Abstract
The age of modern introns and the evolutionary forces controlling intron loss and gain remain matters of much debate. In the case of the apicomplexan malaria parasite Plasmodium falciparum, previous studies have shown that while the positions of two thirds of P. falciparum introns are not shared with surveyed non-apicomplexans (leaving open the possibility that they were relatively recently gained), 99.1% are shared with Plasmodium yoelii, which diverged from P. falciparum at least 100 Mya. We show here that 60.6% of P. falciparum intron positions in conserved regions are shared with the distantly related apicomplexan Theileria parva, whereas only 18.2% of introns in the more intron-rich T. parva are shared with P. falciparum. Comparison of 3305 pairs of orthologous genes between T. parva and Theileria annulata showed that 7089/7111 (99.7%) introns in conserved regions are shared between species. These levels of conservation imply significant differences in rates of intron loss and gain through apicomplexan history. Because transposable elements (TEs) and/or (often TE-encoded) reverse transcriptase are implicated in models of intron loss and gain, the observed low rates of intron loss and gain in recent Plasmodium and Theileria evolution are consistent with the lack of known TE in those groups. We suggest that intron loss/gain in some eukaryotic lineages may be concentrated in relatively short episodes coincident with occasional TE invasions.
Eukaryotic species vary dramatically in average number of introns per gene, from less than 0.2 intron/gene in Cryptosporidium species, hemiascomycetes fungi, Encephalitozoon cuniculi, red algae, and Giardia lamblia to more than one per gene in animals, most characterized fungi, most apicomplexans, amoebae, diatoms, paramecia, jakobids, land plants, and green algae (compiled in Jeffares et al. 2006; Roy and Gilbert 2006). Such a pattern clearly implies recurrent episodes of massive intron loss and/or gain, although the relative importance of these two processes remains hotly debated (Babenko et al. 2004; Qiu et al. 2004; Csuros 2005; Roy and Gilbert 2005b, c).
any studies have shown a complex pattern of intron position sharing between species (e.g., Perler et al. 1980; Dibb and Newman 1989; Moriyama et al. 1998; Fedorov et al. 2002; Guilliano et al. 2002; Kent and Zahler 2000; Rogozin et al. 2003; Roy et al. 2003; Kiontke et al. 2004; Nielsen et al. 2004; Vanacova et al. 2005; Roy and Hartl 2006). In some cases, the vast majority of introns are found at identical positions of orthologous genes. For instance, only 15 human-specific introns were found among over 10,000 intron positions in 1560 human–mouse ortholog pairs (Roy et al. 2003). In other cases, intron positions have diverged significantly over relatively short times (Seo et al. 2001; Kent and Zahler 2000; Edvardsen et al. 2004).
The apicomplexan malaria parasite Plasmodium falciparum provides a particularly interesting case. In a study of 684 sets of orthologs between P. falciparum and seven non-apicomplexan eukaryotic species, only one third of P. falciparum intron positions were shared with another species, less than was found for any non-apicomplexan species, leaving open the possibility that the majority of P. falciparum introns have been relatively recently gained (Rogozin et al. 2003). By contrast, a study of 3789 pairs of orthologs between P. falciparum and the rodent parasite P. yoelii (diverged =100 Mya) found very high conservation, with 99.1% of P. falciparum introns shared with P. yoelii (and 99.6% of P. yoelii introns shared with P. falciparum), and at least three quarters and very likely at least 95% of the observed differences attributable to intron loss, not intron gain (Roy and Hartl 2006).
We studied conserved regions of 1279 orthologous gene pairs between P. falciparum and the distantly related apicomplexan parasite Theileria parva. A total of 335/553 (60.6%) P. falciparum intron positions were shared with T. parva; 335/1842 (18.2%) T. parva intron positions were shared with P. falciparum. We then compared 3305 pairs of orthologous genes between T. parva and T. annulata (diverged ~82 Mya). Among 7111 intron positions in conserved regions, 7089 were shared between species, whereas only 11 (0.15%) were specific to each species. Implied intron loss/gain rates between Theileria species and between Plasmodium species are orders of magnitude smaller than estimated rates between T. parva and P. falciparum.
Proposed mechanisms of intron loss and gain involve transposable elements (TE) or TE-encoded reverse transcriptases. Thus, as noted previously for Plasmodium (Roy and Hartl 2006), a dearth of intron loss and gain in Theileria is consistent with the lack of transposable elements in modern Theileria species. We suggest that in some lineages including apicomplexans most intron loss/gain may be confined to relatively rare episodes of transposable element invasion.
Results and Discussion
Very little intron loss/gain in Theileria
We compared intron–exon structures for 3305 putative pairs of orthologous genes for T. parva and T. annulata. In 3.24 Mb of conserved amino acid level alignment, there were only 22 intron positions specific to one species out of 7111 total intron positions. Eleven introns were specific to each species (Table 1; Fig. 1). The rate of divergence at synonymous positions between these species (d S) has been estimated as 0.82 (Pain et al. 2005). Assuming that rates of divergence at synonymous positions reflects that found in Plasmodium (~5 × 10-9; Castillo-Davis et al. 2004; Tanabe et al. 2004; Mu et al. 2005; Neafsey et al. 2005), gives an estimated divergence time of ~82 Mya. The Theileria genes experiencing intron loss/gains are summarized in Table 2.
Table 1.
Summary of species comparisons
Figure 1.
Intron divergences between and within Theileria and Plasmodium species. Percentages of total intron positions in conserved regions that are species specific are given for the three apicomplexan comparisons. Times of divergence are from Escalante and Ayala (1995) (Theileria- Plasmodium divergence), from an assumption of P. yoelii–P. falciparum speciation at least as deep as host divergence, and based on an estimated d S = 0.82 divergence between Theileria species, assuming a nucleotide substitution rate of 5 × 10-9 (T. annulata–T. parva divergence).
Table 2.
Theileria genes experiencing intron loss/gains
Intron gain/loss between T. parva and Plasmodium falciparum
We next analyzed 2060 intron positions in 596 kb of conserved regions of 1279 putative pairs of orthologous genes for T. parva and P. falciparum. A total of 335/553 (60.6%) of P. falciparum intron positions were shared with T. parva, whereas 335/1842 (18.2%) T. parva intron positions were shared with P. falciparum (Table 1; Fig. 1). The results were nearly identical in a subset of 597 gene pairs for which conservation of synteny makes particularly confident orthology assignment possible. In a comparison of a small set of conserved regions of 464 possibly orthologous sequences between T. parva and Paramecium tetraureli, 5/17 P. tetraurelia introns were found in T. parva and 5/12 T. parva introns were found in P. tetraurelia. However, these results are difficult to interpret because of small sample numbers, and the lack of a full P. tetraurelia genome for analysis prohibits confident orthology assignments.
Discordant Theileria introns
We BLASTed each of the 22 introns found in only one of the two Theileria species against the genome in which it is found (e.g., T. parva introns against the T. parva genome). None of the intronic sequences gave hits with convincing sequence similarity to any intergenic, coding, or intronic sequence. Twenty-one of 22 were exact changes, without loss or gain of adjacent coding sequences. Comparisons with other apicomplexans for which genomic sequence was available confirmed intron presence in 10 cases, mostly in Babesia bigemina, strongly suggesting intron loss. In many cases intron positions that are conserved between the two Theileria species are absent in these outgroups, thus absence of discordant Theileria introns in these outgroups does not imply intron gain. Full understanding of the history of these introns will have to await the availability of fully annotated genome sequences for additional apicomplexan species.
Interestingly, most discordant Theileria introns (77.2%) including most confirmed losses (80.0%) fell in phase one, much higher than the fraction of all Theileria introns (33.9%, P< 0.005 by a binomial test). This is particularly surprising since previously a bias toward loss of phase zero introns was found for a variety of eukaryotic lineages (Roy and Gilbert 2005a). In the manner of Roy and Hartl (2006), we identified discordant introns that were 5' or 3' of the median intron in conserved regions and found no difference (6 vs. 7 for all discordant introns; 3 vs. 2 for confirmed intron losses), thus there is no clear bias toward a 3' biased loss. In addition, in only one case were two discordant introns found in the same gene, and these were not adjacent. Thus, in these ways the data does not provide evidence for mRNA-mediated intron loss, although this could reflect the small number of intron losses identified.
The average length of intergenic regions flanking genes experiencing an intron loss/gain were very similar to values for all genes for Plasmodium (2221 nucleotides in all genes vs. 2257 for genes experiencing intron loss/gains) and Theileria (1618 vs. 2034, P = 0.48 by a Monte Carlo simulation).
Gain and loss in Theileria and Plasmodium
To get a sense of the relative importance of intron loss and gain in Theileria and Plasmodium since the common ancestor, we assessed intron presence in homologous genes in the distantly related apicomplexan species Toxoplasma gondii for 30 Theileria-specific and 30 Plasmodium-specific introns. In both cases, 19/30 were present in T. gondii, suggesting intron loss. Levels of conservation of Theileria/Plasmodium introns in T. gondii are unknown, thus for the introns that are absent in T. gondii intron loss/gain is uncertain. This small sample suggests that intron losses have outnumbered intron gains in these species; however, full understanding of intron evolution in the deeper branches of apicomplexans will have to await availability of a fully annotated T. gondii genome.
Rates of intron loss and gain
We next estimated rates of intron loss and gain from the data. Consider two species A and B that diverged t million years ago, in which the ancestor contained N introns in the c Mb of studied orthologous coding sequence, and in which both species have experienced constant rate of intron gain g/Mb/My and intron loss l/My since their divergence. We assume that all observed intron positions shared between the two species represent truly ancestral introns (i.e., no multiple insertions into homologous sites). If all introns are lost at equal rate, the chance that an ancestral intron has not been lost by the present time along a single lineage is given by the exponential distribution e - lt. The expected number of introns shared between the species is the number of ancestral introns N times the probability that an ancestral intron has not been lost in species A (e - lt) times the probability that it has not been lost in species B (also e - lt), or Ne -2lt. The probability that an ancestral intron is lost in a given species is 1 - e - lt. An intron may thus be retained in species A (with probability e - lt) but lost in species B (with probability 1 - e - lt), with total probability e - lt (1 - e - lt), or lost in A but retained in B also with probability e - lt (1 - e - lt); thus the expected total number of ancestral introns that are present in only one species is N × e - lt (1 - e - lt) × 2.
The rate of intron gain across the entire sequence is simply cg per My. However, an intron that is gained at time tg before the present may be subsequently lost, with probability given by the exponential distribution e - ltg. Integrating, we get an expectation of ∫o t cge−ltg dtg = (cg/l)(1-elt)extant intron gains per lineage since the time of divergence, thus a total of twice that number of species-specific intron gains (since there are two species).
In the case of the T. parva–T. annulata comparison, there are 22 species-specific and 7089 shared intron positions in 3.24 Mb of sequence. Ten of the species-specific introns are known to be due to intron loss. The remaining 12 may be due to intron loss or gain. Assuming that all 22 species-specific introns are attributable to intron loss, we estimate 22/7089 = 2Ne - lt(1 - e - lt)/Ne -2lt, which gives lt = 0.0015, or l = 1.9 × 10-5/My for t = 82 My. Assuming that all 12 species-specific introns of unknown origin are due to intron gain gives estimates of l = 8.6 × 10-6/My and g = 0.023/Mb/My. Figure 2 shows the entire range of estimates assuming between 0 and 12 of the species-specific introns are due to intron gain. For a previous P. falciparum–P. yoelii comparison (Roy and Hartl 2006), at least 19 of the 27 species-specific introns among 2212 total introns in 3.5 Mb of conserved regions are due to intron loss. Estimated rates of loss, assuming t = 100 Mya, are therefore 5.4–6.2 × 10-5/My, and the estimated gain rate is between zero and g = 0.011/Mb/My (Fig. 2).
Figure 2.
Estimated rates of intron loss and gain in three comparisons. The Plasmodium and Theileria traces give possible estimates of intron loss and gain for the P. falciparum–P. yoelii and T. parva–T. annulata comparisons, respectively, given the number of observed species-specific intron positions. The black traces give possible estimates derived from the T. parva–P. falciparum comparison, assuming either a divergence time of 350 Mya or 824 Mya. Estimates are derived as described in the text.
Rates of loss and/or gain implied by the T. parva–P. falciparum divergence are much higher. The Theileria–Plasmodium divergence very likely postdates the deepest splits within known apicomplexans, estimated to be 350–824 Mya (Escalante and Ayala 1995). These values are unlikely to be a significant underestimate, since the apicomplexan ancestor was very likely to have been an animal parasite and therefore to not predate the origin of animals (Zilversmit and Hartl 2005). Rates of intron gain and loss similar to those estimated for the P. falciparum–P. yoelii and T. parva–T. annulata divergence would therefore predict that between 3 and 10% of introns in a T. parva–P. falciparum comparison should be species specific. Instead, 84% of intron positions are species specific.
Assuming that all 1725 species-specific introns out of 2060 total intron positions in 0.60 Mb of conserved sequences are due to intron loss gives an estimate of l = 0.0015–0.0075. These values are two orders of magnitude larger than the P. falciparum–P. yoelii and T. parva–T. annulata estimates. Attributing all 1725 species-specific introns to intron gain yields estimates of g = 1.8–4.1/ Mb/My (assuming t of 824 My and 350 My, respectively), two to three orders of magnitude higher than the P. falciparum–P. yoelii and T. parva–T. annulata estimates. Figure 2 gives the entire range of estimates assuming that between 0 and 1725 of the species- specific introns are attributable to intron gain, showing clearly higher estimates than the T. parva–T. annulata and P. falciparum– P. yoelii estimates.
Rates of intron evolution through time
Other results suggest significant variations in intron loss/gain rate through time. While intron position divergence between mouse and humans is only 0.1%, divergence between mammals and flies or worms, reflecting a time of divergence perhaps tenfold earlier, is not 1% but ~80% (Rogozin et al. 2003; Roy et al. 2003). While intron position divergence between euascomycetous fungi appears to be around one quarter, divergence between euascomycetes and hemiascomycetes is near complete, apparently due to hemiascomycetes having lost nearly all of their ancestral introns (Rogozin et al. 2003; Nielsen et al. 2004). Similarly, while divergence between Theileria and Plasmodium is 84%, their divergence from the slightly more distantly related C. parvum, with only one intron per 10 genes, is nearly complete (Abrahamsen et al. 2004).
At the rates of intron loss estimated from the T. parva–T. annulata comparison, it would take 35 billion years for a lineage to lose half of its ancestral introns, yet many lineages have lost much higher fractions of their introns over much shorter times (Rogozin et al. 2003; Roy and Gilbert 2005b, c). At the rates of intron gain estimated from the T. parva–T. annulata comparison, it would take hundreds of billions of years to reach the intron densities of 5–8 introns per gene observed in a variety of eukaryotic lineages, even ignoring subsequent intron loss. Thus, intron loss and gain rates have clearly varied substantially through eukaryotic evolution.
Transposable elements and rate variation
Why have intron loss and gain rates varied so significantly through the history of these species? Differences in intron number have traditionally been attributed to differences in selection based on biological differences, with for instance fast-replicating species experiencing more selection against excess DNA. Such an explanation does not readily present itself in this case. Both Plasmodium and Theileria are intracellular parasites of vertebrates transmitted by an arthropod vector. Both groups have a complex life cycle with multiple different asexual stages and a single meiotic cycle per transmission.
Instead, the rate of intron loss and gain mutations themselves may be a central part of the explanation (Roy and Gilbert 2006; Roy and Hartl 2006). Intron loss and gain may not follow a classical mode of near-constant rate but may instead be largely confined to dramatic episodes (Fedorov et al. 2003). Such a scenario is in fact expected in lineages in which the number of TEs varies through time. The most likely mechanism of intron loss is reverse transcription (likely largely by TE-encoded reverse transcriptase) of a spliced mRNA and subsequent recombination with the genomic copy (Mourier and Jeffares 2003; Roy and Gilbert 2005a). Intron gain is likely also dependent on TEs, either via reverse splicing of spliced-out RNA introns into previously intronless sites of mRNAs, followed by reverse transcription and subsequent double recombination with the genomic copy, or by simple conversion of coding sequence-interrupting TE insertions into new introns (Crick 1979; Sharp 1985; Roy 2004). It is well established that rates of TE insertion vary through time (e.g., Lander et al. 2001; Blumenstiel et al. 2002; Salem et al. 2005), and that individual TE families can go extinct in species that previously harbored them (Lander et al. 2001), or can be introduced to species that previously lacked these families (Robertson and Lampe 1995; de Almeida and Carareto 2005; Sanchez-Gracia et al. 2005; Diao et al. 2006). Such differences in TE numbers and insertion patterns through time might therefore lead to dramatic differences in both rates of intron loss and gain through time, leading to apparent non-linearity of rates of intron position change.
There are no known TEs in Plasmodium or Theileria. The dearth of intron loss and gain in Plasmodium and Theileria could therefore be due to the lack of known TEs in those groups, while the much higher degree of change between the two groups could reflect one or more TE invasions since the common ancestor. Given the high rates of gene conversion in some apicomplexans, a relatively short spurt of high TE abundance could lead to a large amount of intron loss, leading to large deviations from clock-like behavior.
What could drive variation in TE abundance in the evolution of apicomplexan parasites? Two possibilities present themselves. First, given the close association between both groups studied here and mammalian hosts, which have experienced dramatic changes in TE number through time (Lander et al. 2001), it is possible that fluctuation in host TE number could influence fluctuation in parasite TE number. Second, differences in demographics and relationships of host, vector, and parasite could create fluctuations in TE number. During times of low rates of parasite transmission, most infections may contain only a single parasite genotype, while increases in transmission may increase rates of zygote formation between unlike genotypes (Ferreira et al. 1998; Razakandrainibe et al. 2005). Increases in transmission could accompany host or vector switches due to lack of immune defense against the newly introduced parasite or to increased virulence (Boyd 1949; Waters et al. 1991; Mu et al. 2005). Since theory predicts that proliferation of TEs under certain conditions requires sexual reproduction between individuals of unlike geno- type (Hickey 1982), this could lead to transient conditions permissible to TE proliferation.
Implications to gene prediction and genome annotation
This is at least the third genome-wide study to show that in conserved regions of alignment, the vast majority of intron positions can be conserved over very long evolutionary times (Roy et al. 2003; Roy and Hartl 2006). In each of these pairwise comparisons of intron positions between pairs of putatively orthologous genes, a very large number of apparently discordant intron positions are cases in which an intron in one of the orthologous genes lies adjacent to a gap in the alignment of that same sequence. Such cases are very simply explained if a sequence that has been annotated as exonic in one of the genes is in fact an intron, or if the annotated intron sequence in the other gene is in fact exonic. These findings suggest that annotations could be improved, perhaps greatly, by comparison of apparently orthologous sequences, to determine whether corresponding sequence in the two genes may be called either both intronic or both exonic.
Conclusions
We show that intron loss and gain has been very scant between T. parva and T. annulata, that introns in P. falciparum are largely conserved in the distant genus Theileria, and that there are large apparent differences in the rate of intron loss/gain divergence between closely and distantly related apicomplexan species.
Methods
Sequences and orthologous gene pair definition
We downloaded the T. parva and T. annulata genome sequences and annotations from GenBank (accession numbers AAGK01000001.1 and CR940346.1, respectively), and the P. falciparum genome sequence and annotation from PlasmoDB (plasmodb.org, version 10.03.2002.v2). We also downloaded the sequence and annotation for the largest chromosome of P. aurelia (Zagulski et al. 2004) from GenBank. Reciprocal BLASTP searches between T. parva and T. annulata and between T. parva and P. falciparum yielded 3305 and 1279 pairs of putatively orthologous genes, respectively. We used ClustalW with default parameters to align protein sequences of each pair and mapped the intron positions onto the resultant alignments using a custom Perl program. BLASTP searches of the P. tetraurelia sequences against T. parva yielded 464 putatively orthologous gene pairs.
Analysis of T. parva–T. annulata alignments
For each T. parva–T. annulata gene pair, we first excluded intron position discrepancies that were due to several obvious and recurrent annotation errors or concerned introns in doubtful regions of alignment. We excluded intron positions with less than 50% amino acid-level identity in the 15 aligned amino acids on each side (thus retaining positions near gaps). We excluded intron positions in one sequence opposite or within five positions of a 5-amino-acid or greater gap in the same sequence, as such cases are easily explained as an exonic stretch of sequence having been erroneously called an intron by the annotation (in the intron-containing sequence) or vice versa (in the other sequence). However, we retained intron positions adjacent to or within a 15-amino-acid or longer gap in the other sequence, as such cases are not explicable as simple annotation errors and could reflect intron loss or gain in which the adjacent sequences have been added/lost at the same time. We next excluded sequences that fell at the beginning or end of the alignment or in regions before/ after the first/last region of good alignment. Custom Perl programs were written to perform these filters. Each apparent discordance was then analyzed by eye. The vast majority of remaining cases concerned a species-specific intron position near in the alignment to a shared intron position with an intervening gap and no intervening region of homology. Such cases are easily explained as an error in the prediction of the boundary of one intron or the other, such that an intron–exon–intron had been called a single intron or vice versa. A few other cases involved very slight (<5 bp) differences in intron positions between species, likely because of annotation error. This left 22 intron discordances, all in genes with well-conserved synteny. We performed BLASTN searches of each remaining intron against the corresponding genome. In no case was a clear sequence similarity found between the discordant intron and another genomic element.
Analysis of T. parva-P. falciparum and T. parva-P. tetraurelia alignments
Because of the much greater divergence between these species, we applied a different definition of “conserved regions.” We required that one third of the 25 positions in the alignment on either side of an intron (including gaps) were conserved. For the T. parva–P. falciparum alignments, this yielded a total of 2060 introns. The analysis was also performed using a more restrictive criterion of 50% conservation over 15 alignment positions on each side, excluding intron positions with alignment gaps within five positions. Degree of intron position conservation under these conditions was only marginally higher (64.0% of P. falciparum introns present in T. parva; 20.0% of T. parva introns present in P. falciparum). We further identified 597 with evidence of conserved synteny between T. parva and P. falciparum. An orthologous pair was considered to be syntenic if each fell within 10 kb of the corresponding gene from a second putatively orthologous pair. The rate of intron conservation in this subset of genes was very similar (60.6% of P. falciparum introns present in T. parva; 17.0% of T. parva introns present in P. falciparum). For the T. parva–P. tetraurelia there were 24 intron positions in regions of conserved alignments.
Presence/absence of discordant Theileria introns in other apicomplexans
To determine whether each of the 22 introns present in only one of the two Theilieria species was present in distantly related apicomplexans, we ran BLASTP searches of the corresponding protein sequences against the Eimeria tenella GeneDB database of preliminary gene predictions (http://www.genedb.org/genedb/etenella/) and ran TBLASTN sequences against E. tenella (http://www.sanger.ac.uk/Projects/E_tenella/) and Babesia bigemina contigs (http://www.sanger.ac.uk/Projects/B_bigemina/) to identify possible unannotated or misannotated sequences. For good GeneDB hits, we determined the presence/absence by inspecting the intron positions in the GeneDB entry; for contig hits, we determined the presence/absence of a gap in the alignment at a position corresponding to the intron position. In a few cases, sequence similarity with the contig ended abruptly at the intron position, although sequence homologous to the other flanking exon could not be found. Such cases are easily explained if the intron is present in the species, but only one flanking exon is currently represented in the genome assembly and were therefore scored as probable intron presence.
Intergenic distances
We calculated average flanking intergenic distance for all T. parva and all P. falciparum genes and for the 21 Theileria genes and 21 Plasmodium genes (Roy and Hartl 2006) experiencing intron loss/ gains. We used Monte Carlo simulation to generate 10,000 random sets of 21 genes for Theileria. A total of 4811/10,000 sets had a higher average distance than the real set, yielding P = 0.48.
Footnotes
Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5410606.
References
- Abrahamsen M.S., Templeton T.J., Enomoto S., Abrahante J.E., Zhu G., Lancto C.A., Deng M., Liu C., Widmer G., Tzipori S., Templeton T.J., Enomoto S., Abrahante J.E., Zhu G., Lancto C.A., Deng M., Liu C., Widmer G., Tzipori S., Enomoto S., Abrahante J.E., Zhu G., Lancto C.A., Deng M., Liu C., Widmer G., Tzipori S., Abrahante J.E., Zhu G., Lancto C.A., Deng M., Liu C., Widmer G., Tzipori S., Zhu G., Lancto C.A., Deng M., Liu C., Widmer G., Tzipori S., Lancto C.A., Deng M., Liu C., Widmer G., Tzipori S., Deng M., Liu C., Widmer G., Tzipori S., Liu C., Widmer G., Tzipori S., Widmer G., Tzipori S., Tzipori S., et al. Complete genome sequence of the apicomplexan. Cryptosporidium parvum. Science. 2004;304:441–445. doi: 10.1126/science.1094786. [DOI] [PubMed] [Google Scholar]
- Babenko V.N., Rogozin I.B., Mekhedov S.L., Koonin E.V., Rogozin I.B., Mekhedov S.L., Koonin E.V., Mekhedov S.L., Koonin E.V., Koonin E.V. Prevalence of intron gain over intron loss in the evolution of paralogous gene families. Nucleic Acids Res. 2004;32:3724–3733. doi: 10.1093/nar/gkh686. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blumenstiel J.P., Hartl D.L., Lozovsky E.R., Hartl D.L., Lozovsky E.R., Lozovsky E.R. Patterns of insertion and deletion in constrasting chromatin domains. Mol. Biol. Evol. 2002;19:2211–2225. doi: 10.1093/oxfordjournals.molbev.a004045. [DOI] [PubMed] [Google Scholar]
- Boyd M.F. Historical review. In: Boyd M.F., editor. Malariology. Saunders; Philadelphia: 1949. pp. 3–25. [Google Scholar]
- Castillo-Davis C.I., Bedford T.B., Hartl D.L., Bedford T.B., Hartl D.L., Hartl D.L. Accelerated rates of intron gain/loss and protein evolution in duplicate genes in human and mouse malaria parasites. Mol. Biol. Evol. 2004;21:1422–1427. doi: 10.1093/molbev/msh143. [DOI] [PubMed] [Google Scholar]
- Crick F. Split genes and RNA splicing. Science. 1979;204:264–271. doi: 10.1126/science.373120. [DOI] [PubMed] [Google Scholar]
- Csuros M. Likely scenarios of intron evolution. Third RECOMB Satellite Workshop on Comparative Genomics. Lecture Notes Comp. Sci. 2005;3678:47–60. [Google Scholar]
- de Almeida L.M., Carareto C.M., Carareto C.M. Multiple events of horizontal transfer of the Minos transposable element between Drosophila species. Mol. Phylogenet. Evol. 2005;110:583–594. doi: 10.1016/j.ympev.2004.11.026. [DOI] [PubMed] [Google Scholar]
- Diao X., Freeling M., Lisch D., Freeling M., Lisch D., Lisch D. Horizontal transfer of a plant transposon. PLoS Biol. 2006;4:e5. doi: 10.1371/journal.pbio.0040005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dibb N.J., Newman A.J., Newman A.J. Evidence that introns arose at proto-splice sites. EMBO J. 1989;8:2015–2021. doi: 10.1002/j.1460-2075.1989.tb03609.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edvardsen R.B., Lerat E., Maeland A.D., Flat M., Tewari R., Jensen M.F., Lehrach H., Reinhardt R., Seo H.C., Chourrout D., Lerat E., Maeland A.D., Flat M., Tewari R., Jensen M.F., Lehrach H., Reinhardt R., Seo H.C., Chourrout D., Maeland A.D., Flat M., Tewari R., Jensen M.F., Lehrach H., Reinhardt R., Seo H.C., Chourrout D., Flat M., Tewari R., Jensen M.F., Lehrach H., Reinhardt R., Seo H.C., Chourrout D., Tewari R., Jensen M.F., Lehrach H., Reinhardt R., Seo H.C., Chourrout D., Jensen M.F., Lehrach H., Reinhardt R., Seo H.C., Chourrout D., Lehrach H., Reinhardt R., Seo H.C., Chourrout D., Reinhardt R., Seo H.C., Chourrout D., Seo H.C., Chourrout D., Chourrout D. Hypervariable and highly divergent intron–exon organizations in the chordate Oikopleura dioica . J. Mol. Evol. 2004;59:448–457. doi: 10.1007/s00239-004-2636-5. [DOI] [PubMed] [Google Scholar]
- Escalante A.A., Ayala F.J., Ayala F.J. Evolutionary origin of Plasmodium and other Apicomplexa based on rRNA genes. Proc. Natl. Acad. Sci. 1995;92:5793–5797. doi: 10.1073/pnas.92.13.5793. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fedorov A., Merican A.F., Gilbert W., Merican A.F., Gilbert W., Gilbert W. Large-scale comparison of intron positions among animal, plant, and fungal genes. Proc. Natl. Acad. Sci. 2002;99:16128–16133. doi: 10.1073/pnas.242624899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fedorov A., Roy S., Fedorova L., Gilbert W., Roy S., Fedorova L., Gilbert W., Fedorova L., Gilbert W., Gilbert W. Mystery of intron gain. Genome Res. 2003;13:2236–2241. doi: 10.1101/gr.1029803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferreira M.U., Lin Q., Kimura M., Ndawi B.T., Tanabe K., Kawamoto F., Lin Q., Kimura M., Ndawi B.T., Tanabe K., Kawamoto F., Kimura M., Ndawi B.T., Tanabe K., Kawamoto F., Ndawi B.T., Tanabe K., Kawamoto F., Tanabe K., Kawamoto F., Kawamoto F. Allelic diversity in the merozoite surface protein-1 and epidemiology of multiple-clone Plasmodium falciparum infections in northern Tanzania. J. Parasitol. 1998;84:1286–1289. [PubMed] [Google Scholar]
- Hickey D.A. Selfish DNA: a sexually-transmitted nuclear parasite. Genetics. 1982;101:519–531. doi: 10.1093/genetics/101.3-4.519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jeffares D.C., Mourier T., Penny D., Mourier T., Penny D., Penny D. The biology of intron gain and loss. Trends Genet. 2006;22:16–22. doi: 10.1016/j.tig.2005.10.006. [DOI] [PubMed] [Google Scholar]
- Kent W.J., Zahler A.M., Zahler A.M. Conservation, regulation, synteny, and introns in a large-scale C. briggsae–C. elegans genomic alignment. Genome Res. 2000;10:1115–1125. doi: 10.1101/gr.10.8.1115. [DOI] [PubMed] [Google Scholar]
- Kiontke K., Gavin N.P., Raynes Y., Roehrig C., Piano F., Fitch D.H., Gavin N.P., Raynes Y., Roehrig C., Piano F., Fitch D.H., Raynes Y., Roehrig C., Piano F., Fitch D.H., Roehrig C., Piano F., Fitch D.H., Piano F., Fitch D.H., Fitch D.H. Caenorhabditis phylogeny predicts convergence of hermaphroditism and extensive intron loss. Proc. Natl. Acad. Sci. 2004;101:9003–9008. doi: 10.1073/pnas.0403094101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lander E.S., Linton L.M., Birren B., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., Fitzhugh W., Linton L.M., Birren B., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., Fitzhugh W., Birren B., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., Fitzhugh W., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., Fitzhugh W., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., Fitzhugh W., Baldwin J., Devon K., Dewar K., Doyle M., Fitzhugh W., Devon K., Dewar K., Doyle M., Fitzhugh W., Dewar K., Doyle M., Fitzhugh W., Doyle M., Fitzhugh W., Fitzhugh W., et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
- Moriyama E.N., Petrov D.A., Hartl D.L., Petrov D.A., Hartl D.L., Hartl D.L. Genome size and intron size in. Drosophila. Mol. Biol. Evol. 1998;15:770–773. doi: 10.1093/oxfordjournals.molbev.a025980. [DOI] [PubMed] [Google Scholar]
- Mourier T., Jeffares D.C., Jeffares D.C. Eukaryotic intron loss. Science. 2003;300:1393. doi: 10.1126/science.1080559. [DOI] [PubMed] [Google Scholar]
- Mu J., Joy D.A., Duan J., Huang Y., Carlton J., Walker J., Barnwell J., Beerli P., Charleston M.A., Pybus O.G., Joy D.A., Duan J., Huang Y., Carlton J., Walker J., Barnwell J., Beerli P., Charleston M.A., Pybus O.G., Duan J., Huang Y., Carlton J., Walker J., Barnwell J., Beerli P., Charleston M.A., Pybus O.G., Huang Y., Carlton J., Walker J., Barnwell J., Beerli P., Charleston M.A., Pybus O.G., Carlton J., Walker J., Barnwell J., Beerli P., Charleston M.A., Pybus O.G., Walker J., Barnwell J., Beerli P., Charleston M.A., Pybus O.G., Barnwell J., Beerli P., Charleston M.A., Pybus O.G., Beerli P., Charleston M.A., Pybus O.G., Charleston M.A., Pybus O.G., Pybus O.G., et al. Host switch leads to emergence of Plasmodium vivax malaria in humans. Mol. Biol. Evol. 2005;22:1686–1693. doi: 10.1093/molbev/msi160. [DOI] [PubMed] [Google Scholar]
- Neafsey D.E., Hartl D.L., Berriman M., Hartl D.L., Berriman M., Berriman M. Evolution of noncoding and silent coding sites in the Plasmodium falciparum and Plasmodium reichenowi genomes. Mol. Biol. Evol. 2005;22:1621–1626. doi: 10.1093/molbev/msi154. [DOI] [PubMed] [Google Scholar]
- Nielsen C.B., Friedman B., Birren B., Burge C.B., Galagan J.E., Friedman B., Birren B., Burge C.B., Galagan J.E., Birren B., Burge C.B., Galagan J.E., Burge C.B., Galagan J.E., Galagan J.E. Patterns of intron gain and loss in fungi. PLoS Biol. 2004;2:e422. doi: 10.1371/journal.pbio.0020422. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qiu W.G., Schisler N., Stoltzfus A., Schisler N., Stoltzfus A., Stoltzfus A. The evolutionary gain of spliceosomal introns: Sequence and phase preferences. Mol. Biol. Evol. 2004;21:1252–1263. doi: 10.1093/molbev/msh120. [DOI] [PubMed] [Google Scholar]
- Pain A., Renauld H., Berriman M., Murphy L., Yeats C.A., Weir W., Kerhornou A., Aslett M., Bishop R., Bouchier C., Renauld H., Berriman M., Murphy L., Yeats C.A., Weir W., Kerhornou A., Aslett M., Bishop R., Bouchier C., Berriman M., Murphy L., Yeats C.A., Weir W., Kerhornou A., Aslett M., Bishop R., Bouchier C., Murphy L., Yeats C.A., Weir W., Kerhornou A., Aslett M., Bishop R., Bouchier C., Yeats C.A., Weir W., Kerhornou A., Aslett M., Bishop R., Bouchier C., Weir W., Kerhornou A., Aslett M., Bishop R., Bouchier C., Kerhornou A., Aslett M., Bishop R., Bouchier C., Aslett M., Bishop R., Bouchier C., Bishop R., Bouchier C., Bouchier C., et al. Genome of the host-cell transforming parasite Theileria annulata compared with T. parva . Science. 2005;309:131–133. doi: 10.1126/science.1110418. [DOI] [PubMed] [Google Scholar]
- Perler F., Efstratiadis A., Lomedico P., Gilbert W., Kolodner R., Dodgson J., Efstratiadis A., Lomedico P., Gilbert W., Kolodner R., Dodgson J., Lomedico P., Gilbert W., Kolodner R., Dodgson J., Gilbert W., Kolodner R., Dodgson J., Kolodner R., Dodgson J., Dodgson J. The evolution of genes: the chicken preproinsulin gene. Cell. 1980;20:555–566. doi: 10.1016/0092-8674(80)90641-8. [DOI] [PubMed] [Google Scholar]
- Razakandrainibe F.G., Durand P., Koella J.C., De Meeus T., Rousset F., Ayala F.J., Renaud F., Durand P., Koella J.C., De Meeus T., Rousset F., Ayala F.J., Renaud F., Koella J.C., De Meeus T., Rousset F., Ayala F.J., Renaud F., De Meeus T., Rousset F., Ayala F.J., Renaud F., Rousset F., Ayala F.J., Renaud F., Ayala F.J., Renaud F., Renaud F. Clonal” population structure of the malaria agent Plasmodium falciparum in high-infection regions. Proc. Natl. Acad. Sci. 2005;102:17388–17393. doi: 10.1073/pnas.0508871102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robertson H.M., Lampe D.J., Lampe D.J. Recent horizontal transfer of a mariner transposable element among and between Diptera and Neuroptera. Mol. Biol. Evol. 1995;12:850–862. doi: 10.1093/oxfordjournals.molbev.a040262. [DOI] [PubMed] [Google Scholar]
- Rogozin I.B., Wolf Y.I., Sorokin A.V., Mirkin B.G., Koonin E.V., Wolf Y.I., Sorokin A.V., Mirkin B.G., Koonin E.V., Sorokin A.V., Mirkin B.G., Koonin E.V., Mirkin B.G., Koonin E.V., Koonin E.V. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr. Biol. 2003;13:1512–1517. doi: 10.1016/s0960-9822(03)00558-x. [DOI] [PubMed] [Google Scholar]
- Roy S.W. The origin of recent introns: transposons? Genome Biol. 2004;5:251. doi: 10.1186/gb-2004-5-12-251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roy S.W., Gilbert W., Gilbert W. The pattern of intron loss. Proc. Natl. Acad. Sci. 2005a;102:713–718. doi: 10.1073/pnas.0408274102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roy S.W., Gilbert W., Gilbert W. Complex early genes. Proc. Natl. Acad. Sci. 2005b;102:1986–1991. doi: 10.1073/pnas.0408355101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roy S.W., Gilbert W., Gilbert W. Rates of intron loss and gain: Implications for early eukaryotic evolution. Proc. Natl. Acad. Sci. 2005c;102:5773–5778. doi: 10.1073/pnas.0500383102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roy S.W., Gilbert W., Gilbert W. The evolution of spliceosomal introns: Patterns, puzzles, and progress. Nat. Rev. Genet. 2006;7:211–221. doi: 10.1038/nrg1807. [DOI] [PubMed] [Google Scholar]
- Roy S.W., Hartl D.L., Hartl D.L. Very little intron loss/gain in Plasmodium: Intron loss/gain mutation rates and intron number. Genome Res. 2006;16:750–756. doi: 10.1101/gr.4845406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roy S.W., Fedorov A., Gilbert W., Fedorov A., Gilbert W., Gilbert W. Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain. Proc. Natl. Acad. Sci. 2003;100:7158–7162. doi: 10.1073/pnas.1232297100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salem A.H., Ray D.A., Hedges D.J., Jurka J., Batzer M.A., Ray D.A., Hedges D.J., Jurka J., Batzer M.A., Hedges D.J., Jurka J., Batzer M.A., Jurka J., Batzer M.A., Batzer M.A. Analysis of the human Alu Ye lineage. BMC Evol. Biol. 2005;5:18. doi: 10.1186/1471-2148-5-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sanchez-Gracia A., Maside X., Charlesworth B., Maside X., Charlesworth B., Charlesworth B. High rate of horizontal transfer of transposable elements in. Drosophila. Trends Genet. 2005;21:200–203. doi: 10.1016/j.tig.2005.02.001. [DOI] [PubMed] [Google Scholar]
- Seo H.C., Kube M., Edvardsen R.B., Jensen M.F., Beck A., Spriet E., Gorsky G., Thompson E.M., Lehrach H., Reinhardt R., Kube M., Edvardsen R.B., Jensen M.F., Beck A., Spriet E., Gorsky G., Thompson E.M., Lehrach H., Reinhardt R., Edvardsen R.B., Jensen M.F., Beck A., Spriet E., Gorsky G., Thompson E.M., Lehrach H., Reinhardt R., Jensen M.F., Beck A., Spriet E., Gorsky G., Thompson E.M., Lehrach H., Reinhardt R., Beck A., Spriet E., Gorsky G., Thompson E.M., Lehrach H., Reinhardt R., Spriet E., Gorsky G., Thompson E.M., Lehrach H., Reinhardt R., Gorsky G., Thompson E.M., Lehrach H., Reinhardt R., Thompson E.M., Lehrach H., Reinhardt R., Lehrach H., Reinhardt R., Reinhardt R., et al. Miniature genome in the marine chordate Oikopleura dioica . Science. 2001;294:2506. doi: 10.1126/science.294.5551.2506. [DOI] [PubMed] [Google Scholar]
- Sharp P.A. On the origins of RNA splicing and introns. Cell. 1985;42:397–400. doi: 10.1016/0092-8674(85)90092-3. [DOI] [PubMed] [Google Scholar]
- Tanabe K., Sakihama N., Hattori T., Ranford-Cartwright L., Goldman I., Escalante A.A., Lal A.A., Sakihama N., Hattori T., Ranford-Cartwright L., Goldman I., Escalante A.A., Lal A.A., Hattori T., Ranford-Cartwright L., Goldman I., Escalante A.A., Lal A.A., Ranford-Cartwright L., Goldman I., Escalante A.A., Lal A.A., Goldman I., Escalante A.A., Lal A.A., Escalante A.A., Lal A.A., Lal A.A. Genetic distance in housekeeping genes between Plasmodium falciparum and Plasmodium reichenowi and within. P. falciparum. J. Mol. Evol. 2004;59:687–694. doi: 10.1007/s00239-004-2662-3. [DOI] [PubMed] [Google Scholar]
- Vanacova S., Yan W., Carlton J.M., Johnson P.J., Yan W., Carlton J.M., Johnson P.J., Carlton J.M., Johnson P.J., Johnson P.J. Spliceosomal introns in the deep-branching eukaryote Trichomonas vaginalis . Proc. Natl. Acad. Sci. 2005;102:4430–4435. doi: 10.1073/pnas.0407500102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waters A.P., Higgins D.G., McCutchan T.F., Higgins D.G., McCutchan T.F., McCutchan T.F. Plasmodium falciparum appears to have arisen as a result of lateral transfer between avian and human hosts. Proc. Natl. Acad. Sci. 1991;88:3140–3144. doi: 10.1073/pnas.88.8.3140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zagulski M., Nowak J.K., Le Mouel A., Nowacki M., Migdalski A., Gromadka R., Noel B., Blanc I., Dessen P., Wincker P., Nowak J.K., Le Mouel A., Nowacki M., Migdalski A., Gromadka R., Noel B., Blanc I., Dessen P., Wincker P., Le Mouel A., Nowacki M., Migdalski A., Gromadka R., Noel B., Blanc I., Dessen P., Wincker P., Nowacki M., Migdalski A., Gromadka R., Noel B., Blanc I., Dessen P., Wincker P., Migdalski A., Gromadka R., Noel B., Blanc I., Dessen P., Wincker P., Gromadka R., Noel B., Blanc I., Dessen P., Wincker P., Noel B., Blanc I., Dessen P., Wincker P., Blanc I., Dessen P., Wincker P., Dessen P., Wincker P., Wincker P., et al. High coding density on the largest Paramecium tetraurelia somatic chromosome. Curr. Biol. 2004;14:1397–1404. doi: 10.1016/j.cub.2004.07.029. [DOI] [PubMed] [Google Scholar]
- Zilversmit M., Hartl D.L., Hartl D.L. Evolutionary history and population genetics of human malaria parasites. In: Sherman I.W., editor. Molecular approaches to malaria. American Society for Microbiology Press; Washington, D.C: 2005. pp. 95–109. [Google Scholar]