Abstract
The majority of eukaryotic diversity is hidden in protists, yet our current knowledge of processes and structures in the eukaryotic cell is almost exclusively derived from multicellular organisms. The increasing sensitivity of molecular methods and growing interest in microeukaryotes has only recently demonstrated that many features so far considered to be universal for eukaryotes actually exist in strikingly different versions. In other words, during their long evolutionary histories, protists have solved general biological problems in many more ways than previously appreciated. Interestingly, some groups have broken more rules than others, and the Euglenozoa and the Alveolata stand out in this respect. A review of the numerous odd features in these 2 groups allows us to draw attention to the high level of convergent evolution in protists, which perhaps reflects the limits that certain features can be altered. Moreover, the appearance of one deviation in an ancestor can constrain the set of possible downstream deviations in its descendents, so features that might be independent functionally, can still be evolutionarily linked. What functional advantage may be conferred by the excessive complexity of euglenozoan and alveolate gene expression, organellar genome structure, and RNA editing and processing has been thoroughly debated, but we suggest these are more likely the products of constructive neutral evolution, and as such do not necessarily confer any selective advantage at all.
Keywords: mitochondria, plastids, comparative genomics, molecular evolution, Trypanosoma
The vast majority of eukaryotes on the planet, in terms of both abundance and diversity, are microbial. Generalities about fundamental biological processes are based on knowledge of a few model organisms, yet many microeukaryotes have deviated well beyond these generalities over the course of evolutionary history, which is a reflection of the deep phylogenetic distances between eukaryotic lineages that are neither plants nor animals nor fungi. It is also clear that some groups of protists have broken more rules than others, and 2 diverse lineages that particularly stand out in this regard are the Euglenozoa and the Alveolata (Fig. 1). In members of both of these groups, fundamental structures and processes have substantially deviated from those of other eukaryotes; however, perhaps even more interestingly, both groups have frequently departed in the same general fashion, resulting in surprising levels of convergence that suggest limits to the ways these features can be altered.
The Euglenozoa is a monophyletic group within the Excavata consisting of single-celled flagellates composed of 2 major subgroups (kinetoplastids and euglenids), and 1 smaller subgroup (diplonemids) (Fig. 1). Members of Euglenozoa have diverse modes of nutrition, including predation, parasitism, and photoautotrophy. Predatory euglenozoans are phylogenetically widespread within the group and tend to have diverse feeding apparatuses, feeding strategies and prey preferences (1). For instance, some predatory species are limited to small prey such as bacteria, whereas other species frequently consume larger prey, such as other eukaryotic cells. Photoautotrophy is restricted to a specific subclade of euglenids and originated via secondary endosymbiosis between a predatory euglenid and a green algal prey (1). Parasitic and commensalic euglenozoans appear to have evolved independently several times within kinetoplastids (2), and some species (e.g., Trypanosoma and Leishmania) cause important human illnesses such as African sleeping sickness, Chagas's disease, and leishmaniases.
The Alveolata, another monophyletic group of primarily single-celled eukaryotes that have adopted similarly diverse modes of life, is composed of 3 major subgroups: ciliates, apicomplexans, and dinoflagellates (Fig. 1). All 3 subgroups contain predatory and parasitic species, and only dinoflagellates and an unusual lineage called Chromera are known to contain fully integrated and photosynthetic plastids (3). Photosynthetic dinoflagellates play important roles as planktonic primary producers in oceanic ecosystems, and some of the lineages form symbiotic relationships with corals (e.g., Symbiodinium) and are critical for maintaining the health of reef systems around the world. Nonphotosynthetic plastids have independently evolved in some dinoflagellates and in apicomplexans, which are all obligate parasites of animals and a few are exceedingly important disease organisms of vertebrates (e.g., Cryptosporidium, Toxoplasma, and Plasmodium). Although plastids have not been definitively demonstrated in ciliates, several independent lineages in this group harbor photosynthetic symbionts that are intermittently replenished by feeding.
Both euglenozoans and alveolates have a reputation for “doing things their own way,” which is to say that they have developed seemingly unique ways to build important cellular structures or carry out molecular tasks critical for their survival. Why such hotspots for the evolution of novel solutions to problems should exist in the tree of life is not entirely clear. However, the deeper we look into these groups, the more often it is found that they are also evolving strikingly similar mechanisms for achieving these essential biological functions. Significantly, however, there is a great weight of phylogenetic data that show these lineages are not closely related: of the 5 eukaryotic supergroups hypothesized to explain all eukaryotic diversity, alveolates and euglenozoans fall into 2 different supergroups, chromalveolates and excavates, respectively (Fig. 1). The support for these supergroups as a whole remains contentious (4–7), but there is strong support from phylogenomics and many individual phylogenies and rare genomic characters for a specific relationship between alveolates and stramenopiles on one hand, and euglenozoans and heteroloboseans on the other hand (7). Moreover, no analysis of eukaryotic phylogeny has ever suggested they are closely related to one another. Still more significantly, the majority of the characteristics we discuss below are not universal to all members of either alveolates or euglenozoans, but rather appear to have evolved within a subgroup of each lineage. Altogether, the distribution of these characteristics can really only adequately be explained by convergent evolution. Below, we will examine some of these examples of convergence and what the cooccurrence of convergent traits may tell us about how they evolved.
Convergent Evolution
Recognizing the independent origins of similar traits in distantly related lineages—convergent evolution—allows us to better understand how different environmental and intrinsic conditions have shaped the characteristics of organisms over time; each specific example of convergence reflects a fundamental biological problem and its possible solutions. The causes of convergent evolution are varied and can involve camouflage, mimicry, biomechanical optimization, molecular constraints, developmental canalization and character-state reversals. Examples of convergent evolution range from the biochemical level to the behavioral level and are best characterized within animals and land plants (8–11), which collectively represent only a small portion of the full tree of eukaryotes (4, 6). The occurrence and adaptive significance of convergent evolution in microbial eukaryotes, by contrast, is poorly understood, but it is clear from several examples that convergent traits can evolve over vast phylogenetic distances (6). Convergence in very distantly-related lineages is particularly compelling because the influence of homologous developmental programs (i.e., intrinsic conditions) in constraining subsequent evolution should be minimal if not absent altogether (6). Therefore, improved understanding of convergent evolution in distantly related microbes will provide a much broader framework for evaluating the forces of natural selection and the potential role of constructive neutrality during the evolution of ultrastructural systems and complex molecular processes.
Eukaryotic cells are built from a few core systems that have become tremendously diverse over the course of evolutionary history. Some systems are remarkably conserved, in particular fundamental molecular processes such as information flow or core metabolism, but even in these systems substantial modifications accumulated in some lineages. In other cases, conserved ancestral building blocks (such as the proteinaceous cytoskeleton involved in locomotion and feeding) are widely shared, but have been used in different ways with diverse outcomes. The origins of other components are less clear and likely more recent, but also show a great deal of morphological variation (examples include photoreception systems or surface armor). Taken together, the diversity of cellular and molecular systems in microbial eukaryotes is simply staggering, and some emerging patterns indicate that convergent evolution played a major role in shaping the overall organization of eukaryotic cells at all levels (6, 11).
Below, several features will be described, for which an excessive complexity is a common denominator. This is counterintuitive in single-celled organisms, especially when selective advantages for these complex structures and/or mechanisms remain elusive. We argue that the theory of constructive neutral evolution (12), which invokes nonselective factors such as excess capacities, can best account for their emergence.
Cellular Organization of Euglenids and Dinoflagellates
The comparable combinations of ultrastructural features in euglenozoans and alveolates (Table S1) have been appreciated for decades (13, 14). For instance, the cells of benthic predatory species of euglenids and dinoflagellates are streamlined and dorsoventrally flattened and possess batteries of extrusive organelles, or extrusomes, that are similar in morphology and behavior (Fig. 2). The mucocysts of euglenids and the trichocysts of dinoflagellates are compact, linear bodies containing a highly organized latticed framework of carbohydrates. When these bodies are released through discrete pores through the surface of the cell, the extrusomes become hydrated and rapidly extend in length as spear-like threads (Fig. 2 D and I) (15). Although the origin and function of extrusomes in both groups is not clear, they probably play a role in escape responses, defense, and capturing prey cells.
Benthic euglenids and dinoflagellates, in particular, adhere to substrates and are capable of gliding motility using 2 heterodynamic flagella equipped with flagellar hairs (or mastigonemes). In both groups, the recurrent flagellum sits within a groove on the ventral surface of the cell and is oriented backwards. Euglenids and dinoflagellates also possess cytoskeletal elements (called “paraxial/paraflagellar rods,” which run in parallel to the 9 + 2 microtubular axonemes within each flagellum) that are not found in any other group of eukaryotes. A major difference between euglenids and dinoflagellates, however, is the structure, orientation and motility of the anterior flagellum. The anterior paraxial rod in euglenids is oriented on the ventral side of the axoneme, is stiff and held straight in front of the cell; the paraxial rod functions with the flagellar hairs to produce gliding forces (16). By contrast, the anterior flagellum of dinoflagellates forms a transverse loop or spiral around the circumference of the cell and usually sits within a transverse groove called the cingulum (Fig. 1). The coiled transverse flagellum bears hairs and a flagellar membrane that connects it to the base of the cingulum, and this entire apparatus is capable of producing forces on the surrounding medium that tend to spin the cell around its longitudinal axis.
Many free-living euglenids and dinoflagellates engulf prey organisms using sophisticated feeding apparatuses positioned on the ventral side of the cell. Although the evolution of these apparatuses is a shared feature, the details of these ultrastructural systems are quite distinctive. For instance, there are a few chief components present in most of the predatory species of euglenids described so far, namely “rods” and “vanes.” Two feeding rods oriented longitudinally within the cell are composed of microtubules and amorphous proteinaceous material. These stiff elements provide structural support for gripping and internalizing prey cells and work in concert with 4–5 membranous vanes that are usually reinforced with additional microtubules (1). The vanes originate from the rods, form the inside core of the feeding apparatus, and create space within the apparatus by opening up in a pinwheel-like fashion; the same mechanism can cause the apparatus to protrude from the cell when feeding. By contrast, the diversity and complexity of feeding apparatuses in dinoflagellates probably reflect independent origins in different lineages within the group. The feeding apparatus in dinoflagellates can be simple pockets that unzip when prey is drawn into the cell, dynamic siphons that suck out the cytoplasm of prey cells in a straw-like fashion or expansive veils that completely envelop large filamentous prey and fold it methodically into manageable packets small enough to ingest. Different kinds of feeding apparatuses are often associated with different kinds of photoreceptive eyespots and ocelloids, suggesting that in some dinoflagellates, photoreceptors are adaptations for detecting and capturing photosynthetic prey. Some predatory euglenids with a rod-and-vane feeding apparatus also possess a photoreceptor system, as a putative stigma and photosensory swelling (17), and this combination of features may serve the same basic function as in dinoflagellates.
Another convergent similarity between benthic euglenids and dinoflagellates is the tendency to reinforce their cell surfaces with robust proteinaceous layers beneath the plasma membrane (Fig. 2 C and H). Euglenids possess a distinctive (and synapomorphic) pellicle consisting of discontinuous strips that run longitudinally or helically over the entire cell surface (1). The strips articulate along their lateral margins, and in many euglenids these zones facilitate sliding between strips that produce rhythmic deformations in cell shape, called “euglenoid movement.” Benthic dinoflagellates can also change their shape, especially after engulfing large and oddly shaped prey cells. The proteinaceous surface layer in dinoflagellates, called the “dinoflagellate pellicle” forms a continuous and flexible sheath beneath alveolar vesicles, which may in turn be filled with cellulosic material. Both the euglenid and the dinoflagellate pellicles comprise novel classes of proteins: articulins and epiplasmins (14). Although it is unclear whether these proteins represent an example of molecular convergence or distant homology, their presence in both euglenids and dinoflagellates underscores the striking similarities between these 2 very distantly related groups of eukaryotes.
At different points in their evolutionary history, both euglenids and dinoflagellates independently acquired photosynthesis via secondary endosymbiosis. Accordingly, some representatives of both groups contain at least 3 different genomes within 3 different cellular compartments: the nucleus, the plastid and the mitochondrion. The general organization of the nucleus is a particularly notable feature that is shared by euglenids and dinoflagellates; both groups possess a conspicuous nucleus with a relatively large nucleolus and permanently condensed chromosomes (Fig. 2 B and G). The plastids in both groups also share the unusual features of 3 envelope membranes and a tendency to have thylacoids in stacks of 3 (Fig. 2 E and J) (13). However, the analogous similarities between euglenozoans and dinoflagellates do not end at the ultrastructural level. As described in the next 3 sections, the molecular processes associated with the nucleus, plastid and mitochondrion also reflect high levels of convergent evolution.
The Nucleus: Spliced Leaders and Polycistronic mRNA Processing
The nuclear genomes of kinetoplastids and dinoflagellates have both acquired a long list of unusual characteristics. Some of these are unique to one lineage and very different in the other. For example, dinoflagellates have among the largest nuclear genomes known, and these genomes have a very low gene density and permanently condensed chromosomes that lack nucleosomes (18). Kinetoplastid genomes, however, are relatively small, gene-dense, and remain uncondensed during the cell cycle (19). Both genomes are notorious for their rich representation of modified nucleotides, but the nucleotides themselves are not the same: the hypermodified base J (β-d-glucopyranosyloxymethyluracil) is common in kinetoplastid telomeric regions, whereas dinoflagellates have a high proportion of 5-hydroxymethyluracil and 5-methylcytosine.
However, other dramatic alterations to these genomes have taken place convergently, and interestingly several characteristics have been altered in the same way in both lineages, in particular relating to how genes are arranged and transcribed, and how transcripts are processed. The canonical, simplified view of eukaryotic gene expression involves a single gene transcribed, capped, polyadenylated, spliced (if introns are present), and exported to the cytosol. Both kinetoplastids and dinoflagellates deviate from this canonical view in 2 significant ways that impact the way expression may be controlled.
The first of these is trans-splicing. The spliceosome is a large multisubunit complex that normally recognizes GT-AG bounded spliceosomal introns within eukaryotic genes, and catalyses their removal and the ligation of the flanking exons. Spliceosomal introns are very rare in trypanosomes (19), and available evidence suggests they are relatively so in dinoflagellates as well (20). In contrast, every mRNA in both groups has a 5′ spliced leader (SL) sequence that is added by trans-splicing. The SL, also called a miniexon, is a short conserved sequence that is encoded by a high copy-number family of genes throughout the genome. In dinoflagellates, the same 22-bp fragment is added to all transcripts, and the sequence is also conserved across the entire group (21, 22). In kinetoplastids, the SLs are conserved within a given genome, but vary in size and sequence between species (23). The SL is expressed as a short RNA consisting of the leader sequence followed by a GT dinucleotide and a short stretch of sequence. Complimenting this, mRNAs for protein-coding genes begin with a short stretch of sequence ending with an AG dinucleotide, followed by the 5′ untranslated region and the coding region. The spliceosome brings these 2 elements together and mediates the removal of the 2 intronic fragments and ligation of the SL to the 5′ end of the mRNA (Fig. 3) (23).
The second major oddity shared by kinetoplastid and dinoflagellate nuclear gene expression is the presence of polycistronic messages. Once again, the canonical view of nuclear gene expression in eukaryotes centers around the transcription of a single gene at a time; this stands in contrast to prokaryotes, where multiple genes can be expressed on a single, multifunctional mRNA and many genes can be coregulated in operons. Complete genomic sequences from trypanosomatids demonstrate an organization where genes are distributed in contiguous clusters, ranging in size from a handful of genes to several hundreds. In these clusters, stretching up to >1 Mb, genes are oriented on the same strand, usually toward the telomeres, with adjacent clusters located on opposite strands (19). All of the genes within a contiguous cluster are transcribed on a single, sometimes very long, polycistronic mRNA. Relatively short AT-rich regions separate the clusters and are considered to contain the sites for transcription initiation and termination. Comparison of trypanosomatid genomes shows a high degree of conservation in gene order, even within clusters between flagellates that diverged 200–500 Mya (24).
It is important to point out that, in contrast to prokaryotes, these clusters do not contain genes of related function (19) and they are not coordinately regulated like bacterial operons, so they should not be consider operons. These polycistronic messages are not even translated intact, but are processed to monomeric mRNAs before translation; these monomeric mRNAs are the substrate for trans-splicing by the addition, at the 5′ end, of a SL already equipped with a methylated cap, followed by the polyadenylation at the 3′ end (23).
Far less is known about the organization of dinoflagellate genomes. Due to the enormous size of their nuclear DNA, nearly all sequencing of dinoflagellate genes performed to date has focused on expressed sequence tags, which do not provide information on the context of the gene. Nevertheless, what little is known about dinoflagellate genomes suggests a fascinating parallel with kinetoplastids. It now appears that some genes are isolated in the genome, but others are organized as tandem repeats (20). These gene repeats are cotranscribed, resulting in polycistronic messages, and different from those of kinetoplastids because mRNAs have so far only been found to carry multiple copies of a single gene (20). These transcripts are apparently processed into monocistronic mRNAs, which are presumably the substrates for trans-splicing.
In kinetoplastids, the presence of polycistronic mRNAs, together with the absence of introns, is frequently argued to be an ancient holdover, frozen since their early divergence from other eukaryotes (25). However, this interpretation is flawed for several reasons, particularly because there is no evidence whatsoever for an ancient divergence of kinetoplastids (4). Nonetheless, the independent origin of the same features in dinoflagellates raises an intriguing alternative explanation, namely that the evolutionary origins of polycistronic mRNAs and trans-splicing are linked. This is all of the more compelling when one considers that both features are also found together in the nematode Caenorhabditis elegans (26). It is unlikely that this is either functionally advantageous or an evolutionary relict, but rather that the evolution of one feature preconditions the genome by removing deleterious effects of the second feature. For example, the establishment of widespread SL addition in a nuclear genome could precondition that genome for the subsequent establishment of polycistronic transcription. Polycistronic mRNAs that would otherwise be deleterious could flourish simply because the processing pathway eliminates their deleterious effect (the inability to translate all but the first cistron). SL addition appears to be universal in both dinoflagellates and kinetoplastids (in C. elegans 70% of mature mRNAs are produced through trans-splicing: 26). Polycistronic messages, however, are also near universal in kinetoplastids, whereas in dinoflagellates (and C. elegans) only a subset of genes are expressed on polycistonic mRNAs (20). Since so far only tandem duplications of closely related copies of the same gene are known in dinoflagellates, it would appear they may arise and dissolve continuously.
The functional impacts of SL addition and polycistronic transcription are also different in the 2 lineages. Posttranslational control may be somewhat restricted by the absence of sequence diversity at the 5′ end of mRNAs, but more importantly a heavy use of polycistronic messages eliminates the possibility of transcription-level differentiation of expression of any genes within the same cluster. In kinetoplastids, there is only a handful of promoters and a marked paucity of transcription factors (25), unavoidably leading to the general lack of control over transcription initiation. Indeed, in the well-studied T. brucei, virtually all nuclear DNA seems to be permanently transcribed. Consequently, control levels in kinetoplastids are confined to RNA processing, export, and half-life, as well as translation and protein stability (27). This is a good illustration of how convergent processes differ in the details in different lineages. In this case, the kinetoplastids cotranscribe many different genes whereas dinoflagellates cotranscribe many copies of the same gene, and as a result transcription-level control is likely not so severely affected in the latter group.
The Plastid: Three Membrane Plastids and Unique Targeting System
Plastids are known in both alveolates and euglenozoans to have been derived from secondary endosymbiosis: the uptake of a eukaryotic alga by another eukaryote. In the Euglenozoa, plastids are derived from a green alga and are relatively restricted, being found in a subset of euglenids and nowhere else, namely the “euglenophytes” (28). In the Alveolata, plastids are derived from a red alga and are more widespread and ancient, being known in dinoflagellates and apicomplexans, and suspected of originating before the divergence of alveolates (5). As with nuclear genomes, plastids have evolved a number of unusual characteristics, some unique and some arising convergently. Euglenophyte plastid genomes are home to some unique self-splicing introns (29), whereas the dinoflagellate plastid genome has been massively reduced in coding content and broken into single gene minicircles with polyuridylylated transcripts (30). Curiously, both features are also found in kinetoplastid mitochondria (31).
Once again, however, 2 probably interconnected features have arisen in both groups. The vast majority of secondary plastids are bounded by 4 membranes. Most proteins in these plastids are encoded in the nucleus and are posttranslationally targeted to the organelle by way of a 2-part pathway beginning with the endomembrane system and followed by the original primary plastid targeting system. In dinoflagellates and euglenophytes, however, the plastid is novel in that it is bounded by 3 membranes rather than 4. It was argued that this may reflect a different mechanism of plastid uptake, specifically that in these lineages plastids arose through myzocytosis whereas other secondary plastids arose through endocytosis. Myzocytosis is a mode of predation where a cell pierces its prey and sucks the prey cytoplasm directly into a digestive vacuole, leaving the prey wall and membrane behind. Although not as common as endocytosis of whole prey cells, myzocytosis is known in both dinoflagellates and euglenozoans, leading to the suggestion that their plastids originated from a myzocytosed alga, and therefore lacked its plasma membrane (32). However, plastids in the closest relatives of dinoflagellates, apicomplexans and Chromera, are bounded by 4 membranes and have now been shown to be orthologous to the dinoflagellate plastid (3). Accordingly, in at least dinoflagellates, plastids must have originated in the same fashion as some 4-membrane counterparts and at one time been bounded by 4 membranes, which means that the origins of 3 membranes around the plastids in dinoflagellates and euglenophytes cannot be attributed to a shared, unusual mechanism such as myzocytosis.
Interestingly, the system used to target proteins to 3 membrane plastids is also different in subtle but important ways to that of canonical secondary plastids with 4 enveloping membranes, and the same variations have been adopted in dinoflagellates and euglenophytes. The N-terminal leaders that direct proteins to canonical secondary plastids include a signal peptide (to enter the endomembrane system) and a transit peptide (to cross the 2 plastid membranes), and are similar in secondarily derived red and green plastids. In dinoflagellates and euglenophytes, however, an additional hydrophobic domain is found following the transit peptide of some, but intriguingly not all proteins (33, 34). This domain is thought to anchor the proteins in the endomembrane, so as the protein moves through the Golgi apparatus the leader lays in the lumen but the mature protein remains in the cytosol (35, 36). The number of membranes and these unusual characteristics of targeting have both evolved convergently in dinoflagellates and euglenophytes, which suggests some link in how these 2 features evolved. Unfortunately, the mechanism by which proteins cross the membrane that is missing in both dinoflagellates and euglenophytes (the plasma membrane of the engulfed alga) is the most poorly understood step in the targeting pathway to canonical secondary plastids, so any specific model for preconditioning would be highly speculative.
The Mitochondrion: RNA Editing and Genome Breakdown
The mitochondrial genomes of dinoflagellates and kinetoplastids are both highly unorthodox, and once again have evolved some unique features and several common complex characteristics. The kinetoplastid mitochondrion contains uniquely structured, protein-rich mitochondrial ribosomes with a reduced RNA component, unusual fatty acid synthesis and respiratory complexes such as the prokaryotic-like complex I, alternative terminal oxidase, massive tRNA import, and incomplete Krebs cycle. The complex genome of the kinetoplastid mitochondrion is known as kinetoplast DNA or kDNA, its genes being subjected to unprecedented levels of RNA editing (Fig. 4) (31). Dinoflagellate mitochondria have received far less attention, but it is now emerging that their genomes have also evolved a number of highly unusual characteristics, including trans-splicing, tRNA import, fragmented rRNAs, the loss of start and stop codons, and an oligouridine tail (37, 38). Most strikingly, however, the structure of dinoflagellate mitochondrial genomes has also broken down into many fragments, the transcripts of which have high levels of RNA editing; however, as we discuss below, the details of both systems differ between kinetoplastids and dinoflagellates (Fig. 4).
Within the Euglenozoa as a whole, mitochondrial genomes are generally odd. The euglenid mitochondrial genomes are experimentally refractive and remain poorly known (M. W. Gray, personal communication). The mitochondrion of related diplonemids was recently shown to harbor genome of unprecedented organization, with fragments of genes residing on minicircles, which are assembled in the correct order posttranscriptionally by means as yet unknown (39). Virtually nothing is known about the form or content of the giant kDNAs in bodonid flagellates, which are estimated to comprise millions of base pairs (40), whereas the kDNA networks of trypanosomatids are among the best studied and most complex mitochondrial genomes known. They are composed of circular DNA molecules that are relaxed and catenated into a single 3-dimensional network. These networks are composed of dozens of maxicircles, which are equivalents of classical mitochondrial genome, and thousands of minicircles (31) involved in editing, discussed below. The gene content of the maxicircle genome is not unusual, except for the complete absence of tRNA genes. tRNAs have been demonstrated to be imported from the cytosol into the tRNA-lacking organelle of T. brucei, so that the prokaryotic translation system of the mitochondrion must cope with imported eukaryotic tRNAs (41). The only exception is tRNAMet-i, the import of which is blocked because it cannot function in the prokaryotic system. Instead, tRNAMet formyl-transferase is present, which formylates the translation initiator tRNAMet-e upon import (42).
Within alveolates, mitochondrial genome evolution has also taken more than its share of strange turns. Although the circular mitochondrial genome of ciliates is undistinguished in both form and content, the genomes in apicomplexans and dinoflagellates are both highly reduced and often scrambled (37, 38, 43). These lineages have the smallest mitochondrial genomes known, with most species examined with just 3 protein-coding genes: cox1, cox3, and cob (strictly speaking, the dinoflagellate Oxyrrhis has only 2 genes since cob and cox3 are expressed as a fusion) (37). The only other coding regions are small fragments of rRNAs. These do not amount to an entire copy of either large or small subunit rRNAs, so fragments are although to be important and the functional RNAs assembled by base pairing interactions. As with kinetoplastids, no tRNAs are encoded in these genomes, and they have been shown to be imported into apicomplexan mitochondria. Moreover, apicomplexans also block the import of tRNAMet-i, and use tRNAMet formyl-transferase to formylate the translation initiator tRNAMet-e. Indeed, kinetoplastids and apicomplexans have independently evolved very similar tRNA import mechanisms to cope with this unique lack of tRNAs (44). In apicomplexans, the 3 protein-coding genes map to a linear, tandem repeat with rRNA fragments interspersed (43). In dinoflagellates, the same coding regions are present, but the organization is much more complex. Here, multiple copies of each gene are found in various orientations on linear chromosomes of varying size. In some species, all possible permutations of 3 genes are adjacent, whereas in others chromosomes seem to contain copies of only 1 gene. Chromosomes also contain rRNA fragments, and substantial noncoding regions, and some have been shown to have structurally complex ends characterized by families of repeats (37, 38, 45).
In kinetoplastids, the evolution of the complex genome organization is tightly linked to how genes are expressed, and specifically to RNA editing. The genes, such as they are, are encoded on the maxicircles and expressed as polycistronic mRNAs, but after processing into monocistrons these messages are then massively altered by the insertion and deletion of uridine residues (up to 553 insertions and 89 deletions in a single mRNA). Editing is mediated by hundreds of small guide (g) RNAs in an elaborate process involving numerous multisubunit protein complexes (31, 46). The gRNAs that contain the information that directs editing are encoded on the minicircles, so the breakup of the genome into 2 chromosome types is likely linked to the evolution of editing.
In dinoflagellates, RNA editing has also been found to be widespread, but the process is mechanistically different and in no way related to the breakdown of the genome structure. Here, transcripts are edited at ≈2% of their positions via substitutional editing, as opposed to insertion/deletion editing (38, 47, 48). Although A to G is the most common substitution, several others have been observed (U → C, G → C, G → A, A → C, and C → U), suggesting a highly flexible and sophisticated editing mechanism (38, 48). Fragments of edited gene sequences have been found in dinoflagellate mitochondrial genomes, prompting the suggestion that they employ gRNAs similar to that of kinetoplastids (49). However, the genomes are prone to recombination, so the significance of these fragments remains unclear; overall, there is no direct evidence for any particular editing mechanism at present. It is worth noting that mitochondrial transcripts in dinoflagellates have substantial polyadenylated tails, a feature linked to the editing process in kinetoplastids (50), and generally very rare in organelles.
The limited data further indicate that uridine insertion type of RNA editing might even coexist with trans-splicing in diplonemids (51). We predict that the extreme diversity of editing types documented in the dinoflagellate mitochondrion (48) also requires poorly understood albeit complex protein machinery that is the result of constructive neutrality similar to that described above for the kinetoplastid mitochondrion.
Conclusions
The deeper we look at protist biology, the greater the variety we discover in how cells can accomplish fundamental processes. Not only do protists represent the majority of the phylogenetic tree of eukaryotes, and therefore the greatest evolutionary diversity, but they also have pushed the limits of many biological systems and bending the “rules” of biology (such as the central dogma) far beyond what we see in the better studied multicellular eukaryotes. The alveolates and the euglenozoans may be “hotspots” for the generation of diverse solutions to fundamental processes, but it is also possible that they only appear this way because they are among the best studied protist groups. Other odd protists abound, but we know next to nothing about many of them, particularly at the molecular and genomics levels. All this is presently changing, and to interpret genomic diversity in eukaryotes we will have to set aside many of our preconceptions.
Comparing the alveolates and the euglenozoans is also appealing because they have broken many of the same rules in the same general way. Because they are so distant on the phylogenetic tree of eukaryotes (4, 7), convergence between the 2 groups would ultimately be influenced only by intrinsic factors of a very basic nature (i.e., that are likely common to most or all eukaryotes) (6). In contrast, where multiple aspects of a system have all converged similarly, it is likely that the convergent appearance of one new characteristic can be a strong factor in the convergent evolution of others. Even if these characteristics are not obligatorily functionally linked, their evolution may be tightly linked. For example, polycistronic mRNAs can exist without a SL, but they are evolutionarily linked because adding the SL allows the polycistronic mRNA to function. Conversely, one can imagine other ways to get a polycistronic mRNA to function without SL processing (e.g., changes to translation initiation), but because no such system is known, these are evidently less likely than the advent of SL processing. In other words, within the limited universe of acceptable changes, one change closes some possibilities, but opens new ones as well.
So why have protists in general and alveolates and euglenozoans in particular engaged in so much evolutionary experimentation? Many characteristics discussed here have been considered individually and concluded to be ancient relicts, going back even so far as the RNA world, or to have been favored by selection over the canonical way of accomplishing the same task (52, 53). We find neither of these arguments to be particularly compelling given the narrow distribution of these characters in nature, and their often extreme complexity. For example, dozens of nuclear-encoded proteins are required for T. brucei to edit just 12 mRNAs (31, 46, 50). Despite considerable controversy, no obvious evolutionary advantage has ever been demonstrated for this type of editing, and such possible advantages that have been proposed (e.g., the generation of 2 proteins from 1 gene) (53) are more than outweighed by the demonstrated cost (i.e., “save” 1 gene at the cost of dozens of genes). We argue that constructive neutral evolution offers a more compelling explanation (12, 54). This is a very simple and intuitive way of explaining complexity in biological systems, but one that has not received much attention. Briefly, it is possible for a biological system to increase in complexity (that is, to increase the number of components or interactions needed to sustain the system) by making a series of neutral changes that collectively do not affect fitness. Pan-editing is often thought of as an error correcting system, but as Stoltzfus (12) pointed out the duplicated information (e.g., gRNAs) must have been created before the mutations they are correcting, or they too would carry the mutations–so the error-then-solution model is backwards. Instead, if a gratuitous duplication of information took place first (i.e., the origin of a gRNA), then a subsequent mutation could be neutralized by the presence of the duplicated information needed to change it. The fixation of such a mutation would render the gRNA essential, and would also allow for further mutations as long as the gRNAs could mediate their reversal. This last point is important because it would bias the system against the loss of the gRNA since mutations at many sites will further establish the gRNA as essential, whereas only complete reversion to the original sequence could render it unnecessary. Overall, the editing activity and the sites that are edited will coevolve, and the complexity of the system will inevitably grow while conferring no real selective advantage (for many other case studies and much greater detail) (see refs. 12 and 54).
Within this framework, together with the recognition that the evolution of an unusual character can be an intrinsic factor in the subsequent evolution of additional, specific characters, a complex cellular system may be explained simply by identifying the event(s) that preconditioned the cell for such a system. Convergence may offer a glimpse into these conditions by revealing how characters are linked when the same events are played out multiple times.
Acknowledgments.
We thank Mona Hoppenrath and Susan Breglia for providing the images in Figs. 1 and 2F, respectively. This work was supported by the Grant Agency of the Czech Republic Grant 204/09/1667 (to J.L.); Ministry of Education of the Czech Republic Grants LC07032, 2B06129, and 600766580 (to J.L.); a grant from the Tula Foundation to the Centre for Microbial Diversity and Evolution (to B.S.L. and P.J.K.); and the Canadian Institute for Advanced Research (J.L., B.S.L., and P.J.K.).
Footnotes
This paper results from the Arthur M. Sackler Colloquium of the National Academy of Sciences, “In the Light of Evolution III: Two Centuries of Darwin,” held January 16–17, 2009, at the Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering in Irvine, CA. The complete program and audio files of most presentations are available on the NAS web site at www.nasonline.org/Sackler_Darwin.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/cgi/content/full/0901004106/DCSupplemental.
References
- 1.Leander BS, Esson HJ, Breglia SA. Macroevolution of complex cytoskeletal systems in euglenids. BioEssays. 2007;29:987–1000. doi: 10.1002/bies.20645. [DOI] [PubMed] [Google Scholar]
- 2.Simpson AGB, Stevens JR, Lukeš J. The evolution and diversity of kinetoplastid flagellates. Trends Parasitol. 2006;22:168–174. doi: 10.1016/j.pt.2006.02.006. [DOI] [PubMed] [Google Scholar]
- 3.Oborník M, Janouškovec J, Chrudimský T, Lukeš J. Evolution of the apicoplast and its hosts: From heterotrophy to autotrophy and back again. Int J Parasitol. 2009;39:1–12. doi: 10.1016/j.ijpara.2008.07.010. [DOI] [PubMed] [Google Scholar]
- 4.Keeling PJ, et al. The tree of eukaryotes. Trends Ecol Evol. 2005;20:670–676. doi: 10.1016/j.tree.2005.09.005. [DOI] [PubMed] [Google Scholar]
- 5.Keeling PJ. Chromalveolates and the evolution of plastids by secondary endosymbiosis. J Euk Microbiol. 2009;56:1–8. doi: 10.1111/j.1550-7408.2008.00371.x. [DOI] [PubMed] [Google Scholar]
- 6.Leander BS. A hierarchical view of convergent evolution in microbial eukaryotes. J Euk Microbiol. 2008;55:59–68. doi: 10.1111/j.1550-7408.2008.00308.x. [DOI] [PubMed] [Google Scholar]
- 7.Hampl V, et al. Phylogenomic analyses support the monophyly of Excavata and resolve relationships among eukaryotic “supergroups.”. Proc Natl Acad Sci USA. 2009;106:3859–3864. doi: 10.1073/pnas.0807880106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Conway-Morris S, Gould SJ. Showdown on the Burgess Shale. Nat Hist. 1998;107:48. [Google Scholar]
- 9.Zakon HH. Convergent evolution on the molecular level. Behav Evol. 2002;59:250–261. doi: 10.1159/000063562. [DOI] [PubMed] [Google Scholar]
- 10.Emery NJ, Clayton NS. The mentality of crows: Convergent evolution of intelligence in corvids and apes. Science. 2004;306:1903–1907. doi: 10.1126/science.1098410. [DOI] [PubMed] [Google Scholar]
- 11.Arndt J, Reznick D. Convergence and parallelism reconsidered: What have we learned about the genetics of adaptation? Trends Ecol Evol. 2008;23:26–32. doi: 10.1016/j.tree.2007.09.011. [DOI] [PubMed] [Google Scholar]
- 12.Stoltzfus A. On the possibility of constructive neutral evolution. J Mol Evol. 1999;49:169–181. doi: 10.1007/pl00006540. [DOI] [PubMed] [Google Scholar]
- 13.Taylor FJR. The Biology of Dinoflagellates. Oxford: Blackwell Scientific; 1987. [Google Scholar]
- 14.Bouck GB, Ngo H. Cortical structure and function in euglenoids with reference to trypanosomes, ciliates, and dinoflagellates. Int Rev Cytol. 1996;169:267–318. doi: 10.1016/s0074-7696(08)61988-9. [DOI] [PubMed] [Google Scholar]
- 15.Hausmann K. Extrusive organelles in protists. Int Rev Cytol. 1978;52:197–268. doi: 10.1016/s0074-7696(08)60757-3. [DOI] [PubMed] [Google Scholar]
- 16.Saito A, et al. Gliding movement in Peranema trichophorum is powered by flagellar surface motility. Cell Motil Cytoskel. 2003;55:244–253. doi: 10.1002/cm.10127. [DOI] [PubMed] [Google Scholar]
- 17.Leander BS, Triemer RE, Farmer MA. Character evolution in heterotrophic euglenids. Eur J Protistol. 2001;37:337–356. [Google Scholar]
- 18.McEwan M, Humayun R, Slamovits CH, Keeling PJ. Nuclear genome sequence survey of the dinoflagellate Heterocapsa triquetra. J Euk Microbiol. 2008;55:530–535. doi: 10.1111/j.1550-7408.2008.00357.x. [DOI] [PubMed] [Google Scholar]
- 19.Berriman M, et al. The genome of the African trypanosome, Trypanosoma brucei. Science. 2005;309:416–422. doi: 10.1126/science.1112642. [DOI] [PubMed] [Google Scholar]
- 20.Bachvaroff TR, Place AR. From stop to start: Tandem gene arrangement, copy number and trans-splicing sites in the dinoflagellate Amphidinium carterae. PLoS One. 2008;3:e2929. doi: 10.1371/journal.pone.0002929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zhang H, et al. Spliced leader RNA trans-splicing in dinoflagellates. Proc Natl Acad Sci USA. 2007;104:4618–4623. doi: 10.1073/pnas.0700258104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Slamovits CH, Keeling PJ. Widespread recycling of processed cDNAs in dinoflagellates. Curr Biol. 2008;18:R550–R552. doi: 10.1016/j.cub.2008.04.054. [DOI] [PubMed] [Google Scholar]
- 23.Campbell DA, Thomas S, Sturm NR. Transcription in kinetoplastid protozoa: Why be normal? Microbes Infect. 2003;5:1231–1240. doi: 10.1016/j.micinf.2003.09.005. [DOI] [PubMed] [Google Scholar]
- 24.Ghedin E, et al. Gene synteny and evolution of genome architecture in trypanosomatids. Mol Biochem Parasitol. 2004;134:183–191. doi: 10.1016/j.molbiopara.2003.11.012. [DOI] [PubMed] [Google Scholar]
- 25.Gunzl A, Vanhamme L, Myler PJ. Transcription in Trypanosomes: A different means to the end. In: Barry D, McCulloch R, Mottram J, Acosta-Serrano A, editors. Trypanosomes After the Genome. Norfolk, UK: Horizon Bioscience; 2007. [Google Scholar]
- 26.Graber JH, Salisbury J, Hutchins LN, Blumenthal T. C. elegans sequences that control trans- splicing and operon pre-mRNA processing. RNA. 2007;13:1409–1426. doi: 10.1261/rna.596707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Clayton CE. Life without transcriptional control? From fly to man and back again. EMBO J. 2002;21:1881–1888. doi: 10.1093/emboj/21.8.1881. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Leander BS. Did trypanosomatid parasites have photosynthetic ancestors? Trends Microbiol. 2004;12:251–258. doi: 10.1016/j.tim.2004.04.001. [DOI] [PubMed] [Google Scholar]
- 29.Copertino DW, Hallick RB. Group-II and group-III introns of twintrons–potential relationships with nuclear premessenger RNA introns. Nucl Acids Res. 1993;18:467–471. doi: 10.1016/0968-0004(93)90008-b. [DOI] [PubMed] [Google Scholar]
- 30.Wang Y, Morse D. Rampant polyuridylylation of plastid gene transcripts in the dinoflagellate Lingulodinium. Nucl Acids Res. 2006;34:613–619. doi: 10.1093/nar/gkj438. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lukeš J, Hashimi H, Zíková A. Unexplained complexity of the mitochondrial genome and transcriptome in kinetoplastid flagellates. Curr Genet. 2005;48:277–299. doi: 10.1007/s00294-005-0027-0. [DOI] [PubMed] [Google Scholar]
- 32.Schnepf E, Deichgraber G. “Myzocytosis,” a kind of endocytosis with implications to compartmentalization in endosymbiosis. Naturwissenschaften. 1984;71:218–219. [Google Scholar]
- 33.Patron NJ, Waller RF, Archibald JM, Keeling PJ. Complex protein targeting to dinoflagellate plastids. J Mol Biol. 2005;384:1015–1024. doi: 10.1016/j.jmb.2005.03.030. [DOI] [PubMed] [Google Scholar]
- 34.Durnford DG, Gray MW. Analysis of Euglena gracilis plastid-targeted proteins reveals different classes of transit sequences. Eukaryot Cell. 2006;5:2079–2091. doi: 10.1128/EC.00222-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Sulli C, Fang ZW, Muchhal U, Schwartzbach SD. Topology of Euglena chloroplast protein precursors within endoplasmic reticulum to Golgi to chloroplast transport vesicles. J Biol Chem. 1999;274:457–463. doi: 10.1074/jbc.274.1.457. [DOI] [PubMed] [Google Scholar]
- 36.Nassoury N, Cappadocia M, Morse D. Plastid ultrastructure defines the protein import pathway in dinoflagellates. J Cell Sci. 2003;116:2867–2874. doi: 10.1242/jcs.00517. [DOI] [PubMed] [Google Scholar]
- 37.Slamovits CH, Saldarriaga JF, Larocque A, Keeling PJ. The highly reduced and fragmented mitochondrial genome of the early-branching dinoflagellate Oxyrrhis marina shares characteristics with both apicomplexan and dinoflagellate mitochondrial genomes. J Mol Biol. 2007;372:356–368. doi: 10.1016/j.jmb.2007.06.085. [DOI] [PubMed] [Google Scholar]
- 38.Nash EA, Nisbet RER, Barbrook AC, Howe CJ. Dinoflagellates: A mitochondrial genome all at sea. Trends Genet. 2008;24:328–335. doi: 10.1016/j.tig.2008.04.001. [DOI] [PubMed] [Google Scholar]
- 39.Marande W, Lukeš J, Burger G. Unique mitochondrial genome structure in diplonemids, the sister group of kinetoplastids. Eukaryot Cell. 2005;4:1137–1146. doi: 10.1128/EC.4.6.1137-1146.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Lukeš J, Jirků M, Avliyakulov N, Benada O. Pankinetoplast DNA structure in a primitive bodonid flagellate, Cryptobia helicis. EMBO J. 1998;17:838–846. doi: 10.1093/emboj/17.3.838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Crausaz-Esseiva A, Naguleswaran A, Hemphill A, Schneider A. Mitochondrial tRNA import in Toxoplasma gondii. J Biol Chem. 2004;279:42363–42368. doi: 10.1074/jbc.M404519200. [DOI] [PubMed] [Google Scholar]
- 42.Tan THP, Bochud-Allemann N, Horn EK, Schneider A. Eukaryotic-type elongator tRNAMet of Trypanosoma brucei becomes formylated after import into mitochondria. Proc Natl Acad Sci USA. 2002;99:1152–1157. doi: 10.1073/pnas.022522299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Feagin JE. Mitochondrial genome diversity in parasites. Int J Parasitol. 2000;33:371–390. doi: 10.1016/s0020-7519(99)00190-3. [DOI] [PubMed] [Google Scholar]
- 44.Bouzaidi-Tiali N, Aeby E, Charriere F, Pusnik M, Schneider A. Elongation factor 1a mediates the specificity of mitochondrial tRNA import in T. brucei. EMBO J. 2007;20:4302–4312. doi: 10.1038/sj.emboj.7601857. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Jackson CJ, et al. Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria. BMC Biol. 2007;5:41. doi: 10.1186/1741-7007-5-41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Hashimi H, Zíková A, Panigrahi AK, Stuart KD, Lukeš J. TbRGG1, a component of a novel multiprotein complex involved in kinetoplastid RNA editing. RNA. 2008;14:970–980. doi: 10.1261/rna.888808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Lin SJ, Zhang HA, Spencer DF, Norman JE, Gray MW. Widespread and extensive editing of mitochondrial mRNAs in dinoflagellates. J Mol Biol. 2002;320:727–739. doi: 10.1016/s0022-2836(02)00468-0. [DOI] [PubMed] [Google Scholar]
- 48.Zhang H, Lin S. mRNA editing and spliced-leader RNA trans-splicing groups Oxyrrhis, Noctiluca, Heterocapsa, and Amphidinium as basal lineages of dinoflagellates. J Phycol. 2008;44:703–711. doi: 10.1111/j.1529-8817.2008.00521.x. [DOI] [PubMed] [Google Scholar]
- 49.Nash EA, et al. Organisation of the mitochondrial genome in the dinoflagellate Amphidinium carterae. Mol Biol Evol. 2007;24:1528–1536. doi: 10.1093/molbev/msm074. [DOI] [PubMed] [Google Scholar]
- 50.Etheridge RD, Aphasizheva I, Gershon PD, Aphasizhev R. 3′ adenylation determined mRNA abundance and monitors completion of RNA editing in T. brucei mitochondria. EMBO J. 2008;27:1596–1608. doi: 10.1038/emboj.2008.87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Marande W, Burger G. Mitochondrial DNA as a genomic jigsaw puzzle. Science. 2007;318:415. doi: 10.1126/science.1148033. [DOI] [PubMed] [Google Scholar]
- 52.Speijer D. Evolutionary aspects of RNA editing. In: Goringer HU, editor. RNA Editing. Berlin: Springer; 2007. [Google Scholar]
- 53.Ochsenreiter T, Cipriano M, Hajduk SL. Alternative mRNA editing in trypanosomes is extensive and may contribute to mitochondrial protein diversity. PLoS One. 2008;3:e1566. doi: 10.1371/journal.pone.0001566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Covello PS, Gray MW. On the evolution of RNA editing. Trends Genet. 1993;9:265–268. doi: 10.1016/0168-9525(93)90011-6. [DOI] [PubMed] [Google Scholar]