Abstract
Comparing the genome sequences of free-living organisms in the five eukaryotic supergroups enables predictions to be made about the genome of the last common ancestor of eukaryotes. The genome sequence of the amoeboflagellate Naegleria gruberi reported by Fritz-Laylin et al. (2010) reveals the surprising complexity of this unicellular organism and, by inference, of the last common eukaryotic ancestor.
In terms of organizational complexity and cellular sophistication, eukaryotes far exceed the other two domains of life, archaea and bacteria. Thus, the origin and the early evolution of eukaryotes are of major interest in terms of understanding the emergence of biological complexity. Depending on the particular phylogenomics study, eukaryotes are classified into five to seven supergroups: two groups (plants and animals) include both unicellular and multicellular organisms, and three to five groups consist entirely of diverse unicellular organisms that are free living or parasitic (Figure 1). Early phylogenetic studies placed parasitic unicellular eukaryotes (e.g., Diplomonads, Microsporidia) at the root of the eukaryotic evolutionary tree. However, subsequent phylogenomic analyses showed that this position for unicellular parasites was, in all likelihood, an artifact caused by their relatively simple genomes, which have been reduced in size by the fast, reductive evolution that is typical of parasites (Embley and Martin, 2006). Therefore, the relationships between individual eukaryotic supergroups are still ambiguous (Keeling, 2007) (Figure 1). Genome sequences of free-living (not parasitic) organisms from the three unicellular supergroups (Figure 1) will help to resolve this debate and will facilitate the reconstruction of the gene repertoire for the last common ancestor of eukaryotes. In this issue of Cell, Fritz-Laylin et al. (2010) take a step in this direction with their report of the genome sequence of the free-living amoeboflagellate Naegleria gruberi, a member of the Excavate supergroup (Figure 1). The genome of Naegleria is unexpectedly gene rich, suggesting that the last common ancestor of the eukaryotes possessed a surprisingly complex gene complement.
Figure 1. Eukaryotic Evolution.
The five eukaryotic supergroups—Excavates, Rhizaria, Unikonts, Chromalveolates, and Plantae—are shown to diverge directly from the last common ancestor (black circle) because the relationship between individual supergroups is uncertain (Keeling, 2007). Analysis of the genome sequence of the free-living amoeboflagellate Naegleria gruberi in the Excavate supergroup reveals that this organism has 4133 genes that are shared by at least one other eukaryote supergroup (Fritz-Laylin et al., 2010). These 4133 genes are inferred to be ancestral genes that were present in the last common ancestor of eukaryotes, suggesting that the this common ancestor was surprisingly complex. The numbers of putative ancestral genes present in selected major clades are indicated in blue. The Rhizaria supergroup is included for completeness, despite the current absence of sequenced genomes. The names of groups that include mostly parasites are italicized. Branch lengths are arbitrary.
The 41 million base pair genome of N. gruberi was predicted to encompass 15,727 protein-coding genes, which account for nearly 58% of the total genome sequence (Fritz-Laylin et al., 2010). For comparison, the human genome carries 21,000 to 25,000 genes (Clamp et al., 2007), which comprise less than 1.5% of the genome. Clearly, the Naegleria genome is complex but compact. In addition, Naegleria contains 0.7 introns per gene on average. Thus, in terms of gene architecture, this organism is intermediate between parasitic unicellular eukaryotes, which possess only a few introns in the entire genome, and multicellular eukaryotes, which contain 8 introns per gene on average (Fritz-Laylin et al., 2010).
Arguably, the most striking results of the Fritz-Laylin et al. study come from comparing the Naegleria genome with genomes from organisms in the other eukaryote supergroups in order to identify genes that may have been present in organisms at the root of the eukaryotic tree (Figure 1). This approach is known as comparative-genomic reconstruction of ancestral forms. Although the unresolved relationship among the eukaryotic supergroups confounds this process, rebuilding the ancient eukaryote genome is relatively easy compared to the same task for archaea or bacteria because horizontal gene transfer among eukaryotes appears to be limited (Keeling and Palmer, 2008). Thus, genes that are represented by orthologs in a pair of eukaryotic genomes can be mapped to their last common ancestor with reasonable confidence. Further, given the lack of resolution in the deep branches of the eukaryotic tree (Figure 1), it is reasonable to assume that genes found in multiple supergroups originate from the last common ancestor of eukaryotes. With these assumptions and the new genome sequence of Naegleria in hand, Fritz-Laylin and colleagues identified 4133 genes shared by Naegleria and at least one other eukaryote supergroup, and the authors tentatively assigned these genes to the last common ancestor of eukaryotes. In support of this assignment, 3784 of these genes are present in at least three supergroups.
The estimate of 4133 genes in the last common eukaryotic ancestor is a moderate but significant upward revision from the estimate of 3417 genes reported in an earlier study that compared representative genomes from only two supergroups, Unikonts and Plantae (Figure 1) (Koonin et al., 2004). Estimates from both studies are quite conservative because they do not account for loss of ancestral genes, which is likely to be substantial in Naegleria. Indeed, even in the animal and plant supergroups, which seem to be the least prone to gene loss, members of these supergroups have still lost about 20% of the ancestral genes found in Naegleria (Figure 1). Considering that there is no way to account for ancestral genes that have been lost (or changed beyond recognition) in all extant lineages and that a considerable fraction of genes in each sequenced eukaryotic genome have no detectable homologs in distant organisms (for instance, in N. gruberi, this fraction is approximately 25%), one is compelled to conclude that the last common ancestor of eukaryotes was already a complex organism. There is no reason to believe that the last common ancestor of eukaryotes was simpler than extant free-living unicellular eukaryotes.
This conclusion is based on numbers, but a more biological approach that reconstructs distinct functional systems in the common ancestor supports the hypothesis, perhaps even more convincingly. In particular, Naegleria contains a great majority of the eukaryotic genes involved in translation, replication, splicing, and other basic cellular processes that distinguish eukaryotes from prokaryotes and archaea. This result agrees with previous genomic reconstruction studies, which also inferred a surprisingly complex eukaryotic ancestor with fully formed eukaryote-specific functional systems (Collins and Penny, 2005; Mans et al., 2004). Of special interest is the presence of the key components of the RNA interference machinery, indicating that this quintessential eukaryotic system of defense and regulation dates back to the last common ancestor of eukaryotes. Furthermore, inspection of the conserved gene repertoire suggests that the ancient eukaryotic predecessor was capable of both flagellar and amoeboid movement.
Functional annotation of the Naegleria genes may also have unexpected implications for the early history of eukaryotic mitochondria. All extant eukaryotes appear to possess mitochondria-related organelles, suggesting that this endosymbiosis antedates the last common ancestor of eukaryotes. However, the properties of these organelles differ widely. For example, animals, plants, aerobic fungi, and aerobic protists possess full-fledged mitochondria that are capable of oxidative phosphorylation, whereas aerobic fungi and aerobic protists posses reduced organelles, such as hydrogenosomes and mitosomes, that typically generate hydrogen (Embley and Martin, 2006). Naegleria is an aerobe with bona fide mitochondria, but genome analysis also revealed genes encoding Fe-hydrogenases that are endowed with mitochondrial import peptides, indicating that under anaerobic conditions, this organism switches to hydrogen-producing energy metabolism (Fritz-Laylin et al., 2010). Although it is premature to conclude that the last common ancestor of eukaryotes possessed similar dual-function mitochondria, these findings suggest that versatile energy conversion systems appeared early in eukaryotic evolution.
A common ancestor exhibiting remarkable genomic and cellular complexity appears to be a fundamental evolutionary pattern that is not limited to the evolution of eukaryotes. Similar conclusions have been drawn about the complexity of the common ancestors of archaea (Csuros and Miklos, 2009) and large eukaryotic viruses (Yutin et al., 2009). It seems that the evolution of major classes of life typically begins with a turbulent phase, which leads to the emergence of a highly complex ancestor. Specific lineages then diverge from this common ancestor by one of three pathways: (1) genome streamlining, in which numerous genes are lost, the genomes shrinks, and functional redundancy decreases; (2) genome stasis, in which limited amounts of genes are lost and gained at roughly the same rate via duplication and other processes; (3) genome expansion, in which the rate of gene acquisition substantially exceeds the rate of gene loss.
Comparative genomics of free-living unicellular eukaryotes such as Naegleria will help to develop more detailed and confident reconstructions of the gene repertoire of the last common ancestor of eukaryotes. However, understanding the processes that led to the emergence of complex common ancestors, particularly for the eukaryotes, requires other approaches and is one of the most difficult and exciting challenges facing evolutionary biologists today.
References
- Clamp M, Fry B, Kamal M, Xie X, Cuff J, Lin MF, Kellis M, Lindblad-Toh K, Lander ES. Proc Natl Acad Sci USA. 2007;104:19428–19433. doi: 10.1073/pnas.0709013104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Collins L, Penny D. Mol Biol Evol. 2005;22:1053–1066. doi: 10.1093/molbev/msi091. [DOI] [PubMed] [Google Scholar]
- Csuros M, Miklos I. Mol Biol Evol. 2009;26:2087–2095. doi: 10.1093/molbev/msp123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Embley TM, Martin W. Nature. 2006;440:623–630. doi: 10.1038/nature04546. [DOI] [PubMed] [Google Scholar]
- Fritz-Laylin LK, Prochnik SE, Ginger ML, Dacks JB, Carpenter ML, Field MC, Kuo A, Paredez A, Chapman J, Pham J, et al. Cell. 2010 doi: 10.1016/j.cell.2010.01.032. this issue. [DOI] [PubMed] [Google Scholar]
- Keeling PJ. Science. 2007;317:1875–1876. doi: 10.1126/science.1149593. [DOI] [PubMed] [Google Scholar]
- Keeling PJ, Palmer JD. Nat Rev Genet. 2008;9:605–618. doi: 10.1038/nrg2386. [DOI] [PubMed] [Google Scholar]
- Koonin EV, Fedorova ND, Jackson JD, Jacobs AR, Krylov DM, Makarova KS, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, et al. Genome Biol. 2004;5:R7. doi: 10.1186/gb-2004-5-2-r7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mans BJ, Anantharaman V, Aravind L, Koonin EV. Cell Cycle. 2004;3:1612–1637. doi: 10.4161/cc.3.12.1316. [DOI] [PubMed] [Google Scholar]
- Yutin N, Wolf YI, Raoult D, Koonin EV. Virol J. 2009;6:223. doi: 10.1186/1743-422X-6-223. [DOI] [PMC free article] [PubMed] [Google Scholar]