Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2015 Sep 26;370(1678):20140323. doi: 10.1098/rstb.2014.0323

The ring of life hypothesis for eukaryote origins is supported by multiple kinds of data

James McInerney 1,2,, Davide Pisani 3, Mary J O'Connell 4
PMCID: PMC4571563  PMID: 26323755

Abstract

The literature is replete with manuscripts describing the origin of eukaryotic cells. Most of the models for eukaryogenesis are either autogenous (sometimes called slow-drip), or symbiogenic (sometimes called big-bang). In this article, we use large and diverse suites of ‘Omics' and other data to make the inference that autogeneous hypotheses are a very poor fit to the data and the origin of eukaryotic cells occurred in a single symbiosis.

Keywords: ring of life, fusion, merger, hybrid, phylogeny, eukaryote

1. Introduction

The value indeed of an aggregate of characters is very evident in natural history

— Charles Darwin [1, p. 417].

How we think we should depict evolutionary history has, of course, an enormous effect on how we analyse data, how we write analytical software and how we depict the final results. If we feel that the evolutionary history of a dataset has been tree-like, then it is likely that the first, perhaps only, analyses we carry out will be a phylogenetic analysis using software that generates, as an output, a tree. We know of course that human populations do not have a tree-like history; therefore, we are usually disinclined to use software for tree reconstruction to depict these histories [2]. A phylogenetic tree can always be derived based on the complete genomes of two parents and their children. However, we know that the tree will be meaningless, because a tree-like process did not generate the data. What this means is that the outcome of an evolutionary analysis is always contingent on our a priori opinions for how the data have evolved. In some cases, as in the above example, our knowledge of the process that generated the data is good enough to let us unambiguously avoid the use of a particularly poorly fitting model (i.e. a tree) to describe the data (i.e. the relationships between the genomes of two parents and their progeny). However, in most cases, we lack the knowledge to unambiguously reject a model (or a class of models) based on previous observations. In such cases, a better course of action would be to consider a variety of models and ask which fits the data best (if not adequately). In this article, we try to understand the patterns we observe that speak about eukaryotic evolutionary history and we assess the goodness-of-fit between the data collected so far and a variety of models for eukaryotic origins and evolution. We do not limit our model analysis to simple alternative phylogenetic trees, we also include processes that are not tree-like.

One of the most profound changes in evolutionary biology has taken place in the past 10 years. We have had to adjust our thinking on the best means of depicting and analysing evolutionary relationships. Instead of using only phylogenetic trees of organisms as the organizing principle, we must now think also in terms of flows of genetic information. Flows of genetic information can be vertical from ancestor to descendants (following a tree-like process) or horizontal from one lineage to another (following a network-like process) [3]. Horizontal flows can be facilitated by plasmids [4], phage [5], viruses [6], transposons [7], gene transfer agents [8], intercellular nanotubes [9], or simply the hybridization of sexual species, followed by re-integration of the hybrid into one of the ancestral species [10,11]. Contemporary genomes are, and extinct genomes were, complex mixtures of genetic mergers [12,13], with the horizontal gene flows being as normal as vertical gene flows. This presents us with a problem if we are restricted to using tree-like processes that only depict vertical gene flows to model the data, as we would not be able to model and understand the impact of horizontal gene flows in evolution.

Phenotypes, particularly in single-celled organisms, coalitions of evolving entities [14] and viruses can only be explained by integrating both a horizontal and vertical view of evolution. In large-scale genome sequencing of bacterial strains, we see thousands of recombination events [15], and indeed, on occasion, explaining phenotypes requires methods that explicitly need to account for and remove vertical transmission signals [16]. One thousand eubacterial genes have flowed into the stem lineage of halophilic Archaebacteria and identification of this important evolutionary event required the modelling of horizontal, rather than vertical gene flows [17]. Indeed, the impossibility of identifying horizontal flows (even when massive) when using only tree-like models is evident in the history of haloarchaeal studies. One of the first analyses of complete genomes from a broad selection of prokaryotes placed the halophiles as deep-branching Archaebacteria [18]. The phylogenetic position of this lineage was determined through the interaction of two signals, both present in the genomes of halophiles, one pulling this lineage towards the methanogenic euryarchaeotes, the other pulling the halophiles towards the Eubacteria. The existence of two incongruent signals ultimately caused the halophiles to cluster at the base of the archaebacterial tree—a phylogenetic position that was not supported by the individual gene trees. A tree-based analysis could not get the correct answer for the placement of archaebacterial halophiles, because halophile evolution is driven by gene flow based adaptive processes rather than by a pure cladogenetic process. Indeed, archaebacterial evolution more broadly can only be fully understood when considered as the result of large-scale horizontal gene flows [19] interacting with vertical ones. This is because the origins of most major groups of Archaebacteria coincide with massive flows of genes from Eubacteria to Archaebacteria [19,20]. It is becoming increasingly evident that at least some of the genes present in every known organism or lineage (animals included) underwent part of their evolution in a completely different lineage [3]. In the case of animals, obvious examples include all genes of mitochondrial origins (that evolved in the alpha-proteobacterial lineage). To conclude, while one can always force chimerical data onto a tree, model misspecification will ensue. In the best-case scenario, this will simply limit our understanding of the generative process underpinning the data. However, in the worst-case scenario, model misidentification will lead to misleading conclusions, such as the finding of support for monophyletic Archaebacteria [21], a topology that is no longer supported by the data or by better models (see discussion in §8).

Genomes contain exquisitely detailed information about evolutionary history; however, because of a deep-rooted tradition in the use of tree-like models to explain biological evolution, it was felt that a clear understanding of cladogenesis (to be derived using a phylogenetic tree for a sufficient number of genomes [22]) would have been enough to clarify the early evolution of life on Earth. This narrative remains omnipresent, despite our current understanding of the pervasive role of horizontal gene transfer (HGT) in evolution [14]. A tree is expected; therefore, a tree is sought out.

2. What is the problem with trees in prokaryotic evolution?

Phylogenetic trees have a very precise meaning [23]. Each node on a tree is either a contemporary taxon or a hypothetical ancestral taxon and the linkages indicate vertical inheritance. An evolutionary diagram is a tree if it does not display any reticulations (loops), otherwise it is not a tree. Trees depict evolution as a continuously diverging process and the ramifications on trees indicate speciation or duplication events. A more general diagram is a phylogenetic network. These are found in two varieties—networks that display uncertainty and networks that display recombination [24]. Recombination networks also have a restrictive set of assumptions—the data are required to be homologous along its entirety, and reticulations on phylogenetic network graphs indicate places where recombination events between homologous regions have occurred.

More than 10 years ago, Creevey et al. [25] showed that strongly supported but completely conflicting phylogenies existed throughout the prokaryote world. Since then, other studies have confirmed that the majority of prokaryotic genome data cannot fit onto a single phylogenetic tree [2629] (also see [20]). Only gene flow (both horizontal and vertical) analysis can adequately account for prokaryotic evolution [13,30].

More recently, because of our growing realization that so many molecular sequences are in fact composite entities, we have been advocating a more realistic model of sequence evolution where we allow the merging of both homologous and non-homologous evolving entities [3133]. In this view, we can analyse situations where very different evolving entities, sometimes from the same evolutionary level (e.g. symbiosis of two organisms, merger of two genes) and sometimes from different levels (e.g. plasmids merging with cells, domains merging with genes) can merge [14]. This is a ‘goods-thinking’ outlook and can have many benefits [12]. We have recently defined N-rooted fusion graphs that can instead be used to depict branching relationships with mergings [31,33].

An over-reliance on tree-thinking and tree-based methods [34] has led some researchers to proclaim that we have reached in phylogenetic impasse [35]. However, this is surely premature, as there are a wealth of characters that can be used to provide insights into eukaryogenesis.

3. Different data, different viewpoints on eukaryotic origins?

Most scientists are familiar with the ideas of Karl Popper and his development of falsifiability [36], id est, that the demarcation line between science and non-science is whether or not an idea can be tested, or falsified. However, prior to Popper, a popular philosophy of science was that of William Whewell, who put forward the idea of consilience [37]. Consilience means the ‘jumping together’ of pieces of evidence from disparate kinds of data. Consilience does not mean compromise; rather it is an appeal to look at reciprocal supports from disparate sets of data. Whewell argued that support for a particular theory would be made stronger if multiple lines of evidence that were orthogonal to one another were seen to agree with the same overall theory. Furthermore, if hypothesis X and hypothesis Y both tended to support an overall theory T, then if some new evidence supporting hypothesis X arises, this new evidence not only increases support for the overall theory, but it also strengthens our support for hypothesis Y, even if we have not collected any new data that speaks to hypothesis Y. Specifically with regard to the origins of the eukaryotic cell, we might look to disparate datasets in an effort to see how they agree with an overall theory (if indeed they do at all).

Consilience was the most popular way of viewing science at the time when Darwin, Wallace and colleagues were working out the theory of evolution. Indeed in The origin of species, Darwin noted, ‘The importance, for classification, of trifling characters mainly depends on their being correlated with several other characters of more or less importance’ [1, p. 417].

Here, we argue that some of the hypotheses for the origins of eukaryotes have already been falsified (sensu Popper). In other cases, we rely on an accumulation of evidence from disparate sources for hypotheses that have not yet been rejected by specific observations in order to evaluate which model for the origin of the eukaryotes is better supported (sensu Whewell). Indeed, some hypotheses of eukaryogenesis can be rejected simply because the prediction of the model does not fit with the empirical data, and in some other cases, while no individual piece of evidence is enough to unequivocally define or support a model, when the evidence is taken together its support for the considered hypothesis will become overwhelming. We argue that the only current model that can adequately account for the origin of eukaryotes is the ring of life model of Lake and Rivera [38].

4. What do we know about the last eukaryotic common ancestor?

Eukaryotes presumably did not arise ex nihilo. However, there are considerable disagreements concerning how we might read their traits. In the following sections, we first review what is known about eukaryote traits and how they relate to prokaryote traits. We then ask how these traits map onto hypotheses of eukaryote origins and we finally ask which mappings are the best fit. In some cases, we find that eukaryotes have homologous structures with all prokaryotes—membranes, for instance are homologous. In other cases, we find that eukaryotes have proteins or structures that are uniquely eukaryote. In other cases, we find homologies between eukaryotes and some, but not all prokaryotes, and also we find that there are partial homologies when comparing prokaryotic and eukaryotic features. The detail of the homologies, the organelles and the structures is important. Fortunately, the current wave of ‘omics' studies is providing more and more data that can speak to eukaryotic origins [3941].

There are a number of traits that are either uniquely eukaryotic or are present in all eukaryotes. These are the traits that most likely trace back to the first eukaryotic cells. The last eukaryote common ancestor (LECA) had a nucleus and associated with this nucleus were nuclear pore complexes, some of whose proteins appear to be homologous and some analogous in modern eukaryotes [42,43]. Additionally, comparative analysis of chromatin proteins from protists has provided insights into the early evolutionary history of histone and DNA modification, nucleosome assembly and chromatin-remodelling systems [44]. Many of the individual domains in these eukaryotic systems can be shown to have had bacterial precursors, but today they are found in distinctive regulatory complexes that are unique to the eukaryotes [44]. In general, eukaryotes have linear chromosomes and possess centromeres [45], though prokaryotes like Borrelia burgdorferi also contain linear chromosomes and plasmids [46]. It is unlikely that chromosome linearity is homologous, as there is no suggestion of a sister-group relationship between Borrelia and eukaryotes. Similarly, there have been suggestions of a relationship between Planctomycetes and eukaryotes, based on the superficial similarity between the eukaryotic nucleus and the planctomycete membrane that encloses its chromosome. Any direct (or intermediate) relationship between Planctomycetes and eukaryotes has been shown to be spurious or analogous [47]. In conclusion, though there is some evidence that the eukaryotic nucleus (excluding the genetic material in this case) has some homologies with eubacterial characters, it is mostly uniquely eukaryotic, with its own independent evolutionary history.

The nucleolus is uniquely eukaryotic. However, careful analysis of the phyletic distribution of the proteome of the nucleolus shows that the nucleolar protein domains confirm the archaebacterial origin of the core machinery for ribosome maturation and assembly, but also reveals substantial eubacterial and eukaryotic contributions to nucleolus evolution [48]. The nucleolus as a whole has no homologue in prokaryotes, though the nucleolus parts have homologies in prokaryotes. These ‘parts' or ‘goods' [12,49] form a unique structure in eukaryotes. It appears therefore, that the nucleolus is in some respects chimerical.

Introns are not unique to eukaryotes, but spliceosomal introns are [50]. While group II (self-splicing) introns are commonly found in eukaryotes, Eubacteria and Archaebacteria, spliceosomal introns are only found in eukaryotes. It is thought that spliceosomal introns arose from group II introns early in eukaryotic evolution [50]. Furthermore, intron positions are seen to be conserved across animals, plants and many protists, indicating that spliceosomal introns arose early in eukaryotic evolution and differential loss of introns has been a typical mode of evolution since then [50]. If the suggested relationship between group II introns and spliceosomal introns is confirmed, then this suggests a specific flow of genetic material between eukaryotes and Eubacteria.

Capped and polyadenylated mRNA is also unique to eukaryotes and while both kinds of cellular processes are distinct, there is a common involvement of RNA polymerase II, which is a large macromolecular complex involving both non-coding RNAs and 12–14 proteins. Eukaryotes have three different polymerases involved in transcription, while Eubacteria have only one [51]. There are clear homologies between the RNA polymerase II components of eukaryotes and RNA polymerase proteins in Eubacteria and Archaebacteria, though there are more subunits in common between archaebacterial RNA polymerase and eukaryotic polymerase II (10 subunits in common) than between the eubacterial version and the others (five subunits in common with the others). Eukaryote polymerase II has two subunits (Rpb8 and Rpb9) that are unique to eukaryotes.

In terms of other processes that are unique to eukaryotes, the anaphase promoting complex or cyclosome (APC/C) is a major component of the cell cycle and is not found in prokaryotes. In total, 24 out of 37 known APC/C subunits, adaptors/co-activators and main targets, could be inferred to be present in the LECA. For the most part, these components are well conserved in all extant eukaryotic lineages.

Meiosis and mitosis are both eukaryotic features, but two points are interesting about this. Firstly, it is thought that meiosis might be an early invention in eukaryotic evolution [52]; secondly, is the observation of cellular fusion and recombination occurring in Archaebacteria [53,54]. Cellular fusion and recombination are a long way from being sexual reproduction; however, it is interesting in the light of early meiosis in eukaryotes that prokaryotes can carry out such functions.

One of the most significant differences between prokaryotes and eukaryotes is to be found in cell size and cell volume. Typically, there is a 1000-fold difference in the sizes of the cells in the two groups, though there can be huge variation [55]. Biochemical calculations have also found that the energy requirement per gene is orders of magnitude higher in eukaryotes than in prokaryotes [55], and this additional requirement in terms of energy means that eukaryotes typically need a cellular powerhouse with large amounts of energy-generating membranes. Given the clear relationship between the mitochondrion and alphaproteobacteria, this would only support a model for eukaryote origins that involves a mitochondrial endosymbiosis.

In addition to eukaryote-specific features, eukaryotes share a number of traits that seem more to resemble Archaebacteria than Eubacteria. For instance, eukaryotes and Archaebacteria both have histones [56,57], whereas Eubacteria have a non-homologous system of ‘histone-like’ proteins [58].

Eukaryotes have an elaborate cytoskeleton based on tubulin and actins [59,60]; however, many prokaryotes are known to have homologues of these molecules. FtsZ, which is involved in septum formation during cell division in Eubacteria is found in many eukaryotes as well as Eubacteria and in many archaebacterial groups, except Crenarchaeota [61]. By contrast, while the Crenarchaeota do not have an FtsZ equivalent, the crenarchaeotal protein crenactin, is related to eukaryotic actin or ‘actin-related proteins' and is absent from Euryarchaeota [59]. Actins are structural proteins, often having such diverse functions as providing cell structural support and, in animals, facilitating muscle contraction, for instance [62]. Most recently, it has been shown that a newly discovered group of Archaebacteria—the ‘Lokiarchaeota’—possess an actin variant that is the most closely related prokaryotic actin to eukaryotic actins. This ‘lokiactin’ has a limited distribution within Archaebacteria and is specifically found in Archaebacteria that themselves—using ribosomal protein phylogenies—are placed as sister taxa to eukaryotes. Additionally, a group of proteins that are typically responsible for cell proliferation, the GTPases of the Ras superfamily, are found in Lokiarchaeota and many of these sequences form sister-group relationships with eukaryotes on phylogenetic trees. In addition, the Lokiarchaeota contain ‘[…] a primordial version of the eukaryotic ESCRT (endosomal sorting complex required for transport) vesicle trafficking pathway’ [63, p. 177]. The ESCRT pathway, which up to now was considered unique to eukaryotes is involved in the formation of multivesicular bodies, which are a trafficking system to bring cargo to the vacuole in eukaryotes. However, the ESCRT system is also known to have other functions such as eukaryotic cell abscission, viral budding, exosome secretion and autophagy [64]. These are not typical features of prokaryotes, so future work will have to properly elucidate their likely function in Lokiarchaeota. It is worth mentioning that Lokiarchaeota are, at the time of writing, only known from metagenomic sequencing [63]. However, the finding of Lokiarchaeota suggests a gene flow between eukaryotes and Archaebacteria, at least for these key genes.

Mitochondria are a unique feature of eukaryotes, though not all eukaryotes possess mitochondria [6569]. For some time, it was thought that many lineages of eukaryotes that lacked mitochondria were primitively amitochondriate [70]. This was because they branched deeply in phylogenetic trees of eukaryotes using poorly performing phylogenetic methods [71]. However, using better phylogenetic methods [72] and the discovery of typical mitochondrial genes in the genomes of these amitochondriate eukaryotes [73,74], and furthermore the demonstration that some possessed an organelle that is homologous to the mitochondrion (the hydrogenosome), the ‘Archaezoa’ hypothesis has been dismantled [65,66,75]. The implication of this work is that we now know of no primitively amitochondriate eukaryotes [76]. That is to say, we know of no eukaryote genomes that are primitively devoid of a significant gene flow from Eubacteria through mitochondrial acquisition.

To facilitate cell division, Walker A-type cytoskeletal ATPase protein MinD is found widely among Eubacteria and in some eukaryotes, but is absent in Archaebacteria. In prokaryotes, this protein acts to prevent Z-ring formation anywhere, except for mid-cell [77].

An analysis of 630 genes in eukaryotes with a sister-group relationship to alpha-proteobacteria on phylogenetic trees revealed that many biochemical pathways are completely, or nearly, contained in this cohort. These include the pathways for beta-oxidation, electron transport chain, fructose/mannose metabolism and pathways for the synthesis of lipids, biotin, haem and iron–sulfur clusters [78]. In addition, the eukaryotic glyceraldehyde-3-phosphate dehydrogenase [79] is of bacterial origin, and the machinery for iron–sulfur cluster assembly (iron–sulfur clusters are an important part of eukaryotic metalloproteins) is related to alpha-proteobacterial genes [80,81]. The implication from this analysis is that much of eukaryotic metabolism is eubacterial-derived.

Analyses of homologies between eukaryotes and prokaryotes have consistently found a disjoint between the relationships that have been inferred [69,8285] (see supporting information, figure S3, for Alvarez-Ponce et al. [85]—http://archaebacterial.pnas.org/content/suppl/2013/04/01/1211371110.DCSupplemental/sapp.pdf). In general, while there are many gene families that are found across all three groups, quite often we find gene families that are found in only two of the three groups (i.e. eukaryotes and Eubacteria, and eukaryotes and Archaebacteria). Analysis of ‘omics' data has shown that the phyletic affiliation of these genes has a very important effect on the cellular role of their proteins.

In gene deletion experiments, it is possible to find out which proteins are essential for a cell to exist. Several such studies have been conducted for Saccharomyces cerevisiae [86]. This allows a test of the association between evolutionary history and function. When eukaryotic genes with prokaryote homologues are partitioned into one class that are homologous with archaebacterial genes and another class that are homologous to eubacterial genes, we see that the genes that are homologous to Archaebacteria are more likely to be lethal upon deletion, they are more highly expressed than the eubacterial genes, the protein products are significantly more central in protein interaction networks (PINs) and are usually under more selective constraint (as judged by dN/dS ratios) [82,83,85]. In addition, there is an interesting difference between eukaryotic genes with archaebacterial homologues and those with eubacterial homologues—the eubacterial homologues are more likely to be duplicated in large eukaryotic genomes and are more likely to be lost in small eukaryotic genomes. The genes with archaebacterial homologues are more stable in terms of duplication or loss [85].

Phylogenetic analyses using sophisticated heterogeneous models have consistently placed eukaryote informational genes within archaebacterial clades [8789]. Phylogenetic supertrees recognize three main kinds of tree topology—eukaryotes branch with Cyanobacteria, eukaryotes branch with alpha-proteobacteria and eukaryotes branch within Archaebacteria [84]. This supertree analysis found that the gene trees infer many other kinds of relationships, but all these topologies were at a low frequency that was no more than expected by chance [84].

It is clear from the above that there are many eukaryotic signature proteins (ESPs) [69,83,85], and many eukaryotic-specific features (proteins, organelles and processes) that are either unique, highly elaborated compared with their prokaryotic equivalents, or completely absent (e.g. simultaneous transcription and translation), just as there are many eubacterial-specific features (e.g. peptidoglycan cell walls) and archaebacterial-specific features (e.g. ether-linked lipids, methanogenesis in some groups). If these are considered in isolation, explaining their existence requires hypothesizing that either (a) these features evolved in an isolated derived eukaryotic lineage that engaged in sustained gene genesis, or that (b) these eukaryote-specific features are ancestral and prokaryotes have lost them as a form of simplifying evolution [90].

5. Analysis of path lengths on four different hypotheses

Interpretations of evolutionary history usually involve two components: the data and the model. When looking at deep evolutionary history, it is possible to employ many different kinds of data: fossil, cellular, genomic, among others. The model, on the other hand, is a statement of how evolution might have occurred. There are usually two parts to the model: the branching diagram and the process. The overwhelming majority of branching diagrams in the literature are tree diagrams, and in all likelihood for most purposes these are the correct kinds of diagram to use [91]. However, if evolutionary history of the evolving objects under study has contained introgressive events, then a simple diverging tree cannot be used. To use a tree when the data were not generated by a tree-like process is an error in model selection [92]. There are many network-like alternatives to trees, including phylogenetic networks [93], sequence-sharing networks [32] and N-rooted fusion networks [31,33]. Each of these diagrams contains inferences of flows of evolving entities [24]. In this article, we use the relative distances between the major groups of organisms, as inferred on four different evolutionary scenarios, to ask how well the data that have been collected can map onto these distances. The paths we analyse have been defined elsewhere and we are using the available data, together with Popperian and Whewellian philosophy to analyse how well they fit the data.

Using path length analysis (see table 1 and figure 1), we find that there are significant differences in the times that are estimated since the lineages split. Naturally, HGT will confuse these times for some genes and make coalescence seem more recent than it actually is for some genes and more ancient for other genes. However, all things being equal, the different hypotheses suggest radically different path length ratios. When a long separation time exists, we expect that two taxa will have diverged in most of their features. We expect, for instance, that their PINs will have diverged, that genome contents will have changed and biochemistry will be different. For shorter paths, we expect fewer differences. In addition, we expect that the totality of data will map differently onto the different diagrams. Most importantly, all four hypotheses imply different path lengths: none are simply a re-rooting of another hypothesis.

Table 1.

Hypothesis as depicted in figure 1 and the corresponding path lengths inferred by these hypotheses. EUB, Eubacteria; EUK, Eukaryota; ARC, Archaebacteria; EOC, Eocyta.

hypothesis relationship of path lengths to one another
(a) three domains Inline graphic
(b) eukaryotes early Inline graphic
(c) eocyte Inline graphic
(d) ring of life Inline graphic

Figure 1.

Figure 1.

Phylogenetic hypotheses for the highest level relationships of life. On the right are the classes of hypothesis described in the text and on the left are the various inferred path lengths. The colour coding is retained for each hypothesis. (Online version in colour.)

An analysis of table 1 and figure 1 shows some interesting patterns. The three-domains hypothesis (table 1a and figure 1a) implies that the shortest path on the tree is the one between eukaryotes and Archaebacteria—implying that these should be the most similar organisms, on average. The eukaryotes-early hypothesis (table 1b and figure 1b) shows eukaryotes to be equally distant from both kinds of prokaryote and in both cases this is a long path. The ring of life hypothesis (table 1d and figure 1d) shows eukaryotes equally distant to both kinds of prokaryote, though in this case eukaryotes are closer to the prokaryotes than either is to the other. The eocyte hypothesis (table 1c and figure 1c) suggests a short path separating eukaryotes and a subset of the Archaebacteria—the eocytes, while a longer path separates eukaryotes and the rest of the Archaebacteria and a much longer path separates eukaryotes from Eubacteria. The eocyte hypothesis is, of course, a pruning of the ring of life hypothesis, where there is no explicit direct path from Eubacteria to eocytes, except through the root. In reality, the eocyte hypothesis is a ‘region’ of the overall ring of life hypothesis, having been based initially on ribosome structures and later on informational gene phylogenies.

6. Three-domains hypothesis (tree-like, autogenous)

The three-domains model places the eukaryotes as a sister-group to Archaebacteria. This implies a shorter path length, on average between eukaryote and archaebacterial genomes. That is to say, we would expect that eukaryotes and Archaebacteria to have more in common than either has in common with Eubacteria (this assumes a somewhat constant rate of genomic change—in effect clocklike behaviour). Instead, the data reveal that eukaryote genomes tend to have more eubacterial genes than archaebacterial genes [67,69,82,83]. This could still be explained on the Woese model, if those eubacterial homologues were the least likely to be lost throughout evolutionary history. However, this hypothesis is falsified by the observation by Alvarez-Ponce et al. [85] of an association between the likelihood of gene loss (and also of gene duplication) and whether a eukaryote gene is homologous to a eubacterial gene or an archaebacterial gene. In other words, despite the longer path between eukaryotes and Eubacteria in figure 1a, this model suggests that a greater number of dispensable genes were retained than were retained from the more closely related Archaebacteria. This finding makes the three-domains hypothesis highly implausible.

7. The eukaryotes-early hypothesis (tree-like, simplificationist)

The eukaryotes-early hypothesis has been seen in several incarnations. Its first incarnation was known as the ‘introns-early’ hypothesis, which speculated that genes in pieces were ancestral (perhaps a left-over from the RNA world that preceded it) and in some lineages, these genes were streamlined in order to become prokaryote [94,95]. Rooted phylogenetic trees also were put forward to support this idea [96,97] and even genome content phylogenies analysed using parsimony [98], the authors having called the last common ancestor of all life ‘incongruously complex’. In this scenario, prokaryotes are seen as ‘superficially and secondarily’ simple [96], with the eukaryotic genomes being a ‘[…] unique cell type that cannot be deconstructed into features inherited directly from Archaea and Bacteria’ [99]. How this irreducible complexity emerged, however, is unclear, unless one assumes a previous unsampled (perhaps extra-terrestrial) history during which this complexity first arose (which would make this hypothesis highly unparsimonious when compared against other much simpler hypotheses), or the existence of a creator. Irrespective of the above, assuming the tree in figure 1b, we see that the longest period of independent evolution is the lineage leading to eukaryotes and we see that Archaebacteria and Eubacteria have a shorter path separating them. Ancestrally, the cell type is eukaryote and so the prokaryote cell structure has evolved from a more complex ancestor and prokaryote evolution is one of predominantly reductional evolution. Certainly, we know that prokaryote genome evolution in particular is biased towards deletions [100], though eukaryotes also manifest this bias.

Following the paths in figure 1b, this would imply that somehow when Archaebacteria and Eubacteria split, the Archaebacteria preferentially took with them or retained more essential genes, more highly expressed genes and more central proteins in the ancestral PIN and lost the less essential, more lowly expressed genes and genes whose protein products were less central in PINs. Loss of non-essential genes is not new to evolutionary biology and such a scenario is well known, particularly in symbiont or parasite genomes [101]. However, the topology has an additional implication—that the Eubacteria preferentially took or retained the less essential genes, the more lowly expressed genes and the genes whose protein products were less central in PINs, while dispensing with the more essential genes. It is not immediately obvious what selective advantage might accrue from deleting genes that are presumably very important for life. In effect, this hypothesis expects that Eubacteria were effectively eviscerated, but somehow thrived and expanded. This speculation is the opposite of our expectations using standard evolutionary theories.

In addition to the problem with explaining the eukaryotes-early hypothesis using cellular data, the model fails to account for the absence of fossil eukaryotes before 1.7 billion years ago. The oldest known fossil eukaryote Shuiyousphaeridium macroreticulatum is from the Ruyang group (in China) and is dated at approximately 1.6 to 1.8 Ga [102]. By contrast, the oldest known prokaryotic fossils are twice as old [102]. Assuming that single-celled prokaryotes and eukaryotes have been equally amenable to fossilization over time, there is a question over where the missing fossils have gone.

8. Eocyte hypothesis (tree-like, partial explanation)

The eocyte hypothesis was first put forward in 1984 on the basis of the similarities in the ultrastructures of ribosomes found in eukaryotes and some Archaebacteria [103]. This led Lake and colleagues to propose that there was a sister-group relationship between eukaryotes and a subset of Archaebacteria, which he called eocytes, and not the relationship that was being suggested from studies of small-subunit ribosomal RNA sequences which placed eukaryotes as the sister-group of all Archaebacteria. This eocyte hypothesis spent more than 20 years as the lesser-known alternative to the Woesian tree, but has recently received more attention as better phylogenetic methods have been applied to collections of informational genes [8789] and supertrees have been built from large collections of orthologues [84]. This hypothesis relates to the specific set of relationships between eukaryotes and Archaebacteria and centres on whether Archaebacteria are monophyletic or not. If eukaryotes are ancestral, then it is necessary to explain how sophisticated heterogeneous maximum-likelihood phylogenies place these informational genes within the diversity of Archaebacteria and reject with significance the hypotheses of monophyletic Archaebacteria [87,89].

Probably the most conclusive support for a within-Archaebacteria relationship for some genes in eukaryotes has come from the newly discovered ‘Lokiarchaeota’ sequences [63]. These sequences clearly identify a sister-group relationship with eukaryotes for some key genes, including members of the Ras superfamily, eukaryotic actins (lokiactins) and the ESCRT pathway. They provide functional clues to the first eukaryotes and provide a genetic toolbox that could explain the origin of phagotrophy.

9. Ring of life (network, complexificationist)

The ring of life hypothesis places a genomic and cellular merger event at the centre of eukaryogenesis. In this scenario, eukaryotes are not ancient: they are a more recent group than either of the two prokaryotic groups. This scenario also implies that neither prokaryotic group is monophyletic, as the eukaryotes arose from within the group and not as a separate lineage. Furthermore, the expectation from this scenario is that Archaebacteria (specifically, the eocytes) and the Eubacteria (specifically, the mitochondrial ancestor) made similar contributions to the eukaryote. The implication is that eukaryotes are not simply modified Archaebacteria or ‘Archaebacteria with some eubacterial parts', rather that the origin of eukaryotes corresponds to an egalitarian merger of two distant relatives, and without both parts eukaryotes would not emerge. This is what we mean by ‘similar’: this event would not have occurred unless both were present.

Support for this hypothesis comes from genome content phylogenies [38]. Lake and Rivera [104] have used the method of conditioned reconstruction to produce a collection of phylogenetic trees that are incongruent. However, these trees can be perfectly mapped onto a ring [105]. By taking a ‘signal stripping’ approach, Pisani et al. [84] showed that there were three and only three phylogenetic signals uniting prokaryotes and eukaryotes: one specifically uniting eukaryotes and cyanobacteria, one uniting eukaryotes and alpha-proteobacteria and one uniting eukaryotes and Archaebacteria. An analysis of chaperone systems in eukaryotes showed that there are a mix of genes with homologies to both Archaebacteria and Eubacteria [106] active in eukaryotes and, furthermore, that folding of proteins with archaebacterial affiliations by eubacterial chaperones and vice versa was normal in eukaryotes. A simple analysis of the phylogenetic affiliations of eukaryotic genes [69,82,83,85] shows there to be evolutionary affiliations between most functional categories of eukaryotic proteins.

Calculations from bioenergetics show that the per-gene energy requirement for eukaryotic genes vastly exceeds the prokaryotic requirement and, therefore, eukaryotes can only exist if they have a ‘powerhouse’ [55]. In addition, eukaryote proteomes show evidence of protein interactions being structured according to phylogenetic affiliation [82,83]: eubacterial proteins preferentially interact with eubacterial proteins, archaebacterial proteins preferentially interact with eubacterial proteins and ESPs preferentially interact with ESPs.

This model would expect and require that some gene phylogenies place eukaryotes within the diversity of Eubacteria and also that other phylogenies would place them within the diversity of Archaebacteria. This is exactly what we see [67,68,87].

Analyses have not attributed equal roles to archaebacterial and eubacterial homologues of eukaryotes. For instance, archaebacterial homologues are more central in PINs (both closeness centrality and betweenness centrality), more slowly evolving, more likely to be lethal upon deletion, and more likely to be involved in informational processes, among other things [83]. Simply stated, archaebacterial genes seem to be more important in eukaryotic genomes, but eubacterial genes seem to be more flexible and more likely to be duplicated or lost [85]. Given a merger scenario for eukaryogenesis, this would imply that at the start of eukaryogenesis there were two genome equivalents in the merger. Over time, one (the archaebacterial) acted as a more central genome, more involved in informational processes. The eubacterial genome could lose the more important informational genes without a fitness cost if much of the cellular duplication, transcription and translation was being carried out by the archaebacterial proteins. The mitochondrion retained some informational processes, but the nucleo-cytosolic informational functions were the responsibility of the archaebacterial homologues. By contrast, the less central eubacterial homologues were largely involved in metabolic functions and overall, these genes were more numerous.

10. What is a eukaryote?

Eukaryotes are a secondary lineage of life that originated following a symbiotic event involving a eubacterium and an archaebacterium. As a consequence, the eukaryotic genome is a complex mixture of archaebacterial and eubacterial genes that were horizontally shaken into the same cocktail glass. In addition, eukaryotes invented their own genes and cellular structures. The major gene flows into eukaryotes have almost certainly been from an archaebacterium (the eocyte) that would not be hugely different to modern Archaebacteria, from a eubacterium that would be placed somewhere within the diversity of modern alpha-proteobacteria, and the third gene flow is to be found in the Archaeplastida and arose from within the diversity of Cyanobacteria. Other prokaryote-to-eukaryote gene flows, though they are known to exist [107] and might have had significant phenotypic effects [108], do not seem to have been formative.

Modern eukaryotes are composed largely of archaebacterial homologues that are more highly expressed, more central in PINs, more likely to be lethal upon deletion, more slowly evolving and more likely to be informational. By contrast, eukaryotic genes with eubacterial homologues are more likely to be duplicated in larger genomes and more likely to be lost in smaller genomes, they are more likely to be involved in mendelian disease in humans, they are less likely to be lethal upon deletion and are faster evolving than their archaebacterial counterparts.

The main archaebacterial gene flow into eukaryotes brought with it the raw materials for the eukaryote cytoskeleton, transcription, vesicle trafficking and some elements of cell division. The eubacterial gene flow brought with it beta-oxidation, electron transport chain, fructose/mannose metabolism and pathways for the synthesis of lipids, biotin, haem and iron–sulfur clusters [78]. The eubacterial gene flow is also likely to have brought group II introns, which evolved into spliceosomal introns [50].

To conclude, it is clear that eukaryotes cannot be correctly defined as ‘derived’ Archaebacteria, or as ‘derived’ Eubacteria. Indeed, to view eukaryotes as being from either the archaebacterial or the eubacterial lineages is an over-simplification. Each human is derived equally from both parents. They would not exist without a genetic contribution from both, and it does not matter if they look more like their mother or father, or which surname they carry, if any. The reality is that a human only exists as a consequence of a contribution from both parents. Analogously, eukaryotes are equally eubacterial and archaebacterial. A taxonomic debate exists in the literature on the early evolution of life, whereby hypotheses have been suggested to be characterizable either as three-domains or two-domains based (2D versus 3D hypotheses). This characterization inherently assumes the existence of a tree-like pattern of evolution, which is misleading. Because eukaryotes arose from both Archaebacteria and Eubacteria, there are only two (monophyletic) lineages of life: (i) cellular life and (ii) the eukaryotes. Monophyletic eukaryotes are nested within monophyletic life. Eukaryotes make domain-based classifications obsolete and we therefore advocate dismissing the use of this term (which can easily be replaced by the term lineage, for instance) entirely. That is, we advocate a ‘domain-free’ view of the history of life, as debates about whether there should be two domains or three are essentialist and moot.

In a pluralistic view of cellular life on the planet, we can see that the merging of eubacterial genes with archaebacterial genes gave rise to the halophiles and indeed it made an enormous contribution to the origins of most of the major groups of Archaebacteria. We see that photosynthesis can only be interpreted as a series of gene flows around the prokaryotic and eukaryotic worlds. We see that eukaryotes have arisen as a consequence of major flows between prokaryotes initially (eukaryogenesis), and later, between a prokaryote group and a eukaryotic group (plastid origins) [84].

Life's history is complex and we should not try to simplify it to suit our need for orderly nomenclatural systems.

Competing interests

We declare we have no competing interests.

Funding

We received no funding for this study.

References

  • 1.Darwin C. 1859. On the origin of species by means of natural selection. London, UK: John Murray. [Google Scholar]
  • 2.Excoffier L, Laval G, Schneider S. 2005. Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evol. Bioinform. Online 1, 47–50. [PMC free article] [PubMed] [Google Scholar]
  • 3.McInerney JO. 2013. More than three dimensions: inter-lineage evolution's ecological importance. Trends Ecol. Evol. 28, 624–625. ( 10.1016/j.tree.2013.09.002) [DOI] [PubMed] [Google Scholar]
  • 4.Lydiate DJ, Malpartida F, Hopwood DA. 1985. The Streptomyces plasmid SCP2*: its functional analysis and development into useful cloning vectors. Gene 35, 223–235. ( 10.1016/0378-1119(85)90001-0) [DOI] [PubMed] [Google Scholar]
  • 5.Canchaya C, Fournous G, Chibani-Chennoufi S, Dillmann ML, Brussow H. 2003. Phage as agents of lateral gene transfer. Curr. Opin. Microbiol. 6, 417–424. ( 10.1016/S1369-5274(03)00086-9) [DOI] [PubMed] [Google Scholar]
  • 6.Xiao X, Li J, Samulski RJ. 1996. Efficient long-term gene transfer into muscle tissue of immunocompetent mice by adeno-associated virus vector. J. Virol. 70, 8098–8108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Salyers AA, Shoemaker NB, Stevens AM, Li LY. 1995. Conjugative transposons: an unusual and diverse set of integrated gene transfer elements. Microbiol. Rev. 59, 579–590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lang AS, Zhaxybayeva O, Beatty JT. 2012. Gene transfer agents: phage-like elements of genetic exchange. Nat. Rev. Microbiol. 10, 472–482. ( 10.1038/nrmicro2802) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Dubey GP, Ben-Yehuda S. 2011. Intercellular nanotubes mediate bacterial communication. Cell 144, 590–600. ( 10.1016/j.cell.2011.01.015) [DOI] [PubMed] [Google Scholar]
  • 10.Heliconius Genome Consortium. 2012. Butterfly genome reveals promiscuous exchange of mimicry adaptations among species. Nature 487, 94–98. ( 10.1038/nature11041) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Patterson N, Richter DJ, Gnerre S, Lander ES, Reich D. 2006. Genetic evidence for complex speciation of humans and chimpanzees. Nature 441, 1103–1108. ( 10.1038/nature04789) [DOI] [PubMed] [Google Scholar]
  • 12.McInerney JO, Pisani D, Bapteste E, O'Connell MJ. 2011. The public goods hypothesis for the evolution of life on Earth. Biol. Direct 6, 41 ( 10.1186/1745-6150-6-41) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Bapteste E, et al. 2009. Prokaryotic evolution and the tree of life are two different things. Biol. Direct 4, 34 ( 10.1186/1745-6150-4-34) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Bapteste E, Lopez P, Bouchard F, Baquero F, McInerney JO, Burian RM. 2012. Evolutionary analyses of non-genealogical bonds produced by introgressive descent. Proc. Natl Acad. Sci. USA 109, 18 266–18 272. ( 10.1073/pnas.1206541109) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chewapreecha C, et al. 2014. Dense genomic sampling identifies highways of pneumococcal recombination. Nat. Genet. 46, 305–309. ( 10.1038/ng.2895) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Sheppard SK, et al. 2013. Genome-wide association study identifies vitamin B5 biosynthesis as a host specificity factor in Campylobacter. Proc. Natl Acad. Sci. USA 110, 11 923–11 927. ( 10.1073/pnas.1305559110) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Nelson-Sathi S, Dagan T, Landan G, Janssen A, Steel M, McInerney JO, Deppenmeier U, Martin WF. 2012. Acquisition of 1,000 eubacterial genes physiologically transformed a methanogen at the origin of Haloarchaea. Proc. Natl Acad. Sci. USA 109, 20 537–20 542. ( 10.1073/pnas.1209119109) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Wolf YI, Rogozin IB, Grishin NV, Tatusov RL, Koonin EV. 2001. Genome trees constructed using five different approaches suggest new major bacterial clades. BMC Evol. Biol. 1, 8 ( 10.1186/1471-2148-1-8) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Nelson-Sathi S, et al. 2015. Origins of major archaeal clades correspond to gene acquisitions from bacteria. Nature 517, 77–80. ( 10.1038/nature13805) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Akanni WA, Siu-Ting K, Creevey CJ, McInerney JO, Wilkinson M, Foster PG, Pisani D. 2015. Horizontal gene flow from Eubacteria to Archaebacteria and what it means for our understanding of eukaryogenesis. Phil. Trans. R. Soc. B 370, 20140337 ( 10.1098/rstb.2014.0337) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Gouy M, Li WH. 1989. Phylogenetic analysis based on rRNA sequences supports the archaebacterial rather than the eocyte tree. Nature 339, 145–147. ( 10.1038/339145a0) [DOI] [PubMed] [Google Scholar]
  • 22.Eisen JA, Fraser CM. 2003. Phylogenomics: intersection of evolution and genomics. Science 300, 1706–1707. ( 10.1126/science.1086292) [DOI] [PubMed] [Google Scholar]
  • 23.Doolittle WF, Bapteste E. 2007. Pattern pluralism and the Tree of Life hypothesis. Proc. Natl Acad. Sci. USA 104, 2043–2049. ( 10.1073/pnas.0610699104) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Bapteste E, et al. 2013. Networks: expanding evolutionary thinking. Trends Genet. 29, 439–441. ( 10.1016/j.tig.2013.05.007) [DOI] [PubMed] [Google Scholar]
  • 25.Creevey CJ, Fitzpatrick DA, Philip GK, Kinsella RJ, O'Connell MJ, Pentony MM, Travers SA, Wilkinson M, McInerney JO. 2004. Does a tree-like phylogeny only exist at the tips in the prokaryotes? Proc. R. Soc. Lond. B 271, 2551–2558. ( 10.1098/rspb.2004.2864) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Puigbo P, Wolf YI, Koonin EV. 2010. The tree and net components of prokaryote evolution. Genome Biol. Evol. 2, 745–756. ( 10.1093/gbe/evq062) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Puigbo P, Wolf YI, Koonin EV. 2009. Search for a 'Tree of Life' in the thicket of the phylogenetic forest. J. Biol. 8, 59 ( 10.1186/jbiol159) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Koonin EV, Wolf YI, Puigbo P. 2009. The phylogenetic forest and the quest for the elusive tree of life. Cold Spring Harb. Symp. Quant. Biol. 74, 205–213. ( 10.1101/sqb.2009.74.006) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Koonin EV, Puigbó P, Wolf YI. 2011. Comparison of phylogenetic trees and search for a central trend in the ‘Forest of Life’. J. Comput. Biol. 18, 917–924. ( 10.1089/cmb.2010.0185) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Kloesges T, Popa O, Martin W, Dagan T. 2011. Networks of gene sharing among 329 proteobacterial genomes reveal differences in lateral gene transfer frequency at different phylogenetic depths. Mol. Biol. Evol. 28, 1057–1074. ( 10.1093/molbev/msq297) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Haggerty LS, et al. 2014. A pluralistic account of homology: adapting the models to the data. Mol. Biol. Evol. 31, 501–516. ( 10.1093/molbev/mst228) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Halary S, Leigh JW, Cheaib B, Lopez P, Bapteste E. 2010. Network analyses structure genetic diversity in independent genetic worlds. Proc. Natl Acad. Sci. USA 107, 127–132. ( 10.1073/pnas.0908978107) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Coleman O, Hogan R, McGoldrick N, Rudden N, McInerney JO. 2015. Evolution by pervasive gene fusion in antibiotic resistance and antibiotic synthesizing genes. Computation 3, 114–127. ( 10.3390/computation3020114) [DOI] [Google Scholar]
  • 34.O'Hara R. 1998. Population thinking and tree thinking in systematics. Zool. Scr. 26, 323–329. ( 10.1111/j.1463-6409.1997.tb00422.x) [DOI] [Google Scholar]
  • 35.Gribaldo S, Poole AM, Daubin V, Forterre P, Brochier-Armanet C. 2010. The origin of eukaryotes and their relationship with the Archaea: are we at a phylogenomic impasse? Nat. Rev. Microbiol. 8, 743–752. ( 10.1038/nrmicro2426) [DOI] [PubMed] [Google Scholar]
  • 36.Popper K. 1934. The logic of scientific discovery. Vienna, Austria: Mohr Siebeck. [Google Scholar]
  • 37.Whewell W. 1840. The philosophy of inductive sciences, founded upon their history. London, UK: JW Parker. [Google Scholar]
  • 38.Rivera MC, Lake JA. 2004. The ring of life provides evidence for a genome fusion origin of eukaryotes. Nature 431, 152–155. ( 10.1038/nature02848) [DOI] [PubMed] [Google Scholar]
  • 39.Cravatt BF, Kodadek T. 2015. Editorial overview: Omics: methods to monitor and manipulate biological systems: recent advances in 'omics’. Curr. Opin. Chem. Biol. 24, v–vii. ( 10.1016/j.cbpa.2014.12.023) [DOI] [PubMed] [Google Scholar]
  • 40.Fisch KM, Meißner T, Gioia L, Ducom JC, Carland TM, Loguercio S, Su AI. 2015. Omics Pipe: a community-based framework for reproducible multi-omics data analysis. Bioinformatics 31, 1724–1728. ( 10.1093/bioinformatics/btv061) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Fondi M, Lio P. 2015. Multi-omics and metabolic modelling pipelines: challenges and tools for systems microbiology. Microbiol. Res. 171, 52–64. ( 10.1016/j.micres.2015.01.003) [DOI] [PubMed] [Google Scholar]
  • 42.Bapteste E, Charlebois RL, MacLeod D, Brochier C. 2005. The two tempos of nuclear pore complex evolution: highly adapting proteins in an ancient frozen structure. Genome Biol. 6, R85 ( 10.1186/gb-2005-6-10-r85) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Mans BJ, Anantharaman V, Aravind L, Koonin EV. 2004. Comparative genomics, evolution and origins of the nuclear envelope and nuclear pore complex. Cell Cycle 3, 1612–1637. ( 10.4161/cc.3.12.1345) [DOI] [PubMed] [Google Scholar]
  • 44.Iyer LM, Anantharaman V, Wolf MY, Aravind L. 2008. Comparative genomics of transcription factors and chromatin proteins in parasitic protists and other eukaryotes. Int. J. Parasitol. 38, 1–31. ( 10.1016/j.ijpara.2007.07.018) [DOI] [PubMed] [Google Scholar]
  • 45.Cavalier-Smith T. 2010. Origin of the cell nucleus, mitosis and sex: roles of intracellular coevolution. Biol. Direct 5, 7 ( 10.1186/1745-6150-5-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.McInerney JO. 1998. Replicational and transcriptional selection on codon usage in Borrelia burgdorferi. Proc. Natl Acad. Sci. USA 95, 10 698–10 703. ( 10.1073/pnas.95.18.10698) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.McInerney JO, Martin WF, Koonin EV, Allen JF, Galperin MY, Lane N, Archibald JM, Embley TM. 2011. Planctomycetes and eukaryotes: a case of analogy not homology. Bioessays 33, 810–817. ( 10.1002/bies.201100045) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Staub E, Fiziev P, Rosenthal A, Hinzmann B. 2004. Insights into the evolution of the nucleolus by an analysis of its protein domain repertoire. Bioessays 26, 567–581. ( 10.1002/bies.20032) [DOI] [PubMed] [Google Scholar]
  • 49.Erwin DH. 2015. A public goods approach to major evolutionary innovations. Geobiology 13, 308–315. ( 10.1111/gbi.12137) [DOI] [PubMed] [Google Scholar]
  • 50.Rogozin IB, Carmel L, Csuros M, Koonin EV. 2012. Origin and evolution of spliceosomal introns. Biol. Direct 7, 11 ( 10.1186/1745-6150-7-11) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Decker KB, Hinton DM. 2013. Transcription regulation at the core: similarities among bacterial, archaeal, and eukaryotic RNA polymerases. Annu. Rev. Microbiol. 67, 113–139. ( 10.1146/annurev-micro-092412-155756) [DOI] [PubMed] [Google Scholar]
  • 52.Ramesh MA, Malik SB, Logsdon JM Jr. 2005. A phylogenomic inventory of meiotic genes; evidence for sex in Giardia and an early eukaryotic origin of meiosis. Curr. Biol. 15, 185–191. [DOI] [PubMed] [Google Scholar]
  • 53.Naor A, Gophna U. 2013. Cell fusion and hybrids in Archaea: prospects for genome shuffling and accelerated strain development for biotechnology. Bioengineered 4, 126–129. ( 10.4161/bioe.22649) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Gophna AN, Lapierre P, Mevarech M, Papke RT, Gophna U. 2012. Low species barriers in halophilic Archaea and the formation of recombinant hybrids. Curr. Biol. 22, 1444–1448. ( 10.1016/j.cub.2012.05.056) [DOI] [PubMed] [Google Scholar]
  • 55.Lane N, Martin W. 2010. The energetics of genome complexity. Nature 467, 929–934. ( 10.1038/nature09486) [DOI] [PubMed] [Google Scholar]
  • 56.Pereira SL, Grayling RA, Lurz R, Reeve JN. 1997. Archaeal nucleosomes. Proc. Natl Acad. Sci. USA 94, 12 633–12 637. ( 10.1073/pnas.94.23.12633) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Reeve JN, Bailey KA, Li WT, Marc F, Sandman K, Soares DJ. 2004. Archaeal histones: structures, stability and DNA binding. Biochem. Soc. Trans. 32, 227–230. ( 10.1042/BST0320227) [DOI] [PubMed] [Google Scholar]
  • 58.Anuchin AM, Goncharenko AV, Demidenok OI, Kaprel'iants AS. 2011. Histone-like proteins of bacteria (review). Prikl. Biokhim. Mikrobiol. 47, 635–641. [PubMed] [Google Scholar]
  • 59.Yutin N, Wolf MY, Wolf YI, Koonin EV. 2009. The origins of phagocytosis and eukaryogenesis. Biol. Direct 4, 9 ( 10.1186/1745-6150-4-9) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Hammesfahr B, Kollmar M. 2012. Evolution of the eukaryotic dynactin complex, the activator of cytoplasmic dynein. BMC Evol. Biol. 12, 95 ( 10.1186/1471-2148-12-95) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Makarova KS, Yutin N, Bell SD, Koonin EV. 2010. Evolution of diverse cell division and vesicle formation systems in Archaea. Nat. Rev. Microbiol. 8, 731–741. ( 10.1038/nrmicro2406) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Gunning PW, Ghoshdastider U, Whitaker S, Popp D, Robinson RC. 2015. The evolution of compositionally and functionally distinct actin filaments. J. Cell Sci. 128, 2009–2019. ( 10.1242/jcs.165563) [DOI] [PubMed] [Google Scholar]
  • 63.Spang A, et al. 2015. Complex Archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179. ( 10.1038/nature14447) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Henne WM, Buchkovich NJ, Emr SD. 2011. The ESCRT pathway. Dev. Cell 21, 77–91. ( 10.1016/j.devcel.2011.05.015) [DOI] [PubMed] [Google Scholar]
  • 65.Hirt RP, Logsdon JM Jr, Healy B, Dorey MW, Doolittle WF, Embley TM. 1999. Microsporidia are related to fungi: evidence from the largest subunit of RNA polymerase II and other proteins. Proc. Natl Acad. Sci. USA 96, 580–585. ( 10.1073/pnas.96.2.580) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Embley TM, Horner DA, Hirt RP. 1997. Anaerobic eukaryote evolution: hydrogenosomes as biochemically modified mitochondria? Trends Ecol. Evol. 12, 437–441. ( 10.1016/S0169-5347(97)01208-1) [DOI] [PubMed] [Google Scholar]
  • 67.Fitzpatrick DA, Creevey CJ, McInerney JO. 2006. Genome phylogenies indicate a meaningful alpha-proteobacterial phylogeny and support a grouping of the mitochondria with the Rickettsiales. Mol. Biol. Evol. 23, 74–85. ( 10.1093/molbev/msj009) [DOI] [PubMed] [Google Scholar]
  • 68.Esser C, Martin W, Dagan T. 2007. The origin of mitochondria in light of a fluid prokaryotic chromosome model. Biol. Lett. 3, 180–184. ( 10.1098/rsbl.2006.0582) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Esser C, et al. 2004. A genome phylogeny for mitochondria among alpha-proteobacteria and a predominantly eubacterial ancestry of yeast nuclear genes. Mol. Biol. Evol. 21, 1643–1660. ( 10.1093/molbev/msh160) [DOI] [PubMed] [Google Scholar]
  • 70.Cavalier-Smith T. 1989. Molecular phylogeny. Archaebacteria and Archezoa. Nature 339, 100. [DOI] [PubMed] [Google Scholar]
  • 71.Woese CR, Kandler O, Wheelis ML. 1990. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl Acad. Sci. USA 87, 4576–4579. ( 10.1073/pnas.87.12.4576) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Foster PG. 2004. Modeling compositional heterogeneity. Syst. Biol. 53, 485–495. ( 10.1080/10635150490445779) [DOI] [PubMed] [Google Scholar]
  • 73.Horner DS, Embley TM. 2001. Chaperonin 60 phylogeny provides further evidence for secondary loss of mitochondria among putative early-branching eukaryotes. Mol. Biol. Evol. 18, 1970–1975. ( 10.1093/oxfordjournals.molbev.a003737) [DOI] [PubMed] [Google Scholar]
  • 74.Horner DS, Hirt RP, Kilvington S, Lloyd D, Embley TM. 1996. Molecular data suggest an early acquisition of the mitochondrion endosymbiont. Proc. R. Soc. Lond. B 263, 1053–1059. ( 10.1098/rspb.1996.0155) [DOI] [PubMed] [Google Scholar]
  • 75.Horner DS, Hirt RP, Embley TM. 1999. A single eubacterial origin of eukaryotic pyruvate: ferredoxin oxidoreductase genes: implications for the evolution of anaerobic eukaryotes. Mol. Biol. Evol. 16, 1280–1291. ( 10.1093/oxfordjournals.molbev.a026218) [DOI] [PubMed] [Google Scholar]
  • 76.Embley TM, Martin W. 2006. Eukaryotic evolution, changes and challenges. Nature 440, 623–630. ( 10.1038/nature04546) [DOI] [PubMed] [Google Scholar]
  • 77.Ghosal D, Trambaiolo D, Amos LA, Lowe J. 2014. MinCD cell division proteins form alternating copolymeric cytomotive filaments. Nat. Commun. 5, 5341 ( 10.1038/ncomms6341) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Gabaldon T, Huynen MA. 2003. Reconstruction of the proto-mitochondrial metabolism. Science 301, 609 ( 10.1126/science.1085463) [DOI] [PubMed] [Google Scholar]
  • 79.Martin W, Brinkmann H, Savonna C, Cerff R. 1993. Evidence for a chimeric nature of nuclear genomes: eubacterial origin of eukaryotic glyceraldehyde-3-phosphate dehydrogenase genes. Proc. Natl Acad. Sci. USA 90, 8692–8696. ( 10.1073/pnas.90.18.8692) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Tovar J, Leon-Avila G, Sanchez LB, Sutak R, Tachezy J, van der Giezen M, Hernandez M, Muller M, Lucocq JM. 2003. Mitochondrial remnant organelles of Giardia function in iron-sulphur protein maturation. Nature 426, 172–176. ( 10.1038/nature01945) [DOI] [PubMed] [Google Scholar]
  • 81.Emelyanov VV. 2003. Phylogenetic affinity of a Giardia lamblia cysteine desulfurase conforms to canonical pattern of mitochondrial ancestry. FEMS Microbiol. Lett. 226, 257–266. ( 10.1016/S0378-1097(03)00598-6) [DOI] [PubMed] [Google Scholar]
  • 82.Alvarez-Ponce D, McInerney JO. 2011. The human genome retains relics of its prokaryotic ancestry: human genes of archaebacterial and eubacterial origin exhibit remarkable differences. Genome Biol. Evol. 3, 782–790. ( 10.1093/gbe/evr073) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Cotton JA, McInerney JO. 2010. Eukaryotic genes of archaebacterial origin are more important than the more numerous eubacterial genes, irrespective of function. Proc. Natl Acad. Sci. USA 107, 17 252–17 255. ( 10.1073/pnas.1000265107) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Pisani D, Cotton JA, McInerney JO. 2007. Supertrees disentangle the chimerical origin of eukaryotic genomes. Mol. Biol. Evol. 24, 1752–1760. ( 10.1093/molbev/msm095) [DOI] [PubMed] [Google Scholar]
  • 85.Alvarez-Ponce D, Lopez P, Bapteste E, McInerney JO. 2013. Gene similarity networks provide tools for understanding eukaryote origins and evolution. Proc. Natl Acad. Sci. USA 110, E1594–E1603. ( 10.1073/pnas.1211371110) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Giaever G, et al. 2002. Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387–391. ( 10.1038/nature00935) [DOI] [PubMed] [Google Scholar]
  • 87.Cox CJ, Foster PG, Hirt RP, Harris SR, Embley TM. 2008. The archaebacterial origin of eukaryotes. Proc. Natl Acad. Sci. USA 105, 20 356–20 361. ( 10.1073/pnas.0810647105) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Williams TA, Embley TM. 2014. Archaeal ‘dark matter’ and the origin of eukaryotes. Genome Biol. Evol. 6, 474–481. ( 10.1093/gbe/evu031) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Williams TA, Foster PG, Cox CJ, Embley TM. 2013. An archaeal origin of eukaryotes supports only two primary domains of life. Nature 504, 231–236. ( 10.1038/nature12779) [DOI] [PubMed] [Google Scholar]
  • 90.de Duve C. 2007. The origin of eukaryotes: a reappraisal. Nat. Rev. Genet. 8, 395–403. ( 10.1038/nrg2071) [DOI] [PubMed] [Google Scholar]
  • 91.Baldauf SL. 2003. Phylogeny for the faint of heart: a tutorial. Trends Genet. 19, 345–351. ( 10.1016/S0168-9525(03)00112-4) [DOI] [PubMed] [Google Scholar]
  • 92.Keane TM, Creevey CJ, Pentony MM, Naughton TJ, McLnerney JO. 2006. Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol. Biol. 6, 29 ( 10.1186/1471-2148-6-29) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Bryant D, Moulton V. 2004. Neighbor-net: an agglomerative method for the construction of phylogenetic networks. Mol. Biol. Evol. 21, 255–265. ( 10.1093/molbev/msh018) [DOI] [PubMed] [Google Scholar]
  • 94.Kersanach R, Brinkmann H, Liaud MF, Zhang DX, Martin W, Cerff R. 1994. Five identical intron positions in ancient duplicated genes of eubacterial origin. Nature 367, 387–389. ( 10.1038/367387a0) [DOI] [PubMed] [Google Scholar]
  • 95.de Souza SJ, Long M, Klein RJ, Roy S, Lin S, Gilbert W. 1998. Toward a resolution of the introns early/late debate: only phase zero introns are correlated with the structure of ancient proteins. Proc. Natl Acad. Sci. USA 95, 5094–5099. ( 10.1073/pnas.95.9.5094) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Forterre P, Philippe H. 1999. The last universal common ancestor (LUCA), simple or complex? Biol. Bull. 196, 373–375; discussion 375–377 ( 10.2307/1542973) [DOI] [PubMed] [Google Scholar]
  • 97.Forterre P, Philippe H. 1999. Where is the root of the universal tree of life? Bioessays 21, 871–879. () [DOI] [PubMed] [Google Scholar]
  • 98.Harish A, Tunlid A, Kurland CG. 2013. Rooted phylogeny of the three superkingdoms. Biochimie 95, 1593–1604. ( 10.1016/j.biochi.2013.04.016) [DOI] [PubMed] [Google Scholar]
  • 99.Kurland CG, Collins LJ, Penny D. 2006. Genomics and the irreducible nature of eukaryote cells. Science 312, 1011–1014. ( 10.1126/science.1121674) [DOI] [PubMed] [Google Scholar]
  • 100.Kuo CH, Ochman H. 2009. Deletional bias across the three domains of life. Genome Biol. Evol. 1, 145–152. ( 10.1093/gbe/evp016) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Moran NA. 1996. Accelerated evolution and Muller's rachet in endosymbiotic bacteria. Proc. Natl Acad. Sci. USA 93, 2873–2878. ( 10.1073/pnas.93.7.2873) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Wacey D, Kilburn M, Saunders M, Cliff J, Brasier M. 2011. Microfossils of sulphur-metabolizing cells in 3.4-billion-year-old rocks of Western Australia. Nat. Geosci. 4, 698–702. ( 10.1038/ngeo1238) [DOI] [Google Scholar]
  • 103.Lake JA, Henderson E, Oakes M, Clark MW. 1984. Eocytes: a new ribosome structure indicates a kingdom with a close relationship to eukaryotes. Proc. Natl Acad. Sci. USA 81, 3786–3790. ( 10.1073/pnas.81.12.3786) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Lake JA, Rivera MC. 2004. Deriving the genomic tree of life in the presence of horizontal gene transfer: conditioned reconstruction. Mol. Biol. Evol. 21, 681–690. ( 10.1093/molbev/msh061) [DOI] [PubMed] [Google Scholar]
  • 105.McInerney JO, Wilkinson M. 2005. New methods ring changes for the tree of life. Trends Ecol. Evol. 20, 105–107. ( 10.1016/j.tree.2005.01.007) [DOI] [PubMed] [Google Scholar]
  • 106.Bogumil D, Alvarez-Ponce D, Landan G, McInerney JO, Dagan T. 2014. Integration of two ancestral chaperone systems into one: the evolution of eukaryotic molecular chaperones in light of eukaryogenesis. Mol. Biol. Evol. 31, 410–418. ( 10.1093/molbev/mst212) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Hirt RP, Alsmark C, Embley TM. 2015. Lateral gene transfers and the origins of the eukaryote proteome: a view from microbial parasites. Curr. Opin. Microbiol. 23, 155–162. ( 10.1016/j.mib.2014.11.018) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Marcet-Houben M, Gabaldon T. 2010. Acquisition of prokaryotic genes by fungal genomes. Trends Genet. 26, 5–8. ( 10.1016/j.tig.2009.11.007) [DOI] [PubMed] [Google Scholar]

Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES