Abstract
SUMMARY
The concept of the minimal cell has fascinated scientists for a long time, from both fundamental and applied points of view. This broad concept encompasses extreme reductions of genomes, the last universal common ancestor (LUCA), the creation of semiartificial cells, and the design of protocells and chassis cells. Here we review these different areas of research and identify common and complementary aspects of each one. We focus on systems biology, a discipline that is greatly facilitating the classical top-down and bottom-up approaches toward minimal cells. In addition, we also review the so-called middle-out approach and its contributions to the field with mathematical and computational models. Owing to the advances in genomics technologies, much of the work in this area has been centered on minimal genomes, or rather minimal gene sets, required to sustain life. Nevertheless, a fundamental expansion has been taking place in the last few years wherein the minimal gene set is viewed as a backbone of a more complex system. Complementing genomics, progress is being made in understanding the system-wide properties at the levels of the transcriptome, proteome, and metabolome. Network modeling approaches are enabling the integration of these different omics data sets toward an understanding of the complex molecular pathways connecting genotype to phenotype. We review key concepts central to the mapping and modeling of this complexity, which is at the heart of research on minimal cells. Finally, we discuss the distinction between minimizing the number of cellular components and minimizing cellular complexity, toward an improved understanding and utilization of minimal and simpler cells.
INTRODUCTION
As recognized in the beginning of the current era of molecular systems biology, a cell could be as simple as we could define life in its simplest form (1). Indeed, all known life forms have the cell as their basic unit. On the other hand, the cell is the most complex structure in the micrometer size range known to humans (2). Despite several achievements in identifying and characterizing the molecular constituents of life, we are far from understanding how these constituents interact with each other and give rise to a robust and self-replicating system. Also, there is not a widely accepted theory of how the first cells arose on Earth, nor has complete synthesis from scratch of simpler living cells been achieved in the laboratory. Therefore, at present, the minimal cell can be defined only on a semiabstract level as a living cell with a minimal and sufficient number of components (3) and having three main features: (i) some form of metabolism to provide molecular building blocks and energy necessary for synthesizing the cellular components, (ii) genetic replication from a template or an equivalent information processing and transfer machinery, and (iii) a boundary (membrane) that separates the cell from its environment. The necessity of coordination between boundary fission and the full segregation of the previously generated twin genetic templates could be added to this definition. Another fundamental characteristic that could be added to the essential features of a minimal cell is the ability to evolve, which is a universal characteristic among all known living cells (4).
From a physicochemical perspective, the minimal cell portrays the transition from nonliving to living matter, which can refer to the transition that occurred during the origin of life that preceded the evolution of species on Earth as well as the transition that is expected to be attained in the laboratory with the creation of an artificial living cell (5). The result of the former transition, usually called the last universal common ancestor (LUCA), universal common ancestor, last common ancestor, or cenancestor, roots the currently accepted tree of life from which all life forms are supposed to have evolved (6, 7). The hypothetical laboratory transition forms the basis of the concept of artificial cells, minimal cells fully created in the laboratory from known parts. It is often difficult to separate the concept of an artificial cell from that of a semiartificial cell which is, to some degree, built from biogenic parts. The pioneering work by J. Craig Venter's team is perhaps the best example of a semiartificial cell, having reported the first functional cell with its genetic material being an artificial, in vitro-synthesized chromosome (8).
Because of its interdisciplinary nature, the work on minimal cells has been closely linked with several lines of research, including minimal genomes, protocells, models of minimal cells, and chassis cells (Table 1).
TABLE 1.
Concept or construct | Short definition | References |
|
---|---|---|---|
Scientific landmarks | Reviews | ||
Minimal genome | Simplified genome without nonessential genes (under specific environmental conditions) | 17, 25, 26 | 2, 19, 23, 115, 205 |
LUCA | Life form commonly accepted to have existed before the divergence of the domains Bacteria, Archaea, and Eukarya; hypothesized to have been inorganically hosteda | 7, 94, 95 | 80, 97, 132, 206–208 |
Chassis cell | Cell designed for use in industrial production processes with a high degree of controllability and efficiency | 4, 101, 102, 105 | 11, 12 |
Artificial/semiartificial cell | Cell built in the laboratory (at least partially) with resources to extant genetic and other biological material | 8, 145 | 5, 116, 209–211 |
Minimal cell models | |||
Protocells | In vitro models of a minimal cell usually containing some kind of biological material encapsulated in liposomes or other lipidic vesicles | 10, 139–141 | 9, 212, 213 |
In silico minimal cells | Virtual model/reconstruction of any of the possible constructs described above or any other model of a minimal “ome” relevant to the study of the minimal cell | 46, 156, 164–166, 169, 170 | 158, 168 |
See reference 75.
Minimal cell models, as the name indicates, refer to any construct that exhibits certain characteristics of biological cells while having a considerably simpler nature. The simplicity of such constructs permits detailed study of the biological characteristics of interest. Minimal cell models comprise physical constructs, protocells, and theoretical models based on mathematical and/or computational descriptions that capture certain features of the living cells (9). Protocells are compartmentalized assemblies based on lipidic vesicles, polymeric or polypeptide capsules, colloidosomes, coacervates, and others (reviewed in reference 10) that usually encapsulate biological material such as organic chemicals, proteins, or RNA. Protocells have been considered models of states of transition toward fully functional living cells and have been mainly developed for studying the emergence of biological characteristics such as self-organization and replication in simpler assemblies of biochemical entities.
The concept that relates to the minimal cell from a more applied angle is that of the chassis or platform cell. The chassis cell can be defined as a cell with reduced complexity that is designed for one or several biotechnological applications and can be modified and controlled with precision and in a predictive manner (11). Although studies of minimal cells have often claimed to pursue both scientific and technological purposes, the two aims are often incompatible. For example, those bacterial cells that have evolved the smallest genomes in nature show slower and less efficient metabolism with low division rates, features that are opposite of those desired for a chassis cell (11, 12). Thus, the chassis cell will need to achieve a tradeoff between the simplicity or minimality needed for predictive manipulations and the complexity needed for robustness and efficiency.
In this review, the various concepts and approaches related to research on minimal cells are further discussed from a systems biology perspective. The plural terms “minimal cells” and “simpler cells” are preferred, as many configurations of each seem to be possible, given the observed high functional redundancy in biological networks.
A Systems Biology Perspective on Minimal Cells
Besides being the focus of fundamental and applied research for a long time, minimal genomes have been quasisynonymous with minimal cells since the sequencing of Mycoplasma genitalium in 1995 (13). M. genitalium is so far considered the microbe with the smallest autonomously replicating genome (∼580 kb) that can be grown in laboratory cultures (13). Recently, the focus of minimal cell research has been expanding beyond the genome, as high-throughput technologies are enabling system-wide quantifications of other biomolecules. These studies mainly include proteomics, lipidomics, metabolomics, and fluxomics. The exponential growth of different omics data sets and computational models has been helping biologists to integrate these data and to predict the behavior of whole cells. The study of life and, consequently, of minimal cells is thus facing a new paradigm, with systems biology beginning to be accepted as an approach that puts biology closer to the other natural sciences by establishing laws and enabling quantitative predictions (14).
Minimal or Simpler Cells?
When discussing minimal cells, there is frequently an association of two different concepts. The first concept relates minimal cells to the smallest number of components, implying cells with a small number of genes and expressed proteins. The second concept centers on the lowest complexity and connotes so-called simpler cells, cells with a behavior easier to predict and to manipulate. While the minimality in terms of the number of components is relatively straightforward to measure by genome sequencing and other high-throughput technologies, quantification of complexity has yet to be tackled. For example, the number and dynamics of the interactions between different biomolecules can be regarded as indicators of a cell's complexity (15). However, the technologies for mapping biomolecular interactions in a system-wide manner are yet to mature (16).
As the relationship between the number of components in a system and the system's complexity is often nonlinear, the minimal cell may not necessarily be the simplest cell. We therefore review the literature concerning both concepts. We start with systems with smaller numbers of components, from the minimal genome to the minimal proteomes and minimal nutritional requirements. Next, the special cases of the LUCA and chassis cells are reviewed. Later, different systems-level approaches toward minimal and simpler cell constructs are explored, namely, top-down, bottom-up, and the middle-out/integrative approaches. The last section discusses the importance of considering complexity in a holistic approach to minimal cells and the contribution of systems biology to attaining this goal.
TOWARDS THE SMALLEST NUMBER OF COMPONENTS
Finding the smallest number of components required to constitute a living cell is the classical approach used to understand and create minimal cells. One of the fundamental distinctions to be made here from the systems biology perspective is between a minimal set of components and a minimal “ome.” This distinction was introduced early in 1996, with the first comparative approach for two full genomes (17). A (minimal) genome, proteome, or another ome is the full, functional set of components within a (minimal) living cell, either sequenced, enumerated, or even not yet fully accessible, as in the case of the metabolome (18). On the other end of the spectrum, a (minimal) set is theoretical, derived from comparative or analytical studies, and has not been proven to be functional in a living cell.
Minimal Genome
As the genome was the first available ome in cell-level systems biology, searching for the smallest functional genome represents most of the state of the art for minimal cells. One comprehensive definition of a minimal genome was given by Koonin: “the smallest possible group of genes sufficient to sustain a functional cellular life form under the most favorable conditions imaginable, that is the presence of a full complement of essential nutrients and the absence of environmental stress” (19). The phrase “most favorable conditions” should be emphasized, which in practice indicates that one minimal cell may have extremely complex nutritional requirements. The smallest prokaryotic genomes sequenced to date belong to species not considered autonomously alive, which, while missing essential genes, became entirely dependent on much more complex hosts: insects (20). “Candidatus Carsonella ruddii” has an impressive 160-kb genome (21), and “Candidatus Hodgkinia cicadicola” has an even smaller one, with 144 kb, which leaves scientists at the edge of considering them organelles, as in the case of mitochondria and chloroplasts (22). The genome of “Candidatus Carsonella ruddii” lacks genes involved in cell envelope biogenesis and metabolism of nucleotides and lipids (21) and also lacks genes involved in DNA replication, transcription, and translation, which are essential for any bacterial cell to live autonomously (22). However, achieving a minimal genome implies that the microorganism containing it should be accessible with current isolation and cultivation techniques without the aid of another living host, as emphasized by Mushegian, who defined a minimal genome as the “smallest number of genetic elements sufficient to build a modern-type free-living cellular organism” (23). As mentioned above, the smallest natural genome capable of autonomous growth or laboratory cultivation in pure culture and also in a defined medium (24) is the one of M. genitalium, with 580 kb (13).
The first theoretical minimal gene set was proposed by Mushegian and Koonin based on a system-wide comparison of Haemophilus influenzae and M. genitalium genomes, consisting of 256 genes (17). Later, one integrative study utilized a larger data set including results from both experimental and computational approaches for the minimal genome and predicted a set of 206 genes for a theoretical minimal gene set (25). This minimal gene set included genes for DNA replication, repair, restriction, and modification; a basic transcription machinery; aminoacyl-tRNA synthesis; tRNA maturation and modification; ribosomal proteins; ribosome function, maturation, and modification; translation factors; RNA degradation; protein processing, folding, and secretion; cellular division; transport; and energetic and intermediary metabolism (glycolysis, proton motive force generation, pentose phosphate pathway, lipid metabolism, and biosynthesis of nucleotides and cofactors). Those authors did not include rRNA or tRNA genes, and they recognized that the basic substrate transport machinery could not be clearly defined, even though this minimal cell would rely greatly on the import of several substrates, including all 20 amino acids (for which it had no biosynthetic ability). Theoretical minimal gene sets will need to be tested in vivo to be qualified as minimal genomes. The technology to synthesize full genomes has been developed only very recently, and it has not yet been applied toward this goal (8).
Determining a minimal gene set is frequently associated with predicting which genes are essential for a species. M. genitalium was the first organism to be analyzed in a large-scale essentiality assay, with between 265 and 350 genes being identified as essential (26). Proof of gene dispensability, however, requires isolation and characterization of pure clonal populations, which were not done in that study. This gap was later filled by that same team, who identified 382 essential genes; the difference in the number of essential genes might have occurred due to not only mutant complementation in the previous approach but also different medium conditions (27). Several other prokaryotes were targets of genome-wide essentiality studies, for either antibiotic design or antimicrobial control, providing important data sets for benchmarking results. These organisms include Acinetobacter baylyi (28), Caulobacter crescentus (29) Francisella novicida (30), Haemophilus influenzae (28), Helicobacter pylori (31) Salmonella enterica serovar Typhimurium (32), Staphylococcus aureus (33, 34), Neisseria meningitidis (35), and Vibrio cholerae (36). Both the DEG (37) and OGEE (38) databases centralize much of these data.
Essential gene sets obtained by determining all viable single knockouts of a species are always a subset of a possible minimal genome, due to synergistic effects. In other words, these sets exclude genes that are not essential when deleted individually but that cause cell death when deleted simultaneously, also termed synthetic lethal genes. Higher-structure chromosomal effects will also not be evident when genes are deleted individually (reviewed in reference 2). Also, essential gene sets usually lack essential noncoding sequences that would be part of a minimal genome, such as essential promoter regions, tRNAs, small noncoding RNAs, and other noncoding sequences with unknown but essential functions. A recent genome-scale essentiality study identified and described 130 essential noncoding elements of Caulobacter crescentus, including 90 intergenic segments of unknown function (29).
It is now commonly accepted in the scientific community that multiple minimal genomes can exist. Currently known prokaryotic genomes are complex and highly adapted, exhibiting functionally equivalent components with different evolutionary origins, named nonorthologous displacements (NODs). In order to reduce the number of potential combinations, one rational direction is to identify a minimal genome for a number of functional niches or to determine the minimal gene set for a thermophilic autotroph or a mesophilic heterotroph, among others (19).
Other Minimal Sets of Components
The cell-level evaluation of components other than the genome includes functional inferences from the genome at the protein level, directly generating theoretical minimal proteomes by assuming a general translation from the genome. Recently, this functional inference has allowed other omics approaches that analyze whole sets of specific genetic sequences. One example is a comparison of complete sets of tRNA isoacceptors (tRNomics) and tRNA/rRNA modification enzymes (modomics) in all sequenced Mollicutes, a class of bacteria that lacks a cell wall and includes the genus Mycoplasma (39). In that study, it was shown that the organisms have developed different strategies to minimize the RNA component of the translation apparatus. Even given a good representation of the RNA modification enzymes in the genomes of these bacteria (up to 6% in M. genitalium), only 9 enzymes were identified as being more resistant to loss in Mollicutes (39). This finding indicates that even in extremely reduced genomes, for the most basic processes of the cell, such as translation and codification, different strategies can be adopted.
Recently, the whole methylomes of M. genitalium and Mycoplasma pneumoniae were analyzed at a single-base resolution, suggesting a potential role for methylation in regulating the cell cycle and gene expression in these reduced bacteria (40). In another study, the whole transcriptome of Prochlorococcus marinus MED4, the smallest known photosynthetic organism considering both genome and cell size, was analyzed, with a focus on the effects of the light cycle (41). It was found that 90% of the annotated genes of this species were expressed under some condition, and 80% showed cyclic expression together with the light-dark cycle, including genes involved in the cell cycle, photosynthesis, and phosphorus metabolism. While measurements of the proteome and the metabolome are not available for Prochlorococcus, transcriptomics allowed per se the identification of specific metabolic transitions and possible regulatory proteins for these minimal photosynthetic bacteria (41).
Minimal protein sets have recently begun to be inferred by integrating experimental data. This is a step in moving from functional inference from minimal genomes toward a real assessment of minimal proteomes. Pioneer works included a comparison of 17 prokaryotic genomes by integrating a database of experimentally determined unique peptides to define a core proteome (42). The authors of that study predicted 144 orthologs for the core genome, of which ∼74% were actually expressed in all species. More than half of this core proteome was related to protein synthesis, but strikingly, 10 proteins had not been functionally characterized. That study also identified differences in the proteomes associated with the different life-styles of the bacteria analyzed, and the authors concluded that the phenomenon of phenotypic plasticity has an impact on the minimal proteome, which could not be accessed simply by comparing genomes (42). In another work, the proteomes of Acholeplasma laidlawii and Mycoplasma gallisepticum were analyzed by two-dimensional (2D) electrophoresis, matrix-assisted laser desorption ionization (MALDI), and liquid chromatography-mass spectrometry (LC-MS) (43) and compared to the proteome of Mycoplasma mobile obtained in another study (44). Clusters of orthologous genes (COGs) were used to compare both the genomes and proteomes of the three Mollicutes species (43). Two hundred twelve COGs were identified as being part of the core proteome, including DNA replication, repair, transcription, and translation and molecular chaperones. Some metabolic pathways were also represented in this core proteome, including glycolysis, the nonoxidative part of the pentose phosphate pathway, glycerophospholipid biosynthesis, and the synthesis of nucleoside triphosphates (43). One surprising finding was the low level of conservation of proteins related to cell division, as only two proteins were conserved in the core: FtsH and an Smc-like protein. Strikingly, the genome of M. mobile does not even contain FtsK or FtsZ, which indicates that the essential process of cell division has greater plasticity than other cellular systems (43). Building on results of another study of the interactome of M. pneumoniae (45), those authors also concluded that most COGs in the Mollicutes core proteome—140—are expected to associate in protein complexes, and 54 COGs are predicted to participate in more than one complex (43). Due to secondary functions of such complexes, such as the maintenance of overall cellular stability (and particularly genome stability), which could explain the maintenance of incomplete metabolic pathways in reduced genomes, those authors proposed that the concept of a minimal genome should be treated not as a set of essential functions but as a set of essential structures (43).
Another system that can be analyzed at the cell level is the metabolic network of an organism. Given that the whole metabolome is still not accessible due to technological limitations, studies in this area are mainly computational. A minimal metabolic network of 50 enzymatic reactions was derived from the theoretically inferred minimal gene set of Gil et al. (25); it was shown that the encoded metabolism was consistent and that the network's topological parameters were similar to those of natural metabolic networks (46). Another work performed data mining on the KEGG Pathways database in an effort to obtain a minimal anabolic network and the correspondent minimal metabolome for a reductive chemoautotroph (47). The resulting metabolic network comprised 287 metabolites, with more than half being intermediates in the biosynthesis of monomers.
Recently, a series of three papers reported a variety of analyses of M. pneumoniae, a reduced-genome bacterium. These studies included the determination of the proteome (45), the transcriptome (48), and a metabolic network that allowed the identification of a minimal medium that supported growth of M. pneumoniae as well as of M. genitalium (24). This series was a pioneering step forward in the integration of omes other than the genome in the minimal cell panorama and also in the use of the power of a holistic system perspective for the study of a single species.
Work on minimal omes other than the genome facilitated the analysis of the impact of different environmental conditions on minimal sets, mainly through transcriptomics and expression proteomics (42). Also, proteomics permits insight into the spatial organization of minimal cells by analyzing which protein complexes are assembled and which structural functions these complexes could have (43, 45). On the negative side, environment-dependent cell-level analyses are often more prone to errors than genome sequencing. The technology for expressional proteomics is still under development, and proteins with extreme physical and chemical properties, such as low mass and high hydrophobicity, including membrane proteins, can be underrepresented in these assays (49). Moreover, some proteins might be dispensable under optimal growth conditions and expressed only under specific stress conditions. This will decrease the size of core transcriptome and proteome if the experimental setup does not include sufficient diversity.
Minimal Environmental Conditions for Life
Evolution enabled many alternative ecological niches and nutritional pathways for prokaryotes, and there is no experimental or even conceptual support for the existence of just one form of a minimal prokaryotic cell from a metabolic point of view, as recognized by Szathmáry (50), Koonin (19), and Gil et al. (25). Many minimal metabolic networks adapted to different habitats could sustain the universal genetic machinery, the translation and transcription apparatus, which are usually more conserved and similar among distantly related prokaryotes. Depending on environmental conditions such as temperature, pH, salinity, and especially the nutrients available in a specific niche, organisms could differ substantially and still have a reduced number of genes. Here an important minimal set, almost absent in the scientific literature, comes to the scene as a major player in the study and design of minimal cells: the minimal, defined media able to sustain such cells. Minimal medium is not a biological component per se, but it is an emergent biological property that directly reflects the degree of dependence of the cell on the environment.
Currently, there are no comprehensive comparative studies on the different minimal nutritional requirements of different prokaryotic organisms. However, there is a variety of old studies that seem to have been relatively forgotten. A good example is the extensive work started in the 1950s by MacLeod and coauthors on minimal nutritional requirements of marine bacteria (51, 52). Those authors explored and presented several combinatorial possibilities for the composition of defined media, mentioning special needs for amino acids as sole carbon sources or as supplements in addition to non-amino acid sources of carbon and energy and also identifying special needs for ions, vitamins, and other growth factors (51). Bryant and Robinson reviewed work on nutritional requirements of ruminal bacteria and corroborated the conclusion that volatile fatty acids are essential for the growth of several of these organisms, as is ammonium, which is required regardless of the amount of amino acids and peptides present in the medium (53).
The study of mutations leading to specific auxotrophies in bacteria also started several decades ago, long before the DNA structure was discovered (54). Fundamental for the identification of the different steps of metabolic pathways, the classical study of auxotrophies is also central to the study of minimal or simpler cells by identifying possible pathways for viability after gene inactivation.
Old studies on nutritional requirements also include the interesting finding that minimal nutritional requirements increase with extreme temperatures for strains of Lactobacillus plantarum (55) and Escherichia coli (56) and several strains of thermophilic Bacilli (57). This implies that genome reductions starting from these species will have to take into account the conditions that the cells will face in artificial cultures.
Extensive nutritional requirements were predicted for previous theoretical minimal gene sets, including all amino acids, nucleotides, fatty acids, and complex coenzymes (17). The number of components of a minimal medium is therefore not a limiting factor for designing and deriving theoretical minimal cells, as long as it does not require other living cells (it remains an axenic culture). However, it certainly becomes a limitation for industrially relevant chassis cells, which must be efficient and profitable (see Chassis Cells, below). The organisms used most often in minimal cell studies for biotechnological applications, E. coli and Bacillus subtilis, are facultative anaerobes, highly versatile organisms with relatively simple nutrient requirements (58). Indeed, E. coli probably has the simplest growth requirements known so far: a medium composed of as little as seven substances corresponding to eight components, disodium phosphate, monopotassium phosphate, sodium chloride, ammonium chloride, magnesium sulfate, calcium chloride, and one carbon source, can sustain growth (59). However, it should not be put aside that some trace metals are also considered essential but are not added to the medium, as they are present in sufficient amounts in water, including copper (60), nickel and cobalt (61), molybdenum (62), iron (63), manganese (64), and zinc (65). All these components together make probably the simplest growth requirements known so far for prokaryotes. An extensive review of nutritional requirements of microorganisms used in fermentation processes covers interesting points, such as why each of the principal elements is needed for the cell's physiology; the major requirements (carbon, nitrogen, sulfur, trace elements, vitamins, and other growth factors) and also physicochemical constraints of growth, such as pH and ionic strength; and the effect of concentrations on growth rates (66).
Defining minimal media for minimal cells also requires definition of a minimal threshold of growth rates. Achieving a clear exponential phase might not be a necessity for the fundamental pursuit of a minimal/simpler cell, while for biotechnological applications, minimalism will have to cope, in a more complex tradeoff, with a minimum yield of biomass and a minimum specific growth rate.
It is estimated that only approximately 1% of bacteria on Earth can be readily cultivated in vitro (67). With this lack of technological capabilities regarding cultivation of prokaryotic cells, there is a great possibility that simpler organisms with more complex requirements might go unnoticed. Organisms that cannot be maintained in a bacteriology culture collection, even in the richest media known, are commonly named “Candidatus” (68). This is a useful term that is not completely implemented within the scientific community. There are no reports of the cultivation of Buchnera aphidicola without insect cells (69, 70); however, as this genus was discovered before the implementation of this nomenclature, and there is sufficient biochemical information available on it, it is not named “Candidatus” (71). While in many cases, unknown nutritional requirements are the reason for the impossibility of cultivating an organism in vitro, “Candidatus” species may also require their host's cells due to unknown physical constraints.
Until recently, M. genitalium was difficult to grow in defined media, and efforts were made to calculate the best composition of such a medium by using genome-scale metabolic modeling (24, 72). These system-level approaches are certainly a promising direction in the field of estimating prokaryotic minimal nutritional requirements.
LUCA AND THE FIRST CELLS
Since the first proposal of the common ancestry theory, described by Charles Darwin in his seminal book On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life (73), much has been debated and speculated about the origin of life and the nature of a possible cell or set of cells that preceded the evolution of the three main lineages of life forms known today: Archaea, Bacteria, and Eukarya. The strongest support for this theory comes from the shared biological features of the three domains, including double-stranded DNA to encode genetic information, transcription to RNA, translation to proteins that are the universal operators of cellular functions, lipidic membranes, and primary metabolism, among others. Other evidence includes the high-level homologies of biological structures with different functions, indicating divergent evolution from a common ancestor; the congruence of morphological and molecular phylogenies; the agreement between phylogeny, the paleontological record, and biogeography; and the hierarchical classification of morphological characteristics (7).
Recent theoretical work (74) was done on the subject of the appearance of the LUCA, making a vital connection between the theory of an inorganically hosted origin of cells (75) and the origin of genomes. The hypothesis of the inorganically hosted LUCA was first posed in 1997 by Russell and Hall, with the premise that it was based on “what life does rather than what life is” (75). This hypothesis was a detailed, complex description of 17 stages of geochemical transformation in a submarine hydrothermal spring, where iron monosulfide bubbles were the hatcheries for the first cells. In a later publication, Russell et al. significantly developed the geochemical details of this theory, specifically the implications of temperature and energetics for the primitive origin of cells (76). In that same year, more biochemistry was incorporated into the theory, including a comparison of the amino acid sequences of the enzymes of glycolytic pathways in eukaryotes and prokaryotes and a simplification of the visual model of the origin of life in hydrothermal vents (77). Claiming that the first free-living cells were eubacterial and archaebacterial chemoautotrophs that emerged more than 3.8 billion years ago from inorganic compartments (77), this is probably so far the most accepted theory of the origin of life (74, 78). The geochemical conditions of early Earth and those of other planets in the solar system where life might have originated were discussed comprehensively elsewhere (79).
It has been proposed that the universal ancestor should have been a fully DNA- and protein-based organism with extensive processing of RNA transcripts and should have had an extensive set of proteins for DNA, RNA, and protein synthesis and DNA repair and recombination; control systems for the regulation of genes and cell division; and chaperone proteins, and it probably lacked operons (80). However, there is still uncertainty in the literature regarding the question of whether the LUCA's genetic machinery was based primarily on RNA or DNA and, if it had DNA, how it was replicated (81, 82). In a comparison of sequences of proteins involved in DNA replication, it was proposed that the LUCA had a genetic system that contained both RNA and DNA, but the latter was, at that time, produced by reverse transcription (83).
Recently, the first formal tests of the LUCA hypothesis were performed by Theobald, with statistical evidence corroborating the monophyly of all known life (7). In that study, Theobald ignored the commonly assumed sequence similarity as a proof of common ancestry, as sequence similarity can be a result of convergent evolution due to selection, structural constraints on sequence identity, mutation bias, chance, or artifact manufacture (7). Although this was the first formal attempt at establishing the LUCA theory on a statistical basis, others claim that the tests performed were not sufficient to reject the alternative hypothesis of separate origins of life (84). Theobald replied with improvements of the models used for the formal test and emphasized that his work did not provide absolute proof for the theory of a LUCA but mentioned several strong arguments in favor of it, such as the low sequence requirements for a specific fold and the enormity of the sequence space (85). Although the alternative hypothesis of separate origins cannot be absolutely ruled out (78, 79), a single common ancestry is currently the best-supported theory of the origin of life. Several extended perspectives and reviews focusing on the issue have been published (Table 1), while the focus here is on systems approaches concerning the LUCA.
A prominent systems biology initiative concerning the LUCA is LUCApedia, a recently launched online database that integrates different data sets related to the LUCA and its predecessors (86). With this database, users working on the LUCA hypothesis have a tool for benchmarking their results with other studies predicting the characteristics of the LUCA, searching by protein name or identification in data sets for COGs, protein domain folds, protein structures, and cofactor usage, etc. (86). Comparative studies make up the vast majority of the system-level approaches for the LUCA, with a focus on genome sequences (87, 88), protein domains (89–91), and proteome hydrophobicity (92). A comprehensive review concerning comparative genomics and its role in defining the LUCA's theoretical gene sets suggests that the estimated genome size of the LUCA is 500 to 600 genes (93). A comparison of protein folds from all three domains of life found approximately 50 folds that are present in all three domains (89), and one study that used the COG database found 80 COGs present in all organisms studied across the three domains of life, 50 of which show the same phylogenetic pattern as rRNA (which the authors called three-domain genes) (94). Of the 50 three-domain genes, 37 were associated with the ribosome in modern cells (94). Another interesting study looked at a large set of diverse predicted proteomes to infer the evolution of hydrophobicity (92). By using the percentage of the most hydrophobic residues in proteins, a universal “oil escape” was observed, indicating that the LUCA was more hydrophobic than modern cells (92).
One of the major problems of comparisons of whole genomes or proteomes in order to infer the LUCA's composition arises due to the relatively unknown extents of horizontal gene transfer (HGT) and gene loss (93), which generate phylogenetic trees not compatible with the rRNA phylogenetic tree topology. Mirkin et al. analyzed the extent of HGT by using the COG database to construct trees for all the COGs and found an approximately equal likelihood of HGT and gene loss events in the evolution of prokaryotic genomes (95). Although those authors state that their intent was not to reconstruct the functional aspects of the LUCA but rather to make a preliminary attempt at constructing evolutionary scenarios by using comparative genomics data, they support the plausibility of a set of ∼572 genes as being sufficient to sustain a functioning LUCA (95). Even though this and other studies have approached HGT events and gene losses within the LUCA context (95, 96), it is still relatively difficult to estimate the extent of the bias that they cause in comparative approaches. There may have been genes present in the LUCA that were lost before all the major lineages diverged, so when genomes are compared at present, those ancestral genes do not appear in the common pool. Also, some genes may not have been present in the LUCA but, after originating, spread quickly by HGT, being present today in all known microorganisms (93). The presence of de novo synthetic pathways in some but not all prokaryotes may therefore leave some uncertainty about which metabolic routes were taken by the universal ancestor.
The transition from organic chemical compounds to cells is still an extremely delicate subject in biology (97). The vast amount of data that modern experimentalists face in a rapidly evolving technological scenario may be the causative agent of a seemingly increasing distance between experimental approaches and theoretical work taking into account the geochemical context of early life. This gap can be diminished with approaches that are becoming more holistic. The search for the LUCA's minimal omes using evolutionary perspectives will undoubtedly contribute to and benefit from the generic quest for the minimal cell, as the examples mentioned above illustrate. The theory of an inorganically hosted origin of life (74, 77) can shed light on the design of membrane-free minimal cell systems. Similarly, the current discussion on the basis of the LUCA's genetic machinery (82) opens a possibility for minimal cell design based solely on RNA genomes. Also, studies of the LUCA directly benefit from those of minimal cells: while minimal gene sets are theoretical and do not explicitly incorporate evolution, comparative genomics is based on orthology and its resulting minimal gene sets should be related to those of ancestral life forms (93).
CHASSIS CELLS
Probably the most proclaimed reason for the recent interest in minimal cells and related minimal data sets (e.g., the minimal genome and minimal metabolic networks) has been the potential for biotechnological applications. When referring to a minimal cell that is intentionally simplified for use in industry, the terms platform cell or factory cell (12) or the term chassis cell (11) is preferred. This conceptual construct is of extreme importance for biotechnology industries, as it implies more specialized and more comprehensible cells for biological production of industrial chemicals and pharmaceuticals.
Microbial cells have been shown to be extremely profitable in many applications, thanks to the catalytic power of enzymes and also the large panoply of products that they can synthesize. Nevertheless, these cell factories still remain, to a large extent, black boxes that often surprise engineers. In industrial bioprocesses, as opposed to scientific discovery, no surprises are desired, and total control over a specially designed and fully comprehensible chassis cell is the ultimate goal. This fact has led some to argue that a minimal cell would be interesting for industry due to its supposed simplicity; however, this is highly debatable, as shown in Table 2, where the predicted requisites of a chassis cell are enumerated based on two recent, comprehensive reviews (11, 12). One of the details that can be controversial in comparisons of industrially driven to scientifically driven minimal cells is the necessity for evolution: some have argued that, ideally, no evolution should occur in a chassis cell (4). A recent study proposed that evolvability is inevitable and can actually increase without any pressure for adaptation in a population model, given that it is the result of the exploration of the genetic space (98). Evolution seems to be a process inextricable from DNA replication, and it can also be seen as necessary to improve organisms through evolutionary engineering, for which major achievements have been reviewed elsewhere (99, 100). In populations of chassis cells that maintain evolvability, optimized pathways and enzymes and better growth rates could be selected for in desired media, either complex or defined.
TABLE 2.
Requirement for a chassis cell |
---|
Overall simplicity |
Minimal no. of carbon sinks and other nonoptimal flux paths |
Predictable metabolic and regulatory networks (more control over growth and production) |
Simplified translation code |
Reduced genetic drift and limited evolvability |
Robust mechanisms for genome replication, cytokinesis, and coordination in between |
Robust cell membrane and cell wall that confers resistance to shear stress in bioreactors |
Efficient transcription, translation, and regulation for optimization of cellular fluxes to desired goals |
Availability of predictive mathematical models that save expensive trial resources |
Process-specific modules for implementation of different industrial solutions (particular for each process) |
Other stress tolerance mechanisms, such as: |
Product tolerance |
High substrate tolerance |
Tolerance to low O2 levels |
A chassis cell needs to work on a combination of factors that bounce between simplicity and complexity: precise control often requires simplicity, but energetic and nutritional efficiencies and productivity indicate complex pathways within relatively large networks. Model organisms such as E. coli and B. subtilis, which are well studied and display robust growth, have been preferred subjects of genome-reducing approaches for chassis cells (4, 101–103). When an industrial biotechnology process is discussed, even the complexity of a eukaryote can be accepted as the minimum simplicity, e.g., if synthesis of eukaryotic proteins is desired (104).
Several large projects for genome reduction of industrially relevant prokaryotes have achieved satisfactory results so far. B. subtilis MGIM, based on an ∼1-Mbp deletion of B. subtilis 168, showed little reduction in growth and comparable enzyme productivity (101). B. subtilis MBG874 was achieved by a deletion of 874 kb (20% of the original genome size) and showed a reorganization of the gene expression network and productivities of extracellular cellulase and protease that were 1.7- and 2.5-fold higher than those of wild-type (WT) cells, respectively (105). E. coli MGF-01 was obtained after successive deletions of genomic fragments from E. coli K-12 (a total deletion of about 1 Mbp, or 22% of the genome) and showed improved growth and high-level threonine productivity compared to the wild-type strain (102, 106). E. coli MDS42, obtained by a 14.3% reduction of the genome of E. coli K-12 (103), showed genome stabilization, high electroporation efficiency (103), reduced evolvability (4), and, later, an 83% increase in l-threonine production after metabolic engineering, compared to an E. coli MG1655 strain engineered with the same modifications (107).
Interesting modifications and bottlenecks to be tackled in biotechnological production have been identified by using genome-scale network reconstructions (GENREs) (108), and future designs of chassis cells might emerge from these methods. Accurate submodels of E. coli MG1655 have been derived for aerobic, carbon-limited growth on a chemically defined medium with glucose, glycerol, and acetate as carbon sources (109). These models were created from subsets of reactions from the first E. coli GENRE (110) with the biomass composition as a function of the growth rate (109). Several other metabolic models have been developed, and their applications have been reviewed elsewhere (108). However, regarding modeling of the dynamics of chassis cells in synthetic biology, the focus has been more on modeling individual modules than on modeling whole chassis systems (111).
It seems evident that for chassis cell design, an integrative and pragmatic approach is required (Table 2), along with the best understanding possible of the model organisms to be used. Between the widely used organism E. coli and the minimal organism M. genitalium, there are considerable differences that should be taken into account in time-constrained industrial projects. Even though E. coli has 10 times more protein-coding genes than M. genitalium, a search for species names returned 276 times more abstracts on Medline for the former. The species knowledge index (SKI) is a measure of the amount of scientific literature available for an organism, defined as the number of abstracts on Medline referring to the species, normalized by the number of genes in the genome (112). The SKI index at present is 31 times higher for E. coli than for M. genitalium (Table 3). Although a larger amount of scientific literature does not necessarily imply more knowledge, it is certainly a good indication that there are more scientific data for E. coli than for M. genitalium, which will provide a more solid basis for future interventions using the former species. However, it is not only knowledge about E. coli that makes this species a more promising starting point for the development of chassis cells. The versatility and network redundancy of E. coli are interesting for industrial processes, which often require backup and alternative metabolic routes in cases of enzyme saturation or the ability to change between substrates. The two bacteria also differ strikingly in their doubling times (Table 3), which is often a determinant factor in industrial processes. The short doubling time of E. coli has been shown to be related to posttranscriptional control of protein abundances and posttranslational control of flux rates (113). Studies of Mycoplasma smegmatis concluded that the organization of regulatory operons involved in the regulation of DNA replication and macromolecular synthesis in mycobacteria is very different from that of the majority of other bacteria, which can introduce problems in attempts to control the regulation of these cells (114).
TABLE 3.
Parameter | Value for species |
|
---|---|---|
Escherichia coli | Mycoplasma genitalium | |
Characteristics of species | ||
No. of ORFs | 4,325a | 482b |
No. of NCBI COGs | 2,131 | 362 |
No. of NCBI structure direct links | 1,096 | 6 |
DNA content (mg/ml cell vol)c | 13 | 100 |
Doubling time (h)d | 0.35 | 12 |
Species knowledge indexe | 47.5 | 1.53 |
Characteristics of in silico metabolic network reconstruction | ||
Model ID | iJO1366a | iPS189b |
No. of genes | 1,366 | 189 |
Overall accuracy of gene essentiality predictions (%) | 91 | 87 |
No. of reactions | 2,251 | 262 |
Metabolic | 1,473 | 178 |
Transport | 778 | 84 |
No. of unique metabolites | 1,136 | 274 |
No. of gene-associated reactions | 1,310 | 168 |
No. of spontaneous reactions | 25 | 6 |
No. of non-gene-associated reactions | 133 | 88 |
SYSTEMS APPROACHES FOR UNDERSTANDING AND CREATING MINIMAL CELLS
The systems biology approaches relevant to the construction or definition of minimal cells can be divided into four broad categories. The first two approaches are the traditional approaches of any systems science or technology, namely, top-down (analytic [deconstruction of systems]) and bottom-up (synthetic [construction of systems]) approaches, referred to in many reviews of the field (3, 5, 12, 50, 115–119). Both of these classical approaches have comprised mainly physical or experimental studies, in vivo in the case of top-down or in vitro in the case of bottom-up approaches. We introduce here the middle-out approach, which includes large-scale data integration, modeling, and simulations relevant to the study of minimal or simpler cells. Following Denis Noble's definition, the middle-out approach considered here is one that “starts at any level … at which there are sufficient data and reaches (up, down and across) toward other levels and components” (120). The fourth category is system-level comparative studies, the first approaches to be used at a system level toward the construction of minimal cells (17) and probably still the most used approach today for systems biology of minimal cells (93, 121).
Almost a decade ago, Eörs Szathmáry highlighted the importance of bridging the gap between both bottom-up and top-down approaches but also between experimental and theoretical studies (50). In an attempt to organize the sparse and diverse knowledge obtained from the long pursuit of minimal life, we reviewed the diversity of relevant studies, as summarized in Fig. 1. We consider the classification “experimental” versus “theoretical/computational” to be independent of the 4 major categories presented above. In the following sections, we also attempt to associate each approach with the associated technologies and the disciplines that it has primarily served, such as the associations of the top-down approach with molecular biology and of the bottom-up approach with biophysics and biochemistry. This is a different view from that of other authors, who associate the quest for minimal cells with synthetic biology only, for instance (122).
Top-Down Approach
Broadly, top-down implies the removal of the nonessential components of the studied system until it is no longer functional and in this manner obtaining an understanding of each part's individual function within the whole system. Traditionally, this approach has also been referred to as reductionism, and in minimal cell studies, it has involved mainly attempts to define minimal gene sets and minimal genomes (see “Minimal Genome,” above), which were achieved by knocking out genes to determine which ones were nonessential.
Several techniques to perform large-scale knockout studies have been developed, as reviewed elsewhere (25), including antisense RNA to inhibit gene expression, systematic inactivation of individual genes, and massive transposon mutagenesis strategies (the most widely used approach). The recent technological capacity to study synthetic lethality on a genome scale in E. coli, taking advantage of conjugation of deletion or hypomorphic strains to create double mutants (123), promises important data sets for the design of reduced strains. As conjugation occurs in other bacteria, it is expected that it will be applied to other organisms (123). Metabolic modeling has already been performed to predict synthetic lethal genes for E. coli on a genome scale, not only for pairs of genes but also for triplets, some quadruplets, and higher-order lethal combinations (124).
Simultaneous deletions of large parts of the chromosome were done mainly for model bacteria that are at the same time industrially relevant (see Chassis Cells, above). Reductions of the genome of E. coli of up to 29.7% (125) were achieved using the red recombination system of phage lambda (126). Another more recent large-scale deletion technique merged Tn5 transposon mutagenesis with the Cre/loxP excision system and phage P1 transduction (127). This method has the advantage of not requiring the construction of genetic vectors or the performance of complex PCR experiments for each deletion, but so far, it has achieved a reduction of only 7% of the genome of E. coli MG1655.
The reduction of genomes occurs naturally in specific habitats, where bacteria adapt drastically to a specific niche, losing several unnecessary genes usually related to the biosynthesis of amino acids and other essential metabolites that they can take up from a stable niche. There has been increasing interest in the natural top-down reduction of the genome of B. aphidicola, as this bacterium keeps the biosynthetic abilities of most amino acids that are provided to the insect host (128). An innovative study analyzed the dynamics of natural genome reduction in Salmonella enterica by an experimental evolution procedure using serial passages (129). Those authors obtained deletions of up to 200 kb (approximately 4% of the WT genome), and impressively, two of the large deletions isolated included several genes that were previously identified as being individually essential for growth (130). These results reinforce the need to perform single-deletion studies under different experimental conditions and, ultimately, to conduct large-scale simultaneous deletions for studies of genome reduction.
Being based on existing natural genomes, top-down approaches can be limiting in drawing universal conclusions about minimalism and simplicity. It has been recognized that as each study starts with a specific organism, it arrives at a specific minimal gene set (131). Finally, it seems that simplifying existing genomes will always lead to a complex cell with complex means of transcribing and translating its genetic code, and there is general discussion about whether this is indeed the simplest living system possible (50).
Table 4 enumerates the most relevant species used in the top-down or analytic approach to obtain or understand minimized cells.
TABLE 4.
Category and species | Genome size | Special feature(s) (reference[s]) |
---|---|---|
Mollicutes | Usually parasites without a cell wall; these were the first genomes to be analyzed by global transposon mutagenesis (M. genitalium and M. pneumoniae) (26); the same methodology was applied to Mycoplasma pulmonis (214); defined media have been described for both M. genitalium and M. pneumoniae (24); different species have been compared at the system level for the genome (203), proteome (43), complete sets of tRNA isoacceptors (tRNomics) and tRNA/rRNA modification enzymes (modomics) (39), and methylome (40) | |
Mycoplasma genitalium G37 | 580 kbp | Second genome to be fully sequenced (13) and still the autonomously replicating culturable species with the smallest genome; the full genome was analyzed early by global transposon mutagenesis for essential genes (26); a later expt concluded that 387 protein-encoding and 43 structural RNA genes were essential (27); genome-scale metabolic reconstruction (72) and an integrative whole-cell computational model (170) are available |
Mycoplasma pneumoniae M129 | 816 kbp | A genome-scale in vivo assay was performed for this bacterium to determine essential genes for mouse infection, which identified 194 genes (215); the proteome (45, 216), transcriptome (48), and metabolic network (24) have been analyzed at the cell level; it seems to have a larger fraction of multifunctional enzymes than other bacteria (24); the transcriptome was shown to be remarkably dynamic and complex (including antisense transcripts, alternative transcripts, and multiple regulators) and more similar to those of eukaryotes than to those of other bacteria (48) |
“Candidatus Phytoplasma mali” AT | 602 kbp | Insect-transmitted plant pathogen that represents an economically important disease of apple (217); one of its most distinctive characteristics is its linear chromosome (218) |
Obligate endosymbionts of insects | Usually the smallest and most GC-poor genomes reported, with the exception of Hodgkinia (219); genomes indicate functional convergence during evolution (220) | |
“Candidatus Tremblaya princeps” PCVAL | 138 kbp | Smallest genome of an endosymbiont; genes for synthesis of nucleotides and cofactors, energy production, transport, and cell wall biogenesis are absent; only part of the replication machinery is preserved (221); ability to synthesize most of the amino acids is still encoded; it is a primary insect endosymbiont with a secondary endosymbiont (221) |
Buchnera aphidicola APS | 656 kbp | Model bacterium for extremely reduced prokaryotic genomes of obligate endosymbionts of insects (71, 128, 162, 222, 223); there are no reports of cultures without insect cells (70, 223) |
“Candidatus Hodgkinia cicadicola” Dsem | 144 kbp | Unprecedented combination of an extremely small genome (144 kb), a GC-biased base composition (58.4%), and a coding reassignment of the UGA codon from stop to tryptophan (219) |
“Candidatus Carsonella ruddii” PV | 160 kbp | Symbiont that appears to be present in all species of phloem sap-feeding insects; more than half of the ORFs are devoted to translation and amino acid metabolism (21) |
“Candidatus Sulcia muelleri” | The most ancient and widely distributed insect nutritional symbiont; the cells can be very large with an elongated shape, often >30 μm in length (224); it is present in a large group of related insects, which supports the ancient acquisition of the symbiont by a shared ancestor, dating the original infection to at least 260 million yr ago (224); together with other endosymbionts, it forms dual-symbiont systems that allow collective production of the 10 amino acids not synthesized by the host (220) | |
DMIN | 244 kbpa | |
GWSS | 246 kbpb | |
Other obligate endosymbionts | ||
“Candidatus Vesicomyosocius okutanii” HA | 1.02 Mbp | Thioautotrophic primary endosymbiont of a deep-sea clam; this is the smallest genome reported for autotrophic bacteria (225); it contains genes for thioautotrophy and for synthesis of almost all amino acids and various cofactors but apparently lacks several transporters of these substances to the host cell and several other genes that are essential in E. coli, mainly the ftsZ genes and related genes for cytokinesis (225) |
Free-living prokaryotes with the smallest genomes | ||
Pelagibacter ubique SAR11 HTCC1062 | 1.31 Mbp | Heterotrophic prokaryote supposed to be the most abundant species on Earth (226); it has the smallest genome encoding the smallest no. of predicted ORFs of all free-living microorganisms (227); in contrast to other genome-reduced prokaryotes, it has complete biosynthetic pathways for all 20 amino acids and all but a few cofactors; no pseudogenes, introns, transposons, extrachromosomal elements, or inteins are known; it has few paralogs and the shortest intergenic spacers observed for any cell (227); noncanonical metabolic rearrangements in defined media have been reported (226); an analysis of the proteome covering 65% of the ORFs confirmed remodeling of expression during adaptation to stationary phase (228) |
Prochlorococcus marinus MED4 | 1.66 Mbp | Smallest genome and cell size of an oxygenic phototroph; it is believed to be the most abundant photosynthetic organism on Earth (227); the two genomes spanning the largest phylogenetic distance in the genus were compared, revealing genomic dynamics and small proportions of regulatory genes (229); the no. of noncoding RNAs relative to the genome size is comparable to that found in other bacteria (230); simplified regulation of nitrogen utilization was reported (231) |
Model bacteria relevant for industry | ||
Escherichia coli K-12 MG1655 | 4.64 Mbp | Model Gram-negative bacterium with the highest species knowledge index for a prokaryote (112); different genome-scale gene essentiality assays concluded that 620 genes (150) and, later, 303 genes (204) were essential; using the lambda red recombination system, genome reductions of up to 15% (103), 22% (102, 106), and 29.7% (125) of the original genome size were reported; another procedure combining Tn5 transposon mutagenesis with the Cre/loxP excision system and phage P1 transduction achieved a smaller but faster reduction of ∼7% (127) |
Bacillus subtilis subsp. subtilis 168 | 4.21 Mbp | Model Gram-positive bacterium; an early estimation of the no. of essential genes based on 79 chromosomal deletions extrapolated that 562 kbp would be sufficient to sustain a minimal cell based on this species (232); a later assay concluded that 271 genes were indispensable for growth (149); 7.7% of the genome was deleted by removing prophages and AT-rich islands using plasmid-based chromosomal integration-excision systems, which resulted in B. subtilis strain Δ6 (233); another project, the MG1 M strain, deleted about 25% (991 kbp) of the genome (101); later, strain MBG874 was reported, with a deletion of 874 kb (20%), and showed enhanced protein productivity; this was the first report demonstrating that genome reduction could contribute to the creation of a bacterial cell with application in industry (105) |
Archaea | ||
Nanoarchaeum equitans Kin4-M | 491 kbp | The only known archaeal parasite; it is an obligate symbiont of another archaeon (Ignicoccus sp.); unlike the small genomes of bacteria undergoing reductive evolution, N. equitans has very small regions of noncoding DNA (234); the genome encodes the machinery for information processing and repair but lacks genes for lipid, cofactor, amino acid, or nucleotide biosynthesis |
Comparative Approach
Comparative approaches applied to the minimal cell have been mainly those of comparative genomics, involving whole genomes and inferred proteomes. Usually, conserved genes have a higher probability of being not only essential (and therefore part of a possible minimal genome) but also ancient (possibly part of the LUCA's genome). The best known of these genes is the 16S rRNA, traditionally used for phylogeny. In this manner, comparative studies serve mainly evolutionary biology and the quest for the LUCA's constitution (132).
The early comparison of the genomes of M. genitalium and Haemophilus influenzae described above was the first system-level comparative approach to construction of a minimal genome (17). Although only 240 genes were conserved between both genomes, 22 cases of NODs were identified. Depending on the conceptual or practical cellular construct being pursued, choosing the simplest, most ancient, or most economic protein when facing a NOD will be crucial in the search for a minimal cell. An analysis of possible functional redundancy and the presence of parasite-specific genes in this study resulted in a final set of 256 genes as the hypothetical number of genes capable of sustaining a cell (17).
A new wave of comparative studies integrated proteogenomics to validate genetic conservation, using high-throughput tandem mass spectrometry to verify the expression of predicted conserved coding regions (133). First used by Gupta et al. to compare the expressions of orthologous genes across three Shewanella species (121), not much later, comparative proteogenomics was used in the above-described quest for the core proteome of a minimal cell (43) (see “Other Minimal Sets of Components,” above).
The computational comparison of proteins in a large scale can outperform the comparison of genomic sequences. One example includes annotations of curated domain structures, which were done in a previous phylogenomic study with 420 free-living organisms in an attempt to define the proteomic content of the LUCA (91). Others have compared protein folds across Bacteria and Archaea, which indicated a possible set of the top 30 most conserved folds (134).
When jumping from comparisons of genomes to comparisons of proteomes, transcriptomes, or fluxomes, experimental conditions are an additional but indispensable layer of information. The results in these cases are influenced by the media and conditions provided to the cells, which must be kept constant to allow comparative studies to be performed. The comparison of several omics data sets is highly promising, although it can be a challenging task, as many of the studies available in the literature were not done under the same experimental conditions. Even the same complex media can have small variations that will impair comparisons (135), so ultimately, defined media should be preferred for comparative analysis. This will require the generation of new, controlled experimental data for future comparative studies.
Not only omics-level comparisons (arriving at minimal sets) but also comparisons at the level of the organelle can be relevant for the study of minimal cells. A comparison of the sequences of modern ribosomes identified the most conserved regions from the three domains of life, which were then mapped onto determined structures of 30S and 50S subunits of ribosomes (136).
In silico system-level comparative studies include a comparison of biological networks using graph theory-based algorithms to perform a topology-based-only comparison of biological networks (protein-protein or metabolic) on a global scale (137).
Arriving at minimal theoretical sets by comparative and top-down approaches is not sufficient to achieve minimal cells. After the 1,000th prokaryotic genome was made available, the striking discovery that not one single protein-encoding gene is conserved across all prokaryotic genomes shocked biologists (138). Moreover, if Archaea are excluded, only two protein-encoding genes, a translation elongation factor and a ribosomal protein, plus the two rRNA genes are conserved across all Bacteria (138). These facts imply that systematic comparative approaches will gain from focusing on functional differences at levels other than the genome level. Ultimately, by recognizing that the comparative and top-down approaches are insufficient to reduce complexity to the level of a full comprehension of the cell, one should build or synthesize that minimal cell from its parts. This is what the bottom-up approach intends to achieve.
Bottom-Up Approach
The bottom-up, or synthetic, approach is aimed at assembling a minimal or simpler cell in the laboratory, i.e., constructing minimal cells from nonliving material (5). Bottom-up studies have concerned mainly physical and chemical properties and the dynamics of the building blocks of life. Focus has been placed on inserting genetic material (RNA or DNA) or enzymes inside lipidic vesicles, creating what is often named protocells (see the introduction). Properties such as stability, permeability, and self-reproduction together with the dynamics of eventual biochemical reactions can be studied in these constructs (for a detailed compilation of the work of biophysics in this area, see references 118 and 119). More complex biological properties can also be analyzed in protocells. For example, in a pioneer study, it was shown that Darwinian competition emerges in populations of vesicles with encapsulated genetic material (139). The competition arose simply due to the physical principle of osmosis-driven vesicle growth. Other researchers studied enzymatic RNA replication (140) and movement of vesicles resembling bacterial chemotaxis (141), based on different protocells assembled in those studies.
Solé et al. (9) make a distinction between the major achievements of bottom-up studies, which may lead to the construction of completely artificial cells, and those of reconstruction studies (118), which use components from a biological origin to produce what is here named semiartificial cells.
One innovative bottom-up project involves the idea of creating a minimal cell based on purified proteins. The authors of that study intended to identify the genes necessary for a minimal cell and, after preparation of the purified biochemical molecules, to encapsulate these genes within membranes, possibly rendering an artificial cell (116, 142). Another system of this kind is Cytomin, a cell-free translation system that has revealed promising results for protein synthesis and energy efficiency (143, 144).
Probably the major landmark in bottom-up approaches is the synthesis of the first artificial bacterial chromosome (145). Although a cell was not created per se, this study established the technology for the creation of the code for an entire cell. Nevertheless, although the creation and assembly of fully artificial cells are some of the ultimate goals of bioengineering and would help obtain a deeper understanding of biosystems, they seem part of science fiction, for now.
It might appear that the bottom-up approach is in a privileged position for the study of the LUCA and prebiotic chemistry compared to the top-down approach, as both the creation of artificial cells in the laboratory and the creation of ancestor cells in nature constitute transitions from nonliving to living entities (5). However, the connection between both areas of research should be handled with care (1, 5). While fully tracking the history of life until its origins could in principle allow replication of the process in the laboratory, the opposite cannot be assumed. Any artificial cell to be created in the laboratory based on modern genes, modern proteins, and modern membranes may be far from resembling what the LUCA was. It has been argued that the origin of genetic and enzymatic machineries must have occurred within some inorganic scaffold, with the LUCA not being free-living at first (74), while bottom-up studies commonly use vesicles to build protocells (see LUCA and the First Cells, above). In this manner, classical bottom-up work, regarding the current state of the art, may not be directly associable to the study of the LUCA, as discussed elsewhere (118, 119). Moreover, validating that a protocell would be a good model of a chassis cell would require protocells to be experimentally validated for chassis cell design. Within the state of the art, protocells are still, unfortunately, a meager model of such constructs.
Middle-Out Approach
Kohl and Noble attributed the term middle-out originally to Sydney Brenner (146), who coined it during a discussion at a Novartis Foundation symposium on complexity in biological information processing (147). For the purposes of this review, given that the focus is on prokaryotic systems, Noble's definition (120) was adapted to “the approach which starts at any level (gene, RNA, protein, metabolic or regulatory pathways) at which there are sufficient data and reaches (up, down and across) toward other levels and components.” The middle-out approach is often difficult to distinguish from classical approaches. In this review, we classified middle-out approaches as those studies that integrate different layers of information in a final holistic model or construct, as mentioned in Table 1.
Gil et al. performed large-scale work on the integration of several minimal gene sets and generated probably the most comprehensive and accepted theoretical minimal protein-encoding gene set for prokaryotic life (25) (see “Minimal Genome,” above, for the composition of this minimal gene set). That study integrated the orthologous genes resulting from a comparison of the genomes of five endosymbionts (148) with functional equivalents without sequence similarity. Afterwards, the results were integrated into several data sets: a list of essential B. subtilis genes (149), proposed essential E. coli genes from different sources (150–152), the proposed computationally derived minimal gene set of Mushegian and Koonin (17), the results of global transposon mutagenesis for mycoplasmas (26), a list of essential genes identified in S. aureus (33, 153), and the reduced genome of the plant pathogen “Candidatus Phytoplasma asteris” (154). To identify corresponding orthologous genes and protein functions and reconstruct the metabolic pathways, those authors used a comprehensive variety of online databases and resources (25). The final functional classification of the gene set was done with the categories used in the sequencing work on Aquifex aeolicus, one of the earliest-diverging bacteria known (155), and the resulting minimal metabolic network was analyzed for detection of gaps in essential pathways. The proposed minimal gene set reflects a rational integration that has been described in detail elsewhere (25).
Another example of an integrative approach resulting in an original construct is the whole-cell tomogram of M. pneumoniae, which includes individual heteromultimeric protein complexes represented to scale within one bacterial cell, obtained by using electron tomographies of 26 entire cells (45). A combination of pattern recognition and classification algorithms allowed the positioning of the identified protein complexes in a whole-cell illustration of the spatial organization of the proteome of this reduced bacterium (45) (Fig. 1).
A major achievement that so far represents the climax of integrative experimental projects toward the creation of artificial cells came 2 years after the creation of the first synthetic artificial genome (145). The Venter Institute announced the successful transplantation of an artificial chromosome, Mycoplasma mycoides JCVI-syn1.0, to another recipient cell, a Mycoplasma capricolum cell, creating new cells controlled by the synthetic chromosome (8). This represented a stretching of the boundaries of biotechnology, opening doors to new work using semiartificial bacterial cells.
Models and Simulations of Minimal and Simpler Cells
Because minimal or simpler cells are still conceptual constructs, theoretical representations and mathematical models are crucial for the advancement of the field. Theories (like the one of a hydrothermal origin of life [see LUCA and the First Cells, above]) (75, 76) and models (e.g., physical [experimental protocells] or virtual [in silico simulation models]) are the minimal or simpler cell-related constructs that are closer to being holistically understood among those represented in Table 1, given the complexity of prokaryotic cells.
Theoretical or virtual protocell systems include a vast array of representations of self-replicable systems, some explored mathematically. A pioneering protocell model is the so-called chemoton by Tibor Gánti (156). The chemoton consists of three functionally dependent autocatalytic subsystems: the metabolic network, the template polymerization subsystem, and the membrane subsystem enclosing the previous subsystems. All three subsystems are precisely coupled by stoichiometry, which ensures correct functioning. The chemoton is considered an elegant platform to support different protocell models (157). Physical protocells as minimal cell models and theoretical models of protocells have been reviewed comprehensively elsewhere (9).
On the other hand, the field of modeling of whole cells is still very scattered, and a variety of different modeling approaches have been used so far. In general, whole-cell simulation requires modeling of different biological networks at an appropriate scale. Existing models can be broadly categorized into three classes, interaction models or network representations, constraint-based models (e.g., stoichiometric models), and mechanistic models (e.g., kinetic models), although these models are still far from being holistic (for a review, see reference 158). Among these models, the constraint-based models have played a major role in contemporary attempts at modeling minimal life, mainly because of the simplicity or abstraction that they allow. Genome-scale network reconstructions (GENREs), which have been increasingly used in metabolic modeling, are one example with several practical applications, as discussed elsewhere (108). GENREs require the integration of experimental data in a middle-out manner (108, 159). The minimal requirement for reconstructing a GENRE is the annotated genome sequence of the organism of interest. The resulting basic framework can be further refined and expanded with the incorporation of experimental data at the cell level (mainly transcriptomics and proteomics) and manual curation based on the available literature. These models allow assessment of the biosynthetic capabilities of a species in a systematic manner. Furthermore, these models also enable the simulation of intracellular metabolic fluxes as well as the effects of genetic modifications such as gene knockouts (160, 161). So far, a large number of manually curated prokaryotic GENREs have been reported (108). These models are promising for studies of prokaryotic simplification and even for comparative studies, which will allow the definition of common and different metabolic features. A few studies with GENREs related to minimal or simpler cells have been done. Pál et al. used one E. coli GENRE to analyze reductive evolution from the network of E. coli toward the small networks of B. aphidicola and Wigglesworthia glossinidia, which achieved a remarkable accuracy of 80% (162). GENREs have also been used to predict gene essentiality in different organisms and theoretical compositions of minimal media (163).
Other work in modeling minimal cells has been done with mechanistic cell-level models, focusing on different features, such as cell geometry and division (164), macromolecular interactions (165), and also metabolism (166), with the latter study aimed at modeling a minimal cell from knowledge of the metabolic kinetics of E. coli (167). Another comprehensive, ongoing, whole-cell simulation project based on M. genitalium and including 127 genes, the E-CELL model, is running in Japan (168). More recently, Shuler et al. developed probably the most comprehensive and abstract minimal cell model to date (169), based on the minimal gene set derived by Gil et al. (25). Those authors added genes for 3 rRNA products, 20 tRNA species, and transport systems for amino acids and inorganic ions that were missing in the source gene set. This minimal cell model has 241 genes in total, represented in a 233-kb chromosome coding for all the functions supposedly required for a chemoheterotrophic bacterium to grow and divide (169). The model formulation consists of a differential algebraic equation system, which includes the DNA replication process as well as cytokinesis and the coupling between cell physiology and cell growth. It is also able to output several parameters, such as partition factors, chromosome replication, and cell division parameters (169).
The recently reported whole-cell model of M. genitalium was an important advance not only for the modeling field but also for the biological study of prokaryotes, by allowing accurate phenotypic predictions (170). This model integrates 28 essential cellular processes that were represented in different submodels; these processes fall into five main categories, DNA, RNA, protein, metabolism, and other (cytokinesis and host interaction), including over 1,900 quantitative parameters. Each of the 28 submodels was simulated with an appropriate mathematical representation; for instance, metabolism was modeled by using a constraint-based approach, while RNA and protein degradation were modeled by using mechanistic Poisson processes (170). This integrative strategy makes the assumption that the submodels are approximately independent on short time scales so that at each time step, the submodels depend on the values of variables determined by the other submodels at the previous time step (170). This formulation of independent and decoupled modules allowed the most complete simulation of M. genitalium so far, not only providing insights into the simulated cellular functions but also directing experimental assays that identified kinetic parameters and details of the biological function of metabolic genes.
TOWARDS THE LOWEST COMPLEXITY
For both fundamental science and the design of better platform cells with applications in industrial biotechnology, some of the major concerns are the complexity of the cells used, rather than the number of components that these cells have, and how precisely these cells can be understood and engineered in a predictive manner. Therefore, at this point, it can be argued that for the study of the minimal cell, the focus should be on minimizing the complexity and not the number of components. Complexity is often related to the number of interactions patent in the interactome, with all the interactions linking biological molecules in a cell (171). Once the interactome is known and the complexity of the system is understood, this complexity can be reduced by the rational deletion of some elements, such as single genes or even whole metabolic or regulatory modules that are not essential and that represent a considerable increase of the complexity of the system. One example is work by Trinh et al., who, by knocking out only 8 genes, reduced the functional space of the E. coli central metabolic network from 15,000 pathway possibilities to only 6 growth-supporting pathways (172).
Interactomes and Network Biology
Network biology explores the connectivity of molecular elements in biological networks, which can change dramatically for different proteins (173–175). It has been suggested that the complexity of the network of protein-protein interactions in a cell can be reduced to and be represented by a small number of highly connected hubs or protein units of structure and function (174). Network biology also specializes in applying graph theory to biological systems and revealing universal features of cellular networks (176). One of the major discoveries was that biological networks follow a hierarchical organization (173) in a modular manner, a feature that, from a holistic perspective, can facilitate interventions and predictions in the network. Recently, the hierarchical organization of biological networks was highlighted as being vital for the reduction of the complexity of bacterial cells for biotechnological applications but under another nomenclature (177). Those authors emphasize the need to introduce the concept of orthogonalization, a classical notion in engineering and mathematics that represents the ability of subsystems of a higher system to function independently, in biology (177).
An analysis of different prokaryotic networks suggested that more environmental variability is related to more network modularity and therefore more orthogonalization (178). It was demonstrated that E. coli metabolic modules are functionally uniform, with each metabolic class being assignable to one specific structural module, while the reduced network modules of B. aphidicola showed a larger mixture of different functions (178). Another interesting conclusion on biological complexity was that the transition to the largest and more complex metabolic networks was dependent on the presence of oxygen (179).
Genome Size and Cellular Complexity
The results of high-throughput interactome studies permit a first glance at the relationship between the genome size (in terms of the number of open reading frames [ORFs]) and the number of interactions identified (16) and show that there is no correlation between the two variables (Fig. 2). The total number of interactions exhibits a disperse distribution, but when normalized by the number of baits tested in each study, the ratio of the number of interactions identified to the number of baits was between 2- and 8-fold, with the exception of Campylobacter jejuni, for which the interactome is 18 times larger than the number of baits tested (Fig. 2). This indicates that interactome size might be independent of genome size, although the available data are still insufficient for definitive conclusions.
A general lack of strong correlations between genome size and several other cellular features, inferred from annotation data (180), corroborates the notion that the genome size (in kb) is a poor indicator of complexity (Fig. 3A). Of recent annotation data, the worst correlation occurs for the number of predicted HGT events, followed closely by the number of pseudogenes and the number of rRNA copies per genome. The absence of a correlation between the genome size and copy number of small-subunit rRNAs was also suggested by other authors (181), as is the case for pseudogenes. It was shown that the vast majority (90%) of prokaryotic genomes contain <18% noncoding DNA, but this value can be up to 50% in parasites that are enriched in pseudogenes (182). Interestingly, eukaryote-like kinases are present in the genomes of M. genitalium and M. pneumoniae (two kinases and one kinase, respectively) but not in the genome of E. coli (183).
The lack of a correlation between genome size and doubling time is another interesting point to consider (Fig. 3), from the points of view of both evolutionary fitness and industrial application. Indeed, codon usage bias is a much better indicator of the growth rate (184, 185) than genome size. Another interesting feature, the CRISPR (clustered regularly interspaced short palindromic repeat) defense mechanism has been indicated as a complex feature of prokaryotes, in which both the number of loci and the size of the sequences do not correlate with genome size (186).
The best correlations with genome size occur for metabolism-related features such as the number of predicted enzymes and the transporters assigned by the transporter classification system (Fig. 3). This correlation is weaker when only manually curated GENREs are considered (Fig. 3B; see also Data Set S1 in the supplemental material). Manually curated GENREs are available for a significantly smaller number of species than for those with sequenced genomes; however, the former include a rigorous process of validation and a supervised procedure for gap filling of the network. Overall, it seems plausible that genome size reflects fairly well the metabolic capability of an organism. Metabolic networks are among the most studied and manipulated of all prokaryotic features (108, 159, 187), and it has been suggested that the complexity of metabolism lies mostly in the regulation imposed on the metabolic network (188), which can occur on a large scale with the intervention of a single ubiquitous transcription factor (189), making it difficult to infer biological complexity based on metabolic network size alone.
The complexity of transcriptional regulatory networks, e.g., through transcription factor-gene interactions, can be seen as another metric of overall cellular complexity. Although the number of transcription factors seems to increase with increasing genome size, the number of regulatory sites per intergenic region is independent of it (190). On the other hand, the M. pneumoniae genome, despite having only 0.81 Mb, contains frequent antisense transcripts, alternative transcripts, and multiple regulators per gene, which make regulation and transcriptome of this bacterium highly dynamic and somehow similar to those of eukaryotes (48). M. genitalium lacks two-component regulatory systems with histidine kinase sensors and response regulator domains, which are widespread in E. coli and H. influenzae (13), which led to the anticipation that its regulatory circuits would be less responsive to environmental signals (191) and therefore less controllable in industrial scenarios.
The minimal nutritional requirements of a species summarize its biosynthetic capabilities and hence can be used as a metric of its metabolic complexity. Based on nutritional information for 15 species (see Data Set S2 in the supplemental material), there seems to be a nonlinear relationship between the number of medium components and genome size, with an apparent stabilization in a minimal medium with between 7 and 8 components after the 3-Mb mark (for heterotrophic growth) (Fig. 4). The underlying negative correlation is in accordance with the expectation that the nutritional requirements of smaller genomes should be greater, reflecting evolutionary adaptations that have implications for the design of chassis cells.
The number of genome copies per cell is another feature that defies genome size as an appropriate measure of complexity. Surprisingly, until recently, insect obligate endosymbionts held the record for the largest numbers of copies of genomes per cell, with the average ranging from 20 to several hundred genome copies in Buchnera cells and from 200 to 900 copies in “Candidatus Sulcia” cells (192). Moreover, it was shown that the number of copies of genomes of intracellular symbionts varies in response to the developmental stage of the host, increasing during postembryonic development of insects into adults and decreasing during aging (193). It is reasonable to think that endosymbiosis transforms these prokaryotes into cell factories that are more active in providing the host the “agreed nutrients” by increasing the genome copy number, which can be exploited for more profitable biotechnological applications with minimal cells.
Subcellular Architecture
Highly organized subcellular architecture is increasingly becoming an object of attention and brings a whole new perspective to the biology of prokaryotes (194, 195), which until recently have been regarded as simple membrane-bound cells with a uniform cytoplasm and one circular genome. It has been shown that even enzymes thought to have only specific chemical roles can have well-defined structural roles in a prokaryotic cytoplasm. The CTP synthase of Caulobacter crescentus forms filaments that help define the characteristic curvature of these bacteria, and these filaments are formed in E. coli as well (196). M. pneumoniae also displays highly ordered structural features (45, 48), including a complex terminal structure that directs human respiratory tract colonization and is considered an organelle per se, with the function of promoting attachment (197). Although this bacterium is among the simplest prokaryotes, with an extremely reduced genome and without cell wall, its subcellular architecture shows that smaller genomes can translate into complex cellular structures.
CONCLUSIONS AND FUTURE PERSPECTIVES
The genome, as the first ome made accessible by technological advances, has so far received most of the attention in the field of minimal or simpler cells. Efforts toward the construction of minimal genomes include mainly the large-scale identification of nonessential genes, relatively few experimental genome reductions, and an outstanding example of the construction of a bacterial cell harboring a synthetic genome. In another line of research, comparative approaches identified core, conserved gene sets that were at first thought to constitute the minimal genome. With the sequencing of more and more genomes, at present, this core is practically reduced to zero, as no protein-encoding gene is universal across the prokaryotic domain (138). This outstanding discovery has reshaped the way in which the field of minimal cells is viewed from a systems biology perspective. The genome is no longer seen as the static core identity of the cell but is seen more as a backbone or a database of tools pertaining to a complex and dynamic system. Technologies complementary to genomics are thus entering the main stage, such as transcriptomics, proteomics, metabolomics, as well as computational tools for simulating the dynamic behavior of the cell. The minimal cell can be seen at present as a broad concept that does not apply to one genome composition only. It seems that a panoply of different small genomes may exist, being regulated differently, expressed in different proteomes, and strongly dependent on the available media and environment.
In parallel to omics-oriented research, the study of the last universal common ancestor has been integrated within the geochemical context of early Earth, which is crucial to the reconstitution and understanding of the genetic and metabolic capabilities of this minimal cell. Furthermore, the design of chassis cells is becoming more and more targeted on specific needs, such as product and culture conditions, expanding on the previous notion that a general minimal cell with a reduced genome would fit industrial needs. Overall, it has become clear that both fundamental and applied goals for research on minimal cells can be achieved only through a system-level analysis encompassing bottom-up, top-down, and middle-out approaches.
The need for taking a holistic approach to the design of minimal cells is underlined by the need to complement experimental approaches with mathematical modeling. Mathematical models can aid in the interpretation and integration of large omics data sets, hypothesis generation, uncovering general principles underlying the operation of complex cellular machinery, and, eventually, designing the network modules for minimal cells. One of the foremost tasks will be to devise metrics for assessing the minimality and simplicity of a biological system, features which may not necessarily go hand in hand. Although minimality can be defined in a relatively straightforward manner, e.g., in terms of genome size, to date there are no explicit metrics of complexity available. Several recent studies providing insight into the cellular interactome (16, 174, 175) indicate that the topological and functional features of these networks may be used for devising suitable complexity metrics.
A cell factory viewpoint of minimal and simpler cells can provide useful insights into the relationship between simplicity and complexity. A cell factory to be used in biotechnological applications will be required to strike a balance between various contrasting features (Fig. 5A). For example, while minimality implies a smaller genome size, it undesirably increases the requirements for nutritional supply. Similarly, minimal complexity and optimal local control may require a certain degree of orthogonalization between the functions of different components or functional modules, while some cross talk between these components will be essential to achieve globally optimal control and a high metabolic efficiency. Indeed, cellular metabolic networks feature both orthogonalization (e.g., distinct biochemical pathways) and cross talk (e.g., through the use of universal redox and energy cofactors). Furthermore, metabolic efficiency and rates often counter each other (198), prompting another balance for the system as a whole. These different tradeoff considerations clearly suggest that minimal cells used for an industrial purpose will have to be tailored to a particular need, with the complexity of the desired phenotype and the economy of the overall process dictating the balancing point. It will be interesting to extend these engineering viewpoints to evolutionary considerations for the LUCA. For example, the theoretical/experimental LUCA models could be refined so as to strike a balance between the number of components and the level of complexity that would likely represent optimal fitness under the postulated environmental conditions.
Research from diverse fields, ranging from fundamental biology to LUCA to chassis cells, is providing a clearer picture of the workflow that will most likely lead to the reconstruction of simple and minimal cells for basic research as well as for industrial applications. This implies an iterative process building upon top-down studies generating omics data sets; bottom-up, mechanistic studies generating biochemical and biophysical data; and middle-out integrative modeling allowing some degree of abstraction together with important predictions (Fig. 5B). Ultimately, all approaches toward the construction of minimal or simpler cells are systems biology approaches, as the goal is to achieve a whole system—the whole minimized or simplified cell—even though these approaches have much to gain from nonsystematic studies. Examples include studies of a specific protein or regulatory module for cell division of a minimal cell (199, 200); phylogenetic studies and even reconstruction of ancient enzymes, tracing their chemistry back to the context of ancient life (201); and the study of a specific pathway that could later be optimized in a chassis cell (202), etc. The merger between such nonsystematic studies, systematic approaches, and synthetic DNA technology is expected to lead to exciting achievements toward the construction of minimal cells. This combination will be key for answering the long-sought questions of the origin and nature of life and for improving our ability to rationally design minimal or simpler cells.
Supplementary Material
ACKNOWLEDGMENT
J.C.X. is sponsored by grant SFRH/BD/81626/2011 from the Fundação para a Ciência e Tecnologia, Portugal.
Footnotes
Supplemental material for this article may be found at http://dx.doi.org/10.1128/MMBR.00050-13.
REFERENCES
- 1.Szostak JW, Bartel DP, Luisi PL. 2001. Synthesizing life. Nature 409:387–390. 10.1038/35053176 [DOI] [PubMed] [Google Scholar]
- 2.Fehér T, Papp B, Pal C, Pósfai G. 2007. Systematic genome reductions: theoretical and experimental approaches. Chem. Rev. 107:3498–3513. 10.1021/cr0683111 [DOI] [PubMed] [Google Scholar]
- 3.Henry C, Overbeek R, Stevens RL. 2010. Building the blueprint of life. Biotechnol. J. 5:695–704. 10.1002/biot.201000076 [DOI] [PubMed] [Google Scholar]
- 4.Umenhoffer K, Fehér T, Balikó G, Ayaydin F, Pósfai J, Blattner FR, Pósfai G. 2010. Reduced evolvability of Escherichia coli MDS42, an IS-less cellular chassis for molecular and synthetic biology applications. Microb. Cell Fact. 9:38. 10.1186/1475-2859-9-38 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Rasmussen S, Chen L, Deamer D, Krakauer DC, Packard NH, Stadler PF, Bedau MA. 2004. Evolution. Transitions from nonliving to living matter. Science 303:963–965. 10.1126/science.1093669 [DOI] [PubMed] [Google Scholar]
- 6.Doolittle WF. 1999. Phylogenetic classification and the universal tree. Science 284:2124–2129. 10.1126/science.284.5423.2124 [DOI] [PubMed] [Google Scholar]
- 7.Theobald DL. 2010. A formal test of the theory of universal common ancestry. Nature 465:219–222. 10.1038/nature09014 [DOI] [PubMed] [Google Scholar]
- 8.Gibson DG, Glass JI, Lartigue C, Noskov VN, Chuang R-Y, Algire MA, Benders GA, Montague MG, Ma L, Moodie MM, Merryman C, Vashee S, Krishnakumar R, Assad-Garcia N, Andrews-Pfannkoch C, Denisova EA, Young L, Qi Z-Q, Segall-Shapiro TH, Calvey CH, Parmar PP, Hutchison CA, Smith HO, Venter JC. 2010. Creation of a bacterial cell controlled by a chemically synthesized genome. Science 329:52–56. 10.1126/science.1190719 [DOI] [PubMed] [Google Scholar]
- 9.Solé RV, Munteanu A, Rodriguez-Caso C, Macía J. 2007. Synthetic protocell biology: from reproduction to computation. Philos. Trans. R. Soc. Lond. B Biol. Sci. 362:1727–1739. 10.1098/rstb.2007.2065 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Huang X, Li M, Green DC, Williams DS, Patil AJ, Mann S. 2013. Interfacial assembly of protein-polymer nano-conjugates into stimulus-responsive biomimetic protocells. Nat. Commun. 4:2239. 10.1038/ncomms3239 [DOI] [PubMed] [Google Scholar]
- 11.Vickers CE, Blank LM, Krömer JO. 2010. Chassis cells for industrial biochemical production. Nat. Chem. Biol. 6:875–877. 10.1038/nchembio.484 [DOI] [PubMed] [Google Scholar]
- 12.Foley PL, Shuler ML. 2010. Considerations for the design and construction of a synthetic platform cell for biotechnological applications. Biotechnol. Bioeng. 105:26–36. 10.1002/bit.22575 [DOI] [PubMed] [Google Scholar]
- 13.Fraser CM, Gocayne JD, White O, Adams MD, Clayton RA, Fleischmann RD, Bult CJ, Kerlavage AR, Sutton G, Kelley JM, Fritchman JL, Weidman JF, Small KV, Sandusky M, Fuhrmann J, Nguyen D, Utterback TR, Saudek DM, Phillips CA, Merrick JM, Tomb J-F, Dougherty BA, Bott KF, Hu P-C, Lucier TS, Peterson SN, Smith HO, Hutchison CA, III, Venter JC. 1995. The minimal gene complement of Mycoplasma genitalium. Science 270:397–404. 10.1126/science.270.5235.397 [DOI] [PubMed] [Google Scholar]
- 14.Westerhoff HV, Winder C, Messiha H, Simeonidis E, Adamczyk M, Verma M, Bruggeman FJ, Dunn W. 2009. Systems biology: the elements and principles of life. FEBS Lett. 583:3882–3890. 10.1016/j.febslet.2009.11.018 [DOI] [PubMed] [Google Scholar]
- 15.Bonchev D. 2004. Complexity analysis of yeast proteome network. Chem. Biodivers. 1:312–326. 10.1002/cbdv.200490028 [DOI] [PubMed] [Google Scholar]
- 16.Bouveret E, Brun C. 2012. Bacterial interactomes: from interactions to networks. Methods Mol. Biol. 804:15–33. 10.1007/978-1-61779-361-5_2 [DOI] [PubMed] [Google Scholar]
- 17.Mushegian A, Koonin EV. 1996. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. U. S. A. 93:10268–10273. 10.1073/pnas.93.19.10268 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Van der Werf MJ, Overkamp KM, Muilwijk B, Coulier L, Hankemeier T. 2007. Microbial metabolomics: toward a platform with full metabolome coverage. Anal. Biochem. 370:17–25. 10.1016/j.ab.2007.07.022 [DOI] [PubMed] [Google Scholar]
- 19.Koonin EV. 2000. How many genes can make a cell: the minimal-gene-set concept. Annu. Rev. Genomics Hum. Genet. 1:99–116. 10.1146/annurev.genom.1.1.99 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.McCutcheon JP, McDonald BR, Moran NA. 2009. Convergent evolution of metabolic roles in bacterial co-symbionts of insects. Proc. Natl. Acad. Sci. U. S. A. 106:15394–15399. 10.1073/pnas.0906424106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Nakabachi A, Yamashita A, Toh H, Ishikawa H, Dunbar HE, Moran NA, Hattori M. 2006. The 160-kilobase genome of the bacterial endosymbiont Carsonella. Science 314:267. 10.1126/science.1134196 [DOI] [PubMed] [Google Scholar]
- 22.Tamames J, Gil R, Latorre A, Peretó J, Silva FJ, Moya A. 2007. The frontier between cell and organelle: genome analysis of Candidatus Carsonella ruddii. BMC Evol. Biol. 7:181. 10.1186/1471-2148-7-181 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Mushegian A. 1999. The minimal genome concept. Curr. Opin. Genet. Dev. 9:709–714. 10.1016/S0959-437X(99)00023-4 [DOI] [PubMed] [Google Scholar]
- 24.Yus E, Maier T, Michalodimitrakis K, van Noort V, Yamada T, Chen W-H, Wodke JA, Güell HM, Martínez S, Bourgeois R, Kühner S, Raineri E, Letunic I, Kalinina OV, Rode M, Herrmann R, Gutiérrez-Gallego R, Russell RB, Gavin A-C, Bork P, Serrano L. 2009. Impact of genome reduction on bacterial metabolism and its regulation. Science 326:1263–1268. 10.1126/science.1177263 [DOI] [PubMed] [Google Scholar]
- 25.Gil R, Silva FJ, Peretó J, Moya A. 2004. Determination of the core of a minimal bacterial gene set. Microbiol. Mol. Biol. Rev. 68:518–537. 10.1128/MMBR.68.3.518-537.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hutchison CA, Peterson SN, Gill SR, Cline RT, White O, Fraser CM, Smith HO, Venter JC. 1999. Global transposon mutagenesis and a minimal Mycoplasma genome. Science 286:2165–2169. 10.1126/science.286.5447.2165 [DOI] [PubMed] [Google Scholar]
- 27.Glass JI, Assad-Garcia N, Alperovich N, Yooseph S, Lewis MR, Maruf M, Hutchison CA, Smith HO, Venter JC. 2006. Essential genes of a minimal bacterium. Proc. Natl. Acad. Sci. U. S. A. 103:425–430. 10.1073/pnas.0510013103 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.De Berardinis V, Vallenet D, Castelli V, Besnard M, Pinet A, Cruaud C, Samair S, Lechaplais C, Gyapay G, Richez C, Durot M, Kreimeyer A, Le Fèvre F, Schächter V, Pezo V, Döring V, Scarpelli C, Médigue C, Cohen GN, Marlière P, Salanoubat M, Weissenbach J. 2008. A complete collection of single-gene deletion mutants of Acinetobacter baylyi ADP1. Mol. Syst. Biol. 4:174. 10.1038/msb.2008.10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Christen B, Abeliuk E, Collier JM, Kalogeraki VS, Passarelli B, Coller JA, Fero MJ, McAdams HH, Shapiro L. 2011. The essential genome of a bacterium. Mol. Syst. Biol. 7:528. 10.1038/msb.2011.58 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Gallagher LA, Ramage E, Jacobs MA, Kaul R, Brittnacher M, Manoil C. 2007. A comprehensive transposon mutant library of Francisella novicida, a bioweapon surrogate. Proc. Natl. Acad. Sci. U. S. A. 104:1009–1014. 10.1073/pnas.0606713104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Salama NR, Shepherd B, Falkow S. 2004. Global transposon mutagenesis and essential gene analysis of Helicobacter pylori. J. Bacteriol. 186:7926–7935. 10.1128/JB.186.23.7926-7935.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Langridge GC, Phan M-D, Turner DJ, Perkins TT, Parts L, Haase J, Charles I, Maskell DJ, Peters SE, Dougan G, Wain J, Parkhill J, Turner AK. 2009. Simultaneous assay of every Salmonella Typhi gene using one million transposon mutants. Genome Res. 19:2308–2316. 10.1101/gr.097097.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Forsyth RA, Haselbeck RJ, Ohlsen KL, Yamamoto RT, Xu H, Trawick JD, Wall D, Wang L, Brown-Driver V, Froelich JM, Kedar GC, King P, McCarthy M, Malone C, Misiner B, Robbins D, Tan Z, Zhu Z, Carr G, Mosca DA, Zamudio C, Foulkes JG, Zyskind JW. 2002. A genome-wide strategy for the identification of essential genes in Staphylococcus aureus. Mol. Microbiol. 43:1387–1400. 10.1046/j.1365-2958.2002.02832.x [DOI] [PubMed] [Google Scholar]
- 34.Chaudhuri RR, Allen AG, Owen PJ, Shalom G, Stone K, Harrison M, Burgis TA, Lockyer M, Garcia-Lara J, Foster SJ, Pleasance SJ, Peters SE, Maskell DJ, Charles IG. 2009. Comprehensive identification of essential Staphylococcus aureus genes using transposon-mediated differential hybridisation (TMDH). BMC Genomics 10:291. 10.1186/1471-2164-10-291 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Mendum TA, Newcombe J, Mannan AA, Kierzek AA, McFadden J. 2011. Interrogation of global mutagenesis data with a genome scale model of Neisseria meningitidis to assess gene fitness in vitro and in sera. Genome Biol. 12:R127. 10.1186/gb-2011-12-12-r127 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Cameron DE, Urbach JM, Mekalanos JJ. 2008. A defined transposon mutant library and its use in identifying motility genes in Vibrio cholerae. Proc. Natl. Acad. Sci. U. S. A. 105:8736–8741. 10.1073/pnas.0803281105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Zhang R, Lin Y. 2009. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 37:D455–D458. 10.1093/nar/gkn858 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Chen W-H, Minguez P, Lercher MJ, Bork P. 2012. OGEE: an online gene essentiality database. Nucleic Acids Res. 40:D901–D906. 10.1093/nar/gkr986 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.De Crécy-Lagard V, Marck C, Brochier-Armanet C, Grosjean H. 2007. Comparative RNomics and modomics in Mollicutes: prediction of gene function and evolutionary implications. IUBMB Life 59:634–658. 10.1080/15216540701604632 [DOI] [PubMed] [Google Scholar]
- 40.Lluch-Senar M, Luong K, Lloréns-Rico V, Delgado J, Fang G, Spittle K, Clark TA, Schadt E, Turner SW, Korlach J, Serrano L. 2013. Comprehensive methylome characterization of Mycoplasma genitalium and Mycoplasma pneumoniae at single-base resolution. PLoS Genet. 9:e1003191. 10.1371/journal.pgen.1003191 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zinser ER, Lindell D, Johnson ZI, Futschik ME, Steglich C, Coleman ML, Wright MA, Rector T, Steen R, McNulty N, Thompson LR, Chisholm SW. 2009. Choreography of the transcriptome, photophysiology, and cell cycle of a minimal photoautotroph, Prochlorococcus. PLoS One 4:e5135. 10.1371/journal.pone.0005135 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Callister SJ, McCue LA, Turse JE, Monroe ME, Auberry KJ, Smith RD, Adkins JN, Lipton MS. 2008. Comparative bacterial proteomics: analysis of the core genome concept. PLoS One 3:e1542. 10.1371/journal.pone.0001542 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Fisunov GY, Alexeev DG, Bazaleev NA, Ladygina VG, Galyamina MA, Kondratov IG, Zhukova NA, Serebryakova MV, Demina IA, Govorun VM. 2011. Core proteome of the minimal cell: comparative proteomics of three mollicute species. PLoS One 6:e21964. 10.1371/journal.pone.0021964 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Jaffe JD, Stange-Thomann N, Smith C, DeCaprio D, Fisher S, Butler J, Calvo S, Elkins T, FitzGerald MG, Hafez N, Kodira CD, Major J, Wang S, Wilkinson J, Nicol R, Nusbaum C, Birren B, Berg HC, Church GM. 2004. The complete genome and proteome of Mycoplasma mobile. Genome Res. 14:1447–1461. 10.1101/gr.2674004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Kühner S, van Noort V, Betts MJ, Leo-Macias A, Batisse C, Rode M, Yamada T, Maier T, Bader S, Beltran-Alvarez P, Castaño-Diez D, Chen W-H, Devos D, Güell M, Norambuena T, Racke I, Rybin V, Schmidt A, Yus E, Aebersold R, Herrmann R, Böttcher B, Frangakis AS, Russell RB, Serrano L, Bork P, Gavin A-C. 2009. Proteome organization in a genome-reduced bacterium. Science 326:1235–1240. 10.1126/science.1176343 [DOI] [PubMed] [Google Scholar]
- 46.Gabaldón T, Peretó J, Montero F, Gil R, Latorre A, Moya A. 2007. Structural analyses of a hypothetical minimal metabolism. Philos. Trans. R. Soc. Lond. B Biol. Sci. 362:1751–1762. 10.1098/rstb.2007.2067 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Srinivasan V, Morowitz HJ. 2009. The canonical network of autotrophic intermediary metabolism: minimal metabolome of a reductive chemoautotroph. Biol. Bull. 216:126–130 [DOI] [PubMed] [Google Scholar]
- 48.Güell M, van Noort V, Yus E, Chen W-H, Leigh-Bell J, Michalodimitrakis K, Yamada T, Arumugam M, Doerks T, Kühner S, Rode M, Suyama M, Schmidt S, Gavin A-C, Bork P, Serrano L. 2009. Transcriptome complexity in a genome-reduced bacterium. Science 326:1268–1271. 10.1126/science.1176951 [DOI] [PubMed] [Google Scholar]
- 49.Chandramouli K, Qian P-Y. 2009. Proteomics: challenges, techniques and possibilities to overcome biological sample complexity. Hum. Genomics Proteomics 2009:239204. 10.4061/2009/239204 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Szathmáry E. 2005. Life: in search of the simplest cell. Nature 433:469–470. 10.1038/433469a [DOI] [PubMed] [Google Scholar]
- 51.Macleod RA, Onofrey E, Norris ME. 1954. Nutrition and metabolism of marine bacteria. I. Survey of nutritional requirements. J. Bacteriol. 68:680–686 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Wong PTS, Thompson J, MacLeod RA. 1969. Nutrition and metabolism of marine bacteria. XVII. Ion-dependent retention of alpha-aminoisobutyric acid and its relation to Na+ dependent transport in a marine pseudomonad. J. Biol. Chem. 244:1016–1025 [PubMed] [Google Scholar]
- 53.Bryant MP, Robinson IM. 1962. Some nutritional characteristics of predominant culturable ruminal bacteria. J. Bacteriol. 84:605–614 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Roepke RR, Libby RL, Small MH. 1944. Mutation or variation of Escherichia coli with respect to growth requirements. J. Bacteriol. 48:401–412 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Borek E, Waelsch H. 1951. The effect of temperature on the nutritional requirement of microorganisms. J. Biol. Chem. 190:191–196 [PubMed] [Google Scholar]
- 56.Ware GC. 1951. Nutritional requirements of Bacterium coli at 44 degrees. J. Gen. Microbiol. 5:880–884. 10.1099/00221287-5-5-880 [DOI] [PubMed] [Google Scholar]
- 57.Campbell LL, Williams OB. 1953. The effect of temperature on the nutritional requirements of facultative and obligate thermophilic bacteria. J. Bacteriol. 65:141–145 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Clements LD, Miller BS, Streips UN. 2002. Comparative growth analysis of the facultative anaerobes Bacillus subtilis, Bacillus licheniformis, and Escherichia coli. Syst. Appl. Microbiol. 25:284–286. 10.1078/0723-2020-00108 [DOI] [PubMed] [Google Scholar]
- 59.Joyce AR, Reed JL, White A, Edwards R, Osterman A, Baba T, Mori H, Lesely SA, Palsson BØ, Agarwalla S. 2006. Experimental and computational assessment of conditionally essential genes in Escherichia coli. J. Bacteriol. 188:8259–8271. 10.1128/JB.00740-06 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Rensing C, Fan B, Sharma R, Mitra B, Rosen BP. 2000. CopA: an Escherichia coli Cu(I)-translocating P-type ATPase. Proc. Natl. Acad. Sci. U. S. A. 97:652–656. 10.1073/pnas.97.2.652 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Bleriot C, Effantin G, Lagarde F, Mandrand-Berthelot M-A, Rodrigue A. 2011. RcnB is a periplasmic protein essential for maintaining intracellular Ni and Co concentrations in Escherichia coli. J. Bacteriol. 193:3785–3793. 10.1128/JB.05032-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.McLuskey K, Harrison JA, Schuttelkopf AW, Boxer DH, Hunter WN. 2003. Insight into the role of Escherichia coli MobB in molybdenum cofactor biosynthesis based on the high resolution crystal structure. J. Biol. Chem. 278:23706–23713. 10.1074/jbc.M301485200 [DOI] [PubMed] [Google Scholar]
- 63.Semsey S, Andersson AMC, Krishna S, Jensen MH, Massé E, Sneppen K. 2006. Genetic regulation of fluxes: iron homeostasis of Escherichia coli. Nucleic Acids Res. 34:4960–4967. 10.1093/nar/gkl627 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Jakubovics NS, Jenkinson HF. 2001. Out of the iron age: new insights into the critical role of manganese homeostasis in bacteria. Microbiology 147:1709–1718 [DOI] [PubMed] [Google Scholar]
- 65.Lee LJ, Barrett JA, Poole RK. 2005. Genome-wide transcriptional response of chemostat-cultured Escherichia coli to zinc. J. Bacteriol. 187:1124–1134. 10.1128/JB.187.3.1124-1134.2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Kampen W. 1997. Nutritional requirements in fermentation processes, p 122–160 In Vogel HC, Todaro CC. (ed), Fermentation and biochemical engineering handbook—principles, process design, and equipment, 2nd ed. Noyes Publications, Saddle River, NJ [Google Scholar]
- 67.Vartoukian SR, Palmer RM, Wade WG. 2010. Strategies for culture of “unculturable” bacteria. FEMS Microbiol. Lett. 309:1–7. 10.1111/j.1574-6968.2010.02000.x [DOI] [PubMed] [Google Scholar]
- 68.Murray RG, Stackebrandt E. 1995. Taxonomic note: implementation of the provisional status Candidatus for incompletely described procaryotes. Int. J. Syst. Bacteriol. 45:186–187. 10.1099/00207713-45-1-186 [DOI] [PubMed] [Google Scholar]
- 69.Gosalbes MJ, Lamelas A, Moya A, Latorre A. 2008. The striking case of tryptophan provision in the cedar aphid Cinara cedri. J. Bacteriol. 190:6026–6029. 10.1128/JB.00525-08 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Douglas AE, Bouvaine S, Russell RR. 2011. How the insect immune system interacts with an obligate symbiotic bacterium. Proc. Biol. Sci. 278:333–338. 10.1098/rspb.2010.1563 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Gil R, Sabater-Muñoz B, Latorre A, Silva FJ, Moya A. 2002. Extreme genome reduction in Buchnera spp.: toward the minimal genome needed for symbiotic life. Proc. Natl. Acad. Sci. U. S. A. 99:4454–4458. 10.1073/pnas.062067299 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Suthers PF, Dasika MS, Kumar VS, Denisov G, Glass JI, Maranas CD. 2009. A genome-scale metabolic reconstruction of Mycoplasma genitalium, iPS189. PLoS Comput. Biol. 5:e1000285. 10.1371/journal.pcbi.1000285 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Darwin C. 1859. On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. John Murray, London, United Kingdom: [PMC free article] [PubMed] [Google Scholar]
- 74.Koonin EV, Martin W. 2005. On the origin of genomes and cells within inorganic compartments. Trends Genet. 21:647–654. 10.1016/j.tig.2005.09.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Russell MJ, Hall AJ. 1997. The emergence of life from iron monosulphide bubbles at a submarine hydrothermal redox and pH front. J. Geol. Soc. London 154:377–402. 10.1144/gsjgs.154.3.0377 [DOI] [PubMed] [Google Scholar]
- 76.Russell MJ, Hall AJ, Mellersh AR. 2003. On the dissipation of thermal and chemical energies on the early Earth: the onsets of hydrothermal convection, chemiosmosis, genetically regulated metabolism and oxygenic photosynthesis, p 325–388 In Ikan R. (ed), Natural and laboratory-simulated thermal geochemical processes. Kluwer Academic Publishers, Dordrecht, Netherlands [Google Scholar]
- 77.Martin W, Russell MJ. 2003. On the origins of cells: a hypothesis for the evolutionary transitions from abiotic geochemistry to chemoautotrophic prokaryotes, and from prokaryotes to nucleated cells. Philos. Trans. R. Soc. Lond. B Biol. Sci. 358:59–83. 10.1098/rstb.2002.1183 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Martin W, Baross J, Kelley D, Russell MJ. 2008. Hydrothermal vents and the origin of life. Nat. Rev. Microbiol. 6:805–814. 10.1038/nrmicro1991 [DOI] [PubMed] [Google Scholar]
- 79.Nisbet EG, Sleep NH. 2001. The habitat and nature of early life. Nature 409:1083–1091. 10.1038/35059210 [DOI] [PubMed] [Google Scholar]
- 80.Penny D, Poole A. 1999. The nature of the last universal common ancestor. Curr. Opin. Genet. Dev. 9:672–677. 10.1016/S0959-437X(99)00020-9 [DOI] [PubMed] [Google Scholar]
- 81.Becerra A, Islas S, Leguina JI, Silva E, Lazcano A. 1997. Polyphyletic gene losses can bias backtrack characterizations of the cenancestor. J. Mol. Evol. 45:115–117. 10.1007/PL00006209 [DOI] [PubMed] [Google Scholar]
- 82.Poole AM, Logan DT. 2005. Modern mRNA proofreading and repair: clues that the last universal common ancestor possessed an RNA genome? Mol. Biol. Evol. 22:1444–1455. 10.1093/molbev/msi132 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Leipe DD, Aravind L, Koonin EV. 1999. Did DNA replication evolve twice independently? Nucleic Acids Res. 27:3389–3401. 10.1093/nar/27.17.3389 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Yonezawa T, Hasegawa M. 2010. Was the universal common ancestry proved? Nature 468:E9; discussion E10. 10.1038/nature09482 [DOI] [PubMed] [Google Scholar]
- 85.Theobald DL. 2010. Theobald reply. Nature 468:E10. 10.1038/nature09483 [DOI] [Google Scholar]
- 86.Goldman AD, Bernhard TM, Dolzhenko E, Landweber LF. 2013. LUCApedia: a database for the study of ancient life. Nucleic Acids Res. 41:D1079–D1082. 10.1093/nar/gks1217 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Kyrpides N, Overbeek R, Ouzounis C. 1999. Universal protein families and the functional content of the last universal common ancestor. J. Mol. Evol. 49:413–423. 10.1007/PL00006564 [DOI] [PubMed] [Google Scholar]
- 88.Mat W-K, Xue H, Wong JT-F. 2008. The genomics of LUCA. Front. Biosci. 13:5605–5613. 10.2741/3103 [DOI] [PubMed] [Google Scholar]
- 89.Yang S, Doolittle RF, Bourne PE. 2005. Phylogeny determined by protein domain content. Proc. Natl. Acad. Sci. U. S. A. 102:373–378. 10.1073/pnas.0408810102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Wang M, Yafremava LS, Caetano-Anollés D, Mittenthal JE, Caetano-Anollés G. 2007. Reductive evolution of architectural repertoires in proteomes and the birth of the tripartite world. Genome Res. 17:1572–1585. 10.1101/gr.6454307 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Kim KM, Caetano-Anollés G. 2011. The proteomic complexity and rise of the primordial ancestor of diversified life. BMC Evol. Biol. 11:140. 10.1186/1471-2148-11-140 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Mannige RV, Brooks CL, Shakhnovich EI. 2012. A universal trend among proteomes indicates an oily last common ancestor. PLoS Comput. Biol. 8:e1002839. 10.1371/journal.pcbi.1002839 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Koonin EV. 2003. Comparative genomics, minimal gene-sets and the last universal common ancestor. Nat. Rev. Microbiol. 1:127–136. 10.1038/nrmicro751 [DOI] [PubMed] [Google Scholar]
- 94.Harris JK, Kelley ST, Spiegelman GB, Pace NR. 2003. The genetic core of the universal ancestor. Genome Res. 13:407–412. 10.1101/gr.652803 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Mirkin BG, Fenner TI, Galperin MY, Koonin EV. 2003. Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes. BMC Evol. Biol. 3:2. 10.1186/1471-2148-3-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Pál C, Papp B, Lercher MJ. 2005. Adaptive evolution of bacterial metabolic networks by horizontal gene transfer. Nat. Genet. 37:1372–1375. 10.1038/ng1686 [DOI] [PubMed] [Google Scholar]
- 97.Morange M. 2011. Some considerations on the nature of LUCA, and the nature of life. Res. Microbiol. 162:5–9. 10.1016/j.resmic.2010.10.001 [DOI] [PubMed] [Google Scholar]
- 98.Lehman J, Stanley KO. 2013. Evolvability is inevitable: increasing evolvability without the pressure to adapt. PLoS One 8:e62186. 10.1371/journal.pone.0062186 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Johannes TW, Zhao H. 2006. Directed evolution of enzymes and biosynthetic pathways. Curr. Opin. Microbiol. 9:261–267. 10.1016/j.mib.2006.03.003 [DOI] [PubMed] [Google Scholar]
- 100.Lee JW, Na D, Park JM, Lee J, Choi S, Lee SY. 2012. Systems metabolic engineering of microorganisms for natural and non-natural chemicals. Nat. Chem. Biol. 8:536–546. 10.1038/nchembio.970 [DOI] [PubMed] [Google Scholar]
- 101.Ara K, Ozaki K, Nakamura K, Yamane K, Sekiguchi J, Ogasawara N. 2007. Bacillus minimum genome factory: effective utilization of microbial genome information. Biotechnol. Appl. Biochem. 46:169–178. 10.1042/BA20060111 [DOI] [PubMed] [Google Scholar]
- 102.Mizoguchi H, Mori H, Fujio T. 2007. Escherichia coli minimum genome factory. Biotechnol. Appl. Biochem. 46:157–167. 10.1042/BA20060107 [DOI] [PubMed] [Google Scholar]
- 103.Pósfai G, Plunkett G, Fehér T, Frisch D, Keil GM, Umenhoffer K, Kolisnychenko V, Stahl B, Sharma SS, de Arruda M, Burland V, Harcum SW, Blattner FR. 2006. Emergent properties of reduced-genome Escherichia coli. Science 312:1044–1046. 10.1126/science.1126439 [DOI] [PubMed] [Google Scholar]
- 104.Giga-Hama Y, Tohda H, Takegawa K, Kumagai H. 2007. Schizosaccharomyces pombe minimum genome factory. Biotechnol. Appl. Biochem. 46:147–155. 10.1042/BA20060106 [DOI] [PubMed] [Google Scholar]
- 105.Morimoto T, Kadoya R, Endo K, Tohata M, Sawada K, Liu S, Ozawa T, Kodama T, Kakeshita H, Kageyama Y, Manabe K, Kanaya S, Ara K, Ozaki K, Ogasawara N. 2008. Enhanced recombinant protein productivity by genome reduction in Bacillus subtilis. DNA Res. 15:73–81. 10.1093/dnares/dsn002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Mizoguchi H, Sawano Y, Kato J, Mori H. 2008. Superpositioning of deletions promotes growth of Escherichia coli with a reduced genome. DNA Res. 15:277–284. 10.1093/dnares/dsn019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Lee JH, Sung BH, Kim MS, Blattner FR, Yoon BH, Kim JH, Kim SC. 2009. Metabolic engineering of a reduced-genome strain of Escherichia coli for L-threonine production. Microb. Cell Fact. 8:2. 10.1186/1475-2859-8-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Oberhardt MA, Palsson BØ, Papin JA. 2009. Applications of genome-scale metabolic reconstructions. Mol. Syst. Biol. 5:320. 10.1038/msb.2009.77 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Taymaz-Nikerel H, Borujeni AE, Verheijen PJT, Heijnen JJ, van Gulik WM. 2010. Genome-derived minimal metabolic models for Escherichia coli MG1655 with estimated in vivo respiratory ATP stoichiometry. Biotechnol. Bioeng. 107:369–381. 10.1002/bit.22802 [DOI] [PubMed] [Google Scholar]
- 110.Reed JL, Vo TD, Schilling CH, Palsson BØ. 2003. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol. 4:R54. 10.1186/gb-2003-4-9-r54 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Andrianantoandro E, Basu S, Karig DK, Weiss R. 2006. Synthetic biology: new engineering rules for an emerging discipline. Mol. Syst. Biol. 2:2006.0028. 10.1038/msb4100073 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Janssen P, Goldovsky L, Kunin V, Darzentas N, Ouzounis CA. 2005. Genome coverage, literally speaking. The challenge of annotating 200 genomes with 4 million publications. EMBO Rep. 6:397–399. 10.1038/sj.embor.7400412 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Valgepea K, Adamberg K, Seiman A, Vilu R. 2013. Escherichia coli achieves faster growth by increasing catalytic and translation rates of proteins. Mol. Biosyst. 9:2344–2358. 10.1039/c3mb70119k [DOI] [PubMed] [Google Scholar]
- 114.Klann AG, Belanger AE, Abanes-De Mello A, Lee JY, Hatfull GF. 1998. Characterization of the dnaG locus in Mycobacterium smegmatis reveals linkage of DNA replication and cell division. J. Bacteriol. 180:65–72 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Moya A, Gil R, Latorre A, Peretó J, Pilar Garcillán-Barcia M, de la Cruz F. 2009. Toward minimal bacterial cells: evolution vs. design. FEMS Microbiol. Rev. 33:225–235. 10.1111/j.1574-6976.2008.00151.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Jewett MC, Forster AC. 2010. Update on designing and building minimal cells. Curr. Opin. Biotechnol. 21:697–703. 10.1016/j.copbio.2010.06.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Luisi PL. 2002. Toward the engineering of minimal living cells. Anat. Rec. 268:208–214. 10.1002/ar.10155 [DOI] [PubMed] [Google Scholar]
- 118.Luisi PL, Ferri F, Stano P. 2006. Approaches to semi-synthetic minimal cells: a review. Naturwissenschaften 93:1–13. 10.1007/s00114-005-0056-z [DOI] [PubMed] [Google Scholar]
- 119.Stano P. 2011. Minimal cells: relevance and interplay of physical and biochemical factors. Biotechnol. J. 6:850–859. 10.1002/biot.201100079 [DOI] [PubMed] [Google Scholar]
- 120.Noble D. 2002. The rise of computational biology. Nat. Rev. Mol. Cell Biol. 3:459–463. 10.1038/nrm810 [DOI] [PubMed] [Google Scholar]
- 121.Gupta N, Benhamida J, Bhargava V, Goodman D, Kain E, Kerman I, Nguyen N, Ollikainen N, Rodriguez J, Wang J, Lipton MS, Romine M, Bafna V, Smith RD, Pevzner PA. 2008. Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes. Genome Res. 18:1133–1142. 10.1101/gr.074344.107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.O'Malley MA, Powell A, Davies JF, Calvert J. 2008. Knowledge-making distinctions in synthetic biology. Bioessays 30:57–65. 10.1002/bies.20664 [DOI] [PubMed] [Google Scholar]
- 123.Butland G, Babu M, Díaz-Mejía JJ, Bohdana F, Phanse S, Gold B, Yang W, Li J, Gagarinova AG, Pogoutse O, Mori H, Wanner BL, Lo H, Wasniewski J, Christopolous C, Ali M, Venn P, Safavi-Naini A, Sourour N, Caron S, Choi J-Y, Laigle L, Nazarians-Armavil A, Deshpande A, Joe S, Datsenko KA, Yamamoto N, Andrews BJ, Boone C, Ding H, Sheikh B, Moreno-Hagelseib G, Greenblatt JF, Emili A. 2008. eSGA: E. coli synthetic genetic array analysis. Nat. Methods 5:789–795. 10.1038/nmeth.1239 [DOI] [PubMed] [Google Scholar]
- 124.Suthers PF, Zomorrodi A, Maranas CD. 2009. Genome-scale gene/reaction essentiality and synthetic lethality analysis. Mol. Syst. Biol. 5:301. 10.1038/msb.2009.56 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Hashimoto M, Ichimura T, Mizoguchi H, Tanaka K, Fujimitsu K, Keyamura K, Ote T, Yamakawa T, Yamazaki Y, Mori H, Katayama T, Kato J. 2005. Cell size and nucleoid organization of engineered Escherichia coli cells with a reduced genome. Mol. Microbiol. 55:137–149. 10.1111/j.1365-2958.2004.04386.x [DOI] [PubMed] [Google Scholar]
- 126.Murphy KC. 1998. Use of bacteriophage lambda recombination functions to promote gene replacement in Escherichia coli. J. Bacteriol. 180:2063–2071 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Yu BJ, Sung BH, Koob MD, Lee CH, Lee JH, Lee WS, Kim MS, Kim SC. 2002. Minimization of the Escherichia coli genome using a Tn5-targeted Cre/loxP excision system. Nat. Biotechnol. 20:1018–1023. 10.1038/nbt740 [DOI] [PubMed] [Google Scholar]
- 128.Van Ham RCHJ, Kamerbeek J, Palacios C, Rausell C, Abascal F, Bastolla U, Fernández JM, Jiménez L, Postigo M, Silva FJ, Tamames J, Viguera E, Latorre A, Valencia A, Morán F, Moya A. 2003. Reductive genome evolution in Buchnera aphidicola. Proc. Natl. Acad. Sci. U. S. A. 100:581–586. 10.1073/pnas.0235981100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Nilsson AI, Koskiniemi S, Eriksson S, Kugelberg E, Hinton JCD, Andersson DI. 2005. Bacterial genome size reduction by experimental evolution. Proc. Natl. Acad. Sci. U. S. A. 102:12112–12116. 10.1073/pnas.0503654102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Knuth K, Niesalla H, Hueck CJ, Fuchs TM. 2004. Large-scale identification of essential Salmonella genes by trapping lethal insertions. Mol. Microbiol. 51:1729–1744. 10.1046/j.1365-2958.2003.03944.x [DOI] [PubMed] [Google Scholar]
- 131.Huynen M. 2000. Constructing a minimal genome. Trends Genet. 16:116. 10.1016/S0168-9525(99)01972-1 [DOI] [Google Scholar]
- 132.Delaye L, Becerra A, Lazcano A. 2005. The last common ancestor: what's in a name? Orig. Life Evol. Biosph. 35:537–554. 10.1007/s11084-005-5760-3 [DOI] [PubMed] [Google Scholar]
- 133.Ansong C, Purvine SO, Adkins JN, Lipton MS, Smith RD. 2008. Proteogenomics: needs and roles to be filled by proteomics in genome annotation. Brief. Funct. Genomic. Proteomic. 7:50–62. 10.1093/bfgp/eln010 [DOI] [PubMed] [Google Scholar]
- 134.Wolf YI, Brenner SE, Bash PA, Koonin EV. 1999. Distribution of protein folds in the three superkingdoms of life. Genome Res. 9:17–26 [PubMed] [Google Scholar]
- 135.Pavankumar AR, Ayyappasamy SP, Sankaran K. 2012. Small RNA fragments in complex culture media cause alterations in protein profiles of three species of bacteria. Biotechniques 52:167–172 [DOI] [PubMed] [Google Scholar]
- 136.Mears JA, Cannone JJ, Stagg SM, Gutell RR, Agrawal RK, Harvey SC. 2002. Modeling a minimal ribosome based on comparative sequence analysis. J. Mol. Biol. 321:215–234. 10.1016/S0022-2836(02)00568-5 [DOI] [PubMed] [Google Scholar]
- 137.Kuchaiev O, Milenkovic T, Memisevic V, Hayes W, Przulj N. 2010. Topological network alignment uncovers biological function and phylogeny. J. R. Soc. Interface 7:1341–1354. 10.1098/rsif.2010.0063 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Lagesen K, Ussery DW, Wassenaar TM. 2010. Genome update: the 1000th genome—a cautionary tale. Microbiology 156:603–608. 10.1099/mic.0.038257-0 [DOI] [PubMed] [Google Scholar]
- 139.Chen IA, Roberts RW, Szostak JW. 2004. The emergence of competition between model protocells. Science 305:1474–1476. 10.1126/science.1100757 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Oberholzer T, Wick R, Luisi PL, Biebricher CK. 1995. Enzymatic RNA replication in self-reproducing vesicles: an approach to a minimal cell. Biochem. Biophys. Res. Commun. 207:250–257. 10.1006/bbrc.1995.1180 [DOI] [PubMed] [Google Scholar]
- 141.Hanczyc MM, Toyota T, Ikegami T, Packard N, Sugawara T. 2007. Fatty acid chemistry at the oil-water interface: self-propelled oil droplets. J. Am. Chem. Soc. 129:9386–9391. 10.1021/ja0706955 [DOI] [PubMed] [Google Scholar]
- 142.Forster AC, Church GM. 2006. Towards synthesis of a minimal cell. Mol. Syst. Biol. 2:45. 10.1038/msb4100090 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Jewett MC, Swartz JR. 2004. Mimicking the Escherichia coli cytoplasmic environment activates long-lived and efficient cell-free protein synthesis. Biotechnol. Bioeng. 86:19–26. 10.1002/bit.20026 [DOI] [PubMed] [Google Scholar]
- 144.Jewett MC, Calhoun KA, Voloshin A, Wuu JJ, Swartz JR. 2008. An integrated cell-free metabolic platform for protein production and synthetic biology. Mol. Syst. Biol. 4:220. 10.1038/msb.2008.57 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Gibson DG, Benders GA, Andrews-Pfannkoch C, Denisova EA, Baden-Tillson H, Zaveri J, Stockwell TB, Brownley A, Thomas DW, Algire MA, Merryman C, Young L, Noskov VN, Glass JI, Venter JC, Hutchison CA, III, Smith HO. 2008. Complete chemical synthesis, assembly, and cloning of a Mycoplasma genitalium genome. Science 319:1215–1220. 10.1126/science.1151721 [DOI] [PubMed] [Google Scholar]
- 146.Kohl P, Noble D. 2009. Systems biology and the virtual physiological human. Mol. Syst. Biol. 5:292. 10.1038/msb.2009.51 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147.Brenner S, Noble D, Sejnowski T, Fields R, Laughlin S, Berridge M, Segel L, Prank K, Dolmetsch R. 2001. Understanding complex systems: top-down, bottom-up or middle-out?, p 150–159 In Bock GR, Goode JA. (ed), Novartis Foundation Symposium: complexity in biological information processing. John Wiley & Sons, Ltd, Chichester, United Kingdom [Google Scholar]
- 148.Gil R, Silva FJ, Zientz E, Delmotte F, González-Candelas F, Latorre A, Rausell C, Kamerbeek J, Gadau J, Hölldobler B, van Ham RCHJ, Gross R, Moya A. 2003. The genome sequence of Blochmannia floridanus: comparative analysis of reduced genomes. Proc. Natl. Acad. Sci. U. S. A. 100:9388–9393. 10.1073/pnas.1533499100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 149.Kobayashi K, Ehrlich SD, Albertini A, Amati G, Andersen KK, Arnaud M, Asai K, Ashikaga S, Aymerich S, Bessieres P, Boland F, Brignell SC, Bron S, Bunai K, Chapuis J, Christiansen LC, Danchin A, Débarbouille M, Dervyn E, Deuerling E, Devine K, Devine SK, Dreesen O, Errington J, Fillinger S, Foster SJ, Fujita Y, Galizzi A, Gardan R, Eschevins C, Fukushima T, Haga K, Harwood CR, Hecker M, Hosoya D, Hullo MF, Kakeshita H, Karamata D, Kasahara Y, Kawamura F, Koga K, Koski P, Kuwana R, Imamura D, Ishimaru M, Ishikawa S, Ishio I, Le Coq D, Masson A, Mauël C, et al. 2003. Essential Bacillus subtilis genes. Proc. Natl. Acad. Sci. U. S. A. 100:4678–4683. 10.1073/pnas.0730515100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150.Gerdes SY, Scholle MD, Campbell JW, Balazsi G, Ravasz E, Daugherty MD, Somera AL, Kyrpides NC, Anderson I, Gelfand MS, Bhattacharya A, Kapatral V, D'Souza M, Baev MV, Grechkin Y, Mseeh F, Fonstein MY, Overbeek R, Barabasi A-L, Oltvai ZN, Osterman AL, Balázsi G, Souza D, Barabási A, Bala G. 2003. Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J. Bacteriol. 185:5673–5684. 10.1128/JB.185.19.5673-5684.2003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 151.Kato J, Hashimoto M. 2007. Construction of consecutive deletions of the Escherichia coli chromosome. Mol. Syst. Biol. 3:132. 10.1038/msb4100174 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 152.Kang Y, Durfee T, Glasner JD, Qiu Y, Frisch D, Winterberg KM, Blattner FR. 2004. Systematic mutagenesis of the Escherichia coli genome. J. Bacteriol. 186:4921–4930. 10.1128/JB.186.15.4921-4930.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153.Ji Y, Zhang B, Van Horn SF, Warren P, Woodnutt G, Burnham MK, Rosenberg M. 2001. Identification of critical staphylococcal genes using conditional phenotypes generated by antisense RNA. Science 293:2266–2269. 10.1126/science.1063566 [DOI] [PubMed] [Google Scholar]
- 154.Oshima K, Kakizawa S, Nishigawa H, Jung H-Y, Wei W, Suzuki S, Arashida R, Nakata D, Miyata S, Ugaki M, Namba S. 2004. Reductive evolution suggested from the complete genome sequence of a plant-pathogenic phytoplasma. Nat. Genet. 36:27–29. 10.1038/ng1277 [DOI] [PubMed] [Google Scholar]
- 155.Deckert G, Warren PV, Gaasterland T, Young WG, Lenox AL, Graham DE, Overbeek R, Snead MA, Keller M, Aujay M, Huber R, Feldman RA, Short JM, Olsen GJ, Swanson RV. 1998. The complete genome of the hyperthermophilic bacterium Aquifex aeolicus. Nature 392:353–358. 10.1038/32831 [DOI] [PubMed] [Google Scholar]
- 156.Gánti T. 1975. Organization of chemical reactions into dividing and metabolizing units: the chemotons. Biosystems 7:15–21. 10.1016/0303-2647(75)90038-6 [DOI] [PubMed] [Google Scholar]
- 157.Szathmáry E, Griesemer J. 2008. Ganti's chemoton model and life criteria, p 407–432 In Rasmussen S, Bedau MA, Chen L, Deamer D, Krakauer DC, Packard NH, Stadler PF. (ed), Protocells: bridging nonliving and living matter. MIT Press, Cambridge, MA [Google Scholar]
- 158.Stelling J. 2004. Mathematical models in microbial systems biology. Curr. Opin. Microbiol. 7:513–518. 10.1016/j.mib.2004.08.004 [DOI] [PubMed] [Google Scholar]
- 159.Durot M, Bourguignon P-Y, Schachter V. 2009. Genome-scale models of bacterial metabolism: reconstruction and applications. FEMS Microbiol. Rev. 33:164–190. 10.1111/j.1574-6976.2008.00146.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 160.Orth JD, Thiele I, Palsson BØ. 2010. What is flux balance analysis? Nat. Biotechnol. 28:245–248. 10.1038/nbt.1614 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 161.Price ND, Reed JL, Palsson BØ. 2004. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat. Rev. Microbiol. 2:886–897. 10.1038/nrmicro1023 [DOI] [PubMed] [Google Scholar]
- 162.Pál C, Papp B, Lercher MJ, Csermely P, Oliver SG, Hurst LD. 2006. Chance and necessity in the evolution of minimal metabolic networks. Nature 440:667–670. 10.1038/nature04568 [DOI] [PubMed] [Google Scholar]
- 163.Gianchandani EP, Chavali AK, Papin JA. 2010. The application of flux balance analysis in systems biology. Wiley Interdiscip. Rev. Syst. Biol. Med. 2:372–382. 10.1002/wsbm.60 [DOI] [PubMed] [Google Scholar]
- 164.Surovtsev IV, Zhang Z, Lindahl PA, Morgan JJ. 2009. Mathematical modeling of a minimal protocell with coordinated growth and division. J. Theor. Biol. 260:422–429. 10.1016/j.jtbi.2009.06.001 [DOI] [PubMed] [Google Scholar]
- 165.Flamm C, Endler L, Müller S, Widder S, Schuster P. 2007. A minimal and self-consistent in silico cell model based on macromolecular interactions. Philos. Trans. R. Soc. Lond. B Biol. Sci. 362:1831–1839. 10.1098/rstb.2007.2075 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 166.Castellanos M, Wilson DB, Shuler ML. 2004. A modular minimal cell model: purine and pyrimidine transport and metabolism. Proc. Natl. Acad. Sci. U. S. A. 101:6681–6686. 10.1073/pnas.0400962101 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 167.Browning ST, Shuler ML. 2001. Towards the development of a minimal cell model by generalization of a model of Escherichia coli: use of dimensionless rate parameters. Biotechnol. Bioeng. 76:187–192. 10.1002/bit.10007 [DOI] [PubMed] [Google Scholar]
- 168.Tomita M. 2001. Whole-cell simulation: a grand challenge of the 21st century. Trends Biotechnol. 19:205–210. 10.1016/S0167-7799(01)01636-5 [DOI] [PubMed] [Google Scholar]
- 169.Shuler ML, Foley P, Atlas J. 2012. Modeling a minimal cell. Methods Mol. Biol. 881:573–610. 10.1007/978-1-61779-827-6_20 [DOI] [PubMed] [Google Scholar]
- 170.Karr JR, Sanghvi JC, Macklin DN, Gutschow MV, Jacobs JM, Bolival B, Assad-Garcia N, Glass JI, Covert MW. 2012. A whole-cell computational model predicts phenotype from genotype. Cell 150:389–401. 10.1016/j.cell.2012.05.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 171.Kiemer L, Cesareni G. 2007. Comparative interactomics: comparing apples and pears? Trends Biotechnol. 25:448–454. 10.1016/j.tibtech.2007.08.002 [DOI] [PubMed] [Google Scholar]
- 172.Trinh CT, Unrean P, Srienc F. 2008. Minimal Escherichia coli cell for the most efficient production of ethanol from hexoses and pentoses. Appl. Environ. Microbiol. 74:3634–3643. 10.1128/AEM.02708-07 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 173.Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabási AL. 2002. Hierarchical organization of modularity in metabolic networks. Science 297:1551–1555. 10.1126/science.1073374 [DOI] [PubMed] [Google Scholar]
- 174.Rives AW, Galitski T. 2003. Modular organization of cellular networks. Proc. Natl. Acad. Sci. U. S. A. 100:1128–1133. 10.1073/pnas.0237338100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 175.Bolser D, Dafas P, Harrington R, Park J, Schroeder M. 2003. Visualisation and graph-theoretic analysis of a large-scale protein structural interactome. BMC Bioinformatics 4:45. 10.1186/1471-2105-4-45 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 176.Barabási A-L, Oltvai ZN. 2004. Network biology: understanding the cell's functional organization. Nat. Rev. Genet. 5:101–113. 10.1038/nrg1272 [DOI] [PubMed] [Google Scholar]
- 177.Mampel J, Buescher JM, Meurer G, Eck J. 2013. Coping with complexity in metabolic engineering. Trends Biotechnol. 31:52–60. 10.1016/j.tibtech.2012.10.010 [DOI] [PubMed] [Google Scholar]
- 178.Parter M, Kashtan N, Alon U. 2007. Environmental variability and modularity of bacterial metabolic networks. BMC Evol. Biol. 7:169. 10.1186/1471-2148-7-169 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 179.Raymond J, Segre D. 2006. The effect of oxygen on biochemical networks and the evolution of complex life. Science 311:1764–1767. 10.1126/science.1118439 [DOI] [PubMed] [Google Scholar]
- 180.Markowitz VM, Chen I-MA, Palaniappan K, Chu K, Szeto E, Grechkin Y, Ratner A, Jacob B, Huang J, Williams P, Huntemann M, Anderson I, Mavromatis K, Ivanova NN, Kyrpides NC. 2012. IMG: the Integrated Microbial Genomes database and comparative analysis system. Nucleic Acids Res. 40:D115–D122. 10.1093/nar/gkr1044 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 181.Fogel GB, Collins CR, Li J, Brunk CF. 1999. Prokaryotic genome size and SSU rDNA copy number: estimation of microbial relative abundance from a mixed population. Microb. Ecol. 38:93–113. 10.1007/s002489900162 [DOI] [PubMed] [Google Scholar]
- 182.Rogozin IB, Makarova KS, Natale DA, Spiridonov AN, Tatusov RL, Wolf YI, Yin J, Koonin EV. 2002. Congruent evolution of different classes of non-coding DNA in prokaryotic genomes. Nucleic Acids Res. 30:4264–4271. 10.1093/nar/gkf549 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 183.Pérez J, Castañeda-García A, Jenke-Kodama H, Müller R, Muñoz-Dorado J. 2008. Eukaryotic-like protein kinases in the prokaryotes and the myxobacterial kinome. Proc. Natl. Acad. Sci. U. S. A. 105:15950–15955. 10.1073/pnas.0806851105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 184.Vieira-Silva S, Rocha EPC. 2010. The systemic imprint of growth and its uses in ecological (meta)genomics. PLoS Genet. 6:e1000808. 10.1371/journal.pgen.1000808 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 185.Vieira-Silva S, Touchon M, Rocha EPC. 2010. No evidence for elemental-based streamlining of prokaryotic genomes. Trends Ecol. Evol. 25:319–320. 10.1016/j.tree.2010.03.001 [DOI] [PubMed] [Google Scholar]
- 186.Sorek R, Lawrence CM, Wiedenheft B. 2013. CRISPR-mediated adaptive immune systems in Bacteria and Archaea. Annu. Rev. Biochem. 82:237–266. 10.1146/annurev-biochem-072911-172315 [DOI] [PubMed] [Google Scholar]
- 187.Orth JD, Conrad TM, Na J, Lerman JA, Nam H, Feist AM, Palsson BØ. 2011. A comprehensive genome-scale reconstruction of Escherichia coli metabolism—2011. Mol. Syst. Biol. 7:535. 10.1038/msb.2011.65 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 188.Gerosa L, Sauer U. 2011. Regulation and control of metabolic fluxes in microbes. Curr. Opin. Biotechnol. 22:566–575. 10.1016/j.copbio.2011.04.016 [DOI] [PubMed] [Google Scholar]
- 189.Brand MD, Curtis RK. 2002. Simplifying metabolic complexity. Biochem. Soc. Trans. 30:25–30. 10.1042/bst0300025 [DOI] [PubMed] [Google Scholar]
- 190.Molina N, van Nimwegen E. 2008. Universal patterns of purifying selection at noncoding positions in bacteria. Genome Res. 18:148–160. 10.1101/gr.6759507 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 191.Koonin EV, Mushegian A, Rudd KE. 1996. Sequencing and analysis of bacterial genomes. Curr. Biol. 6:404–416. 10.1016/S0960-9822(02)00508-0 [DOI] [PubMed] [Google Scholar]
- 192.Woyke T, Tighe D, Mavromatis K, Clum A, Copeland A, Schackwitz W, Lapidus A, Wu D, McCutcheon JP, McDonald BR, Moran NA, Bristow J, Cheng J-F. 2010. One bacterial cell, one complete genome. PLoS One 5:e10314. 10.1371/journal.pone.0010314 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 193.Komaki K, Ishikawa H. 2000. Genomic copy number of intracellular bacterial symbionts of aphids varies in response to developmental stage and morph of their host. Insect Biochem. Mol. Biol. 30:253–258. 10.1016/S0965-1748(99)00125-3 [DOI] [PubMed] [Google Scholar]
- 194.Gitai Z. 2005. The new bacterial cell biology: moving parts and subcellular architecture. Cell 120:577–586. 10.1016/j.cell.2005.02.026 [DOI] [PubMed] [Google Scholar]
- 195.Minton AP, Rivas G. 2011. Biochemical reactions in the crowded and confined physiological environment: physical chemistry meets synthetic biology, p 73–89 In Luisi PL, Stano P. (ed), The minimal cell, 1st ed. Springer, Dordrecht, Netherlands [Google Scholar]
- 196.Ingerson-Mahar M, Briegel A, Werner JN, Jensen GJ, Gitai Z. 2010. The metabolic enzyme CTP synthase forms cytoskeletal filaments. Nat. Cell Biol. 12:739–746. 10.1038/ncb2087 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 197.Popham PL, Hahn T-W, Krebes KA, Krause DC. 1997. Loss of HMW1 and HMW3 in noncytadhering mutants of Mycoplasma pneumoniae occurs post-translationally. Proc. Natl. Acad. Sci. U. S. A. 94:13979–13984. 10.1073/pnas.94.25.13979 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 198.Bachmann H, Fischlechner M, Rabbers I, Barfa N, Branco Dos Santos F, Molenaar D, Teusink B. 2013. Availability of public goods shapes the evolution of competing metabolic strategies. Proc. Natl. Acad. Sci. U. S. A. 110:14302–14307. 10.1073/pnas.1308523110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 199.Lluch-Senar M, Querol E, Piñol J. 2010. Cell division in a minimal bacterium in the absence of ftsZ. Mol. Microbiol. 78:278–289. 10.1111/j.1365-2958.2010.07306.x [DOI] [PubMed] [Google Scholar]
- 200.Jonas K, Chen YE, Laub MT. 2011. Modularity of the bacterial cell cycle enables independent spatial and temporal control of DNA replication. Curr. Biol. 21:1092–1101. 10.1016/j.cub.2011.05.040 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 201.Perez-Jimenez R, Inglés-Prieto A, Zhao Z-M, Sanchez-Romero I, Alegre-Cebollada J, Kosuri P, Garcia-Manyes S, Kappock TJ, Tanokura M, Holmgren A, Sanchez-Ruiz JM, Gaucher EA, Fernandez JM. 2011. Single-molecule paleoenzymology probes the chemistry of resurrected enzymes. Nat. Struct. Mol. Biol. 18:592–596. 10.1038/nsmb.2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 202.Zhang Y-HP, Evans BR, Mielenz JR, Hopkins RC, Adams MWW. 2007. High-yield hydrogen production from starch and water by a synthetic enzymatic pathway. PLoS One 2:e456. 10.1371/journal.pone.0000456 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 203.Himmelreich R, Plagens H, Hilbert H, Reiner B, Herrmann R. 1997. Comparative analysis of the genomes of the bacteria Mycoplasma pneumoniae and Mycoplasma genitalium. Nucleic Acids Res. 25:701–712. 10.1093/nar/25.4.701 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 204.Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H. 2006. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2:2006.0008. 10.1038/msb4100050 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 205.Dewall MT, Cheng DW. 2011. The minimal genome: a metabolic and environmental comparison. Brief. Funct. Genomics 10:312–315. 10.1093/bfgp/elr030 [DOI] [PubMed] [Google Scholar]
- 206.Lazcano A, Miller SL. 1996. The origin and early evolution of life: prebiotic chemistry, the pre-RNA world, and time. Cell 85:793–798. 10.1016/S0092-8674(00)81263-5 [DOI] [PubMed] [Google Scholar]
- 207.Chen IA. 2006. The emergence of cells during the origin of life. Science 314:1558–1559. 10.1126/science.1137541 [DOI] [PubMed] [Google Scholar]
- 208.Zimmer C. 2009. On the origin of life on Earth. Science 323:198–199. 10.1126/science.323.5911.198 [DOI] [PubMed] [Google Scholar]
- 209.Pohorille A, Deamer D. 2002. Artificial cells: prospects for biotechnology. Trends Biotechnol. 20:123–128. 10.1016/S0167-7799(02)01909-1 [DOI] [PubMed] [Google Scholar]
- 210.Murtas G. 2009. Artificial assembly of a minimal cell. Mol. Biosyst. 5:1292–1297. 10.1039/b906541e [DOI] [PubMed] [Google Scholar]
- 211.Porcar M, Danchin A, de Lorenzo V, Dos Santos VA, Krasnogor N, Rasmussen S, Moya A. 2011. The ten grand challenges of synthetic life. Syst. Synth. Biol. 5:1–9. 10.1007/s11693-011-9084-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 212.Solé RV. 2009. Evolution and self-assembly of protocells. Int. J. Biochem. Cell Biol. 41:274–284. 10.1016/j.biocel.2008.10.004 [DOI] [PubMed] [Google Scholar]
- 213.Szathmáry E, Santos M, Fernando C. 2005. Evolutionary potential and requirements for minimal protocells. Top. Curr. Chem. 259:167–211. 10.1007/tcc001 [DOI] [Google Scholar]
- 214.French CT, Lao P, Loraine AE, Matthews BT, Yu H, Dybvig K. 2008. Large-scale transposon mutagenesis of Mycoplasma pulmonis. Mol. Microbiol. 69:67–76. 10.1111/j.1365-2958.2008.06262.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 215.Sassetti CM, Rubin EJ. 2003. Genetic requirements for mycobacterial survival during infection. Proc. Natl. Acad. Sci. U. S. A. 100:12989–12994. 10.1073/pnas.2134250100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 216.Catrein I, Herrmann R. 2011. The proteome of Mycoplasma pneumoniae, a supposedly “simple” cell. Proteomics 11:3614–3632. 10.1002/pmic.201100076 [DOI] [PubMed] [Google Scholar]
- 217.Baric S. 2012. Quantitative real-time PCR analysis of “Candidatus Phytoplasma mali” without external standard curves. Erwerbs Obstbau 54:147–153. 10.1007/s10341-012-0166-7 [DOI] [Google Scholar]
- 218.Kube M, Schneider B, Kuhl H, Dandekar T, Heitmann K, Migdoll AM, Reinhardt R, Seemüller E. 2008. The linear chromosome of the plant-pathogenic mycoplasma “Candidatus Phytoplasma mali.” BMC Genomics 9:306. 10.1186/1471-2164-9-306 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 219.McCutcheon JP, McDonald BR, Moran NA. 2009. Origin of an alternative genetic code in the extremely small and GC-rich genome of a bacterial symbiont. PLoS Genet. 5:e1000565. 10.1371/journal.pgen.1000565 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 220.McCutcheon JP, Moran NA. 2010. Functional convergence in reduced genomes of bacterial symbionts spanning 200 My of evolution. Genome Biol. Evol. 2:708–718. 10.1093/gbe/evq055 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 221.López-Madrigal S, Latorre A, Porcar M, Moya A, Gil R. 2011. Complete genome sequence of “Candidatus Tremblaya princeps” strain PCVAL, an intriguing translational machine below the living-cell status. J. Bacteriol. 193:5587–5588. 10.1128/JB.05749-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 222.Yizhak K, Tuller T, Papp B, Ruppin E. 2011. Metabolic modeling of endosymbiont genome reduction on a temporal scale. Mol. Syst. Biol. 7:479. 10.1038/msb.2011.11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 223.Prickett MD, Page M, Douglas AE, Thomas GH. 2006. BuchneraBASE: a post-genomic resource for Buchnera sp. APS. Bioinformatics 22:641–642. 10.1093/bioinformatics/btk024 [DOI] [PubMed] [Google Scholar]
- 224.Moran NA, Tran P, Gerardo NM. 2005. Symbiosis and insect diversification: an ancient symbiont of sap-feeding insects from the bacterial phylum Bacteroidetes. Appl. Environ. Microbiol. 71:8802–8810. 10.1128/AEM.71.12.8802-8810.2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 225.Kuwahara H, Yoshida T, Takaki Y, Shimamura S, Nishi S, Harada M, Matsuyama K, Takishita K, Kawato M, Uematsu K, Fujiwara Y, Sato T, Kato C, Kitagawa M, Kato I, Maruyama T. 2007. Reduced genome of the thioautotrophic intracellular symbiont in a deep-sea clam, Calyptogena okutanii. Curr. Biol. 17:881–886. 10.1016/j.cub.2007.04.039 [DOI] [PubMed] [Google Scholar]
- 226.Carini P, Steindler L, Beszteri S, Giovannoni SJ. 2013. Nutrient requirements for growth of the extreme oligotroph “Candidatus Pelagibacter ubique” HTCC1062 on a defined medium. ISME J. 7:592–602. 10.1038/ismej.2012.122 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 227.Giovannoni SJ, Tripp HJ, Givan S, Podar M, Vergin KL, Baptista D, Bibbs L, Eads J, Richardson TH, Noordewier M, Rappé MS, Short JM, Carrington JC, Mathur EJ. 2005. Genome streamlining in a cosmopolitan oceanic bacterium. Science 309:1242–1245. 10.1126/science.1114057 [DOI] [PubMed] [Google Scholar]
- 228.Sowell SM, Norbeck AD, Lipton MS, Nicora CD, Callister SJ, Smith RD, Barofsky DF, Giovannoni SJ. 2008. Proteomic analysis of stationary phase in the marine bacterium “Candidatus Pelagibacter ubique.” Appl. Environ. Microbiol. 74:4091–4100. 10.1128/AEM.00599-08 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 229.Rocap G, Larimer FW, Lamerdin J, Malfatti S, Chain P, Ahlgren NA, Arellano A, Coleman M, Hauser L, Hess WR, Johnson ZI, Land M, Lindell D, Post AF, Regala W, Shah M, Shaw SL, Steglich C, Sullivan MB, Ting CS, Tolonen A, Webb EA, Zinser ER, Chisholm SW. 2003. Genome divergence in two Prochlorococcus ecotypes reflects oceanic niche differentiation. Nature 424:1042–1047. 10.1038/nature01947 [DOI] [PubMed] [Google Scholar]
- 230.Steglich C, Futschik ME, Lindell D, Voss B, Chisholm SW, Hess WR. 2008. The challenge of regulation in a minimal photoautotroph: non-coding RNAs in Prochlorococcus. PLoS Genet. 4:e1000173. 10.1371/journal.pgen.1000173 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 231.García-Fernández JM, de Marsac NT, Diez J. 2004. Streamlined regulation and gene loss as adaptive mechanisms in Prochlorococcus for optimized nitrogen utilization in oligotrophic environments. Microbiol. Mol. Biol. Rev. 68:630–638. 10.1128/MMBR.68.4.630-638.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 232.Itaya M. 1995. An estimation of minimal genome size required for life. FEBS Lett. 362:257–260. 10.1016/0014-5793(95)00233-Y [DOI] [PubMed] [Google Scholar]
- 233.Westers H, Dorenbos R, van Dijl JM, Kabel J, Flanagan T, Devine KM, Jude F, Seror SJ, Beekman AC, Darmon E, Eschevins C, de Jong A, Bron S, Kuipers OP, Albertini AM, Antelmann H, Hecker M, Zamboni N, Sauer U, Bruand C, Ehrlich DS, Alonso JC, Salas M, Quax WJ. 2003. Genome engineering reveals large dispensable regions in Bacillus subtilis. Mol. Biol. Evol. 20:2076–2090. 10.1093/molbev/msg219 [DOI] [PubMed] [Google Scholar]
- 234.Waters E, Hohn MJ, Ahel I, Graham DE, Adams MD, Barnstead M, Beeson KY, Bibbs L, Bolanos R, Keller M, Kretz K, Lin X, Mathur E, Ni J, Podar M, Richardson T, Sutton GG, Simon M, Soll D, Stetter KO, Short JM, Noordewier M. 2003. The genome of Nanoarchaeum equitans: insights into early archaeal evolution and derived parasitism. Proc. Natl. Acad. Sci. U. S. A. 100:12984–12988. 10.1073/pnas.1735403100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 235.McCutcheon JP, Moran NA. 2007. Parallel genomic evolution and metabolic interdependence in an ancient symbiosis. Proc. Natl. Acad. Sci. U. S. A. 104:19392–19397. 10.1073/pnas.0708855104 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.