Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Sep 19.
Published in final edited form as: Cell. 2024 Sep 19;187(19):5151–5170. doi: 10.1016/j.cell.2024.08.028

Modern microbiology: embracing complexity through integration across scales

A Murat Eren 1,2,3,4,5,*, Jillian F Banfield 6,7,8,9,*
PMCID: PMC11450119  NIHMSID: NIHMS2026252  PMID: 39303684

Abstract

Microbes were the only form of life on Earth for most of its history, and they still account for the vast majority of life’s diversity. They convert rocks to soil, produce much of the oxygen we breathe, remediate our sewage, and sustain agriculture. Microbes are vital to planetary health as they maintain biogeochemical cycles that produce and consume major greenhouse gasses and support large food webs. Modern microbiologists analyze nucleic acids, proteins, and metabolites, leverage sophisticated genetic tools, software and bioinformatic algorithms and process and integrate complex and heterogeneous datasets so that microbial systems may be harnessed to address contemporary challenges in health, the environment and basic science. Here we consider an inevitably incomplete list of emergent themes in our discipline and highlight those that we recognize as the archetypes of its modern era that aim to address the most pressing problems of the 21st century.

Introduction

Our planet faces formidable challenges that span from climate change, environmental pollution, depletion of fertile soils, and loss of biodiversity to feeding and caring for the ever-growing human populations in a sustainable fashion. Microbes, whose evolution has been intimately interconnected with changes in Earth’s physicochemical environments over biological time, remain an integral part of these interlinked crises. With its aim to understand microbial life, and given the potential to harness microbes to address these huge problems, the field of microbiology is emerging as arguably the most important science of the mid-21st century.

Microbiology has made remarkable progress throughout its relatively brief few hundred-year history. The advances seem far from reaching a plateau, as we have likely just begun to scratch the surface of major biotechnological and medical advances that are enabled by studies of microbial life. Today, microbiologists aim to elucidate and ultimately address a wide range of processes that include greenhouse gas emissions and soil carbon sinks to wastewater management, protection of clean water resources, detoxification of heavy-metal and organic pollutants, maintenance of soil fertility and rhizosphere nutrition, the emergence and mitigation of plant and animal pathogens, antibiotic resistance, pandemics. Investigations to address such diverse questions often rely on a broad range of data producing and processing technologies, which demand increasingly higher levels of integration and synthesis for unified insights.

Microbiology draws its power from relatively well-established technologies such as molecular sequencers, mass spectrometers, high-throughput cultivation, genetic manipulation and a large arsenal of ‘omics strategies, statistics, and modeling, as well as technologies that are rapidly emerging, such as artificial intelligence, machine learning, microbiome-based genome editing, and more. As the contemporary questions of microbiology demand the orchestration of diverse data generation and analysis strategies that yield increasingly large datasets, integrative approaches that span across scales of complexity emerges as one of the hallmarks of modern microbiology. Here we take a look at the past and contemplate the future of our rapidly transforming discipline by starting with a discussion on some of the major themes that underpin the new data-intensive epoch of microbiology.

Modern microbiology: from individual microbes to communities

It is a daunting task to precisely define what modern microbiology is, but a useful one to reminisce how we got to where we are, and to recognize new motivations, opportunities, and challenges that influence today’s microbiologists. If we could explain how we study microbes today and what we know about them to Antonie van Leeuwenhoek, who saw microbes the first time in 1670s, to Ignaz Semmelweis, who realized washing hands reduces patient mortality in hospitals in 1840s, or to Fanny Hesse, who established agar to grow microbes in laboratory settings in 1880s – what would they find to be the most fascinating advance microbiology has achieved compared to its early days? We indeed would have had a myriad of surprises to show them proudly, as we now can isolate microbes in nanodroplets1, image them in situ with colorful probes2 and in aqua in three dimensions3, survey metabolites they exchange4, visualize and model their metabolic activities both at the level of individual elements5 or at planetary scales6, pull them around from their surface proteins7, stick them into little wells8, and even modify9, transform10, or synthesize11 their genetic repertoire, or transfer their genes into the genomes of plants12 and animals13. But apart from the permanently central role of technology in microbiology, there is one conceptual fracture in our otherwise continuous progress towards a better understanding of microbial life which can serve as a line between the still alive and well classical roots of microbiology and its aspiring modern times: our expanding focus from individual organisms in cultures to complex communities in natural habitats, with the admission that ‘a pure culture is not enough’14.

The shift from individuals to communities in microbiology was enabled by a series of technological and conceptual advances: it was foreshadowed by the theory of evolution, and guided by many breakthroughs such as the discovery of the structure of DNA15, the recognition of the utility of the molecular information encoded in genomes as a means to document evolutionary history16, and the arrival of sequencing17. One of the latest of the most defining conceptual advances that culminated the thread of progress into a single evolutionary framework that connected all cellular life was the work led by Carl Woese, which offered a first glimpse of the tree of life, revealed by comparison of ribosomal RNA gene sequences in 197718. The translation of this transformative idea to practical applications in microbiology started to emerge through the pioneering work by Norman Pace and colleagues that described rapid sequencing of ribosomal RNA (rRNA) gene fragments19 and cloning methods that enabled gene recovery from environmental samples20, which gained further traction with simpler molecular strategies that require lower input biomass21. In the following years, PCR-amplification of 16S and 18S rRNA gene fragments became an increasingly popular method to survey microbial diversity in naturally occurring habitats22,23. Coinciding with increasing availability and affordability of short read sequencing of early 2000s, a plethora of amplicon studies revealed the richness and dynamism of microbial communities from marine24 and terrestrial systems25 to host associated habitats26, and led to a broad recognition and awareness of Earth’s flourishing microbiomes and their connection to the rest of life27. The transition from individuals to whole microbiomes motivated a societal level adjustment of our perspective: microbes were no longer a band of bad actors in the public mind that typically caused diseases, but the architects of life, the wardens of planetary health, and the fundamental forces that make our ecosystems tick.

Surveys of microbial diversity through rRNA gene amplicons was a wildfire in microbiology and resulted in over 85,000 studies in the past 20 years. Yet, taxonomic insights offer limited utility to understand functional drivers of biological systems, a pinnacle desire that brings together many corners of microbiology. Despite the inspiration to better understand microbial systems, the integrative power of amplicon surveys remained severely constrained. Understanding of microbes in their environmental context required genomes, which first became available for isolates that were generated by indispensable yet labor intensive isolation efforts28. However, isolation is inherently biased towards microbes that can grow alone and for which suitable growth conditions and substrates are known, and has not delivered even a single representative of the majority of currently known lineages29.

The realization of the need for direct access to the genetic makeup of environmental populations intensified during the late 1990s. In pursuit of natural products potentially available from the untapped diversity of microbes in the environment, Handelsman and colleagues introduced a cloning-based approach to access genes and pathways of interest from soil ‘metagenome’30. The early approaches to metagenomics, which promoted the idea of cultivation-independent surveys of environmental populations, relied upon “isolating DNA from an environmental sample, cloning the DNA into a suitable vector, transforming the clones into a host bacterium, and screening the resulting transformants”14. The arduous beginnings of metagenomics offered unique insights into environmental microbes31,32, and demonstrated the feasibility of culture-independent surveys of natural populations. An alternative approach to metagenomics relied upon the sequencing of randomly chosen short DNA fragments33. Combined with the advances in sequencing chemistry and molecular approaches, this ‘shotgun sequencing’ approach replaced cloning-based strategies, and ultimately enabled the reconstruction of genomes directly from natural ecosystems.

The core idea behind reconstructing genomes from environmental metagenomes, which was laid out in the first example of its kind, stood on a relatively simple principle that relied on a few imperfect but computationally tractable steps34: assembling metagenomic short reads into longer contiguous DNA segments (contigs), estimating the abundance of contigs in metagenomes, and partitioning contigs into separate genome bins by exploiting differences in GC content and abundance of each discrete population. The power of genome-resolved metagenomics came from the fact that no aspect of it required any prior knowledge about organisms in a given environment. Its simplicity allowed this strategy to be encapsulated in automated workflows, from which emerged a plethora of genomes, not only from taxa microbiology knew about but also from those we had not recognized before, and not only from the primary domains of life but also from viruses, plasmids, and other exciting biological entities. Over time, the size of sequencing datasets increased, along with the power of bioinformatics tools, enabling the transition of genome-resolved surveys from relatively low complexity systems to the microbiologically most complex ecosystems on Earth, such as oceans and soils. Metagenome-assembled genomes (MAGs), along with single-amplified genomes (SAGs) and those that emerged from improved isolation efforts, opened the floodgates of genomic data pouring in from all environments. Today, the rapidly maturing evolutionary framework around genomes offers a playground for all corners of our discipline to integrate heterogeneous data types that slice through microbial systems in complementary ways through analytical, computational, and molecular tools to address fundamental questions of modern microbiology (Figure 1).

Figure 1.

Figure 1.

The measurement tools, data types, and analysis strategies commonly used in modern microbiology to study naturally occurring microbial life in a wide range of complex habitats through rapidly emerging genomic blueprints of the diversity of life.

From communities back to individual microbes

Cultivation, the most fundamental methodological advance that shaped classical microbiology by enabling molecular insights with broad applications through manipulation of representative microbial organisms, continues to be the bedrock of contemporary studies of microbial physiology in the post-omics era35. The ability to reconstruct microbial genomes directly from the environment represents an effective means to improve cultivation efforts as it can provide clues regarding metabolic requirements of little known organisms, and guide their directed isolation36. The first demonstration of this idea was the isolation of Leptospirillum ferrodiazotrophum37 from a relatively simple subsurface acid mine drainage biofilm following the observation that it was the only organism in its community capable of nitrogen fixation. The first successful isolation of Coxiella burnetii38, the causative agent of human Q fever39, represents another example of how partial insights into genomic features of target organisms can help define complex media that support the axenic growth of even obligate intracellular organisms. Peering into the metabolic make up of environmental communities through metagenome-assembled genomes continues to be a powerful tool to support cultivation efforts even from complex mixtures of microbes. For instance, Pope et al. were able to isolate a representative of the family Succinivibrionaceae, so called WG-1, that dampens methane emissions from digestive activities of macropods using a composite genome reconstructed directly from a metagenome. Metabolic insights gained from this genome supported the development of a medium with specific carbohydrate and nitrogen sources as well as antibiotics to achieve an axenic culture40. Others applied similar principles to grow microbes from other environments, including human gut41 and even extreme subsurface environments42. Prediction of the episymbiotic lifestyles of Candidate Phyla Radiation (CPR) bacteria from environmentally-derived genomes43 led to the co-isolation of these tiny bacteria with their hosts44, and opened the way for microscopic45, metabolic46, transcriptomic47 definitions of the impact of these symbioses, and their interactions with their hosts48. An exciting workflow that integrated the use of environmental genomes, host immunity, and fluorescent labeling for targeted cultivation of microbes was demonstrated by Cross et al., who cultivated multiple CPR bacteria, including the elusive Absconditabacteria (SR1)7. Cross et al first surveyed single-amplified and metagenome-assembled genomes of target bacteria to identify membrane proteins that could serve as immunogens, then predicted extracellular domains in these candidate genes with homology to previously recognized immunogens, synthesized matching peptides, introduced them into rabbits, recovered antibodies generated by the rabbit immune system, used these antibodies to ‘label’ bacterial cells in a human saliva sample, and were able to sort labeled cells into cultures7.

In addition to culture-independent insights that improve cultivation efforts and new approaches that aim to preserve natural growth conditions of environmental microbes49,50 or their interactions51, modern approaches to cultivation include high-throughput strategies that exploit robotics and microfluidics. Recent studies demonstrate the utility of microfluidics to generate thousands of nanoliter droplets on petri dishes with chemical gradients52, and the application of robotics and machine learning to process thousands of colonies per hour53. One critical application of microfluidics is targeted cultivation of microorganisms based on any genetic feature that is learned from metagenomes. Using this strategy, Ma et al. were able to identify a microcolony that carried their target gene sequence among 500 microcolonies they grew on a multiplexed microfluidic device, and recovered the first isolate of a previously unidentified genus of the Ruminococcaceae family54 that has been among the most abundant and prevalent taxa in the human gut with no cultured representatives until then55. Another exciting application of microfluidics is the untargeted, high-throughput cultivation of microorganisms in droplets. The isolation of individual bacterial cells in droplets reduces the loss of microbial colonies in cultivation efforts due to resource competition, differences in growth rates, or dilution of quorum sensing molecules56. In an application of this approach, Watterson et al. populated millions of picoliter droplets created from distinct culture media with individual microbial cells from a single sample1. Droplets they sorted based on colony density included many of the rare organisms conventional cultivation strategies missed in the same sample, and recovered antibiotic resistant strains that conventional plate-based assessments of antibiotic resistance were unable to detect1.

Dramatic expansion of the tree of life

Our newly found power to sequence genes from environmental samples had far reaching implications on our understanding of the diversity and evolution of life through phylogenomics and resulted in remarkable changes to the structure of the tree of life. The sequencing of marker genes such as rRNAs had revealed many, but not all, branches of life, but inconsistencies in predicted physiology in major branches and unexpected gene inventories only became evident upon recovery of genomes. Notable features of the revised and genome-informed tree of life include the Asgard archaea with their inventories of eukaryote-associated genes57, making them of great interest from the perspective of the origin of Eukaryotes58. Other major groups of particular interest include the possibly not monophyletic DPANN archaea and the seemingly monophyletic CPR, which display consistently small genomes and limited biosynthetic capacities. This led to the prediction that both the bacterial and archaeal domains of life feature large groups of microbes that they generally depend on host microorganisms for basic building blocks such as lipids and nucleotides59, and recent work has clarified the partitioning of host-derived lipids into some episymbionts60. Inclusion of sequences from metagenomes as well as isolates brought into focus the dramatic prevalence of branches of the tree that remain without any isolated representatives29. Further refinements have begun to clarify relationships between deep branches, for example the Chloroflexi and CPR, and the distribution of traits61. The revised views of the bacteria and archaeal domains as well as the whole Tree of Life contrast dramatically with even contemporary views of the diversity of life that are highly eukaryote-centric and in some cases, don’t even include non-eukaryotic organisms. These renderings enabled a new appreciation of our place in biological evolution and biodiversity.

The value of complete and finished genomes is hard to over emphasize. In addition to providing a quality assured inventory of genes and metabolic potential without contamination by sequences from other sources, finished genomes ensure full genomic context for any genetic feature and define the overall genome architecture, including exact genome size and replichore structure. However, one of the hallmarks of the contemporary catalogs of metagenome-assembled genomes is the low completion and occasional contamination issues due to the highly fragmented nature of metagenomic assemblies62. While some of these concerns can be addressed by manual62,63 or automated genome curation and bin refinement approaches64, long-read sequencing technologies are already enabling the reconstruction of complete genomes without binning, and this will soon be possible at scale. Products of long-read sequencing can be error prone, and efforts to automate error correction and to improve the accuracy of long-reads are underway (e.g.,6567). The number of assembly algorithms that can report circularized sequences from metagenomes is also increasing6870. However, assembled long-reads are not without issues, and the resulting sequences can suffer from false circularization due to artifactual repeated sequences and misassemblies that are not supported by the original reads. High-throughput methods for detection and correction of assembly errors will ultimately bring microbiology closer to the ideal of routinely rendering accurate and complete genomes and increase the utility of genomes from environments for downstream applications in microbiology.

Expanding and exploring the protein universe

Metagenomic sequencing can uncover new protein families and whole proteome comparisons (including proteins with unknown functions) can establish relatedness amongst large groups of organisms71, including those that lack universal genes, such as extrachromosomal elements72. Despite the lack of recognizable functional domains, the sequences themselves can provide information about physicochemical properties of the protein (e.g., hydrophobic domains or positively charged regions that may be involved in nucleic acid binding) and uncover metal and ligand binding sites. A new generation of computational methods such as AlphaFold73, RoseTTAFold74, and ESMFold75 enable prediction of the three-dimensional structures of proteins, thus can provide functional clues that are not apparent based on the sequence alone. Often, predictions leverage templates in the form of the structures of biochemically validated proteins accessed via large protein databases such as the Protein Data Bank (PDB)76. However, it is possible to improve these predictions by supplementing the underlying multi-sequence protein alignments using the ever expanding database of amino acid sequences derived from metagenomic studies77. In fact, using sequences independent of templates it is possible to discover new protein folds and vastly increase access to three dimensional structures for proteins with no related solved structures or sequence similarity to known proteins78. These approaches, in combination with ecological and other information, have the potential to advance understanding the protein inventories of enigmatic entities such as viruses79 and aid in the development of biotechnology tools80. Exciting new bilingual models that translate information between structural and sequence space81, and computational tools that increase the accessibility of protein structure calculations82, and searchability of existing structures83,84 in large databases with millions of predictions75 help locating variants of proteins of interest with potentially different size or kinetics. It may be possible to revolutionize drug development given methods that predict molecular interactions and discover small molecules or antibodies that bind to proteins identified as drug targets85. Indeed, computational approaches continue to improve at a rapid pace to predict a broad range of biomolecular interactions73 and the stoichiometry of large protein complexes86. Altogether, the ongoing revolution in protein structure space and new opportunities to integrate structural insights with established ‘omics data types and analysis strategies will undoubtedly influence every corner of microbiology.

Going beyond the tree of life – (giant) viruses, (giant) plasmids, and other extrachromosomal entities

With their astronomical numbers and complexity, viruses represent a force of nature that is implicated in all aspects of life, starting from its very origins87,88. The immense impact of viruses on life continue through metabolic reprogramming of their hosts89,90, biogeochemical and ecological processes they influence in complex habitats91, and even the modifications of the reproductive behavior of animals92.

Contemporary virus research spans from their characterization and organization into cohesive units to linking them to particular hosts or ecosystems and investigating their biotechnological applications. The emergence of computational tools to recognize viral sequences in metagenomic assemblies9395 resulted in the recovery of tens of thousands of new viral genomes from the human gut96, oceans97, and soils98. The increasing access to metatranscriptomics further supports the recovery of RNA viruses99, and enables spatiotemporal tracking of eukaryotic viruses of public health significance and their variants100. Microbiologists were surprised when ‘giant eukaryotic viruses’, whose genome sizes surpass those of some bacteria101, were discovered102. Genome-resolved metagenomics had a remarkable impact also on our rapidly emerging understanding of the diversity and biogeography of giant viruses103, their complex metabolic capabilities104, host reprogramming105, and impact on host genome evolution106. Despite the progress towards a global taxonomic framework for viruses107,108, the luxury to be represented in a universal tree of life continues to be limited to cellular organisms. However, at least at the level of individual studies, the integrated use of metagenomic assemblies or genome-resolved metagenomics, phylogenomics, and protein structure prediction offers promising insights, not only to achieve a more complete representation of viral clades but also fill the gaps between disconnected realms of viruses. For instance, in a recent study Gaïa et al. used phylogeny-guided genome-resolved metagenomics using DNA-dependent RNA polymerase subunits to describe a large clade of viruses, and by combining functional features and predicted protein structures they were able to reveal a first association between giant viruses that infect unicellular marine eukaryotes and herpesviruses that infect mammals109. In addition to their prominent place in basic science, viruses continue to be a powerful driver of biotechnology for innumerable reasons, including therapeutic opportunities110, the potential of their proteins as small genome editors111 and their ability to deliver DNA (including editing tools) into the cells in which they replicate112114. They also force us to question the classical definitions of what is a bona fide living entity115 with opposing views116, yet continuously elucidate the conceptual frameworks that can offer formal organizations of the realms of biology117.

Plasmids, which were first defined by Joshua Lederberg in 1952118 and have served as a workhorse of molecular biology and genetics119, represent another exciting corner of microbiology. The significance of plasmids parallels that of viruses: plasmids also impact the ecology and evolution of their hosts120, often find themselves under the spotlight of public health concerns such as global antibiotic resistance121 largely due to their contribution to the proliferation of such genetic factors122, encode genes that are associated with a wide range of metabolic activities123, and suffer from a similar exclusion from trees of life. It is conceivable that these similarities may simply be an echo of deeper evolutionary connections that are no longer present in contemporary sequence space. Indeed, recent characterizations of ‘virus-plasmids’ that can act both as viruses and plasmids124, observations of plasmids that propagate through virus-like particles125, or phylogenetic analyses that attribute the emergence of DNA viruses to recombination events between RNA viruses and plasmids126 reinforce the likely presence of deeper evolutionary links and hint that there is likely much more to discover in this junction. Also similar to viruses, the diversity of naturally occurring plasmids has been grossly underestimated. While metagenomic sequencing and assemblies access naturally occurring plasmids, recognizing plasmids in sequence collections remains a challenge. Some plasmids, even for some of the most rigorously studied microbial taxa such as Wolbachia, remained undetected for decades127. But our ability to recognize plasmids is improving thanks to computational strategies that exploit k-mer profiles learned from reference plasmids128, conserved plasmid functions129, circularity of sequences in metagenomes130, or deep learning approaches that use combinations of these features131. Today we know that even for relatively well studied ecosystems, plasmids found exclusively in metagenomes exceed the number of previously known plasmids multiple folds132. An interesting insight comes from a recent study that uses 68,000 non-redundant plasmids that were identified from human gut metagenomes and shows that plasmid distribution patterns is not correlated with bacterial taxonomy123, which reinforces the critical importance of treating them as semi-independent entities that represent an adaptive force in microbial systems133, similar to viruses. Yet organizing plasmids into evolutionarily cohesive units is perhaps an even bigger challenge compared to their identification. Unlike viruses, where relatively stable genes within distinct viral realms can yield taxonomic frameworks to shed some light on evolutionary routes of individual viral genomes134, attempts to find cohesive units in diverse sets of plasmids have been largely limited to partitioning pairwise sequence similarity networks135,136. A recent attempt to identify evolutionarily cohesive ‘plasmid systems’ in such networks through a containment-aware partitioning approach demonstrated a means to identify plasmid backbones and cargo genes, and partition plasmid genes into housekeeping genes and those that respond to particular environmental conditions de novo123. Plasmids are a pillar of biopharmaceutical advances119, and naturally occurring plasmids represent a new asset. But the world of plasmids is chaotic: some can be so extremely essential and integrated into the host cellular physiology that they compelled researchers to call them ‘chromids’137, while others can look so small and irrelevant even though they can be some of the most numerous genetic elements in natural habitats138. With their unruly nature, plasmids remain an intimidating and exciting dimension of microbiology.

A significant challenge that is generally true for most extrachromosomal elements that are not typically found integrated into host genomes is the difficulty of linking them to their host organisms, which can be a significant burden in complex environments. By preserving the unity of naturally occurring genetic elements within cells that are typically lost in bulk sequencing efforts, single-cell genomics approaches offer somewhat straightforward means to capture virus-host139 and plasmid-host140 associations with high confidence. In some cases it is possible to leverage the fact that bacteria and archaea incorporate snippets of their extrachromosomal element genomes into their CRISPR loci (spacers) so that invasive elements can be recognized by the protein-RNA-based surveillance system. Extrachromosomal elements whose genomes match to CRISPR spacer sequences, especially from the same or related samples, are likely the targets of that microbial immune system, thus indicating ECE-host relationships141. Because CRISPR loci are often very fast evolving, the effectiveness of this method can be improved when spacer sequences are recovered directly from short metagenomic reads rather than from the assembled consensus sequence141. Another solution to identify hosts for extrachromosomal entities in complex samples comes from Hi-C142, a molecular strategy that can physically link genetic entities that are adjacent to one another prior to sequencing, and can reveal associations between host genomes and extrachromosomal entities, including plasmids143 and viruses144,145. Finally, heterogeneity in population genomic data can identify fast evolutionary modes and uncover the exact integrations sites for prophages and proviruses, simultaneously establishing host association for these elements146,147.

Overall, our insights into extrachromosomal elements are improving at a rapid pace, and cover more than historically well-recognized ones such as viruses and plasmids. The recent description of ‘Borgs’, which can have over 1 Mbp linear genomes, contain large inventories of metabolically relevant genes, and are not simply classifiable as plasmids or viruses148, ‘obelisks’, which are ~1 kbp viroid-like extrachromosomal circular RNAs that are prevalent in gut metagenomes and metatranscriptomes149, the recent expansion of ‘viroid-like circular RNAs’150, the ‘phage-inducible chromosomal island-like elements’ that demonstrate the extent of parasitism among mobile genetic elements151 and the overall prevalence of poorly characterized phage satellites across bacterial genomes152 indicate that there is more to be learned about these mysterious entities.

Discovery and validation of new genetic codes

Metagenomic sequencing has brought to light unexpected phenomena that present new dimensions of modern microbiology. One of these relates to the genetic code and its variability. The genetic code was long considered immutable and, indeed, it is widely conserved in bacteria and archaea. However, partial sequencing of a ribosomal gene cluster in the Mycoplasma capricolum genome revealed that, as in some mitochondrial genomes, TGA (normally a stop codon) is read as tryptophan (genetic code 4)153. Subsequently, metagenomics-based analyses indicated that Gracilibacteria and Absconditabacteria, sibling lineages within the CPR, have reassigned the TGA codon43. Translation as glycine154, confirmed using metaproteomics155, ushered in a third bacterial genetic code (Code 25). These code variations appear to have arisen in single, rare and ancient evolutionary events. To date, no analogous genome-wide codon reassignments have been reported in archaea, although this domain of life remains under-explored relative to bacteria and eukaryotes. Recently, a new computational approach, ‘Codetta’, was developed to identify new instances of codon reassignment in the now vast genomic sequence databases156. Analyses revealed the first evidence for changes in the amino acid encoded by sense codon changes in bacteria. In combination, these studies broaden and deepen our understanding of the modes and frequency of genetic code evolution in prokaryotes.

The origin of the relatively (but with exceptions) stable modern genetic code has been investigated recently by Žagrović and colleagues, who found that, although mRNAs and their cognate proteins are different types of polymers, mRNAs and the proteins they encode display complementarity157. This is particularly evident if the molecules are unstructured (disordered), as would have been the case during evolution of the primordial genetic code. In other work potentially relevant to understanding of the origin of the genetic code, analysis of large sequence databases from across the tree of life and confirmed that frame shifted proteins have similar physicochemical properties (e.g., hydrophobicity, nucleobase affinity and intrinsic disorder) to the original, non-frame shifted proteins, despite essentially no primary sequence similarity158. It has long been known that similar codons encode amino acids with similar physicochemical properties and that the second codon position is most important for determining the properties of the amino acid encoded159. Thus, frame shifted codons, which share two of three nucleotides, tend to incorporate related amino acids. Further insights into the evolution of genetic code come from integrated analyses of genomes and environmental selective pressures quantified through metagenomes. One such investigation by Shenhav and Zeevi shows that the genetic code optimizes codon assignments that minimize the impact of DNA mutations to dramatically change nutrient requirements of protein synthesis, revealing an overarching principle across the tree of life that enables relatively less risky exploration of the mutational landscape160.

Findings related to genetic codes of bacteriophages (phages) challenge the expectation that phages are unlikely to use a genetic code that differs from that of their hosts and the perception that genetic code changes are rare evolutionary events. In 2008, Shackelton and Holmes wrote “… it is extremely unlikely that viruses of hosts utilizing the universal genetic code would emerge, via cross-species transmission, in hosts utilizing alternative codes, and vice versa”161. Consistent with this prediction, these authors confirmed that alternatively coded phages replicate in hosts that also use an alternative genetic code. However, Ivanova et al. reported that some phage genomes do in fact use a genetic code in which one of the three normal stop codons is read as an amino acid yet they replicate in hosts that use the standard genetic code162. Recently, the reverse was suggested: standard code phages can replicate in bacteria that use a modified genetic code, specifically, Gracilibacteria and Absconditabacteria, whose genomes that encode TGA as glycine163. Studies now show that code mismatch between phages and hosts is a relatively common phenomenon. For example, diverse alternatively coded phages replicate in Bacteroidetes and Firmicutes, bacteria that are particularly prevalent in human and animal microbiomes164. Combined taxonomic and genetic code analysis of a metagenomically-derived database of alternatively coded phages revealed that these alternatively coded phages can be closely related to phages that use the standard code. In the most surprising case, one phage lineage features standard code, TGA-reassigned and TAG reassigned genomes. Thus, contrary to evidence from prokaryote genomes, results for phages suggest that the barrier to code change is relatively low, thus codon reassignment can arise independently in different lineages, and over relatively short evolutionary timescales164.

Indications that code shifts in phages are relatively facile raises the questions of the evolutionary path to code change and the machinery required for different codes to be interpreted by the ribosome during phage infection. Acquisition of a suppressor tRNA that translates a normal stop codon as an amino acid could occur via lateral gene transfer or, more likely, via mutation of the anticodon of a tRNA (possibly preceded by duplication of the tRNA165. Interestingly, to date, analyses of metagenome-derived phage genomes suggest charging of suppressor tRNAs by one of only three amino acids, suggesting convergent paths to these codon reassignments. The advantages of code mismatch remain unclear, but may prevent premature production of structural proteins and proteins involved in host cell lysis164.

Another form of genetic code variation involves the reassignment of normal stop codons to non-standard amino acids. The first recognized example of this was the translation of TGA codon to selenocysteine in specific proteins, as signaled by a SECIS element, a specific structural mRNA feature166. This can substantially modify the properties (e.g., stronger nucleophilic ability and lower reduction potential) of proteins relative to the cysteine-containing homologs167. More recently, reassignment of the TAG stop to the 22nd amino acid, pyrrolysine, was discovered, and, to date, found to be prevalent in methylamine methyltransferases and a few other proteins168.

Mapping biogeography to genes and genomes

Genomes have been reconstructed directly from metagenomes sampled from every major ecosystem, including microbiomes of humans, animals, insects, plants, soils, lakes, aquifers and oceans. The large influx of genomes had an immediate impact on life sciences through broader access to predicted genes, functions, and metabolic potential of naturally occurring organisms. Surveys of traits that define the biogeographical patterns and niche partitioning of plants and animals go as far back as the eighteenth century in biology. However, a broad appreciation of the fact that microbes also display non-uniform biogeographical distribution patterns as a function of environmental gradients took time to come into focus169. Some of the most defining work emerged from the studies of microbial mats in the Yellowstone National Park, United States. Initially using denaturing gradient gel electrophoresis to separate 16S rRNA gene fragments amplified from the environment, Ward and colleagues showed the astonishingly clear patterns of microbial population biogeography within one-millimeter slices of Mushroom Spring cyanobacterial mats170. These observations at the micron scale were confirmed by others at the scale of meters, when Moore et al. showed that physiologically and phylogenetically distinct Prochlorococcus populations distribute across the water column as a function of light and temperature171, and across continents, or when Cho and Tiedje showed that genotypic differences between Pseudomonas populations correlated with geographic distances between soil samples from which they originated172. With the recognition of the impact of microbial traits, their ecology, and impact on biogeochemical systems173, the next challenge was to go from phylogenetic indicators of biogeography to traits that drive biogeographical partitioning of microbes174. The initial gene-centric analyses of short metagenomic reads175 and longer genomic fragments recovered directly from environmental samples176 quickly revealed non-uniform distribution patterns for genetic features that were associated with environment-specific metabolic requirements. The ability to characterize biogeography of microbial populations along with their genes was a step towards one of the holy grails of microbiology with significant biotechnological and biomedical implications: elucidating genetic determinants of fitness encoded in genomes, which benefited from ecology-enabled comparative genomics approaches that soon followed.

Characterizing genetic heterogeneity in microbial populations

The first comparison of complete genomes of two closely related Helicobacter pylori strains revealed a large extent of gene conservancy and systematic differences177 and the existence of ‘core’ and ‘flexible’ gene pools178. Research in the early 2000s took advantage of the increasing access to microbial genomes to reveal the utility of comparative genomics to gain insights into the likely determinants of microbial colonization and virulence179, or develop vaccines180. The idea to systematically study differentially occurring genetic features of closely related microbial populations came into fruition with the recognition of the microbial pangenomes181 and evolutionary processes that shape them182,183. By partitioning the gene pool of a set of genomes, pangenomics offers insights into genes that are found in the vast majority of genomes as well as those that differentially occur between them, which is an immense power to link gene function to population ecology.

Genomes can tell very little about their biogeography, and even less about how well their gene content represents the gene content of environmental populations to which they belong. Pangenomics combined with shotgun metagenomes can uncover not only the biogeography of individual populations, but also gene conservancy across habitats184. An application of this idea by Coleman and Chisholm to a Prochlorococcus pangenome and two marine metagenomes reveal that the scarcity of phosphorus in the North Atlantic Ocean resulted in a strict maintenance of accessory phosphorus acquisition genes for Prochlorococcus populations in the North Atlantic while their close relatives in the Pacific ocean had largely lost them185. This integrated approach to pangenomics and metagenomics to identify selective forces that shape gene variation across populations and environments, also termed as metapangenomics186, has revealed differential enrichment of near-identical Haemophilus parainfluenzae populations that are millimeters apart from each other in human oral cavity, where populations enriched in tongue commonly encoded sodium-dependent oxaloacetate decarboxylase enzymes, and those enriched in dental plaque completely lacked them187, and differential distribution of transporters and enzymes within marine SAR324 populations to balance autotrophic and heterotrophic lifestyles as a function of water depth188. The metapangenomic approach can also be used to determine whether critical genes validated experimentally in cultures are conserved in environmental populations189.

The inclusion of ecological patterns into comparative genomics reduces the number of targets in our search for genes that may be particularly important under particular environmental conditions. However, difficulty with de novo prediction of gene function poses challenges. Efforts to shed light on the ecology of unknown genes190,191, profile hidden Markov model search algorithms192, application of deep learning to distant homology searches193, algorithms that enable rapid mining for protein structures83, and bringing in graph theory into pangenomes to infer synteny of genes194,195 will improve insights into microbial gene pool.

Disentangling targets of long-term evolutionary processes

Even the genes that are shared across all members of a population will vary across individuals in subtle ways. Since the early descriptions of within-population trait variations in animals and plants by Darwin and Mendel, population genetics, studies of genetic variation among individual members of a population, has had a remarkable history in biology revealing molecular principles of evolution196, and rapidly gained attention in microbiology with the increasing number of genomes and metagenomes. Even with limited access to sequencing, it was becoming clear during early 1990s that microbial organisms were not strictly clonal197, and this signal was essential to study fundamental principles of genetic innovation and divergence198 as the microbial world is a true playground for evolution199, which enables them to respond to environmental change within seasonal timescales200. Metagenomic datasets capture genomic heterogeneity of closely related populations and provide the data needed to identify genes under neutral, adaptive, or purifying forces. Indeed, the initial culture-independent observations of fine-scale population structures through microdiversity patterns found in clone libraries from marine systems201 foreshadowed the monumental impact of shotgun metagenomics to study genome-wide spatiotemporal patterns of evolution. Identifying targets of distinct evolutionary processes in complex systems is an intimidating task, given that even in the most well-controlled laboratory systems conceivable, such as the long-term evolution experiment that exceeds 75,000 generations for 12 Escherichia coli populations202, presents us with overwhelming complexity203. Read recruitment from shotgun metagenomes using assembled contigs or reference genomes enables the characterization of genetic variants with access to exact allele frequencies at a single-nucleotide resolution204. Early studies that took advantage of shotgun metagenomics to study population genetics in relatively simple microbial systems observed the outcomes of transposase activity and purifying selection146 and quantified genetic variability patterns throughout entire genomes of environmental populations205. In addition to increasing depth of metagenomic sequencing efforts, large-scale metagenomic surveys that yield public datasets with high-legacy206,207 have been improving the application of microbial population genetics. More recent research efforts have elucidated variable rates of evolution for genes within individual microbial populations of the human gut208, patterns of gene- and genome-wide selective sweeps in lakes209, fate of subpopulations after human fecal microbiota transplantation procedures210, associations between microbial population structures and human biogeography211, location-specific purifying forces in deep-sea hydrothermal vents212, the impact of large-scale oceanic current temperatures on population structures of marine organisms213, the impact of anatomy on population structures on human skin214, and more215,216.

Many of the studies to date used subtle genetic variation in natural populations to infer to what extent divergence within microbial populations predict or correlate with other environmental variables, such as differences in biogeography, various ecological measurements, or health and disease states of organisms colonized by microbes. However, while such statistical inferences that treat every variant as equal are useful for broad ecological insights, they are not suitable to resolve the impact of individual variants on function, or the evolutionary processes that maintain them. Increasing the utility of environmental surveys of genetic variation to advance biotechnological applications will undoubtedly benefit from consideration of protein structures217, given the sequence-structure-function paradigm218 and our renewed understanding of the direct relationships between protein sequence and function219. The well-understood importance of protein structures to interpret genomic variants220 was recognized in the context of population genetics over two decades ago221. However, these ideals were delayed by challenges associated with predicting protein structures from amino acid sequences until recently222. Especially with the introduction of a new generation of deep learning frameworks that enable accurate predictions of protein structures, careful interpretations of variants observed in metagenomes is becoming more accessible through integrative ‘omics platforms223 for structure-guided mining of metagenomes224. A recent study by Kiefl et al. that linked protein structures and microbial population genetics demonstrated the non-uniform distribution of non-synonymous variants with respect to their relative solvent accessibility and distance to ligands, and the rapidly decreasing rates of non-synonymous variants around ligand-binding sites of key metabolic genes as a function of bioavailable nitrogen concentrations in marine systems225. In another study, Han and Peng et al. observed variable selective pressures across the genomic features of organohalide reducing taxa that occupy deep sea cold seeps, and demonstrated the presence of stronger purifying selective forces acting on buried residues of metabolically critical reductive dehalogenases in the environment226.

Elucidating how microbes respond to rapid environmental change

Environments change constantly, and survival depends on the ability to keep up. Of all organisms, microbes are unusual in that their evolutionary rates may be fast enough to respond to steady and relatively slow environmental shifts that take place in timescales of decades or longer. However, the conventional means of genome evolution is not suitable to respond to rapid changes that may occur within hours, days, or weeks. Thus, life has evolved many means to dynamically respond to ever shifting fitness landscapes, some of which are just coming to light.

Optimization of transcription and mechanisms behind transcriptional regulation is arguably one of the most well understood means to respond to rapidly changing environmental conditions. Technological advances brought transcriptomics to fruition starting around the 1970s, and continued at a rapid pace with the increasing access to sequencing technologies227. The first whole-genome characterization of the transcriptomic landscape of a microbe demonstrated that Listeria monocytogenes had remarkable transcriptional changes as a function of the environment in which it occurred228. With massive advances in single-cell transcriptomics, recent studies demonstrate responses such as spatial expression profiles in biofilms229, heterogeneous responses to antibiotics exposure230, and transcriptional responses to phage infection231 at the level of individual cells across isogenic populations. Metatranscriptomics, the study of RNA molecules directly recovered from complex samples, was applied first to marine microbial populations232 and had challenging beginnings due to the the extremely low numbers of messenger RNAs per cell, short half-life of RNA molecules, lack of poly(A) tails that increase signal-to-noise ratio in RNA libraries, and challenges associated with the quantification of RNA sequencing results233. Nevertheless, contemporary large-scale metatranscriptomics surveys have reached a level of maturity to track microbial responses to warming oceans234, and elucidate disease-specific microbial responses in the human gut235.

Changing patterns of expression via transcriptional regulation is only one aspect of the genetic toolkit that rapidly optimizes microbial responses. Analyses of genomes and metagenomes revealed the pervasiveness of ‘phase variation’ in bacteria through inversion of regions in intergenic DNA to prevent the expression of costly genes in environments that don’t require them236. Jiang and Hall et al. demonstrated that the same microbial populations that were engrafted to multiple individuals through fecal microbiota transplantation can acquire rapid and heritable changes in their promoter orientations within days, highlighting the speed by which microbes can optimize their gene use to respond to individual-specific selective forces without changing their gene content236. Another intriguing mechanism that enables microbes to change the efficiency of their existing genes is implemented by diversity generating retroelements (DGRs)237. DGRs are a family of genetic elements that generate hypervariability in target genes based on a template sequence where every adenosine is replaced with a random base prior to mutagenic homing thanks to an highly error-prone retron-type reverse transcriptase238. DGRs, which can generate billions of sequence variants in a single population within a few generations, were first discovered in a phage that infects Bordetella239, yet they are widespread in bacteria and archaea240, and the analyses of metagenomes show that they target a wide variety of functions, including those that are involved in protein binding, carbohydrate binding, and cell adhesion241. Within-population variants of DGR targets recovered from metagenomes can yield strong biogeographical partitioning242, suggesting that selective forces optimize the pool of random variants to match the fitness requirements imposed by environmental conditions.

Translational regulation is yet another layer of biology that enables cells from all domains of life to mount rapid responses to environmental change, this time without changing their genomes or patterns of transcription, but by changing what is prioritized by the translational machinery. One of the essential components of the protein synthesis, transfer RNAs (tRNAs), play a significant role in this process through their relative abundances in the cell and chemical modifications that influence their interactions with the ribosome243,244. Cells can actively change their tRNA relative abundances to switch their metabolic state and prioritize the translation of immediate needs to respond to environmental conditions245. Phages and plasmids can also influence the tRNA pool as they carry large inventories of tRNAs246 and tRNA modification enzymes123 that are most likely driven by adaptive processes247 that enable direct influence on translation248. Characterization of tRNA transcripts and their chemical modifications to study the dynamics of translational regulation is well-established for cell populations that belong to the same organism, yet expanding these surveys into naturally occurring complex habitats is complicated by immense molecular and analytical challenges. For instance, due to their rigid secondary structures and chemical modifications, high-throughput sequencing of tRNA transcripts has been notoriously difficult249. But, improving molecular protocols promise broader access to tRNA sequencing250 and new opportunities to study targets and implications of translational regulation also in complex systems. For instance, an application of tRNA sequencing to animal gut microbial populations revealed differences in taxon-specific tRNA chemical modification profiles as a function of diet, where highly modified tRNAs in high-fat diet condition resolved to codons that were highly enriched in highly expressed proteins compared to low-fat diet condition251, and demonstrated the feasibility of this approach to track rapid changes in translational priorities and their genetic targets. Another promising approach to study translational responses of environmental microbial populations is ribosome profiling – a strategy that gives access to ribosome-protected RNA fragments that are being translated at a given time point252. An application of ribosome profiling to human gut microbial populations identified significant differences in transcripts and what was actually translated253, highlighting the need for highly-resolved strategies that target critical components of translation to gain more complete insights into microbial responses to environmental conditions.

Insights into in situ microbial activity

From the perspective of ecological studies that seek to understand in situ processes it is important to distinguish active versus inactive microbial community members, an ideal which has become more accessible in recent years. For example, bioorthogonal non-canonical amino acid tagging (BONCAT) can visualize translationally active cells within complex environmental samples254. Another innovative approach uses fluorescence in situ hybridization (FISH) and nanoscale secondary ion mass spectrometry (FISH–nanoSIMS) in combination with 15N stable isotope probing to link metabolic activity of cells to their location in complex consortia255,256. From the perspective of metagenomic studies, an important motivation for use of methods that can identify organisms that are active in situ is the recognition that a substantial fraction of DNA (e.g., in soil) is from dead organisms or is extracellular257. This may in part be a reflection of the fact that soils are incredibly biodiverse and display rapid shifts in community composition over the seasons. A key approach to identify the subset of organisms that are active at any time is to leverage isotopes, which are selectively incorporated into proteins258 or DNA259 of actively growing organisms. Typically, stable isotope amendments are applied to samples of interest shortly before extraction of proteins or DNA. In the case of proteins, labeled vs. unlabeled proteins (e.g., containing 15N amended in the form of ammonia) are distinguished using highly accurate mass spectrometry measurements of peptides and their fragmentation productions260. In the case of DNA, fractions are separated by weight and different fractions sequenced independently. Implementation examples include growth of plants in a 13CO2 atmosphere and tracking of 13C into the rhizosphere microbiome or addition of isotopically heavy water that is tracked into organisms that incorporate oxygen into de novo synthesized DNA. In the original implementation, referred to as qSIP (quantitative stable isotope probing), the heavy fraction (after consideration of the effect of genome GC content) is linked to actively growing organisms259. For example, one study used H218O addition and quantitative PCR of 16S rRNA genes to track growth and turnover of bacteria and fungi and thus quantified these contributions to the CO2 flux from soil following the water addition to dry grassland soil261. Subsequently, DNA labeling was tracked into whole metagenomes, enabling stable isotope-informed genome-resolved metagenomics262. Despite the potential for artifacts when, for example, field-collected soil samples are necessarily incubated with heavy water in the laboratory, stable isotope labeling is emerging as an important method for the study of microbial ecosystem dynamics.

Emerging opportunities enabled by genome editing of communities

Most current biological knowledge was generated via experiments in which a specific microbe was grown as an isolate (i.e., in a pure culture) and a certain gene was knocked out or added experimentally to determine its function, and much of it involved just one organism, E. coli. However, the tree of life is vast, and the suite of isolates available for traditional genetic manipulation is extremely limited. Even after functional assignments that leverage structural similarity, there are vast numbers of proteins for which functions cannot be predicted.

Why do microbes grow as members of communities, rather than evolve into multicellular organisms such as trees or insects? The benefits may relate to adaptability to changing environmental conditions and likely comes from flexibility in community membership and efficiencies gained via cooperative activities. As a consequence, microbes often depend on other organisms for key resources and devote genetic potential to communication and competition. These facets of microbial activity are understandably understudied because the majority of genetic and biochemical research conducted to date has been performed on organisms growing alone. In the absence of new methods that enable genetic manipulations of organisms without isolation, this is an essentially insurmountable problem. Understanding of the genes and pathways via which microbes interact with each other (and macroorganisms) requires the ability to perform “gold standard” genetic manipulation without the requirement for isolation. In other words, genome editing of microbes while they exist in the context of their communities. With this approach, not only genes involved in interaction can be interrogated but also organisms that cannot be grown in isolation become tractable for experimental manipulation.

If one is to edit a microbiome using CRISPR-Cas methods, there are several key requirements. First, stable, reproducible laboratory-based microbiomes are needed for method development and testing. Next, it is vital that the sequence that is targeted for modification is known. For this reason, genome-resolved metagenomics is foundational. The third challenge is to find a means by which the genome editing tools can be delivered into the targeted cells, and this is likely to be microbe-specific (e.g., conjugation, natural competence). To address this challenge, Rubin and others developed ET-seq and to enable organism- and locus-specific genetic manipulation, a DNA-editing all-in-one RNA-guided CRISPR–Cas transposase (DART)263. Although direct editing of a human infant microbiome member was achieved, the editing efficiency was low and occurred in well studied bacteria. Future delivery improvements are needed to target diverse organisms from little studied branches of the tree of life before this potentially revolutionary approach can reach its potential.

Genetic manipulation in community context has been explored in the mouse microbiome, ultimately, with the goal of engineering communities in situ. The first step recovered engineered strains for reintroduction into the gut to alter overall community function. To do this, Ronda et al. developed Metagenomic Alteration of Gut microbiome by In situ Conjugation (MAGIC)264. The method harnesses horizontal gene transfer to genetically modify gut bacteria. A vector introduces green fluorescent protein labeling to recipient cells that can then be cell sorted, or introduces antibiotic resistance that can be selected for. Libraries included vectors bearing replication elements suitable for different gut bacteria. Among the challenges identified was long-term stability of the engineered genetic constructs, an issue that was somewhat addressed by isolation and engineering of host-derived strains that could mediate further transfer of engineered functions in the microbiome.

A recent study reported the important accomplishment of genetic editing of a population of episymbiotic Saccharibacteria in the context of a co-culture with host Actinobacteria265. The experiments harnessed the natural competence of the Saccharibacteria to insert heterologous sequences and targeted gene deletions. This enabled identification of novel Saccharibacterial genes as important for the growth of these cell surface-attached symbionts on their Actinobacteria host cells.

Ultimately, the ability to genetically modify strains of interest in microbiome context will make it possible to assign functions to a vast spectrum of novel genes and enable precise tuning of microbiome composition and function. By microbiome manipulation, a vast range of desirable outcomes of societal and environmental importance may be achieved.

A shifting mindset to address the challenges of our modern era

A hallmark of modern microbiology is the plethora of new kinds of data generated from inconceivably intricate microbial systems as a whole with brand new molecular and computational strategies. Taking full advantage of our observational powers demands us to focus on how to improve many of the well-established practices in microbiology; from how we responsibly disseminate data for transparency and reproducibility, or how we train life scientists to better exploit new data streams and computational resources, to how we encourage, support, and promote team science efforts to solve our immense need for integration when we mostly credit the first name and/or the last name that appear in final publications. Apart from such tangible issues that can be solved through tangible actions, some aspects of modern microbiology can also benefit from discussions regarding how the classical determinants of good scientific practice apply to this new era in which our observational powers far exceeds our ability to interpret the data they yield. For instance, how critical is it to lead contemporary research in microbiology with the expectation of testable hypotheses as the first and foremost requirement to benchmark or implement new ideas? And how should we recognize, prioritize, or promote the synthesis of research products from laboratory experiments or data-driven explorations?

We believe it is important to recognize that hypothesis-driven research is inevitably limited by the ability of the human mind to imagine facets of microbiology that may be far from what we know today. Indeed, from the discovery of penicillin to insulin, from radioactivity to X-rays, history of science has a long list of breakthroughs that emerged as unintended consequences of exploratory research and serendipity. For example, although the CRISPR-Cas system could theoretically be conceived of de novo, its accidental discovery through microbial genome sequencing provided a new window on virus-host interactions and ultimately opened the path for a revolution in biotechnology. Sequences from bacterial host CRISPR loci and coexisting viruses revealed infecting bacteriophage genomes as the source of the CRISPR spacer sequences266268, but biochemical experiments were essential for uncovering mechanisms and thus enabling adaptation for targeted editing of eukaryotes269, with almost unimaginable impacts in medicine, agriculture, and basic science. This and other examples underline the opportunities that arise from exploratory research, and its confluence with laboratory experimentation, once the hypotheses reveal themselves from exploration.

In a world of unknowns, rigorous exploratory research provides the basis for hypothesis-driven research. Thus, given the huge scale and the microbial underpinnings of many of the problems of the modern era, and the hope and potential for microbiome-based solutions, we advocate increasingly deploying the methods of microbiology in the context of ecosystems, whether they be the ocean, agricultural land, wetlands, municipal landfills, or others, and explore open-mindedly, and dare we say it, without asking for or testing any strict hypotheses. Only by developing a concrete understanding of how these ecosystems seem to work, and only after conducting targeted, field-scale experiments within them, will it be possible to tackle questions of utmost importance for the wellbeing of us and our planet, such as “how do we store more carbon in soil?”, “how do we cut trace gas emissions from rice paddies?”, “what can we do to reduce methane releases from cattle?”, “how do we improve the quality of marginal drinking water, accelerate the breakdown of ocean plastic, reduce salinization of agricultural soils, and use the earth’s subsurface to store CO2?”, “can food biotechnologies be improved to make, for example, vegan meat alternatives more palatable?”, “how do manipulation of diet, probiotics or environmental exposures alter development of human microbiomes and can changes in these factors address the ever growing list of microbiome-related diseases?”. For example, given that around 40% of the land surface is already highly manipulated in support of food production270, agricultural systems are a logical target to learn how microbiome functionality can be modulated to reduce greenhouse gas emissions271. Yet our ability to implement investigations of such magnitude in terrestrial, marine, or host-associated habitats requires a mindset that embraces natural complexity and remains open to experiments in such systems at large scale, with integrated analyses of readouts ranging from microbiome composition to biogeochemical fluxes to crop productivity or health status.

These considerations are particularly relevant to highly critical ecosystem-level questions such as “how will future climate scenarios alter labile and recalcitrant dissolved organic carbon pools in warming oceans and impact marine food webs and the atmosphere?”, “how nitrous oxide emissions from grassland soil will be altered by shifts in rainfall amount and timing or temperature driven by climate change?”, or “how does soil recover after severe wildfires and how the patchiness of burnt areas impacts the reestablishment of healthy soil microbiomes?”. For questions at this magnitude, harnessing natural events and variation over temporal and geographical gradients may have bigger potential to improve our understanding of key processes and guide modeling efforts to predict ecosystem behavior compared to artificial or small-scale controlled disruptions to test hypotheses. In other words, sometimes there are reasons to let nature do the experiment.

Our review highlights the motivation, progress, and potential of the data-driven epoch of microbiology to tackle systems of great complexity with vast scales of unknowns. A critical admission in our search for solutions to the most pressing needs of our planet is that the complexity of microbial systems that modern microbiology aims to elucidate may not always be compatible with the desire for hypothesis-driven research. Thus, from the level of PhD committees to funding agencies, a strict requirement for ‘hypotheses’ should not be a critical determinant of the fate of research proposals. While we should not hold data products to a lesser degree of scrutiny, we also should not trivialize data-driven investigations at the level of grant panels or editorial decisions. In the quest of understanding how nature works, it is important to embrace complexity while looking for means to reveal its driving principles.

Acknowledgements

Authors thank Steve Giovannoni, André Marques, Tom Delmont, Jessika Füssel, Carlotta Ronda, Lin-Xing Chen, Rodolphe Barrangou, Mary Ann Moran, as well as the members of the Meren Lab (https://merenlab.org/) and the Banfield Lab (https://www.banfieldlab.com/) for helpful discussions. This effort was supported by the Center for Chemical Currencies of a Microbial Planet (C-CoMP) (NSF Award OCE-2019589, C-CoMP publication #049), and a grant from the National Institute of Health (NIH award RAI092531A).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

This review explores the history of microbiology and contemplates the future of this evolving discipline as microbiologists embrace rapidly emerging technologies to explore systems of great complexity and answer fundamental questions of the modern microbiology era.

References

  • 1.Watterson WJ, Tanyeri M, Watson AR, Cham CM, Shan Y, Chang EB, Eren AM, and Tay S (2020). Droplet-based high-throughput cultivation for accurate screening of antibiotic resistant gut microbes. Elife 9. 10.7554/eLife.56998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Welch JLM, Rossetti BJ, Rieken CW, Dewhirst FE, and Borisy GG (2016). Biogeography of a human oral microbiome at the micron scale. Proceedings of the National Academy of Sciences 113, E791–E800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Carroll BL, Nishikino T, Guo W, Zhu S, Kojima S, Homma M, and Liu J (2020). The flagellar motor of Vibrio alginolyticus undergoes major structural remodeling during rotational switching. Elife 9. 10.7554/eLife.61446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Quinn A, El Chazli Y, Escrig S, Daraspe J, Neuschwander N, McNally A, Genoud C, Meibom A, and Engel P (2024). Host-derived organic acids enable gut colonization of the honey bee symbiont Snodgrassella alvi. Nat Microbiol 9, 477–489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Musat N, Halm H, Winterholler B, Hoppe P, Peduzzi S, Hillion F, Horreard F, Amann R, Jørgensen BB, and Kuypers MMM (2008). A single-cell view on the ecophysiology of anaerobic phototrophic bacteria. Proc. Natl. Acad. Sci. U. S. A 105, 17861–17866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Moran MA, Kujawinski EB, Schroer WF, Amin SA, Bates NR, Bertrand EM, Braakman R, Brown CT, Covert MW, Doney SC, et al. (2022). Microbial metabolites in the marine carbon cycle. Nat Microbiol 7, 508–523. [DOI] [PubMed] [Google Scholar]
  • 7.Cross KL, Campbell JH, Balachandran M, Campbell AG, Cooper CJ, Griffen A, Heaton M, Joshi S, Klingeman D, Leys E, et al. (2019). Targeted isolation and cultivation of uncultivated bacteria by reverse genomics. Nat. Biotechnol 37, 1314–1321. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Stepanauskas R, Fergusson EA, Brown J, Poulton NJ, Tupper B, Labonté JM, Becraft ED, Brown JM, Pachiadaki MG, Povilaitis T, et al. (2017). Improved genome recovery and integrated cell-size analyses of individual uncultured microbial cells and viral particles. Nat. Commun 8, 84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Jin W-B, Li T-T, Huo D, Qu S, Li XV, Arifuzzaman M, Lima SF, Shi H-Q, Wang A, Putzel GG, et al. (2022). Genetic manipulation of gut microbes enables single-gene interrogation in a complex microbiome. Cell 185, 547–562.e22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Johnston C, Martin B, Fichant G, Polard P, and Claverys J-P (2014). Bacterial transformation: distribution, shared mechanisms and divergent control. Nat. Rev. Microbiol 12, 181–196. [DOI] [PubMed] [Google Scholar]
  • 11.Hutchison CA 3rd, Chuang R-Y, Noskov VN, Assad-Garcia N, Deerinck TJ, Ellisman MH, Gill J, Kannan K, Karas BJ, Ma L, et al. (2016). Design and synthesis of a minimal bacterial genome. Science 351, aad6253. [DOI] [PubMed] [Google Scholar]
  • 12.Lu Y, Wu K, Jiang Y, Guo Y, and Desneux N (2012). Widespread adoption of Bt cotton and insecticide decrease promotes biocontrol services. Nature 487, 362–365. [DOI] [PubMed] [Google Scholar]
  • 13.Kaur R, Leigh BA, Ritchie IT, and Bordenstein SR (2022). The Cif proteins from Wolbachia prophage WO modify sperm genome integrity to establish cytoplasmic incompatibility. PLoS Biol 20, e3001584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Handelsman J (2004). Metagenomics: application of genomics to uncultured microorganisms. Microbiol. Mol. Biol. Rev 68, 669–685. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Watson JD, and Crick FH (1953). Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171, 737–738. [DOI] [PubMed] [Google Scholar]
  • 16.Zuckerkandl E, and Pauling L (1965). Molecules as documents of evolutionary history. J. Theor. Biol 8, 357–366. [DOI] [PubMed] [Google Scholar]
  • 17.Sanger F, Nicklen S, and Coulson AR (1977). DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. U. S. A 74, 5463–5467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Woese CR, and Fox GE (1977). Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc. Natl. Acad. Sci. U. S. A 74, 5088–5090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lane DJ, Pace B, Olsen GJ, Stahl DA, Sogin ML, and Pace NR (1985). Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. Proc. Natl. Acad. Sci. U. S. A 82, 6955–6959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Pace NR, Stahl DA, Lane DJ, and Olsen GJ (1986). The Analysis of Natural Microbial Populations by Ribosomal RNA Sequences. In Advances in Microbial Ecology, Marshall KC, ed. (Springer US; ), pp. 1–55. [Google Scholar]
  • 21.Böttger EC (1989). Rapid determination of bacterial ribosomal RNA sequences by direct sequencing of enzymatically amplified DNA. FEMS Microbiol. Lett 65, 171–176. [DOI] [PubMed] [Google Scholar]
  • 22.DeLong EF, and Pace NR (2001). Environmental diversity of bacteria and archaea. Syst. Biol 50, 470–478. [PubMed] [Google Scholar]
  • 23.López-García P, Rodríguez-Valera F, Pedrós-Alió C, and Moreira D (2001). Unexpected diversity of small eukaryotes in deep-sea Antarctic plankton. Nature 409, 603–607. [DOI] [PubMed] [Google Scholar]
  • 24.Sogin ML, Morrison HG, Huber JA, Welch DM, Huse SM, Neal PR, Arrieta JM, and Herndl GJ (2006). Microbial diversity in the deep sea and the underexplored “rare biosphere.” Proceedings of the National Academy of Sciences 103, 12115–12120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Fierer N, and Jackson RB (2006). The diversity and biogeography of soil bacterial communities. Proc. Natl. Acad. Sci. U. S. A 103, 626–631. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Ley RE, Bäckhed F, Turnbaugh P, Lozupone CA, Knight RD, and Gordon JI (2005). Obesity alters gut microbial ecology. Proc. Natl. Acad. Sci. U. S. A 102, 11070–11075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Blaser Martin J, Cardon Zoe G, Cho Mildred K, Dangl Jeffrey L, Donohue Timothy J, Green Jessica L, Knight Rob, Maxon Mary E., Northen Trent R., Pollard Katherine S., et al. (2016). Toward a Predictive Understanding of Earth’s Microbiomes to Address 21st Century Challenges. MBio 7, 10.1128/mbio.00714-00716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, and Merrick JM (1995). Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512. [DOI] [PubMed] [Google Scholar]
  • 29.Hug LA, Baker BJ, Anantharaman K, Brown CT, Probst AJ, Castelle CJ, Butterfield CN, Hernsdorf AW, Amano Y, Ise K, et al. (2016). A new view of the tree of life. Nat Microbiol 1, 16048. [DOI] [PubMed] [Google Scholar]
  • 30.Handelsman J, Rondon MR, Brady SF, Clardy J, and Goodman RM (1998). Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. Chem. Biol 5, R245–R249. [DOI] [PubMed] [Google Scholar]
  • 31.Stein JL, Marsh TL, Wu KY, Shizuya H, and DeLong EF (1996). Characterization of uncultivated prokaryotes: isolation and analysis of a 40-kilobase-pair genome fragment from a planktonic marine archaeon. J. Bacteriol 178, 591–599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Béjà O, Aravind L, Koonin EV, Suzuki MT, Hadd A, Nguyen LP, Jovanovich SB, Gates CM, Feldman RA, Spudich JL, et al. (2000). Bacterial rhodopsin: evidence for a new type of phototrophy in the sea. Science 289, 1902–1906. [DOI] [PubMed] [Google Scholar]
  • 33.Breitbart M, Salamon P, Andresen B, Mahaffy JM, Segall AM, Mead D, Azam F, and Rohwer F (2002). Genomic analysis of uncultured marine viral communities. Proc. Natl. Acad. Sci. U. S. A 99, 14250–14255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, Solovyev VV, Rubin EM, Rokhsar DS, and Banfield JF (2004). Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428, 37–43. [DOI] [PubMed] [Google Scholar]
  • 35.Tamaki H (2019). Cultivation Renaissance in the Post-Metagenomics Era: Combining the New and Old. Microbes Environ 34, 117–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Tyson GW, and Banfield JF (2005). Cultivating the uncultivated: a community genomics perspective. Trends Microbiol 13, 411–415. [DOI] [PubMed] [Google Scholar]
  • 37.Tyson GW, Lo I, Baker BJ, Allen EE, Hugenholtz P, and Banfield JF (2005). Genome-directed isolation of the key nitrogen fixer Leptospirillum ferrodiazotrophum sp. nov. from an acidophilic microbial community. Appl. Environ. Microbiol 71, 6319–6324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Omsland A, Cockrell DC, Howe D, Fischer ER, Virtaneva K, Sturdevant DE, Porcella SF, and Heinzen RA (2009). Host cell-free growth of the Q fever bacterium Coxiella burnetii. Proc. Natl. Acad. Sci. U. S. A 106, 4430–4434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Maurin M, and Raoult D (1999). Q fever. Clin. Microbiol. Rev 12, 518–553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Pope PB, Smith W, Denman SE, Tringe SG, Barry K, Hugenholtz P, McSweeney CS, McHardy AC, and Morrison M (2011). Isolation of Succinivibrionaceae implicated in low methane emissions from Tammar wallabies. Science 333, 646–648. [DOI] [PubMed] [Google Scholar]
  • 41.Lugli GA, Milani C, Duranti S, Alessandri G, Turroni F, Mancabelli L, Tatoni D, Ossiprandi MC, van Sinderen D, and Ventura M (2019). Isolation of novel gut bifidobacteria using a combination of metagenomic and cultivation approaches. Genome Biol 20, 96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Karnachuk OV, Lukina AP, Kadnikov VV, Sherbakova VA, Beletsky AV, Mardanov AV, and Ravin NV (2021). Targeted isolation based on metagenome-assembled genomes reveals a phylogenetically distinct group of thermophilic spirochetes from deep biosphere. Environ. Microbiol 23, 3585–3598. [DOI] [PubMed] [Google Scholar]
  • 43.Wrighton KC, Thomas BC, Sharon I, Miller CS, Castelle CJ, VerBerkmoes NC, Wilkins MJ, Hettich RL, Lipton MS, Williams KH, et al. (2012). Fermentation, hydrogen, and sulfur metabolism in multiple uncultivated bacterial phyla. Science 337, 1661–1665. [DOI] [PubMed] [Google Scholar]
  • 44.He X, McLean JS, Edlund A, Yooseph S, Hall AP, Liu S-Y, Dorrestein PC, Esquenazi E, Hunter RC, Cheng G, et al. (2015). Cultivation of a human-associated TM7 phylotype reveals a reduced genome and epibiotic parasitic lifestyle. Proc. Natl. Acad. Sci. U. S. A 112, 244–249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Luef B, Frischkorn KR, Wrighton KC, Holman H-YN, Birarda G, Thomas BC, Singh A, Williams KH, Siegerist CE, Tringe SG, et al. (2015). Diverse uncultivated ultra-small bacterial cells in groundwater. Nat. Commun 6, 6372. [DOI] [PubMed] [Google Scholar]
  • 46.Moreira D, Zivanovic Y, López-Archilla AI, Iniesto M, and López-García P (2021). Reductive evolution and unique predatory mode in the CPR bacterium Vampirococcus lugosii. Nat. Commun 12, 2454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Kuroda K, Nakajima M, Nakai R, Hirakata Y, Kagemasa S, Kubota K, Noguchi TQP, Yamamoto K, Satoh H, Nobu MK, et al. (2024). Microscopic and metatranscriptomic analyses revealed unique cross-domain parasitism between phylum Candidatus Patescibacteria/candidate phyla radiation and methanogenic archaea in anaerobic ecosystems. MBio 15, e0310223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Bouderka F, López-García P, Deschamps P, Krupovic M, Gutiérrez-Preciado A, Ciobanu M, Bertolino P, David G, Moreira D, and Jardillier L (2024). Culture- and genome-based characterization of a tripartite interaction between patescibacterial epibionts, methylotrophic proteobacteria, and a jumbo phage in freshwater ecosystems. bioRxiv, 2024.03.08.584096. 10.1101/2024.03.08.584096. [DOI] [Google Scholar]
  • 49.Rappé MS, Connon SA, Vergin KL, and Giovannoni SJ (2002). Cultivation of the ubiquitous SAR11 marine bacterioplankton clade. Nature 418, 630–633. [DOI] [PubMed] [Google Scholar]
  • 50.Kaeberlein T, Lewis K, and Epstein SS (2002). Isolating “uncultivable” microorganisms in pure culture in a simulated natural environment. Science 296, 1127–1129. [DOI] [PubMed] [Google Scholar]
  • 51.Garcia SL (2016). Mixed cultures as model communities: hunting for ubiquitous microorganisms, their partners, and interactions. Aquat. Microb. Ecol 77, 79–85. [Google Scholar]
  • 52.Jiang C-Y, Dong L, Zhao J-K, Hu X, Shen C, Qiao Y, Zhang X, Wang Y, Ismagilov RF, Liu S-J, et al. (2016). High-Throughput Single-Cell Cultivation on Microfluidic Streak Plates. Appl. Environ. Microbiol 82, 2210–2218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Huang Y, Sheth RU, Zhao S, Cohen LA, Dabaghi K, Moody T, Sun Y, Ricaurte D, Richardson M, Velez-Cortes F, et al. (2023). High-throughput microbial culturomics using automation and machine learning. Nat. Biotechnol 41, 1424–1433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Ma L, Kim J, Hatzenpichler R, Karymov MA, Hubert N, Hanan IM, Chang EB, and Ismagilov RF (2014). Gene-targeted microfluidic cultivation validated by isolation of a gut bacterium listed in Human Microbiome Project’s Most Wanted taxa. Proc. Natl. Acad. Sci. U. S. A 111, 9768–9773. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Fodor AA, DeSantis TZ, Wylie KM, Badger JH, Ye Y, Hepburn T, Hu P, Sodergren E, Liolios K, Huot-Creasy H, et al. (2012). The “most wanted” taxa from the human microbiome for whole genome sequencing. PLoS One 7, e41294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Boedicker JQ, Vincent ME, and Ismagilov RF (2009). Microfluidic confinement of single cells of bacteria in small volumes initiates high-density behavior of quorum sensing and growth and reveals its variability. Angew. Chem. Int. Ed Engl 48, 5908–5911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Spang A, Saw JH, Jørgensen SL, Zaremba-Niedzwiedzka K, Martijn J, Lind AE, van Eijk R, Schleper C, Guy L, and Ettema TJG (2015). Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Eme L, Spang A, Lombard J, Stairs CW, and Ettema TJG (2017). Archaea and the origin of eukaryotes. Nat. Rev. Microbiol 15, 711–723. [DOI] [PubMed] [Google Scholar]
  • 59.Castelle CJ, and Banfield JF (2018). Major New Microbial Groups Expand Diversity and Alter our Understanding of the Tree of Life. Cell 172, 1181–1197. [DOI] [PubMed] [Google Scholar]
  • 60.Ding S, Hamm JN, Bale NJ, Sinninghe Damsté JS, and Spang A (2024). Selective lipid recruitment by an archaeal DPANN symbiont from its host. Nat. Commun 15, 3405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Witwinowski J, Sartori-Rupp A, Taib N, Pende N, Tham TN, Poppleton D, Ghigo J-M, Beloin C, and Gribaldo S (2022). An ancient divide in outer membrane tethering systems in bacteria suggests a mechanism for the diderm-to-monoderm transition. Nat Microbiol 7, 411–422. [DOI] [PubMed] [Google Scholar]
  • 62.Chen L-X, Anantharaman K, Shaiber A, Eren AM, and Banfield JF (2020). Accurate and complete genomes from metagenomes. Genome Res 30, 315–333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Shaiber A, and Eren AM (2019). Composite Metagenome-Assembled Genomes Reduce the Quality of Public Genome Repositories. MBio 10. 10.1128/mBio.00725-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Orakov A, Fullam A, Coelho LP, Khedkar S, Szklarczyk D, Mende DR, Schmidt TSB, and Bork P (2021). GUNC: detection of chimerism and contamination in prokaryotic genomes. Genome Biol 22, 178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Sacristán-Horcajada E, González-de la Fuente S, Peiró-Pastor R, Carrasco-Ramiro F, Amils R, Requena JM, Berenguer J, and Aguado B (2021). ARAMIS: From systematic errors of NGS long reads to accurate assemblies. Brief. Bioinform 22. 10.1093/bib/bbab170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Wick RR, and Holt KE (2022). Polypolish: Short-read polishing of long-read bacterial genome assemblies. PLoS Comput. Biol 18, e1009802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Hackl T, Trigodet F, Murat Eren A, Biller SJ, Eppley JM, Luo E, Burger A, DeLong EF, and Fischer MG (2021). proovframe: frameshift-correction for long-read (meta)genomics. bioRxiv, 2021.08.23.457338. 10.1101/2021.08.23.457338. [DOI] [Google Scholar]
  • 68.Kolmogorov M, Bickhart DM, Behsaz B, Gurevich A, Rayko M, Shin SB, Kuhn K, Yuan J, Polevikov E, Smith TPL, et al. (2020). metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Feng X, Cheng H, Portik D, and Li H (2022). Metagenome assembly of high-fidelity long reads with hifiasm-meta. Nat. Methods 19, 671–674. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Benoit G, Raguideau S, James R, Phillippy AM, Chikhi R, and Quince C (2024). High-quality metagenome assembly from long accurate reads with metaMDBG. Nat. Biotechnol 10.1038/s41587-023-01983-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Méheust R, Castelle CJ, Jaffe AL, and Banfield JF (2022). Conserved and lineage-specific hypothetical proteins may have played a central role in the rise and diversification of major archaeal groups. BMC Biol 20, 154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Rohwer F, and Edwards R (2002). The Phage Proteomic Tree: a genome-based taxonomy for phage. J. Bacteriol 184, 4529–4535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, Ronneberger O, Willmore L, Ballard AJ, Bambrick J, et al. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 10.1038/s41586-024-07487-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, Wang J, Cong Q, Kinch LN, Schaeffer RD, et al. (2021). Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, Smetanin N, Verkuil R, Kabeli O, Shmueli Y, et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130. [DOI] [PubMed] [Google Scholar]
  • 76.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, and Bourne PE (2000). The Protein Data Bank. Nucleic Acids Res 28, 235–242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Munsamy G, Bohnuud T, and Lorenz P (2024). IMPROVING ALPHAFOLD2 PERFORMANCE WITH A GLOBAL METAGENOMIC & BIOLOGICAL DATA SUPPLY CHAIN. bioRxiv, 2024.03.06.583325. 10.1101/2024.03.06.583325. [DOI] [Google Scholar]
  • 78.Ovchinnikov S, Park H, Varghese N, Huang P-S, Pavlopoulos GA, Kim DE, Kamisetty H, Kyrpides NC, and Baker D (2017). Protein structure determination using metagenome sequence data. Science 355, 294–298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Flamholz ZN, Biller SJ, and Kelly L (2024). Large language models improve annotation of prokaryotic viral proteins. Nat Microbiol 9, 537–549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Xiang G, Li Y, Sun J, Huo Y, Cao S, Cao Y, Guo Y, Yang L, Cai Y, Zhang YE, et al. (2024). Evolutionary mining and functional characterization of TnpB nucleases identify efficient miniature genome editors. Nat. Biotechnol 42, 745–757. [DOI] [PubMed] [Google Scholar]
  • 81.Heinzinger M, Weissenow K, Sanchez JG, Henkel A, Steinegger M, and Rost B (2023). ProstT5: Bilingual Language Model for Protein Sequence and Structure. bioRxiv, 2023.07.23.550085. 10.1101/2023.07.23.550085. [DOI] [Google Scholar]
  • 82.Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, and Steinegger M (2022). ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.van Kempen M, Kim SS, Tumescheit C, Mirdita M, Lee J, Gilchrist CLM, Söding J, and Steinegger M (2024). Fast and accurate protein structure search with Foldseek. Nat. Biotechnol 42, 243–246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Prochazka D, Slaninakova T, Olha J, Rosinec A, Gresova K, Janosova M, Cillik J, Porubska J, Svobodova R, Dohnal V, et al. (2024). AlphaFind: Discover structure similarity across the entire known proteome. bioRxiv 10.1101/2024.02.15.580465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Borkakoti N, and Thornton JM (2023). AlphaFold2 protein structure prediction: Implications for drug discovery. Curr. Opin. Struct. Biol 78, 102526. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Shor B, and Schneidman-Duhovny D (2024). CombFold: predicting structures of large protein assemblies using a combinatorial assembly algorithm and AlphaFold2. Nat. Methods 21, 477–487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Krupovic M, Dolja VV, and Koonin EV (2020). The LUCA and its complex virome. Nat. Rev. Microbiol 18, 661–670. [DOI] [PubMed] [Google Scholar]
  • 88.Koonin EV, and Martin W (2005). On the origin of genomes and cells within inorganic compartments. Trends Genet 21, 647–654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Forterre P (2013). The virocell concept and environmental microbiology. ISME J 7, 233–236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Rosenwasser S, Ziv C, van Creveld SG, and Vardi A (2016). Virocell Metabolism: Metabolic Innovations During Host-Virus Interactions in the Ocean. Trends Microbiol 24, 821–832. [DOI] [PubMed] [Google Scholar]
  • 91.Fuhrman JA (1999). Marine viruses and their biogeochemical and ecological effects. Nature 399, 541–548. [DOI] [PubMed] [Google Scholar]
  • 92.LePage DP, Metcalf JA, Bordenstein SR, On J, Perlmutter JI, Shropshire JD, Layton EM, Funkhouser-Jones LJ, Beckmann JF, and Bordenstein SR (2017). Prophage WO genes recapitulate and enhance Wolbachia-induced cytoplasmic incompatibility. Nature 543, 243–247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Roux S, Enault F, Hurwitz BL, and Sullivan MB (2015). VirSorter: mining viral signal from microbial genomic data. PeerJ 3, e985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Kieft K, Zhou Z, and Anantharaman K (2020). VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences. Microbiome 8, 90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Camargo AP, Roux S, Schulz F, Babinski M, Xu Y, Hu B, Chain PSG, Nayfach S, and Kyrpides NC (2023). Identification of mobile genetic elements with geNomad. Nat. Biotechnol 10.1038/s41587-023-01953-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Camarillo-Guerrero LF, Almeida A, Rangel-Pineros G, Finn RD, and Lawley TD (2021). Massive expansion of human gut bacteriophage diversity. Cell 184, 1098–1109.e9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Gregory AC, Zayed AA, Conceição-Neto N, Temperton B, Bolduc B, Alberti A, Ardyna M, Arkhipova K, Carmichael M, Cruaud C, et al. (2019). Marine DNA Viral Macro- and Microdiversity from Pole to Pole. Cell 177, 1109–1123.e14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Ma B, Wang Y, Zhao K, Stirling E, Lv X, Yu Y, Hu L, Tang C, Wu C, Dong B, et al. (2024). Biogeographic patterns and drivers of soil viromes. Nat Ecol Evol 8, 717–728. [DOI] [PubMed] [Google Scholar]
  • 99.Callanan J, Stockdale SR, Shkoporov A, Draper LA, Ross RP, and Hill C (2020). Expansion of known ssRNA phage genomes: From tens to over a thousand. Sci Adv 6, eaay5981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Crits-Christoph A, Kantor RS, Olm MR, Whitney ON, Al-Shayeb B, Lou YC, Flamholz A, Kennedy LC, Greenwald H, Hinkle A, et al. (2021). Genome Sequencing of Sewage Detects Regionally Prevalent SARS-CoV-2 Variants. MBio 12. 10.1128/mBio.02703-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Abrahão J, Silva L, Silva LS, Khalil JYB, Rodrigues R, Arantes T, Assis F, Boratto P, Andrade M, Kroon EG, et al. (2018). Tailed giant Tupanvirus possesses the most complete translational apparatus of the known virosphere. Nat. Commun 9, 749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Raoult D, Audic S, Robert C, Abergel C, Renesto P, Ogata H, La Scola B, Suzan M, and Claverie J-M (2004). The 1.2-megabase genome sequence of Mimivirus. Science 306, 1344–1350. [DOI] [PubMed] [Google Scholar]
  • 103.Schulz F, Abergel C, and Woyke T (2022). Giant virus biology and diversity in the era of genome-resolved metagenomics. Nat. Rev. Microbiol 20, 721–736. [DOI] [PubMed] [Google Scholar]
  • 104.Moniruzzaman M, Martinez-Gutierrez CA, Weinheimer AR, and Aylward FO (2020). Dynamic genome evolution and complex virocell metabolism of globally-distributed giant viruses. Nat. Commun 11, 1710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Schulz F, Roux S, Paez-Espino D, Jungbluth S, Walsh DA, Denef VJ, McMahon KD, Konstantinidis KT, Eloe-Fadrosh EA, Kyrpides NC, et al. (2020). Giant virus diversity and host interactions through global metagenomics. Nature 578, 432–436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Moniruzzaman M, Weinheimer AR, Martinez-Gutierrez CA, and Aylward FO (2020). Widespread endogenization of giant viruses shapes genomes of green algae. Nature 588, 141–145. [DOI] [PubMed] [Google Scholar]
  • 107.Koonin EV, Dolja VV, Krupovic M, Varsani A, Wolf YI, Yutin N, Zerbini FM, and Kuhn JH (2020). Global Organization and Proposed Megataxonomy of the Virus World. Microbiol. Mol. Biol. Rev 84. 10.1128/MMBR.00061-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Siddell SG, Smith DB, Adriaenssens E, Alfenas-Zerbini P, Dutilh BE, Garcia ML, Junglen S, Krupovic M, Kuhn JH, Lambert AJ, et al. (2023). Virus taxonomy and the role of the International Committee on Taxonomy of Viruses (ICTV). J. Gen. Virol 104. 10.1099/jgv.0.001840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Gaïa M, Meng L, Pelletier E, Forterre P, Vanni C, Fernandez-Guerra A, Jaillon O, Wincker P, Ogata H, Krupovic M, et al. (2023). Mirusviruses link herpesviruses to giant viruses. Nature 616, 783–789. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Strathdee SA, Hatfull GF, Mutalik VK, and Schooley RT (2023). Phage therapy: From biological mechanisms to future directions. Cell 186, 17–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Pausch P, Al-Shayeb B, Bisom-Rapp E, Tsuchida CA, Li Z, Cress BF, Knott GJ, Jacobsen SE, Banfield JF, and Doudna JA (2020). CRISPR-CasΦ from huge phages is a hypercompact genome editor. Science 369, 333–337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Selle K, Fletcher JR, Tuson H, Schmitt DS, McMillan L, Vridhambal GS, Rivera AJ, Montgomery SA, Fortier L-C, Barrangou R, et al. (2020). In Vivo Targeting of Clostridioides difficile Using Phage-Delivered CRISPR-Cas3 Antimicrobials. MBio 11. 10.1128/mBio.00019-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Nethery MA, Hidalgo-Cantabrana C, Roberts A, and Barrangou R (2022). CRISPR-based engineering of phages for in situ bacterial base editing. Proc. Natl. Acad. Sci. U. S. A 119, e2206744119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Gencay YE, Jasinskytė D, Robert C, Semsey S, Martínez V, Petersen AØ, Brunner K, de Santiago Torio A, Salazar A, Turcu IC, et al. (2024). Engineered phage with antibacterial CRISPR-Cas selectively reduce E. coli burden in mice. Nat. Biotechnol 42, 265–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Raoult D, and Forterre P (2008). Redefining viruses: lessons from Mimivirus. Nat. Rev. Microbiol 6, 315–319. [DOI] [PubMed] [Google Scholar]
  • 116.Moreira D, and López-García P (2009). Ten reasons to exclude viruses from the tree of life. Nat. Rev. Microbiol 7, 306–311. [DOI] [PubMed] [Google Scholar]
  • 117.Koonin EV, and Starokadomskyy P (2016). Are viruses alive? The replicator paradigm sheds decisive light on an old but misguided question. Stud. Hist. Philos. Biol. Biomed. Sci 59, 125–134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Lederberg J (1998). Plasmid (1952–1997). Plasmid 39, 1–9. [DOI] [PubMed] [Google Scholar]
  • 119.Prazeres DMF, and Monteiro GA (2014). Plasmid Biopharmaceuticals. Microbiol Spectr 2. 10.1128/microbiolspec.PLAS-0022-2014. [DOI] [PubMed] [Google Scholar]
  • 120.Rodríguez-Beltrán J, DelaFuente J, León-Sampedro R, MacLean RC, and San Millán Á (2021). Beyond horizontal gene transfer: the role of plasmids in bacterial evolution. Nat. Rev. Microbiol 19, 347–359. [DOI] [PubMed] [Google Scholar]
  • 121.Prestinaci F, Pezzotti P, and Pantosti A (2015). Antimicrobial resistance: a global multifaceted phenomenon. Pathog. Glob. Health 109, 309–318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.Alawi M, Torrijos TV, and Walsh F (2022). Plasmid-mediated antimicrobial resistance in drinking water. Environmental Advances 8, 100191. [Google Scholar]
  • 123.Yu MK, Fogarty EC, and Eren AM (2024). Diverse plasmid systems and their ecology across human gut metagenomes revealed by PlasX and MobMess. Nat Microbiol 9, 830–847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Pfeifer E, and Rocha EPC (2024). Phage-plasmids promote recombination and emergence of phages and plasmids. Nat. Commun 15, 1545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125.Erdmann S, Tschitschko B, Zhong L, Raftery MJ, and Cavicchioli R (2017). A plasmid from an Antarctic haloarchaeon uses specialized membrane vesicles to disseminate and infect plasmid-free cells. Nat Microbiol 2, 1446–1455. [DOI] [PubMed] [Google Scholar]
  • 126.Krupovic M, Ravantti JJ, and Bamford DH (2009). Geminiviruses: a tale of a plasmid becoming a virus. BMC Evol. Biol 9, 112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Reveillaud J, Bordenstein SR, Cruaud C, Shaiber A, Esen ÖC, Weill M, Makoundou P, Lolans K, Watson AR, Rakotoarivony I, et al. (2019). The Wolbachia mobilome in Culex pipiens includes a putative plasmid. Nat. Commun 10, 1051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128.Pellow D, Mizrahi I, and Shamir R (2020). PlasClass improves plasmid sequence classification. PLoS Comput. Biol 16, e1007781. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.Robertson J, and Nash JHE (2018). MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies. Microb Genom 4. 10.1099/mgen.0.000206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130.Rozov R, Brown Kav A, Bogumil D, Shterzer N, Halperin E, Mizrahi I, and Shamir R (2017). Recycler: an algorithm for detecting plasmids from de novo assembly graphs. Bioinformatics 33, 475–482. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 131.Andreopoulos WB, Geller AM, Lucke M, Balewski J, Clum A, Ivanova NN, and Levy A (2022). Deeplasmid: deep learning accurately separates plasmids from bacterial chromosomes. Nucleic Acids Res 50, e17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132.Camargo AP, Call L, Roux S, Nayfach S, Huntemann M, Palaniappan K, Ratner A, Chu K, Mukherjeep S, Reddy TBK, et al. (2024). IMG/PR: a database of plasmids from genomes and metagenomes with rich annotations and metadata. Nucleic Acids Res 52, D164–D173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 133.Pilosof S (2023). Conceptualizing microbe-plasmid communities as complex adaptive systems. Trends Microbiol 31, 672–680. [DOI] [PubMed] [Google Scholar]
  • 134.Simmonds P, Adriaenssens EM, Zerbini FM, Abrescia NGA, Aiewsakun P, Alfenas-Zerbini P, Bao Y, Barylski J, Drosten C, Duffy S, et al. (2023). Four principles to establish a universal virus taxonomy. PLoS Biol 21, e3001922. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135.Redondo-Salvo S, Fernández-López R, Ruiz R, Vielva L, de Toro M, Rocha EPC, Garcillán-Barcia MP, and de la Cruz F (2020). Pathways for horizontal gene transfer in bacteria revealed by a global map of their plasmids. Nat. Commun 11, 3602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136.Acman M, van Dorp L, Santini JM, and Balloux F (2020). Large-scale network analysis captures biological features of bacterial plasmids. Nat. Commun 11, 2452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137.Harrison PW, Lower RPJ, Kim NKD, and Young JPW (2010). Introducing the bacterial “chromid”: not a chromosome, not a plasmid. Trends Microbiol 18, 141–148. [DOI] [PubMed] [Google Scholar]
  • 138.Fogarty EC, Schechter MS, Lolans K, Sheahan ML, Veseli I, Moore RM, Kiefl E, Moody T, Rice PA, Yu MK, et al. (2024). A cryptic plasmid is among the most numerous genetic elements in the human gut. Cell 187, 1206–1222.e16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 139.Jarett JK, Džunková M, Schulz F, Roux S, Paez-Espino D, Eloe-Fadrosh E, Jungbluth SP, Ivanova N, Spear JR, Carr SA, et al. (2020). Insights into the dynamics between viruses and their hosts in a hot spring microbial mat. ISME J 14, 2527–2541. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 140.Diebold PJ, New FN, Hovan M, Satlin MJ, and Brito IL (2021). Linking plasmid-based beta-lactamases to their bacterial hosts using single-cell fusion PCR. Elife 10. 10.7554/eLife.66834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 141.Andersson AF, and Banfield JF (2008). Virus population dynamics and acquired virus resistance in natural microbial communities. Science 320, 1047–1050. [DOI] [PubMed] [Google Scholar]
  • 142.Belton J-M, McCord RP, Gibcus JH, Naumova N, Zhan Y, and Dekker J (2012). Hi-C: a comprehensive technique to capture the conformation of genomes. Methods 58, 268–276. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143.Stalder T, Press MO, Sullivan S, Liachko I, and Top EM (2019). Linking the resistome and plasmidome to the microbiome. ISME J 13, 2437–2446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144.Marbouty M, Thierry A, Millot GA, and Koszul R (2021). MetaHiC phage-bacteria infection network reveals active cycling phages of the healthy human gut. Elife 10. 10.7554/eLife.60608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145.Hwang Y, Roux S, Coclet C, Krause SJE, and Girguis PR (2023). Viruses interact with hosts that span distantly related microbial domains in dense hydrothermal mats. Nat Microbiol 8, 946–957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 146.Allen EE, Tyson GW, Whitaker RJ, Detter JC, Richardson PM, and Banfield JF (2007). Genome dynamics in a natural archaeal population. Proc. Natl. Acad. Sci. U. S. A 104, 1883–1888. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147.Chen L-X, Zhao Y, McMahon KD, Mori JF, Jessen GL, Nelson TC, Warren LA, and Banfield JF (2019). Wide Distribution of Phage That Infect Freshwater SAR11 Bacteria. mSystems 4. 10.1128/mSystems.00410-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 148.Al-Shayeb B, Schoelmerich MC, West-Roberts J, Valentin-Alvarado LE, Sachdeva R, Mullen S, Crits-Christoph A, Wilkins MJ, Williams KH, Doudna JA, et al. (2022). Borgs are giant genetic elements with potential to expand metabolic capacity. Nature 610, 731–736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 149.Zheludev IN, Edgar RC, Lopez-Galiano MJ, de la Peña M, Babaian A, Bhatt AS, and Fire AZ (2024). Viroid-like colonists of human microbiomes. bioRxiv 10.1101/2024.01.20.576352. [DOI] [Google Scholar]
  • 150.Lee BD, Neri U, Roux S, Wolf YI, Camargo AP, Krupovic M, RNA Virus Discovery Consortium, Simmonds P, Kyrpides N, Gophna U, et al. (2023). Mining metatranscriptomes reveals a vast world of viroid-like circular RNAs. Cell 186, 646–661.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 151.Boyd CM, Subramanian S, Dunham DT, Parent KN, and Seed KD (2024). A Vibrio cholerae viral satellite maximizes its spread and inhibits phage by remodeling hijacked phage coat proteins into small capsids. Elife 12. 10.7554/eLife.87611. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 152.de Sousa JAM, Fillol-Salom A, Penadés JR, and Rocha EPC (2023). Identification and characterization of thousands of bacteriophage satellites across bacteria. Nucleic Acids Res 51, 2759–2777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 153.Yamao F, Muto A, Kawauchi Y, Iwami M, Iwagami S, Azumi Y, and Osawa S (1985). UGA is read as tryptophan in Mycoplasma capricolum. Proc. Natl. Acad. Sci. U. S. A 82, 2306–2309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 154.Campbell JH, O’Donoghue P, Campbell AG, Schwientek P, Sczyrba A, Woyke T, Söll D, and Podar M (2013). UGA is an additional glycine codon in uncultured SR1 bacteria from the human microbiota. Proc. Natl. Acad. Sci. U. S. A 110, 5540–5545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 155.Hanke A, Hamann E, Sharma R, Geelhoed JS, Hargesheimer T, Kraft B, Meyer V, Lenk S, Osmers H, Wu R, et al. (2014). Recoding of the stop codon UGA to glycine by a BD1–5/SN-2 bacterium and niche partitioning between Alpha- and Gammaproteobacteria in a tidal sediment microbial community naturally selected in a laboratory chemostat. Front. Microbiol 5, 231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 156.Shulgina Y, and Eddy SR (2021). A computational screen for alternative genetic codes in over 250,000 genomes. Elife 10. 10.7554/eLife.71402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157.Zagrovic B, Bartonek L, and Polyansky AA (2018). RNA-protein interactions in an unstructured context. FEBS Lett 592, 2901–2916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158.Bartonek L, Braun D, and Zagrovic B (2020). Frameshifting preserves key physicochemical properties of proteins. Proc. Natl. Acad. Sci. U. S. A 117, 5907–5912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 159.Alff-Steinberger C (1969). The genetic code and error transmission. Proc. Natl. Acad. Sci. U. S. A 64, 584–591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 160.Shenhav L, and Zeevi D (2020). Resource conservation manifests in the genetic code. Science 370, 683–687. [DOI] [PubMed] [Google Scholar]
  • 161.Shackelton LA, and Holmes EC (2008). The role of alternative genetic codes in viral evolution and emergence. J. Theor. Biol 254, 128–134. [DOI] [PubMed] [Google Scholar]
  • 162.Ivanova NN, Schwientek P, Tripp HJ, Rinke C, Pati A, Huntemann M, Visel A, Woyke T, Kyrpides NC, and Rubin EM (2014). Stop codon reassignments in the wild. Science 344, 909–913. [DOI] [PubMed] [Google Scholar]
  • 163.Liu J, Jaffe AL, Chen L, Bor B, and Banfield JF (2023). Host translation machinery is not a barrier to phages that interact with both CPR and non-CPR bacteria. MBio 14, e0176623. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 164.Borges AL, Lou YC, Sachdeva R, Al-Shayeb B, Penev PI, Jaffe AL, Lei S, Santini JM, and Banfield JF (2022). Widespread stop-codon recoding in bacteriophages may regulate translation of lytic genes. Nat Microbiol 7, 918–927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 165.Massey SE (2017). The identities of stop codon reassignments support ancestral tRNA stop codon decoding activity as a facilitator of gene duplication and evolution of novel function. Gene 619, 37–43. [DOI] [PubMed] [Google Scholar]
  • 166.Fischer N, Paleskava A, Gromadski KB, Konevega AL, Wahl MC, Stark H, and Rodnina MV (2007). Towards understanding selenocysteine incorporation into bacterial proteins. Biol. Chem 388, 1061–1067. [DOI] [PubMed] [Google Scholar]
  • 167.Chung CZ, Miller C, Söll D, and Krahn N (2021). Introducing Selenocysteine into Recombinant Proteins in Escherichia coli. Curr Protoc 1, e54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 168.Li J, Kang PT, Jiang R, Lee JY, Soares JA, Krzycki JA, and Chan MK (2023). Insights into pyrrolysine function from structures of a trimethylamine methyltransferase and its corrinoid protein complex. Commun Biol 6, 54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 169.Martiny JBH, Bohannan BJM, Brown JH, Colwell RK, Fuhrman JA, Green JL, Horner-Devine MC, Kane M, Krumins JA, Kuske CR, et al. (2006). Microbial biogeography: putting microorganisms on the map. Nat. Rev. Microbiol 4, 102–112. [DOI] [PubMed] [Google Scholar]
  • 170.Ward David M, Ferris Michael J, Nold Stephen C, and Bateson Mary M (1998). A Natural View of Microbial Biodiversity within Hot Spring Cyanobacterial Mat Communities. Microbiol. Mol. Biol. Rev 62, 1353–1370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 171.Moore LR, Rocap G, and Chisholm SW (1998). Physiology and molecular phylogeny of coexisting Prochlorococcus ecotypes. Nature 393, 464–467. [DOI] [PubMed] [Google Scholar]
  • 172.Cho JC, and Tiedje JM (2000). Biogeography and degree of endemicity of fluorescent Pseudomonas strains in soil. Appl. Environ. Microbiol 66, 5448–5456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 173.Newman DK, and Banfield JF (2002). Geomicrobiology: how molecular-scale interactions underpin biogeochemical systems. Science 296, 1071–1077. [DOI] [PubMed] [Google Scholar]
  • 174.Green JL, Bohannan BJM, and Whitaker RJ (2008). Microbial biogeography: from taxonomy to traits. Science 320, 1039–1043. [DOI] [PubMed] [Google Scholar]
  • 175.Tringe SG, von Mering C, Kobayashi A, Salamov AA, Chen K, Chang HW, Podar M, Short JM, Mathur EJ, Detter JC, et al. (2005). Comparative metagenomics of microbial communities. Science 308, 554–557. [DOI] [PubMed] [Google Scholar]
  • 176.DeLong EF, Preston CM, Mincer T, Rich V, Hallam SJ, Frigaard N-U, Martinez A, Sullivan MB, Edwards R, Brito BR, et al. (2006). Community genomics among stratified microbial assemblages in the ocean’s interior. Science 311, 496–503. [DOI] [PubMed] [Google Scholar]
  • 177.Alm RA, Ling LS, Moir DT, King BL, Brown ED, Doig PC, Smith DR, Noonan B, Guild BC, deJonge BL, et al. (1999). Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature 397, 176–180. [DOI] [PubMed] [Google Scholar]
  • 178.Hacker J, and Carniel E (2001). Ecological fitness, genomic islands and bacterial pathogenicity. A Darwinian view of the evolution of microbes. EMBO Rep 2, 376–381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 179.Tettelin H, Masignani V, Cieslewicz MJ, Eisen JA, Peterson S, Wessels MR, Paulsen IT, Nelson KE, Margarit I, Read TD, et al. (2002). Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae. Proc. Natl. Acad. Sci. U. S. A 99, 12391–12396. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 180.Maione D, Margarit I, Rinaudo CD, Masignani V, Mora M, Scarselli M, Tettelin H, Brettoni C, Iacobini ET, Rosini R, et al. (2005). Identification of a universal Group B streptococcus vaccine by multiple genome screen. Science 309, 148–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 181.Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, Angiuoli SV, Crabtree J, Jones AL, Durkin AS, et al. (2005). Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome.” Proceedings of the National Academy of Sciences 102, 13950–13955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 182.Domingo-Sananes MR, and McInerney JO (2021). Mechanisms That Shape Microbial Pangenomes. Trends Microbiol 29, 493–503. [DOI] [PubMed] [Google Scholar]
  • 183.McInerney JO (2023). Prokaryotic Pangenomes Act as Evolving Ecosystems. Mol. Biol. Evol 40. 10.1093/molbev/msac232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 184.Kashtan N, Roggensack SE, Rodrigue S, Thompson JW, Biller SJ, Coe A, Ding H, Marttinen P, Malmstrom RR, Stocker R, et al. (2014). Single-cell genomics reveals hundreds of coexisting subpopulations in wild Prochlorococcus. Science 344, 416–420. [DOI] [PubMed] [Google Scholar]
  • 185.Coleman ML, and Chisholm SW (2010). Ecosystem-specific selection pressures revealed through comparative population genomics. Proc. Natl. Acad. Sci. U. S. A 107, 18634–18639. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 186.Delmont TO, and Eren AM (2018). Linking pangenomes and metagenomes: the Prochlorococcus metapangenome. PeerJ 6, e4320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 187.Utter DR, Borisy GG, Eren AM, Cavanaugh CM, and Mark Welch JL (2020). Metapangenomics of the oral microbiome provides insights into habitat adaptation and cultivar diversity. Genome Biol 21, 293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 188.Boeuf D, Eppley JM, Mende DR, Malmstrom RR, Woyke T, and DeLong EF (2021). Metapangenomics reveals depth-dependent shifts in metabolic potential for the ubiquitous marine bacterial SAR324 lineage. Microbiome 9, 172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 189.Bunker JJ, Drees C, Watson AR, Plunkett CH, Nagler CR, Schneewind O, Eren AM, and Bendelac A (2019). B cell superantigens in the human intestinal microbiota. Sci. Transl. Med 11. 10.1126/scitranslmed.aau9356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 190.Vanni C, Schechter MS, Acinas SG, Barberán A, Buttigieg PL, Casamayor EO, Delmont TO, Duarte CM, Eren AM, Finn RD, et al. (2022). Unifying the known and unknown microbial coding sequence space. Elife 11. 10.7554/eLife.67667. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 191.Rodríguez Del Río Á, Giner-Lamia J, Cantalapiedra CP, Botas J, Deng Z, Hernández-Plaza A, Munar-Palmer M, Santamaría-Hernando S, Rodríguez-Herva JJ, Ruscheweyh H-J, et al. (2024). Functional and evolutionary significance of unknown genes from uncultivated taxa. Nature 626, 377–384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 192.Remmert M, Biegert A, Hauser A, and Söding J (2011). HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods 9, 173–175. [DOI] [PubMed] [Google Scholar]
  • 193.Hamamsy T, Morton JT, Blackwell R, Berenberg D, Carriero N, Gligorijevic V, Strauss CEM, Leman JK, Cho K, and Bonneau R (2023). Protein remote homology detection and structural alignment using deep learning. Nat. Biotechnol 10.1038/s41587-023-01917-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 194.Gautreau G, Bazin A, Gachet M, Planel R, Burlot L, Dubois M, Perrin A, Médigue C, Calteau A, Cruveiller S, et al. (2020). PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph. PLoS Comput. Biol 16, e1007732. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 195.Colquhoun RM, Hall MB, Lima L, Roberts LW, Malone KM, Hunt M, Letcher B, Hawkey J, George S, Pankhurst L, et al. (2021). Pandora: nucleotide-resolution bacterial pangenomics with reference graphs. Genome Biol 22, 267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 196.Okazaki A, Yamazaki S, Inoue I, and Ott J (2021). Population genetics: past, present, and future. Hum. Genet 140, 231–240. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 197.Smith JM, Smith NH, O’Rourke M, and Spratt BG (1993). How clonal are bacteria? Proc. Natl. Acad. Sci. U. S. A 90, 4384–4388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 198.Cohan FM (1994). Genetic exchange and evolutionary divergence in prokaryotes. Trends Ecol. Evol 9, 175–180. [DOI] [PubMed] [Google Scholar]
  • 199.Cohan FM (2001). Bacterial species and speciation. Syst. Biol 50, 513–524. [DOI] [PubMed] [Google Scholar]
  • 200.Rohwer RR, Kirkpatrick M, Garcia SL, Kellom M, McMahon KD, and Baker BJ (2024). Bacterial ecology and evolution converge on seasonal and decadal scales. bioRxiv. 10.1101/2024.02.06.579087. [DOI] [Google Scholar]
  • 201.Acinas SG, Klepac-Ceraj V, Hunt DE, Pharino C, Ceraj I, Distel DL, and Polz MF (2004). Fine-scale phylogenetic architecture of a complex bacterial community. Nature 430, 551–554. [DOI] [PubMed] [Google Scholar]
  • 202.Lenski RE (2023). Revisiting the Design of the Long-Term Evolution Experiment with Escherichia coli. J. Mol. Evol 91, 241–253. [DOI] [PubMed] [Google Scholar]
  • 203.Good BH, McDonald MJ, Barrick JE, Lenski RE, and Desai MM (2017). The dynamics of molecular evolution over 60,000 generations. Nature 551, 45–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 204.Denef VJ (2019). Peering into the Genetic Makeup of Natural Microbial Populations Using Metagenomics. In Population Genomics: Microorganisms, Polz MF and Rajora OP, eds. (Springer International Publishing; ), pp. 49–75. [Google Scholar]
  • 205.Simmons SL, Dibartolo G, Denef VJ, Goltsman DSA, Thelen MP, and Banfield JF (2008). Population genomic analysis of strain variation in Leptospirillum group II bacteria involved in acid mine drainage formation. PLoS Biol 6, e177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 206.Human Microbiome Project Consortium (2012). Structure, function and diversity of the healthy human microbiome. Nature 486, 207–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 207.Sunagawa S, Coelho LP, Chaffron S, Kultima JR, Labadie K, Salazar G, Djahanschiri B, Zeller G, Mende DR, Alberti A, et al. (2015). Ocean plankton. Structure and function of the global ocean microbiome. Science 348, 1261359. [DOI] [PubMed] [Google Scholar]
  • 208.Schloissnig S, Arumugam M, Sunagawa S, Mitreva M, Tap J, Zhu A, Waller A, Mende DR, Kultima JR, Martin J, et al. (2013). Genomic variation landscape of the human gut microbiome. Nature 493, 45–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 209.Bendall ML, Stevens SL, Chan L-K, Malfatti S, Schwientek P, Tremblay J, Schackwitz W, Martin J, Pati A, Bushnell B, et al. (2016). Genome-wide selective sweeps and gene-specific sweeps in natural bacterial populations. ISME J 10, 1589–1601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 210.Li SS, Zhu A, Benes V, Costea PI, Hercog R, Hildebrand F, Huerta-Cepas J, Nieuwdorp M, Salojärvi J, Voigt AY, et al. (2016). Durable coexistence of donor and recipient strains after fecal microbiota transplantation. Science 352, 586–589. [DOI] [PubMed] [Google Scholar]
  • 211.Truong DT, Tett A, Pasolli E, Huttenhower C, and Segata N (2017). Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res 27, 626–638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 212.Anderson RE, Reveillaud J, Reddington E, Delmont TO, Eren AM, McDermott JM, Seewald JS, and Huber JA (2017). Genomic variation in microbial populations inhabiting the marine subseafloor at deep-sea hydrothermal vents. Nat. Commun 8, 1114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 213.Delmont TO, Kiefl E, Kilinc O, Esen OC, Uysal I, Rappé MS, Giovannoni S, and Eren AM (2019). Single-amino acid variants reveal evolutionary processes that shape the biogeography of a global SAR11 subclade. Elife 8. 10.7554/eLife.46497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 214.Conwill A, Kuan AC, Damerla R, Poret AJ, Baker JS, Tripp AD, Alm EJ, and Lieberman TD (2022). Anatomy promotes neutral coexistence of strains in the human skin microbiome. Cell Host Microbe 30, 171–182.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 215.Garud NR, and Pollard KS (2020). Population Genetics in the Human Microbiome. Trends Genet 36, 53–67. [DOI] [PubMed] [Google Scholar]
  • 216.Logares R (2024). Decoding populations in the ocean microbiome. Microbiome 12, 67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 217.Wilke CO (2012). Bringing molecules back into molecular evolution. PLoS Comput. Biol 8, e1002572. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 218.Anfinsen CB (1973). Principles that govern the folding of protein chains. Science 181, 223–230. [DOI] [PubMed] [Google Scholar]
  • 219.Park Y, Metzger BPH, and Thornton JW (2024). The simplicity of protein sequence-function relationships. bioRxiv. 10.1101/2023.09.02.556057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 220.Dean AM, and Thornton JW (2007). Mechanistic approaches to the study of evolution: the functional synthesis. Nat. Rev. Genet 8, 675–688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 221.Sunyaev S, Lathe W 3rd, and Bork P (2001). Integration of genome data and protein structures: prediction of protein folds, protein interactions and “molecular phenotypes” of single nucleotide polymorphisms. Curr. Opin. Struct. Biol 11, 125–130. [DOI] [PubMed] [Google Scholar]
  • 222.Kuhlman B, and Bradley P (2019). Advances in protein structure prediction and design. Nat. Rev. Mol. Cell Biol 20, 681–697. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 223.Eren AM, Kiefl E, Shaiber A, Veseli I, Miller SE, Schechter MS, Fink I, Pan JN, Yousef M, Fogarty EC, et al. (2021). Community-led, integrated, reproducible multi-omics with anvi’o. Nat Microbiol 6, 3–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 224.Robinson SL (2023). Structure-guided metagenome mining to tap microbial functional diversity. Curr. Opin. Microbiol 76, 102382. [DOI] [PubMed] [Google Scholar]
  • 225.Kiefl E, Esen OC, Miller SE, Kroll KL, Willis AD, Rappé MS, Pan T, and Eren AM (2023). Structure-informed microbial population genetics elucidate selective pressures that shape protein evolution. Sci Adv 9, eabq4632. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 226.Han Y, Peng Y, Peng J, Cao L, Xu Y, Yang Y, Wu M, Zhou H, Zhang C, Zhang D, et al. (2024). Phylogenetically and structurally diverse reductive dehalogenases link biogeochemical cycles in deep-sea cold seeps. bioRxiv, 2024.01.23.576788. 10.1101/2024.01.23.576788. [DOI] [Google Scholar]
  • 227.Moses L, and Pachter L (2022). Museum of spatial transcriptomics. Nat. Methods 19, 534–546. [DOI] [PubMed] [Google Scholar]
  • 228.Toledo-Arana A, Dussurget O, Nikitas G, Sesto N, Guet-Revillet H, Balestrino D, Loh E, Gripenland J, Tiensuu T, Vaitkevicius K, et al. (2009). The Listeria transcriptional landscape from saprophytism to virulence. Nature 459, 950–956. [DOI] [PubMed] [Google Scholar]
  • 229.Dar D, Dar N, Cai L, and Newman DK (2021). Spatial transcriptomics of planktonic and sessile bacterial populations at single-cell resolution. Science 373. 10.1126/science.abi4882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 230.Ma P, Amemiya HM, He LL, Gandhi SJ, Nicol R, Bhattacharyya RP, Smillie CS, and Hung DT (2023). Bacterial droplet-based single-cell RNA-seq reveals antibiotic-associated heterogeneous cellular states. Cell 186, 877–891.e14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 231.Wang B, Lin AE, Yuan J, Novak KE, Koch MD, Wingreen NS, Adamson B, and Gitai Z (2023). Single-cell massively-parallel multiplexed microbial sequencing (M3-seq) identifies rare bacterial populations and profiles phage infection. Nat Microbiol 8, 1846–1862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 232.Poretsky RS, Bano N, Buchan A, LeCleir G, Kleikemper J, Pickering M, Pate WM, Moran MA, and Hollibaugh JT (2005). Analysis of microbial gene transcripts in environmental samples. Appl. Environ. Microbiol 71, 4121–4126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 233.Gifford SM, Sharma S, Rinta-Kanto JM, and Moran MA (2011). Quantitative analysis of a deeply sequenced marine microbial metatranscriptome. ISME J 5, 461–472. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 234.Salazar G, Paoli L, Alberti A, Huerta-Cepas J, Ruscheweyh H-J, Cuenca M, Field CM, Coelho LP, Cruaud C, Engelen S, et al. (2019). Gene Expression Changes and Community Turnover Differentially Shape the Global Ocean Metatranscriptome. Cell 179, 1068–1083.e21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 235.Schirmer M, Franzosa EA, Lloyd-Price J, McIver LJ, Schwager R, Poon TW, Ananthakrishnan AN, Andrews E, Barron G, Lake K, et al. (2018). Dynamics of metatranscription in the inflammatory bowel disease gut microbiome. Nat Microbiol 3, 337–346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 236.Jiang X, Hall AB, Arthur TD, Plichta DR, Covington CT, Poyet M, Crothers J, Moses PL, Tolonen AC, Vlamakis H, et al. (2019). Invertible promoters mediate bacterial phase variation, antibiotic resistance, and host adaptation in the gut. Science 363, 181–187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 237.Medhekar B, and Miller JF (2007). Diversity-generating retroelements. Curr. Opin. Microbiol 10, 388–395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 238.Naorem SS, Han J, Wang S, Lee WR, Heng X, Miller JF, and Guo H (2017). DGR mutagenic transposition occurs via hypermutagenic reverse transcription primed by nicked template RNA. Proc. Natl. Acad. Sci. U. S. A 114, E10187–E10195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 239.Liu M, Deora R, Doulatov SR, Gingery M, Eiserling FA, Preston A, Maskell DJ, Simons RW, Cotter PA, Parkhill J, et al. (2002). Reverse transcriptase-mediated tropism switching in Bordetella bacteriophage. Science 295, 2091–2094. [DOI] [PubMed] [Google Scholar]
  • 240.Paul BG, Burstein D, Castelle CJ, Handa S, Arambula D, Czornyj E, Thomas BC, Ghosh P, Miller JF, Banfield JF, et al. (2017). Retroelement-guided protein diversification abounds in vast lineages of Bacteria and Archaea. Nat Microbiol 2, 17045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 241.Roux S, Paul BG, Bagby SC, Nayfach S, Allen MA, Attwood G, Cavicchioli R, Chistoserdova L, Gruninger RJ, Hallam SJ, et al. (2021). Ecology and molecular targets of hypermutation in the global microbiome. Nat. Commun 12, 3076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 242.Paul BG, and Eren AM (2022). Eco-evolutionary significance of domesticated retroelements in microbial genomes. Mob. DNA 13, 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 243.Mohler K, and Ibba M (2017). Translational fidelity and mistranslation in the cellular response to stress. Nat Microbiol 2, 17117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 244.Samatova E, Daberger J, Liutkute M, and Rodnina MV (2020). Translational Control by Ribosome Pausing in Bacteria: How a Non-uniform Pace of Translation Affects Protein Production and Folding. Front. Microbiol 11, 619430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 245.Torrent M, Chalancon G, de Groot NS, Wuster A, and Madan Babu M (2018). Cells alter their tRNA abundance to selectively regulate protein synthesis during stress conditions. Sci. Signal 11. 10.1126/scisignal.aat6409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 246.Al-Shayeb B, Sachdeva R, Chen L-X, Ward F, Munk P, Devoto A, Castelle CJ, Olm MR, Bouma-Gregson K, Amano Y, et al. (2020). Clades of huge phages from across Earth’s ecosystems. Nature 578, 425–431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 247.Yang JY, Fang W, Miranda-Sanchez F, Brown JM, Kauffman KM, Acevero CM, Bartel DP, Polz MF, and Kelly L (2021). Degradation of host translational machinery drives tRNA acquisition in viruses. Cell Syst 12, 771–779.e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 248.Bailly-Bechet M, Vergassola M, and Rocha E (2007). Causes for the intriguing presence of tRNAs in phages. Genome Res 17, 1486–1495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 249.Zhang W, Foo M, Eren AM, and Pan T (2022). tRNA modification dynamics from individual organisms to metaepitranscriptomics of microbiomes. Mol. Cell 82, 891–906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 250.Padhiar NH, Katneni U, Komar AA, Motorin Y, and Kimchi-Sarfaty C (2024). Advances in methods for tRNA sequencing and quantification. Trends Genet 40, 276–290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 251.Schwartz MH, Wang H, Pan JN, Clark WC, Cui S, Eckwahl MJ, Pan DW, Parisien M, Owens SM, Cheng BL, et al. (2018). Microbiome characterization by high-throughput transfer RNA sequencing and modification analysis. Nat. Commun 9, 5353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 252.Ingolia NT, Ghaemmaghami S, Newman JRS, and Weissman JS (2009). Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 253.Fremin BJ, Sberro H, and Bhatt AS (2020). MetaRibo-Seq measures translation in microbiomes. Nat. Commun 11, 3268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 254.Hatzenpichler R, Scheller S, Tavormina PL, Babin BM, Tirrell DA, and Orphan VJ (2014). In situ visualization of newly synthesized proteins in environmental microbes using amino acid tagging and click chemistry. Environ. Microbiol 16, 2568–2590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 255.Musat N, Foster R, Vagner T, Adam B, and Kuypers MMM (2012). Detecting metabolic activities in single cells, with emphasis on nanoSIMS. FEMS Microbiol. Rev 36, 486–511. [DOI] [PubMed] [Google Scholar]
  • 256.McGlynn SE, Chadwick GL, Kempes CP, and Orphan VJ (2015). Single cell activity reveals direct electron transfer in methanotrophic consortia. Nature 526, 531–535. [DOI] [PubMed] [Google Scholar]
  • 257.Carini P, Marsden PJ, Leff JW, Morgan EE, Strickland MS, and Fierer N (2016). Relic DNA is abundant in soil and obscures estimates of soil microbial diversity. Nat Microbiol 2, 16242. [DOI] [PubMed] [Google Scholar]
  • 258.VerBerkmoes NC, Denef VJ, Hettich RL, and Banfield JF (2009). Systems biology: Functional analysis of natural microbial consortia using community proteomics. Nat. Rev. Microbiol 7, 196–205. [DOI] [PubMed] [Google Scholar]
  • 259.Hungate BA, Mau RL, Schwartz E, Caporaso JG, Dijkstra P, van Gestel N, Koch BJ, Liu CM, McHugh TA, Marks JC, et al. (2015). Quantitative microbial ecology through stable isotope probing. Appl. Environ. Microbiol 81, 7570–7581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 260.Pan C, Fischer CR, Hyatt D, Bowen BP, Hettich RL, and Banfield JF (2011). Quantitative tracking of isotope flows in proteomes of microbial communities. Mol. Cell. Proteomics 10, M110.006049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 261.Blazewicz SJ, Hungate BA, Koch BJ, Nuccio EE, Morrissey E, Brodie EL, Schwartz E, Pett-Ridge J, and Firestone MK (2020). Taxon-specific microbial growth and mortality patterns reveal distinct temporal population responses to rewetting in a California grassland soil. ISME J 14, 1520–1532. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 262.Vyshenska D, Sampara P, Singh K, Tomatsu A, Kauffman WB, Nuccio EE, Blazewicz SJ, Pett-Ridge J, Louie KB, Varghese N, et al. (2023). A standardized quantitative analysis strategy for stable isotope probing metagenomics. mSystems 8, e0128022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 263.Rubin BE, Diamond S, Cress BF, Crits-Christoph A, Lou YC, Borges AL, Shivram H, He C, Xu M, Zhou Z, et al. (2022). Species- and site-specific genome editing in complex bacterial communities. Nat Microbiol 7, 34–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 264.Ronda C, Chen SP, Cabral V, Yaung SJ, and Wang HH (2019). Metagenomic engineering of the mammalian gut microbiome in situ. Nat. Methods 16, 167–170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 265.Wang Y, Gallagher LA, Andrade PA, Liu A, Humphreys IR, Turkarslan S, Cutler KJ, Arrieta-Ortiz ML, Li Y, Radey MC, et al. (2023). Genetic manipulation of candidate phyla radiation bacteria provides functional insights into microbial dark matter. bioRxiv. 10.1101/2023.05.02.539146. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 266.Mojica FJ, Juez G, and Rodríguez-Valera F (1993). Transcription at different salinities of Haloferax mediterranei sequences adjacent to partially modified PstI sites. Mol. Microbiol 9, 613–621. [DOI] [PubMed] [Google Scholar]
  • 267.Bolotin A, Quinquis B, Sorokin A, and Ehrlich SD (2005). Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin. Microbiology 151, 2551–2561. [DOI] [PubMed] [Google Scholar]
  • 268.Pourcel C, Salvignol G, and Vergnaud G (2005). CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies. Microbiology 151, 653–663. [DOI] [PubMed] [Google Scholar]
  • 269.Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, and Charpentier E (2012). A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 270.Ritchie H, and Roser M (2024). Half of the world’s habitable land is used for agriculture. Our World in Data. [Google Scholar]
  • 271.Foley JA, Defries R, Asner GP, Barford C, Bonan G, Carpenter SR, Chapin FS, Coe MT, Daily GC, Gibbs HK, et al. (2005). Global consequences of land use. Science 309, 570–574. [DOI] [PubMed] [Google Scholar]

RESOURCES