Skip to main content
BMC Biology logoLink to BMC Biology
. 2025 Aug 20;23:261. doi: 10.1186/s12915-025-02331-7

Adaptive genetics reveals constraints on protein structure/function by evolving E. coli under constant nutrient limitation

Katja Schwartz 1,#, Margie Kinnersley 2,#, Charles Ross Lindsey 3, Gavin Sherlock 1,, Frank Rosenzweig 2,3,
PMCID: PMC12366229  PMID: 40830465

Abstract

Background

Evolution of microbes under laboratory selection produces genetically diverse populations, owing to the continuous input of mutations and to competition among lineages. Whole-genome whole-population sequencing makes it possible to identify mutations arising in such populations, to use them to discern functional modules where adaptation occurs, and then map gene structure–function relationships. Here, we report on the use of this approach, adaptive genetics, to discover targets of selection and the mutational consequences thereof in E. coli evolving under chronic nutrient limitation.

Results

Replicate bacterial populations were cultured for ≥ 300 generations in glucose limited chemostats and sequenced every 50 generations at 1000X-coverage, enabling identification of mutations that rose to ≥ 1% frequency. Thirty-nine genes qualified as high value targets of selection, being mutated far more often than would be expected by chance. A majority of these encode regulatory proteins that control gene expression at the transcriptional (e.g., RpoS and OmpR), post-transcriptional (e.g., Hfq and ProQ), and post-translational (e.g., GatZ) levels. The downstream effects of these regulatory mutations likely impact not only acquisition and processing of limiting glucose, but also assembly of structural elements such as lipopolysaccharide, periplasmic glucans, and cell surface appendages such as flagella and fimbriae. Whether regulatory or structural in nature, recurrent mutations at high value targets tend to cluster at sites either known or predicted to be involved in RNA–protein or protein–protein interactions.

Conclusions

Our observations highlight the value of experimental evolution as a proving ground for inferences gathered from traditional molecular genetics. By coupling experimental evolution to whole-genome, whole-population sequencing, adaptive genetics makes it possible not only the genes whose mutation confers a selective advantage, but also to discover which residues in which genes are most likely to confer a particular type of selective advantage and why.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12915-025-02331-7.

Keywords: E. coli, Experimental evolution, Whole genome sequencing, Functional genomics, Adaptive genetics, Parallelism, Metabolic networks

Background

Genetics aims to discover the parts out of which an organism is built, maintained, and replicated, and to discover how those parts interact. To identify the genetic basis of a phenotype or trait, classical, or forward, genetics relies on screening collections of mutants for those that alter specific structures or processes (e.g., [15]). By contrast, molecular, or reverse genetics, starts with a known gene or locus, variants of which are created and characterized to determine how specific changes alter a given structure or process (e.g., [2, 6, 7]). Genomics has transformed both approaches and accelerated the pace of discovery using each (e.g., [8, 9]). Forward and reverse genetics typically depend on the isolation and characterization of deleterious mutants that carry out a process less well than their wild-type counterpart. In recent years, a new approach—adaptive genetics—has emerged as a tool for connecting genotype with phenotype; this approach relies on the isolation and characterization of beneficial mutants that arise in frequency in response to selection [10]. Laboratory evolution is often driven by selection on large populations of microbes (e.g., [1113]), ideal conditions for using adaptive genetics to explore protein structure–function relationships as well as the regulation of metabolic pathways in response to specific environmental challenges.

Microbial evolution experiments typically result in genetically diverse populations, owing to the continuous input of mutations (beneficial or otherwise), the persistence of neutral passengers, and competition among lineages (e.g., [1417]). Mutations that reduce a gene product’s activity, or even cause loss-of-function, are among the most common classes of beneficial mutations observed in such experiments (e.g., [1722]). This observation suggests that it can be both evolutionarily advantageous—as well as easier—to break old parts than it is to improve old parts or create new ones. Beneficial mutations arising from missense mutations that diminish or alter protein activity also provide structural and functional information about the proteins mutated. Indeed, this is the basis for Deep Mutational Scanning, which uses saturating mutagenesis to probe a protein’s structure in conjunction with a functional assay to determine correlated changes in its activity (e.g., [23, 24]).

Previously, we reported a genomic analysis of how independent Escherichia coli populations evolve under continuous aerobic glucose limitation in chemostats [25]. Each of these populations originated from an E. coli mutator strain that also carries a transfer RNA (tRNA) suppressor (JA122; [26]); such a genotype increases mutational load yet softens the effect of nonsense mutations. By whole-genome sequencing populations every 50 generations at ~ 1000X-coverage, we identified de novo mutations that rose to ≥ 1% frequency then sequenced scores of clones from each population at the time point where we observed highest genetic diversity. This enabled us to track the evolutionary dynamics of high fitness lineages over time [25].

In the simple, unstructured environment of a chemostat, cells can adaptively evolve via one or both of two strategies: they can become more adept at scavenging the limiting nutrient or they can become more efficient at converting the limiting nutrient to progeny [2729]. Previous work [13, 3032], as well as our companion piece [25], has shown that under glucose limitation selection impacts targets that affect both strategies, ranging from global regulators to operon-specific (local) regulators, to enzymes, and to structural proteins, as well as promoters and terminators. In a chemostat, selection may also favor strategies that increase cells’ residence time, leading to spatially segregated populations, one planktonic, the other sessile (i.e., adhering to the vessel wall). All these strategies are executed via integrated functional modules, hierarchically organized as either single genes, operons, or regulons depending on their specificity of action and responsiveness to higher order control. The relationships among these regulatory circuits, whether as simple as product inhibition of a single enzyme or as complex as the orchestration of cell division, can be thought of as the cell’s wiring diagram, a diagram that adaptive genetics can help us to read.

In this report, we focus on the physical location of de novo alleles arising in genes that are mutated much more frequently than expected by chance, and consider how they impact the structure and function of proteins encoded by these genes. We place special emphasis on independent mutations clustered in proximity to one another, as recurrent mutations at specific locations across independent experiments both argues for their value under selection and illustrates how adaptive evolution may be constrained at the molecular level. We find that some non-random mutations map to sites known or predicted to be involved in substrate binding or to sites required to assemble multimeric proteins that act, for example, in lipopolysaccharide (LPS) transport and flagellar biogenesis. However, many other mutations map to sites that regulate gene expression and do so not just at the level of transcription, but also at the post-transcriptional and post-translational levels, levels that have been historically understudied in experimental evolution. Out of 39 genes statistically shown to be high-value targets of selection, more than half were regulatory genes or regulatory elements. Further, out of 379 de novo mutations arising in these 39 genes, nearly two-thirds arose within regulatory genes or extragenic regulatory sequences. We discuss how these mutations affect not only acquisition and utilization of the limiting resource, but also the construction of structural elements like LPS, periplasmic glucans, flagella, and fimbriae. Taken together, our findings highlight the role that regulatory mutations play in driving adaptation and demonstrate the power of adaptive genetics to discover which genes become targets of selection during experimental evolution and which residues in those genes are most likely to confer a particular type of selective advantage and why.

Results and discussion

Experimental design

The design of our evolution experiments has been described in detail previously [25]. Briefly, using Davis Minimal Medium, E. coli JA122 was evolved in triplicate in aerobic, glucose-limited (0.0125% w/v) chemostats at a fixed dilution rate of 0.2 h−1 and at a constant temperature of 30 °C. Relative to E. coli K12 MG1655 our founder strain JA122 has an elevated mutation rate (1.0 × 10−7 vs 3.6 × 10−9 bp generation) due to a nonsense mutation in base excision repair glycosylase, MutY (L299*). JA122 also contains nonsense mutations in the housekeeping (σD aka σ70 (RpoD), E26*) and stationary phase (σS aka σ38 (RpoS), Q33*) sigma factors, each of which positions ribonucleic (RNA) polymerase holoenzyme to its respective consensus sequence (note: after a sigma factor’s first mention, we hereafter refer to it by its gene name). However, the founder strain also carries a nonsense suppressor tRNA capable of suppressing all three types of nonsense mutations [32, 33]. Thus, while the ancestral MutY defect increases mutational load on our populations, the presence of a glnX suppressor softens the effect of nonsense mutations. Suppressor activity may even be enhanced by the slow growth conditions [33, 34] imposed by resource limitation.

Identification of high-value targets and mutation clusters

Here, we analyzed a collection of functional modules whose components become targets of selection when E. coli evolves under continuous glucose limitation. Targets of selection were defined as genes and regulatory elements in which the number of observed mutations exceeds the number that would be expected by random chance, given the observed number of mutations and gene/element sizes (Table 1; [25]). Thirty-nine targets meeting a 5% false-discovery rate (FDR) threshold were further examined for evidence of non-random patterns of mutation either in their primary sequence or in their 3-dimensional structures, the latter using ClusterExplorer [35], the nonrandom mutations cluster (NMC) algorithm [36], and the Identification of Protein Amino Acid Clustering (iPAC) program [37]. Nineteen protein coding genes/intergenic regions exhibited at least one significant cluster of mutations (Table 1). Many of these clusters precisely define intergenic regulatory regions or occur in known protein structural elements, such as domains involved in catalysis, protein-RNA, or protein–protein interactions. In targets where de novo mutations were not clustered but still present in excess, they often occurred in regions that could alter protein activity or regulation in ways that enhance acquisition or utilization of the limiting nutrient, facilitate energy conservation, or increase cells’ residence time in the chemostat. A majority of genes deemed to be high-value targets of selection fell into one of four functional categories: regulatory proteins (galS, malT, malK, rho, hfq, proQ, ompR, rpoS, rpoA, gatZ), proteins that act in lipopolysaccharide export (lptA, lptB, lptC, lptD, lptG), multifunctional inner membrane proteins (opgG, opgH), and proteins required to construct cell surface appendages (fliG, fliH, fliP, fimH). Within each of these categories we found examples of groups of genes whose products collaborate in specific biological processes, qualifying them to be regarded as components of a functional module (Fig. 1).

Table 1.

Protein coding genes/intergenic regions that had more mutations than expected by chance

Gene/region Function Length Number Expected Mutations Number Observed Mutations
(p-value)
3D structure PDB ID Location Significant Clusters: Cluster Explorer (p-value) FDR Location Significant Clusters: NMC
(p-value)
Location Significant Clusters: iPAC
(p-value)
galS transcriptional dual regulator, represses galactose utilization genes 346 aa 0.78 38 (6.55E-50)
hfq RNA-binding protein 102 aa 0.23 24 (6.91E-40) 3QHS

14-64 (1.21E-06)

17-64 (3.42E-06)

26-32 (4.49E-05)

17-32 (4.85E-05)

29-32 (5.51E-05)

14-32 (5.68E-05)

14-37 (1.65E-04)

17-37 (1.76E-04)

26-64 (1.91E-04)

60-64 (2.66E-04)

14-62 (2.82E-04)

20-64 (3.56E-04)

17-62 (5.04E-04)

pgi glucose-6-phosphate isomerase 549 aa 1.23 35 (4.54E-38) 3NBU 263-267 (3.76E-02) 3.46E-01
opgH OPG biosynthesis protein H 847 aa 1.90 31 (8.74E-27) 333-457 (5.00E-02) 3.46E-01

334-512 (3.71E-05)

334-458 (4.61E-05)

347-512 (6.32E-05)

347-458 (6.82E-05)

334-507 (1.15E-04)

334-434 (1.15E-04)

334-488 (1.21E-04)

malT Transcriptional activator of maltose utilization genes 901 aa 2.02 30 (8.10E-25) 1HZ4

4-358 (4.34E-06)

4-60 (3.22E-05)

4-336 (4.53E-05)

4-319 (9.22E-05)

9-358 (1.06E-04)

311-319 (1.07E-04)

634-637 (1.14E-04)

236-358 (1.39E-04)

634 (1.53E-04)

634 (1.49E-05)

634-637 (1.95E-04)

malK maltose transporter ATP binding subunit, negative regulator of MalT 371 aa 0.83 22 (7.47E-24) 2AWO 295-297 (3.31E-03) 0.0736

296-298 (1.11E-07)

296-297 (2.24E-06)

225-352 (6.38E-06)

225-298 (8.28E-06)

225-349 (3.23E-05)

225-297 (4.65E-05)

286-298 (5.46E-05)

225-339 (6.13E-05)

253-298 (1.11E-04)

231-298 (1.27E-04)

231-352 (1.37E-04)

267-298 (2.60E-04)

253-352 (3.05E-04)

286-297 (3.95E-04)

231-349 (5.01E-04)

253-297 (5.39E-04)

296-296 (5.45E-04)

51-51 (5.52E-04)

231-297 (5.71E-04)

297-298 (3.34E-06)

225-297 (5.16E-06)

296-297 (5.78E-06)

297-339 (7.85E-06)

297-352 (4.11E-05)

267-297 (8.09E-05)

297-349 (1.13E-04)

231-297 (1.14E-04)

225-296 (1.77E-04)

296-298 (2.25E-04)

296-339 (2.33E-04)

253-297 (2.68E-04)

286-297 (3.14E-04)

296-296 (5.47E-04)

51-51 (5.55E-04)

upstream mglB Promoter region for mglBAC transcription unit 280 bp 0.21 7 (2.91E-09) 0-2 (1.80E-04) 1.37E-02

1-3 (3.86E-10)

2-3 (3.48E-08)

1-20 (7.54E-07)

3 (8.85E-05)

2-20 (2.07E-05)

1-2 (6.14E-04)

rho Transcription termination factor 419 aa 0.94 11 (5.49E-09) 1PVO 87-88 (4) 87-88 (6.28E-05)
downstream rpsU Terminator region for 30S ribosomal subunit protein S21 111 bp 0.08 5 (3.06E-08) 56-63 (3.92E-03) 7.36E-02

57-73 (2.16E-03)

57-64 (2.28E-03)

fimH Type 1 fimbriae D-mannose specific adhesin 300 aa 0.68 9 (4.38E-08) 1KLF

54-94 (2.10E-05)

54-62 (5.72E-05)

94-94 (9.14E-05)

54-127 (1.32E-04)

58-62 (1.81E-04)

58-94 (1.87E-04)

58-127 (7.83E-04)

54-62 (8.30E-05)

94-94 (1.52E-04)

62-94 (2.85E-04)

54-58 (1.07E-03)

58-127 (1.79E-03)

58-94 (2.21E-03)

rpoS RNA polymerase, sigma S 330 aa 0.74 9 (9.71E-08) 5IPL 299-325 (4.53E-04)
downstream cspC Terminator region for stress protein cspC 166 bp 0.12 5 (2.21E-07)

99-117 (7.04E-04)

99-100 (8.32E-04)

gatZ tagatose-1,6-bisphosphate aldolase 2 subunit 420 aa 0.94 9 (7.06E-07) 2FIQ
pfkA 6-phosphofructokinase I 320 aa 0.72 8 (9.47E-07) 1PFK
rpoA RNA polymerase subunit alpha 329 aa 0.74 7 (1.27E-05) 3IYD 322 (6.44E-05) 322 (6.70E-05)
lptD LPS assembly protein 784 aa 1.76 10 (1.61E-05) 4N4R (S. typhimurium)

618 (1.94E-05)

587-676 (1.60E-03)

618 (1.94E-05)

618-676 (6.47E-05)

618-660 (3.48E-03)

proQ RNA chaperone 232 aa 0.52 6 (1.81E-05) 5NB9
lptG LPS transport system protein 360 aa 0.81 7 (2.24E-05) 5L75 (K. pneumoniae)
downstream hfq/upstream hflX Terminator region for hfq 76 bp 0.06 3 (2.93E-05)
yciM (lapB) LPS assembly protein B 389 aa 0.87 7 (3.64E-05) 4ZLH (cytoplasmic domain) 45-53 (3.85E-02) 3.46E-01 46-54 (1.66E-04)
downstream fis Terminator region for transcriptional regulator Fis 86 bp 0.06 3 (4.22E-05)
slt lytic murein transglycosylase 645 aa 1.45 8 (1.34E-04) 1QSA
gabD NADP+-dependent succinate-semialdehyde dehydrogenase 482 aa 1.08 7 (1.36E-04)

369 (1.43E-05)

369-415 (7.50E-03)

wzzE enterobacterial common antigen polysaccharide co-polymerase 348 aa 0.78 6 (1.64E-04) 4WL1
yobF unknown 47 aa 0.11 3 (1.92E-04)
opgG OPG biosynthesis protein 511 aa 1.15 7 (1.93E-04) 1TXK (aa 23-511)
fliH Flagellar biosynthesis protein 228 aa 0.51 5 (1.95E-04) 5B0O (Salmonella)
ompR DNA-binding transcriptional dual regulator 239 aa 0.54 5 (2.41E-04) 1ODD (aa 122-239)
malE maltose ABC transporter periplasmic binding protein 396 aa 0.89 6 (3.25E-04) 1DMB
ybaL Putative transporter 558 aa 1.25 7 (3.26E-04) 3FWZ (TrkA-N domain) 40-48 (5.90E-03) 8.98E-02
upstream adhE alcohol dehydrogenase promoter region 477 bp 0.36 4 (5.08E-04)

250 (4.18E-03)

256-263 (4.30E-03)

yiaO 2,3-diketo-L-gulonate:Na+ symporter - periplasmic binding protein 328 aa 0.74 5 (9.92E-04) 158-159 (4.15E-02) 4.15E-02
lptC lipopolysaccharide transport system protein 191 aa 0.43 4 (1.02E-03) 3MY2
fliG flagellar motor switch protein 331 aa 0.74 5 (1.03E-03) 172-176 (1.33E-03) 0.05 173-177 (1.43E-03)
lpxD UDP-3-O-(3-hydroxymyristoyl)glucosamine N-acyltransferase 341 aa 0.77 5 (1.18E-03) 3EH0
valY tRNA-Val 78 bp 0.06 2 (1.55E-03)
fliP Flagellar biosynthesis protein 245 aa 0.55 4 (2.49E-03) 6F2D (Salmonella)
hyaB hydrogenase 1 large subunit 597 aa 1.34 6 (2.60E-03)
glpR Transcriptional repressor, glycerol-3-phosphate utilization genes 253 aa 0.57 4 (2.76E-03)

Fig. 1.

Fig. 1

Functional modules whose genetic components are mutated more often than expected by chance in E. coli evolved under glucose limitation. Glucose assimilation under glucose limitation. In glucose-limited chemostats, glucose diffusion across the outer membrane into the periplasmic space is facilitated by glycoporin LamB, and its transport from the periplasmic space into the cytoplasm occurs via the inner membrane transport complex MglBAC. LamB expression is regulated by RNA chaperones Hfq and ProQ, DNA-binding protein OmpR, and transcriptional regulators, RpoD (σ70) and RpoS (σ38). OPG biosynthesis. Transport of cytoplasmic UDP-glucose and its assembly into osmoregulated periplasmic glucans (OPGs) requires OpgG and OpgH, which are frequently mutated in our experiments. C LPS trafficking. Transport and secretion of lipopolysaccharide (LPS) to the outer cell surface requires the Lpt complex, elements of which are targets of selection. D Cell surface appendages. Proteins in functional modules required to construct appendages used in motility (flagellae, left) and attachment (fimbriae, right) are frequently mutated. (OM=Outer membrane, IM=Inner membrane, Magenta hexagons=Glucose, Boldface=Proteins discussed at length in the Results)

Regulatory proteins

Under resource limitation, regulatory mutations that influence glucose uptake and conservation offer high-value targets for selection

Among the functional modules most frequently mutated in our experiments are those required to scavenge glucose when it is the limiting substrate. One such module is organized around the diffusion of glucose across the E. coli outer membrane; another is organized around transport of glucose across the inner membrane (Fig. 1A). When glucose is limiting E. coli access, this substrate via proteins associated with the movement of galactose and maltose [13, 25, 31, 32, 3841], notably the high-affinity outer membrane maltose/glucose porin, LamB [42]. While no mutations were observed at the lamB locus itself, our data were enriched in mutations likely to alter the activity of four lamB effectors: malT, malK, rho, and hfq, all of which were enriched for clustered mutations (Table 1). Because each of these effectors represses lamB expression, either directly (e.g., rho) or indirectly (e.g., malK), and because all 87 de novo alleles are either nonsense or missense mutations in key functional domains, all likely de-repress lamB transcription ([25] and references therein), enhancing cells’ capacity to scavenge limiting glucose.

Another component of this functional module is built around regulating expression of the D-galactose/methyl-β-D-galactoside transporter MglBAC, which under glucose limitation moves glucose from the periplasmic space across the inner cell membrane [25, 43] (Fig. 1A). As with LamB, we observed no mutations in MglBAC uptake system itself. However, galS, a key effector of mglBAC expression, is the most frequently mutated gene in our population sequencing dataset, though these mutations show no evidence of clustering (Table 1). The DNA-binding protein GalS negatively regulates mglBAC transcription [44]. Because every new galS allele is either a nonsense or missense mutation, all could therefore be expected to diminish GalS deoxyribonucleic acid (DNA) binding affinity, resulting in mglBAC de-repression, which would enhance cells’ ability to assimilate limiting glucose. The spectrum and evolutionary dynamics of mutations arising in the functional modules depicted in Fig. 1A are discussed at length in our previous communication, as are the structural consequences of mutations in galS, malT, malK, and rho [25]. Below, we discuss the structural and functional consequences of non-random mutations in other lamB effectors, RNA chaperones hfq and proQ, response regulator ompR, RNA polymerase holoenzyme components, rpoA and rpoS, as well as in the putative protein chaperone gatZ.

RNA chaperone Hfq is a target of recurrent mutation under glucose limitation

The most frequently mutated gene in our population sequencing dataset, hfq, is also among the most frequently mutated genes in our clonal sequencing dataset and shows multiple mutational clusters ([25] and Table 1). All but one of the de novo hfq alleles are missense mutations, with examples of the same residue being repeatedly mutated within the same experimental population (e.g., Pro64Thr and Pro64Gln), or the same residues being mutated in all experimental populations (e.g., Arg17Leu, Gly29Cys) (Fig. 2A, Additional File 2: Table S1 and [25]). hfq encodes the RNA-binding protein Hfq, which regulates post-transcriptional small RNA (sRNA)/mRNA interactions that modulate cellular processes ranging from central metabolism and amino acid biosynthesis to peptidoglycan biosynthesis, motility, and cell division ([4547] and refs therein). Hfq also acts as a general stress response regulator by interacting with mRNAs that encode alternative RNA polymerase sigma factors σ24 (RpoE), σ32 (RpoH), and RpoS, each of which can substitute for housekeeping sigma factor RpoD to produce RNA polymerase holoenzyme required to initiate transcription [48]. Each σ-factor controls the expression of a different set of genes by binding to specific consensus sequences −10 and −35 upstream of the transcriptional start site.

Fig. 2.

Fig. 2

Mutations in RNA chaperone Hfq (N= 24, P=6.91E-40). A Location of mutations on the primary structure of the RNA-binding protein, Hfq. Residues reported to be involved in RNA binding include: Gln8, Phe39, Lys56, and His57 [53]. Mutations occurring at amino acid positions 17, 26, 29, 31, 32, 60, 62, and 64 were recorded in more than one chemostat. B Rotated 3-D image of the ancestral genotype with missense mutations located on the distal face (in residues 26, 29, 31, 32, 52, 60, 62 and 64) highlighted in yellow, and Arg17Leu (located on the rim) highlighted in red. Alternating subunits of Hfq are in different shades of grey

E. coli Hfq is a 102 amino-acid protein with a disordered N-terminal domain (aa 1–6), a core Sm-like domain (aa 7–65) and an unstructured C-terminal tail (aa 66–102) [4952] (Fig. 2A). The active protein is composed of six monomers organized into a ring-like structure having three surface domains—the distal face, lateral face (or rim), and proximal face (Fig. 2B) [51, 53, 54]. The distal face of each Hfq monomer contains a tripartite motif with an adenine-binding groove (A-site), a purine interaction site (R-site) and a nonselective site (E-site) (Fig. 3B). This motif has strong affinity for (ARN)x nucleotide repeats frequently found in the 5′ UTR of mRNAs such as the rpoS mRNA leader [55]. Lateral face residues on Hfq help facilitate sRNA-mRNA annealing, while those on its proximal face bind AU-rich tail regions of sRNAs [5659]. For example, Hfq promotes RpoS translation at suboptimal E. coli temperatures by facilitating interaction between rpoS mRNA on the distal face with DsrA bound to the proximal face, leading to exposure of an obscured rpoS ribosome binding site [5965].

Fig. 3.

Fig. 3

Hfq modulates transcription of stationary phase sigma factor (σs) by directly interacting with A-rich regions of the rpoS leader sequence. A Distal view surface representation of three Hfq subunits bound to A7 oligonucleotide representing the A-rich region of the rpoS mRNA leader. Unchanged residues are colored grey and residues affected by mutations are colored as follows: Leu26=green, Gly29=blue, Lys31=red, Leu32=yellow, Gln52=cyan, Ser60=pink, Val62=white, Pro64=orange. Hydrogen bonds (3.5 Å) between the A7 oligonucleotide and Hfq are depicted as green dashed lines. B Close up view of adenine nucleotides interacting with the A-site and R-site. Residues colored as in panel A

We observed a total of 24 mutations in hfq, 14 of which were unique (Table 1 and Additional File 2: Tables S1 and S2). Every unique mutation occurred within hfq’s core Sm-like domain. Most were concentrated on the distal face (Leu26Phe, Gly29Cys, Lys31Asn, Leu32Met, Gln52His, Ser60Tyr, Val62Phe, Pro64Thr, Pro64Gln), and some (Pro64Gln) arose independently multiple times (Fig. 3A, B and Additional File 2: Tables S1 and S2) [59, 66]. Specific changes at many of these residues (Gly29Cys, Lys31Asn, Leu32Met, Gln52His, and Ser60Tyr) (Fig. 2B, Fig. 3A) have previously been implicated in Hfq’s interaction with A-rich RNA molecules as well as with ADP and ATP [60]. A secondary RNA and ADP binding site is located on the rim of the Hfq hexamer and includes charged residues Arg16, Arg17, and Arg19 [67]. A mutation at Arg17 (Arg17Leu) arose independently in all three chemostats.

Because the majority of hfq mutations precisely delineate the binding site of the rpoS leader (A7 oligonucleotide), these mutations likely affect regulation of RpoS translation by the small non-coding RNA DsrA (Fig. 3A). Amino acid changes in, or adjacent to, the A- and R-sites of the ARN binding motif may disrupt the base stacking and hydrogen bond formation needed for Hfq to interact with the A-rich RpoS leader (Fig. 3B) [60]. Hfq mutations resulting in diminished translation of RpoS are likely to increase lamB transcription by reducing RpoS competition with RpoD for core RNA polymerase [20] (Fig. 1A). In addition, Hfq collaborates with antisense sRNA MicA to downregulate lamB expression [55]; disruption of this regulatory circuit is likely to increase levels of glycoporin LamB (Fig. 1A). Strikingly, no mutations were detected that altered residues either on the hexamer’s proximal face or in the C-terminal tail, suggesting there may adaptive constraints on Hfq evolution in this environment.

RNA chaperone ProQ is also repeatedly mutated under glucose limitation

ProQ was originally identified as a non-essential osmoregulatory factor required to optimally express proline channel protein ProP [68, 69]. ProQ was later shown to be a major RNA-binding regulatory protein that has both strand exchange and RNA duplexing activities [70]. Multiple RNA targets for ProQ have been proposed, with many sRNA species co-precipitating with this protein [71, 72]. Deuteration protection assays reveal specific regions where ProQ preferentially binds different sRNAs [73]. E. coli ProQ consists of an N-terminal domain, spanning residues 1–130, that is very similar to that of the ProQ paralog, FinO. This N-terminal domain is connected to a Tudor-like C-terminal domain (residues 180–232) by a 63 aa linker region. All three regions are proposed to bind RNA [73]. In the N-terminal domain seven positively charged residues (Arg32, Arg69, Arg80, Arg100, Lys101, Lys107, and Arg114 (colored in yellow in Fig. 4B) form a patch that is highly protected in deuteration protection assays; this patch has been implicated in binding to the 3′UTRs of sRNAs that include the late stationary phase ncRNA SraB [73, 74], DNA-damage inducible ncRNA SraB [75].

Fig. 4.

Fig. 4

Mutations in RNA chaperone ProQ (N=6, P=1.81E-05). A Protter diagram of ProQ protein. B Top: 3-dimensional model of ProQ N-terminal domain rotated 1800. Positively charged residues implicated in RNA binding (Arg32, Arg69, Arg80, Arg100, Lys101, Lys107, Arg 114) are indicated in yellow [73]. Mutations arising in our evolution experiments include Arg80Leu (magenta), Gly85Val and Leu103Pro in red (top left); on the opposite side of the N-terminal domain are these mutations: Ala106Glu and Ser53Ile (red) and Cys88* (cyan) (top right). Bottom: 3-dimensional model of ProQ C-terminal domain rotated 180°. Ala227Glu is adjacent to Arg226 in the proposed RNA binding patch (red; bottom left), while Gly189Val and Ala203Asp (cyan; bottom right) are on the opposite side of the molecule are not in the vicinity of any positively charged residues

Like hfq, the RNA chaperone proQ is mutated more often than expected by chance (Table 1, Fig. 4, and Additional File 1: Figure S1). While we did not detect significant clustering of proQ mutations, we did find in the N-terminal patch substitution of a polar for a non-polar amino acid (Arg80Leu) as well as two other missense mutations (Gly85Val and Leu103Pro) (colored in red, Fig. 4A). Three additional mutated residues are located on the other side of the N-terminal domain adjacent to other charged residues: Ala106Glu and Cys88*(Gln) in the vicinity of Lys75 and Arg109, and Ser53Ile adjacent to Lys54 (Fig. 4B, left). In ProQ’s C-terminal domain one de novo mutation (Ala227Glu) is located adjacent to Arg226 in a proposed RNA binding patch (red, Fig. 4B, right), while two other mutations (Gly189Val and Ala203Asp) on the opposite side of the molecule are not in the vicinity of any positively charged residues (cyan, Fig. 4B, right). All these mutations are likely to destabilize the ProQ protein.

It is perhaps not mere coincidence that RNA chaperones Hfq and ProQ are both mutated far more often than expected by chance. Recent co-immunoprecipitation and RIL-seq data indicate that each chaperone can bind hundreds of target sRNAs and mRNAs [76]. In most instances, the two proteins bind different targets, with ProQ showing marked preference for sRNAs, although more sRNAs overall are bound by Hfq [77]. In scores of cases, the RNA-RNA interactomes of the two chaperones overlap, setting up the potential for ProQ and Hfq to compete for the same target, as they do for rybB sRNA and, to a lesser extent, for micA sRNA [76, 78]. Expression of both these σE-dependent sRNAs [79] is known to modulate expression of stationary phase transcription factor σ38/RpoS ([80]; and Fig. 1A), whose diminished activity has been repeatedly associated with increased fitness among E. coli evolved under glucose limitation [25, 30, 39, 81]. Also, as noted above, micA RNA bound to Hfq acts as a post-transcriptional repressor of LamB synthesis (also see [82]). It is tempting to speculate that MicA bound to ProQ may open an alternative route to LamB repression, which, if mutationally blocked, would prove adaptive under glucose limitation.

DNA-binding dual transcriptional regulator OmpR is recurrently mutated

Outer membrane permeability is an important determinant of adaptation to very low concentrations of glucose [38]. When glucose is non-limiting, it can enter the periplasm passively through outer-membrane porins OmpC and OmpF. However, differences in the relative amounts of these porins have also been observed during glucose-limited chemostat growth [31, 40, 83]. The regulation of OmpC and OmpF is complex; their relative expression is controlled in part by the EnvZ/OmpR two-component regulatory system that responds to changes in medium osmolarity or pH [84, 85]. OmpR is active either as a dimer or monomer and is composed of an N-terminal receiver domain and a C-terminal DNA-binding effector domain that also interacts with the RNA polymerase α subunit, RpoA [8689]. When phosphorylated, OmpR’s interaction with RpoA is favored, and ompF and ompC expression is active but modulated in a reciprocal manner: i.e., when external osmolarity is high, ompF expression is favored over that of ompC, whereas when osmolarity is low the reverse is true [89, 90]. Mutations that affect porin regulation have been reported in both domains of OmpR, as well as in its cognate histidine kinase EnvZ and in RNA polymerase α-subunit RpoA [89]. OmpR-P is also involved in regulating a number of other genes including lamB, malE, flagellar master operon genes flhDC, curli production genes csgDEFG, and the small regulatory RNAs micF, omrA, and omrB (Fig. 1A) [78, 9196]. While OmpR is not considered essential, its inactivation or deletion in the presence of constitutive malT mutations is lethal due to outer membrane changes that stem from LamB hyperaccumulation [78, 97]. This effect can be mitigated by expression of the LptB component of the LPS transporter, establishing a link between mutation of OmpR and MalT, LamB overexpression, membrane stress, and LPS transport [78], all of which appear to play roles in E. coli’s evolutionary adaptation to chronic glucose limitation (Fig. 1).

Although no mutation clusters were detected in ompR, this locus was mutated more frequently than would be expected by chance (Table 1). Two of the five mutations observed (Lys6Asn and Ala35Asp) occurred in the N-terminal receiver domain (Fig. 5A) while the remaining three (Ser174Arg, Pro179Thr, and Leu228Met) were in the C-terminal effector domain (Fig. 5B). When mapped onto a model of the OmpR receiver domain, the Ala35Asp mutation is close to the Asp55 phosphorylation residue and the rest of the catalytic triad (Asp11 and Asp12), suggesting that it may interfere with phosphorylation or transmission of the phosphorylation signal to the effector domain (Fig. 5A). Based on the OmpR crystal structure [98], the three effector domain mutations (Ser174Arg, Pro179Thr, and Leu228Met) occur at the end of the α1 helix, in the loop between helices α1 and α2, and at the N-terminal end of sheet β5 (Fig. 5B, magenta/red). OmpR DNA-binding activity is known to involve residues in α3 (Fig. 5B,green), while OmpR-RpoA binding occurs in the loops between helices α1 and α2, in helix α2, and in the loop between helices α2 and α3 (Fig. 5B, blue/magenta) [99102]. The relative location of mutations Ser174Arg at the end of helix α1, Pro179Thr between helices α1 and α2, and Leu228Met in sheet β5 suggest that these de novo mutations are more likely to impact OmpR association with RpoA than its binding to DNA. OmpR mutants carrying a Pro179Leu allele demonstrate a transcription negative phenotype, but still can interact with EnvZ and bind DNA, further supporting the hypothesis that mutation at this residue interferes with RpoA binding [101]. Remarkably, as we discuss below, not only is OmpR mutated where it interacts with RpoA, but RpoA is also frequently mutated where it binds to OmpR. The fitness benefit of such mutations is clear: Impaired OmpR function is known to result in constitutive expression of glycoporin LamB, the major route by limiting glucose is transported across the E. coli cell wall under glucose limiting conditions [78].

Fig. 5.

Fig. 5

Location and frequency of mutations in the outer membrane protein regulator, OmpR (N=5, P=2.41E-4). A SWISSMODEL homology model of N-terminal receiver domain of OmpR based on the structure of YycF (PDB 2zwn, sequence identity = 47.06%). Residues that form the active site (Asp11 and Asp12 and Asp55) as well as a residue that, when mutated, affects transcriptional activation (R42) are shown in green. Residues that were mutated in this study (Lys6Asn and Ala35Asp) are colored red. B Ribbon diagram of the C-terminal effector domain of OmpR (aa 130-239, PDB 1OPC). Helix α3 contains DNA contact residues V203, R207 and R209 (green), which interact with thymine, guanine and the phosphate backbone in the major groove. OmpR and RNA polymerase α subunit (RpoA) interactions occur in the loop between helices α1 and α2, in helix α2 and in the loop between helices α2 and α3 (blue and magenta,respectively)

The α-subunit of RNA polymerase, RpoA, and stationary phase transcription factor, σ70/RpoS, are both recurrently mutated under glucose limitation

rpoA encodes the α-subunit of RNA polymerase, which consists of a C-terminal domain and an N-terminal domain connected by a flexible linker. Dimerization of RpoA, which is controlled by its carboxy-terminal domain, is required to assemble the RNA polymerase core complex [103]. rpoA was recurrently mutated in our replicate experiments (Table 1, Fig. 6, and Additional File 2: Table S1). One of targeted residues, Pro322, was hit three times, in each case causing a change from a non-polar to a polar residue (Pro322Thr in chemostat 1 and Pro322Thr/Pro322Gln in chemostat 2) and was therefore recognized as a significant cluster by the NMC and iPAC algorithms (Table 1). Mutations in the C-terminal half of RpoA affect the activity of many different positive RpoA regulators including CRP, FNR, and OmpR (reviewed in [104]). Rather than exhibiting a general inhibitory effect, most of these mutations appear to be specific to certain regulators and clustered in discrete patches along the RpoA primary sequence. For example, mutations in the C-terminal 10 amino acids, in particular aa322 and 323, prevent transcription of the OmpR-controlled genes ompF and ompC [89, 104]. Expression phenotype can depend on the nature of the amino acid substitution. Pro322Ser (nonpolar to polar) mutations reduce expression of both porin genes, whereas Pro323Leu (nonpolar to nonpolar) reduces ompC transcription but negligibly affects that of ompF [89]. Because all three RpoA mutations we observed at residue 322 changed proline to a nonpolar amino acid (leucine in one instance and threonine twice), we hypothesize that in lineages containing these mutations only ompC expression is affected.

Fig. 6.

Fig. 6

Mutations in proteins required for transcription initiation: a-subunit of the RNA polymerase core enzyme, RpoA (N=7, P=1.27E-05). RpoA consists of an alpha N-terminal domain that interacts with DNA-binding transcriptional dual regulator Crp at class II promoters plus an alpha C-terminal domain. A Location of mutations on the primary structure of the RpoA protein. Red and blue represent, respectively, functional domains 1 and 2. Amino acids that are filled represent a specific mutation that arose in the evolution experiments. Rotated 3-D image of the ancestral protein with functional domains 1 and 2 colored in red and blue, respectively. Observed missense mutations are highlighted in yellow

As noted, the ancestral E. coli strain used for these evolutions contained several mutations likely to influence transcriptional regulation, in particular a nonsense mutation in housekeeping sigma factor, RpoD (Glu26*), a nonsense mutation in the stationary phase transcription factor, RpoS (Gln33*), and a nonsense suppressor mutation in glnX tRNA that suppresses amber, ochre, and opal mutations. Our three experimental populations accumulated an additional 9 rpoS mutations, far more than expected by chance (Table 1). Six of these were missense mutations, and two (Arg299Ser, Phe278Leu) were determined by iPAC to cluster significantly (Table 1, Fig. 7, and AdditionalFile 2: Table S1). Both mutations occurred in domain 4 (262–315) that encompasses the RpoS DNA-binding region (288–307) with its helix-turn-helix motif. Because housekeeping RpoD regulates expression of E. coli glucose scavenging proteins and because RpoS and RpoD compete for the core RNA polymerase (RNAP), the fitness advantage that accrues to RpoS mutants under glucose limitation has long been attributed to reduced competition between RpoD and RpoS for RNAP (e.g., [81]), which would enable glucose scavenging under slow growth conditions. As the ancestor in our experiments contains N-terminal nonsense mutations in each sigma factor as well as a tRNA suppressor mutation capable of bypassing both, it may be that excess mutations in RpoS represent one of several mechanisms that serve to minimize RpoD-RpoS competition, ensuring maximal expression of genes involved in glucose transport and assimilation (see Fig. 1A).

Fig. 7.

Fig. 7

Mutations in proteins required for transcription initiation: RpoS (N=9, P=9.71E-05). RpoS consists of four sigma-70 factor domains (DT1–4) that function in DNA binding and melting and that interact with RNA polymerase subunits RpoA, RpoB, and RpoC. A Location of mutations on the primary structure of the RpoS protein. Red, blue, magenta, and green highlighting represent domains 1-4, respectively. Amino acids that are filled represent a specific mutation that arose in the evolution experiments. B Rotated 3-D image of the ancestral genotype with functional domains 1-4 colored inred, blue, magenta, and green, respectively. Observed missense mutations are highlighted in yellow

A post-translational regulatory protein becomes a target of selection under glucose limitation

gatZ was recurrently mutated in our experiments, though these mutations were not clustered (Table 1, Additional File 1: Figure S2). While GatZ appears to lack catalytic activity, evidence suggests that it acts as a protein chaperone to ensure proper folding of GatY, tagatose-1,6-bisphosphatase aldolase [105]. GatZ inactivation has been shown to be beneficial under anerobic conditions and conducive for H2 production from glycerol [106]. Interestingly, recurrent mutations inactivating the gat operon have been shown to be beneficial among E. coli experimentally evolved in gnotobiotic mice [107109]. In multiple instances, adaptive mutations were either IS insertions (79%) or short deletions (21%) in the gatYZABCD operon, whose gene products collectively allow for galactitol catabolism. All these mutations exerted polar effects and produced the same phenotype: the inability to metabolize galactitol. All conferred a fitness advantage [108], perhaps related to dispensing with an unnecessary and costly pathway. In this regard, it is noteworthy that a majority of gatZ mutations were either nonsense mutations or indels and that we also observed nonsense mutations in gatC and gatY, in all a total of 13 mutations in the module defined by this single operon (AdditionalFile 2: Tables S1 and S2). Similar to [108], none of our mutant gat alleles ever went to fixation. On the other hand, none ever went extinct, and in two populations their final frequency was ~ 20%. While the role played by GatYZ under continuous glucose limitation has not been explored, it is noteworthy that these proteins reversibly interconvert D-tagatose 1,6-bisphosphate with the glycolytic intermediates D-glyceraldehyde 3-phosphate (G3P) and dihydroxyacetone phosphate (DHAP) [110]. Gat inactivation may therefore also serve as a mechanism to prevent diversion of limiting carbon to non-essential pathways. This hypothesis is consistent with fixation of a GatY nonsense mutation (G49*) in the predominant clone isolated from a previous E. coli evolution experiment performed under glucose limitation [32].

Multifunctional inner membrane proteins

Proteins required for synthesis of periplasmic glucans are more frequently mutated than expected by chance

Osmoregulated periplasmic glucans (OPGs), formerly termed membrane-derived oligosaccharides (MDOs), consist of 5–24 subunits of D-glucose connected by β-glycosidic linkages. As their name implies, periplasmic concentrations of OPGs vary inversely with extracellular osmolarity. To date, OPGs have been reported in four of six major subdivisions of the Proteobacteria, indicating that they may be essential components of the cell envelope across this group [111]. In E. coli, OPGs consist of 5–12 glucose residues; cellular OPG content can range from as much as 5% total cell dry weight in low osmolarity medium to as little as 0.5% dry weight in high osmolarity medium. OPG biosynthesis requires OpgH, which spans the inner membrane, and OpgG, which closely interacts with OpgH in the periplasm (Fig. 1B). OPG molecules can be decorated with phosphoglycerol, succinyl, and phosphoethanolamine residues by the OpgB, OpgC, and OpgE/OpgD proteins, respectively, but it is OpgH/OpgG that transport UDP-glucose out of the cytoplasm and form b-glycosidic linkages between glucose monomers. Recent data suggest that these inner membrane proteins have functions other than transport and catalysis, including the coordination of cell division and cell size via the intermediate UDP-glucose [112].

In our evolution experiments opgH was much more frequently mutated than expected by chance, being the third most frequently mutated gene in our population and clone sequencing datasets (Table 1). OpgH mutations were found to be significantly clustered by the Cluster Explorer (positions 333–457) and NMC (positions 334–512, inclusive) algorithms, with 18 of 31 independent mutations occurring in this region of the primary sequence. OpgH is predicted to have three cytoplasmic regions connected by eight transmembrane domains that span the inner cell membrane (Fig. 8) [113]. Currently, no solved structure exists for displaying the location of OpgH amino acid substitutions in 3-dimensional space. Nevertheless, our understanding of the protein’s basic topology makes it clear that mutations cluster within the middle cytoplasmic region, a region that exhibits features reminiscent of glucosyltransferases; the C-terminal portion of this region has been shown to be essential for catalytic activity [113]. Thus, missense mutations here may compromise assembly of the OPG backbone. They may also compromise transport of UDP-glucose to the periplasm, which would also impair OPG synthesis, but not due to a catalytic defect. Also, for reasons that are poorly understood, OPG assembly requires acyl carrier protein (ACP). While ACP necessarily interacts with one or more of the three cytoplasmic regions, the exact site of this interaction is not currently known. It is noteworthy that while opgH is a frequent mutational target, only a handful of de novo opgH alleles ever surpass 10% frequency in our replicate evolutions (Fig. 8). The most successful of these, Pro434Thr, attained 5% frequency by 200 generations and eventually rose to 78% by the end of the experiment.

Fig. 8.

Fig. 8

Mutations in proteins required for multifunctional inner membrane protein export: Osmoregulated periplasmic glucans protein, OpgH (N=31, P=8.74E-27). Location of de novo mutations in the OpgH primary structure.Amino acids that are filled represent a specific mutation that arose in the evolution experiments. Amino acid positions 370, 373, and 408 were each mutated in multiple chemostats. OpgH is a transmembrane protein, and the inner membrane is represented by peach-colored bar

opgG is the first gene to be transcribed in the opgG/opgH operon and is OpgH’s periplasmic partner in the synthesis and placement of periplasmic glucans between the inner and outer membranes (Fig. 1B). Although opgG was also more frequently mutated than expected by chance (Table 1), no mutations were significantly clustered. All OpgG mutations were either missense or nonsense mutations, including 2 independent nonsense mutations at Glu81* (Fig. 9). While Glu147* eventually rose to 13% frequency, no other opgG allele exceeded 2.5% frequency across our replicate evolutions [25]. This outcome stands in contrast with a prior E. coli evolution experiment where an opgG nonsense mutation (E487*) became fixed in the predominant lineage isolated after 765 generations of continuous glucose limitation [32]. We also observed a novel allele in another inner membrane protein, opgC, which encodes for succinyl transferase. While this missense mutation (Gln64Lys) arose in a single population at 100 generations, it was one of only a handful of alleles across replicate experiments that ever went to fixation. Whether this allele was a driver or passenger mutation is not currently known.

Fig. 9.

Fig. 9

Mutations in proteins required for multifunctional inner membrane protein export: Osmoregulated periplasmic glucans protein, OpgG (N=7, P=1.93E-04). A Location of de novo mutations on the OpgG primary structure. Amino acids that are filled represent the result of a specific mutation that arose during the evolution experiments. The mutation at amino acid position 81 occurred in two chemostats. B Rotated 3-D image of the ancestral genotype with observed missense mutations highlighted in yellow [143, 144]

The adaptive value that enables novel opg alleles repeatedly to spread from single mutant cells to > 2% of a population of 109 cells remains obscure. Blocking the use of glucose as a structural element in the periplasm, as opposed to a source of carbon and energy for growth, could be construed as an energy conservation mechanism. Interestingly, in cells undergoing binary fission under nutrient-rich conditions, OpgH localizes to the nascent septum, where it suppresses assembly of the tubulin-like cell division protein FtsZ [112]. This activity delays cell division and enables cell size to increase. Under slow-growth, nutrient-limiting conditions, there may be a premium for abolishing or diminishing this interaction so as promote cell division among smaller cells.

Lipopolysaccharide (LPS) assembly and transport

Genes whose products act in LPS assembly and transport are repeatedly mutated when glucose is limiting

Lipopolysaccharide is an essential component of the E. coli outer membrane, contributing to its structural integrity and providing a protective permeability barrier against a variety of stress factors, including antibiotics and detergents [114]. LPS itself has a tripartite structure, consisting of lipid A (the hydrophobic moiety that anchors LPS to the outer membrane), a core oligosaccharide, and an O-antigen made of repeating oligosaccharide units [114]. The LPS transport system consists of seven proteins (LptA, LptB, LptC, LptD, LptE, LptF, and LptG) that act in concert to extract LPS from the inner membrane then to transport it across the periplasmic space to the outer membrane, where it forms a layer ([114, 115]; Figs. 1C and 10). In our evolution experiments, three of these genes (lptC, lptD, which has significantly clustered mutations, and lptG) are mutated more frequently than expected by chance in both population and clonal sequencing data (Table 1, Additional File 1: Figures S3, S4, S5 and [25]), as is lapB/yciM, which encodes the LPS assembly protein and also has significantly clustered mutations (Additional File 1: Figure S6).

Fig. 10.

Fig. 10

LPS transport is a target of selection under glucose limitation (A) Ribbon view of LPS transport system, and how it extracts LPS from the inner membrane and transport it across the periplasmic space to the outer membrane (also see Figure 1C). B View of A, but from underneath. C A lateral view of LptD/LptE complex (PDB 4Q35 from Shigella flexneri). Hydrophobic residues forming a hydrophobic intra-membrane hole between N-terminus and C-terminus used for lipid A passage are colored in magenta (Trp180, a residue mutated in our evolution to Leu) and yellow (the rest of the residues: Phe203, Phe211, Phe218, Phe228, Leu760 and Leu763, residues in which mutations were not observed). D A view from underneath the LptD barrel (with LptE bound) into a passageway for core oligosaccharide and antigen A portion of LPS. Arg729, Glu733 and Leu736 mutated in our evolution are colored in red and located on the beta26 strand of the luminal gate. An animated 3D image showing de novo mutations in the LptD/LptE complex can be found in Additional File 3: Figure S13. Other LPS transport proteins that are mutated more frequently than can be explained by chance include LptC (N=4, P=1.02E-03), LptD (N=10, P=1.61E-05), LptG (N=7, P=2.24E-05), and LapB (N=7, P=3.64E-05) (see Additional File 1: Figures S3, S4, S5 and S6, respectively)

LptD and LptE form a complex responsible for the final step of transporting LPS to the outer membrane. Assembly protein LptD consists of two domains: the C-terminal half of the protein forms a β-barrel that spans the outer membrane and envelops the LPS assembly protein LptE, while the N-terminal domain is a part of the periplasmic bridge to the inner membrane [116118]. The lipid A part of LPS is transported through the intra-membrane hole formed between the N-terminal and C-terminal portions of LptD, while the O-antigen and the core oligosaccharide are passed between β-strands 1 and 26, opening up the barrel LptD’s C-terminal domain [119]. Both LptE and LptD are considered essential [120].

Crystal structures of the LptD/E complex have been resolved for E. coli (unpublished, reported as PDB 4RHB) as well as those for other Gram-negative bacteria [121]. We mapped de novo mutations in lptD onto structures PDB (Protein Data Bank) 4RHB (LptD C-terminus/LptE complex from E. coli) and PDB 4Q35 (LptD/LptE from Shigella flexneri [121]). Multiple nonsense mutations were located at the extracellular end of the barrel: a single nonsense mutation at Glu676, three independent nonsense mutations at Glu587, and two independent nonsense mutations at Glu618, the last of which is located adjacent to the periplasmic entrance to the barrel, near where LptE is inserted (see Fig. 10C, D as well as an animated 3-D image of the same structure in Additional File 3: Figure S13). It is important to keep in mind that all de novo nonsense mutations recovered in these evolutions may be partly suppressed by a tRNA suppressor mutation in the ancestral strain.

Missense mutations at Gln653 (mutated to His), Ser660 (mutated to Ile), and Ala687 (mutated to Ser) are all located on the same side of the barrel, whereas residues Arg729 (mutated to Leu), Glu733*, and Leu736 (mutated to Met) are accessible from the inside of the barrel’s lumen and are part of β-strand 26, close to where it unzips from β-strand 1. Trp180 (mutated to Leu) has been implicated as providing a hydrophobic intra-membrane exit from the lumen, formed by the N-terminus of LptD, which is used to transport O-antigen and Lipid A (colored in yellow in Fig. 10, Additional File3: Figure S13). Trp180Gln is reported to be lethal in E. coli [119].

LptE stabilizes LptD at the membrane [118] and is required for proper LptD assembly at the membrane [122, 123]. While the observed number of mutations in LptE did not exceed the number expected by chance, both mutated residues in the mature protein (Ser88, Ser125) are located on the protein’s surface (Fig. 10C, D; Additional File 3: Figure S13). Moreover, they cluster with residues in LptE previously implicated in its direct interaction with the lumen of the LptD barrel (Thr86, Phe90, Phe123, Arg124, Met142, Arg150; [117]; Fig. 10). We speculate that decreased LPS biosynthesis may serve as an energy conservation measure under chronic nutrient limitation or that it may play a role in alleviating membrane integrity stress resulting from LamB overproduction [78, 97]. In these respects, it is noteworthy that missense (R165L) and nonsense (E164*) mutations in lptG were fixed in the predominant lineage isolated from an independent E. coli evolution experiment carried out under similar conditions [32].

Proteins required to construct cell surface appendages

Mutations in genes required for flagellar synthesis and activity

Twenty-four mutations were observed that are likely to impact flagellar synthesis and activity: 17 occurred in the fliFGHIJK operon, of which 11 were nonsense mutations, while 7 occurred in fliMNOPQR operon, of which 2 were nonsense [25]. Of these twelve loci, three were mutated more frequently than expected by chance: fliG, fliH, and fliP, with fliG showing significantly clustered mutations (Table 1, Fig. 1D, and Additional File 1: Figures S7, S8, S9). fliP and fliH encode components of the flagellar export apparatus: FliP being a cytoplasmic ATPase (Adenosine triphosphatase), FliHbeingone of six integral membrane components. All five fliH alleles were transversions, 4 of which resulted in nonsense mutations (Glu37*, Glu62*, Glu104*, Cys220*); none of these alleles ever exceeded 13% frequency in our experimental populations [25].

FliG, FliM, and FliN form the C-ring of the flagellar motor switch ([124]; Fig. 1D). FliG consists of 331 amino acid residues organized in at least two discrete domains: one at the carboxy terminus, another in the middle of the protein. The C-terminal domain of ~ 100 residues is essential for flagellar rotation, but dispensable for flagellar assembly [125]. The flagellar rotor itself consists of ~ 25 FliG molecules that interact with one another and with FliM, which in turn interacts with the FliN C-ring protein. It is in the FliG middle domain where FliG:FliG and FliG:FliM interactions appear to occur [125]. In our experiments all five de novo fliG mutations were GT transversions, four of which resulted in nonsense mutations: Glu19*, Glu19*, Glu174*, Glu177*, and one of which resulted in a missense mutation from a non-polar to a polar residue, Ala173Ser (Additional File 1: Figure S7). Three of these mutations were found to cluster between amino acids 172 and 177 by the Cluster Explorer and NMC algorithms (Table 1). The phenotype and severity of these C-terminal and middle domain nonsense mutations would depend on the efficiency with which the ancestral tRNA glnX suppressor enables full-length FliG protein to be translated. Finally, it should be noted that in addition to mutations in the fliMNOPQRoperon de novo mutations also occurred at greater than expected frequencies in the gene encoding the flagellar biosynthesis protein, FlhB as well as in the gene encoding the flagellar assembly protein, FlgJ (Additional File 1: Figures S10 and S11; Additional File 2: Table S1 and [25]).

In E. coli motility is linked to growth rate through the flagellar master regulator FlhD4C2, and cells grown in glucose-limited chemostats at slow dilution rate (µ = 0.12 h−1) exhibit produce fewer flagella than cells grown at high dilution rate (µ = 0.6 h−1) [126]. This observation suggests a trade-off between resource investment in motility versus growth in a nutrient-poor environment, especially when that environment is well-mixed, as is the case in a chemostat [127]. Indeed, flagellar biosynthesis may draw upon as much as 2% of an E. coli cell’s biosynthetic capacity and 0.1% of its total energy expenditure [128]. Diminished expression of E. coli flagellar operons has also been seen in long-term evolution experiments carried out via serial dilution [129]. And, in previous chemostat experiments originating from the same ancestor used here, multiple mutations in fliF, fliH, flI, and fliM were observed, with three (E59D, A62S, E178*) occurring in the dominant lineage at the motor switching and energizing component, fliM [32]. We speculate that recurrent mutations in flagellar operons may be due to two related factors: reduced investment in the motility apparatus among cells growing slowly (D = 0.2 h−1) in a well-mixed, nutrient-poor environment and followed by selection on the motility apparatus because it is weakly expressed and nonessential under these conditions.

Mutations required for type 1 fimbriae secretion are frequently selected, inducing biofilm formation pathways

While biofilm formation was not an intended outcome of selection under continuous glucose limitation, it appears to be an adaptive strategy that enables low-frequency clones to persist. We observed biofilms in each of our replicate evolutions, and our genetic data reflect that observation. Type 1 fimbriae are required to maintain biofilm structures in E. coli [130], and expression of the fim operon (fimAICDFGH) encoding the structural components of type 1 pili is regulated by an invertible switch, fimS ([131, 132]; Fig. 11A). fimS controls type 1 pilin phase variation in E. coli and consists of two recombinase genes, fimB and fimE, that lie immediately upstream of fimS. FimE protein is responsible for switching fimS from “Phase ON” to “Phase OFF” while FimB functions as a bi-directional recombinase [133135]. In our experiments, fimS remained in the ancestral “off” configuration for the great majority of sequenced clones. However, we did observe a small minority of clones that had inverted fimS in the Phase ON orientation (see Methods), which is expected to result in transcription of the fim operon. We suggest that selection of the chromosomal inversion activating fimS primes “Phase ON” lineages for selection of additional fim operon mutations (e.g., fimH, Additional File 1: Figure S12 and discussion below).

Fig. 11.

Fig. 11

Genes required for biofilm formation are mutated in glucose-limited E. coli populations. A The Fim switch and transcriptional regulation of the Fim operon. The region bracketed by filled triangles represents the invertible fim switch (fimS). When the switch is in the top configuration, the fim operon is transcribed (ON). When the switch is in the bottom configuration (inverted), the fim operon is not transcribed (OFF). Microtiter plate with clones A1-6 through H1-6 from chemostat 2 (columns 1-6 are technical replicates of columns 7-12) grown, washed and stained with Crystal Violet dye (see Methods). The propensity to form biofilms positively correlates with the intensity of color. C Biofilm formation in clones with activated form of fim operon switch (fimS ON, in green) and deactivated (fimS OFF, in red), and various fimH genotypes. The propensity to form biofilms was evaluated in quadruplicate by measuring optical density at 595nm. Clones with identical genotypes that arose independently in different chemostats are plotted separately. Evolved clones with the Fim operon turned ON differ significantly from wild-type clones whose Fim operon is turned OFF (p-value < 2.5E-09)

Fimbriated cells normally grow more slowly than those that do not produce fimbriae, especially at lower temperatures (our experiments were carried out at 30 °C not 37 °C) [136]. However, in a chemostat the growth rate cost of making fimbriae might be offset by the benefit of increased adherence to vessel walls, which would extend cells’ residence time and spatially segregate them from the larger—and more rapidly dividing—planktonic population. While simply turning on the fim operon likely provides some selective advantage, most sequenced clones with the fimS switch turned on (43 out of 55) possessed additional mutations in the fimH gene encoded within the fim operon (Fig. 1D and below). Mutant fimH clones tested strongly positive for biofilm formation in a microtiter plate assay, while wild type fimH clones did not (Figs. 11 and 12). In 12 of 55 fimS “Phase ON” clones that lack additional fim operon mutations we sometimes observed other biofilm-relevant mutations. For example, three mutations (Pro30Thr, Ala102Glu, and Ala157Asp) were observed in matA, which encodes a transcription factor that exerts a dual regulatory function on the choice of planktonic vs. sessile lifestyle [137]. There is also a matA promoter mutation in 4 clones from chemostat 3 (G4, E1, D8, C6). How (or whether) these mutations affect matA expression is unknown (Additional File 2: TableS1).

Fig. 12.

Fig. 12

E. coli FimH undergoes allosteric changes in response to lectin binding and shear stress that affect its propensity to adhere to the uroepithelium and promote biofilm formation. A Full length FimHFL consists of an N-terminal lectin domain (FimHLD) connected to a C-terminal pilin domain (FimHPD), which undergo conformational changes upon binding to mannosylated uroplakin 1a (UP1a), whose sugar moiety projects from the luminal side of the urothelium (green). FimHFL undergoes additional changes in response to shear stress (adapted from [136]). B FimH structure bound to n-heptyl-α-d-mannopyranoside in the low-affinity state (PDB 4XOE) (left), in the high-affinity state (middle), and in the presence of shear force (PDB 4XOB) (right). Asn33 and Gly73 are represented by magenta, Asp37, Gln41 and Ala106 by cyan, Ser62 by yellow. Mutants at each of these positions were recovered from our evolution experiments as well as from screens of uropathogenic strains. (Also see Additional File 4: Figure S14, an animated 3-D image of FimH showing the location of de novo mutations recovered in our experiments.)

FimH mutations arising in E. coli under glucose limitation in the lab recapitulate FimH mutations seen in pathogenic E. coli isolated in the clinic

FimH encodes a type 1 fimbrial adhesin that binds D-mannose [138] (Figs. 1D and 11). The majority of fimS “ON” clones contained additional mutations in FimH, with three residues in FimH (Asp37, Gln41, and Gly73) being recurrent, independent targets of mutation, suggesting that these mutations are adaptive. (Note: The amino acid numbering used here reflects cleavage of the signal peptide [139], and therefore does not match the numbering in Additional File 2:Table S1). We also observed additional single mutations affecting two other residues (Asn33 and Ala106). Asn33His and Gly73Glu have both been previously observed as naturally occurring variants in uropathogenic strains (CI#7 and CI#4 respectively); both bind yeast mannan and human fibronectin [140]. FimH mutants containing either Asn33His or Gly73Glu exhibit higher affinity to monomeric mannose relative to an otherwise isogenic strain lacking those mutations [141]. Importantly, higher affinity to monomeric mannose, compared to tri-mannose structures, differentiates multiple uropathogenic strains from bowel isolates originating in healthy individuals, and has been shown to provide a selective advantage for urinary tract colonization in mice [141]. Similarly, Gly73Glu mutants were recovered in a screen for clones from a fimH mutant library that more readily agglutinate yeast and exhibit higher affinity for human fibronectin [142]. Last, Gly73 and Ala106 in FimH have been described as evolutionary hot spots in E. coli strains isolated from patients with Crohn’s disease. Gly73Arg, Gly73Glu, Gly73Ala, Gly73Trp (the exact change that we observed), and Ala106Trp substitutions have been seen among Crohn’s isolates able to bind human T84 intestinal epithelial cells and cause inflammatory bowel disorder [143]. Together, these observations show that our chemostat evolutions select for clinically relevant mutations in fimH related to biofilm formation.

To better understand how these mutations might result in a biofilm phenotype, we modeled how they affect FimH structure. The FimH protein consists of lectin and pilin domains that adopt different conformations, depending on the presence/magnitude of shear stress induced by flow of fluids [144, 145]. When the lectin domain is bound to mannose, the lectin and pilin domains are connected more rigidly in the absence of flow. However, in the presence of flow the connection between the two domains becomes more flexible, increasing ligand affinity [144]. We examined the locations of the observed mutated residues in the structures of both conformations (see Fig. 12, Additional File 1: Figure S12 plus animated 3-D image of FimH structure Additional File4: Figure S14). Asn33 and Gly73 exhibit different accessibility between the two conformational states. In the relaxed, high-affinity conformation (with shear stress) they are exposed, while in the rigid, low-affinity conformation they are almost entirely buried (in magenta, Fig. 12B). This is similar to another residue (Ser62) located in the same vicinity, which has also been identified as polymorphic in multiple pathogenic E. coli strains (uropathogenic strain NU14 and neonatal meningitis isolates RS218 and IHE3034; [146]; Fig. 12B in yellow). While accessibility of the other residues that we observed as mutated (Asp37, Gln41, Ala106) does not vary between conformations, they do cluster together within the structure (Fig. 12B in cyan), suggesting they may impact activity of the FimH adhesin via a similar mechanism.

Conclusions

Much of what we understand about protein structure–function relationships has been inferred from the analysis of mutations, often by using screens or selections to separate viable from non-viable mutants under defined conditions. Now, we can use the evolutionary process itself to gain insight into these relationships. Microbial populations can be experimentally evolved under selection for hundreds or even thousands of generations in the laboratory, and sequenced either in their entirety or as individual clones. This approach not only expands our understanding of protein structure–function relationships, it also opens up possibilities for understanding which mutational targets are favored under different forms of selection, the distribution of their fitness effects, and how particular sets of mutations interact, synergistically or antagonistically, to create fitness landscapes. This approach, adaptive genetics, represents a significant advance in adaptive laboratory evolution (ALE), which has found successful application in industrial biotechnology and metabolic engineering [147].

Adaptive genetics reveals diverse mechanisms by which E. coli can enhance acquisition of limiting glucose

Across replicate evolution experiments we found recurrent mutation of genes whose products serve in functional modules that localize to the cell envelope, where a cell interacts with its external milieu (Fig. 1). One such module is centered on membrane-associated proteins that facilitate movement of limiting glucose across the outer and inner cell membranes. Consistent with prior studies [25, 30, 38, 39, 148], we found that the transport proteins themselves—LamB and MglBAC—were never mutated, or if they were, those mutants were either quickly purged from our experimental populations or never exceeded our detection limit (freq. ≥ 1%). Instead, the regulatory apparatus controlling these transport proteins’ expression was malleable. Under continuous resource limitation selection favors phenotypes that best scavenge limiting substrate. Thus, in one way or another, each recurrently mutated target in our experiments contributed to constitutive expression of LamB and MglBAC.

Functionally, most of these targets (galS, malK, rho, hfq, proQ, ompR, rpoS) act in MglBAC and LamB repressor pathways (Fig. 1). However, the mechanisms by which mutations at these genes derepress mglBAC and lamB expression markedly differ. For example, the most frequently mutated gene in our dataset is galS whose gene product is a DNA binding protein that acts specifically as a mglBAC transcriptional repressor. Likewise, MalK acts specifically as a transcriptional repressor of MalT, a lamB transcriptional activator. By contrast, rho encodes an RNA-binding protein that regulates gene expression by acting as a transcriptional terminator of nascent transcripts. (The structural location and evolutionary dynamics of de novo alleles at galS, malK, malT, and rho are discussed at length in our companion report [25]).

Defects in hfq and proQ also ultimately derepress lamB expression but do so by yet another mechanism. hfq and proQ encode RNA chaperones that collaborate with small non-coding RNAs to target specific transcripts for post-transcriptional regulation. For example, global regulator Hfq collaborates with sncRNAs MicA and DsrA to downregulate LamB and RpoS translation, respectively. RNA chaperone ProQ also acts a regulator of translation, and the suite of sRNAs with which it interacts is known to overlap that of Hfq [76]. Mutations in the stationary phase transcription factor RpoS have long been known to confer a fitness advantage on E. coli growing under continuous nitrogen or continuous carbon limitation [20], as reduced expression of RpoS diminishes its competition with housekeeping transcription factor RpoD for binding with RNA polymerase. At steady state, the glucose level in a chemostat is vanishingly low [32]. But instead of mounting a costly stress response to bring cells into stationary phase, RpoS-deficient, RpoD-proficient E. coli continue to grow, albeit slowly, transcribing genes that include LamB and MglBAC [148]. Non-random mutation of outer membrane protein OmpR likely compensates for potentially lethal hyperaccumulation of LamB [78, 97].

Adaptive genetics shows us that E. coli’s first-level response to resource limitation is orchestrated almost entirely by regulatory mutations. These mutations occur in proteins that increase either LamB or MglBAC expression by virtue of their activity as transcriptional repressors (GalS, MalK), transcriptional activators (MalT, RpoS), transcriptional terminators (Rho), post-transcriptional regulators (Hfq, ProQ), or as regulators of protein packing at the outer membrane (OmpR). We again note that RpoS and RpoD both contain suppressed nonsense mutations in our ancestral strain, likely decreasing their overall expression, which in turn may affect the expression of downstream targets. It is entirely possible that this too influences the adaptive targets.

Adaptive genetics reveals diverse mechanisms by which E. coli can economize limiting glucose

We also discovered an excess of mutations in functional modules involved in the economy of limiting glucose, specifically its polymerization into cell envelope macromolecules used to strengthen the periplasm (OPGs) and to protect the outer membrane (LPS). In E. coli osmoregulated periplasmic glucans are polymers of 5–12 D-glucose monomers connected by β-glycosidic linkages. Assembly of OPGs in the periplasmic space requires collaboration between OpgG and OpgH, both of which were recurrently mutated in our experiments far more often than expected by chance. Mutations that restrict allocation of glucose to structures that can account for as much as 5% biomass dry weight seem an obvious resource conservation strategy. Moreover, data now indicate that membrane spanning protein OpgH also moonlights as a regulatory protein. OpgH localizes to the E. coli septal site, sequesters tubulin-like cell division protein FtsZ, and helps enforce minimal size as a prerequisite to cell division [112]. In Caulobacter crescentus OpgH is not only to be essential for OPG biogenesis but also helps regulate morphogenesis [149]. The extraordinary number of non-random mutations observed at opgH suggests they could provide a two-fold advantage: boosting the economy of limiting glucose and enabling E. coli to divide at a lower-than-normal cell size.

Another mechanism by which cells could economize limiting carbon would be to limit its allocation to lipopolysaccharide, which by mass is approximately two-thirds sugar [150]. E. coli LPS composition, including that of its core oligosaccharide, differs by serotype. The ancestral strain used in our experiments is a derivative of E. coli K12, whose LPS core is composed not only of glucose, but also of galactose, heptose, rhamnose, N-acetyl-d-glucosamine, and 3-deoxy-d-manno-oct-2-ulosonic acid [151]. Structural genes involved in LPS assembly (lapB) and transport (lptC, lptD, lptG) were recurrently mutated in our evolution experiments, and the clustering of missense mutations in the proteins they encode suggests they would compromise LPS assembly and transport. Interestingly, while the number of mutations failed to meet our non-random threshold for inclusion in Table 1, we also recorded eight independent occurrences of four missense alleles in the LPS regulatory protein YciM (Ala183Ser, Gln278Lys, Ser297Ile, and Asn586Lys); one (Gln278Lys) reached > 19% frequency in one of our evolving populations [25]. Recent data have shown that YciM regulates LPS biosynthesis by controlling proteolysis of LpxC, the enzyme that controls the first committed step in this process [152, 153], by competing with FtsH for LapB to inhibit LpxC degradation [154]. Finally, we recorded ten independent occurrences of three missense mutations in LpxC itself (Gly152Val, Gln202His, Cys207Phe), all of which occurred in the same experimental population [25]. The sheer number of recurrent structural and regulatory mutations in LPS biogenesis points to the value of this process as a target of selection. However, as few of these variants ever reached high frequency, their fitness values appear to be either limited or transitory.

In addition to diminishing carbon allocation to OPG and LPS, cells can also economize by dispensing with costly pathways that are non-essential under our evolutionary conditions. One such pathway is galactitol metabolism, which is mediated by proteins encoded by genes in the gat operon (gatYZABCD). These genes encode a galactitol-specific phosphoenolpyruvate transferase (PTS) active transport system as well as a dehydrogenase and aldolase by which galactitol is brought into glycolysis at the D-glyceraldehyde 3-phosphate/dihydroxyacetone phosphate (G3P/DHAP) branch point [110]. The final step in the reaction sequence, catalyzed by GatY, is reversible, opening the possibility for limiting carbon to be diverted from glycolysis. We observed 13 mutations in the gat operon, 9 of which recurred in the regulatory protein GatZ, whose chaperone activity is required for GatY to fold and exhibit catalytic activity. Three independent mutations were observed in GatY itself, including one nonsense mutation.

Finally, E. coli can also economize limiting carbon and energy by dispensing with structures that may not be under strong selection among slow-growing cells in a well-mixed environment. Foremost among these structures are flagella, whose biosynthesis and rotation costs have been estimated to consume ~ 10% of the bacterium’s entire energy budget [155]. Because a chemostat is homogeneous with respect to the spatial distribution of resources, cells are not under selection to be chemotactic. Moreover, cells in our evolution experiments are growing slowly (D = 0.2 h − 1), and previous studies have shown that under nutrient limiting conditions, slow-growing cells produce far fewer flagella than fast-growing cells [126]. Thus, it is perhaps unsurprising that we observed recurrent mutations in the two major fli operons. In every instance, these occurred in structural genes that encode components of the flagellar apparatus (e.g., FliG) or proteins required for its assembly (e.g., FlhB and FlgJ). Many (13/24) were nonsense mutations, whose severity would depend on the efficiency of the ancestral glnX nonsense suppressor. Mutations in flagellar biosnthesis were not only recurrent, but also arose in each of our replicate experiments; one, in flagellar export protein FliO (Val82Phe), was one of the few de novo mutations that went to fixation [25]. Similarly, in a prior evolution experiment, also carried out under glucose limitation starting with the same ancestor, three mutations arose in FliM (E59D, A62S, E178*), one in FLiF (A28A), and one in FliH (E41D); all were fixed in the numerically dominant clone isolated from the terminal population [32]. The material and energetic savings that accrue to cells dispensing with motility are obvious and likely provide a selective advantage to cells evolving in chemostats. Recent data have shown that energy-intensive processes like flagellar motility can be mutagenic [156] owing to the reactive oxygen species they generate. Because our ancestral strain harbors a nonsense mutation in base excision repair glycosylase MutY, mutations that disable any mutagenic process are likely to provide a fitness advantage.

Adaptive genetics reveals unexpected targets of selection, some of which have clinical correlates

The selective advantages that accrue to the mutational targets described above derive from either enhanced acquisition of limiting resources or from improvements in their economy. All make sense in the selective environment of a nutrient limited chemostat. Ideally, such an environment exhibits neither temporal nor spatial variation, and populations consist of homogenously distributed planktonic cells at steady state, with equal numbers of cells being born and being washed out per unit time. However, “life finds a way” to confound this ideal environment, often in the form of wall growth [157], which effectively prolongs adherent cells’ residence time. Unlike the mutations recurring in OPG, LPS, galactitol, and flagellar metabolism that conserve limiting resources, mutations that enable cells to adhere to reactor walls require their expenditure, here in the production of type 1 pilin proteins.

Wall growth niche specialists arising in our experiments owe their phenotype to two distinct types of mutations that appear to occur in succession. First, a small fraction of cells undergoes phase variation via regulatory change at the invertible phase switch locus fimS, and are switched from ancestral “OFF” to evolved “ON,” which activates expression of the fim operon (fimAICDFGH). Next, “Phase ON cells” expressing type 1 pilin components undergo further selection for mutants with augmented adherence. In particular, a large number of recurring mutations clustered at fimH, the structural gene encoding mannose-binding fimbrial adhesin. Among these, we found multiple FimH mutations identical to those recovered from clinical E. coli isolates, notably highly adherent (and therefore virulent) uropathogenic strains and strains isolated from Crohn’s disease patients [143].

Regulatory genes are frequently high value targets of selection

Perhaps the most striking feature of our dataset is the frequency with which regulatory genes are mutated, compared to the structural genes that they regulate. The most dramatic example of this discrepancy is to be found in glucose assimilation (Fig. 1). While no mutants were detected in genes that encode the high-affinity uptake machinery (lamB and mglBAC), literally scores of mutations arose within genes and gene regulatory regions that either directly or indirectly regulate LamB and MglBAC expression. Two aspects of these results are especially noteworthy: the variety of gene regulatory mechanisms affected, and the way recurrent mutations cluster in each. High-value targets included proteins that act locally as transcriptional activators (e.g., MalT) or repressors (GalS, OmpR) as well as proteins that act globally as transcriptional activators by directing RNA polymerase to different consensus sequences (RpoS, RpoD). High-value targets also included transcription termination factors (Rho) as well as terminator regions (downstream hfq). We also observed a higher-than-expected incidence of mutations in post-transcriptional regulatory proteins (Hfq, ProQ) as well as at loci regulating processes functionally downstream of glucose assimilation that determine how glucose is allocated. These include: (i) transmembrane protein OpgH, which not only acts in OPG assembly but also moonlights to regulate the minimal cell size required for cell division; (ii) a post-translational regulator yciM, which modulates proteolysis of the first step in LPS biosynthesis; (iii) post-translational regulator GatZ, which activates GatY to catalyze G3P/DHAP 1,6-BP. In most of these examples, independent mutations cluster in proteins’ substrate binding sites, likely compromising their activity.

That regulatory genes are high-value targets in our experiments is consistent with a large body of theoretical and empirical evidence suggesting that changing the expression of structural genes’ is more suitable for evolutionary “tinkering” (sensu [158]) than is changing their coding sequences. This argument has been used to explain, for example, the enormous phenotypic difference between humans and chimpanzees despite their genome sequences being ~ 96% identical [159, 160]. The core idea is that regulatory mutations are less likely than structural mutations to be pleiotropic and therefore less likely to adversely affect fitness—especially cis-regulatory mutations in morphologically complex Eukaryotes [161]. Longstanding theory suggests such organisms pay a “cost of complexity” because the rate of adaptive evolution decreases as the degree of pleiotropy increases [162].

Cost of complexity notwithstanding, a general outcome of both in vivo and in silico experimental evolution is that bacteria more often adapt to environmental challenges by adjusting global regulatory networks than by altering the structure of genes and pathways [163]. Using connectivity as a proxy for pleiotropy, Ruelens et al. recently surveyed targets of selection in sequenced microbial evolution experiments; they discovered that non-disruptive single nucleotide polymorphisms to highly pleiotropic genes tended to yield the largest fitness benefits, especially in large populations [164]. In instances where early, large-effect mutations target global regulators, later structural mutations may compensate for their pleiotropic effects [165167]. For example, a frequent target of selection in E. coli chemostat evolutions is RpoS, owing to the fact that activating the stress response trades off against nutrient acquisition when resources are scarce [148]. Stoebel et al. showed that replicate experimental evolution of an rpoS deletion mutant rapidly and repeatedly resulted in rewiring of the genome to allow for RpoS-independent expression of the osmotic stress response pathway encoded by otsBA operon [168].

When bacteria evolve under intense selection pressures like antibiotic exposure, starvation, or high temperature a frequent mutational target is the RNA polymerase complex (RNAPC), which is central to the coordination of all gene expression [169]. Remarkably, the most frequently altered residues in RNAPC tend to be those that are most highly conserved in RpoB and RpoC [170]. As with high-value targets in our dataset, adaptive RNAPC mutations tend to cluster within known functional domains and to be located near the active site. However, there is little overlap among mutations arising under different forms of “extreme” stress, an observation that may help explain their rarity in natural settings [170]. Our population and clone sequencing data uncovered hundreds of mutations, include several in assembly and initiation protein RpoA, but in no case did we observe de novo variants in the core proteins, RpoB or RpoC, indicating that a glucose limited chemostat must be a benign condition relative to outright starvation or high temperature and/or be one that selects against novel rpoB and rpoC variants.

Adaptive genetics can be a tool to test and to develop structure-function hypotheses

Mutational patterns arising during laboratory evolution can be used to test hypotheses concerning gene structure–function. For example, all 19 unique missense alleles of the malT transcriptional activator cluster in predicted functional domains that govern MalT-MalK interactions; these in turn regulate expression of LamB, the principal route for glucose uptake under resource limitation [25]. Similarly, the most frequently arising mutations in RNA chaperone Hfq localize to the Hfq hexamer’s distal face, a region proposed to bind A-rich regions of rpoS mRNA, whose diminished translation has repeatedly been shown to be a high-value target for selection in slow-growing cells [20, 39]. Missense mutations in transcriptional terminator rho also occur near putative RNA-binding domains, and their reduced binding affinity could be expected to favor increased expression of genes regulated by Rho-dependent terminators like lamB, malK, and mglBAC, all of which play key roles in glucose scavenging. In each of the foregoing examples, adaptive genetics supports hypotheses originally formulated via molecular genetics [25].

Mutational patterns arising during laboratory evolution can also be used to generate hypotheses concerning gene structure–function, as well as to produce genetic tools needed to test those hypotheses. For example, the extraordinary number of mutations in OpgH leads us to hypothesize a dual role for this protein in economizing glucose and altering the minimum cells size required for division. The clustering pattern as well as the identity of specific OpgH mutations provides a guide to detailed mechanistic understanding of its putative moonlighting function as well as how that function may be related to its established role in OPG assembly. Likewise, the large number of FimH mutations, considered in the light of their clustering pattern, provides new genetic tools to understand the role that particular FimH mutations may play in increasing adhesion in clinical E. coli isolates. Finally, while our founding genotype allowed for the selection of adaptive nonsense mutations that, rather than resulting simply in truncated proteins, instead may decrease gene expression via nonsense suppression, not all adaptive targets were equal recipients of such mutations. For example, more than half of the adaptive mutations in FliG and FliH are nonsense mutations, while none of the in FliP, FlhB, FlgJ, or FimH are nonsense mutations. This suggests that adaptation can be achieved by expression modulation of some targets, but not of others, where the specific types of missense mutations are important. Our collection of mutants, contextualized by adaptation under continuous nutrient limitation, therefore provides a rich resource for future studies aimed at understanding their different physiological impacts and fitness values, singly or in combination with other adaptive alleles, in a common genetic background.

Methods

Details on chemostat experiments, population and clonal sequencing, variant calling, and identification of genomic features subject to an excess of mutations were described previously [25].

Generation of primary and tertiary protein structures

Primary protein structures were inferred using Protter v.1.0 [171] using Uniprot protein accessions to inform transmembrane positions and domains whenever possible. Tertiary protein structures were visualized using Polyview-3D [171]. The source of structural data for Polyview-3D was gathered from the Protein Data Bank (PDB). For PDB data to qualify for use in generating tertiary structures the designated species had to be Escherichia coli and the amino acid sequence had to be identical to that used previously to infer the protein’s primary structure.

Mutation cluster analysis

Cluster Explorer

For ClusterExplorer, residues were coded as exposed (e), and either conserved (c, identical to JA122) or mutated (m) for input into the ClusterExplorer program [35]. Clusters with uncorrected p-values < 0.05 were retained and tabulated in Table 1. Only those clusters with a corrected p-value (FDR) < 0.05 were considered significant and discussed further in the text. Cluster locations are identified by residue number relative to the translational start site for each protein.

Nonrandom mutation clustering (NMC)

The NMC algorithm [36] implemented in R was used to identify significant clusters of missense substitutions relative to the ancestral JA122 amino acid sequence. Substitution matrices in which each mutation is coded as a 0 (ancestral) or 1 (derived) were used as input and clusters scored as significant if they had a Bonferroni-corrected p-value < 0.05.

Identification of Protein Amino Acid Clustering (iPAC)

The iPAC package [37] combines the NMC algorithm with structure information to search for mutations that cluster together in 3D space. Proteins for which structure information was available through PDB were analyzed in R [172] with iPAC using substitution matrices as described for NMC. FASTA protein files were mapped to PDB files using pairwise sequence alignment and positional information converted to 1D space using the Multidimensional Scaling (MDS) option.

Biofilm formation assays

To quantitate biofilm formation, quadruplicate samples of each clone were inoculated into 100 µL of Luria Broth in microtiter plate format (Corning 3585) and grown overnight in a Tecan apparatus (Genios) at 30 °C. After 20 h incubation, cell suspensions were removed, the microtiter wells rinsed by submersion in water, then stained with 200 µL of 0.5% Crystal Violet solution. After removing the dye, wells were rinsed and dried by inversion, and the dye solubilized in 150 µL of 95% ethanol. Optical density was measured at 595 nm; data represent four biological replicates of each clone. OD595 was measured in quadruplicate Tecan plate reader and averaged for each well. Each clone was grown and measured four times.

Supplementary Information

12915_2025_2331_MOESM1_ESM.docx (4.5MB, docx)

Additional file 1: Figure S1. Mutations inthe gene encoding RNA chaperone, ProQ. Figure S2. Mutations in putative chaperone protein, GatZ. Figure S3. Mutations in LPS transport protein, LptC. Figure S4. Mutations in LPS assembly protein, LptD. Figure S5. Mutations in the LPS export system permease, LptG. Figure S6. Mutations in LPS assembly protein, LapB. Figure S7. Mutations in flagellar motor switch protein, FliG. Figure S8. Mutations in cytosolic flagellar export protein, FliH. Figure S9. Mutations in flagellar biosynthesis protein, FliP. Figure S10. Mutations in flagellar biosynthesis protein, FlhB. Figure S11. Mutations in flagellar assembly protein, FlgJ. Figure S12. Mutations in bacterial adhesion protein, FimH.

12915_2025_2331_MOESM2_ESM.xlsx (1.8MB, xlsx)

Additional file 2: Table S1: Variants Identified in Clonal Sequencing. Table S2: Variants identified in Population sequencing. Table S3: Ancestral state of de novo mutations discussed in narrative.

12915_2025_2331_MOESM3_ESM.pptx (725.4KB, pptx)

SAdditional file 3: Figure S13: Animated 3-D image of the Shigella flexneri LptD/LptE complex showing location of mutations arising in the Escherichia coli homolog under glucose limitation.

12915_2025_2331_MOESM4_ESM.pptx (1.7MB, pptx)

Additional file 4: Figure S14: Animated 3-D image of E. coli FimH showing location of mutations arising under glucose limitation.

Acknowledgements

The authors thank Emily Cook, Peter Conlin, Shelley Copley and two anonymous reviewers for their careful reading of the manuscript and thoughtful suggestions for its improvement. Logan Pierpont and Emily Cook designed Figure 1.

Abbreviations

E. coli

Escherichia coli

tRNA

Transfer RNA

LPS

Lipopolysaccharide

RNA

Ribonucleic acid

DNA

Deoxyribonucleic acid

FDR

False-discovery rate

NMC

Nonrandom mutations clustering

iPAC

Identification of Protein Amino Acid Clustering

sRNA

Small RNA

mRNA

Messenger RNA

UTR

Untranslated region

RNAP

RNA polymerase

OPGs

Osmoregulated periplasmic glucans

MDOs

Membrane-derived oligosaccharide

ACP

Acyl carrier protein

PDB

Protein Data Bank

ATPase

Adenosine triphosphatase

G3P

D-glyceraldehyde 3-phosphate

DHAP

Dihydroxyacetone phosphate

UDP

Uridine diphosphate glucose

BP/bp

Base pairs

RNAPC

RNA polymerase complex

MDS

Multidimensional scaling

Authors’ contributions

Conceived and designed the experiments: GS, MK, FR. Performed the experiments: MK and KS. Analyzed the data: KS, GS, MK, CRL, FR. Contributed reagents/materials/analysis tools: GS, FR. Wrote the paper: KS, MK, GS, FR. All authors read and approved the final manuscript.

Funding

This work was funded by NASA grant 80NSSC20K0621 (Exobiology) to FR and GS, and by NIH grant R35 GM131824 to GS.

Data availability

All raw sequencing data are available from the SRA under BioProject ID PRJNA517527 [173], which is associated with our companion paper [25].

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

All authors consent to publication of this report, which contains no personal data.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Katja Schwartz and Margie Kinnersley contributed equally to this work.

Contributor Information

Gavin Sherlock, Email: gsherloc@stanford.edu.

Frank Rosenzweig, Email: frank.rosenzweig@biology.gatech.edu.

References

  • 1.Bentley A, MacLennan B, Calvo J, Dearolf CR. Targeted recovery of mutations in Drosophila. Genetics. 2000;156(3):1169–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Pratt LA, Kolter R. Genetic analysis of Escherichia coli biofilm formation: roles of flagella, motility, chemotaxis and type I pili. Mol Microbiol. 1998;30(2):285–93. [DOI] [PubMed] [Google Scholar]
  • 3.Ellis HM, Horvitz HR. Genetic control of programmed cell death in the nematode C. elegans. Cell. 1986;44(6):817–29. [DOI] [PubMed] [Google Scholar]
  • 4.Bingham PM, Levis R, Rubin GM. Cloning of DNA sequences from the white locus of D. melanogaster by a novel and general method. Cell. 1981;25(3):693–704. [DOI] [PubMed] [Google Scholar]
  • 5.Hartwell LH, Culotti J, Reid B. Genetic control of the cell-division cycle in yeast. I. Detection of mutants. Proc Natl Acad Sci U S A. 1970;66(2):352–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Baetz KK, Krogan NJ, Emili A, Greenblatt J, Hieter P. The ctf13-30/CTF13 genomic haploinsufficiency modifier screen identifies the yeast chromatin remodeling complex RSC, which is required for the establishment of sister chromatid cohesion. Mol Cell Biol. 2004;24(3):1232–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, et al. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science. 1999;285(5429):901–6. [DOI] [PubMed] [Google Scholar]
  • 8.McCluskey K, Wiest AE, Grigoriev IV, Lipzen A, Martin J, Schackwitz W, et al. Rediscovery by Whole Genome Sequencing: Classical Mutations and Genome Polymorphisms in Neurospora crassa. G3 (Bethesda). 2011;1(4):303–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Lyons E, Freeling M, Kustu S, Inwood W. Using genomic sequencing for classical genetics in E. coli K12. PLoS One. 2011;6(2):e16717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Cooper VS. Experimental Evolution as a High-Throughput Screen for Genetic Adaptations. mSphere. 2018;3(3):e00121-18. [DOI] [PMC free article] [PubMed]
  • 11.Atwood KC, Schneider LK, Ryan FJ. Periodic selection in Escherichia coli. Proc Natl Acad Sci U S A. 1951;37(3):146–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Lenski RE, Rose MR, Simpson SC, Tadler SC. Long-term experimental evolution in Escherichia coli. I. Adaptation and divergence during 2,000 generations. Am Nat. 1991;138:1315–41. [Google Scholar]
  • 13.Notley-McRobb L, Ferenci T. The generation of multiple co-existing mal-regulatory mutations through polygenic evolution in glucose-limited populations of Escherichia coli. Environ Microbiol. 1999;1(1):45–52. [DOI] [PubMed] [Google Scholar]
  • 14.Good BH, McDonald MJ, Barrick JE, Lenski RE, Desai MM. The dynamics of molecular evolution over 60,000 generations. Nature. 2017;551(7678):45–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Maddamsetti R, Lenski RE, Barrick JE. Adaptation, Clonal Interference, and Frequency-Dependent Interactions in a Long-Term Evolution Experiment with Escherichia coli. Genetics. 2015;200(2):619–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Levy SF, Blundell JR, Venkataram S, Petrov DA, Fisher DS, Sherlock G. Quantitative evolutionary dynamics using high-resolution lineage tracking. Nature. 2015;519(7542):181–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Lang GI, Rice DP, Hickman MJ, Sodergren E, Weinstock GM, Botstein D, et al. Pervasive genetic hitchhiking and clonal interference in forty evolving yeast populations. Nature. 2013;500(7464):571–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kvitek DJ, Sherlock G. Whole genome, whole population sequencing reveals that loss of signaling networks is the major adaptive strategy in a constant environment. PLoS Genet. 2013;9(11): e1003972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wong A, Rodrigue N, Kassen R. Genomics of adaptation during experimental evolution of the opportunistic pathogen Pseudomonas aeruginosa. PLoS Genet. 2012;8(9): e1002928. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Notley-McRobb L, King T, Ferenci T. rpoS mutations and loss of general stress resistance in Escherichia coli populations as a consequence of conflict between competing stress responses. J Bacteriol. 2002;184(3):806–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Cooper VS, Lenski RE. The population genetics of ecological specialization in evolving Escherichia coli populations. Nature. 2000;407(6805):736–9. [DOI] [PubMed] [Google Scholar]
  • 22.Cunningham CW, Jeng K, Husti J, Badgett M, Molineux IJ, Hillis DM, et al. Parallel molecular evolution of deletions and nonsense mutations in bacteriophage T7. Mol Biol Evol. 1997;14(1):113–6. [DOI] [PubMed] [Google Scholar]
  • 23.Schmiedel JM, Lehner B. Determining protein structures using deep mutagenesis. Nat Genet. 2019;51(7):1177–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Rollins NJ, Brock KP, Poelwijk FJ, Stiffler MA, Gauthier NP, Sander C, et al. Inferring protein 3D structure from deep mutation scans. Nat Genet. 2019;51(7):1170–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kinnersley M, Schwartz K, Yang DD, Sherlock G, Rosenzweig F. Evolutionary dynamics and structural consequences of de novo beneficial mutations and mutant lineages arising in a constant environment. BMC Biol. 2021;19(1):20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Helling RB, Vargas CN, Adams J. Evolution of Escherichia coli during growth in a constant environment. Genetics. 1987;116(3):349–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Dykhuizen DE, Dean AM. Predicted fitness changes along an environmental gradient. Evol Ecol. 1994;8(5):524-41.
  • 28.Ferea TL, Botstein D, Brown PO, Rosenzweig RF. Systematic changes in gene expression patterns following adaptive evolution in yeast. Proc Natl Acad Sci U S A. 1999;96(17):9721–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Tilman D. Resource Competition and Community Structure. Princeton: Princeton University Press; 1982. [Google Scholar]
  • 30.Ferenci T. What is driving the acquisition of mutS and rpoS polymorphisms in Escherichia coli? Trends Microbiol. 2003;11(10):457–61. [DOI] [PubMed] [Google Scholar]
  • 31.Maharjan R, Seeto S, Notley-McRobb L, Ferenci T. Clonal adaptive radiation in a constant environment. Science. 2006;313(5786):514–7. [DOI] [PubMed] [Google Scholar]
  • 32.Kinnersley M, Wenger J, Kroll E, Adams J, Sherlock G, Rosenzweig F. Ex uno plures: clonal reinforcement drives evolution of a simple microbial community. PLoS Genet. 2014;10(6): e1004430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Singaravelan B, Roshini BR, Munavar MH. Evidence that the supE44 mutation of Escherichia coli is an amber suppressor allele of glnX and that it also suppresses ochre and opal nonsense mutations. J Bacteriol. 2010;192(22):6039–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Bharti N, Santos L, Davyt M, Behrmann S, Eichholtz M, Jimenez-Sanchez A, et al. Translation velocity determines the efficacy of engineered suppressor tRNAs on pathogenic nonsense mutations. Nat Commun. 2024;15(1):2957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Zhou T, Enyeart PJ, Wilke CO. Detecting clusters of mutations. PLoS ONE. 2008;3(11): e3765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ye J, Pavlicek A, Lunney EA, Rejto PA, Teng CH. Statistical method on nonrandom clustering with application to somatic mutations in cancer. BMC Bioinformatics. 2010;11:11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Ryslik GA, Cheng Y, Cheung KH, Modis Y, Zhao H. Utilizing protein structure to identify non-random somatic mutations. BMC Bioinformatics. 2013;14:190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Ferenci T. Adaptation to life at micromolar nutrient levels: the regulation of Escherichia coli glucose transport by endoinduction and cAMP. FEMS Microbiol Rev. 1996;18(4):301–17. [DOI] [PubMed] [Google Scholar]
  • 39.Ferenci T. The spread of a beneficial mutation in experimental bacterial populations: the influence of the environment and genotype on the fixation of rpoS mutations. Heredity (Edinb). 2008;100(5):446–52. [DOI] [PubMed] [Google Scholar]
  • 40.Liu X, Ferenci T. Regulation of porin-mediated outer membrane permeability by nutrient limitation in Escherichia coli. J Bacteriol. 1998;180(15):3917–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Manch K, Notley-McRobb L, Ferenci T. Mutational adaptation of Escherichia coli to glucose limitation involves distinct evolutionary pathways in aerobic and oxygen-limited environments. Genetics. 1999;153(1):5–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Carreon-Rodriguez OE, Gosset G, Escalante A, Bolivar F. Glucose Transport in Escherichia coli: From Basics to Transport Engineering. Microorganisms. 2023;11(6):1588. [DOI] [PMC free article] [PubMed]
  • 43.Death A, Ferenci T. The importance of the binding-protein-dependent Mgl system to the transport of glucose in Escherichia coli growing on low sugar concentrations. Res Microbiol. 1993;144(7):529–37. [DOI] [PubMed] [Google Scholar]
  • 44.Geanacopoulos M, Adhya S. Functional characterization of roles of GalR and GalS as regulators of the gal regulon. J Bacteriol. 1997;179(1):228–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Leyes-Vence M, Roca-Sanchez T, Flores-Lozano C, Villarreal-Villareal G. All-Inside Partial Epiphyseal Anterior Cruciate Ligament Reconstruction Plus an Associated Modified Lemaire Procedure Sutured to the Femoral Button. Arthrosc Tech. 2019;8(5):e473–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Melamed S, Peer A, Faigenbaum-Romm R, Gatt YE, Reiss N, Bar A, et al. Global Mapping of Small RNA-Target Interactions in Bacteria. Mol Cell. 2016;63(5):884–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Sobrero P, Valverde C. The bacterial protein Hfq: much more than a mere RNA-binding factor. Crit Rev Microbiol. 2012;38(4):276–99. [DOI] [PubMed] [Google Scholar]
  • 48.Guisbert E, Rhodius VA, Ahuja N, Witkin E, Gross CA. Hfq modulates the sigmaE-mediated envelope stress response and the sigma32-mediated cytoplasmic stress response in Escherichia coli. J Bacteriol. 2007;189(5):1963–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Santiago-Frangos A, Jeliazkov JR, Gray JJ, Woodson SA. Acidic C-terminal domains autoregulate the RNA chaperone Hfq. Elife. 2017;6:e27049. [DOI] [PMC free article] [PubMed]
  • 50.Santiago-Frangos A, Kavita K, Schu DJ, Gottesman S, Woodson SA. C-terminal domain of the RNA chaperone Hfq drives sRNA competition and release of target RNA. Proc Natl Acad Sci U S A. 2016;113(41):E6089–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Beich-Frandsen M, Vecerek B, Konarev PV, Sjoblom B, Kloiber K, Hammerle H, et al. Structural insights into the dynamics and function of the C-terminus of the E. coli RNA chaperone Hfq. Nucleic Acids Res. 2011;39(11):4900–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Sun X, Zhulin I, Wartell RM. Predicted structure and phyletic distribution of the RNA-binding protein Hfq. Nucleic Acids Res. 2002;30(17):3662–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Sauter C, Basquin J, Suck D. Sm-like proteins in Eubacteria: the crystal structure of the Hfq protein from Escherichia coli. Nucleic Acids Res. 2003;31(14):4091–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Schumacher MA, Pearson RF, Moller T, Valentin-Hansen P, Brennan RG. Structures of the pleiotropic translational regulator Hfq and an Hfq-RNA complex: a bacterial Sm-like protein. EMBO J. 2002;21(13):3546–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Link TM, Valentin-Hansen P, Brennan RG. Structure of Escherichia coli Hfq bound to polyriboadenylate RNA. Proc Natl Acad Sci U S A. 2009;106(46):19292–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Panja S, Schu DJ, Woodson SA. Conserved arginines on the rim of Hfq catalyze base pair formation and exchange. Nucleic Acids Res. 2013;41(15):7536–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Otaka H, Ishikawa H, Morita T, Aiba H. PolyU tail of rho-independent terminator of bacterial small RNAs is essential for Hfq action. Proc Natl Acad Sci U S A. 2011;108(32):13059–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Sauer E, Weichenrieder O. Structural basis for RNA 3’-end recognition by Hfq. Proc Natl Acad Sci U S A. 2011;108(32):13065–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Mikulecky PJ, Kaw MK, Brescia CC, Takach JC, Sledjeski DD, Feig AL. Escherichia coli Hfq has distinct interaction surfaces for DsrA, rpoS and poly(A) RNAs. Nat Struct Mol Biol. 2004;11(12):1206–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Wang W, Wang L, Wu J, Gong Q, Shi Y. Hfq-bridged ternary complex is important for translation activation of rpoS by DsrA. Nucleic Acids Res. 2013;41(11):5938–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Soper TJ, Doxzen K, Woodson SA. Major role for mRNA binding and restructuring in sRNA recruitment by Hfq. RNA. 2011;17(8):1544–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Soper T, Mandin P, Majdalani N, Gottesman S, Woodson SA. Positive regulation by small RNAs and the role of Hfq. Proc Natl Acad Sci U S A. 2010;107(21):9602–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Brescia CC, Mikulecky PJ, Feig AL, Sledjeski DD. Identification of the Hfq-binding site on DsrA RNA: Hfq binds without altering DsrA secondary structure. RNA. 2003;9(1):33–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Majdalani N, Cunning C, Sledjeski D, Elliott T, Gottesman S. DsrA RNA regulates translation of RpoS message by an anti-antisense mechanism, independent of its action as an antisilencer of transcription. Proc Natl Acad Sci U S A. 1998;95(21):12462–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Muffler A, Fischer D, Hengge-Aronis R. The RNA-binding protein HF-I, known as a host factor for phage Qbeta RNA replication, is essential for rpoS translation in Escherichia coli. Genes Dev. 1996;10(9):1143–51. [DOI] [PubMed] [Google Scholar]
  • 66.Updegrove TB, Zhang A, Storz G. Hfq: the flexible RNA matchmaker. Curr Opin Microbiol. 2016;30:133–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Stanek KA, Patterson-West J, Randolph PS, Mura C. Crystal structure and RNA-binding properties of an Hfq homolog from the deep-branching Aquificae: conservation of the lateral RNA-binding mode. Acta Crystallogr D Struct Biol. 2017;73(Pt 4):294–315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Smith MN, Crane RA, Keates RA, Wood JM. Overexpression, purification, and characterization of ProQ, a posttranslational regulator for osmoregulatory transporter ProP of Escherichia coli. Biochemistry. 2004;43(41):12979–89. [DOI] [PubMed] [Google Scholar]
  • 69.Kunte HJ, Crane RA, Culham DE, Richmond D, Wood JM. Protein ProQ influences osmotic activation of compatible solute transporter ProP in Escherichia coli K-12. J Bacteriol. 1999;181(5):1537–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Chaulk SG, Smith Frieday MN, Arthur DC, Culham DE, Edwards RA, Soo P, et al. ProQ is an RNA chaperone that controls ProP levels in Escherichia coli. Biochemistry. 2011;50(15):3095–106. [DOI] [PubMed] [Google Scholar]
  • 71.Holmqvist E, Li L, Bischler T, Barquist L, Vogel J. Global Maps of ProQ Binding In Vivo Reveal Target Recognition via RNA Structure and Stability Control at mRNA 3' Ends. Mol Cell. 2018;70(5):971–82 e6. [DOI] [PubMed]
  • 72.Smirnov A, Forstner KU, Holmqvist E, Otto A, Gunster R, Becher D, et al. Grad-seq guides the discovery of ProQ as a major small RNA-binding protein. Proc Natl Acad Sci U S A. 2016;113(41):11591–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Gonzalez GM, Hardwick SW, Maslen SL, Skehel JM, Holmqvist E, Vogel J, et al. Structure of the Escherichia coli ProQ RNA-binding protein. RNA. 2017;23(5):696–711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Argaman L, Hershberg R, Vogel J, Bejerano G, Wagner EG, Margalit H, et al. Novel small RNA-encoding genes in the intergenic regions of Escherichia coli. Curr Biol. 2001;11(12):941–50. [DOI] [PubMed] [Google Scholar]
  • 75.Sass TH, Lovett ST. The DNA damage response of Escherichia coli, revisited: Differential gene expression after replication inhibition. Proc Natl Acad Sci U S A. 2024;121(27): e2407832121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Melamed S, Adams PP, Zhang A, Zhang H, Storz G. RNA-RNA Interactomes of ProQ and Hfq Reveal Overlapping and Competing Roles. Mol Cell. 2020;77(2):411–25 e7. [DOI] [PMC free article] [PubMed]
  • 77.Waters SA, McAteer SP, Kudla G, Pang I, Deshpande NP, Amos TG, et al. Small RNA interactome of pathogenic E. coli revealed through crosslinking of RNase E. EMBO J. 2017;36(3):374–87. [DOI] [PMC free article] [PubMed]
  • 78.Reimann SA, Wolfe AJ. Constitutive expression of the maltoporin LamB in the absence of OmpR damages the cell envelope. J Bacteriol. 2011;193(4):842–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Nicoloff H, Gopalkrishnan S, Ades SE. Appropriate Regulation of the sigma(E)-Dependent Envelope Stress Response Is Necessary To Maintain Cell Envelope Integrity and Stationary-Phase Survival in Escherichia coli. J Bacteriol. 2017;199(12):e00089-17. [DOI] [PMC free article] [PubMed]
  • 80.Wassarman KM, Repoila F, Rosenow C, Storz G, Gottesman S. Identification of novel small RNAs using comparative genomics and microarrays. Genes Dev. 2001;15(13):1637–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Farrell MJ, Finkel SE. The growth advantage in stationary-phase phenotype conferred by rpoS mutations is dependent on the pH and nutrient environment. J Bacteriol. 2003;185(24):7044–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Iosub IA, van Nues RW, McKellar SW, Nieken KJ, Marchioretto M, Sy B, et al. Hfq CLASH uncovers sRNA-target interaction networks linked to nutrient availability adaptation. Elife. 2020;9:e54655. [DOI] [PMC free article] [PubMed]
  • 83.Nikaido H. Porins and specific channels of bacterial outer membranes. Mol Microbiol. 1992;6(4):435–42. [DOI] [PubMed] [Google Scholar]
  • 84.Alphen WV, Lugtenberg B. Influence of osmolarity of the growth medium on the outer membrane protein pattern of Escherichia coli. J Bacteriol. 1977;131(2):623–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Stincone A, Daudi N, Rahman AS, Antczak P, Henderson I, Cole J, et al. A systems biology approach sheds new light on Escherichia coli acid resistance. Nucleic Acids Res. 2011;39(17):7512–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Igarashi K, Hanamura A, Makino K, Aiba H, Aiba H, Mizuno T, et al. Functional map of the alpha subunit of Escherichia coli RNA polymerase: two modes of transcription activation by positive factors. Proc Natl Acad Sci U S A. 1991;88(20):8958–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Rhee JE, Sheng W, Morgan LK, Nolet R, Liao X, Kenney LJ. Amino acids important for DNA recognition by the response regulator OmpR. J Biol Chem. 2008;283(13):8664–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Sharif TR, Igo MM. Mutations in the alpha subunit of RNA polymerase that affect the regulation of porin gene transcription in Escherichia coli K-12. J Bacteriol. 1993;175(17):5460–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Slauch JM, Russo FD, Silhavy TJ. Suppressor mutations in rpoA suggest that OmpR controls transcription by direct interaction with the alpha subunit of RNA polymerase. J Bacteriol. 1991;173(23):7501–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Aiba H, Mizuno T. Phosphorylation of a bacterial activator protein, OmpR, by a protein kinase, EnvZ, stimulates the transcription of the ompF and ompC genes in Escherichia coli. FEBS Lett. 1990;261(1):19–22. [DOI] [PubMed] [Google Scholar]
  • 91.Gerken H, Charlson ES, Cicirelli EM, Kenney LJ, Misra R. MzrA: a novel modulator of the EnvZ/OmpR two-component regulon. Mol Microbiol. 2009;72(6):1408–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Guillier M, Gottesman S. Remodelling of the Escherichia coli outer membrane by two small regulatory RNAs. Mol Microbiol. 2006;59(1):231–47. [DOI] [PubMed] [Google Scholar]
  • 93.Ogasawara H, Yamada K, Kori A, Yamamoto K, Ishihama A. Regulation of the Escherichia coli csgD promoter: interplay between five transcription factors. Microbiology. 2010;156(Pt 8):2470–83. [DOI] [PubMed] [Google Scholar]
  • 94.Shin S, Park C. Modulation of flagellar expression in Escherichia coli by acetyl phosphate and the osmoregulator OmpR. J Bacteriol. 1995;177(16):4696–702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Vidal O, Longin R, Prigent-Combaret C, Dorel C, Hooreman M, Lejeune P. Isolation of an Escherichia coli K-12 mutant strain able to form biofilms on inert surfaces: involvement of a new ompR allele that increases curli expression. J Bacteriol. 1998;180(9):2442–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Pruss BM. Acetyl phosphate and the phosphorylation of OmpR are involved in the regulation of the cell division rate in Escherichia coli. Arch Microbiol. 1998;170(3):141–6. [DOI] [PubMed] [Google Scholar]
  • 97.Reimann SA, Wolfe AJ. A critical process controlled by MalT and OmpR is revealed through synthetic lethality. J Bacteriol. 2009;191(16):5320–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Martinez-Hackert E, Harlocker S, Inouye M, Berman HM, Stock AM. Crystallization, X-ray studies, and site-directed cysteine mutagenesis of the DNA-binding domain of OmpR. Protein Sci. 1996;5(7):1429–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Kato N, Tsuzuki M, Aiba H, Mizuno T. Gene activation by the Escherichia coli positive regulator OmpR: a mutational study of the DNA-binding domain of OmpR. Mol Gen Genet. 1995;248(4):399–406. [DOI] [PubMed] [Google Scholar]
  • 100.Martinez-Hackert E, Stock AM. The DNA-binding domain of OmpR: crystal structures of a winged helix transcription factor. Structure. 1997;5(1):109–24. [DOI] [PubMed] [Google Scholar]
  • 101.Pratt LA, Silhavy TJ. OmpR mutants specifically defective for transcriptional activation. J Mol Biol. 1994;243(4):579–94. [DOI] [PubMed] [Google Scholar]
  • 102.Russo FD, Slauch JM, Silhavy TJ. Mutations that affect separate functions of OmpR the phosphorylated regulator of porin transcription in Escherichia coli. J Mol Biol. 1993;231(2):261–73. [DOI] [PubMed] [Google Scholar]
  • 103.Zhang G, Darst SA. Structure of the Escherichia coli RNA polymerase alpha subunit amino-terminal domain. Science. 1998;281(5374):262–6. [DOI] [PubMed] [Google Scholar]
  • 104.Russo FD, Silhavy TJ. Alpha: the Cinderella subunit of RNA polymerase. J Biol Chem. 1992;267(21):14515–8. [PubMed] [Google Scholar]
  • 105.Brinkkotter A, Shakeri-Garakani A, Lengeler JW. Two class II D-tagatose-bisphosphate aldolases from enteric bacteria. Arch Microbiol. 2002;177(5):410–9. [DOI] [PubMed] [Google Scholar]
  • 106.Tran KT, Maeda T, Sanchez-Torres V, Wood TK. Beneficial knockouts in Escherichia coli for producing hydrogen from glycerol. Appl Microbiol Biotechnol. 2015;99(6):2573–81. [DOI] [PubMed] [Google Scholar]
  • 107.Barroso-Batista J, Pedro MF, Sales-Dias J, Pinto CJG, Thompson JA, Pereira H, et al. Specific Eco-evolutionary Contexts in the Mouse Gut Reveal Escherichia coli Metabolic Versatility. Curr Biol. 2020;30(6):1049–62 e7. [DOI] [PubMed]
  • 108.Barroso-Batista J, Sousa A, Lourenco M, Bergman ML, Sobral D, Demengeot J, et al. The first steps of adaptation of Escherichia coli to the gut are dominated by soft sweeps. PLoS Genet. 2014;10(3): e1004182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Tsukimi T, Obana N, Shigemori S, Arakawa K, Miyauchi E, Yang J, et al. Genetic mutation in Escherichia coli genome during adaptation to the murine intestine is optimized for the host diet. mSystems. 2024;9(2):e0112323. [DOI] [PMC free article] [PubMed]
  • 110.Ha J, Kim D, Yeom J, Kim Y, Yoo SM, Yoon SH. Identification of a gene cluster for D-tagatose utilization in Escherichia coli B2 phylogroup. iScience. 2022;25(12):105655. [DOI] [PMC free article] [PubMed]
  • 111.Bontemps-Gallo S, Bohin JP, Lacroix JM. Osmoregulated Periplasmic Glucans. EcoSal Plus. 2017;7(2):10.1128. [DOI] [PMC free article] [PubMed]
  • 112.Hill NS, Buske PJ, Shi Y, Levin PA. A moonlighting enzyme links Escherichia coli cell size with central metabolism. PLoS Genet. 2013;9(7): e1003663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Debarbieux L, Bohin A, Bohin JP. Topological analysis of the membrane-bound glucosyltransferase, MdoH, required for osmoregulated periplasmic glucan synthesis in Escherichia coli. J Bacteriol. 1997;179(21):6692–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Sperandeo P, Martorana AM, Polissi A. The lipopolysaccharide transport (Lpt) machinery: A nonconventional transporter for lipopolysaccharide assembly at the outer membrane of Gram-negative bacteria. J Biol Chem. 2017;292(44):17981–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Li Y, Orlando BJ, Liao M. Structural basis of lipopolysaccharide extraction by the LptB2FGC complex. Nature. 2019;567(7749):486–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.Freinkman E, Okuda S, Ruiz N, Kahne D. Regulated assembly of the transenvelope protein complex required for lipopolysaccharide export. Biochemistry. 2012;51(24):4800–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117.Freinkman E, Chng SS, Kahne D. The complex that inserts lipopolysaccharide into the bacterial outer membrane forms a two-protein plug-and-barrel. Proc Natl Acad Sci U S A. 2011;108(6):2486–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Chng SS, Ruiz N, Chimalakonda G, Silhavy TJ, Kahne D. Characterization of the two-protein complex in Escherichia coli responsible for lipopolysaccharide assembly at the outer membrane. Proc Natl Acad Sci U S A. 2010;107(12):5363–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119.Gu Y, Stansfeld PJ, Zeng Y, Dong H, Wang W, Dong C. Lipopolysaccharide is inserted into the outer membrane through an intramembrane hole, a lumen gate, and the lateral opening of LptD. Structure. 2015;23(3):496–504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol. 2006;2006(2):0008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Qiao S, Luo Q, Zhao Y, Zhang XC, Huang Y. Structural basis for lipopolysaccharide insertion in the bacterial outer membrane. Nature. 2014;511(7507):108–11. [DOI] [PubMed] [Google Scholar]
  • 122.Chimalakonda G, Ruiz N, Chng SS, Garner RA, Kahne D, Silhavy TJ. Lipoprotein LptE is required for the assembly of LptD by the beta-barrel assembly machine in the outer membrane of Escherichia coli. Proc Natl Acad Sci U S A. 2011;108(6):2492–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123.Ruiz N, Chng SS, Hiniker A, Kahne D, Silhavy TJ. Nonconsecutive disulfide bond formation in an essential integral outer membrane protein. Proc Natl Acad Sci U S A. 2010;107(27):12245–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Liu R, Ochman H. Stepwise formation of the bacterial flagellar system. Proc Natl Acad Sci U S A. 2007;104(17):7116–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125.Lowder BJ, Duyvesteyn MD, Blair DF. FliG subunit arrangement in the flagellar rotor probed by targeted cross-linking. J Bacteriol. 2005;187(16):5640–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Sim M, Koirala S, Picton D, Strahl H, Hoskisson PA, Rao CV, et al. Growth rate control of flagellar assembly in Escherichia coli strain RP437. Sci Rep. 2017;7:41189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Colin R, Ni B, Laganenka L, Sourjik V. Multiple functions of flagellar motility and chemotaxis in bacterial physiology. FEMS Microbiol Rev. 2021;45(6):fuab038. [DOI] [PMC free article] [PubMed]
  • 128.Fontaine F, Stewart EJ, Lindner AB, Taddei F. Mutations in two global regulators lower individual mortality in Escherichia coli. Mol Microbiol. 2008;67(1):2–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.Cooper TF, Rozen DE, Lenski RE. Parallel changes in gene expression after 20,000 generations of evolution in Escherichiacoli. Proc Natl Acad Sci U S A. 2003;100(3):1072–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130.Beloin C, Roux A, Ghigo JM. Escherichia coli biofilms. Curr Top Microbiol Immunol. 2008;322:249–89. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 131.Hinde P, Deighan P, Dorman CJ. Characterization of the detachable Rho-dependent transcription terminator of the fimE gene in Escherichia coli K-12. J Bacteriol. 2005;187(24):8256–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132.Abraham JM, Freitag CS, Clements JR, Eisenstein BI. An invertible element of DNA controls phase variation of type 1 fimbriae of Escherichia coli. Proc Natl Acad Sci U S A. 1985;82(17):5724–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 133.Gally DL, Leathart J, Blomfield IC. Interaction of FimB and FimE with the fim switch that controls the phase variation of type 1 fimbriae in Escherichia coli K-12. Mol Microbiol. 1996;21(4):725–38. [DOI] [PubMed] [Google Scholar]
  • 134.McClain MS, Blomfield IC, Eberhardt KJ, Eisenstein BI. Inversion-independent phase variation of type 1 fimbriae in Escherichia coli. J Bacteriol. 1993;175(14):4335–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135.Schwan WR. Regulation of fim genes in uropathogenic Escherichia coli. World J Clin Infect Dis. 2011;1(1):17–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136.Muller CM, Aberg A, Straseviciene J, Emody L, Uhlin BE, Balsalobre C. Type 1 fimbriae, a colonization factor of uropathogenic Escherichia coli, are controlled by the metabolic sensor CRP-cAMP. PLoS Pathog. 2009;5(2): e1000303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137.Lehti TA, Bauchart P, Dobrindt U, Korhonen TK, Westerlund-Wikstrom B. The fimbriae activator MatA switches off motility in Escherichia coli by repression of the flagellar master operon flhDC. Microbiology. 2012;158(Pt 6):1444–55. [DOI] [PubMed] [Google Scholar]
  • 138.Klemm P, Schembri MA. Bacterial adhesins: function and structure. Int J Med Microbiol. 2000;290(1):27–35. [DOI] [PubMed] [Google Scholar]
  • 139.Klemm P, Christiansen G. Three fim genes required for the regulation of length and mediation of adhesion of Escherichia coli type 1 fimbriae. Mol Gen Genet. 1987;208(3):439–45. [DOI] [PubMed] [Google Scholar]
  • 140.Sokurenko EV, Courtney HS, Maslow J, Siitonen A, Hasty DL. Quantitative differences in adhesiveness of type 1 fimbriated Escherichia coli due to structural differences in fimH genes. J Bacteriol. 1995;177(13):3680–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 141.Sokurenko EV, Chesnokova V, Dykhuizen DE, Ofek I, Wu XR, Krogfelt KA, et al. Pathogenic adaptation of Escherichia coli by natural variation of the FimH adhesin. Proc Natl Acad Sci U S A. 1998;95(15):8922–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 142.Schembri MA, Sokurenko EV, Klemm P. Functional flexibility of the FimH adhesin: insights from a random mutant library. Infect Immun. 2000;68(5):2638–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143.Dreux N, Denizot J, Martinez-Medina M, Mellmann A, Billig M, Kisiela D, et al. Point mutations in FimH adhesin of Crohn’s disease-associated adherent-invasive Escherichia coli enhance intestinal inflammatory response. PLoS Pathog. 2013;9(1): e1003141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144.Rabbani S, Fiege B, Eris D, Silbermann M, Jakob RP, Navarra G, et al. Conformational switch of the bacterial adhesin FimH in the absence of the regulatory domain: Engineering a minimalistic allosteric system. J Biol Chem. 2018;293(5):1835–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145.Sauer MM, Jakob RP, Eras J, Baday S, Eris D, Navarra G, et al. Catch-bond mechanism of the bacterial adhesin FimH. Nat Commun. 2016;7:10738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 146.Johnson JR, Weissman SJ, Stell AL, Trintchina E, Dykhuizen DE, Sokurenko EV. Clonal and pathotypic analysis of archetypal Escherichia coli cystitis isolate NU14. J Infect Dis. 2001;184(12):1556–65. [DOI] [PubMed] [Google Scholar]
  • 147.Dragosits M, Mattanovich D. Adaptive laboratory evolution – principles and applications for biotechnology. Microb Cell Fact. 2013;12:64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 148.Ferenci T. Trade-off Mechanisms Shaping the Diversity of Bacteria. Trends Microbiol. 2016;24(3):209–23. [DOI] [PubMed] [Google Scholar]
  • 149.Daitch AK, Smith EL, Goley ED. OpgH is an essential regulator of Caulobacter morphology. mBio. 2024;15(9):e0144324. [DOI] [PMC free article] [PubMed]
  • 150.Javed A, Balhuizen MD, Pannekoek A, Bikker FJ, Heesterbeek DAC, Haagsman HP, et al. Effects of Escherichia coli LPS Structure on Antibacterial and Anti-Endotoxin Activities of Host Defense Peptides. Pharmaceuticals (Basel). 2023;16(10):1485. [DOI] [PMC free article] [PubMed]
  • 151.Liu Y, Koudelka GB. The Oligosaccharide Region of LPS Governs Predation of E. coli by the Bacterivorous Protist, Acanthamoeba castellanii. Microbiol Spectr. 2023;11(1):e0293022. [DOI] [PMC free article] [PubMed]
  • 152.Douglass MV, Cleon F, Trent MS. Cardiolipin aids in lipopolysaccharide transport to the gram-negative outer membrane. Proc Natl Acad Sci U S A. 2021;118(15):e2018329118. [DOI] [PMC free article] [PubMed]
  • 153.Guest RL, Same Guerra D, Wissler M, Grimm J, Silhavy TJ. YejM Modulates Activity of the YciM/FtsH Protease Complex To Prevent Lethal Accumulation of Lipopolysaccharide. mBio. 2020;11(2):e00598-20. [DOI] [PMC free article] [PubMed]
  • 154.Shu S, Mi W. Regulatory mechanisms of lipopolysaccharide synthesis in Escherichia coli. Nat Commun. 2022;13(1):4576. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 155.Bhattacharyya S, Lopez S, Singh A, Harshey RM. Flagellar Motility is Mutagenic. bioRxiv. 2024;121(41):e2412541121. Published in PNAS (USA). [DOI] [PMC free article] [PubMed]
  • 156.Bhattacharyya S, Lopez S, Singh A, Harshey RM. Flagellar motility is mutagenic. Proc Natl Acad Sci U S A. 2024;121(41): e2412541121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157.Gresham D, Hong J. The functional basis of adaptive evolution in chemostats. FEMS Microbiol Rev. 2015;39(1):2–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158.Jacob F. Evolution and tinkering. Science. 1977;196(4295):1161–6. [DOI] [PubMed] [Google Scholar]
  • 159.Suntsova MV, Buzdin AA. Differences between human and chimpanzee genomes and their implications in gene expression, protein functions and biochemical properties of the two species. BMC Genomics. 2020;21(Suppl 7):535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 160.King MC, Wilson AC. Evolution at two levels in humans and chimpanzees. Science. 1975;188(4184):107–16. [DOI] [PubMed] [Google Scholar]
  • 161.Carroll SB. Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell. 2008;134(1):25–36. [DOI] [PubMed] [Google Scholar]
  • 162.Orr HA. Adaptation and the cost of complexity. Evolution. 2000;54(1):13–20. [DOI] [PubMed] [Google Scholar]
  • 163.Hindre T, Knibbe C, Beslon G, Schneider D. New insights into bacterial adaptation through in vivo and in silico experimental evolution. Nat Rev Microbiol. 2012;10(5):352–65. [DOI] [PubMed] [Google Scholar]
  • 164.Ruelens P, Wynands T, de Visser J. Interaction between mutation type and gene pleiotropy drives parallel evolution in the laboratory. Philos Trans R Soc Lond B Biol Sci. 1877;2023(378):20220051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 165.Knight CG, Zitzmann N, Prabhakar S, Antrobus R, Dwek R, Hebestreit H, et al. Unraveling adaptive evolution: how a single point mutation affects the protein coregulation network. Nat Genet. 2006;38(9):1015–22. [DOI] [PubMed] [Google Scholar]
  • 166.Fong SS, Joyce AR, Palsson BO. Parallel adaptive evolution cultures of Escherichia coli lead to convergent growth phenotypes with different gene expression states. Genome Res. 2005;15(10):1365–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 167.MacLean RC, Bell G, Rainey PB. The evolution of a pleiotropic fitness tradeoff in Pseudomonas fluorescens. Proc Natl Acad Sci U S A. 2004;101(21):8072–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 168.Stoebel DM, Hokamp K, Last MS, Dorman CJ. Compensatory evolution of gene regulation in response to stress by Escherichia coli lacking RpoS. PLoS Genet. 2009;5(10): e1000671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 169.Couce A. Regulatory networks may evolve to favor adaptive foresight. PLoS Biol. 2024;22(12): e3002922. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 170.Cohen Y, Hershberg R. Rapid Adaptation Often Occurs through Mutations to the Most Highly Conserved Positions of the RNA Polymerase Core Enzyme. Genome Biol Evol. 2022;14(9):evac105. [DOI] [PMC free article] [PubMed]
  • 171.Omasits U, Ahrens CH, Muller S, Wollscheid B. Protter: interactive protein feature visualization and integration with experimental proteomic data. Bioinformatics. 2014;30(6):884–6. [DOI] [PubMed] [Google Scholar]
  • 172.Team RC. R: A language and environment for statistical computing. In: R Foundation for Statistical Computing. 2024.
  • 173.Kinnersley M, Schwartz K, Yang DD, Sherlock G, Rosenzweig F. Evolutionary dynamics and structural consequences of de novo beneficial mutations and mutant lineages arising in a constant environment. Available from: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA517527. [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

12915_2025_2331_MOESM1_ESM.docx (4.5MB, docx)

Additional file 1: Figure S1. Mutations inthe gene encoding RNA chaperone, ProQ. Figure S2. Mutations in putative chaperone protein, GatZ. Figure S3. Mutations in LPS transport protein, LptC. Figure S4. Mutations in LPS assembly protein, LptD. Figure S5. Mutations in the LPS export system permease, LptG. Figure S6. Mutations in LPS assembly protein, LapB. Figure S7. Mutations in flagellar motor switch protein, FliG. Figure S8. Mutations in cytosolic flagellar export protein, FliH. Figure S9. Mutations in flagellar biosynthesis protein, FliP. Figure S10. Mutations in flagellar biosynthesis protein, FlhB. Figure S11. Mutations in flagellar assembly protein, FlgJ. Figure S12. Mutations in bacterial adhesion protein, FimH.

12915_2025_2331_MOESM2_ESM.xlsx (1.8MB, xlsx)

Additional file 2: Table S1: Variants Identified in Clonal Sequencing. Table S2: Variants identified in Population sequencing. Table S3: Ancestral state of de novo mutations discussed in narrative.

12915_2025_2331_MOESM3_ESM.pptx (725.4KB, pptx)

SAdditional file 3: Figure S13: Animated 3-D image of the Shigella flexneri LptD/LptE complex showing location of mutations arising in the Escherichia coli homolog under glucose limitation.

12915_2025_2331_MOESM4_ESM.pptx (1.7MB, pptx)

Additional file 4: Figure S14: Animated 3-D image of E. coli FimH showing location of mutations arising under glucose limitation.

Data Availability Statement

All raw sequencing data are available from the SRA under BioProject ID PRJNA517527 [173], which is associated with our companion paper [25].


Articles from BMC Biology are provided here courtesy of BMC

RESOURCES