Abstract
Adaptive laboratory evolution is able to generate microbial strains, which exhibit extreme phenotypes, revealing fundamental biological adaptation mechanisms. Here, we use adaptive laboratory evolution to evolve Escherichia coli strains that grow at temperatures as high as 45.3 °C, a temperature lethal to wild-type cells. The strains adopted a hypermutator phenotype and employed multiple systems-level adaptations that made global analysis of the DNA mutations difficult. Given the challenge at the genomic level, we were motivated to uncover high-temperature tolerance adaptation mechanisms at the transcriptomic level. We employed independently modulated gene set (iModulon) analysis to reveal five transcriptional mechanisms underlying growth at high temperatures. These mechanisms were connected to acquired mutations, changes in transcriptome composition, sensory inputs, phenotypes, and protein structures. They are as follows: (i) downregulation of general stress responses while upregulating the specific heat stress responses, (ii) upregulation of flagellar basal bodies without upregulating motility and upregulation fimbriae, (iii) shift toward anaerobic metabolism, (iv) shift in regulation of iron uptake away from siderophore production, and (v) upregulation of yjfIJKL, a novel heat tolerance operon whose structures we predicted with AlphaFold. iModulons associated with these five mechanisms explain nearly half of all variance in the gene expression in the adapted strains. These thermotolerance strategies reveal that optimal coordination of known stress responses and metabolism can be achieved with a small number of regulatory mutations and may suggest a new role for large protein export systems. Adaptive laboratory evolution with transcriptomic characterization is a productive approach for elucidating and interpreting adaptation to otherwise lethal stresses.
Keywords: systems biology, adaptive laboratory evolution, temperature stress, transcriptomics, transcriptional regulatory networks, iModulons
Graphical Abstract
Graphical Abstract.
Significance.
In systems biology, we seek a global understanding of living organisms and systems. iModulon analysis has enabled this for the gene expression patterns in microbes, and it can now be applied to more complex strains and phenotypes. Adaptive laboratory evolution can generate such strains and reveal the fundamental biology of stresses such as prolonged exposure to high temperature. Here, we generated strains that were tolerized to very high temperatures and performed a deep analysis of the transcriptomic alterations that were selected for. We found that upregulation of large membrane protein systems (flagellar basal bodies, fimbriae, and the previously uncharacterized yjfIJKL operon) conferred resistance, as did a streamlining of stress responses, iron metabolism, and redox metabolism. These results highlight the benefit of the global iModulon perspective for understanding evolutionary strategies, and their specifics motivate further research with potential impacts for strain engineering, cellular envelope science, and adaptation to global warming.
Introduction
Adaptive laboratory evolution (ALE) generates microbes that push biological systems to extremes, producing interesting new phenotypes and providing insight into fundamental biology. Starting with a microbe and condition of interest, cells are grown for many generations and propagated in exponential growth phase when flasks reach a target density (Sandberg et al. 2019). Mutants that grow faster are more likely to propagate, driving accumulation of beneficial mutations. In tolerization ALEs, cells grow under increasing stress to evolve highly tolerant strains (Peabody et al. 2014). A detailed understanding of these strains reveals mechanisms of stress tolerance that are able to inform the design of cellular factories (Peabody et al. 2014), our understanding of the evolution of pathogens (Hughes and Andersson 2015), and the fundamental science of systems that may otherwise be hard to study. Typically, ALE endpoint strains are studied by DNA resequencing and subsequent characterization of the mutations that are expected to improve fitness. This works well for many ALEs, as evolved strains typically have a relatively small number of mutations (Phaneuf et al. 2019). However, microbes are able to increase their mutation rate by mutating the DNA mismatch repair machinery and will evolve into hypermutator strains in highly stressful environments (Swings et al. 2017). The high mutation rates in these strains complicate the identification of causal mutations. Therefore, methods that can reduce the complexity of analyzing hypermutator strains and cut through the genomic noise of their many mutations are needed.
The transcriptional regulatory network (TRN) senses cellular states and environments and helps to maintain homeostasis and regulate growth by adjusting gene expression levels. Analyzing changes to transcriptomic allocation in hypermutator strains represents a possible route for their characterization, as it can reveal regulatory changes induced by the mutations or be used to infer changes to metabolism and stress. However, we again run into a problem of scale: hundreds or thousands of genes may be differentially expressed, making it difficult to glean a global understanding from transcriptomic datasets.
This problem can now be addressed by a transcriptomic analysis approach called independently modulated gene set (iModulon) analysis (Sastry et al. 2019). iModulon analysis employs independent component analysis (ICA) to identify coregulated signals from large compendia of gene expression data. These signals are represented by iModulons, which have a weighting for each gene in a signal and an activity level (signal strength) in each sample. Highly weighted genes are considered to be members of the iModulon. iModulon gene sets tend to match well with regulons. Regulons are defined using a variety of bottom-up experimental methods (Myers et al. 2015), whereas iModulons are quantitative structures learned top-down from expression data alone. Compared to analyzing individual gene expression levels, analyzing the activity levels of iModulons decreases the number of significant variables ∼17-fold (Lamoureux et al. 2023). iModulon structures have been established for several organisms and are available to browse, search, or download from iModulonDB.org (Rychel et al. 2021; Catoiu et al. 2025). iModulons have proven useful for analyzing nonhypermutator ALE strains in several cases (Anand et al. 2019, 2020, 2021, 2022; Kavvas et al. 2022; Rychel et al. 2023), making them a promising option for characterizing hypermutator strains. We recently compiled over 1,000 Escherichia coli K-12 RNA sequencing (RNA-seq) expression profiles into a dataset called PRECISE-1K and characterized its iModulons (Lamoureux et al. 2023). Data from the present work were included in that dataset, but presented without the detailed characterization provided here.
To explore how transcriptomic allocation adaptively evolves, we must choose a selection pressure that will produce informative strains. High temperature exerts a fundamental stress on biological systems by destabilizing proteins, membranes, and other molecules (Chen et al. 2017a). Heat tolerance is key to understanding the early evolution of life (Nguyen et al. 2017), pathogen response to fever (Blatteis 2003), engineering of cell factories (Vavitsas et al. 2022), and the response to global warming (Cavicchioli et al. 2019). One prior study demonstrated the presence of epistasis among mutations acquired under high-temperature stress (Tenaillon et al. 2012). Another prior study used ALE with increasing temperatures to generate ten E. coli strains that grow well at 42 °C (Sandberg et al. 2014). Mutational analysis and a simple transcriptomic analysis of these strains revealed some valuable thermal adaptation strategies, such as modifying mRNA degradation and peptidoglycan recycling pathways. However, interpreting the transcriptomic responses of these strains was difficult in the absence of iModulon analysis, and we predicted that higher heat tolerance could be achieved with further evolution.
Here, we evolved an isolate from the 42 °C evolution further to push the limits of heat tolerance. The six endpoint strains of this study can grow at temperatures as high as 45.3 °C, which is lethal to wild-type strains. To achieve this increase in heat tolerance, the strains were all hypermutators. We generated transcriptomic samples from these strains at various temperatures and previously included them in PRECISE-1K (Lamoureux et al. 2023). We enumerated each of the major transcriptomic adaptations that facilitate rapid growth at high temperatures. Despite their broad range of genomic mutations, the strains exhibited only a few major transcriptomic changes. These correspond to the regulation of stress responses, motility, redox metabolism, and iron uptake. We also identify and predict protein structures for a strongly upregulated, previously uncharacterized operon, yjfIJKL, which may be beneficial for survival at high temperatures. In addition to the specific insights on heat tolerance, this study demonstrates the value of transcriptomic analysis (particularly the iModulon framework) for gaining clear insights from hypermutator strains.
Results and Discussion
ALE Increased Temperature Tolerance Via a Hypermutator Phenotype
Using ALE, we obtained six evolved strains that tolerated 45.3 °C (Fig. 1a and S1). Each one descended from the same ancestor from the previous 42 °C ALE (Sandberg et al. 2014), 42c_3, which itself descended from E. coli K-12 MG1655. The mutations present in the starting strain are summarized in Table S1. We generated growth curves and computed growth rates for each of the strains at 30, 37, and 44 °C (Fig. 1b), showing a significant increase in growth rate at 44 °C after evolution (P = 5.9 × 10−7). Interestingly, the strains did not exhibit a major tradeoff in growth rates at 37 °C on average (P = 0.13) and only had a slight growth disadvantage against wild type at 30 °C (P = 0.012). The evolved strains maintained much of their ability to adapt to changes in temperature, suggesting that the acquired genomic mutations were not especially deleterious at lower temperatures.
Fig. 1.
ALE increased temperature tolerance via changes to the genome and transcriptome. a) ALE schematic, showing a previous round of ALE that tolerized E. coli up to 42 °C (Sandberg et al. 2014) and that the present study focuses on descendants from a single strain of the prior study to generate strains that tolerate up to 45.3 °C. Symbol shapes represent strain cohorts, and colors represent temperatures; these will be kept consistent throughout the paper. See Fig. S1 for details. b) Growth rates for the wild-type (WT) and final six evolved strains at three temperatures, showing a significant increase in growth rate at 44 °C (P = 5.9 × 10−7) but similar growth rates as WT at 30 and 37 °C. c) Treemap of all 504 mutations observed in any of the 6 evolved strains, where each mutation is mapped to its nearest gene and genes that mutate in 5 or more strains are labeled. Colors indicate COGs (Table S2). d) Treemap of the variance in the transcriptomes of the evolved strains by iModulon, showing that relatively few iModulons capture most of the variation. The 20 iModulons that explain the most variance are labeled, with some names shortened for space. Categories are labeled with bold names, except two categories combined as “Other”: “Genomic” in blue and “Translation” in pink. For more information on each iModulon, see Table S3 and iModulonDB.org (E. coli PRECISE-1K). e) Venn diagram of the 37 significant DiMAs from evolution f) and temperature changes g). Colors match the categories in d). Bold iModulon names are also in the top 20 by explained variance. f to g) Colors match the categories in d), except that gray represents insignificant iModulon activities and black represents the “unknown” category. f) DiMA plot comparing the iModulon activities in the WT and evolved strains at their highest respective temperatures. g) DiMA plot comparing iModulon activities in the evolved strains at cold (30 °C) and hot (44 °C) temperatures.
The ancestral strain, 42c_3, contained 30 mutations, including mutL G49V. MutL is part of the DNA mismatch repair machinery, and mutations in this gene tend to induce increases in mutation rates (Bridges 2001; Couce et al. 2013; Modrich 2016; Swings et al. 2017). Thus, each of the evolved strains was a hypermutator and ended with between 60 and 126 mutations, with the average strain experiencing 84 mutations. Details of all mutations are shown in Table S2. Each of the mutations was assigned to its nearest gene, and mutated genes were visualized in Fig. 1c. No particular cluster of orthologous genes (COG) was enriched in this set, and the large number of mutations precluded a detailed analysis of the potential benefit of each one, as has been performed in past ALE studies (e.g. [Sandberg et al. 2014]).
iModulon Analysis Revealed Transcriptomic Adaptations
In order to gain a clear understanding of the adaptations in the evolved strains, we generated RNA-seq data for each of them and the 42c_3 ancestor at 30, 37, and 44 °C in duplicate. From the prior study (Sandberg et al. 2014), we also had samples at 42 °C for the wild type and each of the 42 °C evolved strains. All of these profiles were included in PRECISE-1K, a compendium of over 1,000 E. coli transcriptomes, which were generated using the same experimental protocol and analyzed with iModulons in aggregate (Lamoureux et al. 2023).
PRECISE-1K provides a large and diverse condition space from which to identify coregulated, independently modulated signals (iModulons). The 201 iModulons computed from PRECISE-1K have been characterized with assigned functions, regulators, and categories to facilitate interpretation. They are available at iModulonDB.org (Catoiu et al. 2025) under “E. coli PRECISE-1K,” in the project “hot_tale.” We can quantify the explained variance of each iModulon in the samples from the evolved strains by using only information from one or several iModulons at a time to reconstruct the entire dataset (Fig. 1d). In the data generated for this study (a subset of PRECISE-1K), the 201 iModulons captured 82% of the variance in the data, with the remaining 18% assumed to be noise or other variation with no clear structure. The top 20 highest explained variance iModulons explained 61.3% of the overall variance (labeled iModulons, Fig. 1d). Thus, a relatively small number of variables can be highly informative about the global state of these samples and therefore represent an approach to identify key thermotolerance strategies that emerge during ALE.
Differential iModulon activities (DiMAs) are similar to commonly used plots of differentially expressed genes, except they are much easier to interpret because the iModulons are much fewer in total number (201) than genes (4257). iModulons are also knowledge-enriched with regulatory information (Lamoureux et al. 2023). DiMAs between starting and evolved strains represent a summary of transcriptional adaptations (Rychel et al. 2023) (Fig. 1f). In this dataset, we can also quantify DiMAs for the evolved strains between the cold and hot temperatures (Fig. 1g) and compare the sets of significant iModulons (Fig. 1e). Statistics and additional details for each iModulon are provided in Table S3. Note that not all genes in iModulons always exhibit the same changes as the iModulon as a whole.
Highly variable and differentially activated iModulons in ALE studies typically indicate one of three features (Rychel et al. 2023): (i) large genomic alterations have directly amplified or deleted the genes of an iModulon, (ii) mutations in regulatory pathways have altered gene expression, or (iii) underlying metabolites or processes that are sensed by the TRN have been altered. Based on the results of the mutation caller, there were no major amplifications or deletions in the evolved strains, which resulted in their own iModulons. Therefore, the signals are either the result of regulatory mutations or the output of the sensory systems within the evolved cells.
Based on the direction of the change, the large body of existing literature on the E. coli TRN, and experimental evidence, we have inferred the mechanisms which underlie the major changes in the transcriptome. We also proposed explanations of how they provide benefits to the evolving strains. We present the five mechanisms with the strongest signals in the following sections, in Fig. 2, and in Table S4.
Fig. 2.
Overview of the five major mechanisms (mechs) that underlie the adaptation to high-temperature growth. Each relationship marked by an arrow corresponds to a row in Table S4 with the corresponding number, in which its evidence and novelty is listed. On the left side, high temperature is the main “input,” which leads to a variety of sensory inputs in the first column, which are expected from literature or inferred from the iModulon evidence. In the second column, presence or absence of mutations in the evolved strains is shown, according to the legend. iModulons in the third column integrate sensory inputs and effects of mutations to determine their activity levels. Colors in the iModulon icons represent the change between the wild-type (WT) activity at 42 °C and the given strain's evolved activity at 44 °C, normalized by the standard deviation of the iModulon's activity in all samples from PRECISE-1K (e.g. the RpoS iModulon is downregulated [blue], the FliA iModulon is constant [gray], and the YjfJ iModulon is upregulated [red]). iModulon icons are sized according to their explained variance (from Fur-2, 0.27% to RpoS, 19.8%; scaled using the square root). To the right of the iModulon column are the pathways and phenotypes, which are determined by the transcriptomic and genomic changes. The far right column lists hypothesized strategies by which the evolved strains tolerated high temperatures. Different background shades represent different topics/mechanisms, and each is labeled with the respective figure in which to find more information. Each column represents a “level” of omics data or biological phenomena, and we combine information across levels with knowledge in the literature to develop a global understanding of the adaptations in these strains.
Though these putative mechanisms are consistent with genomic, transcriptomic, and additional data as well as the literature, it is important to note that they have not been fully validated. As we are combining results with inference-based interpretation in this study, we have named the main section of this article “Results and Discussion,” instead of the standard “Results.” We strongly encourage molecular biology studies that isolate and test individual hypotheses, as our goals are instead to evaluate the global behavior across several interacting systems. The observed phenotypes may be the result of drift instead of selection, as the 42c_3 ancestral strain may have harbored neutral or deleterious mutations which we erroneously interpret as being beneficial.
Stress Sigma Factors Shift Transcriptomic Allocation From General to Specific Responses
The iModulon with the largest explained variance in observed transcriptomic changes is the RpoS iModulon, which reallocates an enormous 20% of the transcriptome in the evolved strains. RpoS is the general stress response sigma factor, which is governed by complex regulation and limits growth when active (Gottesman 2019; Schellhorn 2020). The general stress response is part of a tradeoff between self-preservation and nutritional competence (Ferenci 2005). Prior iModulon studies have explored a “fear/greed tradeoff,” where typical strains exhibit a negative correlation between the RpoS and Translation iModulon activities. Faster growing cells activate the Translation and downregulate the RpoS iModulons (Utrilla et al. 2016; Sastry et al. 2019, 2021a; Anand et al. 2020; Kavvas et al. 2022; Rychel et al. 2023). As the prior generation of 42 °C evolved strains mutated to tolerate high temperatures, they experienced less stress and therefore downregulated RpoS (Sandberg et al. 2014) (Fig. 3a, “42c Other”).
Fig. 3.
Stress sigma factors shift allocation from general to specific responses. Error bars represent mean ± 95% confidence interval. Colors in the columns of the legend are consistent for each plot. a to c) Activities of the RpoS and Translation iModulons, which constitute the fear/greed tradeoff. a) RpoS activity is downregulated by evolution (P = 0.036) and accounts for 19.8% of the variance in the dataset. The RpoS iModulon regulates the general stress response, which slows growth. b) Scatter plot color coded according to the legend, with low opacity circles representing all other samples in PRECISE-1K (n = 969). A black dashed line was fit to the other samples, representing the typical fear/greed tradeoff (Dalldorf et al. 2024). The temperature evolved samples have lower RpoS activity than expected, due to the rpoS mutation in the strains. c) Translation activity is correlated with temperature, but downregulated less strongly after evolution due to successful adaptation (P = 0.027). d) Knowledge graph describing this figure. Numbered arrows are consistent with Fig. 2, and reference additional details are in Table S4. e) RpoH iModulon activity, which maintains its correlation with temperature but is slightly upregulated by prolonged heat exposure (P = 0.0031). RpoH regulates high-temperature responses.
In the 42c_3 ancestor of the high-temperature–tolerant strains, we observe even stronger downregulation of the RpoS iModulon, which is maintained in the new evolved strains. This iModulon activity is likely resulting from a frameshift mutation in rpoS in 42c_3 and its derivatives. The mutation appears to have mostly deactivated RpoS, allowing sigma factors that do not suppress growth to outcompete it and providing a growth rate benefit during ALE. Deactivating RpoS is generally a good strategy for ALE (Zambrano et al. 1993; Utrilla et al. 2016; Sastry et al. 2019, 2021a; Anand et al. 2020; Dalldorf et al. 2024; Rychel et al. 2023), but the thermotolerant strains take this to an extreme via this mutation. It is important to note that while changes such as deactivating RpoS are beneficial for the ALE, they likely carry major tradeoffs for the general robustness of the strains in other conditions.
The Translation iModulon is typically anticorrelated with RpoS, because similar underlying growth/stress and RNA polymerase-related variables control both iModulons (Gottesman 2019; Schellhorn 2020). However, the Translation iModulon remains anticorrelated with temperature (Fig. 3c), while the RpoS iModulon is downregulated at all temperatures, diverging from the usual fear/greed tradeoff (Fig. 3b). The rpoS frameshift probably stops RpoS activity but does not regulate the Translation iModulon, explaining this discrepancy (Fig. 3d). We do observe a small upregulation of the Translation iModulon at high temperatures after evolution, suggesting that the mutations and tolerization strategies in these strains have successfully decreased the stress signals, which typically downregulate Translation at high temperatures.
Unlike RpoS, the heat stress sigma factor RpoH does not mutate, maintains its wild-type correlation with temperature, and is differentially upregulated at high temperatures after evolution (Fig. 3d). RpoH senses temperature via several mechanisms including an RNA thermosensor and temperature-dependent proteolysis (Morita et al. 1999; Yura 2019), and it activates a variety of heat shock genes and chaperones (Erickson et al. 1987). Presumably, any mutations or changes to heat stress regulation were selected against. Prolonged exposure to high temperatures slightly upregulates RpoH, probably via the known temperature-dependent pathways (Morita et al. 1999; Yura 2019).
Thus, the evolved cells downregulate general stress responses (RpoS) to improve growth, but upregulate specific responses to heat (RpoH). This represents an effective strategy for stress tolerization ALE; indeed, it mirrors the response of oxidative stress evolved strains, which maintain activity of the specific oxidative stress response, SoxS, while also downregulating RpoS (Rychel et al. 2023).
Motility iModulons Amplify Flagellar Basal Body Expression While Suppressing the Filament
A major fraction (18%) of the variance in the transcriptome of the thermotolerant strains is explained by motility iModulons, which respond to two transcription factors, FlhDC and FliA. The regulation of this system has been studied in detail (Fitzgerald et al. 2014), and the iModulon gene structure matches well with the known literature. The two primary iModulons of interest are FlhDC-2 and FliA (Fig. 4a). The promoter of flhDC integrates many signals that affect motility (Shin and Park 1995; Soutourina et al. 1999 ; Lehnen et al. 2002; Sperandio et al. 2002; Lee and Park 2013; Kim et al. 2020), and then, expression of FlhDC induces flagellar synthesis in steps: first, the basal body is synthesized, and then the hook, junction, flagellin, motor, and control mechanisms are added. The timing of these steps is ensured by using a second regulator, the sigma factor FliA, which is induced by FlhDC (as part of the class I genes, left side/purple in Fig. 4a), coregulates the intermediate steps (class II genes, center/orange), and solely regulates the final steps (class III genes, right/green) (Fitzgerald et al. 2014). A good understanding of the regulation and dynamics of this system is important for fundamental biology, understanding host–pathogen interactions (Ottemann and Miller 1997), and developing a toolkit for designing protein secretion systems (Singer et al. 2012 ; Green et al. 2019 ).
Fig. 4.
Changes to motility and fimbriae regulation suggest a possible role in high-temperature tolerance. a) Venn diagram of genes in the two main motility iModulons, FlhDC-2 and FliA, which are highly similar to the known regulons (Fitzgerald et al. 2014; Santos-Zavaleta et al. 2019). The most notable exception is fliMNOPQR, which is thought to be regulated by both FlhDC and FliA but is only found in the FlhDC-related iModulon. b) Illustration of the flagellum, adapted from Fitzgerald et al. (2014). Components are colored according to the Venn diagram in a), and low opacity components are those that were expected to be under dual regulation based on the literature. ATPase mutations in fliI and fliJ observed in the evolved strains are pictured with a red star and label. The established regulatory cascade (Fitzgerald et al. 2014) is also pictured in the lower right, showing that the antisigma factor FlgM requires the ATPase to be exported and derepresses FliA. c) Knowledge graph summarizing this section. Numbered arrows are consistent with Fig. 2, and reference additional details are in Table S4. d and e) Scatter plots of iModulon activities (iModulon activity phase planes), illustrated according to the legend in the box on the left side of the figure between panels (a) and (d). d) The activity of the two FlhDC iModulons form a tight curve with heat tolerant strains having high activity below 40 °C but decreasing at high temperatures. This trend is not affected by observed mutations. e) Though FliA activity is typically correlated with FlhDC-2 activity (black dotted line: best fit for other projects; Pearson R = 0.87) and follows it in the regulatory cascade b), the mutant strains (squares) do not activate FliA as strongly as expected, particularly in the case of the strain with two ATPase mutations (diamonds). Green and orange lines illustrate observed trends. Temperature and other regulatory changes move strains along the line, while mutations modify the line. f) Bar and swarm plot of Fimbriae iModulon activity, with all evolved strains upregulating it (P = 0.00030) and those with the ecpC Δ1 mutation upregulating it the strongest at high temperatures. Error bars represent mean ± 95% confidence interval.
The two flagellar iModulons nearly perfectly mirror the known regulation (Fig. 4b). However, FlhDC-2, but not FliA, includes fliLMNOPQR, despite the fact that it has binding sites for both regulators. This operon is needed earlier in the synthesis of flagella (Minamino and Namba 2004). iModulons learn from relationships in expression data, so the exclusion of this operon from FliA indicates that these genes tend to be more correlated with the early stages of synthesis, in agreement with their function and despite the ability to be regulated by the late stage regulator. In addition, a third iModulon plays a unique role: the FlhDC-1 iModulon contains several genes from both iModulons at both positive and negative weights and appears to capture a third dimension or nonlinearity in the motility transcriptome. This may reflect different binding affinities and changing ratios between the two regulators. FlhDC-1 and 2 form a nonlinear curve, suggesting that samples adjust toward FlhDC-1 as FlhDC expression increases (Fig. 4d). Thus, iModulons reflect the known transcriptional regulation while providing additional nuance which is useful for a practical understanding of the system. We propose that a future study explore the binding affinities of the fliLMNOPQR promoter for both of these regulators to test this observation.
At high temperatures, the flagellar secretion system is less able to secrete FlgM, the antisigma factor for FliA (Rudenko et al. 2019). This mechanism may have evolved to help E. coli avoid flagellin-mediated detection by the host immune system during a fever (Ottemann and Miller 1997). Thus, the flagella synthesis pathway is cut off at a regulatory point between the two iModulons (Fig. 4c). We observe this mechanism clearly in the activity of the evolved strains at 44 °C, which have some FlhDC-2 activity but no FliA activation (Fig. 4e). The activity phase plane between FlhDC-2 and FliA is thus highly informative: typically, the two iModulons exhibit a strong correlation (Pearson R = 0.87), but in cases below the best-fit line, some mechanism likely inhibits FliA activity; potentially, this may be the failed secretion of FlgM.
Interestingly, we also observed the evolved samples exhibiting activity below the best-fit line at lower temperatures, when heat should not be disrupting FlgM secretion (Fig. 4e). This observation indicates that something besides temperature in these strains downregulates the FliA iModulon. Indeed, the ancestor (42c_3) of all evolved strains had a frameshift mutation in fliJ (Table S1), which is involved in the secretion of flagellar export substrates (Minamino and Namba 2004). The strains occupy a unique location in the phase plane, possibly because FlgM cannot be exported as efficiently due to this mutation. Another mutation, a frameshift in the export ATPase fliI, affected only the hot_4 strain. With this mutation, FliA iModulon activity decreases even further (diamonds, Fig. 4e). This second mutation having an additive effect appears to strengthen this speculative mechanism relating flagellar mutations with flagellar expression.
iModulons have thus quantitatively captured the complex transcriptional regulation of motility and revealed the effects of temperature and mutations on the regulation of FliA. Key questions remain to be elucidated: (i) the strains strongly upregulate FlhDC, but due to its complex upstream regulation, it is difficult to deduce the molecular mechanism; (ii) since FlhDC activity also decreases at high temperatures, some unknown mechanism downstream of FliA may be feeding back to regulate these genes; and (iii) there may be an evolutionary benefit to expressing FlhDC-regulated genes but not FliA-regulated genes at high temperatures.
We can speculate about the evolutionary benefit in question (iii) and propose two hypotheses that warrant further study: (a) the basal body could secrete other temperature-sensitive proteins and/or (b) the basal body may provide membrane stability. (a) The flagellar basal body is able to rapidly secrete large proteins and can be engineered to secrete a variety of protein substrates (Singer et al. 2012). Its typical secretion substrate, fliC, is downregulated by the evolved fliIJ mutants, which means that the exporter is upregulated while its substrate is absent. Therefore, perhaps it has been repurposed to help eliminate protein aggregates in the cells, which would accumulate under high temperatures. Alternatively, (b) the flagellar basal body is one of very few protein complexes that span both membranes. More basal bodies may therefore be structurally beneficial to the cellular envelope, which is destabilized by high temperatures. Indeed, a simulation study found that protein crowding in the membrane can increase membrane viscosity, counteracting a stress-inducing decrease in viscosity that occurs at high temperatures (Fábián et al. 2023). Though they have not yet been tested, both of these hypotheses warrant future study and are also applicable to the Fimbriae and YjfJ iModulons discussed later.
The Fimbriae iModulon is Another Upregulated Large Protein Export System
Another extracellular structure, the fimbriae, is an important part of transcriptome reallocation in the evolved strains (1.6% explained variance). This iModulon contains the fimbriae synthesis genes fimAICDFGH (Geibel et al. 2013), which are strongly upregulated in 42c_3 and the evolved strains and negatively correlated with temperature (Fig. 4f) (Hinthong et al. 2015). Interestingly, the negative correlation with temperature was abolished, and fimbriae were strongly upregulated at high temperatures in two evolved strains, hot_3 and hot_10, which both shared a frameshift mutation in the ecpC gene. EcpC is a putative usher protein for another extracellular structure, the common pilus (Garnett et al. 2012). This result suggests cross-talk between various extracellular fiber systems in E. coli, and it also suggests that upregulation of fimbriae may be related to survival at high temperatures.
The altered temperature response of fimbriae expression in the evolved strains, particularly in hot_3 and hot_10, suggests that fimbrial upregulation may play a functional role in thermal adaptation. Similar to our two hypotheses about flagellar basal bodies, the upregulation of fimbriae may also assist with misfolded protein export and/or membrane stability. The usher proteins fimD and ecpC typically export large proteins and may also have been repurposed for misfolded substrates in the evolved strains. Unlike flagellar basal bodies, the fimbriae systems are only in the outer membrane and would therefore be limited to secreting periplasmic proteins or supporting the stability of the outer membrane. Further research into the specificity of protein export by fimbriae and pilus systems, and their role in thermal stability of the cellular envelope, could be helpful for understanding heat tolerance and for the design of heterologous protein producers (Schlegel et al. 2013).
Redox Metabolism Shifts Toward Less Aerobic Metabolism
ArcA and Fnr, the regulators of aerobicity, exert significant control over cellular phenotypes via alterations to the expression of genes involved in respiration (Federowicz et al. 2014). Together, their associated iModulons explain 3.4% of the variance in the transcriptomes of the evolved strains but are likely to have a larger effect on metabolism and phenotypes. The ArcAB two-component system represses aerobic metabolism genes when the electron transport chain (ETC) is in a reduced state (Malpica et al. 2004), and Fnr derepresses anaerobic metabolism genes when its iron–sulfur (Fe–S) clusters are not oxidized (Myers et al. 2013). Fnr activity is captured by three iModulons with similar activities in these samples; we therefore focus on Fnr-3, which has the highest explained variance of the three. ArcA and Fnr-3 activities are correlated (black line, Fig. 5b) since they both sense different features of the same underlying cellular redox state.
Fig. 5.
iModulon activities and mutations reveal hallmarks of redox metabolism, iron uptake, and uncharacterized genes, which may facilitate temperature adaptation. All figures use the colors, and all scatterplots use the shapes, given in the legend (top middle). For bar graphs, error bars represent mean ± 95% confidence interval. a to c) iModulon activities for ArcA, which regulates aerobic metabolism (by sensing oxidation state of quinones in the ETS), and Fnr-3, which regulates anaerobic metabolism (by sensing oxidative damage to the Fnr Fe–S cluster). b) Linear fits for the evolved samples (short green line) and all other samples (long black line) are shown. Temperature shifts expression up the trendline toward a less oxidized state, and the evolved strains have shifted their trend leftward, likely due to an arcB E118G mutation. d) Rate-yield plot demonstrating the effect of temperature and evolution on biomass yield and glucose uptake rate. Gray dotted lines indicate isoclines with constant growth rate according to the labels on the right. Colored arrows show the effect of a change in temperature (blue and purple) and evolution (green). e to g) iModulon activities for Fur-1 and Fur-2, which regulate iron uptake and are fit to a logarithmic curve. h) Zoomed in version of f), showing that raising the temperature tends to shift samples above the trendline, toward Fur-2 expression. Fur-2 contains feoABC, the simple iron transporter, whereas Fur-1 contains the more metabolically expensive and less necessary siderophore synthesis pathways. The arrow labeled “Iron Concentration” shows the direction of increasing iron concentration from other studies in PRECISE-1K, and the one labeled “Temp” describes the observed trend from this study. i) Distance to the trendline from f and h), showing that increasing temperatures shifts the preference of Fur toward activating Fur-2. j) feoB gene expression is correlated with temperature in all strains, which is consistent with the association between Fur-2 and high temperature. k and l) Knowledge graphs for each of the temperature adaptation mechanisms presented in this figure, where numbering is consistent with Fig. 2 and additional details are available in Table S4.
As temperature increases, gene expression shifts upward and leftward in the ArcA/Fnr-3 phase plane (Fig. 5a to c). This shift indicates that high temperatures decrease oxidation, which is consistent with the decrease in oxygen solubility as temperatures increase (Gevantman 2022), but may also be the result of other metabolic changes. Indeed, a study described how higher temperatures can lead to oxygen limitation in animals (Rubalcaba et al. 2020). Decreased oxygen solubility may cause the ETC to be more reduced and the Fe–S clusters to be less oxidized, causing changes to the activity state of these iModulons. The expression change will potentially decrease the production of reduced nicotinamide adenine dinucleotide (NADH) and reliance on oxygen, since the downregulated ArcA iModulon contains aerobic metabolism and NADH-producing genes (Malpica et al. 2004; Federowicz et al. 2014).
In addition to its effect on oxygen concentration, high temperature also increases the rate of autoxidation, a process in which reactive oxygen species (ROS) are generated by ETC components like NADH dehydrogenase (Messner and Imlay 1999 ; Belhadj Slimen et al. 2014). We predict that decreasing ArcA expression should decrease ETC activity and help to decrease the amount of electrons that end up being wasted by this process at high temperatures (Messner and Imlay 1999; Belhadj Slimen et al. 2014). Interestingly, the 42c_3 strain and all fully evolved strains harbored the mutation arcB E118G, suggesting that this modification to the ArcAB system may be associated with survival at high temperatures. In Fig. 5b, we observe that ArcA iModulon activity has shifted to the left of the trendline formed by the other samples (Lamoureux et al. 2023), so we infer that this mutation increases the phosphorylation of ArcA by ArcB (van Beilen and Hellingwerf 2016). The arcB mutation would explain the shift in ArcA iModulon activities and potentially provide the benefit of decreasing autoxidation from the ETC, reinforcing a change induced by high temperatures. However, there ought to be a tradeoff to this mutation at lower temperatures, when autoxidation does not have as strong of an effect on biomass yield.
We also note one outlier strain, hot_9, which did not upregulate Fnr iModulons or further downregulate the ArcA iModulon when temperature increased. This strain harbored the mutation gor G127D, which may have enhanced ROS detoxification by glutathione reductase or decreased autoxidation (Prinz et al. 1997) at high temperatures.
To explore the systems-level changes to energy metabolism that arise from these genomic and transcriptomic changes, we measured the glucose uptake rate and biomass yield of the wild-type and evolved strains at three temperatures and plotted them on a rate-yield plane (Fig. 5d). Regions of the rate-yield plane are associated with distinct states of energy metabolism called aero-types, as has been characterized in prior studies (Chen et al. 2021; Anand et al. 2022; Rychel et al. 2023). Samples with high biomass yields are in the highest aero-type, corresponding to efficient aerobic growth. Lower aero-types are progressively less efficient and pump fewer protons across the inner membrane during respiration. Within an aero-type, samples may shift left or right based on the rate of glucose uptake. We find that temperature strongly affects yield and uptake: cold samples are highly efficient (high aero-type), but unable to rapidly uptake glucose (Fig. 5d, blue arrow), whereas hot samples can rapidly take up glucose but have low yield (lower aero-type) due to heat-induced damage and waste (purple arrow).
The evolved strains at high temperature have higher uptake and yield compared with the wild type (Fig. 5d, arrow labeled “ALE”). The increased uptake may be due to the changes toward anaerobic metabolism, which utilizes more glucose (Federowicz et al. 2014). The increased yield demonstrates the successful adaptation of these strains to high-temperature conditions, which likely results from a combination of mechanisms that could include decreased autoxidation brought about by the shift toward anaerobic metabolism. We note that, on average, the effects of ALE at 37 °C are negligible. At 30 °C, on the other hand (Fig. 5d, green arrow with no label), yield decreases while glucose uptake rates remain low. This is consistent with the arcB mutation preventing upregulation of the high-yield aerobic pathways, which use glucose more efficiently in the wild type at low temperatures.
Thus, iModulon and glucose uptake rate-yield analysis have revealed a consistent prediction for the effects of temperature and mutations on energy metabolism (Fig. 5k). At high temperatures, dissolved oxygen decreases, and electrons leak from the ETC into ROS more readily (Belhadj Slimen et al. 2014), inducing a metabolic shift toward anaerobiosis, which is amplified by an arcB mutation in the high-temperature–tolerant evolved strains. A mutation in gor may alleviate some autoxidation at high temperatures. This shift successfully increases both glucose uptake and biomass yield at high temperatures but carries a tradeoff that decreases yield at lower temperatures. This temperature tolerance strategy, though not thoroughly validated, is informative for the fundamental biology of cross-stress tolerance and the relationship between stress and metabolism. Its mutations may also provide design variables of interest for fermentation applications, which may experience high temperatures or uneven oxygenation.
Fur Preferentially Derepresses feoB, a Commonly Mutated Iron Transporter
The two Fur iModulons regulate iron uptake systems and exhibit a nonlinear relationship (Fig. 5e to g) (Sastry et al. 2021). They explain ∼1% of the variance in the transcriptome and exhibit an interesting relationship with temperature, in which temperature shifts activity perpendicularly to the trendline (Fig. 5h and i). This behavior is observed in both wild-type and evolved strains, suggesting that it may be a fundamental feature of Fur binding. The effect is to prefer Fur-1 iModulon genes in lower temperatures and Fur-2 genes in higher temperatures. Fur-1 contains siderophore synthesis genes, which are needed when iron becomes less soluble at cold temperatures, as has been studied in Vibrio salmonicida (Colquhoun and Sørum 2001). Fur-2, on the other hand, contains less metabolically expensive ionic iron transporters, like feoABC (Smith et al. 2019), which is upregulated (Fig. 5j). These would be preferred at higher temperatures due to their lower cost and the readily available dissolved iron. Thus, we predict that the Fur transcription factor exhibits temperature sensitivity which may support a temperature-dependent preference for iron uptake systems (Fig. 5l).
Though there are no transcriptional regulatory mutations to the iron uptake system, the transporter gene feoB mutates in three of the six evolved strains (hot_4: F363L; hot_8: W699*; hot_9: F363L and V563M). Further research ought to probe the effects of these mutations on the temperature stability and function of FeoABC.
The yjfIJKL Operon May Be a New Heat Tolerance Operon
Finally, a large 2.2% of the explained variance in the transcriptome is attributed to a single operon of all uncharacterized genes, yjfIJKL, which constitutes the YjfJ iModulon. The iModulon was named as such because yjfJ encodes what was previously thought to be a putative transcription factor, and it was presumed to be the regulator of the operon during the initial PRECISE-1K curation. The iModulon is strongly upregulated in 42c_3 and the evolved strains (Fig. 6a), but not in any other samples in the dataset. This is predicted to be the result of a single nucleotide deletion 80 bp upstream of the operon (yjfI-pΔ1), in its promoter region (Fig. 6b).
Fig. 6.
Upregulation and structure prediction for yfjIJKL suggest a role in survival at high temperatures. a) YjfJ iModulon activity, describing the expression of the yjfIJKL operon. The iModulon appears to be activated in all evolved strains by a single nucleotide promoter deletion upstream of yjfI. This iModulon represents an unknown molecular process, but is a clear signal detected by ICA. b) Knowledge graph for this section, where numbering is consistent with Fig. 2 and additional details are available in Table S4. c to f) Predicted structures from AlphaFold (Jumper et al. 2021) for YjfJ iModulon member genes. See Table 1 for details. c) YjfJ 48-mer colored by QMEANDisCo confidence score (Studer et al. 2020). d) YjfK 1-mer, also colored by QMEANDisCo. e and f)YjfL 6-mer seen from top e) and side view f). Colored by residue index as shown in legend.
Given the large influence on the transcriptome, the likely causal mutation, and the lack of existing knowledge about these genes, we decided to perform structure prediction in order to improve functional annotation of the YjfJ iModulon and identify its possible role in high-temperature stress mitigation. Prior to this work, it was known that YjfJ exhibits homology to PspA, a phage shock protein (Jovanovic et al. 2014), which we confirmed using ssbio (41.8% similarity, score 88.5) (Mih et al. 2018). PspA uses a variety of interesting structural mechanisms to remodel bacterial membranes in response to stress (Junglas et al. 2021; Pfitzner et al. 2021; Thurotte et al. 2017). Using QSPACE (Catoiu et al. 2024), we were able to quickly find all structures for YjfJ in the SWISS-MODEL repository (Bienert et al. 2017 ), AlphaFold Database (Jumper et al. 2021; Varadi et al. 2022), and Protein Data Bank (PDB) (Rose et al. 2021). The SWISS-MODEL structure for YjfJ resembles an endosomal protein complex required for transport-III (ESCRT-III)–like membrane channel (Fig. 6c) (Pfitzner et al. 2021), suggesting that YjfJ may also interact with the membrane. We also performed structural prediction for the other three genes in the iModulon: YjfK is likely a primitive outer membrane porin (Fig. 6d), YjfL may oligomerize into inner membrane channel-like porin structures (Fig. 6e and f), and YjfI had only low-confidence structure predictions typical for relaxed membrane lipoproteins that can form sheet-like structures in the periplasm. Details of each gene are summarized in Table 1.
Table 1.
Structural predictions for uncharacterized genes in the strongly upregulated YjfJ iModulon
| Gene | Description | Similar To | Location | Structure | Confidence |
|---|---|---|---|---|---|
| YjfI (b4181) | Alpha helix and beta sheet domains, low confidence | Membrane lipoproteins that form periplasmic sheets | Periplasm | N/A | N/A |
| YjfJ (b4182) | Fig. 6c. 55-mer pore structure with C-11 symmetry | PspA, Vipp1, and other ESCRT-III proteins that remodel membranes | Inner membrane | SWISS- MODEL (6zvr) | QSPRD: 0.3460 GMQE: 0.66 QMEANDisCo: 0.61 ± 0.05 Seq: 20.64% |
| YjfK (b4183) | Fig. 6d. Primitive outer membrane B-barrel porin | MliC, YqeJ, YnfC | Outer membrane | AlphaFold | pLDDT: 0.925 |
| YjfL (b4184) | Fig. 6e–f. Multiple transmembrane helices. Hexamer forms inner membrane channels | Channel-like porins | Inner membrane | AlphaFold Multimer (unrelaxed) | iPTM: 0.67 PTM: 0.71 Model: 0.68 |
QSPRD, quaternary structure score (Bertoni et al. 2017); GMQE, global model quality estimation (Benkert et al. 2011); QMEANDisCo, qualitative model energy analysis with distance constraints (global) (Studer et al. 2020); pLLDT, per-residue local distance difference test (Mariani et al. 2013
); iPTM, integrated predicted template modeling (Yin et al. 2022); PTM, predicted template modeling (Jumper et al. 2021); model: 0.8 × iPTM + 0.2 × PTM (Evans et al. 2022).
Taken together, all four genes in the iModulon likely interact with the cellular envelope, and three of them (yjfJKL) appear to add pore structures to the membranes. Increasing the membrane protein concentration has been shown to increase membrane viscosity, counteracting a decrease in viscosity that occurs at high temperatures (Fábián et al. 2023). There is also a chance that larger openings formed by YjfJ are able to secrete proteins, including misfolded proteins, from the periplasm, or form vesicles as an envelope stress response (McBroom and Kuehn 2007). Thus, the mechanisms associated with the YjfJ iModulon are similar to those for the motility and Fimbriae iModulons discussed earlier: the evolved cells upregulate certain envelope-associated structures, with potential, predicted effects that increase membrane stability or protein aggregate efflux. The discovery of the YjfJ iModulon, its upregulation during temperature adaptation, and its putative functional annotation through structural proteomics provide clear impetus for undertaking a future study to detail its molecular functions. The promoter mutation we identified would be useful to increase expression in such a study.
Conclusion
In this study, we used ALE to produce six E. coli strains, which can grow at 45.3 °C, a temperature lethal to wild-type cells. Although their hypermutator phenotype complicated a global, detailed genomic analysis, iModulon analysis revealed global transcriptomic adaptations. These included 14 iModulons that could be linked to 11 mutations, accounting for half of the observed gene expression variation in these strains (Fig. 2). We predict that the strains adapt to high temperatures by (i) specializing their stress response by downregulating RpoS and upregulating RpoH; (ii) activating flagellar basal bodies and fimbriae while downregulating FliA, with possible effects on envelope stability or the export of proteins; (iii) downregulating aerobic metabolism genes to counteract changes to oxygen solubility and autoxidation rates, (iv) upregulating and modifying ionic iron uptake while shifting away from unnecessary expression of siderophores; and (v) upregulating the previously uncharacterized yjfIJKL operon, which, based on our structural analysis, is likely to improve membrane stability.
The five mechanisms described above suggest three general principles for mesophilic microbes growing at high temperatures. First, stress responses and metabolism are streamlined: RpoS and ArcA regulons are downregulated, and shifting from Fur-1 to Fur-2 decreases wasteful siderophore pathway expression. Second, high temperatures drive protein aggregation responses, so these must be cleared: the RpoH-dependent proteases and chaperones are upregulated (Gragerov et al. 1991), and large protein export systems such as flagellar basal bodies and fimbriae are upregulated. Third, envelope viscosity and stability improve through upregulation of membrane protein systems, including the flagellar basal body, fimbriae, and yjfIJKL operon.
This study and similar work on ROS tolerance (Rychel et al. 2023) emphasize the value of transcriptomic analysis through iModulons for building a multilevel (or multiomic; Fig. 2) understanding of cellular stress tolerance phenotypes. In both cases, the stress response becomes specialized for the given strain by modifying activity of RpoS while leaving the specific stress regulon (RpoH or SoxS) to function as it does in wild type. Interestingly, both cases also showed a shift toward anaerobiosis and higher preference for Fur-2 ionic iron transport, but with different predicted underlying mechanisms. The rich information gleaned from these ALE experiments and transcriptomic datasets motivates further applications of iModulons for understanding unique strains. This effort will build up more examples associated with each iModulon and further enrich the field's working understanding of transcriptional regulation.
A particularly fruitful use of iModulon analysis in this study lies in the use of activity phase planes (Figs. 3b, 4d, 4e, 5b, 5f, 5h). Each figure showed a trend that was observed across the >1,000 samples of PRECISE-1K (Lamoureux et al. 2023), along with modifications to the overall trend resulting from regulatory changes in the evolved strains. This is an example of an emerging principle of data science in microbial physiology—i.e. learning with scale—as the evolved transcriptomic changes could not have been understood without the context of a large reference dataset.
The goal of this study was to predict mechanisms underlying temperature tolerization. iModulons proved to be a key tool for achieving this goal, particularly because mutational data were complicated due to hypermutator phenotypes. Relating between iModulons and selected mutations is important, but we recognize that this pursuit is somewhat limited in scope. The mutational mechanisms are inferred based on literature associations of the genes and regulons, as opposed to being individually experimentally validated. We rely on prior work in the literature, which allows us to cover more of the global features of the transcriptome in a single manuscript. It is also possible that observed phenotypes are the result of drift, rather than selection. This approach bears the risk of presenting incorrect conclusions, and thus we encourage future studies to more thoroughly validate the hypotheses presented here using traditional methods.
In addition to its contribution to the understanding of TRN adjustments at high temperatures, we anticipate that this study will lead to practical applications. Engineering flagellar basal bodies for heterologous protein export is a promising approach (Green et al. 2019), and we have implicated mutations in the ATPase genes fliIJ in a mechanism that upregulates the export basal body without the wasteful production of other motility proteins. We are also the first to report that high temperatures change the activity of ArcA and Fnr and predict the effects of this sensitivity on cellular metabolism. This observation could also be useful for designing cell factories, in which changes to oxygen and temperature commonly occur, and regulatory effects need to be precisely understood. Perturbations to temperature and oxygen levels in nature will also become more common and extreme as the climate changes, so strategies like these may be employed by wild bacteria as tolerance evolves.
Taken together, we presented a global characterization of laboratory evolved, high-temperature–tolerant strains of E. coli with emphasis on the transcriptome as opposed to the genome. Our multilevel approach was effective for understanding the coordination of multiple mechanisms resulting in temperature tolerization. It also predicted new mechanisms involved in temperature tolerization and resulted in the putative functional annotation of unknown genes. Given the availability of large amounts of transcriptomic data and tools like iModulon analysis, we believe that TRN evolution will continue to be elucidated for a variety of environmental challenges. Such studies will reveal how known cellular processes cooperate in generating tolerized phenotypes, and will discover new ones.
Materials and Methods
Microbial Strains
The starting strain of the original 42 °C evolution (Sandberg et al. 2014) was E. coli K-12 MG1655. Mutations for the evolved strains are listed on aledb.org and in Table S1.
Culture Conditions
All strains were grown and evolved in M9 minimal medium prepared by addition of 0.1 mM CaCl2, 2 mM MgSO4, 1× trace elements solution, 1× M9 salt solution, and 4 g/L D-glucose to Milli-Q water. The M9 salt solution was composed of 68 g/L Na2HPO4, 30 g/L KH2PO4, 5 g/L NaCl, and 10 g/L NH4Cl. The trace elements solution was prepared by mixing 27 g/L FeCl3・6 H2O, 1.3 g/L ZnCl2, 2 g/L CoCl2⋅6 H2O, 2 g/L Na2MoO4⋅2 H2O, 0.75 g/L CaCl2, 0.91 g/L CuCl2, and 0.5 g/L H3BO3 in a Milli-Q water solution consisting of 10% concentrated HCl by final volume. Sterilization was achieved in all solutions and media by filtration through a 0.22-μM polyvinylidene fluoride (PVDF) membrane.
Adaptive Laboratory Evolution
Stage I of the ALE experiment was started from isolates of the wild-type E. coli K-12 MG1655 and evolved at 42 °C as described previously (Sandberg et al. 2014). Clones were isolated from ten populations at the end of this experiment, and eight of them with distinct mutational histories were used to start the Stage II ALE experiment. Unfortunately, a contamination event early in the Stage II ALE led to the 42c_3 strain becoming the dominant strain in all flasks that were subsequently analyzed, as evidenced by DNA resequencing. Although this event led to less diverse starting conditions than were originally intended, it does suggest that mutations in 42c_3 were particularly beneficial in the ALE conditions, and diverse endpoint strains were still obtained.
All cultures during the Stage II evolution were grown in 35 mL flasks with a 15-mL working volume and were vigorously stirred at 1100 rpm to create a well-mixed and aerobic environment. Initial temperatures for these cultures were set to 42 °C. The temperatures were increased by 0.5 °C approximately every 150 generations (∼15 passages) to give the cultures time to optimize their growth under the new conditions. Due to the higher stress levels, temperature increases were only 0.25 °C above 44 °C. An automated system was used to propagate the evolving populations over the course of the ALE. To maintain the evolving population at the exponential growth phase, their growth was periodically monitored by taking optical density measurements at a 600-nm wavelength (OD600) on a Tecan Sunrise reader plate (Fig. S1). Once reaching the target OD600∼0.3 (∼1 on a 1-cm path length spectrophotometer), ∼0.66% of the cells in a population were passaged to the fresh medium. Population samples along the adaptive trajectories were taken by mixing 800 μL of culture with 800 μL of 50% glycerol and stored at −80 °C for subsequent analysis (not reported).
DNA Sequencing and Mutation Calling
Growth-improved clones along the ALE trajectory were isolated and grown in the standard medium condition. Cells were then harvested while in exponential growth, and genomic DNA was extracted using a KingFisher Flex Purification system previously validated for the high throughput platform mentioned below (Marotz et al. 2017). Shotgun metagenomic sequencing libraries were prepared using a miniaturized version of the KAPA HyperPlus Illumina-compatible library prep kit (KAPA Biosystems). DNA extracts were normalized to 5 ng total input per sample using an Echo 550 acoustic liquid-handling robot (Labcyte Inc.), and 1/10 scale enzymatic fragmentation, end-repair, and adapter-ligation reactions were carried out using a Mosquito HTS liquid-handling robot (TTP Labtech Inc.). Sequencing adapters were based on the iTru protocol (Glenn et al. 2019 ), in which short universal adapter stubs are ligated first and then sample-specific barcoded sequences added in a subsequent polymerase chain reaction (PCR) step. Amplified and barcoded libraries were then quantified using a PicoGreen assay and pooled in approximately equimolar ratios before being sequenced on an Illumina HiSeq 4000 instrument.
Sequencing reads were filtered and trimmed using AfterQC version 0.9.7 (Chen et al. 2017b). We mapped reads to the E. coli K-12 MG1655 reference genome (NC_00913.3; Hayashi et al. 2006) using the breseq pipeline version 0.33.1 (Deatherage and Barrick 2014). Mutation analysis was performed using ALEdb (Phaneuf et al. 2019).
Physiological Characterization
Cultures were initially inoculated from −80 °C glycerol stocks and grown at 37 °C overnight. Physiological adaptation was achieved by growing cell cultures exponentially over two passages for five to ten generations at the target temperature for phenotypic characterization. Next, cultures growing at the exponential growth phase were passaged to a 15-mL working volume tube and grown fully aerated. Spectrophotometer readings at OD600 were periodically taken (Thermo Fisher Scientific, Waltham, MA, USA) until stationary phase was reached. Growth rates were determined for each culture by least-squares linear regression of ln(OD600) versus time.
Samples were filtered through a 0.22-μm filter (MilliporeSigma, Burlington, MA, USA); at the same time, OD600 measurements were taken, and the filtrate was analyzed for glucose and acetate concentrations using a high-performance liquid chromatography system (Agilent Technologies, Santa Clara, CA, USA) with an Aminex HPX-87H column (Bio-Rad Laboratories, Hercules, CA, USA). Glucose uptake rates and acetate production rates in exponential growth were determined by best-fit linear regression of glucose and acetate concentrations versus cell dry weights, multiplied by growth rates over the same sample range. The above described phenotypic characterizations were performed for two biological replicates of each of the selected clonal isolates along the ALE trajectory, at 30, 37, and 44 °C, respectively.
RNA Sequencing
During phenotypic characterization, 3 mL of cell broth was taken at OD600∼0.6 and immediately added to 2 volumes of QIAGEN RNAprotect Bacteria Reagent (6 mL). Then, the sample was vortexed for 5 s, incubated at room temperature for 5 min, and immediately centrifuged for 10 min at 5000 × g. The supernatant was decanted, and the cell pellet was stored in the −80 °C. Cell pellets were thawed and incubated with Ready-lyse Lysozyme, SuperaseIn, Protease K, and 20% sodium dodecyl sulfate (SDS) for 20 min at 37 °C. Total RNA was isolated and purified using the RNeasy Plus Mini Kit (QIAGEN) columns following vendor procedures. An on-column DNase treatment was performed for 30 min at room temperature. RNA was quantified using a Nanodrop and quality assessed by running an RNA-nano chip on a bioanalyzer. The rRNA was removed using Illumina Ribo-Zero rRNA Removal Kit (Gram-negative bacteria). The quantity was determined by Nanodrop 1000 spectrophotometer (Thermo Scientific). The quality was checked using RNA 6000 Pico Kit using Agilent 2100 Bioanalyzer (Agilent). Paired-end, strand-specific RNA-seq library was built with the KAPA RNA Hyper Prep kit (KAPA Biosystems) following the manufacturer's instructions. Libraries were sequenced on an Illumina HiSeq 4000 instrument.
As part of the PRECISE-1K dataset (Lamoureux et al. 2023), transcriptomic reads were mapped using our pipeline (https://github.com/SBRG/iModulonMiner) (Sastry et al. 2024) and run on Amazon Web Services Batch. First, raw read trimming was performed using Trim Galore with default options, followed by FastQC on the trimmed reads. Next, reads were aligned to the E. coli K-12 MG1655 reference genome (NC_000913.3) (Hayashi et al. 2006) using Bowtie (Langmead et al. 2009). The read direction was inferred using RSeQC (Wang et al. 2012). Read counts were generated using featureCounts (Liao et al. 2014). All quality control metrics were compiled using MultiQC (Ewels et al. 2016). Finally, the expression dataset was reported in units of log-transformed transcripts per million (log(TPM)).
All included samples passed rigorous quality control, with “high-quality” defined as (i) passing the following FastQC checks: per_base_sequence_quality, per_sequence_quality_scores, per_base_n_content, adaptor content; (ii) having at least 500,000 reads mapped to the coding sequences of the reference genome (NC_000913.3); (iii) not being an outlier in a hierarchical clustering based on pairwise Pearson correlation between all samples in PRECISE-1K; and (iv) having a minimum Pearson correlation between biological replicates of 0.95.
iModulon Computation and Curation
The full PRECISE-1K compendium, including the samples for this study, was used to compute iModulons using our previously described method (McConn et al. 2021; Lamoureux et al. 2023). The log(TPM) dataset X was first centered such that wild-type E. coli MG1655 samples in M9 minimal media with glucose had expression values of 0 for all genes. ICA was performed using the Scikit-Learn (v0.19.0) implementation of FastICA (Pedregosa 2011). We performed 100 iterations of the algorithm across a range of dimensionalities, and for each dimensionality, we pooled and clustered the components with DBSCAN to find robust components, which appeared in more than 50 of the iterations. If the dimensionality parameter is too high, ICA will begin to return single gene components; if it is too low, the components will be too dense to represent biological signals. Therefore, we selected a dimensionality which was as high as possible without creating many single gene components, as described (McConn et al. 2021). At the optimal dimensionality, the total number of iModulons was 201. The output is composed of matrices M [genes × iModulons], which defines the relationship between each iModulon and each gene, and A [iModulons × samples], which contains the activity levels for each iModulon in each sample.
For each iModulon, a threshold must be drawn in the M matrix to determine which genes are members of each iModulon. These thresholds are based on the distribution of gene weights. The highest weighted genes were progressively removed until the remaining weights had a D’agostino K2 normality below 550. Thus, the iModulon member genes are outliers from an otherwise normal distribution. iModulon annotation and curation was performed by comparing them against the known TRN from RegulonDB (Santos-Zavaleta et al. 2019). Names, descriptions, and statistics for each iModulon are available from the PRECISE-1K manuscript (Lamoureux et al. 2023), iModulonDB (Rychel et al. 2021), and Table S3.
Differential iModulon Activity Analysis
DiMAs were calculated as previously described (Sastry et al. 2019; Sastry et al. 2024). For each iModulon, a null distribution was generated by calculating the absolute difference between activity levels in each pair of biological replicates and fitting a log-normal distribution to them. For the groups being compared, their mean difference for each iModulon was compared with that iModulon's null distribution to obtain a P-value. The set of P-values for all iModulons was then false discovery rate (FDR) corrected to generate q-values. Activities were considered significant if they passed an absolute difference threshold of 5 and an FDR of 0.1. The main comparison in this study was between the wild-type strain at 42 °C (n = 1) and the combined set of all fully evolved strains at 44 °C (n = 12). This comparison is shown in Fig. 1f, and its P-values are reported in figure captions throughout the manuscript as well as in Table S3. We used the same statistical algorithm to compare the evolved strains at 30 °C (n = 12) and 44 °C (n = 12) in Fig. 1g.
iModulon Explained Variance Calculation
The explained variance for each iModulon in this study was calculated using our workflow (Sastry et al. 2024). Since iModulons are built on a matrix decomposition, the contribution of each one to the overall expression dataset can be calculated. For each iModulon, the column of M and the row of A for the evolved samples in this study were multiplied together, and the explained variance between the result and the full expression dataset was computed. These explained variance scores were used to size the subsets of the treemap in Fig. 1d and the icons in the third column of Fig. 2. Note that the variance explained by ICA is “knowledge-based’ in contrast to the “statistic-based’ variance explanation provided by the commonly used principal component analysis.
Structure Comparison and Prediction
The QSPACE platform (Catoiu et al. 2024 ) was used to identify protein structures for YjfJ (b4182), YjfK (b4183), and YjfL (b4184) available in the PDB (Rose et al. 2021), the SWISS-MODEL repository (Bienert et al. 2017), and the AlphaFold Database (Jumper et al. 2021; Varadi et al. 2022). The SWISS-MODEL for YjfJ (P0AF78-6zvs.5.A) was derived from a template of the ESCRT-III-like protein Vipp1 with C-12 symmetry (PDB:6ZVS, [Liu et al. 2021]); however, C11-18 symmetries have also been proposed for Vipp1 (Liu et al. 2021) and (Gupta et al. 2021). Vipp1 is homologous to PspA (Liu et al. 2021). In the default SWISS-MODEL workspace, the pseudo-heterooligomeric 6ZVS template yields a monomer. To obtain the 48mer-YjfJ, PDB:6ZVS was downloaded from RCSB-PDB. The original CIF file (6ZVS.cif) contains 72 chains for Vipp1 (258 amino acids), where all chains are missing positions 220-258. In addition, 24 chains are missing positions 1-25 and 159-219 and 36 chains are missing a Proline at position 219. To obtain the largest, homo-oligomeric template compatible with the SWISS-MODEL workspace, the 24 incomplete chains were removed and the Proline at position 219 was removed from 36 of the remaining chains, resulting in a 48mer of Vipp1 (each chain contains positions 1-218). A custom script was used to convert multi-character chain IDs in the original CIF file to single-character alpha-numberic chain IDs in the PDB template. This 48mer was used as a template to model YjfJ.
A high-confidence (pLDDT = 0.925) AlphaFold model (P39293) was identified for YjfK in the AlphaFold Database. Although the shape of the YjfK AF-model resembles a small outer membrane channel, neither the sequence-based prediction (DeepTMHMM [Hallgren et al. 2022]) nor structure-based prediction (OPM PPM server 2.0 [Lomize et al. 2022]) was able to confirm that YjfK belongs to the membrane. The YjfL hexamer was modeled using ColabFold (Mirdita et al. 2022) (AF-Multimer v2, model score = 0.68).
Supplementary Material
Acknowledgments
We thank Amitesh Anand, Xin Fang, and Patrick V. Phaneuf for helpful discussions. This work was supported by National Institutes of Health grants (grant numbers GM102098 and GM057089) and the Novo Nordisk Foundation (grant numbers NNF10CC1016517 and NNF20CC0035580). This research used resources of the National Energy Research Scientific Computing Center, supported by the US Department of Energy (contract number DE-AC02-05CH11231). Graphical abstract created in https://BioRender.com.
Contributor Information
Kevin Rychel, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
Ke Chen, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
Edward A Catoiu, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
Elina Olson, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
Troy E Sandberg, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
Ye Gao, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
Sibei Xu, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
Ying Hefner, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
Richard Szubin, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
Arjun Patel, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
Adam M Feist, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA; Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby 2800, Denmark.
Bernhard O Palsson, Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA; Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby 2800, Denmark.
Supplementary Material
Supplementary material is available at Genome Biology and Evolution online.
Author Contributions
K.R., K.C., A.M.F., and B.O.P. designed the study. K.C., E.A.O., T.E.S., Y.G., S.X., Y.H., and R.S. performed experiments. K.R., K.C., and E.A.C. analyzed the data and wrote the manuscript, with contributions from all coauthors.
Data Availability
RNA-seq data have been deposited to GEO and are publicly available as of the date of publication, under accession number GSE140478. DNA-seq data are available from aledb.org under the project “Hot mutL” at https://aledb.org/metadata/?ale_experiment_id=188. iModulons and related data are available from iModulonDB.org under the dataset “E. coli PRECISE-1K.” The YjfJ 72-mer structure is available at https://www.rcsb.org/structure/6ZVS. All original code and data to generate figures are available at https://github.com/kevin-rychel/hot_tale, which also links to the alignment, ICA, and iModulon analysis workflows (Sastry et al. 2024). It is publicly available as of the date of publication. Any additional information required to reanalyze the data reported in this paper, including strains generated in this study, is available from the lead contact upon request: Bernhard Palsson (palsson@ucsd.edu).
Literature Cited
- Anand A, et al. Adaptive evolution reveals a tradeoff between growth rate and oxidative stress during naphthoquinone–based aerobic respiration. Proc Natl Acad Sci U S A. 2019:116:25287–25292. 10.1073/pnas.1909987116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anand A, et al. OxyR is a convergent target for mutations acquired during adaptation to oxidative stress–prone metabolic states. Mol Biol Evol. 2020:37:660–667. 10.1093/molbev/msz251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anand A, et al. Restoration of fitness lost due to dysregulation of the pyruvate dehydrogenase complex is triggered by ribosomal binding site modifications. Cell Rep. 2021:35:108961. 10.1016/j.celrep.2021.108961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anand A, et al. Laboratory evolution of synthetic electron transport system variants reveals a larger metabolic respiratory system and its plasticity. Nat Commun. 2022:13:3682. 10.1038/s41467-022-30877-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Belhadj Slimen I, et al. Reactive oxygen species, heat stress and oxidative–induced mitochondrial damage. A review. Int J Hyperthermia. 2014:30:513–523. 10.3109/02656736.2014.971446. [DOI] [PubMed] [Google Scholar]
- Benkert P, Biasini M, Schwede T. Toward the estimation of the absolute quality of individual protein structure models. Bioinformatics. 2011:27:343–350. 10.1093/bioinformatics/btq662. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bertoni M, Kiefer F, Biasini M, Bordoli L, Schwede T. Modeling protein quaternary structure of homo- and hetero-oligomers beyond binary interactions by homology. Sci Rep. 2017:7:10480. 10.1038/s41598-017-09654-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bienert S, et al. The SWISS-MODEL repository—new features and functionality. Nucleic Acids Res. 2017:45:D313–D319. 10.1093/nar/gkw1132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blatteis CM. Fever: pathological or physiological, injurious or beneficial? J Therm Biol. 2003:28:1–13. 10.1016/S0306-4565(02)00034-7. [DOI] [Google Scholar]
- Bridges BA. Hypermutation in bacteria and other cellular systems. Philos Trans R Soc Lond B Biol Sci. 2001:356:29–39. 10.1098/rstb.2000.0745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Catoiu EA, et al. iModulonDB 2.0: dynamic tools to facilitate knowledge-mining and user-enabled analyses of curated transcriptomic datasets. Nucleic Acids Res. 2025:53:D99–D106. 10.1093/nar/gkae1009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Catoiu EA, Mih N, Lu M, Palsson B. Establishing comprehensive quaternary structural proteomes from genome sequence. eLife. 2024:13(RP100485). 10.1101/2024.04.24.590993. [DOI] [Google Scholar]
- Cavicchioli R, et al. Scientists’ warning to humanity: microorganisms and climate change. Nat Rev Microbiol. 2019:17:569–586. 10.1038/s41579-019-0222-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen K, et al. Thermosensitivity of growth is determined by chaperone-mediated proteome reallocation. Proc Natl Acad Sci U S A. 2017a:114:11548–11553. 10.1073/pnas.1705524114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen K, et al. Bacterial fitness landscapes stratify based on proteome allocation associated with discrete aero-types. PLoS Comput Biol. 2021:17:e1008596. 10.1371/journal.pcbi.1008596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen S, et al. AfterQC: automatic filtering, trimming, error removing and quality control for fastq data. BMC Bioinformatics. 2017b:18:80. 10.1186/s12859-017-1469-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Colquhoun DJ, Sørum H. Temperature dependent siderophore production in Vibrio salmonicida. Microb Pathog. 2001:31:213–219. 10.1006/mpat.2001.0464. [DOI] [PubMed] [Google Scholar]
- Couce A, Guelfo JR, Blázquez J. Mutational spectrum drives the rise of mutator bacteria. PLoS Genet. 2013:9:e1003167. 10.1371/journal.pgen.1003167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dalldorf C, et al. The hallmarks of a tradeoff in transcriptomes that balances stress and growth functions. mSystems. 2024:9:e0030524. 10.21203/rs.3.rs-2729651/v1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deatherage DE, Barrick JE. Identification of mutations in laboratory–evolved microbes from next–generation sequencing data using breseq. Methods Mol Biol. 2014:1151:165–188. 10.1007/978-1-4939-0554-6_12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Erickson JW, Vaughn V, Walter WA, Neidhardt FC, Gross CA. Regulation of the promoters and transcripts of rpoH, the Escherichia coli heat shock regulatory gene. Genes Dev. 1987:1:419–432. 10.1101/gad.1.5.419. [DOI] [PubMed] [Google Scholar]
- Evans R, et al. Protein complex prediction with AlphaFold–Multimer [preprint]. bioRxiv 463034. 10.1101/2021.10.04.463034. [DOI]
- Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016:32:3047–3048. 10.1093/bioinformatics/btw354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fábián B, Vattulainen I, Javanainen M. Protein crowding and cholesterol increase cell membrane viscosity in a temperature dependent manner. J Chem Theory Comput. 2023:19:2630–2643. 10.1021/acs.jctc.3c00060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Federowicz S, et al. Determining the control circuitry of redox metabolism at the genome-scale. PLoS Genet. 2014:10:e1004264. 10.1371/journal.pgen.1004264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferenci T. Maintaining a healthy SPANC balance through regulatory and mutational adaptation. Mol Microbiol. 2005:57:1–8. 10.1111/j.1365-2958.2005.04649.x. [DOI] [PubMed] [Google Scholar]
- Fitzgerald DM, Bonocora RP, Wade JT. Comprehensive mapping of the Escherichia coli flagellar regulatory network. PLoS Genet. 2014:10:e1004649. 10.1371/journal.pgen.1004649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garnett JA, et al. Structural insights into the biogenesis and biofilm formation by the Escherichia coli common pilus. Proc Natl Acad Sci U S A. 2012:109:3950–3955. 10.1073/pnas.1106733109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geibel S, Procko E, Hultgren SJ, Baker D, Waksman G. Structural and energetic basis of folded–protein transport by the FimD usher. Nature. 2013:496:243–246. 10.1038/nature12007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gevantman LH. Solubility of selected gases in water. CRC Handbook of Chemistry and Physics; 2022. [Google Scholar]
- Glenn TC, et al. Adapterama I: universal stubs and primers for 384 unique dual–indexed or 147,456 combinatorially–indexed Illumina libraries (iTru & iNext). PeerJ. 2019:7:e7755. 10.7717/peerj.7755. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gottesman S. Trouble is coming: signaling pathways that regulate general stress responses in bacteria. J Biol Chem. 2019:294:11685–11700. 10.1074/jbc.REV119.005593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gragerov AI, Martin ES, Krupenko MA, Kashlev MV, Nikiforov VG. Protein aggregation and inclusion body formation in Escherichia coli rpoH mutant defective in heat shock protein induction. FEBS Lett. 1991:291:222–224. 10.1016/0014-5793(91)81289-K. [DOI] [PubMed] [Google Scholar]
- Green CA, et al. Engineering the flagellar type III secretion system: improving capacity for secretion of recombinant protein. Microb Cell Fact. 2019:18:10. 10.1186/s12934-019-1058-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gupta TK, et al. Structural basis for VIPP1 oligomerization and maintenance of thylakoid membrane integrity. Cell. 2021:184:3643–3659.e23. 10.1016/j.cell.2021.05.011. [DOI] [PubMed] [Google Scholar]
- Hallgren J, et al. DeepTMHMM predicts alpha and beta transmembrane proteins using deep neural networks [preprint]. bioRxiv 487609. 10.1101/2022.04.08.487609. [DOI]
- Hayashi K, et al. Highly accurate genome sequences of Escherichia coli K–12 strains MG1655 and W3110. Mol Syst Biol. 2006:2:2006.0007. 10.1038/msb4100049. [DOI] [Google Scholar]
- Hinthong W, et al. Effect of temperature on fimbrial gene expression and adherence of enteroaggregative Escherichia coli. Int J Environ Res Public Health. 2015:12:8631–8643. 10.3390/ijerph120808631. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hughes D, Andersson DI. Evolutionary consequences of drug resistance: shared principles across diverse targets and organisms. Nat Rev Genet. 2015:16:459–471. 10.1038/nrg3922. [DOI] [PubMed] [Google Scholar]
- Jovanovic G, et al. The N–terminal amphipathic helices determine regulatory and effector functions of phage shock protein A (PspA) in Escherichia coli. J Mol Biol. 2014:426:1498–1511. 10.1016/j.jmb.2013.12.016. [DOI] [PubMed] [Google Scholar]
- Jumper J, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021:596:583–589. 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Junglas B, et al. Pspa adopts an ESCRT-III-like fold and remodels bacterial membranes. Cell. 2021:184:3674–3688.e18. 10.1016/j.cell.2021.05.042. [DOI] [PubMed] [Google Scholar]
- Kavvas ES, et al. Experimental evolution reveals unifying systems-level adaptations but diversity in driving genotypes. mSystems. 2022:7:e00165–22. 10.1128/msystems.00165-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim JM, Garcia–Alcala M, Balleza E, Cluzel P. Stochastic transcriptional pulses orchestrate flagellar biosynthesis in Escherichia coli. Sci Adv. 2020:6:eaax0947. 10.1126/sciadv.aax0947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lamoureux CR, et al. A multi–scale expression and regulation knowledge base for Escherichia coli. Nucleic Acids Res. 2023:51:10176–10193. 10.1093/nar/gkad750. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009:10:R25. 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee C, Park C. Mutations upregulating the flhDC operon of Escherichia coli K–12. J Microbiol. 2013:51:140–144. 10.1007/s12275-013-2212-z. [DOI] [PubMed] [Google Scholar]
- Lehnen D, et al. Lrha as a new transcriptional key regulator of flagella, motility and chemotaxis genes in Escherichia coli. Mol Microbiol. 2002:45:521–532. 10.1046/j.1365-2958.2002.03032.x. [DOI] [PubMed] [Google Scholar]
- Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014:30:923–930. 10.1093/bioinformatics/btt656. [DOI] [PubMed] [Google Scholar]
- Liu J, et al. Bacterial Vipp1 and PspA are members of the ancient ESCRT–III membrane–remodeling superfamily. Cell. 2021:184:3660–3673.e18. 10.1016/j.cell.2021.05.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lomize AL, Todd SC, Pogozheva ID. Spatial arrangement of proteins in planar and curved membranes by PPM 3.0. Protein Sci. 2022:31:209–220. 10.1002/pro.4219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malpica R, Franco B, Rodriguez C, Kwon O, Georgellis D. Identification of a quinone-sensitive redox switch in the ArcB sensor kinase. Proc Natl Acad Sci U S A. 2004:101:13318–13323. 10.1073/pnas.0403064101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mariani V, Biasini M, Barbato A, Schwede T. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics. 2013:29:2722–2728. 10.1093/bioinformatics/btt473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marotz C, et al. DNA extraction for streamlined metagenomics of diverse environmental samples. Biotechniques. 2017:62:290–293. 10.2144/000114559. [DOI] [PubMed] [Google Scholar]
- McBroom AJ, Kuehn MJ. Release of outer membrane vesicles by Gram-negative bacteria is a novel envelope stress response. Mol Microbiol. 2007:63:545–558. 10.1111/j.1365-2958.2006.05522.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McConn JL, Lamoureux CR, Poudel S, Palsson BO, Sastry AV. Optimal dimensionality selection for independent component analysis of transcriptomic data. BMC Bioinformatics. 2021:22:584. 10.1186/s12859-021-04497-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Messner KR, Imlay JA. The identification of primary sites of superoxide and hydrogen peroxide formation in the aerobic respiratory chain and sulfite reductase complex of Escherichia coli. J Biol Chem. 1999:274:10119–10128. 10.1074/jbc.274.15.10119. [DOI] [PubMed] [Google Scholar]
- Mih N, et al. Ssbio: a Python framework for structural systems biology. Bioinformatics. 2018:34:2155–2157. 10.1093/bioinformatics/bty077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Minamino T, Namba K. Self-assembly and type III protein export of the bacterial Flagellum. Microb Physiol. 2004:7:5–17. 10.1159/000077865. [DOI] [Google Scholar]
- Mirdita M, et al. ColabFold: making protein folding accessible to all. Nat Methods. 2022:19:679–682. 10.1038/s41592-022-01488-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Modrich P. Mechanisms in E. coli and human mismatch repair (nobel lecture). Angew Chem Int Ed. 2016:55:8490–8501. 10.1002/anie.201601412. [DOI] [Google Scholar]
- Morita MT, et al. Translational induction of heat shock transcription factor sigma 32: evidence for a built-in RNA thermosensor. Genes Dev. 1999:13:655–665. 10.1101/gad.13.6.655. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Myers KS, et al. Genome–scale analysis of Escherichia coli FNR reveals complex features of transcription factor binding. PLoS Genet. 2013:9:e1003565. 10.1371/journal.pgen.1003565. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Myers KS, Park DM, Beauchene NA, Kiley PJ. Defining bacterial regulons using ChIP-seq. Methods. 2015:86:80–88. 10.1016/j.ymeth.2015.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nguyen V, et al. Evolutionary drivers of thermoadaptation in enzyme catalysis. Science. 2017:355:289–294. 10.1126/science.aah3717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ottemann KM, Miller JF. Roles for motility in bacterial–host interactions. Mol Microbiol. 1997:24:1109–1117. 10.1046/j.1365-2958.1997.4281787.x. [DOI] [PubMed] [Google Scholar]
- Peabody GL, Winkler J, Kao KC. Tools for developing tolerance to toxic chemicals in microbial systems and perspectives on moving the field forward and into the industrial setting. Curr Opin Chem Eng. 2014:6:9–17. 10.1016/j.coche.2014.08.001. [DOI] [Google Scholar]
- Pedregosa F, et al. Scikit–learn: machine learning in python. J Mach Learn Res. 2011:12:2825–2830. [Google Scholar]
- Pfitzner A–K, Moser von Filseck J, Roux A. Principles of membrane remodeling by dynamic ESCRT-III polymers. Trends Cell Biol. 2021:31:856–868. 10.1016/j.tcb.2021.04.005. [DOI] [PubMed] [Google Scholar]
- Phaneuf PV, Gosting D, Palsson BO, Feist AM. ALEdb 1.0: a database of mutations from adaptive laboratory evolution experimentation. Nucleic Acids Res. 2019:47:D1164–D1171. 10.1093/nar/gky983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prinz WA, Åslund F, Holmgren A, Beckwith J. The role of the thioredoxin and glutaredoxin pathways in reducing protein disulfide bonds in the Escherichia coli cytoplasm. J Biol Chem. 1997:272:15661–15667. 10.1074/jbc.272.25.15661. [DOI] [PubMed] [Google Scholar]
- Rose Y, et al. RCSB Protein Data Bank: architectural advances towards integrated searching and efficient access to macromolecular structure data from the PDB archive. J Mol Biol. 2021:433:166704. 10.1016/j.jmb.2020.11.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rubalcaba JG, Verberk WCEP, Hendriks AJ, Saris B, Woods HA. Oxygen limitation may affect the temperature and size dependence of metabolism in aquatic ectotherms. Proc Natl Acad Sci U S A. 2020:117:31963–31968. 10.1073/pnas.2003292117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rudenko I, Ni B, Glatter T, Sourjik V. Inefficient secretion of anti-sigma factor FlgM inhibits bacterial motility at high temperature. iScience. 2019:16:145–154. 10.1016/j.isci.2019.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rychel K, et al. iModulonDB: a knowledgebase of microbial transcriptional regulation derived from machine learning. Nucleic Acids Res. 2021:49:D112–D120. 10.1093/nar/gkaa810. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rychel K, et al. Laboratory evolution, transcriptomics, and modeling reveal mechanisms of paraquat tolerance. Cell Rep. 2023:42:113105. 10.1016/j.celrep.2023.113105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sandberg TE, et al. Evolution of Escherichia coli to 42 °C and subsequent genetic engineering reveals adaptive mechanisms and novel mutations. Mol Biol Evol. 2014:31:2647–2662. 10.1093/molbev/msu209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sandberg TE, Salazar MJ, Weng LL, Palsson BO, Feist AM. The emergence of adaptive laboratory evolution as an efficient tool for biological discovery and industrial biotechnology. Metab Eng. 2019:56:1–16. 10.1016/j.ymben.2019.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Santos-Zavaleta A, et al. RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K–12. Nucleic Acids Res. 2019:47:D212–D220. 10.1093/nar/gky1077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sastry AV, et al. The Escherichia coli transcriptome mostly consists of independently regulated modules. Nat Commun. 2019:10:13483. 10.1038/s41467-019-13483-w. [DOI] [Google Scholar]
- Sastry AV, et al. Machine learning of bacterial transcriptomes reveals responses underlying differential antibiotic susceptibility. mSphere. 2021:6:e00443–21. 10.1128/mSphere.00443-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sastry AV, et al. iModulonMiner and PyModulon: Software for unsupervised mining of gene expression compendia. PLoS Comput Biol. 2024:20. 10.1371/journal.pcbi.1012546. [DOI] [Google Scholar]
- Schellhorn HE. Function, evolution, and composition of the RpoS regulon in Escherichia coli. Front Microbiol. 2020:11:560099. 10.3389/fmicb.2020.560099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schlegel S, et al. Optimizing heterologous protein production in the periplasm of E. coli by regulating gene expression levels. Microb Cell Fact. 2013:12:24. 10.1186/1475-2859-12-24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shin S, Park C. Modulation of flagellar expression in Escherichia coli by acetyl phosphate and the osmoregulator OmpR. J Bacteriol. 1995:177:4696–4702. 10.1128/jb.177.16.4696-4702.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singer HM, et al. Selective purification of recombinant neuroactive peptides using the flagellar type III secretion system. mBio. 2012:3:e00115–12. 10.1128/mBio.00115-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith AT, et al. The FeoC [4Fe–4S] cluster is redox-active and rapidly oxygen-sensitive. Biochemistry. 2019:58:4935–4949. 10.1021/acs.biochem.9b00745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Soutourina O, et al. Multiple control of flagellum biosynthesis in Escherichia coli: role of H–NS protein and the cyclic AMP–catabolite activator protein complex in transcription of the flhDC master operon. J Bacteriol. 1999:181:7500–7508. 10.1128/JB.181.24.7500-7508.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sperandio V, Torres AG, Kaper JB. Quorum sensing Escherichia coli regulators B and C (QseBC): a novel two–component regulatory system involved in the regulation of flagella and motility by quorum sensing in E. coli. Mol Microbiol. 2002:43:809–821. 10.1046/j.1365-2958.2002.02803.x. [DOI] [PubMed] [Google Scholar]
- Studer G, et al. QMEANDisCo—distance constraints applied on model quality estimation. Bioinformatics. 2020:36:1765–1771. 10.1093/bioinformatics/btz828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Swings T, et al. Adaptive tuning of mutation rates allows fast response to lethal stress in Escherichia coli. eLife. 2017:6:e22939. 10.7554/eLife.22939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tenaillon O, et al. The molecular diversity of adaptive convergence. Science. 2012:335:457–461. 10.1126/science.1212986. [DOI] [PubMed] [Google Scholar]
- Thurotte A, Brüser T, Mascher T, Schneider D. Membrane chaperoning by members of the PspA/IM30 protein family. Commun Integr Biol. 2017:10:e1264546. 10.1080/19420889.2016.1264546. [DOI] [Google Scholar]
- Utrilla J, et al. Global rebalancing of cellular resources by pleiotropic point mutations illustrates a multi-scale mechanism of adaptive evolution. Cell Syst. 2016:2:260–271. 10.1016/j.cels.2016.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Beilen JWA, Hellingwerf KJ. All three endogenous quinone species of Escherichia coli are involved in controlling the activity of the aerobic/anaerobic response regulator ArcA. Front Microbiol. 2016:7:1339. 10.3389/fmicb.2016.01339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Varadi M, et al. AlphaFold protein structure database: massively expanding the structural coverage of protein–sequence space with high–accuracy models. Nucleic Acids Res. 2022:50:D439–D444. 10.1093/nar/gkab1061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vavitsas K, Glekas PD, Hatzinikolaou DG. Synthetic biology of thermophiles: taking bioengineering to the extremes? Appl Microbiol. 2022:2:165–174. 10.3390/applmicrobiol2010011. [DOI] [Google Scholar]
- Wang L, Wang S, Li W. RSeQC: quality control of RNA-Seq experiments. Bioinformatics. 2012:28:2184–2185. 10.1093/bioinformatics/bts356. [DOI] [PubMed] [Google Scholar]
- Yin R, Feng BY, Varshney A, Pierce BG. Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants. Protein Sci. 2022:31:e4379. 10.1002/pro.4379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yura T. Regulation of the heat shock response in Escherichia coli: history and perspectives. Genes Genet Syst. 2019:94:103–108. 10.1266/ggs.19-00005. [DOI] [PubMed] [Google Scholar]
- Zambrano MM, Siegele DA, Almirón M, Tormo A, Kolter R. Microbial competition: Escherichia coli mutants that take over stationary phase cultures. Science. 1993:259:1757–1760. 10.1126/science.7681219. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
RNA-seq data have been deposited to GEO and are publicly available as of the date of publication, under accession number GSE140478. DNA-seq data are available from aledb.org under the project “Hot mutL” at https://aledb.org/metadata/?ale_experiment_id=188. iModulons and related data are available from iModulonDB.org under the dataset “E. coli PRECISE-1K.” The YjfJ 72-mer structure is available at https://www.rcsb.org/structure/6ZVS. All original code and data to generate figures are available at https://github.com/kevin-rychel/hot_tale, which also links to the alignment, ICA, and iModulon analysis workflows (Sastry et al. 2024). It is publicly available as of the date of publication. Any additional information required to reanalyze the data reported in this paper, including strains generated in this study, is available from the lead contact upon request: Bernhard Palsson (palsson@ucsd.edu).







