Abstract
One-carbon compounds, such as formate, are promising and sustainable feedstocks for microbial bioproduction of fuels and chemicals. Growth of Escherichia coli on formate was recently achieved by introducing the reductive glycine pathway (rGlyP) into its genome, which is theoretically the most energy-efficient aerobic formate assimilation pathway. While adaptive laboratory evolution was used to enhance the growth rate and biomass yield significantly, still the best performing formatotrophic E. coli strain did not approach the theoretical optimal biomass yield of the rGlyP. In this study, we investigated these previously engineered formatotrophic E. coli strains to find out why the biomass yield was sub-optimal and how it may be improved. Through a combination of metabolic modelling, genomic and proteomic analysis, we identified several potential metabolic bottlenecks and future targets for optimization. This study also reveals further insights in the evolutionary mutations and related changes in proteome allocation that supported the already substantially improved growth of formatotrophic E. coli strains. This systems-level analysis provides key insights to realize high-yield, fast growing formatotrophic strains for future bioproduction.
Keywords: Escherichia coli, Metabolic modelling, C1-assimilation, Formate, Reductive glycine pathway
1. Introduction
Emissions of CO2 originating from fossil resources are a global problem leading to a climate crisis with potentially disastrous consequences [1,2]. A promising alternative to fossil-based production methods is microbial bioproduction of value-added chemicals and fuels. However, current microbial bioproduction processes often rely on agriculture-intensive feedstocks like glucose, which compete for arable land and resources with food production, and thus cannot be scaled to the extent required to replace fossil resources. Therefore, one-carbon compounds such as formate, methanol, methane, and carbon monoxide are being explored as sustainable feedstocks for microbial bioproduction [[3], [4], [5]]. Formate and methanol are particularly of interest due to their high solubility in water [6,7]. These compounds can be produced directly (formate) or in two steps (methanol) by electrochemical reduction of CO2 using electricity derived from renewable resources. In this work we focus on formate as a feedstock.
Some natural formate-utilizing microbes have been identified, but few have been well-characterized, and they generally lack effective and robust genetic engineering tools [7]. Consequently, optimization of metabolic pathways in most natural formatotrophs is difficult and engineering of novel (production) pathways is often not feasible. In order to explore beyond the limited product spectrum offered by natural formatotrophs, formate assimilation pathways can instead be introduced in model hosts like Escherichia coli. Several natural and synthetic formate assimilation pathways have been described, such as the Calvin Benson Bassham (CBB) cycle, the Serine Cycle, the Wood-Ljungdahl Pathway, the Serine-Threonine Cycle, and the reductive Glycine Pathway (rGlyP) [8,9]. The rGlyP is the most energy efficient aerobic formate assimilation pathway [6] and was first designed by Bar-Even et al., in 2013 [10] and later discovered in the natural formatotroph and autotroph Desulfovibrio desulfuricans [11]. Soon after design of the pathway, efforts to engineer the rGlyP in E. coli started through a step-wise, modular engineering approach [[12], [13], [14], [15]].
The rGlyP in E. coli consists of an energy generation reaction and a formate assimilation route (Fig. 1). The energy generation reaction is catalyzed by a heterologously expressed formate dehydrogenase from Pseudomonas sp. strain 101 (PsFDH), which oxidizes formate to CO2 to regenerate NADH [17]. The formate assimilation route was subdivided into three modules to facilitate stepwise integration and confirmation of individual modules in E. coli. In the first (C1) module, formate is ligated to tetrahydrofolate (THF) at the expense of ATP and reduced to 5,10-methylene-THF by three heterologously expressed enzymes originating from Methylorubrum extorquens (formerly Methylobacterium extorquens). Next, in the C2 module the C1 moiety of 5,10-methylene-THF is condensed with CO2 and NH3 resulting in the two-carbon amino acid glycine. This reaction is catalyzed by the (overexpressed) native glycine cleavage system (Gcv) of E. coli, operating in the reverse (reductive) direction, driven in this direction by elevated CO2 concentrations.
Fig. 1.
Reductive glycine pathway (rGlyP) as engineered in E. coli by Kim et al. [16]. Me: M. extorquens, PsFDH: formate dehydrogenase, MeFtfL: formate-THF ligase, MeFch: 5,10-methenyl-THF cyclohydrolase, MeMtdA: 5,10-methylene-THF dehydrogenase, GcvTHP, Lpd: glycine cleavage system, GlyA: serine hydroxymethyltransferase, SdaA: serine deaminase.
Finally, in the C3 module, glycine is ligated with another C1 moiety from 5,10-methylene-THF and converted to the C3 amino acid serine and then into pyruvate, by the native serine hydroxymethyltransferase (GlyA) and serine deaminase (SdaA), respectively. Pyruvate can then be converted to various biomass components via native metabolic pathways and possibly also into various bioproducts.
In 2020, Kim et al. [16], achieved full formatotrophic growth of E. coli by introducing the complete rGlyP into its genome and overexpressing the native enzymes involved in the rGlyP. This initial strain was named K4 and had a doubling time (DT) of 65–80 h on 28 mM of formate. K4 served as a base strain for further engineering to investigate and improve formatotrophic growth. Supplementary Table 1 gives an overview of growth rates of the various formatotrophic E. coli strains derived from K4.
Adaptive laboratory evolution (ALE) was applied to the K4 strain by Kim et al. [16], resulting in the K4e strain, which grew at a DT of 7.7 h on 153 mM formate. Thus, ALE achieved here not only an increased growth rate but also increased formate tolerance as the parental strain K4 did not grow at formate concentrations above ∼60 mM. Two key mutations were further investigated for involvement in this vast growth improvement. These mutations appeared to increase expression of the NADH-regenerating PsFDH, and the native membrane bound transhydrogenase complex (PntAB), which regenerates NADPH by oxidizing NADH. These two mutations were reverse engineered in the naive K4 strain, which did improve growth rate and formate tolerance substantially, but not completely to the extent of the K4e. Unlike K4e, the reconstructed strain (K4 g-FDH∗ g-PntAB∗) did not grow on 153 mM formate. On 109 mM formate the reconstructed strain had a DT of 9.7 h, which is slightly slower than the 7.9 h DT of K4e at the same formate concentration.
After further ALE on K4e, Kim et al. [18] isolated four variants of an even further evolved strain (K4e2), which had a DT of only 6 h on 150 mM formate and a further improved formate tolerance, as well as an increased biomass yield compared to K4e from 2.3 to 3.3 gCDW mol−1 formate. The growth improvement of K4e2 was in part attributed to decreased acetate production due to a mobile genetic element integration interrupting the promotor region of the gene encoding acetate kinase (ackA).
Acetate kinase and phosphate acetyltransferase (Pta) together catalyze the conversion of acetyl-CoA to acetate. During fast growth on glucose, this conversion is used as part of overflow metabolism to allow faster growth by investing less resources into respiratory chain enzymes while still yielding some energy. However, acetate production during growth on formate via the rGlyP is unfavorable as it has a net energy cost and does not contribute to biomass formation. Furthermore, the potential re-uptake of acetate costs additional ATP. Hence, Kim et al. [18], subsequently deleted the full ackA-pta operon in K4e2 as a more stable and rational approach to decrease acetate production, which resulted in slightly better growth than K4e2 with the interrupted ackA promoter. This ackA-pta deletion was also reconstructed in the predecessor strain K4e. However, while the reconstructed strain K4e ΔackA-pta had a decreased DT (from 8.5 to 7.2 h on 150 mM formate), it did not grow as fast as the fully evolved K4e2 ΔackA-pta (6.4 h). Thus, it appears that blocking acetate production through the AckA-Pta route does not explain the full extent of the K4e2 improvement over K4e. Other mutations in all four isolates of K4e2 were found in genes encoding the pyruvate dehydrogenase complex regulator (PdhR) and RNA polymerase subunit β' (RpoC) [18].
In a parallel effort, full formatotrophic growth of E. coli via the rGlyP was also demonstrated in another study, however, the best strain resulting from this study (FC8) had a maximum growth rate of only ∼65 h [19].
To summarize, while some mutations have been demonstrated to contribute to the growth improvement of the evolved strains K4e and K4e2 over their parental strain K4, there should be other mutations also contributing to the improved growth phenotypes of the evolved strains. In this study we further investigated the E. coli K4, K4e, and K4e2 strains at a genomic and proteomic level to further identify mutations and their effects on protein levels, aiming to further understand how E. coli evolved towards more optimal formatotrophic growth.
The K4e and K4e2 strains have recently also been used as proof-of-principle studies to make relevant bioproducts lactate and polyhydrobutyrate from formate [18,20]. However, the resulting production levels and rates are much too low for economically feasible sustainable bioproduction from formate. Furthermore, while the formatotrophic growth of K4e2 is much improved over the parental K4 strain, the growth rate and biomass yield are still relatively low compared to natural formatotrophs [6,21]. This indicates that the formatotrophic E. coli strains still have inefficiencies, such as bottlenecks or sub optimal fluxes in their metabolism, that limit biomass yields and productivities. Therefore, we combined in this work the omics-based approaches with metabolic modelling of the E. coli rGlyP metabolism to discover bottlenecks and potential targets for further optimization of these strains for improved growth on formate.
2. Methods
2.1. Strains and culture conditions
The formatotrophic E. coli strains used in this study are listed in Table 1. Lysogeny Broth (LB) medium (10 g/L tryptone, 5 g/L yeast extract, 10 g/L NaCl) was used for rich medium cultivation. M9 minimal medium (50 mM Na2HPO4, 20 mM KH2PO4, 1 mM NaCl, 20 mM NH4Cl, 2 mM MgSO4 and 100 μM CaCl2, 134 μM EDTA, 13 μM FeCl3·6H2O, 6.2 μM ZnCl2, 0.76 μM CuCl2·2H2O, 0.42 μM CoCl2·2H2O, 1.62 μM H3BO3, 0.081 μM MnCl2·4H2O, set to pH 7.4) was used for all cultivations on formate. All cultivations were carried out at 37°C, with orbital shaking at 180 rpm. All cultivations on formate were carried out in incubators with 10 % CO2 and 90 % air.
Table 1.
E. coli strains used in this study.
| Strains | Description | Genomic knock-outs | Genomic integrations (same in all strains) | Source |
|---|---|---|---|---|
| E. coli MG1655 K4 | rGlyP genes in genome | ΔltaE Δkbl ΔserA ΔglyA ΔsdaA ΔyiaB | Me_ftfL-fch-mtdA (inserted at SS9, strong constitutive promoter PGI-20, medium-strength ribosome binding sites (RBSC). Ec_glyA-sdaA (inserted at IS7, strong constitutive promoter PGI-20, medium-strength ribosome binding site (RBSC) and native RBS respectively. Ps_fdh (inserted at IS10, under strong constitutive promoter PGI-20 and strong ribosome binding site (RBSA). pstrong-gcvTHP (strong constitutive promoter PGI-20 and medium-strength RBSC inserted at native operon). |
Adapted from Ref. [16] by Vittorio Rainaldi to restore aceA by P1 phage transduction. |
| E. coli MG1655 K4e | rGlyP genes in genome and evolved | ΔltaE Δkbl ΔserA ΔglyA ΔsdaA ΔyiaB | Adapted from Ref. [16] by Vittorio Rainaldi to restore aceA by P1 phage transduction. | |
| E. coli MG1655 K4e2 | rGlyP in genome and further evolved | ΔltaE Δkbl ΔserA ΔglyA ΔsdaA ΔaceA | [18] |
2.2. Whole genome sequencing
Cells were grown overnight on LB and DNA extraction was performed using the DNeasy® Blood & Tissue kit manufactured by QIAGEN, according to manufacturer's instructions (version 07/2020) for Gram-negative bacteria, including the optional RNase A digestion step. Whole genome sequencing was performed by Novogene using the Illumina NovaSeq 6000 platform for paired-end reads of 150bp. Genome sequence data was analyzed using Geneious 10.0.9. Paired reads were mapped to a refence genome (Escherichia coli str. K-12 substr. MG1655, U00096.3) retrieved from NCBI using default settings for mapping to reference genome, and a consensus sequence was generated based on majority of reads.
2.3. Quantitative proteomics
2.3.1. Cultivation and harvesting
Cell material for proteomics analysis was obtained from cultivations in 10 mL M9 with 30 mM (K4) or 80 mM (K4e & K4e2) formate, inoculated from pre-cultures on the same medium. These cultures were grown in 125 mL vented and baffled Erlenmeyer flasks and harvested during log growth phase (harvesting OD600 K4: 0.16–0.18, K4e: 0.54–0.67, K4e2: 0.30–0.43). The equivalent of 3 mL of OD600 = 1 was harvested (e.g. 6 mL of OD600 = 0.5). These cells were harvested by centrifugation (4500 g, 10 min, 4°C), and washed two times with 1 mL phosphate buffered saline (PBS) (8 g/L NaCl, 0.2 g/L KCl, 1.42 g/L Na2HPO4, 0.24 g/L KH2PO4, pH 7.4), then centrifuged again and supernatant was removed. Pellets were flash frozen in liquid nitrogen and stored at −80°C.
2.3.2. Protein extraction and alkylation
Cell pellets were resuspended in 300 μL lysis buffer (2 % sodium-lauroyl sarcosinate in 100 mM ammonium bicarbonate) and heated for 10 min at 90°C, shaking. The lysate was further sonicated by 60 ultrasonic pulses (VialTweeter, Hielscher Ultrasonics) and then samples were centrifuged for 10 min at 14,000 rpm at 4°C. Protein concentrations were determined using the Pierce™ BCA Protein Assay Kit (ThermoFisher Scientific), according to manufacturer's instructions. Then, 7.5 μL of TCEP buffer (0.2 M Tris (2-carboxy-ethyl) phosphine in 100 mM ammonium bicarbonate) was added and incubated for 10 min at 90°C, shaking. Next, 7.5 μL of 0.4 M iodoacetamide was added and samples were incubated for 30 min at 25°C, shaking (shielded from light).
2.3.3. Protein digestion
Based on the BCA results, 50 μg of total protein was digested by addition of 1 μg trypsin (Serva) in diluted lysis buffer (0.25 % sodium-lauroyl sarcosinate with 100 mM ammonium bicarbonate) and overnight incubation at 30°C. After digestion, SLS was precipitated by adding a final concentration of 1.5 % trifluoroacetic acid (TFA, Thermo Fischer Scientific). Peptides were desalted by using C18 solid phase extraction cartridges (Macherey-Nagel). Cartridges were prepared by adding acetonitrile (ACN), followed by equilibration with 0.1 % TFA. Peptides were loaded on equilibrated cartridges, washed with 5 % ACN and 0.1 % TFA containing buffer and finally eluted with 50 % ACN and 0.1 % TFA.
2.3.4. LC-MS
Dried peptides were reconstituted in 0.1 % trifluoroacetic acid and then analyzed using liquid chromatography mass spectrometry (LC-MS) carried out on an Exploris 480 instrument connected to an Ultimate 3000 RSLC nano and a nanospray flex ion source (all Thermo Scientific). Peptide separation was performed on a reverse phase HPLC column (75 μm × 42 cm) packed in-house with C18 resin (2.4 μm; Dr. Maisch). The following separating gradient was used: 94 % solvent A (0.15 % formic acid) and 6 % solvent B (99.85 % acetonitrile, 0.15 % formic acid) to 25 % solvent B over 40 min, and an additional increase of solvent B to 35 % for 20 min at a flow rate of 300 nl/min.
MS raw data was acquired on an Exploris 480 (Thermo Scientific) in data independent acquisition (DIA) mode. Peptides were ionized at a spray voltage of 2.3 kV, ion transfer tube temperature set at 275°C, 445.12003 m/z was used as internal calibrant. The funnel RF level was set to 40. For DIA experiments full MS resolutions were set to 120.000 at m/z 200 and full MS, AGC (Automatic Gain Control) target was 300 % with an IT of 50 ms. Mass range was set to 350–1400. AGC target value for fragment spectra was set at 3000 %. 45 windows of 14 Da were used with an overlap of 1 Da. Resolution was set to 15,000 and IT to 22 ms. Stepped HCD collision energy of 25, 27.5, 30 % was used. MS1 data was acquired in profile, MS2 DIA data in centroid mode.
2.3.5. Data analysis
DIA data was analyzed using DIA-NN [22] as previously described by Ref. [23] but using the Uniprot database for E. coli with the following additional protein sequences: Formate-THF ligase, 5,10-methenyl-THF cyclohydrolase, and 5,10-methylene-THF dehydrogenase from M. extorquens AM1, and formate dehydrogenase from Pseudomonas sp. (strain 101). Each with an N-terminal His-tag (complete sequences in Supplementary). Full tryptic digest with up to three missed cleavage sites, oxidized methionines, and carbamidomethylated cysteines was set, with match between runs and remove likely interferences enabled. The neural network classifier was set to the single-pass mode, and protein inference was based on genes. Quantification strategy was set to any LC (high accuracy). Cross-run normalization was set to RT-dependent. Library generation was set to smart profiling. DIA-NN results were further processed using SafeQuant. DIA-NN outputs were further evaluated using the SafeQuant script modified to process DIA-NN outputs [24,25].
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD050315 and in Supplementary data 1.
2.4. Constraint-based metabolic modelling
2.4.1. Model
The iJO1366 genome-scale metabolic model of E. coli MG1655 developed by Orth et al. [26] was modified and used in this study. Modifications to the model included adding the reactions in the rGlyP pathway not present yet in the model. For this, an NADH-regenerating formate dehydrogenase reaction (ID: FDH) was added and the glycine cleavage system reaction (ID: GLYCL) was made reversible. Additionally, the lower bound of the ATP maintenance reaction (ID: ATPM) was updated from 3.15 to 6.86 mmol gCDW−1 h−1 to match the most recent E. coli MG1655 model (iML1515) [27] and the NAD(P) transhydrogenase (periplasm) reaction (ID: THD2pp) was updated to import only one proton into the cytoplasm instead of two [28,29]. Furthermore, the NADP-dependent malic enzyme reaction (ID: ME2) was made reversible and the malate oxidase reaction (ID: MOX) was made irreversible (only allowing conversion of l-malate to oxaloacetate). All scripts and models used in this study can be found at https://git.wur.nl/ssb/publications/ecoli-synthetic-formatotrophy.
2.4.2. pFBA and flux sampling
For each growth condition the optimal value of the selected objective function (either growth or production of target metabolites) was predicted using the flux balance analysis (FBA) tool from the COnstraint-Based Reconstruction and Analysis package for python (COBRApy) [30].
Carbon source uptake rates for both growth on formate and growth on glucose were fixed to correspond to the actual growth rate of the fastest growing rGlyP strain K4e2 on formate (namely ∼0.12 h−1). To simulate formatotrophic growth, formate uptake rate was set to 22.5 mmol gCDW−1 h−1 and glucose uptake rate to zero. Growth on glucose was simulated with a glucose uptake rate of 1.5 mmol gCDW−1 h −1 and formate uptake rate of zero.
The constraint-based model was adapted to run parsimonious Flux balance Analysis (pFBA) [31]. For this adaptation, reversible reactions were split into two reactions, forward and reverse. In addition an additional ‘flux’ metabolite was added as a product to each reaction and a sink reaction was introduced for this flux metabolite. The FBA objective was set to minimize the flux through the flux metabolite sink. This gave the minimal amount of flux the model needed to still produce the optimal growth rate. Then, the flux through the flux metabolite sink was fixed to the minimal number allowing a small deviation of 0.01 mmol h−1. After this flux sampling was performed; 50000 samples were taken for each substrate using the optGp methods [32].
From the flux samples the following statistics were calculated for each reaction: average, standard deviation and the Shapiro–Wilk test statistic for normal distribution [33]. Prior to the calculations, fluxes through reverse reactions were subtracted from fluxes through the corresponding forward reaction. For the relevant comparisons the Kolmogorov–Smirnov test statistic and log2 fold change statistical calculations were done using the python SciPy library.
2.4.3. Calculating the formate cost of biomass precursors and energy cofactors
The efficiency of various pathways for the synthesis of biomass precursors was calculated by adding sink reactions for relevant metabolites (in the cytoplasm) and running FBA setting the flux through the sink reaction as maximization objective. Pathway efficiencies were tested by eliminating competing reactions as indicated. For a fair comparison of different pathways, NADH was by default produced via NADH-regenerating formate dehydrogenase, NADPH via NAD(P) transhydrogenase and ATP via NADH dehydrogenase (ubiquinone-8 & 3 protons), cytochrome oxidase bo3 (ubiquinol-8: 4 protons) and ATP synthase (four protons for one ATP), as the full model predicts that this is the way most of these energy cofactors are made when growing on formate.
To estimate the costs to generate ATP, NADH, or NADPH, conversion reactions were introduced converting each of them to their low energy form (ADP, NAD+, and NADP+, respectively). The corresponding reaction was set as FBA maximization objective in each case. By dividing the maximal flux by the formate input flux, the cost/formate was obtained.
In addition, formate costs per metabolite/energy precursor and predicted losses in biomass for less efficient pathway variants were computed by setting maintenance requirements to zero. When calculating the cost of individual metabolites, we confirmed that the given metabolite had a net cost of ATP, NADH, and NADPH. While growth on formate virtually always resulted in a net cost of these energy carriers, growth on glucose for instance could for some metabolites result in a net yield of one energy carrier and a net cost of another, leading to conversions of energy carriers that would not be necessary (to the same degree) in full biomass production.
3. Results & discussion
We investigated three E. coli strains (K4, K4e, and K4e2), which were engineered and evolved to grow on formate through the synthetic rGlyP [16,18]. We aimed to characterize the metabolic network of these strains and identify targets for future optimization, through a combination of omics-based data collection and metabolic modelling.
The main product of the rGlyP is pyruvate, which can then be converted through central carbon metabolism further into 11 other biomass precursors and from there to the rest of biomass [34]. We have predicted the most efficient metabolic routes for production of these main precursors from formate, using a genome-scale model of E. coli metabolism adapted from Orth et al. [26] to include the rGlyP pathway. We also compared the predicted metabolic fluxes for optimal growth on formate for E. coli via the rGlyP with predicted fluxes for growth on its natural substrate glucose (Fig. 2). In parallel, we analyzed the proteome of E. coli strains K4, K4e, and K4e2 during growth on formate, and compared these to previously published proteomes for E. coli grown on glucose at a controlled growth rate, similar to the growth of the fastest formatotrophic strain K4e2 on formate (∼6 h doubling time) (Fig. 3). We compare protein allocation and the optimal flux network for the growth on formate, with those of the ancestor strain when growing on glucose, assuming that E. coli is better optimized during its long evolutionary history for the natural substrate. From this comparison we aim to identify enzymes and related reactions that are possibly not optimally adapted yet for growth on formate.
Fig. 2.
Average predicted fluxes using the metabolic model for both growth on formate and glucose at the same growth rate (∼6 h doubling time), all fluxes are normalized for formate uptake flux of 100. Blue arrows denote reactions that, based on the model, carry a higher flux for formatotrophic growth than for growth on glucose, while orange arrows indicate reactions with a higher flux on glucose. Reactions with both blue and orange arrows have a reversed flux direction between two conditions. Black arrows indicate fluxes that are unchanged across the conditions. Light grey boxes indicated the 12 typical biomass precursors. Dashed lines indicate simplified multiple-enzyme routes.
Fig. 3.
Protein abundance fold changes for the formatotrophic grown E. coli K4e2, versus E. coli grown on glucose. Green arrows denote protein abundance is higher on formate than on glucose, red arrows denote the opposite (higher on glucose than on formate). Black arrows indicate either no difference, or that a protein was not detected. Arrow thickness indicates degree of fold change (thicker arrow is much more abundant on formate than on glucose, thinner arrow means much less abundant of formate than on glucose). ∗Not all subunits of the enzyme complex were detected. ∗∗Enzyme complex consisting of: AceEF, Lpd (5.7, 1.6, 2.1). ∗∗∗ TktB arrow is not shown separately.
When analyzing the formatotrophic growth of E. coli, the metabolic model predicts a maximum yield of 5.4 gCDW mol−1 formate, which is still much higher than the highest achieved experimental yield (3.3 gCDW mol−1 formate). However, predicted yields are highly dependent on the non-growth-associated ATP maintenance (6.86 mmol gCDW−1 h−1 assumed in the model), which may change in different growth conditions (predicted yield without maintenance cost is 6.54 gCDW mol−1 formate). Nevertheless, experimental yields of ∼4 gCDW mol−1 formate have been reported for some formatotrophs using the CBB cycle, which is less energetically efficient than the rGlyP [21,35,36]. Furthermore, recently a yield of 4.5 gCDW mol−1 formate was achieved through the rGlyP engineered in C. necator [23]. This yield gap between these other formatotrophs and the rGlyP in E. coli indicates that there are potentially energy-losses or stoichiometrically less efficient routes occurring in the formatotrophic metabolism of the currently available formatotrophic E. coli strains. The yield gap may also be partly related to the still not optimal growth rate, as increasing growth rate normally also increases yield [37]. The fastest CBB cycle formatotrophs can typically grow slightly faster (∼3.5–4 h doubling time) [35,36,38] than the fastest growth via the rGlyP so far (6 h).
Here, we discuss key findings from the proteomics and modelling comparison for growth on formate and glucose. We will discuss possible routes and predicted most optimal routes for the regeneration of energy cofactors and the production of biomass precursors from formate. These routes will be analyzed using proteomics data, which provide an estimate for the abundance of the enzymes involved in these routes and hence, a proxy for the flux through these routes.
3.1. NADH regeneration by formate dehydrogenase is optimal
During growth on formate, energy metabolism widely differs from growth on glucose because formate is a much more oxidized, single-carbon substrate. The most efficient and direct way to derive energy from formate is through its oxidation to CO2 by heterologously introduced formate dehydrogenase (PsFDH), which regenerates NADH. In turn, NADH powers the regeneration of other energy carriers, such as NADPH and ATP.
For an optimal biomass yield, the model predicts that 78 % of formate should be oxidized by PsFDH to CO2 for energy production, reflecting the high ratio of formate oxidation to formate assimilation into biomass. This study focusses on biomass as a product, which has a relatively high energy requirement and hence a large fraction of formate oxidized. However, also for bioproducts a substantial share has to be oxidized. For example, for the production of key product precursor pyruvate the model predicts that 6.57 mol formate is required per mol pyruvate, of which 3.57 mol is net released as CO2 from formate.
Currently there is still a substantial yield gap between the theoretical and experimental biomass yield. Whereas, a small amount of acetate is still excreted as by-product in the best performing formatotrophic E. coli strain, likely most of the yield gap between that strain and the predicted optimal yield is dissipated to CO2 through the use of inefficient metabolic routes for energy and biomass production. For instance, alternatively to the most efficient NADH regeneration route via PsFDH, NADH could be regenerated in the oxidative TCA cycle, which costs more than twice as much formate per NADH regenerated (Fig. 4 A). The model predicts that NADH regeneration via the TCA cycle would result in a 32 % loss in biomass yield compared to the predicted optimal yield using PsFDH.
Fig. 4.
(A) The cost of energy carriers NADH, NADPH, and ATP through various regeneration pathways, in mol formate per mol energy carrier. (B–E) Metabolic maps of the five most theoretically efficient NADPH regeneration routes in formatotrophic E. coli strains analyzed by metabolic modelling. Not all reaction components are depicted.
While the full oxidative TCA cycle does not provide a net biomass yield since it oxidizes the carbon that is fed into it to CO2, its activity can be inferred by analyzing the biomass produced from TCA cycle intermediates. Hereto, Kim et al. [16] performed labelling experiments using formate and/or CO2 made from the 13C isotope of carbon and measured the abundance of 13C in several amino acids produced by K4e. In their experiments, threonine (which is produced from oxaloacetate) would be differently labelled depending on whether oxaloacetate was produced through the anapleurotic reactions from pyruvate, or through the TCA cycle via acetyl-CoA. They found that while most threonine was produced through anapleurotic routes, approximately 23 % still appeared to come from acetyl-CoA, indicating that the oxidative TCA cycle was still active to some extent. In contrast, Claassens et al. [45] performed similar 13C-labeling experiments with the rGlyP in C. necator, showing only 12 % of threonine came via acetyl-CoA, which may contribute to higher biomass yield of their engineered C. necator strain (2.6 gCDW mol−1 formate) compared to E. coli K4e (2.3 gCDW mol−1 formate). Furthermore, proteomic analysis of K4e also indicates potential oxidative activity of the TCA cycle, as will be further discussed in section 3.4. These results and the predicted 32 % lower yield when using the TCA cycle for NADH generation, suggest that this can be one of the key wasteful processes in the formatotrophic strains.
Note that the costs of regenerating NADH from formate includes the cost of importing formate into the cytoplasm. The mechanism and energetics of formate import into the cytoplasm are not completely understood, though previous studies have suggested that an additional proton is needed to import formate (CHOO−) as neutral formic acid (CHOOH), either through diffusion through the membrane or through the hydrophobic center of the FocA transporter located in the inner membrane [39]. Recently, Vanyan et al. [40] showed that in E. coli, import of exogenously added formate into the cytoplasm dissipated the proton motive force (PMF), further supporting the notion that formate is likely symported with one (or more) protons. Therefore, the formate import reaction in the model also imports one proton per imported formate. This is why for instance NADH regeneration by PsFDH costs 1.14 formate per NADH, the 0.14 additional costs reflects the energy costs for transport.
3.2. NADPH regeneration by membrane-bound transhydrogenase is optimal
The first part of the rGlyP oxidizes one NADPH to NADP+ for each assimilated formate, leading to a high demand for NADPH regeneration. We modelled the top five theoretically most efficient routes for NADPH regeneration that may be used by the formatotrophic E. coli strains during growth on formate (Fig. 4).
However, there is a potentially even more efficient route (not shown in figure) through a formate dehydrogenase that directly regenerates NADPH. For instance, the NAD+-specific PsFDH enzyme was previously engineered to achieve high kinetic efficiency and specificity for NADPH regeneration. While the strains analyzed in this study do not possess an NADPH-regenerating FDH and thus are unable to perform this route, introducing such an enzyme would be a promising avenue for future optimization of formatotrophic growth.
The most efficient route currently possible in the formatotrophic strains utilizes membrane bound transhydrogenase (PntAB) to transhydrogenate NADH into NADPH at the cost of one proton translocated into the cell (Fig. 4 B). Kim et al. (2020) found mutations in the promotor of the pntAB operon, as well as the 5′UTR of the gene encoding PsFDH in the K4e strain, which appeared to increase transcript levels for both. Our proteomics findings also show that the abundance of both these proteins increased, PntAB appeared to be ∼11–13 fold higher in the K4e2 strain on formate than in the glucose reference (Fig. 3).
The second most efficient route for NADPH regeneration predicted by the model is a cycle that converts NADH to NADPH via acetaldehyde (Fig. 4C). This cycle starts by first reducing acetyl-CoA with NADH to acetaldehyde and subsequent oxidation to acetate by regeneration of NADPH, catalyzed by two acetaldehyde dehydrogenases (AdhE and AldB respectively). Next, acetate is converted back to acetyl-CoA (costing the equivalent of 2 ATP), by acetyl-CoA synthetase (Acs). This route costs an additional 0.42 mol formate per mol NADPH regenerated compared to the PntAB route (Fig. 4 A). We observed increased proteomic abundance of AdhE and Acs on formate compared to glucose, but decreased levels of AldB (Fig. 3). Since AldB is the NADPH regenerating reaction, it appears that this route is not being used for NADPH regeneration. According to the model prediction, AdhE should carry no flux (Fig. 2), thus may be a suitable target for knock-out.
The next two routes for NADPH regeneration use glucose-6-phosphate dehydrogenase (Zwf) in the (oxidative) Pentose Phosphate Pathway (PPP) to produce gluconate-6-phospate (Fig. 4 D). Gluconate-6P can either be oxidized to ribulose-5-phospate to regenerate a second NADPH by 6-phosphogluconate dehydrogenase (Gnd) or it can be dehydrated to 2-dehydro-3-deoxy-d-gluconate-6-phosphate by phosphogluconate dehydratase (Edd). The Zwf-Gnd route and Zwf-Edd route (latter known as the Entner Doudoroff pathway) respectively use 0.57 mol and 1.57 mol more formate than the PntAB route (Fig. 4 A). If for example all NADPH during formatotrophic growth would be generated through the Zwf-Gnd route this would lead to a 15 % predicted loss in biomass yield, showing this could lead to a substantial reduction in yield. Hence the model predicts no flux should go through these routes during growth on formate. However, during growth on glucose, Zwf-Gnd is one of the key NADPH regeneration routes as also observed in the model simulations, hence this ‘natural NADPH-regeneration' route may still be active in E. coli also during growth on formate (Fig. 2).
Strikingly, proteomics indicated a higher abundance of Zwf on formate than on glucose (Fig. 3). Notably, the gene encoding Zwf is also known as one of the most consistently flux-coupled genes in E. coli [41]. Zwf is regulated by the SoxRS regulon and this regulatory mechanism is controlled by NADPH levels. Low NADPH levels during rGlyP on formate may be a possible explanation for the upregulation of Zwf [42]. Given that flux through Zwf during formatotrophic growth may lead to a substantial reduction in yield, Zwf appears to be a key candidate for downregulation or even knock-out.
The final NADPH regeneration pathway discussed here is a pyruvate-malate cycle that uses malic enzyme B (MaeB) to decarboxylate malate to pyruvate. Pyruvate is then converted back to malate via phosphoenol pyruvate (PEP) and oxaloacetate, catalyzed by PEP synthase (Pps), PEP carboxylase (Ppc), and malate dehydrogenase (Mdh) (Fig. 4 E). This cycle costs one formate more for NADPH regeneration than the PntAB route (Fig. 4 A), and requires MaeB to catalyze in its malate-decarboxylating direction. However, it is uncertain which direction MaeB operates in vivo during growth on formate, since the conditions during formatotrophic growth (including elevated CO2) may push this reversible reaction more towards carboxylation. MaeB levels appear to be higher on formate than on glucose, which could mean that this enzyme is regenerating NADPH, if it does operate in the decarboxylating direction. Alternatively, if MaeB is operating in the carboxylating direction, it may be used for biomass formation, which is further discussed below.
3.3. ATP production by NADH dehydrogenase II and cytochrome bo3 is optimal
There are various routes for generating the proton motive force (PMF) required for the regeneration of ATP by ATP synthase, and for other PMF-driven reactions (e.g. NADPH regeneration by membrane-bound transhydrogenase, formate import, etc.). The efficiency of these routes was compared in terms of formate cost per ATP regenerated. By far the stoichiometrically most efficient route for ATP regeneration on formate uses NADH dehydrogenase I (Nuo complex) combined with cytochrome bo3 (Cyo complex) to oxidize NADH and pump protons out of the cytosol (Fig. 4 A). Some of the other, less efficient options use combinations with other dehydrogenases and oxidases, namely: NADH dehydrogenase II (Ndh), cytochrome bd-I/bd-II (Cyd complex/App complex), and formate dehydrogenase N/O (Fdn complex/Fdo complex). Formate dehydrogenase N and O are native membrane proteins that reduce menaquinone, not to be confused with the heterologous NAD+-reducing PsFDH.
The model predicted that flux through the Nuo and Cyo complexes should be 1.9 and 1.6-fold higher on formate than on glucose respectively, and that the other routes for PMF generation should carry no flux for optimal biomass yield on formate. The difference in ATP requirement is because during the breakdown of glucose to major biomass precursors there is net ATP production, unlike for formate assimilation, which consumes ATP.
However, proteomics showed that NuoBCEFGI had on average 4 fold higher levels in E. coli on glucose than in K4e2 of formate, while the other subunits (NuoAHJKLMN) were not detected in the proteomic analyses (Supplementary Table 3). CyoA was approximately 14-fold higher in the glucose condition than on formate, but CyoBCD were not detected either. The seemingly lower abundances of Nuo and Cyo on formate contradicts what the model predicts as optimal, and may thus be suitable targets for upregulation. However, it may also be that in this case protein abundance does not reflect flux, due to for instance differences in substrate or product concentrations (e.g. NADH:NAD ratio) or post-translational modifications. Furthermore, too high activity of these respiratory chain enzymes may lower the NADH:NAD ratio too much, which would likely be unfavorable for the thermodynamic feasibility of glycine production by the glycine cleavage system in the rGlyP. Thus, fine-tuning the expression of the Nuo and Cyo subunits may be needed.
Many of the enzymes involved in the less efficient routes for ATP regeneration were not detected or showed contradicting trends for their different subunits. Therefore, it is unclear whether the other involved enzymes require downregulation or knock-out.
3.4. Oxidative TCA cycle may be overactive
During aerobic growth on glucose and all other natural substrates, the TCA cycle oxidizes acetyl-CoA to regenerate the energy cofactors NADH and NADPH. However, energy generation during growth on formate is most efficiently done directly by PsFDH, while operation of the full TCA cycle is relatively inefficient as mentioned previously. During growth on formate, the TCA cycle is therefore only needed to produce biomass precursors.
The model thus predicts lower flux through almost all TCA cycle enzymes on formate compared to glucose (Fig. 2). The TCA cycle section between oxaloacetate and alpha-ketoglutarate still carries flux to fulfil the demand of the precursor alpha-ketoglutarate, which requires about 10 times less flux than for growth on glucose. The section between alpha-ketoglutarate and fumarate is predicted to carry no flux in the oxidative direction during growth on formate. Succinyl-CoA is the remaining (minor) biomass precursor in the TCA cycle. A small amount is needed as a CoA donor for the biosynthesis of the amino acids lysine and methionine. This amino acid synthesis releases succinate again, which can be recycled back to succinyl-CoA in the TCA cycle via succinyl-CoA synthetase (SucCD). Hence, only very low fluxes towards succinate or succinyl-CoA are needed for biosynthesis and to sustain the succinate/succinyl-CoA pool in the cells. The model suggests this flux could come from the reductive direction via fumarate, through fumarate reductase. However, in E. coli, fumarate reductase is usually only active under anaerobic conditions.
In the last section of the TCA cycle, malate dehydrogenase (Mdh) carries substantial flux. This is related to its involvement in the stoichiometrically most efficient route to produce oxaloacetate and PEP, as will be explained below.
Contrary to the expectations based on optimal flux distribution for formatotrophic growth, proteomics indicate that several TCA cycle enzymes have higher abundances during growth on formate than during growth on glucose (e.g. citrate synthase: GltA, aconitase: AcnA, and alpha-ketoglutarate dehydrogenase: SucAB) (Fig. 3).
Though it is unknown whether these increased levels of GltA, AcnA, and SucAB also lead to increased flux, SucA has been identified as a flux-coupled gene (meaning the qualitative change in abundance was closely linked to the qualitative change in flux) [41]. Since SucAB should carry no flux according to our model prediction, the genes encoding this enzyme complex (sucA-sucB) could be a key target for knock-out (or knock-down to allow some succinyl-CoA biosynthesis) to decrease wasteful TCA cycle flux and improve formatotrophic growth efficiency.
Citrate synthase (GltA) is known to be subject to catabolite repression during growth on glucose, which may explain why its expression is higher on formate than on glucose. Since flux through GltA should be approximately 10-fold lower on formate than glucose according to the model, the gene encoding GltA is another key target for down regulation to potentially limit TCA cycle related losses. Expression tuning of AcnA may be less important if GltA-tuning can limit the flux into the TCA cycle.
3.5. Carboxylating malic enzyme for oxaloacetate and PEP synthesis on formate
Oxaloacetate and alpha-ketoglutarate are two major precursors for the production of amino acids and nucleotides, and respectively support roughly ∼17 % and ∼12 % of biomass production [34,43]. Alpha-ketoglutarate is generated from oxaloacetate and acetyl-CoA through a part of the TCA cycle. However, there are various routes to produce oxaloacetate (Fig. 5 A).
Fig. 5.
(A–C) Metabolic maps of the three oxaloacetate production routes analyzed by metabolic modelling. Pyruvate is produced via the rGlyP as depicted in Fig. 1. Not all reaction components are depicted. (D) The cost of oxaloacetate biosynthesis through various pathways, in mol formate per mol oxaloacetate.
For growth on glucose, the model predicts that the most efficient route for production of oxaloacetate is through glycolysis until phosphoenolpyruvate (PEP), which can then be converted directly to oxaloacetate by PEP carboxylase (Ppc). For growth on formate, oxaloacetate can also be produced from PEP, which could be generated from pyruvate coming out of the rGlyP via PEP synthase (PpsA) (Fig. 5 A). However, the model predicts that oxaloacetate would be generated through a more efficient route, via NADPH-dependent reduction and carboxylation of pyruvate into malate via malic enzyme MaeB (Fig. 5 B). This reaction is in the typical E. coli model assumed to proceed in the reverse oxidative direction. However, as mentioned previously, this thermodynamically reversible reaction could be functioning in the reductive direction under the conditions for formatotrophic growth including high CO2 concentration and possibly elevated NADPH:NADP+ (ΔrG'm at of 1 mM for all reactants is 5 kJ/mol; ΔrG' at 0.1 atm CO2 and 10:1 NADPH:NADP is −3.4 kJ/mol).
Based on the model predictions, the route via PEP synthase costs one additional mol of formate per mol oxaloacetate compared to the route via carboxylating malic enzyme activity (Fig. 5C) and would therefore lead to a 2.8 % predicted loss in biomass yield.
For growth on formate, the route via carboxylating activity of malic enzyme to oxaloacetate is also the most efficient starting route to generate PEP from pyruvate and further precursors that are synthesized via gluconeogenesis from PEP. Hence, the model predicts for growth on formate that all flux towards PEP proceeds from oxaloacetate through PEP carboxykinase (PckA). Relatedly, the model predicts no flux to go via PEP synthetase (PpsA) and PEP carboxylase (Ppc) (Fig. 2). However, analysis of the proteome showed that there was an increased level of the latter two enzymes in the E. coli strains during growth on formate, whereas the NADPH-dependent enzyme MaeB is also increased (Fig. 3). Thus, possibly downregulation of PpsA and Ppc and/or further upregulation of MaeB could potentially increase the yield on formate. For the latter we assume that the reaction through MaeB can indeed proceed fast enough in the carboxylating direction and that the metabolic flux is not yet fully going through that enzyme in the evolved strains. Though in this case there may be a trade-off relationship between efficiency and rate, where the MaeB route could be more stoichiometrically efficient but slower than the more thermodynamically favorable PEP route, potentially resulting in limitations on growth rate.
3.6. Glyoxylate shunt is inefficient for biomass production on formate
Another possible route for the production of oxaloacetate from pyruvate is the glyoxylate shunt (Fig. 5C), which can produce oxaloacetate with an additional cost of two mol formate per mol oxaloacetate compared to the MaeB route (Fig. 5 D). The glyoxylate shunt is catalyzed by isocitrate lyase (AceA) and malate synthase (AceB/GlcB). The gene encoding AceA has been knocked out during construction of the formatotrophic E. coli by Kim et al. [16], and has been restored again in the K4 and K4e strains characterized in this study (for further engineering purposes), but not in K4e2. Thus, since K4e2 lacks AceA, the glyoxylate shunt cannot be used in this strain. However, somewhat surprisingly, proteomics indicate that AceB has a 13-fold higher level in K4e2 on formate than in the glucose reference (Fig. 3). Hence, AceB may be a suitable knock-out or downregulation target, to prevent unnecessary proteome allocation to this likely unused enzyme. However, we note that AceB may serve to assimilate glyoxylate, which could be a by-product of an inefficient bypass of the rGlyP. Some of the glycine generated in this pathway may be oxidized to glyoxylate [44], as was also previously observed during rGlyP operation in Cupriavidus necator [45]. In the slower growing K4 and K4e strains characterized in this study, where the gene encoding AceA has been reconstituted, proteomic detection of AceA is similar to the glucose reference, though since growth rates between these three conditions vary greatly, a direct comparison cannot be made.
3.7. Glycolysis naturally downregulated in favor of gluconeogenesis during growth on formate
In upper central carbon metabolism, several important biomass precursors are produced. During formatotrophic growth, these metabolites must be produced through gluconeogenesis and the PPP.
Glucose-6-phosphate and fructose-6-phosphate are required for the production of glycogen, lipopolysaccharides, and the cell wall. Glycerate-3-phosphate serves as a precursor for glycine, serine, cysteine, and one-carbon metabolites during glycolytic growth. However, the route to produce these compounds from glycerate-3-phosphate was previously knocked out during engineering of the rGlyP in E. coli. Since these compounds are also central intermediates in the rGlyP, they are most efficiently directly obtained from this pathway during formatotrophic growth.
As expected, we observe decreased abundance of most glycolytic enzymes during growth on formate, as no major flux is needed, unlike during growth on glucose when glycolysis is a major metabolic pathway (Fig. 3). Notable exceptions are enzymes that function specifically in gluconeogenesis and not in glycolysis: FbaB and Fbp. Thus, it appears that gluconeogenesis is already upregulated, though it is unknown whether this is to the optimal level. It could for instance be that gluconeogenesis is too strongly upregulated, as the gluconeogenic demand from the rGlyP is lower than the typical gluconeogenic demand for E. coli from other substrates, as compounds generated from glycerate-3-phosphate usually are supplied by the rGlyP directly.
3.8. Pentose phosphate pathway products partially made via inefficient Zwf and Gnd route
Erythrose-4-posphate (E4P) and ribose-5-phosphate (R5P) are produced in the PPP and are needed in substantial quantities for synthesis of several amino acids and nucleotides. The most efficient pathway to produce E4P and R5P is direct re-arrangement in the PPP pathway via transketolase (Tkt) and transaldolase (Tal) (Fig. 6 A).
Fig. 6.
(A–C) Metabolic maps of the three erythrose-4-posphate (E4P) and ribose-5-phosphate (R5P) production routes analyzed by metabolic modelling. Not all reaction components are depicted. (D) The cost of oxaloacetate biosynthesis through various pathways, in mol formate per mol oxaloacetate.
Alternative pathways to produce E4P and R5P are via oxidative variants of the PPP utilizing glucose-6-phosphate dehydrogenase (Zwf). The first oxidative PPP variant uses Zwf, as well as Tkt and Tal (Fig. 6 B), which costs an additional 1.2 and 0.6 mol formate per E4P and R5P, respectively, compared to the most efficient, non-oxidative PPP route (Fig. 6 D). Alternatively, E4P can also be produced via Zwf combined with phosphofructokinase (PfkA) and fructose-bisphosphate aldolase (FbaA) instead of Tal (Fig. 6C), which costs 1.8 mol formate more than via the PPP (Fig. 6 D).
Though the model predicts Tkt, Tal and Rpe should carry more flux on formate than on glucose (Fig. 2), from these only TktAB appears to be more abundant at proteomic level on formate (Fig. 3). TalAB abundance is similar in both conditions while Rpe appears to be downregulated on formate. Hence, E4P and R5P may be (partially) formed via the less efficient Zwf-Tal route, as Zwf is more abundant than on glucose, possibly due to the higher NADPH demand in formatotrophic growth. Thus, aiming to redirect the route towards E4P and R5P by downregulating or knocking out Zwf and possibly overexpressed TalAB, and Rpe, may increase metabolic efficiency during growth on formate. However, since biomass production requires a relatively small amount of E4p and R5P (in contrast to NADPH requirements), the production of these precursors via the less efficient Zwf-Tal route only results in a 0.7 % predicted loss in biomass yield.
3.9. Improved growth of K4e due to improved NAD(P)H regeneration capacity and a potential reduction in flux at rGlyP pathway branchpoint
It has been shown by Kim et al. [16] that the formatotrophic growth improvement of K4e over K4 is largely caused by mutations in regulatory regions of the genes encoding PsFDH and PntAB, the enzymes involved in the most efficient routes for NADH and NADPH generation, respectively. For PntAB the mutation was found in the promotor region, which led to a ∼13-fold increase of transcript level as measured by Kim et al. [16]. This was also corroborated in our findings that protein abundance of PntAB increased by 4.6 and 5.7-fold for PntA and PntB, respectively for K4e compared to K4. The mutation for PsFDH was found at the start of the 5′UTR which caused a 2.5-fold increase in transcript as detected by Kim et al. [16]. We measured a further 13-fold increase of PsFDH at proteome level, which may be caused by a combination of slightly higher transcript levels, as well as improved translation initiation from the mutated 5′UTR.
However, these mutations do not appear to explain the full extent of the K4e phenotype, as their reconstruction in the naive K4 strain did not yield the same growth rate as K4e.
We further analyzed the genome sequences and identified additional mutations in K4e in the coding sequences of the genes encoding 5,10-methylenetetrahydrofolate reductase (MetF), fatty acid biosynthesis regulator (FabR), and type II secretion system protein (GspL). The mutation in gspL is a silent mutation (G138G C to T). This gene is part of an operon that is transcriptionally repressed under standard laboratory conditions [46] and this protein was not detected in any of the proteomes we analyzed. The mutations in metF (A280V) and fabR (G42V) are amino acid substitutions.
MetF catalyzes the NADH-dependent reduction of 5,10-methylene-THF to 5-methyl-THF, which is part of a native E. coli route for methionine synthesis. However, since 5,10-methylene-THF is also an intermediate metabolite in the rGlyP and hence maybe be present in higher concentrations during growth on formate via the rGlyP, MetF expression/activity may need to be downregulated. Proteomics analysis indicates that MetF is indeed somewhat less abundant in K4e than in K4 (Fig. 7 A), in addition the amino acid substitution may potentially alter activity further as the substituted amino acid is directly next to an active site residue involved in substrate binding.
Fig. 7.
Scatter plots showing abundance differences for proteins encoded by genes that have acquired mutations during ALE. X-axis indicates log2 fold change in protein abundance and y-axis indicates -log10 p-value. (A) K4e abundance/K4 abundance, for mutations first occurring in K4e. (B) K4e2 abundance/K4e abundance, for mutations first occurring in K4e2.
FabR is a transcriptional repressor for the two type II fatty acid synthase enzymes (FabA and FabB), thereby impacting membrane lipid homeostasis [47]. Proteomics analysis indicates that FabR is less abundant in K4e than in K4, and that FabA and FabB are slightly upregulated in K4e (Fig. 7 A). However, it is unclear if this upregulation is beneficial for growth on formate and what the mechanistic explanation would be. As the reaction catalyzed by FabB is consuming NADPH it could be that this slight upregulation compensates for potentially lower NADPH levels during growth on formate. Alternatively, the slight upregulation of FabA and FabB could be possibly related to different membrane composition requirements for formate transport or be related to the increased membrane-bound transhydrogenase PntAB expression levels.
3.10. Enhanced growth of K4e2 via reduced acetate production, potential Gcv increase, and further proteomic changes
The further growth improvement of K4e2 over K4e was attributed in part to a mutation in the 5′UTR of the ackA-pta operon, leading to decreased acetate production, as observed by Kim et al. [18]. Proteomics indeed showed a strong downregulation of AckA and Pta in K4e2 compared to K4e (Fig. 7 B). In addition, acetate reuptake using Acs also appears to be downregulated in K4e2 compared to K4e (Fig. 8C), though it is not as low as for growth on glucose at the same growth rate (at this growth rate glucose metabolism does not lead to acetate overflow and hence no uptake is required). However, the level of pyruvate oxidase (PoxB) is up in all rGlyP strains compared to glucose. PoxB is a membrane protein coupled to the electron transport chain and converts pyruvate to acetate, which may explain why there is still some production of acetate, even in the K4eΔackA-pta as Kim et al. [18] reported. Thus, the gene encoding PoxB is likely also a good target for knock-out.
Fig. 8.
Percentages of total quantified proteome for proteins involved in (A) the glycine cleavage system, (B) pyruvate dehydrogenase, (C) acetate metabolism, (D) NADPH regeneration.
Additionally, Kim et al. [18] also identified an amino acid substitution in PdhR and RpoC in K4e2, which likely explains the remainder of the growth differences between K4e2 and K4e.
Since RpoC is a subunit of RNA polymerase involved in DNA binding, it can act as a global regulator of gene expression and is often found mutated in ALE experiments on E. coli MG1655 for a range of experimental conditions [48,49]. While proteomics indicates abundance of RpoC is barely changed in K4e2 compared to K4e (Fig. 7 B), the amino acid substitution likely influences RpoC activity. The exact RpoC mutation in K4e2 (A919V) has been reported previously by Szalewska-Palasz et al. [50] to bypass requirement of the alarmone guanosine tetraphosphate (ppGpp) for transcription from sigma-54 promoters. This signaling molecule is involved in the stringent response, which is activated during nutrient starvation. ppGpp has many targets, and is for instance involved in decreasing DNA replication and changes in transcriptional profile, inhibiting ribosome maturation and protein translation activity [51]. Possibly, the RpoC mutation could impact transcription levels from starvation sigma-54 promoters compared to the (maintenance associated) sigma-70 promoters. Proteomics showed an increase of proteins expressed from confirmed sigma-54 promoters in K4e2 compared to K4 and K4e, though this could be (partially) caused by differences in growth rate (Supplementary Fig. 2). It is known that some sigma-54 genes are associated with nitrogen fixation, though many are not [52]. However, even if the RpoC mutation influences transcription from sigma-54 promoters, it is unclear which (if any) of the impacted genes are contributing to the improved growth of K4e2.
PdhR regulates expression of an operon which encodes the components of the pyruvate dehydrogenase enzyme complex (Pdh), namely pyruvate dehydrogenase E1 component (AceE), pyruvate dehydrogenase E2 subunit (AceF), and lipoamide dehydrogenase (Lpd). We found that levels of PdhR, AceEF, and Lpd are higher in K4e2 compared to K4, K4e, and WT E. coli grown on glucose (Fig. 8 B), possibly as a result of the PdhR mutation in K4e2. Increased expression of Pdh may not be advantageous since the flux through Pdh should be about three times lower during growth on formate than on glucose, according to the predicted fluxes for optimal yield (Fig. 2). However, Lpd is also a component of the glycine cleavage system enzyme complex (Gcv), together with aminomethyltransferase (GcvT), glycine cleavage system H protein (GcvH), and glycine decarboxylase (GcvP). The GcvTHP operon was overexpressed by Kim et al. [16] in K4 to allow growth on formate as the Gcv catalyzes a key reaction in the rGlyP. They did not overexpress Lpd as it was not essential. As expected, proteomics showed increased levels of GcvTHP in the K4 strains on formate compared to WT E. coli on glucose (Fig. 8 A). We observe that Lpd levels are similar in K4 and K4e on formate, and slightly higher in those strains compared to WT E. coli on glucose, but approximately 2-fold higher in K4e2. The upregulation of Lpd in K4e2 may contribute to the strains improved growth on formate by increasing Gcv levels. If this is the case, it may be even better to specifically upregulate Lpd, but not AceEF.
3.11. rGlyP strains show trend towards larger proteome fraction of central carbon metabolism
The large differences in growth rates between the K4, K4e, and K4e2 strains on formate likely both result from and cause differences in their overall proteome allocation (Fig. 9 A). A notable difference is that the rGlyP fraction of the proteome increased after ALE from approximately 10 % in K4, to 13 % in K4e, and 16 % in K4e2, mainly due to increasing abundances of PsFDH and the GCV system. PsFDH abundance largely increased between the K4 and K4e from 0.3 % to 3.7 % and is even slightly more abundant in K4e2 (4.3 %). This increase likely explains a large part of the improved growth between K4 and K4e, as this relatively slow enzyme (kcat ∼10 s−1) is responsible for all energy supply in the cell [10,17]. A further increase of the PsFDH level may boost growth even more, but maybe also burden a large part of the proteome. An alternative may be the use of faster metal-cofactor dependent FDH enzymes, such as the molybdenum-dependent formate dehydrogenase from the natural formatotroph C. necator. In another study, the rGlyP was implemented in C. necator and in this study it was determined that the CnFdh occupied only ∼1.6 % of the proteome to reach doubling times of ∼10 h [23]. This FDH may also be functionally implemented in formatotrophic E. coli strains. Recently, Schulz et al. [53] demonstrated the functional expression of the CnFdh in an E. coli NADH sensor strain, which was able to generate all NADH required for E. coli. Incorporation of a faster Fdh like CnFDH in K4e2 may be a promising approach to improve growth rate on formate, and an alternative to further increasing overexpression of PsFDH.
Fig. 9.
(A) Percentages of total quantified proteome for proteins involved in central carbon metabolic pathways. A list of individual proteins included in each pathway is provided in Supplementary Table 2. GNG: gluconeogenesis/glycolysis. (B) Proteomaps displaying all proteins proportional to the size of their proteome fraction, and grouped based on cellular function (Supplementary Fig. 3). Proteomaps were generated using the tool developed by Liebermeister et al., Nocaj and Brandes, and Otto et al. [[61], [62], [63]].
The assimilation modules of the rGlyP (C1, C2 and C3) form a relatively stable part of the proteome throughout the different strains. Only between K4e and K4e2, the relative fraction of the C2 module substantially increased from 4.5 to 5.9 %. This is mostly related to the earlier discussed increase in the Lpd enzyme, as well as the GcvT and GcvH proteins that take part in the GCV system. While the kinetics of the GCV system are poorly characterized [54], considering the turnover numbers and the small thermodynamic driving force of this reaction, it is likely that GCV also serves as a bottleneck in the rGlyP pathway. Relatedly, its increased abundance may have contributed to the faster growth rate of K4e2 over K4e.
In total the rGlyP forms a high proportion of the proteome (up to 16 %), but during growth on glucose at a similar growth rate to K4e2, the main pathways for carbon assembly and energy generation (glycolysis and TCA cycle) also occupy 13 % of the proteome. In addition, several enzymes in the rGlyP have relatively slow kinetics [55]. Furthermore, the rGlyP reactions have to carry higher fluxes of the single-carbon substrate formate, than the fluxes required during the growth on six-carbon substrate glucose in glycolysis and TCA (Fig. 2). Thus, an open question remains whether improving the abundance of rGlyP enzymes in the pathway could further boost the growth rate, and which are the bottleneck enzymes in this pathway that would be most essential to further increase.
In the formatotrophic strains, especially in K4e, a large part of the proteome is still taken up by the TCA cycle and glyoxylate shunt (6.1 and 5.5 % respectively in K4e), which according to the optimal flux distribution should carry very low or no flux during optimal formatotrophic growth. In K4e2, the fraction of the TCA cycle is already reduced further (to 4.1 %) and the glyoxylate shunt occupies a much lower fraction (1.3 %), partly because the gene encoding AceA is not present in K4e2.
In addition to rGlyP proteins, we observed a high abundance of outer membrane porins OmpC and OmpA in the formatotrophic E. coli strains (Fig. 9 B). Both proteins are known to mediate permeability of the outer membrane [56]. They were found in vitro to form nonspecific diffusion channels through which various small molecules can passively diffuse, including some sugars and antibiotics, and in the case of OmpC also ions [[57], [58], [59], [60]]. Thus, these proteins may also facilitate diffusion of formic acid (and perhaps formate) through the outer membrane. However, since all three formatotrophic strains showed this high abundance of OmpC and OmpA compared to the glucose reference, it is unclear whether their expression is upregulated response to growth conditions, or whether this is caused by an unknown genetic difference between the reference and the formatotrophic strains.
4. Conclusions
We investigated the metabolism and proteome of the engineered formatotrophic E. coli strains K4, K4e, and K4e2 grown on formate, and compared the latter to E. coli grown on glucose. We aimed to gain a better understanding of the metabolic differences between these strains and to identify avenues for further improvement of formatotrophic growth by rational engineering. We identified several potential metabolic bottlenecks and inefficient routes in the formatotrophic metabolism of K4e2.
We found that, despite high expression of the heterologous PsFDH that efficiently regenerates NADH, the K4e2 strain may still be using the full TCA cycle for energy generation on formate as well, which is highly inefficient. Thus, improved yield during growth on formate may be achieved by downregulating expression of genes encoding key enzymes in this cycle, such as GltA and SucAB. Furthermore, while the optimal route for NADPH regeneration through PntAB is overexpressed, there may still be a bottleneck for NADPH, possibly contributing to utilization of less efficient NADPH regeneration routes, such as those using Zwf. Thus, knockout of Zwf and stronger overexpression of PntAB may also help improve the metabolic efficiency and biomass yield.
While malate decarboxylation by MaeB is part of another highly inefficient NADPH regeneration cycle, we suggest that MaeB could operate in reverse (pyruvate carboxylating) direction under formatotrophic growth conditions. This carboxylating activity would allow production of biomass precursors (such as oxaloacetate and PEP) through a more efficient route than the alternative via Pps and Ppc. If this is the case, upregulation of MaeB combined with downregulation or knock-out of Pps and Ppc could help promote this more efficient biomass production route. Though the latter is only predicted to lead to a minor increase in biomass yield.
When comparing the genomes and proteomes of the three formatotrophic strains, we identified some so far unknown mutations that may contribute to their differences in growth on formate. For instance, we found a mutation in strain K4e in a branching point of the rGlyP with native metabolism (MetF), which may contribute to its improved formatotrophic growth compared to its ancestor K4.
To conclude, our investigation provides insight into the metabolic landscape of engineered and evolved formatotrophic E. coli strains, thereby elucidating critical areas for improvement in efficiently utilizing formate as a renewable feedstock for bioproduction (Supplementary Table 5). Our findings offer a roadmap for future rational engineering endeavors aimed at maximizing the efficiency of synthetic formatotrophy. Addressing the identified metabolic bottlenecks and fine-tuning expression targets could pave the way towards industrial applications of formate-based bioproduction systems as sustainable alternative to traditional petrochemical production.
CRediT authorship contribution statement
Suzan Yilmaz: Writing – review & editing, Writing – original draft, Visualization, Project administration, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Boas Kanis: Writing – review & editing, Software, Methodology, Formal analysis, Data curation. Rensco A.H. Hogers: Software, Methodology, Formal analysis, Data curation. Sara Benito-Vaquerizo: Writing – review & editing, Software, Methodology, Formal analysis, Data curation, Conceptualization. Jörg Kahnt: Methodology, Investigation, Formal analysis, Data curation. Timo Glatter: Writing – review & editing, Methodology. Beau Dronsella: Writing – review & editing, Methodology. Tobias J. Erb: Supervision, Funding acquisition. Maria Suarez-Diez: Writing – review & editing, Supervision, Project administration, Methodology, Funding acquisition, Conceptualization. Nico J. Claassens: Writing – review & editing, Supervision, Project administration, Methodology, Funding acquisition, Conceptualization.
Declaration of competing interest
We declare that we have no competing interests that could influence the work reported in this article.
Acknowledgements
We thank Seohyoung Kim, Steffen Lindner, and Arren Bar-Even, for creating and donating the strains used in this study and for their advice and support. We also thank John van der Oost for his advice and support and we thank Vittorio Rainaldi for restoring the AceA gene in the K4 and K4e stains that were used in this study. We thank Bart Nijsse for his contributions in data analysis. S.Y. and N.J.C. acknowledge the support of the Dutch Research Council (NWO) via the Gravitation Project BaSyC (024.003.019). In addition, N.J.C. acknowledges support from his NWO Veni fellowship (VI.Veni.192.156).
Footnotes
Peer review under the responsibility of Editorial Board of Synthetic and Systems Biotechnology.
Supplementary data to this article can be found online at https://doi.org/10.1016/j.synbio.2025.03.001.
Appendix A. Supplementary data
The following are the Supplementary data to this article:
References
- 1.Intergovernmental Panel on Climate Change . Intergovernmental Panel on Climate Change (IPCC); 2023. Climate change 2023: synthesis report. [DOI] [Google Scholar]
- 2.Provisional state of the global climate 2023. World Meteorological Organization; 2023. https://wmo.int/sites/default/files/2023-11/WMO%20Provisional%20State%20of%20the%20Global%20Climate%202023.pdf [Online]. Available: [Google Scholar]
- 3.Collas F., et al. Engineering the biological conversion of formate into crotonate in Cupriavidus necator. Metab Eng. 2023;79:49–65. doi: 10.1016/j.ymben.2023.06.015. [DOI] [PubMed] [Google Scholar]
- 4.Gleizer S., Bar-On Y.M., Ben-Nissan R., Milo R. Engineering microbes to produce fuel, commodities, and food from CO2. Cell Rep Phys Sci. 2020;1(10) doi: 10.1016/j.xcrp.2020.100223. [DOI] [Google Scholar]
- 5.Haas T., Krause R., Weber R., Demler M., Schmid G. Technical photosynthesis involving CO2 electrolysis and fermentation. Nat Catal. 2018;1(1):32–39. doi: 10.1038/s41929-017-0005-1. [DOI] [Google Scholar]
- 6.Cotton C.A., Claassens N.J., Benito-Vaquerizo S., Bar-Even A. Renewable methanol and formate as microbial feedstocks. Curr Opin Biotechnol. 2020;62:168–180. doi: 10.1016/j.copbio.2019.10.002. [DOI] [PubMed] [Google Scholar]
- 7.Yishai O., Lindner S.N., Gonzalez De La Cruz J., Tenenboim H., Bar-Even A. The formate bio-economy. Curr Opin Chem Biol. 2016;35:1–9. doi: 10.1016/j.cbpa.2016.07.005. [DOI] [PubMed] [Google Scholar]
- 8.Bar-Even A. Formate assimilation: the metabolic architecture of natural and synthetic pathways. Biochemistry. 2016;55(28):3851–3863. doi: 10.1021/acs.biochem.6b00495. [DOI] [PubMed] [Google Scholar]
- 9.Wenk S., et al. Evolution-assisted engineering of E. coli enables growth on formic acid at ambient CO2 via the Serine Threonine Cycle. Metab Eng. 2025;88:14–24. doi: 10.1016/j.ymben.2024.10.007. [DOI] [PubMed] [Google Scholar]
- 10.Bar-Even A., Noor E., Flamholz A., Milo R. Design and analysis of metabolic pathways supporting formatotrophic growth for electricity-dependent cultivation of microbes. Biochim Biophys Acta BBA - Bioenerg. 2013;1827(8–9):1039–1047. doi: 10.1016/j.bbabio.2012.10.013. [DOI] [PubMed] [Google Scholar]
- 11.Sánchez-Andrea I., et al. The reductive glycine pathway allows autotrophic growth of Desulfovibrio desulfuricans. Nat Commun. 2020;11(1):5090. doi: 10.1038/s41467-020-18906-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Yishai O., Goldbach L., Tenenboim H., Lindner S.N., Bar-Even A. Engineered assimilation of exogenous and endogenous formate in Escherichia coli. ACS Synth Biol. 2017;6(9):1722–1731. doi: 10.1021/acssynbio.7b00086. [DOI] [PubMed] [Google Scholar]
- 13.Yishai O., Bouzon M., Döring V., Bar-Even A. In vivo assimilation of one-carbon via a synthetic reductive Glycine pathway in Escherichia coli. ACS Synth Biol. 2018;7(9):2023–2028. doi: 10.1021/acssynbio.8b00131. [DOI] [PubMed] [Google Scholar]
- 14.Tashiro Y., Hirano S., Matson M.M., Atsumi S., Kondo A. Electrical-biological hybrid system for CO2 reduction. Metab Eng. 2018;47:211–218. doi: 10.1016/j.ymben.2018.03.015. [DOI] [PubMed] [Google Scholar]
- 15.Bang J., Lee S.Y. Assimilation of formic acid and CO2 by engineered Escherichia coli equipped with reconstructed one-carbon assimilation pathways. Proc Natl Acad Sci USA. 2018;115(40) doi: 10.1073/pnas.1810386115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kim S., et al. Growth of E. coli on formate and methanol via the reductive glycine pathway. Nat Chem Biol. 2020;16(5):538–545. doi: 10.1038/s41589-020-0473-5. [DOI] [PubMed] [Google Scholar]
- 17.Tishkov V.I., Popov V.O. Catalytic mechanism and application of formate dehydrogenase. Biochem Mosc. 2004;69(11):1252–1267. doi: 10.1007/s10541-005-0071-x. [DOI] [PubMed] [Google Scholar]
- 18.Kim S., et al. Optimizing E. coli as a formatotrophic platform for bioproduction via the reductive glycine pathway. Front Bioeng Biotechnol. 2023;11 doi: 10.3389/fbioe.2023.1091899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Bang J., Hwang C.H., Ahn J.H., Lee J.A., Lee S.Y. Escherichia coli is engineered to grow on CO2 and formic acid. Nat Microbiol. 2020;5(12):1459–1463. doi: 10.1038/s41564-020-00793-9. [DOI] [PubMed] [Google Scholar]
- 20.Fedorova D., et al. Demonstration of bioplastic production from CO2 and formate using the reductive Glycine pathway in E. coli. bioRxiv. 2023 doi: 10.1101/2023.12.02.569694. [DOI] [Google Scholar]
- 21.Claassens N.J., Cotton C.A.R., Kopljar D., Bar-Even A. Making quantitative sense of electromicrobial production. Nat Catal. 2019;2(5):437–447. doi: 10.1038/s41929-019-0272-0. [DOI] [Google Scholar]
- 22.Demichev V., Messner C.B., Vernardis S.I., Lilley K.S., Ralser M. DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nat Methods. 2020;17(1):41–44. doi: 10.1038/s41592-019-0638-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Dronsella B., Orsi E., Schulz-Mirbach H., Benito-Vaquerizo S., Yilmaz S., Glatter T., Bar-Even A., Erb T.J., Claassens N.J. One-carbon fixation exceeds via the synthetic reductive glycine pathway exceeds yield of the Calvin Cycle. Nat Microbiol. 2025;10(3):646–653. doi: 10.1038/s41564-025-01941-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ahrné E., Molzahn L., Glatter T., Schmidt A. Critical assessment of proteome-wide label-free absolute abundance estimation strategies. Proteomics. 2013;13(17):2567–2578. doi: 10.1002/pmic.201300135. [DOI] [PubMed] [Google Scholar]
- 25.Glatter T., Ludwig C., Ahrné E., Aebersold R., Heck A.J.R., Schmidt A. Large-scale quantitative assessment of different in-solution protein digestion protocols reveals superior cleavage efficiency of tandem lys-C/trypsin proteolysis over trypsin digestion. J Proteome Res. 2012;11(11):5145–5156. doi: 10.1021/pr300273g. [DOI] [PubMed] [Google Scholar]
- 26.Orth J.D., et al. A comprehensive genome‐scale reconstruction of Escherichia coli metabolism—2011. Mol Syst Biol. 2011;7(1):535. doi: 10.1038/msb.2011.65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Monk J.M., et al. iML1515, a knowledgebase that computes Escherichia coli traits. Nat Biotechnol. 2017;35(10):904–908. doi: 10.1038/nbt.3956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bizouarn T., Sazanov L.A., Aubourg S., Baz Jackson J. Estimation of the H+/H− ratio of the reaction catalysed by the nicotinamide nucleotide transhydrogenase in chromatophores from over-expressing strains of Rhodospirillum rubrum and in liposomes inlaid with the purified bovine enzyme. Biochim Biophys Acta BBA - Bioenerg. 1996;1273(1):4–12. doi: 10.1016/0005-2728(95)00125-5. [DOI] [PubMed] [Google Scholar]
- 29.Hutton M., Day J.M., Bizouarn T., Jackson J.B. Kinetic resolution of the reaction catalysed by proton‐translocating transhydrogenase from Escherichia coli as revealed by experiments with analogues of the nucleotide substrates. Eur J Biochem. 1994;219(3):1041–1051. doi: 10.1111/j.1432-1033.1994.tb18587.x. [DOI] [PubMed] [Google Scholar]
- 30.Ebrahim A., Lerman J.A., Palsson B.O., Hyduke D.R. COBRApy: COnstraints-based reconstruction and analysis for Python. BMC Syst Biol. 2013;7(1):74. doi: 10.1186/1752-0509-7-74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lewis N.E., et al. Omic data from evolved E. coli are consistent with computed optimal growth from genome‐scale models. Mol Syst Biol. 2010;6(1):390. doi: 10.1038/msb.2010.47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Megchelenbrink W., Huynen M., Marchiori E. optGpSampler: an improved tool for uniformly sampling the solution-space of genome-scale metabolic networks. PLoS One. 2014;9(2) doi: 10.1371/journal.pone.0086587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Shapiro S.S., Wilk M.B. An analysis of variance test for normality (complete samples) Biometrika. 1965;52(3/4):591. doi: 10.2307/2333709. [DOI] [Google Scholar]
- 34.Noor E., Eden E., Milo R., Alon U. Central carbon metabolism as a minimal biochemical walk between precursors for biomass and energy. Mol Cell. 2010;39(5):809–820. doi: 10.1016/j.molcel.2010.08.031. [DOI] [PubMed] [Google Scholar]
- 35.Grunwald S., et al. Kinetic and stoichiometric characterization of organoautotrophic growth of Ralstonia eutropha on formic acid in fed‐batch and continuous cultures. Microb Biotechnol. 2015;8(1):155–163. doi: 10.1111/1751-7915.12149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Kelly D.P., Wood A.P., Gottschal J.C., Kuenen J.G. Autotrophic metabolism of formate by Thiobacillus strain A2. J Gen Microbiol. 1979;114(1):1–13. doi: 10.1099/00221287-114-1-1. [DOI] [Google Scholar]
- 37.Pirt The maintenance energy of bacteria in growing cultures. Proc R Soc Lond B Biol Sci. 1965;163(991):224–231. doi: 10.1098/rspb.1965.0069. [DOI] [PubMed] [Google Scholar]
- 38.Pronk J.T., Meijer W.M., Hazeu W., Van Dijken J.P., Bos P., Kuenen J.G. Growth of Thiobacillus ferrooxidans on formic acid. Appl Environ Microbiol. 1991;57(7):2057–2062. doi: 10.1128/aem.57.7.2057-2062.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Lü W., Du J., Schwarzer N.J., Wacker T., Andrade S.L.A., Einsle O. The formate/nitrite transporter family of anion channels. Biol Chem. 2013;394(6):715–727. doi: 10.1515/hsz-2012-0339. [DOI] [PubMed] [Google Scholar]
- 40.Vanyan L., Kammel M., Sawers R.G., Trchounian K. Evidence for bidirectional formic acid translocation in vivo via the Escherichia coli formate channel FocA. Arch Biochem Biophys. 2024;752 doi: 10.1016/j.abb.2023.109877. [DOI] [PubMed] [Google Scholar]
- 41.Kim H.U., Kim W.J., Lee S.Y. Flux-coupled genes and their use in metabolic flux analysis. Biotechnol J. 2013:1035–1042. doi: 10.1002/biot.201200279. [DOI] [PubMed] [Google Scholar]
- 42.Krapp A.R., Humbert M.V., Carrillo N. The soxRS response of Escherichia coli can be induced in the absence of oxidative stress and oxygen by modulation of NADPH content. Microbiology. 2011;157(4):957–965. doi: 10.1099/mic.0.039461-0. [DOI] [PubMed] [Google Scholar]
- 43.Simensen V., et al. Experimental determination of Escherichia coli biomass composition for constraint-based metabolic modeling. PLoS One. 2022;17(1) doi: 10.1371/journal.pone.0262450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Schulz-Mirbach H., et al. On the flexibility of the cellular amination network in E. coli. Elife. 2022;11 doi: 10.7554/eLife.77492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Claassens N.J., et al. Replacing the Calvin cycle with the reductive glycine pathway in Cupriavidus necator. Metab Eng. 2020;62:30–41. doi: 10.1016/j.ymben.2020.08.004. [DOI] [PubMed] [Google Scholar]
- 46.Francetic O., Belin D., Badaut C., Pugsley A.P. Expression of the endogenous type II secretion pathway in Escherichia coli leads to chitinase secretion. EMBO J. 2000;19(24):6697–6703. doi: 10.1093/emboj/19.24.6697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Zhu K., Zhang Y.-M., Rock C.O. Transcriptional regulation of membrane lipid homeostasis in Escherichia coli. J Biol Chem. 2009;284(50):34880–34888. doi: 10.1074/jbc.M109.068239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Conrad T.M., et al. Whole-genome resequencing of Escherichia coli K-12 MG1655 undergoing short-term laboratory evolution in lactate minimal media reveals flexible selection of adaptive mutations. Genome Biol. 2009;10(10):R118. doi: 10.1186/gb-2009-10-10-r118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.LaCroix R.A., et al. Use of adaptive laboratory evolution to discover key mutations enabling rapid growth of Escherichia coli K-12 MG1655 on glucose minimal medium. Appl Environ Microbiol. 2015;81(1):17–30. doi: 10.1128/AEM.02246-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Szalewska-Palasz A., et al. Properties of RNA Polymerase Bypass Mutants: implications for the role of ppGpp and its co-factor DksA in controlling transcription dependent on σ54. J Biol Chem. 2007;282(25):18046–18056. doi: 10.1074/jbc.M610181200. [DOI] [PubMed] [Google Scholar]
- 51.Irving S.E., Choudhury N.R., Corrigan R.M. The stringent response and physiological roles of (pp)pGpp in bacteria. Nat Rev Microbiol. 2021;19(4):256–271. doi: 10.1038/s41579-020-00470-y. [DOI] [PubMed] [Google Scholar]
- 52.Reitzer L., Schneider B.L. Metabolic context and possible physiological themes of ς 54 -dependent genes in Escherichia coli. Microbiol Mol Biol Rev. 2001;65(3):422–444. doi: 10.1128/MMBR.65.3.422-444.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Schulz M., et al. Functional expression of a Mo-dependent formate dehydrogenase in Escherichia coli under aerobic conditions. bioRxiv. 2023 doi: 10.1101/2023.10.27.564357. [DOI] [Google Scholar]
- 54.Hong Y., Ren J., Zhang X., Wang W., Zeng A.-P. Quantitative analysis of glycine related metabolic pathways for one-carbon synthetic biology. Curr Opin Biotechnol. 2020;64:70–78. doi: 10.1016/j.copbio.2019.10.001. [DOI] [PubMed] [Google Scholar]
- 55.Löwe H., Kremling A. In-depth computational analysis of natural and artificial carbon fixation pathways. BioDesign Res. 2021;2021 doi: 10.34133/2021/9898316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Zhang D., Ye J., Dai H., Lin X., Li H., Peng X. Identification of ethanol tolerant outer membrane proteome reveals OmpC-dependent mechanism in a manner of EnvZ/OmpR regulation in Escherichia coli. J Proteonomics. 2018;179:92–99. doi: 10.1016/j.jprot.2018.03.005. [DOI] [PubMed] [Google Scholar]
- 57.Benz R., Schmid A., Hancock R.E. Ion selectivity of gram-negative bacterial porins. J Bacteriol. 1985;162(2):722–727. doi: 10.1128/jb.162.2.722-727. 1985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Iyer R., Moussa S.H., Durand-Réville T.F., Tommasi R., Miller A. Acinetobacter baumannii OmpA is a selective antibiotic permeant porin. ACS Infect Dis. 2018;4(3):373–381. doi: 10.1021/acsinfecdis.7b00168. [DOI] [PubMed] [Google Scholar]
- 59.Sugawara E., Nikaido H. Pore-forming activity of OmpA protein of Escherichia coli. J Biol Chem. 1992;267(4):2507–2511. doi: 10.1016/S0021-9258(18)45908-X. [DOI] [PubMed] [Google Scholar]
- 60.Zhou G., et al. Outer membrane porins contribute to antimicrobial resistance in gram-negative bacteria. Microorganisms. 2023;11(7):1690. doi: 10.3390/microorganisms11071690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Liebermeister W., Noor E., Flamholz A., Davidi D., Bernhardt J., Milo R. Visual account of protein investment in cellular functions. Proc Natl Acad Sci USA. 2014;111(23):8488–8493. doi: 10.1073/pnas.1314810111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Nocaj A., Brandes U. Computing voronoi treemaps: faster, simpler, and resolution‐independent. Comput Graph Forum. 2012;31(3pt1):855–864. doi: 10.1111/j.1467-8659.2012.03078.x. [DOI] [Google Scholar]
- 63.Otto A., et al. Systems-wide temporal proteomic profiling in glucose-starved Bacillus subtilis. Nat Commun. 2010;1(1):137. doi: 10.1038/ncomms1137. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.









