Iterative design of training data to control intricate enzymatic reaction networks

Bob van Sluijs; Tao Zhou; Britta Helwig; Mathieu G Baltussen; Frank H T Nelissen; Hans A Heus; Wilhelm T S Huck

doi:10.1038/s41467-024-45886-9

. 2024 Feb 21;15:1602. doi: 10.1038/s41467-024-45886-9

Iterative design of training data to control intricate enzymatic reaction networks

Bob van Sluijs ¹, Tao Zhou ^1,^✉, Britta Helwig ¹, Mathieu G Baltussen ¹, Frank H T Nelissen ¹, Hans A Heus ¹, Wilhelm T S Huck ^1,^✉

PMCID: PMC10881569 PMID: 38383500

Abstract

Kinetic modeling of in vitro enzymatic reaction networks is vital to understand and control the complex behaviors emerging from the nonlinear interactions inside. However, modeling is severely hampered by the lack of training data. Here, we introduce a methodology that combines an active learning-like approach and flow chemistry to efficiently create optimized datasets for a highly interconnected enzymatic reactions network with multiple sub-pathways. The optimal experimental design (OED) algorithm designs a sequence of out-of-equilibrium perturbations to maximize the information about the reaction kinetics, yielding a descriptive model that allows control of the output of the network towards any cost function. We experimentally validate the model by forcing the network to produce different product ratios while maintaining a minimum level of overall conversion efficiency. Our workflow scales with the complexity of the system and enables the optimization of previously unobtainable network outputs.

Subject terms: Chemistry, Process chemistry, Computational models, Computational chemistry, Dynamic networks

Kinetic modeling of in vitro enzymatic reaction networks (ERNs) is severely hampered by the lack of training data. Here, authors introduce a methodology that combines an active learning-like approach and flow chemistry to create optimized datasets for an intricate ERN.

Introduction

Living cells rely on enzymatic reaction networks (ERNs) to produce energy and building blocks to support cellular processes. Evolution has shaped these ERNs into interconnected sub-pathways to generate multiple outputs from multiple inputs, driving product formation across complex kinetic landscapes. Recently, significant progress has been made in reconstituting ERNs in vitro with the aim of building a cell from the bottom up^1–4, or to produce value-added chemicals from sustainable substrates as an advanced biotechnology^5–8. However, most of these networks typically do not capture one of the essential features of biological ERNs, where several interconnected sub-pathways function simultaneously to generate multiple outputs. Controlling such networks remains challenging due to the lack of sufficiently informative experimental datasets that can be utilized to train kinetic models which trace the dynamic properties of large ERNs and enable on-demand design^9,10.

Typically, the optimization of ERNs towards specific outcomes, like increasing the overall efficiency, is achievable by searching a large combinatorial space of inputs and measuring the product formation of the ERN. Experimentally, this is prohibitively time-, labor-, and cost-intensive¹¹. Recently Pandi et al. have shown that such a screening process could be significantly improved by an AI based active learning protocol¹². Additionally, promising advances have been published recently, utilizing machine learning to derive and individual reaction mechanisms from large datasets^9,10,13–15. Yet, these black box approaches are limited in their ability to guide the design of large ERNs, however they are often very adept at mapping a specific region of the input output space, but not the entire space (the entire kinetic landscape). Kinetic models based on ordinary differential equations can track all the intermediates through time by explicitly formulated reaction rates and hence are especially powerful in guiding the optimization of complex ERNs¹⁶. In the context of larger networks, parameterizing these models is challenging. Not every interaction can be observed which complicates the identification of individual rates. Training data often relies on steady state batch experiments where a single combination of control inputs is tested. These experiments tend to be kinetically non-informative and are not sufficient to approximate the kinetic landscape of complex ERNs. To address this, time-course datasets which track the responses of ERNs to controlled perturbations are needed. This is demonstrated by both Shen et al. and Hold et al. in both batch and flow respectively, who characterized networks by adding the enzymes sequentially and measuring the change in product formation^17–19. However, as the complexity and scale of an ERN increases (substrate competition, allosteric interactions, feedback loops, futile cycles, etc.) choosing a set of perturbations intuitively such that we obtain relevant information about the kinetic landscape becomes increasingly difficult.

Here, we present a generalizable method that trains a kinetic model iteratively, by adding new and more informative experiments to a training dataset in each optimization cycle (akin to active learning). It incorporates an optimal experimental design (OED) algorithm that evolves a sequence of out-of-equilibrium perturbations to be maximally informative. We subsequently test the utility of the model by using the experimental outcomes of these perturbation experiments as test data for the previous iteration of the model. Using this approach, we demonstrate that a limited number of design iterations is enough to obtain data of sufficient quality to map the kinetic landscape of the ERN and obtain a measure of control over it as a multi-input multi-output (MIMO) system in vitro.

Results

Overview of the nucleotide salvage pathway

The in vitro ERN constructed in this work derives from the nucleotide salvage pathway (Fig. 1a), which regenerates nucleotides for cellular processes by recovering bases and nucleosides from the degradation of RNA and DNA. The network starts with phosphoribosyl pyrophosphate (PRPP), which can be converted from glucose via the pentose phosphate pathway and is coupled by the enzyme UPRT and APRT to nucleobases uracil and adenine, respectively, to form the monophosphate nucleotides UMP and AMP. For solubility reasons we did not include guanine as a nucleobase and started from GMP. UMP, GMP and AMP are subsequently converted to their corresponding diphosphate nucleotides (NDPs) by enzymes UMPK, GMPK and AK, respectively, using ATP as cofactor. Finally, NDPs are converted to NTPs by a single enzyme, PK. In total, this system consists of six enzymes catalyzing eight reversible reactions, where PK is shared between three substrates, and resource competition for ATP, PEP and PRPP throughout the network. Previous works demonstrated all these enzymes could function in one pot to synthesize labeled nucleotides with an excess amount of the key compound^20–22, yet the overall performance is poor, controlling multiple state outputs remains a challenge, this requires the guidance of a kinetic model with sufficient resolution.

Kinetic model of the nucleotide salvage pathway

Translating the reactions of the ERN into a coarse grained model of ordinary differential equation (ODEs), resulted in an ODE system of 15 equations with over 40 kinetic rates (for a full description of the model and coarse graining process see supplementary information 2). Generally, choosing the right model can be challenging (Fig S5-11), large enzymatic reaction networks require more parameterization, this can cause the model to overfit the training data, reducing its predictive power. This parameter problem is present in all models, but with ODEs it can be viewed from the perspective of a parameter’s forwards sensitivity to the observed species (Fig. 1b)²³. These sensitivities map onto the contribution a parameter has to the observed rates of change over time (Supplementary information 1.6eq.6). When these sensitivities correlate with one another, the observations can be approximated by the model by modifying both rates simultaneously. A positive correlation between the forward sensitivities of kinetic rates implies a similar effect on the rate of change of the observed species, thus the model can fit the data by increasing the value of one rate whilst decreasing the value of its partner, a negative correlation implies an opposing effect on the rate of change, thus, to fit the data the kinetic rates need both to either increase or decrease.

This unidentifiability means many combinations of kinetic rates can approximate the data (not just the ‘true’ rates), which in turn leads to prediction errors as the experimental conditions change from those used to generate the initial training data^23–25. Thus, experimental data can be deemed uninformative if the inability to discern which reactions contribute most to the flux of a species at a specific time and results in prediction errors as conditions change. Generally, it is easier to completely identify rates in simplified models, but their quantitative predictive power will be limited as mechanistic assumptions are readily broken (Supplementary information 2, Fig. S10). Conversely, detailed mechanistic models are more descriptive but it is harder to identify kinetic rates.

However, from a broadly practical perspective, precisely identifying individual rates is not needed to control the behavior of an ERN, a model just needs to approximate the kinetic landscape adequately and the remaining uncertainty needs to be manageable. To address this efficiently, we adapted an active learning approach commonly applied in machine learning with the singular goal of controlling ERNs. We utilized optimal experimental design (OED) to design experiments that maximize information about the ERN in the data, and subsequently train a kinetic model and tested its predictive power. This cycle was repeated until the uncertainty around the predictions was reduced and they matched the experimental outcome.

OED and pulsing substrates into the flow reactor

We highlight this experimental workflow in Fig. 2a. First, all enzymes were individually immobilized on microfluidically produced hydrogel beads with a diameter of 50 μm²⁶. The activity of each enzyme after immobilization was measured separately. Next, enzyme-loaded beads were loaded into a microfluidic continuous stirred-tank reactor (CSTR). The CSTR chamber itself has a volume of 100 μl and the flow setup has six inlets for each of the input substrates uracil, GMP, adenine, ATP, PEP, and PRPP and a single outlet (Fig. S15). Samples were collected from the output at different intervals depending on the total flow rates by a fraction collector and analyzed offline by ion-pair HPLC²⁷. The analysis of the chromatographic peaks provides a compositional pattern of eight input substrate, intermediates, and product molecules (uracil, UMP, GMP, adenine, ADP, GTP, UTP, and ATP), each changing at every input combination.

The optimal experimental design workflow is shown in Fig. 2b. In step one a swarm/evolutionary algorithm evolves an input flow profile for each of the six inputs at three different flowrates^28,29. This algorithm scores input patterns by maximizing the D-Fisher information criterion (Supplementary information 1.6eq.7)³⁰. This criterion is obtained by computing the determinant of the Fisher information matrix which is derived from the parameter sensitivities (Supplementary information 1.2eq.6). This metric maps onto the volume of the parameter space where the ODE model can fit to the experimental data^30,31. This means the algorithm is driven to find a combination of input sequences that breaks the correlation between parameter sensitivities (if only temporary). The transition between different total flow rates results in different output compositions and serves as another control parameter that increases the information content about substrate conversion fluxes in the data. At high flow rates input molecules and monophosphates are detected (as only a fraction of substrate has been converted); at low flow rates increased NTP formation is observed (Supplementary information 2.3 Fig. S13 & S14). In step two this data is added to a training dataset, the model is trained on this data in step three. In step four the predictive power of the model is assessed by using the previous iteration of the model to predict the current experiment (test data), if the predictive power is not sufficient or no longer improves, the cycle is terminated; if not, the cycle continues, and the latest iteration of the model and database is used to design a new experiment in step one³².

Iterative design of training data to build a kinetic model

A total of three iterations of the optimization cycle were performed (excluding a calibration), each time exchanging the microfluidic chip, altering the enzyme concentrations (Supplementary information 3.2 Fig. S17-S20). The lower and upper boundary of the concentration ranges for the substrates were based on the enzyme activity assays and substrate solubility (Supplementary information 4.4 Fig. S24-S37). The initial experiment (not part of the cycle) is manually designed and ‘calibrates’ the model (Supplementary information 3.2 Fig. S17). This allows for the subsequent OED of an informative input sequence since more knowledge about the system equates to better OED outcomes²⁶. To illustrate the non-intuitive character of the evolved input sequence we show the substrate inputs of the final experiment of the optimization cycle (Fig. 3a) and the complexity of the time-course data including model convergence (Fig. 3b).

Fig. 3 — a Input flow rates for each of the 6 inputs substrates evolved by the OED algorithm. b Data as measured on HPLC (black triangles) and the fit of the model to the data (solid lines), Source data are provided in Source Data Fig3.

We subsequently place these data in the context of the optimization cycle (Fig. 4). Figure 4a shows parameter distributions of the model trained in the first iteration (top) and the parameter distribution of the model trained in the third iteration (bottom). We note a significant decline in the distribution width of most kinetic parameters (Fig. S14). To demonstrate the improved predictive power of the model, Fig. 4b compares the predicted outcomes (shaded area) of the model trained after iteration one and iteration two of the OED cycle (predicting the experiment performed in the third iteration shown in Fig. 3). The second iteration of the model already shows a drastic reduction in the variance around the prediction and highlights that the model can approximate the behavior of the ERN quantitatively.

Trained model controls nucleotide salvage pathway in flow

This presents us with new opportunities for the third iteration of the model, beyond traditional optimization schemes that often focus on maximizing the yield of a single product. Here, we demonstrate how we can use the final iteration of the model to control a MIMO system to achieve a range of more complex output states^29,33. We opted to tune the ATP/GTP/UTP output ratios whilst maintaining a minimal conversion efficiency—defined as the percentage of nucleobases converted to triphosphates—of 60%.

The outcome of this sampling process is shown in Fig. 5a, we randomly generated 10⁵ substrate input combinations, each input combination was simulated twenty times using different combinations of estimated kinetic rates. Every dot represents a different condition, the color indicates the ratio between ATP/UTP/GTP. The 20 sets of estimated kinetic rates –when simulated- predict different ATP, UTP, and GTP concentrations. This is reflected by the y-axis which shows the standard deviation of the predicted mean concentrations for these simulations. It captures the certainty of the model and the likelihood there will be a prediction error for a given set of input conditions. The x-axis subsequently shows the conversion efficiency. We selected seven experimental conditions representing seven ATP/UTP/GTP ratios in Fig. 5a, including one repeated ratio (experiment 1 & 7) and one experiment with a lower conversion efficiency (experiment 3). This experiment serves two purposes: first, to demonstrate that the model can control a MIMO system and access a part of the output space that requires an accurate map of the kinetics and finely tuned control inputs (which is achieved by optimizing the ERN for different triphosphate blends with a high conversion efficiency). Second, to identify the operable space of the model, for which we test a range of total input substrate concentrations along with compositional blends of final products.

Figure 5b shows the predicted confidence interval of the final yield, and the yield as measured on the HPLC. For experiments 1-5 uncertainties and total output concentrations vary but predictions still match. For very low input concentrations of UMP, guanine, adenine, and ATP in experiments 6 and 7, the predictions error increases even though the simulated standard deviation is low. This relation between the prediction error, quantified as the percentage the simulated mean deviates from the HPLC measurement and the summed input concentration of the nucleobases is shown in Fig. 5c. It highlights that the model can predict exact concentrations as long as the total concentration of substrate inputs is larger than 0.3 mM. The cause can likely be attributed to a decrease in the signal to noise ratio for the HPLC measurement, leading to larger variations in the experimental data (see Supplementary information table S1-2). To test this, we used different models. More complex models which contained different rate laws (Fig. S5-6) as well as combinations of allosteric interactions reported in literature (Fig. S7). However, none of these models performed better and prediction errors increased. This suggests that these interactions do not play a significant role in this network, at least not significant enough to overcome a potential overfit of the training data. In contrast, reducing the complexity of the model increased the prediction error significantly, we were able to confirm that reactions catalyzed by the PK, UMPK, AMPK, and AK enzymes need to be reversible, whereas UPRT and APRT can be considered unidirectional (Fig. S8-11). In summary, this means that a total input concentration 0.3 mM marks the practical boundary of the model trained on this data, knowledge we can leverage to efficiently probe conditions in the identified operable space.

Discussion

We have presented a methodology to design informative training data and map the kinetic landscape of an ERN as efficiently as possible. By designing sufficiently complex experiments we were able to restrict the combinations of potential kinetic rates such that they map onto real product formation fluxes across a large input-output space. This space could subsequently be sampled for any cost function. To highlight this versatility, we opted to create different compositional blends of triphosphate compounds which require not one but multiple finely tuned input conditions. Finally, we identify the operable space wherein the model is useful and demonstrate that other mechanistic descriptions of the systems reduce the predictive power of the model. This underscores that the active learning aspect of the OED pipeline is able to balance the degree to which we parameterize the model, its mechanistic assumptions, and its predictive power within three iterations.

The number of OED iterations required to achieve this depends on both the complexity of the network and the quality of the experimental data. If the system is highly non-linear, more certainty about the rates will be needed as smaller deviations from the true value will result in larger prediction errors. In contrast, very linear and orthogonal networks will likely require significantly fewer optimization cycles (and a simpler model) to enable a form of MIMO control. Overall, this means the pipeline can be utilized in different contexts as long as there is a kinetic model with control inputs (in Fig. S1 we probed the applicability of this software to larger systems, specifically by comparing CPU time needed for the model presented here and the E. coli core metabolism model). Process optimization for organic synthesis using design of experiments in flow has been reported, most of which aim to determine the optimal operational conditions for one reaction^34–36. So far experimental design schemes have not been applied to multiple organic reaction networks (or placed in the context of an active learning cycle). However, there is no reason why it cannot be applied to train a kinetic model which provides more understanding and a high-level of control over chemical reaction networks^10,13.

In future work, more complex cost functions can be defined, including the identification of key reaction mechanisms and interactions by making the coarse graining process of the model an explicit part of the active learning process. In this instance, the algorithm—besides mapping the kinetic landscape- seeks to find input combinations which either validate or invalidate mechanistic assumptions embedded in different models. Currently, we are able to discriminate between different rate laws (broadly classifying them as descriptive or not) and the inclusion of reaction reversibility, whereas potential allosteric interactions did not seem to be present in a manner that effected predicted outcomes. However, the differences were not explicitly maximized by the algorithm, thus the observed difference in predictive power was minimal in most cases³⁷. Nevertheless our results are promising, and are complimentary to other work that has shown that black box models can identify the reaction mechanism of a single reaction from bulk data¹⁴. Such approaches have not been reported in the context of a biochemical network nor have they been embedded in an active learning like approach which offers promise for the future. Overall, we believe our pipeline is beneficial to all who seek to build complex biochemical pathways with controlled inputs.

Methods

Materials

Enzymes adenylate kinase (AK) and pyruvate kinase (PK) and all chemicals were purchased from Sigma and directly used without further processes. Enzymes adenine phosphoribosyl transferase (APRT) and, uracil phosphoribosyl transferase (UPRT) were expressed and purified as described by Arthur et al.³⁸, Genes for guanosine monophosphate kinase (GMPK) and, uridine monophosphate kinase (UMPK) were PCR amplified from E. coli K12 using gene specific primers, cloned into pET15b, expressed overnight at 30 °C (GMPK) and 18 °C (UMPK) in E. coli BL21(DE3) and purified according to protocols modified from Oeschger et al. ³⁹ (GMPK) and Serina et al. ⁴⁰ (UMPK) to accommodate Ni^2+-sepharose purification. Purified enzymes were dialyzed against 20 mM potassium phosphate buffer (pH 7.2) prior to immobilization. All the enzymes were immobilised on microfluidic produced hydrogel beads, as reported²⁶. After immobilization, all the enzyme-beads were freeze dried and stored in -20 °C. 1 mg of beads for each enzyme was suspended in 31 ul IVTT buffer (pH 7.3, 9 mM magnesium acetate, 5 mM potassium phosphate, 95 mM potassium glutamate, 5 mM ammonium chloride, 0.5 mM calcium chloride, 1 mM spermidine, 8 mM putrescine, 1 mM dithiothreitol, 10 mM creatine phosphate). All reactions were conducted in this so-called IVTT buffer at room temperature.

Flow experiments setup

Cetoni Nemesys syringe pumps with Hamilton syringes were used to control input and the flow profile was programmed using the Cetoni neMESYS software^26,41. Before performing the designed flow profile, the whole system was equilibrium with buffer for two hours. The outflow of the CSTR was collected using a fraction collector, collecting for either 30 or 15 minutes or three droplets per fraction. The ion-pair HPLC analysis was adapted from ref. ²⁶ and performed on Shimadzu Nexera X3 HPLC system with an Inertsil ODS-4 column (3 μm, 150 × 4.6 mm; GL Science) and a guard column (3 μm; 10 × 4.6 mm) at 40 °C. The elution gradient was as follows: 100% buffer A (100 mM potassium phosphate buffer (pH 6.4) with 8 mM ion-pair reagent tetrabutylammonium bisulfate, filtered before use) for 13 min; 0–77% linear gradient of buffer B for 22 min; 77–100% buffer B (70% buffer A with 30% acetonitrile) for 1 min; and 100% buffer B for 14 min. The flow rate was maintained at 1 ml/min. Peaks were identified by comparison with standard samples. The concentration was obtained from the integrated peak areas with the calibration curve of each standard.

Software and modeling

An overview of the software that performs the optimizations can be found in Supplementary Information 1. A generated text-based model object²⁵ is translated to an SBML and AMICI object modified from ref. ²⁸ and ref. ⁴² (Supporting information Fig. S1-S4). AMICI is an ODE compilation package to C + + which is continuously updated^43–46. Several publicly available tools integrate with AMICI^45–49. This is needed for the expanding repertoire of ever larger kinetic models (most in vivo)^50–54. To quantify the computational cost (and its general application to larger systems) we tested the speed of the pipeline presented on an in vivo metabolic core E. coli core metabolism model (ref. ⁵³) and placed it in the context of our in vitro reactor set-up (see Fig. S1). This test was run on a single core of Intel Xeon E5-1660 v4 @ 3.2 GHz. For more information on the efficiency of AMICI itself (where the bulk of the calculations are performed), we refer the reader to refs. ^{43,44,55–57}, or its, by now, numerous applications^58–64.

Statistics & reproducibility

No statistical method was used to predetermine the sample size. No data were excluded from the analyses; for the experimental data shown in Fig. 5, the experimental conditions predicting specific ratios were selected randomly after sampling 10⁵ possible ratios of ATP/UTP/GTP. Provided these ratios conformed to the required conversion efficiency (60%) and the chosen set of conditions differed sufficiently between the summed total inflow concentration of all substrates to cover the largest possible space and test the model.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Supplementary information

Supplementary Information^{(3.2MB, pdf)}

Peer Review File^{(1.9MB, pdf)}

Reporting Summary^{(719.6KB, pdf)}

Source data

Source data^{(234.4KB, zip)}

Acknowledgements

This project is funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (ERC Adv. Grant Life-Inspired, grant agreement No. 833466 and ERC PoC Grant OptiPlex, grant agreement No. 101069237). T. Z. acknowledges the Swiss National Science Foundation for financial support (P500PB_203166).

Author contributions

B.v.S., T.Z. and W.T.S.H. conceived the study. B.v.S. and T.Z designed and performed experiments respectively. B.v.S., T.Z. and W.T.S.H. analyzed the data and discussed the results. B.H. carried out foundational work and B.v.S. and M.G.B. built software to auto- generate strings for kinetic models of ERNs. F.N. and H.H. purified the four commercial unavailable enzymes provided the related plasmids. All authors discussed the results, provided comments, and revised the manuscript.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Data availability

All relevant data supporting the key findings of this study are available within the article and its Supplementary Information files. Source data are provided with this paper as a singular source data file, including the time-dependent inputs and HPLC quantifications and parameter estimates, archive 10.5281/zenodo.10411170. Source data are provided in this paper.

Code availability

The package is written in Python 3.8 (python software foundation, Delaware US). Code can be found at Huckgroup GitHub at http://github.com/huckgroup/OED, code archived (see ref. ⁶⁵), 10.5281/zenodo.10411170 (2023). For more information contact bob.vansluijs@gmail.com.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Tao Zhou, Email: tao.zhou@ru.nl.

Wilhelm T. S. Huck, Email: w.huck@science.ru.nl

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-024-45886-9.

References

1.Berhanu S, Ueda T, Kuruma Y. Artificial photosynthetic cell producing energy for protein synthesis. Nat. Commun. 2019;10:1325. doi: 10.1038/s41467-019-09147-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Bhattacharya A, Brea RJ, Niederholtmeyer H, Devaraj NK. A minimal biochemical route towards de novo formation of synthetic phospholipid membranes. Nat. Commun. 2019;10:300. doi: 10.1038/s41467-018-08174-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Lee KY, et al. Photosynthetic artificial organelles sustain and control ATP-dependent reactions in a protocellular system. Nat. Biotechnol. 2018;36:530–535. doi: 10.1038/nbt.4140. [DOI] [PubMed] [Google Scholar]
4.Pols T, et al. A synthetic metabolic network for physicochemical homeostasis. Nat. Commun. 2019;10:4239. doi: 10.1038/s41467-019-12287-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Burgener S, Luo S, McLean R, Miller TE, Erb TJ. A roadmap towards integrated catalytic systems of the future. Nat. Catal. 2020;3:186–192. doi: 10.1038/s41929-020-0429-x. [DOI] [Google Scholar]
6.Valliere MA, Korman TP, Arbing MA, Bowie JU. A bio-inspired cell-free system for cannabinoid production from inexpensive inputs. Nat. Chem. Biol. 2020;16:1427–1433. doi: 10.1038/s41589-020-0631-9. [DOI] [PubMed] [Google Scholar]
7.Rasor BJ, et al. Toward sustainable, cell-free biomanufacturing. Curr. Opin. Biotechnol. 2021;69:136–144. doi: 10.1016/j.copbio.2020.12.012. [DOI] [PubMed] [Google Scholar]
8.Miller TE, et al. Light-powered CO(2) fixation in a chloroplast mimic with natural and synthetic parts. Science. 2020;368:649–654. doi: 10.1126/science.aaz6802. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Yu T, et al. Machine learning-enabled retrobiosynthesis of molecules. Nat. Catal. 2023;6:137–151. doi: 10.1038/s41929-022-00909-w. [DOI] [Google Scholar]
10.Margraf JT, Jung H, Scheurer C, Reuter K. Exploring catalytic reaction networks with machine learning. Nat. Catal. 2023;6:112–121. doi: 10.1038/s41929-022-00896-y. [DOI] [Google Scholar]
11.Morgado G, Gerngross D, Roberts TM, Panke S. Synthetic biology for cell-free biosynthesis: fundamentals of designing novel in vitro multi-enzyme reaction networks. Adv. Biochem. Eng. Biotechnol. 2018;162:117–146. doi: 10.1007/10_2016_13. [DOI] [PubMed] [Google Scholar]
12.Pandi A, et al. A versatile active learning workflow for optimization of genetic and metabolic networks. Nat. Commun. 2022;13:3876. doi: 10.1038/s41467-022-31245-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Wen M, et al. Chemical reaction networks and opportunities for machine learning. Nat. Comput. Sci. 2023;3:12–24. doi: 10.1038/s43588-022-00369-z. [DOI] [PubMed] [Google Scholar]
14.Bures J, Larrosa I. Organic reaction mechanism classification using machine learning. Nature. 2023;613:689–695. doi: 10.1038/s41586-022-05639-4. [DOI] [PubMed] [Google Scholar]
15.Faulon JL, Faure L. In silico, in vitro, and in vivo machine learning in synthetic biology and metabolic engineering. Curr. Opin. Chem. Biol. 2021;65:85–92. doi: 10.1016/j.cbpa.2021.06.002. [DOI] [PubMed] [Google Scholar]
16.Martin JP, et al. A dynamic kinetic model captures cell-free metabolism for improved butanol production. Metab. Eng. 2023;76:133–145. doi: 10.1016/j.ymben.2023.01.009. [DOI] [PubMed] [Google Scholar]
17.Shen L, et al. A combined experimental and modelling approach for the Weimberg pathway optimisation. Nat. Commun. 2020;11:1098. doi: 10.1038/s41467-020-14830-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Bujara M, Schumperli M, Pellaux R, Heinemann M, Panke S. Optimization of a blueprint for in vitro glycolysis by metabolic real-time analysis. Nat. Chem. Biol. 2011;7:271–277. doi: 10.1038/nchembio.541. [DOI] [PubMed] [Google Scholar]
19.Hold C, Billerbeck S, Panke S. Forward design of a complex enzyme cascade reaction. Nat. Commun. 2016;7:12971. doi: 10.1038/ncomms12971. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Parkin DW, Leung HB, Schramm VL. Synthesis of nucleotides with specific radiolabels in ribose. Primary 14C and secondary 3H kinetic isotope effects on acid-catalyzed glycosidic bond hydrolysis of AMP, dAMP, and inosine. J. Biol. Chem. 1984;259:9411–9417. doi: 10.1016/S0021-9258(17)42716-5. [DOI] [PubMed] [Google Scholar]
21.Tolbert TJ, Williamson JR. Preparation of specifically deuterated and 13C-labeled RNA for NMR studies using enzymatic synthesis. J. Am. Chem. Soc. 1997;119:12100–12108. doi: 10.1021/ja9725054. [DOI] [Google Scholar]
22.Nelissen FHT, Girard FC, Tessari M, Heus HA, Wijmenga SS. Preparation of selective and segmentally labeled single-stranded DNA for NMR by self-primed PCR and asymmetrical endonuclease double digestion. Nucleic Acids Res. 2009;37:e114–e114. doi: 10.1093/nar/gkp540. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Gábor A, Villaverde AF, Banga JR. Parameter identifiability analysis and visualization in large-scale kinetic models of biosystems. BMC Syst. Biol. 2017;11:54. doi: 10.1186/s12918-017-0428-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Kreutz C, Raue A, Kaschek D, Timmer J. Profile likelihood in systems biology. FEBS J. 2013;280:2564–2571. doi: 10.1111/febs.12276. [DOI] [PubMed] [Google Scholar]
25.Raue A, et al. Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics. 2009;25:1923–1929. doi: 10.1093/bioinformatics/btp358. [DOI] [PubMed] [Google Scholar]
26.Baltussen MG, van de Wiel J, Fernandez Regueiro CL, Jakstaite M, Huck WTS. A Bayesian approach to extracting kinetic information from artificial enzymatic networks. Anal. Chem. 2022;94:7311–7318. doi: 10.1021/acs.analchem.2c00659. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Nakajima K, et al. Simultaneous determination of nucleotide sugars with ion-pair reversed-phase HPLC. Glycobiology. 2010;20:865–871. doi: 10.1093/glycob/cwq044. [DOI] [PubMed] [Google Scholar]
28.van Sluijs B, Maas RJM, van der Linden AJ, de Greef TFA, Huck WTS. A microfluidic optimal experimental design platform for forward design of cell-free genetic networks. Nat. Commun. 2022;13:3626. doi: 10.1038/s41467-022-31306-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Smith RW, van Sluijs B, Fleck C. Designing synthetic networks in silico: a generalised evolutionary algorithm approach. BMC Syst. Biol. 2017;11:118. doi: 10.1186/s12918-017-0499-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Sinkoe A, Hahn J. Optimal experimental design for parameter estimation of an IL-6 signaling model. Processes. 2017;5:49. doi: 10.3390/pr5030049. [DOI] [Google Scholar]
31.de Aguiar PF, Bourguignon B, Khots MS, Massart DL, Phan-Than-Luu R. D-optimal designs. Chemometrics Intell. Lab. Syst. 1995;30:199–210. doi: 10.1016/0169-7439(94)00076-X. [DOI] [Google Scholar]
32.Ruess J, Parise F, Milias-Argeitis A, Khammash M, Lygeros J. Iterative experiment design guides the characterization of a light-inducible gene expression circuit. Proc. Natl Acad. Sci. USA. 2015;112:8148–8153. doi: 10.1073/pnas.1423947112. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Otero-Muras I, Carbonell P. Automated engineering of synthetic metabolic pathways for efficient biomanufacturing. Metab. Eng. 2021;63:61–80. doi: 10.1016/j.ymben.2020.11.012. [DOI] [PubMed] [Google Scholar]
34.Taylor CJ, et al. A brief introduction to chemical reaction optimization. Chem. Rev. 2023;123:3089–3126. doi: 10.1021/acs.chemrev.2c00798. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Taylor CJ, et al. Flow chemistry for process optimisation using design of experiments. J. Flow. Chem. 2021;11:75–86. doi: 10.1007/s41981-020-00135-0. [DOI] [Google Scholar]
36.Wyvratt BM, McMullen JP, Grosser ST. Multidimensional dynamic experiments for data-rich process development of reactions in flow. React. Chem. Eng. 2019;4:1637–1645. doi: 10.1039/C9RE00078J. [DOI] [Google Scholar]
37.Egert, J. & Kreutz, C. Realistic simulation of time-course measurements in systems biology. bioRxiv, 2023.2001. 2005.522854 (2023). [DOI] [PubMed]
38.Arthur PK, Alvarado LJ, Dayie TK. Expression, purification and analysis of the activity of enzymes from the pentose phosphate pathway. Protein Expr. Purif. 2011;76:229–237. doi: 10.1016/j.pep.2010.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Oeschger MP, Bessman MJ. Purification and properties of guanylate kinase from Escherichia coli. J. Biol. Chem. 1966;241:5452–5460. doi: 10.1016/S0021-9258(18)96451-3. [DOI] [PubMed] [Google Scholar]
40.Serina L, et al. Escherichia coli UMP kinase, a member of the aspartokinase family, is a hexamer regulated by guanine nucleotides and UTP. Biochemistry. 1995;34:5066–5074. doi: 10.1021/bi00015a018. [DOI] [PubMed] [Google Scholar]
41.Helwig B, van Sluijs B, Pogodaev AA, Postma SGJ, Huck WTS. Bottom-up construction of an adaptive enzymatic reaction. Netw. Angew. Chem. Int Ed. Engl. 2018;57:14065–14069. doi: 10.1002/anie.201806944. [DOI] [PubMed] [Google Scholar]
42.Choi K, et al. Tellurium: An extensible python-based modeling environment for systems and synthetic biology. Biosystems. 2018;171:74–79. doi: 10.1016/j.biosystems.2018.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Frohlich F, et al. AMICI: high-performance sensitivity analysis for large ordinary differential equation models. Bioinformatics. 2021;37:3676–3677. doi: 10.1093/bioinformatics/btab227. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Lakrisenko P, et al. Efficient computation of adjoint sensitivities at steady-state in ODE models of biochemical reaction networks. PLoS Comput. Biol. 2023;19:e1010783. doi: 10.1371/journal.pcbi.1010783. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Schälte, Y. et al. pyPESTO: A modular and scalable tool for parameter estimation for dynamic models. arXiv preprint arXiv:2305.01821 (2023). [DOI] [PMC free article] [PubMed]
46.Schmiester L, Weindl D, Hasenauer J. Efficient gradient-based parameter estimation for dynamic models using qualitative data. Bioinformatics. 2021;37:4493–4500. doi: 10.1093/bioinformatics/btab512. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Schmiester L, Weindl D, Hasenauer J. Parameterization of mechanistic models from qualitative data using an efficient optimal scaling approach. J. Math. Biol. 2020;81:603–623. doi: 10.1007/s00285-020-01522-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Schmiester L, et al. PEtab—Interoperable specification of parameter estimation problems in systems biology. PLoS Comput. Biol. 2021;17:e1008646. doi: 10.1371/journal.pcbi.1008646. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.van Rosmalen RP, Smith R, Dos Santos VM, Fleck C, Suarez-Diez M. Model reduction of genome-scale metabolic models as a basis for targeted kinetic models. Metab. Eng. 2021;64:74–84. doi: 10.1016/j.ymben.2021.01.008. [DOI] [PubMed] [Google Scholar]
50.Dash S, et al. Development of a core Clostridium thermocellum kinetic metabolic model consistent with multiple genetic perturbations. Biotechnol. Biofuels. 2017;10:1–16. doi: 10.1186/s13068-017-0792-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Foster CJ, Gopalakrishnan S, Antoniewicz MR, Maranas CD. From Escherichia coli mutant 13C labeling data to a core kinetic model: a kinetic model parameterization pipeline. PLoS Comput. Biol. 2019;15:e1007319. doi: 10.1371/journal.pcbi.1007319. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Gopalakrishnan S, Dash S, Maranas C. K-FIT: An accelerated kinetic parameterization algorithm using steady-state fluxomic data. Metab. Eng. 2020;61:197–205. doi: 10.1016/j.ymben.2020.03.001. [DOI] [PubMed] [Google Scholar]
53.Khodayari A, Zomorrodi AR, Liao JC, Maranas CD. A kinetic model of Escherichia coli core metabolism satisfying multiple sets of mutant flux data. Metab. Eng. 2014;25:50–62. doi: 10.1016/j.ymben.2014.05.014. [DOI] [PubMed] [Google Scholar]
54.Foster CJ, Wang L, Dinh HV, Suthers PF, Maranas CD. Building kinetic models for metabolic engineering. Curr. Opin. Biotechnol. 2021;67:35–41. doi: 10.1016/j.copbio.2020.11.010. [DOI] [PubMed] [Google Scholar]
55.Städter P, Schälte Y, Schmiester L, Hasenauer J, Stapor PL. Benchmarking of numerical integration methods for ODE models of biological systems. Sci. Rep. 2021;11:2696. doi: 10.1038/s41598-021-82196-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Shaikh B, et al. BioSimulators: a central registry of simulation engines and services for recommending specific tools. Nucleic Acids Res. 2022;50:W108–W114. doi: 10.1093/nar/gkac331. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Fröhlich F, Theis FJ, Rädler JO, Hasenauer J. Parameter estimation for dynamical systems with discrete events and logical operations. Bioinformatics. 2017;33:1049–1056. doi: 10.1093/bioinformatics/btw764. [DOI] [PubMed] [Google Scholar]
58.Fröhlich, F. In Computational Modeling of Signaling Networks 59-86 (Springer, 2022).
59.Lao-Martil D, et al. Kinetic modeling of Saccharomyces cerevisiae central carbon metabolism: achievements, limitations, and opportunities. Metabolites. 2022;12:74. doi: 10.3390/metabo12010074. [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Fröhlich F, Gerosa L, Muhlich J, Sorger PK. Mechanistic model of MAPK signaling reveals how allostery and rewiring contribute to drug resistance. Mol. Syst. Biol. 2023;19:e10988. doi: 10.15252/msb.202210988. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Smith RW, van Rosmalen RP, Martins dos Santos VA, Fleck C. DMPy: a Python package for automated mathematical model construction of large-scale metabolic systems. BMC Syst. Biol. 2018;12:1–16. doi: 10.1186/s12918-018-0584-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
62.Massonis G, Villaverde AF, Banga JR. Improving dynamic predictions with ensembles of observable models. Bioinformatics. 2023;39:btac755. doi: 10.1093/bioinformatics/btac755. [DOI] [PMC free article] [PubMed] [Google Scholar]
63.Mishra S, Wang Z, Volk MJ, Zhao H. Design and application of a kinetic model of lipid metabolism in Saccharomyces cerevisiae. Metab. Eng. 2023;75:12–18. doi: 10.1016/j.ymben.2022.11.003. [DOI] [PubMed] [Google Scholar]
64.Contento, L., Stapor, P., Weindl, D. & Hasenauer, J. In International Conference on Computational Methods in Systems Biology 36-43 (Springer, 2023).
65.van Sluijs, B. “Iterative design of training data to control intricate enzymatic networks”, Zenodo. 10.5281/zenodo.10411170 (2023). [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information^{(3.2MB, pdf)}

Peer Review File^{(1.9MB, pdf)}

Reporting Summary^{(719.6KB, pdf)}

Source data^{(234.4KB, zip)}

Data Availability Statement

[CR1] 1.Berhanu S, Ueda T, Kuruma Y. Artificial photosynthetic cell producing energy for protein synthesis. Nat. Commun. 2019;10:1325. doi: 10.1038/s41467-019-09147-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Bhattacharya A, Brea RJ, Niederholtmeyer H, Devaraj NK. A minimal biochemical route towards de novo formation of synthetic phospholipid membranes. Nat. Commun. 2019;10:300. doi: 10.1038/s41467-018-08174-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Lee KY, et al. Photosynthetic artificial organelles sustain and control ATP-dependent reactions in a protocellular system. Nat. Biotechnol. 2018;36:530–535. doi: 10.1038/nbt.4140. [DOI] [PubMed] [Google Scholar]

[CR4] 4.Pols T, et al. A synthetic metabolic network for physicochemical homeostasis. Nat. Commun. 2019;10:4239. doi: 10.1038/s41467-019-12287-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Burgener S, Luo S, McLean R, Miller TE, Erb TJ. A roadmap towards integrated catalytic systems of the future. Nat. Catal. 2020;3:186–192. doi: 10.1038/s41929-020-0429-x. [DOI] [Google Scholar]

[CR6] 6.Valliere MA, Korman TP, Arbing MA, Bowie JU. A bio-inspired cell-free system for cannabinoid production from inexpensive inputs. Nat. Chem. Biol. 2020;16:1427–1433. doi: 10.1038/s41589-020-0631-9. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Rasor BJ, et al. Toward sustainable, cell-free biomanufacturing. Curr. Opin. Biotechnol. 2021;69:136–144. doi: 10.1016/j.copbio.2020.12.012. [DOI] [PubMed] [Google Scholar]

[CR8] 8.Miller TE, et al. Light-powered CO(2) fixation in a chloroplast mimic with natural and synthetic parts. Science. 2020;368:649–654. doi: 10.1126/science.aaz6802. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Yu T, et al. Machine learning-enabled retrobiosynthesis of molecules. Nat. Catal. 2023;6:137–151. doi: 10.1038/s41929-022-00909-w. [DOI] [Google Scholar]

[CR10] 10.Margraf JT, Jung H, Scheurer C, Reuter K. Exploring catalytic reaction networks with machine learning. Nat. Catal. 2023;6:112–121. doi: 10.1038/s41929-022-00896-y. [DOI] [Google Scholar]

[CR11] 11.Morgado G, Gerngross D, Roberts TM, Panke S. Synthetic biology for cell-free biosynthesis: fundamentals of designing novel in vitro multi-enzyme reaction networks. Adv. Biochem. Eng. Biotechnol. 2018;162:117–146. doi: 10.1007/10_2016_13. [DOI] [PubMed] [Google Scholar]

[CR12] 12.Pandi A, et al. A versatile active learning workflow for optimization of genetic and metabolic networks. Nat. Commun. 2022;13:3876. doi: 10.1038/s41467-022-31245-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Wen M, et al. Chemical reaction networks and opportunities for machine learning. Nat. Comput. Sci. 2023;3:12–24. doi: 10.1038/s43588-022-00369-z. [DOI] [PubMed] [Google Scholar]

[CR14] 14.Bures J, Larrosa I. Organic reaction mechanism classification using machine learning. Nature. 2023;613:689–695. doi: 10.1038/s41586-022-05639-4. [DOI] [PubMed] [Google Scholar]

[CR15] 15.Faulon JL, Faure L. In silico, in vitro, and in vivo machine learning in synthetic biology and metabolic engineering. Curr. Opin. Chem. Biol. 2021;65:85–92. doi: 10.1016/j.cbpa.2021.06.002. [DOI] [PubMed] [Google Scholar]

[CR16] 16.Martin JP, et al. A dynamic kinetic model captures cell-free metabolism for improved butanol production. Metab. Eng. 2023;76:133–145. doi: 10.1016/j.ymben.2023.01.009. [DOI] [PubMed] [Google Scholar]

[CR17] 17.Shen L, et al. A combined experimental and modelling approach for the Weimberg pathway optimisation. Nat. Commun. 2020;11:1098. doi: 10.1038/s41467-020-14830-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Bujara M, Schumperli M, Pellaux R, Heinemann M, Panke S. Optimization of a blueprint for in vitro glycolysis by metabolic real-time analysis. Nat. Chem. Biol. 2011;7:271–277. doi: 10.1038/nchembio.541. [DOI] [PubMed] [Google Scholar]

[CR19] 19.Hold C, Billerbeck S, Panke S. Forward design of a complex enzyme cascade reaction. Nat. Commun. 2016;7:12971. doi: 10.1038/ncomms12971. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Parkin DW, Leung HB, Schramm VL. Synthesis of nucleotides with specific radiolabels in ribose. Primary 14C and secondary 3H kinetic isotope effects on acid-catalyzed glycosidic bond hydrolysis of AMP, dAMP, and inosine. J. Biol. Chem. 1984;259:9411–9417. doi: 10.1016/S0021-9258(17)42716-5. [DOI] [PubMed] [Google Scholar]

[CR21] 21.Tolbert TJ, Williamson JR. Preparation of specifically deuterated and 13C-labeled RNA for NMR studies using enzymatic synthesis. J. Am. Chem. Soc. 1997;119:12100–12108. doi: 10.1021/ja9725054. [DOI] [Google Scholar]

[CR22] 22.Nelissen FHT, Girard FC, Tessari M, Heus HA, Wijmenga SS. Preparation of selective and segmentally labeled single-stranded DNA for NMR by self-primed PCR and asymmetrical endonuclease double digestion. Nucleic Acids Res. 2009;37:e114–e114. doi: 10.1093/nar/gkp540. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Gábor A, Villaverde AF, Banga JR. Parameter identifiability analysis and visualization in large-scale kinetic models of biosystems. BMC Syst. Biol. 2017;11:54. doi: 10.1186/s12918-017-0428-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR24] 24.Kreutz C, Raue A, Kaschek D, Timmer J. Profile likelihood in systems biology. FEBS J. 2013;280:2564–2571. doi: 10.1111/febs.12276. [DOI] [PubMed] [Google Scholar]

[CR25] 25.Raue A, et al. Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics. 2009;25:1923–1929. doi: 10.1093/bioinformatics/btp358. [DOI] [PubMed] [Google Scholar]

[CR26] 26.Baltussen MG, van de Wiel J, Fernandez Regueiro CL, Jakstaite M, Huck WTS. A Bayesian approach to extracting kinetic information from artificial enzymatic networks. Anal. Chem. 2022;94:7311–7318. doi: 10.1021/acs.analchem.2c00659. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Nakajima K, et al. Simultaneous determination of nucleotide sugars with ion-pair reversed-phase HPLC. Glycobiology. 2010;20:865–871. doi: 10.1093/glycob/cwq044. [DOI] [PubMed] [Google Scholar]

[CR28] 28.van Sluijs B, Maas RJM, van der Linden AJ, de Greef TFA, Huck WTS. A microfluidic optimal experimental design platform for forward design of cell-free genetic networks. Nat. Commun. 2022;13:3626. doi: 10.1038/s41467-022-31306-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] 29.Smith RW, van Sluijs B, Fleck C. Designing synthetic networks in silico: a generalised evolutionary algorithm approach. BMC Syst. Biol. 2017;11:118. doi: 10.1186/s12918-017-0499-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Sinkoe A, Hahn J. Optimal experimental design for parameter estimation of an IL-6 signaling model. Processes. 2017;5:49. doi: 10.3390/pr5030049. [DOI] [Google Scholar]

[CR31] 31.de Aguiar PF, Bourguignon B, Khots MS, Massart DL, Phan-Than-Luu R. D-optimal designs. Chemometrics Intell. Lab. Syst. 1995;30:199–210. doi: 10.1016/0169-7439(94)00076-X. [DOI] [Google Scholar]

[CR32] 32.Ruess J, Parise F, Milias-Argeitis A, Khammash M, Lygeros J. Iterative experiment design guides the characterization of a light-inducible gene expression circuit. Proc. Natl Acad. Sci. USA. 2015;112:8148–8153. doi: 10.1073/pnas.1423947112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR33] 33.Otero-Muras I, Carbonell P. Automated engineering of synthetic metabolic pathways for efficient biomanufacturing. Metab. Eng. 2021;63:61–80. doi: 10.1016/j.ymben.2020.11.012. [DOI] [PubMed] [Google Scholar]

[CR34] 34.Taylor CJ, et al. A brief introduction to chemical reaction optimization. Chem. Rev. 2023;123:3089–3126. doi: 10.1021/acs.chemrev.2c00798. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Taylor CJ, et al. Flow chemistry for process optimisation using design of experiments. J. Flow. Chem. 2021;11:75–86. doi: 10.1007/s41981-020-00135-0. [DOI] [Google Scholar]

[CR36] 36.Wyvratt BM, McMullen JP, Grosser ST. Multidimensional dynamic experiments for data-rich process development of reactions in flow. React. Chem. Eng. 2019;4:1637–1645. doi: 10.1039/C9RE00078J. [DOI] [Google Scholar]

[CR37] 37.Egert, J. & Kreutz, C. Realistic simulation of time-course measurements in systems biology. bioRxiv, 2023.2001. 2005.522854 (2023). [DOI] [PubMed]

[CR38] 38.Arthur PK, Alvarado LJ, Dayie TK. Expression, purification and analysis of the activity of enzymes from the pentose phosphate pathway. Protein Expr. Purif. 2011;76:229–237. doi: 10.1016/j.pep.2010.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] 39.Oeschger MP, Bessman MJ. Purification and properties of guanylate kinase from Escherichia coli. J. Biol. Chem. 1966;241:5452–5460. doi: 10.1016/S0021-9258(18)96451-3. [DOI] [PubMed] [Google Scholar]

[CR40] 40.Serina L, et al. Escherichia coli UMP kinase, a member of the aspartokinase family, is a hexamer regulated by guanine nucleotides and UTP. Biochemistry. 1995;34:5066–5074. doi: 10.1021/bi00015a018. [DOI] [PubMed] [Google Scholar]

[CR41] 41.Helwig B, van Sluijs B, Pogodaev AA, Postma SGJ, Huck WTS. Bottom-up construction of an adaptive enzymatic reaction. Netw. Angew. Chem. Int Ed. Engl. 2018;57:14065–14069. doi: 10.1002/anie.201806944. [DOI] [PubMed] [Google Scholar]

[CR42] 42.Choi K, et al. Tellurium: An extensible python-based modeling environment for systems and synthetic biology. Biosystems. 2018;171:74–79. doi: 10.1016/j.biosystems.2018.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Frohlich F, et al. AMICI: high-performance sensitivity analysis for large ordinary differential equation models. Bioinformatics. 2021;37:3676–3677. doi: 10.1093/bioinformatics/btab227. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR44] 44.Lakrisenko P, et al. Efficient computation of adjoint sensitivities at steady-state in ODE models of biochemical reaction networks. PLoS Comput. Biol. 2023;19:e1010783. doi: 10.1371/journal.pcbi.1010783. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR45] 45.Schälte, Y. et al. pyPESTO: A modular and scalable tool for parameter estimation for dynamic models. arXiv preprint arXiv:2305.01821 (2023). [DOI] [PMC free article] [PubMed]

[CR46] 46.Schmiester L, Weindl D, Hasenauer J. Efficient gradient-based parameter estimation for dynamic models using qualitative data. Bioinformatics. 2021;37:4493–4500. doi: 10.1093/bioinformatics/btab512. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR47] 47.Schmiester L, Weindl D, Hasenauer J. Parameterization of mechanistic models from qualitative data using an efficient optimal scaling approach. J. Math. Biol. 2020;81:603–623. doi: 10.1007/s00285-020-01522-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR48] 48.Schmiester L, et al. PEtab—Interoperable specification of parameter estimation problems in systems biology. PLoS Comput. Biol. 2021;17:e1008646. doi: 10.1371/journal.pcbi.1008646. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR49] 49.van Rosmalen RP, Smith R, Dos Santos VM, Fleck C, Suarez-Diez M. Model reduction of genome-scale metabolic models as a basis for targeted kinetic models. Metab. Eng. 2021;64:74–84. doi: 10.1016/j.ymben.2021.01.008. [DOI] [PubMed] [Google Scholar]

[CR50] 50.Dash S, et al. Development of a core Clostridium thermocellum kinetic metabolic model consistent with multiple genetic perturbations. Biotechnol. Biofuels. 2017;10:1–16. doi: 10.1186/s13068-017-0792-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR51] 51.Foster CJ, Gopalakrishnan S, Antoniewicz MR, Maranas CD. From Escherichia coli mutant 13C labeling data to a core kinetic model: a kinetic model parameterization pipeline. PLoS Comput. Biol. 2019;15:e1007319. doi: 10.1371/journal.pcbi.1007319. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR52] 52.Gopalakrishnan S, Dash S, Maranas C. K-FIT: An accelerated kinetic parameterization algorithm using steady-state fluxomic data. Metab. Eng. 2020;61:197–205. doi: 10.1016/j.ymben.2020.03.001. [DOI] [PubMed] [Google Scholar]

[CR53] 53.Khodayari A, Zomorrodi AR, Liao JC, Maranas CD. A kinetic model of Escherichia coli core metabolism satisfying multiple sets of mutant flux data. Metab. Eng. 2014;25:50–62. doi: 10.1016/j.ymben.2014.05.014. [DOI] [PubMed] [Google Scholar]

[CR54] 54.Foster CJ, Wang L, Dinh HV, Suthers PF, Maranas CD. Building kinetic models for metabolic engineering. Curr. Opin. Biotechnol. 2021;67:35–41. doi: 10.1016/j.copbio.2020.11.010. [DOI] [PubMed] [Google Scholar]

[CR55] 55.Städter P, Schälte Y, Schmiester L, Hasenauer J, Stapor PL. Benchmarking of numerical integration methods for ODE models of biological systems. Sci. Rep. 2021;11:2696. doi: 10.1038/s41598-021-82196-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR56] 56.Shaikh B, et al. BioSimulators: a central registry of simulation engines and services for recommending specific tools. Nucleic Acids Res. 2022;50:W108–W114. doi: 10.1093/nar/gkac331. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR57] 57.Fröhlich F, Theis FJ, Rädler JO, Hasenauer J. Parameter estimation for dynamical systems with discrete events and logical operations. Bioinformatics. 2017;33:1049–1056. doi: 10.1093/bioinformatics/btw764. [DOI] [PubMed] [Google Scholar]

[CR58] 58.Fröhlich, F. In Computational Modeling of Signaling Networks 59-86 (Springer, 2022).

[CR59] 59.Lao-Martil D, et al. Kinetic modeling of Saccharomyces cerevisiae central carbon metabolism: achievements, limitations, and opportunities. Metabolites. 2022;12:74. doi: 10.3390/metabo12010074. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR60] 60.Fröhlich F, Gerosa L, Muhlich J, Sorger PK. Mechanistic model of MAPK signaling reveals how allostery and rewiring contribute to drug resistance. Mol. Syst. Biol. 2023;19:e10988. doi: 10.15252/msb.202210988. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR61] 61.Smith RW, van Rosmalen RP, Martins dos Santos VA, Fleck C. DMPy: a Python package for automated mathematical model construction of large-scale metabolic systems. BMC Syst. Biol. 2018;12:1–16. doi: 10.1186/s12918-018-0584-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR62] 62.Massonis G, Villaverde AF, Banga JR. Improving dynamic predictions with ensembles of observable models. Bioinformatics. 2023;39:btac755. doi: 10.1093/bioinformatics/btac755. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR63] 63.Mishra S, Wang Z, Volk MJ, Zhao H. Design and application of a kinetic model of lipid metabolism in Saccharomyces cerevisiae. Metab. Eng. 2023;75:12–18. doi: 10.1016/j.ymben.2022.11.003. [DOI] [PubMed] [Google Scholar]

[CR64] 64.Contento, L., Stapor, P., Weindl, D. & Hasenauer, J. In International Conference on Computational Methods in Systems Biology 36-43 (Springer, 2023).

[CR65] 65.van Sluijs, B. “Iterative design of training data to control intricate enzymatic networks”, Zenodo. 10.5281/zenodo.10411170 (2023). [DOI] [PMC free article] [PubMed]

PERMALINK

Iterative design of training data to control intricate enzymatic reaction networks

Bob van Sluijs

Tao Zhou

Britta Helwig

Mathieu G Baltussen

Frank H T Nelissen

Hans A Heus

Wilhelm T S Huck

Abstract

Introduction

Results

Overview of the nucleotide salvage pathway

Fig. 1. Overview of the nucleotide salvage pathway and kinetic model parameter sensitivities to observed species.

Kinetic model of the nucleotide salvage pathway

OED and pulsing substrates into the flow reactor

Fig. 2. Overview of the experimental flow set-up and the iterative design of training data to train a kinetic model.

Iterative design of training data to build a kinetic model

Fig. 3. Example of optimally designed flow profile and measured output (3rd iteration).

Fig. 4. Application of the iterative design of training data and its impact on identifiability and the predictive power of the model.

Trained model controls nucleotide salvage pathway in flow

Fig. 5. Controlling the nucleotide salvage pathway as a MIMO system and testing the model by predicting product ratios.

Discussion

Methods

Materials

Flow experiments setup

Software and modeling

Statistics & reproducibility

Reporting summary

Supplementary information

Source data

Acknowledgements

Author contributions

Peer review

Peer review information

Data availability

Code availability

Competing interests

Footnotes

Contributor Information

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Fig. 3. Example of optimally designed flow profile and measured output (3^rd iteration).