Skip to main content
Springer logoLink to Springer
. 2021 Jan 20;44(4):683–700. doi: 10.1007/s00449-020-02478-3

Model-assisted DoE software: optimization of growth and biocatalysis in Saccharomyces cerevisiae bioprocesses

André Moser 1, Kim B Kuchemüller 2, Sahar Deppe 3, Tanja Hernández Rodríguez 3, Björn Frahm 3, Ralf Pörtner 2, Volker C Hass 1, Johannes Möller 2,
PMCID: PMC7997827  PMID: 33471162

Abstract

Bioprocess development and optimization are still cost- and time-intensive due to the enormous number of experiments involved. In this study, the recently introduced model-assisted Design of Experiments (mDoE) concept (Möller et al. in Bioproc Biosyst Eng 42(5):867, 10.1007/s00449-019-02089-7, 2019) was extended and implemented into a software (“mDoE-toolbox”) to significantly reduce the number of required cultivations. The application of the toolbox is exemplary shown in two case studies with Saccharomyces cerevisiae. In the first case study, a fed-batch process was optimized with respect to the pH value and linearly rising feeding rates of glucose and nitrogen source. Using the mDoE-toolbox, the biomass concentration was increased by 30% compared to previously performed experiments. The second case study was the whole-cell biocatalysis of ethyl acetoacetate (EAA) to (S)-ethyl-3-hydroxybutyrate (E3HB), for which the feeding rates of glucose, nitrogen source, and EAA were optimized. An increase of 80% compared to a previously performed experiment with similar initial conditions was achieved for the E3HB concentration.

Supplementary Information

The online version contains supplementary material available at 10.1007/s00449-020-02478-3.

Keywords: Biocatalysis, Monte Carlo methods, Fed-batch strategy, Model-assisted design of experiments, Quality by design

Introduction

Biotechnology is expected to make a significant contribution to the establishment of a bio-based economy, since it offers new product manufacturing approaches and resource-efficient technologies [2, 3]. However, the development of a bio-economy requires new sustainable and environmentally friendly industrial production processes [4, 5]. Experiments for their development and optimization are usually designed using one-factor-at-a-time approaches and statistical Design of Experiments (DoE) methods. DoE methods inevitably require a large number of experiments to be performed and analytically evaluated [6, 7]. Although the use of high-throughput systems is well established, e.g., for the screening of new enzymes or drugs [8, 9], they can be used with simplifications only for the actual process development (e.g., dimensioning of bioreactors, design of process control strategies, and scale-up) [1012]. Although DoE can be used to identify correlations between process parameters and their influence on the final productivity, the complex bioprocess is reduced to a few key numbers (e.g., final product concentration), and the dynamics of growth and metabolism are not sufficiently taken into account [1, 13, 14]. In addition, the heuristic conception of a DoE by choosing the limits of the parameter space poses a particular challenge [1517]. Thus, there is a high risk that the experiments carried out were wrongly chosen and have only insufficient validity, which results in further costs and time delays [1, 18].

To overcome the previously mentioned limitations of DoE, a new model-assisted Design of Experiments (mDoE) concept was recently introduced for knowledge-driven bioprocess development and optimization [1, 14, 19]. In the mDoE approach, the recommended experiments in statistical DoE designs are simulated using mathematical process models instead of being performed experimentally. The DoE designs (i.e., experimental space) are then evaluated based on the simulations, which enables the definition of a well-defined experimental space with a significantly reduced number of experiments to be performed experimentally. Besides the significant reduction of the number of experiments, the use of mathematical process models is nowadays seen as a sustainable part of a knowledge-driven bioprocess development strategy, because they contribute to the scientific understanding of the process [12, 2022]. So far, the successful application of the mDoE approach in the field of medium and feeding strategy optimization for an antibody-producing Chinese Hamster Ovary cell line was shown [1, 14, 19]. Among the here presented application in the field of bio-economy, mDoE is currently used in optimization studies with algae, stem cells, and different mammalian producer cell lines.

In this study, the mDoE concept is incorporated into a software toolbox (“mDoE-toolbox”) for efficient design and optimization of biotechnological processes with a reduced number of experiments. In general, the toolbox can be used for different applications such as cell culture, algae, and yeast. Here, the application of the mDoE-toolbox is shown in two optimization case studies with Saccharomyces cerevisiae. In the first case study (S1), the cultivation conditions of a fed-batch process were optimized to increase the biomass concentration. The chosen factors were the pH value of the medium and the feeding rates of linearly rising feeding rates for glucose (FGlc) and nitrogen source (FN). In the second case study (S2), the concentration of (S)-ethyl-3-hydroxybutyrate (E3HB) was maximized in the biocatalytic conversion of ethyl acetoacetate (EAA) to E3HB based on constant feeding rates for EAA (FEAA), glucose (FGlc), and nitrogen source (FN).

mDoE-toolbox

The mDoE concept (see [1, 14]) was incorporated into a software toolbox, implemented in MATLAB (V2018a) and R (V3.5.1). The main parts of the mDoE-toolbox are the combination of a mathematical process model, including model-parametric uncertainties with the computational planning and evaluation of DoE designs.

In the beginning, the objective of the study (i.e., maximization of product concentration and minimization of inhibitory component) is defined. Then, the biotechnological system is modeled first, as can be seen in the structural workflow in Fig. 1 box I. Thus, prior knowledge (e.g., pre-experiments and literature) about the strain is used to define mathematical expressions for cell growth, metabolism, and productivity [1]. It should be noticed that process modeling itself is a rather undefined work, and a variety of models and modeling approaches of different considered complexity exist in the literature [2425]. The mDoE-toolbox is designed to be applied in the initial phase of process development for which very little data are available. Therefore, structurally simple [14, 26] or generalized models [27] are applied, for which model parameters can be adapted based on few data points typically generated in medium tests or first cultivations. After defining a mathematical model, the model-parametric uncertainties are derived with Monte Carlo sampling based on the experimental uncertainty (i.e., measurement error, Fig. 1 box II). Therefore, the expected process variability based on the measurement errors is simulated and later used in the DoE evaluation [19, 28]. Next, the experimental factors and responses are defined (Fig. 1 box III) with individual boundary values, e.g., a tolerated concentration of an inhibitory component or a minimal required product concentration. A DoE design, such as an optimal design [29, 30] or Box–Behnken design [31, 32], is subsequently planned. Additionally, mDoE enables the in silico comparison of different DoE designs, which is not targeted in this study.

Fig. 1.

Fig. 1

Structural workflow of mDoE-toolbox consisting of the combination of mathematical process models and classical DoE under the consideration of model-parametric uncertainty based on experimental variability [1, 19]

For each recommended factor combination i, the time courses of the modeled state variables (e.g., cell weight, substrate, and product concentration) are simulated multiple times (Monte Carlo simulations, Fig. 1 box IV) taking into account the previously determined parameter probability functions (box II). From these simulations, the average expected response r¯i (e.g., average maximal cell dry weight) and the variability νi are calculated. Due to Monte Carlo simulations, νi of the response is expressed as the difference between the 10% and 90% quantiles of simulations [19].

In the next step, r¯i and νi are used for the computational evaluation (box V) of the former planned experimental design (box III). Therefore, both r¯i and νi are summarized into a combined objective/desirability function (desirability at experimental factor combinations i-Di) for each planned experiment in the DoE designs [33, 34]. This enables an evaluation of each planned experiment with respect to its simulated average and its expected variability with the aim of simultaneously maximizing r¯i and reducing νi. The evaluation of DoEs using Di reflects a risk-based approach. An experiment with a high Di is favorable and a low Di indicates a high variability and/or a low average response, which is not desired. After calculating Di for all planned and simulated experiments, the experimental design planned in box III is analyzed, and response surface (RS) plots are generated automatically for visualization. Only a few (e.g., 2–4) experiments with the highest Di are recommended to be performed, experiments with low Di are neglected, which enables a significantly reduced number of experiments compared to the initially planned DoE design (box II).

Using the mDoE-toolbox, the available knowledge can be captured in the mathematical model, which can serve as a basis for advanced process understanding and digital twins [3637]. In this way, the new data obtained from the recommended experiments can be used to re-adapt the model parameters and their probability distribution or to modify the model structure if so far unknown effects were identified [19, 24, 28].

Materials and methods

S1: optimization of fed-batch strategy for maximization of dry cell weight

Genetically unmodified Saccharomyces cerevisiae (Agrano, Germany, commercial strain used for industrial food production) was cultivated using complex media consisting of water, glucose (Glc), yeast extract (YE), and soy peptone (Pep, all Roth, Germany). No preculture was carried out, and dried yeast was directly inoculated. An overview of the performed experiments in S1 with the medium and feed compositions used is shown in Table 1.

Table 1.

Initial and feed volume, as well as initial and feed concentration of cDCW, cGlc, cYE and cPep of every experiment in S1

Medium FGlc FN
Volume (l) Biomass (gl-1) cGlc (gl-1) cYE (gl-1) cPep (gl-1) cGlc (gl-1) cYE (gl-1) cPep (gl-1)
Modeling
 Cultivation I 1.00 3.0 6.0 1.4 2.3 290 50.0 76.0
 Cultivation II 1.00 3.0 6.0 1.4 2.3 290 50.0 76.0
 Cultivation III 0.75 20.0 7.9 1.4 2.3 290 85.0 135
 pH-Exp 0.30 10.0 30.0 2.9 4.5 180 17.0 27.0
mDoE-toolbox
 S1 0.70 2.0 15.0 1.4 2.3 400 130 270

Experiments for modeling

For the parameterization of the pH-related model part (see Section “Mathematical process model”), four cultivations with different pH were performed in 1 l baffled shake flasks (500 ml working volume, Schott, Germany), which were shaken at 170 rpm (1.9 cm shaking diameter, MaxQ4000, Thermo Fisher Scientific, USA) with initial cDCW=10gl-1. The temperature was controlled at 30C. The pH was adjusted initially and maintained manually (pH = 3, 4, 5, 6, respectively) to the desired value. In all experiments, the pH was adjusted using 20 wt% potassium hydroxide solution or 20 wt% phosphoric acid (both VWR, Germany). One feed pulse (glucose and nitrogen source) of 50 ml was added to each flask after 24 h (feed concentration in Table 1).

After adjusting the pH part of the model, further model parameterization was done based on historical data of three bioreactor (2 l working volume, Biostat B, Sartorius, Germany) fed-batch cultivations with different initial concentrations and feed compositions (see Table 1). Gassing was manually set in relation to the state of the process between 1 and 2 vvm (max. 2 lmin-1), and stirring was held between 500 and 800 rpm to maintain a DO above 10%. The pH was automatically controlled at 5. Temperature was set to 30C.

Recommended experiments from mDoE-toolbox

For the experiments recommended from the mDoE-toolbox, the initial dry cell weight (DCW) was adjusted to cDCW=2gl-1 and the initial conditions of the complex culture medium were prepared, as shown in Table 1. The recommended feeding strategy for the mDoE experiments was a linearly rising feeding rate starting at tstart=1h. The final feed volume flow at tend=48h was determined by the mDoE-toolbox for FGlc and FN separately. The feed rates were determined using the following function:

Fk(t)=Fk,endtend×(t-tstart). 1

For t<tstart, Fk(t) equals zero.

Linear feed strategies have been chosen instead of exponential strategies, as these are easier to handle and less risky. Slightly too large feed rates resulting from exponential strategies quickly result in over-feeding. The pH was held constant using 20 wt% potassium hydroxide solution or 20 wt% phosphoric acid (both VWR). The temperature was controlled at 30C.

S2: optimization of fed-batch strategy for biocatalysis

Genetically unmodified Saccharomyces cerevisiae (Agrano, Germany) served as the whole-cell biocatalyst. In the biocatalysis (S2), the media consist of Glc, YE, Pep, and EAA (all Roth, Germany). The temperature was set at 30C and pH 5. The pH was controlled with the addition of 20 wt% potassium hydroxide solution or 20 wt% phosphoric acid. The airflow rate was adjusted at 1–2 vvm and stirring rate at 800 rpm to maintain aerobic conditions (DO > 10%). Antifoam was fed when required. The experiments for the biocatalysis part (S2) are shown in Table 2.

Table 2.

Initial and feed volume, as well as initial and feed concentration of cDCW, cGlc, cYE, and cPep of the biocatalysis experiment of reference process S2; EAA is fed as a pure component

Medium FGlc FN
Volume (l) Biomass (gl-1) cGlc (gl-1) cYE (gl-1) cPep (gl-1) cGlc (gl-1) cYE (gl-1) cPep (gl-1)
Modeling
 Biocatalysis I 4.0 45.5 0.1 0 0 400 130 270
 Biocatalysis II 10.0 78.2 0.1 0 0 400 130 270
mDoE-toolbox
 S2 0.6 40.0 2.0 1.6 2.4 Adjusted individually

Experiments for modeling

Two experiments for model parameterization were performed in 5 l (Biocatalysis I) (BioFlo, Eppendorf, Germany) and 20 l reactor (Biocatalysis II) (BioStat C, Sartorius). Initial and feed conditions are shown in Table 2. The first experiment was an initial test experiment and the second experiment was with a high initial cDCW.

Recommended experiments from mDoE-toolbox

The recommended biocatalysis experiments were performed in 1 l stirred bioreactors (Medorex, Germany). Initial dry biomass density (cell dry weight cDCW) of 40gl-1 was chosen. Constant FGlc, FN, and FEAA were defined as factors. Feeding started immediately after inoculation. Since the independently predicted feed flow rates for glucose and nitrogen were too low for the available pumps, they were fed together. No online off-gas measurement was performed.

Analytics

Concentrations of ethanol (cEtOH), cGlc, cEAA, and cE3HB were quantified with high pressure liquid chromatography using a Rezex ROA column (300 × 7.8 mm, Phenomenex, USA) and 0.005 N sulfuric acid as the aqueous mobile phase according to the manufacturer’s protocol. cDCW was determined by filtrating the medium through cellulose acetate filters (0.45 m, VWR, US) and measuring the weight of the retentate after drying in a moisture analyzer (MA45, Sartorius, Germany). The percentages of oxygen and carbon dioxide in the off-gas were measured via an extractive gas analyzer (Sick, Germany). The respiratory quotient (RQ) is then calculated from the quotient of carbon dioxide produced divided by the oxygen consumed [38, 39]:

RQ=CO2,produced[mol]O2,consumed[mol]. 2

The pH value of the medium was measured in situ with an amperometric pH Probe (405-DPAS-SC-K8S, Mettler Toledo, US, and EasyFerm Plus PHI S8 225, Hamilton, US). The pH values in the shaking flask experiments were controlled offline with a benchtop pH meter (FiveEasy F20, Mettler Toledo, US). The DO in the biocatalysis experiments was measured with an optical dissolved oxygen (DO) probe (VisiFerm DO ECS 225 H0, Hamilton, US).

Mathematical process model

A novel structured compartment model, capable of being adapted to different biotechnological expression systems (e.g., bacteria, yeast, fungi, mammalian cell lines), was used to describe yeast growth, metabolism, and biocatalysis [27, 40]. The model was previously introduced by Brüning et al. [27] and is briefly explained in the following. The main part of the model is the segregation of the biomass into six distinct compartments, which are linked and individually described by mathematical equations representing different essential metabolic tasks. The detailed figure of the six model compartments can be found in the Electronic Supplementary Material (ESM) Fig. S1.

The following compartments are considered: an autocatalytically active biomass (Xpri), a product forming (Xp), a biocatalytically inactive (Xi), a structurally active (Xs) and inactive (Xsi), and a dead biomass (Xd) compartment. Biomass synthesis is based on a carbon (SC) and a nitrogen substrate (SN) and biocatalysis is modeled based on an educt (SBC). Furthermore, physicochemical state variables, such as DO, pH, and temperature, have a direct influence on cell metabolism, biomass growth and/or biocatalytic activity [27]. The uptake rates (rS) of the substrates (S) are rate-limiting steps, which are modeled by Monod kinetics typically used in bioprocess modeling [14, 26, 28, 40]:

rS=rSmaxSKs+S×i=1nfDsig(xi). 3

Ks is the half-saturation constant. The Monod-like term for the uptake rates is multiplied with the product of multiple double sigmoidal functions (fDsig) of the state variables (xi), which describe changes of the cell metabolism [27, 41]:

fDsig(x)=Yl+Ymid-Yl1+e-Ksl(x-X50,l)×1+(Yh/Ymid-1)1+e-Ksl(x-X50,h). 4

The value of a state variable is described by x. Yl is the value of fDsig at low x, and Yh is the value at high x. Ymid is the value between X50,l and X50,h, which are location parameters of the low/high side of the function. Ksl determines the gradient of the slope [27]. The sigmoidal functions are also used to describe the influence of operating parameters on the activation and inactivation rates as well as yield coefficients. This structure enables the description of complex changes in multiple metabolic pathways and their intensity, e.g., for biomass formation, overflow metabolisms, biocatalysis, and complete oxidation under aerobic and anaerobic conditions. Moreover, the product of fDsig(x) is used to account for combined influences such as substrate/product inhibition and/or pH, DO, or temperature on the uptake rates. A parameterization strategy for the double sigmoidal functions is described in [41]. Each pathway is represented by the same, generalized stoichiometric function:

CxHyOz+ν1O2+ν2HgOhNiν3CaHbOcNd+ν4CO2+ν5H2Ot. 5

The stoichiometric coefficients νi were determined previously [42] and used according to:

Yi/S=νiMWiMWS, 6

where MWi is the molecular weight for the state variable (e.g., biomass, O2, by-product) and MWS for the substrate [27, 43]. The yield coefficients, describing the formation of a substance i based on the substrate S (Yi/S) are used in the calculation of production and uptake rates, whereas rates of each pathway are summed up to total rates. The total rates rci are then used in general mass balances for each component with the concentrations of components in the feed ci,feed and their concentration in the bioreactor ci:

dcidt=rci+×Xvproduction-rci-×Xvuptake+ci,feed×FciVinput-ci×FV,inVdilution. 7

Monte Carlo-based uncertainty quantification

To quantify the variability of the model simulations in the mDoE-toolbox, the model-parametric uncertainties are determined using Monte Carlo sampling with repeated parameter adaptations [19]. In brief, the determined standard deviation of each experimental data point was considered to be independent and normally distributed. For the initial values, the standard deviation was assumed to be 5%. The individual biomass compartments considered in the Six-compartment model (Section “Mathematical process model”) could not be experimentally determined and were presumed with a standard deviation of 10%. The standard deviations of the set pH value, temperature, DO, and feeding rates, as well as their concentrations, were defined to be 5% based on the typical standard deviations in bioprocesses (i.e., expert knowledge) [19, 44, 45]. The model-parametric uncertainty was determined based on the experimental uncertainty using multiple parameterization runs (Monte Carlo samples). Due to limited computational power, 116 adaptations were performed in case study S1 and 240 adaptations in case study S2. Model parameters were adapted by minimizing the weighted root-mean-square deviation RMSD [14, 19, 27, 46]. The RMSD is calculated from the squared difference between the measured value ym and the simulated value ys, multiplied by a factor for weighting individual data points kweighting, and divided by the number of data points n in the data set:

RMSD=i=1n(ys,i-ym,i)2n·kweighting. 8

Only high cDCW>100gl-1 were weighted by 0.5 and no weighting was used for other state variables. The individually adapted model parameters, their medians, 10% and 90% quantiles (ESM: Tables S1 and S2), as well as their distributions are shown in the ESM.

The simulations using the determined parameters were additionally evaluated using the coefficient of determination (R2), which includes the differences between simulated ys,i and experimental data yi as well as the differences between experimental data and their mean y¯ [1, 19, 47]:

R2=1-i=1n(yi-ys,i)2i=1n(yi-y¯)2. 9

R2 lies between minus infinity and one. If R2 is one, the data points correspond precisely to the solution of the model. If R2 is less than zero, the mean of the measured data points is closer to the mean result than the solution of the model [19].

Planning of experimental design

The factor settings of the DoE designs were determined as described in the following: First, a large number of points (>106) were randomly distributed in the three-dimensional design space (i.e., three investigated factors). Then, clusters in these points were determined using the k-means algorithm [48, 49]. These clusters are partitioned into the k sets corresponding to the number of experiments in the DoE design. The resulting k cluster centers are the factor combinations (i.e., planned experiments) of the DoE design. Based on this algorithm and previous studies, a total of 29 experiments were planned, which were later individually simulated and evaluated using the mDoE-toolbox. The main advantage of this method, among other approaches, is the universal application to any number of investigated factors and experimental spaces of any shape [50].

Monte Carlo-based simulation of planned experiments

Instead of performing each planned cultivation from the DoE design, they were first simulated (see Fig. 1) with the developed process model. Due to computational power, 30 Monte Carlo simulations (Section “Monte Carlo-based uncertainty quantification”) (C-eStlM, Germany) were performed for each planned experiment, so that the propagated uncertainty of the simulations was quantified. The model parameter values were drawn using Latin Hypercube Sampling using the R-Package “lhs” (V1.0.2) [51, 52]. The interval boundaries are determined by calculating the 10% and 90% quantiles using the Type R-7 method provided in R.

Computational evaluation of experimental design

First, for each planned factor combination i, the average expected response (r¯i) is calculated based on the Monte Carlo simulations. Furthermore, νi is calculated as the difference of the 10% and the 90% quantile, and is used as a measure of the expected process variability. In the mDoE-toolbox, the maximization of r¯i is targeted with a simultaneously minimization of νi. Therefore, for both r¯i and νi, individual desirability functions d(r¯iorνi) are calculated by rescaling between 0 and 1. d(r¯iorνi) are based on the minimal response L(r¯orν) and the maximal response U(r¯orν) of all r¯i and νi (vector including all i experiments donated as r¯orν). Therefore, the desirability function d(r¯i) is in the optimization range (U(r¯)L(r¯)). For the maximization of r¯i, d(r¯i) is calculated as follows:

d(r¯i)=r¯i-L(r¯)U(r¯)-L(r¯). 10

For the minimization of νi, d(νi) is inversely calculated, i.e., a high νi has a low d(νi) and vice versa:

d(νi)=νi-U(ν)L(ν)-U(ν). 11

In the mDoE-toolbox, d(r¯i) and d(νi) are combined into one numerical value Di to quantify the average value and its variability of each planned experiment (see Section 1.1) into Di including weighting factors w:

Di=w1×d(r¯i)+w2×d(νi) 12

for which

w1+w2=1. 13

By this approach, a risk-based evaluation of the planned designs is enabled and w(νi) reflects the percentage at which νi is considered. In this study, w1=0.8 and w2=0.2. Contour and 3D plots were generated with Gnuplot 5.2.8.

Results and discussion

The mDoE-toolbox software was tested on two optimization studies with Saccharomyces cerevisiae (S1 and S2, respectively). First, the aim was to maximize cDCW after 48 h (S1) based on the experimental factors pH, as well as FGlc and FN. Second, the biocatalytic conversion from EAA to E3HB was optimized (S2). E3HB should be maximized based on FEAA, FGlc and FN. EAA shows inhibitory effects above a concentration of cEAA=0.5gl-1 [53, 54]. In both processes, ethanol formation is crucial due to its inhibitory effect on cell growth and biocatalysis [55, 56]. In addition, Saccharomyces cerevisiae produces ethanol even under aerobic conditions if glucose concentration is above a certain limit. This phenomenon is known as the Crabtree effect and should be minimized to optimize growth and biocatalysis [57, 58].

Monte Carlo-based uncertainty quantification

Model parameters and their distributions were determined using Monte Carlo sampling (see Fig. 1 box II), as explained in Section “Monte Carlo-based uncertainty quantification”. Therefore, three sets of experiments (see Table 1 and 2) were used. The first set consists of shaking flask experiments to adapt the parameters for the pH model. The second set consists of three historical fed-batch cultivations to model the growth of yeast, uptake rates, and production rates in relation to critical process parameters, e.g., glucose and ethanol concentration (both in Table 1). The last set was designed based on literature and was used to identify the parameters for the biocatalysis (Table 2).

Growth and metabolic model parameters (S1)

The growth and metabolic model parameters targeted in case study S1 were adjusted using data of three fed-batch cultivations (see Table 1). These model parameters describe the uptake of glucose, ethanol, and nitrogen, the activation, inactivation, and mortality rates, as well as the general yield coefficients for glucose and ethanol. Furthermore, the parameters of the sigmoidal functions for the influence of glucose limitation and ethanol inhibition on the glucose and ethanol uptake as well as the biomass inactivation rate were identified (see ESM Section 3.1). The comparison of the experimental data with the Monte Carlo-based simulations including 10% and 90% quantiles are shown in Fig. 2a–f. Furthermore, gassing rates and experimental online data (Fig. 2g, h), and calculated total volume V and total feeding F (Fig. 2j–l) are shown.

Fig. 2.

Fig. 2

Comparison of experimental data to adapted model of three initial cultivations (see Table 1), af solid lines represent the mean of 116 Monte Carlo simulations (Section “Monte Carlo-based simulation of planned experiments”); dashed lines represent the 10% and 90% quantiles of the simulations; gi online data of the off-gas measurement as well as the calculated respiratory quotient. jl Calculated V and feeding rates (F). Experimental settings and the used reactor are shown in Section “Experiments for modeling

Cultivations I and II were initial test cultivations aiming to achieve a high biomass density. For this purpose, different initial biomass concentrations were chosen. Cultivation III was performed with a feeding strategy which should lead to ethanol inhibition (Table 1).

Cultivation I In cultivation I, the biomass (Fig. 2a) increases from initially cDCW=18gl-1 to 100gl-1 (t=42h). An initial cDCW that high would not be used in “real” bioprocesses and was just utilized for the purpose of model parameter adaptation. The ethanol concentration was relatively low below cEtOH=10gl-1 throughout the experiment and glucose concentration was not measurable after t=2h. These results are reflected in the course of the respiratory quotient (Fig. 2b, RQ), which was constantly around one, indicating a low ethanol formation. Feeding rate was increased stepwise to F=0.35mlmin-1 (Fig. 2j).

Cultivation II In cultivation II, a biomass density of cDCW=64gl-1 was achieved at the end of the process and an ethanol concentration of cEtOH=28gl-1 was determined (Fig. 2d), leading to low growth inhibition. Glucose was directly consumed when fed, and therefore, the measured glucose concentrations were about 0gl-1 after t=10h. The rather strong ethanol production after t=30h is reflected in the RQ (Fig. 2e). Feeding (Fig. 2h) was higher than in cultivation I despite a lower initial biomass density.

Cultivation III As can be seen in Fig. 2c, only cDCW=24gl-1 was formed at t=44h. but over cEtOH=40gl-1 was produced during the same time period. This trend was also confirmed by the RQ (Fig. 2f), which was clearly above one from t=20h onwards, indicating an increased CO2 formation during ethanol production. Feeding (Fig. 2i) was designed to induce ethanol inhibition (i.e., over-feeding) and was increased stepwise up to F=0.7mlmin-1 [59].

Overall, the model parameters could be adapted well to the process data with an R2 above 0.85 (total for cDCW, cGlc, cEtOH) comparing the experimental data to the mean of the simulations for every experiment (Table 3).

Table 3.

Total R2 for average model parameters in Monte Carlo-based uncertainty quantification (S1—cultivation and S2—biocatalysis)

Experiment R2
S1 Cultivation I 0.97
Cultivation II 0.98
Cultivation III 0.85
S2 Biocatalysis I 0.92
Biocatalysis II 0.96

In addition, the modeling of high biomass densities and ethanol inhibition was adapted sufficiently. The width of the uncertainty band (10% and 90% quantiles) of the simulations (Fig. 2a–f) was narrow, indicating a reliable estimation of model parameters. All model parameters and their individual distribution are shown in ESM Figs. S3–S5.

pH-dependent model parameters (S1)

As can be seen in Fig. 3, the influence of a wide pH range (4–7) on the growth and metabolism of Saccharomyces cerevisiae can be sufficiently reflected by the model with the adapted model parameters in case study S1. The concentrations of biomass and glucose can be well reflected with an R2 above 0.92. The simulation of the ethanol concentration is sufficient (R2 = 0.74).

Fig. 3.

Fig. 3

Comparison of experimental (Exp) data and simulated (Sim) data for the adaption of pH-related model parameters (see Section “Experiments for modeling”). Error bars show standard deviation of two parallel shaking flask experiments for each investigated pH (4, 5, 6, and 7, respectively). The quality of fit is represented by R2 (optimal simulation x = y). For each pH, the individual growth curves are shown in the ESM Fig. S2. Experimental settings are specified in Section “Experiments for modeling

For all cultivations (see ESM Fig. S2 for individual plots), cells grew up to approx. cDCW=20-25gl-1 in the first hours including strong ethanol production. After glucose depletion, cell growth stagnated, and ethanol was taken up until the feed pulse at t=24h was fed. After the feed pulse, the biomass density increased again, while glucose was consumed and ethanol produced. No strong growth inhibitions were seen for the different pH values investigated. The biomass densities at the end of the process were found to be slightly higher at pH 5 (compared to pH 4 and 6, respectively), and 10% higher than at pH 3. Furthermore, the glucose and ethanol consumption rates increased with increasing pH. The growth rates at pH 4 and pH 5 were equally the highest with 0.105±0.005 h-1.

Using the experimental data, the model parameters related to the pH-dependent glucose metabolism and uptake, and the segregation of glucose into biomass and ethanol production were determined. These parameters were not used in the Monte Carlo parameter adaptation (Section “Monte Carlo-based uncertainty quantification”) to speed up the calculations of the parameterization algorithm and were kept constant thereafter.

Biocatalytic model part (S2)

The biocatalysis model used in S2 was adapted on data of two experiments, based on the literature [53, 54]. The focus was on those model parameters, characterizing the biocatalysis, and therefore partly differs from the previously chosen parameters in case study S1. The new parameters are listed in ESM Section 3.2 and describe the uptake of EAA, glucose, ethanol and nitrogen, the activation, inactivation, and mortality rate, as well as the general yield coefficients for EAA, glucose, and ethanol. Furthermore, the parameters of the sigmoidal functions quantifying the influence of ethanol inhibition, glucose limitation on glucose uptake, and EAA inhibition on the biomass inactivation rate were adapted to the experimental data. In Fig. 4, the comparison between experimental and simulation data of the biocatalysis parameterization experiments is shown.

Fig. 4.

Fig. 4

Model parameter adaptation for biocatalysis (see Table 2). ad solid line represents the mean of 116 Monte Carlo simulations (Section “Monte Carlo-based simulation of planned experiments”), dashed lines represent the 10% and 90% quantiles of the simulations; e, f calculated V and F. Experimental settings and used reactors are specified in Section “S2: optimization of fed-batch strategy for biocatalysis

Biocatalysis I The first experiment aimed at the production of E3HB with cDCW=45gl-1. The feed and initial concentrations of the parameterization experiments are listed in Table 2. In the first experiment, biomass (Fig. 4a) decreased to cDCW=25gl-1 (t=26h), first due to dilution by feeding and towards the end due to a higher mortality rate induced by toxically high concentrations of EAA. Ethanol concentration (Fig. 4c) rose from 30gl-1 to over 40gl-1 at t=26h. Feeding was designed to increase cEAA (Fig. 4e). Therefore, a constant FGlc=0.65mlmin-1 and a constant FEAA=0.067mlmin-1 were set during the first 28 h. Then, FGlc was reduced and FEAA was increased to identify potential EAA inhibition. Thus, cEAA rose to over 4 gl-1, and the resulting inhibition is reflected in rising cGlc, despite the reduced glucose feed. E3HB constantly increased up to cE3HB=24gl-1 (t=32h) and then stopped increasing due to the EAA and potential ethanol inhibition.

Biocatalysis II In the second experiment, a higher initial biomass concentration of cDCW=80gl-1 was used, and ethanol and glucose concentrations were kept below 0.1 gl-1 during the whole experiment (Fig. 4b, d). As in Cultivation I (S1), this high cDCW was used for model parameter adaptation. After 48 h, cE3HB=44gl-1 was reached, whereas cEAA was constantly low and the biomass density decreased due to dilution. Constant glucose (FGlc=3mlmin-1) and EAA feeding rates (FEAA=0.05mlmin-1) were set. FEAA was increased to 0.085mlmin-1 in two steps.

Overall, the parameterized model parameters reflect the kinetics of biocatalysis satisfactorily. The calculated R2 are shown Table 3 and are higher than 0.92 for both simulations for the experimental data compared to the mean simulation (summarized for cE3HB, cEAA, cDCW, cEtOH).

The biocatalytic metabolite concentrations are well reproduced by the model for both experiments. The 10% and 90% quantiles of the simulations (Fig. 4a–d) are small. The model parameters and their individual distributions are shown in ESM Figs. S6–S9.

Optimization of fed-batch process with mDoE-toolbox (S1)

The factors (pH, FGlc, FN, respectively) have been selected based on literature and experience in the cultivation of Saccharomyces cerevisiae. pH is described to have a strong influence on viability and growth rate [60]. Glucose is essential for cell growth, but high glucose feeding rates lead to ethanol formation due to the Crabtree effect and possibly ethanol inhibition [5859]. A nitrogen source is essential for cell growth, but feeding needs a tight control to avoid dilution and to enable steady growth [61].

mDoE-toolbox (S1)

Planning of experimental design and Monte Carlo simulations Using the mDoE-toolbox explained in Sect. 1.1, rather widely distributed initial boundary values are defined first for the planning of the experimental design (Fig. 1, box III) [1]. Therefore, start of feeding was at t=1h. The initial boundaries for FGlc were set between 0.1FGlc1mlmin-1, and FN between 0.05FN0.6mlmin-1. pH was varied between 3 and 7. Using the mDoE-toolbox, a total of 29 design points (i.e., planned experiments) were distributed in the three-dimensional design space, determined by the previously defined boundaries (see Section “Planning of experimental design”). For each of the planned experiments, Monte Carlo simulations (Fig. 1, box IV) were performed, as explained in Section “Monte Carlo-based simulation of planned experiments”.

Computational evaluation of experimental design From the Monte Carlo simulations, r¯i and νi were calculated for the maximum cDCW for each experimental setting i, which were further used to derive the desirability Di (Fig. 1, box V). In Fig. 5, the desirability functions are plotted for pH = 5.0, FN=0.20mlmin-1, and FGlc=0.42mlmin-1, for which the highest Di was calculated. Figure 5 shows the contour and 3D plots at these process conditions.

Fig. 5.

Fig. 5

Contour and 3D plot for the response surfaces based on Di for the optimization of pH, FGlc, and FN in S1. The responses were calculated with Monte Carlo simulations in the mDoE-toolbox, as explained in Section 1.1. ac Graphs are adjusted to pH=5, FN=0.20mlmin-1 and FGlc=0.42mlmin-1, respectively, lines show differences of 0.05; df 3D representation of RS curves. The individual quadratic RS equations are shown in ESM Table S1

Di is high for flow rates of FGlc<0.5mlmin-1 and FN<0.3mlmin-1 at pH = 5.0 (Fig. 5a). The influence of the pH value (Fig. 5b) on the desirability function is low, and a pH between 4 and 6 provides the best results. These results were additionally confirmed with the RSM in Fig. 5c and were low FN<0.3mlmin-1, and a pH between 4 and 6 shows again the best results. Using the mDoE-toolbox, the experimental space was computationally simulated and evaluated. For the factors investigated, the resulting RS plots could hardly be predicted on experience solely. The in silico calculation and computational evaluation of the planned experimental design offer a knowledge-driven approach, which is the major advantage of the mDoE-toolbox.

mDoE-suggested experimental settings Based on the evaluation of the experimental space, further experiments were recommended (see Sect. 2.8) from the toolbox. To statistically validate the recommended factor settings, four experiments located in the high Di region were chosen, which are listed in Table 4.

Table 4.

Factor combination of the four yeast cultivation experiments determined with the usage of the mDoE-toolbox. FGlc and FN are the final maximum feed rates of the linearly rising feed at the end of the experiments

Exp. FGlc FN pH
(mlmin-1) (mlmin-1) (–)
#1 0.28 0.12 4
#2 0.28 0.12 6
#3 0.41 0.18 5
#4 0.54 0.25 5

Performed experiments

The four recommended experiments were performed (see Section “Recommended experiments from mDoE-toolbox”). The comparison of the experimental data to the model simulations, including the parametric 10% and 90% uncertainty-based prediction bands, is depicted in Figure 6.

Fig. 6.

Fig. 6

Experimental data of the four performed cultivations compared to the simulated data from the mDoE-toolbox for cDCW, cGlc, and cEtOH. The solid line is the mean of 30 simulations (Section “Monte Carlo-based simulation of planned experiments”); dashed line represents the 10% and 90% quantiles of the simulations, and online off-gas data and individual feeding profiles can be found in ESM Fig. S10. Experimental settings and the used reactor are specified in Section “Recommended experiments from mDoE-toolbox

Biomass In all four cultivations, cells grew until maximum biomass densities between cDCW=45gl-1 (Fig. 6d) and cDCW=80gl-1 (Fig. 6a). Growth was predicted sufficiently for all cultivations and only partly underestimated between t=18and24h in cultivation #4 (Fig. 6c). The width of the uncertainty band (10% and 90% quantiles) was relatively narrow for cultivation #1–#3 (Fig. 6a–c), which reflects that the variability of these experimental settings are predicted to be low. Broad uncertainty bands indicating a high variability of the experimental settings were predicted for cultivation #4 (Fig. 6d), due to high cEtOH near inhibitory conditions.

Glucose and ethanol Glucose was consumed during the cultivations and was constantly very low (i.e., fully consumed) after t20h. In cultivations #1 and #2, cEtOH increased only at the end of the cultivation in relatively low amounts below cEtOH=12gl-1, for which no growth inhibition was seen. In cultures #3 and #4 (Fig. 6g, h), ethanol was not consumed and produced up to a maximum inhibitory concentration of cEtOH=45.5±1.8gl-1 in cultivation #4 (Fig. 6h) [59, 61].

Overall, cDCW=80gl-1 was achieved (cultivation #1) after the application of the mDoE-toolbox in a study with three influencing factors (FGlc, FN, pH). This reflects an improvement of 30% in relation to the cultivation II (Fig. 2 b) with similar initial conditions. Simultaneously, cEtOH at t=48h could be reduced by 50% compared to cultivation II (Fig. 2 e), resulting in higher substrate usage and a safer process operation point due to less possibility of inhibition.

The investigated factors are difficult to asses in traditional DoE studies due to their dynamic nature, i.e., the feeding rate itself changes during the process. A model-based approach strongly supports the evaluation of such dynamically changing factors.

Optimization of biocatalytic conversion of EAA to E3HB (S2)

In case study S2, the feeding rates FEAA, FGlc, and FN were manipulated to optimize the biocatalytic conversion of EAA to E3HB. The optimization objective is defined as the maximization of the biocatalytic product concentration (cE3HB). Among the other factors, FEAA is very critical, because cEAA has to be kept below 0.5gl-1 to avoid inhibition [53].

mDoE-toolbox (S2)

Planning of experimental design and Monte Carlo simulations The boundaries for FEAA were 0FEAA0.04mlmin-1, based on literature to avoid inhibition through high EAA concentrations [53, 54]. FGlc and FN were defined to meet the demands the maintenance metabolism and the amount required for biocatalysis to be 0FGlc0.2mlmin-1 and 0FN0.03mlmin-1. The constitution of the feeds can be found in Table 2. For each simulated and performed biocatalysis, cDCW=30gl-1 was directly inoculated with no prior cultivation. The same design of the experiments as in S1 was applied (see Fig. 1 box III) and 29 experiments were planned initially using the mDoE-toolbox. For each planned experiment, Monte Carlo simulations were performed (see Fig. 1, box IV).

Computational evaluation of experimental design Di was calculated and the response surface was predicted (Fig. 1 box V). As can be seen in Fig. 7, a maximum was determined at FEAA=0.02mlmin-1, FGlc=0.11mlmin-1, and FN=0.01mlmin-1.

Fig. 7.

Fig. 7

Contour and 3D plot for the response surfaces based on Di for FEAA, FGlc, and FN in case study S2. The responses were calculated with Monte Carlo simulations in the mDoE-toolbox, as explained in Section 1.1. ac Graphs are adjusted to FEAA=0.02mlmin-1, FGlc=0.11mlmin-1 and FN=0.01mlmin-1 respectively, lines show differences of 0.05; df 3D representation of RS curves. The individual quadratic RS equations are shown in ESM Table S1

Excessive EAA feeding is predicted to inhibit and deactivate biocatalysis. A high Di is achieved only with a low FEAA, as can be seen from the shape of the 3D plots (Fig. 7d–f). Too small values of FGlc lead to low cGlc limiting the biocatalysis in the simulations. Too high FGlc resulting in an increased cGlc, which leads to ethanol formation due to the Crabtree effect. In addition, high feeding rates (FGlc and FN) always result in dilution. The impact of FN on achieving high cE3HB is rather low. Therefore, a small FN is desirable.

mDoE-suggested experimental settings Based on the Monte Carlo simulations and the computational evaluation of the experimental space, experiments located in the high Di regions were identified and four of them were chosen. The experimental settings are listed in Table 5.

Table 5.

Factor combination of the four recommended biocatalytic experiments determined with the mDoE-toolbox

Exp. FEAA FGlc FN
(mlmin-1) (mlmin-1) (mlmin-1)
#1 0.024 0.09 0.010
#2 0.018 0.12 0.010
#3 0.014 0.06 0.020
#4 0.010 0.09 0.002

Performed experiments

As can be seen in Fig. 8, final E3HB concentrations between cE3HB=10-50gl-1 were reached in the four recommended experiments.

Fig. 8.

Fig. 8

Experimental data of the four performed cultivations compared to the simulated data from the mDoE-toolbox for cEAA, cE3HB, and cGlc. The solid line is the mean of 30 simulations (Section “Monte Carlo-based simulation of planned experiments”); dashed line represents the 10% and 90% quantiles of the simulations. Experimental settings and the used reactor are specified in Section “Recommended experiments from mDoE-toolbox

EAA and E3HB Even if the variation in FEAA in the four experiments was relatively low (0.10FEAA0.24mlmin-1), the maximum cE3HB decreased strongly with decreasing FEAA from cultivations #1-#4. EAA was constantly consumed during the bioprocesses and reached a maximum of cEAA=2gl-1 in experiment #1 (Fig. 8a). The width of the 10% and 90% quantiles of the cEAA simulations increases for higher FEAA, due to an increasing probability of EAA inhibition.

In cultivations #3 and #4 (Fig. 8c, d), cE3HB is lower than the prediction. Since no EAA was detectable, a higher by-product formation rate might have occurred in the experiments with lower product concentrations (Fig. 8c, d).

Biomass, glucose, and ethanol Despite a low FN, biomass density increased by at least 15 gl-1 in every experiment (Fig. 8e–h). This was consistent with cGlc and cEtOH, which were below 5 gl-1 for cultivations #2-#4. In cultivation #1, increasing cEAA leads to inhibition and reduction of the metabolic activity, resulting in a lower glucose consumption and an increasing glucose concentration up to cGlc=1gl-1. This induces ethanol formation reaching cEtOH=10gl-1 (t=48h, Fig. 8e).

It could be shown that even the feeding strategy for a biocatalytic process with complex reaction mechanisms could be designed and optimized with the application of the mDoE-toolbox. By calculating an optimal feeding profile for EAA, glucose, and nitrogen source, cE3HB=44gl-1 (Exp. #1) was achieved. This is an improvement of 80% in comparison to the experiment Biocatalysis I (Fig. 4a), with similar initial conditions. The same E3HB concentration was reached with a 60% lower initial cDCW, when compared to the high cell density experiment (Biocatalysis II, Fig. 4b) and 10% compared to the literature [53, 54]. Furthermore, the application of the mDoE-toolbox reduced the initial experimental space by over 90% (compare Table 4 with Fig. 5). Only four recommended experiments had to be performed to find improved operating conditions, resulting in 80% higher product concentrations. The reactor volume was 80% smaller compared to Biocatalysis I, which may have had an additional influence on the improved product concentration. However, the transferability of process understanding obtained using mathematical process models was recently shown to be transferable between different scales [19, 37].

Conclusion

In this study, the mDoE-toolbox was introduced to enable a more knowledge-driven experimental design, and to strongly reduce the number of experiments during bioprocess development and optimization. The application of the toolbox was shown for two different case studies with Saccharomyces cerevisiae. In case study S1, a fed-batch process was optimized to maximize the final biomass density depending on the factors pH, and linearly rising substrate feeding rates. Just four experiments were needed to achieve a 30% increase of the final biomass density compared to Cultivation II, instead of 29 initially planned experiments, which would have been performed in the fully experimental evaluation of the DoE. In case study S2, the biocatalytic production of E3HB was optimized based on constant substrate feeding rates. Just four experiments were needed instead of 29 initially planned experiments. An improvement of 80% in the final E3HB concentration was experimentally achieved compared to the experiment Biocatalysis I with similar initial conditions. Although this reaction is well known, an improvement of about 10% of cE3HB was achieved compared to the literature [53, 54]. In addition, this result was obtained in less than half the cultivation time [54]. In both processes, the optimization of rather difficult to assess factors, such as timely changing feeding rates (S1) or feeding of inhibitory components (S2), was possible through modeling.

In summary, the usage of the mDoE-toolbox enables optimization studies with dynamic factors in statistics-based biotechnology research employing a reduced number of experiments. Further research will focus on online model parameter adaptation and the consequent online re-design of experiments to increase the obtained process understanding further.

Supplementary Information

Below is the link to the electronic supplementary material.

Acknowledgements

The authors would like to thank V. Gockel, M. Meissner, and M. Menden for their laboratory work at Furtwangen University. Krathika Bhat is kindly acknowledged for English proof reading. The mDoE-toolbox can be made available for scientific collaborations.

Abbreviations

DCW

Dry cell weight

DO

Dissolved oxygen

DoE

Design of experiments

EAA

Ethyl acetoacetate

EtOH

Ethanol

E3HB

(S)-Ethyl-3-hydroxybutyrate

Glc

Glucose

mDoE

Model-assisted design of experiments

Pep

Peptone

RMSD

Weighted mean squared deviation

RS

Response surface

RSM

Response surface method

RQ

Respiratory quotient

YE

Yeast extract

List of symbols

ck

Concentration of component k in the bioreactor (gl-1)

ck,feed

Concentration of component k in the feed (gl-1)

d

Desirability function (–)

Di

Combined desirability function (–)

F

Total feeding rate (mlmin-1)

Fck

Feeding rate of component k (mlmin-1)

Fk

Feeding rate of component k (mlmin-1)

FV,in

Total volumetric feed rate (gl-1h-1)

i (index)

Recommended factor combination (–)

k (index)

EAA, Glc, N, EtOH, E3HB (–)

kweighting

Weighting factor in RMSD (–)

Ks

Saturation constant (gl-1)

Ksl

Gradient slope (–)

L

Lowest value/lower limit (–)

MWk

Molecular weight of the state variable k (gmol-1)

MWS

Molecular weight of the substrate S (gmol-1)

n

Number of data points (–)

rck

Total uptake/consumption rate (h-1)

r¯i

Average value of response i (–)

rS

Substrate uptake rate (gl-1h-1)

R2

Coefficient of determination (–)

SBC

Biocatalysis educt (gl-1)

SC

Carbon substrate (gl-1)

SN

Nitrogen substrate (gl-1)

t

Time (h)

U

Highest value/upper limit (–)

V

Working volume of the bioreactor (l)

w

Weighting factor in Di (–)

Xd

Dead biomass (gl-1)

Xi

Inactive biomass (gl-1)

Xp

Product forming biomass (gl-1)

Xpri

Autocatalytically active biomass (gl-1)

Xs

Structurally active biomass (gl-1)

Xsi

Structurally inactive biomass (gl-1)

Xv

Viable biomass (gl-1)

X50,l

Low location parameter of x (–)

X50,h

High location parameter of x (–)

yi

Measured value (–)

yi¯

Mean of measured data (–)

ys

Simulated value (–)

Yh

y at high x (–)

Yi/S

Yield coefficient (–)

Yl

y at low x (–)

Ymid

y at medium x (–)

νi in model

Stoichiometric coefficient (–)

νi in MC simulation

Predicted variability (–)

Funding

Open Access funding enabled and organized by Projekt DEAL. The authors acknowledge partial funding by German Federal Ministry of Education and Research (BMBF, Grant 031B0577A-C).

Compliance with ethical standards

Conflict of interest

The authors declare that there are no conflict of interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Möller J, Kuchemüller KB, Steinmetz T, Koopmann KS, Pörtner R. Model-assisted design of experiments as a concept for knowledge-based bioprocess development. Bioproc Biosyst Eng. 2019;42(5):867. doi: 10.1007/s00449-019-02089-7. [DOI] [PubMed] [Google Scholar]
  • 2.Dubois JL. Requirements for the development of a bioeconomy for chemicals. Curr Opin Environ Sustain. 2011;3(1):11. doi: 10.1016/j.cosust.2011.02.001. [DOI] [Google Scholar]
  • 3.Lokko Y, Heijde M, Schebesta K, Scholtès P, Van Montagu M, Giacca M. Biotechnology and the bioeconomy towards inclusive and sustainable industrial development. New Biotechnol. 2018 doi: 10.1016/j.nbt.2017.06.005. [DOI] [PubMed] [Google Scholar]
  • 4.Scarlat N, Dallemand JF, Monforti-Ferrario F, Nita V. The role of biomass and bioenergy in a future bioeconomy: policies and facts. Environ Dev. 2015;15:3. doi: 10.1016/j.envdev.2015.03.006. [DOI] [Google Scholar]
  • 5.Guo M, Song W. The growing U.S. bioeconomy: drivers, development and constraints. New Biotechnol. 2019;49:48. doi: 10.1016/j.nbt.2018.08.005. [DOI] [PubMed] [Google Scholar]
  • 6.Mandenius CF, Brundin A. Bioprocess optimization using design-of-experiments methodology. Biotechnol Progr. 2008;24(6):1191. doi: 10.1002/btpr.67. [DOI] [PubMed] [Google Scholar]
  • 7.Glauche F, Pilarek M, Bournazou MNC, Grunzel P, Neubauer P. Design of experiments-based high-throughput strategy for development and optimization of efficient cell disruption protocols. Eng Life Sci. 2017;17(11):1166. doi: 10.1002/elsc.201600030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Vervoort Y, Linares AG, Roncoroni M, Liu C, Steensels J, Verstrepen KJ. High-throughput system-wide engineering and screening for microbial biotechnology. Curr Opin Biotech. 2017;46:120. doi: 10.1016/j.copbio.2017.02.011. [DOI] [PubMed] [Google Scholar]
  • 9.Dörr M, Fibinger MP, Last D, Schmidt S, Santos-Aberturas J, Böttcher D, Hummel A, Vickers C, Voss M, Bornscheuer UT. Fully automatized high-throughput enzyme library screening using a robotic platform. Biotechnol Bioeng. 2016;113(7):1421. doi: 10.1002/bit.25925. [DOI] [PubMed] [Google Scholar]
  • 10.Bareither R, Bargh N, Oakeshott R, Watts K, Pollard D. Automated disposable small scale reactor for high throughput bioprocess development: a proof of concept study. Biotechnol Bioeng. 2013;110(12):3126. doi: 10.1007/s00449-019-02089-7. [DOI] [PubMed] [Google Scholar]
  • 11.Savizi ISP, Soudi T, Shojaosadati SA. Systems biology approach in the formulation of chemically defined media for recombinant protein overproduction. Appl Microbiol Biotechnol. 2019;103(20):8315. doi: 10.1007/s00449-019-02089-7. [DOI] [PubMed] [Google Scholar]
  • 12.Abt V, Barz T, Cruz-Bournazou MN, Herwig C, Kroll P, Möller J, Pörtner R, Schenkendorf R. Model-based tools for optimal experiments in bioprocess engineering. Curr Opin Chem Eng. 2018;22:244. doi: 10.1007/s00449-019-02089-7. [DOI] [Google Scholar]
  • 13.Mandenius CF, Graumann K, Schultz TW, Premstaller A, Olsson IM, Petiot E, Clemens C, Welin M. Quality-by-design for biotechnology-related pharmaceuticals. Biotechnol J. 2009;4(5):600. doi: 10.1007/s00449-019-02089-7. [DOI] [PubMed] [Google Scholar]
  • 14.Kuchemüller KB, Pörtner R, Möller J. Efficient optimization of process strategies with model-assisted design of experiments. New York: Springer US; 2020. pp. 235–249. [DOI] [PubMed] [Google Scholar]
  • 15.Brunner M, Fricke J, Kroll P, Herwig C. Investigation of the interactions of critical scale-up parameters (ph, po2 and pco2) on cho batch performance and critical quality attributes. Bioproc Biosyst Eng. 2017;40(2):251. doi: 10.1007/s00449-016-1693-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.von Stosch M, Willis MJ. Intensified design of experiments for upstream bioreactors. Eng Life Sci. 2017;17(11):1173. doi: 10.1002/elsc.201600037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Manzon D, Claeys-Bruno M, Declomesnil S, Carité C, Sergent M. Quality by design: comparison of design space construction methods in the case of design of experiments. Chemometr Intell Lab. 2020;200:104002. doi: 10.1016/j.chemolab.2020.104002. [DOI] [Google Scholar]
  • 18.von Stosch M, Hamelink JM, Oliveira R. Hybrid modeling as a qbd/pat tool in process development: an industrial E. coli case study. Bioproc Biosyst Eng. 2016;39(5):773. doi: 10.1007/s00449-019-02089-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Möller J, Hernández Rodríguez T, Müller J, Arndt L, Kuchemüller KB, Frahm B, Eibl R, Eibl D, Pörtner R. Model uncertainty-based evaluation of process strategies during scale-up of biopharmaceutical processes. Comput Chem Eng. 2020;134:106693. doi: 10.1016/j.compchemeng.2019.106693. [DOI] [Google Scholar]
  • 20.Carrondo MJT, Alves PM, Carinhas N, Glassey J, Hesse F, Merten OW, Micheletti M, Noll T, Oliveira R, Reichl U, Staby A, Teixeira AP, Weichert H, Mandenius CF. How can measurement, monitoring, modeling and control advance cell culture in industrial biotechnology? Biotechnol J. 2012;7(12):1522. doi: 10.1002/biot.201200226. [DOI] [PubMed] [Google Scholar]
  • 21.Kontoravdi C, Samsatli NJ, Shah N. Development and design of bio-pharmaceutical processes. Curr Opin Chem Eng. 2013;2(4):435. doi: 10.1016/j.coche.2013.09.007. [DOI] [Google Scholar]
  • 22.Jarka GVGK, Christoph CWST, Rui O, Gerald S, Carl-Fredrik M. Process analytical technology (pat) for biopharmaceuticals. Biotechnol J. 2011;6(4):369. doi: 10.1002/biot.201000356. [DOI] [PubMed] [Google Scholar]
  • 23.Bailey JE. Mathematical modeling and analysis in biochemical engineering: past accomplishments and future opportunities. Biotechnol Progr. 1998;14(1):8. doi: 10.1021/bp9701269. [DOI] [PubMed] [Google Scholar]
  • 24.Möller J, Korte K, Pörtner R, Zeng AP, Jandt U. Model-based identification of cell-cycle-dependent metabolism and putative autocrine effects in antibody producing cho cell culture. Biotechnol Bioeng. 2018;115(12):2996. doi: 10.1002/bit.26828. [DOI] [PubMed] [Google Scholar]
  • 25.Kroll P, Hofer A, Ulonska S, Kager J, Herwig C. Model-based methods in the biopharmaceutical process lifecycle. Pharm Res. 2017;34(12):2596. doi: 10.1007/s11095-017-2308-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kern S, Platas-Barradas O, Pörtner R, Frahm B. Model-based strategy for cell culture seed train layout verified at lab scale. Cytotechnology. 2016;68(4):1019. doi: 10.1007/s10616-015-9858-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Brüning S, Gerlach I, Pörtner R, Mandenius CF, Hass VC. Modeling suspension cultures of microbial and mammalian cells with an adaptable six-compartment model. Chem Eng Technol. 2017;40(5):956. doi: 10.1002/ceat.201600639. [DOI] [Google Scholar]
  • 28.Möller J, Bhat K, Riecken K, Pörtner R, Zeng AP, Jandt U. Process-induced cell cycle oscillations in cho cultures: online monitoring and model-based investigation. Biotechnol Bioeng. 2019;116(11):2931. doi: 10.1002/bit.27124. [DOI] [PubMed] [Google Scholar]
  • 29.Jin Z, Han SY, Zhang L, Zheng SP, Wang Y, Lin Y. Combined utilization of lipase-displaying Pichia pastoris whole-cell biocatalysts to improve biodiesel production in co-solvent media. Bioresour Technol. 2013;130:102. doi: 10.1016/j.biortech.2012.12.020. [DOI] [PubMed] [Google Scholar]
  • 30.Isidro IA, Portela RM, Clemente JJ, Cunha AE, Oliveira R. Hybrid metabolic flux analysis and recombinant protein prediction in Pichia pastoris x–33 cultures expressing a single-chain antibody fragment. Bioproc Biosyst Eng. 2016;39(9):1351. doi: 10.1007/s00449-016-1611-z. [DOI] [PubMed] [Google Scholar]
  • 31.Mondal NK, Samanta A, Dutta S, Chattoraj S. Optimization of Cr(VI) biosorption onto Aspergillus niger using 3-level box-behnken design: Equilibrium, kinetic, thermodynamic and regeneration studies. J Genet Eng Biotechnol. 2017;15(1):151. doi: 10.1016/j.jgeb.2017.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Abdel-Fattah YR. Optimization of thermostable lipase production from a thermophilic Geobacillus sp. using box-behnken experimental design. Biotechnol Lett. 2002;24(14):1217. doi: 10.1023/A:1016167416712. [DOI] [Google Scholar]
  • 33.Candioti LV, Zan MMD, Camara MS, Goicoechea HC. Experimental design and multiple response optimization. using the desirability function in analytical methods development. Talanta. 2014;124:123. doi: 10.1016/j.talanta.2014.01.034. [DOI] [PubMed] [Google Scholar]
  • 34.Derringer G, Suich R. Simultaneous optimization of several response variables. J Qual Technol. 1980;12(4):214. doi: 10.1080/00224065.1980.11980968. [DOI] [Google Scholar]
  • 35.Narayanan H, Luna MF, von Stosch M, Cruz Bournazou MN, Polotti G, Morbidelli M, Butte A, Sokolov M. Bioprocessing in the digital age: the role of process models. Biotechnol J. 2020;15(1):1900172. doi: 10.1002/biot.201900172. [DOI] [PubMed] [Google Scholar]
  • 36.Herwig C. Applied basic science in process analytics and control technology. Anal Bioanal Chem. 2020;412(9):2025. doi: 10.1007/s00216-020-02465-3. [DOI] [PubMed] [Google Scholar]
  • 37.Kuchemüller KB, Pörtner R, Möller J. Digital twins and their role in model-assisted design of experiments. New York: Springer US; 2020. [DOI] [PubMed] [Google Scholar]
  • 38.Zeng AP, Byun TG, Posten C, Deckwer WD. Use of respiratory quotient as a control parameter for optimum oxygen supply and scale-up of 2,3-butanediol production under microaerobic conditions. Biotechnol Bioeng. 1994;44(9):1107. doi: 10.1002/bit.260440912. [DOI] [PubMed] [Google Scholar]
  • 39.Heyman B, Tulke H, Putri SP, Fukusaki E, Büchs J. Online monitoring of the respiratory quotient reveals metabolic phases during microaerobic 2,3-butanediol production with Bacillus licheniformis. Eng Life Sci. 2020;20(3–4):133. doi: 10.1002/elsc.201900121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Moser A, Appl C, Brüning S, Hass VC. Mechanistic mathematical models as a basis for digital twins. New York: Springer US; 2020. [DOI] [PubMed] [Google Scholar]
  • 41.Gerlach I, Brüning S, Gustavsson R, Mandenius CF, Hass VC. Operator training in recombinant protein production using a structured simulator model. J Biotechnol. 2014;177:53. doi: 10.1016/j.jbiotec.2014.02.022. [DOI] [PubMed] [Google Scholar]
  • 42.Hass VC. Verbesserung der bioverfahrenstechnischen ausbildung durch einen virtuellen bioreaktor. Chem-Ing-Techk. 2005;77(1–2):161. doi: 10.1002/cite.200407053. [DOI] [Google Scholar]
  • 43.Gerlach I, Hass VC, Brüning S, Mandenius CF. Virtual bioreactor cultivation for operator training and simulation: application to ethanol and protein production. J Chem Technol Biotechnol. 2013;88(12):2159. doi: 10.1002/jctb.4079. [DOI] [Google Scholar]
  • 44.Wechselberger P, Sagmeister P, Herwig C. Model-based analysis on the extractability of information from data in dynamic fed-batch experiments. Biotechnol Progr. 2013;29(1):285. doi: 10.1002/btpr.1649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Hernández Rodríguez T, Posch C, Schmutzhard J, Stettner J, Weihs C, Pörtner R, Frahm B. Predicting industrial scale cell culture seed trains—a bayesian framework for model fitting and parameter estimation, dealing with uncertainty in measurements and model parameters, applied to a nonlinear kinetic cell culture model, using a mcmc method. Biotechnol Bioeng. 2019;116(11):2944. doi: 10.1002/bit.27125. [DOI] [PubMed] [Google Scholar]
  • 46.Nelder JA, Mead R. A simplex method for function minimization. Comput J. 1965;7(4):308. doi: 10.1093/comjnl/7.4.308. [DOI] [Google Scholar]
  • 47.Colin Cameron A, Windmeijer FA (1997) An r-squared measure of goodness of fit for some common nonlinear regression models, J Econometr 77(2):329. 10.1016/S0304-4076(96)01818-0
  • 48.D’haeseleer P. How does gene expression clustering work? Nat Biotechnol. 2005;23(12):1499. doi: 10.1038/nbt1205-1499. [DOI] [PubMed] [Google Scholar]
  • 49.Möller J, Rosenberg M, Riecken K, Pörtner R, Zeng AP, Jandt U. Quantification of the dynamics of population heterogeneities in cho cultures with stably integrated fluorescent markers. Anal Bioanal Chem. 2020;412(9):2065. doi: 10.1007/s00216-020-02401-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Lloyd S. Least squares quantization in pcm. IEEE T Inf Theory. 1982;28(2):129. doi: 10.1109/TIT.1982.1056489. [DOI] [Google Scholar]
  • 51.McKay MD, Beckman RJ, Conover WJ. Comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics. 1979;21(2):239. doi: 10.1080/00401706.1979.10489755. [DOI] [Google Scholar]
  • 52.Gargalo CL, Cheali P, Posada JA, Carvalho A, Gernaey KV, Sin G. Assessing the environmental sustainability of early stage design for bioprocesses under uncertainties: an analysis of glycerol bioconversion. J Clean Prod. 2016;139:1245. doi: 10.1016/j.jclepro.2016.08.156. [DOI] [Google Scholar]
  • 53.Hirschmann R, Borodkin N, Baganz F, Hass V. Towards the integration of the anaerobic ethyl (s)-3-hydroxybutyrate production process into a biorefinery concept. Chem Eng Trans. 2018;70:559. doi: 10.3303/CET1870094. [DOI] [Google Scholar]
  • 54.Kometani T, Yoshii H, Kitatsuji E, Nishimura H, Matsuno R. Large-scale preparation of (s)-ethyl 3-hydroxybutanoate with a high enantiomeric excess through baker’s yeast-mediated bioreduction. J Ferment Bioeng. 1993;76(1):33. doi: 10.1016/0922-338X(93)90049-E. [DOI] [Google Scholar]
  • 55.Lin Y, Tanaka S. Ethanol fermentation from biomass resources: current state and prospects. Appl Microbiol Biotechnol. 2006;69(6):627. doi: 10.1007/s00253-005-0229-x. [DOI] [PubMed] [Google Scholar]
  • 56.Bai F, Anderson W, Moo-Young M. Ethanol fermentation technologies from sugar and starch feedstocks. Biotechnol Adv. 2008;26(1):89. doi: 10.1016/j.biotechadv.2007.09.002. [DOI] [PubMed] [Google Scholar]
  • 57.Sonnleitner B, Käppeli O. Growth of Saccharomyces cerevisiae is controlled by its limited respiratory capacity: formulation and verification of a hypothesis. Biotechnol Bioeng. 1986;28(6):927. doi: 10.1002/bit.260280620. [DOI] [PubMed] [Google Scholar]
  • 58.Fiechter A, Fuhrmann G, Käppeli O. Regulation of glucose metabolism in growing yeast cells. Adv Microbiol Physiol. 1981;22:123. doi: 10.1016/S0065-2911(08)60327-6. [DOI] [PubMed] [Google Scholar]
  • 59.Casey GP, Ingledew WM. Ethanol tolerance in yeasts. CRC Crit Rev Microbiol. 1986;13(3):219. doi: 10.3109/10408418609108739. [DOI] [PubMed] [Google Scholar]
  • 60.Arroyo-López FN, Orlić S, Querol A, Barrio E. Effects of temperature, ph and sugar concentration on the growth parameters of Saccharomyces cerevisiae S. kudriavzevii and their interspecific hybrid. Int J Food Microbiol. 2009;131(2):120. doi: 10.1016/j.ijfoodmicro.2009.01.035. [DOI] [PubMed] [Google Scholar]
  • 61.Larsson C, von Stockar U, Marison I, Gustafsson L. Growth and metabolism of Saccharomyces cerevisiae in chemostat cultures under carbon-, nitrogen-, or carbon- and nitrogen-limiting conditions. J Bacteriol. 1993;175(15):4809. doi: 10.1128/jb.175.15.4809-4816.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials


Articles from Bioprocess and Biosystems Engineering are provided here courtesy of Springer

RESOURCES