Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 May 19.
Published in final edited form as: ACS Synth Biol. 2022 Mar 10;11(4):1699–1704. doi: 10.1021/acssynbio.2c00082

Correction to GAMES: A dynamic model development workflow for rigorous characterization of synthetic genetic systems

Kate E Dray 1, Joseph J Muldoon 1,2, Niall M Mangan 3,4, Neda Bagheri 1,2,4,5,*, Joshua N Leonard 1,2,4,6,*
PMCID: PMC9119373  NIHMSID: NIHMS1796301  PMID: 35271255

After publication, we identified two minor mistakes in the code for the GAMES workflow. Each subtly affected our analysis of the example case study but had no effect on the GAMES workflow. Here we describe those mistakes and discuss how the resulting corrections modify the case study analysis.

Correction 1:

There are two locations in the workflow where noise is added to each data point in the training data: to generate the parameter estimation method (PEM) evaluation data, and to calculate the threshold used for the parameter profile likelihood (PPL). In both locations, the value of the noise added to each data point is selected from a distribution centered at zero with a standard deviation equal to the standard error associated with the data point (assuming, for our case study, that the data point represents the mean value for three replicates). We set the standard deviation, σSD, for each data point to 0.05, so the standard error, σSE, is calculated with Equation C1, where nreplicates is the number of replicates:

σSE=σSDnreplicates (Equation C1)

The original published code omitted the square root operation in Equation C1, such that the distribution from which each added noise value was drawn was slightly smaller than it would have been with the correct distribution. We corrected the code in v1.0.2 and repeated the relevant PEM evaluation and PPL simulations. We noted minor changes in the results that impact the threshold of the PPL in the example study and affected the identifiability classification of one parameter for one model in the example study. This change impacted the interpretation of this one parameter and the quantitative values of the PPL threshold for all parameters. The other qualitative interpretations remain the same, and no changes were made to the GAMES workflow.

With the correction, the PEM evaluation results are very similar to the previous results. The threshold used to define the PEM evaluation criterion remains the same (R2 = 0.99), and this threshold is satisfied for all models (Figure 4c model A, Figure S5a model B, Figure S6a model C, Figure S7a model C).

Figure 4. Evaluate parameter estimation method.

Figure 4.

(a) Module 1 workflow for evaluating the PEM using simulated training data. A model must pass the PEM evaluation criterion before moving on to Module 2. (b, c) Module 1 case study for a hypothetical crTF. (b) Generating the PEM data. A global search of 1000 parameter sets was filtered by χ2 with respect to the training data and the 8 parameter sets with the lowest χ2 values were used as reference parameters to generate PEM evaluation data. For each data set, technical error was added using a noise distribution of N(0, 0.0292). Triangle data points are PEM evaluation data. (c) Determination of the PEM evaluation criterion. For each PEM evaluation data set, a global search with 100 randomly chosen parameter sets was used to choose 10 parameter sets to use as initial guesses for optimization. The optimized parameter sets and cost function from each of the PEM evaluation problems were used to evaluate the PEM evaluation criterion. Each parameter was allowed to vary across three orders of magnitude in either direction of the reference parameter value, except for n, which was allowed to vary across [100, 100.6]. Results are shown only for parameter sets yielding χ2 values within the bottom (best) 10% of χ2 values (to the left of the pink dotted line in Figure S2b) achieved in the initial global search with respect to the training data (Module 1.1). Only parameter sets yielding R2 ≥ 0.90 are included on the plot to more clearly show data points with R2 values that exceed R2opt. Both of these filtering strategies apply to all plots of PEM evaluation data in this tutorial.

With the correction, the main difference observed for the PPL results is that the calculated thresholds for each model are higher than in the original analysis. This change is consistent with our understanding of the PPL threshold, which is related to the extent of overfitting that is possible given a model, a training data set, and the associated measurement error. For models A (Figure 7b, Figure 8b, Figure S4a) and B (Figure 8b, Figure S5c), the increased threshold is the only substantial difference between the corrected results and previous results; all PPL shapes and parameter classifications remain the same.

Figure 7. Assess parameter identifiability.

Figure 7.

(a) Module 3 workflow for evaluating and refining parameter identifiability through the profile likelihood approach. Depending on the results of the parameter identifiability analysis, the next step is either experimental design (Module 0), model reduction (Module 0), or model comparison (Module 4). (b–d) Module 3 case study for a hypothetical crTF. (b) Application of the profile likelihood approach to the model defined in Figure 2. The calibrated parameter set from a parameter estimation was run with 1000 global search parameter sets, and 100 initial guesses for optimization were used as the starting point (represented in blue). Parameters were allowed to vary across three orders of magnitude in either direction of the reference parameter value, except for n, which was allowed to vary across [100, 100.6]. An adaptive step method (Supplementary Note 1) was used to determine each step size. The threshold is defined as the 99% confidence interval of the practical χdf2 distribution (Δ1-α= 5.1). (c) Plots of parameter relationships along the profile likelihood associated with km. We consider a range of possible values of the unidentifiable parameter (m) and plot these values against recalibrated values of other model parameters (km, e, n, b, kbind). (d) Plots of internal model states considering a range of possible values of unidentifiable parameter m. Time courses represent the trajectory of each state variable in the model as a function of m choice. Each trajectory was generated by holding m constant at the given value and re-optimizing all other free parameters (results are from the same simulations used to plot the PPL results in b). Data are shown for these conditions: 50 ng DNA-binding domain plasmid, 50 ng activation domain plasmid, and a saturating ligand dose (100 nM).

Figure 8. Refinement of parameter identifiability using experimental design and model reduction.

Figure 8.

(a) Model B training data. Additional training data for a DNA-binding domain plasmid dose response at two plasmid doses of activation domain and a saturating ligand dose were generated using the reference parameter set. Noise was added as was done for the ligand dose response. Model B has the same model structure as Model A, but Model B incorporates this additional training data set and therefore has different calibrated parameters. (b) PPL results for Models A, B, C, and D. Results from Model A (Figure 7b) are shown again to facilitate comparison between other PPLs for other models. Each column is a different parameter, and each row is a different model. All parameters were allowed to vary across three orders of magnitude in either direction of the reference parameter value, except for n, which was allowed to vary across [100, 100.6] for all PPL simulations in this figure. Calibrated parameter values for Models B, C, and D are in Supplementary Table 2. The threshold is defined as the 99% confidence interval of the χdf2 distribution (Model B: Δ1-α= 7.7, Model C: Δ1-α= 7.9, Model D: Δ1-α = 7.0). A green check mark means the parameter is identifiable and a red X means the parameter is unidentifiable. (c) Parameter relationships along the profile likelihood associated with m. b and m compensate for one another along the profile likelihood. (d) Model reduction scheme for Model C. Instead of fitting both m and b, the ratio between the two parameters was fit and b was fixed to an arbitrary value of 1.

The corrected PPL shapes and parameter classifications for model C were also in agreement with the previous results (Figure 8b, Figure S6c), with the exception of the parameter m*, which now appears practically unidentifiable—whereas previously this parameter was deemed identifiable—as the PPL reaches the threshold in the negative direction but not in the positive direction. However, this corrected result has no impact on downstream analysis because m* is still classified as identifiable for the final model (model D). The classification of m* as practically unidentifiable for model C is reasonable given that the increased PPL threshold necessitates that higher m* values be traversed when determining the PPL. As m* is a ratio between the parameters m and b, once m* reaches a sufficiently high value such that m >> b, increasing m* further has no meaningful effect on the agreement between the training data and simulated data. This interpretation explains why m* does not reach the threshold in the positive direction for model C with the correction included here.

The corrected results for model D are very similar to the previous results (Figure 8b, Figure S7c). All parameter classifications remain the same, and all parameters are identifiable. The qualitative shape of the PPL for m* is similar to the shape observed for m* in model C (with the correction), but in model D, the PPL crosses the threshold in both the negative and positive directions. This is reasonable because model D has fewer free parameters (four free parameters) than does model C (five free parameters), and therefore model D has a lower calculated PPL threshold, enabling the PPL for m* to cross the threshold in the positive direction.

Correction 2:

We also noted a minor mistake in the model D case study. For model D, an incorrect value for kbind (the fixed value of 1 rather than the reference value of 0.05) was used to define the reference parameter set and calculate the PPL threshold. This mistake was corrected before generating the simulation results reported here. Correcting this value led to some parameter sets having higher χ2θfit values than χ2θref values (Figure S7c) because kbind cannot be fit to the reference parameter value for each noise realization. However, the resulting reduced model with kbind = 1 still yields very similar agreement between the training data and simulated data (Figure S7b), which shows that fixing kbind to 1 (and not to the reference value of 0.05, which would be unknown in a practical situation when the reference parameters do not exist) does not significantly affect the results. This phenomenon, in which some parameter sets have slightly higher χ2θfit values than χ2θref values, was also observed in the original results for all models but to a lesser extent. In general, slightly negative values for χ2θref-χ2θfitcan be attributed to the optimization algorithm finding local minima (that have only slightly different χ2 values than the global minimum) to define χ2θfitfor some noise realizations.

Conclusions:

these corrections affected our interpretation of the example case study but had no effect on the GAMES workflow itself. The code used to define the case study example has been updated and annotated on GitHub: https://github.com/leonardlab/GAMES.

Supplementary Material

Figure S4. Additional PPL results for Model A.

(a) Evaluation of the χ2distribution via a simulation study. 1000 individual noise realizations were generated. Parameters were individually estimated for each of the noise realizations to calculate χ2θfit. Reference parameters were used to calculate χ2θref. The difference between these values, χ2θref-χ2θfit, represents the amount of overfitting for each noise realization. Δ1-α(blue dotted line) was determined by evaluating the 99% confidence interval ( = 0.01) of the distribution (Δ1-α=5.1). (b) Three-dimensional plot of km, b, and m along the unidentifiability associated with m. The surface is smooth, indicating dependencies between the three parameters. The logarithm (log10(θ)) of each parameter is plotted.

Figure S5. PEM evaluation, parameter estimation, and determination of confidence threshold for Model B.

(a) PEM evaluation criterion with 1000 parameter sets in the global search and 100 initial guesses. Results are shown only for parameter sets yielding R2 0.90. The PEM evaluation criterion is satisfied. (b) Best fit to the training data using the calibrated parameter set. The visual inspection criterion is satisfied. Parameter values are in Supplementary Table 2. (c) Determination of the confidence threshold for PPL calculations (Δ1-α = 7.7).

Figure S6. PEM evaluation, parameter estimation, and determination of confidence threshold for Model C.

(a) PEM evaluation criterion with 1000 parameter sets in the global search and 100 initial guesses. The PEM evaluation criterion is satisfied. (b) Best fit to the training data using the calibrated parameter set. The visual inspection criterion is satisfied. Parameter values are in Supplementary Table 2. (c) Determination of the confidence threshold for PPL calculations (Δ1-α = 7.9).

Figure S7. PEM evaluation, parameter estimation, and determination of confidence threshold for Model D.

(a) PEM evaluation criterion with 1000 parameter sets in the global search and 100 initial guesses. The PEM evaluation criterion is satisfied. (b) Best fit to the training data using the calibrated parameter set. The visual inspection criterion is satisfied. Parameter values are in Supplementary Table 2. (c) Determination of the confidence threshold for PPL calculations (Δ1-α = 7.0).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S4. Additional PPL results for Model A.

(a) Evaluation of the χ2distribution via a simulation study. 1000 individual noise realizations were generated. Parameters were individually estimated for each of the noise realizations to calculate χ2θfit. Reference parameters were used to calculate χ2θref. The difference between these values, χ2θref-χ2θfit, represents the amount of overfitting for each noise realization. Δ1-α(blue dotted line) was determined by evaluating the 99% confidence interval ( = 0.01) of the distribution (Δ1-α=5.1). (b) Three-dimensional plot of km, b, and m along the unidentifiability associated with m. The surface is smooth, indicating dependencies between the three parameters. The logarithm (log10(θ)) of each parameter is plotted.

Figure S5. PEM evaluation, parameter estimation, and determination of confidence threshold for Model B.

(a) PEM evaluation criterion with 1000 parameter sets in the global search and 100 initial guesses. Results are shown only for parameter sets yielding R2 0.90. The PEM evaluation criterion is satisfied. (b) Best fit to the training data using the calibrated parameter set. The visual inspection criterion is satisfied. Parameter values are in Supplementary Table 2. (c) Determination of the confidence threshold for PPL calculations (Δ1-α = 7.7).

Figure S6. PEM evaluation, parameter estimation, and determination of confidence threshold for Model C.

(a) PEM evaluation criterion with 1000 parameter sets in the global search and 100 initial guesses. The PEM evaluation criterion is satisfied. (b) Best fit to the training data using the calibrated parameter set. The visual inspection criterion is satisfied. Parameter values are in Supplementary Table 2. (c) Determination of the confidence threshold for PPL calculations (Δ1-α = 7.9).

Figure S7. PEM evaluation, parameter estimation, and determination of confidence threshold for Model D.

(a) PEM evaluation criterion with 1000 parameter sets in the global search and 100 initial guesses. The PEM evaluation criterion is satisfied. (b) Best fit to the training data using the calibrated parameter set. The visual inspection criterion is satisfied. Parameter values are in Supplementary Table 2. (c) Determination of the confidence threshold for PPL calculations (Δ1-α = 7.0).

RESOURCES