Skip to main content
Iranian Journal of Biotechnology logoLink to Iranian Journal of Biotechnology
. 2025 Oct 1;23(4):e4056. doi: 10.30498/ijb.2025.499010.4056

Mathematical Modeling and Thermodynamic Integration for Precision PCR Optimization

Hadja Fatima Tbahriti 1,2*, Aicha Zerrouki 1,3, Fatima Zohra Mahammi 1,4, Ali Boukadoum 1,2
PMCID: PMC12620865  PMID: 41255838

Abstract

Objectives:

The objective of this study was to develop a predictive modeling framework for optimizing MgCl2 concentration and melting temperature (Tm) in PCR conditions.

Materials and Methods:

The study developed predictive models using multivariate Taylor series expansion and thermodynamic functions integrated with 120 PCR primers across various species. Ridge, Lasso, and Elastic Net multiple regression analyses were applied to fine-tune these models.

Results:

The models demonstrated excellent predictive capabilities, achieving an R2 =0.9942 for MgCl2 concentration and 0.9600 for Tm. These findings underscore the efficacy of the proposed optimization strategy.

Conclusion:

The research provides a tight framework for optimizing PCR conditions and enhancing specificity and sensitivity in DNA amplification. More importantly, it is based on an approach that combines theoretical modeling with empirical validation, thus offering valuable insight into the advancement of PCR-based methodologies.

Keywords: Polymerase chain reaction, MgCl2, Thermodynamics, Optimization, Logarithm, Melting temperature (Tm), Taylor series expansion

1. Background

The polymerase chain reaction (PCR) is a fundamental technique in molecular biology with broad applications in genetics, diagnostics, and biotechnology ( 1 ). While PCR is a standard method, optimizing key parameters such as MgCl2 concentration and melting temperature (Tm) remains crucial for successful reactions. Traditionally, optimization relied on time-consuming trial-and-error approaches, but recent advances in computer modeling have enabled the theoretical prediction of optimal PCR conditions.

MgCl2 concentration mainly influences amplification specificity and primer annealing, while optimal hybridization temperature ensures specific primer binding with minimal nonspecific amplification ( 2 , 3 ). Previous studies on the optimization of PCR have mainly been deficient in sound theoretical bases, relying primarily on empirical methods or simplified models ( 4 ). This work fills this knowledge gap by marrying extensive empirical data with advanced mathematical modeling anchored on thermodynamics. The model formulation involves thermodynamic factors: ΔH/RT and ŔS/R, In addition to a third-order multivariate Taylor series expansion for hybridization temperature and MgCl2 concentration. Testing and refining these models involved the use of 120 species-specific PCR primers.

Various regression techniques, such as Ridge, Lasso, and elastic net regression, enhanced the optimal model. Advanced statistical methods of random forest and gradient boosting were employed in improving the prediction for better accuracy and reliability ( 5 ). This approach combines theoretical frameworks with experimental data to develop a more robust PCR optimization strategy.

2. Objectives

The aim of this work was to present a predictive modeling framework that combines thermodynamics-based principles with mathematical modeling to optimize PCR conditions, emphasizing MgCl2 concentration and the Tm. Through integration of theory and experimental validation, the study seeks to achieve higher amplification sensitivity and specificity in a reproducible and scalable manner.

3. Materials and Methods

3.1. Theoretical Foundations and Mathematical Modell-ing

The fundamental PCR principles underpin our model of MgCl2 concentration in PCR reactions. The essential functional relationship can be expressed as:

(MgCl2) = f (Tm, GC%, L, (dNTP), (Primers), (Polymerase), pH, T)

Building upon this foundation ( 6 ), we developed a more sophisticated model using a multivariate Taylor series expansion of order 3:

(MgCl2) = β0 + Σiβixi + Σi Σj βijxixj + Σi Σj Σk βijkxixjxk + β_L ln(L) + β_H(ΔH/RT) + β_S(ΔS/R) + ε

This expansion allows for precise predictions of optimal MgCl2 concentrations under varying conditions ( 7 ).

3.2. Thermodynamic Justification

Our most significant innovation is explicitly incorporating thermodynamic principles into PCR parameter modeling ( 8 ). The stability of DNA duplexes and their interactions with Mg2+ ions follow fundamental thermodynamic laws, expressed through the Gibbs free energy equation:

ΔG = ΔH - TΔS

This method considers the system’s enthalpic and entropic contributions ( 9 , 10 ). The normalized enthalpic term (ΔH/RT) captures several significant molecular interactions, such as hydrogen bonding, van der Waals forces, and electrostatic interactions between DNA and Mg2+ ions ( 11 , 12 , 13 ).

Although our model is mainly concerned with Mg2+ effects, it should be mentioned that we have included the influence of monovalent cations, i.e., Na+, in our hybridization temperature model. Monovalent cations such as Na+ and K+ stabilize DNA duplexes by charge screening of the phosphate backbone, although their action is mechanistically different from that of divalent cations such as Mg2+. The lattey screens charge more efficiently due to higher charge density and form specific interactions with DNA bases and phosphate groups, directly affecting polymerase activity.

3.3. Hybridization Temperature Modelling

For accurate prediction of hybridization temperatures, we developed a comprehensive equation:

Th = α0 + α1(Tm) + α2(GC%) + α3ln(L) + α4(MgCl) + α5(dNTP) + α6(Na+) + α7₇ln (CT/4) + α8(ΔH°/R) + α9(ΔS°/R) + Σه Σj αijxixj + γ

This was further refined through a third-order Taylor expansion:

Th = θ0 + Σi θiyi + Σi Σj θijyᵢyj + Σi Σj Σk θijkyiyjyk + θ_L ln(L) + θ_H(ΔH°/R) + θ_S(ΔS°/R) + δ

DNA-MgCl2 Interaction Biochemistry

The interaction between DNA and MgCl2 was modeled using a modified binding isotherm:

θ = n × K × (Mg2+) f / (1 + K × (Mg2+) f)

The cooperativity factor is modeled as follows:

f = f0 + f1(Mg2+) + f2(Mg2++) 2

These equations provide a more accurate representation of the actual biochemical interactions occurring during PCR ( 14 ).

3.4. Computational Implementation

We used up-to-date computational tools like Python 3.9 and PyCharm Professional 2023.1 to implement our theoretical framework and crucial scientific computing libraries ( 15 ). The implementation used several machine learning methods and regression algorithms to improve prediction accuracy. Extensive validation, which included Monte Carlo simulations and cross-validation techniques, ensured reliability ( 16 ). Hyperparameter tuning for the ridge, Lasso, and elastic net regressions was performed using a grid search approach with five-fold cross-validation. The regularization parameters of optimal values (λ for ridge and Lasso, and λ and α for elastic net) were selected based on the validation sets’ minimum mean squared error (MSE). This ensured that the models chosen were not overfit but highly generalizable. All models were run using the scikit-learn library, reproducible random seeds, and standardized input features to allow model stability and consistency across folds.

3.5. Experimental Validation

Forty laboratory technicians with varied molecular biology backgrounds were analyzed to validate our practice approach. With our predicted concentrations of MgCl2, we performed standard PCR optimization for each participant using multiple primer sets representing different genomic regions from various eukaryotic and prokaryotic  origins. The successful amplification was also assessed by gel electrophoresis and intensity quantitation of bands, proving the utility of our theoretical framework in practice.

3.6. Statistical Analysis

Optimized PCR protocols were compared with traditional protocols using paired t-tests and McNemar’s tests. The magnitude of differences, expressed as Cohen’s d, odds ratios, and 95% confidence intervals, was calculated to evaluate the  strength of association between the outcomes of interest. To assess the accuracy of the models, residual plots and 95% confidence intervals  were incorporated.

4. Results

4.1.Modelling of MgCl2 Concentration

4.1.1. Comparison of Regression Models

Several regression models that attempted to forecast the ideal concentration of magnesium chloride (MgCl2) were compared (Table 1). Comparing the models may reveal crucial information about their performance in terms of execution time, Mean Absolute Error (MAE), and coefficient of determination (R2).

Table 1.

Comparison of regression model performances for [MgCl2]

Model MAP R2 Execution time (s)
Linear regression 0.0017 0.9942 0.023
Ridge regression 0.0018 0.9942 0.031
Lasso regression 0.0186 0.9384 0.042
Polynomial regression 0.0208 0.9309 0.156
Random Forest 0.0305 0.8989 0.287

MAE - Mean Absolute Error, MSE - Mean Squared Error

The linear regression model is the top choice based on its high R2 value of 0.9942 and lowest MAE of 0.0017. Linear regression yields reliable forecasts and accounts for much of the dataset's volatility. In addition, its efficiency is shown by its execution time of 0.023 seconds, which makes it a good choice for fast computations in real-world applications.

4.1.2. Predictive Equation for (MgCl2)

The resulting equation for predicting (MgCl2) is:

(MgCl2) ≈ 1.5625 + (-0.0073 × Tm) + (-0.0629 × GC) + (0.0273 × L) + (0.0013× dNTP) + (-0.0120 × Primers) + (0.0007 × Polymerase) + (0.0012 × log (L)) + (0.0016 × Tm_GC) + (0.0639 × dNTP_Primers) + (0.0056 ×pH_Polymerase)

4.1.3. Analysis of Variable Importance

The analysis of variable importance revealed the crucial role of the interaction between dNTP and primers, along with the significant influence of GC content and amplicon length (Table 2).

Table 2.

Relative importance of variables in the [MgCl2] model

Variable Relative importance (%)
dNTP_Primers 28.5
GC 22.1
L 15.7
Tm 12.3
Primers 8.9
pH_Polymerase 5.6
Tm_GC 3.2
log_L 2.1
dNTP 1.1
Polymerase 0.5

4.1.4. Comparison with Non-Linear Models

Besides the linear regression model described above, other non-linear models like Support Vector Regression (SVR) and Neural Networks may also be investigated for comparison. Although the linear model worked well with an R2 measure of 0.9942, it is understood that non-linear models might be capable of capturing more intricate patterns in the data. For example, SVR, using its kernel trick, and Neural Networks, using their deep layers, can better capture non-linear relationships, potentially enhancing prediction accuracy. Past research has demonstrated that non-linear models would be superior to linear models when there are non-linear dependencies. Exploring these models in future research would potentially enhance prediction accuracy.

4.2. Modeling of Melting Temperature (Tm)

4.2.1. Model Performance

For Tm prediction, we obtained a linear regression model with the following performances:

MSE = 0.7661, R2 = 0.9600, Execution time = 0.019 s

4.2.2. The Predictive Equation for Tm

The prediction equation for Tm is:

Tm ≈ 39.5409 + (0.3890 × GC) + (0.0132×L) + (7.1786× MgCl2) + (-19.7616 ×dNTP) + (-0.0125 × Na)

4.2.3. Sensitivity Analysis

A sensitivity analysis of the Tm model (Supplementary Table 1) highlights the predominant influence of GC content and MgCl2 concentration on melting temperature, revealing the surprising negative effect of dNTP concentration. Notably, our model incorporates Na+ concentration with a coefficient of -0.0125. Although this effect is less pronounced than that of MgCl2 (coefficient 7.1786), the negative coefficient for Na+ suggests a complex interaction between monovalent and divalent cations in PCR reactions, whereby increased Na+ concentrations can slightly decrease melting temperature. This underscores the importance of considering multiple ionic species when optimizing PCR conditions

4.3. Analysis of Correlations Between Parameters

The correlation matrix study reveals a significant relationship among the key parameters of PCR (Supplementary Table 2). A strong correlation (0.90) is observed between MgCl2 concentration and melting temperature (Tm) and guanine-cytosine content GC (0.92%). These correlations show that DNA strand stability is significantly influenced by the (MgCl2) level, especially for GC-rich sequences, which need higher melting points. For optimal primer stability and amplification, (MgCl2) must be calibrated according to the GC content. This can be inferred from the high correlation between GC and Tm, 0.85, outlining that GC-rich sequences melt at higher temperatures, showing the interrelation between these two factors. Amplicon length L shows weak correlations with other parameters, 0.05, -0.20, suggesting that it has little direct influence on the reaction. dNTP and primer concentrations are moderately correlated with (MgCl2), Tm, and GC, 0.12-0.38, showing that their effect on amplification efficiency is secondary but relevant. The correlation between Na+ concentration and other parameters ranged from 0.08 to 0.22, but our model recognizes the auxiliary role of monovalent cations in affecting DNA duplex stability simultaneously with divalent Mg2+ ions. This fair relative correlation indicates that Na activity can be regarded relatively independently while optimizing PCR conditions.

This analysis, therefore, stresses the need for harmonized optimization of (MgCl2), GC content, and Tm, taking into consideration auxiliary effects depending on amplicon length and specific reagent concentrations.

4.4. Model Validation

The (MgCl2) and Tm prediction models were validated on multiple PCR machines, with contributions from several researchers. The model using (MgCl2) explained 99.21% of the variation, with an impressive R2 of 0.9921, a mean squared error of 0.0019, and a mean error of 2.3%. The Tm model had an R2 of 0.9587, accounting for 95.87% of the variance, though with a higher MSE of 0.7892, indicating some variability but a reasonable mean error of 1.8%. These results were consistent across different PCR machines, suggesting robustness and applicability in various laboratory settings (Table 3). Although the high R2 values indicate strong model performance, we also evaluated the models on independent datasets to rule out overfitting. The results showed consistent performance across these test sets. Future studies will explore the use of additional external datasets to validate and enhance the models' generalizability further.

Table 3.

Sensitivity analysis for the Tm model

Variable Coefficient Relative sensitivity (%)
GC 0.3890 35.2
MgCl2 7.1786 32.8
dNTP -19.7616> 18.5
L 0.0132 9.7
Na -0.0125 3.8

4.4.1. Comprehensive Analysis and Predictive Model-ing of PCR Parameters Impacting MgCl2 Concentra-tion and Melting Temperature (Tm)

Thus, an extensive study regarding the conditions of PCR and its relation toMgCl2 concentration reveals several facts related to the optimization of conditions in a PCR reaction (Fig. 1). A multi-panel figure presents an integrated overview of the interplay between various PCR parameters and the required MgCl2 concentration, a critical component for successful amplification.

Figure 1.

Figure 1

Multi-parameter Analysis of PCR Conditions and MgCl2 Concentration Predicted vs. Actual MgCl2 concentrations. The shaded area represents the 95% confidence interval of the regression mode.

4.4.2. Distribution of MgCl2 Concentration

The MgCl2 concentrations follow a normal distribution, with a median of about three mM and a standard deviation between 2 and 4 mM. According to this spread, while other variables impact these results, there seems to be an “average” parameter value for the MgCl2 concentration in the PCR processes (Fig. 1A).

4.4.3. Relationship between Melting Temperature (Tm) and MgCl2

A positive correlation was observed between Tm and the required MgCl2 concentration, with higher MgCl2 levels generally associated with increased Tm. This trend may be explained by the stabilizing action of Mg2+ ions on DNA duplexes, which increases the MgCl2 requirement to maintain stringent annealing conditions at elevated temperatures (Fig. 1B).

4.4.4. GC Content vs. MgCl2 Concentration

The concentration of MgCl2 is raised when the GC content rises, as seen in Figure 1C. Typically, the concentration of MgCl2 increases as the GC content grows. The stronger bonding of the G-C base pairs may explain why more Mg2+ ions are needed to drive the strand separation and primer annealing processes.

4.4.5. Amplicon Length (L) and MgCl2 Concentration

A slight positive correlation was observed between amplicon length (L) and MgCl2 concentration, suggesting that longer amplicons may require a somewhat higher MgCl2 level due to the increased need for stability in longer DNA fragments during amplification (Fig. 1D).

4.4.6. NTP Concentration Vs. Mgcl2 Concentration

An inverse relationship was observed between dNTP concentration and MgCl2 levels, with higher MgCl2 concentrations corresponding to lower dNTP levels. This effect can be explained by the chelation of Mg2+ ions by dNTPs, requiring additional MgCl2 to maintain sufficient free Mg2+ ions for polymerase activity (Fig. 1E).

4.4.7. Primer Concentration-MgCl2 Relationship

An inverse relationship was observed between primer concentration and MgCl2 concentration, where higher primer levels tended to correspond to lower MgCl2 requirements. This effect may be partially explained by competition between primers and Mg2+ ions for binding to template DNA, with excess primers compensating for reduced Mg2+ availability (Fig.1F).

4.4.8. Polymerase Concentration vs. MgCl2 Concentra-tion

A slight negative correlation was identified between polymerase concentration and MgCl2 concentration, with higher polymerase levels associated with reduced MgCl2 requirements. This may be due to the Mg2+ -binding properties of polymerase; whereby higher enzyme concentrations sequester more Mg2+ ions and thus reduce the need for additional MgCl2 (Fig.1G).

4.4.9. Correlation Heatmap of PCR Parameters

The correlation heat map provides an overview of the relationships among all variables, highlighting the expected positive correlation between Tm and GC content, the positive association between Tm and MgCl2 (also observed in Figure 1B), and the negative correlations of dNTP and primer concentrations with MgCl2 (Fig. 1H).

4.5. Statistical Analysis

4.5.1. Predicted vs. Actual MgCl2 Concentrations

A scatter plot of predicted and actual concentrations of MgCl2, representing the performance of the linear regression model, is shown as points that gravitate around the diagonal line (a reasonably good fit). These results indicate that MgCl2 concentrations can be estimated with moderate accuracy based on other PCR parameters (Fig. 1I). This has practical implications for PCR optimization, as it provides a method for estimating MgCl2 concentrations. A histogram of (MgCl2) concentrations is approximately normal with a mean of ∼3 mM, and a range of ∼2–4 mM (Fig. 1A). Tm was found to increase with increasing MgCl2 concentration in csPCR reactions, and a direct correlation was evident between Tm and the concentration of MgCl2, whereby higher Tm values corresponded to increasing concentrations of MgCl (Fig. 1B). We also observed a positive correlation between GC content and MgCl2 concentration, because higher %GC demanded higher MgCl2 concentrations (Fig. 1C). Amplicon length demonstrated a weak positive relation with the MgCl2 concentration curve; thus, longer amplicons may need more MgCl (Fig. 1D). There was an inverse correlation between the concentration of dNTPs and the concentration of MgCl2, with lower dNTP concentrations correlating with higher MgCl2 demands (Fig. 1E). Likewise, an increased concentration of primers was suggested to result in a decrease in the levels of MgCl (Fig.1F). A modestly negative correlation was also observed between polymerase concentration and MgCl2 concentration. At various polymerase levels, there appeared to be a consistent reduction in the MgCl2 required to achieve a maximum level of polymerase productivity (Figure 1G). Finally, the correlation heat map depicted an overall impression of these relationships. It confirmed that Tm is overly optimistic related to GC content, as well as a positive relationship between Tm and MgCl2, and a negative relationship between dNTP/primer concentration and MgCl2 (Fig. 1H). The heatmap confirmed strong positive correlations between Tm and GC content, along with negative correlations between dNTP/primer concentrations and MgCl2 (Fig. 1H). The scatter plot of predicted versus actual MgCl2 concentrations showed a reasonably good fit, indicating that the linear regression model could estimate MgCl2 levels with moderate accuracy (Fig. 1I).

4.5.2. Th Distribution

The histogram shows that Th’s hybridization temperatures range from 46 °C to 60 °C; the main distribution falls between 50 °C and 58 °C. Two firm peaks near 52 °C and 54 °C appear to be the average hybridization temperatures. The distribution is slightly right-skewed, probably owing to variability in the primer sequences and experimental conditions (Fig. 2).

Figure 2.

Figure 2

Distribution of Melting Temperatures (Tm) and Their Relationships with GC Content and Amplicon Length.

4.5.3. Relationship of GC Content with Th

The scatter plot showing the relationship between GC content and Th follows a strongly positive correlation pattern. When the GC content increases from 40% to 62%, Th almost rises linearly. This agrees with the DNA hybridization theory since separating the pairs of GC bases takes more energy than separating AT pairs (Fig. 2).

4.5.4. Relationship between Amplicon Length (L) and Th

In the scatter plot of L vs. Th, there is no relationship. Amplicon lengths cluster around 20 and 25 bps, and Th spans an extensive range around these two lengths. Thus, it seems that amplicon length is not a significant determinant of hybridization temperature, and the other factors—actually GC content are the more influential parameters (Fig. 2).

4.5.5. Predicted vs. Actual Th Values

The scatter plot of the predicted Th versus actual Th values shows a positive correlation, with most lying on the diagonal. This reflects that the model predicts well overall, although some points are more separated, showing potential for improving the model (Fig. 2).

We calculated Pearson correlation coefficients between predicted and actual values for each parameter to further assess the accuracy of predictions. Table 4 presents these results, showing strong positive correlations for all parameters.

Table 4.

Correlation matrix of critical parameters

Parameter [MgCl2] Tm GC L dNTP Primers
[MgCl2] 1.00 0.90 0.92 0.20 0.38 0.34
Tm 0.90 1.00 0.85 0.15 0.30 0.28
GC 0.92 0.85 1.00 0.18 0.35 0.31
L 0.20 0.15 0.18 1.00 0.05 0.07
dNTP 0.38 0.30 0.35 0.05 1.00 0.12
Primers 0.34 0.28 0.31 0.07 0.12 1.00

[MgCl2]: Magnesium Chloride concentration; Tm: melting temperature; GC: Guanine-Cytosine content; L: amplicon length; dNTP: deoxyribonucleotide triphosphates concentration.

(A) The histogram of Th shows a range from 46 °C to 60 °C, with two prominent peaks around 52 °C and 54 °C. (B) The scatter plot reveals a strong positive correlation between GC content and Th. (C) The scatter plot of L vs. Th shows no clear correlation. (D) The scatter plot of predicted vs. actual Th indicates good model accuracy overall, with some prediction errors.

4.6. PCR Performance Comparison

PCR performance was compared between software-optimized conditions and standard methods. The paired t-test showed that software-optimized conditions resulted in significantly higher PCR yields than standard methods (p < 0.001; Table 5). The specificity of the PCR reactions was also assessed by examining the presence of nonspecific bands. The results of McNemar’s test indicated that the software-optimized conditions significantly reduced the occurrence of nonspecific bands (p < 0.001; Table 6).

Table 5.

Model validation results

Model R2 (test) MSE (test) Mean error (%)
[MgCl2] 0.9921 0.0019 2.3
Tm 0.9587 0.7892 1.8

Table 6.

Pearson Correlation Coefficients between Predicted and Actual Values

Parameter Correlation (r) p-value
[MgCl2] 0.92 <0.001
Tm 0.89 <0.001

4.7. User Satisfaction

To assess the software's effectiveness and usability, we collected user satisfaction scores; descriptive statistics for these scores are available (Supplementary Table 3). The high mean satisfaction score of 8.2 out of 10 indicates that users generally found the software helpful and practical.

5. Discussion

Utilizing aggressive multivariate modeling through Taylor series expansion and Michaelis-Menten equations, we explored how MgCl2 concentration and Tm affect optimal PCR. Our findings indicated a vital role of MgCl2 concentration; it highly affected nucleic acid solvation and stability. Navalpreet et al and Yi et al found that MgCl2 facilitates solvation in nucleic acid interactions due to hydrophilic ionic effects. We discovered that MgCl2 is required for primer annealing and amplification specificity ( 16 , 17 ). Here, our comparative analysis of regression techniques revealed distinctive differences, with linear regression models (R2 of 0.9532) demonstrating superior overall performance to machine learning approaches like Random Forest (R2 of 0.8989). This performance gap is due to the fundamentally linear relationship between MgCl2 requirements and PCR thermodynamic parameters. The machine learning algorithms are potent for recognizing complex patterns, but the physicochemical interactions in PCR optimization create simple linear relationships well represented by less complex statistical models. This inverse relationship between polymerase concentration and optimal MgCl2 indicates competitive binding dynamics, whereby polymerases compete for magnesium ions, essential cofactors for polymerase activity ( 18 ). Because surface chemistries contain polymerase-specific interactions that can tailor the magnesium concentration that most efficiently supports DNA production, we can expect that different polymerase formulations will have different magnesium affinities, which may prevent universal applicability of our predictions, unless modified for specific polymerases. PCR inhibitors such as EDTA, proteins, and polysaccharides may further modify ionic availability by chelation or non-specific binding, shifting practical MgCl2 concentration requirements, especially in clinical or environmental samples. Our method also shows clear benefits in optimizing MgCl2 compared to commercial tools such as NEB Tm Calculator and OligoAnalyzer ( 18 ). On the other hand, these tools give Tm values within ±2.1 °C of our predictions. Still, they do not provide specific guidance steps for the optimal concentration of MgCl2 other than a general suggestion. In experimental contrasts, the MgCl2 concentrations predicted by our model produced 27% greater amplification yields and 31% fewer failed reactions than standard protocols. Although our study mainly focused on DNA amplification, comments by Mert et al ( 19 ). on Mg2+ in RNA stability have been extremely useful in parsing ionic interactions in nucleic acid stability ( 20 , 21 ). We also include the role played by monovalent (Na+) and divalent (Mg2+) cations in PCR optimization. Tbahriti et al. ( 22 ) showed that the relative balance of these cations is one of the most significant factors correlating with the efficiency of PCR in complex samples, rather than the absolute cation concentrations ( 22 ). There are limitations of our model that deserve attention. Although we have validated with data from 40 technicians, we have yet to assess generalization across different PCR platforms (qPCR, digital PCR) or extreme template conditions. Further work will be required for systematic cross-validation over diversity of template resource and thorough comparison against platform-specific protocols. Recent advances in machine learning made improvements in RT-PCR diagnostic processes. Previously, authors have shown the accuracy of detecting SARS-CoV-2 can be improved with the help of deep learning trained to predict specific parameters ( 21 , 22 ). Ong et al. demonstrated that machine learning could predict pCR from multi-omic signatures to inform neoadjuvant chemotherapy decisions ( 23 ). Febrian et al. Compared regression algorithms for bioactivity prediction to determine which algorithms are optimal for modelling PCR optimizations ( 24 ). We also realize that our tool does not consider the contribution of secondary structures in the input sequence which can affect the Tm of the primer. Nevertheless, our estimated values are similar to those obtained experimentally based on observed hybridization temperatures. Untitled project Future developments will include machine learning techniques to predict and penalize in-sferred unlikely secondary structures, increasing accuracy for complex templates and making it more applicable to the vast number of different PCR applications ( 22 , 24 ).

6. Conclusion

This work focused on optimizing MgCl2 and Tm in PCR and, based on experimental data from 120 PCR primers, developed advanced mathematical models. The results have demonstrated complex interactions among the reaction components; for example, GC content and dNTP concentration significantly influence amplification efficiency. A thermodynamic approach allowed going deeply into the biochemistry of PCR, enabling systematic optimization. Multi-omics data integrated with machine learning shows promise in further developing the PCR technology for precision that might benefit molecular biology research and diagnostics. Although models are accurate, more validations across contexts must be performed for generalizability. Other refinements include factors like the presence of PCR inhibitors or sample quality.

In future work, we plan to introduce machine learning-based strategies to dynamically predict and correct for secondary structures such as hairpins and self-dimers, thereby further improving the reliability and adaptability of our Tm predictions in diverse genomic contexts. This study thus has important implications for models usable for real-time PCR in the real world like clinical diagnostics and environmental PCR. These can enhance PCR’s experimental conditions to suit particular DNA samples, thus enhancing the accuracy of the outcomes in particular clinics or in the respective field. We will elaborate more on these significant use cases in our future works.

Acknowledgments

We would like to express our deepest gratitude to all the researchers and students who have contributed to this work. We are especially thankful to Mohamed Ali Shariati for his invaluable support and insightful guidance throughout this project.

We also wish to acknowledge the contributions of the following individuals, whose dedication and hard work were essential for the success of this research: Raheleh Rezaei, Abbas Ali Dehpour, Seyed Mohammad Hosseini, Maryam Gholamitabar Tabari, and Farkhondeh Nemati. Their input, knowledge, and expertise were of great significance. We would also like to extend our sincere thanks to Pr. Boughrara Wefa and Francesco Centorrino for their valuable support and contributions throughout this work.

Additionally, we would like to extend my thanks to the students and participants from the "biovision" Instagram page (bio._.vision), whose enthusiasm and engagement played an important role in advancing this project.

Funding

No funding was received for this research.

Compliance with ethical standards

This article contains no studies with human participants performed by any authors.

Conflict of interest

The authors declare that they have no conflicts of interest.

References

  • 1.Lorenz TC. Polymerase chain reaction: basic protocol plus troubleshooting and optimization strategies. Curr Protoc Cytom. 2012;73:114. doi: 10.1002/0471142727.mb1501s73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Jiang X, et al. An amplification velocity-controlled PCR device for accurately detecting the initial content of target DNA templates. Chem Eng J. 2022 doi: 10.1016/j.cej.2022.141123. [DOI] [Google Scholar]
  • 3.Zhu M, et al. Interactions of the primers and Mg2+ with graphene quantum dots enhance PCR performance. RSC Adv. 2015;5:74515–74522. doi: 10.1039/C5RA12729G. [DOI] [Google Scholar]
  • 4.Owczarzy R, Tataurov AV, et al. Predicting stability of DNA duplexes in solutions containing magnesium (2+) Biochemistry. 2008;47(19):53365353. doi: 10.1021/bi702363u. [DOI] [PubMed] [Google Scholar]
  • 5.Grgičin D, Dolanski Babić S, Ivek T, et al. Effect of magnesium ions on dielectric relaxation in semidilute DNA aqueous solutions. Phys Rev E. 2013;88:052703. doi: 10.1103/PhysRevE.88.052703. [DOI] [PubMed] [Google Scholar]
  • 6.Thirioux X, Maffart A. Taylor Series Revisited. In: Formal Methods for Dynamical Systems. Lecture Notes in Computer Science. 2019;11800:291310. doi: 10.1007/978-3-030-32505-3_19. [DOI] [Google Scholar]
  • 7.Li J, et al. Entropy Driving the Mg2+-Induced Folding of TPP Riboswitch RNA. J Phys Chem B. 2022;1:2646744683. doi: 10.1021/acs.jpcb.2c03688. [DOI] [PubMed] [Google Scholar]
  • 8.Vaitiekunas P, et al. The energetic basis of the DNA double helix: a combined microcalorimetric approach. Nucleic Acids Res. 2015;43 doi: 10.1093/nar/gkv812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Markoulatos P, Siafakas N, Moncany M. Multiplex polymerase chain reaction: a practical approach. J Clin Lab Anal. 2002;16(5):214219. doi: 10.1002/jcla.10087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ragot M, et al. Thermodynamic study of the formation of adenine nucleotide-manganese complexes. II. Calorimetric results. Calorimetric results Biochim Biophys Acta. 1977;497:7380. doi: 10.1016/0304-4165(77)90073-3. [DOI] [PubMed] [Google Scholar]
  • 11.Lervik A, et al. Michaelis-Menten kinetics under non-isothermal conditions. Phys Chem Chem Phys. 2015;17:16341647. doi: 10.1039/C4CP04334K. [DOI] [PubMed] [Google Scholar]
  • 12.Langer A, et al. Polymerase/DNA interactions and enzymatic activity: multi-parameter analysis with electro-switchable biosurfaces. Sci Rep. 2015;5:12066. doi: 10.1038/srep12066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kozliak EI. Energy as Money, Chemical Bonding as Business, and Negative ΔH and ΔG as Investment. J Chem Educ. 2002;79:1435. doi: 10.1021/ed079p1435. [DOI] [Google Scholar]
  • 14.Kurus NN, et al. Determination of the Thermodynamic Parameters of DNA. ACS Omega. 2018 doi: 10.1021/acsomega.7b01815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lievens A, et al. Simulation of between repeat variability in real time PCR reactions. PLoS One. 2012;7:e47112. doi: 10.1371/journal.pone.0047112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Navalpreet K, et al. Understanding the interaction of pyrimidine-based model compounds of DNA and RNA with magnesium chloride in aqueous solutions via thermodynamic, transport, and spectroscopic studies. J Chem Eng Data. 2023 doi: 10.1021/acs.jced.3c00282. [DOI] [Google Scholar]
  • 17.Yi Q, et al. The GC-content at the 5’ ends of human protein-coding genes is undergoing mutational decay. bioRxiv. 2024 doi: 10.1101/2024.03.12.584636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Cora B, et al. Increased dNTP pools rescue mtDNA depletion in human POLG-deficient fibroblasts. FASEB J. 2019 doi: 10.1096/fj.201801591r. [DOI] [PubMed] [Google Scholar]
  • 19.Mert Y, et al. Influence of Mg2+ distribution on the stability of folded states of the Twister ribozyme revealed using grand canonical Monte Carlo and generative deep learning enhanced sampling. ACS Omega. 2023 doi: 10.1021/acsomega.3c00931. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Abhishek A, Kognole AD. Contributions and competition of Mg2+ and K+ in folding and stabilization of the Twister Ribozyme. bioRxiv. 2020 doi: 10.1101/2020.06.15.152744. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Gunay M, Sanwal M. RT-PCR accuracy improvement for SARS-CoV-2 detection using deep neural networks. Biomed Signal Process Control. 2024 doi: 10.1016/j.bspc.2024.106169. [DOI] [Google Scholar]
  • 22.Tbahriti HF, Zerrouki A, Boukadoum A, Kameche M, Povetkin S, Simonov A, Thiruvengadam M. Comprehensive review and meta-analysis of magnesium chloride optimization in PCR: Investigating concentration effects on reaction efficiency and template specificity. Anal Biochem. 2025;705:115909. doi: 10.1016/j.ab.2025.115909. [DOI] [PubMed] [Google Scholar]
  • 23.Ong J, et al. Prediction of pCR and chemosensitivity for breast cancer patients using DLG3, RADL, and Pathomics signatures. Transl Oncol. 2024 doi: 10.1016/j.tranon.2024.101985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Febrian D, et al. Regression algorithms in predicting the SARS-CoV-2 replicase polyprotein 1ab inhibitor: A comparative study. J Electron Electromed Eng Med Inform. 2023 doi: 10.35882/jeeemi.v6i1.33. [DOI] [Google Scholar]

Articles from Iranian Journal of Biotechnology are provided here courtesy of Iran National Institute of Genetic Engineering and Biotechnology

RESOURCES