Abstract

Here, we demonstrate a method to estimate the uncertainty (confidence intervals and standard errors) of free energy differences calculated by molecular simulations. The widths of confidence intervals and standard errors can be calculated solely from temperature and the number of transitions between states. Uncertainty (95% confidence interval) lower than ±1 kcal/mol can be achieved by a simulation with four forward and four reverse transitions. For a two-state Markovian system, the confidence interval is exact, regardless the number of transitions.
1. Introduction
One of the main purposes of molecular simulations is to predict equilibrium constants and associated free energy differences for processes, such as conformational changes, formation of noncovalent complexes, chemical reactions, or phase transitions. The equilibrium constant of a transition from state A to B can be determined experimentally as a fraction of the concentrations of B and A in equilibrium. In molecular simulations, it is common to simulate only one copy of the system (e.g., a single solvated protein or a protein–ligand pair); thus, the equilibrium constant can be predicted as a fraction of time spent in states B and A. For practical application of such predictions, it is necessary to assess their accuracy.
Accuracy of molecular simulations is determined by systematic and random errors. Systematic errors can be caused, for example, by over- or underestimation of some noncovalent interactions or oversimplification of the structure-energy relationship. In this work, we deal with random errors (uncertainties) caused by lack of sampling. They can be, in principle, eliminated by running infinitely long simulations.
Current statistical methods used in this field are based on the
treatment of autocorrelation.1−3 Many characteristics of a molecular
system, including the densities of states, can be calculated by averaging
along the trajectory. Accuracy of the mean value can be characterized
by a standard error of the mean. It is commonly calculated as
, where σ is the standard deviation
and n is the number of samples. However, n can be set to the number of samples only if these samples
are independent. This is not the case of states sampled along a simulation
trajectory because they are strongly autocorrelated. To apply to a
simulation trajectory, it is necessary to estimate the number of independent
samples n, which is significantly lower than the
number of all samples, either by block averaging1,2 or
autocorrelation analysis.3
As an alternative, we present a simple method “JumpCount” based solely on the number of observed transitions between states of the system. The prerequisite of the method is that the states of the system can be clearly distinguished, there is an energy barrier between them, and the process is Markovian, i.e., the probability of transition from one state to another is constant and independent of the history of the system, and it can be studied by the first order kinetics. Finally, the simulation is sampled with a frequency allowing to capture all transitions.
The method is demonstrated on model data (random numbers with an appropriate statistical distribution) and two types of molecular systems–glycerol and fast-folding miniproteins.
2. Theory
First, we will derive an expression of the uncertainties of an equilibrium constant and the associated free energy difference for a simple reversible transition from state A to state B. A simulation can be performed in a way that undergoes the same number of transitions from A to B (nA) and from B to A (nB). This is illustrated in Figure 1 for the number of transitions nA = nB = 3.
Figure 1.

A schematic representation of a trajectory with three A to B and three B to A transitions (data taken from glycerol simulation).
For a Markovian process, the first time passage times for the transition from A to B (and vice versa) are exponentially distributed with the probability P(tA) = k1 exp(−k1tA), where k1 is a rate constant for the transition from A to B. Analogously, P(tB) = k–1 exp(−k–1tB). It is possible to estimate k1 and k–1 as nA/∑itA,i and nB/∑itB,i, respectively.
The sum of independent random variables
with exponential distribution
follows the gamma distribution, namely, Gamma (shape = nA, rate = k1) for
. The equilibrium constant K̂ can be estimated as
(K̂ will be used
as a symbol for an estimate of the true equilibrium constant K). The corresponding free energy difference can be estimated
as
, where k is Boltzmann
constant and T is the temperature in Kelvins. The
fraction of the estimate and the true value of K
can be written as a rescaled fraction of
independent gamma-distributed random variables and follows the F-distribution
(also known as Fisher-Snedecor distribution, known from F-test and
ANOVA)4 with degrees of freedom d1 = 2nB and d2 = 2nA. The confidence
interval (CI) of K can therefore be calculated using
the quantile function of the F-distribution. The 95% CI of K can be calculated as
| 1 |
where qF is the quantile of F-distribution (inverse of the cumulative distribution function) with subscripts d1 and d2.
For free energy, the 95% CI can be calculated as
| 2 |
As a result, the CI
of the free energy difference depends solely
on temperature and the number of transitions. For 300 K, the 95% CI
for the free energy difference is
(2.18 kcal/mol) for nA = nB = 1,
(1.35 kcal/mol) for nA = nB = 2, etc. (see Table 1). The confidence
interval
, which is often used as a threshold of
accuracy in molecular simulations, can be reached in a simulation
with nA = nB = 4.
Table 1. 95% Confidence Intervals at 300 K (Mean ± Error) for Processes with Numbers of Transitions from A to B (nA) and from B to A (nB).
The concept described above can be easily generalized to nA = nB + 1 (we denote the starting state as A). In this case, the confidence interval can be calculated by setting the degrees of freedom d1 = 2nB and d2 = 2nA in the F-distribution. The confidence interval is then asymmetric (see Table 1).
The variance of ΔG0 can be calculated as
| 3 |
Since K is exact (its variance
is 0) and K̂/K follows F-distribution, the variance of
and standard error can be calculated as
| 4 |
| 5 |
where ψ(1) is a polygamma function of order 1 (trigamma). The values of standard error are presented in Table 2. However, we believe that confidence intervals are more informative because the free energy estimates discussed here are not normally distributed and most researchers are familiar with standard errors in the context of normally distributed samples.
Table 2. Standard Errors at 300 K (Mean ± Error) for Processes with Numbers of Transitions from A to B (nA) and from B to A (nB).
For a system with multiple states (e.g., A, B, and C), it is possible to calculate confidence intervals and standard errors as described above, however, it is necessary to count only the accomplished transitions between states. For example, when calculating ΔGA→C it is necessary to count the process with transitions A → B → A → B → C as a single accomplished transition from A to C. The resulting numbers of accomplished transitions nA and nC can be used in eqs 1 and 2 to obtain CI of KA→C and ΔGA→C (see Supporting Information for a demonstration of the fact that the sum of tAi and the sum of tCi follow the Gamma(shape = nA, ...) and Gamma(shape = nC, ...) distributions, respectively).
As an alternative, for a system with multiple
states, it is possible
to calculate the free energy differences for states with direct transitions
(e.g., A to B and B to C, for a system with A ⇌ B ⇌
C transitions) separately and combine errors as
. This approach is rigorous for variables
with normal distribution. It is possible to apply it to systems with
a high number of transitions, because the distribution of K̂/K becomes a close-to-normal distribution
according to the central limit theorem.
3. Computational Details
Numerical simulations with random numbers were performed in R 3.4.4.5 Molecular dynamics simulation of gylcerol in water was conducted using GROMACS 2018.6. Relevant preparation steps were conducted using GROMACS 2022.3.6 The preparation of the system for MD simulation was as follows: Starting structure of sn-glycerol was obtained by conversion of SMILES file to PDB format using Open Babel.7 Topology was built manually according to Glycam06.8 Partial atomic charges were calculated at HF/6-31G*//HF/6-31G* level of the theory using the RESP method.9 Simulation box was cubic with a size of 2.94013 nm. The molecule was solvated by TIP3P water. No ions were added, as the net charge of the system was zero. Potential energy of the system was then minimized using the steepest descent method, until the maximum force acting on any atom was lower than 1000 kJ/mol/nm. This step was followed by isothermal-isochoric equilibration and isothermal–isobaric equilibration, each at 300 K for 100 ps.
Then, 1 μs long MD simulation was conducted. At the beginning of the simulation, the velocities of atoms were generated randomly from Maxwell distribution for 300 K. In the relevant steps of equilibration and MD run, the following parameters were used: Leapfrog integrator (md), radius for short-range electrostatic and van der Waals was set to 1 nm (spanning the whole molecule of glycerol). Particle Mesh Ewald method10 was used for computing long-range electrostatic interactions. Temperature coupling was conducted using Parrinello–Bussi thermostat11 (300 K) and pressure coupling was conducted with Parrinello–Rahman barostat12 (1 bar). After obtaining the trajectory of glycerol from the MD run, the values of torsion angles in the molecule were computed at each 1 ps using Plumed.13
Trajectories of fast-folding miniproteins were taken from literature.14
4. Results
The easiest way to test the concept is to generate the first time passage times as exponentially distributed random numbers. This was performed in the R software (see Supporting Information). We tested scenarios with nA = nB set to 1 to 20 and with K set to 1 to 1,000. The values of k–1 and k1 were set to 1 and K, respectively (in arbitrary units). For each pair of n and K we generated 10,000 sets of first time passage times and calculated 95% CI of K and 95% CI of the free energy difference (by eqs 1 and 2, respectively). We expect the rate of type I errors (i.e., the fraction of trials for which the predefined K lies outside the calculated CI) to be 5%. Indeed, the type I error rate ranged from 4.50 to 5.52% with a median of 4.99% (Figure 2). Similar results were observed for nA = nB + 1 (see Supporting Information).
Figure 2.
Rates of type 1 errors (in %) for different nA = nB and K in simulations using generated random numbers with exponential distribution.
Numerical support for eq 5 for calculation of standard errors is demonstrated in Figure 3. For each nA and nB we generated n × 10,000 sets of first time passage times and calculated standard error by eq 5 and numerically as a standard deviation of 10,000 calculated values of ΔG0. There is a perfect agreement.
Figure 3.
Standard errors calculated by eq 5 (lines) and numerically (points) for nA = nB (A) and nB = nAB + 1 (B).
Our approach was demonstrated on two types of molecular systems. An example of a molecular system with multiple states is a glycerol molecule in water.15 Each of the two O–C–C–O torsion angles can adopt three minima. This gives nine combinations; however, the three pairs are equivalent as a result of the symmetry of the molecule; thus, six conformers can be experimentally resolved. Equilibria of these conformers have been studied by molecular simulations as well as experimentally.15,16
Here, we performed a 1 μs simulation of glycerol in water and calculated the equilibria for six conformers. Consistent with simulation and experimental studies15,16 we observed conformer populations αγ > αβ > αα > βγ > γγ > ββ ∼ γγ. Confidence intervals for all conformers relative to αγ were calculated at times 5, 10, 20, 50, 100, 200, 500, and 1,000 ns. In total, 37 confidence intervals were calculated (eight for each conformer except αγ, confidence intervals for γγ were not available at 5, 10, and 20 ns due to no sampling). These confidence intervals were compared with the value of ΔG calculated from the whole trajectory. Since we do not know the exact value of ΔG, we used this value as a “ground truth”.
One of these 37 confidence intervals was not spanning the ground truth ΔG. This was the case of ΔG of βγ at 50 ns (Figure 4). The distance of ΔG from the confidence interval was very low. Figures for other conformers are available in Supporting Information. Since we compare the confidence intervals with the estimate of ΔG calculated for whole trajectory, not the exact value of ΔG, we expect the rate of type 1 errors to be lower than 5%. This is in agreement with the observed 1/37 (2.7%).
Figure 4.

Confidence intervals of free energy of βγ conformer of glycerol, relative to αγ. The value of free energy calculated for the whole simulation is depicted as a blue line. A confidence interval that does not span this value is depicted in red.
The main prerequisite of the above-outlined approach is Markovianity of the processes studied. This is usually fulfilled for transitions associated with a single energy barrier, such as simple conformational changes or chemical reactions. More complex transitions, such as protein folding, can be non-Markovian.17 The method of molecular dynamics simulation is Markovian because each state of the molecular system depends solely on the previous state, not on the history. However, coarse graining of the system representation into a few substates (e.g., folded and unfolded, ligand-bound, and unbound) and ignoring the complex kinetics within these states may cause the Markovianity condition to be not fulfilled.
Keeping
in mind the limitation of the Markovianity prerequisite,
we applied our approach to the folding and unfolding trajectories
of fast folding mini-proteins.14 Trajectories
were kindly provided by D.E. Shaw research. Root-mean-square deviation
(RMSD) profiles from the native structure were calculated, and folded
and unfolded states were assigned by visual inspection of RMSD profiles
and trajectories. Folding free energies were estimated as
for the whole trajectories (multiple trajectories
of the same system were combined). These values were used as a ground
truth.
They were compared with
calculated at 20 points equally distributed
along each trajectory. The ground truth was outside the 95% confidence
intervals for six of 432 (1.39%) values. These confidence intervals
are shown in Figure 5. Corresponding plots for other systems can be obtained in Supporting Information.
Figure 5.
Confidence intervals of folding free energies for (A) NTL9 (simulation 2), (B) protein G (simulation 0), and (C) α3D (simulation 1). Top part of each subplot shows calculated folding free energy with confidence intervals depicted as error bars. The value of folding free energy calculated for the whole simulation is depicted as a blue line. Confidence intervals that do not span this value are depicted in red. Bottom parts show RMSD profiles. Folded states are highlighted by gray background.
Since we do not know the exact value of ΔGfolding and we used the value calculated for the whole simulation as the ground truth, we expect a lower number of type 1 errors (values outside 95% CI) than 5%, which is in good agreement with the observed 1.39%. Furthermore, all values outside the confidence intervals were located very close to them.
5. Discussion
Our results show that relatively accurate predictions of ΔG can be obtained from simulations with relatively low number of transitions between the states. This finding may reduce the costs and increase the efficiency of future applications of molecular simulations in ligand design and other fields.
Most important, in our opinion, is the fact that the method is very easy to use. It requires counting of transitions between states of the system and looking up confidence intervals from a table or by a simple program. It must be kept in mind that automated identification of transitions between states can be challenging. While for the glycerol system presented above it was possible to decide the state based on torsion angles, for fast-folding miniproteins, it was necessary to count transitions “manually” based on visual inspection of the trajectories, because the value of RMSD from the native structure cannot strictly distinguish the folded and unfolded states.
It is possible to generalize the concept to binding, such as simulation of protein–ligand interactions. The dissociation constant of a complex PL (the equilibrium constant of PL ⇌ P + L) can be expressed as Kd = cL∑tP/∑tPL, where cL is the concentration of the ligand in the simulation box in the unbound state. The value of ΔG0 is calculated as kT log Kd. Therefore, the accuracy of ∑tP/∑tPL (and thus for the resulting Kd and binding ΔGbind) can be calculated as described above.
The concept described above was presented for one long simulation with multiple transitions between the states of the system. Another design of molecular simulations can be to run a series of nA independent simulations starting from the state A until they transit to state B and a series of nB simulations starting from state B until they transit to the state A. The value of K can be estimated as
| 6 |
The confidence intervals and standard errors can be estimated by the equations presented above.
However, we must warn readers who are not patient enough to run all simulations until all transitions are observed. The method described above can be used only when all simulations result in transitions from A to B or from B to A.
Let us consider the situation when 10 simulations starting from A are performed, but only one of them (i = 1) results in a transition to B. Other 9 simulations end at time tmax without a transition. Some users might be tempted to estimate the rate constant k1 as 1/tA1 and to discard the results of 9 unsuccessful simulations. However, k1 is likely to be significantly overestimated because 9 of the 10 simulations did not reach state B. The correct estimator of k1 with unfinished simulations (right-censored t) is
| 7 |
where nA is the number of simulations in which the transition from A to B was observed (indexes i) and mA is the number of simulations in which the time tmax is reached without a transition.
Therefore, K can be estimated as
| 8 |
Derivation of equations for this design of simulations is presented in Supporting Information. The fact that some long simulations starting from state A fail to reach state B, while others reach it quickly, can also be a signature of non-Markovianity.
The results show that our approach can be applied to fast-folding mini-proteins. Either the degree of non-Markovianity in these systems is not high enough to significantly affect the performance of our approach or this approach is robust enough for the non-Markovianity typical for biomolecular systems.
In principle, non-Markovianity can be tested by inferential statistics methods, such as by Kolmogorov–Smirnov test (or some other goodness-of-fit test) to assess the deviation from the exponential distributions. However, we would like to stress that the power of the Kolmogorov–Smirnov test is rather low, i.e., it can reject Markovianity provided that there are enough samples (enough transitions). With few transitions, the test neither rejects nor approves Markovianity.
The problem of non-Makovianity can be solved by dissection of sampled states into a minimal set of substates for which mutual transitions are Markovian. This approach is used when building Markov state models.18 Alternatively, it would be possible to estimate the true distribution of tA and tB for non-Markovian systems and derive corresponding distributions for K and ΔG.
We argue that the approach can be generalized to biased simulations with a static bias potential, such as single-replica umbrella sampling.19 Adapting the approach to methods with a time-dependent bias potential, such as metadynamics,20 will be the subject of future research.
In Supporting Information, it is possible to find commands to calculate confidence intervals and standard errors in various programming languages. Data necessary to reproduce all calculations are available online via Zenodo (dx.doi.org/10.5281/zenodo.7610654). Online calculator of CI and standard errors is available at https://jumpcount.cz.
Acknowledgments
The authors would like to thank D.E. Shaw Research for providing us trajectories. The work was supported by the Czech Science Foundation (22-29667S and 19-16857S). Computational resources were supplied by the project “e-Infrastruktura CZ” (e-INFRA CZ LM2018140) and ELIXIR-CZ project (LM2023055) supported by the Ministry of Education, Youth and Sports of the Czech Republic.
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.2c01237.
Testing of the method on random numbers with exponential distribution, detailed results of simulation of aqueous glycerol, full results of application of the method on fast-folding miniproteins, and instructions to calculate errors in various programming languages (PDF)
The authors declare no competing financial interest.
Supplementary Material
References
- Flyvbjerg H.; Petersen H. G. Error estimates on averages of correlated data. J. Chem. Phys. 1989, 91, 461–466. 10.1063/1.457480. [DOI] [Google Scholar]
- Bussi G.; Tribello G. A. In Biomolecular Simulations: Methods and Protocols; Bonomi M., Camilloni C., Eds.; Springer: New York, NY, 2019; pp 529–578. [Google Scholar]
- Klimovich P. V.; Shirts M. R.; Mobley D. L. Guidelines for the analysis of free energy calculations. J. Comput. Aided Mol. Des. 2015, 29, 397–411. 10.1007/s10822-015-9840-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fisher R. A. On the interpretation of χ2 from contingency tables and the calculation of P. J. R. Stat. Soc. 1922, 85, 87–94. 10.2307/2340521. [DOI] [Google Scholar]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
- Hess B.; Kutzner C.; van der Spoel D.; Lindahl E. GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation. J. Chem. Theory Comput. 2008, 4, 435–447. 10.1021/ct700301q. [DOI] [PubMed] [Google Scholar]
- O’Boyle N. M.; Banck M.; James C. A.; Morley C.; Vandermeersch T.; Hutchison G. R. Open Babel: An open chemical toolbox. J. Cheminform. 2011, 3, 33. 10.1186/1758-2946-3-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirschner K. N.; Yongye A. B.; Tschampel S. M.; González-Outeiriño J.; Daniels C. R.; Foley B. L.; Woods R. J. GLYCAM06: A generalizable biomolecular force field. Carbohydrates. J. Comput. Chem. 2008, 29, 622–655. 10.1002/jcc.20820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bayly C. I.; Cieplak P.; Cornell W.; Kollman P. A. A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP model. J. Phys. Chem. 1993, 97, 10269–10280. 10.1021/j100142a004. [DOI] [Google Scholar]
- Darden T.; York D.; Pedersen L. Particle mesh Ewald: An N · log(N) method for Ewald sums in large systems. J. Chem. Phys. 1993, 98, 10089–10092. 10.1063/1.464397. [DOI] [Google Scholar]
- Bussi G.; Donadio D.; Parrinello M. Canonical sampling through velocity rescaling. J. Chem. Phys. 2007, 126, 014101. 10.1063/1.2408420. [DOI] [PubMed] [Google Scholar]
- Parrinello M.; Rahman A. Polymorphic transitions in single crystals: A new molecular dynamics method. J. Appl. Phys. 1981, 52, 7182–7190. 10.1063/1.328693. [DOI] [Google Scholar]
- Tribello G. A.; Bonomi M.; Branduardi D.; Camilloni C.; Bussi G. PLUMED 2: New feathers for an old bird. Comput. Phys. Com. 2014, 185, 604–613. 10.1016/j.cpc.2013.09.018. [DOI] [Google Scholar]
- Lindorff-Larsen K.; Piana S.; Dror R. O.; Shaw D. E. How Fast-Folding Proteins Fold. Science 2011, 334, 517–520. 10.1126/science.1208351. [DOI] [PubMed] [Google Scholar]
- Yongye A. B.; Foley B. L.; Woods R. J. On Achieving Experimental Accuracy from Molecular Dynamics Simulations of Flexible Molecules: Aqueous Glycerol. J. Phys. Chem. A 2008, 112, 2634–2639. 10.1021/jp710544s. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Koningsveld H. A conformational study on glycerol in a D2O solution by means of 220 Mc PMR data. Recueil des Travaux Chimiques des Pays-Bas 1970, 89, 801–812. 10.1002/recl.19700890806. [DOI] [Google Scholar]
- Ayaz C.; Tepper L.; Brünig F. N.; Kappler J.; Daldrop J. O.; Netz R. R. Non-Markovian modeling of protein folding. Proc. Natl. Acad. Sci. U.S.A. 2021, 118, e2023856118 10.1073/pnas.2023856118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bowman G. R.; Pande V. S.; Noé F.. An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation; Springer: Dordrecht, 2014. [Google Scholar]
- Torrie G.; Valleau J. Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. J. Comput. Phys. 1977, 23, 187–199. 10.1016/0021-9991(77)90121-8. [DOI] [Google Scholar]
- Laio A.; Parrinello M. Escaping free-energy minima. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 12562–12566. 10.1073/pnas.202427399. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





