Skip to main content
. 2023 Oct 14;6:222. doi: 10.1038/s42004-023-01019-9

Table 3.

Summarizing the reproducibility of the experimental relative binding free energies and the accuracy of FEP+.

Accuracy metric Experimental survey FEP+ benchmark
Pairwise RMSE (kcal mol−1) 0.91 [0.83, 1.11] 1.25 [1.17, 1.33]
Pairwise MUE (kcal mol−1) 0.67 [0.61, 0.83] 0.98 [0.91, 1.05]
Edgewise RMSE (kcal mol−1) N/A 1.17 [1.08, 1.25]
Edgewise MUE (kcal mol−1) N/A 0.91 [0.84, 0.98]
R2 0.79 [0.75, 0.82] 0.56 [0.51, 0.60]
Kendall τ 0.71 [0.65, 0.74] 0.51 [0.48, 0.55]

The value of every metric, such as RMSE or R2, is a weighted average. For the pairwise, R2, and Kendall τ metrics, the weighting is equal to the number of compounds in the assay (in the experimental survey) or FEP graph. For the edgewise errors, the weighting is equal to the number of edges in each FEP graph. Square brackets encompass 95% confidence intervals that have been calculated by bootstrap sampling over the pairs of experimental series or FEP+ graphs. As the edgewise error is dependent on the topology of an FEP+ graph, there is no equivalent metric in the experimental survey.