Skip to main content
. 2022 Oct 14;36(10):707–734. doi: 10.1007/s10822-022-00462-5

Table 3.

Error metrics for all (ranked and non-ranked) SAMPL8 methods for all host–guest systems

ID Sid RMSE (kcal/mol) MAE (kcal/mol) ME (kcal/mol) R2 m τ
CB8
 SILCS/LGFE/TIP3P/GCMC-MD/rew 5 1.96 [1.09, 4.61] 1.71 [0.87, 4.09] − 0.23 [− 2.76, 2.29] 0.38 [0.00, 0.96] 0.58 [− 0.71, 1.80] 0.52 [− 0.58, 1.00]
 DDM/FEP/MBAR/FM/RW[pm6s6]* 28 2.46 [1.26, 3.94] 2.03 [1.03, 3.50] 0.68 [− 1.37, 2.71] 0.59 [0.03, 0.98] 1.22 [− 0.05, 2.32] 0.52 [− 0.33, 1.00]
 ML/NNET/CORINA-descriptors2 10 2.73 [1.58, 4.18] 2.41 [1.29, 3.85] − 1.43 [− 3.42, 0.66] 0.01 [0.00, 0.95] 0.04 [− 0.71, 0.63] 0.14 [− 1.00, 1.00]
 SILCS/LGFE/TIP3P/GCMC-MD* 3 3.06 [1.65, 5.07] 2.59 [1.31, 4.64] − 2.46 [− 4.50, − 0.44] 0.40 [0.00, 0.96] 0.29 [− 0.49, 1.19] 0.43 [− 0.65, 1.00]
 SILCS/LGFE/TIP3P/GCMC-MD/G5 4 3.16 [1.69, 5.19] 2.67 [1.35, 4.73] − 2.53 [− 4.61, − 0.46] 0.33 [0.00, 0.96] 0.27 [− 0.54, 1.12] 0.43 [− 0.71, 1.00]
 DDM/FEP/MBAR/Paramchem/FM/[gfn2gfn2] 19 3.47 [2.19, 4.94] 3.23 [1.89, 4.67] − 1.48 [− 3.93, 1.12] 0.43 [0.00, 0.97] − 0.32 [− 0.94, 0.39] − 0.33 [− 1.00, 0.56]
 ML/NNET/CORINA-descriptors1 9 3.50 [2.11, 4.96] 3.02 [1.58, 4.66] − 2.78 [− 4.59, − 0.87] 0.14 [0.00, 0.95] 0.15 [− 0.47, 0.71] 0.33 [− 0.71, 1.00]
 DDM/FEP/MBAR/FM/[pm6s6] 22 3.60 [2.36, 4.99] 3.23 [1.91, 4.74] 3.23 [1.66, 4.73] 0.64 [0.04, 0.99] 0.92 [0.08, 1.69] 0.62 [− 0.20, 1.00]
 DDM/FEP/MBAR/FM/RW/[wb97xd,s6] 29 3.67 [2.37, 5.12] 3.29 [1.93, 4.85] 2.89 [0.83, 4.79] 0.38 [0.00, 0.98] 0.75 [− 0.29, 1.83] 0.33 [− 0.53, 1.00]
 DDM/FEP/MBAR/Paramchem/FM/[C36S6] 18 3.73 [2.21, 5.28] 3.17 [1.65, 4.94] 3.08 [1.17, 4.89] 0.41 [0.01, 0.96] 0.74 [− 0.31, 1.82] 0.33 [− 0.33, 1.00]
 DDM/FEP/MBAR/FM/[mp2,b3lyp]* 15 3.77 [2.48, 5.26] 3.39 [2.05, 4.94] 2.50 [0.09, 4.66] 0.20 [0.00, 0.96] 0.57 [− 0.34, 2.40] 0.52 [− 0.33, 1.00]
 DDM-SAMS/GAFF-DMBIS/TIP3P/MCMC-SAMS/* 2 3.82 [2.33, 5.43] 3.25 [1.77, 5.08] 1.74 [− 1.08, 4.29] 0.11 [0.00, 0.95] 0.49 [− 1.29, 2.47] 0.05 [− 0.67, 0.88]
 APR/GAFF2-AM1BCC/TIP3P/US/TI** 33 3.85 [2.35, 5.46] 3.38 [1.90, 5.15] 2.87 [0.65, 5.01] 0.53 [0.08, 0.97] 1.18 [0.29, 2.75] 0.52 [− 0.20, 1.00]
 APR/GAFF2-AM1BCC/TIP3P/US/MBAR** 34 3.89 [2.32, 5.60] 3.41 [1.89, 5.19] 2.86 [0.59, 5.03] 0.63 [0.21, 0.96] 1.40 [0.54, 2.91] 0.52 [− 0.20, 1.00]
 DDM/FEP/MBAR/Paramchem/FM/RW/[wb97xd] 26 3.90 [2.08, 5.96] 3.41 [1.81, 5.33] 0.19 [− 3.01, 3.07] 0.53 [0.05, 0.97] 1.68 [0.12, 3.21] 0.62 [− 0.18, 1.00]
 GFN2-xTB/MetaMD/GBSA/ensemble/Nobuffer* 1 4.06 [2.54, 5.68] 3.60 [2.00, 5.31] 2.89 [0.46, 5.15] 0.01 [0.00, 0.92] 0.08 [− 0.89, 1.12] − 0.05 [− 0.87, 0.79]
 APR/OPENFF1.2.0-AM1BCC/TIP3P/US/TI** 31 4.07 [2.20, 5.89] 3.18 [1.53, 5.38] 2.89 [0.53, 5.22] 0.11 [0.00, 0.89] 0.38 [− 0.50, 1.58] 0.24 [− 0.58, 0.88]
 APR/OPENFF1.2.0-AM1BCC/TIP3P/US/MBAR** 32 4.14 [2.47, 5.88] 3.45 [1.81, 5.47] 3.21 [0.97, 5.40] 0.14 [0.00, 0.92] 0.40 [− 0.48, 1.55] 0.24 [− 0.65, 0.88]
 US/GAFF-AM1BCC/TIP3P/HRE-MD/emp_corr* 11 4.15 [1.96, 6.42] 3.37 [1.60, 5.65] 2.20 [− 0.70, 5.08] 0.74 [0.28, 0.99] 2.00 [0.86, 3.85] 0.43 [− 0.20, 1.00]
 LiGaMD/q4MD/TIP4P/enhanced-sampling 16 4.28 [2.43, 6.39] 3.60 [1.90, 5.80] − 2.35 [− 5.22, 0.57] 0.03 [0.00, 0.95] − 0.17 [− 1.82, 0.97] − 0.24 [− 1.00, 0.76]
 DDM/FEP/MBAR/C36 17 4.44 [2.51, 6.23] 3.76 [1.91, 5.78] 3.58 [1.33, 5.74] 0.24 [0.00, 0.93] 0.60 [− 0.42, 1.85] 0.24 [− 0.41, 1.00]
 GAFF-RESP/TIP3P/MD/xtb-GFN2B/Boltz-Avg 8 4.55 [2.64, 6.42] 3.98 [2.15, 5.93] − 0.92 [− 4.14, 2.81] 0.00 [0.00, 0.95] 0.04 [− 1.70, 1.68] 0.14 [− 1.00, 1.00]
 GAFF-RESP/TIP3P/MD-Classical/xtb-GFN2B* 7 4.60 [2.50, 6.87] 3.87 [2.08, 6.13] 1.50 [− 1.90, 4.90] 0.01 [0.00, 0.94] − 0.18 [− 1.62, 1.48] − 0.24 [− 1.00, 0.60]
 DDM/FEP/MBAR/FM/[mp2s6] 21 4.68 [3.08, 6.22] 4.09 [2.36, 5.96] 4.09 [2.00, 5.93] 0.37 [0.00, 0.96] 0.74 [− 0.30, 2.02] 0.43 [− 0.47, 1.00]
 MD/fmB3LYP(H)-fmMP2(G)/TIP3P/REUS/* 13 4.68 [3.19, 6.20] 4.27 [2.64, 5.95] 2.52 [− 0.86, 5.26] 0.16 [0.00, 0.94] 0.74 [− 0.37, 3.25] 0.33 [− 0.37, 1.00]
 DDM/FEP/MBAR/FM/[gfn2,s6] 23 5.13 [3.64, 6.68] 4.79 [3.09, 6.47] 4.50 [2.23, 6.45] 0.23 [0.00, 0.98] 0.52 [− 0.72, 1.52] 0.43 [− 0.60, 1.00]
 DDM/FEP/MBAR/FM/RW/[blyp,s6] 24 5.13 [2.52, 8.00] 4.28 [2.32, 6.87] 0.08 [− 4.29, 3.49] 0.42 [0.01, 0.98] 1.80 [− 0.14, 4.06] 0.52 [− 0.26, 1.00]
 DDM/FEP/MBAR/FM/[pm6pm6] 20 5.29 [2.87, 7.44] 4.22 [1.99, 6.80] − 3.94 [− 6.68, − 1.10] 0.13 [0.00, 0.94] − 0.31 [− 1.63, 0.35] − 0.52 [− 1.00, 0.53]
 DDM/FEP/MBAR/FM/RW/[blyp,s6BLUR] 25 5.53 [3.44, 7.69] 5.01 [3.11, 7.11] 3.90 [0.65, 6.83] 0.56 [0.06, 0.95] 1.73 [0.34, 3.63] 0.52 [− 0.29, 1.00]
 ABFE/Parsley-GAFF-BCC/TIP3P/MD/NoBuffer2 14 5.72 [3.24, 13.27] 5.16 [2.57, 11.84] 5.16 [− 1.01, 11.36] 0.22 [0.00, 0.95] 0.51 [− 2.41, 3.74] 0.33 [− 0.79, 1.00]
 ABFE/Parsley-GAFF-BCC/TIP3P/MD/NoBuffer1* 30 5.72 [3.22, 13.09] 5.16 [2.53, 11.73] 5.16 [− 1.01, 11.30] 0.22 [0.00, 0.95] 0.51 [− 2.41, 3.72] 0.33 [− 0.79, 1.00]
 DDM/FEP/MBAR/FM/RW/[wb97xdBLUR] 27 5.98 [4.03, 7.91] 5.54 [3.65, 7.52] 4.40 [0.97, 7.30] 0.62 [0.06, 0.97] 1.92 [0.34, 3.70] 0.43 [− 0.29, 1.00]
 EE-MCC/GAFF2-AM1-BCC/TIP3P/MD/* 6 6.64 [4.39, 8.82] 5.97 [3.63, 8.42] 5.97 [3.39, 8.42] 0.48 [0.04, 0.95] 1.21 [0.11, 2.65] 0.39 [− 0.29, 1.00]
 US/GAFF-AM1BCC/TIP3P/HRE-MD 12 9.36 [5.05, 18.19] 8.80 [4.07, 16.52] 8.80 [1.24, 16.37] 0.70 [0.00, 0.97] 1.77 [− 1.76, 5.59] 0.52 [− 0.60, 1.00]
GDCC–TEMOA and TEETOA
 DDM/AMOEBA/BAR* 44 0.88 [0.46, 1.56] 0.72 [0.36, 1.36] 0.10 [− 0.72, 0.84] 0.78 [0.20, 0.98] 0.97 [0.43, 1.38] 0.79 [0.24, 1.00]
 ATM/GAFF2-AM1BCC/TIP3P/HREM* 37 1.59 [1.10, 3.96] 1.25 [0.87, 3.43] − 0.39 [− 2.33, 1.83] 0.88 [0.14, 0.98] 1.67 [0.46, 3.03] 0.71 [0.00, 1.00]
 PMF/GAFF2-AM1BCC/TIP3P/MD-US* 38 1.59 [1.13, 4.02] 1.31 [0.88, 3.53] − 0.16 [2.17, 2.03] 0.79 [0.06, 0.97] 1.51 [0.30, 2.97] 0.71 [− 0.08, 1.00]
 AM1BCC/MMPBSA/TIP4PEW/MD_NR3 42 1.98 [1.15, 3.28] 1.69 [0.91, 2.96] 0.77 [− 0.82, 2.35] 0.00 [0.00, 0.83] 0.02 [− 0.75, 0.70] 0.18 [− 0.76, 0.82]
 AM1BCC/MMPBSA/TIP4PEW/MD* 43 2.05 [1.20, 3.30] 1.65 [0.91, 2.97] 1.05 [− 0.52, 2.57] 0.02 [0.00, 0.85] 0.06 [− 0.66, 0.75] 0.18 [− 0.71, 0.83]
 AM1BCC/MMPBSA/TIP4PEW/MD_NR2 41 2.10 [1.20, 3.43] 1.69 [0.93, 3.10] 1.04 [− 0.57, 2.67] 0.00 [0.00, 0.84] 0.02 [− 0.75, 0.69] 0.18 [− 0.77, 0.82]
 ML/NNET/CORINA-descriptors-8* 39 2.39 [1.44, 3.85] 2.16 [1.16, 3.51] 0.58 [− 1.44, 2.58] 0.60 [0.00, 0.94] − 0.35 [− 1.13, 0.38] − 0.64 [− 1.00, 0.57]
 SILCS/LGFE/TIP3P/GCMC-MD* 36 2.40 [1.38, 3.75] 2.10 [1.12, 3.38] − 0.24 [− 2.17, 1.70] 0.26 [0.00, 0.89] − 0.32 [− 1.05, 0.47] − 0.29 [− 1.00, 0.57]
 SILCS/LGFE/TIP3P/GCMC-MD_NR 35 2.51 [1.34, 4.50] 1.81 [1.02, 3.90] − 1.69 [− 3.66, 0.26] 0.00 [0.00, 0.88] − 0.01 [− 1.07, 0.92] 0.07 [− 0.83, 0.83]
 APR/OPENFF1.2.0-AM1BCC/TIP3P/US/TI** 48 2.97 [1.27, 4.80] 2.24 [0.99, 4.07] − 0.76 [− 3.09, 1.36] 0.48 [0.09, 0.94] 1.55 [0.43, 3.44] 0.50 [− 0.09, 0.92]
 APR/OPENFF1.2.0-AM1BCC/TIP3P/US/MBAR** 49 2.98 [1.31, 4.86] 2.24 [1.02, 4.07] − 0.83 [− 3.14, 1.24] 0.48 [0.09, 0.92] 1.54 [0.39, 3.48] 0.50 [− 0.08, 0.92]
 APR/GAFF2-AM1BCC/TIP3P/US/MBAR** 51 3.24 [1.27, 5.29] 2.30 [0.97, 4.37] − 1.49 [− 3.82, 0.62] 0.35 [0.02, 0.90] 1.24 [0.07, 3.24] 0.43 [− 0.28, 0.84]
 APR/GAFF2-AM1BCC/TIP3P/US/TI** 50 3.31 [1.41, 5.30] 2.47 [1.11, 4.45] − 1.29 [− 3.68, 0.96] 0.26 [0.00, 0.87] 1.08 [− 0.11, 3.24] 0.29 [− 0.36, 0.83]
 LiGaMD/GAFF2/RESP/TIP4P/Sampling 47 4.48 [1.46, 6.77] 3.07 [1.10, 5.66] 1.72 [− 1.10, 4.79] 0.00 [0.00, 0.76] − 0.02 [− 1.78, 1.92] 0.07 [− 0.67, 0.75]
 DDM/C36/TIP3P/MD/MBAR* 45 4.52 [2.01, 6.71] 3.45 [1.62, 5.84] − 3.45 [− 5.79, − 1.34] 0.04 [0.00, 0.78] 0.35 [− 0.98, 2.00] 0.29 [− 0.57, 0.92]
 MD/ParamChem/TIP3P/REUS/* 46 4.91 [2.50, 7.18] 3.95 [2.02, 6.33] − 3.90 [− 6.29, − 1.69] 0.01 [0.00, 0.68] 0.18 [− 1.18, 1.63] 0.04 [− 0.58, 0.76]
 AM1BCC/MMPBSA/TIP4PEW/MD_NR1 40 9.26 [7.30, 11.28] 8.93 [7.00, 10.95] 8.93 [7.00, 10.95] 0.00 [0.00, 0.73] 0.06 [− 1.18, 0.98] 0.18 [− 0.67, 0.74]

The root mean square error (RMSE), mean absolute error (MAE), signed mean error (ME), coefficient of correlation (R2), slope (m), and Kendall’s rank correlation coefficient (Tau) were computed via bootstrapping with replacement. Shown are results for individual host categories, with upper and lower bounds of 95% confidence intervals shown in brackets. Statistics do not include optional host–guest systems CB8–G8, CB8–G9, and TEMOA–G3. Each method has an assigned unique submission ID (sid). An asterisk next to the method’s name denotes a ranked submission, and a double asterisk denotes a reference calculation