Table 3.
Error metrics for all (ranked and non-ranked) SAMPL8 methods for all host–guest systems
| ID | Sid | RMSE (kcal/mol) | MAE (kcal/mol) | ME (kcal/mol) | R | m | |
|---|---|---|---|---|---|---|---|
| CB8 | |||||||
| SILCS/LGFE/TIP3P/GCMC-MD/rew | 5 | 1.96 [1.09, 4.61] | 1.71 [0.87, 4.09] | − 0.23 [− 2.76, 2.29] | 0.38 [0.00, 0.96] | 0.58 [− 0.71, 1.80] | 0.52 [− 0.58, 1.00] |
| DDM/FEP/MBAR/FM/RW[pm6s6]* | 28 | 2.46 [1.26, 3.94] | 2.03 [1.03, 3.50] | 0.68 [− 1.37, 2.71] | 0.59 [0.03, 0.98] | 1.22 [− 0.05, 2.32] | 0.52 [− 0.33, 1.00] |
| ML/NNET/CORINA-descriptors2 | 10 | 2.73 [1.58, 4.18] | 2.41 [1.29, 3.85] | − 1.43 [− 3.42, 0.66] | 0.01 [0.00, 0.95] | 0.04 [− 0.71, 0.63] | 0.14 [− 1.00, 1.00] |
| SILCS/LGFE/TIP3P/GCMC-MD* | 3 | 3.06 [1.65, 5.07] | 2.59 [1.31, 4.64] | − 2.46 [− 4.50, − 0.44] | 0.40 [0.00, 0.96] | 0.29 [− 0.49, 1.19] | 0.43 [− 0.65, 1.00] |
| SILCS/LGFE/TIP3P/GCMC-MD/G5 | 4 | 3.16 [1.69, 5.19] | 2.67 [1.35, 4.73] | − 2.53 [− 4.61, − 0.46] | 0.33 [0.00, 0.96] | 0.27 [− 0.54, 1.12] | 0.43 [− 0.71, 1.00] |
| DDM/FEP/MBAR/Paramchem/FM/[gfn2gfn2] | 19 | 3.47 [2.19, 4.94] | 3.23 [1.89, 4.67] | − 1.48 [− 3.93, 1.12] | 0.43 [0.00, 0.97] | − 0.32 [− 0.94, 0.39] | − 0.33 [− 1.00, 0.56] |
| ML/NNET/CORINA-descriptors1 | 9 | 3.50 [2.11, 4.96] | 3.02 [1.58, 4.66] | − 2.78 [− 4.59, − 0.87] | 0.14 [0.00, 0.95] | 0.15 [− 0.47, 0.71] | 0.33 [− 0.71, 1.00] |
| DDM/FEP/MBAR/FM/[pm6s6] | 22 | 3.60 [2.36, 4.99] | 3.23 [1.91, 4.74] | 3.23 [1.66, 4.73] | 0.64 [0.04, 0.99] | 0.92 [0.08, 1.69] | 0.62 [− 0.20, 1.00] |
| DDM/FEP/MBAR/FM/RW/[wb97xd,s6] | 29 | 3.67 [2.37, 5.12] | 3.29 [1.93, 4.85] | 2.89 [0.83, 4.79] | 0.38 [0.00, 0.98] | 0.75 [− 0.29, 1.83] | 0.33 [− 0.53, 1.00] |
| DDM/FEP/MBAR/Paramchem/FM/[C36S6] | 18 | 3.73 [2.21, 5.28] | 3.17 [1.65, 4.94] | 3.08 [1.17, 4.89] | 0.41 [0.01, 0.96] | 0.74 [− 0.31, 1.82] | 0.33 [− 0.33, 1.00] |
| DDM/FEP/MBAR/FM/[mp2,b3lyp]* | 15 | 3.77 [2.48, 5.26] | 3.39 [2.05, 4.94] | 2.50 [0.09, 4.66] | 0.20 [0.00, 0.96] | 0.57 [− 0.34, 2.40] | 0.52 [− 0.33, 1.00] |
| DDM-SAMS/GAFF-DMBIS/TIP3P/MCMC-SAMS/* | 2 | 3.82 [2.33, 5.43] | 3.25 [1.77, 5.08] | 1.74 [− 1.08, 4.29] | 0.11 [0.00, 0.95] | 0.49 [− 1.29, 2.47] | 0.05 [− 0.67, 0.88] |
| APR/GAFF2-AM1BCC/TIP3P/US/TI** | 33 | 3.85 [2.35, 5.46] | 3.38 [1.90, 5.15] | 2.87 [0.65, 5.01] | 0.53 [0.08, 0.97] | 1.18 [0.29, 2.75] | 0.52 [− 0.20, 1.00] |
| APR/GAFF2-AM1BCC/TIP3P/US/MBAR** | 34 | 3.89 [2.32, 5.60] | 3.41 [1.89, 5.19] | 2.86 [0.59, 5.03] | 0.63 [0.21, 0.96] | 1.40 [0.54, 2.91] | 0.52 [− 0.20, 1.00] |
| DDM/FEP/MBAR/Paramchem/FM/RW/[wb97xd] | 26 | 3.90 [2.08, 5.96] | 3.41 [1.81, 5.33] | 0.19 [− 3.01, 3.07] | 0.53 [0.05, 0.97] | 1.68 [0.12, 3.21] | 0.62 [− 0.18, 1.00] |
| GFN2-xTB/MetaMD/GBSA/ensemble/Nobuffer* | 1 | 4.06 [2.54, 5.68] | 3.60 [2.00, 5.31] | 2.89 [0.46, 5.15] | 0.01 [0.00, 0.92] | 0.08 [− 0.89, 1.12] | − 0.05 [− 0.87, 0.79] |
| APR/OPENFF1.2.0-AM1BCC/TIP3P/US/TI** | 31 | 4.07 [2.20, 5.89] | 3.18 [1.53, 5.38] | 2.89 [0.53, 5.22] | 0.11 [0.00, 0.89] | 0.38 [− 0.50, 1.58] | 0.24 [− 0.58, 0.88] |
| APR/OPENFF1.2.0-AM1BCC/TIP3P/US/MBAR** | 32 | 4.14 [2.47, 5.88] | 3.45 [1.81, 5.47] | 3.21 [0.97, 5.40] | 0.14 [0.00, 0.92] | 0.40 [− 0.48, 1.55] | 0.24 [− 0.65, 0.88] |
| US/GAFF-AM1BCC/TIP3P/HRE-MD/emp_corr* | 11 | 4.15 [1.96, 6.42] | 3.37 [1.60, 5.65] | 2.20 [− 0.70, 5.08] | 0.74 [0.28, 0.99] | 2.00 [0.86, 3.85] | 0.43 [− 0.20, 1.00] |
| LiGaMD/q4MD/TIP4P/enhanced-sampling | 16 | 4.28 [2.43, 6.39] | 3.60 [1.90, 5.80] | − 2.35 [− 5.22, 0.57] | 0.03 [0.00, 0.95] | − 0.17 [− 1.82, 0.97] | − 0.24 [− 1.00, 0.76] |
| DDM/FEP/MBAR/C36 | 17 | 4.44 [2.51, 6.23] | 3.76 [1.91, 5.78] | 3.58 [1.33, 5.74] | 0.24 [0.00, 0.93] | 0.60 [− 0.42, 1.85] | 0.24 [− 0.41, 1.00] |
| GAFF-RESP/TIP3P/MD/xtb-GFN2B/Boltz-Avg | 8 | 4.55 [2.64, 6.42] | 3.98 [2.15, 5.93] | − 0.92 [− 4.14, 2.81] | 0.00 [0.00, 0.95] | 0.04 [− 1.70, 1.68] | 0.14 [− 1.00, 1.00] |
| GAFF-RESP/TIP3P/MD-Classical/xtb-GFN2B* | 7 | 4.60 [2.50, 6.87] | 3.87 [2.08, 6.13] | 1.50 [− 1.90, 4.90] | 0.01 [0.00, 0.94] | − 0.18 [− 1.62, 1.48] | − 0.24 [− 1.00, 0.60] |
| DDM/FEP/MBAR/FM/[mp2s6] | 21 | 4.68 [3.08, 6.22] | 4.09 [2.36, 5.96] | 4.09 [2.00, 5.93] | 0.37 [0.00, 0.96] | 0.74 [− 0.30, 2.02] | 0.43 [− 0.47, 1.00] |
| MD/fmB3LYP(H)-fmMP2(G)/TIP3P/REUS/* | 13 | 4.68 [3.19, 6.20] | 4.27 [2.64, 5.95] | 2.52 [− 0.86, 5.26] | 0.16 [0.00, 0.94] | 0.74 [− 0.37, 3.25] | 0.33 [− 0.37, 1.00] |
| DDM/FEP/MBAR/FM/[gfn2,s6] | 23 | 5.13 [3.64, 6.68] | 4.79 [3.09, 6.47] | 4.50 [2.23, 6.45] | 0.23 [0.00, 0.98] | 0.52 [− 0.72, 1.52] | 0.43 [− 0.60, 1.00] |
| DDM/FEP/MBAR/FM/RW/[blyp,s6] | 24 | 5.13 [2.52, 8.00] | 4.28 [2.32, 6.87] | 0.08 [− 4.29, 3.49] | 0.42 [0.01, 0.98] | 1.80 [− 0.14, 4.06] | 0.52 [− 0.26, 1.00] |
| DDM/FEP/MBAR/FM/[pm6pm6] | 20 | 5.29 [2.87, 7.44] | 4.22 [1.99, 6.80] | − 3.94 [− 6.68, − 1.10] | 0.13 [0.00, 0.94] | − 0.31 [− 1.63, 0.35] | − 0.52 [− 1.00, 0.53] |
| DDM/FEP/MBAR/FM/RW/[blyp,s6BLUR] | 25 | 5.53 [3.44, 7.69] | 5.01 [3.11, 7.11] | 3.90 [0.65, 6.83] | 0.56 [0.06, 0.95] | 1.73 [0.34, 3.63] | 0.52 [− 0.29, 1.00] |
| ABFE/Parsley-GAFF-BCC/TIP3P/MD/NoBuffer2 | 14 | 5.72 [3.24, 13.27] | 5.16 [2.57, 11.84] | 5.16 [− 1.01, 11.36] | 0.22 [0.00, 0.95] | 0.51 [− 2.41, 3.74] | 0.33 [− 0.79, 1.00] |
| ABFE/Parsley-GAFF-BCC/TIP3P/MD/NoBuffer1* | 30 | 5.72 [3.22, 13.09] | 5.16 [2.53, 11.73] | 5.16 [− 1.01, 11.30] | 0.22 [0.00, 0.95] | 0.51 [− 2.41, 3.72] | 0.33 [− 0.79, 1.00] |
| DDM/FEP/MBAR/FM/RW/[wb97xdBLUR] | 27 | 5.98 [4.03, 7.91] | 5.54 [3.65, 7.52] | 4.40 [0.97, 7.30] | 0.62 [0.06, 0.97] | 1.92 [0.34, 3.70] | 0.43 [− 0.29, 1.00] |
| EE-MCC/GAFF2-AM1-BCC/TIP3P/MD/* | 6 | 6.64 [4.39, 8.82] | 5.97 [3.63, 8.42] | 5.97 [3.39, 8.42] | 0.48 [0.04, 0.95] | 1.21 [0.11, 2.65] | 0.39 [− 0.29, 1.00] |
| US/GAFF-AM1BCC/TIP3P/HRE-MD | 12 | 9.36 [5.05, 18.19] | 8.80 [4.07, 16.52] | 8.80 [1.24, 16.37] | 0.70 [0.00, 0.97] | 1.77 [− 1.76, 5.59] | 0.52 [− 0.60, 1.00] |
| GDCC–TEMOA and TEETOA | |||||||
| DDM/AMOEBA/BAR* | 44 | 0.88 [0.46, 1.56] | 0.72 [0.36, 1.36] | 0.10 [− 0.72, 0.84] | 0.78 [0.20, 0.98] | 0.97 [0.43, 1.38] | 0.79 [0.24, 1.00] |
| ATM/GAFF2-AM1BCC/TIP3P/HREM* | 37 | 1.59 [1.10, 3.96] | 1.25 [0.87, 3.43] | − 0.39 [− 2.33, 1.83] | 0.88 [0.14, 0.98] | 1.67 [0.46, 3.03] | 0.71 [0.00, 1.00] |
| PMF/GAFF2-AM1BCC/TIP3P/MD-US* | 38 | 1.59 [1.13, 4.02] | 1.31 [0.88, 3.53] | − 0.16 [2.17, 2.03] | 0.79 [0.06, 0.97] | 1.51 [0.30, 2.97] | 0.71 [− 0.08, 1.00] |
| AM1BCC/MMPBSA/TIP4PEW/MD_NR3 | 42 | 1.98 [1.15, 3.28] | 1.69 [0.91, 2.96] | 0.77 [− 0.82, 2.35] | 0.00 [0.00, 0.83] | 0.02 [− 0.75, 0.70] | 0.18 [− 0.76, 0.82] |
| AM1BCC/MMPBSA/TIP4PEW/MD* | 43 | 2.05 [1.20, 3.30] | 1.65 [0.91, 2.97] | 1.05 [− 0.52, 2.57] | 0.02 [0.00, 0.85] | 0.06 [− 0.66, 0.75] | 0.18 [− 0.71, 0.83] |
| AM1BCC/MMPBSA/TIP4PEW/MD_NR2 | 41 | 2.10 [1.20, 3.43] | 1.69 [0.93, 3.10] | 1.04 [− 0.57, 2.67] | 0.00 [0.00, 0.84] | 0.02 [− 0.75, 0.69] | 0.18 [− 0.77, 0.82] |
| ML/NNET/CORINA-descriptors-8* | 39 | 2.39 [1.44, 3.85] | 2.16 [1.16, 3.51] | 0.58 [− 1.44, 2.58] | 0.60 [0.00, 0.94] | − 0.35 [− 1.13, 0.38] | − 0.64 [− 1.00, 0.57] |
| SILCS/LGFE/TIP3P/GCMC-MD* | 36 | 2.40 [1.38, 3.75] | 2.10 [1.12, 3.38] | − 0.24 [− 2.17, 1.70] | 0.26 [0.00, 0.89] | − 0.32 [− 1.05, 0.47] | − 0.29 [− 1.00, 0.57] |
| SILCS/LGFE/TIP3P/GCMC-MD_NR | 35 | 2.51 [1.34, 4.50] | 1.81 [1.02, 3.90] | − 1.69 [− 3.66, 0.26] | 0.00 [0.00, 0.88] | − 0.01 [− 1.07, 0.92] | 0.07 [− 0.83, 0.83] |
| APR/OPENFF1.2.0-AM1BCC/TIP3P/US/TI** | 48 | 2.97 [1.27, 4.80] | 2.24 [0.99, 4.07] | − 0.76 [− 3.09, 1.36] | 0.48 [0.09, 0.94] | 1.55 [0.43, 3.44] | 0.50 [− 0.09, 0.92] |
| APR/OPENFF1.2.0-AM1BCC/TIP3P/US/MBAR** | 49 | 2.98 [1.31, 4.86] | 2.24 [1.02, 4.07] | − 0.83 [− 3.14, 1.24] | 0.48 [0.09, 0.92] | 1.54 [0.39, 3.48] | 0.50 [− 0.08, 0.92] |
| APR/GAFF2-AM1BCC/TIP3P/US/MBAR** | 51 | 3.24 [1.27, 5.29] | 2.30 [0.97, 4.37] | − 1.49 [− 3.82, 0.62] | 0.35 [0.02, 0.90] | 1.24 [0.07, 3.24] | 0.43 [− 0.28, 0.84] |
| APR/GAFF2-AM1BCC/TIP3P/US/TI** | 50 | 3.31 [1.41, 5.30] | 2.47 [1.11, 4.45] | − 1.29 [− 3.68, 0.96] | 0.26 [0.00, 0.87] | 1.08 [− 0.11, 3.24] | 0.29 [− 0.36, 0.83] |
| LiGaMD/GAFF2/RESP/TIP4P/Sampling | 47 | 4.48 [1.46, 6.77] | 3.07 [1.10, 5.66] | 1.72 [− 1.10, 4.79] | 0.00 [0.00, 0.76] | − 0.02 [− 1.78, 1.92] | 0.07 [− 0.67, 0.75] |
| DDM/C36/TIP3P/MD/MBAR* | 45 | 4.52 [2.01, 6.71] | 3.45 [1.62, 5.84] | − 3.45 [− 5.79, − 1.34] | 0.04 [0.00, 0.78] | 0.35 [− 0.98, 2.00] | 0.29 [− 0.57, 0.92] |
| MD/ParamChem/TIP3P/REUS/* | 46 | 4.91 [2.50, 7.18] | 3.95 [2.02, 6.33] | − 3.90 [− 6.29, − 1.69] | 0.01 [0.00, 0.68] | 0.18 [− 1.18, 1.63] | 0.04 [− 0.58, 0.76] |
| AM1BCC/MMPBSA/TIP4PEW/MD_NR1 | 40 | 9.26 [7.30, 11.28] | 8.93 [7.00, 10.95] | 8.93 [7.00, 10.95] | 0.00 [0.00, 0.73] | 0.06 [− 1.18, 0.98] | 0.18 [− 0.67, 0.74] |
The root mean square error (RMSE), mean absolute error (MAE), signed mean error (ME), coefficient of correlation (R), slope (m), and Kendall’s rank correlation coefficient (Tau) were computed via bootstrapping with replacement. Shown are results for individual host categories, with upper and lower bounds of 95% confidence intervals shown in brackets. Statistics do not include optional host–guest systems CB8–G8, CB8–G9, and TEMOA–G3. Each method has an assigned unique submission ID (sid). An asterisk next to the method’s name denotes a ranked submission, and a double asterisk denotes a reference calculation