Skip to main content
. 2025 Feb 5;17:18. doi: 10.1186/s13321-025-00952-2

Table 1.

Benchmarking BarlowDTI against other models using Kang et al. splits [41]

Dataset Model ROC AUC PR AUC
BioSNAP BarlowDTI  0.9599 ± 0.0004 0.9670 ± 0.0004
XGBoost 0.9142 0.9229
MolTrans [42] 0.895 ± 0.002 0.901 ± 0. 004
Kang et al. [41] 0.914 ± 0.006 0.900 ± 0.007
DLM-DTI [17] 0.914 ± 0.003 0.914 ± 0.006
ConPLex [43] 0.897 ± 0.001
BindingDB BarlowDTI  0.9364 ± 0.0003 0.7344 ± 0.0018
XGBoost 0.9261 0.6948
MolTrans [42] 0.914 ± 0.001 0.622 ± 0.007
Kang et al. [41] 0.922 ± 0.001 0.623 ± 0.010
DLM-DTI [17] 0.912 ± 0.004 0.643 ± 0.006
ConPLex [43] 0.628 ± 0.012
DAVIS BarlowDTI  0.9480 ± 0.0008 0.5524 ± 0.0011
XGBoost 0.9285 0.4782
MolTrans [42] 0.907 ± 0.002 0.404 ± 0.016
Kang et al. [41] 0.920 ± 0.002 0.395 ± 0.007
DLM-DTI [17] 0.895 ± 0.003 0.373 ± 0.017
ConPLex [43] 0.458 ± 0.016

Performance was evaluated against three established benchmarks, and the mean and standard deviation of the performance of five replicates are presented. Results per benchmark that are both the best and statistically significant (Two-sided Welch’s t-test [52, 53], α=0.001 with Benjamini-Hochberg [54] multiple test correction) are highlighted in bold