Skip to main content
. Author manuscript; available in PMC: 2024 Feb 1.
Published in final edited form as: Comput Biol Med. 2022 Dec 23;153:106479. doi: 10.1016/j.compbiomed.2022.106479

Table 4.

Comparison of our models’ performance on the five data sets with 5% labeled data with other models’ performance. Our four models are AE-MBO (autoencoder with MBO), BT-MBO (transformer with MBO), ECFP-MBO (extended-connectivity fingerprints with MBO), and Consensus-MBO, a model generating the consensus from our top two scoring methods for a given data set and percent of labeled data (more details in Section 2.6). The BT-GBDT, BT-RF, and BT-SVM models use the BT-FPs as features for gradient boosting decision trees, random forest, and support vector machine, respectively (note that these models are denoted by ‘SSLP-FP’ in [5]). AE-GBDT, AE-RF, and AE-SVM refer to the AE-FPs used as features for the specified machine learning method. Performance is given as average ROC-AUC score over 50 labeled sets with standard deviation, and Consensus-MBO performance is given as the average over 10 trials with standard deviation.

ROC-AUC Scores for 5% Labeled Data
Model Ames Bace BBBP Beet ClinTox
BT-MBO (Proposed) 0.716 ± .014 0.680 ± .027 0.785 ± .036 0.621 ± .070 0.774 ± .058
AE-MBO (Proposed) 0.653 ± .012 0.646 ± .024 0.730 ± .040 0.578 ± .047 0.596 ± .032
ECFP-MBO (Proposed) 0.710 ± .012 0.720 ± .026 0.721 ± .025 0.662 ± .073 0.563 ± .029
Consensus-MBO (Proposed) 0.722 ± .013 0.702 ± .032 0.765 ± .034 0.666 ± .070 0.712 ± .055
BT-GBDT [5] 0.717 ± .009 0.654 ± .029 0.696 ± .061 0.551 ± .054 0.524 ± .029
BT-RF [5] 0.709 ± .014 0.642 ± .030 0.684 ± .057 0.555 ± .058 0.513 ± .025
BT-SVM [5] 0.721 ± .011 0.679 ± .027 0.739 ± .052 0.566 ± .051 0.635 ± .084
AE-GBDT [17] 0.666 ± .011 0.653 ± .030 0.665 ± .050 0.549 ± .042 0.506 ± .007
AE-RF [17] 0.662 ± .013 0.663 ± .028 0.632 ± .063 0.551 ± .040 0.503 ± .006
AE-SVM [17] 0.653 ± .010 0.644 ± .028 0.716 ± .044 0.535 ± .036 0.520 ± .023
ECFP2_512 [48] 0.700 ± .010 0.706 ± .025 0.703 ± .029 0.596 ± .059 0.517 ± .018
ECFP2_1024 [48] 0.703 ± .011 0.701 ± .031 0.701 ± .028 0.603 ± .057 0.513 ± .010
ECFP2_2048 [48] 0.705 ± .009 0.700 ± .031 0.696 ± .033 0.609 ± .063 0.512 ± .012
ECFP4_512 [48] 0.685 ± .015 0.706 ± .025 0.677 ± .036 0.575 ± .058 0.513 ± .011
ECFP4_1024 [48] 0.691 ± .010 0.704 ± .032 0.686 ± .036 0.587 ± .050 0.515 ± .012
ECFP4_2048 [48] 0.700 ± .013 0.712 ± .025 0.678 ± .032 0.602 ± .064 0.510 ± .010
ECFP6_512 [48] 0.669 ± .012 0.694 ± .029 0.664 ± .032 0.580 ± .057 0.508 ± .009
ECFP6_1024 [48] 0.677 ± .009 0.698 ± .025 0.672 ± .033 0.574 ± .055 0.511 ± .009
ECFP6_2048 [48] 0.688 ± .012 0.710 ± .028 0.670 ± .034 0.572 ± .057 0.510 ± .009