Skip to main content
. Author manuscript; available in PMC: 2024 Feb 1.
Published in final edited form as: Comput Biol Med. 2022 Dec 23;153:106479. doi: 10.1016/j.compbiomed.2022.106479

Table 3.

Comparison of our models’ performance on the five data sets with 2% labeled data with other models’ performance. Our four models are AE-MBO (autoencoder with MBO), BT-MBO (transformer with MBO), ECFP-MBO (extended-connectivity fingerprints with MBO), and Consensus-MBO, a model generating the consensus from our top two scoring methods for a given data set and percent of labeled data (more details in Section 2.6). The BT-GBDT, BT-RF, and BT-SVM models use the BT-FPs as features for gradient boosting decision trees, random forest, and support vector machine, respectively (note that these models are denoted by ‘SSLP-FP’ in [5]). AE-GBDT, AE-RF, and AE-SVM refer to the AE-FPs used as features for the specified machine learning method. Performance is given as average ROC-AUC score over 50 labeled sets with standard deviation, and Consensus-MBO performance is given as the average over 10 trials with standard deviation.

ROC-AUC Scores for 2% Labeled Data
Model Ames Bace BBBP Beet ClinTox
BT-MBO (Proposed) 0.677 ± .021 0.618 ± .037 0.736 ± .051 0.576 ± .075 0.704 ± .113
AE-MBO (Proposed) 0.619 ± .016 0.589 ± .029 0.685 ± .048 0.548 ± .067 0.561 ± .030
ECFP-MBO (Proposed) 0.672 ± .021 0.670 ± .034 0.682 ± .028 0.614 ± .089 0.551 ± .026
Consensus-MBO (Proposed) 0.683 ± .023 0.642 ± .043 0.712 ± .058 0.593 ± .090 0.656 ± .076
BT-GBDT [5] 0.674 ± .023 0.600 ± .036 0.643 ± .075 0.521 ± .035 0.513 ± .024
BT-RF [5] 0.666 ± .025 0.588 ± .034 0.619 ± .071 0.510 ± .020 0.504 ± .010
BT-SVM [5] 0.680 ± .017 0.605 ± .036 0.663 ± .082 0.522 ± .040 0.569 ± .080
AE-GBDT [17] 0.632 ± .018 0.588 ± .038 0.614 ± .068 0.529 ± .036 0.504 ± .010
AE-RF [17] 0.631 ± .019 0.581 ± .034 0.596 ± .054 0.517 ± .022 0.502 ± .004
AE-SVM [17] 0.627 ± .015 0.580 ± .035 0.625 ± .066 0.512 ± .028 0.508 ± .012
ECFP2_512 [48] 0.658 ± .018 0.629 ± .039 0.643 ± .051 0.552 ± .064 0.512 ± .017
ECFP2_1024 [48] 0.663 ± .018 0.635 ± .038 0.621 ± .048 0.565 ± .057 0.508 ± .010
ECFP2_2048 [48] 0.666 ± .019 0.634 ± .044 0.641 ± .041 0.541 ± .058 0.509 ± .010
ECFP4_512 [48] 0.646 ± .019 0.634 ± .042 0.609 ± .052 0.542 ± .052 0.505 ± .008
ECFP4_1024 [48] 0.655 ± .020 0.638 ± .036 0.617 ± .045 0.538 ± .050 0.505 ± .006
ECFP4_2048 [48] 0.652 ± .021 0.645 ± .040 0.619 ± .053 0.549 ± .057 0.506 ± .008
ECFP6_512 [48] 0.635 ± .025 0.632 ± .040 0.604 ± .051 0.531 ± .056 0.506 ± .011
ECFP6_1024 [48] 0.639 ± .020 0.632 ± .046 0.592 ± .048 0.542 ± .049 0.503 ± .006
ECFP6_2048 [48] 0.650 ± .022 0.635 ± .046 0.584 ± .046 0.531 ± .047 0.505 ± .007