Skip to main content
. 2020 Oct 6;10:16581. doi: 10.1038/s41598-020-73644-6

Table 1.

Performance of 14 binary classifiers for predicting hemolytic activity using 3 model datasets (HemoPI-1, HemoPI-2, HemoPI-3) and 56 sequence-based physicochemical descriptors.

Classifiers Mean accuracy (%)
HemoPI-1 dataset HemoPI-2 dataset HemoPI-3 dataset
Model Validation Model Validation Model Validation
LOGREG 92.6 89.6 68.0 65.8 69.7 70.8
KNN 93.2 86.8 71.6 63.4 72.8 65.8
CART 90.4 83.7 69.6 62.4 68.9 64.0
RFC 93.8 87.3 69.0 63.9 72.4 66.8
GBC 94.0 90.4 76.7 72.3 75.8 72.9
ABC 93.8 91.4 71.3 73.8 72.3 71.4
LDA 94.2 90.5 70.0 59.9 70.8 70.8
QDA 91.5 85.5 66.4 60.9 65.9 59.4
NB 85.9 85.5 62.3 63.4 64.4 65.9
SVC-LIN 88.1 85.0 61.9 59.9 65.4 66.5
SVC-RBF 88.7 85.5 63.7 62.4 66.3 66.2
SVC-POLY 71.5 60.0 54.4 54.5 54.6 54.5
SVC-SIG 88.0 81.8 59.6 54.5 63.4 63.4
XGBC 94.8 92.4 76.1 69.3 74.7 73.2

All performance metrics (Accuracy, Precision, Matthews correlation coefficient, Cohen’s kappa statistic and Receiver operating characteristic area under curve) are available in Supporting Information.

The best binary classifiers and their respective performances are depicted in bold.