Skip to main content
. 2020 Mar 4;10:3986. doi: 10.1038/s41598-020-60747-3

Table 2.

Performance of QSAR classification models for luciferase inhibition.

Luciferase assay
10-fold cross-validation (full set, n = 1724)
Method Acc Accb Sp Se MCC
RF 0.828 +/− 0.006 0.752 +/− 0.0095 0.931 +/− 0.002 0.573 +/− 0.017 0.557 +/− 0.016
SVM-linear 0.804 +/− 0.05 0.715 +/− 0.011 0.923 +/−0.007 0.507 +/− 0.015 0.486 +/− 0.014
SVM-radial 0.719 +/− 0.002 0.512 +/− 0.0025 0.997 +/− 0.001 0.027 +/− 0.004 0.109 +/− 0.014
SVM-sigmoid 0.669 +/− 0.011 0.563 +/− 0.012 0.813 +/− 0.011 0.312 +/− 0.014 0.136 +/− 0.025
LDA 0.806 +/− 0.005 0.744 +/− 0.0125 0.89 +/− 0.007 0.598 +/− 0.018 0.51 +/− 0.015
CART 0.783 +/− 0.01 0.7055 +/− 0.014 0.888 +/− 0.011 0.523 +/− 0.017 0.442 +/− 0.026
NN 0.783 +/− 0.011 0.7185 +/− 0.024 0.869 +/− 0.011 0.568 +/− 0.037 0.453 +/− 0.031
Fitting (training set, n = 1464)
Method M Acc Accb Sp Se MCC
RF 0.998 +/− 0.001 0.997 +/− 0.002 0.999 +/− 0.001 0.995 +/− 0.003 0.996 +/− 0.002
SVM-linear 0.822 +/− 0.003 0.738 +/− 0.013 0.935 +/− 0.006 0.542 +/− 0.02 0.538 +/− 0.01
SVM-radial 0.973 +/− 0.004 0.953 +/− 0.006 1 +/− 0 0.906 +/− 0.012 0.934 +/− 0.008
SVM-sigmoid 0.638 +/− 0.01 0.531 +/− 0.019 0.783 +/− 0.012 0.279 +/− 0.026 0.066 +/− 0.026
LDA 0.848 +/− 0.007 0.7935 +/− 0.011 0.923 +/− 0.006 0.664 +/− 0.016 0.617 +/− 0.018
CART 0.137 +/− 0.012 0.1925 +/− 0.0275 0.062 +/− 0.01 0.323 +/− 0.045 −0.653 +/− 0.032
NN 0.837 +/− 0.022 0.798 +/− 0.036 0.892 +/− 0.026 0.704 +/− 0.046 0.602 +/− 0.051
External validation (test set, n = 258)
Method Acc Accb Sp Se MCC
RF 0.835 +/− 0.024 0.7645 +/− 0.041 0.925 +/− 0.026 0.604 +/− 0.056 0.571 +/− 0.045
SVM-linear 0.811 +/− 0.032 0.727 +/− 0.0385 0.918 +/− 0.029 0.536 +/− 0.048 0.502 +/− 0.064
SVM-radial 0.724 +/− 0.037 0.51 +/− 0.01 0.994 +/− 0.004 0.026 +/− 0.016 0.079 +/− 0.067
SVM-sigmoid 0.667 +/− 0.021 0.56 +/− 0.05 0.807 +/− 0.031 0.313 +/− 0.069 0.128 +/− 0.068
LDA 0.802 +/− 0.027 0.7405 +/− 0.036 0.879 +/− 0.027 0.602 +/− 0.045 0.495 +/− 0.059
CART 0.207 +/− 0.023 0.275 +/− 0.033 0.12 +/− 0.013 0.43 +/− 0.053 −0.467 +/− 0.045
NN 0.786 +/− 0.031 0.735 +/− 0.048 0.845 +/− 0.033 0.625 +/− 0.063 0.468 +/− 0.072

Each model building process was repeated 10 times with distinct data segregation and inactive under sampling from the entire Tox21 dataset, and the mean (M) and the standard deviation (SD) of each performance criterion are reported, Acc: accuracy, Accb: balanced accuracy, Sp: specificity, Se: sensitivity and MCC: Matthew Coefficient Correlation, see methods.