Skip to main content
. 2019 Sep 13;59(10):4438–4449. doi: 10.1021/acs.jcim.9b00236

Table 3. Average Metrics for Each of 10 Hold-Out Folds from Cross-Validation Using 676 Molecules from Aliper et al. Annotated with Only One of the 12 MeSH Classesa.

problem group metric SVM1 DNN2 IMG + CNN3 MFP + RF4
3-class accuracy 0.53 0.701 0.747 ± 0.0657 0.742 ± 0.0692
  balanced accuracy     0.739 ± 0.0644 0.715 ± 0.0766
  MCC     0.619 ± 0.102 0.612 ± 0.106
  ROC AUC     0.870 ± 0.0412 0.894 ± 0.0417
  ave. precision score     0.806 ± 0.0592 0.847 ± 0.0588
5-class accuracy 0.417 0.596 0.653 ± 0.0451 0.694 ± 0.0497
  balanced accuracy     0.620 ± 0.0509 0.635 ± 0.0661
  MCC     0.549 ± 0.0599 0.606 ± 0.0660
  ROC AUC     0.867 ± 0.0322 0.892 ± 0.0284
  ave. precision score     0.735 ± 0.0568 0.791 ± 0.0471
12-class accuracy 0.366 0.546 0.608 ± 0.0500 0.641 ± 0.0331
  balanced accuracy     0.507 ± 0.107 0.504 ± 0.0522
  MCC     0.525 ± 0.0620 0.572 ± 0.0388
  ROC AUCb     0.863 ± 0.209 0.896 ± 0.0200
  ave. precision scoreb     0.672 ± 0.0303 0.751 ± 0.0205
a

Values for the gene-expression-based models are from Aliper et al. who used different training and validation folds for 10-fold cross-validation with a 1support vector machine (SVM) or 2multilayer perceptron deep neural network (DNN) based on pathway activation scores. Values from this paper using 3molecule images input to a convolutional neural network (IMG + CNN) or 4Morgan molecular fingerprints as the input to the random forest (MFP + RF). Values for 3,4 are the mean of the validation folds ± standard deviation.

b

Receiver operator characteristic area under the curve (ROC AUC) and average precision score were computed as the weighted average of scores across classes and only computed for the first six validation sets of the 12-class problem due to fewer than 10 examples in the dermatological and urological classes.