Skip to main content
. Author manuscript; available in PMC: 2019 Mar 18.
Published in final edited form as: Toxicology. 2017 Jun 23;389:139–145. doi: 10.1016/j.tox.2017.06.003

Table 4.

Statistical performance of the final Random Forest (100 trees) model A) using all 2D MOE descriptors and transporter predictions (DILI_MOE_transp_RF model) or B) using only the 2D MOE descriptors (DILI_MOE_RF model) and the C) open source model (DILI_RDKit _RF100).

Accuracy Sensitivity Specificity AUC Precision
A) DILI_MOE_transp _RF100
10-fold CV (average +/− standard deviation for 50 iterations) 0.65 ± 0.01 0.68 ± 0.01 0.61 ± 0.01 0.69 ± 0.01 0.65 ± 0.01
Mulliner 921 cpds 0.57 0.63 0.50 0.59 0.62
Liew 341 cpds 0.67 0.72 0.56 0.71 0.75
Chen 96 cpds 0.59 0.54 0.65 0.61 0.63
Merged test set 966cpds 0.59 0.68 0.50 0.62 0.62
B) DILI_ MOE _RF100
10-fold CV (average +/− standard deviation for 50 iterations) 0.65 ± 0.01 0.68 ± 0.01 0.61 ± 0.01 0.69 ± 0.01 0.65 ± 0.01
Mulliner 921 cpds 0.58 0.60 0.55 0.59 0.63
Liew 341 cpds 0.68 0.68 0.67 0.71 0.79
Chen 96 cpds 0.63 0.56 0.70 0.66 0.67
Merged test set 966cpds 0.60 0.64 0.56 0.62 0.63
C) DILI_RDKit_RF100
10-fold CV (average +/− standard deviation for 50 iterations) 0.64 ± 0.01 0.70 ± 0.01 0.57 ± 0.01 0.69 ± 0.01 0.63 ± 0.01
Mulliner 921 cpds 0.60 0.64 0.54 0.62 0.64
Liew 332 cpds 0.67 0.72 0.56 0.71 0.72
Chen 95 cpds 0.64 0.64 0.64 0.73 0.64
Merged test set 966cpds 0.60 0.67 0.52 0.64 0.63

Notes: The number of compounds for the external datasets is slightly different for the predictions on model C because for some compounds (peptides), some descriptor values computed by RDKit were too large to be handled by the machine learning algorithm.