Skip to main content
. Author manuscript; available in PMC: 2011 Jul 26.
Published in final edited form as: J Chem Inf Model. 2010 Jul 26;50(7):1189–1204. doi: 10.1021/ci100176x

Table 3.

Statistical parameters of QSAR models obtained before and after curation.

ID Name R2 Q2 R2EF Sws Scv SEF R2EVS R2EVS(NM)
1 Rat 0.96 0.84-0.93 0.89-0.92 0.11-0.13 0.16-0.24 0.20-0.26
2 Rat(NM) 0.91-0.97 0.89-0.95 0.45-0.88 0.10-0.18 0.14-0.28 0.28-0.58
3 TP 0.83 0.76 0.33 0.38 0.54 −0.58
4 TP(NM) 0.85 0.54 0.31 0.54 0.49 0.44
5 DILI non-curated No modeling was possible
6 DILI50 Modeling Set 5-fold external CV Accuracy = 62-68%
External sets Accuracy = 56-73%
7* 62Ames
non-curated
SensitivityRF=83%; SensitivitySVM=87%; SpecificityRF=SpecificitySVM=75%

AUCGp=88%; AUCSVM=89%; AUCRF=83%
8* 63Ames
curated
SensitivityRF=SensitivitySVM=79%; SpecificityRF=SpecificitySVM=81%

AUCGP=86%; AUCSVM=84%; AUCRF=83%

Where:

TP – Tetrahymena pyriformis dataset, (NM) – modeling set with various representations of nitro groups

R2 - determination coefficient, Q2 - cross validation determination coefficient

R2EF- determination coefficient for external folds extracted from the modeling set

Sws - standard error of a prediction for work set

Scv - standard error of prediction for work set in cross validation terms

Sts - standard error of a prediction for external folds extracted from the modeling set

A - number of PLS latent variables, D - number of descriptors, M - number of molecules in the work set

R2EVS - determination coefficient for external validation set

R2EVS(NM) - determination coefficient for external validation set with shuffled nitro groups

AUC – Area Under Curve statistical parameter

RF – Random Forest

SVM – Supporting Vector Machine

GP – Gaussian Processes

*

Prediction performances are reported for external validation set.