Skip to main content
. 2018 Jan 3;18:580. doi: 10.1186/s12859-017-1995-z

Table 4.

Predictive performance on the TESTsmall dataset

Predictors Material production (MF) Purification (PF) Crystallization (CF) Diffraction-quality crystallization (CR)
Average ±std p-value Average ±std p-value Average ±std p-value Average ±std p-value
AUC fDETECT 0.68 ±0.11 0.64 ±0.11 0.55 ±0.11 0.64 ±0.07
PPCpred 0.64 ±0.11 0.004 0.67 ±0.12 <0.001 0.60 ±0.12 <0.001 0.66 ±0.08 0.054
Crysalis 0.67 ±0.11 0.392 0.59 ±0.11 <0.001 0.56 ±0.10 0.366 0.60 ±0.08 <0.001
PredPPCrys 0.62 ±0.11 <0.001 0.59 ±0.11 0.002 0.48 ±0.12 <0.001 0.62 ±0.08 0.001
XtalPRed NA NA NA 0.59 ±0.09 <0.001
XtalPred-RF NA NA NA 0.65 ±0.08 0.392
TragetCrys NA NA NA 0.64 ±0.07 0.734
CRYSTALP2 NA NA NA 0.63 ±0.08 0.419
MCC fDETECT 0.21 ±0.17 0.15 ±0.19 0.16 ±0.18 0.20 ±0.13
PPCpred 0.19 ±0.20 0.039 0.26 ±0.21 <0.001 0.20 ±0.18 0.752 0.23 ±0.15 0.253
Crysalis 0.20 ±0.18 0.115 0.11 ±0.19 0.018 0.07 ±0.16 <0.001 0.15 ±0.15 0.005
PredPPCrys 0.12 ±0.17 <0.001 0.06 ±0.18 <0.001 0.00 ±0.19 <0.001 0.19 ±0.15 0.314
XtalPRed NA NA NA 0.18 ±0.18 0.580
XtalPred-RF NA NA NA 0.24 ±0.14 0.039
TragetCrys NA NA NA 0.21 ±0.12 0.889
CRYSTALP2 NA NA NA 0.23 ±0.14 0.297
Accuracy fDETECT 78.0 ±4.9 72.7 ±6.2 68.5 ±6.9 59.9 ±6.7
PPCpred 77.4 ±5.4 0.040 76.2 ±6.7 <0.001 69.9 ±6.6 0.770 61.4 ±7.6 0.211
Crysalis 77.7 ±4.9 0.104 71.4 ±6.0 0.018 65.1 ±6.1 <0.001 57.5 ±7.3 0.004
PredPPCrys 75.6 ±4.8 <0.001 70.0 ±6.0 <0.001 61.9 ±7.5 <0.001 59.3 ±7.5 0.348
XtalPRed NA NA NA 58.7 ±8.8 0.315
XtalPred-RF NA NA NA 62.2 ±7.2 0.117
TragetCrys NA NA NA 60.3 ±6.2 1.000
CRYSTALP2 NA NA NA 61.4 ±6.7 0.256

We report average AUC, MCC and accuracy and their corresponding standard deviations over 100 bootstrap tests (each test is based on 25% of randomly chosen proteins). Statistical significance of differences between fDETECT and each other method was measured with paired t-test; the measured values are normal, which we verified based on the Anderson-Darling test at 0.05 significance. The best results that are not significantly different with each other (p-value >0.05) for each outcome are given in bold font. NA means that a given method does not provide this type of prediction