Skip to main content
. Author manuscript; available in PMC: 2011 Sep 1.
Published in final edited form as: J Struct Biol. 2010 Mar 27;171(1):64–73. doi: 10.1016/j.jsb.2010.03.016

Table 2.

Effects of experimental and sequence variables on prediction power

Model Variables Used in Prediction Model DSPred
Errorf
Corre-
lationg
ROC Areah
Experimental Variables Sequence Variables DS>3 DS≥5
A. Best with
expt & seqa
R30 YldS SECR1 DLSMR MW Dismax 1.96
(0.13)
0.56
(0.06)
0.77
(0.04)
0.87
(0.05)
B. Leave out
seq from Ab
R30 (YldS) SECR1 DLSMR 2.73
(0.08)
−0.07
(0.06)
0.61
(0.05)
0.49
(0.06)
C. Leave out
expt from Ac
MW Dismax 2.46
(0.10)
0.18
(0.07)
0.65
(0.05)
0.69
(0.06)
D. Best with
expt onlyd
R30 YldS SECPP d DLSMW d LPav d 1.90
(0.06)
0.57
(0.04)
0.70
(0.08)
0.71
(0.08)
E. Best with
seq onlye
MW Dismax Hydav e XPe 2.58
(0.12)
0.17
(0.08)
0.64
(0.05)
0.63
(0.06)

For descriptions of variables see Table 1.

a

Best partition model combining experimental and sequence variables from 77-sample training set.

b

The 4 experimental variables from model A were supplied to the partition algorithm. The algorithm discarded YldS as a criterion.

c

The 2 sequence variables from A were supplied to the algorithm; the algorithm used both as criteria.

d

All experimental variables were supplied. The algorithm used 2 of the same variables as in A, replaced SECR1 and DLSMR with related variables SECPP and DLSMW, and added LPav.

e

All sequence variables were supplied; hydropathy (Hydav) and XtalPred score (XP) were added to the sequence variables used in A.

f,g,h

Three measures of predictive power for the 30-sample test set (parentheses: standard deviation estimated from synthetic data).

f

Square root of the mean square difference between predicted and observed diffraction scores (DS).

g

Pearson‘s correlation coefficient for predicted and observed DS.

h

Area under ROC curves as in Figure 4b, with success defined as “better than 10 Å diffraction” (DS > 3) or as “2.8 Å or better diffraction” (DS ≥ 5).