Skip to main content
. Author manuscript; available in PMC: 2013 Jan 1.
Published in final edited form as: Prog Nucl Magn Reson Spectrosc. 2011 May 23;60:1–28. doi: 10.1016/j.pnmrs.2011.05.002

Table 3.

Performance of shAIC relative to other methods evaluated using the control set of 38 proteins and cross-validation of shAIC.

Part A. Comparison with existing programs
Part B. shAIC: overall and segregated
ShiftX Sparta CamShift SHIFTS Sparta+ ShAIC Sheet Helix Coil
Correlation coefficient squared, Rtert2, for tertiary chemical shift a
N 0.520 0.564 0.468 0.282 0.661 0.653 0.721 0.600 0.653
C’ 0.243 0.370 0.336 0.089 0.478 0.462 0.528 0.476 0.382
0.443 0.522 0.395 0.328 0.621 0.594 0.577 0.621 0.602
0.350 0.455 0.399 0.172 0.533 0.477 0.573 0.396 0.479
HN 0.370 0.389 0.405 0.172 0.523 0.451 0.521 0.468 0.437
0.533 0.475 0.548 0.422 0.614 0.543 0.573 0.447 0.450
Correlation coefficient squared, Rsec2, for secondary chemical shift a
N 0.577 0.616 0.531 0.333 0.701 0.694
C’ 0.555 0.643 0.624 0.340 0.704 0.696
0.741 0.784 0.724 0.661 0.828 0.815
0.556 0.632 0.591 0.381 0.687 0.649
HN 0.435 0.455 0.463 0.249 0.573 0.508
0.721 0.695 0.733 0.647 0.772 0.733
Rmsd/ppm (control set) Rmsd/ppm shAIC (training set)

Derivationb Cross-validationc
N 2.827 2.694 2.905 4.313 2.356 2.343 2.391 2.561
C’ 1.253 1.105 1.128 1.756 1.004 1.016 1.074 1.172
1.144 1.044 1.175 1.337 0.926 0.961 0.946 1.035
1.219 1.099 1.157 1.525 1.012 1.071 1.138 1.336
HN 0.559 0.546 0.528 0.630 0.466 0.503 0.465 0.501
0.283 0.318 0.276 0.319 0.262 0.276 0.262 0.280
90% confidence intervals d /ppm (control set)
N 4.476 4360 4.674 7.005 3.675 3.769
C’ 1.988 1.749 1.838 2.841 1.631 1.626
1.822 1.639 1.896 2.153 1.409 1.462
1.993 1.761 1.865 2.425 1.651 1.750
HN 0.884 0.872 0.827 0.958 0.725 0.798
0.451 0.528 0.454 0.520 0.393 0.421
a

Squared correlation coefficients (coefficient of determination) for observed vs. predicted tertiary or secondary chemical shift is described in the text. Only chemical shift values for which all programs provided a prediction were included in the analysis (e.g., Sparta does not provide predictions for terminal residues). Outliers were removed from the analysis based on the criteria that, for all methods, the error was larger than five times the standard deviation, with the standard deviation estimated from rmsds in the training set broken down into residue and secondary structure type.

b

rmsds between predicted and observed chemical shift in the training set of 681 proteins after derivation of all parameters.

c

rmsds between predicted and observed chemical shift in the training set of 681 proteins using cross-validation. The set was divided into 10 equal subsets and for each subset the 9 other sets were used to derive the parameters, which in turn was used to predict the shift for the first set.

d

90% of the predictions have an error less than this threshold.