Table 1. Prediction accuracy of the effect of amino acid substitutions on function.
Comparison with sift predictions
|
|||||||
---|---|---|---|---|---|---|---|
Protein | Predicted tolerated* | Predicted deleterious* | Total* | Rank-ordering accuracy† | Predicted tolerated | Predicted deleterious | Total |
Hiv | 81% (85/105) | 83% (131/158) | 82% (216/263) | 95% (88/93) | 77% (81/105) | 79% (125/158) | 78% (206/263) |
Lys | 79% (534/676) | 72% (53/74) | 78% (587/750) | 89% (132/149) | 66% (446/676) | 95% (70/74) | 69% (516/750) |
Lac-N | 75% (86/115) | 78% (119/153) | 76% (205/268) | 88% (111/126) | 66% (76/115) | 76% (117/153) | 72% (193/268) |
Lac-C | 72% (1,700/2373) | 73% (387/532) | 72% (2,087/2,905) | 85% (901/1066) | 73% (1,726/2373) | 79% (421/532) | 74% (2,147/2,905) |
Only nonintermediate phenotypes (either wild-type function or complete ablation of function as assayed) from the mutation studies of HIV protease, T4 lysozyme, and Escherichia coli lac repressor were included in the analysis, because they were regarded as being the most reliable.
‡Prediction results from sift Ver. 2 (http://blocks.fhcrc.org/~pauline/SIFT.html) (18) for comparison.
The overall prediction accuracy and the fraction of amino acid substitutions correctly predicted to be either functionally tolerated or functionally deleterious are listed for the four domains. The subalignments used to represent each query sequence are indicated in Fig. 1
The accuracy of the rank-ordering of amino acids in each residue profile as measured by their relative propensity to disrupt function. Only residue positions that contain both tolerated and deleterious substitutions were included in this analysis