Skip to main content
. 2001 May;11(5):863–874. doi: 10.1101/gr.176601

Table 1.

Summary of Prediction Results for SIFT and BLOSUM62

Test set Method Tolerant prediction accuracy Deleterious prediction accuracy Total prediction accuracy Experimental prediction accuracy






LacI* n = 4004 SIFT 78% (1747/2254) 57% (989/1750) 68% (2736/4004) 66% (989/1496)
BLOSUM62 31% (696/2254) 84% (1475/1750) 54% (2171/4004) 49% (1475/3033)
HIV-1 Protease n = 336 Automated SIFT 70% (78/111) 82% (184/225) 78% (262/336) 85% (184/217)
SIFT without RSV,  avian sequences 68% (75/111) 88% (197/225) 81% (272/336) 85% (197/233)
BLOSUM62 63% (70/111) 73% (165/225) 70% (235/336) 80% (165/206)
Bacteriophage T4 SIFT 59% (817/1377) 72% (460/638) 63% (1277/2015) 45% (460/1020)
 Lysozyme n = 2015 BLOSUM62 30% (406/1377) 85% (542/638) 47% (948/2015) 36% (542/1513)

The effect of 4004 substitutions was assayed for LacI (Markiewicz et al. 1994; Pace et al. 1997), 336 substitutions for HIV-1 protease (Loeb et al. 1989), and 2015 substitutions for bacteriophage T4 lysozyme (Rennell et al. 1991). These three data sets are used to test prediction performance. Tolerant prediction accuracy is the number of substitutions correctly predicted to have no effect divided by the total number of substitutions that gave a wild-type phenotype under experimental test conditions. Subtracting the numerator from the denominator gives the number of substitutions that have been predicted to be deleterious but gave a wild-type phenotype under experimental conditions. Deleterious prediction accuracy is the number of substitutions correctly predicted to have an effect on the protein divided by the number of substitutions that affected protein. Subtracting the numerator from the denominator gives the number of substitutions that were predicted to have wild-type phenotype but gave a deleterious phenotype under experimental conditions. Total prediction accuracy is the total number of substitutions correctly predicted divided by the total number of substitutions. Experimental prediction accuracy is the number of substitutions that were experimentally shown to affect protein function divided by the number of substitutions predicted to affect function. For the biologist investigating substitutions predicted to have a deleterious effect, the experimental prediction accuracy reflects the proportion of predictions that will yield affected phenotypes experimentally. 

*

SIFT offers prediction for positions 5–329 of the LacI repressor because fewer than half of the sequences are represented at positions 1–4 and 330–360.