Skip to main content
. 2003 Dec 8;100(26):15688–15693. doi: 10.1073/pnas.2533904100

Table 1. Performance of composition-adjusted substitution matrices.

Organisms compared
No. of sequence pairs
Mean BLOSUM-62 bit score*
Background frequencies specified
Median change in bit score* with respect to BLOSUM-62
Cases improved (%)
Cases (%) with statistical significance improved/worsened by a factor >10
Sequence pairs Absolute Relative (%)
Related C. tetani and M. tuberculosis 40 68.3 Organism +1.6 +2.7 58 20/8
Sequence +2.3 +3.3 85 38/3
B. subtilis and L. lactis 37 59.8 Organism +1.1 +1.8 84 16/3
Sequence +2.1 +3.6 95 11/3
M. tuberculosis and S. coelicolor 34 58.6 Organism +1.4 +2.6 76 24/3
Sequence +2.7 +4.1 100 32/0
Unrelated (negative control) C. tetani and M. tuberculosis 1,560 16.7 Organism -0.02 -0.1 49 0.4/0.1
Sequence -0.05 -0.3 47 0.6/0.4
B. subtilis and L. lactis 1,332 15.7 Organism +0.00 +0.0 50 0.0/0.0
Sequence +0.04 +0.3 52 0.2/0.4
M. tuberculosis and S. coelicolor 1,122 16.4 Organism +0.05 +0.3 53 0.0/0.1
Sequence +0.06 +0.4 53 0.6/0.2
Structural Various 32 50.4 Sequence +1.3 +3.2 72 22/0
*

Bit scores for all comparisons were calculated by using composition-based statistics (19), and experimentally determined gapped statistical parameters (18, 19), as is now standard in blast (12, 13). All matrices were scaled to have ungapped λ = 0.00635 and used in conjunction with gap costs of -550 -50k for a gap of length k.

Equivalent to a change of >3.322 bits.

Twenty pseudocounts proportional to the amino acid frequencies implicit in BLOSUM-62 were added to the actual amino counts from the proteins compared.