Skip to main content
. 2010 Mar 24;38(13):4231–4245. doi: 10.1093/nar/gkq162

Table 1.

Scoring matrix used to define domain boundaries and calculate dos score, representing likelihood for a given amino acid to be found in GW domain

Amino acid Score [half-bits] Score [bits] Ratio Frequency Count
W 2.666 1.333 2.520 0.063:0.025 743:1062
G 2.068 1.034 2.048 0.213:0.104 2490:4447
N 1.510 0.755 1.688 0.081:0.048 949:2051
S 1.236 0.618 1.535 0.152:0.099 1774:4213
A 0.280 0.140 1.102 0.065:0.059 762:2537
D 0.184 0.092 1.066 0.081:0.076 950:3254
T 0.000 0.000 1.000 0.040:0.040 467:1718
Q −0.076 −0.038 0.974 0.038:0.039 440:1686
K −0.120 −0.060 0.959 0.070:0.073 821:3136
R −0.590 −0.295 0.815 0.044:0.054 518:2319
P −0.644 −0.322 0.800 0.032:0.040 373:1726
E −1.288 −0.644 0.640 0.048:0.075 560:3219
V −2.408 −1.204 0.434 0.023:0.053 274:2260
F −2.558 −1.279 0.412 0.014:0.034 169:1443
H −3.324 −1.662 0.316 0.006:0.019 76:796
C −3.398 −1.699 0.308 0.004:0.013 43:568
Y −4.792 −2.396 0.190 0.004:0.021 50:890
M −5.012 −2.506 0.176 0.003:0.017 39:743
I −5.030 −2.515 0.175 0.007:0.040 79:1705
L −5.252 −2.626 0.162 0.011:0.068 132:2925

The amino acids are sorted by the score value, from highest to lowest. The second and third columns were used in domain identification calculations. The last two colums contain counts and frequencies of a given amino acid found in the whole protein sequence versus the domain (format–domain: entire protein).