Skip to main content
. 2010 May 14;5:21. doi: 10.1186/1748-7188-5-21

Table 1.

mBed performance on the ten biggest Pfam/HOMSTRAD families.

Name Size Len %ID Embedding Time (s) Distance Matrix Calculation Time (s) Alignment Column Score (%)
(1) (2) (3) (4) (1) (2) (3) (4) (1) (2) (3) (4)
PF01381 9993 53 23 - 25 55 136 764 57 55 175 13.3 26.7 25.3 34.7
PF00006 9796 209 43 - 134 248 280 4364 48 49 88 42.8 36.6 36.6 38.0
PF00989 9681 95 17 - 43 88 197 1281 50 51 159 46.5 33.3 31.8 34.1
PF00486 9615 75 30 - 34 69 107 950 55 52 104 63.9 92.8 64.9 89.7
PF00571 9551 119 19 - 73 143 268 1993 54 50 152 6.15 3.08 1.54 1.54
PF00097 9423 41 33 - 18 38 94 517 44 43 115 53.2 54.8 61.3 54.8
PF01479 9352 47 32 - 17 40 90 496 45 46 124 58.3 91.7 89.6 79.2
PF00046 9305 54 35 - 20 43 85 651 41 42 77 59.4 44.9 46.4 60.9
PF00550 9249 63 25 - 28 59 136 794 47 47 141 51.3 32.9 55.3 59.2
PF00149 9072 198 14 - 133 256 552 3515 47 46 172 75.4 71.9 72.3 76.1
Average 9503 95 27 0 53 104 195 1533 49 48 131 47.0 48.9 48.5 52.8

The ten biggest Pfam entries containing 9,000-10,000 sequences, which have a corresponding HOMSTRAD alignment are used here. Four different methods were applied to each entry to calculate a distance matrix. These methods are: (1) the traditional process of calculating a full distance matrix from the sequence data using an alignment distance measure; (2) mBed default; (3) mBed followed by the 'usePivotObjects' method; (4) mBed followed by the 'usePivotGroups' method. A UPGMA guide tree is constructed from each matrix and used as a guide tree for progressive alignment of the sequences. The alignment is then scored against the corresponding HOMSTRAD structural alignment using Column Score.

(1) Full d(x, y) distance matrix; (2) mBed; (3) mBed + usePivotObjects; (4) mBed + usePivotGroups