Skip to main content
. 2007 Mar 16;35(7):2238–2246. doi: 10.1093/nar/gkm107

Table 1.

Consensus sequences improve the global quality of structural models*

<2 Å (Cα distance) < 5Å (Cα distance)


PSI-BLAST PROFILE-CONSENSUS MAMMOTH PSI-BLAST PROFILE-CONSENSUS MAMMOTH
SCOP superfamily, only 15.8 (±0.2) 18.1 (±0.2) 19.6 (±0.2) 22.6 (±0.2) 27.2 (±0.3) 35.2 (±0.3)
PSI-BLAST e-values 10−3–10 34.7 (±0.4) 38.3 (±0.5) 36.5 (±0.4) 49.1 (±0.5) 55.2 (±0.6) 58.0 (±0.5)

*For each protein in our data sets (query Q), we aligned a similar protein in the PDB (template T) and used the experimental structure of T to model the structure for Q by simply copying the Cα backbone of T onto Q according to the alignment provided. Since for all Qs in our experiment the correct answer was known (all Qs had known structure), we could then assess how accurate the model was by superposing the model and the known structure. For this superposition, we used the structural alignment method LGA. Here, the measure of accuracy was the percentage of Cαs that were closer to the real structure than some distant cutoff (<5 Å for the three rightmost columns, and <2 Å for columns 2–4). Note that the set of residues below a distance threshold was not necessarily consecutive in sequence. We compared the consensus sequence-based approach with that of the regular PSI-BLAST. The data for MAMMOTH was generated by optimally superposing the structures of Q and T without considering their sequences. In principle, this approximated an upper threshold for performance (Results). The two rows distinguished different data sets corresponding to different levels of alignment difficulty: ‘SCOP superfamily only’ were pairs of proteins that fell into different SCOP families and into the same SCOP superfamily (coarse-grained structural relation), while ‘PSI-BLAST e-values 10−3–10’ were pairs of proteins with similar structure that fell into the corresponding interval of sequence similarity. Note that both rows reflected the performance for ‘non-trivial’ tasks. Standard errors are given in parentheses.