Skip to main content
BMC Bioinformatics logoLink to BMC Bioinformatics
. 2009 Oct 19;10(Suppl 13):P2. doi: 10.1186/1471-2105-10-S13-P2

PAUL: protein structural alignment using integer linear programming and Lagrangian relaxation

Inken Wohlers 1,, Lars Petzold 2, Francisco S Domingues 3, Gunnar W Klau 1
PMCID: PMC2764133

Background

Protein structural alignment determines the three-dimensional superposition of protein structures by means of aligning the protein's residues. It is a basic method for identifying proteins of related structure or common evolutionary origin and for measuring three-dimensional similarity. Applications are for instance the search for proteins with similar biological function or the classification of proteins based on their structural features.

Methods

We present a structural alignment approach that computes an alignment based on the protein's inter-residue distances. Building upon work for the alignment of protein contact maps by Caprara et al. [1], we use these distances to formulate the problem as an integer linear program which is subsequently solved using Lagrangian relaxation. One advantage of the integer linear programming formulation over heuristic methods is that we compute in many cases demonstrably optimal alignments. The bottleneck of the integer linear programming approach is its computational complexity which does not allow to incorporate all inter-residue distances in the problem description. On that account we select and score inter-residue distances efficiently. We develop and optimize a scoring function inspired by Holm and Sander. [2] using a set of 200 pairwise HOMSTRAD [3] alignments with a sequence identity of less than 35%. Subsequently, we use this scoring function to assess the performance of PAUL on the more challenging SISY data set of 130 alignments [4,5] – on this data set we compare PAUL alignments to alignments computed by MATRAS [6], DALI [2], FATCAT [7], SHEBA [8], CA [9] and CE [10].

Results and conclusion

Our novel, non-heuristic structural alignment algorithm is flexible and mathematically sound. On the SISY data set PAUL alignments show higher mean and median alignment accuracies than all other methods (see Figure 1). In more than 30% of the cases, PAUL is the most accurate method. PAUL is thus competitive to other state-of-the-art algorithms and a beneficial tool for high-quality pairwise structural alignment.

Figure 1.

Figure 1

Box-and-whisker plots of the distributions of the percentages of alignment accuracies for the SISY set for PAUL, MATRAS, DALI, FATCAT, SHEBA, CA and CE. Additionally, the average alignment accuracies are denoted in blue.

References

  1. Caprara A, Carr R, Istrail S, Lancia G, Walenz B. 1001 optimal PDB structure alignments: integer programming methods for finding the maximum contact map overlap. J Comput Biol. 2004;11:27–52. doi: 10.1089/106652704773416876. [DOI] [PubMed] [Google Scholar]
  2. Holm L, Sander C. Protein structure comparison by alignment of distance matrices. J Mol Biol. 1993;233:123–138. doi: 10.1006/jmbi.1993.1489. [DOI] [PubMed] [Google Scholar]
  3. Mizuguchi K, Deane CM, Blundell TL, Overington JP. Homstrad: a database of protein structure alignments for homologous families. Protein Sci. 1998;7:2469–2471. doi: 10.1002/pro.5560071126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Andreeva A, Prlic A, Hubbard TJ, Murzin AG. Sisyphus-structural alignments for proteins with non-trivial relationships. Nucleic Acids Res. 2007:253–259. doi: 10.1093/nar/gkl746. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Mayr G, Domingues FS, Lackner P. Comparative analysis of protein structure alignments. BMC Struct Biol. 2007;7:50–50. doi: 10.1186/1472-6807-7-50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Kawabata T. Matras: A program for protein 3d structure comparison. Nucleic Acids Res. 2003;31:3367–3369. doi: 10.1093/nar/gkg581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Ye Y, Godzik A. Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics. 2003;19:ii246–ii255. doi: 10.1093/bioinformatics/btg1086. [DOI] [PubMed] [Google Scholar]
  8. Jung J, Lee B. Protein structure alignment using environmental profiles. Protein Eng. 2000;13:535–543. doi: 10.1093/protein/13.8.535. [DOI] [PubMed] [Google Scholar]
  9. Bachar O, Fischer D, Nussinov R, Wolfson H. A computer vision based technique for 3-D sequence-independent structural comparison of proteins. Protein Eng. 1993;6:279–288. doi: 10.1093/protein/6.3.279. [DOI] [PubMed] [Google Scholar]
  10. Shindyalov IN, Bourne PE. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 1998;11:739–747. doi: 10.1093/protein/11.9.739. [DOI] [PubMed] [Google Scholar]

Articles from BMC Bioinformatics are provided here courtesy of BMC

RESOURCES