Abstract
The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on “the inverse protein folding problem” laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term “protein threading.” These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.
Keywords: Energy Function, Query Sequence, Tree Decomposition, Protein Structure Prediction, Query Protein
Contributor Information
Ying Xu, Email: xyn@bmb.uga.edu.
Dong Xu, Email: xudong@missouri.edu.
Jie Liang, Email: jliang@uic.edu.
References
- Alexandrov N., Shindyalov I. PDP: protein domain parser. Bioinformatics. 2003;19:429–430. doi: 10.1093/bioinformatics/btg006. [DOI] [PubMed] [Google Scholar]
- Altschul S.F., Gish W. Local alignment statistics. Methods Enzymol. 1996;266:460–480. doi: 10.1016/S0076-6879(96)66029-7. [DOI] [PubMed] [Google Scholar]
- Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andreeva A., Howorth D., Brenner S.E., Hubbard T. J., Chothia C., Murzin A.G. SCOP database in 2004: Refinements integrate structure and sequence family data. Nucleic Acids Res. 2004;32:D226–D229. doi: 10.1093/nar/gkh039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Apic G., Gough J., Teichmann S.A. Domain combinations in archaeal, eubacterial and eukaryotic proteomes. J. Mol. Biol. 2001;310:311–325. doi: 10.1006/jmbi.2001.4776. [DOI] [PubMed] [Google Scholar]
- Apic G., Gough J., Teichmann S.A. An insight into domain combinations. Bioinformatics. 2001;17(Suppl. l):83–89. doi: 10.1093/bioinformatics/17.suppl_1.s83. [DOI] [PubMed] [Google Scholar]
- Arnborg S., Proskurowski A. Linear time algorithms for NP-hard problems restricted to partial k-tree. Discrete Appl Math. 1989;23:11–24. doi: 10.1016/0166-218X(89)90031-0. [DOI] [Google Scholar]
- Bairoch A., Apweiler R., Wu C.H., Barker W.C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., Magrane M., Martin M.J., Natale D.A., O’Donovan C., Redaschi N., Yeh L.S. The Universal Protein Resource (UniProt) Nucleic Acids Res. 2005;33:D154–D159. doi: 10.1093/nar/gki070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baker D., Sali A. Protein structure prediction and structural genomics. Science. 2001;294:93–96. doi: 10.1126/science.1065659. [DOI] [PubMed] [Google Scholar]
- Barton G.J., Sternberg M.J. A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. J. Mol. Biol. 1987;198:327–337. doi: 10.1016/0022-2836(87)90316-0. [DOI] [PubMed] [Google Scholar]
- Barton G.J., Sternberg M.J. Flexible protein sequence patterns. A sensitive method to detect weak structural similarities. J. Mol. Biol. 1990;212:389–402. doi: 10.1016/0022-2836(90)90133-7. [DOI] [PubMed] [Google Scholar]
- Bodlaender H.L. A linear time algorithm for finding tree-decompositions of small treewidth. SIAMJ. Comput. 1996;25:1305–1317. doi: 10.1137/S0097539793251219. [DOI] [Google Scholar]
- Bourne P.E., Weissig H., editors. Structural Bioinformatics. New York: Wiley-Liss; 2003. [Google Scholar]
- Bowie J.U., Luthy R., Eisenberg D. A method to identify protein sequences that fold into a known three-dimensional structure. Science. 1991;253:164–170. doi: 10.1126/science.1853201. [DOI] [PubMed] [Google Scholar]
- Branden C., Tooze J. Introduction to Protein Structure. 2nd ed. New York: Garland Publishing; 1999. [Google Scholar]
- Brassard G., Bratley P. Fundamentals of Algorithmes. Upper Saddle River, NJ: Prentice-Hall; 1996. pp. 265–266. [Google Scholar]
- Brenner S.E., Chothia C., Hubbard T.J., Murzin A.G. Understanding protein structure: Using scop for fold interpretation. Methods Enzymol. 1996;266:635–643. doi: 10.1016/S0076-6879(96)66039-X. [DOI] [PubMed] [Google Scholar]
- Bryant S.H., Altschul S.F. Statistics of sequence-structure threading. Curr. Opin. Struct. Biol. 1995;5:236–244. doi: 10.1016/0959-440X(95)80082-4. [DOI] [PubMed] [Google Scholar]
- Calland P.Y. On the structural complexity of a protein. Protein Eng. 2003;16:79–86. doi: 10.1093/proeng/gzg011. [DOI] [PubMed] [Google Scholar]
- Chen W., Mirny L., Shakhnovich E.I. Fold recognition with minimal gaps. Proteins. 2003;51:531–543. doi: 10.1002/prot.10402. [DOI] [PubMed] [Google Scholar]
- Clore G.M., Robien M.A., Gronenborn A.M. Exploring the limits of precision and accuracy of protein structures determined by nuclear magnetic resonance spectroscopy. J. Mol. Biol. 1993;231:82–102. doi: 10.1006/jmbi.1993.1259. [DOI] [PubMed] [Google Scholar]
- Cohen F.E., Sternberg M.J. On the use of chemically derived distance constraints in the prediction of protein structure with myoglobin as an example. J. Mol. Biol. 1980;137:9–22. doi: 10.1016/0022-2836(80)90154-0. [DOI] [PubMed] [Google Scholar]
- Coulson A.F., Moult J. A unifold, mesofold, and superfold model of protein fold use. Proteins. 2002;46:61–71. doi: 10.1002/prot.10011. [DOI] [PubMed] [Google Scholar]
- de Bakker P.I., Bateman A., Burke D.F., Miguel R.N., Mizuguchi K., Shi J., Shirai H., Blundell T.L. HOMSTRAD: Adding sequence information to structure-based alignments of homologous protein families. Bioinformatics. 2001;17:748–749. doi: 10.1093/bioinformatics/17.8.748. [DOI] [PubMed] [Google Scholar]
- de Haan C.A., Stadler K., Godeke G.J., Bosch B.J., Rottier P.J. Cleavage inhibition of the murine coronavirus spike protein by a furin-like enzyme affects cell-cell but not virus-cell fusion. J. Virol. 2004;78:6048–6054. doi: 10.1128/JVI.78.11.6048-6054.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Witte R.S., Shakhnovich E.I. SMoG: de novo design method based on simple, fast, and accurate free energy estimates. 1. Methodology and supporting evidence. J. Am. Chem. Soc. 1996;118:11733–11744. doi: 10.1021/ja960751u. [DOI] [Google Scholar]
- Dietmann S., Holm L. Identification of homology in protein structure classification. Nat. Struct. Biol. 2001;8:953–957. doi: 10.1038/nsb1101-953. [DOI] [PubMed] [Google Scholar]
- Ding C.H., Dubchak I. Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics. 2001;17:349–358. doi: 10.1093/bioinformatics/17.4.349. [DOI] [PubMed] [Google Scholar]
- Doolittle R.F. The multiplicity of domains in proteins. Annu. Rev. Biochem. 1995;64:287–314. doi: 10.1146/annurev.bi.64.070195.001443. [DOI] [PubMed] [Google Scholar]
- Dutta S., Berman H.M. Large macromolecular complexes in the Protein Data Bank: A status report. Structure. 2005;13:381–388. doi: 10.1016/j.str.2005.01.008. [DOI] [PubMed] [Google Scholar]
- Ekman D., Bjorklund A.K., Frey-Skott J., Elofsson A. Multi-domain proteins in the three kingdoms of life: Orphan domains and other unassigned regions. J. Mol. Biol. 2005;348:231–243. doi: 10.1016/j.jmb.2005.02.007. [DOI] [PubMed] [Google Scholar]
- Elofsson A., Fischer D., Rice D.W., Le Grand S.M., Eisenberg D. A study of combined structure/sequence profiles. Fold. Des. 1996;1:451–461. doi: 10.1016/S1359-0278(96)00061-2. [DOI] [PubMed] [Google Scholar]
- Fetrow J.S., Giammona A., Kolinski A., Skolnick J. The protein folding problem: A biophysical enigma. Curr. Pharm. Biotechnol. 2002;3:329–347. doi: 10.2174/1389201023378120. [DOI] [PubMed] [Google Scholar]
- Finkelstein A.V., Ptitsyn O.B. Why do globular proteins fit the limited set of folding patterns? Prog. Biophys. Mol. Biol. 1987;50:171–190. doi: 10.1016/0079-6107(87)90013-7. [DOI] [PubMed] [Google Scholar]
- Fischer, D. 2000. Hybrid fold recognition: Combining sequence derived properties with evolutionary information. Pacific Symp. Biocomputing, Hawaii, pp. 119–130, World Scientific. [PubMed]
- Fischer D. 3D-SHOTGUN: A novel, cooperative, fold-recognition metapredictor. Proteins. 2003;51:434–441. doi: 10.1002/prot.10357. [DOI] [PubMed] [Google Scholar]
- Fischer D., Eisenberg D. Protein fold recognition using sequence-derived predictions. Protein. Sci. 1996;5:947–955. doi: 10.1002/pro.5560050516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fischer, D., A. Elofsson, D. Rice, and D. Eisenberg. 1996a. Assessing the performance of fold recognition methods by means of a comprehensive benchmark. Pac. Symp. Biocomput. 300–318. [PubMed]
- Fischer D., Rice D., Bowie J.U., Eisenberg D. Assigning amino acid sequences to 3-dimensional protein folds. FASEB J. 1996;10:126–136. doi: 10.1096/fasebj.10.1.8566533. [DOI] [PubMed] [Google Scholar]
- Frederickson G.N. Planar graph decomposition and all pairs shortest paths. J. Assoc. Comput. Mach. 1991;38:162–204. [Google Scholar]
- Gaasterland T. Structural genomics: Bioinformatics in the driver’s seat. Nat. Biotechnol. 1998;16:625–627. doi: 10.1038/nbt0798-625. [DOI] [PubMed] [Google Scholar]
- Gelfand M.S., Koonin E.V., Mironov A.A. Prediction of transcription regulatory sites in Archaea by a comparative genomic approach. Nucleic Acids Res. 2000;28:695–705. doi: 10.1093/nar/28.3.695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gerlt, J.A., and P.C. Babbitt. 2000. Can sequence determine function? Genome Biol. l(5):reviews 0005.1-0005.10. [DOI] [PMC free article] [PubMed]
- Gerstein M. A structural census of genomes: Comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure. J. Mol. Biol. 1997;274:562–576. doi: 10.1006/jmbi.1997.1412. [DOI] [PubMed] [Google Scholar]
- Gerstein M. How representative are the known structures of the proteins in a complete genome? A comprehensive structural census. Fold. Des. 1998;3:497–512. doi: 10.1016/S1359-0278(98)00066-2. [DOI] [PubMed] [Google Scholar]
- Gerstein M., Hegyi H. Comparing genomes in terms of protein structure: Surveys of a finite parts list. FEMS Microbiol. Rev. 1998;22:277–304. doi: 10.1111/j.1574-6976.1998.tb00371.x. [DOI] [PubMed] [Google Scholar]
- Godzik A. Fold recognition methods. Methods Biochem Anal. 2003;44:525–546. doi: 10.1002/0471721204.ch26. [DOI] [PubMed] [Google Scholar]
- Guo J.T., Elliott K., Chung W.J., Xu D., Passovets S., Xu Y. PROSPECT-PSPP: An automatic computational pipeline for protein structure prediction. Nucleic Acids Res. 2004;32:W522–525. doi: 10.1093/nar/gkh414. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hobohm U., Scharf M., Schneider R., Sander C. Selection of representative protein data sets. Protein Sci. 1992;1:409–417. doi: 10.1002/pro.5560010313. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holm L., Sander C. Mapping the protein universe. Science. 1996;273:595–603. doi: 10.1126/science.273.5275.595. [DOI] [PubMed] [Google Scholar]
- Holm L., Sander C. The FSSP database: Fold classification based on structure-structure alignment of proteins. Nucleic Acids Res. 1996;24:206–209. doi: 10.1093/nar/24.1.206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacobson M.P., Pincus D.L., Rapp C.S., Day T.J., Honig B., Shaw D.E., Friesner R.A. A hierarchical approach to all-atom protein loop prediction. Proteins. 2004;55:351–367. doi: 10.1002/prot.10613. [DOI] [PubMed] [Google Scholar]
- Jiang T., Xu Y., Zhang M., editors. Current Topics in Computational Molecular Biology. Cambridge, MA: MIT Press; 2002. [Google Scholar]
- Jones D.T. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 1999;292:195–202. doi: 10.1006/jmbi.1999.3091. [DOI] [PubMed] [Google Scholar]
- Jones D.T. GenTHREADER: An efficient, and reliable protein fold recognition method for genomic sequences. J. Mol. Biol. 1999;287:797–815. doi: 10.1006/jmbi.1999.2583. [DOI] [PubMed] [Google Scholar]
- Jones D.T., Taylor W.R., Thornton J.M. A new approach to protein fold recognition. Nature. 1992;358:86–89. doi: 10.1038/358086a0. [DOI] [PubMed] [Google Scholar]
- Kim D., Xu D., Guo J.T., Ellrott K., Xu Y. PROSPECT II: Protein structure prediction program for genome-scale applications. Protein Eng. 2003;16:641–650. doi: 10.1093/protein/gzg081. [DOI] [PubMed] [Google Scholar]
- Kinch L.N., Wrabl J.O., Krishna S.S., Majumdar I., Sadreyev R.I., Qi Y., Pei J., Cheng H., Grishin N.V. CASP5 assessment of fold recognition target predictions. Proteins. 2003;53(Suppl.6):395–409. doi: 10.1002/prot.10557. [DOI] [PubMed] [Google Scholar]
- Koonin E.V., Wolf Y.I., Karev G.P. The structure of the protein universe and genome evolution. Nature. 2002;420:218–223. doi: 10.1038/nature01256. [DOI] [PubMed] [Google Scholar]
- Laskowski R.A., MacArthur M.W., Moss D.S., Thornton J.M. PROCHECK: A program to check the stereochemical quality of protein structures. J.Appl. Crystallogr. 1993;26:283–291. doi: 10.1107/S0021889892009944. [DOI] [Google Scholar]
- Lathrop R.H. The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Eng. 1994;7:1059–1068. doi: 10.1093/protein/7.9.1059. [DOI] [PubMed] [Google Scholar]
- Lesk A. Introduction to Protein Architecture: The Structural Biology of Proteins. London: Oxford University Press; 2001. [Google Scholar]
- Levitt M., Gerstein M. A unified statistical framework for sequence comparison and structure comparison. Proc. Natl. Acad. Sci. USA. 1998;95:5913–5920. doi: 10.1073/pnas.95.11.5913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Helling R., Tang C., Wingreen N. Emergence of preferred structures in a simple model of protein folding. Science. 1996;273:666–669. doi: 10.1126/science.273.5275.666. [DOI] [PubMed] [Google Scholar]
- Li H., Tang C., Wingreen N.S. Are protein folds atypical? Proc. Natl. Acad. Sci. USA. 1998;95:4987–4990. doi: 10.1073/pnas.95.9.4987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Tang C., Wingreen N.S. Designability of protein structures: A lattice-model study using the Miyazawa-Jernigan matrix. Proteins. 2002;49:403–412. doi: 10.1002/prot.10239. [DOI] [PubMed] [Google Scholar]
- Lu H., Skolnick J. A distance-dependent atomic knowledge-based potential for improved protein structure selection. Proteins. 2001;44:223–232. doi: 10.1002/prot.1087. [DOI] [PubMed] [Google Scholar]
- Li X., Liang J. Geometric cooperativity and anti-cooperativity of three-body interactions in native proteins. Proteins. 2005;60:46–65. doi: 10.1002/prot.20438. [DOI] [PubMed] [Google Scholar]
- Lu L., Arakaki A.K., Lu H., Skolnick J. Multimeric threading-based prediction of protein-protein interactions on a genomic scale: Application to the Saccharomyces cerevisiae proteome. Genome Res. 2003;13(6A):1146–1154. doi: 10.1101/gr.1145203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lund O., Frimand K., Gorodkin J., Bohr H., Bohr J., Hansen J., Brunak S. Protein distance constraints predicted by neural networks and probability density functions. Protein Eng. 1997;10:1241–1248. doi: 10.1093/protein/10.11.1241. [DOI] [PubMed] [Google Scholar]
- Lundstrom J., Rychlewski L., Bujnicki J., Elofsson A. Peons: A neuralnetwork-based consensus predictor that improves fold recognition. Protein Sci. 2001;10:2354–2362. doi: 10.1110/ps.08501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Madej T., Boguski M.S., Bryant S.H. Threading analysis suggests that the obese gene product may be a helical cytokine. FEBSLett. 1995;373:13–18. doi: 10.1016/0014-5793(95)00977-H. [DOI] [PubMed] [Google Scholar]
- Makarova K.S., Aravind L., Galperin M.Y., Grishin N.V., Tatusov R.L., Wolf Y.I., Koonin E.V. Comparative genomics of the Archaea (Euryarchaeota): Evolution of conserved protein families, the stable core, and the variable shell. Genome Res. 1999;9:608–628. [PubMed] [Google Scholar]
- May R.M. How many species are there on earth. Science. 1988;241:1441–1449. doi: 10.1126/science.241.4872.1441. [DOI] [PubMed] [Google Scholar]
- McGuffin L.J., Jones D.T. Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics. 2003;19:874–881. doi: 10.1093/bioinformatics/btg097. [DOI] [PubMed] [Google Scholar]
- McGuffin L.J., Street S.A., Bryson K., Sorensen S.A., Jones D.T. The Genomic Threading Database: A comprehensive resource for structural annotations of the genomes from key organisms. Nucleic Acids Res. 2004;32:D196–199. doi: 10.1093/nar/gkh043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Melo F., Feytmans E. Novel knowledge-based mean force potential at atomic level. J. Mol. Biol. 1997;267:207–222. doi: 10.1006/jmbi.1996.0868. [DOI] [PubMed] [Google Scholar]
- Mirny L.A., Finkelstein A.V., Shakhnovich E.I. Statistical significance of protein structure prediction by threading. Proc. Natl. Acad. Sci. USA. 2000;97:9978–9983. doi: 10.1073/pnas.160271197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Munson P.I., Singh R.K. Statistical significance of hierarchical multi-body potentials based on Delaunay tessellation and their application in sequence-structure alignment. Protein Sci. 1997;6:1467–1481. doi: 10.1002/pro.5560060711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murzin A.G., Brenner S.E., Hubbard T., Chothia C. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 1995;247:536–540. doi: 10.1006/jmbi.1995.0159. [DOI] [PubMed] [Google Scholar]
- Orengo C.A., Jones D.T., Thornton J.M. Protein superfamilies and domain superfolds. Nature. 1994;372:631–634. doi: 10.1038/372631a0. [DOI] [PubMed] [Google Scholar]
- Orengo C.A., Michie A.D., Jones S., Jones D.T., Swindells M.B., Thornton J.M. CATH—A hierarchic classification of protein domain structures. Structure. 1997;5:1093–1108. doi: 10.1016/S0969-2126(97)00260-8. [DOI] [PubMed] [Google Scholar]
- Orengo C.A., Taylor W.R. A local alignment method for protein structure motifs. J. Mol. Biol. 1993;233:488–497. doi: 10.1006/jmbi.1993.1526. [DOI] [PubMed] [Google Scholar]
- Panchenko A., Marchler-Bauer A., Bryant S.H. Threading with explicit models for evolutionary conservation of structure and sequence. Proteins Suppl. 1999;3:133–140. doi: 10.1002/(SICI)1097-0134(1999)37:3+<133::AID-PROT18>3.0.CO;2-D. [DOI] [PubMed] [Google Scholar]
- Panchenko A.R., Marchler-Bauer A., Bryant S.H. Combination of threading potentials and sequence profiles improves fold recognition. J. Mol. Biol. 2000;296:1319–1331. doi: 10.1006/jmbi.2000.3541. [DOI] [PubMed] [Google Scholar]
- Papadimitriou C., Christos H. Combinatorial Optimization: Algorithms and Complexity. New York: Dover Publications; 1998. [Google Scholar]
- Prestegard J.H. New techniques in structural NMR-anisotropic interactions. Nat. Struct. Biol. 1998;5:517–522. doi: 10.1038/756. [DOI] [PubMed] [Google Scholar]
- Qu, Y., J.T. Guo, V. Olman, and Y. Xu. 2004a. Protein fold recognition through application of residual dipolar coupling data. Pac. Symp. Biocomput. pp. 459–470. [DOI] [PubMed]
- Qu Y., Guo J.T., Olman V., Xu Y. Protein structure prediction using sparse dipolar coupling data. Nucleic Acids Res. 2004;32:551–561. doi: 10.1093/nar/gkh204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richardson J.S. The anatomy and taxonomy of protein structure. Adv. Protein Chem. 1981;34:167–339. doi: 10.1016/S0065-3233(08)60520-3. [DOI] [PubMed] [Google Scholar]
- Robertson N., Seymour P.D. Graph minors.2. algorithmic aspects of tree-width. J. Algorithm. 1986;7:309–322. doi: 10.1016/0196-6774(86)90023-4. [DOI] [Google Scholar]
- Rost B., Schneider R., Sander C. Protein fold recognition by predictionbased threading. J. Mol. Biol. 1997;270:471–480. doi: 10.1006/jmbi.1997.1101. [DOI] [PubMed] [Google Scholar]
- Sali A., Blundell T.L. Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. J. Mol. Biol. 1990;212:403–428. doi: 10.1016/0022-2836(90)90134-8. [DOI] [PubMed] [Google Scholar]
- Samudrala R., Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J. Mol. Biol. 1998;275:895–916. doi: 10.1006/jmbi.1997.1479. [DOI] [PubMed] [Google Scholar]
- Shi J., Blund L., Mizuguchi K. FUGUE: Sequence-structure homology recognition using environment-specific substitution tables and structuredependent gap penalties. J. Mol. Biol. 2001;310:243–257. doi: 10.1006/jmbi.2001.4762. [DOI] [PubMed] [Google Scholar]
- Sippl M.J. Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. J. Mol. Biol. 1990;213:859–883. doi: 10.1016/S0022-2836(05)80269-4. [DOI] [PubMed] [Google Scholar]
- Sippl M.J., Lackner P., Domingues F.S., Prlic A., Malik R., Andreeva A., Wiederstein M. Assessment of the CASP4 fold recognition category. Proteins. 2001;5:55–67. doi: 10.1002/prot.10006. [DOI] [PubMed] [Google Scholar]
- Skolnick J., Fetrow J.S., Kolinski A. Structural genomics and its importance for gene function analysis. Nat. Biotechnol. 2000;18:283–287. doi: 10.1038/73723. [DOI] [PubMed] [Google Scholar]
- Skolnick J., Kihara D. Defrosting the frozen approximation: PROSPECTOR: A new approach to threading. Proteins. 2001;42:319–331. doi: 10.1002/1097-0134(20010215)42:3<319::AID-PROT30>3.0.CO;2-A. [DOI] [PubMed] [Google Scholar]
- Sommer I., Zien A., von Ohsen N., Zimmer R., Lengauer T. Confidence measures for protein fold recognition. Bioinformatics. 2002;18:802–812. doi: 10.1093/bioinformatics/18.6.802. [DOI] [PubMed] [Google Scholar]
- Song, Y., K. Ellrott, C. Liu, J. Guo, Y. Xu, and L. Cai. 2005. Tree decomposition based protein threading. Submitted.
- Sorenson J.M., Head-Gordon T. Redesigning the hydrophobic core of a model beta-sheet protein: Destabilizing traps through a threading approach. Proteins. 1999;37:582–591. doi: 10.1002/(SICI)1097-0134(19991201)37:4<582::AID-PROT9>3.0.CO;2-M. [DOI] [PubMed] [Google Scholar]
- Tatusov R.L., Galperin M.Y., Natale D.A., Koonin E.V. The COG database: A tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 2000;28:33–36. doi: 10.1093/nar/28.1.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taylor W.R., Orengo C.A. Protein structure alignment. J. Mol. Biol. 1989;208:1–22. doi: 10.1016/0022-2836(89)90084-3. [DOI] [PubMed] [Google Scholar]
- Tolman J.R., Flanagan J.M., Kennedy M.A., Prestegard J.H. Nuclear magnetic dipole interactions in field-oriented proteins: Information for structure determination in solution. Proc. Natl. Acad. Sci. USA. 1995;92:9279–9283. doi: 10.1073/pnas.92.20.9279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsigelny I.F., editor. Protein Structure Prediction: Bioinformatic Approach. La Jolla, CA: International University Line Publishers; 2002. [Google Scholar]
- Venclovas C., Zemla A., Fidelis K., Moult J. Assessment of progress over the CASP experiments. Proteins. 2003;53(Suppl. 6):585–595. doi: 10.1002/prot.10530. [DOI] [PubMed] [Google Scholar]
- von Grotthuss M., Wyrwicz L.S., Rychlewski L. mRNA cap-1 methyltransferase in the SARS genome. Cell. 2003;113:701–702. doi: 10.1016/S0092-8674(03)00424-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vriend G. WHAT IF: A molecular modelling and drug design program. J. Mol. Graph. 1990;8:52–56. doi: 10.1016/0263-7855(90)80070-V. [DOI] [PubMed] [Google Scholar]
- Wan X.F., Ataman D., Xu D. inProgress in Bioinformatics. New York: Nova Science Publishers; 2005. Application of computational biology in understanding emerging infectious diseases: Inferring the biological function for S-M complex of SARS-CoV; pp. 55–80. [Google Scholar]
- Wang G., Dunbrack R.L. PISCES: A protein sequence culling server. Bioinformatics. 2003;19:1589–1591. doi: 10.1093/bioinformatics/btg224. [DOI] [PubMed] [Google Scholar]
- Wang Z.X. How many fold types of protein are there in nature? Proteins. 1996;26:186–191. doi: 10.1002/(SICI)1097-0134(199610)26:2<186::AID-PROT8>3.0.CO;2-E. [DOI] [PubMed] [Google Scholar]
- Westhead D.R., Collura V.P., Eldridge M.D., Firth M.A., Li J., Murray C.W. Protein fold recognition by threading: Comparison of algorithms and analysis of results. Protein Eng. 1995;8:1197–1204. doi: 10.1093/protein/8.12.1197. [DOI] [PubMed] [Google Scholar]
- Wetlaufer D.B. Nucleation, rapid folding, and globular intrachain regions in proteins. Proc. Natl. Acad. Sci. USA. 1973;70:697–701. doi: 10.1073/pnas.70.3.697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu D., Baburaj K., Peterson C.B., Xu Y. Model for the three-dimensional structure of vitronectin: Predictions for the multi-domain protein from threading and docking. Proteins. 2001;44:312–320. doi: 10.1002/prot.1096. [DOI] [PubMed] [Google Scholar]
- Xu D., Kim D., Dam P., Shah M., Uberbacher E.C., Xu Y. Characterization of protein structure and function at genome scale with a computational prediction pipeline. In: Setlow J.K., editor. Genetic Engineering, Principles and Methods. New York: Kluwer Academic/Plenum Publishers; 2003. pp. 269–293. [DOI] [PubMed] [Google Scholar]
- Xu D., Unseren M.A., Xu Y., Uberbacher C. Sequence-structure specificity of a knowledge based energy function at the secondary structure level. Bioinformatics. 2000;16:257–268. doi: 10.1093/bioinformatics/16.3.257. [DOI] [PubMed] [Google Scholar]
- Xu, J., F. Jiao, and B. Berger. 2005. A tree decomposition approach to protein structure prediction. Proceedings of 2005 IEEE Computational Systems Bioinformatics Conference, pp. 247–256. [DOI] [PubMed]
- Xu J., Li M. Assessment of RAPTOR’s linear programming approach in CAFASP3. Proteins. 2003;53(Suppl. 6):579–584. doi: 10.1002/prot.10531. [DOI] [PubMed] [Google Scholar]
- Xu J., Li M., Kim D., Xu Y. RAPTOR: Optimal protein threading by linear programming. J. Bioinform. Comput. Biol. 2003;1:95–117. doi: 10.1142/S0219720003000186. [DOI] [PubMed] [Google Scholar]
- Xu, J., M. Li, G. Lin, D. Kim, and Y. Xu. 2003b. Protein threading by linear programming. Pac. Symp. Biocomput. pp. 264–275. [PubMed]
- Xu Y., Uberbacher E.C. A polynomial-time algorithm for a class of protein threading problems. Comput. Appl. Biosci. 1996;12:511–517. doi: 10.1093/bioinformatics/12.6.511. [DOI] [PubMed] [Google Scholar]
- Xu Y., Xu D. Protein threading using PROSPECT: Design and evaluation. Proteins. 2000;40:343–354. doi: 10.1002/1097-0134(20000815)40:3<343::AID-PROT10>3.0.CO;2-S. [DOI] [PubMed] [Google Scholar]
- Xu Y., Xu D., Crawford O.H., Einstein J.R. A computational method for NMR-constrained protein threading. J. Comput. Biol. 2000;7:449–467. doi: 10.1089/106652700750050880. [DOI] [PubMed] [Google Scholar]
- Xu Y., Xu D., Crawford O.H., Einstein J.R., Larimer F., Uberbacher E., Unseren M.A., Zhang G. Protein threading by PROSPECT: A prediction experiment in CASP3. Protein Eng. 1999;12:899–907. doi: 10.1093/protein/12.11.899. [DOI] [PubMed] [Google Scholar]
- Xu, Y., D. Xu, O.H. Crawford, J.R. Einstein, and E. Serpersu. 2000b. Protein structure determination using protein threading and sparse NMR data. Annual Conference on Research in Computational Molecular Biology, pp. 299–307.
- Xu Y., Xu D., Gabow H.N. Protein domain decomposition using a graph-theoretic approach. Bioinformatics. 2000;16:1091–1104. doi: 10.1093/bioinformatics/16.12.1091. [DOI] [PubMed] [Google Scholar]
- Xu Y., Xu D., Olman V. A practical method for interpretation of threading scores: An application of neural network. Stat Sinica. 2002;12:159–177. [Google Scholar]
- Xu Y., Xu D., Uberbacher E.C. An efficient computational method for globally optimal threading. J. Comput. Biol. 1998;5:597–614. doi: 10.1089/cmb.1998.5.597. [DOI] [PubMed] [Google Scholar]
- Yan B., Pan C., Olman V.N., Hettich R.L., Xu Y. A graph-theoretic approach for the separation of b and y ions in tandem mass spectra. Bioinformatics. 2005;21:563–574. doi: 10.1093/bioinformatics/bti044. [DOI] [PubMed] [Google Scholar]
- Ye X., O’Neil P.K., Foster A.N., Gajda M.J., Kosinski J., Kurowski M.A., Bujnicki J.M., Friedman A.M., Bailey-Kellogg C. Probabilistic cross-link analysis and experiment planning for high-throughput elucidation of protein structure. Protein Sci. 2004;13:3298–3313. doi: 10.1110/ps.04846604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Young M.M., Tang N., Hempel J.C., Oshiro C.M., Taylor E.W., Kuntz I.D., Gibson B.W., Dollinger G. High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry. Proc. Natl. Acad. Sci. USA. 2000;97:5802–5806. doi: 10.1073/pnas.090099097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang B., Jaroszewski L., Rychlewski L., Godzik A. Similarities and differences between nonhomologous proteins with similar folds: Evaluation of threading strategies. Fold. Des. 1997;2:307–317. doi: 10.1016/S1359-0278(97)00042-4. [DOI] [PubMed] [Google Scholar]
- Zhang C., DeLisi C. Estimating the number of protein folds. J. Mol. Biol. 1998;284:1301–1305. doi: 10.1006/jmbi.1998.2282. [DOI] [PubMed] [Google Scholar]
- Zhang Y., Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins. 2004;57:702–710. doi: 10.1002/prot.20264. [DOI] [PubMed] [Google Scholar]
- Zhou H.Y., Zhou Y.Q. Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci. 2002;11:2714–2726. doi: 10.1110/ps.0217002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou H., Zhou Y. Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins. 2005;58:321–328. doi: 10.1002/prot.20308. [DOI] [PMC free article] [PubMed] [Google Scholar]
