Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 1995 Oct;4(10):2179–2190. doi: 10.1002/pro.5560041024

Alignment of 700 globin sequences: extent of amino acid substitution and its correlation with variation in volume.

O H Kapp 1, L Moens 1, J Vanfleteren 1, C N Trotman 1, T Suzuki 1, S N Vinogradov 1
PMCID: PMC2142974  PMID: 8535255

Abstract

Seven-hundred globin sequences, including 146 nonvertebrate sequences, were aligned on the basis of conservation of secondary structure and the avoidance of gap penalties. Of the 182 positions needed to accommodate all the globin sequences, only 84 are common to all, including the absolutely conserved PheCD1 and HisF8. The mean number of amino acid substitutions per position ranges from 8 to 13 for all globins and 5 to 9 for internal positions. Although the total sequence volumes have a variation approximately 2-3%, the variation in volume per position ranges from approximately 13% for the internal to approximately 21% for the surface positions. Plausible correlations exist between amino acid substitution and the variation in volume per position for the 84 common and the internal but not the surface positions. The amino acid substitution matrix derived from the 84 common positions was used to evaluate sequence similarity within the globins and between the globins and phycocyanins C and colicins A, via calculation of pairwise similarity scores. The scores for globin-globin comparisons over the 84 common positions overlap the globin-phycocyanin and globin-colicin scores, with the former being intermediate. For the subset of internal positions, overlap is minimal between the three groups of scores. These results imply a continuum of amino acid sequences able to assume the common three-on-three alpha-helical structure and suggest that the determinants of the latter include sites other than those inaccessible to solvent.

Full Text

The Full Text of this article is available as a PDF (2.4 MB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Arents G., Love W. E. Glycera dibranchiata hemoglobin. Structure and refinement at 1.5 A resolution. J Mol Biol. 1989 Nov 5;210(1):149–161. doi: 10.1016/0022-2836(89)90297-0. [DOI] [PubMed] [Google Scholar]
  2. Aronson H. E., Royer W. E., Jr, Hendrickson W. A. Quantification of tertiary structural conservation despite primary sequence drift in the globin fold. Protein Sci. 1994 Oct;3(10):1706–1711. doi: 10.1002/pro.5560031009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Barker W. C., George D. G., Mewes H. W., Pfeiffer F., Tsugita A. The PIR-International databases. Nucleic Acids Res. 1993 Jul 1;21(13):3089–3092. doi: 10.1093/nar/21.13.3089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Barton G. J. Protein multiple sequence alignment and flexible pattern matching. Methods Enzymol. 1990;183:403–428. doi: 10.1016/0076-6879(90)83027-7. [DOI] [PubMed] [Google Scholar]
  5. Bashford D., Chothia C., Lesk A. M. Determinants of a protein fold. Unique features of the globin amino acid sequences. J Mol Biol. 1987 Jul 5;196(1):199–216. doi: 10.1016/0022-2836(87)90521-3. [DOI] [PubMed] [Google Scholar]
  6. Blaxter M. L. Nemoglobins: divergent nematode globins. Parasitol Today. 1993 Oct;9(10):353–360. doi: 10.1016/0169-4758(93)90082-q. [DOI] [PubMed] [Google Scholar]
  7. Bolognesi M., Onesti S., Gatti G., Coda A., Ascenzi P., Brunori M. Aplysia limacina myoglobin. Crystallographic analysis at 1.6 A resolution. J Mol Biol. 1989 Feb 5;205(3):529–544. doi: 10.1016/0022-2836(89)90224-6. [DOI] [PubMed] [Google Scholar]
  8. Bordo D., Argos P. Suggestions for "safe" residue substitutions in site-directed mutagenesis. J Mol Biol. 1991 Feb 20;217(4):721–729. doi: 10.1016/0022-2836(91)90528-e. [DOI] [PubMed] [Google Scholar]
  9. Chelvanayagam G., Roy G., Argos P. Easy adaptation of protein structure to sequence. Protein Eng. 1994 Feb;7(2):173–184. doi: 10.1093/protein/7.2.173. [DOI] [PubMed] [Google Scholar]
  10. Choi S. Y., Esaki N., Ashiuchi M., Yoshimura T., Soda K. Bacterial glutamate racemase has high sequence similarity with myoglobins and forms an equimolar inactive complex with hemin. Proc Natl Acad Sci U S A. 1994 Oct 11;91(21):10144–10147. doi: 10.1073/pnas.91.21.10144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Chothia C., Lesk A. M. The evolution of protein structures. Cold Spring Harb Symp Quant Biol. 1987;52:399–405. doi: 10.1101/sqb.1987.052.01.046. [DOI] [PubMed] [Google Scholar]
  12. Chothia C., Lesk A. M. The relation between the divergence of sequence and structure in proteins. EMBO J. 1986 Apr;5(4):823–826. doi: 10.1002/j.1460-2075.1986.tb04288.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Connolly M. L. Solvent-accessible surfaces of proteins and nucleic acids. Science. 1983 Aug 19;221(4612):709–713. doi: 10.1126/science.6879170. [DOI] [PubMed] [Google Scholar]
  14. Couture M., Chamberland H., St-Pierre B., Lafontaine J., Guertin M. Nuclear genes encoding chloroplast hemoglobins in the unicellular green alga Chlamydomonas eugametos. Mol Gen Genet. 1994 Apr;243(2):185–197. doi: 10.1007/BF00280316. [DOI] [PubMed] [Google Scholar]
  15. Cramm R., Siddiqui R. A., Friedrich B. Primary sequence and evidence for a physiological function of the flavohemoprotein of Alcaligenes eutrophus. J Biol Chem. 1994 Mar 11;269(10):7349–7354. [PubMed] [Google Scholar]
  16. Dayhoff M. O., Barker W. C., Hunt L. T. Establishing homologies in protein sequences. Methods Enzymol. 1983;91:524–545. doi: 10.1016/s0076-6879(83)91049-2. [DOI] [PubMed] [Google Scholar]
  17. Flores T. P., Orengo C. A., Moss D. S., Thornton J. M. Comparison of conformational characteristics in structurally similar protein pairs. Protein Sci. 1993 Nov;2(11):1811–1826. doi: 10.1002/pro.5560021104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Frenkel M. J., Dopheide T. A., Wagland B. M., Ward C. W. The isolation, characterization and cloning of a globin-like, host-protective antigen from the excretory-secretory products of Trichostrongylus colubriformis. Mol Biochem Parasitol. 1992 Jan;50(1):27–36. doi: 10.1016/0166-6851(92)90241-b. [DOI] [PubMed] [Google Scholar]
  19. Fukuda M., Takagi T., Shikama K. Polymorphic hemoglobin from a midge larva (Tokunagayusurika akamusi) can be divided into two different types. Biochim Biophys Acta. 1993 Jun 11;1157(2):185–191. doi: 10.1016/0304-4165(93)90063-e. [DOI] [PubMed] [Google Scholar]
  20. Fushitani K., Matsuura M. S., Riggs A. F. The amino acid sequences of chains a, b, and c that form the trimer subunit of the extracellular hemoglobin from Lumbricus terrestris. J Biol Chem. 1988 May 15;263(14):6502–6517. [PubMed] [Google Scholar]
  21. Gerstein M., Sonnhammer E. L., Chothia C. Volume changes in protein evolution. J Mol Biol. 1994 Mar 4;236(4):1067–1078. doi: 10.1016/0022-2836(94)90012-4. [DOI] [PubMed] [Google Scholar]
  22. Gilles-Gonzalez M. A., Gonzalez G., Perutz M. F., Kiger L., Marden M. C., Poyart C. Heme-based sensors, exemplified by the kinase FixL, are a new class of heme protein with distinctive ligand binding and autoxidation. Biochemistry. 1994 Jul 5;33(26):8067–8073. doi: 10.1021/bi00192a011. [DOI] [PubMed] [Google Scholar]
  23. Gilles-González M. A., González G., Perutz M. F. Kinase activity of oxygen sensor FixL depends on the spin state of its heme iron. Biochemistry. 1995 Jan 10;34(1):232–236. doi: 10.1021/bi00001a027. [DOI] [PubMed] [Google Scholar]
  24. Henikoff S., Henikoff J. G. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A. 1992 Nov 15;89(22):10915–10919. doi: 10.1073/pnas.89.22.10915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Holm L., Ouzounis C., Sander C., Tuparev G., Vriend G. A database of protein structure families with common folding motifs. Protein Sci. 1992 Dec;1(12):1691–1698. doi: 10.1002/pro.5560011217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Holm L., Sander C. Protein structure comparison by alignment of distance matrices. J Mol Biol. 1993 Sep 5;233(1):123–138. doi: 10.1006/jmbi.1993.1489. [DOI] [PubMed] [Google Scholar]
  27. Huber R., Epp O., Steigemann W., Formanek H. The atomic structure of erythrocruorin in the light of the chemical sequence and its comparison with myoglobin. Eur J Biochem. 1971 Mar 1;19(1):42–50. doi: 10.1111/j.1432-1033.1971.tb01285.x. [DOI] [PubMed] [Google Scholar]
  28. Iwaasa H., Takagi T., Shikama K. Protozoan hemoglobin from Tetrahymena pyriformis. Isolation, characterization, and amino acid sequence. J Biol Chem. 1990 May 25;265(15):8603–8609. [PubMed] [Google Scholar]
  29. Iwaasa H., Takagi T., Shikama K. Protozoan myoglobin from Paramecium caudatum. Its unusual amino acid sequence. J Mol Biol. 1989 Jul 20;208(2):355–358. doi: 10.1016/0022-2836(89)90395-1. [DOI] [PubMed] [Google Scholar]
  30. Johnson M. S., Overington J. P. A structural basis for sequence comparisons. An evaluation of scoring methodologies. J Mol Biol. 1993 Oct 20;233(4):716–738. doi: 10.1006/jmbi.1993.1548. [DOI] [PubMed] [Google Scholar]
  31. Johnson M. S., Overington J. P., Blundell T. L. Alignment and searching for common protein folds using a data bank of structural templates. J Mol Biol. 1993 Jun 5;231(3):735–752. doi: 10.1006/jmbi.1993.1323. [DOI] [PubMed] [Google Scholar]
  32. Johnson M. S., Sali A., Blundell T. L. Phylogenetic relationships from three-dimensional protein structures. Methods Enzymol. 1990;183:670–690. doi: 10.1016/0076-6879(90)83044-a. [DOI] [PubMed] [Google Scholar]
  33. Jones D. T., Taylor W. R., Thornton J. M. A new approach to protein fold recognition. Nature. 1992 Jul 2;358(6381):86–89. doi: 10.1038/358086a0. [DOI] [PubMed] [Google Scholar]
  34. Komiyama N. H., Shih D. T., Looker D., Tame J., Nagai K. Was the loss of the D helix in alpha globin a functionally neutral mutation? Nature. 1991 Jul 25;352(6333):349–351. doi: 10.1038/352349a0. [DOI] [PubMed] [Google Scholar]
  35. Laurents D. V., Subbiah S., Levitt M. Different protein sequences can give rise to highly similar folds through different stabilizing interactions. Protein Sci. 1994 Nov;3(11):1938–1944. doi: 10.1002/pro.5560031105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Lesk A. M., Chothia C. How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins. J Mol Biol. 1980 Jan 25;136(3):225–270. doi: 10.1016/0022-2836(80)90373-3. [DOI] [PubMed] [Google Scholar]
  37. Lesk A. M., Levitt M., Chothia C. Alignment of the amino acid sequences of distantly related proteins using variable gap penalties. Protein Eng. 1986 Oct-Nov;1(1):77–78. doi: 10.1093/protein/1.1.77. [DOI] [PubMed] [Google Scholar]
  38. Orengo C. A., Brown N. P., Taylor W. R. Fast structure alignment for protein databank searching. Proteins. 1992 Oct;14(2):139–167. doi: 10.1002/prot.340140203. [DOI] [PubMed] [Google Scholar]
  39. Orengo C. A., Flores T. P., Taylor W. R., Thornton J. M. Identification and classification of protein fold families. Protein Eng. 1993 Jul;6(5):485–500. doi: 10.1093/protein/6.5.485. [DOI] [PubMed] [Google Scholar]
  40. Orengo C. A., Taylor W. R. A local alignment method for protein structure motifs. J Mol Biol. 1993 Oct 5;233(3):488–497. doi: 10.1006/jmbi.1993.1526. [DOI] [PubMed] [Google Scholar]
  41. Parente A., Verde C., Malorni A., Montecucchi P., Aniello F., Geraci G. Amino-acid sequence of the cooperative dimeric myoglobin from the radular muscles of the marine gastropod Nassa mutabilis. Biochim Biophys Acta. 1993 Mar 5;1162(1-2):1–9. doi: 10.1016/0167-4838(93)90120-g. [DOI] [PubMed] [Google Scholar]
  42. Pastore A., Lesk A. M., Bolognesi M., Onesti S. Structural alignment and analysis of two distantly related proteins: Aplysia limacina myoglobin and sea lamprey globin. Proteins. 1988;4(4):240–250. doi: 10.1002/prot.340040403. [DOI] [PubMed] [Google Scholar]
  43. Pastore A., Lesk A. M. Comparison of the structures of globins and phycocyanins: evidence for evolutionary relationship. Proteins. 1990;8(2):133–155. doi: 10.1002/prot.340080204. [DOI] [PubMed] [Google Scholar]
  44. Pittsyn O. B. Invariant features of globin primary structure and coding of their secondary structure. J Mol Biol. 1974 Sep 15;88(2):287–300. doi: 10.1016/0022-2836(74)90482-3. [DOI] [PubMed] [Google Scholar]
  45. Potts M., Angeloni S. V., Ebel R. E., Bassam D. Myoglobin in a cyanobacterium. Science. 1992 Jun 19;256(5064):1690–1691. doi: 10.1126/science.256.5064.1690. [DOI] [PubMed] [Google Scholar]
  46. Ptitsyn O. B., Volkenstein M. V. Protein structure and neutral theory of evolution. J Biomol Struct Dyn. 1986 Aug;4(1):137–156. doi: 10.1080/07391102.1986.10507651. [DOI] [PubMed] [Google Scholar]
  47. Rodionov M. A., Johnson M. S. Residue-residue contact substitution probabilities derived from aligned three-dimensional structures and the identification of common folds. Protein Sci. 1994 Dec;3(12):2366–2377. doi: 10.1002/pro.5560031221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Russell R. B., Barton G. J. Multiple protein sequence alignment from tertiary structure comparison: assignment of global and residue confidence levels. Proteins. 1992 Oct;14(2):309–323. doi: 10.1002/prot.340140216. [DOI] [PubMed] [Google Scholar]
  49. Russell R. B., Barton G. J. Structural features can be unconserved in proteins with similar folds. An analysis of side-chain to side-chain contacts secondary structure and accessibility. J Mol Biol. 1994 Dec 2;244(3):332–350. doi: 10.1006/jmbi.1994.1733. [DOI] [PubMed] [Google Scholar]
  50. Sali A., Blundell T. L. Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. J Mol Biol. 1990 Mar 20;212(2):403–428. doi: 10.1016/0022-2836(90)90134-8. [DOI] [PubMed] [Google Scholar]
  51. Schimenti J. C., Duncan C. H. Concerted evolution of the cow epsilon 2 and epsilon 4 beta-globin genes. Mol Biol Evol. 1985 Nov;2(6):505–513. doi: 10.1093/oxfordjournals.molbev.a040368. [DOI] [PubMed] [Google Scholar]
  52. Shenkin P. S., Erman B., Mastrandrea L. D. Information-theoretical entropy as a measure of sequence variability. Proteins. 1991;11(4):297–313. doi: 10.1002/prot.340110408. [DOI] [PubMed] [Google Scholar]
  53. Sippl M. J., Weitckus S. Detection of native-like models for amino acid sequences of unknown three-dimensional structure in a data base of known protein conformations. Proteins. 1992 Jul;13(3):258–271. doi: 10.1002/prot.340130308. [DOI] [PubMed] [Google Scholar]
  54. Subbiah S., Laurents D. V., Levitt M. Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core. Curr Biol. 1993 Mar;3(3):141–148. doi: 10.1016/0960-9822(93)90255-m. [DOI] [PubMed] [Google Scholar]
  55. Suzuki T. Abalone myoglobins evolved from indoleamine dioxygenase: the cDNA-derived amino acid sequence of myoglobin from Nordotis madaka. J Protein Chem. 1994 Jan;13(1):9–13. doi: 10.1007/BF01891987. [DOI] [PubMed] [Google Scholar]
  56. Suzuki T. Amino acid sequence of a major globin from the sea cucumber Paracaudina chilensis. Biochim Biophys Acta. 1989 Oct 19;998(3):292–296. doi: 10.1016/0167-4838(89)90287-2. [DOI] [PubMed] [Google Scholar]
  57. Suzuki T., Furukohri T., Okamoto S. Amino acid sequence of myoglobin from the chiton Liolophura japonica and a phylogenetic tree for molluscan globins. J Protein Chem. 1993 Feb;12(1):45–50. doi: 10.1007/BF01024913. [DOI] [PubMed] [Google Scholar]
  58. Suzuki T., Nakamura A., Satoh Y., Inai C., Furukohri T., Arita T. Primary structure of chain I of the heterodimeric hemoglobin from the blood clam Barbatia virescens. J Protein Chem. 1992 Dec;11(6):629–633. doi: 10.1007/BF01024963. [DOI] [PubMed] [Google Scholar]
  59. Suzuki T., Takagi T. A myoglobin evolved from indoleamine 2,3-dioxygenase. J Mol Biol. 1992 Nov 20;228(2):698–700. doi: 10.1016/0022-2836(92)90854-d. [DOI] [PubMed] [Google Scholar]
  60. Takagi T., Iwaasa H., Yuasa H., Shikama K., Takemasa T., Watanabe Y. Primary structure of Tetrahymena hemoglobins. Biochim Biophys Acta. 1993 Apr 29;1173(1):75–78. doi: 10.1016/0167-4781(93)90245-9. [DOI] [PubMed] [Google Scholar]
  61. Wakabayashi S., Matsubara H., Webster D. A. Primary sequence of a dimeric bacterial haemoglobin from Vitreoscilla. 1986 Jul 31-Aug 6Nature. 322(6078):481–483. doi: 10.1038/322481a0. [DOI] [PubMed] [Google Scholar]
  62. Zhu H., Riggs A. F. Yeast flavohemoglobin is an ancient protein related to globins and a reductase family. Proc Natl Acad Sci U S A. 1992 Jun 1;89(11):5015–5019. doi: 10.1073/pnas.89.11.5015. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES