Skip to main content
Genetics logoLink to Genetics
. 2000 May;155(1):431–449. doi: 10.1093/genetics/155.1.431

Codon-substitution models for heterogeneous selection pressure at amino acid sites.

Z Yang 1, R Nielsen 1, N Goldman 1, A M Pedersen 1
PMCID: PMC1461088  PMID: 10790415

Abstract

Comparison of relative fixation rates of synonymous (silent) and nonsynonymous (amino acid-altering) mutations provides a means for understanding the mechanisms of molecular sequence evolution. The nonsynonymous/synonymous rate ratio (omega = d(N)d(S)) is an important indicator of selective pressure at the protein level, with omega = 1 meaning neutral mutations, omega < 1 purifying selection, and omega > 1 diversifying positive selection. Amino acid sites in a protein are expected to be under different selective pressures and have different underlying omega ratios. We develop models that account for heterogeneous omega ratios among amino acid sites and apply them to phylogenetic analyses of protein-coding DNA sequences. These models are useful for testing for adaptive molecular evolution and identifying amino acid sites under diversifying selection. Ten data sets of genes from nuclear, mitochondrial, and viral genomes are analyzed to estimate the distributions of omega among sites. In all data sets analyzed, the selective pressure indicated by the omega ratio is found to be highly heterogeneous among sites. Previously unsuspected Darwinian selection is detected in several genes in which the average omega ratio across sites is <1, but in which some sites are clearly under diversifying selection with omega > 1. Genes undergoing positive selection include the beta-globin gene from vertebrates, mitochondrial protein-coding genes from hominoids, the hemagglutinin (HA) gene from human influenza virus A, and HIV-1 env, vif, and pol genes. Tests for the presence of positively selected sites and their subsequent identification appear quite robust to the specific distributional form assumed for omega and can be achieved using any of several models we implement. However, we encountered difficulties in estimating the precise distribution of omega among sites from real data sets.

Full Text

The Full Text of this article is available as a PDF (378.5 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Akashi H. Inferring weak selection from patterns of polymorphism and divergence at "silent" sites in Drosophila DNA. Genetics. 1995 Feb;139(2):1067–1076. doi: 10.1093/genetics/139.2.1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Akashi H. Within- and between-species DNA sequence variation and the 'footprint' of natural selection. Gene. 1999 Sep 30;238(1):39–51. doi: 10.1016/s0378-1119(99)00294-2. [DOI] [PubMed] [Google Scholar]
  3. Cao Y., Janke A., Waddell P. J., Westerman M., Takenaka O., Murata S., Okada N., Päbo S., Hasegawa M. Conflict among individual mitochondrial proteins in resolving the phylogeny of eutherian orders. J Mol Evol. 1998 Sep;47(3):307–322. doi: 10.1007/pl00006389. [DOI] [PubMed] [Google Scholar]
  4. Crandall K. A., Kelsey C. R., Imamichi H., Lane H. C., Salzman N. P. Parallel evolution of drug resistance in HIV: failure of nonsynonymous/synonymous substitution rate ratio to detect selection. Mol Biol Evol. 1999 Mar;16(3):372–382. doi: 10.1093/oxfordjournals.molbev.a026118. [DOI] [PubMed] [Google Scholar]
  5. Emerman M., Malim M. H. HIV-1 regulatory/accessory genes: keys to unraveling viral and host cell biology. Science. 1998 Jun 19;280(5371):1880–1884. doi: 10.1126/science.280.5371.1880. [DOI] [PubMed] [Google Scholar]
  6. Endo T., Ikeo K., Gojobori T. Large-scale search for genes on which positive selection may operate. Mol Biol Evol. 1996 May;13(5):685–690. doi: 10.1093/oxfordjournals.molbev.a025629. [DOI] [PubMed] [Google Scholar]
  7. Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981;17(6):368–376. doi: 10.1007/BF01734359. [DOI] [PubMed] [Google Scholar]
  8. Fitch W. M., Bush R. M., Bender C. A., Cox N. J. Long term trends in the evolution of H(3) HA1 human influenza type A. Proc Natl Acad Sci U S A. 1997 Jul 22;94(15):7712–7718. doi: 10.1073/pnas.94.15.7712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Fu Y. X., Li W. H. Coalescing into the 21st century: An overview and prospects of coalescent theory. Theor Popul Biol. 1999 Aug;56(1):1–10. doi: 10.1006/tpbi.1999.1421. [DOI] [PubMed] [Google Scholar]
  10. Goldman N. Statistical tests of models of DNA substitution. J Mol Evol. 1993 Feb;36(2):182–198. doi: 10.1007/BF00166252. [DOI] [PubMed] [Google Scholar]
  11. Goldman N., Yang Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol. 1994 Sep;11(5):725–736. doi: 10.1093/oxfordjournals.molbev.a040153. [DOI] [PubMed] [Google Scholar]
  12. Holmes E. C., Zhang L. Q., Robertson P., Cleland A., Harvey E., Simmonds P., Leigh Brown A. J. The molecular epidemiology of human immunodeficiency virus type 1 in Edinburgh. J Infect Dis. 1995 Jan;171(1):45–53. doi: 10.1093/infdis/171.1.45. [DOI] [PubMed] [Google Scholar]
  13. Hudson R. R., Kreitman M., Aguadé M. A test of neutral molecular evolution based on nucleotide data. Genetics. 1987 May;116(1):153–159. doi: 10.1093/genetics/116.1.153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hughes A. L., Nei M. Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature. 1988 Sep 8;335(6186):167–170. doi: 10.1038/335167a0. [DOI] [PubMed] [Google Scholar]
  15. Kuno G., Chang G. J., Tsuchiya K. R., Karabatsos N., Cropp C. B. Phylogeny of the genus Flavivirus. J Virol. 1998 Jan;72(1):73–83. doi: 10.1128/jvi.72.1.73-83.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Leitner T., Kumar S., Albert J. Tempo and mode of nucleotide substitutions in gag and env gene fragments in human immunodeficiency virus type 1 populations with a known transmission history. J Virol. 1997 Jun;71(6):4761–4770. doi: 10.1128/jvi.71.6.4761-4770.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Liò P., Goldman N. Models of molecular evolution and phylogeny. Genome Res. 1998 Dec;8(12):1233–1244. doi: 10.1101/gr.8.12.1233. [DOI] [PubMed] [Google Scholar]
  18. Messier W., Stewart C. B. Episodic adaptive evolution of primate lysozymes. Nature. 1997 Jan 9;385(6612):151–154. doi: 10.1038/385151a0. [DOI] [PubMed] [Google Scholar]
  19. Muse S. V., Gaut B. S. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol Biol Evol. 1994 Sep;11(5):715–724. doi: 10.1093/oxfordjournals.molbev.a040152. [DOI] [PubMed] [Google Scholar]
  20. Nielsen R., Yang Z. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics. 1998 Mar;148(3):929–936. doi: 10.1093/genetics/148.3.929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Pedersen A. K., Wiuf C., Christiansen F. B. A codon-based model designed to describe lentiviral evolution. Mol Biol Evol. 1998 Aug;15(8):1069–1081. doi: 10.1093/oxfordjournals.molbev.a026006. [DOI] [PubMed] [Google Scholar]
  22. Seibert S. A., Howell C. Y., Hughes M. K., Hughes A. L. Natural selection on the gag, pol, and env genes of human immunodeficiency virus 1 (HIV-1). Mol Biol Evol. 1995 Sep;12(5):803–813. doi: 10.1093/oxfordjournals.molbev.a040257. [DOI] [PubMed] [Google Scholar]
  23. Sharp P. M. In search of molecular darwinism. Nature. 1997 Jan 9;385(6612):111–112. doi: 10.1038/385111a0. [DOI] [PubMed] [Google Scholar]
  24. Yamaguchi Y., Gojobori T. Evolutionary mechanisms and population dynamics of the third variable envelope region of HIV within single hosts. Proc Natl Acad Sci U S A. 1997 Feb 18;94(4):1264–1269. doi: 10.1073/pnas.94.4.1264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Yang Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol. 1994 Sep;39(3):306–314. doi: 10.1007/BF00160154. [DOI] [PubMed] [Google Scholar]
  26. Yang Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci. 1997 Oct;13(5):555–556. doi: 10.1093/bioinformatics/13.5.555. [DOI] [PubMed] [Google Scholar]
  27. Zanotto P. M., Gould E. A., Gao G. F., Harvey P. H., Holmes E. C. Population dynamics of flaviviruses revealed by molecular phylogenies. Proc Natl Acad Sci U S A. 1996 Jan 23;93(2):548–553. doi: 10.1073/pnas.93.2.548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Zanotto P. M., Kallas E. G., de Souza R. F., Holmes E. C. Genealogical evidence for positive selection in the nef gene of HIV-1. Genetics. 1999 Nov;153(3):1077–1089. doi: 10.1093/genetics/153.3.1077. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES