Abstract
We propose models for describing replacement rate variation in genes and proteins, in which the profile of relative replacement rates along the length of a given sequence is defined as a function of the site number. We consider here two types of functions, one derived from the cosine Fourier series, and the other from discrete wavelet transforms. The number of parameters used for characterizing the substitution rates along the sequences can be flexibly changed and in their most parameter-rich versions, both Fourier and wavelet models become equivalent to the unrestricted-rates model, in which each site of a sequence alignment evolves at a unique rate. When applied to a few real data sets, the new models appeared to fit data better than the discrete gamma model when compared with the Akaike information criterion and the likelihood-ratio test, although the parametric bootstrap version of the Cox test performed for one of the data sets indicated that the difference in likelihoods between the two models is not significant. The new models are applicable to testing biological hypotheses such as the statistical identity of rate variation profiles among homologous protein families. These models are also useful for determining regions in genes and proteins that evolve significantly faster or slower than the sequence average. We illustrate the application of the new method by analyzing human immunoglobulin and Drosophilid alcohol dehydrogenase sequences.
Full Text
The Full Text of this article is available as a PDF (511.8 KB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Ayala F. J., Campbell C. D., Selander R. K. Molecular population genetics of the alcohol dehydrogenase locus in the Hawaiian drosophilid D. mimica. Mol Biol Evol. 1996 Dec;13(10):1363–1367. doi: 10.1093/oxfordjournals.molbev.a025582. [DOI] [PubMed] [Google Scholar]
- DeSalle R., Templeton A. R. The molecular through ecological genetics of abnormal abdomen. III. Tissue-specific differential replication of ribosomal genes modulates the abnormal abdomen phenotype in Drosophila mercatorum. Genetics. 1986 Apr;112(4):877–886. doi: 10.1093/genetics/112.4.877. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dorit R. L., Ayala F. J. ADH evolution and the phylogenetic footprint. J Mol Evol. 1995 Jun;40(6):658–662. doi: 10.1007/BF00160514. [DOI] [PubMed] [Google Scholar]
- Felsenstein J., Churchill G. A. A Hidden Markov Model approach to variation among sites in rate of evolution. Mol Biol Evol. 1996 Jan;13(1):93–104. doi: 10.1093/oxfordjournals.molbev.a025575. [DOI] [PubMed] [Google Scholar]
- Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981;17(6):368–376. doi: 10.1007/BF01734359. [DOI] [PubMed] [Google Scholar]
- Fischer J. A., Maniatis T. Structure and transcription of the Drosophila mulleri alcohol dehydrogenase genes. Nucleic Acids Res. 1985 Oct 11;13(19):6899–6917. doi: 10.1093/nar/13.19.6899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fitch W. M., Margoliash E. A method for estimating the number of invariant amino acid coding positions in a gene using cytochrome c as a model case. Biochem Genet. 1967 Jun;1(1):65–71. doi: 10.1007/BF00487738. [DOI] [PubMed] [Google Scholar]
- Fitch W. M., Markowitz E. An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem Genet. 1970 Oct;4(5):579–593. doi: 10.1007/BF00486096. [DOI] [PubMed] [Google Scholar]
- Golding G. B. Estimates of DNA and protein sequence divergence: an examination of some assumptions. Mol Biol Evol. 1983 Dec;1(1):125–142. doi: 10.1093/oxfordjournals.molbev.a040303. [DOI] [PubMed] [Google Scholar]
- Goldman N. Statistical tests of models of DNA substitution. J Mol Evol. 1993 Feb;36(2):182–198. doi: 10.1007/BF00166252. [DOI] [PubMed] [Google Scholar]
- Jin L., Nei M. Limitations of the evolutionary parsimony method of phylogenetic analysis. Mol Biol Evol. 1990 Jan;7(1):82–102. doi: 10.1093/oxfordjournals.molbev.a040588. [DOI] [PubMed] [Google Scholar]
- Jones D. T., Taylor W. R., Thornton J. M. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992 Jun;8(3):275–282. doi: 10.1093/bioinformatics/8.3.275. [DOI] [PubMed] [Google Scholar]
- Kelly C., Rice J. Modeling nucleotide evolution: a heterogeneous rate analysis. Math Biosci. 1996 Apr 1;133(1):85–109. doi: 10.1016/0025-5564(95)00083-6. [DOI] [PubMed] [Google Scholar]
- Lake J. A. Optimally recovering rate variation information from genomes and sequences: pattern filtering. Mol Biol Evol. 1998 Sep;15(9):1224–1231. doi: 10.1093/oxfordjournals.molbev.a026030. [DOI] [PubMed] [Google Scholar]
- Ohta T. Amino acid substitution at the Adh locus of Drosophila is facilitated by small population size. Proc Natl Acad Sci U S A. 1993 May 15;90(10):4548–4551. doi: 10.1073/pnas.90.10.4548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Russo C. A., Takezaki N., Nei M. Molecular phylogeny and divergence times of drosophilid species. Mol Biol Evol. 1995 May;12(3):391–404. doi: 10.1093/oxfordjournals.molbev.a040214. [DOI] [PubMed] [Google Scholar]
- Schäble K. F., Zachau H. G. The variable genes of the human immunoglobulin kappa locus. Biol Chem Hoppe Seyler. 1993 Nov;374(11):1001–1022. [PubMed] [Google Scholar]
- Takahata N. Overdispersed molecular clock at the major histocompatibility complex loci. Proc Biol Sci. 1991 Jan 22;243(1306):13–18. doi: 10.1098/rspb.1991.0003. [DOI] [PubMed] [Google Scholar]
- Takezaki N., Rzhetsky A., Nei M. Phylogenetic test of the molecular clock and linearized trees. Mol Biol Evol. 1995 Sep;12(5):823–833. doi: 10.1093/oxfordjournals.molbev.a040259. [DOI] [PubMed] [Google Scholar]
- Wakeley J. Substitution rate variation among sites in hypervariable region 1 of human mitochondrial DNA. J Mol Evol. 1993 Dec;37(6):613–623. doi: 10.1007/BF00182747. [DOI] [PubMed] [Google Scholar]
- Williams S. C., Frippiat J. P., Tomlinson I. M., Ignatovich O., Lefranc M. P., Winter G. Sequence and evolution of the human germline V lambda repertoire. J Mol Biol. 1996 Nov 29;264(2):220–232. doi: 10.1006/jmbi.1996.0636. [DOI] [PubMed] [Google Scholar]
- Yang Z. Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol Biol Evol. 1993 Nov;10(6):1396–1401. doi: 10.1093/oxfordjournals.molbev.a040082. [DOI] [PubMed] [Google Scholar]
- Yang Z., Wang T. Mixed model analysis of DNA sequence evolution. Biometrics. 1995 Jun;51(2):552–561. [PubMed] [Google Scholar]
- Zharkikh A. Estimation of evolutionary distances between nucleotide sequences. J Mol Evol. 1994 Sep;39(3):315–329. doi: 10.1007/BF00160155. [DOI] [PubMed] [Google Scholar]