Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 1994 Feb 15;91(4):1455–1459. doi: 10.1073/pnas.91.4.1455

Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances.

J A Lake 1
PMCID: PMC43178  PMID: 8108430

Abstract

The reconstruction of phylogenetic trees from DNA and protein sequences is confounded by unequal rate effects. These effects can group rapidly evolving taxa with other rapidly evolving taxa, whether or not they are genealogically related. All algorithms are sensitive to these effects whenever the assumptions on which they are based are not met. The algorithm presented here, called paralinear distances, is valid for a much broader class of substitution processes than previous algorithms and is accordingly less affected by unequal rate effects. It may be used with all nucleic acid, protein, or other sequences, provided that their evolution may be modeled as a succession of Markov processes. The properties of the method have been proven both analytically and by computer simulations. Like all other methods, paralinear distances can fail when sequences are misaligned or when site-to-site sequence variation of rates is extensive. To examine the usefulness of paralinear distances, the "origin of the eukaryotes" has been investigated by the analysis of elongation factor Tu sequences with a variety of sequence alignments. It has been found that the order in which sequences are pairwise aligned strongly determines the topology which is reconstructed by paralinear distances (as it does for all other reconstruction methods tested). When the parts of the alignment that are unaffected by alignment order are analyzed, paralinear distances strongly select the eocyte topology. This provides evidence that the eocyte prokaryotes are the closest prokaryotic relatives of the eukaryotes.

Full text

PDF
1456

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Auer J., Spicker G., Böck A. Nucleotide sequence of the gene for elongation factor EF-1 alpha from the extreme thermophilic archaebacterium Thermococcus celer. Nucleic Acids Res. 1990 Jul 11;18(13):3989–3989. doi: 10.1093/nar/18.13.3989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Baldacci G., Guinet F., Tillit J., Zaccai G., de Recondo A. M. Functional implications related to the gene structure of the elongation factor EF-Tu from Halobacterium marismortui. Nucleic Acids Res. 1990 Feb 11;18(3):507–511. doi: 10.1093/nar/18.3.507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Barry D., Hartigan J. A. Asynchronous distance between homologous DNA sequences. Biometrics. 1987 Jun;43(2):261–276. [PubMed] [Google Scholar]
  4. Cousineau B., Cerpa C., Lefebvre J., Cedergren R. The sequence of the gene encoding elongation factor Tu from Chlamydia trachomatis compared with those of other organisms. Gene. 1992 Oct 12;120(1):33–41. doi: 10.1016/0378-1119(92)90006-b. [DOI] [PubMed] [Google Scholar]
  5. Creti R., Citarella F., Tiboni O., Sanangelantoni A., Palm P., Cammarano P. Nucleotide sequence of a DNA region comprising the gene for elongation factor 1 alpha (EF-1 alpha) from the ultrathermophilic archaeote Pyrococcus woesei: phylogenetic implications. J Mol Evol. 1991 Oct;33(4):332–342. doi: 10.1007/BF02102864. [DOI] [PubMed] [Google Scholar]
  6. Dayhoff M. O., Barker W. C., Hunt L. T. Establishing homologies in protein sequences. Methods Enzymol. 1983;91:524–545. doi: 10.1016/s0076-6879(83)91049-2. [DOI] [PubMed] [Google Scholar]
  7. Fitch W. M., Smith T. F. Optimal sequence alignments. Proc Natl Acad Sci U S A. 1983 Mar;80(5):1382–1386. doi: 10.1073/pnas.80.5.1382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Holmquist R., Jukes T. H., Moise H., Goodman M., Moore G. W. The evolution of the globin family genes: concordance of stochastic and augmented maximum parsimony genetic distances for alpha hemoglobin, beta hemoglobin and myoglobin phylogenies. J Mol Biol. 1976 Jul 25;105(1):39–74. doi: 10.1016/0022-2836(76)90194-7. [DOI] [PubMed] [Google Scholar]
  9. Lake J. A. A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. Mol Biol Evol. 1987 Mar;4(2):167–191. doi: 10.1093/oxfordjournals.molbev.a040433. [DOI] [PubMed] [Google Scholar]
  10. Lake J. A. Origin of the eukaryotic nucleus determined by rate-invariant analysis of rRNA sequences. Nature. 1988 Jan 14;331(6152):184–186. doi: 10.1038/331184a0. [DOI] [PubMed] [Google Scholar]
  11. Lake J. A. The order of sequence alignment can bias the selection of tree topology. Mol Biol Evol. 1991 May;8(3):378–385. doi: 10.1093/oxfordjournals.molbev.a040654. [DOI] [PubMed] [Google Scholar]
  12. Nagata S., Nagashima K., Tsunetsugu-Yokota Y., Fujimura K., Miyazaki M., Kaziro Y. Polypeptide chain elongation factor 1 alpha (EF-1 alpha) from yeast: nucleotide sequence of one of the two genes for EF-1 alpha from Saccharomyces cerevisiae. EMBO J. 1984 Aug;3(8):1825–1830. doi: 10.1002/j.1460-2075.1984.tb02053.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Pearson W. R., Miller W. Dynamic programming algorithms for biological sequence comparison. Methods Enzymol. 1992;210:575–601. doi: 10.1016/0076-6879(92)10029-d. [DOI] [PubMed] [Google Scholar]
  14. Rivera M. C., Lake J. A. Evidence that eukaryotes and eocyte prokaryotes are immediate relatives. Science. 1992 Jul 3;257(5066):74–76. doi: 10.1126/science.1621096. [DOI] [PubMed] [Google Scholar]
  15. Steel M. A., Lockhart P. J., Penny D. Confidence in evolutionary trees from biological sequence data. Nature. 1993 Jul 29;364(6436):440–442. doi: 10.1038/364440a0. [DOI] [PubMed] [Google Scholar]
  16. Yokota T., Sugisaki H., Takanami M., Kaziro Y. The nucleotide sequence of the cloned tufA gene of Escherichia coli. Gene. 1980 Dec;12(1-2):25–31. doi: 10.1016/0378-1119(80)90012-8. [DOI] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES