Skip to main content
Proceedings of the Royal Society B: Biological Sciences logoLink to Proceedings of the Royal Society B: Biological Sciences
. 1998 Sep 22;265(1407):1779–1786. doi: 10.1098/rspb.1998.0502

Phylogenetic information and experimental design in molecular systematics.

N Goldman 1
PMCID: PMC1689363  PMID: 9787470

Abstract

Despite the widespread perception that evolutionary inference from molecular sequences is a statistical problem, there has been very little attention paid to questions of experimental design. Previous consideration of this topic has led to little more than an empirical folklore regarding the choice of suitable genes for analysis, and to dispute over the best choice of taxa for inclusion in data sets. I introduce what I believe are new methods that permit the quantification of phylogenetic information in a sequence alignment. The methods use likelihood calculations based on Markov-process models of nucleotide substitution allied with phylogenetic trees, and allow a general approach to optimal experimental design. Two examples are given, illustrating realistic problems in experimental design in molecular phylogenetics and suggesting more general conclusions about the choice of genomic regions, sequence lengths and taxa for evolutionary studies.

Full Text

The Full Text of this article is available as a PDF (175.7 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Brown W. M., Prager E. M., Wang A., Wilson A. C. Mitochondrial DNA sequences of primates: tempo and mode of evolution. J Mol Evol. 1982;18(4):225–239. doi: 10.1007/BF01734101. [DOI] [PubMed] [Google Scholar]
  2. Churchill G. A., von Haeseler A., Navidi W. C. Sample size for a phylogenetic inference. Mol Biol Evol. 1992 Jul;9(4):753–769. doi: 10.1093/oxfordjournals.molbev.a040757. [DOI] [PubMed] [Google Scholar]
  3. Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981;17(6):368–376. doi: 10.1007/BF01734359. [DOI] [PubMed] [Google Scholar]
  4. Fitch D. H., Mainone C., Slightom J. L., Goodman M. The spider monkey psi eta-globin gene and surrounding sequences: recent or ancient insertions of LINEs and SINEs? Genomics. 1988 Oct;3(3):237–255. doi: 10.1016/0888-7543(88)90085-7. [DOI] [PubMed] [Google Scholar]
  5. Goldman N., Thorne J. L., Jones D. T. Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics. 1998 May;149(1):445–458. doi: 10.1093/genetics/149.1.445. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Graybeal A. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst Biol. 1998 Mar;47(1):9–17. doi: 10.1080/106351598260996. [DOI] [PubMed] [Google Scholar]
  7. Hillis D. M. Inferring complex phylogenies. Nature. 1996 Sep 12;383(6596):130–131. doi: 10.1038/383130a0. [DOI] [PubMed] [Google Scholar]
  8. Hillis D. M. Taxonomic sampling, phylogenetic accuracy, and investigator bias. Syst Biol. 1998 Mar;47(1):3–8. doi: 10.1080/106351598260987. [DOI] [PubMed] [Google Scholar]
  9. Huelsenbeck J. P., Rannala B. Phylogenetic methods come of age: testing hypotheses in an evolutionary context. Science. 1997 Apr 11;276(5310):227–232. doi: 10.1126/science.276.5310.227. [DOI] [PubMed] [Google Scholar]
  10. Kuhner M. K., Felsenstein J. A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Mol Biol Evol. 1994 May;11(3):459–468. doi: 10.1093/oxfordjournals.molbev.a040126. [DOI] [PubMed] [Google Scholar]
  11. Li W. H., Wolfe K. H., Sourdis J., Sharp P. M. Reconstruction of phylogenetic trees and estimation of divergence times under nonconstant rates of evolution. Cold Spring Harb Symp Quant Biol. 1987;52:847–856. doi: 10.1101/sqb.1987.052.01.092. [DOI] [PubMed] [Google Scholar]
  12. Martin M. J., González-Candelas F., Sobrino F., Dopazo J. A method for determining the position and size of optimal sequence regions for phylogenetic analysis. J Mol Evol. 1995 Dec;41(6):1128–1138. doi: 10.1007/BF00173194. [DOI] [PubMed] [Google Scholar]
  13. Morrison D. A., Ellis J. T. Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of apicomplexa. Mol Biol Evol. 1997 Apr;14(4):428–441. doi: 10.1093/oxfordjournals.molbev.a025779. [DOI] [PubMed] [Google Scholar]
  14. Pluzhnikov A., Donnelly P. Optimal sequencing strategies for surveying molecular genetic diversity. Genetics. 1996 Nov;144(3):1247–1262. doi: 10.1093/genetics/144.3.1247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Yang Z., Goldman N., Friday A. Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation. Mol Biol Evol. 1994 Mar;11(2):316–324. doi: 10.1093/oxfordjournals.molbev.a040112. [DOI] [PubMed] [Google Scholar]
  16. Yang Z. On the best evolutionary rate for phylogenetic analysis. Syst Biol. 1998 Mar;47(1):125–133. doi: 10.1080/106351598261067. [DOI] [PubMed] [Google Scholar]

Articles from Proceedings of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES