Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2008 Dec 9;46(4):1149–1157. doi: 10.1007/s10910-008-9511-3

A complexity-based measure and its application to phylogenetic analysis

Xiaoqi Zheng 1,2, Chun Li 3, Jun Wang 4,5,
PMCID: PMC7088198  PMID: 32214590

Abstract

In this article, we propose two well-defined distance metrics of biological sequences based on a universal complexity profile. To illustrate our metrics, phylogenetic trees of 18 Eutherian mammals from comparison of their mtDNA sequences and 24 coronaviruses using the whole genomes are constructed. The resulting monophyletic clusters agree well with the established taxonomic groups.

Keywords: Sequence complexity, mtDNA, SARS-CoV, Phylogenetic analysis

References

  1. Snel B., Bork P., Huynen M.A. Nat. Genet. 1999;21:108. doi: 10.1038/5052. [DOI] [PubMed] [Google Scholar]
  2. Campbell A., Mrazek J., Karlin S. Proc. Natl. Acad. Sci. USA. 1999;96:9184. doi: 10.1073/pnas.96.16.9184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Karlin S., Ladunga I. Proc. Natl. Acad. Sci. USA. 1994;91:12832. doi: 10.1073/pnas.91.26.12832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Karlin S., Mrázek J. Proc. Natl. Acad. Sci. USA. 1997;94:10227. doi: 10.1073/pnas.94.19.10227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Blaisdell B.E. Proc. Natl. Acad. Sci. USA. 1986;83:5155. doi: 10.1073/pnas.83.14.5155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Li M., Badger J.H., Chen X., Kwong S., Kearney P., Zhang H.Y. Bioinformatics. 2001;17:149. doi: 10.1093/bioinformatics/17.2.149. [DOI] [PubMed] [Google Scholar]
  7. Otu H.H., Sayood K. Bioinformatics. 2003;19:2122. doi: 10.1093/bioinformatics/btg295. [DOI] [PubMed] [Google Scholar]
  8. C. Li, J. Wang, Similarity analysis of DNA sequences based on the generalized LZ complexity of (0,1)-sequences, Preprint, J. Math. Chem. (2006)
  9. Cover T.M., Thomas J.A. Elements of Information Theory. Beijing: Tsinghua University Press; 2003. [Google Scholar]
  10. Orlov Y.L., Potapov V.N. Nucleic Acids Res. 2004;32:628. doi: 10.1093/nar/gkh466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Jiang T., Xu Y., Zhang M.Q. Current Topics in Computational Molecular Biology. Tsinghua and Cambridge: Tsinghua University Press and The MIT Press; 2002. pp. 157–171. [Google Scholar]
  12. Lempel A., Ziv J. IEEE T. Inform. Theory. 1976;22:75. doi: 10.1109/TIT.1976.1055501. [DOI] [Google Scholar]
  13. Ziv J., Lempel A. IEEE T. Inform. Theory. 1977;23:337. doi: 10.1109/TIT.1977.1055714. [DOI] [Google Scholar]
  14. Ziv J., Lempel A. IEEE T. Inform. Theory. 1978;24:530. doi: 10.1109/TIT.1978.1055934. [DOI] [Google Scholar]
  15. Li B., Li Y.B., He H.B. Geno. Prot. Bioinfo. 2005;3:206. [Google Scholar]
  16. J. Felsenstein, PHYLIP (Phylogeny Inference Package) version 3.5c. Department of Genetics, University of Washington, Seattle (1993)
  17. Page R.D. Comput. Appl. Biosci. 1996;12:357. doi: 10.1093/bioinformatics/12.4.357. [DOI] [PubMed] [Google Scholar]
  18. Marra M.A., et al. Science. 2003;300:1399. doi: 10.1126/science.1085953. [DOI] [PubMed] [Google Scholar]
  19. Rota P.A., et al. Science. 2003;300:1394. doi: 10.1126/science.1085952. [DOI] [PubMed] [Google Scholar]
  20. Zheng W.C., Chen L.L., Ou H.Y., Gao F., Zhang C.T. Mol. Phylogenet. Evol. 2005;36:224. doi: 10.1016/j.ympev.2005.03.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Liò P., Goldman N. Trends Microbiol. 2004;12:106. doi: 10.1016/j.tim.2004.01.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Snijder E.J., Bredenbeek P.J., Dobbe J.C., Thiel V., Ziebuhr J., Poon L.L., Guan Y., Rozanov M., Spaan W.J., Gorbalenya A.E. J. Mol. Biol. 2003;331:991. doi: 10.1016/S0022-2836(03)00865-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Vinga S., Almeida J. Bioinformatics. 2003;19:513. doi: 10.1093/bioinformatics/btg005. [DOI] [PubMed] [Google Scholar]
  24. Nandy A. Curr. Sci. 1994;66:309. [Google Scholar]
  25. Nandy A., Nandy P. Chem. Phys. Lett. 2003;368:102. doi: 10.1016/S0009-2614(02)01830-4. [DOI] [Google Scholar]
  26. Randić M., Balaban A.T., Novič M., Založnik A., Pisanski T. Period. Biol. 2005;107:403. [Google Scholar]
  27. Randić M., Butina D., Zupan J. Chem. Phys. Lett. 2006;419:528. doi: 10.1016/j.cplett.2005.11.091. [DOI] [Google Scholar]
  28. Zhang R., Zhang C.T. J. Biomol. Struc. Dyn. 1994;11:767. doi: 10.1080/07391102.1994.10508031. [DOI] [PubMed] [Google Scholar]
  29. Liao B., Li R., Zhu W., Xiang X. J. Math. Chem. 2007;42:47. doi: 10.1007/s10910-006-9091-z. [DOI] [Google Scholar]
  30. Y.S. Zhang, M.S. Tan, Visualization of DNA sequences based on 3DD-Curves, Preprint, J. Math. Chem. (2007)

Articles from Journal of Mathematical Chemistry are provided here courtesy of Nature Publishing Group

RESOURCES