HvrBase++: a phylogenetic database for primate species

Jochen Kohl; Ingo Paulsen; Thomas Laubach; Achim Radtke; Arndt von Haeseler

doi:10.1093/nar/gkj030

. 2005 Dec 28;34(Database issue):D700–D704. doi: 10.1093/nar/gkj030

HvrBase++: a phylogenetic database for primate species

Jochen Kohl ¹, Ingo Paulsen ¹, Thomas Laubach ¹, Achim Radtke ¹, Arndt von Haeseler ^1,^2,^3,^4,^*

PMCID: PMC1347393 PMID: 16381963

Abstract

HvrBase++ is the improved and extended version of HvrBase. Extensions are made by adding more population-based sequence samples from all primates including humans. The current collection comprises 13 873 hypervariable region I (HVRI) sequences and 4940 hypervariable region II (HVRII) sequences. In addition, we included 1376 complete mitochondrial genomes, 205 sequences from X-chromosomal loci and 202 sequences from autosomal chromosomes 1, 8, 11 and 16. In order to reduce the introduction of erroneous data into HvrBase++, we have developed a procedure that monitors GenBank for new versions of the current data in HvrBase++ and automatically updates the collection if necessary. For the stored sequences, supplementary information such as geographic origin, population affiliation and language of the sequence donor can be retrieved. HvrBase++ is Oracle based and easily accessible by a web interface (http://www.hvrbase.org). As a new key feature, HvrBase++ provides an interactive graphical tool to easily access data from dynamically created geographical maps.

INTRODUCTION

HvrBase was originally started as compilation of hypervariable region I (HVRI) and hypervariable region II (HVRII) mitochondrial sequences (1,2). These regions are situated in the non-coding mitochondrial control region and play an important role in population genetics (3,4). With some exceptions, mitochondrial DNA (mtDNA) follows a maternal clonal inheritance pattern without recombination (5–7). Therefore, population genetics analyses allow studying the population history of maternal inherited mitochondrial genomes. Furthermore, mtDNA variation correlates with the geographic origin of the population, and has been linked to a wide range of degenerative diseases, preferentially affecting the central nervous system, heart, muscle, renal and endocrine systems, and is generally used in forensic comparisons (8–10).

HvrBase++ focuses on aspects of population genetics and collects meta information like ethnic groups and spoken languages for each individual. For this reason, sequences have only been included if a minimum of meta information was available. Moreover, meta information is linked to a geographical information system (GIS), which allows intuitive searches supported by geographical maps to obtain additional information about countries. This interactive map searching feature and the presence of meta information predestine HvrBase++ as a database for population genetics analysis.

Among other databases, like MITOMAP (11) as a general resource for mtDNA-related data and the ‘mtDNA Population Database’ (12) for forensic studies, HvrBase++ contributes to the wide area of mtDNA analysis.

Wherever in the course of a phylogenetic analysis mitochondrial data are used which at best reflect matrilineal history, a closer look at nuclear DNA (nDNA) is indispensable to answer questions concerning phylogenetic history in their entirety. When drawing a comparison between evolutionary pathways of the pyruvate dehydrogenase E1α (PDHA1) subunit and mtDNA, J. Hey showed that ‘variation at nuclear genes and mtDNA are not both consistent with a common demographic history’ (13).

While both hypervariable regions of mtDNA are commonly used for phylogenetic studies, no equivalent sequence markers exist when dealing with nuclear DNA. With HvrBase++, we introduce a set of ready-to-use nDNA sequence markers. Genes that code for the human immune defence, and non-coding regions around microsatellite DNA markers, are promising candidates for nDNA sequence markers owing to their mutation rate (3).

Moreover, the detection of nuclear mitochondrial-like sequences called ‘numts’ in the last decade has shed doubts whether mtDNA data have been classified correctly (14–16). This is caused by a maximum of 94% similarity between mtDNA and numts fragments.

To meet these concerns, researchers have begun to incorporate nuclear markers in their studies. It is a matter of fact that HvrBase++ now carries nuclear markers as well.

Compilation of sequences

In HvrBase++ the term ‘sequence’ represents a piece of DNA from one individual. A ‘lineage’ in contrast means a piece of DNA from possibly different individuals, which share the same nucleotide sequence. Meta information about the sampled individuals was collected from publications (supplemented information). If different sequence sources were available, they have been chosen in the following order: (i) public databases like GenBank (17), (ii) supplemental data from publications, (iii) data manually extracted from publications and (iv) data requested from authors.

After collecting and extracting sequences and meta information for a gene or region, a global nucleotide alignment was created. For the HVRI and HVRII regions, HvrBase++ carries a manual alignment (2) and an alignment generated with MAFFT (18). Automatically calculated alignments can be obtained from HvrBase++ for complete mitochondrial genomes and nuclear sequences, respectively. A procedure checks sequence alterations in GenBank and updates the data in the next release. It is worth noting that every single update step is logged in our database system and can be traced via the HvrBase++ web interface.

Sequences enter HvrBase++ if meta information can be retrieved from the corresponding publication. Meta information must be attributable to each sequence in the paper and consider: (i) geographic origin, (ii) population, (iii) spoken language and (iv) bibliographic information. Owing to those filtering criteria, not all data from publications and most of the forensic data, for example the comprehensive forensic dataset from ‘mtDNA Population Database’ (12), could not be integrated into HvrBase++.

Since there is no unique way to gain the above named meta information either from publications or from sequence files, it is difficult to build a fully automated tool that identifies meta information that is located in different resources.

Synonyms and context-dependent meanings of a word may pose a challenge as well. Where it is easy for humans to associate certain information, it is a hard task for computers. Seeing that, HvrBase++ categorizes ambiguous data to facilitate a broad range of complex search patterns. Bibliographic information, like authors, publication date, journal and PubMed publication identifier are standardized. Each country is assigned to just one continent, e.g. in HvrBase++ Turkey is assigned to Asia, the Canary Islands belong to the sovereign territory of Spain.

All 258 language entries in HvrBase++ have been adapted to comply with the SIL (Summer Institute of Linguistics) and ISO/DIS 639-2 language code standards respectively from Ethnologue vol. 14 (19). In order to avoid information loss and to compensate the incompleteness of any of the standards, it was necessary to integrate both language codes (Table 1). The following example clearly shows the hassle of associating a mother tongue of an individual deduced from a publication with the SIL or ISO language codes.

Table 1.

Assignment of language names to the SIL and ISO/DIS 639-2 codes in HvrBase++ for the mitochondrial dataset

SIL	ISO	No. of individuals	Language family or population
Yes	Yes	7248	English
Yes	No	41	Mandenka (population from Senegal, ‘Mandinka’ in SIL)
No	Yes	1951	Bantu (Africa's largest language family)
No	No	454	Mbenzele (population from Central African Republic)
		4611	Language information missing or not assignable
Total		14 305

Open in a new tab

This year, the SIL and ISO/DIS 639-2 codes have converged. We will account for them in the next major release.

It is known that a certain tongue belongs to the Niger-Kordofanian language family. Niger-Kordofanian is a collective language code only used in the ISO standard whose languages can be found throughout Southern and Central Africa as well as in Sub-Saharan Western Africa. Since that language family does not have a SIL code, a more in-depth knowledge about the very tongue (e.g. language name and habitation of a tribe) would be essential to find a suitable SIL code.

Technical organization

HvrBase++ is managed in an Oracle 10g relational database system. Sequence data and accompanying information are extracted and stored in HvrBase++ via Perl programs that use object-oriented modules from the BioPerl-Project (20) and the Perl DBI module.

The web client is based on the Apache web server technology. For the geographical interface, a map server (MapServer version 4.6 from the University of Minnesota) is integrated into the web client using geographical maps from publicly available resources.

UMN MapServer provides the core functionality of a GIS system for an intuitive data access from dynamically created geographical maps.

Description of the compilation

The HvrBase++ database now comprises not only HVRI and HVRII sequences but also mitochondrial genomes and nuclear sequences from several chromosomal loci. Not surprisingly, human sequences are overrepresented with a total amount of 20 037 sequences (Table 2). Table 3 displays an excerpt from the human HVRI dataset gathered from 103 publications which encompasses sequences from 89 countries and 220 ethnic groups.

Table 2.

Number of sequence categories in humans, great apes and Neanderthalers across all sequence types in HvrBase++

	Number of
	Humans	Great Apes	Neanderthalers
HVRI	13 350	520	3
HVRII	4925	13	2
Mitochondrial genomes	1376	0	0
Nuclear sequences	386	21	0
Total	20 037	554	5

Open in a new tab

Table 3.

Human HVRI datasets over six continents

Continent	Lineages	Human samples	Number of
			Countries	References	Populations	Languages	SIL	ISO
Europe	2033	4358	17	39	25	31	20	16
Africa	1046	1680	25	22	47	47	22	25
North America	824	1581	7	19	34	9	9	8
South America	267	473	7	10	11	19	7	7
Asia	2867	4778	23	49	102	67	31	47
Australia/Oceania	224	473	10	10	12	28	9	16
World	7036	13 343	89	103	220	194	81	118

Open in a new tab

Note that the last row does not depict the arithmetic sum in columns 2, 5–9 as some relevant subsets overlap across continents.

Table 4 describes the 10 loci for the 407 nuclear sequences. The amount of nuclear markers in HvrBase++ is currently not very high because the compilation is at an early stage. We feel confident that it will get more and more important to sequence and analyze nuclear genes for studies in population genetics due to possibly contradicting histories of nuclear genes and mtDNA (3,21). Figure 1 shows the sequence increase of HVRI/II, mitochondrial genomes and nuclear sequences for all available publications within the past 25 years. Thus, it can be assumed that this upward trend will continue.

Table 4.

Location, length and amount of the 407 nuclear sequences in HvrBase++

Amount	Gene	Gene function	Chromosome	Length in bp
8	pdh1	Pyruvate dehydrogenase E1-α subunit gene, partial seq.	X	1769
41	Factor ix	Factor IX gene, intron 4	X	3740
42	rrm2p4	Ribonucleotide reductase M2 pseudogene 4, partial seq.	X	2392
42	tnfsf5	Tumor necrosis factor ligand superfamily 5 gene, partial seq.	X	5239
1	amelx	Amelogenin X chromosome gene, complete seq.	X	5323
71	xq13.3	Xq13.3 non-coding region	X	10 178
56	mc1r promoter	Melanocortin 1 receptor gene, promoter	16	6599
1	mc1r	Melanocortin 1 receptor gene	16	953
8	lpl	Lipoprotein lipase gene, partial seq.	8	542–1636
61	ch1	Membrane protein CH1 gene, partial seq.	1	9626
59	β-globin	β-globin gene, complete seq.	11	3008
17	β-globin repl. init. reg.	β-globin gene, repl. ori. init. reg. and partial seq.	11	1312

Open in a new tab

Accumulation of HVRI, HVRII, mitochondrial genomes and nuclear sequences over the last 25 years.

User interface

The new geographic map search interface is the centre of the web interface, which provides an intuitive search method and presents the results clearly structured. On the other hand, the well-tried form-based search function from HvrBase is recommended for more systematic searches. Supported sequence output formats are Phylip, GenBank, XML and simple text files. The form-based and map searches in combination make it possible to find any kind of sequence available from a sampled individual.

Figure 2 shows the geographic map functionality in HvrBase++. It is possible to search for all genes in countries and continents. A more sophisticated search can be obtained by specifying populations and language codes. Sequence patterns can be detected within genes for a whole sequence or a given range. Moreover, regular expressions allow for complex motif searches. Each country (or continent) is pictured in the world map and colour-coded, depending on the number of sequences from the respective country. More detailed information is displayed at the bottom of the world map after choosing a country from the map.

Geographical map interface in HvrBase++. The upper frame contains elements for searching sequences, the search results are displayed in the map and at the bottom. A countries' colour represents the number of sequences for a given gene. The main table shows the results of all available genes for a selected country. Additional information for each gene is displayed in separate tables (data not shown). Sequences are accessible by selecting them from the table.

Quality and completeness of the data and future directions

Although HvrBase++ represents a large compilation of HVRI and HVRII sequences, completeness cannot be claimed. The collection of mitochondrial genomes and nuclear genes will be extended, and gaps will have to be closed in future releases.

Therefore, we solicit everybody to furnish new sequences and respective information by electronic mail. We would also be grateful to receive already published sequences that are missing in our collection.

This database gives easy access to freely available sequences without altering them in any way. That means we have not checked the data for typos or any other kind of sequence errors that might have occurred between their acquisition and their publication (22–24). Our intention is not to fix putative errors in other publications and finally to hold in our hand another dataset. This could cause confusion by the use of sequences in comparative analyses from two different sources.

We recommend our colleagues to control their datasets carefully and to follow the instructions proposed by Bandelt et al. (25) to detect suspicious sequence positions.

New sequence versions in GenBank are investigated automatically and continuously included into HvrBase++. Since there is no common way to update sequences from non-public databases, we have done this manually. As we aim at a high quality of data, we will welcome any cues regarding programming bugs, misinterpretations or other discrepancies.

Acknowledgments

We thank all colleagues who have provided their sequence data in computer readable format and have given us additional information when needed. Funding to pay the Open Access publication charges for this article was provided by the Deutsche Forchungsgemeinschaft (DFG).

Conflict of interest statement. None declared.

REFERENCES

1.Handt O., Meyer S., von Haeseler A. Compilation of human mtDNA control region sequences. Nucleic Acids Res. 1998;26:126–129. doi: 10.1093/nar/26.1.126. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Burckhardt F., von Haeseler A., Meyer S. HvrBase: compilation of mtDNA control region sequences from primates. Nucleic Acids Res. 1999;27:138–142. doi: 10.1093/nar/27.1.138. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Zhang D.X., Hewitt G.M. Nuclear DNA analyses in genetic studies of populations: practice, problems and prospects. Mol. Ecol. 2003;12:563–584. doi: 10.1046/j.1365-294x.2003.01773.x. [DOI] [PubMed] [Google Scholar]
4.Avise J.C. The history and purview of phylogeography: a personal reflection. Mol. Ecol. 1998;7:371–379. [Google Scholar]
5.Kondo R., Satta Y., Matsuura E.T, Ishiwa H., Takahata N., Chigusa S.I. Incomplete maternal transmission of mitochondrial DNA in Drosophila. Genetics. 1990;126:657–663. doi: 10.1093/genetics/126.3.657. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Gyllensten U., Wharton D., Josefsson A., Wilson A.C. Paternal inheritance of mitochondrial DNA in mice. Nature. 1991;352:255–257. doi: 10.1038/352255a0. [DOI] [PubMed] [Google Scholar]
7.Skibinski D.O.F., Gallagher C., Beynon C.M. Mitochondrial DNA inheritance. Nature. 1994;368:817–818. doi: 10.1038/368817b0. [DOI] [PubMed] [Google Scholar]
8.Coskun P.E., Beal M.F., Wallace D.C. Alzheimer's brains harbor somatic mtDNA control-region mutations that suppress mitochondrial transcription and replication. Proc. Natl Acad. Sci. USA. 2004;29:10726–10731. doi: 10.1073/pnas.0403649101. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Sukernik R.I., Derbebeva O.A., Starikovskaya E.B., Volodko N.V., Mikhailovskaya I.E., Buychov I.Yu., Lott M., Brown M., Wallace D. The mitochondrial genome and humans mitochondrial diseases. Russ. J. Genet. 2002;38:161–170. [PubMed] [Google Scholar]
10.Budowle B., Allard M.W., Wilson M.R., Chakraborty R. Forensics and mitochondrial DNA: applications, debates, and foundations. Annu. Rev. Genomics Hum. Genet. 2003;4:119–141. doi: 10.1146/annurev.genom.4.070802.110352. [DOI] [PubMed] [Google Scholar]
11.Brandon M.C., Lott M.T., Nguyen K.C., Spolim S., Navanthe S.B., Baldi P., Wallace D.C. MITOMAP: a humans mitochondrial genome database—2004 update. Nucleic Acids Res. 2004;33:D611–D613. doi: 10.1093/nar/gki079. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Monson K.L., Miler K.W.P., Wilson M.R., DiZinno J.A., Budowle B. The mtDNA population database: an integrated software and database resource for forensic comparison. Forensic Sci. Commun. 2002;4 Available at http://www.fbi.gov/hqlab/fsc/backissu/april2002/miller1.htm. [Google Scholar]
13.Hey J. Mitochondrial and nuclear genes present conflicting portraits of human origins. Mol. Biol. Evol. 1997;14:166–172. doi: 10.1093/oxfordjournals.molbev.a025749. [DOI] [PubMed] [Google Scholar]
14.Mishmar D., Ruiz-Pesini E., Brandon M., Wallace D.C. Mitochondrial DNA-like sequences in the nucleus (NUMTs): insights into our African origins and the mechanism of foreign DNA integration. Hum. Mutat. 2004;23:125–133. doi: 10.1002/humu.10304. [DOI] [PubMed] [Google Scholar]
15.Bensasson D., Zhang D.X., Hartl D.L., Hewitt G.M. Mitochondrial pseudogenes: evolution's misplaced witnesses. Trends Ecol. Evol. 2001;16:314–321. doi: 10.1016/s0169-5347(01)02151-6. [DOI] [PubMed] [Google Scholar]
16.Thalman O., Hebler J., Poinar H.N., Pääbo S., Vigilant L. Unreliable mtDNA data due to nuclear insertions: a cautionary tale from analysis of humans and other great apes. Mol. Ecol. 2004;13:321–335. doi: 10.1046/j.1365-294x.2003.02070.x. [DOI] [PubMed] [Google Scholar]
17.Benson D.A., Karsch-Mizrachi I., Lipman D.J., Ostell J., Wheeler D.L. GenBank: update. Nucleic Acids Res. 2004;32:D23–D26. doi: 10.1093/nar/gkh045. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Katoh K., Kuma K., Toh H., Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33:511–518. doi: 10.1093/nar/gki198. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Grimes B.F. In: Grimes B.F., editor. 2000. Ethnologue: Volume 1 Languages of the World, (14th Edn) ISBN 1-55671-103-4. [Google Scholar]
20.Stajich J.E., Block D., Boulez K., Brenner S.E., Chervitz S.A., Dagdigian C., Fuellen G., Gilbert J.G.R., Korf I., Lapp H., et al. The Bioperl Toolkit: Perl modules for the life sciences. Genome Res. 2002;12:1161–1168. doi: 10.1101/gr.361602. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Petit R.J., Duminil J., Fineschi S., Hampe A., Salvini D., Vendramin G.G. Comparative organization of chloroplast, mitochondrial and nuclear diversity in plant populations. Mol. Ecol. 2005;14:689–701. doi: 10.1111/j.1365-294X.2004.02410.x. [DOI] [PubMed] [Google Scholar]
22.Bandelt H.-J., Quintana-Murci L.L., Salas A., Macaulay V. The fingerprint of phantom mutations in mitochondrial DNA data. Am. J. Hum. Genet. 2002;71:1150–1160. doi: 10.1086/344397. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Herrnstadt C., Preston G., Howell N. Errors, phantom and otherwise, in humans mtDNA sequences. Am. J. Hum. Genet. 2003;72:1585–1586. doi: 10.1086/375406. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Forster P. To err is human. Ann. Hum. Genet. 2003;67:2–4. doi: 10.1046/j.1469-1809.2003.00002.x. [DOI] [PubMed] [Google Scholar]
25.Bandelt H.-J., Lahermo P., Richards M., Macaulay V. Detecting errors in mtDNA data by phylogenetic analysis. Int. J. Legal Med. 2001;115:64–69. doi: 10.1007/s004140100228. [DOI] [PubMed] [Google Scholar]

[b1] 1.Handt O., Meyer S., von Haeseler A. Compilation of human mtDNA control region sequences. Nucleic Acids Res. 1998;26:126–129. doi: 10.1093/nar/26.1.126. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b2] 2.Burckhardt F., von Haeseler A., Meyer S. HvrBase: compilation of mtDNA control region sequences from primates. Nucleic Acids Res. 1999;27:138–142. doi: 10.1093/nar/27.1.138. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b3] 3.Zhang D.X., Hewitt G.M. Nuclear DNA analyses in genetic studies of populations: practice, problems and prospects. Mol. Ecol. 2003;12:563–584. doi: 10.1046/j.1365-294x.2003.01773.x. [DOI] [PubMed] [Google Scholar]

[b4] 4.Avise J.C. The history and purview of phylogeography: a personal reflection. Mol. Ecol. 1998;7:371–379. [Google Scholar]

[b5] 5.Kondo R., Satta Y., Matsuura E.T, Ishiwa H., Takahata N., Chigusa S.I. Incomplete maternal transmission of mitochondrial DNA in Drosophila. Genetics. 1990;126:657–663. doi: 10.1093/genetics/126.3.657. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b6] 6.Gyllensten U., Wharton D., Josefsson A., Wilson A.C. Paternal inheritance of mitochondrial DNA in mice. Nature. 1991;352:255–257. doi: 10.1038/352255a0. [DOI] [PubMed] [Google Scholar]

[b7] 7.Skibinski D.O.F., Gallagher C., Beynon C.M. Mitochondrial DNA inheritance. Nature. 1994;368:817–818. doi: 10.1038/368817b0. [DOI] [PubMed] [Google Scholar]

[b8] 8.Coskun P.E., Beal M.F., Wallace D.C. Alzheimer's brains harbor somatic mtDNA control-region mutations that suppress mitochondrial transcription and replication. Proc. Natl Acad. Sci. USA. 2004;29:10726–10731. doi: 10.1073/pnas.0403649101. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b9] 9.Sukernik R.I., Derbebeva O.A., Starikovskaya E.B., Volodko N.V., Mikhailovskaya I.E., Buychov I.Yu., Lott M., Brown M., Wallace D. The mitochondrial genome and humans mitochondrial diseases. Russ. J. Genet. 2002;38:161–170. [PubMed] [Google Scholar]

[b10] 10.Budowle B., Allard M.W., Wilson M.R., Chakraborty R. Forensics and mitochondrial DNA: applications, debates, and foundations. Annu. Rev. Genomics Hum. Genet. 2003;4:119–141. doi: 10.1146/annurev.genom.4.070802.110352. [DOI] [PubMed] [Google Scholar]

[b11] 11.Brandon M.C., Lott M.T., Nguyen K.C., Spolim S., Navanthe S.B., Baldi P., Wallace D.C. MITOMAP: a humans mitochondrial genome database—2004 update. Nucleic Acids Res. 2004;33:D611–D613. doi: 10.1093/nar/gki079. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b12] 12.Monson K.L., Miler K.W.P., Wilson M.R., DiZinno J.A., Budowle B. The mtDNA population database: an integrated software and database resource for forensic comparison. Forensic Sci. Commun. 2002;4 Available at http://www.fbi.gov/hqlab/fsc/backissu/april2002/miller1.htm. [Google Scholar]

[b13] 13.Hey J. Mitochondrial and nuclear genes present conflicting portraits of human origins. Mol. Biol. Evol. 1997;14:166–172. doi: 10.1093/oxfordjournals.molbev.a025749. [DOI] [PubMed] [Google Scholar]

[b14] 14.Mishmar D., Ruiz-Pesini E., Brandon M., Wallace D.C. Mitochondrial DNA-like sequences in the nucleus (NUMTs): insights into our African origins and the mechanism of foreign DNA integration. Hum. Mutat. 2004;23:125–133. doi: 10.1002/humu.10304. [DOI] [PubMed] [Google Scholar]

[b15] 15.Bensasson D., Zhang D.X., Hartl D.L., Hewitt G.M. Mitochondrial pseudogenes: evolution's misplaced witnesses. Trends Ecol. Evol. 2001;16:314–321. doi: 10.1016/s0169-5347(01)02151-6. [DOI] [PubMed] [Google Scholar]

[b16] 16.Thalman O., Hebler J., Poinar H.N., Pääbo S., Vigilant L. Unreliable mtDNA data due to nuclear insertions: a cautionary tale from analysis of humans and other great apes. Mol. Ecol. 2004;13:321–335. doi: 10.1046/j.1365-294x.2003.02070.x. [DOI] [PubMed] [Google Scholar]

[b17] 17.Benson D.A., Karsch-Mizrachi I., Lipman D.J., Ostell J., Wheeler D.L. GenBank: update. Nucleic Acids Res. 2004;32:D23–D26. doi: 10.1093/nar/gkh045. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b18] 18.Katoh K., Kuma K., Toh H., Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33:511–518. doi: 10.1093/nar/gki198. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b19] 19.Grimes B.F. In: Grimes B.F., editor. 2000. Ethnologue: Volume 1 Languages of the World, (14th Edn) ISBN 1-55671-103-4. [Google Scholar]

[b20] 20.Stajich J.E., Block D., Boulez K., Brenner S.E., Chervitz S.A., Dagdigian C., Fuellen G., Gilbert J.G.R., Korf I., Lapp H., et al. The Bioperl Toolkit: Perl modules for the life sciences. Genome Res. 2002;12:1161–1168. doi: 10.1101/gr.361602. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b21] 21.Petit R.J., Duminil J., Fineschi S., Hampe A., Salvini D., Vendramin G.G. Comparative organization of chloroplast, mitochondrial and nuclear diversity in plant populations. Mol. Ecol. 2005;14:689–701. doi: 10.1111/j.1365-294X.2004.02410.x. [DOI] [PubMed] [Google Scholar]

[b22] 22.Bandelt H.-J., Quintana-Murci L.L., Salas A., Macaulay V. The fingerprint of phantom mutations in mitochondrial DNA data. Am. J. Hum. Genet. 2002;71:1150–1160. doi: 10.1086/344397. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b23] 23.Herrnstadt C., Preston G., Howell N. Errors, phantom and otherwise, in humans mtDNA sequences. Am. J. Hum. Genet. 2003;72:1585–1586. doi: 10.1086/375406. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b24] 24.Forster P. To err is human. Ann. Hum. Genet. 2003;67:2–4. doi: 10.1046/j.1469-1809.2003.00002.x. [DOI] [PubMed] [Google Scholar]

[b25] 25.Bandelt H.-J., Lahermo P., Richards M., Macaulay V. Detecting errors in mtDNA data by phylogenetic analysis. Int. J. Legal Med. 2001;115:64–69. doi: 10.1007/s004140100228. [DOI] [PubMed] [Google Scholar]

PERMALINK

HvrBase++: a phylogenetic database for primate species

Jochen Kohl

Ingo Paulsen

Thomas Laubach

Achim Radtke

Arndt von Haeseler

Abstract

INTRODUCTION

Compilation of sequences

Table 1.

Technical organization

Description of the compilation

Table 2.

Table 3.

Table 4.

Figure 1.

User interface

Figure 2.

Quality and completeness of the data and future directions

Acknowledgments

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

HvrBase++: a phylogenetic database for primate species

Jochen Kohl

Ingo Paulsen

Thomas Laubach

Achim Radtke

Arndt von Haeseler

Abstract

INTRODUCTION

Compilation of sequences

Table 1.

Technical organization

Description of the compilation

Table 2.

Table 3.

Table 4.

Figure 1.

User interface

Figure 2.

Quality and completeness of the data and future directions

Acknowledgments

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases