Skip to main content
Genetics logoLink to Genetics
. 2003 Oct;165(2):915–928. doi: 10.1093/genetics/165.2.915

Analysis and exploration of the use of rule-based algorithms and consensus methods for the inferral of haplotypes.

Steven Hecht Orzack 1, Daniel Gusfield 1, Jeffrey Olson 1, Steven Nesbitt 1, Lakshman Subrahmanyan 1, Vincent P Stanton Jr 1
PMCID: PMC1462785  PMID: 14573498

Abstract

The difficulty of experimental determination of haplotypes from phase-unknown genotypes has stimulated the development of nonexperimental inferral methods. One well-known approach for a group of unrelated individuals involves using the trivially deducible haplotypes (those found in individuals with zero or one heterozygous sites) and a set of rules to infer the haplotypes underlying ambiguous genotypes (those with two or more heterozygous sites). Neither the manner in which this "rule-based" approach should be implemented nor the accuracy of this approach has been adequately assessed. We implemented eight variations of this approach that differed in how a reference list of haplotypes was derived and in the rules for the analysis of ambiguous genotypes. We assessed the accuracy of these variations by comparing predicted and experimentally determined haplotypes involving nine polymorphic sites in the human apolipoprotein E (APOE) locus. The eight variations resulted in substantial differences in the average number of correctly inferred haplotype pairs. More than one set of inferred haplotype pairs was found for each of the variations we analyzed, implying that the rule-based approach is not sufficient by itself for haplotype inferral, despite its appealing simplicity. Accordingly, we explored consensus methods in which multiple inferrals for a given ambiguous genotype are combined to generate a single inferral; we show that the set of these "consensus" inferrals for all ambiguous genotypes is more accurate than the typical single set of inferrals chosen at random. We also use a consensus prediction to divide ambiguous genotypes into those whose algorithmic inferral is certain or almost certain and those whose less certain inferral makes molecular inferral preferable.

Full Text

The Full Text of this article is available as a PDF (240.1 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Artiga M. J., Bullido M. J., Sastre I., Recuero M., García M. A., Aldudo J., Vázquez J., Valdivieso F. Allelic polymorphisms in the transcriptional regulatory region of apolipoprotein E gene. FEBS Lett. 1998 Jan 9;421(2):105–108. doi: 10.1016/s0014-5793(97)01543-3. [DOI] [PubMed] [Google Scholar]
  2. Bullido M. J., Artiga M. J., Recuero M., Sastre I., García M. A., Aldudo J., Lendon C., Han S. W., Morris J. C., Frank A. A polymorphism in the regulatory region of APOE associated with risk for Alzheimer's dementia. Nat Genet. 1998 Jan;18(1):69–71. doi: 10.1038/ng0198-69. [DOI] [PubMed] [Google Scholar]
  3. Clark A. G. Inference of haplotypes from PCR-amplified samples of diploid populations. Mol Biol Evol. 1990 Mar;7(2):111–122. doi: 10.1093/oxfordjournals.molbev.a040591. [DOI] [PubMed] [Google Scholar]
  4. Excoffier L., Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol. 1995 Sep;12(5):921–927. doi: 10.1093/oxfordjournals.molbev.a040269. [DOI] [PubMed] [Google Scholar]
  5. Fallin D., Schork N. J. Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data. Am J Hum Genet. 2000 Aug 22;67(4):947–959. doi: 10.1086/303069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Fullerton S. M., Clark A. G., Weiss K. M., Nickerson D. A., Taylor S. L., Stengârd J. H., Salomaa V., Vartiainen E., Perola M., Boerwinkle E. Apolipoprotein E variation at the sequence haplotype level: implications for the origin and maintenance of a major human polymorphism. Am J Hum Genet. 2000 Sep 13;67(4):881–900. doi: 10.1086/303070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Gusfield D. Inference of haplotypes from samples of diploid populations: complexity and algorithms. J Comput Biol. 2001;8(3):305–323. doi: 10.1089/10665270152530863. [DOI] [PubMed] [Google Scholar]
  8. Hartman J. L., 4th, Garvik B., Hartwell L. Principles for the buffering of genetic variation. Science. 2001 Feb 9;291(5506):1001–1004. doi: 10.1126/science.291.5506.1001. [DOI] [PubMed] [Google Scholar]
  9. Hawley M. E., Kidd K. K. HAPLO: a program using the EM algorithm to estimate the frequencies of multi-site haplotypes. J Hered. 1995 Sep-Oct;86(5):409–411. doi: 10.1093/oxfordjournals.jhered.a111613. [DOI] [PubMed] [Google Scholar]
  10. Judson R., Stephens J. C., Windemuth A. The predictive power of haplotypes in clinical response. Pharmacogenomics. 2000 Feb;1(1):15–26. doi: 10.1517/14622416.1.1.15. [DOI] [PubMed] [Google Scholar]
  11. Lin Shin, Cutler David J., Zwick Michael E., Chakravarti Aravinda. Haplotype inference in random population samples. Am J Hum Genet. 2002 Oct 17;71(5):1129–1137. doi: 10.1086/344347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Long J. C., Williams R. C., Urbanek M. An E-M algorithm and testing strategy for multiple-locus haplotypes. Am J Hum Genet. 1995 Mar;56(3):799–810. [PMC free article] [PubMed] [Google Scholar]
  13. Michalatos-Beloin S., Tishkoff S. A., Bentley K. L., Kidd K. K., Ruano G. Molecular haplotyping of genetic markers 10 kb apart by allele-specific long-range PCR. Nucleic Acids Res. 1996 Dec 1;24(23):4841–4843. doi: 10.1093/nar/24.23.4841. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Nickerson D. A., Taylor S. L., Fullerton S. M., Weiss K. M., Clark A. G., Stengård J. H., Salomaa V., Boerwinkle E., Sing C. F. Sequence diversity and large-scale typing of SNPs in the human apolipoprotein E gene. Genome Res. 2000 Oct;10(10):1532–1545. doi: 10.1101/gr.146900. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Nickerson D. A., Tobe V. O., Taylor S. L. PolyPhred: automating the detection and genotyping of single nucleotide substitutions using fluorescence-based resequencing. Nucleic Acids Res. 1997 Jul 15;25(14):2745–2751. doi: 10.1093/nar/25.14.2745. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Niu Tianhua, Qin Zhaohui S., Xu Xiping, Liu Jun S. Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am J Hum Genet. 2001 Nov 26;70(1):157–169. doi: 10.1086/338446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Pritchard J. K. Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet. 2001 Jun 12;69(1):124–137. doi: 10.1086/321272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Reich D. E., Cargill M., Bolk S., Ireland J., Sabeti P. C., Richter D. J., Lavery T., Kouyoumjian R., Farhadian S. F., Ward R. Linkage disequilibrium in the human genome. Nature. 2001 May 10;411(6834):199–204. doi: 10.1038/35075590. [DOI] [PubMed] [Google Scholar]
  19. Stephens M., Smith N. J., Donnelly P. A new statistical method for haplotype reconstruction from population data. Am J Hum Genet. 2001 Mar 9;68(4):978–989. doi: 10.1086/319501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Templeton A. R., Sing C. F., Kessling A., Humphries S. A cladistic analysis of phenotype associations with haplotypes inferred from restriction endonuclease mapping. II. The analysis of natural populations. Genetics. 1988 Dec;120(4):1145–1154. doi: 10.1093/genetics/120.4.1145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Xu Chun-Fang, Lewis Karen, Cantone Kathryn L., Khan Parveen, Donnelly Christine, White Nicola, Crocker Nikki, Boyd Pete R., Zaykin Dmitri V., Purvis Ian J. Effectiveness of computational methods in haplotype prediction. Hum Genet. 2001 Dec 14;110(2):148–156. doi: 10.1007/s00439-001-0656-4. [DOI] [PubMed] [Google Scholar]
  22. Zhu X., McKenzie C. A., Forrester T., Nickerson D. A., Broeckel U., Schunkert H., Doering A., Jacob H. J., Cooper R. S., Rieder M. J. Localization of a small genomic region associated with elevated ACE. Am J Hum Genet. 2000 Sep 19;67(5):1144–1153. doi: 10.1016/s0002-9297(07)62945-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES