Skip to main content
American Journal of Human Genetics logoLink to American Journal of Human Genetics
. 1993 Nov;53(5):1137–1145.

Molecular and statistical approaches to the detection and correction of errors in genotype databases.

L M Brzustowicz 1, C Mérette 1, X Xie 1, L Townsend 1, T C Gilliam 1, J Ott 1
PMCID: PMC1682304  PMID: 8213837

Abstract

Errors in genotyping data have been shown to have a significant effect on the estimation of recombination fractions in high-resolution genetic maps. Previous estimates of errors in existing databases have been limited to the analysis of relatively few markers and have suggested rates in the range 0.5%-1.5%. The present study capitalizes on the fact that within the Centre d'Etude du Polymorphisme Humain (CEPH) collection of reference families, 21 individuals are members of more than one family, with separate DNA samples provided by CEPH for each appearance of these individuals. By comparing the genotypes of these individuals in each of the families in which they occur, an estimated error rate of 1.4% was calculated for all loci in the version 4.0 CEPH database. Removing those individuals who were clearly identified by CEPH as appearing in more than one family resulted in a 3.0% error rate for the remaining samples, suggesting that some error checking of the identified repeated individuals may occur prior to data submission. An error rate of 3.0% for version 4.0 data was also obtained for four chromosome 5 markers that were retyped through the entire CEPH collection. The effects of these errors on a multipoint map were significant, with a total sex-averaged length of 36.09 cM with the errors, and 19.47 cM with the errors corrected. Several statistical approaches to detect and allow for errors during linkage analysis are presented. One method, which identified families containing possible errors on the basis of the impact on the maximum lod score, showed particular promise, especially when combined with the limited retyping of the identified families. The impact of the demonstrated error rate in an established genotype database on high-resolution mapping is significant, raising the question of the overall value of incorporating such existing data into new genetic maps.

Full text

PDF
1145

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Anderson M. A., Gusella J. F. Use of cyclosporin A in establishing Epstein-Barr virus-transformed human lymphoblastoid cell lines. In Vitro. 1984 Nov;20(11):856–858. doi: 10.1007/BF02619631. [DOI] [PubMed] [Google Scholar]
  2. Brzustowicz L. M., Kleyn P. W., Boyce F. M., Lien L. L., Monaco A. P., Penchaszadeh G. K., Das K., Wang C. H., Munsat T. L., Ott J. Fine-mapping of the spinal muscular atrophy locus to a region flanked by MAP1B and D5S6. Genomics. 1992 Aug;13(4):991–998. doi: 10.1016/0888-7543(92)90012-h. [DOI] [PubMed] [Google Scholar]
  3. Buetow K. H. Influence of aberrant observations on high-resolution linkage analysis outcomes. Am J Hum Genet. 1991 Nov;49(5):985–994. [PMC free article] [PubMed] [Google Scholar]
  4. Dracopoli N. C., O'Connell P., Elsner T. I., Lalouel J. M., White R. L., Buetow K. H., Nishimura D. Y., Murray J. C., Helms C., Mishra S. K. The CEPH consortium linkage map of human chromosome 1. Genomics. 1991 Apr;9(4):686–700. doi: 10.1016/0888-7543(91)90362-i. [DOI] [PubMed] [Google Scholar]
  5. Gilliam T. C., Tanzi R. E., Haines J. L., Bonner T. I., Faryniarz A. G., Hobbs W. J., MacDonald M. E., Cheng S. V., Folstein S. E., Conneally P. M. Localization of the Huntington's disease gene to a small segment of chromosome 4 flanked by D4S10 and the telomere. Cell. 1987 Aug 14;50(4):565–571. doi: 10.1016/0092-8674(87)90029-8. [DOI] [PubMed] [Google Scholar]
  6. Keats B. J., Sherman S. L., Ott J. Report of the committee on linkage and gene order. Cytogenet Cell Genet. 1990;55(1-4):387–394. doi: 10.1159/000133023. [DOI] [PubMed] [Google Scholar]
  7. Lander E. S., Green P. Construction of multilocus genetic linkage maps in humans. Proc Natl Acad Sci U S A. 1987 Apr;84(8):2363–2367. doi: 10.1073/pnas.84.8.2363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Lathrop G. M., Hooper A. B., Huntsman J. W., Ward R. H. Evaluating pedigree data. I. The estimation of pedigree error in the presence of marker mistyping. Am J Hum Genet. 1983 Mar;35(2):241–262. [PMC free article] [PubMed] [Google Scholar]
  9. Lathrop G. M., Lalouel J. M., Julier C., Ott J. Strategies for multilocus linkage analysis in humans. Proc Natl Acad Sci U S A. 1984 Jun;81(11):3443–3446. doi: 10.1073/pnas.81.11.3443. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Lien L. L., Boyce F. M., Kleyn P., Brzustowicz L. M., Menninger J., Ward D. C., Gilliam T. C., Kunkel L. M. Mapping of human microtubule-associated protein 1B in proximity to the spinal muscular atrophy locus at 5q13. Proc Natl Acad Sci U S A. 1991 Sep 1;88(17):7873–7876. doi: 10.1073/pnas.88.17.7873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Ott J. Linkage analysis with misclassification at one locus. Clin Genet. 1977 Aug;12(2):119–124. doi: 10.1111/j.1399-0004.1977.tb00913.x. [DOI] [PubMed] [Google Scholar]
  12. Sherrington R., Melmer G., Dixon M., Curtis D., Mankoo B., Kalsi G., Gurling H. Linkage disequilibrium between two highly polymorphic microsatellites. Am J Hum Genet. 1991 Nov;49(5):966–971. [PMC free article] [PubMed] [Google Scholar]
  13. Weiffenbach B., Falls K., Bricker A., Hall L., McMahon J., Wasmuth J., Funanage V., Donis-Keller H. A genetic linkage map of human chromosome 5 with 60 RFLP loci. Genomics. 1991 May;10(1):173–185. doi: 10.1016/0888-7543(91)90498-4. [DOI] [PubMed] [Google Scholar]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics

RESOURCES