Skip to main content
Genetics logoLink to Genetics
. 2007 May;176(1):721–724. doi: 10.1534/genetics.106.067264

Survival Quantitative Trait Locus Fine Mapping by Measuring and Testing for Hardy–Weinberg and Linkage Disequilibrium

J Casellas 1,1
PMCID: PMC1893058  PMID: 17409083

Abstract

I show that fine-scale localization of a survival-related locus can be accomplished on the basis of deviations from Hardy–Weinberg equilibrium and linkage disequilibrium at closely linked marker loci. The method is based on χ2-tests and they can be performed for age-specific samples of alive (or dead) individuals, as for combined samples of alive and dead individuals.


CONVENTIONAL tools for the analysis of QTL can locate loci underlying the variation of continuous quantitative traits to a genomic region of ∼30 cM (Deng et al. 2000). Fine-scale mapping (∼1 cM) is required to reduce the range of these candidate genomic regions, and some appropriate techniques have been developed for complex diseases and quantitative traits under Gaussian distributions (Spielman et al. 1993; Feder et al. 1996; Nielsen et al. 1998; Deng et al. 2000, 2003). Although survival has become an emergent research field in human health (Puca et al. 2001) and animal breeding (Kleinbaum 1996), there are not appropriate fine-mapping techniques for survival traits. The objective of this article is to adapt Deng's et al. (2000) QTL fine-mapping method to survival data.

Take as starting point a survival-related QTL locus with two alleles, A1 and A2, and allelic frequencies p and q = 1 − p, respectively. Under the proportional hazards framework (Cox 1972), Inline graphic is the survival probability at time t for an individual with genotype A1A1, where S0(t) is the baseline survival function and a is the genotypic value of the A1A1 genotype. In a similar way, we define Inline graphic and Inline graphic d and −a being the genotypic values for the A1A2 and A2A2 genotypes, respectively. Without loss of generality, we can assume that S0(t) represents a random variable for the combined effects of all the rest of the polymorphic loci and all random environmental effects. As in the original research of Feder et al. (1996), Nielsen et al. (1998), and Deng et al. (2000), a large population under random mating is assumed and thus Hardy–Weinberg (HW) equilibrium holds in each generation of individuals at birth. The proportion of survivors at time tt) is stated as Inline graphic Inline graphic and Inline graphic being the allelic (A1) and genotypic (A1A1) frequency within the group of alive individuals at time t (ALIVEt), respectively (the remaining frequencies can be easily derived following Deng et al. 2000). Deviation from HW equilibrium at the survival QTL can be measured by the disequilibrium coefficient Inline graphic by Weir (1996), Inline graphic or, following Deng et al. (2000), by the function between observed and expected homozygosities, Inline graphic

Previous derivations can be easily adapted to a marker locus closely located near the survival QTL, with alleles M1 and M2, and allelic frequencies r and s = 1 − r. As in Deng et al. (2000), Inline graphic is the allelic frequency of M1 and Inline graphic is the genotypic frequency of M1M1, where Inline graphic is the linkage disequilibrium (LD) measure between A1 and M1 (Crow and Kimura 1970) and PA1M1 is the frequency of haplotypes carrying both A1 and M1. According to Deng et al. (2000), the HW disequilibrium among ALIVEt individuals at the marker locus is Inline graphic it being nonzero when Inline graphic and Inline graphic A wide range of combinations of φ11, φ12, and φ22 provide a value different from zero and, in practice, the HW disequilibrium at the marker locus solely reflects the LD in the whole generation (Deng et al. 2000). In a similar way, HW disequilibrium for the marker locus among alive individuals can be derived as Inline graphic as described by Feder et al. (1996) and Nielsen et al. (1998) for affected individuals of complex traits. Both FM1 and DM1M1 statistics converge to the key point that HW disequilibrium at a marker locus corresponds to the whole-generation LD between the marker locus and the QTL (Deng et al. 2000). Alternatively, one could use a direct measure of LD like the pexcess statistic proposed by Bengtsson and Thomson (1981). For a survival QTL, pexcess becomes Inline graphic where Inline graphic was the allelic frequency of M1 within the group of dead individuals at time t (DEADt). Therefore, pexcess is proportional to DA1M1 and reaches its maximum at the marker with the greatest LD with the QTL (Nielsen et al. 1998).

To test for the statistical significance of the HW disequilibrium measures (Inline graphic and Inline graphic) and the LD measure (pexcess), two χ2-tests can be easily applied. Following Deng et al. (2000), the χ2-test statistic for HW disequilibrium is derived as

graphic file with name M21.gif

where the tilde (∼) denotes an estimated value from the sample and 2n is the total sample size of individuals. The Inline graphic test has Inline graphic d.f., m being the number of alleles at the marker locus being tested (k = 2). On the other hand, the χ2 for pexcess (Weir 1996; Deng et al. 2000) is stated as

graphic file with name M24.gif

with m − 1 = 1 d.f.

To illustrate the tests outlined above, extensive computer simulations were performed for a biallelic survival QTL and several biallelic markers. These computer simulations were carried out under a wide range of inheritance models (additive, dominant, recessive, partial dominant, and partial recessive), sample sizes, and ages and under a Weibull assumption for the baseline survival function (Ducrocq et al. 1988a,b; Ibrahim et al. 2001). For the five genetic models, both tests showed reduced power at greater distances between the QTL and the marker loci, although the power decayed more quickly for Inline graphic than for Inline graphic and it was higher for Inline graphic than for Inline graphic (Figure 1). These results agree with previous QTL fine-mapping research (Nielsen et al. 1998; Deng et al. 2000) and they are not surprising because, in models where both the survival QTL and the marker locus have only two alleles, HW disequilibrium is proportional to the square of LD (Nielsen et al. 1998). Whereas Inline graphic provided a similar power for all genetic models, Inline graphic showed substantial discrepancies. Within this context, Inline graphic seemed preferable if samples of both alive and dead individuals were accessible, although they could be unavailable if the study was not previously scheduled. On the other hand, the average type I error of both tests was close to the expected level of 0.05, slightly higher for Inline graphic than for Inline graphic (Figure 2). This larger variation in Inline graphic was consistent with previous research (Nielsen et al. 1998; Deng et al. 2000). As was expected, the average power of both tests at the different marker positions increased as the amount of available information increased (e.g., the number of sampled individuals increases; Figure 3) or the selection criteria became more strict (e.g., elderly ages to differentiate between alive and dead individuals; Figure 4). These results agreed with those of Deng et al. (2000).

Figure 1.—

Figure 1.—

Figure 1.—

Figure 1.—

Figure 1.—

Figure 1.—

Comparison of the statistical power of the Inline graphic-test (open boxes) and the Inline graphic-test (solid boxes) under various genetic models: (A) additive (a = −0.69, −a = 0.69, d = 0), (B) dominant (a = −0.69, −a = 0.69, d = a), (C) recessive (a = −0.69, −a = 0.69, d = −a), (D) partial dominant (a = −0.69, −a = 0.69, d = a/2), and (E) partial recessive (a = −0.69, −a = 0.69, d = −a/2). The bottom and top edges of the boxes represent the sample 25th and 75th percentiles; the whiskers extend the range of the results. Under a specific set of simulation parameters, the population started at generation zero, with complete association between allele A1 (p = 0.5) at the QTL and marker allele M1 (r = 0.5), and evolved for 50 generations, under random mating and genetic drift. A set of dense marker loci positioned at 0.25-cM intervals and that span 0–2 cM of the QTL were simulated, with the recombination rate obtained from Haldane's map function (Ott 1991). The first 100 populations for which the difference in p at the start and at the end of evolution do not differ by >5% were retained. The effective population size per generation was 15,000. Survival records were simulated under a proportional hazard model (Cox 1972), assuming a Weibull distribution with parameters ρ = 1.5 and λ = 0.001 for the baseline survival function (Inline graphic), and the threshold between alive and dead individuals was assumed at t = 1500. In each case, 5000 appropriate samples of 200 individuals (sampling with replacement) from each of the 100 simulated populations were sampled and the statistical power was calculated as the percentage of times that the null hypothesis of no disequilibrium was rejected.

Figure 2.—

Figure 2.—

Comparison of type I error of the Inline graphic-test (open square) and the Inline graphic-test (solid square) under the five inheritance models: additive, dominant, recessive, partial dominant, and partial recessive. Type I error for both tests was calculated as the percentage of times that the null hypothesis of no disequilibrium was rejected when the simulations were performed under the null hypothesis of no linkage disequilibrium. As is described in Figure 1, simulation parameters were p = 0.5, 2n = 200, t = 1500, λ = 0.001, and ρ = 1.5. The square represents the average value and the whiskers extend the range of the results.

Figure 3.—

Figure 3.—

Figure 3.—

Average power under different sample sizes (2n) for the (A) Inline graphic- and (B) Inline graphic-tests (the simulation process is described in Figure 1).

Figure 4.—

Figure 4.—

Figure 4.—

Comparison of statistical power for various temporal cut points (t). The average results for the (A) Inline graphic- and (B) Inline graphic-tests are presented (the simulation process is described in Figure 1).

In conclusion, LD is captured and magnified in extreme samples of elderly individuals, where QTL genotypes and alleles are disproportionately represented. The disequilibrium must be the highest at the QTL locus, since it is the underlying factor that determines the selection criterion, and it decreases as the degree of linkage between the QTL and the markers decreases. This relation between the HW equilibrium and/or LD and the physical distance between a panel of linked marker loci and a QTL is the key point that provides a straightforward basis for QTL fine mapping with use of the peaks of the disequilibrium measures and/or test statistics.

References

  1. Bengtsson, B. O., and G. Thomson, 1981. Measuring the strength of association between HLA antigens and diseases. Tissue Antigens 18: 356–363. [DOI] [PubMed] [Google Scholar]
  2. Cox, D. R., 1972. Regression models and life-tables. J. R. Stat. Soc. Ser. B 34: 187–220. [Google Scholar]
  3. Crow, L., and M. Kimura, 1970. An Introduction to Population Genetics Theory. Harper & Row, New York.
  4. Deng, H.-W., W.-M. Chen and R. R. Recker, 2000. QTL fine-mapping by measuring and testing for Hardy-Weinberg and linkage disequilibrium at a series of linked marker loci in extreme samples of populations. Am. J. Hum. Genet. 66: 1027–1045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Deng, H.-W., Y.-M. Li, M.-X. Li and P.-Y. Liu, 2003. Robust indices of Hardy-Weinberg disequilibrium for QTL fine mapping. Hum. Hered. 56: 160–165. [DOI] [PubMed] [Google Scholar]
  6. Ducrocq, V., R. L. Quaas, E. J. Pollak and G. Casella, 1988. a Length of productive life of dairy cows. 1. Justification of a Weibull model. J. Dairy Sci. 71: 3061–3070. [Google Scholar]
  7. Ducrocq, V., R. L. Quaas, E. J. Pollak and G. Casella, 1988. b Length of productive life of dairy cows. 2. Variance component estimation and sire evaluation. J. Dairy Sci. 71: 3071–3079. [Google Scholar]
  8. Feder, J. N., A. Gnirke, W. Thomas and Z. Tsuchihasi, 1996. A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis. Nat. Genet. 13: 399–408. [DOI] [PubMed] [Google Scholar]
  9. Ibrahim, J. G., M. Chen and D. Sinha, 2001. Bayesian Survival Analysis. Springer, New York.
  10. Kleinbaum, D. G., 1996. Survival Analysis: A Self-learning Text. Springer, New York.
  11. Nielsen, D. M., M. G. Ehm and B. S. Weir, 1998. Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus. Am. J. Hum. Genet. 63: 1531–1540. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Ott, J, 1991. Analysis of Human Genetic Linkage. Johns Hopkins University Press, Baltimore.
  13. Puca, A. A., M. J. Daly, S. J. Brewster, T. C. Matise, J. Barrett et al., 2001. A genome-wide scan for linkage to human exceptional longevity identifies a locus on chromosome 4. Proc. Natl. Acad. Sci. USA 98: 10505–10508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Spielman, R. S., R. E. McGinnis and W. J. Ewens, 1993. Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am. J. Hum. Genet. 52: 506–516. [PMC free article] [PubMed] [Google Scholar]
  15. Weir, B. S., 1996. Genetic Data Analysis II. Sinauer Associates, Sunderland, MA.

Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES