Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Feb 5.
Published in final edited form as: Hum Genet. 2010 Apr 11;127(6):699–704. doi: 10.1007/s00439-010-0824-5

Power analysis for case–control association studies of samples with known family histories

Bo Peng 1,, Biao Li 2, Younghun Han 3, Christopher I Amos 4
PMCID: PMC3914772  NIHMSID: NIHMS550858  PMID: 20383776

Abstract

Genome-wide case–control studies have been widely used to identify genetic variants that predispose to human diseases. Such studies are powerful in detecting common genetic variants with moderate effects, but quickly lose power as allele frequency and genotype relative risk decrease. Because patients with one or more affected relatives are more likely to inherit disease-predisposing alleles of a genetic disease than patients without family histories of the disease, sampling patients with affected relatives almost always increases the frequency of disease predisposing alleles in cases and improves the power of case–control association studies. This paper evaluates the power of case–control studies that select cases and/or controls according to their family histories of disease. Our results showed that this study design can dramatically increase the power of a case–control association study for a wide range of disease types. Because each additional affected relative of a patient reduces the required sample size roughly by a pair of case and control, inclusion of cases with affected relatives can dramatically decrease the required sample size and thus the cost of such studies.

Introduction

Genome-wide association studies (GWAS) have identified an enormous number of highly significant and reproducible genetic loci predisposing to complex diseases (Manolio et al. 2008). Most of the recently conducted studies have sampled cases based only on their phenotypes. This design ensures excellent power provided that the disease-predisposing alleles (DPA) occur sufficiently frequently in the population that they will be selected into the case population and with sufficient marginal effect on the occurrence of the disease so that they concentrate differentially in case and control populations.

Unfortunately, despite many past successes, much of the heritability of complex diseases has not yet been identified through genome-wide association studies (Maher 2008). For diseases with complex genetic architectures that include effects from multiple loci and/or environmental factors, the penetrance associated with each specific allele may be low so selecting samples through their phenotype may not sufficiently enrich the concentration of DPA in cases compared with controls. For common alleles, power to detect associations may still be sufficient because the marker alleles that are included in most GWAS platforms are common and can adequately tag the DPA. For uncommon DPA that occur in less than 5% of the population, random selection of cases may not yield adequate power because of the low resulting linkage disequilibrium between the rare disease predisposing allele and the more common marker allele frequency. The resulting mismatch in allele frequency between the marker and DPA places an upper bound on the extent of correlation (r2) that can occur between the two loci (Amos 2007). This limits, sometimes dramatically, the power of randomly selected cases for genome-wide association studies.

For complex diseases, patients with one or more affected relatives are more likely to inherit a DPA than patients without a family history of the disease (Matakidou et al. 2005). Although it is expensive and sometimes impractical (e.g. for late-onset diseases such as most cancers) to collect genotypes of relatives of patients, the affection statuses of these relatives, namely family histories of patients, are usually available and provide useful information about the occurrence of DPA for case–control association studies. More specifically, sampling patients with affected relatives almost always increases the frequency of DPA in cases and thus improves the statistical power of case–control association studies (Houlston and Peto 2003; Amos 2007). Although this strategy has been successfully used in practice (Rudd et al. 2006; Easton et al. 2007; Eeles et al. 2008; Gold et al. 2008), there has been only a limited study and no explicit power calculator on the statistical advantage of sampling cases based on their family histories. In this article, we developed a statistical model that evaluates the power of case–control studies that select cases and/or controls according to their family histories of disease. We compared the relative usefulness of different types of family history by comparing the sample sizes required for case–control studies with different family histories for different types of diseases to achieve the same statistical power.

Methods

Following a statistical model of case–control association studies employed by Skol et al. (2006), we assume N1 cases and N2 controls are genotyped. A genetic disease is caused by a DPA x at a diallelic locus with disease allele frequency p. The penetrances of different genotypes XX, Xx and xx are derived from the disease prevalence, genotype relative risk, and genetic model (e.g. additive or dominant). In a standard case–control association study without family history information, the true allele frequencies in cases and controls (p1 = Pr (xx|affected) + Pr (Xx|affected)/2 and p2 = Pr (XX|affected) + Pr (Xx|affected)/2, respectively) can be determined from genotype frequency, disease prevalence, and the genotype penetrance such as Pr (affected|xx) using a Bayesian formula. We assume Hardy–Weinberg equilibrium at the disease predisposing loci.

To evaluate the evidence for association, let 1 and 2 be the estimated risk allele frequencies in cases and controls, respectively, and define test statistic

z=p^1-p^2p^1(1-p^1)/2N1+p^2(1-p^2)/2N2

using a two-sample z test. z follows a standard normal distribution under the null hypothesis of no association. Because the true distribution (under the alternative hypothesis) of z follows an approximate normal distribution with mean

C=p1-p2p1(1-p1)/2N1+p2(1-p2)/2N2

and variance 1, the power of this statistical test is Pr(|zH| > zα/2) = 1 − Φ(−zα/2 + C) + Φ(−zα/2C) at significance level α, where Φ(x) is the cumulative distribution function for the standard normal distribution evaluated at x and zα = Φ−1(1 − α). In all examples presented in this paper, we set α = 10−7, representing a genome-wide significance level of 0.05 with a Bonferroni-correction for a genome-wide association study of 500,000 markers.

Our power analysis differs from this standard analysis in a key aspect in that the family histories of samples are known so instead of calculating Pr (genotype|affected) for the probands, we calculate the probability of each genotype of a proband conditional on the affection statuses of all of his or her relatives. Theoretical calculation of such probabilities is possible for simple cases such as cases with one affected sibling, but quickly becomes formidable as the number of relatives increases. We developed a computer program to calculate this probability by accumulating probabilities across all possible genotypes with a given pedigree structure. More specifically, we calculated Pr(all genotype) and Pr(affection status | all genotype) for all possible genotypes of involved individuals. For example, conditioning on genotype of a proband, we calculated the probability of having a particular pedigree type as

Pr(AFFxx)=Pr(AFF,xx)Pr(xx)=genowithxxPr(AFFgeno)Pr(geno)Pr(xx)

where AFF is the pedigree affection status such as a case with two affected parents. Such probabilities were then used to calculate expected genotype frequencies such as

Pr(xxAFF)=Pr(AFFxx)Pr(xx)Pr(AFF)=Pr(AFFxx)Pr(xx)genoPr(AFFgeno)Pr(geno)

where geno calculates over all possible genotypes of the pedigree. If a sample has cases with different family histories, we calculated p2 = Pr(xx|AFF1, AFF2…) as a weighted average of p2i = Pr(xx|AFFi) according to the number of cases with each type of family history.

We implemented our model in a power calculation tool named caseControlPower.py in Python. In the computer program, we allow for the existence of a marker locus in close vicinity to the disease predisposing locus. Marker allele frequency and linkage disequilibrium (D′ or R2) can be assigned so that statistical power is affected by linkage disequilibrium between marker and disease loci. We also allow for a mixture of pedigree types in the specification of cases and controls (e.g. 1,000 cases without family history information and 1,000 cases with an affected sibling). In addition to power analysis, this program can be used to calculate minimal sample sizes for a given power (with a fixed number of cases, controls or ratio of the number of cases to the number of controls), and the minimum detectable relative risk for given sample sizes. These features make the program practical in real-world studies, but they are not discussed in this article because they do not change our major results. We validated the non-familial cases with CaTS (Skol et al. 2006) and simple familial cases with results presented in Houlston and Peto (2003). This tool is available on the “miscellaneous” page of the simuPOP online cookbook (http://simupop.sourceforge.net/cookbook). It uses simuPOP (a general-purpose individual-based forward-time simulation environment) for its graphical user interface and probability functions and can be used with a graphical user interface, a command line, or by importing this module from another Python script.

Results

We aimed to compare the power of a case–control association study with known family histories with a standard case–control association study. We focused on family history of two generations, which we denoted according to affection status of father, mother, proband and his or her siblings. We used symbols u, a and * for unaffected, affected, and unknown affection status, respectively, with the first two places indicating the statuses of the two parents, followed by the statuses of the proband and then his or her siblings. For example, **a indicates standard patients with unknown parental affection statuses, whereas uuaa stands for a case with unaffected parents and an affected sibling.

For a relatively common disease (with a prevalence of 0.1) caused by a single DPA with an additive effect, a case–control study of probands having affected siblings or parents is much more powerful than a regular case–control association study (Fig. 1). For example, if the frequency of a DPA with a relatively high genotype relative risk 1.4 is 0.05, the statistical power of detecting the DPA is 0.5% for a regular case–control association study of 1,000 cases and 1,000 controls (at the significance level 10−7), 12.7% for cases with one affected sibling, and 65.0% for cases with an additional affected parent. It requires 5,045, 2,169, and 1,166 cases along with equal numbers of controls, respectively, to achieve the same statistical power of 80% for these three study designs (Fig. 1).

Fig. 1.

Fig. 1

Statistical power for 1,000 cases and 1,000 controls (top) and number of cases (with equal number of controls) required to achieve a power of 80% (bottom) for three different genotype relative risks (1.2, 1.3, and 1.4) using three case–control study designs: a standard design of samples with no family history (**a), a design of cases with one affected sibling (**aa), and a design of cases with one affected parent and one affected sibling (*aaa). The disease is assumed to be caused by a DPA having an additive effect. The population prevalence of the disease is 0.1

Factors that influence the relative power of different family histories of disease include the type of family history and characteristics of the disease such as disease allele frequency, genotype relative risk, genetic model, and population prevalence. To quantify the relative usefulness of family history in improving the power in the identification of DPA, we calculated the sample sizes required for case–control studies with different family histories to achieve the same statistical power of 80% ( Ncontrolcase with matching number of cases and controls) for different types of diseases. Table 1 lists the numbers of cases for standard case control studies (with the family history types **a and **u) and the ratios of the standard sample sizes to the sample sizes required for different types of family histories (λcontrolcase=Nua/Ncontrolcase). Types of family histories considered include cases with unaffected parents (uua), cases with one affected sibling (**aa) as well as cases with an affected parent and one (auaa) or two (auaaa) affected siblings. Although vastly different numbers of samples are required and the ratios differ between different disease types, each additional affected parent or sibling reduced the required sample size roughly by a pair of case and control in a regular case–control association study.

Table 1.

Number of patients required for a standard case–control study design and case–control designs with nine family histories for 27 disease types with different prevalences, allele frequencies, disease types, and genotype relative risks

Prevalence Allele frequency Disease type Genotype relative risk
Nua
λuuua
λuuaa
λuaa
λuaa
λuauaa
λuauaaa
λuua
λuuua
λuuaa
0.10 0.10 Additive 1.1 36,853 0.80 2.01 2.16 2.16 3.59 5.66 1.10 1.21 2.31
0.10 0.10 Additive 1.2 9,877 0.79 2.05 2.21 2.21 3.73 5.96 1.11 1.22 2.36
0.10 0.10 Additive 1.4 2,809 0.77 2.12 2.28 2.28 3.93 6.40 1.11 1.22 2.44
0.05 0.02 Additive 1.2 48,501 0.89 2.25 2.33 2.33 4.18 6.82 1.05 1.11 2.41
0.05 0.02 Additive 1.4 13,309 0.88 2.37 2.46 2.46 4.60 7.78 1.05 1.11 2.54
0.05 0.02 Additive 1.8 3,925 0.86 2.57 2.67 2.67 5.30 9.41 1.06 1.11 2.76
0.01 0.02 Additive 1.5 9,743 0.97 2.59 2.61 2.61 5.16 8.91 1.01 1.02 2.62
0.01 0.02 Additive 2 2,979 0.97 2.86 2.88 2.88 6.09 11.07 1.01 1.02 2.90
0.01 0.02 Additive 4 584 0.95 3.48 3.52 3.52 8.00 13.58 1.01 1.03 3.54
0.10 0.10 Recessive 1.1 3,443,387 0.81 1.98 2.12 2.17 3.55 5.65 1.10 1.21 2.32
0.10 0.10 Recessive 1.2 865,602 0.80 1.99 2.14 2.24 3.68 6.04 1.10 1.21 2.40
0.10 0.10 Recessive 1.4 218,778 0.80 2.02 2.17 2.39 3.95 6.87 1.10 1.21 2.54
0.05 0.02 Recessive 1.2 109,715,270 0.90 2.11 2.18 2.32 3.92 6.41 1.05 1.10 2.40
0.05 0.02 Recessive 1.4 27,482,990 0.90 2.12 2.19 2.47 4.13 7.13 1.05 1.10 2.55
0.05 0.02 Recessive 1.8 6,897,833 0.90 2.14 2.21 2.78 4.59 8.80 1.05 1.10 2.86
0.01 0.02 Recessive 1.5 19,124,195 0.98 2.24 2.26 2.63 4.52 8.05 1.01 1.02 2.64
0.01 0.02 Recessive 2 4,805,552 0.98 2.27 2.28 3.05 5.16 10.48 1.01 1.02 3.06
0.01 0.02 Recessive 4 544,840 0.98 2.36 2.37 4.94 8.20 27.19 1.01 1.02 4.97
0.10 0.10 Dominant 1.1 45,260 0.80 2.00 2.14 2.14 3.53 5.52 1.10 1.21 2.29
0.10 0.10 Dominant 1.2 12,074 0.80 2.03 2.18 2.18 3.62 5.69 1.11 1.22 2.33
0.10 0.10 Dominant 1.4 3,407 0.78 2.07 2.23 2.23 3.74 5.88 1.11 1.22 2.38
0.05 0.02 Dominant 1.2 50,410 0.89 2.24 2.32 2.32 4.15 6.74 1.05 1.11 2.40
0.05 0.02 Dominant 1.4 13,812 0.88 2.36 2.44 2.44 4.54 7.61 1.05 1.11 2.53
0.05 0.02 Dominant 1.8 4,064 0.86 2.55 2.65 2.64 5.18 9.07 1.06 1.11 2.73
0.01 0.02 Dominant 1.5 10,104 0.97 2.57 2.59 2.59 5.08 8.68 1.01 1.02 2.60
0.01 0.02 Dominant 2 3,081 0.97 2.82 2.84 2.84 5.94 10.59 1.01 1.02 2.86
0.01 0.02 Dominant 4 600 0.95 3.43 3.45 3.45 7.59 12.50 1.01 1.03 3.47

Nua, number of cases (with equal number of controls) required to achieve a statistical power of 80% using a standard case–control design; λcontrolcase=Nua/Ncontrolcase, ratio of the sample size required for a standard case–control design to that required for a case–control study of cases and controls with specified family histories. Equal numbers of cases and controls were used for all analyses

This result is quite striking, considering that each affected relative could replace roughly a pair of regular case and control, without using genotype information of these relatives. If all cases in a case–control study have three affected relatives, the numbers of cases and controls required to achieve the statistical power of 80% would be only one-sixth of those required for a regular case–control study. As a matter of fact, if the number of controls remains unchanged in a regular case–control study, only 7–35% of the cases are required to achieve the same 80% statistical power using the disease types listed in Table 1 (data not shown). This could be explained by the fact that cases with affected relatives have higher concentration of DPA, resulting in larger difference between frequencies of DPA in cases and controls than a regular case control study. In contrast, in a regular case control study, larger sample size leads to smaller variances, which is a less efficient method of improving statistical power. For example, in the case of an additive disease with a prevalence of 0.01, a disease allele frequency of 10% and a genotype relative risk of 1.5, the expected disease allele frequencies in controls, regular cases, and cases with an affected sibling are 9.96, 14.09 and 16.75%, respectively. The availability of family history information increases the allele frequency difference between cases and controls from 4.13 to 6.79% and boosts the test statistic from 4.03 to 6.35 with a slight increase in the standard deviation from 0.0103 to 0.0107. This increases the statistical power from 9.7 to 84.7%. In contrast, adding another 1,000 standard cases and controls decreases the standard deviation from 0.0103 to 0.0073 and results in a test statistics of 5.69 and an improved power of 64.3%.

Discussion

The same mathematical analysis can be applied to controls and it is easy to prove that using controls with no affected relatives (for example families with history types uuu or **uu) provides slightly better statistical power than using controls without family history information. As shown in Table 1, smaller sample sizes are required for studies using controls with no affected relatives than studies using controls without family history information. The percentage of decreased sample sizes for each unaffected relative approximately equals to the disease prevalence, which can be high for some diseases.

For cases with affected relatives, one affected relative could reduce the required sample size by roughly one pair of case and control because λcontrolcase=Nua/Ncontrolcase for all combinations of family histories and disease models equal approximately twice the number of affected relatives. This is an approximate, and mostly conservative, estimate for cases with one or two affected relatives using typical parameter settings. When the number of affected relatives increases, this ratio increases rapidly, especially when the DPA has relatively high genotype relative risks and low allele frequency. The expected power and significance level also affect the number of samples required for both regular case control study and studies with known family histories, but only expected statistical power markedly affects the ratio of required sample sizes (Fig. 2). Although patients with more than two family members affected by the disease being studied are relatively rare, they should be included in case control association studies with the highest possible priority because of their exceptionally high contributions to the statistical power of such studies.

Fig. 2.

Fig. 2

The impact of family history (a), genotype relative risk (b), expected statistical power (c), significance level (d) and disease allele frequency (e) on the ratio of the sample size required for regular case–control studies and that required for case–control studies of patients with different family histories (λcontrolcase=Nua/Ncontrolcase) to achieve the same statistical power. We considered patients with one to four affected siblings and unknown parental affection status (from **aa to **aaaaa). For each type of family history, we considered additive diseases with prevalences (K) of 0.1, 0.02; disease allele frequencies from 0.02 to 0.14; genotype relative risks from 1.1 to 1.9, significance levels from 10−4 to 10−7, and expected statistical power from 30 to 90%. For clarity purposes, only selected cases are plotted in each figure. If unspecified, a disease prevalence of 0.02, disease allele frequency of 0.1, significance level of 10−7, and an expected statistical power of 80% are assumed

Although our power analysis is based on a single-locus disease model, the results are applicable to more complex disease models with multiple interacting genetic and environmental risk factors with different effect sizes. More specifically, our model models the marginal effect of single disease predisposing locus where all other genetic and environmental risk factors are considered to be confounding factors that tend to decrease genotype relative risk at the locus under study. For example, if a common disease is caused by multiple rare alleles at different disease predisposing loci, each with relatively high levels of penetrance [the common disease rare variant hypothesis (Fearnhead et al. 2005)], the disease would be caused by rare genetic variants with relatively high genotype relative risk at each disease predisposing locus. Because such DPA tend to be site-specific (Simchoni et al. 2006), family history is an important indicator of whether a patient carries a DPA or not. Our analysis showed that the sample size required to detect such a rare but highly penetrant DPA can decrease dramatically if samples with multiple affected relatives are sampled.

Acknowledgments

This research was supported by the Kleberg Center for Molecular Markers at the M. D. Anderson Cancer Center and grants R01CA133996 and U01CA076293 from the US National Institutes of Health.

Contributor Information

Bo Peng, Email: bpeng@mdanderson.org, Department of Epidemiology, The University of Texas M. D. Anderson Cancer Center, 1155 Pressler St. Unit 1340, Houston, TX 77030, USA.

Biao Li, Email: Li.Biao@rice.edu, Department of Bioengineering, Rice University, MS-142, 6100 Main St., Houston, TX 77005, USA.

Younghun Han, Email: yhan@mdanderson.org, Department of Epidemiology, The University of Texas M. D. Anderson Cancer Center, 1155 Pressler St. Unit 1340, Houston, TX 77030, USA.

Christopher I. Amos, Email: camos@mdanderson.org, Department of Epidemiology, The University of Texas M. D. Anderson Cancer Center, 1155 Pressler St. Unit 1340, Houston, TX 77030, USA

References

  1. Amos CI. Successful design and conduct of genome-wide association studies. Hum Mol Genet. 2007;16(2):R220–R225. doi: 10.1093/hmg/ddm161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Easton DF, Pooley KA, Dunning AM, Pharoah PDP, Thompson D, Ballinger DG, Struewing JP, Morrison J, Field H, Luben R, Wareham N, Ahmed S, Healey CS, Bowman R, Meyer KB, Haiman CA, Kolonel LK, Henderson BE, Le Marchand L, Brennan P, Sangrajrang S, Gaborieau V, Odefrey F, Shen C-Y, Wu P-E, Wang H-C, Eccles D, Evans DG, Peto J, Fletcher O, Johnson N, Seal S, Stratton MR, Rahman N, Chenevix-Trench G, Bojesen SE, Nordestgaard BG, Axelsson CK, Garcia-Closas M, Brinton L, Chanock S, Lissowska J, Peplonska B, Nevanlinna H, Fagerholm R, Eerola H, Kang D, Yoo K-Y, Noh D-Y, Ahn SH, Hunter DJ, Hankinson SE, Cox DG, Hall P, Wedren S, Liu J, Low Y-L, Bogdanova N, Schurmann P, Dork T, Tollenaar RAEM, Jacobi CE, Devilee P, Klijn JGM, Sigurdson AJ, Doody MM, Alexander BH, Zhang J, Cox A, Brock IW, MacPherson G, Reed MWR, Couch FJ, Goode EL, Olson JE, Meijers-Heijboer H, van den Ouweland A, Uitterlinden A, Rivadeneira F, Milne RL, Ribas G, Gonzalez-Neira A, Benitez J, Hopper JL, McCredie M, Southey M, Giles GG, Schroen C, Justenhoven C, Brauch H, Hamann U, Ko Y-D, Spurdle AB, Beesley J, Chen X, Mannermaa A, Kosma V-M, Kataja V, Hartikainen J, Day NE, et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature. 2007;447:1087–1093. doi: 10.1038/nature05887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Eeles RA, Kote-Jarai Z, Giles GG, Olama AA, Guy M, Jugurnauth SK, Mulholland S, Leongamornlert DA, Edwards SM, Morrison J, Field HI, Southey MC, Severi G, Donovan JL, Hamdy FC, Dearnaley DP, Muir KR, Smith C, Bagnato M, Ardern-Jones AT, Hall AL, O’Brien LT, Gehr-Swain BN, Wilkinson RA, Cox A, Lewis S, Brown PM, Jhavar SG, Tymrakiewicz M, Lophatananon A, Bryant SL, Horwich A, Huddart RA, Khoo VS, Parker CC, Woodhouse CJ, Thompson A, Christmas T, Ogden C, Fisher C, Jamieson C, Cooper CS, English DR, Hopper JL, Neal DE, Easton DF. Multiple newly identified loci associated with prostate cancer susceptibility. Nat Genet. 2008;40:316–321. doi: 10.1038/ng.90. [DOI] [PubMed] [Google Scholar]
  4. Fearnhead NS, Winney B, Bodmer WF. Rare variant hypothesis for multifactorial inheritance: susceptibility to colorectal adenomas as a model. Cell Cycle. 2005;4:521–525. doi: 10.4161/cc.4.4.1591. [DOI] [PubMed] [Google Scholar]
  5. Gold B, Kirchhoff T, Stefanov S, Lautenberger J, Viale A, Garber J, Friedman E, Narod S, Olshen AB, Gregersen P, Kosarin K, Olsh A, Bergeron J, Ellis NA, Klein RJ, Clark AG, Norton L, Dean M, Boyd J, Offit K. Genome-wide association study provides evidence for a breast cancer risk locus at 6q22.33. Proc Natl Acad Sci. 2008;105:4340–4345. doi: 10.1073/pnas.0800441105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Houlston RS, Peto J. The future of association studies of common cancers. Hum Genet. 2003;112:434–435. doi: 10.1007/s00439-002-0902-4. [DOI] [PubMed] [Google Scholar]
  7. Maher B. Personal genomes: the case of the missing heritability. Nature. 2008;456:18–21. doi: 10.1038/456018a. [DOI] [PubMed] [Google Scholar]
  8. Manolio TA, Brooks LD, Collins FS. A HapMap harvest of insights into the genetics of common disease. J Clin Invest. 2008;118:1590–1605. doi: 10.1172/JCI34772. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Matakidou A, Eisen T, Houlston RS. Systematic review of the relationship between family history and lung cancer risk. Br J Cancer. 2005;93:825–833. doi: 10.1038/sj.bjc.6602769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Rudd MF, Webb EL, Matakidou A, Sellick GS, Williams RD, Bridle H, Eisen T, Houlston RS. Variants in the GH-IGF axis confer susceptibility to lung cancer. Genome Res. 2006;16:693–701. doi: 10.1101/gr.5120106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Simchoni S, Friedman E, Kaufman B, Gershoni-Baruch R, Orr-Urtreger A, Kedar-Barnes I, Shiri-Sverdlov R, Dagan E, Tsabari S, Shohat M, Catane R, King MC, Lahad A, Levy-Lahad E. Familial clustering of site-specific cancer risks associated with BRCA1 and BRCA2 mutations in the Ashkenazi Jewish population. Proc Natl Acad Sci USA. 2006;103:3770–3774. doi: 10.1073/pnas.0511301103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Skol AD, Scott LJ, Abecasis GR, Boehnke M. Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet. 2006;38:209–213. doi: 10.1038/ng1706. [DOI] [PubMed] [Google Scholar]

RESOURCES