Analytic power and sample size calculation for the genotypic transmission/disequilibrium test in case-parent trio studies

Christoph Neumann; Margaret A Taub; Samuel G Younkin; Terri H Beaty; Ingo Ruczinski; Holger Schwender

doi:10.1002/bimj.201300148

. Author manuscript; available in PMC: 2015 Nov 1.

Published in final edited form as: Biom J. 2014 Aug 14;56(6):1076–1092. doi: 10.1002/bimj.201300148

Analytic power and sample size calculation for the genotypic transmission/disequilibrium test in case-parent trio studies

Christoph Neumann ¹, Margaret A Taub ², Samuel G Younkin ², Terri H Beaty ², Ingo Ruczinski ², Holger Schwender ^3,^*

PMCID: PMC4206700 NIHMSID: NIHMS633811 PMID: 25123830

Abstract

Case-parent trio studies considering genotype data from children affected by a disease and from their parents are frequently used to detect single nucleotide polymorphisms (SNPs) associated with disease. The most popular statistical tests in this study design are transmission/disequlibrium tests (TDTs). Several types of these tests have been developed, e.g., procedures based on alleles or genotypes. Therefore, it is of great interest to examine which of these tests have the highest statistical power to detect SNPs associated with disease. Comparisons of the allelic and the genotypic TDT for individual SNPs have so far been conducted based on simulation studies, since the test statistic of the genotypic TDT was determined numerically. Recently, it, however, has been shown that this test statistic can be presented in closed form. In this article, we employ this analytic solution to derive equations for calculating the statistical power and the required sample size for different types of the genotypic TDT. The power of this test is then compared with the one of the corresponding score test assuming the same mode of inheritance as well as the allelic TDT based on a multiplicative mode of inheritance, which is equivalent to the score test assuming an additive mode of inheritance. This is, thus, the first time that the power of these tests are compared based on equations, yielding instant results and omitting the need for time-consuming simulation studies. This comparison reveals that the tests have almost the same power, with the score test being slightly more powerful.

Keywords: Case-parent trio design, Conditional logisitc regression, Genome-wide association studies, Power calculation, Wald test

1 Introduction

Case-parent trio studies are frequently used to test SNPs for association with disease by analyzing the genotypes of children having this disease and their parents. Advantages of case-parent trio and other family-based designs over population-based case-control studies are their robustness against spurious findings due to population stratification and the possibility to test for association and linkage simultaneously (Spielman and Ewens, 1996; Gauderman et al., 1999; Laird and Lange, 2006).

One of the most popular tests for association in case-parent trio studies is the allelic transmission/ disequilibrium test introduced by Spielman et al. (1993), which is equivalent to McNemar’s test comparing the numbers of alleles transmitted or not transmitted from the heterozygous parents to their offspring affected by disease. This allelic TDT thus allows the detection of alleles preferentially transmitted to the affected offspring, and hence, potentially associated with the disease of the children.

Instead of testing alleles (and thus, considering chromosomes as units in the analysis), genotypes (and therefore, individuals) can also be directly analyzed by employing a genotypic transmission/disequilibrium test. In the genotypic TDT, the genotype of an affected child is compared to the three other genotypes possible given the parents’ genotypes, but not shown by the affected offspring (Self et al., 1991; Schaid, 1996). This test is equivalent to a Wald test in a conditional logistic regression model in which each caseparent trio forms a stratum and the respective three not transmitted genotypes serve as controls (usually referred to as pseudo-controls, as these controls are artificial).

While the allelic TDT is based on the assumption of a multiplicative mode of inheritance, the genotypic TDT can be used to test a wide range of genetic models, considering, e.g., an additive, dominant, or recessive mode of inheritance (see, e.g., Fallin et al., 2002). Moreover, the genotypic TDT allows the determination of parameter estimates, relative risks, standard errors, and confidence intervals in addition to p-values. These estimates can, e.g., be used to combine results from different case-parent trio studies as well as in meta-analyses of case-parent trio with population-based case-control studies (see, e.g., Ludwig et al., 2012). By contrast, both the allelic TDT and the score test corresponding to the Wald test in the conditional logistic regression model only provide (scores and) p-values.

A disadvantage of the genotypic TDT, in particular in genome-wide association studies, over the allelic TDT and the score test was its high computation time, as the likelihood of the conditional logistic regression model had to be maximized by employing an iterative procedure to obtain the test statistic of the Wald test, and hence, the genotypic TDT statistic. However, it has recently been shown that when testing SNPs individually an analytic solution for the maximum-likelihood estimator in this model, and thus, for the genotypic TDT statistic exists, no matter whether an additive, dominant, or recessive mode of inheritance is assumed (Schwender et al., 2012). Therefore, this drawback has been eliminated, and genome-wide applications of the genotypic TDT are as fast as analyses with the allelic TDT or the score test (see, in particular, Table 4 in Schwender et al., 2012).

These closed-form solutions of the genotypic TDT also allow the analytic determination of the power and sample size of the genotypic TDT (and the score test) assuming different modes of inheritance. This hence avoids the need for time-consuming computations of required sample sizes or power based on simulation studies. In this article, we derive equations for these power and sample size determinations and compare the sample sizes required by the genotypic TDTs to reach a certain power with the ones needed by the corresponding score tests. This comparison also includes the allelic TDT proposed by Spielman et al. (1993) assuming a multiplicative mode of inheritance, since this test is equivalent to a score test assuming an additive mode of inheritance (cf. Schaid and Sommer, 1994).

For the allelic TDT under general modes of inheritance, equations for the approximations of power and sample sizes have already been devised by Knapp (1999). An alternative approach to the one of Knapp (1999) for power and sample size calculations for studies with a dichotomous outcome have been proposed by Lange and Laird (2002). Their procedure covers the wide range of general FBATs (Family-Based Association Tests) as suggested, e.g., by Laird et al. (2000) and Rabinowitz and Laird (2000) for different family-based designs and different situations in which, e.g., the genotypes of one or both parents are missing. This also includes the original TDT (i.e. the allelic TDT) proposed by Spielman et al. (1993). This method, however, does not cover the genotypic TDT, and hence, does not provide the possibility for analytic power calculation for the genotypic TDT. Moreover, a related approach has been devised by Lange et al. (2002) for power and sample size determinations for general FBATs considering quantitative traits.

This article is organized as follows: We first describe the analytic determination of the genotypic TDT statistics in Section 2. Afterwards, we derive in Sections 3 and 4 equations for power and sample size calculations, respectively, for the genotypic TDT. We focus in these sections on an additive mode of inheritance. Equations for power determinations for the dominant and recessive mode of inheritance as well as their derivation can be found in Appendix A.1. In Section 5, we furthermore present concise equations for the test statistics of the score test, assuming an additive mode of inheritance that can – analogously to the genotypic TDT statistics – be used for power and sample size calculation. Score tests for a dominant or a recessive mode of inheritance are discussed in Appendix A.2. In Section 6, the required sample sizes of these tests are compared with each other and with the ones of the allelic TDT determined based on the approach of Knapp (1999). Finally, we conduct a simulation study in Section 7 to validate the equations for the statistical power determination.

All the closed-form solutions for performing sample size and power calculation are implemented in the R-package trio freely available at http://www.bioconductor.org.

2 Analytic solution to the genotypic TDT

To test under a specified mode of inheritance (e.g., an additive, dominant, or recessive mode of inheritance) whether a SNP is associated with disease, the genotypic TDT assesses whether a genotype of this SNP is preferentially transmitted from the parents to their affected offspring. The genotypic TDT is based on a conditional logistic regression model consisting only of one explanatory variable X coding for the specified mode of inheritance. In this model, the genotype of the affected child is compared with the three not transmitted genotypes that would have also been possible given the genotypes of the parents.

As an example assume that at a specific SNP one parent shows the homozygous reference genotype A₁A₂, and the other the heterozygous genotype A₃G (where the indices are only used to differ between the different alleles). Since each parent transmits one of these alleles to their offspring, it will exhibit one of the genotypes A₁A₃, A₁G, A₂A₃, and A₂G (cf. first two columns of Table 1). In the conditional logistic regression model, the genotype of this offspring is considered as case and the other three not transmitted genotypes are used as pseudo-controls, where the dependency structure is taken into account by forming one stratum for each case-parent trio.

Table 1.

Number of minor alleles shown by the affected offspring, their parents, and the corresponding pseudo-controls for the ten possible genotype combinations case-parent trios can show as well as the weights of the trios in the maximimization of the conditional likelihood assuming an additive mode of inheritance. The number of trios showing a specific genotype combination is denoted by $n_{c}^{(p_{1}, p_{2})}$ with c, p₁, p₂ ∈ {0, 1, 2} and p₁ ≤ p₂. This is a modified version of Table 1 from Schwender et al. (2012).

Number of
Trios

Number of Minor Alleles

Weights in the
Likelihood

Parents

Offspring

Pseudo-Controls

n_{0}^{(0, 1)}

0, 1

0, 1, 1

\frac{1}{2 + 2 exp (γ_{add})}

n_{1}^{(0, 1)}

0, 1

0, 0, 1

\frac{exp (γ_{add})}{2 + 2 exp (γ_{add})}

n_{1}^{(1, 2)}

1, 2

1, 2, 2

\frac{1}{2 + 2 exp (γ_{add})}

n_{2}^{(1, 2)}

1, 2

1, 1, 2

\frac{exp (γ_{add})}{2 + 2 exp (γ_{add})}

n_{0}^{(1, 1)}

1, 1

1, 1, 2

\frac{1}{{(1 + exp (γ_{add}))}^{2}}

n_{1}^{(1, 1)}

1, 1

0, 1, 2

\frac{exp (γ_{add})}{{(1 + exp (γ_{add}))}^{2}}

n_{2}^{(1, 1)}

1, 1

0, 1, 1

\frac{exp (2 γ_{add})}{{(1 + exp (γ_{add}))}^{2}}

n_{1}^{(0, 2)}

0, 2

1, 1, 1

\frac{1}{4}

n_{2}^{(2, 2)}

2, 2

2, 2, 2

\frac{1}{4}

n_{0}^{(0, 0)}

0, 0

0, 0, 0

\frac{1}{4}

Open in a new tab

Denoting the value of X for the affected offspring in case-parent trio i = 1, …, n, by x_i0, and the values for the corresponding three pseudo-controls by x_ik, k = 1, …, 3, the maximum-likelihood estimate for the parameter γ corresponding to X is determined by the value γ̂ maximizing the conditional likelihood

L (γ) = \prod_{i = 1}^{n} \frac{exp (γ x_{i 0})}{\sum_{k = 0}^{3} exp (γ x_{i k})}

(1)

of the conditional logistic regression model. The test statistic of the genotypic TDT is then given by the Wald statistic

g^{2} = \frac{{γ̂}^{2}}{Var (γ̂)} .

(2)

The likelihood (1) has to be maximized over the n weights $w_{i} = exp (γ x_{i 0}) / \sum_{k = 0}^{3} exp (γ x_{i k})$ of the n trios. Considering the above example trio and assuming that the offspring shows one of the heterozygous genotypes, the weight of this case-parent trio under an additive mode of inheritance (in which case, X codes for the number of minor alleles) is given by

w (γ_{add}) = \frac{exp (γ_{add} \cdot 1)}{2 exp (γ_{add} \cdot 0) + 2 exp (γ_{add} \cdot 1)} = \frac{exp (γ_{add})}{2 + 2 exp (γ_{add})} .

However, there only exist ten possible genotype combinations for case-parent trios, and thus, (at most) ten different weights (see Table 1). Since three of these genotype combinations have weights not depending on γ_add, only seven of them – namely the ones comprising at least one heterozygous parent – contribute to the maximization of (1). In this situation, the logarithm of the conditional likelihood (1), therefore, reduces to a sum over seven numbers $n_{c}^{(p_{1}, p_{2})}$ of trios showing the respective genotype combination weighted by w (γ_add), where c, p₁, p₂ ∈ {0, 1, 2} with p₁ ≤ p₂ are the numbers of minor alleles of the children and their parents in the respective trios. Using these numbers and the weights from Table 1, the reduced log-likelihood is thus given by

ℓ^{*} (γ_{add}) = (n_{0}^{(0, 1)} + n_{1}^{(1, 2)}) log (\frac{1}{2 + 2 exp (γ_{add})}) + (n_{1}^{(0, 1)} + n_{2}^{(1, 2)}) log (\frac{exp (γ_{add})}{2 + 2 exp (γ_{add})}) + \sum_{c = 0}^{2} log (\frac{exp (c γ_{add})}{{(1 + exp (γ_{add}))}^{2}}) n_{c}^{(1, 1)} = - (log (2) + log (1 + exp (γ_{add}))) (n_{0}^{(0, 1)} + n_{1}^{(0, 1)} + n_{1}^{(1, 2)} + n_{2}^{(1, 2)}) + (n_{1}^{(0, 1)} + n_{2}^{(1, 2)} + n_{1}^{(1, 1)} + 2 n_{2}^{(2, 2)}) γ_{add} - 2 log (1 + exp (γ_{add})) \sum_{c = 0}^{2} n_{c}^{(1, 1)} .

(3)

Noticing that

n_{het} = n_{0}^{(0, 1)} + n_{1}^{(0, 1)} + n_{1}^{(1, 2)} + n_{2}^{(1, 2)} + 2 \sum_{c = 0}^{2} n_{c}^{(1, 1)}

(4)

is the total number of heterozygous parents and

n_{not} = n_{1}^{(0, 1)} + n_{2}^{(1, 2)} + n_{1}^{(1, 1)} + 2 n_{2}^{(1, 1)}

(5)

is the total number of more frequent alleles not transmitted from the heterozygous parents to their affected offspring – or analogously, the total number of minor alleles transmitted by the heterozygous parents – the first derivative of (3) is given by

\frac{\partial ℓ^{*} (γ_{add})}{\partial γ_{add}} = n_{not} - \frac{exp (γ_{add})}{1 + exp (γ_{add})} n_{het} .

(6)

Setting (6) to zero and solving it for γ_add, the maximum-likelihood estimator for γ_add is given by

{γ̂}_{add} = logit (\frac{n_{not}}{n_{het}}) = log (\frac{n_{not}}{n_{het} - n_{not}}) .

(7)

The variance of γ̂_add can then be estimated by the value of the negative inverse of the second derivative

\frac{\partial^{2} ℓ (γ_{add})}{\partial γ_{add}^{2}} = - \frac{exp (γ_{add})}{{(1 + exp (γ_{add}))}^{2}} n_{het}

(8)

at γ_add = γ̂_add, i.e. by

\hat{Var} ({γ̂}_{add}) = \frac{n_{het}}{(n_{het} - n_{not}) n_{not}},

For a more detailed discussion of the analytic solution for the genotypic TDT, see Schwender et al. (2012).

Analogously, closed-form solutions for the genotypic TDT statistic (2) can be derived for the dominant and the recessive mode of inheritance (for these derivations, see Schwender et al., 2012), where in the dominant case the coding variable X is set to 0 if the subject shows the homozygous reference genotype, and to 1 otherwise. For a recessive mode of inheritance, X is set to 1 if both chromosomes show the minor allele, and to 0 otherwise. In both cases, the maximum likelihood estimate for γ takes the form

γ̂ = log (\sqrt{a + h^{2}} - h),

where, e.g., in a dominant model a and h are given by

a_{dom} = \frac{n_{1}^{(0, 1)} + n_{1}^{(1, 1)} + n_{2}^{(1, 1)}}{3 (n_{0}^{(0, 1)} + n_{0}^{(1, 1)})}

(9)

and

h_{dom} = (\frac{1 / 3 (n_{0}^{(0, 1)} - n_{1}^{(1, 1)} - n_{2}^{(1, 1)}) - n_{1}^{(0, 1)} + n_{0}^{(1, 1)}}{2 (n_{0}^{(0, 1)} + n_{0}^{(1, 1)})}),

(10)

respectively.

3 Power calculation for the genotypic TDT

For the additive mode of inheritance, the power of the genotypic TDT can be determined by an approach analogous to the one used by Knapp (1999) for calculating the power of the allelic TDT. For this, we consider the random vector Z_het = (Z₁, …, Z₇)^T consisting of random variables Z_j, j = 1, …, 7, for the seven numbers of trios corresponding to the genotype combinations that influence the maximization of the log-likelihood (3). This random vector is a subvector of Z = (Z₁, …, Z₈)^T, where Z additionally contains the random variable Z₈ specifying the total number of trios belonging to the other three genotype combinations without heterozygous parents. This random vector Z is thus multinomially distributed with n observations (here, trios) and probability vector q = (q₁, …, q₈)^T.

We further define u = (u₁, …, u₇)^T and v = (v₁, …, v₇)^T as the vectors containing the numbers u_j and v_j of more frequent alleles transmitted and not transmitted, respectively, from the heterozygous parents to their offspring in trios with the j-th genotype combination. Using these specifications, it can be derived from (4) and (5) that

N_{het} = u^{T} Z_{het} + v^{T} Z_{het} and N_{not} = v^{T} Z_{het}

(11)

are the random variables generating n_het and n_not, respectively. Therefore, the square root of the test statistic of the genotypic TDT can be rewritten as

G = log (\frac{V}{U}) \sqrt{\frac{U V}{U + V}}

with U = u^TZ_het/n and V = v^TZ_het/n. Note that under the null hypothesis H₀ : γ = 0 G is standard normally distributed so that G² is χ²-distributed with one degree of freedom.

Following the same arguments as Knapp (1999) based on the theoretic results presented in Rao (1973) and setting ũ = u^T q_het and ṽ = v^T q_het with q_het = (q₁, …, q₇)^T, the test statistic G follows asymptotically a normal distribution with mean

μ_{add} = \sqrt{n} E_{1} (g_{add}) = \sqrt{n} log (\frac{ũ}{υ̃}) \sqrt{\frac{ũ υ̃}{ũ + υ̃}}

and variance

σ_{add}^{2} = u^{T} Σ u {(\frac{\partial g}{\partial ũ})}^{2} + 2 u^{T} Σ v \frac{\partial g}{\partial ũ} \frac{\partial g}{\partial υ̃} + v^{T} Σ v {(\frac{\partial g}{\partial υ̃})}^{2} .

Here, E₁ (g_add) is the expected value of the genotypic TDT statistic for n = 1, Σ is a 7 × 7 matrix with diagonal elements q_j (1 − q_j) and off-diagonal elements −q_jq_ℓ, and the two derivatives are given by

\frac{\partial g}{\partial ũ} = \frac{(\frac{1}{2} log (\frac{υ̃}{ũ}) - 1) \sqrt{(ũ + υ̃) \frac{υ̃}{ũ}} - \frac{1}{2} log (\frac{υ̃}{ũ}) \sqrt{\frac{ũ υ̃}{ũ + υ̃}}}{ũ + υ̃} \frac{\partial g}{\partial υ̃} = \frac{(\frac{1}{2} log (\frac{υ̃}{ũ}) + 1) \sqrt{(ũ + υ̃) \frac{ũ}{υ̃}} - \frac{1}{2} log (\frac{υ̃}{ũ}) \sqrt{\frac{ũ υ̃}{ũ + υ̃}}}{ũ + υ̃} .

The statistical power β_add of the genotypic TDT assuming an additive mode of inheritance is thus asymptotically given by

β_{add} = Φ (\frac{z_{α / 2} - μ_{add}}{σ_{add}}) + 1 - Φ (\frac{z_{1 - α / 2} - μ_{add}}{σ_{add}}),

(12)

where Φ is the cumulative distribution function of the standard normal distribution and z_α is the α-quantile of the standard normal distribution.

In a case-parent trio study, the values q_j, j = 1, …, 7, can be estimated, and thus, the power of the genotypic TDT can be determined for each SNP from the data. It is, however, often also of interest to compute the power for a given number n of trios, a type I error rate α, and a relative risk RR. Since under the assumption of Hardy-Weinberg equilibrium in the parents there exists a direct relationship between the relative risk and the probabilities q_j for the different types of trios, q_j, j = 1, …, 7, can be computed from the relative risk RR (see Schaid, 1999, for general equations for these probabilities).

For such a determination in an additive model, we set r₀ = 1, r₁ = RR, and r₂ = 2RR − 1, where r_c is the risk to get the disease with c minor alleles relative to the disease risk with no minor allele (Schaid, 1999). Further denoting the minor allele frequency by m, the probabilities for the different types of trios can be computed by

q_{j} = \frac{2 m^{4 - p_{1} - p_{2}} {(1 - m)}^{p_{1} + p_{2}} r_{c}}{max {u_{j}, υ_{j}} \cdot (2 m (R R - 1) + 1)},

where p₁ and p₂ are the numbers of minor alleles of the parents (as defined in Section 2).

Equations for the statistical power of the genotypic TDT assuming either a dominant or recessive mode of inheritance can be devised in a similar way as for the additive model. For the derivation of these equations, see Appendix A.1.

4 Sample size calculation for the genotypic TDT

An equation for the required sample size for a given type I error rate α and power β can be derived from equation (12) for the statistical power in the standard way. If α is small and the relative risk is not too close to 1, either $Φ (\frac{z_{α / 2} - μ_{add}}{σ_{add}})$ or $1 - Φ (\frac{z_{1 - α / 2} - μ_{add}}{σ_{add}})$ becomes virtually zero so that this term of (12) does only very slightly influence the statistical power. Due to the symmetry of these terms, the sample size n required to gain a desired power β and to control the type I error rate at α can in both situations be determined by

n \approx {(\frac{σ z_{β} + z_{1 - α / 2}}{E_{1} (g)})}^{2} .

(13)

5 Power and sample size determination for the score tests

The test statistic of a score test for testing the null hypothesis H₀ : γ = 0 against the alternative H₁ : γ ≠ 0 is given by

s_{add}^{2} = \frac{D^{2} (0)}{I (0)}

with

D (γ) = \frac{\partial ℓ (γ)}{\partial γ} and I (γ) = - \frac{\partial^{2} ℓ (γ)}{\partial γ^{2}} .

Assuming an additive mode of inheritance, these two derivatives are given by (7) and (8) so that the test statistic assuming an additive genetic model can be determined by

s_{add}^{2} = \frac{{(n_{not} - 0.5 n_{het})}^{2}}{0.25 n_{het}} = \frac{{(2 n_{not} - n_{het})}^{2}}{n_{het}} .

(14)

As shown by Schaid and Sommer (1994), this test statistic is equivalent to the test statistic of the allelic TDT (Spielman et al., 1993). Using our notation, this can be shown by noting that n_het = n_A + n_a and n_not = n_A, where n_A and n_a are the numbers of the minor and the more frequent allele, respectively, transmitted from heterozygous parents to their offspring (cf. (11)). Therefore, (14) becomes

s_{add}^{2} = \frac{{(2 n_{A} - (n_{A} + n_{a}))}^{2}}{n_{A} + n_{a}} = \frac{{(n_{A} - n_{a})}^{2}}{n_{A} + n_{a}},

which is the test statistic of McNemar’s test, i.e. the allelic TDT. Power and required sample size of the score test assuming an additive mode of inheritance can hence be determined by exactly the same approach proposed for the allelic TDT assuming a multiplicative model by Knapp (1999).

For a discussion of the score tests assuming either a dominant or recessive mode of inheritance and the power calculation for these tests, see Appendix A.2.

6 Comparison of genotypic TDT and score test

Based on the approaches presented in the previous sections the sample sizes required by the genotypic TDTs to gain a certain power can be compared with the required sample sizes of the corresponding score tests.

Using equation (13), we thus computed the required sample sizes for these tests assuming an additive, dominant, or recessive mode of inheritance, considering different values of the relative risk (RR = 1.05, 1.20, 1.30, 1.40, 1.50) lying in the range of the relative risks observed in association studies as well as different minor allele frequencies (MAF = 0.01, 0.10, 0.20, 0.50) also considered by Knapp (1999) and Schaid (1999). As type I error rate we chose α = 5 × 10⁻⁸, which is often used to call (genome-wide) significance in genome-wide association studies. Moreover, we considered a power of β = 0.80 often desired to be gained in a study.

The results of this comparison are summarized in Table 2. This table reveals that the sample size required by the score test is always slightly smaller than the one needed by the corresponding genotypic TDT. Compared to the total sample sizes, these differences are, however, virtually negligible. This table also supports the well-known fact that huge numbers of subjects are required to detect with a high power genome-wide significant SNPs with a realistic relative risk.

Table 2.

Sample sizes required to gain 80% power with an type I error rate of α = 5 × 10⁻⁸ for genotypic TDTs and score tests assuming different modes of inheritance.

		Additive		Dominant		Recessive
RR	MAF	gTDT	Score	gTDT	Score	gTDT	Score
1.05	0.01	1,642,576	1,642,381	1,662,620	1,662,423	216,037,547	216,012,183
	0.10	183,105	183,084	207,695	207,672	2,310,954	2,310,703
	0.20	104,520	104,508	136,124	136,110	631,272	631,207
	0.50	69,857	69,849	154,609	154,594	149,732	149,718

1.20	0.01	110,732	110,549	111,955	111,770	14,148,787	14,125,709
	0.10	12,821	12,800	14,360	14,338	152,006	151,777
	0.20	7,621	7,610	9,657	9,643	41,833	41,774
	0.50	5,704	5,696	11,620	11,604	10,310	10,296

1.30	0.01	51,639	51,463	52,174	51,996	6,482,074	6,460,251
	0.10	6,123	6,103	6,807	6,785	69,827	69,610
	0.20	3,730	3,718	4,651	4,637	19,308	19,252
	0.50	2,976	2,968	5,790	5,773	4,876	4,863

1.40	0.01	30,425	30,255	30,722	30,549	3,755,909	3,735,177
	0.10	3,690	3,671	4,075	4,054	40,564	40,358
	0.20	2,300	2,289	2,827	2,813	11,268	11,215
	0.50	1,942	1,934	3,629	3,612	2,913	2,900

1.50	0.01	20,363	20,198	20,550	20,383	2,474,435	2,454,659
	0.10	2,524	2,505	2,771	2,749	26,789	26,592
	0.20	1,607	1,596	1,950	1,935	7,475	7,424
	0.50	1,427	1,419	2,575	2,557	1,977	1,964

Open in a new tab

We also compared the sample sizes required by the genotypic TDTs and the score tests with the ones determined by the approach proposed by Knapp (1999) for approximating the power of the allelic TDT for general modes of inheritance. The latter sample sizes were previously published in Table 3 of Knapp (1999). For this comparison, we computed the sample sizes required by the genotypic TDT and the score test using the same settings considered in Knapp (1999). These sample sizes are summarized in Table 3.

Table 3.

Sample sizes required to gain 80% power with an type I error rate of α = 10⁻⁷ for the genotypic TDTs, the score tests, and the first approach to power approximation proposed by Knapp (1999) assuming a dominant or a recessive mode of inheritance. The sample sizes for the method by Knapp (1999) were obtained from Table 3 in Knapp (1999).

RR	MAF	Dominant			Recessive
		gTDT	Score	Knapp (1999)	gTDT	Score	Knapp (1999)
1.5	0.01	19,741	19,582	19,755	2,378,326	2,359,578	154,174,890
	0.10	2,661	2,641	2,897	25,747	25,561	174,694
	0.50	2,473	2,456	4,568	1,900	1,887	3,099
	0.80	11,636	11,553	50,826	2,047	2,032	2,356

2.0	0.01	6,031	5,890	5,947	680,674	665,193	38,654,522
	0.10	877	857	949	7,449	7,294	45,071
	0.50	969	950	1,839	622	610	957
	0.80	4,819	4,717	21,998	762	746	851

4.0	0.01	1,215	1,102	1,115	115,532	105,324	4,344,070
	0.10	225	204	231	1,306	1,200	5,631
	0.50	361	331	696	158	146	207
	0.80	1,975	1,802	9,384	256	234	259

Open in a new tab

In particular in the recessive model, the required sample sizes for both the genotypic TDT and the score test are much smaller than ones computed for the allelic TDT based on the approach of Knapp (1999).

7 Simulation study

To validate the accuracy of the proposed sample size calculation, we performed a simulation study using the minor allele frequencies and the type I error rate considered in the previous section. Since the expected value of the estimate for the parameter γ in the conditional logistic regression model on which the genotypic TDT is based is the log relative risk (Schaid, 1996), we considered 1.05, 1.20, 1.30, 1.40, and 1.50 as values for exp(γ). For each combination of these minor allele frequencies and exp(γ), we simulated 10⁵ case-parent trio data sets of the respective sample size determined in the previous section. The power was then estimated by the proportion of data sets for which the null hypothesis was rejected at the α = 5 × 10⁻⁸ level. Because of the huge sample sizes for the recessive model, we only considered genotypic TDTs and score tests assuming an additive or dominant mode of inheritance.

The results of this simulation study are summarized in Table 4, which shows that all estimated powers are very close to β = 0.80, the power used in Section 6. Therefore, the sample size and power equations derived in Sections 3–5 as well as the Appendix show up to be very accurate even for relative risks close to 1 and small minor allele frequencies.

Table 4.

Simulated power of the genotypic TDT and the score test. The power was determined assuming either an additive or a dominant mode of inheritance and based on 10⁵ simulated data sets consisting of the number of trios summarized in Table 2 for each of the settings in Table 2.

		Genotypic TDT		Score Test
RR	MAF	Additive	Dominant	Additive	Dominant
1.05	0.01	0.801	0.801	0.800	0.800
	0.10	0.799	0.798	0.800	0.802
	0.20	0.801	0.800	0.799	0.799
	0.50	0.801	0.800	0.800	0.799

1.20	0.01	0.800	0.800	0.801	0.801
	0.10	0.801	0.799	0.799	0.800
	0.20	0.799	0.798	0.798	0.801
	0.50	0.799	0.801	0.799	0.800

1.30	0.01	0.800	0.800	0.800	0.800
	0.10	0.798	0.802	0.799	0.798
	0.20	0.799	0.800	0.801	0.798
	0.50	0.800	0.800	0.798	0.800

1.40	0.01	0.800	0.798	0.800	0.801
	0.10	0.800	0.798	0.800	0.800
	0.20	0.799	0.799	0.800	0.800
	0.50	0.802	0.800	0.799	0.801

1.50	0.01	0.800	0.803	0.798	0.799
	0.10	0.798	0.799	0.800	0.801
	0.20	0.797	0.800	0.802	0.798
	0.50	0.798	0.800	0.798	0.798

Open in a new tab

8 Discussion

We have presented equations for determining the statistical powers and the required sample sizes for the genotypic TDT and the corresponding score test, assuming an additive, dominant, or recessive mode of inheritance. These approaches allow the determination of the power of the genotypic TDT for several relative risks, minor allele frequencies, etc., in less than a split second, and therefore, avoid very time-consuming simulation-based power and sample size estimations.

A comparison of the genotypic TDTs with the corresponding score tests showed that both require about the same sample size to gain the same power with a very slight advance for the score test. This comparison also implicitly contained the original allelic TDT, as this test (assuming a multiplicative mode of inheritance) is equivalent to the score test for an additive model.

This comparison also reconfirmed the well-known fact that a huge number of samples is required to gain an acceptable power to detect genome-wide significant SNPs with typically small relative risks, where the required sample size rapidly increases with decreasing minor allele frequency. One of the reasons for this is that the smaller the minor allele frequency, the less trios contribute to the maximization of the conditional likelihood considered when performing the genotypic TDT, or more generally, when computing the Wald statistic.

The power and sample size determinations presented in this paper are implemented in the R package trio version 3.1.2 or later freely available at http://www.bioconductor.org.

Supplementary Material

Supporting Information

NIHMS633811-supplement-Supporting_Information.R^{(5.5KB, R)}

Acknowledgements

This work was supported by the Deutsche Forschungsgemeinschaft [SCHW 1508/3-1 to C.N. and H.S., SFB 823 “Statistical Modelling of Nonlinear Dynamic Processes to C.N.] and the National Institutes of Health [R03 DE021437 to I.R.].

Appendix

A.1. Determination of the asymptotic normal distributions of genotypic TDT statistics

Equations for the statistical power of the genotypic TDT assuming either a dominant or recessive mode of inheritance can be derived in a similar way as for the additive model described in Section 3. In these cases, however, the number of components forming the genotypic TDT statistic cannot be reduced to two terms U and V as in the additive case, but four components have to be considered (e.g., in the dominant model presented in the Section 2, just $n_{1}^{(1, 1)}$ and $n_{2}^{(1, 1)}$ can be combined). Denoting these components by b₁, …, b₄, the test statistic G of the genotypic TDT assuming either a dominant or recessive mode of inheritance also follows asymptotically a normal distribution with mean

μ = \sqrt{n} E_{1} (g)

and variance

σ^{2} = \sum_{i = 1}^{4} \sum_{k = 1}^{4} σ_{i k} \frac{\partial g}{\partial b_{i}} \frac{\partial g}{\partial b_{k}},

(15)

where σ_ik are the pairwise variances of these terms computed as described in Section 3 (see also Knapp, 1999).

In the following, the derivatives $\frac{\partial g}{\partial b_{i}}$ , i = 1, …, 4, are determined for the dominant and the recessive mode of inheritance. To differ between these two cases, we use in the following the notation d_i (instead of b_i) for the numbers of trios when considering a dominant mode of inheritance and the notation r_i in the recessive case (for the specification of these numbers, see Table 5).

Table 5.

Number of minor alleles shown by the affected offspring and their parents for the ten possible genotype combinations case-parent trios can exhibit as well as the codings for the variable in the conditional logistic regression model used to perform a genotypic TDT assuming either a dominant or recessive mode of inheritance. The number of trios showing a specific genotype combination is denoted by $n_{c}^{(p_{1}, p_{2})}$ with c, p₁, p₂ ∈ {0, 1, 2} and p₁ ≤ p₂. Additionally, the corresponding components of the genotypic TDT statistic used to derive the asymptotic normal distribution of this statistic are displayed, where – marks a genotype combination that does not contribute to the maximization of the conditional log-likelihood under the corresponding mode of inheritance.

Number of
Trios

Number of Minor Alleles

Coding for Case/Pseudo-Controls

Components

Parents

Offspring

Dominant

Recessive

Dominant

Recessive

n_{0}^{(0, 1)}

0, 1

0 / 0, 1, 1

0 / 0, 0, 0

d₁

–

n_{1}^{(0, 1)}

0, 1

1 / 0, 0, 1

0 / 0, 0, 0

d₂

–

n_{1}^{(1, 2)}

1, 2

1 / 1, 1, 1

0 / 0, 1, 1

–

r₁

n_{2}^{(1, 2)}

1, 2

1 / 1, 1, 1

1 / 0, 0, 1

–

r₂

n_{0}^{(1, 1)}

1, 1

0 / 1, 1, 1

0 / 0, 0, 1

d₃

r₃

n_{1}^{(1, 1)}

1, 1

1 / 0, 1, 1

0 / 0, 0, 1

d₄

r₃

n_{2}^{(1, 1)}

1, 1

1 / 0, 1, 1

1 / 0, 0, 0

d₄

r₄

n_{1}^{(0, 2)}

0, 2

1 / 1, 1, 1

0 / 0, 0, 0

–

n_{2}^{(2, 2)}

2, 2

1 / 1, 1, 1

–

n_{0}^{(0, 0)}

0, 0

0 / 0, 0, 0

–

Open in a new tab

A.1.1. Genotypic TDT assuming a dominant model

For the dominant mode of inheritance, the numerator and denominator of the genotypic TDT statistic $g_{dom}^{2} = {γ̂}_{dom}^{2} / Var ({γ̂}_{dom})$ are determined from

{γ̂}_{dom} = log (\sqrt{a_{dom} + h_{dom}^{2}} - h_{dom})

(16)

with a_dom and h_dom as specified by (9) and (10), respectively, as well as

V_{dom} = {Var}^{- 1} ({γ̂}_{dom}) = \frac{(n_{0}^{(0, 1)} + n_{1}^{(0, 1)}) exp ({γ̂}_{dom})}{{(exp ({γ̂}_{dom}) + 1)}^{2}} + \frac{\sum_{c = 1}^{3} n_{c}^{(1, 1)} exp ({γ̂}_{dom})}{3 {(exp ({γ̂}_{dom}) + 1 / 3)}^{2}} .

The square root of the genotypic TDT statistic can, thus, be written as $g_{dom} = {γ̂}_{dom} \sqrt{V_{dom}}$ , and the first derivatives of g_dom with respect to d_i, i = 1, …, 4, can, hence, be determined by

\frac{\partial g_{dom}}{\partial d_{i}} = \frac{\partial {γ̂}_{dom}}{\partial d_{i}} \cdot \sqrt{V_{dom}} + \frac{\partial V_{dom}}{\partial d_{i}} \cdot \frac{{γ̂}_{dom}}{2 \sqrt{V_{dom}}} .

To devise the variance of the asymptotic normal distribution, we, therefore, need to compute $\frac{\partial {γ̂}_{dom}}{\partial d_{i}}$ as well as $\frac{\partial V_{dom}}{\partial d_{i}}$ , where in the dominant model the components d₁, …, d₄ are given by

d_{1} = n_{0}^{(0, 1)}, d_{2} = n_{1}^{(0, 1)}, d_{3} = n_{0}^{(1, 1)}, d_{4} = n_{1}^{(1, 1)} + n_{2}^{(1, 1)}

(see Table 5).

Differentiating (16) with respect to d_i, i = 1, …, 4, leads to

\frac{\partial {γ̂}_{dom}}{\partial d_{i}} = \frac{1}{exp ({γ̂}_{dom})} (\frac{a_{dom}^{(i)} + 2 h_{dom} h_{dom}^{(i)}}{2 \sqrt{a_{dom} + h_{dom}^{2}}} - h_{dom}^{(i)}),

(17)

where

h_{dom}^{(i)} = \frac{\partial h_{dom}}{\partial d_{i}} and a_{dom}^{(i)} = \frac{\partial a_{dom}}{\partial d_{i}} .

More exactly,

h_{dom}^{(1)} = \frac{3 d_{2} - 2 d_{3} + d_{4}}{6 {(d_{1} + d_{3})}^{2}}, h_{dom}^{(2)} = - \frac{1}{2 (d_{1} + d_{3})}, h_{dom}^{(3)} = \frac{2 d_{1} + 3 d_{2} + d_{4}}{6 {(d_{1} + d_{3})}^{2}}, h_{dom}^{(4)} = - \frac{1}{6 (d_{1} + d_{3})},

and

a_{dom}^{(1)} = a_{dom}^{(3)} = - \frac{d_{2} + d_{4}}{3 {(d_{1} + d_{3})}^{2}}, a_{dom}^{(2)} = a_{dom}^{(4)} = \frac{1}{3 (d_{1} + d_{3})} .

Setting c₁ = c₂ = 1 and c₃ = c₄ = 1/3 as well as

t_{dom}^{(i)} = \frac{\partial}{\partial d_{i}} \sqrt{a_{dom} + h_{dom}^{2}} = \frac{a_{dom}^{(i)} + 2 h_{dom} h_{dom}^{(i)}}{2 \sqrt{a_{dom} + h_{dom}^{2}}},

the first derivative of V_dom with respect to d_i, i = 1, …, 4, can then be derived as

\frac{\partial V_{dom}}{\partial d_{i}} = \frac{c_{i} exp ({γ̂}_{dom})}{{(exp ({γ̂}_{dom}) + c_{i})}^{2}} + (t_{dom}^{(i)} - h_{dom}^{(i)}) \sum_{k = 1}^{4} \frac{c_{k} d_{k} (c_{k} - exp ({γ̂}_{dom}))}{{(exp ({γ̂}_{dom}) + c_{k})}^{3}} .

A.1.2. Genotypic TDT assuming a recessive model

Under the assumption of a recessive mode of inheritance, the maxixmum likelihood estimator of γ_rec is given by

{γ̂}_{rec} = log (\sqrt{a_{rec} + h_{rec}^{2}} - h_{rec}),

(18)

where

a_{rec} = \frac{3 (r_{2} + r_{4})}{r_{1} + r_{3}} and h_{rec} = \frac{3 r_{1} - r_{2} + r_{3} - 3 r_{4}}{2 (r_{1} + r_{3})}

contain the four components

r_{1} = n_{1}^{(1, 2)}, r_{2} = n_{2}^{(1, 2)}, r_{3} = n_{0}^{(1, 1)} + n_{1}^{(1, 1)}, r_{4} = n_{2}^{(1, 1)}

(see also Table 5). Further, the inverse V_rec of the variance of (18) can be determined by

V_{rec} = {Var}^{- 1} ({γ̂}_{rec}) = (r_{1} + r_{2}) \frac{exp ({γ̂}_{rec})}{{(exp ({γ̂}_{rec}) + 1)}^{2}} + 3 (r_{3} + r_{4}) \frac{exp ({γ̂}_{rec})}{{(exp ({γ̂}_{rec}) + 3)}^{2}}

(19)

Analogously to the dominant case, the variance of the asymptotic normal distribution can be derived by computing the first derivatives of γ̂_rec and V with respect to r_i, i = 1, …, 4.

Since γ̂_rec has the same form as γ̂_dom, its first derivatives are identical to (17), except that a_dom and h_dom are replaced by a_rec and h_rec, respectively. These first derivatives are thus given by

\frac{\partial {γ̂}_{rec}}{\partial r_{i}} = \frac{1}{exp ({γ̂}_{rec})} (\frac{a_{rec}^{(i)} + 2 h_{rec} h_{rec}^{(i)}}{2 \sqrt{a_{rec} + h_{rec}^{2}}} - h_{rec}^{(i)})

with

h_{rec}^{(1)} = \frac{r_{2} + 2 r_{3} + 3 r_{4}}{2 {(r_{1} + r_{3})}^{2}}, h_{rec}^{(2)} = - \frac{1}{2 (r_{1} + r_{3})}, h_{rec}^{(3)} = \frac{- 2 r_{1} + r_{2} + 3 r_{4}}{2 {(r_{1} + r_{3})}^{2}}, h_{rec}^{(4)} = - \frac{3}{2 (r_{1} + r_{3})},

and

a_{rec}^{(1)} = a_{rec}^{(3)} = - \frac{3 (r_{2} + r_{4})}{{(r_{1} + r_{3})}^{2}}, a_{rec}^{(2)} = a_{rec}^{(4)} = \frac{3}{r_{1} + r_{3}} .

The same applies to the first derivatives of (19), which take the same form as the first derivatives of V_dom in the dominant model. Setting c₁ = c₂ = 1, c₃ = c₄ = 3, and again

t_{rec}^{(i)} = \frac{\partial}{\partial r_{i}} \sqrt{a_{rec} + h_{rec}^{2}} = \frac{a_{rec}^{(i)} + 2 h_{rec} h_{rec}^{(i)}}{2 \sqrt{a_{rec} + h_{rec}^{2}}},

the first derivative of (19) with respect to r_i, i = 1, …, 4, is, thus, given by

\frac{\partial V_{dom}}{\partial r_{i}} = \frac{c_{i} exp ({γ̂}_{rec})}{{(exp ({γ̂}_{rec}) + c_{i})}^{2}} + (t_{rec}^{(i)} - h_{rec}^{(i)}) \sum_{k = 1}^{4} \frac{c_{k} r_{k} (c_{k} - exp ({γ̂}_{rec}))}{{(exp ({γ̂}_{rec}) + c_{k})}^{3}} .

A.2. Determination of the asymptotic normal distributions of score test statistics

The score test statistic is more complex when considering a dominant or a recessive mode of inheritance than when assuming an additive mode of inheritance. In the dominant case, the score test statistic is given by

s_{dom}^{2} = \frac{{(2 (n_{1}^{(0, 1)} - n_{0}^{(0, 1)}) - 3 n_{0}^{(11)} + n_{1}^{(11)} + n_{2}^{(11)})}^{2}}{4 \sum_{c = 1}^{2} n_{c}^{(0, 1)} + 3 \sum_{c = 0}^{2} n_{c}^{(1, 1)}} .

For an recessive model, this test statistic can be determined by

s_{rec}^{2} = \frac{{(2 (n_{2}^{(1, 2)} - n_{1}^{(1, 2)}) - n_{0}^{(1, 1)} - n_{1}^{(1, 1)} + 3 n_{2}^{(1, 1)})}^{2}}{4 \sum_{c = 1}^{2} n_{c}^{(1, 2)} + 3 \sum_{c = 0}^{2} n_{c}^{(1, 1)}}

(for an alternative representation of these test statistics, see Schaid and Sommer, 1994).

Since these test statistics are concise, they can directly be differentiate with respect to the respective four components also considered in Appendix A.1 to derive the asymptotic normal distribution in the genotypic TDT. Setting

e_{dom} = {(4 d_{1} + 4 d_{2} + 3 d_{3} + 3 d_{4})}^{3 / 2} and e_{rec} = {(4 r_{1} + 4 r_{2} + 3 r_{3} + 3 r_{4})}^{3 / 2}

the first derivatives of s_dom with respect to d₁, …, d₄ are given by

\begin{matrix} \frac{\partial s_{dom}}{\partial d_{1}} = - \frac{4 (d_{1} + 3 d_{2} + 2 d_{4})}{e_{dom}}, & \frac{\partial s_{dom}}{\partial d_{2}} = \frac{4 (3 d_{1} + d_{2} + 3 d_{3} + d_{4})}{e_{dom}}, \\ \frac{\partial s_{dom}}{\partial d_{3}} = - \frac{2 (6 d_{1} + 10 d_{2} + 3 d_{3} + 7 d_{4})}{3 e_{dom}}, & \frac{\partial s_{dom}}{\partial d_{4}} = \frac{14 d_{1} + d_{2} + 15 d_{3} + 3 d_{4}}{2 e_{dom}}, \end{matrix}

and the first derivatives of s_rec with respect to r₁, …, r₄ by

\begin{matrix} \frac{\partial s_{rec}}{\partial r_{1}} = - \frac{4 (r_{1} + 3 r_{2} + r_{3} + 3 r_{4})}{e_{rec}}, & \frac{\partial s_{rec}}{\partial r_{2}} = \frac{4 (3 r_{1} + r_{2} + 2 r_{3})}{e_{rec}}, \\ \frac{\partial s_{rec}}{\partial r_{3}} = - \frac{2 r_{1} + 14 r_{2} + 3 r_{3} + 15 r_{4}}{2 e_{rec}}, & \frac{\partial s_{rec}}{\partial r_{4}} = \frac{2 (10 r_{1} + 6 r_{2} + 7 r_{3} + 3 r_{4})}{3 e_{rec}} . \end{matrix}

These derivatives can then be inserted into equation (15) to compute the variance of the asymptotic normal distribution of the square root S of the score test statistic.

Footnotes

Conflict of Interest

The authors have declared no conflict of interest.

References

Fallin D, Beaty T, Liang K-Y, Chen W. Power comparisons for genotypic vs. allelic TDT methods with > 2 alleles. Genetic Epidemiology. 2002;23:458–461. doi: 10.1002/gepi.10192. [DOI] [PubMed] [Google Scholar]
Gauderman WJ, Witte JS, Thomas DC. Family-based association studies. Journal of the National Cancer Institute Monographs. 1999;26:31–37. doi: 10.1093/oxfordjournals.jncimonographs.a024223. [DOI] [PubMed] [Google Scholar]
Knapp M. A note on power approximations for the transmission/disequilibrium test. American Journal of Human Genetics. 1999;64:1177–1185. doi: 10.1086/302334. [DOI] [PMC free article] [PubMed] [Google Scholar]
Laird NM, Lange C. Family-based designs in the age of large-scale gene-association studies. Nature Reviews Genetics. 2006;7:385–394. doi: 10.1038/nrg1839. [DOI] [PubMed] [Google Scholar]
Laird NM, Horvath S, Xu X. Implementing a unified approach to family-based tests of association. Genetic Epidemiology. 2000;19:S36–S42. doi: 10.1002/1098-2272(2000)19:1+<::AID-GEPI6>3.0.CO;2-M. [DOI] [PubMed] [Google Scholar]
Lange C, Laird NM. Power calculations for a general class of family-based association tests: Dichotomous traits. American Journal of Human Genetics. 2002;71:575–584. doi: 10.1086/342406. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lange C, DeMeo DL, Laird NM. Power and design considerations for a general class of family-based association tests: Quantitative traits. American Journal of Human Genetics. 2002;71:1330–1341. doi: 10.1086/344696. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ludwig KU, Mangold E, Herms S, Nowak S, Reutter H, Paul A, Becker J, Herberz R, AlChawa T, Nasser E, Boehmer AC, Mattheisen M, Alblas MA, Barth S, Kluck N, Lauster C, Braumann B, Reich RH, Hemprich A, Poetzsch S, Blaumeiser B, Daratsianos N, Kreusch T, Murray JC, Marazita ML, Ruczinski I, Scott AF, Beaty TH, Kramer FJ, Wienker TF, Steegers-Theunissen RP, Rubini M, Mossey PA, Hoffmann P, Lange C, Cichon S, Propping P, Knapp M, Noethen MM. Genome-wide meta-analyses of nonsyndromic cleft lip with or without cleft palate identify six new risk loci. Nature Genetics. 2012;44:968–971. doi: 10.1038/ng.2360. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rabinowitz D, Laird N. A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information. Human Heredity. 2000;50:211–223. doi: 10.1159/000022918. [DOI] [PubMed] [Google Scholar]
Rao CR. Linear Statistical Inference and its Applications. 2 edition. New York: Wiley & Sons; 1973. [Google Scholar]
Schaid DJ. General score tests for associations of genetic markers with disease using cases and their parents. Genetic Epidemiology. 1996;13:423–449. doi: 10.1002/(SICI)1098-2272(1996)13:5<423::AID-GEPI1>3.0.CO;2-3. [DOI] [PubMed] [Google Scholar]
Schaid DJ. Likelihoods and TDT for the case-parents design. Genetic Epidemiology. 1999;16:250–260. doi: 10.1002/(SICI)1098-2272(1999)16:3<250::AID-GEPI2>3.0.CO;2-T. [DOI] [PubMed] [Google Scholar]
Schaid DJ, Sommer SS. Comparison of statistics for candidate-gene association studies using cases and parents. American Journal of Human Genetics. 1994;55:402–409. [PMC free article] [PubMed] [Google Scholar]
Schwender H, Taub MA, Beaty TH, Marazita ML, Ruczinski I. Rapid testing of SNPs and gene-environment interactions in case-parent trio data based on exact analytic parameter estimation. Biometrics. 2012;68:766–773. doi: 10.1111/j.1541-0420.2011.01713.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Self SG, Longton G, Kopecky KJ, Liang KY. On estimating HLA/disease association with application to a study of aplastic anemia. Biometrics. 1991;47:53–61. [PubMed] [Google Scholar]
Spielman RS, Ewens WJ. The TDT and other family-based tests for linkage disequilibrium and association. American Journal of Human Genetics. 1996;59:983–989. [PMC free article] [PubMed] [Google Scholar]
Spielman RS, McGinnis R, Ewens WJ. Transmission test for linkage disequilibrium: The insulin gene region and insulin-dependent diabetes mellitus (IDDM) American Journal of Human Genetics. 1993;52:506–516. [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

NIHMS633811-supplement-Supporting_Information.R^{(5.5KB, R)}

[R1] Fallin D, Beaty T, Liang K-Y, Chen W. Power comparisons for genotypic vs. allelic TDT methods with > 2 alleles. Genetic Epidemiology. 2002;23:458–461. doi: 10.1002/gepi.10192. [DOI] [PubMed] [Google Scholar]

[R2] Gauderman WJ, Witte JS, Thomas DC. Family-based association studies. Journal of the National Cancer Institute Monographs. 1999;26:31–37. doi: 10.1093/oxfordjournals.jncimonographs.a024223. [DOI] [PubMed] [Google Scholar]

[R3] Knapp M. A note on power approximations for the transmission/disequilibrium test. American Journal of Human Genetics. 1999;64:1177–1185. doi: 10.1086/302334. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Laird NM, Lange C. Family-based designs in the age of large-scale gene-association studies. Nature Reviews Genetics. 2006;7:385–394. doi: 10.1038/nrg1839. [DOI] [PubMed] [Google Scholar]

[R5] Laird NM, Horvath S, Xu X. Implementing a unified approach to family-based tests of association. Genetic Epidemiology. 2000;19:S36–S42. doi: 10.1002/1098-2272(2000)19:1+<::AID-GEPI6>3.0.CO;2-M. [DOI] [PubMed] [Google Scholar]

[R6] Lange C, Laird NM. Power calculations for a general class of family-based association tests: Dichotomous traits. American Journal of Human Genetics. 2002;71:575–584. doi: 10.1086/342406. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Lange C, DeMeo DL, Laird NM. Power and design considerations for a general class of family-based association tests: Quantitative traits. American Journal of Human Genetics. 2002;71:1330–1341. doi: 10.1086/344696. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Ludwig KU, Mangold E, Herms S, Nowak S, Reutter H, Paul A, Becker J, Herberz R, AlChawa T, Nasser E, Boehmer AC, Mattheisen M, Alblas MA, Barth S, Kluck N, Lauster C, Braumann B, Reich RH, Hemprich A, Poetzsch S, Blaumeiser B, Daratsianos N, Kreusch T, Murray JC, Marazita ML, Ruczinski I, Scott AF, Beaty TH, Kramer FJ, Wienker TF, Steegers-Theunissen RP, Rubini M, Mossey PA, Hoffmann P, Lange C, Cichon S, Propping P, Knapp M, Noethen MM. Genome-wide meta-analyses of nonsyndromic cleft lip with or without cleft palate identify six new risk loci. Nature Genetics. 2012;44:968–971. doi: 10.1038/ng.2360. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Rabinowitz D, Laird N. A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information. Human Heredity. 2000;50:211–223. doi: 10.1159/000022918. [DOI] [PubMed] [Google Scholar]

[R10] Rao CR. Linear Statistical Inference and its Applications. 2 edition. New York: Wiley & Sons; 1973. [Google Scholar]

[R11] Schaid DJ. General score tests for associations of genetic markers with disease using cases and their parents. Genetic Epidemiology. 1996;13:423–449. doi: 10.1002/(SICI)1098-2272(1996)13:5<423::AID-GEPI1>3.0.CO;2-3. [DOI] [PubMed] [Google Scholar]

[R12] Schaid DJ. Likelihoods and TDT for the case-parents design. Genetic Epidemiology. 1999;16:250–260. doi: 10.1002/(SICI)1098-2272(1999)16:3<250::AID-GEPI2>3.0.CO;2-T. [DOI] [PubMed] [Google Scholar]

[R13] Schaid DJ, Sommer SS. Comparison of statistics for candidate-gene association studies using cases and parents. American Journal of Human Genetics. 1994;55:402–409. [PMC free article] [PubMed] [Google Scholar]

[R14] Schwender H, Taub MA, Beaty TH, Marazita ML, Ruczinski I. Rapid testing of SNPs and gene-environment interactions in case-parent trio data based on exact analytic parameter estimation. Biometrics. 2012;68:766–773. doi: 10.1111/j.1541-0420.2011.01713.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] Self SG, Longton G, Kopecky KJ, Liang KY. On estimating HLA/disease association with application to a study of aplastic anemia. Biometrics. 1991;47:53–61. [PubMed] [Google Scholar]

[R16] Spielman RS, Ewens WJ. The TDT and other family-based tests for linkage disequilibrium and association. American Journal of Human Genetics. 1996;59:983–989. [PMC free article] [PubMed] [Google Scholar]

[R17] Spielman RS, McGinnis R, Ewens WJ. Transmission test for linkage disequilibrium: The insulin gene region and insulin-dependent diabetes mellitus (IDDM) American Journal of Human Genetics. 1993;52:506–516. [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Analytic power and sample size calculation for the genotypic transmission/disequilibrium test in case-parent trio studies

Christoph Neumann

Margaret A Taub

Samuel G Younkin

Terri H Beaty

Ingo Ruczinski

Holger Schwender

Abstract

1 Introduction

2 Analytic solution to the genotypic TDT

Table 1.

3 Power calculation for the genotypic TDT

4 Sample size calculation for the genotypic TDT

5 Power and sample size determination for the score tests

6 Comparison of genotypic TDT and score test

Table 2.

Table 3.

7 Simulation study

Table 4.

8 Discussion

Supplementary Material

Acknowledgements

Appendix

A.1. Determination of the asymptotic normal distributions of genotypic TDT statistics

Table 5.

A.1.1. Genotypic TDT assuming a dominant model

A.1.2. Genotypic TDT assuming a recessive model

A.2. Determination of the asymptotic normal distributions of score test statistics

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Analytic power and sample size calculation for the genotypic transmission/disequilibrium test in case-parent trio studies

Christoph Neumann

Margaret A Taub

Samuel G Younkin

Terri H Beaty

Ingo Ruczinski

Holger Schwender

Abstract

1 Introduction

2 Analytic solution to the genotypic TDT

Table 1.

3 Power calculation for the genotypic TDT

4 Sample size calculation for the genotypic TDT

5 Power and sample size determination for the score tests

6 Comparison of genotypic TDT and score test

Table 2.

Table 3.

7 Simulation study

Table 4.

8 Discussion

Supplementary Material

Acknowledgements

Appendix

A.1. Determination of the asymptotic normal distributions of genotypic TDT statistics

Table 5.

A.1.1. Genotypic TDT assuming a dominant model

A.1.2. Genotypic TDT assuming a recessive model

A.2. Determination of the asymptotic normal distributions of score test statistics

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases