Skip to main content
American Journal of Human Genetics logoLink to American Journal of Human Genetics
letter
. 2002 Nov;71(5):1235–1236. doi: 10.1086/344290

Using All Alleles in the Multiallelic Versions of the SDT and Combined SDT/TDT

Wendy Czika 1,2, Jack J Berry 1
PMCID: PMC385105  PMID: 12452175

To the Editor:

Horvath and Laird’s sibling disequilibrium test (SDT) provides a nonparametric approach to testing genetic markers for both linkage and association with a disease (1998). The advantage over its parametric alternatives is its validity as a test of association when using sibships containing more than one affected sibling and/or more than one unaffected sibling. Horvath and Laird introduced an SDT for multiallelic markers and a biallelic combined SDT/transmission/disequilibrium test (TDT) when some parental genotypic information is available. Curtis et al. (1999) later developed a multiallelic combined SDT/TDT. The multiallelic versions of these tests are designed for situations in which there is no a priori knowledge of which allele at a marker might have an effect on disease status; otherwise, a biallelic test can be performed on the allele of interest versus all other alleles collapsed into one. A problem with the multiallelic extensions is that the statistic varies depending on which allele is omitted from the analysis. We present an alternative multiallelic SDT (mSDT) that takes into account all the allelic information and is consistent with the biallelic approach. This method can also be applied to the combined SDT/TDT.

In calculating the multiallelic versions of both the SDT and combined SDT/TDT, the statistics dj, j=1,…,m for a marker with m alleles are used. In the SDT, Inline graphic, where dji represents the difference between the average number of times allele j occurs in an affected sibling and the average number of times it occurs in an unaffected sibling within sibship i (Horvath and Laird 1998); for the combined SDT/TDT, dj is the difference between the number of times allele j is transmitted and the number of times it is not transmitted from a heterozygous parent to an affected child (Sham 1997). As discussed in Stuart (1955), a quadratic form of the dj can be used to create a statistic with an asymptotic χ2 distribution. It is noted that since Inline graphic, the df for the distribution are m-1. Furthermore, since using all m columns of the variance-covariance matrix creates a singularity, and, thus, the matrix is uninvertible, the natural solution is to eliminate one dj and the corresponding row and column in the variance-covariance matrix to make it full rank. The invariance of the χ2 statistic according to which variate (dj) is omitted from the statistic is demonstrated by Stuart (1955).

To create a nonparametric test, Sji=sgn(dji) is used in place of dji, where sgn(d)=-1,0,1 for d<,=,>0, respectively. Though the sum of the quantities dji, j=1,…,m, is 0 for each sibship i=1,…,N and S1i=-S2i in the biallelic case, for more than two alleles, the sum over j of the Sji is not similarly linearly constrained within a sibship. In fact, the Sji can sum over j to either −1, 0, or 1. Despite this fact, multiallelic extensions to the SDT and combined SDT/TDT are formed by arbitrarily dropping one of the Inline graphic from the analysis. The resulting χ2(m-1) test statistic is no longer invariant to which allele’s information has been omitted, since there is no linear dependency among the values of Sj; information is being discarded unnecessarily. Furthermore, the variance-covariance matrix W for S=(S1,…,Sm) is nonsingular (exceptions are discussed below) before any of the m alleles are omitted. Thus, when all m alleles are used, a valid test statistic can still be created as SW-1S, which has an asymptotic χ2(m) distribution (Hettmansperger 1984; Randles 1989).

There are, as mentioned, situations in which W will not be full rank. Among these are:

  • 1.

    the biallelic case, in which the Sj are constrained, since there is a perfect negative correlation between S1i and S2i for all i (Inline graphic for all i);

  • 2.

    the existence of at least one allele j, such that Sji=0 for all N sibships, so that this allele will have a row and column of 0s in W, creating a singularity; and

  • 3.

    Inline graphic, the same constant, for all N sibships.

For these situations, we recommend the use of the Moore-Penrose generalized inverse (g-inverse) of the variance-covariance matrix W, W-. This is a unique generalized inverse of W that satisfies the following conditions (Rao and Mitra 1971; Searle 1971): WW- and W-W are symmetric; W-WW-=W-; and WW-W=W. It is worth noting that the last two scenarios listed for a singular variance-covariance matrix are possible with the original SDT statistic, even after one allele has been omitted from the analysis, in which case the statistic cannot be calculated, since W is uninvertible.

When using W- in place of W-1 in the quadratic form, the test statistic SW-S still has an asymptotic χ2 distribution, now with df equal to the rank of W (Rao and Mitra 1971). Note that, for the biallelic case, in Horvath and Laird’s notation (1998), the mSDT gives S=(b-c,c-b), and the W matrix will be of the form

graphic file with name AJHGv71p1235df1.jpg

The g-inverse is then calculated as

graphic file with name AJHGv71p1235df2.jpg

which yields a χ2 statistic of (b-c)2/(b+c) with 1 df, the same as the usual biallelic statistic.

To summarize our approach, we suggest modifying Horvath and Laird’s SDT statistic (1998) and the combined SDT/TDT of Curtis et al. (1999) in the following manner to calculate the statistic for the mSDT:

  • 1.

    Use all m alleles in the S vector and W matrix.

  • 2.

    Use W- in place of W-1 to create the χ2 statistic (note that these are identical when W is full rank).

  • 3.

    Use rank(W) as the df for the χ2 distribution.

We give an example here, using simulated data from GAW9 (Hodge 1995). As in Spielman and Ewens (1998) and Knapp (1999), we focus on multiallelic markers D1G31 and D5G23, which contain actual disease alleles, M8 and M7, respectively. Table 1 shows the results of analyzing the data using the original Horvath and Laird SDT method, in which each allele is dropped in turn. Also shown are the results from analyzing the data using our mSDT approach. Note that each marker has eight alleles, so P values from the SDT are based on a χ27 distribution, whereas the mSDT P values are from a χ28 distribution, since the variance-covariance matrices for both markers are full rank. This example is not intended as any sort of power comparison but merely to illustrate that there is not necessarily a loss of power by introducing an additional df. The other thing to note from this table is the variation of the SDT P values depending on which allele is dropped. Although all test statistics are highly significant for marker D5G23, we can see quite a discrepancy between the SDT statistic for marker D1G31 when dropping allele M8 and any of the other seven SDT statistics. The mSDT approach will always give a unique χ2 statistic, regardless of whether W is full rank. This method will be available in a future release of SAS/Genetics™.

Table 1.

SDT and mSDT Statistics for Two Markers Linked and Associated with Disease

Statistic for Marker
D1G31
D5G23
AlleleDropped χ2 P χ2 Pa
M1 23.115255 .001628 52.441075 .000048
M2 23.543802 .001370 52.365979 .000049
M3 23.239746 .001548 52.382481 .000049
M4 23.621073 .001328 51.086058 .000088
M5 23.661028 .001307 52.546616 .000046
M6 23.648748 .001313 53.238694 .000033
M7 23.417311 .001441 45.631132 .001031
M8 14.806102 .038567 51.811979 .000064
mSDT 23.667390 .002605 53.455015 .000088
a

P values multiplied by 104.

References

  1. Curtis D, Miller MB, Sham PC (1999) Combining the sibling disequilibrium test and transmission/disequilibrium test for multiallelic markers. Am J Hum Genet 64:1785–1786 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Hettmansperger TP (1984) Statistical inference based on ranks. John Wiley and Sons, New York [Google Scholar]
  3. Hodge SE (1995) An oligogenic disease displaying weak marker associations: a summary of contributions to problem 1 of GAW9. Genet Epidemiol 12:545–554 [DOI] [PubMed] [Google Scholar]
  4. Horvath S and Laird NM (1998) A discordant-sibship test for disequilibrium and linkage: no need for parental data. Am J Hum Genet 63:1886–1897 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Knapp M (1999) The transmission/disequilibrium test and parental-genotype reconstruction: the reconstruction-combined transmission/disequilibrium test. Am J Hum Genet 64:861–870 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Randles RH (1989) A distribution-free multivariate sign test based on interdirections. J Am Stat Assoc 84:1045–1050 [Google Scholar]
  7. Rao CR and Mitra SK (1971) Generalized inverse of matrices and its applications. John Wiley and Sons, New York [Google Scholar]
  8. Searle SR (1971) Linear Models. John Wiley and Sons, New York [Google Scholar]
  9. Sham P (1997) Transmission/disequilibrium tests for multiallelic loci. Am J Hum Genet 61:774–778 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Spielman RS and Ewens WJ (1998) A sibship test for linkage in the presence of association: the sib transmission/disequilibrium test. Am J Hum Genet 62:450–458 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Stuart A (1955) A test of homogeneity of the marginal distributions in a two-way classification. Biometrika 42:412–416 [Google Scholar]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics

RESOURCES