Skip to main content
. Author manuscript; available in PMC: 2014 May 1.
Published in final edited form as: Genet Epidemiol. 2013 Mar 21;37(4):10.1002/gepi.21722. doi: 10.1002/gepi.21722

Figure 1. Two-dimensional rendering of the geometric framework for rare variant tests.

Figure 1

The set of graphs shows simplified scenarios for two rare variant sites in a case-control dataset and are designed to provide intuition into the geometric interpretation of rare variant tests.

A) The vectors f+ = (f1+, f2+) and f = (f1, f2) contain observed allele frequencies at two rare variant sites for cases and controls, respectively. ∥f+p and ∥fp indicate the lengths of these frequency vectors with respect to the Lp norm, θ is the measure of the angle between f+ and, f, and ∥f+fp is the distance between the endpoints of f+ and f. The null hypothesis of no rare variant association (H0: F+ = F) can be tested using any of the three following null hypotheses related to the geometry of the frequency vectors: (i) ∥F+p = ∥Fp, the lengths of the vectors are equal, (ii) θ = 0, the angle between the vectors is zero, or (iii) ∥F+Fp = 0, the distance between the endpoints of the vectors is zero. We refer to tests of the three geometric null hypotheses as, respectively, length, angle and joint tests. In the pictured scenario, the minor allele frequency is higher in cases for each variant (f1+ > f1 and f2+ > f2), indicating both as potential risk variants.

B) Under the null case of no association (F = F) each of the geometric null hypotheses hold: (i) ∥F+p = ∥Fp, (ii) θ = 0, and (iii) ∥F+Fp = 0.

C) Both variants are causative with the case vector being a scalar multiple of the control vector (F+ = cF). This occurs if the case frequency and control frequency are the same across all variant sites. The result is that ∥F+p ≠ ∥Fp and ∥F+Fp ≠ 0, but the null hypothesis of θ = 0 still holds. This scenario highlights the reason that angle tests are not powerful strategies and underscores why none have been proposed.

D) The scenario in which one rare variant is causative (f1+ > f1) and the other is protective (f2 > f2+). In this case, it is possible that ∥F+p = ∥Fp so that the signals from the two variants effectively cancel each other out, explaining reduced performance for length tests in the presence of a mix of risk and protective variants. Alternatively, ∥F+Fp ≠ 0 and joint tests remain powerful.