. 2015 Oct 28;24(5):767–773. doi: 10.1038/ejhg.2015.194

Table 1. Overview of combined test rationale and performance.

Test	Tests combined	Rationale	Average correlationa	Observed strengths	Observed weaknesses	Overall performanceb
CT1	L(1), L(2), L(3), L(∞)	Assess robustness against non-causal variants and tradeoff with number of tests combined	0.53	Minimal	Poor performance with risk-reducing variants	Poor (Min(p): 38.6%) (Fisher's: 47.2%)
CT2	J(1), J(2), J(3), J(∞)	Assess robustness against non-causal variants and tradeoff with number of tests combined	0.87	Handles risk-reducing variants	Redundant	Good (Min(p): 8.9%) (Fisher's: 7.9%)
CT3	CMC, L(1)	Assess impact of combining highly correlated tests	0.92	Minimal	Poor performance with risk-reducing variants; redundant	Poor (Min(p): 42.7%) (Fisher's: 42.2%)
CT4	SKAT, J(2)	Assess impact of combining highly correlated tests	0.99	Handles risk-reducing variants	Redundant	Good (Min(p): 7.4%) (Fisher's: 7.4%)
CT5	SKAT, CMC	Assess a ‘standard' combination of tests	0.46	Fairly robust	Lacks robustness to high proportion of non-causal variants	Good (Min(p): 7.5%) (Fisher's: 8.1%)
CT6	L(1), L(2), L(3), L(∞), J(1), J(2), J(3), J(∞)	Assess robustness against non-causal variants and tradeoff with number of tests combined	0.54	Fairly robust	Some poorly performing tests (eg, length tests) make Fisher's perform suboptimally	Good (Min(p): 5.6%) (Fisher's: 9.2%)
CT7	L(1), L(4), J(1), J(4)	Assess robustness against non-causal variants and tradeoff with number of tests combined	0.50	Fairly robust	Fairly good performance, though Fisher's performs a bit poorer due to length tests	Very good (Min(p): 4.9%) (Fisher's: 6.5%)
CT8	SKAT-O, J(∞)	Assess ability to create a more robust SKAT-O	0.77	More robust to inclusion of non-causal variants	Slightly lower power than SKAT-O when few non-causal variants	Very good (Min(p): 4.6%) (Fisher's: 3.0%)

Average pairwise correlation across all pairs of tests in the combined test. See Supplementary Figure 1 for complete matrix of pairwise correlations.

Percent of simulations in which method had at least 5% lower power than other methods.