Table 1.
Comparison of different methods for GWAS with mixed effect models
Method Features | Algorithm Complexity | Benchmarks for UK Biobank Data Coronary Artery Disease (PheCode 411) |
||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Does not require pre- computed genetic relationship matrix |
Feasible for large sample sizes |
Developed for binary traits |
Accounts for unbalanced case- control ratio |
Tests quantitative traits |
Time complexity | Memory usage (Gbyte) |
Time CPU hrs |
Memory | ||||
Step 1 | Step 2 | Step 1 | Step 2 | |||||||||
Logistic mixed model | SAIGE | ✓ | ✓ | ✓ | ✓ | ✓ | O(PM1N1.5) * | O(MN) | M1N/4 | N | 517 | 10.3G |
GMMAT | ✓ | ✓ | O(PN3) | O(MN2) | F N2 | F N2 | NA | NA | ||||
Linear mixed model | BOLT-LMM | ✓ | ✓ | ✓ | O(PM1N1.5)* | O(MN) | M1N/4 | N | 360 | 10.9G | ||
GEMMA | ✓ | O(N3) | O(MN2) | F N2 | FN2 | NA | NA |
N: number of samples
P: number of iterations required to reach convergence
M1: number of markers used to construct the kinship matrix;
M: total number of markers to be tested
F: Byte for floating number
Number of iterations in PCG is assumed as O(N0.5)8