Table 3:
Runtime metrics of the Association Testing Module of BIGwas (regression analysis with PLINK and mixed model analysis with SAIGE) compared to association testing module of the H3Agwas Pipeline Version 3 (regression analysis with PLINK) [12] for several different-sized GWAS data sets
Sample No. | Variant No. | BIGwas Runtime | H3A Runtime |
---|---|---|---|
10,000 | 250,000 | 21 min | 8 min |
10,000 | 700,078 | 54 min | 16 min |
20,554 | 700,078 | 1 h 33 min | 1 h 23 min |
5,480 | 81,708,012 | 14 h 26 min | * |
487,409 | 92,775,302 | 4 d 18 h | * |
974,818 | 92,775,302 | 9 d 14 h | * |
An imputed genome-wide input data set with almost 1 million samples and >92 million genetic variants (i.e., twice the size of the imputed UKB GWAS data set) can be tested for association within 10 days with 1 command of the BIGwas software, whereby only 150 jobs (configurable; equivalent to ∼7 compute nodes on our HPC cluster system) are used in parallel. *H3A uses PLINK genotype files as input (but not imputed allele dosages).