Table 1.
Summary of commonly applied methods and a description of findings from simulations.
Method & original ref |
Description | Major Assumptions | Simulation findings regarding | Computational Issues |
---|---|---|---|---|
GREML-SC5 | Often called the “GCTA approach.” Originally applied to common array SNPs only. Estimates , the amount of h2 caused by CVs tagged by SNPs used to create the GRM. | 1) Genetic similarity is uncorrelated with environmental similarity; 2) an infinitesimal model; 3) SNP effects are normally distributed, independent of LD, and inversely proportionate to MAF (α=−1). | Biased to the degree that the average LD among SNPs is different than the average LD between SNPs and CVs. This occurs in stratified samples and when MAF & LD distributions of SNPs do not match those of CVs. | Simple model tractable with large samples (>100K). |
GREML-MS11 | The first multi-component approach, usually applied by binning SNPs according to their MAF, annotation, or physical regions in order to explore genetic architecture. | Requires that the same assumptions of GREML-SC hold within each GRM. | Biased if CVs have generally higher or lower levels of LD than the SNPs used to make the GRM. Relatively large standard errors. | Run times and memory requirements higher than GREML-SC and increase as a function of the number of variance components estimated. |
GREML-LDMS-R7 | A multi-component approach that bins imputed SNPs by their MAF and regional LD. | Same as GREML-MS | Use of regional LD scores can lead to biases if CVs have different LD on average than surrounding SNPs. Relatively large standard errors. | Same as GREML-MS. |
GREML-LDMS-I | A multi-component approach introduced here that bins imputed SNPs by their MAF and individual LD. | Same as GREML-MS | Appears to be the least biased approach, even when traits have complex genetic architectures. Relatively large standard errors. | Same as GREML-MS. |
LDAK-SC15,20 | Introduced to account for redundant tagging of CVs by common SNPs. Recently modified to incorporate error due to imputation and to alter the MAF-effect size relationship. | Same as GREML-SC, except that allelic effects are a function of LD. Extended to assume that effects are also a function of imputation quality and weakly inversely proportionate to MAF (α=−0.25). | Can correct for the overestimation observed in GREML-SC from redundant tagging of CVs, but otherwise about as biased as GREML-SC when assumptions are unmet, although the biases are sometimes in different directions. | Same as GREML-SC. |
LDAK-MS15 | A multi-component extension of LDAK-SC that bins SNPs by MAF. | Requires that the same assumptions of LDAK-SC hold within each GRM. | Less biased on average than LDAK-SC, but more biased than GREML-LDMS (-I or -R). Relatively large standard errors. | Same as GREML-MS. |
Threshold GRMs24 | A multi-component approach with two GRMs: the normal (unthresholded) GRM built from all SNPs, and a second GRM with entries set to 0 if below a threshold. Conducted in samples that include close relatives. | Same as GREML-SC for the unthresholded GRM. Assumes no shared environmental influences among close relatives. | Estimates associated with unthresholded GRM similar to those of GREML-SC. When used in samples that include close relatives, the second GRM captures pedigree-associated variation but can be upwardly biased by shared environmental influences. | See GREML-SC. |
LD Score Regression19 | Uses the slope from χ2 (from GWAS) regressed on SNPs’ LD scores to estimate the h2 due to CVs in LD with common SNPs. | Infinitesimal model with allelic effects normally distributed. | Largely robust to confounding due to stratification and shared environmental influences. Estimates h2 due to common CVs only, even when used on imputed or WGS data. Underestimates h2 if the trait is not highly polygenic. | The most computationally efficient method of those compared and is tractable for very large datasets. |