Skip to main content
Cell Genomics logoLink to Cell Genomics
. 2022 Nov 4;2(12):100210. doi: 10.1016/j.xgen.2022.100210

Meta-analysis fine-mapping is often miscalibrated at single-variant resolution

Masahiro Kanai 1,2,3,4,5,7,, Roy Elzur 1,2,3, Wei Zhou 1,2,3; Global Biobank Meta-analysis Initiative, Mark J Daly 1,2,3,6, Hilary K Finucane 1,2,3,∗∗
PMCID: PMC9839193  NIHMSID: NIHMS1858342  PMID: 36643910

Summary

Meta-analysis is pervasively used to combine multiple genome-wide association studies (GWASs). Fine-mapping of meta-analysis studies is typically performed as in a single-cohort study. Here, we first demonstrate that heterogeneity (e.g., of sample size, phenotyping, imputation) hurts calibration of meta-analysis fine-mapping. We propose a summary statistics-based quality-control (QC) method, suspicious loci analysis of meta-analysis summary statistics (SLALOM), that identifies suspicious loci for meta-analysis fine-mapping by detecting outliers in association statistics. We validate SLALOM in simulations and the GWAS Catalog. Applying SLALOM to 14 meta-analyses from the Global Biobank Meta-analysis Initiative (GBMI), we find that 67% of loci show suspicious patterns that call into question fine-mapping accuracy. These predicted suspicious loci are significantly depleted for having nonsynonymous variants as lead variant (2.7×; Fisher’s exact p = 7.3 × 10−4). We find limited evidence of fine-mapping improvement in the GBMI meta-analyses compared with individual biobanks. We urge extreme caution when interpreting fine-mapping results from meta-analysis of heterogeneous cohorts.

Keywords: genome-wide association study, GWAS, biobank, meta-analysis, fine-mapping, miscalibration, heterogeneity, summary statistics, linkage disequilibrium

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Extensive simulation of meta-analyses to show substantial fine-mapping miscalibration

  • SLALOM, a novel method that identifies suspicious loci for meta-analysis fine-mapping

  • Significant depletion of likely causal variants in SLALOM-predicted suspicious loci

  • Widespread suspicious loci for fine-mapping in current meta-analysis summary statistics


Genome-wide associations studies (GWASs), often performed as meta-analyses, have identified tens of thousands of disease-associated loci. Kanai et al. demonstrate via large-scale simulations and real data analysis that standard tools for pinpointing the causal variants underlying these associations can produce unreliable results when applied to GWAS meta-analyses.

Introduction

Meta-analysis is pervasively used to combine multiple genome-wide association studies (GWASs) from different cohorts.1 Previous GWAS meta-analyses have identified thousands of loci associated with complex diseases and traits, such as type 2 diabetes,2,3 schizophrenia,4,5 rheumatoid arthritis,6,7 body mass index,8 and lipid levels.9 These meta-analyses are typically conducted in large-scale consortia (e.g., the Psychiatric Genomics Consortium [PGC] and the Genetic Investigation of Anthropometric Traits [GIANT] consortium) to increase sample size while harmonizing analysis plans across participating cohorts in every possible aspect (e.g., phenotype definition, quality-control [QC] criteria, statistical model, and analytical software) by sharing summary statistics as opposed to individual-level data, thereby avoiding data protection issues and variable legal frameworks governing individual genome and medical data around the world. The Global Biobank Meta-analysis Initiative (GBMI)10 is one such large-scale, international effort, which aims to establish a collaborative network spanning 23 biobanks from four continents (total n = 2.2 million) for coordinated GWAS meta-analyses, while addressing the many benefits and challenges in meta-analysis and subsequent downstream analyses.

One such challenging downstream analysis is statistical fine-mapping.11,12,13 Despite the great success of past GWAS meta-analyses in locus discovery, individual causal variants in associated loci are largely unresolved. Identifying causal variants from GWAS associations (i.e., fine-mapping) is challenging due to extensive linkage disequilibrium (LD, the correlation among genetic variants), the presence of multiple causal variants, and limited sample sizes, but is rapidly becoming achievable with high confidence in individual cohorts14,15,16 owing to the recent development of large-scale biobanks17,18,19 and scalable fine-mapping methods20,21,22 that enable well-powered, accurate fine-mapping using in-sample LD from large-scale individual-level data.

After conducting GWAS meta-analysis, previous studies2,7,9,23,24,25,26,27,28,29 have applied existing summary statistics-based fine-mapping methods (e.g., approximate Bayes factor [ABF],30,31 CAVIAR,32 PAINTOR,33,34 FINEMAP,20,21 and SuSiE22) just as they are applied to single-cohort studies, without considering or accounting for the unavoidable heterogeneity among cohorts (e.g. differences in sample size, phenotyping, genotyping, or imputation). Such heterogeneity could lead to false-positives and miscalibration in meta-analysis fine-mapping (Figure 1). For example, case-control studies enriched with more severe cases or ascertained with different phenotyping criteria may disproportionately contribute to genetic discovery, even when true causal effects for genetic liability are exactly the same between these studies and less severe or unascertained ones. Quantitative traits such as biomarkers could have phenotypic heterogeneity arising from different measurement protocols and errors across studies. There might be genuine biological mechanisms too, such as gene-gene (GxG) and gene-environment (GxE) interactions and (population-specific) dominance variation (e.g., rs671 and alcohol dependence35), that introduce additional heterogeneity across studies.36,37 In addition to phenotyping, differences in genotyping and imputation could dramatically undermine fine-mapping calibration and recall at single-variant resolution, because differential patterns of missingness and imputation quality across constituent cohorts of different sample sizes can disproportionately diminish association statistics of potentially causal variants. Finally, although more easily harmonized than phenotyping and genotyping data, subtle differences in QC criteria and analytical software may further exacerbate the effect of heterogeneity on fine-mapping.

Figure 1.

Figure 1

Schematic overview of meta-analysis fine-mapping

An illustrative example of such issues can be observed in the TYK2 locus (19p13.2) in the recent meta-analysis from the COVID-19 Host Genetics Initiative (COVID-19 HGI; Figure S1).38 This locus is known for protective associations against autoimmune diseases,6,23 while a complete TYK2 loss of function results in a primary immunodeficiency.39 Despite strong LD (r2 = 0.82) with a lead variant in the locus (rs74956615; p = 9.7 × 10−12), a known functional missense variant rs34536443 (p.Pro1104Ala) that reduces TYK2 function40,41 did not achieve genome-wide significance and was assigned a very low posterior inclusion probability (PIP) in fine-mapping (p = 7.5 × 10−7; PIP = 9.5 × 10−4), primarily due to its missingness in two more cohorts than rs74956615. This serves as just one example of the major difficulties with meta-analysis fine-mapping at single-variant resolution. Indeed, the COVID-19 HGI cautiously avoided an in silico fine-mapping in the flagship to prevent spurious results.38

Only a few studies have carefully addressed these concerns in their downstream analyses. The Schizophrenia Working Group of PGC, for example, recently updated their largest meta-analysis of schizophrenia5 (69,369 cases and 236,642 controls), followed by a downstream fine-mapping analysis using FINEMAP.20 Unlike many other GWAS consortia, since PGC has access to individual-level genotypes for a majority of samples, they were able to apply standardized sample and variant QC criteria and impute variants using the same reference panel, all uniformly processed using the RICOPILI pipeline.42 This harmonized procedure was crucial for properly controlling inter-cohort heterogeneity and thus allowing more robust meta-analysis fine-mapping at single-variant resolution. Furthermore, PGC’s direct access to individual-level data enabled them to compute in-sample LD matrices for multiple-causal-variant fine-mapping, which prevents the significant miscalibration that results from using an external LD.14,15 A 2017 fine-mapping study of inflammatory bowel disease also benefited from access to individual-level genotypes and careful pre- and post-fine-mapping QC.43 For a typical meta-analysis consortium, however, many of these steps are infeasible as full genotype data from all cohorts are not available. For such studies, a new approach to meta-analysis fine-mapping in the presence of the many types of heterogeneity is needed. Until such a method is developed, QC of meta-analysis fine-mapping results deserves increased attention.

While existing variant-level QC procedures are effective for limiting spurious associations in GWAS (Data S1),44 they do not suffice for ensuring high-quality fine-mapping results. In some cases, they even hurt fine-mapping quality, because they can (1) cause or exacerbate differential patterns of missing variants across cohorts, and (2) remove true causal variants as well as suspicious variants. Thus, additional QC procedures that retain consistent variants across cohorts for consideration but limit poor-quality fine-mapping results are needed. A recently proposed method called DENTIST,45 for example, performs summary statistics QC to improve GWAS downstream analyses, such as conditional and joint analysis (GCTA-COJO46), by removing variants based on estimated heterogeneity between summary statistics and reference LD. Although DENTIST was also applied prior to fine-mapping (FINEMAP20), simulations only demonstrated that it could improve power for detecting the correct number of causal variants in a locus, not true causal variants. This motivated us to develop a new fine-mapping QC method for better calibration and recall at single-variant resolution and to demonstrate its performance in large-scale meta-analysis.

Here, we first demonstrate the effect of inter-cohort heterogeneity in meta-analysis fine-mapping via realistic simulations with multiple heterogeneous cohorts, each with different combinations of genotyping platforms, imputation reference panels, and genetic ancestries. We propose a summary statistics-based QC method, suspicious loci analysis of meta-analysis summary statistics (SLALOM), that identifies suspicious loci for meta-analysis fine-mapping by detecting association statistics outliers based on local LD structure, building on the DENTIST method. Applying SLALOM to 14 disease endpoints from the GBMI10 as well as 467 meta-analysis summary statistics from the GWAS Catalog,47 we demonstrate that suspicious loci for fine-mapping are widespread in meta-analysis and urge extreme caution when interpreting fine-mapping results from meta-analysis.

Results

Large-scale simulations demonstrate miscalibration in meta-analysis fine-mapping

Existing fine-mapping methods20,22,30 assume that all association statistics are derived from a single-cohort study, and thus do not model the per-variant heterogeneity in effect sizes and sample sizes that arise when meta-analyzing multiple cohorts (Figure 1). To evaluate how different characteristics of constituent cohorts in a meta-analysis affect fine-mapping calibration and recall, we conducted a series of large-scale GWAS meta-analysis and fine-mapping simulations (Table S1 Number of unrelated samples for simulated cohorts, related to STAR Methods, Table S2 Number of chromosome 3 variants in Illumina manifest and those extracted from 1000GP African, East Asian, and European populations, related to STAR Methods, Table S3 Number of imputed and QC-passing variants (MAF > 0.001 and Rsq > 0.6), related to STAR Methods, Table S4 List of configurations for meta-analysis simulation, related to STAR Methods; STAR Methods). Briefly, we simulated multiple GWAS cohorts of different ancestries (10 European ancestry, one African ancestry, and one East Asian ancestry cohorts; n = 10,000 each) that were genotyped and imputed using different genotyping arrays (Illumina Omni2.5, Multi-Ethnic Global Array [MEGA], and Global Screening Array [GSA]) and imputation reference panels (the 1000 Genomes Project Phase 3 [1000GP3],48 the Haplo-type Reference Consortium [HRC],49 and the TOPMed50). For each combination of cohort, genotyping array, and imputation panel, we conducted 300 GWAS with randomly simulated causal variants that resemble the genetic architecture of a typical complex trait, including minor allele frequency (MAF) dependent causal effect sizes,51 total SNP heritability,52 functional consequences of causal variants,16 and levels of genetic correlation across cohorts (i.e., true effect size heterogeneity; rg = 1, 0.9, and 0.5; STAR Methods). We then meta-analyzed the single-cohort GWAS results across 10 independent cohorts based on multiple configurations (different combinations of genotyping arrays and imputation panels for each cohort) to resemble realistic meta-analysis of multiple heterogeneous cohorts (Table S4). We applied ABF fine-mapping to compute a PIP for each variant and to derive 95% and 99% credible sets (CSs) that contain the smallest set of variants covering 95% and 99% of probability of causality. We evaluated the false discovery rate (FDR, defined as the proportion of variants with PIP > 0.9 that are non-causal) and compared against the expected proportion of non-causal variants if the meta-analysis fine-mapping method were calibrated, based on PIP. More details of our simulation pipeline are described in STAR Methods and visually summarized in Figure S2.

We found that FDR varied widely over the different configurations, reaching as high as 37% for the most heterogeneous configurations (Figure 2). We characterized the contributing factors to the miscalibration. We first found that lower true effect size correlation rg (i.e., larger phenotypic heterogeneity) always caused higher miscalibration and lower recall. Second, when using the same imputation panel (1000GP3), use of less dense arrays (MEGA or GSA) led to moderately inflated FDR (up to FDR = 11% versus expected 1%), while use of multiple genotyping array did not cause further FDR inflation (Figure 2C). Third, when using the same genotyping array (Omni2.5), use of imputation panels (HRC or TOPMed) that do not match our simulation reference significantly affects miscalibration (up to FDR = 17% versus expected 1%), and using multiple imputation panels further increased miscalibration (up to FDR = 35% versus expected 2%; Figure 2C); this setup is as bad as the most heterogeneous configuration using multiple genotyping arrays and imputation panels (FDR = 37%). When TOPMed-imputed variants were lifted over from GRCh38 to GRCh37, we observed FDR increases of up to 10%, likely due to genomic build conversion failures (Data S1).53 Fourth, recall was not significantly affected by heterogeneous genotyping arrays or imputation panels (Figures 2B and 2D). Fifth, including multiple genetic ancestries did not affect calibration when using the same genotyping array and imputation panel (Omni 2.5 and 1000GP3; Figure 2E) but significantly improved recall if African ancestry was included (Figure 2F). This is expected, given the shorter LD length in the African population compared with other populations, which improves fine-mapping resolution.54 Finally, in the most heterogeneous configurations where multiple genotyping arrays and imputation panels existed, we observed an FDR of up to 37% and 28% for European and multi-ancestry meta-analyses, respectively (versus expected 2% for both), demonstrating that inter-cohort heterogeneity can substantially undermine meta-analysis fine-mapping (Figures 2G and 2H).

Figure 2.

Figure 2

Evaluation of FDR and recall in meta-analysis fine-mapping simulations

We evaluated FDR and recall in meta-analysis fine-mapping using (A–H) different genotyping arrays (A and B), imputation reference panels (C and D), genetic ancestries (E and F), and more heterogeneous settings by combining these (G and H). As shown in top-right gray labels, the EUR ancestry, the Omni2.5 genotyping array, and/or the 1000GP3 reference were used unless otherwise stated. FDR is defined as the proportion of non-causal variants with PIP > 0.9. Horizontal gray lines represent 1 – mean PIP; i.e., expected FDRs were the method calibrated. Recall is defined as the proportion of true causal variants in the top 1% PIP bin. Shapes correspond to the true effect size correlation rg across cohorts that represent a phenotypic heterogeneity parameter (the lower rg, the higher phenotypic heterogeneity). Error bars correspond to 95% confidence intervals.

To further characterize observed miscalibration in meta-analysis fine-mapping, we investigated the availability of GWAS variants in each combination of ancestry, genotyping array, and imputation panel (Figures S3–S5). Out of 3,285,617 variants on chromosome 3 that passed variant QC in at least one combination (per-combination MAF >0.001 and Rsq >0.6; STAR Methods), 574,261 variants (17%) showed population-level gnomAD MAF >0.001 in every ancestry that we simulated (African, East Asian, and European). Because we used a variety of imputation panels, we retrieved population-level MAF from gnomAD. Of these 574,261 variants, 389,219 variants (68%) were available in every combination (Figure S3A). This fraction increased from 68% to 73%, 74%, and 76% as we increased gnomAD MAF thresholds to >0.005, 0.01, and 0.05, respectively, but never reached 100% (Figure S5). Notably, we observed a substantial number of variants that are unique to a certain genotyping array and an imputation panel, even when we restricted to 344,497 common variants (gnomAD MAF >0.05) in every ancestry (Figure S3B). For example, there are 34,317 variants (10%) that were imputed in the 1000GP3 and TOPMed reference but not in the HRC. Likewise, we observed 33,106 variants (10%) that were specific to the 1000GP3 reference and even 3,066 variants (1%) that were imputed in every combination except for East Asian ancestry with the GSA array and the TOPMed reference. When using different combinations of gnomAD MAF thresholds (>0.001, 0.005, 0.01, or 0.05 in every ancestry) and Rsq thresholds (>0.2, 0.4, 0.6, or 0.8), we observed the largest fraction of shared variants (78%) was achieved with gnomAD MAF >0.01 and Rsq >0.2 while the largest number of the shared variants (427,494 variants) was achieved with gnomAD MAF >0.001 and Rsq >0.2, leaving it unclear which thresholds would be preferable in the context of fine-mapping (Figure S5).

The remaining 2,711,356 QC-passing variants in our simulations (gnomAD MAF ≤0.001 in at least one ancestry) further exacerbate variable coverage of the available variants (Figure S4A). Of these, the largest proportion of variants (39%) were only available in African ancestry, followed by African and European (but not in East Asian) available variants (7%), European-specific variants (6%), and East Asian-specific variants (5%). Furthermore, similar to the aforementioned common variants, we found a substantial number of variants that are unique to a certain combination. Altogether, we observed that only 393,471 variants (12%) out of all the QC-passing 3,285,617 variants were available in every combination (Figure S4B). These observations recapitulate that different combinations of genetic ancestry, genotyping array, imputation panels, and QC thresholds substantially affect the availability of common, well-imputed variants for association testing.55

Thus, the different combinations of genotyping and imputation cause each cohort in a meta-analysis to have a different set of variants, and consequently variants can have very different overall sample sizes. In our simulations with the most heterogeneous configurations, we found that 66% of the false-positive loci (where a non-causal [false-positive] variant was assigned PIP > 0.9) had different sample sizes for true causal and false-positive variants (median maximum/minimum sample size ratio = 1.4; Figure S6). Analytically, we found that at common meta-analysis sample sizes and genome-wide significant effect size regimes, when two variants have similar marginal effects, the one with the larger sample size will usually achieve a higher ABF PIP (Data S2; Figures S7–S9). This elucidates the mechanism by which sample size imbalance can lead to miscalibration.

Overview of the SLALOM method

To address the challenges in meta-analysis fine-mapping discussed above, we developed SLALOM, a method that flags suspicious loci for meta-analysis fine-mapping by detecting outliers in association statistics based on deviations from expectation, estimated with local LD structure (STAR Methods). SLALOM consists of three steps: (1) defining loci and lead variants based on a 1 Mb window, (2) detecting outlier variants in each locus using meta-analysis summary statistics and an external LD reference panel, and (3) identifying suspicious loci for meta-analysis fine-mapping (Figures 3A and 3B).

Figure 3.

Figure 3

Overview of the SLALOM method

(A and B) An illustrative example of the SLALOM application. (A) In an example locus, two independent association signals are depicted: (1) the most significant signal that contains a lead variant (purple diamond) and five additional variants that are in strong LD (r2 > 0.9) with the lead variant, and (2) an additional independent signal (r2 < 0.05). There is one outlier variant (orange diamond) in the first signal that deviates from the expected association based on LD. (B) Step-by-step procedure of the SLALOM method. For outlier variant detection in a locus, a diagnosis plot of r2 values to the lead variant versus marginal χ2 is shown to aid interpretation. Background color represents a theoretical distribution of –log10PDENTIST-S values when a lead variant has a marginal χ2 of 50, assuming no allele flipping. Points represent the variants depicted in the example locus (A), where the lead variant (purple diamond) and the outlier variant (white diamond) are highlighted. Diagonal line represents an expected marginal association. Horizontal dotted lines represent the genome-wide significance threshold (p < 5.0 × 10−8).

(C). The receiver operating characteristic (ROC) curve of SLALOM prediction for identifying suspicious loci in the simulations. Positive conditions were defined as whether a true causal variant in a locus is (1) a lead PIP variant, (2) in 95% CS, and (3) in 99% CS. AUROC values are shown in the labels. Black points represent the performance of our adopted metric; i.e., whether a locus contains at least one outlier variant (PDENTIST-S < 1.0 × 10−4 and r2 > 0.6).

(D) Calibration plot in the simulations under different PIP thresholds. Calibration was measured as the mean PIP minus the fraction of true causal variants among variants above the threshold. Shadows around the lines represent 95% confidence intervals.

(E) The fraction of variants in predicted suspicious and non-suspicious loci under different PIP thresholds. Gray shadows in the panels (D and E) represent a PIP ≤ 0.1 region as we excluded loci with maximum PIP ≤ 0.1 in the actual SLALOM analysis based on these panels.

To detect outlier variants, we first assume a single causal variant per associated locus. Then the marginal Z score zi for a variant i should be approximately equal to ri,c zc where zc is the Z score of the causal variant c, and ri,c is a correlation between variants i and c. For each variant in meta-analysis summary statistics, we first test this relationship using a simplified version of the DENTIST statistics,45 DENTIST-S, based on the assumption of a single causal variant. The DENTIST-S statistics for a given variant i is written as

Ti=(ziri,czc)21ri,c2 (Equation 1)

which approximately follows a distribution with 1 degree of freedom.45 Since the true causal variant and LD structure are unknown in real data, we approximate the causal variant as the lead PIP variant in the locus (the variant with the highest PIP) and use a large-scale external LD reference from gnomAD,56 either an ancestry-matched LD for a single-ancestry meta-analysis or a sample-size-weighted LD by ancestries for a multi-ancestry meta-analysis (STAR Methods). We note that the existence of multiple independent causal variants in a locus would not affect SLALOM precision but would decrease recall (see section “discussion”).

SLALOM then evaluates whether each locus is “suspicious”; that is, has a pattern of meta-analysis statistics and LD that appear inconsistent and therefore call into question the fine-mapping accuracy. By training on loci with maximum PIP >0.9 in the simulations, we determined that the best-performing criterion for classifying loci as true- or false-positives is whether a locus has a variant with r2 > 0.6 to the lead and DENTIST-S p-value < 1.0 × 10−4 (STAR Methods). Using this criterion, we achieved an area under the receiver operating characteristic curve (AUROC) of 0.74, 0.76, and 0.80 for identifying whether a true causal variant is a lead PIP variant, in 95% CS, and in 99% CS, respectively (Figure 3C). Using different thresholds, we observed that the SLALOM performance is not very sensitive to thresholds near the threshold we chose (Figure S10). We further validated the performance of SLALOM using all the loci in the simulations and observed significantly higher miscalibration in predicted suspicious loci than in non-suspicious loci (up to 16% difference in FDR at PIP >0.9; Figure 3D). We found that SLALOM-predicted suspicious loci tend to be from more heterogeneous configurations and the SLALOM sensitivity and specificity depend on the level of heterogeneity (Table S5). Given the lower miscalibration and specificity at low PIP thresholds (Figures 3D and 3E), in subsequent real data analysis we restricted the application of SLALOM to loci with maximum PIP >0.1 (STAR Methods).

Widespread suspicious loci for fine-mapping in existing meta-analysis summary statistics

Having assessed the performance of SLALOM in simulations, we applied SLALOM to 467 meta-analysis summary statistics in the GWAS Catalog47 that are publicly available with a sufficient discovery sample size (N > 10,000; Table S6; STAR Methods) to quantify the prevalence of suspicious loci in existing studies. These summary statistics were mostly European-ancestry-only meta-analyses (63%), followed by multi-ancestry (31%), East Asian ancestry-only (3%), and African ancestry-only (2%) meta-analyses. Across 467 summary statistics from 96 publications, we identified 28,925 loci with maximum PIP >0.1 (out of 35,864 genome-wide significant loci defined based on 1-Mb window around lead variants; STAR Methods) for SLALOM analysis, of which 8,137 loci (28%) were predicted suspicious (Table S7).

To validate SLALOM performance in real data, we restricted our analysis to 6,065 loci that have maximum PIP >0.1 and that contain nonsynonymous coding variants (predicted loss of function [pLoF] and missense) in LD with the lead variant (r2 > 0.6). Given prior evidence16,43,57 that such nonsynonymous variants are highly enriched for being causal, we tested the validity of our method by whether they achieve the highest PIP in the locus (i.e., successful fine-mapping) in suspicious versus non-suspicious loci (STAR Methods). While 40% (1,557 out of 3,860) of non-suspicious loci successfully fine-mapped nonsynonymous variants, only 17% (384 out of 2,205) of suspicious loci did, demonstrating a significant depletion (2.3×) of successfully fine-mapped nonsynonymous variants in suspicious loci (Fisher’s exact p = 3.6 × 10−79; Figure 4A). We also tested whether nonsynonymous variants belonged to 95% and 99% CS and again observed significant depletion (1.4× and 1.3×, respectively; Fisher’s exact p < 4.6 × 10−100). In addition, when we used a more stringent r2 threshold (>0.8) for selecting loci that contain nonsynonymous variants, we also confirmed significant enrichment (Fisher’s exact p < 6.1 × 10−65; Figure S11). To quantify potential fine-mapping miscalibration in the GWAS Catalog, we investigated the difference between mean PIP for lead variants and fraction of lead variants that are nonsynonymous; assuming that nonsynonymous variants in these loci are truly causal, this difference equals the difference between the true and reported fraction of lead PIP variants that are causal. We observed differences between 26%–51% and 10%–18% under different PIP thresholds in suspicious and non-suspicious loci, respectively (Figure 4B), marking 45% and 15% for high-PIP (>0.9) variants.

Figure 4.

Figure 4

Evaluation of SLALOM performance in the GWAS Catalog summary statistics

(A) Depletion of likely causal variants in predicted suspicious loci. We evaluated whether nonsynonymous coding variants (pLoF and missense) were lead PIP variants, in 95% CS, or in 99% CS in suspicious versus non-suspicious loci. Depletion was calculated by relative risk (i.e., a ratio of proportions; STAR Methods). Error bars, invisible due to their small size, correspond to 95% confidence intervals using bootstrapping. Significance represents a Fisher exact test p value (∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001, ∗∗∗∗p < 10−4).

(B) Plot of the estimated difference between true and reported proportion of causal variants in the loci tagging nonsynonymous variants (r2 > 0.6 with the lead variants) in the GWAS Catalog under different PIP thresholds. Analogous to Figure 3D, assuming nonsynonymous variants in these loci are truly causal, the mean PIP for lead variants minus the fraction of lead variants that are nonsynonymous above the threshold is equal to the difference between true and reported proportion of causal variants. Shadows around the lines represent 95% confidence intervals.

(C and D) Similar to (A), we evaluated whether (C) high-PIP (>0.9) complex trait variants in biobank fine-mapping and (D) high-PIP (>0.9) cis-eQTL variants in GTEx v8 and eQTL Catalog were lead PIP variants, in 95% CS, or in 99% CS in suspicious versus non-suspicious loci.

We further assessed SLALOM performance in the GWAS Catalog meta-analyses by leveraging high-PIP (>0.9) complex trait and cis-eQTL variants that were rigorously fine-mapped16 in large-scale biobanks (Biobank Japan [BBJ],58 FinnGen,19 and UK Biobank [UKBB]18) and eQTL resources (GTEx59 v8 and eQTL Catalog60). Among the 27,713 loci analyzed by SLALOM (maximum PIP >0.1) that contain a lead variant that was included in biobank fine-mapping, 17% (3,266 out of 19,692) of the non-suspicious loci successfully fine-mapped one of the high-PIP GWAS variants in biobank fine-mapping, whereas 7% (589 out of 8,021) of suspicious loci did, showing a significant depletion (2.3×) of the high-PIP complex trait variants in suspicious loci (Fisher’s exact p = 4.6 × 10−100; Figure 4C). Similarly, among 26,901 loci analyzed by SLALOM that contain a lead variant that was included in cis-eQTL fine-mapping, we found a significant depletion (1.9×) of the high-PIP cis-eQTL variants in suspicious loci, where 7% (1,247 out of 18,976) of non-suspicious loci versus 4% (281 out of 7,925) of suspicious loci successfully fine-mapped one of the high-PIP cis-eQTL variants (Fisher’s exact p = 2.6 × 10−24; Figure 4D). We observed the same significant depletions of the high-PIP complex trait and cis-eQTL variants in suspicious loci that belonged to 95% and 99% CS set (Figures 4C and 4D).

Suspicious loci for fine-mapping in the GBMI summary statistics

Next, we applied SLALOM to meta-analysis summary statistics of 14 disease endpoints from the GBMI.10 These summary statistics were generated from a meta-analysis of up to 1.8 million individuals in total across 18 biobanks for discovery, representing six different genetic ancestry groups of approximately 33,000 African, 18,000 admixed American, 31,000 Central and South Asian, 341,000 East Asian, 1.4 million European, and 1,600 Middle Eastern individuals (Table S8). Among 489 genome-wide significant loci across the 14 traits (excluding the major histocompatibility complex [MHC] region; STAR Methods), we found that 82 loci (17%) showed maximum PIP <0.1, thus not being further considered by SLALOM. Of the remaining 407 loci with maximum PIP >0.1, SLALOM identified that 272 loci (67%) were suspicious loci for fine-mapping (Figure 5A; Table S9). The fraction of suspicious loci and their maximum PIP varied by trait, reflecting different levels of statistical power (e.g., sample sizes, heritability, and local LD structure) as well as inter-cohort heterogeneity (Figures 5B–5O).

Figure 5.

Figure 5

SLALOM prediction results in the GBMI summary statistics

(A–O) For (A) all 14 traits and (B–O) individual traits, a number of predicted suspicious (SL), non-suspicious (NSL), and non-applicable (NA; maximum PIP <0.1) loci were summarized. Individual traits are ordered by the total number of loci. Color represents the maximum PIP in a locus. Label represents the fraction of loci in each prediction category. AAA, abdominal aortic aneurysm; AcApp, acute appendicitis; COPD, chronic obstructive pulmonary disease; HCM, hypertrophic cardiomyopathy; HF, heart failure; IPF, idiopathic pulmonary fibrosis; POAG, primary open-angle glaucoma; ThC, thyroid cancer; UtC, uterine cancer; VTE, venous thromboembolism.

While the fraction of suspicious loci (67%) in the GBMI meta-analyses is higher than in the GWAS Catalog (28%), there might be multiple reasons for this discrepancy, including association significance, sample size, ancestral diversity, and study-specific QC criteria. For example, the GBMI summary statistics were generated from multi-ancestry, large-scale meta-analyses of median sample size of 1.4 million individuals across six ancestries, while 63% of the 467 summary statistics from the GWAS Catalog were only in European-ancestry studies and 83% had less than 0.5 million discovery samples. Nonetheless, predicted suspicious loci for fine-mapping were prevalent in both the GWAS Catalog and the GBMI.

Using nonsynonymous (pLoF and missense) and high-PIP (>0.9) complex trait and cis-eQTL variants, we recapitulated a significant depletion of these likely causal variants in predicted suspicious loci (2.7×, 5.2×, and 5.1× for nonsynonymous, high-PIP complex trait, and high-PIP cis-eQTL variants being a lead PIP variant, respectively; Fisher’s exact p < 7.3 × 10−4), confirming our observation in the GWAS Catalog analysis (Figures 6A–6C).

Figure 6.

Figure 6

Evaluation of SLALOM performance in the GBMI summary statistics

(A–C) Similar to Figure 4, we evaluated whether (A) nonsynonymous coding variants (pLoF and missense), (B) high-PIP (>0.9) complex trait variants in biobank fine-mapping, and (C) high-PIP (>0.9) cis-eQTL variants in GTEx v8 and eQTL Catalog were lead PIP variants, in 95% CS, or in 99% CS in suspicious versus non-suspicious loci. Depletion was calculated by relative risk (i.e., a ratio of proportions; STAR Methods). Error bars correspond to 95% confidence intervals using bootstrapping. Significance represents a Fisher exact test p value (∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001, ∗∗∗∗p < 10−4).

(D) Locuszoom plot of the 1q23.3 locus for COPD. (Top) A Manhattan plot, where the lead variant rs2099684 (purple diamond) and a missense variant rs396991 (orange diamond) are highlighted. Color represents r2 values to the lead variant. Horizontal line represents a genome-wide significance threshold (p = 5.0 × 10−8). (Middle) PIP from ABF fine-mapping. Color represents whether variants belong to a 95% CS. (Bottom) r2 values with the lead variant in gnomAD populations.

(E) A diagnosis plot showing r2 values to the lead variant versus marginal χ2. Color represents –log10PDENTIST-S values. Outlier variants with PDENTIST-S < 10−4 are depicted in red with a diamond shape. Diagonal line represents an expected marginal association. Horizontal line represents a genome-wide significance threshold.

(F) Z scores of the lead variant (rs2099684) versus the missense variant (rs396991) in the constituent cohorts of the meta-analysis. Open and closed circles represent whether both variants exist in a cohort or rs396991 is missing. Circle size corresponds to an effective sample size. Color represents genetic ancestry.

In 15 out of 23 non-suspicious loci harboring a nonsynonymous variant, the nonsynonymous variant had the highest PIP. These included known missense variants such as rs116483731 (p.Arg20Gln) in SPDL1 for idiopathic pulmonary fibrosis (IPF)61,62 and rs28929474 (p.Glu366Lys) in SERPINA1 for chronic obstructive pulmonary disease (COPD).63,64 In addition, we observed successful fine-mapping in two novel loci for asthma: (1) rs41286560 (p.Pro558Thr) in RTL1, a missense variant known for decreasing height65,66; and (2) rs34187696 (p.Gly337Val) in ZSCAN5A, a known missense variant for increasing monocyte count.29

To characterize fine-mapping failures in suspicious loci, we examined suspicious loci in which a nonsynonymous variant did not achieve the highest PIP. For example, the FCGR2A/FCGR3A (1q23.3) locus for COPD contained a genome-wide significant lead intergenic variant rs2099684 (p = 1.7 × 10−11), which is in LD (r2 = 0.92) with a missense variant rs396991 (p.Phe176Val) of FCGR3A; Figure 6D). This locus was not previously reported for COPD but is known for associations with autoimmune diseases (e.g., inflammatory bowel disease,43 rheumatoid arthritis,7 and systemic lupus erythematosus67) and encodes the low-affinity human FC-gamma receptors that bind to the Fc region of immunoglobulin (Ig) G and activate immune responses.68 Notably, this locus contains copy number variations that contribute to the disease associations in addition to single-nucleotide variants, which makes genotyping challenging.68,69 Despite strong LD with the lead variant, rs396991 did not achieve genome-wide significance (p = 9.1 × 10−3), showing a significant deviation from the expected association (PDENTIST-S = 5.3 × 10−41; Figure 6E). This is primarily due to missingness of rs396991 in eight biobanks out of 17 (Neff = 76,790 and 36,781 for rs2099684 and rs396991, respectively; Figure 6F), which is caused by its absence from major imputation reference panels (e.g., 1000GP,48 HRC,49 and UK10K70) despite having a high MAF in every population (MAF = 0.24–0.34 in African, admixed American, East Asian, European, and South Asian populations of gnomAD56).

Sample size imbalance across variants was pervasive in the GBMI meta-analyses,71 and was especially enriched in predicted suspicious loci: 84% of suspicious loci versus 24% of non-suspicious loci showed a maximum/minimum effective sample size ratio >2 among variants in LD (r2 > 0.6) with lead variants (a median ratio = 4.2 and 1.2 in suspicious and non-suspicious loci, respectively; Figure S12). These observations are consistent with our simulations, recapitulating that sample size imbalance results in miscalibration for meta-analysis fine-mapping. Notably, we observed a similar issue in other GBMI downstream analyses (e.g., polygenic risk score [PRS]71 and drug discovery72), where predictive performance improved significantly after filtering out variants with maximum Neff <50%. Although fine-mapping methods cannot simply take this approach because it inevitably reduces calibration and recall by removing true causal variants, other meta-analysis downstream analyses that primarily rely on polygenic signals rather than individual variants should consider this filtering as an extra QC step.

Comparison of fine-mapping results between the GBMI meta-analyses and individual biobanks

Motivated by successful validation of SLALOM performance, we investigated whether fine-mapping confidence and resolution were improved in the GBMI meta-analyses over individual biobanks. To this end, we used our fine-mapping results16 of nine disease endpoints (asthma,64 COPD,64 gout, heart failure,73 IPF,62 primary open-angle glaucoma,74 thyroid cancer, stroke,75 and venous thromboembolism76) in BBJ,58 FinnGen,19 and UKBB18 Europeans that also contributed to the GBMI meta-analyses for the same traits.

To perform an unbiased comparison of PIP between the GBMI meta-analysis and individual biobanks, we investigated functional enrichment of fine-mapped variants based on top PIP rankings in the GBMI and individual biobanks (top 0.5%, 0.1%, and 0.05% PIP variants in the GBMI versus maximum PIP across BBJ, FinnGen, and UKBB; STAR Methods). Previous studies have shown that high-PIP (>0.9) complex trait variants are significantly enriched for well-known functional categories, such as coding (pLoF, missense, and synonymous), 5′/3′ UTR, promoter, and cis-regulatory element (CRE) regions (DNase I hypersensitive sites and H3K27ac).16 Using these functional categories, we found no significant enrichment of variants in the top PIP rankings in the GBMI over individual biobanks (Fisher’s exact p > 0.05; Figure 7A) except for variants in the promoter region (1.8×; Fisher’s exact p = 4.9 × 10−4 for the top 0.1% PIP variants). We observed similar trends regardless of whether variants were in suspicious or non-suspicious loci (Figures 7B and 7C). To examine patterns of increased and decreased PIP for individual variants, we also calculated PIP difference between the GBMI and individual biobanks, defined as ΔPIP = PIP (GBMI) – maximum PIP across biobanks (Figures S13 and S14). We investigated functional enrichment based on ΔPIP bins and observed inconsistent enrichment results using different ΔPIP thresholds (Figure S15). Finally, to test whether fine-mapping resolution was improved in the GBMI over individual biobanks, we compared the size of 95% CS after restricting them to cases where a GBMI CS overlapped with an individual biobank CS (STAR Methods). We observed the median 95% CS size of 2 and 2 in non-suspicious loci for the GBMI and individual biobanks, respectively, and 5 and 14 in suspicious loci, respectively (Figure S16). The smaller CS size in suspicious loci in GBMI could be due to improved resolution or to increased miscalibration. These results provide limited evidence of overall fine-mapping improvement in the GBMI meta-analyses over what is achievable by taking the best result from individual biobanks.

Figure 7.

Figure 7

Fine-mapping improvement and retrogression in the GBMI meta-analyses over individual biobanks

(A–C) Functional enrichment of variants in each functional category based on top PIP rankings in the GBMI and individual biobanks (maximum PIP of BBJ, FinnGen, and UKBB) using (A) all loci, (B) suspicious loci, or (C) non-suspicious loci. Shape corresponds to top PIP ranking (top 0.5%, 0.1%, and 0.05%). Enrichment was calculated by a relative risk (i.e., a ratio of proportions; STAR Methods). Error bars correspond to 95% confidence intervals using bootstrapping.

(D and E) Locuszoom plots for the same non-suspicious locus of asthma in the GBMI meta-analysis and an individual biobank (BBJ, FinnGen, or UKBB Europeans) that showed the highest PIP in our biobank fine-mapping. Colors in the Manhattan panels represent r2 values to the lead variant. In the PIP panels, only fine-mapped variants in the 95% CS are colored, where the same colors are applied between the GBMI meta-analysis and an individual biobank based on merged CS as previously described. Horizontal line represents a genome-wide significance threshold (p = 5.0 × 10−8).

(D) rs1888909 for asthma in the GBMI and FinnGen.

(E) rs16903574 for asthma in the GBMI and UKBB Europeans. Nearby rs528167451 was also highlighted, which was in strong LD (r2 = 0.86) and in the same 95% CS in UKBB Europeans, but not in the GBMI (r2 = 0.67).

(F) rs1295686 for asthma in the GBMI and UKBB Europeans. A nearby missense, rs20541, showed lower PIP than rs1295686 despite having strong LD (r2 = 0.96).

(G) rs12123821 for asthma in the GBMI and UKBB Europeans. Nearby stop-gained rs61816761 was independent of rs12123821 (r2 = 0.0) and not fine-mapped in the GBMI due to a single causal variant assumption in the ABF fine-mapping.

Individual examples, however, provide insights into the types of fine-mapping differences that can occur. To characterize the observed differences in fine-mapping confidence and resolution, we further examined non-suspicious loci with ΔPIP > 0.5 in asthma. In some cases, the increased power and/or ancestral diversity of GBMI led to improved fine-mapping: for example, an intergenic variant rs1888909 (∼18 kb upstream of IL33) showed ΔPIP = 0.99 (PIP = 1.0 and 0.008 in GBMI and FinnGen, respectively; Figure 7D), which was primarily owing to increased association significance in a meta-analysis (p = 3.0 × 10−86, 7.4 × 10−2, 3.6 × 10−16, and 1.9 × 10−53 in GBMI, BBJ, FinnGen, and UKBB Europeans, respectively) as well as a shorter LD length in the African population than in the European population (LD length = 4 versus 41 kb for variants with r2 > 0.6 with rs1888909 in the African and European populations, respectively; Neff = 4,270 for Africans in the GBMI asthma meta-analysis; Figure S17). This variant was also fine-mapped for eosinophil count in UKBB Europeans (PIP = 1.0; p = 1.3 × 10−314)16 and was previously reported to regulate IL33 gene expression in human airway epithelial cells via allele-specific transcription factor binding of OCT-1 (POU2F1).77 Likewise, we observed a missense variant rs16903574 (p.Phe319Leu) in OTULINL showed ΔPIP = 0.79 (PIP = 1.0 and 0.21 in GBMI and UKBB Europeans, respectively; Figure 7E) owing to improved association significance (p = 7.7 × 10−15 and 4.7 × 10−12 in GBMI and UKBB Europeans, respectively).

However, we also observed very high ΔPIP for variants that are not likely causal. For example, we observed that an intronic variant rs1295686 in IL13 showed ΔPIP = 0.56 (PIP = 0.56 and 0.0002 in GBMI and UKBB Europeans, respectively; Figure 7F), despite having strong LD with a nearby missense variant rs20541 (p.Gln144Arg; r2 = 0.96 with rs1295686), which only showed ΔPIP = 0.13 (PIP = 0.13 and 0.0001 in GBMI and UKBB Europeans, respectively). The missense variant rs20541 showed PIP = 0.23 and 0.15 for a related allergic disease, atopic dermatitis, in BBJ and FinnGen, respectively,16 and was previously shown to induce STAT6 phosphorylation and upregulate CD23 expression in monocytes, promoting IgE synthesis.78 Although the GBMI meta-analysis contributed to prioritizing these two variants (sum of PIP = 0.69 versus 0.0003 in GBMI and UKBB Europeans, respectively), the observed ΔPIP was higher for rs1295686 than for rs20541.

While increasing sample size in meta-analysis improves association significance, we also found negative ΔPIP due to losing the ability to model multiple causal variants. A stop-gained variant rs61816761 (p.Arg501Ter) in FLG showed ΔPIP = −1.0 (PIP = 6.4 × 10−5 and 1.0 in GBMI and UKBB Europeans, respectively; Figure 7G), which was primarily owing to a nearby lead variant rs12123821 (∼17 kb downstream of HRNR; r2 = 0.0 with rs61816761). This lead variant rs12123821 showed greater significance than rs61816761 in GBMI (p = 9.3 × 10−16 and 2.0 × 10−11 for rs12123821 and rs61816761, respectively) as well as in UKBB Europeans (p = 7.1 × 10−26 and 1.5 × 10−18). While our biobank fine-mapping16 assigned PIP = 1.0 for both variants based on multiple-causal-variant fine-mapping (i.e., FINEMAP20 and SuSiE22), our ABF fine-mapping in the GBMI meta-analysis was only able to assign PIP = 0.74 for the lead variant rs12123821 due to a single causal variant assumption. This recapitulates the importance of multiple-causal-variant fine-mapping in complex trait fine-mapping16; however, we note that multiple-causal-variant fine-mapping with an external LD reference is extremely error prone as previously reported.14,15

Discussion

In this study, we first demonstrated in simulations that meta-analysis fine-mapping is substantially miscalibrated when constituent cohorts are heterogeneous in phenotyping, genotyping, and imputation. To mitigate this issue, we developed SLALOM, a summary statistics-based QC method for identifying suspicious loci in meta-analysis fine-mapping. Applying SLALOM to 14 disease endpoints from the GBMI meta-analyses10 as well as 467 summary statistics from the GWAS Catalog,47 we observed widespread suspicious loci in meta-analysis summary statistics, suggesting that meta-analysis fine-mapping is often miscalibrated in real data too. Indeed, we demonstrated that the predicted suspicious loci were significantly depleted for having likely causal variants as a lead PIP variant, such as nonsynonymous variants, high-PIP (>0.9) GWAS, and cis-eQTL fine-mapped variants from our previous fine-mapping studies.16 Our method provides better calibration in non-suspicious loci for meta-analysis fine-mapping, generating a more reliable set of variants for further functional characterization.

We have found limited evidence of improved fine-mapping in the GBMI meta-analyses over individual biobanks. A few empirical examples in this study as well as other previous studies7,9,25,26,29 suggested that multi-ancestry, large-scale meta-analysis could have potential to improve fine-mapping confidence and resolution owing to increased statistical power in associations and differential LD pattern across ancestries. However, we have highlighted that the observed improvement in PIP could be due to sample size imbalance in a locus, miscalibration, and technical confoundings too, which further emphasizes the importance of careful investigation of fine-mapped variants identified through meta-analysis fine-mapping. Given practical challenges in data harmonization across different cohorts, a large-scale biobank with multiple ancestries (e.g., UK Biobank18 and All of Us79) would likely benefit the most from meta-analysis fine-mapping across ancestries.

As high-confidence fine-mapping results in large-scale biobanks and molecular quantitative trait loci (QTLs) continue to become available,15,16,60 we propose alternative approaches for prioritizing candidate causal variants in a meta-analysis. First, these high-confidence fine-mapped variants have been a valuable resource to conduct a phenome-wide association study (PheWAS) to match with associated variants in a meta-analysis, which provides a narrower list of candidate variants assuming they would equally be functional and causal in related complex traits or tissues/cell types. Second, a traditional approach based on tagging variants (e.g., r2 > 0.6 with lead variants, or PICS57 fine-mapping approach that only relies on a lead variant and LD) can still be highly effective, especially for known functional variants such as nonsynonymous coding variants. As we highlighted in this and previous38 studies, potentially causal variants in strong LD with lead variants might not achieve genome-wide significance because of missingness and heterogeneity.

While using an external LD reference for fine-mapping has been shown to be extremely error prone,14,15 we find here that it can be useful for flagging suspicious loci, even when it does not perfectly represent the in-sample LD structure of the meta-analyzed individuals. However, our use of external LD reference comes with several limitations. For example, due to the finite sample size of external LD reference, rare or low-frequency variants have larger uncertainties around r2 than common variants. Moreover, our r2 values in a multi-ancestry meta-analysis are currently approximated based on a sample-size-weighted average of r2 across ancestries as previously suggested,80 but this can be different from actual r2. These uncertainties around r2 affect SLALOM prediction performance and should be modeled appropriately for further method development. On the other hand, we find it challenging to use an LD reference when true causal variants are located within a complex region (e.g., MHC), or are entirely missing from standard LD or imputation reference panels, especially for structural variants. These limitations are not specific to meta-analysis fine-mapping, and separate fine-mapping methods based on bespoke imputation references have been developed (e.g., human leukocyte antigen [HLA],81 killer cell immunoglobulin-like receptor [KIR],82 and variable numbers of tandem repeats83).

We have found evidence in our simulations and real data of severe miscalibration of fine-mapping results from GWAS meta-analysis; for example, we estimate that the difference between true and reported proportion of causal variants is 20% and 45% for high-PIP (>0.9) variants in suspicious loci from the simulations and the GWAS Catalog, respectively. Our SLALOM method helps to exclude spurious results from meta-analysis fine-mapping; however, even fine-mapping results in SLALOM-predicted non-suspicious loci remain somewhat miscalibrated, showing estimated differences between true and reported proportion of causal variants of 4% and 15% for high-PIP variants in the simulations and the GWAS Catalog, respectively. We thus urge extreme caution when interpreting PIPs computed from meta-analyses until improved methods are available. We recommend that researchers looking to identify likely causal variants employ complete synchronization of study design, case/control ascertainment, genomic profiling, and analytical pipeline, or rely more heavily on functional annotations, biobank fine-mapping, or molecular QTLs.

Limitations of the study

There are several methodological limitations of SLALOM. First, our simulations only include one causal variant per locus. Although additional independent causal variants would not affect SLALOM precision (but decrease recall), multiple correlated causal variants in a locus would violate SLALOM assumptions and could lead to some DENTIST-S outliers that are not due to heterogeneity or missingness but rather simply a product of tagging multiple causal variants in LD. In fact, our previous studies have illustrated infrequent but non-zero presence of such correlated causal variants in complex traits.16 Second, SLALOM prediction is not perfect. Although fine-mapping calibration is certainly better in non-suspicious versus suspicious loci, SLALOM has low precision, and we still observe some miscalibration in non-suspicious loci. Optimal thresholds for SLALOM prediction might be different for other datasets. Third, SLALOM does not model effect size heterogeneity. Although SLALOM is able to detect suspicious loci due to effect size heterogeneity as the method is agnostic to the source of heterogeneity, methods that model effect size heterogeneity, such as MR-MEGA,84 could improve SLALOM performance. Finally, SLALOM is a per-locus QC method and does not calibrate per-variant PIPs. Further methodological development that properly models heterogeneity, missingness, sample size imbalance, multiple causal variants, and LD uncertainty across multiple cohorts and ancestries is needed to refine per-variant calibration and recall in meta-analysis fine-mapping.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Deposited Data

GBMI summary statistics Zhou, W. et al., 202210 https://www.globalbiobankmeta.org/resources
BBJ fine-mapping results Kanai, M. et al., 202116 https://humandbs.biosciencedbc.jp/en/hum0197-latest#hum0197.v5.gwas.v1
FinnGen fine-mapping results Kanai, M. et al., 202116 https://www.finngen.fi/en/access_results
UKBB fine-mapping results Kanai, M. et al., 202116 https://www.finucanelab.org/data
GWAS Catalog GWAS Catalog (as of January 12, 2022) https://www.ebi.ac.uk/gwas/
Example outputs from the meta-analysis fine-mapping simulation pipeline This study https://doi.org/10.7910/DVN/M86OCQ

Software and Algorithms

SLALOM This study https://github.com/mkanai/slalom, https://doi.org/10.5281/zenodo.6984388
Meta-analysis fine-mapping simulation pipeline This study https://github.com/mkanai/meta-finemapping-simulation, https://doi.org/10.5281/zenodo.6984391
Analysis code This study https://github.com/mkanai/slalom-paper, https://doi.org/10.5281/zenodo.7010731
HAPGEN2 Su, Z. et al., 201185 https://mathgen.stats.ox.ac.uk/genetics_software/hapgen/hapgen2.html
PLINK 2.0 Chang, CC. et al., 201586 https://www.cog-genomics.org/plink/2.0/
Michigan Imputation Server Das, S. et al., 201687 https://imputationserver.sph.umich.edu/
TOPMed Imputation Server Taliun, D. et al., 202150 https://imputation.biodatacatalyst.nhlbi.nih.gov/
Hail Hail team, 2022 https://hail.is/

Resource availability

Lead contact

Further information and requests for resources and data should be directed to and will be fulfilled by the lead contact, Masahiro Kanai (mkanai@broadinstitute.org).

Materials availability

This study did not generate new unique reagents.

Method details

Meta-analysis fine-mapping simulation

To benchmark fine-mapping performance in meta-analysis, we simulated a large-scale, realistic GWAS meta-analysis and performed fine-mapping under different scenarios. An overview of our simulation pipeline is summarized in Figure S2.

Simulated true genotype

Using HAPGEN285 with the 1000 Genomes Project Phase 3 (ref. 48), we simulated “true” genotypes of chromosome 3 for multiple independent cohorts from African, East Asian, and European ancestries. For each independent cohort from a given ancestry, we simulated 10,000 individuals each using the default parameters, with an ancestry-specific effective population size set to 17,469, 14,269, and 11,418 for Africans, East Asians, and Europeans, respectively, as recommended.85 To mimic sample size imbalance of different ancestries in the current meta-analyses, we simulated 10 independent European cohorts, 1 African cohort, and 1 East Asian cohort.

To restrict our analysis to unrelated samples, we computed sample relatedness based on KING kinship coefficients88 using PLINK 2.0 (ref. 86) and removed monozygotic twins, duplicated individuals, or first-degree relatives with the coefficient threshold of 0.177. The detailed sample sizes of unrelated individuals for each cohort is summarized in Table S1.

Genotyping and imputation

To simulate realistic genotyping and imputation procedures, we first virtually genotyped each cohort by restricting variants to those that are available on different genotyping arrays. We selected three major genotyping arrays from Illumina, Inc. (Omni2.5, Multi-Ethnic Global Array [MEGA], and Global Screening Array [GSA]) that have different densities of genotyping probes (Table S2). For each cohort, we created three virtually genotyped datasets by retaining variants that are genotyped on each array. For the sake of simplicity, we assumed no genotyping errors occurred between true genotypes and virtually genotyped data—however, in practice, genotyping error is one of the major sources of unexpected confounding (e.g., see recent discussions here89,90) and should be treated carefully.

For each pair of cohort and genotyping array, we then imputed missing variants using different imputation reference panels. We used the Michigan Imputation Server (https://imputationserver.sph.umich.edu/)87 and the TOPMed Imputation Server (https://imputation.biodatacatalyst.nhlbi.nih.gov/)50 with the default parameters, using three publicly available reference panels: the 1000 Genomes Project Phase 3 (version 5; n = 2,504; 1000GP3),48 the Haplo-type Reference Consortium (version r1.1; n = 32,470; HRC),49 and the TOPMed (version R2; n = 97,256).50 Briefly, for each input, the imputation server created chunks of 20 Mb, applied the standard QC, pre-phased each chunk with Eagle2 (ref. 91), and imputed non-genotyped variants using a specified reference panel with Minimac4 (https://genome.sph.umich.edu/wiki/Minimac4). The detailed documentation of the imputation pipeline is available on the Michigan and TOPMed websites and has been described elsewhere.87

We applied post-imputation QC by only keeping variants with MAF >0.001 and imputation Rsq >0.6. Because the TOPMed panel is based on GRCh38 while the 1000GP3 and the HRC panels are on GRCh37, we lifted over TOPMed variants from GRCh38 to GRCh37 to meta-analyze with other cohorts. We excluded any variants which were lifted over to different chromosomes or for which the conversion failed. The number of virtually genotyped and imputed variants for each combination of cohort, genotyping array, and imputation panel is summarized in Table S3.

True phenotype

We simulated 300 true phenotypes that resemble observed complex trait genetic architecture and phenotypic heterogeneity across cohorts. Based on previous literature, we set parameters as follows: 1) 50% of 1 Mb loci contain a true causal variant92; 2) probability of being causal is proportional to functional enrichments of variant consequences (pLoF, missense, synonymous, 5’/3′ UTR, promoter, cis-regulatory region, and non-genic) for fine-mapped variants as estimated in a previous complex trait fine-mapping study16; 3) per-allele causal effect sizes have a variance proportional to where represents a maximum MAF across the three ancestries (AFR, EAS, and EUR) and is set to be −0.38 (ref. 51); and 4) total SNP-heritability for chromosome 3 equals 0.03 (ref. 52). For the sake of simplicity, we randomly draw a single true causal variant per locus because ABF assumes a single causal variant.30,31 We draw true causal variants from 1,150,893 non-ambiguous single-nucleotide variants in 1000GP3 that showed MAF >0.01 in at least one of the three ancestries (AFR, EAS, or EUR) and were not located within conversion-unstable positions (CUP)53 between the human genome builds GRCh37 and GRCh38. To mimic phenotypic heterogeneity across cohorts in real-world meta-analysis (due to e.g., different ascertainment, measurement error, or true effect size heterogeneity), we introduced cross-cohort genetic correlation of true effect sizes rg which is set to be one of 1, 0.9, or 0.5. For a true causal variant j, true causal effect sizes βj across cohorts were randomly drawn from β∼ MVN (0, Ʃ) where diagonal elements of Ʃ were set to be σg2[2p(1p)]α and off-diagonal elements of Ʃ were set to be rgσg2 [2p(1p)]α. σg2 was determined by σg2=hg2/Σj[2p(1p)]1+α. For each cohort, true phenotype y was computed via y=Xβ+ where X is the above true genotype matrix from HAPGEN2 and εiN(0,1σg2) i.i.d. We simulated 100 true phenotypes for each of rg = 1, 0.9, and 0.5, respectively.

GWAS

For each combination of phenotype, cohort, genotyping chip, and imputation panel, we conducted GWAS via a standard linear regression as implemented in PLINK 2.0 using imputed dosages. For covariates, we included top 10 principal components that were calculated based on true genotypes after restricting to unrelated samples. We only used LD-pruned variants with MAF >0.01 for PCA.

Meta-analysis

To simulate meta-analyses that resemble real-world settings, we generated multiple configurations of the above GWAS results to meta-analyze across 10 independent cohorts. Briefly, we chose configurations based on the following settings: 1) 10 EUR cohorts are genotyped and imputed using the same genotyping array (one of GSA, MEGA, or Omni2.5) and the same imputation panel (one of 1000GP3, HRC, TOPMed, or TOPMed-liftover); 2) 10 cohorts consisting of multiple ancestries (9 EUR +1 AFR/EAS cohorts or 8 EUR +1 AFR +1 EAS cohorts), with all cohorts genotyped and imputed using the same array (Omni2.5) and the same panel (1000GP3); 3) 10 EUR or multi-ancestry cohorts are genotyped using the same array (Omni2.5) but imputed using different panels across cohorts; 4) 10 EUR or multi-ancestry cohorts are imputed using the same panel (1000GP3) but genotyped using different arrays across cohorts; 5) 10 EUR or multi-ancestry cohorts are genotyped and imputed using different arrays and panels across cohorts. For settings 3–5, we randomly draw a combination of a genotyping array and an imputation panel for each cohort five times each for 10 EUR and multi-ancestry cohorts. In total, we generated 45 configurations as summarized in Table S4.

For each configuration, we conducted a fixed-effect meta-analysis based on inverse-variance weighted betas and standard errors using a modified version of PLINK 1.9 (https://github.com/mkanai/plink-ng/tree/add_se_meta).

Fine-mapping

For each meta-analysis, we defined fine-mapping regions based on a 1 Mb window around each genome-wide significant lead variant and applied ABF30,31 using prior effect size variance of = 0.04. We set a prior variance of effect size to be 0.04 which was taken from Wakefield et al.30 and is commonly used in meta-analysis fine-mapping studies.2,7 We computed posterior inclusion probability (PIP) and 95% credible set (CS) for each locus and evaluated whether true causal variants were correctly fine-mapped.

The SLALOM method

SLALOM takes GWAS summary statistics and external LD reference as input and predicts whether a locus is suspicious for fine-mapping. SLALOM consists of the following three steps:

Locus definition

Consistent with common fine-mapping region definition, we defined loci based on a 1 Mb window around each genome-wide significant lead variant and merged them if they overlapped. We excluded the major histocompatibility complex (MHC) region (chr 6: 25–36 Mb) from analysis due to extensive LD structure in the region.

DENTIST-S outlier detection

For each variant in a locus, we computed DENTIST-S statistics using Equation 1 based on the assumption of a single causal variant. DENTIST-S P-values (PDENTIST-S) were computed using the distribution with 1 degree of freedom. We applied ABF30,31 using prior effect size variance of = 0.04 and used the lead PIP variant (the variant with the highest PIP) as an approximation of the causal variant in the locus. To retrieve correlation r among the variants, we used publicly available LD matrices from gnomAD56 v2 as external LD reference for African, Admixed American, East Asian, Finnish, and non-Finnish European populations. When multiple populations exist, we computed a sample-size-weighted average of r2 using per-variant sample sizes for each population as previously suggested.80 We excluded variants without r2 available in gnomAD from the analysis. Since gnomAD v2 LD matrices are based on the human genome assembly GRCh37, variants were lifted over to GRCh38 if the input summary statistics were based on GRCh38.

We determined DENTIST-S outlier variants using two thresholds: 1) r2 > ρ to the lead and 2) PDENTIST-S < τ. The thresholds ρ and τ were set to ρ = 0.6 and τ = 1.0 × 10−4 based on the training in simulations as described below.

Suspicious loci prediction

We predicted whether a locus is suspicious or non-suspicious for fine-mapping based on the number of DENTIST-S outlier variants in the locus > κ. To determine the best-performing thresholds (ρ, τ, and κ), we used loci with maximum PIP >0.9 in the simulations for training. Positive conditions were defined as whether a true causal variant in a locus is 1) a lead PIP variant, 2) in 95% CS, and 3) in 99% CS. We computed AUROC across different thresholds (ρ = 0, 0.1, 0.2, …, 0.9; –log10 τ = 0, 0.5, 1, …, 10; and κ = 0, 1, 2, …) and chose ρ = 0.6, τ = 1.0 × 10−4, and κ = 0 that showed the highest AUROC for all the aforementioned positive conditions. Using all the loci in the simulations, we then evaluated fine-mapping miscalibration (defined as mean PIP – fraction of true causal variants) at different PIP thresholds in suspicious and non-suspicious loci and decided to only apply SLALOM to loci with maximum PIP >0.1 owing to relatively lower miscalibration and specificity of SLALOM at lower PIP thresholds.

GWAS catalog analysis

We retrieved full GWAS summary statistics publicly available on the GWAS Catalog.47 Out of 33,052 studies from 5,553 publications registered at the GWAS Catalog (as of January 12, 2022), we selected 467 studies from 96 publications that have 1) full harmonized summary statistics preprocessed by the GWAS Catalog with non-missing variant ID, marginal beta, and SE columns, 2) a discovery sample size of more than 10,000 individuals, 3) African (including African American, Afro-Caribbean, and Sub-Saharan African), admixed American (Hispanic and Latin American), East Asian, or European samples based on their broad ancestral category metadata, 4) at least one genome-wide significant association (p < 5.0 × 10−8), and 5) our manual annotation as a meta-analysis rather than a single-cohort study (Table S6). We applied SLALOM to the 467 summary statistics and identified 35,864 genome-wide significant loci (based on 1 Mb window around lead variants), of which 28,925 loci with maximum PIP >0.1 were further classified into suspicious and non-suspicious loci. Since per-variant sample sizes were not available, we used overall sample sizes of each ancestry (African, Admixed American, East Asian, and European) to calculate the weighted-average of r2. All the variants were harmonized into the human genome assembly GRCh38 by the GWAS Catalog.

GBMI analysis

We used meta-analysis summary statistics of 14 disease endpoints from the GBMI (Table S8). These meta-analyses were conducted using up to 1.8 million individuals across 18 biobanks for discovery, representing six different genetic ancestry groups (approximately 33,000 African, 18,000 Admixed American, 31,000 Central and South Asian, 341,000 East Asian, 1.4 million European, and 1,600 Middle Eastern individuals). Detailed procedures of the GBMI meta-analyses were described in the GBMI flagship publication.10

Across the 14 summary statistics, we used 489 out of 500 genome-wide significant loci (p < 5.0 × 10−8; 1 Mb window around each lead variant, as defined in the GBMI flagship publication10), excluding 11 loci that overlap with the MHC region. We applied SLALOM to 422 loci with maximum PIP >0.1 based on the ABF fine-mapping and predicted whether they were suspicious or non-suspicious for fine-mapping. We used per-variant sample sizes of each ancestry (African, Admixed American, East Asian, Finnish, and non-Finnish European) to calculate the weighted-average of r2. Since gnomAD LD matrices were not available for Central and South Asian and Middle Eastern, we did not use their sample sizes for the calculation. All the variants were processed on the human genome assembly GRCh38.

Fine-mapping results of complex traits and cis-eQTL

We retrieved our previous fine-mapping results for 1) complex traits in large-scale biobanks (BBJ,58 FinnGen,19 and UKBB18 Europeans)16 and 2) cis-eQTLs in GTEx59 v8 and eQTL Catalogue60 Briefly, we conducted multiple-causal-variant fine-mapping (FINEMAP20,21 and SuSiE22) of complex trait GWAS (# unique traits = 148) and cis-eQTL gene expression (# unique tissues/cell-types = 69) using summary statistics and in-sample LD. Detailed fine-mapping methods are described elsewhere.16

In this study, we collected 1) high-PIP GWAS variants that achieved PIP >0.9 for any traits in any biobank and 2) high-PIP cis-eQTL variants that acheived PIP >0.9 for any gene expression in any tissues/cell-types. All the variants were originally processed on the human genome assembly GRCh37 and lifted over to the GRCh38 for comparison.

Additional fine-mapping results

To compare with the GBMI meta-analyses, we additionally conducted multi-causal-variant fine-mapping of four additional endpoints (gout, heart failure, thyroid cancer, and venous thromboembolism) that were not fine-mapped in our previous study.16 We used exactly the same fine-mapping pipeline (FINEMAP20,21 and SuSiE22) as described previously.16 For UKBB Europeans, to use the exact same samples that contributed to the GBMI, we used individuals of European ancestry (n = 420,531) as defined in the Pan-UKBB project (https://pan.ukbb.broadinstitute.org), instead of those of “white British ancestry” (n = 361,194) used in our previous study.16

Enrichment analysis of likely causal variants

To validate SLALOM performance, we asked whether suspicious and non-suspicious loci were enriched for having likely causal variants as a lead PIP variant, and for containing them in the 95 and 99% CS. We defined likely causal variants using 1) nonsynonymous coding variants, i.e., pLoF and missense variants annotated93 by the Ensembl Variant Effect Predictor (VEP) v101 (using GRCh38 and GENCODE v35), 2) the high-PIP (>0.9) complex trait fine-mapped variants, and 3) the high-PIP (>0.9) cis-eQTL fine-mapped variants from our previous studies as described above.

We estimated enrichment for suspicious and non-suspicious loci as a relative risk (i.e., a ratio of proportion of variants) between being in suspicious/non-suspicious loci and having the annotated likely causal variants as a lead PIP variant (or containing them in the 95% or 99% CS). That is, a relative risk = (proportion of non-suspicious loci having the annotated variants as a lead PIP variant)/(proportion of suspicious loci having the annotated variants as a lead PIP variant). We computed 95% confidence intervals using bootstrapping.

Comparison of fine-mapping results between the GBMI and individual biobanks

To directly compare with fine-mapping results from the GBMI meta-analyses, we used our fine-mapping results of nine disease endpoints (asthma,64 COPD,64 gout, heart failure,73 IPF,62 primary open-angle glaucoma,74 thyroid cancer, stroke,75 and venous thromboembolism76) in BBJ,58 FinnGen,19 and UKBB18 Europeans that were also part of the GBMI meta-analyses for the same traits. For comparison, we computed the maximum PIP for each variant and the minimum size of 95% CS across BBJ, FinnGen, and UKBB. We restricted the 95% CS in biobanks to those that contain the lead variants from the GBMI. We defined the PIP difference between the GBMI and individual biobanks as ΔPIP = PIP (GBMI) – the maximum PIP across the biobanks.

We conducted functional enrichment analysis to compare between the GBMI meta-analysis and individual biobanks because unbiased comparison of PIP requires conditioning on likely causal variants independent of the fine-mapping results, and functional annotations have been shown to be enriched for causal variants. Using functional categories (coding [pLoF, missense, and synonymous], 5’/3′ UTR, promoter, and CRE) from our previous study,16 we estimated functional enrichments of variants in each functional category based on 1) top PIP rankings and 2) ΔPIP bins. Since fine-mapping PIP in the GBMI meta-analysis can be miscalibrated, we performed a comparison based on top PIP rankings to assess whether the ordering given by GBMI PIPs is more informative than the ordering given by the biobanks. For the top PIP rankings, we took the top 0.5%, 0.1%, and 0.05% variants based on the PIP rankings in the GBMI and individual biobanks. We computed enrichment as a relative risk = (proportion of top X% PIP variants in the GBMI that are in the annotation)/(proportion of top X% PIP variants in the individual biobanks that are in the annotation). For ΔPIP bins, we defined three bins using different thresholds (θ = 0.01, 0.05, and 0.1): 1) decreased PIP bin, ΔPIP < –θ, 2) null bin, –θ ≤ ΔPIP ≤ θ, and 3) increased PIP bin, θ < ΔPIP. We computed enrichment as a relative risk = (proportion of variants in the decreased/increased PIP bin that are in the annotation)/(proportion of variants in the null PIP bin). We combined coding, UTR, and promoter categories for this analysis due to the limited number of variants for each bin.

Quantification and statistical analysis

All statistical analysis was performed using R 4.0.3, Hail 0.2, PLINK 1.9 and 2.0. All methodological details can be found in the method details, and all statistical tests are named as they are used.

Consortia

GBMI: Wei Zhou, Masahiro Kanai, Kuan-Han H. Wu, Humaira Rasheed, Kristin Tsuo, Jibril B. Hirbo, Ying Wang, Arjun Bhattacharya, Huiling Zhao, Shinichi Namba, Ida Surakka, Brooke N. Wolford, Valeria Lo Faro, Esteban A. Lopera-Maya, Kristi Läll, Marie-Julie Favé, Juulia J. Partanen, Sinéad B. Chapman, Juha Karjalainen, Mitja Kurki, Mutaamba Maasha, Ben M. Brumpton, Sameer Chavan, Tzu-Ting Chen, Michelle Daya, Yi Ding, Yen-Chen A. Feng, Lindsay A. Guare, Christopher R. Gignoux, Sarah E. Graham, Whitney E. Hornsby, Nathan Ingold, Said I. Ismail, Ruth Johnson, Triin Laisk, Kuang Lin, Jun Lv, Iona Y. Millwood, Sonia Moreno-Grau, Kisung Nam, Priit Palta, Anita Pandit, Michael H. Preuss, Chadi Saad, Shefali Setia-Verma, Unnur Thorsteinsdottir, Jasmina Uzunovic, Anurag Verma, Matthew Zawistowski, Xue Zhong, Nahla Afifi, Kawthar M. Al-Dabhani, Asma Al Thani, Yuki Bradford, Archie Campbell, Kristy Crooks, Geertruida H. de Bock, Scott M. Damrauer, Nicholas J. Douville, Sarah Finer, Lars G. Fritsche, Eleni Fthenou, Gilberto Gonzalez-Arroyo, Christopher J. Griffiths, Yu Guo, Karen A. Hunt, Alexander Ioannidis, Nomdo M. Jansonius, Takahiro Konuma, Ming Ta Michael Lee, Arturo Lopez-Pineda, Yuta Matsuda, Riccardo E. Marioni, Babak Moatamed, Marco A. Nava-Aguilar, Kensuke Numakura, Snehal Patil, Nicholas Rafaels, Anne Richmond, Agustin Rojas-Muñoz, Jonathan A. Shortt, Peter Straub, Ran Tao, Brett Vanderwerff, Manvi Vernekar, Yogasudha Veturi, Kathleen C. Barnes, Marike Boezen, Zhengming Chen, Chia-Yen Chen, Judy Cho, George Davey Smith, Hilary K. Finucane, Lude Franke, Eric R. Gamazon, Andrea Ganna, Tom R. Gaunt, Tian Ge, Hailiang Huang, Jennifer Huffman, Nicholas Katsanis, Jukka T. Koskela, Clara Lajonchere, Matthew H. Law, Liming Li, Cecilia M. Lindgren, Ruth J.F. Loos, Stuart MacGregor, Koichi Matsuda, Catherine M. Olsen, David J. Porteous, Jordan A. Shavit, Harold Snieder, Tomohiro Takano, Richard C. Trembath, Judith M. Vonk, David C. Whiteman, Stephen J. Wicks, Cisca Wijmenga, John Wright, Jie Zheng, Xiang Zhou, Philip Awadalla, Michael Boehnke, Carlos D. Bustamante, Nancy J. Cox, Segun Fatumo, Daniel H. Geschwind, Caroline Hayward, Kristian Hveem, Eimear E. Kenny, Seunggeun Lee, Yen-Feng Lin, Hamdi Mbarek, Reedik Mägi, Hilary C. Martin, Sarah E Medland, Yukinori Okada, Aarno V. Palotie, Bogdan Pasaniuc, Daniel J. Rader, Marylyn D. Ritchie, Serena Sanna, Jordan W. Smoller, Kari Stefansson, David A. van Heel, Robin G. Walters, Sebastian Zöllner, Biobank of the Americas, Biobank Japan Project, BioMe, BioVU, CanPath - Ontario Health Study, China Kadoorie Biobank Collaborative Group, Colorado Center for Personalized Medicine, deCODE Genetics, Estonian Biobank, FinnGen, Generation Scotland, Genes & Health Research Team, LifeLines, Mass General Brigham Biobank, Michigan Genomics Initiative, National Biobank of Korea, Penn Medicine BioBank, Qatar Biobank, The QSkin Sun and Health Study, Taiwan Biobank, The HUNT Study, UCLA ATLAS Community Health Initiative, Uganda Genome Resource, UK Biobank, Alicia R. Martin, Cristen J. Willer, Mark J. Daly, Benjamin M. Neale. See the Supplemental PDF for consortium member affiliations.

Acknowledgments

We acknowledge all the participants and researchers of the 23 biobanks that have contributed to the GBMI. Biobank-specific acknowledgments are included in the Data S3. We thank H. Huang, A.R. Martin, B.M. Neale, Y. Okada, K. Tsuo, J.C. Ulirsch, Y. Wang, and all the members of Finucane and Daly labs for their helpful feedback. M.K. was supported by a Nakajima Foundation Fellowship and the Masason Foundation. H.K.F. was funded by NIH grant DP5 OD024582.

Author contributions

M.K., M.J.D., and H.K.F. designed the study. M.K., R.E., and W.Z. performed analyses. H.K.F. supervised this work. H.K.F. and M.K. obtained funding. M.K., R.E., M.J.D., and H.K.F. wrote the manuscript with input from all authors.

Declaration of interests

M.J.D. is a founder of Maze Therapeutics. All other authors declare no competing interests.

Published: November 3, 2022

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.xgen.2022.100210.

Contributor Information

Masahiro Kanai, Email: mkanai@broadinstitute.org.

Hilary K. Finucane, Email: finucane@broadinstitute.org.

Global Biobank Meta-analysis Initiative:

Wei Zhou, Masahiro Kanai, Kuan-Han H. Wu, Humaira Rasheed, Kristin Tsuo, Jibril B. Hirbo, Ying Wang, Arjun Bhattacharya, Huiling Zhao, Shinichi Namba, Ida Surakka, Brooke N. Wolford, Valeria Lo Faro, Esteban A. Lopera-Maya, Kristi Läll, Marie-Julie Favé, Juulia J. Partanen, Sinéad B. Chapman, Juha Karjalainen, Mitja Kurki, Mutaamba Maasha, Ben M. Brumpton, Sameer Chavan, Tzu-Ting Chen, Michelle Daya, Yi Ding, Yen-Chen A. Feng, Lindsay A. Guare, Christopher R. Gignoux, Sarah E. Graham, Whitney E. Hornsby, Nathan Ingold, Said I. Ismail, Ruth Johnson, Triin Laisk, Kuang Lin, Jun Lv, Iona Y. Millwood, Sonia Moreno-Grau, Kisung Nam, Priit Palta, Anita Pandit, Michael H. Preuss, Chadi Saad, Shefali Setia-Verma, Unnur Thorsteinsdottir, Jasmina Uzunovic, Anurag Verma, Matthew Zawistowski, Xue Zhong, Nahla Afifi, Kawthar M. Al-Dabhani, Asma Al Thani, Yuki Bradford, Archie Campbell, Kristy Crooks, Geertruida H. de Bock, Scott M. Damrauer, Nicholas J. Douville, Sarah Finer, Lars G. Fritsche, Eleni Fthenou, Gilberto Gonzalez-Arroyo, Christopher J. Griffiths, Yu Guo, Karen A. Hunt, Alexander Ioannidis, Nomdo M. Jansonius, Takahiro Konuma, Ming Ta Michael Lee, Arturo Lopez-Pineda, Yuta Matsuda, Riccardo E. Marioni, Babak Moatamed, Marco A. Nava-Aguilar, Kensuke Numakura, Snehal Patil, Nicholas Rafaels, Anne Richmond, Agustin Rojas-Muñoz, Jonathan A. Shortt, Peter Straub, Ran Tao, Brett Vanderwerff, Manvi Vernekar, Yogasudha Veturi, Kathleen C. Barnes, Marike Boezen, Zhengming Chen, Chia-Yen Chen, Judy Cho, George Davey Smith, Hilary K. Finucane, Lude Franke, Eric R. Gamazon, Andrea Ganna, Tom R. Gaunt, Tian Ge, Hailiang Huang, Jennifer Huffman, Nicholas Katsanis, Jukka T. Koskela, Clara Lajonchere, Matthew H. Law, Liming Li, Cecilia M. Lindgren, Ruth J.F. Loos, Stuart MacGregor, Koichi Matsuda, Catherine M. Olsen, David J. Porteous, Jordan A. Shavit, Harold Snieder, Tomohiro Takano, Richard C. Trembath, Judith M. Vonk, David C. Whiteman, Stephen J. Wicks, Cisca Wijmenga, John Wright, Jie Zheng, Xiang Zhou, Philip Awadalla, Michael Boehnke, Carlos D. Bustamante, Nancy J. Cox, Segun Fatumo, Daniel H. Geschwind, Caroline Hayward, Kristian Hveem, Eimear E. Kenny, Seunggeun Lee, Yen-Feng Lin, Hamdi Mbarek, Reedik Mägi, Hilary C. Martin, Sarah E. Medland, Yukinori Okada, Aarno V. Palotie, Bogdan Pasaniuc, Daniel J. Rader, Marylyn D. Ritchie, Serena Sanna, Jordan W. Smoller, Kari Stefansson, David A. van Heel, Robin G. Walters, Sebastian Zöllner, Biobank of the Americas, Biobank Japan Project, BioMe, BioVU, CanPath - Ontario Health Study, China Kadoorie Biobank Collaborative Group, Colorado Center for Personalized Medicine, deCODE Genetics, Estonian Biobank, FinnGen, Generation Scotland, Genes & Health Research Team, LifeLines, Mass General Brigham Biobank, Michigan Genomics Initiative, National Biobank of Korea, Penn Medicine BioBank, Qatar Biobank, The Qskin Sun and Health Study, Taiwan Biobank, The Hunt Study, Ucla Atlas Community Health Initiative, Uganda Genome Resource, Uk Biobank, Alicia R. Martin, Cristen J. Willer, Mark J. Daly, and Benjamin M. Neale

Supplemental information

Document S1. Figures S1–S17 and Data S1–S3
mmc1.pdf (2.7MB, pdf)
Table S1 Number of unrelated samples for simulated cohorts, related to STAR Methods
mmc2.xlsx (9.4KB, xlsx)
Table S2 Number of chromosome 3 variants in Illumina manifest and those extracted from 1000GP African, East Asian, and European populations, related to STAR Methods
mmc3.xlsx (9.3KB, xlsx)
Table S3 Number of imputed and QC-passing variants (MAF > 0.001 and Rsq > 0.6), related to STAR Methods
mmc4.xlsx (13.7KB, xlsx)
Table S4 List of configurations for meta-analysis simulation, related to STAR Methods
mmc5.xlsx (12.5KB, xlsx)
Table S5 SLALOM prediction in the simulations for loci from the most heterogeneous and homogeneous configurations, related to Figure 3
mmc6.xlsx (9.6KB, xlsx)
Table S6 List of studies used in the GWAS Catalog analysis, related to STAR Methods
mmc7.xlsx (106KB, xlsx)
Table S7 SLALOM prediction for the GWAS Catalog loci, related to Figure 4
mmc8.xlsx (3.1MB, xlsx)
Table S8 Overview of the GBMI meta-analyses, related to Figure 5 and STAR Methods
mmc9.xlsx (11.7KB, xlsx)
Table S9 SLALOM prediction for the GBMI loci, related to Figures 5 and 6
mmc10.xlsx (55.1KB, xlsx)
Document S2.TPR_Kanai
mmc11.pdf (1.5MB, pdf)
Document S3. Article plus supplemental information
mmc12.pdf (5.2MB, pdf)

Data and code availability

The GBMI summary statistics for the 14 endpoints are publicly available and are browserble at the GBMI PheWeb website (http://results.globalbiobankmeta.org/). Example outputs from the meta-analysis fine-mapping simulation pipeline have been deposited at Harvard Dataverse. All original code has been deposited at Zenodo and is publicly available as of the date of publication. DOIs and links are listed in the key resources table. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

References

  • 1.Evangelou E., Ioannidis J.P.a. Meta-analysis methods for genome-wide association studies and beyond. Nat. Rev. Genet. 2013;14:379–389. doi: 10.1038/nrg3472. [DOI] [PubMed] [Google Scholar]
  • 2.Mahajan A., Taliun D., Thurner M., Robertson N.R., Torres J.M., Rayner N.W., Payne A.J., Steinthorsdottir V., Scott R.A., Grarup N., et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 2018;50:1505–1513. doi: 10.1038/s41588-018-0241-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Spracklen C.N., Horikoshi M., Kim Y.J., Lin K., Bragg F., Moon S., Suzuki K., Tam C.H.T., Tabara Y., Kwak S.-H., et al. Identification of type 2 diabetes loci in 433, 540 East Asian individuals. Nature. 2020;582:240–245. doi: 10.1038/s41586-020-2263-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Schizophrenia Working Group of the Psychiatric Genomics Consortium. Neale B.M., Corvin A., Walters J.T.R., Farh K.-H., Holmans P.a., Lee P., Bulik-Sullivan B., Collier D.a., Huang H., et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014;511:421–427. doi: 10.1038/nature13595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Trubetskoy V., Pardiñas A.F., Qi T., Panagiotaropoulou G., Awasthi S., Bigdeli T.B., et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature. 2022;604:502–508. doi: 10.1038/s41586-022-04434-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Okada Y., Wu D., Trynka G., Raj T., Terao C., Ikari K., Kochi Y., Ohmura K., Suzuki A., Yoshida S., et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature. 2014;506:376–381. doi: 10.1038/nature12873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Ishigaki K., Sakaue S., Terao C., Luo Y., Sonehara K., Yamaguchi K., et al. Trans-ancestry genome-wide association study identifies novel genetic mechanisms in rheumatoid arthritis. Preprint at medRxiv. 2021 doi: 10.1101/2021.12.01.21267132. [DOI] [Google Scholar]
  • 8.Locke A.E., Kahali B., Berndt S.I., Justice A.E., Pers T.H., Day F.R., Powell C., Vedantam S., Buchkovich M.L., Yang J., et al. Genetic studies of body mass index yield new insights for obesity biology. Nature. 2015;518:197–206. doi: 10.1038/nature14177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Graham S.E., Clarke S.L., Wu K.-H.H., Kanoni S., Zajac G.J.M., Ramdas S., Surakka I., Ntalla I., Vedantam S., Winkler T.W., et al. The power of genetic diversity in genome-wide association studies of lipids. Nature. 2021;600:675–679. doi: 10.1038/s41586-021-04064-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Zhou W., Kanai M., Wu K.-H.H., Rasheed H., Tsuo K., Hirbo J.B., et al. Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease. Cell Genomics. 2022;2 doi: 10.1016/j.xgen.2022.100192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Visscher P.M., Wray N.R., Zhang Q., Sklar P., McCarthy M.I., Brown M.A., Yang J. 10 Years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 2017;101:5–22. doi: 10.1016/j.ajhg.2017.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shendure J., Findlay G.M., Snyder M.W. Genomic medicine-progress, pitfalls, and promise. Cell. 2019;177:45–57. doi: 10.1016/j.cell.2019.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Schaid D.J., Chen W., Larson N.B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 2018;19:491–504. doi: 10.1038/s41576-018-0016-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ulirsch J.C., Lareau C.A., Bao E.L., Ludwig L.S., Guo M.H., Benner C., Satpathy A.T., Kartha V.K., Salem R.M., Hirschhorn J.N., et al. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nat. Genet. 2019;51:683–693. doi: 10.1038/s41588-019-0362-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Weissbrod O., Hormozdiari F., Benner C., Cui R., Ulirsch J., Gazal S., Schoech A.P., van de Geijn B., Reshef Y., Márquez-Luna C., et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet. 2020;52:1355–1363. doi: 10.1038/s41588-020-00735-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kanai M., Ulirsch J.C., Karjalainen J., Kurki M., Karczewski K.J., Fauman E., Wang Q.S., Jacobs H., Aguet F., Ardlie K.G., et al. Insights from complex trait fine-mapping across diverse populations. Preprint at medRxiv. 2021 doi: 10.1101/2021.09.03.21262975. [DOI] [Google Scholar]
  • 17.Nagai A., Hirata M., Kamatani Y., Muto K., Matsuda K., Kiyohara Y., Ninomiya T., Tamakoshi A., Yamagata Z., Mushiroda T., et al. Overview of the BioBank Japan project: study design and profile. J. Epidemiol. 2017;27:S2–S8. doi: 10.1016/j.je.2016.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Bycroft C., Freeman C., Petkova D., Band G., Elliott L.T., Sharp K., Motyer A., Vukcevic D., Delaneau O., O’Connell J., et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–209. doi: 10.1038/s41586-018-0579-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kurki M.I., Karjalainen J., Palta P., Sipilä T.P., Kristiansson K., Donner K., et al. FinnGen: unique genetic insights from combining isolated population and national health register data. Preprint at medRxiv. 2022 doi: 10.1101/2022.03.03.22271360. [DOI] [Google Scholar]
  • 20.Benner C., Spencer C.C.A., Havulinna A.S., Salomaa V., Ripatti S., Pirinen M. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics. 2016;32:1493–1501. doi: 10.1093/bioinformatics/btw018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Benner C., Havulinna A.S., Salomaa V., Ripatti S., Pirinen M. Refining fine-mapping: effect sizes and regional heritability. Preprint at bioRxiv. 2018 doi: 10.1101/318618. [DOI] [Google Scholar]
  • 22.Wang G., Sarkar A., Carbonetto P., Stephens M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. B. 2020;82:1273–1300. doi: 10.1111/rssb.12388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Onengut-Gumuscu S., Chen W.-M., Burren O., Cooper N.J., Quinlan A.R., Mychaleckyj J.C., Farber E., Bonnie J.K., Szpak M., Schofield E., et al. Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat. Genet. 2015;47:381–386. doi: 10.1038/ng.3245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Levey D.F., Stein M.B., Wendt F.R., Pathak G.A., Zhou H., Aslan M., Quaden R., Harrington K.M., Nuñez Y.Z., Overstreet C., et al. Bi-ancestral depression GWAS in the Million Veteran Program and meta-analysis in >1.2 million individuals highlight new therapeutic directions. Nat. Neurosci. 2021;24:954–963. doi: 10.1038/s41593-021-00860-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Gharahkhani P., Jorgenson E., Hysi P., Khawaja A.P., Pendergrass S., Han X., Ong J.S., Hewitt A.W., Segrè A.V., Rouhana J.M., et al. Genome-wide meta-analysis identifies 127 open-angle glaucoma loci with consistent effect across ancestries. Nat. Commun. 2021;12:1258. doi: 10.1038/s41467-020-20851-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Chen J., Spracklen C.N., Marenne G., Varshney A., Corbin L.J., Luan J., Willems S.M., Wu Y., Zhang X., Horikoshi M., et al. The trans-ancestral genomic architecture of glycemic traits. Nat. Genet. 2021;53:840–860. doi: 10.1038/s41588-021-00852-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Zhou W., Brumpton B., Kabil O., Gudmundsson J., Thorleifsson G., Weinstock J., Zawistowski M., Nielsen J.B., Chaker L., Medici M., et al. GWAS of thyroid stimulating hormone highlights pleiotropic effects and inverse association with thyroid cancer. Nat. Commun. 2020;11:3981–4013. doi: 10.1038/s41467-020-17718-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Wightman D.P., Jansen I.E., Savage J.E., Shadrin A.A., Bahrami S., Holland D., Rongve A., Børte S., Winsvold B.S., Drange O.K., et al. A genome-wide association study with 1, 126, 563 individuals identifies new risk loci for Alzheimer’s disease. Nat. Genet. 2021;53:1276–1282. doi: 10.1038/s41588-021-00921-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Chen M.-H., Raffield L.M., Mousas A., Sakaue S., Huffman J.E., Moscati A., Trivedi B., Jiang T., Akbari P., Vuckovic D., et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746, 667 individuals from 5 global populations. Cell. 2020;182:1198–1213.e14. doi: 10.1016/j.cell.2020.06.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wakefield J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am. J. Hum. Genet. 2007;81:208–227. doi: 10.1086/519024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wakefield J. Bayes factors for genome-wide association studies: comparison with P-values. Genet. Epidemiol. 2009;33:79–86. doi: 10.1002/gepi.20359. [DOI] [PubMed] [Google Scholar]
  • 32.Hormozdiari F., Kostem E., Kang E.Y., Pasaniuc B., Eskin E. Identifying causal variants at loci with multiple signals of association. Genetics. 2014;198:497–508. doi: 10.1534/genetics.114.167908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Kichaev G., Yang W.-Y., Lindstrom S., Hormozdiari F., Eskin E., Price A.L., Kraft P., Pasaniuc B. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 2014;10 doi: 10.1371/journal.pgen.1004722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kichaev G., Pasaniuc B. Leveraging functional-annotation data in trans-ethnic fine-mapping studies. Am. J. Hum. Genet. 2015;97:260–271. doi: 10.1016/j.ajhg.2015.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Li D., Zhao H., Gelernter J. Strong protective effect of the aldehyde dehydrogenase gene (ALDH2) 504lys (∗2) allele against alcoholism and alcohol-induced medical diseases in Asians. Hum. Genet. 2012;131:725–737. doi: 10.1007/s00439-011-1116-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Brown B.C., Asian Genetic Epidemiology Network Type 2 Diabetes Consortium. Ye C.J., Price A.L., Zaitlen N. Transethnic genetic-correlation estimates from summary statistics. Am. J. Hum. Genet. 2016;99:76–88. doi: 10.1016/j.ajhg.2016.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Shi H., Gazal S., Kanai M., Koch E.M., Schoech A.P., Siewert K.M., et al. Population-specific causal disease effect sizes in functionally important regions impacted by selection. Nat. Commun. 2021;12 doi: 10.1038/s41467-021-21286-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.COVID-19 Host Genetics Initiative Mapping the human genetic architecture of COVID-19. Nature. 2021;600:472–477. doi: 10.1038/s41586-021-03767-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Dendrou C.A., Cortes A., Shipman L., Evans H.G., Attfield K.E., Jostins L., Barber T., Kaur G., Kuttikkatte S.B., Leach O.A., et al. Resolving TYK2 locus genotype-to-phenotype differences in autoimmunity. Sci. Transl. Med. 2016;8:363ra149. doi: 10.1126/scitranslmed.aag1974. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Couturier N., Bucciarelli F., Nurtdinov R.N., Debouverie M., Lebrun-Frenay C., Defer G., Moreau T., Confavreux C., Vukusic S., Cournu-Rebeix I., et al. Tyrosine kinase 2 variant influences T lymphocyte polarization and multiple sclerosis susceptibility. Brain. 2011;134:693–703. doi: 10.1093/brain/awr010. [DOI] [PubMed] [Google Scholar]
  • 41.Li Z., Gakovic M., Ragimbeau J., Eloranta M.-L., Rönnblom L., Michel F., Pellegrini S. Two rare disease-associated Tyk2 variants are catalytically impaired but signaling competent. J. Immunol. 2013;190:2335–2344. doi: 10.4049/jimmunol.1203118. [DOI] [PubMed] [Google Scholar]
  • 42.Lam M., Awasthi S., Watson H.J., Goldstein J., Panagiotaropoulou G., Trubetskoy V., Karlsson R., Frei O., Fan C.-C., De Witte W., et al. RICOPILI: rapid imputation for COnsortias PIpeLIne. Bioinformatics. 2020;36:930–933. doi: 10.1093/bioinformatics/btz633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Huang H., Fang M., Jostins L., Umićević Mirkov M., Boucher G., Anderson C.A., Andersen V., Cleynen I., Cortes A., Crins F., et al. Fine-mapping inflammatory bowel disease loci to single-variant resolution. Nature. 2017;547:173–178. doi: 10.1038/nature22969. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Winkler T.W., Day F.R., Croteau-Chonka D.C., Wood A.R., Locke A.E., Mägi R., Ferreira T., Fall T., Graff M., Justice A.E., et al. Quality control and conduct of genome-wide association meta-analyses. Nat. Protoc. 2014;9:1192–1212. doi: 10.1038/nprot.2014.071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Chen W., Wu Y., Zheng Z., Qi T., Visscher P.M., Zhu Z., Yang J. Improved analyses of GWAS summary statistics by reducing data heterogeneity and errors. Nat. Commun. 2021;12:7117. doi: 10.1038/s41467-021-27438-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Yang J., Ferreira T., Morris A.P., Medland S.E., Genetic Investigation of ANthropometric Traits GIANT Consortium. DIAbetes Genetics Replication And Meta-analysis DIAGRAM Consortium. Madden P.A.F., Heath A.C., Martin N.G., Montgomery G.W., et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 2012;44:369–375. doi: 10.1038/ng.2213. S1–S3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Buniello A., MacArthur J.A.L., Cerezo M., Harris L.W., Hayhurst J., Malangone C., McMahon A., Morales J., Mountjoy E., Sollis E., et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47:D1005–D1012. doi: 10.1093/nar/gky1120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.1000 Genomes Project Consortium. Auton A., Brooks L.D., Durbin R.M., Garrison E.P., Kang H.M., Korbel J.O., Marchini J.L., McCarthy S., McVean G.A., Abecasis G.R. A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.McCarthy S., Das S., Kretzschmar W., Delaneau O., Wood A.R., Teumer A., Kang H.M., Fuchsberger C., Danecek P., Sharp K., et al. A reference panel of 64, 976 haplotypes for genotype imputation. Nat. Genet. 2016;48:1279–1283. doi: 10.1038/ng.3643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Taliun D., Harris D.N., Kessler M.D., Carlson J., Szpiech Z.A., Torres R., Taliun S.A.G., Corvelo A., Gogarten S.M., Kang H.M., et al. Sequencing of 53, 831 diverse genomes from the NHLBI TOPMed Program. Nature. 2021;590:290–299. doi: 10.1038/s41586-021-03205-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Schoech A.P., Jordan D.M., Loh P.-R., Gazal S., O’Connor L.J., Balick D.J., Palamara P.F., Finucane H.K., Sunyaev S.R., Price A.L. Quantification of frequency-dependent genetic architectures in 25 UK Biobank traits reveals action of negative selection. Nat. Commun. 2019;10:790. doi: 10.1038/s41467-019-08424-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Yang J., Manolio T.a., Pasquale L.R., Boerwinkle E., Caporaso N., Cunningham J.M., de Andrade M., Feenstra B., Feingold E., Hayes M.G., et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 2011;43:519–525. doi: 10.1038/ng.823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Ormond C., Ryan N.M., Corvin A., Heron E.A. Converting single nucleotide variants between genome builds: from cautionary tale to solution. Brief. Bioinform. 2021;22:bbab069. doi: 10.1093/bib/bbab069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Asimit J.L., Hatzikotoulas K., McCarthy M., Morris A.P., Zeggini E. Trans-ethnic study design approaches for fine-mapping. Eur. J. Hum. Genet. 2016;24:1330–1336. doi: 10.1038/ejhg.2016.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Marchini J., Howie B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 2010;11:499–511. doi: 10.1038/nrg2796. [DOI] [PubMed] [Google Scholar]
  • 56.Karczewski K.J., Francioli L.C., Tiao G., Cummings B.B., Alföldi J., Wang Q., Collins R.L., Laricchia K.M., Ganna A., Birnbaum D.P., et al. The mutational constraint spectrum quantified from variation in 141, 456 humans. Nature. 2020;581:434–443. doi: 10.1038/s41586-020-2308-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Farh K.K.-H., Marson A., Zhu J., Kleinewietfeld M., Housley W.J., Beik S., Shoresh N., Whitton H., Ryan R.J.H., Shishkin A.A., et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature. 2015;518:337–343. doi: 10.1038/nature13835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Sakaue S., Kanai M., Tanigawa Y., Karjalainen J., Kurki M., Koshiba S., Narita A., Konuma T., Yamamoto K., Akiyama M., et al. A cross-population atlas of genetic associations for 220 human phenotypes. Nat. Genet. 2021;53:1415–1424. doi: 10.1038/s41588-021-00931-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.GTEx Consortium The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369:1318–1330. doi: 10.1126/science.aaz1776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Kerimov N., Hayhurst J.D., Peikova K., Manning J.R., Walter P., Kolberg L., Samoviča M., Sakthivel M.P., Kuzmin I., Trevanion S.J., et al. A compendium of uniformly processed human gene expression and splicing quantitative trait loci. Nat. Genet. 2021;53:1290–1299. doi: 10.1038/s41588-021-00924-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Koskela J.T., Häppölä P., Liu A., Partanen J., Genovese G., Artomov M., et al. Genetic variant in SPDL1 reveals novel mechanism linking pulmonary fibrosis risk and cancer protection. Preprint at medRxiv. 2021 doi: 10.1101/2021.05.07.21255988. [DOI] [Google Scholar]
  • 62.Partanen J.J., Häppölä P., Zhou W., Lehisto A.A., Ainola M., Sutinen E., et al. Leveraging global multi-ancestry meta-analysis in the study of idiopathic pulmonary fibrosis genetics. Cell Genomics. 2022;2 doi: 10.1016/j.xgen.2022.100181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Foreman M.G., Wilson C., DeMeo D.L., Hersh C.P., Beaty T.H., Cho M.H., Ziniti J., Curran-Everett D., Criner G., Hokanson J.E., et al. Alpha-1 Antitrypsin PiMZ genotype is associated with chronic obstructive pulmonary disease in two racial groups. Ann. Am. Thorac. Soc. 2017;14:1280–1287. doi: 10.1513/AnnalsATS.201611-838OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Tsuo K., Zhou W., Wang Y., Kanai M., Namba S., Gupta R., et al. Multi-ancestry meta-analysis of asthma identifies novel associations and highlights the value of increased power and diversity. Preprint at medRxiv. 2021 doi: 10.1101/2021.11.30.21267108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Benonisdottir S., Oddsson A., Helgason A., Kristjansson R.P., Sveinbjornsson G., Oskarsdottir A., Thorleifsson G., Davidsson O.B., Arnadottir G.A., Sulem G., et al. Epigenetic and genetic components of height regulation. Nat. Commun. 2016;7 doi: 10.1038/ncomms13490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Marouli E., Graff M., Medina-Gomez C., Lo K.S., Wood A.R., Kjaer T.R., Fine R.S., Lu Y., Schurmann C., Highland H.M., et al. Rare and low-frequency coding variants alter human adult height. Nature. 2017;542:186–190. doi: 10.1038/nature21039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Langefeld C.D., Ainsworth H.C., Cunninghame Graham D.S., Kelly J.A., Comeau M.E., Marion M.C., Howard T.D., Ramos P.S., Croker J.A., Morris D.L., et al. Transancestral mapping and genetic load in systemic lupus erythematosus. Nat. Commun. 2017;8 doi: 10.1038/ncomms16021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Hargreaves C.E., Rose-Zerilli M.J.J., Machado L.R., Iriyama C., Hollox E.J., Cragg M.S., Strefford J.C. Fcγ receptors: genetic variation, function, and disease. Immunol. Rev. 2015;268:6–24. doi: 10.1111/imr.12341. [DOI] [PubMed] [Google Scholar]
  • 69.Franke L., el Bannoudi H., Jansen D.T.S.L., Kok K., Trynka G., Diogo D., Swertz M., Fransen K., Knevel R., Gutierrez-Achury J., et al. Association analysis of copy numbers of FC-gamma receptor genes for rheumatoid arthritis and other immune-mediated phenotypes. Eur. J. Hum. Genet. 2016;24:263–270. doi: 10.1038/ejhg.2015.95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.UK10K Consortium. Walter K., Min J.L., Huang J., Crooks L., Memari Y., McCarthy S., Perry J.R.B., Xu C., Futema M., et al. The UK10K project identifies rare variants in health and disease. Nature. 2015;526:82–90. doi: 10.1038/nature14962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Wang Y., Namba S., Lopera-Maya E.A., Kerminen S., Tsuo K., Lall K., Kanai M., Zhou W., Wu K.-H.H., Fave M.-J., et al. Global biobank analyses provide lessons for computing polygenic risk scores across diverse cohorts. Preprint at medRxiv. 2021 doi: 10.1101/2021.11.19.21266436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Namba S., Konuma T., Wu K.-H., Zhou W., Okada Y., Global Biobank Meta-analysis Initiative A practical guideline of genomics-driven drug discovery in the era of global biobank meta-analysis. Cell Genomics. 2022;2 doi: 10.1016/j.xgen.2022.100190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Wu K.-H.H., Douville N.J., Konerman M.C., Mathis M.R., Hummel S.L., Wolford B.N., et al. Polygenic risk score from a multi-ancestry GWAS uncovers susceptibility of heart failure. Preprint at medRxiv. 2021 doi: 10.1101/2021.12.06.21267389. [DOI] [Google Scholar]
  • 74.Faro V.L., Bhattacharya A., Zhou W., Zhou D., Wang Y., Läll K., et al. Genome-wide association meta-analysis identifies novel ancestry-specific primary open-angle glaucoma loci and shared biology with vascular mechanisms and cell proliferation. Preprint at medRxiv. 2021 doi: 10.1101/2021.12.16.21267891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Surakka I., Wu K.-H., Hornsby W., Wolford B.N., Shen F., Zhou W., et al. Multi-ancestry meta-analysis identifies 2 novel loci associated with ischemic stroke and reveals heterogeneity of effects between sexes and ancestries. Preprint at medRxiv. 2022 doi: 10.1101/2022.02.28.22271647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Wolford B.N., Zhao Y., Surakka I., Wu K.-H.H., Yu X., Richter C.E., Bhatta L., Brumpton B., Desch K., Thibord F., et al. Multi-ancestry GWAS for venous thromboembolism identifies novel loci followed by experimental validation in zebrafish. Preprint at medRxiv. 2022 doi: 10.1101/2022.06.21.22276721. [DOI] [Google Scholar]
  • 77.Aneas I., Decker D.C., Howard C.L., Sobreira D.R., Sakabe N.J., Blaine K.M., Stein M.M., Hrusch C.L., Montefiori L.E., Tena J., et al. Asthma-associated genetic variants induce IL33 differential expression through an enhancer-blocking regulatory region. Nat. Commun. 2021;12:6115. doi: 10.1038/s41467-021-26347-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Vladich F.D., Brazille S.M., Stern D., Peck M.L., Ghittoni R., Vercelli D. IL-13 R130Q, a common variant associated with allergy and asthma, enhances effector mechanisms essential for human allergic inflammation. J. Clin. Invest. 2005;115:747–754. doi: 10.1172/JCI22818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.All of Us Research Program Investigators. Denny J.C., Rutter J.L., Goldstein D.B., Philippakis A., Smoller J.W., Jenkins G., Dishman E. The “all of us” Research program. N. Engl. J. Med. 2019;381:668–676. doi: 10.1056/NEJMsr1809937. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Wojcik G.L., Graff M., Nishimura K.K., Tao R., Haessler J., Gignoux C.R., Highland H.M., Patel Y.M., Sorokin E.P., Avery C.L., et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature. 2019;570:514–518. doi: 10.1038/s41586-019-1310-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Luo Y., Kanai M., Choi W., Li X., Sakaue S., Yamamoto K., Ogawa K., Gutierrez-Arcelus M., Gregersen P.K., Stuart P.E., et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response. Nat. Genet. 2021;53:1504–1516. doi: 10.1038/s41588-021-00935-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Sakaue S., Hosomichi K., Hirata J., Nakaoka H., Yamazaki K., Yawata M., Yawata N., Naito T., Umeno J., Kawaguchi T., et al. Decoding the diversity of killer immunoglobulin-like receptors by deep sequencing and a high-resolution imputation method. Cell Genomics. 2022;2:100101. doi: 10.1016/j.xgen.2022.100101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Mukamel R.E., Handsaker R.E., Sherman M.A., Barton A.R., Zheng Y., McCarroll S.A., Loh P.-R. Protein-coding repeat polymorphisms strongly shape diverse human phenotypes. Science. 2021;373:1499–1505. doi: 10.1126/science.abg8289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Mägi R., Horikoshi M., Sofer T., Mahajan A., Kitajima H., Franceschini N., McCarthy M.I., COGENT-Kidney Consortium T2D-GENES Consortium. Morris A.P., Morris A.P. Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution. Hum. Mol. Genet. 2017;26:3639–3650. doi: 10.1093/hmg/ddx280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Su Z., Marchini J., Donnelly P. HAPGEN2: simulation of multiple disease SNPs. Bioinformatics. 2011;27:2304–2305. doi: 10.1093/bioinformatics/btr341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Chang C.C., Chow C.C., Tellier L.C., Vattikuti S., Purcell S.M., Lee J.J. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Das S., Forer L., Schönherr S., Sidore C., Locke A.E., Kwong A., Vrieze S.I., Chew E.Y., Levy S., McGue M., et al. Next-generation genotype imputation service and methods. Nat. Genet. 2016;48:1284–1287. doi: 10.1038/ng.3656. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Manichaikul A., Mychaleckyj J.C., Rich S.S., Daly K., Sale M., Chen W.-M. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26:2867–2873. doi: 10.1093/bioinformatics/btq559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Wei X., Nielsen R. CCR5-Δ32 is deleterious in the homozygous state in humans. Nat. Med. 2019;25:909–910. doi: 10.1038/s41591-019-0459-6. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  • 90.Maier R., Akbari A., Wei X., Patterson N., Nielsen R., Reich D. No statistical evidence for an effect of CCR5-Δ32 on lifespan in the UK Biobank cohort. Nat. Med. 2020;26:178–180. doi: 10.1038/s41591-019-0710-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Loh P.-R., Danecek P., Palamara P.F., Fuchsberger C., A Reshef Y., K Finucane H., Schoenherr S., Forer L., McCarthy S., Abecasis G.R., et al. Reference-based phasing using the Haplotype reference consortium panel. Nat. Genet. 2016;48:1443–1448. doi: 10.1038/ng.3679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Loh P.-R., Bhatia G., Gusev A., Finucane H.K., Bulik-Sullivan B.K., Pollack S.J., Schizophrenia Working Group of Psychiatric Genomics Consortium. de Candia T.R., Lee S.H., Wray N.R., et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 2015;47:1385–1392. doi: 10.1038/ng.3431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.McLaren W., Gil L., Hunt S.E., Riat H.S., Ritchie G.R.S., Thormann A., Flicek P., Cunningham F. The Ensembl variant effect predictor. Genome Biol. 2016;17:122. doi: 10.1186/s13059-016-0974-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S17 and Data S1–S3
mmc1.pdf (2.7MB, pdf)
Table S1 Number of unrelated samples for simulated cohorts, related to STAR Methods
mmc2.xlsx (9.4KB, xlsx)
Table S2 Number of chromosome 3 variants in Illumina manifest and those extracted from 1000GP African, East Asian, and European populations, related to STAR Methods
mmc3.xlsx (9.3KB, xlsx)
Table S3 Number of imputed and QC-passing variants (MAF > 0.001 and Rsq > 0.6), related to STAR Methods
mmc4.xlsx (13.7KB, xlsx)
Table S4 List of configurations for meta-analysis simulation, related to STAR Methods
mmc5.xlsx (12.5KB, xlsx)
Table S5 SLALOM prediction in the simulations for loci from the most heterogeneous and homogeneous configurations, related to Figure 3
mmc6.xlsx (9.6KB, xlsx)
Table S6 List of studies used in the GWAS Catalog analysis, related to STAR Methods
mmc7.xlsx (106KB, xlsx)
Table S7 SLALOM prediction for the GWAS Catalog loci, related to Figure 4
mmc8.xlsx (3.1MB, xlsx)
Table S8 Overview of the GBMI meta-analyses, related to Figure 5 and STAR Methods
mmc9.xlsx (11.7KB, xlsx)
Table S9 SLALOM prediction for the GBMI loci, related to Figures 5 and 6
mmc10.xlsx (55.1KB, xlsx)
Document S2.TPR_Kanai
mmc11.pdf (1.5MB, pdf)
Document S3. Article plus supplemental information
mmc12.pdf (5.2MB, pdf)

Data Availability Statement

The GBMI summary statistics for the 14 endpoints are publicly available and are browserble at the GBMI PheWeb website (http://results.globalbiobankmeta.org/). Example outputs from the meta-analysis fine-mapping simulation pipeline have been deposited at Harvard Dataverse. All original code has been deposited at Zenodo and is publicly available as of the date of publication. DOIs and links are listed in the key resources table. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.


Articles from Cell Genomics are provided here courtesy of Elsevier

RESOURCES