Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Nov 4;15:38529. doi: 10.1038/s41598-025-19644-w

Calibrating genome wide significance by minor allele frequency across three major populations

Sandeep Chowdary Vejandla 1,, Archish Sadeesh 2, Vinodh Srinivasasainagendra 1, Mary Appah 1, Hemant K Tiwari 1
PMCID: PMC12586462  PMID: 41188338

Abstract

Conventional genome-wide association study (GWAS) thresholds, notably 5 × 10⁻⁸, were established under assumptions that may not hold across diverse populations and whole-genome sequencing (WGS) analyses. Given the complex linkage disequilibrium structure of the human genome, a single fixed threshold risks inadequate type I error control. Here, we sought to derive minor allele frequency (MAF)-specific, population-tailored significance thresholds using the Li-Ji method across European, African, and Asian cohorts from the 1000 Genomes Project. We partitioned the genome into natural linkage disequilibrium (LD) blocks defined by the LDetect database and applied rigorous quality control measures before generating LD matrices. Using the Li-Ji method—we estimated the effective number of independent tests for each block across six MAF thresholds and then aggregated the results for each population. The resulting effective tests were used to calculate Bonferroni-adjusted significance thresholds. Our analysis revealed that for common variants (MAF ≥ 0.05), the significance thresholds in European and Asian populations were somewhat lower than the conventional 5 × 10⁻⁸ benchmark, whereas the African population required considerably more stringent corrections. The inclusion of rarer variants further increased the effective number of independent tests across all groups, thereby shifting the significance thresholds to levels even more stringent than the 5 × 10⁻⁸ benchmark. By applying the Li-Ji method, this study establishes that MAF-specific and population-specific significance thresholds provide a more accurate framework for GWAS analyses. Our findings suggest that the conventional 5 × 10⁻⁸ threshold may be suboptimal, particularly when evaluating rare variants or diverse populations, with important implications for future biobank-scale and precision genomic research.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-19644-w.

Keywords: Genome wide association studies (GWAS), Significance threshold, Linkage disequilibrium (LD), Li-Ji method, Population-specific analysis, 1000 genomes project

Introduction

The emergence of genome-wide association studies (GWAS) has transformed our comprehension of complex traits and diseases through its capacity to analyze millions of genetic variants simultaneously1,2. However, this high-throughput approach introduces significant statistical challenges, particularly in controlling the family-wise error rate (FWER) while maintaining adequate statistical power3,4.

The widely accepted GWAS significance threshold of 5 × 10⁻⁸ has been a subject of considerable debate in recent years5. This threshold was originally established based on Bonferroni correction, assuming approximately one million independent tests6, but has been increasingly scrutinized as potentially inappropriate. This scrutiny is particularly relevant given the extensive linkage disequilibrium (LD) structure in the human genome7, which creates complex patterns of correlation between genetic variants and the availability of whole genome sequencing (WGS) data. This correlation structure means that the effective number of independent tests is substantially lower than the total number of variants tested8, suggesting that the conventional threshold may be overly conservative in some cases and potentially inadequate in others.

The inadequacy of a uniform threshold becomes particularly evident when considering population-specific differences in LD structure. Extensive research has demonstrated that LD block structures vary substantially among populations, even those from the same geographic region9. African populations consistently exhibit shorter LD blocks and lower overall LD compared to non-African populations, reflecting their greater genetic diversity and longer evolutionary history10. For instance, Wall and Pritchard10 reported that in genome-wide analyses, less than half of the total sequence was contained in identified haplotype blocks in African populations, whereas European and East Asian populations showed substantially higher block coverage and more extensive LD. These differences likely stem from demographic events, particularly the population bottleneck associated with the expansion of modern humans out of Africa, which resulted in longer-range LD in non-African populations9. Such population-specific variation in LD architecture suggests that a single genome-wide significance threshold derived from one population may be inappropriate for others, necessitating population-specific approaches to multiple testing correction.

Several approaches have been proposed to address this challenge. Gao et al.'s SimpleM method11 developed an approach based on principal component analysis (PCA) to calculate the effective number of independent tests, building upon earlier work on population stratification correction12. Li and Ji13 developed an alternative method based on the eigenvalues of a correlation matrix, offering a theoretical framework that potentially provides more accurate estimates while remaining computationally tractable for large-scale analyses. Hendricks et al. extensively compared these methodologies in the context of gene-based analyses, highlighting the relative strengths of different approaches for handling correlated genetic variants, and importantly, recommended partitioning the genome into smaller units to calculate the effective number of independent tests for each unit, which would make estimates more accurate14.

Modern GWAS faces additional challenges due to increasing study sizes and the incorporation of millions of imputed variants, as well as whole-genome sequencing (WGS) studies15,16. The traditional significance threshold has been questioned, particularly in the context of dense SNP data and resequencing studies17, with some researchers advocating for more stringent thresholds11. Various statistical approaches have been developed to address these challenges, including gene-based tests18 and advanced statistical methods19 to account for multiple comparisons while considering the unique characteristics of genomic data.

The emergence of large-scale genomic resources, particularly the 1000 Genomes Project20, has provided an opportunity to reassess these methodological approaches across diverse populations. Studies of metabolic traits in founder populations21 and large-scale disease association studies1 have demonstrated the importance of population-specific considerations in GWAS analysis. The complex interplay among factors influencing allelic association4 necessitates careful consideration of multiple testing approaches.

In this study, we apply the Li and Ji (Li-Ji) method to address the challenges associated with the conventional 5 × 10⁻⁸ significance threshold, which may not be appropriate across different populations and minor allele frequency (MAF) ranges. By creating LD matrices for distinct genomic regions defined by natural recombination boundaries, we apply the Li and Ji method to each region and aggregate the results across the genome. This approach allows us to account for the complex LD structure of the human genome, providing a more accurate estimation of the number of independent tests while maintaining computational efficiency. Additionally, we implement the SimpleM method11 for comparison, providing both theoretical and empirical evidence to guide the choice between these two approaches for estimating the effective number of independent tests in genomic studies.

This study contributes to the ongoing refinement of multiple testing correction methods in genomic research, offering both theoretical insights and practical guidance for setting appropriate significance thresholds in modern GWAS. Our results have important implications for balancing type I error control with statistical power across different population groups and variant frequency spectra, addressing key challenges identified in contemporary genetic epidemiology research6,7.

To empirically demonstrate the impact of our population-specific thresholds, we analyzed associations from the NHGRI-EBI GWAS Catalog22 across three major populations. This real-world application revealed that while the conventional 5 × 10⁻⁸ threshold is reasonably appropriate for European populations (with only 2.17% of associations failing to meet Li-Ji thresholds), it is markedly liberal for African populations, where 12.96% of reported associations would not reach significance under appropriate multiple testing correction. These findings underscore the urgent need for population-specific significance thresholds in the era of global genomic studies.

Materials and methods

Overview of multiple testing correction methods

In our study, we considered two major approaches for determining the effective number of independent tests in GWAS: the SimpleM method developed by Gao et al.11 and the Li-Ji method13. Both approaches aim to account for the correlation structure among genetic variants, but they differ fundamentally in their theoretical foundations and implementation.

The SimpleM method

The SimpleM method, introduced by Gao et al.11, uses principal component analysis (PCA) to estimate the effective number of independent tests.

This approach involves calculating the correlation matrix for the loci under investigation, followed by performing eigenvalue decomposition of this matrix. This methodology identifies the quantity of principal components required to account for a specified proportion (c) of the overall variance and subsequently uses this number as the effective number of independent tests.

The effective number (Inline graphic) is calculated as: Inline graphic

where Inline graphic are the eigenvalues of the correlation matrix, k is the minimal number of eigenvalues whose cumulative sum reaches proportion c, j indexes all eigenvalues in the correlation matrix, and c is typically set to 0.995.

The Li-Ji method and its advantages

Li and Ji13 developed a method that improves upon Nyholt's earlier approach. While Nyholt's method used Inline graphic, where M is the total number of eigenvalues and Inline graphic is the variance of eigenvalues of the correlation matrix. Li and Ji demonstrated that this formula could overestimate the effective number of tests in certain scenarios.

The Li-Ji method calculates Inline graphic by separating each eigenvalue's contribution into two components:

graphic file with name d33e401.gif

where: Inline graphic

where Inline graphic denotes the floor function (greatest integer less than or equal to x), and I(x ≥ 1) is the indicator function that equals 1 when x ≥ 1 and 0 otherwise.

The formula decomposes each eigenvalue's contribution into two distinct components: an integer component and a fractional component. The integer component, Inline graphic accounts for eigenvalues greater than or equal to 1 as completely independent tests, while the fractional component,Inline graphic captures partial correlations by incorporating the decimal portion of eigenvalues between 0 and 1. For instance, an eigenvalue of 2.7 would contribute 1.7 to Inline graphic (1 from the integer component plus 0.7 from the fractional component), while an eigenvalue of 0.3 would contribute only its fractional component (0.3) to Inline graphic.

This approach more accurately reflects the true correlation structure as it accounts for both strong and weak correlations through eigenvalue decomposition, avoids overcounting partially correlated tests, and provides more accurate estimates at both extremes of correlation (complete independence and complete dependence).

Theoretical comparison of methods

To illustrate the theoretical advantages of the Li-Ji method, we compared its performance against SimpleM across three idealized scenarios representing different extremes of linkage disequilibrium (LD). We considered a hypothetical Inline graphic correlation matrix for M genetic variants.

  1. Scenario of Perfect Independence (No LD): In this case, the correlation matrix Inline graphic is an identity matrix (all off-diagonal elements are 0). The eigenvalues are all equal to 1.
    • The Li-Ji method correctly calculates Meff = M, aligning with the intuition that if all variants are independent, the number of effective tests equals the total number of variants.
    • In contrast, SimpleM, using the standard c = 0.995 threshold, requires the number of components needed to explain 99.5% of the variance, yielding Meff ≈ 0.995M. This slightly underestimates the true number of independent tests, but still close.
  2. Scenario of Perfect Correlation (Perfect LD): Here, all off-diagonal elements of the correlation matrix are 1. This represents a region with a single independent signal. The eigenvalues are Inline graphic.
    • In this scenario, both the Li-Ji method and SimpleM correctly converge to the intuitive answer of Meff = 1, demonstrating that both methods can identify a single source of variation.
  3. Scenario of Uniform Moderate Correlation: This is the case where all off-diagonal elements are 1/2. The eigenvalues are Inline graphic.
    • The Li-Ji method yields
      graphic file with name d33e512.gif

It follows that Inline graphic. This result aligns with the intuitive expectation that a moderate, uniform correlation across all variants should reduce the number of effective tests by about half.

SimpleM however requires including nearly all of the principal components to capture 99.5% of the total variance, resulting in Meff ≈ 0.99M, significantly overestimating the number of truly independent tests.

This theoretical analysis reveals a fundamental distinction between the two methods' approaches to independence. The Li-Ji method directly quantifies independence through eigenvalue decomposition, asking "how many independent tests are present?" In contrast, SimpleM's PCA-based approach asks, “how many components explain 99.5% of variance?”—a subtly but importantly different question. This distinction becomes particularly evident in the moderate correlation scenario (r2 = 0.5), where Li-Ji correctly identifies that uniform moderate correlation reduces independent tests by half (Meff ≈ M/2), while SimpleM's variance-focused approach captures nearly all components (Meff ≈ 0.99M), effectively treating moderately correlated tests as independent.

Based on these theoretical considerations, we selected the Li-Ji method for our primary analyses, as it provides a more direct and intuitive quantification of independence in genomic correlation structures. To empirically validate this theoretical prediction, we computed effective test estimates using both the methods across all populations and MAF thresholds, with the comparative results presented in the "Comparison of Meff Using Li-Ji and SimpleM Methods" section of our Results.

Study data and population-specific LD blocks

We utilized data from the 1000 Genomes Project20, which serves as a comprehensive public catalog of human genetic variation, containing genotype data from 2,504 individuals across 26 populations. This resource has been historically used as a reference panel for imputation in genome-wide association studies. Our initial analysis included populations from three major continental groups: African (AFR, n = 661), European (EUR, n = 503), and Asian populations (comprising East Asian [EAS, n = 504] and South Asian [SAS, n = 489] groups), with detailed population descriptions provided in Supplementary Table 1.

Following Hendricks et al.’s recommendation to partition the genome into smaller units for more accurate estimation of effective independent tests14, we employed the LDetect database23, which partitions the genome based on population-specific recombination boundaries derived from the 1000 Genomes Project data. LDetect software24 employs a dynamic programming algorithm to identify chromosomal segments with minimal inter-block linkage disequilibrium, effectively partitioning the genome into regions of relatively independent inheritance patterns. These boundaries represent natural recombination hotspots and areas of historical recombination events specific to each population group.

The LDetect database provided distinct block definitions for three major population groups: 1,703 LD regions for European (EUR), 2,581 LD regions for African (AFR), and 1,445 LD regions for Asian populations. Notably, in the LDetect database, East Asian (EAS) and South Asian (SAS) populations are combined when defining LD blocks, reflecting shared ancestral patterns of linkage disequilibrium. To maintain consistency with these predefined block structures, we merged the EAS and SAS populations from the 1000 Genomes Project into a combined Asian cohort of 993 individuals for our analysis. The complete block definitions and their corresponding genomic coordinates are provided in Supplementary Tables 2–4.

To understand how LD block patterns differ by population, we compared the number and size of LD blocks across African, European, and Asian groups (Supplementary Table 25, Table A). Although all populations were analyzed using the same 2,796.01 Mb of genomic sequence, we found notable differences in LD structure. African populations had the highest number of LD blocks (2,581), with a smaller average size of 1,082.47 kb (median: 989.73 kb). In contrast, European populations had 1,703 blocks with a larger average size of 1,641.82 kb (median: 1,521.89 kb), and Asian populations had the fewest blocks (1,445) but the largest average size at 1934.96 kb (median: 1,802.36 kb). When adjusted for genomic length, this corresponds to 92.4 LD blocks per 100 Mb in African populations, compared to 60.9 in Europeans and 51.7 in Asians. These results show that African populations have more—but smaller—LD blocks, which reflects shorter-range LD and a higher frequency of historical recombination events. This distinct LD architecture supports the need for more stringent genome-wide significance thresholds in African populations. To directly visualize the overlap and divergence of LD block boundaries across populations, we examined chromosome 22 as a representative example (Supplementary Fig. 1). This visualization reveals striking population-specific patterns: African populations exhibit substantially more LD blocks (represented by more frequent boundary lines) compared to European and Asian populations, consistent with our genome-wide findings. Notably, the red lines connecting shared block boundaries across populations are sparse, indicating limited overlap in LD architecture. The majority of block boundaries are population-specific, with African populations showing the most unique boundaries. This minimal overlap in LD block boundaries between populations, particularly between African and non-African groups, provides compelling visual evidence for why a single genome-wide significance threshold derived from one population cannot adequately control type I error in another.

Results

Population-specific variant counts from the 1000 genomes project

Initial variant counts from the 1000 Genomes Project data revealed substantial population-specific differences across MAF thresholds as catalogued in Table 1. We examined variants across six MAF thresholds: 0.05 (common variants), 0.02, 0.01, 0.005, 0.001, and 0.0001 (rare variants). For common variants (MAF ≥ 0.05), we observed comparable counts in Europeans (6.85M) and Asians (6.86M), while Africans showed higher diversity with 9.53M variants. The inclusion of rarer variants led to a dramatic increase in counts across all populations, reaching 23.77M in Europeans, 41.5M in Africans, and 39.04M in Asians at MAF ≥ 0.0001. To facilitate subsequent discussion, Table 1 also includes variant counts from the All of Us Research Program25,26.

Table 1.

Initial variant counts across populations before LD matrix generation.

Ancestry MAF threshold 1000 G variants All of us variants
European  ≥ 0.05 6,850,409 8,722,459
 ≥ 0.02 8,406,987 11,076,336
 ≥ 0.01 9,591,188 12,962,241
 ≥ 0.005 10,877,867 15,042,570
 ≥ 0.001 14,777,444 22,051,776
 ≥ 0.0001 23,774,609 36,004,625
African  ≥ 0.05 9,487,879 12,031,351
 ≥ 0.02 13,409,422 17,180,889
 ≥ 0.01 16,421,933 21,335,307
 ≥ 0.005 19,863,626 25,754,887
 ≥ 0.001 28,741,563 38,671,508
 ≥ 0.0001 41,495,707 49,224,132
Asian  ≥ 0.05 6,859,109 8,788,011
 ≥ 0.02 8,352,031 10,972,836
 ≥ 0.01 9,713,338 12,917,452
 ≥ 0.005 11,605,272 15,085,483
 ≥ 0.001 21,822,290 19,984,133
 ≥ 0.0001 39,043,678 32,349,561

LD matrix generation and quality control

We generated LD matrices for three major population groups: European (EUR), African (AFR), and East Asian (ASN). The process involved several key steps to ensure data quality. First, we established LD block definitions using predefined natural recombination boundaries obtained from the LDetect database as reported in Supplementary Table 2–4. These boundaries were population-specific, comprising 1,703 LD regions for EUR, 2,581 LD regions for AFR, and 1,445 LD regions for ASN populations.

For each population and MAF threshold, we implemented a rigorous quality control (QC) pipeline to ensure the generation of valid LD matrices suitable for the Li-Ji method. This multi-step process was performed using PLINK 2.0 software. First, we excluded a small number of variants that fell outside the predefined population-specific LDetect LD regions. Next, we applied two crucial filters: we retained only variants with 100% genotype availability (i.e., 0% missingness) to prevent undefined values in the correlation matrices, which would have made the Li-Ji method's eigenvalue decomposition impossible. We subsequently removed variants showing significant deviation from Hardy–Weinberg Equilibrium (HWE p-value < 1 × 10⁻⁶), which serves dual purposes: preventing undefined values in correlation matrices and ensuring that our significance thresholds are calculated based on variants that would be retained in well-conducted GWAS analyses, as variants severely deviating from HWE often indicate technical artifacts, population substructure, or genotyping errors. A detailed schematic of this filtering pipeline is presented in Fig. 1, which, using the European population at MAF ≥ 0.01 as a representative example, shows that these combined QC steps removed a total of 1.43% of variants. The complete variant counts before and after each QC step for all populations and MAF thresholds are provided in Supplementary Table 5. For the European cohort, the total percentage of variants removed by this pipeline ranged from 0.84 to 1.69% across the different MAF thresholds analyzed. By applying these stringent QC measures, we ensured complete and robust analysis across all LD blocks for the subsequent calculation of effective independent tests.

Fig. 1.

Fig. 1

Variant Filtering Pipeline: European Population (MAF ≥ 0.01). This flowchart illustrates the multi-step quality control (QC) process applied to the genomic data, using the European population at a MAF threshold of ≥ 0.01 as a representative example. The pipeline shows the sequential removal of variants that fall outside predefined LDetect LD regions, fail the genotype missingness filter (requiring a 100% call rate), or deviate from Hardy–Weinberg Equilibrium (HWE p-value < 1 × 10⁻⁶). The summary statistics box quantifies the total number of variants removed at each stage and as a cumulative percentage of the initial count.

To optimize computational efficiency, we implemented a pre-processing step to remove variants with identical genotypes (i.e., those in perfect r2 = 1) from each LD block before applying the Li-Ji method. While the Li-Ji method’s eigenvalue decomposition inherently accounts for perfect correlations, removing these redundant variants significantly reduces the dimensionality of the LD matrices. This step improved memory management and reduced computational time, particularly in regions with high linkage disequilibrium, without altering the final calculation of effective tests.

Application of the Li-Ji method

For each population and MAF threshold, we implemented a systematic approach that involved generating population-specific LD matrices for each predefined LD block and applying the Li-Ji method to calculate the effective number of independent tests within each block. The method generated region-specific counts of total variants, unique variants, and effective SNPs, along with their corresponding genomic coordinates defined by LDetect database. These detailed, block-level results are provided in Supplementary Tables 6–11(European), 12–17 (African), and 18–23 (Asian), with each table containing separate worksheets for the six MAF thresholds analyzed. We then aggregated results across all blocks to obtain genome-wide estimates and calculated corresponding significance thresholds using Bonferroni correction based on the effective number of tests.

This approach maintained computational efficiency while accounting for the complex LD structure specific to each population group. Our quality control procedures ensured complete genome coverage with no LD blocks dropped from the analysis. The unaggregated results in supplementary tables 6–23 provide transparency in our methodology and serve as a valuable resource for researchers interested in region-specific patterns of genetic variation and linkage disequilibrium.

Computational requirements

The computational demands of our approach scaled with the complexity of population-specific LD structures. Each unique combination of population, MAF threshold, and LD block constituted an independent computational job. We processed 34,374 independent computational jobs across all populations: 10,218 for Europeans (1,703 LD regions × 6 MAF thresholds), 15,486 for Africans (~ 2,581 regions × 6 MAFs), and 8,670 for Asians (~ 1,445 regions × 6 MAFs).

Execution times varied substantially with block size and variant density. Asian populations showed the highest computational demands (median: 0:02:16, mean: 0:07:37, max: 1:33:56; format hh:mm:ss), followed by Africans (median: 0:01:53, mean: 0:03:55, maximum: 1:02:20) and Europeans (median: 0:01:24, mean: 0:02:38, maximum: 0:40:53). The minimum execution time was 1 s across all populations.

Memory requirements were stratified by LD matrix size: 30 GB for smaller matrices (~ 200 MB), 70 GB for moderate matrices, and up to 150 GB for the largest matrices (13–20 GB) under the R implementation. For approximately 55 large LD matrices exceeding 20 GB (50 in the ASN population and 5 in the AFR population), primarily covering ultra-rare variants (MAF Inline graphic 0.0001 and a small number of MAF Inline graphic 0.001), the R code was unable to complete due to its internal memory management limits despite increased allocations (150–300 GB). We therefore implemented the Li-Ji procedure in C++ , which processed these large matrices using 50–85 GB of memory and completed in 10–30 min each, including a 65 GB size matrix that finished in ~ 30 min. By comparison, R required ~ 60–93 min for 13–20 GB matrices. Jobs were processed in parallel on a high-performance computing cluster, with memory-intensive jobs scheduled separately to optimize resource utilization. The complete analysis across all populations and MAF thresholds required approximately three weeks of computation time, with the most demanding jobs (large LD blocks with rare variants) contributing disproportionately to the total runtime.

Population-wise analysis of effective independent tests

We analyzed the effective number of independent tests across three major population groups from the 1000 Genomes Project data: European (EUR), African (AFR), and East Asian (ASN). Our analysis revealed substantial variation in variant counts and effective tests across populations and MAF thresholds, as detailed in Tables 24. In these tables, the ‘Total Variants’ column represents the number of variants remaining in each LD block after applying our initial quality control filters for missingness and HWE. The ‘Unique Variants’ column shows the reduced count after we removed variants with identical genotypes (r2 = 1) as a computational optimization step. This pre-processing step accounts for the notable difference between the ‘Total’ and ‘Unique’ variant counts.

Table 2.

Population-specific analysis of effective independent tests in Europeans.

MAF threshold LD regions Total variants† Unique variants‡ Effective tests Significance threshold
 ≥ 0.05 1,703 6,734,755 3,770,097 1,252,299 3.99 × 10⁻⁸
 ≥ 0.02 1,703 8,278,073 4,590,040 1,704,858 2.93 × 10⁻⁸
 ≥ 0.01 1,703 9,453,678 5,298,013 2,158,129 2.32 × 10⁻⁸
 ≥ 0.005 1,703 10,731,889 6,113,484 2,723,317 1.84 × 10⁻⁸
 ≥ 0.001 1,703 14,607,301 8,252,877 4,243,875 1.18 × 10⁻⁸
 ≥ 0.0001 1,703 23,574,329 9,086,266 4,724,202 1.06 × 10⁻⁸

Total Variants: Represents variants remaining after initial quality control (0% missingness and HWE p-value > 1 × 10⁻⁶).

Unique Variants: Represents variants remaining after the removal of variants with identical genotypes (those in perfect r2 = 1). This step was performed to improve computational efficiency and does not alter the final Effective Tests calculation.

Table 4.

Population-specific analysis of effective independent tests in Asians.

MAF threshold LD regions Total variants† Unique variants‡ Effective tests Significance threshold
 ≥ 0.05 1,445 6,473,690 4,406,213 1,297,958 3.85 × 10⁻⁸
 ≥ 0.02 1,445 7,950,124 5,233,335 1,724,586 2.9 × 10⁻⁸
 ≥ 0.01 1,445 9,300,298 6,110,520 2,299,072 2.17 × 10⁻⁸
 ≥ 0.005 1,445 11,179,173 7,477,923 3,262,059 1.53 × 10⁻⁸
 ≥ 0.001 1,445 21,330,945 14,232,210 7,944,079 6.29 × 10⁻⁹
 ≥ 0.0001 1,445 38,512,816 15,640,343 8,564,601 5.84 × 10⁻⁹

Total Variants: Represents variants remaining after initial quality control (0% missingness and HWE p-value > 1 × 10⁻⁶).

Unique Variants: Represents variants remaining after the removal of variants with identical genotypes (those in perfect r2 = 1). This step was performed to improve computational efficiency and does not alter the final Effective Tests calculation.

The reduction from the initial 1000 Genomes variant counts (Table 1) to the post-QC 'Total Variants' (Tables 24) is fully explained by our filtering pipeline (Fig. 1, Supplementary Table 5). For Europeans and Africans, this reduction was minimal across all MAF thresholds (0.84–1.69% and 0.95–1.83%, respectively). The filtering steps contributed differently: variants falling outside LDetect LD regions accounted for negligible losses (< 0.01% in most cases), while the missingness filter (requiring 100% call rate) and Hardy–Weinberg filter (p > 1 × 10⁻⁶) accounted for the majority of removed variants. This systematic quality control ensured valid LD matrix generation while retaining over 98% of variants in European and African populations.

For the Asian population, we combined East Asian (EAS) and South Asian (SAS) groups in accordance with the LDetect database, which provides LD blocks for a single combined Asian panel. Using this framework, all 1,445 LDetect LD regions were analyzed at every MAF cutoff. Losses attributable to restricting variants to LDetect LD regions were negligible across thresholds (e.g., at MAF ≥ 0.0001, 229 of 39,043,678 initial variants— ~ 0.0006% fell outside), indicating near-complete coverage of both common and rare variants within the defined blocks. After applying standard genotype and Hardy–Weinberg equilibrium filters, overall variant retention remained high, ranging from 94.38% at MAF ≥ 0.05 to 98.64% at MAF ≥ 0.0001, with modest increases in retention as lower MAF thresholds introduced larger variant sets (Supplementary Table 5).

In the European population, we analyzed 1,703 LD regions across different MAF thresholds as shown in Table 2. Our computational optimization strategy of removing variants with identical genotypes prior to eigenvalue decomposition reduced the input matrix dimensions by approximately 44% from 6.73 million to 3.77 million variants for common variants (MAF ≥ 0.05), significantly improving computational efficiency while maintaining the integrity of the Li-Ji method. Application of the Li-Ji method revealed an effective number of approximately 1.25 million independent tests, yielding a significance threshold of 3.99 × 10⁻⁸. This threshold became increasingly stringent as we included rarer variants, reaching 1.06 × 10⁻⁸ for variants with MAF ≥ 0.0001, where the effective number of independent tests increased to approximately 4.72 million.

In the African population, we analyzed 2,581 LD regions across different MAF thresholds as shown in Table 3. For common variants (MAF ≥ 0.05), our analysis began with approximately 9.31 million total variants. After removing variants with identical genotypes prior to eigenvalue decomposition, the input matrix dimensions were reduced by approximately 28% to 6.75 million variants, improving computational efficiency. Application of the Li-Ji method yielded 2.73 million effective independent tests, resulting in a significance threshold of 1.83 × 10⁻⁸. As we included rarer variants (MAF ≥ 0.0001), the number of effective independent tests increased to 9.9 million, with a corresponding more stringent threshold of 5.03 × 10⁻⁹.

Table 3.

Population-specific analysis of effective independent tests in Africans.

MAF threshold LD regions Total variants† Unique variants‡ Effective tests Significance threshold
 ≥ 0.05 2,581 9,314,426 6,749,067 2,732,064 1.83 × 10⁻⁸
 ≥ 0.02 2,581 13,195,980 9,137,756 3,841,503 1.3 × 10⁻⁸
 ≥ 0.01 2,581 16,185,933 10,900,363 4,780,474 1.05 × 10⁻⁸
 ≥ 0.005 2,581 19,595,401 12,815,104 5,949,424 8.4 × 10⁻⁹
 ≥ 0.001 2,581 28,394,895 17,190,122 8,923,178 5.6 × 10⁻⁹
 ≥ 0.0001 2,581 41,101,700 18,805,078 9,941,476 5.03 × 10⁻⁹

Total Variants: Represents variants remaining after initial quality control (0% missingness and HWE p-value > 1 × 10⁻⁶).

Unique Variants: Represents variants remaining after the removal of variants with identical genotypes (those in perfect r2 = 1). This step was performed to improve computational efficiency and does not alter the final Effective Tests calculation.

In the Asian population, following the previously described combination of East Asian (EAS) and South Asian (SAS) sub-populations, we analyzed 1,445 LD regions across different MAF thresholds as shown in Table 4. For common variants (MAF ≥ 0.05), our analysis began with approximately 6.47 million total variants. After removing variants with identical genotypes prior to eigenvalue decomposition, the input matrix dimensions were reduced by approximately 32% to 4.41 million variants. Application of the Li-Ji method yielded 1.3 million effective independent tests, resulting in a significance threshold of 3.85 × 10⁻⁸. Including rarer variants (MAF ≥ 0.0001) increased the number of effective independent tests to 8.56 million, with a corresponding more stringent threshold of 5.84 × 10⁻⁹.

Figure 2 provides a consolidated visualization of our findings, demonstrating the inadequacy of a fixed 5 × 10⁻⁸ significance threshold across different populations and MAF ranges. Panel A illustrates the substantial variation in both unique variant counts and the effective number of independent tests across the three populations. As expected, the number of unique variants and effective tests increases as rarer variants are included. Notably, the African population consistently shows the highest number of effective tests at every MAF threshold, reflecting its greater genetic diversity. Panel B plots the corresponding Li-Ji derived significance thresholds against the conventional 5 × 10⁻⁸ benchmark (dotted line). A clear pattern emerges while the conventional threshold is reasonably appropriate for common variants (MAF ≥ 0.05) in European (3.99 × 10⁻⁸) and Asian (3.85 × 10⁻⁸) populations, it is already too liberal for the African population (1.83 × 10⁻⁸) at this same threshold. For all populations, the conventional threshold becomes increasingly liberal as rarer variants are considered. This is most pronounced in the African population, where at MAF ≥ 0.01, the required significance threshold is 1.05 × 10⁻⁸, nearly five times more stringent than the 5 × 10⁻⁸ standard.

Fig. 2.

Fig. 2

Population-specific MAF analysis–variant counts and significance thresholds. This figure consolidates the primary findings across all three populations. A The upper panel displays the number of unique variants and the effective number of independent tests at each of the six minor allele frequency (MAF) thresholds. It highlights the greater genetic diversity in the African population, which consistently has the highest counts. B The lower panel plots the corresponding Li-Ji derived significance thresholds for each population against the conventional 5 × 10⁻⁸ threshold (dotted line). This panel visually demonstrates that the required thresholds become more stringent as rarer variants are included and differ substantially by population, with the African cohort requiring the most stringent correction.

Note for the Asian population in Panel B: there is a pronounced drop in the Li–Ji threshold between MAF ≥ 0.005 and ≥ 0.001. This pattern reflects the use of a combined EAS + SAS panel in LDetect. As the MAF cutoff is lowered, many subgroup-specific variants enter the analysis, substantially increasing the number of variants captured within LD regions and enlarging the LD matrices. The resulting rise in the effective number of independent tests yields a more stringent (lower) Li–Ji threshold.

Collectively, these results provide compelling visual evidence that population-specific and MAF-dependent thresholds are essential for accurate type I error control in modern GWAS.

Comparison of Meff using Li-Ji and simpleM methods

To validate our methodological choice empirically, we calculated the effective number of independent tests (Meff) using both the Li-Ji and SimpleM methods across all populations and MAF thresholds. As shown in Supplementary Table 24, the SimpleM method consistently yielded substantially higher Meff estimates compared to the Li-Ji method, with the difference representing 20.7% to 26.2% of the total unique variants analyzed.

The magnitude of this difference varied by both population and MAF threshold. In the European population, SimpleM’s overestimation represented 22.9% of unique variants for common variants (MAF ≥ 0.05), calculating 2.12 million effective tests compared to Li-Ji’s 1.25 million. This proportion remained relatively stable across MAF thresholds, ranging from 21.1 to 22.9%, though the absolute difference increased substantially with rarer variants (reaching 2.02 million additional tests at MAF ≥ 0.0001).

The African population showed the most pronounced differences, with SimpleM's overestimation representing 24.4% to 26.2% of unique variants across MAF thresholds. At MAF ≥ 0.05, SimpleM estimated 4.38 million effective tests compared to Li-Ji’s 2.73 million (a difference of 24.4% of the 6.75 million unique variants). Similarly, the Asian population exhibited consistent overestimation, with differences ranging from 20.7 to 23.4% of unique variants, translating to nearly a million additional tests at MAF ≥ 0.05 (2.27 million vs. 1.30 million).

These empirical findings strongly support our theoretical analysis, demonstrating that SimpleM, by focusing on variance explained rather than independence, systematically overestimates the number of truly independent tests by approximately one-fifth to one-quarter of all unique variants tested. The consistent pattern across all populations and MAF thresholds reinforces our decision to use the Li-Ji method, which provides more conservative and theoretically grounded estimates of the multiple testing burden. Had we used SimpleM, the resulting significance thresholds would have been unnecessarily stringent, potentially reducing power to detect true associations without meaningful improvement in type I error control.

Population differences in LD structure and allele frequency distributions

To comprehensively quantify population differences underlying the need for ancestry-specific significance thresholds, we examined both linkage disequilibrium structure and allele frequency distributions across populations.

Analysis of LD strength and variant density within blocks for common variants (MAF ≥ 0.05) using the Li-Ji method revealed substantial population-specific patterns (Supplementary Table 25, Table B). African populations showed substantially weaker LD with an average LD ratio of 0.294 (ratio of effective SNPs to raw variants), compared to 0.187 in Europeans and 0.201 in Asians. This higher ratio indicates that variants within African LD blocks are less correlated, resulting in more effective independent tests. Additionally, variant density within LD blocks was highest in Africans at 3.61 variants per kb, compared to 2.60 in Europeans and 2.52 in Asians.

Pairwise population comparisons further revealed the magnitude of these structural differences (Supplementary Table 25, Table C). African populations exhibited 1.57-fold weaker LD and 1.38-fold higher variant density compared to Europeans. In contrast, Asian and European populations showed remarkably similar LD characteristics, with LD strength and variant density ratios near unity (1.07 and 0.97, respectively). The Asian-African comparison yielded intermediate values, with Asians showing stronger LD (ratio 0.68) and lower variant density (ratio 0.70) relative to Africans.

Complementing these LD structural differences, we observed striking divergence in allele frequency distributions across populations (Supplementary Fig. 2). European and Asian populations demonstrated the highest MAF correlations (mean r2 = 0.69, range: 0.565–0.795), reflecting their shared demographic history and more recent divergence. In marked contrast, African populations showed substantially lower correlations with both European (mean r2 = 0.38, range: 0.22–0.544) and Asian populations (mean r2 = 0.40, range: 0.238–0.562). Notably, these correlations decreased systematically as we moved from rare to common variants, with negative slopes of − 0.119 for EUR-AFR, − 0.118 for AFR-ASN, and -0.079 for EUR-ASN pairs. This pattern indicates that common variants—precisely those used to establish the conventional 5 × 10⁻⁸ threshold—show the greatest frequency divergence across populations, with African-European correlations dropping to just 0.22 for variants with MAF ≥ 0.05.

Together, these quantitative measures of genetic architecture divergence—combining weaker LD, higher variant density, and low allele frequency correlations—directly explain why African populations require the most stringent significance threshold (1.83 × 10⁻⁸ for MAF ≥ 0.05) compared to Europeans (3.99 × 10⁻⁸) and Asians (3.85 × 10⁻⁸). The convergence of evidence from both LD structure and allele frequency distributions provides compelling justification for population-specific and MAF-dependent significance thresholds in modern genomic studies.

Discussion

We presented a comprehensive analysis of population-specific GWAS significance thresholds using the Li-Ji method, demonstrating substantial variation across different populations and MAF ranges. Our work provides strong evidence that the conventional 5 × 10⁻⁸ threshold may be inappropriate for modern genomic studies, particularly when analyzing rare variants or diverse populations. The population-specific thresholds we calculated offer more accurate type I error control while maintaining adequate statistical power.

The application of our methodology to the 1000 Genomes Project data revealed important insights about the effective number of independent tests across populations. For common variants (MAF ≥ 0.05), we found that the European population required approximately 1.25 million effective tests, yielding a significance threshold of 3.99 × 10⁻⁸, which is relatively close to the conventional threshold. Notably, European thresholds remained within the 10⁻⁸ order of magnitude across all MAF categories, ranging from 3.99 × 10⁻⁸ for common variants to 1.06 × 10⁻⁸ for ultra-rare variants. In stark contrast, non-European populations demonstrated markedly different patterns that diverged substantially from the conventional 5 × 10⁻⁸ threshold. The African population, with its greater genetic diversity, required 2.73 million effective tests even for common variants, resulting in a threshold of 1.83 × 10⁻⁸. More dramatically, as we included rarer variants, African thresholds crossed into the 10⁻⁹ order of magnitude—dropping to 5.61 × 10⁻⁹ for rare variants (MAF ≥ 0.001) and 5.03 × 10⁻⁹ for ultra-rare variants (MAF ≥ 0.0001). Similarly, the Asian population showed comparable departure from convention, with thresholds of 6.29 × 10⁻⁹ and 5.84 × 10⁻⁹ for rare and ultra-rare variants, respectively.

To validate the practical implications of these population-specific thresholds, we analyzed 59,807 genome-wide significant associations from the GWAS Catalog22, comprising 1,278 European studies, 78 African studies, and 357 Asian studies (Supplementary Table 27). This empirical analysis revealed striking population differences: while only 2.17% of European associations (834 of 38,384) failed to meet our Li-Ji thresholds, a substantial 12.96% of African associations (32 of 247) would not achieve genome-wide significance under appropriate correction. Asian populations showed intermediate results, with 2.65% of associations (56 of 2111) failing to meet corrected thresholds. Importantly, when stratified by MAF, we observed that the impact of adjusted thresholds was most pronounced for rare variants across all populations. In Europeans, 7.26% of associations with MAF < 0.0001 failed to meet Li-Ji thresholds compared to only 1.12% for common variants (MAF ≥ 0.05). This pattern was even more striking in African populations, where 33.33% of associations with MAF between 0.005 and 0.01 would not reach significance under appropriate correction. These MAF-stratified results demonstrate that the conventional threshold becomes increasingly inadequate for rare variants, reinforcing the necessity of both population-specific and MAF-specific significance thresholds for robust genetic discovery. To identify which phenotypes might be most affected by population-specific thresholds, we examined diseases and traits with the highest numbers of associations that would lose significance under Li-Ji correction (Supplementary Tables 28–30). Complex polygenic traits with large-scale GWAS efforts, such as body mass index, appendicular lean mass, and blood cell parameters in Europeans, showed the greatest impact, ranging from 20 to 44 associations per trait falling below adjusted thresholds. This pattern suggests that well-powered studies of polygenic traits may be particularly susceptible to false-positive findings when using conventional thresholds.

These findings underscore a critical limitation of the universal 5 × 10⁻⁸ threshold: while it may be reasonably appropriate for European populations across the allele frequency spectrum, it becomes increasingly inadequate for non-European populations, potentially leading to substantial type I error inflation when analyzing rare variants in diverse cohorts. The six-fold higher rate of potentially false-positive findings in African populations (12.96% vs. 2.17%) highlights the urgent need for ancestry-specific significance thresholds to ensure robust and reproducible genetic discoveries across all populations.

To validate our block-wise approach versus analyzing merged regions, we compared the effective number of tests calculated for individual LD blocks against merged super-blocks. Analysis of three representative genomic regions in the African population showed that merging adjacent LD blocks reduced the total effective tests by only 0.28–0.57%, indicating minimal redundancy between blocks. However, this marginal gain came at substantial computational cost: individual blocks completed in 6–8 s, while merged super-blocks required 6–8 min—a 60-fold increase in computation time (Supplementary Table 26). This analysis confirms that the predefined LDetect boundaries effectively capture independent genomic regions with minimal inter-block correlation, validating our block-wise strategy as both computationally efficient and statistically sound.

Our findings have particular relevance for large-scale biobank studies such as the All of Us Research Program. Comparing our 1000 Genomes Project results with variant counts from All of Us reveals interesting patterns. For common variants (MAF ≥ 0.05), All of Us shows consistently higher counts across all populations: European (8.72M vs 6.85M), African American (12.03M vs 9.49M), and Asian (8.79M vs 6.86M) (Table 1). This pattern persists across all MAF thresholds, with the difference becoming more pronounced for rarer variants. For instance, at MAF ≥ 0.001, All of Us identifies 22.05M variants in Europeans compared to 14.78M in 1000 Genomes, suggesting that our calculated thresholds may need to be even more stringent when applied to larger, more diverse biobank datasets.

For the Asian panel, the pronounced decline in the Li–Ji threshold between MAF ≥ 0.005 and ≥ 0.001 (Fig. 2) motivated a closer examination of post-QC variants on chromosome 22. Using the combined ASN LD blocks as the frame of reference, we contrasted the total ASN variants with the subset present in both EAS and SAS (i.e., the shared set) across MAF bins (Supplementary Table 31, “Comparison of ASN total variants with EAS–SAS shared variants within LDetect LD regions”; Supplementary Fig. 3, Panel A). The proportionate reduction from ASN totals to the EAS–SAS shared set was modest for MAF ≥ 0.05 to ≥ 0.005 (typically ~ 15–40%) but rose substantially for rarer bins (MAF ≥ 0.001 and ≥ 0.0001), where reductions commonly hovered around 60–70% and reached ~ 80% in some blocks. We then quantified cross-population LD similarity on the shared set (Supplementary Table 32, “Comparison of Linkage Disequilibrium (LD) metrics between EAS and SAS subpopulations for shared variants”; Supplementary Fig. 3, Panel B). Pearson correlations between Inline graphic and Inline graphic were high for MAF ≥ 0.01 (≈0.9), declining modestly for rarer bins (≈0.85 at 0.001 and ≈0.80 at 0.0001), while the mean absolute r2 difference—i.e., the average of Inline graphic- Inline graphic| across shared variant pairs within each LD block—was small (≈0.01–0.02) across bins. Together, these results indicate that combining EAS and SAS is unlikely to materially affect Li–Ji outcomes for common to low-frequency variants (MAF ≥ 0.005), whereas at rarer thresholds the sharper drop in the ASN threshold is largely explained by the influx of subgroup-specific variants that are not shared across both ancestries.

Our overall findings of this study suggest a need for better computational tools. Current methods for defining LD blocks struggle with today's large genetic datasets and cannot easily create separate boundaries for specific populations like Finnish, Han Chinese, or Gujarati Indians. Future software development should focus on three goals: (1) handling datasets with millions of people efficiently, (2) creating LD blocks for specific populations rather than broad continental groups, and (3) working with modern cloud-based systems. While technically challenging, these improvements are essential to prevent losing important genetic variation and to ensure genetic studies benefit all populations equally. Developing these tools represents an important next step for making genome-wide studies more accurate and inclusive.

The implications for TOPMed27,28, which has sequenced over 180,000 individuals with deep sequencing coverage, are also significant. While we do not have direct variant counts from TOPMed for each population, our results suggest that the conventional 5 × 10⁻⁸ threshold currently used in many TOPMed analyses may be too liberal, particularly for rare variant analyses. The progression we observe from 1000 Genomes to All of Us variant counts suggests that TOPMed, with its larger sample size and deeper coverage, likely captures even more genetic variation, especially in the rare variant spectrum.

Our work extends previous findings in important ways. While earlier studies using HapMap data suggested a threshold of 5 × 10⁻⁸ for European populations6,8, our analysis demonstrates that this threshold becomes increasingly liberal as we include rarer variants and examine different populations. This is particularly relevant given that both All of Us and TOPMed include substantial numbers of participants from diverse ancestral backgrounds.

The Li-Ji method's ability to account for complex LD patterns while remaining computationally tractable makes it particularly suitable for large-scale genomic studies. As demonstrated in our theoretical analysis, the method provides an intuitive decomposition of eigenvalue contributions into integer and fractional components, allowing for more accurate handling of partial correlations compared to SimpleM. For instance, in our analysis of a hypothetical correlation matrix with uniform off-diagonal elements of 1/2, the Li-Ji method yielded Meff = M/2, aligning with intuitive expectations, while SimpleM with conventional parameters significantly overestimated the effective number of tests. This theoretical advantage, combined with our region-based approach using natural recombination boundaries, provides a robust framework for estimating the effective number of independent tests in genomic studies using the Li-Ji method.

Looking ahead, our methodology could be directly applied to All of Us and TOPMed data to derive study-specific significance thresholds. This would be particularly valuable given these projects' emphasis on rare variants and diverse populations. While our current thresholds provide good approximations, study-specific calculations would offer the most accurate type I error control for these larger, more diverse datasets.

In conclusion, our work provides a practical framework for determining appropriate significance thresholds in modern genomic studies while demonstrating the importance of considering population-specific genetic architecture and variant frequency spectra. The comparisons with All of Us variant counts suggest that conventional thresholds may need to be reconsidered for biobank-scale studies, particularly when analyzing rare variants or diverse populations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (1.9MB, xlsx)
Supplementary Material 2 (927.5KB, docx)

Acknowledgements

Our sincere gratitude goes to the All of Us Research Program and its participants for their significant contributions to the All of Us study. We also thank the National Institutes of Health’s All of Us Research Program for making available the participant data. The data supplied by the program was crucial in performing our analyses and deriving meaningful insights.

Author contributions

Conceptualization: H.K.T.; methodology: H.K.T., S.V., V.S.; formal analysis: S.V., A.S., and V.S.; writing—original draft preparation: S.V., A.S., V.S., M.A., and H.K.T.; All of Us analysis: S.V. and V.S.; writing—review and editing: All; supervision: H.K.T.; project administration: H.K.T.; funding acquisition: SOPH Dean’s fellowship for S.V. All authors have read and agreed to the published version of the manuscript.

Data availability

We used the publicly available 1000 Genomes data set for this study. The 1000 Genomes data is available on IGSR: The International Genome Sample Resource at https://www.internationalgenome.org/category/ftp/. All of Us (AoU) genetic data was accessed through All of Us Researcher Workbench deployed at https://workbench.researchallofus.org/. Only authors approved to use Controlled-Tier (Genomic and Phenotypic data) participated in AoU related analytics in this effort.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Wellcome Trust Case Control C. Genome-wide association study of 14000 cases of seven common diseases and 3,000 shared controls. Nature. 447(7145):661–78.
  • 2.Risch, N. & Merikangas, K. The future of genetic studies of complex human diseases. Science273(5281), 1516–1517 (1996). [DOI] [PubMed] [Google Scholar]
  • 3.Stephens, M. & Balding, D. J. Bayesian statistical methods for genetic association studies. Nat Rev Genet.10(10), 681–690 (2009). [DOI] [PubMed] [Google Scholar]
  • 4.Zondervan, K. T. & Cardon, L. R. The complex interplay among factors that influence allelic association. Nat Rev Genet.5(2), 89–100 (2004). [DOI] [PubMed] [Google Scholar]
  • 5.Benjamin, D. J. et al. Redefine statistical significance. Nat Hum Behav.2(1), 6–10 (2018). [DOI] [PubMed] [Google Scholar]
  • 6.Dudbridge, F. & Gusnanto, A. Estimation of significance thresholds for genomewide association scans. Genet Epidemiol.32(3), 227–234 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Reich, D. E. et al. Linkage disequilibrium in the human genome. Nature411(6834), 199–204 (2001). [DOI] [PubMed] [Google Scholar]
  • 8.Pe’er, I., Yelensky, R., Altshuler, D. & Daly, M. J. Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet Epidemiol.32(4), 381–385 (2008). [DOI] [PubMed] [Google Scholar]
  • 9.Sawyer, S. L. et al. Linkage disequilibrium patterns vary substantially among populations. Eur J Hum Genet.13(5), 677–686 (2005). [DOI] [PubMed] [Google Scholar]
  • 10.Wall, J. D. & Pritchard, J. K. Haplotype blocks and linkage disequilibrium in the human genome. Nat Rev Genet.4(8), 587–597 (2003). [DOI] [PubMed] [Google Scholar]
  • 11.Gao, X., Starmer, J. & Martin, E. R. A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet Epidemiol.32(4), 361–369 (2008). [DOI] [PubMed] [Google Scholar]
  • 12.Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet.38(8), 904–909 (2006). [DOI] [PubMed] [Google Scholar]
  • 13.Li, J. & Ji, L. Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity (Edinb).95(3), 221–227 (2005). [DOI] [PubMed] [Google Scholar]
  • 14.Hendricks, A. E., Dupuis, J., Logue, M. W., Myers, R. H. & Lunetta, K. L. Correction for multiple testing in a gene region. Eur J Hum Genet.22(3), 414–418 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.McCarthy, M. I. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet.9(5), 356–369 (2008). [DOI] [PubMed] [Google Scholar]
  • 16.Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics26(17), 2190–2191 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Hoggart, C. J., Clark, T. G., De Iorio, M., Whittaker, J. C. & Balding, D. J. Genome-wide significance for dense SNP and resequencing data. Genet Epidemiol.32(2), 179–185 (2008). [DOI] [PubMed] [Google Scholar]
  • 18.Liu, J. Z. et al. A versatile gene-based test for genome-wide association studies. Am J Hum Genet.87(1), 139–145 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Johnson, R. C. et al. Accounting for multiple comparisons in a genome-wide association study (GWAS). BMC Genomics11, 724 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Genomes Project C et al. A global reference for human genetic variation. Nature526(7571), 68–74 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sabatti C, Service SK, Hartikainen AL, Pouta A, Ripatti S, Brodsky J, et al. 2009 Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat Genet. 41(1):35–46.
  • 22.Cerezo, M. et al. The NHGRI-EBI GWAS Catalog: standards for reusability, sustainability and diversity. Nucleic Acids Res.53(D1), D998–D1005 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Research N. ldetect-data 2015–07–23 [Available from: https://bitbucket.org/nygcresearch/ldetect-data/src/master/.
  • 24.Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics32(2), 283–285 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.All of Us Research Program I et al. The “all of us” research program. N Engl J Med.381(7), 668–76 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Ramirez, A. H. et al. The all of us research program: data quality, utility, and diversity. Patterns (N Y).3(8), 100570 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature590(7845), 290–299 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kowalski, M. H. et al. Use of >100,000 NHLBI trans-omics for precision medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations. PLoS Genet.15(12), e1008500 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (1.9MB, xlsx)
Supplementary Material 2 (927.5KB, docx)

Data Availability Statement

We used the publicly available 1000 Genomes data set for this study. The 1000 Genomes data is available on IGSR: The International Genome Sample Resource at https://www.internationalgenome.org/category/ftp/. All of Us (AoU) genetic data was accessed through All of Us Researcher Workbench deployed at https://workbench.researchallofus.org/. Only authors approved to use Controlled-Tier (Genomic and Phenotypic data) participated in AoU related analytics in this effort.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES