Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2025 Oct 30;54(D1):D1415–D1424. doi: 10.1093/nar/gkaf1097

GENEasso: a curated resource of credible disease–gene associations across complex diseases from GWAS summary statistics

Tao Jiang 1,d, Mengting Shao 2,d, Junjie Wang 3,d, Pengze Wu 4,d, Changhui Zhu 5, Ranran Tang 6,, Chen Cao 7,, Ning Gu 8,9,10,
PMCID: PMC12807607  PMID: 41166152

Abstract

Gene-based association analysis has become a powerful strategy to improve the biological interpretability of genome-wide association studies (GWAS) by aggregating variant-level signals at the gene level. Although several transcriptome-wide association study (TWAS)-specific databases have been developed, TWAS represents only one class of gene-based association methods and relies primarily on expression-mediated effects, whereas other gene-based approaches also play critical roles in identifying disease-associated genes. To address this limitation, we present GENEasso, a comprehensive platform that integrates multiple gene-level statistical frameworks to enable robust exploration of disease–gene associations across complex diseases. GENEasso systematically applies seven representative methods to 8226 curated GWAS summary statistics, generating 716 122 high-confidence disease–gene associations. The platform supports cross-method consensus scoring, tissue-specific enrichment prioritization, and ancestry-stratified analyses across five populations. Results can be interactively explored through Manhattan plots and ontology-based navigation, with full transparency and unrestricted access to data. A web server module allows users to upload their own GWAS summary statistics, select gene-based methods, and benchmark their findings against the GENEasso reference database. Concordance across methods increases the credibility of associations, improving reproducibility and supporting user-defined gene prioritization workflows. GENEasso is freely available as an open-access resource at https://www.geneasso.net.

Graphical Abstract

Graphical Abstract.

Graphical Abstract

Introduction

Genome-wide association studies (GWAS) have uncovered thousands of genetic variants associated with complex traits and diseases, providing an important resource for understanding human genetics [1]. GWAS databases such as the GWAS Catalog [2], GWAS Central [3], and GWAS Atlas [4] have become indispensable resources in this field. However, traditional single-nucleotide polymorphism (SNP) level GWAS results are often enriched for noncoding, intronic, or intergenic variants, which pose significant challenges for biological interpretation [5]. Moreover, GWAS evaluates each variant independently; yet, many complex diseases are driven by the combined effects of multiple SNPs, thereby limiting the power and biological relevance of single-variant analyses [6, 7].

To address this limitation, gene-based association methods have been developed to aggregate the effects of multiple SNPs at the gene level, thereby improving statistical power and interpretability. These methods are particularly valuable for complex traits, which are frequently influenced by numerous modest-effect variants acting in concert. Recent studies have demonstrated the utility of gene-based approaches in elucidating the genetic architecture of conditions such as Alzheimer’s disease [810], cardiovascular diseases [1113], and cancers [1416], among others [1719].

Despite this progress, resources supporting gene-based association analyses remain limited in scope and diversity. In contrast to GWAS, which typically follow a standard analytic framework, gene-based association studies encompass a broad array of statistical models and genetic assumptions. Several platforms have been developed, such as TWAS-hub [20], webTWAS [21], TWAS Atlas [22], DisGeNET [23], Brain Catalog [24], and Genebass [25]—but each has limitations. TWAS-hub and webTWAS focus exclusively on transcriptome-wide association studies (TWAS), which represent only one category of gene-based association methods and rely on expression-mediated mechanisms. TWAS Atlas and DisGeNET are literature-curated resources, while Brain Catalog focuses exclusively on brain phenotypes. Genebass provides rare variant associations based on UK Biobank exome sequencing but lacks broader trait coverage. Importantly, these resources represent only a narrow slice of the gene-based methodological landscape.

To ensure a comprehensive representation of multiple gene-level genetic architectures, GENEasso integrates seven complementary gene-based association methods, carefully selected to reflect distinct statistical frameworks and genetic architectures. Four of them—MAGMA [26], PASCAL [27], SMR [28], and DEPICT [29]—are widely adopted and used in the field. MAGMA performs multi-marker regression accounting for linkage disequilibrium (LD), PASCAL aggregates association signals analytically to improve pathway-level power, SMR links GWAS and eQTL data to infer causal expression-mediated effects, and DEPICT prioritizes genes through co-regulation and pathway enrichment. These methods collectively capture different important aspects of gene-level association, from statistical signal aggregation to functional inference.

To improve the methodological diversity and biological relevance, GENEasso also incorporates three approaches: RWAS [30], which links regulatory elements such as enhancers and promoters to trait-associated variants; CWAS [31], which leverages chromatin state annotations to improve gene prioritization; and LDAK-GBAT [32], a flexible linear mixed model framework that supports alternative heritability assumptions for robust gene-level inference. By combining these seven methods within a unified framework, GENEasso enables multi-method gene-based analysis, facilitates cross-method validation, and improves the credibility and interpretability of gene–trait associations derived from GWAS summary statistics.

Despite the growing number of gene-based association methods, no centralized web platform currently allows researchers to apply and compare these tools in a convenient way. Most existing methods require complex software installation, custom input formats, and prior knowledge of reference panels, making limits accessibility for many researchers. Furthermore, the credibility and reproducibility of disease–gene associations critically depend on the ability to validate results across multiple methods, GWAS sources, and populations. To address these challenges, we developed GENEasso, a unified platform that integrates seven representative gene-based association methods (12 models) applied to 8226 curated GWAS summary statistics. The platform comprises two key components: a database module, which hosts 726 122 significant disease–gene associations, and a web server module, which allows users to upload their own GWAS summary statistics, apply multiple methods, and compare their results against those in the curated database. GENEasso supports comparisons across methods and GWAS summary statistics, tissue-specific enrichment analysis, and population-stratified association evaluations across five ancestries. These features allow researchers to assess the consistency of results across analytic strategies, data sources, and populations, thereby facilitating the identification of high-confidence candidate genes and improving the reproducibility of post-GWAS findings.

Materials and methods

Data processing

To build the GENEasso database, we curated a total of 8226 GWAS summary statistics from publicly available sources, encompassing 14 complex human trait categories and spanning 245 original publications. Two sources of GWAS summary statistics were collected from UK Biobank (UKBB) cohorts and non-UK Biobank cohorts (Fig. 1A). UKBB-based GWAS summary statistics were collected from three commonly used resources: Neale Lab UKBB v3 (http://www.nealelab.is/uk-biobank/), Gene ATLAS [33], and GWAS ATLAS [4]. While all three derive from the same underlying UKBB cohort, they differ in sample inclusion criteria, association models, and quality control procedures, resulting in distinct statistical outputs. For non-UKBB datasets, we integrated GWAS summary statistics from multiple reputable public databases and consortia, including the GWAS Catalog [34], LD Hub [35], GRASP [36], PhenoScanner [37], and dbGaP [38], as well as individual project sources such as PGC (https://pgc.unc.edu), MAGIC (http://www.magicinvestigators.org), SSGAC (https://www.thessgac.org), and JENGER (http://jenger.riken.jp/en).

Figure 1.

Figure 1.

Overview of the GENEasso platform and analytical workflow. (A) GWAS summary statistics were collected from two major sources and then preprocessed to ensure data quality and standardized through trait, rsID, and population mapping. (B) Seven gene-based association methods, including PASCAL, MAGMA, SMR, DEPICT, CWAS, RWAS, and LDAK-GBAT, were applied to analyze the GWAS summary statistics. Tissue-specific enrichment analysis was performed using deTS. (C) The database module of GENEasso stores curated disease–gene associations and supports interactive exploration. (D) The web server module of GENEasso enables user-submitted analyses with customized pipelines.

To ensure data quality and harmonization, we applied a multistep curation and standardization pipeline. For each GWAS dataset, we extracted key metadata—sample size, ancestry, trait definition, and publication source—from either the file header or the original publication. Datasets lacking clearly documented population information or sample sizes were excluded. When multiple datasets from different sources represented the same trait, including those derived from the UKBB cohort, we retained all available versions. Each dataset was curated and stored separately with source annotations to allow cross-study comparisons, ensure reproducibility, support confidence estimation of gene-level associations, and maintain robustness across resources.

Before gene-based analyses, all GWAS summary statistics were standardized with a single pipeline: variant coordinates were mapped to GRCh37 (GRCh38 converted via liftOver [39], non-bijective mappings removed) and rsIDs reconciled against dbSNP build 151, discarding records missing both coordinate and rsID. Alleles were harmonized to 1000 Genomes Phase 3 ancestry-matched references, requiring a reported effect allele; when only the effect allele was present, the noneffect allele was inferred. Strand-ambiguous palindromic SNPs with minor allele frequency (MAF) >0.40 were excluded. Reported allele frequencies were converted to MAF or imputed from 1000 Genomes when absent; variants were removed when the allele frequency reported in the study differed from the ancestry-matched 1000 Genomes reference by >0.20. For partially incomplete datasets, Z-scores were imputed where appropriate. Specifically, when P-values and effect sizes and directions were available, Z-scores were calculated using Inline graphic, where Inline graphic is the inverse standard normal cumulative distribution function, and Inline graphic denotes the effect size and Inline graphic denotes the P-value. Datasets lacking both Inline graphic and Inline graphic were excluded. Per-variant imputation quality reported by the sources (INFO/R2) was honored, and INFO <0.90 was applied for Neale Lab UK Biobank datasets. Sample-size fields (N or N_cases/N_controls) were retained for reporting. This imputation procedure was applied consistently across all datasets, ensuring harmonized and reproducible inputs for downstream gene-based analyses.

To improve trait interpretability and enable ontology-based querying, all reported traits from the original publications were manually mapped to the Experimental Factor Ontology (EFO) [40]. This standardization ensures the hierarchical organization of diseases and traits, reduces ambiguity, and facilitates the consistent integration of gene-based association results across studies.

Regarding population information, we mapped each GWAS summary statistic to one of the five super-populations defined by the 1000 Genomes Project [41] (AFR, AMR, EAS, EUR, SAS). We retained GWAS datasets from all available populations to support ancestry-specific analyses and enable cross-population comparisons of gene-level associations. Population assignments were based on metadata from the original studies and were standardized to ensure consistency across the database.

In addition to dataset-level filtering, we performed variant-level quality control (SNP-QC) on each GWAS summary statistic dataset. Specifically, we retained SNPs with a genotype calling confidence >0.9 and excluded those with MAF <0.01, Hardy–Weinberg equilibrium P-value <1 × 10−7, or duplicated identifiers, which were taken directly from the original publication or accompanying documentation. These filters ensured that downstream gene-based analyses were based on high-confidence variant-level inputs and minimized potential false-positive signals arising from low-quality data.

Trait-specific tissue calculation

To identify biologically relevant tissues for each trait, we used PASCAL [27] and the deTS [42] algorithms, which perform tissue-specific enrichment analysis based on gene-level association results (Fig. 1B). Using the GTEx reference panel [43] (GTEx v8, 47 tissues), PASCAL first calculated trait-associated gene scores, selecting those with P-value <.05. Subsequently, we applied Fisher’s exact test to evaluate whether trait-associated genes were significantly enriched in specific tissues. For each trait, the top-enriched tissue was selected as its putative trait-specific tissue, which was then used in downstream tissue-dependent analyses, such as SMR and CWAS. In the web server, all tissues for a submitted GWAS are reported, and users may choose multiple tissues for downstream analyses; GENEasso executes each analysis per tissue and displays the results stratified by tissue.

Database and web server architecture

GENEasso was implemented using a modular client–server architecture to support efficient data access and interactive analysis. The frontend was developed using the Vue.js framework, with user interface components styled via the Element UI library. The backend was built on the Spring Boot (Java) framework, enabling stable and scalable server-side operations. All curated GWAS summary statistics and gene-based association results are stored in a MySQL database, optimized for fast retrieval and filtering.

To handle user-submitted jobs, GENEasso employs an asynchronous task management system, allowing gene-based association analyses to be executed in the background without blocking the user interface. Each analysis task is assigned a unique job ID and tracked through a scheduling queue, with real-time progress and results accessible via the web interface. This design ensures responsive performance and supports multiple concurrent user analyses without compromising stability.

Gene-based association method

To systematically identify disease-associated genes, GENEasso integrates seven gene-based association methods, each capturing different aspects of gene-level genetic architecture (Fig. 1B). These include MAGMA [26], PASCAL [27], SMR [28], DEPICT [29], RWAS [30], CWAS [31], and LDAK-GBAT [32]. All methods were applied using standardized pipelines and default parameters unless otherwise noted. Specifically, we implemented 12 models across seven methods—MAGMA, PASCAL, DEPICT, LDAK-GBAT, RWAS, CWAS (whole blood), and six SMR variants leveraging different eQTL resources (CAGE, Geuvadis, PsychENCODE) and GTEx-derived trait-relevant tissues (top-1/top-2/top-3 per trait).

MAGMA performs regression-based multi-marker gene analysis while accounting for LD; PASCAL aggregates SNP-level association signals into gene scores using analytic approximations based on chi-square statistics; SMR leverages Mendelian randomization to integrate GWAS with eQTL data, identifying genes whose expression may be associated with trait; DEPICT prioritizes genes through co-regulated expression profiles and evaluates tissue or cell type enrichment; RWAS and CWAS estimate gene–trait associations by modeling the relationship between genetic variants and regulatory features such as chromatin accessibility or epigenetic modification, followed by linking predicted regulatory activity to complex traits using GWAS summary statistics; and LDAK-GBAT estimates gene-level heritability contributions using a linear mixed model framework.

For tissue-aware methods such as SMR and CWAS, we used deTS to infer the top three enriched tissues per trait based on tissue-specific gene enrichment from GWAS results. Enriched reference panels were then selected accordingly to improve biological relevance. All computed gene-level associations were stored in the GENEasso database to facilitate multi-method, cross-trait, and cross-population comparisons (Fig. 1C).

Web server configuration and parameter settings

The GENEasso web server supports six of the seven gene-based association methods implemented in the database: MAGMA, PASCAL, SMR, DEPICT, CWAS, and LDAK-GBAT. RWAS is not included due to its computational intensity, which is not suitable for real-time web-based analysis. Instead, FUSION-TWAS is provided as a supplementary option (Fig. 1D). Like RWAS, it integrates GWAS signals with transcriptomic features but with substantially greater computational efficiency. Whereas RWAS emphasizes chromatin accessibility and epigenomic regulation, FUSION-TWAS uses transcriptomic reference panels to impute gene expression and link it to complex traits.

To improve usability and reduce user burden, the web server is designed to require minimal parameter input. By default, all methods are executed using standardized settings, and the only required user-defined parameter is a significance threshold (cutoff), which defaults to 0.05 divided by the number of tested genes, corresponding to the Bonferroni correction. For specific methods, additional parameters can be adjusted as needed: SMR, CWAS, and FUSION-TWAS require users to select a relevant tissue, which can either be provided directly or inferred using the integrated tissue calculation module in GENEasso; MAGMA and LDAK-GBAT allow customization of the gene window size (default: 0 bp); and LDAK-GBAT also includes a power parameter for its model (default: 0.05). This design balances flexibility with ease of use, allowing users to perform reproducible and biologically meaningful gene-level association analyses without requiring detailed knowledge of method-specific configurations.

Results

GENEasso is a comprehensive platform designed to facilitate gene-based association studies and exploration of complex disease–gene relationships. The resource integrates multiple statistical methods, curated GWAS summary statistics, and user-friendly visualization tools to support both large-scale database queries and customized online analyses.

The GENEasso resource is organized into seven primary modules—Home, Disease, Gene, Search, Downloads, Analysis, and Tutorial—providing intuitive navigation for users. The Analysis module is further subdivided into Analysis & Comparison, Trait-specific Tissue Calculation, and Job Search, enabling users to perform customized gene-based association analyses and enriched tissue calculation, and track job status.

Database statistics

The current version of GENEasso contains 8226 curated GWAS summary statistics, encompassing 2491 unique EFO traits and 14 complex human trait categories (Fig. 2A). In addition, we classify the diseases category into 22 sub-disease categories (Supplementary Fig. S1). These datasets cover five populations from the 1000 Genomes Project (AFR, AMR, EAS, EUR, SAS), enabling population-stratified gene-based analyses. Among these summary statistics, 4294 belong to UKBB cohorts and 3932 belong to non-UKBB cohorts, with 95.34% of studies from the EUR super-population and 4.66% of studies from the other four human super-populations. Each GWAS summary statistic is annotated with metadata such as publication source, sample size, population, number of variants, and ontology-mapped trait descriptions.

Figure 2.

Figure 2.

Statistical overview of the GENEasso platform. (A) Number of trait types. (B) Distribution of traits by number of associated genes per trait. (C) Distribution of genes by number of associated traits per gene. (D) Distribution of associations identified by different numbers of methods. (E) Number of disease–gene associations identified by each gene-based method. (F) Top 20 recurrent genes across traits (counted once per trait if identified by any method).

Using seven gene-based association methods, PASCAL, MAGMA, SMR, DEPICT, CWAS, RWAS, and LDAK-GBAT, GENEasso has computed 726 122 significant disease–gene associations (Bonferroni-adjusted P <.05), 225 395 of which are unique pairings identified. On average, each disease trait is associated with 90.44 significant genes (Fig. 2B).

Among the top 20 recurrently associated genes, 6 genes reside outside the MHC (major histocompatibility complex) region. The predominance of MHC genes is expected given the region’s dense immune-regulatory variation and long-range LD, which collectively yield many correlated signals across immune-related traits [44]. BIRC3 regulates inflammation and cell-death signaling, and MIS18BP1 controls centromere licensing and chromosome segregation; these core processes confer broad biological pleiotropy and explain their frequent recurrence across diverse non-MHC loci (Fig. 2C and F) [45, 46]. Across the 22 categories, immune and inflammatory diseases are dominated by extended-MHC genes (HLA-DR/DQ cluster, PSMB9, TNXB, C4A, NOTCH4, TSBP1), indicating a pervasive immunogenetic architecture. Cancer and reproductive disease add prominent non-MHC signals at TERT and CLPTM1L alongside pan-category genes such as MIS18BP1 and BIRC3. Cardiovascular disease highlights CDKN2B-AS1, LPA, and SH2B3; respiratory system disease features CHRNA3; psychiatric disorders highlight APOE, APOC1. Digestive system disease, connective tissue disease, and integumentary system disease remain MHC-dense with accessory immune regulators (e.g. CFB), while recurrent non-MHC genes including MIS18B, BIRC3, LRMDA, and CDC42 appear across many categories, suggesting shared cellular processes beyond immune pathways.

We evaluated the overlap of significant trait–gene associations across the seven integrated gene-based methods. On average, 38.83% of associations were identified by at least two methods, and 20.08% were supported by at least three methods (Fig. 2D). Pairwise Jaccard index ranged from 0.01 (RWAS versus DEPICT) to 0.39 (MAGMA versus PASCAL), indicating generally modest overall overlap, with the highest similarity observed between MAGMA and PASCAL. These two methods co-identified an average of 30.93 genes per trait. In total, 14 859 high-confidence trait-gene associations were identified in four or more methods across traits (Fig. 2D and E). The moderate overlap highlights those different methods that often prioritize distinct signals due to their differing genetic architectures and model frameworks. The observed variability reinforces the necessity of a unified platform like GENEasso, which enables researchers to leverage the complementary strengths of multiple frameworks.

Database user interface

GENEasso provides a friendly platform for exploring disease–gene associations from multiple perspectives, featuring disease-level and gene-level access through the Disease and Gene pages (Fig. 3A). On the Disease page, users can browse traits using the EFO ontology tree or the search box (Fig. 3B). Each dataset is annotated with a unique disease association ID, reported trait, standardized trait label, trait ontology ID, sample size, numbers of cases and controls, population, PubMed identifier (PMID), and number of gene associations. The GWAS summary statistics are available for download. Clicking a trait entry leads to a detailed view organized into four sections: trait information, all GWAS summary statistics associated with the trait, an interactive Manhattan plot for visualizing significant associations, and a sortable table listing associated genes across different methods with gene symbol, Ensembl ID, gene type, genomic location, synonyms, association method, P-value, and additional information (Fig. 3C).

Figure 3.

Figure 3.

Main pages in GENEasso. (A) GENEasso navigation panel. (B) Overview of the Disease page. (C) Disease page example for the trait heart failure. (D) Overview of the Gene page. (E) Gene page example for the gene BIRC3. (F) Search page offering three query options: traits, genes, and publications. (G) Download page providing two categories of downloadable files.

The Gene page displays genes associated with at least one trait, with a sortable table that includes gene symbol, Ensembl ID, gene location, gene type, gene synonyms, and number of associated traits (Fig. 3D). Clicking on a gene leads to a detailed page displaying gene annotations, external database links (e.g. Ensembl [47], NCBI [48], and GeneCards [49]), and a table of disease–gene associations with P-values, trait label, trait ontology ID, and gene-based association methods (Fig. 3E). The Search page supports flexible querying across traits, gene symbol, Ensembl ID, gene location, and publications, with autocomplete suggestions to improve user experience (Fig. 3F). All tables are sortable, filterable, and fully downloadable (Fig. 3G). GENEasso also supports interactive visualizations that can be downloaded in high-resolution format. A unified global search bar enables users to locate relevant traits or genes efficiently across all modules.

In addition to trait- and gene-based browsing, GENEasso allows users to conveniently compare results across methods, studies, and populations for the same trait. For each disease entry, users can selectively display association results from different gene-based association methods, facilitating direct comparison of method-specific findings and identification of shared signals. This design helps users assess the reproducibility and robustness of associations. Furthermore, the database supports comparative analysis across multiple GWAS datasets, ancestries, or even distinct but related traits within a unified Manhattan plot, enabling users to evaluate the consistency and robustness of gene-level associations across studies and traits. For traits with a large number of associated genes, GENEasso provides a chromosome-level filtering option to simplify visualization and focus on the chromosome of interest.

Web server module user interface

To complement the GENEasso database, we developed a robust and user-friendly web server enabling online gene-based association analysis and trait-specific tissue inference from user-submitted GWAS summary statistics. The web server comprises two main modules: Gene Association Analysis and Trait-specific Tissue Calculation.

Users begin by uploading GWAS summary statistics in plain text format. The computation process requires rsID, P-value, sample size, and beta fields; optional fields include effect allele, noneffect allele, and genomic coordinates. To facilitate ancestry-aware analysis, users specify population background (EUR, EAS, AFR, SAS, AMR, or Mixed), UKBB inclusion, and whether the dataset is sex-stratified. GENEasso implements automatic detection of column identifiers (e.g. rsID, P-value, beta, and CHR), minimizing user burden and reducing input errors. If ambiguous or unrecognized terms are encountered, the system issues a warning for user confirmation. For gene-based analysis, users could select one or more methods; results are visualized through an interactive Manhattan plot, with each method displayed in a distinct color. Hovering over each point reveals full association details, and results can be downloaded in Excel format.

The Trait-specific Tissue Calculation module uses the deTS algorithm, combined with the SMR and CWAS results in the database, to identify tissues likely to mediate trait-associated genetic signals. Upon submission of GWAS summary statistics, the system evaluates enrichment across 47 GTEx tissues. Results are presented in a table, where tissues with P <.05 are highlighted in red to emphasize statistically significant associations visually. This visualization facilitates intuitive prioritization of disease-relevant tissues, helping users to refine their interpretation and guide follow-up studies.

To facilitate efficient task tracking, GENEasso assigns a unique job ID upon submission and optionally notifies users via email. For each submission, the web server generates a per-method summary report capturing the selected identifiers (rsID, P-value, Beta, sample size; optional fields), the chosen method and parameters, and the significant genes with accompanying statistics. Results can be retrieved through the job ID on the “Job Search” page at any time. This design ensures users can flexibly monitor analysis progress and revisit results without maintaining an active session.

Discussion

GENEasso addresses several challenges in gene-based association analysis by offering a unified framework that incorporates seven gene-based association statistical methods, each designed to capture different aspects of gene-level genetic architecture. By integrating these methods into the resource, GENEasso enables users to explore multiple analytical perspectives without requiring specialized coding expertise or complex software installations. Unlike TWAS-specific resources, GENEasso integrates statistical, regulatory, and chromatin activity approaches, provides ancestry-aware analyses across five populations, and supports ontology-based trait organization. Together, these unique features improve reproducibility and expand the utility of gene-based association studies. Notably, the platform allows gene prioritization in biologically relevant tissues by decoding trait-specific tissues based on input GWAS summary statistics, thereby increasing the interpretability and contextual relevance of the results.

Importantly, the significant genes calculated by GENEasso enables benchmarking and replication of novel analyses against a large, precomputed knowledge base. Researchers can cross-reference their candidate genes with associations identified across different methods, GWAS sources, and populations. The inclusion of population-specific results across five ancestries further improves the interpretability and relevance of findings, particularly in the context of ancestry-aware analyses. The ability to trace association signals across multiple dimensions, such as traits, methods, tissues, and populations, makes GENEasso a powerful resource for meta-analysis, cross-cohort comparison, and reproducibility assessments.

The web server module of GENEasso complements the database by enabling users to perform gene-based association analyses on their own GWAS summary statistics using the same standardized pipeline applied to curated public datasets. It supports multiple well-established methods, provides tissue-aware options where applicable, and allows selection of population-matched reference panels. Users can compare their results across methods, traits, and ancestries, and benchmark them against the resource-derived associations in the database. The platform is designed to be user-friendly, automating input recognition (e.g. rsIDs and gene aliases) and offering interactive visualization tools such as Manhattan plots. By lowering computational and technical barriers, the web server improves accessibility for researchers and facilitates reproducible, customizable gene prioritization in a wide range of study settings.

Despite its advantages, the GENEasso resource has several limitations. First, although the database includes ancestry-specific results, data coverage remains skewed toward European-derived GWAS, limiting the resolution of non-European disease–gene associations. Second, GENEasso focuses on common variant associations derived from GWAS summary statistics and does not yet incorporate results from rare variant, burden-based analyses, which are increasingly important for understanding disease biology. Lastly, while several tissue-aware methods are supported, tissue calculation accuracy is restricted by the availability and consistency of eQTL reference panels, particularly in non-European tissues.

We aim to improve the GENEasso database by incorporating additional high-quality GWAS datasets from non-European populations, extending support to rare variant-based methods, and exploring consensus-building strategies across models. We also plan to integrate additional molecular annotations, such as methylation QTLs, chromatin accessibility data, and single-cell tissue specificity, to refine the resolution and biological interpretability of gene-level associations.

In summary, the GENEasso database module provides a useful and flexible resource for exploring disease–gene associations using multiple methods and diverse GWAS datasets. Although it does not capture all possible association scenarios, it can assist researchers in conducting cross-method evaluations and identifying candidate genes with greater confidence.

Supplementary Material

gkaf1097_Supplemental_Files

Acknowledgements

The computational resources generously provided by the High Performance Computing Center of Nanjing Medical University are greatly appreciated.

Author contributions: Tao Jiang: Conceptualization, Software, Formal analysis, Funding acquisition, Writing—original draft. Mengting Shao: Data curation, Visualization, Writing—original draft, Writing—review & editing. Junjie Wang: Software, Writing—review & editing. Pengze Wu: Writing—original draft, Writing—review & editing. Changhui Zhu: Writing—review & editing. Ranran Tang: Conceptualization, Funding acquisition, Writing—review & editing. Chen Cao: Conceptualization, Funding acquisition, Visualization, Writing—original draft, Writing—review & editing. Ning Gu: Conceptualization, Funding acquisition, Writing—review & editing.

Contributor Information

Tao Jiang, Jiangsu Key Laboratory for Biomedical Electromagnetic Precision Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China.

Mengting Shao, Jiangsu Key Laboratory for Biomedical Electromagnetic Precision Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China.

Junjie Wang, Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China.

Pengze Wu, Jiangsu Key Laboratory for Biomedical Electromagnetic Precision Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China.

Changhui Zhu, Jiangsu Key Laboratory for Biomedical Electromagnetic Precision Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China.

Ranran Tang, Nanjing Women and Children’s Healthcare Institute, Women’s Hospital of Nanjing Medical University, Nanjing Women and Children’s Healthcare Hospital, Nanjing, Jiangsu 210004, China.

Chen Cao, Jiangsu Key Laboratory for Biomedical Electromagnetic Precision Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China.

Ning Gu, Jiangsu Key Laboratory for Biomedical Electromagnetic Precision Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China; Department of Cardiology, Cardiovascular Disease Center, Jiangsu Key Laboratory for Cardiovascular Information and Health Engineering Medicine, Nanjing Drum Tower Hospital, Medical School, Nanjing University, Nanjing, Jiangsu 210093, China; Nanjing Key Laboratory for Cardiovascular Information and Health Engineering Medicine, Institute of Clinical Medicine, Nanjing Drum Tower Hospital, Medical School, Nanjing University, Nanjing, Jiangsu 210093, China.

Supplementary data

Supplementary data is available at NAR online.

Conflict of interest

None declared.

Funding

This work was supported by the National Natural Science Foundation of China (grant numbers 62471240 and 62231013 to C.C.) and the Frontier Fundamental Research Program of Jiangsu Province for Leading Technology (grant number BK20222002 to N.G.). Funding to pay the Open Access publication charges for this article was provided by the National Natural Science Foundation of China (NSFC; grant 62471240).

Data availability

The data underlying this article are available in GENEasso (https://www.geneasso.net) and can be freely downloaded. Scripts for running the gene-based association methods to generate disease–gene associations are provided on the Downloads page (https://www.geneasso.net/#/downloads). No registration or login is required.

References

  • 1. Loos  RJF. 15 years of genome-wide association studies and no signs of slowing down. Nat Commun. 2020;11:5900. 10.1038/s41467-020-19653-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Sollis  E, Mosaku  A, Abid  A  et al.  The NHGRI-EBI GWAS Catalog: knowledgebase and deposition resource. Nucleic Acids Res. 2023;51:D977–D985. 10.1093/nar/gkac1010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Beck  T, Rowlands  T, Shorter  T  et al.  GWAS Central: an expanding resource for finding and visualising genotype and phenotype data from genome-wide association studies. Nucleic Acids Res. 2023;51:D986–93. 10.1093/nar/gkac1017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Watanabe  K, Stringer  S, Frei  O  et al.  A global overview of pleiotropy and genetic architecture in complex traits. Nat Genet. 2019;51:1339–48. 10.1038/s41588-019-0481-0. [DOI] [PubMed] [Google Scholar]
  • 5. Watanabe  K, Taskesen  E, van Bochoven  A  et al.  Functional mapping and annotation of genetic associations with FUMA. Nat Commun. 2017;8:1826. 10.1038/s41467-017-01261-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Long  E, Williams  J, Zhang  H  et al.  An evolving understanding of multiple causal variants underlying genetic association signals. Am J Hum Genet. 2025;112:741–50. 10.1016/j.ajhg.2025.01.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Yang  S, Ye  X, Ji  X  et al.  PGSFusion streamlines polygenic score construction and epidemiological applications in biobank-scale cohorts. Genome Med. 2025;17:77. 10.1186/s13073-025-01505-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Raj  T, Li  YI, Wong  G  et al.  Integrative transcriptome analyses of the aging brain implicate altered splicing in Alzheimer’s disease susceptibility. Nat Genet. 2018;50:1584–92. 10.1038/s41588-018-0238-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Gerring  ZF, Lupton  MK, Edey  D  et al.  An analysis of genetically regulated gene expression across multiple tissues implicates novel gene candidates in Alzheimer’s disease. Alzheimers Res Ther. 2020;12:43. 10.1186/s13195-020-00611-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Shao  M, Chen  K, Zhang  S  et al.  Multiome-wide association studies: novel approaches for understanding diseases. Genomics Proteomics Bioinformatics. 2024;22:qzae77. 10.1093/gpbjnl/qzae077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Al-Barghouthi  BM, Rosenow  WT, Du  KP  et al.  Transcriptome-wide association study and eQTL colocalization identify potentially causal genes responsible for human bone mineral density GWAS associations. eLife. 2022;11:e77285. 10.7554/eLife.77285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Thériault  S, Gaudreault  N, Lamontagne  M  et al.  A transcriptome-wide association study identifies PALMD as a susceptibility gene for calcific aortic valve stenosis. Nat Commun. 2018;9:988. 10.1038/s41467-018-03260-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Roselli  C, Chaffin  MD, Weng  LC  et al.  Multi-ethnic genome-wide association study for atrial fibrillation. Nat Genet. 2018;50:1225–33. 10.1038/s41588-018-0133-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Wu  L, Shi  W, Long  J  et al.  A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer. Nat Genet. 2018;50:968–78. 10.1038/s41588-018-0132-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Gusev  A, Lawrenson  K, Lin  X  et al.  A transcriptome-wide association study of high-grade serous epithelial ovarian cancer identifies new susceptibility genes and splice variants. Nat Genet. 2019;51:815–23. 10.1038/s41588-019-0395-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Barbeira  AN, Dickinson  SP, Bonazzola  R  et al.  Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat Commun. 2018;9:1825. 10.1038/s41467-018-03621-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Gilchrist  JJ, Makino  S, Naranbhai  V  et al.  Natural Killer cells demonstrate distinct eQTL and transcriptome-wide disease associations, highlighting their role in autoimmunity. Nat Commun. 2022;13:4073. 10.1038/s41467-022-31626-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Khunsriraksakul  C, McGuire  D, Sauteraud  R  et al.  Integrating 3D genomic and epigenomic data to enhance target gene discovery and drug repurposing in transcriptome-wide association studies. Nat Commun. 2022;13:3258. 10.1038/s41467-022-30956-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Schmiedel  BJ, Rocha  J, Gonzalez-Colin  C  et al.  COVID-19 genetic risk variants are associated with expression of multiple genes in diverse immune cell types. Nat Commun. 2021;12:6760. 10.1038/s41467-021-26888-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Mancuso  N, Shi  H, Goddard  P  et al.  Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am J Hum Genet. 2017;100:473–87. 10.1016/j.ajhg.2017.01.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Cao  C, Shao  M, Wang  J  et al.  webTWAS 2.0: update platform for identifying complex disease susceptibility genes through transcriptome-wide association study. Nucleic Acids Res. 2025;53:D1261–D1269. 10.1093/nar/gkae1022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Lu  M, Zhang  Y, Yang  F  et al.  TWAS Atlas: a curated knowledgebase of transcriptome-wide association studies. Nucleic Acids Res. 2023;51:D1179–D1187. 10.1093/nar/gkac821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Piñero  J, Ramírez-Anguita  JM, Saüch-Pitarch  J  et al.  The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 2020;48:D845–D855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Pan  S, Kang  H, Liu  X  et al.  Brain Catalog: a comprehensive resource for the genetic landscape of brain-related traits. Nucleic Acids Res. 2023;51:D835–D844. 10.1093/nar/gkac895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Karczewski  KJ, Solomonson  M, Chao  KR  et al.  Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. Cell Genomics. 2022;2:100168. 10.1016/j.xgen.2022.100168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. de Leeuw  CA, Mooij  JM, Heskes  T  et al.  MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput Biol. 2015;11:e1004219. 10.1371/journal.pcbi.1004219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Lamparter  D, Marbach  D, Rueedi  R  et al.  Fast and rigorous computation of gene and pathway scores from SNP-based summary statistics. PLoS Comput Biol. 2016;12:e1004714. 10.1371/journal.pcbi.1004714. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Zhu  Z, Zhang  F, Hu  H  et al.  Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat Genet. 2016;48:481–7. 10.1038/ng.3538. [DOI] [PubMed] [Google Scholar]
  • 29. Pers  TH, Karjalainen  JM, Chan  Y  et al.  Biological interpretation of genome-wide association studies using predicted gene functions. Nat Commun. 2015;6:5890. 10.1038/ncomms6890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Grishin  D, Gusev  A. Allelic imbalance of chromatin accessibility in cancer identifies candidate causal risk variants and their mechanisms. Nat Genet. 2022;54:837–49. 10.1038/s41588-022-01075-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Baca  SC, Singler  C, Zacharia  S  et al.  Genetic determinants of chromatin reveal prostate cancer risk mediated by context-dependent gene regulation. Nat Genet. 2022;54:1364–75. 10.1038/s41588-022-01168-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Berrandou  TE, Balding  D, Speed  D. LDAK-GBAT: fast and powerful gene-based association testing using summary statistics. Am J Hum Genet. 2023;110:23–9. 10.1016/j.ajhg.2022.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Canela-Xandri  O, Rawlik  K, Tenesa  A. An atlas of genetic associations in UK Biobank. Nat Genet. 2018;50:1593–9. 10.1038/s41588-018-0248-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Cerezo  M, Sollis  E, Ji  Y  et al.  The NHGRI-EBI GWAS Catalog: standards for reusability, sustainability and diversity. Nucleic Acids Res. 2025;53:D998–D1005. 10.1093/nar/gkae1070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Zheng  J, Erzurumluoglu  AM, Elsworth  BL  et al.  LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics. 2017;33:272–9. 10.1093/bioinformatics/btw613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Leslie  R, O’Donnell  CJ, Johnson  AD. GRASP: analysis of genotype-phenotype results from 1390 genome-wide association studies and corresponding open access database. Bioinformatics. 2014;30:i185–94. 10.1093/bioinformatics/btu273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Kamat  MA, Blackshaw  JA, Young  R  et al.  PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations. Bioinformatics. 2019;35:4851–3. 10.1093/bioinformatics/btz469. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Tryka  KA, Hao  L, Sturcke  A  et al.  NCBI’s database of Genotypes and Phenotypes: dbGaP. Nucleic Acids Res. 2014;42:D975–D979. 10.1093/nar/gkt1211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Hinrichs  AS, Karolchik  D, Baertsch  R  et al.  The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 2006;34:D590–D598. 10.1093/nar/gkj144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Malone  J, Holloway  E, Adamusiak  T  et al.  Modeling sample variables with an Experimental Factor Ontology. Bioinformatics. 2010;26:1112–8. 10.1093/bioinformatics/btq099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Auton  A, Brooks  LD, Durbin  RM  et al.  A global reference for human genetic variation. Nature. 2015;526:68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Pei  G, Dai  Y, Zhao  Z  et al.  deTS: tissue-specific enrichment analysis to decode tissue specificity. Bioinformatics. 2019;35:3842–5. 10.1093/bioinformatics/btz138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Lonsdale  J, Thomas  J, Salvatore  M  et al. , The Genotype-Tissue Expression (GTEx) project. Nat Genet. 2013;45:580–5. 10.1038/ng.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. de Bakker  PI, McVean  G, Sabeti  PC  et al.  A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet. 2006;38:1166–72. 10.1038/ng1885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Bertrand  MJ, Doiron  K, Labbé  K  et al.  Cellular inhibitors of apoptosis cIAP1 and cIAP2 are required for innate immunity signaling by the pattern recognition receptors NOD1 and NOD2. Immunity. 2009;30:789–801. 10.1016/j.immuni.2009.04.011. [DOI] [PubMed] [Google Scholar]
  • 46. Hori  T, Shang  WH, Hara  M  et al.  Association of M18BP1/KNL2 with CENP-A nucleosome is essential for centromere formation in non-mammalian vertebrates. Dev Cell. 2017;42:181–9. 10.1016/j.devcel.2017.06.019. [DOI] [PubMed] [Google Scholar]
  • 47. Dyer  SC, Austine-Orimoloye  O, Azov  AG  et al.  Ensembl 2025. Nucleic Acids Res. 2025;53:D948–D957. 10.1093/nar/gkae1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Sayers  EW, Beck  J, Bolton  EE  et al.  Database resources of the National Center for Biotechnology Information in 2025. Nucleic Acids Res. 2025;53:D20–D29. 10.1093/nar/gkae979. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Stelzer  G, Rosen  N, Plaschkes  I  et al.  The GeneCards suite: from gene data mining to disease genome sequence analyses. Curr Protoc Bioinformatics. 2016;54:1.30.31–31.30.33. 10.1002/cpbi.5. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

gkaf1097_Supplemental_Files

Data Availability Statement

The data underlying this article are available in GENEasso (https://www.geneasso.net) and can be freely downloaded. Scripts for running the gene-based association methods to generate disease–gene associations are provided on the Downloads page (https://www.geneasso.net/#/downloads). No registration or login is required.


Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES