Abstract
Metabolic dysfunction–associated steatotic liver disease (MASLD) is a globally prevalent disease, yet its genetic architecture remains incompletely characterized. We integrated genome-wide association study data from multiple cohorts totaling nearly 3 million individuals of European ancestry and applied cross-trait genomic modeling of hepatic fat and seven cardiometabolic traits to construct an MASLD-specific polygenic architecture. We identified 128 risk variants across 100 loci and prioritized 55 effector genes, including established (e.g., PNPLA3 and TM6SF2) and previously unreported candidates (e.g., NRXN3 and FRMD5). A phenome-wide scan of the MASLD polygenic risk score revealed broad associations spanning hepatic, cardiometabolic, renal, endocrine, and neuropsychiatric systems. Using a two-step, proteome-wide Mendelian randomization across >4900 plasma proteins, we identified potential mediators linking MASLD to disease. Validation in population-based cohort pinpointed seven proteins (e.g., FURIN, aldehyde dehydrogenase 2, and apolipoprotein M) mediating up to 50.6% of the cardiometabolic risk attributable to MASLD. Our findings delineate the polygenic architecture of MASLD, highlight its multisystem consequences, and nominate translational biomarkers for precision prevention.
A genome-to-protein map shows how common DNA variants drive fatty liver and body-wide risks, pinpointing targets for prevention.
INTRODUCTION
Metabolic dysfunction–associated steatotic liver disease (MASLD), previously known as nonalcoholic fatty liver disease (NAFLD), is a globally prevalent metabolic disorder with increasing prevalence (1–3). Ample evidence suggests that MASLD is not merely a liver-centered disorder but a multisystem condition that contributes to cardiometabolic, renal, and endocrine diseases (4, 5). Biologically, MASLD is a complex trait (6, 7), defined by the coexistence of hepatic steatosis and systemic cardiometabolic dysfunction. However, most genome-wide association studies (GWASs) to date have focused either on hepatic fat content, quantified by magnetic resonance imaging proton density fat fraction (MRI-PDFF), or on binary NAFLD phenotypes derived from electronic health records (8–11). While these studies have provided important genetic insights into hepatic fat accumulation and NAFLD (12–14), they fall short of capturing the broader biological architecture of MASLD, which integrates both liver-specific and systemic metabolic components. To advance mechanistic understanding and inform precision prevention, genetic studies tailored specifically to MASLD are therefore critically needed.
Unraveling the genetic architecture of MASLD may also clarify its biological links to extrahepatic complications and facilitate the development of targeted interventions (15, 16). Although epidemiological studies have consistently shown MASLD to be a multisystem disorder (1, 17–19), the molecular mechanisms underlying these systemic effects remain largely unresolved. Circulating proteins in plasma may represent promising intermediaries in the MASLD-disease continuum (20–23), given their accessibility, regulatory potential, and causal links to disease processes. However, few studies have systematically investigated the mediating role of plasma proteins in translating MASLD-related genetic risk into downstream health outcomes.
To address these gaps, we integrated genetic data across hepatic and cardiometabolic traits to derive an MASLD-specific polygenic architecture. We then examined its systemic associations and identified candidate protein mediators using Mendelian randomization (MR) and observational validation. This framework represents a step toward resolving the biological complexity of MASLD and translating genetic insights into clinically actionable strategies.
RESULTS
Study design
The overall study design is shown in Fig. 1. We conducted a comprehensive integrative genomic analysis of MASLD using GWAS summary statistics from large-scale cohorts totaling nearly 3 million individuals of European ancestry. To construct an MASLD-specific genetic architecture, we implemented a two-step cross-trait modeling approach that jointly incorporated genetic signals from hepatic fat (MRI-PDFF) and major cardiometabolic risk factors (CMRFs) (4), including adiposity, blood lipids, glycemic markers, and blood pressure. This framework enabled the identification of pleiotropic loci and the prioritization of candidate genes and pathways potentially involved in MASLD pathogenesis. To evaluate the clinical relevance of the resulting MASLD polygenic risk score (PRS), we performed a phenome-wide association study (PheWAS) to systematically map its associations across a broad range of disease outcomes in the UK Biobank (UKB) cohort. To investigate potential mediating mechanisms, we further integrated proteomic data through a proteome-wide MR analysis, complemented by prospective cohort validation, to identify circulating proteins that may link MASLD genetic risk to multisystem disease burden.
Fig. 1. Overview of study design.
Schematic summary of the integrative framework used to dissect the genetic architecture and systemic consequences of MASLD, including multitrait genetic modeling, effector gene prioritization, PRS construction, PheWAS, and proteome-wide MR analysis. Created in BioRender [J. Du (2025); https://BioRender.com/6x8e7o8].
Structure equation modeling of cardiometabolic trait GWAS
To capture the shared genetic architecture underlying MASLD-related CMRFs, we first constructed a composite CMRF GWAS using genomic structural equation modeling (Genomic SEM). We compiled publicly available GWAS summary statistics of European ancestry for seven MASLD-related CMRFs: body mass index (BMI), waist circumference (WC), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), glycosylated hemoglobin (HbA1c), fasting glucose (FG), and hypertension (HTN), encompassing approximately 2.95 million individuals (table S1). All datasets were drawn from nonoverlapping cohorts independent of the UKB to minimize bias due to sample overlap. Standardized quality control procedures were applied to each GWAS (table S2).
Univariate linkage disequilibrium score regression (LDSC) analyses confirmed robust polygenicity across traits, with moderate single-nucleotide polymorphism (SNP)–based heritability estimates (h2 range: 0.11 to 0.13) and minimal inflation from population stratification (table S3). Pairwise genetic correlations ranged from moderate to high, with the strongest observed between BMI and WC (rg = 0.89) (Fig. 2A). To assess the suitability of the genetic correlation matrix for latent factor analysis, we conducted a Kaiser-Meyer-Olkin (KMO) test, which yielded a value of 0.683, indicating adequate factorability (table S4). Parallel analysis based on eigenvalue distributions supported the extraction of three latent factors (fig. S1). Accordingly, we performed exploratory factor analysis (EFA) from odd-numbered chromosomes using promax rotation and evaluated one- to three-factor solutions. Three-factor model explained 69.3% of the total variance across the seven CMRFs (table S5), informing the subsequent confirmatory factor analysis (CFA) via Genomic SEM. A single-factor model from the CFA showed suboptimal fit [comparative fit index (CFI) = 0.900; standardized root mean square residual (SRMR) = 0.106], suggesting inadequate capture of genetic heterogeneity (table S6). In contrast, a hierarchical three-factor model provided better fit (χ2 = 242.44; CFI = 0.978; SRMR = 0.036), clustering traits into biologically coherent domains: F1 (obesity and HTN), F2 (dyslipidemia), and F3 (insulin resistance) (Fig. 2B and table S7).
Fig. 2. Heritability and shared genetic architecture of MASLD-related traits.
(A) Heatmap of pairwise genetic correlations (rg) among seven MASLD-related CMRFs: BMI, WC, HDL-C, TG, FG, HbA1c, and HTN. Right panel displays SNP-based heritability (h2) estimates and standard errors for each trait, estimated via LDSC. (B) Three-factor Genomic SEM model and multitrait analysis of GWAS (MTAG)–based multitrait GWAS of MASLD. Path diagram illustrates standardized loadings of a hierarchical model in which CMRFs load onto three latent factors. (C) Cross-trait pleiotropy landscape and circular Manhattan plot. Top left bar plot shows the number of genome-wide significant loci per trait, with black bars indicating loci shared across multiple traits and colored bars denoting trait-specific signals. Circular Manhattan plot depicts genome-wide associations for MASLD, PDFF, and seven CMRFs. Each concentric track represents one trait. MASLD loci shared by five or more traits are highlighted in red and linked by black lines, indicating hotspots of cross-trait pleiotropy.
We then constructed a composite CMRF GWAS using multivariate Genomic SEM, which estimates SNP effects on shared genetic liability across traits (figs. S2 and S3). The effective sample size (neff) of the CMRF was 790,156 with mean χ2 of 2.59, and SNP-based heritability was 0.11 (Fig. 2B and table S8). The resulting CMRF GWAS demonstrated moderate to strong genetic correlations with individual component traits (median rg = 0.68), supporting its validity as a genetically informative composite phenotype (fig. S4).
Multitrait analysis of GWAS–based genome-wide analysis of MASLD
We obtained GWAS summary statistics for liver PDFF in 32,858 individuals (24). Univariate LDSC analysis indicated moderate polygenicity, with a mean χ2 of 1.12 and SNP-based heritability (h2) of 0.17 (Fig. 2B and table S3). To construct a genetic model that captures the integrated genetic architecture of hepatic fat and cardiometabolic traits, we applied multitrait analysis of GWAS (MTAG) to jointly analyze the PDFF and composite CMRF GWAS. This cross-trait meta-analysis yielded a composite PDFF-based GWAS (hereafter referred to as the MASLD GWAS) with an increased effective sample size of 122,644 and boosted association strength (mean χ2 increased from 1.12 to 1.35) while maintaining a maximum false discovery rate (FDR) of 0.18 (Fig. 2B, table S8, and figs. S5 and S6). The genetic architecture of MASLD was polygenic, evidenced by an elevated mean χ2 value (1.35) and a genomic inflation factor (λGC = 1.06), with minimal confounding from population stratification (LDSC regression intercept < 1) (table S8).
We identified 128 genome-wide significant and independent variants associated with MASLD, mapping to 100 loci, markedly exceeding the number of signals (<20) reported in earlier NAFLD-related GWAS (Fig. 2C and table S9) (8, 9). Of them, 98 loci exhibited pleiotropic associations with at least two MASLD traits, and 34 loci exhibited high pleiotropy, each being associated with five or more MASLD traits (Fig. 2C and table S10). These highly pleiotropic regions included well-established MASLD susceptibility genes such as PNPLA3, APOE, TRIB1, and GPAM (8). Beyond the expected overlap with cardiometabolic traits, most lead variants from the MASLD GWAS also exhibited pleiotropic associations with additional biological traits, such as estimated glomerular filtration rate and circulating vitamin D levels, based on previously published GWAS curated in the National Human Genome Research Institute-European Bioinformatics Institute (NHGRI-EBI) GWAS Catalog (table S9).
Functional and cellular context of MASLD risk
To provide a holistic view of MASLD genetic architecture, we conducted integrative enrichment analyses across gene sets, regulatory elements, and tissue/cell-type–specific expression profiles. Multimarker Analysis of GenoMic Annotation (MAGMA)–based gene set enrichment analysis identified 37 pathways significantly associated with MASLD after Bonferroni correction (fig. S7 and tables S11 and S12), with top signals related to lipid localization, triglyceride metabolism, and lipoprotein remodeling, highlighting lipid dysregulation as a core feature of MASLD pathogenesis. Stratified LDSC revealed 13 categories with significant heritability enrichment, including conserved elements, human promoters, histone H3 lysine 9 acetylation, and super enhancers (fig. S8 and table S13), reflecting a complex regulatory landscape. Tissue/cell-type enrichment analysis via LDSC-specifically expressed genes (SEG) demonstrated significant enrichment in liver tissue and hepatocytes, as expected, as well as in visceral adipose tissue (omentum), supporting the role of adipose-liver interactions (fig. S9 and table S14). Notably, enrichment was also observed in brain regions such as the cerebral cortex, hippocampus, and limbic system, suggesting possible neuroendocrine involvement in MASLD-related systemic regulation (tables S14 and S15).
Effector gene prediction for MASLD
We prioritized effector genes for MASLD using FLAMES (Fine-mapped Locus Annotation and Effector gene Scoring) framework based on fine-mapped results of the identified 100 loci. Using SuSiEx, we fine mapped 88 credible sets across autosomes, encompassing 925 putative causal variants, including 29 with high posterior inclusion probability (PIP > 0.80) (Fig. 3A and table S16). Most fine-mapped variants resided in noncoding regulatory regions, particularly intronic (49.8%) and intergenic (27.8%) sites (Fig. 3B).
Fig. 3. Fine mapping and functional characterization of MASLD-associated loci.
(A) Genomic distribution of fine-mapped SNPs across 22 autosomes, with PIP derived from SuSiEx. Variants with PIP > 0.8 are labeled. (B) Functional annotation of 925 credible set variants. (C) Multimodal effector gene prioritization integrating scores from XGBoost, PoPS, and FLAMES. Heatmap displays standardized scores across methods and cumulative precision. Genes with cumulative precision > 0.75 were retained, yielding 55 high-confidence candidates. (D) Tissue-specific expression enrichment of prioritized genes based on GTEx v8 transcriptomes. Differentially expressed gene (DEG) analysis revealed significant up-regulation in liver and visceral adipose tissue (omentum). (E) Gene Ontology (GO) enrichment analysis of prioritized genes. Left: Sankey diagram linking representative genes to top enriched GO terms spanning biological process (BP), cellular component (CC), and molecular function (MF) categories. Right: Bubble plot showing gene ratios and FDR-adjusted P values for selected enriched terms.
We mapped the 925 causal variants to 1777 candidate genes and calculated FLAMES scores for each using an integrative framework combining SNP-to-gene annotations and convergence-based evidence [e.g., polygenic priority scores (PoPS)] (tables S17 and S18). We then selected 88 genes with the highest FLAMES score within each fine-mapped region and prioritized 55 with a cumulative precision > 0.75 (Fig. 3C and table S19). Top-ranked genes included NEGR1, FTO, BDNF, GNPDA2, and MC4R, known regulators of adiposity (25–29), as well as TRIB1, APOB, APOE, CETP, and LPL, which are involved in lipoprotein metabolism (30–33). Canonical MASLD genes such as PNPLA3, TM6SF2, and GPAM were also recovered (9), supporting the robustness of the prioritization framework. In addition, we noted a few less-characterized candidate genes such as MYO1F and FRMD5, whose roles in MASLD pathogenesis remain largely unexplored and merit further functional investigation.
Tissue expression profiling based on Genotype-Tissue Expression (GTEx) data showed that prioritized genes were most strongly up-regulated in liver and visceral adipose tissue (omentum) (Fig. 3D, fig. S10, and tables S20 and S21), aligning with MASLD’s central pathophysiology (34). Pathway enrichment analysis of prioritized genes using FUMA highlighted lipid homeostasis, sterol transport, and triglyceride metabolism as key biological processes (Fig. 3E and table S22). Enriched molecular functions included lipid binding and sterol transfer activity, while cellular components were dominated by terms related to lipoprotein particles and protein-lipid complexes. These findings underscore lipid metabolic regulation as a key axis linking genetic susceptibility to MASLD.
Systemic effects of MASLD genetic liability
To evaluate the clinical relevance of MASLD genetic risk, we constructed an MASLD PRS based on the MASLD GWAS summary statistics and assessed its performance in the UKB cohort. For comparison, a PDFF-based PRS was also derived from univariate PDFF GWAS. MASLD cases were defined as individuals with a fatty liver index (FLI) ≥ 60 (fig. S11) (35, 36) plus ≥1 cardiometabolic trait, with exclusions for excessive alcohol use, chronic viral hepatitis, and Mendelian liver diseases (Materials and Methods). We assessed liver fibrosis risk in the MASLD cohort by comparing fibrosis-4 (FIB-4) index (37), the aspartate aminotransferase-to-platelet ratio index (APRI) (38), and the Forns score (39) between cases and controls, with the Forns index indicating a significantly higher fibrosis proportion among cases (3.4% versus 1.6%; fig. S12). Among 373,622 European-ancestry participants (51,754 MASLD cases), the MASLD PRS was robustly associated with MASLD risk in a dose-response manner [odds ratio (OR) per SD = 1.11, 95% confidence interval (CI): 1.10 to 1.12, P < 2.05 × 10−103) and explained an incremental 0.24% of variance (95% CI: 0.20 to 0.29%), outperforming the PDFF PRS (Fig. 4, A to C, and tables S23 and S24).
Fig. 4. Predictive performance and phenome-wide associations of MASLD PRS in the UKB.
(A) Distribution of MASLD-PRS among MASLD cases and controls in the UKB cohort. (B) Association between MASLD-PRS deciles and MASLD risk. Points show decile-specific estimates; the curve is a spline fit with a blue shaded band. (C) Comparison of incremental variance explained (ΔR2) by MASLD-PRS and PDFF-PRS. Box represents the ΔR2 from bootstrap iteration, with error bars indicating the 95% CIs derived from 1000 bootstrap replicates using the percentile method. (D) PheWAS of MASLD-PRS across 13 disease categories in individuals of European ancestry. Forest plot displays significant associations after FDR correction, with ORs, 95% CIs, case counts, and corresponding P values annotated for each phenotype. HCC, hepatocellular carcinoma; GERD, gastroesophageal reflux disease; IHD, ischemic heart disease; NEC, not elsewhere classified; NOS, not otherwise specified.
To characterize the broader disease impact of MASLD genetic liability, we performed a PheWAS using the MASLD PRS. Significant associations were observed across multiple organ systems (Fig. 4D and table S25). Beyond expected positive associations with hepatic and cardiometabolic conditions (e.g., liver cirrhosis and angina pectoris), the MASLD PRS was also positively associated with a broad range of disorders, including digestive, renal, endocrine, musculoskeletal, and dermatological conditions, some of which have been reported in previous studies (40–42). Conversely, a negative association with dementia and Alzheimer’s disease was observed, although this finding has been inconsistently reported in prior epidemiological studies (43, 44). We also identified associations not well characterized in the literatures, such as positive associations with iron deficiency anemias and erythematosquamous dermatosis. In non-European participants, associations did not reach statistical significance for most diseases (table S26). These results underscore the multisystemic impact of MASLD genetic liability and highlight phenotypic domains that may warrant further mechanistic investigation.
Two-step proteome-wide MR analysis
To elucidate molecular mechanisms linking MASLD to multisystem diseases, we implemented a two-step proteome-wide MR framework (Fig. 5A). In step 1, we performed MR to assess the causal effects of MASLD on 4907 plasma proteins using data from the deCODE proteomic GWAS (45) (n = 35,559). Step 2 assessed the impact of MASLD-driven proteins on phenotypes identified in the MASLD PRS-PheWAS and had available GWAS summary data (table S27). We further validated these associations in the UKB cohort and quantified mediation effects of proteins in the MASLD-disease pathway.
Fig. 5. Two-step MR framework and cohort validation identify plasma proteins mediating MASLD-related disease risk.
(A) Overview of the two-step MR framework linking MASLD genetic liability to clinical outcomes via circulating proteins. (B) Circos plot illustrating 47 protein-phenotype pairs with concordant MR and PheWAS effects. (C) Summary of 13 high-confidence protein-phenotype pairs supported by both two-step MR and observational validation in UKB. (D) Cumulative risk curves stratified by baseline protein quartiles (Q1 to Q4) in the UKB participants. Curves represent cumulative incidence across quartiles; shaded areas denote 95% CIs. (E) Sankey diagram and scatter plot summarizing protein-mediated pathways from genetically predicted MASLD to multiple clinical outcomes in a two-step MR framework. The bubble denotes the proportion of MASLD-associated disease risk explained by each protein. nSNP, number of SNP.
The step 1 MR analysis revealed 514 proteins significantly influenced by MASLD (hereafter “MASLD-driven proteins”) based on inverse-variance weighted (IVW) method, with no evidence of weak instruments or horizontal pleiotropy (table S28 and fig. S13). Known MASLD-associated proteins such as apolipoprotein E (apoE), GCKR, and alcohol dehydrogenase family members were noted. In addition, proteins such as PHGDH, DCUN1D5, and CRYZL1 showed strong positive effects, while apolipoprotein M (apoM) and MENT were negatively associated with MASLD genetic liability.
In step 2, we used cis–protein quantitative trait loci (pQTLs) from a meta-curated proteomic resource (table S29) to assess the causal effects of MASLD-driven proteins on clinical outcomes. A total of 231 proteins with available cis-pQTLs and 58 diseases with GWAS summary statistics was included, resulting in 13,398 protein-phenotype pairs tested by MR. After stringent filtering, 82 associations remained significant, of which 47 associations involving 25 unique proteins were retained on the basis of directional consistency between MR estimates and PheWAS results (fig. S14 and tables S30 to S32). Proteins such as FURIN, aldehyde dehydrogenase 2 (ALDH2), SERPINA1, and GSS were linked to multiple endpoints, including liver disease, type 2 diabetes (T2D), and cardiovascular conditions (Fig. 5B).
To validate these findings in a real-world population, we examined protein-disease associations in 52,633 UKB participants with baseline plasma protein data (46). Of the 25 MR-implicated proteins from the deCODE proteomic GWAS, 16 were measurable in UKB due to platform differences. Cox proportional hazard models, adjusted for demographic and lifestyle factors, confirmed 13 protein-disease pairs involving seven proteins [FURIN, ALDH2, RARRES1, lymphotoxin-alpha (LTA), apoM, thialysine N-epsilon-acetyltransferase (SAT2), and enoyl-CoA hydratase, mitochondrial (ECHS1)] (Fig. 5C and table S33). These proteins were linked to incident cardiometabolic and endocrine outcomes, including coronary atherosclerosis, angina, HTN, T2D, hypothyroidism, and hypoglycemia. Kaplan-Meier curves showed clear dose-response pattern for all associations, except for LTA–type 1 diabetes pair, with higher protein levels predicting greater disease incidence (Fig. 5D and table S34).
Mediation analysis quantified the proportion of the MASLD effect on disease risk explained by protein intermediates (Fig. 5E). FURIN (also known as PCSK3), an important mammalian proprotein convertase (47), emerged as a major mediator linking MASLD to cardiovascular diseases, with the mediated proportion ranging from 12.0% for angina pectoris to 50.6% for essential HTN. Other proteins also showed modest but significant mediating effects: ALDH2, apoM, RARRES1, and SAT2 mediated MASLD-associated risks of essential HTN, hypothyroidism, and T2D, with proportion mediated ranging from 2.2 to 10.0% (table S35). One pair (ECHS1-hypoglycemia) did not reach statistical significance despite consistent associations in MR and cohort analyses.
Genetic relationships between MASLD effector genes and plasma protein mediators
We next asked whether the 55 MASLD-prioritized genes exert regulatory effects on circulating levels of the 25 protein mediators. Using Bayesian colocalization between MASLD GWAS loci corresponding to specific effector gene and plasma-protein GWAS (±250-kb windows), we tested whether each gene-protein pair shared a causal variant. Across all pairs, 13 loci showed strong evidence of colocalization [posterior probability for hypothesis 4 (PP.H4) > 0.8] spanning multiple proteins (fig. S15 and table S36). Several canonical MASLD genes displayed broad cross-protein signals; for example, APOE colocalized with apoM, CETP, endothelial lipase (LIPG), and HSPA1B. Regional locus plots further highlighted pleiotropic colocalization signals at representative loci, including APOE (rs429358), PNPLA3 (rs3747207), and FTO (rs62048402) (figs. S16 to S18), supporting shared genetic regulation between these genomic locus and plasma proteins.
To further delineate regulatory links between MASLD effector genes and circulating proteins, we performed two-sample MR using liver expression quantitative trait loci (eQTLs) as instruments and the 25 candidate proteins as outcomes. Among 21 MASLD-prioritized genes with valid instruments, 22 gene-protein pairs showed nominal causal effects, involving 10 distinct genes (table S37). Several genes (e.g., CETP and COLEC11) were linked to multiple protein targets, indicating potential pleiotropic regulation of plasma proteins. Notably, pairs such as CETP-apoM, COLEC11-BDH2, and COLEC11-SMPDL3A also exhibited strong colocalization (PP.H4 > 0.80; fig. S15), providing convergent genetic support for shared regulatory architecture between MASLD loci and plasma protein abundance.
DISCUSSION
MASLD is a common and multisystem condition. Despite its growing burden, a comprehensive understanding of the genetic and molecular mechanisms underlying MASLD is still lacking. To address these challenges, we developed an integrative cross-trait genomic modeling framework to systematically dissect the polygenic basis and systemic impact of MASLD. This integrative framework yielded four key insights. First, we developed a high-powered MASLD GWAS by jointly modeling hepatic fat and cardiometabolic traits, identifying more than 100 risk loci and improving locus discovery beyond previous GWASs. Second, we revealed widespread pleiotropy at MASLD loci, implicating shared genetic underpinnings across hepatic, cardiovascular, and endocrine traits. Third, we validated an MASLD PRS that not only predicted MASLD risk but also captured its systemic disease burden across organ systems. Fourth, by integrating proteome-wide MR with longitudinal validation, we identified circulating proteins that mediate MASLD’s impact on cardiometabolic and endocrine outcomes, providing proteomic insights into the MASLD systemic risk.
Our study addresses several persistent methodological challenges in MASLD genomic discovery. On the one hand, traditional MASLD GWASs, relying either on imaging-derived hepatic fat or clinical diagnostic codes (8, 9, 11), are limited by modest sample sizes, diagnostic heterogeneity, and potential misclassification, especially in capturing subclinical or metabolically driven cases. On the other hand, MASLD is not a single disease entity but a complex syndrome characterized by heterogeneous hepatic and extrahepatic manifestations (1). This syndromic nature challenges traditional, liver-centric definitions and highlights the limitations of prior studies in capturing its full systemic complexity. To overcome these limitations, we implemented a cross-trait genomic modeling strategy that jointly leverages genetically correlated hepatic and cardiometabolic traits to construct a refined phenotype framework. This approach enhances statistical power, mitigates misclassification bias, and better captures the multisystem complexity of MASLD (4). This strategy also enabled robust genetic discovery and functional downstream analyses (48, 49), including pleiotropy mapping and prioritization of causal genes and biological pathways, thereby deepening our understanding of MASLD as a multisystem disorder. By integrating hepatic fat with cardiometabolic traits, our approach redefines MASLD from a genomic perspective, offering a more biologically coherent disease model.
We uncovered three latent cardiometabolic factors, obesity-HTN, dyslipidemia, and insulin resistance, reflecting the interconnected systemic profile of MASLD (5). Many loci showed high pleiotropy—more than 30 influenced five or more MASLD-related traits. Key loci such as PNPLA3, FTO, TRIB1, APOE, and GPAM exhibited consistent effects across hepatic and extrahepatic domains, underscoring the polygenic and multisystem nature of MASLD. Through integrative fine mapping, we prioritized 55 effector genes enriched in pathways central to lipid metabolism, lipoprotein remodeling, and hepatic function. Established MASLD genes (PNPLA3, APOE, GPAM, and TM6SF2) were recovered alongside upstream regulators of adiposity and insulin (FTO, LPL, BDNF, NEGR1, MC4R, IRS1, and INSR), reinforcing the role of adipose and glucose overflow in hepatic lipid accumulation. Expression enrichment in liver and adipose tissue supported MASLD’s core pathogenic axis: cross-talk between hepatic steatosis and impaired adipose expandability (50). Some genes (e.g., EBF1 and COLEC11) also participate in immune and inflammatory regulation (51, 52), suggesting shared pathways between metabolic and immune dysfunction in MASLD (53). These findings offer mechanistic insights into MASLD pathogenesis and underscore the need for future studies to dissect the coordinated roles of lipid metabolism, adipose and glucose dysfunction, and immune signaling in disease initiation and progression.
We demonstrated the translational relevance of MASLD genetics through phenome-wide analyses. The MASLD PRS, derived from our cross-trait GWAS, was associated with a wide spectrum of clinical outcomes: hepatic (e.g., cirrhosis and liver cancer), cardiometabolic (e.g., myocardial infarction and T2D), renal (e.g., chronic kidney disease), and endocrine diseases (e.g., gout and hypothyroidism). Notably, the MASLD PRS outperformed a PDFF-derived PRS in predicting MASLD status and associated comorbidities, highlighting the added value of integrating systemic components in genetic modeling. Associations with musculoskeletal, hematologic, and dermatologic conditions—some rarely reported previously—warrant further prospective cohort study validation and mechanistic investigation. A negative association with Alzheimer’s disease was observed, echoing prior conflicting epidemiologic findings (44, 54). These findings not only reinforce the systemic nature of MASLD but also highlight the utility of cross-trait genetic approaches in uncovering unexpected disease links, offering potential avenues for mechanistic research and risk stratification in precision medicine.
To explore potential mechanisms linking MASLD to systemic disease, we implemented a two-step proteome-wide MR framework. We identified more than 500 proteins influenced by MASLD and tested their downstream effects on MASLD-related outcomes. After stringent filtering, we identified 47 high-confidence protein-phenotype pairs, of which 13 were validated in the UKB. Notably, FURIN emerged as a key mediator of cardiovascular risk, explaining up to 32.2% of MASLD’s effect on coronary atherosclerosis, consistent with its known role in vascular remodeling (55, 56). Other validated proteins included ALDH2 (essential HTN), SAT2 (T2D), and apoM (hypothyroidism), offering mechanistic insights and potential biomarker targets. Most associations showed dose-response patterns in Kaplan-Meier curves, with higher protein levels predicting increased disease risk. Mediation analysis confirmed partial but significant roles for these proteins, reinforcing their clinical and mechanistic relevance. Our integrative colocalization and gene-protein MR analyses further delineate the regulatory landscape linking MASLD effector genes to circulating proteins. APOE exhibited pleiotropic colocalization with several lipoprotein-related proteins (apoM, CETP, and LIPG), whereas PNPLA3 and FTO also showed shared causal signals with multiple proteins, consistent with their established functions in hepatic lipid metabolism and systemic metabolic regulation. Notably, most colocalized pairs were located on different chromosomes, suggesting trans-regulatory effects rather than local cis-mechanisms. Thus, colocalization likely reflects coordinated genetic regulation, complementing MR evidence of causal links between gene expression and protein abundance. This proteogenomic framework enhances our understanding of MASLD’s biological cascade, from genetic risk to circulating proteins to disease onset, bridging molecular intermediates and clinical endpoints. It also highlights the potential of plasma proteins as biomarkers and therapeutic targets, enabling protein-informed risk stratification and tailored intervention.
Several limitations should be acknowledged. First, our genetic analysis focused on common variants; rare or structural variants were not captured. Second, causal gene prioritization was based on statistical integration and requires experimental validation. Third, our analyses were largely restricted to individuals of European ancestry, potentially limiting generalizability. The diminished performance of PRSs trained in European cohorts when applied to non-European groups likely reflects differences in allele frequencies, linkage disequilibrium (LD) patterns, and GWAS sample sizes and may be further amplified by environmental and sociocultural factors (57, 58). Developing equitable, ancestry-diverse models will require expanding GWAS in underrepresented populations and conducting transethnic meta-analyses so that MASLD PRSs can be deployed broadly without worsening health disparities. Fourth, plasma pQTLs may not fully capture disease-relevant regulation in specific tissues—notably liver and adipose, which are central to MASLD pathophysiology—underscoring the need for integrative analyses that combine tissue-resolved eQTLs, pQTLs, and single-cell multiomics to enhance biological specificity and interpretability. Last, although our results genetically nominate FURIN and other proteins as key mediators of MASLD-related systemic outcomes, we acknowledge that direct mechanistic validation is still essential to establish causality. Targeted perturbation studies (e.g., viral or nanoparticle-mediated modulation of gene expression and protein abundance) in relevant MASLD and cardiovascular models will be required to rigorously confirm these causal links and to fully assess their therapeutic potential. This mechanistic validation represents an essential next step, as genetic inference alone, while powerful, cannot fully resolve the specific biological context or the precise cellular and temporal processes involved. Thus, the analytical framework established here provides a solid foundation and crucial prioritization tool for subsequent experimental validation and translational development.
In conclusion, our study presents a comprehensive framework for dissecting MASLD as a genetically complex and multisystem disease. By leveraging cross-trait genomic modeling integrated with proteomic and phenomic data, we enhance locus discovery, uncover systemic disease pathways, and identify circulating mediators with translational relevance. These findings support a shift from liver-centric models toward holistic, precision-guided strategies for MASLD prevention and treatment, addressing the full spectrum of metabolic dysfunction. Our work establishes a foundational resource for the field, demonstrating how integrative genomics can decode the biological complexity of MASLD and guide future efforts in mechanistic research, biomarker discovery, and therapeutic development.
MATERIALS AND METHODS
GWAS population
We investigated eight traits relevant to MASLD, including (i) liver PDFF and (ii) seven MASLD-related CMRFs: BMI, WC, HbA1c, FG, TG, HDL-C, and HTN. PDFF is the current gold-standard imaging biomarker for hepatic fat quantification, reflecting intrahepatic triglyceride accumulation (59). The seven CMRFs were selected on the basis of established MASLD diagnostic criteria (4). GWAS summary statistics for liver PDFF were obtained from the UKB (24), while those for the seven MASLD-related CMRFs were obtained from the largest available European-ancestry studies that did not include UKB participants.
GWAS quality control
We performed stringent quality control on GWAS summary statistics using EasyQC (v.23.8) and MungeSumstats (v.1.14.1), with the 1000 Genomes Project Phase 3 European panel as the reference (60). SNPs were excluded if they were (i) monomorphic or had minor allele frequency (MAF) < 0.5%; (ii) missing, invalid, or had ambiguous alleles; (iii) poorly imputed (INFO < 0.9, if available); (iv) duplicated or multiallelic; or (v) absent from or inconsistent with the reference panel. All datasets were harmonized to the GRCh37 (hg19) genome build.
Genetic correlation and SNP-based heritability
We applied LDSC (61) to evaluate genomic inflation and estimate SNP-based heritability. Intercepts near 1 and low attenuation ratios indicated that observed inflation was due to polygenicity rather than population stratification. Pairwise genetic correlations were computed across MASLD-related traits and between the composite CMRF construct and individual traits, validating the latent phenotype. Notably, the HDL-C GWAS was reverse coded to align risk directions. Analyses used precomputed LD scores from ~1.3 million high-quality HapMap3 SNPs in the European 1000 Genomes panel. SNPs were filtered for MAF > 0.01 and INFO > 0.9. For binary traits, total sample size (Ncases + Ncontrols) was used.
KMO test and parallel analysis
We used the KMO test to assess sampling adequacy for factor analysis; values > 0.5 were considered acceptable (48). Analyses were performed using the psych R package (v.2.4.12). We conducted parallel analysis (62) to determine the optimal number of factors to retain for subsequent EFA. Observed eigenvalues from the genetic correlation matrix were compared against those from Monte Carlo–simulated noise matrices generated using the sampling covariance matrix. Both matrices were derived from the LDSC function in Genomic SEM (v.0.0.5) (63) based on GWAS summary statistics for the seven MASLD traits. The number of retained factors was determined where observed eigenvalues exceeded simulated ones.
EFA and CFA
EFA was conducted using the factanal function (R stats v.4.0.5) on traits from odd-numbered chromosomes. Guided by parallel analysis results, we fitted models with one to three latent factors, applying promax rotation to account for factor correlations. Factors explaining ≥15% of the total variance were retained, following prior studies (48, 49). CFA was then performed using Genomic SEM on even chromosomes to validate EFA-derived structure. Indicators with loadings ≥ 0.3 were retained. CFA models were specified under unit variance identification and estimated using diagonally weighted least squares. Model fit was evaluated using standard indices: χ2, CFI (with ≥0.95 indicating good fit), akaike information criterion, and SRMR (with <0.08 indicating good fit) (64).
Multivariate Genomic SEM for CMRF GWAS
We applied Genomic SEM to jointly model seven MASLD-related CMRF traits. Stage 1 used cross-trait LDSC to estimate the genetic covariance matrix from univariate GWASs. Stage 2 incorporated SNP associations into a structural equation model to estimate a multivariate GWAS, reflecting shared genetic architecture across traits. A hierarchical factor model, scaled using unit loading identification, demonstrated superior fit and was selected for downstream analyses. The final model included 2,202,333 SNPs with MAF > 1%. Effective sample size (Neff) was calculated using the method of Mallard et al. (65).
Construction of MASLD GWAS using MTAG
To enhance power for MASLD locus discovery, we applied MTAG (66) to integrate the liver PDFF GWAS with the CMRF latent trait GWAS. MTAG models the shared variance-covariance structure of SNP effects across traits to increase statistical power and reduce mean squared error. Harmonization steps included aligning alleles, excluding multiallelic variants, and retaining palindromic SNPs after strand unification to GRCh37. The resulting MASLD GWAS retained the meta-analytic format of the PDFF GWAS and showed modest inflation in false positives (MaxFDR reported). The effective size was calculated on the basis of the increase in the mean χ2 statistic using the formula
Independent significant lead SNPs were identified using PLINK 1.9 (67) (--clump) with the European 1000 Genomes Phase 3 reference panel (P < 5 × 10−8, LD r2 < 0.001, 100-kb window). Nearby lead SNPs (<250 kb) were merged into a single locus, represented by the variant with the lowest P value. Loci were annotated using FUMA (v.1.8.0) (68) and cross referenced with the NHGRI-EBI GWAS Catalog (69). The human leukocyte antigen region was excluded because of its complex LD structure.
Pleiotropy analysis
To identify pleiotropic loci, we evaluated overlap between MASLD-associated lead SNPs and those from PDFF or CMRF GWAS. MASLD loci within 1 Mb of any PDFF or CMRF lead SNP were considered shared and annotated with a common locus ID. Fuji plot [v.1.0.3; Kanai et al. (70)] was used to visualize trait-specific and pleiotropic loci. For pleiotropic regions, candidate effector genes were nominated using integrative evidence from Open Targets Genetics (71) and FUMA annotations; otherwise, the nearest protein-coding gene was reported.
Gene-based and gene set analyses
Gene-based and gene set analyses were performed using MAGMA (72), as implemented in FUMA. Gene-based tests aggregated SNP associations to 18,894 protein-coding genes using an SNP-wise mean model, accounting for LD structure. Gene set enrichment tested 17,023 curated and Gene Ontology (GO) gene sets from the Molecular Signatures Database (MSigDB; v.2023.1) (73), spanning biological processes, molecular functions, and cellular components. Enrichment was assessed via competitive testing with Bonferroni correction.
Functional, cell- and tissue-type enrichments
To characterize functional relevance of MASLD-associated loci, we applied stratified LDSC (S-LDSC) (74) to partition SNP-based heritability across 84 genomic annotations. Enrichment was defined as the proportion of heritability relative to the proportion of SNPs in each annotation, with significance threshold P < 5.95 × 10−4 (0.05/84). Tissue- and cell-type–specific enrichments were assessed using LDSC-SEG (75, 76) across 205 expression datasets from GTEx and Franke lab (77) and 489 epigenomic annotations from Roadmap Epigenomics (78) and the Encyclopedia of DNA Elements project (79). All analyses used precomputed European LD scores from the 1000 Genomes Phase 3 reference.
Gene prioritization with FLAMES
To systematically prioritize effector genes underlying MASLD-associated genetic loci, we applied FLAMES (80), a supervised machine-learning framework that integrates locus-specific functional SNP-to-gene evidence and genome-wide gene convergence signals. We assumed that each true SNP-phenotype association reflects a single causal gene through which the SNP exerts its effect on MASLD. This gene is referred to as the effector gene of the corresponding SNP. FLAMES integrates two evidence streams: (i) SNP-to-gene functional annotations (e.g., eQTLs and enhancers) and (ii) PoPS (81), reflecting genome-wide convergence of gene-level functional features.
Fine mapping of causal variants
Significant MASLD loci were fine mapped using SuSiEx (82) with European UKB data as reference. We defined 500-kb windows centered on lead SNPs and assumed one causal variant per region to reduce false positive rate (80). For each locus, 95% credible sets were constructed on the basis of PIPs.
Functional annotations of candidate effector genes
For each credible set, SNPs were linked to candidate genes using FLAMES annotate, which incorporates both SNP-specific (e.g., eQTLs) and regional (e.g., enhancer-gene links) locus-to-gene annotations. These locus-based gene annotations were used as inputs for XGBoost classifier to generate locus-based SNP-to-gene annotation scores, reflecting the strength of combined functional evidence linking the gene to the credible set. Locus-based annotation scores were computed by multiplying the PIP of each SNP by the corresponding gene annotation score. For regional annotations, the score was calculated as the product of the SNP’s PIP and the annotation strength for the region-gene pair. The window of genes annotated to a locus is specified as with default 750 kb of the fine-mapped lead SNP.
PoPS integration and scoring to identify effector genes
Gene-level PoPS were computed on the basis of MASLD MAGMA z-scores using a weighted model of gene expression, coexpression, pathways, and protein-protein interactions. FLAMES combined locus-based scores and PoPS via XGBoost to generate a unified raw score per gene, linearly scaled within each locus. To evaluate confidence, FLAMES estimates cumulative precision via benchmarking against curated gene-trait pairs (e.g., exome-wide association study-validated loci). Specifically, a series of thresholds was evaluated on held-out datasets to identify the combination of raw and scaled score cutoffs that maximized recall while maintaining a minimum of 75% precision. A polynomial regression was then fitted to model the relationship between the scaled FLAMES score and cumulative precision, enabling FLAMES to report precision estimates for all gene predictions.
Pathway and tissue specificity of effector genes
We performed gene set enrichment analysis using the GENE2FUNC module in FUMA. Hypergeometric tests evaluated overrepresentation in curated gene sets from MSigDB (73), WikiPathways (83), and the GWAS Catalog (69), spanning canonical pathways and GO terms.
Tissue-specific expression was evaluated using GTEx v8 transcriptomic data (84). Prioritized genes were tested for enrichment in tissue-specific differentially expressed gene (DEG) sets, precomputed via pairwise t tests (Bonferroni-adjusted P ≤ 0.05 and |log2FC| ≥ 0.58). Up-regulated, down-regulated, and two-sided DEG sets were tested separately using hypergeometric tests, with Bonferroni correction applied. Background genes were restricted to those with average expression > 1 transcripts per million in at least one tissue. We visualized the tissue-wide expression patterns of prioritized genes using a heatmap displaying log2-transformed average expression values across tissue labels.
UKB cohort
UKB is a large-scale, prospective cohort comprising more than 500,000 participants aged 40 to 69 years, recruited from 22 centers across the United Kingdom between 2006 and 2010 (85). Genotyping was performed using either the Applied Biosystems UK BiLEVE Axiom Array or the closely related UKB Axiom Array (86). Following stringent quality control, genotype data were imputed using the Haplotype Reference Consortium, UK10K, and 1000 Genomes Phase 3 reference panels. We excluded participants based on the following criteria: (i) low genotype quality or missing data, (ii) discordant genetic and reported sex, (iii) non-white ethnicity, (iv) relatedness, and (v) participation in the UKB imaging substudy used in the liver PDFF GWAS, to avoid overlap with the discovery cohort. After exclusions, 373,835 participants remained for downstream PRS analyses. The UKB has approval from the North West Multi-Centre Research Ethics Committee (ref: 11/NW/0382) and obtained written informed consent from all participants before the study.
MASLD case ascertainment
As the MASLD phenotype is not directly available in the UKB, we defined MASLD status based on a multisociety Delphi consensus statement (4). We excluded individuals reporting alcohol intake > 20 g/day (women) or > 30 g/day (men) (4). Daily alcohol intake (in grams of pure alcohol) was calculated on the basis of baseline questionnaire responses, following the approach used in our previous study (87). We also excluded individuals with chronic viral hepatitis (ICD-10 B18.0 to B18.2 and B18.8 to B18.9) based on hospital records, as well as those with Mendelian conditions including Wilson disease, hereditary fructose intolerance, familial partial lipodystrophy, glycogen storage disorders, abetalipoproteinemia, and α1-antitrypsin deficiency. Hepatic steatosis was assessed using the FLI (36), a widely validated noninvasive measure (88). An FLI score ≥ 60 was used as the threshold for likely steatosis. MASLD was defined by the presence of (i) hepatic steatosis, (ii) ≥1 CMRF, and (iii) nonexcessive alcohol consumption. On the basis of these criteria, a total of 51,754 individuals in the UKB cohort was identified as MASLD cases.
To characterize fibrosis risk in MASLD, we computed three widely used noninvasive scores: the FIB-4 (37), APRI (38), and the Forns score (39). Participants with incomplete laboratory data were excluded. To limit outlier influence, biomarker values > 3 SD from the mean were winsorized to that threshold rather than removed. Advanced-fibrosis risk was classified as low, intermediate, or high using established cutoffs: FIB-4 (1.30, 2.67), APRI (0.5, 1.5), and Forns (4.2, 6.9) (88).
PRS analysis
We constructed an MASLD PRS using the clumping and thresholding (C + T) approach (89). Independent genome-wide significant SNPs were first selected from the MASLD GWAS through LD-based clumping. The PRS was calculated as the weighted sum of risk alleles, with weights derived from GWAS effect sizes. Association between PRS and MASLD status was assessed by logistic regression across PRS deciles, using the lowest decile as reference. All models adjusted for age, sex, and the top 10 genetic principal components. To evaluate predictive performance, we compared two logistic regression models: a baseline model including age, sex, and principal components and a full model including PRS. Incremental coefficient of determination (R2) was computed as the difference in Nagelkerke’s pseudo-R2 between the two models, with 95% CIs estimated via 1000 bootstrap replicates.
PheWAS of MASLD
We conducted a PheWAS to assess the associations between the MASLD PRS and a wide range of clinical outcomes in the European-ancestry UKB cohort (n = 373,835). Health outcomes were derived from inpatient hospital records, death registries, and primary care data, harmonized into phecodes. To ensure sufficient statistical power, we excluded outcomes with fewer than 20 cases, resulting in 1466 phenotypes included in the primary analysis. For each phenotype, we fit logistic regression models with the MASLD PRS as the independent variable, adjusting for age, sex, and the top 10 genetic principal components. To account for multiple testing, we applied the Benjamini-Hochberg FDR procedure, with an FDR-adjusted P < 0.05 considered statistically significant. To evaluate the generalizability of findings across ancestries, we additionally performed PRS-PheWAS in the non-white UKB cohort (n = 78,248), encompassing 1250 phenotypes.
Two-step proteomic-wide MR analyses
Step 1: MASLD as exposure, plasma proteins as outcomes
To identify circulating proteins potentially regulated by MASLD, we conducted two-sample MR using MASLD as the exposure and plasma protein levels as outcomes. Genetic instruments were derived from our MASLD GWAS, selecting genome-wide significant and independent SNPs. Protein GWAS summary statistics were obtained from the deCODE study, which measured 4907 plasma proteins in 35,559 individuals of European ancestry using the SomaScan v4 platform (SomaLogic) (45). Proteins were defined as MASLD driven if they met all the following criteria: (i) IVW MR P < 1.02 × 10−5 (Bonferroni correction for 4907 tests); (ii) consistent effect direction across IVW, MR-Egger, weighted median, and weighted mode methods; (iii) all instruments had F-statistics > 10; and (iv) no evidence of directional horizontal pleiotropy (MR-Egger intercept P > 0.05). Applying these stringent filters, we identified 514 proteins as putative downstream consequences of MASLD.
Step 2: Plasma proteins as exposure, disease phenotypes as outcomes
To assess the downstream effects of MASLD-driven proteins, we conducted a forward two-sample MR analysis linking plasma proteins to MASLD-associated clinical outcomes. We first curated a high-quality cis-pQTL database by harmonizing publicly available summary statistics from seven large-scale proteogenomic studies (45, 46, 90–95). For each study, we selected cis-pQTLs (within ±1 Mb of the transcription start site) with the largest F-statistics to ensure strong instruments and reduce weak instrument bias.
To exclude proteins that might causally influence MASLD (i.e., reverse causation), we first performed reverse MR using cis-pQTLs as instruments and MASLD as the outcome. Proteins with significant evidence of reverse causality (IVW or Wald ratio P < 2.22 × 10−4 after Bonferroni correction) were excluded (n = 4). Subsequently, we conducted a forward two-sample MR analysis using cis-pQTLs for the remaining 227 proteins as exposures and 58 MASLD-associated clinical outcomes (identified via PheWAS) as outcomes. Outcome GWAS summary statistics were sourced from either VA Million Veteran Program (MVP) (96) and FinnGen R10 (97), where available. For each protein-phenotype pair, IVW (or Wald ratio if only one SNP) was used as the primary estimator, and Bonferroni correction was applied on the basis of the number of proteins tested per outcome cohort.
To ensure robustness, we retained only results with consistent effect directions across all MR methods (IVW, MR-Egger, weighted median, and weighted mode). Instrumental variables associated with multiple proteins in cis pattern were excluded to minimize pleiotropic bias. For models with ≥3 instruments, horizontal pleiotropy was assessed using the MR-Egger intercept test. In total, 82 protein-phenotype pairs passed all criteria for statistical significance, robustness, and pleiotropy control. The MR analyses were performed using TwoSampleMR (v.0.6.10) R package.
Directional concordance check
To further validate the biological relevance of the 82 significant protein-phenotype associations, we assessed directional concordance across the MASLD-protein-phenotype axis. Specifically, we evaluated whether (i) the effect of MASLD liability on protein levels (from step 1 MR) and (ii) the effect of protein levels on clinical outcomes (from step 2 MR) jointly predicted the direction of the MASLD-phenotype association observed in the PheWAS.
Directional concordance was defined as cases where the product of the two MR-derived effect estimates aligned with the direction of the corresponding PRS-phenotype association. This consistency supports a coherent biological cascade in which MASLD alters protein expression, subsequently contributing to disease development. A total of 47 protein-phenotype pairs (involving 25 unique proteins) satisfied this criterion, indicating potential mechanistic relevance.
Validation in the UKB cohort
To evaluate the clinical relevance of MR-identified protein-phenotype associations, we analyzed 52,633 UKB participants with baseline plasma protein profiles measured using the Olink Explore 3072 platform. Among the 47 genetically predicted protein-phenotype pairs, 27 (involving 16 unique proteins) were available for validation. Incident disease outcomes were defined using ICD-10 codes; prevalent cases and participants with missing event dates were excluded.
We applied multivariable Cox proportional hazard models, adjusting for age, sex, ethnicity, household income, smoking, alcohol intake, BMI, and the top 10 genetic principal components. Associations were considered validated if the hazard ratio was statistically significant (P < 0.05) and directionally consistent with step 2 MR estimates. Thirteen protein-phenotype associations were validated, involving nine outcomes and seven unique proteins. Kaplan-Meier survival curves were plotted by quartiles of baseline protein levels, and log-rank tests were applied to assess statistical significance.
Mediation analysis
To quantify the extent to which these proteins mediate the relationship between MASLD and clinical outcomes, we performed mediation analysis for the 13 validated pairs using a product-of-coefficients approach implemented in the mediation R package (v.4.5.0). Linear regression was used for the MASLD-protein association and logistic regression for protein-outcome associations, adjusting for the same covariates. The indirect effects and proportion mediated were estimated using 1000 bootstrap samples.
Integrative colocalization and MR analysis for gene-protein pairs
To determine whether MASLD-prioritized loci causally influence circulating proteins identified by proteome-wide MR, we performed an analysis that jointly applied colocalization and MR. We first evaluated whether MASLD-associated loci and the 25 candidate proteins from proteome-wide MR are driven by the same causal variant. Using the coloc R package (v.5.2.3), which estimates the posterior probability that two traits share an underlying genetic signal, we analyzed ±250-kb windows centered on MASLD lead SNPs with default priors (p1 = 1 × 10−4 and p2 = 1 × 10−4 for trait associations; p12 = 1 × 10−5 for a shared association). A PP.H4 > 0.80 was considered strong evidence of colocalization. For such loci, we identified MASLD signals colocalizing with multiple proteins and depicted regional patterns using LocusZoomr (v.0.3.8).
We next evaluated regulatory effects of MASLD-prioritized gene expression on candidate plasma proteins using two-sample MR, treating liver eQTLs as exposures and plasma-protein GWAS as outcomes. Liver eQTL summary data were obtained from Broadway et al. (n = 1183) (98). We derived independent instruments per gene (LD r2 < 0.01; association P < 1 × 10−5; F-statistic > 10) and estimated causal effects of gene expression on circulating abundance for the 25 proteins nominated by proteome-wide MR. Associations were deemed nominally significant at P < 0.05 in light of the moderate eQTL sample size.
Acknowledgments
Funding:
This work was funded by the Noncommunicable Chronic Diseases-National Science and Technology Major Project grant 2023ZD0510000 (X.C. and C.S.), the National Key Research and Development Program of China grant 2022YFC3400700 (Z.L.), the National Natural Science Foundation of China grants 82204125 and 82473700 (Z.L. and X.C.), and the Shanghai Rising-star Program grant 24QA2701000 (Z.L.).
Author contributions:
Conceptualization: Z.L., M.D., X.C., and H.Y. Methodology: M.D., Z.L., and T.W. Formal analysis: M.D. and Z.L. Investigation: M.D., Z.L., and X.C. Resources: M.D., H.Y., and C.S. Data curation: C.S. Writing—original draft: M.D. and Z.L. Writing—review and editing: Z.L., Y.J., X.C., T.Z., H.Y., and L.J. Visualization: M.D. and Z.L. Supervision: Z.L., L.J., T.Z., and X.C. Project administration: Z.L. and X.C. Funding acquisition: Z.L., X.C., and C.S.
Competing interests:
The authors declare that they have no competing interests.
Data and materials availability:
The MASLD GWAS summary statistics were generated using a cross-trait approach integrating MRI-PDFF and cardiometabolic risk factors (see “Construction of MASLD GWAS using MTAG” in Materials and Methods) and are available via the NHGRI-EBI GWAS Catalog (GCP ID: GCP001525; accession: GCST90728570). All data and code needed to evaluate and reproduce the results in the paper are present in the paper and/or the Supplementary Materials. No new physical materials were generated. The individual-level phenotypic and genetic data for UKB (www.ukbiobank.ac.uk/) can be accessed through the application. The UKB received ethical approval from the research ethics committee (REC reference for UKB 11/NW/0382), and participants provided written informed consent. The GWAS summary statistics from the following consortia and biobanks are publicly available at the corresponding URL: NHGRI-EBI GWAS Catalog (www.ebi.ac.uk/gwas/), GIANT (https://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files), GLGC (https://csg.sph.umich.edu/willer/public/glgc-lipids2021/), MAGIC (https://magicinvestigators.org/), and FinnGen (www.finngen.fi/en/access_results). The GWAS summary statistics of the MVP cohort were obtained from GWAS catalog (www.ebi.ac.uk/gwas/publications/39024449). Plasma proteome GWAS was obtained from deCODE (www.deCODE.com/summarydata/; https://doi.org/10.1038/s41588-021-00978-w). Files required to run partition heritability with S-LDSC can be found at https://zenodo.org/records/7768714. MASLD GWAS summary data are available at GWAS catalog (GCP ID: GCP001525 and GCST90728570).
Supplementary Materials
The PDF file includes:
Figs. S1 to S18
Legends for tables S1 to S37
References
Other Supplementary Material for this manuscript includes the following:
Tables S1 to S37
REFERENCES
- 1.Targher G., Byrne C. D., Tilg H., MASLD: A systemic metabolic disorder with cardiovascular and malignant complications. Gut 73, 691–702 (2024). [DOI] [PubMed] [Google Scholar]
- 2.Liu Z., Huang J., Dai L., Yuan H., Jiang Y., Suo C., Jin L., Zhang T., Chen X., Steatotic liver disease prevalence in China: A population-based study and meta-analysis of 17.4 million individuals. Aliment. Pharmacol. Ther. 61, 1110–1122 (2025). [DOI] [PubMed] [Google Scholar]
- 3.Lee B. P., Dodge J. L., Terrault N. A., National prevalence estimates for steatotic liver disease and subclassifications using consensus nomenclature. Hepatology 79, 666–673 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Rinella M. E., Lazarus J. V., Ratziu V., Francque S. M., Sanyal A. J., Kanwal F., Romero D., Abdelmalek M. F., Anstee Q. M., Arab J. P., Arrese M., Bataller R., Beuers U., Boursier J., Bugianesi E., Byrne C. D., Castro Narro G. E., Chowdhury A., Cortez-Pinto H., Cryer D. R., Cusi K., El-Kassas M., Klein S., Eskridge W., Fan J., Gawrieh S., Guy C. D., Harrison S. A., Kim S. U., Koot B. G., Korenjak M., Kowdley K. V., Lacaille F., Loomba R., Mitchell-Thain R., Morgan T. R., Powell E. E., Roden M., Romero-Gómez M., Silva M., Singh S. P., Sookoian S. C., Spearman C. W., Tiniakos D., Valenti L., Vos M. B., Wong V. W., Xanthakos S., Yilmaz Y., Younossi Z., Hobbs A., Villota-Rivas M., Newsome P. N., A multisociety Delphi consensus statement on new fatty liver disease nomenclature. Hepatology 78, 1966–1986 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Byrne C. D., Armandi A., Pellegrinelli V., Vidal-Puig A., Bugianesi E., Μetabolic dysfunction-associated steatotic liver disease: A condition of heterogeneous metabolic risk factors, mechanisms and comorbidities requiring holistic treatment. Nat. Rev. Gastroenterol. Hepatol. 22, 314–328 (2025). [DOI] [PubMed] [Google Scholar]
- 6.Valenti L. V. C., Moretti V., Implications of the evolving knowledge of the genetic architecture of MASLD. Nat. Rev. Gastroenterol. Hepatol. 21, 5–6 (2024). [DOI] [PubMed] [Google Scholar]
- 7.Sookoian S., Rotman Y., Valenti L., Genetics of metabolic dysfunction-associated steatotic liver disease: The state of the art update. Clin. Gastroenterol. Hepatol. 22, 2177–2187.e3 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sveinbjornsson G., Ulfarsson M. O., Thorolfsdottir R. B., Jonsson B. A., Einarsson E., Gunnlaugsson G., Rognvaldsson S., Arnar D. O., Baldvinsson M., Bjarnason R. G., Eiriksdottir T., Erikstrup C., Ferkingstad E., Halldorsson G. H., Helgason H., Helgadottir A., Hindhede L., Hjorleifsson G., Jones D., Knowlton K. U., Lund S. H., Melsted P., Norland K., Olafsson I., Olafsson S., Oskarsson G. R., Ostrowski S. R., Pedersen O. B., Snaebjarnarson A. S., Sigurdsson E., Steinthorsdottir V., Schwinn M., Thorgeirsson G., Thorleifsson G., Jonsdottir I., Bundgaard H., Nadauld L., Bjornsson E. S., Rulifson I. C., Rafnar T., Norddahl G. L., Thorsteinsdottir U., Sulem P., Gudbjartsson D. F., Holm H., Stefansson K., Multiomics study of nonalcoholic fatty liver disease. Nat. Genet. 54, 1652–1663 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chen Y., Du X., Kuppa A., Feitosa M. F., Bielak L. F., O’Connell J. R., Musani S. K., Guo X., Kahali B., Chen V. L., Smith A. V., Ryan K. A., Eirksdottir G., Allison M. A., Bowden D. W., Budoff M. J., Carr J. J., Chen Y. I., Taylor K. D., Oliveri A., Correa A., Crudup B. F., Kardia S. L. R., Mosley T. H. Jr., Norris J. M., Terry J. G., Rotter J. I., Wagenknecht L. E., Halligan B. D., Young K. A., Hokanson J. E., Washko G. R., Gudnason V., Province M. A., Peyser P. A., Palmer N. D., Speliotes E. K., Genome-wide association meta-analysis identifies 17 loci associated with nonalcoholic fatty liver disease. Nat. Genet. 55, 1640–1650 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Li Y., van den Berg E. H., Kurilshikov A., Zhernakova D. V., Gacesa R., Hu S., Lopera-Maya E. A., Zhernakova A., de Meijer V. E., Sanna S., Dullaart R. P. F., Blokzijl H., Festen E. A. M., Fu J., Weersma R. K., Genome-wide studies reveal genetic risk factors for hepatic fat content. Genomics Proteomics Bioinformatics 22, qzae031 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ghodsian N., Abner E., Emdin C. A., Gobeil É., Taba N., Haas M. E., Perrot N., Manikpurage H. D., Gagnon É., Bourgault J., St-Amand A., Couture C., Mitchell P. L., Bossé Y., Mathieu P., Vohl M. C., Tchernof A., Thériault S., Khera A. V., Esko T., Arsenault B. J., Electronic health record-based genome-wide meta-analysis provides insights on the genetic architecture of non-alcoholic fatty liver disease. Cell Rep. Med. 2, 100437 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Salomone F., Pipitone R. M., Longo M., Malvestiti F., Amorini A. M., Distefano A., Casirati E., Ciociola E., Iraci N., Leggio L., Zito R., Vicario N., Saoca C., Pennisi G., Cabibi D., Lazzarino G., Fracanzani A. L., Dongiovanni P., Valenti L., Petta S., Volti G. L., Grimaudo S., SIRT5 rs12216101 T>G variant is associated with liver damage and mitochondrial dysfunction in patients with non-alcoholic fatty liver disease. J. Hepatol. 80, 10–19 (2024). [DOI] [PubMed] [Google Scholar]
- 13.Lindén D., Romeo S., Therapeutic opportunities for the treatment of NASH with genetically validated targets. J. Hepatol. 79, 1056–1064 (2023). [DOI] [PubMed] [Google Scholar]
- 14.Verschuren L., Mak A. L., van Koppen A., Özsezen S., Difrancesco S., Caspers M. P. M., Snabel J., van der Meer D., van Dijk A. M., Rashu E. B., Nabilou P., Werge M. P., van Son K., Kleemann R., Kiliaan A. J., Hazebroek E. J., Boonstra A., Brouwer W. P., Doukas M., Gupta S., Kluft C., Nieuwdorp M., Verheij J., Gluud L. L., Holleboom A. G., Tushuizen M. E., Hanemaaijer R., Development of a novel non-invasive biomarker panel for hepatic fibrosis in MASLD. Nat. Commun. 15, 4564 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Jamialahmadi O., De Vincentis A., Tavaglione F., Malvestiti F., Li-Gao R., Mancina R. M., Alvarez M., Gelev K., Maurotti S., Vespasiani-Gentilucci U., Rosendaal F. R., Kozlitina J., Pajukanta P., Pattou F., Valenti L., Romeo S., Partitioned polygenic risk scores identify distinct types of metabolic dysfunction-associated steatotic liver disease. Nat. Med. 30, 3614–3623 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chen V. L., Brady G. F., Recent advances in MASLD genetics: Insights into disease mechanisms and the next frontiers in clinical application. Hepatol. Commun. 9, e0618 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hutchison A. L., Tavaglione F., Romeo S., Charlton M., Endocrine aspects of metabolic dysfunction-associated steatotic liver disease (MASLD): Beyond insulin resistance. J. Hepatol. 79, 1524–1541 (2023). [DOI] [PubMed] [Google Scholar]
- 18.Mantovani A., Byrne C. D., Bonora E., Targher G., Nonalcoholic fatty liver disease and risk of incident type 2 diabetes: A meta-analysis. Diabetes Care 41, 372–382 (2018). [DOI] [PubMed] [Google Scholar]
- 19.Mantovani A., Petracca G., Beatrice G., Csermely A., Lonardo A., Schattenberg J. M., Tilg H., Byrne C. D., Targher G., Non-alcoholic fatty liver disease and risk of incident chronic kidney disease: An updated meta-analysis. Gut 71, 156–162 (2022). [DOI] [PubMed] [Google Scholar]
- 20.Abozaid Y. J., Ayada I., van Kleef L. A., Vallerga C. L., Pan Q., Brouwer W. P., Ikram M. A., Van Meurs J., de Knegt R. J., Ghanbari M., Plasma proteomic signature of fatty liver disease: The Rotterdam study. Hepatology 78, 284–294 (2023). [DOI] [PubMed] [Google Scholar]
- 21.Mia S., Siokatas G., Sidiropoulou R., Hoffman M., Fragkiadakis K., Markopoulou E., Elesawy M. I., Roy R., Blair S., Kuwabara Y., Rapushi E., Chaudhuri D., Makarewich C. A., Gao E., Koch W. J., Schilling J. D., Molkentin J. D., Marketou M., Drosatos K., Hepato-cardiac interorgan communication controls cardiac hypertrophy via combined endocrine-autocrine FGF21 signaling. Cell Rep. Med. 6, 102125 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Watt M. J., Miotto P. M., De Nardo W., Montgomery M. K., The liver as an endocrine organ-linking NAFLD and insulin resistance. Endocr. Rev. 40, 1367–1393 (2019). [DOI] [PubMed] [Google Scholar]
- 23.D’Erasmo L., Di Martino M., Neufeld T., Fraum T. J., Kang C. J., Burks K. H., Di Costanzo A., Minicocci I., Bini S., Maranghi M., Pigna G., Labbadia G., Zheng J., Fierro D., Montali A., Ceci F., Catalano C., Davidson N. O., Lucisano G., Nicolucci A., Arca M., Stitziel N. O., ANGPTL3 deficiency and risk of hepatic steatosis. Circulation 148, 1479–1489 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Liu Y., Basty N., Whitcher B., Bell J. D., Sorokin E. P., van Bruggen N., Thomas E. L., Cule M., Genetic architecture of 11 organ traits derived from abdominal MRI using deep learning. eLife 10, e65554 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Dennis E. L., Jahanshad N., Braskie M. N., Warstadt N. M., Hibar D. P., Kohannim O., Nir T. M., McMahon K. L., de Zubicaray G. I., Montgomery G. W., Martin N. G., Toga A. W., Wright M. J., Thompson P. M., Obesity gene NEGR1 associated with white matter integrity in healthy young adults. Neuroimage 102, 548–557 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Elks C. E., Loos R. J., Sharp S. J., Langenberg C., Ring S. M., Timpson N. J., Ness A. R., Davey Smith G., Dunger D. B., Wareham N. J., Ong K. K., Genetic markers of adult obesity risk are associated with greater early infancy weight gain and growth. PLoS Med. 7, e1000284 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Frayling T. M., Timpson N. J., Weedon M. N., Zeggini E., Freathy R. M., Lindgren C. M., Perry J. R., Elliott K. S., Lango H., Rayner N. W., Shields B., Harries L. W., Barrett J. C., Ellard S., Groves C. J., Knight B., Patch A. M., Ness A. R., Ebrahim S., Lawlor D. A., Ring S. M., Ben-Shlomo Y., Jarvelin M. R., Sovio U., Bennett A. J., Melzer D., Ferrucci L., Loos R. J., Barroso I., Wareham N. J., Karpe F., Owen K. R., Cardon L. R., Walker M., Hitman G. A., Palmer C. N., Doney A. S., Morris A. D., Smith G. D., Hattersley A. T., McCarthy M. I., A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316, 889–894 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Thorleifsson G., Walters G. B., Gudbjartsson D. F., Steinthorsdottir V., Sulem P., Helgadottir A., Styrkarsdottir U., Gretarsdottir S., Thorlacius S., Jonsdottir I., Jonsdottir T., Olafsdottir E. J., Olafsdottir G. H., Jonsson T., Jonsson F., Borch-Johnsen K., Hansen T., Andersen G., Jorgensen T., Lauritzen T., Aben K. K., Verbeek A. L., Roeleveld N., Kampman E., Yanek L. R., Becker L. C., Tryggvadottir L., Rafnar T., Becker D. M., Gulcher J., Kiemeney L. A., Pedersen O., Kong A., Thorsteinsdottir U., Stefansson K., Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity. Nat. Genet. 41, 18–24 (2009). [DOI] [PubMed] [Google Scholar]
- 29.Willer C. J., Speliotes E. K., Loos R. J., Li S., Lindgren C. M., Heid I. M., Berndt S. I., Elliott A. L., Jackson A. U., Lamina C., Lettre G., Lim N., Lyon H. N., McCarroll S. A., Papadakis K., Qi L., Randall J. C., Roccasecca R. M., Sanna S., Scheet P., Weedon M. N., Wheeler E., Zhao J. H., Jacobs L. C., Prokopenko I., Soranzo N., Tanaka T., Timpson N. J., Almgren P., Bennett A., Bergman R. N., Bingham S. A., Bonnycastle L. L., Brown M., Burtt N. P., Chines P., Coin L., Collins F. S., Connell J. M., Cooper C., Smith G. D., Dennison E. M., Deodhar P., Elliott P., Erdos M. R., Estrada K., Evans D. M., Gianniny L., Gieger C., Gillson C. J., Guiducci C., Hackett R., Hadley D., Hall A. S., Havulinna A. S., Hebebrand J., Hofman A., Isomaa B., Jacobs K. B., Johnson T., Jousilahti P., Jovanovic Z., Khaw K. T., Kraft P., Kuokkanen M., Kuusisto J., Laitinen J., Lakatta E. G., Luan J., Luben R. N., Mangino M., McArdle W. L., Meitinger T., Mulas A., Munroe P. B., Narisu N., Ness A. R., Northstone K., O’Rahilly S., Purmann C., Rees M. G., Ridderstråle M., Ring S. M., Rivadeneira F., Ruokonen A., Sandhu M. S., Saramies J., Scott L. J., Scuteri A., Silander K., Sims M. A., Song K., Stephens J., Stevens S., Stringham H. M., Tung Y. C., Valle T. T., Van Duijn C. M., Vimaleswaran K. S., Vollenweider P., Waeber G., Wallace C., Watanabe R. M., Waterworth D. M., Watkins N., Witteman J. C., Zeggini E., Zhai G., Zillikens M. C., Altshuler D., Caulfield M. J., Chanock S. J., Farooqi I. S., Ferrucci L., Guralnik J. M., Hattersley A. T., Hu F. B., Jarvelin M. R., Laakso M., Mooser V., Ong K. K., Ouwehand W. H., Salomaa V., Samani N. J., Spector T. D., Tuomi T., Tuomilehto J., Uda M., Uitterlinden A. G., Wareham N. J., Deloukas P., Frayling T. M., Groop L. C., Hayes R. B., Hunter D. J., Mohlke K. L., Peltonen L., Schlessinger D., Strachan D. P., Wichmann H. E., McCarthy M. I., Boehnke M., Barroso I., Abecasis G. R., Hirschhorn J. N., Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat. Genet. 41, 25–34 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Varbo A., Benn M., Tybjærg-Hansen A., Grande P., Nordestgaard B. G., TRIB1 and GCKR polymorphisms, lipid levels, and risk of ischemic heart disease in the general population. Arterioscler. Thromb. Vasc. Biol. 31, 451–457 (2011). [DOI] [PubMed] [Google Scholar]
- 31.Khalil Y. A., Rabès J. P., Boileau C., Varret M., APOE gene variants in primary dyslipidemia. Atherosclerosis 328, 11–22 (2021). [DOI] [PubMed] [Google Scholar]
- 32.Cupido A. J., Reeskamp L. F., Hingorani A. D., Finan C., Asselbergs F. W., Hovingh G. K., Schmidt A. F., Joint genetic inhibition of PCSK9 and CETP and the association with coronary artery disease: A factorial mendelian randomization study. JAMA Cardiol. 7, 955–964 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Li Y., He P. P., Zhang D. W., Zheng X. L., Cayabyab F. S., Yin W. D., Tang C. K., Lipoprotein lipase: From gene to atherosclerosis. Atherosclerosis 237, 597–608 (2014). [DOI] [PubMed] [Google Scholar]
- 34.Haas J. T., Francque S., Staels B., Pathophysiology and mechanisms of nonalcoholic fatty liver disease. Annu. Rev. Physiol. 78, 181–205 (2016). [DOI] [PubMed] [Google Scholar]
- 35.Crudele L., De Matteis C., Novielli F., Di Buduo E., Petruzzelli S., De Giorgi A., Antonica G., Berardi E., Moschetta A., Fatty liver index (FLI) is the best score to predict MASLD with 50% lower cut-off value in women than in men. Biol. Sex Differ. 15, 43 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Bedogni G., Bellentani S., Miglioli L., Masutti F., Passalacqua M., Castiglione A., Tiribelli C., The fatty liver index: A simple and accurate predictor of hepatic steatosis in the general population. BMC Gastroenterol. 6, 33 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Sterling R. K., Lissen E., Clumeck N., Sola R., Correa M. C., Montaner J., Sulkowski M. S., Torriani F. J., Dieterich D. T., Thomas D. L., Messinger D., Nelson M., Development of a simple noninvasive index to predict significant fibrosis in patients with HIV/HCV coinfection. Hepatology 43, 1317–1325 (2006). [DOI] [PubMed] [Google Scholar]
- 38.Wai C. T., Greenson J. K., Fontana R. J., Kalbfleisch J. D., Marrero J. A., Conjeevaram H. S., Lok A. S., A simple noninvasive index can predict both significant fibrosis and cirrhosis in patients with chronic hepatitis C. Hepatology 38, 518–526 (2003). [DOI] [PubMed] [Google Scholar]
- 39.Forns X., Ampurdanès S., Llovet J. M., Aponte J., Quintó L., Martínez-Bauer E., Bruguera M., Sánchez-Tapias J. M., Rodés J., Identification of chronic hepatitis C patients without hepatic fibrosis by a simple predictive model. Hepatology 36, 986–992 (2002). [DOI] [PubMed] [Google Scholar]
- 40.Liang Y., Chen H., Liu Y., Hou X., Wei L., Bao Y., Yang C., Zong G., Wu J., Jia W., Association of MAFLD with diabetes, chronic kidney disease, and cardiovascular disease: A 4.6-year cohort study in China. J. Clin. Endocrinol. Metab. 107, 88–97 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Qiu P., Du J., Zhang C., Li M., Li H., Chen C., Increased risk of reflux esophagitis in non-obese individuals with nonalcoholic fatty liver disease: A cross-sectional study. Ann. Med. 55, 2294933 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Lu Y., Zhang J., Li H., Li T., Association of non-alcoholic fatty liver disease with self-reported osteoarthritis among the US adults. Arthritis Res. Ther. 26, 40 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Huang H., Liu Z., Xie J., Xu C., NAFLD does not increase the risk of incident dementia: A prospective study and meta-analysis. J. Psychiatr. Res. 161, 435–440 (2023). [DOI] [PubMed] [Google Scholar]
- 44.Bao X., Kang L., Yin S., Engström G., Wang L., Xu W., Xu B., Zhang X., Zhang X., Association of MAFLD and MASLD with all-cause and cause-specific dementia: A prospective cohort study. Alzheimer’s Res. Ther. 16, 136 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Ferkingstad E., Sulem P., Atlason B. A., Sveinbjornsson G., Magnusson M. I., Styrmisdottir E. L., Gunnarsdottir K., Helgason A., Oddsson A., Halldorsson B. V., Jensson B. O., Zink F., Halldorsson G. H., Masson G., Arnadottir G. A., Katrinardottir H., Juliusson K., Magnusson M. K., Magnusson O. T., Fridriksdottir R., Saevarsdottir S., Gudjonsson S. A., Stacey S. N., Rognvaldsson S., Eiriksdottir T., Olafsdottir T. A., Steinthorsdottir V., Tragante V., Ulfarsson M. O., Stefansson H., Jonsdottir I., Holm H., Rafnar T., Melsted P., Saemundsdottir J., Norddahl G. L., Lund S. H., Gudbjartsson D. F., Thorsteinsdottir U., Stefansson K., Large-scale integration of the plasma proteome with genetics and disease. Nat. Genet. 53, 1712–1721 (2021). [DOI] [PubMed] [Google Scholar]
- 46.Sun B. B., Chiou J., Traylor M., Benner C., Hsu Y. H., Richardson T. G., Surendran P., Mahajan A., Robins C., Vasquez-Grinnell S. G., Hou L., Kvikstad E. M., Burren O. S., Davitte J., Ferber K. L., Gillies C. E., Hedman Å. K., Hu S., Lin T., Mikkilineni R., Pendergrass R. K., Pickering C., Prins B., Baird D., Chen C. Y., Ward L. D., Deaton A. M., Welsh S., Willis C. M., Lehner N., Arnold M., Wörheide M. A., Suhre K., Kastenmüller G., Sethi A., Cule M., Raj A., Burkitt-Gray L., Melamud E., Black M. H., Fauman E. B., Howson J. M. M., Kang H. M., McCarthy M. I., Nioi P., Petrovski S., Scott R. A., Smith E. N., Szalma S., Waterworth D. M., Mitnaul L. J., Szustakowski J. D., Gibson B. W., Miller M. R., Whelan C. D., Plasma proteomic associations with genetics and health in the UK Biobank. Nature 622, 329–338 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Zhang Y., Gao X., Bai X., Yao S., Chang Y. Z., Gao G., The emerging role of furin in neurodegenerative and neuropsychiatric diseases. Transl. Neurodegener. 11, 39 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Park S., Kim S., Kim B., Kim D. S., Kim J., Ahn Y., Kim H., Song M., Shim I., Jung S. H., Cho C., Lim S., Hong S., Jo H., Fahed A. C., Natarajan P., Ellinor P. T., Torkamani A., Park W. Y., Yu T. Y., Myung W., Won H. H., Multivariate genomic analysis of 5 million people elucidates the genetic architecture of shared components of the metabolic syndrome. Nat. Genet. 56, 2380–2391 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Karlsson Linnér R., Mallard T. T., Barr P. B., Sanchez-Roige S., Madole J. W., Driver M. N., Poore H. E., de Vlaming R., Grotzinger A. D., Tielbeek J. J., Johnson E. C., Liu M., Rosenthal S. B., Ideker T., Zhou H., Kember R. L., Pasman J. A., Verweij K. J. H., Liu D. J., Vrieze S., Kranzler H. R., Gelernter J., Harris K. M., Tucker-Drob E. M., Waldman I. D., Palmer A. A., Harden K. P., Koellinger P. D., Dick D. M., Multivariate analysis of 1.5 million people identifies genetic associations with traits related to self-regulation and addiction. Nat. Neurosci. 24, 1367–1376 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Duan Y., Yang Y., Zhao S., Bai Y., Yao W., Gao X., Yin J., Crosstalk in extrahepatic and hepatic system in NAFLD/NASH. Liver Int. 44, 1856–1871 (2024). [DOI] [PubMed] [Google Scholar]
- 51.Wu K. Y., Cao B., Chen W. B., Wu W., Zhao S., Min X. Y., Yang J., Han J., Dong X., Wang N., Wu Y., Garred P., Sacks S. H., Zhou W., Li K., Collectin 11 has a pivotal role in host defense against kidney and bladder infection in mice. Kidney Int. 105, 524–539 (2024). [DOI] [PubMed] [Google Scholar]
- 52.Nechanitzky R., Akbas D., Scherer S., Györy I., Hoyler T., Ramamoorthy S., Diefenbach A., Grosschedl R., Transcription factor EBF1 is essential for the maintenance of B cell identity and prevention of alternative fates in committed cells. Nat. Immunol. 14, 867–875 (2013). [DOI] [PubMed] [Google Scholar]
- 53.Sawada K., Chung H., Softic S., Moreno-Fernandez M. E., Divanovic S., The bidirectional immune crosstalk in metabolic dysfunction-associated steatotic liver disease. Cell Metab. 35, 1852–1871 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Xiao T., van Kleef L. A., Ikram M. K., de Knegt R. J., Ikram M. A., Association of nonalcoholic fatty liver disease and fibrosis with incident dementia and cognition: The Rotterdam study. Neurology 99, e565–e573 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Lind L., Mazidi M., Clarke R., Bennett D. A., Zheng R., Measured and genetically predicted protein levels and cardiovascular diseases in UK Biobank and China Kadoorie Biobank. Nat. Cardiovasc. Res. 3, 1189–1198 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Mishra A., Malik R., Hachiya T., Jürgenson T., Namba S., Posner D. C., Kamanu F. K., Koido M., Le Grand Q., Shi M., He Y., Georgakis M. K., Caro I., Krebs K., Liaw Y. C., Vaura F. C., Lin K., Winsvold B. S., Srinivasasainagendra V., Parodi L., Bae H. J., Chauhan G., Chong M. R., Tomppo L., Akinyemi R., Roshchupkin G. V., Habib N., Jee Y. H., Thomassen J. Q., Abedi V., Cárcel-Márquez J., Nygaard M., Leonard H. L., Yang C., Yonova-Doing E., Knol M. J., Lewis A. J., Judy R. L., Ago T., Amouyel P., Armstrong N. D., Bakker M. K., Bartz T. M., Bennett D. A., Bis J. C., Bordes C., Børte S., Cain A., Ridker P. M., Cho K., Chen Z., Cruchaga C., Cole J. W., de Jager P. L., de Cid R., Endres M., Ferreira L. E., Geerlings M. I., Gasca N. C., Gudnason V., Hata J., He J., Heath A. K., Ho Y. L., Havulinna A. S., Hopewell J. C., Hyacinth H. I., Inouye M., Jacob M. A., Jeon C. E., Jern C., Kamouchi M., Keene K. L., Kitazono T., Kittner S. J., Konuma T., Kumar A., Lacaze P., Launer L. J., Lee K. J., Lepik K., Li J., Li L., Manichaikul A., Markus H. S., Marston N. A., Meitinger T., Mitchell B. D., Montellano F. A., Morisaki T., Mosley T. H., Nalls M. A., Nordestgaard B. G., O’Donnell M. J., Okada Y., Onland-Moret N. C., Ovbiagele B., Peters A., Psaty B. M., Rich S. S., Rosand J., Sabatine M. S., Sacco R. L., Saleheen D., Sandset E. C., Salomaa V., Sargurupremraj M., Sasaki M., Satizabal C. L., Schmidt C. O., Shimizu A., Smith N. L., Sloane K. L., Sutoh Y., Sun Y. V., Tanno K., Tiedt S., Tatlisumak T., Torres-Aguila N. P., Tiwari H. K., Trégouët D. A., Trompet S., Tuladhar A. M., Tybjærg-Hansen A., van Vugt M., Vibo R., Verma S. S., Wiggins K. L., Wennberg P., Woo D., Wilson P. W. F., Xu H., Yang Q., Yoon K., Millwood I. Y., Gieger C., Ninomiya T., Grabe H. J., Jukema J. W., Rissanen I. L., Strbian D., Kim Y. J., Chen P. H., Mayerhofer E., Howson J. M. M., Irvin M. R., Adams H., Wassertheil-Smoller S., Christensen K., Ikram M. A., Rundek T., Worrall B. B., Lathrop G. M., Riaz M., Simonsick E. M., Kõrv J., França P. H. C., Zand R., Prasad K., Frikke-Schmidt R., de Leeuw F. E., Liman T., Haeusler K. G., Ruigrok Y. M., Heuschmann P. U., Longstreth W. T., Jung K. J., Bastarache L., Paré G., Damrauer S. M., Chasman D. I., Rotter J. I., Anderson C. D., Zwart J. A., Niiranen T. J., Fornage M., Liaw Y. P., Seshadri S., Fernández-Cadenas I., Walters R. G., Ruff C. T., Owolabi M. O., Huffman J. E., Milani L., Kamatani Y., Dichgans M., Debette S., Stroke genetics informs drug discovery and risk prediction across ancestries. Nature 611, 115–123 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Peterson R. E., Kuchenbaecker K., Walters R. K., Chen C. Y., Popejoy A. B., Periyasamy S., Lam M., Iyegbe C., Strawbridge R. J., Brick L., Carey C. E., Martin A. R., Meyers J. L., Su J., Chen J., Edwards A. C., Kalungi A., Koen N., Majara L., Schwarz E., Smoller J. W., Stahl E. A., Sullivan P. F., Vassos E., Mowry B., Prieto M. L., Cuellar-Barboza A., Bigdeli T. B., Edenberg H. J., Huang H., Duncan L. E., Genome-wide association studies in ancestrally diverse populations: Opportunities, methods, pitfalls, and recommendations. Cell 179, 589–603 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Kachuri L., Chatterjee N., Hirbo J., Schaid D. J., Martin I., Kullo I. J., Kenny E. E., Pasaniuc B., Auer P. L., Conomos M. P., Conti D. V., Ding Y., Wang Y., Zhang H., Zhang Y., Witte J. S., Ge T., G., Principles and methods for transferring polygenic risk scores across global populations. Nat. Rev. Genet. 25, 8–25 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Xia T., Du M., Li H., Wang Y., Zha J., Wu T., Ju S., Association between liver MRI proton density fat fraction and liver disease risk. Radiology 309, e231007 (2023). [DOI] [PubMed] [Google Scholar]
- 60.Auton A., Brooks L. D., Durbin R. M., Garrison E. P., Kang H. M., Korbel J. O., Marchini J. L., McCarthy S., McVean G. A., Abecasis G. R., A global reference for human genetic variation. Nature 526, 68–74 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Bulik-Sullivan B. K., Loh P. R., Finucane H. K., Ripke S., Yang J., Patterson N., Daly M. J., Price A. L., Neale B. M., LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Horn J. L., A rationale and test for the number of factors in factor analysis. Psychometrika 30, 179–185 (1965). [DOI] [PubMed] [Google Scholar]
- 63.Grotzinger A. D., Rhemtulla M., de Vlaming R., Ritchie S. J., Mallard T. T., Hill W. D., Ip H. F., Marioni R. E., McIntosh A. M., Deary I. J., Koellinger P. D., Harden K. P., Nivard M. G., Tucker-Drob E. M., Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 3, 513–525 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Wolf M. G., McNeish D., dynamic : An R package for deriving dynamic fit index cutoffs for factor analysis. Multivariate Behav. Res. 58, 189–194 (2023). [DOI] [PubMed] [Google Scholar]
- 65.Mallard T. T., Linnér R. K., Grotzinger A. D., Sanchez-Roige S., Seidlitz J., Okbay A., de Vlaming R., Meddens S. F. W., Palmer A. A., Davis L. K., Tucker-Drob E. M., Kendler K. S., Keller M. C., Koellinger P. D., Harden K. P., Multivariate GWAS of psychiatric disorders and their cardinal symptoms reveal two dimensions of cross-cutting genetic liabilities. Cell Genom. 2, 100140 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Turley P., Walters R. K., Maghzian O., Okbay A., Lee J. J., Fontana M. A., Nguyen-Viet T. A., Wedow R., Zacher M., Furlotte N. A., Magnusson P., Oskarsson S., Johannesson M., Visscher P. M., Laibson D., Cesarini D., Neale B. M., Benjamin D. J., Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet. 50, 229–237 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M. A., Bender D., Maller J., Sklar P., de Bakker P. I., Daly M. J., Sham P. C., PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Watanabe K., Taskesen E., van Bochoven A., Posthuma D., Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Sollis E., Mosaku A., Abid A., Buniello A., Cerezo M., Gil L., Groza T., Güneş O., Hall P., Hayhurst J., Ibrahim A., Ji Y., John S., Lewis E., MacArthur J. A. L., McMahon A., Osumi-Sutherland D., Panoutsopoulou K., Pendlington Z., Ramachandran S., Stefancsik R., Stewart J., Whetzel P., Wilson R., Hindorff L., Cunningham F., Lambert S. A., Inouye M., Parkinson H., Harris L. W., The NHGRI-EBI GWAS Catalog: Knowledgebase and deposition resource. Nucleic Acids Res. 51, D977–D985 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Kanai M., Akiyama M., Takahashi A., Matoba N., Momozawa Y., Ikeda M., Iwata N., Ikegawa S., Hirata M., Matsuda K., Kubo M., Okada Y., Kamatani Y., Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. Nat. Genet. 50, 390–400 (2018). [DOI] [PubMed] [Google Scholar]
- 71.Ochoa D., Hercules A., Carmona M., Suveges D., Baker J., Malangone C., Lopez I., Miranda A., Cruz-Castillo C., Fumis L., Bernal-Llinares M., Tsukanov K., Cornu H., Tsirigos K., Razuvayevskaya O., Buniello A., Schwartzentruber J., Karim M., Ariano B., Martinez Osorio R. E., Ferrer J., Ge X., Machlitt-Northen S., Gonzalez-Uriarte A., Saha S., Tirunagari S., Mehta C., Roldán-Romero J. M., Horswell S., Young S., Ghoussaini M., Hulcoop D. G., Dunham I., McDonagh E. M., The next-generation open targets platform: Reimagined, redesigned, rebuilt. Nucleic Acids Res. 51, D1353–D1359 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.de Leeuw C. A., Mooij J. M., Heskes T., Posthuma D., MAGMA: Generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Liberzon A., Birger C., Thorvaldsdóttir H., Ghandi M., Mesirov J. P., Tamayo P., The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Finucane H. K., Bulik-Sullivan B., Gusev A., Trynka G., Reshef Y., Loh P. R., Anttila V., Xu H., Zang C., Farh K., Ripke S., Day F. R., Purcell S., Stahl E., Lindstrom S., Perry J. R., Okada Y., Raychaudhuri S., Daly M. J., Patterson N., Neale B. M., Price A. L., Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Finucane H. K., Reshef Y. A., Anttila V., Slowikowski K., Gusev A., Byrnes A., Gazal S., Loh P. R., Lareau C., Shoresh N., Genovese G., Saunders A., Macosko E., Pollack S., Perry J. R. B., Buenrostro J. D., Bernstein B. E., Raychaudhuri S., McCarroll S., Neale B. M., Price A. L., Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Fehrmann R. S., Karjalainen J. M., Krajewska M., Westra H. J., Maloney D., Simeonov A., Pers T. H., Hirschhorn J. N., Jansen R. C., Schultes E. A., van Haagen H. H., de Vries E. G., te Meerman G. J., Wijmenga C., van Vugt M. A., Franke L., Gene expression analysis identifies global gene dosage sensitivity in cancer. Nat. Genet. 47, 115–125 (2015). [DOI] [PubMed] [Google Scholar]
- 77.Pers T. H., Karjalainen J. M., Chan Y., Westra H. J., Wood A. R., Yang J., Lui J. C., Vedantam S., Gustafsson S., Esko T., Frayling T., Speliotes E. K., Boehnke M., Raychaudhuri S., Fehrmann R. S., Hirschhorn J. N., Franke L., Biological interpretation of genome-wide association studies using predicted gene functions. Nat. Commun. 6, 5890 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Kundaje A., Meuleman W., Ernst J., Bilenky M., Yen A., Heravi-Moussavi A., Kheradpour P., Zhang Z., Wang J., Ziller M. J., Amin V., Whitaker J. W., Schultz M. D., Ward L. D., Sarkar A., Quon G., Sandstrom R. S., Eaton M. L., Wu Y. C., Pfenning A. R., Wang X., Claussnitzer M., Liu Y., Coarfa C., Harris R. A., Shoresh N., Epstein C. B., Gjoneska E., Leung D., Xie W., Hawkins R. D., Lister R., Hong C., Gascard P., Mungall A. J., Moore R., Chuah E., Tam A., Canfield T. K., Hansen R. S., Kaul R., Sabo P. J., Bansal M. S., Carles A., Dixon J. R., Farh K. H., Feizi S., Karlic R., Kim A. R., Kulkarni A., Li D., Lowdon R., Elliott G., Mercer T. R., Neph S. J., Onuchic V., Polak P., Rajagopal N., Ray P., Sallari R. C., Siebenthall K. T., Sinnott-Armstrong N. A., Stevens M., Thurman R. E., Wu J., Zhang B., Zhou X., Beaudet A. E., Boyer L. A., De Jager P. L., Farnham P. J., Fisher S. J., Haussler D., Jones S. J., Li W., Marra M. A., McManus M. T., Sunyaev S., Thomson J. A., Tlsty T. D., Tsai L. H., Wang W., Waterland R. A., Zhang M. Q., Chadwick L. H., Bernstein B. E., Costello J. F., Ecker J. R., Hirst M., Meissner A., Milosavljevic A., Ren B., Stamatoyannopoulos J. A., Wang T., Kellis M., Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.ENCODE Project Consortium , An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Schipper M., de Leeuw C. A., Maciel B., Wightman D. P., Hubers N., Boomsma D. I., O’Donovan M. C., Posthuma D., Prioritizing effector genes at trait-associated loci using multimodal evidence. Nat. Genet. 57, 323–333 (2025). [DOI] [PubMed] [Google Scholar]
- 81.Weeks E. M., Ulirsch J. C., Cheng N. Y., Trippe B. L., Fine R. S., Miao J., Patwardhan T. A., Kanai M., Nasser J., Fulco C. P., Tashman K. C., Aguet F., Li T., Ordovas-Montanes J., Smillie C. S., Biton M., Shalek A. K., Ananthakrishnan A. N., Xavier R. J., Regev A., Gupta R. M., Lage K., Ardlie K. G., Hirschhorn J. N., Lander E. S., Engreitz J. M., Finucane H. K., Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat. Genet. 55, 1267–1276 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Yuan K., Longchamps R. J., Pardiñas A. F., Yu M., Chen T. T., Lin S. C., Chen Y., Lam M., Liu R., Xia Y., Guo Z., Shi W., Shen C., Daly M. J., Neale B. M., Feng Y. A., Lin Y. F., Chen C. Y., O’Donovan M. C., Ge T., Huang H., Fine-mapping across diverse ancestries drives the discovery of putative causal variants underlying human complex traits and diseases. Nat. Genet. 56, 1841–1850 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Martens M., Ammar A., Riutta A., Waagmeester A., Slenter D. N., Hanspers K., Miller R. A., Digles D., Lopes E. N., Ehrhart F., Dupuis L. J., Winckers L. A., Coort S. L., Willighagen E. L., Evelo C. T., Pico A. R., Kutmon M., WikiPathways: Connecting communities. Nucleic Acids Res. 49, D613–d621 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.GTEx Consortium , The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Sudlow C., Gallacher J., Allen N., Beral V., Burton P., Danesh J., Downey P., Elliott P., Green J., Landray M., Liu B., Matthews P., Ong G., Pell J., Silman A., Young A., Sprosen T., Peakman T., Collins R., UK biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Bycroft C., Freeman C., Petkova D., Band G., Elliott L. T., Sharp K., Motyer A., Vukcevic D., Delaneau O., O’Connell J., Cortes A., Welsh S., Young A., Effingham M., McVean G., Leslie S., Allen N., Donnelly P., Marchini J., The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Liu Z., Song C., Suo C., Fan H., Zhang T., Jin L., Chen X., Alcohol consumption and hepatocellular carcinoma: Novel insights from a prospective cohort study and nonlinear Mendelian randomization analysis. BMC Med. 20, 413 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.De Vincentis A., Tavaglione F., Jamialahmadi O., Picardi A., Antonelli Incalzi R., Valenti L., Romeo S., Vespasiani-Gentilucci U., A polygenic risk score to refine risk stratification and prediction for severe liver disease by clinical fibrosis scores. Clin. Gastroenterol. Hepatol. 20, 658–673 (2022). [DOI] [PubMed] [Google Scholar]
- 89.Privé F., Vilhjálmsson B. J., Aschard H., Blum M. G. B., Making the most of clumping and thresholding for polygenic scores. Am. J. Hum. Genet. 105, 1213–1221 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Sun B. B., Maranville J. C., Peters J. E., Stacey D., Staley J. R., Blackshaw J., Burgess S., Jiang T., Paige E., Surendran P., Oliver-Williams C., Kamat M. A., Prins B. P., Wilcox S. K., Zimmerman E. S., Chi A., Bansal N., Spain S. L., Wood A. M., Morrell N. W., Bradley J. R., Janjic N., Roberts D. J., Ouwehand W. H., Todd J. A., Soranzo N., Suhre K., Paul D. S., Fox C. S., Plenge R. M., Danesh J., Runz H., Butterworth A. S., Genomic atlas of the human plasma proteome. Nature 558, 73–79 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Pietzner M., Wheeler E., Carrasco-Zanini J., Cortes A., Koprulu M., Wörheide M. A., Oerton E., Cook J., Stewart I. D., Kerrison N. D., Luan J., Raffler J., Arnold M., Arlt W., O’Rahilly S., Kastenmüller G., Gamazon E. R., Hingorani A. D., Scott R. A., Wareham N. J., Langenberg C., Mapping the proteo-genomic convergence of human diseases. Science 374, eabj1541 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Suhre K., Arnold M., Bhagwat A. M., Cotton R. J., Engelke R., Raffler J., Sarwath H., Thareja G., Wahl A., DeLisle R. K., Gold L., Pezer M., Lauc G., El-Din Selim M. A., Mook-Kanamori D. O., Al-Dous E. K., Mohamoud Y. A., Malek J., Strauch K., Grallert H., Peters A., Kastenmüller G., Gieger C., Graumann J., Connecting genetic risk to disease end points through the human blood plasma proteome. Nat. Commun. 8, 14357 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Folkersen L., Gustafsson S., Wang Q., Hansen D. H., Hedman Å. K., Schork A., Page K., Zhernakova D. V., Wu Y., Peters J., Eriksson N., Bergen S. E., Boutin T. S., Bretherick A. D., Enroth S., Kalnapenkis A., Gådin J. R., Suur B. E., Chen Y., Matic L., Gale J. D., Lee J., Zhang W., Quazi A., Ala-Korpela M., Choi S. H., Claringbould A., Danesh J., Smith G. D., de Masi F., Elmståhl S., Engström G., Fauman E., Fernandez C., Franke L., Franks P. W., Giedraitis V., Haley C., Hamsten A., Ingason A., Johansson Å., Joshi P. K., Lind L., Lindgren C. M., Lubitz S., Palmer T., Macdonald-Dunlop E., Magnusson M., Melander O., Michaelsson K., Morris A. P., Mägi R., Nagle M. W., Nilsson P. M., Nilsson J., Orho-Melander M., Polasek O., Prins B., Pålsson E., Qi T., Sjögren M., Sundström J., Surendran P., Võsa U., Werge T., Wernersson R., Westra H. J., Yang J., Zhernakova A., Ärnlöv J., Fu J., Smith J. G., Esko T., Hayward C., Gyllensten U., Landen M., Siegbahn A., Wilson J. F., Wallentin L., Butterworth A. S., Holmes M. V., Ingelsson E., Mälarstig A., Genomic and drug target evaluation of 90 cardiovascular proteins in 30,931 individuals. Nat. Metab. 2, 1135–1148 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Zhao J. H., Stacey D., Eriksson N., Macdonald-Dunlop E., Hedman Å. K., Kalnapenkis A., Enroth S., Cozzetto D., Digby-Bell J., Marten J., Folkersen L., Herder C., Jonsson L., Bergen S. E., Gieger C., Needham E. J., Surendran P., Paul D. S., Polasek O., Thorand B., Grallert H., Roden M., Võsa U., Esko T., Hayward C., Johansson Å., Gyllensten U., Powell N., Hansson O., Mattsson-Carlgren N., Joshi P. K., Danesh J., Padyukov L., Klareskog L., Landén M., Wilson J. F., Siegbahn A., Wallentin L., Mälarstig A., Butterworth A. S., Peters J. E., Genetics of circulating inflammatory proteins identifies drivers of immune-mediated disease risk and therapeutic targets. Nat. Immunol. 24, 1540–1551 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Gudjonsson A., Gudmundsdottir V., Axelsson G. T., Gudmundsson E. F., Jonsson B. G., Launer L. J., Lamb J. R., Jennings L. L., Aspelund T., Emilsson V., Gudnason V., A genome-wide association study of serum proteins reveals shared loci with common diseases. Nat. Commun. 13, 480 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Verma A., Huffman J. E., Rodriguez A., Conery M., Liu M., Ho Y. L., Kim Y., Heise D. A., Guare L., Panickan V. A., Garcon H., Linares F., Costa L., Goethert I., Tipton R., Honerlaw J., Davies L., Whitbourne S., Cohen J., Posner D. C., Sangar R., Murray M., Wang X., Dochtermann D. R., Devineni P., Shi Y., Nandi T. N., Assimes T. L., Brunette C. A., Carroll R. J., Clifford R., Duvall S., Gelernter J., Hung A., Iyengar S. K., Joseph J., Kember R., Kranzler H., Kripke C. M., Levey D., Luoh S. W., Merritt V. C., Overstreet C., Deak J. D., Grant S. F. A., Polimanti R., Roussos P., Shakt G., Sun Y. V., Tsao N., Venkatesh S., Voloudakis G., Justice A., Begoli E., Ramoni R., Tourassi G., Pyarajan S., Tsao P., O’Donnell C. J., Muralidhar S., Moser J., Casas J. P., Bick A. G., Zhou W., Cai T., Voight B. F., Cho K., Gaziano J. M., Madduri R. K., Damrauer S., Liao K. P., Diversity and scale: Genetic architecture of 2068 traits in the VA Million Veteran Program. Science 385, eadj1182 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Kurki M. I., Karjalainen J., Palta P., Sipilä T. P., Kristiansson K., Donner K. M., Reeve M. P., Laivuori H., Aavikko M., Kaunisto M. A., Loukola A., Lahtela E., Mattsson H., Laiho P., Parolo P. D. B., Lehisto A. A., Kanai M., Mars N., Rämö J., Kiiskinen T., Heyne H. O., Veerapen K., Rüeger S., Lemmelä S., Zhou W., Ruotsalainen S., Pärn K., Hiekkalinna T., Koskelainen S., Paajanen T., Llorens V., Gracia-Tabuenca J., Siirtola H., Reis K., Elnahas A. G., Sun B., Foley C. N., Aalto-Setälä K., Alasoo K., Arvas M., Auro K., Biswas S., Bizaki-Vallaskangas A., Carpen O., Chen C. Y., Dada O. A., Ding Z., Ehm M. G., Eklund K., Färkkilä M., Finucane H., Ganna A., Ghazal A., Graham R. R., Green E. M., Hakanen A., Hautalahti M., Hedman Å. K., Hiltunen M., Hinttala R., Hovatta I., Hu X., Huertas-Vazquez A., Huilaja L., Hunkapiller J., Jacob H., Jensen J. N., Joensuu H., John S., Julkunen V., Jung M., Junttila J., Kaarniranta K., Kähönen M., Kajanne R., Kallio L., Kälviäinen R., Kaprio J., Kerimov N., Kettunen J., Kilpeläinen E., Kilpi T., Klinger K., Kosma V. M., Kuopio T., Kurra V., Laisk T., Laukkanen J., Lawless N., Liu A., Longerich S., Mägi R., Mäkelä J., Mäkitie A., Malarstig A., Mannermaa A., Maranville J., Matakidou A., Meretoja T., Mozaffari S. V., Niemi M. E. K., Niemi M., Niiranen T., CJ O. D., Obeidat M. E., Okafo G., Ollila H. M., Palomäki A., Palotie T., Partanen J., Paul D. S., Pelkonen M., Pendergrass R. K., Petrovski S., Pitkäranta A., Platt A., Pulford D., Punkka E., Pussinen P., Raghavan N., Rahimov F., Rajpal D., Renaud N. A., Riley-Gillis B., Rodosthenous R., Saarentaus E., Salminen A., Salminen E., Salomaa V., Schleutker J., Serpi R., Shen H. Y., Siegel R., Silander K., Siltanen S., Soini S., Soininen H., Sul J. H., Tachmazidou I., Tasanen K., Tienari P., Toppila-Salmi S., Tukiainen T., Tuomi T., Turunen J. A., Ulirsch J. C., Vaura F., Virolainen P., Waring J., Waterworth D., Yang R., Nelis M., Reigo A., Metspalu A., Milani L., Esko T., Fox C., Havulinna A. S., Perola M., Ripatti S., Jalanko A., Laitinen T., Mäkelä T. P., Plenge R., McCarthy M., Runz H., Daly M. J., Palotie A., FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Broadaway K. A., Brotman S. M., Rosen J. D., Currin K. W., Alkhawaja A. A., Etheridge A. S., Wright F., Gallins P., Jima D., Zhou Y.-h., Love M. I., Innocenti F., Mohlke K. L., Liver eQTL meta-analysis illuminates potential molecular mechanisms of cardiometabolic traits. Am. J. Hum. Genet. 111, 1899–1913 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figs. S1 to S18
Legends for tables S1 to S37
References
Tables S1 to S37
Data Availability Statement
The MASLD GWAS summary statistics were generated using a cross-trait approach integrating MRI-PDFF and cardiometabolic risk factors (see “Construction of MASLD GWAS using MTAG” in Materials and Methods) and are available via the NHGRI-EBI GWAS Catalog (GCP ID: GCP001525; accession: GCST90728570). All data and code needed to evaluate and reproduce the results in the paper are present in the paper and/or the Supplementary Materials. No new physical materials were generated. The individual-level phenotypic and genetic data for UKB (www.ukbiobank.ac.uk/) can be accessed through the application. The UKB received ethical approval from the research ethics committee (REC reference for UKB 11/NW/0382), and participants provided written informed consent. The GWAS summary statistics from the following consortia and biobanks are publicly available at the corresponding URL: NHGRI-EBI GWAS Catalog (www.ebi.ac.uk/gwas/), GIANT (https://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files), GLGC (https://csg.sph.umich.edu/willer/public/glgc-lipids2021/), MAGIC (https://magicinvestigators.org/), and FinnGen (www.finngen.fi/en/access_results). The GWAS summary statistics of the MVP cohort were obtained from GWAS catalog (www.ebi.ac.uk/gwas/publications/39024449). Plasma proteome GWAS was obtained from deCODE (www.deCODE.com/summarydata/; https://doi.org/10.1038/s41588-021-00978-w). Files required to run partition heritability with S-LDSC can be found at https://zenodo.org/records/7768714. MASLD GWAS summary data are available at GWAS catalog (GCP ID: GCP001525 and GCST90728570).





