Abstract
Dilated cardiomyopathy (DCM) is a leading cause of heart failure and cardiac transplantation. We report a genome-wide association study and multi-trait analysis of DCM (14,256 cases) and three left ventricular traits (36,203 UK Biobank participants). We identified 80 genomic risk loci and prioritized 62 putative effector genes, including several with rare variant DCM associations (MAP3K7, NEDD4L and SSPN). Using single-nucleus transcriptomics, we identify cellular states, biological pathways, and intracellular communications that drive pathogenesis. We demonstrate that polygenic scores predict DCM in the general population and modify penetrance in carriers of rare DCM variants. Our findings may inform the design of genetic testing strategies that incorporate polygenic background. They also provide insights into the molecular etiology of DCM that may facilitate the development of targeted therapeutics.
Subject terms: Genome-wide association studies, Cardiomyopathies
Genome-wide association analyses comprising 14,256 cases and 1,199,156 controls and incorporating correlated cardiac magnetic resonance imaging traits provide insights into the molecular etiology of dilated cardiomyopathy.
Main
Dilated cardiomyopathy (DCM) describes a spectrum of heart muscle diseases that are characterized by impaired left ventricular (LV) myocardial contractility and dilatation, in the absence of coronary artery disease (CAD) or abnormal loading conditions1,2. DCM affects approximately one in 250 individuals and is among the primary etiologies of heart failure, as well as the leading cause of cardiac transplantation3. Pathogenic variants in relevant genes can cause DCM via monogenic disease mechanisms; however, recent evidence suggests important direct and indirect effects of polygenic background on DCM risk4. Characterization of the complex genetic architecture underlying DCM provides opportunities for improved clinical genetic testing and the discovery of pathways and genes to inform therapeutic development.
Results
Genome-wide association study and multitrait analysis of dilated cardiomyopathy identifies novel genomic risk loci
We performed a meta-analysis of case–control genome-wide association studies (GWASs) comprising 14,256 DCM cases and 1,199,156 controls, from 16 studies participating in the Heart Failure Molecular Epidemiology for Therapeutic Targets (HERMES) Consortium5 (Fig. 1, Extended Data Fig. 1, Supplementary Tables 1 and 2, and Supplementary Information 1). Patients who meet guideline definitions of DCM may not carry the disease label, leading to incomplete ascertainment of cases6. To improve DCM ascertainment in large research cohorts and health record-based biobanks, we developed a phenotyping algorithm without a requirement for data on LV chamber dimensions (Supplementary Information 2), which are frequently not available in studies. Of the 16 studies, six included cases recruited from specialist clinical cohorts or unequivocal DCM diagnostic codes (DCMNarrow: 6,001 cases (76.2% recruited from specialist clinical cohorts) and 449,382 controls), whereas 11 ascertained cases based on an inclusive definition of LV systolic dysfunction in the absence of secondary causes, without specific requirements for ventricular dilatation (DCMBroad: 9,299 cases and 1,157,145 controls). We found complete genetic correlation between DCMNarrow and DCMBroad (rg = 1.00), highlighting the shared genetic architecture between these phenotype definitions, and all studies were therefore combined for meta-analysis (DCM GWAS).
Among 9,656,392 common variants (minor allele frequency (MAF) > 0.01) included in the meta-analysis, we identified 27 independent variants at 26 genomic loci passing genome-wide significance (P < 5 × 10−8) (Fig. 2, Extended Data Fig. 2 and Supplementary Table 3). Eighteen of the 26 loci were associations that had not been previously reported for DCM (Supplementary Tables 3 and 4). An additional 36 variants at 36 loci met the criterion of a 1% false discovery rate (FDR) (equivalent to P < 2.2 × 10−6).
Next, we compared the effect estimates from DCM GWAS against the subset of six studies with cases carrying a clinical diagnosis (DCMNarrow GWAS, Extended Data Fig. 3). All 62 DCM GWAS loci identified using the 1% FDR threshold had directionally concordant effects in DCMNarrow GWAS. Of these, ten loci reached the genome-wide significance threshold (P < 10−8) with most having a larger effect size in DCMNarrow GWAS (Supplementary Table 3 and Extended Data Fig. 3). Using linkage disequilibrium (LD)-adjusted kinships (LDAK) with summary statistics from GWAS7, we estimated the heritability explained by common single-nucleotide polymorphism (SNPs; h2SNP) on the liability scale as 20% (2.1% s.d.) for DCMNarrow GWAS and 11% (1% s.d.) for DCM GWAS.
To explore shared genetic etiology with quantitative LV traits and to evaluate the potential of combining traits through multitrait analysis of GWAS (MTAG), we estimated the pairwise genetic correlation (rg) between DCM and ten cardiac magnetic resonance imaging-derived (CMR) traits from 36,203 participants in the UK Biobank (UKB), using bivariate LD score regression8,9. Three LV traits were highly correlated with DCM: end-systolic volume (LVESV), rg = 0.73; global circumferential strain, rg = 0.71; and ejection fraction (LVEF), rg = −0.70) (Supplementary Table 5). These traits were included in a DCM-anchored MTAG (DCM MTAG), allowing for a joint analysis to increase statistical power10. Fifty-eight sentinel variants at 54 loci were identified at P < 5 × 10−8 by DCM MTAG, including 18 loci not identified in our GWAS at FDR < 1%. Twenty-eight of the 54 loci were associations not previously reported for DCM or any of the three LV traits included in the MTAG (Supplementary Tables 3 and 4).
A total of 59 genomic risk loci reached genome-wide significance in GWAS or GWASMTAG, 31 of which had not been previously reported to be associated with DCM or related cardiac traits (Supplementary Tables 3 and 4). Among loci identified in the DCM GWAS, 25 FDR-significant loci were not significant in DCM MTAG; however, all uniquely significant loci (DCM GWAS and DCM MTAG) had directionally concordant effects (Extended Data Fig. 3). For subsequent locus- and gene-based analyses we investigated a discovery set of 80 genomic loci, identified through either DCM GWAS (FDR < 1%) or DCM MTAG (P < 5 × 10−8), applying a range of orthogonal approaches to prioritize potential effector genes.
Using functionally informed fine-mapping, we identified 100 credible sets of likely causal variants at 63 of 80 loci. The credible sets consisted of 1,392 variants (60.6% intronic, 25.4% intergenic and 4.8% exonic). Among these, 83 variants identified at 43 loci had a posterior inclusion probability (PIP) > 0.5 (Extended Data Fig. 4 and Supplementary Table 6). Several fine-mapped coding variants were found within known DCM genes (FLNC, BAG3 and TTN) and genes with plausible effects on cardiac function (NEXN and MYBPC3), including deleterious missense variants (combined annotation-dependent depletion Phred score >15) in TTN, BAG3 and MYBPC3.
Effector gene prioritization and pathway enrichment analysis identify molecular mechanisms
To prioritize effector genes for DCM, we assessed functional evidence for 1,970 protein-coding genes situated within or overlapping the identified genomic risk loci (Fig. 3a and Supplementary Table 7). First, using a combination of nearest gene, locus-based (variant-to-gene (V2G)) and similarity-based (polygenic priority score (PoPS)) methods, we identified 380 candidate genes for further prioritization (median 5 per locus; interquartile range 4–6). Second, by incorporating additional evidence from five complementary methods—coding variants, colocalization with expression quantitative trait loci (eQTL), transcriptome-wide association studies (TWAS), activity-by-contact (ABC) model, and established Mendelian cardiomyopathy- or muscle-disease-causing genes—along with results from the three initial methods, we identified a single prioritized gene at 62 of 80 loci (Fig. 3b, Extended Data Fig. 5 and Supplementary Table 8). The highest prioritization scores were for MYPN (prioritized by seven of the maximum of eight predictors), followed by HSPB8 and ALPK3 (six predictors), and ACTN2, SPATS2L and BAG3 (five predictors). Highlighting the robustness of this framework, all ClinGen genes with definitive evidence for Mendelian cardiomyopathy, except LMNA, were prioritized at their respective loci. Genes associated with Mendelian forms of hypertrophic cardiomyopathy (HCM) (MYBPC3, ALPK3 and FHOD3) were also identified at genomic risk loci for DCM, a finding consistent with evidence that these disorders represent opposing extremes of a continuum of ventricular structure and systolic function9,11. We also identified PITX2, which has been previously shown to be strongly associated with atrial fibrillation (AF)12. To estimate the extent to which the DCM risk effects of PITX2, and the other identified risk loci, were related to AF, we conditioned the DCM GWAS summary statistics on AF using multitrait conditional and joint analysis (mtCOJO). Conditioning on AF partially attenuated the association signal at the PITX2 locus, implying some genetic effects on DCM risk independent of AF. Genetic association estimates for all other loci were robust to conditional analysis on AF, suggesting that the genes identified primarily influence DCM risk (Extended Data Fig. 6).
Pathway analysis of prioritized genes identified enrichment of 72 cellular components and functions, including sarcomeric and cytoskeletal function, cellular adhesion and junction organization, aggrephagy, and Wnt and TGFβ signaling (Fig. 3b,c and Supplementary Table 9). Novel prioritized GWAS genes MAPT13 and MYL6 (ref. 14) contributed to the enrichment of pathways for contractile and cytoskeletal functions. The important role of cell-to-cell adhesion and cell-to-matrix interaction in DCM pathogenesis is underscored by the many effector genes acting at these interfaces. STRN encodes the desmosomal protein striatin, the canine ortholog of which has been implicated in dilated and arrhythmic cardiomyopathy15. SSPN encodes sarcospan, a key component of the dystrophin glycoprotein complex that has been linked to severe skeletal and cardiac muscle disorders. Other effector genes acting at the cell membrane identified include MTSS1 (ref. 16), PDLIM5 (refs. 17,18), THBS1 and TMEM182 (ref. 19).
Cell signaling components were prominently featured among the prioritized genes, including members of the TGFβ (BAMBI, INHBB, PITX2 and THBS1) and Wnt (CAMK2D, MAP3K7, NEDD4L, NFATC1, PRKCA and RNF207) signaling pathways. INHBB encodes a secreted factor, and THBS1 a transmembrane glycoprotein, both of which activate the TGFβ receptor, while BAMBI encodes a TGFβ-like pseudoreceptor that acts as a negative regulator of TGFβ signaling20. TGFβ signaling has been shown to be important in the development of fibrosis in cardiomyopathy models21. Several genes encoding heat-shock proteins (HSPA4, HSPB7 and HSPB8) were also identified, expanding on the established role of BAG3 and the unfolded protein response and endoplasmic reticular stress on DCM pathogenesis. Additionally, FBXO32 encodes a muscle-specific ubiquitin ligase involved in protein degradation that has been proposed as a rare cause of DCM22.
For genomic loci where a single high-confidence gene could not be identified, we manually curated the locus by integrating information from enriched biological pathways. The identified candidate genes were associated with cytoskeleton function (ROCK2 (ref. 23) at locus 13), cell adhesion (ITGA5 at locus 52), MAPK signaling (EPHB1 at locus 23), and the unfolded protein response (DNAJC18 at locus 31 and CRYAB at locus 50). Other notable genes included: the taurine transporter SLC6A6 (locus 20), with existing evidence of taurine deficiency causing feline DCM24; the cardiac-expressed K+ channel KCNIP2, which has been implicated in Brugada syndrome and conduction abnormalities25; RRAS2, where gain of function variants are a cause of Noonan syndrome and accompanying hypertrophic cardiomyopathy26,27; and several genes implicated in myopathy, including CHCHD10 (locus 80) and DMPK (locus 76).
Rare variant burden association analysis of putative DCM effector genes
Within the identified DCM loci were seven Mendelian cardiomyopathy genes cataloged in ClinGen, a curated database of Mendelian-disease causing genes, with definitive evidence (DCM: TTN, FLNC, LMNA, BAG3; HCM: MYBPC3, ALPK3, FHOD3) and seven genes with moderate or limited evidence (DCM: PRDM16, LDB3; DCM or HCM: OBSCN, VCL, NEXN, MYPN; intrinsic cardiomyopathy: ACTN2). Emphasizing the role of gene dosage as a likely mechanism of action at GWAS genes28 and the continuum of disease risk, four of the seven definitive evidence Mendelian DCM genes, established to act through mechanisms involving reduced gene product29, were identified through GWAS: TTN, FLNC, LMNA and BAG3. We observed a tenfold enrichment of Mendelian cardiomyopathy genes within GWAS loci (odds ratio (OR) = 9.7, P = 1.1 × 10−6).
Next, we performed rare variant (MAF < 0.001) burden association analysis (RVAS), focusing on protein truncating variants (PTVs). This analysis was applied to (1) all DCM genes with definitive or moderate evidence for Mendelian DCM30, to characterize the overall genetic architecture of DCM; and (2) genes prioritized at the identified GWAS loci through functional genomics analysis, to identify potential novel causes of Mendelian DCM and cardiomyopathy. In 453,455 participants with whole-exome sequencing from the UKB, a population-based cohort recruiting middle-aged and older individuals, the combined risk effects of rare variants in ClinGen definitive- or moderate-evidence DCM genes were orders of magnitude higher than those of GWAS sentinel variants mapping to the same genes (Fig. 4a and Supplementary Table 10).
To identify genes with a potential role in Mendelian DCM and cardiomyopathy, we investigated the effects of rare PTVs in the 62 prioritized genes with binary disease outcomes (cardiomyopathy and heart failure phenotypes) and quantitative CMR traits. Analysis was performed using whole-genome data in 78,142 individual participants of Genomics England (GeL), a rare disease and cancer cohort that recruited probands and their relatives from clinical centers, and with whole-exome sequencing in the UKB (including a subset of 36,104 with CMR). PTVs in three genes with limited or moderate evidence for Mendelian cardiomyopathy were nominally associated with DCM in GeL (MYPN: OR = 15.0, P = 0.03; PRDM16: OR = 40.3, P = 0.008) and with HCM in UKB (NEXN: OR = 24.1, P = 0.01) (Supplementary Tables 11 and 12). No carriers of MYPN or PRDM16 PTVs where identified in UKB DCM cases, and only one case carried a NEXN PTV among HCM cases in GeL (OR = 1.3, P = 0.8) (Supplementary Tables 11 and 12). Rare PTVs in three prioritized genes, not established causes of cardiomyopathy, were found to be associated with binary diseases outcomes (MAP3K7 and NEDD4L with DCM) in at least one cohort (Fig. 4b and Supplementary Tables 11 and 12) and with quantitative traits (NEDD4L, MAP3K7 and SSPN) in UKB (Fig. 4b and Supplementary Table 13). PTVs in MAP3K7 were associated with DCM in GeL (OR = 24.2, Benjamini–Hochberg adjusted P value (Padj= 0.02), and also with increased LV volumes (LV end-diastolic volume (LVEDV) = +54 ml, Padj = 0.01, LVESV = +38 ml, Padj = 4.4 × 10−4) in UKB. The importance of MAP3K7 in DCM pathogenesis was futher underscored by the prioritization of additional pathway genes, including RNF207 (ref. 31), a regulator of MAP3K7 activation, which has been identified as a possible cause of canine DCM32. PTVs in membrane receptor regulator NEDD4L were associated with DCM (OR = 10.4, Padj = 0.01) P and with quantitative traits in UKB (PTV: LVEDV = +29.7, Padj = 0.02; LVESV = +19.8, Padj = 0.005), with replication in GeL (heart failure OR = 13.0, P = 0.01). PTVs in SSPN were associated with significant changes in quantitative LV traits (LVEF −5.9%, Padj = 0.004 and LVESV + 13.0 ml, Padj = 0.02). Within a local DCM cohort, three of 337 cases (0.9%) carried PTVs in SSPN, compared with 80 of 352,564 (0.02%) among UKB controls (P = 1 × 10−5). SSPN is a critical protein located within the dystrophin glycoprotein complex of muscle cells, including cardiomyocytes. Its activity protects against impairment of cardiac contractility resulting from dystrophin deficiency in Duchenne muscular dystrophy, whereas loss of function destabilizes muscle adhesion and force generation33,34. An exploratory analysis of ultrarare variants (MAF < 1 × 10−5) that did not meeting the minor allele threshold in UKB for the main RVAS, identified additional associations with DCM, specifically with SLC38A6 and SSPN (Supplementary Table 14).
Identifying key cell types and cellular processes using single-cell transcriptomics
To identify the organs, tissues and cell types mediating genetic risk of DCM, we performed bulk tissue-level heritability enrichment analysis. Cardiac and other muscle-related tissues (including vascular and gastrointestinal smooth muscle) showed the highest levels of enrichment (Fig. 5a and Supplementary Table 15). Cell type heritability was assessed using the sc-linker framework35, integrating single-nucleus RNA sequencing (snRNA-seq)36 of LV tissue from 52 DCM patients with end-stage heart failure undergoing cardiac transplantation and 18 controls, and genome-wide enhancer–promoter contact in the LV, with GWAS heritability. We identified biologically relevant cell types and disease-specific relationships by identifying enrichments in basal gene expression profiles within cardiomyocytes and DCM-specific differentially expressed genes (DEGs) in cardiomyocytes, fibroblasts and mural cells (Fig. 5b and Supplementary Tables 16 and 17). When gene expression in control hearts was evaluated, most prioritized genes had the highest levels of expression in cardiomyocytes (Fig. 5c). Several of the prioritized DCM genes, including SSPN, MAP3K7 and NEDD4L, were differentially expressed in cardiomyocytes in DCM (Fig. 5d). Supporting the important role of noncardiomyocytes in DCM pathogenesis, fibroblasts and mural cells (primarily pericytes) consistently had higher proportions of DEGs in enriched biological pathways (Extended Data Fig. 7), with most prioritized genes being DEGs in noncardiomyocytes.
To explore cardiomyocyte and cardiomyocyte cell-nonautonomous mechanisms, as well as the role of prioritized genes encoding ligands or receptors, we investigated intercellular signaling pathways using CellChat37. This method combines cellular transcriptomics, a priori knowledge of ligand–receptor–cofactor interactions and the law of mass action to quantify communication networks. In DCM, we observed an overall increase in global signaling, with notable reductions in cardiomyocyte–cardiomyocyte interaction strength (Extended Data Fig. 7). Additionally, there was an increase in interactions of prioritized genes enriched in the TGFβ signaling pathway, along with specific changes in pathways containing specific prioritized genes. For example, interactions of COL4A1 and EPHB1 increased, while those of THBS1 decreased (Extended Data Fig. 7). Modest increases in overall collagen signaling were also found in DCM. Specifically, COL4A1 expression was increased in fibroblasts (Fig. 5d), with enhanced signaling to cardiomyocytes, fibroblasts and mural cells via integrins (Fig. 5e). EPHB1 (encoding Ephrin type-B receptor 1) expression was highest in cardiomyocytes, while its cognate ligand, EFNB2 (encoding Ephrin-B2), was expressed in endothelial cells. In DCM, the levels of the ligand increased, while there was a corresponding decrease in receptor production (Extended Data Fig. 7). Similar findings were reported in a single-nucleus RNA-sequencing study of pressure-overloaded human hearts38. BMPR1A was predominantly expressed in cardiomyocytes (Extended Data Fig. 7), with increased expression in mural cells and fibroblasts. This was associated with increased BMP6–BMPR1A signaling from endocardial cells to cardiomyocytes and fibroblasts (Fig. 5f and Extended Data Fig. 7), as previously reported36.
Polygenic burden predicts risk and modifies penetrance in carriers of monogenic variants
Given the important contribution of common genetic variation to DCM heritability, we generated a polygenic score (PGSDCM) using 541,841 SNP predictors and evaluated it in 347,585 unrelated participants of White British ancestry from the UKB (Fig. 6a). The PGS was significantly associated with DCM (OR per PGS s.d. 1.76, 95% CI 1.64 to 1.90, P < 2 × 10−16; area under the receiver operating characteristic curve (AUROC) = 0.71) in the general population. The top centile had a fourfold increased risk compared with the median (OR = 3.83, 95% CI 2.52 to 5.79, P = 2.1 × 10−10), and a sevenfold increased risk compared with the bottom centile (OR = 7.04, 95% CI 2.42 to 20.52, P =3.5 × 10−4) (Fig. 6b,c). In 25,443 individuals from the UKB with CMR imaging, PGSDCM was associated with cardiac traits concordant with DCM (Supplementary Table 18). These included reduced contractility (LVEF: per PGS s.d. −0.7%, Padj = 8.1 × 10−78; top versus bottom centile 57.6 versus 60.8, Padj = 1.7 × 10−6) and increased volumes (LVEDV: +2.1 ml, Padj = 2.5 × 10−45; top versus bottom centile: 158.1 versus 143.4, P = 3.1 × 10−6; LVESV: +1.9, P = 1.6 × 10−93; top versus bottom centile: 67.7 versus 56.6, P = 1.4 × 10−9). Given the variability in penetrance and expressivity of DCM in carriers of rare pathogenic variants39, we next evaluated whether common variants affected penetrance of rare variants, as has previously been demonstrated in HCM11. In 1,546 carriers of pathogenic variants in DCM-causing genes in UKB (prevalence 0.5%), PGSDCM stratified DCM prevalence (top quintile: 7.3%, bottom quintile: 1.7%, P 0.005), including in 1,166 carriers of rare TTN PTVs (Fig. 6d). DCM risk was higher in carriers of pathogenic variants in DCM-causing genes compared with gene-negative individuals in the top centile of PGS risk (OR = 6.4, 95% CI 4.0 to 10.3, P = 6 × 10−14). Finally, we conducted a phenome-wide association study (pheWAS) of PGSDCM to explore genetic relationships between common variant risk and other traits and diseases. We identified significant associations with heart failure and several related cardiovascular phenotypes (electrophysiologic and valvular), as well as established risk factors for impaired cardiac function (hypertension and obesity) (Fig. 6e). We also found significant associations with cardiac ischemic phenotypes and inverse associations with HCM, as previously described9. Genetic association estimates for all DCM loci were robust to conditional analysis on CAD and systolic blood pressure (SBP) using mtCOJO, suggesting that the identified genes primarily affect DCM risk (Extended Data Fig. 6). The pheWAS associations were robust to adjustment for measured hypertension, while adjustment for DCM and heart failure diagnoses resulted in loss of associations with ischemic phenotypes and obesity (Extended Data Fig. 8).
Discussion
In conclusion, through GWAS meta-analysis and multitrait analysis with LV traits, we identified 59 genomic loci for novel DCM, 31 of which had not been previously reported. These loci, along with an additional 21 loci significant at an FDR of 1% (80 loci in total), were investigated using a systematic approach for locus annotation and gene prioritization. We prioritized 62 effector genes for DCM, which were associated with key biological pathways in disease pathogenesis. Using single-nucleus transcriptomics from explanted end-stage DCM hearts, we demonstrated the importance of these pathways and highlighted the key role of noncardiomyocyte cell types and noncell-autonomous effects, including Ephrin-B and BMP6 signaling. Rare variant association analysis of the prioritized genes also identified previously unrecognized potential causes of Mendelian DCM, including MAP3K7, NEDD4L and SSPN. Finally, we demonstrate that a DCM polygenic score directly affects DCM risk and modifies disease penetrance in carriers of rare pathogenic variants. These findings provide mechanistic insights into the genetic architecture and molecular etiology of DCM and may inform therapeutic strategies for both DCM patients and at-risk individuals.
Methods
Ethics statement
This research complied with all relevant ethical regulations. All patients gave written informed consent, and all studies were approved by the relevant regional research ethics committees and adhered to the principles set out in the Declaration of Helsinki. Details of ethics approvals for individual studies are provided in the Supplementary Information.
Phenotype and study populations
DCM was defined in each participating study using a harmonized, rule-based, multimodal phenotyping algorithm as a guide. DCM was defined as LV systolic dysfunction with or without LV dilatation in the absence of secondary causes of heart failure (CAD, valvular heart disease or congenital heart disease); see Supplementary Information 1 for full definitions. Individuals with CAD, valvular heart disease or congenital heart disease were excluded from the control group. Imaging evidence or physician adjudication was preferred, but, where this was unavailable, classifiers were defined as the presence of at least one relevant diagnosis or procedural code from the patient’s medical records.
Discovery GWAS and multitrait analysis of GWAS
The current GWAS meta-analysis included 14,256 cases and 1,199,156 controls of European ancestry from 16 studies in the HERMES Consortium (cohorts described in Supplementary Information 2 and Supplementary Table 1). Genotyping for 15 of 16 studies was performed locally in each participating study using high-density genotyping arrays imputed against reference whole-genome sequencing panels from the Haplotype Reference Consortium (14 studies), 1000 Genomes Project (ref. 40) or population-specific reference panels (Estonian Biobank and deCODE) (Supplementary Table 2). Genotyping for the GeL cohort was done using whole-genome sequencing. Genetic association tests were performed per study per phenotype, using a logistic regression model assuming additive genetic effects with adjustments for age, sex, genetic principal components (PCs) and study-specific covariates. Full details of study-level GWAS methods are available in Supplementary Information 3 and Supplementary Table 2. Descriptions of studies and participant characteristics are provided in Supplementary Table 1. Sensitivity analysis GWAS and meta-analysis of strictly defined DCM (Supplementary Information 1) were performed using the same workflow. To assess the effects of ascertainment of DCM using the different criteria, GWAS meta-analysis was performed for the studies that used narrow (DCMNarrow GWAS) or broad (DCMBroad GWAS) criteria (Supplementary Table 1), and genetic correlations were assessed using bivariate LD score regression with LDSC v.1.0.1 (ref. 41).
GWAS meta-analysis was performed centrally using METAL v.2020-05-05 (ref. 42) with an inverse-variance weighted fixed-effect model. To boost discovery power, we further conducted a multitrait analysis of GWAS (MTAG), a method for jointly analyzing summary statistics from multiple overlapping GWAS of genetically correlated traits. GWAS in the UK Biobank of ten CMR-derived LV traits (LVEF, LVESV, LVEDV, stroke volume, global circumferential, longitudinal and radial strains, mass, concentricity, and maximum wall thickness) from 36,083 unrelated participants of White British ancestry and without heart failure, cardiomyopathy, previous myocardial infarction or structural heart disease8 were tested for genetic correlation with primary GWAS using LDSC v.1.0.1 (refs. 43,44). MTAG of the primary GWAS was then performed with CMR traits with high genetic correlation (|rg| > 0.7) using mtag v.1.0.8 (ref. 10). The maximum FDR was estimated by mtag to be 2.7%.
SNP-based heritability estimation
The proportion of variance in heart failure risk explained by common SNPs—that is, SNP-based heritability (h2SNP)—was estimated from GWAS meta-analysis summary statistics using LDAK SumHer software v.5.2 with the BLD-LDAK heritability model7. The h2SNP estimates were calculated on a liability scale, which assumes that a binary phenotype has an underlying continuous liability, and that above a certain liability threshold, an individual becomes affected45. To model the expected heritability tagged by each SNP, we used precomputed tagging files derived from 2,000 White British individuals, and we used a correction for sample prevalence by calculating the effective sample size assuming equal numbers of cases and controls46. The conversion to liability scale was calculated using a population prevalence of 0.004 for DCMNarrow (based on an estimated prevalence of 1 in 250 individuals2,3) and 0.008 for DCM (assuming twice the prevalence of DCMNarrow).
Locus identification
To identify genetic susceptibility loci for DCM, we first identified conditionally independent genetic variants using a chromosome-wide stepwise conditional-joint analysis implemented in the Genome-wide Complex Trait Analysis software (v.1.92.4)47 at a genome-wide significance threshold of P < 5 × 10−8 in all GWAS and additionally at FDR < 1% (estimated using qvalue) for DCM GWAS. To define a genomic locus, conditionally independent genetic variants across both DCM GWAS and DCM MTAG that were located within 500 kb of each other were aggregated, and an additional 500 kb was added to flank the variants at the extremes within each set. A genomic locus was considered to be novel if all conditionally independent variants within the locus were located ≥250 kb away and not in LD (R2) with any sentinel variant with a P < 5 × 10−8 reported in previously published GWAS of DCM for DCM GWAS or GWAS of any of the three traits included for MTAG in DCM MTAG (Supplementary Table 3).
Enrichment of Mendelian cardiomyopathy genes within GWAS loci
To estimate the enrichment of Mendelian cardiomyopathy genes within GWAS loci, we first extracted 3,404 genes that had been linked to Mendelian disorder with at least moderate evidence as listed in the ClinGen and GenCC databases (accessed February 2023). We annotated whether each gene was located in GWAS and whether it was listed as one of the 38 Mendelian cardiomyopathy genes (Supplementary Information 4). We then cross-tabulated these annotations and performed statistical tests with one-sided Fisher’s exact test to calculate ORs of cardiomyopathy genes being more likely to be situated within GWAS loci. Fisher’s exact test was performed using the fisher.test function in R.
Functionally informed fine-mapping of genomic loci
To prioritize likely causal variants at each genomic locus, we performed functionally informed fine-mapping using PolyFun v.2020-11-14 (ref. 48) and SuSiE v.0.11.92 (ref. 49). Using precomputed prior causal probabilities of 19 million imputed SNPs with MAF > 0.001 based on meta-analysis of 15 traits in UKB from PolyFun, we first estimated per-SNP heritability. These results were then passed to SuSiE to calculate per-SNP posterior inclusion probabilities and to identify 95% credible sets of likely causal variants, assuming at most five causal variants per locus. To run fine-mapping, we used LD reference panels from 10,000 randomly selected UKB European ancestry participants. The procedure was performed separately for loci identified from DCM GWAS and DCM MTAG using the respective summary statistics. For each locus, variants within the identified 95% credible sets in either DCM GWAS or DCM MTAG were aggregated, and annotated with nearest gene(s), genic functions, and Combined Annotation-Dependent Depletion Phred score50 extracted from ANNOVAR v.2020-06-07 (ref. 51) and OpenTargets Genetics52.
Prioritization of effector genes at DCM loci
To systematically identify and prioritize effector genes at each locus, we followed a two-step approach. First, the nearest gene and the top three genes prioritized by either PoPS53 or V2G54 were selected as candidate genes. Second, the totality of evidence including nearest gene, PoPS, V2G and five additional approaches (coding variant, colocalization with gene expression, TWAS, ABC model, and established Mendelian cardiomyopathy- and muscle-disease-causing genes) was summarized by identifying the number of individual approaches that identified each candidate gene as the most likely, assuming that it met each method’s minimum threshold for significance or relevance. Each method received equal weighting, with a maximum score of 8, and the candidate gene with the highest score at each genomic locus was determined to be the prioritized gene. Loci in which gene scores were tied for the highest score were determined not to have a single high-confidence candidate gene.
Transcriptome-wide association study
We estimated the associations between overall gene expression across tissues and DCM through a multitissue TWAS using eQTL data across 49 human tissues from GTEx v.8 and the DCM GWAS summary statistics implemented in S-MulTiXcan v.0.7.3 with the MASH-R model55.
Colocalization with gene expression
To test the hypothesis that genetic associations with gene expression in a given tissue and with DCM are driven by the same causal variants, we performed a statistical colocalization analysis using R coloc v.5.2.3 (ref. 49) allowing for multiple causal variants. The colocalization analysis was performed for all genes overlapping with the identified DCM genetic loci using summary-level eQTL data from GTEx v.8 (ref. 56) in tissues with the lowest TWAS Pvalue and the DCM GWAS summary statistics.
Polygenic priority score
We computed the polygenic enrichment of gene features derived from cell-type-specific gene expression, biological pathways and protein–protein interactions for all protein-coding genes within the human genome using PoPS v.0.1 (ref. 53). A higher score implies a higher probability of a gene being causal for the trait under study, given feature similarities to other predicted causal genes.
Variant-to-gene
The V2G model aggregates data from molecular phenotype quantitative trait locus (QTL) experiments including gene expression (eQTL), protein abundance (pQTL) and alternative protein splicing (sQTL), chromatin interaction experiments, in silico functional predictions and genomic distance (between the variant and a gene’s canonical transcriptional start site) to compute a variant-level score, with a higher value reflecting greater functional relevance on a given gene54. To map variant-level V2G scores onto gene-level scores for gene prioritization, we extracted the V2G score using V2G v.1.1 for all variants that were in LD (R2 > 0.8) with conditionally independent variants or within the fine-mapped variant set for a given locus and took the maximum V2G for a given gene.
ABC model
The ABC model uses experimental estimates of enhancer activity (assay for transposase-accessible chromatin using sequencing, DNase I hypersensitive site sequencing, or histone 3 K27 acetylation chromatin immunoprecipitation followed by sequencing) and enhancer–promoter contact frequency (high-throughput chromatin conformation capture) to predict enhancer–gene interactions57. Precomputed ABC scores generated from experimental data of cardiac left ventricles in ENCODE58 were identified for the genomic coordinates of fine-mapped and lead variants, with scores >0.02 indicating important interactions.
Conditional GWAS analysis
Conditional GWAS analysis was performed using a multitrait-based conditional and joint analysis (mtCOJO) method59 implemented in GCTA v.1.92.4, which we used to estimate the genetic effects of disease conditioning on AF, CAD, and SBP. To perform the analysis, we used summary statistics from GWAS of AF in 77,690 cases and 1,167,040 controls60, CAD in 181,522 cases and 984,168 controls60 and SBP in 757,601 individuals61. For AF and CAD, we calculated the sample prevalence by dividing the number of cases by the number of samples reported in the GWAS, and we used a population prevalence of 2.2% for AF and 7.2% for CAD62,63. Given that the vast majority of the GWAS summary statistics used were derived from European ancestry samples, we used 1000G European ancestry to model LD between variants.
Rare variant gene-based association testing
Gene-based association testing was performed in the UKB and 100,000 Genomes Project for all genes located within genomic loci, using the genome-wide regression test implemented in REGENIE v.3.2.4. A whole-genome regression model was fitted to allow handling of polygenicity, relatedness and ancestry, using directly genotype-arrayed variants passing quality control (MAF > 0.01, <10% missingness, Hardy–Weinberg equilibrium test P > 10−15) in UKB, or directly sequenced variants in the 100,000 Genomes Project (GeL). Next, a gene-based burden test was performed conditional upon the phenotype-specific predictors from the genome-wide regression model and adjusting for sex, age, age2 and first ten genetic PCs, with body surface area and SBP included as additional covariates for quantitative traits. The outcomes tested were binary case–control status (DCM (narrow and broad definition), heart failure and HCM) and, in the UKB, related CMR quantitative traits (LVESV, LVEDV, LVEF, LV stroke volume and maximum LV wall thickness). Firth correction was applied to account for case–control imbalance. Burden tests collapse variants into a single variable that can be tested for association with a phenotype or trait, thereby reducing computational cost and the test statistic inflation that is seen with other gene-based rare variant tests (for example, SKAT and SKAT-O). Individuals with missing phenotype data were dropped from analysis. For consistency across UKB and GeL, one rare variant mask of PTVs (start lost, stop gained, frameshift, splice acceptor or donor lost) with a MAF < 1 × 10−3 was tested. To minimize the false positive rate resulting from genes with very low allele counts, a minimum allele count (MAC) threshold was applied that considered the approximate sample size: analysis in UKB required MAC ≥ 20 for binary traits, and MAC ≥ 3 for quantitative traits; and analysis in GeL required MAC ≥ 3. A Pvalue FDR-adjusted using the Benjamini–Hochberg method was used for the total number of genes passing the MAC threshold that were tested. Validation of significant associations (Padj < 0.05) in any cohort required directional concordance and nominal significance (P < 0.05) of the same gene–trait association. Exploratory results evaluating the effect of ultrarare (MAF < 1 × 10−5) variants on binary outcomes in UKB were also tested.
To characterize the overall genetic architecture of DCM, gene-based burden testing of rare PTVs (MAF < 1 x 10−3) was also performed for 16 DCM genes with moderate or definitive evidence30 in UKB to generate risk estimates for carriers of rare variants with DCM and heart failure.
Tissue, cell type and cell state heritability enrichment
Tissue-level heritability enrichment analysis was performed using precalculated LD scores of gene expression data from GTEx56 and chromatin data from the Roadmap Epigenomics64 and ENCODE58 projects, with LDSC v.1.0.1 (ref. 65). For cell type and state heritability enrichment, we used the sc-linker35 approach to link transcriptome-wide gene programs from single-nucleus datasets with GWAS summary statistics. Gene programs derived from snRNA-seq were used to investigate heritability enrichment in cardiac cell types and states using the sc-linker framework35. This approach uses snRNA-seq data to generate gene programs that characterize individual cell types and states. These programs are then linked to genomic regions and the SNPs that regulate them by incorporating Roadmap Enhancer-Gene Linking64,66 and ABC models57,67. Finally, the disease informativeness of the resulting SNP annotations is tested using stratified LD score regression,68 conditional on broad sets of annotations from the baseline LD model,41,69 and enrichment statistics and τ coefficients are reported.
Cell-type-specific gene programs were generated from snRNA-seq data of ventricular tissue from 18 control subjects, with cell type annotations made as part of a larger study of ~880,000 nuclei (samples from 52 DCM and 18 control subjects)36. Cells that may not have represented true biological states (for example, technical doublets) were excluded from the analysis. For cell type disease-specific programs, pseudobulked counts were used to compare expression levels in DCM and control LV samples within all annotated cell types, using edgeR v.3.32.1 (ref. 70) and methods previously described36. Significant DEGs were defined as those with FDR-adjusted P < 0.05 and absolute(log2 fold change) > 0.5, requiring a minimum normalized log2 count of >0.0125 per nucleus (equivalent to 1 count in a nucleus with 10,000 total counts) in either control or DCM samples.
Pathway enrichment analysis of effector genes, DEGs and intercellular communication in DCM single-nucleus transcriptomics
Pathway gene ontology (GO) enrichment of effector genes and DEGs in DCM was determined at the cell type level and driver GO terms were identified using a two-stage algorithm implemented with gprofiler2 v.0.2.3 (ref. 71). Driver GO terms were determined using a two-stage algorithm implemented with gprofiler2 to identify enriched pathways among GWAS effector genes. GO terms were further examined in the DCM single-nucleus dataset by exploring enrichment among DCM DEGs in all cell types. Functional enrichment analysis was performed using a cumulative hypergeometric probability, with Bonferroni-adjusted P values reported.
To determine the importance of cardiomyocyte and noncardiomyocyte cell types in DCM and the roles of candidate genes and effector-gene-enriched signaling pathways, we explored disease-specific intercellular communication. The single-nucleus transcriptomes of DCM and control samples were interrogated using CellChat v.1.0 for manually curated ligand–receptor interactions (CellChatDB)37. In brief, this method identifies overexpressed genes within cell types and states, quantifies the probability of receptor–ligand communication between cells using the law of mass action, and infers statistically and biologically important cellular communications37. CellChat was run using default program settings, and the results were analyzed at the cell type level. Endocardial cells were separated from other endothelial cells owing to previously reported important biological effects on ligand–receptor signaling36. All analyses were performed in R v.4.0.3.
Polygenic risk score generation and testing
PGS were generated using a Bayesian framework that models ancestry-specific LD with an external reference set and uses a continuous shrinkage prior, implemented using the PRS-CS v.1.0 package72. The phi constant was automatically selected by PRS-CS in an unsupervised approach (PRS-CS auto). Whole-genome PGS scores for all included UKB individuals were calculated using the PLINK 1.9 –score function73. Individual SNP weighted scores were generated from DCM GWAS that excluded the UKB cohort, and a subsequent MTAG, to avoid the substantial inflation that occurs when there is overlap of individuals between the GWAS and testing cohorts74. The base GWAS summary statistics were filtered to exclude rare and uncommon variants (MAF < 0.01) and ambiguous SNPs that were not resolvable by strand-flipping. We calculated a PGS for unrelated (third degree or closer) White British participants in the UKB (application number 47602) using variants that passed genotyping quality control (MAF > 0.01, genotyping rate >0.99, Hardy–Weinberg equilibrium test P > 1 × 10−6). Variants overlapping the base, target and LD reference set (1000 Genomes Project phase 3 European ancestry) were included. PGS predictive performance was assessed on the basis of AUROC and association with DCM and associated CMR traits (OR per PGS standard deviation and comparing top quantiles with the median) in the UKB, and in carriers of rare variants predicted to cause DCM30 (see Supplementary Information 5 for full details of variant curation and genes tested). All models included age, age2, sex and first ten genetic PCs as covariates. AUROC was calculated for logistic regression models using pROC v.1.18.4, randomly separating the cohort into 70% generation and 30% evaluation. Nagelkerke’s R2 was calculated using fmsb v.0.7.5 with the null model only including age, age2, sex and first ten genetic PCs as covariates. Time-to-event analysis was performed using survival v.3.5.7, and cumulative incidence curves were generated using survminer v.0.4.9. All statistical analyses were performed in R v.4.0.3.
Phenome-wide association study
The pleiotropic effects of genetic risk arising from common variants were tested by performing a pheWAS of PGS in the UKB. ICD-9 and ICD-10 codes from death records and hospital admission episodes were translated to Phecodes (Phecode Map 1.2)75. For binary phenotypes with at least 20 cases, PGS–phenotype association was tested using logistic regression adjusted for age, age2, sex and first ten genetic PCs as covariates. Sensitivity analyses adjusting for DCM or heart failure and hypertension status in the regression model were performed to identify independent effects. The significance threshold was adjusted for the total number of phenotypes tested (P < 2.72 × 10−5), and data were presented using Manhattan plots, grouped by body system. PheWAS were performed using PheWAS v.2018-03-12 (ref. 76) in R v.4.0.3.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41588-024-01952-y.
Supplementary information
Acknowledgements
We acknowledge contributions from the 100,000 Genomes Project, COVIDsortium, DBDS Genomic Consortium, Estonian Biobank and HERMES Consortium. This work was supported by funding from the British Heart Foundation (RE/18/4/34215, FS/IPBSRF/22/27059, FS/15/81/31817, FS/ICRF/21/26019, RG/19/6/34387, BC/F/21/220106, FS/18/65/34186, SP/19/1/34461, SP/17/11/32885, CH/P/23/80008, RE/24/130023), the Medical Research Council (MC_UP_1605/13), Wellcome Trust (107469/Z/15/Z); National Institute for Health Research (NIHR) Imperial College Biomedical Research Centre, NIHR Royal Brompton Cardiovascular Biomedical Research Unit, Sir Jules Thorn Charitable Trust (21JTA), National Heart and Lung Foundation, Royston Centre for Cardiomyopathy Research, Rosetrees Trust, GenMED LABEX, UCL British Heart Foundation Research Accelerator and NIHR University College London Biomedical Research Centre. This research was conducted in part using the UKB resource under application numbers 9922, 15422, 18545, 40616 and 47602 and was made possible through access to data in the National Genomic Research Library, which is managed by Genomics England Limited (a wholly owned company of the Department of Health and Social Care). The National Genomic Research Library holds data provided by patients and collected by the NHS as part of their care and data collected as part of their participation in research. The National Genomic Research Library is funded by the NIHR and NHS England; the Wellcome Trust, Cancer Research UK and the Medical Research Council have also funded research infrastructure. Individual study acknowledgements are reported in Supplementary Information 6. The views expressed in this work are those of the authors and not necessarily those of the funders. For the purpose of open access, the authors have applied a Creative Commons Attribution (CC BY) license to any author accepted manuscript version arising from this submission.
Extended data
Author contributions
S.L.Z. and A.H. conceived, designed and performed the experiments, performed statistical analysis, analyzed the data, wrote the paper with input from all authors and contributed equally to this work. J.S.W. and R.T.L. conceived and designed the experiments, contributed data, wrote the paper and jointly supervised this work. M.L., K.M., X.X. and C.F. performed statistical analysis and analyzed the data. D.C., D.M., I.B., H.I., A.d.M., P.I., R.B., D.S., E.A., L.J.A., K.G.A., J.A., J.B., A.J.B., P.J.R.B., K.J.B., E.B., J.B., S.B., H.B., D.J.C., P.C., J.P.C., S.A.C., S.D., J.-F.D., A.D., P.E., T.E., C.E., E.H.F.-F., C.F., S.G., J.G., V.G., D.G., C.M.H., B.P.H., A.H., H. Hemingway, H.L.H., L.L., C.M.L., B.D.L., K.M., I.R.M., M.P.M., A.D.M., A.P.M., L.M., C.M., J.C.M., M. Noursadeghi, A.T.O., S.R.O., C.N.A.P., A.P., S.K.P., O.B.P., A.A.R., A.S., D.T.S., S.S., K.S., G.S., P.S., M.L.-T., U.T., T.A.T., M.T.-L., G.T., U.T., V.T., D.-A.T., H.U., A.M.V., J.v.S., M.v.V., A.V., M.V., E.V., COVIDsortium, DBDS Consortium, HERMES Consortium and Genomics England Research Consortium contributed data. T.P.C., M.-P.D., M.D., P.T.E., A.D.H., C.C.L., N.J.S., S.H.S., J.G.S., R.S.V., D.P.O.’R., H. Holm, M. Noseda and Q.S.W. conceived and designed experiments and contributed data.
Peer review
Peer review information
Nature Genetics thanks Shoa Clarke and Guillaume Paré for their contribution to the peer review of this work. Peer reviewer reports are available.
Data availability
Data from UKB can be requested from the UKB Access Management System (https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access). Data from the 100,000 Genomes Project can be accessed following an application to join the Genomics England Clinical Interpretation Partnership (https://www.genomicsengland.co.uk/research/academic/join-research-network). The ClinGen (https://www.clinicalgenome.org) and GenCC (https://search.thegencc.org) databases can be directly accessed. GWAS summary statistics are available on the Cardiovascular Disease Knowledge Portal (https://cvd.hugeamp.org/dinspector.html?dataset=Zheng2024_DCM_EU). Regional association plots for all 80 risk loci are available online (https://hermes-dcm-locus.netlify.app). The PGS are available for download at the Polygenic Score Catalog (https://www.pgscatalog.org/) under accession IDs PGS004861 and PGS004862. The raw single-nucleus gene expression dataset is available for download from the European Phenome-Genome Archive (dataset ID EGAD00001009292).
Code availability
Custom analysis code to perform the main GWAS analyses is available via Zenodo at 10.5281/zenodo.11204854 (ref. 77). Additional analyses were performed using publicly available software as described in the Methods section.
Competing interests
S.L.Z. has acted as a consultant for Health Lumen. A.H. and R.T.L. have received funding from Pfizer Inc. R.T.L. has performed paid consultancy for Health Lumen and Fitfile Ltd. J.S.W. has acted as a consultant for MyoKardia, Pfizer, Foresite Labs and Health Lumen and received institutional support from Bristol Myers Squibb and Pfizer Inc. P.C. has received personal fees for consultancies, outside the present work, for Amicus, Pfizer Inc., Owkin and Bristol Myers Squibb. M.-P.D. declares holding equity in Dalcor Pharmaceuticals, unrelated to this work. The authors who are affiliated with deCODE genetics/Amgen Inc. and Regeneron Pharmaceuticals declare competing financial interests as employees. The other authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Sean L. Zheng, Albert Henry.
These authors jointly supervised this work: James S. Ware, R. Thomas Lumbers.
A full list of members and their affiliations appears in the Supplementary Information.
Contributor Information
James S. Ware, Email: j.ware@imperial.ac.uk
R. Thomas Lumbers, Email: t.lumbers@ucl.ac.uk.
COVIDsortium:
Charlotte Manisty, James C. Moon, Thomas A. Treibel, Mahdad Noursadeghi, and Aroon D. Hingorani
DBDS Genomic Consortium:
Søren Brunak, Christian Erikstrup, Daniel F. Guðbjartsson, Ole B. V. Pedersen, Kari Stefansson, Unnur Thorsteinsdottir, and Henrik Ullum
Estonian Biobank Research Team:
Erik Abner and Tõnu Esko
HERMES Consortium:
Sean L. Zheng, Albert Henry, Douglas Cannie, Michael Lee, David Miller, Kathryn A. McGurk, Isabelle Bond, Xiao Xu, Hanane Issa, Catherine Francis, Pantazis I. Theotokis, Rachel J. Buchan, Doug Speed, Erik Abner, Lance Adams, Krishna G. Aragam, Johan Ärnlöv, Joshua D. Backman, John Baksi, Paul J. R. Barton, Kiran J. Biddinger, Eric Boersma, Jeffrey Brandimarto, David J. Carey, Philippe Charron, James P. Cook, Stuart A. Cook, Spiros Denaxas, Alexander S. Doney, Perry Elliott, Tõnu Esko, Eric H. Farber-Eger, Chris Finan, Jonas Ghouse, Vilmantas Giedraitis, Daniel F. Guðbjartsson, Christopher M. Haggerty, Brian P. Halliday, Anna Helgadottir, Harry Hemingway, Hans L. Hillege, Isabella Kardys, Lars Lind, Cecilia M. Lindgren, Brandon D. Lowery, Kenneth B. Margulies, Ify R. Mordi, Michael P. Morley, Andrew D. Morris, Anjali T. Owens, Antonis Pantazis, Sanjay K. Prasad, Diane T. Smelser, Garðar Sveinbjörnsson, Petros Syrris, Mari-Liis Tammesoo, Upasana Tayal, Maris Teder-Laving, Vinicius Tragante, Yifan Yang, Kari Stefansson, Unnur Thorsteinsdottir, Folkert W. Asselbergs, Antonio De Marvao, Marie-Pierre Dube, Michael E. Dunn, Patrick T. Ellinor, Sophie Garnier, Chim C. Lang, Andrew P. Morris, Lori Morton, Colin N. A. Palmer, Nilesh J. Samani, Svati H. Shah, Akshay Shekhar, J. Gustav Smith, Sundarajan Srinivasan, Guðmundur Thorgeirsson, Ramachandran S. Vasan, Jessica van Setten, Marion van Vugt, Abirami Veluchamy, W. M. Monique Verschuuren, Eric Villard, Quinn Wells, Thomas P. Cappola, Aroon D. Hingorani, Declan P. O’Regan, Hilma Holm, Michela Noseda, James S. Ware, and R. Thomas Lumbers
Extended data
is available for this paper at 10.1038/s41588-024-01952-y.
Supplementary information
The online version contains supplementary material available at 10.1038/s41588-024-01952-y.
References
- 1.Pinto, Y. M. et al. Proposal for a revised definition of dilated cardiomyopathy, hypokinetic non-dilated cardiomyopathy, and its implications for clinical practice: a position statement of the ESC working group on myocardial and pericardial diseases. Eur. Heart J.37, 1850–1858 (2016). [DOI] [PubMed] [Google Scholar]
- 2.Arbelo, E. et al. 2023 ESC Guidelines for the management of cardiomyopathies. Eur. Heart J.44, 3503–3626 (2023). [DOI] [PubMed]
- 3.Seferović, P. M. et al. Heart failure in cardiomyopathies: a position paper from the Heart Failure Association of the European Society of Cardiology. Eur. J. Heart Fail.21, 553–576 (2019). [DOI] [PubMed] [Google Scholar]
- 4.Pirruccello, J. P. et al. Analysis of cardiac magnetic resonance imaging in 36,000 individuals yields genetic insights into dilated cardiomyopathy. Nat. Commun.11, 2254 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lumbers, R. T. et al. The genomics of heart failure: design and rationale of the HERMES consortium. ESC Heart Fail.8, 5531–5541 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hershberger, R. E., Hedges, D. J. & Morales, A. Dilated cardiomyopathy: the complexity of a diverse genetic architecture. Nat. Rev. Cardiol.10, 531–547 (2013). [DOI] [PubMed] [Google Scholar]
- 7.Speed, D. & Balding, D. J. SumHer better estimates the SNP heritability of complex traits from summary statistics. Nat. Genet.51, 277–284 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Tadros, R. et al. Large scale genome-wide association analyses identify novel genetic loci and mechanisms in hypertrophic cardiomyopathy. Preprint at medRxivwww.medrxiv.org/content/10.1101/2023.01.28.23285147 (2023).
- 9.Tadros, R. et al. Shared genetic pathways contribute to risk of hypertrophic and dilated cardiomyopathies with opposite directions of effect. Nat. Genet.53, 128–134 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Turley, P. et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet.50, 229–237 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zheng, S. L. et al. Evaluation of polygenic score for hypertrophic cardiomyopathy in the general population and across clinical settings. Preprint at medRxivwww.medrxiv.org/content/10.1101/2023.03.14.23286621 (2023).
- 12.Tao, Y. et al. Pitx2, an atrial fibrillation predisposition gene, directly regulates ion transport and intercalated disc genes. Circ. Cardiovasc. Genet.7, 23–32 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Betrie, A. H. et al. Evidence of a cardiovascular function for microtubule-associated protein tau. J. Alzheimers Dis.56, 849–860 (2017). [DOI] [PubMed] [Google Scholar]
- 14.England, J. & Loughna, S. Heavy and light roles: myosin in the morphogenesis of the heart. Cell. Mol. Life Sci.70, 1221–1239 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Meurs, K. M. et al. Association of dilated cardiomyopathy with the striatin mutation genotype in boxer dogs. J. Vet. Intern. Med.27, 1437–1440 (2013). [DOI] [PubMed] [Google Scholar]
- 16.Dawson, J. C., Bruche, S., Spence, H. J., Braga, V. M. & Machesky, L. M. Mtss1 promotes cell-cell junction assembly and stability through the small GTPase Rac1. PLoS ONE7, e31141 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Huang, X., Qu, R., Ouyang, J., Zhong, S. & Dai, J. An overview of the cytoskeleton-associated role of PDLIM5. Front. Physiol.11, 975 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Cheng, H. et al. Loss of enigma homolog protein results in dilated cardiomyopathy. Circ. Res.107, 348–356 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Luo, W. et al. TMEM182 interacts with integrin beta 1 and regulates myoblast differentiation and muscle regeneration. J. Cachexia Sarcopenia Muscle12, 1704–1723 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Villar, A. V. et al. BAMBI (BMP and activin membrane-bound inhibitor) protects the murine heart from pressure-overload biomechanical stress by restraining TGF-β signaling. Biochim. Biophys. Acta1832, 323–335 (2013). [DOI] [PubMed] [Google Scholar]
- 21.Bhandary, B. et al. Cardiac fibrosis in proteotoxic cardiac disease is dependent upon myofibroblast TGF‐β signaling. J. Am. Heart Assoc.7, e010013 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Al-Yacoub, N. et al. Mutation in FBXO32 causes dilated cardiomyopathy through up-regulation of ER-stress mediated apoptosis. Commun. Biol.4, 884 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Shimokawa, H., Sunamura, S. & Satoh, K. RhoA/Rho-kinase in the cardiovascular system. Circ. Res.118, 352–366 (2016). [DOI] [PubMed] [Google Scholar]
- 24.McGurk, K. A., Kasapi, M. & Ware, J. S. Effect of taurine administration on symptoms, severity, or clinical outcome of dilated cardiomyopathy and heart failure in humans: a systematic review. Wellcome Open Res.7, 9 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Veerman, C. C. et al. The Brugada syndrome susceptibility gene HEY2 modulates cardiac transmural ion channel patterning and electrical heterogeneity. Circ. Res.121, 537–548 (2017). [DOI] [PubMed] [Google Scholar]
- 26.Niihori, T. et al. Germline-activating RRAS2 mutations cause Noonan syndrome. Am. J. Hum. Genet.104, 1233–1240 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Capri, Y. et al. Activating mutations of RRAS2 are a rare cause of Noonan syndrome. Am. J. Hum. Genet.104, 1223–1232 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Connally, N. J. et al. The missing link between genetic association and regulatory function. eLife11, e74970 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Josephs, K. S. et al. Beyond gene-disease validity: capturing structured data on inheritance, allelic-requirement, disease-relevant variant classes, and disease mechanism for inherited cardiac conditions. Genome Med.15, 86 (2023). [DOI] [PMC free article] [PubMed]
- 30.Jordan, E. et al. Evidence-based assessment of genes in dilated cardiomyopathy. Circulation144, 7–19 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Yuan, L. et al. RNF207 exacerbates pathological cardiac hypertrophy via post-translational modification of TAB1. Cardiovasc. Res.119, 183–194 (2023). [DOI] [PubMed] [Google Scholar]
- 32.Niskanen, J. E. et al. Identification of novel genetic risk factors of dilated cardiomyopathy: from canine to human. Genome Med.15, 73 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Parvatiyar, M. S. et al. Stabilization of the cardiac sarcolemma by sarcospan rescues DMD-associated cardiomyopathy. JCI Insight4, e123855 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Parvatiyar, M. S. et al. Sarcospan regulates cardiac isoproterenol response and prevents Duchenne muscular dystrophy-associated cardiomyopathy. J. Am. Heart Assoc.4, e002481 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Jagadeesh, K. A. et al. Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nat. Genet.54, 1479–1492 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Reichart, D. et al. Pathogenic variants damage cell composition and single cell transcription in cardiomyopathies. Science377, eabo1984 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Jin, S. et al. Inference and analysis of cell-cell communication using CellChat. Nat. Commun.12, 1088 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Nicin, L. et al. A human cell atlas of the pressure-induced hypertrophic heart. Nat. Cardiovasc. Res.1, 174–185 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Shah, R. A. et al. Frequency, penetrance, and variable expressivity of dilated cardiomyopathy-associated putative pathogenic gene variants in UK Biobank participants. Circulation146, 110–124 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Garnier, S. et al. Genome-wide association analysis in dilated cardiomyopathy reveals two new players in systolic heart failure on chromosomes 3p25.1 and 22q11.23. Eur. Heart J.42, 2000–2011 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Gazal, S., Marquez-Luna, C., Finucane, H. K. & Price, A. L. Reconciling S-LDSC and LDAK functional enrichment estimates. Nat. Genet.51, 1202–1204 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics26, 2190–2191 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet.47, 1236–1241 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet.47, 291–295 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Ojavee, S. E., Kutalik, Z. & Robinson, M. R. Liability-scale heritability estimation for biobank studies of low-prevalence disease. Am. J. Hum. Genet.109, 2009–2017 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Grotzinger, A. D., Fuente, J., Privé, F., Nivard, M. G. & Tucker-Drob, E. M. Pervasive downward bias in estimates of liability-scale heritability in genome-wide association study meta-analysis: a simple solution. Biol. Psychiatry93, 29–36 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet.44, 369–375 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Weissbrod, O. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet.52, 1355–1363 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Wang, G., Sarkar, A. K., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Series B Stat. Methodol.82, 1273–1300 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res.47, D886–D894 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res.38, e164 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Ghoussaini, M. et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res.49, D1311–D1320 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Weeks, E. M. et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat. Genet.55, 1267–1276 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Ochoa, D. et al. The next-generation Open Targets Platform: reimagined, redesigned, rebuilt. Nucleic Acids Res.51, D1353–D1359 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun.9, 1825 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Aguet, F. et al. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science369, 1318–1330 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Fulco, C. P. et al. Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nat. Genet.51, 1664–1669 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome. Nature489, 57 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zhu, Z. et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat. Commun.9, 224 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Miyazawa, K. et al. Cross-ancestry genome-wide analysis of atrial fibrillation unveils disease biology and enables cardioembolic risk prediction. Nat. Genet.55, 187–197 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Evangelou, E. et al. Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits. Nat. Genet.50, 1412–1425 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Aragam, K. G. et al. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nat. Genet.54, 1803–1815 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Virani, S. S. et al. Heart disease and stroke statistics–2021 update: a report from the American Heart Association. Circulation143, e254–e743 (2021). [DOI] [PubMed] [Google Scholar]
- 64.Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature518, 317–330 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet.50, 621–629 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature473, 43–49 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature593, 238–243 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet.47, 1228–1235 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Gazal, S. et al. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat. Genet.49, 1421–1427 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics26, 139–140 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Raudvere, U. et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res.47, W191–W198 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Ge, T., Chen, C. Y., Ni, Y., Feng, Y. A. & Smoller, J. W. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun.10, 1776 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet.81, 559–575 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Wray, N. R. et al. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet.14, 507–515 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Wei, W.-Q. et al. Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record. PLoS ONE12, e0175508 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Carroll, R. J., Bastarache, L. & Denny, J. C. R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics30, 2375–2376 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Henry, A. ihi-comp-med/hermes2-gwas: manuscript release. Zenodo10.5281/zenodo.11204854 (2024).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data from UKB can be requested from the UKB Access Management System (https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access). Data from the 100,000 Genomes Project can be accessed following an application to join the Genomics England Clinical Interpretation Partnership (https://www.genomicsengland.co.uk/research/academic/join-research-network). The ClinGen (https://www.clinicalgenome.org) and GenCC (https://search.thegencc.org) databases can be directly accessed. GWAS summary statistics are available on the Cardiovascular Disease Knowledge Portal (https://cvd.hugeamp.org/dinspector.html?dataset=Zheng2024_DCM_EU). Regional association plots for all 80 risk loci are available online (https://hermes-dcm-locus.netlify.app). The PGS are available for download at the Polygenic Score Catalog (https://www.pgscatalog.org/) under accession IDs PGS004861 and PGS004862. The raw single-nucleus gene expression dataset is available for download from the European Phenome-Genome Archive (dataset ID EGAD00001009292).
Custom analysis code to perform the main GWAS analyses is available via Zenodo at 10.5281/zenodo.11204854 (ref. 77). Additional analyses were performed using publicly available software as described in the Methods section.