Summary
Atrial fibrillation (AF) is the most common arrhythmia in the world. Human genetics can provide strong AF therapeutic candidates, but the identification of the causal genes and their functions remains challenging. Here, we applied an AF fine-mapping strategy that leverages results from a previously published cross-ancestry genome-wide association study (GWAS), expression quantitative trait loci (eQTLs) from left atrial appendages (LAAs) obtained from two cohorts with distinct ancestry, and a paired RNA sequencing (RNA-seq) and ATAC sequencing (ATAC-seq) LAA single-nucleus assay (sn-multiome). At nine AF loci, our co-localization and fine-mapping analyses implicated 14 genes. Data integration identified several candidate causal AF variants, including rs7612445 at GNB4 and rs242557 at MAPT. Finally, we showed that the repression of the strongest AF-associated eQTL gene, LINC01629, in human embryonic stem cell-derived cardiomyocytes using CRISPR inhibition results in the dysregulation of pathways linked to genes involved in the development of atrial tissue and the cardiac conduction system.
Subject areas: Techniques in genetics, quantitative genetics, expression study, genomic analysis, association analysis, Transcriptomics
Graphical abstract
Highlights
-
•
Bulk and single-nucleus RNA-seq data from human atria help interpret AF GWAS results
-
•
Co-localization and fine-mapping implicate 14 genes at 9 AF GWAS loci
-
•
LINC01629 is involved in the development of atrial tissue and the conduction system
Techniques in genetics; Quantitative genetics; Expression study; Genomic analysis; Association analysis; Transcriptomics
Introduction
Atrial fibrillation (AF) is the most common cardiac arrhythmia, significantly impacting health outcomes. It increases the risk of death by 1.5–3.5 times and the risk of stroke by ∼5 times.1 In the United States, AF affects a sizable portion of the population: 1 in 3 European Americans and 1 in 5 African Americans are projected to develop AF during their lifetime.2 AF onset is strongly associated with age, increasing rapidly after age 50: below the age of 50, prevalence is less than 0.5%, whereas by age 80, it exceeds 10%.3 Given the current aging global population trend, the prevalence of AF is projected to more than double from 2010 to 2030.4
Substantial advances have been made in our understanding of AF in the last two decades, leading to innovations such as catheter ablation, and improvements in prevention and stroke management. However, current therapies have important limitations. Invasive treatments such as catheter ablation have significant risks, with a complication rate of 4–14%.1 Post-ablation AF recurrence is also common, with a 2-year recurrence rate of 44% in paroxysmal patients5 and a 1-year recurrence rate of 70% for persistent patients.1 As a result, repeated procedures are often required. Additionally, pharmacological treatments for this disease remain largely ineffective, failing to reduce onset or progression in up to 85% of patients.6 A better understanding of the mechanisms of AF is imperative to improve prediction, prevention and the development of new, more specific pharmacological therapies.
Rare mutations in 50 genes have been reported in familial AF.6 These mutations predominantly occur in ion channels such as HCN4 and the potassium channels (KCN) group, but also in cardiac transcription factors such as NKX2-5, PITX2, and TBX5, and cytoskeletal proteins such as TTN. AF is also a complex disease with an important genetic component (heritability ∼22%).7 Recent large-scale genome-wide association studies (GWAS) have identified 150 loci associated with AF.8,9,10 However, most of these genetic associations have not yet been functionally characterized. Expression quantitative trait locus (eQTL) analysis can provide a mechanistic interpretation for AF-associated single nucleotide polymorphisms (SNPs) beyond the closest gene approach, and such strategy has already been employed to characterize AF GWAS loci.9,11,12 With the advent of single-nucleus transposase-accessible chromatin with sequencing (snATAC-seq), investigators have shown a strong enrichment of AF-associated SNPs in cardiomyocyte-specific open chromatin regions.13,14,15 Thus, while other cell-types play a role in AF (e.g., fibroblasts16 and macrophages17), it is expected that most AF-associated SNPs mediate their effect through cardiomyocyte-specific non-coding regulatory sequences. Yet, linkage disequilibrium (LD) remains a barrier in the identification of causal variants. This is exacerbated by the strong European-ancestry bias observed in large GWAS and eQTL cohorts.8,18 Furthermore, studies have revealed that most genes (95% protein-coding and 67% long non-coding RNA [lncRNA]) have one or more eQTL.18 Therefore, more information is generally required to accurately fine-map GWAS associations.
To fine-map AF GWAS signals and identify causal genes and variants, we performed a cross-ancestry eQTL study using left atrial appendages (LAAs) from AF patients and controls in normal sinus rhythm (SR). We combined GWAS-eQTL co-localization and Bayesian fine-mapping analyses to prioritize candidate causal genes and variants, and leveraged a new LAA sn-multiome dataset (snATAC-seq + snRNA-seq) to link regulatory sequences and AF genes. Finally, we performed a CRISPR inactivation (CRISPRi) experiment in human embryonic stem cells-derived cardiomyocytes (hESC-CM) to explore how the most strongly associated lncRNA gene implicated by our eQTL study can modulate AF risk.
Results
AF-associated cis-eQTLs are concordant across European and East Asian ancestries
To prioritize causal AF variants and genes using a cross-ancestry approach, we profiled two cohorts of participants with or without persistent AF that were recruited on two different continents. We genotyped participants from the cardiothoracic surgical trials network (CTSN, N = 84), a cohort of patients recruited in North America, and a cohort recruited at the University of Harbin in China (Harbin, N = 67). We imputed genotypes using TOPMed reference haplotypes and obtained 17,649,215 and 10,537,217 variants in the CTSN and Harbin cohorts, respectively. This difference in the number of high-quality imputed variants is likely due to an over-representation of European haplotypes in the reference imputation panel (STAR Methods), as well as more genetic diversity being present in the CTSN cohort (see the following text), thus enabling the imputation of more non-monomorphic common variants. We projected the genotype data from the CTSN and Harbin cohorts against populations from the 1000 Genomes Project (1000G). As expected, most participants from the CTSN cohort are of European ancestry, although we identified a few participants with other genetically defined ancestries (Figures S1A and S1B). All participants from the Harbin cohort clustered with the Han Chinese in Beijing population from the 1000G dataset (Figures S1A–S1C). For bulk RNA sequencing (RNA-seq) analyses, we obtained LAAs from the same CTSN and Harbin participants. LAA is an ideal tissue to study AF as it is easily accessible during open heart surgery, and a previous study showed that open chromatin sites found in atrial cardiomyocytes capture most of the AF heritability.14 After quality-control, we obtained paired genotype and RNA-seq data for 31 AF and 31 sinus rhythm (SR, controls) individuals in the CTSN cohort (82% European-ancestry), and 28 AF and 37 SR individuals in the Harbin cohort (Table S1).
We focused our cis-eQTL analyses on the 150 sentinel variants recently identified in a cross-ancestry meta-analysis of AF GWAS data (which included 67,864 European and 9,826 Japanese AF cases; and 1,026,594 European and 140,446 Japanese controls).10 We restricted our analyses to genes located within one megabase (Mb) from the sentinel AF variants. AF usually initiates on the left atrial posterior wall, and important transcriptomic differences have been noted between human left and right atria.19 For this reason, and to minimize heterogeneity in our results, we decided to profile the transcriptome of LAAs from participants recruited at both sites in our study (our sn-multiome dataset also only includes LAAs [see the following text]). We found 25 and 17 significant cis-eQTLs (false discovery rate [FDR] <5%) in the CTSN and Harbin cohorts, respectively, of which 11 were significant in both (Figures 1A–1C; Table S2). We found evidence of co-localization between the AF GWAS and eQTL signals (posterior probability [PP] that both AF and gene expression share a single causal variant; H4 ≥ 0.4) at 20 loci (Table 1).
Figure 1.
Expression quantitative trait locus (eQTL) mapping in human atrial tissue using RNA-sequencing
(A and B) Quantile-quantile (QQ) plots of cis-eQTL –log10 (p-values) for 150 sentinel atrial fibrillation (AF) SNPs and nearby genes (<1 Mb) in the (A) CTSN and (B) Harbin datasets. Dotted lines represent the 95% confidence interval of randomly generated normally distributed p-values. (C-E) Scatterplots comparing the eQTLs betas and their standard deviations from left atrial appendages (CTSN and Harbin) and right atrial appendages (Genotype-Tissue Expression; GTEx).
(C) CTSN vs. Harbin, (D) CTSN vs. GTEx, (E) Harbin vs. GTEx. We attributed the value 0 to the eQTL if it was not tested in that cohort. (C–E) Colors indicate in which dataset(s) the eQTL is significant. eQTLs that were not significant in either of the paired cohorts are not shown for clarity. We added gene labels for the 20 most significant eQTLs in each cohort. Dotted lines represent x = y coordinates.
Table 1.
Expression quantitative trait loci (eQTLs) for atrial fibrillation (AF)-associated variants in the CTSN and Harbin cohorts
rsID | Chr:Pos | REF/ ALT |
eGene | eQTL |
AF GWAS |
||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CTSN |
Harbin |
Nielsen et al. |
BBJ |
||||||||||||
Alt.AF | BETA | FDR | Alt.AF | BETA | FDR | Alt.AF | BETA | P | Alt.AF | BETA | P | ||||
rs4970418 | 1:983237 | G/A | PERM1 | 0.15 | 0.40 | 0.0086 | 0.07 | 0.41 | 0.82 | 0.17 | 0.044 | 7.5E-6 | 0.076 | 0.062 | 0.029 |
rs2885697 | 1:41078607 | G/T | AC093151.3a | 0.64 | −0.60 | 0.050 | 0.74 | −0.78 | 2.5E-9 | 0.65 | −0.044 | 2.9E-10 | 0.69 | −0.005 | 0.75 |
AC093151.8a | −0.19 | 0.012 | −0.37 | 1.5E-5 | |||||||||||
rs4951258 | 1:205722188 | G/A | RAB29 | 0.33 | −0.14 | 0.047 | 0.27 | −0.12 | 0.40 | 0.42 | 0.038 | 2.1E-8 | 0.36 | 0.02 | 0.20 |
rs2540949 | 2:65057097 | A/T | CEP68 | 0.45 | −0.20 | 9.1E-8 | 0.32 | −0.22 | 4.5E-8 | 0.39 | −0.066 | 3.0E-22 | 0.33 | −0.089 | 3.1E-8 |
rs3820888 | 2:200315300 | T/C | SPATS2L | 0.46 | −0.18 | 0.0081 | 0.74 | −0.22 | 2.8E-5 | 0.39 | 0.068 | 5.8E-24 | 0.69 | 0.093 | 1.4E-8 |
rs34080181 | 3:66403767 | G/A | SLC25A26 | 0.38 | 0.09 | 0.44 | 0.19 | 0.17 | 0.027 | 0.38 | −0.045 | 1.3E-10 | 0.22 | −0.062 | 9.1E-4 |
rs1278493 | 3:136095167 | G/A | AC072039.1a | 0.57 | −0.33 | 6.9E-7 | 0.14 | −0.14 | 1.00 | 0.57 | −0.039 | 8.8E-9 | 0.20 | −0.04 | 0.048 |
rs7612445 | 3:179455191 | G/T | GNB4 | 0.28 | 0.62 | 6.7E-13 | 0.12 | 0.48 | 3.0E-7 | 0.19 | 0.049 | 4.8E-9 | 0.18 | 0.068 | 9.2E-4 |
rs223449 | 4:102791180 | A/T | BDH2 | 0.44 | 0.15 | 0.023 | 0.41 | 0.03 | 1.00 | 0.49 | 0.036 | 7.1E-8 | 0.46 | 0.039 | 9.9E-3 |
rs2012809 | 5:128854670 | A/G | SLC27A6 | 0.82 | −0.58 | 3.1E-8 | 0.93 | −0.46 | 0.70 | 0.79 | 0.058 | 4.9E-10 | 0.92 | 0.039 | 0.16 |
rs3756687 | 5:137866004 | A/G | FAM13B | 0.17 | 0.21 | 9.0E-5 | 0.060 | −0.05 | 1.00 | 0.19 | 0.099 | 1.1E-31 | 0.017 | 0.036 | 0.52 |
rs34969716 | 6:18209878 | G/A | KDM1B | 0.27 | 0.15 | 0.0044 | 0.31 | 0.09 | 0.052 | 0.31 | 0.07 | 1.6E-19 | 0.35 | 0.071 | 2.6E-5 |
rs60212594 | 10:73654586 | G/C | MYOZ1 | 0.15 | 1.75 | 1.6E-7 | 0.19 | 1.03 | 6.2E-5 | 0.14 | −0.118 | 9.2E-35 | 0.16 | −0.065 | 0.0014 |
rs2316443 | 13:113210523 | G/A | F10 | 0.22 | −0.25 | 0.073 | 0.36 | −0.29 | 8.3E-4 | 0.23 | −0.045 | 2.4E-8 | 0.44 | −0.021 | 0.16 |
rs11156751 | 14:32521231 | T/C | AKAP6 | 0.28 | 0.08 | 0.81 | 0.41 | 0.15 | 0.0055 | 0.29 | 0.072 | 6.9E-21 | 0.35 | 0.099 | 7.6E-10 |
rs10873298 | 14:76960182 | C/T | AC007686.1 | 0.53 | −1.77 | 5.0E-15 | 0.69 | −0.90 | 1.7E-6 | 0.63 | −0.04 | 7.1E-9 | 0.66 | −0.05 | 0.0014 |
LINC01629 | −1.92 | 2.5E-17 | −1.57 | 1.9E-8 | |||||||||||
rs12908004 | 15:80384583 | A/G | AC016705.2a | 0.17 | 0.53 | 7.4E-5 | 0.05 | 0.77 | 9.2E-4 | 0.16 | 0.073 | 4.1E-16 | 0.063 | 0.142 | 1.5E-6 |
ARNT2 | 0.53 | 1.5E-4 | 0.61 | 0.0093 | |||||||||||
CTXND1 | 0.82 | 0.0015 | 1.03 | 0.074 | |||||||||||
rs242557 | 17:45942346 | G/A | MAPT | 0.38 | 0.75 | 1.4E-9 | 0.61 | 0.50 | 7.1E-8 | 0.38 | −0.031 | 1.4E-5 | 0.46 | −0.078 | 2.0E-7 |
MAPT-IT1a | 0.37 | 7.8E-6 | 0.08 | 1.00 | |||||||||||
STHa | 0.30 | 3.8E-5 | 0.10 | 1.00 | |||||||||||
rs6089752 | 20:62557186 | C/T | MIR1-1HG-AS1 | 0.60 | −0.23 | 0.0089 | 0.51 | −0.19 | 0.50 | 0.52 | 0.033 | 2.2E-6 | 0.49 | 0.071 | 4.0E-6 |
rs5754508 | 22:21644940 | C/G | UBE2L3a | 0.17 | 0.07 | 0.0071 | 0.31 | 0.02 | 1.00 | 0.19 | 0.036 | 1.0E-4 | 0.36 | 0.069 | 1.4E-4 |
We report eQTL results for AF variants that are associated (false discovery rate [FDR] < 0.05) with the expression of nearby genes in left atrial appendages from 62 to 65 participants recruited in the CTSN (mostly European-ancestry) and Harbin (East Asian-ancestry) cohorts, respectively, that also show evidence of co-localization with the AF GWAS signals (coloc H4 posterior probability > 0.4). To compare allele frequencies and effect sizes (beta) between our eQTL study and genome-wide association (GWAS) results for AF, we also report AF summary statistics from the study by Nielsen et al. (N = 1,030,836, European-ancestry) and Biobank Japan (BBJ, N = 150,272, East Asian-ancestry). Genomic coordinates are on build hg38. Allele frequencies and effect sizes are for the alternate allele. We define statistical significance as: FDR < 0.05 for the eQTL studies and P < 5E-8 for the AF GWAS.
eGenes that are novel cis-eQTL when compared to GTEx results from right atrial appendages.
While our downstream analyses focus on our LAA eQTL findings, we further confirmed our results against cis-eQTLs from right atrial appendages (RAAs) from the Genotype-Tissue Expression (GTEx) dataset (Table S3).18 Overall, eQTL results were very concordant when comparing these datasets: this strong replication validates our experiment, but also alleviates concerns of false positive genetic associations due to the multi-ancestry component of the CTSN cohort (Figures 1C–1E). We found seven novel AF SNP-eGene pairs that were not identified in GTEx (Table 1), including six that were also absent from a recent AF candidate gene survey.20 We further compared eGenes that showed evidence of co-localization in our study (Table 1) with candidate genes prioritized in seven other AF GWAS-related studies.11,13,14,15,19,21 Of the 26 eGenes that we report in Table 1, 14 have been highlighted previously (Figure S2; Table S4). Multiple reasons can explain why 12 genes were not prioritized before: we used the most recent AF GWAS, we included non-coding genes, we focused on a single tissue, and our eQTL strategy would not pinpoint candidate causal genes with coding variants. To enable future studies, we meta-analyzed cis-eQTL results for the 150 AF sentinel SNPs across the CTSN, Harbin, and GTEx datasets (including 127 LAA and 372 RAA samples): this analysis identified significant cis-eQTL results for an additional 18 AF variants that were not found in any of the datasets taken individually (Table S5). Lastly, in the CTSN and Harbin cohorts, most of the genes implicated by the eQTL studies were not differentially expressed between AF cases and SR controls (Table S6), and we did not find significant AF×SNP interactions for the tested cis-eQTLs, although we acknowledge limited power given our small sample size (Figure S3).
Multi-ancestry fine-mapping of AF-associated loci
Statistical fine-mapping should help prioritize candidate causal variants at the AF-associated loci with evidence of co-localization (Table 1). We derived 95% credible sets using association summary statistics from the AF GWAS and the CTSN/Harbin eQTL studies (Table S7). We further filtered this list and focused on nine AF loci with GWAS-eQTL co-localization and at least one strong candidate causal variant (posterior inclusion probability [PIP] > 0.1) (Tables 2 and S8). These nine loci implicate 14 different genes; we discuss their biology and potential link to AF pathophysiology in Table S9.
Table 2.
Functional annotation of AF-associated and eQTL variants prioritized by Bayesian fine-mapping
Sentinel GWAS variant | eGene | Prioritized variant | CHR:POS (hg38) | PIP AF GWAS | PIP CTSN | PIP Harbin | Left atrial appendage multiome |
Cis-element ATLAS (CATlas) |
EpiMap |
ENCODE | |
---|---|---|---|---|---|---|---|---|---|---|---|
Prioritized variant in ATAC peak | Link between ATAC peak and eGene promoter | Prioritized variant in cardiomyocyte ATAC peak | Prioritized variant in element linked to eGene in heart | ||||||||
rs4970418 | PERM1 | rs74045046 | 1:976536 | 0.141 | 0.282 | ND | enhD | ||||
rs56028034 | 1:981282 | 0.1 | 0.202 | ND | |||||||
rs7612445 | GNB4 | rs7612445 | 3:179455191 | 0.422 | 0.5 | 0.333 | Yes | Yes | Yes | Yes | |
rs7634416 | 3:179455436 | 0.348 | 0.5 | 0.333 | |||||||
rs3756687 | FAM13B | rs3756687 | 5:137866004 | 0.791 | 0.333 | ND | |||||
rs7722600 | 5:137859073 | 0.105 | 0.333 | ND | Yes | enhD | |||||
rs34969716 | KDM1B | rs34969716 | 6:18209878 | 0.999 | 0.137 | ND | Yes | Yes | Yes | enhD | |
rs11156751 | AKAP6 | rs7140396 | 14:32514611 | 0.283 | ND | 0.307 | |||||
rs10873298 | AC007686.1 | rs12889775 | 14:76959734 | 0.134 | 0.25 | NC | |||||
LINC01629 | rs12889775 | 14:76959734 | 0.134 | 0.25 | NC | ||||||
AC007686.1 | rs10873298 | 14:76960182 | 0.264 | 0.25 | NC | enhP | |||||
LINC01629 | rs10873298 | 14:76960182 | 0.264 | 0.25 | NC | enhP | |||||
AC007686.1 | rs10873299 | 14:76960368 | 0.21 | 0.25 | NC | enhP | |||||
LINC01629 | rs10873299 | 14:76960368 | 0.21 | 0.25 | NC | enhP | |||||
AC007686.1 | rs8181996 | 14:77427469 | 0.264 | 0.25 | NC | ||||||
LINC01629 | rs8181996 | 14:77427469 | 0.264 | 0.25 | NC | ||||||
rs12908004 | AC016705.2 | rs12908004 | 15:80384583 | 1 | 0.989 | 0.792 | DNase and H3K4me3 mark | ||||
ARNT2 | rs12908004 | 15:80384583 | 1 | 0.984 | 0.887 | DNase and H3K4me3 mark | |||||
CTXND1 | rs12908004 | 15:80384583 | 1 | 0.222 | ND | DNase and H3K4me3 mark | |||||
rs242557 | MAPT | rs242557 | 17:45942346 | 0.988 | 1 | 1 | Yes | Yes | Yes | Yes | enhD |
MAPT-IT1 | rs242557 | 17:45942346 | 0.988 | 0.997 | ND | Yes | Yes | Yes | enhD | ||
STH | rs242557 | 17:45942346 | 0.988 | 0.991 | ND | Yes | Yes | Yes | enhD | ||
rs6089752 | MIR1-1HG-AS1 | rs6089753 | 20:62556900 | 0.139 | 0.393 | ND |
For variants with posterior inclusion probability (PIP) > 0.1 in the AF GWAS and at least one of the eQTL study, we retrieved functional annotations from our single-nucleus multiome RNA-sequencing and ATAC-sequencing experiment, the cis-element ATLAS (CATlas), EpiMap, and ENCODE. ND, not determined because the association was not significant in this dataset; NC, not in the 95% credible set; enhD, distal enhancer; enhP, proximal enhancer.
Because the eQTL studies are small in comparison to the sample size of the AF GWAS, the 95% credible sets for the eQTL signals were often larger than the 95% credible sets for the AF signals (Tables S7 and S8). However, since there is evidence of co-localization and the two eQTL studies have different main ancestries (and thus likely different linkage disequilibrium patterns), we reasoned that intersecting the 95% credible sets would enrich for candidate causal AF variants. Importantly, because genotype imputation quality could vary between datasets and impact fine-mapping, we elected to use a comprehensive strategy and consider as potentially causal variants those that are included in the AF GWAS 95% credible sets and at least one of the two LAA eQTL 95% credible sets (CTSN and/or Harbin). Using this approach, the number of candidate causal AF variants ranged from one to 178, including three loci with only one candidate variant (KDM1B, ARNT2/CTXND1/AC016705.2, and MAPT/STH/MAPT-ITI) (Tables 2, S7, and S8).
Variant-to-gene (V2G) prioritization using single-nucleus multiomic data
To gain insights into the regulatory mechanisms by which AF-associated variants modulate disease risk, we integrated fine-mapped AF variants described previously with our sn-multiome dataset (paired ATAC and RNA-seq in the same nuclei) generated from LAAs obtained from three AF and four SR human donors (STAR Methods). For completeness, we also queried publicly available data from the cis-element Atlas (CATlas, a large compendium of snATAC-seq data),14 EpiMap,22 and ENCODE.23 For this annotation, we focused on variants with fine-mapping PIP >0.1 in the AF GWAS and at least one of the two eQTL studies; this represented 15 variants at nine loci (Tables 2 and S10). Adding annotations from a promoter capture Hi-C map generated in induced pluripotent stem cells-derived cardiomyocytes did not yield additional insights, maybe because many eQTLs are not captured by this system.24 Ten of these 15 AF-associated variants had been prioritized as candidate causal variants in in at least one previously published AF study (Figure S4; Table S11).11,13,14,15,19,25 For all these loci, the graphical representation of the association results and functional annotations is in Figures S5–S14. Below, we emphasize two compelling examples of V2G prioritization for AF.
GNB4, which encodes G protein subunit beta 4, is one of the strongest cis-eQTL that we detected, and it co-localizes with the AF GWAS signal in both the CTSN and Harbin cohorts. Intersection of the GWAS and eQTL fine-mapped results at this locus prioritized two variants, rs7612445 and rs7634416, with PIP >0.1 (Figure 2A). In our LAA snRNA-seq data, GNB4 is expressed in most cell types, with high expression in pericytes and endothelial, endocardial and myeloid cells, but relatively low levels in cardiomyocytes (Figure 2B). However, one of the fine-mapped variants, rs7612445, overlaps with a cardiomyocyte-specific ATAC-seq peak (Figure 2C), and the ATAC-seq signal at this peak is correlated with GNB4 expression in cardiomyocytes (Spearman’s rho = 0.28, p = 5.7 × 10−9) (Figure 2D).26 The same variant is also prioritized in the CATlas and EpiMap databases (Table 2).
Figure 2.
Fine-mapping and annotation of the GNB4 locus
(A) The top panels show –log10 (p-values) from the cross-ancestry atrial fibrillation (AF) GWAS (y axis) against genomic coordinates (x axis, hg38) for a 500 kb window centered on the sentinel AF SNP. SNPs are colored based on the 1000 Genome Project linkage disequilibrium (LD) r2 with the lead SNP in the European (left) and East Asian (right) super-populations. In the bottom panels, we report the eQTL –log10 (p-values) in the CTSN and Harbin cohorts for GNB4 expression in left atrial appendages (LAAs).
(B) Uniform manifold approximation and projection (UMAP) LAA single-nucleus multiome cell-types (left) and GNB4 expression density (right).
(C) LAA single-nucleus multiome genomic context of the prioritized AF-associated variants. From top to bottom: The first track shows ATAC-seq fragment coverage at the GNB4 locus (left) paired with violin plots of GNB4 expression aggregated by cell-type (right). The second track shows ATAC-seq peaks. The third track shows the gene annotation (exon, intron) for the genes found at the locus. The fourth and fifth tracks highlight links between ATAC-seq peaks and gene promoters identified in either all cell-types (Links) or specifically in cardiomyocytes (Links CM). We use Pearson correlation tests between ATAC-seq peak accessibility and gene expression (in the same nucleus) to derive links. Only links with |Pearson R| > 0.2 are shown. Link heights are proportional to their |Pearson R| in the range indicated in the legend. In the final track, we report AF GWAS and eQTL fine-mapping posterior inclusion probabilities (PIP).
(D) Scatterplots of the chr3:179454910-179455284 peak accessibility against GNB4 expression colored by the genotype of the prioritized variant (rs7612445) in the six genotyped individuals of our single nucleus multiome LAA data (one GT and five GG). Spearman’s rho P-values were 0.0013 and 5.7 × 10−9 when including all cells or only CM, respectively.
(E) ATAC-seq read coverage of the chr3:179454910-179455284 peak (across all cell-types) aggregated by genotype showing greater accessibility for the individual with the T-allele. The dotted lines show the positions of rs7612445 (left) and rs7634416 (right).
(F) rs7612445-GNB4 eQTL boxplots in the CTSN and Harbin bulk RNA-seq cohorts. Adipo, adipocytes; CM, cardiomyocytes; EC, endothelial cells; FB, fibroblasts; PC, pericytes; SMC, smooth muscle cells; POS, chromosomal position.
When we genotyped rs7612445 in six out of seven donors who provided LAAs for the sn-multiome experiment, all but one individual were homozygous for the G-allele. Interestingly, the one individual with the GT genotype had increased chromatin accessibility and higher GNB4 expression in cardiomyocytes (one-sample t test PATAC-seq = 1.9 × 10−4 and PRNA-seq = 2.1 × 10−4 in cardiomyocytes, Figures 2D and 2E), consistent with the cis-eQTL effect detected by bulk RNA-seq in the CTSN and Harbin cohorts (Figure 2F). While rs7612445 and rs7634416 are in strong LD (r2∼1 in European and East Asian populations), our results suggest that rs7612445 is the more likely AF causal variant. Our finding is also consistent with a previous report that used electrophoretic mobility shift assay in cardiomyocytes derived from induced pluripotent stem cells to show that the rs7612445-T allele increases binding with the NKX2-5 transcription factor.27 Thus, we posit that rs7612445 is the causal variant at this AF locus and mediates its effect on disease through the regulation of GNB4 in cardiomyocytes.
At the MAPT locus on chromosome 17 (encoding the microtubule-associated protein tau), we detected a strong co-localization between the AF GWAS signal and the expression of three genes—MAPT, MAPT-IT1, and STH—in the CTSN and Harbin datasets (H4 PP > 0.9, Table S8). We focused our downstream analyses on MAPT given that MAPT-IT1 and STH are expressed at very low levels in our sn-multiome data (Figures S11 and S12) and have not been implicated in cardiac phenotypes in the past (Table S9). Statistical fine-mapping of the AF GWAS and eQTL datasets prioritized a single variant in the 95% credible sets (rs242557) with high confidence (PIP > 0.95, Figure 3A; Tables 2 and S8). In the LAA sn-multiome data, MAPT is highly expressed in cardiomyocytes (Figure 3B). rs242557 intersects with an ATAC-seq peak correlated with the expression of MAPT when we considered all cell-types (Figure 3C), as well as with an ATAC-seq peak opened in a broad range of cell-types in CATlas (including cardiomyocytes) and a distal enhancer element in ENCODE (Tables 2 and S10). Among the six donors who provided LAA for the sn-multiome experiment and that we could genotype, four were heterozygous and two were homozygous for the reference G-allele. GA heterozygous donors showed higher MAPT expression in cardiomyocytes (two-sample t test p = 0.043, Figure 3D), consistent with the bulk eQTL results in the CTSN and Harbin cohorts. However, we found no difference in chromatin accessibility for this ATAC-seq peak based on genotypes at rs242557 (two-sample t test p > 0.7, Figures 3D–3F), maybe because our sample size is too small or because the variant affects gene expression without modulating chromatin accessibility.
Figure 3.
Fine-mapping and annotation of the MAPT locus
(A) The top panels show –log10 (p-values) from the cross-ancestry atrial fibrillation (AF) GWAS (y axis) against genomic coordinates (x axis, hg38) for a 500 kb window centered on the sentinel AF SNP. SNPs are colored based on the 1000 Genome Project linkage disequilibrium (LD) r2 with the lead SNP in the European (left) and East Asian (right) super-populations. In the bottom panels, we report the eQTL –log10 (p-values) in the CTSN and Harbin cohorts for MAPT expression in left atrial appendages (LAAs).
(B) Uniform manifold approximation and projection (UMAP) of LAA single-nucleus multiome cell-types (left) and MAPT expression density (right).
(C) LAA single-nucleus multiome genomic context of the prioritized AF-associated variants. From top to bottom: The first track shows ATAC-seq fragment coverage at the MAPT locus (left) paired with violin plots of MAPT expression aggregated by cell-type (right). The second track shows ATAC-seq peaks. The third track shows the gene annotation (exon, intron) for the genes found at the locus. The fourth and fifth tracks highlight links between ATAC-seq peaks and gene promoters identified in either all cell-types (Links) or specifically in cardiomyocytes (Links CM). We used Pearson correlation tests between ATAC-seq peak accessibility and gene expression (in the same nucleus) to derive links. Only links with |Pearson R| > 0.2 are shown. Link heights are proportional to their |Pearson R| in the range indicated in the legend. In the final track, we report AF GWAS and eQTL fine-mapping posterior inclusion probabilities (PIP).
(D) Scatterplots of the chr17:45942197-45942667 ATAC-seq peak accessibility against MAPT expression colored by the genotype of the prioritized variant (rs242557) in the six genotyped individuals of our single-nucleus multiome LAA data (two GT and four GG). Spearman’s rho p values were <2.2 × 10−16 and 0.034 when including all cells or only CM, respectively.
(E) ATAC-seq read coverage of the chr17:45942197-45942667 ATAC-seq peak aggregated by genotype (across all cell-types). The dotted line shows the position of rs242557.
(F) rs242557-MAPT eQTL boxplots in the CTSN and Harbin bulk RNA-seq datasets. Adipo, adipocytes; CM, cardiomyocytes; EC, endothelial cells; FB, fibroblasts; PC, pericytes; SMC, smooth muscle cells; POS, chromosomal position.
LINC01629 repression alters key AF genes expression in hESC-CMs
Intriguingly, the strongest eQTL in both the CTSN and Harbin datasets implicated genotypes at the sentinel SNP rs10873298 and the expression of an uncharacterized lncRNA (LINC01629) and a pseudo-gene (AC007686.1) in LAAs (Figure 1; Table 1). eQTLs of both genes are co-localized with the AF GWAS signal at the locus (H4 > 0.9, Table S8) and we could resolve the 95% credible set to four variants in CTSN (each with PIP > 0.1, Table 2). Two of these variants (rs10873298 and rs10873299) overlap with a generic proximal enhancer cataloged by ENCODE (Table 2). A third variant in the 95% credible set, rs12889775, is also of interest for two reasons: First, it is located just next to a CM-specific ATAC-seq peak and could therefore directly affects the expression of LINC01629 (Figure S7B). And second, it also maps to a LINC01629 exon so that it might influence the stability of this lncRNA.
We were able to detect the expression of LINC01629 in cardiomyocytes in the LAA sn-multiome data (Figure S7B). To gain molecular insights into the role that LINC01629 can play in AF etiology, we knocked down its expression in hESC-CMs using CRISPRi and performed bulk RNA-seq on three biological replicates (STAR Methods and Figure S15). One important limitation of this system is that it would not allow us to distinguish between a scenario where LINC01629 has a generic effect on CM proliferation and/or maturation vs. an effect on the expression patterns of genes with specific functions in CMs. Arguably, however, it remains the best system to study if a long non-coding RNA can impact the phenotypes of human CMs. For these CRISPRi experiments, we used a guide RNA (gRNA) that maps to the LINC01629 promoter. We confirmed that CRISPRi repressed LINC01629 expression (∼92% decrease) when compared to a non-targeting gRNA (Figure 4A). Differential gene expression analysis identified 217 up- and 299 downregulated genes (FDR <0.1) upon CRISPRi on the LINC01629 promoter (Figure 4B; Table S12). Many of these genes play key roles in CM functions and/or have already been implicated in AF by GWAS (e.g., TBX5, PITX2, HCN4, and SCN5A).21,28,29 Notably, the most downregulated gene in this LINC01629 CRISPRi experiment is FOXP2, a transcription factor that was recently implicated in the control of regulatory networks found in pacemaker cells.30 Our pathway analyses confirmed these observations, for instance highlighting genes implicated in cardiac conduction among the CRISPRi downregulated genes (Figure 4C). The expression of AC007686.1 was not significantly modulated in the CRISPRi experiment (log2 fold-change = −0.036, nominal p = 0.16 [due to its low expression level, we could not generate an accurate FDR using DESeq2]), although we cannot formerly exclude that this pseudogene also contributes to the AF GWAS signal.
Figure 4.
Knockdown of LINC01629 in cardiomyocytes
(A) Boxplot of raw read counts for LINC01629 in human embryonic stem cells-derived cardiomyocytes (hESC-CMs) with CRISPRi targeting the promoter of LINC01629 (LINC01629KRAB) vs. a non-targeting control (NTC) guide RNA (gRNA).
(B) Volcano plot of the differential gene expression analysis carried out by comparing the transcriptome of hESC-CMs treated with a gRNA against the promoter of LINC01629 vs. a negative control (NTC gRNA). We labeled the top 10 genes (black dot and label), the AF prioritized genes by Open Targets (red dot and label) as well as LINC01629 (green dot and black label).
(C) Pathway analyses with genes downregulated (top panel) or upregulated (down panel) (false discovery rate < 0.1) in the CRISPRi experiment.
Discussion
One strength of our study is that we focused on AF-associated variants identified in a large cross-ancestry GWAS study and that we included LAA samples from different ancestries. Our initial conclusion is that there is little evidence of heterogeneity when comparing the effect of AF-associated variants between European- and East Asian-ancestry populations (Tables 1 and S5). This observation is largely consistent with recent studies that suggest that the genetic architecture of common human diseases, including AF, is concordant between populations (at least when considering common variants).31 However, it is important to nuance these conclusions with two important considerations. First, most cross-ancestry GWAS published to date are still very much European-ancestry-centric, and are therefore less likely to yield genetic associations specific to non-European samples (which would increase heterogeneity). Second, the limited sample size of our eQTL datasets (CTSN, N = 62; Harbin, N = 65) would not allow us to detect small differences in effect sizes on gene expression for the AF-associated variants. Thus, there is still great value in continuing to increase the sample size of non-European-ancestry GWAS and eQTL studies for complex human diseases, including AF, to discover population-specific biology and to enable more powerful fine-mapping experiments. To illustrate this important point, we note that three of the variant-gene eQTL pairs that co-localize with the AF GWAS were found only in the East-Asian Harbin cohort (SLC25A26, F10, and AKAP6; Table 1). However, replication studies and additional functional experiments are needed to determine if these genetic variants mostly modulate AF risk in East Asians.
One interesting finding of our study is the discovery that downregulating the expression of LINC01629 in hESC-CMs using CRISPRi modifies the expression of many key AF genes, including candidate causal genes implicated in AF GWAS (like PITX2, Figure 4B). Pathway analyses confirm this observation by revealing that many of the downregulated genes impact cardiac conduction, which might be a specific effect or the unspecific impact of LINC01629 on CM proliferation/maturation. The T-allele at rs10873298, which is associated with lower LINC01629 expression in LAAs, is protective against AF (Table 1). This observation, combined with results from our CRISPRi experiment suggests that the downregulation of key genes like PITX2, GJA5, and TBX5 in CMs might prevent AF. While it is known that the upregulation of these genes can promote arrhythmia and AF,28,32,33,34,35 loss-of-function and haploinsufficiency of these genes has also been associated with AF.35,36,37,38 This observation underscores the intricate balance necessary for normal conduction in the heart, and that too little or too much of a given gene can lead to heart diseases. It is also important to emphasize that while our acute treatment of CMs with CRISPRi links LINC01629 to pathways of genes involved in atrial development and cardiac conduction, it does not recapitulate the chronic impact of LINC01629 over (or under)-expression during a lifetime. Additional work is needed to clarify whether this long non-coding RNA promotes or protects against AF, and how molecularly it modulates the expression of key AF genes.
As for most other complex human diseases tackled by GWAS, progress toward a better understanding of AF pathophysiology has been hampered by challenges to move from genetic associations to genes and variants. In this study, we combined statistical methodologies (co-localization and Bayesian fine-mapping), eQTL analyses in a disease-relevant tissue obtained from donors of different ancestries, and CRISPRi in hESC-CMs to prioritize new variants, regulatory sequences and genes that modulate the risk of developing AF. While we uncovered strong variant and gene candidates for further downstream analyses, we recognize that larger eQTL studies, potentially including other tissues, are required to functionally dissect the ∼150 GWAS loci associated with AF. Because we made these discoveries by studying human genetic and phenotypic variation, they promise to yield insights into the causes of AF in humans. Given the unmet need to develop and characterize new molecules to treat (or prevent) AF, this is particularly exciting since candidate drug targets that are supported by genetic evidence are twice as likely to yield effective therapies.39,40
Limitations of the study
Many GWAS variants are eQTLs, yet this overlap does not necessarily imply that the corresponding eGenes are involved in the diseases. To increase the specificity of our strategy, we only considered (1) loci with strong evidence of co-localization between the AF GWAS signals and eQTL in LAAs and (2) variants with high PIP (Table 2). We acknowledge that these stringent criteria could make us miss interesting loci with more complex genetic architecture. Nonetheless, we prioritized many interesting genes for a role in AF (Table S9). Some of these genes have previously been linked to AF (e.g., GNB4, MAPT, FAM13B, and ARNT2) because of roles in heart rhythm, cardiac conduction, or other aspects of cardiomyocyte biology, while others remain largely unexplored in the context of AF (Table S4). Another limitation of our study is that we did not functionally validate the prioritized variants. For instance, while our study agrees with a previous report that FAM13B is a likely causal AF gene,41 the predicted causal variants are different. More systematic perturbations analyses using genome editing in relevant cell-types could help resolve these issues.
STAR★Methods
Key resources table
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Bacterial and virus strains | ||
OneShot Stbl3 E. coli competent cells | Invitrogen | Catalog #C737303 |
Deposited data | ||
Single-nucleus multiome data | This paper |
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE238242 NCBI GEO accession number (GSE238242) |
Bulk RNA-seq data (CTSN, Harbin, lncRNA LINC01629) | This paper |
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE271839 NCBI GEO accession number (GSE271839) http://www.mhi-humangenetics.org/en/resources/ |
Experimental models: Cell lines | ||
HEK293T | ATCC | Catalog #CRL-3216 |
H1-dCas9-KRAB | This paper | N.A. |
Oligonucleotides | ||
Non-targeting control gRNA (Forward) | This paper | 5′’-CACCGAAAACAGGACGATGTGCGGC-3’ |
Non-targeting control gRNA (Reverse) | This paper | 5’- AAACGCCGCACATCGTCCTGTTTTC-3’ |
LINC01629 gRNA (Forward) | This paper | 5’-CACCGTAGAAAAAGACACTTCCAA-3’ |
LINC01629 gRNA (Reverse) | This paper | 5’-AAACTTGGAAGTGTCTTTTTCTAC-3’ |
Recombinant DNA | ||
Lenti-dCas9-KRAB-blast | Addgene | Catalog #89567 |
Software and algorithms | ||
PLINK1.9 | Chang et al., Gigascience, 201542 | https://www.cog-genomics.org/plink/ |
TOPMed Imputation Server | N.A. | https://imputation.biodatacatalyst.nhlbi.nih.gov/#! |
hdWGCNA | Morabito et al., bioRxiv, 202243 | https://smorabit.github.io/hdWGCNA/ |
Kallisto | Bray et al., Nature Biotech., 201644 | https://pachterlab.github.io/kallisto/ |
DESeq2 | Love et al., Genome Biol., 201445 | https://bioconductor.org/packages/release/bioc/html/DESeq2.html |
scDblFinder | Germain et al., F1000Research, 202146 | https://github.com/plger/scDblFinder |
Code specific to this manuscript | This paper | https://github.com/lebf3/AF_V2G_multiome |
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Guillaume Lettre (guillaume.lettre@umontreal.ca).
Materials availability
This study did not generate new unique reagents.
Data and code availability
-
•
The sn-multiome and LINC01629 RNAseq data discussed in this publication have been deposited in NCBI’s Gene Expression Omnibus and are accessible through GEO Series accession numbers GSE238242 and GSE271839, respectively. The bulk RNAseq data (gene count tables) for the CTSN and Harbin cohorts are also available at: http://www.mhi-humangenetics.org/en/resources/.
-
•
The code to reproduce the results in this article is available at: https://github.com/lebf3/AF_V2G_multiome.
-
•
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
Experimental model and study participant details
All participants provided written informed consent and the project was approved by the Montreal Heart Institute ethical committee (project number: #2011-209). Studies involving human participants recruited at the Oxford site were approved by the local Research Ethics Committee (South Central - Berkshire B Research Ethics Committee, UK; ref. 18/SC/0404). All participants provided written informed consent. Knockdown experiments were carried out in cardiomyocytes differentiated from the H1 hESC cell line. The demographic information for the study participants can be found in Table S1.
Method details
RNA extraction and sequencing
Harbin
RNA isolation
We extracted total RNA with the Qiagen kit. We then processed the total RNA as follows: (1) We tested the RNA samples for possible contamination and degradation using 1% agarose gel electrophoresis; (2) We examined RNA purity and concentration using the NanoPhotometer spectrophotometer; and (3) We measured RNA integrity and quantity using the RNA Nano 6000 Assay Kit on the Bioanalyzer 2100 system.
Library preparation and sequencing
We used the Nugen kit with the ribosomal RNA (rRNA) depletion and stranded method to construct the RNA libraries for RNA-seq. Briefly, we depleted rRNA from total RNA using the rRNA Removal Kit as per the manufacturer’s instructions. Next, we fragmented the RNA into ∼250-∼300 bp fragments and reverse transcribed the first strand cDNA using fragmented RNA and dNTPs (dATP, dTTP, dCTP and dGTP). We degraded the RNA using RNase H and synthesized the second strand cDNA using DNA polymerase I and dNTPs (dATP, dUTP, dCTP and dGTP). We then converted the remaining overhangs of double-stranded cDNA into blunt ends using exonuclease/polymerase activities. After adenylation of the 3′ ends of DNA fragments, we ligated sequencing adaptors to the cDNA. To select cDNA fragments of preferentially ∼250-∼300 bp in length, we purified the library fragments using the AMPure XP system. We performed uridine digestion using Uracil-N-Glycosylase, followed by cDNA amplification using PCR. After library construction, we measured the concentration of the library using the Qubit fluorometer and adjusted it to 1 ng/μL. We used the Agilent 2100 Bioanalyzer to measure the insert size of the acquired library. Finally, we examined the accurate concentration of the cDNA library using qPCR. Finally, we subjected the samples to sequencing on the Illumina NovaSeq 6000 S4 using a paired-ends 150 bp protocol.
CTSN
RNA isolation
We blended samples with the Bullet Blender Storm method using Green RINO Lysis tubes and added 200uL of Qiazol per tube for ∼50 mg of tissue, following the manufacturer’s protocol for heart tissue. We then extracted RNA using the miRNeasy Qiagen kit (Cat No./ID: 217004), following the manufacturer’s protocol.
Library preparation and sequencing
We quantified total RNA using a NanoDrop Spectrophotometer ND-1000 (NanoDrop Technologies, Inc.) and assessed its integrity on a 2100 Bioanalyzer (Agilent Technologies). We depleted rRNA from 250 ng of total RNA using the QIAseq FastSelect - rRNA HMR Kit. We performed cDNA synthesis using the NEBNext RNA First Strand Synthesis and NEBNext Ultra Directional RNA Second Strand Synthesis Modules (New England BioLabs). We carried out the remaining steps of library preparation using the NEBNext Ultra II DNA Library Prep Kit for Illumina (New England BioLabs), and we purchased adapters and PCR primers from New England BioLabs. We quantified libraries using the Kapa Illumina GA with Revised Primers-SYBR Fast Universal kit (Kapa Biosystems) and determined the average size fragment using a LabChip GX (PerkinElmer) instrument. We conducted sequencing on an Illumina NovaSeq 6000 S4 paired-ends 100-bp with a minimum aimed 100M reads per sample.
DNA extraction and genotyping
Harbin
We isolated DNA using a Qiagen kit. We measured the DNA concentration (≥60 ng/μL) and volume (≥30μL) using the Qubit DNA Assay Kit on a Qubit 3.0 Fluorometer (Invitrogen, USA). We performed genotyping using the Illumina GSA v3 or ASA genotyping arrays.
CTSN
We isolated DNA from blood, when available, using the Qiagen DNaeasy kit (Cat No./ID: 69504), following the manufacturer’s protocol. We quantified the genomic DNA using a 2100 Bioanalyzer (Agilent Technologies) and performed genotyping using the Illumina GSA-24v3 genotyping array.
Genotype quality-control and imputation
We used PLINK 1.942 to process and quality-control the genotype files. We removed individuals with >5% missingness (only one individual in the Harbin cohort). We found no evidence of relatedness (third degree relative or higher) nor heterozygosity outliers. In the Harbin cohort, we also removed monomorphic SNPs and those with a Hardy-Weinberg equilibrium exact test P-value <1E-6, resulting in 417,862 genotyped SNPs. In the CTSN cohort, we removed monomorphic SNPs, resulting in 531,541 genotyped SNPs (we did not filter based on Hardy-Weinberg equilibrium in CTSN due to the multi-ancestry nature of the cohort). Next, we flipped alleles to match hg19 using Snpflip (https://github.com/biocore-ntnu/snpflip). We performed genotype imputation on the TOPMed Imputation Server using the TOPMed-r2 reference. SNPs with low call rates (<90%) were removed during this step. We retained variants with an imputation R2 >0.3 for downstream analyses. In total, we obtained 10,537,217 and 17,649,215 imputed variants in the Harbin and CTSN cohorts, respectively.
Genetically-defined continental ancestry
We used 2,722 common ancestry informative markers (https://genome.sph.umich.edu/wiki/Exome_Chip_Design) to perform principal component analyses (PCA) on a combined dataset that included samples from the Harbin and CTSN cohorts, as well as populations from the 1000 Genomes Project. We visualize participants using axes of variation obtained by PCA and Uniform Manifold Approximation and Projection (UMAP) calculated with the first 10 principal components (PC).
RNAseq processing and differential expression analysis
We performed the following steps independently for the CTSN and Harbin datasets. We pseudoaligned RNAseq reads to Gencode v32 using Kallisto44 with the options quant -b 100 --rf-stranded. We aggregated transcripts by genes and quantified them with DESeq2.45 We removed genes with less than 10 reads. We used DESeq2’s Wald test with sex as a covariate, followed by a log2 shrunken transformation (ashr shrinkage estimator47) to compare AF differentially expressed genes between Harbin and CTSN. Lastly, we determined the number of PCs to use as hidden variables (covariates) with the runElbow() function from the PCAForQTL package.48
Single-nucleus multiome
Sample collection from LAA
A total of eight patients were initially included in the study; all patients underwent cardiac surgery (coronary artery bypass grafting) in the John Radcliffe hospital at Oxford. Left atrial appendage biopsies were collected before cardiopulmonary bypass and immediately, rinsed of blood, towel-dried and snap-frozen in liquid nitrogen until use for subsequent experiments.
Single nuclei sample preparation and sequencing
To isolate nuclei from LAA samples we used a modified version of the 10X multiome nuclei isolation protocol. Unless specifically mentioned in our description bellow, we re-suspended nuclei by pipette mixing slowly 10 times and we used the buffer described by 10X here: https://www.10xgenomics.com/support/single-cell-multiome-atac-plus-gene-expression/documentation/steps/sample-prep/nuclei-isolation-from-complex-tissues-for-single-cell-multiome-atac-plus-gene-expression-sequencing. All steps were performed on ice or maintained at 4°C. We extracted nuclei from tissues using a Singulator100 with the nuclei manufacturer protocol (fast extraction setting). For each sample, we loaded approximately 25mg of tissue with 1.5mL of nuclei lysis buffer (Tris-HCL pH 7.4 10mM, NaCl 10mM, MgCl2 3mM, Tween 20 0.1%, Nonidet P40 Substitute 0.1%, Digitonin 0.01%, BSA 2%, DTT 1mM, 0.5U/uL Protector RNase inhibitor [Sigma catalog no. 3335402001], in nuclease free water) in the Singulator100. Once the run completed, we rinsed the cartridge with an additional 1mL of lysis buffer, passed the solution in two 20um filter (Miltenyi catalog no. 130-101-812) sequentially, and performed a first round of centrifugation at 500g and 4°C for 5 min in a swing bucket centrifuge. We then removed the supernatant, added 1mL of suspension buffer (PBS, 2% BSA, 0.5U/uL Protector RNase inhibitor), waited 5 min for buffer exchange, re-suspended the nuclei, passed the solution through a 20um filter, rinsed with an additional 1mL of suspension buffer and performed a second centrifugation at 500g and 4°C for 5 min. We then removed the supernatant and re-suspended the nuclei in 100uL of 0.1X lysis buffer (1U/uL Protector RNase inhibitor) by pipette mixing 5 times, waited 2min, added 1mL of wash buffer, pipette mixed 5 times and centrifuged at 500g and 4°C for 5 min. Finally, we re-suspended the nuclei in diluted nuclei buffer, quantified them on a Countess II FL, and proceeded to loading the chip on the Chromium controller and downstream steps from the manufacturer protocol. We sequenced libraries on a Novaseq 6000 S4 PE100 with a targeted 30,000 paired-reads per cells for the RNA libraries and 60,000 paired-reads per cells for the ATAC libraries.
Alignment and pre-processing
We aligned FASTQ to Cellranger’s GRCh38-2020-A reference using the count function of cellranger-arc-2.0.0 for each sample. We then aggregated all samples using cellranger-arc aggr function. We removed one sample having a low number of detected genes, linked genes and number of cells. For downstream analyses, we used Seurat v4 and Signac.49 We kept ATAC peaks present in at least 10 cells. We removed low quality cells using the following thresholds: >200 detected genes, >400 detected peaks, <10% mitochondrial reads, >2 transcription start site enrichment score and >10% ATAC reads in peaks. We then annotated cells by co-embedding our data with the heart atlas left atrial nuclei.50 We used SCTransform for normalization and regressing out mitochondrial reads percentages. We integrated the data using Harmony on 30 PCs. We used the heart cell atlas labels to assign cell-types to the resulting clusters. We removed doublets using both scDblFinder46 scores and manual sub-clustering curation. We first calculated a doublet score using scDblFinder on both the RNA and ATAC data. We removed cells for which the product of the 2 scores was greater than 0.5 (scDblFinder RNA score ∗ scDblFinder ATAC score, labeled high confidence doublets). We then sub-clustered each cell-type and removed sub-clusters that showed an enrichment of the scDblFinder scores and the top marker gene of another cell-type (labeled sub-cell-type doublets). We then re-clustered cells using seurat’s function FindMultiModalNeighbors() with the first 20 PCs of the RNAseq data and PCs 2–20 of the ATACseq data. Finally, we refined ATACseq peaks with cell-type labels using Signac’s CallPeaks function.
Prioritized variants overlap with other genomic datasets
For each annotation, when necessary, we recovered hg38 positions as mentioned above. We obtained ENCODE regulatory elements from the genome.ucsc.edu table browser, table encodeCcreCombined (https://genome.ucsc.edu/cgi-bin/hgTables?hgta_doMainPage=1&hgta_group=regulation&hgta_track=encodeCcreCombined&hgta_table=encodeCcreCombined&hgsid=1439910105_RsimqAdh3sPECjdmse1QPtYFPY3c). We obtained CATlas ATAC peaks from http://catlas.org/catlas_downloads/humantissues/cCRE_by_cell_type. We obtained EpiMap links in the heart from https://personal.broadinstitute.org/cboix/epimap/links/pergroup/links_by_group.heart.tsv.gz.
Prioritized AF genes and variants comparison across studies
We retrieved prioritized genes from seven datasets to compare to those we prioritized with co-localization in Table 1. We created two sets of genes based on the L2G and Colocalization columns of the Open Targets for the Nielsen et al. GWAS here: https://genetics.opentargets.org/Study/GCST006414/associations. We retrieved eQTL and pQTL genes from Table S5 of Assum et al.11 For Hocker et al. we used genes from Table S19.13 For Selewa et al. we used genes with PIP >0.8 from Table S8.15 For Ouwerkerk et al. 2019 we used genes with a `score p < 10-6` ≥ 11 from Table S1.19 For Zhang et al. 2021, we used AF genes with a PPA >0.1 target genes from the `ABC score >0.015, any cell type` column.14 We retrieved prioritized variants from six datasets to compare to those we prioritized by fine-mapping in Table 2. We retrieved eQTL and pQTL variants from Table S5 of Assum et al.11 For Hocker et al. we used variants from Table S19.13 For Selewa et al. we used variants with PIP >0.5 from Table S3.15 For Ouwerkerk et al. 2019 we used variants from Table S2.19 For Ouwerkerk et al. 2020 we used variants from Table S5.25 For Zhang et al. 2021, we used AF variants in Table S5.14
LINC01629 CRISPRi
Cloning of gRNA plasmid and lentivirus production
Non-Targeting Control (NTC) gRNA sequence: Forward (5′-CACCGAAAACAGGACGATGTGCGGC-3′) and Reverse (5′- AAACGCCGCACATCGTCCTGTTTTC-3′); LINC01629 gRNA sequence: Forward (5′-CACCGTAGAAAAAGACACTTCCAA-3′) and Reverse (5′-AAACTTGGAAGTGTCTTTTTCTAC-3′). The above gRNA oligonucleotides were annealed at a final concentration of 0.4uM and were cloned into 500ng of Esp3I-digested LentiGuide-Puro plasmid using T4 DNA Ligase (Catalog #M0202, NEB). Ligated plasmids were transformed into OneShot Stbl3 E. coli competent cells (Catalog #C737303, Invitrogen) as per the manufacturer’s protocol. Successful plasmid clones were screened using Sanger sequencing and plasmids were extracted using FavorPrep plasmid miniprep kit (Catalog #FAPDE 300). For lentivirus production, lentiviral particles were produced in a 10cm plate format using HEK293T cells cultured in DMEM with 10% FBS. Briefly, 10 μg of Lentiguide-puro gRNA plasmid for LINC01629 or NTC (Addgene #52963), 7.5 μg of pMDLg/pRRE, 2.5 μg of pRSV-REV, and 2.5 μg pMD2.G (Addgene #12251, #12253 & #12259) were co-transfected using 50 μl of PEI (1 mg/ml) and 3mL of Opti-MEM I Reduced Serum Medium (Catalog #31985070, ThermoFisher Scientific). Following overnight incubation, the media was changed to DMEM with 5% FBS. Viral supernatant for the next 48 h was collected, pooled, and filtered through 0.22um PES filter. Viral particles were concentrated using Lenti-Pac Lentivirus Concentration Solution (Catalog #LPR-LCS-01, GeneCopoeia), according to the manufacturer’s instructions. Final concentrated viral particles were then resuspended in 100 μL of PBS solution.
Cell culture and CRISPRi knockdown of LINC01629
The H1-dCas9-KRAB hESC line was engineered by lentiviral infection of the Lenti-dCas9-KRAB-blast plasmid (Addgene #89567), and monoclones with stable constitutive dCas9-KRAB expression were isolated and expanded for targeting. All stem cell cultures were maintained in mTesR1 (Catalog #85857, STEMCELL Technologies), seeded in Geltrex (Catalog #A1413202, ThermoFisher Scientific) coated plates at 37°C with 5% CO2 in the incubator. Cells were routinely tested for mycoplasma prior to culture. For the lentiviral knockdown of LINC01629, the H1-dCas9-KRAB hESC were cultured in 12-well plates and treated with 30ul of lentiviral particles in 8ug/ml polybrene, per well. 24 h later, infected cells were positively selected with both 1ug/ml puromycin (to select for the gRNA) and 10ug/ml blasticidin (to ensure only the H1-dCas9-KRAB cells). The knockdown of LINC01629 was validated by qPCR, and three independent experiments per group were performed.
Cardiomyocyte differentiation
We performed cardiomyocyte differentiation using the GiWi protocol as previously described.51 Briefly, both H1-dCas9-KRAB hESC lines infected with NTC and LINC01629 gRNAs were dissociated with Accutase (Catalog: #07920, STEMCELL Technology), and seeded in 12-well plates containing mTesR1 with Y27632 (Catalog #72307, STEMCELL Technology) until 80% confluency. Following 48 h, cells were induced with CHIR99021 for 24 h (Catalog #72054, STEMCELL Technology) and media was subsequently changed to RPMI+B27 without insulin (Catalog A1895601, ThermoFisher). Three days after CHIR99021 induction, media was changed to RPMI without insulin (Catalog #A1895601, ThermoFisher) and 5uM IWP2 (Catalog #72122, STEMCELL Technology) for another 48 h. On Day 5 post-induction, media was refreshed in RPMI+B27 without insulin. RPMI+B27 supplement with insulin (Catalog 17504044, ThermoFisher) was added only from day 7 thereafter when there were beating cardiomyocyte clusters. Cardiomyocytes were cultured and matured for 60 days before harvesting for downstream experiments.
RNA extraction and RNAseq
Cardiomyocytes targeted with both NTC and LINC01629 gRNA were isolated and harvested in 400ul Trizol reagent (ThermoFisher, 15596026). Briefly, RNA from three independent biological replicates were isolated using the Direct-Zol RNA Miniprep kit (Zymo Research, R2050). RNA quality and yield were assessed using Agilent RNA 6000 Pico kit (Agilent, 50167-1513) for quality control. Total RNA library preparations were prepared using TruSeq Stranded Total RNA Library Prep HMR kit (Illumina, 20020596) and respective cDNA libraries were prepared by Macrogen Asia Pacific Pte. Ltd. RNA libraries were sequenced on HiSeq 4000 Illumina sequencing platform to achieve a sequencing depth of at least 50 million paired-end reads per biological sample.
RNAseq processing
We pseudoaligned RNAseq reads to Gencode v32 using Kallisto with the options quant -b 100 --rf-stranded. We aggregated transcripts by genes and quantified them with DESeq2. We calculated principal components using the 500 most variable genes.
Quantification and statistical analysis
eQTL calling and meta-analyses
For the eQTL datasets, we obtained 65 and 62 samples in Harbin and CTSN, respectively, for which we had genotypic and transcriptomic data. We performed downstream analyses on variants with MAF >0.05. We first tested eQTLs for 150 sentinel variants recently associated with AF in a cross-ancestry meta-analysis.10 Later, for co-localization analyses, we retrieved all genetic variants in a 500 kilobases (kb) window centered on the sentinel variants that were significant eQTLs (FDR <0.05). At the FAM13B locus, we discovered that rs529526 was mis-annotated as sentinel variant (the GWAS meta-analysis P-value was inferior to rs3756687 and rs7722600) and changed it to rs3756687 for all downstream analyses.
To compute eQTLs, we used transformed gene expression matrices derived from the vst() function of DESeq2. We accounted for hidden variables in the RNAseq datasets using PCs48 with the PCAForQTL package. We then used the MatrixEQTL package52 with sex and 7 PCs as covariates to find eQTLs with less than one megabase (Mb) between the gene and the SNP. To test for statistical interaction between genotype and disease status, we used the following linear model for sentinel variants only: gene expression levels ∼ sex + disease_status + SNP + PC1 + PC2 + PC3 + PC4 + PC5 + PC6 + PC7 + SNP:disease_status.
We performed a meta-analysis of nominal p-values for the 150 AF GWAS sentinel variant eQTLs using GTEx RAA V8 (https://console.cloud.google.com/storage/browser/gtex-resources) and our two cohorts (CTSN and Harbin) using METAL.53 We corrected for multiple testing (N = 7907 tests) using FDR.
Co-localization
We retrieved the hg38 positions using the AnnotationHub package (Annotationhub chain: hg19ToHg38.over.chain.gz). For CTSN and Harbin, we merged eQTL summary statistics overlapping with the GWAS summary statistics and ran coloc (https://github.com/chr1swallace/coloc) on each locus with a significant eQTL. We used 87,516 as the number of cases and 1,395,002 as the sample size for the GWAS.10 We used H4 posterior probability >0.4 as evidence of co-localization between the AF and eQTL summary statistics.54
Visualization of fine-mapped AF-associated variants in single-nucleus multiome data
We calculated approximate Bayes factors (aBF) for each dataset using a previously described fine-mapping algorithm.55 Briefly, aBF was calculated with summary statistics of the GWAS meta-analysis and both eQTL datasets using the following equation;
where and SE are the variant’s effect size and standard error, respectively, and is the prior variance in allelic effects, taken here to be 0.04. We calculated the posterior inclusion probability (PIP) of a given variant by dividing the variant’s aBF by the sum of the aBF for all the variants in the locus. We generated the 95% credible sets by including variants, starting with the variants with the largest PIP, until the sum of the variants’ PIP in the credible set ≥95%. We calculated PIP for variants in the 95% credible sets for each dataset. We report credible set sizes and their overlap using the eQTL datasets in which the eQTL was significant and where the GWAS and eQTL signals co-localized (defined as H4 posterior probability >0.4). To prioritize variants based on the aBF fine-mapping, we used a PIP threshold >0.1 as we and others have shown that it enriches for potential functional variants.56,57,58 When both Harbin and CTSN credible sets were included, we overlapped their union with the GWAS credible sets, otherwise we overlapped the GWAS credible set with the one significant eQTL dataset. In LocusZoom panels, we plotted P-values for each locus with 1000 Genomes European (EUR) and East Asian (EAS) populations linkage disequilibrium patterns using the locuscomparer package for CTSN and Harbin.
To evaluate the genomic context of each locus, we retrieved the overlapping ATACseq peaks and coverage, peak-gene links and the eQTL gene expression from our sn-multiome dataset. To calculate peak-gene links, we first created MetaCells within each sample using the hdWGCNA package.43 We calculated MetaCells genes and peaks counts by averaging the genes/cells and peaks/cells matrices using 30 neighbor cells in the gene expression harmony space and limited the number of overlapping cells to 15. We then calculated 2 Pearson’s correlation scores between the eQTL genes and peaks within 1Mb of the gene. The first correlation score was calculated using all cells. The second was calculated using cardiomyocytes only. In both cases, we kept peak-gene links with a |Pearson R| > 0.2. For clarity, we only display links for eQTL genes found in Table 1.
RNA-seq data analysis for the CRISPRi experiment
We used DESeq2’s Wald test for condition (CRISPRi targeting the promoter of LINC01629 vs. NTC), followed by a log2 shrunken transformation (ashr shrinkage estimator). We performed an over-representation analysis on Gene Ontology Biological Processes using the rbioapi package59 rba_panther_enrich() function. We used all genes with a baseMean value above 1 as background and corrected for multiple testing using the Bonferroni method. We report gene set enrichments for up-regulated and down-regulated genes (FDR <0.1 and log2-fold change >0 and <0, respectively).
Acknowledgments
We thank all participants who contributed biosamples to this study. This work was funded by the Fonds de Recherche en Santé du Québec (FRQS), the Canada Research Chair Program and the Montreal Heart Institute Foundation (to S.N. and G.L.), and by the National Natural Science Foundation of China (81861128022, U21A20339 to B.Y.). S.R. was funded by the British Heart Foundation Intermediate and Senior Fellowships, the British Research Council (BRC4) NIHR (Oxford) grant, the Wellcome Trust Institutional Strategic Individual Career Support grant and the John Fell Foundation Fund (Oxford). This research was enabled in part by support provided by Calcul Quebec (https://www.calculquebec.ca/en/) and Compute Canada (www.computecanada.ca). We thank Génome Québec for performing next-generation DNA sequencing for this project.
Author contributions
Conceived and designed the analyses: F.J.A.L. and G.L.; collected the data: X.J., K.K., and N.M.; contributed data: X.J., K.K., C.J.M.L., J.X., L.X., W.M., H.B., N.M., R.S.-Y.F., S.R., C.G.A.-N., Z.P., S.N., and B.Y.; performed analyses: F.J.A.L., C.J.M.L., J.X., C.G.A.-N., and G.L.; secured funding and supervised the work: S.R., C.G.A.-N., Z.P., S.N., B.Y., and G.L.; wrote the manuscript: F.J.A.L. and G.L., with contributions from all authors.
Declaration of interests
The authors declare no competing interests.
Published: August 5, 2024
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.isci.2024.110660.
Supplemental information
MA, minor allele; MAF, MA frequency; FDR, false discovery rate; snps_gene name, snp hg38 position followed by the eQTL gene name. We reported the betas for the minor allele.
1 = the gene was prioritized in the study, 0 = it was not. OT_coloc, OpenTarget prioritized genes from colocalization; OT_L2G, OpenTarget prioritized genes from locus to gene method; sum_overlap, number of studies prioritizing this gene.
snps_gene, snp hg38 position followed by the eQTL gene ID; tss_distance, transcription start site distance to the snp; statistic, t-value from the eQTL linear model; FDR, false discovery rate; MAF, minor allele frequency; A1, reference allele; A2, effect allele; beta se, beta standard error; Weights, N samples in the meta-analysis; meta novel, bolean values for meta-analysis eQTLs bellow FDR <0.05 that were not significant in any other cohorts. We reported the betas for A2. The direction of effect in the meta-analysis shows cohorts in the same order presented in the table (CTSN, Harbin, GTEx). Question mark denotes missing values and “0” indicates a beta of 0.
eQTL significance; we report if the eQTL signal is significant (FDR <0.05) in the CTSN or Harbin cohorts, or both. We used approximate Bayesian fine-mapping to determine the size of the 95% credible sets (cs) for the AF GWAS, CTSN eQTL and Harbin eQTL signals. The “Overlap 95%.cs.size” and “Variants in the overlapping sets” include variants found in the AF GWAS cs and in at least one of the eQTL study cs. ND, not determined because the eQTL signal is not significant for the sentinel and gene in this cohort.
1 = the SNP was prioritized in the study, 0 = it was not. Sum_overlap, number of studies prioritizing this variant.
BaseMean, mean normalized read count; L2FC, shrunken log2 fold change; FDR, false discovery rate. Positive L2FC indicate higher expression in hESC-CMs treated with the gRNA against the promoter of LINC01629. Empty FDR cells are due lowly expressed genes producing NA in DESeq2 differential expression analysis.
References
- 1.Hindricks G., Potpara T., Dagres N., Arbelo E., Bax J.J., Blomström-Lundqvist C., Boriani G., Castella M., Dan G.-A., Dilaveris P.E., et al. 2020 ESC Guidelines for the diagnosis and management of atrial fibrillation developed in collaboration with the European Association for Cardio-Thoracic Surgery (EACTS) The Task Force for the diagnosis and management of atrial fibrillation of the European Society of Cardiology (ESC) Developed with the special contribution of the European Heart Rhythm Association (EHRA) of the ESC. Eur. Heart J. 2021;42:373–498. doi: 10.1093/eurheartj/ehaa612. [DOI] [PubMed] [Google Scholar]
- 2.Delling L.D., Mitchell S.V.E., Jane F.F., Myriam F., Sadiya S.K., Brett M.K., Kristen L.K., Tak W.K., Daniel T.L. Heart Disease and Stroke Statistics—2020 Update: A Report from the American Heart Association. Circulation. 2020;141:e139–e596. doi: 10.1161/CIR.0000000000000757. [DOI] [PubMed] [Google Scholar]
- 3.Feinberg W.M., Blackshear J.L., Laupacis A., Kronmal R., Hart R.G. Prevalence, age distribution, and gender of patients with atrial fibrillation: analysis and implications. Arch. Intern. Med. 1995;155:469–473. [PubMed] [Google Scholar]
- 4.Colilla S., Crow A., Petkun W., Singer D.E., Simon T., Liu X. Estimates of current and future incidence and prevalence of atrial fibrillation in the US adult population. Am. J. Cardiol. 2013;112:1142–1147. doi: 10.1016/j.amjcard.2013.05.063. [DOI] [PubMed] [Google Scholar]
- 5.Ayzenberg O., Swissa M., Shlezinger T., Bloch S., Katzir I., Chodick G., Caspi A., Vered Z. Atrial Fibrillation Ablation Success Rate-A Retrospective Multicenter Study. Curr. Probl. Cardiol. 2023;48:101161. doi: 10.1016/j.cpcardiol.2022.101161. [DOI] [PubMed] [Google Scholar]
- 6.Xun A., True H.M., Kuipers M.F., YH L.G., de Groot Natasja M. Atrial fibrillation (Primer) Nat. Rev. Dis. Prim. 2022;8 doi: 10.1038/s41572-022-00347-9. [DOI] [PubMed] [Google Scholar]
- 7.Weng L.-C., Choi S.H., Klarin D., Smith J.G., Loh P.-R., Chaffin M., Roselli C., Hulme O.L., Lunetta K.L., Dupuis J., et al. Heritability of atrial fibrillation. Circ. Cardiovasc. Genet. 2017;10:e001838. doi: 10.1161/CIRCGENETICS.117.001838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Nielsen J.B., Thorolfsdottir R.B., Fritsche L.G., Zhou W., Skov M.W., Graham S.E., Herron T.J., McCarthy S., Schmidt E.M., Sveinbjornsson G., et al. Biobank-driven genomic discovery yields new insight into atrial fibrillation biology. Nat. Genet. 2018;50:1234–1239. doi: 10.1038/s41588-018-0171-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Roselli C., Chaffin M.D., Weng L.-C., Aeschbacher S., Ahlberg G., Albert C.M., Almgren P., Alonso A., Anderson C.D., Aragam K.G., et al. Multi-ethnic genome-wide association study for atrial fibrillation. Nat. Genet. 2018;50:1225–1233. doi: 10.1038/s41588-018-0133-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Miyazawa K., Ito K., Ito M., Zou Z., Kubota M., Nomura S., Matsunaga H., Koyama S., Ieki H., Akiyama M., et al. Cross-ancestry genome-wide analysis of atrial fibrillation unveils disease biology and enables cardioembolic risk prediction. Nat. Genet. 2023;55:187–197. doi: 10.1038/s41588-022-01284-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Assum I., Krause J., Scheinhardt M.O., Müller C., Hammer E., Börschel C.S., Völker U., Conradi L., Geelhoed B., Zeller T., et al. Tissue-specific multi-omics analysis of atrial fibrillation. Nat. Commun. 2022;13:441. doi: 10.1038/s41467-022-27953-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Hsu J., Gore-Panter S., Tchou G., Castel L., Lovano B., Moravec C.S., Pettersson G.B., Roselli E.E., Gillinov A.M., McCurry K.R., et al. Genetic control of left atrial gene expression yields insights into the genetic susceptibility for atrial fibrillation. Circ. Genom. Precis. Med. 2018;11:e002107. doi: 10.1161/CIRCGEN.118.002107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hocker J.D., Poirion O.B., Zhu F., Buchanan J., Zhang K., Chiou J., Wang T.-M., Zhang Q., Hou X., Li Y.E., et al. Cardiac cell type–specific gene regulatory programs and disease risk association. Sci. Adv. 2021;7:eabf1444. doi: 10.1126/sciadv.abf1444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zhang K., Hocker J.D., Miller M., Hou X., Chiou J., Poirion O.B., Qiu Y., Li Y.E., Gaulton K.J., Wang A., et al. A single-cell atlas of chromatin accessibility in the human genome. Cell. 2021;184:5985–6001.e19. doi: 10.1016/j.cell.2021.10.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Selewa A., Luo K., Wasney M., Smith L., Tang C., Eckart H., Moskowitz I., Basu A., He X., Pott S. Single-cell genomics improves the discovery of risk variants and genes of cardiac traits. medRxiv. 2022 doi: 10.1101/2022.02.02.22270312. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Li X., Garcia-Elias A., Benito B., Nattel S. The effects of cardiac stretch on atrial fibroblasts: analysis of the evidence and potential role in atrial fibrillation. Cardiovasc. Res. 2022;118:440–460. doi: 10.1093/cvr/cvab035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hulsmans M., Schloss M.J., Lee I.-H., Bapat A., Iwamoto Y., Vinegoni C., Paccalet A., Yamazoe M., Grune J., Pabel S., et al. Recruited macrophages elicit atrial fibrillation. Science. 2023;381:231–239. doi: 10.1126/science.abq3061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Consortium G.T.E. The GTEx Consortium atlas of genetic regulatory effects across human tissues The Genotype Tissue Expression Consortium. Science. 2019;369:1318–1330. doi: 10.1126/science.aaz1776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.van Ouwerkerk A.F., Bosada F.M., van Duijvenboden K., Hill M.C., Montefiori L.E., Scholman K.T., Liu J., de Vries A.A.F., Boukens B.J., Ellinor P.T., et al. Identification of atrial fibrillation associated genes and functional non-coding variants. Nat. Commun. 2019;10:4755. doi: 10.1038/s41467-019-12721-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wass S.Y., Offerman E.J., Sun H., Hsu J., Rennison J.H., Cantlay C.C., McHale M.L., Gillinov A.M., Moravec C., Smith J.D. Novel functional atrial fibrillation risk genes and pathways identified from coexpression analyses in human left atria. Heart Rhythm. 2023;20:1219–1226. doi: 10.1016/j.hrthm.2023.05.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ghoussaini M., Mountjoy E., Carmona M., Peat G., Schmidt E.M., Hercules A., Fumis L., Miranda A., Carvalho-Silva D., Buniello A., et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 2021;49:D1311–D1320. doi: 10.1093/nar/gkaa840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Boix C.A., James B.T., Park Y.P., Meuleman W., Kellis M. Regulatory genomic circuitry of human disease loci by integrative epigenomics. Nature. 2021;590:300–307. doi: 10.1038/s41586-020-03145-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Epstein C.B., Shoresh N., Adrian J., Kawli T., Davis C.A., Dobin A., Kaul R., Halow J., Van Nostrand E.L., Freese P. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature. 2020;583:699. doi: 10.1038/s41586-020-2493-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Montefiori L.E., Sobreira D.R., Sakabe N.J., Aneas I., Joslin A.C., Hansen G.T., Bozek G., Moskowitz I.P., McNally E.M., Nóbrega M.A. A promoter interaction map for cardiovascular disease genetics. Elife. 2018;7:e35788. doi: 10.7554/eLife.35788. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.van Ouwerkerk A.F., Bosada F.M., Liu J., Zhang J., van Duijvenboden K., Chaffin M., Tucker N.R., Pijnappels D., Ellinor P.T., Barnett P., et al. Identification of functional variant enhancers associated with atrial fibrillation. Circ. Res. 2020;127:229–243. doi: 10.1161/CIRCRESAHA.119.316006. [DOI] [PubMed] [Google Scholar]
- 26.Leblanc F.J.A., Lettre G. Major cell-types in multiomic single-nucleus datasets impact statistical modeling of links between regulatory sequences and target genes. Sci. Rep. 2023;13:3924. doi: 10.1038/s41598-023-31040-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Benaglio P., D’Antonio-Chronowska A., Ma W., Yang F., Young Greenwald W.W., Donovan M.K.R., DeBoever C., Li H., Drees F., Singhal S., et al. Allele-specific NKX2-5 binding underlies multiple genetic associations with human electrocardiographic traits. Nat. Genet. 2019;51:1506–1517. doi: 10.1038/s41588-019-0499-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bosada F.M., van Duijvenboden K., Giovou A.E., Rivaud M.R., Uhm J.-S., Verkerk A.O., Boukens B.J., Christoffels V.M. An atrial fibrillation-associated regulatory region modulates cardiac Tbx5 levels and arrhythmia susceptibility. Elife. 2023;12:e80317. doi: 10.7554/eLife.80317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Postma A.V., van de Meerakker J.B.A., Mathijssen I.B., Barnett P., Christoffels V.M., Ilgun A., Lam J., Wilde A.A.M., Lekanne Deprez R.H., Moorman A.F.M. A gain-of-function TBX5 mutation is associated with atypical Holt–Oram syndrome and paroxysmal atrial fibrillation. Circ. Res. 2008;102:1433–1442. doi: 10.1161/CIRCRESAHA.107.168294. [DOI] [PubMed] [Google Scholar]
- 30.Kanemaru K., Cranley J., Muraro D., Miranda A.M.A., Ho S.Y., Wilbrey-Clark A., Patrick Pett J., Polanski K., Richardson L., Litvinukova M., et al. Spatially resolved multiomics of human cardiac niches. Nature. 2023;619:801–810. doi: 10.1038/s41586-023-06311-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hou K., Ding Y., Xu Z., Wu Y., Bhattacharya A., Mester R., Belbin G.M., Buyske S., Conti D.V., Darst B.F., et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat. Genet. 2023;55:549–558. doi: 10.1038/s41588-023-01338-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Bai J., Lu Y., Lo A., Zhao J., Zhang H. PITX2 upregulation increases the risk of chronic atrial fibrillation in a dose-dependent manner by modulating IKs and ICaL—insights from human atrial modelling. Ann. Transl. Med. 2020;8:191. doi: 10.21037/atm.2020.01.90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Wirka R.C., Gore S., Van Wagoner D.R., Arking D.E., Lubitz S.A., Lunetta K.L., Benjamin E.J., Alonso A., Ellinor P.T., Barnard J., et al. A common connexin-40 gene promoter variant affects connexin-40 expression in human atria and is associated with atrial fibrillation. Circ. Arrhythm. Electrophysiol. 2011;4:87–93. doi: 10.1161/CIRCEP.110.959726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Perez-Hernandez M., Matamoros M., Barana A., Amoros I., Gomez R., Nunez M., Sacristan S., Pinto A., Fernandez-Aviles F., Tamargo J. Pitx2c increases in atrial myocytes from chronic atrial fibrillation patients enhancing I Ks and decreasing I Ca, L. Cardiovasc. Res. 2016;109:431–441. doi: 10.1093/cvr/cvv280. [DOI] [PubMed] [Google Scholar]
- 35.Syeda F., Kirchhof P., Fabritz L. PITX2-dependent gene regulation in atrial fibrillation and rhythm control. J. Physiol. 2017;595:4019–4026. doi: 10.1113/JP273123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Guo D.F., Li R.G., Yuan F., Shi H.Y., Hou X.M., Qu X.K., Xu Y.J., Zhang M., Liu X., Jiang J.Q., et al. TBX5 loss-of-function mutation contributes to atrial fibrillation and atypical Holt-Oram syndrome. Mol. Med. Rep. 2016;13:4349–4356. doi: 10.3892/mmr.2016.5043. [DOI] [PubMed] [Google Scholar]
- 37.Kirchhoff S., Nelles E., Hagendorff A., Krüger O., Traub O., Willecke K. Reduced cardiac conduction velocity and predisposition to arrhythmias in connexin40-deficient mice. Curr. Biol. 1998;8:299–302. doi: 10.1016/s0960-9822(98)70114-9. [DOI] [PubMed] [Google Scholar]
- 38.Gollob M.H., Jones D.L., Krahn A.D., Danis L., Gong X.-Q., Shao Q., Liu X., Veinot J.P., Tang A.S.L., Stewart A.F.R., et al. Somatic mutations in the connexin 40 gene (GJA5) in atrial fibrillation. N. Engl. J. Med. 2006;354:2677–2688. doi: 10.1056/NEJMoa052800. [DOI] [PubMed] [Google Scholar]
- 39.Nelson M.R., Tipney H., Painter J.L., Shen J., Nicoletti P., Shen Y., Floratos A., Sham P.C., Li M.J., Wang J., et al. The support of human genetic evidence for approved drug indications. Nat. Genet. 2015;47:856–860. doi: 10.1038/ng.3314. [DOI] [PubMed] [Google Scholar]
- 40.King E.A., Davis J.W., Degner J.F. Are drug targets with genetic support twice as likely to be approved? Revised estimates of the impact of genetic support for drug mechanisms on the probability of drug approval. PLoS Genet. 2019;15:e1008489. doi: 10.1371/journal.pgen.1008489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Tchou G., Ponce-Balbuena D., Liu N., Gore-Panter S., Hsu J., Liu F., Opoku E., Brubaker G., Schumacher S.M., Moravec C.S., et al. Decreased FAM13B expression increases atrial fibrillation susceptibility by regulating sodium current and calcium handling. JACC. Basic Transl. Sci. 2023;8:1357–1378. doi: 10.1016/j.jacbts.2023.05.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Chang C.C., Chow C.C., Tellier L.C., Vattikuti S., Purcell S.M., Lee J.J. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Morabito S., Reese F., Rahimzadeh N., Miyoshi E., Swarup V. High dimensional co-expression networks enable discovery of transcriptomic drivers in complex biological systems. bioRxiv. 2022 doi: 10.1101/2022.09.22.509094. Preprint at. [DOI] [Google Scholar]
- 44.Bray N.L., Pimentel H., Melsted P., Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 2016;34:525–527. doi: 10.1038/nbt.3519. [DOI] [PubMed] [Google Scholar]
- 45.Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Germain P.-L., Lun A., Meixide C.G., Macnair W., Robinson M.D. Doublet identification in single-cell sequencing data using scDblFinder. F1000Res. 2021;10:979. doi: 10.12688/f1000research.73600.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Stephens M. False discovery rates: a new deal. Biostatistics. 2017;18:275–294. doi: 10.1093/biostatistics/kxw041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Zhou H.J., Li L., Li Y., Li W., Li J.J. PCA outperforms popular hidden variable inference methods for molecular QTL mapping. Genome Biol. 2022;23:210. doi: 10.1186/s13059-022-02761-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Stuart T., Srivastava A., Madad S., Lareau C.A., Satija R. Single-cell chromatin state analysis with Signac. Nat. Methods. 2021;18:1333–1341. doi: 10.1038/s41592-021-01282-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Litviňuková M., Talavera-López C., Maatz H., Reichart D., Worth C.L., Lindberg E.L., Kanda M., Polanski K., Heinig M., Lee M., et al. Cells of the adult human heart. Nature. 2020;588:466–472. doi: 10.1038/s41586-020-2797-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Lian X., Hsiao C., Wilson G., Zhu K., Hazeltine L.B., Azarin S.M., Raval K.K., Zhang J., Kamp T.J., Palecek S.P. Robust cardiomyocyte differentiation from human pluripotent stem cells via temporal modulation of canonical Wnt signaling. Proc. Natl. Acad. Sci. USA. 2012;109:E1848–E1857. doi: 10.1073/pnas.1200250109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Shabalin A.A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics. 2012;28:1353–1358. doi: 10.1093/bioinformatics/bts163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Willer C.J., Li Y., Abecasis G.R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Giambartolomei C., Vukcevic D., Schadt E.E., Franke L., Hingorani A.D., Wallace C., Plagnol V. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 2014;10:e1004383. doi: 10.1371/journal.pgen.1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Wellcome Trust Case Control Consortium. Maller J.B., McVean G., Byrnes J., Vukcevic D., Palin K., Su Z., Howson J.M.M., Auton A., Myers S., et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 2012;44:1294–1301. doi: 10.1038/ng.2435. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Ulirsch J.C., Lareau C.A., Bao E.L., Ludwig L.S., Guo M.H., Benner C., Satpathy A.T., Kartha V.K., Salem R.M., Hirschhorn J.N., et al. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nat. Genet. 2019;51:683–693. doi: 10.1038/s41588-019-0362-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Vuckovic D., Bao E.L., Akbari P., Lareau C.A., Mousas A., Jiang T., Chen M.-H., Raffield L.M., Tardaguila M., Huffman J.E., et al. The polygenic and monogenic basis of blood traits and diseases. Cell. 2020;182:1214–1231.e11. doi: 10.1016/j.cell.2020.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Chen M.-H., Raffield L.M., Mousas A., Sakaue S., Huffman J.E., Moscati A., Trivedi B., Jiang T., Akbari P., Vuckovic D., et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell. 2020;182:1198–1213.e14. doi: 10.1016/j.cell.2020.06.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Rezwani M., Pourfathollah A.A., Noorbakhsh F. rbioapi: user-friendly R interface to biologic web services’ API. Bioinformatics. 2022;38:2952–2953. doi: 10.1093/bioinformatics/btac172. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
MA, minor allele; MAF, MA frequency; FDR, false discovery rate; snps_gene name, snp hg38 position followed by the eQTL gene name. We reported the betas for the minor allele.
1 = the gene was prioritized in the study, 0 = it was not. OT_coloc, OpenTarget prioritized genes from colocalization; OT_L2G, OpenTarget prioritized genes from locus to gene method; sum_overlap, number of studies prioritizing this gene.
snps_gene, snp hg38 position followed by the eQTL gene ID; tss_distance, transcription start site distance to the snp; statistic, t-value from the eQTL linear model; FDR, false discovery rate; MAF, minor allele frequency; A1, reference allele; A2, effect allele; beta se, beta standard error; Weights, N samples in the meta-analysis; meta novel, bolean values for meta-analysis eQTLs bellow FDR <0.05 that were not significant in any other cohorts. We reported the betas for A2. The direction of effect in the meta-analysis shows cohorts in the same order presented in the table (CTSN, Harbin, GTEx). Question mark denotes missing values and “0” indicates a beta of 0.
eQTL significance; we report if the eQTL signal is significant (FDR <0.05) in the CTSN or Harbin cohorts, or both. We used approximate Bayesian fine-mapping to determine the size of the 95% credible sets (cs) for the AF GWAS, CTSN eQTL and Harbin eQTL signals. The “Overlap 95%.cs.size” and “Variants in the overlapping sets” include variants found in the AF GWAS cs and in at least one of the eQTL study cs. ND, not determined because the eQTL signal is not significant for the sentinel and gene in this cohort.
1 = the SNP was prioritized in the study, 0 = it was not. Sum_overlap, number of studies prioritizing this variant.
BaseMean, mean normalized read count; L2FC, shrunken log2 fold change; FDR, false discovery rate. Positive L2FC indicate higher expression in hESC-CMs treated with the gRNA against the promoter of LINC01629. Empty FDR cells are due lowly expressed genes producing NA in DESeq2 differential expression analysis.
Data Availability Statement
-
•
The sn-multiome and LINC01629 RNAseq data discussed in this publication have been deposited in NCBI’s Gene Expression Omnibus and are accessible through GEO Series accession numbers GSE238242 and GSE271839, respectively. The bulk RNAseq data (gene count tables) for the CTSN and Harbin cohorts are also available at: http://www.mhi-humangenetics.org/en/resources/.
-
•
The code to reproduce the results in this article is available at: https://github.com/lebf3/AF_V2G_multiome.
-
•
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.