
Keywords: congenital heart disease, genetics, mutations, pathway analysis, single-cell transcriptomics
Abstract
Congenital heart disease (CHD) is one of the most prevalent neonatal congenital anomalies. To catalog the putative candidate CHD risk genes, we collected 16,349 variants [single-nucleotide variants (SNVs) and Indels] impacting 8,308 genes in 3,166 CHD cases for a comprehensive meta-analysis. Using American College of Medical Genetics (ACMG) guidelines, we excluded the 0.1% of benign/likely benign variants and the resulting dataset consisted of 83% predicted loss of function variants and 17% missense variants. Seventeen percent were de novo variants. A stepwise analysis identified 90 variant-enriched CHD genes, of which six (GPATCH1, NYNRIN, TCLD2, CEP95, MAP3K19, and TTC36) were novel candidate CHD genes. Single-cell transcriptome cluster reconstruction analysis on six CHD tissues and four controls revealed upregulation of the top 10 frequently mutated genes primarily in cardiomyocytes. NOTCH1 (highest number of variants) and MYH6 (highest number of recurrent variants) expression was elevated in endocardial cells and cardiomyocytes, respectively, and 60% of these gene variants were associated with tetralogy of Fallot and coarctation of the aorta, respectively. Pseudobulk analysis using the single-cell transcriptome revealed significant (P < 0.05) upregulation of both NOTCH1 (endocardial cells) and MYH6 (cardiomyocytes) in the control heart data. We observed nine different subpopulations of CHD heart cardiomyocytes of which only four were observed in the control heart. This is the first comprehensive meta-analysis combining genomics and CHD single-cell transcriptomics, identifying the most frequently mutated CHD genes, and demonstrating CHD gene heterogeneity, suggesting that multiple genes contribute to the phenotypic heterogeneity of CHD. Cardiomyocytes and endocardial cells are identified as major CHD-related cell types.
NEW & NOTEWORTHY Congential heart disease (CHD) is one of the most prevalent neonatal congenital anomalies. We present a comprehensive analysis combining genomics and CHD single-cell transcriptome. Our study identifies 90 potential candidate CHD risk genes of which 6 are novel. The risk genes have heterogenous expression suggestive of multiple genes contributing to the phenotypic heterogeneity of CHD. Cardiomyocytes and endocardial cells are identified as major CHD-related cell types.
Listen to this article’s corresponding podcast at https://apspublicationspodcast.podbean.com/e/cellular-diversity-in-congenital-heart-disease/.
INTRODUCTION
Congenital heart disease (CHD) is a common and often severe birth defect with structural and functional consequences to the heart and great vessel malformations (1). CHD is one of the leading causes of morbidity and mortality in children around the world. CHD can co-occur with neurodevelopmental disorders (2). Due to significant advances in disease identification and improved medical and surgical care, there is a rapid increase in the number of adults living with CHD (3). There are number of different types of CHDs affecting either the heart’s structure, its function, or both. The condition can range from mild to critical, with the latter having a significant impact on the life expectancy of the infant. Tetralogy of Fallot (TOF), coarctation of the aorta (CoA), and hypoplastic left heart syndrome (HLHS) are a few of the critical cases prevalent among patients with CHD. TOF, for example, results in abnormal heart development and cyanosis and consists of four defects: ventricular septal defect, pulmonary stenosis, overriding aorta, and right ventricular hypertrophy. In CoA, the aorta constricts reducing the blood flow to lower parts of the body. HLHS is another severe defect in which the left side of the heart is not fully developed. All these three and many other critical conditions require a series of surgical repairs to amend these aberrations (1, 4). Depending on the anatomy of the defect, CHDs are divided into different subtypes. This anatomic heterogeneity further adds to the complexity of the disease. As a result of this heterogeneity, it is important to understand the diverse molecular and cellular pathophysiology and etiology of CHD with the aim of improving CHD diagnosis and therapy.
Environmental risk factors including maternal alcohol consumption or consumption of antiretroviral or anticonvulsant drugs, maternal nutritional deficiencies, and maternal factors such as gestational diabetes, maternal obesity, rubella, or other viral infections contribute to around 10% of CHD cases (5–7). The majority of CHD cases are presumably attributable to genetic predispositions as indicated by epidemiological studies (1, 8). Specific genetic causes of syndromic CHD have been identified such as ELN gene variants leading to Williams-Beuren Syndrome, TBX5 variants in Holt-Oram syndrome, variants in PTPN11, SOS1, RAF1, KRAS, BRAF, MEK1, MEK2, and HRAS frequently observed in Noonan syndrome, and FBN1 variants in Marfan syndrome. Advances in sequencing technology, such as high-throughput genome sequencing, have further enabled the discovery of new candidate genes due to rare variants that are likely to contribute to nonsyndromic CHD (9–11). Of these, different types of gene families have been implicated: transcription factors, signaling pathways, chromatin modifiers, ciliary function, and structural genes and proteins have been primarily associated with CHD (1, 6, 12, 13).
Despite the identification of an increasing number of genetic risk factors for CHD, the majority of congenital heart patients remain unexplained due to genetic heterogeneity, incomplete penetrance, and the possibility of oligogenic or polygenic contributions (14, 15). The recent advances in single-cell sequencing technologies allow the assessment of cellular heterogeneity and can as such identify molecular markers in specific cell subsets that drive disease (2, 16–19). Therefore, this study sought to establish a link between the genotype, the phenotype, and the cell type of CHD (Fig. 1). To further understand the molecular etiology of CHD, we constructed a database with the clinically relevant rare variants associated with CHD and used single-cell transcriptome data from CHD tissue and healthy individuals for a comprehensive meta-analysis. We identified 90 candidate genes harboring multiple variants in unrelated individuals with CHD, six of which had not been reported previously. We then mapped the single-cell transcriptional profiles of the CHD heart cells to identify specific CHD-associated cell types.
Figure 1.

An overview of the analysis framework for identifying CHD candidate genes from genotypes and characterizing their CHD phenotypes and cell types. CHD, congenital heart disease.
METHODS
Ethical Statement
This study used only published data from deidentified original genomic and single-cell transcriptomic studies and did not require ethical review or approval.
Data Collection
A literature search was performed on Pubmed and Google Scholar using the key MeSH terms “Heart Defects, Congenital” AND “Variant” AND “Cohort” to extract all possibly pertinent articles published between 2000 and 2020. Screening consisted of two steps, namely, title and abstract screening and full text screening. English articles reporting new or known clinically relevant variants for CHD were included. Duplicate publications, which did not match the inclusion criteria, were excluded. Secondary referencing was used to include additional studies (cross-referencing the selected articles). Based on 29 studies published between January 2000 and December 2020, we identified 16,349 CHD-associated genetic variants (Supplemental Tables S1 and S2 and Fig. 1).
Variant Annotation and ACMG Classification
ANNOVAR (ANNOtate VARiation) software was used to annotate variants using genome-built HG38 on our in-house server (AMD EPYC 7402 24-Core Processor, 32GB RAM) (20). Of the total variants, 16,266 were exon/splicing variants, from which 60 variants with an Exac_all frequency of > 0.001 were removed retaining only rare exonic variants. Next, we filtered the unknown and synonymous variants and only used the remaining 15,315 variants for the next steps. Based on American College of Medical Genetics (ACMG) guidelines, we then classified the variants as benign/likely benign (B/LB), pathogenic/likely pathogenic (P/LP), and variants of uncertain significance (VOUS) (21). A variant was considered B/LB if it was reported to be nondamaging by ClinVar and had a benign score as calculated by Combined Annotation-Dependent Depletion (CADD) and Sorting Tolerant From Intolerant (SIFT). Variants were considered as P/LP if they were rare, had a CHD phenotype, and were considered as deleterious according to one of the following three rules: 1) recurrent variants, 2) P/LP by ClinVar (database of human variants and clinical phenotype with supporting evidence)/CADD and SIFT (computational predictions of the deleteriousness) and de novo, and 3) P/LP by ClinVar and CADD and SIFT. The remainder of the variants were classified as VOUS. Descriptive statistics were performed on the filtered pathogenic/likely pathogenic and VOUS data (Fig. 1).
Prioritizing and Identifying CHD Genes
Genes with variants (recurrent or multiple) in three or more unrelated individuals were prioritized as potential CHD genes. In case of recurrent variant, GNomAD frequency was used to consider the variant as rare (fewer variants at this genic position in GNomAD compared with our CHD database). All recurrent variants were extremely rare and highly significant (all P < 0.0008; Fisher exact test). Next, we scrutinized this gene set to identify known and novel genes associated with CHD. To this end, we constructed a database consisting of 275 CHD-associated genes combining literature (22–24) and the CHDgene (25) database. We overlapped our gene set with this database and the Online Mendelian Inheritance in Man (OMIM) database (26) to identify known CHD genes (Fig. 1). We next performed descriptive statistics (using R programming language) on this prioritized candidate CHD gene set and analyzed the variant type and inheritance associated with these genes.
Single-Cell Transcriptome Analysis
CHD single-nucleus RNA-sequencing (RNA-Seq) data from six pediatric CHD heart samples (three TOF and three HLHS) and four control pediatric hearts (3) was used to determine the cellular heterogeneity of the identified CHD genes (Fig. 1). Four of the six CHD samples were from males (aged 2 mo to 4 yr) and two from females (7–8 mo). One male (age 11 yr) and three female (3 yr–11 yr) control samples were taken from terminal organ donors whose hearts could not be transplanted due to technical difficulties. In this work, right and left ventricle tissue samples were analyzed. The expression matrix and metadata were extracted from GSE203274 (3). After normalizing the data, scaling, identification of variable genes, and principal component analysis (PCA), 20 principal components (PCs) were used to calculate clusters using the unsupervised Louvain clustering method from the Seurat package following batch correction with Harmony (27). A total of 16 clusters were identified at 0.1 resolution. Cluster identity was mapped using cardiac cell markers from the literature (3, 28). After clustering, the 53,208 and 54,261 cells that made up the CHD and control data, respectively, were isolated and examined separately. The combined gene set activity was computed using “AddModuleScore” function, and gene expressions were analyzed using feature plots. Furthermore, we specifically analyzed the subset of cardiomyocytes and reclustered them using similar steps. Differentially expressed genes (DEGs) were calculated for each cluster using “FindMarkers” command from Seurat and the most restrictive positive and negative marker genes were identified using the highest (positive) and lowest (negative) fold change (Avglog2FC) values for the CHD transcriptome. These markers were used for marking both CHD and control heart cardiomyocyte subclusters. A Student’s t test was used to compare gene expression between control and CHD samples.
Pathway Enrichment and Expression Mapping
Gene enrichment analyses were performed using the gene overlap package of R, followed by a Cytoscape analysis to explore the pathways involved (Fig. 1). Gene overlap was limited to eight, and false discovery rates (FDRs) and P values were cut off at 0.01 and 0.001, respectively. Kyoto Encyclopedia of Genes and Genomes (KEGG) (29) and Gene Ontology (GO) (30) databases were used for both gene enrichment and cytoscape analysis. Based on an increasing P value and odds ratio, the nodes were colored orange-red and sized from low to high. Next, we computed the number of genes overlapping between each of the identified pathways and single-cell clusters. The top 20 differentially DEGs (greater than 3-time fold change) from the clusters were used. An overlap heatmap indicating the intersection of the network genes and CHD and control cluster DEGs was produced in R.
Databases and Tools Used
The following databases and tools were used: Databases: Pubmed (https://pubmed.ncbi.nlm.nih.gov/ - RRID:SCR_004846),
Google Scholar (https://scholar.google.com/ - RRID:SCR_008878),
CHDgene (https://chdgene.victorchang.edu.au/),
ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/ - RRID:SCR_006169),
GnomAD v 2.1.1 (https://gnomad.broadinstitute.org/ - RRID:SCR_014964),
OMIM (https://www.omim.org/ - RRID:SCR_006437),
GEO (https://www.ncbi.nlm.nih.gov/geo/ RRID:SCR_004584,
KEGG (https://www.genome.jp/kegg/ - RRID:SCR_012773),
GO (https://geneontology.org/ - RRID:SCR_002811).
Tools: ANNOVAR (RRID:SCR_012821), SIFT (RRID: SCR_012813), CADD (RRID:SCR_018393), R 4.0.2 (RRID:SCR_001905), Cytoscape (version 3.8 - RRID:SCR_003032), Seurat (version 4.1.0 - RRID:SCR_007322. Figures were made using different software tools and edited in MS PowerPoint and Adobe illustrator (RRID:SCR_010279).
RESULTS
CHD-Variant Database Construction and ACMG Classification
In this study, the combined, curated CHD dataset consisted of 16,349 variants from 3,166 samples collected from 29 different articles. Different forms of CHDs were included, which were broadly classified into four categories following the Jin et al. classification scheme (31). In the combined dataset, the distribution of CHD forms was as follows: conotruncal defects (CTD 39%), left ventricular outflow tract obstruction (LVO 26%), heterotaxy (HTX 9%), and others (22%) including pulmonary valve abnormalities, anomalous pulmonary venous drainage, atrial septal defects (ASDs), atrioventricular canal defects, double inlet left ventricle (DILV), and tricuspid valve atresia (TA), and 4% where complete information was not available for classification (NA) (Fig. 2A). Specifically, CTD represents defects such as TOF, double-outlet right ventricle (DORV), truncus arteriosus, membranous ventricular septal defects (VSD), and aortic arch abnormalities. LVO includes HLHS, CoA, and aortic stenosis/bicuspid aortic valve (AS/BAV). HTX includes situs abnormalities such as dextrocardia, left or right isomerism (LAI, RAI) as the major malformation, and may include other defects such as L-transposition of the great arteries (L-TGA), atrioventricular canal defects (AVC), anomalous pulmonary venous drainage (TAPVR, PAPVR), and double outlet right ventricle (31). Variants were present in all of the 22 autosomes and the X chromosome (Fig. 2B). The variants were annotated with ANNOVAR and processed (as explained in methods) retaining 15,315 variants for further analysis. ACMG criteria were used to classify variants into B/LB, P/LP, and VOUS. Accordingly, 13 B/LB variants were removed and the final set of 15,302 variants consisting of 2,305 P/LP (15.1%) and 12,997 VOUS (84.9%) variants was retained (Fig. 2C). These variants were present in 8,308 genes across 3,086 individual CHD samples. The annotated variants were categorized into three major groups: loss-of-function variants (LOF: frameshift substitution, stop gain/loss, and splicing), missense variants (nonsynonymous SNVs), and nonframeshift variants. As shown in Fig. 2D, the majority of variants were LOF (12,642, 82.6%), followed by missense (2,593, 17.0%) and nonframeshift (67, 0.004%). Of the 14,211 variants (for which family history was available), 17% were de novo and the remaining were inherited (Fig. 2E). Sex and age were not reported individually for a majority of the samples (Supplemental Table S2). Four percent of all variants were caused by 293 recurrent variants (Fig. 2F). These 293 variants were divided into two groups: 1) repeated variants at the same position and 2) different variants occurring at the same position multiple times. A total of nine individual recurrent variants were found in the latter group of recurrent variants, as shown in Table 1.
Figure 2.

Characterization of the CHD mutation database. A: pie chart showing the type of CHD as a percentage (for 3,166 individuals) following the Jin et al. (31) classification scheme. B: percentage of mutations distributed across the chromosomes. C: CHD mutations categorized as pathogenic/likely pathogenic (P/LP) and variant of uncertain significance (VOUS) using ACMG guidelines. D: graph depicting the percentages (y-axis) of total number of mutations where red and yellow represent loss of function (LOF) and missense mutations, respectively. E: pie chart showing the percentage of familial and de novo mutations (excluding the NA data). F: recurrent mutation distribution. x-axis shows the number of repeats, and y-axis indicates the percent of mutations. ACMG, American College of Medical Genetics; CHD, congenital heart disease.
Table 1.
List of clinically relevant recurrent variants in congenital heart disease
| Number | Chr | Start | Stop | Ref* | Alt* | Amino Acid | VT** | Occurrences | Gene |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 10 | 25952279 | 25952286 | GTAAGTCA | G | Splice | 1 | MYO3A | |
| 10 | 25952279 | 25952279 | G | GT | Splice | 1 | MYO3A | ||
| 2 | 10 | 28125008 | 28125009 | AC | A | Splice | 1 | MPP7 | |
| 10 | 28125008 | 28125008 | A | G | Splice | 1 | MPP7 | ||
| 3 | 16 | 31139213 | 31139215 | TGG | T | P728Rfs*28 | FS | 1 | PRSS36 |
| 16 | 31139213 | 31139213 | T | TG | E729Rfs*28 | FS | 1 | PRSS36 | |
| 4 | 6 | 43072242 | 43072242 | G | A | R415Q | MS | 1 | KLC4 |
| 6 | 43072242 | 43072242 | G | GGTCCC | R418Pfs*35 | FS | 1 | KLC4 | |
| 5 | 2 | 44312754 | 44312754 | G | A | Splice | 1 | SLC3A1 | |
| 2 | 44312754 | 44312754 | G | T | Splice | 1 | SLC3A1 | ||
| 6 | 9 | 77425561 | 77425561 | C | T | Splice | 1 | GNA14 | |
| 9 | 77425561 | 77425561 | C | G | Splice | 1 | GNA14 | ||
| 7 | 5 | 141644646 | 141644646 | A | C | Y523X | SG | 1 | FCHSD1 |
| 5 | 141644646 | 141644646 | A | AT | Y523* | SG | 1 | FCHSD1 | |
| 8 | 5 | 180628896 | 180628896 | C | CG | P364Afs*63 | FS | 1 | FLT4 |
| 5 | 180628896 | 180628897 | CG | C | P363Rfs*27 | FS | 1 | FLT4 | |
| 9 | 12 | 112477720 | 112477720 | A | G | N308S | MS | 3 | PTPN11 |
| 12 | 112477720 | 112477720 | A | C | N308T | MS | 1 | PTPN11 |
*Ref, allele in the reference genome; Alt*, mutated allele at the same locus. Mutations resulted in the sequence in the “Ref” column being replaced by the sequence in the “Alt” column.
**VT is the variant type: Splice, splicing; FS, frameshift; MS, missense; and SG, stopgain.
CHD Variant-Enriched Gene Detection
Of the 8,308 genes with P/LP or VOUS variants, we filtered the genes that were mutated in more than two individuals. Two sets of genes were retained: 1) multiple variants in the same gene in different individuals (>2) and 2) recurrent variants in different individuals (>2). We confirmed the rare occurrence of these recurrent variants in GNomAD by their significant probabilities, with P values in the range of 0.0008–4.16e−88 (Fisher exact test). In this step, we identified 90 genes as CHD candidate genes (Supplemental Table S3 and Fig. 3D). Based on the functional annotation of each gene, we categorized them into signaling pathway-related genes (31), structural genes (25) related to muscle, cilia, extracellular matrix, membrane, or microtubules, nucleic acid-binding genes (13), transcription factors/regulators (10), ion channel-related genes (9), and two growth factor-related genes (Supplemental Table S3). Next, we overlapped these 90 genes with the CHD database and OMIM (Supplemental Table S4) to identify novel genes that had not been previously associated with CHD. A total of six novel candidate CHD genes were identified, namely, GPATCH1, NYNRIN, TCLD2, CEP95, MAP3K19, and TTC36 (Fig. 3A). We further analyzed the 10 most frequently mutated genes of these CHD genes. As shown in Fig. 3B, NOTCH1 was mutated in 60 different individuals with 62 variants, followed by 45 TTN and 41 MYH6 variants in 36 different individuals, respectively. FLT4, OBSCN, PTPN11, KMT2D, CHD7, DNAH5, and GDF1 genes were mutated in 33, 29, 29, 18, 15, 15, and 14 different individuals, respectively. For the MYH6 gene, a maximum number (24) of recurrent variants was identified at position 2161(C > T) in exon 18. Figure 3C shows the distribution of de novo and familial cases for each of these genes (depending on availability of family history).
Figure 3.

Candidate CHD genes and their characterization. A: Venn diagram depicting the overlap between OMIM and 90 prioritized candidate CHD genes. The non-OMIM genes, which do not have any prior association to CHD, are listed to the right. B: graph depicting the number of mutations (y-axis) in the top 10 hypermutated CHD genes. C: familial and de novo mutation distribution in the top 10 genes. D: chromosome map showing the description of CHD genes. The map was built using the “karyoploteR” package in R. CHD, congenital heart disease; OMIM, Online Mendelian Inheritance in Man.
Cell Type Mapping of CHD-Enriched Genes
To uncover the cellular identity of the CHD genes, we used the cardiac single-nucleus sequencing data (3), the only available CHD single-nucleus RNA-Seq data. It consisted of heart tissues from right and left ventricles of three TOF, three HLHS, one hypertrophic cardiomyopathy, two dilated cardiomyopathy, and four donated control pediatric hearts. The expression matrix (GSE203274) consisting of 157,293 cells was processed in Seurat. Once the data were normalized, scaled, batch corrected, and clustered, we segregated the CHD (53,207 cells from patients with TOF and HLHS, Fig. 4A) and control (54,260 cells from healthy donors, Fig. 4D) transcriptome for subsequent downstream analysis. Using standard cardiac cell markers, we identified 16 clusters (Supplemental Fig. S1). The major cell types and the overall topology were similar to Hill et al. (Fig. 4, A and D) (3). The 90 CHD genes exhibited heterogeneous expression patterns across cell types (Supplemental Fig. S2). As shown in Fig. 4, B and E, there was enrichment of the top 10 mutated (with >15 variants) genes in CHD cardiomyocytes, which was reduced in control cardiomyocytes. The CHD and control transcriptomes, however, were dominated by cardiomyocytes (54% and 52%, respectively), which were more enriched for CHD genes than other clusters (Fig. 4, B and E). Therefore, we subclustered and reanalyzed the cardiomyocytes from both datasets. First, we defined the cell identities for the CHD cardiomyocytes using the highest and lowest fold change and the percent of expression of the protein-coding DEGs. The same set of marker genes was used to distinguish the control cardiomyocytes. We observed nine subclusters in the CHD dataset: ALDOA+, TSHZ2+/KCNMB2−, EDIL3+/MYL7−, KCNMB4+/CPED1−, RASD1+/DOCK4−, OTOGL+/OBSCN−, UGT2B4+/KIF26B−, SLC26A3+/TTTY14−, and ARL17B+/TTTY14− (Fig. 4C and Supplemental Fig. S3). Four subclusters were observed for the control dataset: UGT2B4+/KIF26B−, TSHZ2+/KCNMB2−, ARL17B+/TTTY14−, and ALDOA+ (Fig. 4F and Supplemental Fig. S4). In both datasets, the combined gene set activity for the top 10 genes was ubiquitous throughout the subclusters (Fig. 4, C and F).
Figure 4.

CHD pediatric and control cardiac single-cell UMAP and CHD enriched gene expressions. A: overall UMAP plot for the CHD cardiac single-cell transcriptomic data, which involved 53,207 cells from six pediatric CHD heart samples. B: box plot showing the combined expression of the top 10 CHD genes. The red dashed line indicates the 90 percentile value (0.26). C: feature plot showing the combined expression of the top 10 CHD genes in the cardiomyocyte subsets of the CHD cardiac single-cell UMAP. The gene set activity was calculated using the AddModuleScore function of Seurat. The UMAP plot for the CHD cardiomyocytes was constructed using all the cardiomyocytes (29,166 cells), which were extracted and reclustered in Seurat resulting in nine clusters. The clusters were marked using the DEGs restricted to individual clusters where “+” denotes an upregulated gene (fold change > 1) and “−” indicates a downregulated gene (fold change < 0.5). D: overall UMAP plot for the control cardiac single-cell transcriptomic data, which involved 54,260 cells from three donated healthy heart samples. E: box plot showing the combined expression of the top 10 CHD genes in the control dataset. The red dashed line indicates the 90 percentile value (0.26). F: feature plot showing the combined expression of the top 10 CHD genes to construct the UMAP plot for the control cardiomyocytes; all the cardiomyocytes (28,508 cells) were extracted cardiomyocyte subsets of the control heart single-cell UMAP. The gene set activity was calculated using the AddModuleScore function of Seurat. Identities of the CHD cardiomyocyte cluster were used to map those of the control cardiomyocyte cluster. CHD, congenital heart disease; DEGs, differentially expressed genes. UMAP, uniform manifold approximation and projection.
Next, we focused on the top two genes: NOTCH1, which had the highest number of variants, and MYH6, which had the highest number of recurrent variants in this dataset. More than 60% of both NOTCH1 and MYH6 variants were associated with TOF and CoA, respectively (Fig. 5, A and E). Upregulation of NOTCH1 was observed in endothelial and endocardial cells, and that of MYH6 in cardiomyocytes (Fig. 5, B, C, F, and G). NOTCH1 was expressed in 42% and 36% of the endocardial CHD and control dataset, respectively, and 25% and 30% of the endothelial CHD and control dataset, respectively (Fig. 5C), whereas MYH6 was expressed in 97% of CHD cardiomyocytes and 93% of control cardiomyocytes (Fig. 5G). Furthermore, we found that both NOTCH1 (P value = 4.4e−15) and MYH6 (P value = 0) expression was considerably higher in control samples compared with CHD (Fig. 5, D and H). We then analyzed the CHD and control cardiomyocyte subclusters and observed that MYH6 was upregulated in EDIL3+/MYL7−, OTOGL+/OBSCN−, UGT2B4+/KTF26B−, and ARL17B/TTTY14− CHD subclusters (Fig. 5, I and J). Similar upregulation of MYH6 was observed in ALDOA+ and ARL17B/TTTY14− control cardiomyocytes (Fig. 5, K and L).
Figure 5.

NOTCH1 and MYH6 mutations. A: phenotype distribution for NOTCH1 mutations. “CTD” represents conotruncal defects, “LVO” is left ventricular outflow tract obstruction, and “O” represents others. Listed at the top are the percentages and numbers of mutations (in parentheses) associated with each phenotype. Slant lines within the CTD bar indicate the number of mutations with the TOF phenotype (32). Dots within the LVO bar indicate the number of mutations with the CoA phenotype (6). B: feature plot comparing the restricted expression of NOTCH1 in the CHD (top) and control (bottom) dataset. C: comparison of NOTCH1 expression between CHD and control using percent expression (primary y-axis) and fold change (secondary y-axis). A fold change value of 1.5 is indicated by the red dashed line. Also shown on the bar graph is the maximum percentage of NOTCH1 expression. D: mean normalized expression of NOTCH1 (P value = 1.17e−6) in the CHD and control sample shown for endocardial cells. This plot was constructed using the top 10% expression data. *Significant P value (P < 0.05), which was calculated using Student’s t tests. E: phenotype distribution for MYH6 mutations. Listed at the top are the percentages and numbers of mutations (in parenthese) associated with each phenotype. Slant lines within the CTD bar indicate the number of mutations with the TOF phenotype (1). Dots within the LVO bar indicate the number of mutations with the CoA phenotype (29). F: feature plot comparing the restricted expression of MYH6 in the CHD (top) and control (bottom) dataset. G: comparison of MYH6 expression between CHD and control using percent expression (primary y-axis) and fold change (secondary y-axis). A fold change value of 1.5 is indicated by the red dashed line. Also shown on the bar graph is the maximum percentage of MYH6 expression. H: mean normalized expression of MYH6 (P value = 0) in the CHD and control sample shown for cardiomyocytes. This plot was constructed using the top 10% expression data. *Significant P value (P < 0.05), which was calculated using Student’s t tests. I and J: MYH6 expression in the cardiomyocyte subset of the CHD heart single-cell UMAP using feature plot (I) and bar chart and scatter plot (J). The bold red arrows on the feature plot indicate downregulation. A fold change value of 1 is indicated by the red dashed line on the graph. K and L: MYH6 expression in the cardiomyocyte subset of the control heart single-cell UMAP using feature plot (K) and bar chart and scatter plot (L). The bold red arrows on the feature plot indicate downregulation. A fold change value of 1 is indicated by the red dashed line on the graph. CoA, coarctation of the aorta; TOF, tetralogy of Fallot; UMAP, uniform manifold approximation and projection.
We also analyzed the expression of six novel, non-OMIM CHD genes in the CHD and control dataset (Supplemental Fig. S5). Specific clusters of the CHD dataset were enriched for CPE95 (50% of valvar cells), GPATCH1 (7% of T cells), MAP3K19 (12% of adipocytes), NYNRIN (2% of endothelial cells), TLCD2 (2% of adipocytes), and TTC36 (2% of cardiomyocytes). In the control dataset, 45% of cardiomyocyte clusters were enriched with CPE95, 17% adipocyte clusters were enriched with GPATCH1, 4% endocardial clusters were enriched with MAP3K19, 3.7% endothelial clusters were enriched with NYNRIN, 1.8% adipocyte clusters were enriched with TLCD2, and 1.5% adipocyte clusters were enriched with TTC36.
Pathway Analysis and Expression Mapping
The GeneOverlap package from R was used to compute the enrichment of the 90 CHD genes. The resulting 247 pathways had P values between 5.14e−17 and 0.007 and FDR between 1.84e−14 and 0.01, respectively. The top two pathways were GO:0007507 (heart development) and GO:0072359 (circulatory system development). Fourteen networks were detected from which the largest was heart development and the smallest were cell motility regulation and extracellular matrix (Fig. 6A and Supplemental Table S5). Next, we overlapped the gene list from each network with the individual cell clusters from CHD and control single-cell transcriptome datasets using the top 20 DEGs (greater than 3-time fold change) from the clusters (Fig. 6, B and C). For both the CHD (14 genes) and control (13 genes) datasets, we observed maximum overlap between the extracellular matrix network (N1) and the valvar cluster. Similarly, overlap of 10 genes for the CHD dataset and 11 genes for the control was observed between the heart development network (N3) and the valvar cluster. A prominent overlap was detected between cardiomyocytes and cardiac muscle development network (N2 – CHD and donor – 10 genes) and heart development network (CHD – 10 genes and donor – 11 genes). Also, 10 genes from the smooth muscle cluster of the CHD dataset overlapped with the heart development network. Across the CHD and control datasets, the interaction pattern was very similar, with the strongest overlap found with the heart development network.
Figure 6.

Pathway analysis using the 90 prioritized CHD genes. A: pathways built by Cytoscape. The color gradient (orange to red) for the nodes is on the basis of P values, and the size of the nodes is proportional to odds ratio. B: heatmap depicting the overlap between pathway genes (horizontal axis) and CHD heart clusters represented by top 20 DEGs (vertical axis). C: heatmap depicting the overlap between pathway genes (horizontal axis) and control heart clusters represented by top 20 DEGs (vertical axis). The color intensity represents the gene overlap count. CHD, congenital heart disease; DEGs, differentially expressed genes.
DISCUSSION
In this study, we provide a comprehensive list of clinically relevant, rare, and coding variants associated with CHD and highlight the most significant ones. Our CHD-variant database consisted of variants in ∼40% of human genes. This can be attributed to the fact that CHD is caused by errors during the complex developmental process of heart formation, which is probably sensitive to relevant alterations at the level of expression of a large number of essential genes and their products. Indeed, a large number of biomolecular signals and interactions are involved in cardiac embryogenesis (1, 33) and ∼69% of all proteins are expressed in the adult human heart (34). Our dataset consisted of 83% LOF variants and 17% missense variants. Frameshift, stop, and splicing variants constituted the LOF variants, which are expected to have substantial effects on the levels of the mRNA transcript and the translated protein variant. Higher numbers of de novo variants were also observed contributing to a total of 17% of all variants, a majority of which were previously detected in the Pediatric Cardiac Genomics Consortium’s (PCGC) and Pediatric Heart Network’s (PHN) large cohort study (31). De novo variants have been shown to be responsible for CHD cases accompanied by extracardiac congenital anomalies or neurodevelopmental disorders (35, 36). Next, we detected 648 recurrent variants (contributed by 293 variants) in our database. Their number of recurrences was 2, 3, 4, 5, 10, and 24 consisting of 268, 16, 5, 2, 1, and 1 variants, respectively. Recurrent variants in unrelated individuals are specifically important indicators of highly susceptible genomic regions or disease specificity (37–39). We identified 90 CHD risk genes from the P/LP and VOUS CHD variants, which were either variants occurring at different positions or at the same position (recurrent) in three or more individuals. A majority of these genes were signaling pathway genes or structural genes. This is not surprising as cardiac development requires the intricate interaction of various cardiac-specific genes, transcription factors, chromatin modification-related genes, and signaling pathways (13, 40).
Of the 90 genes, we identified six novel CHD risk genes: GPATCH1, NYNRIN, TLCD2, CEP95, MAP3K19, and TTC36. GAPTCH1 or G patch domain-containing protein 1 is predicted to be involved in RNA binding. Not much is known about this protein; however, it has been linked to bone mineral density and is considered a possible osteoporosis risk factor in Japanese women (32). A strong association was also observed between the exonic variant of GPATCH1 and coronary thrombosis after liver transplant (41). NYNRIN or NYN domain and retroviral integrase catalytic domain-containing gene is also predicted to be involved in mRNA binding. Variants in this long terminal repeat (LTR) retrotransposon-derived gene have been linked to the development of Wilms tumors in early age (42) and comparative evolutionary genomic and transcriptomic analyses have shown this gene to be responsible for placental emergence in therian mammals (43). NYNRIN variants have also been associated with serum calcium levels (which have causal effects on ventricular repolarization) (44–46). TLCD2 or TLC domain-containing 2 gene is primarily involved in membrane assembly and phospholipid homeostasis. It is associated with chromosome 17p13.3 microdeletions and microduplications (47). A family-based association study demonstrated a significant effect of SNP in the TCLD2/miR22 region and left ventricular mass (48). Centrosomal protein 95 (CEP95) located in centrosome and spindle pole has been associated with longevity (49, 50) and was identified as one of the hub genes for lung adenocarcinoma using weighted gene coexpression network analysis (WGCNA) (51). Also, an intronic variant of CEP95 (rs148050755) showed genome-wide significant associations with cardiac troponin I, a marker of cardiac muscle damage, and helps to detect myocardial infarction (52). Overexpression of the relatively understudied kinase MAP3K19 has been observed in idiopathic pulmonary fibrosis (IPF) and chronic obstructive pulmonary disease (COPD) (53–55) and pathogenic missense mutations were observed in MAP3K19 in a WES analysis of 20 patients with bicuspid aortic valve (56). TTC36 or tetratricopeptide repeat domain 36 is a molecular chaperone highly expressed in the liver. Mouse knockout studies have established the association of TTC36 with tyrosinemia and neurological damage (57). Using a combination of in silico, in vivo, and in vitro methods, Song et al. (58) established an association of the TTC36 gene with gastric carcinoma. Recently, a study involving multitrait analysis of GWAS showed a significant association between TTC36 variant and triglycerides (59). Moreover, CHD and control single-cell datasets showed reduced levels of expression of these six genes, both in terms of expression level and percentage of cells expressing each gene per cluster. Overall, these six genes have minimal functional annotation available so far and have predominantly been linked to organs other than the heart, which might have concealed their significance as high-risk candidate CHD genes. All other 84 genes overlapped with known CHD and OMIM genes. Next, we analyzed the expression of the top 10 most frequently mutated genes in the cardiac single-cell transcriptome from CHD hearts. Cardiomyocytes were enriched with this subset of genes. Cardiomyocytes are the major cell type in the heart, accounting for ∼30% of the total cell number and 70% of the total cardiac mass and cardiomyocyte cell cycle activity is known to play a cardinal role in heart ontogeny (60, 61). Consistently, cardiomyocytes constituted <50% of the total cardiac cells in the CHD and control heart transcriptome in the present study. We reconstructed the clusters and the cardiomyocytes and observed five subclusters detected only in the CHD samples: EDIL3+/MYL7−, KCNMB4+/CPED1−, RASD1+/DOCK4−, OTOGL+/OBSCN−, and SLC26A3+/TTTY14−, indicating differences in the transcriptome of patients with CHD relative to healthy controls. In previous work, Kitani and colleagues (62) also reported distinct transcriptomes of patients with CHD relative to healthy controls using bulk RNA-Seq of induced pluripotent stem cell-derived cardiomyocytes from patients with CHD. Next, we probed our single-cell datasets for the expression of NOTCH1 (the top most frequently mutated gene) and MYH6 (one of the top 10 genes having maximum recurrent variants). NOTCH1 is a member of the Notch signaling pathway involved in regulating cardiogenesis. In our dataset, we observed that a majority of NOTCH1 variants were associated with CTD phenotype, which includes TOF, double-outlet right ventricle (DORV), truncus arteriosus, membranous ventricular septal defects (VSD), and aortic arch abnormalities. Several previous studies have also linked pathogenic NOTCH1 variants to TOF (9, 63, 64). In line with the well-described expression of NOTCH1 in endothelial cells lining the cardiac outflow tract during cardiogenesis (65), we detected overexpression of this gene primarily in endothelial and endocardial cells across the two datasets. In particular, NOTCH1 was significantly upregulated in control compared with CHD endocardial cells. On the other hand, cardiomyocytes were enriched in myosin heavy chain 6 (MYH6) gene in both datasets. In this database, we detected 24 recurrent variants in MYH6, due to a rare missense variant p.Arg721Trp. This variant has been previously reported to be prevalent in an Icelandic population with CoA (66). Varying reports have established MYH6 variants as variants as high risk factors for ventricular for ventricular septal defects and hypoplastic left heart syndrome (67, 68). We also observed that the dominant phenotype for individuals with MYH6 variants was LVO, which included hypoplastic left heart syndrome (HLHS), CoA, and aortic stenosis/bicuspid aortic valve (AS/BAV). Similar to NOTCH1, even MYH6 was significantly upregulated in control compared with CHD samples. We observed significantly lesser expression of both genes in the CHD heart transcriptome, which might be due to the different types of CHD samples (TOF and neonatal HLHS, dilated cardiomyopathy, hypertrophic cardiomyopathy, and failing HLHS) used (3), the difference in age of CHD (0 yr–4 yr) and control (3 yr–11 yr) sample, and a possible specificity of gene variants to specific populations (66, 69). This may also be due to the fact that both genes play a major role in cardiac morphogenesis, so they are highly expressed in healthy controls. CHD, however, causes their expression to be reduced directly or indirectly by disrupting their function.
Conclusions
This is the first comprehensive study linking CHD genotype, phenotype, and cell type. In this work, we performed a meta-analysis of CHD and control datasets to delineate CHD genes with the highest number of variants. Signaling pathways and structural genes constituted the majority of CHD genes, and six previously unrecognized genes were identified. These 90 candidate CHD genes are the most updated list as we have constructed this using the curated CHD dataset consisting of 16,349 variants from 3,166 samples and compared it with the CHD gene panels from studies with Blue et al. (22), Hu et al. (24), 143 genes from the CHDgene database (25), and 16,491 OMIM genes (26). Having access to a constant source of genetic information on CHD is crucial to finding genetic causes of the disease, which is what we aimed to achieve. Also, the identification of new genes supports the idea that CHD is not only due to pathogenic cardiac gene expression but also genes that are not primarily localized in the heart. Furthermore, our findings are supported by recent research by Robbe et al. (70) demonstrating how reduced expression of noncardiac genes leads to specific cardiac disease states. We have also described for the first time the algorithm for identifying and annotating cellular subsets (cardiomyocyte subset) and observed a difference in cellular composition between control and CHD tissues. Genes associated with CHD showed phenotypic specificity as well. Our single-cell heart transcriptome analysis revealed that cardiomyocytes and endocardial cells are the main cell types associated with CHD. Overall, our study highlights that CHD genes exhibit heterogeneous expression patterns, suggesting that diverse causative genes are expressed in distinct cell types, resulting in different forms of CHDs. Further studies are warranted to determine the functional role of the six novel genes identified in our study in the pathophysiology of CHD and to examine the contributions of both cardiomyocyte and noncardiomyocyte cell types and cardiac and noncardiac genes in CHD.
DATA AVAILABILITY
The data used in this study were obtained from deidentified original genomic studies related to CHD and have been incorporated into the article and its online Supplemental Material.
SUPPLEMENTAL DATA
Supplemental Figs. S1–S5 and Supplemental Tables S1–S5: https://doi.org/10.6084/m9.figshare.23598732.v4.
GRANTS
This work was supported, in whole or in part, by the Al Jalila Foundation, National Institutes of Health Grant U01HL131003, internal grant awards from Mohammed Bin Rashid University of Medicine and Health Sciences (MBRU) College of Medicine (MBRU-CM-RG2018-04, MBRU-CM-RG2020-02, MBRU-CM-RG2020-12, MBRU-CM-RG2021-04, and MBRU-CM-RG2022-12), Al-Mahmeed Collaborative Research Awards ALM1801 and ALM20-0074, and Pediatric Cardiac Genomics Consortium Grant U01HL131003. R.T. was supported by MBRU Postdoctoral Fellow award MBRU-2020-04 and PCGC/CDDRC fellowship 5U01HL131003-08 subaward no. OS00000703/313946. B.Z. was supported by a Saudi Heart Association Dr. Wael Al-Mahmeed Research Grant and MBRU Post-Doctoral Fellow Award MBRU-PD-2020-02.
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the authors.
AUTHOR CONTRIBUTIONS
R.T., M.U., and B.K.B. conceived and designed research; R.T., S.N., S.S., F.K., A.A., R.M.D.D., M.U., and B.K.B. analyzed data; R.T., S.A., M.B., W.M.K., W.K.C., A.A., B.A., S.N., S.S., F.K., N.N., N.M., A.A.A., R.A.H., R.M.D.D., M.U., and B.K.B. interpreted results of experiments; R.T. prepared figures; R.T. drafted manuscript; R.T., S.A., M.B., W.M.K., W.C., A.A.A., B.A., S.N., S.S., F.K., N.N., N.M., A.A., R.A.H., R.M.D.D., M.U., and B.K.B. edited and revised manuscript; R.T., S.A., M.B., W.M.K., W.C., A.A., B.A., S.N., S.S., F.K., N.N., N.M., A.A.A., R.A.H., R.M.D.D., M.U., and B.K.B. approved final version of manuscript.
REFERENCES
- 1. Zaidi S, Brueckner M. Genetics and genomics of congenital heart disease. Circ Res 120: 923–940, 2017. doi: 10.1161/CIRCRESAHA.116.309140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Safizadeh Shabestari SA, Nassir N, Sopariwala S, Karimov I, Tambi R, Zehra B, Kosaji N, Akter H, Berdiev BK, Uddin M. Overlapping pathogenic de novo CNVs in neurodevelopmental disorders and congenital anomalies impacting constraint genes regulating early development. Hum Genet 142: 1201–1213, 2023. doi: 10.1007/s00439-022-02482-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Hill MC, Kadow ZA, Long H, Morikawa Y, Martin TJ, Birks EJ, Campbell KS, Nerbonne J, Lavine K, Wadhwa L, Wang J, Turaga D, Adachi I, Martin JF. Integrated multi-omic characterization of congenital heart disease. Nature 608: 181–191, 2022. doi: 10.1038/s41586-022-04989-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Warnes CA, Liberthson R, Danielson GK, Dore A, Harris L, Hoffman JI, Somerville J, Williams RG, Webb GD. Task force 1: the changing profile of congenital heart disease in adult life. J Am Coll Cardiol 37: 1170–1175, 2001. doi: 10.1016/s0735-1097(01)01272-4. [DOI] [PubMed] [Google Scholar]
- 5. Kalisch-Smith JI, Ved N, Sparrow DB. Environmental risk factors for congenital heart disease. Cold Spring Harb Perspect Biol 12: a037234, 2020. doi: 10.1101/cshperspect.a037234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Diab NS, Barish S, Dong W, Zhao S, Allington G, Yu X, Kahle KT, Brueckner M, Jin SC. Molecular genetics and complex inheritance of congenital heart disease. Genes 12: 1020, 2021. doi: 10.3390/genes12071020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Zhang T-N, Wu Q-J, Liu Y-S, Lv J-L, Sun H, Chang Q, Liu C-F, Zhao Y-H. Environmental risk factors and congenital heart disease: an umbrella review of 165 systematic reviews and meta-analyses with more than 120 million participants. Front Cardiovasc Med 8: 640729, 2021. doi: 10.3389/fcvm.2021.640729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Yuan S, Zaidi S, Brueckner M. Congenital heart disease: emerging themes linking genetics and development. Curr Opin Genet Dev 23: 352–359, 2013. doi: 10.1016/j.gde.2013.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Page DJ, Miossec MJ, Williams SG, Monaghan RM, Fotiou E, Cordell HJ, , et al. Whole exome sequencing reveals the major genetic contributors to nonsyndromic tetralogy of fallot. Circ Res 124: 553–563, 2019. doi: 10.1161/CIRCRESAHA.118.313250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. LaHaye S, Corsmeier D, Basu M, Bowman JL, Fitzgerald-Butt S, Zender G, Bosse K, McBride KL, White P, Garg V. Utilization of whole exome sequencing to identify causative mutations in familial congenital heart disease. Circ Cardiovasc Genet 9: 320–329, 2016. doi: 10.1161/CIRCGENETICS.115.001324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Zahavich L, Bowdin S, Mital S. Use of clinical exome sequencing in isolated congenital heart disease. Circ Cardiovasc Genet 10: e001581, 2017. doi: 10.1161/CIRCGENETICS.116.001581. [DOI] [PubMed] [Google Scholar]
- 12. Pierpont ME, Brueckner M, Chung WK, Garg V, Lacro RV, McGuire AL, Mital S, Priest JR, Pu WT, Roberts A, Ware SM, Gelb BD, Russell MW; American Heart Association Council on Cardiovascular Disease in the Young; Council on Cardiovascular and Stroke Nursing; Council on Genomic and Precision Medicine. Genetic basis for congenital heart disease: revisited: a scientific statement from the American Heart Association. Circulation 138: e653–e711, 2018. [Erratum in Circulation 138: e713, 2018]. doi: 10.1161/CIR.0000000000000606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Morton SU, Quiat D, Seidman JG, Seidman CE. Genomic frontiers in congenital heart disease. Nat Rev Cardiol 19: 26–42, 2022. doi: 10.1038/s41569-021-00587-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Garg V. Molecular genetics of aortic valve disease. Curr Opin Cardiol 21: 180–184, 2006. doi: 10.1097/01.hco.0000221578.18254.70. [DOI] [PubMed] [Google Scholar]
- 15. Garg V. Insights into the genetic basis of congenital heart disease. Cell Mol Life Sci 63: 1141–1148, 2006. doi: 10.1007/s00018-005-5532-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Tambi R, Abdel Hameid R, Bankapur A, Nassir N, Begum G, Alsheikh-Ali A, Uddin M, Berdiev BK. Single-cell transcriptomics trajectory and molecular convergence of clinically relevant mutations in Brugada syndrome. Am J Physiol Heart Circ Physiol 320: H1935–H1948, 2021. doi: 10.1152/ajpheart.00061.2021. [DOI] [PubMed] [Google Scholar]
- 17. Nassir N, Bankapur A, Samara B, Ali A, Ahmed A, Inuwa IM, Zarrei M, Safizadeh Shabestari SA, AlBanna A, Howe JL, Berdiev BK, Scherer SW, Woodbury-Smith M, Uddin M. Single-cell transcriptome identifies molecular subtype of autism spectrum disorder impacted by de novo loss-of-function variants regulating glial cells. Hum Genomics 15: 68, 2021. doi: 10.1186/s40246-021-00368-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Nassir N, Tambi R, Bankapur A, Karuvantevida N, Khansaheb HH, Zehra B, Begum G, Hameid RA, Ahmed A, Deesi Z, Alkhajeh A, Uddin KMF, Akter H, Safizadeh Shabestari SA, Gaudet M, Hachim MY, Alsheikh-Ali A, Berdiev BK, Al Heialy S, Uddin M. Analyzing single cell transcriptome data from severe COVID-19 patients. STAR Protoc 3: 101379, 2022. doi: 10.1016/j.xpro.2022.101379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Nassir N, Tambi R, Bankapur A, Al Heialy S, Karuvantevida N, Khansaheb HH, Zehra B, Begum G, Hameid RA, Ahmed A, Deesi Z, Alkhajeh A, Uddin KMF, Akter H, Safizadeh Shabestari SA, Almidani O, Islam A, Gaudet M, Kandasamy RK, Loney T, Tayoun AA, Nowotny N, Woodbury-Smith M, Rahman P, Kuebler WM, Yaseen Hachim M, Casanova JL, Berdiev BK, Alsheikh-Ali A, Uddin M. Single-cell transcriptome identifies FCGR3B upregulated subtype of alveolar macrophages in patients with critical COVID-19. iScience 24: 103030, 2021. doi: 10.1016/j.isci.2021.103030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic acids Res 38: e164, 2010. doi: 10.1093/nar/gkq603. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, Grody WW, Hegde M, Lyon E, Spector E, Voelkerding K, Rehm HL; ACMG Laboratory Quality Assurance Committee. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 17: 405–424, 2015. doi: 10.1038/gim.2015.30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Blue GM, Kirk EP, Giannoulatou E, Dunwoodie SL, Ho JW, Hilton DC, White SM, Sholler GF, Harvey RP, Winlaw DS. Targeted next-generation sequencing identifies pathogenic variants in familial congenital heart disease. J Am Coll Cardiol 64: 2498–2506, 2014. doi: 10.1016/j.jacc.2014.09.048. [DOI] [PubMed] [Google Scholar]
- 23. Andelfinger G. Next-generation sequencing in congenital heart disease: do new brooms sweep clean? J Am Coll Cardiol 64: 2507–2509, 2014. doi: 10.1016/j.jacc.2014.09.049. [DOI] [PubMed] [Google Scholar]
- 24. Hu P, Qiao F, Wang Y, Meng L, Ji X, Luo C, Xu T, Zhou R, Zhang J, Yu B, Wang L, Wang T, Pan Q, Ma D, Liang D, Xu Z. Clinical application of targeted next-generation sequencing in fetuses with congenital heart defect. Ultrasound Obstet Gynecol 52: 205–211, 2018. doi: 10.1002/uog.19042. [DOI] [PubMed] [Google Scholar]
- 25. Yang A, Alankarage D, Cuny H, Ip EKK, Almog M, Lu J, Das D, Enriquez A, Szot JO, Humphreys DT, Blue GM, Ho JWK, Winlaw DS, Dunwoodie SL, Giannoulatou E. CHDgene: a curated database for congenital heart disease genes. Circ Genom Precis Med 15: e003539, 2022. doi: 10.1161/CIRCGEN.121.003539. [DOI] [PubMed] [Google Scholar]
- 26. Amberger JS, Bocchini CA, Scott AF, Hamosh A. OMIM.org: leveraging knowledge across phenotype-gene relationships. Nucleic Acids Res 47: D1038–D1043, 2019. doi: 10.1093/nar/gky1151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, Baglaenko Y, Brenner M, Loh PR, Raychaudhuri S. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods 16: 1289–1296, 2019. doi: 10.1038/s41592-019-0619-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Litviňuková M, Talavera-López C, Maatz H, Reichart D, Worth CL, Lindberg EL, Kanda M, Polanski K, Heinig M, Lee M, Nadelmann ER, Roberts K, Tuck L, Fasouli ES, DeLaughter DM, McDonough B, Wakimoto H, Gorham JM, Samari S, Mahbubani KT, Saeb-Parsy K, Patone G, Boyle JJ, Zhang H, Zhang H, Viveiros A, Oudit GY, Bayraktar OA, Seidman JG, Seidman CE, Noseda M, Hubner N, Teichmann SA. Cells of the adult human heart. Nature 588: 466–472, 2020. doi: 10.1038/s41586-020-2797-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Kanehisa M. The KEGG database. Novartis Found Symp 247: 91–101, 2002. [PubMed] [Google Scholar]
- 30. Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, , et al. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 32: D258–D261, 2004. doi: 10.1093/nar/gkh036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Jin SC, Homsy J, Zaidi S, Lu Q, Morton S, DePalma SR, , et al. Contribution of rare inherited and de novo variants in 2,871 congenital heart disease probands. Nat Genet 49: 1593–1601, 2017. doi: 10.1038/ng.3970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Zhou H, Mori S, Ishizaki T, Takahashi A, Matsuda K, Koretsune Y, Minami S, Higashiyama M, Imai S, Yoshimori K, Doita M, Yamada A, Nagayama S, Kaneko K, Asai S, Shiono M, Kubo M, Ito H. Genetic risk score based on the prevalence of vertebral fracture in Japanese women with osteoporosis. Bone Rep 5: 168–172, 2016. doi: 10.1016/j.bonr.2016.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Sun C, Kontaridis MI. Physiology of cardiac development: from genetics to signaling to therapeutic strategies. Curr Opin Physiol 1: 123–139, 2018. doi: 10.1016/j.cophys.2017.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardinoglu A, , et al. Proteomics. Tissue-based map of the human proteome. Science 347: 1260419, 2015. doi: 10.1126/science.1260419. [DOI] [PubMed] [Google Scholar]
- 35. Homsy J, Zaidi S, Shen Y, Ware JS, Samocha KE, Karczewski KJ, , et al. De novo mutations in congenital heart disease with neurodevelopmental and other congenital anomalies. Science 350: 1262–1266, 2015. doi: 10.1126/science.aac9396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Zaidi S, Choi M, Wakimoto H, Ma L, Jiang J, Overton JD, , et al. De novo mutations in histone-modifying genes in congenital heart disease. Nature 498: 220–223, 2013. doi: 10.1038/nature12141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Wagnon JL, Meisler MH. Recurrent and non-recurrent mutations of SCN8A in epileptic encephalopathy. Front Neurol 6: 104, 2015. doi: 10.3389/fneur.2015.00104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Wilfert AB, Sulovari A, Turner TN, Coe BP, Eichler EE. Recurrent de novo mutations in neurodevelopmental disorders: properties and clinical implications. Genome Med 9: 101, 2017. doi: 10.1186/s13073-017-0498-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Yu J, Li Y, Zhang D, Wan D, Jiang Z. Clinical implications of recurrent gene mutations in acute myeloid leukemia. Exp Hematol Oncol 9: 4, 2020. doi: 10.1186/s40164-020-00161-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Williams K, Carson J, Lo C. Genetics of congenital heart disease. Biomolecules 9: 879, 2019. doi: 10.3390/biom9120879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Li Y, Nieuwenhuis LM, Voskuil MD, Gacesa R, Hu S, Jansen BH, Venema WTU, Hepkema BG, Blokzijl H, Verkade HJ, Lisman T, Weersma RK, Porte RJ, Festen EAM, de Meijer VE. Donor genetic variants as risk factors for thrombosis after liver transplantation: a genome-wide association study. Am J Transplant 21: 3133–3147, 2021. doi: 10.1111/ajt.16490. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Mahamdallie S, Yost S, Poyastro-Pearson E, Holt E, Zachariou A, Seal S, Elliott A, Clarke M, Warren-Perry M, Hanks S, Anderson J, Bomken S, Cole T, Farah R, Furtwaengler R, Glaser A, Grundy R, Hayden J, Lowis S, Millot F, Nicholson J, Ronghe M, Skeen J, Williams D, Yeomanson D, Ruark E, Rahman N. Identification of new Wilms tumour predisposition genes: an exome sequencing study. Lancet Child Adolesc Health 3: 322–331, 2019. doi: 10.1016/S2352-4642(19)30018-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Plianchaisuk A, Kusama K, Kato K, Sriswasdi S, Tamura K, Iwasaki W. Origination of LTR retroelement-derived NYNRIN coincides with therian placental emergence. Mol Biol Evol 39: msac176, 2022. doi: 10.1093/molbev/msac176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Sakaue S, Kanai M, Tanigawa Y, Karjalainen J, Kurki M, Koshiba S, , et al. A cross-population atlas of genetic associations for 220 human phenotypes. Nat Genet 53: 1415–1424, 2021. doi: 10.1038/s41588-021-00931-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Sinnott-Armstrong N, Tanigawa Y, Amar D, Mars N, Benner C, Aguirre M, Venkataraman GR, Wainberg M, Ollila HM, Kiiskinen T, Havulinna AS, Pirruccello JP, Qian J, Shcherbina A; FinnGen; Rodriguez F, Assimes TL, Agarwala V, Tibshirani R, Hastie T, Ripatti S, Pritchard JK, Daly MJ, Rivas MA. Genetics of 35 blood and urine biomarkers in the UK Biobank. Nat Genet 53: 185–194, 2021. doi: 10.1038/s41588-020-00757-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Young WJ, Warren HR, Mook-Kanamori DO, Ramírez J, van Duijvenboden S, Orini M, Tinker A, van Heemst D, Lambiase PD, Jukema JW, Munroe PB, Noordam R. Genetically determined serum calcium levels and markers of ventricular repolarization: a Mendelian Randomization Study in the UK Biobank. Circ Genom Precis Med 14: e003231, 2021. doi: 10.1161/CIRCGEN.120.003231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Blazejewski SM, Bennison SA, Smith TH, Toyo-Oka K. Neurodevelopmental genetic diseases associated with microdeletions and microduplications of chromosome 17p13.3. Front Genet 9: 80, 2018. doi: 10.3389/fgene.2018.00080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Harper AR, Mayosi BM, Rodriguez A, Rahman T, Hall D, Mamasoula C, Avery PJ, Keavney BD. Common variation neighbouring micro-RNA 22 is associated with increased left ventricular mass. PloS One 8: e55061, 2013. doi: 10.1371/journal.pone.0055061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Shadyab AH, LaCroix AZ. Genetic factors associated with longevity: a review of recent findings. Ageing Res Rev 19: 1–7, 2015. doi: 10.1016/j.arr.2014.10.005. [DOI] [PubMed] [Google Scholar]
- 50. Lee JH, Cheng R, Honig LS, Feitosa M, Kammerer CM, Kang MS, Schupf N, Lin SJ, Sanders JL, Bae H, Druley T, Perls T, Christensen K, Province M, Mayeux R. Genome wide association and linkage analyses identified three loci-4q25, 17q23.2, and 10q11.21-associated with variation in leukocyte telomere length: the Long Life Family Study. Front Genet 4: 310, 2013. doi: 10.3389/fgene.2013.00310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Yan X, Zhao X, Yan Q, Wang Y, Zhang C. Analysis of the role of METTL5 as a hub gene in lung adenocarcinoma based on a weighted gene co-expression network. Math Biosci Eng 18: 6608–6619, 2021. doi: 10.3934/mbe.2021327. [DOI] [PubMed] [Google Scholar]
- 52. Welsh P, Preiss D, Hayward C, Shah ASV, McAllister D, Briggs A, Boachie C, McConnachie A, Padmanabhan S, Welsh C, Woodward M, Campbell A, Porteous D, Mills NL, Sattar N. Cardiac troponin T and troponin I in the general population. Circulation 139: 2754–2764, 2019. doi: 10.1161/CIRCULATIONAHA.118.038529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Jones IC, Espindola MS, Narayanan R, Coelho AL, Habiel DM, Boehme SA, Ly TW, Bacon KB, Hogaboam CM. Targeting MAP3K19 prevents human lung myofibroblast activation both in vitro and in a humanized SCID model of idiopathic pulmonary fibrosis. Sci Rep 9: 19796, 2019. doi: 10.1038/s41598-019-56393-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Boehme SA, Franz-Bacon K, Ludka J, DiTirro DN, Ly TW, Bacon KB. MAP3K19 is overexpressed in COPD and is a central mediator of cigarette smoke-induced pulmonary inflammation and lower airway destruction. PloS One 11: e0167169, 2016. doi: 10.1371/journal.pone.0167169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Boehme SA, Franz-Bacon K, DiTirro DN, Ly TW, Bacon KB. MAP3K19 is a novel regulator of TGF-β signaling that impacts bleomycin-induced lung injury and pulmonary fibrosis. PloS One 11: e0154874, 2016. doi: 10.1371/journal.pone.0154874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Chen S, Jin Q, Hou S, Li M, Zhang Y, Guan L, Pan W, Ge J, Zhou D. Identification of recurrent variants implicated in disease in bicuspid aortic valve patients through whole-exome sequencing. Hum Genomics 16: 36, 2022. doi: 10.1186/s40246-022-00405-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Xie Y, Lv X, Ni D, Liu J, Hu Y, Liu Y, Liu Y, Liu R, Zhao H, Lu Z, Zhou Q. HPD degradation regulated by the TTC36-STK33-PELI1 signaling axis induces tyrosinemia and neurological damage. Nat Commun 10: 4266, 2019. doi: 10.1038/s41467-019-12011-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Song L, Guo X, Zhao F, Wang W, Zhao Z, Jin L, Wu C, Yao J, Ma Z. TTC36 inactivation induce malignant properties via Wnt-β-catenin pathway in gastric carcinoma. J Cancer 12: 2598–2609, 2021. doi: 10.7150/jca.47292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Koskeridis F, Evangelou E, Said S, Boyle JJ, Elliott P, Dehghan A, Tzoulaki I. Pleiotropic genetic architecture and novel loci for C-reactive protein levels. Nat Commun 13: 6939, 2022. doi: 10.1038/s41467-022-34688-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Doll S, Dreßen M, Geyer PE, Itzhak DN, Braun C, Doppler SA, Meier F, Deutsch MA, Lahm H, Lange R, Krane M, Mann M. Region and cell-type resolved quantitative proteomic map of the human heart. Nat Commun 8: 1469, 2017. doi: 10.1038/s41467-017-01747-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Bergmann O, Zdunek S, Felker A, Salehpour M, Alkass K, Bernard S, Sjostrom SL, Szewczykowska M, Jackowska T, Dos Remedios C, Malm T, Andrä M, Jashari R, Nyengaard JR, Possnert G, Jovinge S, Druid H, Frisén J. Dynamics of cell generation and turnover in the human heart. Cell 161: 1566–1575, 2015. doi: 10.1016/j.cell.2015.05.026. [DOI] [PubMed] [Google Scholar]
- 62. Kitani T, Tian L, Zhang T, Itzhaki I, Zhang JZ, Ma N, Liu C, Rhee J-W, Romfh AW, Lui GK, Wu JC. RNA sequencing analysis of induced pluripotent stem cell-derived cardiomyocytes from congenital heart disease patients. Circ Res 126: 923–925, 2020. doi: 10.1161/CIRCRESAHA.119.315653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Durbin MD, Cadar AG, Williams CH, Guo Y, Bichell DP, Su YR, Hong CC. Hypoplastic left heart syndrome sequencing reveals a novel NOTCH1 mutation in a family with single ventricle defects. Pediatr Cardiol 38: 1232–1240, 2017. doi: 10.1007/s00246-017-1650-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Debiec RM, Hamby SE, Jones PD, Safwan K, Sosin M, Hetherington SL, Sprigings D, Sharman D, Lee K, Salahshouri P, Wheeldon N, Chukwuemeka A, Boutziouka V, Elamin M, Coolman S, Asiani M, Kharodia S, Skinner GJ, Samani NJ, Webb TR, Bolger AP. Contribution of NOTCH1 genetic variants to bicuspid aortic valve and other congenital lesions. Heart 108: 1114–1120, 2022. doi: 10.1136/heartjnl-2021-320428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Koenig SN, Bosse K, Majumdar U, Bonachea EM, Radtke F, Garg V. Endothelial Notch1 is required for proper development of the semilunar valves and cardiac outflow tract. J Am Heart Assoc 5: e003075, 2016. doi: 10.1161/JAHA.115.003075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Bjornsson T, Thorolfsdottir RB, Sveinbjornsson G, Sulem P, Norddahl GL, Helgadottir A, Gretarsdottir S, Magnusdottir A, Danielsen R, Sigurdsson EL, Adalsteinsdottir B, Gunnarsson SI, Jonsdottir I, Arnar DO, Helgason H, Gudbjartsson T, Gudbjartsson DF, Thorsteinsdottir U, Holm H, Stefansson K. A rare missense mutation in MYH6 associates with non-syndromic coarctation of the aorta. Eur Heart J 39: 3243–3249, 2018. doi: 10.1093/eurheartj/ehy142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Theis JL, Hu JJ, Sundsbak RS, Evans JM, Bamlet WR, Qureshi MY, O'Leary PW, Olson TM. Genetic association between hypoplastic left heart syndrome and cardiomyopathies. Circ Genom Precis Med 14: e003126, 2021. doi: 10.1161/CIRCGEN.120.003126. [DOI] [PubMed] [Google Scholar]
- 68. Zuo J-Y, Chen H-X, Liu Z-G, Yang Q, He G-W. Identification and functional analysis of variants of MYH6 gene promoter in isolated ventricular septal defects. BMC Med Genomics 15: 213, 2022. doi: 10.1186/s12920-022-01365-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Kosaji N, Zehra B, Nassir N, Tambi R, Orszulak AR, Lim ET, Berdiev BK, Woodbury-Smith M, Uddin M. Lack of ethnic diversity in single-cell transcriptomics hinders cell type detection and precision medicine inclusivity. Med 4: 217–219, 2023. doi: 10.1016/j.medj.2023.03.002. [DOI] [PubMed] [Google Scholar]
- 70. Robbe ZL, Shi W, Wasson LK, Scialdone AP, Wilczewski CM, Sheng X, Hepperla AJ, Akerberg BN, Pu WT, Cristea IM, Davis IJ, Conlon FL. CHD4 is recruited by GATA4 and NKX2-5 to repress noncardiac gene programs in the developing heart. Genes Dev 36: 468–482, 2022. doi: 10.1101/gad.349154.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental Figs. S1–S5 and Supplemental Tables S1–S5: https://doi.org/10.6084/m9.figshare.23598732.v4.
Data Availability Statement
The data used in this study were obtained from deidentified original genomic studies related to CHD and have been incorporated into the article and its online Supplemental Material.
