Abstract
Objective
Genetic approaches have identified numerous loci associated with coronary heart disease (CHD). The molecular mechanisms underlying CHD gene-disease associations, however, remain unclear. We hypothesized that genetic variants with both strong and subtle effects drive gene subnetworks that in turn affect CHD.
Approach and Results
We surveyed CHD-associated molecular interactions by constructing coexpression networks using whole blood gene expression profiles from 188 CHD cases and 188 age- and sex-matched controls. 24 coexpression modules were identified including one case-specific and one control-specific differential module (DM). The DMs were enriched for genes involved in B-cell activation, immune response, and ion transport. By integrating the DMs with altered gene expression associated SNPs (eSNPs) and with results of GWAS of CHD and its risk factors, the control-specific DM was implicated as CHD-causal based on its significant enrichment for both CHD and lipid eSNPs. This causal DM was further integrated with tissue-specific Bayesian networks and protein-protein interaction networks to identify regulatory key driver (KD) genes. Multi-tissue KDs (SPIB and TNFRSF13C) and tissue-specific KDs (e.g. EBF1) were identified.
Conclusions
Our network-driven integrative analysis not only identified CHD-related genes, but also defined network structure that sheds light on the molecular interactions of genes associated with CHD risk.
Keywords: Gene expression, coronary heart disease, systems biology, coexpression network
Introduction
Atherosclerotic coronary heart disease (CHD) is a multi-factorial disease with a prominent inflammatory vascular component that remains the leading cause of death in most developed countries and will soon become the leading cause of death in developing countries 1. Atherosclerosis involves an interaction between modified lipoproteins, monocyte-derived macrophages, T cells, and the normal cellular elements of the arterial wall, resulting in the formation of atherosclerotic plaques. Plaque rupture and thrombosis can result in myocardial infarction (MI) and stroke 2. Despite rapid advances in the genetic and genomic analysis of CHD in recent years, the molecular elements and the mechanisms involved in CHD pathogenesis remain unclear. Understanding the molecular basis of this disease can help to identify biomarkers for more accurate clinical diagnosis and point to candidate drug targets for more effective therapy.
At the genetic level, recent genome-wide association studies (GWAS) have identified 27 loci associated with CHD and myocardial infarction (MI), with many representing novel risk loci 3, 4. Identifying the casual genes and mechanisms underlying GWAS signals, however, remains a challenging task. In addition, even large scale GWAS involving tens of thousands of cases and controls are unlikely to explain more than 15–20% of the heritability of CHD and MI 5. Therefore, further experiments and integrative analyses are needed to elucidate functional mechanisms underlying CHD risk loci and to identify the missing heritability.
At the gene expression level, peripheral blood has been investigated widely due to its ease of collection and its pathological relevance to CHD, as multiple blood constituents appear to play a role in plaque accumulation, rupture, and thrombus formation 6. Gene expression changes in whole blood of CHD patients may reflect the interaction of genetic predisposition, disease activity, immunity, thrombosis, metabolism, and environmental modifiers underling disease mechanisms. In recent studies 7-9, mRNA profiling of tens to more than one hundred CHD cases has yielded lists of differentially expressed genes in an effort to identify effective markers of disease severity. Most of these studies, however, identified relatively few significantly differentially expressed genes between cases and controls at a stringent statistical cutoff and there was little overlap in the gene signatures of different CHD expression profiling studies. For instance, there is no common CHD gene signature among 160 genes reported by Sinnaeve et al 7, 50 genes reported by Wingrove et al 8, and 23 genes identified by Rosenberg et al 9. This failure to replicate may be due to the small sample sizes, differences in phenotypes or experimental platforms, and the low levels of differential expression of single genes.
In order to address the limitations of previous studies, we designed a systems biology framework to explore the molecular underpinnings of CHD via integration of whole blood gene expression profiles of RNA collected from Framingham Heart Study (FHS) participants with network approaches, GWAS, and genetics of gene expression (Figure 1). The FHS case-control study includes 188 pairs of prevalent CHD cases and age- and sex-matched controls. In our companion paper, we identified 35 differentially expressed genes between CHD cases and controls at FDR<50% 10. To extend upon result of analysis of differential gene expression, in this investigation, we first constructed coexpression networks and compared the robust coexpression patterns of cases versus controls to identify differential coexpression modules (DMs) that demonstrate gene co-regulation in cases but not in controls or vice versa. The CHD DMs were then integrated with SNPs associated with altered gene expression (eSNPs or eQTLs) from CHD-relevant tissues as well as CHD and CHD risk factor GWAS association results to look for enrichment of CHD risk eSNPs among the DM genes. This integration allowed us to differentiate causal (i.e. upstream) from reactive (i.e. downstream) DMs. Last, using a key driver analysis (KDA) that leverages DMs and graphical networks – including external tissue-specific Bayesian networks (BNs) and protein-protein interaction (PPI) networks – we identified putative regulators of the DMs and derived a CHD subnetwork.
Figure 1. Systems Biology Analysis Flow Chart Depicting the Process of Identifying CHD Causal Modules and Key Drivers (KDs).
The overall analysis includes 3 steps. First, coexpression networks are constructed from CHD cases and controls separately. CHD differential modules (DMs) are then identified by comparing the network structures between cases and controls. Step 2, the DMs are then integrated with GWAS of CHD and related traits using a SNP Set Enrichment Analysis (SSEA) to identify causal DMs. Third, key regulatory genes or KDs are identified for the causal DMs based on directional networks derived from independent studies using a Key Driver Analysis (KDA) algorithm.
Materials and Methods
Materials and Methods are available in the online-only Supplement.
Results
Clinical characteristics of the study sample
The case control study included 188 CHD cases and 188 matched controls. Demographic data and the clinical and laboratory characteristics of the study participants are summarized in Table 1. CHD cases had a higher prevalence of hypertension than controls (p=0.019).
Table 1.
Clinical Characteristics of the Case-control Study Groups
CHD Cases N=188 |
CHD Controls N=188 |
|
---|---|---|
Age (yr ) | 71±7.9 | 71±8.0 |
Sex, male (female) | 140(48) | 140(48) |
Smoking, n (%) | 15 (8) | 2 (1) |
Alcohol, n (%) | 31(16) | 34(18) |
Glucose (mg/dl) | 115±33 | 111±26 |
Diabetes, n (%) | 51(27) | 37(20) |
Hypertension, n (%) | 158 (84) | 130(69) |
BMI (kg/m2) | 28.9±4.7 | 28.7±4.8 |
Systolic BP (mm Hg) | 127 ±18 | 129±17 |
Diastolic BP (mm Hg) | 68±10 | 72±10 |
Total cholesterol (mg/dl) | 151±33 | 174±31 |
Triglycerides (mg/dl) | 126±85 | 116±70 |
HDL cholesterol (mg/dl) | 48±14 | 53±15 |
Hypertension treatment, n (%) | 154(82) | 109(58) |
Diabetes treatment, n (%) | 41(22) | 35(19) |
Lipid treatment, n (%) | 175(93) | 125(66) |
Values are mean± SD, sample number or percentage.
BMI= body mass index
BP=blood pressure
CHD=coronary heart disease
Construction of coexpression networks for CHD and identification of DMs
Using differential expression analysis, we identified 35 genes that were differentially expressed between CHD cases and controls at a false discovery rate <50%; these results are summarized in our companion report 10. We hypothesized that alterations in the gene regulatory network architecture may better reflect the underlying molecular differences between groups. Therefore, we compared coexpression networks between CHD cases and controls to identify differential coexpression modules (DMs).
Using the previously developed Weighted Gene Coexpression Network Analysis (WGCNA) method 11, we constructed gene coexpression networks for CHD cases and their corresponding controls, separately. A total of 12 coexpression network modules were identified for CHD cases and 12 for controls (Figure SIII). We tested whether modules were conserved between case and control networks by examining the overlap in gene memberships using Fisher's exact test. Modules that overlapped at p <2.0×10-3 (corresponding to Bonferroni-corrected p<0.05) were defined as conserved between networks. If a module in one network did not have a corresponding conserved module in the other network, it was considered a DM. Two DMs – the CHD_tan module and the control_tan module containing 38 and 79 genes, respectively – differed between CHD cases and controls (Figure SIV and Table SIII). The genes in these two CHD DMs did not show any overlap with the 35 differential expression signatures 10, suggesting that the altered gene coexpression pattern and differential expression of individual genes captured different biological signals.
Functional annotation of the DMs
In order to understand the biological pathways and functional categories of the DMs, we conducted functional enrichment analysis. As shown in Table 2, the top enriched GO biology process term in the control_tan module is B cell activation, with 9 out of 51 module genes (existing in the GO database) overlapping with 103 genes in this pathway (19.8 fold enrichment, Bonferroni corrected enrichment p=5.6×10-10). These B cell-related genes demonstrated strong coexpression in controls, but not in CHD cases, indicating that B cell dysregulation may be a major factor in atherosclerotic CHD. In contrast to the control_tan module, the CHD_tan module was identified in CHD cases but not in controls. The CHD_tan module is enriched for genes involved in ion transport (GO category; Bonferroni-corrected enrichment p=0.025). These results suggest a re-organization and shift of immune response and ion transporter pathways, with the former disrupted and the later enhanced in CHD.
Table 2.
Gene Ontology Enrichment of CHD Differential Modules.
Ontology category | Overlap | Fold enrichment | P value | Corrected P* |
---|---|---|---|---|
--Control_tan module | ||||
| ||||
B cell activation | 9 | 19.8 | 5.65×10-10 | 5.65×10-7 |
B cell differentiation | 7 | 23.6 | 1.53×10-8 | 1.53×10-5 |
B cell receptor signaling pathway | 6 | 22.6 | 2.30×10-7 | 2.30×10-4 |
immune response | 12 | 6.3 | 2.86×10-6 | 2.86×10-3 |
lipid transport | 5 | 9.6 | 1.66×10-4 | 0.166 |
hemopoiesis | 9 | 4.2 | 2.23×10-4 | 0.229 |
| ||||
--CHD_tan module | ||||
| ||||
ion transport | 10 | 4.7 | 2.53×10-5 | 0.025 |
metal ion transport | 7 | 6.0 | 1.21×10-4 | 0.121 |
cell motility | 7 | 3.0 | 0.007 | 1 |
Bonferroni correction for testing of multiple pathways and functional categories in the GO biology process and KEGG pathway databases.
Differential modules are enriched for CHD risk eSNPs
By virtue of the case-control design of this study, blood samples from the CHD cases used for gene expression profiling were drawn after the CHD events, thus some of the DMs we identified are potentially downstream of CHD rather than being causal. In order to identify causal DMs, we integrated the CHD DMs with CHD-related GWAS databases to look for enrichment of risk SNPs among DMs, with the rationale that genes harboring disease-risk SNPs are putatively causal for the disease of interest 12, 13, because genotype does not change in response to disease occurrence.
We used SNP set enrichment analysis (SSEA; see description in the Methods section) to test if the DMs for CHD are likely to play causal roles in disease development by evaluating the overall association p value distribution of putative functional SNPs among DM genes in comparison to the null distribution. All 5 CHD gene sets from different resources (details in Materials and Methods) were highly enriched with low p value CHD association eSNPs in the CARDIoGRAM CHD GWAS 3 (Table SIV).These results confirm the sensitivity of SSEA to identify gene sets suspected of being causal of CHD.
The control-specific DM, control_tan, was not only enriched for risk eSNPs for CHD in the CARDIoGRAM CHD GWAS (1.6-fold enrichment; p=0.04 and 1.5×10-5 from Kolmogorov-Smirnov (KS) and Fisher's tests, respectively), but also is enriched for low p value associations for lipid traits including HDL cholesterol (KS p=4.8 ×10-12 and Fisher's p=7.8×10-9), LDL cholesterol (KS p=9.0×10-19 and Fisher's p=1.9×10-26), and total cholesterol (KS p= 4.2×10-21and Fisher's p=7.5×10-28) in the GLGC lipid GWAS 14, suggesting that this DM may play a role in CHD pathogenesis via effects on atherogenic lipids (Table SIV SSEA results for CHD related gene sets, and Table SV CHD risk eSNPs for the control_tan module). The CHD_tan DM, on the other hand, did not show enrichment for CHD risk SNPs (enrichment p=0.49 and 0.85 from Fisher's and KS tests, respectively), and thus may be reactive rather than causal. The differential gene expression signatures identified at FDR<50% also failed to show any enrichment for CHD risk SNPs.
Identification of key drivers of the CHD causal DM and their associated subnetworks via network models derived from orthogonal studies
We speculated that the CHD causal DM we identified (control_tan) was driven by key regulatory genes with broad impact on this network module when perturbed. Such key regulatory genes or key drivers (KDs) can serve as candidate genes for therapeutic intervention. To identify KDs, we projected the genes within the CHD causal DM onto the pre-compiled directional networks including tissue-specific Bayesian networks (BNs) (Table SII) 15, 16 and non-directional PPI networks 17 from previous studies. At a Bonferroni-corrected p<0.05, we identified 59 unique KDs for the CHD causal module in the 6 tissue-specific BNs (59 KDs for blood, 0 for liver, 1 for fat, 1 for kidney, 0 for heart, and 0 for muscle) and one PPI network (0 KD). The human blood BN gave rise to the largest number of KDs (n=59) in contrast to the limited number of KDs in other tissue-specific networks. This is not surprising given that the DMs were identified from human blood expression profiles, confirming the tightly co-regulated relationships among the DM genes in blood rather than in other tissues. Furthermore, as the DMs were from the Framingham Heart Study and the blood BN used for KD identification was constructed from an independent Icelandic cohort 15, the consistency between the two networks cross-validated our results. Among the 59 KDs, 27 KDs contained eSNPs showing association with CHD or its risk factors at p <0.05; 7 contained eSNPs at p <1.0×10-3, and 3 contained eSNPS at p <1.0×10-4 in GWAS (Table SVI). Two of the KDs (CD79B and SPIB) overlapped with the 23 gene signature of CHD reported by Rosenberg et al 9 (enrichment p=5.5×10-5).
We further prioritized the 59 KDs based on 4 elements: 1) consistency of the KD across multiple directional networks, i.e., a higher priority is given to KDs driving the CHD causal DM in more than one tissue-specific/PPI network; 2) whether the KD is a DM gene itself; 3) if the KD contains eSNPs showing evidence of association with CHD and/or its risk factors at p <1.0×10-3 (the liberal GWAS p value cutoff was used here because the GWAS information was not considered alone but integrated with several levels of additional data to prioritize genes and reduce false discoveries); and 4) the p value of the KD based on the KDA algorithm (Table VI). Among the top 20 KDs (Table 3), TNFRSF13C and SPIB appeared in 2 networks – TNFRSF13C in adipose and blood and SPIB in kidney and blood – and were considered as multi-tissue KDs; the remaining were blood-specific KDs. In fact, Sage et al. reported that in Ldlr knockout mice, deletion of the BAFF receptor (BAFFR, encoded by the multi-tissue KD TNFRSF13C) reduced aortic root atherosclerosis significantly 18, 19. In addition, Fretz et al. observed that knockout of another top KD, EBF1, in mice induced defects in lipid metabolism, adipose tissue deposition, and impaired glucose mobilization 20. Therefore, some of the top KDs we identified are supported by existing experimental evidence for their relevance to metabolism, atherosclerosis, or other CHD-related phenotypes.
Table 3.
Top 20 Key Drivers Derived from the CHD Causal Module from Tissue-specific Bayesian and Protein-protein Interaction Networks.
Rank | KD | Tissue | Is DM gene | Adjusted KD p value by subnet size | GWAS source where KD eSNPs show association at p <1×10-3 | Literature evidence for CHD association |
---|---|---|---|---|---|---|
1 | SPIB | Blood / Kidney | 0 | Blood p=1.76×10-25 / Kidney p= 0.045 | One of the CHD biomarkers of the 23 gene signatures9 | |
2 | TNFRSF13C | Blood / Adipose | 0 | Blood p=1.94 ×10-24/ Adipose p=0.041 | Deletion of BAFF Receptor Reduced Atherosclerosis in ldlr Knockout Mice18, 19 | |
3 | PPAPDC1B | Blood | 1 | 2.1 ×10-30 | ICBP_SBP | |
4 | EBF1 | Blood | 1 | 3.2 ×10-18 | ICBP_DBP | EBF1-/- mice resulted in lipid abnormalities20 |
5 | RALGPS2 | Blood | 1 | 1.8 ×10-16 | GLGC_HDL | |
6 | CD72 | Blood | 1 | 4.6 ×10-25 | ||
7 | RASGRP3 | Blood | 1 | 1.7 ×10-22 | ||
8 | CD200 | Blood | 1 | 3.2 ×10-22 | ||
9 | TNFRSF17 | Blood | 1 | 5.8 ×10-22 | ||
10 | BACH2 | Blood | 1 | 1.0 ×10-04 | ||
11 | COBLL1 | Blood | 0 | 1.2 ×10-28 | GLGC_HDL, TC, TG | |
12 | CD79B | Blood | 0 | 1.6 ×10-28 | GIANT_BMI | One of the CHD biomarkers of the 23 gene signatures9 |
13 | SAV1 | Blood | 0 | 8.8×10-22 | GLGC_TC | |
14 | FAM3C | Blood | 0 | 2.2×10-16 | CARDIoGRAM | |
15 | LARGE | Blood | 0 | 9.3 ×10-32 | ||
16 | PKIG | Blood | 0 | 2.6 ×10-31 | ||
17 | TSPAN13 | Blood | 0 | 2.9 ×10-30 | ||
18 | CELSR1 | Blood | 0 | 4.3 ×10-30 | ||
19 | CTGF | Blood | 0 | 3.0 ×10-29 | Increased expression in atrial fibrillation patients and animals 39 | |
20 | BANK1 | Blood | 0 | 4.1 ×10-29 |
The key drivers (KDs) were ranked based on 4 elements: 1) consistency of the KD across multiple directional networks; 2) whether the KD is a DM gene itself; 3) if the KD contains eSNPs showing evidence of association with CHD and/or its risk factors at p <1.0×10-3; and 4) the p value of the KD based on the Key driver analysis algorithm.
As shown in Figure 2, these top KDs constitute a tightly linked subnetwork with certain elements being tissue-specific and others showing cross-tissue connections. The KDs are at the center of the subnetwork and hence represent master regulators of this subnetwork. Not surprisingly, this subnetwork, which includes 152 genes depicted in Figure 2, is significantly enriched for immune response (GO functional category) and B cell receptor signaling pathways (Bonferroni-corrected p=1.9×10-6 and p=5.1×10-6, respectively). In addition, the subnetwork is highly enriched for genes whose eSNPs show low p value associations for both CHD in the CARDIoGRAM GWAS (enrichment p= 9.2×10-3 and 3.8×10-3 from Fisher's and KS tests, respectively) and CHD risk traits including SBP and DBP in the ICBP GWAS and lipid traits (HDL, LDL, and total cholesterol) in the GLGC GWAS (Table SIV). Furthermore, 15 genes were candidate cardiovascular genes curated in the Genetic Association Database (http://geneticassociationdb.nih.gov/). All these lines of evidence support the relevance of the KDs and their related subnetwork to atherosclerotic CHD.
Figure 2. Top KDs of the CHD causal module and the associated subnetwork.
The subnetwork was derived from the top 20 KDs using the tissue-specific networks from which the KDs were identified. The nodes of the largest size in the network are the top 20 KDs, and the middle-sized nodes represent other KDs and genes in the CHD causal model. Red rectangular, red circular, and green circular nodes are multi-tissue KDs, blood-specific KDs, and causal DM genes, respectively. The network graph were drawn by ProteoLens38.
Discussion
We conducted a systems biology investigation of gene expression in a case-control study of CHD using network approaches. To our knowledge, this is the largest gene expression study to date of CHD incorporating multidimensional genome-wide data from multiple sources including GWAS, gene expression, eSNPs, and molecular networks.
Extending a traditional approach that targeted differentially expressed individual genes 10, we compared the coexpression network structure in CHD cases with that in matched controls to derive differential network modules. The advantage of a network approach is that it provides a contextual framework to explain how genes interact with each other differently in CHD cases versus controls. Previous research has reported strong correlations of gene coexpression patterns with causal disease mechanisms. For example, a conserved coexpression module discovered in mouse and human liver and adipose tissues was found to be correlated with multiple metabolic traits 15, 21 and its causal relationship with metabolic diseases was further validated experimentally 21-23.
Using coexpression network analyses, we identified 2 DMs that demonstrated differential coexpression patterns between cases and controls. The 79 genes in the control_tan module (Table SIII) showed a tightly co-regulated pattern in controls but a disrupted coexpression pattern in CHD cases. B cell-centered immune genes were highly enriched in this module, along with a suggestive enrichment for lipid transport genes (ABCA6, ABCA9, ABCB4, APOD, and OSBPL10). This is consistent with the known roles of immune response and lipid related pathways in atherogenesis. At a cellular level, atherosclerosis involves lipid accumulation in the artery wall, which can lead to plaque rupture, platelet aggregation, and coagulation leading to occlusive thrombus and MI. Immune responses are involved in multiple phases of atherogenesis 6, 24. From the inception of atherogenesis, multiple immunocytes including macrophages, T cells, and mast cells are recruited by adhesion molecules or leukocyte receptors expressed on the surface of endothelial cells. The precise role of B cells in atherosclerosis, however, is relatively less well studied. In the traditional view, B cells are thought to play a protective role against atherosclerosis 25. Kyaw et al. found that innate-like B1 cells mediate protection from atherosclerosis via the secretion of natural IgM antibodies 26. The tightly co-regulated pattern of B cell-related genes in controls and the disrupted pattern in CHD cases found in our study are consistent with an athero-protective role of B cells. A recent study, however, showed reduced atherosclerosis in mice after B-cell depletion 27, suggesting that B-cell subpopulations may have contrasting roles in the pathogenesis of atherosclerosis 28.
It is of note that the blood samples used in this study were collected after the onset of clinical CHD events. Therefore, the differential gene signatures and DMs identified here could be a consequence of CHD or its treatment (i.e. downstream signals) rather being causal of CHD (i.e. upstream signals). GWAS has identified scores of loci associated with CHD and CHD risk factors, with about 27 loci for CHD or MI 3, 4, 30 for blood pressure 29-31, 95 for lipids 14, 28 for BMI 32, and 27 for type 2 diabetes 33. Traditional GWAS, however, can only detect common loci yielding stringent p value associations, typically p<5×10-8, and susceptibility loci with more subtle effects are missed. In an attempt to separate the causal and reactive effects of the DMs that differed between CHD cases and controls, we used SSEA to integrate our gene expression findings with publicly available GWAS results for CHD and its risk factors based on the reasoning that genetic variation is unaltered over time and that converging genetic associations with CHD gene expression signatures will highlight causal genes and pathways. SSEA is a complementary approach to single-point analysis, as it tests the overall association of a set of SNPs derived from a given gene set to a trait rather than a single SNP or gene association with a trait 12, thus empowering the identification of biological pathways, functional categories, or networks. Whereas individual genes may only exert subtle effects on a trait, collectively they may contribute to disease susceptibility on a larger scale. Another advantage of SSEA lies in the fact that it utilizes eSNPs (SNPs associated with the expression levels of genes) from disease-relevant tissues to map genes to their functional SNPs. This is in contrast to distance-based gene-to-SNP mapping employed in other studies 34-36.
The control_tan DM (enriched for genes involved in B cell activation/differentiation and immune function), but not the CHD_tan DM (enriched for ion transport genes) was identified as a causal module showing enrichment for eSNPs with low p value associations from CHD GWAS in CARDIoGRAM 3 and lipid traits GWAS in GLGC 14. These results indicate that B cell-related immune function may exert a causal effect on CHD and on lipids. As discussed above, prior studies support a causal role of B cells in atherogenesis. In addition, recent research by Fretz et al. revealed direct evidence that B cell dysfunction (Ebf1-/- mice) results in lipid abnormalities 20. Therefore, our results not only support previous phenotypic observations linking B cells to both atherogenesis and lipid dysregulation, but also provide network and genetic support at the molecular level.
We further identified and prioritized putative key regulatory genes or KDs for the CHD causal DM based on the global networks derived from tissue-specific BNs and PPI networks. We identified 59 KDs for the control_tan module, 2 from multiple tissue networks and 57 from the blood-specific network. After prioritization of the 59 KDs, we retrieved a subnetwork derived from the top KD and found that the KDs and other control_tan DM genes form a tightly linked subnetwork. Perturbations of the multi-tissue KD, TNFRSF13C, and a tissue-specific KD, EBF1, in animal models have been shown to affect aortic root atherosclerosis and other CHD-related phenotypes 18, 20. Therefore, the subnetwork structure and the KDs we uncovered provide both known CHD candidate genes with experimental support and novel candidate genes that warrant further validation.
In summary, we systematically compared the gene expression features of 188 CHD cases and 188 matched controls from the FHS via network modeling. Orthogonal studies incorporating CHD and CHD risk factor GWAS and data-driven gene regulatory networks from human and mouse models were integrated in this analysis to infer casual effects and possible pathological molecular mechanism. We found B cell-centered immune function to be related to CHD pathogenesis. We hypothesize that disruption of gene co-regulation of a B cell related DM via KDs contributes to CHD susceptibility. Our results provide further evidence supporting a critical role of adaptive immunity in atherosclerosis, especially aspects involving B cell function. Consequently, we speculate that immunomodulatory therapy may be useful to prevent or treat atherosclerotic CHD 37.
Supplementary Material
Significance.
Atherosclerotic coronary heart disease (CHD) as a multi-factorial disease is influenced by both genetic and environmental factors. This study utilized a systems biology framework to identify genes and networks associated with CHD via integration of whole blood gene expression profiles with network approaches, GWAS, and genetics of gene expression. This study demonstrated a CHD differential coexpression module (DM), involving B-cell centered immune functions with a disrupted coexpression pattern in CHD cases but not in matched controls. This DM was implicated as causal based on its significant enrichment for both CHD and lipid eSNPs. This causal DM was further integrated with networks to identify regulatory key driver (KD) genes and their driving subnetwork structure. These findings provided further evidence of a critical role of adaptive immunity in atherosclerosis, especially aspects involving B cell function, and may help to identify new drug targets for the treatment or prevention of CHD.
Acknowledgments
D. L. and X.Y. designed, directed, and supervised the experiment. D. L. was responsible for funding of the project. T. H., X. Y. and D. L. drafted the manuscript. P. C. organized the experiment material and data exchange. All authors participated in revising and editing the manuscripts. All authors have read and approved the final version of the manuscript.
We thank Xavier Schildwachter, Jeanette Erdmann, and Christina Willenborg for technical assistance for technical assistance.
Sources of Funding: The study was supported by the Intramural Research Program of NHLBI (Daniel Levy, PI). The Framingham Heart Study is supported by NIH Contract N01-HC-25195. Nilesh J. Samani holds a Chair funded by the British Heart Foundation and is an NIHR Senior Investigator.
Abbreviations
- BMI
Body Mass Index
- CABG
coronary artery bypass grafting
- CHD
coronary heart disease
- CARDIoGRAM
Coronary ARteryDIsease Genome wide Replication and Meta-analysis
- CVD
cardiovascular disease
- DBP
diastolic blood pressure
- DIAGRAM
DIAbetes Genetics Replication and Meta-analysis
- FHS
Framingham Heart Study
- GIANT
Genetic Investigation of Anthropometric Traits
- GLGC
Global Lipid Genetics Consortium
- GO
gene ontology
- GWAS
Genome-wide association studies
- ICBP
International Consortium of Blood Pressure
- MI
myocardial infarction
- PPI
protein-protein interaction
- SBP
systolic blood pressure
- SSEA
SNP set enrichment analysis
- PTCA
percutaneous transluminal coronary angioplasty with or without a stent
- WGCNA
Weighted Gene Coexpression Network Analysis
Footnotes
Disclosure None.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- 1.Lozano R, Naghavi M, Foreman K, et al. Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. 2013;380:2095–2128. doi: 10.1016/S0140-6736(12)61728-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Glass CK, Witztum JL. Atherosclerosis. the road ahead. Cell. 2001;104:503–516. doi: 10.1016/s0092-8674(01)00238-0. [DOI] [PubMed] [Google Scholar]
- 3.Schunkert H, Konig IR, Kathiresan S, et al. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nat Genet. 2011;43:333–338. doi: 10.1038/ng.784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Lotta LA, Peyvandi F. Addressing the complexity of cardiovascular disease by design. Lancet. 2011;377:356–358. doi: 10.1016/S0140-6736(10)62240-4. [DOI] [PubMed] [Google Scholar]
- 5.Park JH, Wacholder S, Gail MH, Peters U, Jacobs KB, Chanock SJ, Chatterjee N. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nature genetics. 2010;42:570–575. doi: 10.1038/ng.610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Libby P, Ridker PM, Hansson GK. Progress and challenges in translating the biology of atherosclerosis. Nature. 2011;473:317–325. doi: 10.1038/nature10146. [DOI] [PubMed] [Google Scholar]
- 7.Sinnaeve PR, Donahue MP, Grass P, Seo D, Vonderscher J, Chibout SD, Kraus WE, Sketch M, Jr, Nelson C, Ginsburg GS, Goldschmidt-Clermont PJ, Granger CB. Gene expression patterns in peripheral blood correlate with the extent of coronary artery disease. PLoS One. 2009;4:e7037. doi: 10.1371/journal.pone.0007037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wingrove JA, Daniels SE, Sehnert AJ, Tingley W, Elashoff MR, Rosenberg S, Buellesfeld L, Grube E, Newby LK, Ginsburg GS, Kraus WE. Correlation of peripheral-blood gene expression with the extent of coronary artery stenosis. Circ Cardiovasc Genet. 2008;1:31–38. doi: 10.1161/CIRCGENETICS.108.782730. [DOI] [PubMed] [Google Scholar]
- 9.Rosenberg S, Elashoff MR, Beineke P, et al. Multicenter validation of the diagnostic accuracy of a blood-based gene expression test for assessing obstructive coronary artery disease in nondiabetic patients. Ann Intern Med. 2010;153:425–434. doi: 10.7326/0003-4819-153-7-201010050-00005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Joehanes R, Ying S, Huan T, Johnson AD, Raghavachari N, Wang R, Liu P, Woodhouse KA, Tanriverdi K, Courchesne P, Freedman JE, O'Donnell CJ, Levy D, Munson PJ. Gene Expression Signatures of Coronary Heart Disease. Arterioscler Thromb Vasc Biol (In Press) doi: 10.1161/ATVBAHA.112.301169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Zhong H, Yang X, Kaplan LM, Molony C, Schadt EE. Integrating pathway analysis and genetics of gene expression for genome-wide association studies. Am J Hum Genet. 2010;86:581–591. doi: 10.1016/j.ajhg.2010.02.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zhong H, Beaulaurier J, Lum PY, et al. Liver and adipose expression associated SNPs are enriched for association to type 2 diabetes. PLoS Genet. 2010;6:e1000932. doi: 10.1371/journal.pgen.1000932. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Teslovich TM, Musunuru K, Smith AV, et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature. 2010;466:707–713. doi: 10.1038/nature09270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Emilsson V, Thorleifsson G, Zhang B, et al. Genetics of gene expression and its effect on disease. Nature. 2008;452:423–428. doi: 10.1038/nature06758. [DOI] [PubMed] [Google Scholar]
- 16.Derry JM, Zhong H, Molony C, et al. Identification of genes and networks driving cardiovascular and metabolic phenotypes in a mouse F2 intercross. PLoS One. 2010;5:e14319. doi: 10.1371/journal.pone.0014319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Mishra GR, Suresh M, Kumaran K, et al. Human protein reference database--2006 update. Nucleic Acids Res. 2006;34:D411–414. doi: 10.1093/nar/gkj141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Sage AP, Harrison JE, Baker LA, Masters L, Tsiantoulas D, Binder CJ, Mallat Z. Depletion of Mature B cells by Deletion of BAFF Receptor Reduces Atherosclerosis in ldlr Knockout Mice. Circulation. 2011;124:A9224. [Google Scholar]
- 19.Sage AP, Tsiantoulas D, Baker L, Harrison J, Masters L, Murphy D, Loinard C, Binder CJ, Mallat Z. BAFF Receptor Deficiency Reduces the Development of Atherosclerosis in Mice--Brief Report. Arterioscler Thromb Vasc Biol. 2012;32:1573–1576. doi: 10.1161/ATVBAHA.111.244731. [DOI] [PubMed] [Google Scholar]
- 20.Fretz JA, Nelson T, Xi Y, Adams DJ, Rosen CJ, Horowitz MC. Altered metabolism and lipodystrophy in the early B-cell factor 1-deficient mouse. Endocrinology. 2010;151:1611–1621. doi: 10.1210/en.2009-0987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Chen Y, Zhu J, Lum PY, et al. Variations in DNA elucidate molecular networks that cause disease. Nature. 2008;452:429–435. doi: 10.1038/nature06757. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Yang X, Deignan JL, Qi H, et al. Validation of candidate causal genes for obesity that affect shared metabolic pathways and networks. Nature genetics. 2009;41:415–423. doi: 10.1038/ng.325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Yang X, Peterson L, Thieringer R, et al. Identification and validation of genes affecting aortic lesions in mice. The Journal of clinical investigation. 2010;120:2414–2422. doi: 10.1172/JCI42742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Hansson GK, Libby P. The immune response in atherosclerosis: a double-edged sword. Nat Rev Immunol. 2006;6:508–519. doi: 10.1038/nri1882. [DOI] [PubMed] [Google Scholar]
- 25.Caligiuri G, Nicoletti A, Poirier B, Hansson GK. Protective immunity against atherosclerosis carried by B cells of hypercholesterolemic mice. J Clin Invest. 2002;109:745–753. doi: 10.1172/JCI07272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Kyaw T, Tay C, Krishnamurthi S, Kanellakis P, Agrotis A, Tipping P, Bobik A, Toh BH. B1a B lymphocytes are atheroprotective by secreting natural IgM that increases IgM deposits and reduces necrotic cores in atherosclerotic lesions. Circ Res. 2011;109:830–840. doi: 10.1161/CIRCRESAHA.111.248542. [DOI] [PubMed] [Google Scholar]
- 27.Ait-Oufella H, Herbin O, Bouaziz JD, et al. B cell depletion reduces the development of atherosclerosis in mice. J Exp Med. 2010;207:1579–1587. doi: 10.1084/jem.20100155. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Perry HM, McNamara CA. Refining the role of B cells in atherosclerosis. Arterioscler Thromb Vasc Biol. 2012;32:1548–1549. doi: 10.1161/ATVBAHA.112.249235. [DOI] [PubMed] [Google Scholar]
- 29.Newton-Cheh C, Johnson T, Gateva V, et al. Genome-wide association study identifies eight loci associated with blood pressure. Nat Genet. 2009;41:666–676. doi: 10.1038/ng.361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Adeyemo A, Gerry N, Chen G, Herbert A, Doumatey A, Huang H, Zhou J, Lashley K, Chen Y, Christman M, Rotimi C. A genome-wide association study of hypertension and blood pressure in African Americans. PLoS Genet. 2009;5:e1000564. doi: 10.1371/journal.pgen.1000564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Levy D, Ehret GB, Rice K, et al. Genome-wide association study of blood pressure and hypertension. Nat Genet. 2009;41:677–687. doi: 10.1038/ng.384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Speliotes EK, Willer CJ, Berndt SI, et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nature genetics. 2010;42:937–948. doi: 10.1038/ng.686. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Voight BF, Scott LJ, Steinthorsdottir V, et al. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nature genetics. 2010;42:579–U155. doi: 10.1038/ng.609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Segre AV, Groop L, Mootha VK, Daly MJ, Altshuler D, Consortium D, Investigators M. Common Inherited Variation in Mitochondrial Genes Is Not Enriched for Associations with Type 2 Diabetes or Related Glycemic Traits. Plos Genetics. 2010;6:e1001058. doi: 10.1371/journal.pgen.1001058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Weng LJ, Macciardi F, Subramanian A, Guffanti G, Potkin SG, Yu ZX, Xie XH. SNP-based pathway enrichment analysis for genome-wide association studies. BMC Bioinformatics. 2011;12:99. doi: 10.1186/1471-2105-12-99. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Holden M, Deng SW, Wojnowski L, Kulle B. GSEA-SNP: applying gene set enrichment analysis to SNP data from genome-wide association studies. Bioinformatics. 2008;24:2784–2785. doi: 10.1093/bioinformatics/btn516. [DOI] [PubMed] [Google Scholar]
- 37.Lahoute C, Herbin O, Mallat Z, Tedgui A. Adaptive immunity in atherosclerosis: mechanisms and future therapeutic targets. Nat Rev Cardiol. 2011;8:348–358. doi: 10.1038/nrcardio.2011.62. [DOI] [PubMed] [Google Scholar]
- 38.Huan T, Sivachenko AY, Harrison SH, Chen JY. ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining. BMC Bioinformatics. 2008;9(Suppl 9):S5. doi: 10.1186/1471-2105-9-S9-S5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Nalls MA, Couper DJ, Tanaka T, et al. Multiple Loci Are Associated with White Blood Cell Phenotypes. Plos Genet. 2011;7:e1002113. doi: 10.1371/journal.pgen.1002113. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.