SUMMARY
Inferring molecular networks can reveal how genetic perturbations interact with environmental factors to cause common complex diseases. We analyzed genetic and gene expression data from seven tissues relevant to coronary artery disease (CAD) and identified regulatory gene networks (RGNs) and their key drivers. By integrating data from genome-wide association studies, we identified 30 CAD-causal RGNs interconnected in vascular and metabolic tissues, and we validated them with corresponding data from the Hybrid Mouse Diversity Panel. As proof of concept, by targeting the key drivers AIP, DRAP1, POLR2I, and PQBP1 in a cross-species-validated, arterial-wall RGN involving RNA-processing genes, we re-identified this RGN in THP-1 foam cells and independent data from CAD macrophages and carotid lesions. This characterization of the molecular landscape in CAD will help better define the regulation of CAD candidate genes identified by genome-wide association studies and is a first step toward achieving the goals of precision medicine.
Graphical abstract
INTRODUCTION
Coronary artery disease (CAD) is a heritable complex disease caused by the interactions of multiple genetic and environmental risk factors that change the molecular landscape of vascular and metabolic tissues to accelerate atherosclerosis. Despite lifestyle improvements and the successful targeting of CAD risk factors, such as hypercholesterolemia (Samani and de Bono, 1996) and hypertension (Bangalore et al., 2012), CAD still accounts for the majority of cardiovascular diseases. In fact, the clinical manifestations of CAD and atherosclerosis—myocardial infarction (MI) and stroke—are responsible for nearly 50% of all deaths globally (Mensah et al., 2014). New research strategies are urgently needed to battle CAD. One promising strategy is systems genetics (Barabási et al., 2011; Björkegren et al., 2015; Civelek and Lusis, 2014; Schadt, 2009; Schadt and Björkegren, 2012), which will help achieve a global understanding of how regulatory gene networks (RGNs) act within and across tissues to cause CAD. Such knowledge is central to tailor therapies to the specific molecular pathology of the individual patient (Collins and Varmus, 2015).
An important part of systems genetics is genome-wide association studies (GWASs), the predominant approach to genetic analysis of complex diseases for the last decade, which have led to the discovery of more than 150 genetic risk loci for CAD alone (Deloukas et al., 2013; Peden and Farrall, 2011;). However, these information-rich datasets have been analyzed only from the perspective of single DNA variants and, therefore, are a largely untapped resource to further elucidate the genetic basis of complex diseases. We and others (Barabási et al., 2011; Björkegren et al., 2015; Civelek and Lusis, 2014; Schadt and Björkegren, 2012) propose to use systems genetics to integrate the analyses of GWASs with functional genomic datasets where the combined effects of many, sometimes subtle, genetic and environmental influences are captured within molecular networks. In this study, we applied a systems genetics pipeline (Figures 1 and S1), including integrative multi-tissue, GWAS, and cross-species analyses to robustly identify RGNs in CAD. In sum, we identified 30 CAD-causal RGNs harboring 59 CAD-related GWA candidate genes (Deloukas et al., 2013), whereof 26 RGNs were validated in corresponding gene expression and phenotypic data from the Hybrid Mouse Diversity Panel (HMDP) (Bennett et al., 2010). As proof of concept, key drivers in CAD-causal RGNs active in both the human and mouse atherosclerotic arterial wall (AAW) were further evaluated in a THP-1 foam cell model (Figure 1G) and in independent data from primary CAD macrophages and carotid lesions.
Figure 1. Schematic Flow of Analytic Steps.
(A) STAGE tissue sampling. The STAGE study was undertaken at the Karolinska University Hospital, Stockholm, Sweden. Patients eligible for coronary artery bypass grafting were included, and seven metabolic and vascular tissues were biopsied during surgery. RNA samples were screened with Affymetrix Gene Chips rendering expression values for 19,610 genes. DNA was screened for 909,622 SNPs (Experimental Procedures).
(B) Weighted gene co-expression network analysis (WGCNA) to generate tissue-specific and cross-tissue co-expression modules. Gene expression values across all seven tissues were considered together to identify tissue-specific and cross-tissue co-expression modules. In the illustrated topological overlap matrix (TOM, also in Figure 2A), rows and columns are genes expressed in specific tissues (i.e., the same gene may occur up to seven times), and the seven tissues are intermixed. The diagonal indicates modules of highly co-expressed genes (Experimental Procedures).
(C) Module association with CAD phenotypes. To define modules of possible relevance for CAD, we sought both linear and nonlinear relationships between the activity of the identified co-expression modules and four main CAD phenotypes as follows: (1) the extent of coronary atherosclerosis as assessed in preoperative angiograms (atherosclerosis modules), (2) plasma measures of cholesterol (i.e., total, VLDL, LDL, or HDL cholesterol levels; cholesterol modules), (3) glucose metabolism (i.e., fasting glucose, HbA1c, insulin, or pro-insulin; glucose modules), and (4) high-sensitivity C-reactive protein (CRP modules). Heatmap (left) shows an example of how genes in a second-step clustering segregate the patients into two groups that differ phenotypically. Shown are examples of regression plots (right) between individual eigengene values (x axis) and individual phenotypic values (y axis).
(D) Assessment of module causality with STAGE eQTLs and GWASs. Causal co-expression modules were sought by (1) assessing the associations of module eQTLs (or of SNPs within ±500 kb of transcription start or end site if less than ten eQTLs) with CAD using CARDIoGRAM data and (2) the module contents for CAD candidate genes identified by GWASs. The eQTLs were sought to match the tissue-specific gene expression (left); GWA genes are those mapped to regions of genome-wide significant loci as exemplified in the Manhattan plot (right).
(E) Bayesian network reconstruction of RGNs and key driver analysis. Co-expression modules indicate correlations between genes without directions. In contrast, inference of RGNs considers the causal roles of eQTLs, transcription factors, and CAD GWA candidate genes to construct directed gene-gene connections. As a consequence, the hierarchical order among network genes can be determined, and the genes most essential for network regulation can be identified in the form of key drivers.
(F) Cross-species validation with the HMDP. Many metabolic and vascular processes are believed to be conserved between mice and humans. To validate the human RGN modules, we related gene expression in corresponding mouse tissues to corresponding mouse phenotypes (Experimental Procedures).
(G) The siRNA targeting of key drivers in THP-1 macrophages and foam cells. As proof of concept, we tested key drivers of the cross-species-validated atherosclerosis RGN modules in human THP-1 monocytes differentiated into THP-1 macrophages and foam cells. First, to test key driver specificity in activating the network module, we used custom microarrays to assess the response in network gene activity in THP-1 macrophages (left). Then we determined whether siRNA targeting of the key drivers also affected THP-1 foam cell formation (right). (Bottom right) The same analytical flow is presented in a triangle, ranging from WGCNA to THP-1 foam cells.
RESULTS
Inference of Co-expression Network Modules
In the first step of our analysis (Figure 1), we used weighted gene co-expression network analysis (WGCNA) (Zhang and Horvath, 2005) to infer functionally related genes in the form of co-expression modules from 612 newly generated gene expression profiles from seven tissues—AAW and non-AAW (internal mammary artery [IMA]), liver, skeletal muscle (SM), visceral fat (VF), subcutaneous fat (SF), and whole blood from the late-stage CAD patients of the Stockholm Atherosclerosis Gene Expression (STAGE) study (Hägg et al., 2009; Table S1; Figures 1A, 1B, and S1). We identified 94 tissue-specific and 77 cross-tissue modules (Figure 2A; Tables S2 and S3). The relevance of these 171 modules for CAD was reflected in the top ten biological processes and molecular functions according to gene ontology (Table S3; Supplemental Experimental Procedures). Moreover, we found 2,467 genes previously related to CAD/atherosclerosis (33%, p = 1.5 × 10−24; Supplemental Experimental Procedures) and 147 genes of 309 candidate genes proposed for risk loci identified by GWASs (Table S4; GWA genes) for CAD (n = 33 of 53 GWA genes) (Deloukas et al., 2013), plasma lipids (n = 87 of 183), and measures of plasma glucose metabolism (n = 22 of 43), including type 2 diabetes (n = 45 of 98) (Welter et al., 2014).
Figure 2. Co-expression Modules Associated with CAD Phenotypes.
(A) TOM of gene activity sorted according to co-expression within and across tissues forming co-expression modules. The TOM shows the 20% most strongly co-expressed genes within and across tissues in 61 CAD-associated modules. Color-coded bars on the y axis and x axis indicate individual modules. The red rectangle encompasses modules related to coronary atherosclerosis; the two green rectangles encompass the modules related to measures of plasma cholesterol; the two blue rectangles encompass the modules related to measures of plasma glucose metabolism; and the two purple rectangles encompass the modules related to plasma CRP. Overlapping rectangles reflect modules associated with more than one of the four main phenotypes. The yellow-red color code (bottom right) indicates the strength of TOM associations (i.e., a weighted combination of direct co-expression and shared co-expression to other genes in the network).
(B) The sixty-one modules (module ID/row) were associated with the extent of coronary atherosclerosis assessed in preoperative angiograms (first column, ATH); plasma levels of total, VLDL, LDL, and HDL cholesterol and triglycerides (second column, CHOL); fasting glucose, HbA1c, insulin, or pro-insulin levels (third column, GLUC); and CRP level (fourth column, CRP). The module-phenotype association was assessed by second-step clustering of the STAGE patients, based on expression of module genes, and by correlating eigengene module values for each STAGE patient with phenotypic values; red colors indicate strength of associations (−log10(P)) (Experimental Procedures). The second and third columns indicate enrichment in disease association (risk) determined from the CARDIoGRAM dataset of module eQTLs (RISK, from 1- to 9-fold, green) and the presence of established GWA genes for CAD (GWAS, from 1–6 GWA genes, blue), respectively. (Right) The lengths of the horizontal bars indicate module size (i.e., gene numbers); the colors indicate the tissues where the module is active.
To test the robustness of cross-tissue modules against the contribution of gene expression in individual tissues, we removed gene expression data from one tissue and re-clustered the data. The 171 modules could largely be retrieved, except when data from dominant tissues in cross-tissue modules were removed (Supplemental Experimental Procedures).
Identification of Modules Associated with Main CAD Phenotypes
In the third step of the analysis pipeline (Figures 1C and S1), we used regression analysis and second-step clustering to identify linear and non-linear associations between module gene expression and CAD phenotypes (Supplemental Experimental Procedures). We found 61 modules that were associated with at least one of the following four major CAD phenotypes: (1) extent of coronary atherosclerosis as measured in preoperative angiograms (Hägg et al., 2009) (resulting in 14 atherosclerosis modules); (2) plasma levels of total cholesterol (TC), very-low-density lipoprotein (VLDL), low-density lipoprotein (LDL), or high-density lipoprotein (HDL) (29 cholesterol modules); (3) plasma glucose (fasting), HbA1c, insulin, or pro-insulin levels (14 glucose modules); and (4) the inflammatory marker C-reactive protein (CRP) (15 CRP modules) (Figure 2B). Ten modules were associated with more than one of the four phenotypes (Figure 2B; Table S2). Manual analysis of the atherosclerosis-, cholesterol-, glucose-, and CRP-associated modules suggested that, depending on phenotype association, the modules were more likely to be tissue specific (atherosclerosis, cholesterol, and CRP modules) or cross-tissue (glucose modules). Moreover, some modules were associated with more than one of the four major CAD phenotypes (Supplemental Experimental Procedures).
Inferring Module Causality by Using GWAS of CAD
The associations between the 61 co-expression modules and the four CAD phenotypes do not reveal whether the module activity alters the phenotypic values (causal) or is caused by changes in phenotypic values (reactive). Therefore, in the fourth step of our analysis (Figures 1D and S1), we assessed if the phenotype-associated modules were CAD causal or not. As in previous studies (Emilsson et al., 2008; Foroughi Asl et al., 2015; Schadt and Björkegren, 2012), we defined causal modules as those containing either of the following: (1) cis-acting expression quantitative trait loci (cis-eQTLs) (or, if a module had <10 eQTLs, cis-located SNPs) enriched in association with CAD according to CAD GWAS (Deloukas et al., 2013; Figure 2B, column 2); or (2) one or more CAD candidate genes (Brænne et al., 2015) already identified for CAD GWAS risk loci (Figure 2B, column 3; Table S4).
Using these definitions, we identified 30 CAD-causal modules (Figure 2B): eight atherosclerosis modules (of 14), ten cholesterol modules (of 29), five glucose modules (of 14), four CRP modules (of 15), and three additional CAD-causal modules that were associated with more than one of the four CAD phenotypes (Figure 2B). By integrating GWASs also for CAD risk factors (Welter et al., 2014), we found that the 30 CAD-causal modules contained 59 unique GWA genes for CAD (n = 19), plasma cholesterol (n = 39), and glucose metabolism (n = 14, including type 2 diabetes GWA genes) (Table S4) and 42% CAD/atherosclerosis genes (p = 1.2 × 10−5). Of the CAD-causal modules, 18 were also causal for plasma cholesterol levels (n = 15) or glucose metabolism (n = 12), based on their eQTL risk enrichment or the presence of GWA genes (Table S2).
Inference of RGNs and Key Drivers from the Co-expression Modules
In the next step of the analysis (Figures 1E and S1), to define the directions of edges between genes in the 30 CAD-causal modules, we inferred RGNs including key drivers by Bayesian network modeling, using transcription factors, eQTL genes, and GWA genes as prior candidate regulators (Figures 3 and S2–S5; Tables S5 and S6; Supplemental Experimental Procedures).
Figure 3. CAD-Causal RGNs Associated with Coronary Atherosclerosis.
(A–H) Bayesian RGNs with key drivers inferred (Experimental Procedures) from CAD-causal modules (i.e., module eQTLs are risk enriched or contain CAD GWA candidate genes) linked to extent of coronary atherosclerosis. Fold enrichment for CAD association was assessed for network eQTLs/SNPs using the CARDIoGRAM dataset. The molecular process with the strongest functional enrichment assessed by gene ontology is indicated. KD, key driver; GWA genes, total and CAD candidate genes identified in GWASs of CAD, plasma lipid/glucose levels, and type 2 diabetes.
Of the RGNs inferred from the 30 CAD-causal modules, eight associated with the extent of atherosclerosis (i.e., atherosclerosis modules) are shown in Figure 3. A nearly SF-specific RGN related to steroid and lipid metabolism (n = 231 genes, including one liver and one AAW gene) contained ten GWA genes identified for CAD, cholesterol, or glucose traits (APOA1CAD,LDL,HDL,TC, ANGPTL3LDL,TC, HPRLDL,TC, LIPCHDL,TC, LIPGHDL,TC, NPC1L1LDL,TC, PPP1R3BLDL,HDL,TC,insulin,glucose, PROX1glucose, SLC2A2glucose, and TMEM195glucose). Of these 10 GWA genes, only APOA1 was a key driver; the other key drivers were CYP4F2, F11, KLKB1, KMO, MYCL1, and ZGPAT. This RGN also was highly enriched in association with fasting glucose (8.0-fold, p < 3.3 × 10−272) and TC (8.7-fold, p < 1.0 × 10−300) (Figure 3A). Another truly SF-specific RGN, related to metabolic processes and oxidoreductase activity (n = 195 genes), contained as many as 15 GWA genes (ABCG5LDL,CAD,TC, ABCG8LDL,CAD,TC, ANXA9LDL, APOA5LDL,HDL,CAD,TC-TG, APOC4LDL, CPS1HDL, CYP7A1LDL,TC, FOXA2glucose, GCKRTC,glucose,insulin, HNF1ATC,LDL, HNF1BT2DM, LPALDL,CAD,TC, NR0B2LDL,HDL, PLGCAD, and PCSK9LDL,CAD,TC). Again, besides PLG, the key drivers of this network were not these GWA genes; they were FOXA2, NR1I2, ONECUT2, and UROC1 (Figure 3B).
Three CAD-causal atherosclerosis-related RGNs (Figures 3C, 3D, and 3H) were AAW specific. The first, an RGN (n = 131 genes) enriched in immune system processes, contained the macrophage lipid and inflammatory regulators and GWA genes APOECAD and NR1H3HDL as its key drivers, together with CD80, E2F2, HCLS1, and PDE8B (Figure 3C). The second, an RGN (n = 109 genes) also linked to HDL cholesterol levels, contained RNA-processing genes highly enriched in CAD association (2.1-fold, p < 1.0 × 10−300) with seven key drivers (AIP, DRAP1, MRPL28, PCBD1, POLR2I, PQBP1, and ZNF91) and one GWA gene (ARF5T2DM) (Figure 3D). The third AAW-specific RGN was small (n = 31 genes), even further enriched in CAD association (4.6-fold, p < 1.0 × 10−300), and mainly contained genes involved in inflammatory responses and the GWA gene ADRA2Aglucose (Figure 3H).
The remaining three CAD-causal, atherosclerosis-related RGNs contained genes from multiple tissues (Figures 3E–3G). The first was dominated by AAW genes (n = 53/81), but it had at least three genes active in all of the CAD-relevant tissues except in blood. This RGN contained two GWA genes (ABCA1HDL,TC and CETPLDL,HDL,TG-TC); was highly enriched in association with CAD (3.8-fold, p < 8.8 × 10−49), plasma triglyceride levels (5.3-fold, p < 2.3 × 10−103), and fasting glucose levels (1.9-fold, p < 5.6 × 10−06); and had seven key drivers (four in AAW and three in liver) with diverse functions, including transmembrane signaling (Figure 3E). The second RGN was smaller (n = 44 genes) and active across all tissues except liver (Figure 3F). Besides the Chr9p21 risk locus GWA gene CDKN2A (Welter et al., 2014) in VF, it had three additional VF key drivers: CLDN4, CTNNAL1, and MYSM1. The third multi-tissue RGN (n = 37 genes), dominated by SF and AAW genes (n = 12 and 22, respectively), was linked to HbA1c levels and enriched in associations with CAD (4.1-fold, p < 3.4 × 10−16), HbA1c levels (3.3-fold, p < 6.8 × 10−10), and insulin levels (3.0-fold, p < 1.8 × 10−8) (Figure 3G). This RGN had three key drivers (CLEC12A, CTSS, and FOLR3) implicated in cell adhesion, cell-cell signaling, glycoprotein turnover, inflammation, and immune responses (e.g., antigen presentation). The RGNs of the remaining 22 CAD-causal cholesterol, glucose, and CRP modules (Figure 2B; Table S2) are shown in Figures S2–S5; the characteristics and directions of gene-gene interactions of the remaining RGNs inferred from all 171 modules are described in Tables S5 and S6.
The CAD GWA genes (Brænne et al., 2015) were included as regulatory priors in the RGN inference. However, compared to eQTLs and transcription factors, their role as transcriptional regulators can be debated. Of 25 GWA genes in 17 of the 30 CAD-casual modules, 13 were identified as key driver genes. To test whether the RGN topology and the identification of the non-GWA key drivers were dependent on including GWA genes as regulatory priors, we repeated the RGN inference without these; 16 RGNs were re-identified and contained, on average, 89% of the same edges and 91% of the same non-GWA key drivers (Supplemental Experimental Procedures).
Cross-Species Validation of CAD-Causal RGNs
Examples of the successful use of mouse models to study atherosclerosis in relation to plasma levels of cholesterol (Björkegren et al., 2014; Skogsberg et al., 2008), glucose (Guo, 2014), and CRP (Kovacs et al., 2007) are many, suggesting that some vascular and metabolic processes leading to CAD/atherosclerosis are shared with humans. This might be particularly true for RGNs thought to be well conserved throughout evolution (Hinman et al., 2003). Therefore, in the next step of the pipeline (Figures 1F and S1), we tested the phenotypic associations of 26 of the 30 CAD-causal RGNs (Figures 3, S2, S3, and S5) against the HMDP (Bennett et al., 2010); the four CRP RGNs (Figure S4) were not validated, as CRP was not assessed in the HMDP.
The HMDP consists of up to 105 different strains of mice with considerable phenotypic variation and global gene expression from relevant tissues—such as aortic atherosclerotic lesions, liver, heart (used for STAGE SM), and adipose tissue—as well as phenotypic characteristics, similar to the STAGE study (Supplemental Experimental Procedures). In examining mouse orthologs of the 26 CAD-causal RGN genes (Table 1), we found that the activity in three of seven CAD-causal, atherosclerosis-specific modules (Figures 3D–3F) also segregated the 105 strains of mice according to their extent of atherosclerosis. Similarly, three of ten cholesterol-specific modules (Figures S2C, S2D, and S2F), three of five glucose-specific modules (Figures S3A, S3C, and S3E), and at least one phenotype in three modules associated with multiple phenotypes (Figure S5) also segregated the 105 mice of the HMDP according to the corresponding mouse phenotypes (Table 1). Thus, 46% (12/26) of the CAD-causal RGNs/modules were related to corresponding phenotypes in mice.
Table 1.
RGN Modules Causally Related to CAD Segregating the STAGE Patients and up to 105 Strains of Mice of the HMDP According to Similar Phenotypes
Module ID | STAGE Phenotype | STAGE Tissue | STAGE Combined p Value | HMDP Phenotype | HMDP Tissue | HMDP Combined p Value |
---|---|---|---|---|---|---|
Atherosclerosis modules | ||||||
42 (Figure 3D) | coronary atherosclerosis | AAW | 0.006 | atherosclerosis | aorta | 0.01 |
58 (Figure 3E) | coronary atherosclerosis | AAW | 0.001 | atherosclerosis | aorta | 0.0007 |
98 (Figure 3F) | coronary atherosclerosis | AAW | 0.03 | atherosclerosis | aorta | 0.005 |
Cholesterol modules | ||||||
87 (Figure S2C) | plasma-cholesterol | SF | 0.01 | LDL/TC | adipose | 0.008 |
95 (Figure S2D) | HDL cholesterol | VF | 0.002 | LDL cholesterol | adipose | 0.009 |
100 (Figure S2F) | HDL cholesterol | SF | 0.006 | LDL cholesterol | adipose | 0.0002 |
Glucose modules | ||||||
64 (Figure S3A) | pro-insulin | SF | 7.0 × 10−05 | insulin | adipose | 0.02 |
112 (Figure S3C) | pro-insulin | liver | 0.0007 | glucose | liver | 0.001 |
140 (Figure S3E) | blood insulin | SF | 2.0 × 10−05 | insulin | adipose | 0.009 |
Multi-phenotype modules | ||||||
27 (Figure S5A) | LDL cholesterol | SF | 0.006 | LDL cholesterol | adipose | 0.0001 |
36 (Figure S5B) | plasma-cholesterol | SM | 5.0 × 10−08 | TC | heart | 0.02 |
105 (Figure S5C) | blood insulin | SF | 8.0 × 10−05 | insulin | adipose | 0.02 |
AAW expression of mouse orthologs for genes in the STAGE atherosclerosis modules were tested against the extent of aortic root lesions in up to 105 strains of mice cross-bred with the ApoELeiden transgenic pro-atherogenic mouse model. Adipose tissue, heart, and liver expression of orthologs of genes in the STAGE cholesterol and glucose modules were tested against corresponding phenotypes in up to 105 strains of wild-type mice. The extent of human coronary atherosclerosis was assessed from coronary angiograms; mouse atherosclerosis was assessed by Oil-Red-O staining of the aortic roots. The p values are combined from second-step clustering and eigengene module correlations in both STAGE and HMDP (Supplemental Experimental Procedures).
In addition, we used aortic root lesion eQTLs and SNP p values from a GWAS of the extent of aortic root lesions in the HMDP (Supplemental Experimental Procedures) to assess the extent to which the three cross-species-validated, atherosclerosis-specific modules (Figures 3D–3F) also were causal for the mouse atherosclerosis phenotype. The mouse eQTLs of orthologs of genes in RGN 42 (Figure 3D), but not in RGN 58 (0.2-fold, p = not applicable) and RGN 98 (0.0-fold, p = not applicable) (Figures 3E and 3F), were highly enriched in association with mouse atherosclerosis (2.5-fold, p < 4.3 × 10−84) and, thus, causal for mouse atherosclerosis.
The Role of Cross-Species-Validated Atherosclerosis RGNs in THP-1 Foam Cells
Foam cells form continuously throughout all stages of atherosclerosis development (Lusis, 2000) and, thus, may be affected by CAD-causal RGNs related to the extent of human and mouse atherosclerosis. Accordingly, in the final step of our analysis (Figures 1G and S1), we used differential gene expression and cholesterol-ester accumulation measures (Supplemental Experimental Procedures) to examine the effects of small interfering RNA (siRNA) silencing of key drivers in the three cross-species-validated CAD-causal atherosclerosis-related RGNs (Figures 3D–3F; modules 42, 58, and 98 in Table S2) in an in vitro model of atherosclerosis. THP-1 cells differentiated into macrophages and incubated with acetylated LDL to generate foam cells (Skogsberg et al., 2008). In RGN 42, which is enriched in RNA-processing genes (Figure 3D), individual silencing of four of seven key drivers (AIP, DRAP1, POLR2I, and PQBP1) specifically activated RGN 42 network genes in THP-1 macrophages and reduced cholesterol-ester accumulation in THP-1 foam cells (Table 2). In contrast, silencing of key drivers in RGNs 58 and 98 did not specifically activate genes in these RGNs (Supplemental Experimental Procedures; Table S7).
Table 2.
Effects of siRNA Targeting of the AAW Key Drivers in RGN Module 42 (Figure 3D) on Network Gene Activity and Cholesterol-Ester Accumulation in THP-1 Macrophages
Key Drivers in Module 42 | Effects of siRNA Inhibition of Key Drivers on Network Genes (n = 109) in THP-1 Macrophages | Effects of siRNA Inhibition of Key Drivers on CE Accumulation in THP-1 Foam Cells | ||||||
---|---|---|---|---|---|---|---|---|
Total Affected Genes | Hyper-geometric Test | Total Affected Genes | Hyper-geometric Test | Control (Relative CE Levels) | siRNA (Relative CE Levels) | CE Content (% Relative Control) | p Value | |
Nominal p Value | <0.05 | FDR < 10% | ||||||
POLR2I | 62 | 2.2 × 10−06 | 59 | 1.0 × 10−06 | 100 ± 48a | 38 ± 24a | −62 | 0.001 |
PQBP1 | 41 | 0.0002 | 24 | 0.002 | 100 ±15a | 69 ± 11a | −31 | 7.0 × 10−05 |
AIP | 25 | 0.0005 | 9 | 0.003 | 100 ±15a | 149 ± 81a | +49 | 0.02 |
DRAP1 | 32 | 0.003 | 12 | 0.02 | 100 ±15a | 70 ± 17a | −30 | 0.0007 |
MRPL28 | 15 | 0.03 | 1 | 0.44 | – | – | – | – |
PCBD1 | 13 | 0.03 | 0 | 1.0 | – | – | – | – |
ZNF91 | 24 | 0.07 | 0 | 1.0 | – | – | – | – |
All siRNA experiments achieved >80% gene silencing except for POLR2I (>50%). CE, cholesterol ester; FDR, false discovery rate; –, not tested.
Values are mean ± SD.
Re-identification of RGN 42 in Independent Gene Expression Data
Since RGN 42 was the sole RGN to emerge from the analysis pipeline (Figures 1A–1G and S1), we used independent gene expression datasets to examine the extent to which it could be re-identified in the following: (1) in-house global gene expression datasets from primary blood macrophage samples (n = 36) and carotid lesions (n = 25) isolated from patients before and during carotid surgery, respectively (Hägg et al., 2009); and (2) datasets generated by others from lipopolysaccharide-stimulated monocytes (n = 18) and macrophages (n = 18) from CAD patients (Schirmer et al., 2009). Reassuringly, the 109 genes in RGN 42 were more strongly co-expressed than sets of 109 random genes, especially in the carotid lesions (9.5-fold higher connectivity of RGN 42 genes than of genes in 10,000 random sets, p < 1.0 × 10−300) and also in macrophages (3.7-fold, p = 6.2 × 10−37) (Hägg et al., 2009), CAD lipopolysaccharide-stimulated monocytes (2.2-fold, p = 1.1 × 10−13), and CAD macrophages (2.0-fold, p = 1.1 × 10−11(Schirmer et al., 2009; Supplemental Experimental Procedures).
CAD-Causal Arterial Wall RGNs and Gene Connectivity
Finally, we analyzed the CAD phenotype-associated modules outside the main analysis pipeline (Figures 1A–1G and S1). First, we utilized a unique feature of the STAGE cohort, a nearly atherosclerosis-free arterial wall sample (i.e., IMA), as an internal control for the AAW. When molecular networks reflecting physiological events are challenged with the effects of genetic and environmental disease perturbations, gene-gene connectivity increases for disease-relevant processes (Zhang et al., 2013). Accordingly, to look for changes in connectivity in the STAGE modules, we compared AAW (disease) and IMA (healthy).
For the 13 AAW/IMA-specific or dominant modules associated with CAD phenotypes (Figure 2B), we calculated the gain/loss in module connectivity (Figure 4). Interestingly, all five arterial wall modules with >2-fold increases in connectivity also were CAD causal (Figure 4). Of the eight arterial wall modules with <2-fold increases in connectivity (seven AAW and one IMA), only two were CAD causal (both in AAW). In both of these, gene connectivity was already high in IMA (e.g., module 42, Figure 4). Thus, CAD-causal modules appear to be characterized by high or increased connectivity, whereas CAD-reactive modules either do not change or lose connectivity in the arterial wall as CAD develops.
Figure 4. Differential Connectivity in Arterial Wall Modules with and without Atherosclerosis.
Differential connectivity plots for AAW- and IMA-dominant, phenotype-associated modules. Each plot shows the adjacency values (weighted correlations) between all pairs of module genes in IMA (lower triangle) and AAW (upper triangle). Modules are sorted from high to low according to module differential connectivity (MDC) (ratios of sum of connectivity values in AAW to IMA). Red bars indicate CAD-causal modules. Green bars indicate non-causal modules. MDC values >1 correspond to modules with a gain of connectivity in AAW versus IMA.
To identify potential causes of the connectivity differences, we analyzed cell type enrichment and transcription factor-binding site enrichment (Roider et al., 2009) for the seven CAD-causal modules with >2-fold increases in connectivity (modules 20, 37, 42, 58, 69, 133, and 138) (Figure 4). For the cell type enrichment, we examined 321 of 409 genes with matches in CTen (Shoemaker et al., 2012) that were responsible for the increased connectivity in these top five modules. CD14-positive monocytes stood out with an enrichment score of 65 (CD33-positive myeloid cells had the second highest enrichment score, 49), suggesting that connectivity is increased by the infiltration and possibly differentiation of circulating monocytes into/in the atherosclerotic plaque.
For the transcription factor-binding site enrichment, genes in module 58 were enriched for binding sites of MEF2A, a transcriptional activator implicated in CAD (Bhagavatula et al., 2004; Wang et al., 2013). Interestingly, among MEF2A’s target genes were the first (CLEC2D) and fifth (CCRN4L) order key drivers in module 58; CCRN4L has been linked to cardiovascular diseases (Binns et al., 2009).
A Super-Network of 30 CAD-Causal RGNs
The second analysis we performed outside the main pipeline (Figures 1A–1G and S1) was to identify a super-network of the 30 CAD-causal RGNs/modules by calculating the correlation values between the module eigengenes (Supplemental Experimental Procedures; Langfelder and Horvath, 2007). We found that all 30 CAD-causal RGNs/modules were interconnected and together formed a super-network connected across all seven tissues (r > 0.4, p < 0.05, Figure 5). Within this super network, SF dominated with 13 modules, AAW with seven, and SM and VF with four modules each; notably, blood and liver had only one CAD-causal module each. Nine modules were connected only within a given tissue as follows: modules 14, 23, 148, and 152 in SF; module 8 in VF; modules 89 and 99 in SM; and modules 69 and 133 in AAW (Figure 5). In contrast, the cross-tissue modules 98 (dominated by VF genes including its key drivers) and 113 (dominated by SF genes including its key drivers) appeared to act as hub modules, mediating all but one (module 27 in SF) connection from SM and liver (module 98) and from SF (module 113) to the AAW (none of the CAD-causal modules were identified in the IMA).
Figure 5. A Super-Network of CAD-Causal Modules.
Eigengene associations (r > 0.4, p < 0.05) were used to link the 30 CAD-causal modules (Experimental Procedures). RGNs/modules (circles) are oriented according to dominating tissue-belonging of genes and are color coded accordingly. Numbers in circles are the module IDs. Circle size corresponds to the number of RGN/module genes. Next to each box, the names of GWA genes in the RGNs/modules and related trait (in superscript) are indicated. TS, tissue-specific RGN/module; CT, cross-tissue RGN/module; colored circle circumferences, phenotype associations; dotted lines, RGNs/modules that are reactive to indicated phenotype; solid line, RGNs/modules that are causal for indicated phenotype (i.e., GWA gene[s] for the phenotype or eQTLs enriched with association for the phenotype); orange, plasma cholesterol measures; blue, plasma glucose metabolism measures; purple, plasma CRP levels.
The notion that cross-tissue modules can serve as hubs in this super-network of CAD-causal modules was reinforced by the greater and stronger connectivity in the super-network of cross-tissue modules than of tissue-specific modules (mean number of connections: 5.33 ± 1.73 versus 3.43 ± 2.04, p < 0.02; mean connectivity strength: 2.96 ± 0.98 versus 1.98 ± 1.20, p < 0.03). Notably, module 98 contained genes involved in endopeptidase activity (Table S3) previously implicated in atherosclerosis (Kugiyama et al., 1996), and one of its four key drivers, CDKN2A, is a CAD candidate gene for the Chr9p21 risk locus (Welter et al., 2014; Table S2). Interestingly, the allelic variation of the Chr9p21 risk SNP (rs4977574) was associated with increased connectivity of the VF genes in module 98 (risk genotype G/G, n = 20 STAGE patients; mean connectivity ± SD, 2.34 ± 1.01), the non-risk heterozygotes (A/G, n = 47 STAGE patients, mean connectivity 1.37 ± 0.33); and the non-risk homozygotes (A/A, n = 21 STAGE patients, mean connectivity 1.43 ± 0.34) (p = 0.002, Kruskal-Wallis test).
DISCUSSION
In this study, we used systems genetics integrating the analysis of (1) DNA genotypes and gene expression profiles in seven CAD-relevant tissues from CAD patients (STAGE cohort) and (2) CAD GWA datasets (primarily Coronary Artery Disease Genome-wide Replication and Meta-analysis [CARDIoGRAM]). Our findings provide a preliminary view of the regulatory landscape of causal molecular processes active within and across a majority of tissues believed to be central to advanced CAD. The identified RGNs included both established (Brænne et al., 2015) and previously unreported CAD candidate genes in the form of key drivers. These candidate genes participate in diverse molecular processes and established pathways of atherosclerosis, cholesterol and glucose metabolism, and acute inflammation, and they were regulated in both tissue-specific and cross-tissue networks. Importantly, we found that nearly half of the RGNs (where similar mouse phenotypes were available, 12/26, 46%) were evolutionarily conserved, as judged from validation against the HMDP (Bennett et al., 2010). As proof of concept, in one cross-species-validated, mouse atherosclerosis- and CAD-causal network active in AAW and involving RNA-processing genes, four key drivers (AIP, DRAP1, POLR2I, and PQBP1) specifically activated the same network genes and affected THP-1 foam cell formation, and the entire network also was re-identified in independent gene expression data from both CAD macrophages and carotid lesions.
It is increasingly recognized that to defeat complex diseases like CAD it is not sufficient to focus on individual DNA variants or CAD candidate genes. In parallel, we need to use systems genetics to understand how these disease genes operate in complex regulatory networks (Schadt, 2009). In examining the 171 RGNs inferred from all STAGE modules (Figures 3 and S2–S5; Tables S5 and S6) in greater detail, we found a good representation of established genes, pathways, and molecular functions in CAD. For instance, these modules contained 147 of the 309 GWA genes thus far proposed for CAD and CAD risk factors, such as lipid and glucose metabolism, including type 2 diabetes (Table S4). Thus, our results should be useful in gaining a better understanding of the tissue-specific or cross-tissue molecular contexts of established CAD candidate genes identified by GWASs. Our results will be particularly useful in providing clues about their upstream regulations by key disease drivers, which frequently were not the GWA genes themselves.
Although it is vital to identify individual CAD candidate genes in the context of regulatory networks, RGNs do not act in isolation but interact. Indeed, we identified a super-network containing all 30 CAD-causal RGNs across all the main CAD tissues. The tissue distribution of RGN interactions is significant, as it links a good portion of disease-driving molecular processes in CAD, including their relation to key metabolic risk factors (e.g., cholesterol and glucose metabolism), and it provides a preliminary overview of the gene regulatory landscape in CAD. Mapping the regulatory framework of complex diseases in this fashion provides a starting point to assess the overall molecular status of individual patients (Björkegren et al., 2015). In fact, more detailed versions of regulatory maps like the one presented here (Figure 5) are required to achieve the goals of precision medicine.
Notably, most of the CAD-causal RGNs in this super-network were active in nonhepatic peripheral organs, particularly SF (Figure 5); only one was in the liver. The paucity of hepatic modules/RGNs and the unexpectedly high number of fat modules/RGNs may be consistent with the emerging notion that key hepatic processes in cardiometabolic diseases are governed by metabolic processes and gene expression activity in nonhepatic peripheral tissues (Lomonaco et al., 2012). In fact, the connectivity map of the super-network of CAD-causal modules suggests that causal effects of metabolic processes in the SM and liver on the CAD-causal modules of the AAW are largely mediated by a single cross-tissue RGN primarily active in VF. This RGN also was found to contain CDKN2A, a candidate gene for the well-established Chr9p21 CAD risk locus, as one of its four key drivers (Figure 5). Moreover, the connectivity of this RGN was found to be associated with the allele frequency of the Chr9p21 CAD risk locus. Similarly, effects on the CAD-causal modules of the AAW of the many RGNs identified in SF also were predominantly mediated by one cross-tissue module (Figure 5).
Although a systems genetics approach has the significant advantage of simultaneously considering many genes (i.e., RGNs) in relation to disease, there are also challenges (Civelek and Lusis, 2014). In brief, cell diversity in the tissue biopsies can be a problem, as changes in gene activity also will reflect changes in cell type content. Furthermore, the inference of causal (i.e., directed) RGNs from omics data can be hampered by data noise, limited sample size, and the inherent limitations of representing complex biological processes by statistical models, which inevitably leads to false positives even among the most significant predictions (Marbach et al., 2010). Thus, as with gene-by-gene approaches, RGNs inferred from genetics of gene expression studies must be examined and validated extensively.
To this point, we validated three atherosclerosis-related RGNs using an in vitro model of atherosclerosis (Skogsberg et al., 2008; Figures 1 and S1, Step G). Silencing of the key drivers of RGN 42, AIP, DRAP1, POLR2I, and PQBP1, which involve RNA processing, specifically reactivated RGN 42 genes in THP-1 macrophages and affected cholesterol-ester accumulation (i.e., foam cell formation) (Table 2). Although none of these four key drivers have been directly linked to CAD, atherosclerosis, or foam cell formation, AIP affects the expression of peroxisome proliferator-activated receptors α and β (Sumanasekera et al., 2003a, 2003b) and, thus, may prevent foam cell formation via the HDL/ABCA1 pathway (Chinetti et al., 2001) or the CD36 scavenger receptor pathway (Li et al., 2004). Similarly, DRAP1 represses TGF-β signaling by interfering with the binding of FoxH1-Smad2/3/4 to its DNA targets during transcription (Iratni et al., 2002). TGF-b signaling prevents foam cell formation by activating ABC1 and reverse cholesterol transport (Kozaki et al., 1997). POLR2I and PQBP1, however, have not even been indirectly associated with atherosclerosis or foam cell formation, returning only 93 and 69 hits, respectively, in PubMed. POLR2I encodes a subunit of RNA polymerase II, the polymerase responsible for synthesizing mRNA in eukaryotes and also targeted by HIV-1 during lytic and latent viral stages (Nilson and Price, 2011). PQBP1 is a key regulator of mRNA processing and gene transcription. In fact, mutations in PQBP1 have been reported in several X chromosome-linked intellectual disability disorders, including Golabi-Ito-Hall syndrome (Sudol et al., 2012). PQBP1 also may be linked to fat metabolism (Takahashi et al., 2009), and, as a splicing factor, it alters fibroblast growth factor signaling (Wang et al., 2013). The fact that RGN 42 is related to RNA processing may be significant in view of the gene-regulatory roles of noncoding RNAs in general (Mihailescu, 2015) and in CAD (Bronze-da-Rocha, 2014).
In summary, our identification of several high-hierarchy, evolutionarily conserved, strongly inherited risk-enriched, and reproducible RGNs is a breakthrough in our understanding of the molecular landscape causing CAD across tissues. Specifically, we have shown for the first time that RNA-processing genes appear central in causing CAD. Key regulatory genes in this and other RGNs may be proven useful as targets for novel CAD therapies.
EXPERIMENTAL PROCEDURES
The STAGE Genetics of Gene Expression Study
In the STAGE study, 612 tissue samples were obtained during coronary artery bypass grafting surgery from AAW (n = 73), IMA (n = 88), liver (n = 87), SM (n = 89), SF (n = 72), and VF (n = 98) of well-characterized CAD patients (Hägg et al., 2009). Fasting whole blood was obtained for isolation of DNA (n = 109) and RNA (n = 105) and biochemical analyses. RNA samples were used for gene expression profiling with a custom Affymetrix array (HuRSTA-2a520709). Blood DNA was genotyped for 909,622 SNPs with the Affymetrix GenomewideSNP_6 array. The carotid lesion and CAD macrophage gene expression data were generated with the same custom Affymetrix arrays mentioned above (Shang et al., 2014).
Multi-tissue Weighted Gene Co-expression Network Reconstruction
WGCNA (Zhang and Horvath, 2005) was used to identify co-expression network modules that simultaneously captured gene-gene relations within and across tissues (Table S8), using the following most variably expressed genes (SD > 0.5 across samples): 3,369 in AAW; 2,761 in IMA; 1,588 in liver; 2,492 in SM; 3,596 in SF; 5,215 in VF; and 1,891 in blood. In WGCNA, values of β = 6 and β = 3 were used to weigh within-tissue and across-tissue correlations, respectively, to ensure that both tissue-specific and cross-tissue sub-networks, as well as whole networks, were scale free.
Associations of Modules and CAD Phenotypes
Module associations with four main CAD phenotypes (Results) were tested as follows: (1) by a rank-sum test of the mean phenotype values of the two most distinct patient subsets, defined by clustering the gene activity within each module; and (2) by calculating Pearson correlations between phenotype and module eigengene values (Langfelder and Horvath, 2007). Integrated p values from both tests and from multiple measurements for each main phenotype (e.g., TC, LDL, VLDL, and HDL for plasma cholesterol levels) were computed with a weighted data integration method (Hwang et al., 2005) and corrected for multiple testing by estimating a false discovery rate (FDR < 0.2) (Storey, 2002).
eQTL Enrichment for Risk Associations and CAD Candidate Genes According to GWASs
To examine the extent to which modules were CAD causal, we used enrichment in inherited CAD association or GWAS candidate genes (Deloukas et al., 2013). In brief, disease assocations of cis-eQTLs detected in the STAGE patients (or, if the number of cis-eQTLs was less than ten, SNPs located within ±500 kb of the transcription start or end site of each module gene) were used to calculate fold enrichment and statistical significance compared to 10,000 random SNP groups corrected for chromosome, gene density, and major allele frequency (Foroughi Asl et al., 2015). CAD candidate genes (n = 53) were obtained from CARDIoGRAM. Similarly eQTL disease associations and additional GWA candidate genes for CAD risk factors were computed using additional GWASs (Supplemental Experimental Procedures) and data from the HMDP.
RGN Reconstruction
Gaussian Bayesian networks were reconstructed for each module by imposing a prior that only genes with eQTLs, transcription factors, or CAD GWA genes could be parents of other genes. The Bayesian Information Criterion (BIC) was used to score models; a multiple-restart greedy hill-climbing algorithm, with edge additions, deletions, and reversals, was used to search locally optimal models (Schmidt et al., 2007).
Key Driver Analysis and Validation
Key drivers were identified from the Bayesian networks as described previously (Zhang and Zhu, 2013), and they were validated by siRNA silencing in THP-1 macrophages incubated with Ac-LDL to induce foam cell formation (Table S9), as described previously (Skogsberg et al., 2008). Gene expression data from siRNA-treated and control THP-1 cells were generated by Agilent Human Custom Gene Expression Microarray 8×15 and compared to assess the effect of a key driver on its RGN, as described previously (Björkegren et al., 2014).
Mouse Phenotype Associations
To validate the phenotype associations of CAD-causal modules, we assessed the association of ortholog genes in corresponding mouse tissues and phenotypes in the HMDP (Bennett et al., 2010), using the same methods described.
Module Connectivity and Differential Connectivity
Total module connectivity was calculated as the sum of adjacency values (i.e., weighted correlation coefficients) between all pairs of genes in a module. Module differential connectivity between AAW and IMA was calculated as the ratio of total module connectivity in AAW to total module connectivity in IMA.
Supplementary Material
Highlights.
We reconstruct regulatory gene networks across seven vascular and metabolic tissues
Integrative analysis using GWASs reveals 30 networks causally related to CAD
12 CAD-causal networks are indicated to be evolutionarily conserved from mouse
An atherosclerotic arterial wall RNA-processing network affects foam cell formation
Acknowledgments
We thank Stephen Ordway for editorial assistance. This work was supported by the Swedish Heart-Lung Foundation (J.L.M.B. and J.S.), the Swedish Research Council (J.L.M.B. and J.S.), the King Gustaf V and Queen Victoria’s Foundation of Freemasons (J.L.M.B. and J.S.), the Astra-Zeneca Translational Science Centre-Karolinska Institutet (J.L.M.B.), the University of Tartu (SP1GVARENG; J.L.M.B), the Estonian Research Council (ETF grant 8853; A.R. and J.L.M.B.), the Biotechnology and Biological Sciences Research Council (BBSRC, BB/J004235/1 and BB/M020053/1; T.M.), the American Heart Association (14SFRN20490315 and 14SFRN20840000; J.B. and E.E.S.), the NIH (National Heart, Lung, and Blood Institute [NHLBI], R01HL71207 [J.L.M.B.], K23HL111339 [C.G.], and K99HL121172 [M.C.]). Clinical Gene Networks AB (CGN) supported this work as a small and medium-sized enterprise (SME) of the EU FP6/FP7 project CVgenes@target (HEALTH-F2-2013-601456). This work was undertaken as part of the Leducq Consortium CAD Genomics (J.L.M.B., E.E.S., M.C., and A.J.L.). J.L.M.B., A.R., and T.M. are shareholders of CGN. J.L.M.B., E.E.S., and A.R. are members of the board of directors. CGN has an invested interest in the STAGE data. However, CGN has expressed no claims or sought any patents related to the results presented in this manuscript.
Footnotes
ACCESSION NUMBERS
The accession number for the STAGE data reported in this paper is GEO: GSE40231. The accession numbers for the HMDP data reported in this paper are GEO: GSE66570, GSE64768, and GSE64769.
SUPPLEMENTAL INFORMATION
Supplemental Information includes Supplemental Experimental Procedures, five figures, and nine tables and can be found with this article online at http://dx.doi.org/10.1016/j.cels.2016.02.002.
AUTHOR CONTRIBUTIONS
Conceptualization, H.A.T., T.M., and J.L.M.B.; Methodology, H.A.T., J.S., T.M., and J.L.M.B.; Software, H.A.T., H.F.A., and T.M.; Formal Analysis, H.A.T., H.F.A., O.F., and T.M.; Investigation, H.A.T., R.K.J., J.S., T.M., and J.L.M.B.; Resources, R.E., A.R., B.A.K., B.R., C.G., T.I., J.T.D., M.C., A.J.L., J.S., and J.L.M.B.; Writing – Original Draft, H.A.T, J.S., T.M., and J.L.M.B.; Writing – Review & Editing, H.A.T, J.C.K., J.S., T.M., and J.L.M.B.; Supervision, E.E.S., J.S., T.M., and J.L.M.B.
References
- Bangalore S, Steg G, Deedwania P, Crowley K, Eagle KA, Goto S, Ohman EM, Cannon CP, Smith SC, Zeymer U, et al. β-Blocker use and clinical outcomes in stable outpatients with and without coronary artery disease. JAMA. 2012;308:1340–1349. doi: 10.1001/jama.2012.12559. [DOI] [PubMed] [Google Scholar]
- Barabási AL, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011;12:56–68. doi: 10.1038/nrg2918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bennett BJ, Farber CR, Orozco L, Kang HM, Ghazalpour A, Siemers N, Neubauer M, Neuhaus I, Yordanova R, Guan B, et al. A high-resolution association mapping panel for the dissection of complex traits in mice. Genome Res. 2010;20:281–290. doi: 10.1101/gr.099234.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bhagavatula MR, Fan C, Shen GQ, Cassano J, Plow EF, Topol EJ, Wang Q. Transcription factor MEF2A mutations in patients with coronary artery disease. Hum Mol Genet. 2004;13:3181–3188. doi: 10.1093/hmg/ddh329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Binns D, Dimmer E, Huntley R, Barrell D, O’Donovan C, Apweiler R. QuickGO: a web-based tool for Gene Ontology searching. Bioinformatics. 2009;25:3045–3046. doi: 10.1093/bioinformatics/btp536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Björkegren JL, Hägg S, Talukdar HA, Foroughi Asl H, Jain RK, Cedergren C, Shang MM, Rossignoli A, Takolander R, Melander O, et al. Plasma cholesterol-induced lesion networks activated before regression of early, mature, and advanced atherosclerosis. PLoS Genet. 2014;10:e1004201. doi: 10.1371/journal.pgen.1004201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Björkegren JL, Kovacic JC, Dudley JT, Schadt EE. Genome-wide significant loci: how important are they? Systems genetics to understand heritability of coronary artery disease and other common complex disorders. J Am Coll Cardiol. 2015;65:830–845. doi: 10.1016/j.jacc.2014.12.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brænne I, Civelek M, Vilne B, Di Narzo A, Johnson AD, Zhao Y, Reiz B, Codoni V, Webb TR, Foroughi Asl H, et al. Prediction of causal candidate genes in coronary artery disease loci. Arterioscler Thromb Vasc Biol. 2015;35:2207–2217. doi: 10.1161/ATVBAHA.115.306108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bronze-da-Rocha E. MicroRNAs expression profiles in cardiovascular diseases. BioMed Res Int. 2014;2014:985408. doi: 10.1155/2014/985408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chinetti G, Lestavel S, Bocher V, Remaley AT, Neve B, Torra IP, Teissier E, Minnich A, Jaye M, Duverger N, et al. PPAR-alpha and PPAR-gamma activators induce cholesterol removal from human macrophage foam cells through stimulation of the ABCA1 pathway. Nat Med. 2001;7:53–58. doi: 10.1038/83348. [DOI] [PubMed] [Google Scholar]
- Civelek M, Lusis AJ. Systems genetics approaches to understand complex traits. Nat Rev Genet. 2014;15:34–48. doi: 10.1038/nrg3575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med. 2015;372:793–795. doi: 10.1056/NEJMp1500523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deloukas P, Kanoni S, Willenborg C, Farrall M, Assimes TL, Thompson JR, Ingelsson E, Saleheen D, Erdmann J, Goldstein BA, et al. Large-scale association analysis identifies new risk loci for coronary artery disease. Nat Genet. 2013;45:25–33. doi: 10.1038/ng.2480. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Emilsson V, Thorleifsson G, Zhang B, Leonardson AS, Zink F, Zhu J, Carlson S, Helgason A, Walters GB, Gunnarsdottir S, et al. Genetics of gene expression and its effect on disease. Nature. 2008;452:423–428. doi: 10.1038/nature06758. [DOI] [PubMed] [Google Scholar]
- Foroughi Asl H, Talukdar HA, Kindt AS, Jain RK, Ermel R, Ruusalepp A, Nguyen KD, Dobrin R, Reilly DF, Schunkert H, et al. Expression quantitative trait Loci acting across multiple tissues are enriched in inherited risk for coronary artery disease. Circ Cardiovasc Genet. 2015;8:305–315. doi: 10.1161/CIRCGENETICS.114.000640. [DOI] [PubMed] [Google Scholar]
- Guo S. Insulin signaling, resistance, and the metabolic syndrome: insights from mouse models into disease mechanisms. J Endocrinol. 2014;220:T1–T23. doi: 10.1530/JOE-13-0327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hägg S, Skogsberg J, Lundström J, Noori P, Nilsson R, Zhong H, Maleki S, Shang MM, Brinne B, Bradshaw M, et al. Multi-organ expression profiling uncovers a gene module in coronary artery disease involving transendothelial migration of leukocytes and LIM domain binding 2: the Stockholm Atherosclerosis Gene Expression (STAGE) study. PLoS Genet. 2009;5:e1000754. doi: 10.1371/journal.pgen.1000754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hinman VF, Nguyen AT, Cameron RA, Davidson EH. Developmental gene regulatory network architecture across 500 million years of echinoderm evolution. Proc Natl Acad Sci USA. 2003;100:13356–13361. doi: 10.1073/pnas.2235868100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hwang D, Rust AG, Ramsey S, Smith JJ, Leslie DM, Weston AD, de Atauri P, Aitchison JD, Hood L, Siegel AF, Bolouri H. A data integration methodology for systems biology. Proc Natl Acad Sci USA. 2005;102:17296–17301. doi: 10.1073/pnas.0508647102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iratni R, Yan YT, Chen C, Ding J, Zhang Y, Price SM, Reinberg D, Shen MM. Inhibition of excess nodal signaling during mouse gastrulation by the transcriptional corepressor DRAP1. Science. 2002;298:1996–1999. doi: 10.1126/science.1073405. [DOI] [PubMed] [Google Scholar]
- Kovacs A, Tornvall P, Nilsson R, Tegnér J, Hamsten A, Björkegren J. Human C-reactive protein slows atherosclerosis development in a mouse model with human-like hypercholesterolemia. Proc Natl Acad Sci USA. 2007;104:13768–13773. doi: 10.1073/pnas.0706027104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kozaki K, Akishita M, Eto M, Yoshizumi M, Toba K, Inoue S, Ishikawa M, Hashimoto M, Kodama T, Yamada N, et al. Role of activin-A and follistatin in foam cell formation of THP-1 macrophages. Arterioscler Thromb Vasc Biol. 1997;17:2389–2394. doi: 10.1161/01.atv.17.11.2389. [DOI] [PubMed] [Google Scholar]
- Kugiyama K, Sugiyama S, Matsumura T, Ohta Y, Doi H, Yasue H. Suppression of atherosclerotic changes in cholesterol-fed rabbits treated with an oral inhibitor of neutral endopeptidase 24.11 (EC 3.4.24.11) Arterioscler Thromb Vasc Biol. 1996;16:1080–1087. doi: 10.1161/01.atv.16.8.1080. [DOI] [PubMed] [Google Scholar]
- Langfelder P, Horvath S. Eigengene networks for studying the relationships between co-expression modules. BMC Syst Biol. 2007;1:54. doi: 10.1186/1752-0509-1-54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li AC, Binder CJ, Gutierrez A, Brown KK, Plotkin CR, Pattison JW, Valledor AF, Davis RA, Willson TM, Witztum JL, et al. Differential inhibition of macrophage foam-cell formation and atherosclerosis in mice by PPARalpha, beta/delta, and gamma. J Clin Invest. 2004;114:1564–1576. doi: 10.1172/JCI18730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lomonaco R, Ortiz-Lopez C, Orsak B, Webb A, Hardies J, Darland C, Finch J, Gastaldelli A, Harrison S, Tio F, Cusi K. Effect of adipose tissue insulin resistance on metabolic parameters and liver histology in obese patients with nonalcoholic fatty liver disease. Hepatology. 2012;55:1389–1397. doi: 10.1002/hep.25539. [DOI] [PubMed] [Google Scholar]
- Lusis AJ. Atherosclerosis. Nature. 2000;407:233–241. doi: 10.1038/35025203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G. Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci USA. 2010;107:6286–6291. doi: 10.1073/pnas.0913357107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mensah GA, Moran AE, Roth GA, Narula J. The global burden of cardiovascular diseases, 1990–2010. Glob Heart. 2014;9:183–184. doi: 10.1016/j.gheart.2014.01.008. [DOI] [PubMed] [Google Scholar]
- Mihailescu R. Gene expression regulation: lessons from noncoding RNAs. RNA. 2015;21:695–696. doi: 10.1261/rna.050815.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nilson KA, Price DH. The role of RNA polymerase II elongation control in HIV-1 gene expression, replication, and latency. Genet Res Int. 2011;2011:726901. doi: 10.4061/2011/726901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peden JF, Farrall M. Thirty-five common variants for coronary artery disease: the fruits of much collaborative labour. Hum Mol Genet. 2011;20(R2):R198–R205. doi: 10.1093/hmg/ddr384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roider HG, Manke T, O’Keeffe S, Vingron M, Haas SA. PASTAA: identifying transcription factors associated with sets of co-regulated genes. Bioinformatics. 2009;25:435–442. doi: 10.1093/bioinformatics/btn627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Samani NJ, de Bono DP. Prevention of coronary heart disease with pravastatin. N Engl J Med. 1996;334:1333–1334. author reply 1334–1335. [PubMed] [Google Scholar]
- Schadt EE. Molecular networks as sensors and drivers of common human diseases. Nature. 2009;461:218–223. doi: 10.1038/nature08454. [DOI] [PubMed] [Google Scholar]
- Schadt EE, Björkegren JL. NEW: network-enabled wisdom in biology, medicine, and health care. Sci Transl Med. 2012;4:115rv1. doi: 10.1126/scitranslmed.3002132. [DOI] [PubMed] [Google Scholar]
- Schirmer SH, Fledderus JO, van der Laan AM, van der Pouw-Kraan TC, Moerland PD, Volger OL, Baggen JM, Böhm M, Piek JJ, Horrevoets AJG, van Royen N. Suppression of inflammatory signaling in monocytes from patients with coronary artery disease. J Mol Cell Cardiol. 2009;46:177–185. doi: 10.1016/j.yjmcc.2008.10.029. [DOI] [PubMed] [Google Scholar]
- Schmidt M, Niculescu-Mizil A, Murphy K. Learning graphical model structure using L1-regularization paths. In: Cohn A, editor. Proceedings of the 22nd National Conference on Artificial Intelligence. Vol. 2. AAAI Press; 2007. pp. 1278–1283. [Google Scholar]
- Shang MM, Talukdar HA, Hofmann JJ, Niaudet C, Asl HF, Jain RK, Rossignoli A, Cedergren C, Silveira A, Gigante B, et al. Lim domain binding 2: a key driver of transendothelial migration of leukocytes and atherosclerosis. Arterioscler Thromb Vasc Biol. 2014;34:2068–2077. doi: 10.1161/ATVBAHA.113.302709. [DOI] [PubMed] [Google Scholar]
- Shoemaker JE, Lopes TJS, Ghosh S, Matsuoka Y, Kawaoka Y, Kitano H. CTen: a web-based platform for identifying enriched cell types from heterogeneous microarray data. BMC Genomics. 2012;13:460. doi: 10.1186/1471-2164-13-460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skogsberg J, Lundström J, Kovacs A, Nilsson R, Noori P, Maleki S, Köhler M, Hamsten A, Tegnér J, Björkegren J. Transcriptional profiling uncovers a network of cholesterol-responsive atherosclerosis target genes. PLoS Genet. 2008;4:e1000036. doi: 10.1371/journal.pgen.1000036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Storey JD. A direct approach to false discovery rates. J R Stat Soc Series B Stat Methodol. 2002;64:479–498. [Google Scholar]
- Sudol M, McDonald CB, Farooq A. Molecular insights into the WW domain of the Golabi-Ito-Hall syndrome protein PQBP1. FEBS Lett. 2012;586:2795–2799. doi: 10.1016/j.febslet.2012.03.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sumanasekera WK, Tien ES, Davis JW, 2nd, Turpey R, Perdew GH, Vanden Heuvel JP. Heat shock protein-90 (Hsp90) acts as a repressor of peroxisome proliferator-activated receptor-alpha (PPARalpha) and PPARbeta activity. Biochemistry. 2003a;42:10726–10735. doi: 10.1021/bi0347353. [DOI] [PubMed] [Google Scholar]
- Sumanasekera WK, Tien ES, Turpey R, Vanden Heuvel JP, Perdew GH. Evidence that peroxisome proliferator-activated receptor alpha is complexed with the 90-kDa heat shock protein and the hepatitis virus B X-associated protein 2. J Biol Chem. 2003b;278:4467–4473. doi: 10.1074/jbc.M211261200. [DOI] [PubMed] [Google Scholar]
- Takahashi K, Yoshina S, Masashi M, Ito W, Inoue T, Shiwaku H, Arai H, Mitani S, Okazawa H. Nematode homologue of PQBP1, a mental retardation causative gene, is involved in lipid metabolism. PLoS ONE. 2009;4:e4104. doi: 10.1371/journal.pone.0004104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Q, Moore MJ, Adelmant G, Marto JA, Silver PA. PQBP1, a factor linked to intellectual disability, affects alternative splicing associated with neurite outgrowth. Genes Dev. 2013;27:615–626. doi: 10.1101/gad.212308.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L, Parkinson H. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–D1006. doi: 10.1093/nar/gkt1229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005;4 doi: 10.2202/1544-6115.1128. Article 17. [DOI] [PubMed] [Google Scholar]
- Zhang B, Zhu J. Identification of key causal regulators in gene networks. Proceedings of the World Congress on Engineering & Computer Science. 2013;II:1309–1312. [Google Scholar]
- Zhang B, Gaiteri C, Bodea LG, Wang Z, McElwee J, Podtelezhnikov AA, Zhang C, Xie T, Tran L, Dobrin R, et al. Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer’s disease. Cell. 2013;153:707–720. doi: 10.1016/j.cell.2013.03.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.