Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2022 Feb 25;18(2):e1009903. doi: 10.1371/journal.pcbi.1009903

Mapping the gene network landscape of Alzheimer’s disease through integrating genomics and transcriptomics

Sara Brin Rosenthal 1,*,#, Hao Wang 2,*,#, Da Shi 3,4, Cin Liu 2, Ruben Abagyan 3, Linda K McEvoy 2,5, Chi-Hua Chen 2,*
Editor: Feixiong Cheng6
PMCID: PMC8906581  PMID: 35213535

Abstract

Integration of multi-omics data with molecular interaction networks enables elucidation of the pathophysiology of Alzheimer’s disease (AD). Using the latest genome-wide association studies (GWAS) including proxy cases and the STRING interactome, we identified an AD network of 142 risk genes and 646 network-proximal genes, many of which were linked to synaptic functions annotated by mouse knockout data. The proximal genes were confirmed to be enriched in a replication GWAS of autopsy-documented cases. By integrating the AD gene network with transcriptomic data of AD and healthy temporal cortices, we identified 17 gene clusters of pathways, such as up-regulated complement activation and lipid metabolism, down-regulated cholinergic activity, and dysregulated RNA metabolism and proteostasis. The relationships among these pathways were further organized by a hierarchy of the AD network pinpointing major parent nodes in graph structure including endocytosis and immune reaction. Control analyses were performed using transcriptomics from cerebellum and a brain-specific interactome. Further integration with cell-specific RNA sequencing data demonstrated genes in our clusters of immunoregulation and complement activation were highly expressed in microglia.

Author summary

Alzheimer’s disease (AD) is recognized as the leading primary cause of dementia, resulting in a high socioeconomic burden. Understanding the disease pathogenesis serves as the cornerstone of exploring potential drug targets, therapeutic strategies and clinical intervention. As a complex disease, the development of AD involves pathological changes in multiple biological processes, and is impacted significantly by genetic factors. Through integration of the available genomic, protein-protein interactions (interactomic) and transcriptomic data, we identified a disease gene network that includes a total of 788 genes, and annotated 17 major gene clusters which encompassed the main categories of biological pathways with reported alterations in AD. The results revealed a landscape of AD etiology, with major pathological changes that extend from gene transcription and RNA metabolism, proteostasis, lipid metabolism, immune reactions to synaptic dysfunction. The systems-level approach of the present study can also be applied to other complex diseases with a significant genetic component.

Introduction

Late Onset Alzheimer’s disease (AD) is a neurodegenerative disorder recognized as the leading primary cause of dementia with a heritability estimate of 50–80% [1,2]. The advent of large-scale genome-wide association studies (GWAS) has revealed associations between single nucleotide polymorphisms (SNPs) and risk of AD, allowing for new insight into the genetic basis of this disease. The latest large AD GWA studies have identified over 40 risk loci [39].

Beyond GWAS, recent large transcriptomic studies have also begun to yield converging findings on differentially expressed genes associated with AD. The success of gene discovery combined with network-based approaches to human diseases [10], leads to opportunities for elaboration of AD pathophysiology at multiple omic levels via embedded molecular interactions (interactome). A key tenet of this approach is to address the overall dysfunction of disease genes within the context of their molecular interactions. Interactomes enable study of collective biological interactions together with basic units of protein-protein interaction, and mediators of various intracellular signaling and regulation [11]. This approach has been successfully used to provide novel insight into many diseases [10,12,13].

Emerging evidence from systems biology and network studies has revealed the association between AD and an increasing number of molecular networks, most notably including lipid metabolism with the APOE ε4 allele, and immunological dysfunction involving microglial cells [1420]. These studies identified gene networks using transcriptomic or proteomic data but integration with AD GWAS is lacking. AD GWAS have made substantial progress identifying multiple SNPs associated with AD risk that have been replicated across studies [4,5]. Genetic variations captured by GWAS are inherited, and are generally not confounded by secondary changes from disease progression, whereas transcriptomics provides information on factors that are both inherited and non-inherited (e.g. affected by environmental exposure and comorbidities). These factors can reflect dynamic and tissue-specific patterns, such as the distinctive neural involvement at a late stage of life in AD [21]. One prior study did integrate AD GWAS with transcriptomics [22]. However, normal rather than AD brains were used in the analyses, and the sample size of AD GWAS was small (17,008 cases and 37,646 controls), compared to the two recent larger AD GWAS with AD-by-proxy samples in our main analysis (71,880 proxy cases and 383,378 controls) [4] and ancillary analysis of autopsy-documented AD samples (35,274 cases and 59,163 controls) [5].

We further leverage the power of the interactome to reveal AD pathophysiology by identifying the significantly proximal neighbors of AD GWAS genes in the interactome to form an expanded AD disease module. The rationale for including proximal genes is that previous studies showed enrichment of new disease loci or drug targets often among interactome neighbors of existing GWAS genes [19,23,24]. We then integrate data of transcriptomic dysregulation in AD brains with the expanded AD disease module, considering the transcriptomic data from both the temporal cortex as major disease-relevant tissue, and the cerebellum as the hypothesized control, based on the evidence that tau pathology, the pathological feature of AD that closely associated with clinical symptoms, emerges in temporal cortex early in the course of the disease before spreading to the other cortical areas in later stages, with minimal involvement to the cerebellum [2529]. Furthermore, application of multiscale community detection enables identification, annotation and mapping of the hierarchical substructure of biological pathways associated with the AD gene network [30].

Results

AD disease module

Our analysis workflow is illustrated in Fig 1. 192 AD risk genes were identified in a large AD-by-proxy GWAS, which identifies both AD patients and individuals with parental history of AD as cases [4]. In the AD-by-proxy GWAS, these risk genes were selected by using positional, expression quantitative trait loci (eQTL) and chromatin information, through the functional mapping and annotation (FUMA) tool [31], on AD GWAS summary statistics [4]. Out of the 192 genes, 142 were found in the interactome (STRING database) and included in our subsequent analyses. As genes not found in the interactome are poorly characterized, they were excluded. New genes with no known information of interaction were excluded as well (Table D in S1 Data).

Fig 1. Analysis workflow.

Fig 1

142 GWAS-identified AD genes were included as the seed genes of the AD disease module, and propagated to an expanded network consisting of another 646 predicted proximal genes based on the background protein-protein interactome (STRING). The expanded AD module was used for further functional annotation, pathway clustering and gene expression analyses integrating with transcriptomic data.

The 142 AD associated risk genes were significantly localized in the interactome, compared with random gene sets (p = 3.8×10−9), with 59 of 142 risk genes (42%) directly connected with at least one other risk gene (i.e., interconnected risk genes). The largest connected component consisted of 14 risk genes (Fig A in S1 Text). We took node degree into account for the selection of random gene sets to control for potential bias that disease genes may have a high degree of connectivity, resulting from the fact that they are well-studied rather than from their biological properties. We also replicated our analysis using a brain-specific network from GIANT [24]. 191 out of 192 AD risk genes were present in this interactome and also significantly localized compared to random gene sets (p < 10−16).

Expanding the AD disease module

As a quantitative way of expanding the AD module to include genes proximal in network space, we used network propagation, a tool which has repeatedly driven novel biological discoveries (Fig 1) [32]. By seeding the network propagation algorithm with the 142 AD risk genes, we identified an expanded AD disease module of 788 genes significantly proximal to one or more seed genes (including 142 seed genes and 646 proximal genes, Fig 2). The genes in the AD disease module, including both seed genes and proximal genes were highly enriched for genes related to synaptic function, especially abnormal synaptic transmission, based on mouse knockout data (Fig 3) [33,34]. Using the 646 proximal genes alone, the gene-disease enrichment analysis also demonstrated that AD was the top enriched disease (Fig F in S1 Text).

Fig 2. Expanded AD disease module and highlighted distinctive clusters annotating different biological pathways possibly involved in AD.

Fig 2

Clusters identified in the expanded AD module labeled in the center. Genes comprising 5 selected clusters are depicted around the edges of the figure. The temporal cortex RNAseq beta statistic is mapped to the gene color, and shape indicates AD GWAS seed gene (triangle) or network proximal gene (z > 2, circle). Bold black outlines indicate AGORA proposed drug targets (e.g. CHRNA2, PTK2B). Seed genes or genes with AGORA targets are labeled with a larger font, while other genes identified by network propagation are labeled with a smaller font.

Fig 3. Annotations of Mammalian Phenotype Ontology on the AD network genes.

Fig 3

The figure shows the odds ratios and 95% confidence intervals for the 9 significantly enriched brain-related phenotypes, and one negative control phenotype (abnormal skeleton physiology). The circle size indicates the number of genes which result in that phenotype when knocked out in mice.

A recent AD GWAS of diagnosed AD patients identified 400 AD-related genes, with varying levels of support, such as functional consequence, eQTL, tissue expression [5]. Of these 400 genes, 103 were also identified in the expanded AD module. Note that most of these were also found in the AD-by-proxy GWAS [4]; since the data partially overlap between the two GWAS, this is not unexpected. However, there were 26 proximal genes identified by our network, but were identified in the more recent AD GWAS [5]. This overlap is highly significant (OR = 2.0, p = 0.0008, Fisher’s exact test). We note that although some of the underlying data overlap, our network analysis approach successfully identified genes missed by the first GWAS [4].

We sought to interpret the structure and content of the expanded AD module, to identify major biological pathways and functions represented in the disease. For this purpose, we applied a graph-based clustering algorithm to the expanded AD module [35], which revealed a strong clustering structure in the network, identifying 18 distinct clusters of 10 genes or more (Table 1). Additionally, we found 15 smaller clusters containing 2–9 genes, and 15 orphan genes forming their own clusters. We focused our analysis on the 18 largest clusters with at least 10 genes (Fig 2). An alternative clustering strategy, multiscale community detection, revealed largely similar clusters (Fig 4 and Table C in S1 Data).

Table 1. Clusters in the expanded AD disease module.

Cluster* Number of genes Enrichment of dysregulated genes in AD cortex (BH FDR) Pathway p-value Function
Class 1. Immune reactions
1 75 ns 1.13×10−12 Immunoregulatory interactions between a lymphoid and non-lymphoid cell
3 55 6.80×10−5 4.84×10−33 Complement activation
8 48 ns 3.42×10−26 EPH-Ephrin signaling
11 31 0.08 2.48×10−8 Interleukin-1 signaling
15 20 0.06 0.00017 ZAG-PIP complex
Class 2. Gene transcription and RNA metabolism
2 73 ns 6.76×10−17 RNA metabolic process
9 33 0.06 1.36×10−13 DNA-binding transcription factor activity
Class 3. Vesicular transport, post-translational protein modifications, trafficking and proteostasis
4 55 ns 6.30×10−14 Protein modification by small protein conjugation
5 53 ns 1.56×10−46 Clathrin-mediated endocytosis
6 52 ns 2.88×10−15 SNARE binding
13 24 ns 1.06×10−15 AP-type membrane coat adaptor complex
14 21 ns 2.49×10−7 Ubiquitin-like protein-specific protease activity
Class 4. Synaptic function
10 33 0.18 5.81×10−22 Acetylcholine-gated cation-selected channel activity
12 31 ns 1.18×10−5 GABAergic synapse
Class 5. Substance metabolism
7 49 0.18 6.05×10−16 Regulation of plasma lipoprotein particle levels
16 18 ns 7.87×−12 Porphyrin and chlorophyll metabolism
17 11 0.18 3.03×10−12 Cellular iron homeostasis

*Cluster 18 genes were unclassified by functional annotation and not included.

†ns: not significant.

Fig 4. Hierarchical graph of the AD gene network.

Fig 4

Pie charts indicate fractions of up- (red) and down- (blue) regulated genes.

Functional annotation of these clusters demonstrated significant association with multiple well-characterized biological pathways. In addition to the many clusters that represent previously reported mechanisms in AD pathogenesis (Table 1 and B in S1 Data) [3648]. we also identified pathways whose roles in AD are as yet unclear, including zinc α2-glycoprotein-prolactin-inducible protein (ZAP-PIP) complex as well as porphyrin and chlorophyll metabolism. Clusters highlighted in Table 1 and Fig 2 with significant clinical implications include immunoregulation (cluster 1, p = 1×10−12), complement activation (cluster 3, p = 5×10−33), RNA metabolism (cluster 2, p = 7×10−17), acetylcholine-gated cation channel activity (cluster 10, p = 6×10−22) and GABAergic synapse (cluster 12, p = 1×10−5). Based on correlation in biological functions, the clusters were further grouped into 5 classes that cover various aspects of AD pathophysiology, including immune reactions, gene transcription and RNA metabolism, proteostasis, synaptic function and substance metabolism (Table 1). We note that observing highly significant associations with known biological pathways is expected, as genes within the same pathway are likely to interact. Enrichment p-values indicate confidence of the observed associations. Identified clusters contain both AD GWAS genes, as well as proximal genes in the expanded AD module (Tables A and B in S1 Data).

Transcriptomic dysregulation proximal to AD genes

To further characterize AD-proximal genes, we used data from the Mayo Clinic RNAseq study. These data identify differential expression of genes in AD patients compared to controls from the temporal cortex (Fig B in S1 Text) and cerebellum [49]. Of 1,213 genes that were significantly upregulated in the temporal cortex (adjusted p < 0.05, beta > 0.5), 85 were also found in the expanded AD module (hypergeometric p = 0.002). Downregulated genes were not similarly enriched (hypergeometric p = 0.5). This effect was even more pronounced when we examined the full distribution of network proximity z-scores of the up-regulated genes, with this gene set having significantly higher z-scores than 13,135 genes not up-regulated in AD (p = 2×10−14, K-S 2-sample test), as shown on Fig C in S1 Text. This was in contrast to the data from cerebellum, where there was no significant overlap of up-regulated genes (hypergeometric p = 0.116), or difference in z-scores between up-regulated genes and the rest in the network (p = 0.069) (Fig D in S1 Text).

Clusters with consistent up- or down-regulation of genes were identified by integration of the RNA sequencing (RNAseq) data of genetic differential expression in the temporal cortex. Eight clusters demonstrated some evidence of dysregulation (Benjamini-Hochberg FDR < 0.2, hypergeometric test; Table 1 and B in S1 Data). In particular, cluster 10 annotated for acetylcholine-gated cation-selective channel activity was down-regulated in AD compared to healthy controls, with 7 significantly downregulated genes, relative to only 1 upregulated gene (FDR = 0.18, hypergeometric test). These include CHRNA2, and PTK2B, both of which were implicated at the GWAS level, as well as the transcriptomic level. Cluster 3, annotating for complement activation, was strongly upregulated in AD, with 20 significantly up-regulated genes, compared to only 2 downregulated genes (p = 6.8×10−5, hypergeometric test). The upregulated genes include C4B and C4A, which were AD-GWAS genes, as well as CFI, which was not implicated by the GWAS, but is found in a list of expert-curated potential AD targets (Agora database). Importantly, we found that randomly selected gene clusters with similar properties to the AD gene clusters are much less dysregulated in the RNAseq data, with only one random gene cluster marginally enriched for AD differentially expressed genes (FDR < 0.2). This suggests that the network proximal genes to AD GWAS hits are more strongly dysregulated than randomly selected regions of the interactome, pointing to a possible link between genomic variants and transcriptomic dysregulation in network proximal biological pathways.

Cell-specific preferential expression

Cross-referencing with brain tissue expression data demonstrated unique patterns of cell-specific gene expression across the identified clusters, with clear association with their annotated biological functions (Fig 5A and E in S1 Text). In particular, genes in clusters 1 (immunoregulation) and 3 (complement activation) were preferentially expressed in microglial cells (p = 1×10−4 and 9×10−3 respectively, Wilcoxon rank-sum test). A trend (not statistically significant) was also observed that genes in cluster 5 (Clathrin-mediated endocytosis) were preferentially expressed in oligodendrocytes (p = 0.73), in cluster 10 (acetylcholine-gated cation-selective channel activity) in mature astrocytes (p = 0.64), and in cluster 12 (IL-1 signaling) mostly in neurons (p = 0.25) followed by mature astrocytes (p = 0.25).

Fig 5. Cell-specific gene expression across clusters and schematic model.

Fig 5

A) Mean expression (FPKM) from brain cell-types averaged across genes in each cluster (functional annotations for identified clusters are 1: Immunoregulatory interactions between a lymphoid and non-lymphoid cell, 2: RNA metabolic process, 3: Complement activation, 4: Protein modification by small protein conjugation, 5: Clathrin-mediated endocytosis, 6: SNARE binding, 7: Regulation of plasma lipoprotein particle levels, 8: EPH-Ephrin signaling, 9: DNA-binding transcription factor activity, 10: Acetylcholine-gated cation-selected channel activity, 11: Interleukin-1 signaling, 12: GABAergic synapse). B) Classes of identified pathways that are functionally related are presented as inner circles, with circle size roughly indicating relative class sizes. The overlying outer circle illustrates the types of cells in the central nervous system with overall preferential gene expression in each major class (e.g., microglia overlay the classes of Immune reactions and Substance metabolism because genes in multiple clusters of these two classes were highly expressed in microglia).

Discussion

There was significant clustering among the 788 identified network genes, forming a distinct disease module in the interactome. The expanded AD module was enriched for genes involved in several biological pathways implicated in AD, including lipid metabolism, the immune system, endocytosis, the cholinergic and GABAergic pathways of the central nervous system (CNS), which are also annotated in the AD-by-proxy GWAS paper as expected. Our hierarchical network analysis revealed the multiscale structure of the AD gene network.

By integrating the AD interactome module with information on differential RNA expression between AD patients and healthy controls, we identified clusters of genes within the AD module that were primarily up- or down-regulated at the transcriptomic level [49]. Significant enrichment of dysregulated genes in AD was observed in 8 out of the 17 annotated clusters. Possible reasons could lead to the discrepancies between the transcriptomic and GWAS data in the other 9 clusters, such as that genes with small effects were not consistently identified in the two data sets, and/or there are more complicated epigenomic mechanisms involving differential expression of genes that were not detected by GWAS.

Finally, we further profiled cell-specific gene expression of these clusters with data from RNA sequencing of purified cells [49]. We select potentially clinically-relevant clusters to discuss in detail as follows.

The Cholinergic pathway

Involvement of the cholinergic system in AD has been known since the 1970s [43]. Our data-driven approach identified a cluster of 33 genes (Table A in S1 Data), the majority of which are implicated in cholinergic function [43]. Consistent with decreased cholinergic activity in AD, genes in this cluster were strongly down-regulated in AD compared to the control group, with 7 significantly down-regulated genes, and only 1 up-regulated gene. This cluster includes 15 highly interconnected genes encoding receptor subunits (CHRNA, CHRNB, CHRND, CHRNE and CHRNG) for the excitatory neurotransmitter acetylcholine, which is strongly involved in memory function [43]. This cluster also contains genes involved with the synthesis (CHAT, CPT1B) or breakdown (ACHE) of acetylcholine. Interestingly, PTK2B is a member of this cluster, which was identified from the GWAS [4]. It encodes PTK2B, a tyrosine kinase that is involved in regulation of long-term potentiation in the hippocampus as a likely neural substrate of memory formation [50]. PTK2B was also significantly down-regulated in AD compared to controls. While no longer believed to be the primary pathologic mechanism underlying AD, these results show that cholinergic dysregulation and disruption of hippocampal signaling are important factors in AD pathogenesis.

Complement activation

Emerging evidence suggests the importance of neuroinflammation in the pathogenesis of AD. Complement activation has been observed in the brain tissue of AD patients and seems to contribute to an important local inflammatory state, with increased expression of C4 observed in AD patients. Other members (C3 and C1q) of the pathway are also implicated in AD, implying a role for the entire complement activation cascade in AD pathogenesis [51]. Coding genes for these proteins, such as C4, CR1, CR1L, were identified from GWAS and the majority showed up-regulated expression patterns in our cluster.

From the cell-specific data, we found genes in this cluster highly expressed in microglial cells (Fig 5A). A mechanistic cascade has been suggested involving increased expression of C3 in microglia and astrocytes by amyloid β (Aβ) oligomers, which tag synapses, promotes recruitment of microglia, and mediates elimination of the tagged synapses [52]. Given that these clustered genes are associated with AD risk in GWAS, up-regulated in AD, and preferentially expressed in cells with close etiological relationship with AD, these findings suggest that the genomic and transcriptional alterations in this cluster likely play a causal role in AD pathogenesis, rather than being secondary consequences of disease progression, as previously suggested [53].

Immunoregulatory interactions and ZAG-PIP complex

The crucial role that microglia-related immunoregulation plays in AD is supported by our network analyses and the preferential expression of these genes in microglia (Fig 5A) [54,55]. Some of the key genes in our study are consistent with current knowledge in this field. TREM2 is an up-regulated cell surface receptor in AD, and exclusively expressed in immune cells including macrophages, dendritic cells and microglial in CNS [54]. The coded protein of this gene has been found to be involved in activation of microglia via its soluble fragment around Aβ plaques, maintaining plaque morphology and neurotoxicity [56,57]. In addition, important genes in the phosphatidylinosol-3 kinase (PIK3) pathway that are closely related to downstream effects of TREM2, such as INPP5D, INPPL1 and PIK3CA, were also universally up-regulated in our study.

We observed a cluster of genes associated with the ZAG-PIP complex (Table A in S1 Data), whose role in AD is not yet fully described. Although both proteins are multifunctional, being involved with such processes as immunoregulation, fertilization and lipolysis [58,59]. ZAG is structurally similar to a truncated secretory MHC-I-like protein, and interacts with PIP which resembles a light chain and binds to both CD4 and IgG [60]. The ZAG (also known as AZGP1) gene has been identified as significant for AD by both reference GWASs used in the present study [4,5]. We observed up-regulation of ZAG but overall down-regulation of other genes in this cluster among AD patients, implying possible dysregulation of immune reactions associated with this complex in the disease.

Lipoprotein regulation

The Apolipoprotein E (APOE) gene, the strongest genetic risk factor for late-onset AD [61], was upregulated in our study. APOE is a lipid transport protein that transports cholesterol and lipids to neurons through the low-density lipoprotein (LDL) receptor family for use in cell membrane maintenance and neuronal repair [41].

In our disease module, APOE was located within a cluster of 49 genes (Table A in S1 Data) that are mainly involved in lipid metabolism, including APOA, APOB, APOC, and APOF genes that showed mixed patterns of up- or down-regulation. The APOA genes, which code for proteins in the high-density lipoprotein, were all down-regulated in AD whereas the APOBR and most of the APOC genes were up-regulated. Most of the LDL receptor encoding genes in this cluster were up-regulated, but the very low-density lipoprotein receptor gene was down-regulated. Also included in this cluster are genes that code for heparan sulfate biosynthetic enzymes (HS3ST) which are thought to facilitate binding of APOE to the LDL receptor–related protein 1 (LRP1) [62]. Most of these genes (4 out of 6) were down-regulated. Other genes in this cluster are related to the immune system (ANKRD17, NR1H4, TCN1) and to inflammation (CHST1, CHST5, GPC2, MDK, MMP10, MSR1).

RNA metabolism

AD has been associated with reduced levels of RNA-binding proteins (RBP), which are crucial in RNA metabolism and maintaining liquid-liquid phase separation for dynamic formation of supramolecular assemblies [63]. Disrupted homeostasis of RBPs leads to increased propensity of aggregation and sequestration of RBPs by abnormal RNAs, and subsequent formation of neurotoxic deposits such as Tau proteins [64]. In addition to genes identified by previous GWAS (SCARA3, GEMIN7, TSC22D4, TAF6, CPSF2), additional RBP genes, involved in multiple stages of the RNA life cycle, were found in our AD disease module (ZC3HAV1, DNAJC2, SYMPK, C8orf34, CPSF3, HEXIM2, HOXA5, ESX1, HOXA10, SOX3, HOXB5, LARP7), majority of which were significantly down-regulated in AD (Table A in S1 Data).

Proteostasis

Proteostasis, or protein homeostasis, involves clearing aberrant, mis-localized or excessive proteins, the imbalance of which is evident in many neurodegenerative disorders [65]. We identified a large class of functionally correlated biological pathways that are associated with this process, including post-translational modifications, intracellular trafficking, vesicular transport and degradation of protein (Table 1).

Consistent with published evidence on impaired ubiquitination-proteasome system (UPS) in AD due to ineffective protein clearing [66], a large set of relevant genes was enriched in our disease module, the majority of which were down-regulated in AD. Interestingly, associated genes of UBE2V1-UBE2N and UBE2V2-UBE2N heterodimers (UBE2V1, UBE2V2 and UBE2NL) were also found among this cluster, which catalyzes synthesis of non-canonical polyubiquitination that does not lead to degradation by proteasome but rather downstream inflammatory responses [67], suggesting the mechanisms of UPS dysfunction in the development of AD are likely more complicated. In addition, we identified multiple significantly down-regulated deubiquitinating enzyme genes in AD, suggesting a role for the imbalance of ubiquitination in AD pathophysiology.

Interestingly, the clustered genes involved in Clathrin-mediated endocytosis were expressed preferentially in oligodendrocytes (Fig 5A and E in S1 Text). Although most studies on the association between dysregulated endocytosis and the formation of Aβ proteins focused on the neurons [39], the above observation may imply that oligodendrocytes are also involved. Furthermore, as differentiation and signaling of oligodendrocytes rely on endocytosis for internalization of transferrin, an iron transporter [68], it is possible that the well-known iron dyshomeostasis in AD affects not only neurons, but also oligodendrocytes and oligodendrocyte progenitor cells [48].

Summary of molecular pathways in AD

Based on the results from our AD gene network and pathway annotation, we propose 5 classes of functionally related molecular pathways that are significantly associated with AD: 1) immune reactions, 2) gene transcription and RNA metabolism, 3) vesicular transport, post-translational protein modifications, trafficking and proteostasis, 4) synaptic function (including acetylcholine-gated channel activity and GABAergic synapse) and 5) substance metabolism (including lipid, iron, porphyrin and chlorophyll) (Table 1). We noted that the classification above can be changed slightly because some pathways have multiple functions and can be involved in various classes. An example is the EPH-Ephrin signaling pathway, which is associated both with synaptic dysfunction and CNS immune dysregulation [42,69].

A schematic model of gene network in AD pathophysiology based on results of the present study is illustrated in Fig 5B. Although alterations in gene expression are complicated in many of the above classes, there is likely predominant up-regulation of immune reactions (IL-1 signaling and complement activation) and lipid metabolism, as well as down-regulation of cholinergic neurotransmission.

Our hierarchical network analysis generally verified the five classes proposed above, where the identified parent nodes included: 1) immune responses (though ephrin receptor signaling pathway was labelled and likely driven by the number of overlapping genes), 2) RNA metabolism, 3) endocytosis, 4) excitatory postsynaptic potential. The pathways for the class of substance metabolism proposed above were scattered in different nodes (Fig 4 and Table C in S1 Data).

The above results demonstrate a landscape of AD etiology, and major pathological changes that range from molecular (gene transcription and RNA metabolism, proteostasis, and substance metabolism), cellular (immune reactions) to tissue-level (synapses) dysfunction. This approach can be applied to other complex diseases with significant genetic component. Future studies on modification and verification of the disease network may provide further insight on prevention and therapeutic intervention of AD.

Potential limitations

The main limitation of the present study is that the dataset for AD risk genes from GWAS is incomplete, although it still covers a significant amount of information. Using network propagation can mitigate this issue as this method is an amplifier of genetic associations [32]. There are multiple choices of interactome, each with their own pros and cons, and many tissue-specific networks that can be obtained (e.g., GIANT [24]). We chose the STRING interactome based on the results of a systemic evaluation of the performance of diverse networks [70], although other interactomes may yield different results and should be considered in future studies. We also used a brain-specific network from GIANT to test consistency of the localization analysis. Our chosen networks from high throughput experiments or computational prediction are potentially less susceptible to the literature bias to well-studied genes. In addition, the present study did not include 50 GWAS-identified genes that are not in the STRING interactome (Table D in S1 Data). Their roles in AD etiology may also be important, and warrant further investigations.

Materials and methods

AD risk genes

To identify genes related to AD, summary statistics from a large-scale GWAS were used [4]. AD risk genes were derived from a GWAS of 455,258 individuals with 71,880 proxy cases and 383,378 controls [4]. This is one of the largest GWAS of AD to date, having identified 29 independent loci and 192 genes associated with AD. A second AD GWAS was used as a replication GWAS, and included with 35,274 clinical and autopsy-documented AD cases and 59,163 controls. This GWAS identified 400 candidate genes associated with AD [5].

Molecular interaction network (interactome)

The STRING database of protein-protein interactions was selected as the background interactome for the analysis. STRING consists of both physical and functional interactions, derived through co-expression, biological knowledge databases, and computational techniques. Interactions are scored based on accumulation of different types of evidence [71]. In our analysis we used interactions classified as ‘high confidence’ (combined score > 0.7), for the human interaction version 10.5, containing 15,131 proteins and 359,776 interactions.

Network propagation

A network propagation algorithm was used to explore the network proximity to a set of genes identified as significantly associated with AD [32]. Network propagation amplifies biological signals in networks, enabling exploration of genes significantly nearby in network space, and improving on simpler measures such as first nearest neighbors [32]. The network propagation algorithm simulates how heat would spread, starting from a set of ‘hot’ seed genes (GWAS-discovered seed nodes). In the simulation, the heat spreads from gene to gene along the interactions in the network. The result is a set of ‘hot’ genes, which are likely related to the starting seed genes in biological process or pathway, as the adjacent genes in a network likely have similar biological functions. This process enables identification of genes related to multiple seed genes rather than a single one, because more heat will accumulate in genes that are close to multiple seed genes. This process is described in the following equation [72]:

Ft=αWFt1+(1α)Y

Where Ft is the heat vector at time t, Y is the initial value of the heat vector, where the value of seed genes is 1/S, the value of non-seed genes is 0, and S is the total number of seed genes. W’ is the normalized adjacency matrix, and α in (0,1) represents the fraction of total heat which is dissipated at every timestep. We chose an α value of 0.5, based on previous work which demonstrated that the propagation algorithm is not sensitive to the choice of α as long as α ≥ 0.5 [71]. We refer the reader to our jupyter notebooks (https://zenodo.org/record/5786722#.Ybtti73MKC8 (DOI: 10.5281/zenodo.5786722)) and the original publication for more details on network propagation [72].

We compared the network propagation z-scores to a null model to find genes which were significantly more proximal to the seed genes than would be expected by chance, defined as 2 standard deviations from the mean (z > 2), where a z score of 2 corresponds to p = 0.02. We constructed a null model by selecting random sets of genes with similar degree distributions to the seed set, using the binning approach [73]. Nodes were grouped into bins where each bin had at least 10 nodes of similar degree. 5000 such gene sets were randomly selected to build up the null distribution. We computed a node-level z-score comparing the network proximity values from the seed set to the mean and standard deviation from the null model network proximities.

zn=log(Fn,HC)log(Fn,rand)σ(logFn,rand)

Where Fn,HC is the propagation score of gene n, for high confidence seed genes, Fn,rand is the propagation score of gene n for randomly selected degree-matched genes, <> denotes an average of gene n’s propagation score over N randomly sampled sets, and σ denotes the standard deviation of the random distribution. The proximity vectors were log transformed so they are approximately normally distributed.

Network localization

We measured the localization of the AD gene set by calculating the number of edges shared between the genes in the focal set. This is similar to the ‘significance’ measure used in string-DB [71]. To measure significance, we calculated the localization of the full gene set, and compared this to the distribution of localization on 5,000 randomly selected, degree-matched gene sets of size equal to the number of disease risk genes. To ensure the localization is not dependent on a small number of hub nodes, we built up a distribution using a sampling procedure. We measured the number of edges connecting a randomly sampled set of 80% of the full set of AD genes, and compared to degree-matched random node set. We conducted 5,000 random samplings to build up the distribution, and did not find any effect from hubby genes (Fig A panel B in S1 Text).

Gene enrichment analysis

Integration of the AD network with mouse knockout data was performed using the mammalian phenotype ontology, with data from the Jackson laboratory [74]. We identified 9 brain-related phenotypes which were significantly enriched for genes in the AD network, using a Fisher’s exact test. A control phenotype (abnormal skeleton physiology) was not similarly enriched. We also evaluated the association of predicted 646 proximal genes with AD by gene-disease enrichment analysis (DisGeNET), using disgenet2r package for R [75].

Clustering

Clusters of highly connected genes were identified in the AD network using a graph-based modularity maximization algorithm [35], commonly referred to as the Louvain algorithm, which iteratively identifies groups of genes which have many connections within the group and few connections between groups.

Functional enrichment of the gene clusters was conducted using the G:Profiler tool [76], using all genes in the full AD network as the background gene set, with adjustments for multiple tests (the Benjamin-Hochberg procedure). GO terms and KEGG and REACTOME pathways were tested for functional enrichment.

To verify the clustering results, we built an alternative hierarchical AD network by multiscale community detection performed in Cytoscape using the CDAPS (Community Detection APplication and Service) application (29), with the HiDef community detection algorithm [77]. We used the 788 genes of the AD expanded disease module. Communities were annotated with significantly enriched GO terms and pathways from the G:Profiler tool.

Agora database

Putative Alzheimer’s disease genes that may be candidates for drug targets were downloaded from the Agora database on 1/9/2019 (https://agora.ampadportal.org/genes/). The database was contributed in part by the Accelerating Medicine Partnership–Alzheimer’s Disease (AMP-AD) consortium.

Transcriptomic study

To investigate whether a gene or pathway of the AD module is up or down regulated, we overlaid the differentially expressed genes provided by the Mayo Clinic RNAseq Study [49] from temporal cortex tissue and cerebellum of AD case-control post-mortem brains (84 cases and 80 controls) on the AD module to infer the effect direction of individual pathways. The data was accessed through the AMP-AD Knowledge Portal. The full AD network was tested for enrichment of significantly up or downregulated genes (hypergeometric test), using a total of 14,631 genes that were found both in the interactome network and in the RNAseq data. The significance of the overlap between dysregulated RNAseq genes and genes in each cluster was also assessed (hypergeometric test). We compared the observed number of differentially expressed genes found in each cluster to the expected number given the observed 2,399 significantly differentially expressed genes (adjusted p < 0.05, abs(Beta) > 0.5). We applied the Benjamini-Hochberg procedure for multiple tests across all clusters. Additionally, we computed the enrichment of RNAseq dysregulated genes in random clusters, generated by seeding the network propagation algorithm with randomly selected seeds with a similar degree distribution as the AD seed genes, and then applying the graph-based clustering algorithm to the genes significantly proximal to these random genes. Thus, we found gene clusters with similar properties to the AD gene clusters, but which are not related to AD in any way. We computed the enrichment of the RNAseq dysregulated genes in these random gene clusters in the same way as for the AD gene clusters, and found only one gene cluster marginally dysregulated (FDR < 0.2), compared to 8 strongly or marginally dysregulated gene clusters in the expanded AD module (Table 1 and B in S1 Data).

Brain tissue specific analysis

Brain tissue expression data for neuronal, glial and endothelial tissue types were downloaded from the Brain RNA-seq Database [77]. FPKM values averaged over all genes per cluster were used to make the heatmap.

Supporting information

S1 Text

Supplementary Figures: Fig A: Network localization of AD GWAS genes. (A) AD gene network identified by GWAS. Gene colors represent the differential expression beta statistic in the temporal cortex between AD and healthy controls. Edges represent high confidence interactions in the STRING database. (B) Distribution of number of edges interconnecting AD GWAS genes (blue) or randomly selected gene sets (yellow). 80% of the AD GWAS genes were sampled 5000 times to create the distribution. Fig B: Heatmap of relative gene expression between AD patients and healthy control, temporal cortex. This figure shows the top 100 most differentially expressed genes. Note that the patients (columns) were not clustered here- they are sorted by healthy and AD status. Only the genes (rows) are clustered. Fig C: Transcriptomic study of AD genes in the temporal cortex. (A) Overlap of up-regulated genes in the Mayo Clinic RNAseq data and the expanded AD disease module. (B) Significant difference of Z-scores between up-regulated and the rest of genes. Fig D: Transcriptomic study of AD genes in the cerebellum. (A) No significant overlap of up-regulated genes in the Mayo Clinic RNAseq data and the expanded AD disease module. (B) No significant difference of Z-scores between up-regulated and the rest of genes. Fig E: Brain cell-specific mean expression (FPKM) of genes in identified clusters. Functional annotations for the clusters are 1: Immunoregulatory interactions between a lymphoid and non-lymphoid cell, 2: RNA metabolic process, 3: Complement activation, 4: Protein modification by small protein conjugation, 5: Clathrin-mediated endocytosis, 6: SNARE binding, 7: Regulation of plasma lipoprotein particle levels, 8: EPH-Ephrin signaling, 9: DNA-binding transcription factor activity, 10: Acetylcholine-gated cation-selected channel activity, 11: Interleukin-1 signaling, 12: GABAergic synapse. Fig F: Gene-enrichment analysis using predicted 646 proximity genes. The plot shows ratio of proximity genes overlapped with each disease-related gene set in the available databases.

(DOCX)

S1 Data

Supplementary Tables: Table A: Annotation of genes in the AD disease module. Table B: Functional annotation of the genes in the AD disease module. Table C: Genes in the hierarchical network. Table D: GWAS-identified significant genes that are not in the STRING interactome.

(XLSX)

Acknowledgments

The results published here are in whole or in part based on data obtained from Agora and the AD Knowledge Portal. Complete statements are detailed in their websites (https://agora.ampadportal.org/about/, https://adknowledgeportal.synapse.org/DataAccess/AcknowledgmentStatements/). Mayo Clinic RNAseq Study with samples provided by Mayo Clinic Brain Bank and Banner Sun Health Research Institute (https://www.synapse.org/#!Synapse:syn20818651).

Data Availability

All relevant data are within the paper, its Supporting Information files, and on Zenodo at https://zenodo.org/record/5786722#.Ybtti73MKC8 (DOI: 10.5281/zenodo.5786722).

Funding Statement

CHC was supported by funding under R01MH118281, R56AG061163 from National Institutes of Health. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Lane CA, Hardy J, Schott JM. Alzheimer’s disease. Eur J Neurol. 2018;25: 59–70. doi: 10.1111/ene.13439 [DOI] [PubMed] [Google Scholar]
  • 2.Gatz M, Reynolds CA, Fratiglioni L, Johansson B, Mortimer JA, Berg S, et al. Role of genes and environments for explaining Alzheimer disease. Arch Gen Psychiatry. 2006;63: 168–174. doi: 10.1001/archpsyc.63.2.168 [DOI] [PubMed] [Google Scholar]
  • 3.Hollingworth P, Harold D, Sims R, Gerrish A, Lambert J-C, Carrasquillo MM, et al. Common variants at ABCA7, MS4A6A/MS4A4E, EPHA1, CD33 and CD2AP are associated with Alzheimer’s disease. Nat Genet. 2011;43: 429–435. doi: 10.1038/ng.803 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Jansen IE, Savage JE, Watanabe K, Bryois J, Williams DM, Steinberg S, et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat Genet. 2019;51: 404–413. doi: 10.1038/s41588-018-0311-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kunkle BW, Grenier-Boley B, Sims R, Bis JC, Damotte V, Naj AC, et al. Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat Genet. 2019;51: 414–430. doi: 10.1038/s41588-019-0358-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lambert J-C, Heath S, Even G, Campion D, Sleegers K, Hiltunen M, et al. Genome-wide association study identifies variants at CLU and CR1 associated with Alzheimer’s disease. Nat Genet. 2009;41: 1094–1099. doi: 10.1038/ng.439 [DOI] [PubMed] [Google Scholar]
  • 7.Lambert JC, Ibrahim-Verbaas CA, Harold D, Naj AC, Sims R, Bellenguez C, et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat Genet. 2013;45: 1452–1458. doi: 10.1038/ng.2802 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Marioni RE, Harris SE, Zhang Q, McRae AF, Hagenaars SP, Hill WD, et al. GWAS on family history of Alzheimer’s disease. Transl Psychiatry. 2018;8: 99. doi: 10.1038/s41398-018-0150-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Naj AC, Jun G, Beecham GW, Wang L-S, Vardarajan BN, Buros J, et al. Common variants at MS4A4/MS4A6E, CD2AP, CD33 and EPHA1 are associated with late-onset Alzheimer’s disease. Nat Genet. 2011;43: 436–441. doi: 10.1038/ng.801 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Barabási A-L, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011;12: 56–68. doi: 10.1038/nrg2918 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Soler-López M, Zanzoni A, Lluís R, Stelzl U, Aloy P. Interactome mapping suggests new mechanistic details underlying Alzheimer’s disease. Genome Res. 2011;21: 364–376. doi: 10.1101/gr.114280.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Parikshak NN, Gandal MJ, Geschwind DH. Systems biology and gene networks in neurodevelopmental and neurodegenerative disorders. Nat Rev Genet. 2015;16: 441–458. doi: 10.1038/nrg3934 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Vidal M, Cusick ME, Barabási A-L. Interactome networks and human disease. Cell. 2011;144: 986–998. doi: 10.1016/j.cell.2011.02.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Mostafavi S, Gaiteri C, Sullivan SE, White CC, Tasaki S, Xu J, et al. A molecular network of the aging human brain provides insights into the pathology and cognitive decline of Alzheimer’s disease. Nat Neurosci. 2018;21: 811–819. doi: 10.1038/s41593-018-0154-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Raj T, Shulman JM, Keenan BT, Chibnik LB, Evans DA, Bennett DA, et al. Alzheimer disease susceptibility loci: evidence for a protein network under natural selection. Am J Hum Genet. 2012;90: 720–726. doi: 10.1016/j.ajhg.2012.02.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Seyfried NT, Dammer EB, Swarup V, Nandakumar D, Duong DM, Yin L, et al. A Multi-network Approach Identifies Protein-Specific Co-expression in Asymptomatic and Symptomatic Alzheimer’s Disease. Cell Syst. 2017;4: 60–72.e4. doi: 10.1016/j.cels.2016.11.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Swarup V, Chang TS, Duong DM, Dammer EB, Dai J, Lah JJ, et al. Identification of Conserved Proteomic Networks in Neurodegenerative Dementia. Cell Rep. 2020;31: 107807. doi: 10.1016/j.celrep.2020.107807 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Yu L, Petyuk VA, Gaiteri C, Mostafavi S, Young-Pearse T, Shah RC, et al. Targeted brain proteomics uncover multiple pathways to Alzheimer’s dementia. Ann Neurol. 2018;84: 78–88. doi: 10.1002/ana.25266 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Zhang B, Gaiteri C, Bodea L-G, Wang Z, McElwee J, Podtelezhnikov AA, et al. Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer’s disease. Cell. 2013;153: 707–720. doi: 10.1016/j.cell.2013.03.030 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Zhang Q, Ma C, Gearing M, Wang PG, Chin L-S, Li L. Integrated proteomics and network analysis identifies protein hubs and network alterations in Alzheimer’s disease. Acta Neuropathol Commun. 2018;6: 19. doi: 10.1186/s40478-018-0524-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Hadar A, Gurwitz D. Peripheral transcriptomic biomarkers for early detection of sporadic Alzheimer disease? Dialogues Clin Neurosci. 2018;20: 293–300. doi: 10.31887/DCNS.2018.20.4/dgurwitz [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.International Genomics of Alzheimer’s Disease Consortium (IGAP). Convergent genetic and expression data implicate immunity in Alzheimer’s disease. Alzheimers Dement J Alzheimers Assoc. 2015;11: 658–671. doi: 10.1016/j.jalz.2014.05.1757 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Cao C, Moult J. GWAS and drug targets. BMC Genomics. 2014;15 Suppl 4: S5. doi: 10.1186/1471-2164-15-S4-S5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Greene CS, Krishnan A, Wong AK, Ricciotti E, Zelaya RA, Himmelstein DS, et al. Understanding multicellular function and disease with human tissue-specific networks. Nat Genet. 2015;47: 569–576. doi: 10.1038/ng.3259 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Braak H, Braak E. Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol (Berl). 1991;82: 239–259. doi: 10.1007/BF00308809 [DOI] [PubMed] [Google Scholar]
  • 26.Brettschneider J, Del Tredici K, Lee VM-Y, Trojanowski JQ. Spreading of pathology in neurodegenerative diseases: a focus on human studies. Nat Rev Neurosci. 2015;16: 109–120. doi: 10.1038/nrn3887 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Lin C-Y, Chen C-H, Tom SE, Kuo S-H, Alzheimer’s Disease Neuroimaging Initiative. Cerebellar Volume Is Associated with Cognitive Decline in Mild Cognitive Impairment: Results from ADNI. Cerebellum Lond Engl. 2020;19: 217–225. doi: 10.1007/s12311-019-01099-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Xu J, Patassini S, Rustogi N, Riba-Garcia I, Hale BD, Phillips AM, et al. Regional protein expression in human Alzheimer’s brain correlates with disease severity. Commun Biol. 2019;2: 43. doi: 10.1038/s42003-018-0254-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Hu W, Wu F, Zhang Y, Gong C-X, Iqbal K, Liu F. Expression of Tau Pathology-Related Proteins in Different Brain Regions: A Molecular Basis of Tau Pathogenesis. Front Aging Neurosci. 2017;9: 311. doi: 10.3389/fnagi.2017.00311 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Singhal A, Cao S, Churas C, Pratt D, Fortunato S, Zheng F, et al. Multiscale community detection in Cytoscape. PLoS Comput Biol. 2020;16: e1008239. doi: 10.1371/journal.pcbi.1008239 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Watanabe K, Taskesen E, van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nat Commun. 2017;8: 1826. doi: 10.1038/s41467-017-01261-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Cowen L, Ideker T, Raphael BJ, Sharan R. Network propagation: a universal amplifier of genetic associations. Nat Rev Genet. 2017;18: 551–562. doi: 10.1038/nrg.2017.38 [DOI] [PubMed] [Google Scholar]
  • 33.Smith CL, Eppig JT. The mammalian phenotype ontology: enabling robust annotation and comparative analysis. Wiley Interdiscip Rev Syst Biol Med. 2009;1: 390–399. doi: 10.1002/wsbm.44 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Bult CJ, Blake JA, Smith CL, Kadin JA, Richardson JE, Mouse Genome Database Group. Mouse Genome Database (MGD) 2019. Nucleic Acids Res. 2019;47: D801–D806. doi: 10.1093/nar/gky1056 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech Theory Exp. 2008;2008: P10008. doi: 10.1088/1742-5468/2008/10/P10008 [DOI] [Google Scholar]
  • 36.Akiyama H, Barger S, Barnum S, Bradt B, Bauer J, Cole GM, et al. Inflammation and Alzheimer’s disease. Neurobiol Aging. 2000;21: 383–421. doi: 10.1016/s0197-4580(00)00124-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Hong S, Beja-Glasser VF, Nfonoyim BM, Frouin A, Li S, Ramakrishnan S, et al. Complement and microglia mediate early synapse loss in Alzheimer mouse models. Science. 2016;352: 712–716. doi: 10.1126/science.aad8373 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Kontaxi C, Piccardo P, Gill AC. Lysine-Directed Post-translational Modifications of Tau Protein in Alzheimer’s Disease and Related Tauopathies. Front Mol Biosci. 2017;4: 56. doi: 10.3389/fmolb.2017.00056 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Wu F, Yao PJ. Clathrin-mediated endocytosis and Alzheimer’s disease: an update. Ageing Res Rev. 2009;8: 147–149. doi: 10.1016/j.arr.2009.03.002 [DOI] [PubMed] [Google Scholar]
  • 40.Costa AS, Guerini FR, Arosio B, Galimberti D, Zanzottera M, Bianchi A, et al. SNARE Complex Polymorphisms Associate with Alterations of Visual Selective Attention in Alzheimer’s Disease. J Alzheimers Dis JAD. 2019;69: 179–188. doi: 10.3233/JAD-190147 [DOI] [PubMed] [Google Scholar]
  • 41.Hauser PS, Narayanaswami V, Ryan RO. Apolipoprotein E: from lipid transport to neurobiology. Prog Lipid Res. 2011;50: 62–74. doi: 10.1016/j.plipres.2010.09.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Chen Y, Fu AKY, Ip NY. Eph receptors at synapses: implications in neurodegenerative diseases. Cell Signal. 2012;24: 606–611. doi: 10.1016/j.cellsig.2011.11.016 [DOI] [PubMed] [Google Scholar]
  • 43.Ferreira-Vieira TH, Guimaraes IM, Silva FR, Ribeiro FM. Alzheimer’s disease: Targeting the Cholinergic System. Curr Neuropharmacol. 2016;14: 101–115. doi: 10.2174/1570159x13666150716165726 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Calvo-Flores Guzmán B, Vinnakota C, Govindpani K, Waldvogel HJ, Faull RLM, Kwakowsky A. The GABAergic system as a therapeutic target for Alzheimer’s disease. J Neurochem. 2018;146: 649–669. doi: 10.1111/jnc.14345 [DOI] [PubMed] [Google Scholar]
  • 45.Mrak RE, Griffin WS. Interleukin-1 and the immunogenetics of Alzheimer disease. J Neuropathol Exp Neurol. 2000;59: 471–476. doi: 10.1093/jnen/59.6.471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Burgos PV, Mardones GA, Rojas AL, daSilva LLP, Prabhu Y, Hurley JH, et al. Sorting of the Alzheimer’s disease amyloid precursor protein mediated by the AP-4 complex. Dev Cell. 2010;18: 425–436. doi: 10.1016/j.devcel.2010.01.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Wei X, Liu X, Tan C, Mo L, Wang H, Peng X, et al. Expression and Function of Zinc-α2-Glycoprotein. Neurosci Bull. 2019;35: 540–550. doi: 10.1007/s12264-018-00332-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Lane DJR, Ayton S, Bush AI. Iron and Alzheimer’s Disease: An Update on Emerging Mechanisms. J Alzheimers Dis JAD. 2018;64: S379–S395. doi: 10.3233/JAD-179944 [DOI] [PubMed] [Google Scholar]
  • 49.Allen M, Carrasquillo MM, Funk C, Heavner BD, Zou F, Younkin CS, et al. Human whole genome genotype and transcriptome data for Alzheimer’s and other neurodegenerative diseases. Sci Data. 2016;3: 160089. doi: 10.1038/sdata.2016.89 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Salazar SV, Cox TO, Lee S, Brody AH, Chyung AS, Haas LT, et al. Alzheimer’s Disease Risk Factor Pyk2 Mediates Amyloid-β-Induced Synaptic Dysfunction and Loss. J Neurosci Off J Soc Neurosci. 2019;39: 758–772. doi: 10.1523/JNEUROSCI.1873-18.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Zorzetto M, Datturi F, Divizia L, Pistono C, Campo I, De Silvestri A, et al. Complement C4A and C4B Gene Copy Number Study in Alzheimer’s Disease Patients. Curr Alzheimer Res. 2017;14: 303–308. doi: 10.2174/1567205013666161013091934 [DOI] [PubMed] [Google Scholar]
  • 52.Rajendran L, Paolicelli RC. Microglia-Mediated Synapse Loss in Alzheimer’s Disease. J Neurosci Off J Soc Neurosci. 2018;38: 2911–2919. doi: 10.1523/JNEUROSCI.1136-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Hansen DV, Hanson JE, Sheng M. Microglia in Alzheimer’s disease. J Cell Biol. 2018;217: 459–472. doi: 10.1083/jcb.201709069 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Villegas-Llerena C, Phillips A, Garcia-Reitboeck P, Hardy J, Pocock JM. Microglial genes regulating neuroinflammation in the progression of Alzheimer’s disease. Curr Opin Neurobiol. 2016;36: 74–81. doi: 10.1016/j.conb.2015.10.004 [DOI] [PubMed] [Google Scholar]
  • 55.Wolfe CM, Fitz NF, Nam KN, Lefterov I, Koldamova R. The Role of APOE and TREM2 in Alzheimer’s Disease-Current Understanding and Perspectives. Int J Mol Sci. 2018;20: E81. doi: 10.3390/ijms20010081 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Wang Y, Cella M, Mallinson K, Ulrich JD, Young KL, Robinette ML, et al. TREM2 lipid sensing sustains the microglial response in an Alzheimer’s disease model. Cell. 2015;160: 1061–1071. doi: 10.1016/j.cell.2015.01.049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Yuan P, Condello C, Keene CD, Wang Y, Bird TD, Paul SM, et al. TREM2 Haplodeficiency in Mice and Humans Impairs the Microglia Barrier Function Leading to Decreased Amyloid Compaction and Severe Axonal Dystrophy. Neuron. 2016;90: 724–739. doi: 10.1016/j.neuron.2016.05.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Hassan MI, Waheed A, Yadav S, Singh TP, Ahmad F. Zinc alpha 2-glycoprotein: a multidisciplinary protein. Mol Cancer Res MCR. 2008;6: 892–906. doi: 10.1158/1541-7786.MCR-07-2195 [DOI] [PubMed] [Google Scholar]
  • 59.Hassan MI, Waheed A, Yadav S, Singh TP, Ahmad F. Prolactin inducible protein in cancer, fertility and immunoregulation: structure, function and its clinical implications. Cell Mol Life Sci CMLS. 2009;66: 447–459. doi: 10.1007/s00018-008-8463-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Hassan I, Ahmad F. Structural diversity of class I MHC-like molecules and its implications in binding specificities. Adv Protein Chem Struct Biol. 2011;83: 223–270. doi: 10.1016/B978-0-12-381262-9.00006-9 [DOI] [PubMed] [Google Scholar]
  • 61.Gamba P, Testa G, Sottero B, Gargiulo S, Poli G, Leonarduzzi G. The link between altered cholesterol metabolism and Alzheimer’s disease. Ann N Y Acad Sci. 2012;1259: 54–64. doi: 10.1111/j.1749-6632.2012.06513.x [DOI] [PubMed] [Google Scholar]
  • 62.Mahley RW. Central Nervous System Lipoproteins: ApoE and Regulation of Cholesterol Metabolism. Arterioscler Thromb Vasc Biol. 2016;36: 1305–1315. doi: 10.1161/ATVBAHA.116.307023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Nussbacher JK, Tabet R, Yeo GW, Lagier-Tourenne C. Disruption of RNA Metabolism in Neurological Diseases and Emerging Therapeutic Interventions. Neuron. 2019;102: 294–320. doi: 10.1016/j.neuron.2019.03.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Ambadipudi S, Biernat J, Riedel D, Mandelkow E, Zweckstetter M. Liquid-liquid phase separation of the microtubule-binding repeats of the Alzheimer-related protein Tau. Nat Commun. 2017;8: 275. doi: 10.1038/s41467-017-00480-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Labbadia J, Morimoto RI. The biology of proteostasis in aging and disease. Annu Rev Biochem. 2015;84: 435–464. doi: 10.1146/annurev-biochem-060614-033955 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Harris LD, Jasem S, Licchesi JDF. The Ubiquitin System in Alzheimer’s Disease. Adv Exp Med Biol. 2020;1233: 195–221. doi: 10.1007/978-3-030-38266-7_8 [DOI] [PubMed] [Google Scholar]
  • 67.Cadena C, Ahmad S, Xavier A, Willemsen J, Park S, Park JW, et al. Ubiquitin-Dependent and -Independent Roles of E3 Ligase RIPLET in Innate Immunity. Cell. 2019;177: 1187–1200.e16. doi: 10.1016/j.cell.2019.03.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Pérez MJ, Fernandez N, Pasquini JM. Oligodendrocyte differentiation and signaling after transferrin internalization: a mechanism of action. Exp Neurol. 2013;248: 262–274. doi: 10.1016/j.expneurol.2013.06.014 [DOI] [PubMed] [Google Scholar]
  • 69.Huang JK, Carlin DE, Yu MK, Zhang W, Kreisberg JF, Tamayo P, et al. Systematic Evaluation of Molecular Networks for Discovery of Disease Genes. Cell Syst. 2018;6: 484–495.e5. doi: 10.1016/j.cels.2018.03.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43: D447–452. doi: 10.1093/nar/gku1003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Vanunu O, Magger O, Ruppin E, Shlomi T, Sharan R. Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol. 2010;6: e1000641. doi: 10.1371/journal.pcbi.1000641 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Guney E, Menche J, Vidal M, Barábasi A-L. Network-based in silico drug efficacy screening. Nat Commun. 2016;7: 10331. doi: 10.1038/ncomms10331 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Eppig JT. Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse. ILAR J. 2017;58: 17–41. doi: 10.1093/ilar/ilx013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Piñero J, Ramírez-Anguita JM, Saüch-Pitarch J, Ronzano F, Centeno E, Sanz F, et al. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 2020;48: D845–D855. doi: 10.1093/nar/gkz1021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Raudvere U, Kolberg L, Kuzmin I, Arak T, Adler P, Peterson H, et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 2019;47: W191–W198. doi: 10.1093/nar/gkz369 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Zheng F, Zhang S, Churas C, Pratt D, Bahar I, Ideker T. HiDeF: identifying persistent structures in multiscale ‘omics data. Genome Biol. 2021;22: 21. doi: 10.1186/s13059-020-02228-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Zhang Y, Sloan SA, Clarke LE, Caneda C, Plaza CA, Blumenthal PD, et al. Purification and Characterization of Progenitor and Mature Human Astrocytes Reveals Transcriptional and Functional Differences with Mouse. Neuron. 2016;89: 37–53. doi: 10.1016/j.neuron.2015.11.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009903.r001

Decision Letter 0

Ilya Ioshikhes, Feixiong Cheng

1 Oct 2021

Dear Dr. Wang,

Thank you very much for submitting your manuscript "Mapping the gene network landscape of Alzheimer’s disease through integrating genomics and transcriptomics" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. In your revisions please in particular address the comments from Reviewer 1.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Feixiong Cheng, Ph.D.

Guest Editor

PLOS Computational Biology

Ilya Ioshikhes

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The manuscript titled ‘Mapping the gene network landscape of Alzheimer’s disease through integrating genomics and transcriptomic’ (Manuscript No: PCOMPBIOL-D-21-01541) integrated multi-omics data to explore the key landscape of AD, which offer new insight for understanding the pathophysiology of Alzheimer’s disease. Especially, the starting point is based on the AD risk genes from GWAS analysis on a larger number of AD-by-proxy samples and autopsy-documented AD samples.

I have some points to discuss with authors:

Major comments:

1. The pathways analysis is based on the expanded network after network propagation algorithm by AD risk genes from GWAS analysis. As mentioned in introduction, genetic variations captured by GWAS are inherited, whereas transcriptomic provides information on factors that are both inherited and non-inherited.

a. In table 1, it is interesting that the key pathways from expanded network module are not significant enriched when performing AD cortex RNAseq enrichment (BH FDR). The difference between genes from GWAS data and from transcriptomic data should be discussed. Or, further explanation could be given for that.

b. This work conducted network propagation algorithm based on AD risk genes from GWAS analysis and validate the obtained module clusters by significantly changed genes from transcriptomic data. What if network propagation algorithm was firstly performed on genes from transcriptomic data and then integrated with GWAS data? Will the final pathways and conclusions be different?

2. In section AD disease module of result, the manuscript declared that ‘Out of the 192 genes, 142 were found in the interactome (STRING database) and included in our subsequent analyses’. It means that 50 risk genes were excluded due to their missing in PPI network. However, it there possibility that some of 50 genes are the key genes to the pathophysiology of Alzheimer’s disease. If so, it might be one limitation of the work.

The authors had better provide which version of STRING database they utilized (The latest one or ?). In addition, the influence of some genes that cannot be mapped to interactome network should be discussed.

Minor comments:

1. In the transcriptomic study in the section methods, to compare the overlapped situation between significant regulated genes from transcript data and the genes in expanded AD disease module, hypergeometric test was used. The total number samples considered in the test should be given. Is it the number of all the genes in GWAS analysis or all the genes in interactome network?

2. In figure 5, mean expression (FPKM) was used to show the cell-specific gene expression in each cluster. With gene expression data, GSEA (Gene set enrichment analysis) could also be considered to calculated the enrichment score of each cell type in specific clusters.

Reviewer #2: This paper integrated AD GWAS data with protein-protein interaction network and identified 142 AD risk genes plus 646 network-proximal genes. They also used independent data such as mouse knockout to validate the functions of those genes. Further, they related their PPI-based AD gene network to the population gene expression data (temporal cortex, AD vs. health), and found 17 gene clusters enriched with various functions/pathways. Using those enriched pathways, they also found a hierarchical structure in their AD network, implying higher order sequential interactions among AD gene functions. Finally, they associated the clusters with recent single cell data and found microglial functions in the clusters such as immunoregulation. In general, the paper reads logically and is also well organized. However, I have the following concerns and suggestions:

● Under summary of molecular pathways of AD, the authors mention that the 5 classes of functionally related molecular pathways that are associated with AD could be changed slightly due to some pathways having multiple functions and being involved in various diseases. It may be helpful to indicate additional functions and how those 5 classes could change. Perhaps, providing this in supplementary information or elaborating a bit more could provide more informative context for the reader.

● In Introduction, the authors do mention 4 papers (25 to 28) which potentially explain evidence of pathological progression and different regional vulnerabilities in AD, but it may be helpful to elaborate a bit more on whether those studies utilized the cerebellum as a hypothesized control and temporal cortex as major disease-relevant tissue. Perhaps adding some more literature and context on the brain region-specific progression of AD would help motivate this approach. Maybe, for example, the cerebellum is never impacted in AD (and a line on its functioning in the brain and statistics on why it is never involved), can be powerful to include. For instance, while it is great that the authors have done literature review for brain regions in AD, it may help to also synthesize and weave in that literature (as it may be too tedious for the reader to go through the literature to validate the support).

● In Introduction, it may help to add a parentheses definition of multiscale community detection to help the reader better understand.

● While the authors mention a lot of citations to support methodology (that is great and does boost confidence in the approaches), it may be helpful to elaborate on what “poorly characterized or new genes are”, especially since the authors removed such genes (please see AD disease module section, end of 1st paragraph, under results).

● From the manuscript, it is implied that 50 AD risk genes are excluded (that were previously identified in the large AD-by-proxy GWAS). Is there any other analysis that was done using these genes? Could there be other interactome data including these genes that may have been considered. What are the risks involved or cons of this approach? The authors only mention a very brief limitation in the discussion (i.e., “There are multiple choices of interactome, each with their own pros and cons, and many tissue-specific networks that can be obtained”). They could potentially add in details on losing the 50 AD risk genes using STRING interactome as a potential limitation. Could the genes have been uncovered in other interactomes? Elaborating more about the potential limitations can help as well and explaining more about the “pros and cons” of tissue-specific networks can help. In addition, authors may consider adding a few more details on why the STRING interactome was selected (i.e. what were the results of a systemic evaluation of the performance of diverse networks).

● It is a bit confusing why the authors first use STRING interactome for the analysis (if only 142 AD GWAS risk genes are identified instead of the 191 that are identified in GIANT). Why replicate the analysis using GIANT to test consistency of the localization analysis and not the other way around, for example? Are the results of STRING in other applications much better than those of GIANT?

● Under “Network propagation”, the authors mention that this process “simulates how heat would travel, through a network starting from a set of seed nodes.” What does heat represent (in terms of exploring the network proximity to a set of genes identified as significantly associated with AD), and why is this important?

● In Discussion, the authors mention that the “dataset for AD risk genes from GWAS is incomplete”, but what does this mean and how can it be more complete?

● In Fig. 5B, the authors mention that the inner circles are classes of identified pathways that are functionally related. What does the size of the inner circles exactly mean? It may be helpful to add a legend or a little more explanation to the figure caption. Also, if “Immune reactions” is right next to Microglia (which is expected as Microglia are immune cells in the brain) does this mean that it is only in Microglia (and same for Substance metabolism)? If that is the case, then about Endothelial cells as they don’t have 1 unique circle in front of them. What determines the size of each cell type in the outer circle? In other words, this “circos-like” plot can be a bit confusing and may need to be re-thought as it may currently be difficult for the reader to appreciate the novel findings. The main text only mentions 1 line for the figure “A schematic model of gene network in AD pathophysiology based on results of the present study is illustrated in Fig. 5B”, which does not explain much either.

● In terms of visuals, Fig. 2 was superb in terms of detail, annotation, and legends, and innovation. Nonetheless, in the caption, the authors could add some example of proposed drug targets that have the bold black outlines. They could consider saying, (e.g. CHRNA2 and PTK2B are bolded examples). Also, what does the font size mean: why does CHRNA2 have a larger font size than TPM1 or ACTA2, for example? This explanation of font size can be added to Figure 2.

● Some of the other figures can be similar to Fig. 2 in terms of the approach. For instance, in Fig. 3, it may help to provide a legend (similar to the “cluster size” annotation in Fig. 2) to explain how the circle height relates to the # of genes that result in the phenotype when knocked out in mice. That is, for example, is the small circle just 1 gene and the large circle 100 genes? Fig. 4 caption explains pie chart colors but still having a legend (similar to Fig. 2 “DE Beta Statistic”) explaining the colors can help.

● Some of the methods may need more explanation, especially in terms of the math. What is alpha in “Network propagation”. Where do we get the heat vector at time T (for F^t) and the value of Y? It may help if there is supporting code that is publicly-available for the readers to go through (i.e. an R markdown file or ipython notebook tutorial) to help them better understand how the network methods were exactly implemented

● The authors also mention trends that are not statistically significant under the section “Cell-specific preferential expression”. It may help to report p-values to better understand if the results were close to statistical significance. Also, providing more biological interpretation of the cell-specific results would be helpful as the authors provide references to Figures and Supplements but could also list them out in the results. For instance, microglia have been recently implicated in AD neuroinflammatory processes.

● Under the statement: “We observed a cluster of genes associated with the ZAG-PIP complex”, it may be helpful to list out these genes that are involved in the ZAG-PIP complex. It appears as though ZAG-PIP complex is #15 module in Fig. 2 but there are no distinct clusters of genes there that point to it. The same for RNA metabolic process, which has no module of genes pointing to it but has some mention of genes like ZC3HAV1, DNAJC2, SYMPK, etc. in the Discussion section.

● It was great that the authors mention the different pathways in the Discussion section and the biology of the findings. It may be helpful to cite the Figures and supplementary figures associated with the different pathways. For instance, it is unclear where we could find the “cluster of 33 genes” identified by the data-driven approach.

● The authors mention Fig 4A under Proteostasis, but there is no figure 4A.

● The organization of the discussion is unclear, nonetheless. The authors mention potentially clinically-relevant clusters to discuss, and it appears as though those could belong to Fig. 2, but the authors do not really follow a flow for Fig 2. For instance, where is proteostasis in Fig 2? It is a section in the discussion but not really mentioned in the figure. Also, Complement Activation is mentioned in Discussion, and is #3 cluster (Red) in Fig. 2. Please clarify the logic for the Discussion. It may be helpful to allude to the pathways based on Fig 2 (or respective figure) along with the cluster #. Another example is APOE and Lipid Metabolism, which seems to potentially come from Cluster # 7 (purple, lipoprotein regulation) in Fig. 2; in that case, it may help to use consistent subheading with the cluster name or annotate/explain the location of the corresponding figure.

● It may help to really emphasize key findings in a paragraph in “Summary of Molecular Pathways in AD” as there is a lot going on in this paper (which is all great work), but the reader may be left wondering what to fully take away from this study. For instance, is this an approach that can be used in other diseases/contexts? Is this shedding light on future research in this area and where do we go from here? What can biologists grasp from these modules and findings? What is the key importance of these 5 classes of functionally-related molecular pathways that are significantly associated with AD? These findings are useful but it felt like they were not presented to the reader with enough justification of their importance. What is the future work?

Minors

● Grammatical errors and typos:

○ For example, under “Summary of Molecular Pathways in AD”, the first sentence contains an error, where “of” should probably be removed.

○ Another is in the introduction for the sentence: “We the integrate data of transcriptomic dysregulation in AD brains with the expanded AD disease module…”. Here, “the” should be replaced with “then”

○ In Fig. 5B, “Endothelia” could be a typo for “Endothelial”. If not enough space, perhaps abbreviating the name can help.

○ In Fig 3. Perhaps the x-axis label could be “log(OR) +/- 95% CI” with the / added between + and -.

○ Under Summary of Molecular Pathways in AD, 1st sentence may need hyphen “-” between functionally related, so it is: functionally-related

● Fig. 1 caption only mentions 2 words (Analysis workflow) and it may help to paraphrase the figure and the connections again so that the reader can recall the workflow. It may make the figure friendlier for readers then, as some could rely on the caption to help summarize the figure.

● In the introduction, it is mentioned that the latest large AD GWAS have identified over 40 risk loci. It is unclear if this is 1 GWAS or multiple GWAS (i.e. Jansen et. al, Kunkle et. al, etc.) SNPs combined. There was also a recent AD GWAS that has been published. The authors had 7 different references to GWAS studies for AD for this sentence, making it a bit confusing. It may help to refer to AD GWAS as “Jansen et. al”, “Kunkle et. al”, for instance. Again, a minor point, but it may be informative for readers.

● Figure 3: it is really great and a reader-friendly figure with few key terms, which is really thoughtful. Nonetheless, it seems as though abnormal synaptic transmission and abnormal CNS synaptic transmission are very similar (even visually, in terms of log(Odds Ratio) +/- 95% CI) and from the terms, perhaps 1 of the terms is a subset of the other. It may be better to include another term instead or remove altogether or provide more context on why both were included.

● This may be minor, but the number selected for random samplings could be consistent across approaches or explained more. For instance, why were 5,000 gene sets randomly selected to build up the null distribution for Network Propagation but only 1,000 random samplings used to build up the distribution for network localization? Is it due to computational resource constraints?

● Under network propagation, is z > 2 meant to be z > 1.96 (significance level of 5%) given that these are network propagation z-scores? Otherwise, what is the intuition for the cut-off of z > 2 for chance?

● In the Fig. S2 heatmap, the clustering and labeling is great, but it is a bit difficult to read the gene names properly. More height for the heatmap would help. A great resource in R could be the complexHeatmap package, which can really make beautiful heatmaps with annotations and clustering as well. Moreover, complexHeatmap could make the division in the heatmap between healthy and controls more prominent and easier to visualize.

● Sharing the code on github or a publicly-available location can greatly help.

● Instead of listing the URL for Agora directly (in section: “Transcriptomic dysregulation proximal to AD genes”), it may help to just cite it as the authors provide the full URL elsewhere in the methods.

● May be helpful to provide a citation for “Involvement of the cholinergic system in AD has been known since the 1970s”.

● May also consider adding in GWAS p-values if it helps motivate findings.

Reviewer #3: The paper utilized a network propagation methods to build AD disease module with GWAS inputs. After that combing with transcriptome data, the authors proved that the identified AD disease module are enriched with multiple disease-associated pathways. And cell-specific expression data demonstrated that genes in AD disease module are enriched with microglia expressed immune and complement activation functions related genes.

I think the overall manuscript design is rigorous and systematic.

1. At the end of page 21 and beginning of page 22, for the network propagation formula, there is a parameter alpha in equation, can the author provides some metric to explain which alpha value you use and why you use this alpha value?

2. Second paragraph of page 22: node z score computation, in the method description, the authors mentioned node z score, however do NOT mention the corresponding p value, has the author considered p value as well or not (just z score)?

3. The final formed AD disease module includes 788 genes with 142 risk genes and 646 network-proximity genes (non-GWAS evidenced). The authors have systematically discussed the 142 risk genes, however do not talk too much about the rest 646 network proximity genes. The authors could provide some statistical evidences for these 646 network-proximity genes by utilizing some public available AD-knowledge database, e.g., the open target platform, DisGeNET, to demonstrate that compared to randomly selected 646 genes, the 646 genes you predicted are more AD-associated. in this way, it could make people better believe that your method does have the capability to predict potential or likely AD-associated risk genes from GWAS inputs or the network module you found could provide useful information. I just think this could also help improve your manuscript novelties.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No: open-source codes are missing for reproducibility

Reviewer #3: No: no code provided

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009903.r003

Decision Letter 1

Ilya Ioshikhes, Feixiong Cheng

25 Jan 2022

Dear Dr. Wang,

Thank you very much for submitting your manuscript "Mapping the gene network landscape of Alzheimer’s disease through integrating genomics and transcriptomics" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

Please take careful the minor comments from the Reviewer 2# and try to submit the revised manuscript in 2-4 weeks.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Feixiong Cheng, Ph.D.

Guest Editor

PLOS Computational Biology

Ilya Ioshikhes

Deputy Editor

PLOS Computational Biology

***********************

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately:

[LINK]

Please take careful the minor comments from the Reviewer 2# and try to submit the revised manuscript in 2-4 weeks.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The responses from authors have answered my questions well.

Reviewer #2: The authors have addressed most of my concerns. I just have some suggestions as follows.

6. The remark by the authors to point 1b is helpful and it is duly noted that presenting an alternative network could complicate the paper. Perhaps the limitations section could incorporate this limiting factor and mention in future work other networks could be considered. The authors could still reiterate the power of the STRING interactome as being the most robust interactome while still presenting the limitations of the approach.

7. It is great that the authors added in a paragraph describing the network propagation and what. It is helpful to perhaps elaborate a little further on what it means for heat to diffuse through a network, as that may be unclear still; at the moment, it sounds like a physics scenario with terms like heat, diffusion, etc. Nonetheless, it is great that the authors provided this helpful paragraph.

9. Revision made to the figure caption. Nonetheless, it may help to add these same comments and clarifications (e.g. what was mentioned about Microglia and “immune reactions” and “substance metabolism”) to the figure text or the manuscript (or some supplementary documentation) as this clarification could help potential readers better understand the plot.

Reviewer #3: I checked my review feedback modifications; I am satisfied with all answers. Great work, I also learned multiple things.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: None

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

References:

Review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript.

If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009903.r005

Decision Letter 2

Ilya Ioshikhes, Feixiong Cheng

8 Feb 2022

Dear Dr. Cheng,

We are pleased to inform you that your manuscript 'Mapping the gene network landscape of Alzheimer’s disease through integrating genomics and transcriptomics' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, including improvement of figure's resolution, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Feixiong Cheng, Ph.D.

Guest Editor

PLOS Computational Biology

Ilya Ioshikhes

Deputy Editor

PLOS Computational Biology

***********************************************************

Comments from the Editors: The authors are highly suggested to improve the quality and resolution of all main figures during the production stages.

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009903.r006

Acceptance letter

Ilya Ioshikhes, Feixiong Cheng

18 Feb 2022

PCOMPBIOL-D-21-01541R2

Mapping the gene network landscape of Alzheimer’s disease through integrating genomics and transcriptomics

Dear Dr Wang,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Anita Estes

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text

    Supplementary Figures: Fig A: Network localization of AD GWAS genes. (A) AD gene network identified by GWAS. Gene colors represent the differential expression beta statistic in the temporal cortex between AD and healthy controls. Edges represent high confidence interactions in the STRING database. (B) Distribution of number of edges interconnecting AD GWAS genes (blue) or randomly selected gene sets (yellow). 80% of the AD GWAS genes were sampled 5000 times to create the distribution. Fig B: Heatmap of relative gene expression between AD patients and healthy control, temporal cortex. This figure shows the top 100 most differentially expressed genes. Note that the patients (columns) were not clustered here- they are sorted by healthy and AD status. Only the genes (rows) are clustered. Fig C: Transcriptomic study of AD genes in the temporal cortex. (A) Overlap of up-regulated genes in the Mayo Clinic RNAseq data and the expanded AD disease module. (B) Significant difference of Z-scores between up-regulated and the rest of genes. Fig D: Transcriptomic study of AD genes in the cerebellum. (A) No significant overlap of up-regulated genes in the Mayo Clinic RNAseq data and the expanded AD disease module. (B) No significant difference of Z-scores between up-regulated and the rest of genes. Fig E: Brain cell-specific mean expression (FPKM) of genes in identified clusters. Functional annotations for the clusters are 1: Immunoregulatory interactions between a lymphoid and non-lymphoid cell, 2: RNA metabolic process, 3: Complement activation, 4: Protein modification by small protein conjugation, 5: Clathrin-mediated endocytosis, 6: SNARE binding, 7: Regulation of plasma lipoprotein particle levels, 8: EPH-Ephrin signaling, 9: DNA-binding transcription factor activity, 10: Acetylcholine-gated cation-selected channel activity, 11: Interleukin-1 signaling, 12: GABAergic synapse. Fig F: Gene-enrichment analysis using predicted 646 proximity genes. The plot shows ratio of proximity genes overlapped with each disease-related gene set in the available databases.

    (DOCX)

    S1 Data

    Supplementary Tables: Table A: Annotation of genes in the AD disease module. Table B: Functional annotation of the genes in the AD disease module. Table C: Genes in the hierarchical network. Table D: GWAS-identified significant genes that are not in the STRING interactome.

    (XLSX)

    Attachment

    Submitted filename: PLOS_Comput_Biol_response.docx

    Attachment

    Submitted filename: PLOS_Comput_Biol_response_2.docx

    Data Availability Statement

    All relevant data are within the paper, its Supporting Information files, and on Zenodo at https://zenodo.org/record/5786722#.Ybtti73MKC8 (DOI: 10.5281/zenodo.5786722).


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES