Abstract
Alzheimer’s disease (AD) and diabetes mellitus (DM) are known to have a shared molecular mechanism. We aimed to identify shared blood transcriptomic signatures between AD and DM. Blood expression datasets for each disease were combined and a co-expression network was used to construct modules consisting of genes with similar expression patterns. For each module, a gene regulatory network based on gene expression and protein-protein interactions was established to identify hub genes. We selected one module, where COPS4, PSMA6, GTF2B, GTF2F2, and SSB were identified as dysregulated transcription factors that were common between AD and DM. These five genes were also differentially co-expressed in disease-related tissues, such as the brain in AD and the pancreas in DM. Our study identified gene modules that were dysregulated in both AD and DM blood samples, which may contribute to reveal common pathophysiology between two diseases.
Keywords: Alzheimer’s disease, diabetes mellitus, blood gene expression
1. Introduction
Alzheimer’s disease (AD), the most common form of dementia, has pathophysiological characteristics that include amyloid plaques, which are extracellular deposits of beta-amyloid (Aβ) in the brain parenchyma; cerebral amyloid angiopathy, which is the accumulation of Aβ in the cerebral blood vessels; and neurofibrillary tangles, which are deposits of hyper-phosphorylated tau protein inside neurons [1]. Recent evidence has suggested that AD is not only a brain-specific disease but also a systemic disorder associated with inflammation, obesity, and other chronic diseases [2]. Specifically, substantial epidemiologic studies have reported that diabetes mellitus (DM) is a crucial risk factor for AD [3,4]. Huang et al. [3] analyzed approximately 140,000 members of the Taiwanese population and reported that newly diagnosed patients with type 2 DM (T2D) have an increased risk of AD. Furthermore, a study analyzing the Hisayama cohort suggested that DM is a significant risk factor for all-cause dementia and AD [4].
Accumulating evidence indicates that AD and DM have shared molecular mechanisms, such as oxidative stress, mitochondrial dysfunction, and inflammation [5]. Therefore, AD is referred to as “type 3 diabetes” [6]. Furthermore, several studies based on whole transcriptome analysis have identified a common pathogenesis between AD and DM. Hokama et al. [7] analyzed the postmortem brain tissues of AD patients, in Hisayama, and reported that dysregulated genes in the AD hippocampus were associated with T2D-related genes. Caberlotto et al. [8] performed microarray analyses of brain tissues from AD and DM patients, revealing several common biosignatures between the two, such as autophagy. A microarray meta-analysis of brain tissues from patients with AD and of multiple tissues (pancreas, liver, muscle, and blood) from T2D patients showed that several pathways, including the ephrin receptor, liver X receptor/retinoid X receptor, and interleukin 6, were dysregulated in both AD and T2D [9].
We previously performed a microarray meta-analysis of blood samples collected from AD patients participating in the Europe-based AddNeuroMed and US-based Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohorts [10,11]. This analysis identified the immune and inflammation-, energy-, WNT-, endoplasmic reticulum-, and cancer-related pathways as the underlying mechanisms associated with the AD status [12]. Moreover, RNA-sequencing data of blood samples collected from AD Japanese subjects stored at the National Center for Geriatrics and Gerontology (NCGG) Biobank also revealed several other pathways (i.e., signal-recognition particle (SRP)-dependent cotranslational protein targeting to membrane, translational initiation, and ribosome) associated with the AD status [13].
Grayson et al. [14] performed microarray transcriptomic analyses of blood samples from T2D patients, identifying differentially expressed genes involved in the biological function of T-cell activation and signaling. Kaizer et al. [15] also reported that type 1 DM and T2D shared pathogenic mechanisms, such as overexpression of IL1B (probably due to hyperglycemia). Moreover, Lin et al. [16] performed RNA sequencing for neutrophils isolated from the blood of T2D patients and found that genes dysregulated in these patients were enriched in cytokine–cytokine receptor interactions, as well as nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB), tumor necrosis factor, cell adhesion molecule, Toll-like receptor, and chemokine signaling. In the OPTIMED study, whole transcriptome of T2D blood samples further revealed that genes with altered expression after anti-diabetic treatment with metformin were involved in the improvement of energy metabolism, the modulation of inflammation, and the inhibition of cancer progression [17].
Although numerous pieces of evidence suggest a shared pathophysiology between AD and DM in the brain and pancreatic tissues, few studies have compared their whole blood transcriptome signatures [18]. Santiago et al. [18] performed a microarray meta-analysis of blood RNA expression in the blood of AD and DM patients and mainly conducted differential expression (DE) analysis to identify common mechanisms between the two diseases. Another brain microarray meta-analysis study conducted a differential connectivity (DC) analysis, yielding TYROBP as an upstream regulator of AD [19]. The aim of our study was to identify shared blood transcriptomic signatures between AD and DM using both DE and DC analyses.
As shown in Figure 1, we analyzed publicly available datasets for the two diseases and selected representative datasets for each disease. The two selected datasets were integrated into one large dataset, which was used for the construction of a co-expression network, yielding several modules. We selected AD- and DM-related modules based on DE and DC analyses between the disease and control groups. From the selected module, we identified common transcription factors for AD and DM using a gene regulatory network (GRN) construction method (GENIE3) [20], a transcription factor (TF) database, gene expression correlation, and a protein-protein interaction (PPI) database.
2. Methods
2.1. Datasets and Preprocessing
Initially, we collected publicly available gene expression microarray data from blood samples taken from patients with AD and DM (Table S1): Four AD datasets (ADNI, GSE63060, GSE63061, and GSE85426) and four DM datasets (GSE9006, GSE23561, GSE87005, and E-MTAB-6667). We selected appendicitis (AP) as a control disease using disease networks (Supplementary Materials) [21,22,23], and GSE9579 was selected as a transcriptomic dataset for AP. The normalization and probe selection for all datasets are described in the Supplementary Materials.
2.2. Selection of Representative Datasets for AD and DM
We selected representative datasets for AD and DM using a MetaQC method that provides quantitative measures for assessing the quality and consistency of microarray data for meta-analysis in the form of six indices and one summarized rank of quality for each RNA dataset [24]. The six quality control (QC) measures comprised an internal QC (IQC), an external QC (EQC), two accuracy QC indices (AQCg and AQCp), and two consistency QC indices (CQCg and CQCp). The standardized mean rank value (SMR) is the average rank of all QC indices (Supplementary Materials).
2.3. Comparison of Selected Blood Datasets with the Brain and Pancreas Datasets
To investigate whether the expression levels of genes in blood samples, instead of those in disease-specific tissues, can be used to show the association between diseases, we compared the expression values of all genes in a selected blood sample set with those in disease-specific tissues (brain tissues for AD and pancreas tissues for T2D). A study by Mathys et al. [25] profiled 80,660 single-cell transcriptomes of brain tissue obtained from 25 and 24 subjects with and without AD pathology, respectively. We used the summary statistics of that study (Supplementary Table S2 in the study by Mathys et al. [25]), which provided the fold changes (FCs) of the expression levels of 10,520 to 16,966 genes between AD-pathology and no-pathology tissues for five brain cell types. The pancreas dataset (GSE81608) included the whole transcriptomes of 1600 human pancreatic α, β, δ, and pancreatic polypeptide (PP) cells obtained from 6 T2D patients and 12 non-diabetic individuals [26]. We scaled the GSE81680 dataset using log-transformation.
2.4. Differential Expression and Pathway Analyses
We investigated differentially expressed genes (DEGs) that were common between AD and DM. DE analysis between each disease (AD or DM) and its matched control (CNAD or CNDM) was conducted using the “lmFit” and “eBayes” functions in the limma package [27]. Results from the limma package analysis consisted of logFC values between the two conditions (disease (AD or DM) vs. healthy control (CNAD or CNDM)) and the p-values for each gene. We defined genes with a false discovery rate (FDR)-adjusted p-value of less than 0.05 as DEGs. We then selected DEGs that were common between AD and DM.
We used the Dataset for Annotation, Visualization and Integrated Discovery program (DAVID) to identify enriched pathways of the DEGs [28]. We used Gene Ontology (GO) [29] and Kyoto Encyclopedia of Genes and Genomes (KEGG) [30] databases and established an FDR-adjusted p-value of less than 0.05 as the significance threshold. Using Spearman’s correlation coefficient, we compared the logFC values of all genes across AD, DM, and the control disease AP. A significance threshold for correlation across datasets was determined using a permutation test (Supplementary Materials).
2.5. Construction of Modules and Selection of Disease-Related Modules
First, blood RNA expression datasets for AD and DM were combined into one large dataset, after the batch effect between the AD and DM datasets was removed via the “ComBat” function in the sva package [31,32]. Subsequently, modules with the combined expression profiles were constructed using the weighted gene co-expression network analysis (WGCNA) package [33]. Specifically, a robust version of WGCNA (rWGCNA) [34,35] was used to minimize the influence of potential outlier samples. Detailed information on several hyper-parameters of the rWGCNA are described in the Supplementary Materials (Figures S1 and S2). Finally, we selected co-expression modules related to both AD and DM from the set of modules constructed with rWGCNA, using three previously established approaches. The first approach is a regression method using disease status and gene expression as independent and dependent variables, respectively [35]. The second involves selecting modules consisting of a significantly high number of DEGs [36], and the third involves selecting modules with DC between two statuses (diseased and healthy) across all pairs of genes in a module [19].
For the first approach, we used a linear mixed model (LMM), where the eigengene, which is the first principal component of the genes in a given module, and the disease status (disease or control) were used as the dependent and independent variables, respectively. We selected a module with a Bonferroni-corrected p-value < 0.05 for eigengene-disease associations from LMM and defined it as the disease-related module. For the second approach, Fisher’s exact test was used to evaluate whether DEGs were significantly enriched in the modules. We selected modules with a Bonferroni-corrected p-value < 0.05 for the number of common genes between the module genes and the DEGs from Fisher’s exact test and defined them as the DEG-enriched modules. For the third approach, the measurement of differential connectivity between the two conditions is described in the Supplementary Materials. We selected a module with a significant differential connectivity between the two conditions (Bonferroni-corrected p-value < 0.05) and defined it as the modular differential connectivity (MDC) module.
2.6. Construction of GRN and Identification of Hub Genes
For the selected module, we attempted to identify genes that were highly correlated with other genes, using the “GENIE3” function in the GENIE3 package based on the random forest algorithm with a regression tree, and constructed a GRN [20]. Two pieces of input data were needed to run GENIE3: One was an expression matrix (called exprMatrix) consisting of all candidate genes, and the other included lists of candidate upstream genes (regulators). GENIE3 generates a table with three columns: “regulatoryGene”, “targetGene”, and “weight”. The “weight” is the average of variable importance measured by the degree of variance reduction of the output variable [20]. To construct the GRN, we selected edges between two genes (regulatoryGene and targetGene) with weight values greater than the mean plus one standard deviation (SD) of the weight values.
Although GENIE3 is considered a reliable method for constructing GRNs [37], this algorithm only uses data-specific information (RNA expression) and has characteristics of randomness; therefore, it may generate biologically irrelevant regulations. To overcome these issues, we used the STRING database and correlation between two genes (Figure 1) [38]. From the edges between genes identified using GENIE3, we selected an edge when the two genes had a relationship in the STRING database or a high correlation value from the biweight midcorrelation measurement (≥95th percentile from the correlation matrices obtained from analyzing AD and DM gene expression data).
3. Results
3.1. Comparisons of Disease-Related RNA Alterations across All Blood Dataset Pairs for Each Disease
We examined whether blood expression values of different datasets from the same disease were correlated with each other. For this task, we calculated the logFC values of all genes between disease and control samples for each dataset, and then assessed their correlation between the datasets. For the four AD datasets, we measured the logFC between AD and CNAD for all genes using the limma package, yielding four lists of logFC values. Then, we compared pairs of these four lists of logFC values using Spearman’s correlation. Among the four AD datasets, two AddNeuroMed datasets (GSE63060 and GSE63061) had the most highly correlated FC values (Figure S2A, Spearman’s correlation coefficient = 0.78, p < 0.001).
Furthermore, for the four DM datasets, we measured the logFC between DM and CNDM for all genes using the limma package and then compared the logFC value of each gene for all pairs in the DM blood datasets. Three pairs exhibited positive correlations, and three pairs exhibited negative Spearman coefficient values (Figure S2B). As such, correlations among the blood datasets were not always strong, even for the same disease.
3.2. Selection of a Representative Dataset for Each Disease
Instead of using all the RNA expression datasets, as has been done in other studies [35,36], we selected representative datasets for each disease using MetaQC [24]. We measured the six QC indices from three datasets iteratively selected from the four AD datasets, resulting in an AD study having three values for each QC index. Figure 2 shows the average values of the six QC indices and the SMR scaled by min-max normalization. Based on the average values of SMR, GSE63060 and MTAB6667 had the lowest values for AD and DM, respectively. Therefore, we selected GSE63060 and MTAB667 as representative blood RNA expression datasets for AD and DM, respectively.
3.3. Comparison of Disease-Related RNA Alterations of the Blood with Those of the Brain and Pancreas
We compared the FC values for all genes between each disease (AD or DM) and its respective control (CNAD or CNDM) from the selected blood dataset with those from other disease-related tissues (the brain and pancreas) affected by AD and DM. When we compared the FC values between AD and CNAD for all genes in the AD blood dataset with the FC values in the AD brain dataset comprising data for five cell types [25], the transcriptomic signatures of the AD blood data exhibited the highest consistency with those of AD inhibitory neurons, followed by the transcriptomic signatures of AD excitatory neurons and AD astrocytes (Figure 3A) (p < 0.05, Spearman’s correlation). When comparing the FC values between DM and CNDM of the blood dataset (MTAB6667) with the FC values of the pancreas dataset comprising data for the four cell types [26], the RNA alteration patterns in the DM blood data were similar to those in the α and β cells of DM islets (p < 0.05, Spearman’s correlation analysis; Figure 3B). These results indicated that gene expression in the blood reflected that in the disease-related tissues.
3.4. Comparison of DEGs between AD and DM Blood Data
The blood datasets for AD and DM consisted of 11,342 and 16,592 genes, respectively, and the number of genes shared between the two datasets was 10,172. We identified 3474 and 3498 DEGs between AD and CNAD and DM and CNDM, respectively (Figure 4A). The DEGs between AD and CNAD (AD-DEGs) significantly overlapped with the DEGs between DM and CNDM (DM-DEGs). The number of DEGs shared between the AD-DEGs and DM-DEGs was 1264 (p = 0.002, hypergeometric test). In the pathway analyses, 442 and 423 pathways were enriched by AD- and DM-DEGs, respectively (Figure 4A). The pathways enriched by the AD-DEGs also significantly overlapped with those enriched by the DM-DEGs. The number of shared pathways was 316 (p < 5.92 × 10−310). Figure 4B shows notable pathways enriched by AD- and DM-DEGs.
The logFC values for all genes between AD and CNAD were significantly correlated with those for all genes between DM and CNDM (Figure 4C; Spearman’s correlation coefficient = 0.332, p < 0.001, permutation test). In contrast, the logFC values of the AD and DM blood datasets were not significantly related to those of the AP blood dataset (the control).
3.5. Identification of the AD and DM Co-Related Module
By applying rWGCNA to the integrated blood dataset, we constructed 11 modules consisting of genes with similar expression patterns across samples (Figure 5A). We examined these 11 modules using the three approaches described in the Section 2 and selected regression-based disease-related modules, DEG-enriched modules, and MDC modules (Figure 5B). For AD, we selected green, turquoise, and brown modules that were related to AD (had a significant association with AD status and module-eigengene), significantly overlapped with the AD-DEGs, and demonstrated a significant differential connectivity between AD and CNAD. For DM, we selected seven modules (green, pink, black, greenyellow, magenta, blue, and red) that were shown via the three approaches to be related to DM. The green module was selected for both AD and DM; therefore, we defined the green module as the co-related module between AD and DM (Figure 5C).
The green module had 467 genes, of which 452 and 306 genes were AD- and DM-DEGs, respectively (middle Venn diagram in Figure 5C). The eigengene of the 467 genes in the green module had a negative relationship with the presence of AD or DM (left bar plot in Figure 5C), indicating that most genes in the green module had higher expressions in AD and DM than in CNAD and CNDM. The 467 genes in the green module had lower connectivity in AD and DM (according to the biweight midcorrelation) than in CNAD and CNDM (right matrix plots in Figure 5C).
The 467 genes in the green module were enriched in 252 KEGG and GO pathways (Table S2), and the representative pathways included Alzheimer’s disease (hsa05010), ribosome (hsa03010), oxidative phosphorylation (hsa00190), mitochondrial respiratory chain (GO:0005746), and protein targeting to endoplasmic reticulum (ER) (GO:0045047). These pathways are known to be related to AD and DM [39,40,41,42,43,44].
3.6. GRN and Identification of Hub Genes
Using GENIE3 [20], we constructed a GRN for the green module, which was related to both AD and DM. The green module contained 42 of the 2764 known TF-related genes [45]. Four GRNs for AD, CNAD, DM, and CNDM were constructed using 467 genes in the green module and 42 TF-related genes as candidate genes and regulators (input features of GENIE3), respectively. Among the edges obtained using GENEI3, we only selected edges when an edge had PPI evidence in the STRING database or a high biweight midcorrelation value (≥95th percentile).
Among the 42 TF-related genes, 24 AD-GRN genes had fewer edges than the CNAD-GRN genes, while 18 AD-GRN genes had greater than or equal the number of edges of the CNAD-GRN genes. Among the 42 TF-related genes, 11 DM-GRN genes had fewer edges than the CNDM-GRN genes, while 31 DM-GRN genes had greater than or equal the number of edges of the CNDM-GRN genes. Because the genes in the green module of the AD- and DM-GRNs generally had lower connectivity or correlational values than those of the matched CN-GRNs (Figure 5), we focused on TF-related genes with fewer edges in the disease-GRN than in the CN-GRN. The following five genes were shared between the 24 and 11 genes with fewer edges in the AD- and DM-GRNs, respectively, than in the matched CN-GRNs: COPS4, PSMA6, GTF2B, GTF2F2, and SSB (Figure 6). This finding suggested that these five genes were dysregulated TFs shared by AD and DM.
3.7. Validation of the Green Module and the Five Hub Genes
To compare the differential connectivity of the green module between the disease and control blood samples with that of other tissues, we analyzed gene expression in datasets for AD and CNAD brain tissue samples (GSE5281) and DM and CNDM pancreas tissue samples (GSE20966). Detailed information on the pre-processing of the two datasets is presented in the Supplementary Materials. The brain tissue dataset (GSE5281) contained gene expression values for 420 of the 467 genes in the green module. Using all pairs of the 420 genes, we constructed two separate biweight midcorrelation matrices for AD and CNAD samples. The results showed that the 420 genes in the AD brain had lower correlation values than those in the CNAD brain (p < 2.2 × 10−16, t-test). Specifically, the five genes COPS4, PSMA6, GTF2B, GTF2F2, and SSB had lower correlation values in the AD dataset than in the CNAD dataset (Figure 7). The pancreas dataset (GSE20966) contained gene expression values for 435 of the 467 green module genes. We also constructed two separate biweight midcorrelation matrices for DM and CNDM using all pairs of the 435 genes. The 435 genes had lower correlational values in the DM pancreas dataset than in the CNDM pancreas dataset (p < 2.2 × 10−16, t-test). Additionally, the five TF-related genes had lower correlation values in the DM dataset than in the CNDM dataset. These results confirmed that these five genes had differential connectivity in disease-related tissues.
4. Discussion
In this study, we identified five TF-related genes (COPS4, PSMA6, GTF2B, GTF2F2, and SSB) in the blood that were dysregulated in both AD and DM. These results were consistent in disease-related tissues (the brain for AD and the pancreas for DM).
COPS4 is a component of the COP9 signalosome and removes the ubiquitin-like protein Nedd8, mediating the reduction of the misfolded protein burden, which is the crucial mechanism of AD [46]. Additionally, the COP9 signalosome is related to obesity, which is the major risk factor for T2D, by mediating the expression of the C/EBP homologous protein and regulating the differentiation of pre-adipocytes. Similarly, PSMA6 is a component of the 20S proteasome and is related to both AD and DM [47,48]. Several studies have reported that PSMA6 has a possible role in DM complications, such as myocardial infarction and nephropathy [49,50].
A text-mining study that predicted physical interactions among candidate disease genes and constructed a molecular network analysis reported GTF2B as the top AD-related gene [51]. Nilsson et al. [52] analyzing transcriptomic data from adipose tissues in a twin cohort, reported GTF2B as a down-regulated TF-related gene in the T2D group. GTF2F2, which exists in many tissues and organs, plays a role in both the initiation and elongation of transcription by controlling RNA polymerase II [53]. Recently, GTF2F2 was reported to be involved in neurogenesis, neuroplasticity, and synaptogenesis by mediating NRF1 [54].
Numerous studies have focused on the autoantibody for SSB (anti-SSB), whose presence characterizes the main pathophysiology of Sjogren’s syndrome and is not part of the normal function of SSB. SSB is known as a transcription termination factor that facilitates the termination of transcription by RNA polymerase III [55]. Xu et al. [36] performed a meta-microarray analysis of brain tissue and reported that the SSB gene was dysregulated in AD patients. A separate study analyzing blood gene expression reported that SSB was also dysregulated in AD samples [56].
In this study, we selected a representative gene expression dataset for each disease instead of integrating several gene expression datasets, for the following three reasons: First, the disease-related gene expression values were variable, even across samples of the same disease. Among all datasets in each disease, only one pair of the AD datasets (GSE63060/GSE63061) and two pairs of the DM blood datasets had SCC > 0.15 when comparing their FC values for all genes (Figure S3). Second, when we compared the logFC changes between AD and CNAD in the large AD blood dataset with the logFC changes between DM and CNDM in the large DM blood dataset, the correlation between AD and DM was insignificant (Figure S4). Third, too many genes were excluded when generating a large dataset given that the platforms used (e.g., microarray) had different coverage for transcripts or genes. For example, we integrated seven different microarray datasets that resulted in only about 1000 genes found to be present in all datasets (Figure S4).
In this study, we identified the shared transcriptomic signatures between AD and DM by integrating co-expression, DE, and DC analyses. Numerous studies used the DE method between two statuses (disease and control) to identify genes related with a specific disease [7,8,9,12,13,14]. However, the DE-based methods cannot reveal two genes that are simultaneously upregulated or downregulated. Therefore, alternative analyses, such as co-expression and DC analyses, have been used to uncover coordinated and regulatory biological signatures [19,35]. This combination of DE and DC analyses allowed us identify a shared module between AD and DM containing highly correlated AD- and DM-related genes.
The selection of upstream genes, a crucial task in numerous studies, has been performed using various methods. Zhang et al. [19] selected the upstream or parent genes by integrating multi-omics datasets. Briefly, they implemented a Bayesian approach by allowing genes with cis-expression quantitative trait loci as the upstream or parent genes [19]. In turn, the Stockholm Atherosclerosis Gene Expression study used genes in TF-related GO terms as the upstream or hub genes [57]. Herein, we used 2765 TF-related genes, which were manually curated from previously validated studies and several databases [45], as well as the PPI database [38] to reduce false positive findings.
Lastly, we selected the five genes (COPS4, PSMA6, GTF2B, GTF2F2, and SSB) by strict criteria (e.g., statistically significance in DC and DE analyses, the presence of interaction with other genes in the PPI network, and the high correlation with other genes). However, there were no previous in vitro or in vivo studies reporting associations between the five genes and AD or DM. This could be due to the small number of previous studies that have investigated shared transcriptome changes between AD and DM. However, because our study was performed based on integration of multiple datasets and unbiased analyses, we expect that the identified genes are biomarkers of AD and DM.
5. Conclusions
In summary, our study revealed a gene module in which genes were commonly dysregulated in both AD and DM blood samples. These genes belonged to AD- and DM-related pathways.
Acknowledgments
Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd. and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.
Supplementary Materials
The following are available online at https://www.mdpi.com/2227-9059/9/1/34/s1, Figure S1: Soft Threshold Power (Hyper-parameter) of Co-Expression Network (rWGCNA), Figure S2: DS, MMS, DCOR, PAM (Hyper-parameters) of Co-Expression Network (rWGCNA), Figure S3: Correlation matrices from Spearman correlation for all pairwise comparisons among blood datasets in each disease, Figure S4: Comparison of gene expressions between integrated AD blood and integrated DM blood datasets, Table S1: All analyzed datasets (AD and DM), Table S2: Enriched pathways of genes in green module.
Author Contributions
Conceptualization, H.L. and T.L.; methodology, T.L.; formal analysis, T.L.; investigation, T.L.; data curation, T.L.; writing—original draft preparation, T.L.; writing—review and editing, H.L.; visualization, T.L.; supervision, H.L.; project administration, H.L.; funding acquisition, H.L. All authors have read and agreed to the published version of the manuscript.
Funding
This work has been supported by the Bio & Medical Technology Development Program of NRF funded by the Korean government (MSIT) (NRF-2018M3C7A1054935) and a grant from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (Grant Number: HI18C0460).
Data Availability Statement
Gene expression datasets are publicly available (ADNI, http://adni.loni.usc.edu/; GEO, https://www.ncbi.nlm.nih.gov/geo/; ArrayExpress, https://www.ebi.ac.uk/arrayexpress/). A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.
Conflicts of Interest
The authors declare no conflict of interest.
Footnotes
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Kumar A., Singh A. A review on Alzheimer’s disease pathophysiology and its management: An update. Pharmacol. Rep. 2015;67:195–203. doi: 10.1016/j.pharep.2014.09.004. [DOI] [PubMed] [Google Scholar]
- 2.Morris J.K., Honea R.A., Vidoni E.D., Swerdlow R.H., Burns J. Is Alzheimer’s disease a systemic disease? Biochim. Biophys. Acta Mol. Basis Dis. 2014;1842:1340–1349. doi: 10.1016/j.bbadis.2014.04.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Huang C.-C., Chung C.-M., Leu H.-B., Lin L.-Y., Chiu C.-C., Hsu C.-Y., Chiang C.-H., Huang P.-H., Chen T.-J., Lin S.-J., et al. Diabetes Mellitus and the Risk of Alzheimer’s Disease: A Nationwide Population-Based Study. PLoS ONE. 2014;9:e87095. doi: 10.1371/journal.pone.0087095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ohara T., Doi Y., Ninomiya T., Hirakawa Y., Hata J., Iwaki T., Kanba S., Kiyohara Y. Glucose tolerance status and risk of dementia in the community: The Hisayama Study. Neurology. 2011;77:1126–1134. doi: 10.1212/WNL.0b013e31822f0435. [DOI] [PubMed] [Google Scholar]
- 5.Kandimalla R., Thirumala V., Reddy P.H. Is Alzheimer’s disease a Type 3 Diabetes? A critical appraisal. Biochim. Biophys. Acta Mol. Basis Dis. 2017;1863:1078–1089. doi: 10.1016/j.bbadis.2016.08.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.De La Monte S.M., Wands J.R. Alzheimer’s Disease is Type 3 Diabetes—Evidence Reviewed. J. Diabetes Sci. Technol. 2008;2:1101–1113. doi: 10.1177/193229680800200619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hokama M., Oka S., Leon J., Ninomiya T., Honda H., Sasaki K., Iwaki T., Ohara T., Sasaki T., LaFerla F.M., et al. Altered Expression of Diabetes-Related Genes in Alzheimer’s Disease Brains: The Hisayama Study. Cereb. Cortex. 2014;24:2476–2488. doi: 10.1093/cercor/bht101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Caberlotto L., Nguyen T.-P., Lauria M., Priami C., Rimondini R., Maioli S., Cedazo-Minguez A., Sita G., Morroni F., Corsi M., et al. Cross-disease analysis of Alzheimer’s disease and type-2 Diabetes highlights the role of autophagy in the pathophysiology of two highly comorbid diseases. Sci. Rep. 2019;9:1–13. doi: 10.1038/s41598-019-39828-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Mirza Z., Kamal M.A., Buzenadah A.M., Al-Qahtani M.H., Karim S. Establishing Genomic/Transcriptomic Links Between Alzheimer’s Disease and Type 2 Diabetes Mellitus by Meta-Analysis Approach. CNS Neurol. Disord. Drug Targets. 2014;13:501–516. doi: 10.2174/18715273113126660154. [DOI] [PubMed] [Google Scholar]
- 10.Lovestone S., Francis P., Kloszewska I., Mecocci P., Simmons A., Soininen H., Spenger C., Tsolaki M., Vellas B., Wahlund L.O. AddNeuroMed—the European collaboration for the discovery of novel biomarkers for Alzheimer’s disease. Ann. N. Y. Acad. Sci. 2009;1180:36–46. doi: 10.1111/j.1749-6632.2009.05064.x. [DOI] [PubMed] [Google Scholar]
- 11.Mueller S.G., Weiner M., Thal L.J., Petersen R.C., Jack C.R., Jagust W., Trojanowski J.Q., Toga A.W., Beckett L. Ways toward an early diagnosis in Alzheimer’s disease: The Alzheimer’s Disease Neuroimaging Initiative (ADNI) Alzheimer Dement. 2005;1:55–66. doi: 10.1016/j.jalz.2005.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lee T., Lee H. Prediction of Alzheimer’s disease using blood gene expression data. Sci. Rep. 2020;10:1–13. doi: 10.1038/s41598-020-60595-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Shigemizu D., Mori T., Akiyama S., Higaki S., Watanabe H., Sakurai T., Niida S., Ozaki K. Identification of potential blood biomarkers for early diagnosis of Alzheimer’s disease through RNA sequencing analysis. Alzheimer Res. Ther. 2020;12:1–12. doi: 10.1186/s13195-020-00654-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Grayson B.L., Wang L., Aune T.M. Peripheral blood gene expression profiles in metabolic syndrome, coronary artery disease and type 2 diabetes. Genes Immun. 2011;12:341–351. doi: 10.1038/gene.2011.13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kaizer E.C., Glaser C.L., Chaussabel D., Banchereau J., Pascual V., White P.C. Gene Expression in Peripheral Blood Mononuclear Cells from Children with Diabetes. J. Clin. Endocrinol. Metab. 2007;92:3705–3711. doi: 10.1210/jc.2007-0979. [DOI] [PubMed] [Google Scholar]
- 16.Lin Q., Zhou W., Wang Y., Huang J., Hui X., Zhou Z., Xiao Y. Abnormal Peripheral Neutrophil Transcriptome in Newly Diagnosed Type 2 Diabetes Patients. J. Diabetes Res. 2020;2020:9519072. doi: 10.1155/2020/9519072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ustinova M., Ansone L., Silamikelis I., Rovite V., Elbere I., Silamikele L., Kalnina I., Fridmanis D., Sokolovska J., Konrade I., et al. Whole-blood transcriptome profiling reveals signatures of metformin and its therapeutic response. PLoS ONE. 2020;15:e0237400. doi: 10.1371/journal.pone.0237400. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Santiago J.A., Bottero V., Potashkin J.A. Transcriptomic and Network Analysis Highlight the Association of Diabetes at Different Stages of Alzheimer’s Disease. Front. Neurosci. 2019;13:1273. doi: 10.3389/fnins.2019.01273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Zhang B., Gaiteri C., Bodea L.-G., Wang Z., McElwee J., Podtelezhnikov A.A., Zhang C., Xie T., Tran L., Dobrin R., et al. Integrated Systems Approach Identifies Genetic Nodes and Networks in Late-Onset Alzheimer’s Disease. Cell. 2013;153:707–720. doi: 10.1016/j.cell.2013.03.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Huynh-Thu V.A., Irrthum A., Wehenkel L., Geurts P. Inferring Regulatory Networks from Expression Data Using Tree-Based Methods. PLoS ONE. 2010;5:e12776. doi: 10.1371/journal.pone.0012776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zhou X., Menche J., Barabási A.-L., Sharma A. Human symptoms-disease network. Nat. Commun. 2014;5:4212. doi: 10.1038/ncomms5212. [DOI] [PubMed] [Google Scholar]
- 22.Menche J., Sharma A., Kitsak M., Ghiassian S.D., Vidal M., Loscalzo J., Barabási A.-L. Uncovering disease-disease relationships through the incomplete interactome. Science. 2015;347:1257601. doi: 10.1126/science.1257601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kim J., Kim J.-J., Lee H. An analysis of disease-gene relationship from Medline abstracts by DigSee. Sci. Rep. 2017;7:40154. doi: 10.1038/srep40154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Kang D.D., Sibille E., Kaminski N., Tseng G. MetaQC: Objective quality control and inclusion/exclusion criteria for genomic meta-analysis. Nucleic Acids Res. 2011;40:e15. doi: 10.1093/nar/gkr1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Mathys H., Davila-Velderrain J., Peng Z., Gao F., Mohammadi S., Young J.Z., Menon M., He L., Abdurrob F., Jiang X., et al. Single-cell transcriptomic analysis of Alzheimer’s disease. Nat. Cell Biol. 2019;570:332–337. doi: 10.1038/s41586-019-1195-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Xin Y., Kim J., Okamoto H., Ni M., Wei Y., Adler C., Murphy A.J., Yancopoulos G.D., Lin C., Gromada J. RNA Sequencing of Single Human Islet Cells Reveals Type 2 Diabetes Genes. Cell Metab. 2016;24:608–615. doi: 10.1016/j.cmet.2016.08.018. [DOI] [PubMed] [Google Scholar]
- 27.Smyth G.K. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer; Berlin/Heidelberg, Germany: 2005. Limma: Linear Models for Microarray Data; pp. 397–420. [Google Scholar]
- 28.Huang D.W., Sherman B.T., Lempicki R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009;4:44. doi: 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
- 29.The Gene Ontology Consortium The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res. 2019;47:D330–D338. doi: 10.1093/nar/gky1055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kanehisa M., Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000;28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Leek J.T., Johnson W.E., Parker H.S., Jaffe A.E., Storey J.D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012;28:882–883. doi: 10.1093/bioinformatics/bts034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Johnson W.E., Li C., Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2006;8:118–127. doi: 10.1093/biostatistics/kxj037. [DOI] [PubMed] [Google Scholar]
- 33.Langfelder P., Horvath S. WGCNA: An R package for weighted correlation network analysis. BMC Bioinform. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Parikshak N.N., Luo R., Zhang A., Won H., Lowe J.K., Chandran V., Horvath S., Geschwind D.H. Integrative Functional Genomic Analyses Implicate Specific Molecular Pathways and Circuits in Autism. Cell. 2013;155:1008–1021. doi: 10.1016/j.cell.2013.10.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Gandal M.J., Haney J.R., Parikshak N.N., Leppa V., Ramaswami G., Hartl C., Schork A.J., Appadurai V., Buil A., Werge T., et al. Shared molecular neuropathology across major psychiatric disorders parallels polygenic overlap. Science. 2018;359:693–697. doi: 10.1126/science.aad6469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Xu M., Zhang D.-F., Luo R., Wu Y., Zhou H., Kong L.-L., Bi R., Yao Y.-G. A systematic integrated analysis of brain expression profiles reveals YAP1 and other prioritized hub genes as important upstream regulators in Alzheimer’s disease. Alzheimers Dement. 2018;14:215–229. doi: 10.1016/j.jalz.2017.08.012. [DOI] [PubMed] [Google Scholar]
- 37.Pratapa A., Jalihal A.P., Law J.N., Bharadwaj A., Murali T.M. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat. Methods. 2020;17:147–154. doi: 10.1038/s41592-019-0690-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Szklarczyk D., Gable A.L., Lyon D., Junge A., Wyder S., Huerta-Cepas J., Simonovic M., Doncheva N.T., Morris J.H., Bork P., et al. STRING v11: Protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47:D607–D613. doi: 10.1093/nar/gky1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Ding Q., Markesbery W.R., Chen Q., Li F., Keller J.N. Ribosome Dysfunction Is an Early Event in Alzheimer’s Disease. J. Neurosci. 2005;25:9171–9175. doi: 10.1523/JNEUROSCI.3040-05.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Ashford A.J., Pain V.M. Effect of diabetes on the rates of synthesis and degradation of ribosomes in rat muscle and liver in vivo. J. Biol. Chem. 1986;261:4059–4065. [PubMed] [Google Scholar]
- 41.Liang W.S., Reiman E.M., Valla J., Dunckley T., Beach T.G., Grover A., Niedzielko T.L., Schneider L.E., Mastroeni D., Caselli R. Alzheimer’s disease is associated with reduced expression of energy metabolism genes in posterior cingulate neurons. Proc. Natl. Acad. Sci. USA. 2008;105:4441–4446. doi: 10.1073/pnas.0709259105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Sivitz W.I., Yorek M.A. Mitochondrial Dysfunction in Diabetes: From Molecular Mechanisms to Functional Significance and Therapeutic Opportunities. Antioxid. Redox Signal. 2010;12:537–577. doi: 10.1089/ars.2009.2531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Chadwick W., Mitchell N., Martin B., Maudsley S. Therapeutic targeting of the endoplasmic reticulum in Alzheimer’s disease. Curr. Alzheimer Res. 2012;9:110–119. doi: 10.2174/156720512799015055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Dong Y., Fernandes C., Liu Y., Wu Y., Wu H., Brophy M.L., Deng L., Song K., Wen A., Wong S., et al. Role of endoplasmic reticulum stress signalling in diabetic endothelial dysfunction and atherosclerosis. Diabetes Vasc. Dis. Res. 2017;14:14–23. doi: 10.1177/1479164116666762. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Lambert S.A., Jolma A., Campitelli L.F., Das P.K., Yin Y., Albu M., Chen X., Taipale J., Hughes T.R., Weirauch M.T. The Human Transcription Factors. Cell. 2018;172:650–665. doi: 10.1016/j.cell.2018.01.029. [DOI] [PubMed] [Google Scholar]
- 46.Parry T.L., Melehani J.H., Ranek M.J., Willis M.S. Functional Amyloid Signaling via the Inflammasome, Necrosome, and Signalosome: New Therapeutic Targets in Heart Failure. Front. Cardiovasc. Med. 2015;2:25. doi: 10.3389/fcvm.2015.00025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Upadhya S.C., Hegde A.N. Role of the ubiquitin proteasome system in Alzheimer’s disease. BMC Biochem. 2007;8:S12. doi: 10.1186/1471-2091-8-S1-S12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Broca C., Varin E., Armanet M., Tourrel-Cuzin C., Bosco M., Dalle S., Wojtusciszyn A. Proteasome Dysfunction Mediates High Glucose-Induced Apoptosis in Rodent Beta Cells and Human Islets. PLoS ONE. 2014;9:e92066. doi: 10.1371/journal.pone.0092066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Barbieri M., Marfella R., Rizzo M.R., Boccardi V., Siniscalchi M., Schiattarella C., Siciliano S., Lemme P., Paolisso G. The—8 UTR C/G polymorphism of PSMA6 gene is associated with susceptibility to myocardial infarction in type 2 diabetic patients. Atherosclerosis. 2008;201:117–123. doi: 10.1016/j.atherosclerosis.2008.01.005. [DOI] [PubMed] [Google Scholar]
- 50.Feng Y., Jin M.-Y., Liu D.-W., Dong-Wei L. Proteasome subunit-α type-6 protein is post-transcriptionally repressed by the microRNA-4490 in diabetic nephropathy. Biosci. Rep. 2018;38 doi: 10.1042/BSR20180815. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Krauthammer M., Kaufmann C.A., Gilliam T.C., Rzhetsky A. Molecular triangulation: Bridging linkage and molecular-network information for identifying candidate genes in Alzheimer’s disease. Proc. Natl. Acad. Sci. USA. 2004;101:15148–15153. doi: 10.1073/pnas.0404315101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Nilsson E.A., Jansson P.A., Perfilyev A., Volkov P., Pedersen M., Svensson M.K., Poulsen P., Ribel-Madsen R., Pedersen N.L., Almgren P., et al. Altered DNA Methylation and Differential Expression of Genes Influencing Metabolism and Inflammation in Adipose Tissue From Subjects With Type 2 Diabetes. Diabetes. 2014;63:2962–2976. doi: 10.2337/db13-1459. [DOI] [PubMed] [Google Scholar]
- 53.Tan S., Aso T., Conaway R.C., Conaway J.W. Roles for both the RAP30 and RAP74 subunits of transcription factor IIF in transcription initiation and elongation by RNA polymerase II. J. Biol. Chem. 1994;269:25684–25691. [PubMed] [Google Scholar]
- 54.Preciados M., Yoo C., Roy D. Estrogenic Endocrine Disrupting Chemicals Influencing NRF1 Regulated Gene Networks in the Development of Complex Human Brain Diseases. Int. J. Mol. Sci. 2016;17:2086. doi: 10.3390/ijms17122086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Gottlieb E., Steitz J.A. Function of the mammalian La protein: Evidence for its action in transcription termination by RNA polymerase III. EMBO J. 1989;8:851–861. doi: 10.1002/j.1460-2075.1989.tb03446.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Naughton B.J., Duncan F.J., Murrey D.A., Meadows A.S., Newsom D.E., Stoicea N., White P., Scharre D.W., Mccarty D.M., Fu H. Blood Genome-Wide Transcriptional Profiles Reflect Broad Molecular Impairments and Strong Blood-Brain Links in Alzheimer’s Disease. J. Alzheimer Dis. 2014;43:93–108. doi: 10.3233/JAD-140606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Hägg S., Skogsberg J., Lundström J., Noori P., Nilsson R., Zhong H., Maleki S., Shang M.-M., Brinne B., Bradshaw M., et al. Multi-Organ Expression Profiling Uncovers a Gene Module in Coronary Artery Disease Involving Transendothelial Migration of Leukocytes and LIM Domain Binding 2: The Stockholm Atherosclerosis Gene Expression (STAGE) Study. PLoS Genet. 2009;5:e1000754. doi: 10.1371/journal.pgen.1000754. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Gene expression datasets are publicly available (ADNI, http://adni.loni.usc.edu/; GEO, https://www.ncbi.nlm.nih.gov/geo/; ArrayExpress, https://www.ebi.ac.uk/arrayexpress/). A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.