Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Jan 15;15:2093. doi: 10.1038/s41598-025-85947-7

DeepDrug as an expert guided and AI driven drug repurposing methodology for selecting the lead combination of drugs for Alzheimer’s disease

Victor O K Li 1,✉,#, Yang Han 1,#, Tushar Kaistha 1,#, Qi Zhang 1, Jocelyn Downey 1, Illana Gozes 2, Jacqueline C K Lam 1,✉,#
PMCID: PMC11735786  PMID: 39814937

Abstract

Alzheimer’s Disease (AD) significantly aggravates human dignity and quality of life. While newly approved amyloid immunotherapy has been reported, effective AD drugs remain to be identified. Here, we propose a novel AI-driven drug-repurposing method, DeepDrug, to identify a lead combination of approved drugs to treat AD patients. DeepDrug advances drug-repurposing methodology in four aspects. Firstly, it incorporates expert knowledge to extend candidate targets to include long genes, immunological and aging pathways, and somatic mutation markers that are associated with AD. Secondly, it incorporates a signed directed heterogeneous biomedical graph encompassing a rich set of nodes and edges, and node/edge weighting to capture crucial pathways associated with AD. Thirdly, it encodes the weighted biomedical graph through a Graph Neural Network into a new embedding space to capture the granular relationships across different nodes. Fourthly, it systematically selects the high-order drug combinations via diminishing return-based thresholds. A five-drug lead combination, consisting of Tofacitinib, Niraparib, Baricitinib, Empagliflozin, and Doxercalciferol, has been selected from the top drug candidates based on DeepDrug scores to achieve the maximum synergistic effect. These five drugs target neuroinflammation, mitochondrial dysfunction, and glucose metabolism, which are all related to AD pathology. DeepDrug offers a novel AI-and-big-data, expert-guided mechanism for new drug combination discovery and drug-repurposing across AD and other neuro-degenerative diseases, with immediate clinical applications.

Keywords: DeepDrug, Alzheimer’s Disease, Expert-led AI drug-repurposing, Graph neural network, Somatic and germline mutations, Long genes, Lead combination of AD drugs, Directed biomedical graph, Neuro-degenerative, Inflammation, Immunological, Aging pathways, Pathway convergence

Subject terms: Computational biology and bioinformatics, Alzheimer's disease

Introduction

Alzheimer’s disease (AD) significantly deteriorates human health and quality of life18. Globally, around 50 million people have suffered from AD and related forms of dementia, generating 28.8 million disability-adjusted life-years1. In general, drug development is a hugely expensive endeavor. It takes USD$2.6 billion for a candidate drug to go through the developmental process prior to regulatory approval. At present, no effective disease-modifying treatment or preventative therapies have been found, while the process of screening effective AD drug candidates has remained lengthy and data-constrained913. Only six drugs and one drug combination comprising two of the six drugs have been approved by the US Food and Drug Administration (FDA) for AD treatment. Existing approved non-immunotherapy treatments for AD include acetylcholinesterase (AChE) inhibitors (Donepezil, Galantamine, and Rivastigmine) and N-methyl-D-aspartate receptor (NMDA) receptor antagonists (Memantine). A systematic meta-analysis14 has highlighted that these treatments can improve cognitive scores but not neuropsychiatric symptoms. These treatments primarily target symptoms like cognitive decline without addressing the underlying neurodegenerative processes15, making them less helpful in managing the progression of the disease.

In our study, instead of focusing individually on cholinergic deficit, excitotoxicity, or amyloid deposition, the key hallmarks of AD encompassing specific pharmaceutical intervention, an artificial intelligence (AI)-based holistic approach for multi-drug repurposing has been targetted. Drug-repurposing is a strategy that identifies new uses for approved drugs. It sheds new hopes for the discovery of treatments for AD and other neurodegenerative diseases, by repurposing existing clinical or non-clinical drugs reported in the drug banks1623. As such, network-based statistical methods were proposed to identify putative therapeutics that can be repurposed for cancer, based on the distance (the shortest path in the network) between the drug targets (proteins) and disease proteins of cancer, assuming that the mutated genes that generate the disease proteins are connected with cancer through protein–protein interactions24. Moreover, drug combinations, which consist of more than one drug with different mechanisms of action and in a single dosage form, were reported to increase the chance of effective therapy25,26. Statistical methods were previously used to identify repurposed two-drug combinations for hypertension and cancer by exploring the network-based relationship between two drugs (and their targets) and disease proteins27. Given the large search space of high-order drug combinations, a random forest was proposed to identify effective drug combinations (from three to five drugs) for tuberculosis therapy based on pairwise drug response data only28. More recently, network-based drug repurposing has been exploited to combat COVID-19 infection. Belyaeva et al.29 integrated multi-omics data with aging signatures into a causal network model to identify drug candidates to treat COVID-19. Howell et al.30 constructed a virus-host interactome as a discrete qualitative network and used node perturbations to identify two-drug combinations for different stages of COVID-19 infection. Further, Gysi et al. (2021) developed a graph neural network (GNN) for COVID-19 treatment recommendations based on a link prediction problem on a biomedical graph consisting of nodes (drugs, proteins, diseases) and edges (protein–protein interactions, drug-target associations, disease-protein associations, and drug-disease indications)31. Hsieh et al.32 developed a GNN model to encode graph nodes (drugs and proteins) into a lower-dimensional space and determined COVID-19 treatment based on the closeness of the drugs to COVID-19 disease proteins32.

In the context of AD drug-repurposing, a few studies have demonstrated several successful applications of computational drug-repurposing methods. Siavelis et al.33 combined a network-based approach with traditional statistical measures, capitalizing on the results obtained from an AD-related microarray study to demonstrate that genes are differentially expressed when AD-related cells are exposed to a test panel of drugs. Taubes et al.34 performed computational AD drug-repurposing based on transcriptomic perturbations of drugs best reversing APOE4-related AD signatures, in combination with experimental evidence observed in mouse models, to demonstrate the efficacy of bumetanide in modulating pathways associated with APOE4, a major genetic risk factor for AD. Bayraktar et al.35 identified glutaminase as an AD drug target and performed computational drug-repurposing to reveal drugs that can reduce glutamate production based on gene expression data. Furthermore, multi-omics data and systems biology approaches have become increasingly adopted in AD drug-repurposing. Zhou et al.36 integrated genetic, transcriptomic, and proteomic data to identify potential drug targets for AD treatment through network-based statistical analysis. Azizan et al.37 explored the molecular links between metabolic disorders and AD through a systems biology approach, demonstrating the common pathways and molecular targets shared by metabolic conditions (hypertension, non-alcoholic fatty liver disease, and diabetes) and AD, and providing new insights into AD pathophysiology and drug-repurposing. Even though these computational approaches are yet to exploit the full potential of AI in capturing the complex relationships of high-dimensional biomedical data, they have demonstrated the significance of synthesizing diverse biological data sources to uncover the underlying disease mechanisms and improve the efficacy of drug-repurposing for AD.

More recently, the landscape of AD drug-repurposing has rapidly evolved given the advancement in AI-driven computational methods38,39. Rodriguez et al. (2021) developed a machine learning framework to predict a list of genes that associate with different stages of AD, based on gene expression data from multiple datasets, for drug repurposing. Ghiam et al.40 examined the role of long non-coding RNAs (lncRNAs) in the cross-talk between diabetes and AD using two bipartite networks, including (1) an mRNA-miRNA network with common mRNAs in diabetes and AD and (2) an miRNA-lncRNA network with high-degree miRNAs obtained from the mRNA-miRNA network. They utilized machine learning to predict the effects of drugs on the expression of high-degree lncRNAs selected from the miRNA-lncRNA network. Furthermore, a few studies have utilized GNNs to model the complex interactions between different entities within biological networks, integrating various biomedical datasets to address the complexity and heterogeneity of AD. Zeng et al.41 proposed DeepDR, a GNN model to learn high-level representations of drugs and diseases from biomedical networks and predict drug-disease associations. Xu et al.42 proposed NETTAG, integrating genome-wide association studies (GWAS) findings and multi-omics data to predict AD-risk genes and identify druggable targets, with the help of a modified GNN model capturing the sparse structure and clustering of a protein–protein interaction (PPI) network. Pan et al.43 developed AI-DrugNet to learn the representation of individual drug-target pair, using a GNN model based on a drug-target pair network and predict drug combinations for AD based on the combined features of two drug-target pairs. Hsieh et al.44 further used a graph autoencoder model to learn node representations from an AD knowledge graph consisting of four node types (drugs, genes, pathways, and gene ontology) and their interactions, identifying top-ranking drugs according to multi-level drug efficacy evidence, ranging from transcriptomic patterns to clinical trials, and two-drug combinations selected from top-ranking drugs with synergistic effects in targeting multiple disease pathways.

Here, we propose DeepDrug, an expert-led AI-driven drug-repurposing framework, to identify a lead combination of approved drugs for AD treatment. While the underlying mechanism of AD remains largely unknown12,4547, the novel DeepDrug model features the incorporation of domain-specific knowledge from neuroscience experts into AI-driven drug-repurposing. Specifically, DeepDrug has four novelties as compared to previous studies: (i) It integrates expert-led knowledge (including long genes, immunological and aging pathways, and somatic mutation markers identified from the blood and the brain with AD) to select candidate genes and drugs relevant to AD, using their GNN-based high-level representations to calculate drug-gene scores. In contrast, previous GNN-based approaches focused primarily on drug-disease association41, protein–protein interaction42, or drug-target interaction43; (ii) It generates an expert-led signed directed heterogeneous biomedical graph to encompass a rich set of nodes and edges, carefully incorporating expert knowledge in graph construction to capture crucial pathways that associate with AD. The expert-led biomedical graph distinguishes itself from previous AD biomedical graphs44, allowing (a) nodes/edges weight assignments and (b) positive/negative directed edges, better capturing AD domain-specific knowledge; (iii) It encodes the expert-led biomedical graph through a signed directed GNN into a new embedding space to better capture the rich relationships between different types of nodes, deviating from traditional statistical measures relying solely on the shortest path metric based on signed directed biomedical graphs48 or recent graph representation learning models44 that do not account for the weights of nodes/edges and the signs/directions of edges. Building on top of the domain-specific node/edge weight assignments, such as drug-target affinity, and the signed and directed edges, such as drug-target activation or inhibition, DeepDrug can further enhance the accuracy of prediction of successful drug candidates; (iv) It optimizes the selection of repurposed drug-combinations systematically, prioritizing single drug candidates via diminishing return-determined thresholds and identifying the lead drug combinations beyond two drugs43,44.

Our DeepDrug drug-repurposing framework consists of a three-stage methodology. Firstly, the construction of a heterogeneous directed biomedical graph consisting of interconnected genes, proteins, and drugs to capture the network characteristics of AD pathology, accounting for the known associations across the expert-led domain-specific AD pathways. Secondly, the biomedical graph is taken as the input to an artificial intelligence (AI)-driven GNN framework, which creates the embeddings of the drug and gene nodes in a lower dimensional embedding space as the outputs. Thirdly, a drug scoring and selection analysis is conducted to generate the drug-gene scores and identify a lead combination of repurposed AD drug candidates with the top drug-gene scores for clinical verification. A five-drug combination, consisting of Tofacitinib, Niraparib, Diltiazem, Pantoprazole, and Pravastatin, has been selected from the top repurposed drugs to achieve the maximum synergistic effect. These five drugs target neuroinflammation, mitochondrial dysfunction, and cholesterol metabolism, which are all related to AD pathology. Figure 1 summarizes our proposed DeepDrug methodology and novelties.

Fig. 1.

Fig. 1

DeepDrug Framework and Novelties.

Results

Identification of the top approved drug candidates from DeepDrug

By means of our DeepDrug methodology (see “Methods”), we have generated a priority list of repurposed drugs as well as a lead combination of five repurposed drugs for AD treatment. Table 1 shows the top 15 drug candidates and their respective scores based on the findings generated by DeepDrug (see “Scoring and Selection of Drug and Drug Combination”). All these drugs are potentially relevant candidates for AD treatment based on evidence from previous literature. Specifically, the top drug, Tofacitinib, was identified as one of the top repurposed drugs in AI-driven AD drug-repurposing research49, subsequently tested in mouse models50. Three of the drugs (Tofacitinib, Filgotinibm, and Baricitinib) are JAK kinase inhibitors and provide a level of validation to our work since it has previously been noted that the JAK/STAT pathway plays an important role in AD-linked neurodegeneration51. While the administration of JAK kinase inhibitors such as Tofacitinib has not been shown to reduce the incidence of AD in Rheumatoid Arthritis patients52, its potential benefits in AD when used in combinatorial therapy have yet to be elucidated. Notably, Tofacitinib has been reported to decrease rapamycin-induced autophagy53, which may raise concerns for AD therapy, as increased autophagy has been associated with enhanced cognitive function54. However, it is important to consider the stage-specific context of AD. As noted with rapamycin later, excessive autophagy may have detrimental outcomes in the later stage. Therefore, JAK inhibitors could be clinically relevant as part of a tailored therapeutic strategy to balance autophagy modulation depending on disease progression. Of the remaining two candidates, Niraparib is a protein poly (ADP-ribose) polymerase (PARP) inhibitor, and Palbociclib is a cyclin-dependent kinase (CDK)4/6 inhibitor. Over-activation of PARP in response to oxidative damage has been associated with increased cell necrosis, and potential neurodegeneration in the context of AD55. Unsurprisingly, PARP has previously been identified as a viable target for AD therapy, along with CDK4/6. CDK4/6 complexes with cyclin 1, facilitating cell progression from G0- > G1 phase, is intricately involved in cell-cycle control in response to oxidative stress56. Our earlier studies also confirmed the positive involvement of PARP in brain plasticity plus learning and memory57,58. Among the remaining ten drugs, early intervention with Sacrolimus (rapamycin) has been shown to reduce Alzheimer’s neurodegeneration in mouse models59. Carosi and Sargeant59 have also noted that intervention via rapamycin during the later course development of the disease might be counterproductive, as it might promote neurological degeneration at a point when the lysosomal system within the brain is damaged. Empagliflozin, Febuxostat, Doxercalciferol, and Zinc Acetate can be relevant candidates for AD treatment because they were formerly approved for treating AD-associated comorbidities or risk factors, including diabetes60, gout61, Vitamin D deficiency62, and Zinc deficiency63. Raloxifene was previously identified to treat women with AD64, a sub-population that carries a higher risk of AD. Valproic Acid and Olanzapine were formerly approved for treating neurological diseases such as seizures and schizophrenia and identified as promising agents for AD treatment65,66. Other candidates, including Adenosine67 and Miconazole68, carry the potentials to reduce neuroinflammation responses.

Table 1.

Top Repurposed Drug Candidates for AD by DeepDrug.

DeepDrug Ranking DeepDrug Score Drug Approved Treatment Description
1 0.2692 Tofacitinib Rheumatic conditions Janus kinase (JAK) inhibitor
2 0.1528 Niraparib Cancer Poly-ADP ribose polymerase (PARP) inhibitor
3 0.1454 Palbociclib Cancer Cyclin-dependent kinase 4/6 (CDK4/6) inhibitor
4 0.1446 Filgotinib Rheumatic conditions JAK inhibitor
5 0.1436 Baricitinib Rheumatic conditions JAK inhibitor
6 0.1318 Sirolimus Organ transplant rejection Mammalian target of rapamycin (mTOR) inhibitor
7 0.1195 Empagliflozin Diabetes Sodium-glucose co-transporter-2 (SGLT2) inhibitor
8 0.1096 Febuxostat

Hyperuricemia

Gout

Xanthine oxidase (XO) inhibitor
9 0.1084 Raloxifene Osteoporosis Estrogen receptor modulator (SERM)
10 0.0964 Doxercalciferol Hyperparathyroidism Vitamin D2 analog
11 0.0939 Valproic Acid Seizures Fatty acid derivative and anticonvulsant
12 0.0938 Miconazole Fungal infections Azole antifungal
13 0.0937 Zinc Acetate

Zinc deficiency

Wilson’s disease

Zinc
14 0.0936 Adenosine Supraventricular tachycardia Antiarrhythmic agent
15 0.0917 Olanzapine

Schizophrenia

Bipolar I disorder

Antipsychotic agent

Identification of a lead combination of drugs based on the top approved drugs

Based on the 15 drug candidates identified, drug combinations, ranging from two-drug to six-drug combinations, have been generated by DeepDrug (see “Drug Combination Optimisation”). Figure 2 shows the distribution of the drug combination scores from two-drug combinations up to six-drug combinations.

Fig. 2.

Fig. 2

Box Plot of Drug Combination Scores. This figure shows the scores of all drug combinations, from two-drug combinations up to six-drug combinations, grouped by the number of drugs in the drug combination, including the minimum, 25th percentile, median, 75th percentile, and maximum scores. The leading drug combination has been selected from the top five-drug combinations, which is the optimal number of drugs in drug combinations based on the cut-off point where the maximum score has been reached.

The top ten five-drug combinations for AD among 3,003 combinations (choosing 5 from 15 drugs) are shown in Table 2. The small variations in the top five-drug combination scores align with previous experimental and simulation results, which suggest that an effective high-order drug combination is not “a needle in a haystack” and that many combinations of novel agents may enhance efficacy69. Moreover, although Valproic Acid and Miconazole have not been found in the top ten drugs (though they are in the top 15), they can be found in the top ten five-drug combination list, potentially due to their complementary interactions with other drugs. For example, Valproic Acid can reduce the expression of pro-inflammatory cytokines through its histone deacetylase (HDAC) inhibition70, while the other top drugs, including JAK inhibitors such as Tofacitinib and Baricitinib, can further regulate cytokine signaling51, potentially providing a synergistic reduction in neuroinflammation and a more comprehensive treatment strategy for AD.

Table 2.

The Top Ten Five-drug Combinations for AD.

Drug 1 Drug 2 Drug 3 Drug 4 Drug 5 DeepDrug Score
Tofacitinib Niraparib Baricitinib Empagliflozin Doxercalciferol 0.3876
Tofacitinib Niraparib Empagliflozin Doxercalciferol Valproic acid 0.3857
Tofacitinib Palbociclib Empagliflozin Doxercalciferol Valproic acid 0.3847
Tofacitinib Niraparib Sirolimus Doxercalciferol Miconazole 0.3844
Tofacitinib Niraparib Palbociclib Empagliflozin Doxercalciferol 0.3843
Tofacitinib Baricitinib Empagliflozin Doxercalciferol Valproic acid 0.3842
Tofacitinib Palbociclib Empagliflozin Febuxostat Valproic acid 0.3839
Tofacitinib Palbociclib Sirolimus Valproic acid Miconazole 0.3837
Tofacitinib Niraparib Baricitinib Sirolimus Miconazole 0.3836
Tofacitinib Baricitinib Empagliflozin Febuxostat Valproic acid 0.3836

To further evaluate the robustness of the drug combination selection process, we used the top ten drug candidates (covering more than 80% of drug targets), instead of the top 15 drug candidates (covering more than 90% of drug targets), to identify the lead drug combination. Results showed that the five drugs in the lead drug combination identified from the top 15 drug candidates (see Table 2) were also in the best drug combination or the second-best drug combination identified using the top ten drug candidates (see Table S1 in Supplementary Information). The high overlap in the lead drug combinations identified using different target coverage threshold values demonstrates the robustness of the drug combination selection process.

Given that the drug combination score was the average of tenfold cross-validation results (see “Scoring and Selection of Drug and Drug Combination”), we performed a Wilcoxon signed-rank test to examine the score difference between the lead drug combination and the other drug combinations. The statistical result showed that the score of the lead drug combination was significantly different from the average scores of the other nine drug combinations in Table 2 across the tenfold cross-validation (Wilcoxon test statistic = 2.0; p = 0.006), suggesting the significant synergistic effect of the lead drug combination compared to other combinations.

The lead five-drug combination selected using DeepDrug comprises Tofacitinib, Niraparib, Baricitinib, Empagliflozin, and Doxercalciferol. The mechanism of these five drugs and their involvement in different stages of AD drug discovery are listed in Table 3. These drugs have been previously approved for the treatment of rheumatic conditions, cancer, diabetes, and hyperparathyroidism. As shown in Table 3, each of the five drug candidates in the lead drug combination is relevant for inhibiting the AD pathology. Specifically, Tofacitinib is a JAK inhibitor targeting the JAK-STAT pathway, which promotes neuroinflammation responses mediated by microglia and astrocytes51. Microglia dysfunction characterized by proinflammatory cytokines is related to the clearance/accumulation of amyloid beta, contributing to AD pathology71. Niraparib inhibits PARP enzymes, which may contribute to neurodegeneration through metabolic impairment and cell death regulation72. Baricitinib is also a JAK inhibitor, which can potentially reinforce Tofacitinib by inhibiting JAK and reducing the production of these cytokines, attenuating neuroinflammatory processes. Empagliflozin is an SGLT2 inhibitor targeting the shared pathways between diabetes and AD, including the insulin signaling pathway73. Doxercalciferol is a Vitamin D analog triggering neuronal protection actions against inflammatory and oxidative stress74. Moreover, the relevance of these five drug candidates in AD pathology has been demonstrated by their individual involvements in different stages of AD drug discovery; whilst a cited review highlighted the use of Niraparib for AD drug-repurposing75, an epidemiological study linked Doxercalciferol to reduced AD risk, based on observational medical records data76 and animal models that used Tofacitinib and Empagliflozin to treat AD in mice50,73, an interventional human clinical trial (Phase II) investigated the effectiveness of Baricitinib on AD treatment77.

Table 3.

A Lead Five-drug Combination of FDA-Approved Drugs for AD.

Drug Tofacitinib Niraparib Baricitinib Empagliflozin Doxercalciferol
Description JAK inhibitor PRAP inhibitor JAK inhibitor SGLT2 inhibitor Vitamin D2 analog
Target Disease Rheumatic conditions Cancer Rheumatic conditions Diabetes Hyperparathyroidism
AD Drug Discovery Mouse model50 Literature review75 Clinical trial (Phase II)77 Mouse model73 Epidemiological study76
Mechanism Tofacitinib is a JAK inhibitor that prevents the body from responding to cytokine signals, inhibiting the JAK-STAT pathway to decrease the inflammatory response Niraparib is a potent and selective inhibitor of PARP enzymes, which plays an important role in DNA repair Baricitinib inhibits JAK proteins, modulates interleukin, interferon, and growth factor signaling pathways, reduces JAK1/JAK2 expression in mutated cells, and induces cell apoptosis Empagliflozin is a potent inhibitor of renal SGLT2 transporters located in the proximal tubules of the kidneys and works to lower blood glucose levels via an increase in glucosuria Doxercalciferol is a synthetic Vitamin D2 analog; its biologically active form controls the absorption and mobilization of calcium in the body

Figure 3 visualizes the lead five-drug combination in the embedding space of the signed directed biomedical graph to understand how the lead combination can work together for AD treatment. The projection is based on the t-SNE technique, which maps high-dimensional data into a three-dimensional scatter plot78. The AD-associated gene embeddings and the reversed drug embeddings are shown in Fig. 3, where a shorter distance between a drug and a gene in the embedding space suggests a higher chance that the drug can “treat” the gene, either by inhibiting the AD-risk protein or by enhancing AD-protective protein function. On average, the five drugs in the lead combination are closer to the AD-associated genes. These five drugs can be grouped into three clusters, and the three clusters are distributed across different parts of the embedding space. This pattern suggests that some drugs can potentially reinforce others while others can be complementary. Specifically, Tofacitinib (JAK inhibitor) and Baricitinib (JAK inhibitor) are the two closest drugs in the embedding space when compared to other drugs, highlighting the importance of addressing dysfunction within immunological pathways, given that both drugs target neuroinflammation-related pathways (in particular, the JAK-STAT pathway51). Moreover, Niraparib (PARP inhibitor) and Doxercalciferol (vitamin D2 analog) are closer in the embedding space. One possible explanation is that the active form of vitamin D may also inhibit PARP79, and the two drugs may provide anti-inflammatory action and reduced oxidative stress through PARP inhibition and vitamin D pathways74,80. In contrast, Empagliflozin (SGLT2 inhibitor) is distant from the remaining four drugs. One possible explanation being that Empagliflozin may play an independent complementary role in AD treatment, by targeting the insulin signaling pathway related to glucose, lipid, and energy homeostasis73.

Fig. 3.

Fig. 3

DeepDrug Prediction of the Five-drug Combination in the Embedding Space.

Figure 4 further illustrates the potential mechanism and biological relevance of the lead five-drug combination in AD treatment using network and pathway analysis. Specifically, Fig. 4 shows the protein targets of the lead combination (which achieves the highest DeepDrug score), and how these targets are linked to the top ten AD-associated genes (among the 310 genes) via the shortest paths of individual biomedical graphs of a drug-target-gene interaction network. Although prominent genes involved in AD (such as APP, APOE, PSEN1, and MAPT) are not directly targeted by the five drugs in combination, these drugs can still target the AD-associated genes through the PPI network.

Fig. 4.

Fig. 4

DeepDrug Prediction of Five-drug Combinations and Drug-target-gene Interaction Network.

Moreover, using STRING pathway enrichment analysis81, Fig. 4 reveals the top five KEGG82 pathways enriched in the PPI subnetwork according to their statistical significance levels. The significance level is measured by the false discovery rate (FDR), and the threshold is set at 5%. The KEGG AD pathway ranks the top 17th among all significant pathways (44 in total), which includes APOE, APP, CASP3, GSK3B, PSEN1, MAPT, and CDK5 in the PPI subnetwork. The top five signaling pathways are all related to AD pathology, covering aging, oxidative stress, neuroinflammation, amyloid beta production, and Tau hyperphosphorylation. Specifically, the MAPK signaling pathway is linked to Tau hyperphosphorylation implicated in brain somatic mutations83 and neuroinflammation84. The FOXO signaling involves FOXO genes, which are transcription factors that regulate reactive oxygen species (ROS) detoxification, cell cycle/death, and energy/glucose metabolism85, and has been linked to longevity, aging86, and oxidative stress87, with the latter being linked to AD pathogenesis. The neurotrophin signaling pathway is connected to a variety of signaling cascades, including the MAPK and PI3K signaling pathways, which are also linked to Tau hyperphosphorylation83 and neuroinflammation84. The Wnt signaling pathway inhibits GSK3B, a key kinase involved in Tau phosphorylation, reinforces amyloid beta production via BACE1 activation88, and regulates microglial inflammation89. In addition to the MAPK and Wnt signaling pathways, the Rap 1 signaling pathway is also considered one of the core neuroinflammation signaling pathways90.

Furthermore, Fig. 4 reveals how the five drugs in combination can be used to treat AD at the pathway level. Tofacitinib and Baricitinib directly hits all the top five pathways via protein targets such as AKT1, CDK5, EGFR, GSK3B, MAPK8, and MAPK10. In contrast to Tofacitinib and Baricitinib, the other three drugs do not directly target any of the top pathways. The reason why including the other three drugs may be beneficial for AD treatment can be explained in two ways. On the one hand, some of the other three drugs can complement Tofacitinib and Baricitinib by indirectly targeting different targets of the same top pathways, thus increasing the synergistic effect of the drug combination91. For example, Empagliflozin can target EGFR in the MAPK, FOXO, and Rap1 signaling pathways via SLC5A1. Similarly, Niraparib can target SIRT1 in the FOXO signaling pathway via PARP2, and Doxercalciferol can target FGF23 in the MAPK and Wnt signaling pathways via VDR. The synergistic effect can increase when different targets in the same top pathways crosstalk other AD-associated pathways91. For example, in the MAPK signaling pathway, EGFR targeted by Empagliflozin is also involved in the FOXO and the Rap1 signaling pathway, which are linked to neuroinflammation and oxidative stress. On the other hand, some of the other three drugs can serve as complementary drugs by targeting different AD-associated pathways beyond the identified top pathways. For example, to address Tau hyperphosphorylation implicated in different pathways, Doxercalciferol can target the PAX6 signaling pathway92 via VDR and its first-order neighbor CREBBP. In parallel, Niraparib can target pathways related to cell cycle and DNA repair in age-related diseases93 via PARP1/2 and its first-order neighbor PCNA. According to the parallel pathway theory of drug synergy, which argues that drugs (in practice, usually assuming that they have useful pharmacokinetics and no interactions) targeting separate pathways to a common biological function are synergistic94, these complementary drugs can greatly increase the effectiveness of AD treatment to be further examined in experimental validation.

Discussion

Recent developments in AI technologies, such as GNN95 and causal AI9698, and the accumulation of large volumes of relevant datasets such as KEGG, DrugBank, CMap, SPIED and DGIdb, have provided new opportunities for an AI-driven drug-repurposing approach to greatly improve the probability of finding effective AD drug candidates, vastly reducing developmental costs and accelerating precision drug identification.

However, three challenges remain to be overcome. Firstly, existing data-driven computational drug-repurposing methods have yet to fully incorporate expert-led domain-specific AD knowledge into a heterogeneous signed directed biomedical graph, including a rich set of nodes and edges, carefully incorporating expert knowledge in graph construction and node/edge weighting to capture the convergence and divergence of crucial pathways associated with AD. Secondly, most existing network-based models determine the efficacy of a repurposed drug by estimating the distance between the drug and disease proteins, based on a simple shortest path distance. Thirdly, the systematic identification of high-order drug combinations to maximize the synergistic effect remains challenging due to high computational costs and the lack of experimental data.

To address these challenges, this study proposes an expert-led AI-driven framework to determine a lead five-drug combination for AD treatments based on a heterogeneous signed directed biomedical graph. Even though the underlying mechanism of AD is less well understood, our proposed GNN model incorporates expert-led domain-specific insights. Such insights include knowledge on long genes, immunological and aging pathways, and somatic mutation markers identified in the blood identified from the previous work99 to be closely associated with AD, as well as how to assign weights to the nodes and edges in graph construction. Further, by encoding our biomedical graph via GNN into a new embedding space, DeepDrug can better capture the rich relationship between different nodes in the original graph as compared to what a pure shortest path metric can provide32. Nevertheless, with our signed directed edges and with node and edge weight assignments, a more accurate prediction of successful drug candidates can be provided. Finally, by using diminishing return-determined thresholds, we can better select the high-order drug combinations from the top drug candidates to achieve the maximum synergistic combinatory effect for improving the effectiveness of AD treatment.

Capitalizing on the recent findings that long genes are more likely to be affected by somatic mutations100, and accumulating somatic mutations are implicated in brain pathology99 and AD-risk blood101 (see Downey, Lam, Li, and Gozes (2022)102 for a detailed review of somatic mutations and AD), we have developed a novel DeepDrug framework and identified the top repurposed five-drug combination candidates. The top 15 drug candidates identified by DeepDrug (see Table 1 in “Results”) are consistent with prior biological and clinical knowledge. Tofacitinib, a top candidate, was previously highlighted in AI-driven AD research and tested in mice models49,50. Three of the top five drugs are JAK inhibitors, which can reduce neuroinflammation contributing to AD pathology51. For the remaining top drugs, Niraparib, Palbociclib, and Sirolimus can be potential therapeutic agents, given their role in autophagy in AD75. Empagliflozin, Febuxostat, Doxercalciferol, and Zinc Acetate can address AD-associated comorbidities and risk factors (diabetes60, gout61, vitamin D deficiency62, and zinc deficiency63). Other promising candidates include Raloxifene for women with AD64, Valproic Acid and Olanzapine originally for other neurological diseases but repurposed for AD65,66, and Adenosine and Miconazole for reducing neuroinflammation67,68.

A lead combination of five drugs, comprising Tofacitinib, Niraparib, Baricitinib, Empagliflozin, and Doxercalciferol, selected from drug combinations of two to six drugs from the top 15 drug candidates, can serve a promising drug combination for AD treatment (see Tables 2 and 3 in “Results”). The selected five-drug combination for AD treatment has significant biological relevance and clinical potential. These drugs, previously approved for treating conditions such as rheumatic diseases, cancer, diabetes, and hyperparathyroidism, are shown to inhibit AD pathology through various mechanisms. Specifically, Tofacitinib and Baricitinib are JAK inhibitors targeting the JAK-STAT pathway, reducing neuroinflammation by inhibiting proinflammatory cytokines mediated by immune cells51. Niraparib inhibits PARP enzymes, addressing neurodegeneration by preventing metabolic impairment and regulating cell death72. Empagliflozin, an SGLT2 inhibitor, regulates the insulin signaling pathway linking diabetes and AD73. Doxercalciferol, a vitamin D analog, protects neurons against inflammation and oxidative stress74. The clinical relevance of these drugs in AD treatment is also supported by their involvement in various stages of AD drug discovery, including literature review75, epidemiological study76, animal models50,73, and human clinical trial (Phase II)77, highlighting their potential effectiveness in treating AD.

Moreover, t-SNE analysis has revealed the position of the five drugs in the lead combination mapped onto the biomedical embedding space (see Fig. 3 in “Results”). On average, this lead combination is closer to AD-associated genes compared to other drug combinations, indicating a higher likelihood of effectively treating AD. These five drugs have been observed in three clusters, suggesting shared and complementary effects. Specifically, Tofacitinib and Baricitinib, both JAK inhibitors, are closely positioned, indicating a strong combinatory effect on neuroinflammation pathways. Niraparib, a PARP inhibitor, and Doxercalciferol, a vitamin D2 analog, are also closely positioned, suggesting anti-inflammatory and oxidative stress reduction through their combined actions. Empagliflozin, an SGLT2 inhibitor to treat diabetes, is distant from the others, indicating its complementary role in targeting the insulin signaling pathway, which are related to glucose, lipid, and energy homeostasis.

Further, network and pathway analysis has provided the mechanism insights into the lead combination of drugs, showing how the drugs and their targets interact with the key AD-associated genes (see Fig. 4 in “Results”). Despite not directly targeting the primary genes in the KEGG AD pathway, the selected drug combination can target these genes through PPIs. STRING pathway enrichment analysis has identified the top five KEGG pathways enriched in the PPI subnetwork. These pathways include the MAPK, FOXO, neurotrophin, Wnt, and Rap1 signaling pathways, which are all related to AD mechanisms from different perspectives, including aging, oxidative stress, neuroinflammation, amyloid beta production, and Tau hyperphosphorylation. Tofacitinib and Baricitinib target all key pathways directly, involving key proteins such as AKT1, CDK5, EGFR, GSK3B, MAPK8, and MAPK10. The other three drugs, Empagliflozin, Niraparib, and Doxercalciferol, do not directly target these pathways but complement Tofacitinib and Baricitinib. Their complementary effects can be achieved by indirectly targeting the same key pathways or additional AD-related pathways, thus addressing broader aspects of AD pathology and enhancing the synergistic effect. By targeting multiple interconnected pathways, the selected drug combination further reveals its potential efficacy in AD treatment, offering a comprehensive approach to tackling the complex mechanisms of AD.

Based on the calculated drug-gene scores, we have also investigated the rankings of existing approved AD drugs, including Lecanemab and Aducanumab, the new drugs approved by FDA in 2021 and 2023, respectively, though not approved by the EU. Aducanumab was subsequently discontinued in 2024103. The rankings of the approved AD drugs (out of 503 drugs) in our GNN model are: Memantine (55th), Donepezil (83rd), Galantamine (159th), Rivastigmine (264th), Lecanemab (403rd), and Aducanumab (405th). They do not rank high in our GNN model, most probably because they are targeting at AD symptoms or events that are insufficient in-and-of-themselves to trigger a disease. For example, the monoclonal antibodies, Aducanumab and Lecanemab, target amyloid beta104, an upstream biomarker in the AD pathology, which may explain why they are not effective in treating progressed AD. In contrast, the upstream vs. downstream effects of mutations in the genes (such as tau vs. amyloid beta), along with different AD-related pathways implicated in AD brain and blood, have been incorporated into the proposed framework during the biomedical graph construction. The low rankings of approved AD drugs can be attributed to the expert-led knowledge incorporated into the GNN model input (which did not focus on amyloid beta only), such as long genes, co-morbidities, and the importance of AD-specific pathways. The findings from our GNN model are also consistent with the observations from existing AD drug experiments. Until now, no drugs have been proven to treat AD effectively. The repeated failures of the approaches targeting amyloid in AD clinical trials have increasingly pointed to other potential directions in AD drug development, such as the role of neuroinflammation and the immune system105. These results, taken together, highlight the importance of incorporating expert-led insights into our proposed AD drug repurposing framework.

Nevertheless, translating AI-driven drug repurposing findings into clinical practice presents significant challenges. Firstly, the limited availability of comprehensive datasets for AD and the lack of effective drugs for its treatment may hinder the training of AI models, potentially leading to biased and inaccurate predictions. Secondly, the expert-led biomedical graph structure simplifies complex biological systems, possibly resulting in an inadequate representation of drug combinations, particularly at the molecular level. Thirdly, AI-driven repurposing models operate as “black boxes”, making it difficult to understand the rationale behind their drug prediction processes. This lack of interpretability may impede the adoption of AI-driven drug-repurposing in clinical settings. Finally, validating AI-driven findings in rigorous experimental and clinical studies is often costly and time-consuming, requiring substantial effort and resource.

To tackle these challenges, our current AI-driven GNN model can be improved further in the following directions: Firstly, given that no effective drug is yet available for AD, to facilitate the GNN model’s learning towards which drug can “treat” which gene, multi-task transfer learning can be used to simultaneously repurpose drugs for AD and other related diseases (where ground truth drugs may be available). Secondly, the GNN model structure can utilize hierarchical graph neural networks106,107 to better represent drug combinations at the molecular level (e.g., a drug consisting of atoms and bonds108 and a drug combination consisting of multiple drugs) while capturing the relationships of nodes at different layers through graph attention mechanisms. The fine-grained molecular representation can also improve the calculation of suitable drug doses to prevent adverse effects109, and facilitate the prediction of drug synergy110,111 and blood–brain barrier permeability112, making it possible to deliver effective drugs to treat AD brains. Thirdly, the interpretability of the GNN model can be improved. Instead of starting with the 310 genes associated with AD in our biomedical graph (including long genes, GWAS genes, and expert-led genes), we may start with the genes most causal to AD. These genes may be determined as the most upstream genes in AD-associated pathways while accounting for environmental effects. In addition to the pathway enrichment analysis, the contribution of local subgraphs113, in particular, genetic pathways, in calculating the drug-gene score can also be investigated to better understand the importance of different pathways in AD drug-repurposing. Such interpretations from a “black box”, combined with expert insights, will promote the feedback loop in the proposed expert-led AI-driven approach.

Finally, having identified the lead combination of drugs in silico, the last step—testing of drug efficacy in vitro and in vivo, remains to be undertaken. In the future, our DeepDrug framework can be applied to repurpose drugs for other neuro-degenerative and neuro-developmental diseases, such as Parkinson’s disease, autism spectrum disorders, PTSD, multiple sclerosis, and schizophrenia.

Methods

DeepDrug, an AD drug-repurposing framework, consists of a three-stage development. First, a biomedical graph was created, linking genes, proteins, drug targets, and drugs to capture the network characteristics of AD pathology (see “Heterogeneous Signed Directed Biomedical Graph Construction”). Second, a GNN-based framework was developed to learn the biomedical graph structure and represent drug and gene nodes as lower-dimensional vectors (see “Graph Neural Network to Embed Drug and Gene Nodes” and “GNN Model Training and Evaluation”). Third, drug scoring and selection were performed to identify the top drug candidates and a lead drug combination (see “Scoring and Selection of Drug and Drug Combination” and “Drug Combination Optimisation”).

Figure 5 provides an overview of DeepDrug, an AD drug-repurposing framework, from two complementary perspectives, namely, a domain-specific knowledge-inspired approach, and an AI-driven approach. On the left side of Fig. 5, the domain-specific knowledge-inspired approach instructs the construction of a heterogeneous signed directed biomedical graph. The biomedical graph consists of four types of nodes, including genes, proteins, targets, and drugs, and four types of edges between nodes, including gene-protein, protein–protein, drug-target, and drug-drug edges. It starts from identifying key genes implicated in AD and approved drugs relevant to AD. Protein targets bound to the selected drugs are connected to the proteins corresponding to the identified genes through protein–protein interactions (PPIs). The identified genes also undergo pathway enrichment analysis to determine their roles in AD-associated pathways in the PPI network. Finally, expert-guided knowledge is integrated into the biomedical graph by determining edge directions (e.g., whether one protein acts on another protein) and signs (e.g., whether a drug activates or inhibits its target) and assigning weights to nodes and edges (e.g., gene nodes weighted by their importance to AD). The details regarding how domain-specific knowledge is integrated into the biomedical graph are elaborated in “Heterogeneous Signed Directed Biomedical Graph Construction”.

Fig. 5.

Fig. 5

Expert-guided AI-driven DeepDrug Framework.

On the right side of Fig. 5, the AI-driven approach leverages on the expert-guided biomedical graph plus the rich representation of the nodes (gene sequences, protein sequences, and drug structures), mapping them onto a lower-dimensional space (see “Graph Neural Network to Embed Drug and Gene Nodes”). The AI model is trained to understand complex relationships within the biomedical graph, using domain-specific pathway-guided regularization to improve biological relevance in the lower-dimensional space (see “GNN Model Training and Evaluation”).

Furthermore, the iterative feedback loop between the two approaches can improve the accuracy and reliability of the proposed framework. This dynamic collaboration between human expertise and AI involves incorporating expert insights into the initial setup and continuous adjustments of the AI methods, ensuring that the automated processes are consistently refined and aligned with the latest domain knowledge.

Heterogeneous signed directed biomedical graph construction

A heterogeneous signed directed biomedical graph was constructed, consisting of four types of nodes, including genes, proteins, drugs, and drug targets, and four types of edges between nodes, including gene-protein, protein–protein, drug-target, and drug-drug edges. After graph construction, isolated nodes were removed from the graph.

Table S2 in Supplementary Information summarizes how the biomedical graph construction was guided by expert knowledge to ensure its relevance to AD. Expert knowledge was provided to highlight potential biases and weaknesses of gene association studies and was sought to prevent the introduction of data biases and highlight potential drug targets. It provides two cases where the insights from our co-authors, who are experts in molecular neuroscience and biology, led to the inclusion of relevant (1) genes and (2) drugs in the biomedical graph. 10 expert-led genes were chosen based on the expert knowledge of our neuroscience co-authors, including APP, MAPT, FOXO3, PSEN1, IGF1, ACT1, SIRT1, PAX6, PSEN2, and APOE. Moreover, the keywords used to find the relevant drugs based on drug description, indication, pharmacodynamics, and mechanism-of-action from DrugBank114 have included not only AD risk factors and comorbidities but also incorporated expert insights, including ‘Reactive oxygen species’, ‘Vitamin D’, ‘immunological’, and ‘autoimmune’.

The benefit of experts (1) is to highlight the potential biases and weaknesses in any particular method – and, in this sense, they should work not to introduce problematic biases, but to prevent their introduction; (2) Make researchers aware of the potential for drug targets that may not show up in some studies; (3) provide insight into potential pharmacological implications. Thus, in the case of gene-pathways linked to AD, an expert can critically examine data derived from GWAS methodology (for example, looking at sample sizes, risks of assigning links to non-coding regions to nearest gene neighbours, etc.). They are also able to analyze the data from such studies and recognize the potential biological systems that they impact.

Based on the selected genes and drugs relevant to AD, validated by our molecular neuroscience and biology co-authors, the complete biomedical graph was constructed using authoritative biomedical databases (see Table S3 in Supplementary Information) with drug-target, drug-drug, and protein–protein interaction information. Table S4 in in Supplementary Information further explains how the biomedical graph construction handles conflicting or incomplete expert knowledge by assigning normalized weights to the inputs (i.e., nodes and edges in the biomedical graph) based on their relevance and confidence levels. This weighting assignment mechanism ensures a consistent integration of varying input levels from experts.

Figure 6 shows a simplified version of the signed directed biomedical graph. This biomedical graph consists of four node types and their edges (interactions). This simplified version includes the largest connected component of the graph consisting of only the top 50 nodes of each node type. This figure illustrates how the AD-relevant genes (squares) are connected to the proteins (diamonds), then to the drug targets (diamonds), and finally to the drugs (circles). The detailed biomedical graph construction procedure is described in the rest of this section.

Fig. 6.

Fig. 6

A Simplified Version of Signed Directed Biomedical Graph.

Gene nodes

Two types of gene mutations relevant to AD, namely, the somatic mutations and the germline mutations, were considered. For the somatic mutations, their gene nodes were obtained from the long genes100 and somatic mutation markers identified in the blood101. For the germline mutations, their gene nodes were obtained from the high-risk genes identified from GWAS meta-analysis data115. Other important genes relevant to (i) AD pathology and (ii) healthy aging were also selected from expert-led insights. The gene nodes were weighted according to their chance of mutation and importance to AD progression (see Table S7 in Supplementary Data for the selected genes and their weights).

Specifically, data on the somatic mutations implicated in the AD pathology was obtained from Soheili-Nezhad et al.100, constituting a total of 272 long genes. GWAS data was obtained from the Phase 3 summary statistics in Jansen et al.115, for a total of 30 significant genes. 10 expert-led genes, including APP, MAPT, FOXO3, PSEN1, IGF1, ACT1, SIRT1, PAX6, PSEN2, and APOE, were chosen based on the expert knowledge of our neuroscience co-authors, who recently discovered ADNP and SHANK3 mutations as the key somatic mutation markers of autism linked to AD116. In addition, her team has proven that critical somatic mutations in ADNP correlate with AD Tau pathology99, and verified them as driving an early AD-like Tau pathology in a genome-edited (CRISPR/Cas9) mouse model117. In total, 310 genes were selected. Only two overlaps between the three sets of genes could be identified, including APOE, between GWAS and the expert-led, and CNTNAP2, between the long gene set and GWAS (see Fig. 7).

Fig. 7.

Fig. 7

The 310 Gene Nodes Incorporating Expert-led Knowledge.

The weights of the 310 gene nodes (a total of three sets of genes) were calculated based on their lengths and normalized to [1, 2]. For all GWAS genes, their normalized -log(p-value), normalized to [1, 2], was added to their existing normalized weight based on gene length. For all expert-led genes, their existing weights (based on normalized gene length plus additional normalized weights from the GWAS step) were increased by 1, to reflect the importance of these genes. Finally, all the gene nodes were renormalized to [1, 2].

In addition to the node weights guided by AD domain-specific knowledge, a high-dimensional weight vector was automatically generated for each gene node based on the corresponding protein-coding sequencing data. Similar to n-grams and bag-of-words used in natural language processing tasks, the k-mer method was used to (i) decompose each mRNA sequence into a sequence of all possible k nucleotides, (ii) calculate the number of times that each k-mer appeared in each sequence, and (iii) convert each sequence into a count vector. Finally, the singular value decomposition (SVD) method was used to compress each gene vector into a fixed-length vector.

Drug nodes

Based on the approved drugs (including drugs approved by the FDA or other drug regulatory authorities outside the US), drug nodes were selected using the top co-morbidities and risk factors related to AD as well as expert-led insights. The drug nodes were weighted according to AD co-morbidities (see Table S2 in Supplementary Data for the list of drugs identified and their weights).

Specifically, data on the drugs were obtained from DrugBank (version 5.1.10)114, an online database containing information on drugs, drug targets, and drug-drug interactions. The keywords used to find the relevant drugs based on drug description, indication, pharmacodynamics, and mechanism-of-action were (i) the top two risk factors of AD: ‘senescence’, ‘female’, (ii) the top four co-morbidities of AD: ‘Diabetes’, ‘Cardiovascular disease’, ‘Depression’, ‘Inflammatory bowel disease’, (iii) miscellaneous terms guided by expert insights: ‘Reactive oxygen species’, ‘Vitamin D’, ‘immunological’, ‘autoimmune’, and (iv) the disease itself: ‘dementia’ (given that AD is the most common form of dementia) and ‘Alzheimer’s’. After (i) filtering out the vaccines, nutrition supplements, and agents for medical imaging, (ii) selecting drugs whose target organism is human or not specified, and (iii) choosing only the drugs that are approved and not withdrawn or prohibited, the total number of drug candidates was 503.

The drug nodes were weighted by the odds ratio of the key comorbidities they can treat. Lin et al.118 reported the odds ratio of the three co-morbidities of ‘diabetes’, ‘cardiovascular disease’, and ‘depression’ in various combinations. This study, with 49,747 subjects aged ≥ 65 years in 2000, was based on the Taiwan National Health Insurance Research Database, with subjects stratified by the presence and absence of dementia from 2000 to 2010. Dementia subjects (4749 out of the total cohorts) were those first diagnosed with dementia from 2000 to 2009. Table 4 shows the weights assigned to drugs that treat various combinations of these three comorbidities. The rationale is that, if a drug treats the key co-morbidities of dementia, the most prevalent form of AD, it shall top the search list and be accorded a higher weighting subsequently. Meanwhile, the weight of a combination of co-morbidities may not be equivalent to the sum of the individual co-morbidities. In addition, the weight of a single co-morbidity can be higher than a combination including this co-morbidity. Based on the weighting rationale, the weights of the drug nodes were normalized to [1, 2].

Table 4.

Weights Assigned to Drug Nodes Based on Different Combinations of Comorbidities Being Treated.

Combination of Co-morbidities Treated Weights Assigned (Based on Odds Ratio obtained from Lin et al.118)
Depression 4.938
Vascular Disease 3.261
Diabetes 2.321
Depression + Vascular Disease 6.726
Depression + Diabetes 4.787
Vascular Disease + Diabetes 3.161
Depression + Vascular Disease + Diabetes 6.521

In addition to the expert-guided node weights, a high-dimensional weight vector was automatically generated for each drug node, based on the corresponding chemical structure data. Drugs were small molecules or proteins (antibodies) in this study. For small molecules, a fingerprint was generated for each drug compound using RDKit119. For protein/peptide drugs, their weight vectors were calculated along with the protein nodes (see below). Finally, the SVD method was used to compress each drug vector into a fixed-length vector.

Target nodes

Target nodes were obtained along with the drug nodes, as each drug has one or more specific targets. The target nodes were weighted according to AD co-morbidities. (see Table S3 in Supplementary Data for the list of protein targets identified and their weights).

Specifically, drug targets were obtained from four major biochemical databases, including DrugBank (version 5.1.10), DrugCentral (as of Aug 2022), ChEMBL (version 31), and BindingDB (version 2023m0). For targets derived from DrugBank, four target types were provided, including targets, enzymes, carriers, and transporters. Targets and enzymes were selected, while carriers and transporters were discarded. All drug targets were proteins in this study. Drug targets from different databases were combined, resulting in 1,525 drug targets in total.

The target nodes were weighted by their involvement in the top three co-morbidities of AD and were later normalized to [1, 2] along with the other proteins (see below).

In addition to the expert-guided node weights, a high-dimensional weight vector was automatically generated for each target node based on the corresponding protein sequencing data (see below).

Protein nodes

Protein nodes were initially obtained according to the gene and the target nodes, as each gene had an equivalent protein and all drug targets were proteins in this study. Next, the first-order neighbours of the initial protein nodes were included in the graph according to the human protein–protein interaction (PPI) data. The protein nodes were weighted by the same method as that of the target nodes (see Table S3 in Supplementary Data for the list of proteins and their weights).

Specifically, data on the proteins were obtained from STRING (version 11.0), an online curated biological database integrating different sources of evidence and providing a combined score (ranging from zero to one) to indicate whether a PPI is biologically meaningful81. The data were first filtered to include only the proteins in human beings. Next, using the 310 proteins arising from the 310 genes and the 1,525 protein targets resulted in 1,804 proteins. Next, the 1,804 proteins’ first-order neighbours, with high PPI confidence of more than 0.7, were added, giving a total of 11,222 proteins.

The protein nodes were weighted by their involvement in the top three co-morbidities. The weights were then normalized to [1, 2] along with the target nodes.

In addition to the expert-guided node weights, a high-dimensional weight vector was automatically generated for each protein/target node and protein/peptide drug node, based on the corresponding amino acid sequencing data. The k-mer method was used to (i) decompose each protein sequence into a sequence of all possible k amino acids, (ii) count the number of times that each k-mer appeared in each sequence, and (iii) convert each sequence into a count vector. Finally, the SVD method was used to compress each protein vector into a fixed-length vector.

Next, we determined the edges between the nodes and the associated weights. There were four types of edges, namely, gene-protein, drug-drug, drug-target, and protein–protein. All edges were directed, and some were signed. Gene-protein and drug-target edges were assumed to be signed and unidirectional. A positive or negative gene-protein edge indicated that mutations in the gene and resulting protein dysfunction would be positively or negatively associated with AD. A positive or negative drug-target edge indicated that the drug would activate or inhibit its binding protein. Protein–protein edges were considered bidirectional by default, except that some were unidirectional according to the PPI data. Similarly, protein–protein edges were considered unsigned by default, except that protein action types were specified in the PPI data. A positive or negative protein–protein edge represented activation or inhibition, respectively. All drug-drug edges were assumed to be unsigned and bidirectional to account for the magnitude of side effects when two drugs are taken together. Detailed descriptions of how the signs, directions, and weights of these edges were determined can be found below, including the list and the signs/directions of these edges, plus their associated weights (see Tables S4-7 in Supplementary Data).

Gene-protein edges

For the gene-protein edges, a unidirectional relationship between the genes and the proteins was established directly, as one gene is matched to one single protein. The gene-protein edges were weighted uniformly. It can be further weighted when the mRNA stability data is available (e.g., how many proteins can one mRNA create before degrading). The sign of the gene-protein edges was determined based on how mutations in the genes were linked to AD.

With the 310 gene nodes, 310 gene-protein edges could be identified. Due to a lack of mRNA stability data, the gene-protein edges were weighted uniformly with a default average weight of 1.5 (after normalizing to [1, 2]). For the 272 long genes, the sign was set to positive (+ 1), given that somatic mutations in these long genes were positively linked to AD by showing reduced gene expression and inhibited synaptic-related pathways in AD brains. For the GWAS genes, the sign was determined by the corresponding Z score – a negative Z score suggested a negative link between the GWAS gene and AD risk, resulting in a negative sign (-1), and vice versa. For the expert-led AD-risk genes, the sign was set to positive (+ 1).

Drug-drug edges

A default bidirectional relationship was established for the edges between the drugs. The drug-drug edges were weighted by text-mining the drug-drug interaction documents (i.e., whether two drugs increased or decreased risk). The drug-drug edges were unsigned given that only the magnitude of side effects was considered.

Drug-drug interaction data were obtained from the DrugBank database (version 5.1.10)114. A total of 36,532 bidirectional drug-drug edges were established. The edges were weighted using the text information of available detailed drug-drug interactions from DrugBank. Specifically, a default weight of 1.5 was given, and, through text-mining for the words ‘increase risk’ or ‘decrease risk’, if the interaction between two drugs was positive (reduced risk), 1 was added to the weight, else 1 was subtracted from the weight (increased risk). The weights of the drug edges were then normalized to [1, 2]. The sign of drug-drug edges was set to 0.

Drug-target edges

A unidirectional relationship between drugs and targets was established directly, as each drug has its own (one or more) specific targets. The signs and weights of drug-target edges were determined by the drug-target binding affinity data.

Specifically, the drug-target interactions from DrugBank (version 5.1.10)114, DrugCentral (as of August 2022)120, ChEMBL (version 31)121, and BindingDB (version 2023m0)122 were obtained and combined, resulting in 7,379 drug-target edges. The corresponding drug-target affinity measures, including IC50, EC50, Ki, and Kd, were available from DrugCentral, ChEMBL, and BindingDB. For all affinity metrics, the unit was first normalized to nanomolar (nM). Repeated measurements from different data sources and experiments were then aggregated based on the median values to reduce the impacts of outliers. The sign of drug-target edges was determined by the drug-target action type. A positive sign (+ 1) was given if the drug-target action had keywords including “activator”, “agonist”, “opener”, or “positive”, while a negative sign (− 1) was given if the drug-target action had keywords including “inhibitor”, “antagonist”, “blocker”, or “negative”. For a negative drug-target relationship, the drug affinity score was calculated based on the KIBA score (an integrated inhibition metric based on IC50, Ki, and Kd123). For a positive drug-target relationship, the drug affinity score was Kd or EC50, depending on which metric was available. Drug-target weights were then calculated using a − log(x) transformation on the affinity scores119.

Missing drug-target signs and weights were predicted by the high-dimensional drug and target weight vectors using gradient-boosting decision trees, with the assumption that the drug affinity measures between similar drugs and targets (measured by their structure information) are related. Finally, the weights of the drug-target edges were normalized to [1, 2]. A total of 7,379 drug-target edges were identified.

Protein–protein edges

A mixture of unidirectional and bidirectional relationships was established between proteins. The signs and weights of protein–protein edges were determined by the PPI data.

Specifically, the direction of protein–protein interaction was set to bidirectional by default, except that the directionality was indicated (i.e., one protein is acting on another protein) in the STRING database. The protein–protein edges were weighted by the confidence of the two involved proteins, a combined score integrating evidence from different data sources, without being normalized. The protein–protein edges were unsigned by default (the sign was set to 0). The sign was further determined by the protein–protein action type if such information was available. A positive sign (+ 1) was adopted if the action type was “activation”, while a negative sign (-1) was adopted if the action type was “inhibition”. A total of 243,413 protein–protein edges were established.

Pathways

Based on the expert-led domain-specific knowledge, the weights of some protein–protein edges were further increased according to the enrichment value of certain AD-associated pathways if they were part of those pathways. This also covers the overlapping pathways, whilst the protein–protein edges present in multiple pathways were given the additional weight for each pathway, with the assumption that these edges can be considered to be the common denominators of AD.

Specifically, all genes (long, GWAS, and expert-led genes) were used to generate the AD-related pathways, using the STRING pathway analysis81. A total of 16 statistically significant pathways, with a false discovery rate (FDR) less than 0.05, were obtained from the KEGG82, Reactome, and WiKiPathways databases. The log(-FDR) values were normalized to [1, 2]. Next, for those not-yet-normalized PPI edges which contained proteins involved in one/some significant pathway(s), their weights were further added by the normalized log(-FDR) value of the corresponding pathway(s). Finally, the protein–protein edge weights were renormalized to [1, 2].

Graph neural network to embed drug and gene nodes

A signed directed GNN framework was developed to map all nodes in the heterogeneous biomedical graph, including gene and drug nodes, to the same embedding space. The choice of a GNN model for graph node encoding is driven by its unique ability to leverage the structural information inherent in biomedical graphs, which is essential for deciphering complex biochemical interactions. Traditional deep learning models, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have been applied to encode biomedical data (e.g., molecular fingerprints); however, these models often struggle to capture the structural dependencies and higher-order connectivity information critical in systems biology124. In contrast, GNNs are specifically designed to handle graph-structured data by utilizing node-level and edge-level features, enabling them to inherently encode relationships within the biomedical graph topologies. In particular, the signed GNN and the directed GNN models125,126, taken together, are exceptionally suited for biomedical applications, as they incorporate both the sign (e.g., activation or inhibition) and directionality (e.g., drug-target interactions) of edges. This signed directed GNN encoding method makes it an ideal choice for encoding gene and drug nodes in the current study, ensuring the graph representation accurately reflects the underlying biological mechanisms.

At a high level, a graph autoencoder (GAE) framework127 was adopted to learn a low-dimensional representation of each node based on the node features and a signed directed graph adjacency matrix measuring how nodes are inter-connected (see Fig. 8). The novelties of the proposed GNN framework are four-fold (see Fig. 8). Firstly, expert-guided weights and data-driven high-dimensional features from the biological structure information are incorporated into node features. Secondly, expert-guided weights, positive/negative edges, and directed edges are incorporated into the graph adjacency matrix. Thirdly, a signed directed GAE is used to embed gene and drug nodes while accounting for positive/negative edges and directed edges. Fourthly, domain-specific pathway-guided regularization is adopted to align gene nodes along the same pathway in the embedding space.

Fig. 8.

Fig. 8

Graph Neural Network to Embed Drug and Gene Nodes and Its Novelties.

Specifically, the signed directed GAE consists of two parts, namely, an encoder and a decoder. The encoder projects all nodes to a low-dimensional embedding space using signed and directed GCN layers based on the node features and edge weights while the decoder performs an inner product operation on the node embeddings to reconstruct the graph. The GAE framework can be used for unsupervised graph representation learning without ground truths from downstream tasks (i.e., which drugs can treat AD).

An adjacency matrix Inline graphic, which represents how nodes are connected and how edges are weighted, is obtained from the multimodal graph Inline graphic consisting of Inline graphic nodes. The adjacency matrix is further converted to two matrices Inline graphic and Inline graphic accounting for positive edges and negative edges. Unsigned edges are incorporated into both matrices. An expert-guided node feature matrix Inline graphic is obtained from Inline graphic, where each node is represented by a four-dimension feature vector. The four-dimension vector is the product of a node’s weight and a one-hot vector representing the node’s type (e.g., [1, 0, 0, 0] for a gene node). In addition, a data-driven node feature matrix Inline graphic is obtained from Inline graphic, where each node is represented by a K-dimension feature vector calculated from its biological structure information. The final node feature matrix Inline graphic is the concatenation of Inline graphic and Inline graphic. The encoder first takes Inline graphic and Inline graphic as the input to produce positive and negative embeddings for each node via a signed GCN125. The positive and negative embeddings are then used to generate source and target embeddings via directed GCNs126. The decoder model optimizes low-dimensional node embeddings (accounting for edge sign and direction) to reconstruct the network structure. Conceptually, the GNN framework is described as follows:

graphic file with name M15.gif 1
graphic file with name M16.gif 2
graphic file with name M17.gif 3
graphic file with name M18.gif 4
graphic file with name M19.gif 5
graphic file with name M20.gif 6

where, SignedGCN is a signed GCN with the ReLU activation function, DirectedGCN is a directed GCN with the ReLU activation function, Inline graphic represents low-dimensional node embeddings, Inline graphic is the sigmoid activation function, and Inline graphic is the edge probability matrix of all node pairs.

GNN model training and evaluation

After the biomedical graph construction, the GNN model was trained through a link prediction task on the biomedical graph to optimize its parameters. The task is a binary classification problem, which predicts the closeness between two nodes on the graph, thus allowing one to indicate which drugs are candidates for a certain gene node. Furthermore, to account for the convergence and divergence of AD-associated pathways, a pathway-guided regularization term is incorporated into the training process to force the gene nodes along the same pathway to stay closer to each other in the embedding space, while preserving the convergence and divergence information. Specifically, the regularization term consists of two classification tasks based on node embeddings: a cross-entropy loss for node type classification (one node must belong to one of the four node types) and a binary cross-entropy loss for pathway classification (one node can be in zero, one, or more pathways). The detailed training procedure is shown in Algorithm 1.

Algorithm 1.

Algorithm 1

Model Training.

The hyperparameter settings are listed as follows. The dimension of node feature vectors was 260, including four dimensions from the expert-guided node weight vector (representing gene, protein, drug, and target nodes) and 256 dimensions derived from the corresponding biological structure (gene sequence, protein sequence, or drug compound). Table S5 in Supplementary Information lists the model parameters in more detail. In the proposed GNN framework, a one-layer signed graph convolutional network was first used to obtain the positive and negative node embeddings (embedding dimension size of 32) for the source and target nodes. The node embeddings were further processed by two-layer directed graph convolutional networks (with a hidden unit size of 128). Two-layer fully connected neural networks (where the hidden unit size was half of the input dimension size) were also included for node and pathway classification tasks to implement the pathway-guided regularization term. The GNN framework included 119,334 trainable parameters in total. The number of training epochs Inline graphic was 100. The Adam optimizer was used to learn the model parameters, with a learning rate Inline graphic of 0.01. For the directed GCNs, two hyperparameters Inline graphic and Inline graphic were used ([0.4, 0.6], [0.5, 0.5], or [0.6, 0.4]) to control the relative importance of source and target information. The weight of the pathway-guided regularization term Inline graphic was 0.1, 0.5, or 1.0.

We performed tenfold cross-validation for model development (i.e., training and validation) and testing. Specifically, the whole dataset (i.e., all existing edges on the biomedical graph) was randomly split into ten folds of equal size. We held out one of the ten folds (i.e., 10% of all samples) as the model testing set. We used the remaining data for model development, where 10% were selected as a nested validation set for hyperparameter tuning with early stopping to reduce computational costs and avoid overfitting. Early stopping was triggered when the validation AUC score was not improved over ten epochs. The best hyperparameters were determined (see Table S6 in Supplementary Information for the final hyperparameter selection results). This procedure was repeated ten times, resulting in ten evaluations of the model performance, one for each data fold, allowing for held-out testing on every sample and improving the robustness of model evaluation.

During cross-validation, the trained model was used to predict the existence of edges from the testing set based on the optimized node embeddings (see Eq. (6)). This link prediction task is a binary classification problem, where positive and negative labels represent the existence and non-existent edges, respectively. The Area Under the ROC Curve (AUC) and Average Precision (AP), which are two commonly used evaluation metrics for link prediction128, were selected for model evaluation. AUC is the area under the receiver operating characteristic (ROC) curve, summarizing the true positive rate (sensitivity or recall) and the false positive rate (1-specificity) at various thresholds into a single value. AUC ranges from 0 to 1, with an AUC of 1 indicating perfect performance. AP is the area under the precision-recall curve, summarizing the trade-off between precision and recall at various thresholds into a single value. AP ranges from 0 to 1, with an AP of 1 indicating perfect performance.

Figure S1 in Supplementary Information shows the nested validation performance during model training based on the selected best hyperparameters. As shown in Figure S1, the training dynamics became more stable after 20 epochs. Moreover, as shown in some subplots in Figure S1, the number of training epochs could be less than the maximum number allowed (i.e., 100). This is because the nested validation performance was used to closely monitor the model training process, enabling early stopping when the model performance did not improve over time to avoid model overfitting. The final model testing performance was evaluated on the independent testing sets (not used during model training and nested validation) across ten folds. The mean AUC and AP scores and their standard deviations were calculated across all folds (see Fig. 9). Results have shown that the signed directed GNN model can achieve more than 97% classification accuracy on average (AUC = 0.977 and AP = 0.979). The classification performance is consistent across different folds, suggesting the robustness of the proposed model in predicting edges on the biomedical graph and capturing the underlying representations of the biomedical graph structure.

Fig. 9.

Fig. 9

ROC and Precision-recall Curves of the GNN Model.

Scoring and selection of drug and drug combination

After cross-validation, we calculated drug scores and selected the top drug candidates based on the ten models trained on all folds. The drug scoring was performed in two steps. First, all data were fed into each trained model to obtain an embedding vector for each drug and gene node. We calculated the drug-gene score Inline graphic for each drug Inline graphic and gene Inline graphic using model Inline graphic. Specifically, the closer a drug is to the AD-associated gene nodes, the more likely it is a higher-ranking AD drug candidate. Based on the signs of gene-protein edges and drug-target edges, we first calculated the drug-gene scores based on the anti-correlation between the drug embedding and the gene embedding, with the aim to find drug inhibitors close to the AD-risk genes or drug activators close to the AD-protective genes (see Eq. (7)).

graphic file with name M33.gif 7

where Inline graphic represents one drug, Inline graphic represents one gene, and Inline graphic is the node’s embedding vector accounting for the drug-target direction or gene-protein direction using the model trained on fold Inline graphic.

Next, we calculated the average DeepDrug score of each single drug candidate Inline graphic based on the average drug-gene score across all AD-associated genes and all trained models (see Eq. (8)). The average scores were calculated to address the potential uncertainties or variances in predictions using models trained on different cross-validation data.

graphic file with name M39.gif 8

where Inline graphic represents one drug, Inline graphic represents one gene, Inline graphic represents the model trained on fold Inline graphic, Inline graphic is the total number AD-associated genes, and Inline graphic is the total number of cross-validation folds.

Further, we selected the top Inline graphic single drug candidates. The top Inline graphic single drug candidates were used to construct Inline graphic drug combinations for optimisation. Inline graphic was chosen to select complementary drugs as much as possible (in terms of the coverage of proteins they could target) while accounting for the combinatorial cost. The intuition was that the marginal benefit of including more drugs to cover more drug targets would decrease at a certain point. Specifically, the optimal Inline graphic was determined by a drug-target coverage curve, an increasing function characterizing the trade-off relationship between the benefit and the cost. The benefit was the number of targets to be covered, including the direct targets via drug-target interactions and indirect targets through the first-order PPI. The cost was the number of drug candidates to be incorporated into drug combinations. The number of combinations was an exponentially large number, e.g., 1,225 combinations when choosing two out of 50 drugs and more than 15 million combinations when choosing six out of 50 drugs.

Drug combination optimisation

Based on the trade-offs between deriving a higher drug-target coverage and a minimum number of drug candidates, the top 15 drug candidates, covering more than 90% of all targets, were chosen based on the cut-off point of diminishing returns. The cut-off point refers to the point when a high coverage of drug targets has been reached to capture the complementary effects of different drugs, while the number of drug candidates selected is minimal to reduce the computational cost of drug combination optimization (see Fig. 10).

Fig. 10.

Fig. 10

The Cut-off Point to Determine the Number of Top Drug Candidates.

Based on the top 15 drug candidates, we selected a lead combination more probable to hit AD-associated genes than others and achieve the maximum synergistic effect. Specifically, we calculated the DeepDrug score of drug combinations (from 2 to n drugs). The optimal number of drugs in drug combinations was determined by the point when the lead drug combination received the maximum DeepDrug score. The DeepDrug score of a drug combination was calculated by merging individual drug nodes into super-drug nodes in the biomedical graph, accounting for (i) the DeepDrug scores of individual drug candidates and their pairwise interactions, (ii) the chemical structures of individual drug candidates and their pairwise interactions, and (iii) the shared and complementary drug targets. The super-drug nodes, super-drug target edges, and super-drug-drug edges were constructed and weighted as follows.

Super-drug nodes

Each super-drug node represented one drug combination consisting of multiple drugs. Specifically, if the DeepDrug score can measure a single drug’s AD treatment effect (ranging from 0 to 1), the weight of an n-drug combination, indicating the expected combination effect, was calculated as follows. First, the weight of a two-drug combination (i.e., the pairwise effect) was calculated based on the Bliss model, a commonly used reference model for two-drug combinations108:

graphic file with name M51.gif 9

where Inline graphic represents the weight of one drug pair Inline graphic, and Inline graphic represents the DeepDrug score of one drug Inline graphic.

Second, for a drug combination consisting of more than two drugs, the weight was calculated based on a mechanism-independent reference model, utilizing an Isserlis-like formula considering pairwise effects only while ignoring high-order interactions129. The original formulas proposed in Wood et al.129 focused on three-drug and four-drug combinations and were extended to n-drug combinations here:

graphic file with name M56.gif 10

where Inline graphic represents the weight of an n-drug combination, Inline graphic represents all unique ways of partitioning Inline graphic into pairs, Inline graphic represents all unique ways of partitioning Inline graphic without the element Inline graphic into pairs, Inline graphic represents the weight of one drug pair Inline graphic, and Inline graphic represents the DeepDrug score of one drug Inline graphic.

For example, the weight of a three-drug combination was: Inline graphic), and the weight of a four-drug combination was: Inline graphic.

In addition, the super-drug node’s high-dimensional weight vector was calculated based on each drug node’s high-dimensional weight vector. The intuition was that one part of a drug’s chemical structure could complement another drug and the molecular representation of a drug combination could be related to its combination effect. Each drug’s weight vector was represented by its fingerprint (bit string), capturing different fragments of the drug’s molecular structure, and enabling zero-shot prediction for drug combination activity with previously unseen molecular information130. Specifically, for a two-drug combination, the same Bliss-based formula108 was applied to single-drug fingerprints to calculate the corresponding combination’s fingerprint: Inline graphic, where Inline graphic denotes element-wise multiplication. Similarly, for an n-drug combination with more than two drugs, the same Isserlis-like formula129 was applied to single-drug fingerprints to calculate the corresponding combination’s fingerprint. For example, for a three-drug combination, its fingerprint was calculated as follows: Inline graphic), where Inline graphic denotes element-wise multiplication. Finally, the SVD method was used to compress each super-drug fingerprint into a fixed-length vector.

Super-drug target edges

Each super-drug-target edge was constructed based on the interactions between the corresponding single drugs and their targets. As a result, a super-drug node was more likely to cover more diverse targets. The weight of super-drug-target edges were calculated by summing the corresponding single drug-target edge weights (normalized to [1, 2]) while considering their signs. The sign of super-drug-target edges were determined based on the sign of aggregated weights. The rationale was to account for the additive effects (e.g., an increase in the total effect if two drugs inhibited the same protein) while addressing the contradicted effects (e.g., a decrease in the total effect if one drug activated a target while another inhibited the same target).

Super-drug-drug edges

All super-drug-drug edges (i.e., interactions between the super-drug nodes) were ignored, given that higher-order drug-drug interactions could often be ignored when modeling their combination effects129 and addressing the side effects of taking a combination of drug combinations were beyond the scope of identifying a lead drug combination.

Finally, we selected a super-drug node (i.e., a lead combination of drugs) achieving the best score. We evaluated the drug combinations selected from the top 15 drugs, ranging from two-drug combinations to n-drug combinations. The detailed procedure is shown in Algorithm 2.

Algorithm 2.

Algorithm 2

Lead Drug Combination Selection.

Supplementary Information

Acknowledgements

This research was supported in part by the US National Academy of Medicine (NAM) Healthy Longevity Catalyst Award 2021 and “An Artificial Intelligence (AI)-driven Causal Approach for Early Diagnosis and Treatment of Late Onset Alzheimer’s Disease” under Seed Fund for Collaborative Research, The University of Hong Kong, June 2023. The authors thank Dr. Ruiqiao Bai of the University of Hong Kong for assisting with part of the literature review.

Author contributions

VL and JL developed the overall framework and identified the key novelties and significance and are the main drivers of this work. They also obtained research funding. YH, QZ, VL, JL designed, and YH, QZ and TK operationalized the GNN model and the biomedical graph, TK and YH collected data on nodes and edges of the biomedical graph, YH, TK, VL and JL wrote the paper, JL, VL and YH revised all drafts and finalized the manuscript. YH prepared all figures. IG and JD provided the domain-specific expert knowledge on the biomedical graph and IG initiated the idea of drug combinations. VL and JL are the senior and the corresponding authors of the funded project and provide the funding support for this study.

Data availability

The drug and drug-drug interaction data were obtained from DrugBank (https://go.drugbank.com). The drug target data were obtained from DrugBank (https://go.drugbank.com), DrugCentral (https://drugcentral.org), ChEMBL (https://www.ebi.ac.uk/chembl/), and BindingDB (https://www.bindingdb.org/). The protein–protein interaction and pathway data were obtained from STRING (https://string-db.org). The preprocessed biomedical network datasets, including genes and their weights (Table S1), drugs and their weights (Table S2), proteins/targets and their weights (Table S3), and edges and their weights (Tables S4-7), are available in Supplementary Data.

Declarations

Competing interests

This work, including contribution of six of the co-authors (VL, YH, TK, JD, IG, and JL), has been submitted for patent protection (University of Hong Kong and Tel Aviv University). All other authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Victor O. K. Li and Jacqueline C. K. Lam are both corresponding and senior authors.

Victor O. K. Li, Yang Han, Tushar Kaistha and Jacqueline C. K. Lam contributed equally to this work.

Contributor Information

Victor O. K. Li, Email: vli@eee.hku.hk

Jacqueline C. K. Lam, Email: jcklam@eee.hku.hk

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-85947-7.

References

  • 1.Nichols, E. et al. Global, regional, and national burden of Alzheimer’s disease and other dementias, 1990–2016: A systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol.18, 88–106 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Prince, M. J. et al. World Alzheimer Report 2015-The Global Impact of Dementia: An analysis of prevalence, incidence, cost and trends. (2015).
  • 3.Prince, M. et al. The global prevalence of dementia: A systematic review and metaanalysis. Alzheimer’s Dement.9, 63-75.e62 (2013). [DOI] [PubMed] [Google Scholar]
  • 4.Wu, Y.-T. et al. The changing prevalence and incidence of dementia over time—Current evidence. Nat. Rev. Neurol.13, 327–339 (2017). [DOI] [PubMed] [Google Scholar]
  • 5.Mattson, M. P. Pathways towards and away from Alzheimer’s disease. Nature430, 631–639 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Citron, M. Strategies for disease modification in Alzheimer’s disease. Nat. Rev. Neurosci.5, 677–685 (2004). [DOI] [PubMed] [Google Scholar]
  • 7.Weiner, H. L. & Frenkel, D. Immunology and immunotherapy of Alzheimer’s disease. Nat. Rev. Immunol.6, 404–416 (2006). [DOI] [PubMed] [Google Scholar]
  • 8.Knopman, D. S. et al. Alzheimer disease. Nat. Rev. Dis. Primers7, 1–21 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Melnikova, I. Therapies for Alzheimer’s disease. Nat. Rev. Drug Discov.6, 341–342 (2007). [DOI] [PubMed] [Google Scholar]
  • 10.Congdon, E. E. & Sigurdsson, E. M. Tau-targeting therapies for Alzheimer disease. Nat. Rev. Neurol.14, 399–415 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Athar, T., Al Balushi, K. & Khan, S. A. Recent advances on drug development and emerging therapeutic agents for Alzheimer’s disease. Mol. Biol. Rep.48, 5629–5645 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Bogdanovic, B., Eftimov, T. & Simjanoska, M. In-depth insights into Alzheimer’s disease by using explainable machine learning approach. Sci. Rep.12, 1–26 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Mullard, A. Failure of first anti-tau antibody in Alzheimer disease highlights risks of history repeating. Nat. Rev. Drug Discov.20, 3–6 (2021). [DOI] [PubMed] [Google Scholar]
  • 14.Dou, K.-X. et al. Comparative safety and effectiveness of cholinesterase inhibitors and memantine for Alzheimer’s disease: A network meta-analysis of 41 randomized controlled trials. Alzheimer’s Res. Ther.10, 1–10 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.National Institute for Health and Care Excellence (NICE). In: Dementia: Assessment, management and support for people living with dementia and their carers (2018). [PubMed]
  • 16.Pushpakom, S. et al. Drug repurposing: Progress, challenges and recommendations. Nat. Rev. Drug Discov.18, 41–58 (2019). [DOI] [PubMed] [Google Scholar]
  • 17.Ekins, S. et al. Exploiting machine learning for end-to-end drug discovery and development. Nat. Mater.18, 435–441 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Ballard, C. et al. Drug repositioning and repurposing for Alzheimer disease. Nat. Rev. Neurol.16, 661–673 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ancidoni, A. et al. Anticancer drugs repurposed for Alzheimer’s disease: A systematic review. Alzheimer’s Res. Ther.13, 1–15 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Corbett, A. et al. Drug repositioning for Alzheimer’s disease. Nat. Rev. Drug Discov.11, 833–846 (2012). [DOI] [PubMed] [Google Scholar]
  • 21.Strittmatter, S. M. Old drugs learn new tricks. Nat. Med.20, 590–591 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Wang, Z.-Y. & Zhang, H.-Y. Rational drug repositioning by medical genetics. Nat. Biotechnol.31, 1080–1082 (2013). [DOI] [PubMed] [Google Scholar]
  • 23.Uddin, M., Wang, Y. & Woodbury-Smith, M. Artificial intelligence for precision medicine in neurodevelopmental disorders. Npj Digit. Med.2, 112 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Cheng, F. et al. A genome-wide positioning systems network algorithm for in silico drug repurposing. Nat. Commun.10, 1–14 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Raynal, N.J.-M. Drug Discovery in Cancer Epigenetics 347–371 (Elsevier, 2016). [Google Scholar]
  • 26.Sun, W., Sanderson, P. E. & Zheng, W. Drug combination therapy increases successful drug repositioning. Drug Discov. Today21, 1189–1195 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Cheng, F., Kovács, I. A. & Barabási, A.-L. Network-based prediction of drug combinations. Nat. Commun.10, 1–11 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Larkins-Ford, J., Degefu, Y. N., Van, N., Sokolov, A. & Aldridge, B. B. Design principles to assemble drug combinations for effective tuberculosis therapy using interpretable pairwise drug response measurements. Cell Rep. Med.3, 100737 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Belyaeva, A. et al. Causal network models of SARS-CoV-2 expression and aging to identify candidates for drug repurposing. Nat. Commun.12, 1–13 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Howell, R. et al. Executable network of SARS-CoV-2-host interaction predicts drug combination treatments. Npj Digit. Med.5, 18 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Gysi, D. M. et al. Network medicine framework for identifying drug-repurposing opportunities for COVID-19. Proc. Natl. Acad. Sci.118 (2021). [DOI] [PMC free article] [PubMed]
  • 32.Hsieh, K. et al. Drug repurposing for COVID-19 using graph neural network and harmonizing multiple evidence. Sci. Rep.11, 23179. 10.1038/s41598-021-02353-5 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Siavelis, J. C., Bourdakou, M. M., Athanasiadis, E. I., Spyrou, G. M. & Nikita, K. S. Bioinformatics methods in drug repurposing for Alzheimer’s disease. Brief. Bioinform.17, 322–335 (2016). [DOI] [PubMed] [Google Scholar]
  • 34.Taubes, A. et al. Experimental and real-world evidence supporting the computational repurposing of bumetanide for APOE4-related Alzheimer’s disease. Nat. Aging1, 932–947 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Bayraktar, A. et al. Drug repositioning targeting glutaminase reveals drug candidates for the treatment of Alzheimer’s disease patients. J. Transl. Med.21, 332 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Zhou, Y. et al. AlzGPS: A genome-wide positioning systems platform to catalyze multi-omics for Alzheimer’s drug discovery. Alzheimer’s Res. Ther.13, 1–13 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Azizan, Z., Zali, H., Mirmotalebisohi, S. A., Bazrgar, M. & Ahmadiani, A. Deciphering molecular bridges: Unveiling the interplay between metabolic syndrome and Alzheimer’s disease through a systems biology approach and drug repurposing. Plos One19, e0304410 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Cheng, F. et al. Artificial intelligence and open science in discovery of disease-modifying medicines for Alzheimer’s disease. Cell Rep. Med.5 (2024). [DOI] [PMC free article] [PubMed]
  • 39.Li, V. O. K. et al. Designing a protocol adopting an artificial intelligence (AI)–driven approach for early diagnosis of late-onset Alzheimer’s disease. J. Mol. Neurosci.71, 1329–1337 (2021). [DOI] [PubMed] [Google Scholar]
  • 40.Ghiam, S., Eslahchi, C., Shahpasand, K., Habibi-Rezaei, M. & Gharaghani, S. Identification of repurposed drugs targeting significant long non-coding RNAs in the cross-talk between diabetes mellitus and Alzheimer’s disease. Sci. Rep.12, 18332 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Zeng, X. et al. deepDR: A network-based deep learning approach to in silico drug repositioning. Bioinformatics35, 5191–5198 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Xu, J. et al. Interpretable deep learning translation of GWAS and multi-omics findings to identify pathobiology and drug repurposing in Alzheimer’s disease. Cell Rep.41, 111717 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Pan, X. et al. AI-DrugNet: A network-based deep learning model for drug repurposing and combination therapy in neurological disorders. Computat. Struct. Biotechnol. J.21, 1533–1542 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Hsieh, K.-L. et al. Synthesize heterogeneous biological knowledge via representation learning for Alzheimer’s disease drug repurposing. iScience26 (2023). [DOI] [PMC free article] [PubMed]
  • 45.Sims, R., Hill, M. & Williams, J. The multiplex model of the genetics of Alzheimer’s disease. Nat. Neurosci.23, 311–322 (2020). [DOI] [PubMed] [Google Scholar]
  • 46.Bellenguez, C. et al. New insights into the genetic etiology of Alzheimer’s disease and related dementias. Nat. Genet. 1–25 (2022). [DOI] [PMC free article] [PubMed]
  • 47.Herrup, K. Fallacies in neuroscience: The Alzheimer’s edition. Eneuro9 (2022). [DOI] [PMC free article] [PubMed]
  • 48.Yu, H. et al. Prediction of drugs having opposite effects on disease genes in a directed network. BMC Syst. Biol.10, S2 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Rodriguez, S. et al. Machine learning identifies candidates for drug repurposing in Alzheimer’s disease. Nat. Commun.12, 1–13 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Sui, S. & Lv, H. Cognitive Improving Actions of Tofacitinib in a mouse model of Alzheimer disease involving TNF-α, IL-6, PI3K-Akt and GSK-3β signalling pathway. Int. J. Neurosci.10.1080/00207454.2022.2151712 (2022). [DOI] [PubMed] [Google Scholar]
  • 51.Jain, M. et al. Role of JAK/STAT in the neuroinflammation and its association with neurological disorders. Ann. Neurosci.28, 191–200 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Desai, R. J. et al. Comparative risk of Alzheimer disease and related dementia among medicare beneficiaries with rheumatoid arthritis treated with targeted disease-modifying antirheumatic agents. JAMA Netw. Open5, e226567. 10.1001/jamanetworkopen.2022.6567 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Vomero, M. et al. Tofacitinib decreases autophagy of fibroblast-like synoviocytes from rheumatoid arthritis patients. Front. Pharmacol.13, 852802 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Barmaki, H., Nourazarian, A. & Khaki-Khatibi, F. Proteostasis and neurodegeneration: A closer look at autophagy in Alzheimer’s disease. Front. Aging Neurosci.15, 1281338 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Shallice, T. `Theory of mind’ and the prefrontal cortex. Brain124, 247–248. 10.1093/brain/124.2.247 (2001). [DOI] [PubMed] [Google Scholar]
  • 56.Advani, D. & Kumar, P. Uncovering cell cycle dysregulations and associated mechanisms in cancer and neurodegenerative disorders: A Glimpse of Hope for repurposed drugs. Mol. Neurobiol.10.1007/s12035-024-04130-7 (2024). [DOI] [PubMed] [Google Scholar]
  • 57.Goldberg, S., Visochek, L., Giladi, E., Gozes, I. & Cohen-Armon, M. PolyADP-ribosylation is required for long-term memory formation in mammals. J. Neurochem.111, 72–79 (2009). [DOI] [PubMed] [Google Scholar]
  • 58.Visochek, L. et al. PolyADP-ribosylation is involved in neurotrophic activity. J. Neurosci.25, 7420–7428 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Carosi, J. M. & Sargeant, T. J. Rapamycin and Alzheimer disease: A double-edged sword?. Autophagy15, 1460–1462. 10.1080/15548627.2019.1615823 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Hu, Z. et al. Shared causal paths underlying Alzheimer’s dementia and type 2 diabetes. Sci. Rep.10, 4107 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Topiwala, A. et al. Association of gout with brain reserve and vulnerability to neurodegenerative disease. Nat. Commun.14, 2844 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Zhang, X.-X. et al. Association of Vitamin D Levels with Risk of Cognitive Impairment and Dementia: A Systematic Review and Meta-Analysis of Prospective Studies. J. Alzheimer’s Dis. 1–13 (2024). [DOI] [PubMed]
  • 63.de Oliveira Neto, L. et al. Factors associated with inflamm-aging in institutionalized older people. Sci. Rep.11, 18333 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Henderson, V. W. et al. Raloxifene for women with Alzheimer disease: A randomized controlled pilot trial. Neurology85, 1937–1944 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Johnson, N. R. et al. Imipramine and olanzapine block apoE4-catalyzed polymerization of Aβ and show evidence of improving Alzheimer’s disease cognition. Alzheimer’s Res. Ther.14, 88 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Zhang, X.-Z., Li, X.-J. & Zhang, H.-Y. Valproic acid as a promising agent to combat Alzheimer’s disease. Brain Res. Bull.81, 3–6 (2010). [DOI] [PubMed] [Google Scholar]
  • 67.Rivera-Oliver, M. & Díaz-Ríos, M. Using caffeine and other adenosine receptor antagonists and agonists as therapeutic tools against neurodegenerative diseases: A review. Life Sci.101, 1–9 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Yeo, I. J., Yun, J., Son, D. J., Han, S.-B. & Hong, J. T. Antifungal drug miconazole ameliorated memory deficits in a mouse model of LPS-induced memory loss through targeting iNOS. Cell Death Dis.11, 623 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Patterson, S. & Palmer, A. Predicting in vivo activity of combination therapies from in vitro drug pairs in diverse environments. Cell Rep. Med.3, 100745 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Pereira, M., Cruz, M. T., Fortuna, A. & Bicker, J. Restoring the epigenome in Alzheimer’s disease: Advancing HDAC inhibitors as therapeutic agents. Drug Discov. Today29, 104052 (2024). [DOI] [PubMed] [Google Scholar]
  • 71.Qiu, Y. et al. Definition of the contribution of an Osteopontin-producing CD11c+ microglial subset to Alzheimer’s disease. Proc. Natl. Acad. Sci.120, e2218915120 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Martire, S., Mosca, L. & d’Erme, M. PARP-1 involvement in neurodegeneration: A focus on Alzheimer’s and Parkinson’s diseases. Mech. Ageing Dev.146, 53–64 (2015). [DOI] [PubMed] [Google Scholar]
  • 73.Hierro-Bujalance, C. et al. Empagliflozin reduces vascular damage and cognitive impairment in a mixed murine model of Alzheimer’s disease and type 2 diabetes. Alzheimer’s Res. Ther.12, 1–13 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Annweiler, C., Karras, S. N., Anagnostis, P. & Beauchet, O. Vitamin D supplements: A novel therapeutic approach for Alzheimer patients. Front. Pharmacol.5, 78104 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Eshraghi, M. et al. Enhancing autophagy in Alzheimer’s disease through drug repositioning. Pharmacol. Ther.237, 108171 (2022). [DOI] [PubMed] [Google Scholar]
  • 76.Mathur, A. et al. Secondary hyperparathyroidism (CKD-MBD) treatment and the risk of dementia. Nephrol. Dial. Transplant.37, 2111–2118 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Cummings, J. et al. Alzheimer’s disease drug development pipeline: 2023. Alzheimer’s Dementia9, e12385 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9 (2008).
  • 79.Banasik, M., Stedeford, T. & Strosznajder, R. P. Natural inhibitors of poly (ADP-ribose) polymerase-1. Mol. Neurobiol.46, 55–63 (2012). [DOI] [PubMed] [Google Scholar]
  • 80.Komirishetty, P. et al. PARP inhibition attenuates neuroinflammation and oxidative stress in chronic constriction injury induced peripheral neuropathy. Life Sci.150, 50–60 (2016). [DOI] [PubMed] [Google Scholar]
  • 81.Szklarczyk, D. et al. The STRING database in 2021: Customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res.49, D605–D612 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Kanehisa, M., Furumichi, M., Sato, Y., Kawashima, M. & Ishiguro-Watanabe, M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res.51, D587–D592 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Park, J. S. et al. Brain somatic mutations observed in Alzheimer’s disease associated with aging and dysregulation of tau phosphorylation. Nat. Commun.10, 3090 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.De Felice, F. G., Gonçalves, R. A. & Ferreira, S. T. Impaired insulin signalling and allostatic load in Alzheimer disease. Nat. Rev. Neurosci.23, 215–230 (2022). [DOI] [PubMed] [Google Scholar]
  • 85.Carter, M. E. & Brunet, A. FOXO transcription factors. Curr. Biol.17, R113–R114 (2007). [DOI] [PubMed] [Google Scholar]
  • 86.Somel, M., Thornton, J. M. & Dönertaş, H. M. Temporal changes in the gene expression heterogeneity during brain development and aging. Sci. Rep.10, 1–15 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Manolopoulos, K., Klotz, L., Korsten, P., Bornstein, S. & Barthel, A. Linking Alzheimer’s disease to insulin resistance: The FoxO response to oxidative stress. Mol. Psychiatry15, 1046–1052 (2010). [DOI] [PubMed] [Google Scholar]
  • 88.Park, J.-C. et al. A logical network-based drug-screening platform for Alzheimer’s disease representing pathological features of human brain organoids. Nat. Commun.12, 280 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Yang, Y. & Zhang, Z. Microglia and Wnt pathways: Prospects for inflammation in Alzheimer’s disease. Front. Aging Neurosci.12, 110 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Li, F. et al. Weakly activated core neuroinflammation pathways were identified as a central signaling mechanism contributing to the chronic neurodegeneration in Alzheimer’s disease. Front. Aging Neurosci.14, 935279 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Jia, J. et al. Mechanisms of drug combinations: Interaction and network perspectives. Nat. Rev. Drug Discov.8, 111–128 (2009). [DOI] [PubMed] [Google Scholar]
  • 92.Zhang, Y. et al. Amyloid-β toxicity modulates tau phosphorylation through the PAX6 signalling pathway. Brain144, 2759–2770 (2021). [DOI] [PubMed] [Google Scholar]
  • 93.Lei, Q. et al. Extracellular vesicles deposit PCNA to rejuvenate aged bone marrow–derived mesenchymal stem cells and slow age-related degeneration. Science Translational Medicine13, eaaz8697 (2021). [DOI] [PubMed] [Google Scholar]
  • 94.Kuenzi, B. M. et al. Predicting drug response and synergy using a deep learning model of human cancer cells. Cancer Cell38, 672-684.e676 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw.20, 61–80 (2008). [DOI] [PubMed] [Google Scholar]
  • 96.Belthangady, C., Stedden, W. & Norgeot, B. Minimizing bias in massive multi-arm observational studies with BCAUS: Balancing covariates automatically using supervision. BMC Med. Res. Methodol.21, 1–10 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Wu, D. & Wang, G. G. Causal artificial neural network and its applications in engineering design. Eng. Appl. Artif. Intell.97, 104089 (2021). [Google Scholar]
  • 98.Zheng, H. et al. Data-driven causal model discovery and personalized prediction in Alzheimer’s disease. NPJ Digit. Med.5, 137 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Ivashko-Pachima, Y. et al. Discovery of autism/intellectual disability somatic mutations in Alzheimer’s brains: Mutated ADNP cytoskeletal impairments and repair as a case study. Mol. Psychiatry26, 1619–1633 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Soheili-Nezhad, S., van der Linden, R. J., Olde Rikkert, M., Sprooten, E. & Poelmans, G. Long genes are more frequently affected by somatic mutations and show reduced expression in Alzheimer’s disease: Implications for disease etiology. Alzheimer’s Dementia17, 489–499 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Sragovich, S., Gershovits, M., Lam, J. C. K., Li, V. O. K. & Gozes, I. Putative blood somatic mutations in post-traumatic stress disorder-symptomatic soldiers: High impact of cytoskeletal and inflammatory proteins. J. Alzheimer’s Dis.79, 1723–1734 (2021). [DOI] [PubMed] [Google Scholar]
  • 102.Downey, J., Lam, J. C. K., Li, V. O. K. & Gozes, I. Somatic mutations and Alzheimer’s disease. J. Alzheimer’s Dis.90, 475–493 (2022). [DOI] [PubMed] [Google Scholar]
  • 103.Alzforum. Adieu to Aduhelm: Biogen Stops Marketing Antibody, https://www.alzforum.org/news/community-news/adieu-aduhelm-biogen-stops-marketing-antibody (2024).
  • 104.Alexander, G. C., Emerson, S. & Kesselheim, A. S. Evaluation of aducanumab for Alzheimer disease: Scientific evidence and regulatory review involving efficacy, safety, and futility. Jama325, 1717–1718 (2021). [DOI] [PubMed] [Google Scholar]
  • 105.Mullane, K. & Williams, M. Alzheimer’s disease beyond amyloid: Can the repetitive failures of amyloid-targeted therapeutics inform future approaches to dementia drug discovery?. Biochem. Pharmacol.177, 113945 (2020). [DOI] [PubMed] [Google Scholar]
  • 106.Ying, Z. et al. Hierarchical graph representation learning with differentiable pooling. Adv. Neural Inf. Process. Syst. 31 (2018).
  • 107.Zhang, Q. et al. An Expert-guided Hierarchical Graph Attention Network for Post-traumatic Stress Disorder Highly-associative Genetic Biomarkers Identification. medRxiv, 2023.2001. 2030.23285175 (2023).
  • 108.Jin, W. et al. Deep learning identifies synergistic drug combinations for treating COVID-19. Proc. Natl. Acad. Sci.118, e2105070118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Obrezanova, O. Artificial intelligence for compound pharmacokinetics prediction. Curr. Opin. Struct. Biol.79, 102546 (2023). [DOI] [PubMed] [Google Scholar]
  • 110.Abbasi, F. & Rousu, J. New methods for drug synergy prediction: A mini-review. Curr. Opin. Struct. Biol.86, 102827 (2024). [DOI] [PubMed] [Google Scholar]
  • 111.Torkamannia, A., Omidi, Y. & Ferdousi, R. SYNDEEP: A deep learning approach for the prediction of cancer drugs synergy. Sci. Rep.13, 6184 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Liu, L. et al. Prediction of the blood–brain barrier (BBB) permeability of chemicals based on machine-learning and ensemble methods. Chem. Res. Toxicol.34, 1456–1467 (2021). [DOI] [PubMed] [Google Scholar]
  • 113.Ying, Z., Bourgeois, D., You, J., Zitnik, M. & Leskovec, J. GNNExplainer: Generating explanations for graph neural networks. Adv. Neural Inf. Process. Syst. 32 (2019). [PMC free article] [PubMed]
  • 114.Wishart, D. S. et al. DrugBank 5.0: A major update to the DrugBank database for 2018. Nucleic Acids Res.46, D1074–D1082 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet.51, 404–413 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.Ivashko-Pachima, Y. et al. SH3-and actin-binding domains connect ADNP and SHANK3, revealing a fundamental shared mechanism underlying autism. Mol. Psychiatry10.1038/s41380-022-01603-w (2022). [DOI] [PubMed] [Google Scholar]
  • 117.Karmon, G. et al. Novel ADNP syndrome mice reveal dramatic sex-specific peripheral gene expression with brain synaptic and Tau pathologies. Biol. Psychiatry92, 81–95 (2021). [DOI] [PubMed] [Google Scholar]
  • 118.Lin, C. H. R. et al. Quantitative comorbidity risk assessment of dementia in Taiwan: A population-based cohort study. Medicine97, e0298 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119.Olier, I. et al. Transformational machine learning: Learning how to learn from many related scientific problems. Proc. Natl. Acad. Sci.118, e2108013118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.Avram, S. et al. DrugCentral 2021 supports drug discovery and repositioning. Nucleic Acids Res.49, D1160–D1169 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Res.45, D945–D954 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.Gilson, M. K. et al. BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res.44, D1045–D1053 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123.Tang, J. et al. Making sense of large-scale kinase inhibitor bioactivity data sets: A comparative and integrative analysis. J. Chem. Inf. Model.54, 735–743 (2014). [DOI] [PubMed] [Google Scholar]
  • 124.Pan, X. et al. Deep learning for drug repurposing: Methods, databases, and applications. Wiley Interdiscip. Rev. Computat. Mol. Sci.12, e1597 (2022). [Google Scholar]
  • 125.Derr, T., Ma, Y. & Tang, J. 2018 IEEE International Conference on Data Mining (ICDM). 929–934 (IEEE).
  • 126.Kollias, G., Kalantzis, V., Idé, T., Lozano, A. & Abe, N. In: Proceedings of the AAAI Conference on Artificial Intelligence. 7211–7219.
  • 127.Kipf, T. N. & Welling, M. Variational graph auto-encoders. Preprint at arXiv:1611.07308 (2016).
  • 128.Zhang, M. & Chen, Y. Link prediction based on graph neural networks. Adv. Neural Inf. Process. Syst. 31 (2018).
  • 129.Wood, K., Nishida, S., Sontag, E. D. & Cluzel, P. Mechanism-independent method for predicting response to multidrug combinations in bacteria. Proc. Natl. Acad. Sci.109, 12254–12259 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130.Pichotta, K. & Tansey, W. The 2022 ICML Workshop on Computational Biology (Maryland, 2022). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The drug and drug-drug interaction data were obtained from DrugBank (https://go.drugbank.com). The drug target data were obtained from DrugBank (https://go.drugbank.com), DrugCentral (https://drugcentral.org), ChEMBL (https://www.ebi.ac.uk/chembl/), and BindingDB (https://www.bindingdb.org/). The protein–protein interaction and pathway data were obtained from STRING (https://string-db.org). The preprocessed biomedical network datasets, including genes and their weights (Table S1), drugs and their weights (Table S2), proteins/targets and their weights (Table S3), and edges and their weights (Tables S4-7), are available in Supplementary Data.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES