Skip to main content
Briefings in Bioinformatics logoLink to Briefings in Bioinformatics
. 2024 Aug 23;25(5):bbae414. doi: 10.1093/bib/bbae414

HTINet2: herb–target prediction via knowledge graph embedding and residual-like graph neural network

Pengbo Duan 1, Kuo Yang 2,, Xin Su 3, Shuyue Fan 4, Xin Dong 5, Fenghui Zhang 6, Xianan Li 7, Xiaoyan Xing 8, Qiang Zhu 9, Jian Yu 10, Xuezhong Zhou 11,
PMCID: PMC11341278  PMID: 39175133

Abstract

Target identification is one of the crucial tasks in drug research and development, as it aids in uncovering the action mechanism of herbs/drugs and discovering new therapeutic targets. Although multiple algorithms of herb target prediction have been proposed, due to the incompleteness of clinical knowledge and the limitation of unsupervised models, accurate identification for herb targets still faces huge challenges of data and models. To address this, we proposed a deep learning-based target prediction framework termed HTINet2, which designed three key modules, namely, traditional Chinese medicine (TCM) and clinical knowledge graph embedding, residual graph representation learning, and supervised target prediction. In the first module, we constructed a large-scale knowledge graph that covers the TCM properties and clinical treatment knowledge of herbs, and designed a component of deep knowledge embedding to learn the deep knowledge embedding of herbs and targets. In the remaining two modules, we designed a residual-like graph convolution network to capture the deep interactions among herbs and targets, and a Bayesian personalized ranking loss to conduct supervised training and target prediction. Finally, we designed comprehensive experiments, of which comparison with baselines indicated the excellent performance of HTINet2 (HR@10 increased by 122.7% and NDCG@10 by 35.7%), ablation experiments illustrated the positive effect of our designed modules of HTINet2, and case study demonstrated the reliability of the predicted targets of Artemisia annua and Coptis chinensis based on the knowledge base, literature, and molecular docking.

Keywords: drug–target prediction, network embedding, graph neural network, knowledge graph

Introduction

The formulas of traditional Chinese medicine (TCM), characterized by its multi-component and multi-target effects, naturally excels in regulating complex diseases. It has been widely practiced in clinical diagnosis and treatment in China and is gradually gaining global application [1, 2]. The targets of drugs/herbs are the origin of a drug’s therapeutic action in the human body. Therefore, identifying herb–target interaction (HTI) is crucial as it aids in understanding the mechanisms of TCM, guiding rational clinical medication, and new drug development. Traditionally, the discovery of herb’s targets primarily relied on wet experiments or clinical trials. However, most herbs contain dozens to thousands of components, making the experimental technologies for discovering the targets of herb, which require simultaneous identification of effective components of herbs and their corresponding biological targets, extremely costly and time-consuming [3, 4]. In recent years, with the advancement of informatics technology and the accumulation of a vast amount of biomedical data, computational techniques such as machine learning and deep learning are increasingly used to predict the targets of herbs, becoming a mainstream approach [5]. Compared with traditional methods, computational prediction significantly reduces the cost of target discovery, enhances prediction accuracy, and accelerates the pace of new drug development [6].

In recent years, researchers have designed various computational methods for predicting the targets of herbs, primarily employing techniques such as network propagation, network embedding, and random walks among other machine learning algorithms. Yang et al. [7] developed a network propagation-based model for herb–target prediction (HTP), motivated by the idea that herbal targets with similar efficacy are also similar. This model used the efficacy-based herb similarity to initialize the values of node in protein–protein interaction (PPI) network, and employs random walks on the network to score candidate proteins. Additionally, Yang et al. [8] proposed a model named heNetRW, which leverages random walks on a heterogeneous herb–target network, demonstrating commendable predictive performance. Wang et al. [9] introduced an HTP method named HTINet based on heterogeneous network embedding. They construct a heterogeneous network of symptoms–herbs–targets, and use network embedding to learn low-dimensional vectors of herbs and proteins, ultimately achieving precise target prediction.

Furthermore, with the advancement of deep learning, various drug prediction methods based on deep neural networks have been proposed. Zhang et al. [10] introduced a multi-view deep learning model, DrugAI, which combines graph neural networks, network embedding, and multi-view learning to predict activation–inhibition relationships between drug and target. Yang et al. [11] proposed a drug repositioning algorithm named DRONet based on network embedding and ranking learning, aimed at discovering new indications for drugs. Lin et al. [12] developed an end-to-end algorithm based on deep neural networks, named GraphCPI, which employs graph neural networks to predict the targets of compounds. Concurrently, several pharmacological databases have been developed. Mohanraj et al. [13, 14] constructed the IMPPAT digital database, which focuses on the phytochemicals of Indian medicinal plants and integrates information from traditional books and published research articles. Wu et al. [15] developed SymMap, an integrative database of TCM enhanced by symptom mapping. SymMap aligns TCM with modern medicine at both the phenotypic and molecular levels in shared aspects. These pharmacological databases support and facilitate the discovery of drug–target relationships.

Although various prediction algorithms of drug/herb’s targets have been proposed, HTI prediction task still faces several challenges. For example, the existing HTI models do not make full use of TCM knowledge. On the one hand, HTI relationships are closely associated with the TCM properties of herbs, such as their property, flavor, and meridian, as well as their efficacy. On the other hand, in clinical practice, the therapeutic action of herbs is manifested through the diseases and symptoms they treat. However, to date, only HTINet [9] has considered the information of diseases and symptoms in target prediction, but it does not take into account the TCM properties of herbs. Therefore, integrating both the TCM properties and clinical treatment of herbs into the HTI modeling to achieve more accurate predictions remains a challenge.

Additionally, the existing models predict HTI relationships primarily using unsupervised learning techniques, such as random walks and network embedding [8, 9]. These methods are unable to use known HTI as the supervisory information of model and fail to capture the full spectrum of known interactions, which hinders further improvement in prediction accuracy. Therefore, designing more efficient supervised deep learning models to enhance the accuracy of HTI poses another challenge.

In this study, we proposed a deep learning-based HTP framework termed HTINet2, which includes three key modules, i.e. TCM knowledge graph embedding (KGE), residual graph representation learning, and supervised target prediction. In the module of KGE, we constructed a knowledge graph (termed TMKG) that fuses the TCM property and clinical treatment knowledge of herbs, and learned the deep knowledge embedding (KE) of herbs and targets. In the module of graph representation learning and target prediction, we designed a residual graph convolution neural network to capture the deep interactions among herbs and targets, and a Bayesian personalized ranking (BPR) to achieve supervised model training. Finally, we designed comprehensive experiments including performance comparison, ablation experiments, hyper-parameter analysis, and case study, which indicated the excellent performance and the reliability of predicted results of HTINet2.

Materials and methods

Dataset

Herb–target interactions

The data of HTIs are derived from SymMap [15], which is a known database centered on symptoms, herbs, and molecular. First, we downloaded all the HTI, totaling 267 809 relationships. To ensure the reliability of relationships, we filtered out a relation subset based on an inferred evidence score exceeding the average value of 0.4616 and a P-value ¡0.05. Finally, we obtained a benchmark dataset of 38 002 HTI relationships, including 563 distinct herbs and 2106 distinct targets.

Knowledge graph of TCM and western medicine (TMKG)

To make full use of medical knowledge in modeling HTI prediction, we constructed a knowledge graph of TCM and western medicine through the extraction and integration from multiplex biomedical knowledge bases and TCM books. These databases include SymMap [15], soFDA [16] centered on the ontology of TCM syndromes, STRING [17] v11 including large-scale PPI, KEGG [18] of genomic encyclopedia, and GO of gene ontology [19]. The TCM books consist of ”Pharmacopoeia of the People’s Republic of China 2020”, ”100 Classic Famous Formulas”, ”Traditional Chinese Medicine Diagnostics”, ”Treatise on Febrile Diseases”, ”Formulas of Traditional Chinese Medicine”, and ”Internal and External Women and Children’s Diseases”. To ensure the quality of our constructed knowledge graph, we used programs or scripts for data cleaning, and we invited medical experts to review and correct the data, as detailed in Section 1.1 of Supplemental Materials file 1.

Finally, the TMKG we constructed comprises 15 types of 74 529 entities, including herbs, efficacy, meridian, categories, diseases, components, properties, symptoms, syndromes, prescriptions, Gene Ontology (GO), pathways, and various levels of syndromes. It includes 77 008 head entities and 130 402 tail entities, spanning 31 types of relations such as herb-efficacy, herb-symptom, herb-prescription, herb-ingredient, herb-property, and herb-meridian, totaling 1920 415 triples (Section 1.1 of Supplemental Materials file 1).

HTINet2 framework to predict herb’s targets

Overall architecture of HTINet2

We proposed HTINet2 (Fig. 1), a HTI prediction framework, which comprises three key modules, namely KG construction and embedding learning, graph representation learning, and target prediction. In the first module, we construct a knowledge graph TMKG centered on molecular, TCM theories and clinical treatments of herbs. Then we applied KE learning algorithms to learn the complex relationships among entities in TMKG, resulting in their respective embedding vectors. In the second module, we constructed a heterogeneous network specifically tailored for target prediction, introducing a learning method of residual graph convolutional network (GCN). This supervised method builds upon KE features to further learn intensified representation of herbs and proteins. In the final module of target prediction, we implement a BPR loss function to optimize the parameters of residual GCN, facilitating the prediction of potential targets.

Figure 1.

Figure 1

Overall architecture of HTINet2; HTINet2 consists of three key modules, i.e. KG construction and embedding learning (A and D), graph representation learning (B and E), and target prediction (C).

KE learning of TMKG

To obtain implicit knowledge of the TMKG, we utilized KE algorithms to learn the embedding representations of all entities within the KG. KE algorithms are adept at learning the latent relationships between various entities in the KG (i.e. implicit local features) as well as the overall network structure of the KG (i.e. implicit global features). Subsequently, the learned implicit features are presented as the embedding vectors of nodes, which can be used for downstream tasks such as classification or link prediction, particularly predicting HTI in this study.

Among various KE algorithms, DeepWalk is particularly noteworthy. DeepWalk uses random walks to explore the KG, thereby learning a sequence of nodes which approximates the structural context of each node. Given a graph Inline graphic, where Inline graphic represents the set of vertices, and Inline graphic represents the set of edges, the v denotes a node in the set of vertices Inline graphic. The loss function of DeepWalk is mathematically formulated as

graphic file with name DmEquation1.gif (1)
graphic file with name DmEquation2.gif (2)

where Inline graphic is the loss function, Inline graphic is the current node in the graph, Inline graphic represents the neighborhood of the current node, and Inline graphic is the size of the window. Inline graphic is the function mapping nodes to their vector embeddings (as detailed in the Section 1.2 of Supplemental Materials file 1).

Additionally, owing to the impact of different KE algorithms and embedding dimensions on these downstream tasks, we compared five KE algorithms, namely LINE [20], GraRep [21], node2vec [22], DeepWalk [23], and HOPE [24], along with different dimensions (i.e. 32, 64, 96, 128, 256, and 512). Ultimately, the most effective KGE algorithms and embedding dimensions for the task of HTI prediction are selected.

Graph representation learning for HTI network

Graph representation learning, particularly GCNs, has emerged as one of the key methodologies for learning and representing complex relationships within graph-structured data. To fully leverage the intricate interactions between herbs and targets, thereby forming more precise representation vectors of herbs and targets, we have developed a residual-like graph convolutional neural network [25]. This network takes the HTI network as its initial input, with the embedding vectors derived from KE learning module serving as the initial node features. Interaction information of herbs and targets could be propagated through residual-like GCN operations, ultimately learning deeper representation features of them.

Formally, based on the known drug–target relationships in the training set, we constructed a heterogeneous drug–target graph Inline graphic, where Inline graphic is the set of nodes (including herbs and targets), and the adjacency matrix Inline graphic encodes the interaction relationships between herbs and targets. In this matrix, if a herb Inline graphic has an interaction with target Inline graphic, then Inline graphic, otherwise, it is 0. Subsequently, we computed the normalized matrix Inline graphic, to weight the influence of each node’s degree, where Inline graphic represents the adjacency matrix with self-loops (Inline graphic being the identity matrix), and Inline graphic is the degree matrix of Inline graphic.

In the multi-layer GCN we constructed, if the node feature matrix at the Inline graphic-th layer is denoted as Inline graphic, then after the propagation and learning of node features, the features at the Inline graphic-th layer can be represented as

graphic file with name DmEquation3.gif (3)

where Inline graphic represents the weight matrix; it adjusts the relative importance of different input variables and is continuously optimized during the training process of neural networks. To maintain the integrity of neighborhood information, and reduce computational costs, we replaced the nonlinear activation functions in each layer with linear activation functions for feature propagation. In the HTI network, the node degrees of an herb Inline graphic and a target Inline graphic are denoted as Inline graphic and Inline graphic, respectively. Their respective neighbor node sets are represented as Inline graphic and Inline graphic. Their vector representations in the Inline graphic-th layer of the graph neural network are Inline graphic and Inline graphic. Consequently, after one layer of GCN feature propagation, their representations at the Inline graphic-th layer can be obtained:

graphic file with name DmEquation4.gif (4)
graphic file with name DmEquation5.gif (5)

Overly deep network architectures may lead to issues of gradient vanishing or explosion in GCN. Inspired by the residual operation of ResNet, we obtained the final embedding vectors of herbs/targets, Inline graphic and Inline graphic, by fusing the vectors of after k layers graph convolutional operations, Inline graphic and Inline graphic, along with the initial node representations Inline graphic and Inline graphic.

graphic file with name DmEquation6.gif (6)
graphic file with name DmEquation7.gif (7)

The residual-like feature aggregation operation captures more complex features of the GCN by considering the feature outputs at each layer, and makes the information transfer between network layers more efficient, finally obtaining more precise representations of herbs and targets.

Herb–target prediction

Building upon graph representation learning module, we designed an optimization loss of BPR for predicting HTI relationships. As a commonly used method in the field of recommendations, BPR loss facilitates learning the interactions between herbs and targets by maximizing the margin between positive and negative samples. This margin is assessed by comparing the scores of herb–target pairs, where the score signifies the level of similarity or matching of a herb–target pair.

Mathematically, for a triple Inline graphic, h represents an herb, Inline graphic a target associated with the herb (i.e. positive sample, composed of targets associated with the herb in the training set), and Inline graphic a target not associated with the herb (i.e. negative sample, composed of several negative samplings of positive samples). We utilized vector distance, simplified to dot product in this study, to calculate the scores for both the positive and negative sample pairs based on the embedding vectors of herbs and targets. We then constructed a pairwise ranking-based objective loss function, which seeks to maximize the score of positive samples while minimizing that of negative samples, thereby updating and optimizing the parameters of the neural network, as follows:

graphic file with name DmEquation8.gif (8)

Here, Inline graphic signifies the Sigmoid function, Inline graphic encompasses all the HTI triples in the training set, Inline graphic represents all the parameters for neural network, Inline graphic is the regularization term for the parameters, Inline graphic denotes the weight of the regularization term, and M is the number of herbs. Ultimately, based on the trained neural network, score for each drug target pair can be obtained, allowing for the ranking prediction of targets based on these scores.

Experimental settings

In our experiments, we divided the 38 002 HTI relationships into a training set, validation set, and test set in an 8:1:1 ratio. We compared HTINet2 with two categories of baseline methods. The first is the methods of traditional link prediction based on common neighbors, including CN, Salton [26], Jaccard [27], HPI [28], LHN-1 [29], AA [30], and RA [31]. The other comprises previously proposed models for HTI prediction, such as heNetRW [8] and Prince [32]. Detailed introduction for these baselines is described in the Section 1.3 of Supplemental Materials file 1. Furthermore, we selected hit ratio (HR) and normalized discounted cumulative gain (NDCG) as the evaluation metrics [33]. For a given herb Inline graphic, Inline graphic represents the top-K predicted candidate genes, and Inline graphic represents the known genes of Inline graphic in the test set. HR@K is given by

graphic file with name DmEquation9.gif (9)

The Normalize Discounted Cumulative Gain (NDCG@K) can be calculated as follows:

graphic file with name DmEquation10.gif (10)

where DCG represents Discounted Cumulative Gain, IDCG represents Ideal Discounted Cumulative Gain, Inline graphic represents the relevance score of the gene at position Inline graphic in the ranked list, and Inline graphic represents the results sorted by relevance.

Results

TMKG and KE

We constructed a large-scale knowledge graph of Traditional Chinese and Modern Medicine TMKG by the information extraction and curation from multiple public databases. The TMKG comprises 15 types of entities totaling 74 529 and 31 types of triples totaling 1920 415. For subsequent HTP, TMKG included two types of vital information related to herbs. The first is the TCM properties of herbs, including herbal property and class, taste, meridian, and efficacy, etc. The second is the clinical treatment information of the herbs, including diseases and symptoms. As shown in Fig. 2A and B, we took Artemisia annua and Coptis as examples, and illustrated their connections to syndromes, diseases, efficacy, ingredients, and proteins.

Figure 2.

Figure 2

(A and B) Schematic representation centered on Artemisia annua and Coptis in knowledge graph TMKG; (CF visualization of embedding vectors of entities in TMKG by different KE methods.

To capture the complex and deep relationships among biological entities (especially herb and target) in TMKG, we utilized multiplex methods of KE to learn the embedding representations of entities, which imply extensive and accurate medical knowledge, e.g. the TCM properties and clinical treatment information of herbs. These embedding vectors encapsulate information pertaining to the current node and its neighboring nodes within the context of a complex knowledge graph. Then we used t-SNE [34] to reduce the dimension of the embedding vectors of these entities learned by different KE models, and visualize them on a two-dimensional plane. We calculated the Silhouette score (SC) for various entities, as shown in Fig. 2C–F and the Section 2.1 of Supplemental Materials file 1. These entities, including herbs, proteins, symptoms, and diseases, etc. are grouped into different clusters.

Second, the results also showed that there are obviously differences on the embedding visualization of KE methods with different model characteristics. For example, in the visualization results of GraRep, the cluster distance between herbs and diseases, genes, and efficacy are relatively close, while the distance between herbs and ingredients was relatively far away. For the result of LINE, the herb cluster was in the center; in addition to the distance from the herbal prescription, it has a relatively close distance from other entities’ cluster. Finally, the KE model can capture the complex relationship between entities and get a good embedding representation, which provides a basis for fine-tuning the downstream task of modeling HTI prediction.

Overall performance comparison with baselines

In the experiments, we compared the performance of HTINet2 with multiple baseline methods, which include traditional link prediction algorithms and the known HTI prediction methods, i.e. heNetRW and PRINCE.

The experimental results (Table 1 and Fig. 3) showed that in the baseline of local similarity, CN and AA, which are both network topology-based methods, achieved the best performance (HR@5=0.1209 and NDCG@10=0.276), followed by RA as well, and the HPI and LHN-1 performed worst. In the two known baselines, heNetRW obtained the highest performance (HR@10=0.0915 and NDCG@10=0.0942). The baseline methods based on local similarity most often outperformed the two known baselines. For example, the HR@10 and NDCG@10 of CN are 118.7% and 193.0% higher than those of heNetRW, respectively, indicating that local similarity methods can effectively capture the structural information of HTI network, demonstrating superior performance in HTI prediction.

Table 1.

Performance comparison of HTP methods.

Top@1 Top@3 Top@5 Top@10
Models HR NDCG HR NDCG HR NDCG HR NDCG
CN 0.0274 0.0812 0.0799 0.1705 0.1206 0.2131 0.2001 0.276
Salton 0.0097 0.0294 0.0284 0.0614 0.0438 0.0792 0.0805 0.1117
Jaccard 0.0113 0.0305 0.0335 0.0667 0.053 0.085 0.0983 0.1255
HPI 0.0003 0.0006 0.003 0.0073 0.0067 0.0162 0.0201 0.04
LHN-1 0.0003 0.0005 0.0012 0.0015 0.0022 0.0031 0.0062 0.0079
AA 0.0278 0.0831 0.0787 0.1669 0.1209 0.2148 0.1996 0.2756
RA 0.027 0.0795 0.0763 0.1569 0.117 0.2118 0.1929 0.2667
heNetRW 0.0080 0.0186 0.0254 0.0436 0.0455 0.0597 0.0915 0.0942
PRINCE 0.0032 0.0065 0.0134 0.0174 0.0269 0.0282 0.0495 0.0422
HTINet2(ours) 0.3536 0.3536 0.3448 0.3398 0.3706 0.35 0.4458 0.3746

Figure 3.

Figure 3

Performance comparison of HTP methods.

By comparing the performance of HTINet2 with all baseline methods, the results showed that HTINet2 achieves the best performance (HR@10=0.4458 and NDCG@10=0.3746), which are significantly higher than all baseline methods. For example, the HR@10 and NDCG@10 metrics of HTINet2 are 122.8% and 35.7% higher than those of the best baseline (i.e. CN). The excellent performance of HTINet2 is benefit from the pre-training of KE learning and the fine-tuning of residual-like GCN.

To obtain deep knowledge representations for herbs and targets, we utilize different algorithms of pre-training KE to learn the pre-training features of herbs and targets from the TMKG. As shown in Fig. 4 and the Section 2.2 of Supplemental Materials file 1, the results indicate that the pre-training of HTINet2 has a good robustness that all pre-training methods improve prediction performance compared with non-pre-training. Specifically, DeepWalk and HOPE exhibit the best performance in HTI prediction. This may be attributed to the graph structure of TMKG; both DeepWalk and node2vec employ random walk to capture the structural information. However, the uniform random walk used in DeepWalk might more accurately reflect the true structure of the graph in the context of HTI prediction tasks. In addition, HOPE is designed to capture high-order proximities in a graph. Capturing deep interactions in the HTI network may be more crucial than considering direct neighbors alone. We have presented the target predictions for all drugs in the dataset, predicted by the HTINet2 model, in the Supplemental Materials file 2.

Figure 4.

Figure 4

Performance comparison and parameter sensitivity analysis of HTINet2; (A) performance comparison of different KE methods; (B) performance comparison of different components of HTINet2; (C and D) HTINet2’s performance with different numbers of GCN layers; (EF) HTINet2’s performance with different dimensions of KE; (GH) HTINet2’s performance with different window size.

Performance influence of key components in HTINet2

To evaluate the contribution of different components in HTINet2, we performed ablation experiments of HTINet2, including KE pre-training (termed PT) and residual GCN (termed RES). The results (Table 2 and Fig. 4B) showed that, except in rare metrics (i.e. HR@1 and NDCG@1), the removal of any components from HTINet2 leads to a performance decline in the most terms of metrics. Specifically, the performance decreases the most (HR@10 by 8.69% and NDCG@10 by 7.89%) when removing the residual and KG pre-training components simultaneously. The performance decreases the least (HR@10 reduced by 2.42% and NDCG@10 by 0.51%) when removing the pre-training component. This indicates that each component of HTINet2 contributes to its performance, with the contribution of the residual in GCN contributing more than the KG pre-training.

Table 2.

Performance comparison of key modules of HTINet2.

Top@1 Top@3 Top@5 Top@10
Models HR NDCG HR NDCG HR NDCG HR NDCG
HTINet2 0.3536 0.3536 0.3448 0.3398 0.3706 0.35 0.4458 0.3746
HTINet2 w/o PT 0.3556 0.3556 0.3389 0.3389 0.3663 0.3447 0.435 0.3727
(Inline graphic0.57%) (Inline graphic0.57%) (Inline graphic1.71%) (Inline graphic0.26%) (Inline graphic1.16%) (Inline graphic1.51%) (Inline graphic2.42%) (Inline graphic0.51%)
HTINet2 w/o RES&PT 0.3285 0.3285 0.3169 0.3172 0.3331 0.32 0.3972 0.3433
(Inline graphic7.62%) (Inline graphic7.62%) (Inline graphic6.49%) (Inline graphic6.40%) (Inline graphic9.06%) (Inline graphic7.17%) (Inline graphic8.69%) (Inline graphic7.89%)

Above results indicate that pre-training of TMKG enhances the expressive capabilities of subsequent components. Moreover, the interactions between herbs and genes within the KG provide additional context for the model. By utilizing the effective information transformed through residual-like GCN, the model is better equipped to capture the latent characteristics of the nodes, improving the prediction performance.

Interpretability of HTI prediction model

To investigate the interpretability of HTI prediction models, we used t-SNE [34] to reduce the dimension of the embedding vectors of herbs and targets learned by different prediction models, and visualize them on a two-dimensional plane. First, the visualization result (Fig. 5A–F) indicated that five KE methods (i.e. LINE, GraRep, node2vec, DeepWalk, and HOPE) effectively divided herbs and targets into two distinct clusters, showing a clear advantage over random embedding vectors. The results of SC for these clusters indicate there are high SC of GraRep, DeepWalk, and HOPE, of which two (HOPE and DeepWalk) also achieve high prediction performance. This indicates there is a positive correlation between performance of HTINet2 and clustering of KE models.

Figure 5.

Figure 5

(AE) Visualization of the pretrained features of herbs and targets in TMKGs by different KE methods; (F) visualization of embedding features of herbs and targets initialized from normal distribution; (G) visualization of embedding features of herbs and targets by HTINet2 and several examples of HTI; (H) relationships between Rehmanniae Radix, PRKAA1, and PRKAA2 in TMKG; (I) relationships between Polygalae Radix and TRPV3 in TMKG.

To intuitively display and further explain the predicted results of HTINet, we also showed the visualization result (Fig. 5G) after the KE and residual-like GCN of HTINet2. Then we selected some HTI relationships in which herbs and targets are very close in a two-dimensional plane. For example, the Rehmanniae Radix is close to two known targets, PRKAA1 and PRKAA2, with distances of 0.1604 and 0.1521, respectively. The two targets PRKAA1 and PRKAA2 that almost coincide in position are also proteins from the same family. There is a closer distance (0.0545) between the Polygalae Radix and its target TRPV3. We revisited the TMKG, analyzing the selected HTI relationships and the neighbors of these entities. Figure 5H-I demonstrates these relationships within the TMKG we constructed, where entities with closer vector distances show strong relevance. We have highlighted some possible paths with red lines. For example, the second-order neighbors of Rehmanniae Radix are PRKAA1 and PRKAA2, and those of Polygalae Radix are TRPV3. On the whole, most herb targets with existing or potential relationships are closely spaced in the embedding visualized results of our HTINet2.

Hyper-parameter sensibility of HTINet2

To evaluate the robustness of HTINet2, we conducted sufficient experiments on hyper-parameter sensitivity analysis. In experiments, the main hyper-parameters of HTINet2 include the number of GCN layers in the Graph Representation Learning module, the embedding dimension of entities (e.g. herbs and targets), and the window size of the DeepWalk in KE learning stage.

When conducting experiments on the number of GCN layers in the component of Graph Representation Learning, we fixed the dimension of embedding vector at 64. The results (Fig. 5C-D, Table 3 of Supplemental Materials file 1) show a noticeable impact of layer quantity on the performance of HTINet2. Specifically, as the number of layers increased, there was a significant improvement in the metrics of HR and NDCG, followed by a reduction and then a stable trend thereafter.

Table 3.

Prediction analysis of Artemisia annua and Coptis chinensis.

Artemisia annua(Inline graphic) Coptis chinensis(Inline graphic)
Rank Predicted target Target validation Predicted target Target validation
1 TLR4 ETCM CHRM3 Docking (BE=-5.99)
2 PLB1 SLC2A4 Tang et al.
3 ADH1A Docking (BE=-6.56) PLB1
4 STAT3 Gao et al. SOD1
Ilamathi et al.
5 MC1R Docking (BE=-6.58) POR Docking (BE=-6.61)
6 TRPM8 ETCM NQO1 Shou et al.
7 CDH1 Xu et al. CHRM2 HERB and ETCM
8 EIF6 HERB PPARA Docking (BE=-7.61)
9 NOS3 RHO Tang et al.
10 NR3C1 Docking (BE=-8.15) PRSS3 Docking (BE=-6.9)
11 EGFR Liu et al. AHSA1
12 CD86 Docking (BE= -6.34) IL18 Huang et al.
13 AHR Wang et al. GCLC
14 MYC Hu et al. MYC
15 IL2 Yang et al. PSMD3

Note: Target validation including three types, namely database, literature, and molecular docking. Database validation is based on the two authoritative Chinese herb databases, namely ETCM and HERB. Literature validation is based on newly published medical literatures from PubMed. Docking validation is based on docking-based virtual screening, and BE denotes the binding energy.

We fixed the number of GCN layers and conducted experiments by adjusting the embedding vector dimensions. The results (Fig. 4E-F, Table 4 of Supplemental Materials file 1) indicate that the performance of our model gradually improved across all metrics with the increase in embedding vector dimensions. This improvement can be attributed to the fact that higher dimensional vectors encapsulate more information, enabling a more comprehensive representation of herbs and targets, improving the prediction performance of HTINet2. Finally, we adjusted the window size of the DeepWalk method during the KE learning stage, and the results (Fig. 4G-H, Table 5 of Supplemental Materials file 1) show that the model exhibits slight sensitivity to the window size. As the window size increases, the overall performance of the model exhibits a declining trend, and there is a slight rebound in performance when the window size is set to 10.

Case study

To demonstrate the reliability of HTINet2’s prediction results, we selected the two herbs, namely Artemisia annua and Coptis chinensis, and showed the results of top 15 targets predicted by HTINet2, excluding the targets that were existing in the test set. To verify the potential relationships between herbs and their predicted targets, we conducted three types of validation, including database, literature and molecular docking. Database validation is based on the two authoritative Chinese herb databases, namely ETCM [35] and HERB [36]. Literature validation is based on newly published medical literatures from PubMed to identified possible associations. In docking validation, we selected the key compounds of Artemisia annua and Coptis chinensis, namely artemisinin and berberine, and conducted docking-based virtual screening (using molecular docking software Autodock 1.5.7) to investigate the potential relationships between these predicted targets and the two compounds. As detailed in Table.3, the targets highlighted in bold are those that have been verified based on three types of evidences.

  • Target validation of Artemisia annua. In the top 15 candidate targets of Artemisia annua, evidence in databases, literature or docking supports 13 of these targets. The ETCM database developed by Zhang et al. [35] recorded TLR4 (ranked first) and TRPM8 (ranked sixth) as the targets of Artemisia annua. The HERB database studied by Fang et al. [36] recorded EIF6 (ranked eighth) as a target of this herb. Gao et al. [37] have explored that Dihydroartemisinin inhibits endothelial cell tube formation by suppression of the STAT3(ranked fourth) signaling pathway. Additionally, Ilamathi et al. [38] have demonstrated Artesunate, an anti-cancer agent, targets STAT3 and effectively suppresses hepatocellular carcinoma. Liu et al. [39] found that EGFR (ranked 11th) is related to the treatment of cervical cancer; oral administration of Dihydroartemisinin for 28 days reduced the expression of p53, EGFR, and Ki-67 antigens. Furthermore, several studies [40–43] have reported potential targets such as CDH1, AHR, MYC, IL-2, which may be associated with Artemisia annua or its derivatives. In addition, docking results indicated that there is strong binding energy with artemisinin for the four potential targets, namely ADH1A (ranked third, BE=-6.56), MC1R (ranked fifth, BE=-6.58), NR3C1 (ranked 10th, BE=-8.15), and CD86 (ranked 12th, BE=-6.34). For example, artemisinin docked onto the amino acid residue of ADH1A, namely LYS-188 (Fig. 6A). Other results of docking analysis are shown in the Section 2.4 of Supplemental Materials file 1.

  • Target validation of Coptis chinensis. In the results of the top 15 candidate targets of Coptis chinensis, the evidence support nine of these targets. The study of Tang et al. [44] demonstrated that berberine, a major component of Coptis chinensis, may improve insulin resistance by affecting the expression of the SLC2A4 gene (ranked second) and increasing GLUT4 levels. Shou et al. [45] found that berberine activates PPARInline graphic, initiating transcriptional regulatory functions and promoting the expression of NQO1 (ranked sixth). The CHRM2 gene, a target for the Coptis chinensis, is retrievable in both the HERB and ETCM databases. Additionally, several studies [46, 47] have reported potential targets, such as RHO and IL18, which may be related to Coptis chinensis or its derivatives. In addition, the results of docking showed there is strong binding energy with berberine for the four potential targets, including CHRM3 (ranked first, BE=-5.99), POR (ranked fifth, BE=-6.61), PPARA (ranked eighth, BE=-7.61), and PRSS3 (ranked tenth, BE=-6.9). Taking the potential target CHRM3 as an example, berberine docked onto the three amino acid residues of the protein, namely HIS-62, ARG-150, and LEU-192 (Fig. 6B). Other results of docking analysis are shown in the Section 2.4 of Supplemental Materials file 1.

Figure 6.

Figure 6

3D diagrams of molecular docking results; (A) artemisinin docked onto the amino acid residue of ADH1A, namely LYS-188; (B) berberine docked onto the three amino acid residues of the protein, namely HIS-62, ARG-150 and LEU-192.

The above case results indicated that our HTINet2 can obtain the reliable prediction results, capable of not only identifying existing drug targets but also suggesting new candidate targets for which evidence has not yet been found.

Discussion

Benefiting from the efficiency of computation methods, computational prediction of drug targets has emerged as a hot topic. In this study, we proposed a deep learning framework to predict herbal targets, which includes TCM and clinical KGE, residual graph representation learning, and supervised target prediction. The comprehensive experiments and case study showed the excellent performance and the reliability of predicted results of HTINet2.

The high performance of HTINet2 benefits primarily from two points. On the one hand, the knowledge graph TMKG we constructed contains rich knowledge, especially both TCM properties and clinical treatment of herbs that help to learn rich KE vectors of herbs and targets. TMKG is a foundational knowledge graph that can be further applied to various application scenarios, e.g. biomedical relationships inference (e.g. genes of diseases, symptoms, or syndromes), intelligent auxiliary diagnosis and treatment (e.g. disease diagnosis or drug/treatment recommendation). On the other hand, the residual-like graph convolution we designed can effectively capture the deep interactions among herbs and targets, capable of effectively integrating complex interaction data from different biomolecules; the BPR is a supervised module designed specifically for HTP. This supervised neural network framework does perform better on target prediction. Meanwhile, the flexible design of the HTINet2 framework is not only suitable for predicting drug targets but can also be extended to other graph-based bioinformatics problems such as disease gene identification and protein function prediction. We describe the process for applying HTINet2 to other tasks or datasets in the Section 2.5 of Supplemental Materials file 1. This wide applicability makes HTINet2 a versatile tool for various biomedical research applications.

There are still several work to do in the future. First, the current data of herb-target relationships are still noisy, how to maintain consistency and high quality with data from multiple sources remains a problem, and high-quality herb–target relationships remain a limitation. In the future, we will combine the two high-quality datasets SymMap [15] and SympGAN [48] built from our previous studies, integrating technologies such as entity alignment and semantic disambiguation to enhance data quality, to form a large-scale, high-quality data set of herb targets. Second, our HTINet2 is still a two-stage, not an end-to-end, framework, where the optimization objective (i.e. loss function) can optimize the neural network parameters of residual-like GCN, but cannot optimize the parameters of KE models. In the future, we will design an end-to-end unified framework, reducing redundancy and improving performance. Third, most neural network models are black boxes, making interpretability a critical and challenging issue, especially in the field of bioinformatics [49]. In the future, we aim to incorporate explainable models and techniques to facilitate a deeper understanding of the decision-making processes of the models [50]. Finally, the clinical translation process generally takes several years. We are committed to shortening this process or making it more efficient through computational biology, thereby narrowing the scope of potential drug targets. In the future, we will conduct wet experiments to validate the predicted results of HTINet2, to discover new and reliable herb targets.

Key Points

  • We proposed a deep learning-based target prediction framework termed HTINet2, which designed three key modules, namely, TCM and clinical KGE, residual graph representation learning, and supervised target prediction.

  • We constructed a large-scale knowledge graph that covers the TCM property and clinical treatment knowledge of herbs, and designed a component of deep KE to learn the deep KE of herbs and targets.

  • We designed a residual-like graph convolution network to capture the deep interactions among herbs and targets, and a BPR loss to conduct supervised training and target prediction.

  • Comparison experiment indicated the excellent performance of HTINet2 (HR@10 increased by 122.7% and NDCG@10 by 35.7%), ablation experiments illustrated the positive effect of our designed modules of HTINet2, and the case study demonstrated the reliability of the predicted targets of Artemisia annua and Coptis chinensis based on knowledge base, literature, and molecular docking.

Abbreviations

Below are the abbreviations and full names used in this study.

  1. TCM Traditional Chinese Medicine

  2. HTI Herb-Target Interaction

  3. HTP Herb-Target Prediction

  4. PPI protein–protein interaction

  5. TMKG Knowledge graph of TCM and western medicine

  6. IES Inferred Evidence Score

  7. GO Gene Ontology

  8. GCN Graph Convolutional Network

  9. BPR Bayesian Personalized Ranking

  10. KE Knowledge Embedding

  11. KGE Knowledge Graph Embedding

  12. HR hit ratio

  13. NDCG normalized discounted cumulative gain

  14. SC silhouette score

Supplementary Material

Supplemental_materials_file_1_bbae414
Supplemental_materials_file_2_bbae414

Contributor Information

Pengbo Duan, Institute of Medical Intelligence, Department of Artificial Intelligence, Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer Science & Technology, Beijing Jiaotong University, Beijing 100044, China.

Kuo Yang, Institute of Medical Intelligence, Department of Artificial Intelligence, Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer Science & Technology, Beijing Jiaotong University, Beijing 100044, China.

Xin Su, Institute of Medical Intelligence, Department of Artificial Intelligence, Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer Science & Technology, Beijing Jiaotong University, Beijing 100044, China.

Shuyue Fan, Institute of Medical Intelligence, Department of Artificial Intelligence, Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer Science & Technology, Beijing Jiaotong University, Beijing 100044, China.

Xin Dong, Institute of Medical Intelligence, Department of Artificial Intelligence, Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer Science & Technology, Beijing Jiaotong University, Beijing 100044, China.

Fenghui Zhang, Institute of Medical Intelligence, Department of Artificial Intelligence, Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer Science & Technology, Beijing Jiaotong University, Beijing 100044, China.

Xianan Li, Institute of Medical Intelligence, Department of Artificial Intelligence, Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer Science & Technology, Beijing Jiaotong University, Beijing 100044, China.

Xiaoyan Xing, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100193, China.

Qiang Zhu, Institute of Medical Intelligence, Department of Artificial Intelligence, Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer Science & Technology, Beijing Jiaotong University, Beijing 100044, China.

Jian Yu, Institute of Medical Intelligence, Department of Artificial Intelligence, Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer Science & Technology, Beijing Jiaotong University, Beijing 100044, China.

Xuezhong Zhou, Institute of Medical Intelligence, Department of Artificial Intelligence, Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer Science & Technology, Beijing Jiaotong University, Beijing 100044, China.

Funding

This work is partially supported by the National Natural Science Foundation of China (Nos. 82204941, 82174533, 82374302, and U23B2062), the National Key Research and Development Program (Nos. 2023YFC3502604 and 2021YFC1712901), the Natural Science Foundation of Beijing (No. L232033), and Key R&D Program Project of Ningxia Hui Autonomous Region (2022BEG02036), the Fundamental Research Funds for the Central Universities (No. 2022RC022).

Conflict of interest: None declared.

Data availability

The code of this study is available on the Github repository: https://github.com/2020MEAI/HTINet2.

References

  • 1. Cheung  F. TCM: made in China. Nature  2011;480:S82–3. 10.1038/480S82a. [DOI] [PubMed] [Google Scholar]
  • 2. Kong  D-X, Li  X-J, Zhang  H-Y. Where is the hope for drug discovery? Let history tell the future. Drug Discov Today  2009;14:115–9. 10.1016/j.drudis.2008.07.002. [DOI] [PubMed] [Google Scholar]
  • 3. Qiu  J. ‘Back to the future’for Chinese herbal medicines. Nat Rev Drug Discov  2007;6:506–7. 10.1038/nrd2350. [DOI] [PubMed] [Google Scholar]
  • 4. Zhiguo  X. Modernization: one step at a time. Nature  2011;480:S90–2. [DOI] [PubMed] [Google Scholar]
  • 5. Bagherian  M, Sabeti  E, Wang  K. et al.  Machine learning approaches and databases for prediction of drug–target interaction: a survey paper. Brief Bioinform  2021;22:247–69. 10.1093/bib/bbz157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Ezzat  A, Min  W, Li  X-L. et al.  Computational prediction of drug–target interactions using chemogenomic approaches: an empirical survey. Brief Bioinform  2019;20:1337–57. 10.1093/bib/bby002. [DOI] [PubMed] [Google Scholar]
  • 7. Yang  K, Zhou  X, Zhang  R. et al.  Integrating herb effect similarity for network-based herb target prediction. In: 2015 8th International Conference on Biomedical Engineering and Informatics (BMEI). IEEE, Shenyang, China, 2015, pp. 483–8. [Google Scholar]
  • 8. Yang  K, Liu  G, Wang  N. et al.  Heterogeneous network propagation for herb target identification. BMC Med Inform Decis Mak  2018;18:27–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Wang  N, Li  P, Xiaochen  H. et al.  Herb target prediction based on representation learning of symptom related heterogeneous network. Comput Struct Biotechnol J  2019;17:282–90. 10.1016/j.csbj.2019.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Zhang  S, Yang  K, Liu  Z. et al.  DrugAI: a multi-view deep learning model for predicting drug–target activating/inhibiting mechanisms. Brief Bioinform  2023;24:bbac526. [DOI] [PubMed] [Google Scholar]
  • 11. Yang  K, Yang  Y, Fan  S. et al.  DRONet: effectiveness-driven drug repositioning framework using network embedding and ranking learning. Brief Bioinform  2023;24:bbac518. [DOI] [PubMed] [Google Scholar]
  • 12. Lin  X, Quan  Z, Wang  Z-J. et al.  Effectively identifying compound-protein interaction using graph neural representation. IEEE/ACM Trans Comput Biol Bioinform  2022;20:932–43. 10.1109/TCBB.2022.3198003. [DOI] [PubMed] [Google Scholar]
  • 13. Mohanraj, Karthikeyan  B, Shanmugam Karthikeyan  RP, Vivek-Ananth  RPB. et al.  IMPPAT: a curated database of Indian medicinal plants, phytochemistry and therapeutics. Sci Rep  2018;8:1–17. 10.1038/s41598-018-22631-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Vivek-Ananth  RP, Mohanraj  K, Sahoo  AK. et al.  IMPPAT 2.0: an enhanced and expanded phytochemical atlas of Indian medicinal plants. ACS. Omega  2023;8:8827–45. 10.1021/acsomega.3c00156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Yang  W, Zhang  F, Yang  K. et al.  SymMap: an integrative database of traditional Chinese medicine enhanced by symptom mapping. Nucleic Acids Res  2019;47:D1110–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Zhang  Y, Wang  N, Xia  D. et al.  SoFDA: an integrated web platform from syndrome ontology to network-based evaluation of disease-syndrome-formula associations for precision medicine. Science Bulletin  2022;67:1097–101. 10.1016/j.scib.2022.03.013. [DOI] [PubMed] [Google Scholar]
  • 17. Szklarczyk  D, Kirsch  R, Koutrouli  M. et al.  The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res  2023;51:D638–46. 10.1093/nar/gkac1000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Kanehisa  M, Goto  S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res  2000;28:27–30. 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Ashburner  M, Ball  CA, Blake  JA. et al.  Gene ontology: tool for the unification of biology. Nat Genet  2000;25:25–9. 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Tang  J, Qu  M, Wang  M. et al.  Line: Large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, Florence, Italy, 2015, pp. 1067–77.
  • 21. Cao  S, Lu  W, Xu  Q. Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. Association for Computing Machinery, Melbourne, Australia, 2015, pp. 891–900.
  • 22. Grover  A, Leskovec  J. node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, San Francisco, California, USA, 2016, pp. 855–64. [DOI] [PMC free article] [PubMed]
  • 23. Perozzi  B, Al-Rfou  R, Skiena  S. Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, New York, New York, USA, 2014, pp. 701–10.
  • 24. Ou  M, Cui  P, Pei  J. et al.  Asymmetric transitivity preserving graph embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, San Francisco, California, USA, 2016, pp. 1105–14.
  • 25. Chen  L, Le  W, Hong  R. et al.  Revisiting graph based collaborative filtering: a linear residual graph convolutional network approach. In Proceedings of the AAAI Conference on Artificial Intelligence  2020;34:27–34. 10.1609/aaai.v34i01.5330. [DOI] [Google Scholar]
  • 26. Salton  G, Fox  EA, Wu  H. Introduction to modern information retrieval. McGraw-Hill, New York, USA, 1983;26:1022–36, 10.1145/182.358466. [DOI] [Google Scholar]
  • 27. Jaccard  P. Étude comparative de la distribution florale dans Une portion des Alpes et des Jura. Bull Soc Vaudoise Sci Nat  1901;37:547–79. [Google Scholar]
  • 28. Ravasz  E, Somera  AL, Mongru  DA. et al.  Hierarchical organization of modularity in metabolic networks. Science  2002;297:1551–5. 10.1126/science.1073374. [DOI] [PubMed] [Google Scholar]
  • 29. Leicht  EA, Holme  P, Newman  MEJ. Vertex similarity in networks. Physical Review E  2006;73:026120. 10.1103/PhysRevE.73.026120. [DOI] [PubMed] [Google Scholar]
  • 30. Adamic  LA, Adar  E. Friends and neighbors on the web. Social Networks  2003;25:211–30. 10.1016/S0378-8733(03)00009-1. [DOI] [Google Scholar]
  • 31. Zhou  T, Lü  L, Zhang  Y-C. Predicting missing links via local information. The European Physical Journal B  2009;71:623–30. 10.1140/epjb/e2009-00335-8. [DOI] [Google Scholar]
  • 32. Vanunu  O, Magger  O, Ruppin  E. et al.  Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol  2010;6:e1000641. 10.1371/journal.pcbi.1000641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Shiwen  W, Sun  F, Wentao Zhang  X. et al.  Graph neural networks in recommender systems: a survey. ACM Comput Surv  2022;55:1–37. [Google Scholar]
  • 34. Van der Maaten, Hinton  G. Visualizing data using t-SNE. Journal of Machine Learning Research  2008;9:2579–605. [Google Scholar]
  • 35. Zhang  Y, Li  X, Shi  Y. et al.  ETCM v2. 0: an update with comprehensive resource and rich annotations for traditional chinese medicine. Acta Pharmaceutica Sinica B  2023;13:2559–71. 10.1016/j.apsb.2023.03.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Fang  SS, Dong  L, Liu  L. et al.  HERB: a high-throughput experiment-and reference-guided database of traditional chinese medicine. Nucleic Acids Res  2021;49:D1197–206. 10.1093/nar/gkaa1063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Gao  P, Wang  L-l, Liu  J. et al.  Dihydroartemisinin inhibits endothelial cell tube formation by suppression of the stat3 signaling pathway. Life Sci  2020;242:117221. 10.1016/j.lfs.2019.117221. [DOI] [PubMed] [Google Scholar]
  • 38. Ilamathi  M, Santhosh  S, Sivaramakrishnan  V. Artesunate as an anti-cancer agent targets stat-3 and favorably suppresses hepatocellular carcinoma. Curr Top Med Chem  2016;16:2453–63. 10.2174/1568026616666160212122820. [DOI] [PubMed] [Google Scholar]
  • 39. Liu  Q, Yang  Z. Progress of anti-tumor activities of artemisinin and its derivatives. Chinese Bulletin of Life Sciences  2020;32:62–9. [Google Scholar]
  • 40. Na  X, Zhou  X, Wang  S. et al.  Artesunate induces skm-1 cells apoptosis by inhibiting hyperactive Inline graphic-catenin signaling pathway. Int J Med Sci  2015;12:524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Wang  Q, Guo  W-Y, Liu  L-L. et al.  Inhibitory effect of artesunate on bone destruction in rheumatoid arthritis: an exploration based on AhR/ARNT/NQO1 signaling pathway. China Journal of Chinese Materia Medica  2022;47:2698–704. 10.19540/j.cnki.cjcmm.20220110.401. [DOI] [PubMed] [Google Scholar]
  • 42. Xianjing  H, Fatima  S, Chen  M. et al.  Dihydroartemisinin is potential therapeutics for treating late-stage CRC by targeting the elevated c-myc level. Cell Death Dis  2021;12:1053. 10.1038/s41419-021-04247-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Yang  S-X, Xie  S-S, Ma  D-L. et al.  Enhancement of interleukin-2 production and its mRNA expression by dihydroartemisinin. Acta Pharmacol Sin  1994;15:515–20. [PubMed] [Google Scholar]
  • 44. Tang  W-k, Li  Q-y, Liu  J. et al.  Research progress on signaling pathway of berberine in relieving insulin resistance. Drugs & Clinic  2022;37:1409–13. [Google Scholar]
  • 45. Jia-wen shou, Xiao-Xiao Li, Yun-sang Tang, Bobby Lim-Ho Kong, hoi-Yan Wu, Meng-Jie Xiao, Chun-kai Cheung, and pang-Chui Shaw. Novel mechanistic insight on the neuroprotective effect of berberine: the role of PPARInline graphic for antioxidant action. Free Radical Biology and Medicine  2022;181:62–71. 10.1016/j.freeradbiomed.2022.01.022. [DOI] [PubMed] [Google Scholar]
  • 46. Tang  F, Wang  D, Duan  C. et al.  Berberine inhibits metastasis of nasopharyngeal carcinoma 5-8F cells by targeting rho kinase-mediated ezrin phosphorylation at threonine 567. J Biol Chem  2009;284:27456–66. 10.1074/jbc.M109.033795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Huang  X, Li  Y-F, Kou  G-J. et al.  Research progress in mechanism of action of berberine on atherosclerosis. Drug Evaluation Research  2016;39:469–73. [Google Scholar]
  • 48. Kezhi  L, Yang  K, Sun  H. et al.  SympGAN: a systematic knowledge integration system for symptom–gene associations network. Knowledge-Based Systems  2023;276:110752. [Google Scholar]
  • 49. Wong  F, Zheng  EJ, Valeri  JA. et al.  Discovery of a structural class of antibiotics with explainable deep learning. Nature  2024;626:177–85. 10.1038/s41586-023-06887-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Lundberg  SM, Lee  S-I. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems  2017;30:4765–74. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental_materials_file_1_bbae414
Supplemental_materials_file_2_bbae414

Data Availability Statement

The code of this study is available on the Github repository: https://github.com/2020MEAI/HTINet2.


Articles from Briefings in Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES