Skip to main content
Bioinformatics Advances logoLink to Bioinformatics Advances
. 2023 Aug 26;3(1):vbad116. doi: 10.1093/bioadv/vbad116

SAGDTI: self-attention and graph neural network with multiple information representations for the prediction of drug–target interactions

Xiaokun Li 1,2, Qiang Yang 3,4, Gongning Luo 5,, Long Xu 6,7, Weihe Dong 8,9, Wei Wang 10, Suyu Dong 11, Kuanquan Wang 12,, Ping Xuan 13,14, Xin Gao 15
Editor: Shanfeng Zhu
PMCID: PMC10818136  PMID: 38282612

Abstract

Motivation

Accurate identification of target proteins that interact with drugs is a vital step in silico, which can significantly foster the development of drug repurposing and drug discovery. In recent years, numerous deep learning-based methods have been introduced to treat drug–target interaction (DTI) prediction as a classification task. The output of this task is binary identification suggesting the absence or presence of interactions. However, existing studies often (i) neglect the unique molecular attributes when embedding drugs and proteins, and (ii) determine the interaction of drug–target pairs without considering biological interaction information.

Results

In this study, we propose an end-to-end attention-derived method based on the self-attention mechanism and graph neural network, termed SAGDTI. The aim of this method is to overcome the aforementioned drawbacks in the identification of DTI. SAGDTI is the first method to sufficiently consider the unique molecular attribute representations for both drugs and targets in the input form of the SMILES sequences and three-dimensional structure graphs. In addition, our method aggregates the feature attributes of biological information between drugs and targets through multi-scale topologies and diverse connections. Experimental results illustrate that SAGDTI outperforms existing prediction models, which benefit from the unique molecular attributes embedded by atom-level attention and biological interaction information representation aggregated by node-level attention. Moreover, a case study on severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) shows that our model is a powerful tool for identifying DTIs in real life.

Availability and implementation

The data and codes underlying this article are available in Github at https://github.com/lixiaokun2020/SAGDTI.

1 Introduction

Research on drug–target interaction (DTI) prediction is a crucial step in the repurposing of current drugs and the discovery of new drugs (Ashburn and Thor 2004, Rifaioglu et al. 2019, Zheng et al. 2020). Accurate identification of potential DTI greatly reduces the requirement of costly and time-consuming high-throughput screening (da Silva Rocha et al. 2019, Bagherian et al. 2021). However, it is impractical to rapidly distinguish every possible compound-target pair because of the large-scale chemical space of molecular compounds, proteins, and protein–ligand complexes (Maia et al. 2020, Zhao et al. 2022). Based on this observation, various computational methods have been introduced to identify potential DTI. In the incipient stage of using computational approaches for DTI identification, molecular simulation, and molecular docking are the mainstream methods (Csermely et al. 2013, Śledź and Caflisch 2018). These methods strongly depend on the three-dimensional (3D) structural information of proteins. However, the performance of these structure-based methods is limited due to the inadequate spatial structure for many proteins. In the past decade, machine learning-based approaches were introduced to overcome some of these difficulties in the process of identifying DTI. For example, Ezzat et al. (2017) establish a matrix factorization framework with k-nearest known neighbors for DTI prediction that incorporates both drug and target similarities. Conventional shallow machine learning-based methods, such as NRLMF (Liu et al. 2016) and KronRLS (Pahikkala et al. 2015), only considered the similarity features of drug–target pairs. Consequently, these approaches cannot fully exploit the comprehensive relationships between drugs and proteins.

With the rapid development of computing power and data mining, deep learning is widely accepted in the field of natural language processing and computer vision (Bharath Ramsundar et al. 2019, Hazra et al. 2021). Deep learning techniques have elevated traditional computational approaches due to the advantage of comprehensively extracting latent feature representations, rendering DTI prediction a research hotspot (Ru et al. 2021). Deep learning methods for inferring DTI are divided into two main types. One type is designed to manage a sequential input representation that transforms all feature information into a vector. For instance, DeepDTA (Öztürk et al. 2018) utilizes the Simplified Molecular Input Line Entry System (SMILES) of ligands and amino acid sequences of proteins to decide the binding affinity of drug–target pairs through convolutional neural networks (CNNs). Similarly, molecular transformer-based models (Shin et al. 2019, Huang et al. 2021) are introduced to determine the high-dimensional structure of a molecular from SMILES string and character-embedding sequence through the self-attention mechanism. However, these methods cannot model the potential relationships of compounds since the atom’s positional distribution and relative atoms are fixed, thereby limiting the prediction performance. Thus, a more flexible input representation of pairwise drug–target is required, another type of deep learning model, namely graph-based neural networks is frequently discussed. The application of graph descriptions in DTI prediction involves the treatment of atoms as nodes and chemical bonds as corresponding edges. Typically, GraphDTA (Nguyen et al. 2021) and IGT (Liu et al. 2022) are the outstanding identifiers for predicting DTI based on the graph representations of ligands and target receptors. Regrettably, most graph neural network (GNN)-based methods exhibit poor performance in extracting the spatial information of proteins through amino acid sequence representations, which is a staple factor in DTI determination. Proteins are composed of abundant atoms that require a large-scale sparse (3D) matrix to receive the entire constrained spatial structure. Hence, it is difficult to obtain an accurate protein high-resolution 3D structure. An alternative strategy was developed to handle the above-mentioned problem. In this strategy, the target proteins are expressed as a contact/distance map, which transforms the interaction among proteins into a matrix (Jiang et al. 2020, Zheng et al. 2020). However, this contact/distance map is based on a heuristic approach. Therefore, it provides only an abstract outcome of the true structure of proteins, which is generally distinguished from X-ray crystallography or nucleic magnetic resonance (NMR) spectroscopy (Tradigo 2013).

Several previously reported prediction models convert multi-scale neighboring topologies and diverse connections (i.e. interactions, associations, and similarities) among biological entities as feature representations to predict DTI. For example, GCDTI was proposed (Xuan et al. 2022a) to capture and fuse the neighboring topologies and diverse connections information based on three-way GCNs and CNN with multi-level attention. Similarly, Xuan et al. (2022b) used graph convolutional and variational autoencoders to encode multiple pairwise representations between a drug and its targets, such as attention-enhanced topology, attribution, and distribution. Unfortunately, the precision of these models in DTI prediction is relatively low because they ignored the molecular and chemical features of drug-protein pairs. Furthermore, existing deep learning methods always simply construct the interaction of drugs toward related targets by concatenating feature representations that cannot sufficiently express the interactions. Due to space constraints, we included more relevant studies in the Supplementary Materials.

Considering the above information, in this study, an end-to-end attention-derived method SAGDTI are introduced. SAGDTI accepts three input representations from multiple molecular and biological data sources, including the SMILES strings of drugs with relative atom distance, the graph embedding of proteins that contain spatial information of binding sites, and graph-based interaction information from biological heterogeneous network. First, we introduced a molecular transformer module based on multi-head self-attention (MSA) to extract the feature embedding of SMILES sequences and protein graphs. The feature representations of each drug–target pair were subsequently transformed into an interaction pairing map and we fed it into CNNs to represent the molecular attributes at the atom-level. Secondly, graph attention networks (GAT) are constructed to capture the interaction information that covers the multi-scale topologies and diverse connections in the form of a heterogeneous network (matrix) via graph-based attention. Finally, we use convolutional-pooling (C-P) networks to allocate the important weights of two obtained attribute representations and fully connected neural networks (FNNs) to predict the interaction score for drug–target pairs.

The main contributions of our study are summarized below.

  • To accurately capture the unique molecular information of both compounds and receptors, a molecular transformer module is designed to improve the relative element information among atoms in drug compounds and 3D structural features of target proteins’ binding pockets.

  • Our proposed model is the first DTI prediction method to transform the input representation of molecular drugs and targets into totally different forms. Moreover, it aggregates the attributes of interaction information from biological entities based on graph attention networks.

  • Owing to its self-attention, the proposed method is attention-derived, offering high interpretability when aggregating the attribute representations from neither molecular nor biological feature information due to its self-attention.

  • To our best cognition, our experimental results are state-of-the-art (SoTA) in feature representation learning models for DTI prediction, as confirmed using three benchmark datasets.

2 Methods

In this study, we regarded the problem of predicting DTI as a binary classification issue. We model an objective function F(·) with multiple information representations for drugs and targets to forecast the presence or absence of DTI. The visualized framework of SAGDTI is shown in Fig. 1. Our model consists of four main parts: molecules input embedding; molecular transformer module; biological interaction information aggregating module and potential DTI classification module. Specifically, in molecule input embedding (Fig. 1B), we transformed the unique molecular information of drugs and proteins into different input forms. The SMILES strings of small compounds were embedded as sequential vectors that cover long-distance relative position information among atoms using the strategy introduced in (Zeng et al. 2021). Inspired by the graph input representation of proteins (Yazdani-Jahromi et al. 2022), the protein pockets with binding sites that interact with compounds were embedded as 3D graphs. Feature representations of drugs and proteins were subsequently fused to achieve the latent drug–target complex attributes that include their unique characteristics through the molecular transformer (Fig. 1C). Next, we modeled and aggregated the biological attributes of interaction information between drugs and targets by using GATs (Fig. 1D). Finally, the two obtained attribute representations were incorporated and then fed into C-Ps and FNNs for DTI prediction (Fig. 1E).

Figure 1.

Figure 1.

The framework of our proposed model. (A) Open-source data for drug SMILES strings and 3D structures of proteins. (B) Molecules input embedding module that identifies the noncovalent intermolecular attributes of drugs and proteins, and subsequently represents the features in the forms of sequences and graphs. (C) Molecular transformer module, in which we construct sequential transformer encoders to learn the unique feature representations of drugs SMILES and the binding sites of proteins, transform them into an interaction pairing map, and then feed them into a convolutional layer to extract molecular attributes. (D) Biological interaction information aggregating module that models the multi-scale topologies and diverse connections; the biological attributes are represented using a graph attention network. (E) Potential DTI classification module, where we predict unknown interactions in a drug–target pair.

2.1 Molecules input embedding

2.1.1 Drug SMILES strings

Small molecular drugs are expressed by the SMILES strings. Motivated by previous research (Zeng et al. 2021, Li et al. 2022), the SMILES strings were converted to continuous numerical vectors representing atoms and chemical information ({C,C,l,,F,),C}{42,42,25,,11,31,42}). For a given drug: where ai is the ith atom and N* is a flexible sequence length that rests with atom numbers in a compound. In this study, we restrict the maximal SMILES sequence length as a hyperparameter d. According to the traditional transformer model (Vaswani et al. 2017), the initial input is an embedding vector composed of a tokenized symbol and position signal, which can be formulated as: where ETDRd×ϖd and EPDRd×ϖd represent the token embedding and position embedding, respectively, and ϖd is the hidden feature dimension of atom ai. and XD denotes the embedding vector of the atom sequence. To model the relative distance between atoms, we describe the unique molecular characteristics using m kinds of relative correlations between atoms. Inspired by (Shaw et al. 2018), the input embedding of drugs with spatial information DIn can be mathematically expressed as: where WQ,WK,WVRϖd×ϖd denote parameter matrices of query, key and value in attention layer. ARRd×d indicates the relative-aware relationship matrix between atoms and XiD is the ith atom embedding in XD. WRRm×ϖd is the learnable parameters composed of m kinds of relative relationship between atoms, and the virtual “nodes” (e.g. @, =, [, etc.) in WR are selected to be zero. Thus, when we model the relative distance between atoms, the distance between special characters in SMILES strings is 0.

D={a1d,a2d,,aid,aN*d} (1)
XD=ETD+EPD (2)
DIn=SoftMax((XDWQ)(XDWK)T+ARϖd)(XDWV) (3)
AijR=(XiDWQ)(Wmin(k,|ij|)R) (4)

2.1.2 Protein graphs

We utilized the spatial structure of large-scale proteins that are collected from the Protein Data Bank (PDB) (Berman et al. 2000). This is a worldwide dataset that provides experimental evidence for proteins (e.g. through X-ray diffraction, cryo-electron microscopy, NRM). We sought to capture the binding sites of protein pockets from 3D structures. For this purpose, we adopted the algorithm introduced by (Saberi Fathi and Tuszynski 2014) to transform the bounding box coordinates of each protein’s binding site into a set of peptide fragments. For a given protein: where bjpRp×υ is the jth binding site in protein, p and υ are the maximum number of binding sites and the maximum number of atoms in binding sites, respectively. M* denotes the varied number of protein binding sites. Since the binding sites of a protein were recognized, we drew individual graphs to represent each atom ap (node) and relationships between atoms (edges, denoted by ep) for each binding site. For each atom ap, we created a feature vector with a size of k (see Supplementary Table S1 for details), which comprised atom symbols, degree, electrical charge, hydrogen, etc. Thus, each binding site graph’s node matrix is denoted as XPRυ×k. Moreover, we use a simple linear transformation to encode PIn=WP(XP)T, leading to a real-valued dense matrix PInRυ×Dp as the input graph embedding. Dp is the high-level feature dimension.

P={b1p,b2p,,bjp,bM*p} (5)

2.2 Molecular transformer module

In recent years, many attention-based algorithms have been developed to address the identification of drug-related targets, such as transformer (Vaswani et al. 2017) and BERT (Kang et al. 2022). Inspired by previous studies (Shin et al. 2019, Huang et al. 2021), we introduce a molecular transformer module to extract the feature representations of drugs and proteins, which cover their unique molecular characteristics.

Specifically, for the drug embedding representation DIn, we fed it into a MSA layer to acquire the attention scores of each atom in the drug. We subsequently threw the output of MSA into a FNN layer, and both MSA and FNN layers are combined with residual operation (He et al. 2015) and layer normalization. The physical structure of the molecular transformer encoder is shown in Supplementary Fig. S1. Mathematically, the drug feature representation can be formulated as: where QRp×ed,KRp×ed and QRq×ev describe the a query and key-value pairs by attention function Att(). headi and Wo are the ith head and learnable parameter matrix in MSA layer, respectively. XMSA is the output feature of MSA layer. Each molecular transformer encoder MTs was stacked using the residual operation to enhance molecular feature extraction, which is expressed as: where DRL1 is the input of Lth molecular transformer encoder, when L1=0,DRL1 equals to DIn. Similarly, we model the latent feature representation of proteins PRL from molecular graph embedding PIn based on the transformer encoders.

MTs(DIn)=FNN(XMSA+DIn) (6)
XMSA=Concate(head1,head2,,headm)Wo (7)
headi=Att(Q,K,V) (8)
Att(Q,K,V)=SoftMax(QKTed)V (9)
DRL=MTs(DRL1)+DRL1 (10)

To further learn the molecular attributes of drug-protein pairs at the atom-level, the two obtained feature representations for a given drug and its binding sites were converted into an interaction pairing map. Specifically, for each atom aid in a given drug and atom ajp in a target protein, we constructed pairwise interactions via the following formula: where IM is an interaction pairing map and Λ(·) denotes the operation of the scalar product. Scalar products provide an interpretable approach to integrating the intensity of interaction between individual drug-protein pairs at the atom level. In IM, when a value is close to 1, the atoms of the corresponding pairwise small molecules are certainly binding with each other, otherwise, the value is 0. Thus, by utilizing this map, we explicitly determined the atoms of drug-protein pairs that contributed to the interaction result.

IM=Λ(DRL,PRL) (11)

A CNN layer was used to extract the adjacent information of neighboring atoms for DTI in IM. Let WConvRnw×nl be the convolutional filter, where nw and nl correspond to the width and length of the filter, respectively. There are nf channels to describe the molecular attributes of a drug and its binding protein in the CNN layer. The zero-padding technique is used to serve the marginal information of the atom binding features. Thus, the molecular attribute representations XMol between drugs and targets are obtained via flattening operation.

XMol=Flatten(Conv(WConvIM,nf)) (12)

2.3 Biological interaction information aggregating module

In this module, we introduced GATs to model the multi-scale neighboring topologies and diverse connections from biological interaction information for improving the performance of DTI prediction. Generally, existing DTI prediction models only consider the noncovalent intermolecular interactions among SMILES strings and amino acid sequences. When some unexpected noise occurs, the accuracy may reduce significantly by the unavailability of biological feature information of drug-protein pairs. Therefore, we incorporated the biological characteristics of drug-protein pairs to increase the stability and robustness of the model. Numerous studies have demonstrated that the biological interaction information of drugs and targeted proteins plays an important role in predicting DTI (Peng et al. 2021, Zhao et al. 2021). The schematic of the GAT mechanism is shown in Supplementary Fig. S2.

Specifically, we first constructed a heterogeneous network, which contained all interaction information of multi-scale topologies and diverse correlations at the node-level. We converted the heterogeneous network into an undirected graph G=(V,E) where V represents different nodes and E represents the different edges between nodes. Afterward, the GAT layer was used to extract the input representation h of each node. where hi and K* are the ith node and the number of adjacent nodes, and Z denotes the feature dimension. For each node hi, we used masked attention to calculate the attention coefficients, which means we only focus on the node hi and its first-order neighboring node hjNi. where eij and W denote attention coefficients and the weight matrix of node feature transformation, respectively. Intuitively, the weights in GAT layer are shared, which greatly improve the efficiency in computation. The symbol denotes the concatenation operation of the two node representations, and αR2Z is a weight vector. β(·) is the LeakyReLU activation function with a negative input slope of 0.2. The value of eij reveals the importance of hj to hi. To better allocate the weight and calculate the correlation strength between the central node hi and all its neighbors, the attention coefficients are then normalized by utilizing a SoftMax function.

h={h1,h2,,hi,,hK*–→},hiRZ (13)
eij=β(αT[WhiWhj]) (14)
aij=SoftMaxj(eij) (15)

We sought to represent the interaction features more comprehensively between nodes. For this purpose, we also adopted the MSA strategy when normalizing the attention coefficients by using a nonlinear activation function. The output feature can be formulated as: where k means the kth head of attention coefficients aijk.

hi=k=1Headβ(jNiaijkWkhj) (16)

After extracting the complex feature representation of interaction information between a drug and its target through consecutive GAT layers, we then utilized a global max pooling to output the biological attribute representation vector. The output representation XBio is expressed as:

XBio=MaxPoolng(hi) (17)

2.4 DTI classification module

Most existing methods only simply concatenate the learned feature representations or include them into several FNN layers. These cannot fully learn the real-world interaction knowledge between a drug and its binding target. Inspired by Zeng et al. (2021) in modeling the DTI feature, C-P and FNN layers are applied to map the extracted features into a final classification output. Firstly, the molecular attribute representations XMol and the biological attribute representations XBio were spliced left and right to fuse the ultimate feature representation XUlt.

XUlt=[XMol;XBio] (18)

Secondly, the fused attribute representations XUlt of a drug–target pair was fed into C-P and FNN layers to learn the whole DTI features, and a Sigmoid function was used to output the final interaction score. where WConv and nf denote the convolutional kernel and corresponding channels, respectively, S represents the Sigmoid function, in which S(x)=11+exp(x). We applied the Adam algorithm to optimize the objective and utilize the binary cross-entropy (BCE) loss function to minimize the difference between the ground truth and the predicted probability, which can be calculated as: where λ is the number of training samples and Yi* represents the real label of a drug–target pair. When the value of Loss is close to 0, it indicates the SAGDTI model can predict the DTI with high accuracy.

YOut=S(FNN(Pool(Conv(WConvXUlt,nf)))) (19)
Loss=1λi=1λYi*·log(YiOut)+(1Yi*)·log(1YiOut) (20)

3 Results

3.1 Benchmark datasets

In this study, we evaluated the merit of our proposed model using three benchmark datasets, BindingDB dataset (Gilson et al. 2016), Kinase dataset Davis (Davis et al. 2013), and Kinase Inhibitor BioActivity (KIBA) (Tang et al. 2014), which are widely utilized in previous studies (Cheng et al. 2022, Li et al. 2022). The SMILES strings of molecular drugs are collected from the PubChem database (Kim et al. 2021) based on their PubChem IDs, the 3D structure information of proteins is derived from the PDB database (Berman et al. 2000), and the interaction information of drug–target pairs is obtained according to prior work (Zhao et al. 2021).In this article, when the structure information of a protein is not available in the PDB database, we will remove it to make sure the model can fully learn the interaction features of drug-protein pairs. Table 1 summarizes the statistical information for these datasets. These datasets are processed to filter out the invalid DTIs for data processing (see Supplementary Materials).

Table 1.

Statistics of the three benchmark datasets: BindingDB dataset, Davis and KIBA.

Datasets Metric Target Drug Interaction
KD 587 6704 31 239
BindingDB KI 1225 126 122 249 598
IC 50 2582 407 462 653 344
EC 50 699 75 142 105 364
Davis KD 442 68 30 056
KIBA KIBA score 229 2111 118 254

3.2 Evaluation criteria

To estimate the utility of our DTI prediction model, we used five metrics in the experiments in total, which are area under the receiver operating characteristics curve (AUROC), area under the precision–recall curve (AUPR), Matthews correlation coefficient (MCC) (He et al. 2022), F1-score, and balanced accuracy (B.Acc, composed of sensitivity and specificity). The mathematical formula of the above five metrics is displayed in the Supplementary Materials.

3.3 Experiments setup

To comprehensively evaluate the performance of DTI prediction, we adopted three experimental settings, i.e. a new-target setting, a new-drug setting, and a pairwise setting. In the new-target setting, targets are randomly split into five identical subsets, four of those are regarded as inference targets, while the remaining subset is treated as test targets. In the training process, the SAGDTI model learns the inference data, which contains the inference targets and all the drugs. Subsequently, the trained model is used to suggest the pairwise interactions between test targets and all the drugs. In this regard, the new-drug setting is analogous to the new-target setting, and we arbitrarily divide the drugs into five groups. The SAGDTI model is trained and applied to predict the interaction between new drugs and all the targets. In the pairwise setting, the known DTI is like drugs or targets, which are grouped as five equal folds.

Similar to the Co-VAE, we clipped the dataset as training data and testing data at a 5:1 ratio. The training data were used as a training set and a validation set to identify the optimal parameters through a 5-fold cross-validation strategy. For each setting, we applied the trained SAGDTI model on the test set and repeated this process five times to obtain the final mean and standard deviation results. We adopted identical settings for impartial comparison when comparing our proposed model with existing methods. To perform the SAGDTI model, several hyperparameters are shown in Table 2 (See Supplementary Materials for training details).

Table 2.

Statistics of the three benchmark datasets: BindingDB dataset, Davis and KIBA.

Parameters Range
Max length (drug) 150
Max length (target) 1200
Attention head in MTs (drug) [4–12]
Attention head in MTs (target) [4–12]
Attention head in GAT [8–12]
Filter size 3*3
Number of filters [32, 64, 96]
Kernel size of the pooling layer 2*2
Dropout [0.1–0.6]
Optimizer Adam
Learning rate [0.1, 0.01, 0.001, 0.0001]
Epoch [30, 50, 100]
Batch size [32, 64, 128, 256]

3.4 Comparison with SoTA methods

The performance of the proposed SAGDTI for DTI prediction was compared with several SoTA methods, including DDR (Olayan et al. 2018), DeepDTA (Öztürk et al. 2018), GraphDTA (Nguyen et al. 2021), IGT (Liu et al. 2022), Moltrans (Huang et al. 2021), AttentionSiteDTI (Yazdani-Jahromi et al. 2022), MATTDTI (Zeng et al. 2021), and Co-VAE (Li et al. 2022). For a fair comparison, we used the same experimental settings on all the models and the original hyperparameter reported in the corresponding publications. Details of the compared models can be found in Supplementary Materials.

We first adopted the BindingDB dataset to evaluate the SAGDTI model by drawing the receiver operating characteristics curve and precision–recall curve, which record the highest performance in the experimental results. As plotted in Fig. 2, our proposed model SAGDTI achieved the best AUROC and AUPR values compared with other advanced methods. Specifically, SAGDTI obtained the best performance with 0.967 (AUROC) and 0.917 (AUPR), which is 0.008 and 0.017 superior to in the second-best model, respectively. The results roughly show that SAGDTI specializes in inferring the interaction of pairwise drug–target. In the BindingDB dataset, the training set and test set were extremely unbalanced, indicating that the AUPR metric is more credible for expressing the prediction performance. The results for three experimental settings are presented in Table 3. According to the AUROC performance, SAGDTI achieved the best values and was 1.8%, 0.97%, and 1.9% better than other prediction models in terms of three settings (i.e. new-target setting, new-drug setting and pairwise setting). Furthermore, we acquired the average performance of the five metrics among the three settings based on the BindingDB dataset (Table 4). It demonstrated that SAGDTI is competent enough in DTI prediction, as reflected by the average metric values obtained under three experimental settings: AUROC (0.967), AUPR (0.914), MCC (0.649), F1-Score (0.861), and B.Acc (0.901).

Figure 2.

Figure 2.

Comparison of the SAGDTI with eight SoTA models in AUROC and AUPR metrics. Left: ROC curves of different DTI models. Right: PR curves of different DTI models.

Table 3.

Comparison of SAGDTI with other eight SoTA methods in AUROC and AUPR metrics under three different experimental conditions, using three benchmark datasets.a

Datasets Methods New-target
New-drug
Pairwise
AUROC (std) AUPR (std) AUROC (std) AUPR (std) AUROC (std) AUPR (std)
BindingDB DDR 0.834 (0.033) 0.747 (0.051) 0.827 (0.037) 0.725 (0.062) 0.823 (0.031) 0.719 (0.054)
DeepDTA 0.868 (0.007) 0.823 (0.011) 0.864 (0.006) 0.776 (0.0011) 0.842 (0.006) 0.811 (0.009)
GraphDTA 0.875 (0.008) 0.835 (0.0010) 0.852 (0.009) 0.783 (0.013) 0.851 (0.006) 0.829 (0.009)
Moltrans 0.893 (0.011) 0.842 (0.015) 0.878 (0.012) 0.812 (0.013) 0.858 (0.009) 0.847 (0.011)
IGT 0.912 (0.007) 0.855 (0.011) 0.891 (0.009) 0.833 (0.015) 0.887 (0.006) 0.852 (0.009)
AttentionSiteDTI 0.937 (0.005) 0.883 (0.009) 0.912 (0.006) 0.853 (0.011) 0.908 (0.005) 0.871 (0.007)
MATTDTI 0.928 (0.004) 0.875 (0.008) 0.907 (0.005) 0.864 (0.007) 0.906 (0.003) 0.869 (0.005)
Co-VAE 0.955 (0.002) 0.897 (0.004) 0.921 (0.001) 0.893 (0.002) 0.914 (0.002) 0.884 (0.003)
SAGDTI 0.967 (0.001) 0.914 (0.002) 0.934 (0.002) 0.886 (0.002) 0.946 (0.001) 0.901 (0.001)
Davis DDR 0.784 (0.035) 0.451 (0.067) 0.763 (0.033) 0.407 (0.056) 0.739 (0.023) 0.411 (0.049)
DeepDTA 0.799 (0.013) 0.526 (0.021) 0.773 (0.015) 0.491 (0.022) 0.752 (0.011) 0.485 (0.018)
GraphDTA 0.816 (0.015) 0.547 (0.026) 0.802 (0.011) 0.514 (0.019) 0.783 (0.013) 0.501 (0.019)
Moltrans 0.849 (0.016) 0.577 (0.022) 0.826 (0.014) 0.531 (0.021) 0.815 (0.012) 0.524 (0.17)
IGT 0.873 (0.013) 0.593 (0.023) 0.849 (0.013) 0.546 (0.019) 0.839 (0.011) 0.537 (0.018)
AttentionSiteDTI 0.902 (0.009) 0.601 (0.017) 0.872 (0.011) 0.578 (0.020) 0.865 (0.008) 0.561 (0.0.015)
MATTDTI 0.917 (0.006) 0.625 (0.014) 0.905 (0.007) 0.612 (0.019) 0.873 (0.005) 0.594 (0.009)
Co-VAE 0.922 (0.003) 0.644 (0.009) 0.911 (0.005) 0.641 (0.009) 0.886 (0.002) 0.632 (0.003)
SAGDTI 0.937 (0.002) 0.645 (0.005) 0.921 (0.003) 0.636 (0.004) 0.903 (0.001) 0.627 (0.002)
KIBA DDR 0.792 (0.024) 0.446 (0.041) 0.773 (0.027) 0.416 (0.040) 0.746 (0.019) 0.404 (0.035)
DeepDTA 0.804 (0.007) 0.517 (0.015) 0.784 (0.008) 0.485 (0.017) 0.765 (0.008) 0.492 (0.015)
GraphDTA 0.824 (0.009) 0.556 (0.014) 0.806 (0.007) 0.512 (0.013) 0.799 (0.006) 0.516 (0.011)
Moltrans 0.856 (0.008) 0.583 (0.015) 0.831 (0.006) 0.558 (0.014) 0.827 (0.007) 0.567 (0.010)
IGT 0.886 (0.008) 0.607 (0.013) 0.853 (0.007) 0.601 (0.015) 0.841 (0.007) 0.591 (0.013)
AttentionSiteDTI 0.907 (0.006) 0.625 (0.012) 0.886 (0.007) 0.615 (0.013) 0.864 (0.006) 0.586 (0.009)
MATTDTI 0.921 (0.005) 0.632 (0.009) 0.914 (0.005) 0.627 (0.010) 0.889 (0.004) 0.619 (0.007)
Co-VAE 0.932 (0.002) 0.645 (0.003) 0.918 (0.001) 0.651 (0.003) 0.894 (0.001) 0.645 (0.002)
SAGDTI 0.945 (0.001) 0.657 (0.002) 0.927 (0.001) 0.643 (0.002) 0.911 (0.001) 0.645 (0.001)
a

Bold: best results.

Table 4.

Performance evaluation for predicting DTI on the BindingDB dataset using the average values of five metrics.a

AUROC AUPR MCC F1score B.Acc
DDR 0.805 (0.037) 0.684 (0.057) 0.425 (0.071) 0.529 (0.043) 0.758 (0.035)
DeepDTA 0.883 (0.009) 0.817 (0.011) 0.597 (0.046) 0.654 (0.037) 0.811 (0.026)
GraphDTA 0.886 (0.007) 0.821 (0.010) 0.584 (0.041) 0.676 (0.034) 0.819 (0.027)
Moltrans 0.917 (0.007) 0.834 (0.013) 0.605 (0.031) 0.669 (0.035) 0.823 (0.031)
IGT 0.926 (0.008) 0.857 (0.011) 0.631 (0.035) 0.817 (0.031) 0.842 (0.028)
AttentionSiteDTI 0.925 (0.006) 0.874 (0.009) 0.627 (0.029) 0.840 (0.027) 0.857 (0.019)
MATTDTI 0.931 (0.005) 0.886 (0.004) 0.644 (0.021) 0.833 (0.021) 0.849 (0.016)
Co-VAE 0.959 (0.002) 0.897 (0.003) 0.653 (0.012) 0.856 (0.019) 0.871 (0.011)
SAGDTI 0.967 (0.002) 0.914 (0.002) 0.649 (0.013) 0.861 (0.016) 0.901 (0.011)
a

Bold: best results.

Existing prediction models (e.g. Co-VAE, MATTDTI, and AttentionSiteDTI) also exhibited excellent performance. The Co-VAE model introduced a novel co-regularized variational autoencoder framework with a strong generative capability. Co-VAE can retain all the learned DTI features by generating the SMILES strings and amino acid sequences to model the original drug or target features. Thus, Co-VAE demonstrated an excellent performance in the identification of DTIs. However, the Co-VAE model ignored the unique molecular information and simply fused the features of drugs and targets through concatenation. MATTDTI considered the long-distance relative relationships among atoms in drugs, and used multi-head attention to compute the DTI similarities. The AttentionSiteDTI method took advantage of the 3D structure information based on the binding sites of proteins and utilized an attention-based module to represent the output vector. All these models captured the DTI features without fully exploiting the unique molecular feature attributes. We speculate that SAGDTI outperformed other SoTA models because it included the whole unique characteristics of both drugs and targets, as well as aggregated the neighboring topologies and diverse connections. In summary, we intuitively compared the prediction capability of the eight models using the curve plots and different experimental settings. Our SAGDTI exhibited the highest performance overall.

We further used the Davis and KIBA datasets to further evaluate the prediction performance of our SAGDTI model. In the Davis dataset, our model almost exhibited the best performance in each experimental setting. In the new-drug setting, it ranked second only to the Co-VAE model in terms of the AUPR value. As shown in Table 3, SAGDTI obtained 1.6%, 1.1%, and 1.9% enhancement in the AUROC metric under the new-target, new-drug, and pairwise settings, respectively. In terms of AUPR values, the performance of all methods was markedly decreased due to the obvious shortage of DTI training samples in the Davis dataset. However, our proposed model is still SoTA, like the Co-VAE model performs in the new-drug and pairwise settings. We also noted excellent performance in the KIBA dataset with AUROC values of 0.945, 0.927, and 0.911, and AUPR values of 0.657, 0.643, and 0.645 for the corresponding settings, respectively.

Generally, the high accuracy of random split settings is known to be overestimated, not a model’s real-world prediction performance. When there is little information about the interaction between a protein and a drug, the learning performance of most models drops greatly. Thus, we adopted a type of nonoverlapped sampling splitting strategy (i.e. cold start splitting) for better evaluation. Inspired by (Atas Guvenilir and Doğan 2023), we used the modified Davis (mDavis) dataset to make sure the model can learn general principles from data rather than memorizing it. We randomly select 10% or 15% DTI pairs for test samples and remove all of their associated drugs and proteins from the training samples. The results in Table 5 indicate that all methods have a great performance decline in cold start splitting. However, SAGDTI still achieves the best performance against other SoTA baselines, and, interestingly, the enhancement is even more obvious. According to the overall experimental results we obtained, it can be easily concluded that SAGDTI outperforms existing SoTA models in the task of DTI prediction. We think that our SAGDTI model can be significantly improved due to the full consideration of the unique noncovalent intermolecular attributes at the atom-level and biological interaction attributes at the node-level between drugs and targets.

Table 5.

The results of cold start splitting on mDavis dataset.a

Methods Remove 10%
Remove 15%
AUROC AUPR AURPC AUPR
DeepDTA 0.780 0.501 0.728 0.453
GraphDTA 0.778 0.526 0.732 0.511
Moltrans 0.805 0.564 0.773 0.547
MATTDTI 0.809 0.584 0.785 0.564
Co-VAE 0.816 0.605 0.796 0.585
SAGDTI 0.863 0.617 0.843 0.604
a

Bold: best results.

3.5 Ablation study

To ensure the stability and robustness, we also conducted a comprehensive ablation study on contribution of different modules in SAGDTI. The AUROC and AUPR were applied to evaluate the effectiveness of five SAGDTI variants (see Supplementary Materials for detail descriptions). For example, the SAGDTI II means the biological interaction information aggregating module was removed from the proposed model. Through this step, we overlooked the attributes of topologies and diverse connections between drugs and targets, and considered only the molecular attributes. Consequently, the performance of SAGDTI (without GATs) has declined markedly (see Supplementary Tables S3 and S4). In our work, we introduce multi-sources input information for SAGDTI, and thus prior knowledge (e.g. the drug SMILES sequence, protein 3D structure, and multi-scale DTI information) may be produced. To avoid an over-optimistic evaluation of the proposed method, we removed the different input elements accordingly. The performance of SAGDTI with different input information is shown in Supplementary Table S5. Obviously, the performance of our proposed model exhibits a downward trend to some extent without adequate input source. However, SAGDTI still has a competitive performance in identifying the interaction between drugs and targets, though we neglect the protein 3D structure and multi-scale biological information simultaneously. Supplementary Table S5 indicates that comprehensively consider the molecular and biological information is beneficial for learning high-level feature representations of drugs and targets and for improving prediction performance. These results also demonstrated that our SAGDTI has a good generalization ability.

3.6 Case study

Since December 2019, the infection of the new coronavirus (SARS-CoV-2) has rapidly spread worldwide, leading to a pandemic (Kirtipal et al. 2020). Hence, there is an urgent need to identify effective drugs for patients infected with SARS-CoV-2. Thus, we enforced case studies on five SARS-CoV-2-related drugs (i.e. Remdesivir, Lopinavir, Budesonide, Dexamethasone and Aripiprazole) to further verify the capability of SAGDTI for determining possible DTI. Each drug interacts with at least 12 different targets. Since space limitation, the top 10 candidate proteins identified by SAGDTI for each drug were listed in Supplementary Table S6.

For brevity, we discuss below our findings for the drugs remdesivir and lopinavir. Remdesivir (a nucleoside analog) possesses antiviral activity with EC50 values of 74 nM against SARS-CoV and Middle East Respiratory Syndrome Coronavirus (MERS-CoV) in Hereditary angioedema (HAE) cells and 30 nM against murine hepatitis virus in delayed brain tumor cells (Olender et al. 2021). In animal and in vitro models, remdesivir has demonstrated activity against the viral pathogens of SARS and MERS, which are also coronaviruses and are structurally very similar to SARS-CoV-2 (Kokic et al. 2021). We ranked the top 10 targets that are verified to interact with the remdesivir. The top 2 targets, which are the Replicase polyprotein 1ab (Rep) and the RNA-directed RNA polymerase L (L), happen to be the genome sequences of SARS-CoV-2-related disease. Lopinavir is a protease inhibitor of the human immunodeficiency virus (HIV)-1 and HIV-2. It is combined with ritonavir for the treatment of HIV infection (Rebello et al. 2018). Through experiments, we identified a SARS-CoV-2-related target Pol polyprotein (Pol) that interacts with lopinavir. According to the results, our proposed prediction model can assist scientists in drug development and clinicians in real-life practice.

4 Discussion

We have demonstrated the advantageous performance of SAGDTI through the evaluations of the BindingDB, Davis and KIBA datasets under three experimental settings and the SARS-CoV-2 case study. The experimental results proved that our proposed SAGDTI model performs well in DTI prediction and may be useful in practical applications. However, there are still some problems that have not been explained clearly and some difficulties need to be solved urgently. Such as, how to obtain the accurate quantification of the binding affinity (interaction strength) when interaction happens, where is the docking position in 3D space and whether the docking postures matter to the DTIs in a real-life situation (Wei et al. 2022). In this article, we treat the DTI prediction problem as a binary classification, which simply suggests the absence or presence of interactions. Therefore, we intend to expand our research on the drug–target binding affinity and investigate the mechanisms through which the docking position and posture influence the results. We believe that our proposed model can perform well in practical prediction for predicting DTIs and improve the development of drug–target binding affinities in the future.

5 Conclusion

In this article, we proposed an end-to-end attention-derived model, termed SAGDTA. The aim of this model is to capture the unique molecular attributes of drugs as well as targets and aggregate the biological interaction information of drug–target pairs to predict the DTI. The proposed framework enables learning the SMILES sequence with the relative distance among atoms of a given drug and the 3D structural feature of proteins from their binding sites at the atom-level through the molecular transformer module. SAGDTI also allows the aggregation of neighboring topologies and diverse connections (i.e. interactions, associations, and similarities) at the node-level based on the GATs. Experimental results demonstrated that, in most cases, our proposed SAGDTI model could predict DTI with superior performance than other advanced models (e.g. Co-VAE, MATTDTI, and AttentionSiteDTI) under three experimental settings. Furthermore, we used the trained model to predict the interactions of five SARS-CoV-2-related drugs with potential targets to assist scientists. Our model displayed SoTA performance, was highly interpretable based on the self-attention mechanism and performed well in a real-life case study. The limitations and advantages of SAGDTI have been fully discussed. We believe that this work could contribute to the research of drug discovery as well as drug repurposing, and be a powerful tool in practical DTI prediction.

Supplementary Material

vbad116_Supplementary_Data

Contributor Information

Xiaokun Li, School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China; Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Harbin 150090, China.

Qiang Yang, School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China; Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Harbin 150090, China.

Gongning Luo, School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.

Long Xu, School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China; Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Harbin 150090, China.

Weihe Dong, Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Harbin 150090, China; College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China.

Wei Wang, School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.

Suyu Dong, College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China.

Kuanquan Wang, School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.

Ping Xuan, School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China; Department of Computer Science, School of Engineering, Shantou University, Shantou 515063, China.

Xin Gao, Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal 23955, Saudi Arabia.

Supplementary data

Supplementary data are available at Bioinformatics Advances online.

Conflict of interest

The authors declare no conflicts of interest.

Funding

The project is supported by Interdisciplinary Research Foundation of HIT under Grant IR2021230, the National Natural Science Foundation of China (Nos. 62001144, 62272135, 62372135), the China Postdoctoral Science Foundation (Nos. 2021M690574, 2020M670911, 2021T140162), Heilongjiang Postdoctoral Fund under Grant LBH-Z20066 and LBH-TZ2202, Science and Technology Innovation Committee of Shenzhen Municipality under Grant JCYJ20210324131800002 and RCBS20210609103820029, Fund from China Scholarship Council (CSC), and the King Abdullah University of Science and Technology (KAUST) Office of Research Administration (ORA) under Award No FCC/1/1976-44-01, FCC/1/1976-45-01, and REI/1/5234-01-01.

References

  1. Ashburn TT, Thor KB.. Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov 2004;3:673–83. [DOI] [PubMed] [Google Scholar]
  2. Atas Guvenilir H, Doğan T.. How to approach machine learning-based prediction of drug/compound-target interactions. J Cheminform 2023;15:16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bagherian M, Sabeti E, Wang K. et al. Machine learning approaches and databases for prediction of drug–target interaction: a survey paper. Brief Bioinform 2021;22:247–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Berman HM, Westbrook J, Feng Z. et al. The protein data bank. Nucleic Acids Res 2000;28:235–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bharath Ramsundar PE, Walters P, Pande V. et al. Deep learning for the life sciences. In: Applying Deep Learning to Genomics, Microscopy, Drug Discovery, and More. Sebastopol, CA: O’Reilly Media, 2019.
  6. Cheng Z, Zhao Q, Li Y. et al. IIFDTI: predicting drug–target interactions through interactive and independent features based on attention mechanism. Bioinformatics 2022;38:4153–61. [DOI] [PubMed] [Google Scholar]
  7. Csermely P, Korcsmáros T, Kiss HJM. et al. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol Ther 2013;138:333–408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Davis AP, Murphy CG, Johnson R. et al. The comparative toxicogenomics database: update 2013. Nucleic Acids Res 2013;41:D1104–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Ezzat A, Zhao P, Wu M. et al. Drug–target interaction prediction with graph regularized matrix factorization. IEEE/ACM Trans Comput Biol Bioinform 2017;14:646–56. [DOI] [PubMed] [Google Scholar]
  10. Gilson MK, Liu T, Baitaluk M. et al. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res 2016;44:D1045–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Hazra A, Choudhary P, Sheetal Singh M. Recent advances in deep learning techniques and its applications: an overview. In:Rizvanov AA et al. (eds) Advances in Biomedical Engineering and Technology. Singapore: Springer Singapore, 2021, 103–122. [Google Scholar]
  12. He W, Jiang Y, Jin J. et al. Accelerating bioactive peptide discovery via mutual information-based meta-learning. Brief Bioinform 2022;23:bbab499. [DOI] [PubMed] [Google Scholar]
  13. He K, Zhang X, Ren S et al. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, 1017–24.
  14. Huang K, Xiao C, Glass LM. et al. MolTrans: molecular interaction transformer for drug–target interaction prediction. Bioinformatics 2021;37:830–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Jiang M, Li Z, Zhang S. et al. Drug–target affinity prediction using graph neural network and contact maps. RSC Adv 2020;10:20701–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Kim S, Chen J, Cheng T. et al. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 2021;49:D1388–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Kirtipal N, Bharadwaj S, Kang SG.. From SARS to SARS-CoV-2, insights on structure, pathogenicity and immunity aspects of pandemic human coronaviruses. Infect Genet Evol 2020;85:104502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Kang H, Goo S, Lee H. et al. Fine-tuning of BERT model to accurately predict drug–target interactions. Pharmaceutics 2022;14:1710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Kokic G, Hillen HS, Tegunov D. et al. Mechanism of SARS-CoV-2 polymerase stalling by remdesivir. Nat Commun 2021;12:279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Li T, Zhao XM, Li L.. Co-VAE: drug–target binding affinity prediction by co-regularized variational autoencoders. IEEE Trans Pattern Anal Mach Intell 2022;44:8861–73. [DOI] [PubMed] [Google Scholar]
  21. Liu S, Wang Y, Deng Y. et al. Improved drug–target interaction prediction with intermolecular graph transformer. Brief Bioinform 2022;23:bbac162. [DOI] [PubMed] [Google Scholar]
  22. Liu Y, Wu M, Miao C. et al. Neighborhood regularized logistic matrix factorization for drug–target interaction prediction. PLoS Comput Biol 2016;12:e1004760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Maia EHB, Assis LC, de Oliveira TA. et al. Structure-Based virtual screening: from classical to artificial intelligence. Front Chem 2020;8:343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Nguyen T, Le H, Quinn TP. et al. GraphDTA: predicting drug–target binding affinity with graph neural networks. Bioinformatics 2021;37:1140–7. [DOI] [PubMed] [Google Scholar]
  25. Olayan RS, Ashoor H, Bajic VB.. DDR: efficient computational method to predict drug–target interactions using graph mining and machine learning approaches. Bioinformatics 2018;34:1164–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Olender SA, Perez KK, Go AS, et al. ; GS-US-540–5773 and GS-US-540–5807 Investigators. Remdesivir for severe coronavirus disease 2019 (COVID-19) versus a cohort receiving standard of care. Clin Infect Dis 2021;73:e4166–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Öztürk H, Özgür A, Ozkirimli E.. DeepDTA: deep drug–target binding affinity prediction. Bioinformatics 2018;34:i821–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Pahikkala T, Airola A, Pietilä S. et al. Toward more realistic drug–target interaction predictions. Brief Bioinform 2015;16:325–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Peng J, Wang Y, Guan J. et al. An end-to-end heterogeneous graph representation learning-based framework for drug–target interaction prediction. Brief Bioinform 2021;22:bbaa430. [DOI] [PubMed] [Google Scholar]
  30. Rebello KM, Andrade-Neto VV, Zuma AA. et al. Lopinavir, an HIV-1 peptidase inhibitor, induces alteration on the lipid metabolism of Leishmania amazonensis promastigotes. Parasitology 2018;145:1304–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Rifaioglu AS, Atas H, Martin MJ. et al. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform 2019;20:1878–912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Ru X, Ye X, Sakurai T. et al. Current status and future prospects of drug–target interaction prediction. Brief Funct Genomics 2021;20:312–22. [DOI] [PubMed] [Google Scholar]
  33. Saberi Fathi SM, Tuszynski JA.. A simple method for finding a protein’s ligand-binding pockets. BMC Struct Biol 2014;14:18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Shaw P, Uszkoreit J, Vaswani A. Self-attention with relative position representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, 464–8.
  35. Shin B, Park S, Kang K et al. Self-attention based molecule representation for predicting drug–target interaction. In: Proceedings of the 4th Machine Learning for Healthcare Conference, Vol. 106, Ann Arbor, Michigan, USA, 2019, 230–48. [Google Scholar]
  36. Śledź P, Caflisch A.. Protein structure-based drug design: from docking to molecular dynamics. Curr Opin Struct Biol 2018;48:93–102. [DOI] [PubMed] [Google Scholar]
  37. da Silva Rocha SFL, Olanda CG, Fokoue HH. et al. Virtual screening techniques in drug discovery: review and recent applications. Curr Top Med Chem 2019;19:1751–67. [DOI] [PubMed] [Google Scholar]
  38. Tang J, Szwajda A, Shakyawar S. et al. Making sense of large-scale kinase inhibitor bioactivity data sets: a comparative and integrative analysis. J Chem Inf Model 2014;54:735–43. [DOI] [PubMed] [Google Scholar]
  39. Tradigo G. Protein Contact Maps. New York, NY: Springer New York, 2013, 1771–3. [Google Scholar]
  40. Vaswani A, Shazeer N, Parmar N. et al. Attention is all you need. In: Advances in Neural Information Processing Systems, NIPS, 2017, 5998–6008.
  41. Wei J, Chen S, Zong L. et al. Protein–RNA interaction prediction with deep learning: structure matters. Brief Bioinform 2022;23:bbab540. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Xuan P, Zhang X, Zhang Y. et al. Multi-type neighbors enhanced global topology and pairwise attribute learning for drug–protein interaction prediction. Brief Bioinform 2022a;23:bbac120. [DOI] [PubMed] [Google Scholar]
  43. Xuan P, Fan M, Cui H. et al. GVDTI: graph convolutional and variational autoencoders with attribute-level attention for drug-protein interaction prediction. Brief Bioinform 2022b;23:bbab453. [DOI] [PubMed] [Google Scholar]
  44. Yazdani-Jahromi M, Yousefi N, Tayebi A. et al. AttentionSiteDTI: an interpretable graph-based model for drug–target interaction prediction using NLP sentence-level relation classification. Brief Bioinform 2022;23:bbac272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Zhao Q, Yang M, Cheng Z. et al. Biomedical data and deep learning computational models for predicting compound–protein relations. IEEE/ACM Trans Comput Biol Bioinform 2022;19:2092–110. [DOI] [PubMed] [Google Scholar]
  46. Zeng Y, Chen X, Luo Y. et al. Deep drug–target binding affinity prediction with multiple attention blocks. Brief Bioinform 2021;22:bbab117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Zheng S, Li Y, Chen S. et al. Predicting drug protein interaction using quasi-visual question answering system. Nat Mach Intell 2020;2:134–40. [Google Scholar]
  48. Zhao T, Hu Y, Valsdottir LR. et al. Identifying drug–target interactions based on graph convolutional network and deep neural network. Brief Bioinform 2021;22:2141–50. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

vbad116_Supplementary_Data

Articles from Bioinformatics Advances are provided here courtesy of Oxford University Press

RESOURCES