Skip to main content
Briefings in Bioinformatics logoLink to Briefings in Bioinformatics
. 2025 Oct 14;26(5):bbaf542. doi: 10.1093/bib/bbaf542

Designing high-affinity 3D drug molecules via geometric spatial perception diffusion model

Hao Lu 1,#, Zhiqiang Wei 2,#, Jiaming Liu 3, Jiangrui Li 4, Qian Wang 5, Hao Liu 6,
PMCID: PMC12526910  PMID: 41092059

Abstract

Designing high-affinity molecules for certain proteins is a fundamental and challenging problem for drug discovery, particularly when considering atomic interactions between molecules and proteins in 3D space. Current 3D molecular design methods are limited because they do not adequately capture the ligand molecular position information in Euclidean space. We proposed a diffusion model based on SE(3)-equivariant graph neural networks to enhance generated molecular binding affinity to protein targets using the long-range and distance-aware attention head mix. We also presented a molecular geometry feature enhancement strategy, further strengthening the perception of the spatial size of ligand molecules. Results show that, on the CrossDocked2020 dataset, our model outperforms the existing state-of-the-art models across various affinity-related metrics, including the Vina Score, and preserves essential drug-like properties. Our model excels in designing ligand molecules with macrocyclic structures. Additionally, it offers a moderate level of interpretability, aiding in understanding the binding interactions between 3D drug molecules and protein pockets.

Keywords: structure-based drug design, diffusion model, docking-based affinity, geometric neural network

Introduction

AlphaFold has enabled a more precise understanding of protein structure and was awarded the 2024 Nobel Prize in Chemistry [1–3]. Due to the ability to obtain 3D protein structures, design 3D drug molecules for specific protein pockets have garnered substantial attention as a downstream task. This task is seen to have the potential to speed up drug development. Recent advances in generative artificial intelligence (AI) have shown remarkable potential in real-world drug design applications. For instance, deep generative models combined with molecular dynamics have been used to accelerate antimicrobial discovery [4], small molecule inhibitors [5], malaria-inhibiting compounds [6], integrative platforms like PandaOmics have been employed to identify fibrosis-related targets using multi-omics and nature language processing-driven target prioritization [7]. 3D molecular structure design for a given protein pocket is viewed as a two-body system, which requires the interaction of protein pockets with drug molecules. Diffusion models have steadily led in 3D molecular generation [8–11]. The fundamental concept of the diffusion model is to introduce noise into actual molecules and then use a trained model to denoise Gaussian noise to iteratively produce molecular structures that satisfy particular logical properties [12–14]. Diffusion models may produce molecules with actual physical molecular structures, which allows them to capture the intricate interactions between molecules and protein pockets, in contrast to models based on 1D molecular sequences and 2D molecular graph structures [14–16].

Frameworks such as SurfGen [17], ClickGen [18], and PMDM [19] highlight ongoing efforts to improve synthesizability and structural fidelity, but the affinity between molecules and pockets remains suboptimal [20–22]. Nevertheless, a significant obstacle still exists: precisely describing the intricate geometric and chemical relationships that exist between molecules and their biological targets in Euclidean space to find more viable drug-like compounds. The final activity and physicochemical properties of drugs are all associated with the Euclidean distances between atoms at the molecular chemistry scale. Therefore, 3D molecular generation methods should particularly focus on Euclidean distances [10, 23, 24]. The molecular generation process must consider the interaction with the target, which may involve the formation of covalent bonds, hydrogen bonds, van der Waals forces, and other microscopic mechanical interactions, all of which are strongly correlated with the Euclidean distances between atoms [25, 26]. Meanwhile, most existing 3D diffusion models first generate atoms and then form atomic bonds based on the geometric distances between atoms [27–29]. A balanced consideration of the influence of atoms at different distances around a given atom may lead to a decline in drug-related properties, such as molecular affinity.

In addition, based on the classical lock and key molecular docking theory, the geometric shapes of the target binding pocket and the ligand need to be sufficiently matched to allow for effective binding and interaction [30–32]. This implies that molecules capable of binding to the same target seem to have similar geometric structures. However, existing molecular generation models are often trained on the interaction relationship between a single “target–ligand pair” and tend to neglect the geometric similarity among different ligands for the same target, overlooking how small shape differences between ligands can influence their binding affinity and selectivity. Delete [33] is effective in molecular design but centers on structure optimization instead of de novo design.

This study focuses on 3D drug molecular generation for a given protein target to design molecules with higher docking affinity. We propose a distance-aware mixed attention (DMA) equivariant graph neural network model, named DMDiff (Distance-aware Mixed Attention Diffusion). This approach incorporates a DMA mechanism, which is crucial for modeling 3D molecular interactions. Additionally, we introduce a molecular geometric feature enhancement strategy to strengthen the relationship between ligand spatial structures and docking-based affinity. Our contributions are as follows: (i) We propose a DMA equivariant graph neural network that combines long-range and distance-aware attention heads. The long-range attention captures the long-distance dependencies between distant atoms, while the distance-aware attention focuses on short-range interactions. (ii) We introduce a molecular geometric feature enhancement strategy, which allows the model to better generate molecules based on the pocket structure, further improving the docking-based affinity of the molecules. (iii) To the best of our knowledge, DMDiff is state-of-the-art for designing high-affinity 3D molecules. The median docking score of the generated molecules reached −10.01, outperforming existing models, as evidenced by molecular docking and related results.

Materials and Methods

Task definition

First, we designate an atom in 3D space as Inline graphic, where Inline graphic represents the Euclidean coordinates, and Inline graphic represents the Inline graphic-dimensional atomic features. Subsequently, a protein pocket can be defined as Inline graphic, and a molecule can be expressed as Inline graphic, where Inline graphic and Inline graphic denote the number of atoms in a protein pocket and a molecule, respectively. Given a specific protein Inline graphic, the objective is to find Inline graphic, where Inline graphic is a scoring function representing the binding affinity between molecule and protein, and Inline graphic represents the set of all valid molecules in chemical space. In other words, for a given protein, the goal is to determine the molecules that maximize binding affinity.

Diffusion and denoising processes

The diffusion model gradually adds noise to the initial data and trains the model to learn how to recover the target data from the noise, thus generating molecules with real-world relevance (see Fig. 1). We adopt the diffusion method from TargetDiff [34]. This approach includes a forward diffusion process and a reverse generation process, both of which are defined as Markov chains (see Equations (1) and (2)). The diffusion process progressively injects noise into the data, while the generation process learns to recover the data distribution from the noise distribution using a network parameterized by Inline graphic. Further details of this model can be found in the Supplementary data Algorithm S1.

Figure 1.

Alt text: Pipeline of the DMDiff model.

Pipeline of the DMDiff model, which includes diffusion process and the inverse process.

graphic file with name DmEquation1.gif (1)
graphic file with name DmEquation2.gif (2)

Distance-aware mixed attention geometric neural network

This study proposes a DMA geometric neural network for recovering molecular data from noise. The network is specifically designed for 3D equivariant graph data and aims to capture the geometric features and spatial relationships within molecular 3D structures. DMA integrates a mixed attention strategy that including distance-aware mechanism to enhance performance in tasks involving irregular 3D atomic point clouds and mesh data.

3D geometric graph attention message passing

To enhance the ability to capture geometric structures, DMA employs a 3D equivariant graph attention update operation. This operation leverages the spatial relationships between nodes and their neighboring nodes by considering their positions in 3D space, facilitating effective information aggregation. The equivariant nature of this operation ensures that the network maintains stable performance and effectively captures geometric features, regardless of changes in the graph construction.

The hidden embeddings Inline graphic of atoms and the coordinates Inline graphic are alternately updated as Equations (3) and (4) show.

graphic file with name DmEquation3.gif (3)
graphic file with name DmEquation4.gif (4)

Here, Inline graphic represents the Euclidean distance between two atoms Inline graphic and Inline graphic, and Inline graphic denotes an additional feature indicating whether the connection exists between protein atoms, ligand atoms, or across protein–ligand pairs. Inline graphic is the ligand molecule mask, as we do not want to update the protein atom coordinates. The initial atomic hidden embedding Inline graphic is obtained from an embedding layer that encodes atomic features. Finally, the hidden embeddings Inline graphic of atoms are passed into a multilayer perceptron and a softmax function.

Distance aware attention head

DMA introduces a distance-aware mechanism specifically designed to model geometric relationships between nodes in 3D space. Unlike traditional attention mechanisms, DMA computes Euclidean distances between nodes in 3D space to dynamically adjust the attention weights. This design enables the network to be more sensitive to spatial relationships, thereby enhancing its ability to capture both local and global features in context. As shown in the example in Fig. 2, we update the feature of Inline graphic. For two adjacent nodes, Inline graphic and Inline graphic, since Inline graphic has a smaller distance, the model assigns higher attention weights to it, while the more distant node, Inline graphic, receives a lower weight. This mechanism effectively reflects the interdependencies within the geometric structure.

Figure 2.

Alt text: Distance-aware mixed attention.

DMA mechanism.

Specifically, the distance-aware head considers the Euclidean distance between adjacent atoms in a molecule, and the attention calculation is shown in Equation (5), where Inline graphic, Inline graphic, and Inline graphic are the projection matrices for calculating the query Inline graphic, key Inline graphic, and value Inline graphic, respectively, and Inline graphic is the dimension of the atomic latent representation features. The original attention mechanism is further modulated by the inter-atomic distance. The distance calculation is shown in Equation (6), where Inline graphic = 1, 2, 3, Inline graphic represents the position in Euclidean space, and Inline graphic is the scaling factor.

graphic file with name DmEquation5.gif (5)
graphic file with name DmEquation6.gif (6)

Mixed attention

The mixed attention strategy in DMA integrates long-range and distance-aware attention to enhance feature extraction. Each attention head focuses on representing the features of neighboring atoms around a central atom, rather than embedding all atoms. Long-range attention enables the model to learn how distant atoms contribute to the central atom’s update, ensuring that distant interactions are considered during denoising. Distance-aware attention differentiates the importance of neighboring atoms based on their Euclidean distances. The hyperparameter Inline graphic in Equation (6), set to 10, reduces attention for atom pairs beyond 10 Å. This hybrid approach allows the network to dynamically balance the influence of local and global features, optimizing the final output.

Specifically, the computed representations from the long-range and distance-aware attention heads are concatenated along the hidden dimension and passed through a linear layer (see Equation (7)). This layer represents the updated atomic features Inline graphic and coordinate features Inline graphic.

graphic file with name DmEquation7.gif (7)

Molecular geometry enhancement

Molecules that exhibit high affinity with the same target are generally believed to share similar molecular properties, which include both chemical characteristics and spatial dimensions. Qian et al. [35] demonstrated that the size of the target pocket influences the quality of the generated molecules. The classical lock-and-key theory in drug discovery and Supplementary data  Fig. 3 also confirm that molecules from different targets have varying volumes. In this work, we aim to enable the model to learn molecular volumes in 3D space. However, as molecules have irregular geometric shapes, directly calculating their precise volume is a computationally intensive task.

Figure 3.

Alt text: Target structure with docking molecules.

Docking molecules with similar geometries capable of binding the same target.

The first step is to reasonably simplify the molecular geometry so that the model can learn the volume and shape of molecules in 3D space. We employ an efficient approximation method that abstracts the molecule into a model with a rectangular cuboid geometry. This simplification assumption allows us to represent the spatial features of the molecule as a vector containing 3D: length, width, and height. These dimensions are then used to capture the overall structure of the molecule in Euclidean space.

Formally, for a given molecule Inline graphic, we transform its geometric information in 3D space into a rectangular representation as shown in Equation (8), and the box that can hold the molecule can be obtained by subtracting the minimum value from the maximum value in Euclidean coordinates. Learning is performed through the mean squared error loss function represented in Equation (9), with Inline graphic denoting the total number of molecules.

graphic file with name DmEquation8.gif (8)
graphic file with name DmEquation9.gif (9)

Optimum objectives

We use Equation (10) as optimization objective of our model. Inline graphic represents the docking affinity value, and in Equation (11). During the denoising process of the diffusion model, we obtain an embeddings for each atom in the molecule. These embeddings are fed into a module to predict the molecular binding affinity, which we refer to as the expert network. The expert network is composed of fully connected layers and outputs the predicted affinity, which is then supervised by the binding affinity calculated from AutoDock Vina through a regression loss. The shifted Softplus activation function is denoted as Inline graphic, and the Sigmoid activation function [36] is denoted as Inline graphic. This computation method is consistent with that in the KGDiff [35]. The scaling factors for the loss, Inline graphic, Inline graphic, and Inline graphic, are set to 100, 1, and 0.001, respectively.

graphic file with name DmEquation10.gif (10)
graphic file with name DmEquation11.gif (11)

Results

Dataset and preprocessing

We use the CrossDocked2020 dataset [37] to train and evaluate our model. This dataset is a benchmark 3D structure-based molecular generation dataset derived from the PDBbind [38] dataset, where only conformations with a root-mean-square deviation (RMSD) below 1 Å and a sequence identity <30% are selected. Specifically, the RMSD is computed between the docked ligand pose and its corresponding crystallographic pose, while the sequence identity is calculated between the protein sequences of different complexes using MMseqs2 [39]. This selection ensures that the conformations are geometrically similar to the reference structure while maintaining sufficient variability in protein sequences to explore more binding modes. We have 99 900 training complexes and 100 new pockets for testing. PyMOL [40] is used to visualize protein and molecular structures.

Metrics

We categorized the evaluation metrics into those related to affinity and molecular properties. The affinity-related metrics are implemented using AutoDock Vina [41], and the molecular properties are calculated using RDKit [42].

Affinity-related metrics: (i) Vina Score: Estimates the binding affinity between the generated molecular conformation and the target protein. (ii) Vina Minimize: Estimates the binding affinity between the optimized generated molecular conformation and the target protein. (iii) Vina Docked: Estimates the binding affinity between the generated molecule and the target protein after docking. (iv) High affinity: The percentage of test cases where the Vina score of the generated molecule is greater than or equal to that of the reference compound.

Molecular property metrics: (i) Quantitative estimate of druglikeness (QED). This metric evaluates a molecule’s drug-likeness by reflecting the typical distribution of molecular properties in successful drug candidates. (ii) Synthetic accessibility (SA). SA assess of the difficulty of synthesizing generated molecules. (iii) LogP. Evaluation of the lipophilicity of the generated molecules. (iii) Topological polar surface area (TPSA). Evaluating the topological polar surface area of generated molecules.

Baselines

We compared our method with various baselines: liGAN [37], AR [43], GraphBP [44], Pocket2Mol [45], KGDiff [35], TAGMol [46], AUTODIFF [47], TargetDiff [34], IRDiff [22], BINDDM [48], DecompDiff [49], IPDiff [50], DecompOpt [51], PocketFlow [52], and DeepICL [53]. liGAN is a 3DCNN-based method that generates 3D molecular images using a conditional variational autoencoder framework. AR, Pocket2Mol, and GraphBP are GNN-based methods that generate 3D molecules by sequentially placing atoms into the protein binding pocket. The diffusion models represented in this comparison include TargetDiff, KGDiff, IRDiff, AUTODIFF, BINDDM, DecompDiff, IPDiff, and DecompOpt.

Docking-based molecular affinity

Average performance across multiple targets

We first evaluate the generated molecular affinity with binding pockets, which is more challenging than evaluating molecular properties that only consider the features of the molecule. We can see how well different methods perform in terms of target binding affinity by looking at the three metrics in Table 1: Vina Score, Vina Minimize, and Vina Dock. Overall, DMDiff demonstrates excellent performance in affinity, with average values of Vina Score at −8.73, Vina Minimize at −9.45, and Vina Dock at −9.90, all of which are significantly lower than those of other methods. For instance, the Reference has a Vina Score of −6.36, Vina Minimize of −6.71, and Vina Dock of −7.45, which are much worse than the performance of DMDiff. In comparison, KGDiff also performs well in terms of affinity, with Vina Dock at −9.43. This is because their work also employs an affinity expert network to guide affinity during the molecular generation stage, yet it still does not surpass the DMDiff. This shows the efficacy of our approach, especially concerning the precision and stability of molecule-target binding, and suggests that DMDiff can offer better docking-based affinity.

Table 1.

Docking-based affinity of the DMDiff model and baseline models

Method Vina Score (Inline graphic) Vina Minimize (Inline graphic) Vina Dock (Inline graphic) High Affinity (Inline graphic) QED (Inline graphic) SA (Inline graphic)
Ave. Med. Ave. Med. Ave. Med. Ave. (%) Med. (%) Ave. Med. Ave. Med.
Reference −6.36 −6.46 −6.71 −6.49 −7.45 −7.26 0.48 0.47 0.73 0.74
LiGAN −6.33 −6.20 21.1 11.1 0.39 0.39 0.59 0.57
GraphBP −4.80 −4.70 14.2 6.7 0.43 0.45 0.49 0.48
AR −5.75 −5.64 −6.18 −5.88 −6.75 −6.62 37.9 31.0 0.51 0.50 0.63 0.63
Pocket2Mol −5.14 −4.7 −6.42 −5.82 −7.15 −6.79 48.4 51.0 0.56 0.57 0.74 0.75
AUTODIFF −5.25 −5.33 −6.91 −7.06 −8.84 −8.94 73.0 77.0 0.57 0.58 0.76 0.77
TargetDiff −5.50 −6.32 −6.69 −6.86 −7.83 −7.92 59.2 60.4 0.49 0.49 0.59 0.58
TAGMol −7.02 −7.77 −7.95 −8.07 −8.59 −8.69 69.8 76.4 0.55 0.56 0.56 0.56
KGDiff −8.04 −8.61 −8.78 −8.85 −9.43 −9.43 79.2 87.0 0.51 0.51 0.54 0.54
BINDDM −5.92 −6.81 −7.29 −7.34 −8.41 −8.37 64.8 71.6 0.51 0.52 0.58 0.58
DecompOpt −5.87 −6.81 −7.35 −7.72 −8.95 −9.01 73.5 93.3 0.48 0.45 0.65 0.65
DecompDiff −5.67 −6.04 −7.04 −7.09 −8.39 −8.43 64.4 71.0 0.45 0.43 0.61 0.60
IRDiff −6.03 −6.89 −7.27 −7.37 −8.42 −8.42 67.4 72.7 0.53 0.54 0.59 0.58
IPDiff −6.42 −7.01 −7.45 −7.48 −8.57 −8.51 69.5 75.5 0.52 0.53 0.61 0.69
PocketFlow −3.02 −4.10 −5.49 −5.55 −7.18 −7.19 51.3 53.1 0.53 0.53 0.81 0.82
DeepICL −3.79 −4.12 −5.88 −5.78 −7.13 −7.29 53.7 59.0 0.61 0.62 0.33 0.30
DMDiff(ours) −8.73 −9.47 −9.45 −9.66 −9.90 −10.01 80.5 91.1 0.53 0.55 0.54 0.54

The bold values indicate the best performance results. The SA values here are normalized. Inline graphic means the higher the better and Inline graphic means the lower the better

Table 1 also provides other important drug molecule properties, including QED and SA. The average QED value of DMDiff is 0.53, which, although lower than the DeepICL model (0.61), still performs reasonably well. This indicates that the generated molecules possess good interpretability and physicochemical properties, meeting the standards for molecular drug design. Regarding the SA value, DMDiff is comparable to other methods, with a value of 0.54, suggesting that the generated molecular structures have moderate synthetic complexity, aligning with the requirements of molecular design. In summary, the DMDiff method demonstrates a good balance in generating high-affinity molecules and maintaining drug molecule quality, showcasing its overall advantages.

Performance of a single target

Based on the median docking score results and ranked by the lowest target score of our model, as shown in Fig. 4, our method outperforms all others, with 64% of the targets in the test set yielding the optimal molecules. In comparison, the KGDiff model follows closely, achieving optimal results for 24% of the targets, while Pocket2Mol and AR account for 7% and 3%, respectively. TargetDiff achieves optimal results for only 2% of the targets.

Figure 4.

Alt text: Head-to-head evaluation of molecular docking performance across test targets.

A head-to-head evaluation of docking performance was performed by comparing median scores with the baseline and ranking targets by the lowest docking score of our model.

This distribution demonstrates that our method achieves a significant advantage in targeted molecular design. Our approach provides lower median scores for the vast majority of targets, indicating superior accuracy and stability in target binding affinity prediction compared with other models. In comparison to baseline models, our method not only better captures the interactions between molecules and targets but also offers stronger molecular design guidance for most targets, showcasing higher predictive capability and generalizability.

Pocket-independent molecular properties

To evaluate the basic drug properties of molecules generated by the model, we compare the density distributions of four key drug property indicators (QED, LogP, SAS, and TPSA) between the generated samples and the real training set, analyzing their similarities and differences to assess the quality of the generated molecules.

As shown in Fig. 5a–d, the generated samples (green area) exhibit a high density of QED values between 0.4 and 0.65, which is similar to the attribute distribution in the training set, indicating that the generative model can produce compounds with good drug-like properties . The LogP values of the generated samples are distributed across both positive and negative values, reflecting that the generated data cover a diverse range of lipophilic and hydrophilic characteristics. For SAS, the distribution of the generated samples is primarily concentrated in the range of 3.5–6.5, whereas the training set is concentrated in the 1–5 range. This suggests that the generative model may be inclined to produce compounds that are more difficult to synthesize. The TPSA distribution shows that the generated samples overlap with the training set in the range of 0–200 but with limited distinction. Additionally, we observed that the properties of the generated molecules generally follow a normal distribution, while the molecular property distribution of the training samples lacks a clear pattern. We hypothesize that this is due to the presence of single molecules with multiple target dockings in the dataset and the fact that the training set was manually selected from the PDBbind dataset.

Figure 5.

Alt text: Drug-like properties of generated molecules.

Drug-like properties of generated molecules, where panels (a–d) show probability distributions of QED, SAS, LogP, and TPSA in the generated, train and reference molecules, panels (e and f) show the ECFP, MACCS Morgan, and RDF molecular fingerprints of Reference and generated molecules after dimensionality reduction for 1D and 2D property distribution.

Furthermore, we also use four molecular fingerprints for dimensionality reduction to compare the generated molecules with the reference molecules, which is due to the fact that molecular fingerprints are able to take into account more drug-like properties. Through Fig. 5e and f, it can be seen that the property distributions of the generated molecules and reference are roughly in the same range, showing that the DMDiff model learned the molecular property distributions.

Ring and bond sizes in generated 3D molecules

Figure 6a shows the proportion of molecular ring sizes of training data, reference data, and generated data. We observe that six-membered rings make up the vast majority. The rings in the molecules generated by the DMDiff model have an average of 6.34 atoms, which suggests that our model is capable of generating macrocyclic compounds. Figure 6b and Table 2 show the performance differences among different methods for various ring sizes. For smaller ring molecules (such as those of size 3 or 4), there is a considerable variation in performance between methods, with liGAN and TargetDiff demonstrating particularly good performance for a ring size of 4 (15.7% and 2.6%, respectively). Our method achieves a performance of 52.2% on six-membered rings. As for larger ring molecules (with size >6), TargetDiff and our method exhibit better results. Overall, DMDiff shows a relatively stable improvement across different ring sizes, with particularly outstanding performance in larger ring molecules. Figure 6c compares DMDiff with the baseline model in terms of molecular bond lengths. Jensen–Shannon scatter is used to compare the distribution of bond distances between the reference and generated molecules. DMDiff achieves the lowest values across all eight common atomic bond types, showing a more stable ability to generate 3D molecules. Although six-membered rings are the most abundant in drug molecules, macrocyclic compounds have unique physicochemical properties, and Diao et al. [54] specifically designed models to generate macrocyclic compounds using deep learning method. The DMDiff model is expected to allow the generation of macrocyclic drug-like molecules.

Figure 6.

Alt text: Ring and bond lengths in generated 3D molecules.

Ring and bond lengths in generated 3D molecules, where panel (a) shows the proportion of rings in the train data, reference data, and generated molecules, (b) shows the proportion of different ring sizes compared to the baseline models, and (c) compare our model with the baseline model in terms of molecular bond lengths, with Jensen–Shannon scatter is used to compare the distribution of bond distances between the reference and generated molecules; lower values indicate better performance, and - denotes the single bond, = denotes the double bond, @ denotes the aromatic bond.

Table 2.

Molecular ring sizes in training set and those generated by the different models

Ring size Ref. (%) liGAN (%) AR (%) Pocket2Mol (%) TargetDiff (%) DMDiff(ours) (%)
3 1.7 28.1 29.9 0.1 0.0 0.0
4 0.0 15.7 0.0 0.0 2.6 1.8
5 30.2 29.8 16.0 16.4 30.8 20.6
6 67.4 22.7 51.2 80.4 50.7 52.2
7 0.7 2.6 1.7 2.6 12.1 18.0
8 0.0 0.8 0.7 0.3 2.7 5.1
9 0.0 0.3 0.5 0.1 0.9 2.3

Case study of the atomistic affinity interpretation

We performed docking analysis of the generated molecules with respective protein targets. Figure 7a shows the structures of the docking pockets of PDB ID: 2Z3H and 4AAW, highlighting the binding sites of the targets. The 2D interaction maps illustrate the spatial distribution of the molecules in the binding pockets and their interaction characteristics. In addition, the affinity expert prediction network provides a quantitative analysis of the contribution of different atoms to the binding affinity. The red atoms in the figure indicate that their affinity contribution to the target is greater than the baseline level, implying that these atoms may occupy an important position in the binding process and enhance the affinity of the molecule to the target, whereas the blue atoms indicate that their affinity contribution is lower than the baseline, and may not play an active role in the binding. This analysis not only reveals the key atom–atom interactions for target docking, but also provides important clues for the structural optimization of the molecule, guiding the subsequent drug design and functional improvement. DMDiff exhibits partial explainability by linking generated molecular features with binding affinity, which helps pharmaceutical experts to understand why molecules are generated, and this explainability of molecule generation is expected to link residues of protein targets to further understand the binding mechanism of molecules on the protein pocket.

Figure 7.

Alt text: Part of the generated molecules were docked to the target.

Part of the generated molecules were docked to the target, where panels (a and b) show the structure of the docked pocket of PDB ID: 2Z3H with 4AAW, the 3D and 2D interaction plot of the pocket with the molecule after docking, and the contribution of the affinity expert prediction network to the atom–affinity relationship, respectively, and the interaction plots were implemented using LigPlot+ [55].

Impact of protein pockets size on docking-based affinity

The size and shape of protein interface pockets significantly influence the binding affinity of generated molecules, with larger pockets often providing more favorable interactions. It is rare for models to generate higher affinity molecules when confronted with smaller targets. For comparison, the target (PDB ID: 1H0I) on the left side of Fig. 8a exhibits a lower affinity score for molecules generated by DMDiff, with a score of −4.52, compared with the target on the right (PDB ID: 4IWQ), which shows an average affinity score of −13.28 for generated molecules.

Figure 8.

Alt text: Relationship between protein pocket characteristics and affinity of generated molecules.

Relationship between protein pocket characteristics and affinity of generated molecules, and panel (a) shows the correlation between large interface pocket volume and surface area with the affinity of generated molecules, where PDB ID: 1H0I corresponds to a small interface pocket and 4IWQ to a large one, panel (b) displays the Pearson correlation between pocket surface area and binding affinity for all test targets, the worst docking result, and the worst docking result excluding DMA and MGE, panel (c) shows the Pearson correlation between pocket volume and binding affinity for all test targets, the worst docking result, and the worst docking result excluding DMA and MGE, panel (d) shows the comparison of docking results generated by DMDiff and the three models across different protein pocket volumes, with the docking pocket properties were obtained using KVFinder [56].

To exemplify the performance of our model with smaller protein pockets, we analyzed the scatter diagram and Pearson correlation coefficients between binding affinity and docking pocket volume or area, including all the targets, the targets of worst docking results, and the target of worst docking result without DMA and MGE. As shown in Fig. 8b and c, both the area and volume of the docking pocket show a negative correlation with the affinity of the generated molecular targets. The contact area of docking pockets correlates more strongly with molecular affinity than the volume. Further, both the contact area and volume of the docking pocket, the affinity of the generated molecules increased in correlation with the use of DMA and MGE. The scatter plots also indicate that the generated molecules have lower docking values, confirming that our model can produce high-affinity molecules even for small protein pockets. A comparative analysis is conducted between DMDiff and three baseline models based on their docking results across protein pockets of varying sizes. As illustrated in Fig. 8d, DMDiff consistently produces high-quality ligands even for smaller binding pocket volumes.

Discussion

DMDiff generates 3D molecules with high affinity for a given protein target, which may accelerate the drug discovery process. This method relies on static molecular structures and simplified interaction models, but it can capture the structural information of molecules in 3D space and thus predict the binding affinity of drugs to targets more accurately [57, 58]. One critical challenge that remains is the limited accuracy of commonly used scoring functions. Many traditional scoring functions, including empirical or knowledge-based methods, may fail to fully account for complex physical interactions such as solvation effects, entropic contributions, or induced fit during binding. This can result in misleading affinity estimations and suboptimal candidate selection. Moreover, the method faces certain limitations when dealing with protein dynamic changes. The combination of molecular dynamics simulation provides a dynamic perspective for drug design, allowing ligand and target interactions to go beyond static structure prediction. In the future, we will try to combine with advanced machine learning potential molecular dynamics models, such as AIInline graphicBMD [59], to consider the dynamic interactions between targets and ligands. Meanwhile, AI-based drug design is also evolving to more advanced evaluation methods. For example, the combination of multiscale simulation and virtual screening with the integration of wet experimental data makes the screening of drug candidates more comprehensive and precise.

Conclusion

In this study, we focus on the task of structure-based 3D drug molecule generation, aiming to generate high-affinity molecules using diffusion model. Existing models for docking in 3D space are inadequate for distance perception as well as geometric structure description of ligand. The central innovation of DMDiff is the introduction of a mixed attention that combines long-range and distance-aware attention heads. Remote attention captures long-range dependencies between distant atoms, whereas distance-aware attention focuses on short-range interactions guided by distance-aware factors. This mixed attention mechanism allows the model to prioritize local geometries and global molecular features, which are essential for accurate prediction of molecular properties and interactions with target proteins. Our proposed molecular geometry feature enhancement strategy further enhances the affinity of ligand molecules. The DMDiff model can generate high-affinity 3D drug molecules based on the structure of protein pockets, outperforming existing models. The generated molecules have larger rings at the same time, which holds promise for designing more complex drugs. Bond analysis experiments of the molecules demonstrate that the generated 3D drug molecules are more stable. The generated molecules are interpretable and enable further understanding of the binding process of protein pockets and ligand molecules. Overall, our proposed model is expected to accelerate structure-based molecular design.

Key Points

  • A mixed attention geometric neural network based on Euclidean distance is proposed for extracting 3D atomic information.

  • Enhancing small molecule volume and pocket relationships through molecular geometry enhancement strategy.

  • Our proposed model enables structure-based 3D drug molecular design, achieves state-of-the-art docking-based affinity, and maintains essential molecular drug properties.

Supplementary Material

20250920_Supplemental_Information_bbaf542

Acknowledgements

The authors thank the anonymous reviewers for their valuable suggestions.

Contributor Information

Hao Lu, College of Computer Science and Technology, Ocean University of China, Songling Road, 266100 Shandong Province, China.

Zhiqiang Wei, College of Computer Science and Technology, Ocean University of China, Songling Road, 266100 Shandong Province, China.

Jiaming Liu, College of Computer Science and Technology, Ocean University of China, Songling Road, 266100 Shandong Province, China.

Jiangrui Li, College of Computer Science and Technology, Ocean University of China, Songling Road, 266100 Shandong Province, China.

Qian Wang, College of Computer Science and Technology, Ocean University of China, Songling Road, 266100 Shandong Province, China.

Hao Liu, College of Computer Science and Technology, Ocean University of China, Songling Road, 266100 Shandong Province, China.

Author contributions

H.Lu was responsible for the design and development of the software, conducted the experiments, and contributed to manuscript editing. Z.W. proposed the method for model design and made revisions to the manuscript. J.L. and J.L. assisted with the implementation of the case study and the creation of several figures. Q.W. gave valuable recommendations. H.Liu designed the overall architecture of the project and made revisions to the manuscript

Conflict of interest

None declared.

Funding

This work was supported by Shandong Provincial Key Research and Development Program projects (grant no. 2024TSGC0226) and overseas joint training of PhD students from Ocean University of China.

Data availability

The Crossdocked2020 dataset are openly available. Our model is available via: https://github.com/luhao27/DMDiff.

References

  • 1. Abramson  J, Adler  J, Dunger  J. et al.  Accurate structure prediction of biomolecular interactions with AlphaFold3. Nature  2024;630:493–500. 10.1038/s41586-024-07487-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Jumper  J, Evans  R, Pritzel  A. et al.  Highly accurate protein structure prediction with AlphaFold. Nature  2021;596:583–9. 10.1038/s41586-021-03819-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Springer Nature . nobel prize in chemistry campaign. https://www.springernature.com/gp/researchers/campaigns/nobel-prize/chemistry. (3 January 2025, date last accessed).
  • 4. Das  P, Sercu  T, Wadhawan  K. et al.  Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations. Nat Biomed Eng  2021;5:613–23. 10.1038/s41551-021-00689-x [DOI] [PubMed] [Google Scholar]
  • 5. Li  Y, Zhang  L, Wang  Y. et al.  Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor. Nat Commun  2022;13:6891. 10.1038/s41467-022-34692-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Godinez  WJ, Ma  EJ, Chao Alexander  T. et al.  Design of potent antimalarials with generative chemistry. Nat Mach Intell  2022;4:180–6. 10.1038/s42256-022-00448-w [DOI] [Google Scholar]
  • 7. Ren  F, Alex  A, Chen  J. et al.  A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models. Nat Biotechnol  2025;43:63–75. 10.1038/s41587-024-02143-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Alakhdar  A, Poczos  B, Washburn  N. et al.  Diffusion models in de novo drug design. arxiv, https://arxiv.org/abs/2406.08511, 2024, preprint: not peer reviewed. (15 January 2025, date last accessed). [DOI] [PMC free article] [PubMed]
  • 9. Guo  Z, Liu  J, Wang  Y. et al.  Diffusion models in bioinformatics and computational biology. Nat Rev Bioeng  2023;2:136–54. 10.1038/s44222-023-00114-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Soleymani  F, Paquet  E, Viktor  HL. et al.  Structure-based protein and small molecule generation using EGNN and diffusion models: a comprehensive review. Comput Struct Biotechnol J  2024;23:2779–97. 10.1016/j.csbj.2024.06.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Yang  L, Zhang  Z, Song  Y. et al.  Diffusion models: a comprehensive survey of methods and applications. arxiv, http://arxiv.org/abs/2209.00796, 2022, preprint: not peer reviewed. (15 January 2025, date last accessed).
  • 12. Ho  J, Jain  A, Abbeel  P. Denoising diffusion probabilistic models. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, pp. 6840–51, 2020.
  • 13. Nichol  AQ, Dhariwal  P. Improved denoising diffusion probabilistic models. In: Proceedings of the 38th International Conference on Machine Learning (ICML), Virtual, pp. 8162–71, 2021.
  • 14. Wu  K, Karapetyan  E, Schloss  J. et al.  Advancements in small molecule drug design: a structural perspective. Drug Discov Today  2023;28:103730. 10.1016/j.drudis.2023.103730 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Sadybekov  AV, Katritch  V. Computational approaches streamlining drug discovery. Nature  2023;616:673–85. 10.1038/s41586-023-05905-z [DOI] [PubMed] [Google Scholar]
  • 16. Bilodeau  C, Jin  W, Jaakkola  T. et al.  Generative models for molecular discovery: recent advances and challenges. Wiley Interdiscip Rev Comput Mol Sci  2022;13:e1635. 10.1002/wcms.1635 [DOI] [Google Scholar]
  • 17. Zhang  O, Wang  T, Weng  G. et al.  Learning on topological surface and geometric structure for 3D molecular generation. Nat Comput Sci  2023;3:849–59. 10.1038/s43588-023-00530-2 [DOI] [PubMed] [Google Scholar]
  • 18. Wang  M, Li  S, Wang  J. et al.  ClickGen: directed exploration of synthesizable chemical space via modular reactions and reinforcement learning. Nat Commun  2024;15:10127. 10.1038/s41467-024-54456-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Huang  L, Xu  T, Yu  Y. et al.  A dual diffusion model enables 3D molecule generation and lead optimization based on target pockets. Nat Commun  2024;15:2657. 10.1038/s41467-024-46569-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Schneuing  A, Du  Y, Harris  C. et al.  Structure-based drug design with equivariant diffusion models. arxiv, http://arxiv.org/abs/2210.13695, 2022, preprint: not peer reviewed. (15 January 2025, date last accessed). [DOI] [PMC free article] [PubMed]
  • 21. Igashov  I, Stärk  H, Vignac  C. et al.  Equivariant 3d-conditional diffusion model for molecular linker design. Nat Mach Intell  2024;6:417–27. 10.1038/s42256-024-00815-9 [DOI] [Google Scholar]
  • 22. Huang  Z, Yang  L, Zhou  X. et al.  Interaction-based retrieval-augmented diffusion models for protein-specific 3D molecule generation. In: Proceedings of the 41st International Conference on Machine Learning (ICML), Vienna Austria, 2024.
  • 23. Han  J, Cen  J, Wu  L. et al.  A survey of geometric graph neural networks: data structures, models and applications. arxiv, https://arxiv.org/abs/2403.00485, 2024, preprint: not peer reviewed. (15 January 2025, date last accessed).
  • 24. Han  J, Rong  Y, Xu  T. et al.  Geometrically equivariant graph neural networks: a survey. http://arxiv.org/abs/2202.07230, 2022, preprint: not peer reviewed (accessed January 15, 2025).
  • 25. Huang  L, Zhang  H, Xu  T. et al.  MDM: molecular diffusion model for 3D molecule generation. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI),  Washington, USA, 2023;37. [Google Scholar]
  • 26. Li  X, Wang  L, Luo  Y. et al.  Geometry informed to kenization of molecules for language model generation. https://arxiv.org/abs/2408.10120, 2024, preprint: not peer reviewed. (15 January 2025, date last accessed).
  • 27. Vignac  C, Osman  N, Toni  L. et al.  MiDi: mixed graph and 3D denoising diffusion for molecule generation. In: Proceedings of the International Conference on Learning Representations (ICLR), Kigali, Rwanda, 2023.
  • 28. Weiss  T, Mayo Yanes  E, Chakraborty  S. et al.  Guided diffusion for inverse molecular design. Nat Comput Sci  2023;3:873–82. 10.1038/s43588-023-00532-0 [DOI] [PubMed] [Google Scholar]
  • 29. Tripp  BL, Yim  J, Tischer  D. et al.  Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem. http://arxiv.org/abs/2206.04119, 2023, preprint: not peer reviewed. (15 January 2025, date last accessed).
  • 30. Tang  Q, Ratnayake  R, Seabra  G. et al.  Morphological profiling for drug discovery in the era of deep learning. Brief Bioinform  2024;25:bbae284. 10.1093/bib/bbae284 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Mullard  A. When can AI deliver the drug discovery hits? Nat Rev Drug Discov  2024;23:159–61. 10.1038/d41573-024-00036-0 [DOI] [PubMed] [Google Scholar]
  • 32. Du  Y, Jamasb  AR, Guo  J. et al.  Machine learning-aided generative molecular design. Nat Mach Intell  2024;6:589–604. 10.1038/s42256-024-00843-5 [DOI] [Google Scholar]
  • 33. Chen  S, Zhang  O, Jiang  C. et al.  Deep lead optimization enveloped in protein pocket and its application in designing potent and selective ligands targeting LTK protein. Nat Mach Intell  2025;7:448–58. 10.1038/s42256-025-00997-w [DOI] [Google Scholar]
  • 34. Guan  J, Qian  WW, Peng  X. et al.  3D equivariant diffusion for target-aware molecule generation and affinity prediction. In: Proceedings of the International Conference on Learning Representations (ICLR), Kigali, Rwanda, 2023.
  • 35. Qian  H, Huang  W, Tu  S. et al.  KGDiff: towards explainable target-aware molecule generation with knowledge guidance. Brief Bioinform  2023;25:bbae435. 10.1093/bib/bbad435 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Elfwing  S, Uchibe  E, Doya  K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. http://arxiv.org/abs/1702.03118, 2017, preprint: not peer reviewed. (15 January 2025, date last accessed). [DOI] [PubMed]
  • 37. Francoeur  PG, Masuda  T, Sunseri  J. et al.  Three-dimensional convolutional neural networks and a cross-docked data set for structure-based drug design. J Chem Inf Model  2020;60:4200–15. 10.1021/acs.jcim.0c00411 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Wang  R, Fang  X, Lu  Y. et al.  The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. J Med Chem  2004;47:2977–80. 10.1021/jm030580l [DOI] [PubMed] [Google Scholar]
  • 39. Steinegger  M, Söding  J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol  2017;35:1026–8. 10.1038/nbt.3988 [DOI] [PubMed] [Google Scholar]
  • 40.Schroeder LLC. The PyMOL Molecular Graphics System. https://www.pymol.org/. (15 January 2025, date last accessed).
  • 41. Jerome  E, Diogo  SM, Andreas  T. et al.  AutoDock Vina 1.2.0: new docking methods, expanded force field, and python bindings. J Chem Inf Model  2021;61:3891–8. 10.1021/acs.jcim.1c00203 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. RDKit . RDKit: Open-Source Cheminformatics Software. https://www.rdkit.org/. (accessed January 15, 2025).
  • 43. Luo  S, Guan  J, Ma  J. et al.  A 3D generative model for structure-based drug design. In Advances in Neural Information Processing Systems (NeurIPS), Virtual, 2021;34:6229–39. [Google Scholar]
  • 44. Liu  M, Luo  Y, Uchino  K. et al.  Generating 3D molecules for target protein binding. In: Proceedings of the International Conference on Machine Learning (ICML), Baltimore, Maryland, USA, 2022.
  • 45. Peng  X, Luo  S, Guan  J. et al.  Pocket2Mol: efficient molecular sampling based on 3D protein pockets. In: Proceedings of the International Conference on Machine Learning (ICML), Baltimore, Maryland, USA, pp. 17644–55, 2022.
  • 46. Dorna  V, Subhalingam  D, Kolluru  K. et al.  TAGMol: target-aware gradient-guided molecule generation. https://arxiv.org/abs/2406.01650, 2024, preprint: not peer reviewed. (15 January 2025, date last accessed).
  • 47. Li  X, Wang  P, Fu  T. et al.  AUTODIFF: autoregressive diffusion modeling for structure-based drug design. https://arxiv.org/abs/2404.02003, 2024, preprint: not peer reviewed. (15 January 2025, date last accessed).
  • 48. Huang  Z, Yang  L, Zhang  Z. et al.  Binding-adaptive diffusion models for structure-based drug design. https://arxiv.org/abs/2402.18583, 2024, preprint: not peer reviewed. (15 January 2025, date last accessed).
  • 49. Guan  J, Zhou  X, Yang  Y. et al.  DecompDiff: diffusion models with decomposed priors for structure-based drug design. https://arxiv.org/abs/2403.07902, 2024, preprint: not peer reviewed. (15 January 2025, date last accessed).
  • 50. Huang  Z, Yang  L, Zhou  X. et al.  Protein-ligand interaction prior for binding-aware 3D molecule diffusion models. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vancouver, Canada, 2024.
  • 51. Zhou  X, Cheng  X, Yang  Y. et al.  DecompOpt: controllable and decomposed diffusion models for structure-based molecular optimization. arxiv, https://arxiv.org/abs/2403.13829, 2024, preprint: not peer reviewed. (15 January 2025, date last accessed).
  • 52. Jiang  Y, Zhang  G, You  J. et al.  PocketFlow: a data-and-knowledge-driven structure-based molecular generative model. arxiv, https://arxiv.org/abs/2403.13829, 2024, preprint: not peer reviewed. (15 January 2025, date last accessed).
  • 53. Zhong  W, Kim  H, Kim  W. 3D molecular generative framework for interaction-guided drug design. Nat Commun  2024;15:2688. 10.1038/s41467-024-47011-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Diao  Y, Liu  D, Ge  H. et al.  Macrocyclization of linear molecules by deep learning to facilitate macrocyclic drug candidates discovery. Nat Commun  2023;14:4552. 10.1038/s41467-023-40219-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Laskowski  RA, Swindells  MB. et al.  LigPlot+: multiple ligand–protein interaction diagrams for drug discovery. J Chem Inf Model  2011;51:2778–86. 10.1021/ci200227u [DOI] [PubMed] [Google Scholar]
  • 56. Guerra  JVS, Ribeiro-Filho  HV, Pereira  JGC. et al.  KVFinder-web: a web based application for detecting and characterizing biomolecular cavities. Nucleic Acids Res  2023;51:W289–97. 10.1093/nar/gkad324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Mirarchi  A, Giorgino  T, De Fabritiis  G. Mdcath: a large-scale MD dataset for data-driven computational biophysics. Sci Data 2024;11:1299. 10.1038/s41597-024-04140-z [DOI] [PMC free article] [PubMed]
  • 58. Siebenmorgen  T, Menezes  F, Benassou  S. et al.  Misato: machine learning dataset of protein-ligand complexes for structure-based drug discovery. Nat Comput Sci  2024;4:367–78. 10.1038/s43588-024-00627-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Wang  T, He  X, Li  M. et al.  Ab initio characterization of protein molecular dynamics with AIInline graphicBMD. Nature  2024;635:1–9. 10.1038/s41586-024-08127-z [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

20250920_Supplemental_Information_bbaf542

Data Availability Statement

The Crossdocked2020 dataset are openly available. Our model is available via: https://github.com/luhao27/DMDiff.


Articles from Briefings in Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES