Abstract
Accurate prediction of protein–ligand binding affinity (PLA) is important for drug discovery. Recent advances in applying graph neural networks have shown great potential for PLA prediction. However, existing methods usually neglect the geometric information (i.e. bond angles), leading to difficulties in accurately distinguishing different molecular structures. In addition, these methods also pose limitations in representing the binding process of protein–ligand complexes. To address these issues, we propose a novel geometry-enhanced mid-fusion network, named GEMF, to learn comprehensive molecular geometry and interaction patterns. Specifically, the GEMF consists of a graph embedding layer, a message passing phase, and a multi-scale fusion module. GEMF can effectively represent protein–ligand complexes as graphs, with graph embeddings based on physicochemical and geometric properties. Moreover, our dual-stream message passing framework models both covalent and non-covalent interactions. In particular, the edge-update mechanism, which is based on line graphs, can fuse both distance and angle information in the covalent branch. In addition, the communication branch consisting of multiple heterogeneous interaction modules is developed to learn intricate interaction patterns. Finally, we fuse the multi-scale features from the covalent, non-covalent, and heterogeneous interaction branches. The extensive experimental results on several benchmarks demonstrate the superiority of GEMF compared with other state-of-the-art methods.
Keywords: binding affinity, message passing, geometric-enhanced, bond angle information, mid-fusion, binding process
Introduction
Protein–ligand binding affinity (PLA) is a measurable indicator of the strength of the interaction between a ligand (drug) and a protein (target). The precise identification of PLA plays an important role in drug discovery [1]. In a real-world scenario, traditional biochemical experiments such as isothermal titration calorimetry [2] and surface plasmon resonance [3] remain the mainstream for binding affinity recognition, offering the most accurate and reliable approach. However, these approaches are both costly and time-consuming in high-throughput screening. To overcome the above limitations, many in silico methods have been proposed to accelerate the drug screening process by prioritizing drug candidates using developing scoring functions. Among these, machine learning (ML) scoring functions are particularly favored for their broad applicability and efficiency characteristics.
Recent studies in classic ML [4–6] represent the pioneering research in predicting PLA, which show a similar working mechanism to the quantitative structure–activity relationship [7] in cheminformatics. However, these approaches heavily depend on hand-craft features, which limits their generalization capabilities. On the other hand, deep learning models possess an inherent ability to automatically extract latent information from raw data, alleviating the need for knowledge-based feature engineering. Previous models based on convolutional neural networks (CNNs) [8, 9] utilized different 1D-kernels to extract features separately from protein residue sequences and ligand SMILES (Simplified Molecular Input Line Entry System) and then concatenated these features for further analysis. DeepCDA [10] proposed a bidirectional attention mechanism to capture the binding sites between protein modifications and ligands and uses a stacked network of convolutional and Long Short-Term Memory to learn deep characterizations. For better utilization of the graph structural properties of molecules, several approaches [11, 12] denote ligands as graphs and imply graph neural networks (GNNs) to learn the molecular topology while still employing 1D-CNNs to extract features at the protein level. Some studies [13–15] utilize predicted residue contact maps from sequences to construct graphs of proteins, offering an alternative approach to sequence representations. TripletMultiDTI [16] combines sequence data with multimodal knowledge of drug–drug and protein–protein profiles to take full advantage of the characterization of different data. To maximize the feature distribution among different samples, a triple loss function is also introduced and combined with predictive loss for training. CCL-DTI [17] enhances TripletMultiDTI by incorporating an attention mechanism for multimodality fusion, and introduces three loss functions to further strengthen the model’s contrastive learning capability.
Different from sequence models, molecular structural data can clearly represent the molecule’s three-dimensional conformation, which inspires the development of 3D convolution-based models for structural data analysis. Some methods implement localized spatial sampling [18, 19], rasterizing the intricate structure surrounding the ligand molecule and using 3D convolution kernels to obtain voxel-wise atoms sampling. However, incorporating coordinate information into atomic features undermines the model’s capacity to maintain the translation, rotation, and substitution invariance crucial for molecular modeling. Another alternative methods [20, 21] are based on atom or residue type counting, which partition into shells defined by varying distances from the ligand center and enumerate the elemental pair count of protein atoms/residues and ligand atoms within each shell. A 2D-CNN is further introduced for potential feature extraction. However, relying solely on atom/residue pair counting fails to capture the spatial relationships between protein atoms/residues and ligand atoms.
In recent years, some GNN models based on spatial encoding have been proposed to unveil the positional information of atoms. For example, Li et al. [22] constructed the protein–ligand complexes as interactive graphs based on atomic pairwise distance, with corresponding distance embeddings serving as edge attributes. Then, the graph operates on a Message Passing Neural Network, which enables the perception of the geometric structure of complexes. GNN-DTI [23] developed a unique adjacency matrix complex graph that serves as message weight. The components of the above adjacency matrix are as follows: covalent connections are represented by the real number one, non-covalent connections are marked with values derived from a Gaussian kernel that decays with distance (where smaller values for greater distance, indicating weaker non-covalent interactions), and the remaining elements are filled with zero. IGN [24] is another study that initially learns covalent representations of atoms in residue contact maps and ligand graphs through a shared GNN module. The above module is then sequentially linked to a bipartite graph representing two distinct types of atomic sets, to further learn the non-covalent representations of the atoms.
Although existing GNN-based approaches have already shown good performance in PLA prediction, they often overlook the bond angles within proteins and ligands, which consequently limits their ability to learn molecular geometry. Theoretically, bond lengths and angles in molecules are critical factors affecting their electronic properties and orbital configurations. ALIGNN [25] has proven that bond angles are important spatial information for molecules in the field of material property prediction. Furthermore, previous evidence suggests that incorporating explicit angular information can enhance classical force-field inspired descriptors [26]. Hence, comprehensive geometric information is of practical significance for a deeper understanding of molecular properties and an enhanced accuracy in PLA prediction. In recent years, some affinity prediction methods have recognized the significance of angle information. For example, graphLambda [27] uses radial symmetry functions (BPS functions) to describe pairwise distances and angles. This model benefits from the addition of angle information but does not fully utilize the relationships between nodes, edges, and angles.
In addition, current GNN methods have demonstrated limitations in representing the interactions of protein–ligand complexes. Existing topology-based models usually include only covalent interaction modeling [11, 12], while other distance-aware 3D-GNNs [22–24] tend to treat covalent and non-covalent interactions as the same type. However, protein–ligand complexes which is different from general molecules rely on intermolecular non-covalent interactions to maintain stability. Therefore, the mentioned approaches might obscure the physicochemical properties dominated by covalent interaction. Although IGN [28] utilizes a hierarchical learning strategy to model different interactions, the ‘induced fit’ theory suggests that non-covalent interactions during the binding process can alter protein conformation. Obviously, such hierarchical representation makes it difficult to provide insights into the dynamic binding process.
To overcome the aforementioned limitations, we propose a novel networks called Geometry-Enhanced and Mid-Fusion network, named GEMF. Our network initially separates the protein–ligand complex into graphs representing covalent and non-covalent interactions. In details, GEMF consists of a Three-Branch Message Passing Framework (TBMP), which acquires knowledge of both covalent and non-covalent interactions, along with diverse interaction features unique to various time intervals. TBMP constructs feature embeddings through utilizing the physicochemical properties and geometric factors (including distances and bond angles) of protein and ligand molecules. In the prediction stage, we fuse the outputs of all branches to generate a multi-scale graph vector and output the PLA prediction result. In a nutshell, we conclude the results and findings of this work as follows:
We propose a model named GEMF, based on a TBMP, which simultaneously learns covalent and non-covalent interactions of protein–ligand complexes. GEMF reinforces the geometric representation of molecules while preserving the fused information from heterogeneous interactions across different stages.
We propose an Angle-aware Edge Enhancement Module (A-EEM) for molecular representation learning, which simultaneously models the influence of atoms, chemical bonds, and bond angles through three stages of node-edge, edge-edge, and edge-node message passing to comprehensively encode the geometric features and covalent interactions of molecules.
We incorporate multiple heterogeneous interaction modules (HIMs) within the communication branch to capture fused information from various stages, thereby enhancing the perception of the entire binding process.
GEMF shows outstanding performance on the three test sets CASF2016, CASF2013, and Holdout2019. In addition, the visualization results show that the model can accurately capture the ligand atoms that are highly correlated with hydrogen bonding, which can provide certain biological insights. We have released all the details and replication package at https://github.com/Yuke-Qin/GEMF/tree/master.
Methodology
In this section, we will present the pipeline of our GEMF, which includes graph construction, embedding representation, and the backbone architecture.
Datasets
PDBbind [29], a widely acknowledged dataset, plays an important role in scoring for PLA prediction. The raw data provided include experimentally determined 3D conformations of proteins and ligands, along with their binding affinities. In this context, binding affinity is typically quantified as the dissociation constant (Kd), the Inhibition constant (Ki), and the half maximal inhibitory connection (IC50) [1]. In ML tasks, it is required to map the raw affinity to logarithmic space [20]. Consistent with most affinity prediction methods, we use pKa value (-logKi, -logKd, or -logIC50) as the affinity evaluation index. The larger the pKa value, the higher the PLA.
Here, to compare with advanced ML methods, we utilized the 2016 version of PDBbind referred to as PDBbind v.2016 for training and optimizing our model. The whole dataset contains two subsets: the general-set, which serves as the entire collection that covers a total of 13 283 complexes; and the refined-set, sampled from the general-set, which contains 4057 high-quality complexes. Additionally, we utilized two additional high-quality benchmarks for model performance testing: CASF2016 [30] and CASF2013 [31], including 285 and 195 protein–ligand complexes, respectively. To prevent data leakage into the test set, we removed the samples that were either duplicates or cannot be processed with chemistry software after fixing CASF2016, and subsequently obtained 12 904 remaining. Next, we randomly sampled 1000 complexes from the retained entries to construct the validation set, while the remaining were allocated to the training set. In addition, we introduced PDBbind v.2019 to construct a realistic ”hold-out” set of 4366 complexes. Simultaneously, it was ensured that the samples from previous datasets were not visible to Holdout2019 set to evaluate the model’s efficacy in predicting the binding affinity of complexes with unknown structures.
Graph construction
Molecules, composed of atoms and chemical bonds, are non-Euclidean structured data and can be naturally represented using graphs
, where
and
denote the set of nodes and edges, respectively. Atom is considered as a node, symbolized as
, and a chemical bond can be viewed as an edge
connecting atom
and
. However, the structure of protein–ligand complexes is more sophisticated than that of single molecules. Figure 1 is the schematic of interactions within the complex. To address this, we constructed two types of molecular graphs according to different interactions, namely, covalent interaction graph
and non-covalent interaction graph
, respectively. Considering that the actual binding site is located within the protein pocket region, we retain only those protein atoms that are situated within a cutoff radius of the ligand. This cutoff radius is empirical and optimizable, we set it to 5Å. Thus, the adjacency matrices of covalent and non-covalent interaction graphs
and
are defined in Eq. 1 and 2.
Figure 1.

Process of constructing covalent and non-covalent interaction graphs of protein–ligand complexes.
![]() |
(1) |
![]() |
(2) |
Embedding layer
This section is dedicated to illustrating the process of constructing node and edge features for covalent and non-covalent interaction graphs in protein–ligand complexes. As shown in Figure 2(b), the embedding layer takes the covalent interaction graph and the noncovalent interaction graph from Figure 2(a) as inputs and reflects them into higher dimensions. In most of the prior work, the physicochemical features of atoms and chemical bonds have been demonstrated to be meaningful for predicting binding affinity. Here, we have added additional geometric information including distance and angle, in order to capture the intricate details of molecular structures. The following details of the graph representations are listed as follows.
Figure 2.
The pipeline of our GEMF. (a) Constructing input graphs for our model; (b) Constructing input embeddings using physicochemical properties and geometric factors of complexes; (c) Our backbone network, which contains a dual-stream framework designed to separately learn the covalent and non-covalent interactions of complexes, respectively, Additionally, intermediate HIMs facilitate communication of heterogeneous interaction information. Ultimately, we fuse multi-scale features from the three branches; (d) Predicting the binding affinity using a fully connected network.
Physicochemical feature
In this study, we utilize RDKit [32] to extract molecular physicochemical attributes, listed in Table 1. Specifically, each atom is represented as a 35D binary feature vector containing six pieces of information, including atom symbol, atom degree, implicit valence, hybridization, aromatic, and the number of neighboring hydrogen atoms. Similar to the treatment of atoms, the chemical bond is represented as a 6D one-hot encoding, including single bond, double bond, triple bond, aromatic bond, whether is a conjugated bond, and whether is in a ring.
Table 1.
List of physicochemical properties of molecules
| Type | Feature Name | Descriptor | Size |
|---|---|---|---|
| Atom Features | Atom Type | one-hot from [‘C’, ‘N’, ‘O’, ‘S’, ‘F’, ‘P’, ‘Cl’, ‘Br’, ‘I’, ‘others’] | 10 |
| Atom Degree | one-hot from [0,1,2,3,4,5,6] | 7 | |
| Implicit Valence | one-hot from [0,1,2,3,4,5,6] | 7 | |
| Hybridization Type | one-hot from [‘SP’, ‘SP2’, ‘SP3’, ‘SP3D’, ‘SP3D2’] | 5 | |
| Aromatic Hydrocarbon | zero or one | 1 | |
| Number of Adjacent Hydrogen Atoms | one-hot from [0,1,2,3,4] | 5 | |
| Bond Features | Bond Type | one-hot from [‘SINGLE’, ‘DOUBLE’, ‘TRIPLE’, ‘AROMATIC’] | 4 |
| Whether is Conjugated | zero or one | 1 | |
| Whether is in Ring | zero or one | 1 |
Geometric feature
Previous works [24, 33] introduce distance information as the edge attributes of a molecular graph, in order to capture the spatial relationships between atoms, thereby achieving enhanced molecular representation. Here, we have calculated the interatomic distance using Cartesian coordinates, and utilize the cosine theorem to calculate the angles between each pair of neighboring chemical bonds, as shown in Eq. 3. Following the previous work [34], we adopt several Radial Basis Functions (RBFs) to obtain geometrically sensitive distance and bond angle embeddings, respectively denoted as
and
, where
denotes the triplet atoms that form a bond angle and
denotes the number of RBF.
![]() |
(3) |
![]() |
(4) |
![]() |
(5) |
where
and
represent the atom displacement vector, symbol
denotes vector dot product, and
denotes 2-norm. In addition, || denotes concatenating operation,
is equal to
,
is equal to
,
and
follow the uniform distribution of the representation interval of the
RBF.
Finally, we employ a Multilayer Perceptron (MLP) to transform all features into embeddings within a consistent vector space. Beforehand, the physicochemical and RBF feature of bond length were merged to construct the bond embedding. Therefore, the atom, bond, intermolecular distance, and bond angle embeddings are symbolized as
,
,
, and
, where
denotes the dimension of the embedding layer.
Message passing mechanism
CNNs have performed remarkably in processing grid data, such as images and texts, yet they are not well suited for non-Euclidean data such as molecular graphs. In contrast, GNNs operate based on the message passing mechanism, enabling the natural perception of molecular topology. Message passing frameworks update node features by aggregating neighboring messages, and successful applications in the field of molecular property prediction have exhibited substantial potential for molecular modeling [35]. In this process, information from neighboring nodes, along with that from connecting edges (an optional operation), is propagated to the target node to generate the new node representation. The formalized representation is illustrated in Eq. 6 and 7.
![]() |
(6) |
![]() |
(7) |
where
represents the current layer of the network,
denotes the neighboring atoms of atom
.
and
, respectively, represent the message function and update function.
Three-branch message passing framework
Figure 2(c) illustrates the backbone architecture of our model, the Three-branch Message Passing (TBMP) framework. Concretely, the corresponding embeddings for covalent and non-covalent interaction graphs are first fed into a dual-stream framework for parallel learning. We note that bond angles involve considering triplet atoms, and we need to calculate them using 2-hop neighborhood information in a general message passing framework. Here, we have implemented an A-EEM in the covalent branch. This module is working on the line graph to propagate angle information through an edge-to-edge aggregation mechanism. Then, the enhanced edge features are input into a cascaded Atomic Graph Module (AGM) to obtain further updated atomic features. This two-stage strategy empowers our model to perceive the fine local structure of molecules. On the other hand, since the non-covalent interactions are defined based on the pairwise distances between intermolecular atoms, AGM is only used to update atom features in the non-covalent branch. In order to fuse covalent and non-covalent heterogeneous features of atoms to accurately represent the binding process of complexes. Here, we propose a centrally located communication branch that includes multiple HIMs; each HIM is designed to correspond hierarchically on a one-to-one basis with other branches. After the iteration, we integrate the pooled outcomes derived from all the branches to construct a multi-scale graph vector. Before that, a strategic operation of skip-connection is applied to all HIMs to represent the binding state in different time intervals.
The more detailed description of each module is provided below.
Angle-aware edge enhancement module
Existing message passing frameworks may not be able to effectively propagate bond angle information, leading to insufficient representations of local molecular geometry. To this end, we have incorporated the use of line graphs derived from atomic graphs, where the node corresponds to a bond and the edge is defined by a pair of connected bonds. Figure 3 exemplifies the construction process of a methane line graph. Simultaneously, we have developed an edge-level message passing framework on the line graph, namely, A-EEM.
Figure 3.

A typical example of constructing a line graph from a methane molecular graph is presented. Here the red and blue nodes correspond, respectively, to atoms and chemical bonds, and the yellow lines represent line graph edges defined by bond angles.
Specifically, we first create intermediate representations of covalent edge
by fusing the features of bonds and their end atoms using MLP:
![]() |
(8) |
Eq.9 shows the process of generating messages, that is, aggregating neighborhood edges and the vector inner products of relevant bond angles. Next, the feature of edge
is fused with such messages to generate enhanced representations. This process, coupled with a connection to the AGM, allows for a single update of atoms to perceive the surrounding structure within a 2-hop range.
![]() |
(9) |
![]() |
(10) |
The following equation describes an additional formula with residual connections for updating angle features.
![]() |
(11) |
where
represents the current layer,
is the set of adjacency atoms of atom
, symbol
and
denote the ReLU activation function and Hadamard product, respectively.
is a learnable matrix.
Atomic graph module
This module is designed to learn atom representations of protein and ligand molecules based on message passing framework. Considering the fact that the discrepancy among adjacent node features will exert differential impacts on the target node. Inspired by GAT [36], we employ an AGM with multi-head attention mechanism to discern the unique contributions of distinct atoms.
For clarity here,
and
represent node and edge features uniformly in covalent and non-covalent channels, respectively. In detail, for atom
, we first calculate the attention coefficient
with adjacent atom
and edge
.
![]() |
(12) |
The attention coefficient is then normalized through softmax, and further assigned as the weights to aggregate all the related messages Simultaneously, the computational outcomes from multiple attention heads are integrated, enabling the capture of message vectors within a multi-scale feature space. Similar to A-EEM, the feature of atom
will fuse with message
and subsequently generate the latest atom feature
.
![]() |
(13) |
![]() |
(14) |
We also adopt a formula of distance features updating based on residual connection, as follows:
![]() |
(15) |
where
and
represent the number of attention heads, and the attention number.
is a parameter matrix, and LeakyReLU is an activation function.
Heterogeneous interaction module
Given the dynamic characteristics of the protein–ligand binding process, GEFA [37] integrated the ligand embedding which has undergone GNN iterations as a virtual node within the residue contact map for collaborative training, thus simulating the fitting process. Such early fusion strategy provides us with idea. Therefore, we propose an HIM located in each layer. Differing from GEFA, we are devoted to capturing the information of covalent and non-covalent interactions across various phases at the atomic scale. As shown in Fig. 2(b), our module contains the components namely Information Shared Unit (ISU) based on GRU [38], and a gate mechanism. As outlined in Eq. 16, the atom features within both covalent and non-covalent channels are first associated with each other, and the following gated units
and
are used to update atom features
and
. Here, sigmoid is an activation function.
![]() |
(16) |
![]() |
(17) |
![]() |
(18) |
![]() |
(19) |
![]() |
(20) |
Prediction layer
Referring to Fig. 2(c) and the generalized formula Eq.21, we employ the Global Add Pooling to transform atom features from the three branches into a one-dimensional vector in the last layer of our backbone (labeled as L), respectively denoted as
. Prior to this, we first concatenate the contents of all ISUs via a famous skip-connection mechanism, aimed at preserving the heterogeneous interaction information in the entire binding process. Then, we accumulate the pooling results of three branches, in order to construct a multi-scale embedding of the complex graph. Finally, a fully connected layer, positioned at the end of our model, is used to map the graph embedding to a scalar, denoted as
, as shown in Eq. 22.
![]() |
(21) |
![]() |
(22) |
Experiment settings
Hyperparameter settings
This section delineates the hyperparameters employed in our model. For loss function definition and training protocol, we utilize Mean Square Error (MSE) and the Adam optimizer, respectively. The foundational parameters, including learning rate, epoch number, batch size, and iteration number, are configured as 0.0005, 200, 128, and 3. Additionally, the number of RBFs denoted as
, which are used to represent distance or bond angles, is established at 9. And the number of attention heads of the AGM is set to 4. In order to mitigate overfitting, we apply L2 regularization in conjunction with the Adam optimizer, where the regularization factor is 0.000001. Furthermore, our model includes a dropout mechanism with a rate of 0.1 in fully connected layers.
Metrices
This section details the four metrics used to evaluate the performance of our model. Here, Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) are utilized to measure the discrepancy between predicted values and ground truth labels. In addition, the Pearson Correlation Coefficient (R) and Standard Deviation (SD) are used to assess the consistency of the predictions. The formulas for these metrics are defined as follows:
![]() |
(23) |
![]() |
(24) |
![]() |
(25) |
![]() |
(26) |
where
denotes the number of samples,
and
denote the predicted and true values of affinity for sample
, respectively.
and
represent the average of predictions and actuals. The symbols
and
are intercept and slope of the linear regression line.
Result and discussion
Comparison with baselines
To evaluate the performance of our model, GEMF, in predicting PLA, we conduct comparisons with the following representative methods: DeepDTA [8] is a model that relies completely on 1D-CNN by extracting local information from sequences; MGraphDTA [12] employs a deep Graph Convolutional Network (GCN) for extracting atom features from ligands’ 2D topological graphs and stacked convolutional kernels for protein sequence features; GNN-DTI [13], a distance-aware model, uses a novel proposed second-order adjacency matrix to differentiate each interaction’s contribution; IGN [24] utilizes two interconnected GNN modules to learn covalent and non-covalent interactions sequentially. Significantly, all baseline models share the same data splitting, and we report the mean and standard deviation of three independently repeated experiments.
Table 2 shows the comparison results on CASF2016, CASF2013, and Holdout2019 test-sets, with the best metrics underlined. Furthermore, Fig. 4 shows the predicted results and the true labeling distribution for one of the experiments, demonstrating the model’s advanced fitting ability. Specifically, our GEMF outperformed the second-best IGN on CASF2016, with improvements of 8.92%, 8.76%, 5.0%, and 9.45% in RMSE, MAE, R, and SD, respectively. A similar trend can also be observed in CASF2013, where our GEMC leads the RMSE and R-value by 0.099 and 0.025, respectively. The analysis of the results indicates that the underperformance of sequence models compared with 3D-GNNs is primarily due to their lack of structural information. However, GEMF integrates bond angle information into the message-passing process, effectively addressing the issue of neglecting molecular geometry when relying solely on distance-based approaches, thereby improving the model performance.
Table 2.
Comparison results with baselines models on three benchmarks
| Test Set | Model | RMSE
|
MAE
|
R
|
SD
|
|---|---|---|---|---|---|
| CASF2016 | DeepDTA | 1.357 0.011 |
1.060 0.020 |
0.784 0.006 |
1.350 0.018 |
| MGraphDTA | 1.442 0.014 |
1.109 0.019 |
0.752 0.006 |
1.434 0.014 |
|
| GNN-DTI | 1.375 0.024 |
1.103 0.027 |
0.785 0.004 |
1.348 0.011 |
|
| IGN | 1.312 0.010 |
1.005 0.004 |
0.800 0.002 |
1.302 0.007 |
|
| GEMF(ours) |
1.195 0.016
|
0.917 0.011
|
0.840 0.004
|
1.179 0.014
|
|
| CASF2013 | DeepDTA | 1.568 0.023 |
1.238 0.036 |
0.748 0.006 |
1.350 0.018 |
| MGraphDTA | 1.611 0.020 |
1.259 0.038 |
0.727 0.013 |
1.602 0.033 |
|
| GNN-DTI | 1.479 0.002 |
1.213 0.030 |
0.791 0.008 |
1.427 0.023 |
|
| IGN | 1.517 0.024 |
1.170 0.038 |
0.768 0.005 |
1.493 0.015 |
|
| GEMF(ours) |
1.380 0.026
|
1.056 0.025
|
0.816 0.008
|
1.349 0.027
|
|
| Holdout2019 | DeepDTA | 1.512 0.026 |
1.194 0.024 |
0.568 0.017 |
1.468 0.021 |
| MGraphDTA | 1.616 0.013 |
1.264 0.011 |
0.502 0.011 |
1.543 0.011 |
|
| GNN-DTI | 1.445 0.019 |
1.135 0.010 |
0.620 0.008 |
1.400 0.011 |
|
| IGN | 1.452 0.024 |
1.145 0.008 |
0.613 0.006 |
1.410 0.008 |
|
| GEMF(ours) |
1.420 0.005
|
1.111 0.005
|
0.628 0.003
|
1.388 0.004
|
Figure 4.

Regression lines of predicted value and ground truth of GEMF on Validation set, CASF2016, CASF2013, and Holdout2019, respectively.
We also note from the original literature that DeepDTA and MGraphDTA are constructed based on Davis [39] and KIBA [40] datasets, and the former reported superior results. Unfortunately, our experimental findings disclose that GNNs relying on 2D ligand representations fail to outperform the CNN framework. Our analysis shows that the Davis and KIBA datasets have 30056 and 118254 samples, respectively, but only a limited number of unique proteins (442 and 229) and ligands (68 and 2111). In contrast, PDBbind v.2016, with a total of 13283 complexes, includes 8792 unique proteins and 2977 unique ligands. Obviously, Davis and KIBA have a higher sample overlap rate, and we believe that GNNs working on ligand topological representations might be less effective in learning from datasets with low ligand diversity. Furthermore, we utilized Holdout2019 as an external test-set to simulate a real-world temporal split scenario, aimed at predicting the binding affinity of the most recent complexes with previously unknown structures. While GEMC still suggests optimal performance, but its advantages are significantly reduced.
Cold start setup
To identify the reasons behind the observed performance decline, we investigate the hold-out 2019 set. Our analysis reveals that protein and ligand overlap rates with the training set to be 68.9% and 24.8%, respectively. Li et al. [40] have pointed out that ML models built on the training set with no overlap with their test sets perform less than some basic linear scoring functions.
Here, we utilized MMseq2 [41] to cluster Holdout2019 and the training set together based on 30% protein sequence identity, resulting in 1907 clusters. Then, all clusters containing proteins that appearing in the training set were removed, and the remaining 313 clusters (with a total of 672 complexes) formed a de-redundant set named Holdout2019-refined. Finally, we compare the proposed model with baselines on the new set to examine the actual generalization ability; Table 3 records the experimental results. Obviously, the metrics of all baseline models decrease overall, but are consistent with the ranking on the original Holdout2019. Notably, the R-value of DeepDTA and MGraphDTA based on sequence droppped below 0.3 and became almost unusable.
Table 3.
Model performance on Holdout2019-refined set under protein cold-start setting
| Model | RMSE
|
MAE
|
R
|
SD
|
|---|---|---|---|---|
| DeepDTA | 1.672 0.014 |
1.330 0.015 |
0.253 0.015 |
1.501 0.006 |
| MGraphDTA | 1.634 0.015 |
1.300 0.003 |
0.244 0.025 |
1.505 0.010 |
| GNN-DTI | 1.467 0.036 |
1.182 0.026 |
0.463 0.021 |
1.375 0.017 |
| IGN | 1.557 0.016 |
1.249 0.013 |
0.393 0.008 |
1.427 0.005 |
| GEMF |
1.448 0.028
|
1.150 0.024
|
0.464 0.022
|
1.374 0.018
|
Notably, GEMF’s performance does not show significant degradation, suggesting that its superiority is not merely due to the recognition of similar samples. Rather, this shows that GEMF has truly learned the underlying mechanisms of PLA, including molecular properties and interaction patterns.
Ablation study
To assess the contribution of our proposed modules, we develop four variants of our model of GEMF, and conduct ablation studies on the CASF2016 and CASF2013 datasets. The configurations of the variants are as follows:
w/o A-EEM: This variant omits the line graph module of the covalent interaction channel, retaining only an atomically updated module (AGM).
w/o HIM: In this variant, we remove all HIMs, and maintain other components, to validate the beneficial impact of the communication branch.
GEMF-HR: Sequentially learn covalent and non-covalent interaction, and drop all middle HIMs. This is done to facilitate a comparison between the efficacy of mid-fusion and hierarchical representation strategies.
GAT: Replacing A-EEM and AGM with Graph Attention Networks (GAT), aiming to verify the superiority of our proposed message-passing framework.
Table 4 and Figure 5 present the results of the aforementioned ablation experiments. Consistent with expectations, GEMF outperformed the other variants. In this case, w/o HIM and GEMF-HR have similar performance, more specifically, their RMSE on both CASF2016 and CASF2013 benchmarks were 1.263 (+5.69%) and 1.266 (+5.94%), and 1.458 (+5.65%) and 1.453(-5.29%), respectively. However, the GAT variant exhibited significant deviations in RMSE compared with our complete model.
Table 4.
Comparison results with baselines models on three benchmarks
| Model | RMSE
|
MAE
|
R
|
SD
|
|---|---|---|---|---|
| CASF2016 | ||||
| w/o A-EEM | 1.228 | 0.927 | 0.833 | 1.203 |
| w/o HIM | 1.263 | 0.944 | 0.818 | 1.252 |
| GEMF-HR | 1.266 | 0.957 | 0.812 | 1.266 |
| GAT | 1.391 | 1.076 | 0.802 | 1.298 |
| GEMF | 1.195 | 0.917 | 0.840 | 1.179 |
| CASF2013 | ||||
| w/o A-EEM | 1.416 | 1.073 | 0.801 | 1.378 |
| w/o HIM | 1.458 | 1.103 | 0.786 | 1.443 |
| GEMF-HR | 1.453 | 1.096 | 0.782 | 1.452 |
| GAT | 1.508 | 1.154 | 0.795 | 1.414 |
| GEMF | 1.380 | 1.056 | 0.816 | 1.349 |
Figure 5.

Report of ablation results using CASF2016 (left) and CASF2013 (right) under four metrics, regarding our variant models.
Observably, w/o A-EEM and w/o HIM both impact the model’s predictive ability. First, the introduction of the A-EEM has enriched the geometric representation of molecules, indicating the significance of bond and bond angle information. Additionally, there can be a decay in metrics because of the inability to fuse and exchange the information that are from different interactions. GEMF-HR demonstrated a more expressive performance, suggesting the advantage of mid-fusion over hierarchical representations in feature fusion. However, substituting our message-passing framework with GAT resulted in a notable decrease in performance, underscoring the efficacy of our message-passing mechanism. Importantly, all variants, except for GAT, still surpassed the baseline models in performance, demonstrating the efficacy of the complete GEMF architecture.
Model interpretability analysis
Deep learning is well known as a black-box model, primarily focusing on whether the feature extractor meets the desired mapping requirements for the input. However, identifying which features are important and which are redundant is often challenging. The lack of interpretability frequently limits the application of DL models, particularly in CADD.
Firstly, a well-performing model should possess the capability to distinguish differential features. To this end, we randomly sampled 256 complexes from the training set and captured the graph vector representations at the 1st, 10th, 100th, and 200th epoch. Subsequently, we mapped these representations to 2D form using the t-SNE tool [42]. Figure 6 displays the visualization result, where a warm-up colormap we used to represent the normalized binding affinities from low to high. Obviously, the samples with approximate labels are clustered together indicating that our representation learning could be meaningful.
Figure 6.

Visualization of complex embeddings using t-SNE dimensionality reduction at four epochs (a) 1, (b) 10, (c) 100,and (d) 200.
To further analyze which ligand atoms provide the most significance to affinity prediction, we use Grad-AAM to visualize the importance of atoms in the molecule and determine whether the model truly understands the underlying binding mode of the complex by comparing it with the actual interactions between the binding pocket and the ligand. Figure 8 shows the visualization results, where the ligand atoms forming hydrogen bonds with pocket residues are marked with red circles. The relevant pocket residues are depicted in a ball-and-stick model, including residue names such as ”ASN-165”. Here, the hydrogen bonds represented by the yellow dotted lines are important components of non-valent interactions and are closely related to maintaining the stability of the complex structure. In this experiment, ligand atoms related to hydrogen bonds are given larger weights (the darker the color, the more important the atom), where the weights are obtained by multiplying the gradient calculated by the model and the atomic features. The larger the weight, the more significant the impact on the model performance. Obviously, the atomic weight is highly consistent with the binding mode, and the importance of hydrogen bonds is fully reflected.
Figure 8.

Effects of model parameters (left) and model performance analyzation (right) under two benchmarks.
Figure 7.

Visualization of the importance of ligand atoms. The result is obtained using Grad-AAM for complexes with PDBID:4qac and 2cet. On the left, the atom weights of the ligand are displayed, while the right side illustrates the hydrogen bond interaction patterns between the pocket residues and the ligands.
Model parameter analysis
In this experiment, we investigate the impact of varying the number of layers
and the feature dimensions of hidden layers
while keeping other parameters constant. The result, as illustrated on the left side of Fig. 8, indicates that the R-value exhibits an increasing trend when the number of layers range from 1 to 3. Theoretically, as the number of layers in a GNN increases, the receptive field of the nodes expands, enabling the recognition of larger molecular substructures. However, a notable decline in the R-value is observed when the model reaches four layers, particularly in the CASF2013 dataset, suggesting the model generalization faces challenges. This phenomenon could be the occurrence of over-smoothing or over-squeezing in the GNN framework as the number of layers increases. This means that after the aggregation of messages from neighboring nodes, the state of a node becomes increasingly similar to the features of its neighbors. Furthermore, we analyze the impact of
by setting it to various values: 32, 64, 128, 192, 256, and 320. As shown in the right half of Fig. 6, our model reaches its peak performance at a dimension of 256. When
is equal to 32 or 64, the performance indicators rapidly decline on the CASF2013 dataset, yet show an upward trend on CASF2016. We infer that this fluctuation could be attributed to insufficient feature representation capability due to the smaller dimension
. Additionally, a decrease in the R-value is observed when the dimension reaches 320, suggesting the onset of overfitting in the model.
Conclusion
In conclusion, our work introduces the geometric-enhanced mid-fusion network (GEMF), a novel approach designed to enhance the predictive capability of PLA predicting. Our model considers constructing input features based on the physicochemical properties associated with atoms and chemical bonds, and geometric factors such as bond lengths and bond angles. Subsequently, we propose a dual-stream framework to simultaneously learn patterns of covalent and non-covalent interactions in separate channels. We innovatively propose the A-EEM, which incorporates bond angle information into the bond representations using a line graph network. This approach enables the extraction of comprehensive geometric information to obtain advanced protein and ligand molecular representation. Additionally, we develop a central communication branch to facilitate the mid-fusion of heterogeneous interactions, aiming to more precisely simulate the dynamic binding process in complexes. Our quantitative evaluations on standard benchmarks prove GEMF’s state-of-the-art performance in comparison with other DL methods, marking a significant advancement in PLA prediction.
Key Points
We propose a TBMP to learn covalent and non-covalent interactions of protein–ligand complexes, as well as to record and transmit the heterogeneous interaction information. In summary, our framework encodes complete geometry and interaction information.
Unlike previous GNN-based methods, we propose an A-EEM, implemented on line graphs. This module embeds bond angle information into associated edge features and then cascades to the AGM to identify fine-grained molecular geometric structures.
We set several HIMs in the communication branch to capture fusion information from different stages to perceive the whole binding process.
We conduct comparative experiments with most recent baselines across multiple test sets, and the results indicate that our model achieves state-of-the-art performance.
Author Biographies
Guoqiang Zhou is currently with School of Computer Science, Nanjing University of Posts and Telecommunications. His main research interests are machine learning and distributed computing.
Yuke Qin is a Master's graduate student in School of Computer Technology, Nanjing University of Posts and Telecommunications. His research interests are deep learning and bioinformatics.
Qiansen Hong is a Master's graduate student in School of Computer Technology, Nanjing University of Posts and Telecommunications. His research interests are deep learning and bioinformatics.
Haoran Li is a PhD candidate in School of Computing and Information Technology, University of Wollongong His research interests focus on deep learning.
Huaming Chen is currently with School of Electrical and Compuer Engineering, University of Sydney. His research interests are trustworthy machine learning, AI4Science and software security.
Jun Shen is currently with School of Computing and Information Technology, University of Wollongong. His research interest is AI4Science.
Contributor Information
Guoqiang Zhou, School of Computer Science, Nanjing University of Posts and Telecommunications, No.9 Wenyuan Road, Jiangsu 210023, China.
Yuke Qin, School of Computer Science, Nanjing University of Posts and Telecommunications, No.9 Wenyuan Road, Jiangsu 210023, China.
Qiansen Hong, School of Computer Science, Nanjing University of Posts and Telecommunications, No.9 Wenyuan Road, Jiangsu 210023, China.
Haoran Li, School of Computing and Information Technology, University of Wollongong, Northfields Avenue, NSW 2522, Australia.
Huaming Chen, School of Electrical and Computer Engineering, University of Sydney, Camperdown, NSW 2050, Australia.
Jun Shen, School of Computing and Information Technology, University of Wollongong, Northfields Avenue, NSW 2522, Australia.
Funding
This work was supported by the Open Research Fund of the National Mobile Communications Research Laboratory, Southeast Universtiy (No.2023D15); Ningbo Clinical Research Center for Medical Imaging (No.2022LYKFYB01); the National Program on Key Basic Research Project (2020YFA0713600); and the National Natural Science Foundation of China (62272214).
Data availability
Data, code, scripts, and instructions for GEMF are available at https://github.com/Yuke-Qin/GEMF/tree/master.
Author contributions
Yuke Qin (Conceptualization, Methodology, Software, Data Curation, Writing—Original Draft), Qiansen Hong (Writing—Review & Editing), Guoqiang Zhou (Resource, Writing—Review & Editing, Supervision), Haoran Li (Validation & Editing), Huaming Chen (Writing Review & Finalization), Jun Shen (Coordination & Cosupervision).
References
- 1. Tang J, Szwajda A, Shakyawar S. et al.. Making sense of large-scale kinase inhibitor bioactivity data sets: a comparative and integrative analysis. J Chem Inf Model 2014;54:735–43. 10.1021/ci400709d. [DOI] [PubMed] [Google Scholar]
- 2. Velazquez-Campoy A, Freire E. Isothermal titration calorimetry to determine association constants for high-affinity ligands. Nat Protoc 2006;1:186–91. 10.1038/nprot.2006.28. [DOI] [PubMed] [Google Scholar]
- 3. Maynard JA, Lindquist NC, Sutherland JN. et al.. Surface plasmon resonance for high-throughput ligand screening of membrane-bound proteins. Biotechnol J 2009;4:1542–58. 10.1002/biot.200900195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Pahikkala T, Airola A, Pietilä S. et al.. Toward more realistic drug–target interaction predictions. Brief Bioinform 2015;16:325–37. 10.1093/bib/bbu010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein–ligand binding affinity with applications to molecular docking. Bioinformatics 2010;26:1169–75. 10.1093/bioinformatics/btq112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Kinnings SL, Liu N, Tonge PJ. et al.. A machine learning-based method to improve docking scoring functions and its application to drug repurposing. J Chem Inf Model 2011;51:408–19. 10.1021/ci100369f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Keiser M, Roth BL, Armbruster BN. et al.. Brian K relating protein pharmacology by ligand chemistry. Nat Biotechnol 2007;25:197–206. 10.1038/nbt1284. [DOI] [PubMed] [Google Scholar]
- 8. Öztürk H, Özgür A, Ozkirimli E. DeepDTA: deep drug–target binding affinity prediction. Bioinformatics 2018;34:i821–9. 10.1093/bioinformatics/bty593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Wang K, Zhou R, Li Y. et al.. DeepDTAF: a deep learning method to predict protein–ligand binding affinity. Brief Bioinform 2021;22:bbab072. [DOI] [PubMed] [Google Scholar]
- 10. Abbasi K, Razzaghi P, Poso A. et al.. DeepCDA: deep cross-domain compound–protein affinity prediction through LSTM and convolutional neural networks. Bioinformatics 2020;36:4633–42. 10.1093/bioinformatics/btaa544. [DOI] [PubMed] [Google Scholar]
- 11. Nguyen T, Le H, Quinn TP. et al.. GraphDTA: predicting drug-target binding affinity with graph neural networks. Bioinformatics 2021;37:1140–7. [DOI] [PubMed] [Google Scholar]
- 12. Yang Z, Zhong W, Zhao L. et al.. MGraphDTA: deep multiscale graph neural network for explainable drug–target binding affinity prediction. Chem Sci 2022;13:816–33. 10.1039/D1SC05180F. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Jiang M, Li Z, Zhang S. et al.. Drug–target affinity prediction using graph neural network and contact maps. RSC Adv 2020;10:20701–12. 10.1039/D0RA02297G. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Li F, Zhang Z, Guan J. et al.. Effective drug–target interaction prediction with mutual interaction neural network. Bioinformatics 2022;38:3582–9. 10.1093/bioinformatics/btac377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Liao J, Chen H, Wei L. et al.. GSAML-DTA: an interpretable drug-target binding affinity prediction model based on graph neural networks with self-attention mechanism and mutual information. Comput Biol Med 2022;150:106145. 10.1016/j.compbiomed.2022.106145. [DOI] [PubMed] [Google Scholar]
- 16. Dehghan A, Razzaghi P, Abbasi K. et al.. TripletMultiDTI: multimodal representation learning in drug-target interaction prediction with triplet loss function. Expert Syst Appl 2023;232:120754. 10.1016/j.eswa.2023.120754. [DOI] [Google Scholar]
- 17. Dehghan A, Abbasi K, Razzaghi P. et al.. CCL-DTI: contributing the contrastive loss in drug–target interaction prediction. BMC Bioinformatics 2024;25:48. 10.1186/s12859-024-05671-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Stepniewska-Dziubinska M, Zielenkiewicz P, Siedlecki P, Pawel P. Development and evaluation of a deep learning model for protein–ligand binding affinity prediction. Bioinformatics 2018;34:3666–74. 10.1093/bioinformatics/bty374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Ragoza M, Hochuli J, Idrobo E. et al.. Protein–ligand scoring with convolutional neural networks. J Chem Inf Model 2017;57:942–57. 10.1021/acs.jcim.6b00740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Zheng L, Fan J, Mu Y. OnionNet: a multiple-layer intermolecular-contact-based convolutional neural network for protein–ligand binding affinity prediction. ACS Omega 2019;4:15956–65. 10.1021/acsomega.9b01997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Wang Z, Zheng L, Liu Y. et al.. OnionNet-2: a convolutional neural network model for predicting protein-ligand binding affinity based on residue-atom contacting shells. Front Chem 2021;9. 10.3389/fchem.2021.753002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Zhou J, Li S, Huang L. et al.. Distance-aware molecule graph attention network for drug-target binding affinity prediction arXiv preprint arXiv:2012.09624. 2020.
- 23. Lim J, Ryu S, Park K. et al.. Predicting drug–target interaction using a novel graph neural network with 3D structure-embedded graph representation. J Chem Inf Model 2019;59:3981–8. 10.1021/acs.jcim.9b00387. [DOI] [PubMed] [Google Scholar]
- 24. Jiang D, Hsieh C-Y, Wu Z. et al.. InteractionGraphNet: a novel and efficient deep graph representation learning framework for accurate protein–ligand interaction predictions. J Med Chem 2021;64:18209–32. 10.1021/acs.jmedchem.1c01830. [DOI] [PubMed] [Google Scholar]
- 25. Choudhary K, DeCost B. Atomistic line graph neural network for improved materials property predictions. npj Comput Mater 2021;7. 10.1038/s41524-021-00650-1. [DOI] [Google Scholar]
- 26. Choudhary K, DeCost B, Tavazza F. Machine learning with force-field-inspired descriptors for materials: fast screening and mapping energy landscape. Phys Rev Mater 2018;2:083801. 10.1103/PhysRevMaterials.2.083801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Mqawass G, Popov P. graphLambda: fusion graph neural networks for binding affinity prediction. J Chem Inf Model 2024;64:2323–30. 10.1021/acs.jcim.3c00771. [DOI] [PubMed] [Google Scholar]
- 28. Teague SJ. Implications of protein flexibility for drug discovery. Nat Rev Drug Discov 2003;2:527–41. 10.1038/nrd1129. [DOI] [PubMed] [Google Scholar]
- 29. Wang R, Fang X, Lu Y. et al.. The PDBbind database: methodologies and updates. J Med Chem 2005;48:4111–9. 10.1021/jm048957q. [DOI] [PubMed] [Google Scholar]
- 30. Su M, Yang Q, Du Y. et al.. Comparative assessment of scoring functions: the CASF-2016 update. J Chem Inf Model 2018;59:895–913. 10.1021/acs.jcim.8b00545. [DOI] [PubMed] [Google Scholar]
- 31. Li Y, Su M, Liu Z. et al.. Assessing protein–ligand interaction scoring functions with the CASF-2013 benchmark. Nat Protoc 2018;13:666–80. 10.1038/nprot.2017.114. [DOI] [PubMed] [Google Scholar]
- 32. Landrum G. Rdkit documentation. Release 2013;1:4. [Google Scholar]
- 33.Li S, Zhou J, Xu T. et al. Structure-aware interactive graph neural networks for the prediction of protein-ligand binding affinity. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD'21). Singapore, pages 975–985, 2021.
- 34. Schütt K, Kindermans P-J, Felix S. et al.. Schnet: a continuous-filter convolutional neural network for modeling quantum interactions. Advances in neural information processing systems 2017;30. [Google Scholar]
- 35. Gilmer J, Schoenholz SS, Riley PF. et al.. Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning (ICML'17). Vol. 70. Sydney, Australia: JMLR.org, 2017; 1263–72. [Google Scholar]
- 36. Veličković P, Cucurull G, Casanova A. et al.. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations (ICLR 2018), arXiv preprint arXiv:1710.10903. 2017. https://openreview.net/forum?id=rJXMpikCZ.
- 37. Nguyen TM, Nguyen T, Le TM. et al.. Gefa: early fusion approach in drug-target affinity prediction. IEEE/ACM Trans Comput Biol Bioinform 2021;19:718–28. [DOI] [PubMed] [Google Scholar]
- 38. Chung J, Gulcehre C, Cho KH. et al.. Empirical evaluation of gated recurrent neural networks on sequence modeling. The Twenty-eighth Annual Conference on Neural Information Processing Systems (NIPS 2014) Workshop on Deep Learning. Montréal, Canada, 2014.
- 39. Davis MI, Hunt JP, Herrgard S. et al.. Comprehensive analysis of kinase inhibitor selectivity. Nat Biotechnol 2011;29:1046–51. 10.1038/nbt.1990. [DOI] [PubMed] [Google Scholar]
- 40. Li Y, Yang J. Structural and sequence similarity makes a significant impact on machine-learning-based scoring functions for protein–ligand interactions. J Chem Inf Model 2017;57:1007–12. 10.1021/acs.jcim.7b00049. [DOI] [PubMed] [Google Scholar]
- 41. Steinegger M, Söding J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol 2017;35:1026–8. 10.1038/nbt.3988. [DOI] [PubMed] [Google Scholar]
- 42. Van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res 2017;9:2579–2605. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data, code, scripts, and instructions for GEMF are available at https://github.com/Yuke-Qin/GEMF/tree/master.























































































































