Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Apr 11.
Published in final edited form as: J Chem Inf Model. 2026 Jan 9;66(2):1003–1012. doi: 10.1021/acs.jcim.5c03052

Multimodal Bond Reconstruction toward Generative Molecular Design

Jian Wang 1, Nikolay V Dokholyan 2
PMCID: PMC13067372  NIHMSID: NIHMS2158870  PMID: 41511953

Abstract

Generative models such as diffusion-based approaches have transformed de novo drug design by enabling rapid generation of novel molecular structures in both 2D and 3D formats. However, accurate reconstruction of chemical bonds, especially from distorted geometries produced by generative models, remains a critical challenge. Here, we present YuelBond, a multimodal graph neural network framework for robust bond reconstruction across three key scenarios: (i) recovery of bonds from accurate 3D atomic coordinates, (ii) reconstruction of chemically valid bonds in crude de novo generated compounds (CDGs) with perturbed geometries, and (iii) reassignment of bond orders in 2D topological graphs. YuelBond outperforms traditional rule-based methods such as RDKit, achieving 98.4% F1 score on standard 3D structures and maintaining strong performance (92.7% F1 score) on distorted CDGs, even when RDKit fails in most cases. Our results demonstrate that YuelBond enables accurate and reliable bond reconstruction from imperfect molecular data, bridging a critical gap in generative drug discovery pipelines.

Graphical Abstract

graphic file with name nihms-2158870-f0001.jpg

INTRODUCTION

In recent years, generative artificial intelligence—particularly generative adversarial networks1 (GANs) and diffusion models2—has revolutionized de novo drug discovery3 by enabling the rapid generation of massive novel molecular structures. These generative approaches are broadly categorized into two-dimensional (2D) molecular topology generation and three-dimensional (3D) molecular conformation generation. 2D molecular topology generation focuses on constructing molecular graphs or string-based representations (e.g., SMILES4/SELFIES5), while 3D molecular conformation generation aims to predict atomic coordinates in 3D space. For 2D generation, widely used methods include autoregressive models such as DiGress,6 MolRNN,7 BIMODAL,8 and MolecularRNN,9 variational autoencoders such as junction tree variational autoencoder,10 GraphVAE,11 and CGVAE,12 GAN-based frameworks like LatentGAN,13 druGAN,14 and MOLGAN,15 and flow-based architectures like GraphNVP16 and MoFlow.17 Meanwhile, 3D generation primarily relies on diffusion models, such as DecompDiff,18 PMDM,19 DiffSBDD20 and DiffBP,21 which often employ equivariant graph neural networks22 (EGNNs) to model molecular structures within protein binding pockets.

Despite their promise, these generative methods face critical limitations, particularly concerning chemical bond order accuracy. In 2D generation, predicted bond orders may be chemically invalid, while 3D generation typically outputs only atomic coordinates without explicit bond annotations. This workflow leads to two major challenges: (1) validation difficulties, as incorrect bond orders hinder reliable assessment of molecular validity,23,24 and (2) degraded performance in downstream tasks, including molecular dynamics simulations,25,26 docking studies27,28 and protein-small molecule binding prediction,29-31 where bond order errors propagate into inaccurate force field parametrization and binding affinity predictions. Addressing these issues is crucial for bridging the gap between in silico molecule generation and real-world drug development applications.

Typically, bond orders are determined by hybridization analysis using bond lengths and angles32 or by comparison of atomic pairs with a known database.33,34 They can also be assigned using chemical or length rules.35 RDKit,36 the most popular cheminformatics toolkit, uses the xyz2mol37 program to predict the bond order of the molecule from the 3D coordinates. xyz2mol determines bond orders from 3D coordinates through a rule-based approach. First, it identifies connected atoms using covalent radii. Then it calculates possible valence states and systematically tests bond order distributions between unsaturated atoms while respecting valence rules. The solution uses either combinatorial pairing or graph theory to find chemically valid configurations, optionally employing Hückel theory for ambiguous cases.

However, existing methods are not robust to geometric distortions in generated molecules. When the geometry is distorted, i.e., bond lengths and angles deviate from ideal values, even determining atomic connectivity becomes challenging. Predicting bond orders under such conditions is even more difficult. This problem was less critical before generative models were widely used in drug discovery, but as these models become increasingly prevalent, the issue is becoming more serious. On one hand, generative models must produce increasingly accurate 3D molecular structures. On the other hand, we must develop more robust approaches to infer molecular connectivity and bond orders directly from distorted 3D coordinates.

To address these limitations, we introduce YuelBond, a graph neural network (GNN)-based framework that infers bond orders from molecular representations, whether they are accurate 3D coordinates, generated noisy structures, or even mere 2D topological graphs. GNN models such as GIN, GCN, SchNet, and GraphMVP are primarily designed for general graph representation learning and molecular property prediction tasks, focusing on learning node-level or graph-level representations. In contrast, YuelBond is explicitly designed for edge-level (bond) prediction with specialized architecture modifications, including edge-centric message passing and chain sampling strategies, which distinguish it from these general-purpose models. This allows the model to learn subtle patterns associated with different bond types and generalize beyond handcrafted rules. YuelBond is evaluated across three increasingly challenging scenarios: (1) reconstruction of bond orders from accurate 3D atomic coordinates; (2) prediction from noisy 3D structures simulating conformations generated by generative models; and (3) reassignment of bond orders given only 2D molecular connectivity. Across these tasks, YuelBond demonstrates robust and accurate performance, surpassing traditional methods, especially under structurally noisy conditions where existing tools often fail. Furthermore, by outputting class probabilities, YuelBond provides a nuanced view of prediction uncertainty, which is an essential feature for downstream workflows that must reason over ambiguous or probabilistic bonding assignments. Taken together, our results show that YuelBond is a flexible and reliable tool for bond order inference across multiple molecular representations.

RESULTS

Bond Reconstruction from Accurate 3D Coordinates

YuelBond is developed based on GNN, in which each molecule is represented as a graph with atoms serving as nodes. For every atom, we calculate its distances to all other atoms in the molecule, and any pair of atoms within 3 Å is connected by an edge (Figure 1). These pairwise distances are assigned as edge features. While conventional GNNs are typically used for node-level predictions,38-41 we adapt the architecture to focus on predicting edge-level features, specifically the bond orders between atoms. To accomplish this, we modify the aggregation mechanism for edge updates (Methods Section). Instead of relying solely on the concatenation of features from the connected nodes, we additionally incorporate the distance between atoms and the previous edge features into the update function. This design allows the model to capture the local chemical environment around each bond more effectively. With each layer of message passing, the edge features are refined, thereby expanding the receptive field. At the final stage, a linear layer maps the updated edge features to one of four bond order categories: single, double, aromatic, or triple (Figure 2).

Figure 1.

Figure 1.

Workflow and architecture for bond reconstruction using a graph neural network. (a) The process begins with 3D atomic coordinates as input, followed by edge construction to form a molecular graph, where edges are defined between atoms within a 3 Å cutoff distance. The graph, containing node (atom) and edge (interatomic) features, is then processed by a GNN to predict bond orders (single, double, triple, or aromatic). (b) The GNN architecture consists of graph input, linear layers, and multiple message-passing (MP) layers that iteratively update edge embeddings (eij) and node states (hi) through message passing. (c) The edge features and node features are aggregated using the edge model and node model.

Figure 2.

Figure 2.

Performance evaluation of bond order prediction in molecular structures. (a) Average accuracy, precision, recall, and F1 score. (b) Examples of accurately predicted bond orders. (c) Per bond type prediction performance.

We trained and evaluated YuelBond using the Geometric Ensemble of Molecules (GEOM)42 data set, which includes over 450,000 molecular structures. The data set was randomly split into training, validation, and test sets. The validation set was used for tuning hyperparameters during training. To evaluate bond reconstruction performance, we first removed all bond annotations from the molecules and used YuelBond to reconstruct the bonds based on their 3D atomic coordinates. We then calculated accuracy, precision, recall, and F1-score on the test set. The model achieved an average accuracy of 98.2%, precision of 98.8%, recall of 98.2%, and F1-score of 98.4%, demonstrating a strong capability in recovering molecular bonding information (Figure 2a and Table S1).

We also analyzed YuelBond’s prediction performance across different bond types (Figure 2c and Table S1). For single bonds, the model achieved an average accuracy of 98.4%, precision of 95.4%, recall of 95.6%, and F1-score of 95.0%. For double bonds, the corresponding metrics were 99.6% accuracy, 95.4% precision, 92.6% recall, and 91.4% F1-score. Triple bond prediction yielded an accuracy of 99.94%, precision of 98.75%, recall of 89.4%, and F1-score of 89.4%. Aromatic bond prediction achieved 98.6% accuracy, 90.1% precision, 95.5% recall, and 89.4% F1-score. These results confirm that YuelBond is capable of accurately reconstructing diverse bond types from 3D atomic arrangements, including those that are challenging for traditional rule-based approaches.

The high performance on single and double bonds suggests that the model effectively learns typical interatomic distances and local environments associated with common covalent bonding. The slightly lower recall for triple bonds, despite high precision, indicates that the model tends to be conservative in assigning triple bonds, possibly due to their rarity and the strict geometric constraints required for their identification. This conservative behavior is preferable as misclassifying a nontriple bond as triple could result in larger errors in downstream tasks.

Bond Reconstruction from Crude de Novo Generated Compounds

The goal of this work is to develop a model capable of reconstructing chemical bonds in crude de novo generated compounds (CDG), which often exhibit noisy or imprecise 3D structures. To simulate this scenario, we introduce controlled noise into the molecular coordinates of the GEOM data set (Figure 3a and Methods Section), mimicking the structural distortion commonly found in generated molecules. We then employ our model to reconstruct bonds from these perturbed coordinates. Evaluation on the test set shows that the model maintains strong performance under noisy conditions, achieving average accuracy, precision, recall, and F1-score of 92.7%, 93.6%, 92.7%, and 92.7%, respectively (Figure 3b and Table S2).

Figure 3.

Figure 3.

Performance evaluation of YuelBond on the CDG bond reconstruction task. (a) Noise is introduced to the atomic coordinates in the compound, and edges are constructed from the perturbed atoms. Bonds are then reconstructed from these distorted compounds. (b) Evaluation of YuelBond’s performance in reconstructing the CDG bonds. (c) Performance of YuelBond for predicting each bond type. (d) Comparison of YuelBond’s performance with RDKit. YuelBond successfully processes all 1000 compounds in the test set, while RDKit can only process 217 compounds due to failures caused by distorted bond lengths and angles. YuelBond achieves approximately 0.8 in accuracy, precision, recall, and F1 score across all 1000 compounds, while RDKit obtains ~0.6 for these metrics on the 217 compounds.

We further analyzed the performance of YuelBond across different bond types (Figure 3c and Table S2). For single bonds, the model achieved an accuracy of 93.5%, precision of 83.4%, recall of 86.0%, and F1-score of 83.3%. Double bond prediction yielded 98.3% accuracy, 85.9% precision, 55.9% recall, and 56.3% F1-score. For triple bonds, the model achieved 99.8% accuracy, 86.5% precision, 63.9% recall, and 59.9% F1-score. Aromatic bond prediction produced 95.2% accuracy, 78.0% precision, 83.2% recall, and 75.5% F1-score.

These results show that the model retains high predictive accuracy for all bond types, but precision and recall vary with bond complexity. Single bonds are predicted most reliably, due to their high frequency and distinctive local geometry, even after perturbation. Aromatic bonds, often embedded in rings with recognizable topological features, are also well predicted despite coordinate noise. In contrast, double and triple bonds present greater challenges under distortion. Their prediction relies heavily on precise bond lengths and angles, which are features that become less informative when the structure is noisy. Consequently, the recall and F1-scores for these bond types are lower, reflecting increased false negatives.

To benchmark our model, we compared its performance with RDKit (Figure 3d and Table S2). Out of 1000 test molecules with noisy coordinates, RDKit was able to process only 217, while it failed on the remaining 783 molecules. This limitation arises because RDKit relies on chemically valid input geometries and rule-based heuristics that become unreliable when the structure is distorted. Even among the 217 successfully processed molecules, RDKit achieved only 65.3% accuracy, 70.1% precision, 65.3% recall, and 64.2% F1-score, substantially lower than YuelBond. These results highlight the robustness of YuelBond in reconstructing bond orders from imprecise 3D structures, where traditional rule-based methods often fail.

Bond Order Reassignment for 2D Generated Compounds

In the third scenario, we consider cases where generated molecules are represented solely as 2D graphs—that is, with atom connectivity but without 3D coordinate information (Figure 4a). This format is common for generative models that produce SMILES strings or molecular graphs, where the presence of a bond is known but the bond order is either missing or ambiguous. In such cases, reassignment of bond orders is a key step toward constructing chemically valid structures.

Figure 4.

Figure 4.

Performance evaluation of YuelBond on the bond order reassignment task. (a) The bond connectivity is preserved, while the bond orders are reassigned across all bonds. (b) Evaluation of YuelBond’s performance in reassigning bond orders. (c) Performance of YuelBond for predicting each bond type.

To address this challenge, we applied YuelBond to infer bond orders based only on the atomic connectivity. When evaluated on the test set, the model achieved average accuracy, precision, recall, and F1-score of 80.1%, 81.1%, 80.1%, and 78.3%, respectively (Figure 4b and Table S3). Compared to the previous scenarios that incorporated 3D information, these values are lower, reflecting the increased complexity of bond order prediction without geometric features such as bond lengths and angles.

We further examined the model’s behavior across different bond types (Figure 4c and Table S3). For single bonds, YuelBond achieved high recall (91.7%) and a solid F1-score (82.4%), reflecting the consistency and prevalence of these bonds in molecular graphs. For double and triple bonds, accuracy remained high (94.6% and 99.2%, respectively), but recall was lower due to their relatively low frequency. This is expected, as higher-order bonds often exhibit subtler patterns that are more challenging to distinguish based on connectivity alone. Notably, the high accuracy for triple bonds indicates strong capability in avoiding false positives, while the lower recall highlights a more conservative prediction strategy that favors precision over overprediction. The lower recall for double and triple bonds, relative to single and aromatic bonds, is largely attributable to their rarity in the data set. Since the overall number of negative cases (e.g., bonds that are not triple) is much larger, even a conservative predictor can yield high accuracy. Aromatic bonds were predicted with balanced performance, achieving 86.0% accuracy and an F1-score of 71.6%, suggesting that their unique connectivity patterns are reasonably well captured by the model.

Molecules with identical 2D connectivity may correspond to different chemically valid bond order assignments due to variations in local electronic environments. A simple example is the distinction between pyridine and its N-oxide form, where the connectivity remains unchanged but the bond orders around the nitrogen atom differ. Similarly, consider the case of a carbon–carbon bond, where a single bond, a double bond, and a triple bond could all appear as valid assignments depending on the electronic environment and the molecule’s resonance structure. In compounds like alkynes or alkenes, the bonding pattern can shift between single, double, or triple bonds, even if the connectivity graph appears identical in a 2D representation. These subtle differences are inherently challenging to resolve based on connectivity alone. Nonetheless, YuelBond demonstrates the ability to make chemically meaningful and robust predictions for bond order reassignment.

Bond Order Probability Prediction

While the overall prediction accuracy for double and triple bonds is lower in the bond order reassignment scenario (Figure 5a,b), it is informative to examine the confidence in the predictions. Since YuelBond outputs raw logits for each bond order class, we apply the softmax function to convert them into probabilities to reflect the confidence in assigning each possible bond order. These probabilities are meaningful as multiple bond types may appear plausible based on 2D connectivity alone.

Figure 5.

Figure 5.

Illustration of bond order probability prediction. (a, b) Two examples of failed predictions for double and triple bonds. (c) Despite incorrectly predicting the triple and double bonds as single bonds, the probabilities for these bond orders remain non-negligible.

To illustrate this, we analyzed a representative compound with the SMILES O=CCC(=O)CCC#N, which contains two double bonds and one triple bond. We calculated the predicted probabilities for each bond in the molecule (Figure 5c). For example, bond 1, which is a true triple bond, is misclassified as a single bond. However, the model assigns a 27.3% probability to the triple bond class, compared to 47.7% for single and 24.9% for double. This relatively high probability for the correct class indicates that the model retains partial confidence in the correct prediction, even if it does not appear as the top choice. In contrast, bond 2, a correct single bond prediction, is associated with a confident 90.3% probability, showing the model’s strong certainty when the context is less ambiguous.

A similar pattern is seen in bond 8, which is a double bond but predicted as single. Here, the probability for double bonds reaches 25.0%, while single and triple bonds receive 55.0% and 19.9%, respectively. These examples show that even when the final classification is incorrect, the predicted probability distribution often captures meaningful chemical ambiguity.

Impact of YuelBond Optimization on Generated Molecules from 2D and 3D Models

To further validate the practical utility of YuelBond in generative drug discovery pipelines, we evaluated its ability to optimize molecular structures generated by state-of-the-art 2D and 3D generative models. To quantitatively assess the quality of bond generation, we introduced the sanitize rate as a key metric, defined as the proportion of molecules that pass chemical validity checks (see Methods Section). We used DiGress (a 2D molecular graph generator) and DecompDiff (a 3D conformation generator) to produce 2000 raw molecular structures (see Methods Section), then applied YuelBond to reassign bond orders and assess improvements via both sanitize rate and log probability scores (see Methods Section). The latter are computed as the sum of log-transformed probabilities of individual bond order assignments and serve as a complementary quantitative measure of a molecular structure’s chemical consistency. For 2D-generated molecules (Figure 6a), DiGress outputs molecular graphs with defined atom connectivity, but its raw outputs exhibited a suboptimal sanitize rate of 0.81 (see Figure 6a), indicating that nearly 20% of generated molecules contain chemically invalid bond configurations. After YuelBond optimization, the sanitize rate was improved to 0.95. Concomitantly, the log probability scores of optimized molecules were consistently higher across all tested samples (Figure 6b), confirming that the bond orders reassigned by YuelBond enhance the chemical rationality of 2D-generated structures. For 3D-generated molecules (Figure 6d), DecompDiff produces atomic coordinates but often suffers from geometric distortions, which lead to unreliable bond annotations. This limitation is directly reflected in the low sanitize rate of 0.76 (Figure 6d), demonstrating that bond generation remains a critical bottleneck for 3D generative models. YuelBond’s bond reconstruction elevated the sanitize rate to 0.94 and resulted in an increase in log probability scores (Figure 6e).

Figure 6.

Figure 6.

Impact of YuelBond optimization on 2D and 3D generated molecules. (a) Sanitize rate of DiGress-generated molecules and YuelBond-optimized molecules. (b) Log probability scores of DiGress-generated molecules and YuelBond-optimized molecules. The dots denote distinct molecular samples. (c) Example Digress-generated molecule before and after YuelBond optimization. Differences in bonds are shown in the black dashed box. (d) Sanitize rate of DecompDiff-generated molecules and YuelBond-optimized molecules. (e) Log probability scores of DecompDiff-generated molecules and YuelBond-optimized molecules. The dots denote distinct molecular samples. (f) Example DecompDiff-generated molecule before and after YuelBond optimization. Differences in bonds are shown in the black dashed box.

The sanitize rate data from both DiGress and DecompDiff reveal that even state-of-the-art 2D and 3D molecular generation methods fail to produce fully chemically valid bonds in their raw outputs. In addition, we also evaluated if RDKit can process these generated molecules. For DiGress-generated molecules, RDKit is completely unable to reconstruct bonds because RDKit requires 3D coordinates for bond generation, while DiGress only outputs 2D molecular graphs. For DecompDiff-generated molecules, which provide 3D coordinates, RDKit was able to process 1284 out of 2000 molecules (64.2%), but failed on the remaining 716 molecules (35.8%) due to geometric distortions. In contrast, YuelBond successfully processed all molecules. Collectively, these results highlight YuelBond’s value as a useful postprocessing tool for generative models.

Test on the Kekulized Data Set

In previous scenarios, YuelBond was trained and evaluated using molecular representations that retained aromatic bonds as explicit aromatic types. However, in many cheminformatics applications, particularly in graph-based processing, it is common practice to kekulize aromatic systems, which is to convert aromatic bonds into alternating single and double bonds (Figure 7a). To assess the robustness of our model under such settings, we recompiled the GEOM data set by kekulizing all aromatic bonds and retrained YuelBond using this modified data set. We evaluated the performance across all three prediction scenarios. When reconstructing bonds from exact 3D coordinates, YuelBond achieved an average accuracy of 96.5%, precision of 96.5%, recall of 96.5%, and F1-score of 96.4% on the test set (Figure 7b and Table S4). In the scenario of CDG, performance remained high, with 91.9% accuracy, 91.0% precision, 91.9% recall, and an F1-score of 90.9% (Figure 7c and Table S5). In the most challenging bond order reassignment, YuelBond achieved 80.3% accuracy, 76.8% precision, 80.3% recall, and an F1-score of 76.6% (Figure 7d and Table S6).

Figure 7.

Figure 7.

Performance of YuelBond on Kekulized data set. (a) In the Kekulized data set, aromatic bonds are represented as alternating single and double bonds. (b) Performance of YuelBond on bond reconstruction from 3D coordinates. (c) Performance of YuelBond on bond reconstruction from CDG. (d) Performance of YuelBond on bond order reassignment.

Despite the structural transformation introduced by kekulization, the overall performance of YuelBond remains consistent with its original version trained on nonkekulized data, suggesting that YuelBond is not overly reliant on explicit aromatic bond annotations and can generalize across chemically equivalent but representationally different data sets. The minor drop in F1-score for bond order reassignment (from 78.3% to 76.6%) may reflect increased ambiguity in distinguishing alternating single and double bonds in kekulized rings. These results confirm that YuelBond maintains strong predictive performance even when trained on kekulized data.

DISCUSSION

In the idealized case with accurate 3D atomic coordinates, YuelBond achieves near-perfect reconstruction. This reflects the high fidelity of 3D geometry in encoding bond-specific information such as distance and local spatial configuration. The slight conservatism in triple bond classification, manifested as high precision but lower recall, suggests a preference for avoiding false positives in ambiguous scenarios. It is a desirable behavior for downstream applications in molecular modeling, where incorrectly assigning a high bond order may introduce substantial artifacts.

The second set of experiments, which introduces geometric distortion to mimic the imperfections of CDG, underscores the resilience to structural noise. YuelBond not only remains functional under such perturbation but also maintains strong predictive power. While there is a drop in performance, especially for bonds that are sensitive to precise geometry, such as double and triple bonds, the results remain significantly superior to RDKit. The RDKit benchmark shows that rule-based systems are fragile when assumptions about molecular structure are violated. YuelBond, trained on varied data and designed to incorporate learned spatial heuristics, proves more tolerant of noise, making it better suited for emerging applications in generative chemistry and structure prediction pipelines.

In the 2D-only setting, the problem shifts from one of geometric reasoning to one of chemical inference, as the model is forced to predict bond orders without direct access to atomic positions. Despite the lack of geometric cues, YuelBond still performs well, implying the capacity to learn structural priors purely from connectivity patterns. The disparity in performance across bond types, especially the lower recall for double and triple bonds, points to inherent ambiguities in 2D representations. Many different electronic or resonance states can yield the same connectivity graph, complicating prediction. The observed high accuracy but low recall for triple bonds in this scenario indicates a conservative bias of YuelBond, which tends to require strong evidence before assigning rare or chemically specific bond types.

The bond order probability highlights the ability of YuelBond to express prediction uncertainty. Unlike deterministic rule-based systems, YuelBond outputs a distribution over possible bond orders, which offers a more nuanced understanding of model confidence. This feature is especially valuable in ambiguous or edge cases, such as the analysis of a molecule with multiple functional groups, where assigning a single bond order may be chemically limiting. These probabilistic outputs can be incorporated into downstream workflows, e.g., during molecule generation, refinement, or validation, offering users a mechanism for prioritizing predictions, flagging uncertain assignments, or sampling alternative structures.

Taken together, YuelBond presents a unified framework capable of learning and generalizing chemical bonding rules across diverse input representations. It excels in accurate 3D settings, tolerates geometric distortion, adapts reasonably well to 2D graphs, and provides interpretable probabilistic predictions. Rather than aiming to replace rule-based methods entirely, YuelBond can be seen as a complementary tool, which brings the flexibility and contextual awareness of machine learning into scenarios where traditional approaches falter.

METHODS

Data Set and Preprocessing

We used the GEOM data set for the training, validation and test. The data set contains over 450,000 molecules. The GEOM data set is a large-scale collection of molecular conformations annotated with energy and statistical weight information, designed to support machine learning applications in computational chemistry. It comprises approximately 37 million conformers for over 450,000 molecules, including 133,000 species from the QM943 data set and 317,000 species with experimental data related to biophysics, physiology, and physical chemistry. Each molecule’s conformers were generated using advanced sampling techniques and semiempirical density functional theory (DFT) methods and further processed to explore conformational space. We used the conformer with the lowest energy of each molecule for the training, validation, and test.

We extracted the SMILES strings and 3D conformers with the lowest energy of each molecule and stored them in an PostgreSQL database for efficient retrieval. Each molecule undergoes two parallel processing streams: (1) sanitized (validated via RDKit’s standard chemical checks) and (2) Kekulized (aromatic bonds explicitly represented as alternating single/double bonds). For each molecule, atomic positions are extracted from the conformer, while atoms and bonds are encoded as one-hot vectors using predefined mappings. Invalid and large molecules (e.g., those with <2 atoms or >150 atoms) are filtered out.

Molecular graphs are constructed by representing atoms as nodes (with one-hot features and 3D coordinates) and bonds as edges (with one-hot bond types). Key molecular features, including atomic positions, one-hot encoded atom types, and bond types, are extracted. In the scenario of predicting the bond order of the molecule from 3D coordinates, edges are created between atoms within a 3 Å distance cutoff. In the scenario of predicting the bond order of the molecule from the connectivity of the atoms, edges are created as the bonds in the molecule, but the bond order is not included. In the scenario of predicting the bond order of CDG, Gaussian noise (σ = 0.2 Å) is added to the atomic positions, and then edges are created between atoms within a 3 Å distance cutoff.

To evaluate YuelBond’s performance on molecules generated by state-of-the-art generative models and assess its robustness to out-ofdistribution (OOD) data, we constructed two additional data sets using DiGress (a 2D molecular graph generator) and DecompDiff (a 3D molecular conformation generator). We generated 2000 molecular structures with each method. For DiGress-generated molecules, which only provide 2D molecular graphs with atom connectivity but no 3D coordinates, YuelBond was evaluated on bond order reassignment tasks. For DecompDiff-generated molecules, which provide 3D atomic coordinates but may contain geometric distortions, YuelBond was evaluated on bond reconstruction tasks.

Architecture of the Bond Reconstruction Model

The neural network is a graph neural network designed to predict bond orders in molecular compounds by processing graph-structured data, where nodes represent atoms and edges represent bonds. The model begins by embedding input nodes and edge features into a higherdimensional space using linear transformations. The core of the network is a stack of message-passing layers, which iteratively refine node and edge features through neighborhood aggregation. Each message passing layer contains two key components: an edge model and a node model. The edge model processes edge features by combining source and target node features with existing edge attributes, passing them through a multilayer perceptron (MLP)44 with layer normalization and a SiLU45 activation function. An innovation is the incorporation of existing edge features into the edge model. Previous GNNs only use the features of the connected nodes to update the edge features. In this work, we use the concatenation of the features of the connected nodes, the distance between the nodes, and the previous edge features to update the edge features. This way, the model can expand the receptive field more effectively. The node model aggregates incoming edge features for each node, combines them with the node’s current features, and applies another MLP to update the node representation. The forward pass handles batched graphs by merging them into a single large graph, applying masked operations to respect variable-sized inputs, and then splitting the results back into batches. The final output is produced by projecting the hidden features to the desired output dimensions (e.g., bond orders for edges).

Implementation of the Graph Neural Network with Edge Updates

We model the molecular system as a fully connected undirected graph G=(V,E), where each node viV that represents an entity (e.g., an atom or residue) with a feature vector hiRd, and each edge eijE represents an interaction or relationship between nodes vi and vj, with associated edge attributes eijRk.

Our framework builds on the message-passing neural network (MPNN)46 paradigm but introduces a dedicated edge update mechanism that plays a central role in learning the relational structure of the data. Unlike standard GNNs that update edge features implicitly or keep them static, we explicitly model and update eij during each layer of the network.

At each layer l, node features and edge features are updated as follows:

Message construction: For each pair of nodes (i, j), a message is constructed using a learned function φm:

mij(l)=φm(hi(l),hj(l),eij(l)) (1)

Edge update: The edge features are updated using the current node representations and previous edge features through a function φe:

eij(l+1)=φc(hi(l),hj(l),eij(l),dij) (2)

Node update: Each node aggregates incoming messages from its neighbors 𝒩(i) using an aggregation function ρ, followed by an update function φh:

mi(l)=ρ({mij(l):j𝒩(i)})hi(l+1)=φh(hi(l),mi(l)) (3)

All functions φm, φe, and φh are implemented as two-layer (MLPs with SiLU45 activations and layer normalization. To ensure stability during training, we use residual connections where applicable.

Chain Sampling Strategy for Bond Prediction

We used the chain sampling for bond order prediction. Unlike standard approaches that predict all bond types simultaneously, we employ an autoregressive strategy that predicts bonds sequentially, allowing the model to leverage previously predicted bond information. For each molecule, we generate a random edge traversal sequence S=[e1,e2,,en] that visits all edges in the graph. This sequence is generated using a random traversal algorithm that ensures all edges are visited exactly once. The algorithm starts from a randomly selected edge and iteratively selects the next edge that shares at least one node with the previously visited edges, ensuring that the traversal maintains local connectivity in the graph structure. This connectivity-preserving property is crucial for the model, as it allows the sequential prediction to leverage information from chemically adjacent bonds (those sharing atoms), which are more likely to be correlated in their bond types.

During training, we randomly select an integer n[1,nedges], where nedges is the total number of edges in the molecule. The first n1 edges in the traversal sequence S are set to their ground-truth bond types (known bonds), while the n-th edge is designated as the target for prediction. This creates a self-supervised learning setup where the model learns to predict bond types conditioned on partial bond information. During inference, bonds are predicted sequentially following the traversal order. At each step, the model predicts the bond type for the current target edge based on: (1) node features, (2) previously predicted bond types (via edge_attr), (3) interatomic distances (edge_dist), and (4) the masking information. The predicted bond type is then incorporated into the edge features for subsequent predictions, enabling the model to capture dependencies between bond types. During inference, we support temperature-controlled sampling, where temperature = 1.0 corresponds to argmax (deterministic) prediction, and higher temperatures introduce stochasticity for exploring diverse bond configurations.

The chain sampling strategy enables YuelBond to generate multiple valid bond configurations for a given molecular structure. By sampling from the probability distribution at each step (using temperature >1.0), the model can explore different bond order assignments, generating alternative chemically valid solutions. This capability is particularly valuable when the probability distribution is flat (indicating uncertainty) or when multiple chemically valid configurations exist for the same molecular connectivity. Users can leverage probability thresholds to identify uncertain predictions and explore alternative bond assignments, allowing the generation of multiple bond combinations for ambiguous cases. All accuracy metrics reported in this manuscript are based on the most probable bond type determination (argmax prediction). During evaluation, we use the bond type with the highest probability from the model’s output distribution as the predicted bond type for each edge.

The chain sampling framework not only enables bond generation but also provides a principled way to score and evaluate existing molecular configurations. For each edge in the traversal sequence, the model outputs a probability distribution over bond types. The probability of a specific bond type assignment for edge ei is denoted as pi. The overall probability of a complete molecular configuration is computed as the product of individual edge probabilities: P=i=1npi. However, to avoid numerical underflow when dealing with many edges, we use the log probability instead: log P=i=1nlogpi. This scoring mechanism allows us to rank and compare different molecular configurations, assess the likelihood of proposed bond assignments, and identify chemically plausible structures. The log probability is particularly useful for comparing molecules of different sizes, as it naturally accounts for the varying number of edges while maintaining numerical stability.

Training Process

The model is trained using PyTorch47 Lightning, which streamlines the training loop, validation, and logging. The training step involves passing molecular graphs through the GNN to predict bond orders, followed by computing the loss between predicted and true bond types. The loss function is masked cross-entropy, where only valid edges (nonpadded bonds) contribute to the gradient updates. The optimizer used is AdamW48 with a learning rate of 10−4, AMSGrad enabled, and weight decay (10−12) for regularization. Training metrics (e.g., loss) are logged at specified intervals and tracked using Weights & Biases49 for visualization. The model supports data augmentation and processes batches of variable-sized graphs using a custom collate function to handle padding and masking.

Loss Function

The objective of the model is to accurately predict the presence and type of bonds (or edges) between node pairs in a graph. To this end, we define a supervised learning loss based on the categorical cross-entropy between the predicted bond type distribution and the ground-truth bond type labels.

Let p^ijRC be the predicted probability distribution over C bond types for the edge between nodes i and j, and let yij{0,1}C be the one-hot encoded ground truth label. The edge prediction loss is given by

Ledge=1i,ji,jmijCE(p^i,j,yij) (4)

where CE(p^i,j,yij)=c=1Cyij(c)logp^i,j(c) is the standard cross-entropy loss for a single edge. mij{0,1} is a binary mask indicating whether the edge (i, j) is the target edge for prediction (1) or not (0). In our training setup, only one edge per molecule is masked as the target, ensuring that the model learns to predict bond types conditioned on partial bond information. The denominator i,jmij normalizes the loss by the number of target edges (typically 1 per molecule) to ensure stability regardless of graph size or padding.

The masked and normalized formulation prevents the model from being biased by graph sparsity or the inclusion of padded entries during batching. It also ensures consistent gradients during training across variable-sized graphs.

Evaluation Metrics

The model’s performance is evaluated using standard multiclass classification metrics: accuracy, precision, recall, and F1-score. They are computed on a per-molecule basis and aggregated across the test set. The classification targets are the bond types between atom pairs, such as single, double, triple, and aromatic. Padded or invalid edges are masked and excluded from all metric computations.

Let y^ij(1,,C) denote the predicted bond type for an atom pair (i, j), yij the corresponding ground-truth bond type, mij{0,1} the mask for valid edges, C the number of bond classes, I the indicator function, then:

accuracy=i,jmijI[y^ij=yij]i,jmij (5)
precisionc=i,jI(y^ij=cyij=c)miji,jI(y^ij=c)mij (6)
recallc=i,jI(y^ij=cyij=c)miji,jI(yij=c)mij (7)
F1c=2precisioncrecallcprecisionc+recallc (8)

To quantitatively assess the quality of bond generation in molecules produced by generative models, we introduced the sanitize rate as a key metric. The sanitize rate is defined as the proportion of molecules that pass chemical validity checks, with a specific focus on bond order correctness and compliance with atomic valence rules. The sanitize rate is computed as

sanitize rate=NvalidNtotal (9)

where Nvalid is the number of molecules that pass RDKit’s standard chemical validity checks (including bond order validation and atomic valence constraints), and Ntotal is the total number of molecules in the test set. A molecule is considered valid if it can be successfully sanitized by RDKit without errors, indicating that all bond orders are chemically plausible and all atoms satisfy their expected valence states.

Baseline Method: RDKit’s DetermineBonds Algorithm

To assess the performance of YuelBond, we benchmark it against the built-in bond type assignment function of RDKit,36 rdkit.Chem.rdDetermineBonds.DetermineBonds. This method is a widely used rule-based algorithm that infers bond orders directly from molecular 3D geometries. Internally, it leverages the xyz2mol37 program, which applies a series of chemically motivated heuristics to reconstruct the bond network of a molecule from atomic coordinates.

The xyz2 mol algorithm proceeds in multiple stages. First, it detects bonded atom pairs based on interatomic distances and covalent radii thresholds. Next, it computes potential valence states for each atom and iteratively searches for a valid bond order configuration. This process prioritizes chemical plausibility by enforcing standard valency constraints and ensures that no atom exceeds its typical bonding capacity. For cases involving multiple plausible configurations, particularly in conjugated or aromatic systems, xyz2mol employs additional reasoning strategies, including Hückel theory or graph-theoretic techniques, to resolve ambiguities and assign aromaticity. While DetermineBonds offers a fast and interpretable solution, its performance is inherently limited by the rigidity of handcrafted rules and its reliance on idealized geometry.

Supplementary Material

supporting information

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.5c03052.

  • Table S1: test performance of reconstructing bonds from accurate 3D coordinates; Table S2: test performance of reconstructing bonds from CDG; Table S3: test performance of reassigning bond orders for 2D atom graphs; Table S4: test performance of reconstructing bonds from accurate 3D coordinates in the Kekulized data set; Table S5: test performance of reconstructing bonds from CDG in the Kekulized data set; Table S6: test performance of reassigning bond orders for 2D atom graphs in the Kekulized data set (PDF)

ACKNOWLEDGMENTS

We acknowledge support from the National Institutes for Health 1R35 GM134864, the Huck Institutes of the Life Sciences, and the Passan Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. This project was supported by the Penn State College of Medicine’s Artificial Intelligence and Biomedical Informatics Program.

Footnotes

The authors declare no competing financial interest.

Contributor Information

Jian Wang, Department of Neurology and Neuroscience, University of Virginia, School of Medicine, Charlottesville, Virginia 22903, United States.

Nikolay V. Dokholyan, Department of Neurology and Neuroscience and Department of Biomedical Engineering, University of Virginia, School of Medicine, Charlottesville, Virginia 22903, United States

Data Availability Statement

Source codes and test data are deposited at: https://github.com/dokhlab/yuel_bond or https://github.com/hust220/yuel_bond.

REFERENCES

  • (1).Goodfellow I.; et al. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar]
  • (2).Ho J; Jain A; Abbeel P Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst 2020, 33, 6840–6851. [Google Scholar]
  • (3).Martinelli DD Generative machine learning for de novo drug discovery: A systematic review. Comput. Biol. Med 2022, 145, No. 105403. [DOI] [PubMed] [Google Scholar]
  • (4).Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci 1988, 28, 31–36. [Google Scholar]
  • (5).Krenn M.; et al. SELFIES and the future of molecular string representations. Patterns 2022, 3, No. 100588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (6).Vignac C.; et al. DiGress: Discrete Denoising diffusion for graph generation. arXiv preprint arXiv 2023. [Google Scholar]
  • (7).Li Y; Zhang L; Liu Z Multi-objective de novo drug design with conditional graph generative model. J. Cheminformatics 2018, 10, 33–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Grisoni F; Moret M; Lingwood R; Schneider G Bidirectional Molecule Generation with Recurrent Neural Networks. J. Chem. Inf. Model 2020, 60, 1175–1183. [DOI] [PubMed] [Google Scholar]
  • (9).Popova M; Shvets M; Oliva J; Isayev O MolecularRNN: Generating realistic molecular graphs with optimized properties. arXiv preprint arXiv 2019. [Google Scholar]
  • (10).Jin W; Barzilay R; Jaakkola T Junction Tree Variational Autoencoder for Molecular Graph Generation. in Proceedings of the 35th International Conference on Machine Learning 2323–2332 (PMLR, 2018). [Google Scholar]
  • (11).Simonovsky M; Komodakis N GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders. in Artificial Neural Networks and Machine Learning - ICANN 2018 (eds Kůrková V; Manolopoulos Y; Hammer B; Iliadis L; Maglogiannis I) Vol. 11139 412–422 (Springer International Publishing, Cham, 2018). [Google Scholar]
  • (12).Liu Q; Allamanis M; Brockschmidt M; Gaunt A Constrained graph variational autoencoders for molecule design. Adv. Neural Inf. Process. Syst 2018, 31. [Google Scholar]
  • (13).Prykhodko O; Johansson SV; Kotsias PC; Arús-Pous J; Bjerrum EJ; Engkvist O; Chen H; et al. A de novo molecular generation method using latent vector based generative adversarial network. J. Cheminformatics 2019, 11, 74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (14).Kadurin A; Nikolenko S; Khrabrov K; Aliper A; Zhavoronkov A druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico. Mol. Pharmaceutics 2017, 14, 3098–3104. [DOI] [PubMed] [Google Scholar]
  • (15).Cao ND; Kipf T MolGAN: An implicit generative model for small molecular graphs. arXiv preprint 2022. [Google Scholar]
  • (16).Madhawa K; Ishiguro K; Nakago K; Abe M GraphNVP: An Invertible Flow Model for Generating Molecular Graphs. arXiv preprint 2019. [Google Scholar]
  • (17).Zang C; Wang F MoFlow: An Invertible Flow Model for Generating Molecular Graphs. in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 617–626 (ACM, Virtual Event CA USA, 2020). doi: . [Google Scholar]
  • (18).Guan J.; et al. DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design. arXiv preprint 2024. [Google Scholar]
  • (19).Huang L; Xu T; Yu Y; Zhao P; Chen X; Han J; Xie Z; Li H; Zhong W; Wong KC; Zhang H; et al. A dual diffusion model enables 3D molecule generation and lead optimization based on target pockets. Nat. Commun 2024, 15, 2657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Schneuing A.; et al. Structure-based drug design with equivariant diffusion models. Nat. Comput. Sci 2024, 4, 899–909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).Lin H.; et al. Diffbp: Generative diffusion of 3d molecules for target protein binding. Chem. Sci 2025, 16, 1417–1431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (22).Satorras VG; Hoogeboom E; Welling M E (n) equivariant graph neural networks. in International conference on machine learning 9323–9332 (PMLR, 2021). [Google Scholar]
  • (23).Bickerton GR; Paolini GV; Besnard J; Muresan S; Hopkins AL Quantifying the chemical beauty of drugs. Nat. Chem 2012, 4, 90–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (24).Lipinski CA Lead-and drug-like compounds: the rule-of-five revolution. Drug Discovery Today Technol. 2004, 1, 337–341. [DOI] [PubMed] [Google Scholar]
  • (25).Ding F; Tsao D; Nie H; Dokholyan NV Ab Initio Folding of Proteins with All-Atom Discrete Molecular Dynamics. Structure 2008, 16, 1010–1018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (26).Dokholyan NV; Buldyrev SV; Stanley HE; Shakhnovich EI Discrete molecular dynamics studies of the folding of a protein-like model. Fold. Des 1998, 3, 577–587. [DOI] [PubMed] [Google Scholar]
  • (27).Ding F; Dokholyan NV Incorporating backbone flexibility in MedusaDock improves ligand-binding pose prediction in the CSAR2011 docking benchmark. J. Chem. Inf. Model 2013, 53, 1871–1879. [DOI] [PubMed] [Google Scholar]
  • (28).Wang J; Dokholyan NV MedusaDock 2.0. J. Chem. Inf. Model 2019, 59, 2509–2515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (29).Wang J; Dokholyan NV Leveraging Transfer Learning for Predicting Protein-Small-Molecule Interaction Predictions. J. Chem. Inf. Model 2025, 65, 3262–3269. [DOI] [PubMed] [Google Scholar]
  • (30).Wang J; Dokholyan NV Yuel: Improving the Generalizability of Structure-Free Compound-Protein Interaction Prediction. J. Chem. Inf. Model 2022, 62, 463–471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (31).Chirasani VR; et al. Whole proteome mapping of compoundprotein interactions. Curr. Res. Chem. Biol 2022, 2, No. 100035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (32).Hendlich M; Rippmann F; Barnickel G BALI: Automatic Assignment of Bond and Atom Types for Protein Ligands in the Brookhaven Protein Databank. J. Chem. Inf. Comput. Sci 1997, 37, 774–778. [Google Scholar]
  • (33).Baber JC; Hodgkin EE Automatic assignment of chemical connectivity to organic molecules in the Cambridge Structural Database. J. Chem. Inf. Comput. Sci 1992, 32, 401–406. [Google Scholar]
  • (34).Labute P. On the Perception of Molecules from 3D Atomic Coordinates. J. Chem. Inf. Model 2005, 45, 215–221. [DOI] [PubMed] [Google Scholar]
  • (35).Zhang Q; Zhang W; Li Y; Wang J; Zhang L; Hou T; et al. A rule-based algorithm for automatic bond type perception. J. Cheminformatics 2012, 4, 26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (36).RDKit: Open-source cheminformatics. https://www.rdkit.org. [Google Scholar]
  • (37).Kim Y; Kim WY Bull. Korean Chem. Soc 2015, 36, 1769–1777. [Google Scholar]
  • (38).Cho H; Choi IS Enhanced Deep-Learning Prediction of Molecular Properties via Augmentation of Bond Topology. Chem-MedChem. 2019, 14, 1604–1609. [DOI] [PubMed] [Google Scholar]
  • (39).Sun F-Y; Hoffmann J; Verma V; Tang J InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. arXiv preprint 2020. [Google Scholar]
  • (40).Wieder O.; et al. A compact review of molecular property prediction with graph neural networks. Drug Discovery Today Technol. 2020, 37, 1–12. [DOI] [PubMed] [Google Scholar]
  • (41).Zhou J.; et al. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar]
  • (42).Axelrod S; Gómez-Bombarelli R GEOM, energy-annotated molecular conformations for property prediction and molecular generation. Sci. Data 2022, 9, 185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (43).Ramakrishnan R; Dral PO; Rupp M; von Lilienfeld OA Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 2014, 1, No. 140022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (44).Haykin S. Neural Networks: A Comprehensive Foundation. (Prentice Hall PTR, USA, 1994). [Google Scholar]
  • (45).Elfwing S; Uchibe E; Doya K Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning. arXiv preprint 2017. [DOI] [PubMed] [Google Scholar]
  • (46).Gilmer J; Schoenholz SS; Riley PF; Vinyals O; Dahl GE Neural Message Passing for Quantum Chemistry. arXiv preprint 2017. [Google Scholar]
  • (47).Paszke A.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. arXiv preprint 2019. [Google Scholar]
  • (48).Loshchilov I; Hutter F Decoupled Weight Decay Regularization. arXiv preprint 2019. [Google Scholar]
  • (49).Biewald L. Experiment Tracking with Weights and Biases. (2020). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supporting information

Data Availability Statement

Source codes and test data are deposited at: https://github.com/dokhlab/yuel_bond or https://github.com/hust220/yuel_bond.

RESOURCES