Abstract
In recent years, generative deep learning has emerged as a transformative approach in drug design, promising to explore the vast chemical space and generate novel molecules with desired biological properties. This perspective examines the challenges and opportunities of applying generative models to drug discovery, focusing on the intricate tasks related to small molecule generation, evaluation, and prioritization. Central to this process is navigating conflicting information from diverse sourcesbalancing chemical diversity, synthesizability, and bioactivity. We discuss the current state of generative methods, their optimization, and the critical need for robust evaluation protocols. By mapping this evolving landscape, we outline key building blocks, inherent dilemmas, and future directions in the journey to fully harness generative deep learning in the “chemical odyssey” of drug design.


1. Introduction
Generative deep learning has emerged as a transformative tool in de novo drug design, and in the molecular sciences at large. − By leveraging existing molecular data, generative deep learning enables the on-demand generation of molecules that possess desirable properties. With the “chemical universe” estimated to contain up to 1060 drug-like molecules, generative deep learning has accelerated the discovery of novel compounds, compared to traditional rule-based molecular assembly and enumeration approaches. − In less than a decade since its introduction in drug discovery, generative deep learning has been extensively applied in prospective wet-lab studies − and demonstrated its potential in real-world applications. Meanwhile, the development of approaches to better explore the chemical universe is gaining momentum in the cheminformatics community. − Currently, we are observing a productive synergy between computer science advances and insights rooted in chemical and biological knowledge.
While generative deep learning is making great strides in designing promising molecules, the understanding of what model to choose and how to prioritize molecular candidates remains limited. With an ever-growing pool of methods available for molecular generation and optimization, choosing the most suited approach might not always be straightforward. Despite the availability of benchmarks, , retrospective comparisons are poised to be incomplete for new, prospective applications. Researchers might face many “traps”, such as overfitting to specific datasets or overlooking key molecular properties. , Moreover, the generated designs can exhibit limited synthesizability, and current synthetic accessibility scores might struggle to capture the effect of subtle structural variations, reaction selectivity, and the availability of building blocks essential to chemical reactions. Finally, optimizing conflicting propertiessuch as balancing pharmacophore similarity with structural diversitypresents another layer of complexity. As a result, de novo molecule design through generative methods continues to be an odyssey (from Greek “Oδúσσεια”, meaning “the journey of Odysseus” or “a long and adventurous journey”) in chemical space.
This perspective
highlights open questions, “known unknowns”,
and key challenges in generative small molecule design, with a particular
focus on ligand-based approaches. After reviewing current generative
approaches for de novo design, we offer insights and propose potential
future directions. By highlighting both the caveats and advantages
of these methods, we venture to forecast the exciting possibilities
that lie ahead.
2. Charting the Chemical Cosmos with Generative AI
2.1. Molecular Representations for Deep Learning
How to represent molecular information has been a central scientific question for over a century. − Molecular representations are “a symbolic transformation of [chemical] reality”, initially devised for the purpose of communication and human reasoning. Examples are the ball-and-stick model, Kekulé’s representation, and Lewis structure of molecules. Such models of chemical reality were progressively developed to address the complexity of molecular structures at different levels of abstractions. From the mid of the 20th century, with the advent of computers and the need for machine-readable formats, string-based molecular representations (notations) have started to be developed, such as Wiswesser Line Notation and the SMILES (Simplified Molecular Input Line Entry System) strings. These representations were initially proposed to complement systematic chemical nomenclature, and enabled the encoding of chemical structures in a compact format suitable for information storage, processing, and retrieval.
With the introduction of machine learning to cheminformatics, the role of molecular representations has also evolved. Molecular representations have become a fundamental step to encode chemical information for machine learning and data analysis. Their role has further become more prominent since the advent of deep learning. Certain deep learning algorithms can handle non-structured (i.e., non-tabular) input, allowing the direct usage of some of these molecular representations, without the need of feature engineering. This shift enables generative deep learning to use some molecular representations as inputs to generate novel chemical structures in an end-to-end fashion. , In other words, models can be trained to directly learn from and generate molecules in the form of a chosen molecular representation. To date, the following molecular representations have played a key role in the field of generative deep learning for de novo design:
Molecular strings. Molecular strings represent a molecule as a sequence of characters. The most popular string notation to date is the SMILES notation. SMILES strings are obtained by traversing a two-dimensional molecular representation (molecular graph), and annotating the atomic symbols and bond types along the traversal path. Rings and branches are encoded using numeric labels and parentheses, respectively, to capture molecular connectivity while maintaining a linear format (Figure b). SMILES can be seen as a chemical language, possessing its own syntax rules to map this string representation back to a chemically valid molecule. Since the early successes of SMILES strings as a representation for de novo design, , several extensions to this notation have been developed. DeepSMILES (Table ) were proposed to address aspects that often caused invalidity of the generated SMILES (brackets and ring characters), but have found limited application, due to their difficult syntactic rules. The Self-referencing embedded strings (SELFIES) are built upon “semantically constrained graphs”, in such a way that SELFIES strings always corresponds to a valid molecule (Table ). While SMILES and SELFIES achieve a similar performance overall, they come with distinct advantages and drawbacks. The enforced validity of SELFIES might be ideal to generate complex macromolecular entities, whose structural complexity increases the likelihood of generating invalid SMILES (e.g., natural products , ). SMILES strings might be more ideal for distribution learning, to as the generation of invalid SMILES allows to filter out low-quality molecular designs, which would not be possible when validity is enforced like in the SELFIES language. Recently, great attention has been given to string representations that are based on molecular fragments, such as Sequential Attachment-based Fragment Embedding (SAFE), GroupSELFIES and fragSMILES. These representations have been developed to provide “chemically-rich” molecular representations, to better learn the properties of the training molecules, and/or provide a better representation of molecular chirality (fragSMILES, Table ).
Two- and three-dimensional molecular graphs. Graphs are among the most intuitive ways to represent molecular structures (Figure c). Any molecule can be represented as a mathematical graph G = (V, E), whose vertices (v i ∈ V) represent atoms, and edges (e ij ∈ E) constitute their connections (bonds). Additionally, features of atoms and/or edges can be added to further characterize the graph. Two-dimensional (2D) molecular graphs usually consider only topological and chemical features (e.g., bond and atom types). Three-dimensional information (3D) can also be added to the features of the molecular graphs, such as torsional angles, or three-dimensional coordinates. Generating molecules in the form of graphs has attracted a lot of attention, for both ligand- and structure-based drug discovery. − A particularly recent direction in the field is the attempt to generate 3D molecular graphs, ,− in the attempt to better capture properties that depend on the spatial arrangement of atoms (e.g., protein binding). Moreover, 3D graphs can also be encoded as the so-called point clouds, ,, i.e., by omitting bond information and encoding the spatial positioning of atoms (absolute or relative coordinates). Once the geometry is generated in the form of point cloud, bond type can be constructed by dedicated chemistry software (e.g., Open Babel).
Molecular surfaces. In addition to 3D graphs and point clouds mentioned above, a limited number of studies have represented molecules via their molecular surface, which encloses the 3D structure of a molecule at a certain distance from each atom (Figure d). Surfaces are then usually represented as (a) 3D meshes , (a set of polygons describing the coordinates in the 3D space), (b) 3D point clouds, , that is, a collection of discrete points in 3D space that define the surface geometry without explicit connectivity between them, or (c) 3D voxels, , that is three-dimensional grid of cubic cells where each voxel represents a volume element in space, allowing for the discretization of the surface and surrounding molecular structure into a regular grid format. Surfaces can then be further characterized by additional chemical (for example, hydrophobic, electrostatic) and geometric (such as local shape, curvature) features. While surface representations have mostly gathered popularity to capture information on protein surfaces or cavities, and protein–protein interactions, , they have also been used to generate molecules matching desired ligand shapes, , or conditioned on the information on surface features.
1.
Representing molecular information for deep learning.(a) Selected molecular example (caffeine). (b–d) Commonly used “raw” molecular representations. (b) Simplified Input Line Entry Systems (SMILES) strings, which capture two-dimensional molecular information as a set of textual characters (“tokens”). (c) Molecular graph, whose vertices represent atoms, and edges constitute their connections (bonds). Additionally, features of atoms and/or edges can be added to further characterize the graph, such as chemical and bond properties, as well as three-dimensional information. (d) Molecular surface, which encloses the 3D structure of a molecule at a certain distance from each atom. (e,f) Encoding of a molecular representation into a “machine readable” format. (e) Encoding of molecular strings (SMILES strings, in the selected example). One-hot encoding represents each token with a unique binary vector. Learnable embeddings start from a random vector per token and are updated during training to improve the model performance. (f) Encoding of molecular graphs. The adjacency matrix captures information about atoms bound to each other (i.e., connected by one edge). Node features can be specified for each atom (e.g., atom type, in this example). Features for edges can also be specified (in this example: p1 = bond order, p2 = ring membership, p3 = aromaticity).
1. Examples of Molecular String Notations, for the Molecule Caffeine (Figure a). SMILES and canonical SMILES differ based on the atom ordering used to write the SMILES string (deterministic when canonicalization is applied).
| Notation | String |
|---|---|
| InChi | 1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3 |
| Canonical SMILES | Cn1cnc2c1c(O)n(c(O)n2C)C |
| SMILES | CN1CNC2C1C(N(C(N2C)O)C)O |
| DeepSMILES | CN(C(O)N(C(O)CNC)C)C |
| SELFIES | [C][N][C][C][C][Branch1_1][C][O][C][Branch1_1][C][O][C][Branch1_1][C][Ring1][Branch1_1][C][Ring1][Branch1_1][C] |
| fragSMILES | C.<2>Oc1[nH]c(O)c2[nH]cnc2[nH]1<6>.<10>(C.).C. |
Once a molecular representation has been chosen, a fundamental step is molecule encoding. Encoding is the transformation of molecular structures into numerical representations that can be processed by machine learning models. Encoding ensures that the chosen molecular representation is converted into a format suitable for deep learning without loss of information. For molecular strings (Figure e) common methods include one-hot encoding, which converts each token into a unique binary vector, and learnable embeddings, which represents tokens as vectors of unique numbers that are updated during training. For molecular graphs (Figure f), encoding involves constructing an adjacency matrix that defines atomic connectivity and a node features matrix to describe atomic properties. Optionally, edge features can also be included to specify bond characteristics (e.g., bond type or order). Recent studies on bioactivity prediction show that molecular strings encoding strategy has little impact on the model performance, whereas the choice of edge features is crucial and task-specific.
Every molecular representation (and corresponding encoding) requires ad hoc neural network architectures to learn and aggregate molecular information, and shows distinct advantages and disadvantages. To date, models based on molecular string representations (often referred to as chemical language models) are the ones that have found extensive experimental validation to generate bioactive molecules (see Section ; Table ). ,,,,, Early studies have demonstrated that relying on string representations for de novo design better captures complex molecular properties compared to molecular graphs. Nonetheless, there is ample room for improvement when it comes to representing and generating molecules de novo, since molecular strings capture only 2D information (molecular topology and composition). In this context, methods that learn sophisticated three-dimensional information (e.g., geometry and chemistry of binding pockets or known molecular conformers) to generate molecules matching relevant binding motifs − is a particularly enticing direction. However, to date, it might be difficult for deep learning models to generate molecules with valid 3D conformations, for instance, in terms of torsional angles and bond lengths, , albeit current advances show great promise to overcome such issues. −
4. Experimentally Validated Studies Involving Generative Deep Learning for Hit Design (Updated May 2025).
| Superfamily | Target | Model type | Model(s) | Potency range | Reference |
|---|---|---|---|---|---|
| Kinases | Janus Kinase 3 (JAK3) | CLM (SMILES) | Entangled Conditional Adversarial Autoencoder | IC50 = 6.73 μM | Polykovskiy et al. 2018 |
| Kinases | Discoidin Domain Receptor 1 (DDR1) | CLM (SMILES) | VAE | IC50: 0.010–0.278 μM | Zhavoronkov et al. 2019 |
| Kinases | Discoidin Domain Receptor 1 (DDR1) | CLM (SMILES) | LSTM | IC50: 0.092–2.239 μM | Yoshimori et al. 2020 |
| Kinases | FMS-like tyrosine kinase 3 (FLT-3) | CLM (SMILES) | LSTM | IC50 = 1.98 μM | Jang et al. 2022 |
| Kinases | Epidermal growth factor receptor (EGFR) | CLM (SMILES) | RNN | pIC50: 5.9–7.4 | Korshunova et al. 2022 |
| Kinases | Receptor-interacting protein kinase 1 (RIPK1) | CLM (SMILES) | Conditional LSTM | IC50: 0.035–0.463 μM | Li et al. 2022 |
| Kinases | Phosphoinositide 3-kinase gamma (PI3Kγ) | CLM (SMILES) | LSTM | Kd: 0.013–0.29 μM | Moret et al. 2023 |
| Kinases | Cyclin-dependent Kinases (CDK1/2) | CLM (SMILES) | Fragment-Based VAE | IC50: 0.0015–8 μM | Yu et al. 2023 |
| Kinases | Cyclin-dependent kinase 8 (CDK8) | Mix | Ensemble of generative modelsc | IC50: 0.0004 μM | Li et al. 2023 |
| Kinases | Salt-inducible kinase 2 (SIK2) | Mix | Ensemble of generative modelsc | IC50: 0.14–9.8 μM | Zu et al. 2023 |
| Kinases | Cyclin-dependent Kinase 2 (CDK2) | Geometry-based (3D atom positioning) | Diffusion | IC50 < 0.001 μM | Huang et al. 2024 |
| Kinases | Cyclin-dependent kinase 20 (CDK20) | Mix | Ensemble of generative modelsc | Kd: 0.034–9.2 μM | Ren et al. 2024 |
| Kinases | Polo-like kinase 1 (PLK1) | CLM (SMILES) | Transformer conditioned on pharmacophores | IC50: 0.00510.037 μM | Xie et al. 2025 |
| Kinase | TRAF2- and NCK-interacting kinase (TNIK) | Mix | Ensemble of generative modelsc | IC50: 0.0048–0.0078 μM | Ren et al. 2025 |
| Nuclear receptors | Peroxisome Proliferator Activated Receptors (PPAR); Retinoid X Receptors (RXR) | CLM (SMILES) | LSTM | EC50: 0.06–10.1 μM | Merk et al. 2018 |
| Nuclear receptors | Retinoid X Receptors (RXR) | CLM (SMILES) | LSTM | EC50: 16.9–15.7 μM | Merk et al. 2018 |
| Nuclear receptors | Liver X receptor α (LXRα) | CLM (SMILES) | LSTM | EC50: 0.183–1.31 μM | Grisoni et al. 2021 |
| Nuclear receptors | Retinoic acid receptor-related orphan receptor γ (RORγ) | CLM (SMILES) | LSTM | EC50:0.68–4.6 μM | Moret et al. 2021 |
| Nuclear receptors | Orphan nuclear receptor related 1 (Nurr1) | CLM (SMILES) | LSTM | EC50: 0.04–2.1 μM | Ballarotto et al. 2023 |
| Nuclear receptors | Proliferator-Activated Receptors (PPAR) | CLM (SMILES) with graph conditioning | GTNN + LSTMc | EC50: 0.24–2.3 μM | Atz et al. 2024 |
| Nuclear receptors and enzymes | Proliferator-Activated Receptor (PPARδ), and soluble epoxide hydrolase (sEH) | CLM (SMILES) | LSTM | EC50:0.009–0.022 μM; IC50: 0.005–0.097 μM | Isigkeit et al. 2024 |
| Other | K-opioid receptor (KOR) | CLM (SMILES) | VAE | Ki: 6.46–7.59 μM | Salas-Estrada et al. 2023 |
| Other | Tuberculosis ClpP | CLM (SMILES) | GPT-likec | IC50: 1.88–35.2 μM | Wu et al. 2024 |
| Other | Histone acetyltransferase 1 (HAT1) and YTH domain-containing protein 1 (YTHDC1) | Geometry-based (3D atom and bond positioning) | Autoregressive flowc | IC50 (HAT1) = 72.36 μM; IC50 (YTHDC1) = 32.6 μM | Jiang et al. 2024 |
| Other | Bacteria (A. baumannii) | Fragment-based | MCTS | MIC ≤8 μg mL–1 | Swanson et al. 2024 |
| Other | Prolyl hydroxylase domain (PHD) enzymes | Mix | Ensemble of generative modelsc | IC50 = 0.004 μM | Xu et al. 2024 |
| Other | Kirsten rat sarcoma virus (KRAS) | CLM (SMILES) | LSTM and QCBM | EC50: 0.9–24.6 μM | Ghazi Vakili et al. 2025 |
| Other | Monoglyceride lipase (MGLL) | Graphs | GTNN | pIC50: 3.66–5.99 | Hassen et al. 2025 |
GPT = generative pre-trained transformer; GTNN = graph transformer neural network; LSTM = long short-term memory; MCTS = monte carlo tree search; QCBM = quantum circuit Born machines; RNN = recurrent neural network; VAE = variational autoencoder.
EC50: = half maximal effective concentration; IC50: = half maximal inhibitory concentration; Kd = dissociation constant; Ki = inhibition constant; MIC = minimum inhibitory concentration; pIC50 = negative logarithm of IC50.
Only studies reporting fully de novo designed molecules, without subsequent structural modifications, are included. All models are ligand-based, unless specified.
2.2. Deep Learning Architectures for Drug Discovery
A plethora of generative deep learning architectures was proposed for designing small molecules with desired properties (e.g., bioactivity toward a pharmacologically relevant target). In its simplest form, generative drug discovery can be cast as a ligand-based task, ,, where models learn from the molecular structures of bioactive molecules and generate novel compounds with desirable characteristics, without requiring explicit knowledge of the target protein’s structure or sequence. Structure-based approaches also exist, which leverage structural (or sequence) information about the target protein to guide molecular design. , For simplicity, in this perspective, we focus mostly on ligand-based approaches, and strategies for structure-based de novo design are discussed elsewhere. , Regardless of whether ligand- or structure-based strategies are employed, the effectiveness of generative models hinges on how molecular information is encoded and processed by deep learning approaches for follow-up molecular generation. In general, different molecular representations require tailored neural architectures to suitably process and leverage the molecular information they encode.
2.2.1. Molecular Strings
One of the earliest and most widely adopted approaches in ligand-based drug design is chemical language modeling (CLM). ,, CLMs take inspiration from natural language processing and treat molecule design as a sequence generation problem. CLMs are trained to predict the next token in a molecular string given the preceding ones. This self-supervised learning approach enables models to generate valid molecular sequences while capturing underlying chemical patterns. Moreover, CLMs have often been applied in combination with “data-augmentation”, where multiple SMILES (or SELFIES) strings are used to represent the same molecule. This is usually achieved by enumeration, where multiple molecular stringsobtained by traversing the molecular graph in different directions or starting from different atomsare used. Molecular string enumeration can improve generative drug design, especially in low-data regimes. ,, Recently, strategies inspired by natural language processing (such as atom masking or token deletion) have proven valuable alternatives to enumeration for de novo drug design.
Early CLMs employed recurrent neural networks (RNNs). RNNs iterate over the input symbols in the sequence stepwise and compress the past information into a single memory vector called hidden state (Figure a). Different RNN variants exist based on the memory update rules, such as Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRUs). LSTMs and GRUs were successfully used to generate syntactically valid molecules with bespoke properties. ,,,
2.
Overview of popular deep learning architectures for de novo design.(a) Recurrent Neural Networks, which learn to predict the next token in a SMILES string, using information on all the previous tokens. The network hidden state is updated in a recurrent way, to perform a prediction at any steps while keeping track of the preceding portions of the string. (b) Transformers, which learn all pair relationships between sequence tokens to perform a prediction. (c) Variational Autoencoder, where an encoder is trained to transform an input molecule (e.g., a graph or a string) into a fixed-dimension latent vector, and a decoder is trained to convert such vectors back into molecular representations.
Transformers an architecture that learns all pair relationships between sequence elements (Figure b) were introduced to molecule design due to their transformative impact in natural language processing. − Transformers have found widespread application in computer-assisted synthesis planning. In de novo design, they found more limited application. Recently, a comparison between transformers and RNNs have shown that the former better capture complex properties of molecules, while the latter generates more structurally diverse design libraries. − Recently, structured state space sequence models (S4s) , emerged as an alternative to “make the best of both worlds” by leveraging full-sequence learning (like transformers) and autoregressive generation (like RNNs). Early results suggest that S4 might enhance the diversity of design libraries compared to transformers, while better capturing complex molecular properties than both architectures.
Variational autoencoders (VAEs) introduce a new component to de novo design. Rather than predicting the next symbol based on the memory, they decompose the task into two subtasks (a) learning how to encode molecules into a structured, continuous representation, and (b) reconstructing the input molecule from such learned representation. , In a VAE (Figure c), an encoder network transforms an input molecule into a probability distribution over a latent space (typically modeled as a Gaussian distribution). From this latent representation, another network (decoder) learns to reconstruct molecules by sampling points from the learned distribution and mapping them back to valid input representations. The encoder and decoder can be any deep learning model, with convolutional neural networks (CNNs) and RNNs being typical choices for molecular sequences. ,− Compared to architectures that rely solely on hidden states (e.g., LSTMs and GRUs), VAEs facilitate smoother latent space navigation, thanks to their explicit learning of latent molecule representations. These representations can be manipulated to enable tasks such as molecule optimization and property-guided design.
Encoder-decoder architectures can also be incorporated in generative adversarial networks (GANs) for molecule design. − Originally developed for image generation, GANs consist of (a) a generator network that learns to convert random noise into realistic instances, and (b) a discriminator network that scores the quality of the generations. These two neural networks are trained to compete with each other, with the generator aiming to produce increasingly realistic samples to “fool” the discriminator. Early applications of GANs showed promise for molecule generation, by improving drug-like properties. However, due to the challenges to train GANse.g., mode collapse, where the generator learns only a few modes of the input distribution − other score-based generators have been replacing GANs for molecular sequence design, namely diffusion models. Initially successful in generating realistic images, diffusion models learn to iteratively refine random noise into samples that resemble the input distribution. Diffusion can be applied on sequences by selecting discrete probability distributions to model the inputs, or by applying diffusion to latent molecule representation, a trend gaining popularity for sequence modeling. − Recently developed molecular diffusion sequence models offer more controllability with more drug-like designs than initial chemical language models and form an active research direction. −
2.2.2. Molecular Graphs
Like string representations, molecular graphs also gained widespread interest in de novo design using deep learning. − Graph neural networks (GNNs) are the predominant deep learning architecture to learn from graphs. Starting from random vectors, GNNs iteratively update atom and bond representations to learn meaningful molecular representations. GNNs were first applied in combination with VAEs and GANs. ,,, While using GNNs as encoders is relatively straightforward, decoding graphs poses a greater challenge, as it requires reconstructing molecular structures while preserving chemical validity and similarity to the input. Calculating the reconstruction quality becomes increasingly compute-intensive as molecular size grows. , For large molecules, using coarse-grained graphs (e.g., fragments as nodes) to shrink the graph size, and formulating graph generation as stepwise node extension can mitigate the computational burden. More recently, flow-based and diffusion models have been proposed for molecular graph generation. Flow-based models learn invertible transformations of simple probability distributions, such as Gaussian, to model the complex distribution of molecular datasets. − Unlike VAEs and diffusion models, flow-based models can provide exact likelihood estimates of generations and provide an internal score for the designs. Diffusion models on molecular graphs learn to generate chemically valid topologies from random noise, − Recent research on diffusion and flow-based graph models focus on improving chemical validity, scalability, and design diversity to match the performance of string generation models. ,
2.2.3. 3D Geometry
3D approaches represent molecules as atom types and their spatial coordinates (and, optionally, bond types). Training models on atom-level 3D information allow learning from proteins and small molecules together and conditioning generations with the 3D topology of protein binding pockets. While this advantage is mostly harnessed for structure-based studies, 3D approaches have also been developed for unconditional small molecule design with diffusion. ,,,− Equivariant graph neural networks (EGNNs) are a popular architecture to learn from point clouds. The equivariance provide the same, or a predictably different, output when the input molecule is rotated, translated, or reflected and provide performance gains over non-equivariant GNNs. The initial EGNNs struggled to generate valid 3D structures that satisfy bond lengths and atom stability constraints, and recent work integrated of laws of physics. and learned jointly from molecular graphs and 3D coordinates to achieve the validity level of sequence models. ,, Using message passing transformers that respect the geometry of molecules has been recently popular, as they provide higher validity in generations. ,− A recent study unified 3D generation and property prediction within a two-stage generation framework and improved performance in both tasks. A bottleneck for 3D generation molecule design models is the availability of datasets and benchmarks. QM9 dataset consists of computed geometric, energetic, electronic, and thermodynamic properties for 134 K molecules up to 9 heavy atoms (C, N, O, F), , forming a narrow chemical space for drug discovery studies. GEOM-Drugs is an alternative used in 3D generation studies and contain molecules up to 91 heavy atoms and their 3D conformers. A recent improved GEOM-Drugs by reimplementing a flawed stability metric and re-evaluating the model trained upon this dataset.
2.2.4. Hybrid Approaches
Generating molecules in any representation has distinctive strengths, motivating approaches to combine the advantages of each. Creating a VAE with a GNNs encoder to explicitly represent the chemical structures and a CLM decoder for fast generation is one such example, resulting in higher diversity of the molecular designs. Similarly, multimodal modelswhich integrate multiple data typescan also be used for de novo design. Such approaches expand the information available to the models, and can even incorporate additional data sources, such as gene expressions, protein interactome, and textual information, and allow zero-shot molecule design combined with human-understandable explanations in natural language. , Each additional modality, however, increases the architectural complexity, which can pose challenges related to computational efficiency, interpretability, and long-term scalability.
2.3. Focused Molecule Design with Deep Learning
Distribution learning (Figure a), where a generative model is trained to learn and replicate the property distribution of a molecular dataset (e.g., physicochemical and biological properties). Distribution learning is usually achieved via transfer learning where a model is “pretrained” on a large, diverse dataset, and then “fine-tuned” on a smaller, task-specific dataset, with the same training strategy. In de novo design, a particularly useful training strategy is the next-token generation in molecular strings, , but other strategies involve reconstructing molecular structures from corrupted inputs, ,, or predicting missing bonds/atoms. − An extensive body of literature using SMILES strings has shown the benefit of pretraining to learn the “syntax” of the SMILES language and global physico-chemical properties, , and of fine-tuning to capture bioactivity, as validated in the wet-lab. ,, Moreover, by controlling the extent of transfer learning, it is possible to indirectly navigate between the properties of the pretraining and fine-tuning molecules. ,,,,
Goal-directed learning (Figure b), where a model is trained to optimize an external objective. This is typically implemented through reinforcement learning, in which a generative model iteratively designs molecules and receives a score (also called reward) based on an external evaluation function (e.g., docking scores , or predicted properties and bioactivity). ,− Usually, these generative models are pretrained on large corpora of data. , During reinforcement learning, the model updates its generation strategy based on this reward signal, “reinforcing” molecular structures that receive higher scores and discouraging those with lower scores. This iterative process helps the model refine its output toward molecules with the desired properties. Compared to distribution learning, goal-directed learning allows for a more flexible steering of molecule design toward desired regions of the chemical space, e.g., for multi-objective molecular design, − scaffold hopping , and hit-to-lead optimization. ,
Conditional generation (Figure c), where a single generative model is trained to explicitly produce molecules matching one or more desired properties. Unlike traditional goal-directed generation (which scores the de novo designs of a generation), the desirable conditions are explicitly used to train the generator in a supervised manner. This is achieved by learning a shared vector space that encodes the desired conditions (e.g., experimentally determined properties) and the corresponding molecular structures (Figure c) together. The desired set properties can be then used as a conditioning input to generate molecular structures that are likely to possess those properties. Conditional generation has been used to design molecules that match a desired three-dimensional shape, possess a desirable set of physico-chemical properties or substructures, ,, or are constrained by a target protein sequence , or gene-expression signatures. ,
3.
Approaches for model training and molecular generation. (a) Distribution learning, where models generate molecules that statistically resemble those in the training set in terms of physico-chemical and biological properties. (b) Goal-directed generation, which optimizes molecules toward a predefined objective, often using reinforcement learning, guided by a scoring function. (c) Conditional generation, where models are explicitly trained to design molecules with specified properties, by incorporating property constraints into the generation process.
Each molecular generation approach presents unique benefits and limitations, and the choice depends on a multitude of factors, such as, the availability and quality of available training data, the complexity of the target molecular properties, and specific design needs (e.g., diversity vs property optimization).
Distribution-learning methods enable end-to-end learning and molecule generation without the need for explicit scoring functions to guide the design process. Even with relatively small pretraining datasets (e.g., in the order of tens of thousands of molecules) and/or fine-tuning sets (e.g., in the order of just a few dozen molecules , ), these approaches can successfully generate molecules with desirable molecular properties, including bioactivity. Moreover, the fine-tuning process is particularly beneficial in low-data scenarios, , where the molecules having the desired properties are too few to train a deep learning model from scratch, and the target property is resource-intensive and/or challenging to evaluate in silico. However, while these models are evaluated based on how well the generated molecules resemble the training set in terms of properties, they do not inherently assess the quality of individual molecules. As a result, additional scoring and filtering steps are often required to identify the most promising candidates, introducing manual intervention into otherwise “rule-free” pipelines. Finally, distribution learning might fail at property optimization if the desired characteristics are underrepresented in the training data (out-of-distribution generation).
Conversely, goal-directed strategies offer direct feedback on both individual molecules and the entire generated population through an external scoring function. Compared to distribution learning, they can, in principle, steer the generation of molecules in previously unexplored regions of the chemical space, if incorporated into a suitable scoring function. However, goal-directed generation poses several challenges, ,, including (a) the difficulty of accurately capturing complex molecular attributes such as bioactivity, drug-likeness, and synthetic accessibilityinto a single scoring function, which might lead to molecules that do not realistically match the intended objectives; (b) the risk of models exploiting spurious correlations or unintended biases within the scoring function rather than learning meaningful structure–property relationships (a phenomenon known as “reward hacking”), and (c) reduced chemical diversity due to biases imposed by the optimization process, which can even result in mode collapse. The reliability of goal-directed methods is therefore heavily dependent on the quality and diversity of the data and models used to define scoring functions. Additionally, molecules generated by goal-directed strategies may exhibit lower synthetic accessibility than those designed via distribution learning approaches. Several strategies have been explored to retain broad chemical diversity while achieving the generation objectives, such as incorporating a memory of previously produced molecules, the penalization of structurally redundant designs, or integrating transfer and reinforcement learning. Overall, whenever dealing with goal-directed generation, scoring functions play a key role in determining the properties of the de novo design. As such, scoring functions should be designed carefully, by balancing computational cost and reliability.
Finally, conditional generation has been explored less extensively in drug discovery compared to distribution-learning and goal-directed approaches. In general, however, since these models do not rely on externally computed scoring functions, they offer several potential advantages, such as (a) mitigating biases associated with predictive models (e.g., bioactivity prediction models), (b) capturing complex structure–property/activity relationships by directly associating molecular structures with desired properties within latent space, and (c) maintaining the end-to-end learning characteristics of distribution-learning methods. Despite these advantages, conditional generation has yet to be applied extensively and experimentally validated. This might be due to the fact that conditional generation is well-suited for scenarios where large molecular databases with accurate and reliable labels are used. However, such datasets are relatively uncommon in drug discovery, which might explain the limited application of conditional approaches. One potential way to overcome this limitation is to incorporate self-supervised or semisupervised approaches to enhance learning from smaller datasets. Finally, it remains uncertain whether conditional generation can effectively explore regions of chemical space beyond those represented in the training set, raising questions about its utility relative to more established generative approaches.
2.4. Evaluating the Quality of de Novo Designs
While generative deep learning models can rapidly produce millions of molecular designs, the evaluation of molecular qualitywhether for follow-up experiments or model analysis and comparisonremains an open challenge. Despite remarkable strides in generative drug discovery, determining whether a design is “good” or “bad” remains a challenge, as it often involves balancing multiple, sometimes conflicting, objectives in a context-dependent manner. While no universally accepted guidelines exist for evaluating de novo design studies, assessment typically considers multiple factors, including chemical validity, diversity, and alignment with the intended design objectives, as outlined below.
2.4.1. Chemical Validity and Non-redundancy
One of the principal requirements of a good molecular generator is its ability to generate molecules that are “chemically plausible”in other words, molecules possessing correct valency, aromaticity and charge constraints. These aspects are usually commonly referred to as “chemical validity”. While validity only considers two-dimensional information, it has recently been extended to three-dimensional molecular generation (“validity3D”), which evaluates the conformation quality of bond lengths and valence angles. Additionally, generators should be able to produce non-redundant molecules, in terms of limiting the duplicated designs (“uniqueness”) and the overlap with the training set (“novelty”). These metrics should always be reported, as they can reveal significant issues in the model training procedure. However, validity, uniqueness and novelty are vulnerable to trivial modifications (e.g., the random insertion of a carbon atom), may depend on the number of generated designs, and are easy to optimize by simple heuristic algorithms. Therefore, they should be considered as a diagnostic check rather than a conclusive evaluation metric.
2.4.2. Internal Diversity of de Novo Design Libraries
Assessing the structural diversity of generated molecules is crucial to ensure that a generative model does not produce structurally redundant or overly similar compounds. High diversity is often desirable, particularly in early-stage drug discovery, as it increases the chances of identifying novel bioactive chemotypes. Evaluating and quantifying molecular diversity is not straightforward, as “similarity is in the eye of the beholder” it depends on the chosen molecular features (molecular descriptors) and similarity/diversity metrics. Often, molecular diversity is captured by computing the pairwise Tanimoto similarity (the lower, the higher the diversity) on extended-connectivity fingerprints (ECFPs), which captures the presence of shared substructures. , Average pairwise similarities are frequently reported, at the risk of obscuring important nuances in molecular diversity. The presence of diverse molecular scaffolds is also commonly used to capture diversity. While providing a more abstract perspective on diversity compared to ECFP similarity, it can be skewed by the inclusion of scaffolds that differ only minimally. Recently, the number of circles (“#Circles”) has been introduced, which leverages sphere exclusion clustering to quantify the internal diversity of molecular datasets and measure the chemical space covered by databases and generative models. #Circles captures well the structural diversity of molecular sets and allows to better distinguish generative models with varying exploration capacities. The #Circles metric is computationally expensive, and this is why the number of unique substructures (computed via the Morgan algorithm) was recently suggested as a cost-effective alternative to capture internal diversity.
2.4.3. Similarity to Reference Molecules
In de novo drug design tasks, generated molecules are often required to share key, desirable properties with known compounds, such as binding affinity or solubility. Consequently, their similarity to these reference compoundsoften used for model training and validationis computed as a measure of how well the designed molecules align with the desired property distributions. Like with measures of internal diversity, similarity can consider information on shared substructures, as captured by extended connectivity fingerprints (ECFPs) or Molecular ACCess System (MACCS) keys. Often, evaluations of molecular similarity are based on physicochemical descriptors, such as molecular weight, octanol–water partitioning coefficient (logP), topological polar surface area (TPSA), and number of hydrogen bond donors and acceptors. Once these descriptors are obtained for the reference molecules (e.g., training set) and the molecular designs, the degree of similarity between their distributions is computed using dedicated metrics, , such as the Kullback–Leibler (KL) divergence or Kolmogorov–Smirnov (KS) distance (the lower the values, the more similar the distributions). Another popular measure of distance between distributions is the Fréchet ChemNet Distance (FCD). FCD is computed from the internal representations of the penultimate layer of a model for bioactivity prediction, capturing both chemical and biological information. Lower FCD values indicate greater similarity between molecular sets in terms of structure and bioactivity. An important caveat is that distribution-based similarities require a minimum number of molecules to be reliably computed, , estimated to be around 100,000.
2.4.4. Predicted Molecular Suitability
Ultimately, de novo drug design aims to find molecular candidates for hit and lead discovery, possessing an array of desirable properties, such as potency, selectivity, pharmacokinetics, and safety. These properties are inherently complex and challenging to determine experimentally, due to their resource-intensive nature. This is why surrogate computational approaches of different levels are used for molecule evaluation and prioritization. Commonly used computational approaches are: (a) quantitative structure–activity relationship (QSAR) approaches, to predict biological properties such as potency, selectivity and ADMET (absorption, distribution, metabolism, excretion and toxicity) properties; (b) pharmacokinetic models, which aid in predicting bioavailability, clearance, and systemic exposure; (c) estimation of synthetic feasibility, − at varying levels of complexity; and (d) biophysics-based approaches, such as docking and molecular dynamics. Each computational approach offers distinct advantages and limitations in molecule evaluation. QSAR models enable rapid, cost-effective predictions of biological properties but rely heavily on high-quality training data, which are often not available. At the same time, pharmacokinetic models provide physiologically relevant insights into drug disposition, yet they require accurate input parameters, which can introduce uncertainties. Synthetic feasibility estimations aid in prioritizing accessible compounds, but may bias toward well-established chemistries, potentially overlooking innovative or unconventional synthetic routes (see Section ). Biophysics-based methods provide mechanistic insights into molecular interactions, with docking offering rapid but often simplified predictions of binding poses, while molecular dynamics (MD) captures conformational flexibility and stability at the cost of significantly higher computational expense. However, both approaches can suffer from scoring inaccuracies, particularly in ranking binding affinities. Despite these challenges, these computational tools have become indispensable to navigate and rationalize the large virtual libraries designed by generative deep learning.
Evaluating the quality of de novo designs remains one of the most challenging open questions in the field. , This complexity arises from the inherently multi-objective nature of drug discovery, the unfeasibility of large-scale synthesis and biological testing, and the absence of perfect models to predict complex in vitro and in vivo properties. Nonetheless, the cheminformatics community has been particularly active in delineating guidelines and benchmarks to aid in the evaluation of the designs produced by generative deep learning algorithms (e.g., Table ). Overall, the specific approaches adopted to evaluate a generator’s designs depend on the intended goalwhether to develop and compare generative models or to prioritize compounds for experimental validation.
2. Overview of Existing Benchmarks for de Novo Molecular Design.
| Name | Evaluation | Tasks |
|---|---|---|
| GuacaMol | General drug-like property optimization. | Distribution learning, exploration/exploitation of chemical space, single and multi-objective optimization tasks |
| MOSES | Molecule design in general. | Distribution learning, molecular diversity, avoidance of forbidden structures. |
| MolOpt | Sample efficiency in reinforcement learning. | Molecule optimization via 23 oracle functions. |
| MolScore | Collection and reimplementation of Guacamol, MOSES, and MolOpt, with added evaluations. | Distribution learning, biological activity, synthetic accessibility, among others. |
| MolExp | Chemical space exploration. | Capacity to discover dissimilar molecules that possess similar bioactivity. |
| QM9 | Unconditional molecule design. | 3D properties of molecules up 9 heavy atoms computed via density functional theory (DFT) simulations. |
| GEOM-Drugs | Unconditional molecule design. | Computed 3D properties of molecular conformers. |
| SMINA-benchmark, DOCKSTRING | Docking scores. | Docking scores with various software. |
| GenBench3D | Ligand conformational quality. | Likelihood of bond lengths and valence angles based on reference values from the Cambridge Structural Database (CSD). |
| CBGBench | Quality of structure-based de novo design. | Evaluation of structure-based de novo design across four generation tasks. |
only for 3D generation approaches.
When comparing generative models or developing new ones, large-scale assessment is recommended, typically involving at least 100,000 molecular designs. In this context, evaluating chemical validity and non-redundancy are necessary, but not sufficient, to monitor the correct learning of chemical information. Furthermore, the analysis of similarity and diversity patterns (e.g., number of circles or substructures, descriptor similarity and FCD) are required to provide insights into a model’s potential and limitations. Practitioners should ensure that the same number of generated designs should be used across evaluations, to prevent unintended confounding factors that could bias model comparison.
For applications aiming to bring generative models closer to experimental validation, a more refined evaluation strategy might be required. While similarity vs. diversity analysis remains relevant for optimizing the exploration-exploitation trade-off, additional layers of molecular assessment might come into play. Early-stage filters, such as simple synthetic accessibility estimations, help narrow down the chemical space before applying more sophisticated and computationally demanding retrosynthesis predictions. Similarly, biophysics-based evaluations may start with docking for rapid binding predictions, followed by more resource-intensive molecular dynamics simulations to refine binding mode stability and interaction profiles.
To date, the effectiveness of de novo design evaluation, especially for prospective applications, remains dependent on the expertise and viewpoint of the researchers analyzing the results, highlighting the need for a strong synergy between machine learning specialists and medicinal chemists to ensure that computationally generated designs align with real-world drug discovery objectives.
3. With Complex Objectives Come Complex Responsibilities
3.1. The Similarity-Diversity Paradox
Designing new drug candidates faces a persistent challenge: balancing structural diversity with molecular similarity to known molecules. On one hand, similarity to known bioactive compounds increases the likelihood of identifying viable drug candidates, as structurally related molecules often share biological activity. On the other hand, excessive similarity can restrict chemical innovation and the likelihood of charting novel chemical space. This not only hampers the discovery of molecules with improved therapeutic propertiessuch as enhanced potency, selectivity, or safetybut also raises concerns regarding patentability of compounds that closely resemble existing drugs. In drug discovery, innovation beyond “me-too” compounds requires striking a balance between leveraging known bioactivity and exploring sufficiently distinct chemical matter to uncover new mechanisms of action and therapeutics. We term this tension as the “similarity-diversity paradox”, to underscore the difficulty in simultaneously achieving the two.
How to balance similarity and diversity has been a long-standing question in the cheminformatics and drug discovery communities. , This paradox can be mitigated by aiming to minimize substructure and scaffold similarity, while preserving three-dimensional information. In the context of protein–ligand binding, in fact, bioactivity is primarily driven by three-dimensional shape and electrostatics complementarity, along with other molecular recognition factors. Based on this concept, several generative approaches have incorporated shape and/or electrostatic complementarity in the design process, e.g., by conditioning the generation using pharmacophore or three-dimensional pocket information, − or via ad-hoc scoring for reinforcement learning. ,, These approaches could allow to preserve core features for binding (i.e., similarity to pharmacophore and/or 3D shape and electrostatics) while allowing to explore new regions in the chemical space in terms of scaffolds and substructures (ensuring “two-dimensional” diversity). Another strategy to increase diversity is by drawing inspiration from natural products. , Natural products have been a rich source for medicinal chemists due to their structural complexity and diverse scaffolds, which often translate into a wide range of biological activities. − Recognizing this, several studies have employed deep learning techniques to generate natural product-inspired compounds via transfer learning ,, or fragment-based generation. These studies show that incorporating natural-product-derived information allows to expand the chemical space accessible for drug discovery while preserving desirable properties such as bioactivity.
Exploring previously uncharted regions of chemical space is nevertheless not easy. In fact, most available molecule evaluation toolssuch as docking algorithms, QSAR models and ADMET prediction toolsmight struggle with highly novel scaffolds. − These tools often rely on statistical correlations derived from existing datasets, meaning that molecules with unprecedented cores or non-traditional structural features may fall outside their applicability domain. For instance, docking algorithms typically assume that ligands share similarities with known binders, leading to inaccurate scoring for highly distinct chemotypes. Similarly, machine learning-based bioactivity predictors are known to provide unreliable predictions when applied to molecules that are too different from the molecules used for training. , Finally, synthetic accessibility scores often penalize novel molecular structures, and might not consider current or future advancements in organic synthesis.
Navigating the similarity-diversity paradox with the current technologies ultimately requires combining several techniques. De novo designs occupying the “familiar” chemical space can be scored with the available data-driven tools such as docking approaches and predictive models, and synthesis routes can be devised for prospective studies. For diverse designs outside the applicability domain, physics-based molecular dynamics simulations of increasing computational complexities can be run. Initial short or coarse-grained simulations can help preselect a smaller pool of de novo designs. For these selected candidates, performing more extensive and accurate simulations (potentially with multiple replicates) might help the refinement of the compound selection. Ultimately, overcoming the similarity-diversity paradox will be only possible by combining efforts from multiple disciplines, such as improving generalizability in molecular machine learning, and finding synergies, complementarities and overlaps between physics-based approaches and data-driven predictions.
3.2. Benefits and Limitations of Molecular Benchmarks for de Novo Design
Evaluating generative models poses additional challenges than evaluating predictive models (e.g., to predict bioactivity or toxicity). While predictive models can be evaluated via held-out molecules that were previously tested, de novo designs are, by definition, molecules that have not been previously tested for their properties. This makes it difficult to evaluate different model architectures, choose which model should be used for prospective studies, and identify learning gaps. To overcome this limitation, in the past few years, several notable benchmarking efforts, ,,, such as GuacaMol and MOSES, have strived to provide standardized datasets and metrics, to evaluate and compare generative models (Table ).
GuacaMol and MOSES are to date the most well-known benchmarking platforms. They include datasets for testing, along with benchmarking metrics for comparing generative modelswith GuacaMol comprising both distribution-learning and goal-directed models, while MOSES focusing on distribution-learning models. These benchmarks have enabled researchers to compare diverse approaches systematically, facilitating rapid progress in algorithmic development and model optimization. By offering well-defined tasks (e.g., optimizing molecular properties, generating diverse compounds, or mimicking training distributions), these benchmarks have become indispensable tools for assessing the performance and utility of generative algorithms in chemistry. While molecular benchmarks are undeniably valuable, there are inherent limitations and potential pitfalls associated with their use. Prominent among these is the risk of overfitting models to benchmark-specific tasks, metrics, or datasets. Such overoptimization can lead to a “tunnel vision”, where models excel at predefined tasks but fail to generalize to real-world drug design challenges. In this context, excessive reliance on benchmarks has been suggested to stifle creativity, as researchers prioritize improving benchmark scores over addressing broader, more impactful scientific questions. Moreover, molecular benchmarks often simplify the complexity of drug discovery. Real-world drug design involves multifaceted objectives, iterative processes, and the integration of experimental validationaspects that are not easily captured in benchmark tasks. The reliance on computational metrics, such as similarity or property optimization, may inadvertently bias research toward generating compounds that are easy to evaluate computationally but less relevant for practical applications.
To address these challenges, the field must strike a balance between leveraging the advantages of benchmarks and fostering innovative, application-driven research. This includes testing and benchmarking approaches on more than one benchmark, continue the development of progressively more realistic evaluation metrics and datasets, and ultimately, incorporate experimental feedback, whenever possible. By acknowledging both the benefits and limitations of molecular benchmarks, we wish for the community to harness the potential while remaining vigilant against their unintended consequences.
3.3. Navigating Feasibility in Molecular Generation
While generative deep learning models allow to produce previously unseen molecules, difficulties arise in addressing synthetic feasibility. Defining and incorporating synthesizability is a challenging task, as it may depend on subtle structural variations, reaction selectivity, and the commercial availability of building blocks essential to chemical reactions. Moreover, several research has shown that the synthesizability of molecules proposed by deep learning may be limited, depending on the chosen approach.
A wealth of research has focused on assessing the synthetic feasibility of molecules. Commonly used strategies generally assess one of the following aspects (Table ):
Synthetic complexity scores, identify those molecular characteristics that could challenge the synthesis. Metrics such as the synthetic accessibility score (SASscore) estimate synthetic complexity based on the presence of predefined molecular motifs, such as the number of bridged and spiro systems, stereocenters, or macrocycles. Other complexity metrics account for functional group distribution, molecular size, and ring system strain. More recently, scores of synthetic complexity based on deep learning have been introduced, for instance, based on the number of estimated reaction steps required.
Predicted retrosynthesis routes.Determining retrosynthesis routes involves decomposing a target molecule into building blocks that are either commercially available or easy to synthesize. Computationally, retrosynthesis planning can be approached by using either rule-based methods or machine learning. Rule-based approaches , identify functional groups and reaction motifs, then apply a library of predefined reaction templates to suggest potential precursors. Machine learning approaches are used to learn retrosynthetic feasibility directly from large databases of annotated reactions. This can be performed via (a) template-based retrosynthesis, where the model selects reaction templates from training data, , or (b) template-free retrosynthesis, where sequence-to-sequence learning or graph-based transformations are applied to generate precursors directly, without predefined templates. −
3. Selected Approaches to Assess and Incorporate Synthetic Feasibility in Molecular Generation. The approaches are divided by type, and subtypes, along with the selected examples.
| Type | Description | Examples |
|---|---|---|
| Synthetic complexity scores | Penalizes complex structures, such as fused rings or stereocenters. | Synthetic Accessibility Score (SASscore) |
| Determines a synthesis tree to choose the most favorable route. | RASA | |
| Based on deep learning and the estimated number of reaction steps. | SCscore, FSscore | |
| Retrosynthetic Analysis of designed compounds | Rule-based planning using predefined reaction rules. | CAS Scifinder, Chematica |
| Template-free models predicting reactants from the product (or vice versa) using deep learning. | Wan et al., Tetko et al., Yao et al. | |
| Template-based deep learning approaches based on learned transformations. | AiZynthFinder, IBM RXN, Synthia | |
| Synthesizability by design | Systematic generation of possible candidates with a predefined reaction rules. | RENATE, DINGOS |
| Constrained molecular generation to favor synthesizable compounds, often via reinforcement learning. | SynFormer, TANGO, ClickGen, REACTOR, Guo and Schwaller |
Each strategy has distinct advantages and disadvantages. Synthetic complexity scores provide a rapid assessment of synthetic feasibility, which makes them particularly suited to evaluate large molecular libraries, or to be included in reinforcement learning pipelines. However, synthetic complexity scores that rely on predefined heuristics are static as they may not include new synthetic methodologies and cannot easily be updated based on new reaction data. Furthermore, these strategies condense vast information into a single score, which can obscure important nuances, such as whether a high complexity score is due to challenging stereochemical features, strained ring systems, or uncommon functional groups. As a result, these scores may sometimes penalize molecules that are accessible via modern synthetic routes.
In contrast, machine learning-based approaches offer a dynamic and data-driven alternative to static complexity scores. By learning from large reaction databases, they can capture nuanced relationships between molecular structures and their synthetic feasibility while adapting to new reaction data. Lastly, retrosynthesis-based models further provide explicit synthetic routes, with template-based methods using established transformations and template-free approaches offering greater flexibility for novel chemistry. However, these models are limited by data bias and reaction coverage.
For this reason, the choice of feasibility assessment method depends on the application. Complexity scores remain valuable for rapid screening of large molecular libraries, and computational pipelines where computational efficiency is crucial. Meanwhile, machine learning-based retrosynthesis models provide more practical feedback for experimental applications by suggesting synthesis routes, but they require careful curation of reaction data to ensure reliability. While synthetic feasibility is typically assessed after molecule generation, approaches that generate molecules while considering the synthesizability constraints were also developed. They are divided into following groups:
Enumeration-based methods, which refers to the systematic generation of (all) possible candidate of building blocks molecules by following a set of predefined reaction rules. This method explores the synthetic accessible space, often using reaction vectors for the reaction rules. ,
Synthesizability-constrained molecular generation, which include reaction templates, or reactivity predictions to generate experimental accessible molecules and their pathways. , Another promising approach is to integrate synthetic accessibility as an optimization objective within reinforcement learning, to steer the generation toward drug-like synthetically accessible molecules. ,, Other approaches have used autoencoders for molecule optimization that considers both desirable molecular properties and elements of synthetic accessibility, ,− and optimized the score of oracle retrosynthesis prediction models.
Last, estimating synthesizability adds to the challenges of designing structurally diverse (and out of distribution) molecules, while remaining synthetically feasible. In these cases, the designs might not only fall out of distribution of the approaches used to assess their physico-chemical and biological properties, but also of the approaches used to estimate their synthesizability. To bridge this gap, recent research has addressed the generalizability of synthesis prediction tools when applied to newly reported patents or reactions. Advancing synthesis estimation models in parallel with molecule generation ones is expected to be a key strategy to allow charting the chemical universe in an efficient manner.
3.4. The Experimental Validation Dilemma
Unlike bioactivity or molecular property predictionwhere models are usually evaluated on well-defined test setsmolecular generation aims to propose novel, previously unseen molecules, whose properties are unknown. Moreover, generative approaches tend to show low “molecular rediscovery” rates, , making it difficult to use existing, held-out molecules for the evaluation. These aspects make direct evaluation of generative approaches difficult, as there is no ground truth to compare against. Compounding this challenge, existing tools for assessing de novo designssuch as similarity metrics, bioactivity prediction, or heuristic scoresare inherently imprecise and/or only partially capture complex design objectives.
Arguably, the only definitive “proof-of-the pudding” of bioactivity is experimental validation. However, experimental testing is expensive and time-consuming, meaning that only a small fraction of generated molecules can be synthesized and evaluated. Hence, this process faces two major bottlenecks:
Selection of molecules to make and test. With thousands of generated designs scoring similarly on computational metrics, choosing which ones to validate experimentally is challenging. Many of these scores contain inherent noise, and small variations in ranking may not correspond to meaningful differences in experimental properties. At the same time, errors in the scoring functions (e.g., for out-of-distribution molecules) might rule out suitable molecules. Ensuring that promising candidates are not overlooked requires robust selection strategiesyet, defining optimal criteria for this process remains an open challenge.
Sufficiency of experimental validation. In machine learning, test sets used to validate predictive models, typically contain a substantial number of molecules to provide statistically robust evaluations (e.g., in the order of hundreds to thousands of compounds , ). In de novo design, however, synthesizing and testing even a small fraction of generated molecules is prohibitively expensive. This raises a fundamental challenge: how many designs are sufficient to reliably measure the effectiveness of (and compare) generative deep learning models? To date, balancing practical feasibility with robust evaluation remains an unresolved issue.
In addition to cost and time constraints, experimental validation of generative deep learning approaches demands expertise across multiple disciplines, including synthetic and medicinal chemistry, computational modeling, assay development, and data analysis. This interdisciplinary challenge likely contributes to the relatively scarce experimental validation of generative models (Table ), especially when compared to the vast number of proposed approaches. This might also be the reason why the field tends to validate new methods predominantly on well-explored targets, like kinases and nuclear receptors (Table ). Finally, the validation cost and complexity escalate as the targeted property becomes more clinically relevant, requiring more advanced assays, in vivo studies, or even early stage pharmacokinetics and toxicity evaluations. These challenges underscore the difficulty of translating de novo molecular design from computational predictions to real-world applications.
The integration of automated synthesis and self-driving laboratories offers a promising solution to such “experimental validation dilemma”. , By combining robotic synthesis, high-throughput screening and deep-learning driven molecule design and synthesis planning, these platforms can accelerate experimental validation while mitigating the bottlenecks of manual experimentation. Automated synthesis not only enables a more systematic assessment of the capabilities of generative drug discovery approaches, but also enhance the efficiency of design-make-test-analyze cycles. Incorporating adaptive learning strategies (e.g., active learning , ) could further refine molecular generation strategies, guiding the exploration of the chemical space with greater precision and versatility.
4. Future Directions and Opportunities
Generative deep learning has advanced the opportunities to design molecules on-demand in search for drug hit and lead candidates. The field is experiencing rapid advances, offering both exciting prospects and notable challenges.
One of the biggest challenges remains the evaluation of design quality. In complex design tasks, such as the design of bioactive molecules, accurate and scalable quality metrics are lacking. Fast evaluation metrics, e.g., quantitative estimate of drug-likeness and QSAR model predictions, can misguide molecule selection and sacrifice prediction accuracy. ,− On the other hand, relying on accurate but time-consuming simulations is practically unfeasible on large libraries, and hence it inevitably constrains the breadth of the analysis. We see accelerating these simulations with deep learning as a promising research direction, − which could remove a major obstacle to realizing the full potential of generative deep learning. Until speed and accuracy are effectively combined, new research should report multiple and multifaceted evaluation metrics for large design libraries to better reflect design quality. ,,
A direct consequence of the challenges in design evaluation is increased risk of testing the proposed de novo designs in the lab. While computational scoring functions (e.g., docking and QSAR models) can prioritize de novo designs, , they are reliable only within their respective applicability domains, leading to filtering out structurally novel designs. Hence, training models with better out-of-distribution capabilities can extend these limits and represents a promising research direction. We expect the integration of predictive approaches with higher generalization capabilities into reinforcement learning as a promising direction to enable fast chemical space exploration in search for structurally novel molecular entities.
Another fundamental challenge for de novo design is data scarcity. , Even “large” bioactivity datasets usually contain only a few thousand molecules , which constitutes a bottleneck for deep learning. Transfer learning helps mitigate this problem by leveraging unlabeled datasets, however, current approaches also face issues such as catastrophic forgetting and mode collapse, resulting in reduced design diversity. Designing new training strategies has attracted increasing attention from the research community. In-context learning for molecules − and test-time scaling of generations are two promising examples; further research is needed to validate their applicability. We also view inactive molecules as an invaluable resource for low-data drug discovery research, since inactive molecules are typically more abundant than active ones.
It is impossible to ignore the recent rise of large language models (LLMs) and their potential to accelerate molecular design. While open questions remain regarding their chemical representation capabilities, − these models offer great ability to integrate diverse sources of information, from textual descriptions of chemical reactions to structured datasets, potentially leading to rapid and/or more informed molecular generation and design evaluation. − However, despite their promise, LLMs have yet to fully permeate drug discovery, and their role in addressing key challenges such as generalization, zero-shot learning, and structure-based design remains an open question.
We currently see two parallel but increasingly intertwined movements. The deep learning community continues to push the boundaries of generative modeling, developing more powerful architectures to explore chemical space. ,,, Meanwhile, the drug discovery community is working to integrate key factors (e.g., chemistry, bioactivity, synthesizability) into the evaluation of generative models, to ensure their relevance for therapeutic applications. ,, Historically, these efforts have often progressed in isolation, but they are now converging as both fields recognize the need for models that are not only innovative but also grounded in medicinal chemistry and pharmacology. Bridging this gap will require strong interdisciplinary collaborations between these communities. Moreover, advances in model interpretability, evaluation frameworks that better reflect real-world constraints, and tighter integration between computational design and experimental validation, will help close such gap. As these two domains continue to evolve, the true measure of success will not be in generating molecules alone, but in producing candidates that can withstand the demands of drug development and make a tangible impact. Herein, the importance of not only publishing success stories, but also negative experimental results are key, as it helps researchers understand better where research pipelines involving computational methods need to be adjusted.
†.
R.Ö. and H.B contributed equally to this work. Conceptualization: all authors; Visualization: F.G., with contributions from RÖ; WritingOriginal Draft: RÖ, H.B., F.G.; WritingReview & Editing: all authors.
This research was funded by the European Union (ERC, ReMINDER, 101077879 to FG) and the Centre for Living Technologies. Views and opinions expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council.
The authors declare no competing financial interest.
References
- Sousa T., Correia J., Pereira V., Rocha M.. Generative Deep Learning for Targeted Compound Design. J. Chem. Inf. Model. 2021;61(11):5343–5361. doi: 10.1021/acs.jcim.0c01496. [DOI] [PubMed] [Google Scholar]
- Anstine D. M., Isayev O.. Generative Models as an Emerging Paradigm in the Chemical Sciences. J. Am. Chem. Soc. 2023;145(16):8736–8750. doi: 10.1021/jacs.2c13467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bilodeau C., Jin W., Jaakkola T., Barzilay R., Jensen K. F.. Generative Models for Molecular Discovery: Recent Advances and Challenges. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2022;12(5):e1608. doi: 10.1002/wcms.1608. [DOI] [Google Scholar]
- Grisoni F.. Chemical Language Models for de Novo Drug Design: Challenges and Opportunities. Curr. Opin. Struct. Biol. 2023;79:102527. doi: 10.1016/j.sbi.2023.102527. [DOI] [PubMed] [Google Scholar]
- Bohacek R. S., McMartin C., Guida W. C.. The Art and Practice of Structure-Based Drug Design: A Molecular Modeling Perspective. Med. Res. Rev. 1996;16(1):3–50. doi: 10.1002/(SICI)1098-1128(199601)16:1<3::AID-MED1>3.0.CO;2-6. [DOI] [PubMed] [Google Scholar]
- Ruddigkeit L., van Deursen R., Blum L. C., Reymond J.-L.. Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17. J. Chem. Inf. Model. 2012;52(11):2864–2875. doi: 10.1021/ci300415d. [DOI] [PubMed] [Google Scholar]
- Schneider G., Lee M.-L., Stahl M., Schneider P.. De Novo Design of Molecular Architectures by Evolutionary Assembly of Drug-Derived Building Blocks. J. Comput.-Aided Mol. Des. 2000;14(5):487–494. doi: 10.1023/A:1008184403558. [DOI] [PubMed] [Google Scholar]
- Arus-Pous J., Blaschke T., Ulander S., Reymond J.-L., Chen H., Engkvist O.. Exploring the GDB-13 Chemical Space Using Deep Generative Models. J. Cheminf. 2019;11(1):20. doi: 10.1186/s13321-019-0341-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skinnider M. A., Stacey R. G., Wishart D. S., Foster L. J.. Chemical Language Models Enable Navigation in Sparsely Populated Chemical Space. Nat. Mach. Intell. 2021;3(9):759–770. doi: 10.1038/s42256-021-00368-1. [DOI] [Google Scholar]
- Brown N., Fiscato M., Segler M. H. S., Vaucher A. C.. GuacaMol: Benchmarking Models for de Novo Molecular Design. J. Chem. Inf. Model. 2019;59(3):1096–1108. doi: 10.1021/acs.jcim.8b00839. [DOI] [PubMed] [Google Scholar]
- Segler M. H. S., Kogej T., Tyrchan C., Waller M. P.. Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks. ACS Cent. Sci. 2018;4(1):120–131. doi: 10.1021/acscentsci.7b00512. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merk D., Friedrich L., Grisoni F., Schneider G.. De Novo Design of Bioactive Small Molecules by Artificial Intelligence. Mol. Inf. 2018;37(1–2):1700153. doi: 10.1002/minf.201700153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merk D., Grisoni F., Friedrich L., Schneider G.. Tuning Artificial Intelligence on the de Novo Design of Natural-Product-Inspired Retinoid X Receptor Modulators. Commun. Chem. 2018;1(1):68. doi: 10.1038/s42004-018-0068-1. [DOI] [Google Scholar]
- Ballarotto M., Willems S., Stiller T., Nawa F., Marschner J. A., Grisoni F., Merk D.. De Novo Design of Nurr1 Agonists via Fragment-Augmented Generative Deep Learning in Low-Data Regime. J. Med. Chem. 2023;66(12):8170–8177. doi: 10.1021/acs.jmedchem.3c00485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grisoni F., Huisman B. J. H., Button A. L., Moret M., Atz K., Merk D., Schneider G.. Combining Generative Artificial Intelligence and On-Chip Synthesis for de Novo Drug Design. Sci. Adv. 2021;7(24):3338. doi: 10.1126/sciadv.abg3338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moret M., Pachon Angona I., Cotos L., Yan S., Atz K., Brunner C., Baumgartner M., Grisoni F., Schneider G.. Leveraging Molecular Structure and Bioactivity with Chemical Language Models for de Novo Drug Design. Nat. Commun. 2023;14(1):114. doi: 10.1038/s41467-022-35692-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moret M., Helmstädter M., Grisoni F., Schneider G., Merk D.. Beam Search for Automated Design and Scoring of Novel ROR Ligands with Machine Intelligence**. Angew. Chem., Int. Ed. 2021;60(35):19477–19482. doi: 10.1002/anie.202104405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thomas M., Matricon P. G., Gillespie R. J., Napiórkowska M., Neale H., Mason J. S., Brown J., Fieldhouse C., Swain N. A., Geng T., O’Boyle N. M., Deflorian F., Bender A., de Graaf C.. Modern Hit-Finding with Structure-Guided de Novo Design: Identification of Novel Nanomolar A2A Receptor Ligands Using Reinforcement Learning. ChemRxiv. 2024:ChemRxiv: wh7zw. doi: 10.26434/chemrxiv-2024-wh7zw. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yuan W., Jiang D., Nambiar D. K., Liew L. P., Hay M. P., Bloomstein J., Lu P., Turner B., Le Q.-T., Tibshirani R., Khatri P., Moloney M. G., Koong A. C.. Chemical Space Mimicry for Drug Discovery. J. Chem. Inf. Model. 2017;57(4):875–882. doi: 10.1021/acs.jcim.6b00754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Özçelik R., de Ruiter S., Criscuolo E., Grisoni F.. Chemical Language Modeling with Structured State Space Sequence Models. Nat. Commun. 2024;15(1):6176. doi: 10.1038/s41467-024-50469-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hoogeboom, E. ; Satorras, V. G. ; Vignac, C. ; Welling, M. . Equivariant Diffusion for Molecule Generation in 3D. In Proceedings of the 39th International Conference on Machine Learning; Proceedings of Machine Learning Research, 2022, pp 8867–8887. [Google Scholar]
- Huang L., Xu T., Yu Y., Zhao P., Chen X., Han J., Xie Z., Li H., Zhong W., Wong K.-C., Zhang H.. A Dual Diffusion Model Enables 3D Molecule Generation and Lead Optimization Based on Target Pockets. Nat. Commun. 2024;15(1):2657. doi: 10.1038/s41467-024-46569-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Svensson H. G., Tyrchan C., Engkvist O., Chehreghani M. H.. Diversity-Aware Reinforcement Learning for de Novo Drug Design. arXiv. 2024:arXiv: 2410.10431. doi: 10.48550/arXiv.2410.10431. [DOI] [Google Scholar]
- Polykovskiy D., Zhebrak A., Sanchez-Lengeling B., Golovanov S., Tatanov O., Belyaev S., Kurbanov R., Artamonov A., Aladinskiy V., Veselov M., Kadurin A., Johansson S., Chen H., Nikolenko S., Aspuru-Guzik A., Zhavoronkov A.. Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models. Front. Pharmacol. 2020;11:565644. doi: 10.3389/fphar.2020.565644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langevin M., Vuilleumier R., Bianciotto M.. Explaining and Avoiding Failure Modes in Goal-Directed Generation of Small Molecules. J. Cheminf. 2022;14(1):20. doi: 10.1186/s13321-022-00601-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Renz P., Van Rompaey D., Wegner J. K., Hochreiter S., Klambauer G.. On Failure Modes in Molecule Generation and Optimization. Drug Discovery Today Technol. 2019;32–33:55–63. doi: 10.1016/j.ddtec.2020.09.003. [DOI] [PubMed] [Google Scholar]
- Gao W., Coley C. W.. The Synthesizability of Molecules Proposed by Generative Models. J. Chem. Inf. Model. 2020;60(12):5714–5723. doi: 10.1021/acs.jcim.0c00174. [DOI] [PubMed] [Google Scholar]
- Luo S., Gao W., Wu Z., Peng J., Coley C. W., Ma J.. Projecting Molecules into Synthesizable Chemical Spaces. arXiv. 2024:arXiv: 2406.04628. doi: 10.48550/arXiv.2406.04628. [DOI] [Google Scholar]
- Ritter, C. An Early History of Alexander Crum Brown’s Graphical Formulas. In Tools and Modes of Representation in the Laboratory Sciences; Klein, U. , Ed.; Springer Netherlands: Dordrecht, 2001; pp 35–46. 10.1007/978-94-015-9737-1_3. [DOI] [Google Scholar]
- Gass G.. Spheres of Influence: Illustration, Notation, and John Dalton’s Conceptual Toolbox, 1803–1835. Ann. Sci. 2007;64(3):349–382. doi: 10.1080/00033790601148668. [DOI] [Google Scholar]
- Kekulé A.. Ueber Die Constitution Und Die Metamorphosen Der Chemischen Verbindungen Und Über Die Chemische Natur Des Kohlenstoffs. Adv. Cycloaddit. 1858;106(2):129–159. doi: 10.1002/jlac.18581060202. [DOI] [Google Scholar]
- Bohr N. I.. On the Constitution of Atoms and Molecules. London Edinb. Dublin Philos. Mag. J. Sci. 1913;26(151):1–25. doi: 10.1080/14786441308634955. [DOI] [Google Scholar]
- Hoffmann R., Laszlo P.. Representation in Chemistry. Angew. Chem., Int. Ed. Engl. 1991;30(1):1–16. doi: 10.1002/anie.199100013. [DOI] [Google Scholar]
- Kekulé A.. Sur La Constitution Des Substances Aromatiques. Bull. Mens. Société Chim. Paris. 1865;3:98. [Google Scholar]
- Muller P.. Glossary of Terms Used in Physical Organic Chemistry (IUPAC Recommendations 1994) Pure Appl. Chem. 1994;66(5):1077–1184. doi: 10.1351/pac199466051077. [DOI] [Google Scholar]
- Wiswesser W. J.. 107 Years of Line-Formula Notations (1861–1968) J. Chem. Doc. 1968;8(3):146–150. doi: 10.1021/c160030a007. [DOI] [Google Scholar]
- Weininger D.. SMILES, a Chemical Language and Information System. 1. Introduction to Methodology and Encoding Rules. J. Chem. Inf. Comput. Sci. 1988;28(1):31–36. doi: 10.1021/ci00057a005. [DOI] [Google Scholar]
- Atz K., Grisoni F., Schneider G.. Geometric Deep Learning on Molecular Representations. Nat. Mach. Intell. 2021;3(12):1023–1032. doi: 10.1038/s42256-021-00418-8. [DOI] [Google Scholar]
- LeCun Y., Bengio Y., Hinton G.. Deep Learning. Nature. 2015;521(7553):436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
- Li Z., Jiang M., Wang S., Zhang S.. Deep Learning Methods for Molecular Representation and Property Prediction. Drug Discovery Today. 2022;27(12):103373. doi: 10.1016/j.drudis.2022.103373. [DOI] [PubMed] [Google Scholar]
- O’Boyle N., Dalke A.. DeepSMILES: An Adaptation of SMILES for Use in Machine-Learning of Chemical Structures. ChemRxiv. 2018:ChemRxiv: 7097960. doi: 10.26434/chemrxiv.7097960.v1. [DOI] [Google Scholar]
- Arus-Pous J., Johansson S. V., Prykhodko O., Bjerrum E. J., Tyrchan C., Reymond J. L., Chen H., Engkvist O.. Randomized SMILES Strings Improve the Quality of Molecular Generative Models. J. Cheminf. 2019;11(1):71. doi: 10.1186/s13321-019-0393-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krenn M., Häse F., Nigam A., Friederich P., Aspuru-Guzik A.. Self-Referencing Embedded Strings (SELFIES): A 100% Robust Molecular String Representation. Mach Learn Sci. Technol. 2020;1(4):045024. doi: 10.1088/2632-2153/aba947. [DOI] [Google Scholar]
- Sakano K., Furui K., Ohue M.. NPGPT: Natural Product-like Compound Generation with GPT-Based Chemical Language Models. J. Supercomput. 2025;81(1):352. doi: 10.1007/s11227-024-06860-w. [DOI] [Google Scholar]
- Skinnider M. A.. Invalid SMILES Are Beneficial Rather than Detrimental to Chemical Language Models. Nat. Mach. Intell. 2024;6(4):437–448. doi: 10.1038/s42256-024-00821-x. [DOI] [Google Scholar]
- Noutahi E., Gabellini C., Craig M., Lim J. S. C., Tossou P.. Gotta Be SAFE: A New Framework for Molecular Design. Digital Discovery. 2024;3(4):796–804. doi: 10.1039/D4DD00019F. [DOI] [Google Scholar]
- Cheng A. H., Cai A., Miret S., Malkomes G., Phielipp M., Aspuru-Guzik A.. Group SELFIES: A Robust Fragment-Based Molecular String Representation. Digital Discovery. 2023;2(3):748–758. doi: 10.1039/D3DD00012E. [DOI] [Google Scholar]
- Mastrolorito F., Ciriaco F., Togo M. V., Gambacorta N., Trisciuzzi D., Altomare C. D., Amoroso N., Grisoni F., Nicolotti O.. fragSMILES as a Chemical String Notation for Advanced Fragment and Chirality Representation. Commun. Chem. 2025;8(1):1–9. doi: 10.1038/s42004-025-01423-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wojtuch A., Danel T., Podlewska S., Maziarka Ł.. Extended Study on Atomic Featurization in Graph Neural Networks for Molecular Property Prediction. J. Cheminf. 2023;15(1):81. doi: 10.1186/s13321-023-00751-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garg V.. Generative AI for Graph-Based Drug Design: Recent Advances and the Way Forward. Curr. Opin. Struct. Biol. 2024;84:102769. doi: 10.1016/j.sbi.2023.102769. [DOI] [PubMed] [Google Scholar]
- Özçelik R., van Tilborg D., Jiménez-Luna J., Grisoni F.. Structure-Based Drug Discovery with Deep Learning. ChemBioChem. 2023;24(13):e202200776. doi: 10.1002/cbic.202200776. [DOI] [PubMed] [Google Scholar]
- Zhou Z., Kearnes S., Li L., Zare R. N., Riley P.. Optimization of Molecules via Deep Reinforcement Learning. Sci. Rep. 2019;9(1):10752. doi: 10.1038/s41598-019-47148-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jin, W. ; Barzilay, R. ; Jaakkola, T. . Junction Tree Variational Autoencoder for Molecular Graph Generation. In Proceedings of the 35th International Conference on Machine Learning; PMLR, 2018; pp 2323–2332. [Google Scholar]
- Isert C., Atz K., Schneider G.. Structure-Based Drug Design with Geometric Deep Learning. Curr. Opin. Struct. Biol. 2023;79:102548. doi: 10.1016/j.sbi.2023.102548. [DOI] [PubMed] [Google Scholar]
- Huang H., Sun L., Du B., Lv W.. Learning Joint 2-D and 3-D Graph Diffusion Models for Complete Molecule Generation. IEEE Trans. Neural Netw. Learn. Syst. 2024;35(9):11857–11871. doi: 10.1109/TNNLS.2024.3416328. [DOI] [PubMed] [Google Scholar]
- Liu M., Luo Y., Uchino K., Maruhashi K., Ji S.. Generating 3D Molecules for Target Protein Binding. arXiv. 2022:arXiv: 2204.09410. doi: 10.48550/arXiv.2204.09410. [DOI] [Google Scholar]
- Zhang O., Huang Y., Cheng S., Yu M., Zhang X., Lin H., Zeng Y., Wang M., Wu Z., Zhao H., Zhang Z., Hua C., Kang Y., Cui S., Pan P., Hsieh C.-Y., Hou T.. Deep Geometry Handling and Fragment-Wise Molecular 3D Graph Generation. arXiv. 2024:arXiv: 2404.00014. doi: 10.48550/arXiv.2404.00014. [DOI] [Google Scholar]
- Wu L., Gong C., Liu X., Ye M., Liu Q.. Diffusion-Based Molecule Generation with Informative Prior Bridges. Adv. Neural Inf. Process. Syst. 2022;35:36533–36545. [Google Scholar]
- O’Boyle N. M., Banck M., James C. A., Morley C., Vandermeersch T., Hutchison G. R.. Open Babel: An Open Chemical Toolbox. J. Cheminf. 2011;3(1):33. doi: 10.1186/1758-2946-3-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gainza P., Sverrisson F., Monti F., Rodolà E., Boscaini D., Bronstein M. M., Correia B. E.. Deciphering Interaction Fingerprints from Protein Molecular Surfaces Using Geometric Deep Learning. Nat. Methods. 2020;17(2):184–192. doi: 10.1038/s41592-019-0666-6. [DOI] [PubMed] [Google Scholar]
- Sverrisson, F. ; Feydy, J. ; Correia, B. E. ; Bronstein, M. M. . Fast End-to-End Learning on Protein Surfaces. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp 15267–15276. [Google Scholar]
- Douguet D., Payan F.. SENSAAS (SENsitive Surface As A Shape): Utilizing Open-Source Algorithms for 3D Point Cloud Alignment of Molecules. arXiv. 2019:arXiv: 1908.11267. doi: 10.48550/arXiv.1908.11267. [DOI] [Google Scholar]
- Yan X., Lu Y., Li Z., Wei Q., Gao X., Wang S., Wu S., Cui S.. PointSite: A Point Cloud Segmentation Tool for Identification of Protein Ligand Binding Atoms. J. Chem. Inf. Model. 2022;62(11):2835–2845. doi: 10.1021/acs.jcim.1c01512. [DOI] [PubMed] [Google Scholar]
- Liu Q., Wang P.-S., Zhu C., Gaines B. B., Zhu T., Bi J., Song M.. OctSurf: Efficient Hierarchical Voxel-Based Molecular Surface Representation for Protein-Ligand Affinity Prediction. J. Mol. Graph. Model. 2021;105:107865. doi: 10.1016/j.jmgm.2021.107865. [DOI] [PubMed] [Google Scholar]
- Mylonas S. K., Axenopoulos A., Daras P.. DeepSurf: A Surface-Based Deep Learning Approach for the Prediction of Ligand Binding Sites on Proteins. Bioinformatics. 2021;37(12):1681–1690. doi: 10.1093/bioinformatics/btab009. [DOI] [PubMed] [Google Scholar]
- Skalic M., Jiménez J., Sabbadin D., De Fabritiis G.. Shape-Based Generative Modeling for de Novo Drug Design. J. Chem. Inf. Model. 2019;59(3):1205–1214. doi: 10.1021/acs.jcim.8b00706. [DOI] [PubMed] [Google Scholar]
- Skalic M., Sabbadin D., Sattarov B., Sciabola S., De Fabritiis G.. From Target to Drug: Generative Modeling for the Multimodal Structure-Based Ligand Design. Mol. Pharmaceutics. 2019;16(10):4282–4291. doi: 10.1021/acs.molpharmaceut.9b00634. [DOI] [PubMed] [Google Scholar]
- Marchand A., Buckley S., Schneuing A., Pacesa M., Elia M., Gainza P., Elizarova E., Neeser R. M., Lee P.-W., Reymond L., Miao Y., Scheller L., Georgeon S., Schmidt J., Schwaller P., Maerkl S. J., Bronstein M., Correia B. E.. Targeting Protein-Ligand Neosurfaces with a Generalizable Deep Learning Tool. Nature. 2025;639:522–531. doi: 10.1038/s41586-024-08435-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Özçelik R., Grisoni F.. A Hitchhiker’s Guide to Deep Chemical Language Processing for Bioactivity Prediction. Digital Discovery. 2025;4(2):316–325. doi: 10.1039/D4DD00311J. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Atz K., Cotos Muñoz L., Isert C., Håkansson M., Focht D., Nippa D. F., Hilleke M., Iff M., Ledergerber J., Schiebroek C. C. G.. et al. Deep interactome learning for de novo drug design. ChemRxiv. 2023:ChemRxiv: cbq9k. doi: 10.26434/chemrxiv-2023-cbq9k. [DOI] [Google Scholar]
- Flam-Shepherd D., Zhu K., Aspuru-Guzik A.. Language Models Can Learn Complex Molecular Distributions. Nat. Commun. 2022;13(1):3293. doi: 10.1038/s41467-022-30839-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luo S., Guan J., Ma J., Peng J.. A 3D Generative Model for Structure-Based Drug Design. arXiv. 2022:arXiv: 2203.10446. doi: 10.48550/arXiv.2203.10446. [DOI] [Google Scholar]
- Peng, X. ; Luo, S. ; Guan, J. ; Xie, Q. ; Peng, J. ; Ma, J. . Pocket2mol: Efficient Molecular Sampling Based on 3d Protein Pockets. In International Conference on Machine Learning; PMLR, 2022; pp 17644–17655. [Google Scholar]
- Guan J., Qian W. W., Peng X., Su Y., Peng J., Ma J.. 3d Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction. arXiv. 2023:arXiv: 2303.03543. doi: 10.48550/arXiv.2303.03543. [DOI] [Google Scholar]
- Schneuing A., Harris C., Du Y., Didi K., Jamasb A., Igashov I., Du W., Gomes C., Blundell T. L., Lio P., Welling M., Bronstein M., Correia B.. Structure-Based Drug Design with Equivariant Diffusion Models. Nat. Comput. Sci. 2024;4(12):899–909. doi: 10.1038/s43588-024-00737-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang O., Zhang J., Jin J., Zhang X., Hu R., Shen C., Cao H., Du H., Kang Y., Deng Y., Liu F., Chen G., Hsieh C.-Y., Hou T.. ResGen Is a Pocket-Aware 3D Molecular Generation Model Based on Parallel Multiscale Modelling. Nat. Mach. Intell. 2023;5(9):1020–1030. doi: 10.1038/s42256-023-00712-7. [DOI] [Google Scholar]
- Baillif B., Cole J., McCabe P., Bender A.. Benchmarking Structure-Based Three-Dimensional Molecular Generative Models Using GenBench3D: Ligand Conformation Quality Matters. arXiv. 2024:arXiv: 2407.04424. doi: 10.48550/arXiv.2407.04424. [DOI] [Google Scholar]
- Buttenschoen M. M., Morris G. M., Deane C.. PoseBusters: AI-Based Docking Methods Fail to Generate Physically Valid Poses or Generalise to Novel Sequences. Chem. Sci. 2024;15:3130. doi: 10.1039/D3SC04185A. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peng X., Guan J., Liu Q., Ma J.. MolDiff: Addressing the Atom-Bond Inconsistency Problem in 3D Molecule Diffusion Generation. arXiv. 2023:arXiv: 2305.07508. doi: 10.48550/arXiv.2305.07508. [DOI] [Google Scholar]
- Vost L., Chenthamarakshan V., Das P., Deane C. M.. Improving Structural Plausibility in Diffusion-Based 3D Molecule Generation via Property-Conditioned Training with Distorted Molecules. Digital Discovery. 2025;4(4):1092–1099. doi: 10.1039/D4DD00331D. [DOI] [Google Scholar]
- Chen Z., Peng B., Zhai T., Adu-Ampratwum D., Ning X.. Generating 3D Binding Molecules Using Shape-Conditioned Diffusion Models with Guidance. arXiv. 2025:arXiv: 2502.06027. doi: 10.48550/arXiv.2502.06027. [DOI] [Google Scholar]
- Gómez-Bombarelli R., Wei J. N., Duvenaud D., Hernández-Lobato J. M., Sánchez-Lengeling B., Sheberla D., Aguilera-Iparraguirre J., Hirzel T. D., Adams R. P., Aspuru-Guzik A.. Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. ACS Cent. Sci. 2018;4(2):268–276. doi: 10.1021/acscentsci.7b00572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bagal V., Aggarwal R., Vinod P. K., Priyakumar U. D.. MolGPT: Molecular Generation Using a Transformer-Decoder Model. J. Chem. Inf. Model. 2022;62(9):2064–2076. doi: 10.1021/acs.jcim.1c00600. [DOI] [PubMed] [Google Scholar]
- Bjerrum E. J.. SMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules. arXiv. 2017:arXiv: 1703.07076. doi: 10.48550/arXiv.1703.07076. [DOI] [Google Scholar]
- Moret M., Friedrich L., Grisoni F., Merk D., Schneider G.. Generative Molecular Design in Low Data Regimes. Nat. Mach. Intell. 2020;2(3):171–180. doi: 10.1038/s42256-020-0160-y. [DOI] [Google Scholar]
- Brinkmann H., Argante A., Steege H., Grisoni F.. Going beyond SMILES Enumeration for Generative Deep Learning in Low Data Regimes. ChemRxiv. 2025:ChemRxiv: fdnnq. doi: 10.26434/chemrxiv-2025-fdnnq. [DOI] [Google Scholar]
- Hopfield J. J.. Neural Networks and Physical Systems with Emergent Collective Computational Abilities. Proc. Natl. Acad. Sci. U.S.A. 1982;79(8):2554–2558. doi: 10.1073/pnas.79.8.2554. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hochreiter S., Schmidhuber J.. Long Short-Term Memory. Neural Comput. 1997;9(8):1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]
- Chung J., Gulcehre C., Cho K., Bengio Y.. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv. 2014:arXiv: 1412.3555. doi: 10.48550/arXiv.1412.3555. [DOI] [Google Scholar]
- Vaswani, A. Attention Is All You Need. In Advances in Neural Information Processing Systems; Curran Associates; Inc, 2017. [Google Scholar]
- Devlin J., Chang M.-W., Lee K., Toutanova K.. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv. 2019:arXiv: 1810.04805. doi: 10.48550/arXiv.1810.04805. [DOI] [Google Scholar]
- Radford, A. ; Narasimhan, K. ; Salimans, T. ; Sutskever, I. . Improving Language Understanding by Generative Pre-Training, 2018.
- Öztürk H., Özgür A., Schwaller P., Laino T., Ozkirimli E.. Exploring Chemical Space Using Natural Language Processing Methodologies for Drug Discovery. Drug Discovery Today. 2020;25(4):689–705. doi: 10.1016/j.drudis.2020.01.020. [DOI] [PubMed] [Google Scholar]
- Lim, S. ; Lee, Y. O. . Predicting Chemical Properties Using Self-Attention Multi-Task Learning Based on SMILES Representation. In 2020 25th International Conference on Pattern Recognition (ICPR); IEEE, 2021; pp 3146–3153. [Google Scholar]
- Jiang J., Zhang R., Ma J., Liu Y., Yang E., Du S., Zhao Z., Yuan Y.. TranGRU: Focusing on Both the Local and Global Information of Molecules for Molecular Property Prediction. Appl. Intell. 2023;53(12):15246–15260. doi: 10.1007/s10489-022-04280-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen Y., Wang Z., Zeng X., Li Y., Li P., Ye X., Sakurai T.. Molecular Language Models: RNNs or Transformer? Briefings Funct. Genomics. 2023;22:392–400. doi: 10.1093/bfgp/elad012. [DOI] [PubMed] [Google Scholar]
- Gu A., Goel K., Ré C.. Efficiently Modeling Long Sequences with Structured State Spaces. arXiv. 2022:arXiv: 2111.0039. doi: 10.48550/arXiv.2111.00396. [DOI] [Google Scholar]
- Kingma D. P., Welling M.. Auto-Encoding Variational Bayes. arXiv. 2022:arXiv: 1312.6114. doi: 10.48550/arXiv.1312.6114. [DOI] [Google Scholar]
- Pravalphruekul N., Piriyajitakonkij M., Phunchongharn P., Piyayotai S.. De Novo Design of Molecules with Multiaction Potential from Differential Gene Expression Using Variational Autoencoder. J. Chem. Inf. Model. 2023;63(13):3999–4011. doi: 10.1021/acs.jcim.3c00355. [DOI] [PubMed] [Google Scholar]
- Blaschke T., Olivecrona M., Engkvist O., Bajorath J., Chen H.. Application of Generative Autoencoder in De Novo Molecular Design. Mol. Inf. 2018;37(1–2):1700123. doi: 10.1002/minf.201700123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sattarov B., Baskin I. I., Horvath D., Marcou G., Bjerrum E. J., Varnek A.. De Novo Molecular Design by Combining Deep Autoencoder Recurrent Neural Networks with Generative Topographic Mapping. J. Chem. Inf. Model. 2019;59(3):1182–1196. doi: 10.1021/acs.jcim.8b00751. [DOI] [PubMed] [Google Scholar]
- Prykhodko O., Johansson S. V., Kotsias P. C., Arus-Pous J., Bjerrum E. J., Engkvist O., Chen H.. A de novo molecular generation method using latent vector based generative adversarial network. J. Cheminf. 2019;11(1):74. doi: 10.1186/s13321-019-0397-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Abbasi M., Santos B. P., Pereira T. C., Sofia R., Monteiro N. R. C., Simões C. J. V., Brito R. M. M., Ribeiro B., Oliveira J. L., Arrais J. P.. Designing Optimized Drug Candidates with Generative Adversarial Network. J. Cheminf. 2022;14(1):40. doi: 10.1186/s13321-022-00623-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guimaraes G. L., Sanchez-Lengeling B., Outeiral C., Farias P. L. C., Aspuru-Guzik A.. Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models. arXiv. 2018:arXiv: 1705.10843. doi: 10.48550/arXiv.1705.10843. [DOI] [Google Scholar]
- Goodfellow, I. Generative Adversarial Nets. In Advances in Neural Information Processing Systems; Curran Associates; Inc, 2014. [Google Scholar]
- Arjovsky, M. ; Chintala, S. ; Bottou, L. . Wasserstein Generative Adversarial Networks. In Proceedings of the 34th International Conference on Machine Learning; PMLR, 2017; pp 214–223. [Google Scholar]
- Jiang, Z. ; Wang, Z. ; Zhang, J. ; Wub, M. ; Li, C. ; Yamanishi, Y. . Mode Collapse Alleviation of Reinforcement Learning-Based GANs in Drug Design. In 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); 2023; pp 3045–3052. 10.1109/BIBM58861.2023.10385948. [DOI] [Google Scholar]
- Kossale, Y. ; Airaj, M. ; Darouichi, A. . Mode Collapse in Generative Adversarial Networks: An Overview. In 2022 8th International Conference on Optimization and Applications (ICOA); 2022; pp 1–6. 10.1109/ICOA55659.2022.9934291. [DOI] [Google Scholar]
- Sohl-Dickstein, J. ; Weiss, E. ; Maheswaranathan, N. ; Ganguli, S. . Deep Unsupervised Learning Using Nonequilibrium Thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning; PMLR, 2015; pp 2256–2265. [Google Scholar]
- Nie S., Zhu F., You Z., Zhang X., Ou J., Hu J., Zhou J., Lin Y., Wen J.-R., Li C.. Large Language Diffusion Models. arXiv. 2025:arXiv: 2502.09992. doi: 10.48550/arXiv.2502.09992. [DOI] [Google Scholar]
- Zhu Y., Zhao Y.. Diffusion Models in NLP: A Survey. arXiv. 2023:arXiv: 2303.07576. doi: 10.48550/arXiv.2303.07576. [DOI] [Google Scholar]
- Lyu Y., Luo T., Shi J., Hollon T. C., Lee H.. Fine-Grained Text Style Transfer with Diffusion-Based Language Models. arXiv. 2023:arXiv: 2305.19512. doi: 10.48550/arXiv.2305.19512. [DOI] [Google Scholar]
- He Z., Sun T., Wang K., Huang X., Qiu X.. DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models. arXiv. 2022:arXiv: 2211.15029. doi: 10.48550/arXiv.2211.15029. [DOI] [Google Scholar]
- Lee S., Kreis K., Veccham S. P., Liu M., Reidenbach D., Peng Y., Paliwal S., Nie W., Vahdat A.. GenMol: A Drug Discovery Generalist with Discrete Diffusion. arXiv. 2025:arXiv: 2501.06158. doi: 10.48550/arXiv.2501.06158. [DOI] [Google Scholar]
- Chang J., Ye J. C.. LDMol: Text-Conditioned Molecule Diffusion Model Leveraging Chemically Informative Latent Space. arXiv. 2024:arXiv: 2405.17829. doi: 10.48550/arXiv.2405.17829. [DOI] [Google Scholar]
- Gong H., Liu Q., Wu S., Wang L.. Text-Guided Molecule Generation with Diffusion Language Model. arXiv. 2024:arXiv: 2402.13040. doi: 10.48550/arXiv.2402.13040. [DOI] [Google Scholar]
- Wang Z., Chen Y., Guo X., Li Y., Li P., Li C., Ye X., Sakurai T.. DiffSeqMol: A Non-Autoregressive Diffusion-Based Approach for Molecular Sequence Generation and Optimization. Curr. Bioinf. 2025;20(1):46–58. doi: 10.2174/0115748936285493240307071916. [DOI] [Google Scholar]
- Liu, Q. ; Allamanis, M. ; Brockschmidt, M. ; Gaunt, A. . Constrained Graph Variational Autoencoders for Molecule Design. In Advances in neural information processing systems, 2018. [Google Scholar]
- Simonovsky, M. ; Komodakis, N. . Graphvae: Towards Generation of Small Graphs Using Variational Autoencoders. In Artificial Neural Networks and Machine Learning-ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4–7, 2018, Proceedings, Part I 27; Springer, 2018; pp 412–422. [Google Scholar]
- Kwon Y., Yoo J., Choi Y.-S., Son W.-J., Lee D., Kang S.. Efficient Learning of Non-Autoregressive Graph Variational Autoencoders for Molecular Graph Generation. J. Cheminf. 2019;11(1):70. doi: 10.1186/s13321-019-0396-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X., Hu J., Wang Y., Zhang L., Liu Z.. Graph-Based Generative Models for de Novo Drug Design. Drug Discovery Today Technol. 2019;32:45–53. doi: 10.1016/j.ddtec.2020.11.004. [DOI] [PubMed] [Google Scholar]
- Abate C., Decherchi S., Cavalli A.. Graph Neural Networks for Conditional de Novo Drug Design. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2023;13(4):e1651. doi: 10.1002/wcms.1651. [DOI] [Google Scholar]
- Cao N. D., Kipf T.. MolGAN: An Implicit Generative Model for Small Molecular Graphs. arXiv. 2022:arXiv: 1805.11973. doi: 10.48550/arXiv.1805.11973. [DOI] [Google Scholar]
- Macedo B., Ribeiro Vaz I., Taveira Gomes T.. MedGAN: Optimized Generative Adversarial Network with Graph Convolutional Networks for Novel Molecule Design. Sci. Rep. 2024;14(1):1212. doi: 10.1038/s41598-023-50834-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ochiai T., Inukai T., Akiyama M., Furui K., Ohue M., Matsumori N., Inuki S., Uesugi M., Sunazuka T., Kikuchi K.. et al. Variational Autoencoder-Based Chemical Latent Space for Large Molecular Structures with 3D Complexity. Commun. Chem. 2023;6(1):249. doi: 10.1038/s42004-023-01054-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rezende, D. ; Mohamed, S. . Variational Inference with Normalizing Flows. In Proceedings of the 32nd International Conference on Machine Learning; PMLR, 2015; pp 1530–1538. [Google Scholar]
- Zang, C. ; Wang, F. . MoFlow: An Invertible Flow Model for Generating Molecular Graphs. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; KDD 20; Association for Computing Machinery: New York, NY, USA, 2020; pp 617–626. 10.1145/3394486.3403104. [DOI] [Google Scholar]
- Verma, Y. , Kaski, S. , Heinonen, M. , Garg, V. . Modular Flows: Differential Molecular Generation. Advances in Neural Information Processing Systems 2022, 35, 12409–12421. [Google Scholar]
- Luo, Y. ; Yan, K. ; Ji, S. . GraphDF: A Discrete Flow Model for Molecular Graph Generation. In Proceedings of the 38th International Conference on Machine Learning; PMLR, 2021; pp 7192–7203. [Google Scholar]
- Shi C., Xu M., Zhu Z., Zhang W., Zhang M., Tang J.. GraphAF: A Flow-Based Autoregressive Model for Molecular Graph Generation. arXiv. 2020:arXiv: 2001.09382. doi: 10.48550/arXiv.2001.09382. [DOI] [Google Scholar]
- Wang L., Song C., Liu Z., Rong Y., Liu Q., Wu S., Wang L.. Diffusion Models for Molecules: A Survey of Methods and Tasks. arXiv. 2025:arXiv: 2502.09511. doi: 10.48550/arXiv.2502.09511. [DOI] [Google Scholar]
- Zhang, M. ; Qamar, M. ; Kang, T. ; Jung, Y. ; Zhang, C. ; Bae, S.-H. ; Zhang, C. . A Survey on Graph Diffusion Models: Generative AI in Science for Molecule, Proteins Mater. 2023. 10.13140/RG.2.2.26493.64480. [DOI]
- Jo J., Lee S., Hwang S. J.. Score-Based Generative Modeling of Graphs via the System of Stochastic Differential Equations. arXiv. 2022:arXiv: 2202.02514. doi: 10.48550/arXiv.2202.02514. [DOI] [Google Scholar]
- Lee S., Jo J., Hwang S. J.. Exploring Chemical Space with Score-Based Out-of-Distribution Generation. arXiv. 2023:arXiv: 2206.07632. doi: 10.48550/arXiv.2206.07632. [DOI] [Google Scholar]
- Liu G., Xu J., Luo T., Jiang M.. Graph Diffusion Transformers for Multi-Conditional Molecular Generation. arXiv. 2024:arXiv: 2401.13858. doi: 10.48550/arXiv.2401.13858. [DOI] [Google Scholar]
- Vignac C., Krawczuk I., Siraudin A., Wang B., Cevher V., Frossard P.. DiGress: Discrete Denoising Diffusion for Graph Generation. arXiv. 2023:arXiv: 2209.14734. doi: 10.48550/arXiv.2209.14734. [DOI] [Google Scholar]
- Pombala P., Grossmann G., Wolf V.. Exploring Molecule Generation Using Latent Space Graph Diffusion. arXiv. 2025:arXiv: 2501.03696. doi: 10.48550/arXiv.2501.03696. [DOI] [Google Scholar]
- Baillif B., Cole J., McCabe P., Bender A.. Deep Generative Models for 3D Molecular Structure. Curr. Opin. Struct. Biol. 2023;80:102566. doi: 10.1016/j.sbi.2023.102566. [DOI] [PubMed] [Google Scholar]
- Xu M., Yu L., Song Y., Shi C., Ermon S., Tang J.. GeoDiff: A Geometric Diffusion Model for Molecular Conformation Generation. arXiv. 2022:arXiv: 2203.02923. doi: 10.48550/arXiv.2203.02923. [DOI] [Google Scholar]
- Jing, B. ; Corso, G. ; Chang, J. ; Barzilay, R. ; Jaakkola, T. . Torsional Diffusion for Molecular Conformer Generation. In Advances in neural information processing systems, 2022, pp 24240–24253. [Google Scholar]
- You, Y. ; Zhou, R. ; Park, J. ; Xu, H. ; Tian, C. ; Wang, Z. ; Shen, Y. . Latent 3D Graph Diffusion, 2023.
- Xu, M. ; Powers, A. S. ; Dror, R. O. ; Ermon, S. ; Leskovec, J. . Geometric Latent Diffusion Models for 3D Molecule Generation. In Proceedings of the 40th International Conference on Machine Learning; PMLR, 2023; pp 38592–38610. [Google Scholar]
- Feng S., Ni Y., Lu Y., Ma Z.-M., Ma W.-Y., Lan Y.. UniGEM: A Unified Approach to Generation and Property Prediction for Molecules. arXiv. 2025:arXiv: 2410.10516. doi: 10.48550/arXiv.2410.10516. [DOI] [Google Scholar]
- Satorras V. G., Hoogeboom E., Welling M.. E(n) Equivariant Graph Neural Networks. arXiv. 2022:arXiv: 2102.09844. doi: 10.48550/arXiv.2102.09844. [DOI] [Google Scholar]
- Vignac C., Osman N., Toni L., Frossard P.. MiDi: Mixed Graph and 3D Denoising Diffusion for Molecule Generation. arXiv. 2023:arXiv: 2302.09048. doi: 10.48550/arXiv.2302.09048. [DOI] [Google Scholar]
- Xu C., Wang H., Wang W., Zheng P., Chen H.. Geometric-Facilitated Denoising Diffusion Model for 3D Molecule Generation. Proc. AAAI Conf. Artif. Intell. 2024;38(1):338–346. doi: 10.1609/aaai.v38i1.27787. [DOI] [Google Scholar]
- Peng X., Zhu F.. Hitting Stride by Degrees: Fine Grained Molecular Generation via Diffusion Model. Expert Syst. Appl. 2024;244:122949. doi: 10.1016/j.eswa.2023.122949. [DOI] [Google Scholar]
- Hua C., Luan S., Xu M., Ying R., Fu J., Ermon S., Precup D.. MUDiff: Unified Diffusion for Complete Molecule Generation. arXiv. 2024:arXiv: 2304.14621. doi: 10.48550/arXiv.2304.14621. [DOI] [Google Scholar]
- Ramakrishnan R., Dral P. O., Rupp M., von Lilienfeld O. A.. Quantum Chemistry Structures and Properties of 134 Kilo Molecules. Sci. Data. 2014;1(1):140022. doi: 10.1038/sdata.2014.22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Axelrod S., Gómez-Bombarelli R.. GEOM, Energy-Annotated Molecular Conformations for Property Prediction and Molecular Generation. Sci. Data. 2022;9(1):185. doi: 10.1038/s41597-022-01288-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nikitin F., Dunn I., Koes D. R., Isayev O.. GEOM-Drugs Revisited: Toward More Chemically Accurate Benchmarks for 3D Molecule Generation. arXiv. 2025:arXiv: 2505.00169. doi: 10.48550/arXiv.2505.00169. [DOI] [Google Scholar]
- Nguyen T., Karolak A.. Transformer Graph Variational Autoencoder for Generative Molecular Design. Biophys. J. 2025 doi: 10.1016/j.bpj.2025.01.022. [DOI] [PubMed] [Google Scholar]
- Gao J., Li P., Chen Z., Zhang J.. A Survey on Deep Learning for Multimodal Data Fusion. Neural Comput. 2020;32:829. doi: 10.1162/neco_a_01273. [DOI] [PubMed] [Google Scholar]
- Cheng J., Pan X., Fang Y., Yang K., Xue Y., Yan Q., Yuan Y.. GexMolGen: Cross-Modal Generation of Hit-like Molecules via Large Language Model Encoding of Gene Expression Signatures. Briefings Bioinf. 2024;25(6):bbae525. doi: 10.1093/bib/bbae525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Atz K., Cotos L., Isert C., Håkansson M., Focht D., Hilleke M., Nippa D. F., Iff M., Ledergerber J., Schiebroek C. C. G., Romeo V., Hiss J. A., Merk D., Schneider P., Kuhn B., Grether U., Schneider G.. Prospective de Novo Drug Design with Deep Interactome Learning. Nat. Commun. 2024;15(1):3408. doi: 10.1038/s41467-024-47613-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lv L., Li H., Wang Y., Yan Z., Chen Z., Lin Z., Yuan L., Tian Y.. Navigating Chemical-Linguistic Sharing Space with Heterogeneous Molecular Encoding. arXiv. 2024:arXiv: 241220888. doi: 10.48550/arXiv.2412.20888. [DOI] [Google Scholar]
- Zheng K., Liang S., Yang J., Feng B., Liu Z., Ju W., Xiao Z., Zhang M.. SMI-Editor: Edit-Based SMILES Language Model with Fragment-Level Supervision. arXiv. 2024:arXiv: 2412.05569. doi: 10.48550/arXiv.2412.05569. [DOI] [Google Scholar]
- Rong, Y. ; Bian, Y. ; Xu, T. ; Xie, W. ; Wei, Y. ; Huang, W. ; Huang, J. . Self-Supervised Graph Transformer on Large-Scale Molecular Data. In Advances in Neural Information Processing Systems; Curran Associates, Inc., 2020, pp 12559–12571. [Google Scholar]
- Zhang, Z. ; Liu, Q. ; Wang, H. ; Lu, C. ; Lee, C.-K. . Motif-Based Graph Self-Supervised Learning for Molecular Property Prediction. In Advances in Neural Information Processing Systems; Curran Associates, Inc., 2021, pp 15870–15882. [Google Scholar]
- Wang Y., Wang J., Cao Z., Barati Farimani A.. Molecular Contrastive Learning of Representations via Graph Neural Networks. Nat. Mach. Intell. 2022;4(3):279–287. doi: 10.1038/s42256-022-00447-x. [DOI] [Google Scholar]
- Loeffler H. H., He J., Tibo A., Janet J. P., Voronov A., Mervin L. H., Engkvist O.. Reinvent 4: Modern AI-Driven Generative Molecule Design. J. Cheminf. 2024;16(1):20. doi: 10.1186/s13321-024-00812-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thomas M., O’Boyle N. M., Bender A., De Graaf C.. Augmented Hill-Climb Increases Reinforcement Learning Efficiency for Language-Based de Novo Molecule Generation. J. Cheminf. 2022;14(1):68. doi: 10.1186/s13321-022-00646-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dodds M., Guo J., Löhr T., Tibo A., Engkvist O., Janet J. P.. Sample Efficient Reinforcement Learning with Active Learning for Molecular Design. Chem. Sci. 2024;15(11):4146–4160. doi: 10.1039/D3SC04653B. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olivecrona M., Blaschke T., Engkvist O., Chen H.. Molecular de-novo design through deep reinforcement learning. J. Cheminf. 2017;9(1):48. doi: 10.1186/s13321-017-0235-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Horwood J., Noutahi E.. Molecular Design in Synthetically Accessible Chemical Space via Deep Reinforcement Learning. ACS Omega. 2020;5(51):32984–32994. doi: 10.1021/acsomega.0c04153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bjerrum E. J., Margreitter C., Blaschke T., de Castro R. L.-R.. Faster and More Diverse de Novo Molecular Optimization with Double-Loop Reinforcement Learning Using Augmented. arXiv. 2022:arXiv: 2210.12458. doi: 10.48550/arXiv.2210.12458. [DOI] [PubMed] [Google Scholar]
- Liu X., Ye K., van Vlijmen H. W. T., Emmerich M. T. M., Ijzerman A. P., van Westen G. J. P.. DrugEx v2: De Novo Design of Drug Molecules by Pareto-Based Multi-Objective Reinforcement Learning in Polypharmacology. J. Cheminf. 2021;13(1):85. doi: 10.1186/s13321-021-00561-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Al-Jumaily A., Mukaidaisi M., Vu A., Tchagang A., Li Y.. Examining Multi-Objective Deep Reinforcement Learning Frameworks for Molecular Design. Biosystems. 2023;232:104989. doi: 10.1016/j.biosystems.2023.104989. [DOI] [PubMed] [Google Scholar]
- Ståhl N., Falkman G., Karlsson A., Mathiason G., Boström J.. Deep Reinforcement Learning for Multiparameter Optimization in de Novo Drug Design. J. Chem. Inf. Model. 2019;59(7):3166–3176. doi: 10.1021/acs.jcim.9b00325. [DOI] [PubMed] [Google Scholar]
- Guo J., Knuth F., Margreitter C., Janet J. P., Papadopoulos K., Engkvist O., Patronov A.. Link-INVENT: Generative Linker Design with Reinforcement Learning. Digital Discovery. 2023;2(2):392–408. doi: 10.1039/D2DD00115B. [DOI] [Google Scholar]
- Rossen L., Sirockin F., Schneider N., Grisoni F.. Scaffold Hopping with Generative Reinforcement Learning. ChemRxiv. 2024:ChemRxiv: gd3j4. doi: 10.26434/chemrxiv-2024-gd3j4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Erikawa D., Yasuo N., Sekijima M.. MERMAID: An Open Source Automated Hit-to-Lead Method Based on Deep Reinforcement Learning. J. Cheminf. 2021;13(1):94. doi: 10.1186/s13321-021-00572-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang S., Hwang D., Lee S., Ryu S., Hwang S. J.. Hit and Lead Discovery with Explorative RL and Fragment-Based Molecule Generation. arXiv. 2021:arXiv: 2110.01219. doi: 10.48550/arXiv.2110.01219. [DOI] [Google Scholar]
- Kotsias P.-C., Arus-Pous J., Chen H., Engkvist O., Tyrchan C., Bjerrum E. J.. Direct Steering of de Novo Molecular Generation with Descriptor Conditional Recurrent Neural Networks. Nat. Mach. Intell. 2020;2(5):254–265. doi: 10.1038/s42256-020-0174-5. [DOI] [Google Scholar]
- Huang L., Yuan Z., Yan H., Sheng R., Liu L., Wang F., Xie W., Chen N., Huang F., Huang S., Wong K.-C., Zhang Y.. A Unified Conditional Diffusion Framework for Dual Protein Targets-Based Bioactive Molecule Generation. IEEE Trans. Artif. Intell. 2024;5(9):4595–4606. doi: 10.1109/TAI.2024.3387402. [DOI] [Google Scholar]
- Grechishnikova D.. Transformer Neural Network for Protein-Specific de Novo Drug Generation as a Machine Translation Problem. Sci. Rep. 2021;11(1):321. doi: 10.1038/s41598-020-79682-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Méndez-Lucio O., Baillif B., Clevert D.-A., Rouquié D., Wichard J.. De Novo Generation of Hit-like Molecules from Gene Expression Signatures Using Artificial Intelligence. Nat. Commun. 2020;11(1):10. doi: 10.1038/s41467-019-13807-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Das D., Chakrabarty B., Srinivasan R., Roy A.. Gex2SGen: Designing Drug-like Molecules from Desired Gene Expression Signatures. J. Chem. Inf. Model. 2023;63(7):1882–1893. doi: 10.1021/acs.jcim.2c01301. [DOI] [PubMed] [Google Scholar]
- Mokaya M., Imrie F., van Hoorn W. P., Kalisz A., Bradley A. R., Deane C. M.. Testing the Limits of SMILES-Based de Novo Molecular Generation with Curriculum and Deep Reinforcement Learning. Nat. Mach. Intell. 2023;5(4):386–394. doi: 10.1038/s42256-023-00636-2. [DOI] [Google Scholar]
- Yoshizawa T., Ishida S., Sato T., Ohta M., Honma T., Terayama K.. A Data-Driven Generative Strategy to Avoid Reward Hacking in Multi-Objective Molecular Design. Nat. Commun. 2025;16(1):2409. doi: 10.1038/s41467-025-57582-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blaschke T., Engkvist O., Bajorath J., Chen H.. Memory-Assisted Reinforcement Learning for Diverse Molecular de Novo Design. J. Cheminf. 2020;12(1):68. doi: 10.1186/s13321-020-00473-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Özçelik R., Grisoni F.. The Jungle of Generative Drug Discovery: Traps, Treasures, and Ways Out. arXiv. 2025:arXiv: 2501.05457. doi: 10.48550/arxiv.2501.05457. [DOI] [Google Scholar]
- Tripp A., Hernández-Lobato J. M.. Genetic Algorithms Are Strong Baselines for Molecule Generation. arXiv. 2023:arXiv: 2310.09267. doi: 10.48550/arXiv.2310.09267. [DOI] [Google Scholar]
- Thimm M., Goede A., Hougardy S., Preissner R.. Comparison of 2D Similarity and 3D Superposition. Application to Searching a Conformational Drug Database. J. Chem. Inf. Comput. Sci. 2004;44(5):1816–1822. doi: 10.1021/ci049920h. [DOI] [PubMed] [Google Scholar]
- Todeschini, R. ; Consonni, V. . Molecular Descriptors for Chemoinformatics; John Wiley & Sons, 2009. [Google Scholar]
- Todeschini R., Ballabio D., Consonni V.. Distances and Other Dissimilarity Measures in Chemometrics. Encycl. Anal. Chem. 2015:1–34. doi: 10.1002/9780470027318.a9438. [DOI] [Google Scholar]
- Rogers D., Hahn M.. Extended-Connectivity Fingerprints. J. Chem. Inf. Model. 2010;50(5):742–754. doi: 10.1021/ci100050t. [DOI] [PubMed] [Google Scholar]
- Bemis G. W., Murcko M. A.. The Properties of Known Drugs. 1. Molecular Frameworks. J. Med. Chem. 1996;39(15):2887–2893. doi: 10.1021/jm9602928. [DOI] [PubMed] [Google Scholar]
- Renz P., Luukkonen S., Klambauer G.. Diverse Hits in De Novo Molecule Design: Diversity-Based Comparison of Goal-Directed Generators. J. Chem. Inf. Model. 2024;64(15):5756–5761. doi: 10.1021/acs.jcim.4c00519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xie Y., Xu Z., Ma J., Mei Q.. How Much Space Has Been Explored? Measuring the Chemical Space Covered by Databases and Machine-Generated Molecules. arXiv. 2023:arXiv: 2112.12542. doi: 10.48550/arXiv.2112.12542. [DOI] [Google Scholar]
- Durant J. L., Leland B. A., Henry D. R., Nourse J. G.. Reoptimization of MDL Keys for Use in Drug Discovery. J. Chem. Inf. Comput. Sci. 2002;42(6):1273–1280. doi: 10.1021/ci010132r. [DOI] [PubMed] [Google Scholar]
- Özçelik R., De Ruiter S., Criscuolo E., Grisoni F.. Chemical Language Modeling with Structured State Space Sequence Models. Nat. Commun. 2024;15(1):6176. doi: 10.1038/s41467-024-50469-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kullback S., Leibler R. A.. On Information and Sufficiency. Ann. Math. Stat. 1951;22(1):79–86. doi: 10.1214/aoms/1177729694. [DOI] [Google Scholar]
- Massey F. J. Jr. The Kolmogorov-Smirnov Test for Goodness of Fit. J. Am. Stat. Assoc. 1951;46(253):68–78. doi: 10.1080/01621459.1951.10500769. [DOI] [Google Scholar]
- Preuer K., Renz P., Unterthiner T., Hochreiter S., Klambauer G.. Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery. J. Chem. Inf. Model. 2018;58(9):1736–1741. doi: 10.1021/acs.jcim.8b00234. [DOI] [PubMed] [Google Scholar]
- Goh, G. B. ; Siegel, C. ; Vishnu, A. ; Hodas, N. . Using Rule-Based Labels for Weak Supervised Learning: A ChemNet for Transferable Chemical Property Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; KDD 18; Association for Computing Machinery: New York, NY, USA, 2018; pp 302–310. 10.1145/3219819.3219838. [DOI] [Google Scholar]
- Muratov E. N., Bajorath J., Sheridan R. P., Tetko I. V., Filimonov D., Poroikov V., Oprea T. I., Baskin I. I., Varnek A., Roitberg A., Isayev O., Curtalolo S., Fourches D., Cohen Y., Aspuru-Guzik A., Winkler D. A., Agrafiotis D., Cherkasov A., Tropsha A.. QSAR without Borders. Chem. Soc. Rev. 2020;49(11):3525–3564. doi: 10.1039/D0CS00098A. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu F., Zhou Y., Li L., Shen X., Chen G., Wang X., Liang X., Tan M., Huang Z.. Computational Approaches in Preclinical Studies on Drug Discovery and Development. Front. Chem. 2020;8:726. doi: 10.3389/fchem.2020.00726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ertl P., Schuffenhauer A.. Estimation of Synthetic Accessibility Score of Drug-like Molecules Based on Molecular Complexity and Fragment Contributions. J. Cheminf. 2009;1(1):8. doi: 10.1186/1758-2946-1-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coley C. W., Rogers L., Green W. H., Jensen K. F.. SCScore: Synthetic Complexity Learned from a Reaction Corpus. J. Chem. Inf. Model. 2018;58(2):252–261. doi: 10.1021/acs.jcim.7b00622. [DOI] [PubMed] [Google Scholar]
- Genheden S., Thakkar A., Chadimová V., Reymond J.-L., Engkvist O., Bjerrum E.. AiZynthFinder: A Fast, Robust and Flexible Open-Source Software for Retrosynthetic Planning. J. Cheminf. 2020;12(1):70. doi: 10.1186/s13321-020-00472-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Renaud J.-P., Chung C., Danielson U. H., Egner U., Hennig M., Hubbard R. E., Nar H.. Biophysics in Drug Discovery: Impact, Challenges and Opportunities. Nat. Rev. Drug Discovery. 2016;15(10):679–698. doi: 10.1038/nrd.2016.123. [DOI] [PubMed] [Google Scholar]
- Vogt M.. Exploring Chemical Space Generative Models and Their Evaluation. Artif. Intell. Life Sci. 2023;3:100064. doi: 10.1016/j.ailsci.2023.100064. [DOI] [Google Scholar]
- Bender A., Glen R. C.. Molecular Similarity: A Key Technique in Molecular Informatics. Org. Biomol. Chem. 2004;2(22):3204–3218. doi: 10.1039/b409813g. [DOI] [PubMed] [Google Scholar]
- Aronson J. K., Green A. R.. Me-Too Pharmaceutical Products: History, Definitions, Examples, and Relevance to Drug Shortages and Essential Medicines Lists. Br. J. Clin. Pharmacol. 2020;86(11):2114–2122. doi: 10.1111/bcp.14327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schneider G., Schneider P., Renner S.. Scaffold-Hopping: How Far Can You Jump? QSAR Comb. Sci. 2006;25(12):1162–1171. doi: 10.1002/qsar.200610091. [DOI] [Google Scholar]
- Garcia-Castro M., Kremer L., Reinkemeier C. D., Unkelbach C., Strohmann C., Ziegler S., Ostermann C., Kumar K.. De Novo Branching Cascades for Structural and Functional Diversity in Small Molecules. Nat. Commun. 2015;6(1):6516. doi: 10.1038/ncomms7516. [DOI] [PubMed] [Google Scholar]
- Cons B. D., Twigg D. G., Kumar R., Chessari G.. Electrostatic Complementarity in Structure-Based Drug Design. J. Med. Chem. 2022;65(11):7476–7488. doi: 10.1021/acs.jmedchem.2c00164. [DOI] [PubMed] [Google Scholar]
- Klebe, G. Protein-Ligand Interactions as the Basis for Drug Action. In Drug Design: From Structure and Mode-of-Action to Rational Design Concepts; Klebe, G. , Ed.; Springer: Berlin, Heidelberg, 2024; pp 39–65. 10.1007/978-3-662-68998-1_4. [DOI] [Google Scholar]
- Yu J.-L., Zhou C., Ning X.-L., Mou J., Meng F.-B., Wu J.-W., Chen Y.-T., Tang B.-D., Liu X.-G., Li G.-B.. Knowledge-Guided Diffusion Model for 3D Ligand-Pharmacophore Mapping. Nat. Commun. 2025;16(1):2269. doi: 10.1038/s41467-025-57485-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xie W., Zhang J., Xie Q., Gong C., Ren Y., Xie J., Sun Q., Xu Y., Lai L., Pei J.. Accelerating Discovery of Bioactive Ligands with Pharmacophore-Informed Generative Models. Nat. Commun. 2025;16(1):2391. doi: 10.1038/s41467-025-56349-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Adams K., Abeywardane K., Fromer J., Coley C. W.. ShEPhERD: Diffusing Shape, Electrostatics, and Pharmacophores for Bioisosteric Drug Design. arXiv. 2025:arXiv: 2411.04130. doi: 10.48550/arXiv.2411.04130. [DOI] [Google Scholar]
- Cremer J., Le T., Noé F., Clevert D. A., Schütt K. T.. PILOT: Equivariant Diffusion for Pocket-Conditioned de Novo Ligand Generation with Multi-Objective Guidance via Importance Sampling. Chem. Sci. 2024;15(36):14954–14967. doi: 10.1039/d4sc03523b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolcato G., Heid E., Boström J.. On the Value of Using 3D Shape and Electrostatic Similarities in Deep Generative Methods. J. Chem. Inf. Model. 2022;62(6):1388–1398. doi: 10.1021/acs.jcim.1c01535. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McNaughton A. D., Bontha M. S., Knutson C. R., Pope J. A., Kumar N.. De Novo Design of Protein Target Specific Scaffold-Based Inhibitors via Reinforcement Learning. arXiv. 2022:arXiv: 2205.10473. doi: 10.48550/arXiv.2205.10473. [DOI] [Google Scholar]
- Beghyn T., Deprez-Poulain R., Willand N., Folleas B., Deprez B.. Natural Compounds: Leads or Ideas? Bioinspired Molecules for Drug Discovery. Chem. Biol. Drug Des. 2008;72(1):3–15. doi: 10.1111/j.1747-0285.2008.00673.x. [DOI] [PubMed] [Google Scholar]
- Rodrigues T., Reker D., Schneider P., Schneider G.. Counting on Natural Products for Drug Design. Nat. Chem. 2016;8(6):531–541. doi: 10.1038/nchem.2479. [DOI] [PubMed] [Google Scholar]
- Harvey A. L., Edrada-Ebel R., Quinn R. J.. The Re-Emergence of Natural Products for Drug Discovery in the Genomics Era. Nat. Rev. Drug Discovery. 2015;14(2):111–129. doi: 10.1038/nrd4510. [DOI] [PubMed] [Google Scholar]
- Henkel T., Brunne R. M., Müller H., Reichel F.. Statistical Investigation into the Structural Complementarity of Natural Products and Synthetic Compounds. Angew. Chem., Int. Ed. 1999;38(5):643–647. doi: 10.1002/(SICI)1521-3773(19990301)38:5<643::AID-ANIE643>3.0.CO;2-G. [DOI] [PubMed] [Google Scholar]
- Lee M.-L., Schneider G.. Scaffold Architecture and Pharmacophoric Properties of Natural Products and Trade Drugs: Application in the Design of Natural Product-Based Combinatorial Libraries. J. Comb. Chem. 2001;3(3):284–289. doi: 10.1021/cc000097l. [DOI] [PubMed] [Google Scholar]
- Shen X., Zeng T., Chen N., Li J., Wu R.. NIMO: A Natural Product-Inspired Molecular Generative Model Based on Conditional Transformer. Molecules. 2024;29(8):1867. doi: 10.3390/molecules29081867. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davis A. M., Riley R. J.. Predictive ADMET Studies, the Challenges and the Opportunities. Curr. Opin. Chem. Biol. 2004;8(4):378–386. doi: 10.1016/j.cbpa.2004.06.005. [DOI] [PubMed] [Google Scholar]
- Chen Y.-C.. Beware of Docking! Trends Pharmacol. Sci. 2015;36(2):78–95. doi: 10.1016/j.tips.2014.12.001. [DOI] [PubMed] [Google Scholar]
- Fooladi H., Vu T. N. L., Kirchmair J.. Evaluating Machine Learning Models for Molecular Property Prediction: Performance and Robustness on Out-of-Distribution Data. ChemRxiv. 2025:ChemRxiv: g1vjf. doi: 10.26434/chemrxiv-2025-g1vjf-v2. [DOI] [Google Scholar]
- Tilborg D. v., Rossen L., Grisoni F.. Molecular Deep Learning at the Edge of Chemical Space. ChemRxiv. 2025:ChemRxiv: qj4k3. doi: 10.26434/chemrxiv-2025-qj4k3. [DOI] [Google Scholar]
- Tropsha A.. Best Practices for QSAR Model Development, Validation, and Exploitation. Mol. Inf. 2010;29(6–7):476–488. doi: 10.1002/minf.201000061. [DOI] [PubMed] [Google Scholar]
- Gao, W. ; Fu, T. ; Sun, J. ; Coley, C. . Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization. In Advances in neural information processing systems, 2022, pp 21342–21357. [Google Scholar]
- Thomas M., O’Boyle N. M., Bender A., De Graaf C.. MolScore: A Scoring, Evaluation and Benchmarking Framework for Generative Models in de Novo Drug Design. J. Cheminf. 2024;16(1):64. doi: 10.1186/s13321-024-00861-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raji I. D., Bender E. M., Paullada A., Denton E., Hanna A.. AI and the Everything in the Whole Wide World Benchmark. arXiv. 2021:arXiv: 2111.15366. doi: 10.48550/arXiv.2111.15366. [DOI] [Google Scholar]
- Thomas M., Bou A., Fabritiis G.. Test-Time Training Scaling for Chemical Exploration in Drug Design. arXiv. 2025:arXiv: 2501.19153. doi: 10.48550/arXiv.2501.19153. [DOI] [Google Scholar]
- Cieplinski T., Danel T., Podlewska S., Jastrzebski S.. We Should at Least Be Able to Design Molecules That Dock Well. J. Chem. Inf. Model. 2023;63(11):3238–3247. doi: 10.1021/acs.jcim.2c01355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- García-Ortegón M., Simm G. N. C., Tripp A. J., Hernández-Lobato J. M., Bender A., Bacallado S.. DOCKSTRING: Easy Molecular Docking Yields Better Benchmarks for Ligand Design. J. Chem. Inf. Model. 2022;62(15):3486–3502. doi: 10.1021/acs.jcim.1c01334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin H., Zhao G., Zhang O., Huang Y., Wu L., Liu Z., Li S., Tan C., Gao Z., Li S. Z.. CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph. arXiv. 2024:arXiv: 2406.10840. doi: 10.48550/arXiv.2406.10840. [DOI] [Google Scholar]
- Stanley M., Segler M.. Fake It until You Make It? Generative de Novo Design and Virtual Screening of Synthesizable Molecules. Curr. Opin. Struct. Biol. 2023;82:102658. doi: 10.1016/j.sbi.2023.102658. [DOI] [PubMed] [Google Scholar]
- Boda K., Seidel T., Gasteiger J.. Structure and Reaction Based Evaluation of Synthetic Accessibility. J. Comput.-Aided Mol. Des. 2007;21(6):311–325. doi: 10.1007/s10822-006-9099-2. [DOI] [PubMed] [Google Scholar]
- Zhong W., Yang Z., Chen C. Y.-C.. Retrosynthesis Prediction Using an End-to-End Graph Generative Architecture for Molecular Graph Editing. Nat. Commun. 2023;14(1):3009. doi: 10.1038/s41467-023-38851-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- CAS SciFinderChemical Compound Database. https://www.cas.org/solutions/cas-scifinder-discovery-platform/cas-scifinder (accessed 2025–03–19).
- Grzybowski B. A., Szymkuć S., Gajewska E. P., Molga K., Dittwald P., Wołos A., Klucznik T.. Chematica: A Story of Computer Code That Started to Think like a Chemist. Chem. 2018;4(3):390–398. doi: 10.1016/j.chempr.2018.02.024. [DOI] [Google Scholar]
- Segler M. H. S., Preuss M., Waller M. P.. Planning Chemical Syntheses with Deep Neural Networks and Symbolic AI. Nature. 2018;555(7698):604–610. doi: 10.1038/nature25978. [DOI] [PubMed] [Google Scholar]
- Seidl P., Renz P., Dyubankova N., Neves P., Verhoeven J., Wegner J. K., Segler M., Hochreiter S., Klambauer G.. Improving Few- and Zero-Shot Reaction Template Prediction Using Modern Hopfield Networks. J. Chem. Inf. Model. 2022;62(9):2111–2120. doi: 10.1021/acs.jcim.1c01065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wan Y., Liao B., Hsieh C.-Y., Zhang S.. Retroformer: Pushing the Limits of Interpretable End-to-End Retrosynthesis Transformer. arXiv. 2022:arXiv: 2201.12475. doi: 10.48550/arXiv.2201.12475. [DOI] [Google Scholar]
- Tetko I. V., Karpov P., Van Deursen R., Godin G.. State-of-the-Art Augmented NLP Transformer Models for Direct and Single-Step Retrosynthesis. Nat. Commun. 2020;11(1):5575. doi: 10.1038/s41467-020-19266-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yao L., Guo W., Wang Z., Xiang S., Liu W., Ke G.. Node-Aligned Graph-to-Graph: Elevating Template-Free Deep Learning Approaches in Single-Step Retrosynthesis. JACS Au. 2024;4(3):992–1003. doi: 10.1021/jacsau.3c00737. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raghavan P., Haas B. C., Ruos M. E., Schleinitz J., Doyle A. G., Reisman S. E., Sigman M. S., Coley C. W.. Dataset Design for Building Models of Chemical Reactivity. ACS Cent. Sci. 2023;9(12):2196–2204. doi: 10.1021/acscentsci.3c01163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo J., Schwaller P.. Directly Optimizing for Synthesizability in Generative Molecular Design Using Retrosynthesis Models. Chem. Sci. 2025;16(16):6943–6956. doi: 10.1039/D5SC01476J. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ghiandoni G. M., Flanagan S. R., Bodkin M. J., Nizi M. G., Galera-Prat A., Brai A., Chen B., Wallace J. E. A., Hristozov D., Webster J., Manfroni G., Lehtiö L., Tabarrini O., Gillet V. J.. Synthetically Accessible de Novo Design Using Reaction Vectors: Application to PARP1 Inhibitors**. Mol. Inf. 2024;43(4):e202300183. doi: 10.1002/minf.202300183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gao W., Luo S., Coley C. W.. Generative Artificial Intelligence for Navigating Synthesizable Chemical Space. arXiv. 2024:arXiv: 2410.03494. doi: 10.48550/arXiv.2410.03494. [DOI] [Google Scholar]
- Liu J., Yan C., Yu Y., Lu C., Huang J., Ou-Yang L., Zhao P.. MARS: A Motif-Based Autoregressive Model for Retrosynthesis Prediction. Bioinformatics. 2024;40(3):btae115. doi: 10.1093/bioinformatics/btae115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo J., Schwaller P.. It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular Design. arXiv. 2024:arXiv: 2410.11527. doi: 10.48550/arXiv.2410.11527. [DOI] [Google Scholar]
- Wang M., Li S., Wang J., Zhang O., Du H., Jiang D., Wu Z., Deng Y., Kang Y., Pan P., Li D., Wang X., Yao X., Hou T., Hsieh C.-Y.. ClickGen: Directed Exploration of Synthesizable Chemical Space via Modular Reactions and Reinforcement Learning. Nat. Commun. 2024;15(1):10127. doi: 10.1038/s41467-024-54456-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bradshaw, J. ; Paige, B. ; Kusner, M. J. ; Segler, M. ; Hernández-Lobato, J. M. . Barking up the Right Tree: An Approach to Search over Molecule Synthesis DAGs. In Advances in Neural Information Processing Systems; Curran Associates, Inc., 2020, pp 6852–6866. [Google Scholar]
- Bradshaw, J. ; Paige, B. ; Kusner, M. J. ; Segler, M. ; Hernández-Lobato, J. M. . A Model to Search for Synthesizable Molecules. In Advances in Neural Information Processing Systems; Curran Associates, Inc., 2019. [Google Scholar]
- Pedawi, A. ; Gniewek, P. ; Chang, C. ; Anderson, B. ; Bedem, H. v. d. . An efficient graph generative model for navigating ultra-large combinatorial synthesis libraries. In Advances in Neural Information Processing Systems 35 (NeurIPS 2022). [Google Scholar]
- Huang Q., Li L.-L., Yang S.-Y.. RASA: A Rapid Retrosynthesis-Based Scoring Method for the Assessment of Synthetic Accessibility of Drug-like Molecules. J. Chem. Inf. Model. 2011;51(10):2768–2777. doi: 10.1021/ci100216g. [DOI] [PubMed] [Google Scholar]
- Neeser R. M., Correia B., Schwaller P.. FSscore: A Personalized Machine Learning-Based Synthetic Feasibility Score. Chem.: Methods. 2024;4(11):e202400024. doi: 10.1002/cmtd.202400024. [DOI] [Google Scholar]
- Discovery at Your Fingertips. SYNTHIA Retrosynthesis Software. https://www.synthiaonline.com/(accessed 2025–03–24). [Google Scholar]
- Schwaller P., Probst D., Vaucher A. C., Nair V. H., Kreutter D., Laino T., Reymond J.-L.. Mapping the Space of Chemical Reactions Using Attention-Based Neural Networks. Nat. Mach. Intell. 2021;3(2):144–152. doi: 10.1038/s42256-020-00284-w. [DOI] [Google Scholar]
- Ghiandoni G. M., Bodkin M. J., Chen B., Hristozov D., Wallace J. E. A., Webster J., Gillet V. J.. RENATE: A Pseudo-Retrosynthetic Tool for Synthetically Accessible de Novo Design. Mol. Inf. 2022;41(4):2100207. doi: 10.1002/minf.202100207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Button A., Merk D., Hiss J. A., Schneider G.. Automated de Novo Molecular Design by Hybrid Machine Intelligence and Rule-Driven Chemical Synthesis. Nat. Mach. Intell. 2019;1(7):307–315. doi: 10.1038/s42256-019-0067-7. [DOI] [Google Scholar]
- Bradshaw J., Zhang A., Mahjour B., Graff D. E., Segler M. H. S., Coley C. W.. Challenging Reaction Prediction Models to Generalize to Novel Chemistry. arXiv. 2025:arXiv: 2501.06669. doi: 10.48550/arXiv.2501.06669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Handa K., Thomas M. C., Kageyama M., Iijima T., Bender A.. On the Difficulty of Validating Molecular Generative Models Realistically: A Case Study on Public and Proprietary Data. J. Cheminf. 2023;15(1):112. doi: 10.1186/s13321-023-00781-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Tilborg D., Alenicheva A., Grisoni F.. Exposing the Limitations of Molecular Machine Learning with Activity Cliffs. J. Chem. Inf. Model. 2022;62(23):5938–5951. doi: 10.1021/acs.jcim.2c01073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu Z., Ramsundar B., Feinberg E., Gomes J., Geniesse C., Pappu A. S., Leswing K., Pande V.. MoleculeNet: A Benchmark for Molecular Machine Learning. Chem. Sci. 2018;9(2):513–530. doi: 10.1039/C7SC02664A. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Abolhasani M., Kumacheva E.. The Rise of Self-Driving Labs in Chemical and Materials Sciences. Nat. Synth. 2023;2(6):483–492. doi: 10.1038/s44160-022-00231-0. [DOI] [Google Scholar]
- Filella-Merce I., Molina A., Orzechowski M., Díaz L., Zhu Y. M., Mor J. V., Malo L., Yekkirala A. S., Ray S., Guallar V.. Optimizing Drug Design by Merging Generative AI With Active Learning Frameworks. arXiv. 2023:arXiv: 2305.06334. doi: 10.48550/arXiv.2305.06334. [DOI] [Google Scholar]
- van Tilborg D., Grisoni F.. Traversing Chemical Space with Active Deep Learning for Low-Data Drug Discovery. Nat. Comput. Sci. 2024;4(10):786–796. doi: 10.1038/s43588-024-00697-2. [DOI] [PubMed] [Google Scholar]
- Polykovskiy D., Zhebrak A., Vetrov D., Ivanenkov Y., Aladinskiy V., Mamoshina P., Bozdaganyan M., Aliper A., Zhavoronkov A., Kadurin A.. Entangled Conditional Adversarial Autoencoder for de Novo Drug Discovery. Mol. Pharmaceutics. 2018;15(10):4398–4405. doi: 10.1021/acs.molpharmaceut.8b00839. [DOI] [PubMed] [Google Scholar]
- Zhavoronkov A., Ivanenkov Y. A., Aliper A., Veselov M. S., Aladinskiy V. A., Aladinskaya A. V., Terentiev V. A., Polykovskiy D. A., Kuznetsov M. D., Asadulaev A., Volkov Y., Zholus A., Shayakhmetov R. R., Zhebrak A., Minaeva L. I., Zagribelnyy B. A., Lee L. H., Soll R., Madge D., Xing L., Guo T., Aspuru-Guzik A.. Deep Learning Enables Rapid Identification of Potent DDR1 Kinase Inhibitors. Nat. Biotechnol. 2019;37(9):1038–1040. doi: 10.1038/s41587-019-0224-x. [DOI] [PubMed] [Google Scholar]
- Yoshimori A., Asawa Y., Kawasaki E., Tasaka T., Matsuda S., Sekikawa T., Tanabe S., Neya M., Natsugari H., Kanai C.. Design and Synthesis of DDR1 Inhibitors with a Desired Pharmacophore Using Deep Generative Models. ChemMedChem. 2021;16(6):955–958. doi: 10.1002/cmdc.202000786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jang S. H., Sivakumar D., Mudedla S. K., Choi J., Lee S., Jeon M., Bvs S. K., Hwang J., Kang M., Shin E. G., Lee K. M., Jung K.-Y., Kim J.-S., Wu S.. PCW-A1001, AI-Assisted de Novo Design Approach to Design a Selective Inhibitor for FLT-3(D835Y) in Acute Myeloid Leukemia. Front. Mol. Biosci. 2022;9:1072028. doi: 10.3389/fmolb.2022.1072028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Korshunova M., Huang N., Capuzzi S., Radchenko D. S., Savych O., Moroz Y. S., Wells C. I., Willson T. M., Tropsha A., Isayev O.. Generative and Reinforcement Learning Approaches for the Automated de Novo Design of Bioactive Compounds. Commun. Chem. 2022;5(1):1–11. doi: 10.1038/s42004-022-00733-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Y., Zhang L., Wang Y., Zou J., Yang R., Luo X., Wu C., Yang W., Tian C., Xu H., Wang F., Yang X., Li L., Yang S.. Generative Deep Learning Enables the Discovery of a Potent and Selective RIPK1 Inhibitor. Nat. Commun. 2022;13(1):6891. doi: 10.1038/s41467-022-34692-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu Y., Huang J., He H., Han J., Ye G., Xu T., Sun X., Chen X., Ren X., Li C., Li H., Huang W., Liu Y., Wang X., Gao Y., Cheng N., Guo N., Chen X., Feng J., Hua Y., Liu C., Zhu G., Xie Z., Yao L., Zhong W., Chen X., Liu W., Li H.. Accelerated Discovery of Macrocyclic CDK2 Inhibitor QR-6401 by Generative Models and Structure-Based Drug Design. ACS Med. Chem. Lett. 2023;14(3):297–304. doi: 10.1021/acsmedchemlett.2c00515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Y., Liu Y., Wu J., Liu X., Wang L., Wang J., Yu J., Qi H., Qin L., Ding X., Ren F., Zhavoronkov A.. Discovery of Potent, Selective, and Orally Bioavailable Small-Molecule Inhibitors of CDK8 for the Treatment of Cancer. J. Med. Chem. 2023;66(8):5439–5452. doi: 10.1021/acs.jmedchem.2c01718. [DOI] [PubMed] [Google Scholar]
- Zhu W., Liu X., Li Q., Gao F., Liu T., Chen X., Zhang M., Aliper A., Ren F., Ding X., Zhavoronkov A.. Discovery of Novel and Selective SIK2 Inhibitors by the Application of AlphaFold Structures and Generative Models. Bioorg. Med. Chem. 2023;91:117414. doi: 10.1016/j.bmc.2023.117414. [DOI] [PubMed] [Google Scholar]
- Ren F., Ding X., Zheng M., Korzinkin M., Cai X., Zhu W., Mantsyzov A., Aliper A., Aladinskiy V., Cao Z., Kong S., Long X., Man Liu B. H., Liu Y., Naumov V., Shneyderman A., Ozerov I. V., Wang J., Pun F. W., Polykovskiy D. A., Sun C., Levitt M., Aspuru-Guzik A., Zhavoronkov A.. AlphaFold Accelerates Artificial Intelligence Powered Drug Discovery: Efficient Discovery of a Novel CDK20 Small Molecule Inhibitor. Chem. Sci. 2023;14(6):1443–1452. doi: 10.1039/D2SC05709C. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ren F., Aliper A., Chen J., Zhao H., Rao S., Kuppe C., Ozerov I. V., Zhang M., Witte K., Kruse C., Aladinskiy V., Ivanenkov Y., Polykovskiy D., Fu Y., Babin E., Qiao J., Liang X., Mou Z., Wang H., Pun F. W., Torres-Ayuso P., Veviorskiy A., Song D., Liu S., Zhang B., Naumov V., Ding X., Kukharenko A., Izumchenko E., Zhavoronkov A.. A Small-Molecule TNIK Inhibitor Targets Fibrosis in Preclinical and Clinical Models. Nat. Biotechnol. 2025;43(1):63–75. doi: 10.1038/s41587-024-02143-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Isigkeit L., Hörmann T., Schallmayer E., Scholz K., Lillich F. F., Ehrler J. H. M., Hufnagel B., Büchner J., Marschner J. A., Pabel J., Proschak E., Merk D.. Automated Design of Multi-Target Ligands by Generative Deep Learning. Nat. Commun. 2024;15(1):7946. doi: 10.1038/s41467-024-52060-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salas-Estrada L., Provasi D., Qiu X., Kaniskan H. U. ¨., Huang X.-P., DiBerto J. F., Lamim Ribeiro J. M., Jin J., Roth B. L., Filizola M.. De Novo Design of κ-Opioid Receptor Antagonists Using a Generative Deep-Learning Framework. J. Chem. Inf. Model. 2023;63(16):5056–5065. doi: 10.1021/acs.jcim.3c00651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu K., Xia Y., Deng P., Liu R., Zhang Y., Guo H., Cui Y., Pei Q., Wu L., Xie S., Chen S., Lu X., Hu S., Wu J., Chan C.-K., Chen S., Zhou L., Yu N., Chen E., Liu H., Guo J., Qin T., Liu T.-Y.. TamGen: Drug Design with Target-Aware Molecule Generation through a Chemical Language Model. Nat. Commun. 2024;15(1):9360. doi: 10.1038/s41467-024-53632-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jiang Y., Zhang G., You J., Zhang H., Yao R., Xie H., Zhang L., Xia Z., Dai M., Wu Y., Li L., Yang S.. PocketFlow Is a Data-and-Knowledge-Driven Structure-Based Molecular Generative Model. Nat. Mach. Intell. 2024;6(3):326–337. doi: 10.1038/s42256-024-00808-8. [DOI] [Google Scholar]
- Swanson K., Liu G., Catacutan D. B., Arnold A., Zou J., Stokes J. M.. Generative AI for Designing and Validating Easily Synthesizable and Structurally Novel Antibiotics. Nat. Mach. Intell. 2024;6(3):338–353. doi: 10.1038/s42256-024-00809-7. [DOI] [Google Scholar]
- Xu J., Ding X., Fu Y., Meng Q., Wang L., Zhang M., Xu C., Chen S., Aliper A., Ren F., Zhavoronkov A., Ding X.. Discovery of Novel and Potent Prolyl Hydroxylase Domain-Containing Protein (PHD) Inhibitors for The Treatment of Anemia. J. Med. Chem. 2024;67(2):1393–1405. doi: 10.1021/acs.jmedchem.3c01932. [DOI] [PubMed] [Google Scholar]
- Ghazi Vakili M., Gorgulla C., Snider J., Nigam A., Bezrukov D., Varoli D., Aliper A., Polykovsky D., Padmanabha Das K. M., Cox H. III, Lyakisheva A., Hosseini Mansob A., Yao Z., Bitar L., Tahoulas D., Čerina D., Radchenko E., Ding X., Liu J., Meng F., Ren F., Cao Y., Stagljar I., Aspuru-Guzik A., Zhavoronkov A.. Quantum-Computing-Enhanced Algorithm Unveils Potential KRAS Inhibitors. Nat. Biotechnol. 2025:1–6. doi: 10.1038/s41587-024-02526-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hassen A. K., Šícho M., van Aalst Y. J., Huizenga M. C. W., Reynolds D. N. R., Luukkonen S., Bernatavicius A., Clevert D.-A., Janssen A. P. A., van Westen G. J. P., Preuss M.. Generate What You Can Make: Achieving in-House Synthesizability with Readily Available Resources in de Novo Drug Design. J. Cheminf. 2025;17(1):41. doi: 10.1186/s13321-024-00910-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bickerton G. R., Paolini G. V., Besnard J., Muresan S., Hopkins A. L.. Quantifying the Chemical Beauty of Drugs. Nat. Chem. 2012;4(2):90–98. doi: 10.1038/nchem.1243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ritchie T. J., Macdonald S. J. F.. How Drug-like Are ‘Ugly’ Drugs: Do Drug-Likeness Metrics Predict ADME Behaviour in Humans? Drug Discovery Today. 2014;19(4):489–495. doi: 10.1016/j.drudis.2014.01.007. [DOI] [PubMed] [Google Scholar]
- Yusof I., Segall M. D.. Considering the Impact Drug-like Properties Have on the Chance of Success. Drug Discovery Today. 2013;18(13):659–666. doi: 10.1016/j.drudis.2013.02.008. [DOI] [PubMed] [Google Scholar]
- Churcher I., Newbold S., Murray C. W.. Return to Flatland. Nat. Rev. Chem. 2025;9(3):140–141. doi: 10.1038/s41570-025-00688-5. [DOI] [PubMed] [Google Scholar]
- Masters M. R., Mahmoud A. H., Wei Y., Lill M. A.. Deep Learning Model for Efficient Protein-Ligand Docking with Implicit Side-Chain Flexibility. J. Chem. Inf. Model. 2023;63:1695. doi: 10.1021/acs.jcim.2c01436. [DOI] [PubMed] [Google Scholar]
- Gentile F., Agrawal V., Hsing M., Ton A.-T., Ban F., Norinder U., Gleave M. E., Cherkasov A.. Deep Docking: A Deep Learning Platform for Augmentation of Structure Based Drug Discovery. ACS Cent. Sci. 2020;6:939. doi: 10.1021/acscentsci.0c00229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McNutt A. T., Li Y., Meli R., Aggarwal R., Koes D. R.. GNINA 1.3: The next Increment in Molecular Docking with Deep Learning. J. Cheminf. 2025;17(1):28. doi: 10.1186/s13321-025-00973-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bhat S., Palepu K., Hong L., Mao J., Ye T., Iyer R., Zhao L., Chen T., Vincoff S., Watson R., Wang T. Z., Srijay D., Kavirayuni V. S., Kholina K., Goel S., Vure P., Deshpande A. J., Soderling S. H., DeLisa M. P., Chatterjee P.. De Novo Design of Peptide Binders to Conformationally Diverse Targets with Contrastive Language Modeling. Sci. Adv. 2025;11(4):eadr8638. doi: 10.1126/sciadv.adr8638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morehead A., Giri N., Liu J., Cheng J.. Deep Learning for Protein-Ligand Docking: Are We There Yet? arXiv. 2024:arXiv; 2405.14108. doi: 10.48550/arXiv.2405.14108. [DOI] [Google Scholar]
- Vella D., Ebejer J.-P.. Few-Shot Learning for Low-Data Drug Discovery. J. Chem. Inf. Model. 2023;63(1):27–42. doi: 10.1021/acs.jcim.2c00779. [DOI] [PubMed] [Google Scholar]
- van Tilborg D., Brinkmann H., Criscuolo E., Rossen L., Özçelik R., Grisoni F.. Deep Learning for Low-Data Drug Discovery: Hurdles and Opportunities. Curr. Opin. Struct. Biol. 2024;86:102818. doi: 10.1016/j.sbi.2024.102818. [DOI] [PubMed] [Google Scholar]
- Tran-Nguyen V.-K., Jacquemard C., Rognan D.. LIT-PCBA: An Unbiased Data Set for Machine Learning and Virtual Screening. J. Chem. Inf. Model. 2020;60(9):4263–4273. doi: 10.1021/acs.jcim.0c00155. [DOI] [PubMed] [Google Scholar]
- Sun J., Jeliazkova N., Chupakhin V., Golib-Dzib J.-F., Engkvist O., Carlsson L., Wegner J., Ceulemans H., Georgiev I., Jeliazkov V., Kochev N., Ashby T. J., Chen H.. ExCAPE-DB: An Integrated Large Scale Dataset Facilitating Big Data Analysis in Chemogenomics. J. Cheminf. 2017;9(1):17. doi: 10.1186/s13321-017-0203-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aleixo E. L., Colonna J. G., Cristo M., Fernandes E.. Catastrophic Forgetting in Deep Learning: A Comprehensive Taxonomy. J. Braz. Comput. Soc. 2024;30(1):175–211. doi: 10.5753/jbcs.2024.3966. [DOI] [Google Scholar]
- Schmidinger N., Schneckenreiter L., Seidl P., Schimunek J., Hoedt P.-J., Brandstetter J., Mayr A., Luukkonen S., Hochreiter S., Klambauer G.. Bio-xLSTM: Generative Modeling, Representation and in-Context Learning of Biological and Chemical Sequences. arXiv. 2024:arXiv: 2411.04165. doi: 10.48550/arXiv.2411.04165. [DOI] [Google Scholar]
- Fifty C., Leskovec J., Thrun S.. In-Context Learning for Few-Shot Molecular Property Prediction. arXiv. 2023:arXiv: 2310.08863. doi: 10.48550/arXiv.2310.08863. [DOI] [Google Scholar]
- Sgarbossa D., Malbranke C., Bitbol A.-F.. ProtMamba: A Homology-Aware but Alignment-Free Protein State Space Model. bioRxiv. 2024:595730. doi: 10.1101/2024.05.24.595730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thomas M., Bou A., Fabritiis G. D.. Test-Time Training Scaling for Chemical Exploration in Drug Design. arXiv. 2025:arXiv: 2501.19153. doi: 10.48550/arXiv.2501.19153. [DOI] [Google Scholar]
- Yan B., Chen A., Cho K.. Inconsistency of LLMs in Molecular Representations. ChemRxiv. 2024:ChemRxiv: lnvbz. doi: 10.26434/chemrxiv-2024-lnvbz. [DOI] [Google Scholar]
- Silly Things Large Language Models Do with Molecules. Silly Things Large Language Models Do With Molecules. https://practicalcheminformatics.blogspot.com/2024/10/silly-things-large-language-models-do.html (accessed 2025–03–20). [Google Scholar]
- Steyvers M., Tejeda H., Kumar A., Belem C., Karny S., Hu X., Mayer L. W., Smyth P.. What Large Language Models Know and What People Think They Know. Nat. Mach. Intell. 2025;7(2):221–231. doi: 10.1038/s42256-024-00976-7. [DOI] [Google Scholar]
- Castro Nascimento C. M., Pimentel A. S.. Do Large Language Models Understand Chemistry? A Conversation with ChatGPT. J. Chem. Inf. Model. 2023;63(6):1649–1655. doi: 10.1021/acs.jcim.3c00285. [DOI] [PubMed] [Google Scholar]
- Guo, T. ; Guo, K. ; Nan, B. ; Liang, Z. ; Guo, Z. ; Chawla, N. ; Wiest, O. ; Zhang, X. . What Can Large Language Models Do in Chemistry? A Comprehensive Benchmark on Eight Tasks. In Advances in Neural Information Processing Systems, 2023, pp 59662–59688. [Google Scholar]
- Bran A. M., Cox S., Schilter O., Baldassari C., White A. D., Schwaller P.. ChemCrow: Augmenting Large-Language Models with Chemistry Tools. arXiv. 2023:arXiv: 2304.05376. doi: 10.48550/arXiv.2304.05376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacobs P. F., Pollice R.. Developing Large Language Models for Quantum Chemistry Simulation Input Generation. Digital Discovery. 2025;4(3):762–775. doi: 10.1039/D4DD00366G. [DOI] [Google Scholar]
- Teukam Y. G. N., Grisoni F., Manica M.. A Language Model Assistant for Biocatalysis. bioRxiv. 2024:623739. doi: 10.1101/2024.11.15.623739. [DOI] [Google Scholar]
- Campbell Q., Cox S., Medina J., Watterson B., White A. D.. MDCrow: Automating Molecular Dynamics Workflows with Large Language Models. arXiv. 2025:arXiv: 2502.09565. doi: 10.48550/arXiv.2502.09565. [DOI] [Google Scholar]
- Coscia D., Welling M., Demo N., Rozza G.. BARNN: A Bayesian Autoregressive and Recurrent Neural Network. arXiv. 2025:arXiv: 2501.18665. doi: 10.48550/arXiv.2501.18665. [DOI] [Google Scholar]
- Walters W. P., Barzilay R.. Critical Assessment of AI in Drug Discovery. Expert Opin. Drug Discovery. 2021;16(9):937–947. doi: 10.1080/17460441.2021.1915982. [DOI] [PubMed] [Google Scholar]
- Schneider P., Walters W. P., Plowright A. T., Sieroka N., Listgarten J., Goodnow R. A., Fisher J., Jansen J. M., Duca J. S., Rush T. S., Zentgraf M., Hill J. E., Krutoholow E., Kohler M., Blaney J., Funatsu K., Luebkemann C., Schneider G.. Rethinking Drug Design in the Artificial Intelligence Era. Nat. Rev. Drug Discovery. 2020;19(5):353–364. doi: 10.1038/s41573-019-0050-3. [DOI] [PubMed] [Google Scholar]



