Δ-Quantum machine-learning for medicinal chemistry

Kenneth Atz; Clemens Isert; Markus N A Böcker; José Jiménez-Luna; Gisbert Schneider

doi:10.1039/d2cp00834c

. 2022 Apr 26;24(18):10775–10783. doi: 10.1039/d2cp00834c

Δ-Quantum machine-learning for medicinal chemistry^†

Kenneth Atz ^1,^‡, Clemens Isert ^1,^‡, Markus N A Böcker ¹, José Jiménez-Luna ^1,^2,^✉, Gisbert Schneider ^1,^3,^✉

PMCID: PMC9093086 PMID: 35470831

Abstract

Many molecular design tasks benefit from fast and accurate calculations of quantum-mechanical (QM) properties. However, the computational cost of QM methods applied to drug-like molecules currently renders large-scale applications of quantum chemistry challenging. Aiming to mitigate this problem, we developed DelFTa, an open-source toolbox for the prediction of electronic properties of drug-like molecules at the density functional (DFT) level of theory, using Δ-machine-learning. Δ-Learning corrects the prediction error (Δ) of a fast but inaccurate property calculation. DelFTa employs state-of-the-art three-dimensional message-passing neural networks trained on a large dataset of QM properties. It provides access to a wide array of quantum observables on the molecular, atomic and bond levels by predicting approximations to DFT values from a low-cost semiempirical baseline. Δ-Learning outperformed its direct-learning counterpart for most of the considered QM endpoints. The results suggest that predictions for non-covalent intra- and intermolecular interactions can be extrapolated to larger biomolecular systems. The software is fully open-sourced and features documented command-line and Python APIs.

Many molecular design tasks benefit from fast and accurate calculations of quantum-mechanical (QM) properties. 3D message-passing neural networks for Δ-quantum machine-learning enable fast access to DFT-level QM properties for drug-like molecules.

Introduction

The electronic structure of drug-like molecules is responsible for various drug-relevant properties, such as molecular recognition in protein–ligand complexes,^1–3 drug-induced photo-toxicity,^4,5 reactivity for covalent ligand–protein interaction,^6–8 cell membrane permeability,⁹ or three-dimensional (3D) conformation energies.¹⁰ However, despite advances in density functional theory (DFT) approaches,^11,12 which are widely regarded as a compromise between accuracy and computational cost,^13,14 calculating quantum-mechanical (QM) properties at this level of theory for many or for sizable molecules remains a computationally expensive task. Cheaper alternatives such as force fields¹⁵ and semiempirical methods^16,17 have become popular alternatives, albeit with reduced accuracy. To overcome some of these issues, there has been a recent surge of interest in quantum machine-learning (QML), a set of techniques which aim to approximate quantum observables through statistical modeling approaches.^18–22 Geometric deep learning in particular, a discipline focused on the investigation of neural network architectures that incorporate symmetry information into their design,^23–25 has become an active topic of research. Recent advances in geometric deep learning, such as the development of E(3)-equivariant neural networks, have led to improved prediction accuracy of energies,^26–28 forces for molecular dynamics simulations,^29–31 and wave functions in the form of local bases of atomic orbitals.^32,33

In parallel to these developments, Δ-QML (delta-QML) approaches, which aim to learn corrections between computationally inexpensive QM methods and more accurate, albeit more expensive ones, have been shown to deliver promising results.³⁵ Machine-learned corrections of this kind have been reported for both coupled cluster theory^36,37via DFT, and for DFT via the semiempirical family of methods GFN-xTB,^38,39 as well as for other combinations.⁴⁰ However, despite their encouraging performance, to the best of our knowledge there are currently no open-source implementations of Δ-QML or readily available trained models, which limits their widespread adoption. Addressing this need, we present DelFTa, an open-source deep-learning toolbox that enables both fast and accurate approximations of molecular electronic properties on the DFT^41,42 level of theory. Models were trained on the QMugs⁴³ dataset, which consists of ∼2 M molecular conformers with of a comprehensive array of QM observables both at semiempirical and DFT levels of theory for each structure. Specifically, QM observables were learned at the ωB97X-D/def2-SVP level of theory, either directly or via Δ-learning through corrections to the semiempirical GFN2-xTB method^17,44–46 (Fig. 1). To this end, E(3)-invariant three-dimensional (3D) message-passing neural networks (MPNNs) were employed, which are able to learn properties at either the global molecular, atom, and bond levels²⁵ (Fig. 2) and whose predictions are invariant to translations or rotations of the input molecule. The potential utility of the presented DelFTa approach is threefold:

(1) The models provided expand the Δ-QML prediction landscape for drug-like molecules by providing access to commonly-used properties such as formation energies, as well as previously-unreported ones, such as energies of the highest occupied and lowest unoccupied molecular orbitals (HOMO and LUMO, respectively), HOMO/LUMO gaps, dipole moments, Mulliken partial charges, and Wiberg bond orders for covalent and non-covalent bonds.

(2) We investigate several key concepts of QML for applications in drug-relevant chemical space; namely, (i) the advantages and limitations of Δ – compared to direct-learning, (ii) the utility of multi- over single-task learning paradigms for molecular properties, (iii) the dependence of prediction errors on training set size, and (iv) the extrapolation capabilities of the trained machine-learning models to non-covalent intra- and intermolecular interactions in biomolecules.

(3) We provide a fully open-sourced and user-friendly software package via command-line and Python APIs that includes all trained models, as well as extensive documentation and tutorials. Interested researchers will be able to use the provided models, train new ones, or build upon them.

The DelFTa approach enables access to a variety of QM properties at DFT accuracy in a fast and user-friendly manner can be routinely used in numerous relevant applications in molecular modeling and design.

Methods

Reference dataset and dataset splits

DelFTa was built upon the QMugs⁴³ data collection, which comprises ∼2 M conformers of over 665 k molecules extracted from the ChEMBL database (release 27),⁴⁷ to obtain training, validation, and test sets. It includes QM properties at two levels of theory, namely the semiempirical method GFN2-xTB^17,44–46 and DFT (ωB97X-D/def2-SVP^41,42).

Each molecule in the QMugs dataset, associated with a unique ChEMBL identifier, was assigned to either training, validation or test sets, with all conformers of one molecule becoming part of the same set. A validation set composed of ∼29 k molecules was used for hyperparameter optimization and early stopping. All optimized models were tested on three test sets of ∼29 k molecules each (∼88 k individual conformers). As in the QMugs dataset, some molecules with distinct ChEMBL identifiers are represented with the same SMILES⁴⁸ notation (i.e., the same 2D molecular graph), all molecules with the same SMILES were assigned to the same test set to avoid information leakage (see ref. 43 for details).

While the production models available in the DelFTa application were trained on the entire QMugs dataset, model performance was also benchmarked for different training set sizes featuring both single molecular conformations per molecule (100, 1 k, 10 k, 100 k and ∼547 k training samples) as well as multiple ones (∼1.6 M individual conformers of ∼547 k distinct molecules). For the formation energy models, all conformers of the same molecule were grouped within the same train-validation-test splits, correspondingly yielding training set sizes of approximately 300, 3 k, 30 k, 300 k and 1.6 M samples. Fig. S1 (ESI†) shows a schematic of the data splitting approach.

Neural network architecture and training details

The 3D-MPNNs used in this work are based on the E(3)-Equivariant Graph Neural Network (EGNN) architecture.^27,49 Δ-learning models for all endpoints y_i, at either global, node, or edge levels, associated with the i-th molecular conformation, were trained to predict the difference y^Δ_i between DFT-computed properties (y^DFT_i ∈ ^k) and GFN2-xTB equivalents (y^GFN2-xTB_i ∈ ^k), specifically:

y^Δ_i = y^DFT_i − y^GFN2-xTB_i.

Direct-learning models were trained on y^DFT_i values only. Molecular conformations were represented as fully-connected 3D graphs Inline graphic , where corresponds to its set of nodes (), to its set of adjacent edges () and to associated Cartesian coordinates in 3D space (r_i ∈ ³). Initial node features v⁰_i were obtained via a linear embedding of the respective atom types. Edge features r_ij were obtained via a sinusoidal and cosinusoidal encoding of the pairwise diatomic distances ‖r_i − r_j‖²₂ (i.e., a Fourier-like encoding scheme).

An Equivariant Graph-Convolutional Layer (EGCL) was applied over all edges e_ij of the graph. It uses the node embeddings of v^l_i at layer l as well as their respective atomic positions r_i to produce updated node representations v^l+1_i:

graphic file with name d2cp00834c-t7.jpg

using the following message-passing mechanism:

graphic file with name d2cp00834c-t8.jpg

where ϕ_h, ϕ_e are node and edge non-linear transformations, respectively, modeled with multilayer Perceptrons non-linearized with the SiLU activation function,⁵⁰m_ij the computed edge message features, and m_i the aggregated message features per node.

After five message-passing steps, node features v_i were sum-pooled for extensive properties (i.e., formation energy), and mean-pooled for intensive ones⁵¹ (i.e., dipole, orbital energies, HOMO–LUMO gap) and then mapped to their corresponding target shapes via an additional multi-layer Perceptron. In the specific cases of the node- or edge-based endpoints (i.e., Mulliken partial charges and Wiberg bond orders), the learned node-level v_i and message-level features m_ij were used directly for prediction.

While we employed MSE losses for the intensive properties and Mulliken partial charge models, the ones used for the formation energy and Wiberg bond order endpoints were composed of two terms. Similar to previous work,^38,39,52 in the case of formation energy (eqn (4)) a first loss term minimized the error on the absolute formation energy y_abs, while a second term minimized the relative energy differences between the different geometries of the same molecule y_rel:

graphic file with name d2cp00834c-t9.jpg

Specifically training the model to differentiate between conformer energies was motivated by the relevance of this task in identifying the most stable conformers within an ensemble or assessing reaction barriers.⁵³ In the case of Wiberg bond orders (eqn (5)), the first term minimized the error on covalent bonds y_cov, and the second term that on non-covalent interactions y_non:

graphic file with name d2cp00834c-t10.jpg

Both β and λ values were optimized on the validation set and set to β = 1, and λ = 5 × 10⁻² (see ESI† Section 1 for further details). The following network hyperparameters were used in all models considered in this study: (i) node dimension v_i = 128, (ii) message passing dimension m_ij = 32 for molecular and atomic models and 64 for the Wiberg bond order models, (iii) number of sinusoidal and cosinusoidal distance encoding features: 32, (iv) number of EGCLs: 5, and (v) number of global multi-layer Perceptrons: 3, each containing 256 hidden units. Because most of the considered endpoints feature different numerical ranges, which could cause optimization instability issues during the training of the multi-task models, a min–max standardization strategy was applied using the 1st and 99th percentiles of each endpoint, thereby also avoiding outlier scaling problems.

Networks for all endpoints, and for both Δ- and direct-learning models, were trained using the Adam stochastic gradient descent optimizer⁵⁴ with a starting learning rate of 10⁻³ and 10⁻⁴ for all single-task and multi-task models, respectively. An early-stopping strategy that monitored the monotonic decrease of the chosen loss function on the chosen validation set was adopted (see ESI† Section 1 for further details).

Processing of biomolecules

The structures were retrieved from the Protein Data Bank (PDB)⁵⁵ and preprocessed with the MOE software⁵⁶ (version 2019.0102). Since bond orders are intrinsically local properties, and in order to make DFT calculations feasible, atoms which were farther away than one additional residue from the non-covalent interactions of interest were removed and the resulting radicals were padded with hydrogens (see ESI† Section 4 for further details). QM reference values were obtained via Psi4⁵⁷ (version 1.3.2) using the ωB97X-D functional and the def2-SVP basis set.

Results

Predictive performance

Test-set learning curves of models trained with varying training-set sizes are shown in Fig. 3. Mean absolute errors (MAEs) w.r.t. DFT reference values decreased with increasing training set size, generally resulting in linear correlation recorded on log–log plots due to their inverse power law relationship.^58,59 For most of the considered endpoints, the Δ-learning models consistently achieved a better predictive performance (lower MAEs) than their direct-learning counterparts. This performance gap was observed for most training set sizes, highlighting the usefulness of Δ-learning in low-data regimes. For the prediction of formation energy and relative conformer energy differences, the Δ-learning models achieved performance surpassing chemical accuracy (1 kcal mol⁻¹ ≈ 43.4 meV) for all training set sizes larger than 300 k conformers, while direct-learning models required 1.6 M training points. For all computationally intensive molecular endpoints (i.e. those not depending on the system size, such as orbital energies or dipoles), the performance of multi-task models was superior to that of their single-task counterparts, with the exception of direct-learning on LUMO energies for which both single- and multi-task models achieved similar performance.

The predictive performance of the models trained on 1.6 M training conformers was analyzed in more detail. Table 1 shows MAEs w.r.t. the ωB97X-D/def2-SVP reference values obtained for the test sets, and compares them to those of the semiempirical baseline method GFN2-xTB. For all considered endpoints and models, the DFT reference values were more closely approximated with the proposed machine-learning models than with GFN2-xTB. For most of the considered endpoints, the Δ-learning approach yielded lower MAEs than its direct-learning counterpart. However, the direct-learning approach achieved a slightly lower MAE (35.0 meV vs. 36.7 meV for Δ-learning) for the prediction of HOMO energies, which can be attributed to the fact that HOMO energies calculated with the DFT and GFN2-xTB methods, respectively, correlated to a lesser degree than other considered endpoints. Scatter plots showing Δ-predicted properties versus their DFT reference values are provided in Fig. 4. Direct-learning approaches also yielded higher accuracies than the semiempirical baseline GFN2-xTB for all considered endpoints (see Fig. S2, ESI†).

MAEs (±1 standard deviation) for the baseline (GFN2-xTB) as well as the Δ- and direct-learning models w.r.t. the DFT reference values (ωB97X-D/def2-SVP). Results computed for ∼88 k molecules (∼263 k conformers) from the three test sets. Wiberg bond order results only for bonds where GFN2-xTB values were available. The lowest MAE w.r.t. reference values are highlighted.

Property	Unit	GFN2-xTB	DelFTa
Property	Unit	GFN2-xTB	Δ-learning	Direct-learning
Formation energy	meV	86343 (± 39)	21.78 (± 0.10)	33.50 (± 0.05)
HOMO energy	meV	2115.4 ± (0.5)	36.7 (± 0.1)	35.0 (± 0.1)
LUMO energy	meV	7773.0 (± 0.7)	27.8 (± 0.2)	36.8 (± 0.2)
HOMO–LUMO gap	meV	5658 (± 1)	47.3 (± 0.2)	52.9 (± 0.1)
Total molecular dipole	D	0.622 (± 0.002)	0.0946 (± 0.0006)	0.1588 (± 0.0006)
Mulliken partial charges	e	0.0610 (± 0.0000)	0.0027 (± 0.0000)	0.0029 (± 0.0000)
Wiberg bond orders	—	0.0592 (± 0.0001)	0.0011 (± 0.0000)	0.0017 (± 0.0000)
Conformer pairwise energy difference	meV	73.6 (± 0.4)	22.29 (± 0.06)	34.27 (± 0.08)

Open in a new tab

Utility of Δ-learning

While Δ-learning models generally outperformed their direct-learning analogues in our experiments, the observed performance difference was not uniformly distributed across the endpoints. To investigate under which conditions the Δ-learning paradigm is advantageous over direct-learning, we analyzed the relative performance difference between the two approaches as a function of the Pearson correlation coefficient r between baseline (GFN2-xTB) and reference (ωB97X-D/def2-SVP) values. Fig. 5 indicates that the relative performance advantage of Δ-learning is positively correlated to the correlation r between baseline and reference values. This result confirms the intuitive understanding that Δ-learning provides a larger performance advantage over direct-learning the more information the baseline method provides.

Non-covalent interactions in biomolecules

Many tasks in medicinal and bioorganic chemistry encompass the study of non-covalent intra- (e.g. secondary-structure-stabilizing) and intermolecular (e.g. protein–ligand) interactions.^60,61 Modelling of non-covalent interactions with semiempirical methods has previously been shown to be challenging.⁶² Given the biological relevance of these molecular interactions, it is desirable to develop QML models which can accurately extrapolate from small molecules (e.g., bioactive ligands) to larger biomolecules (e.g., peptides), foregoing the need for the expensive DFT calculations associated with structures of such size.

Towards that end, we preliminarily investigated the generalization capabilities of direct-learning QML models for selected biomolecular systems with crucial non-covalent interactions by comparing predicted Wiberg bond orders to DFT reference values. Δ-learning models were not considered in these analyses, as bond order values for many non-covalent interactions of interest are not provided by the GFN2-xTB semiempirical method. The investigated structures (Fig. 6) include a hydrolase β-turn,⁶³ a β-sheet and an α-helix of ubiquitin,⁶⁴ glutamate in a glutamate dehydrogenase binding pocket,⁶⁵ an uracil-adenine base pair in an RNA structure,⁶⁶ and a transcription factor binding to a cytosine–guanine base pair of a DNA structure.⁶⁷

By training only on monomers of drug-like molecules and their intramolecular interactions, as included in the QMugs dataset, the models successfully extrapolated to non-covalent intra- and intermolecular interactions in biomacromolecules including monomers and dimers. For instance, weak hydrogen bonds (Wiberg bond order <0.1), such as the ones found in α-helices (Fig. 6(C)), as well as strong hydrogen bonds (Wiberg bond order <0.1), such as the ones found in the RNA base pairs (Fig. 6(E)), β-turns/sheets (Fig. 6(A) and (B)), and in the glutamate binding pocket (Fig. 6(D)), were accurately predicted. However, we observed reduced predictive capabilities for some interactions such as hydrogen bonds with phosphate groups (Fig. S4, ESI†).

Benchmarking

We compared the implementation of the DelFTa deep-learning architecture used in this work to the one originally reported in ref. 27, for the QM9 dataset,⁶⁸ a benchmark used in previous QML studies, which features quantum observables for ∼134 k small molecules. The same DelFTa model architecture used for the direct-learning of formation energies was retrained on the QM9 training set (∼100 k molecules), validated and tested on its respective validation and test sets (∼15 k molecules each). The trained models achieved an MAE of 11.9 ± 0.7 meV in three independent model runs, which is comparable to the originally-reported performance (12 meV).²⁷ The lower overall error of the models trained on QM9 compared to those trained on QMugs (11.9 meV and 33.5 meV, respectively) can be attributed to two key differences between the datasets, namely atom type diversity (10 different atom types in QMugs, 5 in QM9), and molecular size (up to 100 heavy atoms in QMugs, and up to 9 heavy atoms in QM9, respectively).

Since DelFTA models were trained on uncharged molecules but had shown predictive capabilities for charged biomolecules, we quantitatively investigated the models’ performance on 176 randomly-sampled charged conformers corresponding to 59 molecules extracted from the test sets (see Table S2 for details, ESI†). Compared to uncharged molecules, the predictive performance decreased moderately for Mulliken partial charges and Wiberg bond orders, and substantially decreased for other endpoints (see Table S3, ESI†). Based on these results, we discourage the use of the DelFTa models provided in the accompanying software for out-of-distribution molecules.

Listing 1 A small snippet highlighting the main predictive capabilities of the DelFTa Python package and its integration with Pybel. Molecules, with or without associated 3D geometry, can be supplied via a wide array of file types.

Finally, we explored the model performance (trained on ωB97X-D/def2-SVP DFT data) on a set of molecules whose DFT reference values were computed with a more comprehensive basis set (ωB97X-D/def2-QZVP). Calculations for 2,874 conformations corresponding to 958 distinct molecules with this larger basis set did not indicate superior performance of Δ-over direct learning (see Fig. S3 for details, ESI†). Furthermore, Mulliken partial charges on the ωB97X-D/def2-QZVP level of theory were better approximated using GFN2-xTB than with either of the provided machine-learning models. This was expected as GFN2-xTB better approximates charges computed with the larger basis set than the chosen DFT reference used throughout this study (ωB97X-D/def2-SVP).

Software

DelFTa is fully implemented in the Python programming language⁶⁹ and uses the PyTorch⁷⁰ (version 1.8.0) and PyTorch Geometric packages⁷¹ (version 1.7.2) to enable model training and inference. A minimalist code example for the usage of the package is provided in Listing 1. Semiempirical calculations at the GFN2-xTB^17,44–46 level of theory are computed via open-source xtb binaries. All molecular manipulation routines (including optional generation of initial 3D coordinates and GFN2-xTB geometry optimization) are integrated into DelFTa and handled via the Pybel package⁷² and OpenBabel⁷³ Python bindings. The software is fully open-sourced, available on GitHub (https://github.com/josejimenezluna/delfta) under a permissive AGPLv3 license, and distributed through the conda package manager.⁷⁴ A Docker⁷⁵ container is also provided for easier accessibility and to ensure long-term functionality. Furthermore, DelFTa provides extensive documentation for its code and APIs. Tutorials in the form of several didactic Jupyter notebooks⁷⁶ are also available.

On a computer with a consumer-grade graphics processing unit, DelFTA predicts all considered endpoints at a speed of approximately 50 and 5 molecules per second for the direct and Δ-learning models respectively, with the latter approach mostly bottlenecked by the additionally-required baseline GFN2-xTB calculations.

Discussion

QML models were trained for a wide variety of endpoints on a large dataset of quantum observables. Models were validated for both Δ- and direct-learning, as well as single- and multi-task paradigms. The results suggest that both Δ- and direct-learning models have improved accuracy over the GFN2-xTB baseline in approximating ωB97X-D/def2-SVP reference values. For the majority of the considered endpoints, Δ-learning models displayed lower MAEs than their direct-learning analogues at roughly the same computational cost as GFN2-xTB. Additionally, the Wiberg bond order models were able to approximate non-covalent interactions in larger biomolecular systems.

We foresee many applications for the hereby provided models in both supervised and generative molecular pipelines. For instance, featurization with quantum-derived properties, such as partial charges and nuclear magnetic resonance shifts, was shown to increase the performance of reactivity prediction with graph neural networks in low-data regimes.⁷⁷ Similar effects may also be anticipated in medicinal chemistry as the electronic structure of drug-like molecules governs many related properties. Potential examples include the influence of (i) HOMO/LUMO energies on phototoxicity,^4,5 (ii) dipole moments on aqueous solubility^78,79 and membrane permeability,^9,80 and (iii) formation energies on 3D-conformer ensembles⁹ or site-of-metabolism prediction.⁸¹

Future prospective applications will reveal the practical applicability and usefulness of these models in drug discovery-related tasks. The current limitations are twofold, concerning the modelling performance and the models’ applicability domain. With regard to modelling performance, we noted that while the Δ-learning approach affords substantial improvements over its direct-learning analogue w.r.t. target DFT reference values, this does not necessarily hold in the case of HOMO energies, probably owing to the limited correlation of the baseline and reference methods. Furthermore, some of the observations regarding the performance advantages do not necessarily hold for reference values computed with a more comprehensive basis set. Limitations with regard to the applicability domain mostly stem from the underlying QMugs dataset, which was conceived with medicinal chemistry applications in mind. For example, it does not feature organometallic complexes, polymers, crystalline structures, or molecular systems including dimers, radicals, excited electronic states, higher-order spin states, off-equilibrium structures or charged molecules. Specifically for charged molecules we observed substantially decreased predictive performance in this study. Adequate models for these types of molecular structures will require training data that specifically covers the respective chemical space, and therefore remain a subject of future work.

Conflicts of interest

G. S. is a cofounder of inSili.com LLC, Zurich, and a consultant to the pharmaceutical industry.

Supplementary Material

CP-024-D2CP00834C-s001

CP-024-D2CP00834C-s001.pdf^{(3.3MB, pdf)}

CP-024-D2CP00834C-s002

CP-024-D2CP00834C-s002.pdf^{(843.8KB, pdf)}

CP-024-D2CP00834C-s003

CP-024-D2CP00834C-s003.pdf^{(1.1MB, pdf)}

CP-024-D2CP00834C-s004

CP-024-D2CP00834C-s004.pdf^{(1.2MB, pdf)}

CP-024-D2CP00834C-s005

CP-024-D2CP00834C-s005.pdf^{(1.1MB, pdf)}

CP-024-D2CP00834C-s006

CP-024-D2CP00834C-s006.pdf^{(21KB, pdf)}

Acknowledgments

We thank N. Weskamp for helpful discussions on this work. This work was financially supported by the ETH RETHINK initiative, the Swiss National Science Foundation (Grant No. 205321_182176), and Boehringer Ingelheim Pharma GmbH & Co. KG. C.I. acknowledges support from the Scholarship Fund of the Swiss Chemical Industry.

^†

Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2cp00834c

Notes and references

Khrenova M. Nemukhin A. V. Grigorenko B. L. Krylov A. Domratcheva T. J. Chem. Theory Comput. 2010;6:2293–2302. doi: 10.1021/ct100179p. [DOI] [PubMed] [Google Scholar]
Xie N.-Z. Du Q.-S. Li J.-X. Huang R.-B. PLoS One. 2015;10:e0137113. doi: 10.1371/journal.pone.0137113. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tavares A. B. M. Neto J. X. L. Fulco U. L. Albuquerque E. L. Sci. Rep. 2018;8:1–13. doi: 10.1038/s41598-018-20325-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Freccero M. Fasani E. Mella M. Manet I. Monti S. Albini A. Chem. – Eur. J. 2008;14:653–663. doi: 10.1002/chem.200701099. [DOI] [PubMed] [Google Scholar]
Llano J. Raber J. Eriksson L. A. J. Photochem. Photobiol., A. 2003;154:235–243. doi: 10.1016/S1010-6030(02)00351-9. [DOI] [Google Scholar]
Yu H. S. Gao C. Lupyan D. Wu Y. Kimura T. Wu C. Jacobson L. Harder E. Abel R. Wang L. J. Chem. Inf. Model. 2019;59:3955–3967. doi: 10.1021/acs.jcim.9b00268. [DOI] [PubMed] [Google Scholar]
Zhao Z. Liu Q. Bliven S. Xie L. Bourne P. E. J. Med. Chem. 2017;60:2879–2889. doi: 10.1021/acs.jmedchem.6b01815. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fanfrlik J. Brahmkshatriya P. S. Rezac J. Jilkova A. Horn M. Mares M. Hobza P. Lepsik M. J. Phys. Chem. B. 2013;117:14973–14982. doi: 10.1021/jp409604n. [DOI] [PubMed] [Google Scholar]
Pultar F. Hansen M. E. Wolfrum S. Böselt L. Fróis-Martins R. Bloch S. Kravina A. G. Pehlivanoglu D. Schäffer C. LeibundGut-Landmann S. Riniker S. Carreira E. M. J. Am. Chem. Soc. 2021;143:10389–10402. doi: 10.1021/jacs.1c04825. [DOI] [PubMed] [Google Scholar]
Rupp M. Bauer M. R. Wilcken R. Lange A. Reutlinger M. Boeckler F. M. Schneider G. PLoS Comput. Biol. 2014;10:e1003400. doi: 10.1371/journal.pcbi.1003400. [DOI] [PMC free article] [PubMed] [Google Scholar]
Burke K. J. Chem. Phys. 2012;136:150901. doi: 10.1063/1.4704546. [DOI] [PubMed] [Google Scholar]
Sherrill C. D. J. Chem. Phys. 2010;132:110902. doi: 10.1063/1.3369628. [DOI] [PubMed] [Google Scholar]
Von Lilienfeld O. A. Tavernelli I. Rothlisberger U. Sebastiani D. Phys. Rev. Lett. 2004;93:153004. doi: 10.1103/PhysRevLett.93.153004. [DOI] [PubMed] [Google Scholar]
Schwabe T. Grimme S. Acc. Chem. Res. 2008;41:569–579. doi: 10.1021/ar700208h. [DOI] [PubMed] [Google Scholar]
Wang S. Witek J. Landrum G. A. Riniker S. J. Chem. Inf. Model. 2020;60:2044–2058. doi: 10.1021/acs.jcim.0c00025. [DOI] [PubMed] [Google Scholar]
Stewart J. J. P. J. Mol. Model. 2007;13:1173–1213. doi: 10.1007/s00894-007-0233-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bannwarth C. Ehlert S. Grimme S. J. Chem. Theory Comput. 2019;15:1652–1671. doi: 10.1021/acs.jctc.8b01176. [DOI] [PubMed] [Google Scholar]
von Lilienfeld O. A. Müller K.-R. Tkatchenko A. Nat. Rev. Chem. 2020;4:347–358. doi: 10.1038/s41570-020-0189-9. [DOI] [PubMed] [Google Scholar]
Huang B. von Lilienfeld O. A. Nat. Chem. 2020;12:945–951. doi: 10.1038/s41557-020-0527-z. [DOI] [PubMed] [Google Scholar]
Unke O. T. Chmiela S. Sauceda H. E. Gastegger M. Poltavsky I. Schütt K. T. Tkatchenko A. Müller K.-R. Chem. Rev. 2021;121(16):10142–10186. doi: 10.1021/acs.chemrev.0c01111. [DOI] [PMC free article] [PubMed] [Google Scholar]
Unke O. T. Koner D. Patra S. Käser S. Meuwly M. Mach. Learn.: Sci. Technol. 2020;1:013001. [Google Scholar]
Lemm D. von Rudorff G. F. von Lilienfeld O. A. Nat. Commun. 2021;12:4468. doi: 10.1038/s41467-021-24525-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bronstein M. M. Bruna J. LeCun Y. Szlam A. Vandergheynst P. IEEE Signal Process. Mag. 2017;34:18–42. [Google Scholar]
Bronstein M. M., Bruna J., Cohen T. and Velicković P., 2021, arXiv:2104.13478
Atz K. Grisoni F. Schneider G. Nat. Mach. Intell. 2021;3:1023–1032. doi: 10.1038/s42256-021-00418-8. [DOI] [Google Scholar]
Schütt K. T. Sauceda H. E. Kindermans P.-J. Tkatchenko A. Müller K.-R. J. Chem. Phys. 2018;148:241722. doi: 10.1063/1.5019779. [DOI] [PubMed] [Google Scholar]
Satorras V. G., Hoogeboom E. and Welling M., 2021, arXiv:2102.09844
Schütt K. T., Unke O. T. and Gastegger M., 2021, arXiv:2102.03150
Unke O. T. Meuwly M. J. Chem. Theory Comput. 2019;15:3678–3693. doi: 10.1021/acs.jctc.9b00181. [DOI] [PubMed] [Google Scholar]
Batzner S., Smidt T. E., Sun L., Mailoa J. P., Kornbluth M., Molinari N. and Kozinsky B., 2021, arXiv:2101.03164 [DOI] [PMC free article] [PubMed]
Unke O. T., Chmiela S., Gastegger M., Schütt K. T., Sauceda H. E. and Müller K.-R., 2021, arXiv:2105.00304 [DOI] [PMC free article] [PubMed]
Schütt K. Gastegger M. Tkatchenko A. Müller K.-R. Maurer R. J. Nat. Commun. 2019;10:5024. doi: 10.1038/s41467-019-12875-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Unke O. T., Bogojeski M., Gastegger M., Geiger M., Smidt T. and Müller K.-R., 2021, arXiv:2106.02347
von Lilienfeld O. A., 31st Conference on Neural Information Processing Systems, 2017
Ramakrishnan R. Dral P. O. Rupp M. von Lilienfeld O. A. J. Chem. Theory Comput. 2015;11:2087–2096. doi: 10.1021/acs.jctc.5b00099. [DOI] [PubMed] [Google Scholar]
Smith J. S. Zubatyuk R. Nebgen B. Lubbers N. Barros K. Roitberg A. E. Isayev O. Tretiak S. Sci. Data. 2020;7:134. doi: 10.1038/s41597-020-0473-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nandi A. Qu C. Houston P. L. Conte R. Bowman J. M. J. Chem. Phys. 2021;154:051102. doi: 10.1063/5.0038301. [DOI] [PubMed] [Google Scholar]
Qiao Z. Welborn M. Anandkumar A. Manby F. R. Miller III T. F. J. Chem. Phys. 2020;153:124111. doi: 10.1063/5.0021955. [DOI] [PubMed] [Google Scholar]
Christensen A. S. Sirumalla S. K. Qiao Z. OConnor M. B. Smith D. G. Ding F. Bygrave P. J. Anandkumar A. Welborn M. Manby F. R. et al. . J. Chem. Phys. 2021;155:204103. doi: 10.1063/5.0061990. [DOI] [PubMed] [Google Scholar]
Zheng P. Zubatyuk R. Wu W. Isayev O. Dral P. O. Nat. Commun. 2021;12:7022. doi: 10.1038/s41467-021-27340-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chai J.-D. Head-Gordon M. Phys. Chem. Chem. Phys. 2008;10:6615–6620. doi: 10.1039/B810189B. [DOI] [PubMed] [Google Scholar]
Weigend F. Ahlrichs R. Phys. Chem. Chem. Phys. 2005;7:3297–3305. doi: 10.1039/B508541A. [DOI] [PubMed] [Google Scholar]
Isert C., Atz K., Jiménez-Luna J. and Schneider G., 2021, arXiv:2107.00367
Grimme S. Bannwarth C. Shushkov P. J. Chem. Theory Comput. 2017;13:1989–2009. doi: 10.1021/acs.jctc.7b00118. [DOI] [PubMed] [Google Scholar]
Grimme S. J. Chem. Theory Comput. 2019;15:2847–2862. doi: 10.1021/acs.jctc.9b00143. [DOI] [PubMed] [Google Scholar]
Bannwarth C. Caldeweyher E. Ehlert S. Hansen A. Pracht P. Seibert J. Spicher S. Grimme S. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2020:e01493. [Google Scholar]
Gaulton A. Hersey A. Nowotka M. Bento A. P. Chambers J. Mendez D. Mutowo P. Atkinson F. Bellis L. J. Cibrián-Uhalte E. et al. . Nucleic Acids Res. 2017;45:D945–D954. doi: 10.1093/nar/gkw1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
Weininger D. J. Chem. Inf. Comput. Sci. 1988;28:31–36. doi: 10.1021/ci00057a005. [DOI] [PubMed] [Google Scholar]
EGNN-PyTorch, https://github.com/lucidrains/egnn-pytorch, 2021
Elfwing S. Uchibe E. Doya K. Neural Networks. 2018;107:3–11. doi: 10.1016/j.neunet.2017.12.012. [DOI] [PubMed] [Google Scholar]
Pronobis W. Schütt K. T. Tkatchenko A. Müller K.-R. Eur. Phys. J. B. 2018;91:1–6. doi: 10.1140/epjb/e2018-90148-y. [DOI] [Google Scholar]
Qiao Z., Christensen A. S., Manby F. R., Welborn M., Anandkumar A. and Miller III T. F., 2021, arXiv:2105.14655
Pung A. Leito I. J. Phys. Chem. A. 2017;121:6823–6829. doi: 10.1021/acs.jpca.7b05197. [DOI] [PubMed] [Google Scholar]
Kingma D. P. and Ba J., 2014, arXiv:1412.6980
Berman H. M. Westbrook J. Feng Z. Gilliland G. Bhat T. N. Weissig H. Shindyalov I. N. Bourne P. E. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
C. C. G. ULC, Molecular Operating Environment (MOE), 2019.01, 2020
Smith D. G. Burns L. A. Simmonett A. C. Parrish R. M. Schieber M. C. Galvelis R. Kraus P. Kruse H. Di Remigio R. Alenaizan A. et al. . J. Chem. Phys. 2020;152:184108. doi: 10.1063/5.0006002. [DOI] [PMC free article] [PubMed] [Google Scholar]
Faber F. A. Christensen A. S. Huang B. Von Lilienfeld O. A. J. Chem. Phys. 2018;148:241717. doi: 10.1063/1.5020710. [DOI] [PubMed] [Google Scholar]
Müller K.-R. Finke M. Murata N. Schulten K. Amari S. Neural Comput. 1996;8:1085–1106. doi: 10.1162/neco.1996.8.5.1085. [DOI] [PubMed] [Google Scholar]
Kuhn B. Mohr P. Stahl M. J. Med. Chem. 2010;53:2601–2611. doi: 10.1021/jm100087s. [DOI] [PubMed] [Google Scholar]
Bissantz C. Kuhn B. Stahl M. J. Med. Chem. 2010;53:5061–5084. doi: 10.1021/jm100112j. [DOI] [PMC free article] [PubMed] [Google Scholar]
Christensen A. S. Kubar T. Cui Q. Elstner M. Chem. Rev. 2016;116:5301–5337. doi: 10.1021/acs.chemrev.5b00584. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hynes T. R. Kautz R. A. Goodman M. A. Gill J. F. Fox R. O. Nature. 1989;339:73–76. doi: 10.1038/339073a0. [DOI] [PubMed] [Google Scholar]
Vijay-Kumar S. Bugg C. E. Cook W. J. J. Mol. Biol. 1987;194:531–544. doi: 10.1016/0022-2836(87)90679-6. [DOI] [PubMed] [Google Scholar]
Stillman T. Baker P. Britton K. Rice D. J. Mol. Biol. 1993;234:1131–1139. doi: 10.1006/jmbi.1993.1665. [DOI] [PubMed] [Google Scholar]
Rypniewski W. Vallazza M. Perbandt M. Klussmann S. DeLucas L. J. Betzel C. Erdmann V. A. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2006;62:659–664. doi: 10.1107/S090744490601359X. [DOI] [PubMed] [Google Scholar]
Lian T.-f. Xu Y.-p. Li L.-f. Su X.-D. Cell Rep. 2017;19:1334–1342. doi: 10.1016/j.celrep.2017.04.057. [DOI] [PubMed] [Google Scholar]
Ramakrishnan R. Dral P. O. Rupp M. Von Lilienfeld O. A. Sci. Data. 2014;1:140022. doi: 10.1038/sdata.2014.22. [DOI] [PMC free article] [PubMed] [Google Scholar]
The Python Language Reference, https://docs.python.org/3/reference/
Paszke A. Gross S. Massa F. Lerer A. Bradbury J. Chanan G. Killeen T. Lin Z. Gimelshein N. Antiga L. et al. . Adv. Neural Inf. Process. Syst. 2019;32:8026–8037. [Google Scholar]
Fey M. and Lenssen J. E., ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019
O'Boyle N. M. Morley C. Hutchison G. R. Chem. Cent. J. 2008;2:1–7. doi: 10.1186/1752-153X-2-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
O'Boyle N. M. Banck M. James C. A. Morley C. Vandermeersch T. Hutchison G. R. J. Cheminformatics. 2011;3:1–14. doi: 10.1186/1758-2946-3-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
Conda package manager, https://conda.io
Merkel D. Linux J. 2014:2. [Google Scholar]
Kluyver T., Ragan-Kelley B., Pérez F., Granger B., Bussonnier M., Frederic J., Kelley K., Hamrick J., Grout J., Corlay S., Ivanov P., Avila D., Abdalla S. and Willing C., Jupyter Notebooks – A publishing format for reproducible computational workflows, IOS Press, 2016, pp. 87–90 [Google Scholar]
Stuyver T. and Coley C. W., 2021, arXiv:2107.10402
Cardoso R. M. Martins P. A. Ramos C. V. Cordeiro M. M. Leote R. J. Naqvi K. R. Vaz W. L. Moreno M. J. Biochim. Biophys. Acta, Biomembr. 2020;1862:183157. doi: 10.1016/j.bbamem.2019.183157. [DOI] [PubMed] [Google Scholar]
Darvishmanesh S. Vanneste J. Tocci E. Jansen J. C. Tasselli F. Degrève J. Drioli E. Van der Bruggen B. J. Phys. Chem. B. 2011;115:14507–14517. doi: 10.1021/jp207569m. [DOI] [PubMed] [Google Scholar]
Matuszek A. M. Reynisson J. Mol. Inf. 2016;35:46–53. doi: 10.1002/minf.201500105. [DOI] [PubMed] [Google Scholar]
Sun H. Scott D. O. Chem. Biol. Drug Des. 2010;75:3–17. doi: 10.1111/j.1747-0285.2009.00899.x. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials