Unveiling the molecular mechanism of SARS-CoV-2 main protease inhibition from 137 crystal structures using algebraic topology and deep learning

Duc Duy Nguyen; Kaifu Gao; Jiahui Chen; Rui Wang; Guo-Wei Wei

doi:10.1039/d0sc04641h

. 2020 Sep 30;11(44):12036–12046. doi: 10.1039/d0sc04641h

Unveiling the molecular mechanism of SARS-CoV-2 main protease inhibition from 137 crystal structures using algebraic topology and deep learning^†

Duc Duy Nguyen ¹, Kaifu Gao ², Jiahui Chen ², Rui Wang ², Guo-Wei Wei ^2,^3,^4,^✉

PMCID: PMC8162568 PMID: 34123218

Abstract

Currently, there is neither effective antiviral drugs nor vaccine for coronavirus disease 2019 (COVID-19) caused by acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Due to its high conservativeness and low similarity with human genes, SARS-CoV-2 main protease (M^pro) is one of the most favorable drug targets. However, the current understanding of the molecular mechanism of M^pro inhibition is limited by the lack of reliable binding affinity ranking and prediction of existing structures of M^pro–inhibitor complexes. This work integrates mathematics (i.e., algebraic topology) and deep learning (MathDL) to provide a reliable ranking of the binding affinities of 137 SARS-CoV-2 M^pro inhibitor structures. We reveal that Gly143 residue in M^pro is the most attractive site to form hydrogen bonds, followed by Glu166, Cys145, and His163. We also identify 71 targeted covalent bonding inhibitors. MathDL was validated on the PDBbind v2016 core set benchmark and a carefully curated SARS-CoV-2 inhibitor dataset to ensure the reliability of the present binding affinity prediction. The present binding affinity ranking, interaction analysis, and fragment decomposition offer a foundation for future drug discovery efforts.

By integrating algebraic topology and deep learning, we provide a reliable ranking of binding affinities, binding site analysis, and fragment decomposition for 137 SARS-CoV-2 main protease inhibitors.

1. Introduction

Starting in late Dec, 2019, the COVID-19 pandemic caused by new severe acute respiratory syndrome coronavirus (SARS-CoV-2) has infected more than 22 million individuals and has caused more than 777 000 fatalities in all of the continents and over 213 countries and territories by August 19th, 2020. Under the current global health emergency, researchers around the world have engaged in the investigation of the different drug targets of SARS-CoV-2, such as the main protease (M^pro, also called 3CL^pro), papain-Like protease (PL^pro), RNA-dependent RNA polymerase (RdRp), 5′-to-3′ helicase protein (Nsp13) to seek potential cures for this serious pandemic. To date, although there are some vaccines undergoing the Phase III trials,¹ their safety and efficacy are still unclear.²

The main protease, one of the best-characterized targets for coronaviruses, attracts lots of research attention because it is very conservative and distinguished from any human gene. A recent study shows that although the overall sequence identity between SARS-CoV and SARS-CoV-2 is just 80%, the M^pro of SARS-CoV-2 shares 96.08% sequence identity to that of SARS-CoV.³ Therefore, we hypothesize that a potent SARS M^pro inhibitor is also a potent SARA-CoV-2 M^pro inhibitor.

At this moment, more than 300 potential SARS-CoV M^pro inhibitors with its binding affinities are available in ChEMBL database⁴ which can be considered as the potential SARS-CoV-2 M^pro inhibitors. Recently, total 146 crystal structures of SARS-CoV-2 M^pro with its ligand complexes are released on the Protein Data Bank (PDB).⁵ Among them, 137 crystal structures have no available binding affinities reported for various reasons. However, the central dogma of drug design and discovery concerns the molecular mechanism and binding affinity of drug target interactions. Knowing the binding affinities and their ranking of 137 SARS-CoV-2 M^pro inhibitors is of great significance to the future design of anti SARS-CoV-2 drugs.

In this work, for the first time, we predict the binding affinities of these 137 M^pro–inhibitor complexes by reformulating algebraic topology-based mathematics-deep learning (MathDL) models, which have been the top competitor in D3R Grand Challenges, a worldwide competition series in computer aided drug design in the past three years.⁶ We generate reliable poses for 141 M^pro inhibitors with binding affinities but without complex structures. Together with 44 other complexes, we compose a set of 185 M^pro–inhibitor complexes, which is paired with 17 382 protein–ligand complexes in PDBbind 2019 general set. These datasets are utilized to construct 11 MathDL models in single-task and multitask settings.⁶ One of these 11 MathDL models has been validated by using the PDBbind v2016 core set benchmark, achieving the top performance over all exiting scoring functions. The other ten MathDL models have cross-validated on a set of 185 M^pro–nhibitor complexes, showing an averaged Pearson's correlation coefficient of 0.73.

Notably, for covalent inhibitors, the scheme of covalent irreversible inhibition of SARS-CoV/SARS-CoV-2 M^pro is presented below:

The inhibitor first binds to the protease noncovalently, then a nucleophilic attacking by Cys145 leads to the formation of a stable covalent bond between the protease and the inhibitor.^7,8 The interaction depends on both the equilibrium-binding constant K_i (designated as k₁/k₂) and the inactivation rate constant for covalent bond formation k₃. In this work, the binding affinity/IC₅₀ assesses the first step to form noncovalent binding.

In a nutshell, the present work provides reliable binding affinity predictions and ranking of 137 SARS-CoV-2 inhibitors that have crystal structures. It also offers data curation and validated models for exploring potential SARS-CoV-2 M^pro inhibitors. Furthermore, this work explores different possible binding regions on the SARS-CoV-2 main protease and decode the most favorable molecular fragments for the inhibitor design.

2. Results and discussions

2.1. Results

This section is devoted to the utilization of our MathDL models developed in Section 3.3 to predict the binding affinities and their ranking of SARS-CoV-2 inhibitors that do not have reported experimental affinities. To reduce the role of 3D pose prediction errors in our model, we use the SARS-CoV-2 inhibitors with X-ray structures available in the PDB for our study. We manually search these ligands on the PDB and arrive at a set consisting of 137 SARS-CoV-2 M^pro inhibitors having X-ray crystal structures but lacking of experimental binding affinities. We name this set SARS-CoV PDB-noBA (see Table 3). In this experiment, we develop a MathDL model optimized from PDBbind v2016 core set (see Section 3.3.1), five MathDL-ALL and five MathDL-MT models obtained from 5-fold study on the SARS-CoV BA set (see Section 3.3.2). The final predicted binding affinity is the consensus of these 11 models. The top ten inhibitors indicated by our models are shown in Table 1.

A summary of our selected data sets.

Data name	Data size	Descriptions	References
PDBbind v2019	17 382	Partial PDBbind general set v2019	18
PDBbind v2016 core set	290	PDBbind v2016 core set	18
SARS-CoV PDB	192	Inhibitors of SARS-CoV/SARS-CoV-2 M^pro having X-ray crystal structures	5, 19 and 20
SARS-CoV PDB-BA	44	Inhibitors of SARS-CoV/SARS-CoV-2 M^pro having X-ray crystal structures and experimental binding affinities	5, 18–23
SARS-CoV PDB-noBA	137	Inhibitors of SARS-CoV-2 M^pro having X-ray crystal structures but lacking of experimental binding affinities	5, 18–20, 24
SARS-CoV 2D	141	Inhibitors of SARS-CoV/SARS-CoV-2 M^pro having only 2D structures	4, 19, 20, 25–31
SARS-CoV BA	185	Inhibitors of SARS-CoV/SARS-CoV-2 M^pro having experimental binding affinities	5, 18–20, 26–31

Open in a new tab

Binding affinities of top 10 complexes in SARS-CoV PBD-noBA dataset predicted by our MathDL. “Pred. BA” indicates the predicted binding free energy in kcal mol⁻¹ and “Pred. IC₅₀” is the corresponding IC₅₀ in μM unit via the following conversion: Pred. IC₅₀ = 10^{Pred. BA/1.3633} × 10⁶.

PDBID	Pred. BA	Pred. IC₅₀	PDBID	Pred. BA	Pred. IC₅₀
7c8t	−8.90	0.30	6z2e	−8.43	0.66
5rgl	−8.50	0.58	6xbi	−8.34	0.76
6xhm	−8.50	0.58	6xmk	−8.33	0.78
7bqy	−8.49	0.59	5rh7	−8.32	0.79
5rfr	−8.45	0.63	6xbh	−8.27	0.86

Open in a new tab

The most potent SARS-CoV-2 inhibitor found by our MathDL models is the inhibitor Nol in complex 7c8t. Nol was synthesized by Yang and his colleagues,⁹ Nol is found remarkable activities against SARS-CoV and HCoV.⁹ Specifically, the dissociation constant K_i of Nol was found to be 0.053 μM against SARS-CoV.⁹ Our MathDL reveals that Nol still inhibits SARS-CoV-2 main protease with a potent affinity at 0.30 μM.

Another important top potent SARS-CoV-2 inhibitor found by our models is the Michael acceptor inhibitor N3 in complex 7bqy. Designed by Yang and his colleagues,⁸ N3 was found to have viral activities against different coronavirus M^pro such as SARS-CoV and MERS-CoV.^8,10 Specifically, the dissociation constant K_i of N3 was found to be 9.0 μM against SARS-CoV.⁸ Our MathDL reveals that N3 still inhibits SARS-CoV-2 main protease with an even better affinity at 0.59 μM. This finding is consistent with the literature work¹¹ showing that N3 is a potent inhibitor of COVID-19 virus M^pro.

The inhibitor Qys in the complex 6xmk is also noticeable. Our predicted IC₅₀ is 0.78 μM. Soon after we made the prediction, on August 12th, 2020, Rathnayake et al.¹² released another Qys-main protease complex with PDB ID 6w2a and also reported the IC₅₀ of Qys to SARS-CoV-2 is 0.45 μM, which is close to our prediction.

It is worth pointing out, except for the inhibitor T9j in the complex 5rg1, the rest of inhibitors reported in Table 1 are covalent inhibitors, which irreversibly form covalent bonds with Cys145 of the main protease (see discussion in Section 2.2.2). However, our models only predict the non-covalent binding affinity which is measured before the enzyme deactivation. The predicted binding affinities of all 137 complexes in SARS-CoV PBD-noBA dataset from various MathDL models are presented in Table S8 in ESI.† In this table, we also supply the synthetic accessibility score (SAS), partition coefficient log P, and solubility log S for each small molecule. Except for SAS obtained via RDKit,¹³ log P and log S are evaluated by our TopP-S model.¹⁴

2.2. Discussion

2.2.1. Binding site analysis

Based on the crystal structure information of 137 complexes in SARS-CoV PDB-noBA set, we have identified 13 distinct binding site regions of the SARS-CoV-2 main protease as illustrated in Fig. 1. Those binding pockets are denoted by P_i, i = 1, 2, …, 13. Fig. 2a reveals that binding pocket P₁ is the most common binding region of the SARS-CoV-2 main protease, which attracts around 80.2% of ligands in the SARS-CoV PDB-noBA data set of 137 complexes. This finding is no surprise since the binding pocket P₁ shares similar active sites to its predecessor, i.e. SARS-CoV M^pro. Specifically, P₁ encompasses His141 and Cys145 catalytic dyad which are imperative to the substrate-binding mechanism.⁸ In additions, the substrate-binding residues Tyr161 and His163 (ref. 15) are covered in P₁. Binding pockets P₂, P₃, P₅, P₇, P₈, and P₁₀ are the least favor sites consisting of only one ligand. The rest of the binding pockets involve no more than 7 ligands. To study the correlation of the binding regions to the binding free energy, we present the box plot in Fig. 2b to illustrate the energy values through their quartiles.

The prevailing binding pocket P₁ is the best region on the SARS-CoV-2 M^pro for inhibitor design with the median binding energy being −7.22 kcal mol⁻¹. Nol is the best inhibitor candidate for the binding site P₁ with predicted affinity found to be −8.90 kcal mol⁻¹. Other binding regions such as P₄, and P₁₁ are less common but show their adequate effects on the binding mechanism with their best energy binding affinities calculated at −7.28 kcal mol⁻¹ and −6.80 kcal mol⁻¹, respectively. These potential binding sites can guide drug combination to inhibit coronavirus M^pro effectively.

2.2.2. Interaction analysis

By looking further into the interactions between the top inhibitors and the main protease, we have found that Nol, V2m, N3 are peptidomimetic inhibitors, they form as many as 8, 8, 9 hydrogen bonds respectively to the nearby residues and also all form 1 covalent bond with Cys145 as listed in Table 2 and depicted in Fig. 3. All these hydrogen bonds justify their potency of the first step of noncovalent binding to the main protease complex and confirms the robustness of our MathDL models; the covalent bonds make the binding irreversible. We also notice that these three inhibitors share two common hydrogen bonds to His163, His164 (see Table 2, Fig. 3a, c and d). Therefore, they have some similar predicted binding energies, especially 6xhm and 7bqy at −8.50 kcal mol⁻¹ and −8.49 kcal mol⁻¹, respectively.

Interaction analysis in the binding pockets of top 4 complexes in term of binding affinity predicted by our MathDL models.

PDB ID	Ligand ID	Hydrogen bond	Covalent bond
7c8t	Nol	His163, His164, Cys145, Gln189, Gly143, Glu166	Cys145
5rg1	T9J	His163, Glu166
6xhm	V2m	His163, His164, Cys145, Gln189, Phe140	Cys145
7bqy	N3	His163, His164, Cys145, Gln189, Thr190, Glu166, Phe140, Gly143	Cys145

Open in a new tab

This examination manifests how well our models preserve and capture the physical and chemical properties described in intermolecular bonding interactions. Furthermore, the ligand T9J that binds to M^pro in complex 5rg1 with a quite close binding energy at −8.50 kcal mol⁻¹ forms different hydrogen bonds in comparison to three previously mentioned inhibitors (see Table 2). Since our models only concern the non-covalent binding affinity, the lack of covalent bond in 5rg1's interactions does not downgrade its binding strength. With two relatively large hydrogen bonding distances (O2-His163: 3.05 Å, O3-Glu166: 3.38 Å (see Fig. 3d)), the binding affinity of 5rg1 is still comparable to the top inhibitors indicating the important roles in acquiring the hydrogen bonds to these residues in the main protease's binding process.

In the top 10 inhibitors as listed in Table 1, T9J in the complex 5rg1 is only one non-covalent inhibitor. The rest belongs to the class of targeted covalent inhibitors (TCI) in which they interacts with the protein residues, i.e., cysteine, to form a covalent complex strongly neutralizing target's function. However, the major disadvantage of TCIs is the association with the high toxicity risks.¹⁶ TCIs' strong covalent bond can irreversibly modify the unintended protein targets in the human body. As a result, the top covalent inhibitors in SARS-CoV PBD-noBA dataset may have little chance to become approved market drugs in comparison to their non-covalent counterparts such as T9J in 5rg1.

Due to the popularity of the binding site P₁ among 137 interested inhibitors, we mainly analyze the interaction network around the residues in that region. Out of 110 molecules binding to P₁, there are 103 inhibitors forming at least one hydrogen bond to the nearby amino acid in the SARS-CoV-2 main protease. We have identified 20 different residues in the binding pocket P₁ composing hydrogen bonds to these small molecules. Fig. 4 illustrates the frequency of these 20 residues across 110 inhibitors. Based on Fig. 4, Gly143 residue is the most attractive site to form the hydrogen bond. It appears in 53.6% of 110 intermolecular bonding interactions, followed by Glu166 residue with a frequency of 39.1%; residue Cys145 and His163 also occupy 38.2% and 30.9%, respectively. It is worth noting when these molecules form a hydrogen bond with Cys145, they also constitute another hydrogen bond with Gly143. In all cases, both these residues share the same hydrogen-bond acceptor. Besides the hydrogen bond network, 71 ligands in the SARS-CoV PDB-noBA dataset form a covalent bond to γ-sulfur of Cys145. Except the second one, all the others in the top 10 inhibitors are equipped with that covalent bond (see Table S8 in ESI†).

Furthermore, we are interested in the binding energy distribution associated with the interaction network. Fig. 5 depicts the violin plot of that distribution across four categories, namely no H-bond (no hydrogen bond), H-bond (at least one hydrogen bond), no cov. bond (no covalent bond), and cov. bond (at least one covalent bond). Hydrogen bond interactions that are expected to play an important role in the binding mechanism are well captured in our MathDL models. Specifically, while the average energy of inhibitors having no hydrogen bond is −6.62 kcal mol⁻¹, the average energy of ones with hydrogen bond is as low as −7.23 kcal mol⁻¹.

It is noted that our MathDLs only measure the non-covalent binding affinity. The covalent bond appearing at the final covalent complex is not properly accounted for in our framework. Therefore, it is expected that our models sometimes overestimate the covalent-bond inhibitors over the non-covalent-bond candidates. Fig. 5 reveals molecules in the group of covalent bonds generally are predicted with lower binding energy with an average being −7.42 kcal mol⁻¹ in comparison to −6.89 kcal mol⁻¹ averagely measured on ones without covalent bonds.

2.2.3. Fragment analysis

To design the lead molecules, it is of importance to have promising fragments from existing inhibitors against the drug targets. Therefore, in the present work, we study all the fragments decomposed from 110 inhibitors attached to the binding site P₁. To carry out this task, we utilize BRICS algorithm¹⁷via RDkit.¹³ In BRICS model, there are 16 chemical environments indicated by linkers denoted by L₁, L₂, …, L₁₆. The BRICS decomposition gives raise to a total of 185 unique fragments, which are all presented in Table S9 in ESI.†Fig. 6 illustrates top 12 common fragments in terms of their frequencies. Noting that the second frequent fragment, L₁–C(C) O, often constitutes a hydrogen bond with Gly143 and in many cases forms a covalent bond with Cys145.

3. Materials and methods

3.1. Datasets

Our deep learning-based scoring function, MathDL, was trained on public databases including PDBbind¹⁸ and ChEMBL.⁴ The PDBbind sets contain all complexes with crystal structures deposited in the PDB with the binding affinities not limited to K_d, K_i, and IC₅₀ reported in the literature. In this work, we employ the PDBbind v2019, the latest version of its generation. The v2019 version of the PDBbind consists of 17 679 protein–ligand complexes. However, the data preprocessing of the MathDL³² only retains 17 382 complexes. Among them, there are 10 485 ligands measured in K_d/K_i and 6537 ligands measured in IC₅₀.

ChEMBL is another manually curated database of bioactive molecules. Currently, ChEMBL contains more than 2 million compounds in the SMILES string format. Excluding 30 main protease inhibitors in PDBbind data, we have found other 277 small molecules on ChEMBL with reported K_d/IC₅₀. Additionally, we have found more than 300 other SARS-CoV main protease inhibitors from literatures.^{18–20,25–31} In total, there are more than 600 ligands bound to SARS-CoV/SARS-CoV-2 main protease having the experimental binding affinities; among them, there are 44 crystal structures. For compounds without the crystal structures, MathPose⁶ is utilized to generate their 3D conformations. The predicted 3D coordinates of these structures are presented in the SDF format and available in ESI.† Currently, there are roughly 137 ligands forming crystal complexes with the SARS-CoV-2 main protease on PDB without the report of the experimental inhibitor activities. Most of them are deposited by the PanDDA analysis group (https://pandda.bitbucket.io/#).

To serve model validation purposes, we classify the selected data into five different groups as listed in Table 3. Specifically, PDBbind v2019 is the biggest set in this compilation with its PDB IDs and experimental binding affinities listed in Table S1 in ESI.† PDBbind v2016 core set is a subset of PDBbind v2019 and is formed by 290 complexes representing all protein classes in the refined set of PDBbind v2016.^18,33 The PDB IDs of all complexes in the PDBbind v2016 core set are provided in Table S2.† We also collect all M^pro complexes of SARS-CoV/SARS-CoV-2 on the PDB, denoted by SARS-CoV PDB, which results in a total of 192 structures (see Table S3†). Among them, there are 44 ligands with the report of experimental binding affinities denoted by SARS-CoV PDB-BA (see Table S4†). Furthermore, we are interested in the set of SARS-CoV-2 M^pro complexes in the aforementioned SARS-CoV PDB set but their affinities are not presented or undisclosed. We call this set SARS-CoV PDB-noBA with PDB IDs listed in Table S5.† To enrich our training data targeting SARS-CoV/SARS-CoV-2 main protease inhibitors, we gather some inhibitors reported on the literature.^4,25 For those compounds with only 2D information, we limit ourselves to ones having the similarity score based on the path-based fingerprint FP2 no lower than 0.6 to at least one inhibitor in the SARS-CoV PDB set. As a result, we arrive at a set of 141 structures named SARS-CoV 2D (see Table S6†). Combining SARS-CoV PDB-BA and SARS-CoV 2D data sets, we finalize a reliable database focusing on SARS-CoV/SARS-CoV-2 main protease inhibitors. Notice that the binding affinities in this set are all reported in IC₅₀. Table S7 in ESI† presents the PDB IDs as well as the experimental binding energies of these ligands.

3.2. Methods

3.2.1. MathDL

The MathDL models developed in this work are reformulated from our early model bearing the same name. MathDL was designed for the prediction of various druggable properties of 3D molecules.⁶ In the past three years, MathDL has been proved to be the top competitor in D3R Grand Challenges (https://drugdesigndata.org/about/grand-challenge), a worldwide competition in computer-aided drug design. In the present work, we have, for the first time, develop a multitask MathDL (MathDL-MT) to handle the M^pro inhibitor dataset. We have also extended our earlier MathDL by including all different datasets (MathDL-All). Fig. 7 depicts the framework of the MathDL in which the element-specific algebraic topological representations are integrated with the convolutional neural network (CNN) aiming to predict varied druggable properties such as toxicity, binding affinities, etc.

3.2.1.1. Algebraic topology-based representations

Algebraic topology studies the topological spaces with the use of abstract algebra, which can dramatically simplify the geometric complexity. Persistent homology (PH) is one of the algebraic topology approaches which has the capacity to track the multiscale topological information over different scales along with filtration by characterizing independent components, rings, and higher dimensional voids in space.³⁴ In this section, we will briefly review the algebraic topology-based representations. Additionally, since we are dealing with the protein–ligand system, therefore, the biological considerations will take into account as well.

Simplex

The q-simplex denoted as σ_q is the convex hull of q + 1 affinely independent points in Inline graphic . For example, the 0, 1, 2, and 3-simplex is considered as a vertex, an edge, a triangle, and a tetrahedron, respectively. We call the convex hull of each non-empty subset of q + 1 points the face of σ_q, and each points are also called the vertices.

Simplicial complex

A set of simplices is a simplicial complex denote K which satisfies that every face of a simplex σ_q ∈ K is also in K and the non-empty intersection of any two simplices in K is the common face for both.

Chain complex

A formal sum of q-simplices in simplicial complex K with coefficients in an algebraic field (typically Inline graphic ) is a q-chain. A set of all q-chains of the simplicial complex K equipped with an algebraic field is called a chain group and denoted as C_q(K). The boundary operator is defined by ∂_q: C_q(K) → C_q−1(K) to relate the chain groups. More specifically, we denote σ_q = [v₀, v₁, …, v_q] for the q-simplex spanned by its vertices, and then the boundary operator can be represented as:

graphic file with name d0sc04641h-t4.jpg

Here, Inline graphic is the (q − 1)-simplex with v_i being omitted. The sequence of chain groups connected by boundary operators is called the chain complex and expressed as:The q-cycle group Z_q(K) and the q-boundary group B_q(K) are defined as Z_q(K) = ker(∂_q) = {c ∈ C_q(K)|∂_qc = ∅} and B_q(K) = im(∂_q+1) = {∂_q+1c|c ∈ C_q+1(K)}. The q-th homology group is the quotient group H_q(K) = Z_q(K)/B_q(K). Moreover, the rank of q-th homology group can be computed as rankH_q(K) = rankZ_q(K) − rankB_q(K), which is denoted as the q-th Betti number β_q. To be notice that the q-th Betti number count the number of q-dimensional holes that can not be continuously deformed to each other.

Persistent homology

A filtration of a simplicial complex K is a nested sequence of subcomplexes of K such that ∅ = K₀ ⊆ K₁ ⊆ K₂⋯⊆ K_m = K. Then the p-persistent qth homology group of K_t is defined as:

H_q^p(K_t) = Z_q(K_t)/(B_q(K_t+p) ∩ Z_q(K_t)).

Here the rank of H_q^p(K_t) counts the number of q-dimensional holes in K_t that are still alive in K_t+p, which is called the p-persistent qth Betti number. The persistent homology not only records the topological information at a specific configuration, but also tracks the changes along with the filtration parameters. More specifically, the topological changes will be preserved in the persistent barcodes. In MathDL, we make use of the persistent homology barcodes by dividing them into bins and calculating the birth, death, and persistence incidents in each bin to enrich our algebraic topological representations.

3.2.1.2. Element specific considerations

The protein–ligand complex is structural and also biological. The persistent homology provides a theoretical approach to encode high-dimensional spatial data of protein–ligand complexes into algebraic topological representations. In this section, we address the biological considerations for biomolecular complexity. There are many kinds of interactions that exist in the protein–ligand complex, such as electrostatics, hydrogen bonds, and hydrophobic effects. Although persistent homology can capture the interactions among the nearest neighbors, the long-range interactions will be hindered. This difficulty can be avoided via the deployment of the element-specific attention.³² There are 4 commonly atom types in protein, namely C, N, O, S, and there are 11 commonly atom types in ligand, including C, N, O, S, P, F, Cl, Br, I, H, B. We include Boron in the ligand atom type consideration since it appears in more than 200 small compounds in our training data. The general framework of MathDL is depicted in Fig. 7 under exemplified steps. In addition, the details of the deep learning architecture of the current MathDL is offered in Fig. S1.† For the details of feature descriptions as well as the deep learning architecture, interested readers are referred to our previous work.³²

3.2.2. MathPose

MathPose, a 3D pose predictor that converts SMILES strings into 3D poses with references of target molecules, is the top performer in D3R Grand Challenge 4 (GC4) in predicting the poses of 24 beta-secretase 1 (BACE) binders.⁶ For one SMILES string, around 1000 3D conformations can be generated by various docking software tools such as GOLD,³⁵ Autodock Vina,³⁶ and GLIDE.³⁷ Moreover, a selected set of known complexes is re-docked by three aforementioned docking software packages to generate at least 100 decoy complexes per input ligand used in the machine learning training set. The machine learning labels will be the calculated root mean squared deviations (RMSDs) between the decoy and native structures for the training data of the pose selection task. Furthermore, MathDL models will be set up and applied to select the top-ranked pose for the given ligand. Besides the GC4 challenge, our models have outperformed state-of-the-art scoring functions at the docking power challenge on CASF-2007 and CASF-2013 benchmarks.³³ Those established results attest to the credibility of our MathPose on the 3D structure prediction of small molecules.

3.3. Validations

3.3.1. PDBbind v2016 core set benchmark

In this validation task, we will testify our model against 290 complexes in the PDBbind v2016 core set. This is a prevalent test set to assert the scoring ability of a binding affinity prediction model and has attracted lots of research groups to devote the effort to improve the Pearson's correlation coefficient (R_p) and Kendall's tau (τ) on this core set performance.^18,42,43 In the current work, we merge the PDBbind v2019, SARS-CoV PDB-BA, and SARS-CoV 2D sets but removing the duplicates and excluding the PDBbind v2016 core set complexes to attain a training set of 17 211 complexes. MathDL with the architecture described in Section 3.2.1 is trained on those complexes. The resulting model is utilized to predict the binding affinity of 290 structures in the PDBbind v2016 core set.

With the purpose of exploring the most optimal model for this benchmark, MathDL is trained for 1000 epochs. Then, we pick the epoch based on the root-mean-squared error (RMSE) of the PDBbind v2016 core set prediction. We have found that MathDL achieves the smallest RMSE in this experiment at 140 epochs. Specifically RMSE, R_p, and τ metrics on the v2016 core set are 1.56 kcal mol⁻¹, 0.858, and 0.671, respectively. Meanwhile, the training accuracy is 0.387 kcal mol⁻¹ in terms of RMSE and its Pearson's correlation coefficient is R_p = 0.994. These performances reveal that our MathDL converges very fast and with only 140 epochs and maintains a good balance between training and testing accuracies. This is a state-of-the-art performance since our MathDL is ranked in the second place in comparison to 33 other scoring functions (see Fig. 8). It is noted that the top model is TopBP_con. published in our previous work³² with R_p = 0.861. TopBP_con. is the consensus of gradient boosted tree and deep learning-based models. If only the deep learning framework is considered, the performance of TopBP (denoted by TopBP-DL) on the core set of PDBbind v2016 is R_p = 0.848.

It is worth mentioning that except for our MathDL, all machine learning-based scoring functions listed in Fig. 8 were trained on the PDBbind v2016 refined set of 3767 complexes. As mentioned above, the current MathDL is compiled on a much larger training set comprised of 17 211 complexes selected from PDBbind v2019 and SARS-CoV BA data. Even the present MathDL has not outperformed its predecessor, i.e., TopBP_con., MathDL is still a preference model since it is trained on a diverse data set covering various protein families and different binding energy ranges. As a result, it is expected to deliver more reliable predictions on the SARS-CoV-2 inhibitor, especially when this main protease family is not included in the training data of previous TopDL models. The resulting MathDL model is labeled as MathDL-Core2016 and is utilized to predict affinities of complexes in SARS-CoV PDB-noBA in Section 2.1.

3.3.2. 5 fold cross-validation on SARS-CoV BA set

In this section, we testify the performance of our MathDL against 185 inhibitors in the SARS-CoV BA set aforementioned in Table 3. Among those ligands, there are 44 X-ray crystal structures and the rest are in 2D SMILES strings. We employ MathPose to predict 3D structures of those 2D ligands. To carry out the validation, we randomly split the SARS-CoV BA set into 5 non-overlapped folds. In each fold prediction task, MathDL trains on the partial data of SARS-CoV BA in conjunction with PDBbind v2019 set. This situation results in two different ways of training our MathDL model. The first approach is a traditional MathDL architecture with the training set combining both SARS-CoV BA and PDBbind v2019 complexes. The second model makes use of multi-task learning.⁴⁴ In each epoch, the weights of the MathDL architecture are learned through the information from PDBbind v2019 set, then only the fully connected layers are trainable when learning SARS-CoV BA structures. Finally, we come up with 10 different MathDL models in which the traditional MathDL frameworks are labeled as MathDL-All-i and multi-task MatDL is named MathDL-MT-i with i running from 1 to 5. In each model, after 100 epochs, we start monitoring which epoch that helps our model achieve the smallest RMSE on the test set.

Table 4 reveals that MathDL-All models are well trained with the averaged accuracy RMSE = 0.286 kcal mol⁻¹, Pearson's correlation coefficient R_p = 0.994, and Kendall's tau τ = 0.934. Their averaged performances on test data across 5-fold of the SARS-CoV BA set are found to be R_p = 0.729, τ = 0.540, and RMSE = 0.789 kcal mol⁻¹. These results endorse the reliability of these models in the binding affinity prediction of SARS-CoV/SARS-CoV-2 inhibitors. Table 4 also lists the training and testing performances of five multi-task learning models. The averaged training performance of the MathDL-MT model is R_p = 0.995, τ = 0.941 and RMSE = 0.275 kcal mol⁻¹. The accuracy of the multi-task architecture on the test sets is similar to MathDL-All with R_p = 0.727, τ = 0.532, and RMSE = 0.822 kcal mol⁻¹. With these promising results, it is encouraging to carry out MathDL models to predict unknown binding affinities of SARS-CoV/SARS-CoV-2 inhibitors. It is worth noting that if the 5-fold cross-validation is conducted purely on the SARS-CoV BA set, the average R_p and τ are as low as 0.561 and 0.388, respectively. These results strongly support the inclusion of diverse information such as PDBbind v2019 in conjunction with sophisticated deep learning architectures to achieve the accurate binding energy prediction of M^pro inhibitors.

5-fold Performances of MathDL-All and MathDL-MT on SARS-CoV BA set.

	MathDL-ALL			MathDL-MT
	R _p	τ	RMSE	R _p	τ	RMSE
Fold 1 (train)	0.992	0.923	0.327	0.996	0.949	0.253
Fold 1 (test)	0.792	0.534	0.682	0.818	0.534	0.680
Fold 2 (train)	0.995	0.943	0.266	0.996	9.948	0.236
Fold 2 (test)	0.625	0.498	0.866	0.689	0.538	0.826
Fold 3 (train)	0.991	0.917	0.367	0.994	0.934	0.327
Fold 3 (test)	0.771	0.572	0.758	0.767	0.593	0.802
Fold 4 (train)	0.996	0.948	0.240	0.997	0.951	0.177
Fold 4 (test)	0.618	0.397	0.874	0.642	0.472	0.901
Fold 5 (train)	0.995	0.941	0.231	0.991	0.921	0.380
Fold 5 (test)	0.838	0.699	0.767	0.719	0.524	0.900
Average (train)	0.994	0.934	0.286	0.995	0.941	0.275
Average (test)	0.729	0.540	0.789	0.727	0.532	0.822

Open in a new tab

4. Conclusion

SARS-CoV-2 main protease (M^pro) is the most favorable target for COVID-19 drug discovery due to its conservative nature and low similarity with human genes. Structure and binding affinity of protein–drug complexes are of paramount importance for understanding the molecular mechanism in drug discovery. However, there are only two SARS-CoV-2 M^pro inhibitor structures available with binding affinities, highlighting current challenges in COVID-19 drug discovery.

This work presents the reliable binding affinity prediction and ranking of 137 M^pro–inhibitor crystal structures that have no reported experimental binding affinity. We first curate a set of more than 600 M^pro inhibitors with binding affinities from public resources, such as PDBbind, ChEMBL and the scattered literature. Among these inhibitors, 141 are retained based on their high similarity with available M^pro–inhibitor complex structures and built with three dimensional (3D) poses using our MathPose.⁶ Together with 44 another SARS-CoV or SARS-CoV-2 M^pro–inhibitor complexes, we compose a training set of 185 reliable SARS-CoV-2 M^pro–inhibitor complexes. Our earlier MathDL models are reformulated with algebraic topology to accommodate 119 new complexes and 17 382 complexes from the PDBbind v2019 general set in both single-task and multitask settings, which have never been available before. The resulting MathDL models are rigorously validated via PDBbind v2016 core set benchmark in which it outperforms state-of-the-art models in the literature. Most importantly, our MathDL achieves promising cross-validation accuracies on the SARS-CoV family inhibitors with the averaged Pearson's correlation coefficient as high as 0.73.

Additionally, the present work unveils that Gly143 of M^pro is the most attractive region to form hydrogen bonds, followed by Glu166, Cys145, and His163. There are 71 inhibitors interacting with SARS-CoV-2 M^pro to form covalent complexes. Those covalent bonds are mostly composed between dicarbon monoxide groups in inhibitors and γ-sulfur on Cys145. There are only one non-covalent complex in our top 10 ranked, namely 5rg1. To provide a potential resource for lead molecule design, we employ the BRICS algorithm to decompose all the inhibitors of the prominent binding site on M^pro and obtain 185 unique fragments.

The predicted binding affinities and their ranking of 137 M^pro–inhibitor crystal structures, the bonding analysis, and the fragment decomposition have significantly extended current knowledge and understanding of SARS-CoV-2 M^pro and inhibitor interactions and, thus offered valuable information toward COVID-19 drug discovery.

Conflicts of interest

There are no conflicts to declare.

Supplementary Material

SC-011-D0SC04641H-s001

SC-011-D0SC04641H-s001.pdf^{(151.2KB, pdf)}

SC-011-D0SC04641H-s002

SC-011-D0SC04641H-s002.sdf^{(5.2KB, sdf)}

SC-011-D0SC04641H-s003

SC-011-D0SC04641H-s003.pdb^{(393.1KB, pdb)}

SC-011-D0SC04641H-s004

SC-011-D0SC04641H-s004.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s005

SC-011-D0SC04641H-s005.pdb^{(392.8KB, pdb)}

SC-011-D0SC04641H-s006

SC-011-D0SC04641H-s006.sdf^{(5.8KB, sdf)}

SC-011-D0SC04641H-s007

SC-011-D0SC04641H-s007.pdb^{(393.4KB, pdb)}

SC-011-D0SC04641H-s008

SC-011-D0SC04641H-s008.sdf^{(6.5KB, sdf)}

SC-011-D0SC04641H-s009

SC-011-D0SC04641H-s009.pdb^{(393.4KB, pdb)}

SC-011-D0SC04641H-s010

SC-011-D0SC04641H-s010.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s011

SC-011-D0SC04641H-s011.pdb^{(401.8KB, pdb)}

SC-011-D0SC04641H-s012

SC-011-D0SC04641H-s012.sdf^{(4KB, sdf)}

SC-011-D0SC04641H-s013

SC-011-D0SC04641H-s013.pdb^{(397KB, pdb)}

SC-011-D0SC04641H-s014

SC-011-D0SC04641H-s014.sdf^{(6.9KB, sdf)}

SC-011-D0SC04641H-s015

SC-011-D0SC04641H-s015.pdb^{(407.1KB, pdb)}

SC-011-D0SC04641H-s016

SC-011-D0SC04641H-s016.sdf^{(3.8KB, sdf)}

SC-011-D0SC04641H-s017

SC-011-D0SC04641H-s017.pdb^{(416.4KB, pdb)}

SC-011-D0SC04641H-s018

SC-011-D0SC04641H-s018.sdf^{(4.6KB, sdf)}

SC-011-D0SC04641H-s019

SC-011-D0SC04641H-s019.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s020

SC-011-D0SC04641H-s020.sdf^{(2.3KB, sdf)}

SC-011-D0SC04641H-s021

SC-011-D0SC04641H-s021.pdb^{(401.5KB, pdb)}

SC-011-D0SC04641H-s022

SC-011-D0SC04641H-s022.sdf^{(6.3KB, sdf)}

SC-011-D0SC04641H-s023

SC-011-D0SC04641H-s023.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s024

SC-011-D0SC04641H-s024.sdf^{(5.6KB, sdf)}

SC-011-D0SC04641H-s025

SC-011-D0SC04641H-s025.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s026

SC-011-D0SC04641H-s026.sdf^{(5.7KB, sdf)}

SC-011-D0SC04641H-s027

SC-011-D0SC04641H-s027.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s028

SC-011-D0SC04641H-s028.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s029

SC-011-D0SC04641H-s029.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s030

SC-011-D0SC04641H-s030.sdf^{(5.7KB, sdf)}

SC-011-D0SC04641H-s031

SC-011-D0SC04641H-s031.pdb^{(407.1KB, pdb)}

SC-011-D0SC04641H-s032

SC-011-D0SC04641H-s032.sdf^{(5.5KB, sdf)}

SC-011-D0SC04641H-s033

SC-011-D0SC04641H-s033.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s034

SC-011-D0SC04641H-s034.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s035

SC-011-D0SC04641H-s035.pdb^{(410.7KB, pdb)}

SC-011-D0SC04641H-s036

SC-011-D0SC04641H-s036.sdf^{(6.7KB, sdf)}

SC-011-D0SC04641H-s037

SC-011-D0SC04641H-s037.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s038

SC-011-D0SC04641H-s038.sdf^{(6KB, sdf)}

SC-011-D0SC04641H-s039

SC-011-D0SC04641H-s039.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s040

SC-011-D0SC04641H-s040.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s041

SC-011-D0SC04641H-s041.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s042

SC-011-D0SC04641H-s042.sdf^{(4.5KB, sdf)}

SC-011-D0SC04641H-s043

SC-011-D0SC04641H-s043.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s044

SC-011-D0SC04641H-s044.sdf^{(4.5KB, sdf)}

SC-011-D0SC04641H-s045

SC-011-D0SC04641H-s045.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s046

SC-011-D0SC04641H-s046.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s047

SC-011-D0SC04641H-s047.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s048

SC-011-D0SC04641H-s048.sdf^{(4.5KB, sdf)}

SC-011-D0SC04641H-s049

SC-011-D0SC04641H-s049.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s050

SC-011-D0SC04641H-s050.sdf^{(4.7KB, sdf)}

SC-011-D0SC04641H-s051

SC-011-D0SC04641H-s051.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s052

SC-011-D0SC04641H-s052.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s053

SC-011-D0SC04641H-s053.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s054

SC-011-D0SC04641H-s054.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s055

SC-011-D0SC04641H-s055.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s056

SC-011-D0SC04641H-s056.sdf^{(4.5KB, sdf)}

SC-011-D0SC04641H-s057

SC-011-D0SC04641H-s057.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s058

SC-011-D0SC04641H-s058.sdf^{(7.3KB, sdf)}

SC-011-D0SC04641H-s059

SC-011-D0SC04641H-s059.pdb^{(548.4KB, pdb)}

SC-011-D0SC04641H-s060

SC-011-D0SC04641H-s060.sdf^{(5.6KB, sdf)}

SC-011-D0SC04641H-s061

SC-011-D0SC04641H-s061.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s062

SC-011-D0SC04641H-s062.sdf^{(6.8KB, sdf)}

SC-011-D0SC04641H-s063

SC-011-D0SC04641H-s063.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s064

SC-011-D0SC04641H-s064.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s065

SC-011-D0SC04641H-s065.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s066

SC-011-D0SC04641H-s066.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s067

SC-011-D0SC04641H-s067.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s068

SC-011-D0SC04641H-s068.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s069

SC-011-D0SC04641H-s069.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s070

SC-011-D0SC04641H-s070.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s071

SC-011-D0SC04641H-s071.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s072

SC-011-D0SC04641H-s072.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s073

SC-011-D0SC04641H-s073.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s074

SC-011-D0SC04641H-s074.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s075

SC-011-D0SC04641H-s075.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s076

SC-011-D0SC04641H-s076.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s077

SC-011-D0SC04641H-s077.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s078

SC-011-D0SC04641H-s078.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s079

SC-011-D0SC04641H-s079.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s080

SC-011-D0SC04641H-s080.sdf^{(5.2KB, sdf)}

SC-011-D0SC04641H-s081

SC-011-D0SC04641H-s081.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s082

SC-011-D0SC04641H-s082.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s083

SC-011-D0SC04641H-s083.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s084

SC-011-D0SC04641H-s084.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s085

SC-011-D0SC04641H-s085.pdb^{(407.1KB, pdb)}

SC-011-D0SC04641H-s086

SC-011-D0SC04641H-s086.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s087

SC-011-D0SC04641H-s087.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s088

SC-011-D0SC04641H-s088.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s089

SC-011-D0SC04641H-s089.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s090

SC-011-D0SC04641H-s090.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s091

SC-011-D0SC04641H-s091.pdb^{(407.4KB, pdb)}

SC-011-D0SC04641H-s092

SC-011-D0SC04641H-s092.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s093

SC-011-D0SC04641H-s093.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s094

SC-011-D0SC04641H-s094.sdf^{(5.6KB, sdf)}

SC-011-D0SC04641H-s095

SC-011-D0SC04641H-s095.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s096

SC-011-D0SC04641H-s096.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s097

SC-011-D0SC04641H-s097.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s098

SC-011-D0SC04641H-s098.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s099

SC-011-D0SC04641H-s099.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s100

SC-011-D0SC04641H-s100.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s101

SC-011-D0SC04641H-s101.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s102

SC-011-D0SC04641H-s102.sdf^{(5.8KB, sdf)}

SC-011-D0SC04641H-s103

SC-011-D0SC04641H-s103.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s104

SC-011-D0SC04641H-s104.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s105

SC-011-D0SC04641H-s105.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s106

SC-011-D0SC04641H-s106.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s107

SC-011-D0SC04641H-s107.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s108

SC-011-D0SC04641H-s108.sdf^{(5.2KB, sdf)}

SC-011-D0SC04641H-s109

SC-011-D0SC04641H-s109.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s110

SC-011-D0SC04641H-s110.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s111

SC-011-D0SC04641H-s111.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s112

SC-011-D0SC04641H-s112.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s113

SC-011-D0SC04641H-s113.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s114

SC-011-D0SC04641H-s114.sdf^{(4.6KB, sdf)}

SC-011-D0SC04641H-s115

SC-011-D0SC04641H-s115.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s116

SC-011-D0SC04641H-s116.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s117

SC-011-D0SC04641H-s117.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s118

SC-011-D0SC04641H-s118.sdf^{(6.1KB, sdf)}

SC-011-D0SC04641H-s119

SC-011-D0SC04641H-s119.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s120

SC-011-D0SC04641H-s120.sdf^{(2.2KB, sdf)}

SC-011-D0SC04641H-s121

SC-011-D0SC04641H-s121.pdb^{(397KB, pdb)}

SC-011-D0SC04641H-s122

SC-011-D0SC04641H-s122.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s123

SC-011-D0SC04641H-s123.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s124

SC-011-D0SC04641H-s124.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s125

SC-011-D0SC04641H-s125.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s126

SC-011-D0SC04641H-s126.sdf^{(4.3KB, sdf)}

SC-011-D0SC04641H-s127

SC-011-D0SC04641H-s127.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s128

SC-011-D0SC04641H-s128.sdf^{(4.3KB, sdf)}

SC-011-D0SC04641H-s129

SC-011-D0SC04641H-s129.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s130

SC-011-D0SC04641H-s130.sdf^{(4.6KB, sdf)}

SC-011-D0SC04641H-s131

SC-011-D0SC04641H-s131.pdb^{(410.7KB, pdb)}

SC-011-D0SC04641H-s132

SC-011-D0SC04641H-s132.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s133

SC-011-D0SC04641H-s133.pdb^{(420.4KB, pdb)}

SC-011-D0SC04641H-s134

SC-011-D0SC04641H-s134.sdf^{(2.2KB, sdf)}

SC-011-D0SC04641H-s135

SC-011-D0SC04641H-s135.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s136

SC-011-D0SC04641H-s136.sdf^{(6.7KB, sdf)}

SC-011-D0SC04641H-s137

SC-011-D0SC04641H-s137.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s138

SC-011-D0SC04641H-s138.sdf^{(6.1KB, sdf)}

SC-011-D0SC04641H-s139

SC-011-D0SC04641H-s139.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s140

SC-011-D0SC04641H-s140.sdf^{(6.8KB, sdf)}

SC-011-D0SC04641H-s141

SC-011-D0SC04641H-s141.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s142

SC-011-D0SC04641H-s142.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s143

SC-011-D0SC04641H-s143.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s144

SC-011-D0SC04641H-s144.sdf^{(4.2KB, sdf)}

SC-011-D0SC04641H-s145

SC-011-D0SC04641H-s145.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s146

SC-011-D0SC04641H-s146.sdf^{(2.4KB, sdf)}

SC-011-D0SC04641H-s147

SC-011-D0SC04641H-s147.pdb^{(397KB, pdb)}

SC-011-D0SC04641H-s148

SC-011-D0SC04641H-s148.sdf^{(7.1KB, sdf)}

SC-011-D0SC04641H-s149

SC-011-D0SC04641H-s149.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s150

SC-011-D0SC04641H-s150.sdf^{(6.7KB, sdf)}

SC-011-D0SC04641H-s151

SC-011-D0SC04641H-s151.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s152

SC-011-D0SC04641H-s152.sdf^{(6.4KB, sdf)}

SC-011-D0SC04641H-s153

SC-011-D0SC04641H-s153.pdb^{(411KB, pdb)}

SC-011-D0SC04641H-s154

SC-011-D0SC04641H-s154.sdf^{(3.1KB, sdf)}

SC-011-D0SC04641H-s155

SC-011-D0SC04641H-s155.pdb^{(415.8KB, pdb)}

SC-011-D0SC04641H-s156

SC-011-D0SC04641H-s156.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s157

SC-011-D0SC04641H-s157.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s158

SC-011-D0SC04641H-s158.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s159

SC-011-D0SC04641H-s159.pdb^{(411KB, pdb)}

SC-011-D0SC04641H-s160

SC-011-D0SC04641H-s160.sdf^{(5.9KB, sdf)}

SC-011-D0SC04641H-s161

SC-011-D0SC04641H-s161.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s162

SC-011-D0SC04641H-s162.sdf^{(5.9KB, sdf)}

SC-011-D0SC04641H-s163

SC-011-D0SC04641H-s163.pdb^{(411.2KB, pdb)}

SC-011-D0SC04641H-s164

SC-011-D0SC04641H-s164.sdf^{(6.3KB, sdf)}

SC-011-D0SC04641H-s165

SC-011-D0SC04641H-s165.pdb^{(411KB, pdb)}

SC-011-D0SC04641H-s166

SC-011-D0SC04641H-s166.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s167

SC-011-D0SC04641H-s167.pdb^{(548.4KB, pdb)}

SC-011-D0SC04641H-s168

SC-011-D0SC04641H-s168.sdf^{(4.2KB, sdf)}

SC-011-D0SC04641H-s169

SC-011-D0SC04641H-s169.pdb^{(420.1KB, pdb)}

SC-011-D0SC04641H-s170

SC-011-D0SC04641H-s170.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s171

SC-011-D0SC04641H-s171.pdb^{(388.2KB, pdb)}

SC-011-D0SC04641H-s172

SC-011-D0SC04641H-s172.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s173

SC-011-D0SC04641H-s173.pdb^{(387.9KB, pdb)}

SC-011-D0SC04641H-s174

SC-011-D0SC04641H-s174.sdf^{(7.8KB, sdf)}

SC-011-D0SC04641H-s175

SC-011-D0SC04641H-s175.pdb^{(413.6KB, pdb)}

SC-011-D0SC04641H-s176

SC-011-D0SC04641H-s176.sdf^{(3KB, sdf)}

SC-011-D0SC04641H-s177

SC-011-D0SC04641H-s177.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s178

SC-011-D0SC04641H-s178.sdf^{(3.8KB, sdf)}

SC-011-D0SC04641H-s179

SC-011-D0SC04641H-s179.pdb^{(548.4KB, pdb)}

SC-011-D0SC04641H-s180

SC-011-D0SC04641H-s180.sdf^{(6.6KB, sdf)}

SC-011-D0SC04641H-s181

SC-011-D0SC04641H-s181.pdb^{(375.1KB, pdb)}

SC-011-D0SC04641H-s182

SC-011-D0SC04641H-s182.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s183

SC-011-D0SC04641H-s183.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s184

SC-011-D0SC04641H-s184.sdf^{(4.6KB, sdf)}

SC-011-D0SC04641H-s185

SC-011-D0SC04641H-s185.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s186

SC-011-D0SC04641H-s186.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s187

SC-011-D0SC04641H-s187.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s188

SC-011-D0SC04641H-s188.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s189

SC-011-D0SC04641H-s189.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s190

SC-011-D0SC04641H-s190.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s191

SC-011-D0SC04641H-s191.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s192

SC-011-D0SC04641H-s192.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s193

SC-011-D0SC04641H-s193.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s194

SC-011-D0SC04641H-s194.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s195

SC-011-D0SC04641H-s195.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s196

SC-011-D0SC04641H-s196.sdf^{(5.7KB, sdf)}

SC-011-D0SC04641H-s197

SC-011-D0SC04641H-s197.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s198

SC-011-D0SC04641H-s198.sdf^{(5.9KB, sdf)}

SC-011-D0SC04641H-s199

SC-011-D0SC04641H-s199.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s200

SC-011-D0SC04641H-s200.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s201

SC-011-D0SC04641H-s201.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s202

SC-011-D0SC04641H-s202.sdf^{(5.5KB, sdf)}

SC-011-D0SC04641H-s203

SC-011-D0SC04641H-s203.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s204

SC-011-D0SC04641H-s204.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s205

SC-011-D0SC04641H-s205.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s206

SC-011-D0SC04641H-s206.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s207

SC-011-D0SC04641H-s207.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s208

SC-011-D0SC04641H-s208.sdf^{(5.2KB, sdf)}

SC-011-D0SC04641H-s209

SC-011-D0SC04641H-s209.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s210

SC-011-D0SC04641H-s210.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s211

SC-011-D0SC04641H-s211.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s212

SC-011-D0SC04641H-s212.sdf^{(4.2KB, sdf)}

SC-011-D0SC04641H-s213

SC-011-D0SC04641H-s213.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s214

SC-011-D0SC04641H-s214.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s215

SC-011-D0SC04641H-s215.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s216

SC-011-D0SC04641H-s216.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s217

SC-011-D0SC04641H-s217.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s218

SC-011-D0SC04641H-s218.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s219

SC-011-D0SC04641H-s219.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s220

SC-011-D0SC04641H-s220.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s221

SC-011-D0SC04641H-s221.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s222

SC-011-D0SC04641H-s222.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s223

SC-011-D0SC04641H-s223.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s224

SC-011-D0SC04641H-s224.sdf^{(5.2KB, sdf)}

SC-011-D0SC04641H-s225

SC-011-D0SC04641H-s225.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s226

SC-011-D0SC04641H-s226.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s227

SC-011-D0SC04641H-s227.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s228

SC-011-D0SC04641H-s228.sdf^{(5.6KB, sdf)}

SC-011-D0SC04641H-s229

SC-011-D0SC04641H-s229.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s230

SC-011-D0SC04641H-s230.sdf^{(4.3KB, sdf)}

SC-011-D0SC04641H-s231

SC-011-D0SC04641H-s231.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s232

SC-011-D0SC04641H-s232.sdf^{(9.2KB, sdf)}

SC-011-D0SC04641H-s233

SC-011-D0SC04641H-s233.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s234

SC-011-D0SC04641H-s234.sdf^{(7.2KB, sdf)}

SC-011-D0SC04641H-s235

SC-011-D0SC04641H-s235.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s236

SC-011-D0SC04641H-s236.sdf^{(7.5KB, sdf)}

SC-011-D0SC04641H-s237

SC-011-D0SC04641H-s237.pdb^{(407.1KB, pdb)}

SC-011-D0SC04641H-s238

SC-011-D0SC04641H-s238.sdf^{(6.1KB, sdf)}

SC-011-D0SC04641H-s239

SC-011-D0SC04641H-s239.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s240

SC-011-D0SC04641H-s240.sdf^{(6.1KB, sdf)}

SC-011-D0SC04641H-s241

SC-011-D0SC04641H-s241.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s242

SC-011-D0SC04641H-s242.sdf^{(6.4KB, sdf)}

SC-011-D0SC04641H-s243

SC-011-D0SC04641H-s243.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s244

SC-011-D0SC04641H-s244.sdf^{(6.4KB, sdf)}

SC-011-D0SC04641H-s245

SC-011-D0SC04641H-s245.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s246

SC-011-D0SC04641H-s246.sdf^{(6.4KB, sdf)}

SC-011-D0SC04641H-s247

SC-011-D0SC04641H-s247.pdb^{(543.9KB, pdb)}

SC-011-D0SC04641H-s248

SC-011-D0SC04641H-s248.sdf^{(6.2KB, sdf)}

SC-011-D0SC04641H-s249

SC-011-D0SC04641H-s249.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s250

SC-011-D0SC04641H-s250.sdf^{(6.7KB, sdf)}

SC-011-D0SC04641H-s251

SC-011-D0SC04641H-s251.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s252

SC-011-D0SC04641H-s252.sdf^{(6.7KB, sdf)}

SC-011-D0SC04641H-s253

SC-011-D0SC04641H-s253.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s254

SC-011-D0SC04641H-s254.sdf^{(6.3KB, sdf)}

SC-011-D0SC04641H-s255

SC-011-D0SC04641H-s255.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s256

SC-011-D0SC04641H-s256.sdf^{(6KB, sdf)}

SC-011-D0SC04641H-s257

SC-011-D0SC04641H-s257.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s258

SC-011-D0SC04641H-s258.sdf^{(6.2KB, sdf)}

SC-011-D0SC04641H-s259

SC-011-D0SC04641H-s259.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s260

SC-011-D0SC04641H-s260.sdf^{(6.4KB, sdf)}

SC-011-D0SC04641H-s261

SC-011-D0SC04641H-s261.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s262

SC-011-D0SC04641H-s262.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s263

SC-011-D0SC04641H-s263.pdb^{(548.4KB, pdb)}

SC-011-D0SC04641H-s264

SC-011-D0SC04641H-s264.sdf^{(5.6KB, sdf)}

SC-011-D0SC04641H-s265

SC-011-D0SC04641H-s265.pdb^{(407.7KB, pdb)}

SC-011-D0SC04641H-s266

SC-011-D0SC04641H-s266.sdf^{(6.8KB, sdf)}

SC-011-D0SC04641H-s267

SC-011-D0SC04641H-s267.pdb^{(407.1KB, pdb)}

SC-011-D0SC04641H-s268

SC-011-D0SC04641H-s268.sdf^{(5.7KB, sdf)}

SC-011-D0SC04641H-s269

SC-011-D0SC04641H-s269.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s270

SC-011-D0SC04641H-s270.sdf^{(4.5KB, sdf)}

SC-011-D0SC04641H-s271

SC-011-D0SC04641H-s271.pdb^{(546.9KB, pdb)}

SC-011-D0SC04641H-s272

SC-011-D0SC04641H-s272.sdf^{(4.3KB, sdf)}

SC-011-D0SC04641H-s273

SC-011-D0SC04641H-s273.pdb^{(546.9KB, pdb)}

SC-011-D0SC04641H-s274

SC-011-D0SC04641H-s274.sdf^{(5.7KB, sdf)}

SC-011-D0SC04641H-s275

SC-011-D0SC04641H-s275.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s276

SC-011-D0SC04641H-s276.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s277

SC-011-D0SC04641H-s277.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s278

SC-011-D0SC04641H-s278.sdf^{(5.8KB, sdf)}

SC-011-D0SC04641H-s279

SC-011-D0SC04641H-s279.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s280

SC-011-D0SC04641H-s280.sdf^{(4.3KB, sdf)}

SC-011-D0SC04641H-s281

SC-011-D0SC04641H-s281.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s282

SC-011-D0SC04641H-s282.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s283

SC-011-D0SC04641H-s283.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s284

SC-011-D0SC04641H-s284.xlsx^{(399.2KB, xlsx)}

Acknowledgments

This work was supported in part by NIH grant GM126189, NSF Grants DMS-1721024, DMS-1761320, and IIS1900473, Michigan Economic Development Corporation, George Mason University award PD45722, Bristol-Myers Squibb, and Pfizer. The authors thank The IBM TJ Watson Research Center, The COVID-19 High Performance Computing Consortium, and NVIDIA for computational assistance.

^†

Electronic supplementary information (ESI) available: SupportingTables.xls: spreadsheets contain information for all supporting tables from S1 to S8; FileS1.zip: 3D structures generated by our MathPose for 141 ligands in SARS-CoV 2D set; FigS1.pdf: deep learning architecture of MathDL model. See DOI: 10.1039/d0sc04641h

References

Phase III Double-blind, Placebo-controlled Study of AZD1222 for the Prevention of COVID-19 in Adults, 2020, accessed September 15, 2020, https://clinicaltrials.gov/ct2/show/NCT04516746?term=NCT04516746&draw=2&rank=1
Statement on AstraZeneca Oxford SARS-CoV-2 vaccine, AZD1222, COVID-19 vaccine trials temporary pause, 2020, accessed September 15, 2020, https://www.astrazeneca.com/media-centre/press-releases/2020/statement-on-astrazeneca-oxford-sars-cov-2-vaccine-azd1222-covid-19-vaccine-trials-temporary-pause.html
Xu X. Chen P. Wang J. Feng J. Zhou H. Li X. Zhong Wu. Hao P. Evolution of the novel coronavirus from the ongoing wuhan outbreak and modeling of its spike protein for risk of human transmission. Sci. China: Life Sci. 2020;63(3):457–460. doi: 10.1007/s11427-020-1637-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Davies M. Nowotka M. Papadatos G. Dedman N. Gaulton A. Atkinson F. Bellis L. Overington J. P. Chembl web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015;43(W1):W612–W620. doi: 10.1093/nar/gkv352. [DOI] [PMC free article] [PubMed] [Google Scholar]
Berman H. M. Westbrook J. Feng Z. Gilliland G. Bhat T. N. Weissig H. Shindyalov I. N. Bourne P. E. The protein data bank. Nucleic Acids Res. 2000;28(1):35–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nguyen D. D. Gao K. Wang M. Wei G.-W. MathDL: Mathematical deep learning for d3r grand challenge 4. J. Comput.-Aided Mol. Des. 2020;34:131–147. doi: 10.1007/s10822-019-00237-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Matthews D. A. Dragovich P. S. Webber S. E. Fuhrman S. A. Patick A. K. Zalman L. S. Hendrickson T. F. Love R. A. Prins T. J. Marakovits J. T. et al., Structure-assisted design of mechanism-based irreversible inhibitors of human rhinovirus 3c protease with potent antiviral activity against multiple rhinovirus serotypes. Proc. Natl. Acad. Sci. U. S. A. 1999;96(20):11000–11007. doi: 10.1073/pnas.96.20.11000. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yang H. Xie W. Xue X. Yang K. Ma J. Liang W. Zhao Q. Zhou Z. Pei D. Ziebuhr J. et al., Design of wide-spectrum inhibitors targeting coronavirus main proteases. PLoS Biol. 2005;3(10):e324. doi: 10.1371/journal.pbio.0030324. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yang S. Chen S.-J. Hsu M.-F. Wu J.-D. Tseng C.-T. K. Liu Y.-F. Chen H.-C. Kuo C.-W. Wu C.-S. Chang L.-W. et al., Synthesis, crystal structure, structure-activity relationships, and antiviral activity of a potent sars coronavirus 3cl protease inhibitor. J. Med. Chem. 2006;49(16):4971–4980. doi: 10.1021/jm0603926. [DOI] [PubMed] [Google Scholar]
Wang F. Chen C. Tan W. Yang K. Yang H. Structure of main protease from human coronavirus nl63: insights for wide spectrum anti-coronavirus drug design. Sci. Rep. 2016;6:22677. doi: 10.1038/srep22677. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jin Z. Du X. Xu Y. Deng Y. Liu M. Zhao Y. Zhang B. Li X. Zhang L. Peng C. et al., Structure of mpro from covid-19 virus and discovery of its inhibitors. bioRxiv. 2020 doi: 10.1038/s41586-020-2223-y. [DOI] [PubMed] [Google Scholar]
Rathnayake A. D. Zheng J. Kim Y. Perera K. D. Mackin S. Meyerholz D. K. Kashipathy M. M. Battaile K. P. Lovell S. Perlman S. et al., 3c-like protease inhibitors block coronavirus replication in vitro and improve survival in mers-cov-infected mice. Sci. Transl. Med. 2020;12(557):eabc5332. doi: 10.1126/scitranslmed.abc5332. [DOI] [PMC free article] [PubMed] [Google Scholar]
Landrum G. et al. , Rdkit: Open-source cheminformatics, 2006
Wu K. Zhao Z. Wang R. Wei G. W. TopP-S: Persistent Homology-Based Multi-Task Deep Neural Networks for Simultaneous Predictions of Partition Coefficient and Aqueous Solubility. J. Comput. Chem. 2018;39:1444–1454. doi: 10.1002/jcc.25213. [DOI] [PubMed] [Google Scholar]
Chang H.-P. Chou C.-Y. Chang G.-G. Reversible unfolding of the severe acute respiratory syndrome coronavirus main protease in guanidinium chloride. Biophys. J. 2007;92(4):1374–1383. doi: 10.1529/biophysj.106.091736. [DOI] [PMC free article] [PubMed] [Google Scholar]
Park B. K. Boobis A. Clarke S. Goldring C. E. P. Jones D. Kenna J. G. Lambert C. Laverty H. G. Naisbitt D. J. Nelson S. et al., Managing the challenge of chemically reactive metabolites in drug development. Nat. Rev. Drug Discovery. 2011;10(4):292–306. doi: 10.1038/nrd3408. [DOI] [PubMed] [Google Scholar]
Degen J. Wegscheid-Gerlach C. Zaliani A. Rarey M. On the art of compiling and using’drug-like’chemical fragment spaces. ChemMedChem. 2008;3(10):1503–1507. doi: 10.1002/cmdc.200800178. [DOI] [PubMed] [Google Scholar]
Su M. Yang Q. Du Y. Feng G. Liu Z. Li Y. Wang R. Comparative assessment of scoring functions: The casf-2016 update. J. Chem. Inf. Model. 2018 doi: 10.1021/acs.jcim.8b00545. [DOI] [PubMed] [Google Scholar]
Zhang L. Lin D. Sun X. Curth U. Drosten C. Sauerhering L. Becker S. Rox K. Hilgenfeld R. Crystal structure of sars-cov-2 main protease provides a basis for design of improved α-ketoamide inhibitors. Science. 2020;368(6489):409–412. doi: 10.1126/science.abb3405. [DOI] [PMC free article] [PubMed] [Google Scholar]
Su H. Yao S. Zhao W. Li M. Jia L. Shang W. Xie H. Ke C. Gao M. Yu K. et al., Discovery of baicalin and baicalein as novel, natural product inhibitors of sars-cov-2 3cl protease in vitro. bioRxiv. 2020 [Google Scholar]
Wang H. He S. Deng W. Zhang Y. Li G. Sun J. Zhao W. Guo Y. Zheng Y. Li D. et al., Comprehensive insights into the catalytic mechanism of middle east respiratory syndrome 3c-like protease and severe acute respiratory syndrome 3c-like protease. ACS Catal. 2020;10(10):5871–5890. doi: 10.1021/acscatal.0c00110. [DOI] [PubMed] [Google Scholar]
Dai W. Zhang B. Jiang X.-M. Su H. Li J. Zhao Y. Xie X. Jin Z. Peng J. Liu F. et al., Structure-based design of antiviral drug candidates targeting the sars-cov-2 main protease. Science. 2020;368(6497):1331–1335. doi: 10.1126/science.abb4489. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ma C. Sacco M. D. Hurst B. Townsend J. A. Hu Y. Szeto T. Zhang X. Tarbet B. Marty M. T. Chen Y. et al., Boceprevir, gc-376, and calpain inhibitors ii, xii inhibit sars-cov-2 viral replication by targeting the viral main protease. bioRxiv. 2020 doi: 10.1038/s41422-020-0356-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Douangamath A. Fearon D. Gehrtz P. Krojer T. Lukacik P. Owen C. D. Resnick E. Strain-Damerell C. Ábrányi-Balogh P. Brandaõ-Neto J. et al., Crystallographic and electrophilic fragment screening of the sars-cov-2 main protease. Nat. Commun. 2020;11:5047. doi: 10.1038/s41467-020-18709-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bacha U. Barrila J. Gabelli S. B. Kiso Y. Amzel L. M. Freire E. Development of broad-spectrum halomethyl ketone inhibitors against coronavirus main protease 3clpro. Chem. Biol. Drug Des. 2008;72(1):34–49. doi: 10.1111/j.1747-0285.2008.00679.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ghosh A. K. Brindisi M. Shahabi D. Chapman M. E. Mesecar A. D. Drug development and medicinal chemistry efforts toward sars-coronavirus and covid-19 therapeutics. ChemMedChem. 2020;15(11):907–932. doi: 10.1002/cmdc.202000223. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liang P.-H. Characterization and inhibition of sars-coronavirus main protease. Curr. Top. Med. Chem. 2006;6(4):361–376. doi: 10.2174/156802606776287090. [DOI] [PubMed] [Google Scholar]
Wang H.-M. Liang P.-H. Pharmacophores and biological activities of severe acute respiratory syndrome viral protease inhibitors. Expert Opin. Ther. Pat. 2007;17(5):533–546. doi: 10.1517/13543776.17.5.533. [DOI] [Google Scholar]
Kumar V. Jung Y.-S. Liang P.-H. Anti-sars coronavirus agents: a patent review (2008–present) Expert Opin. Ther. Pat. 2013;23(10):1337–1348. doi: 10.1517/13543776.2013.823159. [DOI] [PubMed] [Google Scholar]
Pillaiyar T. Manickam M. Namasivayam V. Hayashi Y. Jung S.-H. An overview of severe acute respiratory syndrome–coronavirus (sars-cov) 3cl protease inhibitors: peptidomimetics and small molecule chemotherapy. J. Med. Chem. 2016;59(14):6595–6628. doi: 10.1021/acs.jmedchem.5b01461. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ullrich S. Nitsche C. The sars-cov-2 main protease as drug target. Bioorg. Med. Chem. Lett. 2020:127377. doi: 10.1016/j.bmcl.2020.127377. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cang Z. X. Mu L. Wei G. W. Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput. Biol. 2018;14(1):e1005929. doi: 10.1371/journal.pcbi.1005929. doi: 10.1371/journal.pcbi.1005929. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nguyen D. Wei G.-W. AGL-Score: Algebraic graph learning score for protein–ligand binding scoring, ranking, docking, and screening. J. Chem. Inf. Model. 2019;59(7):3291–3304. doi: 10.1021/acs.jcim.9b00334. [DOI] [PMC free article] [PubMed] [Google Scholar]
Carlsson G. Topology and data. Bull. Am. Math. Soc. 2009;46(2):255–308. doi: 10.1090/S0273-0979-09-01249-X. [DOI] [Google Scholar]
Jones G. Willett P. Glen R. C. Leach A. R. Taylor R. Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol. 1997;267(3):727–748. doi: 10.1006/jmbi.1996.0897. [DOI] [PubMed] [Google Scholar]
Trott O. Olson A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 2010;31(2):455–461. doi: 10.1002/jcc.21334. [DOI] [PMC free article] [PubMed] [Google Scholar]
Friesner R. A. Banks J. L. Murphy R. B. Halgren T. A. Klicic J. J. Mainz D. T. Repasky M. P. Knoll E. H. Shelley M. Perry J. K. et al., Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J. Med. Chem. 2004;47(7):1739–1749. doi: 10.1021/jm0306430. [DOI] [PubMed] [Google Scholar]
Wójcikowski M., Kukiełka M., Stepniewska-Dziubinska M. and Siedlecki P., Development of a protein-ligand extended connectivity (plec) fingerprint and its application for binding affinity predictions, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
Nguyen D. D. Wei G.-W. DG-GL: Differential geometry-based geometric learning of molecular datasets. Int. J. Numer. Method Biomed. Eng. 2019;35(3):e3179. doi: 10.1002/cnm.3179. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zheng L. Fan J. Mu Y. Onionnet: a multiple-layer intermolecular-contact-based convolutional neural network for protein–ligand binding affinity prediction. ACS Omega. 2019;4(14):15956–15965. doi: 10.1021/acsomega.9b01997. [DOI] [PMC free article] [PubMed] [Google Scholar]
Li H. Sze K.-H. Lu G. Ballester P. J. Machine-learning scoring functions for structure-based drug lead optimization. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2020:e1465. doi: 10.1002/wcms.1225. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nguyen D. D. Cang Z. Wei G.-W. A review of mathematical representations of biomolecular data. Phys. Chem. Chem. Phys. 2020;22(8):4343–4367. doi: 10.1039/C9CP06554G. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jiménez J. Skalic M. Martínez-Rosell G. De Fabritiis G. K DEEP: Protein–Ligand absolute binding affinity prediction via 3D-convolutional neural networks. J. Chem. Inf. Model. 2018;58(2):287–296. doi: 10.1021/acs.jcim.7b00650. [DOI] [PubMed] [Google Scholar]
Wu K. Wei G. W. Quantitative Toxicity Prediction Using Topology Based Multitask Deep Neural Networks. J. Chem. Inf. Model. 2018;58:520–531. doi: 10.1021/acs.jcim.7b00558. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SC-011-D0SC04641H-s001

SC-011-D0SC04641H-s001.pdf^{(151.2KB, pdf)}

SC-011-D0SC04641H-s002

SC-011-D0SC04641H-s002.sdf^{(5.2KB, sdf)}

SC-011-D0SC04641H-s003

SC-011-D0SC04641H-s003.pdb^{(393.1KB, pdb)}

SC-011-D0SC04641H-s004

SC-011-D0SC04641H-s004.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s005

SC-011-D0SC04641H-s005.pdb^{(392.8KB, pdb)}

SC-011-D0SC04641H-s006

SC-011-D0SC04641H-s006.sdf^{(5.8KB, sdf)}

SC-011-D0SC04641H-s007

SC-011-D0SC04641H-s007.pdb^{(393.4KB, pdb)}

SC-011-D0SC04641H-s008

SC-011-D0SC04641H-s008.sdf^{(6.5KB, sdf)}

SC-011-D0SC04641H-s009

SC-011-D0SC04641H-s009.pdb^{(393.4KB, pdb)}

SC-011-D0SC04641H-s010

SC-011-D0SC04641H-s010.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s011

SC-011-D0SC04641H-s011.pdb^{(401.8KB, pdb)}

SC-011-D0SC04641H-s012

SC-011-D0SC04641H-s012.sdf^{(4KB, sdf)}

SC-011-D0SC04641H-s013

SC-011-D0SC04641H-s013.pdb^{(397KB, pdb)}

SC-011-D0SC04641H-s014

SC-011-D0SC04641H-s014.sdf^{(6.9KB, sdf)}

SC-011-D0SC04641H-s015

SC-011-D0SC04641H-s015.pdb^{(407.1KB, pdb)}

SC-011-D0SC04641H-s016

SC-011-D0SC04641H-s016.sdf^{(3.8KB, sdf)}

SC-011-D0SC04641H-s017

SC-011-D0SC04641H-s017.pdb^{(416.4KB, pdb)}

SC-011-D0SC04641H-s018

SC-011-D0SC04641H-s018.sdf^{(4.6KB, sdf)}

SC-011-D0SC04641H-s019

SC-011-D0SC04641H-s019.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s020

SC-011-D0SC04641H-s020.sdf^{(2.3KB, sdf)}

SC-011-D0SC04641H-s021

SC-011-D0SC04641H-s021.pdb^{(401.5KB, pdb)}

SC-011-D0SC04641H-s022

SC-011-D0SC04641H-s022.sdf^{(6.3KB, sdf)}

SC-011-D0SC04641H-s023

SC-011-D0SC04641H-s023.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s024

SC-011-D0SC04641H-s024.sdf^{(5.6KB, sdf)}

SC-011-D0SC04641H-s025

SC-011-D0SC04641H-s025.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s026

SC-011-D0SC04641H-s026.sdf^{(5.7KB, sdf)}

SC-011-D0SC04641H-s027

SC-011-D0SC04641H-s027.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s028

SC-011-D0SC04641H-s028.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s029

SC-011-D0SC04641H-s029.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s030

SC-011-D0SC04641H-s030.sdf^{(5.7KB, sdf)}

SC-011-D0SC04641H-s031

SC-011-D0SC04641H-s031.pdb^{(407.1KB, pdb)}

SC-011-D0SC04641H-s032

SC-011-D0SC04641H-s032.sdf^{(5.5KB, sdf)}

SC-011-D0SC04641H-s033

SC-011-D0SC04641H-s033.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s034

SC-011-D0SC04641H-s034.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s035

SC-011-D0SC04641H-s035.pdb^{(410.7KB, pdb)}

SC-011-D0SC04641H-s036

SC-011-D0SC04641H-s036.sdf^{(6.7KB, sdf)}

SC-011-D0SC04641H-s037

SC-011-D0SC04641H-s037.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s038

SC-011-D0SC04641H-s038.sdf^{(6KB, sdf)}

SC-011-D0SC04641H-s039

SC-011-D0SC04641H-s039.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s040

SC-011-D0SC04641H-s040.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s041

SC-011-D0SC04641H-s041.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s042

SC-011-D0SC04641H-s042.sdf^{(4.5KB, sdf)}

SC-011-D0SC04641H-s043

SC-011-D0SC04641H-s043.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s044

SC-011-D0SC04641H-s044.sdf^{(4.5KB, sdf)}

SC-011-D0SC04641H-s045

SC-011-D0SC04641H-s045.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s046

SC-011-D0SC04641H-s046.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s047

SC-011-D0SC04641H-s047.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s048

SC-011-D0SC04641H-s048.sdf^{(4.5KB, sdf)}

SC-011-D0SC04641H-s049

SC-011-D0SC04641H-s049.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s050

SC-011-D0SC04641H-s050.sdf^{(4.7KB, sdf)}

SC-011-D0SC04641H-s051

SC-011-D0SC04641H-s051.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s052

SC-011-D0SC04641H-s052.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s053

SC-011-D0SC04641H-s053.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s054

SC-011-D0SC04641H-s054.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s055

SC-011-D0SC04641H-s055.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s056

SC-011-D0SC04641H-s056.sdf^{(4.5KB, sdf)}

SC-011-D0SC04641H-s057

SC-011-D0SC04641H-s057.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s058

SC-011-D0SC04641H-s058.sdf^{(7.3KB, sdf)}

SC-011-D0SC04641H-s059

SC-011-D0SC04641H-s059.pdb^{(548.4KB, pdb)}

SC-011-D0SC04641H-s060

SC-011-D0SC04641H-s060.sdf^{(5.6KB, sdf)}

SC-011-D0SC04641H-s061

SC-011-D0SC04641H-s061.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s062

SC-011-D0SC04641H-s062.sdf^{(6.8KB, sdf)}

SC-011-D0SC04641H-s063

SC-011-D0SC04641H-s063.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s064

SC-011-D0SC04641H-s064.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s065

SC-011-D0SC04641H-s065.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s066

SC-011-D0SC04641H-s066.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s067

SC-011-D0SC04641H-s067.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s068

SC-011-D0SC04641H-s068.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s069

SC-011-D0SC04641H-s069.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s070

SC-011-D0SC04641H-s070.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s071

SC-011-D0SC04641H-s071.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s072

SC-011-D0SC04641H-s072.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s073

SC-011-D0SC04641H-s073.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s074

SC-011-D0SC04641H-s074.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s075

SC-011-D0SC04641H-s075.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s076

SC-011-D0SC04641H-s076.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s077

SC-011-D0SC04641H-s077.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s078

SC-011-D0SC04641H-s078.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s079

SC-011-D0SC04641H-s079.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s080

SC-011-D0SC04641H-s080.sdf^{(5.2KB, sdf)}

SC-011-D0SC04641H-s081

SC-011-D0SC04641H-s081.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s082

SC-011-D0SC04641H-s082.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s083

SC-011-D0SC04641H-s083.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s084

SC-011-D0SC04641H-s084.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s085

SC-011-D0SC04641H-s085.pdb^{(407.1KB, pdb)}

SC-011-D0SC04641H-s086

SC-011-D0SC04641H-s086.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s087

SC-011-D0SC04641H-s087.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s088

SC-011-D0SC04641H-s088.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s089

SC-011-D0SC04641H-s089.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s090

SC-011-D0SC04641H-s090.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s091

SC-011-D0SC04641H-s091.pdb^{(407.4KB, pdb)}

SC-011-D0SC04641H-s092

SC-011-D0SC04641H-s092.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s093

SC-011-D0SC04641H-s093.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s094

SC-011-D0SC04641H-s094.sdf^{(5.6KB, sdf)}

SC-011-D0SC04641H-s095

SC-011-D0SC04641H-s095.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s096

SC-011-D0SC04641H-s096.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s097

SC-011-D0SC04641H-s097.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s098

SC-011-D0SC04641H-s098.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s099

SC-011-D0SC04641H-s099.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s100

SC-011-D0SC04641H-s100.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s101

SC-011-D0SC04641H-s101.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s102

SC-011-D0SC04641H-s102.sdf^{(5.8KB, sdf)}

SC-011-D0SC04641H-s103

SC-011-D0SC04641H-s103.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s104

SC-011-D0SC04641H-s104.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s105

SC-011-D0SC04641H-s105.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s106

SC-011-D0SC04641H-s106.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s107

SC-011-D0SC04641H-s107.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s108

SC-011-D0SC04641H-s108.sdf^{(5.2KB, sdf)}

SC-011-D0SC04641H-s109

SC-011-D0SC04641H-s109.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s110

SC-011-D0SC04641H-s110.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s111

SC-011-D0SC04641H-s111.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s112

SC-011-D0SC04641H-s112.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s113

SC-011-D0SC04641H-s113.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s114

SC-011-D0SC04641H-s114.sdf^{(4.6KB, sdf)}

SC-011-D0SC04641H-s115

SC-011-D0SC04641H-s115.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s116

SC-011-D0SC04641H-s116.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s117

SC-011-D0SC04641H-s117.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s118

SC-011-D0SC04641H-s118.sdf^{(6.1KB, sdf)}

SC-011-D0SC04641H-s119

SC-011-D0SC04641H-s119.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s120

SC-011-D0SC04641H-s120.sdf^{(2.2KB, sdf)}

SC-011-D0SC04641H-s121

SC-011-D0SC04641H-s121.pdb^{(397KB, pdb)}

SC-011-D0SC04641H-s122

SC-011-D0SC04641H-s122.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s123

SC-011-D0SC04641H-s123.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s124

SC-011-D0SC04641H-s124.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s125

SC-011-D0SC04641H-s125.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s126

SC-011-D0SC04641H-s126.sdf^{(4.3KB, sdf)}

SC-011-D0SC04641H-s127

SC-011-D0SC04641H-s127.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s128

SC-011-D0SC04641H-s128.sdf^{(4.3KB, sdf)}

SC-011-D0SC04641H-s129

SC-011-D0SC04641H-s129.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s130

SC-011-D0SC04641H-s130.sdf^{(4.6KB, sdf)}

SC-011-D0SC04641H-s131

SC-011-D0SC04641H-s131.pdb^{(410.7KB, pdb)}

SC-011-D0SC04641H-s132

SC-011-D0SC04641H-s132.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s133

SC-011-D0SC04641H-s133.pdb^{(420.4KB, pdb)}

SC-011-D0SC04641H-s134

SC-011-D0SC04641H-s134.sdf^{(2.2KB, sdf)}

SC-011-D0SC04641H-s135

SC-011-D0SC04641H-s135.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s136

SC-011-D0SC04641H-s136.sdf^{(6.7KB, sdf)}

SC-011-D0SC04641H-s137

SC-011-D0SC04641H-s137.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s138

SC-011-D0SC04641H-s138.sdf^{(6.1KB, sdf)}

SC-011-D0SC04641H-s139

SC-011-D0SC04641H-s139.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s140

SC-011-D0SC04641H-s140.sdf^{(6.8KB, sdf)}

SC-011-D0SC04641H-s141

SC-011-D0SC04641H-s141.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s142

SC-011-D0SC04641H-s142.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s143

SC-011-D0SC04641H-s143.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s144

SC-011-D0SC04641H-s144.sdf^{(4.2KB, sdf)}

SC-011-D0SC04641H-s145

SC-011-D0SC04641H-s145.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s146

SC-011-D0SC04641H-s146.sdf^{(2.4KB, sdf)}

SC-011-D0SC04641H-s147

SC-011-D0SC04641H-s147.pdb^{(397KB, pdb)}

SC-011-D0SC04641H-s148

SC-011-D0SC04641H-s148.sdf^{(7.1KB, sdf)}

SC-011-D0SC04641H-s149

SC-011-D0SC04641H-s149.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s150

SC-011-D0SC04641H-s150.sdf^{(6.7KB, sdf)}

SC-011-D0SC04641H-s151

SC-011-D0SC04641H-s151.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s152

SC-011-D0SC04641H-s152.sdf^{(6.4KB, sdf)}

SC-011-D0SC04641H-s153

SC-011-D0SC04641H-s153.pdb^{(411KB, pdb)}

SC-011-D0SC04641H-s154

SC-011-D0SC04641H-s154.sdf^{(3.1KB, sdf)}

SC-011-D0SC04641H-s155

SC-011-D0SC04641H-s155.pdb^{(415.8KB, pdb)}

SC-011-D0SC04641H-s156

SC-011-D0SC04641H-s156.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s157

SC-011-D0SC04641H-s157.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s158

SC-011-D0SC04641H-s158.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s159

SC-011-D0SC04641H-s159.pdb^{(411KB, pdb)}

SC-011-D0SC04641H-s160

SC-011-D0SC04641H-s160.sdf^{(5.9KB, sdf)}

SC-011-D0SC04641H-s161

SC-011-D0SC04641H-s161.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s162

SC-011-D0SC04641H-s162.sdf^{(5.9KB, sdf)}

SC-011-D0SC04641H-s163

SC-011-D0SC04641H-s163.pdb^{(411.2KB, pdb)}

SC-011-D0SC04641H-s164

SC-011-D0SC04641H-s164.sdf^{(6.3KB, sdf)}

SC-011-D0SC04641H-s165

SC-011-D0SC04641H-s165.pdb^{(411KB, pdb)}

SC-011-D0SC04641H-s166

SC-011-D0SC04641H-s166.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s167

SC-011-D0SC04641H-s167.pdb^{(548.4KB, pdb)}

SC-011-D0SC04641H-s168

SC-011-D0SC04641H-s168.sdf^{(4.2KB, sdf)}

SC-011-D0SC04641H-s169

SC-011-D0SC04641H-s169.pdb^{(420.1KB, pdb)}

SC-011-D0SC04641H-s170

SC-011-D0SC04641H-s170.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s171

SC-011-D0SC04641H-s171.pdb^{(388.2KB, pdb)}

SC-011-D0SC04641H-s172

SC-011-D0SC04641H-s172.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s173

SC-011-D0SC04641H-s173.pdb^{(387.9KB, pdb)}

SC-011-D0SC04641H-s174

SC-011-D0SC04641H-s174.sdf^{(7.8KB, sdf)}

SC-011-D0SC04641H-s175

SC-011-D0SC04641H-s175.pdb^{(413.6KB, pdb)}

SC-011-D0SC04641H-s176

SC-011-D0SC04641H-s176.sdf^{(3KB, sdf)}

SC-011-D0SC04641H-s177

SC-011-D0SC04641H-s177.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s178

SC-011-D0SC04641H-s178.sdf^{(3.8KB, sdf)}

SC-011-D0SC04641H-s179

SC-011-D0SC04641H-s179.pdb^{(548.4KB, pdb)}

SC-011-D0SC04641H-s180

SC-011-D0SC04641H-s180.sdf^{(6.6KB, sdf)}

SC-011-D0SC04641H-s181

SC-011-D0SC04641H-s181.pdb^{(375.1KB, pdb)}

SC-011-D0SC04641H-s182

SC-011-D0SC04641H-s182.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s183

SC-011-D0SC04641H-s183.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s184

SC-011-D0SC04641H-s184.sdf^{(4.6KB, sdf)}

SC-011-D0SC04641H-s185

SC-011-D0SC04641H-s185.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s186

SC-011-D0SC04641H-s186.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s187

SC-011-D0SC04641H-s187.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s188

SC-011-D0SC04641H-s188.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s189

SC-011-D0SC04641H-s189.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s190

SC-011-D0SC04641H-s190.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s191

SC-011-D0SC04641H-s191.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s192

SC-011-D0SC04641H-s192.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s193

SC-011-D0SC04641H-s193.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s194

SC-011-D0SC04641H-s194.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s195

SC-011-D0SC04641H-s195.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s196

SC-011-D0SC04641H-s196.sdf^{(5.7KB, sdf)}

SC-011-D0SC04641H-s197

SC-011-D0SC04641H-s197.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s198

SC-011-D0SC04641H-s198.sdf^{(5.9KB, sdf)}

SC-011-D0SC04641H-s199

SC-011-D0SC04641H-s199.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s200

SC-011-D0SC04641H-s200.sdf^{(5.1KB, sdf)}

SC-011-D0SC04641H-s201

SC-011-D0SC04641H-s201.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s202

SC-011-D0SC04641H-s202.sdf^{(5.5KB, sdf)}

SC-011-D0SC04641H-s203

SC-011-D0SC04641H-s203.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s204

SC-011-D0SC04641H-s204.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s205

SC-011-D0SC04641H-s205.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s206

SC-011-D0SC04641H-s206.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s207

SC-011-D0SC04641H-s207.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s208

SC-011-D0SC04641H-s208.sdf^{(5.2KB, sdf)}

SC-011-D0SC04641H-s209

SC-011-D0SC04641H-s209.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s210

SC-011-D0SC04641H-s210.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s211

SC-011-D0SC04641H-s211.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s212

SC-011-D0SC04641H-s212.sdf^{(4.2KB, sdf)}

SC-011-D0SC04641H-s213

SC-011-D0SC04641H-s213.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s214

SC-011-D0SC04641H-s214.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s215

SC-011-D0SC04641H-s215.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s216

SC-011-D0SC04641H-s216.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s217

SC-011-D0SC04641H-s217.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s218

SC-011-D0SC04641H-s218.sdf^{(5.3KB, sdf)}

SC-011-D0SC04641H-s219

SC-011-D0SC04641H-s219.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s220

SC-011-D0SC04641H-s220.sdf^{(4.9KB, sdf)}

SC-011-D0SC04641H-s221

SC-011-D0SC04641H-s221.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s222

SC-011-D0SC04641H-s222.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s223

SC-011-D0SC04641H-s223.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s224

SC-011-D0SC04641H-s224.sdf^{(5.2KB, sdf)}

SC-011-D0SC04641H-s225

SC-011-D0SC04641H-s225.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s226

SC-011-D0SC04641H-s226.sdf^{(5.4KB, sdf)}

SC-011-D0SC04641H-s227

SC-011-D0SC04641H-s227.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s228

SC-011-D0SC04641H-s228.sdf^{(5.6KB, sdf)}

SC-011-D0SC04641H-s229

SC-011-D0SC04641H-s229.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s230

SC-011-D0SC04641H-s230.sdf^{(4.3KB, sdf)}

SC-011-D0SC04641H-s231

SC-011-D0SC04641H-s231.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s232

SC-011-D0SC04641H-s232.sdf^{(9.2KB, sdf)}

SC-011-D0SC04641H-s233

SC-011-D0SC04641H-s233.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s234

SC-011-D0SC04641H-s234.sdf^{(7.2KB, sdf)}

SC-011-D0SC04641H-s235

SC-011-D0SC04641H-s235.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s236

SC-011-D0SC04641H-s236.sdf^{(7.5KB, sdf)}

SC-011-D0SC04641H-s237

SC-011-D0SC04641H-s237.pdb^{(407.1KB, pdb)}

SC-011-D0SC04641H-s238

SC-011-D0SC04641H-s238.sdf^{(6.1KB, sdf)}

SC-011-D0SC04641H-s239

SC-011-D0SC04641H-s239.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s240

SC-011-D0SC04641H-s240.sdf^{(6.1KB, sdf)}

SC-011-D0SC04641H-s241

SC-011-D0SC04641H-s241.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s242

SC-011-D0SC04641H-s242.sdf^{(6.4KB, sdf)}

SC-011-D0SC04641H-s243

SC-011-D0SC04641H-s243.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s244

SC-011-D0SC04641H-s244.sdf^{(6.4KB, sdf)}

SC-011-D0SC04641H-s245

SC-011-D0SC04641H-s245.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s246

SC-011-D0SC04641H-s246.sdf^{(6.4KB, sdf)}

SC-011-D0SC04641H-s247

SC-011-D0SC04641H-s247.pdb^{(543.9KB, pdb)}

SC-011-D0SC04641H-s248

SC-011-D0SC04641H-s248.sdf^{(6.2KB, sdf)}

SC-011-D0SC04641H-s249

SC-011-D0SC04641H-s249.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s250

SC-011-D0SC04641H-s250.sdf^{(6.7KB, sdf)}

SC-011-D0SC04641H-s251

SC-011-D0SC04641H-s251.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s252

SC-011-D0SC04641H-s252.sdf^{(6.7KB, sdf)}

SC-011-D0SC04641H-s253

SC-011-D0SC04641H-s253.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s254

SC-011-D0SC04641H-s254.sdf^{(6.3KB, sdf)}

SC-011-D0SC04641H-s255

SC-011-D0SC04641H-s255.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s256

SC-011-D0SC04641H-s256.sdf^{(6KB, sdf)}

SC-011-D0SC04641H-s257

SC-011-D0SC04641H-s257.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s258

SC-011-D0SC04641H-s258.sdf^{(6.2KB, sdf)}

SC-011-D0SC04641H-s259

SC-011-D0SC04641H-s259.pdb^{(406.5KB, pdb)}

SC-011-D0SC04641H-s260

SC-011-D0SC04641H-s260.sdf^{(6.4KB, sdf)}

SC-011-D0SC04641H-s261

SC-011-D0SC04641H-s261.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s262

SC-011-D0SC04641H-s262.sdf^{(5KB, sdf)}

SC-011-D0SC04641H-s263

SC-011-D0SC04641H-s263.pdb^{(548.4KB, pdb)}

SC-011-D0SC04641H-s264

SC-011-D0SC04641H-s264.sdf^{(5.6KB, sdf)}

SC-011-D0SC04641H-s265

SC-011-D0SC04641H-s265.pdb^{(407.7KB, pdb)}

SC-011-D0SC04641H-s266

SC-011-D0SC04641H-s266.sdf^{(6.8KB, sdf)}

SC-011-D0SC04641H-s267

SC-011-D0SC04641H-s267.pdb^{(407.1KB, pdb)}

SC-011-D0SC04641H-s268

SC-011-D0SC04641H-s268.sdf^{(5.7KB, sdf)}

SC-011-D0SC04641H-s269

SC-011-D0SC04641H-s269.pdb^{(406.8KB, pdb)}

SC-011-D0SC04641H-s270

SC-011-D0SC04641H-s270.sdf^{(4.5KB, sdf)}

SC-011-D0SC04641H-s271

SC-011-D0SC04641H-s271.pdb^{(546.9KB, pdb)}

SC-011-D0SC04641H-s272

SC-011-D0SC04641H-s272.sdf^{(4.3KB, sdf)}

SC-011-D0SC04641H-s273

SC-011-D0SC04641H-s273.pdb^{(546.9KB, pdb)}

SC-011-D0SC04641H-s274

SC-011-D0SC04641H-s274.sdf^{(5.7KB, sdf)}

SC-011-D0SC04641H-s275

SC-011-D0SC04641H-s275.pdb^{(545.1KB, pdb)}

SC-011-D0SC04641H-s276

SC-011-D0SC04641H-s276.sdf^{(4.8KB, sdf)}

SC-011-D0SC04641H-s277

SC-011-D0SC04641H-s277.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s278

SC-011-D0SC04641H-s278.sdf^{(5.8KB, sdf)}

SC-011-D0SC04641H-s279

SC-011-D0SC04641H-s279.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s280

SC-011-D0SC04641H-s280.sdf^{(4.3KB, sdf)}

SC-011-D0SC04641H-s281

SC-011-D0SC04641H-s281.pdb^{(406KB, pdb)}

SC-011-D0SC04641H-s282

SC-011-D0SC04641H-s282.sdf^{(4.4KB, sdf)}

SC-011-D0SC04641H-s283

SC-011-D0SC04641H-s283.pdb^{(406.3KB, pdb)}

SC-011-D0SC04641H-s284

SC-011-D0SC04641H-s284.xlsx^{(399.2KB, xlsx)}

[cit1] Phase III Double-blind, Placebo-controlled Study of AZD1222 for the Prevention of COVID-19 in Adults, 2020, accessed September 15, 2020, https://clinicaltrials.gov/ct2/show/NCT04516746?term=NCT04516746&draw=2&rank=1

[cit2] Statement on AstraZeneca Oxford SARS-CoV-2 vaccine, AZD1222, COVID-19 vaccine trials temporary pause, 2020, accessed September 15, 2020, https://www.astrazeneca.com/media-centre/press-releases/2020/statement-on-astrazeneca-oxford-sars-cov-2-vaccine-azd1222-covid-19-vaccine-trials-temporary-pause.html

[cit3] Xu X. Chen P. Wang J. Feng J. Zhou H. Li X. Zhong Wu. Hao P. Evolution of the novel coronavirus from the ongoing wuhan outbreak and modeling of its spike protein for risk of human transmission. Sci. China: Life Sci. 2020;63(3):457–460. doi: 10.1007/s11427-020-1637-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit4] Davies M. Nowotka M. Papadatos G. Dedman N. Gaulton A. Atkinson F. Bellis L. Overington J. P. Chembl web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015;43(W1):W612–W620. doi: 10.1093/nar/gkv352. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit5] Berman H. M. Westbrook J. Feng Z. Gilliland G. Bhat T. N. Weissig H. Shindyalov I. N. Bourne P. E. The protein data bank. Nucleic Acids Res. 2000;28(1):35–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit6] Nguyen D. D. Gao K. Wang M. Wei G.-W. MathDL: Mathematical deep learning for d3r grand challenge 4. J. Comput.-Aided Mol. Des. 2020;34:131–147. doi: 10.1007/s10822-019-00237-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit7] Matthews D. A. Dragovich P. S. Webber S. E. Fuhrman S. A. Patick A. K. Zalman L. S. Hendrickson T. F. Love R. A. Prins T. J. Marakovits J. T. et al., Structure-assisted design of mechanism-based irreversible inhibitors of human rhinovirus 3c protease with potent antiviral activity against multiple rhinovirus serotypes. Proc. Natl. Acad. Sci. U. S. A. 1999;96(20):11000–11007. doi: 10.1073/pnas.96.20.11000. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit8] Yang H. Xie W. Xue X. Yang K. Ma J. Liang W. Zhao Q. Zhou Z. Pei D. Ziebuhr J. et al., Design of wide-spectrum inhibitors targeting coronavirus main proteases. PLoS Biol. 2005;3(10):e324. doi: 10.1371/journal.pbio.0030324. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit9] Yang S. Chen S.-J. Hsu M.-F. Wu J.-D. Tseng C.-T. K. Liu Y.-F. Chen H.-C. Kuo C.-W. Wu C.-S. Chang L.-W. et al., Synthesis, crystal structure, structure-activity relationships, and antiviral activity of a potent sars coronavirus 3cl protease inhibitor. J. Med. Chem. 2006;49(16):4971–4980. doi: 10.1021/jm0603926. [DOI] [PubMed] [Google Scholar]

[cit10] Wang F. Chen C. Tan W. Yang K. Yang H. Structure of main protease from human coronavirus nl63: insights for wide spectrum anti-coronavirus drug design. Sci. Rep. 2016;6:22677. doi: 10.1038/srep22677. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit11] Jin Z. Du X. Xu Y. Deng Y. Liu M. Zhao Y. Zhang B. Li X. Zhang L. Peng C. et al., Structure of mpro from covid-19 virus and discovery of its inhibitors. bioRxiv. 2020 doi: 10.1038/s41586-020-2223-y. [DOI] [PubMed] [Google Scholar]

[cit12] Rathnayake A. D. Zheng J. Kim Y. Perera K. D. Mackin S. Meyerholz D. K. Kashipathy M. M. Battaile K. P. Lovell S. Perlman S. et al., 3c-like protease inhibitors block coronavirus replication in vitro and improve survival in mers-cov-infected mice. Sci. Transl. Med. 2020;12(557):eabc5332. doi: 10.1126/scitranslmed.abc5332. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit13] Landrum G. et al. , Rdkit: Open-source cheminformatics, 2006

[cit14] Wu K. Zhao Z. Wang R. Wei G. W. TopP-S: Persistent Homology-Based Multi-Task Deep Neural Networks for Simultaneous Predictions of Partition Coefficient and Aqueous Solubility. J. Comput. Chem. 2018;39:1444–1454. doi: 10.1002/jcc.25213. [DOI] [PubMed] [Google Scholar]

[cit15] Chang H.-P. Chou C.-Y. Chang G.-G. Reversible unfolding of the severe acute respiratory syndrome coronavirus main protease in guanidinium chloride. Biophys. J. 2007;92(4):1374–1383. doi: 10.1529/biophysj.106.091736. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit16] Park B. K. Boobis A. Clarke S. Goldring C. E. P. Jones D. Kenna J. G. Lambert C. Laverty H. G. Naisbitt D. J. Nelson S. et al., Managing the challenge of chemically reactive metabolites in drug development. Nat. Rev. Drug Discovery. 2011;10(4):292–306. doi: 10.1038/nrd3408. [DOI] [PubMed] [Google Scholar]

[cit17] Degen J. Wegscheid-Gerlach C. Zaliani A. Rarey M. On the art of compiling and using’drug-like’chemical fragment spaces. ChemMedChem. 2008;3(10):1503–1507. doi: 10.1002/cmdc.200800178. [DOI] [PubMed] [Google Scholar]

[cit18] Su M. Yang Q. Du Y. Feng G. Liu Z. Li Y. Wang R. Comparative assessment of scoring functions: The casf-2016 update. J. Chem. Inf. Model. 2018 doi: 10.1021/acs.jcim.8b00545. [DOI] [PubMed] [Google Scholar]

[cit19] Zhang L. Lin D. Sun X. Curth U. Drosten C. Sauerhering L. Becker S. Rox K. Hilgenfeld R. Crystal structure of sars-cov-2 main protease provides a basis for design of improved α-ketoamide inhibitors. Science. 2020;368(6489):409–412. doi: 10.1126/science.abb3405. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit20] Su H. Yao S. Zhao W. Li M. Jia L. Shang W. Xie H. Ke C. Gao M. Yu K. et al., Discovery of baicalin and baicalein as novel, natural product inhibitors of sars-cov-2 3cl protease in vitro. bioRxiv. 2020 [Google Scholar]

[cit21] Wang H. He S. Deng W. Zhang Y. Li G. Sun J. Zhao W. Guo Y. Zheng Y. Li D. et al., Comprehensive insights into the catalytic mechanism of middle east respiratory syndrome 3c-like protease and severe acute respiratory syndrome 3c-like protease. ACS Catal. 2020;10(10):5871–5890. doi: 10.1021/acscatal.0c00110. [DOI] [PubMed] [Google Scholar]

[cit22] Dai W. Zhang B. Jiang X.-M. Su H. Li J. Zhao Y. Xie X. Jin Z. Peng J. Liu F. et al., Structure-based design of antiviral drug candidates targeting the sars-cov-2 main protease. Science. 2020;368(6497):1331–1335. doi: 10.1126/science.abb4489. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit23] Ma C. Sacco M. D. Hurst B. Townsend J. A. Hu Y. Szeto T. Zhang X. Tarbet B. Marty M. T. Chen Y. et al., Boceprevir, gc-376, and calpain inhibitors ii, xii inhibit sars-cov-2 viral replication by targeting the viral main protease. bioRxiv. 2020 doi: 10.1038/s41422-020-0356-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit24] Douangamath A. Fearon D. Gehrtz P. Krojer T. Lukacik P. Owen C. D. Resnick E. Strain-Damerell C. Ábrányi-Balogh P. Brandaõ-Neto J. et al., Crystallographic and electrophilic fragment screening of the sars-cov-2 main protease. Nat. Commun. 2020;11:5047. doi: 10.1038/s41467-020-18709-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit25] Bacha U. Barrila J. Gabelli S. B. Kiso Y. Amzel L. M. Freire E. Development of broad-spectrum halomethyl ketone inhibitors against coronavirus main protease 3clpro. Chem. Biol. Drug Des. 2008;72(1):34–49. doi: 10.1111/j.1747-0285.2008.00679.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit26] Ghosh A. K. Brindisi M. Shahabi D. Chapman M. E. Mesecar A. D. Drug development and medicinal chemistry efforts toward sars-coronavirus and covid-19 therapeutics. ChemMedChem. 2020;15(11):907–932. doi: 10.1002/cmdc.202000223. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit27] Liang P.-H. Characterization and inhibition of sars-coronavirus main protease. Curr. Top. Med. Chem. 2006;6(4):361–376. doi: 10.2174/156802606776287090. [DOI] [PubMed] [Google Scholar]

[cit28] Wang H.-M. Liang P.-H. Pharmacophores and biological activities of severe acute respiratory syndrome viral protease inhibitors. Expert Opin. Ther. Pat. 2007;17(5):533–546. doi: 10.1517/13543776.17.5.533. [DOI] [Google Scholar]

[cit29] Kumar V. Jung Y.-S. Liang P.-H. Anti-sars coronavirus agents: a patent review (2008–present) Expert Opin. Ther. Pat. 2013;23(10):1337–1348. doi: 10.1517/13543776.2013.823159. [DOI] [PubMed] [Google Scholar]

[cit30] Pillaiyar T. Manickam M. Namasivayam V. Hayashi Y. Jung S.-H. An overview of severe acute respiratory syndrome–coronavirus (sars-cov) 3cl protease inhibitors: peptidomimetics and small molecule chemotherapy. J. Med. Chem. 2016;59(14):6595–6628. doi: 10.1021/acs.jmedchem.5b01461. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit31] Ullrich S. Nitsche C. The sars-cov-2 main protease as drug target. Bioorg. Med. Chem. Lett. 2020:127377. doi: 10.1016/j.bmcl.2020.127377. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit32] Cang Z. X. Mu L. Wei G. W. Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput. Biol. 2018;14(1):e1005929. doi: 10.1371/journal.pcbi.1005929. doi: 10.1371/journal.pcbi.1005929. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit33] Nguyen D. Wei G.-W. AGL-Score: Algebraic graph learning score for protein–ligand binding scoring, ranking, docking, and screening. J. Chem. Inf. Model. 2019;59(7):3291–3304. doi: 10.1021/acs.jcim.9b00334. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit34] Carlsson G. Topology and data. Bull. Am. Math. Soc. 2009;46(2):255–308. doi: 10.1090/S0273-0979-09-01249-X. [DOI] [Google Scholar]

[cit35] Jones G. Willett P. Glen R. C. Leach A. R. Taylor R. Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol. 1997;267(3):727–748. doi: 10.1006/jmbi.1996.0897. [DOI] [PubMed] [Google Scholar]

[cit36] Trott O. Olson A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 2010;31(2):455–461. doi: 10.1002/jcc.21334. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit37] Friesner R. A. Banks J. L. Murphy R. B. Halgren T. A. Klicic J. J. Mainz D. T. Repasky M. P. Knoll E. H. Shelley M. Perry J. K. et al., Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J. Med. Chem. 2004;47(7):1739–1749. doi: 10.1021/jm0306430. [DOI] [PubMed] [Google Scholar]

[cit38] Wójcikowski M., Kukiełka M., Stepniewska-Dziubinska M. and Siedlecki P., Development of a protein-ligand extended connectivity (plec) fingerprint and its application for binding affinity predictions, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit39] Nguyen D. D. Wei G.-W. DG-GL: Differential geometry-based geometric learning of molecular datasets. Int. J. Numer. Method Biomed. Eng. 2019;35(3):e3179. doi: 10.1002/cnm.3179. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit40] Zheng L. Fan J. Mu Y. Onionnet: a multiple-layer intermolecular-contact-based convolutional neural network for protein–ligand binding affinity prediction. ACS Omega. 2019;4(14):15956–15965. doi: 10.1021/acsomega.9b01997. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit41] Li H. Sze K.-H. Lu G. Ballester P. J. Machine-learning scoring functions for structure-based drug lead optimization. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2020:e1465. doi: 10.1002/wcms.1225. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit42] Nguyen D. D. Cang Z. Wei G.-W. A review of mathematical representations of biomolecular data. Phys. Chem. Chem. Phys. 2020;22(8):4343–4367. doi: 10.1039/C9CP06554G. [DOI] [PMC free article] [PubMed] [Google Scholar]

[cit43] Jiménez J. Skalic M. Martínez-Rosell G. De Fabritiis G. K DEEP: Protein–Ligand absolute binding affinity prediction via 3D-convolutional neural networks. J. Chem. Inf. Model. 2018;58(2):287–296. doi: 10.1021/acs.jcim.7b00650. [DOI] [PubMed] [Google Scholar]

[cit44] Wu K. Wei G. W. Quantitative Toxicity Prediction Using Topology Based Multitask Deep Neural Networks. J. Chem. Inf. Model. 2018;58:520–531. doi: 10.1021/acs.jcim.7b00558. [DOI] [PubMed] [Google Scholar]

PERMALINK

Unveiling the molecular mechanism of SARS-CoV-2 main protease inhibition from 137 crystal structures using algebraic topology and deep learning†

Duc Duy Nguyen

Kaifu Gao

Jiahui Chen

Rui Wang

Guo-Wei Wei

Abstract

1. Introduction

2. Results and discussions

2.1. Results

A summary of our selected data sets.

Binding affinities of top 10 complexes in SARS-CoV PBD-noBA dataset predicted by our MathDL. “Pred. BA” indicates the predicted binding free energy in kcal mol−1 and “Pred. IC50” is the corresponding IC50 in μM unit via the following conversion: Pred. IC50 = 10Pred. BA/1.3633 × 106.

2.2. Discussion

2.2.1. Binding site analysis

Fig. 1. All binding site pockets observed from 137 inhibitors in SARS-CoV PDB-noBA set.

Fig. 2. (a) Distribution of 137 ligands across 13 distinct binding sites; (b) Box plot of predicted binding energies (kcal mol−1) of all inhibitors in each binding site.

2.2.2. Interaction analysis

Interaction analysis in the binding pockets of top 4 complexes in term of binding affinity predicted by our MathDL models.

Fig. 4. Popularity of amino acids in the binding site P1 constituting the hydrogen bonds with ligands.

2.2.3. Fragment analysis

Fig. 6. Fragment frequencies based on BRICS decomposition of 110 inhibitors of binding site pocket P1. Li is the link atom of a certain type described in ref. 17.

3. Materials and methods

3.1. Datasets

3.2. Methods

3.2.1. MathDL

Fig. 7. A framework of MathDL energy prediction model which integrates advanced mathematical representations with sophisticated CNN architectures.

3.2.1.1. Algebraic topology-based representations

Simplex

Simplicial complex

Chain complex

Persistent homology

3.2.1.2. Element specific considerations

3.2.2. MathPose

3.3. Validations

3.3.1. PDBbind v2016 core set benchmark

3.3.2. 5 fold cross-validation on SARS-CoV BA set

5-fold Performances of MathDL-All and MathDL-MT on SARS-CoV BA set.

4. Conclusion

Conflicts of interest

Supplementary Material

Acknowledgments

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Unveiling the molecular mechanism of SARS-CoV-2 main protease inhibition from 137 crystal structures using algebraic topology and deep learning^†

Fig. 2. (a) Distribution of 137 ligands across 13 distinct binding sites; (b) Box plot of predicted binding energies (kcal mol⁻¹) of all inhibitors in each binding site.

Fig. 4. Popularity of amino acids in the binding site P₁ constituting the hydrogen bonds with ligands.

Fig. 6. Fragment frequencies based on BRICS decomposition of 110 inhibitors of binding site pocket P₁. L_i is the link atom of a certain type described in ref. 17.