Abstract
SARS-CoV-2 3CL protease is one of the key targets for drug development against COVID-19. Most known SARS-CoV-2 3CL protease inhibitors act by covalently binding to the active site cysteine. Yet, computational screens against this enzyme were mainly focused on non-covalent inhibitor discovery. Here, we developed a deep learning-based stepwise strategy for selective covalent inhibitor screen. We used a deep learning framework that integrated a directed message passing neural network with a feed-forward neural network to construct two different classifiers for either covalent or non-covalent inhibition activity prediction. These two classifiers were trained on the covalent and non-covalent 3CL protease inhibitors dataset, respectively, which achieved high prediction accuracy. We then successively applied the covalent inhibitor model and the non-covalent inhibitor model to screen a chemical library containing compounds with covalent warheads of cysteine. We experimentally tested the inhibition activity of 32 top-ranking compounds and 12 of them were active, among which 6 showed IC50 values less than 12 μM and the strongest one inhibited SARS-CoV-2 3CL protease with an IC50 of 1.4 μM. Further investigation demonstrated that 5 of the 6 active compounds showed typical covalent inhibition behavior with time-dependent activity. These new covalent inhibitors provide novel scaffolds for developing highly active SARS-CoV-2 3CL covalent inhibitors.
Keywords: SARS-CoV-2, 3C-like protease inhibitors, Deep learning, Covalent warheads
Graphical abstract
1. Introduction
Caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1], the ongoing global coronavirus disease 2019 (COVID-19) pandemic has led to more than 565 million confirmed cases and over 6 million deaths as of July 2022 according to World Health Organization. Despite the relentless effort of researchers, COVID-19 may still pose a great threat to human life in the future. On the one hand, the continuous emergence of virus variants compromises the efficacy of available vaccines [2], while the durability and long-term side effects of vaccines are unknown currently [3]. On the other hand, although several drugs such as remdesivir [4], molnupiravir [5], and Paxlovid [6] have been approved for emergent use by FDA for the treatment of patients with mild-to-moderate COVID-19, only Paxlovid performs relatively well. Therefore, it is still imperative to develop new effective therapies targeting this pandemic.
SARS-CoV-2 and other coronaviruses encode a chymotrypsin-like protease (3C-like protease (3CLpro) or main protease (Mpro)), which cleaves the viral polyproteins at 11 sites and plays a pivotal role in the replication as well as transcription of viruses [7,8]. Its substrate cleaving sites involve a conserved glutamine at the P1 position that is essential to hydrolysis [[9], [10], [11]]. Such strict substrate specificity has never been found in human-host protease [12,13], enabling 3CLpro as an ideal target for developing drugs against COVID-19. A number of 3CLpro inhibitors for other coronaviruses had been studied before [[14], [15], [16], [17], [18], [19]], which have been repurposed or further developed to treat COVID-19. Despite the therapeutic benefits, fear of toxicity caused by inherent chemical reactivity has impeded the development of covalent drugs for a long time and few studies were dedicated to covalent inhibition 30 years ago [20]. Nevertheless, due to the development of characterization techniques and the discovery of some covalent inhibitors with moderate chemical reactivity, the design of covalent drugs becomes more prevalent [21,22]. At present, the known SARS-CoV-2 3CLpro inhibitors are also mainly covalent, such as PF-00835231 [22], PF-07321332 [6], myricetin [21], GC376 [23], MI-23 [24], 11a [25], calpain inhibitors I, II, and XII [26], carmofur [10], boceprevir [26], N3 [10], most of which contain the warheads such as aldehyde, alpha-ketoamide, nitrile, acrylamide, Michael acceptors, and chloroacetamide to react with the active site Cys145. There are also a few non-covalent inhibitors, such as S-217622 [27], baicalin [28], and masitinib [29]. In view of the enhanced therapeutic potency and long-lasting effects of these covalent warheads [30,31], developing covalent inhibitors of SARS-CoV-2 3CLpro provides an effective way to combat the pandemic.
In addition to rational design [32], molecular docking-based virtual screening [33], high-throughput experimental screening [34], and artificial intelligence (AI) have been used to screen for anti-SARS-CoV-2 compounds. For example, Wang et al. [35] developed a transferable deep learning method to screen a large compound library and suggested a list of potential anti-SARS-CoV-2 compounds. Duc et al. [36] used algebraic topology and deep learning to understand the molecular mechanism of SARS-CoV-2 3CLpro inhibition from 137 complex crystal structures and predicted 71 possible covalent binding inhibitors, for which no experimental validation has been carried out. Hu et al. [37] developed a novel framework, AIMEE, that integrated an AI model with enzymological experiments to screen a bioactive chemical library and found four SARS-CoV-2 3CLpro inhibitors with half-maximal inhibitory concentration (IC50) at micromole level. Despite that the most successful SARS-CoV-2 3CLpro inhibitors until now all covalently bind to the active site Cys145, currently reported AI models for SARS-CoV-2 3CLpro inhibitor screen all used datasets of experimentally identified inhibitors that contain both covalent and non-covalent inhibitors. Most covalent inhibitors screened by these models can covalently bind to various targets without selectivity, which may easily result in off-target effects. Consequently, AI models that can be used to screen selective covalent inhibitors need to be developed.
Given the contribution of non-covalent interactions to the selectivity of ligands and the increased efficacy when two binding mechanisms exist simultaneously [20], the existence of covalent warheads and the binding of non-covalent sub-structures should be taken into account together when screening new covalent inhibitors. In the present study, we developed a strategy that incorporates deep learning models and in vitro experiments to quickly identify potential selective covalent SARS-CoV-2 3CLpro inhibitors. Firstly, we applied a deep learning framework that integrated a directed message passing neural network with a feed-forward neural network to construct two different classifiers for the activity prediction of molecules. These two classifiers were trained on the covalent and non-covalent 3CLpro inhibitors dataset separately. Both models achieved satisfactory performance on the test sets. We then successively applied the covalent inhibitor model and the non-covalent inhibitor model to screen a library with more than 39, 000 compounds that contain covalent warheads of cysteine. Among the 32 top-ranking compounds that were experimentally tested, 6 showed IC50 values less than 12 μM and the strongest one inhibited SARS-CoV-2 3CLpro with an IC50 of 1.4 μM. Further investigation demonstrated that 5 of the 6 active compounds showed typical covalent inhibition behavior with time-dependent activity. These new inhibitors may provide useful guidance for the design of new drugs towards COVID-19.
2. Material and methods
2.1. Dataset
Since SARS-CoV-2 3CLpro has up to 96% sequence identity with SARS-CoV 3CLpro, inhibitors targeting SARS-CoV 3CLpro are likely to have the same effect on SARS-CoV-2 3CLpro [38]. Therefore, we collected not only SARS-CoV-2 3CLpro inhibitors from recent literature (from 2020 to 2021) but also SARS-CoV 3CLpro inhibitors from previous studies (from 2004 to 2021) [11,[14], [15], [16], [17], [18], [19],23,25,[39], [40], [41], [42]]. Experimental data of inhibition assays are available for all collected inhibitors in literature and IC50 of 50 μM was chosen as the cutoff to distinguish positive and negative samples. In addition, we divided this curated dataset into covalent and non-covalent datasets based on the binding mechanism of inhibitors. After dividing the dataset into training and test sets in a ratio of approximately 10:1, we finally constructed a covalent training dataset (Training Set 1) with 463 molecules (209 positives and 254 negatives), a covalent test dataset (Test Set 1) with 53 molecules (25 positives and 28 negatives), a non-covalent training dataset (Training Set 2) with 1086 molecules (224 positives and 862 negatives) and a non-covalent test dataset (Test Set 2) with 108 molecules (18 positives and 90 negatives) (Data S1).
Cysteine Targeted Covalent Library from ChemDiv (https://www.chemdiv.com) contains 39, 301 compounds that have specific warheads, which were designed to react with cysteine. After removing overlaps with the training set and test set, the remaining 39, 014 compounds of this library were used to screen potential covalent SARS-CoV-2 3CLpro inhibitors.
2.2. Deep learning models
We developed one classification model for covalent inhibitor classification (COVCL) and one classification model for non-covalent inhibitor classification (NOVCL) for SARS-CoV-2 3CLpro. The COVCL model, trained on Training Set 1, can be used to predict whether a compound has covalent inhibitory activity against SARS-CoV-2 3CLpro. The NOVCL model, trained on Training Set 2, can be used to predict whether a compound has non-covalent inhibitory activity against SARS-CoV-2 3CLpro. Successive application of these two models can screen out compounds that may specifically bind SARS-CoV-2 3CLpro and form covalent bond with Cys145.
We used the Chemprop architecture to train the models and encoded the molecules using SMILES representation. Chemprop model mainly includes a directed message passing neural network (D-MPNN) module for molecular feature extraction and a feed-forward neural network (FNN) for property prediction. Initially, Chemprop takes molecular SMILES as input and converts it into a molecular graph with atoms regarded as nodes and bonds as edges. After the message passing stage and readout stage, the D-MPNN module finally extracts all the atomic features and bond features in the molecular graph to generate a single feature vector representing the whole molecule [43]. In this work, we concatenated an additional vector containing 200 descriptors computed by RDKit [44] to introduce more molecular-level features. These descriptors contain partial charge, number of heavy atoms, number of hydrogen-bond donors, etc (Table S1). The FNN structure means that information is transferred unidirectionally from the input layer to the output layer step by step, with no feedback between layers [45]. The final feature vector of D-MPNN module was then fed to the FNN module to predict the activity of candidate molecules (Fig. 1 ).
Fig. 1.
Schematic of the framework used for two deep classifiers. VM represents a vector of the whole molecular features. VRDKit represents a vector containing 200 descriptors calculated by RDKit. VM and VRDKit are concatenated to generate a final vector Vfinal, which is then used as the input of FNN module.
2.3. Inhibition assay of SARS-CoV-2 3CLpro
The expression and purification of SARS-CoV-2 3CLpro were carried out using the reported protocol [46]. A fluorescent substrate Dabcyl-KTSAVLQSGFRKM-E(Edans)-NH2 (GL Biochemistry Ltd) and assay buffer (40 mM PBS, 100 mM NaCl, 1 mM EDTA, 0.1% Triton 100, pH 7.3) were used for the inhibition assay. Stock solutions of the inhibitors were prepared with DMSO. 0.5 μM SARS-CoV-2 3CLpro was pre-incubated for 180 min or other lengths of time with 5 μL DMSO or inhibitor at various concentrations. Then, 20 μM fluorescent substrate was added into the system to initiate the reaction. The reaction system was excited at 360 nm and an increase in absorbance at 460 nm was recorded for 20 min at an interval of 37 s with a kinetics mode program using a 96-well plate reader (Synergy, Biotek). The inhibition rate was calculated by Vi/V0, where V0 or Vi represents the mean reaction rate of the protease incubated with DMSO or compounds. IC50 was fitted with Prism GraphPad 8.0.
For the inhibition test in presence of DTT, the final concentration of DTT in assay buffer was 5 mM.
2.4. Inhibition kinetics analysis and reversibility assay
We studied the enzyme kinetics properties by adding different concentrations of the fluorescent substrate to initiate the reaction. Mean velocities were collected and plotted then fitted with the Michaelis-Menten equation to obtain values of Km and Vmax.
To investigate the reversibility of inhibition, 1 μM SARS-CoV-2 3CLpro was incubated with 10 μM compound stock solution on ice for 180 min and then divided into different Millipore tubes for various times of ultrafiltration. For each time of ultrafiltration, equal volume of assay buffer used in the inhibition assay was added to elute the protease then ultrafiltered at 4 °C, 12,000 rpm for 5 min and collected for the inhibition assay.
2.5. Half-life of compounds reacting with GSH
Evaluation of the intrinsic reactivity of warheads was conducted by measuring the half-life of the compounds reacting with GSH. 500 μM compound was incubated with 10 mM GSH for 0–60 min and the remaining compounds were quantified by HPLC (Agilent 1200). Ln (the percentage of the remaining compound) was plotted against incubation time to generate the half-life time of the compound reacting with GSH.
2.6. Cathepsin L inhibition assay
The inhibition assay of Cathepsin L was performed as previously reported [47]. Briefly, compound 9 and compound 13 were tested using the commercial Cathepsin L Inhibitor Assay Kit (Abcam, Cat# ab197012). FF-FMK, a known inhibitor for Cathepsin L was used as a positive control.
2.7. Mass spectrometry analysis
For mass spectrometry analysis of the protease, 1 μM SARS-CoV-2 3CLpro was incubated with DMSO or 10 μM compound on ice for 180 min. The solution was ultrafiltered three times at 4 °C, 12,000 rpm for 5 min and analyzed by Quadrupole-TOF LC-MS/MS System using the ESI (+) mode. Signals of observed mass were collected and deconvoluted.
2.8. Covalent docking
The crystal structure of SARS-CoV-2 3CLpro in complex with compound 4 (PDB ID: 7JT7) was used as the target structure [48]. Protein and ligands were first prepared by the Protein Prep Wizard module [49] and LigPrep module in Schrodinger [50], respectively. Covalent Docking module [51] in Schrodinger was then used to carry out covalent docking of compounds with reaction type set to Michael addition and docking mode set to pose prediction.
3. Results and discussions
3.1. Models for prediction of covalent and non-covalent SARS-CoV-2 3CLpro inhibitors
We first trained and tested the COVCL and NOVCL models. Chemprop is a deep learning model for molecular property prediction that outperforms existing strong baselines on 16 proprietary datasets and 19 publicly available datasets [52]. Given the outstanding performance of Chemprop, we considered using it as the framework of our models. Since the performance of deep learning model is usually limited by the size of dataset, we additionally tested the performance of several classical machine learning models including Random Forest (RF) [53], Support Vector Classification (SVC) [54], eXtreme Gradient Boosting (XGB) [55], and Artificial Neural Network (ANN) [56]. Firstly, 10-fold cross-validation was carried out on training sets to evaluate the performance. Since ensemble methods can improve the stability of model by aggregating the predictions of multiple classifiers [57], we further constructed an ensemble of 10 models. The ensemble was trained by different splits of training data and validation data (9:1) from training sets. Then, performance of this ensemble was tested on the independent test sets and the final prediction was the average result of these 10 models. The receiver operating characteristic curve (ROC-AUC) and accuracy (ACC) were used as metrics to evaluate the performance of different models. As indicated by Table S2 and Table S3, the performance of classical models was close to Chemprop on both covalent and non-covalent datasets, but overall Chemprop performed slightly better. Consequently, considering that our small dataset does not diminish the effect of deep learning and that Chemprop incorporates more molecular structure information as described in methods, we finally chose Chemprop as the basic architecture of COVCL and NOVCL models.
The default parameters of Chemprop model were used except for four hyperparameters: the number of message-passing steps (depth), the dropout values of all layers, the size of bond message vectors (hidden size), and the number of FNN layers. We used the automated hyperparameter optimization tool in Chemprop, which is achieved by the Bayesian optimization method, to identify the optimal hyperparameters (Table 1 ).
Table 1.
Optimal hyperparameters for the two models.
| model | depth | dropout | FNN layers | hidden size |
|---|---|---|---|---|
| COVCL | 5 | 0.1 | 2 | 1100 |
| NOVCL | 3 | 0.1 | 2 | 1300 |
As shown in Fig. 2 A and Fig. 2C, AUC and ACC of ten-fold cross-validation on Training Set 1 were all above 0.850, while the AUC on Test set 1 achieved 0.987, indicating the good performance of COVCL. As for NOVCL, the AUC and ACC of ten-fold cross-validation this time were all above 0.880, while the AUC of Test set 2 could reach 0.853 (Fig. 2B and D). In summary, both COVCL and NOVCL showed satisfactory performance, and thus can be used for the following screening of SARS-CoV-2 3CLpro inhibitors.
Fig. 2.
Performance of two classification models (COVCL and NOVCL). (A) Result of 10-fold cross-validation on Training Set 1 for COVCL. (B) Result of 10-fold cross-validation on Training Set 2 for NOVCL. (C) Receiver operating characteristic curve of COVCL on Test Set 1. AUC is the area under the curve, which is usually used to evaluate the performance of classification model. (D) Receiver operating characteristic curve of NOVCL on Test Set 2.
3.2. Virtual screening of Cysteine Targeted Covalent Library to identify potential 3CLpro inhibitors
We constructed a generally applicable 5-step screening architecture that integrates deep learning and further properties analysis to identify potential selective covalent inhibitors (Fig. 3 A).
Fig. 3.
Library screening for potential inhibitors. (A) Overall schematic of the screening architecture. (B) Distribution of predicted scores for Cysteine Targeted Covalent Library using the COVCL model. Count indicates the number of molecules. (C) Distribution of the predicted scores for the 727 selected molecules using the NOVCL model.
We first screened the Cysteine Targeted Covalent Library with the COVCL model for compounds that might covalently bind SARS-CoV-2 3CLpro. The top 727 compounds with scores higher than 0.8 were selected (Fig. 3B) and subjected to the NOVCL model filtering for compounds that specifically bind to the enzyme and form covalent bond with Cys145. Using a cutoff score of 0.7, 298 molecules were selected (Fig. 3C) for further analysis.
Since some of these 298 compounds are structurally similar, a cluster analysis of the compounds was performed to select representative compounds from each cluster. We used the Butina [58] module of RDKit to cluster the above 298 molecules with a distance cutoff of 0.4. This allowed the compounds to be divided into 102 clusters, among which two clusters contain more than 20 compounds. We further clustered these two groups of compounds with a cutoff of 0.2 and the compound with the highest NOVCL score was taken as a representative for each class. These analyses gave 113 compounds for further study.
As the physicochemical properties of a molecule can have a significant impact on its absorption, distribution, metabolism, excretion (ADME), toxicity, and pharmacological activity, they were considered in the fourth step of the screening process. In the present study, we simply used the octanol-water partition coefficient (logP) to ensure the absorption and bioavailability property of a compound [59]. There are 79 compounds among the 117 with logP values not higher than 5. We also limited the molecular weight to the range of 300–600 and 32 candidate compounds were selected for experimental study.
In order to visualize the model prediction results and further analyze the structural information of the screened molecules, we applied t-distributed stochastic neighbor embedding (t-SNE) on the dataset to visualize the chemical space distribution of compounds. t-SNE is a dimensionality reduction method that can utilize Tanimoto similarity to quantify the chemical distance. The spatial proximity of two points on the t-SNE plot graph implies structural similarity. As shown in Fig. 4 A, the molecules in the training set were widely distributed in chemical space and contained various structures. The distribution of 32 screened molecules was dispersed and 29 of these molecules had the maximum similarity of less than 0.5 to the positive compounds in the training set, indicating that COVCL and NOVCL can identify novel structures to some extent (Fig. 4B). This indicates that the models learned certain structure-activity relationships from the training set, and the learned features can be recombined into some new compounds. Certainly, the performance of deep learning models is closely related to the training data, so increasing the structural diversity of training data should facilitate the model to learn more information and hence enhance its prediction ability.
Fig. 4.
Chemical space distribution of molecules. (A) Visualization of the molecular structures in two dimensions space for Cysteine Targeted Covalent Library (grey dots) and training set (green, blue, red and orange dots) by t-SNE plot. The spatial proximity of two points on the graph implies structural similarity. (B) Visualization of the molecular structures in two dimensions space for training set (green and grey dots) and 32 screened molecules (red crosses) by t-SNE plot.
3.3. Enzyme inhibition activity of the virtually screened compounds
We then tested the inhibitory activity of the 32 compounds against SARS-CoV-2 3CLpro according to our previously published methods [46], using tideglusib as a positive control (Fig. S1) [10]. Among the 32 compounds, 12 showed more than 50% inhibition at the concentration of 100 μM (Table S4) and 6 exhibited considerable inhibitory activity with IC50 < 12 μM with 180 min incubation time (Fig. 5 ). These 6 active compounds contain covalent warheads of α, β-unsaturated ketone (ester), disulfide bond, and heteroaromatic ester. To the best of our knowledge, no anti-SARS-CoV-2 activities nor other biological activities of these molecules have been reported before. We also used the D3Similarity web server to calculate the molecular similarity between our compounds and the reported bioactive compounds against coronaviruses [60]. The maximum two-dimensional similarity of the 6 compounds were all below 0.45, indicating their structural novelty compared to known inhibitors, which provides good starting points for designing novel SARS-CoV-2 3CLpro covalent inhibitors.
Fig. 5.
Dose-response curves of the inhibition of SARS-CoV-2 3CLpro by the 6 hits under different incubation times. IC50 values under 180 min incubation time are shown.
3.4. Mode of action analysis of the active compounds
To elucidate the mode of action of hit compounds against SARS-CoV-2 3CLpro, we carried out a series of enzymatic studies and mass spectrometry analysis. As time-dependent inhibition is a good indicator of covalent binding, we incubated the 6 compounds with SARS-CoV-2 3CLpro for different lengths of time and measured their inhibition activity. Five of them exhibited obvious time-dependent increase of inhibition activity (Fig. 5 and Table S5). Among the 5 compounds, compound 9 and compound 13 with the most potent inhibitory activities were selected for further analysis.
As shown in Fig. 6 A, the addition of dithiothreitol (DTT) could reverse the inhibitory effect of compound 9 and compound 13, which is typical of covalent inhibition through Cys binding. Furthermore, reversibility assay of both compounds to SARS-CoV-2 3CLpro revealed that the ultrafiltration of inhibitors could recover enzymatic activity to a certain extent (Fig. 6B), which indicated that they were reversible inhibitors. As exhibited in Fig. 6C and Table S6, the Michaelis-Menten kinetics analysis of compound 9 and compound 13 yielded both Km and Vmax values altered, indicating that they are mix-type inhibitors of SARS-CoV-2 3CLpro. Overall, the enzymatic studies demonstrated that both compound 9 and compound 13 were reversible covalent inhibitors of SARS-CoV-2 3CLpro.
Fig. 6.
Investigation of inhibitory properties of compound 9 and compound 13 against SARS-CoV-2 3CLpro. (A) Inhibition test of compound 9 and compound 13 in the presence or absence of DTT. (B) Enzyme activity of SARS-CoV-2 3CLpro co-incubated with compound 9 or compound 13 after different times of ultrafiltration (C). Michaelis-Menten kinetics study of compound 9 and compound 13.
We have also explored the selectivity of compound 9 and compound 13 by investigating their inhibitory activity against Cathepsin L, a key host cysteine protease utilized by coronaviruses for cell entry [3,61,62], using the reported method (the known Cathepsin L inhibitor FF-FMK was used as the positive compound, Fig. S2) [63]. At the concentration of 50 μM, the inhibition rates of Cathepsin L by compound 9 and compound 13 are 20.3 ± 5.3% and 26.6 ± 5.8%, respectively. The weak inhibitory activity of both compounds against Cathepsin L suggested their high selectivity for coronavirus protease.
To reduce possible off-target covalent binding that may lead to toxicity risks, the intrinsic reactivity of the warheads should be low. The intrinsic reactivity of compound 9 and compound 13 was measured using a GSH assay which is routinely used to assess the reactivity of cysteine-targeted warheads [64]. The reaction half-life (t1/2) of compound 9 and compound 13 were 1.4 min and 49.6 min, respectively (Fig. S3). Regarding the fact that t1/2s of clinical covalent kinase inhibitors determined in GSH assay were within the range of 30–512 min [65], compound 13 demonstrated moderate reactivity, implying its potential for further development of more potent SARS-CoV-2 3CLpro covalent inhibitors.
We further carried out liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis of covalent binding between SARS-CoV-2 3CLpro and compound 13. We found that compound 13 could only partially modify SARS-CoV-2 3CLpro (Fig. S4), perhaps due to the instability of reversible covalent binding manner. Intact protein mass spectrometry study confirmed that a covalent bond is formed between the protease and the α, β-unsaturated ketone unit of compound 13, detected as a peak with MW shift of +405 Da (C22H15NO7). Besides, a peak with MW shift of +319 Da (C19H13NO4), which is equal to the mass of the SARS-CoV-2 3CLpro/arylfurylpropenone complex along with the removal of methyl glyoxylate was detected.
Eventually, we performed covalent docking to understand the interactions of compound 13 with SARS-CoV-2 3CLpro. As shown in Fig. 7 A, compound 13 mainly occupied the S1′, S1, and S2 pockets. A covalent bond is formed between Cys145 and the α, β-unsaturated ketone unit. In the docking model, two π-π stackings are observed between the furan unit and the benzene ring of compound 13 and His41 of 3CLpro. His41 also interacts with the nitro group of compound 13 by cation-π interaction. In addition, compound 13 forms three hydrogen bonds with Ser144, Cys145, and His163, respectively (Fig. 7B). Apart from the covalent bond formed between Michael acceptors and Cys145, those mentioned non-covalent interactions of compound 13 also contribute to its inhibitory activity and selectivity against SARS-CoV-2 3CLpro. In the docking model, compound 13 occupies a similar position as myricetin, a natural product that was reported to target Cys145 with its pyrogallol warhead [21]. Myricetin bound to 3CLpro distinctively (PDB ID: 7DPP), mainly occupying the S1′ and S2 pockets. The benzene ring in the case of compound 13 binding to 3CLpro is similar to the binding position of the pyrogallol moiety. On the other hand, different from the cyclic moieties adopted in the reported peptidomimetic covalent inhibitors of 3CLpro [6,23,25,66,67], the methyl glyoxylate moiety in compound 13 occupies the S1 pocket (Fig. S8). In addition, the occupied S1’ pocket and currently unoccupied S4 pocket are relatively novel situations among cases of binding to 3CLpro with covalent inhibitors, which would hopefully provide inspiration for developing novel 3CLpro covalent inhibitors.
Fig. 7.
Interaction diagram of compound 13 with SARS-CoV-2 3CLpro. (A) The overall structure of compound 13 binding to the active pocket of SARS-CoV-2 3CLpro. (B) Detailed interactions between compound 13 and SARS-CoV-2 3CLpro. Ligands are shown as green sticks and SARS-CoV-2 3CLpro is displayed as grey cartoon in 3D structure. Amino acid residues interacting with the compounds are shown as yellow sticks.
4. Conclusion
We have developed a generally applicable deep learning screening strategy that uses both covalent and non-covalent classifiers to identify selective covalent enzyme inhibitors. In the present study, we trained the two classifiers on covalent and non-covalent SARS-CoV-2 3CLpro inhibitor datasets, respectively, and successfully applied them to identify 5 novel covalent SARS-CoV-2 3CLpro inhibitors. Such a deep learning approach only requires molecular SMILES as input files, making it convenient to be applied. Compared to experimental and docking-based screening, it is highly efficient. The inhibitors that we found provide new scaffolds for developing the next generation of specific covalent SARS-CoV-2 3CLpro inhibitors. Especially, our framework can be easily extended to other targets or diseases in principle although we only presented the study of SARS-CoV-2 3CLpro here. For instance, papain-like protease (PLpro) of SARS-CoV-2, which is another promising target for the treatment of coronavirus, can also form a covalent bond with ligands through Cys111 in the catalytic site [68,69]. The covalent inhibitors of PLpro can be screened out by our framework as long as researchers replace the original datasets with PLpro-related data. Overall, the deep learning-based framework for rapid identification of covalent inhibitors established in this work is feasible and efficient, providing a significant complementary approach to guide drug development.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This study was supported in part by the Chinese Academy of Medical Sciences (2021-I2M-5–014) and the Ministry of Science and Technology of China (2016YFA0502303).
Footnotes
Supplementary data to this article can be found online at https://doi.org/10.1016/j.ejmech.2022.114803.
Appendix A. Supplementary data
The following is the Supplementary data to this article:
Data availability
Data will be made available on request.
References
- 1.Wu F., Zhao S., Yu B., Chen Y.M., Wang W., Song Z.G., Hu Y., Tao Z.W., Tian J.H., Pei Y.Y., Yuan M.L., Zhang Y.L., Dai F.H., Liu Y., Wang Q.M., Zheng J.J., Xu L., Holmes E.C., Zhang Y.Z. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579(7798):265. doi: 10.1038/s41586-020-2008-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Harvey W.T., Carabelli A.M., Jackson B., Gupta R.K., Thomson E.C., Harrison E.M., Ludden C., Reeve R., Rambaut A., Peacock S.J., Robertson D.L., Conso C.-G.U.C.-U. SARS-CoV-2 variants, spike mutations and immune escape. Nat. Rev. Microbiol. 2021;19(7):409–424. doi: 10.1038/s41579-021-00573-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Ma C., Xia Z., Sacco M.D., Hu Y., Townsend J.A., Meng X., Choza J., Tan H., Jang J., Gongora M.V., Zhang X., Zhang F., Xiang Y., Marty M.T., Chen Y., Wang J. Discovery of Di- and trihaloacetamides as covalent SARS-CoV-2 main protease inhibitors with high target specificity. J. Am. Chem. Soc. 2021;143(49):20697–20709. doi: 10.1021/jacs.1c08060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Beigel J.H., Tomashek K.M., Dodd L.E., Mehta A.K., Zingman B.S., Kalil A.C., Hohmann E., Chu H.Y., Luetkemeyer A., Kline S., de Castilla D.L., Finberg R.W., Dierberg K., Tapson V., Hsieh L., Patterson T.F., Paredes R., Sweeney D.A., Short W.R., Touloumi G., Lye D.C., Ohmagari N., Oh M.D., Ruiz-Palacios G.M., Benfield T., Fatkenheuer G., Kortepeter M.G., Atmar R.L., Creech C.B., Lundgren J., Babiker A.G., Pett S., Neaton J.D., Burgess T.H., Bonnett T., Green M., Makowski M., Osinusi A., Nayak S., Lane H.C., Grp A.-S. Remdesivir for the treatment of covid-19-final report. N. Engl. J. Med. 2020;383(19):1813–1826. doi: 10.1056/NEJMoa2007764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bernal A.J., da Silva M.M.G., Musungaie D.B., Kovalchuk E., Gonzalez A., Delos Reyes V., Martin-Quiros A., Caraco Y., Williams-Diaz A., Brown M.L., Du J., Pedley A., Assaid C., Strizki J., Grobler J.A., Shamsuddin H.H., Tipping R., Wan H., Paschke A., Butterton J.R., Johnson M.G., De Anda C., Grp M.-O.S. Molnupiravir for oral treatment of covid-19 in nonhospitalized patients. N. Engl. J. Med. 2021 doi: 10.1056/NEJMoa2116044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Owen D.R., Allerton C.M.N., Anderson A.S., Aschenbrenner L., Avery M., Berritt S., Boras B., Cardin R.D., Carlo A., Coffman K.J., Dantonio A., Di L., Eng H., Ferre R., Gajiwala K.S., Gibson S.A., Greasley S.E., Hurst B.L., Kadar E.P., Kalgutkar A.S., Lee J.C., Lee J., Liu W., Mason S.W., Noell S., Novak J.J., Obach R.S., Ogilvie K., Patel N.C., Pettersson M., Rai D.K., Reese M.R., Sammons M.F., Sathish J.G., Singh R.S.P., Steppan C.M., Stewart A.E., Tuttle J.B., Updyke L., Verhoest P.R., Wei L., Yang Q., Zhu Y. An oral SARS-CoV-2 M(pro) inhibitor clinical candidate for the treatment of COVID-19. Science. 2021;374(6575):1586–1593. doi: 10.1126/science.abl4784. [DOI] [PubMed] [Google Scholar]
- 7.Brian D.A., Baric R.S. Coronavirus genome structure and replication. Curr. Top. Microbiol. Immunol. 2005;287:1–30. doi: 10.1007/3-540-26765-4_1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Li C., Qi Y., Teng X., Yang Z., Wei P., Zhang C., Tan L., Zhou L., Liu Y., Lai L. Maturation mechanism of severe acute respiratory syndrome (SARS) coronavirus 3C-like proteinase. J. Biol. Chem. 2010;285(36):28134–28140. doi: 10.1074/jbc.M109.095851. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Fan K., Ma L., Han X., Liang H., Wei P., Liu Y., Lai L. The substrate specificity of SARS coronavirus 3C-like proteinase. Biochem. Biophys. Res. Commun. 2005;329(3):934–940. doi: 10.1016/j.bbrc.2005.02.061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Jin Z., Du X., Xu Y., Deng Y., Liu M., Zhao Y., Zhang B., Li X., Zhang L., Peng C., Duan Y., Yu J., Wang L., Yang K., Liu F., Jiang R., Yang X., You T., Liu X., Yang X., Bai F., Liu H., Liu X., Guddat L.W., Xu W., Xiao G., Qin C., Shi Z., Jiang H., Rao Z., Yang H. Structure of M(pro) from SARS-CoV-2 and discovery of its inhibitors. Nature. 2020;582(7811):289–293. doi: 10.1038/s41586-020-2223-y. [DOI] [PubMed] [Google Scholar]
- 11.Akaji K., Konno H., Mitsui H., Teruya K., Shimamoto Y., Hattori Y., Ozaki T., Kusunoki M., Sanjoh A. Structure-based design, synthesis, and evaluation of peptide-mimetic SARS 3CL protease inhibitors. J. Med. Chem. 2011;54(23):7962–7973. doi: 10.1021/jm200870n. [DOI] [PubMed] [Google Scholar]
- 12.Wu C., Liu Y., Yang Y., Zhang P., Zhong W., Wang Y., Wang Q., Xu Y., Li M., Li X., Zheng M., Chen L., Li H. Analysis of therapeutic targets for SARS-CoV-2 and discovery of potential drugs by computational methods. Acta Pharm. Sin. B. 2020;10(5):766–788. doi: 10.1016/j.apsb.2020.02.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hilgenfeld R. From SARS to MERS: crystallographic studies on coronaviral proteases enable antiviral drug design. FEBS J. 2014;281(18):4085–4096. doi: 10.1111/febs.12936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Kumar V., Shin J.S., Shie J.J., Ku K.B., Kim C., Go Y.Y., Huang K.F., Kim M., Liang P.H. Identification and evaluation of potent Middle East respiratory syndrome coronavirus (MERS-CoV) 3CL(Pro) inhibitors. Antivir. Res. 2017;141:101–106. doi: 10.1016/j.antiviral.2017.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kuo C.J., Liu H.G., Lo Y.K., Seong C.M., Lee K.I., Jung Y.S., Liang P.H. Individual and common inhibitors of coronavirus and picornavirus main proteases. FEBS Lett. 2009;583(3):549–555. doi: 10.1016/j.febslet.2008.12.059. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lu Z., Feng J.I.N., Ying L.I.U., Er-Chang S., Ping W.E.I., Chun-Mei L.I., Lu-Hua L.A.I. Isatin dual functional inhibitors: modulating the aggregation state and enzyme activity of SARS-3CL proteinase. Acta Phys. Chim. Sin. 2012;28(10):2418–2422. doi: 10.3866/pku.Whxb201209143. [DOI] [Google Scholar]
- 17.Park J.Y., Ko J.A., Kim D.W., Kim Y.M., Kwon H.J., Jeong H.J., Kim C.Y., Park K.H., Lee W.S., Ryu Y.B. Chalcones isolated from Angelica keiskei inhibit cysteine proteases of SARS-CoV. J. Enzym. Inhib. Med. Chem. 2016;31(1):23–30. doi: 10.3109/14756366.2014.1003215. [DOI] [PubMed] [Google Scholar]
- 18.Wu C.Y., King K.Y., Kuo C.J., Fang J.M., Wu Y.T., Ho M.Y., Liao C.L., Shie J.J., Liang P.H., Wong C.H. Stable benzotriazole esters as mechanism-based inactivators of the severe acute respiratory syndrome 3CL protease. Chem Biol. 2006;13(3):261–268. doi: 10.1016/j.chembiol.2005.12.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Zhang J., Huitema C., Niu C., Yin J., James M.N., Eltis L.D., Vederas J.C. Aryl methylene ketones and fluorinated methylene ketones as reversible inhibitors for severe acute respiratory syndrome (SARS) 3C-like proteinase. Bioorg. Chem. 2008;36(5):229–240. doi: 10.1016/j.bioorg.2008.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.De Cesco S., Kurian J., Dufresne C., Mittermaier A.K., Moitessier N. Covalent inhibitors design and discovery. Eur. J. Med. Chem. 2017;138:96–114. doi: 10.1016/j.ejmech.2017.06.019. [DOI] [PubMed] [Google Scholar]
- 21.Su H., Yao S., Zhao W., Zhang Y., Liu J., Shao Q., Wang Q., Li M., Xie H., Shang W., Ke C., Feng L., Jiang X., Shen J., Xiao G., Jiang H., Zhang L., Ye Y., Xu Y. Identification of pyrogallol as a warhead in design of covalent inhibitors for the SARS-CoV-2 3CL protease. Nat. Commun. 2021;12(1):3623. doi: 10.1038/s41467-021-23751-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Hoffman R.L., Kania R.S., Brothers M.A., Davies J.F., Ferre R.A., Gajiwala K.S., He M., Hogan R.J., Kozminski K., Li L.Y., Lockner J.W., Lou J., Marra M.T., Mitchell L.J., Jr., Murray B.W., Nieman J.A., Noell S., Planken S.P., Rowe T., Ryan K., Smith G.J., 3rd, Solowiej J.E., Steppan C.M., Taggart B. Discovery of ketone-based covalent inhibitors of coronavirus 3CL proteases for the potential therapeutic treatment of COVID-19. J. Med. Chem. 2020;63(21):12725–12747. doi: 10.1021/acs.jmedchem.0c01063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Vuong W., Khan M.B., Fischer C., Arutyunova E., Lamer T., Shields J., Saffran H.A., McKay R.T., van Belkum M.J., Joyce M.A., Young H.S., Tyrrell D.L., Vederas J.C., Lemieux M.J. Feline coronavirus drug inhibits the main protease of SARS-CoV-2 and blocks virus replication. Nat. Commun. 2020;11(1):4282. doi: 10.1038/s41467-020-18096-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Qiao J., Li Y.S., Zeng R., Liu F.L., Luo R.H., Huang C., Wang Y.F., Zhang J., Quan B., Shen C., Mao X., Liu X., Sun W., Yang W., Ni X., Wang K., Xu L., Duan Z.L., Zou Q.C., Zhang H.L., Qu W., Long Y.H., Li M.H., Yang R.C., Liu X., You J., Zhou Y., Yao R., Li W.P., Liu J.M., Chen P., Liu Y., Lin G.F., Yang X., Zou J., Li L., Hu Y., Lu G.W., Li W.M., Wei Y.Q., Zheng Y.T., Lei J., Yang S. SARS-CoV-2 M(pro) inhibitors with antiviral activity in a transgenic mouse model. Science. 2021;371(6536):1374–1378. doi: 10.1126/science.abf1611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Dai W., Zhang B., Jiang X.M., Su H., Li J., Zhao Y., Xie X., Jin Z., Peng J., Liu F., Li C., Li Y., Bai F., Wang H., Cheng X., Cen X., Hu S., Yang X., Wang J., Liu X., Xiao G., Jiang H., Rao Z., Zhang L.K., Xu Y., Yang H., Liu H. Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease. Science. 2020;368(6497):1331–1335. doi: 10.1126/science.abb4489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Ma C., Sacco M.D., Hurst B., Townsend J.A., Hu Y., Szeto T., Zhang X., Tarbet B., Marty M.T., Chen Y., Wang J. Boceprevir, GC-376, and calpain inhibitors II, XII inhibit SARS-CoV-2 viral replication by targeting the viral main protease. Cell Res. 2020;30(8):678–692. doi: 10.1038/s41422-020-0356-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Unoh Y., Uehara S., Nakahara K., Nobori H., Yamatsu Y., Yamamoto S., Maruyama Y., Taoda Y., Kasamatsu K., Suto T., Kouki K., Nakahashi A., Kawashima S., Sanaki T., Toba S., Uemura K., Mizutare T., Ando S., Sasaki M., Orba Y., Sawa H., Sato A., Sato T., Kato T., Tachibana Y. Discovery of S-217622, a noncovalent oral SARS-CoV-2 3CL protease inhibitor clinical candidate for treating COVID-19. J. Med. Chem. 2022 doi: 10.1021/acs.jmedchem.2c00117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Su H.X., Yao S., Zhao W.F., Li M.J., Liu J., Shang W.J., Xie H., Ke C.Q., Hu H.C., Gao M.N., Yu K.Q., Liu H., Shen J.S., Tang W., Zhang L.K., Xiao G.F., Ni L., Wang D.W., Zuo J.P., Jiang H.L., Bai F., Wu Y., Ye Y., Xu Y.C. Anti-SARS-CoV-2 activities in vitro of Shuanghuanglian preparations and bioactive ingredients. Acta Pharmacol. Sin. 2020;41(9):1167–1177. doi: 10.1038/s41401-020-0483-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Drayman N., DeMarco J.K., Jones K.A., Azizi S.A., Froggatt H.M., Tan K., Maltseva N.I., Chen S., Nicolaescu V., Dvorkin S., Furlong K., Kathayat R.S., Firpo M.R., Mastrodomenico V., Bruce E.A., Schmidt M.M., Jedrzejczak R., Munoz-Alia M.A., Schuster B., Nair V., Han K.Y., O'Brien A., Tomatsidou A., Meyer B., Vignuzzi M., Missiakas D., Botten J.W., Brooke C.B., Lee H., Baker S.C., Mounce B.C., Heaton N.S., Severson W.E., Palmer K.E., Dickinson B.C., Joachimiak A., Randall G., Tay S. Masitinib is a broad coronavirus 3CL inhibitor that blocks replication of SARS-CoV-2. Science. 2021;373(6557):931–936. doi: 10.1126/science.abg5827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Chaikuad A., Koch P., Laufer S.A., Knapp S. The cysteinome of protein kinases as a target in drug development. Angew Chem. Int. Ed. Engl. 2018;57(16):4372–4385. doi: 10.1002/anie.201707875. [DOI] [PubMed] [Google Scholar]
- 31.Abranyi-Balogh P., Petri L., Imre T., Szijj P., Scarpino A., Hrast M., Mitrovic A., Fonovic U.P., Nemeth K., Barreteau H., Roper D.I., Horvati K., Ferenczy G.G., Kos J., Ilas J., Gobec S., Keseru G.M. A road map for prioritizing warheads for cysteine targeting covalent inhibitors. Eur. J. Med. Chem. 2018;160:94–107. doi: 10.1016/j.ejmech.2018.10.010. [DOI] [PubMed] [Google Scholar]
- 32.Chen J., Ali F., Khan I., Zhu Y.Z. Recent progress in the development of potential drugs against SARS-CoV-2. Curr Res Pharmacol Drug Discov. 2021;2 doi: 10.1016/j.crphar.2021.100057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Macip G., Garcia-Segura P., Mestres-Truyol J., Saldivar-Espinoza B., Ojeda-Montes M.J., Gimeno A., Cereto-Massague A., Garcia-Vallve S., Pujadas G. Haste makes waste: a critical review of docking-based virtual screening in drug repurposing for SARS-CoV-2 main protease (M-pro) inhibition. Med. Res. Rev. 2022;42(2):744–769. doi: 10.1002/med.21862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Vittoria B.L., Imbesi C., Irene G., Cali G., Bitto A. New approaches and repurposed antiviral drugs for the treatment of the SARS-CoV-2 infection. Pharmaceuticals. 2021;14(6) doi: 10.3390/ph14060503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Wang S., Sun Q., Xu Y., Pei J., Lai L. A transferable deep learning approach to fast screen potential antiviral drugs against SARS-CoV-2. Briefings Bioinf. 2021;22(6) doi: 10.1093/bib/bbab211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Nguyen D.D., Gao K., Chen J., Wang R., Wei G.W. Unveiling the molecular mechanism of SARS-CoV-2 main protease inhibition from 137 crystal structures using algebraic topology and deep learning. Chem. Sci. 2020;11(44):12036–12046. doi: 10.1039/d0sc04641h. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hu F., Wang L., Hu Y., Wang D., Wang W., Jiang J., Li N., Yin P. A novel framework integrating AI model and enzymological experiments promotes identification of SARS-CoV-2 3CL protease inhibitors and activity-based probe. Briefings Bioinf. 2021;22(6) doi: 10.1093/bib/bbab301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Ullrich S., Nitsche C. The SARS-CoV-2 main protease as drug target. Bioorg. Med. Chem. Lett. 2020;30(17) doi: 10.1016/j.bmcl.2020.127377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Liu Y., Liang C., Xin L., Ren X., Tian L., Ju X., Li H., Wang Y., Zhao Q., Liu H., Cao W., Xie X., Zhang D., Wang Y., Jian Y. The development of Coronavirus 3C-Like protease (3CL(pro)) inhibitors from 2010 to 2020. Eur. J. Med. Chem. 2020;206 doi: 10.1016/j.ejmech.2020.112711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Sun Q., Ye F., Liang H., Liu H., Li C., Lu R., Huang B., Zhao L., Tan W., Lai L. Bardoxolone and bardoxolone methyl, two Nrf2 activators in clinical trials, inhibit SARS-CoV-2 replication and its 3C-like protease. Signal Transduct. Targeted Ther. 2021;6(1):212. doi: 10.1038/s41392-021-00628-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Brown A.S., Ackerley D.F., Calcott M.J. High-throughput screening for inhibitors of the SARS-CoV-2 protease using a FRET-biosensor. Molecules. 2020;25(20) doi: 10.3390/molecules25204666. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.PubChem https://pubchem.ncbi.nlm.nih.gov/bioassay/1640021 SARS-CoV-2 3CL-Pro protease inhibition IC50 determined by FRET kind of response from peptide substrate.
- 43.Gilmer J., Schoenholz S.S., Riley P.F., Vinyals O., Dahl G.E. In: Proceedings of the 34th International Conference on Machine Learning. Doina P., Yee Whye T., editors. vol. 70. 2017. Neural message passing for quantum chemistry; pp. 1263–1272. (PMLR: Proceedings of Machine Learning Research). [Google Scholar]
- 44.RDKit open-source cheminformatics. https://www.rdkit.org
- 45.Svozil D., Kvasnicka V., Pospichal J.í. Introduction to multi-layer feed-forward neural networks. Chemometr. Intell. Lab. Syst. 1997;39(1):43–62. doi: 10.1016/s0169-7439(97)00061-0. [DOI] [Google Scholar]
- 46.Liu H., Ye F., Sun Q., Liang H., Li C., Li S., Lu R., Huang B., Tan W., Lai L. Scutellaria baicalensis extract and baicalein inhibit replication of SARS-CoV-2 and its 3C-like protease in vitro. J. Enzym. Inhib. Med. Chem. 2021;36(1):497–503. doi: 10.1080/14756366.2021.1873977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Baek K.H., Karki R., Lee E.S., Na Y., Kwon Y. Synthesis and investigation of dihydroxychalcones as calpain and cathepsin inhibitors. Bioorg. Chem. 2013;51:24–30. doi: 10.1016/j.bioorg.2013.09.002. [DOI] [PubMed] [Google Scholar]
- 48.Iketani S., Forouhar F., Liu H., Hong S.J., Lin F.Y., Nair M.S., Zask A., Huang Y., Xing L., Stockwell B.R., Chavez A., Ho D.D. Lead compounds for the development of SARS-CoV-2 3CL protease inhibitors. Nat. Commun. 2021;12(1) doi: 10.1038/s41467-021-22362-2. 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Madhavi Sastry G., Adzhigirey M., Day T., Annabhimoju R., Sherman W. Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J. Comput. Aided Mol. Des. 2013;27(3):221–234. doi: 10.1007/s10822-013-9644-8. [DOI] [PubMed] [Google Scholar]
- 50.Chen I.J., Foloppe N. Drug-like bioactive structures and conformational coverage with the LigPrep/ConfGen suite: comparison to programs MOE and catalyst. J. Chem. Inf. Model. 2010;50(5):822–839. doi: 10.1021/ci100026x. [DOI] [PubMed] [Google Scholar]
- 51.Zhu K., Borrelli K.W., Greenwood J.R., Day T., Abel R., Farid R.S., Harder E. Docking covalent inhibitors: a parameter free approach to pose prediction and scoring. J. Chem. Inf. Model. 2014;54(7):1932–1940. doi: 10.1021/ci500118s. [DOI] [PubMed] [Google Scholar]
- 52.Yang K., Swanson K., Jin W., Coley C., Eiden P., Gao H., Guzman-Perez A., Hopper T., Kelley B., Mathea M., Palmer A., Settels V., Jaakkola T., Jensen K., Barzilay R. Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 2019;59(8):3370–3388. doi: 10.1021/acs.jcim.9b00237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Breiman L. Random forests. Mach. Learn. 2001;45(1):5–32. doi: 10.1023/a:1010933404324. [DOI] [Google Scholar]
- 54.Cortes C., Vapnik V. Support-vector networks. Mach. Learn. 1995;20(3):273–297. doi: 10.1023/a:1022627411411. [DOI] [Google Scholar]
- 55.Friedman J.H. Stochastic gradient boosting. Comput. Stat. Data Anal. 2002;38(4):367–378. doi: 10.1016/s0167-9473(01)00065-2. [DOI] [Google Scholar]
- 56.Wang S.-C. In: Interdisciplinary Computing in Java Programming. Wang S.-C., editor. Springer US; Boston, MA: 2003. Artificial neural network; pp. 81–100. [Google Scholar]
- 57.Dietterich T.G. Springer Berlin Heidelberg; Berlin, Heidelberg: 2000. In Ensemble Methods in Machine Learning; pp. 1–15. Berlin, Heidelberg. [Google Scholar]
- 58.Butina D. Unsupervised data base clustering based on Daylight's fingerprint and Tanimoto similarity: a fast and automated way to cluster small and large data sets. J. Chem. Inf. Comput. Sci. 1999;39(4):747–750. doi: 10.1021/ci9803381. [DOI] [Google Scholar]
- 59.Young R.C., Mitchell R.C., Brown T.H., Ganellin C.R., Griffiths R., Jones M., Rana K.K., Saunders D., Smith I.R., Sore N.E., et al. Development of a new physicochemical model for brain penetration and its application to the design of centrally acting H2 receptor histamine antagonists. J. Med. Chem. 1988;31(3):656–671. doi: 10.1021/jm00398a028. [DOI] [PubMed] [Google Scholar]
- 60.Yang Y., Zhu Z., Wang X., Zhang X., Mu K., Shi Y., Peng C., Xu Z., Zhu W. Ligand-based approach for predicting drug targets and for virtual screening against COVID-19. Briefings Bioinf. 2021;22(2):1053–1064. doi: 10.1093/bib/bbaa422. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Sacco M.D., Ma C., Lagarias P., Gao A., Townsend J.A., Meng X., Dube P., Zhang X., Hu Y., Kitamura N., Hurst B., Tarbet B., Marty M.T., Kolocouris A., Xiang Y., Chen Y., Wang J. Structure and inhibition of the SARS-CoV-2 main protease reveal strategy for developing dual inhibitors against M(pro) and cathepsin L. Sci. Adv. 2020;6(50) doi: 10.1126/sciadv.abe0751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Ashhurst A.S., Tang A.H., Fajtová P., Yoon M.C., Aggarwal A., Bedding M.J., Stoye A., Beretta L., Pwee D., Drelich A., Skinner D., Li L., Meek T.D., McKerrow J.H., Hook V., Tseng C.T., Larance M., Turville S., Gerwick W.H., O'Donoghue A.J., Payne R.J. Potent anti-SARS-CoV-2 activity by the natural product gallinamide A and analogues via inhibition of cathepsin L. J. Med. Chem. 2022;65(4):2956–2970. doi: 10.1021/acs.jmedchem.1c01494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Vicik R., Busemann M., Gelhaus C., Stiefl N., Scheiber J., Schmitz W., Schulz F., Mladenovic M., Engels B., Leippe M., Baumann K., Schirmeister T. Aziridide-based inhibitors of cathepsin L: synthesis, inhibition activity, and docking studies. ChemMedChem. 2006;1(10):1126–1141. doi: 10.1002/cmdc.200600106. [DOI] [PubMed] [Google Scholar]
- 64.Xiong M., Nie T., Shao Q., Li M., Su H., Xu Y. In silico screening-based discovery of novel covalent inhibitors of the SARS-CoV-2 3CL protease. Eur. J. Med. Chem. 2022;231 doi: 10.1016/j.ejmech.2022.114130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Shin Y., Jeong J.W., Wurz R.P., Achanta P., Arvedson T., Bartberger M.D., Campuzano I.D.G., Fucini R., Hansen S.K., Ingersoll J., Iwig J.S., Lipford J.R., Ma V., Kopecky D.J., McCarter J., San Miguel T., Mohr C., Sabet S., Saiki A.Y., Sawayama A., Sethofer S., Tegley C.M., Volak L.P., Yang K., Lanman B.A., Erlanson D.A., Cee V.J. Discovery of N-(1-Acryloylazetidin-3-yl)-2-(1H-indol-1-yl)acetamides as covalent inhibitors of KRAS(G12C) ACS Med. Chem. Lett. 2019;10(9):1302–1308. doi: 10.1021/acsmedchemlett.9b00258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Konno S., Kobayashi K., Senda M., Funai Y., Seki Y., Tamai I., Schäkel L., Sakata K., Pillaiyar T., Taguchi A., Taniguchi A., Gütschow M., Müller C.E., Takeuchi K., Hirohama M., Kawaguchi A., Kojima M., Senda T., Shirasaka Y., Kamitani W., Hayashi Y. 3CL protease inhibitors with an electrophilic arylketone moiety as anti-SARS-CoV-2 agents. J. Med. Chem. 2022;65(4):2926–2939. doi: 10.1021/acs.jmedchem.1c00665. [DOI] [PubMed] [Google Scholar]
- 67.Quan B.X., Shuai H., Xia A.J., Hou Y., Zeng R., Liu X.L., Lin G.F., Qiao J.X., Li W.P., Wang F.L., Wang K., Zhou R.J., Yuen T.T., Chen M.X., Yoon C., Wu M., Zhang S.Y., Huang C., Wang Y.F., Yang W., Tian C., Li W.M., Wei Y.Q., Yuen K.Y., Chan J.F., Lei J., Chu H., Yang S. An orally available M(pro) inhibitor is effective against wild-type SARS-CoV-2 and variants including Omicron. Nat Microbiol. 2022;7(5):716–725. doi: 10.1038/s41564-022-01119-7. [DOI] [PubMed] [Google Scholar]
- 68.Rut W., Lv Z., Zmudzinski M., Patchett S., Nayak D., Snipas S.J., El Oualid F., Huang T.T., Bekes M., Drag M., Olsen S.K. Activity profiling and crystal structures of inhibitor-bound SARS-CoV-2 papain-like protease: a framework for anti-COVID-19 drug design. Sci. Adv. 2020;6(42) doi: 10.1126/sciadv.abd4596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Hognon C., Marazzi M., Garcia-Iriepa C. Atomistic-level description of the covalent inhibition of SARS-CoV-2 papain-like protease. Int. J. Mol. Sci. 2022;23(10) doi: 10.3390/ijms23105855. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data will be made available on request.








