Abstract
We report the identification of three structurally diverse compounds – compound 4, GC376, and MAC-5576 – as inhibitors of the SARS-CoV-2 3CL protease. Structures of each of these compounds in complex with the protease revealed strategies for further development, as well as general principles for designing SARS-CoV-2 3CL protease inhibitors. These compounds may therefore serve as leads for the basis of building effective SARS-CoV-2 3CL protease inhibitors.
As the etiologic agent of COVID-19, SARS-CoV-2 has resulted in hundreds of thousands of deaths and caused rampant economic damage worldwide1,2. While some treatments have been identified, their clinical efficacy is low, making continued research essential3,4. Similar to other coronaviruses, SARS-CoV-2 encodes an essential 3CL protease (Mpro) that processes its polyproteins, which has garnered interest as a target for potential viral inhibitors5,6. Here, we describe a series of compounds with inhibitory activity against the SARS-CoV-2 3CL and determine their structures in complex with the protease. These data provide general insights into the design of 3CL protease inhibitors, along with potential avenues by which these classes of compounds can be further developed.
We hypothesized that previously identified SARS-CoV-1 3CL protease inhibitors may also be effective against the SARS-CoV-2 3CL, given the conservation between the two proteases (96% amino acid identity)1,2. Using a biochemical assay to report SARS-CoV-2 3CL protease activity (Extended Data Fig. 1a, b), we identified three diverse compounds of interest: compound 47, GC3768, and MAC-55769, which had IC50 values (mean ± s.d.) of 0.149 ± 0.002 μM, 0.139 ± 0.016 μM, and 0.088 ± 0.003 μM, respectively (Fig. 1a, b). Encouraged by these results, we then tested these compounds for inhibition of SARS-CoV-2 viral replication. We found that compound 4 and GC376 could block viral infection (EC50 values (mean ± s.d.): 3.023 ± 0.923 μM and 4.481 ± 0.529 μM, respectively), whereas MAC-5576 did not (Fig. 1c). Finally, we confirmed that these compounds did not result in cytotoxicity to the cells at the tested concentrations (Extended Data Fig. 2).
As the three compounds exhibited inhibitory activity against the SARS-CoV-2 3CL, we proceeded to solve the crystal structure of the apo 3CL protease alone and of each of these compounds in complex with the protease to understand their mechanism of binding as well as to guide future structure-based optimization efforts. We note that while MAC-5576 did not exhibit activity in the cellular assay, its low molecular weight and reasonable biochemical activity prompted us to pursue its crystallization as well, as our goal was to broadly investigate inhibitory scaffolds for the SARS-CoV-2 3CL protease. Crystals were obtained (see Methods for detailed information) and structures at 1.85 Å, 1.80 Å, 1.83 Å, and 1.73 Å resolution limits for apo 3CL and 3CL bound to compound 4, GC376, and MAC-5576, respectively, were solved (Fig. 2a, b, c, Extended Data Fig. 3, and Extended Data Fig. 4; see Supplementary Table 1 for statistics).
The X-ray crystal structures revealed that all three of the compounds bind covalently to the catalytically active Cys145 residue within the substrate-binding pocket of the protease. We observed distinct mechanisms by which these compounds acted on this residue. Compound 4 functioned in a similar binding mode as other reported compounds, covalently modifying Cys145 through Michael addition (Fig. 2a)5. For GC376, the bisulfite adduct was converted to an aldehyde as previously reported, allowing it to then react with Cys145 through nucleophilic addition and hemithioacetal formation (Fig. 2b)8. MAC-5576 also covalently modified Cys145 by nucleophilic linkage, as expected (Fig. 2c).
As we solved the structures for multiple compounds, we hypothesized that general principles for the design of SARS-CoV-2 3CL protease inhibitors could be identified. We first overlaid all four crystal structures of the 3CL with or without inhibitors (Extended Data Fig. 5). We observed local conformational changes, with Thr45 to Pro52 distinct from the apo 3CL in all three inhibitor-bound structures, whereas Arg188 to Gln192 differed only in the compound 4 and GC376-bound, but not MAC-5576-bound structures. We then overlaid each of the inhibitors in the substrate binding pocket of the 3CL protease to find commonalities in their interactions (Fig. 2d). Most notably, we found that all of these compounds occupied the S2 site, with compound 4 and GC376 further anchored in the S1 and S3 sites. The backbone NH of Gly143 points toward the ligand binding pocket, forming hydrogen bonds with the carbonyl oxygen of the ethyl ester of compound 4, and the hemithioacetal of GC376 after the Cys145 addition to the original aldehyde, even though the former hydrogen bond is stronger than the latter. In both structures, the γ-lactam groups occupy the S1 site, and are strongly anchored by two hydrogen bonds with the side chains of His163 and Glu166. The isobutyl groups are favorably embedded in the hydrophobic S2 site, surrounded by the alkyl portion of the side chains of His41, Met49, His164, Met165, Asp187, and Gln189. Extending into the S3 pocket, the amide bonds of compound 4 and GC376 are stabilized by hydrogen bond interactions with the side chain of Gln189. Similar interactions are also observed in reports of related compounds, suggesting that overall, the binding modes of this class of substrate mimetic inhibitors share remarkable similarities5,6,10. Specifically, they all have a γ-lactam occupying the S1 pocket, preserving the dual hydrogen bonds with His163 and Glu166. Furthermore, they commonly contain a hydrophobic moiety occupying the S2 site. As shown in a structural overlay of compound 4 and GC376 with these related compounds (Extended Data Fig. 6), the segment of the inhibitors from S1 to S2 align closely on top of each other. Variations of binding start to emerge in the S3 and S4 region, which exhibits high degrees of freedom in terms of structural diversity as well as conformational flexibility. In our experiments, the S3 and S4 sites displayed weaker electron density, indicating flexibility in the inhibitor and/or the protease in these regions (Extended Data Fig. 4). These observations suggest that development of 3CL protease inhibitors may benefit from first establishing robust interactions within the S1, S2, and/or S1’ sites, before extending into the S3 and S4 sites. Possibly, compounds such as compound 4 and GC376 are not optimized for binding into the S3 and S4 sites, and there are ample opportunities to improve the inhibitory potencies against the 3CL by designing compounds that exploit the accessible contact points to strengthen the ligand-protein interactions.
On the other hand, the binding of MAC-5576, as a non-peptidic small molecule, displays unique features that differ from that of compound 4 or GC376. We observed that the thiophene group forms π-π stacking with the His41 side chain imidazole, which undergoes a conformational rotation around its beta-carbon to align parallel to the thiophene, as compared to the other peptide-bound structures. Additionally, the side chain of Gln189 also shows notable conformational variation compared to those in the compound 4 and GC376 crystal structures, possibly in response to the specific hydrogen bond interactions induced by the respective ligands. Notably, the rotation of His41 has been reported previously in the crystal structure of a benzotriazole ester inhibitor (XP-59) in complex with the SARS-CoV-1 3CL protease (PDB:2V6N)11. An overlaid model of the crystal structures of MAC-5576 bound to SARS-CoV-2 3CL and XP-59 bound to SARS-CoV-1 3CL shows that both compounds have similar binding modes when covalently bound to Cys145, in which the thiophene of MAC-5576 and the phenyl ring of XP-59 almost overlap with each other, both engaging the His41 side chain via π-π stacking interactions (Extended Data Fig. 7).
In summation, we have identified compound 4, GC376, and MAC-5576 as inhibitors of the SARS-CoV-2 3CL protease. Crystal structures of the compounds complexed to the protease suggested their mechanisms of action, as well as portended guidelines for the development of SARS-CoV-2 3CL protease inhibitors, which may aid in the future development of novel inhibitors to combat this virus.
Methods
Compounds
Compound 4 was synthesized using the synthesis route previously described, with the exception of using a sodium borohydride-cobaltous chloride reduction of the nitrile in the construction of the lactam, thus avoiding the high pressure hydrogenation in the original route7, 12. GC376 was purchased from Aobious and MAC-5576 was purchased from Maybridge.
Expression and purification of SARS-CoV-2 3CL protease.
The SARS-CoV-2 3CL protease gene was codon optimized for bacterial expression and synthesized (Twist Bioscience), then cloned into a bacterial expression vector (pGEX-5X-3, GE) that expresses the protease as a fusion construct with a N-terminal GST tag, followed by a Factor Xa cleavage site. After confirmation by Sanger sequencing, the construct was transformed into BL21 (DE3) cells. These E. coli were inoculated and grown overnight as starter cultures, then used to inoculate larger cultures at a 1:100 dilution, which were then grown at 37 °C, 220 RPM until the OD reached 0.6–0.7. Expression of the protease was induced with the addition of 0.5 mM IPTG, and then the cultures were incubated at 16 °C, 180 RPM for 10 h. Cells were pelleted at 4500 RPM for 15 min at 4 °C, resuspended in lysis buffer (20 mM Tris-HCl, pH 8.0, 300 mM NaCl), homogenized by sonication, then clarified by centrifuging at 25000 × g for 1 h at 4 °C. The supernatant was mixed with Glutathione Sepharose resin (Sigma) and placed on a rotator for 2 h at 4 °C. The resin was then repeatedly washed by centrifugation at 3500 RPM for 15 min at 4 °C, discarding of the supernatant, and then resuspension of the resin in fresh lysis buffer. After ten washes, the resin was resuspended in lysis buffer, and Factor Xa was added and incubated for 18 h at 4 °C on a rotator. The resin was centrifuged at 3500 RPM for 15 min at 4 °C, and then the supernatant was collected and concentrated using a 10 kDa concentrator (Amicon) before being loaded onto a Superdex 10/300 GL column for further purification by size exclusion chromatography. The appropriate fractions were collected and pooled with a 10 kDa concentrator, and then the final product was assessed for quality by SDS-PAGE and measurement of biochemical activity.
Measurement of SARS-CoV-2 3CL protease biochemical activity.
The in vitro biochemical activity of the SARS-CoV-2 3CL protease was measured as previously described5. The fluorogenic peptide MCA-AVLQSGFR-Lys(DNP)-Lys-NH2, corresponding to the nsp4/nsp5 cleavage site in the virus, was synthesized (GL Biochem), then resuspended in DMSO to use as the substrate. Different concentrations of this substrate, ranging from 5 μM to 100 μM, were prepared in the assay buffer (50 mM Tris-HCl, pH 7.5, 1 mM EDTA) in a 96 well-plate. The protease was then added to each well at a concentration of 0.2 μM, and then fluorescence was continuously measured on a plate reader for 3 min. The catalytic efficiency of the protease was then calculated by generating a double-reciprocal plot.
Measurement of SARS-CoV-2 3CL protease inhibition.
Inhibition of the biochemical activity of the SARS-CoV-2 3CL protease was quantified as previously described with modifications5. Serial dilutions of the test compound were prepared in the assay buffer, and then incubated with 0.2 μM of the protease for 10 min at 37 °C. The substrate was then added at 20 μM per well, and then fluorescence was continuously measured on a plate reader for 3 min. Inhibition was then calculated by comparison to control wells with no inhibitor added. IC50 values were determined by fitting an asymmetric sigmoidal curve to the data (GraphPad Prism).
Measurement of SARS-CoV-2 viral inhibition.
Stocks of SARS-CoV-2 strain 2019-nCoV/USA_WA1/2020 was propagated and titered in Vero-E6 cells. One day prior to the experiment, Vero-E6 cells were seeded at 30,000 cells/well in 96 well-plates. Serial dilutions of the test compound were prepared in cell media (EMEM + 10% FCS + penicillin/streptomycin), overlaid onto cells, and then virus was added to each well at an MOI of 0.2. Cells were incubated at 37 °C under 5% CO2 for 72 h and then viral cytopathic effect was scored in a blinded manner. Inhibition was calculated by comparison to control wells with no inhibitor added. EC50 values were determined by fitting an asymmetric sigmoidal curve to the data (GraphPad Prism). Cells were confirmed as mycoplasma negative prior to use. All experiments were conducted in a biosafety level 3 (BSL-3) lab.
Measurement of cellular cytotoxicity.
Vero-E6 cells were incubated with the compound of interest for 48 h at 37 °C under 5% CO2 and then cellular cytotoxicity was determined with the XTT Cell Proliferation Assay Kit (ATCC) according to the manufacturer’s instructions.
Crystallization, data collection, and structure determination.
To generate the complex of SARS-CoV-2 3CL protease bound to compound 4, 50 μM of the 3CL protease was incubated with 500 μM of compound 4 in a buffer comprised of 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, and 5% (v/v) glycerol for 1 h at 4 °C. This complex was then concentrated to 8.5 mg/mL using a 10 kDa concentrator, and initially subjected to extensive robotic screening at the High-Throughput Crystallization Screening Center13 of the Hauptman-Woodward Medical Research Institute (HWI) (https://hwi.buffalo.edu/high-throughput-crystallization-center/). The most promising crystal hits were then reproduced using the microbatch-under-oil method at 4 °C. Block-like crystals of 3CL in complex with compound 4 appeared after a few days in the crystallization condition comprised of 0.1 M potassium nitrate, 0.1 M sodium acetate (pH 5), and 20% (w/v) PEG 1000 with protein to crystallization reagent at a 2:1 ratio. The crystals were subsequently transferred into the same crystallization reagent supplemented with 15% (v/v) glycerol and flash-frozen in liquid nitrogen.
To obtain crystals of 3CL in complex with GC376, crystals of apo 3CL were initially grown by using seeding method in a crystallization reagent comprised of 0.1 M sodium phosphate-monobasic, 0.1 MES (pH 6), and 20% (w/v) PEG 4000. The apo crystals were subsequently soaked with 15 mM GC376, followed by flash-freezing of the crystals in the same reagent supplemented with 15% ethylene glycol.
To generate the complex of SARS-CoV-2 3CL protease bound to MAC-5576, 50 μM of the 3CL protease was incubated with 500 μM of compound 4 in a buffer comprised of 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, and 5% (v/v) glycerol for 1 h at 4 °C. The complex was concentrated to 10 mg/mL using a 10 kDa concentrator, and then crystallized in the same conditions as those used for crystallization of apo 3CL.
A native dataset was collected on each crystal of 3CL, alone (apo), and in complex with compound 4 and GC376 at the NE-CAT24-ID-C beam line of Advanced Photon Source (APS) at Argonne National Laboratory, and the NE-CAT 24-ID-E beam line of APS was used for data collection on crystals of 3CL-MAC-5576. Crystals of apo 3CL and in complex with compound 4, GC376, and MAC-5576 diffracted the X-ray beam to resolution 1.85 Å, 1.80 Å, 1.83 Å, 1.73 Å, respectively. The images were processed and scaled in space group C2 using XDS14. The structure of 3CL-compound 4 was determined by molecular replacement method using program MOLREP15 and the crystal structure of 3CL in complex with inhibitor N3 (PDB id: 6LU7)5 was used as a search model. The geometry of each crystal structure was subsequently fixed and the corresponding inhibitor was modeled in by XtalView16 and Coot17, and refined using PHENIX18. The mapping of electrostatic potential surfaces was generated in PyMOL with the APBS plug-in19. There is one protomer of 3CL complex in the asymmetric unit of each crystal. The crystallographic statistics are shown in Supplementary Table 1.
Reporting Summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Data availability
Structural data for the apo SARS-CoV-2 3CL protease and 3CL in complex with compound 4, GC376, and MAC-5576 will be deposited in the Protein Data Bank (PDB) and made publicly available upon publication. Source data for Fig. 1, Extended Data Fig. 1b, Extended Data Fig. 2, and the unprocessed gel for Extended Data Fig. 1a are available with the paper online.
Extended Data
Supplementary Material
Acknowledgements
This work was supported by a grant from the Jack Ma Foundation to D.D.H. and A.C. and by grants from Columbia Technology Ventures and the Columbia Translational Therapeutics (TRx) program to B.R.S. A.C. is also supported by a Career Awards for Medical Scientists from the Burroughs Wellcome Fund. S.I. is supported by NIH grant T32AI106711. We thank the staff of the High-Throughput Crystallization Screening Center of the Hauptman-Woodward Medical Research Institute for screening of crystallization conditions and the staff of the Advanced Photon Source at Argonne National Laboratory for assistance with data collection.
Footnotes
Competing interests
S.I., H.L., A.Z., B.R.S., A.C., and D.D.H. are inventors on a patent application submitted based on this work. B.R.S. is an inventor on additional patents and patent applications related to small molecule therapeutics, and co-founded and serves as a consultant to Inzen Therapeutics and Nevrox Limited.
References
- 1.Wu F. et al. A new coronavirus associated with human respiratory disease in China. Nature 579, 265–269, doi: 10.1038/s41586-020-2008-3 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zhou P. et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579, 270–273, doi: 10.1038/s41586-020-2012-7 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Beigel J. H. et al. Remdesivir for the Treatment of Covid-19 - Preliminary Report. N Engl J Med, doi: 10.1056/NEJMoa2007764 (2020). [DOI] [PubMed] [Google Scholar]
- 4.Wang Y. et al. Remdesivir in adults with severe COVID-19: a randomised, double-blind, placebo-controlled, multicentre trial. Lancet 395, 1569–1578, doi: 10.1016/S0140-6736(20)31022-9 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jin Z. et al. Structure of M(pro) from SARS-CoV-2 and discovery of its inhibitors. Nature, doi: 10.1038/s41586-020-2223-y (2020). [DOI] [PubMed] [Google Scholar]
- 6.Zhang L. et al. Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved alpha-ketoamide inhibitors. Science 368, 409–412, doi: 10.1126/science.abb3405 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yang S. et al. Synthesis, crystal structure, structure-activity relationships, and antiviral activity of a potent SARS coronavirus 3CL protease inhibitor. J Med Chem 49, 4971–4980, doi: 10.1021/jm0603926 (2006). [DOI] [PubMed] [Google Scholar]
- 8.Kim Y. et al. Broad-spectrum antivirals against 3C or 3C-like proteases of picornaviruses, noroviruses, and coronaviruses. J Virol 86, 11754–11762, doi: 10.1128/JVI.01348-12 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Blanchard J. E. et al. High-throughput screening identifies inhibitors of the SARS coronavirus main proteinase. Chem Biol 11, 1445–1453, doi: 10.1016/j.chembiol.2004.08.011 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Dai W. et al. Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease. Science, doi: 10.1126/science.abb4489 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Verschueren K. H. et al. A structural view of the inactivation of the SARS coronavirus main proteinase by benzotriazole esters. Chem Biol 15, 597–606, doi: 10.1016/j.chembiol.2008.04.011 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Luft J. R. et al. A deliberate approach to screening for initial crystallization conditions of biological macromolecules. J Struct Biol 142, 170–179, doi: 10.1016/s1047-8477(03)00048-0 (2003). [DOI] [PubMed] [Google Scholar]
- 13.Kabsch W. Integration, scaling, space-group assignment and post-refinement. Acta Crystallogr D Biol Crystallogr 66, 133–144, doi: 10.1107/S0907444909047374 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Vagin A. & Teplyakov A. Molecular replacement with MOLREP. Acta Crystallogr D Biol Crystallogr 66, 22–25, doi: 10.1107/S0907444909042589 (2010). [DOI] [PubMed] [Google Scholar]
- 15.McRee D. E. XtalView/Xfit--A versatile program for manipulating atomic coordinates and electron density. J Struct Biol 125, 156–165, doi: 10.1006/jsbi.1999.4094 (1999). [DOI] [PubMed] [Google Scholar]
- 16.Emsley P., Lohkamp B., Scott W. G. & Cowtan K. Features and development of Coot. Acta Crystallogr D Biol Crystallogr 66, 486–501, doi: 10.1107/S0907444910007493 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Adams P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D Biol Crystallogr 66, 213–221, doi: 10.1107/S0907444909052925 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Baker N. A., Sept D., Joseph S., Holst M. J. & McCammon J. A. Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci U S A 98, 10037–10041, doi: 10.1073/pnas.181342398 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Zhai Y. et al. Cyanohydrin as an Anchoring Group for Potent and Selective Inhibitors of Enterovirus 71 3C Protease. J Med Chem 58, 9414–9420, doi: 10.1021/acs.jmedchem.5b01013 (2015). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Structural data for the apo SARS-CoV-2 3CL protease and 3CL in complex with compound 4, GC376, and MAC-5576 will be deposited in the Protein Data Bank (PDB) and made publicly available upon publication. Source data for Fig. 1, Extended Data Fig. 1b, Extended Data Fig. 2, and the unprocessed gel for Extended Data Fig. 1a are available with the paper online.