Abstract
We report the identification of three structurally diverse compounds – compound 4, GC376, and MAC-5576 – as inhibitors of the SARS-CoV-2 3CL protease. Structures of each of these compounds in complex with the protease revealed strategies for further development, as well as general principles for designing SARS-CoV-2 3CL protease inhibitors. These compounds may therefore serve as leads for the basis of building effective SARS-CoV-2 3CL protease inhibitors.
Subject terms: X-ray crystallography, Virology, SARS-CoV-2, Molecular medicine
The essential SARS-CoV-2 3CL protease is of interest as a drug target. Here, the authors identify three 3CL inhibitors and characterize them both in vitro and with a cell-based assay, and they also present the inhibitor-bound 3CL crystal structures, which may allow for the design of improved compounds.
Introduction
As the etiologic agent of COVID-19, SARS-CoV-2 has resulted in millions of deaths and caused rampant economic damage worldwide1,2. While some treatments have been identified, their clinical efficacy is low or require delivery within a narrow treatment window, making continued research for additional therapeutics essential3,4. Similar to other coronaviruses, SARS-CoV-2 encodes an essential 3CL protease (3CLpro or Mpro) that processes its polyproteins, which has garnered interest as a target for potential viral inhibitors5,6. Here, we describe a series of compounds with inhibitory activity against SARS-CoV-2 3CLpro and determine their structures in complex with the protease. These data provide general insights into the design of 3CL protease inhibitors, along with potential avenues by which these classes of compounds can be further developed.
Results
Identification of SARS-CoV-2 3CL protease inhibitors
We hypothesized that previously identified SARS-CoV 3CL protease inhibitors may also be effective against the SARS-CoV-2 3CL, given the conservation between the two proteases (96% amino acid identity)1,2. To study such compounds, we first purified the native SARS-CoV-2 3CL protease from Escherichia coli and confirmed that it had functional enzymatic activity in an in vitro biochemical assay (Fig. 1a, b). Using this assay to report SARS-CoV-2 3CL protease activity, we identified three diverse compounds of interest: compound 47, GC3768, and MAC-55769 (Fig. 2a). These compounds demonstrated inhibition of the protease with IC50 values (mean ± s.e.m.) of 151 ± 15, 160 ± 34, and 81 ± 12 nM, respectively (Fig. 2b). We further characterized these compounds by conducting enzyme kinetic studies to determine the inactivation rate (kinact/Ki) for each compound (Fig. 2c). Compound 4 had a kinact/Ki of 4.13 × 105 M−1s−1 and GC376 had a kinact/Ki of 6.18 × 106 M−1s−1, but we did not observe time-dependent inhibition by MAC-5576.
We then tested these compounds for inhibition of SARS-CoV-2 viral replication. We found that compound 4 and GC376 could block viral infection in Vero-E6 cells in a cytopathic effect reduction assay (EC50 values (mean ± s.d.): 2.88 ± 0.23 and 2.19 ± 0.01 µM, respectively), whereas MAC-5576 did not (Fig. 2d). We confirmed that these compounds did not result in cytotoxicity to the cells at the tested concentrations (Supplementary Fig. 1).
Crystal structures of 3CLpro with protease inhibitors
As the three compounds exhibited inhibitory activity against the SARS-CoV-2 3CLpro, we proceeded to solve the crystal structure of the ligand-free 3CL protease alone and of each of these compounds in complex with the protease to understand their mechanism of binding as well as to guide future structure-based optimization efforts. We note that while MAC-5576 did not exhibit activity in the cellular assay, its low molecular weight and reasonable biochemical activity prompted us to pursue its crystallization as well, as our goal was to broadly investigate inhibitory scaffolds for the SARS-CoV-2 3CL protease. Crystals were obtained (see Methods for detailed information) and structures at 1.85, 1.94, 1.83, 1.73 Å resolution limits for ligand-free 3CLpro and 3CLpro bound to compound 4, GC376, and MAC-5576, respectively, were solved (Fig. 2, see Table 1 for statistics).
Table 1.
Ligand-free 3CL (PDB: 7JST) | 3CL with compound 4 (PDB: 7JT7) | 3CL with compound 4 (PDB: 7JW8) | 3CL with GC376 (PDB: 7JSU) | 3CL with MAC-5576 (PDB: 7JT0) | |
---|---|---|---|---|---|
Data collection | |||||
Space group | C2 | C2 | P1 | C2 | C2 |
Cell dimensions | |||||
a, b, c (Å) | 98.7, 82.0, 51.8 | 97.2, 81.9, 54.2 | 63.5, 67.8, 93.6 | 98.8, 80.2, 52.0 | 98.3, 82.5, 51.8 |
α, β, γ (°) | 90, 114.9, 90 | 90, 117.1, 90 | 75.2, 79.3, 67.9 | 90, 114.3, 90 | 90, 114.83, 90 |
Resolution (Å) | 60.4–1.85 (1.88–1.85)a | 59.5–1.94 (1.98–1.94)a | 90.03–1.84 (1.86–1.84)a | 59.9–1.83 (1.86–1.83)a | 60.58–1.73 (1.76–1.73)a |
Rmerge (%) | 7.2 (65.7) | 18.2 (58.4) | 13.2 (76.5) | 14.7 (60.5) | 4.1 (68.8) |
I/σI | 13.4 (2.1) | 10.2 (3.0) | 5.7 (1.8) | 12.7 (2.2) | 23.5 (2.3) |
Completeness (%) | 98.9 (97.4) | 98.8 (99.0) | 95.3 (94.0) | 98.5 (96.0) | 99.2 (88.3) |
Redundancy | 6.8 (6.5) | 6.7 (6.5) | 3.6 (3.1) | 6.9 (6.1) | 6.8 (6.6) |
CC1/2 | 0.99 (0.95) | 0.99 (0.92) | 0.99 (0.89) | 0.99 (0.93) | 1.00 (0.90) |
Refinement | |||||
Resolution (Å) | 47.0–1.85 (1.88–1.85)a | 48.20–1.94 (1.98–1.94)a | 90.03–1.84 (1.86–1.84)a | 47.38–1.83 (1.86–1.83) | 47.03–1.73 (1.75–1.73) |
No. reflections | 31,709 (3219) | 27,502 (2673) | 113,161 (11,324) | 31,971 (3134) | 38,879 (3899) |
Rwork/Rfree (%) | 16.8 (25.2)/19.6 (30.2) | 17.4 (21.7)/22.5 (26.5) | 18.3 (27.6)/22.7 (33.3) | 17.4 (25.0)/20.6 (29.6) | 16.6 (26.6)/19.2 (27.5) |
Ramachantran plot (%) | |||||
Outliers | 0.00 | 0.00 | 0.16 | 0.00 | 0.00 |
Allowed | 1.00 | 1.64 | 1.81 | 1.30 | 2.00 |
Favored | 99.00 | 98.36 | 98.02 | 98.70 | 98.00 |
No. atoms | |||||
Protein | 2329 | 2347 | 9452 | 2340 | 2317 |
Ligand/ion | 5 | 45 | 208 | 39 | 12 |
Water | 128 | 274 | 1048 | 183 | 241 |
B-factors | |||||
Protein | 50.0 | 32.5 | 29.6 | 40.5 | 41.6 |
Ligand/ion | 52.6 | 32.1 | 33.8 | 40.0 | 41.1 |
Water | 53.5 | 41.5 | 40.0 | 47.9 | 49.9 |
R.m.s deviations | |||||
Bond lengths (Å) | 0.006 | 0.006 | 0.006 | 0.006 | 0.006 |
Bond angles (°) | 0.8 | 0.8 | 0.8 | 0.8 | 0.8 |
aHighest resolution shell is shown in parenthesis.
The X-ray crystal structures revealed that all three compounds bind covalently to the catalytically active Cys145 residue within the substrate-binding pocket of the protease. We observed distinct mechanisms by which these compounds acted on this residue. Compound 4 functioned in a similar binding mode as other reported compounds, covalently modifying Cys145 through Michael addition (Fig. 3a)5. For GC376, the bisulfite adduct was converted to an aldehyde as previously reported, allowing it to then react with Cys145 through nucleophilic addition and hemithioacetal formation (Fig. 3b)8. MAC-5576 also covalently modified Cys145 by nucleophilic linkage, which was somewhat unexpected, given that we did not observe time-dependent inhibition by this compound (Fig. 3c). We observed weaker density in the S4 site for compound 4 (Fig. 3d) and in the S3 site for GC376 (Fig. 3e) as compared to other regions of each inhibitor. For MAC-5576, we found that the overall electron density is weak, and that the optimal modeling was achieved when the occupancy was set to 0.5, supporting that it may bind reversibly (Fig. 3f).
Structural insights into the design of 3CL protease inhibitors
As we solved the structures for multiple compounds, we hypothesized that general principles for the design of SARS-CoV-2 3CL protease inhibitors could be identified. We first overlaid all four crystal structures of the 3CLpro with or without inhibitors (Fig. 4a). We observed local conformational changes, with Thr45 to Pro52 distinct from the ligand-free 3CLpro in all three inhibitor-bound structures, whereas Arg188 to Gln192 differed only in the compound 4 and GC376-bound, but not MAC-5576-bound structures. We then overlaid each of the inhibitors in the substrate-binding pocket of the 3CL protease to find commonalities in their interactions (Fig. 4b). Most notably, we found that all of these compounds occupied the S2 site, with compound 4 and GC376 further anchored in the S1 site. The backbone NH of Gly143 points toward the ligand-binding pocket, forming hydrogen bonds with the carbonyl oxygen of the ethyl ester of compound 4, and the hemithioacetal of GC376 after the Cys145 addition to the original aldehyde, even though the former hydrogen bond is stronger than the latter. In both structures, the γ-lactam groups occupy the S1 site, and are strongly anchored by two hydrogen bonds with the side chains of His163 and Glu166. The isobutyl groups are favorably embedded in the hydrophobic S2 site, surrounded by the alkyl portion of the side chains of His41, Met49, His164, Met165, Asp187, and Gln189. Extending into the S3 pocket, the amide bonds of compound 4 and GC376 are stabilized by hydrogen bond interactions with the side chain of Gln189.
To further study the interaction of compound 4 with SARS-CoV-2 3CLpro, we determined and refined an additional crystal structure of the 3CL protease in complex with compound 4 in space group P1 at 1.84 Å resolution limits, in which there are four protomers in the asymmetric unit of the crystal, which is equivalent with the unit cell in this space group. Overlaying these protomers revealed that in particular, compound 4 exhibited significant flexibility in the S1′ region (Fig. 4c).
As several SARS-CoV-2 3CL protease inhibitors have been reported, we overlaid compound 4 and GC376 with these related substrate mimetic inhibitors (Fig. 4d)5,6,10. We found similar interactions between these compounds, suggesting that overall, the binding modes of this class share remarkable similarities. Specifically, they all have a γ-lactam occupying the S1 pocket, preserving the dual hydrogen bonds with His163 and Glu166. Furthermore, they commonly contain a hydrophobic moiety occupying the S2 site. The segment of the inhibitors from S1 to S2 align closely on top of each other. Variations of binding start to emerge in the S3 and S4 region, which exhibits high degrees of freedom in terms of structural diversity as well as conformational flexibility.
On the other hand, the binding of MAC-5576, as a non-peptidic small molecule, displays unique features that differ from that of compound 4 or GC376. We observed that the thiophene group forms π–π stacking with the His41 side chain imidazole, which undergoes a conformational rotation around its beta-carbon to align parallel to the thiophene, as compared to the other peptide-bound structures. Additionally, the side chain of Gln189 also shows notable conformational variation compared to those in the compound 4 and GC376 crystal structures, possibly in response to the specific hydrogen bond interactions induced by the respective ligands. Notably, the rotation of His41 has been reported previously in the crystal structure of a benzotriazole ester inhibitor (XP-59) in complex with the SARS-CoV 3CL protease (PDB:2V6N)11. An overlaid model of the crystal structures of MAC-5576 bound to SARS-CoV-2 3CLpro and XP-59 bound to SARS-CoV 3CLpro shows that both compounds have similar binding modes when covalently bound to Cys145, in which the thiophene of MAC-5576 and the phenyl ring of XP-59 almost overlap with each other, both engaging the His41 side chain via π-π stacking interactions (Fig. 4e).
Discussion
In this study, we have identified compound 4, GC376, and MAC-5576 as inhibitors of the SARS-CoV-2 3CL protease. Each of these compounds displayed biochemical inhibition of the protease, and compound 4 and GC376 also inhibited the virus in a cell-based assay, whereas MAC-5576 did not (Fig. 2). We solved the crystal structures of these compounds complexed to the protease, confirming that each are covalent inhibitors (Fig. 3). Compound 4 and GC376 demonstrated similar interactions as other substrate mimetic inhibitors5,6,10, and MAC-5576 was similar to a previously identified small molecule inhibitor of SARS-CoV 3CLpro 11 (Fig. 4).
GC376 has been recently reported to be an inhibitor of the SARS-CoV-2 3CLpro, and the complex was solved by Ma et al. (PDB accession code 6WTT)12. Our results corroborate their findings, and we observe similar interactions in our solved structure. However, one notable difference lies in the S3 site, in which the benzyl group in our crystal structure points upward towards the solvent, while making a hydrophobic interaction with the lactam group. In contrast, the benzyl group of GC376, bound to each of the three 3CL protomers in the asymmetric unit (ASU) of their structure, is anchored in the hydrophobic pocket predominantly formed by Met165, Leu167, and Gln192. This observation, along with the observed weaker electron density in this region (Fig. 3e), suggests that this subsite could be modified for an improved inhibitor.
In solving the complex of compound 4 with 3CL protease in both space groups C2 and P1, we observed that the S1′ site demonstrated conformational flexibility (Fig. 4c). In addition, the S4 site demonstrated weaker electron density (Fig. 3d), suggesting that modifying the interaction of compound 4 with these two subsites could improve the compound’s inhibitory activity.
The finding that MAC-5576 was covalently linked to Cys145 in the crystal structure (Fig. 3c) but did not display time-dependent inhibition (Fig. 2c) suggests that it may be a reversible covalent inhibitor. The overall weaker electron density and optimal modeling with occupancy set to 0.5 for this structure supports the possibility of its reversible nature. However, it is possible that the lack of time-dependent inhibition, yet the observation of a clear covalent linkage in the crystal structure, is due to the differences in the conditions used for the two experiments. Further investigations into the mechanism of action of MAC-5576 may reveal a method for alleviating its lack of activity in inhibiting the virus (Fig. 2d).
The collective observations from the three inhibitors suggest that development of 3CL protease inhibitors may benefit from first establishing robust interactions within the S1, S2, and/or S1′ sites, before extending into the S3 and S4 sites. For these, and other compounds targeting the 3CL protease, there are ample opportunities to improve the inhibitory potencies against the 3CLpro by designing compounds that exploit the accessible contact points to strengthen the ligand-protein interactions (Fig. 4d).
In summation, we have identified compound 4, GC376, and MAC-5576 as inhibitors of the SARS-CoV-2 3CL protease. Crystal structures of the compounds complexed to the protease suggested their mechanisms of action, as well as portended guidelines for the development of SARS-CoV-2 3CL protease inhibitors, which may aid in the future development of novel inhibitors to combat this virus.
Methods
Compounds
Compound 4 was synthesized using the synthesis route previously described, with the exception of using a sodium borohydride-cobaltous chloride reduction of the nitrile in the construction of the lactam, thus avoiding the high pressure hydrogenation in the original route7,13. GC376 was purchased from Aobious (Gloucester, MA, USA) and MAC-5576 was purchased from Maybridge (Cheshire, United Kingdom).
Expression and purification of SARS-CoV-2 3CL protease
The SARS-CoV-2 3CL protease gene was codon optimized for bacterial expression and synthesized (Supplementary Table 1) (Twist Bioscience, San Francisco, CA, USA), then cloned into a bacterial expression vector (pGEX-5X-3, GE, Boston, MA, USA, gift from Yosef Sabo, Columbia University Irving Medical Center) which expresses the protease as a fusion construct with a N-terminal GST tag, followed by a Factor Xa cleavage site (pGEX-5X-3-SARS-CoV-2-3CL, deposited to Addgene as plasmid #168457). After confirmation by Sanger sequencing using the primers listed in Supplementary Table 2, the construct was transformed into BL21 (DE3) cells. These E. coli were inoculated and grown overnight as starter cultures, then used to inoculate larger cultures at a 1:100 dilution, which were then grown at 37 °C, 220 RPM until the OD reached 0.6–0.7. Expression of the protease was induced with the addition of 0.5 mM IPTG, and then the cultures were incubated at 16 °C, 180 RPM for 10 h. Cells were pelleted at 3580 × g for 15 min at 4 °C, resuspended in lysis buffer (20 mM Tris-HCl, pH 8.0, 300 mM NaCl), homogenized by sonication, then clarified by centrifuging at 25,000 × g for 1 h at 4 °C. The supernatant was mixed with Glutathione Sepharose resin (Sigma, St. Louis, MO, USA) and placed on a rotator for 2 h at 4 °C. The resin was then repeatedly washed by centrifugation at 3210 × g for 15 min at 4 °C, discarding of the supernatant, and then resuspension of the resin in fresh lysis buffer. After ten washes, the resin was resuspended in lysis buffer, and Factor Xa was added and incubated for 36 h at 4 °C on a rotator. The resin was centrifuged at 3210 × g for 15 min at 4 °C, and then the supernatant was collected and concentrated using a 10 kDa concentrator before being loaded onto a Superdex 10/300 GL column in 50 mM Tris-HCl, pH 7.5, 1 mM EDTA for further purification by size exclusion chromatography. The appropriate fractions were collected and pooled with a 10 kDa concentrator, and then the final product was assessed for quality by SDS-PAGE and measurement of biochemical activity.
Measurement of SARS-CoV-2 3CL protease biochemical activity
The in vitro biochemical activity of the SARS-CoV-2 3CL protease was measured as previously described5. The fluorogenic peptide MCA-AVLQSGFR-Lys(DNP)-Lys-NH2, corresponding to the nsp4/nsp5 cleavage site in the virus, was synthesized (GL Biochem, Shanghai, China), then resuspended in DMSO to use as the substrate. Different concentrations of this substrate, ranging from 5 to 100 µM, were prepared in the assay buffer (50 mM Tris-HCl, pH 7.5, 1 mM EDTA) in a 96 well-plate. The protease was then added to each well at a concentration of 0.2 µM, and then fluorescence was continuously measured on a plate reader for 3 min. The catalytic efficiency of the protease was then calculated by nonlinear regression (GraphPad Prism, GraphPad Software, San Diego, CA, USA). For calculations, a 100% active enzyme was assumed.
Measurement of SARS-CoV-2 3CL protease inhibition
Inhibition of the biochemical activity of the SARS-CoV-2 3CL protease was quantified as previously described with modifications5. Serial dilutions of the test compound were prepared in the assay buffer, and then incubated with 0.2 µM of the protease for 10 min at 37 °C. The substrate was then added at 20 µM per well, and then fluorescence was continuously measured on a plate reader for 3 min. Inhibition was then calculated by comparison to control wells with no inhibitor added. IC50 values were determined by nonlinear regression (GraphPad Prism). For calculations, a 100% active enzyme was assumed.
Kinetic parameters were determined as previously described14. Compounds were pre-incubated with the protease at differing timepoints at various concentrations to derive kobs, which were then used for the calculation of kinact and Ki by nonlinear regression (GraphPad Prism).
Measurement of SARS-CoV-2 viral inhibition
Stocks of SARS-CoV-2 strain 2019-nCoV/USA_WA1/2020 were propagated and titered in Vero-E6 cells. One day prior to the experiment, Vero-E6 cells were seeded at 30,000 cells/well in 96 well-plates. Serial dilutions of the test compound were prepared in cell media (EMEM + 10% FCS + penicillin/streptomycin), overlaid onto cells, and then virus was added to each well at an MOI of 0.2. Cells were incubated at 37 °C under 5% CO2 for 72 h and then viral cytopathic effect was scored in a blinded manner. Inhibition was calculated by comparison to control wells with no inhibitor added. EC50 values were determined by nonlinear regression (GraphPad Prism). Cells were confirmed as mycoplasma negative prior to use. All experiments were conducted in a biosafety level 3 (BSL-3) lab.
Measurement of cellular cytotoxicity
Vero-E6 cells were incubated with the compound of interest for 48 h at 37 °C under 5% CO2 and then cellular cytotoxicity was determined with the XTT Cell Proliferation Assay Kit (ATCC) according to the manufacturer’s instructions.
Crystallization, data collection, and structure determination
To generate the complex of SARS-CoV-2 3CL protease bound to compound 4, 50 µM of the 3CL protease was incubated with 500 µM of compound 4 in a buffer comprised of 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, and 5% (v/v) glycerol for 1 h at 4 °C. This complex was then concentrated to 8.5 mg/mL using a 10 kDa concentrator, and initially subjected to extensive robotic screening at the High-Throughput Crystallization Screening Center of the Hauptman-Woodward Medical Research Institute (HWI) (https://hwi.buffalo.edu/high-throughput-crystallization-center/)15. The most promising crystal hits were then reproduced using the microbatch-under-oil method at 4 °C. Block-like crystals of 3CLpro in complex with compound 4 appeared after a few days in the crystallization condition comprised of 0.1 M potassium nitrate, 0.1 M sodium acetate (pH 5), and 20% (w/v) PEG 1000 with protein to crystallization reagent at a 2:1 ratio. The crystals were subsequently transferred into the same crystallization reagent supplemented with 15% (v/v) glycerol and flash-frozen in liquid nitrogen. Plate-like crystals of 3CLpro in complex with compound 4 were also produced using crystallization reagent comprising 0.1 M Bis-Tris (pH 6.5) and 20% (w/v) PEG MME 5000.
To obtain crystals of 3CLpro in complex with GC376, crystals of ligand-free 3CLpro were initially grown by using seeding method in a crystallization reagent comprised of 0.1 M sodium phosphate-monobasic, 0.1 M MES (pH 6), and 20% (w/v) PEG 4000. These crystals were subsequently soaked with 15 mM GC376, followed by flash-freezing of the crystals in the same reagent supplemented with 15% ethylene glycol.
To generate the complex of SARS-CoV-2 3CL protease bound to MAC-5576, 50 µM of the 3CL protease was incubated with 500 µM of MAC-5576 in a buffer comprised of 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, and 5% (v/v) glycerol for 1 h at 4 °C. The complex was concentrated to 10 mg/mL using a 10 kDa concentrator, and then crystallized in the same conditions as those used for crystallization of ligand-free 3CLpro.
A native dataset was collected on each crystal of 3CLpro, alone (ligand-free), and in complex with compound 4 and GC376 at the NE-CAT24-ID-C beam line of Advanced Photon Source (APS) at Argonne National Laboratory, and the NE-CAT 24-ID-E beam line of APS was used for data collection on crystals of 3CL-MAC-5576. Crystals of ligand-free 3CLpro and in complex with compound 4 in space group C2 and P1, GC376, and MAC-5576 diffracted the X-ray beam to resolution 1.85, 1.94, 1.84, 1.83, 1.73 Å, respectively. The images were processed and scaled in space group C2 using XDS16. The structure of 3CLpro with compound 4 in space group C2 was determined by molecular replacement (MR) method using program MOLREP17 and the crystal structure of 3CLpro in complex with inhibitor N3 (PDB id: 6LU7)5 was used as a search model. The structure of 3CLpro with compound 4 in space group P1 was also determined by MR method and the refined model of 3CLpro with compound 4 in space group C2 was used as the search model. The geometry of each crystal structure was subsequently fixed and the corresponding inhibitor was modeled in by XtalView18 and Coot19, and refined using PHENIX20. The mapping of electrostatic potential surfaces was generated in PyMOL with the APBS plug-in21. There is one protomer of 3CLpro complex in the asymmetric unit of each crystal of space group C2, and there are four protomers of 3CLpro bound to compound 4 in each unit cell of space group P1. The crystallographic statistics are shown in Table 1.
Reporting summary
Further information on experimental design is available in the Nature Research Reporting Summary linked to this paper.
Supplementary information
Acknowledgements
This work was supported by a grant from the Jack Ma Foundation to D.D.H. and A.C. and by grants from Columbia Technology Ventures and the Columbia Translational Therapeutics (TRx) program to B.R.S. A.C. is also supported by a Career Awards for Medical Scientists from the Burroughs Wellcome Fund. S.I. was supported by NIH grant T32AI106711. We are thankful to Yosef Sabo for the pGEX-5X-3 plasmid, to WuXi AppTec for assistance with the enzyme kinetics assay, to the staff of the High-Throughput Crystallization Screening Center of the Hauptman-Woodward Medical Research Institute for screening of crystallization conditions, and to the staff of the Advanced Photon Source at Argonne National Laboratory for assistance with data collection. Crystallization screening was supported through NSF grant 2029943. This research used resources of the Advanced Photon Source, a U.S. Department of Energy (DOE) Office of Science User Facility, operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE-AC02-06CH11357. Extraordinary facility operations were supported in part by the DOE Office of Science through the National Virtual Biotechnology Laboratory, a consortium of DOE national laboratories focused on the response to COVID-19, with funding provided by the Coronavirus CARES Act. Part of the research was conducted at the Northeastern Collaborative Access Team beamlines, which are funded by the National Institute of General Medical Sciences from the National Institutes of Health (P30 GM124165).
Source data
Author contributions
D.D.H. conceived the project. S.I., B.R.S., A.C., and D.D.H. planned and designed the experiments. S.I. and S.J.H. cloned, expressed, and purified proteins. A.Z. synthesized compound 4. S.I. conducted the protease inhibition assay. M.N. and Y.H. conducted the antiviral assay and cytotoxicity measurements. F.F. and H.L. crystallized the proteins. F.F. collected diffraction data and solved the crystal structures. F.F., F.-Y.L., and L.X. analyzed the structural data. S.I., F.F., F.-Y.L., L.X., A.C., and D.D.H. wrote the manuscript with input from all authors.
Data availability
The SARS-CoV-2 3CLpro bacterial expression vector utilized in this study has been deposited to Addgene as plasmid #168457. Structural data for the ligand-free SARS-CoV-2 3CL protease and 3CLpro in complex with compound 4 in space group C2 and P1, GC376, and MAC-5576 have been deposited in the Protein Data Bank (PDB) under accession codes 7JST, 7JT7, 7JW8, 7JSU, and 7JT0, respectively. Overlays in Fig. 4d, e were made using previously deposited structures in PDB, available under accession codes 6Y2F (SARS-CoV-2 3CLpro with compound 13b), 6LZE (SARS-CoV-2 3CLpro with compound 11a), 6M0K (SARS-CoV-2 3CLpro with compound 11b), 7BQY (SARS-CoV-2 3CLpro with N3), and 2V6N (SARS-CoV 3CLpro with XP-59). Source data are provided with this paper.
Competing interests
S.I., F.F., H.L., A.Z., B.R.S., A.C., and D.D.H. are inventors on a patent application submitted based on this work. B.R.S. is an inventor on additional patents and patent applications related to small molecule therapeutics, and co-founded and serves as a consultant to Inzen Therapeutics and Nevrox Limited. The other authors have no competing interests.
Footnotes
Peer review information Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Change history
5/5/2021
A Correction to this paper has been published: 10.1038/s41467-021-23082-3
Contributor Information
Brent R. Stockwell, Email: bstockwell@columbia.edu
Alejandro Chavez, Email: ac4304@cumc.columbia.edu.
David D. Ho, Email: dh2994@cumc.columbia.edu
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-021-22362-2.
References
- 1.Wu F, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265–269. doi: 10.1038/s41586-020-2008-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zhou P, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270–273. doi: 10.1038/s41586-020-2012-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Beigel, J. H. et al. Remdesivir for the Treatment of Covid-19—Preliminary Report. N. Engl. J. Med. 10.1056/NEJMoa2007764 (2020). [DOI] [PubMed]
- 4.Wang Y, et al. Remdesivir in adults with severe COVID-19: a randomised, double-blind, placebo-controlled, multicentre trial. Lancet. 2020;395:1569–1578. doi: 10.1016/S0140-6736(20)31022-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jin, Z. et al. Structure of M(pro) from SARS-CoV-2 and discovery of its inhibitors. Nature, 10.1038/s41586-020-2223-y (2020). [DOI] [PubMed]
- 6.Zhang L, et al. Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved alpha-ketoamide inhibitors. Science. 2020;368:409–412. doi: 10.1126/science.abb3405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yang S, et al. Synthesis, crystal structure, structure-activity relationships, and antiviral activity of a potent SARS coronavirus 3CL protease inhibitor. J. Med. Chem. 2006;49:4971–4980. doi: 10.1021/jm0603926. [DOI] [PubMed] [Google Scholar]
- 8.Kim Y, et al. Broad-spectrum antivirals against 3C or 3C-like proteases of picornaviruses, noroviruses, and coronaviruses. J. Virol. 2012;86:11754–11762. doi: 10.1128/JVI.01348-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Blanchard JE, et al. High-throughput screening identifies inhibitors of the SARS coronavirus main proteinase. Chem. Biol. 2004;11:1445–1453. doi: 10.1016/j.chembiol.2004.08.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Dai W, et al. Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease. Science. 2020 doi: 10.1126/science.abb4489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Verschueren KH, et al. A structural view of the inactivation of the SARS coronavirus main proteinase by benzotriazole esters. Chem. Biol. 2008;15:597–606. doi: 10.1016/j.chembiol.2008.04.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ma C, et al. Boceprevir, GC-376, and calpain inhibitors II, XII inhibit SARS-CoV-2 viral replication by targeting the viral main protease. Cell Res. 2020;30:678–692. doi: 10.1038/s41422-020-0356-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zhai Y, et al. Cyanohydrin as an anchoring group for potent and selective inhibitors of enterovirus 71 3C protease. J. Med. Chem. 2015;58:9414–9420. doi: 10.1021/acs.jmedchem.5b01013. [DOI] [PubMed] [Google Scholar]
- 14.Obach RS, Walsky RL, Venkatakrishnan K. Mechanism-based inactivation of human cytochrome p450 enzymes and the prediction of drug-drug interactions. Drug Metab. Dispos. 2007;35:246–255. doi: 10.1124/dmd.106.012633. [DOI] [PubMed] [Google Scholar]
- 15.Luft JR, et al. A deliberate approach to screening for initial crystallization conditions of biological macromolecules. J. Struct. Biol. 2003;142:170–179. doi: 10.1016/S1047-8477(03)00048-0. [DOI] [PubMed] [Google Scholar]
- 16.Kabsch W. Integration, scaling, space-group assignment and post-refinement. Acta. Crystallogr. D. Biol. Crystallogr. 2010;66:133–144. doi: 10.1107/S0907444909047374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Vagin A, Teplyakov A. Molecular replacement with MOLREP. Acta. Crystallogr. D. Biol. Crystallogr. 2010;66:22–25. doi: 10.1107/S0907444909042589. [DOI] [PubMed] [Google Scholar]
- 18.McRee DE. XtalView/Xfit–A versatile program for manipulating atomic coordinates and electron density. J. Struct. Biol. 1999;125:156–165. doi: 10.1006/jsbi.1999.4094. [DOI] [PubMed] [Google Scholar]
- 19.Emsley P, Lohkamp B, Scott WG, Cowtan K. Features and development of Coot. Acta. Crystallogr. D. Biol. Crystallogr. 2010;66:486–501. doi: 10.1107/S0907444910007493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Adams PD, et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta. Crystallogr. D. Biol. Crystallogr. 2010;66:213–221. doi: 10.1107/S0907444909052925. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA. Electrostatics of nanosystems: application to microtubules and the ribosome. Proc. Natl Acad. Sci. USA. 2001;98:10037–10041. doi: 10.1073/pnas.181342398. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The SARS-CoV-2 3CLpro bacterial expression vector utilized in this study has been deposited to Addgene as plasmid #168457. Structural data for the ligand-free SARS-CoV-2 3CL protease and 3CLpro in complex with compound 4 in space group C2 and P1, GC376, and MAC-5576 have been deposited in the Protein Data Bank (PDB) under accession codes 7JST, 7JT7, 7JW8, 7JSU, and 7JT0, respectively. Overlays in Fig. 4d, e were made using previously deposited structures in PDB, available under accession codes 6Y2F (SARS-CoV-2 3CLpro with compound 13b), 6LZE (SARS-CoV-2 3CLpro with compound 11a), 6M0K (SARS-CoV-2 3CLpro with compound 11b), 7BQY (SARS-CoV-2 3CLpro with N3), and 2V6N (SARS-CoV 3CLpro with XP-59). Source data are provided with this paper.