Abstract
Targeted covalent drugs have demonstrated remarkable potential in disease treatment over the past decades. However, existing methods for covalent drug design are often limited to serine and cysteine, ignoring other potentially ligandable binding sites. Statistical analyses indicate that over 95% of binding pockets contain covalent-binding residues, suggesting that all ligands that targeting these pockets possess the potential to be modified into covalent ligands. To achieve this goal, we introduced CovalentLab, an interactive computational platform that integrates ligand-based and warhead-based strategies into a unified workflow for the rational design of covalent ligands. Leveraging a covalent binding site prediction model constructed on ESM-2 with LoRA fine-tuning, CovalentLab enables the prediction and ranking of nine classes of covalent-binding residues in proteins according to their reactivity and facilitates systematic warhead attachment to ligands using 210 electrophilic groups or user-defined warheads. Using this platform, a comprehensive library of more than 100,000 covalent molecules across 95 targets was generated. Notably, CovalentLab has been successfully applied to various essential real-world targets, identifying wet-laboratory-validated bioactive compounds ranging from TRK orthosteric inhibitors to GAC allosteric inhibitors. By bridging gaps in covalent drug discovery, CovalentLab offers a versatile, publicly accessible resource to expand the druggable targets and accelerate the development of targeted covalent therapies.
Keywords: covalent design platform, site prediction, web server, database, deep learning, TRK, GAC, post-translational modification


Introduction
The covalent strategy has emerged as a transformative paradigm in drug design, offering unique advantages such as prolonged target engagement, high potency, and the potential to overcome drug resistanceparticularly in kinase and protease inhibitors. − These benefits have led to the approval of covalent drugs such as ibrutinib and osimertinib. Recent advances have further extended the application of covalent drugs into the realm of post-translational modifications, including acetylation , and phosphorylation. According to current statistics, covalent drugs constitute only about 7% of U.S. Food and Drug Administration (FDA)-approved therapies, and their applications are mainly confined to targeting serine and cysteine residues. , This underscores the urgent need for broader exploration and optimization in covalent drug design.
Current covalent drug discovery is primarily guided by two intuitive strategies: ligand-based design − and warhead-based approaches. − The ligand-based strategy involves the rational modification of known reversible binders by incorporating electrophilic warheads, leveraging established noncovalent binding poses to accurately position the warhead for covalent engagement. This approach enhances target selectivity while minimizing off-target reactivity. Nevertheless, current methodologies predominantly depend on cysteine-directed electrophilic warheads, such as acrylamides, which has led to relative neglect of other nucleophilic residues. This overreliance on cysteine targeting inadvertently constrains both the scope for innovation and the general applicability of covalent modulation strategies. , In fact, other nucleophilic residues, such as lysine, tyrosine, and glutamic acid, have also been targeted as covalent amino acids, with the targeting of residues other than cysteine becoming increasingly prevalent in recent years. − Conversely, the warhead-based approach adopts an electrophile-first strategy, initiating discovery by identifying warheads upfront rather than introducing covalency into a preexisting reversible ligand. This strategy enables the identification of covalent binders de novo, particularly for targets traditionally considered “undruggable” due to shallow binding pockets or the absence of high-affinity reversible ligands. , However, identifying suitable target proteins and corresponding scaffolds from ligands for a newly designed covalent warhead remains challenging. To address these problems, integrated approaches that combine the precision of ligand-based design with the exploratory breadth of warhead-based strategies is increasingly necessary.
Advances in database construction, computational modeling, and machine learning-guided reactivity prediction offer promising avenues for designing covalent compounds. − On one hand, significant efforts have been made to establish comprehensive databases of covalent molecules, , including cBinderD, CovPDB, CovBinderInPDB, and CovalentInDB. , On the other hand, machine learning (ML)-based identification of covalently ligandable sites facilitates the design of targeted covalent inhibitors and contributes to expanding the druggable proteome. For example, Du et al. proposed a novel graph-based deep learning model, DeepCoSI, for predicting covalent drug binding sites, specifically targeting cysteine residues. Recently, Liu et al. developed predictive models for the covalent ligandability of cysteines in proteins, including a decision tree model based on physicochemical features and a three-dimensional convolutional neural network (3D-CNN) model. Although these efforts provided valuable computational tools, they primarily focused on cataloging existing covalent agents or predicting covalent binding sites for specific amino acids. They fell short of enabling flexible design of novel covalent molecules and did not fully address the limitations posed by restricted warhead diversity or narrow target selection. Consequently, the intelligent and rational design of covalent molecules remains a critical need in the field.
Herein, our analysis of PDBbind reveals over 95% of binding pockets (within 6 Å) have nucleophilic residues for covalent bonding, highlighting broad potential to convert reversible ligands into irreversible inhibitors and expand covalent drug discovery. Building on this insight, we present CovalentLab, an interactive platform for the rational design of covalent molecules (Figure ). The platform integrates traditional covalent strategies and post-translational modifications, unifying ligand-based and warhead-based approaches into a unified workflow. For a given protein–ligand complex, CovalentLab identifies nearby amino acid residues amenable to covalent modification, ranks their reactivity using a covalent binding site prediction model built on ESM-2 with low-rank adaptation (LoRA) fine-tuning, determines optimal warhead attachment sites on the ligand based on spatial proximity, and connects the ligand with selected warheads. Specifically, nine amino acids (Asp, Lys, His, Ser, Thr, Cys, Arg, Glu, and Tyr), which are collected in the CovalentInDB database as covalently targetable residues, were selected, and 210 distinct warheads from the database were employed for ligand construction. Notably, CovalentLab supports the submission of user-defined warheads, enabling the design of custom covalent ligands tailored to specific amino. Targets are ranked according to the minimum distance between the target amino acid and the corresponding ligand, allowing users to prioritize candidates based on spatial proximity for optimal covalent interaction. Furthermore, our method was used to generate a publicly available data set of 302 reversible protein–ligand complexes, from which over 100,000 covalent molecules were designed across 95 targets. More importantly, CovalentLab was further validated on two distinct proteins, including the kinase TRK and the hydrolase GAC. In both cases, we successfully obtained wet-lab-validated covalent hit compounds. In summary, CovalentLab offers a user-friendly, interactive platform for the rational design of covalent compounds, expanding the molecular design space through AI–driven algorithms and fully automated workflows, thereby addressing the needs of researchers across multiple disciplines. The platform and database are fully accessible to the public at https://www.medchemwise.com/CovalentLab.
1.
Comparison of previous work and this work. In previous work, covalent drug design mainly targeted cysteine by attaching covalent warheads to existing reversible ligands. In this work, we present a one-shot platform that expands covalent targeting to nine amino acids, predicts reactivity rankings with an AI model, identifies warhead attachment points, incorporates warheads into ligands, and also enables custom warhead design by aligning appropriate ligands and targets to match user-defined warhead specifications.
Results
Statistics of Covalent Drugs and Covalent Targets
A comprehensive statistical and mechanistic analysis of 128 FDA-approved covalent drugs was performed to better characterize their current landscape and constraints, with a focus on the amino acid residues engaged in covalent bonding and the core chemical mechanisms underlying bond formation. As illustrated in Figure A, among these 128 covalent drugs, serine and cysteine were the most frequently targeted nucleophilic amino acids. Beyond amino acid selectivity, the analysis also indicated limited diversity in covalent reaction pathways. For serine-targeting compounds, only two primary mechanisms were predominantly used: addition elimination and nucleophilic addition, despite the theoretical viability of several other chemical routes. Likewise, cysteine-targeting drugs predominantly rely on nucleophilic addition as the central mechanism. Notably, analysis of the data set reveals clear biases in warhead usage, where β-lactams are most frequently employed for serine-targeting inhibitors, while acrylamides remain the preferred electrophiles for cysteine modification. Overall, the data reveal that current covalent design is constrained by limited residue selectivity and reaction mechanisms.
2.
Statistics of covalent drugs and covalent targets. (A) Statistical and mechanistic breakdown of FDA-approved covalent drugs. (B) Distribution of covalently reactive amino acids in active sites of PDBbind proteins.
To systematically assess the potential for covalent ligand engagement across protein–ligand complexes, we analyzed the binding pockets in the PDBbind v2020 data set, concentrating on amino acid residues located within 6 Å of the ligand centroid. Analysis of Figure B indicates that over 95% of ligand-binding molecules possess at least one covalently modifiable amino acid residue, implying that nearly every binding site contains residues amenable to covalent targeting. This observation offers strong evidence that covalent binding opportunities are widespread among protein–ligand complexes. Nonetheless, this prevalence should be interpreted as indicative of potential covalent tractability. Cysteines, particularly in their deprotonated form, are generally the most reactive nucleophiles and have been extensively exploited in covalent inhibitor design. In contrast, lysines are typically protonated under physiological conditions, substantially diminishing their nucleophilicity, although specific microenvironments or proximity to catalytic residues can enhance their reactivity. Similarly, carboxylates (Asp/Glu) tend to exhibit lower intrinsic reactivity. Despite the widespread presence of nucleophilic residues, their reactivity is highly heterogeneous, and covalent warhead incorporation may fail to yield efficient covalent binding if the targeted residue lacks sufficient chemical tractability. Current scope of covalent drug discovery is limited, with the majority of research concentrated on a narrow set of reactive residuesparticularly cysteine. To bridge this gap, we propose an integrated, streamlined platform designed to systematically support the identification, design, and optimization of covalent inhibitors across a broader array of amino acid targets and warhead chemistries.
Covalent Site Prediction
Accurate identification of covalent binding sites is a fundamental prerequisite for the rational design of covalent molecules. To enable predictive modeling of covalent binding site reactivity, we applied LoRA to the pretrained ESM-2 protein language model and developed a specialized model for covalent binding site identification (Figure A). Leveraging experimentally validated annotations from the CovalentInDB database, we fine-tuned the model to capture sequence- and context-specific features linked to covalent reactivity within protein environments. To evaluate training efficiency, we compared cross entropy loss (CE loss) and focal loss, which differ in handling class imbalance during optimization. Based on overall performance across multiple evaluation metrics, CE Loss was selected as the most effective training objective. As shown in Figure B, the model achieved a recall rate exceeding 0.80 and a precision of 0.715, indicating strong capability in identifying true covalent binding sites while maintaining a moderate false-positive rate. Our model exhibited robust predictive performance, achieving an area under the ROC curve (AUC) of 0.981 on the test set (Figure C), reflecting high discriminative power between covalent and noncovalent residues. Building on this foundation, we applied the model to rank potentially reactive residues located near the active site of proteins. Specifically, protein structures were used to define active pockets, and neighboring amino acids were prioritized based on their predicted covalent reactivity. This application integrates sequence-based predictions with structural and functional context, enabling the rational identification of ligandable residues that are both chemically reactive and spatially accessible for covalent modification.
3.
Covalent binding site prediction model. (A) The ESM-2 protein language model was fine-tuned using LoRA to improve the accuracy of covalent binding site prediction. Figure adapted from Hu et al. under Creative Commons license. (B) Comparison results for two training loss functions used in covalent binding site prediction. (C) AUROC of the predictive model evaluated on the test data set.
Design and Implementation of Covalent Molecules
To address current limitations in covalent drug discovery, we developed CovalentLab, a covalent molecule design platform featuring a novel bidirectional design strategy: a target-to-ligand pipeline for precise covalent compound development and a warhead-to-target reverse design workflow for identifying suitable ligands based on electrophilic warheads. The complete workflow comprises three main stages: binding pocket characterization and covalent site prediction, identification of ligand modification sites, and covalent fragment generation and optimization (Figure A).
4.
Workflow of CovalentLab. (A) Path 1 begins with a protein–ligand complex; covalent sites are identified, and appropriate warheads are selected from a curated library to generate covalent candidates via fragment linking. Path 2 starts with a user-defined warhead; recommended ligands and compatible targets are retrieved, and covalent molecules are directly assembled using the specified warhead. The resulting compounds are screened through docking to construct a covalent compound library. (B) Model logits are used to estimate the likelihood of covalent reactivity for amino acid residues within the protein pocket. (C) Two strategies for fragment-based molecular assembly.
In the initial stage, the protein–ligand complex is analyzed to define the ligand-binding pocket and identify potentially reactive amino acid residues. The pocket is characterized by selecting all protein residues within a user-defined distance threshold (typically centered around 6 Å) from any atom of the bound ligand. Within the defined region, amino acids possessing nucleophilic side chains, including Asp, Glu, Lys, Arg, His, Cys, Ser, Thr, and Tyr, are marked as potential covalent targets. These residues are automatically detected, and relevant information such as residue type, chain ID, and sequence position is recorded. The identified amino acids are then ranked by their predicted probability of covalent reactivity using our trained covalent site prediction model (Figure B). The logits produced by the model are extracted and subjected to softmax processing in order to obtain the probability at each position, after which the positions are ranked according to these probabilities.
Once a targetable residue is identified, the ligand is systematically analyzed to determine suitable sites for covalent modification. Candidate anchoring atoms are selected based on their spatial proximity to the reactive residue, serving as the basis for subsequent structural fragmentation and warhead incorporation. Fragment generation follows two guiding principles to ensure both chemical tractability and structural fidelity. (1) If a selected anchoring atom contains hydrogen substituents, fragments are generated by substituting one hydrogen with a covalent warhead-bearing moiety, thereby retaining the original scaffold with minimal structural disruption. (2) Fragmentation is further informed by the hybridization state of the anchoring atom: for sp3-hybridized atoms, only hydrogen removal is permitted to preserve local 3D geometry and minimize steric strain; for sp2 or sp-hybridized atoms, selective bond cleavage is applied to yield a structurally diverse fragment library that accommodates planar or linear conjugation geometries (Figure C). This approach supports the generation of reactive fragments compatible with a wide range of covalent warhead chemistries. The generated fragments are conjugated with electrophilic warheads tailored to react with the designated amino acid residue. This method was applied to 302 protein–ligand complexes, and a potential covalent compound library comprising more than 100,000 molecules was constructed through docking to eliminate unreasonable conformations.
Platform
We have integrated the automated covalent molecular design workflow into our web platform, CovalentLab, offering an interactive design workspace tailored to accommodate a wide range of user requirements. The platform supports two distinct methods for covalent molecule design.
Module 1: Covalent Molecule Design
As illustrated in Figure A, users can either input a PDB IDprompting automatic retrieval of the complex from the RCSB PDB databaseor upload their own structural files for custom processing. During automated retrieval, the system retains the ligand with the highest number of heavy atoms while excluding solvent molecules and metal ions. The binding pocket is then defined as the region within a specified distance from the retained ligand. Within this pocket, the platform identifies amino acid residues suitable for covalent modification, providing chain identifiers and sequence positions. Users may select target residues either interactively via the visual interface or manually by entering residue information (Figure B). After selecting a target residue, users can choose from a diverse library of covalent warheads to explore modification strategies and generate potential drug candidates systematically. Upon clicking the “Run” button, the system produces three key outputs: molecular fragments suitable for covalent conjugation, designed covalent molecules with warheads linked to selected fragments, and the proposed reaction mechanism involving the target residue (Figure C). To enhance user accessibility, a download option is available in the upper right corner of the results section, enabling users to export generated molecular structures in CSV format for further analysis or integration with external modeling tools. By combining automation, interactive visualization, and flexible user input, CovalentLab empowers researchers to design, optimize, and evaluate covalent drug candidates efficiently, streamlining the entire process into an intuitive and data-driven workflow.
5.
Web platform of CovalentLab. (A) The interactive interface of the covalent molecule design platform follows a three-step workflow: pocket selection, residue selection, and warhead selection. (B) The visualization interface for proteins and small molecules. (C) Presented results, including source fragments, designed covalent molecules, and predicted mechanisms. (D) The submission interface, which allows users to draw chemical structures.
Module 2: Covalent Warhead Application
The warhead-based covalent molecule design strategy has also been incorporated into a fully automated, user-interactive workflow accessible via our web platform. As illustrated in Figure D, users are provided with a flexible molecular editor that allows for the drawing and submission of custom reactive warheads directly within the canvas interface. Notably, this module supports the integration of novel or previously unreported electrophilic moieties, enabling users to explore an expanded chemical space beyond traditional covalent warhead libraries.
After designing a warhead, users can access our PDB recommendation system to identify protein–ligand complex structures suitable for warhead conjugation. The interface includes advanced filtering features: users may specify amino acid types to include or exclude within the target binding pocket, facilitating precise selection of reactive residues. Additionally, a customizable distance filter allows users to define the minimum spatial proximity between the electrophilic warhead and the side chains of the target residues, ensuring that only relevant complexes are returned. The system organizes output recommendations into major protein classes, including enzymes, ion channels, membrane proteins, chaperones, receptors, and others. This classification aids in rapid navigation across diverse protein types and supports users in selecting complexes aligned with their therapeutic or mechanistic goals. Once a suitable complex is chosen, the pipeline advances through site annotation, molecular fragmentation, and warhead incorporation, completing a fully integrated covalent ligand discovery workflow.
By combining user-defined warheads, binding site analysis, and PDB-wide filtering, our platform enables the design of structurally diverse covalent modulators, thereby supporting the discovery of potential first-in-class compounds.
Library
We have developed and integrated a comprehensive online resource centered around a curated data set of more than 100,000 covalent small molecules, meticulously designed across 95 protein targets, to facilitate the discovery and rational design of covalent inhibitors. Figure illustrates the distribution of targets and the chemical space of molecules, showing that the library substantially expands the chemical space of existing covalent compounds. This web-based platform features an intuitive, multilayered search interface that enables users to efficiently access target-specific molecular data via a four-tiered query systemincluding primary target classification, secondary classification, specific target name, and corresponding PDB IDleading to associated protein structures from the PDB. For each PDB entry, all potential covalent ligands targeting available reactive residues are systematically cataloged and presented. Each candidate molecule is richly annotated with detailed information, including the original ligand scaffold, the identity and position of the targeted amino acid residue, and the mechanistic class of the covalent reaction involved. These annotations provide users with a comprehensive view of each compound’s covalent binding mode. Additionally, each ligand entry is paired with a precomputed docking score, offering an estimate of binding affinity. Considering the substantial computational cost and time associated with covalent docking, a noncovalent docking approach was employed for evaluation. The results are visualized using an interactive 3D module, which allows viewing of noncovalent interactions within the ligand–protein complex. Beyond structural and interaction data, the platform offers drug-likeness metrics, including Lipinski’s “Rule of Five”, the quantitative estimate of drug-likeness (QED), and synthetic accessibility (SA) scores. Taken together, these indicators provide a holistic evaluation of each compound’s potential as a lead molecule. By uniting molecular design, structural analysis, and drug-likeness evaluation in a single, user-friendly interface, this library serves as a valuable tool for medicinal chemists and structural biologists working to develop covalent inhibitors with improved efficacy, selectivity, and synthetic feasibility. Additionally, the curated molecular data set can be leveraged to train machine learning models, further accelerating covalent drug discovery.
6.
Targets and chemical space distribution of the library. (A) Distribution of library targets across major classes, including enzymes, receptors, chaperones, membrane proteins, and ion channels. (B) UMAP visualization of the potential covalent molecules in the library compared with the chemical space of existing covalent inhibitors.
Case Study
To demonstrate the universality and robustness of our proposed method, we selected two representative targetsPRMT6 (protein arginine methyltransferase 6) and ABL1 (Abelson murine leukemia viral oncogene homologue 1)to validate both the ligand-based covalent inhibitor design workflow and the warhead-based virtual screening strategy. For the ligand-based design workflow, PRMT6 (PDB ID: 5E8R) was chosen as a case study. The protein sequence derived from this structure was submitted to our covalent binding site prediction model (Figure A), which identified Cys50 as the most probable site for covalent modification. Notably, this prediction aligns with existing literature, where Cys50 is confirmed as a reactive site for covalent ligand engagement in PRMT6. Guided by this prediction, we performed covalent molecule design focused on residue Cys50. Compound 1 matched a previously reported covalent PRMT6 inhibitor, validating the accuracy and practical utility of our platform. We then applied the warhead-based virtual screening workflow to ABL1. Three electrophilic warheads were submitted via the interface, with lysine specified as the target nucleophilic residue. Within the kinase-specific classification module of our system, we identified “2QOH” as a PDB meeting the spatial and chemical criteria for covalent conjugation, based on distance constraints. Our covalent site prediction model subsequently ranked lysine residue Lys271 as the top candidate for covalent modification, consistent with prior experimental studies confirming covalent engagement at this site (Figure B). Following this prediction, we conducted covalent docking and de novo molecule generation. Compounds 2, 3, and 4 were successfully designed, each consistent with previously reported covalent ABL1 inhibitors. , The alignment of our predictions with experimental literature highlights the potential of our platform as a robust and generalizable tool for the rational design of covalent inhibitors across diverse protein targets (Figure ).
7.
Two examples used to validate the covalent platform: PRMT6 and ABL1. (A) Probability ranking of covalent binding sites in PRMT6 (PDB ID: 5E8R). (B) Probability ranking of covalent binding sites in ABL1 (PDB ID: 2QOH).
Wet-Lab Validation of CovalentLab
To further demonstrate the practical applicability of CovalentLab, we conducted wet-lab experimental validation targeting two therapeutically relevant proteins: tropomyosin receptor kinase A (TRKA) and glutaminase C (GAC). , These proteins have recently garnered significant attention as promising therapeutic targets but lack of covalent modulators explored, presenting valuable opportunities for novel inhibitor development.
Case 1: Design of Covalent Inhibitors Targeting TRKs
NTRK gene (encoding TRK) fusions are well-established oncogenic drivers and were the first approved targets for tumor-agnostic therapies. , The TRK family encompasses three subtypes, namely TRKA, TRKB, and TRKC. While pan-TRK inhibitors such as larotrectinib and entrectinib have demonstrated strong clinical efficacy, their effectiveness is often compromised by acquired resistance mutations (e.g., TRKA G667C). , Covalent inhibitors offer a promising chemical strategy to address such resistance. Using the covalent molecule design module within CovalentLab, we initiated the development of the first covalent inhibitors targeting TRK. Beginning with the cocrystal structure of TRKA bound to the reversible orthosteric inhibitor AZ-23 (PDB ID: 4AOJ), our model predicted several potential covalent binding sites. Cys656, Lys544, and Asp596 were the top three candidates, but Cys656 was dismissed due to its suboptimal orientation (Figure A). Lys544 was ultimately chosen for covalent targeting. To ensure favorable fragment geometry and spatial fit, we applied an input fragment retention score of 0.4 during the design process. Based on synthetic feasibility and docking results, eight compounds (5–12) were selected for further study (Figure B).
8.
Experimental results related to TRK targets. (A) Probability ranking of covalent binding sites in TRKA (PDB ID: 4AOJ). (B) Structures of compounds 5–12. (C) Proposed binding mode of compound 7 (white) in TRKA, modeled using the known inhibitor AZ-23 (PDB: 4AOJ). (D) Mass spectrum of TRKA kinase domain following 2 h incubation with compound 7. (E) LC–MS/MS spectra of compound 7-labeled peptides (residues 539–547) derived from TRKA. (F) Covalent binding rates of compound 7 with the TRK family and the TRKAG667C mutant. (G) In vitro IC50 values for 7, AZ-23, and LOXO-101 against TRKA and TRKA G667C. (H) Inhibition of TRKA activation and downstream signaling in KM12-Luc cells.
Using compound 7 as a representative, molecular modeling studies were conducted to assess covalent binding potential. The results indicated that introducing the 2-ethynylbenzaldehyde (EBA) functionality did not interfere with protein binding, and the ∼2.8 Å distance between the EBA moiety and Lys544 suggests that covalent bond formation between the inhibitor and the kinase is possible (Figure C). Intact protein mass spectrometry confirmed covalent attachment for compounds 7–9, with compound 7 exhibiting the highest binding rate (99.2%) (Figures D and S1). Further analysis by liquid chromatography–tandem mass spectrometry (LC–MS/MS) identified Lys544 as the covalent binding site, with a high modification rate of 95%, supporting strong site selectivity under physiological conditions (Figure E). MALDI–TOF analysis indicated that compound 7 also exhibited high reactivity toward TRKB and TRKC, with single-site binding rates exceeding 89% within 1 h (Figures F and S2). Moreover, compound 7 formed a single lysine covalent adduct with the TRKAG667C mutant at a rate of 53.9% (Figures F and S2). In vitro kinase assays demonstrated potent inhibition of TRKA, TRKB, and TRKC by compound 7, yielding IC50 values of 11.50, 1.04, and 1.90 nM, respectively (Figures G, S3C and S3D). Notably, compound 7 displayed enhanced activity against the TRKAG667C mutant (IC50 = 0.16 nM), outperforming both LOXO-101 and AZ-23, suggesting its potential to overcome resistance mutations. As shown in Figure S3A, compound 7 achieved an IC50 of 166 nM in cellular assays. We then carried out Western blot analysis to examine the effects of 7 on the activation of TRKs and its downstream signals on KM12-Luc cells. As illustrated in Figure H, 7 can efficiently suppress the phosphorylation of TRKA as well as its downstream AKT. Furthermore, time-dependent intact protein MS revealed that compound 7 exhibited strong affinity toward TRKA (K I = 0.800 μM), a rapid inactivation rate (k inact = 0.0578 min– 1), and a k inact/K I of 1.20 × 103 M–1·s– 1, indicating a strong binding and fast covalent interaction with TRKA (Figure S4B). Finally, we evaluated the selectivity profile of 7 against a panel of over 400 human kinases at 200 nM. Compound 7 demonstrated high selectivity, maintaining complete inhibition of TRKs and their mutant forms, with a selectivity score (S-Score (20) of 0.142, further confirming its target specificity (Figure S4C). These results indicated the potential of compound 7 as a novel covalent inhibitor of TRK.
Case 2: Design of Covalent Inhibitors Targeting GAC
GAC is a mitochondrial enzyme in the glutaminase (GLS) family that catalyzes the hydrolysis of glutamine to glutamate and plays a critical role in metabolic reprogramming, proliferation, and cancer cell growth. − Using the Covalent Molecule Design module of CovalentLab, we developed a series of covalent inhibitors targeting the allosteric pocket of GAC. As a starting ligand, we selected SY-6b, a reversible GAC allosteric inhibitor previously reported by Sun et al. Covalent binding site prediction identified Lys320, Tyr394, and Lys398 as the top three candidate residues (Figure A). After further validation, Lys320 was selected as the covalent modification site. Prepared protein structures and ligand candidates were uploaded to the platform for warhead selection and covalent docking. Based on docking conformations and synthetic feasibility, compounds 14 and 15 were identified as promising covalent inhibitors (Figures B and S6B). Intact protein mass spectrometry (MS) analysis confirmed covalent modification of GAC by both compounds, with compound 14 showing the highest labeling efficiency (∼45%) (Figure S5A and S5C). Since the ligands bind to an allosteric site at the GAC dimer interface, where the theoretical maximum labeling efficiency is 50%, this suggests near-complete engagement by 14. Further LC–MS/MS analysis of 14-labeled peptides confirmed Lys320 as the specific covalent binding site, in agreement with our structure-based design (Figure S5B). In addition, compound 15 displayed significantly enhanced GAC enzymatic inhibition compared to the reversible inhibitor 13, demonstrating key features of potent covalent inhibition (Figure E). However, despite their potent GAC inhibitory activity, cellular activity assays revealed limited efficacy, indicating possible limitations in cell permeability or intracellular target engagement that require further investigation ().
9.
Experimental results for GAC targets. (A) Probability ranking of predicted covalent binding sites in GAC. (B) Structures of compounds 14-16. (C) Mass spectrometry (MS) analysis of GAC-16 after 8 h incubation at 25 °C. (D) Site-mapping of compound 16 covalently bound to GAC at Lys320, identified by LC–MS/MS. (E) Inhibitory activity of compounds 13-16 against GAC. (F) Acetylation levels of GACWT and GACK320A detected by Western blotting after treatment with varying concentrations of compound 16 for 45 min. (G) Assessment of intracellular reactive oxygen species (ROS) generation in A549 cells following exposure to compound 16 by flow cytometry. n = 3. (H) Intracellular glutamate (GLU)/glutamine(GLN) ratio after 16 h treatment with compound 16. Statistical comparisons between treated and control groups were analyzed using an unpaired t-test. n = 3. ***p < 0.01.
To further validate the effectiveness of the platform, we employed CovalentLab’s post-translational modification module to design a Lys320-specific acetyltransferase mimetic. Structure-based molecular docking and computational feasibility analysis guided the selection and synthesis of compound 16 (Figures B and S6B). Western blot analysis comparing wild-type and Lys320 mutant GAC revealed a concentration-dependent acetylation pattern induced by compound 16 (Figure F). At low concentrations (≤2 μM), 16 selectively acetylated wild-type GAC, with no detectable modification of the Lys320 mutant. However, this selectivity diminished at higher concentrations (≥10 μM), as evidenced by emerging acetylation signals in the Lys320 mutant, indicating a dose-dependent decline in target specificity. Mass spectrometric analysis demonstrated that compound 16 exhibited strong selectivity, with 71.05% of total protein products displaying acetylation (Figure C). Subsequent peptide mapping identified Lys320 as the primary modification site, with an acetylation occupancy of 67.48% (Figure D), confirming the compound’s site-specific targeting capability. Further biological evaluation showed that 16 had enhanced antiproliferative activities activity against tumor cells compared to compounds 14 and 15(Figure S6B). Additionally, treatment with 16 significantly increased intracellular reactive oxygen species (ROS) level and impaired glutaminase activity (Figures G and S6D), as reflected by a marked reduction in glutamine-to-glutamate conversion (Figure H). The above results indicate that inhibitor 16 is capable of mediating post-translational modifications in the allosteric pocket of GAC, thereby providing valuable insights for further studies of the GAC protein.
Discussion
In this study, we developed CovalentLab, a platform integrates ligand-based and warhead-based approaches into a unified workflow, which enables the design of molecules capable of covalent binding to proteins or inducing post-translational modifications. The workflow incorporates an ESM-2-based model to prioritize covalent targetable residues near the binding pocket, and, guided by spatial constraints, the platform automatically pinpoints suitable modification sites and appends the corresponding warheads. In addition, it supports user-defined covalent warheads, thereby greatly expanding the design space for novel covalent molecules. Our method was applied to 95 carefully curated targets and 302 protein–ligand complexes, leading to the construction of a database of over 100,000 covalent compounds, which provide a valuable resource for training machine learning models. Unlike previous computational strategies that focused on specific targets or amino acid types, CovalentLab offers a generalized and automated solution applicable to a broad range of protein–ligand complexes, significantly expanding the landscape of covalent drug discovery. The platform has successfully designed and validated covalent inhibitors targeting GAC and TRK. Its applicability extends to diverse binding pocket types, including both orthogonal and allosteric sites, , highlighting the potential of CovalentLab and its ability to generate bioactive molecules across functionally diverse protein targets.
Our platform addresses critical gaps in traditional covalent drug design by offering a AI-driven and fully automated processes methodology, effectively overcoming the trial-and-error approach that characterizes traditional covalent compound development. CovalentLab will continue to be expanded and refined, incorporating a broader range of target proteins, novel warhead chemistries, and enhanced predictive algorithms. We anticipate that this platform will become a valuable resource for the drug discovery community, streamlining the efficient design of highly selective and potent covalent inhibitors across a wide range of therapeutic targets.
Materials and Methods
Data Collection and Processing
To train our covalent binding site prediction model, we utilized all available covalent molecule entries from CovalentInDB 2.0, comprising a total of 3,598 records. After removing duplicate protein targets, 1,933 unique protein–ligand interaction data points were retained. These data were randomly split into training, validation, and test sets at a ratio of 8:1:1. The distribution of target classes and binding mechanisms was further analyzed using a curated data set of FDA-approved covalent drugs, which included 128 compounds compiled by Dalton et al. Additionally, the index file from the PDBbind v2020 data set was downloaded, providing 19,443 PDB IDs for assessing the prevalence and diversity of binding modes.
Database
The initial protein–ligand complexes used for database construction were derived from the PDBbind v2020 data set. We analyzed the target distribution within PDBbind and identified 95 common targets, comprising a total of 302 PDB structures, to guide the design of irreversible molecules. For each target, 1–5 ligands with diverse scaffolds were manually selected as design precursors. To improve the rationality of molecular design, endogenous substances such as polysaccharides, peptides, and ATP were excluded during the ligand selection. Furthermore, we ensured that selected ligands had a molecular weight greater than 200 Da and contained at least two ring systems. The covalent compounds generated using the platform were subsequently validated through molecular docking studies. Within the AI-driven platform DrugFlow, we employed the Inno-Docking module to perform docking simulations, evaluated protein–ligand binding affinity, and assessed the conformational plausibility of the docked complexes. This step enables accurate prediction of ligand orientation within the protein binding pocket. Additionally, a variety of fundamental chemical and biological properties were computed using RDKit (version 2021), providing comprehensive profiling of the designed compounds.
Server and Browser Compatibility
CovalentLab is built on the Django 5.1.2 framework and Python 3.11, employing PostgreSQL as the database system to ensure robust and scalable data management. For molecular visualization and analysis, it integrates Mol*, an open-source, web-based toolkit that offers advanced functionality for exploring molecular data.
Results Availability
CovalentLab Web server is freely available at https://www.medchemwise.com/CovalentLab. The code and data for covalent site prediction have been uploaded to https://github.com/WJmodels/Covalent-site-prediction.
Supplementary Material
Acknowledgments
This work was financially supported by the National Natural Science Foundation of China (NSFC no.82373718), CAMS Innovation Fund for Medical Sciences (CIFMs 2021-1-2M-028), 2024 China Industrial Technology Infrastructure Public Service Platform Project (GN2024-31-4700), Beijing Natural Science Foundation (L248084), the Fundamental Research Funds for the Central Universities, Peking Union Medical College (3332025191), and Young Elite Scientist Sponsorship Program by Beijing Association for Science and Technology (BAST) (No. BYESS2024087). The computing resources were supported by biomedical high-performance computing platform, Chinese Academy of Medical Sciences . We would like to thank Professor Shaoqing Yao and his student Peng Chen for their valuable suggestions on this work.
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/jacsau.5c01161.
⊥.
X.X., X.L., and X.L. contributed equally.
The authors declare no competing financial interest.
References
- Lanman B. A., Allen J. R., Allen J. G., Amegadzie A. K., Ashton K. S., Booker S. K., Chen J. J., Chen N., Frohn M. J., Goodman G., Kopecky D. J., Liu L., Lopez P., Low J. D., Ma V., Minatti A. E., Nguyen T. T., Nishimura N., Pickrell A. J., Reed A. B., Shin Y., Siegmund A. C., Tamayo N. A., Tegley C. M., Walton M. C., Wang H. L., Wurz R. P., Xue M., Yang K. C., Achanta P., Bartberger M. D., Canon J., Hollis L. S., McCarter J. D., Mohr C., Rex K., Saiki A. Y., San Miguel T., Volak L. P., Wang K. H., Whittington D. A., Zech S. G., Lipford J. R., Cee V. J.. Discovery of a Covalent Inhibitor of KRAS(G12C) (AMG 510) for the Treatment of Solid Tumors. J. Med. Chem. 2020;63:52–65. doi: 10.1021/acs.jmedchem.9b01180. [DOI] [PubMed] [Google Scholar]
- McAulay K., Bilsland A., Bon M.. Reactivity of Covalent Fragments and Their Role in Fragment Based Drug Discovery. Pharmaceuticals. 2022;15:1366. doi: 10.3390/ph15111366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roskoski R.. Orally effective FDA-approved protein kinase targeted covalent inhibitors (TCIs. Pharmacol. Res. 2021;165:105422. doi: 10.1016/j.phrs.2021.105422. [DOI] [PubMed] [Google Scholar]
- Boike L., Henning N. J., Nomura D. K.. Advances in covalent drug discovery. Nat. Rev. Drug Discovery. 2022;21:881–898. doi: 10.1038/s41573-022-00542-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Honigberg L. A., Smith A. M., Sirisawad M., Verner E., Loury D., Chang B., Li S., Pan Z., Thamm D. H., Miller R. A., Buggy J. J.. The Bruton tyrosine kinase inhibitor PCI-32765 blocks B-cell activation and is efficacious in models of autoimmune disease and B-cell malignancy. Proc. Natl. Acad. Sci. U. S. A. 2010;107:13075–13080. doi: 10.1073/pnas.1004594107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Finlay M. R., Anderton M., Ashton S., Ballard P., Bethel P. A., Box M. R., Bradbury R. H., Brown S. J., Butterworth S., Campbell A., Chorley C., Colclough N., Cross D. A., Currie G. S., Grist M., Hassall L., Hill G. B., James D., James M., Kemmitt P., Klinowska T., Lamont G., Lamont S. G., Martin N., McFarland H. L., Mellor M. J., Orme J. P., Perkins D., Perkins P., Richmond G., Smith P., Ward R. A., Waring M. J., Whittaker D., Wells S., Wrigley G. L.. Discovery of a potent and selective EGFR inhibitor (AZD9291) of both sensitizing and T790M resistance mutations that spares the wild type form of the receptor. J. Med. Chem. 2014;57:8249–8267. doi: 10.1021/jm500973a. [DOI] [PubMed] [Google Scholar]
- Drazic A., Myklebust L. M., Ree R., Arnesen T.. The world of protein acetylation. Biochim. Biophys. Acta, Protein Struct. Mol. Enzymol. 2016;1864:1372–1401. doi: 10.1016/j.bbapap.2016.06.007. [DOI] [PubMed] [Google Scholar]
- Tang G., Wang X., Huang H., Xu M., Ma X., Miao F., Lu X., Zhang C. J., Gao L., Zhang Z. M., Yao S. Q.. Small Molecule-Induced Post-Translational Acetylation of Catalytic Lysine of Kinases in Mammalian Cells. J. Am. Chem. Soc. 2024;146:23978–23988. doi: 10.1021/jacs.4c07181. [DOI] [PubMed] [Google Scholar]
- Ramazi S., Zahiri J.. Posttranslational modifications in proteins: resources, tools and prediction methods. Database. 2021;2021:baab012. doi: 10.1093/database/baab012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang W., Pei J., Lai L.. Statistical Analysis and Prediction of Covalent Ligand Targeted Cysteine Residues. J. Chem. Inf. Model. 2017;57:1453–1460. doi: 10.1021/acs.jcim.7b00163. [DOI] [PubMed] [Google Scholar]
- White M. E. H., Gil J., Tate E. W.. Proteome-wide structural analysis identifies warhead- and coverage-specific biases in cysteine-focused chemoproteomics. Cell Chem. Biol. 2023;30:828–838 e824. doi: 10.1016/j.chembiol.2023.06.021. [DOI] [PubMed] [Google Scholar]
- Dalton S. E., Di Pietro O., Hennessy E.. A Medicinal Chemistry Perspective on FDA-Approved Small Molecule Drugs with a Covalent Mechanism of Action. J. Med. Chem. 2025;68:2307–2313. doi: 10.1021/acs.jmedchem.4c02661. [DOI] [PubMed] [Google Scholar]
- Singh J.. The Ascension of Targeted Covalent Inhibitors. J. Med. Chem. 2022;65:5886–5901. doi: 10.1021/acs.jmedchem.1c02134. [DOI] [PubMed] [Google Scholar]
- Yuan B., Liu J., Wu Y., Chen M., Lai Y., Zhao H.-Y., Yang Z., Zhang S.-Q., Xin M.. Lysine-Targeted Covalent Strategy Leading to the Discovery of Novel Potent PROTAC-Based PI3Kδ Degraders. J. Med. Chem. 2025;68:11437–11467. doi: 10.1021/acs.jmedchem.5c00408. [DOI] [PubMed] [Google Scholar]
- Yuan B., Ma M., Wu Y., Liu J., Chen M., Lai Y., Zhang S. Q., Xin M.. Discovery of novel covalent PI3Kdelta inhibitors bearing alaninamide moiety by lysine-targeted covalent strategy. Eur. J. Med. Chem. 2025;297:117948. doi: 10.1016/j.ejmech.2025.117948. [DOI] [PubMed] [Google Scholar]
- Yuan B., Feng Y., Ma M., Duan W., Wu Y., Liu J., Zhao H. Y., Yang Z., Zhang S. Q., Xin M.. Lysine-Targeted Covalent Inhibitors of PI3Kdelta Synthesis and Screening by In Situ Interaction Upgradation. J. Med. Chem. 2024;67:20076–20099. doi: 10.1021/acs.jmedchem.4c01284. [DOI] [PubMed] [Google Scholar]
- Mehta N. V., Degani M. S.. The expanding repertoire of covalent warheads for drug discovery. Drug Discovery Today. 2023;28:103799. doi: 10.1016/j.drudis.2023.103799. [DOI] [PubMed] [Google Scholar]
- Gehringer M., Laufer S. A.. Emerging and Re-Emerging Warheads for Targeted Covalent Inhibitors: Applications in Medicinal Chemistry and Chemical Biology. J. Med. Chem. 2019;62:5673–5724. doi: 10.1021/acs.jmedchem.8b01153. [DOI] [PubMed] [Google Scholar]
- Petri L., Egyed A., Bajusz D., Imre T., Hetenyi A., Martinek T., Abranyi-Balogh P., Keseru G. M.. An electrophilic warhead library for mapping the reactivity and accessibility of tractable cysteines in protein kinases. Eur. J. Med. Chem. 2020;207:112836. doi: 10.1016/j.ejmech.2020.112836. [DOI] [PubMed] [Google Scholar]
- Pettinger J., Jones K., Cheeseman M. D.. Lysine-Targeting Covalent Inhibitors. Angew. Chem., Int. Ed. 2017;56:15200–15209. doi: 10.1002/anie.201707630. [DOI] [PubMed] [Google Scholar]
- Zhao Z., Bourne P. E.. Exploring Extended Warheads toward Developing Cysteine-Targeted Covalent Kinase Inhibitors. J. Chem. Inf. Model. 2024;64:9517–9527. doi: 10.1021/acs.jcim.4c00890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones L. H.. Advances in sulfonyl exchange chemical biology: expanding druggable target space. Chem. Sci. 2025;16:10119–10140. doi: 10.1039/D5SC02647D. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gai C., Harnor S. J., Zhang S., Cano C., Zhuang C., Zhao Q.. Advanced approaches of developing targeted covalent drugs. RSC Med. Chem. 2022;13:1460–1475. doi: 10.1039/D2MD00216G. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao B., Xu S., Zhou S., Jiang X., Jiang A., Lei H., Zhai X.. Research progress on covalent inhibitors targeting alkaline amino acids. Bioorg. Chem. 2025;163:108800. doi: 10.1016/j.bioorg.2025.108800. [DOI] [PubMed] [Google Scholar]
- Chen P., Tang G., Zhu C., Sun J., Wang X., Xiang M., Huang H., Wang W., Li L., Zhang Z. M., Gao L., Yao S. Q.. 2-Ethynylbenzaldehyde-Based, Lysine-Targeting Irreversible Covalent Inhibitors for Protein Kinases and Nonkinases. J. Am. Chem. Soc. 2023;145:3844–3849. doi: 10.1021/jacs.2c11595. [DOI] [PubMed] [Google Scholar]
- Tamura T., Kawano M., Hamachi I.. Targeted Covalent Modification Strategies for Drugging the Undruggable Targets. Chem. Rev. 2025;125:1191–1253. doi: 10.1021/acs.chemrev.4c00745. [DOI] [PubMed] [Google Scholar]
- Bianco G., Goodsell D. S., Forli S.. Selective and Effective: Current Progress in Computational Structure-Based Drug Discovery of Targeted Covalent Inhibitors. Trends Pharmacol. Sci. 2020;41:1038–1049. doi: 10.1016/j.tips.2020.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao Z., Liu Q., Bliven S., Xie L., Bourne P. E.. Determining Cysteines Available for Covalent Inhibition Across the Human Kinome. J. Med. Chem. 2017;60:2879–2889. doi: 10.1021/acs.jmedchem.6b01815. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xie J., Dong R., Zhu J., Lin H., Wang S., Lai L.. MMFuncPhos: A Multi-Modal Learning Framework for Identifying Functional Phosphorylation Sites and Their Regulatory Types. Adv. Sci. 2025;12:e2410981. doi: 10.1002/advs.202410981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hasan M. N., Ray M., Saha A.. Landscape of In Silico Tools for Modeling Covalent Modification of Proteins: A Review on Computational Covalent Drug Discovery. J. Phys. Chem. B. 2023;127:9663–9684. doi: 10.1021/acs.jpcb.3c04710. [DOI] [PubMed] [Google Scholar]
- Zhou Y., Yu H., Vind A. C., Kong L., Liu Y., Song X., Tu Z., Yun C., Smaill J. B., Zhang Q. W., Ding K., Bekker-Jensen S., Lu X.. Rational Design of Covalent Kinase Inhibitors by an Integrated Computational Workflow (Kin-Cov. J. Med. Chem. 2023;66:7405–7420. doi: 10.1021/acs.jmedchem.3c00088. [DOI] [PubMed] [Google Scholar]
- Scarpino A., Ferenczy G. G., Keseru G. M.. Comparative Evaluation of Covalent Docking Tools. J. Chem. Inf. Model. 2018;58:1441–1458. doi: 10.1021/acs.jcim.8b00228. [DOI] [PubMed] [Google Scholar]
- Wu S., Luo H., Wang H., Zhao W., Hu Q., Yang Y.. Cysteinome: The first comprehensive database for proteins with targetable cysteine and their covalent inhibitors. Biochem. Biophys. Res. Commun. 2016;478:1268–1273. doi: 10.1016/j.bbrc.2016.08.109. [DOI] [PubMed] [Google Scholar]
- Du J., Yan X., Liu Z., Cui L., Ding P., Tan X., Li X., Zhou H., Gu Q., Xu J.. cBinderDB: a covalent binding agent database. Bioinformatics. 2017;33:1258–1260. doi: 10.1093/bioinformatics/btw801. [DOI] [PubMed] [Google Scholar]
- Gao M., Moumbock A. F. A., Qaseem A., Xu Q., Gunther S.. CovPDB: a high-resolution coverage of the covalent protein-ligand interactome. Nucleic Acids Res. 2022;50:D445–D450. doi: 10.1093/nar/gkab868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo X. K., Zhang Y.. CovBinderInPDB: A Structure-Based Covalent Binder Database. J. Chem. Inf. Model. 2022;62:6057–6068. doi: 10.1021/acs.jcim.2c01216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Du H., Gao J., Weng G., Ding J., Chai X., Pang J., Kang Y., Li D., Cao D., Hou T.. CovalentInDB: a comprehensive database facilitating the discovery of covalent inhibitors. Nucleic Acids Res. 2021;49:D1122–D1129. doi: 10.1093/nar/gkaa876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Du H., Zhang X., Wu Z., Zhang O., Gu S., Wang M., Zhu F., Li D., Hou T., Pan P.. CovalentInDB 2.0: an updated comprehensive database for structure-based and ligand-based covalent inhibitor design and screening. Nucleic Acids Res. 2025;53:D1322–D1327. doi: 10.1093/nar/gkae946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Du H., Jiang D., Gao J., Zhang X., Jiang L., Zeng Y., Wu Z., Shen C., Xu L., Cao D.. et al. Proteome-Wide Profiling of the Covalent-Druggable Cysteines with a Structure-Based Deep Graph Learning Network. Research. 2022;2022:9873564. doi: 10.34133/2022/9873564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu R., Clayton J., Shen M., Bhatnagar S., Shen J.. Machine Learning Models to Interrogate Proteome-Wide Covalent Ligandabilities Directed at Cysteines. JACS Au. 2024;4:1374–1384. doi: 10.1021/jacsau.3c00749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Z., Su M., Han L., Liu J., Yang Q., Li Y., Wang R.. Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions. Acc. Chem. Res. 2017;50:302–309. doi: 10.1021/acs.accounts.6b00491. [DOI] [PubMed] [Google Scholar]
- Lin Z., Akin H., Rao R., Hie B., Zhu Z., Lu W., Smetanin N., Verkuil R., Kabeli O., Shmueli Y., dos Santos Costa A., Fazel-Zarandi M., Sercu T., Candido S., Rives A.. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science. 2023;379:1123–1130. doi: 10.1126/science.ade2574. [DOI] [PubMed] [Google Scholar]
- Hu, E. J. ; Shen, Y. ; Wallis, P. ; Allen-Zhu, Z. ; Li, Y. ; Wang, S. ; Wang, L. ; Chen, W. . Lora: Low-rank adaptation of large language models. arxiv. 2022. [Google Scholar]
- Amatu A., Sartore-Bianchi A., Bencardino K., Pizzutilo E. G., Tosi F., Siena S.. Tropomyosin receptor kinase (TRK) biology and the role of NTRK gene fusions in cancer. Ann. Oncol. 2019;30:viii5–viii15. doi: 10.1093/annonc/mdz383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hensley C. T., Wasti A. T., DeBerardinis R. J.. Glutamine and cancer: cell biology, physiology, and clinical opportunities. J. Clin. Invest. 2013;123:3678–3684. doi: 10.1172/JCI69600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rappoport Z.. The rapid steps in nucleophilic vinylic addition-elimination substitution. Recent Developments, Acc. Chem. Res. 1992;25:474–479. doi: 10.1021/ar00022a007. [DOI] [Google Scholar]
- Gassman P. G., O’Reilly N. J.. Nucleophilic addition of the pentafluoroethyl group to aldehydes, ketones, and esters. J. Org. Chem. 1987;52:2481–2490. doi: 10.1021/jo00388a025. [DOI] [Google Scholar]
- Liu R., Verma N., Henderson J. A., Zhan S., Shen J.. Profiling MAP kinase cysteines for targeted covalent inhibitor design. RSC Med. Chem. 2022;13:54–63. doi: 10.1039/D1MD00277E. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mukherjee H., Grimster N. P.. Beyond cysteine: recent developments in the area of targeted covalent inhibition. Curr. Opin. Chem. Biol. 2018;44:30–38. doi: 10.1016/j.cbpa.2018.05.011. [DOI] [PubMed] [Google Scholar]
- Xu Y., Wang S., Hu Q., Gao S., Ma X., Zhang W., Shen Y., Chen F., Lai L., Pei J.. CavityPlus: a web server for protein cavity detection with pharmacophore modelling, allosteric site identification and covalent ligand binding ability prediction. Nucleic Acids Res. 2018;46:W374–W379. doi: 10.1093/nar/gky380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mao, A. ; Mohri, M. ; Zhong, Y. . Cross-Entropy Loss Functions: Theoretical Analysis and Applications. In International conference on Machine learning; pmlr, 2023, pp. 23803-23828. [Google Scholar]
- Lin, T.-Y. ; Goyal, P. ; Girshick, R. ; He, K. ; Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision; IEEE, 2017, pp. 2980–2988. [Google Scholar]
- Berman H. M., Westbrook J., Feng Z., Gilliland G., Bhat T. N., Weissig H., Shindyalov I. N., Bourne P. E.. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- https://www.guidetopharmacology.org/. (accessed 20 July 2025).
- McInnes L., Healy J., Melville J.. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv. 2018;1802:03426. doi: 10.48550/arXiv.1802.03426. [DOI] [Google Scholar]
- Cai H., Shen C., Jian T., Zhang X., Chen T., Han X., Yang Z., Dang W., Hsieh C. Y., Kang Y., Pan P., Ji X., Song J., Hou T., Deng Y.. CarsiDock: a deep learning paradigm for accurate protein-ligand docking and screening based on large-scale pre-training. Chem. Sci. 2024;15:1449–1471. doi: 10.1039/D3SC05552C. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lipinski C. A., Lombardo F., Dominy B. W., Feeney P. J.. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings1PII of original article: S0169–409X(96)00423–1. The article was originally published in Advanced Drug Delivery Reviews 23 (1997) 3–25.1. Adv. Drug Delivery Rev. 2001;46:3–26. doi: 10.1016/S0169-409X(00)00129-0. [DOI] [PubMed] [Google Scholar]
- Bickerton G. R., Paolini G. V., Besnard J., Muresan S., Hopkins A. L.. Quantifying the chemical beauty of drugs. Nat. Chem. 2012;4:90–98. doi: 10.1038/nchem.1243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ertl P., Schuffenhauer A.. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminf. 2009;1:8. doi: 10.1186/1758-2946-1-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cao M. T., Feng Y., Zheng Y. G.. Protein arginine methyltransferase 6 is a novel substrate of protein arginine methyltransferase 1. World J. Biol. Chem. 2023;14:84–98. doi: 10.4331/wjbc.v14.i5.84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen P., Sun J., Zhu C., Tang G., Wang W., Xu M., Xiang M., Zhang C. -J., Zhang Z. M., Gao L.. et al. Cell-Active, Reversible, and Irreversible Covalent Inhibitors That Selectively Target the Catalytic Lysine of BCR-ABL Kinase. Angew. Chem., Int. Ed. 2022;61:e202203878. doi: 10.1002/anie.202203878. [DOI] [PubMed] [Google Scholar]
- Eram M. S., Shen Y., Szewczyk M., Wu H., Senisterra G., Li F., Butler K. V., Kaniskan H. U., Speed B. A., Dela Sena C., Dong A., Zeng H., Schapira M., Brown P. J., Arrowsmith C. H., Barsyte-Lovejoy D., Liu J., Vedadi M., Jin J.. A Potent, Selective, and Cell-Active Inhibitor of Human Type I Protein Arginine Methyltransferases. ACS Chem. Biol. 2016;11:772–781. doi: 10.1021/acschembio.5b00839. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou T., Parillon L., Li F., Wang Y., Keats J., Lamore S., Xu Q., Shakespeare W., Dalgarno D., Zhu X.. Crystal structure of the T315I mutant of AbI kinase. Chem. Biol. Drug Des. 2007;70:171–181. doi: 10.1111/j.1747-0285.2007.00556.x. [DOI] [PubMed] [Google Scholar]
- Quach D., Tang G., Anantharajan J., Baburajendran N., Poulsen A., Wee J. L. K., Retna P., Li R., Liu B., Tee D. H. Y., Kwek P. Z., Joy J. K., Yang W. Q., Zhang C. J., Foo K., Keller T. H., Yao S. Q.. Strategic Design of Catalytic Lysine-Targeting Reversible Covalent BCR-ABL Inhibitors*. Angew. Chem., Int. Ed. 2021;60:17131–17137. doi: 10.1002/anie.202105383. [DOI] [PubMed] [Google Scholar]
- Wang J. B., Erickson J. W., Fuji R., Ramachandran S., Gao P., Dinavahi R., Wilson K. F., Ambrosio A. L., Dias S. M., Dang C. V., Cerione R. A.. Targeting mitochondrial glutaminase activity inhibits oncogenic transformation. Cancer Cell. 2010;18:207–219. doi: 10.1016/j.ccr.2010.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson M. M., McBryant S. J., Tsukamoto T., Rojas C., Ferraris D. V., Hamilton S. K., Hansen J. C., Curthoys N. P.. Novel mechanism of inhibition of rat kidney-type glutaminase by bis-2-(5-phenylacetamido-1,2,4-thiadiazol-2-yl)ethyl sulfide (BPTES. Biochem. J. 2007;406:407–414. doi: 10.1042/BJ20070039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang H., Zhong J., Gucwa M., Zhang Y., Ma H., Deng L., Mao L., Minor W., Wang N., Zheng H.. PinMyMetal: a hybrid learning system to accurately model transition metal binding sites in macromolecules. Nat. Commun. 2025;16:3043. doi: 10.1038/s41467-025-57637-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jiang T., Wang G., Liu Y., Feng L., Wang M., Liu J., Chen Y., Ouyang L.. Development of small-molecule tropomyosin receptor kinase (TRK) inhibitors for NTRK fusion cancers. Acta Pharm. Sin. B. 2021;11:355–372. doi: 10.1016/j.apsb.2020.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yan W., Lakkaniga N. R., Carlomagno F., Santoro M., McDonald N. Q., Lv F., Gunaganti N., Frett B., Li H. Y.. Insights into Current Tropomyosin Receptor Kinase (TRK) Inhibitors: Development and Clinical Application. J. Med. Chem. 2019;62:1731–1760. doi: 10.1021/acs.jmedchem.8b01092. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scott L. J.. Larotrectinib: First Global Approval. Drugs. 2019;79:201–206. doi: 10.1007/s40265-018-1044-x. [DOI] [PubMed] [Google Scholar]
- Al-Salama Z. T., Keam S. J.. Entrectinib: First Global Approval. Drugs. 2019;79:1477–1483. doi: 10.1007/s40265-019-01177-y. [DOI] [PubMed] [Google Scholar]
- Russo M., Misale S., Wei G., Siravegna G., Crisafulli G., Lazzari L., Corti G., Rospo G., Novara L., Mussolin B., Bartolini A., Cam N., Patel R., Yan S., Shoemaker R., Wild R., Di Nicolantonio F., Bianchi A. S., Li G., Siena S., Bardelli A.. Acquired Resistance to the TRK Inhibitor Entrectinib in Colorectal Cancer. Cancer Discovery. 2016;6:36–44. doi: 10.1158/2159-8290.CD-15-0940. [DOI] [PubMed] [Google Scholar]
- Drilon A., Laetsch T. W., Kummar S., DuBois S. G., Lassen U. N., Demetri G. D., Nathenson M., Doebele R. C., Farago A. F., Pappo A. S., Turpin B., Dowlati A., Brose M. S., Mascarenhas L., Federman N., Berlin J., El-Deiry W. S., Baik C., Deeken J., Boni V., Nagasubramanian R., Taylor M., Rudzinski E. R., Meric-Bernstam F., Sohal D. P. S., Ma P. C., Raez L. E., Hechtman J. F., Benayed R., Ladanyi M., Tuch B. B., Ebata K., Cruickshank S., Ku N. C., Cox M. C., Hawkins D. S., Hong D. S., Hyman D. M.. Efficacy of Larotrectinib in TRK Fusion–Positive Cancers in Adults and Children. N. Engl. J. Med. 2018;378:731–739. doi: 10.1056/NEJMoa1714448. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pisa R., Kapoor T. M.. Chemical strategies to overcome resistance against targeted anticancer therapeutics. Nat. Chem. Biol. 2020;16:817–825. doi: 10.1038/s41589-020-0596-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang T., Lamb M. L., Block M. H., Davies A. M., Han Y., Hoffmann E., Ioannidis S., Josey J. A., Liu Z. Y., Lyne P. D., MacIntyre T., Mohr P. J., Omer C. A., Sjogren T., Thress K., Wang B., Wang H., Yu D., Zhang H. J.. Discovery of Disubstituted Imidazo[4,5-b]pyridines and Purines as Potent TrkA Inhibitors. ACS Med. Chem. Lett. 2012;3:705–709. doi: 10.1021/ml300074j. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katt W. P., Lukey M. J., Cerione R. A.. A tale of two glutaminases: homologous enzymes with distinct roles in tumorigenesis. Future Med. Chem. 2017;9:223–243. doi: 10.4155/fmc-2016-0190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yeh T. K., Kuo C. C., Lee Y. Z., Ke Y. Y., Chu K. F., Hsu H. Y., Chang H. Y., Liu Y. W., Song J. S., Yang C. W., Lin L. M., Sun M., Wu S. H., Kuo P. C., Shih C., Chen C. T., Tsou L. K., Lee S. J.. Design, Synthesis, and Evaluation of Thiazolidine-2,4-dione Derivatives as a Novel Class of Glutaminase Inhibitors. J. Med. Chem. 2017;60:5599–5612. doi: 10.1021/acs.jmedchem.7b00282. [DOI] [PubMed] [Google Scholar]
- Szeliga M., Obara-Michlewska M.. Glutamine in neoplastic cells: focus on the expression and roles of glutaminases. Neurochem. Int. 2009;55:71–75. doi: 10.1016/j.neuint.2009.01.008. [DOI] [PubMed] [Google Scholar]
- Sun H., Du T., Yang M., Liu X., Xue X., Chen K., Lang X., Chen X., Wang B., Wang X.. Targeting the Subpocket Enables the Discovery of Thiadiazole-Pyridazine Derivatives as Glutaminase C Inhibitors. ACS Med. Chem. Lett. 2023;14:1455–1466. doi: 10.1021/acsmedchemlett.3c00375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ray P. D., Huang B.-W., Tsuji Y.. Reactive oxygen species (ROS) homeostasis and redox regulation in cellular signaling. Cell. Signalling. 2012;24:981–990. doi: 10.1016/j.cellsig.2012.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shen C., Song J., Hsieh C. Y., Cao D., Kang Y., Ye W., Wu Z., Wang J., Zhang O., Zhang X., Zeng H., Cai H., Chen Y., Chen L., Luo H., Zhao X., Jian T., Chen T., Jiang D., Wang M., Ye Q., Wu J., Du H., Shi H., Deng Y., Hou T.. DrugFlow: An AI-Driven One-Stop Platform for Innovative Drug Discovery. J. Chem. Inf. Model. 2024;64:5381–5391. doi: 10.1021/acs.jcim.4c00621. [DOI] [PubMed] [Google Scholar]
- Landrum, G. , Rdkit: open-source cheminformatics; https://www.rdkit.org.
- Sehnal D., Bittrich S., Deshpande M., Svobodova R., Berka K., Bazgier V., Velankar S., Burley S. K., Koca J., Rose A. S.. Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures. Nucleic Acids Res. 2021;49:W431–W437. doi: 10.1093/nar/gkab314. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.









