Abstract

RET receptor tyrosine kinase is crucial for nerve and tissue development but can be an important oncogenic driver. This study focuses on exploring the design principles of potent RET inhibitors through molecular docking and 3D-QSAR modeling of 5,6-fused bicyclic heteroaromatic derivatives. First of all, RET inhibitors of 49 different bicyclic substructures were collected from five different data sources and selected through molecular docking simulations. QSAR models were built from the 3399 conformers of 952 RET inhibitors using the partial least-squares method and statistically evaluated. The optimal QSAR model exhibited high predictive performance, with R2 (of training data) and Q2 (of test data) values of 0.801 and 0.794, respectively, effectively predicting known inhibitors. The optimal model was doubly verified by patent-filed RET inhibitors as the out-of-set data to demonstrate acceptable residual analysis results. Moreover, feature importance analysis of the QSAR model outlined the impact of substituent characteristics on the inhibitory activity within the 5,6-fused bicyclic heteroaromatic core structures. Furthermore, the relationship between structure and inhibitory activity was successfully applied to the RET screening of known clinical and nonclinical kinase inhibitors to afford accurate off-target prediction.
1. Introduction
The rearranged during transfection (RET) receptor tyrosine kinase plays a crucial role in the development of the enteric nervous system and genitourinary tissues.1 Cancer research identified RET as an oncogenic driver in nonsmall-cell lung cancer (NSCLC),2 RET alterations exhibit significant associations with various malignancies.3 The activation of downstream pathways such as PI3K/AKT and MAPK, which are observed in the RET fusion protein tyrosine kinase, stems from ligand-independent homodimerization and autophosphorylation resulting from the intrachromosomal rearrangement of RET.3 Cancers related to RET alterations encompass lung adenocarcinoma, colon adenocarcinoma, medullary carcinoma of the thyroid gland, cutaneous melanoma, and melanoma; of the various alterations, RET point mutations and fusions are prevalent,4 both of which are closely linked to unfavorable prognoses. Notably, the occurrence of brain metastasis among RET-altered patients, RET(+) stands at 46%, markedly higher than that observed in RET(−) counterparts.5 Although rare, RET rearrangements delineate a distinct subtype of metastatic colorectal cancer characterized by poor prognosis under conventional therapies, warranting specialized management approaches.6 Following pivotal clinical trials, multikinase inhibitors (MKIs) and selective RET inhibitors have received FDA approval for the management of radioiodine-refractory differentiated thyroid cancer or metastatic medullary thyroid cancer (MTC).7 Notably, Selpercatinib (also known as Retevmo or LOXO-292) has been granted Accelerated Approval by the FDA.8 This approval applies to the use of the RET kinase inhibitor in patients with metastatic NSCLC harboring RET fusions, as well as those with MTC harboring RET mutations. Recent data demonstrate the efficacy of Selpercatinib over standard care therapy in cancer types characterized by recurrent RET alterations.9
Although RET inhibitors (Figure 1) have been continuously reported, quantitative structure–activity relationship (QSAR) models and other predictive models have not been sufficiently studied. To our best knowledge, reported 3D-QSAR models for RET kinase inhibitors used a focused data set sharing one common core structure (scaffold) such as indolin-2-one10 or pyrrolo[2,3-d]pyrimidine.11 However, the ideal design of kinase inhibitors necessitates the replacement and variation of both core structures and side chains to optimize the synthetic method, drug efficacy, and ADMET.12 Heterocycle substitution, among several scaffold hopping strategies, offers a direct means of modifying the core structure of a known bioactive compound, which is especially beneficial for molecules posing synthetic hurdles. This method entails the replacement of carbon atoms with heteroatoms within the core ring of drug molecules, with the goal of improving their physicochemical properties and pharmacokinetic profile.13 Moreover, recently reported considerable RET inhibitors commonly possess a 5,6-fused bicyclic ring that interacts with the hinge region of RET kinase, and it is termed a hinge binder.14,15 Particularly, a 5,6-fused bicyclic ring was more frequently used in RET-selective inhibitors than in multikinase inhibitors.
Figure 1.
Summary of FDA-approved RET inhibitors. Abbreviation: TKI, tyrosine kinase inhibitor; DTC, radioiodine-refractory differentiated thyroid cancer; MTC, metastatic medullary thyroid cancer; NSCLC, nonsmall-cell lung cancer.
Therefore, we believe that researchers involved in the discovery of kinase inhibitors have a strong need to investigate the sophisticated SAR between these 5,6-fused bicyclic derivatives. In other words, we are motivated to develop 3D-QSAR models for explaining 5,6-fused bicyclic hinge binders, which have increasingly entered nonclinical and clinical studies for RET(+) patients. For this purpose, herein, we collected RET inhibitors from public databases, classified them based on the ring system of their hinge binders, and compiled a comprehensive data set of 49 different 5,6-fused bicyclic heteroaromatic rings with 952 inhibitors as a data set. In this study, we elucidate the subtle SAR difference among 5,6-fused bicyclic derivatives through molecular docking simulations for the 3D alignment of the data set and 3D-QSAR modeling. Furthermore, the evaluation of the best QSAR model is presented to provide guidance on the rational design of selective and potent RET inhibitors. In particular, RET inhibitors filed in the patent were used as an out-of-set third data, which never include either training or test data. In parallel, the RET model was applied for RET screening as an off-target prediction of known kinase inhibitors. The workflow of this study is illustrated in Figure 2.
Figure 2.
Overall workflow of this study.
2. Materials and Methods
2.1. Data Collection and Preparation
Information on RET kinase inhibitor activity was retrieved from the ChEMBL (version 33),16 BindingDB,17 and Excape18 databases. Information from 4763 RET inhibitor patents and other sources was compiled and curated using the KNIME workflow. All of the preprocessing steps, such as target selection and bioactivity transformation, were performed using the KNIME analytical platform.19,20 The selection criteria included the target protein, assay details, and activity value type. The evaluation of activity data within the target category, conducted using wild-type, noncell-based assays, employed specific criteria to focus on data points related to target proteins, such as Kinesin-1 heavy chain/tyrosine-protein kinase receptor RET, proto-oncogene tyrosine-protein kinase receptor RET, and tyrosine-protein kinase receptor RET, where inhibitory concentration (IC50) values were measured in nanomolar (nM) units. Compounds with activity annotations featuring blank fields or values containing “<” or “>” were excluded. In order to standardize the data and facilitate analysis, IC50 values were converted into their negative logarithmic form using the formula pIC50 = 9 – log(IC50).
2.2. Molecular Docking Simulations
Every ligand for molecular docking simulations was prepared using the Ligprep package from Schrödinger.21 2D-to-3D conversion includes options for neutralizing compounds under the pH range of 7 ∓ 222 and not generating tautomers. The resulting structures of the 4763 compounds were docked to the RET protein. The crystal structure of the RET complex with inhibitors is available in the Protein Data Bank (PDB ID: 7DUA),23 and it was arranged via protein preparation (at pH 7 and the OPLS3e force field) in the Schrodinger Suite 2019-1.24 The protein preparation involves the process of adding hydrogen atoms (PROPKA) and removing unnecessary chains and water molecules, and an OPLS3e force field was applied to minimize the protein structure.25−27 The protein molecule underwent a controlled minimization process to refine its geometry, achieving an RMSD of 0.3 Å. The grid was positioned at the core of the workspace ligand.28 The partial charge cutoff was set to 0.25, the van der Waals radius scaling factor was set to 1.0, and the docking box of size 20 Å × 20 Å × 20 Å was defined around the centroid of the ligand. Two spherical areas for the predicted binding site were determined through hydrogen-bonding interactions: (1) the oxygen of the carbonyl group of GLU805 as the H-bond donor and (2) the hydrogen of the amine group of ALA807 as the H-bond acceptor.29,30 Using Maestro’s default docking settings, Glide was used for docking with default values of partial charge cutoff and van der Waal’s radius scaling factor of 0.15 and 0.8, respectively.31 Ligand docking was run in the standard precision mode matching all of the grid constraints set previously, and the maximum number of poses per ligand was set to five. The docking model was validated by the RMSD calculation (0.0162 Å) between the X-ray ligand and the redocked ligand and their pose comparison.
2.3. Substructure-Based Data-Set Selection
The method for selecting the data set involved the similarity analysis of docking poses based on the literature.32 The docking pose that formed hydrogen bonds with the hinge residue ALA807 or GLU805 was selected. A list of 49 substructure filters involving fused bicyclic heteroaromatic rings, including pyrrole, pyrazole, pyrimidine, pyridine, and indazole,33−35 was applied to remove molecules likely to be nonreactive.
2.4. Developing 3D-QSAR Models
For QSAR modeling, PHASE36 implemented in Maestro 11.937 was used. While receptor-guided alignment of data sets is commonly employed in 3D-QSAR modeling, PHASE is specifically designed for pharmacophore-based 3D-QSAR modeling. Considerable QSAR studies using PHASE reported the general method consisting of (1) conformer generation with a reasonable number of conformers within available computing resources, (2) building pharmacophore hypothesis with different feature sets and number of points, (3) selection of the best pharmacophore model using scoring functions, and then (4) final building of the 3D-QSAR model using the best pharmacophore model.38 However, the large and heterogeneous data set in this study prevented implementation of the above-mentioned typical process, mainly owing to (1) constraints in conformer ensemble utilization with a large enough number of conformers (e.g., more than 952 inhibitors × 100 conformers) and (2) molecular superimposition inconsistencies, which arise either between substituents when bicyclic cores are fixed or between cores when substitution positions are fixed. Therefore, receptor-guided alignment was performed for building QSAR models using docking poses. Using the alignment from the best docking poses, various models were generated by varying the grid spacing (1.0, 1.5, and 2.0 Å), partial least-squares (PLS) factors (up to 10), and different data splitting methods, such as LMO (leave many out) or LOO (leave one out). The prepared data set was randomly divided into a training set and a test set at a 9:1 ratio. The PLS method was used for creating the QSAR model. For model generation, the |t-value| was set below 2.0, and the grid spacing was 1.0 Å. Model validation was conducted using 10-fold cross-validation (10-CV), in which 306 compounds were excluded from the training set. The models were selected based on the criteria of low root-mean-square error (RMSE) and standard deviation (SD) values; RMSE below 0.8 and SD 0.7.
2.5. Out-of-Set Validation of the Optimal QSAR Model
The out-of-set molecules consisted of 190 inhibitors and inhibitory activity (IC50) against wild-type RET kinase, which were collected from the patent (US 10,807,986 B2). In Maestro, 2D ligands are neutralized and converted into 3D structures through ligand preparation without considering tautomerism. Ligand docking was performed using the extraprecision mode without constraints, generating multiple docking poses for each molecule. Activity was predicted using QSAR, and the results were selected based on docking poses and the residual difference between the experimental activity and prediction.
2.6. Application of the Predictive QSAR Model into Virtual Screening
The criteria used for filtering the kinase inhibitors, excluding RET inhibitors, were based on characteristics such as SlogP, TPSA, and molecular weight (Table 1). Public databases including PKIDB39 and MRC40 were used as screening libraries. Physicochemical data for the molecules used in 3D-QSAR modeling were prevalidated using the KNIME workflow and subsequently employed for filtering purposes. A total of 459 compounds could be docked against the RET protein following the same configuration as described in Section 2.2. The 3D-QSAR model was then used as a three-dimensional query to distinguish the hits with the highest predicted RET inhibitory activity. Two methods were employed to select the top 30 data sets including the structure information, docking score, actual activity (pIC50), and predicted activity (pIC50) most closely resembling a reference molecule based on their interactions. First, a specific cutoff for the predicted activity was applied, followed by an evaluation of the interaction between docking poses and the target protein. Second, prioritization was based on rankings derived from the docking scores and predicted activity values.
Table 1. Filtering Criteria for Virtual Screeninga.
The criteria used for filtering were based on characteristics such as SlogP, TPSA, molecular weight, number of Lipinski hydrogen-bond acceptors and donors, number of rotatable bonds, number of amide bonds, number of heteroatoms, number of rings, number of aromatic rings, number of aliphatic rings, number of aromatic heterocycles, and number of aliphatic heterocycles.
3. Results
3.1. Activity Prediction of a Broad Range of 5,6-Fused Heteroaromatic Compounds
As mentioned in the histories of RET inhibitors above, the selectivity between RET and other receptor tyrosine kinases is important for clinical development, especially in terms of drug resistance. However, to our best knowledge, there are no reports providing the design logic for discriminating between MKIs and RET-selective inhibitors along with predicted potency. In this situation, the features of the RET-selective inhibitors can be a good clue to predict them. Notably, 5,6-fused heteroaromatic rings are frequently observed in both selective RET inhibitors and recently reported patents. Therefore, we expected that QSAR modeling with these compounds containing 5,6-fused heteroaromatic cores could come closer to the drug design logic (for selectivity and potency). When we collected RET inhibitors with enzymatic IC50 values from five different data sources (Figure 3), a considerable number of compounds out of the 4763 inhibitors featured 5,6-fused heteroaromatic rings in the substructure analysis using the KNIME workflow. Furthermore, while typical QSAR models use around 100 data sets with pIC50 values in the range of about 3 (1000-fold gap between the best and worst data), we could acquire a data set that is ten times larger and has an activity range a thousand times wider than them (size: 4763; IC50: <10 mM to >0.1 nM; pIC50 range: >6) as shown in the distribution of RET inhibitory activities as pIC50 values (Figure 4).
Figure 3.
Composition of the initial collected data.
Figure 4.

Distribution of the pIC50 value of collected RET inhibitors.
In sequence, we could generate 3D conformations21 of the 5,6-fused heteroaromatic compounds and conduct their molecular docking simulations.31 The docking poses resulting from the simulations are approximate to bioactive conformation and also provide the superimposition (3D alignment) of the data set.38 After substructure filtering of the aligned conformers was conducted through the docking simulations, 49 different 5,6-fused bicyclic heteroaromatic rings as substructures could be observed in the RET inhibitors (Supporting Information). Hierarchical clustering presented structural relevance between these 49 substructures,41 and 10 representative structures were chosen as centroids from the 49 substructures (Figure 5). The substructure filtering process resulted in 952 ligands remaining, corresponding to the 3399 poses identified for the docking simulation. A total of 3399 pairs of structures and pIC50 values were selected and used to construct the 3D-QSAR model. The finalized data set of 3399 conformers was randomly divided into the training set and test set at a 9:1 ratio. Then, the training set was fitted to a partial linear square (PLS) regression equation, which was generated based on the interaction features determined by the distance from each atom of the data set. The test set validated each PLS model.
Figure 5.
Hierarchical clustering of RET inhibitors and 10 representative ring structures out of 49 substructures.
The top five models were selected based on the statistical metrics, particularly RMSE, which means the deviation between the actual and predicted pIC50 values (Table 2). In other words, the RMSE denotes the root-mean-square error between the actual and predicted activities within a used data set and has the same unit with an SD of actual pIC50 values. The optimal model presented an RMSE of 0.670 (log nM), an R2 of 0.801, and a correlation coefficient of 0.891 between the experimental and predicted pIC50. Hydrophobicity (H) is the most contributing feature in every model, followed by the electronic (E) feature and the hydrogen-bonding donor (D) feature in that order.
Table 2. Statistical Evaluation of QSAR Modelsa.
| model no. | PLS factorb | Q2c | R2d | R2 CVe | SDf | Fg | RMSEh | Pearson-r | fractionsi | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| thresholdm | <0.7 | <0.8 | Dj | Hk | El | ||||||
| 1 | 7 | 0.794 | 0.801 | 0.742 | 0.656 | 1760 | 0.670 | 0.891 | 0.117 | 0.463 | 0.284 |
| 2 | 7 | 0.780 | 0.799 | 0.774 | 0.660 | 3470 | 0.690 | 0.883 | 0.118 | 0.470 | 0.297 |
| 3 | 7 | 0.773 | 0.801 | 0.744 | 0.658 | 1750 | 0.700 | 0.880 | 0.121 | 0.463 | 0.287 |
| 4 | 7 | 0.764 | 0.805 | 0.746 | 0.651 | 1790 | 0.710 | 0.876 | 0.113 | 0.463 | 0.291 |
| 5 | 7 | 0.763 | 0.799 | 0.744 | 0.661 | 1740 | 0.720 | 0.874 | 0.117 | 0.480 | 0.281 |
The finally selected optimal QSAR model is indicated in bold.
Number of potential variables used in the PLS regression analysis.
Cross-validated coefficient.
Square correlation coefficient.
Square correlation coefficient cross-validation.
Standard deviation.
Value of the F-test.
Root-mean-square error.
Contribution fractions of the various fields.
H-bond donor.
Hydrophobic/nonpolar.
Electron-withdrawing.
SD and RMSE criteria used for filtering the candidates.
3.2. Evaluation of the Optimal QSAR Model
This accurate result also encouraged us to characterize individual molecules in the out-of-set data. We selected the three most active and three most inactive molecules (Table 3). Very fortunately, the prediction of the subnanomolar (pIC50 < 9) active compounds was highly precise up to 0.03 pIC50 gap, demonstrating this model's capability for picomolar prediction. Similarly, millimolar activity (pIC50 close to 3–4) was also precisely predicted to demonstrate the predictable potency scope of the model. The reliable RMSE and R2 encourage us to further evaluate the optimal model. Thus, we needed another out-of-set (third data set), neither training nor test data. For this purpose, we recruited the primary patent of TAS/HM06, a clinical phase II drug (US 10,807,986 B2). When we conducted docking simulations, 30 inhibitors out of 170 patent compounds satisfied the reliable docking poses with hydrogen bonding to ALA807 of the RET hinge region. The pIC50 values of the inhibitors were predicted using the optimal QSAR model to evaluate the practical utility of the model (see also the values in the Supporting Information). Clearly, their inhibitory activities presented a reasonable difference between actual and predicted activity values in the scatter plot of Figure6.
Table 3. Representative Compounds among the Data Seta.
The experimental activity and predicted activity are both expressed in the form of pIC50 values.
Figure 6.
Scatter plot of the experimental activity and predicted activity of the training set, test set, and out-of-set molecules.
Moreover, we conducted residual analysis using the experimental and predicted values as shown in the following equation:
In general, even if a high correlation along with a high accuracy (high R2) is observed between the predicted and actual values, it is difficult to trust the prediction results if the residuals follow any patterns. The analyzed residuals of Figure 7 did not show any pattern of consistent increase or decrease for every data point, thereby proving their reliability. Furthermore, the residual plot demonstrated rare outliers beyond the cutoff range (−2.0 ∼ +2.0) of acceptable prediction.42
Figure 7.

Residual analysis of patent-filed RET inhibitors as out-of-set data. 28 out-of-set molecules showed standardized residuals within the range of −2.0 ∼+2.0 (“normal”). “Outlier” molecules had standardized residuals lower than −2.0.
3.3. Feature Analysis and 3D Visualization of RET–Inhibitor Interaction
In machine learning-based prediction models, model interpretation, which explains how well the predictions are performed, is just as important as prediction accuracy. For this purpose, it is crucial to explain the important features involved in making predictions in a way that is understandable to researchers. Therefore, we tried to analyze the 3D molecular interaction features through the visualization of three representative inhibitors chosen from both the active (pIC50 < 9) and inactive (pIC50 close to 3–4) groups in Table 4. The feature distinctions between active and inactive compounds were analyzed using contour maps (a form of a 3D feature map) of the optimal QSAR model. The unit of the contour maps is a cube (1 Å3), which indicates whether the space is favorable or unfavorable areas for respective features such as H-bond acceptors and donors, hydrophobic groups, and electron-withdrawing (EW) elements.
Table 4. Analysis of Best-Performing and Outlier Compounds in Out-of-Set Patent-Filed Dataa.
Docking score, experimental activity, and predicted activity from the atom-based QSAR model and the standardized residuals of four ligands from the model validation. The ligand numbering corresponds to that in the data source. Ligands 133 and 104, exhibiting standardized residuals below −2.000, were classified as “outliers.” Ligands 14 and 106, categorized as “normal,” were further analyzed for comparative purposes.
First of all, the critical hydrogen bond of RET inhibitors with the hinge region (Glu805 and Ala807) was perfectly matched between the 3D location of the H-donor feature with a positive coefficient (blue box) and the docking pose of the most potent compound (Figure 8a). Even if both the most potent and the least potent compounds commonly have hydrogen bonds (H-bond), the H-donor feature discriminated between their subtle locations, leading to a mismatch between the H-donor feature and the H-bond in the least potent compound (Figure 8b). Meanwhile, the difference in the H-donor feature was transformed into the design logic. The presence of a H-donor at the C4 position, such as an amino group, which forms a H-bond with Glu805, is critical for the design. The presence of the H-donor feature, highlighted by an amino group at the C4 position, characterized the active molecules (molecules 591, 377, and 371). Despite the presence of aromatic amino groups in the inactive compound (44, 46, and 83), they did not align with the H-donor feature at the same location (Figure 8b). Moreover, the H-donor feature with a negative coefficient (red box) was near the aromatic amino group. This subtle difference in the location of the H-donor suggests a conducive principle in the design of highly active RET inhibitors. In feature analysis, the functional group at the C3 position was another important feature to discriminate between active and inactive compounds. There are several H-donor cubes near the C3 position. Notably, the N-substituted amide group at the C3 position in the active molecules was surrounded by several blue boxes of the positive coefficient (Figure 8a).
Figure 8.
3D feature map of hydrogen-bond interactions in the optimal QSAR model. (A) Representative active compound 377 (cyan) and (B) inactive compound 44 (red). Legend: positive coefficient (favored for H-donor), light-blue cube; negative coefficient (disfavored for H-donor): light-orange cube; hydrogen bond, yellow dashed line; aromatic hydrogen bond, blue dashed line.
Second, we analyzed the 3D location of the hydrophobic feature, which is the most contributing feature in the QSAR model. As is well-known in data science, contour map visualization relies on coefficients, depicting how features affect activity in the QSAR model. In other words, the chosen coefficient cutoff determines which cube boxes are visualized on a contour map. Surely, the hydrophobic feature cube boxes were more dispersed than those of the H-donor. As shown in Figure 9, numerous cube boxes are present at the same coefficient level in the contour map for the hydrophobic feature. The clustered hydrophobic favored boxes (green boxes) led us to assign them as N1, C2, C3, and C4 substituents. The alkynyl group at the C2 position and the N-alkyl group at the N1 position matched well with the location of hydrophobic boxes (Figure 9a). This implies that the incorporation of a hydrophobic group at the N1 and C2 positions is recommended in the RET inhibitor design. Clearly, while the most inactive molecules lacked substituents to the N1 or C2 positions (Figure 9b), the most potent RET inhibitors have isopropyl (molecules 591 and 371) or 1-methylcyclopropyl (molecule 377) along with the alkynyl group at the C2 position (Figure 9a). Additionally, incorporating an amino group at the C4 position is recommended for highly potent RET inhibitors in terms of hydrophobic feature (negative coefficient) as well as the above H-donor (positive coefficient). Meanwhile, the C3 position was occupied by a mix of green and purple boxes more than any other position in Figure 9. In other words, the C3 position tended to favor hydrophilic substituents in some specific locations. However, although both the active and inactive molecules commonly possessed a hydrophilic group at the β-position of C3 carbon, there was a clear difference in how the 3D location was occupied by rigid amides (of active molecules) and rotatable aromatic amines (of inactive molecules) respectively.
Figure 9.
3D feature map of hydrophobic/nonpolar interaction in the optimal QSAR model. (A) Representative active compound 377 (cyan) and (B) inactive compound 44 (red). Legend: positive coefficient (favored for hydrophobic), green cube; negative coefficient (disfavored for hydrophobic), purple cube; hydrogen bond, yellow dashed line; aromatic hydrogen bond, blue dashed line.
Finally, the 3D location of the EW feature was analyzed. Rather than those of other features, the EW location was more shared between active and inactive molecules (Figure 10). Particularly, considerable green cube boxes were commonly occupied by both the active and inactive molecules. It indicates that the EW feature in the hinge binder region has a weak ability to discriminate the RET inhibitory activity. Meanwhile, C2 and C3 positions presented different aspects. When we integrate every feature, any small substituent or functional group can possess EW and hydrophilic features. Therefore, we derived another design logic from the feature analysis: the introduction of a hydrophilic substituent that attracts electrons at the C3 position becomes necessary when these two distinct features are considered together.
Figure 10.
3D feature map of EW interaction in the optimal QSAR model. (A) Representative active compound 377 (cyan) and (B) inactive compound 44 (red). Legend: positive coefficient (favored for EW), pale-red cube; negative coefficient (disfavored for EW), green cube; hydrogen bond, yellow dashed line; aromatic hydrogen bond, blue dashed line.
4. Discussion
4.1. 3D-QSAR Model for RET Inhibitors Elucidates the Interaction Mechanisms and Logic of Drug Design in the Out-of-Set Data
In general, an ideal QSAR model corresponds to the noncovalent bonding interactions within the respective ligand-target docking complex. This means that the features in QSAR and the interactions observed in the docking complex are closely aligned with each other. Therefore, we also tried to match the RET docking poses with our optimal QSAR model. For this purpose, we examined docking poses of patent-filed RET inhibitors as out-of-set. The docking poses of both the outlier and normal groups commonly exhibited proximity to the hinge region (ALA807) and the nearby ligand–protein interactions. The reverse orientation of the core structure (5,6-fused bicyclic heteroaromatic ring) in compounds 104 and 133 (outlier group) compared to compounds 14 and 106 (normal group) significantly influenced the positioning of substituents at C2, C3, C4, and N1 (Figure 11).
Figure 11.
Pose comparison of out-of-set molecules from patent-filed RET inhibitors. Outlier group molecules include (A) compound 104p and (B) compound 133p and normal group molecules include (C) compound 14p and (D) compound 106p.
Therefore, 3D-QSAR models, which rely on the alignment of the inhibitors from docking results, can explain the conflicting behaviors of QSAR feature cubes surrounding active and inactive compounds despite identical substituents. While key interactions with ALA807 at the hinge are common in both groups, our optimal QSAR model accounts for the variation in residual size between the experimental and predicted activities. This variance is primarily due to structural differences, particularly the introduction of substituents at the C2 position in the outlier group (Figure 12). For instance, in the contour map of compound 104, the alkynyl group was matched with the negative EWG feature (electron donating) and the hydroxyl group was well matched with the positive EWG features. In addition to the structural disparities at the C2 position, substituents can be compared. Notably, the presence of an isopropyl group at N1 in compound 104 was regarded as unfavorable (depicted by the negative hydrophobic), in contrast to its favorable depiction in compound 14 (Figure 12).
Figure 12.
Whole feature analysis of representative out-of-set from patent-filed RET inhibitors. (A) EW interaction feature of compound 104p, (B) hydrophobic/nonpolar interaction feature of compound 104p, (C) EW interaction feature of compound 14p, and (D) hydrophobic/nonpolar interaction feature of compound 14p. Legend: positive coefficient, pale-red cube (for EW) and green cube (for hydrophobic); negative: green cube (for EW) and purple cube (for hydrophobic).
The overall view in Figure 13 simplifies the SAR for designing RET inhibitors based on the comparative analysis of every contour map. The N-substituted amide at the C3 position was commonly observed in active compounds, and hence we named the C3 substituent as a linker, such as reported in another study.43
Figure 13.

Summarized and simplified feature analysis for designing RET inhibitors.
4.2. The QSAR Model Applied to the Off-Target Screening of Clinical and Nonclinical Kinase Inhibitors
The reliable evaluation and interpretation of the optimal QSAR model encouraged us to apply the QSAR model to an off-target screening for RET alteration in known clinical and nonclinical kinase inhibitors. Among 633 drug candidates collected from PKIDB38 and MRC,39 459 kinase inhibitors were docked to RET kinase to produce 3D conformation for QSAR prediction. After achieving both docking scores and predicted pIC50 values, the top 30 inhibitors were selected under two different ranking methods. The first method involved assigning ranks to the docking scores and the predicted pIC50 values separately, with the top 30 selected based on the lowest sum of these ranks. The second method used the cutoff (>7.5) of the predicted pIC50 along with the evaluation of the interaction between docking poses and the target protein.
Surprisingly, seven compounds in the chosen top 30 list were acknowledged for their inhibitory efficacy against RET in the literature (Table 5). In the first announced patent, Edralbrutinib, Evobrutinib, and Pirtobrutinib are patented as BTK inhibitors;44 Sapanisertib functions as an mTOR inhibitor;45 Abivertinib serves as an EGFR inhibitor;46 Atinvicitinib acts as a JAK inhibitor;47 and Laduviglusib (also known as CHIR-99021) is an inhibitor of GSK3β signaling.48 Despite the absence of a core structure commonly found in active molecules, according to the QSAR model, the molecule contained an imidazole moiety. In the docking pose, the imidazole group was positioned in proximity to GLU805 and ALA807, forming a hydrogen bond, as evidenced by the light-blue cubes near the nitrogen atom in the imidazole ring, indicating the presence of a favorable hydrogen-bond donor feature. The predicted value can also be rationalized by the QSAR cubes near other substituents, such as amino and cyano groups. The presence of an amino group, including N24 in Laduviglusib, is favorable, as indicated by the light-blue cube representing hydrogen-bond donor features. Furthermore, the addition of a cyano group to the pyridine ring improved activity because of its EW features, as shown by the pale-red cube in the QSAR model.
Table 5. RET Off-Target Screening of Known Clinical and Nonclinical Kinase Inhibitorsa.

Key ligand–protein interactions and docking score of ligands. Aromatic-HB(A), aromatic HB acceptor; HB(A), HB acceptor; HB(D), HB donor.
5. Conclusions
In conclusion, the primary objective of this study was to examine the relationship between ligand structures and inhibitory activity using 3D-QSAR modeling. The optimal 3D-QSAR model was developed using RET kinase inhibitors with 49 different substructures to demonstrate a high predictive performance (R2 value of 0.801 and Q2 value of 0.794). Furthermore, the out-of-set validation using patent molecules indicated the practical ability to detect active molecules. Moreover, the 3D spatial feature analysis highlighted the characteristic substituent patterns of the 5,6-fused bicyclic heteroaromatic core structures. Finally, the model was applied to screen kinase inhibitors to evaluate their potential applications. Future work will extend this study by validating outcomes through experimental studies and assessing the drug developability of the identified compounds.
Acknowledgments
This study was supported by the Basic Science Research Program of the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science, and Technology (No. 2022R1A2C2091810). This research was also supported by a grant of the Korea Machine Learning Ledger Orchestration for Drug Discovery Project (K-MELLODDY), funded by the Ministry of Health & Welfare and Ministry of Science and ICT, Republic of Korea (No.: RS-2024-12345678). S.J. appreciates Gachon University’s Scholarship Program.
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.4c07843.
Clarified data sets for collection, training, validation, and screening (PDF)
Author Contributions
M.K. conceived and designed the study. Under M.K.’s plan, S.J. and S.K. investigated diverse molecular featurization and S.J. fully conducted data collection and manipulation and built the QSAR model with validation. M.K. and S.K. analyzed every data and verified the results. M.K. and S.J. wrote the manuscript and revised it. M.K. provided the molecular modeling lab facility. All authors read and approved the final manuscript.
The authors declare no competing financial interest.
Special Issue
Published as part of ACS Omegaspecial issue “3D Structures in Medicinal Chemistry and Chemical Biology”.
Supplementary Material
References
- Vargas-Leal V.; Bruno R.; Derfuss T.; Krumbholz M.; Hohlfeld R.; Meinl E. Expression and function of glial cell line-derived neurotrophic factor family ligands and their receptors on human immune cells. J. Immunol. 2005, 175 (4), 2301–2308. 10.4049/jimmunol.175.4.2301. [DOI] [PubMed] [Google Scholar]
- Lipson D.; Capelletti M.; Yelensky R.; Otto G.; Parker A.; Jarosz M.; Curran J. A.; Balasubramanian S.; Bloom T.; Brennan K. W. Identification of new ALK and RET gene fusions from colorectal and lung cancer biopsies. Nature medicine 2012, 18 (3), 382–384. 10.1038/nm.2673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Novello S.; Califano R.; Reinmuth N.; Tamma A.; Puri T. RET Fusion-Positive Non-small Cell Lung Cancer: The Evolving Treatment Landscape. Oncologist 2023, 28 (5), 402–413. 10.1093/oncolo/oyac264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Consortium A. P. G.; Consortium A. P. G.; André F.; Arnedos M.; Baras A. S.; Baselga J.; Bedard P. L.; Berger M. F.; Bierkens M.; Calvo F. AACR Project GENIE: powering precision medicine through an international consortium. Cancer Discovery 2017, 7 (8), 818–831. 10.1158/2159-8290.CD-17-0151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drilon A.; Lin J. J.; Filleron T.; Ni A.; Milia J.; Bergagnini I.; Hatzoglou V.; Velcheti V.; Offin M.; Li B. Frequency of brain metastases and multikinase inhibitor outcomes in patients with RET–rearranged lung cancers. J. Thorac. Oncol. 2018, 13 (10), 1595–1601. 10.1016/j.jtho.2018.07.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pietrantonio F.; Di Nicolantonio F.; Schrock A.; Lee J.; Morano F.; Fuca G.; Nikolinakos P.; Drilon A.; Hechtman J.; Christiansen J. RET fusions in a small subset of advanced colorectal cancers at risk of being neglected. Ann. Oncol. 2018, 29 (6), 1394–1401. 10.1093/annonc/mdy090. [DOI] [PubMed] [Google Scholar]
- Huang L.; Jiang S.; Shi Y. Tyrosine kinase inhibitors for solid tumors in the past 20 years (2001–2020). J. Hematol. Oncol. 2020, 13, 143. 10.1186/s13045-020-00977-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duke E. S.; Bradford D.; Marcovitz M.; Amatya A. K.; Mishra-Kalyani P. S.; Nguyen E.; Price L. S.; Zirkelbach J. F.; Li Y.; Bi Y. FDA Approval Summary: Selpercatinib for the treatment of advanced RET fusion-positive solid tumors. Clin. Cancer Res. 2023, 29, 3573. 10.1158/1078-0432.CCR-23-0459. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Romero D. Activity of selpercatinib confirmed in phase III trials. Nature Reviews Clinical Oncology 2024, 21 (1), 5–5. 10.1038/s41571-023-00837-z. [DOI] [PubMed] [Google Scholar]
- Mologni L.; Rostagno R.; Brussolo S.; Knowles P. P.; Kjaer S.; Murray-Rust J.; Rosso E.; Zambon A.; Scapozza L.; McDonald N. Q. Synthesis, structure–activity relationship and crystallographic studies of 3-substituted indolin-2-one RET inhibitors. Bioorg. Med. Chem. 2010, 18 (4), 1482–1496. 10.1016/j.bmc.2010.01.011. [DOI] [PubMed] [Google Scholar]
- Bhattacharya S.; Asati V.; Ali A.; Ali A.; Gupta G. In-silico studies for the development of novel RET inhibitors for cancer treatment. J. Mol. Struct. 2022, 1251, 132040 10.1016/j.molstruc.2021.132040. [DOI] [Google Scholar]
- Hu Y.; Stumpfe D.; Bajorath J. Recent advances in scaffold hopping: miniperspective. Journal of medicinal chemistry 2017, 60 (4), 1238–1246. 10.1021/acs.jmedchem.6b01437. [DOI] [PubMed] [Google Scholar]
- Acharya A.; Yadav M.; Nagpure M.; Kumarsan S.; Guchhait S. K. Molecular medicinal insights into scaffold hopping-based drug discovery success. Drug Discov. Today 2023, 29, 103845 10.1016/j.drudis.2023.103845. [DOI] [PubMed] [Google Scholar]
- Mathison C. J.; Chianelli D.; Rucker P. V.; Nelson J.; Roland J.; Huang Z.; Yang Y.; Jiang J.; Xie Y. F.; Epple R. Efficacy and tolerability of pyrazolo [1,5-a] pyrimidine RET kinase inhibitors for the treatment of lung adenocarcinoma. ACS Med. Chem. Lett. 2020, 11 (4), 558–565. 10.1021/acsmedchemlett.0c00015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee S.-H.; Lee J.-K.; Ahn M.-J.; Kim D.-W.; Sun J.-M.; Keam B.; Kim T.; Heo D.; Ahn J.; Choi Y.-L. Vandetanib in pretreated patients with advanced non-small cell lung cancer-harboring RET rearrangement: a phase II clinical trial. Ann. Oncol. 2017, 28 (2), 292–297. 10.1093/annonc/mdw559. [DOI] [PubMed] [Google Scholar]
- ChEMBL2023.
- BindingDB2023.
- Sun J.; Jeliazkova N.; Chupakhin V.; Golib-Dzib J.-F.; Engkvist O.; Carlsson L.; Wegner J.; Ceulemans H.; Georgiev I.; Jeliazkov V. ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics. J. Cheminf. 2017, 9, 17. 10.1186/s13321-017-0203-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee J.; Kumar S.; Lee S. Y.; Park S. J.; Kim M. H. Development of Predictive Models for Identifying Potential S100A9 Inhibitors Based on Machine Learning Methods. Front Chem. 2019, 7, 779. 10.3389/fchem.2019.00779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arabie P.; Baier N. D.; Critchley C. F.; Keynes M.. Studies in Classification, Data Analysis, and Knowledge Organization; Springer, 2006.
- Schrödinger Release 2019-1: LigPrep; Schrödinger, LLC: New York, NY, 2019 [Google Scholar]
- Sastry G. M.; Adzhigirey M.; Day T.; Annabhimoju R.; Sherman W. Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J. Comput. Aided Mol. Des. 2013, 27 (3), 221–234. 10.1007/s10822-013-9644-8. [DOI] [PubMed] [Google Scholar]
- Miyazaki I.; Odintsov I.; Ishida K.; Lui A. J.; Kato M.; Suzuki T.; Zhang T.; Wakayama K.; Kurth R. I.; Cheng R. Vepafestinib is a pharmacologically advanced RET-selective inhibitor with high CNS penetration and inhibitory activity against RET solvent front mutations. Nature Cancer 2023, 4 (9), 1345–1361. 10.1038/s43018-023-00630-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schrödinger Release 2019-1: Protein Preparation Wizard; Epik, Schrödinger, LLC, New York, NY, 2019; Impact, Schrödinger, LLC, New York, NY; Prime, Schrödinger, LLC, New York, NY, 2019.
- Roos K.; Wu C.; Damm W.; Reboul M.; Stevenson J. M.; Lu C.; Dahlgren M. K.; Mondal S.; Chen W.; Wang L. OPLS3e: Extending force field coverage for drug-like small molecules. J. Chem. Theory Comput. 2019, 15 (3), 1863–1874. 10.1021/acs.jctc.8b01026. [DOI] [PubMed] [Google Scholar]
- Teli D. M.; Shah M. B.; Chhabria M. T. In silico screening of natural compounds as potential inhibitors of SARS-CoV-2 main protease and spike RBD: targets for COVID-19. Frontiers in molecular biosciences 2021, 7, 599079 10.3389/fmolb.2020.599079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schrödinger Release 2019-1: Force Fields; Schrödinger, LLC: New York, NY, 2019 [Google Scholar]
- Jordan A. M.; Begum H.; Fairweather E.; Fritzl S.; Goldberg K.; Hopkins G. V.; Hamilton N. M.; Lyons A. J.; March H. N.; Newton R. Anilinoquinazoline inhibitors of the RET kinase domain—Elaboration of the 7-position. Bioorg. Med. Chem. Lett. 2016, 26 (11), 2724–2729. 10.1016/j.bmcl.2016.03.100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cao S.; Tan C.; Fei A.; Hu G.; Fu M.; Lv J. Insights into pralsetinib resistance to the non-gatekeeper RET kinase G810C mutation through molecular dynamics simulations. J. Mol. Model. 2023, 29 (1), 24. 10.1007/s00894-022-05429-9. [DOI] [PubMed] [Google Scholar]
- Traxler P.; Furet P. Strategies toward the design of novel and selective protein tyrosine kinase inhibitors. Pharmacology & therapeutics 1999, 82 (2–3), 195–206. 10.1016/S0163-7258(98)00044-8. [DOI] [PubMed] [Google Scholar]
- Schrödinger Release 2019-1: Glide; Schrödinger, LLC: New York, NY, 2019 [Google Scholar]
- Luo Z.; Wang L.; Fu Z.; Shuai B.; Luo M.; Hu G.; Chen J.; Sun J.; Wang J.; Li J. Discovery and optimization of selective RET inhibitors via scaffold hopping. Bioorg. Med. Chem. Lett. 2021, 47, 128149 10.1016/j.bmcl.2021.128149. [DOI] [PubMed] [Google Scholar]
- Peytam F.; Emamgholipour Z.; Mousavi A.; Moradi M.; Foroumadi R.; Firoozpour L.; Divsalar F.; Safavi M.; Foroumadi A. Imidazopyridine-based kinase inhibitors as potential anticancer agents: A review. Bioorganic Chemistry 2023, 140, 106831 10.1016/j.bioorg.2023.106831. [DOI] [PubMed] [Google Scholar]
- El-Gamal M. I.; Zaraei S.-O.; Madkour M. M.; Anbar H. S. Evaluation of substituted pyrazole-based kinase inhibitors in one decade (2011–2020): Current status and future prospects. Molecules 2022, 27 (1), 330. 10.3390/molecules27010330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nitulescu G. M.; Stancov G.; Seremet O. C.; Nitulescu G.; Mihai D. P.; Duta-Bratu C. G.; Barbuceanu S. F.; Olaru O. T. The importance of the pyrazole scaffold in the design of protein kinases inhibitors as targeted anticancer therapies. Molecules 2023, 28 (14), 5359. 10.3390/molecules28145359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schrödinger Release 2019-1: Phase; Schrödinger, LLC: New York, NY, 2019 [Google Scholar]
- Schrödinger Release 2019-1: Maestro; Schrödinger, LLC: New York, NY, 2019 [Google Scholar]
- Jang C.; Yadav D. K.; Subedi L.; Venkatesan R.; Venkanna A.; Afzal S.; Lee E.; Yoo J.; Ji E.; Kim S. Y.; Kim M. Identification of Novel Acetylcholinesterase Inhibitors Designed by Pharmacophore-Based Virtual Screening, Molecular Docking and Bioassay. Sci. Rep 2018, 8 (1), 14921. 10.1038/s41598-018-33354-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carles F.; Bourg S.; Meyer C.; Bonnet P. PKIDB: A curated, annotated and updated database of protein kinase inhibitors in clinical trials. Molecules 2018, 23 (4), 908. 10.3390/molecules23040908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MRC. https://www.kinase-screen.mrc.ac.uk/kinase-inhibitors
- Schrödinger Release 2019-1: Canvas; Schrödinger, LLC: New York, NY, 2019 [Google Scholar]
- Leach A. R.Molecular Modelling: Principles and Applications; Pearson Education, 2001. [Google Scholar]
- Li X.; Su J.; Yang Y.; Lian W.; Deng Z.; Yang Z.; Chen G.; Zhang B.; Dong C.; Liu X. Discovery of 4-methyl-N-(4-((4-methylpiperazin-1-yl) methyl)-3-(trifluoromethyl) phenyl)-3-((6-(pyridin-3-yl)-1H-pyrazolo [3, 4-d] pyrimidin-4-yl)-oxy) benzamide as a potent inhibitor of RET and its gatekeeper mutant. Eur. J. Med. Chem. 2020, 207, 112755 10.1016/j.ejmech.2020.112755. [DOI] [PubMed] [Google Scholar]
- Mato A. R.; Woyach J. A.; Brown J. R.; Ghia P.; Patel K.; Eyre T. A.; Munir T.; Lech-Maranda E.; Lamanna N.; Tam C. S. Pirtobrutinib after a covalent BTK inhibitor in chronic lymphocytic leukemia. N. Engl. J. Med. 2023, 389 (1), 33–44. 10.1056/NEJMoa2300696. [DOI] [PubMed] [Google Scholar]
- Voss M. H.; Gordon M. S.; Mita M.; Rini B.; Makker V.; Macarulla T.; Smith D. C.; Cervantes A.; Puzanov I.; Pili R. Phase 1 study of mTORC1/2 inhibitor sapanisertib (TAK-228) in advanced solid tumours, with an expansion phase in renal, endometrial or bladder cancer. Br. J. Cancer 2020, 123 (11), 1590–1598. 10.1038/s41416-020-01041-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang Y.-C.; Chen Z.-H.; Zhang X.-C.; Xu C.-R.; Yan H.-H.; Xie Z.; Chuai S.-K.; Ye J.-Y.; Han-Zhang H.; Zhang Z. Analysis of resistance mechanisms to abivertinib, a third-generation EGFR tyrosine kinase inhibitor, in patients with EGFR T790M-positive non-small cell lung cancer from a phase I trial. EBioMedicine 2019, 43, 180–187. 10.1016/j.ebiom.2019.04.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Amstutz A.; Schandelmaier S.; Speich B.; Ewald H.; Agoritsas T.; Tro̷seid M.; Briel M.. Effectiveness and safety of janus kinase inhibitors in hospitalized patients with COVID-19: Systematic review and individual patient data meta-analysis of randomized trials.
- Wang S.; Ye L.; Li M.; Liu J.; Jiang C.; Hong H.; Zhu H.; Sun Y. GSK-3β inhibitor CHIR-99021 promotes proliferation through upregulating β-catenin in neonatal atrial human cardiomyocytes. Journal of cardiovascular pharmacology 2016, 68 (6), 425–432. 10.1097/FJC.0000000000000429. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.













