Abstract
Aim: Build a virtual screening model for ULK1 inhibitors based on artificial intelligence.
Materials & methods: Build machine learning and deep learning classification models and combine molecular docking and biological evaluation to screen ULK1 inhibitors from 13 million compounds. And molecular dynamics was used to explore the binding mechanism of active compounds.
Results & conclusion: Possibly due to less available training data, machine learning models significantly outperform deep learning models. Among them, the Naive Bayes model has the best performance. Through virtual screening, we obtained three inhibitors with IC50 of μM level and they all bind well to ULK1. This study provides an efficient virtual screening model and three promising compounds for the study of ULK1 inhibitors.
Keywords: : drug discovery, machine learning, molecular dynamics simulation, ULK1 inhibitors, virtual screening
Graphical Abstract

Plain language summary
Article highlights.
Background
The abnormally high expression of ULK1 kinase is closely related to various diseases such as tumors and its inhibitors have the prospect of treating related diseases. However, the research progress on ULK1 inhibitors is currently slow and only over 200 active inhibitors have been reported. Therefore, the discovery of novel ULK1 inhibitors is of great significance.
The artificial intelligence-based virtual screening model can be used to quickly screen lead compounds from large databases. Currently, there is no artificial intelligence-based high-throughput screening model in the field of ULK1 inhibitors.
Aim
Build machine learning and deep learning models for high-throughput virtual screening of ULK1 inhibitors and discover novel precursor compounds.
Method & results
Established deep learning and machine learning prediction models for ligand-based virtual screening.
Compared with the deep learning model, machine learning models showed superior performance, which may be due to the limited training data. Among the 16 machine learning prediction models, NB-ECFP6 performed the best.
By combining virtual screening models, 25 candidate compounds were selected from 13 million compounds. and three new ULK1 inhibitors with IC50 of μM level were obtained after kinase activity assay.
Conclusion
This study provides an efficient virtual screening model based on machine learning for the study of ULK1 inhibitors.
Three promising lead compounds were discovered and their modification is expected to produce new and efficient ULK1 inhibitors, which is of great significance for targeted ULK1 therapy.
1. Introduction
Autophagy is a cellular process wherein cells engulf their own proteins, organelles and other components [1]. It is one of the important mechanisms for the cycling of essential substances within the cell [2]. Autophagy is involved in the process of a variety of diseases, including neurodegenerative diseases, infectious diseases and tumors [3–5]. Inhibition of autophagy has been suggested as an effective strategy for the prevention or treatment of diseases such as tumors. UNC-51-like kinase 1 (ULK1), a serine/threonine protein kinase, is an upstream kinase that regulates the initiation of autophagy [6] and plays a key role in the initial stages of autophagy. Abnormally high expression of ULK1 has been shown to be closely associated with a variety of diseases such as neurological injury [7] and tumors, such as in tumors it has been reported that ULK1 is expected to be a key factor in hepatocellular carcinoma [8], non-small cell carcinoma [9], colorectal cancer [10], pancreatic cancer [11], prostate cancer [12] and neuroblastoma [13]. Strikingly, ULK1 inhibitors hold the promise of treating diseases associated with high ULK1 expression, such as inhibiting tumor cell growth, invasion and migration by inducing tumor cell autophagy and also increasing the sensitivity of tumor cells to chemotherapeutic agents [14]. In recent years, the study of ULK1 inhibitors has also gradually become one of the hotspots of kinase inhibitor research, and several inhibitors have been reported. However, compared with the research on kinase inhibitors such as EGFR, HER2 and FGFR, the research on ULK1 inhibitors is relatively lagging behind, and currently no inhibitors have entered the clinical experimental research stage. Therefore, there is an urgent need to discover new ULK1 inhibitors with more promising drug potential.
With the application of Computer-aided drug design (CADD), virtual screening, as a complementary method to experimental HTS (High throughput screening), can effectively reduce the limitations in drug design, improve the hit rate, shorten the development cycle and reduce the cost of drug development. When faced with the screening of small molecule compound libraries of more than tens of millions, Artificial Intelligence Drug Discovery (AIDD) is faster and more efficient than traditional CADD [15]. Therefore, AIDD is widely used in de novo drug molecule design. Common AIDD modeling methods are mainly divided into two categories: Machine Learning and Deep Learning. The machine learning modeling mainly consists of Support Vector Machine (SVM) [16–18], XGBoost (XGB) [19], Random Forest (RF) [20], Naive Bayes (NB) [21–23], K-Nearest Neighbor (KNN) [24] and Artificial Neural Networks (ANN) [25,26] have been used to screen inhibitors or agonists for various targets. Convolutional Neural Network (CNN)-based deep learning models [27] have also been widely used in virtual screening. CNNs were first applied to image processing and have a large advantage in image recognition, are good at excelling in feature extraction from images. However, compared with traditional machine learning, deep learning has certain requirements on data size [28]. For example, the deep learning approach is not conducive to modeling against a single target which have few experimental data in a public activity database. Among the reported studies on ULK1 inhibitors, most were discovered through traditional experimental methods, and only a few were discovered through CADD [8,29–31]. However, there are no research reports on the virtual screening model of ULK1 inhibitors based on AIDD.
In order to rapidly search for novel ULK1 inhibitors with structural diversity from a large database of compounds, we constructed machine learning and deep learning classification models based on molecular fingerprints and molecular images for virtual screening of ULK1 kinase inhibitors. The optimal model was selected to screen a library of 13 million compounds, followed by molecular docking and ADMET (Absorption, Distribution, Metabolism, Excretion and Toxicity) prediction and 25 candidate compounds were purchased for bioactivity testing, resulting in three new compounds with micromolar inhibitory activity against ULK1. In addition, the stability and affinity of the hit compounds for binding to ULK1 were verified by interaction analysis, molecular dynamics simulations and binding free energy calculations. The multi-combinatorial virtual screening workflow of this study is shown in Figure 1.
Figure 1.

Strategy adopted to identify potential ULK1 kinase inhibitors.
2. Materials & methods
2.1. Data preparation
The BindingDB database (https://www.bindingdb.org/) is currently one of the more complete databases of drug experiments obtained from the literature or patents, covering experimental activity data for the majority of kinases. 485 ULK1 experimental data were collected from the BindingDB database, and the data were cleaned using the open-source chemical information package RDKit, including the removal of duplicates, missing values, molecular weights greater than 700, and unrecognizable data, resulting in 468 valid data points. For ULK1, IC50, EC50, Ki, Kd ≤1100 nM were labeled as active data (265 entries) and IC50, EC50, Ki, Kd >1100 nM were labeled as inactive data (203 entries). In addition, 1121 bait compounds (inactive compounds) were generated using the DUD-E [32] (https://dude.docking.org/) database for the 23 active compounds. In total, 265 active and 1324 inactive data were obtained. We randomly split the database in a ratio of 4:1, where the training set contains 212 active data and 1059 inactive data, and the test set contains 53 active data and 265 inactive data.
The 13 million chemical libraries used for virtual screening included Targetmol, Chemdiv, Enamine, Chembridge, Specs, Vitas-M, InterBioScreen, Life Chemical, Maybridge, Otava, UkrOrgSynthesis, Analyticon, Bionet-Key Organics, Princeton BioMolecular, Alinda Chemical, Eximed, HTS Biochemie Innovationen, Asinex, Aronis and AsisChem.
2.2. Molecular characterization
Molecular descriptors are the end result of the process of converting biochemical information encoded in molecular symbols into digital logic and mathematics that can be recognized by a computer. Calculate the processed data into different types of molecular representations and use them as inputs to the machine learning models. In this study, two types of molecular descriptors, RDKit and Mordred and two types of molecular fingerprints, MACCS (Molecular ACCess System) and ECFPs (Extended Connectivity Fingerprints), were computed using the open-source chemical information package RDKit to represent molecular structures and properties.
RDKit descriptors is a specific molecular descriptor built into the RDKit package, and the algorithm is often used for molecular screening, druggability assessment, etc. The Descriptor module of the RDKit package can express the chemical characterization of a compound by quantifying some of its structural features and physicochemical properties. In this study, we calculated a total of 208 molecular descriptors including molecular mass, lipid-water partition coefficient, topological polar surface area, etc. Mordred is an extended version of the descriptor library based on the built-in tools of the RDKit package, which contains 1,826 molecular descriptor computational functions to calculate the 2D and 3D descriptors of molecules. Mordred mainly uses the “SMI” files as input files, usually the molecule's “SMI” file contains two columns, the first column is the SMILES (Simplified molecular input line entry system) of the small molecule and the second column is the name of the small molecule. In addition, the calculation of 3D molecular properties requires the use of “SDF” or “MOL2” files as input files. In this study, we calculated 1422 molecular descriptors as inputs to the machine learning model. MACCS is based on molecular fingerprints with SMARTS (SMiles ARbitrary Target Specification) encoding length of 167, which is one of the most commonly used chemical structure keys. The MACCS Keys were developed by MDL (memory descriptor list), and thus are sometimes referred to as MDL keys. Although there are two sets of MACCS keys, one containing 166 keys and the other containing 960 keys; only the fragment definition containing a subset of 166 keys is currently available for public use. It can be implemented in several open-source chemical information software packages, including RDKit package, CDK and others. In this study, we compute MACCS fingerprints containing 166 keys. ECFPs are circular topological fingerprints that represent a structure by means of circular atomic domains. It is widely used in molecular representation, similarity search and molecular conformational relationship analysis. Among them, the search radius of ECFPs can be set, in this study, we set the radius to 3 and the computational length to 2048.We evaluated these molecular descriptors and molecular fingerprints, and finally chose ECFP6 molecular fingerprints containing 2048 molecules' information as inputs to the model for virtual screening in the machine learning stage.
2.3. Model building
We aimed to discover ULK1 inhibitors using multiple Machine Learning methods, including SVM, XGB, NB, KNN and Deep Learning. The optimized training model is predicted on the test set and the best performing model is selected as the target model. It is built using Scikit-Learn (sklearn) machine learning library and Pytorch deep learning library. Sklearn is a machine learning tool based on Python language. Sklearn supports common machine learning tasks such as classification, regression, clustering, dimensionality reduction, model selection and preprocessing, which can shorten development time.
2.3.1. SVM model building
SVM is a machine learning algorithm used to solve binary classification problems. It separates samples of different categories by finding a partition hyperplane in the sample space. Simply put, the multidimensional space is divided into two regions by constructing a maximum hyperplane for classification. SVM algorithm is suitable for solving small samples and large feature space problems, but it is less efficient for predicting a large number of samples. For the classification task is built using SVC estimator. The GridSearchCV method is used to perform a grid search for hyperparameters. The parameters searched include kernel, gamma, C, etc.
2.3.2. XGB model building
XGB model is a kind of integrated classifier using Boosting, which utilizes many weak classifiers to integrate and then come together to form a model with a strong classifier. XGB is able to autonomously learn the direction of splitting when there are missing samples, and regularization is introduced during the model training process, which can control the complexity of the model and prevent the model from overfitting. However, the XGB algorithm needs to pre-sort the input features before iteration, which is time-consuming when computing large-scale data. For the classification task, use XGBClassifier to build. The GridSearchCV method is used to perform a grid search on the hyperparameters. The searched parameters include learning_rate, n_estimators, gamma, etc.
2.3.3. NB model building
NB is a classification method based on Bayesian theorem and independent assumptions of feature conditions. It divides categories by calculating the conditional probabilities of different independent features. NB algorithm performs better on small data sizes and can handle multicategorization problems. NB classifiers are faster compared with complex algorithms such as SVM. In addition, since the independence of conditional characteristic distributions means that each distribution can be independently estimated as a one-dimensional distribution, this helps alleviate problems caused by too large a number of dimensions. Classification training is performed based on the training set, which is built using the GaussianNB estimator.
2.3.4. KNN model building
The KNN algorithm is a basic classification and regression method, which is a commonly used algorithm in supervised learning methods. The KNN algorithm calculates the distance between the sample and the training data in the space, and then determines the sample category by the category of most of the K “nearest neighbor” points. KNN algorithm is simple in theory, easy to implement and has no assumptions on the data [33], so the accuracy rate is high; however, when the samples are not balanced, the effect is poor and it requires a large amount of memory.
2.3.5. Deep learning model building
CNN-based deep learning models were initially applied in the field of image recognition and are good at extracting features from images. The chemical structures of small molecules are first converted into two-dimensional images as input to the CNN model. Convolutional layers are the core part of CNN models that can capture the structure-activity relationships between small molecules and encode these relationships into features that can be used for classification. The attention mechanism can enhance the semantic correlation between small molecules in both spatial and channel dimensions [34], highlight important features by giving different weights and reduce the impact of insignificant features. In our model, the attention mechanism is used to emphasize those chemical properties that are most important for the prediction of small molecule activity. The fully connected layer is the output part of the model, which integrates the features extracted from the convolutional and attention mechanism layers and outputs the final classification result.
2.4. Model evaluation indicators
The following kinds of metrics were used to evaluate the performance of our classification model: Accuracy is the ratio of the number of correctly categorized samples to the total number of samples:
Where TP denotes true positive number, TN denotes true negative number, FP denotes false positive number and FN denotes false negative number.
Precision is the proportion of the samples predicted to be positive cases to all positive case samples; Recall is the proportion of the correctly classified positive case samples to all positive case samples. F1-Score is the reconciled average of Precision and Recall; AUC is the size of the area under the Receiver operating characteristic (ROC) curve, which can quantitatively reflect the performance of the model measured based on the ROC curve. The value of AUC is generally between 0.5 and 1. The larger the AUC is, the more likely that the classifier will rank the actual positive samples in front of the actual negative samples, i.e., make correct predictions. The horizontal coordinate of the ROC curve is the False Positive Rate (FPR); the vertical coordinate is the True Positive Rate (TPR), and the FPR and TPR are calculated as follows:
2.5. Preparation of proteins & ligands
The crystal structure of ULK1 was obtained from the RSCB protein databank (http://www.rcsb.org/), crystals with high x-ray resolution (resolution values below 3 Å) and the protein structure as complete as possible were selected (PDB ID: 4WNO). The proteins were processed using PyMOL 2.0 to remove all crystalline water molecules, ions, etc. and modify the mutated amino acids to the correct amino acid sequences and completed protein structure using Wizard module. Data processing of all small molecules was performed using the RDKit package, including removal of molecular weights too large (W>700) or too small (W<100), energy minimization, etc. Finally, generate SDF files and MOL2 files for backup.
2.6. Molecular docking
Molecular docking has been widely used for structure-based virtual screening, as well as for molecular binding mode studies of protein-ligand complexes. In this study, molecular docking screening was performed using both AutoDock Vina [35,36] and Glide XP docking programs. The common molecular docking process includes protein preparation, ligand preparation, grid generation, molecular docking. When docking with AutoDock Vina: Use PyMOL 2.0 to select the pre-processed protein file and add hydrogen to it. Select the Grid Settings module and generate a docking box with a size of (22.5, 22.5, 22.5) centered on the small molecule 3RF in the ULK1 crystal. For small molecule compounds, select the processed MOL2 files for molecular docking. In Maestro11.5, ULK1 protein structure was processed with Protein Preparation Wizard module. The default parameters were selected to optimize the protein hydrogen bond network, the OPLS-2005 [37] force field was used to assign partial charges, and the protein structure was energy minimized. Import small molecule SDF files. Using the Receptor Grid Generation module, click on the small molecule 3RF in the crystal to automatically generate a 10 × 10 × 10 Å grid centered on the 3RF. Finally, using the Ligand Docking module, select the grid file of the docking pocket, select all small molecules that need to be docked, select XP accuracy and perform docking screening.
2.7. ADMET properties
Physicochemical properties and ADMET characterization of compounds are crucial in drug discovery. The physicochemical properties of a drug, such as molecular mass, lipophilicity and solubility determine its transmembrane ability. ADMET properties are essential for evaluating the safety and efficacy of a drug candidate. We evaluated the screened compounds using ADMET lab 2.0 [38] (https://admetmesh.scbdd.com/) to select compounds with acceptable physicochemical and ADMET properties and passed PAINS (Pan Assay Interference Compounds) for further analysis.
2.8. Bioactivity assay
The inhibitory activity of 25 candidate compounds against ULK1 kinase was assessed using the Lance Ultra Assay. The IC50 values of these 25 candidate compounds were determined and MRT68921 was selected as a positive control compound for evaluation. The compounds were diluted to 100-fold of the final concentration of the reaction with 100% DMSO. For all compounds, compounds were serially diluted by transferring 15 to 30 μl of 100% DMSO in the next well and so on for a total of five concentrations. Add 100 μl of 100% DMSO to two empty wells in the same 96-well plate for the no-compound control and no-enzyme control. Add 10 μl of kinase solution to each well of the assay plate and 10 μl of 1x kinase buffer to the enzyme-free control wells. The 2x substrate solution was then prepared by adding substrate and ATP to the 1x kinase base buffer. Add 10 μl of 2x substrate solution to each well of the assay plate to start reaction. Finally, the reaction was terminated by adding 20 μl of assay solution to each well of the assay plate. The IC50 curve was fitted by XLFit excel add-in version 5.4.0.8.
2.9. Molecular dynamics simulations
Molecular dynamics (MD) simulations were performed using Gromacs 2020.4 software, and topology and parameter files were generated using the Charmm36 force field for each set of complex systems prior to MD simulations. Using the CGenff force field [39–41] parameters for small molecule ligands, the systems were dissolved in an ortho-dodecahedral TIP3P aqueous cassette at 10 Å from the system surface and the systems were neutralized by the addition of appropriate counterbalancing ions. The systems were then subjected to a maximum of 50,000 steps of energy minimization using the steepest descent method, followed by 100 ps of NVT simulations and 100 ps of NPT simulations with the temperature limited to 300 K. Finally, NPT simulations of each system were subjected to MD simulations of 200 ns duration, with all simulation steps of 2 fs and trajectories were saved every 2 ps.
2.10. MM/GBSA calculations
Molecular mechanics/generalized born surface area (MM/GBSA) has been applied to calculate the binding affinity of ligands to proteins because of its better computational speed and accuracy [42].
ΔGcomplex, ΔGreceptor and ΔGligand are the energies of the complex, receptor and ligand, respectively. The ΔEMM is molecular potential energy, the ΔGGAS molecular gas-phase free energy consists of intramolecular energy (ΔEinternal), Van Der Waals interactions (ΔEvdw) and electrostatic interactions (ΔEele). Since the single MD trajectory protocol is used during MM/GBSA calculations, ΔEinternal can be canceled exactly. The solvation free energy (ΔGsol) is divided into two parts, the polar solvation free energy (ΔGGB) and the non-polar solvation free energy (ΔGSURF). All binding free energy calculations in this study were run by the gmx_MMGBSA.py module.
3. Results
3.1. Modeling & assessment
We constructed a deep learning classification model based on CNN. The model consists of an input layer, four 2D convolutional layers, an attention layer, a fully connected layer and an output layer. Each convolutional layer is followed by a maximum pooling layer, and the attention layer includes channel attention and spatial attention. The model flowchart is shown in Supplementary Figure S1A. Use the RDKit package to convert the SMILES format of the prepared dataset into 300×300 molecular images as input to the deep learning model. The results show that the AUC value of the deep learning model on the test set is 0.583 (Supplementary Figure S1B), which performed lower than traditional molecular docking screening methods.
In addition, we used the machine learning approach to build a screening model for ULK1 kinase inhibitors. Four different types of molecular characterization modalities including RDKit, Mordred, MACCS and ECFP6 were calculated as inputs for molecular features to be used in the machine learning model. After preprocessing, RDKit retained 208 descriptors and Mordred retained 1422 molecular descriptors; MACCS and ECFP6 retained 167 and 2048, respectively.
Each molecular characterization method was combined with four different machine learning algorithms, including SVM, XGB, NB and KNN, to build a total of 16 different classification models. Each model combination was parameter-optimized (Supplementary Table S1) so that the model performance could be maximized. In addition, the data were normalized and analyzed using fivefold cross-validation for each model, and the results showed that all classification models performed well. Metrics including Precision, Recall, F1-score, Accuracy and AUC were used to evaluate the performance of different machine learning classifiers (Supplementary Table S2). The evaluation results of the 16 machine learning models are shown in Table 1. Among them, the NB model has the best performance with Recall of 0.9208 and AUC of 0.9049 in the test set.
Table 1.
Evaluation results of machine learning model.
| SVM | XGB | NB | KNN | |||||
|---|---|---|---|---|---|---|---|---|
| Recall | AUC | Recall | AUC | Recall | AUC | Recall | AUC | |
| RDKit | 0.8113 | 0.8668 | 0.5585 | 0.7430 | 0.8377 | 0.8585 | 0.8226 | 0.8672 |
| MACCS | 0.8076 | 0.8683 | 0.7736 | 0.8615 | 0.8491 | 0.8634 | 0.7925 | 0.8543 |
| ECFP6 | 0.8642 | 0.8996 | 0.7887 | 0.8728 | 0.9208 | 0.9049 | 0.8264 | 0.8777 |
| Mordred | 0.7472 | 0.8419 | 0.5359 | 0.7369 | 0.8151 | 0.8430 | 0.7811 | 0.8502 |
Boldface indicates the model with the best performance.
3.2. Construction & evaluation of docking screening models
In this study, we evaluated two different docking procedures and four ULK1 crystal structures (Supplementary Figure S2). The enrichment ability of the model was evaluated by ROC curves, and the screening ability of the model for inhibitor and decoy sets was evaluated by Student t-test. The results, as shown in Supplementary Figure S3, showed that AutoDock Vina, HTVS, SP and XP were able to accurately discriminate between inhibitors and decoys of the four ULK1 proteins (p < 0.05); meanwhile, the enrichment abilities of different docking programs had significant differences, with the AUC value of AutoDock Vina lower than that of the Glide module. Notably, the 4WNO performed best among the four docking accuracies of the two docking programs, with an AUC value of 0.9149 for the XP mode. Although the AUC value of AutoDock Vina was slightly lower than that of the HTVS mode, the p-value of 2.73E-17 for AutoDock Vina was much smaller than that of HTVS, with a p-value of 2.66E-12 (Figure 2A).
Figure 2.

Result distribution histogram. (A) Distributions of the docking scores of the inhibitor (red)/decoy (blue) for ULK1 protein (PDB ID:4WNO), (B) The distributions of the MW, TPSA, LogS and QED of the activity inhibitors and virtual screening compounds. Histograms for the activity inhibitors and virtual screening compounds are represented by red bar and blue bar, respectively.
For the ULK1 complex, the 4WNO crystallized ligand 3RF was re-docked using two docking programs, and the RMSD between the docking pose and its original conformation was calculated. The results were shown in Supplementary Figure S4, the RMSD value of AutoDock Vina docking was 0.2 Å and the RMSD value of Glide XP docking was 0.22 Å; the RMSD values of both docking programs were less than 2Å, indicating that they both have remarkable predicting power. In summary, we chose the 4WNO crystal structure for the virtual screening of targeting ULK1, and successively used two algorithms, AutoDock Vina and Glide XP, for the gradient screening.
3.3. Virtual screening with multiple combination methods
Based on the constructed NB model and the selected ULK1 protein structures and docking programs, we performed a multicombinatorial virtual screening of about 13 million databases including Vitas-M and others (Supplementary Table S3). The 13 million databases were first screened in the first stage using the NB model based on ligand, and 17,197 compounds with prediction scores greater than or equal to 0.9 were selected for the next stage of molecular docking screening. These 17,197 compounds were then further subjected to structure-based molecular docking screening. First, the AutoDock Vina software was used for docking screening, and a total of 4,601 compounds with AutoDock Vina scores less than -9.0 kcal/mol were selected for the next step of docking screening. Further screening was then performed using the Glide XP docking protocol, retaining 370 compounds with docking scores less than -8.0 kcal/mol. Finally, we de-duplicated the 370 compounds from the three stages of screening and removed duplicates with known experimental data on ULK1 activity, yielding a total of 218 compounds.
We predicted the physicochemical properties of the 218 compounds screened and the active compounds of ULK1 using the ADMET lab 2.0 website. The results showed (Figure 2B) that the 218 compounds screened by multiple combinations were essentially similar to the active compounds of ULK1 in terms of physicochemical properties. Based on the principle of druggability, the molecular masses (MW) of the 218 compounds were mainly distributed between 320 and 480 lower than the active compounds. The topological polar surface area (TPSA) was mainly distributed between 40∼140 with optimal performance based on Weber's rule. The water solubility (LogS) of the 218 compounds was mainly distributed between -6∼-2, which was basically similar to the distribution of the active data, indicating that they were less water soluble. The quantitative estimation of drug similarity (QED) values of our screened compounds is slightly higher than the activity data, suggesting that our compounds may have more novel molecular backbones.
We performed ADMET filtration on 218 compounds, removing compounds that could not pass the PAINS filter, violated the Lipinski Rule (MW ≤500, LogP ≤5, nHA ≤10, nHD ≤5 and nRot ≤10), and had QED <0.4 and TPSA ≤40. By ADMET prediction, 200 compounds were left behind. A final manual screening was performed to compare the ligand-binding poses in multiple ULK1 crystal structures, removing compounds that could not form stable hydrogen bonds with the protein hinge region as well as those with extremely similar structures. In summary, a total of 25 compounds were purchased for kinase activity testing. Supplementary Table S4 lists some of the physicochemical properties of the 25 candidate compounds.
3.4. Bioactivity assay
The in vitro kinase inhibitory activities of 25 candidate compounds were evaluated using the Lance Ultra Assay method, and the IC50 values of these 25 compounds were also determined (Supplementary Figure S5). Encouragingly, three compounds (829W, 831W, 836W) showed more than 60% inhibition at 60 μM (Figure 3A), and the IC50 of the three compounds all had inhibitory activity at the μM level. Therefore, we measured the IC50 values of these three compounds again (Supplementary Table S5), and the IC50 values of the three hit compounds 829W, 831W and 836W are 37.11 ± 4.41 μM, 7.67 ± 2.82 μM and 3.15 ± 0.77 μM respectively (Table 2). The chemical structures of 829W, 831W and 836W are shown in Figure 3B. It is worth noting that 829W and 831W have similar skeletons, but there are certain differences in activity due to differences in substituents or substituent positions. Compared with reported ULK1 inhibitors, 836W has better skeleton novelty.
Figure 3.

(A) Inhibition rate of 25 compounds in 60 μM, (B) Chemical structure of three hit compounds and three reported ULK1 inhibitors. (C) Molecular docking interaction diagrams of three hit compounds and three positive drugs; pink represents the structure of ULK1 protein, while orange is a key residue in protein ligand interaction, yellow represents hydrogen bonding, white represents hydrophobic interaction and red represents salt bridge.
Table 2.
ADMET properties and IC50 of the 3 hit compounds and MRT68921.
| Compound | IC50 (μM) | Absorption | Distribution | Metabolism | Excretion | Toxicity |
|---|---|---|---|---|---|---|
| GI absorption | BBB permeation | CYP2D6 inhibitor | OCT2 substrate | AMES toxicity | ||
| MRT68921 | 0.108 | High | No | Yes | No | No |
| 829W | 37.11 ± 4.41 | High | No | No | No | No |
| 831W | 7.67 ± 2.82 | High | No | No | No | No |
| 836W | 3.15 ± 0.77 | High | No | No | No | Yes |
3.5. Interaction analysis
Molecular docking results show (Figure 3C) that the 1H-pyrazol-3-amine of 3RF forms two hydrogen bonds with residue CYS95 in the hinge region of ULK1, and the aniline forms three hydrogen bonds with LYS46 and ASP165. Importantly, 3RF was able to be well surrounded by residues ILE22, VAL30, VAL76, MET92, TYR94 and LEU145 to form a hydrophobic force. 3RJ was not only able to form three hydrogen bonds with the key residues CYS95 and GLU93 in the hinge region, but the benzimidazole formed three hydrogen bonds with GLN142, ASN143 and ASP165. The cyclobutyl group is surrounded by VAL30, ALA44, LYS46 and ALA164 to form hydrophobic forces. The interaction pattern of compounds 3RF and 3RJ with ULK1 is in general agreement with that reported by Lazarus et al. [43]. The 2,4-diaminopyrimidine of MRT68921 has two hydrogen bonding interactions with CYS95, the amino side chain at the 4-position of pyrimidine forms a hydrogen bond with LYS46 and a hydrogen bonding interaction with ASN143. In addition, MRT68921 is surrounded by ILE22, ALA28, VAL76, MET92, LEU145 and ALA164 to form hydrophobic forces.
The 5-fluoropyrimidine-2,4-diamine of compound 829W forms one hydrogen bond with the hinge CYS95 and has a hydrophobic interaction with TYR94. The 3-aminobenzoic acid in the solvent region near the hinge forms a hydrogen bond with ASP99. The 4-aminobenzoic acid toward the inner side of the ATP pocket forms a hydrogen bond with ASP165. In addition, 829W forms hydrophobic interactions with residues VAL30 and LEU145. Compound 831W has two hydrogen bonding interactions with CYS95, and 3-aminobenzoic acid in the solvent region near the hinge forms one hydrogen bond with ASP99. The portion of 2-aminobenzoic acid toward the inside of the hydrophobic pocket forms a hydrogen bond with ASP165. In addition, 831W forms hydrophobic forces with ILE22, VAL30, TYR94 and LEU145. Interestingly, compound 836W not only has two hydrogen bonds with CYS95 in the hinge region, but also forms a hydrogen bond with GLU93 in the hinge region, which may mean that 836W binds to ULK1 more stably at the ATP site. However, 2,3-dihydrobenzo[b][1,4]dioxine, which is located within the hydrophobic pocket, has only one hydrophobic interaction with VAL30.
3.6. Molecular dynamics simulations
We performed 200 ns MD simulations of four protein-ligand complex systems, including the three hit compounds and the positive drug 3RF from the ULK1 crystal structure. Root Mean Square Deviation (RMSD) measures the deviation of protein Cα atoms over time and is often used to assess the stability of complex systems. The average RMSD values of 829W, 831W, 836W and 3RF were 0.23 nm, 0.21 nm, 0.18 nm and 0.17 nm respectively. As shown in Figure 4A, the three complex systems of 831W, 836W and 3RF can all reach convergence after 40 ns, among which 829W reaches convergence after 65 ns and remained stable during the whole simulation process. All four systems can be stabilized between 0.15∼0.25 nm, among which, the RMSD values of 836W and 3RF are close to each other, with lower RMSD values indicating more stable binding to the protein.
Figure 4.

Results of molecular dynamics calculations performed on the 829W, 831W, 836W and 3RF compound–protein receptor complexes. (A) RMSD, (B) RMSF, (C) Rg and (D) SASA.
Root Mean Square Fluctuation (RMSF) reflects the flexibility of the measured structure and can provide information about how each amino acid residue of the protein fluctuates throughout the simulation. A lower RMSF indicates a more stable protein conformation. RMSF values of αC were calculated for the four complex systems and ULK alone (apo). As shown in Figure 4B, except at the catalytic loopback (residues:145∼160) where the apo system has a higher RMSF value, at other sites, the RMSFs of the residues of the five systems have a similar trend of movement. At the ATP binding site (residues 80∼90), the RMSF values of apo were slightly higher than those of the other four systems, and the RMSF values of the three hit compounds were higher than those of 3RF. This suggests that drug binding may have altered the conformational changes at the ATP site, and that the more active inhibitor significantly stabilized the hinge residues.
Radius of Gyration (Rg) is related to the tertiary structure of proteins and refers to the root-mean-square distance between atoms in a molecule and its common center of mass. It is commonly used as a measure of the mass-weighted average radius of the system during simulation and can be used to characterize the compactness of the protein's structure, with a larger radius of gyration indicating that the system undergoes expansion. The average Rg values of 829W, 831W, 836W and 3RF are 1.97 nm, 1.97 nm, 1.96 nm and 1.95 nm respectively. As shown in Figure 4C, the Rg of the four complex systems was stabilized between 1.94∼2.00 nm; among them, 3RF, which had the best inhibitory activity, had the lowest Rg value. The results show that the four systems are stable throughout the simulation.
The solvent accessible surface area (SASA) of proteins reflects the hydrophobicity and stability in the amino acid residues of proteins, which helps to understand the level of protein folding or unfolding. The average SASA values of 829W, 831W, 836W and 3RF were 144.52 nm2, 144.16 nm2, 143.68 nm2 and 143.49 nm2, respectively. As shown in Figure 4D, the SASA of the four systems was relatively smooth and did not change much throughout the simulation, suggesting that there is a stable hydrophobic interaction in the complex system.
3.7. Analysis of Hydrogen bonding
The occupancies of the four inhibitors 829W, 831W, 836W and 3RF were 99.85%, 99.95%, 99.90% and 100%, respectively (Supplementary Figure S6), which suggests that there is a prolonged and stable hydrogen-bonding interaction in our complex system. Based on the results of the interaction analysis above, the hydrogen bonding occupancy of key residues for inhibitor-protein binding was further calculated. As shown in Figure 5A, compounds 829W, 831W, 836W and 3RF all formed stable hydrogen bonding interactions with CYS95 in the hinge region of ULK1, with hydrogen bonding occupancies of 98.90%, 99.50%, 98.60% and 99.65% respectively. Notably, the positive drug 3RF and the hit compound 836W showed 97.75% and 91.30% hydrogen bond occupancy with GLU93 in the hinge region, respectively, whereas the slightly less active 829W and 831W showed 0% hydrogen bond occupancy with GLU93.
Figure 5.

Analysis of Hydrogen Bond Interaction and Binding Free Energy. (A)Hydrogen bond occupancy of 3 hit compounds and positive drug 3RF. (B) Total energy contribution comparison of ULK1 inhibitors binding to key residues. (C) The Energy Contribution of Residues TYR94 and CYS95 to ULK1 Inhibitors.
3.8. Combined free energy calculations
As shown in Table 3, it shows the different types of binding energies of the four complex systems, including ΔEvdw, ΔEele, ΔEGB and ΔESURF, etc. Compound 829W, 831W, 836W and 3RF the total binding free energies (ΔTOTAL) of the four compounds were -21.09 ± 2.82 kcal/mol, -25.22 ± 0.41 kcal/mol, -26.19 ± 2.70 kcal/mol, and -29.57 ± 1.31 kcal/mol, respectively, which were in agreement with the gradient of inhibitory activity of the drugs. It is not difficult to find that the ΔGGAS plays a more important role in the binding effect of the inhibitor with ULK1. Among them, ΔEvdw contributes the most to the binding energy. However, ΔEGB is detrimental to drug-protein binding.
Table 3.
ΔGbind for the ULK1 inhibitors by the MM/GBSA method (kcal/mol).
| Com. | ΔEVDW | ΔEELE | ΔEGB | ΔESURF | ΔGGAS | ΔGSOLV | ΔTOTAL |
|---|---|---|---|---|---|---|---|
| 829W | -33.85 ± 2.95 | -16.43 ± 5.15 | 34.53 ± 4.04 | -5.34 ± 0.40 | -50.27 ± 5.42 | 29.18 ± 3.88 | -21.09 ± 2.82 |
| 831W | -33.80 ± 1.79 | -16.54 ± 7.38 | 30.13 ± 30.13 | -5.01 ± 0.30 | -50.35 ± 6.36 | 25.12 ± 6.58 | -25.22 ± 0.41 |
| 836W | -36.73 ± 3.10 | -19.72 ± 4.19 | 35.28 ± 3.08 | -5.01 ± 0.34 | -56.45 ± 3.53 | 30.26 ± 3.11 | -26.19 ± 2.70 |
| 3RF | -44.97 ± 1.66 | -28.33 ± 3.17 | 49.67 ± 2.20 | -5.94 ± 0.11 | -73.30 ± 2.52 | 43.73 ± 2.18 | -29.57 ± 1.31 |
As shown in Figure 5B, in the ULK1 protein-ligand complex, ILE22, VAL30, TYR94, CYS95, GLY98 and LEU145 were considered key residues. In the positive drug 3RF, residue CYS95 had the largest energy contribution of -3.84 kcal/mol, followed by TYR94 at -3.46 kcal/mol. Notably, the energy contributions of VAL30 and GLY98 were also significantly lower than that of 3RF in the hit compounds, whereas those of ILE22 and LEU145 were higher than that of 3RF.
In order to further investigate the reasons for the differences in energy contributions of the key residues to drugs, we compared the energy decomposition contributions of residues ILE22, VAL30, TYR94, CYS95, GLY98 and LEU145. As shown in Figure 5C, the ΔEele of VAL30, TYR94, CYS95 and GLY98 at 3RF are -0.26 kcal/mol, -2.74 kcal/mol, -5.20 kcal/mol and -1.45 kcal/mol, which are significantly higher than 829W, 831W and 836W. The ΔEvdw of the interaction between 3RF and VAL30, TYR94 and GLY98 are -1.88 kcal/mol, -1.89 kcal/mol and -1.21 kcal/mol respectively, which are also slightly higher than 829W, 831W and 836W. In addition, we also observed that the ΔEvdw of 829W, 831W and 836W with ILE22 and LEU145 was significantly higher than that of 3RF; and the ΔEele of the four compounds interacting with ILE22 and LEU145 was lower, this may be one of the reasons why the energy contribution of ILE22 and LEU145 at 3RF is lower than 829W, 831W and 836W.
4. Discussion
This article establishes an artificial intelligence-based virtual screening model to search for potential ULK1 inhibitors through multi-stage screening. The research results show that AUC value of the deep learning screening model on the test set is relatively low, whereas the AUC of traditional molecular docking screening methods are generally higher than this value. Although CNN has significant advantages in the field of image recognition [44], it can automatically perform feature extraction by simply inputting molecular images, in the process of converting molecular structures into natural images, the image quality may affect the final prediction results. Therefore, the molecular feature processing means converted to natural images may not be suitable for molecular characterization with smaller data volumes, and in the case of a small amount of data, CNN cannot learn the input features well, so that the judgment error. This makes the deep learning model we have built less effective. The most important difference between deep learning and traditional machine learning shows up as the amount of data increases, and deep learning algorithms perform poorly when the amount of data is small [45]. However, the performance of machine learning algorithms depends on the accuracy of feature recognition and feature extraction [46].
The selection of molecular representations in machine learning models is based on their ability to capture different aspects of molecular structures and properties. Among the four different molecular characterization methods, RDKit includes 208 molecular descriptors including molecular mass of compounds, electronic properties, etc. Mordred is an expanded version of the RDKit-based built descriptor library including 1826 molecular descriptors in 2D or 3D [47]. MACCS is a sub-structure based on SMARTS encoding and contains 166 sub-structures with a total length of 167. ECFPs are circular topological fingerprints that represent structures by circular atomic domains, allowing flexibility in searching radius and calculating length. By preprocessing these descriptors and removing those that cannot be processed by RDKit package and are obviously flawed, we ensure the quality of the input features for the machine learning model. Multiple evaluation metrics indicate that the consistently high performance of all machine learning screening models highlights the effectiveness of the selected methods. Notably, the NB model emerges as the most efficient one, demonstrating its discriminating ability to accurately distinguish between ULK1 inhibitors and non-inhibitors. This discovery highlights the practical application potential of the NB model in drug discovery and development, particularly in the context of ULK1 kinase inhibition.
In addition, various factors such as the crystal structure of the protein, docking software and scoring function often affect the success rate of virtual screening. In order to improve the accuracy of virtual screening, this study selected two different docking programs, structure-based molecular docking screening was performed using AutoDock Vina and the Glide module of the Schrödinger software, where the Glide module includes algorithms with three accuracies (HTVS, SP and XP). In the RSCB protein databank, there are five ULK1 crystal structures with PDB IDs of 4WNO, 4WNP, 5CI7, 6QAS and 6MNH, among which 6MNH has a severe structure deletion, and thus we chose the other four crystal structures for the evaluation.
The NB model allows for rapid searching of high-throughput databases and narrowing down lead compounds to target compounds that have a certain degree of novelty and a similar molecular fingerprint. Molecular docking can predict the interaction modes and binding affinity between molecules, thereby better evaluating the potential activity of candidate molecules. Therefore, in this study, we sequentially used the NB model, Autodock Vina and Glide XP to conduct virtual screening of a large compound library for targeted ULK1 kinase inhibitors. Subsequently, three hit compounds were obtained through ADMET filtering, manual selection and biological activity prediction and so on.
The study of intermolecular interactions in protein-ligand complexes is essential for drug design. To study the binding mode of inhibitors to ULK1 protein, we performed molecular docking of hit compounds 829W, 831W and 836W and three positive drugs 3RF, 3RJ and MRT68921. Selected the docking poses that stably bound to the kinase hinge region for analysis. The research results indicate that, compared with the positive drugs, the three hit compounds have a weakened hydrophobic interaction with the inside of the pocket and are not well surrounded by hydrophobic amino acids. In addition, all three positive drugs were able to form stable hydrogen bonding interactions or hydrophobic interactions with ILE22, however, 836W could not form interactions with ILE22. This may be one of the reasons why the inhibitory activity of 836W was lower than that of the positive drugs. Since 836W formed three hydrogen bonding interactions with CYS95 and GLU93 in the hinge region, whereas 829W and 831W only have hydrogen bonds with CYS95 in the hinge region, this may be one of the reasons why the inhibitory activity of 829W and 831W was not as effective as that of 836W. Analysis of the interactions of the three positive drugs shows that hydrophobic interactions are important for maintaining the stability of the drug-protein complexes. If the three screened drugs are subsequently modified synthetically, it may be crucial to increase the hydrophobic interaction force between the drug structure and the inner side of the pocket, which can improve the activity of the compounds to some extent. In summary, the hinge region residue CYS95 may be the key residue for inhibitor binding to ULK1, and increasing the interaction force with GLU93 could improve the inhibitory activity. Meanwhile, ILE22 may also be one of the key residues for stabilizing drug-protein binding. In addition to this, we found that the hydrophobic interaction on the inner side of the pocket is crucial to increase the activity of the inhibitor.
Since molecular docking only takes into account the flexible ligand conformation while the protein remains in a rigid state, in order to further study the dynamic binding of ULK1 kinase and inhibitors, we performed 200 ns MD simulations of four protein-ligand complex systems. Hydrogen bonding is the main molecular force that maintains the stability of the complex system, and it plays a crucial role in the stability, compactness and conformational changes of the protein structure [48]. Calculating the occupancy of the hydrogen bonding between the inhibitors and ULK1 throughout the 200 ns MD simulation allows us to better uncover the key residues that maintain the stable binding of the protein-drug complex system. The findings further confirmed that residue CYS95 may be the most critical residue for inhibitor binding to ULK1, GLU93 may be another key residue for enhancing the activity of ULK1 inhibitors. This is consistent with the speculation above – increasing interaction force between ligand and residue GLU93 could improve inhibitor activity.
Besides, the MM/GBSA method was used to calculate the binding free energy for the last 10 ns of the trajectory of the four complex systems, and the activity difference between the positive control drug 3RF and the three hit compounds 829W, 831W and 836W was explored from the energy. From the research results, we infer that the difference in binding free energy between the positive control drug and the three hit compounds is mainly caused by ΔGGAS, and also that decreasing the ΔEGB between protein-drug binding may improve the inhibitory activity of the drugs. Then, we further decomposed the binding free energies of the four systems to each residue to assess the detailed contribution of interacting residues in the binding of the inhibitors to the ULK1 kinase. And the residues contributing more than 1 kcal/mol were considered critical residues. The results showed that among the three hit compounds, the energy contributions of CYS95 and TYR94 were lower than that of the positive drug, which may be one of the reasons for the lower inhibitory activity of the hit compounds than that of 3RF. In summary, MM/GBSA analysis shows that solvation free energy plays an important role in maintaining the stability of the complex system. ILE22, VAL30, TYR94, CYS95, GLY98 and LEU145 are considered to be key residues for inhibitor binding to ULK1. At the same time, we also found that increasing the electrostatic interaction between the inhibitor and the key residues VAL30, TYR94, CYS95 and GLY98 may increase the inhibitory activity.
5. Conclusion
In this study, we employed a machine learning-based virtual screening approach aimed at rapidly identify potential ULK1 inhibitors with novel structures from large database. Therefore, we established 16 different machine learning models, among which NB performed best. Firstly, the established NB model was used to screen 13 million compounds. Secondly, two-stage molecular docking screening and ADMET filtering were performed successively. Finally, 25 candidate compounds were purchased for experimental determination of kinase activity. Experimental results show that three compounds have effective inhibitory activity against ULK1, with IC50 values ranging from 3.15 μM to 37.11 μM. These results validated the reliability of the computerized virtual screening and highlighted that the three hit compounds are expected to be further designed and synthesized novel and efficient ULK1 inhibitors. In addition, the ULK1 protein-ligand interaction patterns were analyzed by interaction analysis, MD simulations and binding free energy calculations. The results showed that residues CYS95, TYR94, GLU93, ILE22 and LEU145 were recognized as the key residues for the binding of the inhibitor to ULK1, and increasing the hydrophobicity of the inner side of the pocket could improve the activity of the inhibitor. We also found that the solvation free energy makes an important contribution to maintaining the stability of the complex system, and that increasing the electrostatic potential energy of inhibitor-protein binding may enhance inhibitor activity.
In conclusion, this study provides an efficient virtual screening model for the study of ULK1 inhibitors, which can be used to rapidly screen lead compounds with some activity potential from a large-scale database. In addition, this paper also provides three hit compounds, the modification of which is expected to discover novel and efficient ULK1 inhibitors. Meanwhile, MD simulations and MM/GBSA calculations can provide guidance for the modification of ULK1 lead compounds.
Supplementary Material
Supplemental material
Supplemental data for this article can be accessed at https://doi.org/10.1080/17568919.2024.2385288
Author contributions
M-M Kong is responsible for research performing, writing code, data analyzing and paper writing. B Liu and J-T Ding are responsible for code design. Z-X Xi and K Li is responsible for MD Simulation. X Liu is responsible for biological experiments. T-L Qin and Z-Y Qian are responsible for data processing. T Wei, W-C Wu, W-L Li and J-Z Wu are responsible for research designing and revised paper writing.
Financial disclosure
The authors have no financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.
This work was supported by the Zhejiang Province Natural Science Fund of China (Grant Nos. LGF20B020001, LGF21H160034), National Key R&D Program of China, National Key Research and Development Program of China (2021YFA1101200). The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
Competing interests disclosure
The authors have no competing interests or relevant affiliations with any organization or entity with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.
Writing disclosure
No writing assistance was utilized in the production of this manuscript.
References
Papers of special note have been highlighted as: • of interest; •• of considerable interest
- 1.Hama Y, Ogasawara Y, Noda NN. Autophagy and cancer: basic mechanisms and inhibitor development. Cancer Sci. 2023;114(7):2699–2708. doi: 10.1111/cas.15803 [DOI] [PMC free article] [PubMed] [Google Scholar]; •• Important reference opinions for the topic selection of this study.
- 2.Pasquier B. Autophagy inhibitors. Cell Mol Life Sci. 2016;73(5):985–1001. doi: 10.1007/s00018-015-2104-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kang MR, Kim MS, Oh JE, et al. Frameshift mutations of autophagy-related genes ATG2B, ATG5, ATG9B and ATG12 in gastric and colorectal cancers with microsatellite instability: ATG gene mutations. J Pathol. 2009;217(5):702–706. doi: 10.1002/path.2509 [DOI] [PubMed] [Google Scholar]
- 4.Choi AMK, Ryter SW, Levine B. Autophagy in human health and disease. N Engl J Med. 2013;368(7):651–662. doi: 10.1056/NEJMra1205406 [DOI] [PubMed] [Google Scholar]
- 5.Deretic V, Saitoh T, Akira S. Autophagy in infection, inflammation and immunity. Nat Rev Immunol. 2013;13(10):722–737. doi: 10.1038/nri3532 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Chan EYW, Kir S, Tooze SA. siRNA screening of the kinome identifies ULK1 as a multidomain modulator of autophagy. J Biol Chem. 2007;282(35):25464–25474. doi: 10.1074/jbc.M703663200 [DOI] [PubMed] [Google Scholar]
- 7.Vahsen BF, Ribas VT, Sundermeyer J, et al. Inhibition of the autophagic protein ULK1 attenuates axonal degeneration in vitro and in vivo, enhances translation, and modulates splicing. Cell Death Differ. 2020;27(10):2810–2827. doi: 10.1038/s41418-020-0543-y [DOI] [PMC free article] [PubMed] [Google Scholar]; • Has a profound understanding of the research on ULK1 kinase.
- 8.Xue ST, Li K, Gao Y, et al. The role of the key autophagy kinase ULK1 in hepatocellular carcinoma and its validation as a treatment target. Autophagy. 2020;16(10):1823–1837. doi: 10.1080/15548627.2019.1709762 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Tang F, Hu P, Yang Z, et al. SBI0206965, a novel inhibitor of Ulk1, suppresses non-small cell lung cancer cell growth by modulating both autophagy and apoptosis pathways. Oncology Reports. 2017;37(6):3449–3458. doi: 10.3892/or.2017.5635 [DOI] [PubMed] [Google Scholar]
- 10.Xie CM, Liu XY, Sham KW, et al. Silencing of EEF2K (eukaryotic elongation factor-2 kinase) reveals AMPK-ULK1-dependent autophagy in colon cancer cells. Autophagy. 2014;10(9):1495–1508. doi: 10.4161/auto.29164 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lee DE, Yoo JE, Kim J, et al. NEDD4L downregulates autophagy and cell growth by modulating ULK1 and a glutamine transporter. Cell Death Dis. 2020;11(1):38. doi: 10.1038/s41419-020-2242-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Blessing AM, Rajapakshe K, Reddy Bollu L, et al. Transcriptional regulation of core autophagy and lysosomal genes by the androgen receptor promotes prostate cancer progression. Autophagy. 2017;13(3):506–521. doi: 10.1080/15548627.2016.1268300 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Dower CM, Bhat N, Gebru MT, et al. Targeted inhibition of ULK1 promotes apoptosis and suppresses tumor growth and metastasis in neuroblastoma. Molecular Cancer Therapeutics. 2018;17(11):2365–2376. doi: 10.1158/1535-7163.MCT-18-0176 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Li Y, Wang C, Xu T, et al. Discovery of a small molecule inhibitor of cullin neddylation that triggers ER stress to induce autophagy. Acta Pharmaceutica Sinica B. 2021;11(11):3567–3584. doi: 10.1016/j.apsb.2021.07.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zhang Y, Luo M, Wu P, et al. Application of computational biology and artificial intelligence in drug design. Int J Mol Sci. Published online 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chen X, Xie W, Yang Y, et al. Discovery of dual FGFR4 and EGFR inhibitors by machine learning and biological evaluation. J Chem Inf Model. 2020;60(10):4640–4652. doi: 10.1021/acs.jcim.0c00652 [DOI] [PubMed] [Google Scholar]
- 17.Leong MK, Syu RG, Ding YL, et al. Prediction of N-Methyl-D-aspartate receptor GluN1-ligand binding affinity by a novel SVM-pose/SVM-score combinatorial ensemble docking scheme. Sci Rep. 2017;7(1):40053. doi: 10.1038/srep40053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Liew CY, Ma XH, Liu X, et al. SVM model for virtual screening of Lck inhibitors. J Chem Inf Model. 2009;49(4):877–885. doi: 10.1021/ci800387z [DOI] [PubMed] [Google Scholar]
- 19.Che J, Feng R, Gao J, et al. Evaluation of artificial intelligence in participating structure-based virtual screening for identifying novel interleukin-1 receptor associated kinase-1 inhibitors. Front Oncol. 2020;10:1769. doi: 10.3389/fonc.2020.01769 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Shi C, Dong F, Zhao G, et al. Applications of machine-learning methods for the discovery of NDM-1 inhibitors. Chem Biol Drug Des. 2020;96(5):1232–1243. doi: 10.1111/cbdd.13708 [DOI] [PubMed] [Google Scholar]; • This has certain reference significance for the study.
- 21.Zhu J, Li K, Xu L, et al. Discovery of novel selective PI3Kγ inhibitors through combining machine learning-based virtual screening with multiple protein structures and bio-evaluation. J Advan Res. 2022;36:1–13. doi: 10.1016/j.jare.2021.04.007 [DOI] [PMC free article] [PubMed] [Google Scholar]; •• This has important reference significance for the design ideas of this study.
- 22.Vignaux PA, Minerali E, Foil DH, et al. Machine learning for discovery of GSK3β inhibitors. ACS Omega. 2020;5(41):26551–26561. doi: 10.1021/acsomega.0c03302 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Tian S, Sun H, Li Y, et al. Development and evaluation of an integrated virtual screening strategy by combining molecular docking and pharmacophore searching based on multiple protein structures. J Chem Inf Model. 2013;53(10):2743–2756. doi: 10.1021/ci400382r [DOI] [PubMed] [Google Scholar]; •• It has important reference value for the writing ideas of this study.
- 24.Miller DW. Results of a new classification algorithm combining k nearest neighbors and recursive partitioning. J Chem Inf Comput Sci. 2001;41(1):168–175. doi: 10.1021/ci0003348 [DOI] [PubMed] [Google Scholar]
- 25.Jamali AA, Ferdousi R, Razzaghi S, et al. DrugMiner: comparative analysis of machine learning algorithms for prediction of potential druggable proteins. Drug Discov Today. 2016;21(5):718–724. doi: 10.1016/j.drudis.2016.01.007 [DOI] [PubMed] [Google Scholar]
- 26.Molnár L, Keserű GM. A neural network based virtual screening of cytochrome P450 3A4 inhibitors. Bioorg Medic Chem Lett. 2002;12(3):419–421. doi: 10.1016/S0960-894X(01)00771-5 [DOI] [PubMed] [Google Scholar]
- 27.Bruns D, Gawehn E, Kumar KS, et al. Identification of synthetic activators of cancer cell migration by hybrid deep learning. ChemBioChem. 2020;21(4):500–507. doi: 10.1002/cbic.201900346 [DOI] [PubMed] [Google Scholar]
- 28.Wallach I, Dzamba M, Heifets A. AtomNet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery. AvXiv. 2015. doi: 10.48550/arXiv.1510.02855 [DOI] [Google Scholar]
- 29.He S, Liu Y, Li Q, et al. In silico approaches using pharmacophore model combined with molecular docking for discovery of novel ULK1 inhibitors. Future Medic Chem. 2021;13(4):341–361. doi: 10.4155/fmc-2020-0253 [DOI] [PubMed] [Google Scholar]
- 30.Wood SD, Grant W, Adrados I, et al. In silico HTS and structure based optimization of indazole-derived ULK1 inhibitors. ACS Med Chem Lett. 2017;8(12):1258–1263. doi: 10.1021/acsmedchemlett.7b00344 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Gao Y, Zhou Z, Zhang T, et al. Structure-based virtual screening towards the discovery of novel ULK1 inhibitors with anti-HCC activities. Molecules. 2022;27(9):2627. doi: 10.3390/molecules27092627 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Mysinger MM, Carchia M, Irwin JJ, et al. Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J Med Chem. 2012;55(14):6582–6594. doi: 10.1021/jm300687e [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Altman NS. An introduction to kernel and nearest-neighbor nonparametric regression. Am Statist. 1992;46(3):175–185. doi: 10.1080/00031305.1992.10475879 [DOI] [Google Scholar]
- 34.Agac S, Durmaz IO. On the use of a convolutional block attention module in deep learn-ing-based human activity recognition with motion sensors. Diagnostics (Basel). 2023;13(11):1861. doi: 10.3390/diagnostics13111861 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Eberhardt J, Santos-Martins D, Tillack AF, et al. AutoDock Vina 1.2.0: new docking methods, expanded force field, and python bindings. J Chem Inf Model. 2021;61(8):3891–3898. doi: 10.1021/acs.jcim.1c00203 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2010;31:455–461. doi: 10.1002/jcc.21334 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kaminski GA, Friesner RA, Tirado-Rives J, et al. Evaluation and reparametrization of the OPLS-AA force field for proteins via comparison with accurate quantum chemical calculations on peptides. J Phys Chem B. 2001;105(28):6474–6487. doi: 10.1021/jp003919d [DOI] [Google Scholar]
- 38.Xiong G, Wu Z, Yi J, et al. ADMETlab 2.0: an integrated online platform for accurate and comprehensive predictions of ADMET properties. Nucleic Acids Res. 2021;49(W1):W5–W14. doi: 10.1093/nar/gkab255 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Vanommeslaeghe K, MacKerell AD. Automation of the CHARMM General Force Field (CGenFF) I: bond perception and atom typing. J Chem Inf Model. 2012;52(12):3144–3154. doi: 10.1021/ci300363c [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Vanommeslaeghe K, Hatcher E, Acharya C, et al. CHARMM general force field: a force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J Comput Chem. 2009. doi: 10.1002/jcc.21367 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Yu W, He X, Vanommeslaeghe K, et al. Extension of the CHARMM general force field to sulfonyl-containing compounds and its utility in biomolecular simulations. J Comput Chem. 2012;33(31):2451–2468. doi: 10.1002/jcc.23067 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Radziuk D, Möhwald H. Ultrasonically treated liquid interfaces for progress in cleaning and separation processes. Phys Chem Chem Phys. 2016;18(1):21–46. doi: 10.1039/C5CP05142H [DOI] [PubMed] [Google Scholar]
- 43.Lazarus MB, Novotny CJ, Shokat KM. Structure of the human autophagy initiating kinase ULK1 in complex with potent inhibitors. ACS Chem Biol. 2015;10(1):257–261. doi: 10.1021/cb500835z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Francoeur PG, Masuda T, Sunseri J, et al. Three-dimensional convolutional neural networks and a cross-docked data set for structure-based drug design. J Chem Inf Model. 2020;60(9):4200–4215. doi: 10.1021/acs.jcim.0c00411 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Nayarisseri A, Khandelwal R, Tanwar P, et al. Artificial intelligence, big data and machine learning approaches in precision medicine & drug discovery. Curr Drug Targets. 2021;22(6):631–655. doi: 10.2174/1389450122999210104205732 [DOI] [PubMed] [Google Scholar]
- 46.Carpenter KA, Huang X. Machine learning-based virtual screening and its applications to alzheimer's drug discovery: a review. Curr Pharm Des. 2018;24(28):3347–3358. doi: 10.2174/1381612824666180607124038 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Moriwaki H, Tian Y-S, Kawashita N, et al. Mordred: a molecular descriptor calculator. J Cheminformatics. 2018;10(1):4. doi: 10.1186/s13321-018-0258-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Kuhn B, Mohr P, Stahl M. Intramolecular hydrogen bonding in medicinal chemistry. J Med Chem. 2010;53(6):2601–2611. doi: 10.1021/jm100087s [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
