Skip to main content
ACS Omega logoLink to ACS Omega
. 2025 Nov 13;10(46):55279–55294. doi: 10.1021/acsomega.5c01756

Identification of Potential Nontoxic Human BTK Inhibitors through an Integrated Deep Learning and Structure-Based Drug Repositioning Strategy

Muhammad Waleed Iqbal †,, Xinxiao Sun †,‡,*, Qipeng Yuan †,‡,*
PMCID: PMC12658696  PMID: 41322579

Abstract

Bruton’s tyrosine kinase (BTK) has been a key player in the pathogenesis of multiple autoimmune diseases as its overexpression drives the hyperactivation of the B-cell signaling pathway. While the BTK inhibitors, including ibrutinib, have shown significant inhibitory potential, their low potency and higher toxicity emphasize the need for safer and more effective alternatives. This study develops an in silico pipeline involving deep learning, structure-based drug repositioning, and toxicity analysis to identify potential BTK inhibitors. A curated dataset of BTK-targeting bioactive compounds was rigorously filtered, and the resulting high-quality compounds were used to train and test an artificial neural network (ANN) model. The trained model was then applied to assess the bioactivity of an FDA-approved drug library. The putative compounds were further screened using molecular docking, providing three compounds, including gozetotide, micafungin, and candicidin, as the top hits. Molecular simulations further validated the atomic level stability of these compounds through various post-trajectory analyses, including RMSD, RMSF, RoG, hydrogen bonding, PCA, FEL, and DCCM, suggesting their stable binding profiles within the BTK active site. Finally, the GNN-based toxicity analysis revealed that the suggested compounds did not exhibit any significant toxicity concern, supporting their safety as potential therapeutic agents. These findings contribute to the advancement of safer and more effective treatments for autoimmune diseases and require further clinical trials of gozetotide, micafungin, and candicidin as BTK-targeted therapies.


graphic file with name ao5c01756_0018.jpg


graphic file with name ao5c01756_0016.jpg

1. Introduction

Autoimmune diseases are a heterogeneous group of disorders characterized by an overactive immune response in which the immune system unintentionally targets the body’s own healthy tissues and cells. This dysregulated immune response occurs due to the failure to differentiate between healthier and harmful tissues. These predominant conditions encompass various diseases, including multiple sclerosis (MS), rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), type 1 diabetes, Sjögren’s syndrome (SS), inflammatory bowel disease (IBD), and chronic lymphocytic leukemia (CLL). Different studies have reported a 12.5–19.1% increase in the prevalence of autoimmune diseases every year. , Notably, an increasing number of cases (i.e., about 66–80%) have been identified in women worldwide, as compared to men. Among all of the reported cases, rheumatoid arthritis is the most prevalent in American and European regions, while multiple sclerosis and inflammatory bowel disease have been frequently increasing in Asia and Africa over the past few decades. Furthermore, rheumatoid arthritis has been reported to contribute to the majority of autoimmune diseases worldwide, especially in China, where 3% of the adult population is estimated to have one or more autoimmune diseases. Molecular mimicry is considered a key mechanism in the pathogenesis of autoimmune diseases, wherein foreign antigens share structural similarities with host self-antigens, leading to the activation of immune cells that erroneously target and damage the body’s own tissues.

Numerous genome-wide studies have advanced the understanding of autoimmune diseases by identifying shared molecular pathways, genetic determinants, shared loci, and environmental risk factors that contribute to their pathogenesis. Genetic mutations have been identified as the major contributors to the expression dysregulation of multiple proteins, ultimately leading to various autoimmune diseases. The overexpression of various protein kinases has been implicated in the pathogenesis of multiple disorders, with Bruton’s tyrosine kinase (BTK) emerging as a critical target. BTK, a member of the tyrosine kinase family, is predominantly expressed in B-cells as well as other hematopoietic cells, including myeloid cells, mast cells, osteoclasts, and neutrophils. It has three primary domains, including the C-terminal domain, the N-terminal Pleckstrin Homology (PH) domain, and the SRC Homology 1 (SH1) domain, which play an important role in its functional activity, whereas the intervening domains, including SH2, SH3, and TEC homology, are crucial for protein–protein interactions. BTK plays a pivotal role in the activation of Fc receptor and B-cell receptor (BCR) signaling pathways, which facilitate chemotaxis, trafficking, and differentiation. B-cells have gathered considerable attention because of their wide range of roles and activities in immunological defense in response to infection and their malfunction leading to autoimmune diseases. Various studies have shown that the malfunction or dysregulation of B-cells has been associated with the production of proinflammatory cytokines in the case of many autoimmune diseases. BTK has a multifaceted role in B-cells, where it is involved in key processes like cell proliferation, maturation, and differentiation into plasma cells and plasmablasts. In myeloid cells, it is responsible for cytokine production and phagocytosis, whereas it mediates Fc epsilon receptor signaling in mast cells, which results in chemotactic response. It has been reported to increase the B-cell sensitivity to toll-like receptor signaling, including unrestricted formation of germinal centers and increased production of IL-6, IL-1, and IFNγ. Furthermore, it also increases the CD8- CD8-expression on activated B-cells and induces antinuclear autoantibody production. The recruitment of BTK to the cell membrane is a critical step in its activation, where its N-terminal PH domain interacts with phosphatidylinositol 3,4,5-trisphosphate (PIP3), unfolding the kinase into its active conformation. This conformation shift exposes its tyrosine residue at the 551 position in the SH1 domain, which is subsequently phosphorylated by various kinases depending on the cell type. This phosphorylation catalytically activates the BTK, which then autophosphorylates the Y223 amino acid. This process also initiates the phosphorylation and activation of other downstream signaling pathways. Almost all of the developing B cells are autoreactive and are typically killed through apoptosis. However, in autoimmune genetic conditions, this apoptotic procedure does not kill the autoreactive B cells, allowing them to present self-antigens that initiate T cell-mediated autoimmunity.

Due to its involvement in autoimmune diseases, BTK has attracted significant attention as a potential therapeutic target. Numerous studies have demonstrated the high efficacy of its first-generation and second-generation inhibitors, but resistance and unwanted immune deficiency in patients have emerged as a considerable concern. Targeting BTK activity with small covalent inhibitors, such as ibrutinib, has shown therapeutic promise in treating B-cell-related malignancies, including mantle cell lymphoma (MCL) and chronic lymphocytic leukemia (CLL). However, the patients receiving ibrutinib treatments for CLL have been shown to be at a higher risk of infection and the development of neutropenia, which occurs due to improper maturation of neutrophils. Furthermore, a range of cardiovascular disorders, including atrial fibrillation, fatigue, and upper respiratory infection, have been found to be associated with the use of ibrutinib as a drug. The elderly patients with underlying health issues have been reported to be more sensitive to these adverse effects. Increased drug resistance as a result of the C482S mutation in BTK during ibrutinib interaction is another problem. Moreover, off-target binding and toxicity have been shown by already available BTK inhibitors, suggesting the need for more specific and nontoxic drugs with optimized safety profiles. The aim of our study was to computationally design more reliable nontoxic inhibitors for BTK activity by employing a hybrid methodology. We started with the training and testing of an artificial neural network (ANN) based deep learning model on a comprehensively filtered bioactive compound data set targeting BTK. The best model was then utilized to determine the potency of an already available FDA-approved drug library. The drug repurposing using molecular docking and molecular dynamic simulation strategies further resulted in the identification of reliably potent and potentially nontoxic BTK inhibitors. Our integrated approach underscores the advanced use of computer-aided techniques in the field of drug discovery and suggests further preclinical and clinical testing of the proposed drugs.

2. Methodology

2.1. Data Retrieval and Preprocessing

A total of 4432 BTK targeting bioactive compounds were retrieved from ChEMBL (https://www.ebi.ac.uk/chembl/) using chembl webresource client, under UniProt ID “Q06187”. Initial data filtration was performed using a chemoinformatics-based toolkit, RDKit, to ensure the integrity and accuracy of the data set. Specifically, compounds were eliminated due to potential issues such as duplication, which could arise from redundant entries or identical compounds listed under different identifiers. Additionally, the compounds with the canonical SMILES-related issues like errors in their strings, which might lead to incorrect bond orders, misrepresented or missing atoms, incomplete ring closures, or ambiguous stereochemistry, were filtered out. Subsequently, the IC50 (half maximal inhibitory concentration) values of the remaining compounds were screened and then converted to pIC50 values using Python’s math package to assess their relative potency. The data set was categorized into three groups based on pIC50 values, where the compounds with pIC50 values <5 were classified as inactive, those with pIC50 values between 5 and 6 as intermediate, and compounds with pIC50 values >6 as active.

2.2. ADMET and PAINS Analyses

Subsequent to initial categorization, additional filtering was applied to the data set through ADMET (absorption, distribution, metabolism, excretion, and toxicity) and PAINS (pan assay interference compounds) analyses. Initially, Lipinski’s descriptors were assigned to the active class of the data set using the RDkit package to identify compounds that adhere to Lipinski’s rule of five (Ro5). According to the Ro5 criterion, a compound possesses drug-like properties if its molecular weight is less than 500 Da (Da), can donate up to 5 hydrogen bonds, can accept up to 10 hydrogen bonds, and its LogP is less than 5. The intermediate and inactive classes were then filtered out, as they contain less reliable compounds. The graphical representation of Ro5 fulfilled compounds and their comparative analysis with the inactive class was depicted using the matplotlib and seaborn packages. Finally, the PAINS analysis was performed using the RDKit package to screen and exclude the false positives.

2.3. Final Data Set Preparation

Different compounds having unfavorable pharmacokinetic properties might be present in the categorized data set, which can lead to considerable toxicity and mutagenesis concerns. These compounds are required to filter out for an accurate data set generation. For this purpose, a highly reliable prespecified list of substructures, assembled by Brenk et al., was retrieved and compiled to remove unwanted compounds. This assembled list contains a variety of possible substructures, including highly reactive (thiols) and mutagenic compounds (nitro groups) with unfavorable pharmacokinetic properties (phosphates and sulfates). Following the substructure elimination, a smiles_to_fp function was employed from the RDKit package to assign MACCS (Molecular ACCess System) fingerprints on comprehensively filtered bioactive data. These molecular fingerprints are binary representations of a compound’s features, typically ranging from 166 to 960 bits. Each defined feature has its own uniqueness, which helps a deep learning model to be trained more accurately. The resulting features were further screened using the Pandas package to reduce dimensionality from the overall data set for more accurate model training.

2.4. Model Building and Evaluation

The MACCS fingerprinting data, containing highly accurate features, was analyzed to train an artificial neural network (ANN) model. ANN, a deep learning-based computational model inspired by the human brain, is primarily used for pattern recognition and decision-making based on input weights derived from training and testing data sets. It consists of numerous interconnected layers, with each layer performing a specific function, such as a human brain. To apply the ANN model, the system was initially prepared using the conda environment. Additionally, the Keras model was executed through TensorFlow to define the ANN. TensorFlow, an open-source deep learning framework, integrates with Keras to facilitate model development, while Keras serves as a high-level API to define key parameters such as the number of neurons in hidden layers and the activation functions used in the model. To develop an ANN model, a linear activation function in the output layer and a rectified linear unit (ReLU) in the two hidden layers were applied to minimize linearity. The model was finally trained using the RDKit package in three different batch sizes (16, 32, and 64) and a 70/30 training-to-testing split. The evaluate () function was used to test each batch performance, resulting in the mean square error (MSE) and mean absolute error (MAE). The mean of the squared differences between the true and predicted values (MSE) and the mean of the absolute differences between the true and predicted values (MAE) are less than 1, indicating the optimized model performance. The best-performing model was saved using the ModelCheckpoint package from the Keras library. The trained ANN model was then applied to an existing FDA-approved drug library, which determined the relative potency of each given compound based on its bioactivity data. The final bioactivity results were expressed in the form of pIC50 values, which indicate the possible potency of the compounds.

2.5. Molecular Docking

The potentially potent leads with reliable bioactivity profiles were shortlisted and further screened using a well-established strategy for assessing receptor–ligand affinity, molecular docking. The crystal structure of BTK was obtained from the Protein Data Bank (PDB) (https://www.rcsb.org/), under the PDB ID “1BTK”. The 3D structure selection of BTK was based on its high structural resolution (1.6 Å), reliability, accuracy, and favorable active site geometry, which are the key factors for molecular docking and interaction analysis. Additionally, the chosen structure (1BTK) represents the human BTK receptor in its functional state, ensuring its biological relevance. The molecular operating environment (MOE) (https://www.chemcomp.com/en/Products.htm) software was employed to remove the bound ligand and heteroatoms prior to docking. Additionally, energy minimization was performed to resolve steric conflicts, and 3D protonation was carried out to dynamically assign the appropriate protonation state. The protonation state was distributed according to the physiological pH of 7.4, commonly used in computational studies of biological systems. The built-in site finder tool of the MOE was utilized to identify and specify the potential active site in the BTK structure. Finally, the docking was performed and the interaction profile between docked compounds and the core residues of BTK was evaluated using the triangle matcher algorithm of MOE. The potential leads were shortlisted based on the two crucial matrices, including binding affinity and root-mean-square deviation (RMSD). The interaction profiles of potent leads, fulfilling the given threshold (i.e., binding affinity > −9, RMSD < 2), were further visualized using Pymol, Discovery Studio software, and LigPlot+.

2.6. All-Atom Molecular Dynamic Simulation

The top hits were evaluated at the atomic level in a well-controlled environment using molecular dynamics simulation. Initially, all of the shortlisted ligands were parametrized using antechamber and parmck2 modules to make them suitable for a specific force field (i.e., gaff2). The Amber’s tleap module was utilized to facilitate and prepare the system. The ff19SB and gaff2 force fields were applied to the receptor and ligands, respectively. Appropriate Na+ and Cl ions were added to neutralize each system. A solvation box of 9.0 size using the OPC force field was added around the complex to facilitate hydrogen bonding interactions between atoms. Following its preparation, the system was minimized into two steps, with each step consisting of two subsections of steepest descent and conjugate gradient. In the first step, 10,000 steps of conjugate gradient and 4,000 steps of steepest descent were performed with a constrained complex. Subsequently, the constraints were removed and the whole system was minimized in the second step with 10,000 steps of conjugate gradient and 4000 steps of steepest descent, to fix bad conflicts and orientations in the system. The system was gradually heated at a physiological temperature of 300 K to reach a thermal equilibrium. Langevin thermostat was utilized to control the temperature by inducing friction and atomic collisions through random forces during the heating phase. Subsequently, the system’s pressure was maintained using the Berendsen barostat algorithm, which adjusts the volume of the simulation box based on pressure deviation from the standard value of 1 atm. Following this, the system was equilibrated for 1000 ps using the SHAKE algorithm, which further validated and enforced the bond length constraints by inducing bond rigidity. Finally, a 200 ns MD simulation was performed using pmemd.cuda module on Amber22 software. This simulation time was chosen based on previous research studies indicating that 200 ns is generally sufficient to capture significant conformational changes, ligand binding stability, and protein flexibility in drug-target interactions. , Even a study by Delemotte et al. reported that simulations lasting 100 ns can provide reliable insights into the stability and behavior of protein–ligand interactions. The PTRAJ and CPPTRAJ modules were employed to analyze trajectory files using different matrices including root-mean-square deviation (RMSD), root-mean-square fluctuation (RMSF), radius of gyration (RoG), hydrogen bond analysis (HBA), principal component analysis (PCA), free energy landscapes (FEL), and dynamic cross-correlation matrix (DCCM) analysis.

2.7. Toxicity Analysis

Following a thorough computational evaluation of the candidate compounds as potential BTK inhibitors, a final toxicity analysis was carried out to predict their safety profiles within the body. A graph neural network (GNN) based deep learning framework called Chemprop 1.5.2 (https://github.com/chemprop/chemprop), specifically designed for chemical property prediction, was employed for this analysis. To ensure high-quality data for molecular property prediction, the data set was obtained from a reputable benchmark platform, Moleculenet (https://moleculenet.org/). The retrieved data set was used to train the Chemprop model, which was then applied to predict the toxicity of the proposed compounds as well as the control compound. Matplotlib and Seaborn packages were finally used to display the graphical interphase for the predicted toxicity of all the compounds.

3. Results

3.1. Data Retrieval and Comprehensive Filtration

From the ChEMBL database, bioactivity data of BTK targeting 4,432 compounds were obtained. Following preprocessing, 1298 compounds were eliminated due to redundancy and SMILES-related errors, including inconsistencies in the SMILES strings that could hinder the proper identification and representation of chemical structures, such as incorrect bond orders, missing atoms, or ambiguities in ring closures, which may lead to errors during downstream analyses. The IC50 values of the remaining compounds were converted to a more effective potency unit, pIC50. To ensure a standardized distribution, the data set of 3,134 filtered bioactive compounds was categorized into three groups: intermediate (237), active (2834), and inactive (63). This distribution is visually represented in Figure , using the Matplotlib software.

1.

1

Classification of BTK-targeting bioactive compounds based on their pIC50 values highlights their distribution into distinct activity groups.

Following the data set distribution, the RDKit package was utilized to assign Lipinski’s descriptors to the remaining data set for rigorous refinement. Based on fingerprinting data, only 1218 out of the 3134 bioactive compounds met Lipinski’s rule of five. Subsequently, an intermediate class of 136 compounds was filtered out from the data set. To minimize false positives, the remaining 1082 bioactive compounds underwent PAINS filtration, which identified and further removed 32 compounds. The filtered data set comprised high-quality bioactive compounds that fulfilled both ADMET and PAINS requirements. A detailed graphical representation of the bioactive data set is provided in Figure .

2.

2

(A) Radar plot illustrating the deviations of Lipinski-compliant compounds from their mean values across key parameters; (B) scatter plot depicting the distribution of Lipinski-compliant compounds categorized as active (orange) and inactive (green); (C) bar plot showing the frequency of Lipinski-compliant compounds within active and inactive categories; (D) box plot presenting the number of hydrogen bond acceptors among active and inactive compounds; (E) box plot depicting the number of hydrogen bond donors across active and inactive categories.

To further refine the bioactive data set, a predefined list of substructures was applied. Only 354 of the 1050 bioactive compounds satisfied the filtration criteria, while 696 compounds were excluded due to the presence of 790 undesirable substructures. These substructures comprised Michael acceptors, triple bonds, aliphatic long chains, and anilines (Table ). The resulting 354 compounds, having passed rigorous filtration, were deemed to represent the most accurate and reliable candidates for deep learning model training. Finally, we applied molecular encoding to this highly curated motif using MACCS fingerprints, which encode the structural features of each molecule in binary form. These encoded local features represent the foundational pharmacological characteristics of each molecule, providing critical input for downstream predictive modeling.

1. Representing the Types and Numbers of Identified Substructures in the Bioactive Dataset.

sr. # substructure name substructure index
1 Michael acceptor 515
2 triple bond 140
3 aliphatic long chain 27
4 aniline 24
5 heavy metal 16
6 three-numbered heterocycle 11
7 aromatic hydrocarbon polycyclic 10
8 cyanate/aminonitrile, thiocyanate 9
9 alkyl halide 8
10 others 30

3.2. ANN Model Building and Evaluation

Unique fingerprint data for the extensively filtered set of 354 BTK-targeting bioactive compounds were utilized to train the deep learning-based artificial neural network (ANN) model. A 70:30 ratio was used to split the data set, with 247 bioactive compounds reserved for model training and the remaining 107 compounds for testing. Subsequently, the ANN model was trained using three different batch sizes, including 16, 32, and 64, with a fixed number of epochs (i.e., 50), which corresponds to the number of complete passes through the data set. Batch sizes 16 and 32 demonstrated test losses of 1.62 and 1.65, respectively, while batch size 64 exhibited the lowest test loss of 1.55, indicating its reliable performance across all three sizes. After being determined to be the best model with an optimal configuration, the ANN model with batch size −64 was subjected to additional testing. A mean square error of 1.51 and a mean absolute error of 0.87 were among the variables that validated its robustness and reliability. Figure illustrates the comparative performance of the ANN model across different batch sizes, while a scatter plot depicting the distribution of pIC50 values for the BTK-targeting bioactive compounds used in model training and testing is shown in Figure .

3.

3

Performance evaluation of the ANN model trained with batch sizes of 16, 32, and 64. The blue curve illustrates the performance on the training data set, while the orange curve depicts the performance on the testing data set, demonstrating the model’s learning and generalization across the provided data.

4.

4

Illustrating the pIC50 values of extensively filtered bioactive compounds targeting Trk-A, utilized for training and validation of the ANN model.

Following the selection of the optimal ANN model, it was utilized to evaluate the bioactivity of an existing library of 2613 FDA-approved drugs, with an emphasis on how well they potentially inhibit BTK with respect to autoimmune disorders. Various research studies have reported that pIC50 values between 6 and 8 correspond to moderately active drugs, while values exceeding 8 signify high potency and robust inhibitory effects. , Additionally, increased pharmacological efficacy has been found associated with larger pIC50 values. The top 495 compounds with pIC50 values over 7 were shortlisted for further processing using structure-based approaches.

3.3. Molecular Docking

To further evaluate the top-ranked potentially potent compounds, molecular docking was performed as a rigorous screening method. To ensure optimal docking conditions, the BTK crystal structure was subjected to energy minimization and 3D protonation using the MOE software. Additionally, polar hydrogens were introduced and prebound ligands were removed from the BTK structure in order to improve binding interactions. Following that, the BTK 3D structure was docked against 495 selected compounds, resulting in a diverse range of binding affinities from −3.5 to −10 kJ/mol. A precise selection criterion, requiring an RMSD value below 2 Å and an S-score below −9, was established. This threshold was met by only three compounds, including gozetotide (S-score: −9.32, RMSD: 1.4 Å), micafungin (S-score: −9.29, RMSD: 1.9 Å), and candicidin (S-score: −9.28, RMSD: 1.7 Å). Furthermore, one of the first and significantly effective BTK inhibitors, ibrutinib, was taken as a control to compare the performance of these top hits. From the docking results, Ibrutinib demonstrated a relatively lower binding potential, indicated by its notably less favorable S-score of −6.1 and RMSD of 2.3 Å compared to the proposed leads. These results suggest that the top hits had comparably better structural alignment and binding affinity with BTK. Docking results for the control and top hits are summarized in Table .

2. Summary of the Top Hits and the Control from the Molecular Docking Analysis of Shortlisted High-Potency Candidates.

sr. # compound name PubChem ID S-score RMSD
1 Gozetotide 60143283 –9.32 1.43
2 Micafungin 477468 –9.29 1.90
3 Candicidin 10079874 –9.28 1.75
4 Ibrutinib (control) 24821094 –6.10 2.28

Various research studies have been conducted to explore the use of the proposed compounds as drugs, and found that gozetotide has been used against prostate cancer, micafungin against candidemia, and candicidin against vulvovaginal candidiasis. The interaction profiles of these compounds with the core residues of BTK are illustrated in Figures , , and . The hydrogen bonding interactions were observed between gozetotide and key residues of BTK including Tyr-42, Glu-45, Met-89, Ser-93, Tyr-111, Glu-118, and Arg-123, followed by hydrophobic interactions between residues Glu-7, Ser-8, Ile-9, Phe-44, Gly-47, Gln-91, Ile-94, Ile-95, Arg-97, Pro-99, Pro-116, Thr-117, and Glu-119, as shown in Figure . Similarly, micafungin exhibited strong hydrogen bonding with multiple BTK residues, including Glu-7, Glu-90, Gln-91, Thr-117, Glu-119, and Arg-123. It also interacted hydrophobically with the BTK residues such as Ala-2, Ser-8, Ile-9, Tyr-42, Gly-47, Arg-48, Arg-49, Met-89, Ile-94, and Ile-95 (Figure ). Lastly, potential hydrogen contacts were noted between candicidin and the six BTK residues, including Phe-25, Lys-27, Glu-41, Phe-44, Glu-45, and Arg-77, followed by its hydrophobic interactions with the residues Leu-11, Leu-23, Lys-26, Tyr-42, Glu-70, Asn-72, Pro-74, and Glu-76 (Figure ).

5.

5

(A) A schematic representation illustrating the binding interactions of gozetotide within the active site of the BTK receptor. (B) Detailed, zoomed-in depiction of the BTK active site, highlighting the molecular interactions between gozetotide and core binding acid residues critical of BTK. (C) 2D graph representing hydrogen bonding (green) and hydrophobic (red) contacts between gozetotide and the key BTK residues.

6.

6

(A) A schematic representation illustrating the binding interactions of micafungin within the active site of the BTK receptor. (B) Detailed, zoomed-in depiction of the BTK active site, highlighting the molecular interactions between micafungin and core binding acid residues critical of BTK. (C) 2D graph representing hydrogen bonding (green) and hydrophobic (red) contacts between gozetotide and the key BTK residues.

7.

7

(A) Schematic representation illustrating the binding interactions of candicidin within the active site of the BTK receptor. (B) Detailed, zoomed-in depiction of the BTK active site, highlighting the molecular interactions between candicidin and core binding acid residues critical of BTK. (C) 2D graph representing hydrogen bonding (green) and hydrophobic (red) contacts between gozetotide and the key BTK residues.

3.4. Molecular Dynamic Simulation

The top hits were further evaluated for their atomic-level stability using a molecular dynamic simulation strategy. A 200 ns simulation was carried out on each of the three complexes, along with ibrutinib taken as a control. The binding stability of each system was evaluated by carrying out various analyses, including root-mean-square deviation (RMSD), root-mean-square fluctuation (RMSF), radius of gyration (RoG), hydrogen bond analysis, principal component analysis (PCA), free energy landscapes (FEL) analysis, and dynamic cross correlation matrix (DCCM) analysis. Each analysis provided a comparative overview between the top hits and the control, validating the higher potential of the proposed compounds against the BTK receptor.

3.4.1. Root Mean Square Deviation (RMSD)

Root mean square deviation (RMSD) is a fundamental technique in postsimulation trajectory analysis, providing insights into the binding stability and conformation of the simulated receptor–ligand complexes. Additionally, a crucial factor in precisely assessing receptor–drug interactions is the total time duration of MD simulations. In this study, an extensive 200 ns MD simulation was performed to ensure a comprehensive assessment of drug stability within the active site of the BTK receptor. The ibrutinib-BTK complex (control) demonstrated comparably less stability and greater deviations in comparison to all of the proposed compounds. While the RMSD values of the control maintained stability at around 3 Å for the first 70 ns, they gradually deviated up to 12 Å for the rest of the simulation (Figure A). Conversely, the gozetotide-BTK complex exhibited the highest stability among all the complexes, with RMSD values consistently stable around 1.5 Å throughout the entire 200 ns simulation (Figure B). Similarly, micafungin-BTK (Figure C) and candicidin-BTK (Figure D) complexes also displayed stable behavior at around 1.5 Å ± 0.5 Å for the complete simulation time, indicating their strong binding potential with the BTK receptor. The overall RMSD analysis demonstrated the comparably superior stability of the top hits’ complexes than the control, suggesting their potential to inhibit BTK activity by forming stronger complexes.

8.

8

(A) Depicts the RMSD profile of the control system. (B) Illustrates the RMSD trajectory of the gozetotide-BTK kinase complex. (C) Displays the RMSD analysis of the micafungin-BTK kinase complex. (D) Highlights the RMSD pattern observed in the candicidin-BTK kinase complex.

3.4.2. Root Mean Square Fluctuation (RMSF)

While RMSD provides an overall measure of the stability of the entire complex, root mean square fluctuation (RMSF) comprehensively evaluates the residual stability and flexibility of a simulated system. The RMSF measures the degree of each residual fluctuation from its mean throughout the simulation process. The lower RMSF values indicate minimal flexibility and higher stability, while the higher RMSF values suggest greater flexibility and lower stability of a respective residue. Surprisingly, all of the complexes, including the control, exhibited highly stable residual behavior with minimal flexibility throughout the entire simulation (Figure ). The RMSF values for nearly all the residues remained stable around 1 Å, except for some residues 16–22, 80–86, and 160–161, which showed slight fluctuations up to 3 Å, likely due to the presence of loop regions. Additionally, minimal fluctuations were noted in 125–130 residues within the candicidin-BTK complex. Overall, the RMSF analysis revealed that nearly all the residues demonstrated the highest stability in all of the simulated complexes, suggesting their further validation through additional analyses.

9.

9

(A) RMSF profile of the control system. (B) RMSF trajectory of the gozetotide-BTK kinase complex. (C) RMSF analysis of the micafungin-BTK kinase complex. (D) RMSF pattern observed in the candicidin-BTK kinase complex.

3.4.3. Radius of Gyration (RoG)

The radius of gyration (RoG) analysis was carried out to further assess the efficacy of the simulated complexes by examining their compactness and conformational stability. Structural integrity and folding behavior of each complex were comprehensively assessed by RoG evaluation. All of the proposed complexes, including gozetotide-BTK, micafungin-BTK, and candicidin-BTK, demonstrated exceptional compactness with consistently stable RoG values from 28 to 29 Å (Figure ). While the control also exhibited reliable compactness during the simulation, a notable decrease in compactness was observed between 10 and 25 ns, where its RoG values fluctuated from 29 to 29.7 Å. Overall, the RoG study demonstrated that the proposed receptor–ligand complexes had better conformational stability than the control. These results provide further validation of the potential inhibitors’ effectiveness and promise as BTK inhibitors for the treatment of autoimmune disorders.

10.

10

(A) RoG profile of the control system. (B) RoG trajectory of the gozetotide-BTK kinase complex. (C) RoG analysis of the micafungin-BTK kinase complex. (D) RoG pattern observed in the candicidin-BTK kinase complex.

3.4.4. Hydrogen Bond Analysis

In protein–ligand complexes, hydrogen bonding and hydrophobic contacts are essential for facilitating interactions. We analyzed every individual hydrogen bond formed in each system using precise standards in order to assess these interactions at the atomic level. Generally, a hydrogen bond is considered to form if the distance between the donor and acceptor is less than 0.35 nm and the donor–acceptor angle is within 30 degrees. A time-dependent analysis of hydrogen bonding revealed that all three proposed compounds formed a comparably stronger hydrogen bonding network with the BTK than the control (Figure ). This analysis demonstrated that the proposed compounds exhibited exceptional binding affinity and consistent interactions with BTK, with the hydrogen bonding index reaching up to 6 throughout the simulation. Interestingly, each complex retained a significant number of hydrogen bonds, which strengthened their strong binding affinity. These strong hydrogen bonding networks help to maintain the secondary structure in receptor–ligand complexes, ultimately enhancing their stability and therapeutic potential.

11.

11

(A) Hydrogen bonding index of the control system. (B) H-bond trajectory of the gozetotide-BTK kinase complex. (C) H-bond analysis of the micafungin-BTK kinase complex. (D) H-bonding pattern observed in the candicidin-BTK kinase complex.

3.4.5. Principal Component Analysis (PCA)

The conformational changes induced by receptor–ligand interactions in the control and proposed complexes were examined and compared using Principal Component Analysis (PCA) throughout the 200 ns MD simulation trajectories. All of the complexes showed cluster-like motion patterns with significantly varied structural behavior (Figure ). The control complex displayed more restricted motions, with PC1 and PC2 values ranging from −200 to +200 each. Conversely, the potential inhibitors’ complexes with BTK showed more compact and distinct conformational behavior, with PC1 values ranging from −800 to 800 and PC2 values from −400 to 400. These findings suggest that the ligand binding in the proposed complexes results in more stable and constrained conformational changes than the control. This behavior likely indicates that specific conformational states become more stable when drugs bind to them. Furthermore, the differences in PC value distribution and range offer crucial information about the type and degree of conformational changes in each system, underscoring the superior potential of the proposed complexes to stabilize the BTK active site.

12.

12

(A) PCA profile of the control system. (B) PCA profile of the gozetotide-BTK kinase complex. (C) PCA profile of the micafungin-BTK kinase complex. (D) PCA profile in the candicidin-BTK kinase complex.

3.4.6. Free Energy Landscape (FEL)

Analysis of the Free Energy Landscape (FEL) is an essential method for understanding the stability and conformational dynamics of protein–ligand complexes. The stable conformational states and transition paths that take place during MD simulations can be analyzed using FEL, which can provide insight by mapping the free energy distribution as a function of collective variables. The energy minima and conformational transitions for all of the complexes are represented in Figure . The color gradient, ranging from deep blue (low energy, high stability) to red (high energy, low stability), reflects the energy levels of the system. The control system demonstrated relatively rugged energy landscapes with multiple shallow wells, indicating higher energy fluctuations and less stable conformations than the proposed complexes. Conversely, all the other three complexes exhibited well-defined and deep energy minima, suggesting their stable conformational states and stronger interaction stability as compared to the control. These results demonstrate the exceptional stability and binding capacity of the suggested compounds in the receptor active site.

13.

13

(A) FEL profile of the control system. (B) FEL profile of the gozetotide-BTK kinase complex. (C) FEL profile of the micafungin-BTK kinase complex. (D) FEL profile in the candicidin-BTK kinase complex.

3.4.7. Dynamic Cross-Correlation Matrix (DCCM)

To analyze the continuous correlation during amino acids’ movement, a dynamic cross-correlation matrix (DCCM) analysis was carried out. Positive correlations indicate synchronized movements of the ligand and protein in the same direction, signifying a stable interaction within the system. In contrast, the negative correlations suggest that the ligand shifts out of the binding pocket, which causes the complex to move antiparallel or become unstable. In Figure , the blue shades (i.e., light blue to deep blue) indicate the negative correlation, while red shades represent the positive correlation. A deeper intensity of red or blue corresponds to stronger positive or negative correlations, respectively, whereas lighter shades indicate weaker interactions. The deeper blue intensities were seen in the control, indicating the stronger negative correlation between the residues, while the prominent red shades in all of the proposed complexes suggested their stronger positive correlation. These findings highlight that the suggested ligands promote stability and coordinated residue movement within the receptor, validating their potential efficacy as therapeutic agents.

14.

14

(A) DCCM profile of the control system. (B) DCCM profile of the gozetotide-BTK kinase complex. (C) DCCM profile of the micafungin-BTK kinase complex. (D) DCCM profile in the candicidin-BTK kinase complex.

3.5. Toxicity Analysis

Following their comprehensive stability evaluation, the toxicity profiles of all three proposed compounds were compared to the control across various nuclear receptor (NR) signaling and stress response (SR) pathways (Figure ). In the majority of pathways, including NR-AhR, SR-ARE, and SR-p53, the control compound (ibrutinib) showed substantially higher levels of toxicity up to 0.5, with consistently higher values compared to the suggested compounds. The SR-p53 pathway was found to be the most toxic to the control, demonstrating its strong potential to induce unfavorable cellular reactions such as DNA damage, mitochondrial toxicity, and other genotoxic effects. Conversely, the putative compounds exhibited significantly lower toxicity profile in each pathway, with gozetotide being the most nontoxic compound. While micafungin and candicidin exhibited slightly higher toxicity than gozetotide, they remained considerably lower than the control. Additionally, all of the proposed compounds were found in an acceptable range within all the nuclear pathways, indicating their negligible adverse interactions. These findings emphasize the possibility of the proposed compounds as safer therapeutic options by demonstrating their lower toxicological impact as compared to the control.

15.

15

Representing a bar plot illustrating the toxicity profiles of the three proposed compounds, including gozetotide, micafungin, and candicidin, alongside the control (ibrutinib).

4. Discussion

This study aimed to develop a comprehensive in silico pipeline involving structure-based drug design and deep learning to identify potentially nontoxic and potent inhibitors of BTK, which plays a crucial role in B-cell activation and survival through BCR signaling pathways. The overexpression of BTK has been found associated with enhanced BCR signaling, resulting in various disease conditions including multiple sclerosis (MS), rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), type 1 diabetes, Sjögren’s syndrome (SS), and inflammatory bowel disease (IBD). Despite the development of numerous BTK inhibitors, including first- and second-generation drugs, they frequently exhibit limitations such as poor selectivity, specificity, and off-target effects. Notably, the first FDA-approved BTK inhibitor, ibrutinib, has been quite successful but has toxicity and selectivity concerns, emphasizing the need for safer and more effective BTK inhibitors. For this purpose, we began by retrieving 4432 BTK-targeting bioactive compounds from the ChEMBL database, which were subsequently refined through a series of screening steps to eliminate duplicates, perform ADMET and PAINS analyses, and remove undesirable substructures. Following this thorough filtration, 354 high-quality compounds were shortlisted to train the deep-learning-based ANN model. These compounds were assigned with the molecular descriptors required for the ANN model to read and process the data effectively using MACCS fingerprints, transforming these compounds into machine-readable binary values. Following that, the data set was split into a training set (70%) and a testing set (30%), ensuring that the model was trained on a majority of the data while maintaining a distinct, unbiased testing set to assess model performance. This data splitting technique is frequently used to achieve a balance between generalization and model accuracy. The optimum model configuration for this data set was determined by evaluating each model’s performance using MSE, MAE, and test loss, where batch size 64 achieved the lowest test loss value of 1.55. While these metrics are widely used to assess model accuracy, we acknowledge that they may not fully capture all aspects of predictive performance. However, despite the reported errors, the model demonstrated reliable results in subsequent validation steps, including molecular docking, molecular dynamics simulations, and toxicity analysis, further validating its predictive capabilities. This model was subsequently applied to screen a library of 2613 FDA-approved drugs for their bioactivity evaluation in the form of pIC50 values. Due to their strong binding potency, 495 compounds with pIC50 values greater than 7 were chosen for virtual screening. Molecular docking of these selected compounds identified three potential candidates including gozetotide, micafungin, and candicidin, with S-scores of −9.32, −9.29, and −9.28, respectively, and RMSD values of 1.43, 1.90, and 1.75 Å, suggesting their comparably stronger binding affinities with the core residues of BTK receptor than the control (S-score: −6.10, RMSD: 2.28). The stability profiles of these compounds were further evaluated and compared with the control (ibrutinib) using 200 ns of MD simulation, where the stable binding contacts with comparatively smaller deviations in RMSD, RMSF, and RoG values suggested stronger binding of the proposed compounds to BTK. Likewise, higher hydrogen bonding index (up to 6) of the putative compounds showed that they can bind strongly within the BTK active site and can inhibit its activity. The PCA and FEL analyses further validated the potential of proposed compounds as they demonstrated comparably lower binding energies and well-defined conformational movements than the control, throughout the simulation process. Moreover, the values obtained from DCCM analysis suggested that most of the residues in the proposed compounds showed correlated motion in the same direction as the receptor, while the control showed a poor DCCM profile, indicating that the ligand slips through the receptor’s active site during the simulation. Finally, the toxicity analysis revealed that all of the proposed compounds showed no notable toxicity compared to the control as it exhibited potential safety issues related to its interactions with NR-AhR (0.49), SR-ARE (0.47), SR-MMP (0.34), and SR-p53 (0.40), indicating its multiple toxicity effects including environmental and mitochondrial toxicity. Consequently, the three potential leads, including gozetotide, micafungin, and candicidin, emerged as the putative nontoxic drugs against BTK activity. The key strength of this study is its distinctive methodology involving deep learning, molecular docking, MD simulations, and toxicity analysis, which determined the proposed compounds to be potentially effective BTK inhibitors, ultimately preventing multiple autoimmune diseases. Furthermore, the employed deep learning algorithms have been widely acknowledged for their efficiency, accuracy, and efficacy. This pipeline can be applied more broadly to find inhibitors for different therapeutic targets, expanding the spectrum of computational drug development. Moreover, the already established safety profiles of the FDA-approved drugs enhance the translational potential of these findings, encouraging the use of the drug repositioning approach. These findings hold significant implications for autoimmune diseases, as the developed in silico pipeline shows promise in overcoming the limitations of traditional drug discovery, such as toxicity and off-target effects. By identifying FDA-approved drugs potentially capable of inhibiting BTK, we can expedite the development of novel treatments that specifically target the overexpression of BTK associated with autoimmune diseases, paving the way for more precise and effective therapies. Our findings align with previous studies that have emphasized the importance of identifying potent BTK inhibitors with minimal toxicity. For instance, a study by Alrouji et al. highlighted the potential of computationally repurposing existing drugs as BTK inhibitors, which aligns with our approach of screening FDA-approved drugs. Furthermore, the use of deep learning models in drug discovery has been validated by other researchers, such as those in the study by Patel et al., who demonstrated the effectiveness of this strategy in predicting drug efficacy. Research by Gupta et al. also highlights the importance of integrating deep learning and molecular simulations for identifying novel drug candidates, aligning with our approach. Despite the availability of pharmacological treatments for autoimmune diseases, existing drugs primarily offer symptomatic relief rather than targeted disease progression. For example, the proposed compound gozetotide has been used against prostate cancer. Likewise, micafungin has been found to be active against candidemia and candicidin against vulvovaginal candidiasis. These findings suggest the potential of the proposed compounds as possible BTK inhibitors. Furthermore, these compounds have not been previously investigated for their potential BTK inhibitory activity, suggesting that our finding could represent a novel application of these agents. Despite these advantages, this study still faces several limitations, such as potential inconsistencies in the bioactive data that were retrieved and the fundamental constraints of computational simulations, which might not accurately capture the intricacy of in vivo testing. Additionally, the use of graph neural network (GNN) models could have enhanced the accuracy of predictions, offering further improvements. These limitations demand the additional in vitro and in vivo validation of the potential compounds to ensure safety and effectiveness before their formal use. Future studies incorporating functional assays, such as enzymatic inhibition studies and biophysical binding assessments, will be instrumental in further validating these computational predictions and translating them into clinically relevant therapeutic candidates. Despite these inherent limitations, the computational pipeline developed in this study can be adapted to other drug targets, offering a versatile platform for future drug discovery efforts.

5. Conclusions

Given the role of BTK overexpression in the development of various autoimmune diseases, an in silico deep learning and structure-based drug repositioning approach was employed to inhibit its activity. Through this approach, critical toxicity and specificity concerns of previously identified BTK inhibitors (such as ibrutinib) were addressed by proposing gozetotide, micafungin, and candicidin as potential nontoxic BTK inhibitors. The employed computational pipeline involving deep learning, molecular docking, molecular dynamics simulations, and toxicity assessments demonstrated strong binding affinities and enhanced stability of the proposed compounds, assessing their potential as effective BTK inhibitors. This study demonstrates the significance of computational methods in drug discovery and identifies gozetotide, micafungin, and candicidin as potential nontoxic BTK inhibitors, requiring further clinical approval.

Acknowledgments

This work was supported by the State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China.

All the data generated/analyzed during this study are provided in the Supporting Information. The source code is openly accessible on GitHub (https://github.com/iwaleediqbal/DDWL).

M.W.I.: conceptualization, data curation, methodology, software, formal analysis, writing-original draft. X.S.: writing-review and editing, validation, supervision. Q.Y.: supervision, funding acquisition, resources, project administration.

This work was funded by National Key Research and Development Program of China (2024YFA0918000) and National Natural Science Foundation of China (22238001, 22378016, 22478023).

The authors declare no competing financial interest.

References

  1. Miller F. W.. The increasing prevalence of autoimmunity and autoimmune diseases: an urgent call to action for improved understanding, diagnosis, treatment, and prevention. Curr. Opin. Immunol. 2023;80:102266. doi: 10.1016/j.coi.2022.102266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Gabhann J. N.. et al. Btk Regulates Macrophage Polarization in Response to Lipopolysaccharide. PLoS One. 2014;9(1):e85834. doi: 10.1371/journal.pone.0085834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Ershadinia N.. et al. The prevalence of autoimmune diseases in patients with multiple sclerosis: A cross-sectional study in Qom, Iran, in 2018. Curr. J. Neurol. 2021;19:98–102. doi: 10.18502/cjn.v19i3.5421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Okoroiwu I. L.. et al. The prevalence of selected autoimmune diseases. Int. J. Adv. Multidiscip. Res. 2016;3:9–14. [Google Scholar]
  5. Corneth O. B. J.. et al. Enhanced Bruton’s Tyrosine Kinase Activity in Peripheral Blood B Lymphocytes From Patients With Autoimmune Disease. Arthritis & Rheumatology. 2017;69(6):1313–1324. doi: 10.1002/art.40059. [DOI] [PubMed] [Google Scholar]
  6. Cao F.. et al. Temporal trends in the prevalence of autoimmune diseases from 1990 to 2019. Autoimmunity Reviews. 2023;22(8):103359. doi: 10.1016/j.autrev.2023.103359. [DOI] [PubMed] [Google Scholar]
  7. Keaney J.. et al. Inhibition of Bruton’s Tyrosine Kinase Modulates Microglial Phagocytosis: Therapeutic Implications for Alzheimer’s Disease. Journal of Neuroimmune Pharmacology. 2019;14(3):448–461. doi: 10.1007/s11481-019-09839-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Marrie R. A.. et al. A systematic review of the incidence and prevalence of autoimmune disease in multiple sclerosis. Multiple Sclerosis Journal. 2015;21(3):282–293. doi: 10.1177/1352458514564490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Masciopinto P.. et al. The Role of Autoimmune Diseases in the Prognosis of Lymphoma. Journal of Clinical Medicine. 2020;9(11):3403. doi: 10.3390/jcm9113403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Angum F.. et al. The Prevalence of Autoimmune Disorders in Women: A Narrative Review. Cureus. 2020;12(5):e8094. doi: 10.7759/cureus.8094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Mohamed-Ahmed O.. et al. Incidence and prevalence of autoimmune diseases in China: A systematic review and meta-analysis of epidemiological studies. Global Epidemiology. 2024;8:100158. doi: 10.1016/j.gloepi.2024.100158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cho J. H., Feldman M.. Heterogeneity of autoimmune diseases: pathophysiologic insights from genetics and implications for new therapies. Nature Medicine. 2015;21(7):730–738. doi: 10.1038/nm.3897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Iqbal M. W.. et al. Integrating machine learning and structure-based approaches for repurposing potent tyrosine protein kinase Src inhibitors to treat inflammatory disorders. Sci. Rep. 2025;15(1):1836. doi: 10.1038/s41598-024-83767-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Burger J. A., Wiestner A.. Targeting B cell receptor signalling in cancer: preclinical and clinical advances. Nature Reviews Cancer. 2018;18(3):148–167. doi: 10.1038/nrc.2017.121. [DOI] [PubMed] [Google Scholar]
  15. Ringheim G. E., Wampole M., Oberoi K.. Bruton’s Tyrosine Kinase (BTK) Inhibitors and Autoimmune Diseases: Making Sense of BTK Inhibitor Specificity Profiles and Recent Clinical Trial Successes and Failures. Front. Immunol. 2021;12:662223. doi: 10.3389/fimmu.2021.662223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. de Gruijter N. M., Jebson B., Rosser E. C.. Cytokine production by human B cells: role in health and autoimmune disease. Clin. Exp. Immunol. 2022;210(3):253–262. doi: 10.1093/cei/uxac090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. López-Herrera G.. et al. Brutonˈs tyrosine kinase-an integral protein of B cell development that also has an essential role in the innate immune system. Journal of Leukocyte Biology. 2013;95(2):243–250. doi: 10.1189/jlb.0513307. [DOI] [PubMed] [Google Scholar]
  18. Rip J.. et al. Toll-Like Receptor Signaling Drives Btk-Mediated Autoimmune Disease. Front. Immunol. 2019;10:95. doi: 10.3389/fimmu.2019.00095. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. De Bondt M.. et al. Inhibitors of Bruton’s tyrosine kinase as emerging therapeutic strategy in autoimmune diseases. Autoimmunity Reviews. 2024;23(5):103532. doi: 10.1016/j.autrev.2024.103532. [DOI] [PubMed] [Google Scholar]
  20. Saadoun D.. et al. Expansion of Autoreactive Unresponsive CD21–/lowB Cells in Sjögren’s Syndrome-Associated Lymphoproliferation. Arthritis & Rheumatism. 2013;65(4):1085–1096. doi: 10.1002/art.37828. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Singh S. P., Dammeijer F., Hendriks R.W.. Role of Bruton’s tyrosine kinase in B cells and malignancies. Mol. Cancer. 2018;17(1):57. doi: 10.1186/s12943-018-0779-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Farrar J. E., Rohrer J., Conley M. E.. Neutropenia in X-Linked Agammaglobulinemia. Clinical Immunology and Immunopathology. 1996;81(3):271–276. doi: 10.1006/clin.1996.0188. [DOI] [PubMed] [Google Scholar]
  23. Tang C. P. S., McMullen J., Tam C.. Cardiac side effects of bruton tyrosine kinase (BTK) inhibitors. Leukemia & Lymphoma. 2018;59(7):1554–1564. doi: 10.1080/10428194.2017.1375110. [DOI] [PubMed] [Google Scholar]
  24. McDonald C., Xanthopoulos C., Kostareli E.. The role of Bruton’s tyrosine kinase in the immune system and disease. Immunology. 2021;164(4):722–736. doi: 10.1111/imm.13416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Gaulton A.. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic acids research. 2012;40(D1):D1100–D1107. doi: 10.1093/nar/gkr777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Bento A. P.. et al. An open source chemical structure curation pipeline using RDKit. J. Cheminform. 2020;12:51. doi: 10.1186/s13321-020-00456-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Selvaraj C.. et al. Tool development for Prediction of pIC50 values from the IC50 values-A pIC50 value calculator. Curr. Trends Biotechnol. Pharm. 2011;5(2):1104–1109. [Google Scholar]
  28. Wang J., Skolnik S.. Recent advances in physicochemical and ADMET profiling in drug discovery. Chemistry & biodiversity. 2009;6(11):1887–1899. doi: 10.1002/cbdv.200900117. [DOI] [PubMed] [Google Scholar]
  29. Bisong E., Bisong E.. Matplotlib and seaborn. Building machine learning and deep learning models on google cloud platform: A comprehensive guide for beginners. 2019:151–165. doi: 10.1007/978-1-4842-4470-8_12. [DOI] [Google Scholar]
  30. Baell J. B., Holloway G. A.. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. Journal of medicinal chemistry. 2010;53(7):2719–2740. doi: 10.1021/jm901137j. [DOI] [PubMed] [Google Scholar]
  31. Brenk R.. et al. Lessons learnt from assembling screening libraries for drug discovery for neglected diseases. ChemMedChem: Chemistry Enabling. Drug Discovery. 2008;3(3):435–444. doi: 10.1002/cmdc.200700139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Cereto-Massagué A.. et al. Molecular fingerprint similarity search in virtual screening. Methods. 2015;71:58–63. doi: 10.1016/j.ymeth.2014.08.005. [DOI] [PubMed] [Google Scholar]
  33. Snider L., Swedo S.. PANDAS: current status and directions for research. Molecular psychiatry. 2004;9(10):900–907. doi: 10.1038/sj.mp.4001542. [DOI] [PubMed] [Google Scholar]
  34. Maind S. B., Wankar P.. Research paper on basic of artificial neural network. Int. J. Recent Innov. Trends Comput. Commun. 2014;2(1):96–100. doi: 10.17762/ijritcc.v2i1.2920. [DOI] [Google Scholar]
  35. Liu, Y. H. ; Mehta, S. . Hands-On Deep Learning Architectures with Python: Create deep neural networks to solve computational problems using TensorFlow and Keras; Packt Publishing Ltd., 2019. [Google Scholar]
  36. Dragan P.. et al. Keras/TensorFlow in drug design for immunity disorders. International Journal of Molecular Sciences. 2023;24(19):15009. doi: 10.3390/ijms241915009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Eckle K., Schmidt-Hieber J.. A comparison of deep networks with ReLU activation function and linear spline-type methods. Neural Networks. 2019;110:232–242. doi: 10.1016/j.neunet.2018.11.005. [DOI] [PubMed] [Google Scholar]
  38. Andrian R., Hermanto B., Kamil R.. The implementation of backpropagation artificial neural network for recognition of batik lampung motive. J. Phys.: Conf. Ser. 2019;1338:012062. doi: 10.1088/1742-6596/1338/1/012062. [DOI] [Google Scholar]
  39. Chai T., Draxler R. R.. Root mean square error (RMSE) or mean absolute error (MAE) Geosci. Model Dev. Disc. 2014;7(1):1525–1534. [Google Scholar]
  40. Amaratunga, T. Deep Learning on Windows: Building Deep Learning Computer Vision Systems on Microsoft Windows; Springer, 2021. [Google Scholar]
  41. Abdolmaleki A., Ghasemi J. B.. Inhibition activity prediction for a dataset of candidates’ drug by combining fuzzy logic with MLR/ANN QSAR models. Chemical Biology & Drug Design. 2019;93(6):1139–1157. doi: 10.1111/cbdd.13511. [DOI] [PubMed] [Google Scholar]
  42. Morris G. M., Lim-Wilby M.. Molecular docking. Molecular modeling of proteins. 2008;443:365–382. doi: 10.1007/978-1-59745-177-2_19. [DOI] [PubMed] [Google Scholar]
  43. Burley S. K.. et al. Protein Data Bank (PDB): the single global macromolecular structure archive. Protein crystallography: methods and protocols. 2017;1607:627–641. doi: 10.1007/978-1-4939-7000-1_26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Vilar S., Cozza G., Moro S.. Medicinal chemistry and the molecular operating environment (MOE): application of QSAR and molecular docking to drug discovery. Current topics in medicinal chemistry. 2008;8(18):1555–1572. doi: 10.2174/156802608786786624. [DOI] [PubMed] [Google Scholar]
  45. Gaohua L., Miao X., Dou L.. Crosstalk of physiological pH and chemical pKa under the umbrella of physiologically based pharmacokinetic modeling of drug absorption, distribution, metabolism, excretion, and toxicity. Expert opinion on drug metabolism & toxicology. 2021;17(9):1103–1124. doi: 10.1080/17425255.2021.1951223. [DOI] [PubMed] [Google Scholar]
  46. Castro-Alvarez A., Costa A. M., Vilarrasa J.. The performance of several docking programs at reproducing protein–macrolide-like crystal structures. Molecules. 2017;22(1):136. doi: 10.3390/molecules22010136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. DeLano W. L.. Pymol: An open-source molecular graphics tool. CCP4 Newsl. Protein Crystallogr. 2002;40(1):82–92. [Google Scholar]
  48. Studio, D. Discovery studio. Accelrys [2.1], 2008, 420. [Google Scholar]
  49. Laskowski, R. A. ; Swindells, M.B. . LigPlot+: multiple ligand–protein interaction diagrams for drug discovery; ACS Publications, 2011. [DOI] [PubMed] [Google Scholar]
  50. Wang J.. et al. Antechamber: an accessory software package for molecular mechanical calculations. J. Am. Chem. Soc. 2001;222(1):2001. [Google Scholar]
  51. Loschwitz J.. et al. Dataset of AMBER force field parameters of drugs, natural products and steroids for simulations using GROMACS. Data in brief. 2021;35:106948. doi: 10.1016/j.dib.2021.106948. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Mackay, D. ; Cross, A. ; Hagler, A. . The role of energy minimization in simulation strategies of biomolecular systems. In Prediction of protein structure and the principles of protein conformation; Springer, 1989; pp 317–358. [Google Scholar]
  53. Liu J., Li D., Liu X.. A simple and accurate algorithm for path integral molecular dynamics with the Langevin thermostat. J. Chem. Phys. 2016;145(2):024103. doi: 10.1063/1.4954990. [DOI] [PubMed] [Google Scholar]
  54. Lin Y.. et al. Application of Berendsen barostat in dissipative particle dynamics for nonequilibrium dynamic simulation. J. Chem. Phys. 2017;146(12):124108. doi: 10.1063/1.4978807. [DOI] [PubMed] [Google Scholar]
  55. Kräutler V., Van Gunsteren W. F., Hünenberger P. H.. A fast SHAKE algorithm to solve distance constraint equations for small molecules in molecular dynamics simulations. Journal of computational chemistry. 2001;22(5):501–508. doi: 10.1002/1096-987X(20010415)22:5&#x0003c;501::AID-JCC1021&#x0003e;3.0.CO;2-V. [DOI] [Google Scholar]
  56. Harris J. A.. et al. GPU-accelerated all-atom particle-mesh Ewald continuous constant pH molecular dynamics in Amber. J. Chem. Theory Comput. 2022;18(12):7510–7527. doi: 10.1021/acs.jctc.2c00586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Rácz A.. et al. Molecular dynamics simulations and diversity selection by extended continuous similarity indices. J. Chem. Inf. Model. 2022;62(14):3415–3425. doi: 10.1021/acs.jcim.2c00433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Karplus M., McCammon J. A.. Molecular dynamics simulations of biomolecules. Nature structural biology. 2002;9(9):646–652. doi: 10.1038/nsb0902-646. [DOI] [PubMed] [Google Scholar]
  59. Delemotte L.. et al. Intermediate states of the Kv1. 2 voltage sensor from atomistic molecular dynamics simulations. Proc. Natl. Acad. Sci. U. S. A. 2011;108(15):6109–6114. doi: 10.1073/pnas.1102724108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Roe D. R., Cheatham T. E. III. PTRAJ and CPPTRAJ: software for processing and analysis of molecular dynamics trajectory data. J. Chem. Theory Comput. 2013;9(7):3084–3095. doi: 10.1021/ct400341p. [DOI] [PubMed] [Google Scholar]
  61. Heid E.. et al. Chemprop: a machine learning package for chemical property prediction. J. Chem. Inf. Model. 2024;64(1):9–17. doi: 10.1021/acs.jcim.3c01250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Wu Z.. et al. MoleculeNet: a benchmark for molecular machine learning. Chemical science. 2018;9(2):513–530. doi: 10.1039/C7SC02664A. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Cramer R. D.. et al. Virtual screening for R-groups, including predicted pIC50 contributions, within large structural databases, using Topomer CoMFA. J. Chem. Inf. Model. 2008;48(11):2180–2195. doi: 10.1021/ci8001556. [DOI] [PubMed] [Google Scholar]
  64. Hannaert P.. et al. Rat NKCC2/NKCC1 cotransporter selectivity for loop diuretic drugs. Naunyn-Schmiedeberg's Arch. Pharmacol. 2002;365:193–199. doi: 10.1007/s00210-001-0521-y. [DOI] [PubMed] [Google Scholar]
  65. Polanski J., Bogocz J., Tkocz A.. The analysis of the market success of FDA approvals by probing top 100 bestselling drugs. Journal of computer-aided molecular design. 2016;30:381–389. doi: 10.1007/s10822-016-9912-5. [DOI] [PubMed] [Google Scholar]
  66. Maruyama Y.. et al. Analysis of protein folding simulation with moving root mean square deviation. J. Chem. Inf. Model. 2023;63(5):1529–1541. doi: 10.1021/acs.jcim.2c01444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Shukla R., Tripathi T.. Molecular dynamics simulation of protein and protein–ligand complexes. Computer-aided drug design. 2020:133–161. doi: 10.1007/978-981-15-6815-2_7. [DOI] [Google Scholar]
  68. Lobanov M. Y., Bogatyreva N., Galzitskaya O.. Radius of gyration as an indicator of protein structure compactness. Mol. Biol. 2008;42:623–628. doi: 10.1134/S0026893308040195. [DOI] [PubMed] [Google Scholar]
  69. Bitencourt-Ferreira G., Veit-Acosta M., de Azevedo W. F.. Hydrogen bonds in protein-ligand complexes. Docking screens for drug discovery. 2019;2053:93–107. doi: 10.1007/978-1-4939-9752-7_7. [DOI] [PubMed] [Google Scholar]
  70. Maisuradze G. G., Liwo A., Scheraga H. A.. Principal component analysis for protein folding dynamics. Journal of molecular biology. 2009;385(1):312–329. doi: 10.1016/j.jmb.2008.10.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Abdelsattar A. S., Mansour Y., Aboul-ela F.. The Perturbed Free-Energy Landscape: Linking Ligand Binding to Biomolecular Folding. ChemBioChem. 2021;22(9):1499–1516. doi: 10.1002/cbic.202000695. [DOI] [PubMed] [Google Scholar]
  72. dos Santos Nascimento, I. J. et al. Dynamic cross-correlation matrix (DCCM) Reveals new insights to discover new NLRP3 inhibitors useful as anti-inflammatory drugs. In Medical Sciences Forum; MDPI, 2022. [Google Scholar]
  73. Alrouji M.. et al. Unlocking potential inhibitors for Bruton’s tyrosine kinase through in-silico drug repurposing strategies. Sci. Rep. 2023;13(1):17684. doi: 10.1038/s41598-023-44956-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Patel V., Shah M.. Artificial intelligence and machine learning in drug discovery and development. Intelligent Medicine. 2022;2(3):134–140. doi: 10.1016/j.imed.2021.10.001. [DOI] [Google Scholar]
  75. Gupta R.. et al. Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Molecular diversity. 2021;25:1315–1360. doi: 10.1007/s11030-021-10217-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Hofling, A. A. et al. Prostate cancer theranostics: concurrent approvals by the Food and Drug Administration of the first diagnostic imaging drug indicated to select patients for a paired radioligand therapeutic drug; Society of Nuclear Medicine, 2022; pp 1642–1643. [DOI] [PubMed] [Google Scholar]
  77. Pappas P. G.. et al. Micafungin versus caspofungin for treatment of candidemia and other forms of invasive candidiasis. Clinical infectious diseases. 2007;45(7):883–893. doi: 10.1086/520980. [DOI] [PubMed] [Google Scholar]
  78. Kivinen S.. et al. Short-term topical treatment of wlvovaginal candidiasis with the combination of 5-fluorocytosine and candicidin. Current Medical Research and Opinion. 1979;6(2):88–92. doi: 10.1185/03007997909109403. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All the data generated/analyzed during this study are provided in the Supporting Information. The source code is openly accessible on GitHub (https://github.com/iwaleediqbal/DDWL).


Articles from ACS Omega are provided here courtesy of American Chemical Society

RESOURCES