Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Dec 29;16:3958. doi: 10.1038/s41598-025-34068-2

An integrative computational approach for identification of NLRP3 inhibitors through machine learning, docking, dynamics and DFT analysis

Sami I Alzarea 1,
PMCID: PMC12856013  PMID: 41461937

Abstract

Neuroinflammation, mediated by NLR family pyrin domain containing 3 (NLRP3) inflammasome, plays a crucial role in the development of many central nervous system (CNS) diseases such as Alzheimer’s disease, Parkinson’s disease, multiple sclerosis, and stroke. Despite extensive efforts, there are no clinically approved NLRP3 inhibitors due to issues like poor selectivity, undesirable drug-like properties, and safety concerns. In this study, a machine learning-based virtual screening strategy was used to identify phytochemicals that inhibit the NLRP3 NACHT domain, a key region involved in ATP-driven oligomerization and inflammasome activation. A carefully curated set of 1,956 active compounds and 5,476 inactive ones was employed to train various classifiers, with the Random Forest model demonstrating the best predictive performance (AUC = 0.83). This enhanced model was applied to analyze the MPD3 phytochemical library, resulting in 183 drug-like candidates. Molecular docking revealed that PubChem 348,482, ZINC14583344, and PubChem 11,027,076 showed excellent binding affinities (–10.6 to − 11.3 kcal/mol), forming strong interactions with key residues (Ala228, Arg578, Glu629) known to influence NLRP3 conformational dynamics. ADMET analysis confirmed favorable pharmacokinetic and safety profiles, while molecular dynamics simulations over more than 100 ns verified the stability of the protein-ligand complexes through consistent RMSD, RMSF, and hydrogen bonding patterns of ZINC14583344. MM-GBSA free energy calculations further identified ZINC14583344 (–23.99 kcal/mol) as the most promising candidate. Additionally, Density Functional Theory (DFT) analysis indicated that ZINC14583344 has a smaller HOMO–LUMO gap, higher softness, and greater electrophilicity, suggesting superior reactivity and receptor binding flexibility. Conversely, PubChem 348,482 displayed a higher dipole moment and nucleophilicity, indicating stronger hydrogen bonding and electrostatic interactions with polar residues. Collectively, these findings highlight ZINC14583344 and PubChem 348,482 as promising scaffolds for developing selective NLRP3 inhibitors, providing a basis for therapeutic strategies against neuroinflammation-related CNS disorders.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-34068-2.

Keywords: NLRP3 inflammasome, Neuroinflammation, Phytochemicals, Virtual screening, Molecular dynamics

Subject terms: Computational biology and bioinformatics, Drug discovery, Neuroscience

Introduction

The central nervous system (CNS) presents a significant global health challenge. The innate immune response within the CNS plays a vital role in mediating disease progression after pathogen entry. This process, known as neuroinflammation, is characterized by the stimulation of microglial and astrocytic cells1. Several innate immune receptors have been identified, with the nucleotide-binding domain and leucine-rich repeat-containing (NLR) family gaining considerable attention because of its genetic association to immune modulation. The innate immune defense also encompasses pattern recognition receptors (PRRs), comprising the cytosolic NOD-like receptors and the membrane-bound Toll-like receptors (TLRs). As cytosolic sensors, the NLR cluster promotes the identification of microbial pathogen-associated molecular patterns and injury-induced damage-associated molecular patterns (DAMPs) along the cytoplasm2.

The best-characterized inflammasome clusters are NLR family pyrin domain containing 3 (NLRP3), NLR family pyrin domain containing 1 (NLRP1), and NAIP-NLRC4 (an immune complex), along with the non-NLR receptor AIM2, which mediates inflammasome stimulation3. NLRP3 is a crucial inflammasome involved in the development and proliferation of various CNS disorders4. It consists of the NLRP3 scaffold, SGT1(suppressor of G2, of skp1 allele) and HSP90 (heat shock protein 90 kD), which work together to maintain the pre-stimulation state. When stimulated by an irritant, the complex cleaves from NLRP3, initiating activation. Tripartite-motif protein 30, a RING protein, negatively modulates NLRP3 stimulation by generating reactive oxygen species and promoting caspase-1 cleavage, leading to the production of inflammation-inducing cytokines like IL-1β and IL-18, triggering pyroptosis5. Therefore, targeting the NLRP3 inflammasome holds significant potential as a promising therapeutic strategy for treating neuroinflammation-related CNS disorders (Fig. 1).

Fig. 1.

Fig. 1

Schematic representation of NLRP3 inflammasome stimulation in various CNS disorders. The central complex represents the structure of the NLRP3 inflammasome, which acts as a key factor in the stimulation of pro-inflammatory cytokines. In stroke, NLRP3 stimulation exacerbates brain damage through the secretion of IL-1β and IL-18. Additionally, in Parkinson’s disease, α-synuclein activates TLR2, promoting NF-κB signaling and subsequent NLRP3 expression. In multiple sclerosis, pattern- and damage-associated molecular signals (PAMPs/DAMPs) stimulate cytokine release through the interaction of T and B cells. In Alzheimer’s disease, NLRP3 stimulation induces microglial release and ASC breakdown, contributing to Aβ plaque accumulation.

The Induction of the NLRP3 nucleosome depends on a two-step process: a priming signal, often triggered by PAMPs like TLRs, which activates the NF-κB pathway, followed by a second stimulation signal involving cellular stress factors such as potassium (K⁺) efflux and mitochondrial dysfunction, leading to the assembly of the inflammasome complex. NLRP3 stimulation has been widely reported in both acute and chronic neurological situations, such as Parkinson’s disease (PD), Alzheimer’s disease (AD), multiple sclerosis (MS), and stroke6. In AD, β-amyloid aggregates trigger NLRP3 stimulation through mitochondrial dysfunction, ROS, and the secretion of cathepsin B. This ultimately leads to IL-1β production and synaptic dysfunction7. Similar to Alzheimer’s disease, conditions such as depression, stress, or systemic inflammation also induce the NLRP3 stimulation in microglia, contributing to depressive behavior. Another report demonstrates that Parkinson’s is caused by the aggregation of misfolded α-synuclein (αSyn) accumulation, which induces microglial stimulation and NLRP3-controlled release of IL-1β and IL-18, thus aggravating dopaminergic neuronal damage. It is also demonstrated that the α-syn protein forms the Lewy bodies, which are involved in the biological process of Parkinson’s disease and the stimulation of the NLRP3 inflammasome8. Additionally, Nasoohi et al. reported that the malfunction of glucose metabolism in the CNS, the stimulation of the NLRP3 was induced by thioredoxin-interacting protein (TXNIP), which exacerbated brain tissue damage9.

Boršić et al. identified a decameric cage-based structure of the inactive human NLRP3 assembly through cryo-EM examinations10. The structure of NLRP3 inflammasome comprises 3 domains: a C-terminal carboxy LRR, an amino N-terminal pyrin domain (PYD), and a central NBD-comprising ATPase domain known as NACHT. PYDs interact with each other to initiate cascade signaling, while LRR domains are primarily regulated by protein–protein interactions and help maintain NLRP3 stability. The NACHT domain binds nucleotides and hydrolyzes ATP, which allows ATP-dependent oligomerization and subsequent PYD clustering4. The NACHT domain of NLRP3 is the essential part of the inflammasome sensor because it contains the ATP-binding and hydrolysis pocket that causes conformational changes, oligomerization, and the induction of adaptor proteins needed for inflammasome assembly. Due to the complex structure of the NLRP3, the NACHT domain is further divided into subdomains: FISNA (fish-specific NACHT-associated domain), an NBD, a helical domain 1 (HD1), a winged helix domain (WHD), and a helical domain 2 (HD2). Similar to other members of NLR family such as NOD2 and NLRC4, the NACHT and LRR domains of NLRP3 share a conserved ring-like configuration11. Therefore, targeting the NACHT domain of NLRP3 is a promising way to inhibit ATPase activity and provides a direct strategy to block NLRP3 activation for treating neuroinflammation-related CNS disorders.

Several studies have demonstrated the successful development of inhibitors designed to block NLRP3 stimulation. Notably, MCC950, Oridonin, and OLT1177 selectively target the NLRP3 nucleosome and have produced significant therapeutic responses in animal models12. However, their precise mechanisms are not yet fully understood, which may lead to off-target effects, poor solubility, and safety concerns. Therefore, studies have indicated that a phase II clinical trial confirmed the potential hepatotoxicity of MCC950 and may have off-target effects13. Thus, these already identified inhibitors generally recommend poor drug-like properties and unfavorable pharmacokinetic profiles; currently, no clinically approved NLRP3 inhibitors are available for the treatment of CNS disorders. The NLRP3 stimulation depends on its diverse structural domains, showing that its structural dynamics are vital for developing effective inhibitors. The majority of NLRP3 inhibitors available on the market are obtained by lead compound modification or high-throughput screening, which significantly raises the cost. Machine learning is becoming a valuable addition to traditional drug discovery approaches due to its efficiency and cost-effectiveness14. Recent advances in in-silico approaches, such as enhanced molecular dynamics simulations, and machine learning-based predictive models, have significantly accelerated drug discovery by improving target validation and lead optimization. These computational strategies have effectively identified bioactive natural compounds, assessed mutation-induced drug resistance, and predicted small-molecule activity across various therapeutic targets1518.

In the current study, an ML-based virtual screening method was used to identify new candidate NLRP3 inhibitors. ML algorithms analyze patterns and features from known inhibitors to find novel compounds that effectively bind to the target protein. Additionally, we perform molecular docking, molecular dynamics simulations, binding energy calculations, PCA, and density functional theory (DFT) analysis to predict and validate potential drug-like compounds for inhibiting NLRP3-related CNS disorder symptoms.

Methodology

Dataset Preparation

The NLRP3 dataset used in this study was obtained from BindingDB (https://www.bindingdb.org/rwd/bind/index.jsp)19. After removing duplicates and irrelevant entries, a total of 1,956 unique active compounds were retained. Additionally, 5,476 decoys (inactives) were generated using DUD-E (https://dude.docking.org/). Decoys were included as negative controls to assess the specificity and robustness of the machine learning model. All compounds were converted into SMILES format for descriptor generation, and the data were compiled into an Excel file. The active compounds were labeled as 1, while the inactive compounds were labeled as 0.Data preprocessing and cleaning were carried out using Python’s pandas and RDKit libraries.

Feature calculation

A wide range of molecular descriptors was calculated using the RDKit library in Python. The compounds, provided in SMILES format, were processed accordingly for feature generation20,21. A total of 37 Features were computed, features with zero or missing values were removed to enhance computational efficiency22. This preprocessing step helped eliminate incomplete data and reduce complexity. The final set of descriptors was compiled into a CSV file for use in subsequent preprocessing and machine learning analysis.

Dataset splitting

The dataset, which includes both active and inactive compounds along with their calculated features, was divided into training and testing subsets using _test_split function from the scikit-learn library in Python23. A stratified split was performed using a 70:30 ratio to ensure balanced class distribution in both subsets24. The training set (70%) was used to develop machine learning models, while the testing set (30%) was kept for model evaluation.

Principal component analysis (PCA)

Principal Component Analysis (PCA) was used to reduce the dimensionality of the dataset by converting correlated descriptors into a smaller collection of uncorrelated variables while keeping the majority of the variance. This step reduced redundancy, noise, and computational efficiency, allowing for improved model performance and visualization of the chemical space2527. Eigenvalues and explained variance ratios were calculated to determine each principal component’s contribution to the total variance. The first few components, especially PC1 and PC2, captures a significant portion of the dataset’s variability28,29. Loading factor analysis was also performed to identify the original features that contributed most to each principal component30,31. All PCA and clustering operations were carried out using the scikit-learn and matplotlib libraries in Python32.

Model training and development

For model development, four supervised machine learning algorithms were employed to categorize compounds as active or inactive based on their molecular descriptors33,34. The selected algorithms included Naïve Bayes (NB), Random Forest (RF), K-Nearest Neighbors (KNN), Support Vector Machine (SVM)35,36. NB was used as a probabilistic classifier based on Bayes’ Theorem, with the assumption of feature independence. RF, an ensemble-based approach, constructs multiple decision trees during training and predicts outcomes through majority voting, offering high accuracy and robustness against overfitting. The K-Nearest Neighbors (KNN) algorithm assigns class labels to compounds by considering the majority class among the k nearest neighbors in the feature space, making it particularly effective when clusters are well defined37. SVM identifies the optimal hyperplane that separates classes in a high-dimensional feature space, making it effective for both linear and non-linear classification tasks. Model training was carried out on the training dataset using the scikit-learn library in Python38. To enhance predictive performance, hyperparameters were optimized through GridSearchCV combined with 10-fold cross-validation. The trained models were subsequently evaluated using multiple performance metrics, including accuracy, F1-score, Matthews correlation coefficient (MCC) and ROC-AUC, to determine the most suitable classifier39,40.

Virtual screening of the MPD3 library using the pre-trained model

The MPD3 (Medicinal Plant Database for Drug Designing) (http://bioinfo-pharma.upsc.se/MPD3/) library is a carefully curated collection of phytochemicals from medicinal plants, recognized for their pharmacological significance41. This library was selected for virtual screening because of its diverse chemical structures and potential bioactivity. A total of 2,347 compounds from the MPD3 library were obtained in SMILES format and processed to generate molecular descriptors consistent with those used during model training. The best-performing machine learning model, previously saved with the pickle library, was then applied to this new dataset42. The model predicted each compound’s activity, classifying them as either active or inactive against the NLRP3 target. This virtual screening process helped identify the top candidate molecules for further structure-based evaluation.

Protein structure preparation and ligand preparation

The three-dimensional crystal structure of the NLRP3 protein (PDB ID: 9GU4) was retrieved from the Protein Data Bank (https://www.rcsb.org/). Protein structure was prepared using AutoDockTools 1.5.643 by removing all crystallographic water molecules and unnecessary heteroatoms. Polar hydrogen atoms were added to the receptor and gassteiger charges were assigned. The optimized structure was then saved in PDBQT format for later use. The predictive active compounds obtained from ML model based screening were prepared for molecular docking studies. Initially, the SMILES structures were converted into 3D PDB format using the RDKit library. These structures were then processed in AutoDockTools 1.5.6 for geometry optimization, where polar hydrogens were added and Gasteiger charges were assigned. Finally, the optimized ligand structures were saved in PDBQT format.

Molecular docking analysis

The prepared protein and ligand structures, as described in the previous section, were utilized for molecular docking studies. Docking was performed to predict the binding orientation of ligands within the active site of the target protein and to estimate their binding affinities, allowing for insights into potential molecular interactions. The docking simulations were performed using AutoDock Vina44, which employs a hybrid scoring function combining empirical and knowledge-based terms. The grid box was positioned to fully encompass the predicted active site of the target protein, with center coordinates set to x = 14.465, y = 33.051, and z =−9.565, and grid dimensions of 30Å along the x, y, and z axes, respectively (Supplementary File 6). This configuration ensured complete coverage of the catalytic region and neighboring residues, allowing comprehensive exploration of potential binding orientations. Additionally, the exhaustiveness parameter was set to 16 to achieve an optimal balance between computational efficiency and conformational search depth and for each ligand, twenty docking poses were generated. After molecular docking, the compounds were initially ranked based on their predicted binding scores (Kcal/mol). The top 15 compounds were then visually inspected to assess their binding orientations within the active site. From these compounds, those compounds that established interactions with critical catalytic residues were prioritized for further analysis.

ADMET profiling and drug-likeness evaluation

The pharmacokinetic properties, physicochemical parameters, and drug-likeness of ligands were evaluated using the pkCSM web server (https://biosig.lab.uq.edu.au/pkcsm/). PkCSM is a freely accessible server that predicts ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) characteristics of drug candidates based on graph-based signatures45. It provides detailed information about water solubility, intestinal absorption, blood-brain barrier permeability, cytochrome P450 interactions, and total clearance. The SMILES (Simplified Molecular Input Line Entry System) representations of the ligands were submitted to the pkCSM server to evaluate their pharmacokinetic properties and drug-likeness for further screening.

Molecular dynamics (MD) simulations

In our study, GROMACS 2024.146 was used to conduct simulations on both apo E7 and E7-ligand complexes obtained after docking. The CHARMM36-jul2022 force field47 was employed for protein topology, while ligand topologies were obtained via the CGENFF server (https://cgenff.com/). The system was solvated in a cubic box using the TIP3P solvent model, maintaining a distance of 15Å between the protein and the box boundaries48. Sodium ions (Na+) were added to neutralize the system. After steepest descent method was used to energy minimize the systems for 2,000 steps49. The systems were equilibrated under the NVT and NPT ensembles for 350 ps each to stabilize temperature and pressure, respectively. Subsequently, a 100 ns production simulation was performed for each system, with trajectories saved every 10 ps for analysis. The Linear Constraint Algorithm (LINCS) was used to constrain bond lengths, particularly for those involving H-atoms50.

After the simulations, trajectories were analyzed for root mean square deviation (RMSD), root mean square fluctuations (RMSF), radius of gyration (RoG), solvent-accessible surface area (SASA), Number of hydrogen bonds, and Principal Component Analysis (PCA).

MM-GBSA calculation

Assessment of the effectiveness of small-molecule binding is essential for evaluating potential drug candidates. One of the most extensively used techniques for calculating binding free energy (BFE) in computational studies is MM-GBSA (Molecular Mechanics/Generalized Born Surface Area)51. This technique provides a way to estimate the binding affinity between protein and ligands by accounting for the contributions of various forces, including van der Waals interactions, electrostatics, and solvation energies52,53.

In this study, we utilized the gmx_MMPBSA module54, which is usually employed for calculating binding free energies for protein-ligand complexes. We considered 2,500 snapshots taken from the trajectory of the molecular dynamics simulation, ensuring a comprehensive selection of protein-ligand interactions throughout the simulation period. The binding free energy is estimated as:

Here, ΔG complex represents the free energy of the entire receptor-ligand complex, ΔG receptor refers to the free energy of the receptor alone, and ΔG ligand corresponds to the free energy of the ligand in isolation. The difference between the complex and the sum of the individual receptor and ligand energies yields the binding free energy ΔG bind, which is a measure of the affinity of the ligand for the protein.

Each term in Eq. 1 is calculated as;

graphic file with name d33e474.gif 1

In Eq. 2, ΔG elect is the electrostatic energy, ΔG vdw is the van der Waals energy, while ΔG solvation is the solvation energy (polar and Non-polar).

graphic file with name d33e492.gif 2

By calculating ΔG bind for each receptor-ligand complex, we can assess the comparative binding affinity of different ligands to the protein, helping identify the most promising candidates for further investigation52.

Density functional theory (DFT)

All DFT calculations were performed through Gaussian 09 software. The geometry optimization of the reference compound and the two lead compounds, ZINC 14,533,344 and PubChem 348,482, was performed employing the B3LYP functional and the basis set of 6-311G55. The compounds underwent comprehensive geometry optimizations. The fact that the imaginary vibrational frequencies are zero definitively establishes that the obtained structures are at the actual minima of the potential energy surface. The following electronic analysis included dipole moments, total electronic energies, and frontier molecular orbital energies (the highest occupied molecular orbital, HOMO, and the lowest unoccupied molecular orbital, LUMO). Based on the calculations, global reactivity descriptors were obtained, thereby providing quantitative information on molecular stability and reactivity. Also, a molecular electrostatic potential (MEP) map was created to explain the distribution of electronic charge on the molecular surfaces. The resultant MEP diagrams clearly represent electrophilic and nucleophilic attack at plausible sites by the direct demarcation of electron-rich and electron-deficient regions. Based on this, this combined computational protocol, including optimized geometries, FMO evaluation, global reactivity indicators, and visualization of MEP, provides a complete description of the electronic behavior of the molecules under study56.

Results

Dataset Preparation

For the NLRP3 target, 1,956 bioactive compounds were obtained from publicly available databases, including BindingDB, PubChem, and ChEMBL, based on their reported ability to inhibit NLRP3. To create a comprehensive and representative dataset, 5,869 presumed inactive decoy compounds were added, resulting in a final collection of 7,825 molecules (Supplementary File 1). This dataset was split into a training set (5,476 compounds) and a test set (2,348 compounds) using a 70:30 ratio (Supplementary Files 2 and 3). An equal balance of active and inactive compounds was maintained in both subsets to minimize model bias and enhance prediction accuracy. An overview of the dataset structure is provided in (Table 1).

Table 1.

The train-test split of the NLRP3 dataset used in this study.

Dataset Inhibitors Non-inhibitors Total
Train 1,369 4,107 5,476
Test 489 1,467 1,956

Descriptors generation

In this study, two-dimensional (2D) molecular descriptors were generated to quantitatively capture the structural and physicochemical characteristics of the compounds. The detail of calculated 37 features for the dataset is presented in Supplementary File 4 and name list of the features is presented in Table 2.

Table 2.

List of features.

Number Features Description of the features
Physicochemical properties 1 MolWt Total molecular mass of the compound
2 MolLogP Predicted lipophilicity of the molecule
3 Qed Quantitative Estimate of Drug-likeness
4 HeavyAtomMolWt Mass contributed by non-hydrogen atoms only
5 TPSA Surface area contributed by polar atoms
6 FractionCSP3 Fraction of sp³ Carbon
7 MolMR Measure of polarizability and steric volume
Electronic properties 8 MaxPartialCharge Highest partial atomic charge among all atoms.
9 MinPartialCharge Lowest partial atomic charge among all atoms
10 MinEStateIndex Minimum Electrotopological value
11 MaxEStateIndex Largest Electrotopological value
12 NumValenceElectrons Number of all valence electrons in the molecule
13 NumRadicalElectrons Number of Radical Electrons

Topological

indices

14 Chi0 Molecular Connectivity Chi Index (order 0)
15 Chi3n Molecular Connectivity Chi Index (non-aromatic)
16 BalabanJ A graph-theoretical measure of molecular shape/complexity.
17 FpDensityMorgan1 Morgan Fingerprint Density
Structural/Geometrical features 18 NumHeteroatoms Count of atoms other than carbon and hydrogen
19 NumRotatableBonds Number of Rotatable Bonds
20 NumAtoms Total Number of Atoms including hydrogens
21 NumHeavyAtoms Atoms other than hydrogen
22 RingCount Number of rings (all types)
Ring-specific features 23 NumAromaticRings Benzene-like rings (aromatic)
24 NumAliphaticRings Non-aromatic carbon rings
25 NumSaturatedRings Rings without double bonds
26 NumAliphaticCarbocycles Aliphatic rings containing only carbon atoms
27 NumAliphaticHeterocycle Non-aromatic rings containing heteroatoms
28 NumAliphaticCycles All non-aromatic rings
29 NumAromaticCarbocycles Aromatic rings that contain only carbon atoms
30 NumAromaticHeterocycles Aromatic rings containing heteroatoms
31 NumAromaticCycles All aromatic rings (carbocycles + heterocycles)
32 NumHeterocycles Number of Heterocycles
33 NumAmideBonds Number of Amide Bonds
Hydrogen bonding features 34 NumHAcceptors Number of Hydrogen Bond Acceptors
35 NumHDonors Number of Hydrogen Bond Donors
36 NHOHCount Number of N–H and O–H Bonds
37 NOCount Number of Nitrogen and Oxygen Atoms

Principal component analysis (PCA)

Principal Component Analysis (PCA) was conducted to reduce the dimensionality of the molecular descriptor dataset and to identify key patterns within the chemical space. The distribution of compounds across the first two principal components (PC1 and PC2) is displayed in the two-dimensional PCA scatter plot (Fig. 2), where class 0 and class 1 compounds are shown in blue and green, respectively. The PCA scatter plot showed partial clustering of active and inactive compounds, although significant overlap between the two classes was observed. PC1 captured the majority of variance in the dataset (95.67%), while PC2 contributed an additional 3.75%, demonstrating that most structural variability is driven by a single dominant component. Loading analysis showed that PC1 was primarily influenced by molecular weight, number of valence electrons, and topological polar surface area, suggesting that molecular size and polarity are the major contributors to chemical diversity. PC2 was mainly affected by TPSA and heteroatom content, reflecting variations in polarity-related features.

Fig. 2.

Fig. 2

PCA scatter plot signifying compound distribution along PC1 and PC2. Classes 0 and 1 are displayed in blue and green, respectively, with notable overlap indicating limited class separation.

Overall, PCA reveals significant overlap between active and inactive chemicals in low-dimensional space, emphasizing the importance of machine learning models capable of capturing complicated, non-linear relationships for effective activity classification. Together, the PCA scatter plot (Fig. 2) and the loading factor chart (Fig. 3) provide insight into the structure and variability of the dataset, highlighting the key molecular descriptors responsible for most of the variance.

Fig. 3.

Fig. 3

Loading factor plot displaying the contribution of molecular descriptors to PC1 and PC2. Key contributors include MolWt, TPSA, and NumValenceElectrons, indicating their influence on dataset variance.

Chemical space and diversity analysis

To assess the structural diversity of the dataset, chemical space plots were created using molecular weight (MolWt) and LogP as key descriptors. In both the training (Fig. 4A) and test sets (Fig. 4B), active (green) and inactive (blue) compounds were broadly spread throughout the chemical space, covering a wide range of molecular weights (~ 100–800) and LogP values (0–12). A positive correlation between molecular weight and LogP was observed in both sets, indicating consistent physicochemical profiles. The significant overlap between active and inactive compounds in both plots suggests that simple linear separation based on MolWt or LogP is inadequate, highlighting the need for advanced machine learning models to accurately distinguish bioactive compounds. The overall distribution shows a well-balanced chemical space, supporting the robustness and generalizability of the classification models. The chemical space distribution of the screened compounds was analyzed and found to largely fall within the known property ranges of reported NLRP3 inhibitors, supporting the relevance of the selected library for virtual screening. While the MolWt–LogP scatter plots provide an overview of physicochemical diversity, they do not directly represent scaffold-level structural diversity. Nonetheless, the significant overlap observed among active and inactive compounds implies that multiple scaffolds share similar physicochemical profiles, indicating that property-based distributions are not driven by a single dominant scaffold.

Fig. 4.

Fig. 4

Chemical space plots based on MolWt and LogP for (A) training and (B) test sets. Active compounds (green) and inactive compounds (blue) show broad and overlapping distributions.

Model generation and validation

To help identify active inhibitors targeting the NLRP3 inflammasome, a machine learning-based virtual screening method was used. Four different classification algorithms, Naïve Bayes (NB), Random Forest (RF), K-Nearest Neighbors (KNN), Support Vector Machine (SVM) were built using the scikit-learn (sklearn) package in Python. These models were trained on a curated dataset of active and inactive compounds relevant to NLRP3 modulation. Performance was evaluated using standard classification metrics, including accuracy, sensitivity, specificity, Matthews Correlation Coefficient (MCC), F1 score, and Area Under the Receiver Operating Characteristic Curve (AUC). A summary comparing the training and testing results for each model is provided in (Table 3).

Table 3.

Performance evaluation and validation metrics of ML classifiers trained for NLRP3 inhibitor prediction.

Model Accuracy (Train/Test) Sensitivity (Train/Test) Specificity (Train/Test) MCC (Train/Test) F1 Score (Train/Test) AUC (Train/Test)
KNN 0.85/0.72 0.70/0.45 0.90/0.80 0.61/0.25 0.85/0.72 0.91/0.75
SVM 0.68/0.68 0.88/0.89 0.61/0.61 0.42/0.44 0.70/0.70 0.81/0.81
RF 0.81/0.75 0.99/0.81 0.75/0.73 0.65/0.48 0.72/0.77 0.91/0.83
NB 0.68/0.68 0.82/0.81 0.63/0.64 0.39/0.39 0.70/0.70 0.78/0.78

Among all evaluated classifiers, the RF model exhibited the best performance, achieving the highest AUC scores on both the training (0.91) and independent test (0.83) datasets. Notably, it also showed balanced accuracy (0.81/0.75) and MCC scores (0.65/0.48). The SVM model’s AUC values was consistent in both the training and test sets indicating good generalization performance, which is most likely due to balanced class handling and regularized kernel parameters that prevent overfitting.

The corresponding ROC-AUC curves are displayed in (Fig. 5A and B), illustrating the model’s effectiveness in distinguishing between active and inactive compounds in both datasets. Based on these outcomes, the RF model was deemed the most reliable and was chosen for subsequent virtual screening of the phytochemical compound library against NLRP3. The predictive performance of all models was also assessed using bootstrapped 95% confidence intervals (CIs) for the AUC on both the training and test datasets (Table S1). The Random Forest model demonstrated the highest stability with test AUC CIs of 0.81–0.84, whereas SVM, KNN and NB classifiers showed moderate to lower performance. The predictive performance of the Random Forest model was further evaluated using confusion matrices for both the training and test sets, providing a detailed assessment of classification outcomes (Figure S1). In the training set, the model classified 3174 inactive compounds (true negatives) and 1343 active compounds (true positives), with 35 false negatives and 925 false positives, indicating high sensitivity. In the test set, it predicted 1312 inactive and 429 active compounds, with 149 false negatives and 458 false positives, reflecting a moderate increase in misclassifications but still maintaining balanced detection ability. Overall, the confusion matrix analysis confirms that the Random Forest model shows strong discriminative capability, in identifying active compounds, with acceptable performance consistency between training and test datasets.

Fig. 5.

Fig. 5

Discriminative performance of the RF model shown via ROC-AUC curves. (A) Test set results, and (B) Training set results.

ML model-based virtual screening on the target library

To identify potential inhibitors targeting the NLRP3 protein, the MPD3 phytochemical library was virtually screened using a previously trained Random Forest (RF) classification model. The model predicted 288 phytochemicals as potentially active based on their molecular features. These active hits were then evaluated for drug-likeness using Lipinski’s Rule of Five, which sets acceptable pharmacokinetic properties for orally active drugs. According to this rule, compounds should have a molecular weight of ≤ 500 Da, a LogP value ≤ 5, no more than five hydrogen bond donors, and no more than ten hydrogen bond acceptors. After applying these criteria, only 183 compounds met all drug-likeness parameters. These shortlisted compounds were considered suitable for further investigation and were subjected to molecular docking studies to assess their binding affinity and interactions with the NLRP3 target protein.

Molecular docking

To validate the docking protocol, the co-crystallized inhibitor (NP3-253) was subjected to re-docking into the NLRP3 NACHT domain. The re-docked pose successfully reproduced the crystallographic orientation with minimal deviation, confirming the accuracy of the docking parameters. Key interactions with critical residues Ala228, Arg578, and Glu629 were retained, thereby ensuring the reliability of the protocol for subsequent ligand docking studies (Fig. 6).

Fig. 6.

Fig. 6

(A) Re-docked conformation of inhibitor NP3-253 complexed with NLRP3 (pink) superimposed onto native conformation (Blue). (B) Binding conformation and interactions of NP3-253 within the NLRP3 active site.

In addition, the discriminatory power of the docking protocol was further assessed using receiver operating characteristic (ROC) curve analysis (Figure S2). The ROC analysis gave AUC value of 0.86, demonstrating a good ability of the docking scores to distinguish active inhibitors from inactive. Collectively, these results confirm predictive reliability of the docking protocol employed in this study.

Several small-molecule inhibitors have been reported to target the NACHT domain of NLRP3 by stabilizing its inactive conformation and preventing ATP-driven oligomerization. For example, the brain-penetrant inhibitor NP3-253 interacts with key residues such as Ala228, Arg578, and Glu629, while the tricyclic inhibitor NP3-562 also engages Ala228, Arg351, and Arg578 within the same pocket57. These residues are considered critical for maintaining the structural integrity of the NACHT domain and regulating the conformational changes required for inflammasome assembly. Specifically, Ala228 functions as a structural anchor, Arg578 provides stabilizing hydrogen bonds and electrostatic interactions, and Glu629 contributes to the architecture of the binding pocket. In line with these established findings, our docking results demonstrated that PubChem 348,482 showed the most favorable docking score of − 11.3 kcal/mol, exhibited a broader interaction profile by forming hydrogen bonds with Ala227 and Asp512, along with additional contacts involving Arg578, Glu629, and Gln624 (Fig. 7A) (Table 4). ZINC14583344, with a docking score of − 10.7 kcal/mol, was stabilized through hydrogen bonding with Arg578, Ala228, and Ser626, while further interactions with Tyr443 and Val353 contributed to its binding stability (Fig. 7B). PubChem 11,027,076 occupied the active site with a docking score of − 10.6 kcal/mol. It established two hydrogen bonds with Arg578, two with Tyr632, and one with Thr659, thereby contributing to stable binding within the pocket (Fig. 7C). The docked conformations of the selected compounds are provided in supplementary information (PubChem 348482.pdb, ZINC14583344.pdb, and PubChem 11027076.pdb). Importantly, the engagement of Arg578 and Glu629 by Lig63 closely resembles the binding pattern of reported NLRP3 inhibitors, suggesting strong anchoring within the pocket. Taken together, the consistent involvement of Ala228, Arg578, and Glu629 in the docking profiles of both known inhibitors and our tested ligands highlights their significance as conserved hot spot residues within the NACHT domain, underscoring the potential of these compounds to mimic the inhibitory mechanisms of established NLRP3 inhibitors. Notably, all three ligands consistently formed interactions with Arg578, a residue repeatedly highlighted as crucial for NLRP3 inhibition, further validating the docking protocol and underscoring the potential of these ligands to mimic the inhibitory mechanisms of established NLRP3 inhibitors.

Fig. 7.

Fig. 7

Docked conformation and hydrogen bond interactions of the three selected ligands. (A) PubChem 348,482, (B) ZINC14583344, and (C) PubChem 11,027,076.

Table 4.

Binding scores (kcal/mol), residues making hydrogen bonds and total number of hydrogen bonds.

Compounds Binding score
(kcal/mol)
Interacting residues
NP3-253 (Reference) −10.5 Ala228, Arg578, and Glu629
PubChem 348,482 −11.3 Ala227, Asp512, Arg578, Glu629, Gln624
ZINC14583344 −10.7 Arg578, Ala228,Ser626, Tyr443, Val353
PubChem 11,027,076 −10.6 Arg578, Tyr632, Thr659

ADMET analysis

The pkCSM-based ADMET evaluation of the three selected compounds, PubChem 11,027,076, PubChem 348,482, and ZINC 14,583,344, revealed favorable pharmacokinetic and safety profiles (Table 5). All four compounds showed moderate to good water solubility, with PubChem 14,583,344 (−2.983 log mol/L) being the most soluble, while PubChem 11,027,076 (−4.785 log mol/L) and PubChem 348,482 (−4.747 log mol/L) exhibited comparatively lower solubility(Values > −2 are highly soluble, −2 to −4 moderately soluble, and < −4 poorly soluble.). In terms of absorption, PubChem 11,027,076 and PubChem 348,482 showed excellent intestinal absorption (100% and 98.58%, respectively), whereas ZINC 14,583,344 exhibited lower but still favorable absorption (73.33%) (Values > 70% are considered high for intestinal absorption.). reduced but still favorable absorption (73.33%). Distribution analysis indicated that all three compounds possess the potential for limited penetration across the blood–brain barrier (BBB), suggesting some degree of central nervous system (CNS) distribution. The compounds show moderate BBB permeability, each compound displayed characteristics supportive of partial CNS accessibility. In terms of metabolism, none of the compounds were predicted to be substrates for CYP2D6, thereby reducing the likelihood of variability-related adverse effects. Additionally, none inhibited CYP1A2 or CYP2C19, indicating a low risk of drug–drug interactions through these pathways. Substrate activity for CYP1A4 was observed in two compounds, suggesting a feasible and manageable metabolic route. For excretion, PubChem 348,482 showed the highest predicted clearance (0.587 log ml/min/kg), followed by PubChem 11,027,076 (0.21 log ml/min/kg), while PubChem 14,583,344 exhibited the lowest clearance (−0.06 log ml/min/kg)(0–0.5 log ml/min/kg moderate clearance; values > 0.5 high clearance.). Toxicity predictions were also favorable, as none of the compounds exhibited AMES mutagenicity or hepatotoxicity, underscoring their safety profile.

Table 5.

ADMET properties of the selected Compounds.

Category Properties Unit PubChem 11,027,076 PubChem 348,482 ZINC 14,583,344
Absorption Water solubility Log mol/L −4.785 −4.747 −2.983
Intestinal absorption (human) % Absorbed 100 98.582 73.326
Distribution BBB permeability Log BB −0.475 −0.362 −1.433
Metabolism CYP2D6 substrate Yes/No No No No
CYP1A4 substrate Yes/No Yes No Yes
CYP1A2 inhibitor Yes/No No No No
CYP2C19 inhibitor Yes/No No No No
Excretion Total Clearance log/ml/min/kg) 0.21 0.587 −0.06
Toxicity AMES Yes/No No No No
Hepatotoxicity Yes/No No No No

Root mean square deviation (RMSD) analysis

To assess the structural stability of the protein in apo, reference compounds complex and candidate ligand-bound states, backbone RMSD values were calculated over the course of the 100 ns molecular dynamics simulations for all three compounds (Fig. 8). The RMSD analysis of PubChem 11,027,076 indicated that the system experienced noticeable fluctuations between 25 ns and 60 ns, up to 0.40 nm. However, after this period, the complex stabilized and maintained a steady RMSD profile until the end of the simulation, suggesting that the protein–ligand complex achieved a stable conformation with minimal fluctuations thereafter. For PubChem 348,482, the protein backbone showed rapid stabilization within the first few nanoseconds. The RMSD remained consistent in the range of 0.20–0.30 nm, reflecting a highly stable complex formation. The comparatively lower fluctuations highlight that the binding of PubChem 348,482 does not induce significant structural perturbations and that the protein retains its conformational integrity. The RMSD profile of PubChem 14,583,344 demonstrated excellent stability throughout the entire 100 ns simulation. From the beginning of the trajectory, the RMSD quickly stabilized and consistently remained within the range of 0.22–0.30 nm, with no major fluctuations observed. This indicates that the protein-ligand complex of Compound 3 maintained a highly stable conformation during the whole simulation.

Fig. 8.

Fig. 8

Root mean square deviation (RMSD) profile of the apo-NLRP3 and three selected ligands.

Root mean square fluctuation (RMSF)

To investigate residue-level flexibility, RMSF values of NLRP3 were calculated and compared between the apo form and ligand-bound systems (Fig. 9A). The apo protein displayed relatively lower fluctuations overall, reflecting a more compact and rigid conformational state in the absence of ligand binding. In contrast, the ligand-bound complexes exhibited elevated RMSF values across multiple regions, particularly in loop segments and inter-domain linkers, indicating enhanced local flexibility. This increase in fluctuation upon ligand binding can be attributed to induced-fit effects and allosteric propagation within the NACHT domain, where ligand accommodation triggers subtle side-chain rearrangements and local loosening of domain contacts. Although global stability was retained, as confirmed by RMSD profile, the higher RMSF in ligand-bound systems highlights the dynamic nature of NLRP3 upon ligand interaction, consistent with its known nucleotide-driven conformational plasticity.

Fig. 9.

Fig. 9

(A) Plot showing the RMSF of all the residues of Apo NLRP3 and complexed with the three selected ligands. (B) Plot showing RMSF values of the active site residues.

To gain further insight into the flexibility of the functional binding pocket, RMSF values of active site residues were separately analyzed (Fig. 9B). A clear trend emerged where residues in the binding region exhibited reduced fluctuations in the presence of ligands relative to the apo form. This stabilization of the active site is critical, as it implies stronger ligand–protein interactions and restricted mobility. PubChem 348,482 and ZINC14583344 produced the most pronounced reduction in RMSF values across most active-site residues, showing even low fluctuations than those observed for the reference compound. In contrast, PubChem 11,027,076 showed comparatively higher RMSF values for several key active-site residues. Overall, the RMSF analysis indicates that ligand binding effectively stabilizes both global and local (active site) dynamics of NLRP3. The results suggest that PubChem 348,482 and ZINC14583344 may have stronger binding potential by imparting structural rigidity to the receptor, thereby enhancing the likelihood of inhibitory activity against NLRP3. This ligand-induced stabilization is consistent with the intrinsic nucleotide-driven conformational plasticity of NLRP3, wherein binding events naturally modulate domain flexibility and regulate its transition between inactive and active states.

Compactness and solvent accessible surface area (SASA)

The radius of gyration (Rg) was calculated to assess the overall compactness and structural stability of the complexes during the simulation (Fig. 10A). The radius of gyration (Rg) profiles demonstrated that all complexes maintained a relatively stable compactness during the course of the simulation. ZINC14583344 exhibited the most consistent Rg values, indicating a highly stable structural fold and compactness of the protein structure in complex form. PubChem 348,482 also showed overall stability, with only minor fluctuations suggesting local conformational adjustments within the binding pocket. In contrast, PubChem 11,027,076 displayed noticeable deviations in Rg, which correlates with its tendency to leave the binding pocket and reduced overall stability.

Fig. 10.

Fig. 10

(A) Radius of gyration (Rg) and (B) Solvent accessible surface area (SASA) during MDs simulations.

The SASA was evaluated to investigate the extent of protein surface exposure to the solvent and to understand conformational changes associated with ligand binding (Fig. 10B). The overall trend of SASA remained stable for all systems, except PubChem 11,027,076, which showed noticeable fluctuations. Both ZINC14583344 and PubChem 348,482 maintained steady SASA values, indicating compact and reduced solvent exposure throughout the trajectory. In contrast, the fluctuations in PubChem 11,027,076 correspond to its loss of stability and reduced burial within the binding pocket.

Post-simulation hydrogen bond profile

The hydrogen bond analysis revealed clear differences in the stability of ligand–protein interactions across the three compounds (Fig. 11). ZINC14583344 consistently formed the highest number of hydrogen bonds throughout the simulation, with up to seven hydrogen bonds maintained during the trajectory. PubChem 348,482 formed a maximum of five hydrogen bonds, reflecting stable interactions within the binding site. On the other hand, PubChem 11,027,076 displayed very few and inconsistent hydrogen bonds, which rapidly diminished as the ligand drifted away from the active site.

Fig. 11.

Fig. 11

Number of hydrogen bond interactions formed during MD simulation. (A) PubChem 348,482, (B) ZINC14583344, and (C) PubChem 11,027,076.

These results collectively confirm that ZINC14583344 exhibits the most stable and favorable binding interactions, supported by consistent compactness (Rg), minimal solvent exposure (SASA), and strong hydrogen bonding. PubChem 348,482 also demonstrates considerable stability, while PubChem 11,027,076 shows poor retention and unstable binding, making it the least promising candidate.

Principal component analysis (PCA)

Post-simulation PCA was carried out to examine the conformational motions of the apo protein and its ligand-bound complexes. In the apo system (Fig. 12A), the trajectory was largely confined within a compact cluster, indicating that in the absence of ligand, the protein maintained a relatively stable structural state with limited conformational fluctuations. Upon binding with PubChem 348,482, the conformational space explored by the protein increased substantially, as reflected by a broader distribution along both PC1 and PC2. This suggests that ligand binding introduced greater flexibility and allowed the protein to sample a wider range of conformations (Fig. 12B). In the case of ZINC14583344, the distribution of conformations was more moderate, being wider than the apo but narrower than PubChem 348,482 (Fig. 12C). This pattern reflects a balance between stability and flexibility, indicating that the ligand imposed certain conformational restrictions while still permitting structural adaptation. For the PubChem 11,027,076 complex, the trajectory occupied a relatively well-defined and tighter cluster compared to the other ligand-bound systems (Fig. 12D). This implies that the ligand stabilized the protein structure more effectively and restricted extensive conformational drift during the simulation. Collectively, these results demonstrate that different ligands exert distinct influences on the conformational dynamics of the protein.

Fig. 12.

Fig. 12

PCA plot illustrating the distribution of conformational space along the first two principal components (PC1 and PC2). (A) Apo-NLRP3, (B) PubChem 348,482, (C) ZINC14583344, and (D) PubChem 11,027,076.

MM-GBSA analysis

The MM-GBSA calculations were performed to estimate the binding free energy of the selected compounds. The calculated energy components included electrostatic energy, van der Waals energy, solvation free energy, gas phase energy and the overall free energy of binding (Table 6). Among the three ligands, ZINC14583344 displayed the most favorable binding free energy (−23.99 kcal/mol), which is comparable to the reference (−22.53 kcal/mol). This strong affinity was primarily supported by strong van der Waals (−35.05 kcal/mol) and electrostatic contributions (−23.08 kcal/mol). PubChem 348,482 also exhibited a moderately favorable binding affinity (−9.37 kcal/mol), though partially weakened by a substantial solvation penalty (43.77 kcal/mol). Consistent with these binding free energy results, visual inspection of MD snapshots at 0 ns and 100 ns shows that ligand–protein complexes maintained stable binding poses in the active site during the simulation (Fig. 13A and B).

Table 6.

Binding free energy analysis of selected compounds. This table provides the energy contribution by different energy components in Kcal/mol.

Compounds ΔEelec Δ Evdw ΔGSOLV ΔGGAS ΔTOTAL
Reference −24.04 −37.59 39.10 −61.63 −22.53
PubChem 348,482 −28.36 −24.78 43.77 −53.15 −9.37
ZINC 14,583,344 −23.08 −35.05 34.14 −58.13 −23.99
PubChem 11,027,076 −1.13 −0.11 1.34 −1.25 0.09

Fig. 13.

Fig. 13

Post simulation ligand-NLRP3 complexes extracted at 0ns (Green) and 100ns (Pink). (A) PubChem 348,482, (B) ZINC14583344, and (C) PubChem 11,027,076.

In contrast, PubChem 11,027,076 exhibited weakest affinity with binding energy of 0.09 kcal/mol. This poor affinity can be attributed to its unstable behavior during the MD simulation, where the ligand gradually moved away from the binding cavity (Fig. 13C). As a result, the compound was no longer engaged in strong electrostatic or van der Waals interactions with the active-site residues, explaining its minimal binding score. The most favorable binding free energy of ZINC14583344 correlates with its ability to maintain the highest number of stable hydrogen bonds during the MD simulation, with a maximum of seven hydrogen bonds formed with active-site residues. Similarly, PubChem 348,482, which showed a moderately favorable MM-GBSA score, also demonstrated stable retention within the binding cavity and maintained up to five hydrogen bonds during the trajectory. Overall, ZINC14583344 represents the strongest binder in both docking and MM-GBSA evaluations and also remained stable during MD simulations. PubChem 348,482, though somewhat weaker in binding free energy, showed consistent stability in the active site. In contrast, PubChem 11,027,076 failed to sustain interactions within the binding pocket, resulting in weaker binding affinity.

Hydrogen bonds analysis of the last frame

Hydrogen bond analysis was performed on the last frame extracted at 100 ns of the MD simulation to evaluate the stability and retention of crucial interactions over time (Fig. 14). For PubChem 348,482, hydrogen bonds with Ala228 and Glu629 were clearly preserved at 100 ns, consistent with the docking interaction profile. Similarly, ZINC14583344 formed hydrogen bond with Ala228 and Glu629 at the end of the simulation. These residues were also involved in the initial docking interactions, confirming that the ligand remained anchored in the active site throughout the MD run. The preservence of these hydrogen bonds highlights the conformational stability of the complex and supports the favorable binding behavior observed during docking and MM-GBSA analyses.

Fig. 14.

Fig. 14

Hydrogen bond analysis performed on the last frame extracted at 100 ns of the MD simulation for (A) PubChem 348,482 and (B) ZINC14583344.

Density functional theory (DFT)

An exhaustive list of electronic descriptors of the molecules being studied is obtained by the ab initio DFT calculations. Quantitative results of such computations are tabulated in Table 7, and the equilibrated geometries and frontier-molecular-orbital distributions are plotted in Figs. 15 and 16. The two compounds have negative electronic energies, hence validating their thermodynamic stability. Moreover, the differences in dipole moment, HOMO-LUMO gaps, and reactivity indices highlight the extraordinary electronic properties of the compounds, which, by extension, may regulate their biological activity and affinity towards binding NLRP3-related central nervous system pathologies. A reference compound was also considered in the current study in order to achieve a standard by which the electronic behavior of two designed compounds can be interpreted. The comparative assessment allows having a better idea of their stability, reactivity, and their possible biological performance in comparison to a proven therapeutic molecule58.

Table 7.

HOMO, LUMO, energy gap, and the global quantum reactivity parameters calculated at DFT/B3LYP/6-311G level.

Formula Reference ZINC 14,583,344 PubChem 348,482
Dipole moment (Debye) 10.346784 4.693240 9.746038
Electronic energy (Hartree) −1331.585991 −1757.996105 −1272.962239
LUMO (eV) −0.06893 −0.09320 −0.04318
HOMO (eV) −0.22623 −0.22403 −0.26032
Energy gap (eV) EHOMO − ELUMO 0.157300 0.13083 0.21714
Electron Affinity (A, eV) A =ELUMO 0.06893 0.09320 0.04318
Ionization Potential (I, eV) I = -HOMo 0.22623 0.22403 0.26032
Chemical potential (µ, eV) µ = 1/2 (I + A) 0.147580 0.158615 0.151750
Electronegativity (χ, eV) μ = −1/2 (I + A) −0.147580 −0.158615 −0.151750
Chemical hardness (η, eV) η = 1/2 (I - A) 0.078650 0.065415 0.108570
Chemical softness (S, eV-1) S = 1/η 12.714558 15.287014 9.210648
Electrophilicity index (ω, eV) ω = 2(µ2/η) 0.138461 0.192301 0.106052
Neucleophilicity index (N, eV-1) N = 1/ω 7.222270 5.200185 9.429365
Additional electronic charge = - µ/η −1.876414 −2.424750 −1.397716

Fig. 15.

Fig. 15

Optimized structures of reference compound and lead compounds ZINC 14,583,344 and PubChem 348,482.

Fig. 16.

Fig. 16

HOMO, LUMO, and energy gap of reference compound and lead compounds ZINC 14,583,344 and PubChem 348,482.

Optimized structures

The optimization of the geometries of individual compounds is outlined in Fig. 15. Both ZINC14533344 and PubChem348482 have stabilized to consistent geometries with no imaginary vibrational frequencies, hence verifying the fact that the optimized structures are real minima of the potential energy surface. The witnessed dihedral changes and electronic delocalization of the pharmacophore structure also suggests that these molecules adopt certain orientations that favor interaction in the binding pocket of the target protein. The reference compound was also structured to stabilize to a minimum, ensuring its structural stability. Nonetheless, both the ZINC14533344 and PubChem348482 exhibited higher electronic delocalization than the reference compound, with increased conformational flexibility, which could increase receptor binding59.

Frontier molecular orbital (FMO) study

The orbital energies of HOMO and LUMO (Fig. 16) give useful data on the charge transfer potential. ZINC 14,533,344 had HOMO and LUMO energies of −0.22403 eV and − 0.09320 eV, respectively, resulting in an energy gap of 0.13083 eV. PubChem 348,482 had a weakly lower HOMO energy (− 0.26032 eV) and stronger LUMO (− 0.04318 eV), with an energy difference of 0.21714 eV. The energy gap of the reference compound was 0.15730 eV, which is between ZINC14533344 and PubChem348482. This shows that ZINC14533345 is more electronically reactive than the reference compound, whereas PubChem348482 is relatively more stable but less reactive. Therefore, it is possible that ZINC14533344 can take part in charge-transfer-based binding interactions more effectively than the reference compound and PubChem34848260.

Global reactivity parameters

The calculated global quantum reactivity parameters of reference compound and lead compounds ZINC 14,533,344 and PubChem 348,482 are listed in Table 7. The descriptors that are extracted by the energies of the frontier molecular orbitals give basic information about the stability, reactivity, and electron-transfer propensities of the respective molecular systems.

Dipole moment

Dipole moment is a measure of the overall polarity of the molecular entity, a determinant of solubility, molecular recognition and affinity with polar amino acid residues in the biological target. Comparative analysis revealed that the dipole moment of PubChem record 348,482 (9.74 D) is much bigger than that of ZINC record 14,533,344 (4.69 D). The reference compound exhibited an even higher dipole moment (10.34 D), indicating that it is the most polar molecule among the three. This means that PubChem 348,482 will be able to react with polar amino acid residues with greater electrostatic reactions and be more soluble in aqueous physiological environments. Nonetheless, dipole lower moments, e.g. ZINC 14,533,344, can find it preferable to enter through hydrophobic membranes.

Chemical potential (µ) and electronegativity (χ)

Chemical potential represents the escaping tendency of electrons from equilibrium, while electronegativity is a measure of the tendency to attract electrons. For ZINC 14,533,344, µ = 0.1586 eV and χ = −0.1586 eV, while for PubChem 348,482, µ = 0.1518 eV and χ = −0.1518 eV. The reference compound displayed µ = 0.1476 eV, which is slightly lower than both designed compounds. The slightly higher µ value of ZINC 14,533,344 indicates a greater drive for electron redistribution, which may favor dynamic charge transfer during protein-ligand interactions. The negative values of χ (due to the definition used) reflect the molecules’ capability to balance electron donation and acceptance61.

Chemical hardness (η) and softness (S)

Chemical hardness measures resistance to electron cloud deformation, whereas softness quantifies the ease of polarizability. ZINC 14,533,344 had η = 0.065 eV and S = 15.28 eV⁻¹, while PubChem 348,482 showed η = 0.109 eV and S = 9.21 eV⁻¹. The reference compound showed η = 0.078 eV and S = 12.71 eV⁻¹, making it moderately soft but less reactive than ZINC14533344.The lower hardness and higher softness of ZINC 14,533,344 indicate that it is more chemically reactive, more polarizable, and more capable of undergoing charge transfer. This makes it a “softer” molecule, favoring interactions with biological macromolecules, especially in flexible binding pockets. In contrast, PubChem 348,482 is “harder” and less polarizable, suggesting greater electronic stability but reduced reactivity62.

Electrophilicity (ω) and nucleophilicity (N)

The electrophilicity index describes the stabilization energy when a molecule acquires electrons, while nucleophilicity reflects electron-donating strength. ZINC 14,533,344 showed a higher electrophilicity index (0.192 eV) compared to PubChem 348,482 (0.106 eV), meaning it has a stronger tendency to attract electron density and act as an electrophile. Conversely, PubChem 348,482 had a significantly higher nucleophilicity index (9.43 vs. 5.20), indicating superior ability to donate electrons. The reference compound exhibited ω = 0.138 eV and N = 7.22, showing balanced electrophilic and nucleophilic properties. ZINC14533344 is more electrophilic than the reference compound, while PubChem348482 is more nucleophilic, indicating different but complementary binding modes relative to the reference compound63.

Additional electronic charge transfer (ΔN)

The additional electronic charge transfer parameter quantifies the number of electrons that a system can accept from the environment. ZINC 14,533,344 had ΔN = − 2.42, while PubChem 348,482 showed ΔN = − 1.39. The reference compound displayed ΔN = − 1.87, lower than ZINC14533344 but higher than PubChem348482. This highlights that ZINC14533344 has the highest charge-acceptance capability, suggesting the strongest adaptability in protein–ligand interactions. The more negative value for ZINC 14,533,344 suggests its greater capacity for charge redistribution, which may enhance binding adaptability and dynamic interactions with the protein64.

Altogether, the global reactivity descriptors indicate that ZINC 14,533,344 is softer, more reactive, and more electrophilic, making it a promising candidate for strong and adaptable binding. Meanwhile, PubChem 348,482 is more nucleophilic and polar, which may stabilize its binding through hydrogen bonding and dipole interactions. These complementary properties highlight the unique potential of both molecules as lead scaffolds for NLRP3 inhibition in CNS diseases.

Molecular electrostatic potential (MEP)

Figure 17 demonstrates the variation of spatial charge distributions of the molecular surfaces of ZINC 14,533,344 and PubChem 348,482 using the MEP maps of the two compounds. Red zones denote the sites that are rich in electrons, good sites to attack with electrophiles, and blue zones are the areas that are deficient in electrons and are more likely to be attacked by nucleophiles. In ZINC 14,533,344, the red zones were predominantly located around the oxygen atoms, implying that there is high potential for a hydrogen bond and charge transfer. PubChem 348,482 exhibited larger red zones and greater polarization, as expected of a larger dipole moment, as well as increased electrostatic interactions with polar residue. The separation of charges was even greater in the reference compound, which confirms its high dipole moment and significant potential of the electrostatic interaction. Both designed compounds exhibited homogenous MEP surfaces relative to the reference compound, and thus were more likely to be stable in a dynamic binding environment65. In general, the MEP patterns are consistent with the FMO and global reactivity values, indicating that ZINC 14,533,345 has a more balanced distribution with respect to reactivity but PubChem 348,482 has more polar and pronounced localized charge separation, which can alert to receptor binding. Conclusively, the designed compounds have characteristic but favorable electronic behaviors as compared to the reference compound.

Fig. 17.

Fig. 17

MEP structure and scale of reference compound and lead compounds ZINC 14,583,344 and PubChem 348,482.

Biological interpretation of DFT descriptors

The electronic descriptors that are obtained by DFT give mechanistic understanding of the biological behavior of the compounds studied. The reduced HOMO-LUMO gap and increased chemical softness of HOMO of ZINC14533344 demonstrates greater electronic flexibility, which contributes to its greater binding specificity and the most desired MM/GBSA free energy. The patterns of electrostatic complementarity and hydrogen-bonding that occur during docking and MD simulations are attributed to dipole moment and MEP distributions, which occur more frequently with the more polar PubChem348482 and the reference compound. Additional rationales to electrophilicity and charge-transfer capacity are the mechanism of ligand-protein stabilization by dynamic electron redistribution. On the whole, these quantum descriptors are directly correlated with binding strength, stability of interactions, and solvation behavior, which increase the biological interpretability of the DFT results.

Discussion

Neuroinflammation is a characteristic of many central nervous system (CNS) diseases, such as Alzheimer’s disease (AD), Parkinson’s disease (PD), multiple sclerosis (MS), and ischemic stroke, in which the NLRP3 inflammasome is a key pathogenic mechanism4,6. Excessive activation of NLRP3 drives caspase-1–mediated IL-1β and IL-18 maturation, resulting in pyroptosis and neuronal damage enhancement5. Though increased evidence points towards NLRP3 activation and neurodegeneration, there are no selective NLRP3 inhibitors that have reached clinical approval, largely because it has been difficult to attain potency, selectivity, and safety13. Investigational compounds under current study like MCC950, Oridonin, and OLT1177 have shown potency in preclinical models but have been limited by issues such as hepatotoxicity and off-target effects12. This highlights the urgent need for alternative scaffolds with improved drug-like properties and favorable pharmacokinetics.

In the present study, an integrated computational pipeline was employed to identify phytochemical scaffolds with inhibitory potential against the NLRP3 NACHT domain, a critical structural element required for ATP-driven oligomerization and inflammasome activation11. With a large dataset of curated bioactive and decoy molecules, machine learning classifiers were created and used, with the Random Forest model showing the greatest predictive accuracy (AUC = 0.83)24. Virtual screening of the MPD3 phytochemical library yielded 183 compounds that fulfilled drug-likeness criteria. Among these, three candidates emerged as the most promising due to their strong binding affinities (−10.6 to −11.3 kcal/mol) and stable interactions with crucial catalytic residues. These interactions were consistently observed with Ala228, Arg578, and Glu629, residues known to play essential roles in maintaining the active site architecture and catalytic function of the target protein19. These residues have already been reported as structural anchors for established NLRP3 inhibitors, which indicates the consistency of our predictions10. The molecular docking followed by subsequent molecular dynamics (MD) simulations supported the stability of the chosen ligands in the NACHT binding pocket44,46. ZINC14583344, for instance, exhibited long-term hydrogen bonding (seven bonds maximum) as well as low RMSD and RMSF fluctuations, reflecting strong structural stabilization48. This is consistent with earlier findings in which inhibitors like NP3-253 and NP3-562 stabilized the NACHT domain through inhibition of ATP hydrolysis and oligomerization10. In addition, MM-GBSA binding free energy calculations identified ZINC14583344 (−23.99 kcal/mol) as the best binder over reported values for reference inhibitors, thus making it a viable candidate as a new scaffold51,54. Additionally, the density functional theory (DFT) based electronic structure analysis provided critical insights into the binding adaptability of the two lead compounds. ZINC14533344, with its smaller HOMO–LUMO gap, higher softness, and strong electrophilic character, demonstrates greater potential for dynamic charge transfer, making it suitable for stabilizing interactions within flexible catalytic pockets of NLRP360,62,63. On the other hand, PubChem348482, with its higher dipole moment and nucleophilicity, favors electrostatic stabilization and hydrogen-bonding interactions with polar residues, thereby complementing the reactivity profile of ZINC14533344. These distinct yet complementary electronic behaviors suggest that both molecules may engage in different but synergistic modes of inhibition, strengthening the rationale for their selection as promising scaffolds against NLRP3-mediated CNS pathologies58.

Our results are consistent with and build on prior research that showed MCC950 to be a selective inhibitor of NLRP3 via NACHT ATP-binding pocket blocking13. In contrast to MCC950, which had been tested in clinical trials but proved hepatotoxic, the disclosed phytochemical scaffolds also exhibited good in silico ADMET profiles, such as non-mutagenicity, non-hepatotoxicity, and suitable clearance rates. Scaffolds from natural products have been previously identified as potential anti-inflammatory agents, with oridonin and related diterpenoids demonstrating inhibition of NLRP3 through covalent modification of cysteine residues12. In this regard, ZINC14583344 and PubChem 348,482 present structurally differentiated, non-covalent stabilizing mechanisms, potentially lowering safety liabilities without sacrificing inhibitory potency. Translationally, the lipophilic backbones of these phytochemicals can afford partial blood–brain barrier penetration, a requisite property for CNS therapeutics66. However, bioavailability and metabolic stability will be issues, since natural scaffolds tend to have drugability limitations14. Structural optimization in the future with refinements and medicinal chemistry modifications will be needed to improve CNS penetration while retaining selectivity towards NLRP359,67. In conclusion, the present study reveals phytochemical-derived scaffolds with robust inhibitory activity against the NLRP3 inflammasome, notably ZINC14583344 and PubChem 348,482. The docking, MD, and MM-GBSA studies cumulatively accentuate their stability, desirable pharmacokinetic properties, and ability to interact with conserved residues important for NLRP3 activation. These compounds are in agreement with known mechanisms of NLRP3 inhibition but bring new, natural scaffolds with different structural backbones into play, seeking out new avenues for the development of effective and safe NLRP3-targeted therapeutics for neuroinflammation-related CNS disorders.

Conclusion

This study successfully targeted the NLRP3 NACHT domain using an integrated computational pipeline that included machine learning–based screening, molecular docking, molecular dynamics simulations, MM-GBSA binding free energy calculations, and Density Functional Theory (DFT) analysis. Machine learning models successfully distinguished active from inactive phytochemicals, leading to the identification of promising candidates from the MPD3 library. Docking studies confirmed strong binding affinities and consistent interactions of the three ligands PubChem 348,482, ZINC14583344, and PubChem 11,027,076 with crucial catalytic residues, while molecular dynamics validated the conformational stability of the selected compounds. Furthermore, MM-GBSA analysis prioritized ZINC14583344 and PubChem 348,482 because to their favorable binding free energies, which support their stability in the NLRP3 binding pocket. DFT analysis revealed that ZINC14583344 had a smaller HOMO-LUMO gap, higher softness, and higher electrophilicity, indicating improved adaptability and reactivity for receptor binding, whereas PubChem 348,482 had a higher dipole moment and nucleophilicity, favoring electrostatic and hydrogen-bonding interactions with polar residues. Overall, these complementary approaches identify ZINC14583344 and PubChem 348,482 as the most promising scaffolds, giving compelling evidence for their prospective development as selective NLRP3 inhibitors for therapeutic intervention in neuroinflammation-related CNS disorders.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (13.8KB, png)
Supplementary Material 2 (9.7KB, xlsx)
Supplementary Material 4 (92.5KB, png)
Supplementary Material 5 (119.5KB, docx)
Supplementary Material 6 (397.6KB, csv)
Supplementary Material 7 (595.7KB, csv)
Supplementary Material 8 (616.1KB, pdb)
Supplementary Material 9 (616.6KB, pdb)
Supplementary Material 10 (616.8KB, pdb)

Acknowledgements

This work was funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No. (DGSSR-2025-FC-01044).

Author contributions

S.I.A. designed, performed, and wrote this study.

Funding

This work was funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No. (DGSSR-2025-FC-01044).

Data availability

All data generated and analyzed during this study are included within the article and its Supplementary Information.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Xu, Z. et al. NLRP inflammasomes in health and disease. Mol. Biomed.5 (1), 14 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Carroll, K., Sawden, M. & Sharma, S. DAMPs, PAMPs, NLRs, RIGs, CLRs and TLRs–Understanding the alphabet soup in the context of bone biology. Curr. Osteoporos. Rep.23 (1), 6 (2025). [DOI] [PubMed] [Google Scholar]
  • 3.Zheng, D., Liwinski, T. & Elinav, E. Inflammasome activation and regulation: toward a better Understanding of complex mechanisms. Cell. Discovery. 6 (1), 36 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Fusco, R. et al. Focus on the role of NLRP3 inflammasome in diseases. Int. J. Mol. Sci.21 (12), 4223 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Van de Veerdonk, F. L. et al. Inflammasome activation and IL-1β and IL-18 processing during infection. Trends Immunol.32 (3), 110–116 (2011). [DOI] [PubMed] [Google Scholar]
  • 6.Dadkhah, M. & Sharifi, M. The NLRP3 inflammasome: mechanisms of activation, regulation, and role in diseases. Int. Rev. Immunol.44 (2), 98–111 (2025). [DOI] [PubMed] [Google Scholar]
  • 7.Jurcău, M. C. et al. The link between oxidative stress, mitochondrial dysfunction and neuroinflammation in the pathophysiology of alzheimer’s disease: therapeutic implications and future perspectives. Antioxidants11 (11), 2167 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Li, Y. et al. Targeting microglial α-synuclein/TLRs/NF-kappaB/NLRP3 inflammasome axis in parkinson’s disease. Front. Immunol.12, 719807 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Nasoohi, S., Parveen, K. & Ishrat, T. Metabolic syndrome, brain insulin resistance, and alzheimer’s disease: thioredoxin interacting protein (TXNIP) and inflammasome as core amplifiers. J. Alzheimer’s Disease. 66 (3), 857–885 (2018). [DOI] [PubMed] [Google Scholar]
  • 10.Boršić, E. et al. Clustering of NLRP3 induced by membrane or protein scaffolds promotes inflammasome assembly. Nat. Commun.16 (1), 4887 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Fu, J. & Wu, H. Structural mechanisms of NLRP3 inflammasome assembly and activation. Annu. Rev. Immunol.41 (1), 301–316 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhang, X. et al. Inhibitors of the NLRP3 inflammasome pathway as promising therapeutic candidates for inflammatory diseases. Int. J. Mol. Med.51 (4), 35 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kennedy, C. R. et al. A probe for NLRP3 inflammasome inhibitor MCC950 identifies carbonic anhydrase 2 as a novel target. ACS Chem. Biol.16 (6), 982–990 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Patel, V. & Shah, M. Artificial intelligence and machine learning in drug discovery and development. Intell. Med.2 (3), 134–140 (2022). [Google Scholar]
  • 15.Daroch, A. & Purohit, R. MDbDMRP: A novel molecular descriptor-based computational model to identify drug-miRNA relationships. Int. J. Biol. Macromol.287, 138580 (2025). [DOI] [PubMed] [Google Scholar]
  • 16.Sharma, B. & Purohit, R. Enhanced sampling simulations to explore Himalayan phytochemicals as potential phosphodiesterase-1 inhibitor for neurological disorders. Biochem. Biophys. Res. Commun.758, 151614 (2025). [DOI] [PubMed] [Google Scholar]
  • 17.Singh, R. & Purohit, R. Determining the effect of natural compounds on mutations of Pyrazinamidase in multidrug-resistant tuberculosis: illuminating the dark tunnel. Biochem. Biophys. Res. Commun.756, 151575 (2025). [DOI] [PubMed] [Google Scholar]
  • 18.Gupta, A., Thind, A. S. & Purohit, R. EGFR AP: a predictive machine learning model for assessing small molecule activity against the epidermal growth factor receptor. RSC Med. Chem.16 (9), 4415–4426 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hayat, C. et al. Identification of new potent NLRP3 inhibitors by multi-level in-silico approaches. BMC Chem.18 (1), 76 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Pinheiro, G. A. et al. Machine learning prediction of nine molecular properties based on the SMILES representation of the QM9 quantum-chemistry dataset. J. Phys. Chem. A. 124 (47), 9854–9866 (2020). [DOI] [PubMed] [Google Scholar]
  • 21.Kaneko, H. Molecular descriptors, structure generation, and inverse QSAR/QSPR based on SELFIES. ACS Omega. 8 (24), 21781–21786 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Samkhaniani, M. et al. A machine learning approach to feature selection and uncertainty analysis for biogas production in wastewater treatment plants. Waste Manage.197, 14–24 (2025). [DOI] [PubMed] [Google Scholar]
  • 23.Pantic, I. & Paunovic Pantic, J. Artificial intelligence in chromatin analysis: A random forest model enhanced by fractal and wavelet features. Fractal Fract.8 (8), 490 (2024). [Google Scholar]
  • 24.Ishfaq, M. et al. Multinomial classification of NLRP3 inhibitory compounds based on large scale machine learning approaches. Mol. Diversity. 28 (4), 1849–1868 (2024). [DOI] [PubMed] [Google Scholar]
  • 25.Mehrabinezhad, A., Teshnehlab, M. & Sharifi, A. A comparative study to examine principal component analysis and kernel principal component analysis-based weighting layer for convolutional neural networks. Comput. Methods Biomech. Biomedical Engineering: Imaging Visualization. 12 (1), 2379526 (2024). [Google Scholar]
  • 26.Abdul-Al, M. et al. A novel approach to enhancing multi-modal facial recognition: integrating convolutional neural networks, principal component analysis, and sequential neural networks. IEEE Access.12 (2024).
  • 27.Haji, A. Comparative analysis of autoencoder and PCA for dimensionality reduction in gene expression data. (2024).
  • 28.Kaib, M. T. H. et al. Data size reduction approach for nonlinear process monitoring refinement using kernel PCA technique. Expert Syst. Appl.274, 126975 (2025). [Google Scholar]
  • 29.Makkulau, M. et al. Variance The Estimation Eigen Value of Principal Component Analysis and Nonlinear Principal Component Analysis. in ITM Web of Conferences. EDP Sciences. (2024).
  • 30.Frost, H. R. Eigenvectors from eigenvalues sparse principal component analysis (EESPCA). J. Comput. Graphical Statistics: Joint Publication Am. Stat. Association Inst. Math. Stat. Interface Foundation North. Am.31 (2), 486 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Eze, N. M., Asogwa, O. C. & Eze, C. M. Principal component factor analysis of some development factors in Southern Nigeria and its extension to regression analysis. J. Adv. Math. Comput. Sci.36 (3), 132–160 (2021). [Google Scholar]
  • 32.Abdulhafedh, A. Incorporating k-means, hierarchical clustering and Pca in customer segmentation. J. City Dev.3 (1), 12–30 (2021). [Google Scholar]
  • 33.Niazi, S. K. & Mariam, Z. Recent advances in machine-learning-based chemoinformatics: a comprehensive review. Int. J. Mol. Sci.24 (14), 11488 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Wani, M. A. & Roy, K. K. Development and validation of consensus machine learning-based models for the prediction of novel small molecules as potential anti-tubercular agents. Mol. Diversity. 26 (3), 1345–1356 (2022). [DOI] [PubMed] [Google Scholar]
  • 35.Shrivastava, T., Singh, V. & Agrawal, A. Autism spectrum disorder detection with kNN imputer and machine learning classifiers via questionnaire mode of screening. Health Inform. Sci. Syst.12 (1), 18 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Almatroudi, A. Integrative machine learning, virtual screening, and molecular modeling for BacA-Targeted Anti-Biofilm drug discovery against Staphylococcal infections. Crystals14 (12), 1057 (2024). [Google Scholar]
  • 37.Zhang, H. et al. Machine learning methods for weather forecasting: A survey. Atmosphere16 (1), 82 (2025). [Google Scholar]
  • 38.Salama, M. Optimization of regression models using machine learning: A comprehensive study with scikit-learn. Optimization of Regression Models Using Machine Learning: A Comprehensive Study with Scikit-learn| IUSRJ, 5. (2024).
  • 39.Alemerien, K., Alsarayreh, S. & Altarawneh, E. Diagnosing cardiovascular diseases using optimized machine learning algorithms with GridSearchCV. J. Appl. Data Sci.5 (4), 1539–1552 (2024). [Google Scholar]
  • 40.Padhy, S. & SMOTE-based Deep, L. S. T. M. System with GridSearchCV optimization for intelligent diabetes diagnosis. J. Electr. Syst.20 (7s), 804–815 (2024). [Google Scholar]
  • 41.Mumtaz, A. et al. MPD3: a useful medicinal plants database for drug designing. Nat. Prod. Res.31 (11), 1228–1236 (2017). [DOI] [PubMed] [Google Scholar]
  • 42.Aloufi, B. H., Snoussi, M. & Sulieman, A. M. E. Antiviral efficacy of selected natural phytochemicals against SARS-CoV-2 Spike glycoprotein using structure-based drug designing. Molecules27 (8), 2401 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.El-Hachem, N. et al. AutoDock and AutoDockTools for protein-ligand docking: beta-site amyloid precursor protein cleaving enzyme 1 (BACE1) as a case study, in Neuroproteomics: Methods and Protocols. Springer.391–403. (2017). [DOI] [PubMed]
  • 44.Zayed, A. O. H. Optimizing protein-ligand Docking through machine learning: algorithm selection with AutoDock Vina. Discover Chem.2 (1), 164 (2025). [Google Scholar]
  • 45.Kaur, J., Kaur, S. & andSingh Rational modification of the lead molecule: enhancement in the anticancer and dihydrofolate reductase inhibitory activity. Bioorg. Med. Chem. Lett.26 (8), 1936–1940 (2016). [DOI] [PubMed] [Google Scholar]
  • 46.Berendsen, H. J., van der Spoel, D. & van Drunen, R. A message-passing parallel molecular dynamics implementation. Comput. Phys. Commun.91 (1–3), 43–56 (1995). [Google Scholar]
  • 47.Huang, J. & MacKerell, A. D. Jr CHARMM36 all-atom additive protein force field: validation based on comparison to NMR data. J. Comput. Chem.34 (25), 2135–2145 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Mishra, S. et al. Classical molecular dynamics simulation identifies catechingallate as a promising antiviral polyphenol against MPOX palmitoylated surface protein. Comput. Biol. Chem.110, 108070 (2024). [DOI] [PubMed] [Google Scholar]
  • 49.Ramsey, I. S. et al. An aqueous H + permeation pathway in the voltage-gated proton channel Hv1. Nat. Struct. Mol. Biol.17 (7), 869–875 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Kognole, A. A. et al. CHARMM-GUI Drude Prepper for molecular dynamics simulation using the classical Drude polarizable force field. J. Comput. Chem.43 (5), 359–375 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Jawad, B. et al. Key interacting residues between RBD of SARS-CoV-2 and ACE2 receptor: combination of molecular dynamics simulation and density functional calculation. J. Chem. Inf. Model.61 (9), 4425–4441 (2021). [DOI] [PubMed] [Google Scholar]
  • 52.Gilson, M. K. & Zhou, H. X. Calculation of protein-ligand binding affinities. Annu. Rev. Biophys. Biomol. Struct.36 (1), 21–42 (2007). [DOI] [PubMed] [Google Scholar]
  • 53.Du, X. et al. Insights into protein–ligand interactions: mechanisms, models, and methods. Int. J. Mol. Sci.17 (2), 144 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Yasir, M. et al. Investigating the inhibitory potential of flavonoids against aldose reductase: insights from molecular docking, dynamics simulations, and gmx_MMPBSA analysis. Curr. Issues. Mol. Biol.46 (10), 11503–11518 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Kadhum, L. H. Geometry optimization of coupling allin-metformin using dft/b3lyp molecular modelling technique: geometry optimization of coupling allin-metformin using dft/b3lyp molecular modelling technique. Iraqi J. Market Res. Consumer Prot.13 (2), 89–100 (2021). [Google Scholar]
  • 56.El Addali, A. et al. Theoretical study of the phosphate units stability by the Dft b3lyp/6-311 g quantum method. J. Chem. Technol.31 (3), 477–485 (2023). [Google Scholar]
  • 57.Mackay, A. et al. Discovery of NP3-253, a potent brain penetrant inhibitor of the NLRP3 inflammasome. J. Med. Chem.67 (23), 20780–20798 (2024). [DOI] [PubMed] [Google Scholar]
  • 58.Bağlan, M., Gören, K. & Yıldıko, Ü. MEP analysis and molecular Docking using DFT calculations in DFPA molecule. Int. J. Chem. Technol.7 (1), 38–47 (2023). [Google Scholar]
  • 59.Taher, S. R. & Hamad, W. M. Synthesis, characterization, density functional theory (DFT) analysis, and mesomorphic study of new thiazole derivatives. Bull. Chem. Soc. Ethiop.38 (6), 1827–1842 (2024). [Google Scholar]
  • 60.Stuart, J. G. & Jebaraj, J. W. Synthesis, characterisation, in Silico molecular Docking and DFT studies of 2, 6-bis (4-hydroxy-3-methoxyphenyl)-3, 5-dimethylpiperidin-4-one. Indian J. Chem. (IJC). 62 (10), 1061–1080 (2023). [Google Scholar]
  • 61.Andonova, V. et al. Spectral characteristics, in Silico perspectives, density functional theory (DFT), and therapeutic potential of green-extracted phycocyanin from spirulina. Int. J. Mol. Sci.25 (17), 9170 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Wu, S. et al. Theoretical study on the adsorption of Sulforaphane on B 12 N 12-related nanocages based on density functional theory. New J. Chem.47 (47), 21743–21752 (2023). [Google Scholar]
  • 63.Khalid, M. et al. Exploration of noncovalent interactions, chemical reactivity, and nonlinear optical properties of Piperidone derivatives: a concise theoretical approach. ACS Omega. 5 (22), 13236–13249 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Solgun, D. G. et al. Synthesis of axially silicon phthalocyanine substituted with bis-(3, 4-dimethoxyphenethoxy) groups, DFT and molecular Docking studies. J. Incl. Phenom. Macrocyclic Chem.102 (11), 851–860 (2022). [Google Scholar]
  • 65.Ganiev, B., Mardonov, U. & Kholikova, G. Molecular structure, HOMO-LUMO, MEP-–Analysis of triazine compounds using DFT (B3LYP) calculations. Materials Today: Proceedings, (2023).
  • 66.Pardridge, W. M. Drug transport across the blood–brain barrier. J. Cereb. Blood flow. Metabolism. 32 (11), 1959–1972 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Leeson, P. D. & Springthorpe, B. The influence of drug-like concepts on decision-making in medicinal chemistry. Nat. Rev. Drug Discovery. 6 (11), 881–890 (2007). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (13.8KB, png)
Supplementary Material 2 (9.7KB, xlsx)
Supplementary Material 4 (92.5KB, png)
Supplementary Material 5 (119.5KB, docx)
Supplementary Material 6 (397.6KB, csv)
Supplementary Material 7 (595.7KB, csv)
Supplementary Material 8 (616.1KB, pdb)
Supplementary Material 9 (616.6KB, pdb)
Supplementary Material 10 (616.8KB, pdb)

Data Availability Statement

All data generated and analyzed during this study are included within the article and its Supplementary Information.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES