Abstract
Hepatitis C virus (HCV) presents a significant global health issue due to its widespread prevalence and the absence of a reliable vaccine for prevention. While significant progress has been achieved in therapeutic interventions since the disease was first identified, its resurgence underscores the need for innovative strategies to combat it. The nonstructural protein NS5A is crucial in the life cycle of the HCV, serving as a significant factor in both viral replication and assembly processes. This significance is highlighted by its inclusion in all existing approved HCV combination therapies. In this study, a quantitative structure–activity relationship (QSAR) was conducted to design new compounds with enhanced inhibitory activity against HCV. In this context, a set of 82 phenylthiazole derivatives was employed to construct a QSAR model using the Monte Carlo optimization technique. This model offers valuable insights into the specific structural characteristics that either enhance or reduce the inhibitory activity. These findings were used to design novel NS5A inhibitors. Moreover, molecular docking was used to predict the binding affinity of the newly designed inhibitors within the NS5A protein, followed by molecular dynamics simulations to investigate the dynamic interactions over time. Additionally, molecular mechanics generalized born surface area calculations were carried out to estimate the binding free energies of the inhibitor candidates, providing additional insights into their binding affinities and stabilities. Finally, the absorption, distribution, metabolism, excretion, and toxicity analysis were performed to assess the pharmacokinetic and toxicity profiles of the inhibitor candidates. This comprehensive approach provides a detailed understanding of the potential efficacy, stability, and safety of the screened drug candidates, offering valuable insights for their further development as potent therapeutic agents against HCV.
Keywords: HCV, NS5A, Phenylthiazole, QSAR, MMGBSA, ADMET
Subject terms: Drug discovery, Drug safety, Medicinal chemistry, Pharmacology, Toxicology
Introduction
Hepatitis C virus (HCV), an RNA virus of the Flaviviridae family, is one of the major causes of liver disease worldwide. HCV infection affects over 80 million people with most cases exhibit either no symptoms or mild symptoms1,2. Chronic HCV infection can lead to serious liver complications, including cirrhosis and liver cancer, which can be life-threatening if untreated. The current standard of care for HCV treatment involves the use of direct-acting antivirals (DAAs), which demonstrated high cure rates in a short period of time2. However, DAAs can cause some inconvenient side effects, such as fatigue, headache, and nausea. Furthermore, the cost of HCV treatment can be expensive with limited access for patients in low- and middle-income countries. These challenges present significant barriers to HCV treatment and can have a major impact on patient outcomes. Therefore, ongoing efforts are needed to improve access to HCV treatment, increase affordability, and minimize the adverse effects of the used drugs.
The HCV genome encodes multiple structural and nonstructural (NS) proteins, including NS2, NS3, NS4A, NS4B, NS5A, NS5B, and p73,4. All of which contributes to the virus’s ability to replicate and evade immune responses. Early antiviral therapies focused on the NS3/4A protease and NS5B RNA-dependent RNA polymerase, leading to the development of NS3/4A protease inhibitors (e.g., Simeprevir, Paritaprevir) and NS5B polymerase inhibitors (e.g., Sofosbuvir, Dasabuvir), which demonstrated effectiveness in treating HCV5,6. Despite their success, current NS3/4A and NS5B inhibitors have limitations, including genotype-specific efficacy and vulnerability to resistance-associated variants (RAVs)7,8. For example, while Ledipasvir and Velpatasvir are effective against multiple genotypes, they demonstrate reduced potency against certain RAVs, leading to treatment failures in some patients9.
DAAs for the treatment of HCV by targeting NS3/4A and NS5B using the aforementioned inhibitors have also shown high-sustained virologic response (SVR) rates across multiple HCV genotypes10. Therefore, and more recently, NS5A has emerged as a compelling target due to its multifunctional role in the viral life cycle11,12. NS5A not only supports viral RNA replication but also contributes to the assembly of viral particles and interacts with host cellular pathways, which allows HCV to modulate immune responses and enhance its persistence in the host13. As a result, targeting NS5A with early-established DAAs reduced SVR rates across multiple HCV genotypes, making it a cornerstone in HCV therapy14. These challenges highlight the need for the development of new NS5A inhibitors with enhanced resistance barriers and broader efficacy across HCV genotypes, which this study aims to address. To date, Daclatasvir (DCV) is the most effective approved drug against NS5A. At picomolar doses, it binds selectively into NS5A protein, showing significant alteration of the NS5A subcellular localization, blocking NS5A hyperphosphorylation, and inhibiting viral RNA synthesis15–17. Despite its remarkable potency, DCV is only effective against a limited number of HCV genotypes and has a low resistance barrier to multiple mutations.
To identify more effective NS5A inhibitors, dimeric phenylthiazole derivatives were considered as potential candidates18. This choice is based on several structural and functional considerations. The scaffold of Daclatasvir, which consists of two symmetrical imidazole rings, plays a crucial role in its strong potency against HCV19. Similarly, dimeric phenylthiazole derivatives possess a dimeric structure that is hypothesized to enhance binding affinity and specificity to the NS5A protein. The symmetrical nature of these derivatives allows for multiple interactions with the active sites of NS5A, potentially leading to improved inhibitory activity. Additionally, phenylthiazole derivatives are known for their versatile biological activities and favorable pharmacokinetic properties, making them suitable candidates for further development20.
In recent years, integrated computational approaches have gained traction in drug discovery due to their ability to prioritize lead compounds and predict binding affinities with high precision. Such methods, using explainable machine learning-driven QSAR, pharmacophore modeling, and molecular simulations, have demonstrated significant success in optimizing drug candidates21,22. These studies underscore the value of combining computational techniques, a methodology also employed in the design and evaluation of novel NS5A inhibitors in this work.
In continuation of our ongoing efforts to develop potent NS5A inhibitors23, this study focuses on the establishment of a quantitative structure–activity relationship (QSAR) model based on the structural characteristics of dimeric phenylthiazole derivatives and their inhibitory activities against the NS5A protein. QSAR models, using the Monte Carlo method, are a useful tool for developing SMILES-based QSAR models. They provide important insights for designing new, potent dimeric NS5A inhibitors. Furthermore, molecular docking analyses were conducted to investigate the interaction between the newly designed molecules and the active site of NS5A. The most promising compounds were subjected to molecular dynamics (MD) simulations to evaluate their stability and behavior within the active site of NS5A. To provide a more comprehensive analysis, MMGBSA (Molecular Mechanics Generalized Born Surface Area) calculations were performed to estimate the binding free energies of the inhibitor candidates, enhancing our understanding of their binding affinities and stabilities. Finally, the pharmacokinetic and toxicity properties of the selected designed molecules were calculated to ensure a comprehensive assessment. This multi-faceted approach provides a thorough understanding of the potential NS5A inhibitors, laying the foundation for the development of effective anti-HCV drugs.
Materials and methods
Data set
A dataset of 82 dimeric phenylthiazole derivatives exhibiting inhibitory activity against NS5A was collected from the research conducted by Yeh et al.18. The molecular structures of these compounds were sketched and converted into isomeric Simplified Molecular Input Line Entry System (SMILES) notation using the ACD/ChemSketch program. The EC50 (half maximal effective concentration) value for each molecule was converted to their respective pEC50 (-log10 (EC50)) value and used as a dependent variable. The molecular structures and their corresponding EC50 data are listed in Table S1, and their SMILES notation and converted EC50 are available in Table S2.
QSAR model construction and validation
The CORAL software was used to build up the SMILES-based quantitative structure–activity relationship24. To develop the optimal QSAR model, three splits were randomly generated and then each split further divided into four sets: 35% for training (Tr), 35% for invisible training (Inv-Tr), 15% for calibration (Cal), and 15% for validation (Val) sets. The correlation weights (CWs) were calculated using the training set data. Compounds not included in the Tr set were tested for effectiveness using the Inv-Tr set. The Cal set was used to determine the beginning of overtraining, while the Val set was used to confirm the models for structures absent in the other sets.
The development of QSAR models based on SMILES notation uses the optimal descriptors Eq. (1):
![]() |
1 |
SMILESDCW (T, Nepoch) combines SMILES-based attributes associated with their correlation weight (CW). The explanation of the optimal SMILES parameters is provided in Table S3.
The QSAR models were developed using linear regression, as described in Eq. (2).
![]() |
2 |
C0 is the intercept and C1 is the slope of the regression equation.
The Monte Carlo optimization approach for QSAR modeling is optimized using the target functions TF1 and TF2. Equation (3) uses the equilibrium of the correlation approach to compute TF1, and Eq. (4) gives the modified target function TF2 by adding the index of ideality of correlation (IIC) to TF125.
![]() |
3 |
![]() |
4 |
![]() |
5 |
RTr, RInv-Tr, and Rset are the correlation coefficients between the experimental pEC50 and the predicted pEC50 for the training set, invisible training set, and a specific set, respectively. The mean absolute error (MAE) is calculated as follows:
![]() |
6 |
![]() |
7 |
where;
![]() |
8 |
∆k measures the quality of prediction for the kth substance from a set.
To construct robust QSAR models, the optimal threshold (T*) and the number of epochs (N*) were determined by analyzing the best statistical metrics for the Cal set. The determination of the optimal values for the T* and N* was conducted by exploring the ranges of 1 to 10 for the threshold (T) and 1 to 30 for the Nepochs (N). A total of three optimization probes were employed. In this study, the weight of IIC (WIIC) which is an empirical coefficient for the Monte Carlo optimization is set to zero for TF1 and 0.2 for TF2.
The primary goal of building a QSAR model is to accurately predict the properties of new molecules in an unbiased and trustworthy manner26. The robustness and reliability of a QSAR model can be validated following three main criteria, including data randomization or Y-scrambling, external validation using the calibration dataset, and internal validation or cross-validation using the training and Inv-Tr datasets. To evaluate the quality of developed SMILES-based QSAR models, several statistical benchmarks are computed. The determination coefficient (R2), which quantifies the proportion of variance in the dependent variable explained by the independent variables, provides insight into the model’s fit to the observed data. The concordance correlation coefficient (CCC), which assesses the agreement between predicted and observed values, provides a combined measure of both precision and accuracy in predicting new data. The cross-validated correlation coefficient (Q2), which evaluates the model’s predictive power on unseen data by partitioning the dataset and training the model on subsets, helps prevent overfitting and indicate model robustness. The standard error of estimation (s), measuring the average distance of observed values from the regression line, reflects model precision. The mean absolute error (MAE), which calculates the average absolute difference between predicted and observed values, represents prediction accuracy without disproportionately penalizing larger errors. The Fischer ratio (F), comparing explained variance to unexplained variance, serves as a measure of model significance. The root-mean-square error (RMSE), providing a standardized measure of prediction error by taking the square root of the average squared differences between predicted and observed values, is sensitive to larger errors. Additionally, advanced metrics such as Rm2av, which indicates average predictive power; ΔRm2, showing the difference between training and test predictions; and CRp2, which assesses cross-validated predictive power, further enhance the assessment of model quality and reliability26–28. Finally, the IIC is used to validate all models and determine the best one29,30. Our previous articles provided a clear explanation of the SMILES-based QSAR model validation processes and requirements25,31.
Another crucial parameter to validate the reliability of a built QSAR model is the applicability domain (AD). AD is used to examine the predictability of a QSAR model for compounds that were not considered in the development of the model. It is applied to differentiate between reliable and less reliable predictions in developed QSAR models. On the other hand, the AD of SMILES-based QSAR is defined using statistical defects of SMILES. The statistical defect for a SMILES attribute, denoted as d(A), is calculated by comparing the probability of that attribute in the training and calibration sets and is represented by the following mathematical equation32:
![]() |
9 |
Here, the probabilities of an attribute (A) in the training and calibration sets are represented by P(ATr) and P(ACal), respectively, while the frequency with which attribute (A) occurs in the training and calibration sets is indicated by N(ATr) and N(ACal).
The total of defect d(A), of all the characteristics given in the SMILES notation is the statistical SMILES-defect (D).
![]() |
10 |
A molecule is classified as an outlier if D > 2 * D̅, where D̅ is the average of D determined for the Tr, Inv-Tr, and Cal sets.
Molecular docking of selected hits
Molecular docking is a reliable computational method for analyzing the interactions between potential drugs and the active site of the target protein. It provides a profound understanding of the essential structural elements within a geometric model in relation to binding affinity33–35. The Autodock Vina software was employed for molecular docking analyses of the newly designed compounds as well as the reference compound36,37. The crystallographic structure of NS5A with the PDB code 3FQM was obtained from the Protein Data Bank38. This structure was chosen due to its high-resolution representation of NS5A in a biologically relevant dimeric form. The dimeric conformation in 3FQM captures a 'back-to-back’ arrangement of chains A and B, referred to as the ‘closed’ dimer structure, a critical orientation that allows for interaction with NS5A inhibitors, as supported by various studies39,40. Key residues forming the binding surface on both subunits include GLY33, GLN40, GLN54, THR56, PRO58, ASN91, TYR93, THR95, PRO97, ARG157, and TYR161, which are vital to understanding binding dynamics within the active site. For the protein preparation, water molecules were removed, polar hydrogens and Gasteiger charges were added, and saved in pdbqt format using Autodock Tools41. The selected compounds were prepared using the same software and saved in pdbqt format. The molecular docking was performed on a cubic grid box that cover the key residues of NS5A with xyz coordinates of 23.6268, − 9.8409, 9.0372 and dimensions of 20 Å3 with exhaustiveness of 20. The analysis of docking results was performed using Discovery Studio Visualizer42 and PyMOL software43.
Molecular dynamics (MD) simulations
The best confirmations of the selected designed molecules complexed with NS5A identified by representative docking results were subjected to MD simulations together with the Apo form of NS5A. Before starting the MD simulations, the ligands were parameterized by assigning AM1-BCC charges and force constants from the general AMBER force field, while the receptors were parameterized with the AMBER ff14SB force field44.The two coordinated zinc ions in NS5A and their neighboring cysteine residues were parameterized with the Zinc AMBER force field45. The tetravalent binding site for Zn2⁺ interacts with cysteines at positions 39, 57, 59, and 80 in both subunit A and subunit B. These two interactions were treated as Zn-CCCC. Subsequently, the systems were solvated in an orthorhombic TIP3P water box with a minimum distance of 20 Å from the system surface. The counterions (Na+ and Cl-) were added and randomly distributed to neutralize the overall charge of the systems and achieve a salt concentration of 0.15 M. The preparations were performed with the programs tLEaP and antechamber from AmberTools2346. These coordinate and topology files were converted to GROMACS input files using the ParmEd program47. Energy minimization of each system was performed using the steepest descent method, which allowed up to 50,000 iterations and a maximum force threshold of 10.0 kJ/mol. The systems were kept at a constant temperature of 300 K and a pressure of 1.01325 bar. Equilibration of the systems was achieved by using both canonical (NVT) and isothermal-isobaric (NPT) ensembles. MD simulations were then performed over a period of 200 ns using GROMACS 2021.3 software48. The analysis of the dynamic trajectories included the calculation of the root mean square deviation (RMSD), radius of gyration (Rg), solvent accessible surface area (SASA), and root mean square fluctuation (RMSF) to assess the structural integrity of the designed molecules and their interaction with the NS5A. The results of these analyses were plotted using Xmgrace software49.
Binding free energy calculation
The thermodynamic energy of ligand binding is a crucial factor contributing to the total binding free energy of the protein–ligand complex, influencing the stabilizing forces of an inhibitor within the active site and thereby ensuring system stability throughout the simulation. The binding free energies of protein–ligand complexes were estimated using the Molecular Mechanics Generalized Born Surface Area (MMGBSA) method, implemented through the gmx-mmgbsa package50. The binding free energy (ΔGbind) is calculated as:
![]() |
11 |
where ΔGcomplex represents the total free energy of the complex, and ΔGreceptor and ΔGligand are the free energies of the isolated protein and ligand, respectively.
The binding free energy can also be expressed as:
![]() |
12 |
Here, ΔEgas is the average potential energy in a vacuum, comprising van der Waals (ΔEvdw) and electrostatic interactions (ΔEele). ΔGsolv is the solvation free energy, which includes contributions from both polar (ΔGpolar) and non-polar (ΔGnonpolar) solvation energies.
Pharmacokinetic and toxicity predictions
Pharmacokinetic and toxicity predictions play a crucial role in the drug design process. Pharmacokinetics refers to the study of how drugs are absorbed, distributed, metabolized, and excreted by the body, while toxicity refers to the potential harmful effects of a drug may cause to living organisms. Early prediction of these properties can identify potential safety and efficacy issues before a drug is tested in animals or humans. This can reduce the risk of expensive and time-consuming clinical trials that may ultimately fail due to toxicity or poor pharmacokinetic properties. In recent years, advances in computer-based modeling and simulations have made it possible to predict these properties more accurately, enabling researchers to make more informed decisions about which drug candidates to pursue. As a result, pharmacokinetic and toxicity predictions are now an integral part of the drug design process and are essential for the development of safe and effective drugs51,52. pkCSM and Osiris property explorer servers were used to estimate the pharmacokinetics and toxicity profiles of the compounds under investigation53,54
Results and discussion
QSAR modeling
Six QSAR models were developed through three random dataset splits, employing two target functions: TF1 with WIIC = 0 and TF2 with WIIC = 0.2. Statistical assessments of the SMILES-based QSAR models revealed that a WIIC of 0.2 enhances the impact of the IIC during the Monte Carlo optimization. Table 1 presents the computed statistical parameters for all splits, and Fig. 1 displays the correlation between the experimental and predicted pEC50 values across the three splits. The data in Table 1 demonstrate that all models meet the criteria set by Tropsha et al.55 and Ojha et al.56, confirming their statistical reliability. AD analysis in Table S4 affirms that all data points from the three splits fall within the applicability domain.
Table 1.
Statistical parameters of the built QSAR models for the three splits.
| Split | TF | Set | n | R2 | CCC | IIC | Q2 | Q2F1 | Q2F2 | Q2F3 | Rm2av | ΔRm2 | Crp2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | TF1 | Tr | 30 | 0.931 | 0.964 | 0.643 | 0.920 | 0.901 | |||||
| Inv-Tr | 29 | 0.864 | 0.865 | 0.566 | 0.842 | 0.858 | |||||||
| Cal | 11 | 0.800 | 0.809 | 0.610 | 0.729 | 0.470 | 0.461 | 0.570 | 0.659 | 0.044 | 0.766 | ||
| Val | 12 | 0.886 | 0.901 | 0.654 | 0.850 | 0.868 | 0.079 | ||||||
Equation:
|
|||||||||||||
| TF2 | Tr | 30 | 0.864 | 0.927 | 0.711 | 0.845 | 0.844 | ||||||
| Inv-Tr | 29 | 0.860 | 0.909 | 0.891 | 0.837 | 0.828 | |||||||
| Cal | 11 | 0.850 | 0.906 | 0.922 | 0.807 | 0.833 | 0.829 | 0.864 | 0.729 | 0.148 | 0.793 | ||
| Val | 12 | 0.908 | 0.949 | 0.911 | 0.858 | 0.602 | 0.184 | ||||||
Equation:
|
|||||||||||||
| 2 | TF1 | Tr | 30 | 0.874 | 0.932 | 0.541 | 0.856 | 0.8483 | |||||
| Inv-Tr | 28 | 0.873 | 0.924 | 0.665 | 0.844 | 0.8637 | |||||||
| Cal | 11 | 0.827 | 0.887 | 0.676 | 0.698 | 0.811 | 0.811 | 0.667 | 0.645 | 0.185 | 0.7866 | ||
| Val | 13 | 0.709 | 0.823 | 0.697 | 0.591 | 0.782 | 0.118 | ||||||
Equation:
|
|||||||||||||
| TF2 | Tr | 30 | 0.737 | 0.848 | 0.751 | 0.696 | 0.722 | ||||||
| Inv-Tr | 28 | 0.725 | 0.829 | 0.572 | 0.689 | 0.703 | |||||||
| Cal | 11 | 0.778 | 0.824 | 0.882 | 0.644 | 0.733 | 0.733 | 0.529 | 0.793 | 0.128 | 0.717 | ||
| Val | 13 | 0.902 | 0.906 | 0.813 | 0.885 | 0.531 | 0.254 | ||||||
Equation:
|
|||||||||||||
| 3 | TF1 | Tr | 28 | 0.811 | 0.896 | 0.583 | 0.775 | 0.808 | |||||
| Inv-Tr | 29 | 0.867 | 0.931 | 0.835 | 0.847 | 0.838 | |||||||
| Cal | 12 | 0.665 | 0.809 | 0.815 | 0.523 | 0.658 | 0.649 | 0.632 | 0.541 | 0.142 | 0.596 | ||
| Val | 13 | 0.726 | 0.814 | 0.674 | 0.660 | 0.617 | 0.157 | ||||||
Equation:
|
|||||||||||||
| TF2 | Tr | 28 | 0.958 | 0.978 | 0.979 | 0.948 | 0.936 | ||||||
| Inv-Tr | 29 | 0.963 | 0.978 | 0.867 | 0.958 | 0.956 | |||||||
| Cal | 12 | 0.799 | 0.849 | 0.669 | 0.725 | 0.719 | 0.712 | 0.698 | 0.678 | 0.184 | 0.722 | ||
| Val | 13 | 0.862 | 0.911 | 0.614 | 0.799 | 0.694 | 0.152 | ||||||
Equation:
|
|||||||||||||
Fig. 1.
Experimental versus calculated pEC50 values for the three splits. (Figure created using Origin 2022; OriginLab; https://www.originlab.com/).
The selection of the optimal QSAR model was based on a higher R2 value for the Val set and a superior IIC value for the Cal set. Consequently, the QSAR model derived from split 1 using TF2 shows the highest R2 value of 0.908 and the best IIC value of 0.922 for the Val and Cal sets, respectively. Numerical values for various parameters for the validation set of split 1 included Q2 = 0.858, CCC = 0.949, Rm2av = 0.602, and ∆Rm2 = 0.184.
Equation (11) represents the model for split 1 as follows:
![]() |
Mechanistic interpretation
The Organization for Economic Cooperation and Development (OECD) emphasizes a mechanistic interpretation, which is crucial for understanding how certain molecular features can affect an endpoint. The CORAL model can be used to extract and interpret such features through Monte Carlo optimization. In the Monte Carlo optimization method, optimal descriptors known as correlation weight values (CW) are correlated with specific structural attributes (SAk). These attributes help in identifying promoters of increase or decrease in the endpoint. By running this optimization multiple times, it was found that SMILES attributes with positive CWs were responsible for increasing pEC50 activity, while those with negative CWs were responsible for decreasing it. However, attributes with both positive and negative CWs were not clearly defined. The identification and interpretation of these promoters follow three general rules: (i) a structural attribute is considered a promoter of increase if CW(SAk) is positive in all runs, (ii) a structural attribute is considered a promoter of decrease if CW(SAk) is negative in all runs, and (iii) if CW(SAk) is both positive and negative across different runs, the structural attribute is undefined. Table 2 shows the main promoters of increase and decrease of the pEC50 values for split 1 based on the three independent runs of the QSAR model.
Table 2.
Promoters of increase and decrease of pEC50 endpoint value from split 1 and their description.
| CWs Probe 1 |
CWs Probe 2 |
CWs Probe 3 |
NTra | NInv-Trb | NCalc | Defect [SAk]d |
Comment | |
|---|---|---|---|---|---|---|---|---|
| Promoters of increase | ||||||||
| 1…s…(… | 1.332 | 1.225 | 3.348 | 30 | 29 | 11 | 0.000 | Combination of 1st ring with aromatic sulfur and branching |
| C…C…C… | 0.743 | 0.305 | 0.379 | 30 | 29 | 11 | 0.000 | Propyl group |
| c…(…1… | 1.449 | 1.244 | 0.252 | 30 | 29 | 11 | 0.000 | Combination of sp2 hybridized carbon, branching and aromatic ring |
| c…(…c… | 1.778 | 0.363 | 2.197 | 30 | 29 | 11 | 0.000 | Branching between two aromatic carbon atoms |
| c…c…(… | 0.956 | 0.151 | 2.387 | 30 | 29 | 11 | 0.000 | Two sp2 hybridized carbon or ethene with branching |
| c…c…… | 0.239 | 0.114 | 0.995 | 30 | 29 | 11 | 0.000 | Two aromatic carbons or ethene |
| c…n…c… | 1.407 | 2.486 | 3.461 | 30 | 29 | 11 | 0.000 | Aromatic nitrogen surrounded by two aromatic carbons |
| n…c…(… | 0.257 | 0.049 | 0.779 | 30 | 29 | 11 | 0.000 | Combination of sp2 nitrogen, sp2 carbon and branching |
| C…C…1… | 2.456 | 1.101 | 0.198 | 30 | 29 | 11 | 0.000 | Two non-aromatic carbon in the first ring |
| s…(…c… | 2.414 | 0.682 | 3.594 | 30 | 29 | 11 | 0.000 | Combination of aromatic sulfur, branching and aromatic carbon |
| Smax.2…… | 2.038 | 1.489 | 1.318 | 29 | 29 | 10 | 0.006 | Maximum number of Sulfur atoms is 2 |
| Nmax.6…… | 0.007 | 1.302 | 2.317 | 21 | 23 | 8 | 0.001 | Maximum number of Nitrogen atoms is 6 |
| Omax.4…… | 1.404 | 1.765 | 1.749 | 20 | 14 | 8 | 0.002 | Maximum number of Oxygen atoms is 4 |
| Promoters of decrease | ||||||||
| 1…(…… | − 0.409 | − 0.643 | − 1.288 | 30 | 29 | 11 | 0.000 | One aromatic ring with branching |
| c…(…… | − 0.251 | − 0.449 | − 0.342 | 30 | 29 | 11 | 0.000 | Aromatic carbon with branching |
| c………. | − 0.305 | − 0.381 | − 0.048 | 30 | 29 | 11 | 0.000 | Aromatic carbon |
| n………. | − 0.300 | − 0.761 | − 1.429 | 30 | 29 | 11 | 0.000 | Aromatic nitrogen |
| s………. | − 0.796 | − 0.556 | − 0.264 | 30 | 29 | 11 | 0.000 | Aromatic sulfur |
| s…1…… | − 0.455 | − 0.798 | − 0.490 | 30 | 29 | 11 | 0.000 | Aromatic sulfur linked with a ring |
| NTra, NInv-Trb, and NCalc represent the number of SMILES (samples) in Tr, Inv-Tr, and Cal sets, respectively, that comprise a particular attribute (SAk). Defect [SAk]d is calculated as the difference between the probability of SAk in the training and calibration sets divided by the total number of SAk in the training and calibration sets | ||||||||
Based on the results listed in Table 2, the molecular features contributing to increased activity were incorporated into the lead compound 29, designated as (A), which exhibits the highest pEC50 value. Conversely, features associated with a decrease in activity were omitted. Through this approach, seventeen (A1-A17) novel inhibitors were designed, and their activities were predicted from the best SMILES-based QSAR model (split 1). All newly designed molecules were found to have a higher pEC50 value than the lead compound (A). These compounds, along with their chemical structures and their calculated pEC50 values are listed in Table S5.
Molecular docking analysis
To confirm the effectiveness of the design strategy, a molecular docking analysis was conducted to study how the newly designed compounds interact with the NS5A protein, in comparison to the reference lead compound (A). Figure 2 displays the binding site of the NS5A protein within the molecular docking calculations. All compounds were docked within the active site of the NS5A protein and their binding affinity estimations are shown in Table S5. The three compounds with the highest binding affinity, namely A6, A16, and A17, were identified and selected for further analysis. Their binding affinity estimations and the 2D and 3D representation of ligand–protein interactions for the selected compounds alongside the reference inhibitor (A) are presented in Table 3, Fig. 3, and Fig. S1, respectively. The ligand–protein interactions details of A, A6, A16, and A17 are provided in Table S6. All compounds were found to interact efficiently with different key residues of the binding surface of NS5A. Analysis of the docking results shows that the lead compound A was stabilized by 5 hydrogen bonds (PRO58, GLY60 and THR95 of chain A and TYR93 of chain B) and 6 hydrophobic interactions (TYR93, CYS57, ILE52, and PRO58) of chain A and TYR95 and THR93 of chain B). Each of the selected compounds (A6, A16, and A17) was observed to form three hydrogen bonds with identical amino acids (GLY33, THR56, and THR95 of chain A). However, the number of their hydrophobic interactions varied, with A6 engaged in 7, A16 in 12, and A17 in 10 hydrophobic interactions, respectively. Remarkably, A16 exhibited the highest binding affinity of − 10.2 kcal mol−1, while A6 and A17 showed slightly lower affinities with values of − 9.9 and − 9.8 kcal mol−1, respectively. Despite these differences in binding affinity and hydrophobic interactions, all compounds exhibit interactions with common key residues such as ILE52, CYS57, PRO58 and TYR93 of subunit A, and TYR93 of subunit B, indicating a consistent mode of action of the selected compounds.
Fig. 2.
HCV NS5A dimer protein surface representation: Subunit A colored cadet blue, subunit B colored tan and a rectangle highlights the region where docking calculations are done (left). The inset shows the reference ligand docket in the binding pocket of NS5A (right). (Figure created using ChimeraX version 1.5, 2022-11-24; https://www.rbvi.ucsf.edu/chimerax).
Table 3.
The predicted pEC50 and binding affinity scores of the selected newly designed compounds.
| Compounds | Chemical structure | pEC50 | Binding affinities (kcal.mol-1) |
|---|---|---|---|
| A (Lead compound) | ![]() |
10.53 | − 9.5 |
| A6 | ![]() |
12.22 | − 9.9 |
| A16 | ![]() |
12.57 | − 10.2 |
| A17 | ![]() |
12.68 | − 9.8 |
Fig. 3.
2D representations of ligand–protein interactions of compounds (A6, A16 and A17) and the reference ligand A within the active site of NS5A. (Figure created using BIOVIA Discovery Studio Visualizer version 21.1.0.20298; https://www.3ds.com/products-services/biovia/).
Molecular dynamics (MD) simulations
MD simulations were used to simulate the structural stability and dynamic trends of the NS5A and compound-NS5A systems screened through docking calculations and to assess the binding efficacy of the compounds. For this reason, a range of analyses were considered to interpret the results. RMSD analysis was carried out to gain insights into the stability of the interactions between the protein and the designed compounds. In this study, the RMSD of the Carbon-alpha atoms of the protein backbone were analyzed to determine the stability of the apo form of NS5A and the complexes of NS5A with the selected designed compounds (A6, A16 and A17) (Fig. 4). The RMSD of the apo form of NS5A remains within 0.3–0.5 nm up to 95 ns, after which the RMSD values stabilize at 0.4 nm until the end of the simulation. For the NS5A-A complex, the RMSD is maintained between 0.3 and 0.4 nm from 25 ns to the end of the simulation. It is noticeable that the RMSD averages over the entire simulation of the NS5A-A6 and NS5A-A16 complexes (Table 4) have higher RMSD averages than those for the apo form of NS5A and the NS5A-A complex. The NS5A-A6 complex shows an increase in RMSD to 0.45–0.6 nm post 25 ns, which continues until 140 ns before settling at 0.4–0.45 nm for the final 60 ns of the simulation. Meanwhile, the NS5A-A16 complex exhibits a gradual increase in RMSD during the initial 60 ns, reaching up to 0.5 nm, and thereafter maintains at approximately 0.45–0.55 nm until the end of the simulation, indicating a slightly higher but stable fluctuation range. However, the RMSD profile of the NS5A-A17 complex shows a minimal deviation in RMSD values, which is less than the RMSD deviation of the apo form of NS5A and the NS5A complex with reference compound A.
Fig. 4.
Time-dependent RMSD of c-α backbone of NS5A-Apo, NS5A-A, NS5A-A6, NS5A-A16, and NS5A-A17. (Figure created using QtGrace version 0.2.6; https://sourceforge.net/projects/qtgrace).
Table 4.
The calculated average parameters for all the systems throughout 200 ns MD simulation run.
| Systems | RMSD (nm) | RMSF (nm) | Rg (nm) | SASA (nm2) |
|---|---|---|---|---|
| Apo | 0.387 ± 0.037 | 0.111 ± 0.060 | 2.011 ± 0.021 | 167.103 ± 7.613 |
| A | 0.405 ± 0.058 | 0.159 ± 0.075 | 2.025 ± 0.079 | 168.254 ± 9.475 |
| A6 | 0.468 ± 0.091 | 0.157 ± 0.068 | 2.074 ± 0.025 | 176.743 ± 7.532 |
| A16 | 0.454 ± 0.089 | 0.153 ± 0.068 | 2.069 ± 0.079 | 172.914 ± 7.443 |
| A17 | 0.331 ± 0.043 | 0.113 ± 0.049 | 2.047 ± 0.077 | 168.696 ± 6.581 |
To further investigate the ligand stability within the binding site of NS5A, the RMSD values of the ligands were monitored (Fig. 5). Monitoring these trajectories would provide valuable information about the conformation and orientation of the simulated ligands within the binding site of NS5A throughout the simulation period. The RMSD plot shows that compound A17 exhibits the most consistent and stable behavior, showing the least variation throughout the 200 ns MD simulation. In contrast, the reference compound A shows higher RMSD values initially, indicating greater mobility within the binding site, before later achieving a semblance of stability at the end of the simulation. Compounds A6 and A16 showed a progressive increase in RMSD within the first 70 ns and 40 ns, respectively, and then remained stable between 0.5 and 0.7 nm throughout the remainder of the simulation.
Fig. 5.

RMSD of the ligands (reference compound A and designed compounds A6, A16 and A17). (Figure created using QtGrace version 0.2.6; https://sourceforge.net/projects/qtgrace).
The trajectory images of the NS5A complexes with the designed compounds at 0, 100, and 200 ns (Fig. 6) visually confirm the RMSD results. These snapshots show that the main conformational changes within the binding site occurred during the first 100 ns, especially for compounds A, A6 and A16. By 200 ns, visual analysis confirms the data suggested by the RMSD diagrams: Compounds A, A6 and A16 appear to have reached a stable conformation, while compound A17 shows a consistent positioning indicating a high stability throughout the simulation period. This visual evidence is consistent with the numerical RMSD data and supports our conclusions regarding the stability and conformational dynamics of the ligands within the NS5A binding site.
Fig. 6.
Trajectory images of the complexes NS5A and the designed compounds at different timelines (0, 100, and 200 ns). (Figure created using PyMOL Molecular Graphics System version 2.5.7; Schrödinger, LLC; https://pymol.org).
To understand how ligand interactions affect the structural dynamics of proteins, the RMSF metric was calculated for the NS5A apo form and the complexes of NS5A with the designed compounds. RMSF is a metric used to quantify flexibility within protein structures, with a higher RMSF value indicating more flexible regions such as coils and loops and a lower value indicating the presence of more rigid secondary structures such as helices or sheets. In general, protein structures that have lower RMSF values are considered more stable or desirable as they indicate less flexibility and suggest the preservation of secondary structure upon ligand binding. Following this, Fig. 7 presents the RMSF values for the Cα atoms as a function of the number of residues to investigate the local dynamics of the NS5A residues after the introduction of the designed compounds. The RMSF plots show that the residues involved in the interaction with the designed compounds exhibit significantly less fluctuation compared to the other residues. This observation indicates that the binding sites in all complexes remain structurally stable throughout the simulation. Notably, the fluctuation profiles for the residues of NS5A-A6 and NS5A-A16 are lower and had almost similar RMSF values, with average values of 0.157 ± 0.068 nm and 0.153 ± 0.068 nm, respectively, indicating more stable interactions than for the complex of the NS5A with the reference compound A (NS5A-A, 0.159 ± 0.075 nm). On the other hand, the NS5A-A17 complex, with an average value of 0.113 ± 0.049 nm, indicating the highest stability among the designed compounds. This comprehensive analysis illustrates the different effects of compound binding on the structural flexibility of NS5A residues and highlights the stabilizing effect of the designed compounds on the binding site of NS5A.
Fig. 7.
The RMSF for c-α atoms of NS5A-Apo, NS5A-A, NS5A-A6, NS5A-A16, and NS5A-A17. (Figure created using QtGrace version 0.2.6; https://sourceforge.net/projects/qtgrace).
Following the estimation of the RMSF, the assessments of the Rg and the SASA were carried out. The Rg value serves as an indicator of the stability of NS5A and its complexes with the designed compounds across the MD trajectory by measuring the degree of compactness of the structure. The SASA value, on the other hand, is a critical metric for determining the surface area of the protein that is accessible to solvent molecules, thus helping to predict the extent of conformational shifts that NS5A undergoes when it binds to the designed compounds. Figure 8 shows the Rg plots representing the observed changes in the conformational behavior of the apo-NS5A and NS5A-designed compound complexes for 200 ns MD simulations. The average Rg values estimated for the apo form of NS5A and the NS5A-A complex were 2.011 ± 0.021 and 2.025 ± 0.079 nm, respectively. The compound interactions lead to a slight increase in Rg for complexes A6 and A16, with the highest value for A6 at 2.074 ± 0.025 nm, indicating a modest extension of the structure. However, complex NS5A-A17 closely mirrors the apo form of NS5A with an Rg of 2.047 ± 0.077 nm, indicating a minimal effect on compactness and a stable compound-protein interaction. Overall, the Rg plots suggest that the binding of the designed compounds slightly alters the compactness of NS5A but maintains its overall conformational stability.
Fig. 8.
Plot of Rg vs. time for of c-α backbone of NS5A-Apo, NS5A-A, NS5A-A6, NS5A-A16, and NS5A-A17. (Figure created using QtGrace version 0.2.6; https://sourceforge.net/projects/qtgrace).
The SASA plots are shown in Fig. 9 and the corresponding average SASA values provide a comprehensive overview of the dynamic solvent exposure of the apo form of NS5A and its complexes with the designed compounds during the simulation (Table 4). For the NS5A-apo, the SASA values exhibit consistent uniformity, indicating a stable interaction with the solvent throughout the simulation trajectory. The presence of compound A slightly increases the SASA average to 168.254 ± 9.475 nm2, indicating slight conformational changes upon ligand engagement. In particular, the NS5A-A6 and NS5A-A16 complexes showed a significant increase in SASA of 176.743 ± 7.532 and 172.914 ± 7.443 nm2, respectively, indicating an extended protein structure with increased solvent interaction. Hence, this could indicate a more flexible and solvent-exposed state. Conversely, the NS5A-A17 complex shows SASA values (168.696 ± 6.581 nm2) that are very similar to the apo form of NS5A. This suggests a conformation that largely retains the apo form of NS5A, indicating a more compact and stable complex of compound and NS5A.
Fig. 9.
The SASA profile of NS5A-Apo, NS5A-A, NS5A-A6, NS5A-A16, and NS5A-A17. (Figure created using QtGrace version 0.2.6; https://sourceforge.net/projects/qtgrace).
Binding free energy profiles of the selected hits
Based on our comprehensive analysis, MMGBSA calculations were performed to determine the binding free energies over the entire molecular dynamics (MD) trajectory. The detailed results are summarized in Table 5, showing the binding free energies for the designed ligands A6, A16, and A17 compared to the reference compound A. For compound A17, the binding free energy indicates the most favorable van der Waals and electrostatic interactions among all compounds. Despite these strong interactions, A17 maintains consistent and stable behavior with the least variation throughout the 200 ns MD simulation, as supported by the RMSD plot. On the other hand, compounds A6 and A16 exhibit less favorable binding interactions and greater variation in stability. Compound A17’s lower total binding free energy (− 45.88 ± 0.85 kcal/mol) highlights its potential as a potent NS5A inhibitor, surpassing the reference compound A in both binding affinity and stability. Conclusively, A17 emerges as the most promising ligand, demonstrating robust and spontaneous binding, making it a strong candidate for further development in NS5A inhibition.
Table 5.
Detailed binding free energy calculated by MM/GBSA for all complexes. All the values are given in kcal/mol.
| Systems | NS5A-A | NS5A-A6 | NS5A-A16 | NS5A-A17 |
|---|---|---|---|---|
| ΔEvdw | − 51.53 ± 0.56 | − 46.66 ± 0.53 | − 46.26 ± 0.44 | − 62.18 ± 0.77 |
| ΔEele | − 19.87 ± 0.59 | − 14.78 ± 0.65 | − 11.47 ± 0.72 | − 26.07 ± 0.89 |
| ΔEGB | 43. 05 ± 0.16 | 39.88 ± 0.77 | 33.46 ± 0.72 | 50.39 ± 0.85 |
| ΔEsurf | − 6.8 ± 0.07 | − 6.07 ± 0.07 | − 6.03 ± 0.06 | − 8.01 ± 0.09 |
| ΔGgas | − 71.4 ± 0.9 | − 61.44 ± 1.05 | − 57.73 ± 0.94 | − 88.25 ± 1.55 |
| ΔGsolv | 36.25 ± 0.56 | 33.82 ± 0.72 | 27.43 ± 0.64 | 42.37 ± 0.77 |
| Δtotal | − 35.15 ± 0.45 | − 27.63 ± 0.47 | − 30.3 ± 0.4 | − 45.88 ± 0.85 |
Pharmacokinetic and toxicity predictions
The evaluation of the pharmacokinetic and toxicity properties of potential drugs is crucial in the early steps of drug discovery process. A high-quality therapeutic agent must not only be effective against the desired target, but also have an ideal profile at therapeutic dosages to ensure both its efficacy and safety. Early assessment of these properties is critical to minimize the likelihood of unsuccessful outcomes in the later stages of development. The results summarized in Table 6 provide a detailed comparison of the pharmacokinetic and toxicological properties of the designed compounds (A6, A16, A17) and the reference compound (A). All compounds showed favorable absorption in the human intestine (HIA), indicating their potential for adequate oral bioavailability. However, differences in the ability to cross the blood–brain barrier (BBB) were observed. Values ranged from − 1.873 to − 1.822, indicating varying degrees of central nervous system (CNS) penetration. In addition, all compounds showed the potential to be a substrate of P-glycoprotein. Regarding drug metabolism, none of the designed compounds would inhibit the CYP1A2, CYP2D6, CYP2C9 or CYP3A4 targets, minimizing the risk of metabolic interactions with these enzymes. However, only compound A17 was identified as a CYP2C19 inhibitor, indicating a specific interaction that needs to be carefully considered when co-administered with drugs metabolized by this enzyme. In addition, variations in total clearance values were observed, with A17 showing the highest clearance, followed by A6 and A16, suggesting possible differences in the metabolic pathways of each compound. In terms of toxicological properties, all compounds were classified as non-toxic in terms of mutagenic, tumorigenic, irritant, or reproductive effects, indicating a favorable preliminary safety profile. These results provide valuable insights into the comparative pharmacokinetic and toxicological properties of the synthesized and developed compounds and facilitate informed decisions for further optimization and development towards therapeutic applications.
Table 6.
Pharmacokinetic and toxicity properties of the designed molecules and the lead compound.
| Compound | A | A6 | A16 | A17 |
|---|---|---|---|---|
| Pharmacokinetic properties | ||||
| HIA | Yes | Yes | Yes | Yes |
| BBB | − 1.775 | − 1.873 | − 1.873 | − 1.822 |
| P-glycoprotein substrate | Yes | Yes | Yes | Yes |
| CYP1A2 inhibitior | No | No | No | No |
| CYP2D6 inhibitior | No | No | No | No |
| CYP2C9 inhibitior | Yes | No | No | No |
| CYP3A4 inhibitior | No | No | No | No |
| CYP2C19 inhibitior | No | No | No | Yes |
| Total Clearance | − 0.187 | − 0.757 | − 0.757 | − 0.364 |
| Toxicological properties | ||||
| Mutagenic | NR | NR | NR | NR |
| Tumorigenic | NR | NR | NR | NR |
| Irritant | NR | NR | NR | NR |
| Reproductive effect | NR | NR | NR | NR |
NR no risk.
Conclusion
This study employed a comprehensive and integrated approach to address the significant global health challenge posed by HCV. The main goal was to identify and develop novel inhibitors targeting NS5A, a critical protein involved in RNA replication, viral assembly/release, and modulation of the host immune response. Robust QSAR models were constructed to reveal key structural features influencing inhibitory activity against NS5A, providing valuable insights for the development of novel inhibitors. Overall, three of the designed compounds (A6, A16, and A17) demonstrated good conformational and structural stability as well as favorable binding profiles, with predicted EC50 values of 0.0006, 0.0003, and 0.0002 nM, respectively. Additionally, these compounds exhibited non-toxic profiles and met pharmacological safety standards, underscoring their potential as therapeutic agents for HCV treatment. The amalgamation of computational modeling, molecular docking, molecular dynamics simulations, and ADMET analysis in this study offers a comprehensive and detailed assessment of the inhibitor candidates. These findings not only contribute valuable insights into the potential efficacy, stability, and safety of the identified inhibitors but also serve as a roadmap for their subsequent optimization and development as promising therapeutic agents in the ongoing battle against HCV. However, it is necessary to validate these compounds experimentally (in vitro and in vivo assays) to confirm their efficacy and safety.
Supplementary Information
Author contributions
W. Liman: conceptualization, data curation, formal analysis, investigation, methodology, resources, validation, visualization, writing—original draft. M. Oubahmane: formal analysis, investigation, methodology, validation. N. Ait Lahcen: writing—review and editing, formal analysis, investigation, methodology. I. Hdoufane: methodology, writing—review and editing. D. Cherqaoui: supervision, validation, writing—review and editing. R. Daoud: supervision, writing—reviewing and editing. A. El Allali: supervision, validation, writing—review and editing.
Data availability
All data generated or analysed during this study are included in this published article [and its supplementary information files].
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-024-80082-1.
References
- 1.Hepatitis C. https://www.who.int/news-room/fact-sheets/detail/hepatitis-c. Accessed 9 May 2024.
- 2.Asselah, T., Boyer, N., Saadoun, D., Martinot-Peignoux, M. & Marcellin, P. Direct-acting antivirals for the treatment of hepatitis C virus infection: Optimizing current IFN-free treatment and future perspectives. Liver Int.36(Suppl 1), 47–57. 10.1111/LIV.13027 (2016). [DOI] [PubMed] [Google Scholar]
- 3.Lin, C., Lindenbach, B. D., Prágai, B. M., McCourt, D. W. & Rice, C. M. Processing in the hepatitis C virus E2-NS2 region: Identification of p7 and two distinct E2-specific products with different C termini. J. Virol.68, 5063–5073. 10.1128/JVI.68.8.5063-5073.1994 (1994). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Lin, C., Wu, J. W., Hsiao, K. & Su, M. S. The hepatitis C virus NS4A protein: Interactions with the NS4B and NS5A proteins. J. Virol.71, 6465–6471. 10.1128/JVI.71.9.6465-6471.1997 (1997). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.McDaniel, K. F. et al. The discovery and development of HCV NS3 protease inhibitor paritaprevir, topics in medicinal. Chemistry31, 389–413. 10.1007/7355_2018_42 (2019). [Google Scholar]
- 6.Kati, W. et al. In vitro activity and resistance profile of dasabuvir, a nonnucleoside hepatitis c virus polymerase inhibitor. Antimicrob. Agents Chemother.59, 1505–1511. 10.1128/AAC.04619-14 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kwong, A., McNair, L., Jacobson, I. & George, S. Recent progress in the development of selected hepatitis C virus NS3·4A protease and NS5B polymerase inhibitors. Curr. Opin. Pharmacol.8, 522–531. 10.1016/J.COPH.2008.09.007 (2008). [DOI] [PubMed] [Google Scholar]
- 8.Lontok, E. et al. Hepatitis C virus drug resistance-associated substitutions: State of the art summary. Hepatology62, 1623–1632. 10.1002/HEP.27934 (2015). [DOI] [PubMed] [Google Scholar]
- 9.Sarrazin, C. Treatment failure with DAA therapy: Importance of resistance. J. Hepatol.74, 1472–1482. 10.1016/j.jhep.2021.03.004 (2021). [DOI] [PubMed] [Google Scholar]
- 10.McConachie, S. M., Wilhelm, S. M. & Kale-Pradhan, P. B. New direct-acting antivirals in hepatitis C therapy: A review of sofosbuvir, ledipasvir, daclatasvir, simeprevir, paritaprevir, ombitasvir and dasabuvir. Expert Rev. Clin. Pharmacol.9, 287–302. 10.1586/17512433.2016.1129272 (2016). [DOI] [PubMed] [Google Scholar]
- 11.Alazard-Dany, N., Denolly, S., Boson, B. & Cosset, F. L. Overview of HCV life cycle with a special focus on current and possible future antiviral targets. Viruses11, 30. 10.3390/V11010030 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.O’Boyle, D. R. & Gao, M. NS5A as a target for HCV drug discovery, topics in medicinal. Chemistry32, 3–25. 10.1007/7355_2018_45 (2019). [Google Scholar]
- 13.Macdonald, A. & Harris, M. Hepatitis C virus NS5A: Tales of a promiscuous protein. J. Gener. Virol.85, 2485–2502. 10.1099/vir.0.80204-0 (2004). [DOI] [PubMed] [Google Scholar]
- 14.Pawlotsky, J. M. Hepatitis C virus resistance to direct-acting antiviral drugs in interferon-free regimens. Gastroenterology151, 70–86. 10.1053/J.GASTRO.2016.04.003 (2016). [DOI] [PubMed] [Google Scholar]
- 15.Fridell, R. A. et al. Distinct functions of NS5A in hepatitis C virus RNA replication uncovered by studies with the NS5A inhibitor BMS-790052. J. Virol.85, 7312–7320. 10.1128/jvi.00253-11 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Qiu, D. et al. The effects of NS5A inhibitors on NS5A phosphorylation, polyprotein processing and localization. J. Gen. Virol.92, 2502–2511. 10.1099/VIR.0.034801-0 (2011). [DOI] [PubMed] [Google Scholar]
- 17.Gao, M. et al. Chemical genetics strategy identifies an HCV NS5A inhibitor with a potent clinical effect. Nature465, 96–100. 10.1038/NATURE08960 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Yeh, T. K. et al. A novel, potent, and orally bioavailable thiazole HCV NS5A inhibitor for the treatment of hepatitis C virus. Eur. J. Med. Chem.167, 245–268. 10.1016/J.EJMECH.2019.02.016 (2019). [DOI] [PubMed] [Google Scholar]
- 19.Gandhi, Y. et al. Daclatasvir: A review of preclinical and clinical pharmacokinetics. Clin. Pharmacokinet.57, 911–928. 10.1007/S40262-017-0624-3 (2018). [DOI] [PubMed] [Google Scholar]
- 20.Kashyap, S. J. et al. Thiazoles: Having diverse biological activities. Med. Chem. Res.21, 2123–2132. 10.1007/s00044-011-9685-2 (2012). [Google Scholar]
- 21.Alam, S. & Khan, F. 3D-QSAR, docking, ADME/Tox studies on flavone analogs reveal anticancer activity through tankyrase inhibition. Sci. Rep.9(1), 1–15. 10.1038/s41598-019-41984-7 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Zare, F. et al. A combination of virtual screening, molecular dynamics simulation, MM/PBSA, ADMET, and DFT calculations to identify a potential DPP4 inhibitor. Sci. Rep.10.1038/s41598-024-58485-x (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Liman, W. et al. Monte carlo method and GA-MLR-based QSAR modeling of NS5A inhibitors against the hepatitis C virus. Molecules27, 2729. 10.3390/molecules27092729 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Toropova, A. P. et al. CORAL: Monte carlo method as a tool for the prediction of the bioconcentration factor of industrial pollutants. Mol. Inform.32, 145–154. 10.1002/MINF.201200069 (2013). [DOI] [PubMed] [Google Scholar]
- 25.Oubahmane, M. et al. Design of potent inhibitors targeting the main protease of SARS-CoV-2 using QSAR modelling, molecular docking, and molecular dynamics simulations. Pharmaceuticals16, 608. 10.3390/ph16040608 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Schüürmann, G., Ebert, R. U., Chen, J., Wang, B. & Kühne, R. External validation and prediction employing the predictive squared correlation coefficient—test set activity mean vs training set activity mean. J. Chem. Inf. Model48, 2140–2145. 10.1021/ci800253u (2008). [DOI] [PubMed] [Google Scholar]
- 27.Gramatica, P. & Sangion, A. A historical excursus on the statistical validation parameters for QSAR models: A clarification concerning metrics and terminology. J. Chem. Inf. Model56, 1127–1131. 10.1021/acs.jcim.6b00088 (2016). [DOI] [PubMed] [Google Scholar]
- 28.Gramatica, P. Principles of QSAR models validation: Internal and external. QSAR Comb Sci26, 694–701. 10.1002/QSAR.200610151 (2007). [Google Scholar]
- 29.Toropova, A. P. & Toropov, A. A. The index of ideality of correlation: Improvement of models for toxicity to algae. Nat. Prod. Res.33, 2200–2207. 10.1080/14786419.2018.1493591 (2019). [DOI] [PubMed] [Google Scholar]
- 30.Toropov, A. A. & Toropova, A. P. Use of the index of ideality of correlation to improve predictive potential for biochemical endpoints. Toxicol. Mech. Methods29, 43–52. 10.1080/15376516.2018.1506851 (2019). [DOI] [PubMed] [Google Scholar]
- 31.Ait Lahcen, N. et al. Drug design of new anti-EBOV inhibitors: QSAR, homology modeling, molecular docking and molecular dynamics studies. Arab. J. Chem.10.1016/j.arabjc.2024.105870 (2024). [Google Scholar]
- 32.Toropov, A. A. et al. CORAL: Building up the model for bioconcentration factor and defining it’s applicability domain. Eur. J. Med. Chem.46, 1400–1403. 10.1016/J.EJMECH.2011.01.018 (2011). [DOI] [PubMed] [Google Scholar]
- 33.Bender, B. J. et al. A practical guide to large-scale docking. Nat. Protoc.16(10), 4799–4832. 10.1038/s41596-021-00597-z (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.de Ruyck, J., Brysbaert, G., Blossey, R. & Lensink, M. F. Molecular docking as a popular tool in drug design, an in silico travel. Adv. Appl. Bioinform. Chem.9, 1–11. 10.2147/AABC.S105289 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Meng, X.-Y., Zhang, H.-X., Mezei, M. & Cui, M. Molecular Docking: A powerful approach for structure-based drug discovery. Curr. Comput. Aided Drug Des.7, 146. 10.2174/157340911795677602 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Eberhardt, J., Santos-Martins, D., Tillack, A. F. & Forli, S. AutoDock Vina 1.2.0: New docking methods, expanded force field, and python bindings. J. Chem. Inf. Model61, 3891–3898. 10.1021/ACS.JCIM.1C00203 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Trott, O. & Olson, A. J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem.31, 455–461. 10.1002/JCC.21334 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Love, R. A., Brodsky, O., Hickey, M. J., Wells, P. A. & Cronin, C. N. Crystal structure of a novel dimeric form of NS5A domain I protein from hepatitis C virus. J. Virol.83, 4395–4403. 10.1128/JVI.02352-08 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Barakat, K. H. et al. A refined model of the HCV NS5A protein bound to daclatasvir explains drug-resistant mutations and activity against divergent genotypes. J. Chem. Inf. Model55, 362–373. 10.1021/ci400631n (2015). [DOI] [PubMed] [Google Scholar]
- 40.Issur, M. & Götte, M. Resistance patterns associated with HCV NS5A inhibitors provide limited insight into drug binding. Viruses6, 4227–4241. 10.3390/V6114227 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Morris, G. M. et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J. Comput. Chem.30, 2785–2791. 10.1002/jcc.21256 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.BIOVIA Discovery Studio | Dassault Systèmes. https://www.3ds.com/products/biovia/discovery-studio. Accessed 4 January 2024.
- 43.PyMOL | pymol.org. https://pymol.org/2/. Accessed 4 January 2024.
- 44.Maier, J. A. et al. ff14SB: Improving the accuracy of protein side chain and backbone parameters from ff99SB. J. Chem. Theory Comput.11, 3696–3713. 10.1021/ACS.JCTC.5B00255 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Peters, M. B. et al. Structural survey of zinc-containing proteins and development of the zinc AMBER force field (ZAFF). J. Chem. Theory Comput.6, 2935–2947. 10.1021/ct1002626 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Salomon-Ferrer, R., Case, D. A. & Walker, R. C. An overview of the Amber biomolecular simulation package. Wiley Interdiscip Rev. Comput. Mol. Sci.3, 198–210. 10.1002/WCMS.1121 (2013). [Google Scholar]
- 47.ParmEd—ParmEd documentation. https://parmed.github.io/ParmEd/html/index.html. Accessed 8 April 2024.
- 48.Van Der Spoel, D. et al. GROMACS: Fast, flexible, and free. J Comput Chem26, 1701–1718. 10.1002/JCC.20291 (2005). [DOI] [PubMed] [Google Scholar]
- 49.Grace Home. https://plasma-gate.weizmann.ac.il/Grace/. Accessed 10 April 2024.
- 50.Valdés-Tresanco, M. S., Valdés-Tresanco, M. E., Valiente, P. A. & Moreno, E. Gmx_MMPBSA: A new tool to perform end-state free energy calculations with GROMACS. J. Chem. Theory Comput.17, 6281–6291. 10.1021/acs.jctc.1c00645 (2021). [DOI] [PubMed] [Google Scholar]
- 51.Pires, D. E. V., Kaminskas, L. M. & Ascher, D. B. Prediction and optimization of pharmacokinetic and toxicity properties of the ligand. Methods Mol. Biol.1762, 271–284. 10.1007/978-1-4939-7756-7_14 (2018). [DOI] [PubMed] [Google Scholar]
- 52.Pantaleão, S. Q., Fernandes, P. O., Gonçalves, J. E., Maltarollo, V. G. & Honorio, K. M. Recent advances in the prediction of pharmacokinetics properties in drug design studies: A review. Chem. Med. Chem.10.1002/CMDC.202100542 (2022). [DOI] [PubMed] [Google Scholar]
- 53.Pires, D. E. V., Blundell, T. L. & Ascher, D. B. pkCSM: Predicting small-molecule pharmacokinetic and toxicity properties using graph-based signatures. J. Med. Chem.58, 4066–4072. 10.1021/ACS.JMEDCHEM.5B00104 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Molecular Properties Prediction—Osiris Property Explorer. https://www.organic-chemistry.org/prog/peo/. Accessed 4 January 2024.
- 55.Golbraikh, A. & Tropsha, A. Beware of q2!. J. Mol. Graph Model20, 269–276. 10.1016/S1093-3263(01)00123-1 (2002). [DOI] [PubMed] [Google Scholar]
- 56.Ojha, P. K., Mitra, I., Das, R. N. & Roy, K. Further exploring rm2 metrics for validation of QSPR models. Chemom. Intell. Lab. Syst.107, 194–205. 10.1016/J.CHEMOLAB.2011.03.011 (2011). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data generated or analysed during this study are included in this published article [and its supplementary information files].































