Abstract

The Automated Ligand Searcher (ALISE) is designed as an automated computational drug discovery tool. To approximate the binding free energy of ligands to a receptor, ALISE includes a three-stage workflow, with each stage involving an increasingly sophisticated computational method: molecular docking, molecular dynamics, and free energy perturbation, respectively. To narrow the number of potential ligands, poorly performing ligands are gradually segregated out. The performance and usability of ALISE are benchmarked for a case study containing known active ligands and decoys for the HIV protease. The example illustrates that ALISE filters the decoys successfully and demonstrates that the automation, comprehensiveness, and user-friendliness of the software make it a valuable tool for improved and faster drug development workflows.
Introduction
The emergence of deadly diseases and related epidemics has been a reoccurring phenomenon throughout recorded time. However, with the rapid increase in human population during the last couple of centuries and the resulting increased human–human and human–animal proximity, epidemics have become increasingly more frequent.1 Furthermore, due to international travel, diseases easily spread across countries, leading to global pandemics such as the HIV pandemic and the recent coronavirus disease 2019 (COVID-19) pandemic. Disease outbreaks have huge costs in human lives and the global economy. For example, COVID-19 is to date responsible for a death toll of more than 6.887 million people2 and is expected to cut the world’s GDP tremendously.3,4 Therefore, it is of critical importance that the current protocols used to combat diseases, for example, the development of testing methods, vaccines, and drugs, are improved and made faster.
The present paper focuses on suggesting improvements in drug discovery methods. The goal in drug discovery is to find a molecule (ligand) that binds to a target (receptor), typically a protein, and through the ligand–receptor interactions, blocks or modulates a particular biomolecular mechanism or pathway.5,6 In the case of infectious diseases, typically, mechanisms that are vital for the pathogen’s survival or reproduction are targeted.
The average cost of bringing one drug to market is estimated to be ∼1 billion USD.7 Clinical trials are the most cost-intensive parts of drug development (∼20%)8,9 due to toxicity tests. Furthermore, it has been estimated that only 0.01% of molecules synthesized or isolated during drug development are approved as pharmaceutical drugs. Hence, the large cost of developing a new drug is mainly due to the money and time spent on investigating unsuccessful candidates.10
To improve the identification of poor drug candidates early in the drug development process and, hence, optimize the efficiency of discovering a successful drug, it has become common practice to apply computational methods as part of the initial drug discovery.11 Molecular docking (docking) is a well-established method within computer-aided drug design6,12,13 that identifies the optimal binding pose of a ligand in a receptor and assigns to it a score based on its computed binding affinity. Existing computational docking programs screen thousands of ligands rapidly, but at the cost of a rather crude level of modeling. The most severe limitations are that only a limited number of chemical bonds in the ligands are flexible, and the receptor is often modeled as static or with only a few flexible residues.6,14 This highly constrained conformational search space may lead to a poor estimation of binding affinities. To improve computer-aided drug design, more refined methods need to be applied.15
In this paper, we present the three-stage Automated Ligand Searcher (ALISE) program, which, by applying increasingly refined computational methods in each stage, narrows down the number of potential drug candidates. The first stage applies the arguably primitive but fast docking method to discard all but the most promising ligands. It is important to note that since all ligands are initially screened in the docking stage, the overall tool is highly sensitive to the result of the docking algorithm. For the case study at hand, only 100 ligands were calculated in the molecular dynamics (MD) stage, which represents less than 0.3% of the complete ligand set. The remaining ligands proceed to the second stage, in which MD simulations are performed to obtain detailed insight into the interactions between the ligands and the receptor. Using binding free energy estimates obtained from MD simulations, the ligands with the strongest apparent binding free energies enter the third stage, where even more refined binding free energy estimates are obtained through advanced free energy perturbation (FEP) simulations.
To make the advantages of ALISE as accessible as possible, ALISE is implemented as a computational task in the versatile Scandinavian Online Kit for Nanoscale Modeling (https://viking-suite.com/VIKING) web platform16 which provides standardized step-by-step workflows for setting up and linking various computational modeling techniques, for example, MD simulations, FEP simulations, various quantum mechanical (QM) calculations, and so forth.16−20 Furthermore, VIKING connects to supercomputing clusters and allows one to run any task seamlessly on high-performance computing resources.
In the past, an earlier prototype version of ALISE has been successfully employed to study ligand–receptor binding in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).21 Additionally, the ALISE framework can be used to structurally compare different receptor models and their ligand affinity. In the following, the theory and methods that form the basis of each stage in ALISE are outlined, and the ALISE workflow itself is presented. Furthermore, the capabilities of ALISE are demonstrated using the HIV protease for which a benchmarking set of active and decoy ligands are provided by the DUD-E database.22 The performance of ALISE to find a small number of active ligands in the vast amount of decoys is quantified.
Methodology
From a collection of drug candidates, ALISE identifies the best candidates targeting a specific receptor through three consecutive stages: molecular docking, MD, and FEP. Figure 1 provides a schematic representation of how increasingly sophisticated modeling methods are employed in each stage to compute the binding free energies of putative ligand–receptor complexes, specifically molecular docking, MD, and FEP simulations, as illustrated by the three stages named in the figure. The estimated binding free energy of a ligand is referred to as its score, and separate molecular docking, MD, and FEP scores are obtained in their respective stages. At the end of each stage, only the ligands with the best scores are selected to proceed to the next stage. This gradual segregation of ligands in each stage ensures that computational resources are not wasted by applying more computationally expensive methods to study poor drug candidates. The basic theory and methods of each of the three stages in ALISE are presented in the following sections.
Figure 1.

ALISE applies three consecutive stages, the molecular docking, MD, and FEP stages, to narrow and rank a list of potential drug candidates targeting a specified receptor. Each stage applies the increasingly sophisticated computation methods of molecular docking, MD simulations, and FEP simulations to estimate the binding free energy of the ligands to the receptor. To optimize the use of computational resources, only the best-ranked ligands from each stage proceed to the next, more sophisticated computational stage, and the list of ligands is gradually narrowed down to only the very best drug candidates.
Molecular Docking
Molecular docking is a modeling method utilized to determine the optimal binding pose, that is, the position, conformation, and orientation of a ligand at the binding site of a receptor, along with a docking score that measures the binding affinity of the ligand to the receptor in the determined binding pose.14,23 A typical docking software comprises two primary components: a search algorithm and a scoring function. The search algorithm moves the ligand through the conformational space defined by the degrees of freedom of the modeled system, while the scoring function ranks the docked configurations, which typically relies on an estimate of the free energy of the ligand binding to the receptor.24
The docking procedure in ALISE is carried out using the VinaMPI software,25 which is an extension of the AutoDock Vina (Vina) software with the addition of parallel MPI support for execution on supercomputers. Vina applies a hybrid scoring function C, derived from the empirical interatomic potentials,14,23 defined as
| 1 |
where the summation is performed over all
pairs of atoms whose interatomic distance, rij, can vary.23 The function
describes the interaction between a pair
of atoms of type t1 and t2, with the separation distance r12, as a weighted sum of steric, hydrophobic, and hydrogen
bonding interaction terms, as described by Trott and Olson.23 The molecular docking procedure aims to find
the global minimum of C, corresponding to the optimal
binding pose of the ligand, which in Vina is accomplished by an iterated
local search global optimizer26 which employs
the Broyden–Fletcher–Goldfarb–Shanno method27 for local minimization.
To reduce the computational resource demand of searching the conformational space, only a restricted volume of the receptor, called the search space, is explored. The search space must be sufficiently large to cover the entire active site of the receptor without including unnecessary parts of the protein. Therefore, defining an appropriate search space becomes a mandatory part of the molecular docking procedure; a too-large search space wastes computational resources, while a too-small search space might exclude important parts of the conformational space.
Molecular Dynamics
The second stage of ALISE determines the binding free energy of the investigated ligands based on equilibrium all-atom MD simulations performed using NAMD.28,29 In contrast to molecular docking, all parts of the considered systems are allowed to move during the MD stage. The atomic motions are obtained by integrating Newton’s second law for each atom with interatomic forces derived from molecular mechanics (MM) force fields obtained from experimental studies and QM modeling.30 Therefore, the MD stage provides a significantly more detailed and realistic description of the ligand–receptor binding process.
The ligand-binding free energy G0 is given by
| 2 |
where GC, GL, and GR are the free energies of the ligand (L), receptor (R), and ligand–receptor complex (C), respectively. These energies are computed from the MD simulations as
| 3 |
where EMM are
the respective MM bonding and nonbonding energies,
and
are the polar and nonpolar contributions
to the solvation free energy of the investigated structure, respectively,
and T and S are the temperature
and the entropy of the system. ⟨·⟩ indicates an
average over the respective MD simulation trajectory.31
The first three terms in eq 3 are computed using the molecular mechanics-generalized
Born
and surface area continuum solvation (MM/GBSA) method.31,32
is proportional to the solvent-accessible
surface area, ASA, of the investigated
structure
| 4 |
where γ is a surface tension parameter.33,34
is computed using the generalized Born
(GB) model, originally devised by Still et al.,35 but modified to take into account the ionization of the
solvent36−38
| 5 |
where ε0 is the vacuum permittivity, summations are performed over all N atoms of the different subsystems (L, C, and R),39 and the function gij is defined as originally suggested by Still et al.35
| 6 |
where the effective Born radius of an atom,
αi, indicates how deep an atom is
buried inside a molecule or a protein;40 the deeper an atom is buried, the less accessible it is to the solvent,
resulting in a larger value of αi.39 In ALISE, αi is computed as proposed by Onufriev et al.38−40 and the calculation
of EMM,
, and
is controlled by the generalized Born implicit
solvent (GBIS) functionality provided by NAMD,38 which combines the solvation free energy with the electrostatic
energy output.
The term Dij in eq 5 is defined as
| 7 |
where εs is the dielectric
constant of the solvent, and
is the Debye screening length, with kB being the Boltzmann constant, NA the Avogadro number, e the elementary
charge, and I the ion concentration within the GBIS.38 Note that the choice of an implicit solvent
model limits accuracy in the MD stage. A hybrid model of explicit
and implicit solvents, for example, as proposed by Geist et al.,41 would be beneficial.
In ALISE, the entropy is calculated using quasiharmonic analysis, which derives from a Gaussian approximation of the coordinates’ probability distributions and their interpretation as quantum harmonic oscillators, allowing the computation of the entropy from a principal component analysis.42−46 Within the outlined approximation, one estimates entropy as42
| 8 |
| 9 |
where
, with h being Planck’s
constant, and the frequencies ωi are obtained from the secular equation
| 10 |
where M1/2 is a diagonal matrix with the square roots of all atomic masses in the system, σ is the variance-covariance matrix of the Cartesian coordinates, calculated from the MD trajectory, and I is the identity matrix. The computed change in entropy upon ligand binding accounts for the total entropy of the ligand and the vibrational entropy of the receptor. The translational and rotational entropies of the receptor are not included since they can be expected to be constant during the binding event as long as the mass of the receptor is large compared to the mass of the ligand. Even if the latter is not the case, the often-desired inhibition of the active site by the ligand is independent of the rotational and translational mobility of the complex as long as the ligand is kept in its pocket. The rotational and translational contributions of the receptor are omitted by performing a coordinate alignment of the α-carbons of the receptor prior to the computation of the variance-covariance matrix.
Free Energy Perturbation
In the final stage of ALISE, binding free energy estimates of the most promising ligands are computed by using FEP. FEP is also based on MD simulations, but in addition to simulating a system in only the bound and unbound states, as it is done in the MD stage, several intermediate states are simulated as well.47,48 Each simulated state is identified by a parameter λi ranging from 0 to 1, where i = 1, ..., K such that K is the total number of states. In a so-called forward transformation, the interactions between the ligand and its surroundings are gradually decoupled, described by increasing λi values, until the ligand is effectively annihilated at λK = 1. The free energy difference between each pair of simulated states, ΔGi, is computed by approximating the ensemble average following Zwanzig’s FEP identity49
| 11 |
by a direct average over simulation frames
| 12 |
where ΔUi is the potential energy change associated with the change of the λ-parameter from λi to λi+1 for a given set of atomic coordinates in the system. A total of K λ steps are performed. The total free energy change, ΔG, corresponding to the ligand annihilation is then obtained by summing over all ΔGi: ΔG = ∑i=1K–1ΔGi. By simulating the transformation of interest in small increments, FEP can, in principle, directly sample all the phase space changes related to the transformation between a bound and an unbound system, including all entropic contributions, and thereby measure the related free energy change. An analogous approach is performed to recouple interactions, defined as a backward transformation.
In ALISE, four calculations, indicated by the solid arrows in Figure 2, are performed to obtain the transformation from a bound to an unbound state, from which the binding free energy can be estimated. For the unbound and bound states, two FEP simulations are performed: a forward and backward simulation. In order to keep the ligand in place and to avoid its collision with the receptor while interactions are gradually decoupled during the FEP calculation, several harmonic restraints are introduced. These include distance restraints, which anchor the ligand to selected non-hydrogen atoms of the receptor. Additionally, a restraint on the root-mean-square deviation of the ligand is imposed to keep the ligand from deforming. The restraints have a nonzero contribution to free energy and therefore must be considered in the calculation. Similarly to the forward and backward FEP simulations, the restraints are gradually coupled and decoupled in the bound and unbound configurations.
Figure 2.

Illustration of the FEP cycle that is employed to calculate the binding free energy ΔG0 between the receptor (large shape) and the ligand (small shape). The existence of restraints on the ligand is represented by a dotted box around the ligand. When the ligand is colored solid black, it interacts with its environment, while the striped pattern indicates the absence of interactions with the environment. The FEP calculations are performed such that the forward perturbation is executed along the direction of the arrows.
Since the ligand is not interacting with its surroundings in the lower part of the thermodynamical cycle in Figure 2, the cycle can be closed by setting ΔG3 = 0, as there is no difference in the free energy of the ligand embedded in water or the receptor. The binding free energy thus becomes
| 13 |
Computational Realization of ALISE
The web platform VIKING16 offers access to the ALISE software, which provides a user-friendly workflow interface similar to other computational tasks in VIKING.17−20 In order to set up a virtual screening experiment, the user is prompted to provide specific information. Steps A-C, as shown in Figure 3, require the following information:
Figure 3.
User interface of ALISE. (A,B) Molecular structures for the receptor and ligands can be uploaded or fetched from online databases.52−54 (C) The search space used in the docking stage can be defined manually or automatically based on the AutoLigand software.55 (D) On a summary page, all settings related to the docking and simulations performed during the virtual screening can be reviewed and modified. (E) After pressing “Run task”, the current status of the task can be monitored on the overview page. (F) As each stage completes, a result page will become available with a ranked list of the ligands based on their binding free energy estimates from that stage, along with graphical presentations of the modeled binding modes.
(A) Target receptor: The user can upload a molecular structure file of the receptor or provide a Protein Data Bank (PDB) ID to automatically search for the receptor in the database.
(B) Ligands: Potential ligands can be uploaded as molecular structure files, or the system can retrieve them automatically from the PubChem database52,53 through a chemical similarity search. The similarity is assessed through the Tanimoto score (see Supporting Information: Chemical Similarity Search on PubChem). Additional screening for possible toxicity is possible by utilizing toolkits21,50 that calculate the quantitative estimate of drug likeness51 and uploading their output as a list of mol2 files.
(C) Search space: The user can either manually set the dimensions and location of the search space or use a graphical user interface in VIKING to drag a search space box. Alternatively, ALISE can automatically determine the search space using the AutoLigand software55 (see Supporting Information: AutoLigand).
Once these initial steps are completed, the user reaches a summary page (step D in Figure 3), where all of the settings can be reviewed and adjusted, including those specific to each of the three stages used in a virtual screening experiment. By default, all settings are set to sensible values. However, expert users have the flexibility to adjust virtually every aspect of ALISE. Notably, the docking exhaustiveness and simulation parameters, such as simulation length, time step, and output frequencies of the MD and FEP stages, can be adjusted. Furthermore, the number of intermediate alchemical states to simulate during the FEP stage can be changed to control the statistical reliability of the obtained free energy estimates. A detailed description of all available settings in the user interface is available in Supporting Information (virtual screening setup options).
The search for suitable drug molecules is initiated from the task summary page on the VIKING platform. VIKING then launches the virtual screening, and the transitions between the steps are automatically taken care of. The progress of a running experiment can be monitored on the overview page (see step E in Figure 3) from which intermediate results and files are also available. A screen capture demonstrating the process of setting up a task in ALISE can be found via the Data and Software Availability statement. Figure 4 outlines the automated workflow handled by ALISE when the task is executed. First, if requested, ligands are fetched from the PubChem database52,53 after which the receptor and ligand files are prepared for docking using AutoDockTools61 and OpenBabel.62 Next, if needed, a suitable search space is determined using AutoLigand.55 Docking is performed by VinaMPI25 and a user-defined number of best-scoring ligands and poses are presented on the result page, see step F in Figure 3.
Figure 4.

Automated workflow executed by ALISE. Optional and mandatory steps are represented by yellow and blue boxes, respectively, while green boxes indicate results obtained after each stage. Results are depicted on the result pages; see step F in Figure 3. Names56−60 indicate the software or web resource utilized by ALISE in the particular step of the workflow.
For each of the best-performing ligands from the docking stage, a set of MM force field parameters is automatically generated by the Charmm General Force Field (CGenFF) program,63−65 and MD simulations employing the implicit solvent approximation are prepared and initiated based on the determined binding poses. Both in the MD and FEP stages, the simulation files are prepared using the psfgen plugin66 in VMD,67 and the simulations are performed using NAMD.28,29 For N ligands, a total of 2N + 1 simulations are prepared and executed in the MD stage: one simulation with the receptor, one with each ligand, and finally one simulation with each receptor–ligand complex. When the MD simulations are done, the free energy contributions are computed for each simulated system following eq 3, and the differences in the free energy contributions are provided on the result page (see step F in Figure 3) together with the total binding free energy estimates computed following eq 2.
In the last stage, FEP simulations are prepared and executed from the last configurations of the best-performing ligands obtained from the MD stage, employing the explicit solvent model. Here, the solvate68 and autoionize69 plugins of VMD67 are used, respectively, to position structures in water boxes with a user-specified distance to the water box boundaries and to add Na+ and Cl– ions to the system, matching the concentrations used during the MD stage. The free energy difference obtained from decoupling the ligand from the complex, see Figure 2 and eq 13, is finally determined and summarized on the result page (see step F in Figure 3). In all simulations, proteins, solvents, and ions are modeled by the Chemistry at Harvard Macromolecular Mechanics (CHARMM) additive all-atom force fields with CMAP corrections.70−72 The inclusion of other force fields, for example, AMBER,73 is planned for the future development of ALISE.
Proof of Concept and Discussion
One protein that has received significant attention in recent years and remains a major challenge in human medicine is the human immunodeficiency virus type 1 (HIV-1) protease.74−76 Millions of people are infected with HIV77 and the occurrence of drug-resistant virus variants78,79 requires researchers to respond. Identifying biologically active ligands that can bind to the protease and hinder the replication cycle of HIV is an essential and significant task. Docking algorithms can assist in this process.
Quantifying the precision and efficancy of a docking algorithm is, however, a complex procedure that has been extensively discussed80−82 and the demand to validate different kinds of approaches is growing.83 In recent years, the DUD-E (Directory of Useful Decoys-Enhanced) database22 has become a reliable benchmark library for assessing the performance of docking algorithms. The DUD-E database contains a combination of active ligands, which are biologically active molecules or compounds that interact with a target protein and exhibit desired pharmacological effects, and decoy ligands, which are nonbiologically active molecules included as negative controls during molecular docking experiments. Using data sets from the DUD-E database, it is possible to evaluate whether a docking algorithm effectively identifies active ligands without falsely including decoy ligands.
Performance
We have used the DUD-E data set for HIV-1 protease to benchmark the performance of ALISE to determine if the program is capable of identifying more active ligands for the protease compared to a random selection. Additionally, we compare and discuss the three different steps involved: docking, MD, and FEP.
The data set used in this study comprises 1,395 active ligands and 35,750 decoys. ALISE aims to sort these ligands, favoring the active ones. In each step, ALISE generates an ordered list of ligands based on their scores, and the top-scoring ligands progress to the next stage. Ligands that fail to dock or be simulated are ranked as the worst. Possible reasons for failure include unsuccessful docking when VinaMPI cannot fit the ligand in the defined binding site, CGenFF failing to produce parameters, or an unstable MD simulation. The latter occurred only once and was caused by unphysical parameters, leading to sudden and large accelerations. The specific parameters involved a sulfur atom, to which CGenFF assigned large penalty scores of about 100. A complete report is included in the replication data; see the Data and Software Availability statement. At each stage, the performance of ALISE is evaluated and compared to a random selection of the ligands from the data set.
Figure 5A–C illustrates the comparison of the virtual screening results obtained using ALISE versus employing a random selection of ligands at each stage of the ligand docking process. Each value x on the abcissa represents the top x percent of the ranked database, and the ordinate quantifies the percentage of known active ligands that are found within the top x percent of the ranked database, resulting in so-called enrichment curves. The difference between the curves obtained through ALISE (red) and random selection (blue) describes the effectiveness of each computational step and is visualized in Figure 5D–F.
Figure 5.
Comparison of ALISE’s performance in finding active ligands compared to a random selection at the three computational stages. The abcissa represents the progress in percent through all ligands (active and decoys), while the ordinate shows the percentage of active ligands found within this fraction (first row). In the second row, the ordinate shows the difference between the curves. The individual stage has a better performance if the red curve (ranked by ALISE) is above the blue curve (ranked by random selection). (A) Results of the docking stage show ALISE’s ordering (red line) surpassing the random selection (blue line) consistently for all ligands, including the successful docking of 1395 active ligands (last point). The gray area represents 100 ligands proceeding to the MD stage. (B) Results of the MD stage show that some active ligands are found more efficiently with ALISE compared with random selection. The gray area represents the 15 ligands selected for the FEP stage. (C) Results of the FEP stage further improve the results of the MD stage. The FEP stage ranks the active ligands more favorably than the random selection procedure. (D–F) For each respective stage, the difference between ALISE’s performance and the random selection approach is shown. The integrals I (grayed areas) are a measure of ALISE’s supremacy over random selection. A positive value indicates ALISE’s advantage in performance. For the docking stage, the value is 0.00235; for the MD stage, a value of 0.0931 was found; and for the FEP stage, 0.15835 was calculated.
ALISE outperforms the random selection if the integrals indicated in Figure 5D–F are positive. The docking stage shows an obvious supremacy in sorting. On the other hand, the benefits of the MD and FEP seem to be subtler. Integration of the curves in 5D–F, as indicated by the gray area, yields values of 0.00235 for the docking stage, 0.09301 for the MD stage, and finally 0.15835 for the FEP stage. As all values are positive, the ALISE workflow provides a quantifiable benefit in each stage compared to a random choice of ligands. In the following, each stage is discussed in more detail:
In the docking stage, a total of 16,329 ligands were successfully docked, including all 1395 active ligands (see Figure 5A). The chance to randomly select an active ligand in this stage is 3.90%. From the docking results, the top 100 ligands based on their docking scores proceeded to the MD stage, represented by the gray area in Figure 5A.
During the MD stage, 39 simulations failed, including three active ligands. In most cases, CGenFF could not produce parameters for the ligand. A more detailed status report is given in a supporting .csv file; see the Data and Software Availability statement. Figure 5B shows that at the MD stage, some active ligands were favored, but the overall performance of ALISE was similar to the random selection of ligands with lower scores. Nevertheless, the MD stage has a positive effect, as seen in Figure 5B,E. Furthermore, the chance to randomly choose an active ligand is now 27.86%, as opposed to 3.90% in the docking stage, requiring a more elaborate method to keep outperforming random choice.
Considering the resource demands, only the top 15 ligands from the MD simulations were selected for further analysis in the FEP stage, as indicated by the gray area in Figure 5B. The results in Figure 5C demonstrate that at the FEP stage, the findings of the MD step are improved, and active ligands are discovered more swiftly compared to a random selection. At the FEP stage, the chance to randomly choose an active ligand has already increased to 40.00% in the random selection.
The final stage of the overall ALISE workflow yields an ordered list and provides a visualization of the respective receptor–ligand complexes within the VIKING visualization framework. In summary, while only 3.90% of the ligands tested were active, 40.00% of the ligands tested in the final stage are active ligands. Additionally, in the final stage, the majority of decoy ligands were ranked lower than the active ones. All calculations were performed on a single node with 48 CPUs and a 2.9 GHz frequency. Choosing 100 ligands to be considered in the MD stage and 15 ligands in the FEP stage ensured that all jobs completed within 2 weeks on the available resources. The docking stage took, on average, 0.5 s per docked position. For the MD stage and FEP stage, performances of 2.08 and 4.01 h/ns were observed, respectively.
Binding Results
The workflow implemented in ALISE permitted the reduction of the number of putative ligands to bind to the HIV protease from 35,750 to just 15. The following section provides a more detailed inspection of the top three highest-ranked ligands, as delivered by the FEP stage of the program.
The overall highest-ranking ligand is a decoy, according to the DUD-E database. It was ranked with an estimated binding free energy of −42.60 kcal/mol and is shown in its bound configuration in Figure 6. To the best of our knowledge, this decoy ligand has not been experimentally studied or mentioned to have certain effects. It might therefore be that the ligand is not stable enough in nature or was not studied sufficiently to make any definite conclusions. It is, of course, also possible that a new putative drug to inhibit HIV protease has been identified. The molecular structure of the ligand is shown in Figure 7A.
Figure 6.

Panels A and B show the ligand with the best binding free energy estimate, as obtained through the ALISE framework. A shows the bound ligand on the surface of the protein structure, while B shows the secondary structure of the protein.
Figure 7.

Panels A–C show the structures of the three highest-ranked ligands found by ALISE to inhibit the HIV protease. Structures are obtained from the ChemSpider Web site.87−89
The second highest-ranked ligand (PubChem CID 471313) has a binding free energy estimate similar to that of the first ligand, with a value of −40.02 kcal/mol. The ligand can be categorized as a tetrahydropyrimidinone. DeLucca et al.84 demonstrated that molecules of that kind bind well in the HIV protease binding pocket by displacing structural water molecules. The structure of the ligand is shown in Figure 7B.
For the third highest-ranked ligand, ALISE obtained a binding free energy estimate of −31.89 kcal/mol. It is an active compound, as determined by the DUD-E database. This particular ligand (PubChem CID 469254) is a derivative of the cyclic urea inhibitor DMP450. Previous research has demonstrated that this class of compounds can effectively inhibit the activity of the HIV protease enzyme through specific bonding interactions.85 Notably, the ligand incorporates hydrogen-bonding equivalents of a water molecule that is typically bound to the enzyme, resulting in the formation of a conformationally rigid, seven-membered ring.84 These molecular characteristics contribute to the optimal interaction between DMP450 and all of the binding pockets of the HIV protease enzyme. The ligand is an experimentally synthesized asymmetric derivative of DMP450. It was named the 12F structure by Han et al.,86 and its structure is illustrated in Figure 7C.
The binding score corresponding to the estimated free energies of the first 200 ligands from the docking stage and all ligands in the subsequent stages are listed in the Supporting Information (ALISE’s ranked results for the HIV protease case study).
Conclusions
ALISE is an advanced virtual drug screening tool integrated into the versatile https://viking-suite.com/VIKING web platform.16 The primary objective of ALISE is to refine and condense a preliminary list of potential drug candidates targeting a specific receptor. This is achieved through a sequential process consisting of three distinct stages, wherein the ligands are ranked based on their binding free energy estimates. The binding free energy estimates are computed by using three different computational techniques: docking, MD simulations, and FEP simulations.
At each stage, only the ligands exhibiting the most favorable binding free energy values proceed to the subsequent stage. This stepwise segregation of ligands ensures optimal utilization of computational resources by avoiding the application of computationally demanding methods to unpromising drug candidates. The results obtained after each stage, including the binding free energy estimates and a graphical representation of the outcomes, are made accessible on a dedicated result page within the VIKING online platform, as indicated in Figure 3.
To demonstrate the performance of ALISE, a benchmark example was conducted that focused on the HIV protease. The benchmark involved a data set with less than 4% active ligands, and ALISE successfully narrowed the search to 40% of active ligands in the final stage with 15 ligands. Furthermore, two out of the three highest-ranked drugs were successfully tested to inhibit HIV in earlier investigations.84,86
Overall, ALISE’s automated and systematic approach to virtual drug screening combines 10 individual programs and automates their application in three stages. The automated approach, the web resource implementation, and the performance show that ALISE has potential as an efficient and effective tool in the early stages of drug discovery and optimization. ALISE is a user-friendly tool which allows computational scientists to use a virtual screening workflow. At the same time, ALISE speeds up the workflow for experts in the field.
Acknowledgments
The authors would like to thank the Danish Councils for Independent Research, the Volkswagen Foundation (Lichtenberg Professorship to I.A.S.), the DFG, the German Research Foundation (GRK1885-Molecular Basis of Sensory Biology, SFB 1372—Magnetoreception and Navigation in Vertebrates, TRR 386/1-2023, HYP*MOL-Hyperpolarization in molecular systems (Projektnr. 514664767)), and the Ministry for Science and Culture of Lower Saxony (simulations meet experiments on the nanoscale: Opening up the quantum world to artificial intelligence (SMART), Dynamics of Systems on the nanoscale (DYNANO)). Computational resources were provided by the CARL Cluster at the Carl-von-Ossietzky University Oldenburg. The authors gratefully acknowledge the computing time granted by the Resource Allocation Board and provided on the supercomputer Lise and Emmy at NHR@ZIB and NHR@Göttingen as part of the NHR infrastructure. The calculations for this research were conducted with computing resources under the project nip00058.
Glossary
Abbreviations Used
- ALISE
Automated Ligand Searcher
- CGenFF
Charmm General Force Field
- CHARMM
Chemistry at Harvard Macromolecular Mechanics
- COVID-19
coronavirus disease 2019
- FEP
free energy perturbation
- GB
generalized Born
- GBIS
generalized Born implicit solvent
- MD
molecular dynamics
- MM
molecular mechanics
- MM/GBSA
molecular mechanics generalized Born and surface area continuum solvation
- QM
quantum mechanical
- Vina
AutoDock Vina
Data Availability Statement
The data needed to reproduce this study, a screen capture showing the setup of a task in ALISE, and a status report from the study are publicly available at: https://doi.org/10.57782/LGTW2K. The data structure is described in a README file. Software, web servers, and computational tools applied in ALISE and presented in this paper are owned by their respective developers and copyright holders but can be licensed freely for academic use. Except for the CGenFF program, VIKING redistributes the necessary software and tools to users’ computational resources. To ensure that users of ALISE are noncommercial and possess a CGenFF license, each user must apply for access to ALISE.
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.3c01317.
Chemical similarity search on PubChem; AutoLigand; virtual screening setup options; and ALISE’s ranked results for the HIV protease case study (PDF)
Author Contributions
L.J. and J.H. contributed equally to this work. L.J.: conceptualization, methodology, software, investigation, and writing—original draft. J.H.: conceptualization, methodology, software, validation, formal analysis, writing—review and editing, and visualization. V.B.: methodology, software, and writing—review and editing. L.G.: software, investigation, and writing—review and editing. F.S.: software, validation, formal analysis, investigation, writing—review and editing, visualization, and project administration. I.A.S.: conceptualization, resources, data curation, writing—review and editing, supervision, project administration, and funding acquisition.
The authors declare no competing financial interest.
Supplementary Material
References
- Morens D. M.; Daszak P.; Markel H.; Taubenberger J. K. Pandemic COVID-19 Joins History’s Pandemic Legion. mBio 2020, 11, e00812 10.1128/mbio.00812-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- WHO Coronavirus (COVID-19) Dashboard. https://covid19.who.int/ (accessed April 4, 2023).
- The Brussels Times Global GDP could be Slashed by $22 Trillion between 2020 and 2025, IMF Warns; The Brussels Times, 2021.
- Cutler D. M.; Summers L. H. The COVID-19 Pandemic and the $16 Trillion Virus. J. Am. Med. Assoc. 2020, 324, 1495–1496. 10.1001/jama.2020.19759. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davis R. L. Mechanism of Action and Target Identification: A Matter of Timing in Drug Discovery. iScience 2020, 23, 101487. 10.1016/j.isci.2020.101487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sulimov V. B.; Kutov D. C.; Sulimov A. V. Advances in Docking. Curr. Med. Chem. 2020, 26, 7555–7580. 10.2174/0929867325666180904115000. [DOI] [PubMed] [Google Scholar]
- Wouters O. J.; McKee M.; Luyten J. Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009–2018. J. Am. Med. Assoc. 2020, 323, 844–853. 10.1001/jama.2020.1166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simoens S.; Huys I. R&D Costs of New Medicines: A Landscape Analysis. Front. Med. 2021, 8, 760762. 10.3389/fmed.2021.760762. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sertkaya A.; Wong H. H.; Jessup A.; Beleche T. Key cost drivers of pharmaceutical clinical trials in the United States. Clin. Trials 2016, 13, 117–126. 10.1177/1740774515625964. [DOI] [PubMed] [Google Scholar]
- Singh I. P.; Ahmad F.; Chatterjee D.; Bajpai R.; Sengar N.. Drug Discovery and Development: From Targets and Molecules to Medicines; Poduri R., Ed.; Springer Singapore: Singapore, 2021; pp 11–65. [Google Scholar]
- Glaab E. Building a virtual ligand screening pipeline using free software: A survey. Briefings Bioinf. 2016, 17, 352–366. 10.1093/bib/bbv037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bharatam P. V.Drug Discovery and Development: From Targets and Molecules to Medicines; Poduri R., Ed.; Springer Singapore, 2021; pp 137–210. [Google Scholar]
- Meng X.-Y.; Zhang H.-X.; Mezei M.; Cui M. Molecular Docking: A Powerful Approach for Structure-Based Drug Discovery. Curr. Comput.-Aided Drug Des. 2011, 7, 146–157. 10.2174/157340911795677602. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kitchen D. B.; Decornez H.; Furr J. R.; Bajorath J. Docking and scoring in virtual screening for drug discovery: methods and applications. Nat. Rev. Drug Discovery 2004, 3, 935–949. 10.1038/nrd1549. [DOI] [PubMed] [Google Scholar]
- Jones L. H.; Gray N. S. Chemical biology for target identification and validation. Med. Chem. Commun. 2014, 5, 244–246. 10.1039/C4MD90004A. [DOI] [Google Scholar]
- Korol V.; Husen P.; Sjulstok E.; Nielsen C.; Friis I.; Frederiksen A.; Salo A. B.; Solov’yov I. A. Introducing VIKING: A Novel Online Platform for Multiscale Modeling. ACS Omega 2020, 5, 1254–1260. 10.1021/acsomega.9b03802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nielsen C.; Solov’yov I. A. MolSpin - Flexible and extensible general spin dynamics software. J. Chem. Phys. 2019, 151, 194105. 10.1063/1.5125043. [DOI] [PubMed] [Google Scholar]
- Schuhmann F.; Korol V.; Solov’yov I. A. Introducing Pep McConst – A user-friendly peptide modeler for biophysical applications. J. Comput. Chem. 2021, 42, 572–580. 10.1002/jcc.26479. [DOI] [PubMed] [Google Scholar]
- Gerhards L.; Nielsen C.; Kattnig D. R.; Hore P. J.; Solov’yov I. A. Modeling spin relaxation in complex radical systems using MolSpin. J. Comput. Chem. 2023, 44, 1704–1714. 10.1002/jcc.27120. [DOI] [PubMed] [Google Scholar]
- Mroginski M.-A.; Adam S.; Amoyal G. S.; Barnoy A.; Bondar A.-N. N.; Borin V. A.; Church J. R.; Domratcheva T.; Ensing B.; Fanelli F.; Ferré N.; Filiba O.; Pedraza-González L.; González R.; González-Espinoza C. E.; Kar R. K.; Kemmler L.; Kim S. S.; Kongsted J.; Krylov A. I.; Lahav Y.; Lazaratos M.; NasserEddin Q.; Navizet I.; Nemukhin A.; Olivucci M.; Olsen J. M. H.; Pérez de Alba Ortíz A.; Pieri E.; Rao A. G.; Rhee Y. M.; Ricardi N.; Sen S.; Solov’yov I. A.; De Vico L.; Wesolowski T. A.; Wiebeler C.; Yang X.; Schapiro I.; de Alba Ortíz A. P.; Pieri E.; Rao A. G.; Rhee Y. M.; Ricardi N.; Sen S.; Solov’yov I. A.; Vico L. D.; Wesolowski T. A.; Wiebeler C.; Yang X.; Schapiro I. Frontiers in Multiscale Modeling of Photoreceptor Proteins. Photochem. Photobiol. 2021, 97, 243–269. 10.1111/php.13372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elend L.; Jacobsen L.; Cofala T.; Prellberg J.; Teusch T.; Kramer O.; Solov’Yov I. Design of SARS-CoV-2 main protease inhibitors using artificial intelligence and molecular dynamic simulations. Molecules 2022, 27, 4020. 10.3390/molecules27134020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mysinger M. M.; Carchia M.; Irwin J. J.; Shoichet B. K. Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking. J. Med. Chem. 2012, 55, 6582–6594. 10.1021/jm300687e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trott O.; Olson A. J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 2010, 31, 455–461. 10.1002/jcc.21334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leach A. R.Molecular Modelling: Principles and Applications, 2nd ed.; Pearson, 2001. [Google Scholar]
- Ellingson S. R.; Smith J. C.; Baudry J. VinaMPI: Facilitating multiple receptor high-throughput virtual docking on high-performance computers. J. Comput. Chem. 2013, 34, 2212–2221. 10.1002/jcc.23367. [DOI] [PubMed] [Google Scholar]
- Blum C.; Cotta C.; Fernández A. J.; Gallardo J. E.; Mastrolilli M.. Hybridizations of Metaheuristics with Branch & Bound Derivates; Springer Berlin Heidelberg: Berlin, Heidelberg, 2008; Vol. 114, pp 85–116. [Google Scholar]
- Nocedal J.; Wright S. J.. Numerical Optimization, 2nd ed.; Springer New York: New York, NY, 2006; pp 135–163. [Google Scholar]
- Phillips J. C.; Braun R.; Wang W.; Gumbart J.; Tajkhorshid E.; Villa E.; Chipot C.; Skeel R. D.; Kalé L.; Schulten K. Scalable molecular dynamics with NAMD. J. Comput. Chem. 2005, 26, 1781–1802. 10.1002/jcc.20289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phillips J. C.; Hardy D. J.; Maia J. D. C.; Stone J. E.; Ribeiro J. V.; Bernardi R. C.; Buch R.; Fiorin G.; Hénin J.; Jiang W.; McGreevy R.; Melo M. C. R.; Radak B. K.; Skeel R. D.; Singharoy A.; Wang Y.; Roux B.; Aksimentiev A.; Luthey-Schulten Z.; Kalé L. V.; Schulten K.; Chipot C.; Tajkhorshid E. Scalable molecular dynamics on CPU and GPU architectures with NAMD. J. Chem. Phys. 2020, 153, 044130. 10.1063/5.0014475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Best R. B.; Zhu X.; Shim J.; Lopes P. E. M.; Mittal J.; Feig M.; MacKerell A. D. Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ and Side-Chain χ1 and χ2 Dihedral Angles. J. Chem. Theory Comput. 2012, 8, 3257–3273. 10.1021/ct300400x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kollman P. A.; Massova I.; Reyes C.; Kuhn B.; Huo S.; Chong L.; Lee M.; Lee T.; Duan Y.; Wang W.; Donini O.; Cieplak P.; Srinivasan J.; Case D. A.; Cheatham T. E. Calculating Structures and Free Energies of Complex Molecules: Combining Molecular Mechanics and Continuum Models. Acc. Chem. Res. 2000, 33, 889–897. 10.1021/ar000033j. [DOI] [PubMed] [Google Scholar]
- Genheden S.; Ryde U. The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities. Expert Opin. Drug Discovery 2015, 10, 449–461. 10.1517/17460441.2015.1032936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brieg M.; Setzler J.; Albert S.; Wenzel W. Generalized Born implicit solvent models for small molecule hydration free energies. Phys. Chem. Chem. Phys. 2017, 19, 1677–1685. 10.1039/C6CP07347F. [DOI] [PubMed] [Google Scholar]
- Gohlke H.; Case D. A. Converging free energy estimates: MM-PB(GB)SA studies on the protein-protein complex Ras-Raf. J. Comput. Chem. 2004, 25, 238–250. 10.1002/jcc.10379. [DOI] [PubMed] [Google Scholar]
- Still W. C.; Tempczyk A.; Hawley R. C.; Hendrickson T. Semianalytical treatment of solvation for molecular mechanics and dynamics. J. Am. Chem. Soc. 1990, 112, 6127–6129. 10.1021/ja00172a038. [DOI] [Google Scholar]
- Srinivasan J.; Trevathan M. W.; Beroza P.; Case D. A. Application of a pairwise generalized Born model to proteins and nucleic acids: inclusion of salt effects. Theor. Chem. Acc. 1999, 101, 426–434. 10.1007/s002140050460. [DOI] [Google Scholar]
- Onufriev A. V.; Case D. A. Generalized Born Implicit Solvent Models for Biomolecules. Annu. Rev. Biophys. 2019, 48, 275–296. 10.1146/annurev-biophys-052118-115325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernardi R.; Bhandarkar M.; Bhatele A.; Bohm E.; Brunner R.; Buch R.; Buelens F.; Chen H.; Chipot C.; Dalke A.; Dixit S.; Fiorin G.; Freddolino P.; Fu H.; Grayson P.; Gullingsrud J.; Gursoy A.; Hardy D.; Harrison C.; Hénin J.; Humphrey W.; Hurwitz D.; Hynninen A.; Jain N.; Jiang W.; Krawetz N.; Kumar S.; Kunzman D.; Lai J.; Lee C.; Maia J.; McGreevy R.; Mei C.; Melo M.; Nelson M.; Phillips J.; Radak B.; Ribeiro J.; Rudack T.; Sarood O.; Shinozaki A.; Tanner D.; Wang P.; Wells D.; Zheng G.; Zhu F.. NAMD User’s Guide 2.14, 2020.
- Onufriev A.; Bashford D.; Case D. A. Modification of the Generalized Born Model Suitable for Macromolecules. J. Phys. Chem. B 2000, 104, 3712–3720. 10.1021/jp994072s. [DOI] [Google Scholar]
- Onufriev A.; Bashford D.; Case D. A. Exploring protein native states and large-scale conformational changes with a modified generalized born model. Proteins: Struct., Funct., Bioinf. 2004, 55, 383–394. 10.1002/prot.20033. [DOI] [PubMed] [Google Scholar]
- Geist N.; Kulke M.; Schulig L.; Link A.; Langel W. Replica-Based Protein Structure Sampling Methods II: Advanced Hybrid Solvent TIGER2hs. J. Phys. Chem. B 2019, 123, 5995–6006. 10.1021/acs.jpcb.9b03134. [DOI] [PubMed] [Google Scholar]
- Carlsson J.; Åqvist J. Absolute and Relative Entropies from Computer Simulation with Applications to Ligand Binding. J. Phys. Chem. B 2005, 109, 6448–6456. 10.1021/jp046022f. [DOI] [PubMed] [Google Scholar]
- Andricioaei I.; Karplus M. On the calculation of entropy from covariance matrices of the atomic fluctuations. J. Chem. Phys. 2001, 115, 6289–6292. 10.1063/1.1401821. [DOI] [Google Scholar]
- Levy R. M.; Karplus M.; Kushick J.; Perahia D. Evaluation of the configurational entropy for proteins: application to molecular dynamics simulations of an α-helix. Macromolecules 1984, 17, 1370–1374. 10.1021/ma00137a013. [DOI] [Google Scholar]
- Karplus M.; Kushick J. N. Method for estimating the configurational entropy of macromolecules. Macromolecules 1981, 14, 325–332. 10.1021/ma50003a019. [DOI] [Google Scholar]
- Schlitter J. Estimation of absolute and relative entropies of macromolecules using the covariance matrix. Chem. Phys. Lett. 1993, 215, 617–621. 10.1016/0009-2614(93)89366-P. [DOI] [Google Scholar]
- Pohorille A.; Jarzynski C.; Chipot C. Good Practices in Free-Energy Calculations. J. Phys. Chem. B 2010, 114, 10235–10253. 10.1021/jp102971x. [DOI] [PubMed] [Google Scholar]
- Jespers W.; Åqvist J.; Gutiérrez-de Terán H.. Protein-Ligand Interactions and Drug Design; Ballante F., Ed.; Springer US: New York, NY, 2021; pp 203–226. [Google Scholar]
- Zwanzig R. W. High Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. J. Chem. Phys. 1954, 22, 1420–1426. 10.1063/1.1740409. [DOI] [Google Scholar]
- Jia C.-Y.; Li J. Y.; Hao G. F.; Yang G. F. A drug-likeness toolbox facilitates ADMET study in drug discovery. Drug Discov. Today 2020, 25, 248–258. 10.1016/j.drudis.2019.10.014. [DOI] [PubMed] [Google Scholar]
- Cofala T.; Elend L.; Mirbach P.; Prellberg J.; Teush T.; Kramer O.. Evolutionary multi-objective design of SARS-CoV-2 protease inhibitor candidates Parallel Problem Solving from Nature–PPSN XVI: 16th International Conference, PPSN 2020, Leiden, The Netherlands, September 5–9 2020; Springer International Publishing, 2020; pp 357–371.
- Kim S.; Thiessen P. A.; Bolton E. E.; Chen J.; Fu G.; Gindulyte A.; Han L.; He J.; He S.; Shoemaker B. A.; Wang J.; Yu B.; Zhang J.; Bryant S. H. PubChem substance and compound databases. Nucleic Acids Res. 2016, 44, D1202–D1213. 10.1093/nar/gkv951. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim S.; Chen J.; Cheng T.; Gindulyte A.; He J.; He S.; Li Q.; Shoemaker B. A.; Thiessen P. A.; Yu B.; Zaslavsky L.; Zhang J.; Bolton E. E. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 2021, 49, D1388–D1395. 10.1093/nar/gkaa971. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berman H. M.; Westbrook J.; Feng Z.; Gilliland G.; Bhat T. N.; Weissig H.; Shindyalov I. N.; Bourne P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235–242. 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris R.; Olson A. J.; Goodsell D. S. Automated prediction of ligand-binding sites in proteins. Proteins: Struct., Funct., Bioinf. 2008, 70, 1506–1517. 10.1002/prot.21645. [DOI] [PubMed] [Google Scholar]
- Theoretical and Computational Biophysics Group Home Page. https://www.ks.uiuc.edu/ (accessed Oct 6, 2023).
- Mackerell Home Page. https://mackerell.umaryland.edu/~kenno/cgenff/https://mackerell.umaryland.edu/~kenno/cgenff/ (accessed Oct 6, 2023).
- PubChem Home Page. https://pubchem.ncbi.nlm.nih.gov/ (accessed Oct 6, 2023).
- Open Babel Home Page. http://openbabel.org/wiki/Main_Page (accessed Oct 6, 2023).
- Center For Computational Structural Biology Home Page. https://ccsb.scripps.edu/ (accessed Oct 6, 2023).
- Morris G. M.; Huey R.; Lindstrom W.; Sanner M. F.; Belew R. K.; Goodsell D. S.; Olson A. J. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J. Comput. Chem. 2009, 30, 2785–2791. 10.1002/jcc.21256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Boyle N. M.; Banck M.; James C. A.; Morley C.; Vandermeersch T.; Hutchison G. R. Open Babel: An open chemical toolbox. J. Cheminf. 2011, 3, 33. 10.1186/1758-2946-3-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vanommeslaeghe K.; Hatcher E.; Acharya C.; Kundu S.; Zhong S.; Shim J.; Darian E.; Guvench O.; Lopes P.; Vorobyov I.; Mackerell A. D. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J. Comput. Chem. 2010, 31, 671–690. 10.1002/jcc.21367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vanommeslaeghe K.; MacKerell A. D. Automation of the CHARMM General Force Field (CGenFF) I: Bond Perception and Atom Typing. J. Chem. Inf. Model. 2012, 52, 3144–3154. 10.1021/ci300363c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vanommeslaeghe K.; Raman E. P.; MacKerell A. D. Automation of the CHARMM General Force Field (CGenFF) II: Assignment of Bonded Parameters and Partial Atomic Charges. J. Chem. Inf. Model. 2012, 52, 3155–3168. 10.1021/ci3003649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- VMD psfgen Plugin, Version 2.0. https://www.ks.uiuc.edu/Research/vmd/plugins/psfgen/ (accessed May 19, 2021).
- Humphrey W.; Dalke A.; Schulten K. VMD – Visual Molecular Dynamics. J. Mol. Graphics 1996, 14, 33–38. 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
- Solvate Plugin, Version 1.5. https://www.ks.uiuc.edu/Research/vmd/plugins/solvate/ (accessed Oct 6, 2023).
- Autoionize Plugin, Version 1.5. https://www.ks.uiuc.edu/Research/vmd/plugins/autoionize/ (accessed Oct 6, 2023).
- MacKerell A. D.; Bashford D.; Bellott M.; Dunbrack R. L.; Evanseck J. D.; Field M. J.; Fischer S.; Gao J.; Guo H.; Ha S.; Joseph-McCarthy D.; Kuchnir L.; Kuczera K.; Lau F. T. K.; Mattos C.; Michnick S.; Ngo T.; Nguyen D. T.; Prodhom B.; Reiher W. E.; Roux B.; Schlenkrich M.; Smith J. C.; Stote R.; Straub J.; Watanabe M.; Wiórkiewicz-Kuczera J.; Yin D.; Karplus M. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J. Phys. Chem. B 1998, 102, 3586–3616. 10.1021/jp973084f. [DOI] [PubMed] [Google Scholar]
- Brooks B. R.; Brooks C. L.; Mackerell A. D.; Nilsson L.; Petrella R. J.; Roux B.; Won Y.; Archontis G.; Bartels C.; Boresch S.; Caflisch A.; Caves L.; Cui Q.; Dinner A. R.; Feig M.; Fischer S.; Gao J.; Hodoscek M.; Im W.; Kuczera K.; Lazaridis T.; Ma J.; Ovchinnikov V.; Paci E.; Pastor R. W.; Post C. B.; Pu J. Z.; Schaefer M.; Tidor B.; Venable R. M.; Woodcock H. L.; Wu X.; Yang W.; York D. M.; Karplus M. CHARMM: The biomolecular simulation program. J. Comput. Chem. 2009, 30, 1545–1614. 10.1002/jcc.21287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacKerell A. D.; Feig M.; Brooks C. L. Improved Treatment of the Protein Backbone in Empirical Force Fields. J. Am. Chem. Soc. 2004, 126, 698–699. 10.1021/ja036959e. [DOI] [PubMed] [Google Scholar]
- Ponder J. W.; Case D. A. Force fields for protein simulations. Adv. Protein Chem. 2003, 66, 27–85. 10.1016/S0065-3233(03)66002-X. [DOI] [PubMed] [Google Scholar]
- Deeks S. G.; Overbaugh J.; Phillips A.; Buchbinder S. HIV infection. Nat. Rev. Dis. Prim. 2015, 1, 15035. 10.1038/nrdp.2015.35. [DOI] [PubMed] [Google Scholar]
- Boettcher J.; Specker E.; Heine A.; Klebe G.. HIV-1 Protease in Complex with Pyrrolidinmethanamine; RCSB Protein Data Bank, 2005. 10.2210/pdb1xl2/pdb. [DOI] [Google Scholar]
- Nijhuis M.; van Maarseveen N. M.; Lastere S.; Schipper P.; Coakley E.; Glass B.; Rovenska M.; de Jong D.; Chappey C.; Goedegebuure I. W.; Heilek-Snyder G.; Dulude D.; Cammack N.; Brakier-Gingras L.; Konvalinka J.; Parkin N.; Kräusslich H. G.; Brun-Vezinet F.; Boucher C. A. B. A Novel Substrate-Based HIV-1 Protease Inhibitor Drug Resistance Mechanism. PLoS Med. 2007, 4, e36 10.1371/journal.pmed.0040036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slogrove A. L.; Sohn A. H. The global epidemiology of adolescents living with HIV. Curr. Opin. HIV AIDS 2018, 13, 170–178. 10.1097/COH.0000000000000449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ji H. Current Research on HIV Drug Resistance—A Topical Collection with “Pathogens. Pathogens 2022, 11, 966. 10.3390/pathogens11090966. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lv Z.; Chu Y.; Wang Y. HIV protease inhibitors: a review of molecular selectivity and toxicity. HIV/AIDS 2015, 7, 95. 10.2147/hiv.s79956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Agrawal P.; Singh H.; Srivastava H. K.; Singh S.; Kishore G.; Raghava G. P. Benchmarking of different molecular docking methods for protein-peptide docking. BMC Bioinf. 2019, 19, 426. 10.1186/s12859-018-2449-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vieira T. F.; Sousa S. F.. Advanced Methods in Structural Biology; Sousa A., Passarinha L., Eds.; Springer US: New York, NY, 2023; pp 261–267. [Google Scholar]
- Arrigoni R.; Santacroce L.; Ballini A.; Palese L. L. AI-Aided Search for New HIV-1 Protease Ligands. Biomolecules 2023, 13, 858. 10.3390/biom13050858. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou Y.; Jiang Y.; Chen S.-j. RNA-ligand molecular docking: advances and challenges. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2022, 12, e1571 10.1002/wcms.1571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Lucca G. V.; Liang J.; De Lucca I. Stereospecific synthesis, structure-activity relationship, and oral bioavailability of tetrahydropyrimidin-2-one HIV protease inhibitors. J. Med. Chem. 1999, 42, 135–152. 10.1021/jm9803626. [DOI] [PubMed] [Google Scholar]
- Hodge C. N.; Aldrich P. E.; Bacheler L. T.; Chang C. H.; Eyermann C. J.; Garber S.; Grubb M.; Jackson D. A.; Jadhav P. K.; Korant B.; Lam P. Y.; Maurin M. B.; Meek J. L.; Otto M. J.; Rayner M. M.; Reid C.; Sharpe T. R.; Shum L.; Winslow D. L.; Erickson-Viitanen S. Improved cyclic urea inhibitors of the HIV-1 protease: Synthesis, potency, resistance profile, human pharmacokinetics and X-ray crystal structure of DMP 450. Chem. Biol. 1996, 3, 301–314. 10.1016/S1074-5521(96)90110-6. [DOI] [PubMed] [Google Scholar]
- Han Q.; Chang C. H.; Li R.; Ru Y.; Jadhav P. K.; Lam P. Y. Cyclic HIV protease inhibitors: Design and synthesis of orally bioavailable, pyrazole P2/P2’ cyclic ureas with improved potency. J. Med. Chem. 1998, 41, 2019–2028. 10.1021/jm9704199. [DOI] [PubMed] [Google Scholar]
- CSID:413947. http://www.chemspider.com/Chemical-Structure.413947.html (accessed 19:39, Oct 9, 2023).
- CSID:24521321. http://www.chemspider.com/Chemical-Structure.24521321.html (accessed 19:37, Oct 9, 2023).
- CSID:412242. http://www.chemspider.com/Chemical-Structure.412242.html (accessed 19:40, Oct 9, 2023).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data needed to reproduce this study, a screen capture showing the setup of a task in ALISE, and a status report from the study are publicly available at: https://doi.org/10.57782/LGTW2K. The data structure is described in a README file. Software, web servers, and computational tools applied in ALISE and presented in this paper are owned by their respective developers and copyright holders but can be licensed freely for academic use. Except for the CGenFF program, VIKING redistributes the necessary software and tools to users’ computational resources. To ensure that users of ALISE are noncommercial and possess a CGenFF license, each user must apply for access to ALISE.


