Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Dec 8.
Published in final edited form as: Methods Mol Biol. 2021;2266:39–72. doi: 10.1007/978-1-0716-1209-5_3

Biased docking for protein-ligand pose prediction

Juan Pablo Arcon 1,$,*, Adrián G Turjanski 1, Marcelo A Martí 1, Stefano Forli 2,*
PMCID: PMC10708986  NIHMSID: NIHMS1947069  PMID: 33759120

Abstract

The interaction between a protein and its ligands is one of the basic and most important processes in biological chemistry. Docking methods aim to predict the molecular 3D structure of protein-ligand complexes starting from coordinates of the protein and the ligand separately. They are widely used in both industry and academia, especially in the context of drug development projects. AutoDock4 is one of the most popular docking tools and, as for any docking method, its performance is highly system dependent. Knowledge about specific protein-ligand interactions on a particular target can be used to successfully overcome this limitation. Here we describe how to apply the AutoDock Bias protocol, a simple and elegant strategy that allows users to incorporate target-specific information through a modified scoring function that biases the ligand structure towards those poses (or conformations) that establish selected interactions. We discuss two examples using different bias sources. In the first, we show how to steer dockings toward interactions derived from crystal structures of the receptor with different ligands; in the second example we define and apply hydrophobic biases derived from Molecular Dynamics simulations in mixed solvents. Finally, we discuss general concepts of biased docking, its performance in pose prediction and virtual screening campaigns as well as other potential applications.

Keywords: docking, biased docking, guided docking, knowledge-based docking, AutoDock, AutoDock Bias, mixed-solvents, cosolvent

1. Introduction

Molecular docking

The interaction between a protein and small molecule ligands is one of the basic and most important processes in biological chemistry, and is therefore the subject of intense research. Understanding how a specific protein interacts with a ligand has applications in fields such as enzyme catalysis, cell-cell communication, signaling and, most importantly, drug development. The cornerstone of protein-ligand binding studies is the determination of the corresponding complex structure at the atomic resolution level, which is usually achieved by means of X-ray crystallography. The cheaper, and universally accessible, in-silico based alternative that allows determination of a given protein-ligand complex structure is usually referred to as molecular docking.

Docking methods aim to predict the 3D structure of the protein-ligand complex starting from coordinates of the protein and the ligand separately. They are currently widely used both in academia and industry, and are an essential part of any drug development project. The docking problem mainly consists of two different components. The first one is a geometrical search algorithm that efficiently explores the protein-ligand conformational space and generates potential solutions. Over the years, a number of search algorithms have been implemented for performing this task, ranging from stochastic methods such as Monte Carlo simulated annealing(1, 2) and genetic algorithms(3, 4) that randomly alter conformational degrees of freedom, to systematic searches(5) or shape-based algorithms(6). Additionally, due to the enormous search space, some type of rigid body approach is often used, for example fixing the protein structure as well as some of the ligand internal degrees of freedom (typically bond distances and angles). The second component is the complex affinity estimation by a scoring function that may be empirical,(7) based on classical physics potentials,(8) a statistical potential,(9) or a machine learning approach,(10) among others(11). Mixed approaches are also available.(12, 13) During each docking run both components interact trying to find the ligand binding pose that maximizes the affinity.

Given its wide applicability, several docking methods have been developed in the last decades and are available to the community, such as AutoDock,(12) Glide,(5) GOLD,(4) DOCK,(14, 15) AutoDock Vina,(16) rDock,(13) FlexX,(17) Surflex,(18) and MOE(19). They differ in the search strategies, scoring functions, computational efficiency, user interface, etc. Concerning their reliability in correctly predicting the complex structure, and therefore the interactions held between protein and ligand, results are still highly system dependent, with cases where the predicted complex is indistinguishable from the crystal structure, and others where the ligands are posed in either another site, or in a completely wrong pose. The common validation for pose prediction accuracy of docking programs is done by re-docking (self-docking) of ligands into their cognate protein binding site and calculating the proportion of predicted conformations with RMSD below 2 Å with respect to the co-crystallized complex (considering ligand symmetry if necessary). Accuracy values lie around 60-80% for most common docking software.(1923) Clearly, there is still significant room for improvement, especially considering that the cross-docking problem, in which molecules are docked to non-cognate protein structures, is a far more realistic and challenging application than re-docking cognate ligands.(24, 25) This is why we choose to show a cross-docking application in the present protocol.

Biased docking

Considering the previously stated limitations of docking programs, and to maximize their potential, users add restraints based on relevant previous knowledge about the target of interest to adjust the results for that particular system (knowledge-based docking).(26, 27) Some examples include the use of 3D structure similarity with known ligands to improve pose prediction of drug-like molecules(28) and fragment molecules(29), or the discovery of potent agonists by docking different molecules with lipophilic moiety biases derived from related targets(30). One of the most widely used strategies to improve docking performance on a specific target is to incorporate into the docking algorithm previous knowledge of key protein-ligand interactions for that target. Extreme cases of this concept are, for example, forcing the formation of a covalent bond as in the case of covalent inhibitors, such as penicillin analogs or other antibiotics, or defining dedicated potentials for the coordination to metal ions in metalloproteins,(31) as observed for hydroxamate ligands of Zn hydrolases. The principle, however, works for any type of interaction, including hydrogen bonds, salt bridges, aromatic and even loose hydrophobic interactions. The fundamental requirement is to have relevant interaction knowledge prior to the docking experiment. The approach consists in detecting common features among different ligands of the same protein. The presence of a specific functional group on several ligands establishing the same particular interaction with the receptor is what we refer to as a ligand-derived pharmacophore. Another, more risky approach, but with an enormous application potential, is to use mixed solvent crystallography(32) and/or molecular dynamics(3335) to predict these interactions, as will be detailed later in this chapter. Once the key protein-ligand interactions are identified or predicted the next question is how to include this information to improve the docking performance. A common and extensively used strategy is to bias (guide or constrain) the docking towards those poses (or conformations) where the interaction is fulfilled. In the last decade we have extensively analyzed how to determine these interactions using both X-ray and MD derived information, and use them to improve docking performance both in terms of accuracy and sensitivity.(3639) For this sake we implemented a bias guiding approach in AutoDock4, which we called AutoDock Bias.(40)

AutoDock Bias

Many docking programs include functionalities to encourage the formation of specific molecular interactions, metal coordination bonds or pose based restraints (i.e., tethered docking).(4, 5, 13, 19). AutoDock Bias complements AutoDock4 as a tool to improve docking performance by guiding docking results towards pharmacophoric interactions (hydrogen bonds, hydrophobic/aromatic) or precise localization of either atoms (e.g. metal atoms or anchors for covalent docking) or group of atoms (e.g. substructure core for congeneric ligands or fragment growth) in a defined 3D region relative to the target structure. The aim of AutoDock Bias is to provide an easy way to add in AutoDock4 scoring function a bias towards specific protein-ligand interactions. As previously stated, these previously known interactions may be derived from ligand-based pharmacophores, interaction sites detected by hotspot mapping (e.g., through cosolvent or fragment molecules), or any other interaction known or suspected to be relevant for ligand binding. The idea behind AutoDock Bias is that an increase in the magnitude of the binding energy when these particular interactions are fulfilled drives the ligand toward more accurate binding poses, and therefore increases overall docking performance in predicting plausible results.

How does AutoDock Bias work?

The biases in AutoDock Bias are implemented as potential energy wells that promote the placement of specific atoms or functional groups of the ligand in the desired locations. Once the standard AutoDock4 energy grid maps (without any bias) have been computed for each ligand atom type, biases are introduced as modifications of the grid maps corresponding to the desired type of interaction. Usually, hydrogen bond acceptor biases modify the energy maps corresponding to those oxygen (OA) and nitrogen (NA) atoms capable of accepting hydrogen bonds (e.g. carbonyl oxygen or pyridine nitrogen, respectively), hydrogen bond donor biases modify the energy map corresponding to explicit hydrogen atoms (HD) i.e., those that are hydrogen bond donors, and aromatic biases are handled by creating a new AC atom type in the center of aromatic rings (AC = Aromatic Center), and the corresponding AC energy map with only the energy reward in the desired location and zero energy (i.e., no interactions) everywhere else. It is also possible to modify any specific energy map that the user may consider relevant for the particular system of interest. For example, fixing several ligand atoms in space could be useful for ligands that are grown from smaller fragments with known pose or ligands that are covalently bound to the protein.

Figure 1 shows how the atom type specific grid map modification is accomplished. The new grid map takes the original unbiased energy map and adds an additional energy reward (Vset) in the center of the site where the bias is applied (bias site). Thus, the calculated energy is lowered in that region to a maximum amount. At the same time, a radius for the bias site is set to control how far from the center of that site the energy modification will still be operating, defining a gradual decay of the effect. The precise form of the energy reward is shown in equation 1,

Vbias=V0+Vsetexxi2+yyi2+zzi2ri2 Eq. 1

where Vbias is the resulting modified potential at a certain grid point, V0 corresponds to the original AutoDock4 energy at the same grid point, Vset is the maximum energy reward (negative value) for the bias, (x,y,z) are the grid point coordinates, (xi,yi,zi) are the coordinates of the bias site center, and ri is the bias site radius. Vset, ri and (xi,yi,zi) are user specified (see protocol below). Notice that the extension through space is modulated by a distance-dependent Gaussian decay. Strictly speaking, the energy modifications are applied in the predefined grid points locations.

Figure 1.

Figure 1.

Energy map modification to perform biased docking with AutoDock Bias.

2. Methods

2.1. Prerequisites

To run AutoDock Bias it is required to have AutoDock4 and AutoDockTools (1.5.7 or later) installed in your computer.1

AutoDockTools is capable of preparing protein structures automatically, but the use of more accurate tools such as Maestro(41) or Reduce(42) are encouraged.2

In addition, a file type converter between molecular formats such as Open Babel(43) will be useful to process ligand molecules.3

In the following we describe a step by step procedure to achieve successful biased docking calculations. Basic knowledge and practice regarding docking procedure in general, and AutoDock4 software in particular, is assumed. If the user is not familiar with AutoDock4 we suggest first to review its basic protocols(44) and tutorials and only afterwards start using AutoDock Bias.

Before executing the code examples shown in the following sections, it is useful to define the location of the MGLTools Utilities24 path containing the scripts from AutoDockTools and AutoDock Bias. For example, in a Bash shell and for a typical MGLTools installation it can be done executing the following command:

$ export MGLUTIL=$INSTALL_DIR/MGLToolsPckgs/AutoDockTools/Utilities24

Make sure to adapt it to the particular installation of MGLTools on your system. Alternatively, it is possible to execute the commands by specifying the full path for each script.

2.2. Knowledge-based biased docking

First, we will show how to apply a docking bias based on previous information on the target protein (knowledge-based). The presently chosen target is the cyclin-dependent kinase 2 (CDK2), a serine/threonine kinase that controls the cell cycle progress and whose dysregulation is frequently associated with different types of cancer, thus making it an attractive protein target.(45, 46)

The selected protein structure corresponds to PDB ID 1y91 (deposited in the PDB in 2004)(47) and we will dock a ligand coming from a later resolved structure (PDB ID 3sw4, deposited in 2011)(48) in the active site of the protein, assuming that we are still back in 2004. Our aim is to mimic the typical case where we have a co-crystal structure, and want to test the binding of another structurally different ligand whose binding mode is not necessarily similar to the one in the available crystal, though we assume some relevant interactions to be shared. This type of experiment in which the protein structure is not conformationally adapted for the binding of the particular ligand is known as cross-docking, and represents the most common case when studying protein-ligand binding.

2.2.1. Get starting structures

  1. Download the complex structure file 1y91.pdb from the PDB: https://www.rcsb.org/structure/1y91.

  2. Generate the initial receptor structure file receptor.pdb containing the crystallized structure of the protein using a text editor: just keep ATOM records from 1y91.pdb and be sure to remove the ligand and all water molecules.4

  3. Generate the co-crystallized structure of the ligand ligand_original.pdb: just keep HETATM records from 1y91.pdb and be sure to remove protein atoms and all water molecules. Remeber that this is not the ligand we will be docking, since we want to test the more challenging cross-docking procedure.

  4. Download from the PDB the coordinates of the ligand of interest:

    Search for chemical ID 18K
    Download the optimized structure: Structure Data File (Ideal SDF).
    Save it as ligand_ideal.sdf

We do not want to use the co-crystallized structure of this ligand under PDB ID 3sw4, because we will do as if it were not available.5 In the end, we will use it to check our results.

The initial structure files (1y91.pdb, receptor.pdb, ligand_original.pdb, ligand_ideal.sdf) can be found in https://github.com/jparcon/adbias/tree/master/mimb_chapter/ini_structures.

2.2.2. Prepare protein and ligand structures

Since AutoDock4 uses PDBQT input files describing atom types, atomic charges and flexibility (allowed torsions), besides the atom coordinates, we need to preprocess the PDB/SDF structure files.

Prepare the ligand PDBQT file
  1. Transform the SDF file with the ligand coordinates to PDB, because AutoDockTools cannot handle SDF files. You may use Open Babel for this task.

    $ obabel -isdf ligand_ideal.sdf -opdb -O ligand_ideal.pdb
  2. Prepare an adequate torsion tree and assign atom types and charges for the ligand with the prepare_ligand4.py script provided in AutoDockTools.

    $ pythonsh $MGLUTIL/prepare_ligand4.py -l ligand_ideal.pdb -o ligand_dock.pdbqt

    -l input ligand (PDB)

    -o output ligand (PDBQT)

    This will generate a PDBQT file for the corresponding ligand (ligand_dock.pdbqt).

  3. 3. Visually check the ligand structure, in particular its protonation state (see below). You may open the PDBQT file with the Python Molecule Viewer (PMV) from AutoDock Tools (ADT), or with PYMOL6.

    $ adt ligand_dock.pdbqt

For the current ligand (Figure 2) we will leave both terminal N groups, NH2 and N(CH3)2, deprotonated since they are directly bound to electron withdrawing aromatic rings and we do not expect them to be protonated (charged) at physiological pH. Remember AutoDock uses a united atom approach for hydrogen atoms that cannot be implicated in standard hydrogen bonds, i.e., those H atoms are merged, along with their charge, to their bonded heavy atom.

Figure 2.

Figure 2.

Ligand PDBQT structure file.

Prepare the protein PDBQT file

The receptor PDBQT file will be prepared from the initial receptor structure (receptor.pdb). We have to manage the protonation of protein residues, as well as the special orientation treatment for amides, histidines, alcohols and cysteines. This procedure can be done in an automated fashion with AutoDockTools, but it is recommended to use more accurate tools such as Reduce or one that allows user intervention such as Maestro (see Note 1). For this sake we used the Maestro software with a free academic license.

  1. Protonate protein residues and, if necessary, adjust the orientation of amide, histidine, alcohol and cysteine residues using Maestro.

    $ maestro receptor.pdb
    Click on “Protein Preparation”.
    Mark “Add hydrogens” and “Create disulfide bonds” boxes.
    Click “Preprocess”.
    Check from the pop up window that there are not significant missing atoms and that they are far away from the binding site. You may zoom in each residue by clicking on its corresponding row in the pop up window.
    Click “OK”.
    Click on the “Refine” tab.
    Click “Interactive optimizer” and “Analyze network”.
    Go through the different residues by clicking each row and choose adequate orientation if necessary. Interactions of the selected residue are plotted in the main window to aid in the decision.
    Close the “Interactive H-bond Optimizer” and “Protein Preparation Wizard” windows.
    Right click on “receptor - preprocessed” in the ENTRY LIST.
    Click “Export” > “Structures…”.
    Save the structure as a PDB file: receptor_H.pdb

    The protonated receptor was saved as receptor_H.pdb in https://github.com/jparcon/adbias/tree/master/mimb_chapter/ini_structures.

  2. Use the prepare_receptor4.py script provided in AutoDockTools to obtain the PDBQT file:

    $ pythonsh $MGLUTIL/prepare_receptor4.py -r receptor_H.pdb -o receptor.pdbqt

    -r receptor input (PDB)

    -o receptor output (PDBQT)

  3. Verify that the receptor is treated as a rigid molecule, i.e., no torsions are allowed. For this sake, open the receptor.pdbqt file with a text editor and see that it has no torsion tree, meaning there are no ROOT, BRANCH, TORSDOF records (see Note 2).

The PDBQT files and all the following generated files from the knowledge-based docking protocol can be found in https://github.com/jparcon/adbias/tree/master/mimb_chapter/knowledge_bias.

2.2.3. Prepare the grid parameter file (GPF) and the energy maps

Based on the PDBQT files of both the receptor and the ligand, we will now create the grid where the conformational search of the ligand will be performed, along with the precalculated energy maps for each interaction (see AutoDock4 user guide for more details).

The program will compute the following grid maps:

  • 1 van der Waals (+ hydrogen bond if appropriate) map for each ligand atom type, also including the charge-independent desolvation contribution (receptor.C.map, receptor.N.map, …);

  • 1 electrostatic map (receptor.e.map);

  • 1 charge-dependent desolvation map (receptor.d.map).

  1. Calculate the center of mass of the co-crystallized ligand (ligand_original.pdb), since it will be the center of the docking grid.7 Use the center_of_mass.py script that was downloaded from https://pymolwiki.org an is located at https://github.com/jparcon/adbias/tree/master/mimb_chapter/add_scripts.

    $ pymol ligand_original.pdb
    > import center_of_mass
    > com ligand_original

    The obtained coordinates are (1.801, 63.148, 8.293).

  2. Generate the GPF (grid parameter file) with the information required for creating the maps. This is done using the prepare_gpf4.py script from AutoDockTools:

    $ pythonsh $MGLUTIL/prepare_gpf4.py -l ligand_dock.pdbqt -p npts=“80,80,80” -p gridcenter=“1.801,63.148,8.293” -r receptor.pdbqt -o receptor.gpf

    -l input ligand (PDBQT)

    -p npts=“80,80,80” gives a grid covering the whole ATP binding site of CDK2

    -p gridcenter=“1.801,63.148,8.293” centers the grid in the center of mass of the co-crystallized ligand

    -r input receptor (PDBQT)

    -o output GPF

  3. Build the energy maps using the AutoGrid4 program (from AutoDock4 package) and the corresponding parameters from the GPF:

    $ autogrid4 -p receptor.gpf -l receptor.glg

    -p input GPF

    -l output log file

    This creates the different energy maps (receptor.X.map).

  4. Visualize the grid size (Figure 3) using ADT (or VMD8) and selecting one of the maps:

    $ adt receptor.pdbqt ligand_original.pdb

    > Click “Grid3D”, “Read…” and open one of the maps (e.g. receptor.A.map).

Figure 3.

Figure 3.

Protein structure (yellow lines), co-crystallized ligand (orange sticks) and squared grid centered on the co-crystallized ligand (white box).

2.2.4. Prepare the docking parameter file (DPF)

  1. Prepare the parameter file for the docking runs (DPF = docking parameter file) using the prepare_dpf42.py script from AutoDockTools:

    $ pythonsh $MGLUTIL/prepare_dpf42.py -p ga_run=100 -l ligand_dock.pdbqt -r receptor.pdbqt -o ligand_dock.dpf

    -p ga_run=100 indicates 100 independent docking runs

    -l input with the ligand to be docked (PDBQT)

    -r input receptor (PDBQT)

    -o output with the parameters for the docking runs (DPF)

2.2.5. Organize the files required for the biases

By now, you should have created the following files:

  • LIGAND PDBQT: ligand_dock.pdbqt

  • ENERGY MAPS: receptor.A.map, receptor.C.map, receptor.HD.map, receptor.NA.map, receptor.N.map, receptor.SA.map, receptor.e.map, receptor.d.map, receptor.maps.xyz, receptor.maps.fld

  • DPF: ligand_dock.dpf

With this collection of files it is possible to run a conventional (i.e., unbiased) docking of the selected ligand in the receptor structure that was used to build the grid maps, simply by executing,9

$ autodock4 -p ligand_dock.dpf -l ligand_dock.dlg

-p input DPF

-l output log file with the docking results (DLG = docking log)

Now we will add the biases which are the main focus of this protocol.

AutoDock Bias is based on a Python command line script that modifies previously created conventional AutoDock4 files. It allows two different types of bias treatment:

  • Modification of energy maps, DPF and ligand PDBQT files as required for general built-in biases, i.e. hydrogen bond donors, hydrogen bond acceptors, and/or aromatic interactions. This function allows the user to define the bias for the most common molecular interactions and once the particular interaction type and its localization is indicated, the bias docking run is automatically prepared.

  • Modification of specific energy maps to generate user defined biases. This versatile function allows the user to define a precise localization for any desired atom (e.g. metal atom) or group (e.g. substructure core of a congeneric ligand series or for fragment growth). It may also be used to set an anchor for covalent docking studies.

In any case, to apply a particular bias we need to keep in mind two key characteristics:

  • to which atom types of the LIGAND the bias will be applied;

  • where in the RECEPTOR surface the bias will be localized.

This information, as will be shown along the rest of the chapter, can be derived from different sources.

2.2.6. Prepare input file for bias: the bias parameter file (BPF)

We will start with a direct approach based on complex structures of the receptor with known ligands. Several protein-ligand complexes are available for CDK2 in the PDB which we will use to determine possible key protein-ligand interactions that will later be used to bias (or guide) the docking. The main ligand-derived pharmacophore for this target is located in the so called hinge region, showing an hydrogen bond acceptor moiety interacting with the backbone NH of Leu83 and an hydrogen bond donor moiety interacting with the backbone C=O of the same residue. Even by 2004 (the date when our protein structure was deposited) there were a bunch of co-crystallized ligands shearing this interactions using different functional groups such as adenine (1ckp), guanine (1e1v), pyrazolopyrimidine (1y91), bi-indole derivatives (2bhe), among others. Figure 4 shows the hinge region of PDB 1y91 alongside with its co-crystallized ligand and the reference ligand from 3sw4 superposed (the one we are going to dock), highlighting both mentioned interactions.

Figure 4.

Figure 4.

Hinge region of CDK2 from PDB ID 1y91 (cylinders). The reference ligand from PDB ID 3sw4 is superimposed (balls and sticks). Arrows indicate protein interactors obtained from ligand-derived pharmacophore.

  1. Determine the bias position and parameters that will promote the formation of the hydrogen bonds with Leu83 from CDK2 as marked in Figure 4. To get the bias position we will use the protonated receptor structure (receptor_H.pdb) and an additional script from AutoDock Bias (ideal_interaction_sites.py)10 that calculates the location of ideal interactions for selected residues of interest (Leu83 in this case).

    $ pythonsh $MGLUTIL/contrib/adbias/ideal_interaction_sites.py -i receptor_H.pdb -c A -r 83

    -i input PDB file name containing the protein structure (with hydrogens)

    -c protein chain ID (e.g. A)

    -r residue ID(s) whose ideal interaction positions are to be calculated

  2. Open the generated ideal interaction sites (interaction_sites.pdb)11:

    $ adt receptor.pdbqt interaction_sites.pdb

    Here it will be seen the structure shown in Figure 5 (left panel) with the calculated ideal interaction sites for the indicated residue (Leu83).

    According to the co-crystallized ligand conformation (Figure 5, right panel), choose the most suitable interaction site among the possible donor orientations for the carbonyl and finally get a hydrogen bond acceptor from Leu83 NH backbone and a hydrogen bond donor to Leu83 C=O backbone (Note 3). Their coordinates relative to the receptor.pdb structure are:

    Acceptor site: (1.782, 66.634, 6.337)
    Donor site: (3.400, 64.956, 5.106)
  3. Prepare the bias parameter file (BPF)12 containing all the information for the different biases to be applied. The BPF contains one line for each bias, with the following parameters:

    • (x, y, z) coordinates in Å,

    • energy reward (Vset) in kcal/mol (negative value),

    • decay radius (r) in Å,

    • type of bias (don, acc, aro or map).

    Currently, the following type of biases are available:

    • hydrogen bond donor = don,

    • hydrogen bond acceptor = acc,

    • aromatic = aro,

    • specific bias according to the desired map = map.

    With the coordinates and type of the ligand-derived interaction sites, generate the input bias parameter file ligand_dock.bpf as follows:

    x y z Vset r type
    1.782 66.634 6.337 −1.50 1.00 acc
    3.400 64.956 5.106 −1.50 1.00 don

    The Vset and radius (r) are user defined. We will use −1.5 kcal/mol, a moderately strong bias. If softer biases are desired, use different energy values (e.g. weak bias typically range from −1.0 to −0.5 kcal/mol). A radius of 1.0 Å will be used to control the decay of the energy reward (see equation 1). This value is adequate for hydrogen bond biases, other biases may require larger values (see below).

  4. Use the bias2pdb.py script to convert the input bias file to a PDB file (bias_sites.pdb) and check that the sites coordinates are properly placed, for example by opening the file with ADT (Figure 5, right panel).

    $ pythonsh $MGLUTIL/contrib/adbias/bias2pdb.py ligand_dock.bpf
    $ adt receptor.pdbqt ligand_original.pdb bias_sites.pdb
Figure 5.

Figure 5.

(Left) Ideal interaction sites for the specified residue (Leu83) shown as cylinders. The receptor is depicted as gray ribbons. The calculated interaction sites are shown as spheres: blue = hydrogen bond acceptor, gray = hydrogen bond donor. (Right) Cocrystallized ligand from PDB ID 1y91 superposed to bias sites shown as spheres (blue = hydrogen bond acceptor, gray = hydrogen bond donor).

The bias_sites.pdb has the hydrogen bond acceptor bias site named as ACC residue, and the hydrogen bond donor bias site named as DON residue. Their coordinates will set the locations where the corresponding energy maps will be modified.

2.2.7. Introduce the bias in the maps

  1. Execute AutoDock Bias13 main script (prepare_bias.py) to modify the energy maps and input files according to the bias parameter file prepared in the previous section:

    $ pythonsh $MGLUTIL/contrib/adbias/prepare_bias.py -b ligand_dock.bpf -g receptor.gpf -d ligand_dock.dpf

    -b: bias parameter file (input file for bias)

    -g: AutoDock4 grid parameter file

    -d: AutoDock4 docking parameter file

    This will modify the acceptor and donor energy maps (NA and HD in the present case), and also the input DPF (the maps to read will be the biased ones), so that everything is set to run the biased docking.

    The new map names will add the .biased.map extension to the original basename.

    The new DPF name will add the .biased.dpf extension to the original basename.

  2. Use ADT to check the adequate energy modifications in the grid maps (Figure 6):

    $ adt bias_sites.pdb

    > Click the “C” circle on the bias_sites line for activating “sphere” representation.

    > Click the “L” circle on the bias_sites line for deactivating “lines” representation.

    > Right click the “C” circle > “Rendering” tab (lower panel) > “Wires” representation.

    > If necessary, zoom out with the middle scroll to see the both sites.

    > Open 3D Grid/Volume Rendering (Grid3D > Show Control Panel).

    > Add the biased map to check (e.g. receptor.NA.biased.map) by clicking the “Add” (+) button and selecting the corresponding map.

    > Click “Isocontour”.

    > Set “max” to 0.00

    > SHIFT + click in the panel to set the isocontour value.

    > Move the bar and check that the lower energy values are on the acceptor bias site.

    > Repeat for the remaining biased map (receptor.HD.biased.map).

  3. Check the changes in the map lines between the original DPF and the one newly created (ligand_dock.biased.dpf). Use a text editor to open the two files, or text comparison programs such as diff, vimdiff or meld, for example:

    $ vimdiff ligand_dock.dpf ligand_dock.biased.dpf
Figure 6.

Figure 6.

Grid Control Panel of ADT, showing receptor.NA.biased.map, at an isoenergetic value of −1.5 kcal/mol. The wired representation is for the bias sites.

Verify that the new DPF reads the biased maps instead of original ones for HD and NA.

2.2.8. Perform the biased docking

  1. Perform the docking run using the built bias. The docking run is executed in the same way as conventional AutoDock runs, but indicating the biased docking parameter file (which points to the modified grids):

    $ autodock4 -p ligand_dock.biased.dpf -l ligand_dock.biased.dlg

2.2.9. Analyze results

We will illustrate the analysis with previously calculated results (see the file ligand_dock.biased.dlg in https://github.com/jparcon/adbias/tree/master/mimb_chapter/knowledge_bias/outputs). Follow the same steps with your own results.

  1. Open the docking log file (ligand_dock.biased.dlg) with a text editor and go to the “CLUSTERING HISTOGRAM” section. In our case, 12 different ligand poses were found, with the first ranked pose bearing the best free energy of binding (−10.05 kcal/mol) and cluster population (39/100).

  2. Download PDB ID 3sw4 (the cocrystal structure of the docked ligand) and align the structure to receptor.pdb to get the reference pose of the docked ligand. Save the ligand structure of the aligned complex as ligand_ref.pdb (Note 4).

  3. Visualize the predicted poses from the docking log (DLG) with ADT and compare them with the reference one:

    $ adt receptor.pdbqt ligand_ref.pdb

    > On the ADT bar, click on “Analyze”, “Dockings”, “Open…” and choose the ligand_dock.biased.dlg file.

    > Click on “Analyze”, “Clusterings”, “Show…”.

    A histogram with the energy distribution for the resulting poses, clustered according to their RMSD, will pop up (see Figure 7 as an example, yours is probably slightly different). Click on each bar to see the 3D structure of the poses belonging to each cluster in the visualization panel superposed to the reference ligand. Figure 8 (left panel) shows a representative structure for the first cluster. As clearly shown, the first ranked pose nicely resembles the crystal reference pose for the biased docking method. For sake of comparison, Figure 8 also shows in the right panel the best ranked pose obtained with conventional docking, where the conformation does not resemble the crystal pose. While it also dually interacts with Leu83, it uses alternative functional groups for these hydrogen bond interactions and establishes a totally different interaction pattern with the rest of the binding pocket. Check these results by yourself loading the conventional docking poses from ligand_dock.dlg (https://github.com/jparcon/adbias/tree/master/mimb_chapter/conventional).

  4. Generate PDB files for each cluster representative pose with the pdb_poses.py script:

    $ pythonsh $MGLUTIL/contrib/adbias/pdb_poses.py ligand_dock.biased.dlg

    This will create rank1.pdb, rank2.pdb, …, rank12.pdb structures.

  5. Calculate the RMSD against the reference cocrystal structure. First, remove the hydrogen atoms from rank1.pdb and save it as rank1_noH.pdb. Then reorder the atoms line by line in ligand_ref.pdb to make them coincide with their positions in rank1_noH.pdb and save it as ligand_ref_renum.pdb.14 Finally, execute the compute_rms_between_conformations.py script:

    $ pythonsh $MGLUTIL/compute_rms_between_conformations.py -f ligand_ref_renum.pdb -s rank1_noH.pdb

    The script will generate a file called summary_rms_results.txt which contains the RMSD value calculated for the pose with respect to the reference structure. The value obtained in our experiment was:

    • RMSD rank 1 vs. reference = 1.42 Å

  6. With all this information and the “CLUSTERING HISTOGRAM” section of the docking log, build the typical cluster population vs. docking score plot adding the RMSD value for the best ranked pose. An example with our obtained results is shown in Figure 9.

Figure 7.

Figure 7.

Binding energy distribution for 100 docking runs.

Figure 8.

Figure 8.

Best ranked predicted poses (in cylinders) compared to reference co-crystal structure (in CPK) for docking with ligand-derived pharmacophore bias (left) and conventional docking with AutoDock4 (right).

Figure 9.

Figure 9.

Cluster population vs. docking score for the ligand-derived biased docking (left) and AutoDock4 conventional docking (right) methods. RMSD (in Å) of the first ranked pose against the reference structure is indicated in blue inside the plot.

Conclusion of knowledge-based biased docking.

After applying the bias towards two known protein-ligand interactions, the docking method was capable of predicting the reference cocrystal pose with best energy and highest population, thus effectively discriminating the correct pose among the false positives.

2.3. Solvent site biased docking

Cosolvent Molecular Dynamics for prediction of protein-ligand interaction sites

Here we will provide a brief description of cosolvent molecular dynamics (MD) simulations in order to give proper context to its application in AutoDock Bias. For more detailed information see references (35, 49).

As a consequence of the shape and charge distribution of protein structures, solvent molecules are not placed randomly on their surface but instead tend to cluster in specific regions of favorable interaction. These clusters, which can be evidenced for example as crystallographic water or co-crystallized cosolvent molecules, mimic potential protein-ligand interactions,(38, 50) and can therefore be used in knowledge-based docking approaches. MD simulations of proteins in explicit water and/or water-solvent mixtures can also be used to reveal key protein–ligand interaction hot spots.(35, 51) These hot spots are defined as confined space regions adjacent to the binding/active site of the protein surface where the probability of finding water (or cosolvent) molecules is significantly higher than that of the bulk. In our previous work,(35) we showed that using ethanol as a probe, hydrophilic, hydrophobic and even aromatic interaction sites could be readily identified. After performing the MD simulation of the protein target in an aqueous solution of ethanol, we should be able to identify along the binding site, regions where the ethanol preferentially interact with the protein surface, with either its hydroxyl (OH) end or its hydrophobic methyl (CH3) end. These regions are the so called solvent sites (or interaction hot spots) and below we will show how to apply biases on the docking protocol based on them. The general idea is to guide the docking of ligand groups capable of forming hydrogen bonds towards locations where the OH from ethanol preferentially interact with the protein and, at the same time, guide hydrophobic groups from the ligand to hydrophobic hot spots obtained from the methyl end of ethanol. We will focus on the latter type of interactions due to their different treatment with respect to the hydrogen bond biases shown before. The solvent sites that will define the bias positions and parameters (Vset and r) can be obtained using any of the available strategies, such as WATCLUST,(37) Watermap,(52) MDMix,(53) or SILCS(33). In the present example we will use sites determined by the WATCLUST algorithm applied to cosolvents.(35)

2.3.1. Generate the bias parameter file (BPF)

Aromatic bias from cosolvent MD

We are now going to add a cosolvent-derived hydrophobic bias in addition to the knowledge-based bias from the previous docking section. As previously described,(35, 53) despite being a non-aromatic probe, the hydrophobic sites from ethanol successfully capture the location of aromatic rings from common ligands and will therefore be used as aromatic biases.

To determine the bias position and parameters we previously performed three short 20 ns MD simulations of CDK2 in explicit water/ethanol mixture and used a modified version of WATCLUST algorithm to determine hydrophobic ethanol solvent sites. The reader is referred to WATCLUST tutorials (https://watclust.wordpress.com/examples/) and its cosolvent derivation(35) if they want to determine the sites by themselves. Here we provide the obtained results for a brief analysis.

WATCLUST output file = solvent_sites.pdb (available in https://github.com/jparcon/adbias/tree/master/mimb_chapter/cosolvent_bias/inputs)

The solvent_sites.pdb file has the hydrophobic interaction sites determined with the ethanol probe for this target (atom name = C1).

  1. Copy to a new working directory the solvent_sites.pdb file and the following files from the previous runs: receptor.pdbqt and ligand_ref.pdb.

  2. Visualize the solvent sites with ADT and compare them with the reference ligand functional groups (Figure 10):

    $ adt receptor.pdbqt ligand_ref.pdb solvent_sites.pdb

    As can be seen from Figure 10, two of the hydrophobic solvent sites coincide with the position of aromatic rings from the reference ligand structure. It should be noted, however, that there is an extra site (marked with a red arrow in Figure 10) that may be considered a false positive for this particular ligand, i.e., it does not overlay any aromatic center (Note 5).

  3. Generate the input bias parameter file (ligand_dock_sv.bpf) adding the coordinates and type of sites extracted from the solvent_sites.pdb to our previous BPF:

    x y z Vset r type
    1.782 66.634 6.337 −1.50 1.00 acc
    3.400 64.956 5.106 −1.50 1.00 don
    4.373 62.899 6.880 −2.60 1.20 aro
    1.165 65.278 6.896 −2.40 1.60 aro
    −2.230 66.659 8.705 −1.80 2.20 aro

    The Vset and radius (r) come from the MD derived solvent sites. Vset is the free energy of binding of the ethanol site and the radius is related to the dispersion and translational entropy of the site (see ref. (35) for details on their calculation).

Figure 10.

Figure 10.

Hydrophobic solvent sites derived from ethanol/water MD depicted as cyan spheres overlapping the reference co-crystallized ligand.

2.3.2. Introduce the bias in the maps

For introducing the bias we will need to copy to the working directory the following files from the previous runs:

  • conventional energy maps: receptor.A.map, receptor.C.map, …, receptor.maps.*

  • conventional grid parameter file: receptor.gpf

  • conventional docking parameter file: ligand_dock.dpf

  • ligand PDBQT file: ligand_dock.pdbqt

Additionally, we will need the BPF that we just generated:

  • input bias parameter file: ligand_dock_sv.bpf

  1. Execute AutoDock Bias (prepare_bias.py) to modify the energy maps and input files according to the bias parameter file prepared in the previous section:

    $ pythonsh $MGLUTIL/contrib/adbias/prepare_bias.py -b ligand_dock_sv.bpf -g receptor.gpf -d ligand_dock.dpf

    This will modify the acceptor and donor maps (NA and HD in our case) and create a new map for the aromatic bias (AC). It will also modify the DPF and the ligand PDBQT to insert a dummy atom in the center of aromatic rings.

  2. Check the bias in the maps using ADT (Figure 11):

    $ adt solvent_sites.pdb

    > Click the “C” circle on the solvent_sites line for activating “sphere” representation.

    > Click the “L” circle on the solvent_sites line for deactivating “lines” representation.

    > Right click the “C” circle > “Rendering” tab (lower panel) > “Points” representation.

    > If necessary, zoom out with the middle scroll to see all the sites.

    > Open 3D Grid/Volume Rendering (Grid3D > Show Control Panel).

    > Add the biased map you want to check (e.g. receptor.AC.biased.map) by clicking the “Add” (+) button and selecting the corresponding map.

    > Click “Isocontour”.

    > Set “max” to 0.00

    > SHIFT + click in the panel to set the isocontour value.

    > Move the bar and check that the lower energy values are inside the solvent sites (the AC map should show a spherical pattern).

    > Repeat for all the biased maps.

  3. Check the bias in the DPF by noting the changes between the original DPF and the one newly created (ligand_dock.biased.dpf). Use a text editor to open the two files, or text comparison programs such as diff, vimdiff or meld, for example:

    $ vimdiff ligand_dock.dpf ligand_dock.biased.dpf
    • First line parameter_file ad4_arom_params.dat reads the created file with parameters for the AC dummy atom type.

    • AC is added as a ligand atom type (ligand_types) with its corresponding map.

    • Biased maps are read instead of original maps for HD and NA.

    • The docking ligand is updated (move ligand_dock.dum.pdbqt)

  4. Check the modification in the ligand PDBQT with ADT (or PYMOL15):

    $ adt ligand_dock.pdbqt ligand_dock.dum.pdbqt

    The new PDBQT (ligand_dock.dum.pdbqt) is the same ligand as the original one (ligand_dock.pdbqt), but with dummy atoms added in the center of each aromatic ring.16

Figure 11.

Figure 11.

Grid Control Panel of ADT, showing receptor.AC.biased.map, at an isoenergetic value of −1.0 kcal/mol depicted in red. The gray sphere representation is for the aromatic bias sites.

To sum up the effect of the aromatic bias, we just saw that the AC energy map has lower energies in the locations corresponding to hydrophobic ethanol sites. Therefore, aromatic rings from the ligand (with AC dummy atoms in their center of mass) will be guided during the docking towards those same locations.

2.3.3. Perform the biased docking

The docking is executed as usual, indicating the cosolvent biased docking parameter file:

$ autodock4 -p ligand_dock.biased.dpf -l ligand_dock.biased.dlg

2.3.4. Analyze results

As we did for ligand-derived biased docking, we will illustrate the analysis with previously calculated results (see ligand_dock.biased.dlg in https://github.com/jparcon/adbias/tree/master/mimb_chapter/cosolvent_bias/outputs). Follow the steps with your own results.

  1. Open the docking log file (ligand_dock.biased.dlg) with a text editor and go to the “CLUSTERING HISTOGRAM” section. In our case, 6 different ligand poses were found. The first ranked pose has both the best docking score (−14.26 kcal/mol) and population (64/100).

  2. Visualize the poses with ADT and compare them with the reference one (ligand_ref.pdb):

    $ adt receptor.pdbqt ligand_ref.pdb

    > On the ADT bar, click on “Analyze”, “Dockings”, “Open…” and choose the ligand_dock.biased.dlg file.

    > Click on “Analyze”, “Clusterings”, “Show…”.

    Again, a histogram with the energy distribution for the resulting poses, clustered according to their RMSD, will pop up (see Figure 12, left panel, as an example, yours is probably slightly different). Click on each bar to see the 3D structure of the poses belonging to each cluster in the visualization panel. Figure 12 (right panel) shows the representative structure for the first ranked cluster, which nicely resembles the reference cocrystal pose, except for a partial rotation in the outermost thiazole ring that lacks defined interactions with the binding pocket.

  3. Generate PDB files for each cluster representative pose:

    $ pythonsh $MGLUTIL/contrib/adbias/pdb_poses.py ligand_dock.biased.dlg

    This will generate rank1.pdb, rank2.pdb, …, rank6.pdb.

  4. Calculate the RMSD of the first ranked pose against the crystal structure in the same manner that we did before. First, manually delete the dummy atoms and the hydrogens from rank1.pdb with a text editor and save it as rank1_noH.pdb. Copy from the previous runs ligand_ref_renum.pdb and then get the RMSD:

    $ pythonsh $MGLUTIL/compute_rms_between_conformations.py -f ligand_ref_renum.pdb -s rank1_noH.pdb

    The value obtained is saved to summary_rms_results.txt:

    • RMSD rank 1 vs. reference = 1.82 Å

  5. With all this information and the “CLUSTERING HISTOGRAM” section of the DLG, build the cluster population vs. docking score plot specifying the RMSD of the best ranked pose. An example with our obtained results is shown in Figure 13.

Figure 12.

Figure 12.

(Left panel) Binding energy distribution for poses obtained from 100 docking runs. (Right panel) Best ranked predicted pose (in cylinders) compared to reference cocrystal structure (in balls and sticks) for the biased docking method including solvent sites.

Figure 13.

Figure 13.

Cluster population vs. docking score for the biased docking. RMSD (in Å) of the first ranked pose against the reference is indicated in blue.

Conclusion of solvent site biased docking.

After applying the bias towards the solvent sites from MD, the docking method was capable of predicting the crystal pose with best energy and highest population, thus effectively discriminating the correct pose among the other false positives (compare Figure 13 with the conventional method, Figure 9, right panel). In addition, the aromatic bias aided in both energy and population separation between the crystal pose and the other ones (compare Figure 13 with the ligand-derived docking experiment just with hydrogen bond bias, Figure 9, left panel). Also comparing both bias approaches, it can be seen that the mixed polar and aromatic bias generated half false positives (6 vs 12 clusters found).

Discussion

Docking methods are currently widely used both in industry and academia to study protein-ligand interactions, especially in the context of drug development projects. AutoDock4 is one of the most popular docking tools, being open source and freely available. As for any docking program, its performance is highly system dependent and can be significantly improved using knowledge of particular protein-ligand interactions that are relevant for the system of interest. AutoDock Bias is a simple and elegant strategy which allows users to incorporate this type of information through the use of a modified scoring function that biases the ligand structure towards those poses (or conformations) that establish the selected interactions. The method has been shown to significantly improve docking performance both in terms of accuracy (lower RMSD of the predicted pose against the reference complex structure) and precision (its capacity to distinguish the correct pose among wrong predictions).(3537) Moreover, incorporating bias significantly improves docking-based virtual screening predictive capacity.(39)

An initial key question when introducing a bias is how to define it. In the present work we have shown two powerful strategies. The first one relies on the availability of structures of the same (or an homologous) protein in the presence of at least one ligand, in order to identify relevant interactions and their position in the protein’s active site. The second approach determines protein-solvent interactions that mimic potential protein-ligand interactions and has a higher potential since it allows introducing bias even if there is no complex structure available. The presence of water sites (WS), i.e. regions of space adjacent to the protein surface where the density of water is significantly higher than that of the bulk, tend to reveal the presence of strong hydrogen bond interactions. WS can be identified if the receptor structure is available, since they correspond to crystallographic water molecules. In the absence of crystallographic waters, they can also be determined from explicit water MD simulations and an analysis of the resulting solvent structure. For hydrophobic or aromatic interactions, although they can be revealed by the presence of amphiphilic molecules (such as acetonitrile or glycerol) in available crystal structures, this is not the usual case. Therefore, the best strategy is to perform mixed solvent MD using for example water-ethanol or water-isopropanol mixtures and determine the presence of hydrophobic solvent sites, which can be used to define the bias position and strength as described in this chapter.

Interestingly, the flexibility of AutoDock Bias also allows using the bias for other types of interactions not covered in the present chapter. For example, the presence of metal ions in the receptor active site usually allows for some type of ligands to establish metal-ligand coordination bonds. Typical cases are cytochromes P450 (CYPs) harboring a heme group to which several inhibitors bind tightly with their aromatic nitrogens, as shown for human CYP3A4 bound to antifungal fuconazole.(54) In these cases, users can introduce a strong bias that will tend to position proper ligand atoms (i.e those belonging to functional groups known to coordinate the particular metal ion) at specified distance from the metal center, therefore anchoring the ligand in relevant poses. A second example, are the suicide inhibitors that establish covalent bonds with the receptor. To be able to form the bond, the ligand must first, be properly bound to the active site with the reactive ligand functional group at proper distance from the receptor bond acceptor. The user could define a bias that promotes the ligand reactive atom to this position, thus enforcing a pose that allows establishment of the corresponding covalent bond (see also ref. (55)).

In summary, AutoDock Bias is a powerful tool that we expect will significantly aid researchers in the application of molecular docking strategies for the study of protein-ligand interactions and the identification of more effective binders, which ultimately can lead to the development of new and better drugs.

Notes

1.

The protein structure for a docking procedure needs to be carefully prepared, even after selecting the proper conformation. Protonation of protein residues, especially those at the binding site, needs to be tackled, as well as the special orientation treatment for amides and histidines that may appear flipped due to their ambiguity for fitting the electron density of X-ray crystallography. The direction of the hydroxyl (-OH) group from serine and threonine residues and from the sulfhydryl (-SH) group from cysteine residues should also be supervised. AutoDockTools is able to protonate the protein structure in an automated fashion ($ pythonsh $MGLUTIL/prepare_receptor4.py -r receptor.pdb -A hydrogens -o receptor.pdbqt), but it is recommended to use tools that allow user intervention to manually adjust all the stated issues. Maestro represents a good alternative that is free for academic use.

2.

Although treating the receptor structure as a rigid body is clearly a limitation of the procedure, it is by far the most common approach in docking protocols. In particular, since AutoDock Bias relies on precalculated energy grid maps, incorporating flexibility to the target is not an easy task. One way could be selecting an ensemble of protein structures and repeat the biased docking of the ligand on each of them independently to finally gather all the results (ensemble biased docking). Another approach is taking advantage of current alternatives that allow flexible receptor sidechains, such as AutoDock or AutoDockFR(56), and modify the maps that are built without considering the flexible residues, taking into account that the bias could also guide the flexible part of the receptor.

3.

An alternative and more direct approach to determine the position of the ligand-derived pharmacophoric sites would have been to use directly the coordinates of the co-crystallized ligand interactors. However, the suggested approach based on the receptor structure has broader applications and does not depend on a particular ligand structure, which may be more suitable for the cross-docking of different ligands.

4.

It is best practice to only consider the residues of the binding site when aligning protein structures to validate the results obtained from cross-docking. In our case, to define the binding site we considered every protein residue with at least one atom at a distance lower than 4 Å from the cocristallized ligand. The residue numbers are 10, 18, 31, 64, 80, 81, 82, 83, 84, 86, 89, 134, 144 and 145. The alignment was restricted to backbone atoms. VMD (RMSD Trajectory Tool extension) and PyMol (align command) are good alternatives to perform structural alignments.

5.

It is expected that using small cosolvent molecules to infer interaction sites from bigger ligands, such as drug-like compounds, will produce additional predicted hot spots that will be false positives, in part due to the size difference. A thorough description of this problem may be found elsewhere.(35) However, the robustness of the biased docking method will be its ability to improve the docking of the ligand despite the presence of one (or few) misleading bias sites.

1

Download site for AutoDock4: http://autodock.scripps.edu/downloads. Download site for AutoDockTools: https://ccsb.scripps.edu/mgltools.

3

Download site for Open Babel: http://openbabel.org.

4

Alternatively, graphical user interfaces such as ADT or PyMol can be used.

5

Actually, the ideal conformation would have not been available from the PDB back in 2004, but one could have easily obtained it from the SMILES of the ligand using Open Babel or MarvinSketch for example.

6

$ pymol ligand_dock.pdbqt

7

Alternatively, it would be possible to use just the coordinates of a single atom from the ligand that could be extracted from the PDB directly in a straightforward manner.

8

$ vmd -m receptor.pdbqt ligand_original.pdb receptor.A.map

9

This step (conventional docking) may be omitted, but we will use it to compare our final biased results. It is available from https://github.com/jparcon/adbias/tree/master/mimb_chapter/conventional.

10

The ideal_interactions_sites.py script calculates ideal locations for hydrogen bond and aromatic interactions with any residue in a protein structure. It takes a PDB file with the structure of the protein (protonated) and generates a PDB file (interaction_sites.pdb) with a list of possible ligand interactors with the protein residues indicated by the user. The script requires Maestro or Amber atom names for Hs (e.g. HNE or HE, HH11, HH12, HH21 and HH22 for Arg hydrogens in the guanidinium group) and Biopython for PDB parsing (included in AutoDockTools).

11

Acceptor sites: residue name ACC, atom name O. Donor sites: residue name DON, atom name H. Aromatic sites: residue name ARO, atom name C.

12
Bias parameter file format requirements
  • All lines must have 6 columns. The columns must be space or tab separated.
  • Lines are ignored if the first column is not numeric (e.g., header with titles -x, y, z, Vset, r, type-).
  • The first three columns define the x,y,z coordinates of the bias site center, in Å.
  • The fourth column corresponds to the energy reward (Vset), in kcal/mol, to be applied at the bias site center. It has to be a negative number. If Vset is a positive number N, it will be considered as a relative density of states and will be converted to energy using -kT ln(N) -in kcal/mol-. This means that an energy penalty cannot be set.
  • The fifth column is the radius (r) of the bias site, in Å. It controls the extent of energy reward through space according to a Gaussian function -see equation 1-.
  • The last column indicates the type of bias and, in consequence, which energy maps will be modified:
    acc modifies NA and OA maps;
    don modifies HD maps;
    aro creates an ad hoc new map (AC, aromatic center map);
    map modifies the energy map specified in the -m argument. map biases cannot be combined with other types of biases (don, acc, aro) in the same execution of the program.
13

The program is prepared to work with the pythonsh interpreter for python2 that comes with AutoDockTools 1.5.7 (https://ccsb.scripps.edu/mgltools). Independent versions for regular python2 and python3 are also available upon request.

14

To avoid this step, Maestro has a python script to calculate RMSD between .mae files. Convert the pdb files to mae and execute: $ run rmsd.py ligand_ref.mae rank1.mae

15

$ pymol ligand_dock.pdbqt ligand_dock.dum.pdbqt

16

The aromatic rings are detected by Open Babel.

References

  • 1.Goodsell DS and Olson AJ (1990) Automated docking of substrates to proteins by simulated annealing. Proteins 8:195–202 [DOI] [PubMed] [Google Scholar]
  • 2.Hart TN and Read RJ (1992) A multiple-start Monte Carlo docking method. Proteins 13:206–222 [DOI] [PubMed] [Google Scholar]
  • 3.Morris GM, Goodsell DS, Halliday RS, et al. (1998) Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. 19:1639–1662 [Google Scholar]
  • 4.Jones G, Willett P, Glen RC, et al. (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267:727–748 [DOI] [PubMed] [Google Scholar]
  • 5.Friesner RA, Banks JL, Murphy RB, et al. (2004) Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem 47:1739–1749 [DOI] [PubMed] [Google Scholar]
  • 6.Kuntz ID, Blaney JM, Oatley SJ, et al. (1982) A geometric approach to macromolecule-ligand interactions. J Mol Biol 161:269–288 [DOI] [PubMed] [Google Scholar]
  • 7.Eldridge MD, Murray CW, Auton TR, et al. (1997) Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes. J Comput Aided Mol Des 11:425–445 [DOI] [PubMed] [Google Scholar]
  • 8.Meng EC, Shoichet BK, and Kuntz ID (1992), Automated docking with grid-based energy evaluation, 10.1002/jcc.540130412 [DOI] [Google Scholar]
  • 9.Gohlke H, Hendlich M, and Klebe G (2000) Knowledge-based scoring function to predict protein-ligand interactions. J Mol Biol 295:337–356 [DOI] [PubMed] [Google Scholar]
  • 10.Ballester PJ and Mitchell JBO (2010) A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics 26:1169–1175 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Shoichet BK, Kuntz ID, and Bodian DL (1992), Molecular docking using shape descriptors, 10.1002/jcc.540130311 [DOI] [Google Scholar]
  • 12.Morris GM, Huey R, Lindstrom W, et al. (2009) AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J Comput Chem 30:2785–2791 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ruiz-Carmona S, Alvarez-Garcia D, Foloppe N, et al. (2014) rDock: a fast, versatile and open source program for docking ligands to proteins and nucleic acids. PLoS Comput Biol 10:e1003571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Coleman RG, Carchia M, Sterling T, et al. (2013) Ligand pose and orientational sampling in molecular docking. PLoS One 8:e75992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Allen WJ, Balius TE, Mukherjee S, et al. (2015) DOCK 6: Impact of new features and current docking performance. J Comput Chem 36:1132–1156 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Trott O and Olson AJ (2010) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem 31:455–461 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Rarey M, Kramer B, Lengauer T, et al. (1996) A fast flexible docking method using an incremental construction algorithm. J Mol Biol 261:470–489 [DOI] [PubMed] [Google Scholar]
  • 18.Jain AN (2003) Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine. J Med Chem 46:499–511 [DOI] [PubMed] [Google Scholar]
  • 19.Corbeil CR, Williams CI, and Labute P (2012) Variability in docking success rates due to dataset preparation. J Comput Aided Mol Des 26:775–786 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Repasky MP, Murphy RB, Banks JL, et al. (2012) Docking performance of the glide program as evaluated on the Astex and DUD datasets: a complete set of glide SP results and selected results for a new scoring function integrating WaterMap and glide. J Comput Aided Mol Des 26:787–799 [DOI] [PubMed] [Google Scholar]
  • 21.Liebeschuetz JW, Cole JC, and Korb O (2012) Pose prediction and virtual screening performance of GOLD scoring functions in a standardized test. J Comput Aided Mol Des 26:737–748 [DOI] [PubMed] [Google Scholar]
  • 22.Brozell SR, Mukherjee S, Balius TE, et al. (2012) Evaluation of DOCK 6 as a pose generation and database enrichment tool. J Comput Aided Mol Des 26:749–773 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Spitzer R and Jain AN (2012) Surflex-Dock: Docking benchmarks and real-world application. J Comput Aided Mol Des 26:687–699 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Damm-Ganamet KL, Smith RD, Dunbar JB Jr, et al. (2013) CSAR benchmark exercise 2011-2012: evaluation of results from docking and relative ranking of blinded congeneric series. J Chem Inf Model 53:1853–1870 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Carlson HA, Smith RD, Damm-Ganamet KL, et al. (2016) CSAR 2014: A Benchmark Exercise Using Unpublished Data from Pharma. J Chem Inf Model 56:1063–1077 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Cleves AE and Jain AN (2015) Knowledge-guided docking: accurate prospective prediction of bound configurations of novel ligands using Surflex-Dock. J Comput Aided Mol Des 29:485–509 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hu B and Lill MA (2014) PharmDock: a pharmacophore-based docking program. J Cheminform 6:14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kumar A and Zhang KYJ (2016) A pose prediction approach based on ligand 3D shape similarity. J Comput Aided Mol Des 30:457–469 [DOI] [PubMed] [Google Scholar]
  • 29.Jacquemard C, Drwal MN, Desaphy J, et al. (2019) Binding mode information improves fragment docking. J Cheminform 11:24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Kuhn B, Guba W, Hert J, et al. (2016) A Real-World Perspective on Molecular Design. J Med Chem 59:4087–4102 [DOI] [PubMed] [Google Scholar]
  • 31.Santos-Martins D, Forli S, Ramos MJ, et al. (2014) AutoDock4(Zn): an improved AutoDock force field for small-molecule docking to zinc metalloproteins. J Chem Inf Model 54:2371–2379 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Mattos C, Bellamacina CR, Peisach E, et al. (2006) Multiple solvent crystal structures: probing binding sites, plasticity and hydration. J Mol Biol 357:1471–1482 [DOI] [PubMed] [Google Scholar]
  • 33.Guvench O and MacKerell AD (2009) Computational Fragment-Based Binding Site Identification by Ligand Competitive Saturation. 5:e1000435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Seco J, Luque FJ, and Barril X (2009) Binding site detection and druggability index from first principles. J Med Chem 52:2363–2371 [DOI] [PubMed] [Google Scholar]
  • 35.Arcon JP, Defelipe LA, Modenutti CP, et al. (2017) Molecular Dynamics in Mixed Solvents Reveals Protein-Ligand Interactions, Improves Docking, and Allows Accurate Binding Free Energy Predictions. J Chem Inf Model 57:846–863 [DOI] [PubMed] [Google Scholar]
  • 36.Gauto DF, Petruk AA, Modenutti CP, et al. (2013) Solvent structure improves docking prediction in lectin–carbohydrate complexes. Glycobiology 23:241–258 [DOI] [PubMed] [Google Scholar]
  • 37.López ED, Arcon JP, Gauto DF, et al. (2015) WATCLUST: a tool for improving the design of drugs based on protein-water interactions. Bioinformatics 31:3697–3699 [DOI] [PubMed] [Google Scholar]
  • 38.Modenutti C, Gauto D, Radusky L, et al. (2015) Using crystallographic water properties for the analysis and prediction of lectin-carbohydrate complex structures. Glycobiology 25:181–196 [DOI] [PubMed] [Google Scholar]
  • 39.Arcon JP, Defelipe LA, Lopez ED, et al. (2019) Cosolvent-Based Protein Pharmacophore for Ligand Enrichment in Virtual Screening. J Chem Inf Model 59:3572–3583 [DOI] [PubMed] [Google Scholar]
  • 40.Arcon JP, Modenutti CP, Avendaño D, et al. (2019) AutoDock Bias: improving binding mode prediction and virtual screening using known protein-ligand interactions. Bioinformatics 35:3836–3838 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Schrödinger Release 2019-4: MS Jaguar, Schrödinger, LLC, New York, NY, 2019, [Google Scholar]
  • 42.Word JM, Lovell SC, Richardson JS, et al. (1999) Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. J Mol Biol 285:1735–1747 [DOI] [PubMed] [Google Scholar]
  • 43.O’Boyle NM, Banck M, James CA, et al. (2011) Open Babel: An open chemical toolbox. J Cheminform 3:33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Forli S, Huey R, Pique ME, et al. (2016) Computational protein-ligand docking and virtual drug screening with the AutoDock suite. Nat Protoc 11:905–919 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Lim S and Kaldis P (2013) Cdks, cyclins and CKIs: roles beyond cell cycle regulation. Development 140:3079–3093 [DOI] [PubMed] [Google Scholar]
  • 46.Cicenas J and Valius M (2011) The CDK inhibitors in cancer research and therapy. J Cancer Res Clin Oncol 137:1409–1418 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Williamson DS, Parratt MJ, Torrance CJ, et al. (2005), Crystal structure of human CDK2 complexed with a pyrazolo[1,5-a]pyrimidine inhibitor, 10.2210/pdb1y91/pdb [DOI] [Google Scholar]
  • 48.Kang YN and Stuckey JA (2012), Crystal Structure of the CDK2 in complex with thiazolylpyrimidine inhibitor, https://www.wwpdb.org/pdb?id=pdb_00003sw4
  • 49.Defelipe LA, Arcon JP, Modenutti CP, et al. (2018) Solvents to Fragments to Drugs: MD Applications in Drug Design. Molecules 23:3269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Dechene M, Wink G, Smith M, et al. (2009) Multiple solvent crystal structures of ribonuclease A: an assessment of the method. Proteins 76:861–881 [DOI] [PubMed] [Google Scholar]
  • 51.Gauto DF, Di Lella S, Guardia CMA, et al. (2009) Carbohydrate-binding proteins: Dissecting ligand structures through solvent environment occupancy. J Phys Chem B 113:8717–8724 [DOI] [PubMed] [Google Scholar]
  • 52.Abel R, Young T, Farid R, et al. (2008), Role of the Active-Site Solvent in the Thermodynamics of Factor Xa Ligand Binding, 10.1021/ja0771033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Alvarez-Garcia D and Barril X (2014) Molecular simulations with solvent competition quantify water displaceability and provide accurate interaction maps of protein binding sites. J Med Chem 57:8530–8539 [DOI] [PubMed] [Google Scholar]
  • 54.Sevrioukova I (2019) Interaction of Human Drug-Metabolizing CYP3A4 with Small Inhibitory Molecules. Biochemistry 58:930–939 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Bianco G, Forli S, Goodsell DS, et al. (2016) Covalent docking using autodock: Two-point attractor and flexible side chain methods. Protein Sci 25:295–301 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Ravindranath PA, Forli S, Goodsell DS, et al. (2015) AutoDockFR: Advances in Protein-Ligand Docking with Explicitly Specified Binding Site Flexibility. PLoS Comput Biol 11:e1004586. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES