Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 May 6.
Published in final edited form as: Proteins. 2008 Nov 15;73(3):566–580. doi: 10.1002/prot.22081

In pursuit of virtual lead optimization: The role of the receptor structure and ensembles in accurate docking

Erin S D Bolstad 1, Amy C Anderson 1,*
PMCID: PMC3644990  NIHMSID: NIHMS462771  PMID: 18473360

Abstract

Accurate ranking during in silico lead optimization is critical to drive the generation of new ligands with higher affinity, yet it is especially difficult because of the subtle changes between analogs. In order to assess the role of the structure of the receptor in delivering accurate lead ranking results, we docked a set of forty related inhibitors to structures of one species of dihydrofolate reductase (DHFR) derived from crystallographic, NMR solution data, and homology models. In this study, the crystal structures yielded the superior results: the compounds were placed in the active site in the conserved orientation and the docking scores for 80% percent of the compounds clustered into the same bins as the measured affinity. Single receptor structures derived from NMR data or homology models did not serve as accurate docking receptors. To our knowledge, these are the first experiments that assess ranking of homologous lead compounds using a variety of receptor structures. We then extended the study to investigate whether ensembles, either computationally or experimentally derived, of all of the single starting structures aid, hinder or have no effect on the performance of the starting template. Impressively, when ensembles of receptor structures derived from NMR data or homology models were employed, docking accuracy improved to a level equal to that of the high resolution crystal structures. The same experiments using a second species of DHFR and set of ligands confirm the results. A comparison of the structures of the individual ensemble members to the starting structures shows that the effect of the ensembles can be ascribed to protein flexibility in addition to absorption of computational error.

Keywords: docking, DHFR, crystal structure, solution structure, homology model, ensembles, protein flexibility

INTRODUCTION

Virtual screening has been proven to be a valuable technique to accelerate the pace of drug discovery. In virtual screening, often several thousands of compounds from an in silico database are docked into a model of a receptor binding site and ranked according to their fit. Accurate ligand ranking during lead optimization, when the compounds are focused around a conserved hit scaffold, is critical to successfully drive the generation of new compounds with high affinity. Yet, computationally reproducing the correlations between subtle changes in structure and potentially large changes in activity can be extremely difficult. Despite advances in scoring algorithms, incorporation of protein and ligand flexibility and the treatment of solvent, all of which increase docking accuracy,1,2 achieving proper ranking remains a significant and difficult problem that is largely unsolved.3,4 A crucial component of a successful virtual screen with accurate ligand ranking is the experimental or modeling source of the receptor structure used for docking. High resolution crystal structures are typically chosen if available. Otherwise, structures solved from medium resolution crystallographic data, solution structures derived from NMR data and homology models are often substituted. Previous studies have compared the ability of different receptor structures derived from crystal structures, NMR ensembles and homology models to actively identify ligands in a high affinity cluster from a large database of compounds, also known as enrichment,5-9 but have not fully explored the role of the receptor in accurately ranking ligands.

Although the receptor for docking can be a single structure, we have previously investigated ligand ranking in lead optimization10 and found that considering ensembles of protein:ligand conformations yielded a reasonable (72.9%) correlation between docking scores and biological activity. Ensembles of receptor structures have been used previously in other studies as a simulation of multiple conformational states to represent protein flexibility11-13 and have been shown to increase docking accuracy.10,14,15 The contribution of the ensemble to the improvement in docking accuracy and ligand ranking may be related to the use of a flexible receptor that is capable of adopting many of the necessary alternate side chain orientations for binding as well as to overcoming the fundamental energy landscape and thermodynamic issues that plague accurate ligand-protein modeling.7,16 Additionally, ensembles of receptor structures may overcome potential bias in the screen toward analogs of the ligand with which it was co-crystallized.17-20

Molecular ensembles can be experimentally observed, such as solution structure NMR ensembles, or computationally determined via a molecular dynamics (MD) simulation. Solution structure NMR ensembles generally explore larger conformational space than traditional MD simulations.21 A comparison of accurate ligand ranking using MD-generated ensembles from crystal structures and homology models and ensembles from NMR data would be useful in choosing a receptor for a virtual screen.

Dihydrofolate reductase (DHFR) is an excellent system for the comparison of ligand docking and ranking. Several structures derived from crystallographic and solution data are available for Lactobacillus casei and human DHFR (LcDHFR and hDHFR). Homology models are easily created based on a wealth of sequence information and readily available online modeling engines. Additionally, there are many known ligands with associated inhibition values in the literature. DHFR is an essential enzyme in folate metabolism and has been a validated target for antibiotic, anticancer, and antiparasitic drug therapy. Crystallographic studies of several species of DHFR bound to analogs containing 2,4-diaminopyrimidine rings have shown that there is an exclusive orientation for the pyrimidine component within the active site. The protonated N1 atom and N2-amino group of the 2,4-diaminopyrimidine ring always form two hydrogen bonds with a conserved acidic residue in the active site.22-26 Knowledge of this conservation creates an additional opportunity to assess the instances that the ligand is correctly oriented within the active site in addition to ranking its comparative potency. The results of these studies in which orientation is known may allow better selection of a receptor structure in the more typical situation when the correct orientation is not known, such as for screening large databases. Additionally, DHFR is representative of many enzymes that bind small molecules in a pocket, in that the active site is accessed through an opening to solution, and undergoes some ligand-induced conformational changes.

In this manuscript, we present an in-depth comparison of ligand docking and ranking using crystal structures, single members of NMR ensembles, and homology models created from two scaffolds with different sequence identities. Unlike the traditional metric for assessing docking accuracy that focuses on identifying highly active molecules from a database of decoys, we have adopted a strategy for assessing individual ligand ranking using a “grouped ranking score.” The grouped ranking score is designed to be useful to a medicinal chemist, who is more likely to synthesize a group of top-scoring compounds rather than a single top compound. We then extend these docking results to investigate whether ensembles of starting structures aid, hinder or have no effect on the performance of the starting template. We have found that within the context of a single docking protocol and scoring algorithm, the receptor structure provides the key to successful ligand ranking. While crystal structures showed the best performance, ensembles of receptor structures derived from NMR data or homology models regained docking accuracy relative to the single receptor structures and performed equally as well as high resolution crystal structures. The ensembles appear to represent protein flexibility as well as to absorb some of the inherent error of a coordinate-specific computational scoring algorithm.

METHODS

Ligand preparation

All ligands were drawn in Sybyl27 in an analogous fashion so that the starting conformations were as similar as possible and then minimized using a Tripos force field. Substituted bicyclic rings containing a 2,4-diaminoquinazoline ring core in the ligand set were submitted to Gaussian28 runs utilizing Hartree-Fock theory and a 6–3111G(d,p) theory set to determine conformation. Results from this simulation showed that the substituted bicyclic component remained planar; therefore ligands with a bicyclic core were minimized with the planar component held rigid. Amines attached to aromatic systems were assigned with planar geometry to account for the interaction of the lone pair to the p orbitals of the aromatic system and to represent an average structure between two possible trigonal pyramidal structures. The resulting structures were checked for proper geometries and selectively protonated at N1 of the 2,4-diaminopyrimidine ring.

Protein preparation

Experimentally determined protein structures were downloaded from the PDB29 and prepared by adding hydrogens and calculating charges. Homology models were created using the various programs discussed in the text, with the L. casei DHFR (LcDHFR) sequence obtained from the PDB structure 3DFR.25 All default model creation options were taken. Returned structures were checked for sidechain bumps and Ramachandran violations. Models were aligned with 3DFR; the cofactor, NADPH, and the pteridine ring of methotrexate were merged into the homology models. The resulting ternary structure was minimized using the Amber force field either in entirety or using a restricted 3.5 Å radius around the ligand. Results from both experiments are reported. In some cases, the homology modeling program rotated Trp 5 into the pocket occupied by NADPH. In these cases it was rotated back into the protein using a Lovell dictionary30 of sidechain angles prior to minimization.

Ensemble preparation

A MD simulation was carried out on a starting template (as defined in the text) at 300 K over 10,000 fs with conformational snapshots taken every 500 fs. The active sites were defined using a 3.5 Å radius from the ligand in which all residues with at least one atom falling into the sphere were included. All other residues including the docked ligand and cofactor NADPH were held rigid. In order to prevent an unfairly large active site selection with large ligands such as methotrexate in 3DFR, only the substituted pteridine ring was included in the active site definition. Resulting MD sets were minimized with an Amber force field for 1000 iterations and a terminal energy change of 0 kcal/mol. Geometries and Ramachadran plots were checked and formal charges were added. The starting conformation is also included in the calculations, labeled as RS.

Docking

All ligands were docked using Surflex-Dock as implemented by Sybyl 7.2,27 run at the command line using an in-house script designed for docking against a library of receptors. The active sites were flooded with small lipophilic, hydrogen bond donor, and hydrogen bond acceptor probes to define a protomol.31 Surflex-Dock automatically determines and then places a base fragment into the protomol to maximize contacts and then sequentially builds up the ligand, exploring conformational space on the fly. An independent experiment (data not shown) with three different ligands showed that the starting conformation could alter the final docking score by up to 8.5%. To try to circumvent this problem, all ligands were drawn with an analogous conformation. For each ligand, 200 poses were generated and ranked based on docking score. These poses were visually assessed to find the top-scoring pose with the conserved geometry of the 2,4-diaminopyrimidine ring as defined by Figure 1.

Figure 1.

Figure 1

The conserved “correct” orientation of the 2,4-diaminopyrimidine ring within the active site, with hydrogen bonding defined by crystallographic data.

Results were visualized within Sybyl and exported to Excel spreadsheets. The reported docking score is the score of the first pose with the conserved orientation, considered in this study to be the “correct” orientation.

The docking score is a weighted sum of nonlinear functions using van der Waals distances between the relevant protein/ligand atoms including hydrophobic, polar, electrostatic repulsive, entropic, and solvation terms.31,32 All scores are expressed in −log10(Kd) units, with higher scores indicating a stronger affinity. Each ensemble member is considered to contribute equally to the overall docking and therefore each score is averaged, providing a single docking score for each ligand per ensemble. Previous work showed that averaging the values across an ensemble was superior to using a Boltzmann distribution.33

The accuracy with which a docking experiment predicted the tightest binding ligands was assessed using a grouped ranking score. The grouped ranking score was calculated using a neighbor accounting technique. Ligands were sorted according to binding affinity and divided into bins of approximately equal size. After docking, the ligands are sorted according to docking score and the ranking based on docking score was compared with the ranking based on affinity. If the two rankings placed the ligand in the same bin, or into an adjacent bin, a score of 1 was assigned to the ligand, otherwise a score of 0 was assigned. The use of neighbor binning provided a soft edge and alleviated the problem of assigning numerical ranking to ligands with the same docking score. The ratio of the grouped ranking score sum to the maximum possible score (expressed as a %) is an important measure of the ligand-ranking accuracy of a docking run. For a grouped ranking score to be meaningful it must be statistically greater than random. The random cutoff value is determined by averaging the chances of each ligand being placed in the correct bin.

The ability of the receptor to correctly orient the ligand is also an important consideration to its accuracy as a predictive model. To represent this, the improper orientation rate is also presented. If the docking protocol returns a ligand with an incorrect orientation for the 2,4-diaminopyrimidine, that ligand is considered improperly oriented. In such cases, the top scoring correctly oriented ligand is found from the set of 200 poses and reported, but that ligand is flagged as having been improperly oriented. A low improper orientation rate indicates that only a few ligands were poorly oriented, while a high improper orientation rate indicates that the receptor failed to orient a large percentage of ligands correctly. The metric ‘place’ is used in the calculation of the improper orientation rate and is the number in the list of 200 poses of the top-scoring correct orientation. A place score of 0 represents the first place, and that ligand is considered to be properly oriented automatically.

The most important scoring metrics discussed herein are the grouped ranking score and the improper orientation rate (%IO). The docking score range is also worth considering as it represents the overall affinity of the ligand for the receptor site, including polar contacts, conformational energy penalties, and steric clash penalties.

RESULTS

Lead optimization by virtual screening requires both proper orientation of the ligand within the active site and accurate ranking to drive analog synthesis. We compared the performance of several structures of Lactobacillus casei (LcDHFR) derived from X-ray, NMR, or homology modeling data as docking receptors to accurately orient and rank a group of ligands with known affinity. With the docking results from several static structures of DHFR, we then investigated whether the use of ensembles created from those structures improves ligand orientation and ranking. The resulting trends were further investigated by submitting human DHFR (hDHFR) structures and a second set of ligands to the same docking experiments.

Receptors

Three experimental receptor structures were prepared: a crystal structure of L. casei DHFR (LcDHFR), the representative member of the ensemble of the NMR solution structure of LcDHFR and an averaged structure from the NMR ensemble. The crystal structure of LcDHFR bound to NADPH and methotrexate, PDB ID: 3DFR,25 was determined with data to 1.7 Å resolution. Of the several LcDHFR NMR ensembles available in the PDB, 1LUD34 (Supplemental Fig. 1) was selected as it is the only ensemble with both the cofactor NADPH and a ligand (trimethoprim, TMP). The average structure was calculated using the average3d.py35 script within Pymol.36 The NMR representative member is the first member of the set (eg. 1LUD_1), and is defined by having the lowest root mean squared deviation (RMSD) to all other members in the ensemble.

Homology model creation

Several homology models of LcDHFR were created in order to compare results with the experimentally derived structures. To explore the impact of sequence identity, models were created using sequences with the highest or lowest identity to LcDHFR. In addition, a model was created using multiple templates in order to assess the function of ‘averaging’ several scaffolding structures.

The LcDHFR sequence from 3DFR was submitted to a BlastP37 search of the PDB, returning at least 250 results. The highest scoring non-L. casei DHFR structure was 1ZDR38 (Bacillus stearothermophilus), with 36% identity, and the lowest scoring DHFR structure was 1CZ339 (Thermatoga maritima), with 31% identity. Several structures were chosen for the multiple-template model: 1ZDR, 1DYH40 (E. coli X-ray crystal structure with 5-deazafolic acid), 1RF741 (E. coli X-ray crystal structure with dihydrofolic acid), and 2INQ42 (E. coli neutron diffraction structure bound to methotrexate). The 3DFR sequence was submitted to three different automated homology modeling servers using both 1CZ3 and 1ZDR as scaffolds: Esypred3D,43 Geno3D,44 and Swiss-Model45 using a sequence alignment generated by ClustalW.46 Geno3D was used for creating the multiple-template model. As accurate docking requires the presence of NADPH, and both the MD protocol and protomol generation protocols require a ligand in the active site, the two molecules were merged into the resulting homology models based on the orientation within 3DFR. The models were minimized using two methods: either a 3.5 Å radius around the merged ligand and NADPH in which all residues with at least one atom falling within the sphere were included and a minimization in which the entire molecule was permitted to move, allowing the relaxation of residues farther from the active site. The minimization around the ligands allowed the sidechains to find energetically feasible, and therefore realistic, conformations that accommodate the presence of a ligand. This resulted in 14 homology model structures: 1CZ3 and 1ZDR scaffolds submitted to three different engines with two different minimization techniques and the multiple-template model with two minimization techniques.

The resulting homology models show substantial differences based upon the scaffold. The structures derived from the 1ZDR scaffold structures have lower RMSD and relative difference (the calculated similarity based on the comparative measures between the hydrogen bonding atoms of the active site) to 3DFR than the structures derived from the 1CZ3 scaffold. An overlay of the structures with 3DFR (see Fig. 2) shows the substantial difference in variation from 3DFR between the two scaffolds.

Figure 2.

Figure 2

Overlay of 3DFR (green) with the pre-minimized homology models EsyPred3D (yellow), SwissModel (red), and Geno3D (purple) based on the 1ZDR template (left) and 1CZ3 template (right).

Ligands

Forty monocyclic 2,4-diaminopyrimidine analogs from two experiments47,48 were used for LcDHFR, shown in Table I. These ligands were chosen based on structural variation and a range of inhibition constants (0.05–1949.8 nM). The 40 ligands were sorted into six bins based on logical breaks in Ki values, with consideration for an even distribution of bin size (Table II). The small bin size allows for better evaluation of ranking accuracy. Grouped ranking was scored using the previously discussed neighbor-accounting technique. The random cutoff value for this set of ligands is 31%.

Table I.

Selected LcDHFR Ligands, their Measured 1/Ki Values and the Calculated Ki Values

Selected JMC1982 Ligands47 Selected
JMC1984
Ligands48
Ligand X Ki (nM) Ligand X Ki (nM)
1 3-5-(OH)2 416.87 1 H 19.95
4 H 6.31 2 3-SO2NH2 1174.9
8 3-CH2OH 2.14 4 3-OH 141.25
9 4-NH2 3.39 7 3-I 6.61
14 4-Cl 0.65 10 3-CH3 10.96
15 3-4-(OH)2 1.45 15 3-OCH3 30.2
16 3-OH 1.51 16 3-OCH2CH3 6.46
23 3-Cl 1.26 17 3-O(CH2)2CH3 2.63
24 3-CH3 1.66 19 3-O(CH2)8CH3 2.29
26 4-Br 0.62 21 3-OCH2-1-adamantyl 5.13
27 4-OCH3 0.56 27 3-CH2O-c-C6H11 2.04
34 3-Br 0.59 30 3-CH2OC6H5 0.27
40 3-CF3, 4-OCH3 0.05 37 3-CH2SC6H5 0.28
41 3,4-(OCH3)2 0.12 38 3-SCH2C6H5 1
43 3,5-(OCH3)2 0.38 41 4-SO2CH3 1949.84
44 3,4,5-(OCH3)3 0.13 45 4-COOCH2CH3 389.05
graphic file with name nihms-462771-t0004.jpg 46 4-OH 12.3
47 4-NH2 114.82
52 4-Br 26.92
54 4-CN 501.19
57 4-CH3 67.61
58 4-(CH2)3CH3 8.91
62 4-OCH3 79.43
63 4-O(CH2)5CH3 5.62
graphic file with name nihms-462771-t0005.jpg

Table II.

Data Analysis for 3DFR

graphic file with name nihms-462771-t0006.jpg

The properties of these ligands fall in the range of typical examples of drug-like molecules used for screening. The molecular weights range from 201.2 to 382.5, there are three to six rotatable bonds per ligand (and three single outliers with seven, nine, and twelve rotatable bonds), the logP values range from 0.12 to 4.55 and starting conformational energies range from 2.54 to 23.17 kcal/mol.

Docking

Evaluation of a receptor’s ligand ranking and docking is multi-faceted. The primary metric is the grouped ranking score, which represents the number of compounds that are placed in the correct affinity bin. The grouped ranking score is totaled and represented as a percentage of the maximum possible grouped ranking score. The improper orientation rate relays information regarding the ability of the receptor to properly orient the ligand. Receptors with high improper orientation rates reflect a general inability to properly orient the ligands. The docking score range represents the number of nonbonded contacts countered by energy and steric penalties. In general, ranges with higher values at both ends of the scale reflect better docking contacts, though these metrics must be considered in conjunction with the grouped ranking score and improper orientation rates, as a high docking score does not necessarily mean accurate ranking. An example of scoring using 3DFR is reported in Table II and full comparative results for all experimentally derived receptors are reported in Table III.

Table III.

Docking Data for Crystallographic and NMR Solution Structures of LcDHFR

Grouped
ranking
% IO Score
range
Source
% (% A.R.) High Low
3DFR 80% (49%) 0% 8.79 4.07 Lc X-Ray
1LUD_1 30% (−1%) 0% 10.5 3.31 Lc NMR: Representative
 member
1LUD_av 40% (9%) 20% 10.41 1.82 Lc NMR: average
 structure

The % I.O. column (improper orientation) represents the percentage of ligands that were not correctly oriented at place 0. The grouped ranking score is shown as % of maximum score and (the % above the random cutoff). The random cutoff is 31%.

The ternary complex crystal structure (3DFR) performed very well, orienting all of the ligands correctly and ranking them with a high grouped ranking score, well above random. A clear trend can be seen in Table II showing decreasing docking scores as the colors move from warm (yellow) to cool (purple and grey) down the table in conjunction with an increasing Ki. This is reflected in a grouped ranking score of 80%, in which 80% of the ligands were correctly placed in their corresponding affinity bin. Additionally, each ligand has a place score of 0, indicating that the top scoring pose was also the correct orientation. The total place score is 0, shown by the improper orientation rate of 0% and indicating all ligands were correctly oriented. The maximum and minimum docking scores are reported. A low score of 4.07 indicates that even the ligand with the least computed affinity still presented a number of good binding contacts. This is to be expected as the ligand database is composed of ligands with affinity for the receptor, as opposed to a database with decoys.

The individual NMR ensemble members performed much worse than the crystal structure (Table III). The representative NMR member (1LUD_1) correctly oriented all of ligands, but the average structure (1LUD_av) yielded an improper orientation rate of 20%. Both of the NMR structures have grouped ranking scores at or below random, demonstrating that the structures did not differentiate ligand affinity. The low grouped ranking scores for the NMR structures also demonstrate that proper orientation and high docking scores do not necessarily indicate accurate ranking.

The homology models are assessed individually with two minimization methods and as two scaffold groups, 1ZDR versus 1CZ3 (Table IV). While the average grouped ranking scores of the 1ZDR and 1CZ3 template models are almost equivalent (59 and 62%, respectively), there is significant variability within the two groups, ranging from a grouped ranking score as high as that of the crystal structures (80%) down to nearly random (40%). When all metrics are considered, the 1CZ3 group performs worse than the 1ZDR group as receptors for accurately docking ligands. The 1CZ3 group has an improper orientation rate that is twice that of the 1ZDR group. The docking scores are also lower for 1CZ3, indicating fewer docking contacts and/or greater energy and steric penalties. It is also interesting to note that the average RMSD from 3DFR for the 1ZDR group is substantially lower than that of the 1CZ3 group (2.02 as opposed to 3.83). The overall comparison of the 1CZ3 and 1ZDR receptors reported in Table IV shows that higher sequence identity of the model scaffold leads to a more accurate representation of the receptor both structurally and in docking performance.

Table IV.

Scoring Metrics of the Individual Homology Models, as Sorted by Template and Grouped Ranking Score

Grouped ranking
% IO Score range
RMSD to 3DFR
% (% A.R.) High Low
Geno3D 1ZDR 3.5 Å min 80% (49%) 70% 6.97 0.38 2.09
Swiss 1ZDR whole min 63% (32%) 10% 8.01 5.18 2.03
Geno3D 1ZDR whole min 60% (29%) 65% 6.25 1.88 2.07
Swiss 1ZDR 3.5 Å min 58% (27%) 0% 9.12 4.75 1.97
Esypred3D 1ZDR 3.5Å min 48% (17%) 28% 10.04 2.99 1.9
Esypred3D 1ZDR whole min 43% (12%) 8% 8.92 5.22 2.05
Averages 59% (28%) 30% 8.22 3.4 2.02
Esypred3D 1CZ3 whole min 75% (44%) 58% 6.94 0.2 4.2
Geno 1CZ3 3.5 Å min 73% (42%) 63% 6.25 2.84 2.78
Geno 1CZ3 whole min 68% (37%) 25% 7.04 3.34 2.76
Swiss 1CZ3 whole min 63% (32%) 80% 6.95 0.79 4.56
Esypred3D 1CZ3 3.5 Å min 55% (24%) 70% 7.67 0.52 4.17
Swiss 1CZ3 3.5Å min 40% (9%) 73% 8.41 2.88 4.52
Averages 62% (31%) 61% 7.21 1.76 3.83
Multiple-template whole min 55% (24%) 3% 8.75 5.13 2.16
Multiple-template 3.5 Å min 53% (22%) 5% 8.73 5.11 2.15
Averages 54% (23%) 4% 8.74 5.12 2.16

The models are named based on their creation method, the template used, and the minimization protocol used to merge the ligand and NADPH (either a 3.5 Å minimization or a whole receptor minimization). The random cutoff value for the grouped ranking score is 31%.

The multiple-template model orients the ligands within the active site better than the other homology models, as indicated by the low improper orientation rate. However, they have grouped ranking scores lower than most of the other homology models (and lower than the averages for either group). As discussed in the case of 1LUD_1, proper orientation does not necessarily mean accurately ranked ligands. The multiple-template model effectively acts as a structural average between the 1ZDR and 1CZ3 groups, and the RMSD to 3DFR of the multiple-template structures falls between the RMSD ranges of the two template groups. It presents no advantage over a single template structure. However, only one engine could be located to create a multiple-template model. As such, the single sampling is not enough to indicate if the ‘average’ homology model performs consistently as compared with the variable performances of the single template from the assorted creation mechanisms.

On the basis of the docking results of the individual templates, the crystal structure shows substantially better docking and ligand ranking than the other templates. It orients all ligands correctly and has the best scoring metrics. The NMR structures 1LUD_1 and 1LUD_av, performed substantially worse than the crystal structures when all metrics are considered. The larger number of ligands that were not properly oriented using the structures determined by NMR would suggest that caution is necessary with this type of receptor in a system without a conserved ligand orientation. The homology models vary widely in their docking and ranking scores, but generally perform worse than the crystal structures, a trend found in other studies.6 The homology models based on the scaffold of higher sequence identity perform better than those based on the scaffold of lower sequence identity, but vary too much to be of consistent use when one does not have a way to gauge the predictive worth a priori.

ENSEMBLES

Using ensembles of structures during docking has been shown to improve the accuracy of compound ranking, since ensembles potentially represent receptor flexibility and other docking conformations.7,10 In order to determine if the conformational space explored in the ensemble would aid or hinder the ligand ranking of the starting receptor, ensembles of the single template structures were created and the NMR ensemble was used directly in docking experiments. All ensembles were created in a similar fashion: snapshots were taken at regular intervals across a MD simulation and minimized. The post MD minimization corrected any geometry problems and the resulting Ramachandran plots were all acceptable. All MD simulations were performed with a rigid ligand and the NADPH cofactor. Performing such MD simulations with no ligand would allow the residues to move through a space that would otherwise be occupied by the ligand, preventing accurate docking. It has also been found that ligand bound trajectories in MD simulations are more energetically stable because of induced fit.49 Docking and analysis were then performed for each ensemble member as above, with values averaged across the ensemble to provide a single metric per ensemble.

A 3.5 Å 300 K molecular dynamic simulation was carried out on the crystal structure 3DFR, the NMR representative member 1LUD_1, and the average NMR structure 1LUD_av. All structures were then minimized with the same mobile side chain definitions. Solution structures from NMR data are provided as an ensemble of structures. The full NMR ensemble (1LUD NMR), containing 24 structures, was used in docking experiments similar to the MD ensembles. While the NMR average is not an experimentally observed structure, it is considered here to gauge its worth as a starting template for an MD ensemble, as it represents the average of all potential structures. The starting template is included as part of the calculation and indicated with RS. Results are reported in Table V.

Table V.

Scoring Metrics for the Crystal Structures and their MD Ensembles

Grouped ranking
% IO Score range
% (% A.R.) High Low
3DFR RS 80% (49%) 0% 8.79 4.07
3DFR 3.5Å 300K 75% (44%) 3.6% 8.06 3.91
1LUD NMR Ens. 70% (39%) 4.6% 9.72 5.08
1LUD_av 3.5Å Ens 80% (49%) 7.3% 8.66 3.31
1LUD_av 40% (9%) 20% 10.41 1.82
1LUD_1 3.5Å Ens 80% (49%) 10.3% 7.82 3.39
1LUD_1 RS 30% (−1%) 0% 10.5 3.31

RS indicates the starting template (single structure).

The individual crystal structure performed better than the MD ensemble based on that structure. Whereas the crystal structure orients all ligands correctly, the ensemble has a 3.6% improper orientation rate, with a corresponding decrease in docking score. The 1LUD NMR ensemble has a high grouped ranking score of 70%, and a relatively low improper orientation rate of 4.6%. The average 70% grouped ranking score across the ensemble represents a wide range of individual grouped ranking scores from members of the ensemble (30–80%). There is also substantial variation in the docking score ranges across the ensemble members. The high score varies from 13.5 to 9.0, and the low score varies from 6.87 down to 0 (when a ligand cannot be placed correctly in any of the 200 explored poses). Use of the ensemble in this case compensates for the members with worse ligand docking and ranking.

Use of an MD ensemble based on an individual member of the NMR ensemble offers dramatic improvement over the single structures. Both 1LUD_1 and 1LUD_av have random grouped ranking scores as single structures and have grouped ranking scores of 80% as MD ensembles. The 80% grouped ranking scores are equivalent to that of the crystal structure. The 1LUD_av ensemble also shows a decrease in the improper orientation rate over the individual structure. Despite the increase in grouped ranking score for 1LUD_1, the ensemble also increases the improper orientation rate, primarily resulting from only two ensemble members that possess the majority of improperly placed ligands.

Homology models

Three of the 12 1ZDR/1CZ3 scaffold models were chosen for ensemble analysis, based on various starting metrics. Additionally, one ensemble was created from the 3.5 Å minimization of the multiple-template model. The docking results are reported in Table IV. In all cases, the use of the ensemble improved or maintained ligand ranking relative to single structure homology models. Also, in all cases except the multiple-template model, an ensemble of the single scaffold homology models improves the improper orientation rates.

Table VI.

Scoring Metrics of the Selected Homology Models and their Ensembles, Sorted by Template and Grouped Ranking Score

Grouped
ranking
% IO Score
range
% (% A.R.) High Low
Swiss 1ZDR Wholemin Ens 83% (52%) 5.10% 8.14 4.55
Swiss 1ZDR Wholemin RS 63% (32%) 10% 8.01 0
Esypred 1CZ3 Wholemin Ens 75% (44%) 48% 7.2 0.15
Esypred 1CZ3 Wholemin RS 75% (44%) 58% 6.94 0
Swiss 1CZ3 3.5 Å min Ens 48% (17%) 67.30% 6.7 0.24
Swiss 1CZ3 3.5 Å min RS 40% (9%) 73% 8.41 0
Multitemp 3.5 Å min Ens 53% (21%) 10.50% 8.38 5.04
Multitemp 3.5 Å min RS 53% (21%) 5% 8.73 5.11

The 1CZ3 ensembles have substantially higher improper orientation rates than the 1ZDR ensemble. The low end of the docking score range for the 1CZ3 ensembles approaches zero, indicating few or no favorable docking interactions for the correct pose of those particular ligands. The SwissModel 1CZ3 3.5 Å min ensemble has a 67.3% improper orientation rate, making it effectively invalid. However, using the same program and a different template, the ensemble based on the SwissModel 1ZDR wholemin model approaches the crystallographic starting structure using all metrics and has the best grouped ranking score in this study. While the ensembles improve the metrics, the homology models still vary significantly in their efficacy.

As observed in the NMR study, docking and ranking was improved using MD ensembles from homology modeled structures relative to the single homology model structures. The ensemble is especially important when the starting structure cannot dock some of the ligands, as was the case with each starting structure of the homology models, as shown by the low score of 0 in the docking score range in each RS structure. While the two 1CZ3 ensembles still find near 0 values for the low end, the ligand is correctly oriented in some of the cases, and the 1ZDR ensemble demonstrates significant docking interactions on the low end of the docking score range with the value of 4.55. Furthermore, there was a substantial difference in scaffold model sequence identity, indicating the importance of high sequence identity during the homology model creation phase.

HUMAN DHFR

At least two significant results emerged from the L. casei study: one, docking and ranking is generally improved over the single structure when ensembles are used for homology models and NMR data and two, the crystal structure outperforms both homology models and NMR data sets with such high metrics that ensembles offer no improvement. In order to further explore these trends, human DHFR crystal structures, an NMR solution structure and homology models were subjected to the similar experiments. The plethora of available hDHFR structures offers the opportunity to not only investigate ligand docking and ranking but also offers the opportunity to explore the impact of resolution of crystallographic data and whether the crystal structure was determined in the apo or holo state.

Receptors

Several hDHFR structures were prepared. Structures derived from high and medium resolution diffraction data (1KMV,50 1.05 Å; 1OHJ,51 2.5 Å) were chosen for the comparative docking experiment. There are very minor differences between the two structures and the RMSD along the backbone is only 0.8 Å (shown in Fig. 3). Within the active site, there are negligible variations among the sidechains, with the exceptions of Phe 31 and Leu 22 that adopt different rotamers.

Figure 3.

Figure 3

Overlay of 1KMV (blue) and 1OHJ (yellow) shown with the 1KMV co-crystallized ligand (pink) and NADPH (green). Phe 31 and Leu 22 are labeled. [Color figure can be viewed in the online issue, which is available at www.interscience.wiley.com.]

It has been suggested that the apo protein structure may better serve as a docking template as it has not been specifically tuned to a single ligand.7 As a comparison, the apo hDHFR structure 1PDB52 (2.2 Å resolution) was also examined. The apo structure has very few differences from the structures with co-crystallized ligands. The backbone lies along or between 1OHJ and 1KMV, and has an RMSD of 1.03 and 0.77 Å, respectively. There are very few sidechain alterations at the active site in the apo structure.

The hDHFR sequence from 1KMV was submitted to a BlastP37 search of the PDB, returning 238 results. The highest scoring non-human DHFR structure was 4CD253 (Pneumocystis carinii), with 37% identity. The 1KMV sequence was submitted to two different automated homology modeling servers using 4CD2 as a scaffold: Geno3D and Swiss-Model using a sequence alignment generated by ClustalW. Resulting models were aligned with 1KMV to minimize RMSD, and the co-crystallized ligands NADPH and SRI-9662 ((Z)-6-(2-[2,5-dimethoxyphenyl]ethen-1-yl)-2,4-diamino-5-methylpyrido[2,3-d]pyrimidine) were merged into the homology models. The homology models were minimized with a 3.5 Å radius sphere around the ligands.

The NMR ensemble of hDHFR (1YHO54) presents a ligand docking problem. The side chain of the conserved acidic residue (Glu 30 in hDHFR) in 1YHO_1 (the representative structure) is rotated away from an orientation that would allow the essential hydrogen bonding at the ligand N1. To explore the importance of this side chain orientation, we also selected the NMR ensemble member of 1YHO with the most appropriate Glu 30 orientation, 1YHO_12. The variability of the side chain orientations within the active site are illustrated in the Supplementary data. The average NMR structure (1YHO_av) was also investigated as a docking receptor.

Ligands

A series of compounds containing 2,4-diaminopyrimidine rings from the Rosowsky laboratory55 was chosen for analysis. The data set had to be reduced to 19 compounds (Table VII) to avoid ligand geometry problems (such as the geometry of tertiary amines, tricyclic ligands with multiple conformations, or ligands with stereochemistry). The ligands were sorted into four bins based upon their IC50 clustering and contain approximately the same number of ligands per bin. The clusters are reported in the Supplementary data.

Table VII.

hDHFR Ligands and their Associated IC50 (μM) Values

Structure Ligand Substituent IC50 (μM)
graphic file with name nihms-462771-t0007.jpg II.1 X = N, Y = CH=CH 7.2
II.3 X = N, Y = O 0.23
II.6 X = CH, Y = CH=CH 1.4
graphic file with name nihms-462771-t0008.jpg III.1 R = NHC6H3(2,5-OMe)2 0.83
III.2 R = NHC6H2(3,4,5-OMe)2 0.49
III.4 R = N(Me)C6H4(4-Cl) 0.31
III.5 R = N(Me)C6H4(3-Cl) 0.027
III.6 R = N(Me)C6H3(3,4-Cl2) 0.00037
graphic file with name nihms-462771-t0009.jpg VI.1 R1 = Me, R2 = C6H3(2,5-OMe)2 0.98
VI.2 R1 = Me, R2 = CH2C6H3(2,5-OMe)2 0.64
VI.4 R1 = Me, R2 = CH2C6H2(3,4,5-OMe)3 3
VI.5 R1 = Me, R2 = CH2C6H(2-Br)(3,4,5-OMe)3 1.6
VI.12 R1 = H, R2 = CH2CH2C6H(2-Br)(3,4,5-OMe)3 7.3
graphic file with name nihms-462771-t0010.jpg VIII.1 R1 = H, R2 = R3 = OMe 7.3
VIII.2 R1 = Cl, R2 = R3 = H 9.9
graphic file with name nihms-462771-t0011.jpg IX.1 R = NHCH(CH3)CH2CH2CH3 0.6
IX.4 R = morpholino 1.9
IX.6 R = 4-carbethoxypiperazino 0.81
graphic file with name nihms-462771-t0012.jpg TMP 890

It should be noted that these structures vary significantly as compared with the ligands used in the L. casei study, but still have typical drug-like properties. The molecular weights range from 233.6 to 465.5 with two to eight rotatable bonds per ligand. The logP values range from 0.074 to 3.39, with starting conformational energies of 9.03–27.38 kcal/mol and a single outlier of 43.8 kcal/mol.

Docking

Because of the reduced number of ligands, four scoring bins were selected. With neighbor ranking included, there is a 61% random cutoff point, as compared to 31% in the LcDHFR experiment. The ligand ranking of the hDHFR experiment should be considered more in terms of good/fair/random than in direct numeric comparison. This is especially critical when one remembers that the starting conformation of a ligand can affect the docking score by up to 8%. For this reason, scores of 84 and 90% are considered equivalent.

The docking results for hDHFR are reported in Table VIII. Despite the high random cutoff, the bin method still effectively ranks the ligands into the proper clusters. The two crystal structures that were co-crystallized with a ligand (1KMV and 1OHJ) perform equivalently; both have high grouped ranking scores and correctly orient all ligands. It is interesting to note that 1KMV structure docks the ligands with higher docking scores, implying that more binding contacts and/or lower energy and steric penalties are calculated using the higher resolution structure. The apo structure (1PDB) ranks slightly below the holo structures for grouped ranking scores, docking and orientation. The slightly worse performance of the apo structure has been previously noted in other studies.5

Table VIII.

Docking Data for Crystallographic, NMR Solution Structure and Homology models for hDHFR

Grouped ranking
% IO Score range
RMSD
to 1KMV
Source
% (% A.R.) High Low
1KMV 84% (23%) 0% 11.34 6.16 H X-Ray res. 1.05 Å
1OHJ 90% (29%) 0% 10.49 3.48 0.79 H X-Ray res. 2.5 Å
1PDB 79% (18%) 5% 10.24 5.42 0.76 H X-Ray apo
SwissModel_4CD2 63% (2%) 5% 9.76 4.2 0.97 H homology model
Geno3D_4CD2 79% (18%) 11% 7.76 4.62 3.00 H homology model
1YHO_1 74% (13%) 37% 8.36 1.73 H NMR: Representative member
1YHO_12 68% (7%) 21% 9.14 4.95 H NMR: Selected member
1YHO av 79% (18%) 53% 7.99 3.78 H NMR: average structure

The grouped ranking score is shown as % of maximum score and (the % above the random cutoff). The random cutoff is 61%.

The SwissModel_4CD2 homology model performs worse than the crystallographic structures, with a random grouped ranking score and a 5% improper orientation rating from a single misplaced ligand. Given the behavior of the grouped ranking for this low number of ligands, the 63% grouped ranking score is a notch down from the apo crystal structure (79%), which is also a notch under the two holo structures. The Geno3D_4CD2 homology model has a grouped ranking score equivalent with that of the apo structure, but an 11% improper orientation rating. This behavioral trend and variability in performance across homology models is consistent with the single structures seen in the L. casei study.

Although the 1YHO representative member (1YHO_1) has grouped ranking scores similar to the apo crystal structure, the 37% improper orientation rate demonstrates that many ligands were not properly oriented within the active site. This very high improper orientation rate is expected as the conserved acidic residue that makes critical hydrogen bonds with the 2,4-diaminopyrimidine ring, Glu 30, is rotated away from the proper binding conformation. 1YHO_12 was chosen from the ensemble as it has a Glu 30 orientation closest to that seen in crystal structures, while still not in the ideal location. This selected member shows improved docking scores, with a 16% improvement in improper orientation, however it still has a grouped ranking score that is barely above random. The average 1YHO structure performs equivalently with 1YHO_1, but better than 1YHO_12 in terms of grouped ranking scores. However, 1YHO_av has an improper orientation rate over 50%, making it the least usable of the three 1YHO structures explored. Although there is improvement with the correct side chain orientation, none of the 1YHO structures dock the ligands with an acceptable improper orientation rate. It should also be noted that the docking score range for the 1YHO NMR members is substantially lower than those observed with the crystal structures of hDHFR, indicating fewer positive interactions.

Ensembles

Docking experiments were performed using MD ensembles generated from the hDHFR crystal structures as discussed previously for LcDHFR. The docking results are reported in Table IX. Ensembles were explored for select members of the 1YHO ensemble set. However, the misorientation of Glu 30 was never corrected by the relatively short MD runs explored here. As a result, small motions of Glu 30 further exacerbated the orientation issue faced by these members, and docking results were incomparable (data not shown). These experiments demonstrate the critical importance of the conserved acidic residue to DHFR for proper ligand orientation and the resulting docking and grouped ranking scores. This effectively negates 1YHO as a docking template for this experiment, while 1LUD (which maintains the conserved orientation) was usable.

Table IX.

Scoring Metrics for the hDHFR Structures and their Ensembles

Grouped
ranking
% IO Score range
% (% A.R.) High Low
1OHJ RS 90% (29%) 0% 10.49 3.48
1KMV RS 84% (23%) 0% 11.34 6.16
1KMV 3.5 Å Ens 68% (7%) 4.3% 10.47 3.74
1OHJ 3.5 Å Ens 63% (2%) 35.6% 7.81 1.62
SwissModel_4CD2 63% (2%) 5% 9.76 4.2
SwissModel_4CD2 3.5 Å Ens 74% (13%) 19% 8.74 4.59
Geno3D_4CD2 79% (18%) 11% 7.76 4.62
Geno3D_4CD2 3.5 Å Ens 68% (7%) 7% 7.57 5.33

RS indicates the starting template (single structure).

The results from the ensembles generated from the high resolution crystal structure, 1KMV, are similar to those reported for 3DFR. All metrics decrease, though only slightly. The ligand ranking decreases from good to random and the improper orientation rate increases from 0 to 4.3%. However, the ensemble generated from the medium resolution crystal structure, 1OHJ, performs substantially worse. The docking values and orientation of ligands in 1OHJ quickly degenerate as it moves through an MD simulation, with a 35.6% improper orientation rate. The docking scores also decrease significantly, indicating loss of binding points across the ensemble.

Docking metrics are roughly equivalent when an ensemble based on the homology models was created and employed (Table IX). Grouped ranking scores improved for the SwissModel_4CD2 homology model ensemble, although this is accompanied by an increase in the improper orientation rate. The Geno3D ensemble, for which three members were removed because the conserved acidic residue is rotated 90° away from the correct orientation, offers improvement in improper orientation rate, but a loss of grouped ranking metrics. However, the small number of bins and ligands make such comparisons more difficult than in the L. casei case.

In summary, the ternary complex crystal structures serve as very accurate receptors for docking and the apo crystal structure performs only marginally worse than the ternary crystal structures, but still substantially better than the other hDHFR receptors explored. Once again, there is variability in the homology models in terms of grouped ranking scores and ligand placement, but an ensemble still presents a more predictable route than using a static structure alone. These trends are consistent with those seen in LcDHFR. The critical importance of the conserved acidic residue in ligand docking and orientation is also reiterated in several circumstances.

DISCUSSION

While we have applied this methodology only to DHFR in this work, there are no features of the protein or ligand set that make the results case-limiting. DHFR is representative of many enzymes in its active site size and conformation as well as the degree of ligand-induced conformational change. Furthermore, since the properties of the ligands that were docked are considered drug-like, they are similar to many ligands that would be of interest for docking to other receptors. It is also clear that the improvement effected by the ensembles did not depend on the nature of the ligand in the starting structure: the crystal structure was determined with a pteridine ligand and the homology models were originally minimized around a pteridine core, yet the docked ligands were based on pyrimidine structures.

As receptors for docking, crystal structures perform better than solution structure NMR or homology model structures, both in ligand orientation and ranking. Structures based on high and medium resolution diffraction data both place and rank the ligands equally effectively. The apo structure performs almost equivalently to the ternary structures. Single members of NMR ensembles and homology models vary widely in their ligand docking and ranking, but even those that exhibit the best metrics do not perform as well as crystal structures.

The template structures used in creating homology models have a large impact on the metrics for ligand docking and ranking to the model. The differences in sequence identity of the scaffold show a marked change in the docking metrics of the resulting homology model. Conversely, a multiple template approach to creating a homology model does not appear to have any substantial improvement over the use of a single template. A related conclusion from this work emphasizes that critical structural features, such as the orientation of a conserved residue, must be properly oriented for accurate docking results.

The effect of ensembles to improve docking and ranking is summarized in Table X. The high resolution crystal structures showed a slight decrease in metrics when used as the starting templates for ensembles, as their starting conformation is most likely superior to all others explored. However, the generation of an ensemble from poor starting structures, such as the 1LUD NMR structures and most homology models, generally improves ligand ranking and docking.

Table X.

Summary of the Impact of an MD Ensemble upon the Metrics of the Starting Structure, Ranging from Strongly Negative (−−) to Neutral (=) to Strongly Positive (++)

Starting structure Structure type Improvement
with an ensemble?
3DFR X-ray
1KMV X-ray
1OHJ X-ray − −
1LUD_1 NMR +
1LUD_av NMR av. ++
Swiss 1ZDR Wholemin Homology ++
Esypred 1ZDR 3.5 Å min Homology
Esypred 1CZ3 Wholemin Homology +
Swiss 1CZ3 3.5 Å min Homology +
Multitemp 3.5 Å Min Homology
SwissModel 4CD2 Homology =
Geno3D 4CD2 Homology =

Emphasis is given to ligand ranking.

We investigated several possible reasons for the improved results: the ensembles may represent lower energy conformations than the solution structures or homology models and thereby correct “problems” in the original structures, the ensembles may provide a “softer” protocol for scoring, or the ensembles may truly represent protein flexibility. Each of these will be considered in the following paragraphs.

We examined whether the rigorous minimization of the ensemble members brought the structures to lower energy conformations than the coordinates obtained from the database for the solution structure or calculated for the homology models. To test this, the example 1LUD_1 was subjected to a similar minimization as the ensemble members and the docking experiment was repeated. The grouped ranking score increased somewhat but there was a dramatic increase in the improper orientation rate, leading us to the conclusion that the additional minimization did not account for the positive effect of the ensemble. Additionally, an examination of the docked, ligand-bound conformations of the solution structures and homology models showed that they did not contain residues that were grossly out of position or in steric conflict with the ligand, implying that the positive effect of the ensembles was not simply to correct improperly placed side chains.

The ensembles, most likely, absorb some of the error of the highly coordinate-specific nature of computational docking. In solution, contacts and interactions between protein and ligand are flexible, with constantly moving sidechains and variable hydrogen bonding lengths all contributing to an overall preserved ligand binding value. In silico, a minute change in atom location or hydrogen bond distance can dramatically affect the docking score, as shown by the variable ensemble results across the MD when atoms move only fractions of an Angstrom. Comparatively large changes in computational score are softened by averaging across these small motions.

Lastly and most importantly, the ensembles do appear to represent a degree of protein flexibility. Overall, the positions of the residues in the crystal structure, solution structure, homology models, and individual ensemble members differ, yet are realistic ligand-binding conformations. The positions of the residues in the crystal structure were determined to high resolution; in the NMR solution structure were determined with a high ratio of experimental data to model parameters; the homology models fall within the boundaries of acceptable regions of the Ramachandran plot; and the active site side chains of individual ensemble members are brought to a local energy minima. Since all of these positions are considered chemically possible, side chain flexibility is experimentally and computationally observed. In this study, the crystal structure represents one of the most optimal ligand binding conformations for ligand ranking and correct ligand placement. Interestingly, the majority of the active site side chains of the individual ensemble members have a lower RMSD to the crystal structure than to their starting structure, suggesting that the ensemble represents motion toward a superior ligand-binding orientation. Overall, it appears that the ensemble may explore conformational space through an MD and sample a geometry landscape of improved docking conformations.

Supplementary Material

Supplementary Figure
Supplementary Table

ACKNOWLEDGMENTS

This work was supported by NIGMS (GM 067542 to ACA) and NSF (0133468 to ACA).

Grant sponsor: NIGMS; Grant number: GM 067542; Grant sponsor: NSF; Grant number: 0133468

Footnotes

Additional Supporting Information may be found in the online version of this article.

REFERENCES

  • 1.Shoichet BK. Virtual screening of chemical libraries. Nature. 2004;432:862–865. doi: 10.1038/nature03197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Schneidman-Duhovny D, Nussinov R, Wolfson HJ. Predicting molecular interactions in silico. II. Protein-protein and protein-drug docking. Curr Med Chem. 2004;11:91–107. doi: 10.2174/0929867043456223. [DOI] [PubMed] [Google Scholar]
  • 3.Amini A, Shrimpton PJ, Muggleton SH, Sternberg MJ. A general approach for developing system-specific functions to score protein-ligand docked complexes using support vector inductive logic programming. Proteins. 2007;69:823–831. doi: 10.1002/prot.21782. [DOI] [PubMed] [Google Scholar]
  • 4.Leach AR, Shoichet BK, Peishoff CE. Prediction of protein-ligand interactions. Docking and scoring: successes and gaps. J Med Chem. 2006;49:5851–5855. doi: 10.1021/jm060999m. [DOI] [PubMed] [Google Scholar]
  • 5.McGovern SL, Shoichet BK. Information decay in molecular docking screens against holo, apo, and modeled conformations of enzymes. J Med Chem. 2003;46:2895–2907. doi: 10.1021/jm0300330. [DOI] [PubMed] [Google Scholar]
  • 6.Kairys V, Fernandes MX, Gilson MK. Screening drug-like compounds by docking to homology models: a systematic study. J Chem Inf Model. 2006;46:365–379. doi: 10.1021/ci050238c. [DOI] [PubMed] [Google Scholar]
  • 7.Alonso H, Bliznyuk AA, Gready JE. Combining docking and molecular dynamic simulations in drug design. Med Res Rev. 2006;26:531–568. doi: 10.1002/med.20067. [DOI] [PubMed] [Google Scholar]
  • 8.Huang SY, Zou X. Efficient molecular docking of NMR structures: application to HIV-1 protease. Protein Sci. 2007;16:43–51. doi: 10.1110/ps.062501507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Damm KL, Carlson HA. Exploring experimental sources of multiple protein conformations in structure-based drug design. J Am Chem Soc. 2007;129:8225–8235. doi: 10.1021/ja0709728. [DOI] [PubMed] [Google Scholar]
  • 10.Popov VM, Yee WA, Anderson AC. Towards in silico lead optimization: scores from ensembles of protein/ligand conformations reliably correlate with biological activity. Proteins. 2007;66:375–387. doi: 10.1002/prot.21201. [DOI] [PubMed] [Google Scholar]
  • 11.Kavraki L. Protein-Ligand Docking, Including Flexible Receptor-Flexible Ligand Docking. 2007. Connexions Web site. http://cnx.org/content/m11456/1.10/ [Google Scholar]
  • 12.Carlson HA, Masukawa KM, McCammon JA. Method for including the dynamic fluctuations of a protein in computer-aided drug design. J Phys Chem A. 1999;103:10213–10219. [Google Scholar]
  • 13.Lerner MG, Bowman AL, Carlson HA. Incorporating dynamics in E. coli dihydrofolate reductase enhances structure-based drug discovery. J Chem Inf Model. 2007;47:2358–2365. doi: 10.1021/ci700167n. [DOI] [PubMed] [Google Scholar]
  • 14.Carlson HA, Masukawa KM, Rubins K, Bushman FD, Jorgensen WL, Lins RD, Briggs JM, McCammon JA. Developing a dynamic pharmacophore model for HIV-1 integrase. J Med Chem. 2000;43:2100–2114. doi: 10.1021/jm990322h. [DOI] [PubMed] [Google Scholar]
  • 15.Carlson HA. Protein flexibility and drug design: how to hit a moving target. Curr Opin Chem Biol. 2002;6:447–452. doi: 10.1016/s1367-5931(02)00341-1. [DOI] [PubMed] [Google Scholar]
  • 16.Verkhivker GM, Bouzida D, Gehlhaar DK, Rejto PA, Freer ST, Rose PW. Complexity and simplicity of ligand-macromolecule interactions: the energy landscape perspective. Curr Opin Struct Biol. 2002;12:197–203. doi: 10.1016/s0959-440x(02)00310-x. [DOI] [PubMed] [Google Scholar]
  • 17.Sugita Y, Okamtot Y. Replica-exchange molecular dynamics method for protein folding. Chem Phys Lett. 1999;314:141–151. [Google Scholar]
  • 18.Murray CW, Baxter CA, Frenkel AD. The sensitivity of the results of molecular docking to induced fit effects: application to thrombin, thermolysin and neuraminidase. J Comput Aided Mol Des. 1999;13:547–562. doi: 10.1023/a:1008015827877. [DOI] [PubMed] [Google Scholar]
  • 19.Czerminski R, Elber R. Computational studies of ligand diffusion in globins: I. Leghemoglobin. Proteins. 1991;10:70–80. doi: 10.1002/prot.340100107. [DOI] [PubMed] [Google Scholar]
  • 20.Berndt KD, Guntert P, Wuthrich K. Conformational sampling by NMR solution structures calculated with the program DIANA evaluated by comparison with long-time molecular dynamics calculations in explicit water. Proteins. 1996;24:304–313. doi: 10.1002/(SICI)1097-0134(199603)24:3<304::AID-PROT3>3.0.CO;2-G. [DOI] [PubMed] [Google Scholar]
  • 21.Philippopoulos M, Lim C. Exploring the dynamic information content of a protein NMR structure: comparison of a molecular dynamics simulation with the NMR and X-ray structures of Escherichia coli ribonuclease HI. Proteins. 1999;36:87–110. doi: 10.1002/(sici)1097-0134(19990701)36:1<87::aid-prot8>3.0.co;2-r. [DOI] [PubMed] [Google Scholar]
  • 22.Yuvaniyama J, Chitnumsub P, Kamchonwongpaisan S, Vanichtanankul J, Sirawaraporn W, Taylor P, Walkinshaw MD, Yuthavong Y. Insights into antifolate resistance from malarial DHFR-TS structures. Nat Struct Biol. 2003;10:357–365. doi: 10.1038/nsb921. [DOI] [PubMed] [Google Scholar]
  • 23.Knighton DR, Kan CC, Howland E, Janson CA, Hostomska Z, Welsh KM, Matthews DA. Structure of and kinetic channelling in bifunctional dihydrofolate reductase-thymidylate synthase. Nat Struct Biol. 1994;1:186–194. doi: 10.1038/nsb0394-186. [DOI] [PubMed] [Google Scholar]
  • 24.Champness JN, Achari A, Ballantine SP, Bryant PK, Delves CJ, Stammers DK. The structure of Pneumocystis carinii dihydrofolate reductase to 1.9 A resolution. Structure. 1994;2:915–924. doi: 10.1016/s0969-2126(94)00093-x. [DOI] [PubMed] [Google Scholar]
  • 25.Bolin JT, Filman DJ, Matthews DA, Hamlin RC, Kraut J. Crystal structures of Escherichia coli and Lactobacillus casei dihydrofolate reductase refined at 1.7 A resolution. I. General features and binding of methotrexate. J Biol Chem. 1982;257:13650–13662. [PubMed] [Google Scholar]
  • 26.Anderson AC. Two crystal structures of dihydrofolate reductase-thymidylate synthase from Cryptosporidium hominis reveal protein-ligand interactions including a structural basis for observed antifolate resistance. Acta Crystallogr Sect F Struct Biol Cryst Commun. 2005;61:258–262. doi: 10.1107/S1744309105002435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.SYBYL 7.3. Tripos Inc.; St. Louis; Missouri: 1699. [Google Scholar]
  • 28.Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JA, Jr, Montgomery J, Vreven T, Kudin KN, Burant JC, Millam JM, Iyengar SS, Tomasi J, Barone V, Mennucci B, Cossi M, Scalmani G, Rega N, Petersson GA, Nakatsuji H, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Klene M, Li X, Knox JE, Hratchian HP, Cross JB, Bakken V, Adamo C, Jaramillo J, Gomperts R, Stratmann RE, Yazyev O, Austin AJ, Cammi R, Pomelli C, Ochterski JW, Ayala PY, Morokuma K, Voth GA, Salvador P, Dannenberg JJ, Zakrzewski VG, Dapprich S, Daniels AD, Strain MC, Farkas O, Malick DK, Rabuck AD, Raghavachari K, Foresman JB, Ortiz JV, Cui Q, Baboul AG, Clifford S, Cioslowski J, Stefanov BB, Liu G, Liashenko A, Piskorz P, Komaromi I, Martin RL, Fox DJ, Keith T, Al-Laham MA, Peng CY, Nanayakkara A, Challacombe M, Gill PMW, Johnson B, Chen W, Wong MW, Gonzalez C, Pople JA. Gaussian 03, Revision C. 02. Gaussian, Inc.; Wallingford CT: 2004. [Google Scholar]
  • 29.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The protein data bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Lovell SC, Word JM, Richardson JS, Richardson DC. The penultimate rotamer library. Proteins. 2000;40:389–408. [PubMed] [Google Scholar]
  • 31.Jain AN. Scoring noncovalent protein-ligand interactions: a continuous differentiable function tuned to compute binding affinities. J Comput Aided Mol Des. 1996;10:427–440. doi: 10.1007/BF00124474. [DOI] [PubMed] [Google Scholar]
  • 32.Tripos Bookshelf 7.3. Tripos Inc.; St. Louis; Missouri: 1699. [Google Scholar]
  • 33.Pelphrey PM, Popov VM, Joska TM, Beierlein JM, Bolstad ES, Fillingham YA, Wright DL, Anderson AC. Highly efficient ligands for dihydrofolate reductase from Cryptosporidium hominis and Toxoplasma gondii inspired by structural analysis. J Med Chem. 2007;50:940–950. doi: 10.1021/jm061027h. [DOI] [PubMed] [Google Scholar]
  • 34.Polshakov VI, Smirnov EG, Birdsall B, Kelly G, Feeney J. NMR-based solution structure of the complex of Lactobacillus casei dihydrofolate reductase with trimethoprim and NADPH. J Biomol NMR. 2002;24:67–70. doi: 10.1023/a:1020659713373. [DOI] [PubMed] [Google Scholar]
  • 35.Mura C. The average3d PyMOL module. 2005. http://mccammon.ucsd.edu/~cmura/PyMOL/ [Google Scholar]
  • 36.DeLano WL. The PyMOL molecular graphics system. De Lano Scientific; Palo Alto CA, USA: 2002. [Google Scholar]
  • 37.Blast P. Bethesda, MD: National Center for Biotechnology Information. Bldg 38A, NIH; 8600 Rockville Pike, Bethesda, MD 20894: [Google Scholar]
  • 38.Kim HS, Damo SM, Lee SY, Wemmer D, Klinman JP. Structure and hydride transfer mechanism of a moderate thermophilic dihydrofolate reductase from Bacillus stearothermophilus and comparison to its mesophilic and hyperthermophilic homologues. Biochemistry. 2005;44:11428–11439. doi: 10.1021/bi050630j. [DOI] [PubMed] [Google Scholar]
  • 39.Dams T, Auerbach G, Bader G, Jacob U, Ploom T, Huber R, Jaenicke R. The crystal structure of dihydrofolate reductase from Thermotoga maritima: molecular features of thermostability. J Mol Biol. 2000;297:659–672. doi: 10.1006/jmbi.2000.3570. [DOI] [PubMed] [Google Scholar]
  • 40.Reyes VM, Sawaya MR, Brown KA, Kraut J. Isomorphous crystal structures of Escherichia coli dihydrofolate reductase complexed with folate, 5-deazafolate, and 5,10-dideazatetrahydrofolate: mechanistic implications. Biochemistry. 1995;34:2710–2723. doi: 10.1021/bi00008a039. [DOI] [PubMed] [Google Scholar]
  • 41.Sawaya MR, Kraut J. Loop and subdomain movements in the mechanism of Escherichia coli dihydrofolate reductase: crystallographic evidence. Biochemistry. 1997;36:586–603. doi: 10.1021/bi962337c. [DOI] [PubMed] [Google Scholar]
  • 42.Bennett B, Langan P, Coates L, Mustyakimov M, Schoenborn B, Howell EE, Dealwis C. Neutron diffraction studies of Escherichia coli dihydrofolate reductase complexed with methotrexate. Proc Natl Acad Sci USA. 2006;103:18493–18498. doi: 10.1073/pnas.0604977103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Lambert C, Leonard N, De Bolle X, Depiereux E. ESyPred3D: prediction of proteins 3D structures. Bioinformatics. 2002;18:1250–1256. doi: 10.1093/bioinformatics/18.9.1250. [DOI] [PubMed] [Google Scholar]
  • 44.Combet C, Jambon M, Deleage G, Geourjon C. Geno3D: automatic comparative molecular modelling of protein. Bioinformatics. 2002;18:213–214. doi: 10.1093/bioinformatics/18.1.213. [DOI] [PubMed] [Google Scholar]
  • 45.Schwede T, Kopp J, Guex N, Peitsch MC. SWISS-MODEL: an automated protein homology-modeling server. Nucleic Acids Res. 2003;31:3381–3385. doi: 10.1093/nar/gkg520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. doi: 10.1093/nar/22.22.4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Hansch C, Li R, Blaney JM, Langridge R. Comparison of the inhibition of Escherichia coli and Lactobacillus casei dihydrofolate reductase by 2,4-diamino-5-(substituted-benzyl)pyrimidines: quantitative structure-activity relationships. X-ray crystallography, and computer graphics in structure-activity analysis. J Med Chem. 1982;25:777–784. doi: 10.1021/jm00349a003. [DOI] [PubMed] [Google Scholar]
  • 48.Hansch C, Hathaway BA, Guo ZR, Selassie CD, Dietrich SW, Blaney JM, Langridge R, Volz KW, Kaufman BT. Crystallography, quantitative structure-activity relationships, and molecular graphics in a comparative analysis of the inhibition of dihydrofolate reductase from chicken liver and Lactobacillus casei by 4,6-diamino-1,2-dihydro-2,2-dimethyl-1-(substituted-phenyl)-s-triazine s. J Med Chem. 1984;27:129–143. doi: 10.1021/jm00368a006. [DOI] [PubMed] [Google Scholar]
  • 49.Kua J, Zhang Y, McCammon JA. Studying enzyme binding specificity in acetylcholinesterase using a combined molecular dynamics and multiple docking approach. J Am Chem Soc. 2002;124:8260–8267. doi: 10.1021/ja020429l. [DOI] [PubMed] [Google Scholar]
  • 50.Klon AE, Heroux A, Ross LJ, Pathak V, Johnson CA, Piper JR, Borhani DW. Atomic structures of human dihydrofolate reductasecomplexed with NADPH and two lipophilic antifolates at 1.09 a and 1.05 a resolution. J Mol Biol. 2002;320:677–693. doi: 10.1016/s0022-2836(02)00469-2. [DOI] [PubMed] [Google Scholar]
  • 51.Cody V, Galitsky N, Luft JR, Pangborn W, Rosowsky A, Blakley RL. Comparison of two independent crystal structures of human dihydrofolate reductase ternary complexes reduced with nicotinamide adenine dinucleotide phosphate and the very tight-binding inhibitor PT523. Biochemistry. 1997;36:13897–13903. doi: 10.1021/bi971711l. [DOI] [PubMed] [Google Scholar]
  • 52.Cody V, Luft JR, Pangborn W, Gangjee A. Analysis of three crystal structure determinations of a 5-methyl-6-N-methylanilino pyridopyrimidine antifolate complex with human dihydrofolate reductase. Acta Crystallogr D Biol Crystallogr. 2003;59:1603–1609. doi: 10.1107/s0907444903014963. [DOI] [PubMed] [Google Scholar]
  • 53.Cody V, Galitsky N, Rak D, Luft JR, Pangborn W, Queener SF. Ligand-induced conformational changes in the crystal structures of Pneumocystis carinii dihydrofolate reductase complexes with folate and NADP1. Biochemistry. 1999;38:4303–4312. doi: 10.1021/bi982728m. [DOI] [PubMed] [Google Scholar]
  • 54.Kovalevskaya NV, Smurnyy YD, Polshakov VI, Birdsall B, Bradbury AF, Frenkiel T, Feeney J. Solution structure of human dihydrofolate reductase in its complex with trimethoprim and NADPH. J Biomol NMR. 2005;33:69–72. doi: 10.1007/s10858-005-1475-z. [DOI] [PubMed] [Google Scholar]
  • 55.Nelson RG, Rosowsky A. Dicyclic and tricyclic diaminopyrimidine derivatives as potent inhibitors of Cryptosporidium parvum dihydrofolate reductase: structure-activity and structure-selectivity correlations. Antimicrob Agents Chemother. 2001;45:3293–3303. doi: 10.1128/AAC.45.12.3293-3303.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Figure
Supplementary Table

RESOURCES