Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2012 May 2;40(Web Server issue):W409–W414. doi: 10.1093/nar/gks378

ZINCPharmer: pharmacophore search of the ZINC database

David Ryan Koes 1,*, Carlos J Camacho 1
PMCID: PMC3394271  PMID: 22553363

Abstract

ZINCPharmer (http://zincpharmer.csb.pitt.edu) is an online interface for searching the purchasable compounds of the ZINC database using the Pharmer pharmacophore search technology. A pharmacophore describes the spatial arrangement of the essential features of an interaction. Compounds that match a well-defined pharmacophore serve as potential lead compounds for drug discovery. ZINCPharmer provides tools for constructing and refining pharmacophore hypotheses directly from molecular structure. A search of 176 million conformers of 18.3 million compounds typically takes less than a minute. The results can be immediately viewed, or the aligned structures may be downloaded for off-line analysis. ZINCPharmer enables the rapid and interactive search of purchasable chemical space.

INTRODUCTION

A pharmacophore describes the structural arrangement of the essential molecular features of an interaction between a ligand and its receptor. Searching chemical libraries for compounds that match a specific pharmacophore is an established method of virtual screening (1–3). The two main challenges of pharmacophore-based virtual screening are identifying a representative pharmacophore for an interaction and then identifying the compounds within a relevant chemical library that match the pharmacophore. ZINCPharmer is a pharmacophore search engine for purchasable chemical space that addresses both these challenges.

An interaction pharmacophore may be elucidated from a set of known active ligands by identifying a consensus pharmacophore that is conformationally accessible to all these ligands (1,4). These techniques do not require a ligand-bound structure, but may be computationally demanding if the input set contains many flexible ligands. PharmaGist (5) is a free web server that can identify a consensus pharmacophore of a set of up to 32 ligands in a few minutes. Alternatively, structure-based approaches require a ligand-bound structure and identify a potential pharmacophore by analyzing the interaction site (6). ZINCPharmer provides a mechanism for deriving an initial pharmacophore hypothesis directly from structures within the PDB (Protein Data Bank), and also supports importing pharmacophore definitions developed using more computationally demanding approaches implemented in third-party tools.

Given a library of explicit compound conformations, conformers that match a 3D pharmacophore can be found using either fingerprint-based (7–9) or alignment-based (4,10) approaches. Fingerprints are well suited for similarity metrics (11), but, since they discretize the pharmacophore representation, provide inexact results. The EDULISS (12) online database provides fingerprint-based screening of a single-conformer library of a few million compounds, but the query fingerprint must be manually constructed from pairwise distance constraints. Alignment-based approaches produce more accurate and interpretable results, at the expense of more computation. For example, a library of fewer than a million conformers may take minutes or hours to screen (13). However, since there are substantially fewer protein targets than there are possible ligands, alignment-based pharmacophore screening can be used effectively when performing a reverse screen that identifies matching protein targets instead of ligands. PharmMapper (14) takes as input a single ligand and screens a database of over 7000 receptors for potential targets.

Both fingerprint and alignment-based approaches typically evaluate every conformer in the library, resulting in search times that scale with the size of the database. Newer methods, such as Pharmer (15) and Recore (16) use indexing approaches so that search times scale with the complexity and breadth of the query, not the size of the library. ZINCPharmer uses the open-source Pharmer software to enable the interactive search of more than 176 million conformations in just a few minutes, if not seconds.

METHODS

ZINCPharmer searches a database of conformations calculated from the purchasable compounds of the ZINC database (17). ZINC is a comprehensive collection of commercially available, biologically relevant compounds suitable for screening. Purchasable compounds have an expected availability of <10 weeks and are either available from vendor stock or are make-on-demand. The ZINCPharmer library is synchronized with the ZINC library on a monthly basis. Compounds are both added and removed to maintain consistency and ensure that only currently purchasable compounds are retained. ZINC compounds are converted into 3D conformations using omega2 from OpenEye Scientific Software (http://eyesopen.com). Conformers are generated using the default settings and -rms.7, which improves the sampling of conformational space compared to the default setting of .5 (18). The 10 best conformers are saved.

The generated conformers are converted into an efficient search format using the Pharmer (15) open-source software. Pharmer identifies hydrophobic, hydrogen bond donor/acceptor, positive/negative ions and aromatic pharmacophore features using the SMARTS matching functionality of the OpenBabel toolkit (19). Currently, the default set of SMARTS definitions is used, but these are subject to refinement based on user input. These features are stored in an efficient spatial index to support the rapid search of large chemical libraries. For example, the search shown in Figure 1 took less than 3 seconds.

Figure 1.

Figure 1.

The ZINCPharmer interface. The Jmol-based molecular viewer is in the upper left and displays the pharmacophore features as spheres within the context of the interaction structure. A negative ion feature is shown in red mesh and the selected hydrogen acceptor in solid orange. Both a receptor structure, shown as a translucent partial-charge mapped surface, and a ligand structure, from which an interaction pharmacophore is automatically derived, may be uploaded. The pharmacophore query editor is shown in the bottom left and supports the interactive modification of the properties of the pharmacophore, including directions of hydrogen bonds and the size of hydrophobic regions. The full query session state can be saved and restored. Additional property filters, such as molecular weight, may be specified under the Filters tab while the visual styles of the molecular viewer may be set under the Viewer tab. The results browser is on the right and displays the ZINC id, which links directly to the ZINC database and purchasing information, the minimal RMSD of the compound pose to the query, the molecular weight and the number of rotatable bonds. The results may be sorted by any of the numerical features and the full set of result structures may be downloaded.

The graphical user interface (Figure 1) for defining, refining and visualizing pharmacophore queries and their results is implemented using JavaScript and the Java-based Jmol (http://www.jmol.org/) molecular viewer. A modern, standards compliant browser with a recent Java plugin is required. Session state, which includes the pharmacophore definitions, can be saved in a human-readable JSON (JavaScript Object Notation) format and the aligned search results can be saved in the sdf molecular format. An internet forum hosts a user guide and provides technical support.

DEFINING A PHARMACOPHORE QUERY

Using the Pharmer software, ZINCPharmer can automatically extract a set of pharmacophore features from molecular structure. Each feature consists of the feature type (hydrophobic, hydrogen bond donor/acceptor, positive/negative ion or aromatic), a position, and a search radius. Figure 2 illustrates the various methods for creating an initial query. Features may be derived from a single ligand structure, a protein–ligand structure, a protein–protein structure or from the output of third-party software.

Figure 2.

Figure 2.

Defining a pharmacophore query in ZINCPharmer. The Load Features button can be used to calculate the pharmacophore features of a ligand structure or to upload 3rd party pharmacophore definitions. Alternatively, an interaction pharmacophore can be derived directly from a ligand-bound structure in the PDB, or the essential pharmacophore of a protein–protein interaction can be exported from PocketQuery.

From ligand structure

Any single-conformer molecular structure file that is compatible with OpenBabel (19) may be uploaded to define a set of pharmacophore features. All identified features of the molecule are enabled as pharmacophore query features. However, since by itself a ligand provides no information about the nature of an interaction, the result is not a true pharmacophore. For instance, even though low-energy conformers are often close in configuration to the bound structures (18), without additional information it is impossible to separate interacting features from non-interacting features. Instead, the pharmacophore derived from a single ligand structure should be thought of as a 3D similarity search.

If the receptor structure is known, a flexible docking of the ligand can generate a custom protein–ligand structure from which ZINCPharmer can automatically derive an interaction pharmacophore. Alternatively, if there are many known binders then a consensus pharmacophore can be elucidated (1,4) using software such as Chemical Computing Group's MOE (http://www.chemcomp.com/), Inte:Ligand's LigandScout (http://www.inteligand.com/), or PharmaGist (5) and the result can be imported into ZINCPharmer.

From protein-ligand structure

When provided with both a receptor and bound-ligand structure, ZINCPharmer will automatically identify an interaction pharmacophore. All possible pharmacophore features on the ligand are computed, but only those that are within a distance cutoff of complimentary features on the receptor are enabled. Hydrogen bond acceptors/donors must be within 4 Å of a hydrogen bond donor/acceptor on the receptor. Charged features must be within 5 Å of an oppositely charged feature on the receptor. Aromatic feature must be within 5 Å of a receptor aromatic feature. A ligand hydrophobic feature must be within 6 Å of at least three hydrophobic features on the receptor in order to require some degree of buriedness. The distance cutoffs are intended to be permissive and no angular cutoffs are applied since it is conceptually easier for a user to reduce the number of features in a pharmacophore query than to increase them (which requires investigating a much larger number of potential features).

If the protein–ligand structure exists in the PDB, then a shortcut is available on the ZINCPharmer home page where the user need only enter the PDB accession code, select the desired ligand and click the Start button (Figure 2). The corresponding ligand and receptor structures as well as their interaction pharmacophore will automatically be loaded into a new ZINCPharmer session.

For custom protein–ligand structures, for example, the result of a docking study, the receptor and ligand must be uploaded separately. In order to identify the interaction pharmacophore, the receptor must be uploaded first.

From protein–protein interaction structure

ZINCPharmer is integrated with PocketQuery (http://pocketquery.csb.pitt.edu), a website that identifies protein–protein interaction (PPI) inhibitor starting points from PPI structure. Using a consensus scoring scheme (20), PocketQuery identifies a small set of interacting residues in a PPI structure whose mimicry by a small molecule is likely to inhibit the interaction. Within the PocketQuery interface, as shown in Figure 2, the selected set of residues can be exported directly to ZINCPharmer. The interaction pharmacophore between these ligand residues and the receptor will than be automatically generated as with a protein–ligand structure.

From 3rd party software

ZINCPharmer includes support for uploading pharmacophore definitions represented in either PH4 format, used by MOE, or PML format, used by LigandScout. Additionally, the specialized mol2 format exported by PharmaGist (5) is recognized as a hybrid pharmacophore definition and ligand structure file. These programs can be used to elucidate a consensus pharmacophore from a set of active compounds. ZINCPharmer can then import the result and quickly identify all matching hits. However, there are several differences between the pharmacophore recognition routines and alignment policies of different software packages (21). In particular, the identification and positioning of hydrophobic features has the most variation between software packages. Consequently, ZINCPharmer searches using an externally defined pharmacophore will result in an overlapping, but not identical, set of hits compared with a search performed using the software that generated the pharmacophore.

REFINING A QUERY

Although ZINCPharmer is capable of automatically extracting a pharmacophore from an interaction, it is expected that the user will further refine the query to enhance its specificity and applicability. This can be done by editing the properties of the query or by applying filters to the results.

Query editor

Every pharmacophore feature is a row in the query editor and has a pharmacophore class (hydrophobic, hydrogen bond donor/acceptor, positive/negative ion or aromatic), a position specified in Cartesian coordinates, a radius representing the tolerance sphere to search around this position and an enabled/disabled setting. The pharmacophore query editor, shown in the bottom left of Figure 1, supports the interactive editing of these features, which are shown as spheres in the molecular viewer as seen in the top left of Figure 1. Features may be selected either in the query editor table or directly in the molecular viewer by clicking on the relevant sphere. Selected features may be batch processed (enabled, disabled, deleted or duplicated) through a contextual menu accessible by right-clicking the selected rows.

Some features have additional options unique to their pharmacophore class that are accessible through a drop down menu. Hydrogen bond donors/acceptors have an optional directionality, as shown in the drop down menu of Figure 1. The query vector is matched against a precomputed vector on the ligand. Since the actual direction of the hydrogen bond is specific to the geometry of the interface, this match is necessarily approximate, and therefore a large tolerance in angular deviation is implemented by default.

Aromatic features also have an optional directionality constraint that matches against the normal vector of the aromatic ring. Hydrophobic features have an optional constraint for specifying the number of atoms participating in the hydrophobic area. For example, if a small hydrophobe, such as a methyl group, is desired, then the maximum number of atoms can be constrained to one. Alternatively, if a large, space-filling group is desired, such as an aliphatic ring, the minimum number of atoms can be constrained to five or higher.

Filters

The results can be filtered both in terms of the number of returned results and the properties of the returned results. The number of hits can be reduced by specifying a limit on the number of different orientations returned for each conformation (‘Max Hits per Conf’), the number of different orientations of different conformations returned for each molecule (‘Max Hits per Mol’), or the total number of hits returned (‘Max Total Hits’). In all cases, the search is terminated as soon as the limit is reached with no guarantee that the returned hits have the best possible root mean squared deviation (RMSD) to the query.

Each orientation of a conformer results from a different mapping and alignment of pharmacophore features on the ligand to the query features. If the query has many degrees of symmetry or tightly spaced features, reducing the number of orientations returned may substantially reduce the number of hits that need to be analyzed without omitting significant positional differences. Reducing the number of hits per a molecule is particularly useful when only the 2D properties of the results will be analyzed and only a single representative of each molecule is needed. Reducing the total number of hits is beneficial when the post-screening analysis is computationally intensive and only a sampling of the results is needed.

The results list can also be filtered by maximum RMSD. The orientation of the hits is computed using a weighted RMSD calculation (15), but the reported value is the unweighted RMSD between the calculated orientation and the query. Filtering by RMSD restricts the hits to those that have the best overall geometric match to the query. Additionally, hits can be filtered by the molecular properties of molecular weight (in Daltons) and number of rotatable bonds, both of which have been implicated as useful properties for identifying ‘drug-like’ molecules (22).

PHARMACOPHORE SEARCH

Having defined a pharmacophore, searching for matching purchasable compounds is as simple as clicking the ‘Submit Query’ button. Searches take anywhere from a few seconds to a few minutes. Queries with more features, queries with many hydrophobic features (which are the most common features), queries with large search tolerances and symmetric queries (which require the processing of many orientations per a matching conformer) will have longer search times. Results are returned and displayed in the results browser as they are found. An orientation of a conformer is only returned as a hit if all the matching features are within the specified search tolerances of the query when the conformer is aligned to minimize the weighted RMSD.

RESULTS VISUALIZATION

The results of a search are displayed in the results browser shown in Figure 1. Each hit represents a unique orientation of a conformation to the query. For each hit, the ZINC identifier, RMSD to the query, molecular weight (‘Mass’), and number of rotatable bonds (‘RBnds’) is shown. The ZINC identifier is a hyperlink that points to the corresponding compound web page in the ZINC database where purchasing information may be found. The results may be sorted by any of the numerical properties by clicking on the property heading in the results table. The complete set of oriented hits may be saved to an sdf file through the ‘Save Results’ button. The hits in this file are unordered and include the RMSD to the query as extra data attached to each molecule. This file is immediately useful as input to a secondary screening protocol such as ranking by energy minimization.

Individual hits are visualized with the query and a receptor (if present) by clicking on the corresponding row in the results browser. The viewer tab contains a wide assortment of colors and styles (wireframe, stick, spheres, etc.) for visualizing the results, the query ligand, the receptor residues and the receptor surface.

DISCUSSION

The goal of ZINCPharmer is to remove barriers to computational drug discovery. There is no need for users to purchase, install or build software: all that is needed is a modern web browser. Additionally, ZINCPharmer takes care of the generation and storage of a large multi-conformer database of the biologically relevant and commercially available compounds of the ZINC database. Perhaps more importantly, the search performance of ZINCPharmer (most searches take a few minutes, if not seconds) enables the iterative refinement of a pharmacophore hypothesis in the context of the entire chemical library. For example, users can quickly enable or disable features, adjust search tolerances and apply filters based on the results of previous searches to achieve a set of result compounds that has the desired size, specificity and chemical diversity. This sort of iterative refinement simply is not practical when searches take several hours. ZINCPharmer enables a hands-on, experimental approach to developing a high-quality pharmacophore hypothesis that fully leverages the expertise and insight of the user. The matching compounds of the user-specified pharmacophore can then be purchased and experimentally validated as part of a broader drug discovery effort. ZINCPharmer is a fully open access resource and is available at http://zincpharmer.csb.pitt.edu.

FUNDING

National Institutes of Health [GM097082] and National Institutes of Health [GM71896 to Brian Shoichet and John Irwin, supporting the ZINC database.]. Funding for open access charge: National Institutes of Health [R01GM097082].

Conflict of interest statement. None declared.

ACKNOWLEDGEMENTS

The ZINC database is incorporated into ZINCPharmer by permission. We thank NIH for supporting this resource.

REFERENCES

  • 1.Leach AR, Gillet VJ, Lewis RA, Taylor R. Three-dimensional pharmacophore methods in drug discovery. J. Med. Chem. 2009;53:539–558. doi: 10.1021/jm900817u. [DOI] [PubMed] [Google Scholar]
  • 2.Mason JS, Good AC, Martin EJ. 3-D pharmacophores in drug discovery. Curr. Pharm. Des. 2001;7:567–597. doi: 10.2174/1381612013397843. [DOI] [PubMed] [Google Scholar]
  • 3.Langer T, Krovat EM. Chemical feature-based pharmacophores and virtual library screening for discovery of new leads. Curr. Opin. Drug Discov. Dev. 2003;6:370–376. [PubMed] [Google Scholar]
  • 4.Wolber G, Seidel T, Bendix F, Langer T. Molecule-pharmacophore superpositioning and pattern matching in computational drug design. Drug Discov. Today. 2008;13:23–29. doi: 10.1016/j.drudis.2007.09.007. [DOI] [PubMed] [Google Scholar]
  • 5.Schneidman-Duhovny D, Dror O, Inbar Y, Nussinov R, Wolfson HJ. PharmaGist: a webserver for ligand-based pharmacophore detection. Nucleic Acids Res. 2008;36:W223–W228. doi: 10.1093/nar/gkn187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wolber G, Langer T. LigandScout: 3-D pharmacophores derived from protein-bound ligands and their use as virtual screening filters. J. Chem. Inf. Model. 2004;45:160–169. doi: 10.1021/ci049885e. [DOI] [PubMed] [Google Scholar]
  • 7.Mason JS, Cheney DL. Libray design and virtual screening using multiple 4 point pharmacophore fingerprints. Pacific Symposium on Biocomputing. 2000;5:576–587. doi: 10.1142/9789814447331_0055. [DOI] [PubMed] [Google Scholar]
  • 8.Baroni M, Cruciani G, Sciabola S, Perruccio F, Mason JS. A common reference framework for analyzing/comparing proteins and ligands. Fingerprints for Ligands and Proteins (FLAP): theory and application. J. Chem. Inf. Model. 2007;47:279–294. doi: 10.1021/ci600253e. [DOI] [PubMed] [Google Scholar]
  • 9.Floris M, Masciocchi J, Fanton M, Moro S. Swimming into peptidomimetic chemical space using pepMMsMIMIC. Nucleic Acids Res. 2011 doi: 10.1093/nar/gkr287. 10.1093/nar/gkr287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Raymond JW, Willett P. Maximum common subgraph isomorphism algorithms for the matching of chemical structures. J. Comput. Aided Mol. Des. 2002;16:521–533. doi: 10.1023/a:1021271615909. [DOI] [PubMed] [Google Scholar]
  • 11.Sheridan RP, Kearsley SK. Why do we need so many chemical similarity search methods? Drug Discov. Today. 2002;7:903–911. doi: 10.1016/s1359-6446(02)02411-x. [DOI] [PubMed] [Google Scholar]
  • 12.Hsin KY, Morgan HP, Shave SR, Hinton AC, Taylor P, Walkinshaw MD. EDULISS: a small-molecule database with data-mining and pharmacophore searching capabilities. Nucleic Acids Res. 2011;39:D1042–D1048. doi: 10.1093/nar/gkq878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Fang X, Wang S. A web-based 3D-database pharmacophore searching tool for drug discovery. J. Chem. Inf. Comput. Sci. 2002;42:192–198. doi: 10.1021/ci010083i. [DOI] [PubMed] [Google Scholar]
  • 14.Liu X, Ouyang S, Yu B, Liu Y, Huang K, Gong J, Zheng S, Li Z, Li H, Jiang H. PharmMapper server: a web server for potential drug target identification using pharmacophore mapping approach. Nucleic Acids Res. 2010;38:W609–W614. doi: 10.1093/nar/gkq300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Koes DR, Camacho CJ. Pharmer: efficient and exact pharmacophore search. J. Chem. Inf. Model. 2011;51:1307–1314. doi: 10.1021/ci200097m. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Maass P, Schulz-Gasch T, Stahl M, Rarey M. Recore: a fast and versatile method for scaffold hopping based on small molecule crystal structure conformations. J. Chem. Inf. Model. 2007;47:390–399. doi: 10.1021/ci060094h. [DOI] [PubMed] [Google Scholar]
  • 17.Irwin JJ, Shoichet BK. ZINC- a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 2005;45:177–182. doi: 10.1021/ci049714. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kirchmair J, Wolber G, Laggner C, Langer T. Comparative performance assessment of the conformational model generators omega and catalyst: a large-scale survey on the retrieval of protein-bound ligand conformations. J. Chem. Inf. Model. 2006;46:1848–1861. doi: 10.1021/ci060084g. [DOI] [PubMed] [Google Scholar]
  • 19.O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open Babel: an open chemical toolbox. J. Cheminforma. 2011;3:33. doi: 10.1186/1758-2946-3-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Koes DR, Camacho CJ. Small-molecule inhibitor starting points learned from protein–protein interaction inhibitor structure. Bioinformatics. 2012;28:784–791. doi: 10.1093/bioinformatics/btr717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Spitzer GM, Heiss M, Mangold M, Markt P, Kirchmair J, Wolber G, Liedl KR. One concept, three implementations of 3D pharmacophore-based virtual screening: distinct coverage of chemical search space. J. Chem. Inf. Model. 2010;50:1241–1247. doi: 10.1021/ci100136b. [DOI] [PubMed] [Google Scholar]
  • 22.Veber D, Johnson S, Cheng H, Smith B, Ward K, Kopple K. Molecular properties that influence the oral bioavailability of drug candidates. J. Med. Chem. 2002;45:2615–2623. doi: 10.1021/jm020017n. [DOI] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES