To the Editor
Identifying proteins that are good drug targets and finding drug leads that bind to them is generally a challenging problem. It is particularly difficult for neglected tropical diseases, such as malaria and tuberculosis, where research resources are relatively scarce1. Fortunately, several developments improve our ability to deal with drug discovery for neglected diseases: (i) the sequencing of many complete genomes of organisms that cause tropical diseases; (ii) the determination of a large number of protein structures; (iii) the creation of compound libraries including already approved drugs; and (iv) the availability of improved bioinformatics analysis, including methods for comparative protein structure modeling, binding site identification, virtual ligand screening, and drug design. Therefore, we are now in a position to increase the odds of identifying high-quality drug targets and drug leads for neglected tropical diseases. Here, we encourage a collaboration among scientists to engage in drug discovery for tropical diseases by providing a “kernel” for the Tropical Disease Initiative (TDI, http://www.tropicaldisease.org)2. Based on the impact of the Linux kernel on the open source code development, we suggest that the TDI kernel may help overcome a major stumbling block for open source drug discovery: the absence of a critical mass of preexisting work that volunteers can build on incrementally. This kernel complements a number of other initiatives on neglected tropical diseases3–5 including collaborative Web portals (e.g., The Synaptic Leap; http://www.thesynapticleap.org), Public-Private-Partnerships (e.g., Medicines for Malaria Venture; http://www.mmv.org), and private foundations (e.g., Bill and Melinda Gates Foundation; http://www.gatesfoundation.org); for an updated list of initiatives, see http://www.tropicaldisease.org.
The TDI kernel was derived with our software pipeline6, 7 for predicting structures of protein sequences by comparative modeling, localizing small molecule binding sites on the surfaces of the models, and predicting ligands that bind to them. Specifically, the pipeline linked 297 proteins from ten pathogen genomes with already approved drugs that were developed for treating other diseases (Table 1). Such links, if proven experimentally, may significantly increase the efficiency of target identification, target validation, lead discovery, lead optimization, and clinical trials. Two of the kernel targets were tested for their binding to a known drug by NMR spectroscopy, validating one of our predictions (Figure 1). It is difficult to assess the accuracy of our computational predictions based on this limited experimental testing. Thus, we encourage other investigators to donate their expertise and facilities to test additional predictions. We hope the testing will occur within the open source context.
Table 1.
Organisma | Transcriptsb | Modeled targetsc | Similard | Exacte |
---|---|---|---|---|
Cryptosporidium hominis | 3,886 | 666 | 20 | 13 |
Cryptosporidium parvum | 3,806 | 742 | 24 | 13 |
| ||||
Leishmania major | 8,274 | 1,409 | 43 | 20 |
| ||||
Mycobacterium leprae | 1,605 | 893 | 25 | 6 |
| ||||
Mycobacterium tuberculosis | 3,991 | 1,608 | 30 | 10 |
| ||||
Plasmodium falciparum | 5,363 | 818 | 28 | 13 |
| ||||
Plasmodium vivax | 5,342 | 822 | 24 | 13 |
| ||||
Toxoplasma gondii | 7,793 | 300 | 13 | 6 |
| ||||
Trypanosoma cruzi | 19,607 | 3,070 | 51 | 28 |
| ||||
Trypanosoma brucei | 9,210 | 1,386 | 39 | 21 |
| ||||
Total | 68,877 | 11,714 | 297 | 143 |
Organisms in bold are included in the WHO Tropical Disease portfolio.
Number of transcripts in each genome.
Number of targets with at least one domain accurately modeled (i.e., MODPIPE quality score of at least 1.0).
Number of modeled targets with at least one predicted binding site for a molecule with a Tanimoto score9 of at least 0.9 to a drug in DrugBank10.
Number of modeled targets with at least one predicted binding site for a molecule in DrugBank.
The TDI kernel is freely downloadable in accordance with the Science Commons protocol for implementing open access data (http://sciencecommons.org/projects/publishing/open-access-data-protocol/) that prescribes standard academic attribution and facilitates tracking of work but imposes no other restrictions. We do not seek intellectual property rights in the actual discoveries based on the TDI kernel in the hope to reinvigorate drug discovery for neglected tropical diseases8. By minimizing restrictions on the data, including viral terms that would be inherited by all derivative works, we hope to attract as “many eyeballs” as we possibly can to use and improve the kernel. While many of the drugs in the kernel are proprietary under diverse types of rights, we believe that the existence of public domain pairs of targets and compounds will reduce the royalties that patent owners can charge and sponsors must pay. This should decrease the large sums of money governments and foundations need to invest to turn validated targets and candidate drugs into actual treatments.
Our list of likely drug leads and their targets must be validated and extended using additional lines of evidence by computation and, most importantly, wet lab experiments. We are committed to helping other researchers add their protocols and analyses to the current kernel. For example, computational docking, biophysical analysis, activity assays, site-directed mutagenesis, and synthetic chemistry could be performed for all predicted targets. Unfortunately, such techniques are usually very expensive and thus not feasible on a genomic scale by a single research group. The main goal of our exercise was to narrow down the number of targets and identify their putative ligands for experimental follow-up, so that the overall process is faster, more thorough, and less expensive. The TDI kernel's list of “hits” does not exhaust the ten target genomes. Researchers who want TDI to investigate additional candidates should contact us or engage in online discussions at our collaborative portal (The Synaptic Leap, http://www.thesynapticleap.org).
Acknowledgments
We acknowledge the support from the Spanish Ministerio de Educación y Ciencia (BIO2007/66670 and SAF2008-01845), the NIH (R01 GM54762, U54 GM074945, P01 AI035707, and P01 GM71790), and the Sandler Family Supporting Foundation.
References
- 1.Nwaka S, Ridley RG. Virtual drug discovery and development for neglected diseases through public-private partnerships. Nat Rev Drug Discov. 2003;2:919–928. doi: 10.1038/nrd1230. [DOI] [PubMed] [Google Scholar]
- 2.Maurer SM, Rai A, Sali A. Finding cures for tropical diseases: is open source an answer? PLoS Med. 2004;1:e56. doi: 10.1371/journal.pmed.0010056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kepler T, et al. Open Source Research - The power of Us. Aust J Chem. 2006;59:291–294. [Google Scholar]
- 4.Singh S. India takes an open source approach to drug discovery. Cell. 2008;133:201–203. doi: 10.1016/j.cell.2008.04.003. [DOI] [PubMed] [Google Scholar]
- 5.Aguero F, et al. Genomic-scale prioritization of drug targets: the TDR Targets database. Nat Rev Drug Discov. 2008;7:900–907. doi: 10.1038/nrd2684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]
- 7.Marti-Renom MA, et al. The AnnoLite and AnnoLyze programs for comparative annotation of protein structures. BMC Bioinformatics. 2007;8 (Suppl 4):S4. doi: 10.1186/1471-2105-8-S4-S4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Munos B. Can open-source R&D reinvigorate drug research? Nat Rev Drug Discov. 2006;5:723–729. doi: 10.1038/nrd2131. [DOI] [PubMed] [Google Scholar]
- 9.Gower JC. A general coefficient of similarity and some of its properties. Biometrics. 1971;27:857–871. [Google Scholar]
- 10.Wishart DS, et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 2008;36:D901–906. doi: 10.1093/nar/gkm958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Dalvit C, et al. Identification of compounds with binding affinity to proteins via magnetization transfer from bulk water. J Biomol NMR. 2000;18:65–68. doi: 10.1023/a:1008354229396. [DOI] [PubMed] [Google Scholar]
- 12.Meyer B, Peters T. NMR spectroscopy techniques for screening and identifying ligand binding to protein receptors. Angew Chem Int Ed Engl. 2003;42:864–890. doi: 10.1002/anie.200390233. [DOI] [PubMed] [Google Scholar]