Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 May 5.
Published in final edited form as: Nat Biotechnol. 2009 Apr;27(4):320–321. doi: 10.1038/nbt0409-320

A kernel for the Tropical Disease Initiative

Leticia Ortí 1,2, Rodrigo J Carbajo 2, Ursula Pieper 3, Narayanan Eswar 3,, Stephen M Maurer 4, Arti K Rai 5, Ginger Taylor 6, Matthew H Todd 7, Antonio Pineda-Lucena 2, Andrej Sali 3,*, Marc A Marti-Renom 1,*
PMCID: PMC3088649  NIHMSID: NIHMS287604  PMID: 19352362

To the Editor

Identifying proteins that are good drug targets and finding drug leads that bind to them is generally a challenging problem. It is particularly difficult for neglected tropical diseases, such as malaria and tuberculosis, where research resources are relatively scarce1. Fortunately, several developments improve our ability to deal with drug discovery for neglected diseases: (i) the sequencing of many complete genomes of organisms that cause tropical diseases; (ii) the determination of a large number of protein structures; (iii) the creation of compound libraries including already approved drugs; and (iv) the availability of improved bioinformatics analysis, including methods for comparative protein structure modeling, binding site identification, virtual ligand screening, and drug design. Therefore, we are now in a position to increase the odds of identifying high-quality drug targets and drug leads for neglected tropical diseases. Here, we encourage a collaboration among scientists to engage in drug discovery for tropical diseases by providing a “kernel” for the Tropical Disease Initiative (TDI, http://www.tropicaldisease.org)2. Based on the impact of the Linux kernel on the open source code development, we suggest that the TDI kernel may help overcome a major stumbling block for open source drug discovery: the absence of a critical mass of preexisting work that volunteers can build on incrementally. This kernel complements a number of other initiatives on neglected tropical diseases35 including collaborative Web portals (e.g., The Synaptic Leap; http://www.thesynapticleap.org), Public-Private-Partnerships (e.g., Medicines for Malaria Venture; http://www.mmv.org), and private foundations (e.g., Bill and Melinda Gates Foundation; http://www.gatesfoundation.org); for an updated list of initiatives, see http://www.tropicaldisease.org.

The TDI kernel was derived with our software pipeline6, 7 for predicting structures of protein sequences by comparative modeling, localizing small molecule binding sites on the surfaces of the models, and predicting ligands that bind to them. Specifically, the pipeline linked 297 proteins from ten pathogen genomes with already approved drugs that were developed for treating other diseases (Table 1). Such links, if proven experimentally, may significantly increase the efficiency of target identification, target validation, lead discovery, lead optimization, and clinical trials. Two of the kernel targets were tested for their binding to a known drug by NMR spectroscopy, validating one of our predictions (Figure 1). It is difficult to assess the accuracy of our computational predictions based on this limited experimental testing. Thus, we encourage other investigators to donate their expertise and facilities to test additional predictions. We hope the testing will occur within the open source context.

Table 1.

TDI kernel genomes.

Organisma Transcriptsb Modeled targetsc Similard Exacte
Cryptosporidium hominis 3,886 666 20 13
Cryptosporidium parvum 3,806 742 24 13

Leishmania major 8,274 1,409 43 20

Mycobacterium leprae 1,605 893 25 6

Mycobacterium tuberculosis 3,991 1,608 30 10

Plasmodium falciparum 5,363 818 28 13

Plasmodium vivax 5,342 822 24 13

Toxoplasma gondii 7,793 300 13 6

Trypanosoma cruzi 19,607 3,070 51 28

Trypanosoma brucei 9,210 1,386 39 21

Total 68,877 11,714 297 143
a

Organisms in bold are included in the WHO Tropical Disease portfolio.

b

Number of transcripts in each genome.

c

Number of targets with at least one domain accurately modeled (i.e., MODPIPE quality score of at least 1.0).

d

Number of modeled targets with at least one predicted binding site for a molecule with a Tanimoto score9 of at least 0.9 to a drug in DrugBank10.

e

Number of modeled targets with at least one predicted binding site for a molecule in DrugBank.

Figure 1.

Figure 1

TDI kernel snapshot of the web page for the Plasmodium falciparum thymidylate kinase target (http://tropicaldisease.org/kernel/q8i4s1/). Our computational pipeline predicted that thymidylate kinase from P. falciparum binds ATM (3'-azido-3'-deoxythymidine-5'-monophosphate), a supra-structure of the Zidovudine drug approved for the treatment of HIV infection. The binding of this ligand to a site on the kinase was experimentally validated by 1D Water-LOGSY11 and Saturation Transfer Difference12 NMR experiments.

The TDI kernel is freely downloadable in accordance with the Science Commons protocol for implementing open access data (http://sciencecommons.org/projects/publishing/open-access-data-protocol/) that prescribes standard academic attribution and facilitates tracking of work but imposes no other restrictions. We do not seek intellectual property rights in the actual discoveries based on the TDI kernel in the hope to reinvigorate drug discovery for neglected tropical diseases8. By minimizing restrictions on the data, including viral terms that would be inherited by all derivative works, we hope to attract as “many eyeballs” as we possibly can to use and improve the kernel. While many of the drugs in the kernel are proprietary under diverse types of rights, we believe that the existence of public domain pairs of targets and compounds will reduce the royalties that patent owners can charge and sponsors must pay. This should decrease the large sums of money governments and foundations need to invest to turn validated targets and candidate drugs into actual treatments.

Our list of likely drug leads and their targets must be validated and extended using additional lines of evidence by computation and, most importantly, wet lab experiments. We are committed to helping other researchers add their protocols and analyses to the current kernel. For example, computational docking, biophysical analysis, activity assays, site-directed mutagenesis, and synthetic chemistry could be performed for all predicted targets. Unfortunately, such techniques are usually very expensive and thus not feasible on a genomic scale by a single research group. The main goal of our exercise was to narrow down the number of targets and identify their putative ligands for experimental follow-up, so that the overall process is faster, more thorough, and less expensive. The TDI kernel's list of “hits” does not exhaust the ten target genomes. Researchers who want TDI to investigate additional candidates should contact us or engage in online discussions at our collaborative portal (The Synaptic Leap, http://www.thesynapticleap.org).

Acknowledgments

We acknowledge the support from the Spanish Ministerio de Educación y Ciencia (BIO2007/66670 and SAF2008-01845), the NIH (R01 GM54762, U54 GM074945, P01 AI035707, and P01 GM71790), and the Sandler Family Supporting Foundation.

References

  • 1.Nwaka S, Ridley RG. Virtual drug discovery and development for neglected diseases through public-private partnerships. Nat Rev Drug Discov. 2003;2:919–928. doi: 10.1038/nrd1230. [DOI] [PubMed] [Google Scholar]
  • 2.Maurer SM, Rai A, Sali A. Finding cures for tropical diseases: is open source an answer? PLoS Med. 2004;1:e56. doi: 10.1371/journal.pmed.0010056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kepler T, et al. Open Source Research - The power of Us. Aust J Chem. 2006;59:291–294. [Google Scholar]
  • 4.Singh S. India takes an open source approach to drug discovery. Cell. 2008;133:201–203. doi: 10.1016/j.cell.2008.04.003. [DOI] [PubMed] [Google Scholar]
  • 5.Aguero F, et al. Genomic-scale prioritization of drug targets: the TDR Targets database. Nat Rev Drug Discov. 2008;7:900–907. doi: 10.1038/nrd2684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]
  • 7.Marti-Renom MA, et al. The AnnoLite and AnnoLyze programs for comparative annotation of protein structures. BMC Bioinformatics. 2007;8 (Suppl 4):S4. doi: 10.1186/1471-2105-8-S4-S4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Munos B. Can open-source R&D reinvigorate drug research? Nat Rev Drug Discov. 2006;5:723–729. doi: 10.1038/nrd2131. [DOI] [PubMed] [Google Scholar]
  • 9.Gower JC. A general coefficient of similarity and some of its properties. Biometrics. 1971;27:857–871. [Google Scholar]
  • 10.Wishart DS, et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 2008;36:D901–906. doi: 10.1093/nar/gkm958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Dalvit C, et al. Identification of compounds with binding affinity to proteins via magnetization transfer from bulk water. J Biomol NMR. 2000;18:65–68. doi: 10.1023/a:1008354229396. [DOI] [PubMed] [Google Scholar]
  • 12.Meyer B, Peters T. NMR spectroscopy techniques for screening and identifying ligand binding to protein receptors. Angew Chem Int Ed Engl. 2003;42:864–890. doi: 10.1002/anie.200390233. [DOI] [PubMed] [Google Scholar]

RESOURCES