Abstract
Previous whole genome comparisons of Plasmodium falciparum populations have not included collections from the Indian subcontinent, even though two million Indians contract malaria and about 50,000 die from the disease every year. Stratification of global parasites has revealed spatial relatedness of parasite genotypes on different continents. Here, genomic analysis was further improved to obtain country-level resolution by removing var genes and intergenic regions from distance calculations. P. falciparum genomes from India were found to be most closely related to each other. Their nearest neighbors were from Bangladesh and Myanmar, followed by Thailand. Samples from the rest of Southeast Asia, Africa and South America were increasingly more distant, demonstrating a high-resolution genomic-geographic continuum. Such genome stratification approaches will help monitor variations of malaria parasites within South Asia and future changes in parasite populations that may arise from in-country and cross-border migrations.
Keywords: Malaria, Populations, Epidemiology, Molecular, Neighbors
Graphical Abstract
The first genome-wide sequencing of pathogenic organisms, including the human malaria parasite P. falciparum, proved to be an unprecedented asset for malaria scientists [1]. Investigators, who previously studied individual phenotypes or individual genes and their products, could now study parasites on a genome scale [2–7]. Today, as the world considers global policies to control and eliminate malaria parasites, representative populations of parasites from distinct regions of the world are essential [8–10]. Collective genomes have the potential to capture variations in interactions of malaria parasites with diverse human hosts and mosquito vectors as well as susceptibilities of and resistance to therapeutics.
P. falciparum jumped from gorillas to humans in Central Africa in the distant past, but some of the most interesting differences among human P. falciparum parasites probably reflect evolutionary selection pressures resulting from more recent host history [11]. Human settlement into specific geographic regions and ecosystems, with unique climates, available plants and animals for food, and insect vectors, not only affected evolution of protective traits in humans, but also triggered new specialized traits within pathogens. The highly-intertwined genetic relationship between present-day humans, mosquitoes, and malaria parasites is captured in the varying architecture of parasite genomes from different parts of the world. Raw mapping of sequence reads between parasite populations has previously revealed distinct segregation of P. falciparum parasites from Africa, Southeast Asia and South America [12, 13]. While some continental separation could have been anticipated based on select markers from parasite genomes [14–16], the extent to which these principles held across the whole genome is remarkable [12, 13]. In the present study we demonstrate that it is possible to achieve country-level resolution of parasite relationships from whole genome sequences with judicious use of bioinformatics tools.
The global collection of more than 500 different assembled whole genome sequences of P. falciparum with known geographic history has gaps, particularly related to South Asia. This is a significant deficiency. There are more than 500 million people at risk for malaria in India and the country reports up to two million cases per year [17, 18]. Current estimates of deaths due to malaria in India reach approximately 50,000 per year [17, 18]. It is of great biological interest and public health importance to know if present-day P. falciparum parasites from India are closely related to parasites from Southeast Asia, especially since India shares borders and has growing ties with many countries in Southeast Asia. Thailand, Vietnam, Laos and Cambodia have been under continual antimalarial drug pressure for decades and most drug resistance was first detected in these countries, including the current threat of artemisinin-resistance [19]. To understand future intermingling of parasite genomes, current similarities and differences in the genetic makeup of Indian and Southeast Asian parasites are of particular significance. It is also conceivable that Indian parasites could be closely related to parasites from Africa, given the history of travels over the last eight centuries as well as more recent trade and population exchanges [20].
The Malaria Evolution in South Asia (MESA) program seeks to understand how parasites in the field evolve not just against drugs, but also to gain other evolutionary advantages related to overcoming host immunity and improving transmission. The program runs under the broader US NIH International Centers of Excellence for Malaria Research (ICEMR) initiative [9] and so is a part of a formal government-sanctioned collaboration between India and the US [8, 21].
The clinical protocol guiding sample collections was approved by the institutional review boards of Goa Medical College and Hospital, University of Washington, US NIAID Division of Microbiology and Infectious Diseases, and Government of India Health Ministry Screening Committee. From April 2012 to December 2015 patients at the Goa Medical College (GMC) who were diagnosed with Plasmodium falciparum infection (by either rapid diagnostic test or by thin-smear microscopy) were referred to the MESA-ICEMR study team. Non-pregnant individuals between 12 months and 65 years old were given a written and oral description of the study and asked to provide written informed consent. Children between 8 and 18 years old were asked to provide assent in addition to the written consent of a parent or guardian required for all under 18. Study participants provided 4 – 6 mL of venous blood and parasite species was confirmed by RDT (FalciVax, Zephyr Biomedicals, India) and Giemsa-stained slide microscopy. Through December 2015, a total of 1088 Plasmodium-positive individuals, 228 of whom had Plasmodium falciparum mono-infection, were enrolled by the MESA-ICEMR at GMC. Of the malaria-positive patients enrolled at Goa Medical College and Hospital between 2012 and 2015, 88% were born in 31 Indian states other than Goa. Most of the enrolled patients were male (91%) and many were construction workers (51%) (29).
The present study provides a first glimpse of P. falciparum genomes from India though analysis of five Indian P. falciparum whole genome sequences collected between 2012 and 2015 by the MESA-ICEMR. Though the parasites samples included in the present study were collected in Goa, the parasites likely had diverse origins, including the states of Goa, Uttar Pradesh, Bihar, and Assam, based on study participant’s reported place of birth and travel history one month prior to sample collection (Table 1). For comparison of relatedness to parasites around the world, whole genome sequencing (WGS) of malarial parasites from global field isolates delivers a comprehensive set of variations present in parasite populations. This, in turn, can provide an accurate approximation of associations between various genome samples [12,13]. In such population studies, with more than 5,000 genes in each malaria parasite, the relationships between samples are complex. This complexity makes it useful to compare global variation and relatedness between parasites mathematically by imagining the distinct samples as occupying locations in an abstract space (usually in n-1 dimensions, where n is the number of samples under consideration). Principal Component Analysis (PCA) plots identify components, linear combinations of these dimensions, that describe the variation between the samples. In this case and typically, a few principal components capture most of the variation, thus describing the separation of samples using just a few important dimensions. This approach provides a parsimonious view of the stratification present in a set of samples. Technical details of the sequences of the parasites from India, the gathering of non-Indian genome sequences, and the final genomic comparisons are presented in the legend to Figure 1.
Table 1.
Patient ID | XGK2 | 5TBX | durf | hxjy | gabv |
---|---|---|---|---|---|
Enrollment Date | Aug-12 | Aug-12 | Aug-12 | Apr-13 | Jun-15 |
State of Birth | Uttar Pradesh (UP) | Outside India | UP | Bihar | Assam |
Travels | Maharashtra, UP | No/Goa | No/Goa | Tamil Nadu, Bihar | No/Goa |
Temperature | 97.8 | 99.7 | 104.8 | 103.8 | 100.6 |
% Parasitemia | 3.6 | 0.2 | 1.9 | 2.64 | 2.04 |
Inpatient | Y | N | N | Y | N |
Severe | Y | NA | NA | Y | NA |
Together with genomes from around the world, the relative position of the Indian parasites in the world-map of genetic interconnectedness was identified (Figure 1). The layers of stratification of Indian samples in relation to parasites from other parts of the world was also estimated. Figure 1 shows that, as expected, global isolates fall into different groups based on their geographical origins [12, 13]. In this stratification, Indian isolates (red) clearly segregate into a distinct cluster, different from isolates of Southeast Asia (blue and purple) and even Bangladesh and Myanmar (dark and light pink). A neighbor joining tree, obtained using 'nj' from 'ape' package in R and made from the same distance matrix as that used for PCA, shows that the common ancestor for isolates from Africa and Asia are well-separated (Figure 2). Indian samples (red), lie between those from Bangladesh and Myanmar (pink), slightly removed from samples from Thailand (dark blue) and more removed from those of far Southeast Asian countries of Cambodia, Laos, and Vietnam (lighter blue).
Investigators in South Asia and beyond should benefit from access to the present whole genome sequences of P. falciparum from India. The Indian subcontinent has very diverse human genetics, ecosystems, and mosquito populations [21], so future parasite genome collections by the MESA-ICEMR will help capture an even more complete geographic representation of Indian P. falciparum parasites. Parasites captured directly from the states of origin of the construction workers in Goa will be of particular interest and will help determine if there is even greater diversity of parasites in India than what we capture from Southwest India. The present collection will also serve as a reference point in time. As the social and economic interactions between the countries of South Asia and Southeast Asia increase, it will be important to track the extent to which parasite genetic structures change, or even merge, over time. Genome-wide analysis of P. falciparum SNPs may also help track newly arrived ‘interlopers’ across borders with migrant workers, and possible stable recombinants emerging after mating with local parasite populations in India. Beyond Asia, the parasite genome sequences from India should provide valuable context to interpret population differences amongst global collections of fully-assembled whole genome sequences of P. falciparum, especially those from Africa. Such large data should help predict cross-continental efficacy of newly emerging drugs, vaccines, and vector-control measures as well as to track the spread of potential resistance traits across continents.
Highlights.
Plasmodium falciparum genomes from India are underrepresented in global collections
Five distinct Plasmodium falciparum genomes from India have been sequenced
Compared to global parasites, Plasmodium falciparum from India are distinct
Indian parasites are most closely related to those from Bangladesh and Thailand
Acknowledgments
The authors thank all of the study participants and clinical research staff at the Goa Medical College and Hospital who assisted with this work. The authors also gratefully acknowledge Goa Medical College Dean, Dr. Pradeep Naik, and Medical Superintendent, Dr. Sunanda Amonkar, for facilitation and support of the research study. Dr. Anju Verma (The Rotary Blood Bank, New Delhi, India) provided human RBCs and human plasma for parasite culture. This work was part of the ‘Malaria Evolution in South Asia’ Program Project, an International Center of Excellence for Malaria Research (ICEMR) supported by US NIH/NIAID agreement U19 AI089688 (Program Director, PKR). The authors are most grateful for the administrative and scientific guidance provided by the MESA-ICEMR Scientific Advisory Group, particularly Dr. David Sibley on genomics, and the Government of India representatives Dr. Neena Valecha, Dr. Rashmi Arora, Dr. P.L. Joshi and Dr. Shiv Lal, and US NIH Program Officer Dr. Malla Rao.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
AUTHOR CONTRIBUTIONS
SK, DGM, JWIII, JNM, MD, LC and PKR conceived and designed the experiments. AS, AM, RD, LP and RBS performed the experiments. SK, DM, JNM, JWIII, WZ, ST, and PKR analyzed the data. SK, LC, and PKR wrote the manuscript. All authors reviewed the final manuscript.
References
- 1.Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, et al. Genome sequence of the human malaria parasite Plasmodium falciparum. Nature. 2002;419:498–511. doi: 10.1038/nature01097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Doolan DL, Apte SH, Proietti C. Genome-based vaccine design: the promise for malaria and other infectious diseases. Int J Parasitol. 2014;44:901–913. doi: 10.1016/j.ijpara.2014.07.010. [DOI] [PubMed] [Google Scholar]
- 3.Volkman SK, Neafsey DE, Schaffner SF, Park DJ, Wirth DF. Harnessing genomics and genome biology to understand malaria biology. Nat Rev Genet. 2012;13:315–328. doi: 10.1038/nrg3187. [DOI] [PubMed] [Google Scholar]
- 4.Fan E, Baker D, Fields S, Gelb MH, Buckner FS, Van Voorhis WC, et al. Structural genomics of pathogenic protozoa: an overview. Methods Mol Biol. 2008;426:497–513. doi: 10.1007/978-1-60327-058-8_33. [DOI] [PubMed] [Google Scholar]
- 5.Kooij TW, Janse CJ, Waters AP. Plasmodium post-genomics: better the bug you know? Nat Rev Microbiol. 2006;4:344–357. doi: 10.1038/nrmicro1392. [DOI] [PubMed] [Google Scholar]
- 6.Llinas M, DeRisi JL. Pernicious plans revealed: Plasmodium falciparum genome wide expression analysis. Curr Opin Microbiol. 2004;7:382–387. doi: 10.1016/j.mib.2004.06.014. [DOI] [PubMed] [Google Scholar]
- 7.Rathod PK, Ganesan K, Hayward RE, Bozdech Z, DeRisi JL. DNA microarrays for malaria. Trends Parasitol. 2002;18:39–45. doi: 10.1016/s1471-4922(01)02153-5. [DOI] [PubMed] [Google Scholar]
- 8.Narayanasamy K, Chery L, Basu A, Duraisingh MT, Escalante A, Fowble J, et al. Malaria evolution in South Asia: knowledge for control and elimination. Acta Trop. 2012;121:256–266. doi: 10.1016/j.actatropica.2012.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Rao MR. International Centers of Excellence for Malaria Research. The American Journal of Tropical Medicine and Hygiene. 2015;93:1–4. doi: 10.4269/ajtmh.15-0407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Carlton JM, Volkman SK, Uplekar S, Hupalo DN, Pereira Alves JM, Cui L, et al. Population Genetics, Evolutionary Genomics, and Genome-Wide Studies of Malaria: A View Across the International Centers of Excellence for Malaria Research. Am J Trop Med Hyg. 2015;93:87–98. doi: 10.4269/ajtmh.15-0049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Karlsson EK, Kwiatkowski DP, Sabeti PC. Natural selection and infectious disease in human populations. Nat Rev Genet. 2014;15:379–393. doi: 10.1038/nrg3734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Miotto O, Almagro-Garcia J, Manske M, Macinnis B, Campino S, Rockett KA, et al. Multiple populations of artemisinin-resistant Plasmodium falciparum in Cambodia. Nat Genet. 2013;45:648–655. doi: 10.1038/ng.2624. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Miotto O, Amato R, Ashley EA, MacInnis B, Almagro-Garcia J, Amaratunga C, et al. Genetic architecture of artemisinin-resistant Plasmodium falciparum. Nat Genet. 2015;47:226–234. doi: 10.1038/ng.3189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ferdig MT, Su XZ. Microsatellite markers and genetic mapping in Plasmodium falciparum. Parasitol Today. 2000;16:307–312. doi: 10.1016/s0169-4758(00)01676-8. [DOI] [PubMed] [Google Scholar]
- 15.Anderson TJ. Mapping drug resistance genes in Plasmodium falciparum by genome-wide association. Curr Drug Targets Infect Disord. 2004;4:65–78. doi: 10.2174/1568005043480943. [DOI] [PubMed] [Google Scholar]
- 16.Su XZ, Wootton JC. Genetic mapping in the human malaria parasite Plasmodium falciparum. Mol Microbiol. 2004;53:1573–1582. doi: 10.1111/j.1365-2958.2004.04270.x. [DOI] [PubMed] [Google Scholar]
- 17.Kumar A, Valecha N, Jain T, Dash AP. Burden of malaria in India: retrospective and prospective view. Am J Trop Med Hyg. 2007;77:69–78. [PubMed] [Google Scholar]
- 18.Murray CJ, Rosenfeld LC, Lim SS, Andrews KG, Foreman KJ, Haring D, et al. Global malaria mortality between 1980 and 2010: a systematic analysis. Lancet. 2012;379:413–431. doi: 10.1016/S0140-6736(12)60034-8. [DOI] [PubMed] [Google Scholar]
- 19.Dondorp AM, Nosten F, Yi P, Das D, Phyo AP, Tarning J, et al. Artemisinin resistance in Plasmodium falciparum malaria. N Engl J Med. 2009;361:455–467. doi: 10.1056/NEJMoa0808859. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Faulde MK, Rueda LM, Khaireh BA. First record of the Asian malaria vector Anopheles stephensi and its possible role in the resurgence of malaria in Djibouti, Horn of Africa. Acta Trop. 2014;139:39–43. doi: 10.1016/j.actatropica.2014.06.016. [DOI] [PubMed] [Google Scholar]
- 21.Kumar A, Chery L, Biswas C, Dubhashi N, Dutta P, Dua VK, et al. Malaria in South Asia: prevalence and control. Acta Trop. 2012;121:246–255. doi: 10.1016/j.actatropica.2012.01.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Llinas M, Deitsch KW, Voss TS. Plasmodium gene regulation: far more to factor in. Trends Parasitol. 2008;24:551–556. doi: 10.1016/j.pt.2008.08.010. [DOI] [PubMed] [Google Scholar]
- 23.Guizetti J, Scherf A. Silence, activate, poise and switch! Mechanisms of antigenic variation in Plasmodium falciparum. Cell Microbiol. 2013;15:718–726. doi: 10.1111/cmi.12115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. 2011;2011:17. [Google Scholar]
- 25.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27:2987–2993. doi: 10.1093/bioinformatics/btr509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Wickham H. ggplot2: Elegant Graphics for Data Analysis. New York: Springer-Verlag; 2009. [Google Scholar]
- 29.Chery, et al. Demographic and clinical profiles of Plasmodium falciparum and Plasmodium vivax patients at a government tertiary care centre in southwestern India. doi: 10.1186/s12936-016-1619-5. Submitted. [DOI] [PMC free article] [PubMed] [Google Scholar]