Skip to main content
PLOS Biology logoLink to PLOS Biology
. 2020 May 4;18(5):e3000711. doi: 10.1371/journal.pbio.3000711

Single-cell transcription analysis of Plasmodium vivax blood-stage parasites identifies stage- and species-specific profiles of expression

Juliana M Sà 1, Matthew V Cannon 2, Ramoncito L Caleon 1, Thomas E Wellems 1, David Serre 2,*
Editor: Michael Duffy3
PMCID: PMC7224573  PMID: 32365102

Abstract

Plasmodium vivax and P. falciparum, the parasites responsible for most human malaria worldwide, exhibit striking biological differences, which have important clinical consequences. Unfortunately, P. vivax, unlike P. falciparum, cannot be cultivated continuously in vitro, which limits our understanding of its biology and, consequently, our ability to effectively control vivax malaria. Here, we describe single-cell gene expression profiles of 9,215 P. vivax parasites from bloodstream infections of Aotus and Saimiri monkeys. Our results show that transcription of most P. vivax genes occurs during short periods of the intraerythrocytic cycle and that this pattern of gene expression is conserved in other Plasmodium species. However, we also identify a strikingly high proportion of species-specific transcripts in late schizonts, possibly associated with the specificity of erythrocyte invasion. Our findings provide new and robust markers of blood-stage parasites, including some that are specific to the elusive P. vivax male gametocytes, and will be useful for analyzing gene expression data from laboratory and field samples.


Analysis of individual Plasmodium vivax parasites reveals the tight control of the expression of most genes during the intra-erythrocytic cycle and the differentiation of male and female gametocytes, and highlights differences between the development of P. vivax and P. falciparum.

Introduction

Following the bite of an infected mosquito, Plasmodium parasites in the form of sporozoites move from the skin to the circulation, before invading the host hepatocytes. There, they multiply in their individual host cells before exiting into the bloodstream, where they proliferate in continuous rounds of asexual multiplications within erythrocytes (or red blood cells [RBCs]). This intraerythrocytic life cycle that is completed every 2 days by the most prevalent human parasites, Plasmodium falciparum and P. vivax, leads to an exponential increase in the number of parasites and is responsible for the disease symptoms in malaria patients. During the blood-stage infection, individual parasites progress through the successive morphological stages of rings, trophozoites, and schizonts [1,2]. Alternatively, some erythrocytic parasites differentiate into sexual stages, the male and female gametocytes, that mature to infect mosquitoes, undergo fertilization, and produce sporozoites that can then be propagated to new vertebrate hosts [3,4].

Several studies have shown that the morphological changes occurring during the intraerythrocytic cycle of the Plasmodium parasites are accompanied by important transcriptional differences [510]. For human parasites, these studies have primarily relied on analysis of synchronized parasites from in vitro cultures, which might not always recapitulate accurately the gene expression profiles observed in vivo [11,12]. Furthermore, analyses of samples collected at different time points along the development of tightly synchronized cultures cannot inform on whether gradual or abrupt transcriptional changes accompany the morphological changes observed microscopically. The reliance on cultivated parasites has also greatly limited analyses of nonfalciparum Plasmodium species that cause malaria in humans. In particular, despite the great burden of vivax malaria worldwide [13,14] and pioneering work conducted using ex vivo cultures [15,16], few studies have investigated the gene expression profiles of P. vivax parasites [1720]. This paucity of information is problematic given the deep evolutionary divergence of P. vivax and P. falciparum species and the profound differences in their biology, including distinct microscopic presentations of asexual and sexual stage parasites, different rates of gametocytogenesis [2123], specific RBC preferences and underlying invasion mechanisms [2427], and differences in host tissue affinities and sequestration mechanisms during blood-stage infections [28,29].

Advances in genomic technologies, and notably the development of tools enabling the characterization of the gene expression profiles from individual cells [30], are providing new approaches to disentangle the inherent heterogeneity of most blood-stage infections and to improve our understanding of the regulation of transcription in malaria parasites. Applications of single-cell RNA sequencing (scRNA-seq) technology to P. falciparum and P. knowlesi in vitro cultures and to the rodent parasite P. berghei have provided unique insights into the molecular mechanisms underlying gametocytogenesis [31,32] as well as the regulation of the asexual intraerythrocytic life cycle [33,34]. Unfortunately, the requirement for fresh and viable parasites has so far limited extension of such studies to other Plasmodium species. Here, we describe scRNA-seq data generated from 7 New World monkeys infected with 4 different strains of human P. vivax parasites. Our results highlight genes that can serve as markers for different developmental stages, including male or female gametocytes, provide insights on the possible role of many incompletely characterized protein-coding genes, and reveal unique features of the profiles of gene expression during the P. vivax intraerythrocytic cycle.

Results and discussion

scRNA-seq provides detailed gene expression information on individual blood-stage P. vivax parasites

We generated scRNA-seq data from 10 P. vivax–infected blood samples collected from 7 New World monkeys: 4 samples were collected from 4 splenectomized Saimiri monkeys, each infected with 1 of 3 P. vivax strains; 3 samples were collected from 3 splenectomized Aotus monkeys infected with 1 of 3 strains of P. vivax strains; and 3 additional blood samples were collected from the same Aotus monkeys 16 hours after administration of 1 dose of chloroquine (Table 1 and S1 Fig). Upon collection, we processed each blood sample using MACS columns [35] to enrich for infected RBCs and immediately prepared 3′-end scRNA-seq libraries using the 10X Genomics technology [30]. We then sequenced each library to generate, on average, 125.5 million paired-end reads of 75 base pairs (bp) per sample (Table 1). Between 41.2% and 84.2% of the reads from each sample mapped to the P. vivax P01 genome sequence [36], with most of the remaining reads mapping to the hosts’ genomes (S1 Table). Interestingly, after these reads were assigned into individual cells, the vast majority of the single-cell transcriptomes consisted of reads that mapped exclusively to either the host or P. vivax genome (S2 Fig). In fact, less than 5% of all 9,215 parasite transcriptomes that passed stringent quality cutoffs (see below) contained more than 300 unique reads mapping to the corresponding host genome. This observation was surprising given that P. vivax preferentially invades reticulocytes [37,38], which are defined by the presence of RNA molecules (that are later lost during RBC maturation [39,40]). However, this relative paucity of host sequences can be explained by the selection of relatively mature P. vivax stages during MACS column purification (see below) and by the findings of an ex vivo study, which demonstrated that P. vivax parasites rapidly remodel the reticulocytes and actively expulse the intracellular content as they develop [27]. Most of the noninfected monkey cells harbored high levels of hemoglobin mRNAs, indicating that these were likely erythrocytes or of platelet-specific transcripts (e.g., platelet-factor 4 or pro-platelet basic protein genes).

Table 1. Description of the samples and summary of the transcriptome data.

The table shows information about the sample (P. vivax strain, host species and animal identifier, and collection time in days post infection), microscopy data (parasitemia in percentage of infected RBCs, number of rings, trophozoites, schizonts and (female) gametocytes per 10,000 RBCs (and in %)) and sequencing data (number of reads generated, percentage of reads mapped to the P. vivax genome, number of individual parasite transcriptomes obtained and number (and percentage) of asexual, female, and male parasites).

Sample Microscopy Sequencing
Strain Host ID Collection Parasitemia Rings Trophozoites Schizonts Gametocytes No. Reads % P. vivax Parasites Asexuals Females Males
Chesson Aotus 86436 12 DPI 0.16% 10
(62.5%)
1
(6.3%)
3
(18.8%)
2
(12.5%)
151,711,653 84.2% 1,037 917
(88.4%)
108
(10.4%)
12
(1.2%)
Indonesia-I/CDC Aotus 86574 12 DPI 0.62% 24
(38.7%)
38
(61.3%)
0
(0.0%)
0
(0.0%)
147,583,347 77.4% 2,098 1,816
(86.6%)
236
(11.2%)
46
(2.2%)
AMRU-I Aotus 86416 12 DPI 0.34% 3
(8.8%)
23
(67.6%)
4
(11.8%)
4
(11.8%)
139,578,955 78.9% 1,709 1,403
(82.1%)
295
(17.3%)
11
(0.6%)
Chesson Aotus 86436 16 DPIa 0.46% 23
(50.0%)
12
(26.1%)
9
(19.6%)
2
(4.3%)
87,132,416 73.1% 521 273
(52.4%)
238
(45.7%)
10
(1.9%)
Indonesia-I/CDC Aotus 86574 16 DPIb 0.45% 10
(22.2%)
27
(60.0%)
8
(17.8%)
0
(0.0%)
92,108,991 67.9% 589 500
(84.9%)
89
(15.1%)
0
(0.0%)
AMRU-I Aotus 86416 16 DPIa 0.47% 12
(25.5%)
29
(61.7%)
0
(0.0%)
6
(12.8%)
110,785,183 80.7% 1,795 1,738
(96.8%)
52
(2.9%)
5
(0.3%)
Chesson Saimiri 4215 24 DPI 0.03% 1
(33.3%)
0
(0.0%)
1
(33.3%)
1
(33.3%)
165,620,223 51.0% 249 233
(93.6%)
16
(6.4%)
0
(0.0%)
AMRU-I Saimiri 5107 24 DPI 0.01% 1
(100.0%)
0
(0.0%)
0
(0.0%)
0
(0.0%)
127,962,320 41.2% 22 21
(95.5%)
1
(4.5%)
0
(0.0%)
NIH-1993 Saimiri 3879 15 DPI 0.24% 4
(16.7%)
18
(75.0%)
0
(0.0%)
2
(8.3%)
125,218,128 70.5% 928 518
(55.8%)
401
(43.2%)
9
(1.0%)
NIH-1993 Saimiri 5541 21 DPI 0.04% 2
(50.0%)
2
(50.0%)
0
(0.0%)
0
(0.0%)
107,172,479 72.0% 267 225
(84.3%)
41
(15.4%)
1
(0.4%)

a16 hours after animal received a single 10 mg/kg chloroquine dose orally.

b16 hours after animal received a single 5 mg/kg chloroquine dose orally.

Abbreviations: DPI, days post infection; RBC, red blood cell

After removing PCR duplicates, very few transcriptomes (between 0 to 86 per sample, with a median of 8.5) contained more than 75,000 unique reads, and these were excluded because they might correspond to droplets containing multiple parasites or other technical artefacts. We next compared the results using either all P. vivax transcriptomes with more than 1,000 unique reads (n = 13,503) or those with more than 5,000 reads (n = 9,215). Overall, the results were qualitatively equivalent (S3 Fig versus Fig 1), with only a slight increase in the proportion of late schizonts excluded by the more stringent cutoff (S4 Fig). We chose to concentrate for the remainder of the analyses on the 9,215 parasites with at least 5,000 reads because these data provided a deeper characterization of the P. vivax transcriptomes (Table 1). On average, each of the individual parasite transcriptomes consisted of 19,282 unique sequences.

Fig 1. PCAs of the individual P. vivax transcriptomes.

Fig 1

(A) PCA of 9,215 P. vivax parasites using 13,054 windows of 500 bp with at least 0.25% of the reads of one parasite. Each dot represents a single parasite transcriptome and is colored based on the expression of 60S ribosomal protein L34 (red), AMA1 (blue), MSP3 (purple), Pvs25 (green), and MGET (orange). The arrows represent the trajectories used to calculate developmental pseudotimes for asexual (in black) and female gametocytes (in green). (B) PCA calculated jointly from the data of 9,215 P. vivax parasites and 4,884 P. berghei parasites using the expression levels of orthologous genes. The gray dots represent P. vivax parasites and the colored dots show P. berghei parasites according to their developmental stages (yellow, rings; red, trophozoites; blue, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data plotted in panels A and B are provided in S1 Data. bp, base pairs; PCA, principal component analysis.

Because the 10X 3’-end assay relies on oligo-dT capture and priming and fragmentation of the cDNA into short molecules before sequencing, all scRNA-seq reads should derive from the last approximately 300 bp of each polyadenylated mRNA molecule [30]. Indeed, 68% to 72% of the Aotus reads mapped within 500 bp of the 3’-end of annotated genes (on either side). By contrast, only 42% to 48% of the reads mapped to the P. vivax P01 genome were located within 500 bp of one annotated gene 3’-end (see also S5 Fig), reflecting (i) the incomplete annotation of untranslated regions (UTRs) in this species [18] and/or (ii) the presence of unannotated or misannotated P. vivax transcripts. To further examine whether the scRNA-seq reads correspond to genuine signals from misannotated and unannotated transcripts, we compared the locations of the mapped reads with the 3’-end of transcripts predicted from bulk RNA-seq data from one P. vivax–infected patient (see Methods). Despite the crudeness of these de novo gene predictions, 22% of the scRNA-seq reads mapped to the last 500 bp of the predicted transcripts. This figure corresponds a 3.4-fold enrichment over what would be expected solely by chance (by comparison, the enrichment in the 3’-end of current gene annotations was only 2.5-fold) and highlights that the scRNA-seq reads likely represent many transcripts incompletely described in the current annotations. Overall, 45% of the scRNA-seq reads mapped to the last 500 bp of the 3’-end of either an annotated or predicted transcript. To circumvent the incomplete annotations of P. vivax genes, we summarized the scRNA-seq data, not by annotated genes but by nonoverlapping windows of 500 bp (which corresponds roughly to twice the width of the scRNA-seq peaks; see, e.g., S6 Fig), and we annotated these windows only afterward, with regard to their distance to the 3’-end of the closest gene on the same strand (see S2 Fig and Methods), with the caveat that the signal might actually derive from another unannotated transcript. Overall, we detected expression of 1,079 genes on average in each individual P. vivax parasite; although, the majority of these genes was represented by very few reads, as typically observed with this technology (S2 Table).

Multiple developmental stages are simultaneously present in each infection

Although certain P. vivax stages have been reported to accumulate in host tissues such, as bone marrow [19,41], the patterns, extent, and mechanisms of sequestration differ substantially from those of P. falciparum. As a consequence, and in contrast with P. falciparum infections, in most P. vivax infections, many asexual and sexual stages can be found simultaneously in the circulation [29,37]. Consistent with this reduced sequestration, principal component analysis (PCA) of the individual P. vivax transcriptomes revealed distinct populations of blood-stage parasites. Two of the first 3 principal components defined the main population of parasites, organized according to a shape reminiscent of a cycle, though with a conspicuous gap (Fig 1A and S7 Fig). Parasites at one end of this distribution expressed primarily genes associated with transcription and translation, whereas the parasites at the other end expressed many invasion protein genes (Figs 1A and 2A), suggesting that this population of parasites represented asexual blood-stage parasites progressing in development from trophozoites to schizonts. Note that, because they contain little hemozoin, ring-stage parasites are typically lost during MACS column purification, which likely explains their absence from our data and the gap in the cycle. The second principal component separated 2 additional populations of parasites from the main group: one population expressed known female gametocyte genes (e.g., Pv25 [42]), whereas a smaller population expressed genes known to be specific to male gametocytes (e.g., MGET [42]; Fig 1A).

Fig 2. Heatmaps showing variations in gene expression among blood-stage P. vivax parasites.

Fig 2

(A) Variations in the expression levels of the genes most expressed at different times of the P. vivax asexual development. (B) Variations in the expression levels of genes previously associated with gametocytes or gametes. Each row represents a different gene, and each vertical bar an individual P. vivax parasite ranked along the x-axis based on its pseudotime. The color of each bar represents the expression level of a specific transcript from nondetected (white) to highest expression (in red) in yellow-to-red scale. The color bar under the heatmaps shows the assignment of the parasites to different morphological stages (red, trophozoites; blue and purple, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data plotted in panels A and B are provided in S2 Data.

To further examine these cell populations, we compared the scRNA-seq profiles of P. vivax parasites with those generated using the same technology from P. berghei blood-stage parasites [34]. Despite the deep divergence between these 2 species and the substantial differences between their life cycles (including a 24-hour intraerythrocytic cycle in P. berghei compared to a 48-hour in P. vivax), the PCA profiles were remarkably similar (Fig 1B and S8 Fig). This observation was consistent with previous findings from microarray [7,15] and scRNA-seq [34] studies that demonstrated global conservation of the gene expression changes during the intraerythrocytic cycle of Plasmodium species. These results also corroborated our interpretation of the different cell populations and confirmed that rings were missing from our data set.

P. vivax parasites from different stages were observed in most monkey infections, but not all stages were necessarily observed in each sample, and their relative proportion varied extensively among samples (Table 1 and S9 Fig). These results were consistent with the observation that most P. vivax infections are asynchronous, although this high diversity in stages might have possibly been exacerbated in our data by the splenectomy of the monkeys that was used to avoid parasite clearance and the dangers of splenomegaly [43]. Note that analysis of allelic variants across nucleotides sequenced at high coverage in all samples provided no evidence of genetic heterogeneity in the individual infections (S1 Text), indicating that the different parasite stages observed in each infection were unlikely to represent mixed populations of P. vivax clones out-of-phase in their cycles.

scRNA-seq provides robust markers of P. vivax male and female gametocytes

P. vivax and P. falciparum are distinguished by important features of their gametocytogenesis: P. vivax gametocytes appear earlier than P. falciparum gametocytes in blood-stage infections and display very different morphology than P. falciparum gametocytes [21,22]. Unfortunately, the lack of in vitro culture methods for P. vivax has drastically limited studies of sexual commitment and gametocyte differentiation, and has also hindered the characterization of robust biomarkers for male and female P. vivax gametocytes. In this regard, the scRNA-seq data provided a unique opportunity to assess the expression of putative gametocyte markers and to identify novel candidates. Many of the genes typically expressed in P. falciparum or rodent gametocytes (Pvs25 [42], MGET [42], ULG8 [10], GEST [44]) were detected, at various levels, in male or female P. vivax gametocytes (Fig 2B). However, other genes labelled as “gamete” or “gametocyte” genes displayed patterns of expression in P. vivax that were not consistent with such annotations: for example, the “gametocyte-associated protein” (PVP01_1403000) or the “sexual antigen” s16 (PVP01_0305600) were mostly detectable in asexual parasites (Fig 2B). Importantly, transcripts of both of these genes were not detected together in individual asexual parasites and were therefore unlikely to indicate sexually committed parasites in the asexual population. In contrast, the patterns of expression of some genes were consistent with their transcription in differentiating gametocytes: the gamete egress and sporozoite traversal protein (GEST, PVP01_1258000) was detected only in the population of cells intermediate between the asexual parasites and the fully differentiated females, whereas the male development gene 1 (MDG1, PVP01_1435300) was expressed in both male and female putatively differentiating gametocytes (Fig 2B). To test more systematically whether our data set included both immature and mature gametocytes, we examined the expression of genes previously associated with various stages of gametocyte differentiation [19]. Although the expression levels of genes associated with immature gametocytes were relatively constant along the female gametocyte pseudotime, we observed that the expression of mature gametocyte genes increased with developmental pseudotime (S10 Fig). In contrast, we did not observe any population of parasites expressing genes specifically associated with gametocyte rings, which could indicate (i) their loss during MACS column purification, (ii) insufficient sensitivity of the scRNA-seq, or (iii) their absence from circulating blood (e.g., due to sequestration in the bone marrow [19]). Interestingly, male gametocytes did not show elevated expression of genes previously associated with mature relative to immature gametocytes [19] (S10 Fig), suggesting that these markers may only be representative of female gametocyte differentiation, or that our data set only included immature but not mature male gametocytes.

We next searched the scRNA-seq data to identify genes highly expressed in male and/or female gametocytes and with little expression in other stages to generate a comprehensive list of putative biomarkers for male and female gametocytes (S3 Table). Interestingly, we noted that most of the genes highly expressed in female gametocytes (19 out of 25) were specifically expressed at this stage, whereas the genes highly expressed in male gametocytes were often also expressed at a different stage of the parasite life cycle, and these patterns were conserved in P. berghei (S3 Table). These patterns of sex-specific expression differ from the patterns of expression, at those same genes, measured previously by microarrays in P. berghei [45] and could reflect differences in regulation between Plasmodium species (though scRNA-seq data from P. berghei [34] are consistent with our findings from P. vivax) and/or differences in sensitivity and specificity of the methods used. On the other hand, male gametocytes appeared to express more species-specific genes (i.e., genes without orthologs in P. falciparum or P. berghei), possibly indicating differential evolutionary constrains on these different developmental stages (see also below).

Next, we reanalyzed the gene expression profiles previously obtained by bulk RNA-seq from whole blood of vivax malaria patients in Cambodia [20] using the transcripts deemed from the scRNA-seq data to be present exclusively in male or female P. vivax gametocytes. We determined that the expression of male and female gametocyte-specific transcripts did not significantly covary across patient infections (S11 Fig), confirming that the ratio of male/female gametocytes can differ greatly among patient infections [20]. These findings were also consistent with the variable proportions of male and female gametocytes detected by scRNA-seq in each New World monkey infection (Table 1).

Gene expression is highly stage-specific and changes gradually during the P. vivax intraerythrocytic life cycle

Gene expression data from individual parasites revealed the exquisite regulation of P. vivax transcription along the intraerythrocytic life cycle: for the large majority of genes, expression was tightly restricted to a specific developmental stage, and the transcripts were not detectable elsewhere in the cycle (see, e.g., Fig 2A). This pattern was observed for members of the 6-cysteine gene family (Fig 3) as previously described [46] and held true for numerous other genes involved in metabolism, cellular processes, or cellular organization, reflecting the complex changes in parasite biology occurring as it grows and divides in the host RBC. The levels of histone expression also varied along the P. vivax intraerythrocytic cycle, recapitulating the patterns described previously [47]: for the genes encoding core histones (i.e., H2A, H2B, H3, and H4), expression increased gradually during asexual development and peaked in early schizont stages, consistent with their role in nucleosome assembly during replication, before decreasing below detectable levels later on (Fig 4A). By contrast, variant histones displayed relatively constant gene expression throughout the intraerythrocytic life cycle (e.g., H3.3) or had a delayed expression profile (H2Bv). Interestingly, these changes of histone expression were accompanied by changes in the expression of histone-regulating factors: the expression of the histone chaperone ASF1 that plays a key role in disassembly and reassembly of the chromatin during replication [48] peaked slightly earlier than the core histone expression, whereas the expression of the histone hairpin-binding protein that regulates histone mRNA processing in higher eukaryotes [49,50] preceded the increase in core histone mRNA levels (Fig 4B).

Fig 3. Expression of genes from multigene families.

Fig 3

(A) Heatmap showing variations in the expression levels of 6-cysteine genes. (B) Heatmap showing variations in the expression levels of MSP genes. See legend of Fig 2 for details. Underlying data plotted in panels A and B are provided in S3 Data. MSP, merozoite surface protein.

Fig 4. Expression of histone and histone-related genes.

Fig 4

(A) Expression of histone genes according to the P. vivax development. The core histones are indicated by solid lines, and the histone variants are shown with dashed lines. Note that the expression level of core histones peaks in early schizonts, consistent with the timing of DNA replication and the need for new nucleosomes. (B) Expression of the 4 detectable histone-related genes according to the parasite development. Note how the histone RNA hairpin-binding protein (PVP01_1014900, in black) expression increases before replication, consistent with its role in nucleosome disassembly and reassembly, whereas ASF1 (PVP01_1442700, in red) peaks slightly before the increase in core histone, consistent with its role in histone mRNA regulation. The color bars under the graphs show the assignment of the parasites to different morphological stages (red, trophozoites; blue and purple, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data plotted in panels A and B are provided in S4 Data.

We also observed this stage-specific expression at genes commonly used for rapid diagnostic tests of P. vivax (S12 Fig). Although we cannot predict from the scRNA-seq data how the corresponding protein levels may vary across developmental stages, these observations suggested that detection of P. vivax parasites using aldolase [51,52] or lactate dehydrogenase [53] might be influenced by the predominant stages present in one infection. We therefore searched for genes that were highly expressed in most life cycle stages and might serve for diagnostics as well as for the normalization of transcript levels in quantitative reverse transcription polymerase chain reaction (qRT-PCR) experiments. We only identified a few genes highly expressed across the entire intraerythrocytic life cycle, including profilin (PVP01_0730900, S12 Fig), highlighting that the normalization of qRT-PCR assays and the correction for differences in stage composition will remain difficult challenges and continue to warrant careful evaluation of gene expression studies.

This tightly regulated, stage-specific gene expression was also observed for AP2 domain transcription factor genes. Despite their low expression levels, several AP2 genes could be reliably detected in our data set, and their expression was largely confined to specific developmental stages, with little overlap between them (S13 Fig). These observations corroborated the findings of studies conducted with other Plasmodium species and were consistent with the regulation of the entire intraerythrocytic cycle by successive switches in the expression of these master transcriptional regulators [15,5456].

Much of our knowledge about the changes in gene expression occurring during the Plasmodium intraerythrocytic cycle derives from microarray and bulk RNA-seq studies conducted on parasites cultured in vitro and sampled at discrete time points of their development. Such studies highlight differences in gene expression between parasites at different stages of their development, but it remained unclear whether these dynamic changes happened abruptly, mirroring the changes observed microscopically, or whether they occurred more continuously with development. Recent scRNA-seq data [33] indicated that most genes could be grouped into a few clusters with simultaneous changes in expression, supporting the hypothesis that gene expression changed abruptly in the intraerythrocytic life cycle of Plasmodium parasites. Our findings suggested gradual changes of gene expression resulting from continuous, rather than abrupt, switches in transcription (e.g., Figs 2A and 4). The hypothesis that gene expression changes occurred gradually was also supported by the continuous distribution of the parasites in the PCA plot (Fig 1), whereas distinct clusters of parasites would have been expected from discrete and coordinated regulation of gene expressions (note that this continuous distribution of asexual parasites is also present in other Plasmodium scRNA-seq data generated so far [3134]). One possible explanation for the discrete patterns described previously is that the parasites analyzed were not evenly distributed along the intraerythrocytic cycle, introducing apparent discontinuities, but additional studies will be needed to rigorously analyze and resolve the dynamics of these gene expression changes.

In addition to the transcription of specific genes at different stages of the intraerythrocytic development, the precise information generated by scRNA-seq from the 3’-end of genes also hinted at more subtle forms of gene expression regulation. For example, our data suggested that PVP01_0941100 may be transcribed with different 3’-UTRs in asexual and sexual parasites (S14 Fig). Combined with the observation that many reads mapped outside current gene annotations, the findings emphasized how incomplete our current understanding of the regulation of P. vivax gene expression is.

Sequence homology in gene families does not necessarily predict patterns of expression or function

The P. vivax genome contains a number of multigene families defined by sequence homology and conservation of specific amino acid domains. Although the various members of these multigene families often have no known function, the role of a few families have been annotated based on the validated functions of particular proteins (e.g., the merozoite surface proteins [MSPs] [57]) or on their homologies to P. falciparum genes (e.g., the vir/PIR gene family [58,59]). Single-cell profiling provided a unique opportunity to test whether members of individual families were expressed at the same developmental stage, or in contrary, whether members of a family might be transcribed at different times and with different roles across the parasite life cycle. Although most MSPs were expressed late in the development of asexual parasites (Fig 3B), consistent with their role in erythrocyte invasion, 2 MSP genes displayed distinct expression patterns. An MSP3 gene (PVP01_1031100) was exclusively expressed in very late schizonts, in which it accounted for approximately 2.5% of all mRNA molecules. More surprisingly, an MSP7-like gene (PVP01_1219900) was expressed almost exclusively in gametocytes and at particularly high levels in male gametocytes (Fig 3B). The expression of MSP7-like genes was especially intriguing as 12 of these genes, clustered in 30 kb on chromosome 12, showed highly variable but tightly regulated expression in different stages (S15 Fig). Such differential transcription is consistent with various roles for different MSP genes at specific points in the P. vivax life cycle. Similarly, we observed vastly different expression profiles among members of the PHIST (S16 Fig), PIR (S17 Fig), and tryptophan-rich protein gene families (S18 Fig), suggesting that, despite their sequence homology, members of these individual families might not all have the same function.

Schizonts display species-specific profiles of gene expression

Most P. vivax genes remain functionally unannotated (i.e., they are described as “conserved protein of unknown function”), and those that are annotated often carry a function determined from studies of their P. falciparum homologs. However, P. falciparum and P. vivax diverged from each other millions of years ago [60] and have very different biological features. Mirroring this divergence, 1,473 of the 6,276 annotated P. vivax genes (based on PlasmoDB-37 annotations) do not have annotated orthologs in the P. falciparum 3D7 genome. On average, 9.3% of the mRNA molecules in each P. vivax parasite were transcribed from annotated P. vivax genes without a P. falciparum ortholog. However, this pattern was not constant across all parasite stages and, in particular, late schizonts showed a pronounced difference, with up to 63.4% of the mRNA molecules derived from genes absent in P. falciparum (Fig 5). This remarkable result mirrored the important differences in the molecular mechanisms of human RBC invasion by P. vivax and P. falciparum [26] and highlight the limitations of P. falciparum as a model to understand P. vivax biology. Interestingly, this pattern was also observed when comparing P. vivax to P. berghei (S19 Fig), suggesting that the pathways underlying RBC invasion diverged faster than those regulating other features of the intraerythrocytic cycle. Interestingly, this analysis also confirmed that, overall, male gametocytes expressed a higher proportion of species-specific transcripts than female gametocytes.

Fig 5. Proportion of mRNA transcribed from P. vivax genes with a P. falciparum orthologs.

Fig 5

The figure shows the proportion of the mRNA molecules derived from P. vivax genes with at least one P. falciparum ortholog (y-axis) according to the parasite developmental stage (x-axis). Blue dots indicate asexual parasites organized according to their developmental pseudotime rank (from trophozoites on the left to schizonts on the right); green dots represent female gametocytes and orange dots male gametocytes. Underlying data are provided in S5 Data.

Influence of chloroquine treatment and host species on P. vivax gene expression

To statistically determine whether specific transcripts were differentially regulated upon chloroquine exposure, we compared the gene expression profiles of parasites from the same Aotus infection before and 16 hours after treatment, adjusting for the different distributions of developmental stages (see Methods). A total of 58 windows of 500 bp showed evidence of differential expression (false discovery rate [FDR] < 0.1) in the same direction and in all 3 infections (S4 Table). Interestingly, the genes encoding the core histones H2A, H3, and H4 were all significantly up-regulated after chloroquine exposure (see, e.g., S20A Fig). This finding might reflect the ability of chloroquine to intercalate into DNA and displace Plasmodium histones in vitro [61] and suggested that increasing histone abundance might be a mechanism of parasite response to drug exposure. In these short-term experiments, neither the chloroquine resistance transporter (CRT, PVP01_0109300) nor multidrug resistance gene 1 (MDR1, PVP01_1010900) showed significant changes in expression upon chloroquine exposure (uncorrected p > 0.03 and p > 0.003, respectively). Note, however, that the dose of chloroquine administered was subtherapeutic, and the gene expression changes reported here thus might not be particularly relevant to the biological processes underlying chloroquine-induced clearance or drug resistance.

We also tested for differences in gene expression that might be associated with the host species. Aotus and Saimiri erythrocytes differ in their ability to bind the invasion-related ligand Duffy binding protein 1 and are therefore likely to be invaded by P. vivax parasites using different molecular pathways [62,63]. Our data in Aotus were obtained from infections with 3 P. vivax strains—Chesson, Indonesia-1, and AMRU-1—whereas the data from Saimiri monkeys were mostly derived from infections of NIH-1993 parasites, with some contributions from Chesson and AMRU-1 infections (Table 1 and S9 Fig). This confounding effect precluded drawing firm conclusions about whether the differences in parasite gene expression were due to a response to the host rather than genetic polymorphisms between the parasites. Nevertheless, our analysis revealed potentially interesting genes differentially expressed between the Aotus and Saimiri infections: a number of these candidate genes expressed in late schizonts encoded exported proteins or MSPs (S5 Table), including a MSP3 gene that displayed a 2.1-fold overexpression in Saimiri (PVP01_1031700, S20B Fig). Our findings only marginally overlap with those of a recent whole-blood RNA-seq comparison of Aotus and Saimiri monkeys infected with the Salvador-I strain of P. vivax [62]: the MSP3 gene identified in our study was not orthologous to any of those identified previously, and the most differentially expressed tryptophan-rich antigen gene in schizonts only ranked 250th in our analyses (PVP01_0000140). These discrepancies could be due to the strain heterogeneity in our analyses or reflect the difficulty to adequately correct bulk RNA-seq analyses for even modest differences in stage composition.

Intriguingly, no expression of the DBP2 gene (PVP01_0102300) was detected in any of the samples analyzed here. Because we previously showed that this gene was deleted in the Salvador-I strain [64], we looked for evidence of gene expression of the neighboring genes. In all samples from the 4 P. vivax strains independently adapted to the monkeys, the region containing DBP2 and the upstream telomeric sequence (including 22 annotated genes) was devoid of any reads (aside from possibly mismapped sequences derived from multigene families). This observation could be consistent with the deletion of this region in all monkey-adapted P. vivax strains as we described in the Salvador-I strain. Interestingly, the boundaries of this putative deletion appeared different in the various strains, suggesting independent events, but always occurred immediately upstream of DBP2 or its immediate neighbor (S21 Fig). This fascinating finding could suggest that DBP2 is not necessary for invasion of Aotus or Saimiri monkey RBCs and may even be deleterious in these monkey models, leading to its systematic deletion. This contrasts with the frequent presence of the DBP2 gene in clinical P. vivax isolate genomes from around the world [65,66]. Although more studies will be required to rigorously understand the effect of the host species on parasite gene expression, our analyses suggest that scRNA-seq, using additional strains from monkey infections and/or parasites from patients with different RBC genotypes, will provide a useful framework to identify proteins involved in different pathways of RBC invasion.

Conclusion

The results of our study highlighted the complex and dynamic changes in gene expression occurring during the intraerythrocytic cycle of P. vivax parasites. The vast majority of P. vivax genes appeared to be expressed only during a short developmental period and very few genes (if any) were expressed at the same level throughout the intraerythrocytic cycle. This paucity of constantly expressed “housekeeping” genes will complicate the interpretation of RT-PCR or bulk RNA-seq experiments, which will likely require several markers to adequately correct for the stage composition of the samples. The results of this study can provide a framework to implement such analyses by identifying genes specifically expressed at different stages of the parasite development, including in male and female P. vivax gametocytes. Our study also highlighted that many P. vivax transcripts are incompletely annotated which complicates assigning the scRNA-seq signals to specific genes. Combining the single-cell gene expression profiling with long read sequencing may provide an opportunity to complete the annotation of P. vivax gene structures as well as to characterize different isoforms that may be expressed by the same genes. Similarly, the characterization of when specific genes are expressed in the parasite life cycle provides an opportunity to functionally annotate these genes that often have no predicted function. Finally, the scRNA-seq data revealed differences in the regulation of P. vivax compared to P. falciparum parasites, particularly at the schizont stage, underscoring the importance of specific studies directed to each of these major human pathogens.

Methods

Ethics statement

All animal procedures were conducted in accordance with the National Institutes of Health (NIH) guidelines and regulations [67], under approved protocols by the National Institute of Allergy and Infectious Diseases (NIAID) Animal Care and Use Committee (ACUC) (protocol LMVR15). Animals were purchased from NIH-approved sources and transported and housed according to Guide for the Care and Use of Laboratory Animals [67].

Animal studies

Four Saimiri boliviensis and 3 Aotus nancymaae New World monkeys were individually infected with 1 of 4 strains of P. vivax (Table 1 and S1 Fig): Chesson, a chloroquine sensitive parasite isolate from New Guinea [68]; Indonesia-I/CDC, a chloroquine resistant isolate from Indonesia [69,70]; AMRU-I, a chloroquine resistant isolate from Papua New Guinea [71]; or NIH-1993, a moderately chloroquine resistant strain closely related to the Salvadorian reference strain Salvador-I [72]. All monkeys were splenectomized before being infected intravenously with parasitized erythrocytes from cryopreserved stocks. Baseline hematocrit, weight, and blood smears were recorded from each animal before inoculation. After infection, hematocrit was measured once a week and parasitemia 3 times a week using approximately 0.05 mL of blood collected from the saphenous vein. Once parasites were observed on thin blood films, hematocrit was checked 3 times a week and parasitemia daily. Blood films were fixed with 100% methanol and stained for 15 minutes with a 20% Giemsa solution. For sample collection, blood draws from the femoral vein were performed on animals anesthetized with 10 mg/kg ketamine, using a laced heparin sodium syringe and a 25-gauge needle. The 3 Aotus monkeys received one dose of chloroquine 3 days after the initial blood sample collection, and we collected an additional blood sample 16 hours after this administration to assess the effect of chloroquine on P. vivax gene expression profiles (S1 Fig).

Sample processing, library preparation, and sequencing

A total of 1 mL of fresh venous blood was obtained from each monkey and centrifuged at 2,500 rpm for 3 minutes to separate and remove the serum. Blood cells were transferred to 50 mL tubes containing RPMI 1640 to a final 0.5% to 1.0% hematocrit solution. Parasitized cells were enriched using MACS LS separation columns [35]. We determined the number and viability of the purified cells from each sample using a BioRad TC20 Automated Cell Counter and loaded, from each sample, an estimated approximately 3,000 cells on a 10X Genomics Chromium controller to prepare a barcoded 3’-end scRNA-seq library according to the manufacturer’s instructions. For each of the 10 libraries, we minimized the amount of processing time so that less than 3 hours elapsed between the blood collection and the cell isolation on the Chromium controller.

The 10 scRNA-seq libraries were sequenced with an Illumina HiSeq 4000 to generate a total of 1,254,873,695 paired-end reads of 75 bp (S1 Table). (One library was sequenced using 101 bp paired-end reads but was trimmed, before alignment, to 75 bp for consistency.) All sequence reads generated are available on NCBI SRA under the Bioproject PRJNA603327.

scRNA-seq analysis

To process the scRNA-seq reads, we developed custom analyses mirroring the 10X CellRanger pipeline, allowing us to more easily control the different analysis parameters and to address the specificities of the P. vivax genome (see below).

We kept only reads containing the accepted 10X Genomics barcodes, and we trimmed scRNA-seq reads extending beyond polyadenylation tails by removing 3’ sequences downstream of 6 consecutive As. All sequences shortened to less than 40 nucleotide long were discarded from further analyses (S1 Table). We then mapped all 1,091,808,431 remaining reads to the P. vivax P01 genome sequence [36] using HISAT2 (version 2.0.4 [73]) with the default parameters, except for a maximum intron length of 5,000 bp. We also separately mapped all reads to the Saimiri boliviensis (SaiBol1.0) or Aotus nancymaae (Anan2.0) genome sequences using HISAT2 and a maximum intron length of 20,000 bp.

To distinguish molecules amplified during the library preparation from genuine multiple copies of mRNAs present in one parasite, we determined for each sample which reads carried the same 16-mer 10X barcode, 10-mer randomizer sequence (or unique molecular identifier), and mapped within 10 bp of the same genomic location and on the same DNA strand. We randomly kept one occurrence of each unique molecule and discarded the others as likely PCR duplicates (S1 Table).

For each sample, we separated the reads derived from each individual parasite using the 10X barcodes. We then calculated the number of reads mapped to each nonoverlapping 500 bp window of each strand of the P. vivax P01 genome and annotated each window with regard to its position relative to the closest gene 3’-end on the same strand (based on PlasmoDB-37 annotations), using the following priorities to account for the gene-rich nature of the P. vivax genome and possible overlaps (S5 Fig): (1) containing an annotated gene 3’-end (“0”); (2) immediately following a window containing a gene 3’-end (“1”); (3) immediately preceding a gene 3’-end (“-1”); (4) 2 windows downstream of a gene 3’-end (“2”); (5) 2 windows upstream of a gene 3’-end (“-2”); (6) 3, 4, or 5 windows downstream of a gene 3’-end (“3,”“4,” and “5,” respectively); and (7) further away from any annotated gene 3’-end (“X”). Finally, we discarded all individual parasite transcriptomes containing less than 5,000 unique reads mapped to the P. vivax genome as well as those containing more than 75,000 unique reads (S1 Table) and normalized the read counts across all parasite transcriptomes by dividing the read counts obtained for each window by the total number of mapped reads obtained for the given parasite. For the gene-based analyses, we combined reads from all windows associated with a given annotated protein-coding gene (with the caveat that this approach might collapse signals from different transcripts). To validate that the scRNA-seq reads mapping far from the 3’-end of annotated genes were genuine, we compared their mapping locations with the positions of the 3’-ends of transcripts predicted from the bulk RNA-seq data of a P. vivax–infected Cambodian patient (C0) using StringTie version 2.1.1 [74].

PCA

To examine the diversity of, and relationships among, individual P. vivax parasite transcriptomes from the different infections, we conducted PCA using data from 13,054 windows containing at least 0.25% of the reads of one parasite, scaling the variance across windows. We then used the first 3 principal components to determine the position of each asexual parasite and female gametocyte along their putative developmental trajectories (later referred to as “pseudotime”) by calculating their Euclidean distance from an arbitrarily selected origin (Fig 1A). Because the male gametocytes formed an apparently homogenous population without much variation along the PCs, we treated all male gametocytes identically in our analyses without calculating developmental pseudotime.

Analysis of gametocyte genes

We calculated, for each individual parasite, the proportion of scRNA-seq reads originating from genes previously defined as characteristic of gametocyte rings (GRs), immature gametocytes (IGs), and mature gametocytes (MGs) [19].

We also screened the scRNA-seq data for genes expressed specifically in gametocytes by selecting all genes for which the mean expression in either male or female gametocytes was at least 10-fold greater than the mean expression in asexual parasites and in the other sex. We then verified that the selected genes were expressed in most gametocytes of the selected sex and absent from other stages. We used these gametocyte-specific genes to rerun the gene co-expression analysis on data previously generated by whole-blood RNA-seq from 26 Cambodian patients [20]: we calculated the Pearson’s R correlation coefficient between the expression level of each pair of selected gametocyte genes and grouped these genes according to their co-expression among patients using unsupervised hierarchical clustering.

Comparison with P. berghei scRNA-seq data

We compared the scRNA-seq data generated from P. vivax parasites to the data generated using 10X scRNA-seq for 4,884 P. berghei from in vivo bloodstream infections [34]. We used PlasmoDB (www.plasmodb.org) to identify 4,265 P. berghei genes with one-to-one ortholog to P. vivax and calculated for each parasite of each species the proportion of the mRNA molecules derived from this subset of genes. We then performed PCA jointly for the parasites of both species using all genes accounting for more than 1% of the mRNAs in at least one parasite and scaling the variance across genes. In addition, we recalculated the pseudotime for the P. berghei parasites as described above and using the principal component coordinates originally calculated by Howick and colleagues using all P. berghei genes and retrieved the stage annotations (i.e., rings, trophozoites, schizonts, and gametocytes) previously determined for each parasite [34].

P. vivax genes with orthologs in the P. falciparum genome

We used PlasmoDB to identify the subset of P. vivax genes with at least one annotated orthologous gene in the P. falciparum 3D7 genome. We then calculated, for each individual P. vivax parasite, the proportion of scRNA-seq reads originating from genes with P. falciparum orthologs.

Statistical analysis

To test for differential gene expression, we first normalized our data with scran [75] and calculated a size factor for each gene after pooling counts across cells to account for the low and zero counts typical of scRNA-seq data. We then imported the count tables and the size factors to perform the statistical analyses with zinbwave [76] and corrected all statistical tests for multiple testing correction using FDRs [77,78].

To evaluate the effect of chloroquine on P. vivax gene expression, we compared the profiles of individual parasites of the same strain in the same Aotus monkey before and 16 hours after chloroquine administration. We conducted these analyses independently for each strain and separately for female gametocytes and asexual parasites, which were further subdivided into 4 crude developmental groups along the intraerythrocytic cycle. In addition, we used the pseudotime determined for each parasite as a covariate in our analyses to further correct for subtle developmental differences. Too few male gametocytes were detected in each sample to support rigorous statistical testing, and these were therefore not included in this analysis. We only considered as differentially expressed windows that were statistically significant (FDR < 0.1) for the same developmental group in all 3 strains and showed expression change in the same direction (e.g., they were all up-regulated upon chloroquine exposure).

We performed a similar analysis to evaluate the effect of the host species on the parasite gene expression and determine which P. vivax transcripts were differentially regulated in Aotus and Saimiri in each developmental stage. For this analysis, the different strains used in both monkey species and the low number of parasites obtained from several Saimiri infections prevented rigorous paired analyses (i.e., comparisons of expression differences between parasites of the same strain in the 2 hosts). We therefore pooled data from parasites obtained from all infections in a given host species together for this analysis.

Genetic analysis

We identified single-nucleotide variants and sequence rearrangements from each P. vivax strain using the sequence information contained in the scRNA-seq reads. First, we examined nucleotide variation at each position of the P. vivax genome sequenced by at least 20 reads in a given sample. For each sample, we calculated the proportion of reads carrying the reference allele at each genomic position and examined the distribution of reference allele frequencies to assess the clonality of each sample [79]. We also investigated 2,279,246 nucleotides sequenced at >20X in all 10 samples to identify variable positions between parasites and calculated the number of nucleotide differences between the P. vivax parasites present in each pair of infections. Second, we calculated the number of reads mapped to nonoverlapping 10-kb windows for each sample and further examined windows without any read in one or more sample to detect possible chromosomal deletions (see S1 Text for details).

Supporting information

S1 Fig. Development of P. vivax parasitemia in nonhuman primates.

The graphs show the changes in parasitemia over time in Aotus (A) and Saimiri (B) monkeys infected with different strains of P. vivax. The large black circles represent, for each animal, the collection points of blood samples used for scRNA-seq library preparation. “CQ” indicates point of single oral administration of chloroquine to the animals (10 mg/kg for the Chesson and AMRU-I infections, 5 mg/kg for the Indonesia-I/CDC infection). Underlying data plotted in panels A and B are provided in S6 Data. scRNA-seq, single-cell RNA sequencing.

(TIFF)

S2 Fig. Host and parasite contribution to the single-cell transcriptomes.

The scatter plot shows the number of monkey (x-axis, in 100,000s) and P. vivax (y-axis, in 100,000s) unique sequences (or UMIs) obtained from each droplet (gray dots) across all samples. Note that most single-cell transcriptomes contain either monkey or P. vivax reads. The few droplets containing high numbers of monkey and P. vivax reads (dots off the x- and y-axes) are likely droplets containing multiple cells. UMI, unique molecular identifiers.

(TIF)

S3 Fig. PCA of the 13,503 individual parasite transcriptomes containing more than 1,000 unique reads.

Each gray dot represents a single P. vivax transcriptome and is displayed according to its gene expression profile along the first 3 principal components. Note that the figure is virtually identical to the PCA generated using the 9,215 transcriptomes with more than 5,000 unique reads (Fig 1A). Underlying data are provided in S7 Data. PCA, principal component analysis.

(TIF)

S4 Fig. Influence of the minimum read cutoff on the parasite stages retrieved.

The graph shows the number of parasites obtained (y-axis) using a minimum read cutoff of 1,000 (red) or 5,000 (green) unique reads per parasite. The parasites are grouped according to their pseudotimes (x-axis) calculated using the full data set and binned into 10 categories for the asexual parasites and female gametocytes from 0 (low pseudotime value) to 9 (high pseudotime value). Underlying data are provided in S8 Data.

(TIF)

S5 Fig. Annotation of the 500-bp windows with regard to annotated P. vivax genes.

Each 500 bp window was annotated based on its distance from the 3’-end of the closest annotated gene. “0” corresponds to a window containing an annotated 3’-end, “1” through “5” indicate windows located 1 to 5 windows of 500 bp downstream of the annotated 3’-end, and “-2”/”-1” upstream of the 3’-end. “X” represents bins located outside of these regions. The annotation was performed independently for each DNA strand. The histogram on top shows the distribution of the windows containing scRNA-seq reads with regard to annotated genes. Underlying data are provided in S9 Data. bp, base pairs; scRNA-seq, single-cell RNA sequencing.

(TIF)

S6 Fig. Example of scRNA-seq data generated using the 10X 3’-end protocol.

The figure shows approximately 20 kb on chromosome 11 between position 736,000 and 756,000. The blue bars at the bottom represent 7 annotated P. vivax genes in this region. The gray histograms represent the number of reads covering a given position for 24 single-cell transcriptomes generated from an infection of an Aotus monkey with AMRU-I. The first 16 rows correspond to asexual parasites along their developmental trajectories (by groups of 4), the next 4 rows represent female gametocytes, and the last 4 rows, male gametocytes. scRNA-seq, single-cell RNA sequencing.

(TIF)

S7 Fig. Animated.mp4 showing the PCA of the 9,215 individual P. vivax parasite transcriptomes.

Each dot represents a single parasite transcriptome and is displayed according to its gene expression profile along the first 3 principal components. PCA, principal component analysis.

(MP4)

S8 Fig. Animated.mp4 showing the PCA of the 9,215 P. vivax (in blue) and 4,884 P. berghei parasites (in pink).

Each dot represents a single parasite transcriptome and is displayed according to its gene expression profile along the first 3 principal components. PCA, principal component analysis.

(MP4)

S9 Fig. PCA of the individual parasite transcriptomes separated by samples.

The figure shows PCA from all individual transcriptomes separated by blood samples. The rows correspond to the different strains of P. vivax used in this study, the columns blood samples collected from Aotus monkeys before and after chloroquine treatment (columns 1 and 2, respectively) or from Saimiri monkeys (column 3). Note that 2 different Saimiri monkeys were infected with the same strain of NIH-1993 (last 2 rows). Underlying data are provided in S10 Data. PCA, principal component analysis.

(TIF)

S10 Fig. Maturity of P. vivax gametocytes circulating in blood.

The panels show the average proportion of transcripts (y-axis, in mRNA molecules per 1,000) derived from P. vivax genes associated with GR (red), IG (green), and MG (blue) based on Obaldia and colleagues, 2018. The left and middle panels show the expression in asexual parasites and male gametocytes, respectively. The right panel displays the average expression of 50 female gametocytes organized according to their pseudotime (left, differentiating; right, fully differentiated). Underlying data are provided in S11 Data. GR, gametocyte ring; IG, immature gametocyte; MG, mature gametocyte.

(TIF)

S11 Fig. Correlations between the expression of male and female gametocyte genes.

The heatmap shows the extent of gene expression correlation (Pearson’s R, in blue-red scale) between 2 gametocyte genes selected from scRNA-seq data across all Cambodian patients characterized in Kim and colleagues, 2019. The bordering trees show the result of unsupervised clustering of these genes according to their gene expression patterns. Underlying data are provided in S12 Data. scRNA-seq, single-cell RNA sequencing.

(TIF)

S12 Fig. Expression of genes used for RDT.

The figures show PCA from all P. vivax transcriptomes colored according to the expression level of, from top to bottom, 2 genes used in RDTs, lactate dehydrogenase (PVP01_1229700) and fructose 1,6-bisphosphate aldolase (PVP01_1262200), and one gene almost ubiquitously expressed (profilin or PVP01_0730900). Each dot represents one single-cell parasite and is colored according to the expression level of that transcript (in red scale). Underlying data are provided in S13 Data. PCA, principal component analysis; RDT, rapid diagnostic test.

(TIF)

S13 Fig. Expression of 28 AP2 domain transcription factors.

Variations in the expression levels of AP2 genes according to the parasite development. Each row represents a different annotated AP2 gene and each vertical bar an individual P. vivax parasite ranked along the x-axis based on its pseudotime. The color of each vertical bar represents the expression level of a specific transcript from nondetected (yellow) to highest expression (in red) in yellow-to-red scale. The color bar under the heatmap shows the assignment of the parasites into different morphological stages (red, trophozoites; blue and purple, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data are provided in S14 Data.

(TIF)

S14 Fig. Example showing possible alternative 3’-UTR usage.

The figure shows scRNA-seq data from 24 individual parasites for a region of approximately 10 kb on chromosome 9 containing 2 annotated P. vivax genes, PVP01_0941100 (left blue bar) and PVP01_0941200 (right blue bar), both encoded on the + strand. The scRNA-seq sequences generated from for asexual parasites (rows 1–16) map close to the annotated 3’-end of PVP01_0941100, whereas those generated from male gametocytes (last 4 rows) map more than 1 kb downstream. Female gametocytes (rows 17–20) displayed an intermediate profile. This pattern is consistent with the presence of 2 different 3’-UTRs for PVP01_0941100 (although it is also possible that the second peak corresponds to the 3’-UTR of an unannotated P. vivax gene). scRNA-seq, single-cell RNA sequencing; UTR, untranslated region.

(TIF)

S15 Fig. Expression of MSP7-like genes.

The figure shows examples of scRNA-seq data across approximately 30 kb of chromosome 12 containing a cluster of MSP7-like genes (blue bars at the bottom). Note that most MSP genes are specifically expressed in late asexual parasites (rows 5–12) but turned off in very late schizonts (rows 13–16), whereas other MSP7-like genes are uniquely expressed in female and male gametocytes (row 17–24). MSP, merozoite surface protein.

(TIF)

S16 Fig. Expression of PHIST genes.

Variations in the expression levels of members of the PHIST multigene family according to the parasite development. Each row represents a different annotated PHIST gene, and each vertical bar an individual P. vivax parasite ranked along the x-axis based on its pseudotime. The color of each vertical bar represents the expression level of a specific transcript from nondetected (yellow) to highest expression (in red) in yellow-to-red scale. The color bar under the heatmap shows the assignment of the parasites into different morphological stages (red, trophozoites; blue and purple, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data are provided in S15 Data. PHIST, plasmodium helical interspersed subtelomeric.

(TIF)

S17 Fig. Heatmap showing the variation in the time of PIR gene expression.

Variations in the expression levels of PIR genes according to the parasite development. Each row represents a different annotated PIR gene and each vertical bar an individual P. vivax parasite ranked along the x-axis based on its pseudotime. The color of each vertical bar represents the expression level of a specific transcript from nondetected (yellow) to highest expression (in red) in yellow-to-red scale. The color bar under the heatmap shows the assignment of the parasites into different morphological stages (red, trophozoites; blue and purple, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data are provided in S16 Data.

(TIF)

S18 Fig. Heatmap showing the variation in the time of tryptophan-rich antigen gene expression.

Variations in the expression levels of tryptophan-rich antigen genes according to the parasite development. Each row represents a different annotated tryptophan-rich antigen gene and each vertical bar an individual P. vivax parasite ranked along the x-axis based on its pseudotime. The color of each vertical bar represents the expression level of a specific transcript from nondetected (yellow) to highest expression (in red) in yellow-to-red scale. The color bar under the heatmap shows the assignment of the parasites into different morphological stages (red, trophozoites; blue and purple, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data are provided in S17 Data.

(TIF)

S19 Fig. Proportion of mRNA transcribed from P. vivax genes with a P. berghei ortholog.

The figure shows the proportion of the mRNA molecules derived from P. vivax genes with a one-to-one P. berghei ortholog (y-axis) according to the parasite developmental stage (x-axis). Purple dots indicate asexual parasites organized according to their developmental pseudotime rank (from trophozoites on the left to schizonts on the right); green dots represent female gametocytes and orange dots male gametocytes. Underlying data are provided in S18 Data.

(TIF)

S20 Fig. Effect of chloroquine and host species on P. vivax gene expression.

(A) Expression of histone H4 (PVP01_0905800) in individual parasites organized along their developmental pseudotime rank (on the x-axis) and colored according to the chloroquine treatment (red, before treatment; blue, after treatment). Only parasite transcriptomes generated from Aotus infections are represented. (B) Expression of the exported protein PVP01_1402200 in individual parasites organized along their developmental pseudotime rank and colored according to the host species (red, Aotus; blue, Saimiri). The y-axis indicates the number of unique reads mapped to these windows in each parasite (in reads per 1,000). Underlying data are provided in S19 Data.

(TIF)

S21 Fig. Possible deletion of DBP2 in monkey-adapted P. vivax strains.

The figure shows all the reads from each P. vivax strain mapped to a 50-kb region surrounding DBP2 (PVP01_0102300, highlighted in red). Virtually no reads mapped in the first 110 kb of this chromosomal arm (including DBP2) possibly indicating a deletion of this region in these monkey-adapted strains. Note that the boundary of this deletion might differ between strains (as shown by the location where read mapping resumed).

(TIF)

S22 Fig. Reference allele frequency plots.

The figure shows reference allele frequency plots generated from each blood sample. Each plot shows the proportion of reads carrying the reference allele (x-axis, for 0 on the left to 1 on the right) for each position sequenced at >20X (y-axis). The rows correspond to the different strains of P. vivax used in this study, the columns blood samples collected from Aotus monkeys before and after chloroquine treatment (columns 1 and 2, respectively) or from Saimiri monkeys (column 3). Note that 2 different Saimiri monkeys were infected with the same strain of NIH-1993 (last 2 rows). Underlying data are provided in S20 Data.

(TIF)

S1 Table. Summary statistics for the alignment and bioinformatics analyses.

The table shows, for each sample, the viability determined after MACS column purification, the number of reads generated, the percentage of reads carrying a 10X barcode (% Accept Barc), the percentage of reads too short after trimming polyA sequences (% Discarded), the percentage of reads mapped to the monkey and P. vivax genomes, the average number of reads per unique sequences (reads/UMI) and the number of single-cell transcriptomes obtained. UMI, unique molecular identifiers.

(XLSX)

S2 Table. Comparison of the results from different Plasmodium scRNA-seq experiments.

For each study, the table indicates the number of single-cell transcriptomes obtained, the average number of unique reads mapped to annotated genes per parasite, the average number of expressed genes detected and the average number of genes contributing to more than 1% of the mRNA molecules. Data from Howick and colleagues, 2019, only include blood parasites characterized on the 10X Genomics platform. scRNA-seq, single-cell RNA sequencing.

(XLSX)

S3 Table. Genes highly expressed in male and female P. vivax gametocytes.

The table shows the 25 genes most highly expressed in male and female gametocytes and provides for each genes the median and range (10%–90%) expression in female, male, and asexual parasites (in number of mRNA molecules per 1,000). The figures for P. berghei are also provided, as well as whether the expression of the gene was detected in Yeoh and colleagues, 2017 (M in male, F in female gametocytes) and in Obaldia and colleagues, 2018 (GR, gametocyte ring; MG, mature gametocytes). Genes highlighted in green are specifically expressed in gametocytes.

(XLSX)

S4 Table. Genes differentially expressed after chloroquine treatment.

The table lists the 500-bp windows deemed differentially expressed after chloroquine exposure (see main text for details). Cells highlighted in green represent genes up-regulated after chloroquine exposure, cells highlighted in red represent genes down-regulated after chloroquine exposure. The capital letters after the strain name indicate the stage of the parasite tested (A to D: asexual parasites from early trophozoites to late schizonts, F: female gametocytes). bp, base pairs.

(XLSX)

S5 Table. Genes differentially expressed genes according to the host species.

The table lists the ten 500-bp windows most differentially expressed according to the host species at each stage (see main text for details). Cells highlighted in green represent genes up-regulated after chloroquine exposure, cells highlighted in red represent genes down-regulated after chloroquine exposure. The capital letters indicate the stage of the parasite tested (A to D: asexual parasites from early trophozoites to late schizonts, F: female gametocytes). bp, base pairs.

(XLSX)

S6 Table. Genetic differences between strains.

The table shows the number of nucleotide differences between the different P. vivax strains used in this study. The cells highlighted in green indicate comparisons between the genotype from the same strain, either in the same monkey (before and after treatment) or in 2 different monkey infections.

(XLSX)

S1 Text. Genetic Analyses.

Description of the assessment of clonality and the number of genetic differences between strains.

(DOCX)

S1 Data. Experimental data used to generate Fig 1.

(XLSX)

S2 Data. Experimental data used to generate Fig 2.

(XLSX)

S3 Data. Experimental data used to generate Fig 3.

(XLSX)

S4 Data. Experimental data used to generate Fig 4.

(XLSX)

S5 Data. Experimental data used to generate Fig 5.

(XLSX)

S6 Data. Experimental data used to generate S1 Fig.

(XLSX)

S7 Data. Experimental data used to generate S3 Fig.

(XLSX)

S8 Data. Experimental data used to generate S4 Fig.

(XLSX)

S9 Data. Experimental data used to generate S5 Fig.

(XLSX)

S10 Data. Experimental data used to generate S9 Fig.

(XLSX)

S11 Data. Experimental data used to generate S10 Fig.

(XLSX)

S12 Data. Experimental data used to generate S11 Fig.

(XLSX)

S13 Data. Experimental data used to generate S12 Fig.

(XLSX)

S14 Data. Experimental data used to generate S13 Fig.

(XLSX)

S15 Data. Experimental data used to generate S16 Fig.

(XLSX)

S16 Data. Experimental data used to generate S17 Fig.

(XLSX)

S17 Data. Experimental data used to generate S18 Fig.

(XLSX)

S18 Data. Experimental data used to generate S19 Fig.

(XLSX)

S19 Data. Experimental data used to generate S20 Fig.

(XLSX)

S20 Data. Experimental data used to generate S22 Fig.

(XLSX)

Acknowledgments

We thank the technicians and veterinaries of the Division of Veterinary Resources, National Institutes of Health, for providing support to this study; S. Ott, Y. Santana-Cruz, L. Sadzewicz, and L. Tallon in the Genomic Resource Center at the University of Maryland School of Medicine for their support with 10X library preparation and Illumina sequencing; and A. Kim, R. Moraes Barros, and T. Gibson for helpful discussions and for critical comments on this manuscript.

Abbreviations

bp

base pairs

CRT

chloroquine resistance transporter

FDR

false discovery rate

GR

gametocyte ring

IG

immature gametocyte

MDR1

multidrug resistance gene 1

MG

mature gametocyte

MSP

merozoite surface protein

qRT-PCR

quantitative reverse transcription polymerase chain reaction

RBC

red blood cell

PCA

principal component analysis

scRNA-seq

single-cell RNA sequencing

UTR

untranslated region

Data Availability

All sequence data generated in this study are deposited at the Sequence Read Archive under the BioProject PRJNA603327 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA603327).

Funding Statement

This work was supported by an award from the National Institutes of Health to the University of Maryland School of Medicine (U19 AI110820) and by the Intramural Research Program of the National Institute of Allergy and Infectious Diseases, National Institutes of Health. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Phillips MA, Burrows JN, Manyando C, van Huijsduijnen RH, Van Voorhis WC, Wells TNC. Malaria. Nat Rev Dis Primers. 2017;3:17050 10.1038/nrdp.2017.50 . [DOI] [PubMed] [Google Scholar]
  • 2.Coatney R, Collins WE, Warren M, Contacos PG. The Primate Malarias. Bethesda, MD: National Institute of Health; 1971. 366 p. [Google Scholar]
  • 3.Josling GA, Llinas M. Sexual development in Plasmodium parasites: knowing when it's time to commit. Nat Rev Microbiol. 2015;13(9):573–87. 10.1038/nrmicro3519 . [DOI] [PubMed] [Google Scholar]
  • 4.Josling GA, Williamson KC, Llinas M. Regulation of Sexual Commitment and Gametocytogenesis in Malaria Parasites. Annu Rev Microbiol. 2018;72:501–19. 10.1146/annurev-micro-090817-062712 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Otto TD, Wilinski D, Assefa S, Keane TM, Sarry LR, Bohme U, et al. New insights into the blood-stage transcriptome of Plasmodium falciparum using RNA-Seq. Mol Microbiol. 2010;76(1):12–24. 10.1111/j.1365-2958.2009.07026.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Otto TD, Bohme U, Jackson AP, Hunt M, Franke-Fayard B, Hoeijmakers WA, et al. A comprehensive evaluation of rodent malaria parasite genomes and gene expression. BMC Biol. 2014;12:86 10.1186/s12915-014-0086-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hoo R, Zhu L, Amaladoss A, Mok S, Natalang O, Lapp SA, et al. Integrated analysis of the Plasmodium species transcriptome. EBioMedicine. 2016;7:255–66. 10.1016/j.ebiom.2016.04.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lemieux JE, Gomez-Escobar N, Feller A, Carret C, Amambua-Ngwa A, Pinches R, et al. Statistical estimation of cell-cycle progression and lineage commitment in Plasmodium falciparum reveals a homogeneous pattern of transcription in ex vivo culture. Proceedings of the National Academy of Sciences of the United States of America. 2009;106(18):7559–64. 10.1073/pnas.0811829106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Adjalley SH, Scanfeld D, Kozlowski E, Llinas M, Fidock DA. Genome-wide transcriptome profiling reveals functional networks involving the Plasmodium falciparum drug resistance transporters PfCRT and PfMDR1. BMC Genomics. 2015;16:1090 10.1186/s12864-015-2320-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Le Roch KG, Zhou Y, Blair PL, Grainger M, Moch JK, Haynes JD, et al. Discovery of gene function by expression profiling of the malaria parasite life cycle. Science. 2003;301(5639):1503–8. 10.1126/science.1087025 . [DOI] [PubMed] [Google Scholar]
  • 11.Lapp SA, Mok S, Zhu L, Wu H, Preiser PR, Bozdech Z, et al. Plasmodium knowlesi gene expression differs in ex vivo compared to in vitro blood-stage cultures. Malar J. 2015;14:110 10.1186/s12936-015-0612-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Blythe JE, Yam XY, Kuss C, Bozdech Z, Holder AA, Marsh K, et al. Plasmodium falciparum STEVOR proteins are highly expressed in patient isolates and located in the surface membranes of infected red blood cells and the apical tips of merozoites. Infection and immunity. 2008;76(7):3329–36. 10.1128/IAI.01460-07 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Howes RE, Battle KE, Mendis KN, Smith DL, Cibulskis RE, Baird JK, et al. Global Epidemiology of Plasmodium vivax. The American journal of tropical medicine and hygiene. 2016. 10.4269/ajtmh.16-0141 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.WHO. World malaria report 2018. 2018.
  • 15.Bozdech Z, Mok S, Hu G, Imwong M, Jaidee A, Russell B, et al. The transcriptome of Plasmodium vivax reveals divergence and diversity of transcriptional regulation in malaria parasites. Proceedings of the National Academy of Sciences of the United States of America. 2008;105(42):16290–5. 10.1073/pnas.0807404105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Zhu L, Mok S, Imwong M, Jaidee A, Russell B, Nosten F, et al. New insights into the Plasmodium vivax transcriptome using RNA-Seq. Sci Rep. 2016;6:20498 10.1038/srep20498 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Westenberger SJ, McClean CM, Chattopadhyay R, Dharia NV, Carlton JM, Barnwell JW, et al. A systems-based analysis of Plasmodium vivax lifecycle transcription from human to mosquito. PLoS Negl Trop Dis. 2010;4(4):e653 10.1371/journal.pntd.0000653 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kim A, Popovici J, Vantaux A, Samreth R, Bin S, Kim S, et al. Characterization of P. vivax blood stage transcriptomes from field isolates reveals similarities among infections and complex gene isoforms. Sci Rep. 2017;7(1):7761 10.1038/s41598-017-07275-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Obaldia N 3rd, Meibalan E, Sa JM, Ma S, Clark MA, Mejia P, et al. Bone Marrow Is a Major Parasite Reservoir in Plasmodium vivax Infection. MBio. 2018;9(3). 10.1128/mBio.00625-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kim A, Popovici J, Menard D, Serre D. Plasmodium vivax transcriptomes reveal stage-specific chloroquine response and differential regulation of male and female gametocytes. Nature communications. 2019;10(1):371 10.1038/s41467-019-08312-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Baker DA. Malaria gametocytogenesis. Molecular and biochemical parasitology. 2010;172(2):57–65. 10.1016/j.molbiopara.2010.03.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gautret P, Motard A. Periodic infectivity of Plasmodium gametocytes to the vector. A review. Parasite. 1999;6(2):103–11. 10.1051/parasite/1999062103 . [DOI] [PubMed] [Google Scholar]
  • 23.Yang B. Experimental and field studies on some biological characteristics of Plasmodium vivax isolated from tropical and temperate zones of China. Chin Med J (Engl). 1996;109(4):266–71. . [PubMed] [Google Scholar]
  • 24.Adams JH, Sim BK, Dolan SA, Fang X, Kaslow DC, Miller LH. A family of erythrocyte binding proteins of malaria parasites. Proceedings of the National Academy of Sciences of the United States of America. 1992;89(15):7085–9. 10.1073/pnas.89.15.7085 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Hayton K, Gaur D, Liu A, Takahashi J, Henschen B, Singh S, et al. Erythrocyte binding protein PfRH5 polymorphisms determine species-specific pathways of Plasmodium falciparum invasion. Cell Host Microbe. 2008;4(1):40–51. 10.1016/j.chom.2008.06.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Cowman AF, Tonkin CJ, Tham WH, Duraisingh MT. The Molecular Basis of Erythrocyte Invasion by Malaria Parasites. Cell Host Microbe. 2017;22(2):232–45. 10.1016/j.chom.2017.07.003 . [DOI] [PubMed] [Google Scholar]
  • 27.Malleret B, Li A, Zhang R, Tan KS, Suwanarusk R, Claser C, et al. Plasmodium vivax: restricted tropism and rapid remodeling of CD71-positive reticulocytes. Blood. 2015;125(8):1314–24. 10.1182/blood-2014-08-596015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Barber BE, William T, Grigg MJ, Parameswaran U, Piera KA, Price RN, et al. Parasite biomass-related inflammation, endothelial activation, microvascular dysfunction and disease severity in vivax malaria. PLoS Pathog. 2015;11(1):e1004558 10.1371/journal.ppat.1004558 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Anstey NM, Russell B, Yeo TW, Price RN. The pathophysiology of vivax malaria. Trends in parasitology. 2009;25(5):220–7. 10.1016/j.pt.2009.02.003 . [DOI] [PubMed] [Google Scholar]
  • 30.Zheng GX, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, et al. Massively parallel digital transcriptional profiling of single cells. Nature communications. 2017;8:14049 10.1038/ncomms14049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Poran A, Notzel C, Aly O, Mencia-Trinchant N, Harris CT, Guzman ML, et al. Single-cell RNA sequencing reveals a signature of sexual commitment in malaria parasites. Nature. 2017;551(7678):95–9. 10.1038/nature24280 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Brancucci NMB, De Niz M, Straub TJ, Ravel D, Sollelis L, Birren BW, et al. Probing Plasmodium falciparum sexual commitment at the single-cell level. Wellcome Open Res. 2018;3:70 10.12688/wellcomeopenres.14645.4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Reid AJ, Talman AM, Bennett HM, Gomes AR, Sanders MJ, Illingworth CJR, et al. Single-cell RNA-seq reveals hidden transcriptional variation in malaria parasites. Elife. 2018;7 10.7554/eLife.33105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Howick VM, Russell AJC, Andrews T, Heaton H, Reid AJ, Natarajan K, et al. The Malaria Cell Atlas: Single parasite transcriptomes across the complete Plasmodium life cycle. Science. 2019;365(6455). 10.1126/science.aaw2619 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Ribaut C, Berry A, Chevalley S, Reybier K, Morlais I, Parzy D, et al. Concentration and purification by magnetic separation of the erythrocytic stages of all human Plasmodium species. Malaria journal. 2008;7:45 10.1186/1475-2875-7-45 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Auburn S, Bohme U, Steinbiss S, Trimarsanto H, Hostetler J, Sanders M, et al. A new Plasmodium vivax reference sequence with improved assembly of the subtelomeres reveals an abundance of pir genes. Wellcome Open Res. 2016;1:4 10.12688/wellcomeopenres.9876.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Mueller I, Galinski MR, Baird JK, Carlton JM, Kochar DK, Alonso PL, et al. Key gaps in the knowledge of Plasmodium vivax, a neglected human malaria parasite. The Lancet infectious diseases. 2009;9(9):555–66. 10.1016/S1473-3099(09)70177-X . [DOI] [PubMed] [Google Scholar]
  • 38.Kitchen SF. The Infection of Reticulocytes by Plasmodium vivax. The American journal of tropical medicine and hygiene. 1938;18(4):347–59. [Google Scholar]
  • 39.Geminard C, de Gassart A, Vidal M. Reticulocyte maturation: mitoptosis and exosome release. Biocell. 2002;26(2):205–15. . [PubMed] [Google Scholar]
  • 40.Lee E, Choi HS, Hwang JH, Hoh JK, Cho YH, Baek EJ. The RNA in reticulocytes is not just debris: it is necessary for the final stages of erythrocyte formation. Blood Cells Mol Dis. 2014;53(1–2):1–10. 10.1016/j.bcmd.2014.02.009 . [DOI] [PubMed] [Google Scholar]
  • 41.Baro B, Deroost K, Raiol T, Brito M, Almeida AC, de Menezes-Neto A, et al. Plasmodium vivax gametocytes in the bone marrow of an acute malaria patient and changes in the erythroid miRNA profile. PLoS Negl Trop Dis. 2017;11(4):e0005365 Epub 2017/04/07. 10.1371/journal.pntd.0005365 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Meerstein-Kessel L, Andolina C, Carrio E, Mahamar A, Sawa P, Diawara H, et al. A multiplex assay for the sensitive detection and quantification of male and female Plasmodium falciparum gametocytes. Malaria journal. 2018;17(1):441 10.1186/s12936-018-2584-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Collins WE. Nonhuman primate models. II. Infection of Saimiri and Aotus monkeys with Plasmodium vivax. Methods in molecular medicine. 2002;72:85–92. Epub 2002/07/20. 10.1385/1-59259-271-6:85 . [DOI] [PubMed] [Google Scholar]
  • 44.Talman AM, Lacroix C, Marques SR, Blagborough AM, Carzaniga R, Menard R, et al. PbGEST mediates malaria transmission to both mosquito and vertebrate host. Mol Microbiol. 2011;82(2):462–74. 10.1111/j.1365-2958.2011.07823.x . [DOI] [PubMed] [Google Scholar]
  • 45.Yeoh LM, Goodman CD, Mollard V, McFadden GI, Ralph SA. Comparative transcriptomics of female and male gametocytes in Plasmodium berghei and the evolution of sex in alveolates. BMC Genomics. 2017;18(1):734 10.1186/s12864-017-4100-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Arredondo SA, Kappe SHI. The s48/45 six-cysteine proteins: mediators of interaction throughout the Plasmodium life cycle. Int J Parasitol. 2017;47(7):409–23. 10.1016/j.ijpara.2016.10.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Miao J, Fan Q, Cui L, Li J, Li J, Cui L. The malaria parasite Plasmodium falciparum histones: organization, expression, and acetylation. Gene. 2006;369:53–65. 10.1016/j.gene.2005.10.022 . [DOI] [PubMed] [Google Scholar]
  • 48.Ransom M, Dennehey BK, Tyler JK. Chaperoning histones during DNA replication and repair. Cell. 2010;140(2):183–95. 10.1016/j.cell.2010.01.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Davila Lopez M, Samuelsson T. Early evolution of histone mRNA 3' end processing. RNA. 2008;14(1):1–10. 10.1261/rna.782308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Pettitt J, Crombie C, Schumperli D, Muller B. The Caenorhabditis elegans histone hairpin-binding protein is required for core histone gene expression and is essential for embryonic and postembryonic cell division. J Cell Sci. 2002;115(Pt 4):857–66. . [DOI] [PubMed] [Google Scholar]
  • 51.Dzakah EE, Kang K, Ni C, Wang H, Wu P, Tang S, et al. Plasmodium vivax aldolase-specific monoclonal antibodies and its application in clinical diagnosis of malaria infections in China. Malaria journal. 2013;12:199 10.1186/1475-2875-12-199 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Lee N, Baker J, Bell D, McCarthy J, Cheng Q. Assessing the genetic diversity of the aldolase genes of Plasmodium falciparum and Plasmodium vivax and its potential effect on performance of aldolase-detecting rapid diagnostic tests. J Clin Microbiol. 2006;44(12):4547–9. 10.1128/JCM.01611-06 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Makler MT, Piper RC, Milhous WK. Lactate dehydrogenase and the diagnosis of malaria. Parasitol Today. 1998;14(9):376–7. 10.1016/s0169-4758(98)01284-8 . [DOI] [PubMed] [Google Scholar]
  • 54.Painter HJ, Campbell TL, Llinas M. The Apicomplexan AP2 family: integral factors regulating Plasmodium development. Molecular and biochemical parasitology. 2011;176(1):1–7. 10.1016/j.molbiopara.2010.11.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Campbell TL, De Silva EK, Olszewski KL, Elemento O, Llinas M. Identification and genome-wide prediction of DNA binding specificities for the ApiAP2 family of regulators from the malaria parasite. PLoS Pathog. 2010;6(10):e1001165 10.1371/journal.ppat.1001165 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Modrzynska K, Pfander C, Chappell L, Yu L, Suarez C, Dundas K, et al. A Knockout Screen of ApiAP2 Genes Reveals Networks of Interacting Transcriptional Regulators Controlling the Plasmodium Life Cycle. Cell Host Microbe. 2017;21(1):11–22. 10.1016/j.chom.2016.12.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Beeson JG, Drew DR, Boyle MJ, Feng G, Fowkes FJ, Richards JS. Merozoite surface proteins in red blood cell invasion, immunity and vaccines against malaria. FEMS Microbiol Rev. 2016;40(3):343–72. 10.1093/femsre/fuw001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Baker J, Ho MF, Pelecanos A, Gatton M, Chen N, Abdullah S, et al. Global sequence variation in the histidine-rich proteins 2 and 3 of Plasmodium falciparum: implications for the performance of malaria rapid diagnostic tests. Malar J. 2010;9:129 Epub 2010/05/18. 10.1186/1475-2875-9-129 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.del Portillo HA, Fernandez-Becerra C, Bowman S, Oliver K, Preuss M, Sanchez CP, et al. A superfamily of variant genes encoded in the subtelomeric region of Plasmodium vivax. Nature. 2001;410(6830):839–42. 10.1038/35071118 . [DOI] [PubMed] [Google Scholar]
  • 60.Loy DE, Liu W, Li Y, Learn GH, Plenderleith LJ, Sundararaman SA, et al. Out of Africa: origins and evolution of the human malaria parasites Plasmodium falciparum and Plasmodium vivax. Int J Parasitol. 2017;47(2–3):87–97. 10.1016/j.ijpara.2016.05.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Silberhorn E, Schwartz U, Loffler P, Schmitz S, Symelka A, de Koning-Ward T, et al. Plasmodium falciparum Nucleosomes Exhibit Reduced Stability and Lost Sequence Dependent Nucleosome Positioning. PLoS Pathog. 2016;12(12):e1006080 10.1371/journal.ppat.1006080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Gunalan K, Sa JM, Moraes Barros RR, Anzick SL, Caleon RL, Mershon JP, et al. Transcriptome profiling of Plasmodium vivax in Saimiri monkeys identifies potential ligands for invasion. Proceedings of the National Academy of Sciences of the United States of America. 2019;116(14):7053–61. 10.1073/pnas.1818485116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Wertheimer SP, Barnwell JW. Plasmodium vivax interaction with the human Duffy blood group glycoprotein: identification of a parasite receptor-like protein. Exp Parasitol. 1989;69(4):340–50. 10.1016/0014-4894(89)90083-0 . [DOI] [PubMed] [Google Scholar]
  • 64.Hester J, Chan ER, Menard D, Mercereau-Puijalon O, Barnwell J, Zimmerman PA, et al. De novo assembly of a field isolate genome reveals novel Plasmodium vivax erythrocyte invasion genes. PLoS Negl Trop Dis. 2013;7(12):e2569 10.1371/journal.pntd.0002569 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Gunalan K, Lo E, Hostetler JB, Yewhalaw D, Mu J, Neafsey DE, et al. Role of Plasmodium vivax Duffy-binding protein 1 in invasion of Duffy-null Africans. Proceedings of the National Academy of Sciences of the United States of America. 2016;113(22):6271–6. 10.1073/pnas.1606113113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Roesch C, Popovici J, Bin S, Run V, Kim S, Ramboarina S, et al. Genetic diversity in two Plasmodium vivax protein ligands for reticulocyte invasion. PLoS Negl Trop Dis. 2018;12(10):e0006555 10.1371/journal.pntd.0006555 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.National Research Council. Guide for the Care and Use of Laboratory Animals: Eighth Edition Washington, DC: The National Academies Press; 2011. 246 p. 10.1258/la.2010.010031 [DOI] [Google Scholar]
  • 68.Ehrman FC, Ellis JM, Young MD. Plasmodium Vivax Chesson Strain. Science. 1945;101(2624):377 10.1126/science.101.2624.377 . [DOI] [PubMed] [Google Scholar]
  • 69.Schwartz IK, Lackritz EM, Patchen LC. Chloroquine-resistant Plasmodium vivax from Indonesia. N Engl J Med. 1991;324(13):927 10.1056/NEJM199103283241317 . [DOI] [PubMed] [Google Scholar]
  • 70.Collins WE, Schwartz IK, Skinner JC, Morris C, Filipski VK. The susceptibility of the Indonesian I/CDC strain of Plasmodium vivax to chloroquine. J Parasitol. 1992;78(2):344–9. . [PubMed] [Google Scholar]
  • 71.Cooper RD, Rieckmann KH. Efficacy of amodiaquine against a chloroquine-resistant strain of Plasmodium vivax. Transactions of the Royal Society of Tropical Medicine and Hygiene. 1990;84(4):473 10.1016/0035-9203(90)90003-w . [DOI] [PubMed] [Google Scholar]
  • 72.Sa JM, Kaslow SR, Moraes Barros RR, Brazeau NF, Parobek CM, Tao D, et al. Plasmodium vivax chloroquine resistance links to pvcrt transcription in a genetic cross. Nature communications. 2019;10(1):4300 10.1038/s41467-019-12256-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12(4):357–60. 10.1038/nmeth.3317 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33(3):290–5. 10.1038/nbt.3122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Lun AT, McCarthy DJ, Marioni JC. A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor. F1000Res. 2016;5:2122 10.12688/f1000research.9501.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Risso D, Perraudeau F, Gribkova S, Dudoit S, Vert JP. A general and flexible method for signal extraction from single-cell RNA-seq data. Nature communications. 2018;9(1):284 10.1038/s41467-017-02554-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences of the United States of America. 2003;100(16):9440–5. 10.1073/pnas.1530509100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society Serie B. 1995;57(1):289–300. [Google Scholar]
  • 79.Chan ER, Menard D, David PH, Ratsimbasoa A, Kim S, Chim P, et al. Whole genome sequencing of field isolates provides robust characterization of genetic diversity in Plasmodium vivax. PLoS Negl Trop Dis. 2012;6(9):e1811 10.1371/journal.pntd.0001811 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Di Jiang

5 Jul 2019

Dear Dr Serre,

Thank you for submitting your manuscript entitled "Single-cell transcriptomes of Plasmodium vivax parasites reveal extensive but continuous changes in gene expression during their intraerythrocytic cycle" for consideration as a Research Article by PLOS Biology.

Your manuscript has now been evaluated by the PLOS Biology editorial staff as well as by an academic editor with relevant expertise and I am writing to let you know that we would like to send your submission out for external peer review as a Resources paper if you agree.

However, before we can send your manuscript to reviewers, we need you to complete your submission by providing the metadata that is required for full assessment. To this end, please login to Editorial Manager where you will find the paper in the 'Submissions Needing Revisions' folder on your homepage. Please click 'Revise Submission' from the Action Links and complete all additional questions in the submission questionnaire.

**Important**: Please also see below for further information regarding completing the MDAR reporting checklist. The checklist can be accessed here: https://plos.io/MDARChecklist

Please re-submit your manuscript and the checklist, within two working days, i.e. by Jul 07 2019 11:59PM.

Login to Editorial Manager here: https://www.editorialmanager.com/pbiology

During resubmission, you will be invited to opt-in to posting your pre-review manuscript as a bioRxiv preprint. Visit http://journals.plos.org/plosbiology/s/preprints for full details. If you consent to posting your current manuscript as a preprint, please upload a single Preprint PDF when you re-submit.

Once your full submission is complete, your paper will undergo a series of checks in preparation for peer review. Once your manuscript has passed all checks it will be sent out for review.

Feel free to email us at plosbiology@plos.org if you have any queries relating to your submission.

Kind regards,

Di Jiang, PhD

Associate Editor

PLOS Biology

==================

INFORMATION REGARDING THE REPORTING CHECKLIST:

PLOS Biology is pleased to support the "minimum reporting standards in the life sciences" initiative (https://osf.io/preprints/metaarxiv/9sm4x/). This effort brings together a number of leading journals and reproducibility experts to develop minimum expectations for reporting information about Materials (including data and code), Design, Analysis and Reporting (MDAR) in published papers. We believe broad alignment on these standards will be to the benefit of authors, reviewers, journals and the wider research community and will help drive better practise in publishing reproducible research.

We are therefore participating in a community pilot involving a small number of life science journals to test the MDAR checklist. The checklist is intended to help authors, reviewers and editors adopt and implement the minimum reporting framework.

IMPORTANT: We have chosen your manuscript to participate in this trial. The relevant documents can be located here:

MDAR reporting checklist (to be filled in by you): https://plos.io/MDARChecklist

**We strongly encourage you to complete the MDAR reporting checklist and return it to us with your full submission, as described above. We would also be very grateful if you could complete this author survey:

https://forms.gle/seEgCrDtM6GLKFGQA

Additional background information:

Interpreting the MDAR Framework: https://plos.io/MDARFramework

Please note that your completed checklist and survey will be shared with the minimum reporting standards working group. However, the working group will not be provided with access to the manuscript or any other confidential information including author identities, manuscript titles or abstracts. Feedback from this process will be used to consider next steps, which might include revisions to the content of the checklist. Data and materials from this initial trial will be publicly shared in September 2019. Data will only be provided in aggregate form and will not be parsed by individual article or by journal, so as to respect the confidentiality of responses.

Please treat the checklist and elaboration as confidential as public release is planned for September 2019.

We would be grateful for any feedback you may have.

Decision Letter 1

Di Jiang

20 Aug 2019

Dear Dr Serre,

Thank you very much for submitting your manuscript "Single-cell transcriptomes of Plasmodium vivax parasites reveal extensive but continuous changes in gene expression during their intraerythrocytic cycle" for consideration as a Research Article at PLOS Biology. Your manuscript has been evaluated by the PLOS Biology editors, an Academic Editor with relevant expertise, and by three independent reviewers.

In light of the reviews (including the review from the Academic Editor), we would welcome resubmission of a much-revised version that takes into account the reviewers' and Academic Editor's comments as a Methods and Resources article. Our Academic Editor provides the following guidance to you on how to address the points raised by the reviewers.

Regarding reviewer 1's comments: You will need to address points 1 and 2 re normalisation, which are important. You will also need to provide a more in-depth comparison with the P. falciparum scRNAseq in response to point 4.

Regarding reviewer 2's comments: Paragraphs 2 and 3 re analysis method with a focus on genes/regions/windows analysed are important and must be addressed. Paragraph 4 suggests to extend and deepen the analysis to entire genome and comparison with other published scRNAseq of other plasmodium species is important and must be addressed. Paragraphs 5 and 6 are valid technical criticisms. Paragraph 8 suggests to exploit the scRNAseq thoroughly by looking for patterns of co-expression of TFs etc. This is important as reviewer 1 suggests the same in point 4. Both reviewers imply this identification of co-regulated genes is an additional analysis that would add considerable value. As it is the principal value that would raise the paper above the level of published bulk RNAseq, we ask you to meet this request from the reviewers. All minor comments are valid and should be addressed.

Regarding reviewer 3's comments: This reviewer has a large number of comments, most of which are reasonable and you will need to address them. The request for a de novo transcriptome would provide a useful resource. scRNAseq could allow the correct assignation of splice sites and utrs to specific isoforms if the coverage was adequate. There are very many putative AP2 binding sites in the P. falciparum genome and inferring a functional role for an over-represented, predicted DNA cis regulatory motif really requires functional analysis. However it is a simple analysis to do and detection of a shared AP2 binding motif would add value to inferred groups of co-regulated transcripts.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent for further evaluation by the reviewers.

Your revisions should address the specific points made by each reviewer. Please submit a file detailing your responses to the editorial requests and a point-by-point response to all of the reviewers' comments that indicates the changes you have made to the manuscript. In addition to a clean copy of the manuscript, please upload a 'track-changes' version of your manuscript that specifies the edits made. This should be uploaded as a "Related" file type. You should also cite any additional relevant literature that has been published since the original submission and mention any additional citations in your response.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

Before you revise your manuscript, please review the following PLOS policy and formatting requirements checklist PDF: http://journals.plos.org/plosbiology/s/file?id=9411/plos-biology-formatting-checklist.pdf. It is helpful if you format your revision according to our requirements - should your paper subsequently be accepted, this will save time at the acceptance stage.

Please note that as a condition of publication PLOS' data policy (http://journals.plos.org/plosbiology/s/data-availability) requires that you make available all data used to draw the conclusions arrived at in your manuscript. If you have not already done so, you must include any data used in your manuscript either in appropriate repositories, within the body of the manuscript, or as supporting information (N.B. this includes any numerical values that were used to generate graphs, histograms etc.). For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

For manuscripts submitted on or after 1st July 2019, we require the original, uncropped and minimally adjusted images supporting all blot and gel results reported in an article's figures or Supporting Information files. We will require these files before a manuscript can be accepted so please prepare them now, if you have not already uploaded them. Please carefully read our guidelines for how to prepare and upload this data: https://journals.plos.org/plosbiology/s/figures#loc-blot-and-gel-reporting-requirements.

Upon resubmission, the editors will assess your revision and if the editors and Academic Editor feel that the revised manuscript remains appropriate for the journal, we will send the manuscript for re-review. We aim to consult the same Academic Editor and reviewers for revised manuscripts but may consult others if needed.

We expect to receive your revised manuscript within two months. Please email us (plosbiology@plos.org) to discuss this if you have any questions or concerns, or would like to request an extension. At this stage, your manuscript remains formally under active consideration at our journal; please notify us by email if you do not wish to submit a revision and instead wish to pursue publication elsewhere, so that we may end consideration of the manuscript at PLOS Biology.

When you are ready to submit a revised version of your manuscript, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' where you will find your submission record.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive thus far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Di Jiang, PhD

Associate Editor

PLOS Biology

*****************************************************

Reviewer remarks:

Reviewer #1: The current manuscript describes application of a single-cell sequencing approach to characterising stage specific transcriptomes for blood-stages of Plasmodium vivax taken from experimental infected primates. The work is very interesting and the potential identification of stage-specific markers, particularly for gametocyte stages, is of high value.

Major comments:

1. I have some significant concerns about the normalization strategy used in the study. The authors have mapped the data using standard and appropriate tools and removed PCR duplicate reads, which is appropriate and considered the current gold-standand approach for scRNA-seq data. However, to normalize the data, they have discarded low and high coverage outliers and then normalized their read count data (binned by mapping window) based on the total reads generated for the cell. This approach is essentially the "transcripts per million" transformation used in bulk RNA-seq analyses. This is fine, but even for bulk data, it is necessary to use more sophisticated normalization methods (e.g., TMM or UQ normalization) prior to TPM calculation, which are more robust to large differences in sequencing depth and other potential biases in RNA-seq data.

For scRNA-seq, the preferred approach is to include quantitative spike-in controls to support data normalization. In the absence of this, it is necessary to use normalization strategies specifically suited to scRNA-seq data. Removal of high and low coverage outlier cells would help with this, but it is not considered enough on its own. A recent study hosted on BioRxiv (https://www.biorxiv.org/content/early/2019/03/19/583013.full.pdf) systematically evaluates scRNA-seq analytical approaches and highlights the appropriate data normalization as the most critical step in this process. In my opinion, the authors need to use a more robust normalization strategy, such as SCnorm (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5473255/) before reappraising the other steps in their study. I realise this is inconvenient and time-consuming, but I don't think it is possible to confidently make major conclusions based on the current analytical approach.

2. Following from this, the authors describing using edgeR for paired analyses of their data. This is not described in detail, but I take this to mean for differential gene expression analyses. If so, the details of these analyses need to be provided (What significance threshold was used? Did the authors used a Fischer's test or a linear model (i.e., Limma) for differentational calling?). Further and more importantly, EdgeR is a robust package for RNAseq and works very well for bulk RNAseq data, but it was not designed for scRNA-seq data specifically. In the absence of control spike-ins, it is critical that the data be appropriately and robustly normalized before using EdgeR (as noted above). A detailed comparative analysis of methods for calling differentially transcribed genes in scRNA-seq data can be found here, for example, doi:10.1038/nmeth.4612 and highlights other important steps including determining dropout rates and imputing missing gene values. For a example of specific applications in Plasmodium, the authors could also look at https://www.sciencedirect.com/science/article/pii/S0014482718306438, which used an Upper-Quartile normalization approach coupled with an scRNA-seq aware differential expression caller (SCDE). Reid et al, eLife, 2018 provides another good example and includes a approach for imputing missing gene data, which is an important step (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5871331/).

3. Results - page 8 - "Although accumulation of specific P. vivax stages has been reported in host tissues such as bone marrow the patterns, extent, and mechanisms of sequestration differ substantially from those of P. falciparum, and in many P. vivax infections, young as well as mature asexual and sexual stages can be found simultaneously in the circulation 27,40. Consistent with these observations, principal component analysis of the individual transcriptomes revealed distinct populations of blood stage parasites." - I'm not sure about the connection between the first and second sentence here. I understand the idea that P. vivax differs from P. falciparum in tissue sequestration patterns, but I don't see the direct connection to the distinct transcriptomic populations in the blood-stage data. Specifically, how do the authors differentiate that from simply different stage-specific transcriptomes? Do these population clusters differ markedly from the clusters seen in currently published scRNA-seq datasets for P. falciparum.

4. I feel the authors could make much more thorough use of available scRNA-seq data for blood-stages of Plasmodium. I understand the authors point in the introduction that prior studies have looked at cultured RBCs rather than cells harvested from an infected host. This is an important point, but in my view, that is all the more reason to compare the data more thoroughly. At present, the only comparison between P. vivax and P. falciparum in the current analysis is restricted to quantifying the proportion of the P. vivax transcriptome associated with genes that have or don't have a P. falciparum orthology. Why not also look at the relative stage specificity of these orthologs in existing scRNA-seq data? Ultimately, it is of course the authors' choice as to what they choose to include in their study. This is not a trivial undertaking and it may be beyond the scope of what the authors' wish to do here. I also am aware that nothing frustrates an author more than a reviewer commenting on a study they think the authors could have done, rather than the one they did do. That said, I think such an analysis would add to the impact of the study and is worth considering.

Overall comments: This study is interesting and provides very valuable data that will be of broad interest to the field. I think identification of signals and markers of sexual-stage differentiation in P. vivax are important and again, will be of broad significance. However, the paper as written has some limitations in my view that must be addressed. Most significantly in the analytical stages in relation to the data normalization and differential gene expression analyses. These approaches are largely acceptable for bulk RNAseq data (with some caveats as noted above) but are not directly transferrable to scRNA-seq data. In my view, this has to be addressed before any further consideration of the manuscript can be made. I think the authors can make more of the comparative analyses with existing scRNAseq data for blood-stages of other Plasmodium species, but this is my subjective opinion and if the authors' choose not to expand the scope of their study, I won't stand in the way.

Reviewer #2: Major comments:

In general, the data for this paper are highly valuable and impressive but the paper somewhat suffers from an underwhelming use of the data that should be rectified before publication.

This paper presents a bespoke analysis of P. vivax single cell data and would benefit from demonstrating that their approach is robust by comparing it to some of the more commonly used tools, for example at the very least presenting the CellRanger results. If the annotation of the genome is an issue and leads to poor results from CellRanger, then they could consider extending the annotated genes by a set amount from each start and stop (e.g., include 500bp up and downstream for each gene). It is unclear why they do not use popular and useful tools, and instead go about creating their own methods of analysis.

The authors’ use of windows rather than genes should be tempered by providing the gene-based analysis first. It is somewhat concerning that the screenshots of IGV seem to show transcription primarily outside of gene boundaries. This may very well be UTRs and indeed, the patterns of transcription that are presented support the notion that windows approach works, but it would still be good to provide insight into differences in patterns or indeed loss of information when using only annotated gene boundaries. Some stats on genes with and without annotated UTRs would also be important, as if UTRs are annotated, then the windows approach becomes less robust.

The authors state that they have observed “exquisite regulation of P. vivax transcription along the intraerythrocytic life cycle” but support this claim with a heatmap of a small subset of genes. A more global analysis of the transcriptional changes over developmental time in needed. Do you observe abrupt patterns of expression as seen in other single cell Plasmodium data sets (Reid et al 2018), and are the timing of expression patterns similar to other species (e.g. Poran et al 2017, Howick et al 2019)?

The authors state that they have controlled for development by including pseudotime as a covariate. It’s important that they illustrate how this correction affects the results, as it seems to not capture how genes would be differentially expressed in a pseudotime dependent manner.

The authors conclude that because the putative male population of cells doesn’t express genes associated with mature gametocytes that the known marker genes for males are not accurate. That these are definitely mature male gametocytes is unconvincing, and a more in-depth comparison to other published expression data sets in P. falciparum and P. berghei in needed including this recent biorxiv paper: van Biljon R, et al bioRxiv. 2019. p. 633222. doi:10.1101/633222.

Throughout the paper the authors focus on individual genes or subsets of genes but don’t include any functional follow up or global analysis to support their conclusions. For example the authors look at expression of the MSPs and PIRs but discuss them in the context of all gene families. Is this pattern generalizable? Can the authors comment on function about the other gene families given their patterns of expression?

DE assessments are not really using advantages of having single cell data at all, and presumably could have been done using bulk approaches with various purification methods to gather different stages. Consider fixing this. Also consider incorporating more of the power that single cell approaches give to examine things like variable gene expression, transcription factor coexpression, etc.

Minor comments

1. Many minor grammatical issues throughout.

2. The statement in the introduction “Unfortunately, the requirement for fresh and viable parasites has so far precluded extension of such studies to other Plasmodium species” should say instead something about “unculturable human species”. Technically, this is also inaccurate as P.malariae is present in the Howick et al bioRxiv paper.

3. The authors state “We determined the number and viability of the purified cells” is stated in the methods, but no information is given on the results of the viability assays. Given the cells were enriched with MACS, the viability should be reported as well as when the viability assessments were made, given that three hours from sample to loading is a very long time. Were samples on ice this entire time?

4. Not enough detail on loading for 10x and how this was accounted for given different parasitemias; not enough detail on why the runs varied so much in cells recovered.

5. Was there any attempt to get rid of WBCs? Provide a basic analysis of the pure monkey cells to understand if these are RBCs or WBCs.

6. The number of lanes of HiSeq 4000 used should be reported.

7. hisat2 is more typically referred to as HISAT2

8. Fig S7 should have the treatments/host labeled on it so that one can quickly identify which experiments are chloroquine treated, etc.

9. Fig S10 purports to show that sexual and asexual parasites might use different UTRs for the same gene. From looking at the figure, it is also equally possible that those sexual parasites are just expressing the next gene along.

10. “Most P. vivax genes remain unannotated..”, would be perhaps better stated as “..remain incompletely annotated”. With >6000 annotated genes in the P. vivax genome, it seems unlikely that there remain >6000 yet to be annotated.

11. Fig 4 also needs some kind of normalizer that shows the total number of genes identified along pseudotime rather than just the proportion that have a P. falciparum ortholog.

12. The authors state “To statistically determine whether specific transcripts were differentially regulated upon chloroquine treatment, we compared the gene expression profiles of the parasites from the same Aotus infection before and 20-hour post- treatment, adjusting for differences in developmental stages” needs further elaboration on how you adjust for differences in developmental stages.

13. No comment is made on the drug resistance status of the strains used; do any of them carry mutations associated with drug resistance?

14. When discussing the different host species, it would be beneficial to remind the reader an estimate of how many rounds of re-invasion (length of time post inoculation) the parasites have been through.

15. When referring to reference 57, which seems to investigate only Saimiri monkeys, it would be useful to discuss the locations from where the various P. vivax strains under discussion originate from. Indeed, this information appears to be completely missing from the methods/materials and I can only find strain names in table 1, with no explanation on what these are or where they originate from.

16. Figure S4 is meant to show that the data do not change based on the QC cut off on reads per cell. However, it only demonstrated that the general shape of the PCA is the same (and is really hard to see the points in grey). It would be helpful to quantify how each developmental stage or cluster is represented depending on the cut off used.

17. Figure 3 illustrates non-overlapping expression of 4 AP2 transcription factors. How many of the ~27 AP2s were detected and how common was this stage-specific expression?

18. Figure 5: label axes

19. The authors state “Finally, we discarded all individual parasite transcriptomes containing less than 5,000 unique reads mapped to the P. vivax genome as well as those containing more than 75,000 unique reads (Table S1)” & “Figure S4. Principal component analysis of the 13,503 individual parasite transcriptomes each containing more than 1,000 unique reads. Each dot represents a single parasite transcriptome and is displayed according to its gene expression profile along the first three principal components. Note that the figure is virtually identical to the PCA generated using the 9,215 transcriptomes with more than 5,000 unique reads (Figure 1).”....These cut offs seem high - which cells do you lose with this filtering? Is it cell type biased?

20. The authors state “The analyses described above did not reveal any major effect of chloroquine (or of the host species) on the overall patterns of gene expression: we observed all parasite developmental stages in most samples regardless of the treatment or the host, and the transcriptomes of treated parasites or parasites from different hosts overlapped apparently perfectly (Table 1 and Figure S13).” and from methods: “We conducted these paired analyses using EdgeR 36, independently for female gametocytes and asexual parasites, and further subdivided the latter into four crude developmental categories along the intraerythrocytic cycle”. It is important to go into more detail of how you divided asexuals, and a single-cell specific differential expression tool should be used to account for dropout.

Reviewer #3: Overview:

The authors should be commended for their heroic undertaking to determine several Plasmodium vivax transcriptomes from in vivo infections using single cell RNA-seq. These are the first example of such data which builds on data the group has previously generated by replacing bulk transcriptomics from patient isolates with single cell RNA-sequencing data generated from the infected reticulocytes of splenectomized monkeys. Using dimensional reduction based analysis of the data, the authors are able to characterize the probable sequence of gene expression events throughout P. vivax asexual development and to some extent during male and female gametocytogenesis. Despite the potential richness of the datasets acquired in this study, the manuscript fails to deeply analyze the data, providing a very broad over-view of the P. vivax in vivo transcriptome that ultimately fails to deliver on new biological insights, rather confirming previous studies. The manuscript also appears to have been put together hastily and therefore is difficult to read at times and repetitive in other sections, lacking significant editing prior to submission.

General Comments:

It is difficult to understand from the manuscript what new information was learned relative to The transcriptome of Plasmodium vivax reveals divergence and diversity of transcriptional regulation in malaria parasites (2008 PNAS). Is the novelty in the unique markers that cannot be picked up by bulk RNA-seq? Furthermore, since in the previous study the authors could culture the parasites ex-vivo and therefore determine the morphology of the parasite unambiguously, the authors should compare their data to this study to see if the highest correlations exist at the expected stage (i.e. the trophozoite and schizont datasets correlate highly, whereas the ring data is missing as they speculate is the case in their sc-RNA-seq asexual transcriptome?). Furthermore, how does the proportion of genes detected in this sc-RNA-seq study compare to bulk studies? At what threshold do you no longer detect genes that previous datasets have determined to be expressed but at a low abundance?

Can the authors graph their phaseogram against that of the orthologous genes in a better characterized Plasmodium spp. (I.e. falciparum or berghei) to highlight the differences and similarities across malaria parasites?

Can the authors use the high quality paired end reads they have generated to assemble a putative P. vivax transcriptome de novo? It would be both interesting and of great use to the field to have a more reliable P. vivax transcriptome to map NGS sequencing data to, or conversely to provide higher confidence in the currently predicted gene models.

There is a missed opportunity in the authors’ exploration of the expression of AP2 domain proteins throughout asexual and sexual development. 12/27 of the AP2 domain proteins predicted in P. vivax have been genetically characterized to some extent in other Plasmodium species. How do their temporal expression patterns match their genetic roles? We feel that the PfAP2-I orthologue may be of particular interest due to its role in activating transcription of genes needed for red blood cell invasion in Pf. Although the essential AP2 domain (D3) of PFAP2-I is 100% conserved between Pv and Pf (implying that it will bind a similar DNA motif) the Pv invasion pathway is presented as being different and presumably involves the transcriptional activation of different genes, as the authors have pointed out.

In keeping with the above content, can you use the existing Pv gene models to predict DNA cis-regulatory motifs based on the temporal profile of mRNA abundance? This would provide a very convincing argument that the precisely regulated timing of transcription factor expression is relevant for controlling the rest of the transcriptome, as the authors have suggested.

Specific Comments:

Page 2 line 18- In the following citation, ex vivo cultured P. falciparum isolates were used to address some of the shortcomings mentioned, specifically the fact that cultured parasites lose the ability to express genes that are important for host interaction.

Blythe, J. E., Xue, Y. Y., Kuss, C., Bozdech, Z., Holder, A. A., Marsh, K., … Preiser, P. R. (2008). Plasmodium falciparum STEVOR proteins are highly expressed in patient isolates and located in the surface membranes of infected red blood cells and the apical tips of merozoites. Infection and Immunity

Page 2 line 17- Was there any way for the authors to look at their parasites microscopically before generating the material for an RNA-sequencing library (as in reference 29)? How robust is the pseudotime data reduction i.e. can the age and life stage of parasites analyzed in this manuscript be said to be known with absolute certainty?

Page 4 line 21- how does mapping the artificially trimmed 75bp vs 101 bp reads compare? If you use the longer reads do you increase your information about gene structures?

Page 5 sequence analysis- Is it normal with single cell transcriptome data to normalize read depth across bins as you have described? Is there a significant difference in the sequencing depth originating from different locations on the transcriptome? Can you provide a sc-RNA seq citation that analyzes the reads in a similar way?

Can the authors provide an updated putative P. vivax transcriptome based on the gene structures (splice junctions, UTR coverage, etc.) detected in their data? Although that the goal of this work was to quantify transcripts using the 3’ end reads, the dataset should also impart some new insights into the transcriptome?

Page 7- The Results section describing “detailed transcription information” contains numerous arbitrary facts concerning the number of total reads and fraction pertaining to host versus P. vivax. In the end, this serves to confuse the reader more than providing clarity and diminishing the significance of the number of reads attained in this study. This section also describes a rational for inclusion of certain transcriptomes and not others which comes across as arbitrary without any statistical metric for inclusion/exclusion.

Page 7 line 25-29- Isn’t the information provided here contradictory? How can the spread of reads mapped to Pv be 40 to 80% but the majority of datasets are either all host or all Plasmodium?

Page 8 line 16- Please state how many genes 19,282 unique sequences corresponds to in your dataset.

Page 9 line 10- Is there a citation for the group of male gametocyte genes in P. vivax? Otherwise how were these predicted?

Page 9 line 20- How deep of sequencing do you need to detect SNPs with high confidence? Does this data consistently reach that threshold?

Page 9 – How were the genes selected for inclusion in Figure 2?

Page 10 line 20- Is it possible that the sc-RNA seq method does not adequately detect low abundance transcripts, and this is why previously theorized markers were not detected? What fraction of predicted P. vivax transcripts were detected relative to the previous bulk RNA-seq study in reference 18?

Page 10 line 31- P. vivax has direct orthologs to ap2-g2 and ap2-g3, which are thought to be important for gametocyte development in all Plasmodium species where they have been genetically interrogated. Are their transcripts detected in either gametocyte population? It would be useful to provide a comprehensive list of all predicted gametocyte-related genes (many are now available in the literature) and list their P. vivax orthologues to better define new versus previously suspected sexual stage genes emerging from this study.

Page 11- I believe that AP2 domain proteins should be written as AP2 (or ApiAP2), not AP-2

Many of the genetically characterized ApiAP2 proteins in Pb and Pf have orthologues in Pv and the important invasion transcription factor AP2-I has 1:1 conservation in its 3rd AP2 domain, implying that it probably binds a similar DNA sequence. Can you find any temporal conservation of potential 5' cis regulatory DNA elements in your data that implicate any of the AP2 domain proteins in the observed expression cascade? This may be especially interesting for AP2s involved in invasion (such as AP2-I) because or the differences in invasion mechanism between Pf and Pv.

Bozdech et al. previously proposed a cascade of regulation through AP2 proteins in their in vitro P. vivax transcriptome paper (2008).

Page 12- The authors state that the current gene annotations are rather poor for the P. vivax genome? Is this actually correct? What is the current number of gaps in this genome and by homology and synteny alone, can’t the counter argument be made that the genome is in quite good shape?

Page 12-13- Can you please provide a biological rationale why there would be a dramatic variability in the number (up to 63.4%) of genes expressed that are absent in P. falciparum during invasion? I can appreciate that invasion would be different for Pv, but why should it be so different across the measured cells? Some don’t appear to vary much at all, while others vary a lot.

Page 13 line 10- what does “apparently perfectly” mean in this context? Is the correlation between the datasets really 100%? There is no statistical basis for this statement.

Page14- if the authors optimized their isolation to find reticulocytes with mature parasites could the undetected subpopulations (related to the Duffy Binding Protein) just not be enriched in the purification process, as the authors already suggested for rings?

Figure 1- please add the gene names to the Gene ID (e.g. AMA1, Pvs25, etc…).

Figure 4- please clarify why there are different scales for each of the figures?

Miscellaneous-Pv is known to react differently to chloroquine than falciparum. Is there any reason to expect a transcriptional response that can be explained by falciparum biology?

Academic Editor's remarks:

The work uses excellent cutting-edge technology to report several novel findings and several that confirm what was already known from bulk RNAseq or scRNAseq in other species

The expression of Pv specific genes primarily in late stage schizonts for invasion is an important and novel finding.

Interestingly the authors show that there appears to be a deletion in monkey passaged paras and that an msp3 is differentially expressed between monkey spp, this evidence of species differences could be informative for factors specific for human disease.

The lack of co-variation between female and male gam genes is interesting, ie the ratios are diff between patients. I find the observation sound although it is confirmatory of what the authors previously published in their bulk rnaseq data. However, I found the terminology used here confusing

“the terminal phases of male and female gametocytogeneses are independently regulated in P. vivax”

Actually they seem to be dependently regulated, so that the probability of one cell being female affects another cell in the same host hence the disparity in sex ratios between hosts, if they were truly independent wouldn’t they trend to 50/50 or have the same unequal ratio across all hosts?

H4 expression increased in cq treated paras, this is interesting in light of the known DNA intercalating properties of Cq but a lot of work over a long time has convincingly shown that the Cq kills via interfering somehow with haemoglobin digestion/hemozoin formation in the digestive vacuole so the authors need fairly strong evidence to propose resistance via increased histone expression, I think this section should be toned down.

Interestingly the alternative histone H3.3 is upregulated (Table S3), H4 might be upregulated because they are deposited as a dimer together post replicatively. H3.3 is involved in gene regulation in many species and in telomeric silencing but in P falciparum it has some dynamic association with promoters but is primarily constitutively associated with subtelomeric repeats and coding sequence (Fraschka 2016 SciReports). It is implicated in male gamete development in mammals so could be somehow associated with gam development, this hasn’t been investigated. Increased H3.3-H4 incorporation could be a stress response to Cq.

Two substantial aspects of the work were the ap2 lifecycle associations and the gender specific genesets. However I did not get a feel from the MS how different the vivax data were to already published datasets from other species. I think this needs to be more clearly discussed in light of the author’s own discussion of the need for P vivax studies due to its unique biology.

The expression of specific ap2 at diff stages suggesting they specifically regulate progression through the lifecycle. However the bulk culture RNAseq of P. berghei, (which is more closely related to P vivax than is P falciparum, although still quite distant) has already revealed the phased expression of sets of Api AP2 transcription factors (Modrzynska et al 2016 Cell Host Microbe) and importantly 12 of 26 ApiAP2 have been KOd/KDd and their role in stage specific development confirmed by functional evidence of lethality in various stages in this and other studies. The principle was thus established without the scRNAseq although not at the same resolution. The authors should discuss their findings in light of these other studies, are there results congruent with the other studies in other species?

“We next searched the scRNA-seq data to identify transcripts highly expressed in male and/or female gametocytes and with little expression in other stages to generate a comprehensive list of putative biomarkers for male and female gametocytes (Table S2).”

Similar analyses have been conducted using scRNAseq in P. falciparum and P berghei (Reid et al 2018 elife). Also bulk RNAseq of male and female gams in P berghei (Yeoh BMC genomics 2017) and P. falciparum (lasonder 2016 NAR). It would be useful to know how much of these genesets intersected with the current study and how much is unique to P vivax, also the extent to which scRNAseq identified novel sex specific genes compared to the earlier bulk RNAseq experiments. The findings of these previous study should be discussed by the authors.

Minor comments

the PCA plot in fig 1 is color coded by genes, the function of which is used as a surrogate of lifecycle stage but the function of the genes and their lifecycle association is not explicitly stated in the fig, its legend or the text.

Page 10

“Interestingly, male gametocytes did not display elevated expression of genes previously associated with mature gametocytes (Figure S8), suggesting that these genes might be expressed only in female mature gametocytes”

This is not apparent to me from fig s8, it looks like the proportion of mature gam genes is as high as immature gam genes in males and the ratio of immature to mature is higher in males than females. The actual levels of expression of the mature gam genes in males does not seem to be presented in fig s8 at all.

Page 14 dbp2 nomenclature should be clarified in the text, its called ebp2 in the fig legend and ebp in the fig

Table S3 colour coding re significance seems odd, the numbers are ranks for the genes significance of fold change but this is not informative to me, they may be highly ranked and still non-significant, the colour coding isn’t explained, I assume they meet some significance threshold?

Decision Letter 2

Di Jiang

31 Jan 2020

Dear Dr Serre,

Thank you very much for submitting a revised version of your manuscript "Single-cell transcription analysis of Plasmodium vivax blood-stage parasites identifies stage- and species-specific profiles of expression" for consideration as a Methods and Resources at PLOS Biology. This revised version of your manuscript has been evaluated by the PLOS Biology editors, the Academic Editor and two of the three original reviewers who also assessed your response to the comments of original reviewer 3.

In light of the reviews (below), we are pleased to offer you the opportunity to address the remaining points from reviewer 2, and our Academic Editor's comments which are included as Academic Editor's remarks at the bottom of this letter, in a revised version that we anticipate should not take you very long. Our Academic Editor provides the following guidance to you on how to address the points raised by the reviewer 2, which is included here.

“…The authors should provide at least one quantitative comparison of results using their bespoke method and CellRanger with UTRs extended in the GTF by a reasonable amount. This can be supplementary, and their main text can stay as it is, but it is important to do this”

Our Academic Editor suggests that you have mounted a convincing case as to why you need to use your bespoke method.

“The new Figure S2 appears to indicate that >40% of observed transcription is more than 2.5kb away from an annotated gene. Is this true? It would be important to further discuss this in the manuscript if so, and draw comparisons with results from bulk RNAseq. Additionally, for the remaining windows that are within 1kb of a gene, it is important to do a more clear-cut gene-for-gene comparison to the other species scRNAseq data out there, as noted in the next section’s comment .”

Our Academic Editor indicates that this concern needs to be addressed and agrees with the reviewer that the bulk RNAseq should reveal roughly the length of 3’ UTRs which would validate your windows approach.

“…It would be reassuring to see that the reads that map to these new 3’ UTR regions in Pv map to 3’ UTRs in orthologous genes”

Academic Editor suggests that if orthologous means the 3’ UTRs in scRNAseq in other plasmodium spp, this wouldn't be very informative as the 3’ UTRs may be of different lengths. Another interpretation is the reviewer meant paralogs within Pv; finally they might mean the same genes in other Pv transcriptomes; the last interpretation seems reasonable but seems to reiterate the comment in the previous paragraph.

“The comparison of the shape of the PCA is a very superficial level look at similarity in expression, it is helpful to have as a figure, but more needs to be done, especially given the new title stating the paper is about species-specific expression profiles.”

Academic Editor agrees that the PCA plot is suggestive of similarity of profile but is not informative of the number or identity of similarly regulated genes across the life cycle. The new discussion and your claims of a gradual progression through transcription are useful, but our Academic Editor indicates that some more detailed comparison with other species is warranted.

“Concerns raised by reviewer 3 could be alleviated by correlating expression of the single-cell transcriptomes to those of ex vivo Pv bulk data. This would provide more insight into what the pseudotime values actually correspond to in real time. If the data does not correlate well with bulk, the authors have outlined a clear reason (lack of synchronicity or washed out by trophs) above that would be an interesting discussion point.”

Academic Editor isn't convinced that this comparison would be useful to address this issue. Pseudotime has been accepted widely as a measure and, as the reviewers acknowledge, a disparity between pseudotime here and bulk RNAseq elsewhere would not reveal whether the pseudotime or the bulk RNAseq were incorrect and thus would not be very useful.

“There are many other methods to cluster genes besides WGCNA that might provide more clarity into patterns of coexpression. Saelens et al 2018 is a good review of the many different options (https://www.ncbi.nlm.nih.gov/pubmed/29545622). This could provide clusters that would allow for the identification of enriched motifs. Additionally, if greater than 40% of transcription is not assigned to a gene, then assigning regulatory regions will be a difficult task, so further clarification in the manuscript about genes in particular is required. If the annotations are so poor still that genes cannot be assigned, then this needs to be far clearer throughout the manuscript and explored. It does not mean that the results are not sound, but the “windows” based approach is confusing and needs to be more clearly explained why it is necessary and where possible, demonstrate that it works well (e.g., when windows can be clearly assigned to genes, the genes make sense ).”

Academic Editor believes that this reiterates the reviewer's concerns re the validity of the windows approach and suggests that you address this by comparing your RNAseq to Pv bulk RNAseq to confirm the validity of the predicted 3’ UTR length.

We will assess your revised manuscript and your response to the reviewers' comments and we may consult the reviewers again. We expect to receive your revised manuscript within 1 month.

Please email us (plosbiology@plos.org) if you have any questions or concerns, or would like to request an extension. At this stage, your manuscript remains formally under active consideration at our journal; please notify us by email if you do not intend to submit a revision so that we may end consideration of the manuscript at PLOS Biology.

Your revisions should address the specific points made by each reviewer. Please submit the following files along with your revised manuscript:

1. A 'Response to Reviewers' file - this should detail your responses to the editorial requests, present a point-by-point response to all of the reviewers' comments, and indicate the changes made to the manuscript.

*NOTE: In your point by point response to the reviewers, please provide the full context of each review. Do not selectively quote paragraphs or sentences to reply to. The entire set of reviewer comments should be present in full and each specific point should be responded to individually.

You should also cite any additional relevant literature that has been published since the original submission and mention any additional citations in your response.

2. In addition to a clean copy of the manuscript, please also upload a 'track-changes' version of your manuscript that specifies the edits made. This should be uploaded as a "Related" file type.

When you are ready to resubmit your revised manuscript, please refer to this resubmission checklist: https://plos.io/Biology_Checklist

To submit a revised version of your manuscript, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' where you will find your submission record.

Please make sure to read the following important policies and guidelines while preparing your revision:

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

Please note that as a condition of publication PLOS' data policy (http://journals.plos.org/plosbiology/s/data-availability) requires that you make available all data used to draw the conclusions arrived at in your manuscript. If you have not already done so, you must include any data used in your manuscript either in appropriate repositories, within the body of the manuscript, or as supporting information (N.B. this includes any numerical values that were used to generate graphs, histograms etc.). For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5

We require the original, uncropped and minimally adjusted images supporting all blot and gel results reported in an article's figures or Supporting Information files. We will require these files before a manuscript can be accepted so please prepare them now, if you have not already uploaded them. Please carefully read our guidelines for how to prepare and upload this data: https://journals.plos.org/plosbiology/s/figures#loc-blot-and-gel-reporting-requirements

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosbiology/s/submission-guidelines#loc-materials-and-methods

Thank you again for your submission to our journal. We hope that our editorial process has been constructive thus far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Di Jiang

PLOS Biology

*****************************************************

REVIEWS:

__________

Reviewer #1: I thank the authors for their considered and thorough responses to my comments. In my view, they have addressed these in detail and I have no further concerns with the manuscript. Per the editors request, I have viewed the comments also from reviewer 3 (who I understand was not available to review the revised manuscript). In my opinion, the authors have also addressed these comments in detail and appropriately, such that I have no further concerns about these comments either. Based on this, I am happy to recommend the study be accepted for publication. I don't have any further comments that require attention and I wish to congratulate the authors on an excellent contribution to the field.

__________

Reviewer #2: Please see attachment for easier reading as it has colors...pasted here again in case attachment does not work.

The authors have addressed the majority of comments well, but some important areas still require further clarification and possibly work. We have pasted the original reviewer comments and author responses that we think require further consideration or work below. Our additional comments are in red.

Reviewer 2

This paper presents a bespoke analysis of P. vivax single cell data and would benefit from demonstrating that their approach is robust by comparing it to some of the more commonly used tools, for example at the very least presenting the CellRanger results. If the annotation of the genome is an issue and leads to poor results from CellRanger, then they could consider extending the annotated genes by a set amount from each start and stop (e.g., include 500bp up and downstream for each gene). It is unclear why they do not use popular and useful tools, and instead go about creating their own methods of analysis.

We might have incorrectly conveyed the idea that our analysis pipeline was very different from CellRanger while it is heavily based on it and conceptually identical (we have now clarified this point on page 4). We initially spent a fair amount of time trying to get CellRanger to work on our data and would have preferred to use this approach directly. We eventually chose to write our own pipeline, recoding the different steps implemented in CellRanger to allow us to more easily modify and control every parameter. As the reviewer accurately notes, the gene annotation was one key issue driving this choice: as indicated in the manuscript, only 42-48% of the reads mapped within 500 bp of the annotated 3'-end of a gene. In addition, we wanted to be able to analyze the precise genomic location of the scRNA-seq signals and avoid collapsing signals that might represent different transcripts (as shown for example on Supplemental Figure S14). Arbitrarily extending the 3'-end of each gene, as suggested by the reviewer, does improve the number of reads assigned to genes but does not alleviate our other concerns, nor allows conducting all the analyses described in the manuscript. We were also concerned that the criteria used by CellRanger for discarding cells with too few reads (all cells with 1/10th of the reads of the 99tile cells with the most expression) might fail to capture the range of transcriptional activity observed along the intraerythrocytic cycle (this concern turned out to be valid as Howick et al. 2019 had to manually "rescue" some cell populations that were systematically discarded by the CellRanger's thresholding).

The improved method sections provides clarity on why this bespoke analysis was performed; however, it would still increase the impact of the paper if there was a more in depth comparison between the CellRanger output and the window based count table. How much of the biology that is discussed in the manuscript is actually lost by using CellRanger? Should all Plasmodium RNA-seq studies on P. vivax use this new method from now on? The authors should provide at least one quantitative comparison of results using their bespoke method and CellRanger with UTRs extended in the GTF by a reasonable amount. This can be supplementary, and their main text can stay as it is, but it is important to do this.

The authors' use of windows rather than genes should be tempered by providing the gene-based analysis first. It is somewhat concerning that the screenshots of IGV seem to show transcription primarily outside of gene boundaries. This may very well be UTRs and indeed, the patterns of transcription that are presented support the notion that windows approach works, but it would still be good to provide insight into differences in patterns or indeed loss of information when using only annotated gene boundaries. Some stats on genes with and without annotated UTRs would also be important, as if UTRs are annotated, then the windows approach becomes less robust.

Because of the design of the 10X assay, which captures mRNAs with oligo-dTs and ligates the sequencing adapters after cDNA fragmentation, almost all scRNA-seq reads derive from the ~300 bp immediately preceding the polyA tail and therefore map to the 3'-UTRs (and not to the core of the gene). Given the incomplete annotation of the 3'-UTRs in P. vivax (none of the P. vivax P01 genes have 3'-UTR longer than 10 bp), it is not surprising that most reads map outside genes. Limiting our analyses to the current gene annotations would not only dramatically reduce the number of useful reads but could also collapse signals from different (unannotated) transcripts together and muddle the analyses. We did however summarize the bin data by genes and present these findings to facilitate the interpretation of the data (with a cautionary note on page 5).

The new Figure S2 appears to indicate that >40% of observed transcription is more than 2.5kb away from an annotated gene. Is this true? It would be important to further discuss this in the manuscript if so, and draw comparisons with results from bulk RNAseq. Additionally, for the remaining windows that are within 1kb of a gene, it is important to do a more clear-cut gene-for-gene comparison to the other species scRNAseq data out there, as noted in the next section's comment.

A better way to display information currently contained within S2 would be a conventional coverage plot across all genes including multiple windows beyond the stop codon. If the authors think the unannotated transcripts that are more than 500 bp from an annotated gene are driving the biology in their data, more work needs to be done on when and where they are expressed. It would be reassuring to see that the reads that map to these new 3' UTR regions in Pv map to 3' UTRs in orthologous genes.

The authors state that they have observed "exquisite regulation of P. vivax transcription along the intraerythrocytic life cycle" but support this claim with a heatmap of a small subset of genes. A more global analysis of the transcriptional changes over developmental time in needed. Do you observe abrupt patterns of expression as seen in other single cell Plasmodium data sets (Reid et al 2018), and are the timing of expression patterns similar to other species (e.g. Poran et al 2017, Howick et al 2019)?

We thank the reviewer for bringing this point to our attention. Reid et al., 2018 describes three distinct clusters of genes based on their expression in asexual parasites and interpret this pattern as indicative of "abrupt changes" in transcription, contrasting with the continuous cascade of transcription and smooth transition observed by bulk-RNA seq (see e.g., Ho et al, 2016 or Bozdech et al, 2003). However, while their figure 4C seems to show distinctive clusters of genes, the PCA in Reid et al shows a continuous distribution of the asexual parasites and not discrete clusters as we would expect if there were "abrupt" changes in expression. The heatmap shown in their Figure 4 also seems to indicate that each gene is not either on or off but gradually increasing or decreasing in expression (though we would agree that this is quite subjective). The three clusters of genes described in Reid et al., 2018 are different from what we observed in our data (e.g., our new Figures 2A, 3 and 4) and we wonder if this could be caused by an uneven distribution of the parasites along the intraerythrocytic cycle in the Reid study which could introduce some apparent clustering. We have included a comparison between our data and the recent P. berghei data generated by Howick et al. with the same technology as used in our study. This analysis highlights the similarities in transcriptional changes between these species along the intraerythrocytic cycle (see e.g., Figure 1B) and, we think, the conserved continuous changes in gene expression throughout the parasites' development. We have also included a short discussion on this aspect on pages 13-14.

Although the authors have included a comparison with the P. berghei data from Howick et al, as well as some technical comparisons with other single-cell plasmodium data sets, they do not address what is differentially expressed over development in a species-dependent manner, i.e. do the same genes show the same pattern of expression across both Pb Pf, and Pv? The comparison of the shape of the PCA is a very superficial level look at similarity in expression, it is helpful to have as a figure, but more needs to be done, especially given the new title stating the paper is about species-specific expression profiles.

Reviewer 3

Page 2 line 17- Was there any way for the authors to look at their parasites microscopically before generating the material for an RNA-sequencing library (as in reference 29)? How robust is the pseudotime data reduction i.e. can the age and life stage of parasites analyzed in this manuscript be said to be known with absolute certainty?

We prepared blood smears from each sample and counted an estimated 10,000 red blood cells prior to processing through MACS columns. We agree that it would have been better to also have a second smear after MACS enrichment, but we did not have enough sample from the column. We have now included in Table 1 the stage composition determined by blood smear from each sample (prior to MACS enrichment).

Regarding the calculation of the pseudotime, we believe that this metric captures most of the variation occurring along the developmental cycle. The absolute value of this estimate is probably not very informative (as it depends on changes in gene expression and on the distribution of the parasites along the intraerythrocytic cycle) but the ranking reflects morphological changes in the parasites, appears robust with regards to the data cutoffs and normalization (see above), and is correlated with the gene expression variation observed in P. berghei.

Concerns raised by reviewer 3 could be alleviated by correlating expression of the single-cell transcriptomes to those of ex vivo Pv bulk data. This would provide more insight into what the pseudotime values actually correspond to in real time. If the data does not correlate well with bulk, the authors have outlined a clear reason (lack of synchronicity or washed out by trophs) above that would be an interesting discussion point.

In keeping with the above content, can you use the existing Pv gene models to predict DNA cis-regulatory motifs based on the temporal profile of mRNA abundance? This would provide a very convincing argument that the precisely regulated timing of transcription factor expression is relevant for controlling the rest of the transcriptome, as the authors have suggested.

We tried to identify clusters of co-expressed genes (using a modified version of WGCNA) to then search for enriched motifs in the 5' region of each gene but failed to obtain convincing modules (probably due to the high number of drop-offs). At this point, we feel that our current data are not sufficient to address this question.

There are many other methods to cluster genes besides WGCNA that might provide more clarity into patterns of coexpression. Saelens et al 2018 is a good review of the many different options (https://www.ncbi.nlm.nih.gov/pubmed/29545622). This could provide clusters that would allow for the identification of enriched motifs. Additionally, if greater than 40% of transcription is not assigned to a gene, then assigning regulatory regions will be a difficult task, so further clarification in the manuscript about genes in particular is required. If the annotations are so poor still that genes cannot be assigned, then this needs to be far clearer throughout the manuscript and explored. It does not mean that the results are not sound, but the "windows" based approach is confusing and needs to be more clearly explained why it is necessary and where possible, demonstrate that it works well (e.g., when windows can be clearly assigned to genes, the genes make sense).

__________

Academic Editor's remark on your response to their first remark:

No fig legends for figs s7 and s8

The authors meaning behind the lack of co-variation between male and female gam gene expression is now clearer that they are not using words with mathematical meaning in a non-mathematical fashion but I’m not convinced that the data supports the conclusion. I agree that, for example, high levels of female gam gene expression can occur with low, high or medium levels of male gam gene expression but I don’t see why this would necessitate separate processes of gametocytogenesis. An alternative explanation consistent with the data is a common pathway of sexual differentiation that defaults to one sex, then those that fail to switch to the other sex will end up the default sex, the proportion that fail to switch can vary but the processes are not separate. I don’t suggest that this actually happens but nor do I think the presented evidence confirms separate processes of gametocytogenesis. The author’s intended meaning is still somewhat opaque in the use of the word “terminal”. Gametocytogenesis is a terminal process, it is unclear whether terminal encompasses the entire process (as I assumed above) or merely the later stages when gametocyte sex can be established. If the latter, then the observation is facile that further male and female differentiation are separate processes after male and female sex has been established. If gametocytogenesis is proposed to be sexually divergent from initiation then there is some implicit difference that precedes or is concurrent with AP2G initiating commitment, is there any evidence for such a difference? I don’t think the authors provide any evidence of such a mechanism and I think this section should be redrafted much more cautiously and the actually intended meaning conveyed much more explicitly. If this section is just speculation then comments re “confirming” separate processes of gametocytogenesis should be removed.

My comments re the observed increase in H4 expression following Cq treatment and the comments on H3.3 were not meant to imply that the paper lacked a discussion of histone gene expression but rather to indicate that increased nucleosome density as a resistance mechanism was not the only possible explanation for altered H4 expression levels, increased H3.3-H4 deposition would alter the composition but not the density of nucleosomes. I don’t suggest that the authors need to address H3.3 but they have now modified the MS to include a fairly long section on general histone gene expression and the expression of several chromatin regulatory proteins during the Pv lifecycle as revealed by the scRNAseq. Is this novel and noteworthy? The transcriptional profile of Pf histone genes has been long known from bulk transcriptomes.

The authors table s3 shows that nearly all of the sex-specific genes identified in Pv scRNAseq in this study were also found in Pb by Yeoh et al using bulk rnaseq of sex specific separated gams also many were shown by obaldia and marti. All of these papers should be cited in the main body of the MS and included in the text not just the supplement. The authors point out the majority of the female gam genes they describe are restricted to this stage and sex but many of these genes are present in male and female Pb gams according to table s3, this should be commented on, is this a genuine species difference or possibly a technical artefact?

Attachment

Submitted filename: SaetalRev2.docx

Decision Letter 3

Di Jiang

18 Mar 2020

Dear Dr Serre,

Thank you for submitting your further revised Methods and Resources entitled "Single-cell transcription analysis of Plasmodium vivax blood-stage parasites identifies stage- and species-specific profiles of expression" for publication in PLOS Biology. I have now obtained advice from the Academic Editor who has assessed your revisions.

Based on our Academic Editor's evaluation, we will probably accept this manuscript for publication only if you will include a previously requested, straightforward validation of your assignment of reads to the 3' ends of genes (see below). We don't think this will take long. Below are our Academic Editor's full comments which contain an additional minor request.

Comments from the Academic Editor:

The authors have largely addressed the queries raised by the reviewer and myself (editor).

Fig s2 is difficult to find in the attached revised MS, it is in suppl materials but its presence there is not indicated in the main MS, the suppl materials file could be renamed to indicate that it contains many of the suppl figs. I assume the legend in the response and fig s2 uses 5’-end in error when the authors mean 3’-end? If so, then I think this data is convincing that approx. half the reads are mapping within 500 bp of the end of the gene.

The authors also show convincing examples of incorrect gene annotations which could explain the distant locations of their scRNAseq reads from the annotated gene 3’ ends.

However, the authors don’t perform the requested broad comparison with bulk rnaseq because; “implementing a systematic reannotation and validation of gene isoforms in the P. vivax genome is going to be a major effort beyond the scope of the present report.” Maybe the authors have misunderstood the requested, trivial comparison to bulk RNAseq. The intention of the requested analysis was to distil multiple comments of R2 to a single validation of the assignment of reads to transcripts and genes. This could be addressed by comparing a table of bulk RNAseq assembled transcripts, eg from cufflinks, to the scRNAseq. This would not require any comparison with existing annotations. A simple bedtools intersect of the concatenated scRNAseq consensus peak coordinates and the bulk RNAseq transcripts should reveal the proportion of scRNAseq “peaks” that are corroborated by bulk RNAseq transcripts. The authors already have the bam files for the bulk rnaseq so this comparison would be trivial and would allay concerns that the scRNAseq was incorrectly assigning reads to distant genes.

I understood the reviewer to have agreed that the PCA plot shows that the variation in transcriptome of the approx. 5000 genes through development is largely concordant hence the similar eigenvectors resulting in a similar PCA. However, there will be some genes that are differently expressed between species across development. These are the genes that I understood the reviewer wanted discussed. However, upon re-reading the MS I think this additional discussion is probably unnecessary.

The authors have satisfactorily addressed all of the other comments and queries from the editor and reviewer.

__________

We expect to receive your revised manuscript within two weeks. Your revisions should address the specific points made by each reviewer. In addition to the remaining revisions and before we will be able to formally accept your manuscript and consider it "in press", we also need to ensure that your article conforms to our guidelines. A member of our team will be in touch shortly with a set of requests. As we can't proceed until these requirements are met, your swift response will help prevent delays to publication.

*Copyediting*

Upon acceptance of your article, your final files will be copyedited and typeset into the final PDF. While you will have an opportunity to review these files as proofs, PLOS will only permit corrections to spelling or significant scientific errors. Therefore, please take this final revision time to assess and make any remaining major changes to your manuscript.

NOTE: If Supporting Information files are included with your article, note that these are not copyedited and will be published as they are submitted. Please ensure that these files are legible and of high quality (at least 300 dpi) in an easily accessible file format. For this reason, please be aware that any references listed in an SI file will not be indexed. For more information, see our Supporting Information guidelines:

https://journals.plos.org/plosbiology/s/supporting-information

*Published Peer Review History*

Please note that you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

*Early Version*

Please note that an uncorrected proof of your manuscript will be published online ahead of the final version, unless you opted out when submitting your manuscript. If, for any reason, you do not want an earlier version of your manuscript published online, uncheck the box. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us as soon as possible if you or your institution is planning to press release the article.

*Protocols deposition*

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosbiology/s/submission-guidelines#loc-materials-and-methods

*Submitting Your Revision*

To submit your revision, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' to find your submission record. Your revised submission must include a cover letter, a Response to Reviewers file that provides a detailed response to the reviewers' comments (if applicable), and a track-changes file indicating any changes that you have made to the manuscript.

Please do not hesitate to contact me should you have any questions.

Sincerely,

Di Jiang

PLOS Biology

------------------------------------------------------------------------

ETHICS STATEMENT:

-- Please create a separate subsection entitled "Ethics Statement" and place it in the beginning of the Methods section.

-- Please include the full name of the IACUC/ethics committee that reviewed and approved the animal care and use protocol/permit/project license. ***IMPORTANT: Please also include an approval number.***

-- Please include the specific national or international regulations/guidelines to which your animal care and use protocol adhered. Please note that institutional or accreditation organization guidelines (such as AAALAC) do not meet this requirement.

-- Please include information about the form of consent (written/oral) given for research involving human participants. All research involving human participants must have been approved by the authors' Institutional Review Board (IRB) or an equivalent committee, and all clinical investigation must have been conducted according to the principles expressed in the Declaration of Helsinki.

------------------------------------------------------------------------

DATA POLICY:

You may be aware of the PLOS Data Policy, which requires that all data be made available without restriction: http://journals.plos.org/plosbiology/s/data-availability. For more information, please also see this editorial: http://dx.doi.org/10.1371/journal.pbio.1001797

Note that we do not require all raw data. Rather, we ask that all individual quantitative observations that underlie the data summarized in the figures and results of your paper be made available in one of the following forms:

1) Supplementary files (e.g., excel). Please ensure that all data files are uploaded as 'Supporting Information' and are invariably referred to (in the manuscript, figure legends, and the Description field when uploading your files) using the following format verbatim: S1 Data, S2 Data, etc. Multiple panels of a single or even several figures can be included as multiple sheets in one excel file that is saved using exactly the following convention: S1_Data.xlsx (using an underscore).

2) Deposition in a publicly available repository. Please also provide the accession code or a reviewer link so that we may view your data before publication.

-- Regardless of the method selected, please ensure that you provide the individual numerical values that underlie the summary data displayed in the following figure panels as they are essential for readers to assess your analysis and to reproduce it: Figures 1AB, 2AB, 3AB, 4AB, 5, S1AB, S2, S3, S4, S5. NOTE: the numerical data provided should include all replicates AND the way in which the plotted mean and errors were derived (it should not present only the mean/average values).

-- Please provide an editor/reviewer key/token to your data (sequence reads) deposited in NCBI SRA under the Bioproject PRJNA603327 so we can check it before accepting the manuscript for publication.

Please also ensure that figure legends in your manuscript include information on where the underlying data can be found, and ensure your supplemental data file/s has a legend.

Please ensure that your Data Statement in the submission system accurately describes where your data can be found.

------------------------------------------------------------------------

BLOT AND GEL REPORTING REQUIREMENTS:

For manuscripts submitted on or after 1st July 2019, we require the original, uncropped and minimally adjusted images supporting all blot and gel results reported in an article's figures or Supporting Information files. We will require these files before a manuscript can be accepted so please prepare and upload them now. Please carefully read our guidelines for how to prepare and upload this data: https://journals.plos.org/plosbiology/s/figures#loc-blot-and-gel-reporting-requirements

Decision Letter 4

Di Jiang

15 Apr 2020

Dear Dr Serre,

On behalf of my colleagues and the Academic Editor, Michael Duffy, I am pleased to inform you that we will be delighted to publish your Methods and Resources in PLOS Biology.

The files will now enter our production system. You will receive a copyedited version of the manuscript, along with your figures for a final review. You will be given two business days to review and approve the copyedit. Then, within a week, you will receive a PDF proof of your typeset article. You will have two days to review the PDF and make any final corrections. If there is a chance that you'll be unavailable during the copy editing/proof review period, please provide us with contact details of one of the other authors whom you nominate to handle these stages on your behalf. This will ensure that any requested corrections reach the production department in time for publication.

Early Version

The version of your manuscript submitted at the copyedit stage will be posted online ahead of the final proof version, unless you have already opted out of the process. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

PRESS

We frequently collaborate with press offices. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximise its impact. If the press office is planning to promote your findings, we would be grateful if they could coordinate with biologypress@plos.org. If you have not yet opted out of the early version process, we ask that you notify us immediately of any press plans so that we may do so on your behalf.

We also ask that you take this opportunity to read our Embargo Policy regarding the discussion, promotion and media coverage of work that is yet to be published by PLOS. As your manuscript is not yet published, it is bound by the conditions of our Embargo Policy. Please be aware that this policy is in place both to ensure that any press coverage of your article is fully substantiated and to provide a direct link between such coverage and the published work. For full details of our Embargo Policy, please visit http://www.plos.org/about/media-inquiries/embargo-policy/.

Thank you again for submitting your manuscript to PLOS Biology and for your support of Open Access publishing. Please do not hesitate to contact me if I can provide any assistance during the production process.

Kind regards,

Vita Usova

Publishing Editor,

PLOS Biology

on behalf of

Di Jiang,

Associate Editor

PLOS Biology

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Development of P. vivax parasitemia in nonhuman primates.

    The graphs show the changes in parasitemia over time in Aotus (A) and Saimiri (B) monkeys infected with different strains of P. vivax. The large black circles represent, for each animal, the collection points of blood samples used for scRNA-seq library preparation. “CQ” indicates point of single oral administration of chloroquine to the animals (10 mg/kg for the Chesson and AMRU-I infections, 5 mg/kg for the Indonesia-I/CDC infection). Underlying data plotted in panels A and B are provided in S6 Data. scRNA-seq, single-cell RNA sequencing.

    (TIFF)

    S2 Fig. Host and parasite contribution to the single-cell transcriptomes.

    The scatter plot shows the number of monkey (x-axis, in 100,000s) and P. vivax (y-axis, in 100,000s) unique sequences (or UMIs) obtained from each droplet (gray dots) across all samples. Note that most single-cell transcriptomes contain either monkey or P. vivax reads. The few droplets containing high numbers of monkey and P. vivax reads (dots off the x- and y-axes) are likely droplets containing multiple cells. UMI, unique molecular identifiers.

    (TIF)

    S3 Fig. PCA of the 13,503 individual parasite transcriptomes containing more than 1,000 unique reads.

    Each gray dot represents a single P. vivax transcriptome and is displayed according to its gene expression profile along the first 3 principal components. Note that the figure is virtually identical to the PCA generated using the 9,215 transcriptomes with more than 5,000 unique reads (Fig 1A). Underlying data are provided in S7 Data. PCA, principal component analysis.

    (TIF)

    S4 Fig. Influence of the minimum read cutoff on the parasite stages retrieved.

    The graph shows the number of parasites obtained (y-axis) using a minimum read cutoff of 1,000 (red) or 5,000 (green) unique reads per parasite. The parasites are grouped according to their pseudotimes (x-axis) calculated using the full data set and binned into 10 categories for the asexual parasites and female gametocytes from 0 (low pseudotime value) to 9 (high pseudotime value). Underlying data are provided in S8 Data.

    (TIF)

    S5 Fig. Annotation of the 500-bp windows with regard to annotated P. vivax genes.

    Each 500 bp window was annotated based on its distance from the 3’-end of the closest annotated gene. “0” corresponds to a window containing an annotated 3’-end, “1” through “5” indicate windows located 1 to 5 windows of 500 bp downstream of the annotated 3’-end, and “-2”/”-1” upstream of the 3’-end. “X” represents bins located outside of these regions. The annotation was performed independently for each DNA strand. The histogram on top shows the distribution of the windows containing scRNA-seq reads with regard to annotated genes. Underlying data are provided in S9 Data. bp, base pairs; scRNA-seq, single-cell RNA sequencing.

    (TIF)

    S6 Fig. Example of scRNA-seq data generated using the 10X 3’-end protocol.

    The figure shows approximately 20 kb on chromosome 11 between position 736,000 and 756,000. The blue bars at the bottom represent 7 annotated P. vivax genes in this region. The gray histograms represent the number of reads covering a given position for 24 single-cell transcriptomes generated from an infection of an Aotus monkey with AMRU-I. The first 16 rows correspond to asexual parasites along their developmental trajectories (by groups of 4), the next 4 rows represent female gametocytes, and the last 4 rows, male gametocytes. scRNA-seq, single-cell RNA sequencing.

    (TIF)

    S7 Fig. Animated.mp4 showing the PCA of the 9,215 individual P. vivax parasite transcriptomes.

    Each dot represents a single parasite transcriptome and is displayed according to its gene expression profile along the first 3 principal components. PCA, principal component analysis.

    (MP4)

    S8 Fig. Animated.mp4 showing the PCA of the 9,215 P. vivax (in blue) and 4,884 P. berghei parasites (in pink).

    Each dot represents a single parasite transcriptome and is displayed according to its gene expression profile along the first 3 principal components. PCA, principal component analysis.

    (MP4)

    S9 Fig. PCA of the individual parasite transcriptomes separated by samples.

    The figure shows PCA from all individual transcriptomes separated by blood samples. The rows correspond to the different strains of P. vivax used in this study, the columns blood samples collected from Aotus monkeys before and after chloroquine treatment (columns 1 and 2, respectively) or from Saimiri monkeys (column 3). Note that 2 different Saimiri monkeys were infected with the same strain of NIH-1993 (last 2 rows). Underlying data are provided in S10 Data. PCA, principal component analysis.

    (TIF)

    S10 Fig. Maturity of P. vivax gametocytes circulating in blood.

    The panels show the average proportion of transcripts (y-axis, in mRNA molecules per 1,000) derived from P. vivax genes associated with GR (red), IG (green), and MG (blue) based on Obaldia and colleagues, 2018. The left and middle panels show the expression in asexual parasites and male gametocytes, respectively. The right panel displays the average expression of 50 female gametocytes organized according to their pseudotime (left, differentiating; right, fully differentiated). Underlying data are provided in S11 Data. GR, gametocyte ring; IG, immature gametocyte; MG, mature gametocyte.

    (TIF)

    S11 Fig. Correlations between the expression of male and female gametocyte genes.

    The heatmap shows the extent of gene expression correlation (Pearson’s R, in blue-red scale) between 2 gametocyte genes selected from scRNA-seq data across all Cambodian patients characterized in Kim and colleagues, 2019. The bordering trees show the result of unsupervised clustering of these genes according to their gene expression patterns. Underlying data are provided in S12 Data. scRNA-seq, single-cell RNA sequencing.

    (TIF)

    S12 Fig. Expression of genes used for RDT.

    The figures show PCA from all P. vivax transcriptomes colored according to the expression level of, from top to bottom, 2 genes used in RDTs, lactate dehydrogenase (PVP01_1229700) and fructose 1,6-bisphosphate aldolase (PVP01_1262200), and one gene almost ubiquitously expressed (profilin or PVP01_0730900). Each dot represents one single-cell parasite and is colored according to the expression level of that transcript (in red scale). Underlying data are provided in S13 Data. PCA, principal component analysis; RDT, rapid diagnostic test.

    (TIF)

    S13 Fig. Expression of 28 AP2 domain transcription factors.

    Variations in the expression levels of AP2 genes according to the parasite development. Each row represents a different annotated AP2 gene and each vertical bar an individual P. vivax parasite ranked along the x-axis based on its pseudotime. The color of each vertical bar represents the expression level of a specific transcript from nondetected (yellow) to highest expression (in red) in yellow-to-red scale. The color bar under the heatmap shows the assignment of the parasites into different morphological stages (red, trophozoites; blue and purple, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data are provided in S14 Data.

    (TIF)

    S14 Fig. Example showing possible alternative 3’-UTR usage.

    The figure shows scRNA-seq data from 24 individual parasites for a region of approximately 10 kb on chromosome 9 containing 2 annotated P. vivax genes, PVP01_0941100 (left blue bar) and PVP01_0941200 (right blue bar), both encoded on the + strand. The scRNA-seq sequences generated from for asexual parasites (rows 1–16) map close to the annotated 3’-end of PVP01_0941100, whereas those generated from male gametocytes (last 4 rows) map more than 1 kb downstream. Female gametocytes (rows 17–20) displayed an intermediate profile. This pattern is consistent with the presence of 2 different 3’-UTRs for PVP01_0941100 (although it is also possible that the second peak corresponds to the 3’-UTR of an unannotated P. vivax gene). scRNA-seq, single-cell RNA sequencing; UTR, untranslated region.

    (TIF)

    S15 Fig. Expression of MSP7-like genes.

    The figure shows examples of scRNA-seq data across approximately 30 kb of chromosome 12 containing a cluster of MSP7-like genes (blue bars at the bottom). Note that most MSP genes are specifically expressed in late asexual parasites (rows 5–12) but turned off in very late schizonts (rows 13–16), whereas other MSP7-like genes are uniquely expressed in female and male gametocytes (row 17–24). MSP, merozoite surface protein.

    (TIF)

    S16 Fig. Expression of PHIST genes.

    Variations in the expression levels of members of the PHIST multigene family according to the parasite development. Each row represents a different annotated PHIST gene, and each vertical bar an individual P. vivax parasite ranked along the x-axis based on its pseudotime. The color of each vertical bar represents the expression level of a specific transcript from nondetected (yellow) to highest expression (in red) in yellow-to-red scale. The color bar under the heatmap shows the assignment of the parasites into different morphological stages (red, trophozoites; blue and purple, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data are provided in S15 Data. PHIST, plasmodium helical interspersed subtelomeric.

    (TIF)

    S17 Fig. Heatmap showing the variation in the time of PIR gene expression.

    Variations in the expression levels of PIR genes according to the parasite development. Each row represents a different annotated PIR gene and each vertical bar an individual P. vivax parasite ranked along the x-axis based on its pseudotime. The color of each vertical bar represents the expression level of a specific transcript from nondetected (yellow) to highest expression (in red) in yellow-to-red scale. The color bar under the heatmap shows the assignment of the parasites into different morphological stages (red, trophozoites; blue and purple, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data are provided in S16 Data.

    (TIF)

    S18 Fig. Heatmap showing the variation in the time of tryptophan-rich antigen gene expression.

    Variations in the expression levels of tryptophan-rich antigen genes according to the parasite development. Each row represents a different annotated tryptophan-rich antigen gene and each vertical bar an individual P. vivax parasite ranked along the x-axis based on its pseudotime. The color of each vertical bar represents the expression level of a specific transcript from nondetected (yellow) to highest expression (in red) in yellow-to-red scale. The color bar under the heatmap shows the assignment of the parasites into different morphological stages (red, trophozoites; blue and purple, schizonts; green, female gametocytes; orange, male gametocytes). Underlying data are provided in S17 Data.

    (TIF)

    S19 Fig. Proportion of mRNA transcribed from P. vivax genes with a P. berghei ortholog.

    The figure shows the proportion of the mRNA molecules derived from P. vivax genes with a one-to-one P. berghei ortholog (y-axis) according to the parasite developmental stage (x-axis). Purple dots indicate asexual parasites organized according to their developmental pseudotime rank (from trophozoites on the left to schizonts on the right); green dots represent female gametocytes and orange dots male gametocytes. Underlying data are provided in S18 Data.

    (TIF)

    S20 Fig. Effect of chloroquine and host species on P. vivax gene expression.

    (A) Expression of histone H4 (PVP01_0905800) in individual parasites organized along their developmental pseudotime rank (on the x-axis) and colored according to the chloroquine treatment (red, before treatment; blue, after treatment). Only parasite transcriptomes generated from Aotus infections are represented. (B) Expression of the exported protein PVP01_1402200 in individual parasites organized along their developmental pseudotime rank and colored according to the host species (red, Aotus; blue, Saimiri). The y-axis indicates the number of unique reads mapped to these windows in each parasite (in reads per 1,000). Underlying data are provided in S19 Data.

    (TIF)

    S21 Fig. Possible deletion of DBP2 in monkey-adapted P. vivax strains.

    The figure shows all the reads from each P. vivax strain mapped to a 50-kb region surrounding DBP2 (PVP01_0102300, highlighted in red). Virtually no reads mapped in the first 110 kb of this chromosomal arm (including DBP2) possibly indicating a deletion of this region in these monkey-adapted strains. Note that the boundary of this deletion might differ between strains (as shown by the location where read mapping resumed).

    (TIF)

    S22 Fig. Reference allele frequency plots.

    The figure shows reference allele frequency plots generated from each blood sample. Each plot shows the proportion of reads carrying the reference allele (x-axis, for 0 on the left to 1 on the right) for each position sequenced at >20X (y-axis). The rows correspond to the different strains of P. vivax used in this study, the columns blood samples collected from Aotus monkeys before and after chloroquine treatment (columns 1 and 2, respectively) or from Saimiri monkeys (column 3). Note that 2 different Saimiri monkeys were infected with the same strain of NIH-1993 (last 2 rows). Underlying data are provided in S20 Data.

    (TIF)

    S1 Table. Summary statistics for the alignment and bioinformatics analyses.

    The table shows, for each sample, the viability determined after MACS column purification, the number of reads generated, the percentage of reads carrying a 10X barcode (% Accept Barc), the percentage of reads too short after trimming polyA sequences (% Discarded), the percentage of reads mapped to the monkey and P. vivax genomes, the average number of reads per unique sequences (reads/UMI) and the number of single-cell transcriptomes obtained. UMI, unique molecular identifiers.

    (XLSX)

    S2 Table. Comparison of the results from different Plasmodium scRNA-seq experiments.

    For each study, the table indicates the number of single-cell transcriptomes obtained, the average number of unique reads mapped to annotated genes per parasite, the average number of expressed genes detected and the average number of genes contributing to more than 1% of the mRNA molecules. Data from Howick and colleagues, 2019, only include blood parasites characterized on the 10X Genomics platform. scRNA-seq, single-cell RNA sequencing.

    (XLSX)

    S3 Table. Genes highly expressed in male and female P. vivax gametocytes.

    The table shows the 25 genes most highly expressed in male and female gametocytes and provides for each genes the median and range (10%–90%) expression in female, male, and asexual parasites (in number of mRNA molecules per 1,000). The figures for P. berghei are also provided, as well as whether the expression of the gene was detected in Yeoh and colleagues, 2017 (M in male, F in female gametocytes) and in Obaldia and colleagues, 2018 (GR, gametocyte ring; MG, mature gametocytes). Genes highlighted in green are specifically expressed in gametocytes.

    (XLSX)

    S4 Table. Genes differentially expressed after chloroquine treatment.

    The table lists the 500-bp windows deemed differentially expressed after chloroquine exposure (see main text for details). Cells highlighted in green represent genes up-regulated after chloroquine exposure, cells highlighted in red represent genes down-regulated after chloroquine exposure. The capital letters after the strain name indicate the stage of the parasite tested (A to D: asexual parasites from early trophozoites to late schizonts, F: female gametocytes). bp, base pairs.

    (XLSX)

    S5 Table. Genes differentially expressed genes according to the host species.

    The table lists the ten 500-bp windows most differentially expressed according to the host species at each stage (see main text for details). Cells highlighted in green represent genes up-regulated after chloroquine exposure, cells highlighted in red represent genes down-regulated after chloroquine exposure. The capital letters indicate the stage of the parasite tested (A to D: asexual parasites from early trophozoites to late schizonts, F: female gametocytes). bp, base pairs.

    (XLSX)

    S6 Table. Genetic differences between strains.

    The table shows the number of nucleotide differences between the different P. vivax strains used in this study. The cells highlighted in green indicate comparisons between the genotype from the same strain, either in the same monkey (before and after treatment) or in 2 different monkey infections.

    (XLSX)

    S1 Text. Genetic Analyses.

    Description of the assessment of clonality and the number of genetic differences between strains.

    (DOCX)

    S1 Data. Experimental data used to generate Fig 1.

    (XLSX)

    S2 Data. Experimental data used to generate Fig 2.

    (XLSX)

    S3 Data. Experimental data used to generate Fig 3.

    (XLSX)

    S4 Data. Experimental data used to generate Fig 4.

    (XLSX)

    S5 Data. Experimental data used to generate Fig 5.

    (XLSX)

    S6 Data. Experimental data used to generate S1 Fig.

    (XLSX)

    S7 Data. Experimental data used to generate S3 Fig.

    (XLSX)

    S8 Data. Experimental data used to generate S4 Fig.

    (XLSX)

    S9 Data. Experimental data used to generate S5 Fig.

    (XLSX)

    S10 Data. Experimental data used to generate S9 Fig.

    (XLSX)

    S11 Data. Experimental data used to generate S10 Fig.

    (XLSX)

    S12 Data. Experimental data used to generate S11 Fig.

    (XLSX)

    S13 Data. Experimental data used to generate S12 Fig.

    (XLSX)

    S14 Data. Experimental data used to generate S13 Fig.

    (XLSX)

    S15 Data. Experimental data used to generate S16 Fig.

    (XLSX)

    S16 Data. Experimental data used to generate S17 Fig.

    (XLSX)

    S17 Data. Experimental data used to generate S18 Fig.

    (XLSX)

    S18 Data. Experimental data used to generate S19 Fig.

    (XLSX)

    S19 Data. Experimental data used to generate S20 Fig.

    (XLSX)

    S20 Data. Experimental data used to generate S22 Fig.

    (XLSX)

    Attachment

    Submitted filename: ResponseReviewers_DS010320.docx

    Attachment

    Submitted filename: SaetalRev2.docx

    Attachment

    Submitted filename: Responses_030720.docx

    Attachment

    Submitted filename: Response.docx

    Data Availability Statement

    All sequence data generated in this study are deposited at the Sequence Read Archive under the BioProject PRJNA603327 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA603327).


    Articles from PLoS Biology are provided here courtesy of PLOS

    RESOURCES