Summary
A comprehensive map of transcription start sites (TSS) across the highly AT-rich genome of P. falciparum would aid progress towards deciphering the molecular mechanisms that underlie the timely regulation of gene expression in this malaria parasite. Using high-throughput sequencing technologies, we generated a comprehensive atlas of transcription initiation events at single nucleotide-resolution during the parasite intra-erythrocytic developmental cycle. This detailed analysis of TSS usage enabled us to define architectural features of plasmodial promoters. We demonstrate that TSS selection and strength are constrained by local nucleotide composition. Furthermore, we provide evidence for coordinate and stage-specific TSS usage from distinct sites within the same transcription unit, thereby producing transcript isoforms, a subset of which are developmentally regulated. This work offers a framework for further investigations into the interactions between genomic sequences and regulatory factors governing the complex transcriptional program of this major human pathogen.
Graphical abstract

Introduction
Elucidation of the 23-Mb genome sequence of P. falciparum, the protozoan parasite responsible for the most lethal form of human malaria, revealed that its global AT-content is greater than 80% and rises to 90% in introns and intergenic regions (Gardner et al., 2002). This extreme base composition raised questions on the genome organization and the mechanisms enabling gene regulation in such an AT-rich environment. Transcriptome analyses using microarrays and later confirmed by RNA-Seq methods showed that gene expression is tightly controlled during the parasite intra-erythrocytic developmental cycle (IDC) (Bozdech et al., 2003a; Le Roch, 2004; Otto et al., 2010; Sorber et al., 2011). Indeed, the majority of the ∼5500 genes vary significantly in steady-state mRNA levels between the different intra-erythrocytic stages, resulting in a developmentally-linked cascade of gene expression. Genome surveys showed conservation of the basal eukaryotic transcriptional machinery, but also suggested a paucity of transcription-associated factors (Bischoff and Vaquero, 2010; Callebaut et al., 2005; Coulson et al., 2004). Despite major advances comprising the identification of transcription factors that govern P. falciparum stage-specific gene expression (Balaji, 2005; De Silva et al., 2008), the lack of a comprehensive map of transcription initiation events has hampered advances in decrypting the molecular basis of transcriptional control. For instance, the mechanisms enabling the transcription machinery to cope with the exceptionally skewed nucleotide composition of the parasite genome or the basic organization of transcription units have yet to be determined. More particularly, the processes underlying the regulation of transcription initiation, such as the recruitment of RNA polymerase II (RNA PolII) or transcription start site (TSS) selection, still remain to be deciphered. Generally, characterizing the transcriptome architecture of P. falciparum has been technically challenging, given the low-complexity of its genome and the potential for reverse-transcription-derived artifacts (Siegel et al., 2014). Nevertheless, several single gene studies have attempted to identify transcription initiation sites (Horrocks et al., 2009), while the FULL-Malaria project mapped TSS positions for ∼25% of P. falciparum genes by cloning and subsequently sequencing full-length cDNA molecules (Watanabe et al., 2002). However, the intricacy and dynamics of transcription initiation in this parasite have not yet been addressed. The establishment of next-generation sequencing technologies now provides powerful means for the comprehensive and systematic analysis of gene expression and regulation. Genome-wide approaches such as CAGE (cap analysis of gene expression) have contributed to a better understanding of eukaryotic promoter structures and the impact of local nucleotide content on transcriptional processes (Shiraki et al., 2003). Recent characterization of the eukaryotic landscape of transcription initiation by RNA PolII has, for instance, revealed the existence of “broad” and “sharp” promoter classes (Lenhard et al., 2012) and shed some light on the mechanisms underlying the selection of transcription start sites (TSS) (Haberle et al., 2014).
Here, we report an in-depth analysis of the dynamics of transcription initiation during the P. falciparum IDC, giving insight into the parasite's transcriptome architecture and gene expression regulation. Using 5′ cap sequencing, a modified version of the CAGE approach that alleviates some of the artifacts previously mentioned, we systematically characterized P. falciparum transcript 5′ ends at different stages of the 48-hour intra-erythrocytic cycle, thereby generating a genome-wide TSS map at single nucleotide-resolution. We demonstrated that transcription initiation occurs across broad genomic regions and that multiple clusters of transcription start sites may co-exist within a gene locus to generate mRNA isoforms. Strikingly, examination of the temporal data revealed that transcription initiation from separate sites within the same transcription unit may be coordinated or stage-specific, leading to the production of developmentally regulated and possibly differentially translated transcript isoforms for a subset of P. falciparum genes. Analysis of the nucleotide content and chromatin organization around the identified TSS allowed us to outline features of P. falciparum promoter architecture. In particular, we found that even in the parasite's extremely AT-biased genome, promoter selection and activity are governed by a locally skewed base composition. Given the great complexity of the P. falciparum transcriptional program, this report constitutes a highly valuable resource for further investigations into the mechanisms directing TSS selection, including the links between P. falciparum promoter architecture and regulatory elements. Our study also provides a framework for the development of therapeutics interfering with gene expression regulation in this deadly pathogen, to which half of the world population remains exposed (World Health Organization, 2015).
Results
Characterization of P. falciparum transcription initiation patterns reveals TSS distributions over broad promoter regions
An in-depth survey of the transcription initiation events occurring across the P. falciparum developmental stages was conducted from two tightly synchronized biological replicate cultures harvested at 6 equally spaced time points during the 48-hour IDC (Supplemental Experimental Procedures).
Strand-specific libraries for deep-sequencing were constructed from the 5′ ends of capped mRNAs captured using a biotinylated 5′ adapter containing unique molecular identifiers, as reported in (Pelechano et al., 2015). In addition to CIP and TAP treatments, enrichment for capped mRNA molecules was further ensured by enzymatic treatment of the total RNA samples with a 5′ P-dependent exonuclease. Omitting the TAP treatment resulted in a very small number of sequencing reads mapping to the P. falciparum genome, indicative of a negligible level of experimental noise (Supplemental Experimental Procedures). Some modifications were made to take into consideration the exceptional AT-richness of the P. falciparum genome (Figure 1A, Figures S1A and S1B, and Supplemental Experimental Procedures). For instance, the mRNA fragmentation and the reverse transcription reaction were performed using conditions previously established for RNA-seq and DNA microarrays in P. falciparum (Bozdech et al., 2003b; Hoeijmakers et al., 2013a). Additionally, the second strand synthesis as well as the amplification of the cDNA libraries were performed using the KAPA DNA polymerase enzyme that has a very low AT-bias and is now commonly used when preparing P. falciparum sequencing libraries (Oyola et al., 2012; Siegel et al., 2014). About 600,000 unique 5′ long tags (> 90 nucleotides) were recovered on average per independent library (Figure S1C, Table S1) and collapsed on their first base to define the position of the transcription start sites (TSSs) at single-nucleotide resolution. Compilation of the transcription initiation events detected on both positive and negative strands showed that more than 3 million nucleotide bases constituted a TSS, indicative of an extensive transcription initiation activity across the P. falciparum genome (Table S1). The use of in vitro transcripts to control for the robustness of the protocol confirmed that the 5′ end of these transcripts were properly identified using the 5′ cap capture (Figure S1D, Supplemental Experimental Procedures).
Figure 1. Genome-wide identification of TSS during P. falciparum asexual cycle. See also Figures S1 and S2.

(A) Capped mRNA molecules are captured using direct ligation of barcoded RNA adapters to transcript 5′ ends. Total RNA samples are sequentially treated with a 5′ P-dependent exonuclease (TEX) and an alkaline phosphatase (CIP) to enrich for capped mRNA molecules. Illumina libraries are then amplified after reverse transcription of the tagged mRNA fragments.
(B) Coverage track showing the TSS mapped to a locus on chromosome 14. Filtered reads from all time points and biological replicates were pooled and collapsed on their first base pair to visualize sites of initiation. The scale indicates the read counts at each genomic position.
(C) Coverage track illustrating the annotation obtained using alternate sequencing filters. Read counts for the raw signal are indicated. Green boxes show the TSS block annotation at each step of the incremental filtering.
(D) Distribution of the number of TSS blocks associated with previously annotated transcription units.
(E) Length distribution of the 5′ UTRs derived from our annotation. While most TSS are located upstream of the start codon, a large number of transcripts have very short (<100bp) or no 5′ UTRs due to initiation within gene bodies (black). Length distribution of the 5′ UTRs for genes in a tandem (red) and bi-directional configuration (orange) is also shown.
(F) Examples of TSS distributions in multi-exon genes. Transcription widely initiates within the body of each exon of PF3D7_0512600, whereas TSS clusters are only detected upstream of the start codon for PF3D7_0320700. TSS block annotation is shown in blue.
Transcription initiation events in P. falciparum appeared to occur at multiple closely spaced sites, across relatively wide regions (Figure 1B), indicative of a broad promoter architecture similar to other eukaryotic organisms (Lenhard et al., 2012). We therefore implemented an analytical method based on mathematical morphology operations and filtering tools (Heijmans et al., 1989) that adapts to the various types of captured signals (dispersed or dense distribution, for instance). Instead of clustering the TSSs within an empirically chosen window size across the genome, it allowed to define separate regions of transcription initiation or “TSS blocks” within each gene locus in a systematic way (Figures 1C and S2, Supplemental Experimental Procedures). The use of alternating sequential filters (Heijmans et al., 1989) on the pooled data (i.e. combining replicates and developmental stages) led to the definition of more than 44,000 potential TSS blocks genome-wide. These distinct regions of transcription initiation were associated where possible to one of the annotated transcription units (Figure S2 and Supplemental Experimental Procedures) to constitute a high-resolution atlas of TSSs for the P. falciparum transcriptome. Biological replicates exhibited an excellent correlation (Spearman correlation > 0.9), thereby confirming the robustness of our protocol (Figure S1E). In total, between the two replicates, transcription initiation events were detected for 90% of the P. falciparum annotated genes (4,955 out of 5,510). The vast majority (92%) of these contained more than one cluster of transcription initiation sites (median of ∼ 6 TSS blocks), thereby expanding on previous observations from the partial analysis of full-length cDNAs (Watanabe et al., 2002). This striking finding suggests that the majority of the genes use not only multiple sites as evidenced by the dispersed patterns of transcription initiation, but may also utilize multiple promoter regions to initiate transcription (Figure 1D). Importantly, this confirms the additional level of resolution provided by our approach in comparison to that of current RNA-Seq technologies.
Transcription in P. falciparum may initiate close to the coding sequence of genes
Given that the majority of genes appeared to use multiple sites to initiate transcription, we examined their global distribution relative to the coding region (CDS) of the associated genes. This analysis showed that 81% of the total TSS blocks were positioned less than 1000-bp upstream of the start codon, and more than 65% of these located within a distance of 500-bp or less upstream of the CDS (Figure 1E). Interestingly, the mode of the TSS block distribution was within 50-bp upstream of the start codon (Figure 1E), suggesting transcription initiation in regions very close to the start of the CDS that may produce leader-less transcripts, i.e with little or no 5′ UTR.
Our analysis further identified transcription initiation in non-conventional sites, such as within the coding region of single exon genes, but also in exons of multi-exon genes (Figure 1F). Individual instances of such configurations were also observed in the FULL-parasites dataset (Tuda et al., 2010) (Figures S3A and S3B). 49% of all identified TSS blocks were actually located downstream of the start codon. This indicates a significant amount of transcription initiation events within the CDS that potentially leads to truncated protein products as observed in yeast (Fournier et al., 2012). The number of internal TSS blocks that are directly followed by an ATG codon in frame with the annotated coding sequence (∼90%) suggests that it might also be the case in P. falciparum. However, some of the sequencing tags recovered in these regions may also result from recapping of mRNA degradation products or non-coding RNAs (Schoenberg and Maquat, 2009).
Further analysis showed that the distribution of the TSS blocks was similar whether genes displayed a tandem or bidirectional configuration (Figure 1E) and was not markedly affected by the presence of introns (Figure S3C). Increased length of the intergenic region was associated with a wider spread of the TSSs upstream of the start codon, with the TSSs closer to the CDS for genes separated by a short distance (Figure S3D). Annotated transcription units for which no TSS was identified corresponded for the vast majority to antigenic variant families and to genes that are expressed in the other stages of the parasite infectious cycle (Table S2A). Nonetheless, there were also genes reported to be expressed during the IDC for which we did not detect any transcription initiation events. Among these, some may be genes that produce uncapped transcripts that our approach would not permit to capture or that are expressed at a level below our detection threshold.
P. falciparum transcriptome exhibits widespread intergenic and bidirectional promoter activities
An appreciable number of TSS blocks (4,600, i.e. 10%, across all 14 chromosomes) could not be assigned to any of the gene loci found in the current annotation of the P. falciparum genome. These were therefore categorized as “new blocks of transcription initiation” and manually clustered (Figure 2A, Supplemental Experimental Procedures) to isolate more than 1,500 “potential transcription units” (Figure 2B, Table S2B). Most of these clusters are located in non-coding regions, or within very large intergenic regions between annotated genes, suggesting that a large fraction of the P. falciparum non-coding genome is actively transcribed (Raabe et al., 2010). Such non-coding transcription could play a role in regulating gene expression, for instance via recruitment of chromatin modifiers or by strengthening transcriptional activities through interactions with the transcriptional machinery (Barrandon et al., 2012). This postulate extends to the telomeric and subtelomeric regions of the parasite genome, in which a few non-coding transcripts had been detected and linked to transcriptional control and silencing (Broadbent et al., 2011; Epp et al., 2008; Kyes et al., 2007). We generalized these findings with the identification of transcripts emerging from the telomeric regions of 11 out of all 14 chromosomes and within introns of at least 80% of all var genes (Figures S3E and S3F, Tables S3A and S3B).
Figure 2. Global organization of the TSS along the P. falciparum genome. See also Figure S3.

(A) Clustering of new transcription units on the (-) strand of chromosome 4. TSS blocks were clustered using the distance to their neighbors and the final dendogram was segmented after manually evaluating the number of relevant clusters. Snapshots of coverage tracks on the right show example of large, medium and small transcription units.
(B) Distribution of the number of TSS blocks associated with the clustered transcription units. The enlargement illustrates the presence of transcription units with a high number of TSS blocks (>7 per unit).
(C) Coverage track of a genomic locus that contains several newly annotated antisense transcripts potentially overlapping with the 3′ end of genes located on the opposite strand. TSS block annotations are indicated in blue (+ strand) and green (- strand).
(D) Distribution of the various classes of antisense transcripts. The distance from the end of the antisense transcriptional unit (i.e. group of antisense TSS blocks assumed to derive from the same transcriptional unit) to the 3′ end of the coding region of annotated sense transcripts is plotted against the distance from the start of the antisense transcriptional unit to the 3′ end of the coding region of annotated sense transcripts. This distribution reveals that most antisense transcripts initiate close to the 3′ end of annotated genes.
(E) Coverage track depicting a bidirectional promoter in which transcription initiates in opposite direction from sites located a few hundred base pairs apart.
(F) Distribution of the distance between the 5′ ends of bidirectional promoter elements. Negative distances correspond to TSS located upstream of the 5′ reference point while positive values correspond to TSS located downstream (and therefore potentially resulting in overlapping transcripts).
Further pervasive transcription was observed with 31% of the 4,955 examined genes displaying antisense transcription initiation (Figure 2C). A large fraction (47%) of the ∼2,000 antisense TSS blocks identified genome-wide resulted from non-coding transcription. Interestingly, a third of these TSS blocks directly overlapped with the 3′ end of the sense annotated genes (Figure 2D), thereby plausibly interfering with the regulation of the derived sense transcripts. Further analysis showed that 33% of the antisense TSS blocks were located in promoter regions where transcription may initiate bidirectionally (Figure 2C). Such regions were prevalent, the presence of TSS blocks in a divergent arrangement being observed for about half of all annotated transcription units (Figure 2E). In more than 70% of such regions, the pairs of divergent TSS blocks were separated by a distance of 400-bp or less from each other, arguing for the likely presence of bidirectional promoters with shared regulatory elements (Figure 2F). This type of configuration may allow for the co-expression of adjacent genes, or alternatively the directional control of gene expression through binding of regulators to separate promoter elements.
Core promoter regions are defined by a specific chromatin organization that includes a precise positioning of the +1 nucleosome
Generating a genome-wide map of transcription initiation events enabled us to assess the structural properties that may define TSS choice in the malaria parasite. Among these, local chromatin organization plays an important role in the regulation of eukaryotic transcription initiation, with the association of specific histone marks and variants with active promoters. We therefore examined the dynamics of H3K4me3-, H3K9ac-, H2A.Z- and H2B.Z occupancies around the most active transcription start sites, using publicly available datasets of chromatin immuno-precipitation followed by sequencing (ChIP-Seq) (Bártfai et al., 2010; Hoeijmakers et al., 2013b) (Figure 3A, Supplemental Experimental Procedures).
Figure 3. Architecture of active promoters during P. falciparum asexual cycle. See also Figure S3.

(A) Occupancy of histone post-translational modifications and variants around stage-specific promoters. The occupancy profiles are organized by mark/variant (rows in the table) and stage-specific genes (columns in the table; ring-specific genes are in purple, early trophozoite-specific genes are in blue, trophozoite-specific genes are in orange and schizont-specific genes are in red). In each case, the occupancy profiles were plotted for each time point at which the mark or variant was profiled (Bartfai et al. 2010, Hoeijmakers et al, 2013). Color shades from light (early time point) to dark (late time point) represent progression through the IDC. Only the most active TSS block was considered for each gene and all ChIP signals were normalized using the chromatin input signal. Stage-specific promoters were called using the counts obtained from our study (See Supplemental Experimental Procedures). This global view illustrates developmentally regulated changes in the distribution of histone marks and variants around P. falciparum promoters.
(B) Local nucleotide content and nucleosome occupancy around the most active TSS block. The heatmap illustrates the local GC content in 10-bp bins around the TSS block for each gene with an annotated TSS block. Analysis of the nucleotide composition in a symmetrical 10-bp window around the dominant TSS peak within the block shows the preferential pyrimidine (T)-purine (A) di-nucleotide used to start transcription. The average profile below the heatmap reveals the drop in GC content around the site of initiation as well as the +1 nucleosome barrier. The additional downstream peaks enriched in GC nucleotides are also apparent and delimit a decrease in nucleosome occupancy. Nucleosome occupancy was plotted using the Mnase-Seq data from (Bunnik et al., 2014).
Alignment of all transcripts at the start position of their most active TSS block showed predominant enrichment in these histone post-translational modifications and variants at the 5′ end of genes (Figure 3A), as in other eukaryotic species. H3K4me3- and H3K9ac-marked nucleosomes were precisely positioned at the +1 position, i.e. the first nucleosome downstream of the TSS block, suggesting that the strong relationship between TSS and position of the +1 nucleosome is maintained in the malaria parasite (Struhl and Segal, 2013). Nevertheless, both marks displayed highly discernible occupancy profiles. Globally, H3K4me3 enrichment increased steadily as the parasite progressed through the IDC, with a stronger marking of the +1 and +2 nucleosomes and a more pronounced nucleosome-free region (NFR) by the end of the cycle (Figure 3A). In contrast, H3K9ac enrichment at the +1 nucleosome position increased progressively before dropping at the schizont stage to a level equivalent or lower than that observed at early developmental stages (Figure 3A). Enrichment for the acetylation mark around the TSSs was more pronounced with a prominent NFR at the trophozoite stage, during which the bulk of transcriptional activity occurs (Sims et al., 2009).
Interestingly, the overall profiles of H3K4me3 enrichment were highly similar between groups of genes expressed at different developmental stages, confirming the disconnect between H3K4me3 occupancy and genes' transcriptional activity (Figure 3A) (Bártfai et al., 2010). Genes specifically expressed late in the parasite IDC, on the other hand, exhibited very little variation in occupancy for H3K9ac-marked nucleosomes around the TSSs between time points (Figure 3A). This trend was also observed for nucleosomes containing the histone variants H2A.Z and H2B.Z (Figure 3A). Their occupancy profiles appeared constant and similar between genes expressed at distinct stages of the parasite life cycle, with a characteristic enrichment at the +1 nucleosome (Mavrich et al., 2008) and a stronger depletion towards the 3′ end of transcripts. In contrast, genes preferentially active during schizogony exhibited an increased enrichment for H2A.Z in the 5′ region of the transcripts near the TSS in comparison to that further downstream in the gene body especially at the beginning and end of the IDC. This differed from the more even distribution of H2B.Z on either side of the TSS, which produced a generally flatter profile.
Altogether these observations indicate that although chromatin organization around the TSS displays expected features such as a marked positioning of the +1 nucleosome downstream of the TSS, the presence of the examined histone marks and variants do not tightly associate with transcriptional activation as observed in other eukaryotes.
Regions of transcription initiation in P. falciparum are characterized by a markedly skewed nucleotide composition at and downstream of the TSS
In addition to examining the chromatin landscape around the sites of transcription initiation, we also assessed whether P. falciparum core promoters are characterized by a particular nucleotide signature. Analysis of the base composition in the core promoter region after aligning all genomic sequences at the most active TSS block revealed a strong decline in GC content around the position of transcription initiation (Figure 3B). Such a decline was also visible albeit within a narrower region around the TSS detected in the body of mono- or multi-exonic genes, suggesting that many of the internal TSS are genuine sites of transcription initiation rather than biological or experimental artifacts (Figure S3G). When looking at the dominant TSS peak within the block, we observed that transcription in the parasite preferentially initiates with the pyrimidine-purine di-nucleotide T-A at position -1, 0 (Figure 3B). This trend was independent of the activity of the considered TSS block (Figure S3H). In contrast, no specific di-nucleotide composition was observed when randomly selecting a genomic position within the blocks (Figure S3H). Surprisingly, we also detected the presence of two well-defined peaks corresponding to a local increase in G/C about 150-bp and 210-bp downstream of the TSSs, respectively, equally visible downstream of the internal exonic TSS (Figures 3B and S3G). Alignment of the profile of GC-content and that of nucleosome positioning derived from (Bunnik et al., 2014) revealed that these two peaks coincided with a dip in nucleosome occupancy. Nucleotide composition around the TSSs did not vary strikingly with the strength of the promoter, except for a greater GC content right at the borders of the TSS blocks, and in particular downstream of those with a lower activity (i.e. less frequent usage) (Figures S3I and S3J). This change was accompanied by a shift of the +1 nucleosome and a disappearance of the NFR around the TSSs, indicative of a direct link between nucleotide content, nucleosome positioning and transcriptional activity.
Dynamic analysis of TSS usage during P. falciparum IDC reveals alternative transcription initiation and possibly shared regulatory elements
The generation of our genome-wide TSS annotation enabled us to examine the dynamics of transcription initiation during the P. falciparum IDC and identify significant differences in TSS usage between developmental stages. Stringent filters using the replicate information were applied (Figure S4A, Supplemental Experimental Procedures) and confirmed extensive transcriptional activity during the P. falciparum IDC. In total, transcription initiation events conserved across replicates were detected for 74% of all 5,510 protein-coding genes (local FDR < 0.1), and 68% of all annotated transcription units (4,838 out of 7,090, FDR < 0.1) that include the new transcription units as defined above.
We found that the majority of the actively transcribed genome (91% of the 4,738 transcription units to which active TSS blocks were associated) exhibited a cycling behavior throughout the parasite developmental cycle (log2fold change > 0.5, Figure 4A, Figure S4B). These numbers are in agreement with previous microarrays and RNA-Seq studies and confirm that the parasite transcriptional cascade is recapitulated at the TSS level. Furthermore, comparison of our data with a recent RNA-Seq analysis (Siegel et al., 2014) showed that the detected transcriptional initiation events correlated well with gene expression levels (Figure S4C). This suggests that the majority of these events are associated with the production of full-length transcripts.
Figure 4. Dynamic usage of transcription initiation sites during P. falciparum intra-erythrocytic development. See also Figure S4.

(A) Coverage tracks illustrating the periodic transcription initiation events associated with the gene PF3D7_0520400. This cycling transcriptional activity is characteristic of the reported cascade of gene expression during the parasite IDC.
(B) Normalized counts of a set of asynchronous TSS blocks associated with a pair of bidirectional genes (PF3D7_0505500/PF3D7_0505600). The timing of maximal expression for the 4 blocks defines two classes of TSS clusters with different expression patterns and yet sometimes associated with the same gene.
(C) Normalized counts of a pair of genes (PF3D7_0925000/PF3D7_0925100) in antisense orientation and displaying temporal usage of TSS blocks (marked as dots) in opposite phase.
(D) Example of a gene, PF3D7_1332500, harboring two types of TSS blocks with different temporal usage during the IDC. The coverage tracks and normalized counts indicate a strong increase in activity at 18-hpi for the main TSS block, while the secondary block near the start of the gene coding sequence is mostly used at the early and late stages. Northern blot analysis clearly illustrates the changes in relative abundance of the two populations of RNA molecules between stages. These include a long transcript emerging from the secondary TSS block, whose length suggests that it might fully overlap with the gene coding sequence and thus constitute a leader-less transcript. The genomic location covered by the northern blot probe is highlighted in orange.
In view of the non-random organization of eukaryotic genomes, indicative of the functional importance of gene distribution (Hurst et al., 2004), we first investigated possible interdependency in TSS usage between genes depending on their arrangement in the genome. For the vast majority (91%) of genes pairs in divergent configuration, bidirectional transcription initiated coordinately, suggesting potential co-expression (Figure S4D). However, a number of those (183 out of 2084, Table S4A) exhibited distinct cycling behavior, as reflected by the contrasting profiles of the associated TSS blocks (Figure S4E). Intriguingly, we also identified divergent gene pairs with distinct expression patterns that comprised TSS blocks with highly similar profiles of temporal usage (Figure 4B, Figure S4F), suggesting the possible presence of both common and separate promoter elements. Altogether these results demonstrate that bidirectional promoters may contain multiple regulatory elements that would enable both coordinated and dissociated TSS usage within divergent transcription units. Given that many of the regions of bidirectional transcription were also sites of antisense transcription, we analyzed the temporal usage pattern of TSS blocks for cycling sense/antisense pairs (11% of all sense/antisense pairs, stringent local FDR threshold of 0.01). We included in our survey sense/antisense pairs on opposite strands in a convergent configuration. We observed dissociated transcription for most. Indeed, only a minority of the antisense transcription initiation events (<1%) occurred in synchrony with their sense counterparts, whereas a few pairs of sense/antisense transcription units (52 out of the ∼1100 pairs surveyed) exhibited expression profiles in opposite phase (Figure 4C, Table S4B). Further investigation will be needed to assess whether these profiles are the result of an interdependency between sense and antisense transcription initiation, whereby antisense transcription acts as a regulator of gene expression by promoting or preventing expression of the sense transcript (Wei et al., 2011).
Given that most transcription units in P. falciparum contain multiple TSS clusters, we additionally examined the dynamics of transcription initiation within each of those. For most of the annotated transcription units, transcription appeared to initiate coordinately at each of the associated TSS blocks, suggesting that even when transcription starts at multiple sites within a given gene locus, all sites are concomitantly used (Figure 4A). Nonetheless, a subset of P. falciparum transcription units (124, 3.4%) exhibited TSS blocks with distinct cycling behavior (Figure 4D, Table S4C). Among these, 57 harbored TSS blocks with temporal usage patterns in opposite phase, indicative of a switch between TSS in a stage-specific manner and suggesting that initiation events at these sites are mutually exclusive (Table S4D). Northern blot analysis of selected candidates (Figures 4D and S4G) confirmed that transcription initiation at alternative sites does lead to production of full-length transcript isoforms, some of which are possibly non-coding (Figure S4G) and whose expression is developmentally regulated. Interestingly, some of the TSS blocks near the start of the genes coding sequence may generate leader-less transcripts as observed for PF3D7_1332500 (Figure 4D). Alternative transcription initiation may lead to the occurrence of alternative translation initiation sites. These in fact appear to be widespread and may influence translation efficiency or enrich the pools of proteins (de Klerk and Hoen, 2015). Interestingly, initial comparison of our TSS map with the recently published profile of ribosome occupancy during the IDC of P. falciparum (Caro et al., 2014) suggests that a dynamic change in TSS usage may be accompanied by a switch between translation initiation sites (Figure S4H). Additionally, it further argues for the likely co-existence of transcript isoforms that are differentially translated (Figure S4I).
Discussion
This report constitutes a comprehensive survey of the transcription initiation events occurring across the P. falciparum asexual blood cycle and provides insights into promoter architecture for a complex and thus far poorly characterized genome. Our data therefore represent a powerful resource that will enable further investigation into the molecular mechanisms governing the selection of TSS and more broadly transcriptional regulation in such an extremely base-biased environment.
The in silico survey of DNA physicochemical properties predicted core promoter regions using the partial TSS mapping from Watanabe et al. (Watanabe et al., 2002) and estimated thymine-adenine to be the preferred sequence at the TSS (Brick et al., 2008). We demonstrated that TSS selection at the genome-wide level is organized around such a specific di-nucleotide composition, reflecting the general preference for a pyrimidine-purine initiation site observed in other eukaryotes (Carninci et al., 2006). Given the low-complexity sequence context associated with the P. falciparum genome, the question therefore arises of what drives the selection of a genomic site for transcription initiation in such an AT-rich environment. We showed that the frequency of usage of a TSS is actually guided by the local G/C content at precise positions downstream of the TSS. This indicates that at least part of the structure of core promoters is genomically-encoded and that there might be spatial constraints for the positioning of the basal transcription machinery. Indeed, the local increase in GC content around weaker TSS suggests the establishment of tighter boundaries for transcription initiation possibly by way of a more defined nucleosome positioning or particular epigenetic mechanisms such as DNA methylation (Ponts et al., 2013). In fact, our observation that the diminution in TSS strength is accompanied with an increase in the distance between the TSS and the +1 nucleosome together with a reduced NFR argues for an association between TSS selection, nucleosome positioning and regional nucleotide content. Altogether these observations indicate that despite the parasite's extraordinary AT-biased genome, promoter selection and activity are primarily defined by local base composition and chromatin structure. Intriguingly, timing of TSS usage does not correlate with enrichment in the histone marks H3K4me3 and H3K9ac or the variants H2A.Z and H2B.Z at the 5′ end of genes, which are typically associated with promoter activity (Li et al., 2007; Zlatanova and Thakar, 2008). Instead these seem to be associated with P. falciparum maturation, independently of the activation of gene expression, in agreement with previous studies reporting a rather moderate association between the dynamic changes in the histone marks/variants and transcript levels (Bártfai et al., 2010; Hoeijmakers et al., 2013b). Interestingly, the greatest changes in chromatin organization around the TSS occur at the schizont stage and for schizont-specific genes. This could reflect the replication activity that takes place during this developmental stage or a mechanism of synchronization of gene expression to prepare all future daughter cells for the next cycle.
Further examination of the dynamics of TSS usage captured the cascade of gene expression characteristic of the malaria parasite, whereby all TSS associated with the same transcription unit were coordinately used at a specific time during the parasite's life cycle. For a vast majority of such transcription units, we observed a high recurrence of synchronous transcription initiation events from multiple sites. This may lead to the simultaneous production of transcript isoforms with alternative 5′ UTRs, as indicated by our northern blot analysis, and thus increased transcriptome diversity (Pal et al., 2011). The concomitant expression of several isoforms with varying 5′ UTRs may have important regulatory consequences (Davuluri et al., 2008), notably by influencing transcript stability or translation efficiency (de Klerk and Hoen, 2015). Interestingly, recent ribosome and polysome profiling studies in P. falciparum have reported enrichment for ribosomes along the 5′ UTR of numerous transcripts (Bunnik et al., 2013; Caro et al., 2014), including those lacking AUGs (Caro et al., 2014). This suggests the existence of upstream open reading frames (uORFs) that are actively translated from non-cognate initiation codons (Ingolia, 2014) and may affect the efficiency of translation of the full-length transcript. In addition to the numerous transcriptional events detected upstream of the translation start codon, we noticed that most of the transcriptionally active loci contained TSS that overlap with the start of the coding sequence. Our observations suggest that these events produce at least in some cases leader-less transcripts that may be translated (Cortes et al., 2013). The high number of mono- and multi-exonic genes containing exonic promoters also demonstrates the general prevalence of transcription initiation events within gene bodies as observed elsewhere (Carninci et al., 2006). Conservation of the TA di-nucleotide pattern and GC patches at exonic promoters argue for these internal TSSs being true sites of transcription initiation. Nevertheless some of the internal TSS may reflect recapping of degradation products (Lenhard et al., 2012) or be the result of a moderate level of experimental noise from which other studies such as that of Tuda et al. reporting similar observations may not be immune either.
Analysis of the relative organization of transcription units, to isolate potential regulatory elements and co-regulated loci, led to the identification of numerous TSS positions in intergenic and other non-coding regions. Most of these TSS could be clustered in hypothetical, previously unreported non-coding transcription units that may carry a regulatory role, as recently suggested for the telomeric lncRNAs during parasite invasion (Broadbent et al., 2015). Many of these non-coding transcription units correspond to transcription initiation in an antisense orientation to coding genes. More generally, this configuration was widespread across the genome, regardless of the coding potential of the sense/antisense pairs, as previously observed (López-Barragán et al., 2011; Militello, 2005; Siegel et al., 2014). Such arrangements that presumably yield overlapping transcripts may carry a regulatory role, as shown by the expression of certain sense-antisense transcript pairs in a mutually exclusive manner. In multiple loci, antisense events originated from bidirectional transcription activity, which we detected for most of the gene pairs in a head-to-head arrangement. For many, we observed coordinate usage of the TSS in both orientations, indicative of the transcription units' co-expression. However, we also identified genomic loci for which divergent transcriptional events were asynchronous or sometimes mutually exclusive, suggesting the possibility of stage-specific regulation of the directionality of bidirectional promoters. Altogether these observations point towards the probable presence of bidirectional promoters with shared and separate regulatory elements (Trinklein et al., 2004).
In contrast to the majority of genes for which concomitant usage of the associated TSS was detected, a minority appeared to switch from one TSS to another in a stage-specific manner. This indicates that TSS usage may be temporally controlled, resulting in the generation of developmentally regulated variants. This observation suggests the existence of a context-dependent selection of the TSS, whereby regulatory processes guide the choice for alternative promoters or sequence elements within the same promoter. Interestingly, dynamic TSS usage has been reported in the context of tissue-specificity (FANTOM Consortium et al., 2014) and embryonic development (Haberle et al., 2014) and attributed to the differential activity of chromatin-defined enhancers (Andersson et al., 2014), or a switch between TSS selection mechanisms (Haberle et al., 2014). While several studies argue for the existence of cis-regulatory sequences in the genome of P. falciparum (Horrocks et al., 2009), further investigations will be needed to assess whether these or other promoter/regulatory element switching mechanisms influence this choice. Local changes in nucleosome configuration mediated by distinct nucleotide compositions (Haberle et al., 2014), altered chromatin states (Davuluri et al., 2008) or the binding of regulatory factors such as the ApiAP2 transcription factors to specific regions around the TSS (Campbell et al., 2010; De Silva et al., 2008) may for instance be at play. Given the restricted number of P. falciparum genes that display such a behavior of dynamic TSS usage, it will also be interesting to investigate whether this particular mode of transcription regulation is linked to the biological function of these genes. Indeed, the developmentally regulated switch between TSS blocks for this subset of genes may be a mechanism to transcriptionally control the stability or translation efficiency of the associated transcripts, or even their biological function, when it is needed the most. The fact that some of the transcripts emerging from alternate TSS blocks display a distinct temporal profile from that reported for the expression of the corresponding gene would suggest that these transcripts are non-coding RNAs that are possibly non-polyadenylated, therefore not detected with classical transcriptomic approaches.
With more than 60% of P. falciparum genes for which no biological function has been assigned (Brehelin et al., 2008), assessing the functional consequence of a developmentally regulated switch between TSS will require further mechanistic studies.
Our precise mapping of TSS genome-wide revealed an unexpectedly complex and dynamic transcriptomic landscape and constitutes a major advance towards deciphering the molecular basis of P. falciparum transcriptional control. We observed the existence of a TSS signature and postulate that spatial constraints mediated by local base composition and nucleosome occupancy may control the accessibility of the basal transcriptional machinery to promoter regions, and thus frequency of TSS usage. This highly valuable resource therefore opens new avenues into the characterization of regulatory elements and the mechanisms directing TSS selection. The recent implementation of CRISPR/Cas9 approaches to edit the parasite genome (Ghorbal et al., 2014; Wagner et al., 2014) will permit further examination of the interactions between promoters and transcription factors, as well as whether widespread non-coding transcriptional activity plays a role in regulating gene expression.
Experimental Procedures
Detailed protocols can be found in the Supplemental Experimental Procedures.
Parasite culture
3D7 parasites were synchronized by two consecutive treatments with 5% sorbitol for three or more successive generations before initiating time point samplings every 8 hours throughout the IDC.
5′ ends capture and library preparation
Single-strand ligation to transcripts' 5′ end was performed using an RNA adapter containing a 8-mer molecular barcode. cDNA second-strand was synthesized using a biotinylated primer complementary to the RNA 5′ adapter. All samples were treated in the presence of in vitro transcripts.
Northern Blot analysis
Probe templates were labeled using DIG Northern Starter Kit (Roche) as per manufacturer's instructions. Hybridizations were performed at 55°C.
TSS annotation, antisense and bidirectional TSS detection
To define TSS blocks morphological operations were performed on the tag counts that were summed across time points and replicates for each position of the genome. The final annotation correspond to the genomic coordinates of every block associated with the optimal filtering step of each genomic. TSS blocks identified on the opposite strand of annotated features were computed within a window of 500-bp around their 3′ end to detect antisense transcription initiation events. Pairs of annotated features in a divergent arrangement that were separated by a distance of 1-kb or less were analyzed for possible bidirectional transcriptional activity.
Analysis of the core promoter architecture
Chromatin organization around TSSs was analyzed using ChIP-seq datasets from (Bártfai et al., 2010) and (Hoeijmakers et al., 2013b). Bins of 10-bp within a 2500-bp window (from -1000 to +1500-bp) around the start of the most active TSS block were used to estimate the coverage in the corresponding genomic locations.
The local GC content was computed in 10-bp bins around the start of the block with the highest counts. We assessed potential nucleotide biases at the site of transcription initiation by extracting the nucleotide content in a 5-bp window around the genomic coordinate corresponding to the highest peak of collapsed 5′ tags.
Analysis of the dynamics of TSS usage
A likelihood ratio test was performed to identify TSS blocks for which the usage was different for at least one of the six time points in both biological replicates. TSS blocks with a local FDR < 0.01 were considered active. Transcript level at every single time point was tested against the mean value calculated across all time points. An absolute log2 fold change ≥ 0.5 was required to identify peaking time points.
Supplementary Material
Highlights.
A genome-wide map of transcription start sites (TSS) for P. falciparum
Sequence and chromatin features for core promoters of malaria parasites are defined
Dynamic study of TSS usage identifies developmentally-regulated transcript isoforms
Acknowledgments
We thank Emilie Fritsch and Marcus Lee for helpful comments on the manuscript, and Aleksandra Pekowska for fruitful discussions over the course of the project.
We thank Richard Eastman for kindly providing the P. falciparum strain 3D7.
This study was technologically supported by the EMBL Genomics Core Facilities.
S.H.A was supported by an EIPOD / Marie Curie COFUND postdoctoral fellowship. C.D.C was supported by a PhD fellowship from the Boehringer Ingelheim Fonds. This study was supported by the National Institutes of Health Grant NIH P01 HG000205 (to L.M.S.).
Footnotes
Data access: All sequencing raw data are accessible through the GEO accession number GSE68982. A temporary genome-browser is accessible at http://steinmetzlab.embl.de/shiny/TSS_malaria_adjalley_chabbert/ to query for specific genes and different time points until the current integration of the data into PlasmoDB is completed.
Authors' contributions: S.H.A., C.D.C, V.P. conceived the study. S.H.A. performed the experiments with contributions from C.D.C. and V.P. S.H.A., C.D.C., and B.K. performed the analysis. C.D.C. developed the morphological mathematics approach for the P. falciparum TSS data. C.D.C. and B. K. established the statistical pipeline for the dynamic analysis. S.H.A. and C.D.C. interpreted the results, together with V.P. and L.M.S. S.H.A and C.D.C wrote the manuscript, with inputs from the other contributors. All authors read and approved the final manuscript.
All authors declare that they have no conflict of interest at the time of manuscript submission.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, Chen Y, Zhao X, Schmidl C, Suzuki T, et al. An atlas of active enhancers across human cell types and tissues. Nature. 2014;507:455–461. doi: 10.1038/nature12787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balaji S. Discovery of the principal specific transcription factors of Apicomplexa and their implication for the evolution of the AP2-integrase DNA binding domains. Nucleic Acids Research. 2005;33:3994–4006. doi: 10.1093/nar/gki709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barrandon C, Spiluttini B, Bensaude O. Non-coding RNAs regulating the transcriptional machinery. Biology of the Cell. 2012;100:83–95. doi: 10.1042/BC20070090. [DOI] [PubMed] [Google Scholar]
- Bártfai R, Hoeijmakers WAM, Salcedo-Amaya AM, Smits AH, Janssen-Megens E, Kaan A, Treeck M, Gilberger TW, Françoijs KJ, Stunnenberg HG. H2A.Z Demarcates Intergenic Regions of the Plasmodium falciparum Epigenome That Are Dynamically Marked by H3K9ac and H3K4me3. PLoS Pathog. 2010;6:e1001223. doi: 10.1371/journal.ppat.1001223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bischoff E, Vaquero C. In silico and biological survey of transcription-associated proteins implicated in the transcriptional machinery during the erythrocytic development of Plasmodium falciparum. BMC Genomics. 2010;11:34. doi: 10.1186/1471-2164-11-34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bozdech Z, Zhu J, Joachimiak M, Cohen F, Pulliam B, DeRisi J. Expression profiling of the schizont and trophozoite stages of Plasmodium falciparum with a long-oligonucleotide microarray. Genome Biol. 2003a;4 doi: 10.1186/gb-2003-4-2-r9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bozdech Z, Llinás M, Pulliam BL, Wong ED, Zhu J, deRisi JL. The Transcriptome of the Intraerythrocytic Developmental Cycle of Plasmodium falciparum. Plos Biol. 2003b;1:e5. doi: 10.1371/journal.pbio.0000005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brehelin L, Dufayard JF, Gascuel O. PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data. BMC Bioinformatics. 2008;9:440. doi: 10.1186/1471-2105-9-440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brick K, Watanabe J, Pizzi E. Core promoters are predicted by their distinct physicochemical properties in the genome of Plasmodium falciparum. Genome Biol. 2008;9:R178. doi: 10.1186/gb-2008-9-12-r178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broadbent KM, Broadbent JC, Ribacke U, Wirth D, Rinn JL, Sabeti PC. Strand-specific RNA sequencing in Plasmodium falciparum malaria identifies developmentally regulated long non-coding RNA and circular RNA. BMC Genomics. 2015;16:454–454. doi: 10.1186/s12864-015-1603-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broadbent KM, Park D, Wolf AR, Van Tyne D, Sims JS, Ribacke U, Volkman S, Duraisingh M, Wirth D, Sabeti PC, et al. A global transcriptional analysis of Plasmodium falciparum malaria reveals a novel family of telomere-associated lncRNAs. Genome Biol. 2011;12:R56. doi: 10.1186/gb-2011-12-6-r56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bunnik EM, Chung DWD, Hamilton M, Ponts N, Saraf A, Prudhomme J, Florens L, Le Roch KG. Polysome profiling reveals translational control of gene expression in the human malaria parasite Plasmodium falciparum. Genome Biol. 2013;14:R128. doi: 10.1186/gb-2013-14-11-r128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bunnik EM, Polishko A, Prudhomme J, Ponts N, Gill SS, Lonardi S, Le Roch KG. DNA-encoded nucleosome occupancy is associated with transcription levels in the human malaria parasite Plasmodium falciparum. BMC Genomics. 2014;15:347. doi: 10.1186/1471-2164-15-347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Callebaut I, Prat K, Meurice E, Mornon JP, Tomavo S. Prediction of the general transcription factors associated with RNA polymerase II in Plasmodium falciparum: conserved features and differences relative to other eukaryotes. BMC Genomics. 2005;6:100. doi: 10.1186/1471-2164-6-100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic J, Semple CAM, Taylor MS, Engström PG, Frith MC, et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nature Genetics. 2006;38:626–635. doi: 10.1038/ng1789. [DOI] [PubMed] [Google Scholar]
- Caro F, Ahyong V, Betegon M, deRisi JL. Genome-wide regulatory dynamics of translation in the Plasmodium falciparum. asexual blood stages Elife. 2014;3 doi: 10.7554/eLife.04106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cortes T, Schubert OT, Rose G, Arnvig KB, Comas I, Aebersold R, Young DB. Genome-wide Mapping of Transcriptional Start Sites Defines an Extensive Leaderless Transcriptome in Mycobacterium tuberculosis. CellReports. 2013;5:1121–1131. doi: 10.1016/j.celrep.2013.10.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coulson RMR, Hall N, Ouzounis CA. Comparative genomics of transcriptional control in the human malaria parasite Plasmodium falciparum. Genome Research. 2004;14:1548–1554. doi: 10.1101/gr.2218604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davuluri RV, Suzuki Y, Sugano S, Plass C, Huang THM. The functional consequences of alternative promoter use in mammalian genomes. Trends in Genetics. 2008;24:167–177. doi: 10.1016/j.tig.2008.01.008. [DOI] [PubMed] [Google Scholar]
- de Klerk E, Hoen PACT. Alternative mRNA transcription, processing, and translation: insights from RNA sequencing. Trends in Genetics. 2015;31:128–139. doi: 10.1016/j.tig.2015.01.001. [DOI] [PubMed] [Google Scholar]
- De Silva EK, Gehrke AR, Olszewski K, León I, Chahal JS, Bulyk ML, Llinás M. Specific DNA-binding by apicomplexan AP2 transcription factors. Proceedings of the National Academy of Sciences. 2008;105:8393–8398. doi: 10.1073/pnas.0801993105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Epp C, Li F, Howitt CA, Chookajorn T, Deitsch KW. Chromatin associated sense and antisense noncoding RNAs are transcribed from the var gene family of virulence genes of the malaria parasite Plasmodium falciparum. RNA. 2008;15:116–127. doi: 10.1261/rna.1080109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- FANTOM Consortium and the RIKEN PMI and CLST (DGT) Forrest ARR, Kawaji H, Rehli M, Baillie JK, de Hoon MJL, Haberle V, Lassmann T, Kulakovskiy IV, Lizio M, et al. A promoter-level mammalian expression atlas. Nature. 2014;507:462–470. doi: 10.1038/nature13182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fournier CT, Cherny JJ, Truncali K, Robbins-Pianka A, Lin MS, Krizanc D, Weir MP. Amino termini of many yeast proteins map to downstream start codons J Proteome Res. 2012;11:5712–5719. doi: 10.1021/pr300538f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, Carlton JM, Pain A, Nelson KE, Bowman S, et al. Genome sequence of the human malaria parasite Plasmodium falciparum. Nature. 2002;419:498–511. doi: 10.1038/nature01097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ghorbal M, Gorman M, Macpherson CR, Martins RM, Scherf A, Lopez-Rubio JJ. Genome editing in the human malaria parasite Plasmodium falciparum using the CRISPR-Cas9 system. Nature Biotechnology. 2014;32:819–821. doi: 10.1038/nbt.2925. [DOI] [PubMed] [Google Scholar]
- Haberle V, Li N, Hadzhiev Y, Plessy C, Previti C, Nepal C, Gehrig J, Dong X, Akalin A, Suzuki AM, et al. Two independent transcription initiation codes overlap on vertebrate core promoters. Nature. 2014;507:381–385. doi: 10.1038/nature12974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heijmans HJAM, Ronse C. Centrum voor Wiskunde en Informatica Amsterdam, N.D.O.A.A.A.G. The Algebraic Basis of Mathematical Morphology 1989 [Google Scholar]
- Hoeijmakers WAM, Bártfai R, Stunnenberg HG. Transcriptome analysis using RNA-Seq. Methods Mol Biol. 2013a;923:221–239. doi: 10.1007/978-1-62703-026-7_15. [DOI] [PubMed] [Google Scholar]
- Hoeijmakers WAM, Salcedo-Amaya AM, Smits AH, Françoijs KJ, Treeck M, Gilberger TW, Stunnenberg HG, Bártfai R. H2A.Z/H2B.Z double-variant nucleosomes inhabit the AT-rich promoter regions of the Plasmodium falciparum genome. Mol Microbiol. 2013b;87:1061–1073. doi: 10.1111/mmi.12151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Horrocks P, Wong E, Russell K, Emes RD. Control of gene expression in Plasmodium falciparum – Ten years on. Molecular & Biochemical Parasitology. 2009;164:9–25. doi: 10.1016/j.molbiopara.2008.11.010. [DOI] [PubMed] [Google Scholar]
- Hurst LD, Pál C, Lercher MJ. The evolutionary dynamics of eukaryotic gene order. Nat Rev Genet. 2004;5:299–310. doi: 10.1038/nrg1319. [DOI] [PubMed] [Google Scholar]
- Ingolia NT. Ribosome profiling: new views of translation, from single codons to genome scale. Nat Rev Genet. 2014;15:205–213. doi: 10.1038/nrg3645. [DOI] [PubMed] [Google Scholar]
- Kyes S, Christodoulou Z, Pinches R, Kriek N, Horrocks P, Newbold C. Plasmodium falciparum var gene expression is developmentally controlled at the level of RNA polymerase II-mediated transcription initiation. Mol Microbiol. 2007;63:1237–1247. doi: 10.1111/j.1365-2958.2007.05587.x. [DOI] [PubMed] [Google Scholar]
- Le Roch KG. Global analysis of transcript and protein levels across the Plasmodium falciparum life cycle. Genome Research. 2004;14:2308–2318. doi: 10.1101/gr.2523904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lenhard B, Sandelin A, Carninci P. Metazoan promoters: emerging characteristics and insights into transcriptional regulation. Nat Rev Genet. 2012;13:233–245. doi: 10.1038/nrg3163. [DOI] [PubMed] [Google Scholar]
- Li B, Carey M, Workman JL. The Role of Chromatin during Transcription. Cell. 2007;128:707–719. doi: 10.1016/j.cell.2007.01.015. [DOI] [PubMed] [Google Scholar]
- López-Barragán MJ, Lemieux J, Quiñones M, Williamson KC, Molina-Cruz A, Cui K, Barillas-Mury C, Zhao K, Su XZ. Directional gene expression and antisense transcripts in sexual and asexual stages of Plasmodium falciparum. BMC Genomics. 2011;12:587. doi: 10.1186/1471-2164-12-587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mavrich TN, Jiang C, Ioshikhes IP, Li X, Venters BJ, Zanton SJ, Tomsho LP, Qi J, Glaser RL, Schuster SC, et al. Nucleosome organization in the Drosophila genome. Nature. 2008;453:358–362. doi: 10.1038/nature06929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Militello KT. RNA polymerase II synthesizes antisense RNA in Plasmodium falciparum. Rna. 2005;11:365–370. doi: 10.1261/rna.7940705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Otto TD, Wilinski D, Assefa S, Keane TM, Sarry LR, Böhme U, Lemieux J, Barrell B, Pain A, Berriman M, et al. New insights into the blood-stage transcriptome of Plasmodium falciparum using RNA-Seq. Mol Microbiol. 2010;76:12–24. doi: 10.1111/j.1365-2958.2009.07026.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oyola SO, Otto TD, Gu Y, Maslen G, Manske M, Campino S, Turner DJ, MacInnis B, Kwiatkowski DP, Swerdlow HP, et al. Optimizing illumina next-generation sequencing library preparation for extremely AT-biased genomes. BMC Genomics. 2012;13:1. doi: 10.1186/1471-2164-13-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pal S, Gupta R, Kim H, Wickramasinghe P, Baubet V, Showe LC, Dahmane N, Davuluri RV. Alternative transcription exceeds alternative splicing in generating the transcriptome diversity of cerebellar development. Genome Research. 2011;21:1260–1272. doi: 10.1101/gr.120535.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pelechano V, Wei W, Steinmetz LM. Widespread Co-translational RNA Decay Reveals Ribosome Dynamics. Cell. 2015;161:1400–1412. doi: 10.1016/j.cell.2015.05.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ponts N, Fu L, Harris EY, Zhang J, Chung DWD, Cervantes MC, Prudhomme J, Atanasova-Penichon V, Zehraoui E, Bunnik EM, et al. Genome-wide Mapping of DNA Methylation in the Human Malaria Parasite Plasmodium falciparum. Cell Host and Microbe. 2013;14:696–706. doi: 10.1016/j.chom.2013.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raabe CA, Sanchez CP, Randau G, Robeck T, Skryabin BV, Chinni SV, Kube M, Reinhardt R, Ng GH, Manickam R, et al. A global view of the nonprotein-coding transcriptome in Plasmodium falciparum. Nucleic Acids Research. 2010;38:608–617. doi: 10.1093/nar/gkp895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoenberg DR, Maquat LE. Re-capping the message. Trends in Biochemical Sciences. 2009;34:435–442. doi: 10.1016/j.tibs.2009.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shiraki T, Kondo S, Katayama S, Waki K, Kasukawa T, Kawaji H, Kodzius R, Watahiki A, Nakamura M, Arakawa T, et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc Natl Acad Sci USa. 2003;100:15776–15781. doi: 10.1073/pnas.2136655100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Siegel TN, Hon CC, Zhang Q, Lopez-Rubio JJ, Scheidig-Benatar C, Martins RM, Sismeiro O, Coppée JY, Scherf A. Strand-specific RNA-Seq reveals widespread and developmentally regulated transcription of natural antisense transcripts in Plasmodium falciparum. BMC Genomics. 2014;15:150. doi: 10.1186/1471-2164-15-150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sorber K, Dimon MT, deRisi JL. RNA-Seq analysis of splicing in Plasmodium falciparum uncovers new splice junctions, alternative splicing and splicing of antisense transcripts. Nucleic Acids Research. 2011;39:3820–3835. doi: 10.1093/nar/gkq1223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Struhl K, Segal E. Determinants of nucleosome positioning. Nat Struct Mol Biol. 2013;20:267–273. doi: 10.1038/nsmb.2506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trinklein ND, Aldred SF, Hartman SJ, Schroeder DI, Otillar RP, Myers RM. An abundance of bidirectional promoters in the human genome. Genome Research. 2004;14:62–66. doi: 10.1101/gr.1982804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tuda J, Mongan AE, Tolba MEM, Imada M, Yamagishi J, Xuan X, Wakaguri H, Sugano S, Sugimoto C, Suzuki Y. Full-parasites: database of full-length cDNAs of apicomplexa parasites, 2010 update. Nucleic Acids Research. 2010;39:D625–D631. doi: 10.1093/nar/gkq1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wagner JC, Platt RJ, Goldfless SJ, Zhang F, Niles JC. Efficient CRISPR-Cas9–mediated genome editing in Plasmodium falciparum. Nature Methods. 2014;11:915–918. doi: 10.1038/nmeth.3063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Watanabe J, Sasaki M, Suzuki Y, Sugano S. Analysis of transcriptomes of human malaria parasite Plasmodium falciparum using full-length enriched library: identification of novel genes and diverse transcription start sites of messenger RNAs. Gene. 2002;291:105–113. doi: 10.1016/s0378-1119(02)00552-8. [DOI] [PubMed] [Google Scholar]
- Wei W, Pelechano V, Järvelin AI, Steinmetz LM. Functional consequences of bidirectional promoters. Trends in Genetics. 2011;27:267–276. doi: 10.1016/j.tig.2011.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- World Health Organization. World Malaria Report 2015 - full report. 2015;2015:1–280. http://www.who.int/malaria/publications/world-malaria-report-2015/en/ [Google Scholar]
- Zlatanova J, Thakar A. H2A.Z: View from the Top. Structure. 2008;16:166–179. doi: 10.1016/j.str.2007.12.008. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
