Abstract
Dictyostelium discoideum live in the soil as single cells, engulfing bacteria and growing vegetatively. Upon starvation, tens of thousands of amoebae enter a developmental program that includes aggregation, multicellular differentiation, and sporulation. Major shifts across the protein-coding transcriptome accompany these developmental changes. However, no study has presented a global survey of long noncoding RNAs (ncRNAs) in D. discoideum. To characterize the antisense and long intergenic noncoding RNA (lncRNA) transcriptome, we analyzed previously published developmental time course samples using an RNA-sequencing (RNA-seq) library preparation method that selectively depletes ribosomal RNAs (rRNAs). We detected the accumulation of transcripts for 9833 protein-coding messenger RNAs (mRNAs), 621 lncRNAs, and 162 putative antisense RNAs (asRNAs). The noncoding RNAs were interspersed throughout the genome, and were distinct in expression level, length, and nucleotide composition. The noncoding transcriptome displayed a temporal profile similar to the coding transcriptome, with stages of gradual change interspersed with larger leaps. The transcription profiles of some noncoding RNAs were strongly correlated with known differentially expressed coding RNAs, hinting at a functional role for these molecules during development. Examining the mitochondrial transcriptome, we modeled two novel antisense transcripts. We applied yet another ribosomal depletion method to a subset of the samples to better retain transfer RNA (tRNA) transcripts. We observed polymorphisms in tRNA anticodons that suggested a post-transcriptional means by which D. discoideum compensates for codons missing in the genomic complement of tRNAs. We concluded that the prevalence and characteristics of long ncRNAs indicate that these molecules are relevant to the progression of molecular and cellular phenotypes during development.
Keywords: transcriptome time course, development, noncoding RNA, ribosomal RNA depletion, Dictyostelium discoideum, slime mold
The social amoeba Dictyostelium discoideum has captured the imagination of biologists for over 75 years (Raper 1940; Williams 2010). Much of the fascination is due to its unusual life cycle, which is divided between a single-celled, vegetative growth stage and a tightly regulated multicellular developmental program. Development is triggered by starvation, and proceeds through various stages: chemotaxis and aggregation, multicellular differentiation, morphogenesis, and reproductive maturation (reviewed in Kessin 2001). Many cellular and molecular events critical to development have been elucidated through genetic and chemical experiments. The growing adoption of high throughput sequencing adds a powerful complementary approach to analyzing the regulation of this process (Loomis and Shaulsky 2011).
The developmentally regulated, protein-coding transcriptome of D. discoideum has been characterized extensively (Van Driessche et al. 2002; Iranfar et al. 2003; Parikh et al. 2010; Rosengarten et al. 2015a). Roughly two-thirds of the gene models present in the genome are expressed to some extent during development. Critical transcription regulatory networks are coming into relief with the combination of genetic screens and deep-sequencing data (Cai et al. 2014; Santhanam et al. 2015). In many model organisms, ncRNAs such as lncRNAs and asRNAs also play important roles in the regulation of gene expression (Qu and Adelson 2012; Guil and Esteller 2012; Pelechano and Steinmetz 2013).
The presence of various types of ncRNAs in D. discoideum, beginning with small nuclear RNAs, has been appreciated for decades (Takeishi and Kaneda 1979, 1981). One of the first eukaryotic examples of endogenous antisense regulation of an mRNA cognate (the prespore gene psvA) was demonstrated in D. discoideum (Hildebrandt and Nellen 1992). Over the years, construction of small insert cDNA libraries, de novo computational searches, and various deep-sequencing approaches have identified novel classes of developmentally important small ncRNAs and microRNAs (Aspegren et al. 2004; Hinas et al. 2006; Larsson et al. 2008; Avesson et al. 2011, 2012). Nevertheless, we have yet to thoroughly catalog the identities of lncRNAs and asRNAs, and to examine how their abundances relate to developmental changes.
Herein, we describe the first comprehensive annotation of lncRNA transcript models, and identify a set of putative asRNAs, in the D. discoideum genome. These transcripts were identified by bioinformatics analysis of RNA-seq data from rRNA-depleted libraries. We used the same biological samples as in an earlier published developmental time course (Rosengarten et al. 2015a). However, the current rRNA depletion strategy deviates from the previous poly-A selection, in that it should retain nonpoly-A transcripts in addition to mRNAs. Our analyses identified hundreds of intergenic lncRNA loci. While these were typically expressed at much lower levels than mRNAs, their abundance followed similar temporal patterns over the course of development. Strong correlation was observed between the temporal pattern of a few dozen lncRNAs and mRNAs abundant in both early and late development. We further examined other nonpoly-A RNAs, including the first deep-sequencing analysis of the mitochondrial transcriptome. Analysis of tRNA expression provided evidence for post-transcriptional modifications that compensate for various anticodons missing from the genomic tRNA complement. The widespread expression of lncRNAs during D. discoideum development suggests that future genetic studies should consider the effects of intergenic elements more closely.
Materials and Methods
Growth, development, and sample collection
In this study, we processed aliquots of total RNA that had been collected in Rosengarten et al. (2015a). Briefly, D. discoideum cells (strain AX4) were grown in HL-5 nutrient medium, with shaking at 22° to midlog phase. Cells were developed on nitrocellulose filters (5 × 10 cells per 5 cm filter) saturated in PDF buffer for 24 hr, as described in (Miranda et al. 2013b). Every 1–2 hr, developing cells were scraped into 1 ml Trizol reagent (Life Sciences). Total RNA was isolated by phenol–chloroform extraction and ethanol precipitation. Two biological replicates were collected for each time course. The previous analysis found little difference in population-average gene expression at 1 vs. 2 hr time resolution (Rosengarten et al. 2015). Therefore, we selected samples 2 hr apart, from 0 to 24 hr.
cDNA library preparation and RNA-seq
We constructed multiplexed RNA-seq libraries using the Ovation Universal RNA-Seq System (NuGen, Carlsbad, CA), to exclude ribosomal RNA and enrich other RNA species, according to the manufacturer’s recommended protocol. For each sample, 200 ng total RNA was annealed to random and oligo-dT primers and treated with heat-labile arctic double-stranded DNase, prior to first- then second-strand cDNA synthesis. The resulting cDNA was fragmented by sonication using the Covaris S-series System with the recommended settings to achieve 150–200 bp median fragments. cDNA was recovered using magnetic beads with two ethanol wash steps, followed by enzymatic end repair of the fragments. Next, barcoded adapters were ligated to each sample, followed by an enzymatic strand selection step and magnetic bead recovery, as above. rRNAs were targeted for depletion by the addition of custom designed oligonucleotides specific for the 28, 17, 5.8, and 5S rRNA genes, as well as the mitochondrial large and small RNA subunits (rnlA and rnsA, respectively). A list of these depletion probes is available in Supplemental Material, File S1. The next step was Insert Dependent Adaptor Cleavage (InDA-C) to remove the adapters containing priming sites from all targeted rRNA molecules. RT-PCR was used to determine that 15 cycles were required for the subsequent library amplification.
To examine tRNA abundance, we used the Ribo-Zero Plant Seed/Root magnetic kit (Epicentre), designed to retain molecules smaller than 100 bp. For this, we chose six samples from biological replicate 1 (0, 4, 8, 12, 16, and 22 hr). Library preparation was performed according to the manufacturer’s recommended protocol.
The full time course InDA-C and the subsample Ribo-Zero cDNA libraries were sequenced by Illumina Hiseq2500 with paired ends and read length of 100 bp.
Primary sequence analysis
We checked sequencing data quality using FastQC (v0.10.1) (Andrews 2010). We eliminated three samples—biological replicate 2: 12, 16, and 22 hr—that did not meet our quality criteria. For all other samples, sequences were aligned to the D. discoideum reference genome (Miranda et al. 2013a) by providing strand information when available using TopHat (v2.0.13) (Trapnell et al. 2009). We only permitted uniquely mapped reads and supplied reference transcript annotations (http://dictybase.org/ version 2013) (–mate-inner-dist 100–mate-std-dev 20–num-threads 4–GTF -g 1–report-secondary-alignments–microexon-search–no-mixed–no-discordant–min-intron-length 70–max-intron-length 500). We assembled transcripts for each sample using Cufflinks (v2.2.1) (Trapnell et al. 2012) with the option –GTF-guide to guide the reference annotation-based transcript (RABT) assembly. The transcript annotation files from cufflinks were then merged using Cuffmerge (v2.2.1) to obtain a final transcriptome assembly (File S2). By comparing these transcripts to the existing transcript annotations, we identified putative noncoding transcripts. The above primary analyses were all performed using pipelines written for Genialis GenBoard software. We used Transdecoder (v2.0.1) (Haas et al. 2013) with domain homology search options and ORF length thresholds of 25, 50, and 100 amino acids to estimate the coding potential of the final RABT assembled transcriptome.
Transcriptome analysis
We quantified the abundance of all transcripts contained in the final transcript set by counting uniquely mapped reads and accounting for the strand information where available. We standardized transcript abundance by accounting for the mappable lengths of transcripts and the total number of mapped reads (excluding those mapped to the ribosomal palindrome chrR) in each experiment. Long noncoding and antisense transcript models were filtered according to the following criteria: minimum length of 200 bp; at least two raw reads per transcript model at any time point; mappable expression greater than zero at any time point; and lncRNAs must reside entirely within intergenic regions with no tiling path to a neighboring gene.
Antisense models were further filtered to remove possible artifacts from template strand switching by removing all those transcripts whose tiling path gaps exceeded 5% of the read coverage, typically corresponding to sense-strand introns. Tiling paths were determined for models that were supported by properly paired reads of CIGAR (Compact Idiosyncratic Gapped Alignment Report) 100 M. Strand specificity was calculated at 9833 protein-coding loci that remained after filtering for minimum expression values. For each mRNA transcript with a putative antisense transcript, we defined a composite contiguous genomic region that fully included both gene models. On this region, we calculated strand specificity as the fraction of total reads that align to the sense strand. Strand specificity was measured on the aggregate of the entire stranded InDA-C data and only included properly paired reads. Spearman’s correlation (SC) of the temporal expression profiles between sense and antisense transcripts was determined by comparing their standardized average transcript abundances across all time points. Additionally, we manually inspected read coverage patterns of the aggregated InDA-C data using the Integrated Genomics Viewer (IGV) (Robinson et al. 2011; Thorvaldsdóttir et al. 2013). These metrics are reported (File S7) to enable researchers to prioritize asRNA models for future validation, though the asRNA transcripts are excluded from the statistical characterizations herein.
A table of curated transcripts and their abundances is provided in File S3. Transcript density was calculated as a proportion of the base pairs per 10 kb stepping window and plotted using Circos (Krzywinski et al. 2009). Relative distances between the transcriptomes were visualized using classical multidimensional scaling (R function cmdscale) and examined using hierarchical clustering with bootstrapping [R package “pvclust” version 1.2-2 (Suzuki and Shimodaira 2006) with optimized leaf ordering; R package “cba” version 0.2-14]. We used SC to calculate the distance (D = 1 − SC) and complete linkage as the clustering criterion. Heatmaps were generated with the visual programming software suite Orange (Demšar et al. 2013).
Rosengarten et al. (2015a) identified 3197 protein-coding genes that were differentially upregulated (FDR < 0.01; ≥ twofold) during development compared to the 0 hr time point. We used transcript abundance from the InDA-C library (Ovation, NuGen) data of these 3197 coding transcripts and 622 long noncoding transcripts to compute pairwise SCs with adjustment for multiple tests (R package psych, function “corr.test”). Correlation coefficients that met the statistical threshold of FDR < 0.01 were visualized as a heatmap (R package gplots, function heatmap.2).
tRNA analysis
Most genes encoding tRNAs are present in the D. discoideum genome in multiple copies (Eichinger et al. 2005). We extracted all the unique sequences of tRNA genes and created a new reference genome with each sequence representing a separate contig. Since tRNAs are typically between 70–95 bp, we trimmed our reads by 65 bp from the 3′-end before mapping. The resulting 35 bp paired-end reads were mapped as single-end reads using bowtie2 (v2.2.3) (Langmead and Salzberg 2012) permitting unique matches. Using samtools (v1.3) (Li et al. 2009), we first aggregated the resulting bam files from all six time points and then, using the “mpileup” function, created a pileup. To identify variants in the tRNA transcriptome, we used Varscan (v2.3.9) (Koboldt et al. 2012) and identified 64 putative variants, allowing for p-values < 0.25.
REMI mutant phenotyping
Mutant D. discoideum strains with lesions in, or immediately adjacent to, putative lncRNA loci were identified from a local database of libraries of barcoded random insert mutants. Fifteen of these strains were recovered from frozen stocks and grown on SM-agar with lawns of Klebsiella pneumonia at 22°. These strains were allowed to clear the bacteria and develop. Six of these strains were transferred to nutritive media, grown to midlog, and developed on nitrocellulose filters, as described above.
Data availability
Gene expression data are available at GEO with the accession number GSE90829. File S2 contains the raw transcriptome assembly output by cufflinks/cuffmerge. File S3 contains the transcript abundances of these genes and DDB_G gene models from the latest genome assembly. File S4 contains a table of all curated ncRNA transcripts and their corresponding DDB_G gene model. File S5 contains the mRNA expression values used to determine correlations between library preparation methods. File S6 contains SCs and strand orientation between lncRNAs and their nearest neighbors. File S7 contains asRNA confidence statistics. File S8 contains a table of REMI mutant strains, which are available upon request. File S9 contains the Varscan output from the tRNA analysis. Further, upon publication, all transcriptome data from this study may be explored and compared to previous works at www.dictyexpress.org.
Results and Discussion
Strand-specific ribosomal RNA-depleted libraries are consistent with poly-A enriched benchmarks
The protein-coding transcriptome of D. discoideum has been characterized extensively, most recently by exploring changes in mRNA abundance every 1–2 hr over the 24 hr course of development (Rosengarten et al. 2015a). To characterize the noncoding portion of the transcriptome, we analyzed aliquots of the same 2 hr samples, comprising two biological replicates. RNA was prepared for RNA-seq by enzymatic depletion of rRNA using the InDA-C method (NuGen) with custom designed rRNA oligonucleotides. We constructed strand-specific libraries using the Ovation kit (NuGen) and sequenced them by 100 bp paired-end Illumina chemistry. Ribosomal RNA constitutes around 96–98% of cellular RNA in D. discoideum (Sucgang et al. 2003), but after processing only 40–60% of the sequencing reads mapped to rRNA genes, confirming an enrichment of non-rRNA in these libraries and facilitating deep-coverage RNA-seq analysis (Figure S1 and Figure S2).
We next validated the quality of the rRNA-depleted libraries by comparing the mRNA profiles to those previously characterized by poly-A selection (Rosengarten et al. 2015a). These experiments utilized identical biological samples, but were processed and sequenced using different technologies. In the poly-A libraries, we detected transcript abundance for 10,010 protein-coding genes at some point in development. The same minimal criterion included 9833 genes from the new libraries, 99% of which overlapped with the previous set (File S5). We calculated SCs of the mRNA abundances at each sample from the two experiments (Figure S3), and observed a mean correlation of 0.96 for corresponding time points. We conclude that the stranded rRNA depletion libraries are representative of the mRNA transcriptome, in addition to enabling the quantification of various long ncRNAs, namely asRNAs and lncRNAs.
Long ncRNAs are dispersed throughout the genome
Analyses of the RNA-seq data identified lncRNAs that reside entirely within intergenic regions with no contiguous tiling path to neighboring genes. The transcript models were filtered for minimum length and coverage, as described above. In total, we identified 621 lncRNAs with measurable expression at some point during growth or development (Figure 1A, File S3, and File S4). We detected ORFs 150 bp or longer in only ∼10% of the lncRNAs models, indicating most of these transcripts do not have substantial coding potential. Genomic segments encoding lncRNAs were found interspersed among protein-coding loci with no obvious pattern of clustering (Figure 1A). This distribution contrasts with small noncoding RNAs and tRNAs, which are often found in clusters in the D. discoideum genome (Aspegren et al. 2004; Eichinger et al. 2005; Hinas and Söderbom 2007).
Putative asRNAs were defined as those transcripts whose mapping overlapped some portion of a known gene model encoded on the opposite strand, and were filtered for length and coverage as well. We observed considerable correlation in transcript abundance with sense-strand cognates (Figure S5), as well as troubling similarities in splice patterns that suggested that many of the asRNA models were artifacts. We further characterized the strand specificity of the asRNA models, filtering those with tiling path gaps characteristic of sense-strand artifacts. However, due to our inability to consistently distinguish between true antisense transcription and strand-switching products, asRNA models are excluded from the statistical characterizations below. Instead they are included in the supplemental files, available for future validation (File S3, File S4, and File S7). One notable exception is discussed below.
Because we imposed a filter on the lncRNA models removing those with tiling paths to the nearest neighbor, we do not believe our transcript models to be artifacts resulting from transcriptional read-through. Nevertheless, the lncRNAs might represent transcripts that are expressed from a shared upstream promoter and subsequently processed from the protein-coding transcript. To determine the degree to which lncRNAs show similar expression profiles with their 5′ neighbor, we calculated the SC for each lncRNA and its nearest neighbors on both sides, and grouped these based on the neighbors’ orientation (Figure S4 and File S6).
Three-quarters of all converging lncRNA-neighbors were found to have a SC < 0.30, and the median lncRNAs correlation (0.02) was lower than that of 1000 randomly sampled genes (median = 0.08, Mann Whitney U-test, p = 0.05). Thus, while a subset of putative lncRNAs are coexpressed with their 5′ converging neighbor, most lncRNAs are expressed independently. To test for potential bidirectional promoter activity, we similarly measured the correlations between lncRNA models and their diverging 5′ neighbor on the opposite strand. Here, we observed a median correlation of 0.03 for lncRNAs, slightly lower than expected at random (Mann Whitney U-test, p = 0.02). Though some lncRNAs are coexpressed with their divergent neighbor, we reject the hypothesis that bidirectional promoter activity is a general driver of lncRNA expression.
Transcriptional read-through is not the only mechanism that could account for the minority of examples of lncRNA-neighbor pairs with strong correlations in abundance. One hypothesis might be a cis-regulatory effect of the lncRNA on its gene neighbor. Alternatively, coexpression might represent a more transcriptionally active chromatin state (Rinn and Chang 2012; Guil and Esteller 2012; Kornienko et al. 2013; Quinn and Chang 2016).
Noncoding RNAs were considerably less abundant than mRNAs. The median maximum expression at any time point was 41 reads per kb per million (RPKM) for mRNAs and 1.0 RPKM for lncRNAs (Figure 1B). The lower abundance of noncoding transcripts might be due to a lack of strong promoters, or to effects of polyadenylation on mRNA stability (Bernstein et al. 1989; Sachs 1990; Wang et al. 1999). The relative maximum abundance of different classes of RNAs is consistent with that observed in other organisms, such as the malaria vector Plasmodium falciparum, and even in humans (Derrien et al. 2012; Broadbent et al. 2015). Derrien et al. (2012) cataloged a comprehensive annotation of human lncRNAs for the GENCODE consortium, with median expression differences around two orders of magnitude between mRNA and lncRNAs across many different tissue types. Recent experimental evidence suggests that lower lncRNA abundances in bulk-cell sequencing samples are due not to lower levels of expression, but rather greater cell-to-cell variation in expression. Single-cell RNA-seq analysis in the neurocortex revealed that lncRNAs are expressed at levels comparable to mRNAs, but are expressed in a much lower fraction of cells (Liu et al. 2016). Thus, bulk analysis results in a lower average abundance. In our study, as in other bulk analyses, each class of lncRNA did include numerous transcripts with maximum abundance levels more akin to mRNAs, although the median for lncRNA was lower (Figure 1B). It will be interesting to see what the first single-cell RNA-seq studies in Dictyostelium reveal regarding cellular lncRNA heterogeneity.
lncRNA transcript models were considerably shorter than typical mRNAs (median = 628 vs. 1400 bp) (Figure 1C). The relative shortness of the lncRNAs is not surprising, because these are constrained by the overall available intergenic space, which in D. discoideum is only roughly 700 bp on average (Eichinger et al. 2005). Twenty lncRNAs were modeled to have as many as three exons, but splice products remain to be verified.
The D. discoideum genome is very AT-rich, but protein-coding regions (i.e., ORFs), are more GC-rich than intergenic regions (Eichinger et al. 2005). The median GC content of ORFs was 25%, whereas the GC content of lncRNAs was 17% (Figure 1D). The nucleotide composition of noncoding transcripts is similar to that described for P. falciparum, which also has a highly AT-skewed genome (Broadbent et al. 2015). Intergenic regions of the D. discoideum genome exhibit a strong AT-bias, low complexity, and long homopolymer tracts. We asked whether the lncRNAs were distinct in nucleotide composition from nonexpressed intergenic segments. Indeed, randomly selected intergenic sequences were 13% GC on average, significantly lower than lncRNAs (Figure 1D). This difference provides additional confidence in the lncRNA transcript models. One might speculate that some lower limit in GC-content prevents RNA polymerase and the associated machinery from transcribing regions below some minimum complexity or GC composition. Anecdotal and published accounts consistently report struggles in amplifying intergenic DNA from Dictyostelium (Rosengarten et al. 2015b; Eichinger et al. 2005). Perhaps these in vitro difficulties reflect challenges experienced by the amoebae themselves.
Temporal changes in long noncoding RNA abundances follow a similar trajectory as that of the mRNA transcriptome
The protein-coding transcriptome of D. discoideum changes dramatically over the course of development, with major shifts in the population-average transcript abundances during starvation, multicellular integration, and differentiation, and again from the culmination of slugs to fruiting bodies (Rosengarten et al. 2015a; Van Driessche et al. 2002; Parikh et al. 2010). The abundances of mRNA and lncRNA transcripts were examined in heatmaps (Figure 2, A and B). In this view, each row represents a transcript, color-coded to show relative changes in abundance for that molecule. Consistent with previous studies (Rosengarten et al. 2015b; Van Driessche et al. 2002; Parikh et al. 2010), the mRNA underwent dramatic changes in expression over the time course (Figure 2A). Likewise, we found that the noncoding transcriptome also changes over developmental time (Figure 2B). Numerous genetics studies have shown that transcriptome dynamics are regulated and have important phenotypic consequences (Williams 2006; Cai et al. 2014; Santhanam et al. 2015). For example, deletion of the transcription factor encoding gtaC manifests itself in an arrest of both the developmental transcriptome state and morphological progression (Cai et al. 2014). Thus, we hypothesize that the dynamic transcription of long noncoding RNAs may also contribute to development at the (multi-)cellular level.
Multidimensional scaling (MDS) is a powerful approach to visualize high dimensional data in lower dimensional space. MDS can be used to visualize the relative differences between transcriptomes (time points), wherein the Euclidean distances between points in two-dimensional space correspond to overall dissimilarity between entire transcriptomes of those samples. This analysis revealed that the lncRNA temporal changes followed a similar pattern to that of the mRNA, previously described as clusters of slowly changing stages punctuated by gaps representing larger changes in the molecular phenotype (Rosengarten et al. 2015a) (Figure 2, C and D). Transcriptomes in MDS dimension 1 were nearly collinear with time, although the relationship was imperfect for the lncRNAs (Figure 2, E and F).
From the MDS analysis, we observed a large temporal shift in this library between 10 and 12 hr of development, and bigger still between 16 and 18 hr (Figure 2, D–F). The previous poly-A-based analysis of these samples reported the greatest single transcriptome change between 10 and 12 hr, and also observed considerable separation between the 16 and 18 hr transcriptomes (Rosengarten et al. 2015a). Both of these time frames coincide with major morphological changes. The observed signal is robust to library preparation method and is recapitulated by both mRNA and lncRNA datasets. The consistency in transcriptome pattern among both classes of RNA lends further support to the characterization of the developmental transcriptome as a global quantitative phenotype (Van Driessche et al. 2005). Further investigations into the gene regulatory pathways in D. discoideum might consider noncoding RNA as well as mRNA responses to genetic perturbations and transcription factor binding (Cai et al. 2014; Santhanam et al. 2015).
ncRNA abundances correlate with mRNAs involved in early and late development
Since lncRNAs are abundant throughout development, we propose that they might contribute to cellular and morphological phenotypes. We searched for lncRNAs and mRNAs with strongly correlated transcription profiles (Childs et al. 2011; Okamura et al. 2015) (Figure 3). We found two groups of correlated transcripts: those abundant or “on” early and “off” late and those off early and on late. Considering decades of characterization of the cell biology of development over time (Kessin 2001), we propose that the early-on lncRNAs are involved in growth (measured at 0 hr) and in the starvation response at the onset of development, whereas the late-on lncRNAs may contribute to culmination, sporulation, and fruiting body maturation. Noncoding transcripts are also present at middevelopment time points, (e.g., between hr 8 and 14), leaving open the possibility of roles in multicellular integration and differentiation (Williams 2006; Rosengarten et al. 2013).
We wished to test directly whether lncRNAs played a functional role in development. From a collection of barcoded insertion mutants (Robery et al. 2013), we identified 15 strains with lesions putatively mapped in or adjacent to asRNAs and lncRNAs (File S8). We grew the strains in association with bacteria on nutrient agar plates and observed growth and development of individual plaques. None of the strains showed overt growth impairment or defects in development on cleared agar. A subset of six strains also developed normally on nitrocellulose filters (data not shown). Our failure to recover lncRNA mutants with obvious phenotypes is likely a reflection of the small sample size of known mutants available for testing. Future genetics studies should be mindful that mutations between coding regions may in fact hit functional genetic elements, and should not be discarded as off-target until the expression of that noncoding region is assessed in a wild-type background.
Strong temporal and strand signal of an antisense transcript from the mitochondrial genome
D. discoideum transcribes its mitochondrial genome (mtDNA) from a single initiation site, with all genes on the same strand (Le et al. 2009) (Figure 4A). The resulting polycistronic RNA is processed into eight smaller multigenic units, which are further processed into individual gene transcripts (mtRNAs) (Barth et al. 2001; Le et al. 2009). The transcripts are not polyadenylated, and therefore have not been included in previously published RNA-seq studies that relied on poly-A library enrichment methods. Overall, mtRNA was highly abundant at the onset of starvation and early development, and declined over the developmental time course (Figure 4B). Even the small and large ribosomal subunit genes, which were targeted for depletion during library preparation, retained high abundance values. This result suggests a limitation of the success of the enzymatic depletion method (Adiconis et al. 2013), and speaks to the sheer abundance of mtRNAs overall.
We identified two putative asRNAs mapped to the mtDNA (Figure 4A, green boxes). These overlap with the gene models for rnaS/DDB_G0305150, trnM, trnL, and trnR, and trnP, atp9, trnM, and nad9. The median abundance of the asRNA models was 16-fold lower than that of the top-strand genes. The antisense transcript opposite the rnaS locus sharply peaked in abundance at 14 hr (Figure 4B, red box). This peak did not appear to be correlated with the expression of any other mtRNA. The locus including this asRNA model was also notable for a strand specificity of 0.79, well below the genome median, providing additional confidence that this asRNA is independently transcribed. This time point coincides with a major mtDNA replication event during multicellular differentiation (Shaulsky and Loomis 1995). asRNA has been shown to regulate the replication of plasmids in various prokaryotic systems (Brantl 2002, 2015). Whether or not the uptick in asRNA abundance is related to mtDNA replication in Dictyostelium, rather than a simple coincidence of small sample size, remains to be tested.
An additional consequence of profiling Dictyostelium’s mtRNA is the identification of an annotation issue regarding the small ribosomal subunit (rnsA). The annotation of the AX4 mtDNA on dictyBase (version 2013) (Basu et al. 2013) does not include the rnsA gene, but rather the model DDB_G0305150 for a gene of unknown function similar to a bacterial protein. Meanwhile, the NCBI mitochondrial genome record (GenBank: AB000109.1), derived from strain AX3, does include the rnsA gene annotation overlapping this position. The mtDNA sequences from these two databases display 100% identity (BLAST results not shown), but slight differences in annotation. Our data from AX4 support continuous transcription across the rnsA region. Considering the sequence similarity of this locus to small ribosomal subunit genes of other taxa, we propose that the mtDNA annotations should be reconciled to include the rnsA gene in all cases.
Abundance and modifications of tRNA transcripts
Although earlier studies cataloged several types of small ncRNAs (Aspegren et al. 2004; Avesson et al. 2011), the developmental expression of tRNAs in D. discoideum has not been described in detail. The present study, thus far, has focused on long intergenic and antisense transcripts, and the sample preparation method (Ovation, NuGen) was well suited to isolating these molecules. While we detected plenty of reads likely from tRNAs in this library as well, we were not confident in the quantification of the tRNAs because the Ovation method was not optimized to retain molecules smaller than 100 bp. In order to examine tRNAs, typically 70–80 bp, we processed a subset of samples—six time points from biological replicate 1—using the riboZero (Epicentre, Madison) rRNA depletion method.
A majority of tRNA gene families is represented by more than a single copy in the D. discoideum genome. A total of 418 tRNA genes have been modeled, 403 of which reside on the nuclear genome (http://dictybase.org/, version 2013) (Eichinger et al. 2005; Fey et al. 2009; Basu et al. 2013). These correspond to tRNA families with specificity for 41 codons. The prevalence of duplicated loci suggests that gene copy number may influence tRNA abundance and availability. With 22 loci each, “tRNA-Lys-UUU” and “tRNA-Asp-GUC” are the most repeated of the tRNA genes. However, the codons they decode, AAA and GAC, respectively, are not the most abundant. Due to the multicopy state of most tRNA genes, we created a new reference genome with each tRNA allele represented as a contig. We quantified their transcript abundance as the cumulative abundance of all tRNA genes belonging to a tRNA family. Unlike mRNAs and other noncoding RNAs, tRNA abundance didn’t change dramatically during development. So for all further analyses, we aggregated data from all the developmental time points.
We asked whether the abundance of tRNAs correlated with the frequency of the matching codon in the ORFs throughout the genome (Figure 5A). We found that tRNA abundance and codon frequency were weakly positively correlated (SC = 0.35), similar to observations in other organisms (reviewed in Novoa and de Pouplana 2012). Surprisingly, we found that 20 codons had no cognate tRNA partner encoded in the genome.
We wondered how the 20 codons with no tRNA interacting partners might be translated. Based on anticodon similarities, we suspect that 12 of these 20 could be translated through canonical wobble interactions with tRNA isoacceptors (Crick 1966) (data not shown). However, the remaining eight codons with missing tRNA partners are unlikely to be translated merely through classical wobble interactions (Figure 5A). These eight unrepresented tRNA–codon pairs included the least frequent codon, CGG (arginine), but also more highly ranked codons such as ACC (threonine, 28th most frequent) and AUC (isoleucine, 34th most frequent). Transcripts of genes that contain these unmatched codons could suffer from stalled translation, possibly leading to endonucleolytic cleavage and “no-go” mRNA decay (Doma and Parker 2006), or potentially truncated protein expression, which could be detrimental to the cells. One simple explanation might be that transcripts containing these codons are not transcribed, or if transcribed are not translated. We examined 11,830 coding sequences obtained from dictyBase (http://dictybase.org, version 2013), recently confirmed to be polyadenylated (Cai et al. 2014). More than 98% contained at least one of these eight codons, arguing against the hypothesis that these codons are unexpressed.
As an alternative, we hypothesized that cells resolve the issue of missing codon–tRNA partners through specific post-transcriptional editing of the tRNA to modify the anticodon specificity (Gerber and Keller 1999; Rubio et al. 2007; Jackman and Alfonzo 2013). tRNAs are subject to myriad biochemical modifications, including deamination (Jackman and Alfonzo 2013). One of these modifications is deamination of the 5′-adenosine in the anticodon, converting adenosine to inosine, a guanosine analog capable of pairing with A, C, or U (Gerber and Keller 1999). These modifications expand the decoding capacity of tRNAs to recognize rare codons or compensate for absent isoacceptors. We searched the tRNA transcriptome for evidence of single nucleotide polymorphisms (SNPs) that might modify anticodon specificity. We identified variations within the anticodon that could compensate for four of the eight missing tRNAs (Figure 5B). For example, in the case of the missing tRNA specific for leucine (CUC), we observed variations in the anticodon in transcripts of two tRNA-Leu-AAG alleles (Figure 5, C and D). In both, the anticodon (AAG) was modified to (GAG). In one case, we detected evidence for polymorphism only at the first base of the anticodon (Figure 5C), whereas in the second we additionally identified a SNP at position 28 (Figure 5D). This position, outside of the anticodon, was predicted to remain unpaired in putative secondary structures of the tRNA (Figure S6) (Lowe and Eddy 1997; Schattner et al. 2005). All of the anticodon SNPs involved an A → G transition, and since inosine is read as guanosine by most sequencing technologies, we interpret this to be evidence of A → I deamination. Specialized protocols to test if these editing events in fact result in inosine coupled with higher depth of coverage may be necessary to further validate our findings (Cattenoz et al. 2013).
Overall, 27 other tRNAs were found with SNPs somewhere in the transcript (File S9) and many tRNA genes contained SNPs at more than one position. In contrast to the anticodon variants, most other SNPs result in changes of either an A or G to a T, suggesting the activity of other post-transcriptional modification mechanisms, or tolerance of some amount of uncorrected transcriptional error in specific positions on tRNAs. We can only speculate at the effect of these polymorphisms, perhaps involved in folding efficiency, ribosome interactions, or amino acid loading. Deeper sampling would be necessary to assess whether post-transcriptional tRNA variation is developmentally regulated. Further functional studies might examine how these changes influence protein expression.
Conclusions
Dynamic developmental expression of lncRNAs:
We conclude that D. discoideum expresses lncRNAs throughout development, and that the abundance of these molecules changes over developmental time with a trajectory similar to that of mRNAs. The prevalence of these transcripts and the similarity in expression profiles to that of the protein-coding transcriptome suggest that lncRNAs are relevant to the progression of molecular and cellular phenotypes from single-celled amoebae to multicellular reproductive fruiting bodies. Hypothesized cis- or trans-interactions between lncRNAs and mRNAs add a layer of complexity to the transcriptional regulatory landscape. Further, post-transcriptional tRNA modifications may play an important part in ensuring timely translation of expressed genes. The catalog of transcripts described in this study sets the stage for future functional studies to decode the functions of ncRNAs in Dictyostelium.
Data resources for future studies:
In order to facilitate future analysis of transcriptional links between mRNAs and lncRNAs, we developed a new visualization module on the Dictyostelium gene expression atlas dictyExpress (www.dictyExpress.org) (R. D. Rosengarten, J. Kokošar, L. Jeran, G. Shaulsky, B. Zupan et al., unpublished results; Rot et al. 2009; Stajdohar et al. 2015). When a user selects an mRNA or ncRNA and then searches for similar temporal expression profiles, ncRNAs are included in the results. When one or more ncRNAs are then selected, a parallel time course is plotted so that the transcription profiles may be compared, with the abundance (y-axis) appropriately scaled (Figure S7). This tool will allow the Dictyostelium research community to consider ncRNAs when generating hypotheses to test regarding their genes of interest.
Supplementary Material
Supplemental material is available online at www.g3journal.org/lookup/suppl/doi:10.1534/g3.116.037150/-/DC1.
Acknowledgments
We thank Adam Kuspa and Mariko Katoh-Kurasawa for helpful discussions and commentary; Frank Tansley and Luke Sherlin from NuGen for advice, guidance, and material support; Christopher Dinh for help with identifying REMI mutant strains; Pamela Beltran for technical assistance; Shan Song and Rui Chen for access to materials and sequencing advice; and Lisa White and the Baylor College of Medicine RNA-sequencing core for generously including us in the GARP Ribo-Zero pilot study. We are especially grateful to the team at Genialis Inc. for providing excellent visual analytics software to accommodate the requirements of this study. R.D.R. was supported in part by the Keck Center of the Gulf Coast Consortia, Training Program in Biomedical Informatics, National Library of Medicine (T15LM007093-21, Principal Investigator Tony Gorry, Rice University). Research reported in this publication was supported by the Eunice Kennedy Shriver National Institute of Child Health and Human Development of the National Institutes of Health (NIH) under award number P01HD039691. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. The authors declare no competing interests. No humans or animals, nor data derived from those sources, were used in this study.
Footnotes
Communicating editor: T. R. Hughes
Literature Cited
- Adiconis X., Borges-Rivera D., Satija R., DeLuca D. S., Busby M. A., et al. , 2013. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat. Methods 10: 623–629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andrews, S., 2010 FastQC: a Quality Control Tool for High Throughput Sequence Data. Available at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc. Accessed: June 7, 2016.
- Aspegren A., Hinas A., Larsson P., Larsson A., Söderbom F., 2004. Novel non-coding RNAs in Dictyostelium discoideum and their expression during development. Nucleic Acids Res. 32: 4646–4656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Avesson L., Schumacher H. T., Fechter P., Romby P., Hellman U., et al. , 2011. Abundant class of non-coding RNA regulates development in the social amoeba Dictyostelium discoideum. RNA Biol. 8: 1094–1104. [DOI] [PubMed] [Google Scholar]
- Avesson L., Reimegård J., Wagner E. G. H., Söderbom F., 2012. MicroRNAs in Amoebozoa: deep sequencing of the small RNA population in the social amoeba Dictyostelium discoideum reveals developmentally regulated microRNAs. RNA 18: 1771–1782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barth C., Greferath U., Kotsifas M., Tanaka Y., Alexander S., et al. , 2001. Transcript mapping and processing of mitochondrial RNA in Dictyostelium discoideum. Curr. Genet. 39: 355–364. [DOI] [PubMed] [Google Scholar]
- Basu S., Fey P., Pandit Y., Dodson R., Kibbe W. A., et al. , 2013. dictyBase 2013: integrating multiple Dictyostelid species. Nucleic Acids Res. 41: D676–D683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernstein P., Peltz S. W., Ross J., 1989. The poly(A)-poly(A)-binding protein complex is a major determinant of mRNA stability in vitro. Mol. Cell. Biol. 9: 659–670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brantl S., 2002. Antisense RNAs in plasmids: control of replication and maintenance. Plasmid 48: 165–173. [DOI] [PubMed] [Google Scholar]
- Brantl S., 2015. Antisense-RNA mediated control of plasmid replication - pIP501 revisited. Plasmid 78: 4–16. [DOI] [PubMed] [Google Scholar]
- Broadbent K. M., Broadbent J. C., Ribacke U., Wirth D., Rinn J. L., et al. , 2015. Strand-specific RNA sequencing in Plasmodium falciparum malaria identifies developmentally regulated long non-coding RNA and circular RNA. BMC Genomics 16: 454. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cai H., Katoh-Kurasawa M., Muramoto T., Santhanam B., Long Y., et al. , 2014. Nucleocytoplasmic shuttling of a GATA transcription factor functions as a development timer. Science 343: 1249531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cattenoz P. B., Taft R. J., Westhof E., Mattick J. S., 2013. Transcriptome-wide identification of A > I RNA editing sites by inosine specific cleavage. RNA 19: 257–270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Childs K. L., Davidson R. M., Buell C. R., 2011. Gene coexpression network analysis as a source of functional annotation for rice genes. PLoS One 6: e22196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crick F. H., 1966. Codon–anticodon pairing: the wobble hypothesis. J. Mol. Biol. 19: 548–555. [DOI] [PubMed] [Google Scholar]
- Demšar J., Curk T., Erjavec A., Gorup Č., Hočevar T., et al. , 2013. Orange: data mining toolbox in Python. J. Mach. Learn. Res. 14: 2349–2353. [Google Scholar]
- Derrien T., Johnson R., Bussotti G., Tanzer A., Djebali S., et al. , 2012. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22: 1775–1789. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doma M. K., Parker R., 2006. Endonucleolytic cleavage of eukaryotic mRNAs with stalls in translation elongation. Nature 440: 561–564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eichinger L., Pachebat J. A., Glöckner G., Rajandream M.-A., Sucgang R., et al. , 2005. The genome of the social amoeba Dictyostelium discoideum. Nature 435: 43–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fey P., Gaudet P., Curk T., Zupan B., Just E. M., et al. , 2009. dictyBase–a Dictyostelium bioinformatics resource update. Nucleic Acids Res. 37: D515–D519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gerber A. P., Keller W., 1999. An adenosine deaminase that generates inosine at the wobble position of tRNAs. Science 286: 1146–1149. [DOI] [PubMed] [Google Scholar]
- Guil S., Esteller M., 2012. Cis-acting noncoding RNAs: friends and foes. Nat. Struct. Mol. Biol. 19: 1068–1075. [DOI] [PubMed] [Google Scholar]
- Haas B. J., Papanicolaou A., Yassour M., Grabherr M., Blood P. D., et al. , 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8: 1494–1512. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hildebrandt M., Nellen W., 1992. Differential antisense transcription from the Dictyostelium EB4 gene locus: implications on antisense-mediated regulation of mRNA stability. Cell 69: 197–204. [DOI] [PubMed] [Google Scholar]
- Hinas A., Söderbom F., 2007. Treasure hunt in an amoeba: non-coding RNAs in Dictyostelium discoideum. Curr. Genet. 51: 141–159. [DOI] [PubMed] [Google Scholar]
- Hinas A., Larsson P., Avesson L., Kirsebom L. A., Virtanen A., et al. , 2006. Identification of the major spliceosomal RNAs in Dictyostelium discoideum reveals developmentally regulated U2 variants and polyadenylated snRNAs. Eukaryot. Cell 5: 924–934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iranfar N., Fuller D., Loomis W. F., 2003. Genome-wide expression analyses of gene regulation during early development of Dictyostelium discoideum. Eukaryot. Cell 2: 664–670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jackman J. E., Alfonzo J. D., 2013. Transfer RNA modifications: nature’s combinatorial chemistry playground. Wiley Interdiscip. Rev. RNA 4: 35–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kessin R. H., 2001. Dictyostelium: Evolution, Cell Biology, and Development of Multicellularity. Cambridge University Press, Cambridge, UK. [Google Scholar]
- Koboldt D. C., Zhang Q., Larson D. E., Shen D., McLellan M. D., et al. , 2012. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22: 568–576. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kornienko A. E., Guenzl P. M., Barlow D. P., Pauler F. M., 2013. Gene regulation by the act of long non-coding RNA transcription. BMC Biol. 11: 59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krzywinski M., Schein J., Birol I., Connors J., Gascoyne R., et al. , 2009. Circos: an information aesthetic for comparative genomics. Genome Res. 19: 1639–1645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langmead B., Salzberg S. L., 2012. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9: 357–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Larsson P., Hinas A., Ardell D. H., Kirsebom L. A., Virtanen A., et al. , 2008. De novo search for non-coding RNA genes in the AT-rich genome of Dictyostelium discoideum: performance of Markov-dependent genome feature scoring. Genome Res. 18: 888–899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Le P., Fisher P. R., Barth C., 2009. Transcription of the Dictyostelium discoideum mitochondrial genome occurs from a single initiation site. RNA 15 : 2321–2330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., et al. , 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25: 2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu S. J., Nowakowski T. J., Pollen A. A., Lui J. H., Horlbeck M. A., et al. , 2016. Single-cell analysis of long non-coding RNAs in the developing human neocortex. Genome Biol. 17: 67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loomis W. F., Shaulsky G., 2011. Developmental changes in transcriptional profiles. Dev. Growth Differ. 53: 567–575. [DOI] [PubMed] [Google Scholar]
- Lowe T. M., Eddy S. R., 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25: 955–964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miranda E. R., Rot G., Toplak M., Santhanam B., Curk T., et al. , 2013a Transcriptional profiling of dictyostelium with RNA sequencing, in Dictyostelium discoideum Protocols, Methods in Molecular Biology. Springer, New York. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miranda E. R., Zhuchenko O., Toplak M., Santhanam B., Zupan B., et al. , 2013b ABC transporters in Dictyostelium discoideum development. PLoS One 8: e70040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Novoa E. M., de Pouplana L. R., 2012. Speeding with control: codon usage, tRNAs, and ribosomes. Trends Genet. 28: 574–581. [DOI] [PubMed] [Google Scholar]
- Okamura Y., Obayashi T., Kinoshita K., 2015. Comparison of gene coexpression profiles and construction of conserved gene networks to find functional modules. PLoS One 10: e0132039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parikh A., Miranda E. R., Katoh-Kurasawa M., Fuller D., Rot G., et al. , 2010. Conserved developmental transcriptomes in evolutionarily divergent species. Genome Biol. 11: R35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pelechano V., Steinmetz L. M., 2013. Gene regulation by antisense transcription. Nat. Rev. Genet. 14: 880–893. [DOI] [PubMed] [Google Scholar]
- Qu Z., Adelson D. L., 2012. Identification and comparative analysis of ncRNAs in human, mouse and zebrafish indicate a conserved role in regulation of genes expressed in brain. PLoS One 7: e52275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quinn J. J., Chang H. Y., 2016. Unique features of long non-coding RNA biogenesis and function. Nat. Rev. Genet. 17: 47–62. [DOI] [PubMed] [Google Scholar]
- Raper K., 1940. Pseudoplasmodium formation and organization in Dictyostelium discoideum. J. Elisha Mitchell Sci. Soc. 56: 241–282. [Google Scholar]
- Rinn J. L., Chang H. Y., 2012. Genome regulation by long noncoding RNAs. Annu. Rev. Biochem. 81: 145–166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robery S., Tyson R., Dinh C., Kuspa A., Noegel A. A., et al. , 2013. A novel human receptor involved in bitter tastant detection identified using Dictyostelium discoideum. J. Cell Sci. 126: 5465–5476. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson J. T., Thorvaldsdóttir H., Winckler W., Guttman M., Lander E. S., et al. , 2011. Integrative genomics viewer. Nat. Biotechnol. 29: 24–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rosengarten R. D., Santhanam B., Katoh-Kurasawa M., 2013. Transcriptional regulators: dynamic drivers of multicellular formation, cell differentiation and development, pp. 89–108 in Dictyostelids. Springer, New York. [Google Scholar]
- Rosengarten R. D., Santhanam B., Fuller D., Katoh-Kurasawa M., Loomis W. F., et al. , 2015a Leaps and lulls in the developmental transcriptome of Dictyostelium discoideum. BMC Genomics 16: 294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rosengarten R. D., Beltran P. R., Shaulsky G., 2015b A deep coverage Dictyostelium discoideum genomic DNA library replicates stably in Escherichia coli. Genomics 106: 249–255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rot G., Parikh A., Curk T., Kuspa A., Shaulsky G., et al. , 2009. dictyExpress: a Dictyostelium discoideum gene expression database with an explorative data analysis web-based interface. BMC Bioinformatics 10: 265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rubio M. A. T., Pastar I., Gaston K. W., Ragone F. L., Janzen C. J., et al. , 2007. An adenosine-to-inosine tRNA-editing enzyme that can perform C-to-U deamination of DNA. Proc. Natl. Acad. Sci. USA 104: 7821–7826. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sachs A., 1990. The role of poly(A) in the translation and stability of mRNA. Curr. Opin. Cell Biol. 2: 1092–1098. [DOI] [PubMed] [Google Scholar]
- Santhanam B., Cai H., Devreotes P. N., Shaulsky G., Katoh-Kurasawa M., 2015. The GATA transcription factor GtaC regulates early developmental gene expression dynamics in Dictyostelium. Nat. Commun. 6: 7551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schattner P., Brooks A. N., Lowe T. M., 2005. The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res. 33: W686–W689. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shaulsky G., Loomis W. F., 1995. Mitochondrial DNA replication but no nuclear DNA replication during development of Dictyostelium. Proc. Natl. Acad. Sci. USA 92: 5660–5663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stajdohar, M., L. Jeran, J. Kokošar, D. Blenkus, T. Janez et al., 2015 dictyExpress: visual analytics of NGS gene expression in Dictyostelium Available at: https://www.dictyexpress.org. Accessed: June 7, 2016.
- Sucgang R., Chen G., Liu W., Lindsay R., Lu J., et al. , 2003. Sequence and structure of the extrachromosomal palindrome encoding the ribosomal RNA genes in Dictyostelium. Nucleic Acids Res. 31: 2361–2368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Suzuki R., Shimodaira H., 2006. Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics 22: 1540–1542. [DOI] [PubMed] [Google Scholar]
- Takeishi K., Kaneda S., 1979. Low molecular weight nuclear RNA species in Dictyostelium discoideum. Nucleic Acids Symp. Ser. 6: s125–s127. [PubMed] [Google Scholar]
- Takeishi K., Kaneda S., 1981. Isolation and characterization of small nuclear RNAs from Dictyostelium discoideum. J. Biochem. 90: 299–308. [DOI] [PubMed] [Google Scholar]
- Thorvaldsdóttir H., Robinson J. T., Mesirov J. P., 2013. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14: 178–192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trapnell C., Pachter L., Salzberg S. L., 2009. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25: 1105–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trapnell C., Roberts A., Goff L., Pertea G., Kim D., et al. , 2012. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7: 562–578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Driessche N., Shaw C., Katoh M., Morio T., Sucgang R., et al. , 2002. A transcriptional profile of multicellular development in Dictyostelium discoideum. Development 129: 1543–1552. [DOI] [PubMed] [Google Scholar]
- Van Driessche N., Demsar J., Booth E. O., Hill P., Juvan P., et al. , 2005. Epistasis analysis with global transcriptional phenotypes. Nat. Genet. 37: 471–477. [DOI] [PubMed] [Google Scholar]
- Wang Z., Day N., Trifillis P., Kiledjian M., 1999. An mRNA stability complex functions with poly(A)-binding protein to stabilize mRNA in vitro. Mol. Cell. Biol. 19: 4552–4560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Williams J. G., 2006. Transcriptional regulation of Dictyostelium pattern formation. EMBO Rep. 7: 694–698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Williams J. G., 2010. Dictyostelium finds new roles to model. Genetics 185: 717–726. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Gene expression data are available at GEO with the accession number GSE90829. File S2 contains the raw transcriptome assembly output by cufflinks/cuffmerge. File S3 contains the transcript abundances of these genes and DDB_G gene models from the latest genome assembly. File S4 contains a table of all curated ncRNA transcripts and their corresponding DDB_G gene model. File S5 contains the mRNA expression values used to determine correlations between library preparation methods. File S6 contains SCs and strand orientation between lncRNAs and their nearest neighbors. File S7 contains asRNA confidence statistics. File S8 contains a table of REMI mutant strains, which are available upon request. File S9 contains the Varscan output from the tRNA analysis. Further, upon publication, all transcriptome data from this study may be explored and compared to previous works at www.dictyexpress.org.