Skip to main content
Genome Biology and Evolution logoLink to Genome Biology and Evolution
. 2014 Jun 20;6(7):1707–1723. doi: 10.1093/gbe/evu139

The Draft Assembly of the Radically Organized Stylonychia lemnae Macronuclear Genome

Samuel H Aeschlimann 1, Franziska Jönsson 2, Jan Postberg 2,3, Nicholas A Stover 4, Robert L Petera 4, Hans-Joachim Lipps 2,*, Mariusz Nowacki 1,*, Estienne C Swart 1,*
PMCID: PMC4122937  PMID: 24951568

Abstract

Stylonychia lemnae is a classical model single-celled eukaryote, and a quintessential ciliate typified by dimorphic nuclei: A small, germline micronucleus and a massive, vegetative macronucleus. The genome within Stylonychia’s macronucleus has a very unusual architecture, comprised variably and highly amplified “nanochromosomes,” each usually encoding a single gene with a minimal amount of surrounding noncoding DNA. As only a tiny fraction of the Stylonychia genes has been sequenced, and to promote research using this organism, we sequenced its macronuclear genome. We report the analysis of the 50.2-Mb draft S. lemnae macronuclear genome assembly, containing in excess of 16,000 complete nanochromosomes, assembled as less than 20,000 contigs. We found considerable conservation of fundamental genomic properties between S. lemnae and its close relative, Oxytricha trifallax, including nanochromosomal gene synteny, alternative fragmentation, and copy number. Protein domain searches in Stylonychia revealed two new telomere-binding protein homologs and the presence of linker histones. Among the diverse histone variants of S. lemnae and O. trifallax, we found divergent, coexpressed variants corresponding to four of the five core nucleosomal proteins (H1.2, H2A.6, H2B.4, and H3.7) suggesting that these ciliates may possess specialized nucleosomes involved in genome processing during nuclear differentiation. The assembly of the S. lemnae macronuclear genome demonstrates that largely complete, well-assembled highly fragmented genomes of similar size and complexity may be produced from one library and lane of Illumina HiSeq 2000 shotgun sequencing. The provision of the S. lemnae macronuclear genome sets the stage for future detailed experimental studies of chromatin-mediated, RNA-guided developmental genome rearrangements.

Keywords: macronuclear genome, nanochromosome, genome rearrangement, histone variant, chromosome copy number, alternative fragmentation

Introduction

As is characteristic of ciliates, Stylonychia lemnae possesses both a macronucleus (MAC), specialized for gene expression, and a micronucleus (MIC), containing the germline genome that permits recombination and transmission of genetic information across sexual generations (Prescott 1994) (fig. 1). As a genus, Stylonychia has long and rich history as a subject for studies of nuclear organization and development, chromosomes and chromatin, and telomere biology and genome rearrangement (reviewed in Prescott 1994, 2000; Fuhrmann et al. 2013). Among the first records of chromosomes and the mitotic spindle were detailed drawings of micronuclei from Stylonychia species published by Bütschli (1876) (fig. 1B). The discovery of a large DNA loss (over 90%) in S. lemnae’s developing MAC following polyploidization (Ammermann 1968) spurred the studies of genome reduction and reorganization in ciliates (Prescott 1994). Subsequently S. lemnae (which, for simplicity’s sake, we refer to as Stylonychia henceforth) has been extensively used as a model unicellular organism to study the regulation of telomere structure (reviewed in Lipps and Rhodes 2009) and chromatin dynamics (Postberg et al. 2008, 2010; Bulic et al. 2013; Forcob et al. 2014) during genome reorganization and macronuclear differentiation. In spite of the success in studying these processes, their analysis has been hampered by the difficulty of manipulating Stylonychia by classical genetic means and by the limited availability of sequence information. The provision of an annotated Stylonychia draft MAC genome is a significant contribution to addressing the latter problem.

Fig. 1.—

Fig. 1.—

Stylonychia macronuclei. (A–C) Illustrations of successive stages of Stylonychia nuclear division during cellular replication, modified from (Bütschli 1876). Replication bands are responsible for DNA synthesis in spirotrichous ciliates (Gall 1959), including Styonychia, and sweep through the MAC during asexually division (Ammermann 1971). The granular structure of macronuclei, due to nucleolar bodies (Postberg et al. 2006) is also shown. There is currently no indication of a classical spindle in macronuclei (Ammermann 1971). (D) A pair of conjugating S. lemnae cells containing a mixture of old fragmenting macronuclei, new macronuclei, and micronuclei (DAPI staining in red; overlaid on a micrograph of the cells) illustrating the complexity of nuclear organization and development.

The genomes contained within the Stylonychia micro- and macronuclei both have extraordinary architectures, with the former containing elaborately “scrambled” DNA segments which need to be reorganized and joined to form a highly fragmented genome comprised “nanochromosomes.” During sexual development (triggered by conjugation of compatible pairs of cells), a copy of the MIC genome develops into a fresh MAC genome by sophisticated reorganization processes including: 1) The excision of intervening sequences (“internally eliminated sequences,” or IESs) in MAC-destined sequences, 2) unscrambling of MAC-destined DNA, 3) elimination of bulk DNA containing both repetitive and unique DNA, 4) fragmentation of the genome, 5) de novo addition of telomeres, and 6) and amplification of MAC sequences to specific copy numbers (reviewed in Prescott 1994, 2000). The MAC genomes of stichotrichous ciliates (including S. lemnae and Oxytricha trifallax) are organized as tiny, mostly gene-sized molecules with a minimal amount of subtelomeric noncoding sequence, allowing them to be exploited as natural gene finders for both protein-coding and noncoding RNA genes (Jung et al. 2011; Swart et al. 2013). Mature nanochromosomes are capped on either end by simple telomeric repeats (Oka et al. 1980; Klobutcher et al. 1981; Lipps and Erhardt 1981; Pluta et al. 1982). In stichotrichs, alternative processing of certain developing macronuclear DNA regions generates nanochromosome isoforms (Herrick et al. 1987). In O. trifallax (henceforth, Oxytricha), approximately 10% of nanochromosomes have more than one site of telomere addition and give rise to one or a few isoforms, typically with intact genes (Swart et al. 2013).

Macronuclear DNA giving rise to nanochromosomes is variably amplified in two successive rounds (Ammermann 1971; Ammermann et al. 1974) during development, resulting in thousands of copies of each nanochromosome (∼15,000 copies in S. lemnae [Steinbrück 1983] and ∼1,900 copies in O. trifallax [Prescott 1994] on average). In Stylonychia and Oxytricha, the most highly amplified nanochromosome encodes the 18S, 5.8S, and 28S rRNA subunits (Lipps and Steinbruck 1978; Spear 1980; Swanton et al. 1980, 1982). In Oxytricha, the copy number of this nanochromosome is approximately 56 times the mean nanochromosome copy number (Swart et al. 2013). The copy number of most Oxytricha nanochromosomes lies within approximately a single order of magnitude interval (Swart et al. 2013). Macronuclear division during cellular replication occurs by a process known as amitosis, which results in stochastic segregation between the two resulting nuclei without the aid of a mitotic spindle (reviewed in Prescott 1994). In ciliates the amount of DNA amplification in mature macronuclei is proportional to their cell volume (Taylor and Shuter 1981), which has been hypothesized to reflect the need for increased gene expression as cell size increases (Bell 1988) (the largest ciliates may be as long as a few millimeters [Lynn 2008]). Thus, S. lemnae cells (largest dimension 140 μm [Prescott 1994]) were estimated to have approximately 6.8× the macronuclear DNA content of cells from Oxytricha species (Prescott 1994) (largest dimension ∼95 μm).

Although the O. trifallax MAC genome has recently been published (Swart et al. 2013), there is also considerable interest in obtaining a draft S. lemnae MAC genome, both for comparative purposes and to facilitate future detailed experimental studies using Stylonychia as a model organism. We show that continued improvements in genome assembly and conventional Illumina sequencing now permit the production of well-assembled, and largely complete draft nanochromosomal MAC genomes with just a single Illumina paired-end (PE) DNA-seq library (∼68× coverage with 90-bp reads). We report the first comparative genomic analyses between two highly fragmented macronuclear genomes, focusing on gene synteny and nanochromosome copy number conservation. We conclude by describing the diversity of histone variants in Stylonychia, including a number of unique, highly divergent histone H1, H2A, H2B, and H3 variants. Our discoveries suggest that a wealth of treasures is still waiting to be revealed in the macronuclear genomes of both Oxytricha and Stylonychia.

Materials and Methods

DNA Isolation, Illumina Library Creation, and Sequencing

Illumina shotgun libraries were created for S. lemnae cells of lab strain 130c. This strain was created by conjugation of cells from two different mating types derived from two inbred lab strains derived from cells originally collected in Southern Germany (Stylonychia strains senescence after some time in laboratory culture and so it is necessary to conjugate the cells on a regular basis [Duerr et al. 2004]). Stylonychia lemnae cells were grown and lysed as previously described (Ammermann et al. 1974). After lysis of cells, macronuclei were separated from micronuclei by sieving the cell lysates twice through a 15-µm gauze, collecting the macronuclei on the gauze. Macronuclei were collected by centrifugation (4 °C, 2,400 rpm, 6 min, 4 °C) and then lysed in Karvenoff–Zimm buffer (10 mM Tris; 0.5 M EDTA; 1% SDS; pH 9.5) for 30 min at 65 °C. Proteinase K (0.2 mg/ml) was then added (50 °C, overnight) followed by RNase digestion (1 h, 37 °C). After phenol/chloroform extraction, DNA was dialyzed against 0.1× Tris–ethylenediaminetetraacetic acid (24 h, 4 °C) followed by ethanol precipitation.

The Beijing Genome Institute produced standard Illumina genomic libraries from macronuclear DNA fragmented to specific size ranges by a Covaris E220 sonicator, followed by size selection of the library DNA fragments during agarose gel electrophoresis. Sequencing was performed on a HiSeq 2000 sequencer.

Choice of Illumina Library for Assembly

Four libraries with different PE length distributions were created from Stylonychia macronuclear DNA (distributions of PE read lengths are shown in supplementary fig. S3, Supplementary Material online). We define the best assembly as the assembly which maximizes the ratio of total nanochromosomes to total contigs. Irrespective of the assembler used, the best assemblies were typically produced from the shortest PE library (e.g., table 1). Combining this library with other libraries usually produced worse assemblies. We therefore focused our efforts on optimizing the assembly with this library alone.

Table 1.

SPAdes Assemblies of different Stylonychia MAC DNA Libraries

Property Library 24 Library 25 Library 26 Library 27
Assembly size (Mb) 52.8 47.9 50.1 49.9
Contigs (n) 29,175 23,673 32,495 41,263
Telomeres (n) 34,815 13,806 27,912 9,878
Mean contig length (bp) 1,811 2,022 1,542 1,210
Max contig length (bp) 65,401 65,232 65,436 65,079
2-Telomere contigs 16,082 3,822 10,912 2,069
1-Telomere contigs 2,550 6,106 6,028 5,731
0-Telomere contigs 10,543 13,745 15,555 33,463
Total PE read coverage (%) 98.1 91.1 93.0 96.3
Telomeric PE read coverage (%) 92.2 59.5 79.7 62.5

IDBA-UD/Terminator 2.0 Assembly

Our initial best macronuclear genome assembly for Stylonychia with IDBA-UD contained approximately 9,400 nanochromosomes, but most of the contigs in the assembly lacked one or both telomeres (∼29,100).

We were able to significantly improve the O. trifallax macronuclear genome assembly by a custom meta-assembly pipeline (which we now call “Terminator 1.0”), but this pipeline was complex and used multiple sequence types. We therefore developed a simplified version of this approach for S. lemnae (Terminator 2.0) using Illumina sequence data alone (supplementary fig. S1, Supplementary Material online). We used CAP3 (Huang and Madan 1999) to merge assemblies, as was done for the Oxytricha MAC genome assembly, but instead of attempting to prevent collapse of all alleles, we chose a less restrictive assembly criterion: ≥40 bp matches that are ≥97% identity. To extend the contigs we used two different read mappers, SHRiMP 2 (Rumble et al. 2009; David et al. 2011) and smalt (version 0.71; http://www.sanger.ac.uk/resources/software/smalt/, last accessed June 30, 2014). An improvement over the Oxytricha pipeline is that we used read pairing information to avoid incorrect contig extension. This extension process is relatively strict, as we only permit an extension if one of the reads in the read pair matches with zero or one substitution to a contig end. We did not include the chimera detection and removal step used in the Oxytricha macronuclear genome assembly, since, as judged by visual inspection of Stylonychia assemblies, this did not appear to be a significant issue.

In the Terminator 2.0 pipeline, we first extended the contigs with SHRiMP until there were no longer any significant improvements after assembly with CAP3 (at ∼16 iterations). After visual inspection of contigs that failed to be extended by SHRiMP, but were extended by the Geneious read mapper (Kearse et al. 2012), we found that many contigs could still be extended. We found that smalt was capable of mapping additional reads to many of the ends, and so we continued to extend the contigs with smalt. The proportion of nanochromosomes increased only slightly with increasing extensions, and so we chose our final CAP3 meta-assembly as the assembly after ten additional extensions using smalt.

The successive rounds of extension of the contigs using SHRiMP mapping results, followed by CAP3 meta-assembly, increased the total nanochromosome tally by approximately 4,800. This tally was further increased by approximately 1,850 nanochromosomes after a further round of extension using smalt, followed by CAP3 meta-assembly, yielding a total of approximately 16,100 nanochromosomes (table 2).

Table 2.

Best Stylonychia MAC Genome Assemblies

Property SPAdes (SE) SPAdes (PE) Terminator 2.0 Final (SPAdes Polished)
Assembly size (Mb) 52.2 52.8 54.7 50.2
Contigs (n) 31,850 29,175 22,758 19,851
Telomeres (n) 34,422 34,815 35,961 34,327
Mean contig length (bp) 1,640 1,811 2,404 2,531
Max contig length (bp) 65,404 65,401 65,407 65,401
2-Telomere contigs 14,619 16,082 16,082 16,059
1-Telomere contigs 5,132 2,550 3,683 2,104
0-Telomere contigs 12,099 10,543 2,993 1,688
Total PE read coverage (%) 97.9 98.1 98.2 98.0
Telomeric PE read coverage (%) 90.0 92.2 90.4 91.0

SPAdes Assembly

SPAdes (2.5.0) (Bankevich et al. 2012) was run with the “careful” option and the BayesHammer error correction algorithm (Nikolenko et al. 2013) on Illumina library 24. CAP3 (Huang and Madan 1999) was used to assemble potential overlapping contigs from the SPAdes assembly, using a coverage length cutoff of 40 bp and a overlap percent identity cutoff of 97. To remove redundant “chaff” contigs from the assembly, BLAT (Kent 2002) was used to map contigs shorter than 500 bp to contigs greater than 500 bp. Contigs shorter than 500 bp were discarded (8,920 in total) if they had matches to the greater than 500 bp contigs with greater than 80% coverage and greater than 90% sequence identity (this had a minimal effect on the assembly completeness: See table 2).

Assembly Cleanup

Prior to submitting the macronuclear genome to the European Nucleotide Archive, we clipped Illumina adapters at the ends of contigs matching adaptor sequences in the UniVec (ftp://ftp.ncbi.nih.gov/pub/UniVec/UniVec, last accessed June 30, 2014) and EMVec (ftp://ftp.ebi.ac.uk/pub/databases/emvec/emvec.dat.gz, last accessed June 30, 2014) databases (using BLASTN with default parameters; BLAST+ 2.2.26 [Camacho et al. 2009]). We also removed four contigs (Contig8462, Contig1165, Contig12163, and Contig7566) with longer matches to these vectors databases.

The macronuclear isolation protocol used kept DNA contamination from Stylonychia mitochondria to a minimum: Only two short (<1,080 bp), telomereless contigs (Contig10464 and Contig843) had substantial TBLASTX matches (e value < 1e-3) to the Oxytricha mitochondrial genome (Swart et al. 2012). These contigs were removed.

Genome Assemblies from Other Assemblers

The following additional genome assemblers were tested: ABySS (Simpson et al. 2009), IDBA-UD (Peng et al. 2012), Velvet (Zerbino and Birney 2008), MetaVelvet (Namiki et al. 2012), Minia (Chikhi and Rizk 2013), Mira (Chevreux et al. 1999), and SOAPdenovo (Li et al. 2010). See supplementary table S1, Supplementary Material online, for statistics of the best genome assemblies produced with these assemblers, all using library 24. ABySS (version 1.3.4) was run in default mode with a k-mer size of 31. IDBA-UD (version 1.0.9) was run with the switches “–mink 25 –maxk 89 –step 2.” Velvet was run with k-mer size 31, automatic coverage and cutoff of 10. The Velvet assembly was used as input to MetaVelvet (version 1.2.02) run in default mode. Minia was compiled using the make k = 100 option to include higher k-mer sizes and run with a k-mer size of 31 and minimum abundance of 4. Mira (version 3.4.1.1) was run in accurate mode (job=denovo,genome,accurate,solexa SOLEXA SETTINGS—GE:tismin=100:tismax=200) on a subset of 20 million reads randomly chosen from library 24 by a custom Python script. SOAPdenovo was run in default mode with the multi-kmer switch “-m 63.”

Validation of Final Genome Assembly

As the mean insert size of library 25 is relatively large (463 bp) compared with the average nanochromosome size, mapped reads from this library provide a manner to generally validate our final assembly. Tolerating no substitutions during read mapping, 98.9% of the mapped reads were properly paired, as defined by the read mapper bwa (Li and Durbin 2009) (i.e., correctly oriented and within a reasonable insert size range; see http://bio-bwa.sourceforge.net/bwa.shtml, last accessed June 30, 2014).

Gene Prediction

Stylonychia gene predictions were produced de novo using AUGUSTUS (version 2.5.5) (Stanke et al. 2006) previously trained on O. trifallax (Swart et al. 2013) with the following parameters: “−−species=oxytricha −−UTR=on −−extrinsicCfgFile=install/augustus.2.5.5/config/extrinsic/extrinsic.M.RM.E.W.cfg −−alternatives−from−evidence=true −−genemodel=complete −−codingseq=on.” No RNA-seq data were used to provide additional constraints (hints) for the gene prediction.

Overall, the Stylonychia gene prediction statistics (supplementary table S2, Supplementary Material online) are similar to those from Oxytricha (supplementary table S3, Supplementary Material online), which is presumably both a consequence of the training on Oxytricha data, and the similarity between these organisms. In Stylonychia and Oxytricha (with RNA-seq “hints” for AUGUSTUS), 73% and 76% of nanochromosomes are predicted to contain a single gene, respectively.

Protein Annotation

HMMER version 3.1b1 (Eddy 2013) was used to annotate protein domains with the Pfam database (Mistry et al. 2013) (version 26.0). Supplementary data file S1 (stylo_asm1.all.domtable.txt.zip), Supplementary Material online, contains these annotations.

Blast2GO (Conesa et al. 2005; Gotz et al. 2008) (version 2.5.0; default parameters) was used to annotate and name predicted proteins. Results from BLASTP (version 2.2.28) of predicted Stylonychia proteins versus the National Center for Biotechnology Information nonredundant (nr) database (retrieved on July 29, 2013), with an e value threshold of 1e-3 and max_target_seqs=20, and InterProScan version 4.8 (Quevillon et al. 2005) (run in default mode) were the input for Blast2GO. Supplementary data file S2 (stylo_asm1.fixed.go_annotations.txt), Supplementary Material online, contains the final annotations from this pipeline.

Assessment of Genome Completeness

We used three methods to assess the completeness of the draft Stylonychia macronuclear genome assembly: 1) The percentage of reads mapping to the assembly, 2) the percentage of conserved core eukaryotic genes (CEGs) (Parra et al. 2007) with Stylonychia homologs, and 3) whether a complete set of tRNA genes was predicted. All these measures indicate that the draft Stylonychia macronuclear genome assembly is essentially complete.

Individual read libraries were mapped to the MAC assemblies with LAST (Frith et al. 2010; Kielbasa et al. 2011) (lastal -r6 -q18 -a21 -b9 -e180) to estimate raw read coverage. Reads containing telomeric sequences were separately mapped to the assemblies using LAST, to estimate telomeric read coverage. Output MAF alignment files were converted into a SAM files with maf-convert.py from the LAST package. Mapped reads with ≥90% identical matches, covering ≥70% of the read length were counted. Almost 98% of the nontelomeric Illumina reads in our small fragment library and 92% of telomeric reads match our draft assembly (table 2). The reduced fraction of matching telomeric reads may indicate that we have missed a small fraction of alternative nanochromosome ends, but as nanochromosome ends were typically found in nongenic regions in Oxytricha (Swart et al. 2013) and only approximately 2% of all the reads do not map to the Stylonychia MAC genome assembly (table 2), we expect only a minor loss of sequence information.

For the CEG analysis, protein sequences from Stylonychia were BLASTed against the 248 CEGs (Parra et al. 2007). Matches from BLASTP with e values lower than 1e-10 and a sequence coverage ≥70% of the CEG sequence were counted as a match. Of the 248 accepted CEGs, 234 proteins predicted for Stylonychia were likely homologs (based on BLASTP matches; see Materials and Methods for the homology criteria). Ten of the fourteen remaining CEGs also appear to be absent in the Oxytricha MAC genome based on the CEGMA criteria (Parra et al. 2007) (see supplementary table S4, Supplementary Material online, for the missing BLASTP matches), but can be found in both Stylonychia and Oxytricha either by less restrictive BLASTP matches, TBLASTN matches or by using HMMER3 domain searches. After accounting for these issues, in Stylonychia only the MAD2 spindle assembly checkpoint protein (KOG3285) is missing from the superset of 245 CEGs from Oxytricha, Paramecium, and Tetrahymena. This protein is also missing from Oxytricha (Swart et al. 2013). Thus, the macronuclear genomes of both Stylonychia and Oxytricha encode 99.6% of the ciliate-specific CEGS.

To assess he completeness of the Stylonychia tRNA complement, tRNAscan-SE (version 1.3.1, run in default mode) (Lowe and Eddy 1997) was initially used to predict the tRNAs encoding for the standard 20 amino acids. To determine which tRNAs were unique, tRNA sequences were extracted and aligned with MAFFT (Katoh et al. 2002; Katoh and Standley 2013) (default parameters), followed by inspection of the subsequent alignments by eye. Selenocysteine tRNAs in Stylonychia and Oxytricha were predicted using Infernal 1.1rc4 (Nawrocki et al. 2009; Nawrocki and Eddy 2013) using the Rfam 11.0 (Griffiths-Jones et al. 2003; Burge et al. 2013) model for this tRNA. Stylonychia’s MAC genome encodes a comprehensive set of tRNAs for all the 20 standard amino acids and for selenocysteine.

Ortholog Prediction

Protein sequences from both Oxytricha and Stylonychia were first clustered independently using cd-hit (v4.5.4) (Li et al. 2001; Li and Godzik 2006) with a protein clustering identity threshold of greater than 95% to merge alleles (21,490 clusters in Oxytricha and 20,968 clusters in Stylonychia). We then performed BLASTP searches of the representative, clustered protein sequences from cd-hit and selected the reciprocal best hits (7,374) using a custom Python script.

Estimation of Nanochromosome Copy Number

To analyze relative nanochromosome copy numbers, sequencing reads from the four libraries were mapped to the Terminator 2.0 assembly using SHRiMP (version 2.2.3, run in default mode; Rumble et al. 2009; David et al. 2011). For most single-gene nanochromosomes, library fragment size does not seem to have a major effect on estimation of relative copy number (supplementary fig. S2, Supplementary Material online), and so we based our copy number estimates for the final SPAdes assembly on library 24. For each contig, mapped reads were counted and normalized by contig length (mapped reads per base).

Determination of Alternative Fragmentation of Nanochromosomes

To determine alternative fragmentation, we first selected all read pairs possessing a read starting with the most common telomeric repeat “CCCCAAAACCCCAAAACCCC.” The telomeric repeat was then stripped before mapping the read pairs with bwa (parameters: -n 0). For the individual nanochromosomes in figure 2 we inspected the locations of the telomere-stripped reads and determined the most frequent location of the mapped stripped ends, and the number of reads in close proximity (∼100-bp window) for this location by eye.

Fig. 2.—

Fig. 2.—

Synteny of Stylonychia and Oxytricha multigene nanochromosomes. Fold coverage of reads is indicated for total reads mapped with bwa to nanochromosomes from Stylonychia (library 24) and for the Oxytricha MAC genome Illumina library (Swart et al. 2013). Oxytricha has prominent doublet peaks corresponding to telomeric end coverage biases of the PE reads (this phenomenon is due to the larger fragment sizes of the Oxytricha Illumina library). Alternative fragmentation sites are indicated by upward pointing arrows with the number of reads corresponding to the approximate fragmentation site below. Coordinates (italicized numbers) of the contigs (in bp) are given relative to the current assembly.

Histone Phylogenies

Saccharomyces cerevisiae histone variants were obtained from UniProt (the histone variant accession numbers are: H3—UniProt:P61830; CenH3—UniProt:P36012; H2A.1—UniProt:P04911; H2A.2—UniProt:P04912; H2A.Z—UniProt:Q12692; H2B.1—UniProt:P02293; and H2B.2—UniProt:P02294). Tetrahymena thermophila histone variants were obtained from the Tetrahymena Genome Database (TGD) (Stover et al. 2006) (accession numbers: H3.1—TGD:TTHERM_00189180; H3.3—TGD:TTHERM_00016170; H3.4—TGD:TTHERM_00016200; CenH3—TGD:TTHERM_00146340; H2A.1—TGD:TTHERM_00316500; H2A.V—TGD:TTHERM_00143660; H2A.X—TGD:TTHERM_00790790; H2A.Y—TGD:TTHERM_01079200; H2B.1—TGD:TTHERM_00633360; and H2B.2—TGD:TTHERM_00283180). Alignments for each type of histone were generated using Geneious’s (Kearse et al. 2012) MAFFT (Katoh et al. 2002; Katoh and Standley 2013) (v7.017 default parameters) plugin, after which conserved blocks of amino acids (minus N- and C-terminal extensions for some of the histone variants) were manually selected. PhyML (Guindon et al. 2009) (LG substitution model; invariable site proportion = 0; four substitution categories; estimated gamma distribution parameter; optimization of topology and branch length; topology search by nearest neighbor interchanges) was used to generate 100 bootstrap replicates for each phylogeny.

Stylonychia Macronuclear Genome Database

A genome-centric model organism database containing a GBrowse 2 genome browser (Stein et al. 2002), BLAST (Altschul et al. 1997) server, and database of gene function annotations has been established to aid research on S. lemnae. StyloDB (stylo.ciliate.org) was modeled after other ciliate.org websites, including those for Tetrahymena (Stover et al. 2006) and Oxytricha (Swart et al. 2013), and utilizes the same underlying architecture and programming as these projects. StyloDB features a public curation interface that allows members of the research community to edit annotations for each gene, including gene names, Gene Ontology annotations, and published references. Genome, predicted gene, and protein sequence files can also be accessed from the StyloDB website.

Results and Discussion

Choice of an Optimal DNA Fragment Size for Assembling Highly Fragmented Macronuclear Genomes

Currently no genome assemblers are specifically designed to cater to the unique properties of the highly fragmented stichotrich macronuclear genomes, that is, high levels of heterozygosity, nanochromosome copy number variation, alternative nanochromosome fragmentation, and variability of the telomere addition site. Nevertheless, we have produced respectable assemblies after extensive exploration of different assemblers and assembly parameters. As with the Oxytricha MAC genome assembly, we have not sought to resolve nanochromosome haplotypes for Stylonychia during genome assembly, which is a complicated problem in genomes with relatively high levels of heterozygosity (Small et al. 2007; Swart et al. 2013) (though, as we show in the next section, this is mitigated by inbreeding of Stylonychia).

As multiple Illumina libraries with different DNA fragment sizes were generated for this genome project (supplementary fig. S3, Supplementary Material online; libraries 24–27), we were able to evaluate how fragment size affects the quality of the assembly (table 1). We sought to maximize both the read coverage of the assembly and the number of assembled nanochromosomes, while minimizing the number of contigs. Based on these criteria, we found that the best assemblies were produced from the library with the tightest fragment size distribution and small fragments (library 24; mean outer distance of 163 bp; table 1). By virtue of the size selection procedure employed after DNA fragmentation, Illumina PE libraries with longer fragment lengths tend to have a region with low or no sequence coverage at the ends of nanochromosomes (as noted in Oxytricha [Swart et al. 2013], and can also be seen in fig. 2). Together with variation in the precise site of telomere addition site (Swart et al. 2013), this low coverage region may be responsible for the failure to link nanochromosome ends to many contigs. This low coverage region becomes more problematic as the library fragment size increases, for example, the fraction of telomere-bearing reads (from library 24) matching the assembly of our two larger fragment libraries, 25 and 27, is 59.5% and 62.5%, in contrast to 92.2% for the small fragment library 24 (table 1).

Combining different libraries typically did not improve the assemblies compared with the assembly of just library 24 (genome assembly quality has previously been shown to saturate and even worsen as sequencing depth increases [Magoc et al. 2013]). Using all the libraries produced a bloated assembly with only approximately 9% of the contigs possessing two telomeres (supplementary table S5, Supplementary Material online). Even though the best SPAdes assembly combination (libraries 24 and 25) produced a comparable number of complete nanochromosomes to our library 24 assembly (supplementary table S6, Supplementary Material online), it produced approximately 11,000 extra contigs. Consequently, we based our final assembly exclusively on library 24.

Based on our analyses of Stylonychia MAC genome assemblies, we propose using PE Illumina libraries created from short DNA fragments with the SPAdes genome assembler as a cost-effective strategy to produce well-assembled, high complexity fragmented genomes, including the macronuclear genomes of spirotrichs such as Euplotes (Vinogradov et al. 2012) and phyllopharyngean ciliates, such as Chilodonella. We also suggest the use of small fragment libraries for genome assemblies in general if the goal is to obtain the ends of chromosomes, as larger fragment sizes prevent assembly of these ends. As SPAdes was not designed for diploid genome assembly, it will still be necessary to use additional sequencing strategies for haplotype resolution in future.

Selection of a Reference Genome Assembly

After testing multiple genome assemblers (supplementary table S1, Supplementary Material online) we found two strategies generated our “best” Stylonychia macronuclear genome assemblies: 1) A combination of the IDBA-UD assembler (Peng et al. 2012) and a custom extension/assembly approach (Terminator 2.0), and 2) the SPAdes genome assembler (Bankevich et al. 2012) with additional postprocessing to remove tiny, redundant contigs, followed by merging with CAP3 (Huang and Madan 1999). In our first strategy, we used an iterative procedure (Terminator 2.0) to merge and extend incomplete nanochromosomes after assembling Illumina reads with IDBA-UD (see Materials and Methods). The completeness of the two assemblies, as assessed by the total number of nanochromosomes and percentage of mapping reads, is quite similar (table 2). As it was the simpler and less computationally intensive of our two best assembly strategies, we chose the postprocessed SPAdes assembly for our reference draft assembly.

Although we were testing different genome assemblers and assembly parameters, we examined the assemblies of the most highly amplified nanochromosome and the longest nanochromosome, because they present challenging cases for the assemblers and were often not completely assembled. With both IDBA-UD/Terminator 2.0 and SPAdes we completely assembled the highest copy number nanochromosome in Stylonychia, encoding the large rRNA subunit (7,455 bp, including telomeres). Both IDBA-UD/Terminator 2.0 and SPAdes also completely assembled the longest Stylonychia nanochromosome (65,401 bp). This nanochromosome is a single gene nanochromosome which is orthologous (best reciprocal match) to the longest Oxytricha nanochromosome (66,022 bp; encoding the Jotin protein [Swart et al. 2013]). It should be noted that even with just single-end reads we obtained a relatively complete assembly using SPAdes (14,619 two-telomere contigs, and the complete 65.4 kb Jotin contig; table 2).

The total number of Stylonychia nanochromosomes (16,059) in our reference assembly is similar to that of Oxytricha, but the total number of contigs and assembly size is slightly smaller (∼20,000 vs. ∼22,500 contigs, and ∼50 vs. ∼67 Mb; table 2). The smaller Stylonychia MAC genome size is roughly consistent with sequence complexity estimates (47 Mb for Stylonychia vs. 55 Mb for Oxytricha) (Prescott 1994). Two factors are likely the main reasons for the difference in size of these assemblies: 1) The Oxytricha MAC genome contains some redundancy due to the use of two strains and a complex assembly strategy (Swart et al. 2013), and 2) there are lower levels of heterozygosity in the Stylonychia MAC genome than the Oxytricha MAC genome (supplementary fig. S4, Supplementary Material online). Assembled Stylonychia nanochromosomes are somewhat shorter on average than Oxytricha (mean length 2,760 bp compared with mean length 2,982 bp), which may reflect the use of Sanger reads in the Oxytricha assembly, and also the successive greedy CAP3 meta-assemblies, which will tend to merge nanochromosome isoforms arising from alternative fragmentation. Overall, we observe a significant improvement in the proportion of complete nanochromosomes (81% of contigs with two telomeres) in the draft Stylonychia macronuclear assembly compared with the draft Oxytricha assembly (71% of contigs with two telomeres).

Synteny and Alternative Fragmentation

Although the taxonomic classification of stichotrichous ciliates including S. lemnae and O. trifallax has been in a state of flux (Schmidt et al. 2007; Zoller et al. 2012), at the sequence level these species are quite similar. For example, for a small set of S. lemnae and O. trifallax protein-coding genes, the 4-fold synonymous substitutions were approximately 0.4 substitutions/site (Jung et al. 2011). We decided to examine orthologous-predicted genes of Stylonychia and Oxytricha to assess how much conservation of synteny exists between nanochromosomes in the two species. We began with nanochromosomes encoding the most genes in Oxytricha and Stylonychia (eight genes encoded by two different nanochromosomes in both cases; fig. 2), as this gives us the longest regions to observe potential stretches of synteny. In Oxytricha, one 8-gene nanochromosome (eight; OxyDB:Contig14329.0; GenBank: AMCR01001519) is also the most extremely alternatively fragmented (producing at least 14 distinct nanochromosome isoforms) (Swart et al. 2013). The entire length of this Oxytricha nanochromosome aligns to two Stylonychia nanochromosomes (StyloDB:Contig14379 and StyloDB:Contig1032; fig. 2A; BLAST best-reciprocal hits to the Oxytricha nanochromosome). No read pairs link these contigs, even in our larger insert library (library 25). These two Stylonychia contigs encode five genes and two genes, respectively (as judged by BLASTX, in the latter AUGUSTUS failed to predict a gene in the region corresponding to Oxytricha’s gene OxyDB:Contig14329.0.g33). Colinearity between the entire Stylonychia and Oxytricha nanochromosome genes is also evident between the other Oxytricha eight-gene nanochromosome, OxyDB:Contig13261.0, and Stylonychia StyloDB:Contig909 (these two contigs align end-to-end and are 61% identical using Geneious’s [Kearse et al. 2012] Needleman–Wunsch alignment plugin with default parameters). The Stylonychia nanochromosome with eight predicted genes (StyloDB:Contig18561; fig. 2B) is syntentic with two Oxytricha nanochromosomes (OxyDB:Contig747.1 and OxyDB:Contig737.1, which also lack any reads supporting their linkage). Therefore, at smaller genomic scales (<20 kb), there appears to be a considerable amount of synteny between multigene nanochromosomes of Stylonychia and Oxytricha.

To explore synteny more generally, we searched for synteny among two-gene contigs using BLASTP of predicted proteins. Stylonychia two-gene contigs were counted as potentially syntenic with Oxytricha two-gene contigs if both the Stylonychia proteins separately matched (e value < 1e-10) two proteins on a Oxytricha contig. Using these criteria, 43% of two-gene contigs (from 2,364 two-gene contigs) in Stylonychia appear to be syntenic with those in Oxytricha. Given the conservation of synteny between the nanochromosomes of Stylonychia and Oxytricha, we desired to know how well alternative fragmentation sites are conserved between these two species. The alternative fragmentation sites for the two multigene nanochromosomes in figure 2 are usually conserved, but in both cases a site that is an alternative fragmentation site in one of the species appears to be a normal chromosome breakage site in the other species, or too weakly fragmented to be detected. More generally we found that approximately 66% of Stylonychia’s syntenic two-gene contigs showed alternative fragmentation (supported by at least one internally mapping telomeric read) in both species if alternative fragmentation was found in either species.

Conservation of Relative Nanochromosome Copy Number between Stylonychia and Oxytricha

Previously a survey of 11 orthologous nanochromosomes in S. lemnae and O. trifallax showed that the relative copy number of these nanochromosomes is similar (Xu et al. 2012). Although we were examining alternative nanochromosome fragmentation, we noticed that the patterns of sequence coverage of the different nanochromosome isoforms from Stylonychia and Oxytricha are similar (fig. 2). To examine the general relationship between the copy number of Stylonychia and Oxytricha nanochromosomes in a straightforward manner, we compared the base coverage of putative orthologous single-gene nanochromosomes (see Materials and Methods). A strong correlation between the nanochromosome copy numbers (Pearson’s r = 0.77) of these two species can be seen in fig. 3.

Fig. 3.—

Fig. 3.—

Conservation of nanochromosome copy number and length. Nanochromosome copy number was determined for nanochromosomes encoding orthologous proteins in the Terminator 2.0 assembly (see Materials and Methods). As the Oxytricha Illumina library is slightly smaller than that of Stylonychia library 24, we multiplied the Oxytricha reads/bp value by 1.242 (total mapped Stylonychia reads/total Oxytricha mapped reads) to normalize the library sizes.

Studies in both Oxytricha and Stylonychia suggest that copy number may be epigenetically inherited across generations (Heyse et al. 2010; Nowacki et al. 2010). In one study, injection of sRNAs reduced nanochromosome copy number, and was proposed to be a consequence of degradation of putative copy number determining RNA templates by RNA interference (Nowacki et al. 2010). In the other study injection of ss- or dsRNA templates led to an increase in nanochromosome copy number, and copy number of individual nanochromosomes was also stably inherited for 100 asexual generations (Heyse et al. 2010), as predicted by stochastic models of nanochromosome segregation (Duerr et al. 2004).

Consistent with the stochastic segregation model, the copy number of specific nanochromosomes occasionally becomes highly overamplified (up to ∼100× the normal copy number) (Steinbrück 1983; Harper et al. 1991). As an argument for genetic control of copy number determination, in crosses of Stylonychia strains with such overamplified nanochromosomes their progeny initially showed no overamplification of these nanochromosomes, but after approximately 12 months some of the newly created strains showed overamplification of some of the nanochromosomes, whereas other strains showed no overamplification (Steinbrück 1983). If nanochromosome copy number was solely epigenetically controlled, there would be no way for such large copy number fluctutations to be brought back to normal levels during sexual reproduction. Moreover, epimutation rates appear to be orders of magnitude higher than typical eukaryotic genetic mutation rates (Jablonka and Raz 2009; Schmitz et al. 2011), and so, over time, we expect that epigenetic copy number control would lead to wide variation in nanochromosome copy number. Given these problems with epigenetic copy number control, we concur with Steinbrück (1983) that the establishment of nanochromosome copy number during new macronuclear development has an important genetic component.

Stylonychia’s MAC Genome Encodes DDE_3 Transposase Genes

We briefly examined the protein domain complements of Stylonychia and Oxytricha to see whether we could identify any interesting species-specific proteins, but in general, as these two ciliates are relatively closely related, their protein domain complements are also quite similar (See supplementary data file S1, Supplementary Material online: “Known protein domains are conserved between Stylonychia and Oxytricha”). We therefore only consider a few classes of proteins of special interest in the remainder of this article.

In ciliates domesticated transposases involved in DNA elimination and genome reorganization may either be expressed from the micronuclear genome, as for the Oxytricha micronuclear genome-limited TBE (telomere-bearing element [Herrick et al. 1985]) transposases (Nowacki et al. 2009), or be expressed from the macronuclear genome like the PiggyBac-related transposases of Paramecium (Baudry et al. 2009) and Tetrahymena (Cheng et al. 2010). In addition to the TBE transposases, Oxytricha also has two families of transposases: MULE (Pfam:PF10551) transposases and ISXO2-like (Pfam:PF12762) transposases (Swart et al. 2013). Both of these nanochromosome-encoded transposase families were also found among the present predicted Stylonychia proteins (see supplementary data file S1, Supplementary Material online). Interestingly, we found three Pfam DDE_3 transposase domain matching proteins, encoded on telomere bearing contigs (StyloDB: Contig3970.g4257, Contig6146.g6579 and Contig13700.g14613). The DDE_3 domain is characteristic of Oxytricha’s micronuclear-encoded TBE transposases (two predicted proteins in the current Oxytricha MAC genome assembly [OxyDB:Contig5254.0.g70 and Contig2394.0.g82] appear to be TBE transposases encoded on telomereless contigs [potentially MIC genome contaminants] [Swart et al. 2013]). When we used the Stylonchia nanochromosome-encoded DDE_3 proteins to query GenBank’s nr database with BLASTP we found no matches to Oxytricha TBE transposases, suggesting that the transposases in Oxytricha may only be distantly related to the nanochromosome-encoded DDE_3 Stylonychia proteins. It will be of interest to assess whether Stylonychia’s nanochromosome-encoded DDE_3 transposases are developmentally expressed like most known Oxytricha transposase genes, and whether they are involved in genome rearrangements like the TBE transposases.

Two New Telomere End-Binding Beta Proteins

In the Oxytricha/Stylonychia MAC, the major telomere-binding protein complex is comprised a dimer of telomere-binding protein alpha (TeBPα) and telomere-binding protein beta (TeBPβ) (Lipps et al. 1982; Gottschling and Zakian 1986). While examining the predicted Oxytricha proteome we found five additional TeBPα proteins and two additional TeBPβ proteins (and hence we now refer to the original telomere-binding proteins as TeBPα1 and TeBPβ1). Since we performed this search, a new domain, corresponding to the human telomere protein, TPP1, has been added to the Pfam database (version 27). We found matches to the TPP1 domain (PF10341) in both Stylonychia and Oxytricha (two pairs of orthologous [best-reciprocal hits] proteins each). The TPP1 domain of one of these proteins (StyloDB:Contig8366.g8920, OxyDB:Contig1486.1) overlaps its Pfam TeBPβ domain (Pfam:PF07404). This is consistent with the structural homology found between human TPP1 and O. nova TeBPβ1 (Xin et al. 2007), and suggests that these Pfam domains could be unified.

Counting proteins with either the TPP1 domain or TeBPβ domain, there are five distinct TeBPβ proteins in Stylonychia. The best reciprocal hit to one of these Stylonychia proteins (StyloDB:Contig2512.g2701) in Oxytricha (OxyDB:Contig19388.0.g78) does not have a detectable TPP1 domain or TeBPβ domain (at a threshold e value < 1.0). These proteins are both relatively long (836 and 1,092 aa), and, excluding an approximately 172-aa N-terminal extension in Oxytricha, their pairwise alignment is approximately 34% identical (alignments by MAFFT [Katoh et al. 2002; Katoh and Standley 2013] version 7.017; default parameters).

Our inspection of the domain architectures of 102 TPP1-domain containing proteins from the Pfam website (Punta et al. 2012) indicates that this domain is usually located in the N-terminal portion of the protein and often has a long C-terminal region (>200 aa) with no predicted domains, as is the case for all the Stylonychia/Oxytricha TeBPβ proteins. Similarly, aside from TeBPα1, which has multiple POT1 domains (Pfam:PF02765; known as “Telo_bind” in Pfam version 26), other Stylonychia/Oxytricha TeBPα proteins only have an N-terminal POT1 domain. This suggests that, relative to the remainder of these proteins, their N-terminal regions are subject to stronger purifying selection.

The restricted distribution of the TPP1 domain among eukaryotes is striking: Of the 151 protein sequences with detectable TPP1 Pfam domains in UniProt (release 2013_12), only three, including two Oxytricha matches, were not found in opisthokonts (including animals and fungi). The only other nonopisthokont match we found was to an Acanthamoeba castellanii protein (UniProt:L8H2E8_ACACA). The TeBPβ Pfam domain only has matches to proteins from Oxytricha and Stylonychia species. Unless other eukaryotes possess very divergent, and consequently as yet undetected homologs of TPP1, this suggests the absence of TPP1 in ancestral eukaryotes, and the possibility that this protein may have been acquired horizontally by the common ancestor of some ciliates. Although there is a proposed “functional homolog” of TPP1 in Tetrahymena (corresponding to the N-terminal of the protein prediction TGD:TTHERM_00523050) (Linger et al. 2011), we cannot detect either the TPP1 domain or TeBPβ domain in this protein. No TPP1 homolog has been proposed or detected in Paramecium. We previously found a single protein in Paramecium with a POT1 domain (Swart et al. 2013) and Tetrahymena is known to have two POT1 proteins (Jacob et al. 2007), so these ciliates do appear to have homologs of TeBPα.

Stylonychia Has Two Linker Histone Proteins

As chromatin biology is an active area of research in Stylonychia and other ciliates (e.g., Bulic et al. 2013; Gao et al. 2013; Shieh and Chalker 2013; Vogt and Mochizuki 2013; Forcob et al. 2014) we were interested in characterizing the complete diversity of core histones, which is described in the next results section. First we searched for linker histones, as, despite extensive characterization of these histone proteins in T. thermophila (Gorovsky et al. 1974; Gorovsky and Keevert 1975; Allis et al. 1984; Wu et al. 1986, 1994; Hayashi et al. 1987; Shen and Gorovsky 1996; Dou et al. 1999; Dou and Gorovsky 2000, 2002), and identification and sequencing of a histone H1 gene in Euplotes eurystomus macronuclei (Herrmann et al. 1987; Hauser et al. 1993), no linker histone sequences have been reported in other ciliates. Tetrahymena has a single gene encoding its macronuclear histone H1 (Wu et al. 1986), and another gene (MLH) encoding a set of four linker histone proteins as a polyprotein (Allis et al. 1984; Wu et al. 1994). Two of the protein forms generated from the MLH gene have an HMG box domain (Wu et al. 1994) (matching the Pfam domain HMG_box). No sequence similarity was observed between the histone H1 gene of Euplotes eurystomus and the globular histone H1 domain from other eukaryotes (Hauser et al. 1993).

From the two-dimensional SDS-PAGE analyses of O. nova and Stylonychia, it was inferred that histone H1 was missing at the expected location (compared with Tetrahymena and Chicken) for acid-extracted proteins from macronuclei (Butler et al. 1984). However, low mobility, 20–30 kDa proteins were noted in the Oxytricha protein extracts, and it was suggested that if histone H1 proteins are present in Oxytricha or Stylonychia, they might have major biochemical differences from those in animals (Butler et al. 1984). Two putative histone H1 protein bands were identified in an earlier study of Oxytricha histone extracts and were lysine rich compared with the other histones (Caplan 1977). Putative H1 histones were absent from micronuclear extracts in this study, but were identified in the micronuclear extracts of Stylonychia (Schlegel et al. 1990). We therefore desired to know whether any candidate histone H1 genes could be found in Stylonychia.

Among our HMMER3 domain annotations we noticed a single convincing match to the Pfam histone H1 domain (linker_histone - Pfam:PF00538) in Stylonychia (StyloDB:Contig14654.g15612, e value = 7e-8), but no such match in Oxytricha. By BLASTP searches, we found protein homologs of this histone (H1.1) in Oxytricha: Two shorter, identical proteins (OxyDB:Contig14754.0.0.g58 and OxyDB:Contig10099.0.1.g76; 220 aa; e value = 7e-28) and an additional longer protein (OxyDB:Contig20723.0.g17; 501 aa; e value = 2e-10). The best reciprocal hit to the longer Oxytricha protein in Stylonychia is a 356 aa protein (StyloDB:Contig2637.g2828; histone H1.2). The orthologs of histone H1.1 are approximately 51.8% identical, and the orthologs of histone H1.2 are approximately 29.% identical, excluding unmatched C-terminal regions. Multiple sequence alignment with MAFFT (Katoh et al. 2002; Katoh and Standley 2013) revealed that the region corresponding to the first approximately 49 aa acids of the histone domain match in Stylonychia H1.1 aligns without gaps and is conserved among all the Stylonychia and Oxytricha histone H1 variants (41% of the sites are identical and the mean pairwise identity between the pairs is 66%). As is the norm for eukaryotic histone H1, and consistent with the lysine richness of putative histone H1 proteins from Oxytricha (Caplan 1977), these new histone H1 variants are lysine rich (∼20% lysine: Approximately double the lysine content of other Stylonychia/Oxytricha histones).

Stylonychia’s Cornucopia of Core Histone Variants

Following the discovery of a large number of histone H3 variants in Stylonychia (Bernhard 1999) their localization patterns and functions have begun to be teased apart (Postberg et al. 2008; Forcob et al. 2014). An unusually large histone H3 protein (“protein X”; molecular weight 21,000) previously observed in Stylonychia micronuclear histone extracts (Schlegel et al. 1990) has recently been identified as a divergent histone H3 variant, H3.8, and appears to be replaced in the developing new MAC during macronuclear development, including by the H3.7 variant (Forcob et al. 2014). No histone H3 variant of a typical, smaller eukaryotic histone H3 size was found in micronuclear histone extracts (Schlegel et al. 1990). Differences in migration of histone H2A and H2B proteins were noted between the micronuclear and macronuclear Stylonychia histone extracts and were suggested to be due to modifications of these proteins (Schlegel et al. 1990) as no variants of these histones were known.

As relatively little is known about histone H2A and H2B variants in Stylonychia we decided to examine the diversity of these variants among our gene predictions, and at the same time to check whether any other histone H3 or H4 variants were previously missed. We searched for proteins in Stylonychia possessing the Pfam core histone domain using HMMER3 (Eddy 2013) (Pfam:PF00125; e value < 1e-6). In total we found 21 Stylonychia histones with this domain, corresponding to 19 distinct histone variants: One histone H4 protein, nine histone H3 proteins, six histone H2A proteins, and four histone H2B proteins. The diversity of Stylonychia histone variants is almost double that of T. thermophila (12; proteins from ciliate.org, November 26, 2013, possessing the Pfam core histone domain). Likely as a consequence of multiple whole-genome duplications (Aury et al. 2006), the record holder for histone variants among ciliates is Paramecium tetraurelia, with 30 distinct histone variants (histone H4: 5, histone H3: 10; histone H2B: 6, and histone H2A: 9). For the purposes of comparison, we found 48 distinct human histone variants in UniProt (December 1, 2013).

There appear to be two paralogous genes encoding histone H4, as previously reported (Wefes and Lipps 1990), with identical amino acid sequences. The coding sequences of these histones are 92.7% identical in Stylonychia and 95.6% identical in Oxytricha, but both have surrounding noncoding regions that are much more divergent than typically seen between alleles. In contrast to the Stylonychia/Oxytricha histone H4 paralogs, which are invariant at the amino acid level within and between these species, and which do not appear to be particularly divergent compared with variants within other major eukaryotic groups (e.g., fungi), a number of histone H4 variants appear to have diverged substantially and evolved independently in other ciliate classes (Katz et al. 2004). We found one new histone H3, H3.9, in addition to those previously reported (Postberg et al. 2010; Forcob et al. 2014). Other than the possible independent duplications of H3.1/H3.2 variants, the same histone variants in Stylonychia are encoded by the Oxytricha MAC genome (see supplementary table S4, Supplementary Material online).

In Stylonychia, we noticed five very divergent histone variants: One histone H2A, one histone H2B and three H3 histones, including the previously reported histones H3.7 and H3.8 (Forcob et al. 2014), and the new histone H3.9 (fig. 4). The divergences between the three orthologous pairs of divergent histone H3 variants (H3.7, H3.8, and H3.9) in Stylonychia and Oxytricha are much greater than between the other H3 variants (fig. 4). Pairwise MAFFT (Katoh et al. 2002; Katoh and Standley 2013) alignments of the orthologous pairs of proteins were approximately 58%, 42%, 49% and approximately 35% identical (excluding the first few amino acids which create a large gap, and starting from the first block of amino acids to the end of the protein), respectively, for divergent histone variants H2A.6, H2B.4, H3.7, and H3.9. The divergence levels among these histone variants are comparable to the divergence (48% identity) between the most highly specialized human histone variant, H2A.B (Barr body-deficient; absent from inactive X chromosomes in females), and canonical human H2A histones (Gonzalez-Romero et al. 2008), and between human centromeric histone and histone H3, for example, 47.3% identity between CENPA (UniProt:P49450) and H3F3A (UniProt:P84243). Within the H2A and H3 histone families in Tetrahymena there are also highly divergent histone variants, for example, HTA.V and H2A.Y are 52.7% and 47.0% identical compared with H2A.1 (fig. 4A and C). The extreme divergences of the Stylonychia/Oxytricha histone variants suggest that their functional roles may be somewhat unconventional. It has previously been suggested that the unique genome architecture may have led to elevated divergence between paralogs in ciliates with highly fragmented macronuclear genomes compared with those that are not highly fragmented (Zufall et al. 2006). However, based on the development-specific gene expression of these paralogs, together with previous demonstration of development-specific localization of the H3.7 variant (see next paragraphs), we suggest that at least for the most divergent Stylonychia/Oxytricha histones, functional specialization in macronuclear genome development, rather than genome architecture per se, may be the main evolutionary driving force.

Fig. 4.—

Fig. 4.—

Histone variants in Stylonchia. Scale bars in expected substitutions per site are provided below each phylogeny and bootstrap percentages for branch points are shown when greater than 80%. Phylogenies were rooted using Saccharomyces cerevisiae histone variants (H2A.1/H2A.2, H2B.1/H2B.2, H3, for 4A–C, respectively) as outgroups. Note that in (C) it is possible that long-branch attraction is causing the most divergent histone H3 variants to cluster. See Materials and Methods for a list of accessions for the Saccharomyces cerevisiae and Tetrahymena thermophila histone variants.

We were able to find mRNA sequences corresponding to each of the divergent histone variants in a cDNA library created by subtraction of vegetative cDNA from cDNA from Stylonychia cells 10 h postconjugation (Paschka et al. 2005). From quantitative polymerase chain reaction data Stylonychia’s histone H3.7 was also shown to be the most highly expressed H3 variant during the development of the new MAC, and was upregulated approximately 7 or 8 orders of magnitude compared with its vegetative expression (Forcob et al. 2014). In Oxytricha the orthologs of three of these divergent histones (H2B.4, H2A.6, and H3.7) appear to be highly upregulated and coexpressed, peaking early during early development before tapering off over time, and are all negligibly expressed during vegetative growth (fig. 5). The divergent histone H3.8 is highly expressed at both 10 and 20 h, and is moderately expressed during vegetative growth in Oxytricha. H3.9 is negligibly expressed during vegetative growth, and highly upregulated at the 10-h time point, but is expressed at lower levels than H3.7 or H3.8. The Oxytricha linker histone H1.2 exhibits a strikingly similar gene expression pattern to the divergent, highly expressed, development specific core histone variants, suggesting co-regulation of all these genes.

Fig. 5.—

Fig. 5.—

Expression of histone variants in Oxytricha trifallax. Gene expression values are the normalized RNA-seq counts obtained from (Swart et al. 2013) and are given in arbitrary units. “Vegetative” represents a normally fed cell culture. The developmental time course on the x axis starts at 0 h when cells from the complementary mating types of O. trifallax were mixed together (see Swart et al. 2013 for additional details about this experiment).

Recently, histone H3.7 was identified as a key histone in Stylonychia’s macronuclear genome development, specifically localizing to the developing MAC during polytene DNA formation, and disappearing after the completion of DNA elimination (Forcob et al. 2014). Silencing the histone H3.7 gene prevented further development and was usually lethal (Forcob et al. 2014). As this histone variant was shown to be enriched in macronuclear-destined DNA, it was proposed that it might be required for permissive chromatin formation (Forcob et al. 2014). The massive upregulation and coexpression of the Oxytricha H1.2, H2A.6, H2B.4, and H3.7 histone variants during sexual development raises the intriguing possibility of their coassembly in both Stylonychia and Oxytricha. Consequently, it will be of great interest to determine whether these histones colocalize in the developing new MAC, and particularly whether they have evolved to form a novel type of nucleosome specific for genome rearrangements.

Supplementary Material

Supplementary data files S1 and S2 are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).

Supplementary Data

Acknowledgments

This research was supported by grants to H.-J.L. from the Deutsche Forschungsgemeinschaft (Li231/29-1, AOBJ 571454), and by the SNSF grant 31003A_129957 and the ERC grant “EPIGENOME” to M.N. StyloDB was supported by Bradley University. S.H.A. and E.C.S. thank the members of the Nowacki lab for various discussions about this project.

Literature Cited

  1. Allis CD, Allen RL, Wiggins JC, Chicoine LG, Richman R. Proteolytic processing of h1-like histones in chromatin: a physiologically and developmentally regulated event in Tetrahymena micronuclei. J Cell Biol. 1984;99:1669–1677. doi: 10.1083/jcb.99.5.1669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Altschul SF, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Ammermann D. Synthese und Abbau der Nucleinsäuren während der Entwicklung des Makronukleus von Stylonychia mytilus (Protozoa, Ciliata) Chromosoma. 1968;25:107–120. doi: 10.1007/BF00327172. [DOI] [PubMed] [Google Scholar]
  4. Ammermann D. Morphology and development of the macronuclei of the ciliates Stylonychia mytilus and Euplotes aediculatus. Chromosoma. 1971;33:209–238. doi: 10.1007/BF00285634. [DOI] [PubMed] [Google Scholar]
  5. Ammermann D, Steinbruck G, von Berger L, Hennig W. The development of the macronucleus in the ciliated protozoan Stylonychia mytilus. Chromosoma. 1974;45:401–429. doi: 10.1007/BF00283386. [DOI] [PubMed] [Google Scholar]
  6. Aury JM, et al. Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature. 2006;444:171–178. doi: 10.1038/nature05230. [DOI] [PubMed] [Google Scholar]
  7. Bankevich A, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19:455–477. doi: 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Baudry C, et al. PiggyMac, a domesticated piggyBac transposase involved in programmed genome rearrangements in the ciliate Paramecium tetraurelia. Genes Dev. 2009;23:2478–2483. doi: 10.1101/gad.547309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bell G. Sex and death in Protozoa: the history of obsession. Cambridge: Cambridge University Press; 1988. [Google Scholar]
  10. Bernhard D. Several highly divergent histone H3 genes are present in the hypotrichous ciliate Stylonychia lemnae. FEMS Microbiol Lett. 1999;175:45–50. doi: 10.1111/j.1574-6968.1999.tb13600.x. [DOI] [PubMed] [Google Scholar]
  11. Bulic A, et al. A permissive chromatin structure is adopted prior to site-specific DNA demethylation of developmentally expressed genes involved in macronuclear differentiation. Epigenetics Chromatin. 2013;6:5. doi: 10.1186/1756-8935-6-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Burge SW, et al. Rfam 11.0: 10 years of RNA families. Nucleic Acids Res. 2013;41:D226–D232. doi: 10.1093/nar/gks1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Butler AP, Laughlin TJ, Cadilla CL, Henry JM, Olins DE. Physical structure of gene-sized chromatin from the protozoan Oxytricha. Nucleic Acids Res. 1984;12:3201–3217. doi: 10.1093/nar/12.7.3201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Bütschli O. Studien über die ersten Entwicklungsvorgänge der Eizelle, die Zelltheilung und die Conjugation der Infusorien. Frankfurt am Main: Christian Winter; 1876. [Google Scholar]
  15. Camacho C, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Caplan EB. Histones and other basic nuclear proteins in genetically active and genetically inactive nuclei of the ciliate, Oxytricha sp. Biochim Biophys Acta. 1977;479:214–219. doi: 10.1016/0005-2787(77)90142-3. [DOI] [PubMed] [Google Scholar]
  17. Cheng CY, Vogt A, Mochizuki K, Yao MC. A domesticated piggyBac transposase plays key roles in heterochromatin dynamics and DNA cleavage during programmed DNA deletion in Tetrahymena thermophila. Mol Biol Cell. 2010;21:1753–1762. doi: 10.1091/mbc.E09-12-1079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Chevreux B, Wetter T, Suhai S. 1999. Genome sequence assembly using trace signals and additional sequence information. : Proceedings of the German Conference on Bioinformatics, GCB’99; Oct 4-6, Hanover, Germany. Hannover (Germany): GCB. p. 45–56. [Google Scholar]
  19. Chikhi R, Rizk G. Space-efficient and exact de Bruijn graph representation based on a Bloom filter. Algorithms Mol Biol. 2013;8:22. doi: 10.1186/1748-7188-8-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Conesa A, et al. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21:3674–3676. doi: 10.1093/bioinformatics/bti610. [DOI] [PubMed] [Google Scholar]
  21. David M, Dzamba M, Lister D, Ilie L, Brudno M. SHRiMP2: sensitive yet practical SHort Read Mapping. Bioinformatics. 2011;27:1011–1012. doi: 10.1093/bioinformatics/btr046. [DOI] [PubMed] [Google Scholar]
  22. Dou Y, Gorovsky MA. Phosphorylation of linker histone H1 regulates gene expression in vivo by creating a charge patch. Mol Cell. 2000;6:225–231. doi: 10.1016/s1097-2765(00)00024-1. [DOI] [PubMed] [Google Scholar]
  23. Dou Y, Gorovsky MA. Regulation of transcription by H1 phosphorylation in Tetrahymena is position independent and requires clustered sites. Proc Natl Acad Sci U S A. 2002;99:6142–6146. doi: 10.1073/pnas.092029599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Dou Y, Mizzen CA, Abrams M, Allis CD, Gorovsky MA. Phosphorylation of linker histone H1 regulates gene expression in vivo by mimicking H1 removal. Mol Cell. 1999;4:641–647. doi: 10.1016/s1097-2765(00)80215-4. [DOI] [PubMed] [Google Scholar]
  25. Duerr HP, Eichner M, Ammermann D. Modeling senescence in hypotrichous ciliates. Protist. 2004;155:45–52. doi: 10.1078/1434461000163. [DOI] [PubMed] [Google Scholar]
  26. Eddy SR. HMMER3 [Internet]. 2013. Available from: http://hmmer.org/
  27. Forcob S, Bulic A, Jonsson F, Lipps HJ, Postberg J. Differential expression of histone H3 genes and selective association of the variant H3.7 with a specific sequence class in Stylonychia macronuclear development. Epigenetics Chromatin. 2014;7:4. doi: 10.1186/1756-8935-7-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Frith MC, Hamada M, Horton P. Parameters for accurate genome alignment. BMC Bioinformatics. 2010;11:80. doi: 10.1186/1471-2105-11-80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Fuhrmann G, Swart E, Nowacki M, Lipps HJ. RNA-dependent genome processing during nuclear differentiation: the model systems of stichotrichous ciliates. Epigenomics. 2013;5:229–236. doi: 10.2217/epi.13.15. [DOI] [PubMed] [Google Scholar]
  30. Gall JG. Macronuclear duplication in the ciliated protozoan Euplotes. J Biophys Biochem Cytol. 1959;5:295–308. doi: 10.1083/jcb.5.2.295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Gao S, et al. Impaired replication elongation in Tetrahymena mutants deficient in histone H3 Lys 27 monomethylation. Genes Dev. 2013;27:1662–1679. doi: 10.1101/gad.218966.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Gonzalez-Romero R, Mendez J, Ausio J, Eirin-Lopez JM. Quickly evolving histones, nucleosome stability and chromatin folding: all about histone H2A.Bbd. Gene. 2008;413:1–7. doi: 10.1016/j.gene.2008.02.003. [DOI] [PubMed] [Google Scholar]
  33. Gorovsky MA, Keevert JB. Absence of histone F1 in a mitotically dividing, genetically inactive nucleus. Proc Natl Acad Sci U S A. 1975;72:2672–2676. doi: 10.1073/pnas.72.7.2672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Gorovsky MA, Keevert JB, Pleger GL. Histone F1 of Tetrahymena macronuclei: unique electrophoretic properties and phosphorylation of F1 in an amitotic nucleus. J Cell Biol. 1974;61:134–145. doi: 10.1083/jcb.61.1.134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Gottschling DE, Zakian VA. Telomere proteins: specific recognition and protection of the natural termini of Oxytricha macronuclear DNA. Cell. 1986;47:195–205. doi: 10.1016/0092-8674(86)90442-3. [DOI] [PubMed] [Google Scholar]
  36. Gotz S, et al. High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res. 2008;36:3420–3435. doi: 10.1093/nar/gkn176. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Rfam: an RNA family database. Nucleic Acids Res. 2003;31:439–441. doi: 10.1093/nar/gkg006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Guindon S, Delsuc F, Dufayard JF, Gascuel O. Estimating maximum likelihood phylogenies with PhyML. Methods Mol Biol. 2009;537:113–137. doi: 10.1007/978-1-59745-251-9_6. [DOI] [PubMed] [Google Scholar]
  39. Harper DS, Song K, Jahn CL. Overamplification of macronuclear linear DNA molecules during prolonged vegetative growth of Oxytricha nova. Gene. 1991;99:55–61. doi: 10.1016/0378-1119(91)90033-8. [DOI] [PubMed] [Google Scholar]
  40. Hauser LJ, Treat ML, Olins DE. Cloning and analysis of the macronuclear gene for histone H1 from Euplotes eurystomus. Nucleic Acids Res. 1993;21:3586. doi: 10.1093/nar/21.15.3586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Hayashi T, Hayashi H, Iwai K. Tetrahymena histone H1. Isolation and amino acid sequence lacking the central hydrophobic domain conserved in other H1 histones. J Biochem. 1987;102:369–376. doi: 10.1093/oxfordjournals.jbchem.a122063. [DOI] [PubMed] [Google Scholar]
  42. Herrick G, et al. Mobile elements bounded by C4A4 telomeric repeats in Oxytricha fallax. Cell. 1985;43:759–768. doi: 10.1016/0092-8674(85)90249-1. [DOI] [PubMed] [Google Scholar]
  43. Herrick G, Cartinhour SW, Williams KR, Kotter KP. Multiple sequence versions of the Oxytricha fallax 81-MAC alternate processing family. J Protozool. 1987;34:429–434. doi: 10.1111/j.1550-7408.1987.tb03207.x. [DOI] [PubMed] [Google Scholar]
  44. Herrmann AL, Cadilla CL, Cacheiro LH, Carne AF, Olins DE. An H1-like protein from the macronucleus of Euplotes eurystomus. Eur J Cell Biol. 1987;43:155–162. [PubMed] [Google Scholar]
  45. Heyse G, Jonsson F, Chang WJ, Lipps HJ. RNA-dependent control of gene amplification. Proc Natl Acad Sci U S A. 2010;107:22134–22139. doi: 10.1073/pnas.1009284107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Huang X, Madan A. CAP3: a DNA sequence assembly program. Genome Res. 1999;9:868–877. doi: 10.1101/gr.9.9.868. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Jablonka E, Raz G. Transgenerational epigenetic inheritance: prevalence, mechanisms, and implications for the study of heredity and evolution. Q Rev Biol. 2009;84:131–176. doi: 10.1086/598822. [DOI] [PubMed] [Google Scholar]
  48. Jacob NK, Lescasse R, Linger BR, Price CM. Tetrahymena POT1a regulates telomere length and prevents activation of a cell cycle checkpoint. Mol Cell Biol. 2007;27:1592–1601. doi: 10.1128/MCB.01975-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Jung S, et al. Exploiting Oxytricha trifallax nanochromosomes to screen for non-coding RNA genes. Nucleic Acids Res. 2011;39:7529–7547. doi: 10.1093/nar/gkr501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30:3059–3066. doi: 10.1093/nar/gkf436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–780. doi: 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Katz LA, Bornstein JG, Lasek-Nesselquist E, Muse SV. Dramatic diversity of ciliate histone H4 genes revealed by comparisons of patterns of substitutions and paralog divergences among eukaryotes. Mol Biol Evol. 2004;21:555–562. doi: 10.1093/molbev/msh048. [DOI] [PubMed] [Google Scholar]
  53. Kearse M, et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28:1647–1649. doi: 10.1093/bioinformatics/bts199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Kent WJ. BLAT—the BLAST-like alignment tool. Genome Res. 2002;12:656–664. doi: 10.1101/gr.229202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Kielbasa SM, Wan R, Sato K, Horton P, Frith MC. Adaptive seedstame genomic sequence comparison. Genome Res. 2011;21:487–493. doi: 10.1101/gr.113985.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Klobutcher LA, Swanton MT, Donini P, Prescott DM. All gene-sized DNA molecules in four species of hypotrichs have the same terminal sequence and an unusual 3′ terminus. Proc Natl Acad Sci U S A. 1981;78:3015–3019. doi: 10.1073/pnas.78.5.3015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Li R, et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 2010;20:265–272. doi: 10.1101/gr.097261.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. doi: 10.1093/bioinformatics/btl158. [DOI] [PubMed] [Google Scholar]
  60. Li W, Jaroszewski L, Godzik A. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics. 2001;17:282–283. doi: 10.1093/bioinformatics/17.3.282. [DOI] [PubMed] [Google Scholar]
  61. Linger BR, Morin GB, Price CM. The Pot1a-associated proteins Tpt1 and Pat1 coordinate telomere protection and length regulation in Tetrahymena. Mol Biol Cell. 2011;22:4161–4170. doi: 10.1091/mbc.E11-06-0551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Lipps HJ, Erhardt P. DNase I hypersensitivity of the terminal inverted repeat DNA sequences in the macronucleus of the ciliate Stylonychia mytilus. FEBS Lett. 1981;126:219–222. doi: 10.1016/0014-5793(81)80246-3. [DOI] [PubMed] [Google Scholar]
  63. Lipps HJ, Gruissem W, Prescott DM. Higher order DNA structure in macronuclear chromatin of the hypotrichous ciliate Oxytricha nova. Proc Natl Acad Sci U S A. 1982;79:2495–2499. doi: 10.1073/pnas.79.8.2495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Lipps HJ, Rhodes D. G-quadruplex structures: in vivo evidence and function. Trends Cell Biol. 2009;19:414–422. doi: 10.1016/j.tcb.2009.05.002. [DOI] [PubMed] [Google Scholar]
  65. Lipps HJ, Steinbruck G. Free genes for rRNAs in the macronuclear genome of the ciliate Stylonychia mytilus. Chromosoma. 1978;69:21–26. doi: 10.1007/BF00327378. [DOI] [PubMed] [Google Scholar]
  66. Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:955–964. doi: 10.1093/nar/25.5.955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Lynn DH. Dordrecht (The Netherlands): Springer; 2008. The ciliated protozoa: characterization, classification, and guide to the literature. [Google Scholar]
  68. Magoc T, et al. GAGE-B: an evaluation of genome assemblers for bacterial organisms. Bioinformatics. 2013;29:1718–1725. doi: 10.1093/bioinformatics/btt273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Mistry J, Finn RD, Eddy SR, Bateman A, Punta M. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 2013;41:e121. doi: 10.1093/nar/gkt263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Namiki T, Hachiya T, Tanaka H, Sakakibara Y. MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res. 2012;40:e155. doi: 10.1093/nar/gks678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Nawrocki EP, Eddy SR. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 2013;29:2933–2935. doi: 10.1093/bioinformatics/btt509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Nawrocki EP, Kolbe DL, Eddy SR. Infernal 1.0: inference of RNA alignments. Bioinformatics. 2009;25:1335–1337. doi: 10.1093/bioinformatics/btp157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Nikolenko SI, Korobeynikov AI, Alekseyev MA. BayesHammer: Bayesian clustering for error correction in single-cell sequencing. BMC Genomics. 2013;14(Suppl. 1):S7. doi: 10.1186/1471-2164-14-S1-S7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Nowacki M, Haye JE, Fang W, Vijayan V, Landweber LF. RNA-mediated epigenetic regulation of DNA copy number. Proc Natl Acad Sci U S A. 2010;107:22140–22144. doi: 10.1073/pnas.1012236107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Nowacki M, et al. A functional role for transposases in a large eukaryotic genome. Science. 2009;324:935–938. doi: 10.1126/science.1170023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Oka Y, Shiota S, Nakai S, Nishida Y, Okubo S. Inverted terminal repeat sequence in the macronuclear DNA of Stylonychia pustulata. Gene. 1980;10:301–306. doi: 10.1016/0378-1119(80)90150-x. [DOI] [PubMed] [Google Scholar]
  77. Parra G, Bradnam K, Korf I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 2007;23:1061–1067. doi: 10.1093/bioinformatics/btm071. [DOI] [PubMed] [Google Scholar]
  78. Paschka AG, et al. A microarray analysis of developmentally regulated genes during macronuclear differentiation in the stichotrichous ciliate Stylonychia lemnae. Gene. 2005;359:81–90. doi: 10.1016/j.gene.2005.06.024. [DOI] [PubMed] [Google Scholar]
  79. Peng Y, Leung HC, Yiu SM, Chin FY. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28:1420–1428. doi: 10.1093/bioinformatics/bts174. [DOI] [PubMed] [Google Scholar]
  80. Pluta AF, Kaine BP, Spear BB. The terminal organization of macronuclear DNA in Oxytricha fallax. Nucleic Acids Res. 1982;10:8145–8154. doi: 10.1093/nar/10.24.8145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Postberg J, Alexandrova O, Lipps HJ. Synthesis of pre-rRNA and mRNA is directed to a chromatin-poor compartment in the macronucleus of the spirotrichous ciliate Stylonychia lemnae. Chromosome Res. 2006;14:161–175. doi: 10.1007/s10577-006-1033-x. [DOI] [PubMed] [Google Scholar]
  82. Postberg J, Forcob S, Chang WJ, Lipps HJ. The evolutionary history of histone H3 suggests a deep eukaryotic root of chromatin modifying mechanisms. BMC Evol Biol. 2010;10:259. doi: 10.1186/1471-2148-10-259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Postberg J, Heyse K, Cremer M, Cremer T, Lipps HJ. Spatial and temporal plasticity of chromatin during programmed DNA-reorganization in Stylonychia macronuclear development. Epigenetics Chromatin. 2008;1:3. doi: 10.1186/1756-8935-1-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Prescott DM. The DNA of ciliated protozoa. Microbiol Rev. 1994;58:233–267. doi: 10.1128/mr.58.2.233-267.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Prescott DM. Genome gymnastics: unique modes of DNA evolution and processing in ciliates. Nat Rev Genet. 2000;1:191–198. doi: 10.1038/35042057. [DOI] [PubMed] [Google Scholar]
  86. Punta M, et al. The Pfam protein families database. Nucleic Acids Res. 2012;40:D290–D301. doi: 10.1093/nar/gkr1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Quevillon E, et al. InterProScan: protein domains identifier. Nucleic Acids Res. 2005;33:W116–W120. doi: 10.1093/nar/gki442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Rumble SM, et al. SHRiMP: accurate mapping of short color-space reads. PLoS Comput Biol. 2009;5:e1000386. doi: 10.1371/journal.pcbi.1000386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Schlegel M, Muller S, Ruder F, Büsen W. Transcriptionally inactive micronuclei, macronuclear anlagen and transcriptionally active macronuclei differ in histone composition in the hypotrichous ciliate Stylonychia lemnae. Chromosoma. 1990;99:401–406. [Google Scholar]
  90. Schmidt SL, Bernhard D, Schlegel M, Foissner W. Phylogeny of the Stichotrichia (Ciliophora; Spirotrichea) reconstructed with nuclear small subunit rRNA gene sequences: discrepancies and accordances with morphological data. J Eukaryot Microbiol. 2007;54:201–209. doi: 10.1111/j.1550-7408.2007.00250.x. [DOI] [PubMed] [Google Scholar]
  91. Schmitz RJ, et al. Transgenerational epigenetic instability is a source of novel methylation variants. Science. 2011;334:369–373. doi: 10.1126/science.1212959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Shen X, Gorovsky MA. Linker histone H1 regulates specific gene expression but not global transcription in vivo. Cell. 1996;86:475–483. doi: 10.1016/s0092-8674(00)80120-8. [DOI] [PubMed] [Google Scholar]
  93. Shieh AW, Chalker DL. LIA5 is required for nuclear reorganization and programmed DNA rearrangements occurring during tetrahymena macronuclear differentiation. PLoS One. 2013;8:e75337. doi: 10.1371/journal.pone.0075337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Simpson JT, et al. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009;19:1117–1123. doi: 10.1101/gr.089532.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Small KS, Brudno M, Hill MM, Sidow A. Extreme genomic variation in a natural population. Proc Natl Acad Sci U S A. 2007;104:5698–5703. doi: 10.1073/pnas.0700890104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Spear BB. Isolation and mapping of the rRNA genes in the macronucleus of Oxytricha fallax. Chromosoma. 1980;77:193–202. doi: 10.1007/BF00329544. [DOI] [PubMed] [Google Scholar]
  97. Stanke M, et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 2006;34:W435–W439. doi: 10.1093/nar/gkl200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Stein LD, et al. The generic genome browser: a building block for a model organism system database. Genome Res. 2002;12:1599–1610. doi: 10.1101/gr.403602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Steinbrück G. Overamplification of genes in macronuclei of hypotrichous ciliates. Chromosoma. 1983;88:156–163. [Google Scholar]
  100. Stover NA, et al. Tetrahymena Genome Database (TGD): a new genomic resource for Tetrahymena thermophila research. Nucleic Acids Res. 2006;34:D500–D503. doi: 10.1093/nar/gkj054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Swanton MT, Greslin AF, Prescott DM. Arrangement of coding and non-coding sequences in the DNA molecules coding for rRNAs in Oxytricha sp. DNA of ciliated protozoa. VII. Chromosoma. 1980;77:203–215. doi: 10.1007/BF00329545. [DOI] [PubMed] [Google Scholar]
  102. Swanton MT, McCarroll RM, Spear BB. The organization of macronuclear rDNA molecules of four hypotrichous ciliated protozoans. Chromosoma. 1982;85:1–9. doi: 10.1007/BF00344590. [DOI] [PubMed] [Google Scholar]
  103. Swart EC, et al. The Oxytricha trifallax mitochondrial genome. Genome Biol Evol. 2012;4:136–154. doi: 10.1093/gbe/evr136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Swart EC, et al. The Oxytricha trifallax macronuclear genome: a complex eukaryotic genome with 16,000 tiny chromosomes. PLoS Biol. 2013;11:e1001473. doi: 10.1371/journal.pbio.1001473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Taylor WD, Shuter BJ. Body size, genome size, and intrinsic rate of increase in ciliated protozoa. Am Nat. 1981;118:160–172. [Google Scholar]
  106. Vinogradov DV, et al. Draft macronucleus genome of Euplotes crassus ciliate. Mol Biol. 2012;46:328–333. [PubMed] [Google Scholar]
  107. Vogt A, Mochizuki K. A domesticated PiggyBac transposase interacts with heterochromatin and catalyzes reproducible DNA elimination in Tetrahymena. PLoS Genet. 2013;9:e1004032. doi: 10.1371/journal.pgen.1004032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Wefes I, Lipps HJ. The two macronuclear histone H4 genes of the hypotrichous ciliate Stylonychia lemnae. DNA Seq. 1990;1:25–32. doi: 10.3109/10425179009041344. [DOI] [PubMed] [Google Scholar]
  109. Wu M, Allis CD, Richman R, Cook RG, Gorovsky MA. An intervening sequence in an unusual histone H1 gene of Tetrahymena thermophila. Proc Natl Acad Sci U S A. 1986;83:8674–8678. doi: 10.1073/pnas.83.22.8674. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Wu M, et al. Four distinct and unusual linker proteins in a mitotically dividing nucleus are derived from a 71-kilodalton polyprotein, lack p34cdc2 sites, and contain protein kinase A sites. Mol Cell Biol. 1994;14:10–20. doi: 10.1128/mcb.14.1.10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Xin H, et al. TPP1 is a homologue of ciliate TEBP-beta and interacts with POT1 to recruit telomerase. Nature. 2007;445:559–562. doi: 10.1038/nature05469. [DOI] [PubMed] [Google Scholar]
  112. Xu K, et al. Copy number variations of 11 macronuclear chromosomes and their gene expression in Oxytricha trifallax. Gene. 2012;505:75–80. doi: 10.1016/j.gene.2012.05.045. [DOI] [PubMed] [Google Scholar]
  113. Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–829. doi: 10.1101/gr.074492.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Zoller SD, et al. Characterization and taxonomic validity of the ciliate Oxytricha trifallax (Class Spirotrichea) based on multiple gene sequences: limitations in identifying genera solely by morphology. Protist. 2012;163:643–657. doi: 10.1016/j.protis.2011.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Zufall RA, McGrath CL, Muse SV, Katz LA. Genome architecture drives protein evolution in ciliates. Mol Biol Evol. 2006;23:1681–1687. doi: 10.1093/molbev/msl032. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Genome Biology and Evolution are provided here courtesy of Oxford University Press

RESOURCES