Skip to main content
BMC Genomics logoLink to BMC Genomics
. 2008 Jun 20;9:296. doi: 10.1186/1471-2164-9-296

ESTs from the microsporidian Edhazardia aedis

Erin E Gill 1, James J Becnel 2, Naomi M Fast 1,
PMCID: PMC2474625  PMID: 18570666

Abstract

Background

Microsporidia are a group of parasites related to fungi that infect a wide variety of animals and have gained recognition from the medical community in the past 20 years due to their ability to infect immuno-compromised humans. Microsporidian genomes range in size from 2.3 to 19.5 Mbp, but almost all of our knowledge comes from species that have small genomes (primarily from the human parasite Encephalitozoon cuniculi and the locust parasite Antonospora locustae). We have conducted an EST survey of the mosquito parasite Edhazardia aedis, which has an estimated genome size several times that of more well-studied species. The only other microsporidian EST project is from A. locustae, and serves as a basis for comparison with E. aedis.

Results

The spore transcriptomes of A. locustae and E. aedis were compared and the numbers of unique transcripts that belong to each COG (Clusters of Orthologous Groups of proteins) category differ by at most 5%. The transcripts themselves have widely varying start sites and encode a number of proteins that have not been found in other microsporidia examined to date. However, E. aedis seems to lack the multi-gene transcripts present in A. locustae and E. cuniculi. We also present the first documented case of transcription of a transposable element in microsporidia.

Conclusion

Although E. aedis and A. locustae are distantly related, have very disparate life cycles and contain genomes estimated to be vastly different sizes, their patterns of transcription are similar. The architecture of the ancestral microsporidian genome is unknown, but the presence of genes in E. aedis that have not been found in other microsporidia suggests that extreme genome reduction and compaction is lineage specific and not typical of all microsporidia.

Background

Microsporidia are single-celled eukaryotic intracellular parasites that are related to fungi. Currently, over 1200 species have been identified, infecting animals from nearly every phylum, including commercially important species such as honeybees and fish, as well as humans [1]. Inside host cells, microsporidia proliferate as vegetative stages (meronts, schizonts) which eventually produce spores that are released when the host cell lyses. Spores possess a unique host cell invasion apparatus called the polar filament, which is forcefully everted upon germination to form a tube and can pierce a nearby host cell [1]. The tube then acts as a conduit allowing the contents of the spore to be injected into the host cell's cytoplasm, where the parasite undergoes vegetative replication.

Microsporidia are a diverse group of organisms, and vary greatly in the complexity of their life cycles. For instance, Encephalitozoon cuniculi and Antonospora locustae produce only one type of spore (uninucleate in the former and binucleate in the latter), and complete their entire life cycles inside one host individual, while Amblyospora californica requires two host groups (mosquitoes and microcrustacea) and produces three morphologically and functionally discrete spore types [1].

Microsporida possess some of the smallest primary nuclear genomes known (as tiny as 2.3 Mbp). The only microsporidian whose genome has been completely sequenced is the human parasite, E. cuniculi. At a meager 2.9 Mbp, E. cuniculi's genome is extremely compact, with only 2000 genes [2]. A small genome sequence survey (GSS) project has been conducted on A. locustae, a locust parasite that has been approved as a biological control agent in the United States [3]. A. locustae's genome is roughly 5.4 Mbp in size [4], or about twice the size of E. cuniculi's genome. Despite the genome size difference, both genomes appear to be structured in much the same way. Genes are closely packed (nearly one gene per kilobase), are small in size compared to homologues in animals and fungi, and are intron-poor. There is also a much greater degree of synteny between these two organisms than would be expected given their phylogenetic relationship, which implies that although microsporidian genes are fast-evolving, genomic rearrangements occur only rarely [3] (See Fig. 1).

Figure 1.

Figure 1

The phylogenetic relationships between several microsporidia. Species that house transposable elements belonging to the Ty3/gypsy family are highlighted in blue, while species containing LTR transposons are highlighted in yellow. Genome sizes are indicated to the right of each species. (Adapted from Slamovits et al., 2004.)

However, we have very little information on microsporidian genomes of larger sizes. Edhazardia aedis is a microsporidian that infects Aedes aegypti, the mosquito vector of the Dengue hemorrhagic and yellow fever viruses. E. aedis has been intensively studied as a viable biological control agent for A. aegypti [5] and has a genome estimated to be to be many times larger than that of E. cuniculi. There are several possible explanations for this difference: E. aedis may have more genes that control its complex life cycle. Genes may also be longer, more widely spaced, and contain more introns than E. cuniculi [2].

Morphological studies conducted on E. aedis have revealed at least four different types of spores – two uninucleate and two binucleate [6,7]. The two types of uninucleate spore types differ morphologically but possess similar pyriform shapes. However, the cell division events from which they arise differ. Spores produced via mitosis are roughly 8.5 μm in length, whereas spores produced via meiosis (meiospores) are about 7.5 μm. Small binucleate spores (~6.5 μm in length) that have short polar filaments are formed first, followed by the production of larger binucleate spores (~9 μm in length) that are ovoid in shape. Meiospore formation is usually abortive and rarely produces normal spores [6].

E. aedis' life cycle is moderately complex and involves two generations of the mosquito host. It begins when a uninucleate spore is ingested by a mosquito larva from the environment. Once in the gut, the spore germinates and begins to multiply in the host tissue. Within 48 hours, small binucleate spores are formed that are responsible for spread to other tissues. Orally infected larvae generally exhibit reduced growth, and may die before reaching maturity if the parasite load is high, thus releasing more spores into the environment. However, if the infection load is sufficiently small, the larva will mature into an adult mosquito and survive to reproduce [6]. If the adult mosquito is female, large binucleate spores will develop in her ovaries and will infect oocytes, thus passing the infection on to the next generation where the majority of mortality occurs in larvae. Little is known about the factors that modulate the transition from one phase in the life cycle to the next, or about the changes in gene expression that occur during these transitions.

It is also possible that the difference in genome size between E. aedis and E. cuniculi or A. locustae may have less to do with the number of genes, and more to do with genome architecture. E. aedis genes could be longer, more widely spaced, and contain more introns than E. cuniculi [2]. In an effort to learn more about E. aedis' genome, a GSS of >200 kbp was conducted [8]. This study concluded that E. aedis' genome structure is very different from those of E. cuniculi and A. locustae. A large portion of the genome is occupied by non-coding DNA and genes are not closely packed together, although the existence of local areas of compaction could not be ruled out.

Previous examinations of ESTs from microsporidia have only been conducted on microsporidia with small genomes. These transcripts possessed unusual features that are atypical in eukaryotes. Examinations of ESTs from A. locustae [9] and transcripts from E. cuniculi revealed numerous multi-gene transcripts. These transcripts are different from prokaryotic operons, as the proteins encoded by the transcript do not have related functions and are often not encoded on the same DNA strand. Many transcripts encode only a portion of one gene, while the other is present in its entirety [9,10]. The reason for this phenomenon is not known, but it has been suggested that transcriptional control elements have been lost (or moved into adjacent genes) during the process of genome compaction [10].

As E. aedis' genome and life cycle are very different from E. cuniculi and A. locustae, it is reasonable to assume that the transcript structure and number of genes present may differ as well. In this study, we describe the first survey of ESTs from a microsporidian with a much larger genome size and complex life cycle. In sequencing over 1300 transcripts, we have elucidated more of E. aedis' genome content, and have gained a profile of its transcript structure and composition. Surprisingly, the E. aedis uninucleate spore transcriptome is remarkably similar to that of A. locustae.

Results

Overview

Sequences were deposited into the Genbank EST database and have the accession numbers FG063843 to FG065106. From the 1307 clones sequenced, 133 unique genes were found; 55 were represented by a single transcript, while the remaining 78 were represented by two or more. 97 of the 133 unique genes are present in other microsporidia (See Table 1), while 10 are present in other (non-microsporidian) organisms (See Table 2), 18 are putatively E. aedis-specific and 8 have no apparent open reading frames. Coding sequences contained 43% G+C while 5' and 3' untranslated regions possessed 27% and 26%, respectively.

Table 1.

Gene Name Species Genbank Accession Number
16S rRNA GENE Brachiola algerae AM422905
1-ACYL-SN-GLYCEROL-3-PHOSPHATE ACYLTRANSFERASE Encephalitozoon cuniculi NP_586146
26S PROTEASOME REGULATORY SUBUNIT 4 Encephalitozoon cuniculi NP_586091
26S PROTEASOME REGULATORY SUBUNIT 6 Encephalitozoon cuniculi NP_586128
26S PROTEASOME REGULATORY SUBUNIT 8 Encephalitozoon cuniculi XP_955738
40S RIBOSOMAL PROTEIN S2 Leishmania infantum XP_001466537
40S RIBOSOMAL PROTEIN S3 Encephalitozoon cuniculi XP_955676
40S RIBOSOMAL PROTEIN S4 Mycetophagus quadripustulatus CAJ17168
40S RIBOSOMAL PROTEIN SA or P40 Encephalitozoon cuniculi NP_584728
60S RIBOSOMAL PROTEIN L3 Encephalitozoon cuniculi NP_597630
60S RIBOSOMAL PROTEIN L4 Encephalitozoon cuniculi NP_597213
60S RIBOSOMAL PROTEIN L5 Encephalitozoon cuniculi NP_585846
6-PHOSPHOFRUCTOKINASE Encephalitozoon cuniculi NP_597579
ABC TRANSPORTER (MITOCHONDRIAL TYPE) #1 Encephalitozoon cuniculi NP_586426
ABC TRANSPORTER (MITOCHONDRIAL TYPE) #2 Encephalitozoon cuniculi NP_586426
ACTIN Blakeslea trispora AAW32475
ARGININE/SERINE RICH PRE-mRNA SPLICING FACTOR Encephalitozoon cuniculi NP_597487
ASSOCIATED WITH RAN (NUCLEAR IMPORT/EXPORT) FUNCTION FAMILY MEMBER Caenorhabditis elegans NP_499369
ATP SYNTHASE Encephalitozoon cuniculi XP_955732
BELONGS TO THE ABC TRANSPORTER SUPERFAMILY Encephalitozoon cuniculi NP_597462
cAMP-DEPENDENT PROTEIN KINASE TYPE 1 REGULATORY CHAIN Encephalitozoon cuniculi NP_597223
CASEIN KINASE 1 HOMOLOG (INVOLVED IN DNA REPAIR Encephalitozoon cuniculi NP_597600
CATION-TRANSPORTING ATPase Encephalitozoon cuniculi NP_586078
CHOLINE PHOSPHATE CYTIDYLYLTRANSFERASE Encephalitozoon cuniculi NP_586276
DNA REPLICATION LICENSING FACTOR MCM2 Encephalitozoon cuniculi NP_584768
DNA REPLICATION LICENSING FACTOR OF THE MCM FAMILY MCM6 Encephalitozoon cuniculi NP_597420
DNA REPLICATION LICENSING FACTOR OF THE MCM FAMILY MCM7 Encephalitozoon cuniculi NP_585977
DNAJ PROTEIN HOMOLOG 2 Encephalitozoon cuniculi NP_586004
DNAK-LIKE PROTEIN Encephalitozoon cuniculi NP_586489
EUKARYOTIC TRANSLATION INITIATION FACTOR 4A Encephalitozoon cuniculi XP_955671
FIBRILLARIN (34kDa NUCLEOLAR PROTEIN) Encephalitozoon cuniculi NP_586197
GENERAL TRANSCRIPTION FACTOR Encephalitozoon cuniculi NP_597292
GLUCOSAMINE FRUCTOSE-6-PHOSPHATE AMINOTRANSFERASE Encephalitozoon cuniculi NP_586057
GLYCERALDEHYDE-3-PHOSPHATE DEHYDROGENASE Encephalitozoon cuniculi NP_586008
GUANINE NUCLEOTIDE BINDING PROTEIN BETA SUBUNIT Encephalitozoon cuniculi NP_597241
HEAT SHOCK RELATED 70 kDa PROTEIN Encephalitozoon cuniculi NP_597563
HEAT-SHOCK PROTEIN HSP90 HOMOLOG Encephalitozoon cuniculi NP_584635
HISTIDYL tRNA SYNTHETASE Antonospora locustae AAT12372
HISTONE ACETYLTRANSFERASE TYPE B SUBUNIT 2 Encephalitozoon cuniculi NP_586003
HISTONE DEACETYLASE 1 Encephalitozoon cuniculi NP_597645
HISTONE DEACETYLASE Encephalitozoon cuniculi XP_955621
HISTONE H3 Mus musculus JQ1983
HSP 101 RELATED PROTEIN Encephalitozoon cuniculi NP_586448
HYPOTHETICAL PROTEIN ECU02_0840 Encephalitozoon cuniculi NP_584609
HYPOTHETICAL PROTEIN ECU02_0950 Encephalitozoon cuniculi NP_584620
HYPOTHETICAL PROTEIN ECU06_0450 Encephalitozoon cuniculi NP_585801
HYPOTHETICAL PROTEIN ECU06_1280 Encephalitozoon cuniculi NP_585884
HYPOTHETICAL PROTEIN ECU07_0530 Encephalitozoon cuniculi NP_585981
HYPOTHETICAL PROTEIN ECU08_1500 Encephalitozoon cuniculi NP_597278
HYPOTHETICAL PROTEIN ECU09_0740 Encephalitozoon cuniculi XP_955628
HYPOTHETICAL PROTEIN ECU09_1700 Encephalitozoon cuniculi XP_955723
HYPOTHETICAL PROTEIN ECU11_1720 Encephalitozoon cuniculi NP_586478
LIM DOMAIN-CONTAINING PROTEIN Encephalitozoon cuniculi NP_586340
LONG CHAIN FATTY ACID CoA LIGASE Encephalitozoon cuniculi NP_586206
METHIONINE AMINOPEPTIDASE TYPE 2 Encephalitozoon cuniculi NP_586190
METHIONINE PERMEASE Encephalitozoon cuniculi NP_585905
NIFS-LIKE PROTEIN (CYSTEINE DESULFURASE) INVOLVED IN IRON-SULFUR CLUSTER SYNTHESIS Encephalitozoon cuniculi NP_586483
NUCLEAR SER/THR PROTEIN PHOSPHATASE PP1-1 GAMMA CATALYTIC SUBUNIT Encephalitozoon cuniculi NP_597385
P68-LIKE PROTEIN (DEAD BOX FAMILY OF RNA HELICASES) Encephalitozoon cuniculi NP_597238
PEPTIDE CHAIN RELEASE FACTOR SUBUNIT 1 Encephalitozoon cuniculi NP_597376
PEPTIDE ELONGATION FACTOR 2 Glugea plecoglossi BAA11470
PHOSPHATIDYLINOSITOL TRANSFER PROTEIN, ALPHA Danio rerio NP_957229
PHOSPHOMANNOMUTASE Encephalitozoon cuniculi NP_597365
POLYADENYLATE-BINDING PROTEIN 2 Encephalitozoon cuniculi NP_586226
POLYPROTEIN Sorghum bicolor AAD27571
PRE-mRNA SPLICING FACTOR Encephalitozoon cuniculi NP_586183
PROTEIN KINASE B-LIKE PROTEIN Plasmodium falciparum AAT06260
PROTEIN TRANSPORT PROTEIN SEC23 HOMOLOG (COPII COAT) Encephalitozoon cuniculi NP_586385
PUTATIVE HYDROLASE-LIKE PROTEIN Antonospora locustae AAU11090
PUTATIVE ZINC FINGER PROTEIN Encephalitozoon cuniculi NP_597297
SER/THR PROTEIN PHOSPHATASE 2-A Encephalitozoon cuniculi NP_584753
SER/THR PROTEIN PHOSPHATASE PP2-A REGULATORY SUBUNIT B Encephalitozoon cuniculi NP_597423
SERINE/THREONINE PROTEIN KINASE (REQUIRED FOR ACTIN RING AND SEPTATION) Encephalitozoon cuniculi XP_965898
SIMILAR TO DNAJ-LIKE PROTEIN Nasonia vitripennis XP_001602403
SIMILARITY TO 14-3-3 PROTEIN 1 Encephalitozoon cuniculi NP_597610
SIMILARITY TO ADP/ATP CARRIER PROTEIN Paranosema grylli CAI30461
SIMILARITY TO CDC20 (WD-REPEAT PROTEIN) Encephalitozoon cuniculi NP_597660
SIMILARITY TO Hsp70-RELATED PROTEIN Encephalitozoon cuniculi NP_584537
SIMILARITY TO HYPOTHETICAL INTEGRAL MEMBRANE PROTEIN YQ55_CAEEL Encephalitozoon cuniculi NP_597662
SIMILARITY TO HYPOTHETICAL PROTEIN YAAT_BACSU Encephalitozoon cuniculi NP_597532
SIMILARITY TO HYPOTHETICAL PROTEIN YB36_METJA Encephalitozoon cuniculi NP_597239
SIMILARITY TO PUTATIVE AMINOACID TRANSPORTER YEU9_yeast Encephalitozoon cuniculi NP_584803
SIMILARITY TO SKT5 PROTEIN Encephalitozoon cuniculi NP_586349
SIMILARITY TO TRANSCRIPTION INITIATION FACTOR TFIIA Encephalitozoon cuniculi NP_597616
STE12 TRANSCRIPTION FACTOR Encephalitozoon cuniculi NP_586509
STRUCTURE-SPECIFIC RECOGNITION PROTEIN Encephalitozoon cuniculi NP_586030
T COMPLEX PROTEIN 1 SUBUNIT BETA Encephalitozoon cuniculi XP_955601
THREONYL tRNA SYNTHETASE #1 Encephalitozoon cuniculi NP_586084
THREONYL tRNA SYNTHETASE #2 Encephalitozoon cuniculi NP_586084
TRANSLATION ELONGATION FACTOR 1 ALPHA Glugea plecoglossi BAA12288
TRIOSE PHOSPHATE ISOMERASE Encephalitozoon cuniculi NP_586329
TUBULIN BETA CHAIN Encephalitozoon cuniculi NP_597591
U5 ASSOCIATED snRNP Encephalitozoon cuniculi NP_586393
UNNAMED PROTEIN PRODUCT (Hsp70) Candida glabrata XP_445544
VACUOLAR ATP SYNTHASE CATALYTIC SUBUNIT A Encephalitozoon cuniculi NP_586434
VACUOLAR ATP SYNTHASE SUBUNIT B Encephalitozoon cuniculi NP_586219
ZINC FINGER PROTEIN Encephalitozoon cuniculi NP_584833

The 97 unique E. aedis transcripts which are homologous to genes present in other microsporidia are listed above. Species names and Genbank accession numbers of top BLASTX hits are indicated. Bold text in the "Gene Name" column indicates instances where two different transcripts both had the same top BLASTX hit. Underlining indicates a copy of Hsp70 that is most similar to a protein that remains unnamed in Genbank.

Table 2.

Gene Name Species Name Genbank Accession Number
60S RIBOSOMAL PROTEIN L2 Babesia bovis XP_001612300
ADENOSINE KINASE Homo sapiens AAA97893
HYPOTHETICAL PROTEIN Candida albicans XP_717148
HYPOTHETICAL PROTEIN PY5484 Plasmodium yoelii yoelii XP_725949
L-ASPARIGINASE Dirofilaria immitis Q9U518
LYSINE tRNA LIGASE Saccharomyces cerevisiae CAA39699
PUTATIVE VESICULAR TRANSPORT FACTOR USO1P Candida albicans XP_710120
SEC63 DOMAIN CONTAINING PROTEIN Trichomonas vaginalis XP_001580151
PROTEIN PHOSPHATASE 2B Cryptosporidium hominis XP_666159
WD-40 REPEAT FAMILY PROTEIN Arabidopsis thaliana NP_201533

The 10 genes which are present in E. aedis but absent from other microsporidia are listed above. The species names and Genbank accession numbers of the top BLASTX hit for each gene are also listed.

Approximately a quarter of the transcripts analyzed coded for Hsp70. Almost all of the Hsp70 sequences were most similar to the "heat shock related 70 kDa protein" found in E. cuniculi (NP_597563). Single nucleotide variation exists between sequences, usually as 3rd position synonymous substitutions. Where non-synonymous substitutions exist, they are always a single nucleotide and there are no indels between sequences. Mitochondrial-type and DNAK-like Hsp70s were also represented.

Genes were assigned to COG categories to allow for comparison with A. locustae. Figure 2 illustrates the percentages of total E. aedis transcripts that are dedicated to each COG category. Total A. locustae transcripts are provided for comparative purposes. As the randomness of the library is uncertain, it is possible that some transcripts are artificially overrepresented. It is therefore more informative to examine unique transcripts (ie. counting multiple transcripts for the same gene only once) rather than total transcripts. Figure 3 displays the percentages of unique E. aedis and A. locustae transcripts dedicated to each category. Surprisingly, the values are similar and sometimes identical (maximum difference between E. aedis and A. locustae categories is 5%).

Figure 2.

Figure 2

Total E. aedis transcripts represented by COG (Clusters of Orthologous Groups of proteins) category with and without Hsp70. Total A. locustae transcripts are provided for comparison. (A. locustae data adapted from Williams et al., 2005.)

Figure 3.

Figure 3

Unique E. aedis transcripts represented by COG category. Unique A. locustae transcripts are provided for comparison. (A. locustae data adapted from Williams et al., 2005.)

Notable transcripts include a retrotransposon that is similar to LTR retrotransposons present in Sorghum bicolor (AAD27571) and Nosema bombycis (ABE26655). All belong to the Ty3/Gypsy family of retrotransposons. E. aedis also possesses a methionine aminopeptidase 2 gene (MetAP-2), which is present in E. cuniculi. There were several transcripts present that appear homologous to proteins found in various eukaryotes, but are absent in other microsporidia examined to date. These include hypothetical or unknown proteins found in Oryza, Danio and Plasmodium, as well as genes encoding proteins with identified functions, such as an adenosine kinase, a lysine-tRNA ligase and an L-asparaginase (See Table 2). In addition, E. aedis encodes a putative hydrolase-like protein that is present in A. locustae, but absent in E. cuniculi.

E. cuniculi and A. locustae both contain a small number of introns in their genomes and consequently, they have retained a minimal set of splicing machinery. These two organisms are not closely related (See Fig. 1), but they do share a few conserved introns [11]. Therefore, there is reason to suspect that some of these introns may also be present in E. aedis. Fortunately, seven transcripts of the gene encoding ribosomal protein L5 (which contains an intron in E. cuniculi) were recovered from the E. aedis library. These sequences were used to design primers to amplify the L5 gene from genomic DNA. It was found that the E. aedis L5 gene does not contain an intron.

Transcript structure

As E. aedis is an intracellular parasite and therefore cannot be easily cultured, RNA was limited and the library could not be constructed in a 5' cap-dependent manner. Therefore, nearly all of the inserts encoding the same gene were of different lengths, and most were 5' truncated. However, some of E. aedis' transcripts appear to have very long 5' untranslated regions (UTRs) of several hundred base pairs. To further assess transcript structure, cap-dependent 5' RACE (rapid amplification of cDNA ends) was conducted on transcripts from a moderately represented gene, glucosamine fructose-6-phosphate aminotransferase. 5' RACE confirmed that transcript lengths for this gene do vary, with 5' UTRs ranging from 255 to 348 bp (See Fig. 4).

Figure 4.

Figure 4

5' RACE conducted on a moderately represented transcript in E. aedis reveals multiple transcription start sites. ESTs are depicted in green and RACE products in purple. The predicted translational start codon is indicated by the orange arrow. As indicated, the E. cuniculi homologue of this gene contains 128 amino acids at N-terminus that appear to be absent in E. aedis.

Contrary to the variable start sites of the transcripts, nearly all appear to have identical end sites. The notable exceptions are the heat shock related 70 kDa protein transcripts, which have somewhat variable 3' polyadenylation sites.

There were frequently single nucleotide differences between sequences in contigs, but these differences were usually restricted to silent third position substitutions. In instances where the substitutions are not silent, they are conservative amino acid substitutions. These differences could represent different copies of the same gene or different alleles within the population (UTRs were not available in most cases to determine which).

Discussion

Comparing microsporidian transcriptomes

This is the second microsporidian EST project to be conducted and the first from a microsporidian possessing a large genome, allowing for a meaningful comparison of microsporidian spore transcriptomes. Despite the vast differences in genome size and life cycle complexity between E. aedis and A. locustae, their transcriptomes are highly similar in their compositions. The proportions of unique transcripts encoding proteins devoted to the "protein destination" COG category in both E. aedis and A. locustae are relatively large (19% and 16%, respectively) (See Fig. 3). It is interesting to note that proteomic work correlates with these results, as the number of proteins in E. cuniculi devoted to the "protein destination" COG category form a large percentage of the total proteins present (~28%) that have known functions [12].

When the total number of unique genes found in E. aedis and A. locustae are compared based on COG category classification, the percentages in each category are close to identical (See Fig. 3). The largest differences lie in the categories of cellular organization and biogenesis, cellular communication and signal transduction and cell rescue, defense, cell death and aging. One notable difference between the two spore transcriptomes is that no transposable elements were recovered in the A. locustae ESTs, whereas E. aedis transcribes a retrotransposon of the Ty3/gypsy family. Transposable elements have been previously reported to exist in the genomes of Nosema bombycis [13], Spraguea lophii [14], Brachiola algerae and E. aedis [8] (See below). To the best of our knowledge, this is the first instance of documented transposable element transcription in microsporidia, and could indicate active transposition.

Nearly 8% of the unique transcripts from E. aedis encode genes that are present in various eukaryotes, but are absent from other microsporidia. The existence of these genes has several possible explanations. Sequence data from microsporidia is scarce, and the only completely sequenced genome is that of E. cuniculi. Therefore, it is currently impossible to assert that these genes are absent in any microsporidia other than E. cuniculi. The possibility exists that they were present in the genome of the microsporidian ancestor, and were lost during genome reduction/compaction events in E. cuniculi. These genes could also have arisen from lateral transfer events or they could have come to resemble genes in other organisms by chance or by convergence. Parsimoniously, the first explanation seems most likely, therefore, these data seem to suggest that the ancestor of microsporidia was not, indeed, compact to the extent of E. cuniculi.

The MetAP-2 protein is a target for drug therapy in E. cuniculi [15]. The E. aedis copy of the MetAP-2 gene is very similar to that present in E. cuniculi, and contains the amino acid residues that bind the drug fumagillin as well as those believed to coordinate metals. Like E. cuniculi, E. aedis lacks a polylysine tract at the N-terminus of the MetAP-2 protein that is present in animals, other fungi and plants. This tract plays a role in hindering the phosphorylation of eukaryotic initiation factor 2α (eIF2α), and its absence indicates that the microsporidian proteins likely lack this function [15].

Although our work indicates that the E. aedis L5 gene does not contain an intron like its E. cuniculi homologue (see Results, above), there is reason to believe that there are introns elsewhere in the genome. There are several transcripts encoding proteins that act in pre-mRNA splicing: an arginine/serine rich pre-mRNA splicing factor (NP_597487 in E. cuniculi), a pre-mRNA splicing factor (NP_586183 in E. cuniculi) and a U5 associated snRNP (NP_586393 in E. cuniculi). These genes comprise 2.2% of the total unique genes found.

Hsp70

Roughly 28% of total E. aedis transcripts encoded some form of Hsp70, a heat shock protein that assists in the folding of other proteins. Hsp70 helps prevent proteins from becoming insoluble and also plays a role in various other intracellular processes, such as apoptosis [16]. The action of Hsp70 allows mutant proteins to continue functioning by being refolded instead of being degraded, which necessitates the costly synthesis of more protein. The number of Hsp70 transcripts in the E. aedis ESTs is an order of magnitude higher than was found in A. locustae (2%) [17]. We are cautious in this interpretation as we have not quantitatively assessed the transcription level of Hsp70 in E. aedis, and it is likely that transcripts of this protein are somewhat overrepresented in the library.

Although no E. cuniculi ESTs have been published, Brosson et al. [12] investigated the proteins present in spores. Hsp70 constitutes a moderate amount of all protein present. Brosson and his colleagues classified all proteins based on their COG categories, and found that all "protein destination" proteins together comprise 21% of E. cuniculi's proteome. Intriguingly, Brosson et al.'s [12] experiments indicate that of the four copies of Hsp70 in E. cuniculi, the predominately expressed copy of Hsp70 in E. cuniculi is homologous to the highly represented transcript in E. aedis. In A. locustae, the most highly transcribed copy was most similar to the abundantly transcribed copy in E. aedis as well [17]. Therefore, it is likely that microsporidia employ similar primary mechanisms to ensure proper folding of proteins.

In other parasites and endosymbionts, such as Buchnera aphidicola, Hsp70 is also highly expressed [18] and may constitute up to 10% of the protein contained in the cell at any one time. In species that lead parasitic or endosymbiotic lifestyles, genetic drift and relaxed selection pressure frequently lead to an increased mutation rate. The need for Hsp70 in order for proteins to fold correctly seems to increase with both the size and number of mutations in the protein [16]. Although microsporidian genomes appear to have had little rearrangement, the nucleotide mutation rate seems to be high in this group of organisms [19,20]. Microsporidia could, therefore, contain elevated levels of Hsp70 in order to allow folding of mutant proteins.

Transposable elements

One of the E. aedis ESTs closely matches the integrase domain of the Ty3/gypsy family of retrotransposons. Several of these elements were identified in a GSS of E. aedis [8] and a few other microsporidian species, but to the best of our knowledge, this is the first instance in which transcripts of any microsporidian retrotransposon have been found. Transcripts could be indicative of active transposition occurring in E. aedis' genome.

Ty3/gypsy retrotransposons exist in many organisms ranging from the microsporidia Spraguea lophii [14], Brachiola algerae [8], and Nosema bombycis [13] to Saccharomyces, Drosophila and Sorghum. Ty3 elements have been well characterized in budding yeast, and exist in 1–4 copies per genome, where they are transcribed by RNA polymerase III. Transcription typically occurs only in haploid cells in the presence of mating pheromones [21]. The N. bombycis genome contains at least 8 different retrotransposons in the Ty3/gypsy family, but unlike yeast, they are not exclusively located upstream of tRNAs [13]. Nearly all N. bombycis retrotransposons encode a polyprotein containing 5 domains, which exist in a defined order: Gag, protease, reverse transcriptase, RnaseH and integrase. As many of the sequences in the E. aedis library appear to be 5' truncated, it is possible that the other domains upstream of the integrase in the polyprotein are also present in genomic DNA. Indeed, the GSS project revealed sequences matching the reverse transcriptase domain [8]. Although the microsporidian Vittaforma corneae is also known to possess at least one transposable element [22], it belongs to a different family than those present in E. aedis - the L1 family present in humans.

The only completely sequenced microsporidian genome, that of E. cuniculi [2], is completely devoid of transposable elements. The existence of similar transposable elements (of the Ty3/gypsy family) in the distantly related S. lophii, N. bombycis, B. algerae and E. aedis (See Fig. 1) implies that this element may have been present in the genome of the ancestor of microsporidia. Therefore, the process of genome compaction that gave rise to the E. cuniculi genome likely involved purging transposable elements.

It has been suggested that transposable elements may act to reorganize genes within the genome. Xu et al. [13] compared regions of synteny between N. bombycis and E. cuniculi chromosomes, as selection appears to be acting to retain gene synteny among microsporidia, even if they are only distantly related [3]. In N. bombycis, transposable elements flank these syntenic regions [13]. If E. aedis' large genome is partially a product of transposable element proliferation, one would expect much less synteny between this species and other microsporidia. Perhaps future research will elucidate other roles that transposable elements have played in shaping microsporidian genomes, especially since the minute genome of E. cuniculi seems to lack them, while they are present in larger genomes.

The functions that these transposable elements perform in a given genome are cryptic at best, but evidence is emerging that they may be more than just simply parasitic DNA. Peaston et al. [23] recently discovered that a class of mouse retrotransposons appears to regulate gene expression in embryos.

Transcript structure

Transcripts in A. locustae typically contain more than one gene. These transcripts do not necessarily contain complete open reading frames for all genes and the genes are frequently in opposite orientations [9]. It is not known how many proteins are made from each transcript or whether this situation is typical for microsporidia, but recent work by Corradi et al. [10] suggests that E. cuniculi also possesses multi-gene transcripts.

Unlike A. locustae and E. cuniculi, E. aedis appears to transcribe very few multi-gene transcripts, if any at all. This is not unexpected, given that E. aedis genes appear to be separated by large intergenic spaces [8]. The E. aedis GSS could not rule out the possibility that local areas of compacted genes might exist [8]. Given the lack of multi-gene transcripts identified, this seems increasingly unlikely.

Also contrary to what is found in A. locustae, nearly all of E. aedis' transcripts encode proteins in a positive frame (<1% are in a negative frame, compared to 17% in A. locustae) [9]. Although antisense transcripts are used in many organisms (possibly also A. locustae) to suppress translation, it appears unlikely that this type of regulation occurs in E. aedis. Conversely, the large number of antisense transcripts in A. locustae may be due to a lack of transcriptional regulation resulting from genome compaction.

E. aedis' transcripts seem to start at multiple locations upstream of the start codon (5' UTR length is 180 bp on average) but terminate at the same position with a relatively short 3' UTR (51 bp on average) (See, for example, Figure 4). This is more in line with transcription in E. cuniculi and contrasts with the situation in A. locustae, where transcripts start directly upstream of the translation initiation site, but often terminate much farther downstream in the adjacent gene [10]. For comparison, the yeast S. cerevisiae contains much shorter 5' UTRs than 3' UTRs (15–75 and ~144 bp, respectively [24,25]), a common trend seen in other fungi, plants and animals. The reason for this reversal is unknown, since 3' UTRs are ubiquitously used as translation regulators. It is likely that E. aedis lacks some of the translational control mechanisms present in other fungi, plants and animals [26].

Conclusion

This is the first examination of ESTs from a microsporidian containing a large genome. The extent of genome compaction in the microsporidian ancestor is not known, but the presence of genes in E. aedis that have not been found in other microsporidia suggests that extreme reduction and compaction occurred only in specific lineages. Surprisingly, E. aedis has a predicted uninucleate spore transcriptome that is highly similar to that of the distantly related A. locustae, although the two species have diverse life cycles and genome sizes.

Methods

Uninucleate E. aedis spores were grown and harvested from A. aegypti larvae as described previously [27].

E. aedis spores were lysed in Ambion's plant RNA isolation aid and lysis/binding solution from an Ambion RNAqueous kit using a bead beater operating at 2500 rpm for 6 minutes with glass beads. RNA was extracted from the resulting supernatant using the RNAqueous kit. A microquantity cDNA library was constructed by Marligen, using the pExpress-1 vector. 1307 clones with an average insert size of 1.5 kb were uni-directionally sequenced using an automated capillary sequencer. Sequences were manually edited and analyzed using Sequencher 4.2 software. Proteins encoded by the transcripts were identified via BLASTX [28] searches performed on the NCBI website (Genbank). Transcripts were identified as encoding a particular protein when BLASTX hits to Genbank proteins had e-values of 10-4 or lower. Transcripts were scored as "present in other microsporidia" when the best BLASTX hit was a gene present in other microsporidia or when the best hit was a gene that has a microsporidian homologue, and the homologue was identified in other microsporidia by BLASTing the E. aedis transcript against available microsporidian data. Putative E. aedis-specific genes are transcripts that contain open reading frames at least 100 base pairs in size and do not have any BLASTX hits with e-values lower than 10-3. In order to facilitate access to the EST sequences, they were uploaded and annotated by the dbEST website [17].

Authors' contributions

EEG extracted RNA from the E. aedis spores, performed and interpreted the sequence analyses and drafted the manuscript. JJB cultivated insect larvae and harvested E. aedis spores. NMF conceived of this study, contributed to the interpretation of the results and helped draft the manuscript.

Acknowledgments

Acknowledgements

EEG's work is supported by an NSERC Postgraduate Scholarship and research in the Fast lab is funded by an NSERC Discovery Grant to NMF.

Contributor Information

Erin E Gill, Email: egill@interchange.ubc.ca.

James J Becnel, Email: james.becnel@ars.usda.gov.

Naomi M Fast, Email: nfast@interchange.ubc.ca.

References

  1. Wittner M, Weiss LM. The Microsporidia and Microsporidosis. Washington, D. C.: ASM Press; 1999. [Google Scholar]
  2. Katinka MD, Duprat S, Cornillot E, Metenier G, Thomarat F, Prensier G, Barbe V, Peyretaillade E, Brottier P, Wincker P, Delbac F, El Alaoui H, Peyret P, Saurin W, Gouy M, Weissenbach J, Vivares C. Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi. Nature. 2001;414:450–453. doi: 10.1038/35106579. [DOI] [PubMed] [Google Scholar]
  3. Slamovits CH, Fast NM, Law JS, Keeling PJ. Genome compaction and stability in microsporidian intracellular parasites. Curr Biol. 2004;14:891–896. doi: 10.1016/j.cub.2004.04.041. [DOI] [PubMed] [Google Scholar]
  4. Streett DA. Analysis of Nosema locustae (microsporidia:Nosematidae) chromosomal DNA with pulsed-field gel electrophoresis. J Invertebr Pathol. 1994;63:301–303. doi: 10.1006/jipa.1994.1055. [DOI] [Google Scholar]
  5. Becnel JJ. Edhazardia aedis (microsporidia: Amblyosporidae) as a biological control agent of Aedes aegypti (diptera: Culicidae) Proc Vth Int Colloq Invertebr Pathol Microb Control: Adelaide, Australia. 1990. pp. 56–60.
  6. Becnel JJ, Sprague V, Fukuda T, Hazard EI. Development of Edhazardia aedis (kudo, 1930 N. G., N. comb. (microsporidia: Amblyosporidae) in the mosquito Aedes aegypti (L.) (DIptera: Culicidae) J Protozool. 1989;36:119–130. doi: 10.1111/j.1550-7408.1989.tb01057.x. [DOI] [PubMed] [Google Scholar]
  7. Johnson MA, Becnel JJ, Undeen AH. A new sporulation sequence in Edhazardia aedis (microsporidia: Culicosporidae), a parasite of the mosquito Aedes aegypti (diptera: Culicidae) J Invertebr Pathol. 1997;70:69–75. doi: 10.1006/jipa.1997.4678. [DOI] [Google Scholar]
  8. Williams BAP, Lee RCH, Becnel JJ, Weiss LM, Fast NM, Keeling PJ. Genome sequence surveys of Brachiola algerae and Edhazardia aedis reveal microsporidia with low gene densities. BMC Genomics. [DOI] [PMC free article] [PubMed]
  9. Williams BA, Slamovits CH, Patron NJ, Fast NM, Keeling PJ. A high frequency of overlapping gene expression in compacted eukaryotic genomes. Proc Natl Acad Sci USA. 2005;102:10936–10941. doi: 10.1073/pnas.0501321102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Corradi N, Gangaeva A, Keeling PJ. Comparative profiling of overlapping transcription in the compacted genomes of microsporidia Antonospora locustae and Encephalitozoon cuniculi. Genomics. [DOI] [PubMed]
  11. Limpright VO, Fast NM. personal communication
  12. Brosson D, Kuhn L, Delbac F, Garin J, Vivares CP, Texier C. Proteomic analysis of the eukaryotic parasite Encephalitozoon cuniculi (microsporidia): A reference map for proteins expressed in late sporogonial stages. Proteomics. 2006;6:3625–3635. doi: 10.1002/pmic.200500796. [DOI] [PubMed] [Google Scholar]
  13. Xu J, Pan G, Fang L, Li J, Tian X, Li T, Zhou Z, Xiang Z. The varying microsporidian genome: Existence of long-terminal repeat retrotransposon in domesticated silkworm parasite Nosema bombycis. Int J Parasitol. 2006;36 doi: 10.1016/j.ijpara.2006.04.010. [DOI] [PubMed] [Google Scholar]
  14. Hinkle G, Morrison HG, Sogin ML. Genes coding for reverse transcriptase, DNA-directed RNA polymerase, and chitin synthetase from the microsporidian Spraguea lophii. Biol Bull. 1997;193:250–251. doi: 10.1086/BBLv193n2p250. [DOI] [PubMed] [Google Scholar]
  15. Pandrea I, Mittleider D, Brindley PJ, Didier ES, Robertson DL. Phylogenetic relationships of methionine aminopeptidase 2 among Encephalitozoon species and genotypes of microsporidia. Mol Biochem Parasit. 2005;140:141–152. doi: 10.1016/j.molbiopara.2004.12.006. [DOI] [PubMed] [Google Scholar]
  16. Mayer MP, Bukau B. Hsp70 chaperones: Cellular functions and molecular mechanism. Cell Mol Life Sci. 2005;62:670–684. doi: 10.1007/s00018-004-4464-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. O'Brien E, Koski L, Zhang Y, Yang L, Wang E, Gray MW, Burger G, Lang BF. TBestDB: A taxinomically broad database of expressed sequence tags (ESTs) Nucleic Acids Res. 2007;35:D445–451. doi: 10.1093/nar/gkl770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Wilcox J, Dunbar HE, Wolfinger RD, Moran NA. Consequences of reductive evolution for gene expression in an obligate endosymbiont. Mol Microbiol. 2003;48:1491–1500. doi: 10.1046/j.1365-2958.2003.03522.x. [DOI] [PubMed] [Google Scholar]
  19. Thomarat F, Vivares CP, Gouy M. Phylogenetic analysis of the complete genome sequence of Encephalitozoon cuniculi supports the fungal origin of microsporidia and reveals a high frequency of fast-evolving genes. J Mol Evol. 2004;59:780–791. doi: 10.1007/s00239-004-2673-0. [DOI] [PubMed] [Google Scholar]
  20. Peer Y Van de, Ben Ali A, Meyer A. Microsporidia: Accumulating molecular evidence that a group of amitochondriate and suspectedly primitive eukaryotes are just curious fungi. Gene. 2000;246:1–8. doi: 10.1016/S0378-1119(00)00063-9. [DOI] [PubMed] [Google Scholar]
  21. Kinsey PT, Sandmeyer SB. Ty3 transposes in mating populations of yeast: A novel transposition assay for Ty3. Genetics. 1995;139:81–94. doi: 10.1093/genetics/139.1.81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Mittleider D, Green LC, Mann VH, Michael SF, Didier ES, Brindley PJ. Sequence survey of the genome of the opportunistic microsporidian pathogen, Vittaforma corneae. J Eukaryot Microbiol. 2002;49:393–401. doi: 10.1111/j.1550-7408.2002.tb00218.x. [DOI] [PubMed] [Google Scholar]
  23. Peaston AE, Evsikov AV, Graber JH, de Vries WN, Holbrook AE, Solter D, Knowles BB. Retrotransposons regulate host genes in mouse oocytes and preimplantation embryos. Dev Cell. 2004;7:597–606. doi: 10.1016/j.devcel.2004.09.004. [DOI] [PubMed] [Google Scholar]
  24. Zhang Z, Dietrich FS. Mapping of transcription start sites in Saccharomyces cerevisiae using 5' SAGE. Nucleic Acids Res. 2005;33:2838–2851. doi: 10.1093/nar/gki583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Graber JH, Cantor CR, Mohr SC, Smith TF. Genomic detection of new yeast pre-mRNA 3'-end-processing signals. Nucleic Acids Res. 1999;3:888–894. doi: 10.1093/nar/27.3.888. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Mazumder B, Seshadri V, Fox PL. Translational control by the 3'-UTR: The ends specify the means. Trends Biochem Sci. 2003;2:91–98. doi: 10.1016/S0968-0004(03)00002-1. [DOI] [PubMed] [Google Scholar]
  27. Becnel JJ, Garcia JJ, Johnson MA. Edhazardia aedis (microspora: Culicosporidae) effects on the reproductive capacity of Aedes aegypti (diptera: Culicidae) J Med Entomol. 1995;32:549–553. doi: 10.1093/jmedent/32.4.549. [DOI] [PubMed] [Google Scholar]
  28. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from BMC Genomics are provided here courtesy of BMC

RESOURCES