Abstract
In many animal species, germ-line progenitors associate with gonadal somatic cells to form the embryonic gonads (EGs) that later develop into functional organ producing gametes. To explore the genetic regulation of the germ-line development, we initiated a comprehensive identification and functional analysis of the genes expressed within the EGs. First, we generated a cDNA library from gonads purified from Drosophila embryos by FACS. Using this library, we catalogued the genes expressed in the gonad by EST analysis. A total of 17,218 high-quality ESTs representing 3,051 genes were obtained, corresponding to 20% of the predicted genes in the genome. The EG transcriptome is unexpectedly distinct from that of adult gonads and includes an extremely high proportion of retrotransposon-derived transcripts. We verified 101 genes preferentially expressed in the EGs by whole-mount in situ hybridization. Within this subset, 39 and 58 genes were expressed predominantly in germ-line and somatic cells, respectively, whereas four genes were expressed in the both cell lineages. The gonad-enriched genes encompassed a variety of predicted functions. However, genes implicated in SUMOylation and protein translation, including germ-line-specific ribosomal proteins, are preferentially expressed in the germ line, whereas the expression of various retrotransposons and RNAi-related genes are more prominent in the gonadal soma. These transcriptome data are a resource for understanding the mechanism of various cellular events during germ-line development.
Keywords: expressed sequence tag, germ cell, retrotransposon, pole cell
The germ line is the only cell type that transmits genetic materials from one generation to the next during sexual reproduction. In many animal species, germ-line progenitors migrate within embryos to associate with gonadal somatic cells to form the embryonic gonads (EGs) that will later develop into a fully functional organ capable of producing gametes. In Drosophila, the germ-line progenitors, or pole cells, form at the posterior pole region of the early embryos (1, 2). Pole cells then migrate toward the mesodermal layer, where they associate with the specialized mesodermal cells known as somatic gonadal precursors. Eventually, the somatic cells encapsulate the pole cells to form EGs. Within the gonads, the pole cells undergo oogenesis or spermatogenesis and differentiate into germ cells during postembryonic development. Pole cells that fail to be encapsulated within the gonads eventually degenerate without producing germ cells (3).
Within the EGs, distinct cellular events associated with germ-line development occur, such as resumption of germ-line proliferation (4, 5), selection of the germ-line stem cell (6), gonad morphogenesis (7), and cellular communication between germ-line and somatic cells (8–10). Recent studies have also revealed that the male germ-line stem cell niche is already specified in the EG (ref. 11; Y.K., S.S., K. Arita, and S.K., unpublished data). Despite the importance of the EG in germ-line development, only limited information is available, regarding which genes are expressed in the EG although transcriptome data of adult testes and ovaries have accumulated (12, 13). Thus, we attempted to identify the genes expressed within the EGs by a direct and comprehensive approach. In Drosophila, transcriptome analysis of individual organs and cell types has been hampered by the smallness of their size. To overcome this problem, we have developed an efficient method to isolate EGs by flow cytometry (14). We generated a cDNA library from purified gonads and obtained 17,218 valid ESTs representing 3,051 genes, all of which were examined by whole-mount in situ hybridization (WISH). The transcripts from 101 genes were enriched in the EG. These genes encompass a wide array of molecular and biological processes, as deduced from the Gene Ontology (GO) categories in the fly database. Here, we highlight five functional categories of genes enriched in the EG and discuss their roles.
Results and Discussion
Purification of EG by FACS and Generation of ESTs.
We used FACS to isolate EG from transgenic Drosophila embryos harboring the germ-line marker EGFP-vasa (15). Embryos at 10–18 h after egg laying were homogenized without protease treatment to keep the gonad intact. From these homogenates, gonads containing both GFP-positive pole cells and GFP-negative gonadal somatic cells were separated from the remaining tissue by FACS. With this procedure, we were able to obtain a highly enriched fraction of EG, as confirmed by microscopy and quantitative PCR (14). We constructed an EG cDNA library (EG library) from poly(A)+ RNA from a pool of ≈25,000 FACS-sorted EGs. We sequenced 12,977 cDNA clones from the 5′ end and 6,755 from the 3′ end. After removing low-quality and contaminating sequences, 17,218 high-quality reads were obtained (DNA Data Bank of Japan/European Molecular Biology Laboratory/GenBank accession nos. BP540206–BP560422).
When aligned to Drosophila melanogaster genomic sequences, 15,384 (90.1%) ESTs mapped to euchromatic genomic regions and 434 (2.5%) to heterochromatic genomic regions. The remainder, 1,254 ESTs (7.4%), mapped to multiple loci within the genome; these included 974 highly repetitive sequences (≥10 hits in the genome). Compared with the public EST collections, this EG library includes a significantly higher proportion of repetitive sequences (Fig. 1A). Almost all of the repetitive sequences were derived from retrotransposons (Fig. 1B).
We aligned each EST with a reference transcript set in the Drosophila database (FlyBase, http://flybase.net) and assigned it to a gene. In total, we consolidated 17,072 ESTs derived from the EG library into a nonredundant set of 3,051 genes; these correspond to ≈20% of the predicted genes in D. melanogaster. Because our EST analysis was nearly saturating (see Supporting Text, which is published as supporting information on the PNAS web site), it covers most of the gene repertoire of the EG transcriptome. All identified genes expressed in the EG are listed in Table 3, which is published as supporting information on the PNAS web site.
Using the National Center for Biotechnology Information (NCBI) UniGene-based classification, we compared the gene sets expressed by EGs, adult gonads (AGs; ovary + testis), and other tissues (OTs). Genes represented in all three collections are regarded as those with “housekeeping” functions (1,809 UniGenes; 60.8% of EG UniGene collection). Except the housekeeping genes, we observed that the proportion of genes present in both the EG and AG collections was low (Fig. 2). Only 145 genes (5.0% of the EG UniGene collection) were common to EG and AG, whereas 719 genes (24.8% of EG UniGene collection) were expressed in both EGs and OTs but not AGs. Thus, the EG transcriptome is unexpectedly different from AG. We conclude that the genetic regulation of germ-line development within the EGs is distinct from the one underlying gametogenesis in ovaries and testes.
Overview of Comprehensive Whole-Mount in Situ Hybridization (WISH).
All 3,051 genes represented in our EG library were subjected to WISH to examine their distribution within the embryo. Overall, we obtained useful expression data for 2,388 genes. Although most of them showed ubiquitous distribution, we found that transcripts from 101 genes were enriched in the EG, as summarized in Table 1. We further examined their distribution within the gonads by double-staining embryos with a RNA probe for each transcript and an anti-VASA antibody to distinguish the germ-line and somatic expression of the transcripts within the gonads. We identified 39 RNAs that are expressed predominantly in pole cells, 58 that are expressed in gonadal somatic cells, and 4 that are expressed in both cell types (Table 1, Fig. 3).
Table 1.
Gene | Tissues* | Functions (excerpt) | FlyBase ID |
---|---|---|---|
Expressed in germline | |||
RpS13 | GO[g], CNS, LG | Ribosomal protein | FBgn0010265 |
RpS19b | GO[g], PV | Ribosomal protein | FBgn0039129 |
CG9871 | GO[g], SNS | Ribosomal protein | FBgn0034837 |
RpS5b | GO[g], U | Ribosomal protein | FBgn0038277 |
smt3 | GO[g], CNS | Sumoylation | FBgn0026170 |
Uba2 | GO[g], CNS | Sumoylation | FBgn0029113 |
lwr | GO[g], CNS, U | Sumoylation | FBgn0010602 |
Aos1 | GO[g], CNS | Sumoylation | FBgn0029512 |
Hsp27 | GO[g], CNS, DV | Protein folding | FBgn0001226 |
Hsp26 | GO[g], MGL | Protein folding | FBgn0001225 |
Hsp83 | GO[g], OE, PNS, CNS, MG | Protein folding | FBgn0001233 |
CG4415 | GO[g], PV | Unfolded protein binding | FBgn0031296 |
Cam | GO[g], CNS, OE | Calcium ion binding | FBgn0000253 |
I(1)G0269 | GO[g], CNS, PNS | CTD-like phosphatase | FBgn0029067 |
grp | GO[g], CNS | Protein serine/threonine kinase; cell cycle | FBgn0011598 |
CG2919 | GO[g], BR | Cytoskeleton organization and biogenesis | FBgn0037348 |
scra | GO[g], CNS | Cytoskeleton organization and biogenesis | FBgn0004243 |
Mapmodulin | GO[g], CNS | Microtubule binding | FBgn0034282 |
stai | GO[g], CNS, PNS | Microtubule binding | FBgn0051641 |
Mcm6 | GO[g], CNS, MG, FB | DNA helicase | FBgn0025815 |
Top2 | GO[g], CNS | DNA topoisomerase | FBgn0003732 |
Thd1 | GO[g], CNS, U | Pyrimidine-specific mismatch base pair DNA N-glycosylase | FBgn0026869 |
dUTPase | GO[g], CNS, U, MG | dUTP diphosphatase | FBgn0013349 |
TfllA-S | GO[g], CNS, U | General transcription factor | FBgn0013347 |
Ssb-c31a | GO[g], MG, BR | Transcription coactivator | FBgn0015299 |
ovo | GO[g], EP, BR | Transcription factor | FBgn0003028 |
Fs(2)Ket | GO[g], CNS, U | Importin β | FBgn0000986 |
zpg | GO[g] | Innexin channel | FBgn0024177 |
janA | GO[g], MG, U | Sex differentiation | FBgn0001280 |
CSN3 | GO[g], CNS, U | Signalosome complex | FBgn0027055 |
Uba1 | GO[g], CNS, U | Ubiquitin-activating enzyme | FBgn0023143 |
CG10990 | GO[g] | Translation elongation factor | FBgn0030520 |
vas | GO[g] | RNA helicase | FBgn0003970 |
CG11329 | GO[g] | FBgn0031848 | |
CG15930 | GO[g] | FBgn0029754 | |
CG18213 | GO[g] | FBgn0038470 | |
CG12576 | GO[g], CNS, MG | FBgn0031190 | |
CG14346 | GO[g], U | FBgn0031337 | |
Unnamed gene | GO[g], U | FBgn0058460 | |
Expressed in somatic line | |||
412 | GO[s] | Retrotransposon | FBgn0000006 |
297 | GO[s], CNS, U | Retrotransposon | FBgn0000005 |
17.6 | GO[s], DV, PH | Retrotransposon | FBgn0000004 |
mdg1 | GO[s] | Retrotransposon | FBgn0002697 |
Quasimodo | GO[s] | Retrotransposon | FBgn0062261 |
Stalker | GO[s], U | Retrotransposon | FBgn0064138 |
Stalker2 | GO[s], PV | Retrotransposon | FBgn0063399 |
Tabor | GO[s], SNS | Retrotransposon | FBgn0045970 |
Tirant | GO[s] | Retrotransposon | FBgn0004082 |
ZAM | GO[s], U | Retrotransposon | FBgn0023131 |
gtwin | GO[s] | Retrotransposon | FBgn0063436 |
armi | GO[s], FB | RNA interference | FBgn0041164 |
Dcr-2 | GO[s], U | RNA interference | FBgn0034246 |
piwi | GO[s] | RNA interference | FBgn0004872 |
CG8908 | GO[s], U | ABC transporter | FBgn0034493 |
CG30359 | GO[s], FB | Carbohydrate metabolism; transporter | FBgn0050359 |
CG3036 | GO[s], PNS, GC | Membraine protein; transporter | FBgn0031645 |
CG9935 | GO[s], SNS, LG, PE | Membraine protein; transporter | FBgn0039916 |
CG11537 | GO[s], CNS, SG | Membraine protein; transporter | FBgn0035400 |
CG1599 | GO[s], U | Plasma membrane protein; v-SNARE | FBgn0033452 |
CG3074 | GO[s], SNS | Cathepsin B; proteolysis | FBgn0034709 |
CG9634 | GO[s], U | Proteolysis | FBgn0027528 |
m1 | GO[s], U | Serine-type endopeptidase inhibitor | FBgn0002578 |
Fas1 | GO[s], CNS, PNS | Cell adhesion | FBgn0000634 |
I(2)03709 | GO[s], MG, FB, MU | Cell cycle, DNA metabolism | FBgn0010551 |
Wnt6 | GO[s], MG, MP, GC, MG | frizzled-2 signaling | FBgn0031902 |
skf | GO[s], U | Plasma membrane protein; signal transduction | FBgn0050021 |
lbm | GO[s], CNS, PNS | Tetraspanin; receptor signaling protein | FBgn0016032 |
CG7194 | GO[s] | Gonad development | FBgn0035868 |
M(2)21AB | GO[s], RG, MG | Methionine adenosyltransferase | FBgn0005278 |
mRpS24 | GO[s], FB, MG | Mitochondrial ribosomal protein | FBgn0039159 |
Mocs1 | GO[s], MG | Mo-molybdopterin cofactor biosynthesis | FBgn0036122 |
mud | GO[s] | Mushroom body development | FBgn0002873 |
Pros45 | GO[s], CNS | Proteasome complex | FBgn0020369 |
CG10565 | GO[s], MG, PV, FB | Protein folding; nucleic acid bindning | FBgn0037051 |
stg | GO[s] | Protein tyrosine/serine/threonine phosphatase; cell cycle | FBgn0003525 |
CG5800 | GO[s], SNS, MG | RNA helicase | FBgn0030855 |
B52 | GO[s], U | RNA splicing factor activity | FBgn0004587 |
CG11447 | GO[s], MG, ES, HG | rRNA (uridine-2′-O-)-methyltransferase | FBgn0038737 |
zfh1 | GO[s], CNS, PNS | Transcription factor | FBgn0004606 |
stc | GO[s], FB | Transcription factor | FBgn0001978 |
esg | GO[s], HIB | Transcription factor | FBgn0001981 |
ftz-f1 | GO[s], PV | Transcription factor | FBgn0001078 |
neur | GO[s], CNS, FB | Ubiquitin-protein ligase | FBgn0002932 |
novel gene | GO[s], CNS, PNS | FGM222E05† | |
novel gene | GO[s], CNS, U | FGC026A04† | |
CG15784 | GO[s], CNS | FBgn0029766 | |
CG7267 | GO[s], FB, U | FBgn0030079 | |
CG33047 | GO[s], GRL | FBgn0053047 | |
CG6014 | GO[s], HG | FBgn0027542 | |
CG7498 | GO[s], LG, SNS, FB | FBgn0040833 | |
CG7224 | GO[s], MG, GC, PV | FBgn0031971 | |
CG5541 | GO[s], MG, HG, ES, PV | FBgn0030603 | |
CG14998 | GO[s], PNS, FB | FBgn0035500 | |
CG11050 | GO[s], U | FBgn0031836 | |
dpr17 | GO[s], U | FBgn0051361 | |
CG14072 | GO[s], U | FBgn0032318 | |
Expressed in germline and somatic line | |||
Su(var)205 | GO[g/s], CNS | Chromatin binding | FBgn0003607 |
Df31 | GO[g/s], CNS, PNS, misc, HIB | Histone binding | FBgn0022893 |
ran | GO[g/s], CNS, BR, FB | Ras GTPase | FBgn0020255 |
14-3-3ϵ | GO[g/s], CNS, U | Ras protein signal transduction | FBgn0020238 |
*BR, brain; DV, dorsal vessel; EP, epidermis; ES, esophagus; FB, fat body; GC, gastric caeca; GO, gonad; HG, hindgut; HIB, histoblast; LG, lymph gland; MG, midgut; MGL, midline glial cell; MP, Malpighian tubule; MU, muscle; OE, oenocyte; PE, pericardial cells; PH, pharynx; PNS, peripheral nervous system; PV, proventriculus; RG, ring gland; SG, salivary gland; SNS, stomatgastric nerbous system; U, weak signal is ubiquitously ditected; g and s in a pair of brace indicate germ-line and somatic line expression in the gonad.
†Instead of FlyBase ID, the clone name is shown for the novel gene.
We investigated the temporal expression patterns of transcripts enriched in pole cells by WISH. Embryos at various developmental stages were examined (Fig. 4, which is published as supporting information on the PNAS web site), and three major expression patterns were extracted (Types I, II and III). Transcripts with the Type I expression pattern are first observed in the pole cells during their migration through the posterior midgut epithelium and remain detectable after the coalescence of the gonads. Transcripts from vasa, RpL22-like, RpS19b, CG10990, CG4415, TfIIA-S, and Ssb-c31b exhibit this type of expression. Because the pole cells are transcriptionally inactive until they migrate (16), these transcripts are some of the earliest zygotic transcripts in the pole cells. Given that their transcription is initiated in the pole cells before coalescing with the gonadal somatic cells, we speculate that their expression is autonomously initiated by maternal factors partitioned into the pole cells, rather than an inductive signal from the gonadal soma. Indeed, the expression of some Type I genes also was detectable “lost” pole cells that failed to be incorporated within the gonads. Transcripts with a Type II expression pattern are observed in various tissues before gonad formation but are enriched in pole cells after they associate with the gonadal somatic cells. Transcripts for smt3, Uba2, lwr, Top2, and grp display this type of expression pattern. Type III expression includes transcripts that accumulate in the pole cells throughout embryogenesis. These transcripts present in the early pole cells are presumably maternal in origin, whereas zygotic transcription may occur at later stages. This type includes transcripts from ovo, stai, Hsp26, Hsp27, Hsp83, and zpg.
Functional Classification of EG-Enriched Genes.
To characterize the EG transcriptome, we assigned GO terms to each EG-enriched gene according to FlyBase annotations. As shown in Table 4, which is published as supporting information on the PNAS web site, EG-enriched genes represent a broad range of biological and molecular functions. Our statistical analysis showed that some of the categories were significantly overrepresented in the list of EG-enriched genes (Table 2). Among them, five categories are highlighted and discussed in detail.
Table 2.
GO name (GO ID) | No. genes enriched in EG | No. genes in the genome | P value | Genes |
---|---|---|---|---|
Biological process | ||||
Germ cell development (0007281) | 5 | 157 | 0.012 | zpg, vas, armi, 14-3-3ε, piwi |
DNA replication (0006260) | 5 | 128 | 5.0E-03 | armi, 1(2)03709, Mcm6, Top2, Thd1 |
Protein folding (0006457) | 4 | 134 | 0.029 | CG10565, Hsp26, Hsp27, Hsp83 |
Protein import into nucleus (0006606) | 3 | 49 | 0.0087 | Fs(2)Ket, lwr, smt3 |
Response to heat (0009408) | 3 | 51 | 0.0097 | Hsp26, Hsp27, Hsp83 |
RNA interference (0016246) | 3 | 11 | 9.9E-05 | armi, Dcr-2, piwi |
Molecular function | ||||
Microtubule binding (0008017) | 3 | 76 | 0.028 | Mapmodulin, scra, stai |
SUMO-activating enzyme activity (0019948) | 2 | 2 | 7.4E-05 | Aos1, Uba2 |
Cellular component | ||||
Cytosolic ribosome (0005830) | 4 | 99 | 0.011 | CG9871, RpS13, RpS19b, RpS5b |
Overrepresented categories are chosen.
Germ-Line Development.
We found that genes in the GO category “germ cell development” were overrepresented in the list of EG-enriched genes (Table 2). It is generally expected that genes responsible for germ-line development are predominantly expressed in the gonads. However, their functions are known to be required within the AGs. For example, piwi is expressed in the somatic cells adjacent to germ-line stem cells and is essential for stem cell self renewal (17). zpg is required for survival of differentiating early germ cells in AGs (18), and armi represses oskar translation in ovaries and Ste expression in testes (19, 20). Although their functions during embryogenesis are unclear, these genes were expressed in the EGs. A similar precocious expression has been reported for meiotic genes; a subset of the genes responsible for meiotic division is expressed in pole cells during embryogenesis, whereas meiosis is initiated later at the postembryonic stages (21). It is possible that transcription of these gametogenesis-related genes initiates in the EGs, but posttranscriptional repression restricts the function of these genes until the onset of gametogenesis. Although we cannot exclude the possibility that these genes may have additional functions, our observations are consistent with the notion that the EG acquires at least a part of the potential to carry out gametogenesis.
LTR Retrotransposons.
We observed that a surprisingly large number of EG ESTs (>1,000 ESTs) were derived from retrotransposons; this population corresponds to 7% of the EG EST collection. This proportion was significantly larger than in other public EST collections (Fig. 1B). Thus, retrotransposons are predominantly expressed in the EGs. Approximately 100 families of retrotransposons have been identified in Drosophila genome (22). Our EST analysis detected transcripts from various types of retrotransposons (30 families) but was dominated by those with LTRs (23 families; Table 5, which is published as supporting information on the PNAS web site). The WISH experiments reveal that at least 11 LTR retrotransposons, 17.6, 297, 412, gtwin, mdg1, quasimode, stalker, stalker2, tabor, ZAM, and tirant, are expressed predominantly in the EGs (Fig. 3). These observations are in accordance with the previous reports showing that transcripts for 17.6, 412, mdg1, 297, and gypsy accumulate in the EGs (23, 24). It is interesting to note that these transcripts were all detected in gonadal somatic cells rather than in pole cells (Fig. 3). Thus, we conclude that hyperexpression in the gonadal somatic cells is a common feature of various types of LTR retrotransposons.
The significance of the retrotransposon expression in gonadal somatic cells is unclear. Expression and retrotransposition in germ line would be a more effective strategy for retrotransposons to propagate them in a heritable manner from one generation to the next. An interesting case has been reported in a specific strain called RevI, in which the retrotransposon ZAM is expressed in the follicle cells of the adult ovaries and forms virus-like particles that transfer to neighboring oocytes (25). A similar transfer has been reported for the virus-like particles originating from the gypsy retrotransposon in the ovaries of flamenco mutant females (26). This translocation of virus-like particles may couple with yolk transfer from follicle cells to the oocytes by exo- and endocytosis and/or through gap junctions (25, 27). Retrotransposons may exploit the intimate link between the follicle cells and the oocytes to obtain additional access to gametes. This somatic expression may circumvent a host defense against retrotransposons in the germ line (28, 29). However, it is worthwhile to note that the expression of ZAM and gypsy retrotransposons are detectable only in certain genetic backgrounds, such as RevI and the flamenco mutant, respectively (25, 27). Thus, their expression is normally repressed in the follicle cells. In contrast, in our experiments, transcripts from various retrotransposons are preferentially expressed in gonadal somatic cells during normal embryogenesis. One possibility is that their early transcription is regulated differently, and the transcripts are inactivated by a posttranscriptional regulatory mechanism (see below).
RNAi.
Among the Drosophila genes, 11 are annotated to be associated with “RNAi” (FlyBase), a mechanism by which dsRNA induces gene silencing. Transcripts from nine RNAi-related genes are constituents of the EG EST collection. Among them, three genes, piwi, armi, and Dcr-2, were expressed predominantly in the EGs by WISH (Fig. 3). These transcripts were all detectable in the gonadal somatic cells rather than pole cells in the EGs (Fig. 3). The functions of piwi and armi have been investigated in the AG. piwi is an Argonaute-family gene necessary for germ-line stem cell renewal in ovaries (17), and armi is required for polarization of the oocyte (20) and for silencing of Stellate gene in male germ cells (19). However, the functions of these genes in somatic gonadal cells during embryogenesis are not yet clear. We propose that a RNAi-mediated gene silencing mechanism is active in the somatic cells of the EGs.
RNAi-mediated mechanisms contribute to host defenses against transposons and viruses (30–32). A subset of mutations that disable the RNAi mechanism mobilizes families of transposable elements. For example, the LTR retrotransposons gypsy and ZAM are regulated by a mechanism that depends on piwi (30–32). In Drosophila, transposons and repeated sequences, including P-element, Stellate, I-element, and gypsy, are repressed by a trans-silencing mechanism termed “cosuppression” that targets any transposons containing homologous sequences to the “trigger” transcripts by small interfering RNA (siRNA; refs. 33–35). Based on the aforementioned observations that transcripts from various LTR retrotransposons and the RNAi-related genes are both enriched in the gonadal somatic cells of the embryos, we hypothesize that LTR-retrotransposon transcripts would be the “trigger;” they are processed by RNAi pathway to produce siRNA, which in turn silences the retrotransposons in the following developmental stages.
This hypothesis is supported by our observations that Dcr-2 but not Dcr-1, the two Drosophila dicer homologs, is predominantly expressed in the EGs, because Dcr-2 is responsible for the production of small interfering RNA (siRNA) from dsRNA, whereas Dcr-1 is for microRNA-triggered gene silencing (36–38). A recent analysis of the small RNAs expressed during Drosophila embryogenesis has identified a large number of repeat-associated siRNAs, which are complementary to repetitive elements, including retrotransposons (39). Although the distribution of these small RNAs remains unclear, it is likely that Dcr-2 and the other RNAi-related genes process transcripts from the LTR retrotransposons in the gonadal soma. Further studies examining the role of these RNAi-related genes in the EGs are required to investigate this hypothesis.
SUMOylation.
We found that almost all of the components required for SUMOylation are expressed predominantly in pole cells. SUMO is a member of the ubiquitin-like protein family that regulates cellular function by binding covalently to a variety of target proteins (40). SUMO (smt3) is one of the most highly represented transcripts in the EG EST collection (80 ESTs; Table 3). Our WISH analysis revealed that smt3 RNA is enriched in pole cells as well as in the CNS (Fig. 3). Similarly, E1 and E2 components, which are encoded by Uba2, Aos1, and lwr genes, are all concentrated in the pole cells (Fig. 3). In addition, these transcripts exhibit quite similar temporal expression (Type III; see above). The common spatiotemporal expression pattern suggests that SUMOylation occurs in pole cells within the gonad of developing embryos.
A large fraction of the SUMO substrates identified by global proteomics and studies in silico contribute to transcription (41, 42). Thus, SUMOylation may regulate germ-line gene expression by posttranslational modification of transcription factors. Indeed, our computational analysis reveals that the EG-EST collection contains a number of potential substrates for SUMOylation, including proteins involved in transcription (data not shown). To understand the role of SUMOylation in germ-line development, we are attempting to identify SUMO substrates with genetic and biochemical approaches.
Germ-Line-Specific Ribosomal Proteins.
Four genes (RpL22-like, RpS19b, RpS5b, and RpS13) encoding cytosolic ribosomal proteins are preferentially in pole cells within the EGs (Fig. 3). We found that three (RpL22-like, RpS19b, and RpS5b) of the four genes have paralogs (RpL22, RpS19a, and RpS5a, respectively) in the genome, and these paralogs are expressed ubiquitously throughout late embryos (14). Thus, RpL22, RpS5a, and RpS19a are used universally, and RpL22-like, RpS5b, and RpS19b have a specialized role in the germ line.
In addition to the ribosomal proteins, transcripts encoding translational regulators also are expressed preferentially in pole cells. For example, CG10990, which encodes a translational repressor distantly related to eIF4G and PDCD4 (43), was detected in pole cells in late embryos (Fig. 3). A Drosophila homolog of mammalian RpS13, which interacts with PDCD4 in HeLa cells (44), is expressed in the pole cells (Fig. 3). In addition, vasa, which has sequence similarity to eIF4A, is zygotically activated in pole cells during their migration to the gonads, and the transcriptions of RpL22-like, RpS19, and CG10990 are all activated in pole cells at nearly same time as that of vasa (data not shown). It is interesting to note that these translation-related genes (RpS5b, RpS19b, RpL22-like, CG10990, and vasa) are also up-regulated in the germ-line stem cells of adult ovaries (45). These germ-line-specific components may be essential for the translational regulatory mechanisms required for germ-line development.
Alternatively, it is possible that the germ-line-specific ribosomal proteins carry out extraribosomal functions. It is plausible that the duplicated genes for ribosomal proteins acquire novel functions unrelated to their paralogs. This view is supported by our data that the germ-line-specific paralogs of the RpL22 and RpS19 families are more divergent than the universal ones; for example, the D. melanogaster RpL22 protein sequence is 57% identical to human RPL22, whereas germ-line-specific RpL22-like displays only 44% identity (14). Novel functions of ribosomal proteins have been reported. In human cells infected with Epstein–Barr virus, an appreciable portion of the RpL22 is not associated with ribosomes but is located in the nucleoplasm, where RpL22 binds to a small viral RNA (46). In addition, RpL22 has been identified as a protein associated with telomerase RNA (47). Thus, we speculate that the Drosophila paralogs of ribosomal proteins have acquired novel functions that contribute to germ-line development.
The mechanism for germ-line-specific expression of the paralogs of ribosomal proteins is not yet clear. The similarity of their spatiotemporal expression is consistent with these genes being regulated in a coordinated fashion by germ-line-specific transcriptional machinery. An interesting case has been reported in Ascaris lumbricoides (48). Its genome encodes both germ-line- and soma-specific ribosomal proteins homologous to RpS19. A paralog, RpS19G, is expressed predominantly in the germ line but is eliminated from the genome of all somatic cells by chromatin diminution during early development. Instead, the other paralog, RpS19S, is expressed in the soma. Thus, we speculate that the differential expression of the ribosomal protein paralogs (and probably their function) is intimately related to the regulatory mechanism underlying germ-line development.
Perspectives.
Here we describe the gene expression data obtained from our EST analysis of purified EGs. Our transcriptome data provide unique genetic information to help in the understanding of gonad development. Furthermore, the spatiotemporal expression data of the gonad-enriched genes are useful for studying the regulatory mechanism of germ-line- and gonadal soma-specific gene expression and function. The general transcription factor TfIIA-S and transcription coactivator Ssb-c31a (Fig. 3) may be involved in germ-line-specific gene regulation at the transcriptional level. Recent studies indicate that transmembrane proteins are involved in gonad morphogenesis and the establishment of the germ-line–stem-cell niche within the EG (refs. 11 and 49; Y.K., S.S., K. Arita, and S.K., unpublished data). Our list of gonad-enriched genes includes many genes encoding membrane proteins that are predicted to function in cell–cell interactions, in signaling, and as transporters (Table 1). Functional analysis of these genes will help us to understand the mechanism of gonad formation, the germ–soma interaction, and the establishment of the germ-line–stem-cell niche within this specialized organ.
Materials and Methods
Fly Stocks.
EGs were collected from EGFP-vasa embryos (15) by FACS. y w flies were used for WISH analysis. Detailed procedures for WISH are described in Supporting Text.
Construction of a cDNA Library from the FACS-Sorted EGs.
The EG was isolated from EGFP-vasa transgenic embryos at 10–18 h after egg laying, as described (14). Microscopically, >99% of the total particles obtained by FACS were gonads (the number of particles we counted was >400). The remaining particles (<1%) were small noncellular clumps. A cDNA library was generated from ≈7 μg of total RNA, which was purified from 25,000 gonads by using the SMART system (Clontech, Mountain View, CA), as described (14). Two plasmids were used for cDNA construction, the pDNR-LIB vector (Clontech) and the pGEM-T Easy vector (Promega, Madison, WI). Information about primers used for EST sequencing is available in Table 6, which is published as supporting information on the PNAS web site. The clone name and corresponding EST accession no. used for each synthesis of the RNA probe are listed in Table 7, which is published as supporting information on the PNAS web site.
EST Sequencing and Informatics.
Each EST was sequenced, processed, and annotated as described (14). Detailed information is provided in Supporting Text. The bioinformatics analyses on (i) the analysis of repetitive ESTs, (ii) functional annotation based on GO and the statistical analysis, and (iii) the EST comparison among public EST collections and our EG library are also available in Supporting Text.
Supplementary Material
Acknowledgments
We thank Dr. T. Akiyama (Azabu University, Fuchinobe, Japan) and Beckman Coulter (Fullerton, CA) for cell sorting, Dr. A. Nakamura (RIKEN, Kobe, Japan) for an anti-VASA antibody, and Mr. K. Hashiyama for WISH experiments. This work was supported in part by grants from the Ministry of Education, Culture, Sports, Science, and Technology and the National Institute of Agrobiological Sciences and by the Core Research for Evolutional Science and Technology project of the Japan Science and Technology Agency.
Abbreviations
- EG
embryonic gonad
- AG
adult gonad
- OT
other tissue
- WISH
whole-mount in situ hybridization
- GO
Gene Ontology
- NCBI
National Center for Biotechnology Information.
Footnotes
References
- 1.Santos AC, Lehmann R. Curr Biol. 2004;14:R578–R589. doi: 10.1016/j.cub.2004.07.018. [DOI] [PubMed] [Google Scholar]
- 2.Williamson A, Lehmann R. Annu Rev Cell Dev Biol. 1996;12:365–391. doi: 10.1146/annurev.cellbio.12.1.365. [DOI] [PubMed] [Google Scholar]
- 3.Hay B, Jan LY, Jan YN. Cell. 1988;55:577–587. doi: 10.1016/0092-8674(88)90216-4. [DOI] [PubMed] [Google Scholar]
- 4.Asaoka-Taguchi M, Yamada M, Nakamura A, Hanyu K, Kobayashi S. Nat Cell Biol. 1999;1:431–437. doi: 10.1038/15666. [DOI] [PubMed] [Google Scholar]
- 5.Sonnenblick BP. In: Biology of Drosophila. Demerec M, editor. Cold Spring Harbor, NY: Cold Spring Harbor Lab Press; 1994. pp. 62–167. [Google Scholar]
- 6.Asaoka M, Lin H. Development (Cambridge, UK) 2004;131:5079–5089. doi: 10.1242/dev.01391. [DOI] [PubMed] [Google Scholar]
- 7.DeFalco TJ, Verney G, Jenkins AB, McCaffery JM, Russell S, Van Doren M. Dev Cell. 2003;5:205–216. doi: 10.1016/s1534-5807(03)00204-1. [DOI] [PubMed] [Google Scholar]
- 8.Mukai M, Kashikawa M, Kobayashi S. Development (Cambridge, UK) 1999;126:1023–1029. doi: 10.1242/dev.126.5.1023. [DOI] [PubMed] [Google Scholar]
- 9.Jenkins AB, McCaffery JM, Van Doren M. Development (Cambridge, UK) 2003;130:4417–4426. doi: 10.1242/dev.00639. [DOI] [PubMed] [Google Scholar]
- 10.Wawersik M, Milutinovich A, Casper AL, Matunis E, Williams B, Van Doren M. Nature. 2005;436:563–567. doi: 10.1038/nature03849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Le Bras S, Van Doren M. Dev Biol. 2006;294:92–103. doi: 10.1016/j.ydbio.2006.02.030. [DOI] [PubMed] [Google Scholar]
- 12.Andrews J, Bouffard GG, Cheadle C, Lu JN, Becker KG, Oliver B. Genome Res. 2000;10:2030–2043. doi: 10.1101/gr.10.12.2030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Parisi M, Nuttall R, Edwards P, Minor J, Naiman D, Lu J, Doctolero M, Vainer M, Chan C, Malley J, et al. Genome Biol. 2004;5:R40. doi: 10.1186/gb-2004-5-6-r40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Shigenobu S, Arita K, Kitadate Y, Noda C, Kobayashi S. Dev Growth Differ. 2006;48:49–57. doi: 10.1111/j.1440-169X.2006.00845.x. [DOI] [PubMed] [Google Scholar]
- 15.Sano H, Nakamura A, Kobayashi S. Mech Dev. 2002;112:129–139. doi: 10.1016/s0925-4773(01)00654-2. [DOI] [PubMed] [Google Scholar]
- 16.Van Doren M, Williamson AL, Lehmann R. Curr Biol. 1998;8:243–246. doi: 10.1016/s0960-9822(98)70091-0. [DOI] [PubMed] [Google Scholar]
- 17.Cox DN, Chao A, Baker J, Chang L, Qiao D, Lin H. Genes Dev. 1998;12:3715–3727. doi: 10.1101/gad.12.23.3715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Tazuke SI, Schulz C, Gilboa L, Fogarty M, Mahowald AP, Guichet A, Ephrussi A, Wood CG, Lehmann R, Fuller MT. Development (Cambridge, UK) 2002;129:2529–2539. doi: 10.1242/dev.129.10.2529. [DOI] [PubMed] [Google Scholar]
- 19.Tomari Y, Du T, Haley B, Schwarz DS, Bennett R, Cook HA, Koppetsch BS, Theurkauf WE, Zamore PD. Cell. 2004;116:831–841. doi: 10.1016/s0092-8674(04)00218-1. [DOI] [PubMed] [Google Scholar]
- 20.Cook HA, Koppetsch BS, Wu J, Theurkauf WE. Cell. 2004;116:817–829. doi: 10.1016/s0092-8674(04)00250-8. [DOI] [PubMed] [Google Scholar]
- 21.Mukai M, Kitadate Y, Arita K, Shigenobu S, Kobayashi S. Gene Expr Patterns. 2006;6:256–266. doi: 10.1016/j.modgep.2005.08.002. [DOI] [PubMed] [Google Scholar]
- 22.Ashburner M, Golic K, Howley R. Drosophila, a Laboratory Handbook. Cold Spring Harbor, NY: Cold Spring Harbor Lab Press; 2005. [Google Scholar]
- 23.Brookman JJ, Toosy AT, Shashidhara LS, White RA. Development (Cambridge, UK) 1992;116:1185–1192. doi: 10.1242/dev.116.4.1185. [DOI] [PubMed] [Google Scholar]
- 24.Ding D, Lipshitz HD. Genet Res. 1994;64:167–181. doi: 10.1017/s0016672300032833. [DOI] [PubMed] [Google Scholar]
- 25.Leblanc P, Desset S, Giorgi F, Taddei AR, Fausto AM, Mazzini M, Dastugue B, Vaury C. J Virol. 2000;74:10658–10669. doi: 10.1128/jvi.74.22.10658-10669.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Song SU, Kurkulos M, Boeke JD, Corces VG. Development (Cambridge, UK) 1997;124:2789–2798. doi: 10.1242/dev.124.14.2789. [DOI] [PubMed] [Google Scholar]
- 27.Waksmonski SL, Woodruff RI. J Insect Physiol. 2002;48:667–675. doi: 10.1016/s0022-1910(02)00095-1. [DOI] [PubMed] [Google Scholar]
- 28.Aravin AA, Klenov MS, Vagin VV, Bantignies F, Cavalli G, Gvozdev V A. Mol Cell Biol. 2004;24:6742–6750. doi: 10.1128/MCB.24.15.6742-6750.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Sijen T, Plasterk RH A. Nature. 2003;426:310–314. doi: 10.1038/nature02107. [DOI] [PubMed] [Google Scholar]
- 30.Buchon N, Vaury C. Heredity. 2006;96:195–202. doi: 10.1038/sj.hdy.6800789. [DOI] [PubMed] [Google Scholar]
- 31.Waterhouse PM, Wang M-B, Lough T. Nature. 2001;411:834–842. doi: 10.1038/35081168. [DOI] [PubMed] [Google Scholar]
- 32.Kavi HH, Fernandez HR, Xie W, Birchler JA. FEBS Lett. 2005;579:5940–5949. doi: 10.1016/j.febslet.2005.08.069. [DOI] [PubMed] [Google Scholar]
- 33.Ronsseray S, Josse T, Boivin A, Anxolabehere D. Genetica. 2003;117:327–335. doi: 10.1023/a:1022929121828. [DOI] [PubMed] [Google Scholar]
- 34.Aravin AA, Naumova NM, Tulin AV, Vagin VV, Rozovsky YM, Gvozdev VA. Curr Biol. 2001;11:1017–1027. doi: 10.1016/s0960-9822(01)00299-8. [DOI] [PubMed] [Google Scholar]
- 35.Sarot E, Payen-Groschêne G, Bucheton A, Pelisson A. Genetics. 2004;166:1313–1321. doi: 10.1534/genetics.166.3.1313. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Okamura K, Ishizuka A, Siomi H, Siomi MC. Genes Dev. 2004;18:1655–1666. doi: 10.1101/gad.1210204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Liu Q, Rand TA, Kalidas S, Du F, Kim H-E, Smith DP, Wang X. Science. 2003;301:1921–1925. doi: 10.1126/science.1088710. [DOI] [PubMed] [Google Scholar]
- 38.Lee YS, Nakahara K, Pham JW, Kim K, He Z, Sontheimer EJ, Carthew RW. Cell. 2004;117:69–81. doi: 10.1016/s0092-8674(04)00261-2. [DOI] [PubMed] [Google Scholar]
- 39.Aravin AA, Lagos-Quintana M, Yalcin A, Zavolan M, Marks D, Snyder B, Gaasterland T, Meyer J, Tuschl T. Dev Cell. 2003;5:337–350. doi: 10.1016/s1534-5807(03)00228-4. [DOI] [PubMed] [Google Scholar]
- 40.Johnson ES. Annu Rev Biochem. 2004;73:355–382. doi: 10.1146/annurev.biochem.73.011303.074118. [DOI] [PubMed] [Google Scholar]
- 41.Zhou F, Xue Y, Lu H, Chen G, Yao X. FEBS Lett. 2005;579:3369–3375. doi: 10.1016/j.febslet.2005.04.076. [DOI] [PubMed] [Google Scholar]
- 42.Wohlschlegel JA, Johnson ES, Reed SI, Yates JR., III J Biol Chem. 2004;279:45662–45668. doi: 10.1074/jbc.M409203200. [DOI] [PubMed] [Google Scholar]
- 43.Yang H-S, Jansen AP, Komar AA, Zheng X, Merrick WC, Costes S, Lockett SJ, Sonenberg N, Colburn NH. Mol Cell Biol. 2003;23:26–37. doi: 10.1128/MCB.23.1.26-37.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Kang M-J, Ahn H-S, Lee J-Y, Matsuhashi S, Park W-Y. Biochem Biophys Res Commun. 2002;293:617–621. doi: 10.1016/S0006-291X(02)00264-4. [DOI] [PubMed] [Google Scholar]
- 45.Kai T, Williams D, Spradling AC. Dev Biol. 2005;283:486–502. doi: 10.1016/j.ydbio.2005.04.018. [DOI] [PubMed] [Google Scholar]
- 46.Toczyski DP, Matera AG, Ward DC, Steitz JA. Proc Natl Acad Sci USA. 1994;91:3463–3467. doi: 10.1073/pnas.91.8.3463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Le S, Greider CW, Sternglanz R. Mol Biol Cell. 2000;11:999–1010. doi: 10.1091/mbc.11.3.999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Etter A, Bernard V, Kenzelmann M, Tobler H, Müler F. Science. 1994;265:954–956. doi: 10.1126/science.8052853. [DOI] [PubMed] [Google Scholar]
- 49.Mathews WR, Ong D, Milutinovich AB, Van Doren M. Development (Cambridge, UK) 2006;133:1143–1153. doi: 10.1242/dev.02256. [DOI] [PubMed] [Google Scholar]
- 50.Potter SS, Brorein WJ, Jr, Dunsmuir P, Rubin GM. Cell. 1979;17:415–427. doi: 10.1016/0092-8674(79)90168-5. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.