Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2006 Sep 1;103(37):13728–13733. doi: 10.1073/pnas.0603767103

Molecular characterization of embryonic gonads by gene expression profiling in Drosophila melanogaster

Shuji Shigenobu *,†,, Yu Kitadate *,†,, Chiyo Noda *, Satoru Kobayashi *,†,§,
PMCID: PMC1559405  PMID: 16950879

Abstract

In many animal species, germ-line progenitors associate with gonadal somatic cells to form the embryonic gonads (EGs) that later develop into functional organ producing gametes. To explore the genetic regulation of the germ-line development, we initiated a comprehensive identification and functional analysis of the genes expressed within the EGs. First, we generated a cDNA library from gonads purified from Drosophila embryos by FACS. Using this library, we catalogued the genes expressed in the gonad by EST analysis. A total of 17,218 high-quality ESTs representing 3,051 genes were obtained, corresponding to 20% of the predicted genes in the genome. The EG transcriptome is unexpectedly distinct from that of adult gonads and includes an extremely high proportion of retrotransposon-derived transcripts. We verified 101 genes preferentially expressed in the EGs by whole-mount in situ hybridization. Within this subset, 39 and 58 genes were expressed predominantly in germ-line and somatic cells, respectively, whereas four genes were expressed in the both cell lineages. The gonad-enriched genes encompassed a variety of predicted functions. However, genes implicated in SUMOylation and protein translation, including germ-line-specific ribosomal proteins, are preferentially expressed in the germ line, whereas the expression of various retrotransposons and RNAi-related genes are more prominent in the gonadal soma. These transcriptome data are a resource for understanding the mechanism of various cellular events during germ-line development.

Keywords: expressed sequence tag, germ cell, retrotransposon, pole cell


The germ line is the only cell type that transmits genetic materials from one generation to the next during sexual reproduction. In many animal species, germ-line progenitors migrate within embryos to associate with gonadal somatic cells to form the embryonic gonads (EGs) that will later develop into a fully functional organ capable of producing gametes. In Drosophila, the germ-line progenitors, or pole cells, form at the posterior pole region of the early embryos (1, 2). Pole cells then migrate toward the mesodermal layer, where they associate with the specialized mesodermal cells known as somatic gonadal precursors. Eventually, the somatic cells encapsulate the pole cells to form EGs. Within the gonads, the pole cells undergo oogenesis or spermatogenesis and differentiate into germ cells during postembryonic development. Pole cells that fail to be encapsulated within the gonads eventually degenerate without producing germ cells (3).

Within the EGs, distinct cellular events associated with germ-line development occur, such as resumption of germ-line proliferation (4, 5), selection of the germ-line stem cell (6), gonad morphogenesis (7), and cellular communication between germ-line and somatic cells (810). Recent studies have also revealed that the male germ-line stem cell niche is already specified in the EG (ref. 11; Y.K., S.S., K. Arita, and S.K., unpublished data). Despite the importance of the EG in germ-line development, only limited information is available, regarding which genes are expressed in the EG although transcriptome data of adult testes and ovaries have accumulated (12, 13). Thus, we attempted to identify the genes expressed within the EGs by a direct and comprehensive approach. In Drosophila, transcriptome analysis of individual organs and cell types has been hampered by the smallness of their size. To overcome this problem, we have developed an efficient method to isolate EGs by flow cytometry (14). We generated a cDNA library from purified gonads and obtained 17,218 valid ESTs representing 3,051 genes, all of which were examined by whole-mount in situ hybridization (WISH). The transcripts from 101 genes were enriched in the EG. These genes encompass a wide array of molecular and biological processes, as deduced from the Gene Ontology (GO) categories in the fly database. Here, we highlight five functional categories of genes enriched in the EG and discuss their roles.

Results and Discussion

Purification of EG by FACS and Generation of ESTs.

We used FACS to isolate EG from transgenic Drosophila embryos harboring the germ-line marker EGFP-vasa (15). Embryos at 10–18 h after egg laying were homogenized without protease treatment to keep the gonad intact. From these homogenates, gonads containing both GFP-positive pole cells and GFP-negative gonadal somatic cells were separated from the remaining tissue by FACS. With this procedure, we were able to obtain a highly enriched fraction of EG, as confirmed by microscopy and quantitative PCR (14). We constructed an EG cDNA library (EG library) from poly(A)+ RNA from a pool of ≈25,000 FACS-sorted EGs. We sequenced 12,977 cDNA clones from the 5′ end and 6,755 from the 3′ end. After removing low-quality and contaminating sequences, 17,218 high-quality reads were obtained (DNA Data Bank of Japan/European Molecular Biology Laboratory/GenBank accession nos. BP540206BP560422).

When aligned to Drosophila melanogaster genomic sequences, 15,384 (90.1%) ESTs mapped to euchromatic genomic regions and 434 (2.5%) to heterochromatic genomic regions. The remainder, 1,254 ESTs (7.4%), mapped to multiple loci within the genome; these included 974 highly repetitive sequences (≥10 hits in the genome). Compared with the public EST collections, this EG library includes a significantly higher proportion of repetitive sequences (Fig. 1A). Almost all of the repetitive sequences were derived from retrotransposons (Fig. 1B).

Fig. 1.

Fig. 1.

The transcriptome of the EG includes highly repetitive elements, almost all of which are derived from LTR retrotransposons. (A) ESTs with multiple hits to the genome. EST sequences that match multiple loci on the genome were classified into two groups: highly repetitive (≥10 hits to the genome) and moderately repetitive (two to nine hits). The proportions of the repetitive ESTs are shown. LD, 0- to 22-h whole embryo; LP, larva and early pupa; GH, adult head; GM, adult ovary; AT1, testis (0- to 3-day adult); AT2, testis (1- to 5-day adult); FB, fat body of larva; SG, salivary gland of pupa; MB, mbn2 cell line; and SD, Schneider L2 culture cells. The EG library includes a significantly higher proportion of repetitive sequences than other EST libraries. (B) Proportion of transposon-derived transcripts in EST libraries. A significant accumulation of LTR retrotransposons is observed in the EG library. EST libraries generated from cell lines (MB and SD) also include many highly repetitive and retrotransposon-derived sequences. In general, the number of transposable elements is increased in cell lines (50).

We aligned each EST with a reference transcript set in the Drosophila database (FlyBase, http://flybase.net) and assigned it to a gene. In total, we consolidated 17,072 ESTs derived from the EG library into a nonredundant set of 3,051 genes; these correspond to ≈20% of the predicted genes in D. melanogaster. Because our EST analysis was nearly saturating (see Supporting Text, which is published as supporting information on the PNAS web site), it covers most of the gene repertoire of the EG transcriptome. All identified genes expressed in the EG are listed in Table 3, which is published as supporting information on the PNAS web site.

Using the National Center for Biotechnology Information (NCBI) UniGene-based classification, we compared the gene sets expressed by EGs, adult gonads (AGs; ovary + testis), and other tissues (OTs). Genes represented in all three collections are regarded as those with “housekeeping” functions (1,809 UniGenes; 60.8% of EG UniGene collection). Except the housekeeping genes, we observed that the proportion of genes present in both the EG and AG collections was low (Fig. 2). Only 145 genes (5.0% of the EG UniGene collection) were common to EG and AG, whereas 719 genes (24.8% of EG UniGene collection) were expressed in both EGs and OTs but not AGs. Thus, the EG transcriptome is unexpectedly different from AG. We conclude that the genetic regulation of germ-line development within the EGs is distinct from the one underlying gametogenesis in ovaries and testes.

Fig. 2.

Fig. 2.

Comparison of genes expressed by EGs, AGs, and OTs. The AG EST collection is a pool of adult testis libraries and adult ovary libraries. The OT EST collection is a pool of seven public EST libraries, including head, salivary gland, fat body, and culture cell EST libraries. The ESTs were collapsed into UniGene sets that were precompiled by NCBI. The three-way comparison shows that the expressed gene repertoire of the EG is quite different from that of the AG.

Overview of Comprehensive Whole-Mount in Situ Hybridization (WISH).

All 3,051 genes represented in our EG library were subjected to WISH to examine their distribution within the embryo. Overall, we obtained useful expression data for 2,388 genes. Although most of them showed ubiquitous distribution, we found that transcripts from 101 genes were enriched in the EG, as summarized in Table 1. We further examined their distribution within the gonads by double-staining embryos with a RNA probe for each transcript and an anti-VASA antibody to distinguish the germ-line and somatic expression of the transcripts within the gonads. We identified 39 RNAs that are expressed predominantly in pole cells, 58 that are expressed in gonadal somatic cells, and 4 that are expressed in both cell types (Table 1, Fig. 3).

Table 1.

EG-enriched genes identified by WISH

Gene Tissues* Functions (excerpt) FlyBase ID
Expressed in germline
    RpS13 GO[g], CNS, LG Ribosomal protein FBgn0010265
    RpS19b GO[g], PV Ribosomal protein FBgn0039129
    CG9871 GO[g], SNS Ribosomal protein FBgn0034837
    RpS5b GO[g], U Ribosomal protein FBgn0038277
    smt3 GO[g], CNS Sumoylation FBgn0026170
    Uba2 GO[g], CNS Sumoylation FBgn0029113
    lwr GO[g], CNS, U Sumoylation FBgn0010602
    Aos1 GO[g], CNS Sumoylation FBgn0029512
    Hsp27 GO[g], CNS, DV Protein folding FBgn0001226
    Hsp26 GO[g], MGL Protein folding FBgn0001225
    Hsp83 GO[g], OE, PNS, CNS, MG Protein folding FBgn0001233
    CG4415 GO[g], PV Unfolded protein binding FBgn0031296
    Cam GO[g], CNS, OE Calcium ion binding FBgn0000253
    I(1)G0269 GO[g], CNS, PNS CTD-like phosphatase FBgn0029067
    grp GO[g], CNS Protein serine/threonine kinase; cell cycle FBgn0011598
    CG2919 GO[g], BR Cytoskeleton organization and biogenesis FBgn0037348
    scra GO[g], CNS Cytoskeleton organization and biogenesis FBgn0004243
    Mapmodulin GO[g], CNS Microtubule binding FBgn0034282
    stai GO[g], CNS, PNS Microtubule binding FBgn0051641
    Mcm6 GO[g], CNS, MG, FB DNA helicase FBgn0025815
    Top2 GO[g], CNS DNA topoisomerase FBgn0003732
    Thd1 GO[g], CNS, U Pyrimidine-specific mismatch base pair DNA N-glycosylase FBgn0026869
    dUTPase GO[g], CNS, U, MG dUTP diphosphatase FBgn0013349
    TfllA-S GO[g], CNS, U General transcription factor FBgn0013347
    Ssb-c31a GO[g], MG, BR Transcription coactivator FBgn0015299
    ovo GO[g], EP, BR Transcription factor FBgn0003028
    Fs(2)Ket GO[g], CNS, U Importin β FBgn0000986
    zpg GO[g] Innexin channel FBgn0024177
    janA GO[g], MG, U Sex differentiation FBgn0001280
    CSN3 GO[g], CNS, U Signalosome complex FBgn0027055
    Uba1 GO[g], CNS, U Ubiquitin-activating enzyme FBgn0023143
    CG10990 GO[g] Translation elongation factor FBgn0030520
    vas GO[g] RNA helicase FBgn0003970
    CG11329 GO[g] FBgn0031848
    CG15930 GO[g] FBgn0029754
    CG18213 GO[g] FBgn0038470
    CG12576 GO[g], CNS, MG FBgn0031190
    CG14346 GO[g], U FBgn0031337
    Unnamed gene GO[g], U FBgn0058460
Expressed in somatic line
    412 GO[s] Retrotransposon FBgn0000006
    297 GO[s], CNS, U Retrotransposon FBgn0000005
    17.6 GO[s], DV, PH Retrotransposon FBgn0000004
    mdg1 GO[s] Retrotransposon FBgn0002697
    Quasimodo GO[s] Retrotransposon FBgn0062261
    Stalker GO[s], U Retrotransposon FBgn0064138
    Stalker2 GO[s], PV Retrotransposon FBgn0063399
    Tabor GO[s], SNS Retrotransposon FBgn0045970
    Tirant GO[s] Retrotransposon FBgn0004082
    ZAM GO[s], U Retrotransposon FBgn0023131
    gtwin GO[s] Retrotransposon FBgn0063436
    armi GO[s], FB RNA interference FBgn0041164
    Dcr-2 GO[s], U RNA interference FBgn0034246
    piwi GO[s] RNA interference FBgn0004872
    CG8908 GO[s], U ABC transporter FBgn0034493
    CG30359 GO[s], FB Carbohydrate metabolism; transporter FBgn0050359
    CG3036 GO[s], PNS, GC Membraine protein; transporter FBgn0031645
    CG9935 GO[s], SNS, LG, PE Membraine protein; transporter FBgn0039916
    CG11537 GO[s], CNS, SG Membraine protein; transporter FBgn0035400
    CG1599 GO[s], U Plasma membrane protein; v-SNARE FBgn0033452
    CG3074 GO[s], SNS Cathepsin B; proteolysis FBgn0034709
    CG9634 GO[s], U Proteolysis FBgn0027528
    m1 GO[s], U Serine-type endopeptidase inhibitor FBgn0002578
    Fas1 GO[s], CNS, PNS Cell adhesion FBgn0000634
    I(2)03709 GO[s], MG, FB, MU Cell cycle, DNA metabolism FBgn0010551
    Wnt6 GO[s], MG, MP, GC, MG frizzled-2 signaling FBgn0031902
    skf GO[s], U Plasma membrane protein; signal transduction FBgn0050021
    lbm GO[s], CNS, PNS Tetraspanin; receptor signaling protein FBgn0016032
    CG7194 GO[s] Gonad development FBgn0035868
    M(2)21AB GO[s], RG, MG Methionine adenosyltransferase FBgn0005278
    mRpS24 GO[s], FB, MG Mitochondrial ribosomal protein FBgn0039159
    Mocs1 GO[s], MG Mo-molybdopterin cofactor biosynthesis FBgn0036122
    mud GO[s] Mushroom body development FBgn0002873
    Pros45 GO[s], CNS Proteasome complex FBgn0020369
    CG10565 GO[s], MG, PV, FB Protein folding; nucleic acid bindning FBgn0037051
    stg GO[s] Protein tyrosine/serine/threonine phosphatase; cell cycle FBgn0003525
    CG5800 GO[s], SNS, MG RNA helicase FBgn0030855
    B52 GO[s], U RNA splicing factor activity FBgn0004587
    CG11447 GO[s], MG, ES, HG rRNA (uridine-2′-O-)-methyltransferase FBgn0038737
    zfh1 GO[s], CNS, PNS Transcription factor FBgn0004606
    stc GO[s], FB Transcription factor FBgn0001978
    esg GO[s], HIB Transcription factor FBgn0001981
    ftz-f1 GO[s], PV Transcription factor FBgn0001078
    neur GO[s], CNS, FB Ubiquitin-protein ligase FBgn0002932
    novel gene GO[s], CNS, PNS FGM222E05
    novel gene GO[s], CNS, U FGC026A04
    CG15784 GO[s], CNS FBgn0029766
    CG7267 GO[s], FB, U FBgn0030079
    CG33047 GO[s], GRL FBgn0053047
    CG6014 GO[s], HG FBgn0027542
    CG7498 GO[s], LG, SNS, FB FBgn0040833
    CG7224 GO[s], MG, GC, PV FBgn0031971
    CG5541 GO[s], MG, HG, ES, PV FBgn0030603
    CG14998 GO[s], PNS, FB FBgn0035500
    CG11050 GO[s], U FBgn0031836
    dpr17 GO[s], U FBgn0051361
    CG14072 GO[s], U FBgn0032318
Expressed in germline and somatic line
    Su(var)205 GO[g/s], CNS Chromatin binding FBgn0003607
    Df31 GO[g/s], CNS, PNS, misc, HIB Histone binding FBgn0022893
    ran GO[g/s], CNS, BR, FB Ras GTPase FBgn0020255
    14-3-3ϵ GO[g/s], CNS, U Ras protein signal transduction FBgn0020238

*BR, brain; DV, dorsal vessel; EP, epidermis; ES, esophagus; FB, fat body; GC, gastric caeca; GO, gonad; HG, hindgut; HIB, histoblast; LG, lymph gland; MG, midgut; MGL, midline glial cell; MP, Malpighian tubule; MU, muscle; OE, oenocyte; PE, pericardial cells; PH, pharynx; PNS, peripheral nervous system; PV, proventriculus; RG, ring gland; SG, salivary gland; SNS, stomatgastric nerbous system; U, weak signal is ubiquitously ditected; g and s in a pair of brace indicate germ-line and somatic line expression in the gonad.

Instead of FlyBase ID, the clone name is shown for the novel gene.

Fig. 3.

Fig. 3.

Spatial expression patterns of gonad-enriched transcripts. Embryos at stage 14–16 are shown with anterior to the left. Gonads are indicated by arrowheads, and gene names are shown in the bottom right of each image. Insets provide the confocal microscopic image of the gonad double-stained with an antisense RNA probe for the indicated gene (red) and anti-Vasa (green), a germ-line marker.

We investigated the temporal expression patterns of transcripts enriched in pole cells by WISH. Embryos at various developmental stages were examined (Fig. 4, which is published as supporting information on the PNAS web site), and three major expression patterns were extracted (Types I, II and III). Transcripts with the Type I expression pattern are first observed in the pole cells during their migration through the posterior midgut epithelium and remain detectable after the coalescence of the gonads. Transcripts from vasa, RpL22-like, RpS19b, CG10990, CG4415, TfIIA-S, and Ssb-c31b exhibit this type of expression. Because the pole cells are transcriptionally inactive until they migrate (16), these transcripts are some of the earliest zygotic transcripts in the pole cells. Given that their transcription is initiated in the pole cells before coalescing with the gonadal somatic cells, we speculate that their expression is autonomously initiated by maternal factors partitioned into the pole cells, rather than an inductive signal from the gonadal soma. Indeed, the expression of some Type I genes also was detectable “lost” pole cells that failed to be incorporated within the gonads. Transcripts with a Type II expression pattern are observed in various tissues before gonad formation but are enriched in pole cells after they associate with the gonadal somatic cells. Transcripts for smt3, Uba2, lwr, Top2, and grp display this type of expression pattern. Type III expression includes transcripts that accumulate in the pole cells throughout embryogenesis. These transcripts present in the early pole cells are presumably maternal in origin, whereas zygotic transcription may occur at later stages. This type includes transcripts from ovo, stai, Hsp26, Hsp27, Hsp83, and zpg.

Functional Classification of EG-Enriched Genes.

To characterize the EG transcriptome, we assigned GO terms to each EG-enriched gene according to FlyBase annotations. As shown in Table 4, which is published as supporting information on the PNAS web site, EG-enriched genes represent a broad range of biological and molecular functions. Our statistical analysis showed that some of the categories were significantly overrepresented in the list of EG-enriched genes (Table 2). Among them, five categories are highlighted and discussed in detail.

Table 2.

Functional classification of gonad-enriched genes based on GO terms

GO name (GO ID) No. genes enriched in EG No. genes in the genome P value Genes
Biological process
    Germ cell development (0007281) 5 157 0.012 zpg, vas, armi, 14-3-3ε, piwi
    DNA replication (0006260) 5 128 5.0E-03 armi, 1(2)03709, Mcm6, Top2, Thd1
    Protein folding (0006457) 4 134 0.029 CG10565, Hsp26, Hsp27, Hsp83
    Protein import into nucleus (0006606) 3 49 0.0087 Fs(2)Ket, lwr, smt3
    Response to heat (0009408) 3 51 0.0097 Hsp26, Hsp27, Hsp83
    RNA interference (0016246) 3 11 9.9E-05 armi, Dcr-2, piwi
Molecular function
    Microtubule binding (0008017) 3 76 0.028 Mapmodulin, scra, stai
    SUMO-activating enzyme activity (0019948) 2 2 7.4E-05 Aos1, Uba2
Cellular component
    Cytosolic ribosome (0005830) 4 99 0.011 CG9871, RpS13, RpS19b, RpS5b

Overrepresented categories are chosen.

Germ-Line Development.

We found that genes in the GO category “germ cell development” were overrepresented in the list of EG-enriched genes (Table 2). It is generally expected that genes responsible for germ-line development are predominantly expressed in the gonads. However, their functions are known to be required within the AGs. For example, piwi is expressed in the somatic cells adjacent to germ-line stem cells and is essential for stem cell self renewal (17). zpg is required for survival of differentiating early germ cells in AGs (18), and armi represses oskar translation in ovaries and Ste expression in testes (19, 20). Although their functions during embryogenesis are unclear, these genes were expressed in the EGs. A similar precocious expression has been reported for meiotic genes; a subset of the genes responsible for meiotic division is expressed in pole cells during embryogenesis, whereas meiosis is initiated later at the postembryonic stages (21). It is possible that transcription of these gametogenesis-related genes initiates in the EGs, but posttranscriptional repression restricts the function of these genes until the onset of gametogenesis. Although we cannot exclude the possibility that these genes may have additional functions, our observations are consistent with the notion that the EG acquires at least a part of the potential to carry out gametogenesis.

LTR Retrotransposons.

We observed that a surprisingly large number of EG ESTs (>1,000 ESTs) were derived from retrotransposons; this population corresponds to 7% of the EG EST collection. This proportion was significantly larger than in other public EST collections (Fig. 1B). Thus, retrotransposons are predominantly expressed in the EGs. Approximately 100 families of retrotransposons have been identified in Drosophila genome (22). Our EST analysis detected transcripts from various types of retrotransposons (30 families) but was dominated by those with LTRs (23 families; Table 5, which is published as supporting information on the PNAS web site). The WISH experiments reveal that at least 11 LTR retrotransposons, 17.6, 297, 412, gtwin, mdg1, quasimode, stalker, stalker2, tabor, ZAM, and tirant, are expressed predominantly in the EGs (Fig. 3). These observations are in accordance with the previous reports showing that transcripts for 17.6, 412, mdg1, 297, and gypsy accumulate in the EGs (23, 24). It is interesting to note that these transcripts were all detected in gonadal somatic cells rather than in pole cells (Fig. 3). Thus, we conclude that hyperexpression in the gonadal somatic cells is a common feature of various types of LTR retrotransposons.

The significance of the retrotransposon expression in gonadal somatic cells is unclear. Expression and retrotransposition in germ line would be a more effective strategy for retrotransposons to propagate them in a heritable manner from one generation to the next. An interesting case has been reported in a specific strain called RevI, in which the retrotransposon ZAM is expressed in the follicle cells of the adult ovaries and forms virus-like particles that transfer to neighboring oocytes (25). A similar transfer has been reported for the virus-like particles originating from the gypsy retrotransposon in the ovaries of flamenco mutant females (26). This translocation of virus-like particles may couple with yolk transfer from follicle cells to the oocytes by exo- and endocytosis and/or through gap junctions (25, 27). Retrotransposons may exploit the intimate link between the follicle cells and the oocytes to obtain additional access to gametes. This somatic expression may circumvent a host defense against retrotransposons in the germ line (28, 29). However, it is worthwhile to note that the expression of ZAM and gypsy retrotransposons are detectable only in certain genetic backgrounds, such as RevI and the flamenco mutant, respectively (25, 27). Thus, their expression is normally repressed in the follicle cells. In contrast, in our experiments, transcripts from various retrotransposons are preferentially expressed in gonadal somatic cells during normal embryogenesis. One possibility is that their early transcription is regulated differently, and the transcripts are inactivated by a posttranscriptional regulatory mechanism (see below).

RNAi.

Among the Drosophila genes, 11 are annotated to be associated with “RNAi” (FlyBase), a mechanism by which dsRNA induces gene silencing. Transcripts from nine RNAi-related genes are constituents of the EG EST collection. Among them, three genes, piwi, armi, and Dcr-2, were expressed predominantly in the EGs by WISH (Fig. 3). These transcripts were all detectable in the gonadal somatic cells rather than pole cells in the EGs (Fig. 3). The functions of piwi and armi have been investigated in the AG. piwi is an Argonaute-family gene necessary for germ-line stem cell renewal in ovaries (17), and armi is required for polarization of the oocyte (20) and for silencing of Stellate gene in male germ cells (19). However, the functions of these genes in somatic gonadal cells during embryogenesis are not yet clear. We propose that a RNAi-mediated gene silencing mechanism is active in the somatic cells of the EGs.

RNAi-mediated mechanisms contribute to host defenses against transposons and viruses (3032). A subset of mutations that disable the RNAi mechanism mobilizes families of transposable elements. For example, the LTR retrotransposons gypsy and ZAM are regulated by a mechanism that depends on piwi (3032). In Drosophila, transposons and repeated sequences, including P-element, Stellate, I-element, and gypsy, are repressed by a trans-silencing mechanism termed “cosuppression” that targets any transposons containing homologous sequences to the “trigger” transcripts by small interfering RNA (siRNA; refs. 3335). Based on the aforementioned observations that transcripts from various LTR retrotransposons and the RNAi-related genes are both enriched in the gonadal somatic cells of the embryos, we hypothesize that LTR-retrotransposon transcripts would be the “trigger;” they are processed by RNAi pathway to produce siRNA, which in turn silences the retrotransposons in the following developmental stages.

This hypothesis is supported by our observations that Dcr-2 but not Dcr-1, the two Drosophila dicer homologs, is predominantly expressed in the EGs, because Dcr-2 is responsible for the production of small interfering RNA (siRNA) from dsRNA, whereas Dcr-1 is for microRNA-triggered gene silencing (3638). A recent analysis of the small RNAs expressed during Drosophila embryogenesis has identified a large number of repeat-associated siRNAs, which are complementary to repetitive elements, including retrotransposons (39). Although the distribution of these small RNAs remains unclear, it is likely that Dcr-2 and the other RNAi-related genes process transcripts from the LTR retrotransposons in the gonadal soma. Further studies examining the role of these RNAi-related genes in the EGs are required to investigate this hypothesis.

SUMOylation.

We found that almost all of the components required for SUMOylation are expressed predominantly in pole cells. SUMO is a member of the ubiquitin-like protein family that regulates cellular function by binding covalently to a variety of target proteins (40). SUMO (smt3) is one of the most highly represented transcripts in the EG EST collection (80 ESTs; Table 3). Our WISH analysis revealed that smt3 RNA is enriched in pole cells as well as in the CNS (Fig. 3). Similarly, E1 and E2 components, which are encoded by Uba2, Aos1, and lwr genes, are all concentrated in the pole cells (Fig. 3). In addition, these transcripts exhibit quite similar temporal expression (Type III; see above). The common spatiotemporal expression pattern suggests that SUMOylation occurs in pole cells within the gonad of developing embryos.

A large fraction of the SUMO substrates identified by global proteomics and studies in silico contribute to transcription (41, 42). Thus, SUMOylation may regulate germ-line gene expression by posttranslational modification of transcription factors. Indeed, our computational analysis reveals that the EG-EST collection contains a number of potential substrates for SUMOylation, including proteins involved in transcription (data not shown). To understand the role of SUMOylation in germ-line development, we are attempting to identify SUMO substrates with genetic and biochemical approaches.

Germ-Line-Specific Ribosomal Proteins.

Four genes (RpL22-like, RpS19b, RpS5b, and RpS13) encoding cytosolic ribosomal proteins are preferentially in pole cells within the EGs (Fig. 3). We found that three (RpL22-like, RpS19b, and RpS5b) of the four genes have paralogs (RpL22, RpS19a, and RpS5a, respectively) in the genome, and these paralogs are expressed ubiquitously throughout late embryos (14). Thus, RpL22, RpS5a, and RpS19a are used universally, and RpL22-like, RpS5b, and RpS19b have a specialized role in the germ line.

In addition to the ribosomal proteins, transcripts encoding translational regulators also are expressed preferentially in pole cells. For example, CG10990, which encodes a translational repressor distantly related to eIF4G and PDCD4 (43), was detected in pole cells in late embryos (Fig. 3). A Drosophila homolog of mammalian RpS13, which interacts with PDCD4 in HeLa cells (44), is expressed in the pole cells (Fig. 3). In addition, vasa, which has sequence similarity to eIF4A, is zygotically activated in pole cells during their migration to the gonads, and the transcriptions of RpL22-like, RpS19, and CG10990 are all activated in pole cells at nearly same time as that of vasa (data not shown). It is interesting to note that these translation-related genes (RpS5b, RpS19b, RpL22-like, CG10990, and vasa) are also up-regulated in the germ-line stem cells of adult ovaries (45). These germ-line-specific components may be essential for the translational regulatory mechanisms required for germ-line development.

Alternatively, it is possible that the germ-line-specific ribosomal proteins carry out extraribosomal functions. It is plausible that the duplicated genes for ribosomal proteins acquire novel functions unrelated to their paralogs. This view is supported by our data that the germ-line-specific paralogs of the RpL22 and RpS19 families are more divergent than the universal ones; for example, the D. melanogaster RpL22 protein sequence is 57% identical to human RPL22, whereas germ-line-specific RpL22-like displays only 44% identity (14). Novel functions of ribosomal proteins have been reported. In human cells infected with Epstein–Barr virus, an appreciable portion of the RpL22 is not associated with ribosomes but is located in the nucleoplasm, where RpL22 binds to a small viral RNA (46). In addition, RpL22 has been identified as a protein associated with telomerase RNA (47). Thus, we speculate that the Drosophila paralogs of ribosomal proteins have acquired novel functions that contribute to germ-line development.

The mechanism for germ-line-specific expression of the paralogs of ribosomal proteins is not yet clear. The similarity of their spatiotemporal expression is consistent with these genes being regulated in a coordinated fashion by germ-line-specific transcriptional machinery. An interesting case has been reported in Ascaris lumbricoides (48). Its genome encodes both germ-line- and soma-specific ribosomal proteins homologous to RpS19. A paralog, RpS19G, is expressed predominantly in the germ line but is eliminated from the genome of all somatic cells by chromatin diminution during early development. Instead, the other paralog, RpS19S, is expressed in the soma. Thus, we speculate that the differential expression of the ribosomal protein paralogs (and probably their function) is intimately related to the regulatory mechanism underlying germ-line development.

Perspectives.

Here we describe the gene expression data obtained from our EST analysis of purified EGs. Our transcriptome data provide unique genetic information to help in the understanding of gonad development. Furthermore, the spatiotemporal expression data of the gonad-enriched genes are useful for studying the regulatory mechanism of germ-line- and gonadal soma-specific gene expression and function. The general transcription factor TfIIA-S and transcription coactivator Ssb-c31a (Fig. 3) may be involved in germ-line-specific gene regulation at the transcriptional level. Recent studies indicate that transmembrane proteins are involved in gonad morphogenesis and the establishment of the germ-line–stem-cell niche within the EG (refs. 11 and 49; Y.K., S.S., K. Arita, and S.K., unpublished data). Our list of gonad-enriched genes includes many genes encoding membrane proteins that are predicted to function in cell–cell interactions, in signaling, and as transporters (Table 1). Functional analysis of these genes will help us to understand the mechanism of gonad formation, the germ–soma interaction, and the establishment of the germ-line–stem-cell niche within this specialized organ.

Materials and Methods

Fly Stocks.

EGs were collected from EGFP-vasa embryos (15) by FACS. y w flies were used for WISH analysis. Detailed procedures for WISH are described in Supporting Text.

Construction of a cDNA Library from the FACS-Sorted EGs.

The EG was isolated from EGFP-vasa transgenic embryos at 10–18 h after egg laying, as described (14). Microscopically, >99% of the total particles obtained by FACS were gonads (the number of particles we counted was >400). The remaining particles (<1%) were small noncellular clumps. A cDNA library was generated from ≈7 μg of total RNA, which was purified from 25,000 gonads by using the SMART system (Clontech, Mountain View, CA), as described (14). Two plasmids were used for cDNA construction, the pDNR-LIB vector (Clontech) and the pGEM-T Easy vector (Promega, Madison, WI). Information about primers used for EST sequencing is available in Table 6, which is published as supporting information on the PNAS web site. The clone name and corresponding EST accession no. used for each synthesis of the RNA probe are listed in Table 7, which is published as supporting information on the PNAS web site.

EST Sequencing and Informatics.

Each EST was sequenced, processed, and annotated as described (14). Detailed information is provided in Supporting Text. The bioinformatics analyses on (i) the analysis of repetitive ESTs, (ii) functional annotation based on GO and the statistical analysis, and (iii) the EST comparison among public EST collections and our EG library are also available in Supporting Text.

Supplementary Material

Supporting Information

Acknowledgments

We thank Dr. T. Akiyama (Azabu University, Fuchinobe, Japan) and Beckman Coulter (Fullerton, CA) for cell sorting, Dr. A. Nakamura (RIKEN, Kobe, Japan) for an anti-VASA antibody, and Mr. K. Hashiyama for WISH experiments. This work was supported in part by grants from the Ministry of Education, Culture, Sports, Science, and Technology and the National Institute of Agrobiological Sciences and by the Core Research for Evolutional Science and Technology project of the Japan Science and Technology Agency.

Abbreviations

EG

embryonic gonad

AG

adult gonad

OT

other tissue

WISH

whole-mount in situ hybridization

GO

Gene Ontology

NCBI

National Center for Biotechnology Information.

Footnotes

Conflict of interest statement: No conflicts declared.

This paper was submitted directly (Track II) to the PNAS office.

Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. BP540206BP560422).

References

  • 1.Santos AC, Lehmann R. Curr Biol. 2004;14:R578–R589. doi: 10.1016/j.cub.2004.07.018. [DOI] [PubMed] [Google Scholar]
  • 2.Williamson A, Lehmann R. Annu Rev Cell Dev Biol. 1996;12:365–391. doi: 10.1146/annurev.cellbio.12.1.365. [DOI] [PubMed] [Google Scholar]
  • 3.Hay B, Jan LY, Jan YN. Cell. 1988;55:577–587. doi: 10.1016/0092-8674(88)90216-4. [DOI] [PubMed] [Google Scholar]
  • 4.Asaoka-Taguchi M, Yamada M, Nakamura A, Hanyu K, Kobayashi S. Nat Cell Biol. 1999;1:431–437. doi: 10.1038/15666. [DOI] [PubMed] [Google Scholar]
  • 5.Sonnenblick BP. In: Biology of Drosophila. Demerec M, editor. Cold Spring Harbor, NY: Cold Spring Harbor Lab Press; 1994. pp. 62–167. [Google Scholar]
  • 6.Asaoka M, Lin H. Development (Cambridge, UK) 2004;131:5079–5089. doi: 10.1242/dev.01391. [DOI] [PubMed] [Google Scholar]
  • 7.DeFalco TJ, Verney G, Jenkins AB, McCaffery JM, Russell S, Van Doren M. Dev Cell. 2003;5:205–216. doi: 10.1016/s1534-5807(03)00204-1. [DOI] [PubMed] [Google Scholar]
  • 8.Mukai M, Kashikawa M, Kobayashi S. Development (Cambridge, UK) 1999;126:1023–1029. doi: 10.1242/dev.126.5.1023. [DOI] [PubMed] [Google Scholar]
  • 9.Jenkins AB, McCaffery JM, Van Doren M. Development (Cambridge, UK) 2003;130:4417–4426. doi: 10.1242/dev.00639. [DOI] [PubMed] [Google Scholar]
  • 10.Wawersik M, Milutinovich A, Casper AL, Matunis E, Williams B, Van Doren M. Nature. 2005;436:563–567. doi: 10.1038/nature03849. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Le Bras S, Van Doren M. Dev Biol. 2006;294:92–103. doi: 10.1016/j.ydbio.2006.02.030. [DOI] [PubMed] [Google Scholar]
  • 12.Andrews J, Bouffard GG, Cheadle C, Lu JN, Becker KG, Oliver B. Genome Res. 2000;10:2030–2043. doi: 10.1101/gr.10.12.2030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Parisi M, Nuttall R, Edwards P, Minor J, Naiman D, Lu J, Doctolero M, Vainer M, Chan C, Malley J, et al. Genome Biol. 2004;5:R40. doi: 10.1186/gb-2004-5-6-r40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Shigenobu S, Arita K, Kitadate Y, Noda C, Kobayashi S. Dev Growth Differ. 2006;48:49–57. doi: 10.1111/j.1440-169X.2006.00845.x. [DOI] [PubMed] [Google Scholar]
  • 15.Sano H, Nakamura A, Kobayashi S. Mech Dev. 2002;112:129–139. doi: 10.1016/s0925-4773(01)00654-2. [DOI] [PubMed] [Google Scholar]
  • 16.Van Doren M, Williamson AL, Lehmann R. Curr Biol. 1998;8:243–246. doi: 10.1016/s0960-9822(98)70091-0. [DOI] [PubMed] [Google Scholar]
  • 17.Cox DN, Chao A, Baker J, Chang L, Qiao D, Lin H. Genes Dev. 1998;12:3715–3727. doi: 10.1101/gad.12.23.3715. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Tazuke SI, Schulz C, Gilboa L, Fogarty M, Mahowald AP, Guichet A, Ephrussi A, Wood CG, Lehmann R, Fuller MT. Development (Cambridge, UK) 2002;129:2529–2539. doi: 10.1242/dev.129.10.2529. [DOI] [PubMed] [Google Scholar]
  • 19.Tomari Y, Du T, Haley B, Schwarz DS, Bennett R, Cook HA, Koppetsch BS, Theurkauf WE, Zamore PD. Cell. 2004;116:831–841. doi: 10.1016/s0092-8674(04)00218-1. [DOI] [PubMed] [Google Scholar]
  • 20.Cook HA, Koppetsch BS, Wu J, Theurkauf WE. Cell. 2004;116:817–829. doi: 10.1016/s0092-8674(04)00250-8. [DOI] [PubMed] [Google Scholar]
  • 21.Mukai M, Kitadate Y, Arita K, Shigenobu S, Kobayashi S. Gene Expr Patterns. 2006;6:256–266. doi: 10.1016/j.modgep.2005.08.002. [DOI] [PubMed] [Google Scholar]
  • 22.Ashburner M, Golic K, Howley R. Drosophila, a Laboratory Handbook. Cold Spring Harbor, NY: Cold Spring Harbor Lab Press; 2005. [Google Scholar]
  • 23.Brookman JJ, Toosy AT, Shashidhara LS, White RA. Development (Cambridge, UK) 1992;116:1185–1192. doi: 10.1242/dev.116.4.1185. [DOI] [PubMed] [Google Scholar]
  • 24.Ding D, Lipshitz HD. Genet Res. 1994;64:167–181. doi: 10.1017/s0016672300032833. [DOI] [PubMed] [Google Scholar]
  • 25.Leblanc P, Desset S, Giorgi F, Taddei AR, Fausto AM, Mazzini M, Dastugue B, Vaury C. J Virol. 2000;74:10658–10669. doi: 10.1128/jvi.74.22.10658-10669.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Song SU, Kurkulos M, Boeke JD, Corces VG. Development (Cambridge, UK) 1997;124:2789–2798. doi: 10.1242/dev.124.14.2789. [DOI] [PubMed] [Google Scholar]
  • 27.Waksmonski SL, Woodruff RI. J Insect Physiol. 2002;48:667–675. doi: 10.1016/s0022-1910(02)00095-1. [DOI] [PubMed] [Google Scholar]
  • 28.Aravin AA, Klenov MS, Vagin VV, Bantignies F, Cavalli G, Gvozdev V A. Mol Cell Biol. 2004;24:6742–6750. doi: 10.1128/MCB.24.15.6742-6750.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Sijen T, Plasterk RH A. Nature. 2003;426:310–314. doi: 10.1038/nature02107. [DOI] [PubMed] [Google Scholar]
  • 30.Buchon N, Vaury C. Heredity. 2006;96:195–202. doi: 10.1038/sj.hdy.6800789. [DOI] [PubMed] [Google Scholar]
  • 31.Waterhouse PM, Wang M-B, Lough T. Nature. 2001;411:834–842. doi: 10.1038/35081168. [DOI] [PubMed] [Google Scholar]
  • 32.Kavi HH, Fernandez HR, Xie W, Birchler JA. FEBS Lett. 2005;579:5940–5949. doi: 10.1016/j.febslet.2005.08.069. [DOI] [PubMed] [Google Scholar]
  • 33.Ronsseray S, Josse T, Boivin A, Anxolabehere D. Genetica. 2003;117:327–335. doi: 10.1023/a:1022929121828. [DOI] [PubMed] [Google Scholar]
  • 34.Aravin AA, Naumova NM, Tulin AV, Vagin VV, Rozovsky YM, Gvozdev VA. Curr Biol. 2001;11:1017–1027. doi: 10.1016/s0960-9822(01)00299-8. [DOI] [PubMed] [Google Scholar]
  • 35.Sarot E, Payen-Groschêne G, Bucheton A, Pelisson A. Genetics. 2004;166:1313–1321. doi: 10.1534/genetics.166.3.1313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Okamura K, Ishizuka A, Siomi H, Siomi MC. Genes Dev. 2004;18:1655–1666. doi: 10.1101/gad.1210204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Liu Q, Rand TA, Kalidas S, Du F, Kim H-E, Smith DP, Wang X. Science. 2003;301:1921–1925. doi: 10.1126/science.1088710. [DOI] [PubMed] [Google Scholar]
  • 38.Lee YS, Nakahara K, Pham JW, Kim K, He Z, Sontheimer EJ, Carthew RW. Cell. 2004;117:69–81. doi: 10.1016/s0092-8674(04)00261-2. [DOI] [PubMed] [Google Scholar]
  • 39.Aravin AA, Lagos-Quintana M, Yalcin A, Zavolan M, Marks D, Snyder B, Gaasterland T, Meyer J, Tuschl T. Dev Cell. 2003;5:337–350. doi: 10.1016/s1534-5807(03)00228-4. [DOI] [PubMed] [Google Scholar]
  • 40.Johnson ES. Annu Rev Biochem. 2004;73:355–382. doi: 10.1146/annurev.biochem.73.011303.074118. [DOI] [PubMed] [Google Scholar]
  • 41.Zhou F, Xue Y, Lu H, Chen G, Yao X. FEBS Lett. 2005;579:3369–3375. doi: 10.1016/j.febslet.2005.04.076. [DOI] [PubMed] [Google Scholar]
  • 42.Wohlschlegel JA, Johnson ES, Reed SI, Yates JR., III J Biol Chem. 2004;279:45662–45668. doi: 10.1074/jbc.M409203200. [DOI] [PubMed] [Google Scholar]
  • 43.Yang H-S, Jansen AP, Komar AA, Zheng X, Merrick WC, Costes S, Lockett SJ, Sonenberg N, Colburn NH. Mol Cell Biol. 2003;23:26–37. doi: 10.1128/MCB.23.1.26-37.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Kang M-J, Ahn H-S, Lee J-Y, Matsuhashi S, Park W-Y. Biochem Biophys Res Commun. 2002;293:617–621. doi: 10.1016/S0006-291X(02)00264-4. [DOI] [PubMed] [Google Scholar]
  • 45.Kai T, Williams D, Spradling AC. Dev Biol. 2005;283:486–502. doi: 10.1016/j.ydbio.2005.04.018. [DOI] [PubMed] [Google Scholar]
  • 46.Toczyski DP, Matera AG, Ward DC, Steitz JA. Proc Natl Acad Sci USA. 1994;91:3463–3467. doi: 10.1073/pnas.91.8.3463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Le S, Greider CW, Sternglanz R. Mol Biol Cell. 2000;11:999–1010. doi: 10.1091/mbc.11.3.999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Etter A, Bernard V, Kenzelmann M, Tobler H, Müler F. Science. 1994;265:954–956. doi: 10.1126/science.8052853. [DOI] [PubMed] [Google Scholar]
  • 49.Mathews WR, Ong D, Milutinovich AB, Van Doren M. Development (Cambridge, UK) 2006;133:1143–1153. doi: 10.1242/dev.02256. [DOI] [PubMed] [Google Scholar]
  • 50.Potter SS, Brorein WJ, Jr, Dunsmuir P, Rubin GM. Cell. 1979;17:415–427. doi: 10.1016/0092-8674(79)90168-5. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information
pnas_0603767103_2.pdf (4.3MB, pdf)
pnas_0603767103_1.pdf (4.3MB, pdf)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES