Abstract
The gatae gene of Strongylocentrotus purpuratus is orthologous to vertebrate gata-4,5,6 genes. This gene is expressed in the endomesoderm in the blastula and later the gut of the embryo, and is required for normal development. A gatae BAC containing a GFP reporter knocked into exon one of the gene was able to reproduce all aspects of endogenous gatae expression in the embryo. To identify putative gatae cis-regulatory modules we carried out an interspecific sequence conservation analysis with respect to a Lytechinus variegatus gatae BAC, which revealed 25 conserved non-coding sequence patches. These were individually tested in gene transfer experiments, and two modules capable of driving localized reporter expression in the embryo were identified. Module 10 produces early expression in mesoderm and endoderm cells up to the early gastrula stage, while module 24 generates late endodermal expression at gastrula and pluteus stages. Module 10 was then deleted from the gatae BAC by reciprocal recombination, resulting in total loss of reporter expression in the time frame in which it is normally active. Similar deletion of module 24 led to ubiquitous GFP expression in the gastrula and pluteus. These results show that Module 10 is uniquely necessary and sufficient to account for the early phase of gatae expression during endomesoderm specification. In addition they imply a functional cis-regulatory module exclusion, whereby only a single module can associate with the basal promoter and drive gene expression at any given time.
Keywords: sea urchin, gene regulation, GATA factors, cis-regulatory analysis, gatae
INTRODUCTION
GATA4,5,6 transcription factors and their orthologs are implicated in numerous aspects of endoderm and mesoderm development across the Bilateria (Maduro and Rothman, 2002; Murakami et al., 2005; Patient and McGhee, 2002). The sea urchin Strongylocentrotus purpuratus has two gata genes, of which gatae is orthologous to the vertebrate gata4/5/6 genes (Pancer et al., 1999). The dynamic spatial expression of gatae in the sea urchin embryo was described by Lee and Davidson, 2004. Expression is first detected in the 15 h blastula in cells of the presumptive mesoderm, and in the 24 h mesenchyme blastula the gene is expressed in endoderm and mesoderm cells of the veg2 lineage. At the onset of gastrulation the gatae gene is expressed in the invaginating vegetal plate, and during gastrulation in the cells surrounding the blastopore as well as in mesoderm cells at the tip of the archenteron. In the later gastrula stages gatae is expressed in the midgut, hindgut and coelomic pouch regions of the archenteron. At the pluteus stage hindgut expression is extinguished, leaving the definitive pattern of expression in the midgut and the coelomic pouches, which form the rudiment where the body plan of the adult sea urchin later develops.
The gatae gene occupies an important node in the sea urchin endomesoderm network. Perturbation analysis using morpholino antisense oligonucleotides (MASO), and many other observations, reveal that prior to gastrulation gatae is a direct activator of a number of genes encoding transcription factors, including the endodermal transcription factors foxA, brachyury, and β1/2-otx (Davidson et al., 2002a; Davidson et al., 2002b); see http://sugp.caltech.edu/endomes/ for current version of the endomesodermal gene regulatory network). Of particular interest and importance is the interaction of Gatae factor with the β1/2-otx gene. These two genes cross-regulate, generating a positive feedback loop which serves to lock down the state of endoderm specification (Davidson et al, 2002a; Yuh et al, 2004).
Since the gatae gene is expressed in a complex spatial pattern which changes with developmental time, it seemed likely that more than one cis-regulatory module would be required to control its expression in the embryo. Here we show that a physically distinct “early module” is necessary and sufficient to account for expression up to the early gastrula stage, and that a separate “late module” takes over control of expression in the gut thereafter. Comparison of the expression patterns generated by deletion of either module from the genomic regulatory DNA with those generated by the individual modules in reporter constructs leads to the additional conclusion that in situ the function of one module excludes the function of the other.
MATERIALS AND METHODS
Identification of gatae BACs and interspecific sequence comparison
S. purpuratus and L. variegatus BAC libraries were screened with a mixture of two probes, one corresponding to exon 1 (5′ probe), and the other to exons 5 and 6 (3′ probe). Filters were hybridized in 5XSSPE, 5%SDS and 0.1% NaPPi at 65°C and washed to a final concentration of 1XSSPE, 0.1% SDS. Positive clones were identified using the BioArray Software (Brown et al., 2002) and further confirmed by PCR and genomic DNA blots. Each clone was also mapped to determine the distance of the gatae gene from the vector. Mapping was done by digesting each BAC with Not I, which releases the insert, and either Bgl II, Xho I or Pst I. Genomic DNA blots were hybridized with combinations of probes corresponding to the vector (T7 and SP6), the 5′ and 3′ gatae exon probes. Sp and Lv BACs in which the gatae gene was furthest from the vector were sequenced at either the Joint Genome Institute or the Institute for Systems Biology (Seattle, Washington).
Interspecific sequence analysis was carried out using FamilyRelations (Brown et al., 2002). FamilyRelations software is available at http://family.caltech.edu.
Generation of cis-regulatory reporter constructs
Fusion PCR (Yon and Fried, 1989) was used in the generation of all reporter constructs. Each reporter construct consists of three separate PCR products stitched together in a subsequent PCR reaction: the conserved sequence patch, the gatae basal promoter and the GFP coding region. PCR primers were designed for each conserved sequence patch identified by FamilyRelations analysis. The reverse primer also included the tail sequence 5′-GTGTTGAAGTAGCTGGCAGTGACGT-3′, which overlaps with the sequence of the gatae basal promoter. The gatae basal promoter was amplified using the forward primer 5′-ACGTCACTGCCAGCTACTTC-3′, and the reverse primer 5′-GTGAACAGTTCCTCGCCCTTGCTCATCTGATGTGGCATACCACGC-3′. The sequence underlined in this primer corresponds to the GFP coding region. The GFP reporter included the SV40 polyadenylation signal, and was amplified using as forward primer 5′-ATGAGCAAGGGCGAGGAACTG-3′, and as reverse primer 5′-TGACTGGGTTGAAGGCTCTC-3′. Each resulting PCR product was cloned into the pGEMTEZ vector (Promega Corporation, Madison, WI) and verified by sequencing. PCR reporter constructs were purified using a PCR Purification Kit (Qiagen Inc., Valencia, CA), and injected into fertilized eggs.
BAC homologous recombinations
BAC modifications involving homologous recombination utilized the method described by (Lee et al., 2001). The targeting cassette consists of the GFP coding region and a kanamycin resistance gene flanked by frt sites. In this way the kanamycin resistance gene, used to select for recombinants, can be removed by arabinose induction after successful recombination. To generate the targeting cassette for creation of the gatae GFP BAC knockin, in which the GFP coding region was inserted into gatae’s first exon, primers corresponding to exon 1 were designed as follows: forward primer, 5′CAGCAGTATCTTTATCCCCAGTATCATTTGACAAGCGAATCCCAAATGAGCAAGGGCGAGGAACT-3′; reverse primer, 5′ACTCCACACGGCTGCAGCAGCGTGAGCATTGGCCTGGATCACGCTTCGAAGAGCTATTCCAG-3′.
For deletion of cis-regulatory modules 10 and 24, primers were designed to flank the region marked for removal. The targeting cassettes used for module deletions did not include the GFP coding DNA, consisting only of the kanamycin resistance gene flanked by frt sites. Module 10del forward primer: 5′AAGTATTAATATATTGGAATTGTTACAATGTTAGATTTGTATTCATCATGTCTGGATCGAACACC-3′; module 10del reverse primer: 5′GCAAGATTATTAGTCACCGCTTGAAGAACATCGGGAAGAGAATGGGCTACCATGGAGAAGTTCC-3′; module 24del forward primer: 5′AAAACTTGAATGATAACGACGCCTTGACTTACTGCCGTTTAAAGATCATGTCTGGATCGAACACC-3′; module 24del reverse primer: 5′TAAAGTTAGTCAAATAAGCTAATGTTTGGTGAGAAGGGTATGAGAGGCTACCATGGAGAAGTTCC-3′.
Sequences corresponding to the targeting cassette are underlined. The targeting cassettes were electroporated into EL250 cells containing the Gatae BAC (GFP insertion) or Gatae GFP BAC (module deletion), and the λ recombination system activated by heat shock at 42 °C. After selection for recombinants and removal of the selectable marker, clones containing the targeted insertions or deletions were linearized with Not I and column purified before microinjection.
Quantitative PCR reporter analysis
Embryos injected with reporter constructs were collected at various stages of development and their RNA extracted using Qiagen’s RNeasy Micro Kit (Qiagen Inc., Valencia, CA). RT-PCR was carried out using ABI’s (Applied Biosystems, Foster City, CA) Taqman Reverse Transcription Reagents using random hexamer priming, while real-time QPCR reactions were performed in triplicate with ABI’s SYBR Green PCR Master Mix. Ct is defined as the cycle number at which DNA in a PCR reaction reaches a particular threshold, set to a level where PCR products are increasing exponentially. Cts for GFP were normalized to Cts for SpZ12 as a control to account for differences in number of embryos in each preparation and converted to relative RNA levels using the formula 2ΔCt, where ΔCt=Ct(SpZ12)-Ct(GFP).
Embryo culture, microinjection and whole mount in situ hybridizations
Fertilized eggs were injected with 10 pl of a solution containing 250 molecules of reporter construct/pl, following the microinjection and embryo culture procedures described by (McMahon et al., 1985). Whole mount in situ hybridizations on injected embryos were performed as described (Minokawa et al., 2005).
RESULTS
Structure of the gatae genomic locus
Comparison of the gatae BAC sequence to that of gatae cDNA (Genbank Accession No. AY623814) revealed that the gatae gene contains 6 exons extending over 29 kb of genomic DNA (Fig. 1A). The two class IV zinc fingers are encoded in exons 3 and 4. Sea Urchin Genome Annotation Resource software (Brown et al., 2002) was used to predict the locations of the two genes flanking gatae. The nearest predicted coding region was 19 kb upstream of exon 1, matching a predicted sea urchin beta-2 lactamase gene (Genbank Accession No. XM_001177319). The nearest downstream gene is a predicted folate transporter gene (Genbank Accession No. XM_001177178), located 8 kb 3′ of the gatae stop codon. These genes are both transcribed in the same direction as gatae (left to right in Fig. 1A). The assembled sequence of the S. purpuratus genome (Sodergren et al., 2006) confirmed that gatae is a single copy gene.
Conserved non-coding sequence patches in the vicinity of the gatae gene
Using FamilyRelations software (Brown et al., 2002), we compared the genomic sequence surrounding the gatae gene in Strongylocentrotus purpuratus and Lytechinus variegatus gatae BACs. The region scanned extended from the lactamase to the folate transporter gene (cf. Fig. 1A). Parameters were set to require 85% threshold identity within a 50 bp sliding window. This analysis (Fig. 2A) revealed the presence of 31 conserved sequence patches, five of which corresponded to gatae exons 2-6, and one patch which corresponded partly to exon 1. The conserved sequences range from 196 bp to 1.7 kb, with an average size of 440 bp.
In order to identify active cis-regulatory modules that drive gatae expression in the embryo, a series of reporter constructs were made (Fig. 2B). The individual conserved sequence patches were amplified by PCR, and inserted into the expression vector, as described in Materials and Methods. Additional longer constructs were also prepared as indicated in the lower part of Fig. 2B (see Methods) to control for the possibility that functional sequence elements might be excluded from those inserts defined by conservation pattern, though this turned out not to be a concern. The conserved sequence immediately upstream of exon 1 (7a) appeared likely to include the gatae basal promoter, given its location, and indeed it includes the TATA box and initiator element sequences of the gatae gene. When cloned into a GFP reporter and introduced into eggs, fragment 7a generated no expression on its own, as characteristic of basal promoters in our expression vectors (Arnone et al., 1998; Sucov et al., 1988; Yuh and Davidson, 1996). Module 7a was included as the basal promoter in all of the gatae cis-regulatory constructs; experiments in which the endo16 basal promoter was instead combined with active gatae cis-regulatory modules showed that the two basal promoters function in the same way (data not shown). Each reporter construct was injected into fertilized sea urchin eggs and observed at the mesenchyme blastula, gastrula and pluteus stages.
Two specifically active DNA fragments that generated specific endoderm and mesoderm expression in the embryo were identified in these preliminary experiments, viz. those included in conserved patches 10 and 24. The large distal fragment upstream of patch 1 was expressed ubiquitously, but was not studied further.
A GFP BAC that reproduces gatae expression
The sequence of the BAC containing the gatae gene begins 109 kb upstream of the first exon of gatae, and terminates 2.5 kb downstream of the last exon. Using in vitro recombination, we inserted the coding region of a GFP reporter into the first exon of gatae within the BAC (Fig. 1B, referred to as Gatae BAC). When injected into fertilized eggs, Gatae BAC was able to reproduce every aspect of endogenous gatae expression (Fig. 3 and Table 1). GFP fluorescence was detected in vegetal cells of the 18 h blastula (Fig. 3A). In the mesenchyme blastula, GFP was observed in both endoderm and mesoderm cells of the veg2 lineage (Fig. 3B and Table 1). GFP reporter expression persisted in those cells until the onset of gastrulation (Fig. 3C). In the gastrula, GFP expression was restricted to endoderm cells of midgut and hindgut and mesoderm at the tip of the archenteron (Fig. 3D and Table 1). At 72 h, expression was limited to the midgut and coelomic pouches (Fig. 3E,F and Table 1). Thus, the Gatae BAC must contain all the cis-regulatory information required to account for gatae expression in the embryo.
Table 1.
24 h | |||||
---|---|---|---|---|---|
Construct | Number of embryos observed | Number of GFP+ embryos (%) | Endomesoderm (%a) | Ectoderm (%a) | |
10b | 128 | 70 (55) | 69( 99) | 0 | |
24 | 163 | 80 (56) | 30 (37) | 65 (81.5) | |
Gatae BAC | 303 | 198 (65) | 198 (100) | 0 | |
Gatae BAC del10 | 320 | 5 (2) | 5(100) | 0 | |
Gatae BAC del24 | 98 | 56 (57) | 54 (97) | 5 (10) | |
GataeBp | 119 | 0 (0) | 0 | 0 | |
48 h | |||||
Construct | Number of embryos observed | Number of GFP+ embryos (%) | Endoderm (%a) | Mesoderm (%a) | Ectoderm (%a) |
10 | 272 | 76 (28) | 42 (55) | 14 (18) | 34 (45) |
24 | 166 | 98 (59) | 93 (95) | 9 (9) | 2 (2) |
Gatae BAC | 313 | 161 (51) | 146 (90) | 53 (33) | 1 (1) |
Gatae BAC del10 | 269 | 125 (46) | 115 (92) | 47 (38) | 0 |
Gatae BAC del24 | 179 | 111 (62) | 72 (65) | 32 (28) | 53 (48) |
GataeBp | 100 | 2 (2) | 0 | 2 (100) | 0 |
72 h | |||||
Construct | Number of embryos observed | Number of GFP+ embryos (%) | Endoderm (%a) | Mesoderm (%a) | Ectoderm (%a) |
10 | 198 | 12 (6) | 1 (8) | 6 (50) | 5 (42) |
24 | 109 | 72 (66) | 71 (99) | 2 (3) | 4 (7) |
Gatae BAC | 279 | 148 (53) | 143 (97) | 39 (26) | 0 |
Gatae BAC del10 | 175 | 71 (41) | 64 (90) | 14 (20) | 0 |
Gatae BAC del24 | 203 | 93 (46) | 54 (58) | 12 (13) | 60 (65) |
GataeBp | 52 | 0 (0) | 0 | 0 | 0 |
Percentages reflect embryos which expressed GFP in said cell type, including those that displayed GFP expression in two or more cell types.
One PMC expressing embryo omitted for simplicity.
A cis-regulatory module that reproduces early vegetal expression of gatae
Region 10, a 585 bp conserved sequence located in the first intron (Fig. 2B), was capable of producing GFP reporter expression in the vegetal plate. In embryos injected with region 10 reporters, expression could be detected in a single localized region at 15 h (Fig. 4A). At 15 h it is not possible to determine the location of gene expression based on morphology alone, but by the time of vegetal plate thickening soon thereafter, it became obvious that expression driven by this DNA fragment is localized in the vegetal plate. In the mesenchyme blastula, this module generated GFP reporter expression in the endomesoderm specifically (Fig. 4B): 99% of GFP expressing embryos showed endomesoderm expression (Table 1). Expression persisted in the invaginating archenteron at the onset of gastrulation (Fig. 4C). However, in the 48 h gastrula, the module 10 construct produced ubiquitous expression (Fig. 4E and Table 1). This construct was completely inactive in the pluteus (Fig. 4F and Table 1). Consistent with these observations, constructs 10-12 and 9-11 produced the same patterns of expression as did the isolated module 10 (Fig. 2B and data not shown).
Expression of module 10 was studied in greater detail by quantifying the amount of GFP RNA generated by the construct over developmental time, using QPCR. As with the endogenous gatae gene (Lee and Davidson, 2004), reporter expression was first detected in the 15 h embryo. Expression then increased, peaking at 24h and 30h, before decreasing dramatically in the gastrula and pluteus (Fig. 4G). These data show that module 10 is a driver for gatae expression in the blastula. Since the turnover rate of GFP mRNA is not known in these cells, we cannot be sure when the transcriptional activity of module 10 constructs terminates, except that it is at or before the onset of gastrulation at 30h.
The late gatae cis-regulatory module
The second conserved patch in the first intron, the 334 bp region 24 (Fig. 2B), proved capable of driving endoderm-specific expression at gastrula and pluteus stages. However, both GFP fluorescence observation and in situ hybridizations revealed that region 24 constructs are expressed ubiquitously up to 30h (Figs. 5A-C and Table 1). By gastrula stage, expression has become highly specific and was confined to the midgut and hindgut (Fig. 5D and Table 1), while in the pluteus GFP reporter was only observed in the midgut (Figs. 5E,F and Table 1). It should be noted that module 24 was not expressed in the mesoderm cells at the tip of the archenteron in the gastrulating embryo or in the coelomic pouches at pluteus stage as is the endogenous gene and the Gatae BAC (Fig. 3). Regulatory functions required for coelomic pouch expression thus are missing from region 24, and from the extended constructs that include region 24, i.e. regions 15-20 or 20-24 (Fig. 2B). These extended fragments displayed the same endodermal activity in gastrula and pluteus stages as did the region 24 construct (data not shown).
QPCR time courses performed on embryos injected with the module 24 reporter construct revealed that reporter levels were relatively low up to 30 h, and the main activity was at the 48 h gastrula and the 72 h pluteus stages (Fig. 5G). Therefore the main function of module 24 is to drive gatae expression in the gastrula and pluteus. Considering the expression data for modules 10 and 24 together, it is clear that their expression patterns are complementary, both spatially and temporally. Together they account for the totality of embryonic gatae expression, except for the late expression in the mesodermal coelomic pouches. The control locus for this aspect of gatae expression remains undiscovered.
Necessity of module 10 for gatae expression in the blastula
To determine if module 10 is required for the early expression of gatae, it was deleted from Gatae BAC (Gatae BAC del10) by homologous recombination (see Materials and Methods). This enabled the study of the function of the module in the context of the complete gatae genomic locus and to identify any intermodular interactions. The result was clear: when Gatae BAC del10 was injected into embryos, no expression whatsoever was seen in 15h, 24h, or 30h embryos (Figs. 6A-C and Table 1), but strong GFP expression was observed in the gastrula stage, in the midgut, hindgut and mesoderm (Fig. 6D and Table 1). In pluteus stage embryos bearing Gatae BAC del10, GFP was expressed in the midgut and the coelomic pouches (Figs. 6E, F and Table 1).
QPCR time courses were generated from embryos injected with Gatae BAC and Gatae BAC del10 (Fig. 6G), and the data were consistent with the spatial expression. GFP RNA levels in Gatae BAC del10 embryos remain low compared to the control until the gastrula stage, and by 48h they revert to the levels produced by the wild type Gatae BAC. The results demonstrate that module 10 is the only module utilized during blastula stages, and is necessary as well as sufficient for gatae expression in the vegetal pole.
Deletion of the late module
A construct lacking module 24 was similarly generated (Gatae BAC del24). Embryos injected with Gatae BAC del24 express GFP vegetally at 15 h and 24 h in the same spatial domain as the control Gatae BAC (Figs. 7A, B and Table 1). Furthermore the amount of early expression is exactly the same as recorded for the isolated module 10 construct (55% vs. 57%). Surprisingly, however, we observed ubiquitous GFP expression in Gatae BAC del24 embryos after this (Figs. 7C-F). In the gastrula 52% of GFP expressing embryos showed expression in endoderm or mesoderm cells, but 48% displayed some level of expression in the ectoderm. In sharp contrast, in the parental Gatae BAC, 100% of GFP positive embryos expressed only in the endoderm. A similar observation was made in the pluteus, in which GFP was observed in endoderm and mesoderm in 35% of embryos and 65% displayed some level of ectodermal expression, while 100% of embryos bearing the Gatae BAC control expressed GFP in endoderm and mesoderm (Table 1).
QPCR analysis of levels GFP reporter RNA produced by Gatae BAC del24 support the spatial expression data. At no time was GFP RNA eliminated or drastically reduced. Instead, we observed reduced levels of GFP RNA in embryos injected with Gatae BAC del 24 compared to Gatae BAC. Even though we did not observe a loss of expression in the gastrula and pluteus stages, spatial expression at those time points had been completely disrupted by the removal of module 24. Therefore, as is module 10 at early stages, module 24 is necessary for the correct spatial regulation of gatae at late stages.
DISCUSSION
Cis-regulation of gatae
Here we show that two physically distinct cis-regulatory modules control different aspects of gatae expression in the sea urchin embryo. Module 10 is active early, from the onset of expression in the presumptive secondary mesenchyme cells to the early gastrula phase of expression in the vegetal plate endoderm and mesoderm. Sometime during early gastrulation module 24 takes over control from module 10, directing gatae expression in the gut endoderm of the gastrula and pluteus. This modular organization reflects the requirement for regulation by diverse sets of transcription factors at the respective stages, i.e., during specification of the endomesoderm, and during definitive regionalization and differentiation of the gut. The gatae gene itself plays different roles in these phases of its activity. The endomesodermal gene regulatory network shows explicitly how gatae functions to activate a number of other regulatory genes during the specification phase (Davidson, 2006; Davidson et al., 2002a; Davidson et al., 2002b). Given its regionalized pattern of expression in the gut of the late embryo, gatae may be involved in specification of first the hindgut and then the midgut, and in activation of gut differentiation gene batteries.
While the endogenous gatae gene and Gatae BAC express strongly in the mesoderm cells of the gastrula and the coelomic pouches of the pluteus embryo, neither module 10 nor module 24 directs expression to these cells. An additional control module is thus implied. This is likely to reside >10 kb upstream of conserved patch 1, the limit of overlap of the L. variegatus BAC with the S. purpuratus sequence. This leaves roughly 9 kb to the beta2 lactamase-like gene which will be possible to explore by FamilyRelations only when the respective L. variegatus sequence becomes available. It is unlikely that the missing module is downstream of the region we have examined, since Gatae BAC expresses in coelomic pouches though it terminates only 2.5 kb beyond exon 6.
In the context of the endomesoderm gene regulatory network, an important result is that module 10 alone is necessary and sufficient to drive gatae expression throughout the phase of development to which the network analysis pertains. Therefore all interactions from upstream regulators into gatae will have to be mediated by and processed through this module. We have identified binding sites for such inputs as predicted by the endomesoderm gene network in module 10, and are in the process of mutating and analyzing these sites in detail, to be reported in a subsequent publication (Lee and Davidson, 2007).
Homologous BAC recombination as a tool for cis-regulatory analysis
Conventional cis-regulatory analysis on isolated modules, including site specific mutagenesis, provides our most powerful and direct tool for demonstrating functionally the roles of given cis-regulatory inputs. By this means proposed upstream linkages of a regulatory module into the gene regulatory network can be certified or rejected. The use of homologous BAC recombination further enhances the arsenal of functional cis-regulatory approaches, opening up several additional possibilities: (1) As have others (Hadchouel et al., 2003; Teboul et al., 2002) we show here how deletion of a specific regulatory module can be used to establish its necessity as well as its sufficiency. This excludes the possibility of regulatory redundancy. (2) BAC reporter knockins which provide the complete and accurate spectrum of expression of a given gene are a useful starting place to narrow the genomic domain over which specific cis-regulatory modules are to be sought. (3) BAC reporter knockins provide built in components for single module expression constructs that include the endogenous basal promoter. (4) BAC reporter knockins enable the study of intermodular interactions in the natural context of the gene, and this has proved one of the most interesting aspects of the present work.
Exclusionary function of cis-regulatory modules
The expression of the module 10 and 24 reporter constructs differ in a revealing way from the expression of Gatae BAC. When individually cloned in front of the reporter, each module was capable of driving spatially specific expression for part of embryogenesis, but each produced ubiquitous, albeit weak expression at other stages. Yet in their natural context they work sequentially to produce highly specific patterns of expression with no ectopic expression of any kind, as seen from endogenous gatae expression and that of Gatae BAC. This difference devolves from the global structure of the locus: we see the whole locus has additional functions than do the sum of individual constructs. Individual constructs display outputs from cis-regulatory processing of their individual inputs while the overall regulatory function of the gatae locus includes mechanisms that determine which cis-regulatory modules are allowed to function; thus far there has been little information regarding the experimental verification of such alternate use of cis-regulatory modules.
In Fig. 8 we present a model for how this might occur. The premise is that module function requires physical association with the basal transcription apparatus (BTA), and that a given association precludes all other modules from such association. This would be the consequence of association by looping, undoubtedly the general mechanism by which distant cis-regulatory modules are brought to the immediate vicinity of the BTA (reviewed by Davidson, 2006). With respect to choice of active cis-regulatory module, a looping mechanism confers a Boolean quality to the regulatory system (Istrail and Davidson, 2005). In our present case, the gatae gene contains two cis-regulatory modules active in the embryonic endomesoderm, viz. module 10 for early expression and module 24 for late expression. In the normal context, module 10 associates with the BTA up to the early gastrula, driving endoderm and mesoderm expression (Fig. 8B). This excludes module 24 from association with the BTA within the endomesoderm during this period, when module 10 is loaded with its transcription factors (Lee and Davidson, 2007). Outside the endomesoderm, module 24 is at this same early period capable of generating weak (ectopic) expression (Fig. 4) if it is cloned in juxtaposition to the BTA, but it does not do so in context. Therefore when this module is not loaded with its cognate transcription factors it cannot loop to the BTA. Sometime in early to mid-gastrula, however these factors become available in endoderm cells, and there module 24 is activated, loops to the BTA, and generates specific expression in the midgut and hindgut (Fig. 8C). At this time module 10 is essentially relieved of its duty and is excluded from association with the basal promoter. As for module 24 at early times, in cells outside of the endoderm at late stages module 10 cannot now cause expression unless it is artificially brought into the immediate context of the BTA. Thus, though each of these cis-regulatory modules in isolation displays weak ubiquitous expression at certain times, in context they function alternately to produce highly specific expression.
However there is an asymmetry in this system, as shown by the results of the BAC deletions. Deletion of module 10 results in complete loss of early expression, followed by normal late expression; as above the potential of module 24 for early expression outside the endomesoderm cannot be realized unless it is artificially positioned next to the BTA. However, deletion of module 24 results in ubiquitous late Gatae BAC expression. This could be driven by the action of the distal “B” element in the undefined region upstream of conserved patch 1. A prediction within the framework of the model in Fig. 8 is that the asymmetry in the consequences of these two deletions is due to a second kind of looping: in all embryo cells at all times module 24 is looped to the B region, preventing it from functioning with the BTA, except in late endoderm cells when it becomes loaded with endoderm transcription factors and occupies the BTA itself. Deletion of module 24 would release this constraint, resulting in B-driven ectopic expression.
An alternative, that ectopic expression is precluded by specific repressors target sites for which are located within modules 10, 24, and B, seems too baroque to consider seriously. This would require that the repressor that acts on module 10 is present in all cells except endomesoderm at early times and in all cells at late times, while that which acts on module 24 is present everywhere early and then in all cells except gut at late times, etc. Furthermore the ectopic expression seen in module 24 deletions from the Gatae BAC cannot easily be explained in this way.
In summary, we describe two levels of cis-regulatory control in the gatae gene. The first is the classic, module-specific cis-regulatory design that determines time and place of regulatory function for each module. This is clearly revealed in experiments with single module expression constructs. The second is the level of exclusionary cis-regulatory module interactions on the scale of the gene as a whole. This can only be perceived in experiments carried out on that scale, for which recombinant BAC constructs provide a ready approach.
ACKNOWLEDGEMENTS
We would like to express our gratitude to Pat Leahy at Kerckhoff Marine Lab for his excellent care of the sea urchins. We would also like to thank Julie Hahn for generating the Gatae GFP BAC and for technical advice with BAC deletions. This work was supported by NIH grant HD-37105.
Footnotes
Corresponding author. Tel: +1 626 395 4937; fax: +1 626 793 3047 davidson@caltech.edu (E. H. Davidson) Division of Biology California Institute of Technology 1200 E. California Blvd Mail Code 156-29 Pasadena, CA 9112
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
REFERENCES
- Arnone MI, Martin EL, Davidson EH. Cis-regulation downstream of cell type specification: a single compact element controls the complex expression of the CyIIa gene in sea urchin embryos. Development. 1998;125:1381–95. doi: 10.1242/dev.125.8.1381. [DOI] [PubMed] [Google Scholar]
- Brown CT, Rust AG, Clarke PJ, Pan Z, Schilstra MJ, De Buysscher T, Griffin G, Wold BJ, Cameron RA, Davidson EH, Bolouri H. New computational approaches for analysis of cis-regulatory networks. Dev Biol. 2002;246:86–102. doi: 10.1006/dbio.2002.0619. [DOI] [PubMed] [Google Scholar]
- Davidson EH. The Regulatory Genome. Academic Press; San Diego: 2006. [Google Scholar]
- Davidson EH, Rast JP, Oliveri P, Ransick A, Calestani C, Yuh CH, Minokawa T, Amore G, Hinman V, Arenas-Mena C, Otim O, Brown CT, Livi CB, Lee PY, Revilla R, Rust AG, Pan Z, Schilstra MJ, Clarke PJ, Arnone MI, Rowen L, Cameron RA, McClay DR, Hood L, Bolouri H. A genomic regulatory network for development. Science. 2002a;295:1669–78. doi: 10.1126/science.1069883. [DOI] [PubMed] [Google Scholar]
- Davidson EH, Rast JP, Oliveri P, Ransick A, Calestani C, Yuh CH, Minokawa T, Amore G, Hinman V, Arenas-Mena C, Otim O, Brown CT, Livi CB, Lee PY, Revilla R, Schilstra MJ, Clarke PJ, Rust AG, Pan Z, Arnone MI, Rowen L, Cameron RA, McClay DR, Hood L, Bolouri H. A provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo. Dev Biol. 2002b;246:162–90. doi: 10.1006/dbio.2002.0635. [DOI] [PubMed] [Google Scholar]
- Hadchouel J, Carvajal JJ, Daubas P, Bajard L, Chang T, Rocancourt D, Cox D, Summerbell D, Tajbakhsh S, Rigby PW, Buckingham M. Analysis of a key regulatory region upstream of the Myf5 gene reveals multiple phases of myogenesis, orchestrated at each site by a combination of elements dispersed throughout the locus. Development. 2003;130:3415–26. doi: 10.1242/dev.00552. [DOI] [PubMed] [Google Scholar]
- Istrail S, Davidson EH. Logic functions of the genomic cis-regulatory code. Proc Natl Acad Sci U S A. 2005;102:4954–9. doi: 10.1073/pnas.0409624102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee EC, Yu D, Martinez de Velasco J, Tessarollo L, Swing DA, Court DL, Jenkins NA, Copeland NG. A highly efficient Escherichia coli-based chromosome engineering system adapted for recombinogenic targeting and subcloning of BAC DNA. Genomics. 2001;73:56–65. doi: 10.1006/geno.2000.6451. [DOI] [PubMed] [Google Scholar]
- Lee PY, Davidson EH. Expression of Spgatae, the Strongylocentrotus purpuratus ortholog of vertebrate GATA4/5/6 factors. Gene Expr Patterns. 2004;5:161–5. doi: 10.1016/j.modgep.2004.08.010. [DOI] [PubMed] [Google Scholar]
- Lee PY, Davidson EH.2007. Use of OR logic in the regulation of a sea urchin GATA factor (in preparation)
- Maduro MF, Rothman JH. Making worm guts: the gene regulatory network of the Caenorhabditis elegans endoderm. Dev Biol. 2002;246:68–85. doi: 10.1006/dbio.2002.0655. [DOI] [PubMed] [Google Scholar]
- McMahon AP, Flytzanis CN, Hough-Evans BR, Katula KS, Britten RJ, Davidson EH. Introduction of cloned DNA into sea urchin egg cytoplasm: replication and persistence during embryogenesis. Dev Biol. 1985;108:420–30. doi: 10.1016/0012-1606(85)90045-4. [DOI] [PubMed] [Google Scholar]
- Minokawa T, Wikramanayake AH, Davidson EH. cis-Regulatory inputs of the wnt8 gene in the sea urchin endomesoderm network. Dev Biol. 2005;288:545–58. doi: 10.1016/j.ydbio.2005.09.047. [DOI] [PubMed] [Google Scholar]
- Murakami R, Okumura T, Uchiyama H. GATA factors as key regulatory molecules in the development of Drosophila endoderm. Dev Growth Differ. 2005;47:581–9. doi: 10.1111/j.1440-169X.2005.00836.x. [DOI] [PubMed] [Google Scholar]
- Pancer Z, Rast JP, Davidson EH. Origins of immunity: transcription factors and homologues of effector genes of the vertebrate immune system expressed in sea urchin coelomocytes. Immunogenetics. 1999;49:773–86. doi: 10.1007/s002510050551. [DOI] [PubMed] [Google Scholar]
- Patient RK, McGhee JD. The GATA family (vertebrates and invertebrates) Curr Opin Genet Dev. 2002;12:416–22. doi: 10.1016/s0959-437x(02)00319-2. [DOI] [PubMed] [Google Scholar]
- Sodergren E, Weinstock GM, Davidson EH, Cameron RA, Gibbs RA, Angerer RC, Angerer LM, Arnone MI, Burgess DR, Burke RD, Coffman JA, Dean M, Elphick MR, Ettensohn CA, Foltz KR, Hamdoun A, Hynes RO, Klein WH, Marzluff W, McClay DR, Morris RL, Mushegian A, Rast JP, Smith LC, Thorndyke MC, Vacquier VD, Wessel GM, Wray G, Zhang L, Elsik CG, Ermolaeva O, Hlavina W, Hofmann G, Kitts P, Landrum MJ, Mackey AJ, Maglott D, Panopoulou G, Poustka AJ, Pruitt K, Sapojnikov V, Song X, Souvorov A, Solovyev V, Wei Z, Whittaker CA, Worley K, Durbin KJ, Shen Y, Fedrigo O, Garfield D, Haygood R, Primus A, Satija R, Severson T, Gonzalez-Garay ML, Jackson AR, Milosavljevic A, Tong M, Killian CE, Livingston BT, Wilt FH, Adams N, Belle R, Carbonneau S, Cheung R, Cormier P, Cosson B, Croce J, Fernandez-Guerra A, Geneviere AM, Goel M, Kelkar H, Morales J, Mulner-Lorillon O, Robertson AJ, Goldstone JV, Cole B, Epel D, Gold B, Hahn ME, Howard-Ashby M, Scally M, Stegeman JJ, Allgood EL, Cool J, Judkins KM, McCafferty SS, Musante AM, Obar RA, Rawson AP, Rossetti BJ, Gibbons IR, Hoffman MP, Leone A, Istrail S, Materna SC, Samanta MP, Stolc V, Tongprasit W, Tu Q, Bergeron KF, Brandhorst BP, Whittle J, Berney K, Bottjer DJ, Calestani C, Peterson K, Chow E, Yuan QA, Elhaik E, Graur D, Reese JT, Bosdet I, Heesun S, Marra MA, Schein J, Anderson MK, Brockton V, Buckley KM, Cohen AH, Fugmann SD, Hibino T, Loza-Coll M, Majeske AJ, Messier C, Nair SV, Pancer Z, Terwilliger DP, Agca C, Arboleda E, Chen N, Churcher AM, Hallbook F, Humphrey GW, Idris MM, Kiyama T, Liang S, Mellott D, Mu X, Murray G, Olinski RP, Raible F, Rowe M, Taylor JS, Tessmar-Raible K, Wang D, Wilson KH, Yaguchi S, Gaasterland T, Galindo BE, Gunaratne HJ, Juliano C, Kinukawa M, Moy GW, Neill AT, Nomura M, Raisch M, Reade A, Roux MM, Song JL, Su YH, Townley IK, Voronina E, Wong JL, Amore G, Branno M, Brown ER, Cavalieri V, Duboc V, Duloquin L, Flytzanis C, Gache C, Lapraz F, Lepage T, Locascio A, Martinez P, Matassi G, Matranga V, Range R, Rizzo F, Rottinger E, Beane W, Bradham C, Byrum C, Glenn T, Hussain S, Manning FG, Miranda E, Thomason R, Walton K, Wikramanayke A, Wu SY, Xu R, Brown CT, Chen L, Gray RF, Lee PY, Nam J, Oliveri P, Smith J, Muzny D, Bell S, Chacko J, Cree A, Curry S, Davis C, Dinh H, Dugan-Rocha S, Fowler J, Gill R, Hamilton C, Hernandez J, Hines S, Hume J, Jackson L, Jolivet A, Kovar C, Lee S, Lewis L, Miner G, Morgan M, Nazareth LV, Okwuonu G, Parker D, Pu LL, Thorn R, Wright R. The genome of the sea urchin Strongylocentrotus purpuratus. Science. 2006;314:941–52. doi: 10.1126/science.1133609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sucov HM, Hough-Evans BR, Franks RR, Britten RJ, Davidson EH. A regulatory domain that directs lineage-specific expression of a skeletal matrix protein gene in the sea urchin embryo. Genes Dev. 1988;2:1238–50. doi: 10.1101/gad.2.10.1238. [DOI] [PubMed] [Google Scholar]
- Teboul L, Hadchouel J, Daubas P, Summerbell D, Buckingham M, Rigby PW. The early epaxial enhancer is essential for the initial expression of the skeletal muscle determination gene Myf5 but not for subsequent, multiple phases of somitic myogenesis. Development. 2002;129:4571–80. doi: 10.1242/dev.129.19.4571. [DOI] [PubMed] [Google Scholar]
- Yon J, Fried M. Precise gene fusion by PCR. Nucleic Acids Res. 1989;17:4895. doi: 10.1093/nar/17.12.4895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yuh CH, Davidson EH. Modular cis-regulatory organization of Endo16, a gut-specific gene of the sea urchin embryo. Development. 1996;122:1069–82. doi: 10.1242/dev.122.4.1069. [DOI] [PubMed] [Google Scholar]