Abstract
Background
The Runt DNA binding domain (Runx) defines a metazoan family of sequence-specific transcription factors with essential roles in animal ontogeny and stem cell based development. Depending on cis-regulatory context, Runx proteins mediate either transcriptional activation or repression. In many contexts Runx-mediated repression is carried out by Groucho/TLE, recruited to the transcriptional complex via a C-terminal WRPY sequence motif that is found encoded in all heretofore known Runx genes.
Findings
Full-length Runx genes were identified in the recently sequenced genomes of phylogenetically diverse metazoans, including placozoans and sponges, the most basally branching members of that clade. No sequences with significant similarity to the Runt domain were found in the genome of the choanoflagellate Monosiga brevicollis, confirming that Runx is a metazoan apomorphy. A contig assembled from genomic sequences of the haplosclerid demosponge Amphimedon queenslandica was used to construct a model of the single Runx gene from that species, AmqRunx, the veracity of which was confirmed by expressed sequences. The encoded sequence of the Runx protein OscRunx from the homoscleromorph sponge Oscarella carmella was also obtained from assembled ESTs. Remarkably, a syntenic linkage between Runx and Supt3h, previously reported in vertebrates, is conserved in A. queenslandica. Whereas OscRunx encodes a C-terminal Groucho-recruitment motif, AmqRunx does not, although a Groucho homologue is found in the A. queenslandica genome.
Conclusion
Our results are consistent with the hypothesis that sponges are paraphyletic, and suggest that Runx-WRPY mediated recruitment of Groucho to cis-regulatory sequences originated in the ancestors of eumetazoans following their divergence from demosponges.
Findings
The Runt domain (Runx) is a highly conserved 128 amino acid sequence motif that defines a metazoan family of sequence-specific DNA binding proteins required for the ontogeny of each of the animal species in which it has been functionally studied, as well as for the regulation of somatic stem cells and development of the lineages to which they give rise [1-4]. Runx genes facilitate developmental coordination of cell proliferation and differentiation [1], integrating the transduction of multiple signalling pathways [2] by nucleating the assembly of signal-responsive cis-regulatory modules [5]. Runx genes have only been found in animals [6,7], suggesting that they may have evolved in concert with metazoan systems for developmental signalling.
All heretofore known Runx genes encode proteins that bear at their C-terminus a WRPY sequence motif (or a close variant thereof), which functions to recruit the Groucho/TLE corepressor to the cis-regulatory system [8-12]. Runx-WRPY mediated recruitment of Groucho is relatively weak and controlled by cis-regulatory sequence context [12,13]. Depending on such context, Runx proteins can also function as Groucho-independent repressors, as well as activators [8,14].
The purpose of this study was to extend our previous investigation of the evolution of Runx genes [6] by analyzing and comparing several new Runx gene sequences collected from recently sequenced genomes of lophotrochozoans and basally branching metazoans (see Additional File 1 for detailed methods). Although cnidarian and sponge Runx genes were described in a recent report [7], that study left open the question of whether the sponge Runx proteins bear a C-terminal Groucho recruitment motif. To address that question we examined Runx-encoding genomic and cDNA sequences from two sponges (Amphimedon queenslandica and Oscarella carmela), and compared these to Runx sequences collected from a phylogenetically broad sampling of other metazoan genomes, including that of the placozoan Trichoplax adhaerens [15].
Runx is a metazoan synapomorphy that has undergone independent duplications in a subset of triploblast lineages
Figure 1 depicts several representative examples of previously known [6,7] or newly revealed (Table 1) Runx genes from across metazoan phylogeny, clustered according to the phylogenetic topology obtained by Sperling et al. [16]. As recently shown by Sullivan et al. [7], Runx-encoding sequences extend to the base of the metazoan family tree, with single orthologues encoded in the genome of the haplosclerid demosponge A. queenslandica and in expressed sequence tags from the homoscleromorph sponge O. carmela. Similarly, the anthozoan cnidarian Nematostella vectensis and the placozoan Trichoplax adherens each have a single Runx gene, as do several triploblast species, including the lancelet Branchiostoma floridae and the sea squirt Ciona intestinalis among deuterostomes; the nematode Caenorhabditis elegans among ecdysozoans; and the polychaete Capitella sp.I and the mollusk Lottia gigantea among lophotrochozoans. In contrast, vertebrates, sea urchins (Strongylocentrotus purpuratus), dipteran insects (Drosophila melanogaster), clitellate annelids (Helobdella robusta), and planarians (Schmidtea mediterranea) each have two or more Runx genes.
Table 1.
Species | Genome Database (URL) | Version | NCBI Acc. No. |
S. purpuratus | http://sugp.caltech.edu/SpBase | 2.1 | NW_001330224 |
B. floridae | http://genome.jgi-psf.org | 1.0 | N.A. |
H. robusta | http://genome.jgi-psf.org | 1.0 | N.A. |
Capitella sp. I* | http://genome.jgi-psf.org | 1.0 | N.A. |
L. gigantean | http://genome.jgi-psf.org | 1.0 | N.A. |
T. adhaerens | http://genome.jgi-psf.org | 1.0 | N.A. |
S. mediterranea | http://smedgd.neuro.utah.edu/index.html | 1.3.14 | N.A. |
A. queenslandica | http://compagen.zoologie.uni-kiel.de/index.html | N.A. | N.A. |
O. carmela** | http://compagen.zoologie.uni-kiel.de/index.html | N.A. | N.A. |
The table identifies the genome project and assembly version from which each of the new sequences described here was obtained, as well as available NCBI genomic contig reference assemblies. The sequences and links to each locus on the respective genome browsers are provided in Additional File 2. N.A., not available. *Complete gene model obtained from raw contig sequence using GeneScan. **ESTs only.
Comparison of the gene architectures suggests that the primordial Runx gene contained three introns, the first of which interrupts the coding sequence of the Runt domain (found in every representative except for the insect runt orthologues), the second of which lies at the C-terminal end of the Runt domain (found in all of the representatives except two, HrRunx2 and LgRunx, both from lophotrochozoans), and the third lying between the two exons that encode the poorly conserved C-terminal sequence of the protein (missing in three of the insect genes and one of the leech genes; Fig. 1). This basic four-exon architecture is displayed by the demosponge, placozoan and anthozoan Runx genes, and among the known triploblast Runx genes, by the two sea urchin paralogues, the single lancelet orthologue, and the two planarian paralogues. Except for the additional intron within the sequence that encodes the N-terminal half of the Runt domain in all the vertebrate paralogues (Fig. 1), the basal architecture is conserved in vertebrate Runx3, which supports previous propositions for that gene being the most ancient of the vertebrate paralogues [17]. The additional N-terminal intron in Runx3, which is also found in each of the other vertebrate Runx paralogues, is also found in the C. intestinalis orthologue (but not in the cephalochordate B. floridae), consistent with recent phylogenies that place cephalochordates basal to {urochordates+vertebrates} in the chordate lineage [18].
To confirm and extend previous analyses of Runx family relations [6,7], we used our expanded Runx sequence dataset to calculate trees by Bayesian, distance neighbor-joining (NJ), and maximum likelihood (ML) methods. The three trees have slightly different topologies; the Bayesian tree is shown in Figure 2A. All three analyses confidently support the branch separating the two sponge Runx genes from eumetazoan genes. Additionally, the protostome and chordate clades are recovered in all three trees but the positions of cnidarian, placozoan, and echinoderm genes differ between analyses. While only the NJ tree places echinoderms correctly inside a deuterostome clade, this clade also erroneously includes cnidarian and placozoan genes. Bayesian and ML analyses correctly place the latter two genes at the base of the bilaterian clade but wrongly group echinoderm genes with protostome genes. Relationships within the protostomes are unclear and none of the three analyses separates these genes into lophotrochozoan and ecdysozoan clades. This may be due to long-branch attraction between the Runx genes from S. mediterranea, H. robusta, and C. elegans. Thus, these genes were removed in a second set of analyses (Fig. 2B), where a lophotrochozoan clade and a clade comprising the four D. melanogaster genes are recovered in all three trees. These analyses suggest that there was only one Runx gene in the lineage between the metazoan and the lophotrochozoan-ecdysozoan last common ancestors. Hence, the multiple Runx genes present in some of the animals in this study are most likely the products of independent duplications within each of the lineages [6] (Fig. 1, colored boxes; note that a second sea urchin Runx gene, SpRunt-2, was recently found to be encoded in the sea urchin genome [19,20], in contradiction to several previous reports [1,6,7,21]).
Previous reports have noted the absence of any Runx homologues in sequenced genomes of unicellular organisms [6,7], including the choanoflagellate M. brevicolis [22], a member of the Holozoa taxon that is most closely related to Metazoa. We confirmed the absence of a Runx sequence motif in the M. brevicolis genome using tBLASTn searches. Thus, the Runt domain appears to have evolved in concert with complex multicellularity in the animal clade. Furthermore, unlike many other metazoan-specific transcription factor classes [23], the Runx gene did not duplicate in early animals, or even within some of the bilaterian lineages.
AmqRunx lacks a Groucho recruitment motif
As reported previously [7], Runx genes are found in both the haplosclerid demosponge A. queenslandica and the homoscleromorph sponge O. carmela. Although genome sequence is not yet available for the latter, a sequence encoding a Runx protein was recovered from an assembly of available ESTs. The predicted OscRunx protein terminates with the amino acid sequence WRPY (Fig. 3) [see Additional File 2], the C-terminal Groucho-recruitment motif found encoded in all heretofore known Runx genes (Fig. 1). Note that there are vertebrate splice variants that lack a C-terminal WRPY [24-26], and that one each of the two leech and two planarian paralogues do not appear to terminate in WRPY (Fig. 1) [see Additional File 2]. Thus, some contexts have functional requirements for Runx protein isoforms lacking a C-terminal WRPY. Nevertheless, all of the eumetazoan species depicted in Fig. 1 (as well as the homoscleromorph sponge) encode at least one Runx protein that terminates in WRPY or a close variant thereof.
A genomic sequence contig from A. queenslandica was predicted to encode a Runx gene with four exons, displaying an architecture very similar to that of the placozoan and cnidarian genes (Fig. 1) [7]. The predicted coding sequence of AmqRunx is 1,566 bp with the Runt domain contained within the first 474 bp. As is typical for Runx proteins, the predicted C-terminal domain of AmqRunx (amino acid residues 159–479) is enriched for proline (12%), serine (16%), and threonine (7%) residues, a PST enrichment similar to that previously reported for the C-terminal domain of NvRunx [7] and that displayed by the C-terminal domain of OscRunx (Fig. 3). Surprisingly however, the C-terminus of AmqRunx does not bear the WRPY motif or any variant thereof (Fig. 3). Furthermore, no open reading frames encoding WRPY were found along the genomic contig in which AmqRunx is found. The A. queenslandica genome does however encode a bona fide Groucho homologue (Additional File 3 and unpublished data), as well as several transcription factors that are predicted to interact with Groucho [12], including a hairy/Hey homologue with a FRPW motif and a number of NK class genes with an engrailed homology 1 (EH-1) motif ([27,28]; BMD, unpublished data).
The lack of a C-terminal WRPY motif in AmqRunx was verified by expressed sequence data. Based on alignment with genomic DNA, EST sequence 2941805_1 was found to encode the last 115 bp of the AmqRunx coding sequence, the stop codon, and an additional 626 bp of 3' UTR spanning two exons. In order to confirm that this EST was transcribed from AmqRunx, oligonucleotide primers – forward primer in the Runt domain and reverse primer in the EST-encoded 3' UTR region – were used to amplify the sequence both from A. queenslandica adult and embryonic RNA. An amplicon of the correct size and sequence was obtained (Additional File 4), thus confirming the veracity of the AmqRunx gene prediction.
The contig bearing AmqRunx contains sequences predictive of additional genes flanking the Runx gene (Fig. 4), which argues against the possibility that the AmqRunx gene model is missing a C-terminal exon that might produce alternative splice variants. Moreover, the veracity of the contig assembly is further supported by the remarkable fact that a syntenic relationship between Runx and Supt3h, previously reported to exist in vertebrates [29] and which we found also to exist in cnidarians (N. vectensis), lancelets (B. floridae), and polychaetes (Capitella sp. I), is conserved in the demosponge (Fig. 4).
Although homoscleromorph sponges are still commonly grouped with demosponges in the phylum Porifera (Fig. 5A), this classification has been called into question, as has the monophyly of sponges (and hence 'Porifera' as a true phylum) [16]. The fact that AmqRunx lacks a C-terminal WRPY motif is consistent with the more recent proposition that sponges are paraphyletic [16,30], with calcisponges and homoscleromorphs branching after demosponges along the lineage leading to eumetazoans (Fig. 5B). The conventional scenario, which holds that sponges are monophyletic (Fig. 5A), would require that several characters held in common between eumetazoans and homoscleromorph sponges (i.e., acrosomes, true epithelia, and a C-terminal WRPY motif linked to Runx) be either convergent homoplasies, or metazoan pleisiomorphies that were all lost in the demosponge lineage leading to A. queenslandica. Although it is possible that the loss of multiple characters occurred within the demosponge lineage, it is unlikely that body plan simplification is in itself sufficient to relax the selection pressure for maintaining the Runx-WRPY linkage, as evidenced by its maintenance in placozoans. The more parsimonious scenario is that the C-terminal WRPY motif of Runx proteins, and presumably the consequent recruitment of Groucho to a subset of Runx target cis-regulatory modules, originated in eumetazoan ancestors following their divergence from the sponge lineage leading to A. queenslandica (Fig. 5B). An interesting possibility is that the Runx associated WRPY motif originated in Epitheliozoa {eumetazoans and homoscleromorphs} [16], which would suggest that Runx-WRPY mediated cis-regulatory recruitment of Groucho is functionally linked to the evolution and development of an epithelium. Testing this possibility awaits the sequencing of a calcisponge Runx gene.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
AJR performed BLAST searches, sequence assemblies, alignments, and computational construction of gene models. CL independently verified the A. queenslandica contig assembly and Runx gene model, performed the phylogenetic analyses, and obtained the PCR amplicon of AmqRunx cDNA. BMD performed some sequence assemblies, provided intellectual guidance and assisted in the writing of the manuscript. JAC performed some of the BLAST searches and sequence alignments, and drafted the manuscript and figures. All authors read and approved the final manuscript.
Supplementary Material
Acknowledgments
Acknowledgements
This work was supported by funding from the NIH (GM070840 to JAC) and ARC (to BMD). We thank Kevin Peterson for providing helpful suggestions that improved the manuscript prior to submission.
Contributor Information
Anthony J Robertson, Email: tony@mdibl.org.
Claire Larroux, Email: c.larroux1@uq.edu.au.
Bernard M Degnan, Email: b.degnan@uq.edu.au.
James A Coffman, Email: jcoffman@mdibl.org.
References
- Coffman JA. Runx transcription factors and the developmental balance between cell proliferation and differentiation. Cell Biol Int. 2003;27:315–324. doi: 10.1016/S1065-6995(03)00018-0. [DOI] [PubMed] [Google Scholar]
- Coffman JA. Is Runx a linchpin for developmental signaling in metazoans? J Cell Biochem. 2009;107:194–202. doi: 10.1002/jcb.22143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kagoshima H, Shigesada K, Kohara Y. RUNX regulates stem cell proliferation and differentiation: insights from studies of C. elegans. J Cell Biochem. 2007;100:1119–1130. doi: 10.1002/jcb.21174. [DOI] [PubMed] [Google Scholar]
- Nimmo R, Woollard A. Worming out the biology of Runx. Dev Biol. 2008;313:492–500. doi: 10.1016/j.ydbio.2007.11.002. [DOI] [PubMed] [Google Scholar]
- Westendorf JJ, Hiebert SW. Mammalian runt-domain proteins and their roles in hematopoiesis, osteogenesis, and leukemia. J Cell Biochem. 1999:51–58. doi: 10.1002/(SICI)1097-4644(1999)75:32+<51::AID-JCB7>3.0.CO;2-S. [DOI] [PubMed] [Google Scholar]
- Rennert J, Coffman JA, Mushegian AR, Robertson AJ. The evolution of Runx genes I. A comparative study of sequences from phylogenetically diverse model organisms. BMC Evol Biol. 2003;3:4. doi: 10.1186/1471-2148-3-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sullivan JC, Sher D, Eisenstein M, Shigesada K, Reitzel AM, Marlow H, Levanon D, Groner Y, Finnerty JR, Gat U. The evolutionary origin of the Runx/CBFbeta transcription factors–studies of the most basal metazoans. BMC Evol Biol. 2008;8:228. doi: 10.1186/1471-2148-8-228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wheeler JC, Shigesada K, Gergen JP, Ito Y. Mechanisms of transcriptional regulation by Runt domain proteins. Semin Cell Dev Biol. 2000;11:369–375. doi: 10.1006/scdb.2000.0184. [DOI] [PubMed] [Google Scholar]
- Javed A, Guo B, Hiebert S, Choi JY, Green J, Zhao SC, Osborne MA, Stifani S, Stein JL, Lian JB, et al. Groucho/TLE/R-esp proteins associate with the nuclear matrix and repress RUNX (CBF(alpha)/AML/PEBP2(alpha)) dependent activation of tissue-specific gene transcription. J Cell Sci. 2000;113:2221–2231. doi: 10.1242/jcs.113.12.2221. [DOI] [PubMed] [Google Scholar]
- Lutterbach B, Westendorf JJ, Linggi B, Isaac S, Seto E, Hiebert SW. A mechanism of repression by acute myeloid leukemia-1, the target of multiple chromosomal translocations in acute leukemia. J Biol Chem. 2000;275:651–656. doi: 10.1074/jbc.275.1.651. [DOI] [PubMed] [Google Scholar]
- McLarren KW, Theriault FM, Stifani S. Association with the nuclear matrix and interaction with Groucho and RUNX proteins regulate the transcription repression activity of the basic helix loop helix factor Hes1. J Biol Chem. 2001;276:1578–1584. doi: 10.1074/jbc.M007629200. [DOI] [PubMed] [Google Scholar]
- Jennings BH, Pickles LM, Wainwright SM, Roe SM, Pearl LH, Ish-Horowicz D. Molecular recognition of transcriptional repressor motifs by the WD domain of the Groucho/TLE corepressor. Mol Cell. 2006;22:645–655. doi: 10.1016/j.molcel.2006.04.024. [DOI] [PubMed] [Google Scholar]
- Canon J, Banerjee U. In vivo analysis of a developmental circuit for direct transcriptional activation and repression in the same cell by a Runx protein. Genes Dev. 2003;17:838–843. doi: 10.1101/gad.1064803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Telfer JC, Hedblom EE, Anderson MK, Laurent MN, Rothenberg EV. Localization of the domains in runx transcription factors required for the repression of CD4 in thymocytes. J Immunol. 2004;172:4359–4370. doi: 10.4049/jimmunol.172.7.4359. [DOI] [PubMed] [Google Scholar]
- Srivastava M, Begovic E, Chapman J, Putnam NH, Hellsten U, Kawashima T, Kuo A, Mitros T, Salamov A, Carpenter ML, et al. The Trichoplax genome and the nature of placozoans. Nature. 2008;454:955–960. doi: 10.1038/nature07191. [DOI] [PubMed] [Google Scholar]
- Sperling EA, Pisani D, Peterson KJ. Poriferan paraphyly and its implications for Precambrian paleobiology. In: Vickers-Rich P, Komarower P, editor. The Rise and Fall of the Eidiacaran Biota. Vol. 286. London: The Geological Society of London; 2007. pp. 355–368. [Google Scholar]
- Levanon D, Glusman G, Bettoun D, Ben-Asher E, Negreanu V, Bernstein Y, Harris-Cerruti C, Brenner O, Eilam R, Lotem J, et al. Phylogenesis and regulated expression of the RUNT domain transcription factors RUNX1 and RUNX3. Blood Cells Mol Dis. 2003;30:161–163. doi: 10.1016/S1079-9796(03)00023-8. [DOI] [PubMed] [Google Scholar]
- Putnam NH, Butts T, Ferrier DE, Furlong RF, Hellsten U, Kawashima T, Robinson-Rechavi M, Shoguchi E, Terry A, Yu JK, et al. The amphioxus genome and the evolution of the chordate karyotype. Nature. 2008;453:1064–1071. doi: 10.1038/nature06967. [DOI] [PubMed] [Google Scholar]
- Fernandez-Guerra A, Aze A, Morales J, Mulner-Lorillon O, Cosson B, Cormier P, Bradham C, Adams N, Robertson AJ, Marzluff WF, et al. The genomic repertoire for cell cycle control and DNA metabolism in S. purpuratus. Dev Biol. 2006;300:238–251. doi: 10.1016/j.ydbio.2006.09.012. [DOI] [PubMed] [Google Scholar]
- Dickey-Sims C, Robertson AJ, Rupp DE, McCarthy JJ, Coffman JA. Runx-dependent expression of PKC is critical for cell survival in the sea urchin embryo. BMC Biol. 2005;3:18. doi: 10.1186/1741-7007-3-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robertson AJ, Dickey CE, McCarthy JJ, Coffman JA. The expression of SpRunt during sea urchin embryogenesis. Mech Dev. 2002;117:327–330. doi: 10.1016/S0925-4773(02)00201-0. [DOI] [PubMed] [Google Scholar]
- King N, Westbrook MJ, Young SL, Kuo A, Abedin M, Chapman J, Fairclough S, Hellsten U, Isogai Y, Letunic I, et al. The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans. Nature. 2008;451:783–788. doi: 10.1038/nature06617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Larroux C, Luke GN, Koopman P, Rokhsar DS, Shimeld SM, Degnan BM. Genesis and expansion of metazoan transcription factor gene classes. Mol Biol Evol. 2008;25:980–996. doi: 10.1093/molbev/msn047. [DOI] [PubMed] [Google Scholar]
- Levanon D, Glusman G, Bangsow T, Ben-Asher E, Male DA, Avidan N, Bangsow C, Hattori M, Taylor TD, Taudien S, et al. Architecture and anatomy of the genomic locus encoding the human leukemia-associated transcription factor RUNX1/AML1. Gene. 2001;262:23–33. doi: 10.1016/S0378-1119(00)00532-1. [DOI] [PubMed] [Google Scholar]
- Sun L, Vitolo MI, Qiao M, Anglin IE, Passaniti A. Regulation of TGFbeta1-mediated growth inhibition and apoptosis by RUNX2 isoforms in endothelial cells. Oncogene. 2004;23:4722–4734. doi: 10.1038/sj.onc.1207589. [DOI] [PubMed] [Google Scholar]
- Tsuji K, Noda M. Identification and expression of a novel 3'-exon of mouse Runx1/Pebp2alphaB/Cbfa2/AML1 gene. Biochem Biophys Res Commun. 2000;274:171–176. doi: 10.1006/bbrc.2000.3112. [DOI] [PubMed] [Google Scholar]
- Larroux C. PhD Thesis. Brisbane, Australia: The University of Queensland; 2007. Genome content and developmental expression of transcription factor genes in the demosponge Amphimedon queenslandica: insights into the first multicellular animal. [Google Scholar]
- Simionato E, Ledent V, Richards G, Thomas-Chollier M, Kerner P, Coornaert D, Degnan BM, Vervoort M. Origin and diversification of the basic helix-loop-helix gene family in metazoans: insights from comparative genomics. BMC Evol Biol. 2007;7:33. doi: 10.1186/1471-2148-7-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Glusman G, Kaur A, Hood L, Rowen L. An enigmatic fourth runt domain gene in the fugu genome: ancestral gene loss versus accelerated evolution. BMC Evol Biol. 2004;4:43. doi: 10.1186/1471-2148-4-43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borchiellini C, Chombard C, Manuel M, Alivon E, Vacelet J, Boury-Esnault N. Molecular phylogeny of Demospongiae: implications for classification and scenarios of character evolution. Mol Phylogenet Evol. 2004;32:823–837. doi: 10.1016/j.ympev.2004.02.021. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.