Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Sep 8.
Published in final edited form as: Circ Res. 2008 Jul 3;103(1):13–15. doi: 10.1161/CIRCRESAHA.108.179978

Deck of CArGs

Joseph M Miano 1
PMCID: PMC2739618  NIHMSID: NIHMS126731  PMID: 18596263

The instructions for generating and sustaining all life forms are encoded within each life form's genome. A major challenge in the post-genomic sequencing era has been assigning functions to the hundreds of thousands of non-coding sequence elements within our nuclear genome that are homologous to sequences in other species. These snippets of DNA have been hypothesized to impart information for initiating DNA replication, for inter- and intra-chromosomal recombination, for structural integration of the genome into the surrounding nucleoskeleton, and for transcriptional regulation of gene expression. The latter regulatory elements have been of particular interest inasmuch as variations among our own species and, to some extent between species, are thought largely to be a function of differences in the timing, duration and intensity of gene expression. Moreover, the explosive rise in single nucleotide polymorphism (SNP) association studies has generated a mounting number of non-coding SNPs whose functions, while poorly defined at this time, will undoubtedly include the regulation of gene expression 1. Finally, the continued discovery of regulatory elements controlling gene expression will augment our “genomic tool box” of reagents for expressing and inactivating genes in a context-dependent manner. Elucidating the function of all regulatory elements in our genome is therefore a critically important endeavor from both a clinical and basic science perspective.

The three muscle types display unique patterns of gene expression; however, during development and in some pathological states there is overlap in gene expression profiles suggesting a common mode of regulation (Fig. 1). For example, the majority of cyto-contractile genes expressed in each muscle cell type are under direct control of the widely expressed transcription factor, serum response factor (SRF). SRF self-dimerizes and binds to a 10 base pair sequence known as a CArG element or CArG box (Fig. 2). CArG boxes are found in the 5' promoter and intronic region of a rising number of cyto-contractile genes. Based on 20 years of DNA-protein and promoter analyses, as well as comparative genomics, we now recognize that SRF may potentially bind to 1,216 permutations of a CArG box, with CCTTATATGG emerging as a consensus sequence (Fig. 2). Recent genome-wide studies have further advanced our understanding of the base sequence character of CArG elements and have greatly expanded the so-called CArGome 2-5. As of this writing, over 200 CArG boxes controlling expression of some 170 mammalian SRF target genes have been identified with more than 300 hypothetical CArG boxes awaiting wet-lab validation.

Figure 1. Overlapping patterns of gene expression in the three muscle types.

Figure 1

Venn diagram illustrates a sampling of muscle cell-specific genes and those (in italics) that are SRF-dependent.

Figure 2. Sequence character of the CArG box.

Figure 2

A sequence logo 20 based on 223 aligned CArG boxes. The height of each nucleotide reflects its frequency across the 223 CArGs. Note that most substitutions occur at positions 2, 5 and 6 with position 4 showing the least amount of variation.

SRF possesses relatively weak transcriptional activity, but binds to any one of 56 cofactors that potently activate target gene expression, mainly through alterations in chromatin permissive for DNA transcription. Many SRF cofactors exhibit cell-restricted patterns of gene expression during development and postnatal life. One of the more powerful cell-restricted SRF cofactors is myocardin (Myocd), which was first cloned in a bioinformatic screen for cardiac-specific genes 6. Expression of Myocd is highly specific for cardiac and SMC, with transient expression in developing skeletal muscle precursors 6,7. Myocd forms a ternary complex with SRF-bound CArG boxes and, through its association with a variety of other coregulators of gene expression, directs expression of cardiac and SMC cyto-contractile genes 8-10. Though cardiac genes are induced when Myocd is ectopically expressed in non-muscle cells, little evidence of a structural or functional cardiac muscle phenotype is manifest. In contrast, Myocd orchestrates structural, biochemical, and physiological characteristics of SMC 11,12. Thus, Myocd appears to be the SMC equivalent of MyoD, the original master regulator of the skeletal muscle phenotype.

SMC are defined by a molecular signature of gene expression that includes genes encoding contractile, cytoskeletal, ion channel, transcription factor, and signaling proteins all of which are essential to carry out the unique function of this cell type 13. The regulatory regions of many of these genes have been characterized both in vitro and in vivo and more than half contain functional CArG elements 14. Now, in this issue of Circulation Research, Petit et al report the discovery of an alternative form of the SMC-specific gene, LIM domain containing preferred translocation partner in lipoma (aka Lpp), that appears to be under direct control of CArG-SRF-Myocd ternary complexes 15. Lpp protein expression was shown previously to be highly specific for SMC where, in association with vinculin at peripheral dense bodies, it mediates cell migration 16. Petit et al sought to define whether any functional CArG boxes reside in or around the 588-kb mouse Lpp locus using a bioinformatics approach based on CArG nucleotide frequencies (Fig. 2). A total of 35 CArG elements were found over the interrogated sequence, a number surprisingly lower than the theoretical frequency of one every 910 base pairs of DNA sequence. Three of the 35 CArG elements were found to be homologous with corresponding CArGs in several other species, though their position is more than 50-kb away from the annotated start site of Lpp transcription. Since virtually all functional CArG elements reside within a 4-kb window of transcription start sites 5, Petit et al searched for possible internal promoters in the Lpp gene. Their presupposition of an internal promoter is supported by recent knowledge demonstrating a much more complex transcriptome than we ever imagined 17. Using various genomic algorithms and RT-PCR, the authors discovered an additional promoter inside intron 2 of the mouse Lpp gene that directs expression of a SMC-specific transcript. Because the first 4 exons of mouse Lpp are non-coding, the alternate SMC-specific transcript encodes for an identical LPP protein as that derived from the more proximal promoter.

The intronic Lpp promoter resides ~50-kb downstream from the proximal promoter region where at least 5 non-homologous CArG elements are found. None of the latter CArGs appears to bind SRF using ChIP though it would be important to extend these studies to additional promoter assays. Interestingly, the human LPP proximal promoter contains 4 CArG boxes over a 350 base pair genomic interval, a density 10-fold greater than that predicted by chance. Whether these CArGs bind SRF and respond to SRF-Myocd is unknown but worth investigating given that previous non-homologous CArG boxes were found to be functional 3-5. Two of the three homologous downstream CArGs, one of which (CArG8) is found only 150 base pairs upstream from the intronic promoter's putative transcription start site, bind SRF in both ChIP and gel shift assays. It is curious to note that the one downstream CArG (CArG11) that did not bind SRF in a ChIP assay matches 100% with the consensus CArG sequence (Fig. 2). Whether absence of SRF binding stems from flanking sequences or histone modifications of chromatin that are non-permissive for protein-DNA binding (e.g., hypoacetylation) is unknown. In vitro reporter assays revealed intrinsic promoter activity for the intronic Lpp promoter that was partially dependent upon CArG8. Both SRF and Myocd activated the intronic Lpp promoter in a CArG8-dependent manner suggesting, at least in vitro, that Lpp is a direct target of SRF-Myocd. These results are congruent with the original finding that Lpp mRNA is induced with Myocd 18. Petit et al further corroborated these findings with elegant expression studies using embryonic stem cells null for SRF as well as various tissues where SRF was deleted specifically in SMC. In both cases, the intronic Lpp promoter-driven transcript is sharply attenuated, but could be rescued upon ectopic SRF expression.

The study by Petit et al further expands the mammalian CArGome and the molecular signature of SMC lineages. Future work will require a thorough analysis of both the proximal and intronic Lpp promoters in transgenic mice. In addition, the presence of a microRNA (miR-28) within intron 8 of the Lpp gene should be of some interest given the number of SRF-responsive microRNAs 19. Finally, ongoing efforts continue to ascertain whether any sequence variants within or adjacent to the more than 500 CArGs identified in our genome are linked to altered target gene expression and human disease.

Acknowledgments

Sources of Funding: The author's work on this subject is funded through National Institutes of Health grant HL-62572.

Footnotes

Disclosures: AUTHOR: PLEASE COMPLETE AT PROOF STAGE.

References

  • 1.Stranger BE, Nica AC, Forrest MS, Dimas A, Bird CP, Beazley C, Ingle CE, Dunning M, Flicek P, Koller D, Montgomery S, Tavare S, Deloukas P, Dermitzakis ET. Population genomics of human gene expression. Nat Genet. 2007;39:1217–1224. doi: 10.1038/ng2142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Philippar U, Schratt G, Dieterich C, Müller JM, Galgóczy P, Engel FB, Keating MT, Gertler F, Schüle R, Vingron M, Nordheim A. The SRF target gene Fhl2 antagonizes RhoA/MAL-dependent activation of SRF. Mol Cell. 2004;16:867–880. doi: 10.1016/j.molcel.2004.11.039. [DOI] [PubMed] [Google Scholar]
  • 3.Zhang SX, Gras EG, Wycuff DR, Marriot SJ, Kadeer N, Yu W, Olson EN, Garry DJ, Parmacek MS, Schwartz RJ. Identification of direct serum response factor gene targets during DMSO induced P19 cardiac cell differentiation. J Biol Chem. 2005;280:19115–19126. doi: 10.1074/jbc.M413793200. [DOI] [PubMed] [Google Scholar]
  • 4.Balza RO, Jr., Misra RP. Role of the serum response factor in regulating contractile apparatus gene expression and sarcomeric integrity in cardiomyocytes. J Biol Chem. 2006;281:6498–6510. doi: 10.1074/jbc.M509487200. [DOI] [PubMed] [Google Scholar]
  • 5.Sun Q, Chen G, Streb JW, Long X, Yang Y, Stoeckert CJ, Jr., Miano JM. Defining the mammalian CArGome. Genome Res. 2006;16:197–207. doi: 10.1101/gr.4108706. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wang D-Z, Chang PS, Wang Z, Sutherland L, Richardson JA, Small E, Krieg PA, Olson EN. Activation of cardiac gene expression by myocardin, a transcriptional cofactor for serum response factor. Cell. 2001;105:851–862. doi: 10.1016/s0092-8674(01)00404-4. [DOI] [PubMed] [Google Scholar]
  • 7.Long X, Creemers EE, Wang D-Z, Olson EN, Miano JM. Myocardin is a bifunctional switch for smooth versus skeletal muscle differentiation. Proc Natl Acad Sci, USA. 2007;104:16570–16575. doi: 10.1073/pnas.0708253104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Mack CP, Hinson JS. Regulation of smooth muscle differentiation by the myocardin family of serum response factor co-factors. J Thromb Haemost. 2005;3:1976–1984. doi: 10.1111/j.1538-7836.2005.01316.x. [DOI] [PubMed] [Google Scholar]
  • 9.Pipes GCT, Creemers EE, Olson EN. The myocardin family of transcriptional coactivators: versatile regulators of cell growth, migration, and myogenesis. Genes Dev. 2006;20:1545–1556. doi: 10.1101/gad.1428006. [DOI] [PubMed] [Google Scholar]
  • 10.Parmacek MS. Myocardin-related transcription factors: critical coactivators regulating cardiovascular development and adaptation. Circ Res. 2007;100:633–644. doi: 10.1161/01.RES.0000259563.61091.e8. [DOI] [PubMed] [Google Scholar]
  • 11.Wang Z, Wang D-Z, Pipes GCT, Olson EN. Myocardin is a master regulator of smooth muscle gene expression. Proc Natl Acad Sci, USA. 2003;100:7129–7134. doi: 10.1073/pnas.1232341100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Long X, Bell RD, Gerthoffer WT, Zlokovic BV, Miano JM. Myocardin is sufficient for a SMC-like contractile phenotype. Arterioscler Thromb Vasc Biol. 2008 doi: 10.1161/ATVBAHA.108.166066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Owens GK, Kumar MS, Wamhoff BR. Molecular regulation of vascular smooth muscle cell differentiation in development and disease. Physiol Rev. 2004;84:767–801. doi: 10.1152/physrev.00041.2003. [DOI] [PubMed] [Google Scholar]
  • 14.Miano JM. Serum response factor: toggling between disparate programs of gene expression. J Mol Cell Cardiol. 2003;35:577–593. doi: 10.1016/s0022-2828(03)00110-x. [DOI] [PubMed] [Google Scholar]
  • 15.Petit MMR, Lindskog H, Larsson E, Wasteson P, Athley E, Breuer S, Angstenberger M, Hertfelder D, Mattsson E, Nordheim A, Nelander S, Lindahl P. Smooth muscle expression of lipoma preferred partner is mediated by an alternative intronic promoter that is regulated by serum response factor/myocardin. Circ Res. 2008;103:xxx–XXX. doi: 10.1161/CIRCRESAHA.108.177436. [DOI] [PubMed] [Google Scholar]
  • 16.Gorenne I, Nakamoto RK, Phelps CP, Beckerle MC, Somlyo AV, Somlyo AP. LPP, a LIM protein highly expressed in smooth muscle. Am J Physiol Cell Physiol. 2003;285:C674–C685. doi: 10.1152/ajpcell.00608.2002. [DOI] [PubMed] [Google Scholar]
  • 17.Denoeud F, Kapranov P, Ucla C, Frankish A, Castelo R, Drenkow J, Lagarde J, Alioto T, Manzano C, Chrast J, Dike S, Wyss C, Henrichsen CN, Holroyd N, Dickson MC, Taylor R, Hance Z, Foissac S, Myers RM, Rogers J, Hubbard T, Harrow J, Guigo R, Gingeras TR, Antonarakis SE, Reymond A. Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions. Genome Res. 2007;17:746–759. doi: 10.1101/gr.5660607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Gorenne I, Jin L, Yoshida T, Sanders JM, Sarembock IJ, Owens GK, Somlyo AP, Somlyo AV. LPP expression during in vitro smooth muscle cell differentiation and stent-induced vascular injury. Circ Res. 2006;98:378–385. doi: 10.1161/01.RES.0000202802.34727.fd. [DOI] [PubMed] [Google Scholar]
  • 19.Niu Z, Li A, Zhang SX, Schwartz RJ. Serum response factor micromanaging cardiogenesis. Curr Opin Cell Biol. 2007;19:618–627. doi: 10.1016/j.ceb.2007.09.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Crooks GE, Hon G, Chandonia J-M, Brenner SE. WebLogo: a sequence logo generator. Genome Res. 2004;14:1188–1190. doi: 10.1101/gr.849004. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES