Abstract
The HMG box is a novel type of DNA-binding domain found in a diverse group of proteins. The HMG box superfamily comprises a.o. the High Mobility Group proteins HMG1 and HMG2, the nucleolar transcription factor UBF, the lymphoid transcription factors TCF-1 and LEF-1, the fungal mating-type genes mat-Mc and MATA1, and the mammalian sex-determining gene SRY. The superfamily dates back to at least 1,000 million years ago, as its members appear in animals, plants and yeast. Alignment of all known HMG boxes defined an unusually loose consensus sequence. We constructed phylogenetic trees connecting the members of the HMG box superfamily in order to understand their evolution. This analysis led us to distinguish two subfamilies: one comprising proteins with a single sequence-specific HMG box, the other encompassing relatively non sequence-specific DNA-binding proteins with multiple HMG boxes. By studying the extent of diversification of the superfamily, we found that the speed of evolution was very different within the various groups of HMG-box containing factors. Comparison of the evolution of the two boxes of ABF2 and of mtTF1 implied different diversification models for these two proteins. Finally, we provide a tree for the highly complex group of SRY-like ('Sox' genes), clustering at least 40 different loci that rapidly diverged in various animal lineages.
Full text
PDFSelected References
These references are in PubMed. This may not be the complete list of references from this article.
- Amero S. A., Kretsinger R. H., Moncrief N. D., Yamamoto K. R., Pearson W. R. The origin of nuclear receptor proteins: a single precursor distinct from other transcription factors. Mol Endocrinol. 1992 Jan;6(1):3–7. doi: 10.1210/mend.6.1.1738368. [DOI] [PubMed] [Google Scholar]
- Bachvarov D., Moss T. The RNA polymerase I transcription factor xUBF contains 5 tandemly repeated HMG homology boxes. Nucleic Acids Res. 1991 May 11;19(9):2331–2335. doi: 10.1093/nar/19.9.2331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bianchi M. E., Falciola L., Ferrari S., Lilley D. M. The DNA binding site of HMG1 protein is composed of two similar segments (HMG boxes), both of which have counterparts in other eukaryotic regulatory proteins. EMBO J. 1992 Mar;11(3):1055–1063. doi: 10.1002/j.1460-2075.1992.tb05144.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bruhn S. L., Pil P. M., Essigmann J. M., Housman D. E., Lippard S. J. Isolation and characterization of human cDNA clones encoding a high mobility group box protein that recognizes structural distortions to DNA caused by binding of the anticancer agent cisplatin. Proc Natl Acad Sci U S A. 1992 Mar 15;89(6):2307–2311. doi: 10.1073/pnas.89.6.2307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coriat A. M., Müller U., Harry J. L., Uwanogho D., Sharpe P. T. PCR amplification of SRY-related gene sequences reveals evolutionary conservation of the SRY-box motif. PCR Methods Appl. 1993 Feb;2(3):218–222. doi: 10.1101/gr.2.3.218. [DOI] [PubMed] [Google Scholar]
- Denny P., Swift S., Brand N., Dabhade N., Barton P., Ashworth A. A conserved family of genes related to the testis determining gene, SRY. Nucleic Acids Res. 1992 Jun 11;20(11):2887–2887. doi: 10.1093/nar/20.11.2887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Denny P., Swift S., Connor F., Ashworth A. An SRY-related gene expressed during spermatogenesis in the mouse encodes a sequence-specific DNA-binding protein. EMBO J. 1992 Oct;11(10):3705–3712. doi: 10.1002/j.1460-2075.1992.tb05455.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dessen P., Fondrat C., Valencien C., Mugnier C. BISANCE: a French service for access to biomolecular sequence databases. Comput Appl Biosci. 1990 Oct;6(4):355–356. doi: 10.1093/bioinformatics/6.4.355. [DOI] [PubMed] [Google Scholar]
- Diffley J. F., Stillman B. A close relative of the nuclear, chromosomal high-mobility group protein HMG1 in yeast mitochondria. Proc Natl Acad Sci U S A. 1991 Sep 1;88(17):7864–7868. doi: 10.1073/pnas.88.17.7864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferrari S., Harley V. R., Pontiggia A., Goodfellow P. N., Lovell-Badge R., Bianchi M. E. SRY, like HMG1, recognizes sharp angles in DNA. EMBO J. 1992 Dec;11(12):4497–4506. doi: 10.1002/j.1460-2075.1992.tb05551.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fitch W. M. A non-sequential method for constructing trees and hierarchical classifications. J Mol Evol. 1981;18(1):30–37. doi: 10.1007/BF01733209. [DOI] [PubMed] [Google Scholar]
- Foster J. W., Brennan F. E., Hampikian G. K., Goodfellow P. N., Sinclair A. H., Lovell-Badge R., Selwood L., Renfree M. B., Cooper D. W., Graves J. A. Evolution of sex determination and the Y chromosome: SRY-related sequences in marsupials. Nature. 1992 Oct 8;359(6395):531–533. doi: 10.1038/359531a0. [DOI] [PubMed] [Google Scholar]
- Gastrop J., Hoevenagel R., Young J. R., Clevers H. C. A common ancestor of the mammalian transcription factors TCF-1 and TCF-1 alpha/LEF-1 expressed in chicken T cells. Eur J Immunol. 1992 May;22(5):1327–1330. doi: 10.1002/eji.1830220531. [DOI] [PubMed] [Google Scholar]
- Giese K., Cox J., Grosschedl R. The HMG domain of lymphoid enhancer factor 1 bends DNA and facilitates assembly of functional nucleoprotein structures. Cell. 1992 Apr 3;69(1):185–195. doi: 10.1016/0092-8674(92)90129-z. [DOI] [PubMed] [Google Scholar]
- Grasser K. D., Feix G. Isolation and characterization of maize cDNAs encoding a high mobility group protein displaying a HMG-box. Nucleic Acids Res. 1991 May 25;19(10):2573–2577. doi: 10.1093/nar/19.10.2573. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gubbay J., Collignon J., Koopman P., Capel B., Economou A., Münsterberg A., Vivian N., Goodfellow P., Lovell-Badge R. A gene mapping to the sex-determining region of the mouse Y chromosome is a member of a novel family of embryonically expressed genes. Nature. 1990 Jul 19;346(6281):245–250. doi: 10.1038/346245a0. [DOI] [PubMed] [Google Scholar]
- Haggren W., Kolodrubetz D. The Saccharomyces cerevisiae ACP2 gene encodes an essential HMG1-like protein. Mol Cell Biol. 1988 Mar;8(3):1282–1289. doi: 10.1128/mcb.8.3.1282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haqq C. M., King C. Y., Donahoe P. K., Weiss M. A. SRY recognizes conserved DNA sites in sex-specific promoters. Proc Natl Acad Sci U S A. 1993 Feb 1;90(3):1097–1101. doi: 10.1073/pnas.90.3.1097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harley V. R., Jackson D. I., Hextall P. J., Hawkins J. R., Berkovitz G. D., Sockanathan S., Lovell-Badge R., Goodfellow P. N. DNA binding activity of recombinant SRY from normal males and XY females. Science. 1992 Jan 24;255(5043):453–456. doi: 10.1126/science.1734522. [DOI] [PubMed] [Google Scholar]
- Hayashi T., Hayashi H., Iwai K. Tetrahymena HMG nonhistone chromosomal protein. Isolation and amino acid sequence lacking the N- and C-terminal domains of vertebrate HMG 1. J Biochem. 1989 Apr;105(4):577–581. doi: 10.1093/oxfordjournals.jbchem.a122707. [DOI] [PubMed] [Google Scholar]
- Higgins D. G., Sharp P. M. CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene. 1988 Dec 15;73(1):237–244. doi: 10.1016/0378-1119(88)90330-7. [DOI] [PubMed] [Google Scholar]
- Hisatake K., Hasegawa S., Takada R., Nakatani Y., Horikoshi M., Roeder R. G. The p250 subunit of native TATA box-binding factor TFIID is the cell-cycle regulatory protein CCG1. Nature. 1993 Mar 11;362(6416):179–181. doi: 10.1038/362179a0. [DOI] [PubMed] [Google Scholar]
- Jantzen H. M., Admon A., Bell S. P., Tjian R. Nucleolar transcription factor hUBF contains a DNA-binding motif with homology to HMG proteins. Nature. 1990 Apr 26;344(6269):830–836. doi: 10.1038/344830a0. [DOI] [PubMed] [Google Scholar]
- Kappen C., Schughart K., Ruddle F. H. Two steps in the evolution of Antennapedia-class vertebrate homeobox genes. Proc Natl Acad Sci U S A. 1989 Jul;86(14):5459–5463. doi: 10.1073/pnas.86.14.5459. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keese P. K., Gibbs A. Origins of genes: "big bang" or continuous creation? Proc Natl Acad Sci U S A. 1992 Oct 15;89(20):9489–9493. doi: 10.1073/pnas.89.20.9489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kelly M., Burke J., Smith M., Klar A., Beach D. Four mating-type genes control sexual differentiation in the fission yeast. EMBO J. 1988 May;7(5):1537–1547. doi: 10.1002/j.1460-2075.1988.tb02973.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Knoll A. H. The early evolution of eukaryotes: a geological perspective. Science. 1992 May 1;256(5057):622–627. doi: 10.1126/science.1585174. [DOI] [PubMed] [Google Scholar]
- Kolodrubetz D., Burgum A. Duplicated NHP6 genes of Saccharomyces cerevisiae encode proteins homologous to bovine high mobility group protein 1. J Biol Chem. 1990 Feb 25;265(6):3234–3239. [PubMed] [Google Scholar]
- Landschulz W. H., Johnson P. F., McKnight S. L. The leucine zipper: a hypothetical structure common to a new class of DNA binding proteins. Science. 1988 Jun 24;240(4860):1759–1764. doi: 10.1126/science.3289117. [DOI] [PubMed] [Google Scholar]
- Laudet V., Hänni C., Coll J., Catzeflis F., Stéhelin D. Evolution of the nuclear receptor gene superfamily. EMBO J. 1992 Mar;11(3):1003–1013. doi: 10.1002/j.1460-2075.1992.tb05139.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laudet V., Niel C., Duterque-Coquillaud M., Leprince D., Stehelin D. Evolution of the ets gene family. Biochem Biophys Res Commun. 1993 Jan 15;190(1):8–14. doi: 10.1006/bbrc.1993.1002. [DOI] [PubMed] [Google Scholar]
- Lautenberger J. A., Burdett L. A., Gunnell M. A., Qi S., Watson D. K., O'Brien S. J., Papas T. S. Genomic dispersal of the ets gene family during metazoan evolution. Oncogene. 1992 Sep;7(9):1713–1719. [PubMed] [Google Scholar]
- Laux T., Goldberg R. B. A plant DNA binding protein shares highly conserved sequence motifs with HMG-box proteins. Nucleic Acids Res. 1991 Sep 11;19(17):4769–4769. doi: 10.1093/nar/19.17.4769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mosrin C., Riva M., Beltrame M., Cassar E., Sentenac A., Thuriaux P. The RPC31 gene of Saccharomyces cerevisiae encodes a subunit of RNA polymerase C (III) with an acidic tail. Mol Cell Biol. 1990 Sep;10(9):4737–4743. doi: 10.1128/mcb.10.9.4737. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murre C., McCaw P. S., Baltimore D. A new DNA binding and dimerization motif in immunoglobulin enhancer binding, daughterless, MyoD, and myc proteins. Cell. 1989 Mar 10;56(5):777–783. doi: 10.1016/0092-8674(89)90682-x. [DOI] [PubMed] [Google Scholar]
- Murtha M. T., Leckman J. F., Ruddle F. H. Detection of homeobox genes in development and evolution. Proc Natl Acad Sci U S A. 1991 Dec 1;88(23):10711–10715. doi: 10.1073/pnas.88.23.10711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ner S. S. HMGs everywhere. Curr Biol. 1992 Apr;2(4):208–210. doi: 10.1016/0960-9822(92)90541-h. [DOI] [PubMed] [Google Scholar]
- Novacek M. J. Mammalian phylogeny: shaking the tree. Nature. 1992 Mar 12;356(6365):121–125. doi: 10.1038/356121a0. [DOI] [PubMed] [Google Scholar]
- Parisi M. A., Clayton D. A. Similarity of human mitochondrial transcription factor 1 to high mobility group proteins. Science. 1991 May 17;252(5008):965–969. doi: 10.1126/science.2035027. [DOI] [PubMed] [Google Scholar]
- Pentecost B. T., Wright J. M., Dixon G. H. Isolation and sequence of cDNA clones coding for a member of the family of high mobility group proteins (HMG-T) in trout and analysis of HMG-T-mRNA's in trout tissues. Nucleic Acids Res. 1985 Jul 11;13(13):4871–4888. doi: 10.1093/nar/13.13.4871. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saitou N., Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987 Jul;4(4):406–425. doi: 10.1093/oxfordjournals.molbev.a040454. [DOI] [PubMed] [Google Scholar]
- Schughart K., Kappen C., Ruddle F. H. Duplication of large genomic regions during the evolution of vertebrate homeobox genes. Proc Natl Acad Sci U S A. 1989 Sep;86(18):7067–7071. doi: 10.1073/pnas.86.18.7067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schulman I. G., Wang T., Wu M., Bowen J., Cook R. G., Gorovsky M. A., Allis C. D. Macronuclei and micronuclei in Tetrahymena thermophila contain high-mobility-group-like chromosomal proteins containing a highly conserved eleven-amino-acid putative DNA-binding sequence. Mol Cell Biol. 1991 Jan;11(1):166–174. doi: 10.1128/mcb.11.1.166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sekiguchi T., Nohiro Y., Nakamura Y., Hisamoto N., Nishimoto T. The human CCG1 gene, essential for progression of the G1 phase, encodes a 210-kilodalton nuclear DNA-binding protein. Mol Cell Biol. 1991 Jun;11(6):3317–3325. doi: 10.1128/mcb.11.6.3317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shirakata M., Hüppi K., Usuda S., Okazaki K., Yoshida K., Sakano H. HMG1-related DNA-binding protein isolated with V-(D)-J recombination signal probes. Mol Cell Biol. 1991 Sep;11(9):4528–4536. doi: 10.1128/mcb.11.9.4528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sinclair A. H., Berta P., Palmer M. S., Hawkins J. R., Griffiths B. L., Smith M. J., Foster J. W., Frischauf A. M., Lovell-Badge R., Goodfellow P. N. A gene from the human sex-determining region encodes a protein with homology to a conserved DNA-binding motif. Nature. 1990 Jul 19;346(6281):240–244. doi: 10.1038/346240a0. [DOI] [PubMed] [Google Scholar]
- Staben C., Yanofsky C. Neurospora crassa a mating-type region. Proc Natl Acad Sci U S A. 1990 Jul;87(13):4917–4921. doi: 10.1073/pnas.87.13.4917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sugimoto A., Iino Y., Maeda T., Watanabe Y., Yamamoto M. Schizosaccharomyces pombe ste11+ encodes a transcription factor with an HMG motif that is a critical regulator of sexual development. Genes Dev. 1991 Nov;5(11):1990–1999. doi: 10.1101/gad.5.11.1990. [DOI] [PubMed] [Google Scholar]
- Travis A., Amsterdam A., Belanger C., Grosschedl R. LEF-1, a gene encoding a lymphoid-specific protein with an HMG domain, regulates T-cell receptor alpha enhancer function [corrected]. Genes Dev. 1991 May;5(5):880–894. doi: 10.1101/gad.5.5.880. [DOI] [PubMed] [Google Scholar]
- Wagner C. R., Hamana K., Elgin S. C. A high-mobility-group protein and its cDNAs from Drosophila melanogaster. Mol Cell Biol. 1992 May;12(5):1915–1923. doi: 10.1128/mcb.12.5.1915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waterman M. L., Fischer W. H., Jones K. A. A thymus-specific member of the HMG protein family regulates the human T cell receptor C alpha enhancer. Genes Dev. 1991 Apr;5(4):656–669. doi: 10.1101/gad.5.4.656. [DOI] [PubMed] [Google Scholar]
- Weir H. M., Kraulis P. J., Hill C. S., Raine A. R., Laue E. D., Thomas J. O. Structure of the HMG box motif in the B-domain of HMG1. EMBO J. 1993 Apr;12(4):1311–1319. doi: 10.1002/j.1460-2075.1993.tb05776.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wright E. M., Snopek B., Koopman P. Seven new members of the Sox gene family expressed during mouse development. Nucleic Acids Res. 1993 Feb 11;21(3):744–744. doi: 10.1093/nar/21.3.744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu M., Allis C. D., Richman R., Cook R. G., Gorovsky M. A. An intervening sequence in an unusual histone H1 gene of Tetrahymena thermophila. Proc Natl Acad Sci U S A. 1986 Nov;83(22):8674–8678. doi: 10.1073/pnas.83.22.8674. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yamaguchi-Shinozaki K., Shinozaki K. A novel Arabidopsis DNA binding protein contains the conserved motif of HMG-box proteins. Nucleic Acids Res. 1992 Dec 25;20(24):6737–6737. doi: 10.1093/nar/20.24.6737. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van de Wetering M., Clevers H. Sequence-specific interaction of the HMG box proteins TCF-1 and SRY occurs within the minor groove of a Watson-Crick double helix. EMBO J. 1992 Aug;11(8):3039–3044. doi: 10.1002/j.1460-2075.1992.tb05374.x. [DOI] [PMC free article] [PubMed] [Google Scholar]