Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 1999 Jun;8(6):1358–1361. doi: 10.1110/ps.8.6.1358

Simple sequence is abundant in eukaryotic proteins.

G B Golding 1
PMCID: PMC2144344  PMID: 10386886

Abstract

All proteins of Saccharomyces cerevisiae have been compared to determine how frequently segments from one protein are present in other proteins. Proteins that are recently evolutionarily related were excluded. The most frequently present protein segments are long, tandem repetitions of a single amino acid. For some of these segments, up to 14% of all proteins in the genome were found to have similar peptides within them. These peptide segments may not be functional protein domains. Although they are the most common shared feature of yeast proteins, their ubiquity and simplicity argue that their probable function may be to simply serve as spacers between other protein motifs.

Full Text

The Full Text of this article is available as a PDF (153.0 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. Basic local alignment search tool. J Mol Biol. 1990 Oct 5;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  2. Cox G. W., Taylor L. S., Willis J. D., Melillo G., White R. L., 3rd, Anderson S. K., Lin J. J. Molecular cloning and characterization of a novel mouse macrophage gene that encodes a nuclear protein comprising polyglutamine repeats and interspersing histidines. J Biol Chem. 1996 Oct 11;271(41):25515–25523. doi: 10.1074/jbc.271.41.25515. [DOI] [PubMed] [Google Scholar]
  3. Di Como C. J., Bose R., Arndt K. T. Overexpression of SIS2, which contains an extremely acidic region, increases the expression of SWI4, CLN1 and CLN2 in sit4 mutants. Genetics. 1995 Jan;139(1):95–107. doi: 10.1093/genetics/139.1.95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Doolittle R. F. The multiplicity of domains in proteins. Annu Rev Biochem. 1995;64:287–314. doi: 10.1146/annurev.bi.64.070195.001443. [DOI] [PubMed] [Google Scholar]
  5. Dorit R. L., Schoenbach L., Gilbert W. How big is the universe of exons? Science. 1990 Dec 7;250(4986):1377–1382. doi: 10.1126/science.2255907. [DOI] [PubMed] [Google Scholar]
  6. Duboule D., Haenlin M., Galliot B., Mohier E. DNA sequences homologous to the Drosophila opa repeat are present in murine mRNAs that are differentially expressed in fetuses and adult tissues. Mol Cell Biol. 1987 May;7(5):2003–2006. doi: 10.1128/mcb.7.5.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Gatti E., Popolo L., Vai M., Rota N., Alberghina L. O-linked oligosaccharides in yeast glycosyl phosphatidylinositol-anchored protein gp115 are clustered in a serine-rich region not essential for its function. J Biol Chem. 1994 Aug 5;269(31):19695–19700. [PubMed] [Google Scholar]
  8. Gilbert W. Why genes in pieces? Nature. 1978 Feb 9;271(5645):501–501. doi: 10.1038/271501a0. [DOI] [PubMed] [Google Scholar]
  9. Gilbert W., de Souza S. J., Long M. Origin of genes. Proc Natl Acad Sci U S A. 1997 Jul 22;94(15):7698–7703. doi: 10.1073/pnas.94.15.7698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Goffeau A., Barrell B. G., Bussey H., Davis R. W., Dujon B., Feldmann H., Galibert F., Hoheisel J. D., Jacq C., Johnston M. Life with 6000 genes. Science. 1996 Oct 25;274(5287):546, 563-7. doi: 10.1126/science.274.5287.546. [DOI] [PubMed] [Google Scholar]
  11. Heinonen T. Y., Pearlman R. E. A germ line-specific sequence element in an intron in Tetrahymena thermophila. J Biol Chem. 1994 Jul 1;269(26):17428–17433. [PubMed] [Google Scholar]
  12. Lin Y., Gross J. K. Molecular cloning and characterization of winter flounder antifreeze cDNA. Proc Natl Acad Sci U S A. 1981 May;78(5):2825–2829. doi: 10.1073/pnas.78.5.2825. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Mewes H. W., Albermann K., Bähr M., Frishman D., Gleissner A., Hani J., Heumann K., Kleine K., Maierl A., Oliver S. G. Overview of the yeast genome. Nature. 1997 May 29;387(6632 Suppl):7–65. doi: 10.1038/42755. [DOI] [PubMed] [Google Scholar]
  14. Milbrandt J. A nerve growth factor-induced gene encodes a possible transcriptional regulatory factor. Science. 1987 Nov 6;238(4828):797–799. doi: 10.1126/science.3672127. [DOI] [PubMed] [Google Scholar]
  15. O'Hara P. J., Horowitz H., Eichinger G., Young E. T. The yeast ADR6 gene encodes homopolymeric amino acid sequences and a potential metal-binding domain. Nucleic Acids Res. 1988 Nov 11;16(21):10153–10169. doi: 10.1093/nar/16.21.10153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Ohno S. Early genes that were oligomeric repeats generated a number of divergent domains on their own. Proc Natl Acad Sci U S A. 1987 Sep;84(18):6486–6490. doi: 10.1073/pnas.84.18.6486. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Persengiev S. P., Kilpatrick D. L. Characterization of a cDNA containing trinucleotide repeat sequences that is highly enriched in spermatogenic cells. Mol Reprod Dev. 1997 Apr;46(4):476–481. doi: 10.1002/(SICI)1098-2795(199704)46:4<476::AID-MRD5>3.0.CO;2-L. [DOI] [PubMed] [Google Scholar]
  18. Shaw D. R., Richter H., Giorda R., Ohmachi T., Ennis H. L. Nucleotide sequences of Dictyostelium discoideum developmentally regulated cDNAs rich in (AAC) imply proteins that contain clusters of asparagine, glutamine, or threonine. Mol Gen Genet. 1989 Sep;218(3):453–459. doi: 10.1007/BF00332409. [DOI] [PubMed] [Google Scholar]
  19. Sidén-Kiamos I., Favia G., Artiaco D., Saccone G., Furia M., Polito L. C., Louis C. Opa-like repeats in the genome of the Medfly Ceratitis capitata. Genetica. 1993;92(1):43–53. doi: 10.1007/BF00057506. [DOI] [PubMed] [Google Scholar]
  20. Sudo S., Fujikawa T., Nagakura T., Ohkubo T., Sakaguchi K., Tanaka M., Nakashima K., Takahashi T. Structures of mollusc shell framework proteins. Nature. 1997 Jun 5;387(6633):563–564. doi: 10.1038/42391. [DOI] [PubMed] [Google Scholar]
  21. Vai M., Gatti E., Lacanà E., Popolo L., Alberghina L. Isolation and deduced amino acid sequence of the gene encoding gp115, a yeast glycophospholipid-anchored protein containing a serine-rich region. J Biol Chem. 1991 Jul 5;266(19):12242–12248. [PubMed] [Google Scholar]
  22. Wharton K. A., Yedvobnick B., Finnerty V. G., Artavanis-Tsakonas S. opa: a novel family of transcribed repeats shared by the Notch locus and other developmentally regulated loci in D. melanogaster. Cell. 1985 Jan;40(1):55–62. doi: 10.1016/0092-8674(85)90308-3. [DOI] [PubMed] [Google Scholar]
  23. White M. J., Hirsch J. P., Henry S. A. The OPI1 gene of Saccharomyces cerevisiae, a negative regulator of phospholipid biosynthesis, encodes a protein containing polyglutamine tracts and a leucine zipper. J Biol Chem. 1991 Jan 15;266(2):863–872. [PubMed] [Google Scholar]
  24. Wolfe K. H., Shields D. C. Molecular evidence for an ancient duplication of the entire yeast genome. Nature. 1997 Jun 12;387(6634):708–713. doi: 10.1038/42711. [DOI] [PubMed] [Google Scholar]
  25. Yamamoto A., DeWald D. B., Boronenkov I. V., Anderson R. A., Emr S. D., Koshland D. Novel PI(4)P 5-kinase homologue, Fab1p, essential for normal vacuole function and morphology in yeast. Mol Biol Cell. 1995 May;6(5):525–539. doi: 10.1091/mbc.6.5.525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. de Souza S. J., Long M., Schoenbach L., Roy S. W., Gilbert W. Intron positions correlate with module boundaries in ancient proteins. Proc Natl Acad Sci U S A. 1996 Dec 10;93(25):14632–14636. doi: 10.1073/pnas.93.25.14632. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES