Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 1994 Dec;3(12):2340–2350. doi: 10.1002/pro.5560031218

Conserved sequence features of inteins (protein introns) and their use in identifying new inteins and related proteins.

S Pietrokovski 1
PMCID: PMC2142770  PMID: 7756989

Abstract

Inteins (protein introns) are internal portions of protein sequences that are posttranslationally excised while the flanking regions are spliced together, making an additional protein product. Inteins have been found in a number of homologous genes in yeast, mycobacteria, and extreme thermophile archaebacteria. The inteins are probably multifunctional, autocatalyzing their own splicing, and some were also shown to be DNA endonucleases. The splice junction regions and two regions similar to homing endonucleases were thought to be the only common sequence features of inteins. This work analyzed all published intein sequences with recently developed methods for detecting weak, conserved sequence features. The methods complemented each other in the identification and assessment of several patterns characterizing the intein sequences. New intein conserved features are discovered and the known ones are quantitatively described and localized. The general sequence description of all the known inteins is derived from the motifs and their relative positions. The intein sequence description is used to search the sequence databases for intein-like proteins. A sequence region in a mycobacterial open reading frame possessing all of the intein motifs and absent from sequences homologous to both of its flanking sequences is identified as an intein. A newly discovered putative intein in red algae chloroplasts is found not to contain the endonuclease motifs present in all other inteins. The yeast HO endonuclease is found to have an overall intein-like structure and a few viral polyprotein cleavage sites are found to be significantly similar to the inteins amino-end splice junction motif. The intein features described may serve for detection of intein sequences.

Full Text

The Full Text of this article is available as a PDF (2.8 MB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Anraku Y., Hirata R. Protozyme: emerging evidence in nature. J Biochem. 1994 Feb;115(2):175–178. doi: 10.1093/oxfordjournals.jbchem.a124313. [DOI] [PubMed] [Google Scholar]
  2. Arnold E., Luo M., Vriend G., Rossmann M. G., Palmenberg A. C., Parks G. D., Nicklin M. J., Wimmer E. Implications of the picornavirus capsid structure for polyprotein processing. Proc Natl Acad Sci U S A. 1987 Jan;84(1):21–25. doi: 10.1073/pnas.84.1.21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank, recent developments. Nucleic Acids Res. 1993 Jul 1;21(13):3093–3096. doi: 10.1093/nar/21.13.3093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Cooper A. A., Stevens T. H. Protein splicing: excision of intervening sequences at the protein level. Bioessays. 1993 Oct;15(10):667–674. doi: 10.1002/bies.950151006. [DOI] [PubMed] [Google Scholar]
  5. Davis E. O., Jenner P. J., Brooks P. C., Colston M. J., Sedgwick S. G. Protein splicing in the maturation of M. tuberculosis recA protein: a mechanism for tolerating a novel class of intervening sequence. Cell. 1992 Oct 16;71(2):201–210. doi: 10.1016/0092-8674(92)90349-h. [DOI] [PubMed] [Google Scholar]
  6. Davis E. O., Thangaraj H. S., Brooks P. C., Colston M. J. Evidence of selection for protein introns in the recAs of pathogenic mycobacteria. EMBO J. 1994 Feb 1;13(3):699–703. doi: 10.1002/j.1460-2075.1994.tb06309.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Doolittle R. F. The comings and goings of homing endonucleases and mobile introns. Proc Natl Acad Sci U S A. 1993 Jun 15;90(12):5379–5381. doi: 10.1073/pnas.90.12.5379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Gu H. H., Xu J., Gallagher M., Dean G. E. Peptide splicing in the vacuolar ATPase subunit A from Candida tropicalis. J Biol Chem. 1993 Apr 5;268(10):7372–7381. [PubMed] [Google Scholar]
  9. Henikoff S. Detection of Caenorhabditis transposon homologs in diverse organisms. New Biol. 1992 Apr;4(4):382–388. [PubMed] [Google Scholar]
  10. Henikoff S., Henikoff J. G. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A. 1992 Nov 15;89(22):10915–10919. doi: 10.1073/pnas.89.22.10915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Henikoff S., Henikoff J. G. Protein family classification based on searching a database of blocks. Genomics. 1994 Jan 1;19(1):97–107. doi: 10.1006/geno.1994.1018. [DOI] [PubMed] [Google Scholar]
  12. Hensgens L. A., Bonen L., de Haan M., van der Horst G., Grivell L. A. Two intron sequences in yeast mitochondrial COX1 gene: homology among URF-containing introns and strain-dependent variation in flanking exons. Cell. 1983 Feb;32(2):379–389. doi: 10.1016/0092-8674(83)90457-9. [DOI] [PubMed] [Google Scholar]
  13. Kane P. M., Yamashiro C. T., Wolczyk D. F., Neff N., Goebl M., Stevens T. H. Protein splicing converts the yeast TFP1 gene product to the 69-kD subunit of the vacuolar H(+)-adenosine triphosphatase. Science. 1990 Nov 2;250(4981):651–657. doi: 10.1126/science.2146742. [DOI] [PubMed] [Google Scholar]
  14. Karlin S., Altschul S. F. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc Natl Acad Sci U S A. 1990 Mar;87(6):2264–2268. doi: 10.1073/pnas.87.6.2264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Kostriken R., Strathern J. N., Klar A. J., Hicks J. B., Heffron F. A site-specific endonuclease essential for mating-type switching in Saccharomyces cerevisiae. Cell. 1983 Nov;35(1):167–174. doi: 10.1016/0092-8674(83)90219-2. [DOI] [PubMed] [Google Scholar]
  16. Kostrzewa M., Zetsche K. Large ATP synthase operon of the red alga Antithamnion sp. resembles the corresponding operon in cyanobacteria. J Mol Biol. 1992 Oct 5;227(3):961–970. doi: 10.1016/0022-2836(92)90238-f. [DOI] [PubMed] [Google Scholar]
  17. Kostrzewa M., Zetsche K. Organization of plastid-encoded ATPase genes and flanking regions including homologues of infB and tsf in the thermophilic red alga Galdieria sulphuraria. Plant Mol Biol. 1993 Oct;23(1):67–76. doi: 10.1007/BF00021420. [DOI] [PubMed] [Google Scholar]
  18. Lambowitz A. M., Belfort M. Introns as mobile genetic elements. Annu Rev Biochem. 1993;62:587–622. doi: 10.1146/annurev.bi.62.070193.003103. [DOI] [PubMed] [Google Scholar]
  19. Lawrence C. E., Altschul S. F., Boguski M. S., Liu J. S., Neuwald A. F., Wootton J. C. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science. 1993 Oct 8;262(5131):208–214. doi: 10.1126/science.8211139. [DOI] [PubMed] [Google Scholar]
  20. Michel F., Jacquier A., Dujon B. Comparison of fungal mitochondrial introns reveals extensive homologies in RNA secondary structure. Biochimie. 1982 Oct;64(10):867–881. doi: 10.1016/s0300-9084(82)80349-0. [DOI] [PubMed] [Google Scholar]
  21. Neff N. F. Protein splicing: selfish genes invade cellular proteins. Curr Opin Cell Biol. 1993 Dec;5(6):971–976. doi: 10.1016/0955-0674(93)90079-6. [DOI] [PubMed] [Google Scholar]
  22. Neuwald A. F., Green P. Detecting patterns in protein sequences. J Mol Biol. 1994 Jun 24;239(5):698–712. doi: 10.1006/jmbi.1994.1407. [DOI] [PubMed] [Google Scholar]
  23. Perler F. B., Comb D. G., Jack W. E., Moran L. S., Qiang B., Kucera R. B., Benner J., Slatko B. E., Nwankwo D. O., Hempstead S. K. Intervening sequences in an Archaea DNA polymerase gene. Proc Natl Acad Sci U S A. 1992 Jun 15;89(12):5577–5581. doi: 10.1073/pnas.89.12.5577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Perler F. B., Davis E. O., Dean G. E., Gimble F. S., Jack W. E., Neff N., Noren C. J., Thorner J., Belfort M. Protein splicing elements: inteins and exteins--a definition of terms and recommended nomenclature. Nucleic Acids Res. 1994 Apr 11;22(7):1125–1127. doi: 10.1093/nar/22.7.1125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Pósfai J., Bhagwat A. S., Pósfai G., Roberts R. J. Predictive motifs derived from cytosine methyltransferases. Nucleic Acids Res. 1989 Apr 11;17(7):2421–2435. doi: 10.1093/nar/17.7.2421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Russell D. W., Jensen R., Zoller M. J., Burke J., Errede B., Smith M., Herskowitz I. Structure of the Saccharomyces cerevisiae HO gene and analysis of its upstream regulatory region. Mol Cell Biol. 1986 Dec;6(12):4281–4294. doi: 10.1128/mcb.6.12.4281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Schneider T. D., Stephens R. M. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990 Oct 25;18(20):6097–6100. doi: 10.1093/nar/18.20.6097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Schneider T. D., Stormo G. D., Gold L., Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol. 1986 Apr 5;188(3):415–431. doi: 10.1016/0022-2836(86)90165-8. [DOI] [PubMed] [Google Scholar]
  29. Schuler G. D., Altschul S. F., Lipman D. J. A workbench for multiple alignment construction and analysis. Proteins. 1991;9(3):180–190. doi: 10.1002/prot.340090304. [DOI] [PubMed] [Google Scholar]
  30. Shub D. A., Goodrich-Blair H. Protein introns: a new home for endonucleases. Cell. 1992 Oct 16;71(2):183–186. doi: 10.1016/0092-8674(92)90345-d. [DOI] [PubMed] [Google Scholar]
  31. Smith H. O., Annau T. M., Chandrasegaran S. Finding sequence motifs in groups of functionally related proteins. Proc Natl Acad Sci U S A. 1990 Jan;87(2):826–830. doi: 10.1073/pnas.87.2.826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Strathern J. N., Klar A. J., Hicks J. B., Abraham J. A., Ivy J. M., Nasmyth K. A., McGill C. Homothallic switching of yeast mating type cassettes is initiated by a double-stranded cut in the MAT locus. Cell. 1982 Nov;31(1):183–192. doi: 10.1016/0092-8674(82)90418-4. [DOI] [PubMed] [Google Scholar]
  33. Thompson J. D., Higgins D. G., Gibson T. J. Improved sensitivity of profile searches through the use of sequence weights and gap excision. Comput Appl Biosci. 1994 Feb;10(1):19–29. doi: 10.1093/bioinformatics/10.1.19. [DOI] [PubMed] [Google Scholar]
  34. Thöny-Meyer L., Böck A., Hennecke H. Prokaryotic polyprotein precursors. FEBS Lett. 1992 Jul 27;307(1):62–65. doi: 10.1016/0014-5793(92)80902-s. [DOI] [PubMed] [Google Scholar]
  35. Wallace C. J. The curious case of protein splicing: mechanistic insights suggested by protein semisynthesis. Protein Sci. 1993 May;2(5):697–705. doi: 10.1002/pro.5560020501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Waring R. B., Davies R. W., Scazzocchio C., Brown T. A. Internal structure of a mitochondrial intron of Aspergillus nidulans. Proc Natl Acad Sci U S A. 1982 Oct;79(20):6332–6336. doi: 10.1073/pnas.79.20.6332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Williamson D. H., Gardner M. J., Preiser P., Moore D. J., Rangachari K., Wilson R. J. The evolutionary origin of the 35 kb circular DNA of Plasmodium falciparum: new evidence supports a possible rhodophyte ancestry. Mol Gen Genet. 1994 Apr;243(2):249–252. doi: 10.1007/BF00280323. [DOI] [PubMed] [Google Scholar]
  38. Xu M. Q., Southworth M. W., Mersha F. B., Hornstra L. J., Perler F. B. In vitro protein splicing of purified precursor and the identification of a branched intermediate. Cell. 1993 Dec 31;75(7):1371–1377. doi: 10.1016/0092-8674(93)90623-x. [DOI] [PubMed] [Google Scholar]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES