Skip to main content
Journal of Bacteriology logoLink to Journal of Bacteriology
. 1995 Mar;177(6):1585–1588. doi: 10.1128/jb.177.6.1585-1588.1995

Widespread protein sequence similarities: origins of Escherichia coli genes.

B Labedan 1, M Riley 1
PMCID: PMC176776  PMID: 7883716

Abstract

To learn more about the evolutionary origins of Escherichia coli genes, we surveyed systematically for extended sequence similarities among the 1,264 amino acid sequences encoded by chromosomal genes of E. coli K-12 in SwissProt release 26 by using the FASTA program and imposing the following criteria: (i) alignment of segments at least 100 amino acids long and (ii) at least 20% amino acid identity. Altogether, 624 extended alignments meeting the two criteria were identified, corresponding to 577 protein sequences (45.6% of the 1,264 E. coli protein sequences) that had an extended alignment with at least one other E. coli protein sequence. To exclude alignments of questionable biological significance, we imposed a high threshold on the number of gaps allowed in each of the 624 extended alignments, giving us a subset of 464 proteins. The population of 464 alignments has the following characteristics expressed as median values of the group: 254 amino acids in the alignment, representing 86% of the length of the protein, 33% of the amino acids in the alignment being identical, and 1.1 gaps introduced per 100 amino acids of alignment. Where functions are known, nearly all pairs consist of functionally related proteins. This implies that the sequence similarity we detected has biological meaning and did not arise by chance. That a major fraction of E. coli proteins form extended alignments strongly suggests the predominance of duplication and divergence of ancestral genes in the evolution of E. coli genes. The range of degrees of similarity shows that some genes originated more recently than others. There is no evidence of genome doubling in the past, since map distances between genes of sequence-related proteins show no coherent pattern of favored separations.

Full Text

The Full Text of this article is available as a PDF (172.7 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank, recent developments. Nucleic Acids Res. 1993 Jul 1;21(13):3093–3096. doi: 10.1093/nar/21.13.3093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Doolittle R. F., Feng D. F., Johnson M. S., McClure M. A. Relationships of human protein sequences to those of other organisms. Cold Spring Harb Symp Quant Biol. 1986;51(Pt 1):447–455. doi: 10.1101/sqb.1986.051.01.054. [DOI] [PubMed] [Google Scholar]
  3. Doolittle R. F. Similar amino acid sequences: chance or common ancestry? Science. 1981 Oct 9;214(4517):149–159. doi: 10.1126/science.7280687. [DOI] [PubMed] [Google Scholar]
  4. Doolittle R. F. Stein and Moore Award address. Reconstructing history with amino acid sequences. Protein Sci. 1992 Feb;1(2):191–200. doi: 10.1002/pro.5560010201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Kister A., Muchnik I., Bouzida D., Reinherz E. L., Smith T. Efficient pattern comparative method for selecting functionally important motifs in protein sequences: application to zinc enzymes. Biosystems. 1993;30(1-3):233–240. doi: 10.1016/0303-2647(93)90073-l. [DOI] [PubMed] [Google Scholar]
  6. LEWIS E. B. Pseudoallelism and gene evolution. Cold Spring Harb Symp Quant Biol. 1951;16:159–174. doi: 10.1101/sqb.1951.016.01.014. [DOI] [PubMed] [Google Scholar]
  7. McCaldon P., Argos P. Oligopeptide biases in protein sequences and their use in predicting protein coding regions in nucleotide sequences. Proteins. 1988;4(2):99–122. doi: 10.1002/prot.340040204. [DOI] [PubMed] [Google Scholar]
  8. Pearson W. R., Lipman D. J. Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A. 1988 Apr;85(8):2444–2448. doi: 10.1073/pnas.85.8.2444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Riley M. Functions of the gene products of Escherichia coli. Microbiol Rev. 1993 Dec;57(4):862–952. doi: 10.1128/mr.57.4.862-952.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Ycas M. On earlier states of the biochemical system. J Theor Biol. 1974 Mar;44(1):145–160. doi: 10.1016/s0022-5193(74)80035-4. [DOI] [PubMed] [Google Scholar]
  11. Zipkas D., Riley M. Proposal concerning mechanism of evolution of the genome of Escherichia coli. Proc Natl Acad Sci U S A. 1975 Apr;72(4):1354–1358. doi: 10.1073/pnas.72.4.1354. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES