Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 1991 Feb 11;19(3):637–647. doi: 10.1093/nar/19.3.637

Mapping sequenced E.coli genes by computer: software, strategies and examples.

K E Rudd 1, W Miller 1, C Werner 1, J Ostell 1, C Tolstoshev 1, S G Satterfield 1
PMCID: PMC333660  PMID: 2011534

Abstract

Methods are presented for organizing and integrating DNA sequence data, restriction maps, and genetic maps for the same organism but from a variety of sources (databases, publications, personal communications). Proper software tools are essential for successful organization of such diverse data into an ordered, cohesive body of information, and a suite of novel software to support this endeavor is described. Though these tools automate much of the task, a variety of strategies is needed to cope with recalcitrant cases. We describe such strategies and illustrate their application with numerous examples. These strategies have allowed us to order, analyze, and display over one megabase of E. coli DNA sequence information. The integration task often exposes inconsistencies in the available data, perhaps caused by strain polymorphisms or human oversight, necessitating the application of sound biological judgment. The examples illustrate both the level of expertise required of the database curator and the knowledge gained as apparent inconsistencies are resolved. The software and mapping methods are applicable to the study of any genome for which a high resolution restriction map is available. They were developed to support a weakly coordinated sequencing effort involving many laboratories, but would also be useful for highly orchestrated sequencing projects.

Full text

PDF
637

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Aiba H., Fujimoto S., Ozaki N. Molecular cloning and nucleotide sequencing of the gene for E. coli cAMP receptor protein. Nucleic Acids Res. 1982 Feb 25;10(4):1345–1361. doi: 10.1093/nar/10.4.1345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Anderson A. Full sequence for E. coli. Nature. 1989 Mar 23;338(6213):283–283. doi: 10.1038/338283b0. [DOI] [PubMed] [Google Scholar]
  3. Bachmann B. J. Linkage map of Escherichia coli K-12, edition 7. Microbiol Rev. 1983 Jun;47(2):180–230. doi: 10.1128/mr.47.2.180-230.1983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bachmann B. J. Linkage map of Escherichia coli K-12, edition 8. Microbiol Rev. 1990 Jun;54(2):130–197. doi: 10.1128/mr.54.2.130-197.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Baird L., Georgopoulos C. Identification, cloning, and characterization of the Escherichia coli sohA gene, a suppressor of the htrA (degP) null phenotype. J Bacteriol. 1990 Mar;172(3):1587–1594. doi: 10.1128/jb.172.3.1587-1594.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bencini D. A., Houghton J. E., Hoover T. A., Foltermann K. F., Wild J. R., O'Donovan G. A. The DNA sequence of argI from Escherichia coli K12. Nucleic Acids Res. 1983 Dec 10;11(23):8509–8518. doi: 10.1093/nar/11.23.8509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Birkenbihl R. P., Vielmetter W. Complete maps of IS1, IS2, IS3, IS4, IS5, IS30 and IS150 locations in Escherichia coli K12. Mol Gen Genet. 1989 Dec;220(1):147–153. doi: 10.1007/BF00260869. [DOI] [PubMed] [Google Scholar]
  8. Blasco F., Iobbi C., Giordano G., Chippaux M., Bonnefoy V. Nitrate reductase of Escherichia coli: completion of the nucleotide sequence of the nar operon and reassessment of the role of the alpha and beta subunits in iron binding and electron transfer. Mol Gen Genet. 1989 Aug;218(2):249–256. doi: 10.1007/BF00331275. [DOI] [PubMed] [Google Scholar]
  9. Church G. M., Kieffer-Higgins S. Multiplex DNA sequencing. Science. 1988 Apr 8;240(4849):185–188. doi: 10.1126/science.3353714. [DOI] [PubMed] [Google Scholar]
  10. Churchill G. A., Daniels D. L., Waterman M. S. The distribution of restriction enzyme sites in Escherichia coli. Nucleic Acids Res. 1990 Feb 11;18(3):589–597. doi: 10.1093/nar/18.3.589. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Craigen W. J., Cook R. G., Tate W. P., Caskey C. T. Bacterial peptide chain release factors: conserved primary structure and possible frameshift regulation of release factor 2. Proc Natl Acad Sci U S A. 1985 Jun;82(11):3616–3620. doi: 10.1073/pnas.82.11.3616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Daniels D. L., Blattner F. R. Mapping using gene encyclopaedias. 1987 Feb 26-Mar 4Nature. 325(6107):831–832. doi: 10.1038/325831a0. [DOI] [PubMed] [Google Scholar]
  13. Erni B., Zanolari B. Glucose-permease of the bacterial phosphotransferase system. Gene cloning, overproduction, and amino acid sequence of enzyme IIGlc. J Biol Chem. 1986 Dec 15;261(35):16398–16403. [PubMed] [Google Scholar]
  14. Hantke K. Identification of an iron uptake system specific for coprogen and rhodotorulic acid in Escherichia coli K12. Mol Gen Genet. 1983;191(2):301–306. doi: 10.1007/BF00334830. [DOI] [PubMed] [Google Scholar]
  15. Heck J. D., Hatfield G. W. Valyl-tRNA synthetase gene of Escherichia coli K12. Molecular genetic characterization. J Biol Chem. 1988 Jan 15;263(2):857–867. [PubMed] [Google Scholar]
  16. Hove-Jensen B., Harlow K. W., King C. J., Switzer R. L. Phosphoribosylpyrophosphate synthetase of Escherichia coli. Properties of the purified enzyme and primary structure of the prs gene. J Biol Chem. 1986 May 25;261(15):6765–6771. [PubMed] [Google Scholar]
  17. Härtlein M., Frank R., Madern D. Nucleotide sequence of Escherichia coli valyl-tRNA synthetase gene valS. Nucleic Acids Res. 1987 Nov 11;15(21):9081–9082. doi: 10.1093/nar/15.21.9081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Ishino Y., Shinagawa H., Makino K., Amemura M., Nakata A. Nucleotide sequence of the iap gene, responsible for alkaline phosphatase isozyme conversion in Escherichia coli, and identification of the gene product. J Bacteriol. 1987 Dec;169(12):5429–5433. doi: 10.1128/jb.169.12.5429-5433.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Isono S., Thamm S., Kitakawa M., Isono K. Cloning and nucleotide sequencing of the genes for ribosomal proteins S9 (rpsI) and L13 (rplM) of Escherichia coli. Mol Gen Genet. 1985;198(2):279–282. doi: 10.1007/BF00383007. [DOI] [PubMed] [Google Scholar]
  20. Kawamukai M., Matsuda H., Fujii W., Utsumi R., Komano T. Nucleotide sequences of fic and fic-1 genes involved in cell filamentation induced by cyclic AMP in Escherichia coli. J Bacteriol. 1989 Aug;171(8):4525–4529. doi: 10.1128/jb.171.8.4525-4529.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kim S. Y., McLaggan D., Epstein W. The gdhA gene is located at 38.6 minutes on the Escherichia coli map. J Bacteriol. 1990 Oct;172(10):6127–6128. doi: 10.1128/jb.172.10.6127-6128.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Klemm P., Christiansen G. Three fim genes required for the regulation of length and mediation of adhesion of Escherichia coli type 1 fimbriae. Mol Gen Genet. 1987 Jul;208(3):439–445. doi: 10.1007/BF00328136. [DOI] [PubMed] [Google Scholar]
  23. Klemm P. The fimA gene encoding the type-1 fimbrial subunit of Escherichia coli. Nucleotide sequence and primary structure of the protein. Eur J Biochem. 1984 Sep 3;143(2):395–399. doi: 10.1111/j.1432-1033.1984.tb08386.x. [DOI] [PubMed] [Google Scholar]
  24. Kohara Y., Akiyama K., Isono K. The physical map of the whole E. coli chromosome: application of a new strategy for rapid analysis and sorting of a large genomic library. Cell. 1987 Jul 31;50(3):495–508. doi: 10.1016/0092-8674(87)90503-4. [DOI] [PubMed] [Google Scholar]
  25. Komine Y., Adachi T., Inokuchi H., Ozeki H. Genomic organization and physical mapping of the transfer RNA genes in Escherichia coli K12. J Mol Biol. 1990 Apr 20;212(4):579–598. doi: 10.1016/0022-2836(90)90224-A. [DOI] [PubMed] [Google Scholar]
  26. Kröger M., Wahl R., Rice P. Compilation of DNA sequences of Escherichia coli (update 1990). Nucleic Acids Res. 1990 Apr 25;18 (Suppl):2549–2587. doi: 10.1093/nar/18.suppl.2549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Li J. M., Russell C. S., Cosloy S. D. Cloning and structure of the hem A gene of Escherichia coli K-12. Gene. 1989 Oct 30;82(2):209–217. doi: 10.1016/0378-1119(89)90046-2. [DOI] [PubMed] [Google Scholar]
  28. Liu J., Burns D. M., Beacham I. R. Isolation and sequence analysis of the gene (cpdB) encoding periplasmic 2',3'-cyclic phosphodiesterase. J Bacteriol. 1986 Mar;165(3):1002–1010. doi: 10.1128/jb.165.3.1002-1010.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Masai H., Bond M. W., Arai K. Cloning of the Escherichia coli gene for primosomal protein i: the relationship to dnaT, essential for chromosomal DNA replication. Proc Natl Acad Sci U S A. 1986 Mar;83(5):1256–1260. doi: 10.1073/pnas.83.5.1256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. McCorkle G. M., Altman S. Large deletion mutants of Escherichia coli tRNATyr1. J Mol Biol. 1982 Feb 25;155(2):83–103. doi: 10.1016/0022-2836(82)90438-7. [DOI] [PubMed] [Google Scholar]
  31. McPherson M. J., Wootton J. C. Complete nucleotide sequence of the Escherichia coli gdhA gene. Nucleic Acids Res. 1983 Aug 11;11(15):5257–5266. doi: 10.1093/nar/11.15.5257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Miller W., Ostell J., Rudd K. E. An algorithm for searching restriction maps. Comput Appl Biosci. 1990 Jul;6(3):247–252. doi: 10.1093/bioinformatics/6.3.247. [DOI] [PubMed] [Google Scholar]
  33. Moore S. K., Garvin R. T., James E. Nucleotide sequence of the argF regulatory region of Escherichia coli K-12. Gene. 1981 Dec;16(1-3):119–132. doi: 10.1016/0378-1119(81)90068-8. [DOI] [PubMed] [Google Scholar]
  34. Muramatsu S., Kato M., Kohara Y., Mizuno T. Insertion sequence IS5 contains a sharply curved DNA structure at its terminus. Mol Gen Genet. 1988 Nov;214(3):433–438. doi: 10.1007/BF00330477. [DOI] [PubMed] [Google Scholar]
  35. Médigue C., Bouché J. P., Hénaut A., Danchin A. Mapping of sequenced genes (700 kbp) in the restriction map of the Escherichia coli chromosome. Mol Microbiol. 1990 Feb;4(2):169–187. doi: 10.1111/j.1365-2958.1990.tb00585.x. [DOI] [PubMed] [Google Scholar]
  36. Noji S., Nohno T., Saito T., Taniguchi S. The narK gene product participates in nitrate transport induced in Escherichia coli nitrate-respiring cells. FEBS Lett. 1989 Jul 31;252(1-2):139–143. doi: 10.1016/0014-5793(89)80906-8. [DOI] [PubMed] [Google Scholar]
  37. O'Regan M., Gloeckler R., Bernard S., Ledoux C., Ohsawa I., Lemoine Y. Nucleotide sequence of the bioH gene of Escherichia coli. Nucleic Acids Res. 1989 Oct 11;17(19):8004–8004. doi: 10.1093/nar/17.19.8004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Pearson W. R., Lipman D. J. Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A. 1988 Apr;85(8):2444–2448. doi: 10.1073/pnas.85.8.2444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Plasterk R. H., van de Putte P. The invertible P-DNA segment in the chromosome of Escherichia coli. EMBO J. 1985 Jan;4(1):237–242. doi: 10.1002/j.1460-2075.1985.tb02341.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Pressler U., Staudenmaier H., Zimmermann L., Braun V. Genetics of the iron dicitrate transport system of Escherichia coli. J Bacteriol. 1988 Jun;170(6):2716–2724. doi: 10.1128/jb.170.6.2716-2724.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Rossi J., Egan J., Hudson L., Landy A. The tyrT locus: termination and processing of a complex transcript. Cell. 1981 Nov;26(3 Pt 1):305–314. doi: 10.1016/0092-8674(81)90199-9. [DOI] [PubMed] [Google Scholar]
  42. Rudd K. E., Miller W., Ostell J., Benson D. A. Alignment of Escherichia coli K12 DNA sequences to a genomic restriction map. Nucleic Acids Res. 1990 Jan 25;18(2):313–321. doi: 10.1093/nar/18.2.313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Sanderson K. E., Roth J. R. Linkage map of Salmonella typhimurium, edition VII. Microbiol Rev. 1988 Dec;52(4):485–532. doi: 10.1128/mr.52.4.485-532.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Sauer M., Hantke K., Braun V. Sequence of the fhuE outer-membrane receptor gene of Escherichia coli K12 and properties of mutants. Mol Microbiol. 1990 Mar;4(3):427–437. doi: 10.1111/j.1365-2958.1990.tb00609.x. [DOI] [PubMed] [Google Scholar]
  45. Schweizer H. P., Datta P. Genetic analysis of the tdcABC operon of Escherichia coli K-12. J Bacteriol. 1988 Nov;170(11):5360–5363. doi: 10.1128/jb.170.11.5360-5363.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Schweizer H. P., Datta P. Physical map location of the tdc operon of Escherichia coli. J Bacteriol. 1990 Jun;172(6):2825–2825. doi: 10.1128/jb.172.6.2825.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Schweizer H. P., Datta P. The complete nucleotide sequence of the tdc region of Escherichia coli. Nucleic Acids Res. 1989 May 25;17(10):3994–3994. doi: 10.1093/nar/17.10.3994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Serizawa H., Fukuda R. Structure of the gene for the stringent starvation protein of Escherichia coli. Nucleic Acids Res. 1987 Feb 11;15(3):1153–1163. doi: 10.1093/nar/15.3.1153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Sodergren E. J., DeMoss J. A. narI region of the Escherichia coli nitrate reductase (nar) operon contains two genes. J Bacteriol. 1988 Apr;170(4):1721–1729. doi: 10.1128/jb.170.4.1721-1729.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Staudenmaier H., Van Hove B., Yaraghi Z., Braun V. Nucleotide sequences of the fecBCDE genes and locations of the proteins suggest a periplasmic-binding-protein-dependent transport mechanism for iron(III) dicitrate in Escherichia coli. J Bacteriol. 1989 May;171(5):2626–2633. doi: 10.1128/jb.171.5.2626-2633.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Stewart V., Parales J., Jr, Merkel S. M. Structure of genes narL and narX of the nar (nitrate reductase) locus in Escherichia coli K-12. J Bacteriol. 1989 Apr;171(4):2229–2234. doi: 10.1128/jb.171.4.2229-2234.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Stirling C. J., Colloms S. D., Collins J. F., Szatmari G., Sherratt D. J. xerB, an Escherichia coli gene required for plasmid ColE1 site-specific recombination, is identical to pepA, encoding aminopeptidase A, a protein with substantial similarity to bovine lens leucine aminopeptidase. EMBO J. 1989 May;8(5):1623–1627. doi: 10.1002/j.1460-2075.1989.tb03547.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Takeda Y., Avila H. Structure and gene expression of the E. coli Mn-superoxide dismutase gene. Nucleic Acids Res. 1986 Jun 11;14(11):4577–4589. doi: 10.1093/nar/14.11.4577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Thorbjarnardóttir S., Dingermann T., Rafnar T., Andrésson O. S., Söll D., Eggertsson G. Leucine tRNA family of Escherichia coli: nucleotide sequence of the supP(Am) suppressor gene. J Bacteriol. 1985 Jan;161(1):219–222. doi: 10.1128/jb.161.1.219-222.1985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Touati D. Transcriptional and posttranscriptional regulation of manganese superoxide dismutase biosynthesis in Escherichia coli, studied with operon and protein fusions. J Bacteriol. 1988 Jun;170(6):2511–2520. doi: 10.1128/jb.170.6.2511-2520.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Triggs-Raine B. L., Doble B. W., Mulvey M. R., Sorby P. A., Loewen P. C. Nucleotide sequence of katG, encoding catalase HPI of Escherichia coli. J Bacteriol. 1988 Sep;170(9):4415–4419. doi: 10.1128/jb.170.9.4415-4419.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Umeda M., Ohtsubo E. Mapping of insertion element IS5 in the Escherichia coli K-12 chromosome. Chromosomal rearrangements mediated by IS5. J Mol Biol. 1990 May 20;213(2):229–237. doi: 10.1016/S0022-2836(05)80186-X. [DOI] [PubMed] [Google Scholar]
  58. Umeda M., Ohtsubo E. Mapping of insertion elements IS1, IS2 and IS3 on the Escherichia coli K-12 chromosome. Role of the insertion elements in formation of Hfrs and F' factors and in rearrangement of bacterial chromosomes. J Mol Biol. 1989 Aug 20;208(4):601–614. doi: 10.1016/0022-2836(89)90151-4. [DOI] [PubMed] [Google Scholar]
  59. Valle F., Becerril B., Chen E., Seeburg P., Heyneker H., Bolivar F. Complete nucleotide sequence of the glutamate dehydrogenase gene from Escherichia coli K-12. Gene. 1984 Feb;27(2):193–199. doi: 10.1016/0378-1119(84)90140-9. [DOI] [PubMed] [Google Scholar]
  60. Watanabe W., Sampei G., Aiba A., Mizobuchi K. Identification and sequence analysis of Escherichia coli purE and purK genes encoding 5'-phosphoribosyl-5-amino-4-imidazole carboxylase for de novo purine biosynthesis. J Bacteriol. 1989 Jan;171(1):198–204. doi: 10.1128/jb.171.1.198-204.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Williams K. M. Version 5 of the Mount-Conrad-Myers Sequence Analysis Software Package now available. Comput Appl Biosci. 1988 Mar;4(1):211–211. doi: 10.1093/bioinformatics/4.1.211. [DOI] [PubMed] [Google Scholar]
  62. Zimmermann L., Hantke K., Braun V. Exogenous induction of the iron dicitrate transport system of Escherichia coli K-12. J Bacteriol. 1984 Jul;159(1):271–277. doi: 10.1128/jb.159.1.271-277.1984. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES