Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2004 Aug 10;32(14):e114. doi: 10.1093/nar/gnh115

An in vitro strategy for the selective isolation of anomalous DNA from prokaryotic genomes

M W J van Passel, A Bart, R J A Waaijer 1, A C M Luyf 1, A H C van Kampen 1, A van der Ende *
PMCID: PMC514399  PMID: 15304543

Abstract

In sequenced genomes of prokaryotes, anomalous DNA (aDNA) can be recognized, among others, by atypical clustering of dinucleotides. We hypothesized that atypical clustering of hexameric endonuclease recognition sites in aDNA allows the specific isolation of anomalous sequences in vitro. Clustering of endonuclease recognition sites in aDNA regions of eight published prokaryotic genome sequences was demonstrated. In silico digestion of the Neisseria meningitidis MC58 genome, using four selected endonucleases, revealed that out of 27 of the small fragments predicted (<5 kb), 21 were located in known genomic islands. Of the 24 calculated fragments (>300 bp and <5 kb), 22 met our criteria for aDNA, i.e. a high dinucleotide dissimilarity and/or aberrant GC content. The four enzymes also allowed the identification of aDNA fragments from the related Z2491 strain. Similarly, the sequenced genomes of three strains of Escherichia coli assessed by in silico digestion using XbaI yielded strain-specific sets of fragments of anomalous composition. In vitro applicability of the method was demonstrated by using adaptor-linked PCR, yielding the predicted fragments from the N.meningitidis MC58 genome. In conclusion, this strategy allows the selective isolation of aDNA from prokaryotic genomes by a simple restriction digest–amplification–cloning–sequencing scheme.

INTRODUCTION

Horizontal gene transfer (HGT) was already identified in 1944 by the same experiment that demonstrated the transformation of non-virulent to virulent Streptococcus pneumoniae (1). The extent of HGT as an evolutionary phenomenon had not been addressed quantitatively on genomic scale until Lawrence and Ochman (2) calculated that ∼18% of the genome of Escherichia coli MG1665 was horizontally transferred since its divergence from the Salmonella lineage 100 million years ago. This identified HGT as a major factor in prokaryotic genome evolution. Recently, an extensive database of horizontally transferred genes based on complete bacterial and archaeal genomes has been made available (3).

The rationale behind the computational identification of horizontally transferred DNA is the genome hypothesis, which proposes that for a given prokaryotic genus genomic DNA is relatively constant in codon usage and GC content (4,5). In contrast, horizontally acquired anomalous DNA differs in codon usage and/or GC composition from the recipient genome and can therefore be identified when substantial sequence information is available.

An additional parameter in lateral genomics is based on oligonucleotide compositional extremes: the dinucleotide relative abundance values or genome signature ρ* (6,7). The genome signature is constant among members of a genus, but deviates substantially between members of different genera (8). When used for intragenomic comparisons, ρ* makes an excellent parameter for the identification of anomalous DNA regions. Aberrant dinucleotide frequencies in aDNA are then expressed as the genome dissimilarity δ*, being the average dinucleotide relative abundance difference between the aDNA region and the whole genome (68). Although the genome signature is capable of identifying clusters of alien genes and acquired pathogenicity-associated islands (PAI) with an atypical nucleotide composition, highly expressed regions such as ribosomal clusters can also display aberrant dinucleotide frequencies (8,9).

Till date, to our knowledge, no method exists that uses (one of) these parameters and enables the selective isolation of anomalous DNA sequences from a microbial genome in vitro. In order to develop such a technique, we investigated a special group of oligonucleotide composition extremes: the local overrepresentation in a genome of palindromic hexanucleotide sequences, specifically restriction endonuclease recognition sites, in aDNA regions. Like the genomic dinucleotide and tetranucleotide frequencies (10,11), frequencies of restriction sites vary between the genomes of different microbial species (12). Avoidance of cognate recognition sequences is probably the operating mechanism (13,14). An HGT event between different organisms may introduce clusters of certain restriction sites in the recipient's genome. Therefore, digestion of the chromosomal DNA with such a restriction endonuclease can produce a limited number of small restriction fragments, comprising potential anomalous DNA, which can be selectively amplified by adaptor-linked PCR [ALP (15)]. The resulting amplicons can be subsequently subcloned and identified by sequence analysis.

Clustering of restriction endonuclease recognition sites in diverse aDNA regions in prokaryotic genomes was illustrated by the in silico assessment of seven genome sequences of five different species. The restriction enzymes for which the hexameric recognition sites are underrepresented were identified for each genome, and restriction fragments between clustered sites, being <5 kb, were analysed for nucleotide composition concerning GC percentage and genomic dissimilarity.

Next, the restriction fragments of Neisseria meningitidis MC58 between 300 bp and 5 kb were analysed in silico for both GC content and genome signature compared to the genomic values. Also, the restriction fragments obtained with the selected restriction endonucleases from N.meningitidis MC58 and Z2491 strains were compared.

Finally, in order to demonstrate the applicability of this technique in vitro, ALP was performed on chromosomal DNA from strain MC58 digested by each of the selected restriction endonucleases. The resulting amplicons were sequenced to verify the predicted sequence composition.

MATERIALS AND METHODS

Bacterial strain and growth conditions

N.meningitidis MC58 is a serogroup B:15:P1.7,16 strain isolated from a case of invasive infection in the UK (16). This wild-type MC58 strain lacks the erythromycin resistance cassette insertion in the capsule gene locus in contrast to the sequenced strain MC58 (17). Neisseriae were grown on heated blood (chocolate) agar plates or in liquid Tryptic Soy Broth (DIFCO) medium at 37°C in a humidified atmosphere of 5% CO2.

Chromosomal DNA preparation and digestion

Chromosomal DNA was isolated with the Puregene DNA isolation kit (Biozym). Restriction digests and subsequent heat inactivation were carried out according to the manufacturer's instructions (Roche).

Adaptor-linked PCR and DNA sequencing

Adaptor-linked PCR was performed as described previously (18). The adaptor and linker sets are MP19 (5′-ACG TCG ACT ATC CAT GAA CAG ATC 3′) and MP23 (5′-GAT CTG TTC ATG-3′) for the ScaI-digested genomic template, MP24 (5′-ACC GAC GTC GAC TAT CCA TGA ACA-3′) and MP20 (5′- CTA GTG TTC ATG -3′) for both the NheI- and SpeI-digested chromosomal DNA and MP24 and MP23 for the BglII-digested genomic template. PCR amplicons were purified by agarose gel extraction (Qiagen) and subcloned into a pCR2.1 vector (Invitrogen) according to the manufacturer's instructions. E.coli DH5α was transformed by standard heat shock procedure. The constructed plasmids were isolated with the Wizard Kit (Promega). Inserts were sequenced using standard M13 primers or primer walking on vector or genomic DNA according to the manufacturer's instruction (ABI). Sequences were analysed using the Staden Package (http://www.mrc-lmb.cam.ac.uk/pubseq/).

Software

The restriction site frequency tables from the various genomes were obtained from http://tools.neb.com/~posfai/FINISHED. The in silico digestions of the various sequenced genomes (for accession numbers see Table 1) were performed using the Restriction Digest tool from The Institute for Genomic Research (TIGR) (http://www.tigr.org). In silico retrieval and identification of the restriction fragments was performed with the Position Search/Segment Retrieval tool from TIGR (http://www.tigr.org). The different genomes of N.meningitidis were compared using the Artemis Comparison Tool (ACT) (http://www.sanger.ac.uk).

Table 1. Clustering of restriction enzyme recognition sites in anomalous DNA regions in sequenced genomes of various prokaryotes.

Organism Accession number (and reference) Enzyme Total number of fragmentsa aDNAb
Haemophilus influenzae Rd20 NC000907 (29) XmaIII 7 7/7
    ApaI 7 7/7
E.coli O157:H7 VT2 NC002695 (30) AvrII 5 5/5
    XbaI 3 2/3
E.coli K-12 NC000913 (31) XbaI 2 1/1
E.coli CFT073 NC004431 (32) XbaI 7 4/5
Salmonella enterica serovar Typhi CT18 NC003198 (33) XbaI 7 6/6
Methanobacterium thermoautotrophicum delta H NC000916 (34) SpeI 11 7/8
N.meningitidis MC58 NC003112 (17) BglII 9 6/7
    ScaI 11 9/10
    SpeI 3 3/3
    NheI 4 4/4
N.meningitidis Z2491 NC003116 (35) BglII 4 1/2
    ScaI 4 3/4
    NheI 2 2/2
Total       67/74 (91%)

aAll restriction fragments up to 5 kb are considered in this column.

bFor aDNA composition calculations concerning GC percentage and genome dissimilarity, only the fragments between 300 bp and 5 kb were considered.

Data analysis

Fragments were designated anomalous in GC composition if the GC content of the fragment is below the fifth or above the 95th percentile of the genomic GC content distribution, calculated with a window and step size identical to the fragment length (http://www.tigr.org).

The δ* value for each restriction fragment was calculated as described earlier by Karlin and colleagues (7). In brief, the dinucleotide relative abundance values ρXY* are defined as the frequency of the dinucleotide XY divided by the product of the background frequencies of the individual nucleotides in the sequence and the reverse complement sequence [ρ*XY = fXY/(fX * fY)]. δ* is the average absolute dinucleotide relative abundance difference given by δ*(f,g) = 1/16 * ∑|ρXY*(f)−ρXY*(g)|, where ρXY * (f) denotes the abundance values calculated for fragment f and ρXY * (g) the abundance values calculated for the genome g. The δ* of each fragment was compared to a distribution of δ* values which we constructed for consecutive fragments of identical size obtained from the respective genome sequence. A fragment was scored positive for anomalous DNA composition if this δ* value was above the 90th percentile of the δ* distribution. For determination of genome signature dissimilarities, only sequences between 300 bp and 5 kb were considered. Five kilobase pairs is the median size of imported DNA in Neisseria (19), and it also represents a conservative size limit for technical convenience in amplification procedures. With restriction fragments <300 bp, computations of composition are not performed; this limit represents a conservative lower size limit previously used in studies identifying aDNA by codon usage (3,5).

RESULTS

Local overrepresentation of hexameric restriction enzyme recognition sites in anomalous DNA regions of sequenced prokaryotic genomes

In order to identify clustered restriction endonuclease recognition sites in the sequenced prokaryotic genomes used in this study, we tested restriction enzymes of which the hexapalindromic recognition sites are underrepresented in the genome sequences (http://tools.neb.com/~posfai/FINISHED). The tendency of the recognition sites to cluster in aDNA regions was assessed by analysing the sequence composition of the restriction fragments obtained in silico (Table 1 and supplementary Tables 1–8). Fragments <300 bp were not considered for their genome dissimilarity and GC composition values, because δ* values of these small fragments are unreliable; this conservative minimal length is also used by Lawrence and Ochman and Garcia-Vallvé and co-workers (3,5). Nevertheless, many of these restriction fragments <300 bp are adjacent to the other fragments <5 kb in their respective genomes (as an example see N.meningitidis MC58 in Table 2 and Figure 1). The results showed that the eight analysed genome sequences did contain clusters of endonuclease restriction sites in aDNA regions (Table 1). The aggregated data showed 74 fragments (lengths between 300 bp and 5 kb) of which 67 (91%) were of anomalous composition.

Table 2. Restriction fragment numbers, lengths, δ* and GC composition of fragments obtained after in silico digestion of the genome of N.meningitidis MC58 by four selected restriction enzymes.

Fragmentsa GC content Genomic dissimilarity
Enzyme No. Length (bp) GC% <10th percentile δ* (× 103) >90th percentile
BglII 1 2996b 47 136 +
  2 2889b 49 117 +
  3 2889c 49 117 +
  4 2654c 46 123 +
  5 2461 51 136 +
  6 1194c 55 125
  7 477 34 + 218 +
  8 75b ND ND ND ND
  9 21d ND ND ND ND
NheI 1 4723e 43 + 99 +
  2 4392f 40 + 111 +
  3 787e 35 + 123
  4 670 29 + 282 +
ScaI 1 4824g 43 + 121 +
  2 4452g 48 142 +
  3 2496 48 132 +
  4 2179 57 85
  5 865h 38 + 171 +
  6 699h 38 + 218 +
  7 600 50 184 +
  8 600h 50 184 +
  9 600h 50 182 +
  10 533h 51 192 +
  11 67h ND ND ND ND
SpeI 1 1672 36 + 132 +
  2 579i 35 + 212 +
  3 470 24 + 274 +

For fragments <300 bp the GC percentage and δ* were not determined (ND).

aOut of 25 fragments, 17 were located within one of the anomalous gene clusters A, B or C described by Karlin (8).

bAdjacent in anomalous gene cluster A.

cAdjacent in anomalous gene cluster C.

dPresent in anomalous gene cluster B.

eAdjacent in anomalous gene cluster A.

fPresent in anomalous gene cluster C.

gPresent in anomalous gene cluster A.

hAdjacent in anomalous gene cluster C.

iAdjacent in anomalous gene cluster B.

Figure 1.

Figure 1

Clustering of restriction fragments <5 kb in the aDNA regions of the genome of N.meningitidis MC58 compared with the genome signature distribution. Blocks A, B and C represent the large genomic islands as described by Karlin et al. (8), whereas block D is a large ribosomal protein gene cluster. Block X is a large putative region of horizontal gene transfer identified by Garcia-Vallvé and co-workers (3).

Clustering of hexameric restriction enzyme recognition sites in the genome of N.meningitidis MC58 in different aDNA regions

Assessment of the occurrence of low-frequency restriction sites in the genome sequence of N.meningitidis MC58 revealed that many of the recognition sites of BglII, NheI, ScaI and SpeI clustered in the four regions known to contain large stretches of aDNA. These are also annotated as islands of horizontal transfer (IHT), thereby supporting the notion that in MC58 these recognition sites are relatively overrepresented in regions originating from horizontal transfer (3,8,17) (Figure 1, Table 2).

Of the 27 restriction fragments <5 kb, 21 were located within either of the clusters of anomalous genes described in previous studies (3,8,17) (Figure 1, Table 2). Various ScaI fragments as well as BglII fragments were adjacent to each other in these aDNA regions, confirming the local overrepresentation of recognition sites in these regions.

Calculation of GC composition and δ* values of the 24 restriction fragments between 300 bp and 5 kb obtained by in silico digestion with BglII, NheI, ScaI or SpeI confirmed their anomalous nature; 22 out of 24 of the restriction fragments met our criteria for anomalous DNA (Table 2). Of the 24 fragments, 21 had δ* values above that of the 90th percentile of the genomic δ* value distribution and 11 had a GC percentage lower than that of the fifth percentile of the genomic GC content distribution.

Comparing the different restriction fragment patterns of the sequenced N.meningitidis strains Z2491 and MC58 in silico

The restriction fragment patterns obtained in silico from the two different N.meningitidis strains showed remarkable differences (Table 3). Various restriction fragments located in the annotated anomalous gene clusters or IHTs in N.meningitidis MC58 were not identified in N.meningitidis Z2491, consistent with the notion that these IHTs are absent in strain Z2491 (17). In addition, two anomalous restriction fragments from MC58 (MC58-ScaI-2179 and MC58-SpeI-470), which were not part of one of the previously mentioned IHTs, were located in aDNA regions only present in MC58. The MC58-ScaI-2179 fragment harboured ORF NMB1829, encoding a TonB-dependent receptor, and MC58-SpeI-470 contained a cluster of six open reading frames (ORFs). The latter showed a number of features typical for a PAI (20), such as an atypical GC content compared to the genome sequence and association with a transfer RNA (tRNA) gene (NMB1595) and an insertion sequence IS1106 (ORF NMB1601) at its boundaries. As the functions of these ORFs and their distribution in other pathogenic and non-pathogenic strains are unknown, this region does not formally qualify as PAI, although a heterologous origin is suspected.

Table 3. Restriction fragments, coordinates, of the different tested N.meningitidis strains, indicating the absence or presence of the different fragments in the other strain (GI refers to the different genomic island as depicted in Figure 1).

Enzyme Size   Coordinates   GI in strain MC58 Presence in the other strain
  Strain MC58 Strain Z2491 Strain MC58 Strain Z2491    
BglII 4213 1179114–1183327 Dispersed in MC58, with inversions and loss of restriction fragment
  2996 525422–528418 A Absent in Z2491
  2889 1863020–1865909 C Largely present in Z2491
  2889 522533–525422 A Largely present in Z2491
  2654 1860366–1863020 C Largely absent in Z2491
  2461 2449 726967–729428 875412–877861 Similar sequences
  1194 1859172–1860366 C Largely present in Z2491
  477 614379–614856 Absent in Z2491
  97   578316–578413 Absent in MC58
  75 75 542107–542182 688499–688574 A Similar sequences
  21 1444731–1444752 B Absent in Z2491
NheI 4723 505027–509750 A Largely absent in Z2491
  4392 1834407–1838799 C Absent in Z2491
  787 776 2231113–2231900 299657–300433 X Similar sequences
  670 670 543262–543932 689470–690140 A Similar sequences
ScaI 4824 526521–531345 A Largely absent in Z2491
  4452 511598–516050 A Absent in Z2491
  4101 769714–773815 Partial similarity to MC58-ScaI-533, MC58-ScaI-600abc, partially absent in MC58
  3181 1928315–1931496 Similar sequences, but polymorphism at the recognition site
  2496 1007047–1009543 Largely present in Z2491
  2179 1925725–1927904 Largely absent in Z2491
  865 865 1815665–1816530 1927450–1928315 C Similar sequences
  699 699 1814966–1815665 1926751–1927450 C Similar sequences
  600 1447767–1448367 B Similar to Z2491-ScaI-4101
  600 1447167–1447767 B Similar to Z2491-ScaI-4101
  600 616671–617271 Similar to Z2491-ScaI-4101
  533 1446567–1447100 B Similar to Z2491-ScaI-4101
  67 1447100–1447167 B Similar to Z2491-ScaI-4101
SpeI 1672 2223993–2225665 X Dispersed in MC58, with inversions and loss of restriction fragment
  579 1443931–1444510 B Absent in Z2491
  470 1659795–1660265 Absent in Z2491

The two strains were compared in silico using the Artemis Comparison Tool, with megablast hit scores of 500 and above.

Two fragments identified in silico in Z2491 were absent in the genome of MC58. Z2491-BglII-97 harboured a part of NMA0604, which encodes a hypothetical protein. The Z2491-ScaI-4101 fragment contained ORFs NMA0785 and NMA0786, encoding hypothetical proteins. Both NMA0785 and NMA0786 display an atypical GC composition and dinucleotide composition, and are described as putatively horizontally transferred by Garcia-Vallvé and colleagues (3).

Thus, the same set of four enzymes which was used to isolate anomalous sequences from the MC58 strain in silico identified aDNA fragments from the related strain Z2491 strain. Similarly, the sequenced genomes of three strains of E.coli assessed by in silico digestion using XbaI yielded strain-specific sets of fragments with anomalous composition (supplementary Tables 2–4). Unfortunately, due to sequence ambiguities in the E.coli EDL933 genome sequence, the δ* values of the XbaI restriction fragments from this strain could not be readily calculated, although the low GC percentage of these fragments compared to the E.coli genomic GC composition values suggest an anomalous nucleotide composition (supplementary Table 5).

Selective isolation of aDNA in vitro from N.meningitidis MC58 by adaptor-linked PCR

In order to validate that this strategy could be converted into an in vitro strategy with possible applications to unsequenced genomes, chromosomal DNA of strain MC58 was digested in vitro with BglII, NheI, ScaI or SpeI. The fragments obtained from each of the four digests were amplified by ALP. The amplicon pattern is very similar to the expected in silico restriction fragment patterns (Figure 2), albeit the minor differences observed. These can be explained by the possible inefficient amplification of large fragments (∼4 kb) in the presence of smaller fragments. The resulting amplicons were subcloned and sequenced, verifying the sequences predicted by the in silico analysis (data not shown). This demonstrated the applicability of this method in vitro.

Figure 2.

Figure 2

Comparison of the restriction fragment length polymorphism (RFLP) pattern and the ALP pattern of N.meningitidis MC58 digested with each of the selected endonucleases in silico (via www.tigr.org) and the resulting amplification pattern in vitro (endonucleases are depicted above each lane). Lane X depicts the marker X (Roche) with the sizes in base pairs on the left. The fragments NheI-4723 and NheI-4392 could not be readily amplified, probably due to the preferential amplification of smaller fragments.

DISCUSSION

A new parameter based on dinucleotide composition extremes has been introduced to identify genomic islands in complete genomes (6,7). The potential of this and other in silico methods to identify genomic islands is obviously limited to sequenced genomes. To our knowledge, no in vitro method exists which allows the selective isolation of aDNA from unsequenced genomes, except for subtractive hybridization strategies in which usually two related but different strains are compared (2123). In order to develop an in vitro tool for the selective isolation of anomalous sequences from unsequenced genomes, we investigated whether clustering of restriction enzyme recognition sites could lead to the preferential isolation of aDNA from various sequenced genomes.

We demonstrated clustering of genomically underrepresented restriction enzyme recognition sites in eight sequenced genomes of five prokaryotic species in silico. We found that clustering of these recognition sites occurred predominantly in aDNA regions, including ribosomal loci, but also and more interestingly, in putative horizontally transferred loci which were described by Garcia-Vallvé and co-workers (3). However, some discrepancies between our data and their database exist, as the HGT database ignores non-coding sequences.

In the genome of N.meningitidis MC58, the clustering of the four selected endonuclease recognition sites occurred in the three known IHTs (17), a recently described aDNA region (3), and also in smaller anomalous loci. Comparative analysis of the calculated restriction fragments from N.meningitidis MC58 and Z2491 showed that similar putative horizontally acquired anomalous sequences could be isolated from both strains. Furthermore, aDNA confined to either one of these strains could be identified, suggesting that differences between strains can be identified and isolated. The strategy was validated by the in vitro amplification of the predicted restriction fragments from the genome of N.meningitidis MC58.

In this study, only a limited number of sequenced genomes was analysed to illustrate atypical clustering of endonuclease recognition sites in their respective aDNA regions. Theoretically, any prokaryotic genome may contain atypical clustering of endonuclease recognition sites. On the other hand, aDNA which is acquired via horizontal gene transfer is thought to adjust to the host's nucleotide composition over time in a process called amelioration, the same mutational process that affects the entire genome (5). This implies that only aDNA resulting from evolutionary recent transfer events, in which the nucleotide content of the acquired DNA still differs substantially from the sequence composition of the host genome, can be adequately identified and isolated. Furthermore, in genomes of bacteria, such as Helicobacter pylori, with an extreme plasticity due to high recombination rates, regions of aDNA may be rapidly obscured over time (24). Another potential limitation of our technique is that the restriction enzyme recognition sites may be methylated by restriction-modification (RM) systems. For example, H.pylori contains many RM enzymes, rendering the genome resistant to their activity (25).

Only hexapalindromic recognition sites of endonucleases have been tested; we did not examine other recognition sites (such as non-palindromic recognition sites). The cores of the restriction sites of the selected restriction enzymes persistently consist of the genomically underrepresented tetranucleotides previously described by Karlin and co-workers (10). Genomic underrepresentation of tetrapalindromes may be due to structural defects caused by these sequences or special functional roles associated with these sequences (10). Whether genomic aDNA regions, with these tetranucleotides overrepresented, predominantly originate form donor organisms in which these tetranucleotides are less associated with structural defects or special functional roles, remains unclear.

The restriction enzymes, for which the recognition sites were often found to cluster atypically in the genomes assessed in this study, such as SpeI, XbaI, AvrII and NheI, are also commonly used for genotyping by pulsed-field gel electrophoresis (PFGE) (26). For example, to identify an E.coli O157 outbreak cluster, Tsuji and co-workers (27) performed PFGE with the XbaI enzyme. A higher prevalence of the recognition sites of these enzymes in aDNA regions, such as horizontally transferred genes, may partly explain the high differentiating capacity of PFGE when performed with these enzymes. Insertion of horizontally acquired DNA harbouring these sites in a higher frequency than the recipient genome will result in the introduction of novel small fragments, which are usually not visualized by PFGE. However, the large fragment in which the region of aDNA is inserted will disappear from the PGFE pattern.

As the genome signature is conserved between closely related species (28), this technique may enable the selective isolation of aDNA from novel outbreak strains in the population of a pathogenic species of which a representative complete genome sequence is available, illustrated by the identification of the different anomalous fragments in the two Neisseria strains. It would be of interest to test different neisserial genoclusters for anomalous sequences with this novel strategy. In conclusion, the strategy presented in this study allows the selective isolation of anomalous sequences from prokaryotic genomes by a simple restriction digest–amplification–cloning–sequencing scheme. This simple technique can have major practical applications in studying horizontal gene transfer.

SUPPLEMENTARY MATERIAL

Supplementary Material is available at NAR Online.

[Supplementary Material]

Acknowledgments

ACKNOWLEDGEMENTS

We would like to thank Drs Mark Achtman and Christina Vandenbroucke-Grauls for critically reading the manuscript.

REFERENCES

  • 1.Avery O.T., MacLeod,C.M. and McCarty,M. (1944) Studies on the chemical nature of the substance inducing transformation of pneumococcal types. Inductions of transformation by a desoxyribonucleic acid fraction isolated from pneumococcus type III. J. Exp. Med., 79, 137–158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lawrence J.G. and Ochman,H. (1998) Molecular archaeology of the Escherichia coli genome. Proc. Natl Acad. Sci. USA, 95, 9413–9417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Garcia-Vallvé S., Guzman,E., Montero,M.A. and Romeu,A. (2003) HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes. Nucleic Acids Res., 31, 187–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Grantham R., Gautier,C., Gouy,M., Mercier,R. and Pave,A. (1980) Codon catalog usage and the genome hypothesis. Nucleic Acids Res., 8, r49–r62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Lawrence J.G. and Ochman,H. (1997) Amelioration of bacterial genomes: rates of change and exchange. J. Mol. Evol., 44, 383–397. [DOI] [PubMed] [Google Scholar]
  • 6.Burge C., Campbell,A.M. and Karlin,S. (1992) Over- and under-representation of short oligonucleotides in DNA sequences. Proc. Natl Acad. Sci. USA, 89, 1358–1362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Karlin S., Ladunga,I. and Blaisdell,B.E. (1994) Heterogeneity of genomes: measures and values. Proc. Natl Acad. Sci. USA, 91, 12837–12841. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Karlin S. (2001) Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. Trends Microbiol., 9, 335–343. [DOI] [PubMed] [Google Scholar]
  • 9.Karlin S., Campbell,A.M. and Mrazek,J. (1998) Comparative DNA analysis across diverse genomes. Annu. Rev. Genet., 32, 185–225. [DOI] [PubMed] [Google Scholar]
  • 10.Karlin S., Mrazek,J. and Campbell,A.M. (1997) Compositional biases of bacterial genomes and evolutionary implications. J. Bacteriol., 179, 3899–3913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Pride D.T., Meinersmann,R.J., Wassenaar,T.M. and Blaser,M.J. (2003) Evolutionary implications of microbial genome tetranucleotide frequency biases. Genome Res., 13, 145–158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Roberts R.J., Vincze,T., Posfai,J. and Macelis,D. (2003) REBASE: restriction enzymes and methyltransferases. Nucleic Acids Res., 31, 418–420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Gelfand M.S. and Koonin,E.V. (1997) Avoidance of palindromic words in bacterial and archaeal genomes: a close connection with restriction enzymes. Nucleic Acids Res., 25, 2430–2439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Karlin S., Burge,C. and Campbell,A.M. (1992) Statistical analyses of counts and distributions of restriction sites in DNA sequences. Nucleic Acids Res., 20, 1363–1370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Saunders R.D., Glover,D.M., Ashburner,M., Siden-Kiamos,I., Louis,C., Monastirioti,M., Savakis,C. and Kafatos,F. (1989) PCR amplification of DNA microdissected from a single polytene chromosome band: a comparison with conventional microcloning. Nucleic Acids Res., 17, 9027–9037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.McGuinness B.T., Clarke,I.N., Lambden,P.R., Barlow,A.K., Poolman,J.T., Jones,D.M. and Heckels,J.E. (1991) Point mutation in meningococcal por A gene associated with increased endemic disease. Lancet, 337, 514–517. [DOI] [PubMed] [Google Scholar]
  • 17.Tettelin H., Saunders,N.J., Heidelberg,J., Jeffries,A.C., Nelson,K.E., Eisen,J.A., Ketchum,K.A., Hood,D.W., Peden,J.F., Dodson,R.J. et al. (2000) Complete genome sequence of Neisseria meningitidis serogroup B strain MC58. Science, 287, 1809–1815. [DOI] [PubMed] [Google Scholar]
  • 18.Bowler L., Bart,A. and Van der Ende,A. (2001) Meningococcal Disease: Methods and Protocols. Humana Press, Totowa, NJ. [Google Scholar]
  • 19.Linz B., Schenker,M., Zhu,P. and Achtman,M. (2000) Frequent interspecific genetic exchange between commensal Neisseriae and Neisseria meningitidis. Mol. Microbiol., 36, 1049–1058. [DOI] [PubMed] [Google Scholar]
  • 20.Hacker J., Blum-Oehler,G., Muhldorfer,I. and Tschape,H. (1997) Pathogenicity islands of virulent bacteria: structure, function and impact on microbial evolution. Mol. Microbiol., 23, 1089–1097. [DOI] [PubMed] [Google Scholar]
  • 21.Lisitsyn N., Lisitsyn,N. and Wigler,M. (1993) Cloning the differences between two complex genomes. Science, 259, 946–951. [DOI] [PubMed] [Google Scholar]
  • 22.Bart A., Dankert,J. and van der Ende,A. (2000) Representational difference analysis of Neisseria meningitidis identifies sequences that are specific for the hyper-virulent lineage III clone. FEMS Microbiol. Lett., 188, 111–114. [DOI] [PubMed] [Google Scholar]
  • 23.Malloff C.A., Fernandez,R.C. and Lam,W.L. (2001) Bacterial comparative genomic hybridization: a method for directly identifying lateral gene transfer. J. Mol. Biol., 312, 1–5. [DOI] [PubMed] [Google Scholar]
  • 24.Suerbaum S., Smith,J.M., Bapumia,K., Morelli,G., Smith,N.H., Kunstmann,E., Dyrek,I. and Achtman,M. (1998) Free recombination within Helicobacter pylori. Proc. Natl Acad. Sci. USA, 95, 12619–12624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kong H., Lin,L.F., Porter,N., Stickel,S., Byrd,D., Posfai,J. and Roberts,R.J. (2000) Functional analysis of putative restriction-modification system genes in the Helicobacter pylori J99 genome. Nucleic Acids Res., 28, 3216–3223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.McClelland M., Jones,R., Patel,Y. and Nelson,M. (1987) Restriction endonucleases for pulsed field mapping of bacterial genomes. Nucleic Acids Res., 15, 5985–6005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Tsuji H., Hamada,K., Kawanishi,S., Nakayama,A. and Nakajima,H. (2002) An outbreak of enterohemorrhagic Escherichia coli O157 caused by ingestion of contaminated beef at grilled meat-restaurant chain stores in the Kinki District in Japan: epidemiological analysis by pulsed-field gel electrophoresis. Jpn. J. Infect. Dis., 55, 91–92. [PubMed] [Google Scholar]
  • 28.Karlin S. and Burge,C. (1995) Dinucleotide relative abundance extremes: a genomic signature. Trends Genet., 11, 283–290. [DOI] [PubMed] [Google Scholar]
  • 29.Fleischmann R.D., Adams,M.D., White,O., Clayton,R.A., Kirkness,E.F., Kerlavage,A.R., Bult,C.J., Tomb,J.F., Dougherty,B.A., Merrick,J.M. et al. (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, 269, 496–512. [DOI] [PubMed] [Google Scholar]
  • 30.Hayashi T., Makino,K., Ohnishi,M., Kurokawa,K., Ishii,K., Yokoyama,K., Han,C.G., Ohtsubo,E., Nakayama,K., Murata,T. et al. (2001) Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. DNA Res., 8, 11–22. [DOI] [PubMed] [Google Scholar]
  • 31.Blattner F.R., Plunkett,G.,III, Bloch,C.A., Perna,N.T., Burland,V., Riley,M., Collado-Vides,J., Glasner,J.D., Rode,C.K., Mayhew,G.F. et al. (1997) The complete genome sequence of Escherichia coli K-12. Science, 277, 1453–1474. [DOI] [PubMed] [Google Scholar]
  • 32.Welch R.A., Burland,V., Plunkett,G.,III, Redford,P., Roesch,P., Rasko,D., Buckles,E.L., Liou,S.R., Boutin,A., Hackett,J. et al. (2002) Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc. Natl Acad. Sci. USA, 99, 17020–17024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Parkhill J., Dougan,G., James,K.D., Thomson,N.R., Pickard,D., Wain,J., Churcher,C., Mungall,K.L., Bentley,S.D., Holden,M.T. et al. (2001) Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature, 413, 848–852. [DOI] [PubMed] [Google Scholar]
  • 34.Smith D.R., Doucette-Stamm,L.A., Deloughery,C., Lee,H., Dubois,J., Aldredge,T., Bashirzadeh,R., Blakely,D., Cook,R., Gilbert,K. et al. (1997) Complete genome sequence of Methanobacterium thermoautotrophicum deltaH: functional analysis and comparative genomics. J. Bacteriol., 179, 7135–7155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Parkhill J., Achtman,M., James,K.D., Bentley,S.D., Churcher,C., Klee,S.R., Morelli,G., Basham,D., Brown,D., Chillingworth,T. et al. (2000) Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491. Nature, 404, 502–506. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

[Supplementary Material]
nar_32_14_e114__1.pdf (66.2KB, pdf)

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES