Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 1998 Oct 27;95(22):13108–13113. doi: 10.1073/pnas.95.22.13108

PCR-based subtractive hybridization and differences in gene content among strains of Helicobacter pylori

Natalia S Akopyants *,, Arkady Fradkov ‡,, Luda Diatchenko §, Jason E Hill §, Paul D Siebert §, Sergey A Lukyanov , Eugene D Sverdlov , Douglas E Berg *,
PMCID: PMC23726  PMID: 9789049

Abstract

Genes that are characteristic of only certain strains of a bacterial species can be of great biologic interest. Here we describe a PCR-based subtractive hybridization method for efficiently detecting such DNAs and apply it to the gastric pathogen Helicobacter pylori. Eighteen DNAs specific to a monkey-colonizing strain (J166) were obtained by subtractive hybridization against an unrelated strain whose genome has been fully sequenced (26695). Seven J166-specific clones had no DNA sequence match to the 26695 genome, and 11 other clones were mixed, with adjacent patches that did and did not match any sequences in 26695. At the protein level, seven clones had homology to putative DNA restriction-modification enzymes, and two had homology to putative metabolic enzymes. Nine others had no database match with proteins of assigned function. PCR tests of 13 unrelated H. pylori strains by using primers specific for 12 subtracted clones and complementary Southern blot hybridizations indicated that these DNAs are highly polymorphic in the H. pylori population, with each strain yielding a different pattern of gene-specific PCR amplification. The search for polymorphic DNAs, as described here, should help identify previously unknown virulence genes in pathogens and provide new insights into microbial genetic diversity and evolution.


Genes that are present in certain isolates of a given bacterial species and absent or substantially different in others can be of great interest biologically. Some may determine strain-specific traits such as drug resistance (1), bacterial surface structure (2), or restriction-modification (3). Of special importance in infectious disease are the “pathogenicity islands” (PAIs), multigene segments of virulent strains that tend to be absent from avirulent members of the same species and help determine the nature and severity of disease (4, 5). Other highly polymorphic genes or DNA segments, including insertion sequences, plasmids, and prophages (6), may have little or no effect on bacterial fitness and consequently may be useful as neutral markers for epidemiology and evolutionary studies. The discoveries of prophages, some of which are also plasmids, that carry drug resistance and virulence genes (79) illustrate the overlap among various types of strain-specific DNAs.

Many of the genes or DNA segments specific to individual strains were found by the special phenotypes they confer (19), detailed physical mapping (10), or comparisons of sequence data from different isolates or taxa (4, 5, 11, 12). Subtractive hybridization (13) allows strain-specific DNAs to be selected directly and is attractive because it eliminates the need to score any particular phenotype or to do extensive mapping or sequencing at the outset. However, the subtractive methods developed to date have been unwieldy technically and tend to be quite narrowly selective: only a subset of strain-specific DNAs are generally obtained, and DNA segments with potentially interesting mixes of strain-specific and common sequences are generally excluded (1317).

Here we describe a more efficient and sensitive method of finding bacterial strain-specific DNAs (Fig. 1), which is based on suppression subtractive hybridization (18, 19), a method invented to study gene expression in eukaryotes. As outlined in Fig. 1 and detailed below, pools of genomic DNA fragments from a bacterial strain of interest (tester) are, in effect, depleted, by hybridization and PCR, of sequences that are also in a reference strain (driver) . The remaining DNA fragments, highly enriched for tester-specific sequences, are then cloned for further analysis.

Figure 1.

Figure 1

Bacterial genome subtractive DNA method. Solid lines represent AluI-digested tester and driver DNA fragments. Boxes represent outer part of PCR adaptors, which lack phosphate groups at their 5′ ends. Adaptors 1 and 2 are identical near their 5′ ends but differ near their 3′ ends. Note that after recessed 3′ ends are filled, types a, b, and c molecules having adaptor 2 are also present, but are not shown. This method is adapted from the subtraction subtractive hybridization method for studies of mRNAs in eukaryotes (18, 19).

We implemented this method using Helicobacter pylori, a Gram-negative bacterium that is a major cause of peptic ulcer disease and an early risk factor for gastric cancer (20, 21). DNA fingerprinting studies had revealed great diversity among independent H. pylori isolates (22), some of which may be important phenotypically. In particular, two strain-specific DNA segments are known to be virulence-associated epidemiologically: (i) the 37-kb cag PAI, many of whose encoded proteins help elicit a severe host inflammatory response that probably contributes to overt disease (23, 24); and (ii) iceA1, a restriction endonuclease (NlaIII) homolog whose transcription is induced by host–cell contact (25). Also polymorphic among H. pylori strains are various plasmids and insertion sequences (23, 24, 26, 27). Much remains to be learned, however, about the nature and extent of genetic diversity among strains of H. pylori (and, indeed, among many bacterial species); the mechanisms that give rise to and maintain this diversity; and its importance, for example in determining the specificity of colonization, persistence, or disease in cases of pathogens.

Here we developed a bacterial genome subtractive hybridization method and used it to find DNA segments from an H. pylori strain (J166) that is well suited for colonization of rhesus monkeys; these segments are absent from or substantially different from any DNAs in the strain whose genome has been fully sequenced. PCR with primers based on these J166-specific DNAs indicated that they are highly polymorphic in the H. pylori gene pool.

MATERIALS AND METHODS

General Methods.

The H. pylori strains used in this study are listed in Table 1. Standard methods were used for the growth of and DNA extraction from H. pylori and Escherichia coli (24).

Table 1.

H. pylori strains used in these studies

Strain Source/salient features*
26695 England, fully sequenced genome (29)
26695Δcag Engineered deletion of entire cag PAI (24)
1480-3 Rhesus monkey (41)
2002 Louisiana (44)
60190 England (45)
ATCC49503 England (45)
Hp1 Peru, colonizes mice (46)
J166 Tennessee (25), colonizes rhesus monkeys (48)
J170 Tennessee, colonizes rhesus monkeys transiently (48)
J238 Tennessee, colonizes rhesus monkeys transiently (48)
MO19 Missouri (22)
N6 France (31)
NCTC11637 Australia (47)
NCTC11638 Australia (47)
Tx30a Texas (45)
WV99 West Virginia (22)
*

Each strain, except MO19, Tx30a, WV99, 2002, and 26695Δcag, contains the cag PAI, based on PCR and DNA hybridization tests, as in ref. 24. The cag PAI in strain Hp1 contains an internal deletion (D. Kersulyte and D.E.B., unpublished work). 

ATCC, American Type Culture Collection; NCTC, National Collection of Type Cultures. ATCC49503 is also known as 60190; NCTC11637 is also known as ATCC43504. 

Oligonucleotides.

H. pylori strain-specific oligonucleotide primers used for PCR are listed in Table 2. The following gel-purified oligonucleotides were used for subtractive hybridizations: Adaptor 1, 5′-CTAATACGACTCACTATAGGGCTCGAGCGGCCGCCCGGGCAGGT-3′ and 3′-GGCCCGTCCA-5′; adaptor 2, 5′-CTAATACGACTCACTATAGGGCAGCGTGGTCGCGGCCGAGGT-3′ and 3′-GCCGGCTCCA-5′; P1: 5′-CTAATACGACTCACTATAGGGC-3′. (This P1 PCR primer matches the long strands of adaptors 1 and 2 at their 5′ ends.) NP1 5′-TCGAGCGGCCGCCCGGGCAGGT-3′; NP2 5′-AGCGTGGTCGCGGCCGAGGT-3′. (These NP1 and NP2 nested PCR primers match the internal portions of the long strand of adaptors 1 and 2, respectively.)

Table 2.

H. pylori J166-specific PCR primers

Line* Clone and orientation Sequence (5′-3′)
1 C2F GCTTCCGGAACTATTGGTAGGGCAG
C2R GAATGAGAGCATCTAAATTATAAAG
2 F3F ATAGTATTGATATTGATGGCAATT
F3R GGATAAATTCAATTGCGAACTGCC
3 A3F GTCTTCAATCATTAGCACCATAGA
A3R CTCAACCGCCTTATGCATAACCAC
4 H5F GAAGCAATCTATATTGAAGACTT
H5R GCGTTCCATAGGTAAATTGACCA
7 F1F GATGTTTGTGAGCATTGATGACAACG
F1R GGTCTATTTCTGTATCAATGATAGTG
9 G7F GTGCGGCTTCCTCTCAATCTCAAAC
G7R GCTTGCGTGATTGGCGAAGTGCTTG
10 D3F GATCATTATAAGCGCACGTAAATGC
D3R GCATTTAAGCAGCGTGGATCACAGG
12 A7F CTCTCAGTTATAATTGGGAGTAA
A7R CGATCGAAGTGTAACGCTTAACA
B2F CTCTACCATCTTGGACAAGGTATTC
B2R GCTTCTATCAAGTTAGTATCGAGTG
C8F CTCCGATCAGCACTGATGAGAGCG
C8R CAATATCAGGTTTGTTACCGCTTG
G4F CAGTAAGAACCACGAACTCGTGGA
G4R CGTGCTACCATAACTGAGCGCTCC
*

Refers to line in Table 3

Clone refers to clone in Table 3. Orientation: F, forward; R, reverse. 

Clone with no protein homology to current database entries. Listed in footnote to Table 3

Overview of Subtractive Hybridization.

Our method (Fig. 1) entails digestion of DNAs from the strain of interest and a reference strain (tester and driver, respectively) with a restriction endonuclease such as AluI to generate DNA fragment populations with median sizes of about 0.5 kb. Two different PCR adaptors that can join only to 5′ ends of target DNAs (because their own 5′ ends lack phosphate groups) are ligated to different aliquots of tester DNA. These ligated DNAs are denatured, mixed with an excess of driver DNA (that has no adaptors), and allowed to anneal. The two DNA pools are then mixed together, and more denatured driver DNA is added to further bind tester sequences that are also present in the driver genome. Remaining complementary single strands of tester DNA are allowed to anneal, and the adaptor sequences are copied onto their 3′ ends. PCR is then used to obtain exponential amplification of tester DNAs with different adaptors at each end (“e” in Fig. 1). In contrast, amplification of DNAs with the same adaptor at each end is suppressed because self-annealing of inverted repeat adaptors (“b” in Fig. 1) inhibits binding of PCR primers. Tester DNAs with an adaptor at only one end undergo linear, but not exponential, amplification. This method offers several substantial advantages over earlier bacterial genome subtraction methods: (i) less DNA is needed; (ii) multiple rounds of hybridization and physical removal of tester-driver DNA complexes as in refs. 13 and 14 are not needed; (iii) there is no need for complicated adaptor removal and readdition, with additional rounds of PCR amplification, as in the representational difference analysis method (1517); and (iv) DNAs that are of interest because they are partially matched to driver DNA can be recovered in our method, whereas they tend to be excluded in the other methods (1317).

Driver and Tester DNA Preparations.

Two micrograms of tester and of driver DNAs were each digested to completion with 20 units of AluI (New England Biolabs) for 16 h in 200-μl reaction volumes, extracted with phenol and precipitated with ethanol, and resuspended in 10 mM Tris⋅HCl, pH 7.5, at a final concentration of 200 ng/μl. Two aliquots of tester DNA (120 ng each) were ligated separately to 2 μl of the two adaptors, each in a total volume of 10 μl (2 μM final concentration) at 16°C overnight, using 1 μl (New England Biolabs; 400 units) of T4 DNA ligase in the buffer supplied by the manufacturer. After ligation, 1 μl of 0.2 M EDTA was added, and the samples were heated at 70°C for 5 min to inactivate the ligase and then stored at −20°C.

Subtractive Hybridization.

Three microliters of driver DNA (600 ng) was added to 1 μl (12 ng) of each of the adaptor-ligated tester DNAs (50:1 ratio). One microliter of 5× hybridization buffer (2.5 M NaCl/250 mM Hepes, pH 8.3/1 mM EDTA) was added to each tube, the solutions were overlaid with mineral oil, and the DNAs were denatured (1.5 min, 98°C) and allowed to anneal for 1.5 h at 65°C. After this first hybridization, the two samples (the first with adaptor 1, the second with adaptor 2) were combined, 300 ng more of heat-denatured driver DNA was added in 3 μl of 1× hybridization buffer, and the sample was allowed to hybridize for an additional 14 h at 65°C (without intermediate denaturation). This final 13-μl reaction was diluted in 200 μl with dilution buffer (50 mM NaCl/5 mM Hepes, pH 8.3/0.2 mM EDTA), heated at 65°C for 10 min, and stored at −20°C until use in PCR.

Two sequential PCRs were carried out. The first PCR contained 1 μl of genomic DNA prepared as described, 2 μl of PCR primer P1 (10 μM), and 47 μl of a PCR master mix prepared using the Advantage cDNA PCR Core Kit (CLONTECH) (total volume, 50 μl). This first PCR was incubated at 72°C, 2 min and then subjected to 25 cycles of 95°C, 30 sec; 66°C, 30 sec; 72°C, 1.5 min. The amplified products were then diluted 20-fold in 10 mM Tris⋅HCl, pH 7.5, and 1 μl of each diluted sample was used in the second PCR with 2 μl of nested PCR primers NP1 and NP2 (10 μM each) and Advantage cDNA PCR Core Kit (total volume, 50 μl) for 10 cycles of 95°C, 30 sec; 68°C, 30 sec; and 72°C, 1.5 min.

The products from the second PCR were inserted into the pT-Adv vector plasmid using the Advantage PCR Cloning Kit (CLONTECH), and ligated DNAs were transformed into E. coli DH5α with selection for ampicillin resistance. Random transformant clones were picked to 100 μl of Luria–Bertani medium with ampicillin in wells of a microtiter plate and grown at 37°C for 2 h. Inserts were amplified using 1 μl of cell suspension and primers NP1 and NP2 in 50-μl volumes for 25 cycles, as in the second PCR (above). Amplified fragments were purified using the Quaquick Spin PCR Purification Kit (Qiagen, Chatsworth, CA), according to the manufacturer’s protocol and sequenced by BioCore (Palo Alto, CA) by using an Applied Biosystems 373A sequenator. Homology searches were carried out using blast programs through e-mail servers at the National Center for Biotechnology Information. PCR primers generated from 12 of these sequences are listed in Table 2.

Modifications.

The present bacterial genome subtractive hybridization protocol, developed from suppression subtractive hybridization protocol for studies of eukaryotic mRNAs (18, 19), incorporates several changes to increase yield or efficiency. First, the two adaptors used here are identical at their 5′ ends for the first 22 nt, not different as in refs. 18 and 19. These new adaptors allow use of only one primer in the first amplification and also significantly decrease nonspecific amplification and the yield of very short (less than 200 bp) DNA fragments (28). This change will improve many applications, including eukaryotic mRNA studies. Second, the low complexity of bacterial genomes allowed the DNA concentration to be reduced. Polyethylene glycol (PEG), a hybridization enhancer used in cDNA subtraction, was also omitted. Although PEG has the advantage of increasing the effective concentration of DNA in solution, it also increases the risk of fortuitous misannealing between different DNA molecules and of DNA precipitation during subtractive hybridization. PEG should be retained in subtractive cDNA protocols, however, especially when starting material is limiting. Third, the time of first hybridization was shortened from 10 h to 1.5 h, thereby decreasing the extent of normalization of multi- vs. single-copy genes. This change was possible because the range of gene copy numbers in bacteria is small relative to that of mRNAs in eukaryotes. Shortening the hybridization time increases the amount of single-stranded DNA that remains after the first subtractive hybridization and thereby makes the second hybridization more efficient. Fourth, a higher ratio of tester to driver DNA is used here than in the eukaryotic mRNA studies (18, 19). This higher ratio increases the efficiency of enrichment for tester-specific sequences and reduces recovery of DNAs that are present but at different abundance in both tester and driver. These last three modifications improve performance and reliability in bacterial genome subtraction but would have been undesirable in mRNA studies because of huge differences in abundance, as noted above. Fifth, to decrease the recovery of homologous, but only slightly divergent, sequences, and because of the A + T-richness of H. pylori DNA, we decreased the stringency of subtraction by hybridization at 65°C instead of 68°C.

DNA Hybridization.

Dot blotting and Southern blot hybridizations were carried out using standard protocols (24). In Southern blot hybridization tests, approximately 300 ng of AluI-digested H. pylori genomic DNA of each strain of interest was electrophoresed in agarose gels and transferred to Hybond-N+ membranes (Amersham). The blots were prehybridized 1 h in 5 ml of CLONTECH ExpressHyb containing 100 μg/ml sheared salmon sperm DNA at 68°C and then hybridized overnight at 68°C with 20 ng of probe DNA that was made by PCR with clone-specific primers (Table 2) and 32P-labeled using a DECAprime II labeling kit (≈4 × 106 cpm/ml) (Ambion, Austin, TX). The blots were first washed at moderately low stringency [four times for 20 min in 2× SSC/0.5% SDS (1× SSC is 0.15 M sodium chloride/0.015 sodium citrate, pH 7), then 2× 20 min in 0.2× SSC/0.5% SDS, all at 62°C], used to expose x-ray film, washed again at high stringency (two times for 30 min in 0.2× SSC at 68°C) and again used to expose x-ray film.

RESULTS

Subtractive Hybridization.

We first tested our bacterial genome subtractive hybridization method (Fig. 1) using isogenic H. pylori strains that differed only by the presence or absence of the 37-kb cag PAI, which is equivalent to about 2% of the H. pylori genome: strain 26695 wild-type (cag+) was used as tester, and its engineered cag-deletion derivative (Δcag) (24) was used as driver. Subtracted DNAs were cloned, the inserts were PCR amplified, and PCR products were spotted on hybridization filters. Clones specific to the cag PAI were then identified by probing these filters with labeled genomic DNAs from the cag+ tester strain and the Δcag driver and also with a set of DNA fragments that span the entire cag PAI (24) (data not shown). The hybridization results indicated that more than 90% of clones had derived from the cag PAI, and 14 of them were one-pass sequenced. Comparison of their sequences with those in the 26695 genome mapped the 14 clones to 12 sites in the cag PAI (Fig. 2). This result indicates that many different DNAs specific to a given strain can be obtained by our subtractive hybridization method.

Figure 2.

Figure 2

Map locations of 14 subtracted clones in the cag PAI. Coordinates refer to distance in kb from left end of the 37-kb cag PAI, as in ref. 24, which corresponds to position 537 kb in the entire H. pylori 26695 genome sequence (30).

Two unrelated H. pylori strains were tested next for differences in gene content, by using a strain well suited for rhesus monkey colonization (J166) as tester and the strain whose genome was sequenced (26695) as driver. Each strain is plasmid free and contains the cag PAI and the iceA1 gene. DNAs that were recovered after subtraction were cloned, PCR amplified and arrayed for hybridization with labeled tester and driver DNAs, as above. About 30 of 64 clones tested by hybridization were judged to contain strain J166-specific sequences (See Fig. 3), although the differences among clones were less clear-cut than in the first trial with isogenic cag+ and Δcag strains, as will be explained. In contrast, none of 15 random clones (generated without subtractive hybridization) appeared to be J166 specific. Twenty of the candidate J166-specific clones were one-pass sequenced from each end; 16 were unique and two others were each represented twice, which indicated that 18 different clones had been obtained. Ten of the clones whose DNAs were found by sequencing to be absent or very divergent from those in the 26695 genome are circled in Fig. 3 (Right).

Figure 3.

Figure 3

Representative dot-blot hybridization of products of subtractive hybridization by using genomic DNAs of H. pylori strains J166 as tester and 26695 as driver. Cloned subtractive hybridization products were PCR-amplified using primers NP1 and NP2 and spotted on Hybond N+ filters (Materials and Methods) and probed with labeled genomic DNAs, as indicated. The clones shown by DNA sequencing to have J166 DNAs either not present in the 26695 genome or substantially different from sequences in 26695 (as summarized in Table 3) are circled.

Protein Homologies.

Thirteen of the 18 clones exhibited significant protein homologies with other databases entries, as summarized in Table 3 and in the text below. Seven clones had homologies to parts of putative restriction–modification (R-M) proteins. Two matched overlapping portions of type I specificity subunits (HsdS) (Table 3, lines 1 and 2), but differed from one another in size and sequence and thus must derive from different hsdS genes (three putative hsdS genes were found in the 26695 genome sequence; ref. 29). Two clones had homology to BcgI (Table 3, lines 3 and 4), a two-subunit enzyme that is unusual in cleaving DNA at a fixed distance on each side of its recognition site (30). Although quite different BcgI homologs are encoded in strain 26695 (HP1471, HP1472; ref. 29), PCR tests mapped these two types of BcgI homologs to the same locus (between ORFs HP1470 and HP1473) (ref. 29 and J. L. Lovett and D.E.B., unpublished work). Two other clones had homology to type II DNA methylases (Table 3, lines 5, 6). For one clone, F7 (Table 3, line 5), the strongest match, although weak, was with M.HpyI (Hp1208). However, the DNA sequence of the HP1208 gene of J166 is known (25) and does not contain the sequence found in clone F7. Thus, this clone represents a different locus. The seventh clone exhibited protein homology to a type III DNA methyltransferase. Two other clones (Table 3, lines 8 and 9) exhibited homology to proteins that might be metabolic. Four clones (Table 3, lines 10–13) exhibited protein-level homology to ORFs in 26695 for hypothetical proteins of unknown function, two of which had equivalent function-unknown homologs in other microbes. The remaining five clones did not exhibit significant protein homology to entries in current databases and are listed in the footnote to Table 3.

Table 3.

Features of J166-specific clones with significant database matches

Clone* GenBank accession no. Homolog (protein); accession no.; blastx P(N) value DNA match, strain 26695
R-M
1 C2 AF025968 HsdS, HI0216, 4e−36; HP1383, 1e−20 110/116, HP1383
2 F3 AF025974 HsdS, MJ0130, 2e−11 52/55, HP1383
3 A3 AF025965 BcgIα, HP1472, 2e−17; BC, gbL17341, 2e−16 None
4 H5 AF025982 BcgIα, HP1472, 3e−10; BcgIβ BC spQ07606, 2e−07 150/160, HP1472, and upstream
5 F7 AF025975 HpyI, DNA methyltransferase HP1208, 6e−05 None
6 H4/F AF025980 DNA methyltransferase, M.AvaI, AV gbX98339, 1e−08 None
H4/R AF025981 None None
7 F1 AF025973 StyLT1 DNA methyltransferase, spP40814, 9e−16 106/113, HP1369
Possible metabolic
8 E3 AF025971 l-Serine deaminase, HP0132; E. coli yhaE-like, gbU82664, 7e−05 93/96, HP0132, HP0133
9 G7 AF025978 Hydrogenase expression, HP0047, 1e−21 116/122, HP0047
Hypothetical protein homolog, function unknown
10 D3 AF025970 Conserved HP0347, 4e−52; HP0338, 1e−54 231/249, HP0347; 142/155, HP0338
11 H2 AF025979 Conserved HP0745, 6e−42 214/217, HP0745
12 A7 AF025966 H. pylori HP0673, 2e−29 27/28, 68/80, HP0673
13 G3 AF025976 H. pylori HP0594, 4e−06 110/114, HP0594

gb, GenBank; sp, Swiss-Prot; AV, Anabaena variabilis; BC, Bacillus coagulans; EC, Escherichia coli; HI, Haemophilus influenzae; HP, H. pylori; Sty, Salmonella typhimurium; MJ, Methanococcus jannaschii. Sequences of open reading frames designated HI, HP, and MJ (e.g., HI0216 and HP1383 for clone C2, line 1) are found using the genome-specific sections of the TIGR website. 

*

Cloned DNAs ranged from 309 bp to 804 bp. They are arranged here according to putative functions, inferred from database matches. Clones with homology to R-M systems were subdivided into type I (clones C2, F3); type I-like (clones A3, H5; BcgI appears to be a type I-derived R-M enzyme (ref. 29; H. Kong, personal communication); type II (clones F7, H4); and type III (clone F1). J166-specific clones with no significant protein homology to current database entries were as follows: E7 (AF025972), which had a 59/62-bp match to the H. pylori genome sequence between HP0134 and HP0135; A1/F (AF025963) and A1/R (AF025964); B2 (AF025967); C8 (AF025969); and G4 (AF025977). 

Extent of match (in bp) in longest stretch of sequence shared between J166-derived clone and the 26695 genome, based on sequence searches through the TIGR web site (http://www.tigr.org). 

DNA Sequence Homology.

Ten of the 18 clones contained patches of 55 bp to 217 bp that closely matched sequences in strain 26695 (Table 3), lengths that should be sufficient for recombination (see Discussion). Another seven clones had little or no significant DNA sequence match to the 26695 genome (less than 30 bp identical; listed as “none” in Table 3 and its footnote). The remaining clone (D3; Table 3, line 10) represented a deletion/substitution polymorphism — a replacement of 3.4 kb of function-unknown sequence of 26695 (ORFs HP0339–HP0346) with 110 bp of J166-specific DNA. PCR tests with D3-specific primers showed that this clone accurately represented the J166 genome sequence organization in this region.

Diversity at Many Loci.

The extent of polymorphism among H. pylori strains for 11 of the J166 sequences that were either absent from or were substantially different in strain 26695 was assessed by PCR using 13 H. pylori strains from various parts of the world. Each strain exhibited a distinct spectrum of amplification products (Table 4; first + or − for each entry). Complementary Southern blot hybridizations were carried out by using clone-specific probes, with quite similar results (second + or − for each entry): of the 143 tests (11 clones, 13 previously untested strains), 101 (71%) yielded the same result in PCR and hybridization (67 both negative; 34 both positive). Of the remaining 42 tests, 37 were positive only by hybridization, and five were positive only by PCR.

Table 4.

PCR and hybridization test of DNA diversity among H. pylori strains

Clone Size, bp J166 26695 11637 11638 N6 60190 J170 J238 1480 84-183 HP1 WV99 MO19 2002 TX30A
C2 260 ++ −− ++ −− −− −− −− −+ −+* −− −+ −− ++ −+ −+
F3 340 ++ −− −− −− −− −− −− −− ++ ++ −− −− −− ++ −−
A3 590 ++ −− ++ +− ++ +− ++ ++ +− +− −− −− −− ++ −−
H5 565 ++ −− −+ −+ −+ −− −+ −− −+ ++ −+ −+ −− −− −−
F1 220 ++ −− −+ ++ −− −− −+ −− −+ ++ −+* −− −+ −− −−
G7 300 ++ −− −− −+ −− −− −− −− −− −− −− −− −− −− +−
D3 600 ++ −+* ++ −+ ++ −+* ++ −+ −+ −+ −+ −+ −+ −+ −+
A7 410 ++ −− ++ −− −+* −− −+ −+ −− −− −+ −− ++ −− ++
B2 300 ++ −− −− −− −− −− ++ −− ++ −− −+ −− ++ ++ ++
C8 300 ++ −− ++ ++ ++ −− −− −+ −− ++ −+ ++ −+ ++ ++
G4 410 ++ −− −− −− −− −− −+ −− ++ ++ −− ++ −− −− −−

11637 and 11638 designate NCTC11637 and NCTC11638 (NCTC, National Collection of Type Cultures). In each column, the first entry is the result of PCR of the indicated H. pylori strain with the clone-specific primers listed in Table 2. The second entry is the result of Southern blot hybridization with the clone. The designation +* indicates cases of positive hybridization at low stringency only, whereas + indicates positive hybridization at both low and high stringency. In addition, PCR tests with primers specific for the F7 clone yielded products with J166 (positive control), NCTC11637, N6, WV99, 2002, and TX30A, but not 26695 (negative control) nor any of the other eight strains tested. 

The 18 clones studied here can be mapped tentatively to at least 12 loci. First, PCR with genomic DNA from strain NCTC11638 yielded specific products with three clones (F1, D3, and C8; Table 3, lines 7, 10, and footnote, respectively). Hybridization to filters containing an ordered cosmid library from this strain (32) mapped each clone to a different location: cosmids 51 and 52 (clone F1), 19 and 20 (D3) and 6 (C8) [approximate map positions of 1270, 510, and 120 kb, respectively, in the 1730-kb NCTC11638 genome (32)]. Second, the two clones representing BcgI homologs mapped to a fourth locus (J. L. Lovett and D.E.B., unpublished work). Third, eight other clones exhibited partial DNA sequence matches to eight other loci in the 26695 genome.

DISCUSSION

We developed an efficient PCR-based subtractive hybridization method for detecting differences in gene content among members of a bacterial species and implemented it using the gastric pathogen H. pylori. In a pilot experiment with isogenic strains that differed by the presence or absence of the 37-kb cag PAI (≈2% of H. pylori genome), more than 90% of recovered clones contained DNAs from various sites in this PAI.

Two unrelated strains were then compared: J166 (tester), which can persistently infect rhesus monkeys; and 26695 (driver), whose genome has been sequenced. Only two of 20 clones tested were represented twice, indicating that our potential library of J166-specific DNAs is not exhausted. About half of the clones recovered here were judged to be J166-specific by hybridization, and the sequences of 17 of the 18 tested were either absent (seven clones) or substantially different (ten clones) from sequences in strain 26695; the eighteenth contained 110 bp from J166 in place of 3.4 kb from 26695. In follow-up tests, each of 13 unrelated H. pylori strains that were tested for J166-specific sequences yielded a distinct array of PCR amplification products or hybridization pattern (Table 4). Collectively, these results reveal much more diversity among H. pylori strains than had been evident from studies to date (2327), which have focused primarily on the cag PAI, plasmids and insertion sequences, and iceA1.

Several technical aspects of our bacterial genome subtraction protocol merit attention. First, the recovery of strain-specific DNAs with unrelated strains (J166 vs. 26695) was lower than that with isogenic control strains (cag+ and Δcag). This may reflect base substitution and restriction fragment length differences between tester and driver DNAs, which lower the efficiency with which driver DNAs titrate homologous tester sequences. Second, half of our J166-specific clones contained patches of sequence that were matched to those in strain 26695. Such DNAs tend to be lost in other subtractive hybridization protocols (1317), although many of them could be important phenotypically (33, 34). Their recovery here may reflect incomplete pairing with driver DNA, which allows annealing with complementary tester DNA strands and thereby amplification (see Fig. 1). Third, although here we had identified clones for further study by dot-blot hybridization, the potential importance of DNA segments containing mixtures of sequence, some matched and some not matched to reference DNAs (33, 34), will often make it worthwhile to move directly to sequencing of subtracted clones, without such prescreening.

DNA segments that are strain-specific in H. pylori populations probably occur at many different chromosomal loci, as illustrated by the 12 J166 DNAs that were tentatively mapped here (see Results). In most cases, the nature and sizes of these DNA segments remain to be defined. Some will certainly represent members of divergent gene families that are carried in most or all H. pylori strains (as illustrated with the BcgI homologs). Others may represent only single genes that are strain-specific in their distributions (e.g., iceA genes; ref. 25). A third important class may derive from much larger strain-specific DNA segments such as PAIs (other than the cag PAI) and may contain new genes affecting colonization or disease.

Evolutionary forces that may underlie certain polymorphisms are suggested by consideration of homologies to other database entries. In particular, 7 of our 18 differential clones exhibit homology to R-M systems. This finding is in accord with the diversity among alleles at a single chromosomal R-M locus in enteric bacteria (3), and the many putative R-M systems found in the H. pylori genome sequence (29). H. pylori seems to be far richer in putative R-M genes than other bacterial species whose genomes have been sequenced to date (e.g., entries in http://www.tigr.org). Why there are so many R-M genes in H. pylori is not known, however. At least five models are appealing: (i) R-M genes might contribute to bacterial survival after phage infection (3), although to our knowledge there has been only one report of an H. pylori phage (35) among more than 8,000 Medline citations. (ii) Some R-M genes may be “selfish”, tending to cause lethality if lost from a genome (36), although this selfish DNA model may apply to only certain types of R-M genes (37). (iii) DNA ends generated by restriction may stimulate recombination (38), which, in turn, should be beneficial to H. pylori (see below). (iv) R-M systems might affect bacterial–host interactions, an idea suggested by the virulence-associated H. pylori iceA1 gene, an R.NlaIII homolog, whose transcription is induced by host–cell contact (25). (v) DNA methylation related to that of R-M systems might be used to regulate gene expression or DNA replication (39).

Nine of the 18 DNAs had no database match to genes of known function. Some function-unknown genes might encode completely new classes of proteins (40), contribute to the remarkable specificity of individual strains for particular host individuals (41), or help determine the nature and severity of disease that infection can cause. Others might be vestigial and might make little if any contribution to fitness, although they may still be of great usefulness for evolutionary studies.

Ten of the 18 clones contained short patches of sequence that were closely matched to sequences in the 26695 driver and that were next to other nonmatched sequences. We suggest that such mixtures of matching and divergent sequences are also potentially important evolutionarily in facilitating the formation of new genotypes by recombination during mixed infection and in thereby sometimes facilitating quite dramatic changes in bacterial phenotype. For organisms that colonize inconstant and potentially hostile niches, such gene exchange can often be more potent than de novo mutation as an adaptive mechanism (42, 43). With H. pylori, such recombination may speed adaptation to different human hosts and may help the bacterium cope with changes in the gastric mucosa that infection elicits.

Acknowledgments

D.E.B. is grateful to the late C. M. Berg for years of discussion on genetics and evolution. We also thank numerous colleagues, including A. Dubois, A. Raudonikiene, J. L. Lovett, and W. W. Su for strains, DNA samples, and communication of unpublished results. This work was supported by research Grants DK48029, AI138166, HG00820, and 1 R03 TW00835-01 from the National Institutes of Health, VM-121 from the American Cancer Society, 98-04-48508 from the Russian Foundation for Fundamental Research, 930491 (linkage) from the North Atlantic Treaty Organization, 75195-502007 (International Research Scholars Program) from the Howard Hughes Medical Institute, and by Clontech Research Funds.

ABBREVIATIONS

PAI

pathogenicity island

R-M

restriction–modification

Footnotes

Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. AF025963AF025982).

References

  • 1. Davies J. Science. 1994;264:375–382. doi: 10.1126/science.8153624. [DOI] [PubMed] [Google Scholar]
  • 2.Stroeher U H, Manning P A. Trends Microbiol. 1997;5:178–180. doi: 10.1016/s0966-842x(97)85010-x. [DOI] [PubMed] [Google Scholar]
  • 3.King G, Murray N E. Trends Microbiol. 1994;2:465–469. doi: 10.1016/0966-842x(94)90649-1. [DOI] [PubMed] [Google Scholar]
  • 4.Groisman E A, Ochman H. Trends Microbiol. 1997;5:343–349. doi: 10.1016/S0966-842X(97)01099-8. [DOI] [PubMed] [Google Scholar]
  • 5.Hacker J, Blum-Oehler G, Muhldorfer I, Tschape H. Mol Microbiol. 1997;23:1089–1097. doi: 10.1046/j.1365-2958.1997.3101672.x. [DOI] [PubMed] [Google Scholar]
  • 6.Campbell A. Annu Rev Microbiol. 1981;35:55–83. doi: 10.1146/annurev.mi.35.100181.000415. [DOI] [PubMed] [Google Scholar]
  • 7.Hedges R W, Jacob A E, Barth P T, Grinter N J. Mol Gen Genet. 1975;141:263–267. doi: 10.1007/BF00341804. [DOI] [PubMed] [Google Scholar]
  • 8.O’Brien A D, Newland J W, Miller S F, Holmes R K, Smith H W, Formal S B. Science. 1984;226:694–696. doi: 10.1126/science.6387911. [DOI] [PubMed] [Google Scholar]
  • 9.Waldor M K, Mekalanos J J. Science. 1996;272:1910–1914. doi: 10.1126/science.272.5270.1910. [DOI] [PubMed] [Google Scholar]
  • 10.Krawiec S, Riley M. Microbiol Rev. 1990;54:502–539. doi: 10.1128/mr.54.4.502-539.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Smith D R, Doucette-Stamm L A, Deloughery C, Lee H, Dubois J, Aldredge T, Bashirzadeh R, Blakely D, Cook R, Gilbert K, et al. J Bacteriol. 1997;179:7135–7155. doi: 10.1128/jb.179.22.7135-7155.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Himmelreich R, Plagens H, Hilbert H, Reiner B, Herrmann R. Nucleic Acids Res. 1997;25:701–712. doi: 10.1093/nar/25.4.701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Straus D, Ausubel F M. Proc Natl Acad Sci USA. 1990;87:1889–1893. doi: 10.1073/pnas.87.5.1889. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Brown P K, Curtiss R. Proc Natl Acad Sci USA. 1996;93:11149–11154. doi: 10.1073/pnas.93.20.11149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lisitsyn N, Lisitsyn N, Wigler M. Science. 1993;259:946–951. doi: 10.1126/science.8438152. [DOI] [PubMed] [Google Scholar]
  • 16.Tinsley C R, Nassif X. Proc Natl Acad Sci USA. 1996;93:11109–11114. doi: 10.1073/pnas.93.20.11109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Calia K E, Waldor M K, Calderwood S B. Infect Immun. 1998;66:849–852. doi: 10.1128/iai.66.2.849-852.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Diatchenko L, Lau Y F C, Campbell A P, Chenchik A, Moqadam F, Huang B, Lukyanov S, Lukyanov K, Gurskaya N, Sverdlov E D, et al. Proc Natl Acad Sci USA. 1996;93:6025–6030. doi: 10.1073/pnas.93.12.6025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Gurskaya N G, Diatchenko L, Chenchik A, Siebert P D, Khaspekov G L, Lukyanov K A, Vagner L L, Ermolaeva O D, Lukyanov S A, Sverdlov E D. Anal Biochem. 1996;240:90–97. doi: 10.1006/abio.1996.0334. [DOI] [PubMed] [Google Scholar]
  • 20.Parsonnet J. Infect Dis Clin North Am. 1998;12:185–197. doi: 10.1016/s0891-5520(05)70417-7. [DOI] [PubMed] [Google Scholar]
  • 21.Moss S F, Fendrick M A, Cave D R, Modlin I M. Am J Gastroenterol. 1998;93:306–310. doi: 10.1111/j.1572-0241.1998.00306.x. [DOI] [PubMed] [Google Scholar]
  • 22.Akopyanz N, Bukanov N O, Westblom T U, Kresovich S, Berg D E. Nucleic Acids Res. 1992;20:5137–5142. doi: 10.1093/nar/20.19.5137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Censini S, Lange C, Xiang Z, Crabtree J E, Ghiara P, Borodovsky M, Rappouli R, Covacci A. Proc Natl Acad Sci USA. 1996;93:14648–14653. doi: 10.1073/pnas.93.25.14648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Akopyants N S, Clifton S W, Kersulyte D, Crabtree J E, Youree B E, Reece C A, Bukanov N O, Drazek E S, Roe B A, Berg D E. Mol Microbiol. 1998;28:37–53. doi: 10.1046/j.1365-2958.1998.00770.x. [DOI] [PubMed] [Google Scholar]
  • 25.Xu Q, Peek R M, Miller G G, Blaser M J. J Bacteriol. 1997;179:6807–6815. doi: 10.1128/jb.179.21.6807-6815.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Minnis J A, Taylor T E, Knesek J E, Peterson W L, McIntire S A. Plasmid. 1995;34:22–36. doi: 10.1006/plas.1995.1030. [DOI] [PubMed] [Google Scholar]
  • 27.Kersulyte, D., Akopyants, N. S., Clifton, S. W., Roe, B. A. & Berg, D. E. (1998) Gene, in press. [DOI] [PubMed]
  • 28.Lukyanov K A, Launer G A, Tarabykin V S, Zaraisky A G, Lukyanov S A. Anal Biochem. 1995;229:198–202. doi: 10.1006/abio.1995.1402. [DOI] [PubMed] [Google Scholar]
  • 29. Tomb J F, White O, Kerlavage A R, Clayton R A, Sutton G G, Fleischmann R D, Ketchum K A, Klenk H P, Gill S, Dougherty B A, et al. Nature (London) 1997;388:539–547. doi: 10.1038/41483. [DOI] [PubMed] [Google Scholar]
  • 30.Kong H, Roemer S E, Waite-Rees P A, Benner J S, Wilson G G, Nwankwo D O. J Biol Chem. 1994;269:683–690. [PubMed] [Google Scholar]
  • 31.Suerbaum S, Josenhans C, Labigne A. J Bacteriol. 1997;175:3278–3288. doi: 10.1128/jb.175.11.3278-3288.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Bukanov N O, Berg D E. Mol Microbiol. 1994;11:509–523. doi: 10.1111/j.1365-2958.1994.tb00332.x. [DOI] [PubMed] [Google Scholar]
  • 33.Coffey T J, Dowson C G, Daniels M, Spratt B G. Microb Drug Resist. 1995;1:29–34. doi: 10.1089/mdr.1995.1.29. [DOI] [PubMed] [Google Scholar]
  • 34.Seifert H S. Mol Microbiol. 1996;21:433–440. doi: 10.1111/j.1365-2958.1996.tb02552.x. [DOI] [PubMed] [Google Scholar]
  • 35.Schmid E N, von Recklinghausen G, Ansorg R. J Med Microbiol. 1990;32:101–104. doi: 10.1099/00222615-32-2-101. [DOI] [PubMed] [Google Scholar]
  • 36.Kusano K, Naito T, Handa N, Kobayashi I. Proc Natl Acad Sci USA. 1995;92:11095–11099. doi: 10.1073/pnas.92.24.11095. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.O’Neill M, Chen A, Murray N E. Proc Natl Acad Sci USA. 1997;94:14596–14601. doi: 10.1073/pnas.94.26.14596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.McKane M, Milkman R. Genetics. 1995;139:35–43. doi: 10.1093/genetics/139.1.35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Henaut A, Rouxel T, Gleizes A, Moszer I, Danchin A. J Mol Biol. 1996;257:574–585. doi: 10.1006/jmbi.1996.0186. [DOI] [PubMed] [Google Scholar]
  • 40.Hinton J C. Mol Microbiol. 1997;26:417–422. doi: 10.1046/j.1365-2958.1997.6371988.x. [DOI] [PubMed] [Google Scholar]
  • 41. Dubois A, Berg D E, Incecik E T, Fiala N, Heman-Ackah L M, Perez-Perez G I, Blaser M J. Infect Immun. 1996;64:2885–2891. doi: 10.1128/iai.64.8.2885-2891.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Chao L. Gene. 1997;205:301–308. doi: 10.1016/s0378-1119(97)00405-8. [DOI] [PubMed] [Google Scholar]
  • 43.Crow J F. Dev Genet (Amsterdam) 1994;15:205–213. doi: 10.1002/dvg.1020150303. [DOI] [PubMed] [Google Scholar]
  • 44.Taylor N S, Fox J G, Akopyants N S, Berg D E, Thompson N, Shames B, Yan L, Fontham E, Janney F, Hunter F M, et al. J Clin Microbiol. 1995;33:918–923. doi: 10.1128/jcm.33.4.918-923.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Tummuru M K, Cover T L, Blaser M J. Infect Immun. 1993;61:1799–1809. doi: 10.1128/iai.61.5.1799-1809.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Guruge J L, Falk P G, Lorenz R G, Dans M, Wirth H-P, Blaser M J, Berg D E, Gordon J I. Proc Natl Acad Sci USA. 1998;95:3925–3930. doi: 10.1073/pnas.95.7.3925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Akopyants N S, Jiang Q, Taylor D E, Berg D E. Helicobacter. 1997;2:48–52. doi: 10.1111/j.1523-5378.1997.tb00058.x. [DOI] [PubMed] [Google Scholar]
  • 48.Dubois, A., Berg, D. E., Incecik, E. T., Fiala, N., Heman-Ackah, L. M., Del Valle, J., Yang, M., Wirth, H.-P., Perez-Perez, G. I. & Blaser, M. J. Gastroenterology, in press. [DOI] [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES