Abstract
The pathogenesis of nontypeable Haemophilus influenzae (NTHi) begins with adhesion to the rhinopharyngeal mucosa. In almost 80% of NTHi clinical isolates, the HMW proteins are the major adhesins. The prototype HMW1 and HMW2 proteins, identified in NTHi strain 12, exhibit different binding specificities. The two binding domains have been localized in regions of maximal sequence dissimilarity (40% identity, 58% similarity). Two areas within these binding domains have been found essential for full level adhesive activity (designated the core-binding domains). To investigate the conservation and diversity of the HMW1 and HMW2 core-binding domains among isolates, PCR and DNA sequencing were used. First, we separately amplified the hmw1A-like and hmw2A-like structural genes in nine invasive NTHi isolates, discovering two new hmwA alleles, whose sequences are herein reported. Then, the hmw1A-like and hmw2A-like PCR products were used as the template in nested PCR to produce amplicons encompassing the encoding sequences of the two core-binding domains. In-depth sequence analysis was then performed among sequences of each group, with the support of specific computer programs. Overall, extensive sequence diversity among isolates was highlighted. However, similarity plots showed patterns consisting of peaks of relatively high similarity alternating with strongly divergent regions. The phylogenetic tree clearly indicated the HMW1-like and HMW2-like core-binding domain sequences as two clusters. Distinct sets of conserved amino acid motifs were identified within each group of sequences using the MEME/MOTIFSEARCH tool. Since HMW adhesins could represent candidates for future vaccines, identification of specific patterns of conserved motifs in otherwise highly variable regions is of great interest.
Nontypeable Haemophilus influenzae (NTHi) is regarded as an important human pathogen, able to cause human respiratory tract disease and, especially in adults, invasive diseases such as meningitis and septicemia (11, 12). The initial step in the pathogenesis of NTHi disease involves establishment of bacteria on the rhinopharyngeal respiratory mucosa followed by contiguous spread within the respiratory tract and, occasionally, to sterile sites. A number of adherence factors, including both pilus and nonpilus adhesins, have been identified (24). In almost 80% of NTHi clinical isolates, the HMW proteins are the major adhesins responsible for attachment to human epithelial cells (29). They were originally identified as a group of high-molecular-weight, surface-exposed proteins, predominant targets of human serum antibody response during acute otitis media (1).
Given their functional role and their immunogenic character, HMW proteins have gained special consideration as NTHi vaccine candidates. The prototype proteins were detected in NTHi strain 12 and were designated HMW1 and HMW2 (2). They are encoded by separate chromosomal loci, hmw1 and hmw2, respectively, the hmwA gene encoding the structural protein, and two accessory genes, called hmwB and hmwC, encoding proteins involved in the processing and secretion of the structural protein (3, 14, 30, 32). The hmw1A gene consists of 4.6 kb and encodes a 160-kDa preprotein, while the hmw2A gene is 4.4 kb and encodes a 155-kDa preprotein (2). Since, during maturation, the HMW1 and HMW2 proteins undergo cleavage of a 441-amino-acid N-terminal fragment, the mature proteins are 125 kDa and 120 kDa in size, respectively (15).
The sequences of the hmw1A and hmw2A genes are identical for the first 1,259 bp and thereafter partially diverge, even if a high level of identity (98%) is maintained in the first 1,685 bp and, at the 3′ end, in the last 740 bp (83%) (2). Comparison of the derived amino acid sequences reveals 99% identity over the first 441 residues (cleaved during maturation process) and nearly 80% identity over the 306 C-terminal residues, with the middle portion of the proteins sharing much less homology (2).
Interestingly, the HMW1 and HMW2 proteins exhibit different cellular binding specificities (16, 29). HMW1 binds to sialylated glycoproteins, whereas the receptor for HMW2 remains unknown (28). The two different HMW1 and HMW2 binding domains of strain 12 have recently been localized near the N terminus of the mature proteins in the regions of maximal sequence dissimilarity, with 40% identity and 58% similarity (8). In particular, the 124 amino acids between residues 114 and 237 in mature HMW1 and the 125 amino acids between residues 112 and 236 in mature HMW2 have been found to be essential for full-level adhesive activity (8). Although arbitrary, for the sake of clarity in this paper these essential regions will hereafter be referred to as the HMW1 and HMW2 core-binding domains.
Antigenically related HMW proteins have been detected in NTHi strains other than the prototype strain 12 (2, 19, 31). Likewise, sequences homologous to the hmwA genes have been demonstrated among NTHi strains, including isolates from invasive disease, by Southern blotting or by PCR (10, 19, 22, 31, 34). Recently, the presence of two physically distinct hmw loci has been demonstrated among genetically diverse NTHi isolates possessing hmwA sequences by Southern hybridization (4). Analysis of Escherichia coli transformants expressing these loci from two isolates indicated that both the isolates possess a protein with HMW1-like adherence properties and another with HMW2-like properties (4).
If various isolates produce two HMW proteins with different adherence properties as in the prototype strain 12, strain-to-strain conservation of sequences critical for the different binding specificities should be supposed. On the other hand, sequence analysis of hmwA alleles from different isolates revealed a high level of polymorphism in the receptor binding regions (9). In this study, a PCR approach in conjunction with DNA sequencing was used to investigate the conservation and diversity of the HMW1 and HMW2 binding domain sequences among isolates. First, we separately amplified the hmw1A and hmw2A genes in nine invasive NTHi isolates by employing primers complementary to flanking open reading frames (ORFs). Then, the hmw1A and hmw2A PCR products were used as the template in nested PCRs to produce amplicons including the coding sequences of the two core-binding domains. In-depth sequence analysis of these regions with the support of specific computer program tools revealed a complex pattern consisting of highly conserved motifs interrupted by strongly divergent regions. In the course of this study, the complete nucleotide sequence of the hmw1A and hmw2A genes from strain 72 was determined, providing further insight into the basis of the variability of HMW proteins.
MATERIALS AND METHODS
Selection of strains and growth conditions.
Nine epidemiologically unlinked NTHi strains were chosen among a collection of 38 invasive NTHi isolates previously found hmwA positive by PCR using a set of primers annealing to the conserved 5′ region of the hmw1A and hmw2A genes (M. Cerquetti, P. Spigaglia, G. Renna, R. Brunetti, R. Cardines, and P. Mastrantonio, Abstr. 12th Eur. Cong. Clin. Microbiol. Infect. Dis., abstr. P596, 2002). An effort was made to choose isolates representing a range of different invasive sites: blood (strains 72, 143, 157, 161, and 188), cerebrospinal fluid (strains 142 and 152), pleural fluid (strain 56), and peritoneal fluid (strain 91). NTHi strain 12 (kindly provided by S. J. Barenkamp, St. Louis University, MO), from which the hmw1A and hmw2A genes were originally cloned (2), was also included in the study.
Bacteria were grown overnight on chocolate agar plates supplemented with Vitox (Oxoid Ltd., Basingstoke, Hampshire, United Kingdom) at 37°C in 5% CO2. For DNA extraction, bacterial strains were grown in Haemophilus test medium (HTM) broth, consisting of Muller-Hinton broth (Oxoid Ltd.) supplemented with 0.5% yeast extract and HTM supplement (Oxoid Ltd.), and incubated under the same conditions.
Southern hybridization.
Chromosomal DNA was digested with the restriction endonuclease BglII (BioLabs Inc.), electrophoresed on a 0.7% agarose gel, transferred to a nylon membrane (HybondN; Amersham Biosciences, Buckinghamshire, United Kingdom) by alkaline blotting, and then fixed by exposure to the UV light. The probe was a 1,245-bp nucleotide fragment containing the 5′ conserved region of the hmwA genes. It was generated by PCR amplification of prototype strain 12 genomic DNA by using the primers and PCR conditions previously described (34). Labeling of the probe and hybridization were performed by using the ECL labeling kit (Amersham Biosciences). Hybridizing bands were visualized by autoradiography.
PCR analysis of hmw loci.
In each isolate, the hmw1A-like and the hmw2A-like genes were separately amplified by employing primers complementary to flanking genes. In prototype strain 12 the hmw1 gene cluster is located downstream of ORF HI1679, while the hmw2 cluster is downstream of ORF HI1598 (both ORFs were originally identified in H. influenzae strain Rd, GenBank accession no. NC_000907) (25). Inside the hmw clusters, the hmw1A and hmw2A structural genes are upstream of the hmw1B and the hmw2B accessory genes, respectively (2). On the basis of this information, two primer sets (ORF5-1/HMWB3R and ORF5-2/HMWB3R) were designed, in which the forward primers ORF5-1 and ORF5-2 recognized sequences at the 3′ end of ORFs HI1679 and HI1598, respectively, while the common reverse primer HMWB3R annealed to a conserved region in the hmw1B and hmw2B genes (Table 1).
TABLE 1.
PCR primers
Type | Primer | Nucleotide sequence (5′ to 3′) | Positions | Reference sequence |
---|---|---|---|---|
hmw1A | ORF5-1 | TGG AAC TTC TTT TGC TGT GGC TGA TGC | 1900-1926 | U32841a |
HMWB3R | GAT GAA GAA GCC AGG CCA AGC AAT AC | 5181-5156 | M84616 | |
HMWA6 | AAT GTA TCA GGC AAA GAA AAA GGC | 1488-1511 | M84616 | |
HMWA11 | CAA AAG TGT TAT GTT GCC TCC GG | 2780-2757 | M84616 | |
HMWA7 | CAG CAG AGA TTT TTG AAT CTT TAA C | 3618-3594 | M84616 | |
hmw2A | ORF5-2 | CCT CTT AAT TGG GCA TTA GTT GG | 9904-9926 | U32833a |
HMWB3R | GAT GAA GAA GCC AGG CCA AGC AAT AC | 5002-4977 | M84615 | |
HMWA6 | AAT GTA TCA GGC AAA GAA AAA GGC | 1489-1512 | M84615 | |
HMWA13 | AAT CAT CTT TCG TCT GTC TGA GGC | 2791-2768 | M84615 | |
HMWA7 | CAG CAG AGA TTT TTG AAT CTT TAA C | 3625-3601 | M84615 |
Section of strain Rd KW20 complete genome (L42023).
Based on strain 12 sequences, the expected sizes for the hmw1A-like and the hmw2A-like gene PCR products were approximately 5.5 kb and 5.3 kb, respectively. PCR analysis was carried out on genomic DNAs isolated with the QIAamp DNA kit (QIAGEN S.p.A., Milan, Italy). The amplification reaction was performed using 1.5 U of Takara Ex-Taq (Takara Bio Inc., Shiga, Japan) with a reaction mixture containing Ex-Taq buffer (2 mM MgCl2), 2.5 mM of each deoxynucleoside triphosphate, 50 pmol of each primer, and 2 μl of DNA in a total volume of 50 μl. After an initial denaturation step at 95°C for 5 min, the sample underwent 30 cycles with the following parameters: denaturation at 95°C (1 min), annealing at 63°C (1 min), elongation at 72°C (6 min), and finally 10 min of incubation at 72°C. PCR products were electrophoresed on a 1.0% agarose gel. In all the PCRs, NTHi strain 12 was used as the positive control.
Detection of HMW proteins by Western blotting.
Whole-cell proteins were prepared by heating bacteria solubilized in electrophoretic sample buffer at 100°C for 5 min. Proteins were resolved by sodium dodecyl sulfate (SDS)-polyacrylamide gel electrophoresis (PAGE) on a 7.5% acrylamide gel and transferred to nitrocellulose sheets in a transblotting cell (Bio-Rad Laboratories Inc., Richmond, California). HMW protein expression was detected by using the 28D rabbit polyclonal antiserum (kindly provided by S. J. Barenkamp) raised against the HMW1 and HMW2 proteins of NTHi strain 15 (2) and cross-reactive with the HMW proteins of other NTHi strains. The 28D rabbit polyclonal antiserum was used in a 1:250 dilution. Anti-rabbit immunoglobulin G alkaline phosphatase conjugate was used as the secondary antibody (Sigma-Aldrich Inc., St. Louis, Mo.). NTHi strain 12 was used as a control HMW-expressing isolate.
Amplification of the hmw1A-like and hmw2A-like core-binding domain sequences.
In order to obtain amplicons including the hmw1A-like and the hmw2A-like core-binding domain sequences of each isolate, two separate nested PCRs were performed by using the hmw1A-like or the hmw2A-like gene PCR products as templates and by employing different internal primer sets (Table 1). Based on the published sequences of strain 12, which together with the one from strain A950006 were the only hmwA sequences accessible from GenBank at the time this study began, two primer sets were initially designed: HMWA6/HMWA11, to be used with the hmw1A-like gene PCR products, and HMWA6/HMWA13, to be used with the hmw2A-like gene PCR products. Both primer sets gave rise to expected PCR products of approximately 1,300 bp. Both amplifications were performed with 1.5 U of Takara Taq (Takara Bio Inc.) with a reaction mixture containing buffer (10 mM Tris-HCl, 50 mM KCl, and 1.5 mM MgCl2), 2.5 mM of each deoxynucleoside triphosphate, 50 pmol of each primer, and 2 μl of DNA in a total volume of 50 μl. The thermocycling conditions were 95°C for 5 min, 25 cycles of 95°C for 30 s, 55°C for 30 s, and 72°C for 30 s, followed by a final elongation step of 72°C for 5 min.
For isolates found negative by the HMWA6/HMWA11 or HMWA6/HMWA13 primer set, an additional reverse primer, HMWA7, was designed, which annealed to a small conserved region that is present within the variable regions of both published hmw1A and hmw2A sequences from strain 12. This primer was used with the HMWA6 primer in both nested PCRs, giving rise to an expected PCR product of approximately 2,130 bp. The reaction mixture and PCR conditions were the same as reported above for the analysis of hmw loci, except the elongation time was 2 min. In all the PCRs, NTHi strain 12 was used as the positive control.
Nucleotide sequencing.
PCR fragments were purified using the Nucleospin extract kit (Macherey-Nagel, Duren, Germany). Sequencing was performed by using the fluorescent dideoxy-chain terminator method on an ABI 3730 DNA sequencer (Applied Biosystems, Foster City, Calif.). Both strands were sequenced. Several oligonucleotide primers were generated as necessary to complete the sequences.
Sequence analysis.
Sequence analysis was performed with the programs of the University of Wisconsin Genetic Computer Group Package, version 10.3. Where not mentioned, standard analysis was performed using default parameters. The pairwise sequence identities between the nucleotide sequences of the hmwA-like genes from strain 72 and those from strain 12 (GenBank accession numbers U08875 and U08876) were calculated by use of the GAP and DOTPLOT programs. The nucleotide sequences of the hmw1A-like and hmw2A-like core-binding domains were aligned using the PILEUP program and adjusted manually to conform to the optimized alignment of deduced amino acid sequences. Matrixes of the pairwise similarities were determined with the OLDDISTANCES program. Similarity plots were generated using the PLOTSIMILARITY program. The deduced amino acid sequence of the HMW1 and HMW2 core-binding domains were obtained by the TRANSLATE program and then sequentially analyzed by the MEME and MOTIFSEARCH programs to find conserved motifs. Briefly, the MEME program identifies likely conserved motifs in a family of sequences and saves these motifs as a set of profiles, then MOTIFSEARCH uses these sets of profiles as a query to search the members of the original family, detailing the matches between the profiles and each of the members.
To generate the phylogenetic tree, the amino acid sequences were aligned using the multiple sequence alignment Clustal W (version 1.8) program (33). The phylogenetic tree was constructed using the neighbor-joining method (26). Regions of the alignments containing gaps were excluded. Support for specific tree topologies was estimated by bootstrap analysis with 1,000 pseudoreplicates to evaluate internal branches.
Nucleotide sequence accession numbers.
All the hmwA sequences reported in this paper have been submitted to the EMBL Nucleotide Sequence Database. The accession numbers for the complete hmw1A-like and hmw2A-like gene sequences from strain 72 are AJ937359 and AJ937360, respectively. The accession numbers for the partial hmw1A-like gene sequences encompassing the core-binding domain regions are AJ920371 through AJ920378. The accession numbers for the partial hmw2A-like gene sequences encompassing the core-binding domain regions are AJ937351 through AJ937358.
RESULTS
As a first step, we confirmed the presence of sequences homologous to the hmwA genes in the nine NTHi isolates by Southern hybridization assays, using an hmwA-specific probe obtained from prototype strain 12. When the chromosomal DNAs were digested with BglII restriction enzyme, two hybridization bands were visible in all isolates but one (strain 143, showing only one band) (data not shown). Since, based on the strain 12 sequences, neither the hmw1 nor the hmw2 cluster containd BglII recognition sites, the presence of two hybridization bands suggested that the isolates harbored two distinct hmwA genes. Variability in the sizes of the hybridization fragments was observed among isolates, indicating considerable polymorphism in the location of the BglII restriction sites.
Chromosomal location of the hmwA genes.
The chromosomal location of the hmwA genes was determined by PCR on the basis of the known positions of the hmw1A and hmw2A genes in strain 12 (25). The nine isolates were subjected to two separate PCRs by employing forward primers complementary to the 3′ end of the unique upstream ORFs and a common reverse primer annealing to a conserved region of the hmw1B and hmw2B genes. In each isolate, the expected amplicon was generated in both PCRs (Table 2), indicating the presence of two physically distinct hmwA genes, one (designated hmw1A-like) downstream of ORFHI11679 and the other (designated hmw2A-like) downstream of ORFHI1598, as in the prototype strain 12. Since even strain 143 appeared to contain two distinct hmwA genes, it was likely that two comigrating hybridization bands of similar size were present in the Southern blot. Variation among different isolates was noticed in the fragment lengths produced by amplifications with the primer set specific for the hmw1A-like gene. The sizes of the these amplicons ranged from approximately 4,900 bp to 6,100 bp, suggesting that deletions and insertions were present inside the hmw1A-like gene or in the intergenic regions. On the contrary, the product sizes corresponding to the hmw2A-like genes appeared quite conserved in all strains analyzed.
TABLE 2.
PCR amplification results
Strain no. | Amplification
|
|||||
---|---|---|---|---|---|---|
hmw1A primer set
|
hmw2A primer set
|
|||||
ORF5-1/ HMWB3R | HMWA6/ HMWA11 | HMWA6/ HMWA7 | ORF5-2/ HMWB3R | HMWA6/ HMWA13 | HMWA6/ HMWA7 | |
56 | + | + | NDa | + | + | ND |
72 | + | − | + | + | − | + |
91 | + | + | ND | + | − | + |
142 | + | + | ND | + | − | + |
143 | + | + | ND | + | − | + |
152 | + | + | ND | + | − | + |
157 | + | + | ND | + | − | + |
161 | + | − | + | + | + | ND |
188 | + | + | ND | + | + | ND |
ND, not done.
HMW protein expression.
Western blotting with the 28D polyclonal rabbit antiserum showed the presence of reactive HMW proteins in seven out of the nine NTHi isolates. Isolates 72 and 143 did not show any reactive protein (data not shown), suggesting they either express antigenically variant HMW proteins or did not express any HMW protein at all.
Sequence analysis of the hmw1A and hmw2A genes from strain 72.
Nucleotide sequencing was performed on the hmw1A-like and hmw2A-like gene PCR products from strain 72. The total lengths of the complete nucleotide sequences for the hmw1A-like and hmw2A-like genes were 4.4 kb and 4.3 kb, respectively. The strain 72 sequences were compared with the published hmw1A and hmw2A sequences from strain 12. By Gap analysis, the strain 72 hmw1A sequence showed an overall identity of 80.47% with the hmw1A sequence from strain 12. The alignment required the introduction of multiple gaps. Interestingly, insertions or deletions within the ORFs were always constituted of three or a multiple of three nucleotides, so the reading frame was maintained.
Comparison of the 72 hmw2A sequence with that from strain 12 indicated an overall identity of 76.81%. Insertions or deletions were generally composed of three or a multiple of three nucleotides with the exception of a single base deletion at the bp 2725 position, resulting downstream in a stop codon at the bp 2739 position that was able to prevent the synthesis of the HMW2 full-size protein. Furthermore, inspection of nucleotide sequences upstream of the hmw1A-like and hmw2A-like genes revealed the presence of a large number of 7-bp direct repeats (28 repeats) arranged in tandem array in both promoters. In accordance with the findings previously reported by S. Dawid et al., variation in the number of 7-bp tandem repeats in the hmw1A and hmw2A promoters regulates bacterial expression of HMW adhesins with an inverse stepwise relationship between the number of repeats and the level of protein expression (7). In strain 72, showing no reactive protein by Western blotting, the presence of a high number of 7-bp repeats in both the hmw1A and hmw2A promoters suggested that down-modulation of expression of both HMW1 and HMW2 adhesins may have occurred.
The nucleotide sequences of the 72 hmwA-like genes were further analyzed in comparison with those from strain 12 by dot plot analysis. To obtain an overall view of matching regions between the two hmw1A genes (strain 72 versus strain 12), a comparison window of 16 with a high stringency of 14 was used (Fig. 1a). The consensus diagonal line revealed conserved regions between the two sequences up to the first 1,800 nucleotides and in the 3′ end of the alignment. In the middle portion of the diagonal, the presence of multiple breaks indicated unmatching regions due to insertions, deletions, or sequence diversity. In particular, three long gaps within the 72 hmw1A gene, corresponding to deletions of three regions that were present in the 12 hmw1A sequence (positions 3897 to 3932, 4067 to 4084, and 4284 to 4406, respectively) caused shifts of the consensus diagonal line (Fig. 1a, arrow). Of note, at the 5′ end of the sequences, the different number of 7-bp tandem repeats in the promoter regions was visualized as a black rectangular box. When the hmw2A sequences were analyzed (strain 72 versus strain 12), dot plots showing conserved regions up to the first 1,750 nucleotides and at the 3′ end were obtained, similar to those found in the hmw1A genes (Fig. 1b). However, the middle portion of the consensus diagonal line showed the presence of larger breaks, indicating a lower level of homology.
FIG. 1.
Dot plot analysis of the hmw1A-like gene (a) and the hmw2A-like gene (b) from strain 72 compared with the corresponding genes from prototype strain 12. In both graphs, the x axis is strain 72 and the y axis is strain 12. Nucleotide positions are indicated on the top and on the right. Comparisons begin 351 nucleotides upstream of the hmwA start codon and continue to the end of the genes. Breaks in the diagonal line indicate insertion, deletion, or extensive sequence diversity. Black rectangular boxes represent the 7-bp tandem repeats in the gene promoter regions. The arrows indicate shifts of the consensus diagonal lines. Comparison windows and stringency values are indicated on the left.
Conservation and diversity of the hmw1A-like and hmw2A-like gene fragments encoding the core-binding domain of the HMW1 and HMW2 proteins among NTHi isolates.
To investigate the genetic diversity of the hmw1A-like and hmw2A-like gene regions encoding the core-binding domain of the HMW1 and the HMW2 proteins, we amplified and sequenced the complete core-binding domain portion of each of the 18 hmwA-like genes from the nine NTHi strains. Amplicons were generated by nested PCR by using either the hmw1A-like or the hmw2A-like gene PCR products as templates and by employing different internal primer sets. According to the template used, the core-binding domain sequences were designated as either hmw1A-like or hmw2A-like. Besides prototype strain 12, strain 72 was included in PCR assays as a further control.
Table 2 shows the results of the PCR analysis. Initially, the hmw1A-like gene PCR products were subjected to amplification using primer set HMWA6-HMWA11, while the hmw2A-like PCR products were analyzed with primer set HMWA6-HMWA13. In the hmw1A gene, PCR was successful in seven of the nine isolates, while in the hmw2A genes only three amplicons were produced. Considering the positions of the three primers HMWA6, HMWA11, and HMWA13 on strain 12 sequences, the forward HMWA6 primer annealed to a conserved region of the hmw1A and hmw2A genes, whereas both the reverse HMWA11 and HMWA13 primers recognized sequences in their divergent and hypervariable parts.
Based on this information, failure to detect PCR products in some isolates might be explained by sequence heterogeneity at the binding sites of primers HMWA11 and HMWA13. Therefore, an additional reverse HMWA7 primer was designed and PCRs were repeated on the samples previously found negative by primers HMWA6 and HMWA11 or primers HMWA6 and HMWA13. The expected 2,130-bp sequence was amplified in all the isolates tested. As expected, PCR was successful with all primer sets in prototype strain 12, whereas strain 72 gave positive results only with primers HMWA6 and HMWA7. Nucleotide sequencing was carried out on amplicons obtained by nested PCR. For each isolate, the complete nucleotide sequences of both the hmw1A-like and hmw2A-like core-binding domains were determined.
The hmw1A-like and hmw2A-like core-binding domain sequences from the nine different isolates were separately aligned in comparison with the corresponding published sequences of strain 12 using the Pileup program and adjusted manually to conform to the optimized alignment of deduced amino acid sequences. Matrixes of the pairwise similarities within the hmw1A and hmw2A aligned sequences were determined (Fig. 2a and c). Among the hmw1A-like core-binding domain sequences, similarity values for nucleotide-nucleotide comparisons ranged from 0.3360 (strain 143 in comparison with strain 152) to 1.00 (strains 56, 142, 157, and 188 in comparison with strain 12), indicating the latter were 100% identical to strain 12 (Fig. 2a). As far as the hmw2A-like core-binding domain sequences are concerned, similarity values ranged from 0.4570 to 1.00, with two isolates (strains 56 and 188) being 100% identical to strain 12 (Fig. 2c). In both matrixes, strains isolated from the same sources did not exhibit significantly higher similarity values than strains from different sources.
FIG. 2.
Conservation and diversity of the hmw1A-like and hmw2A-like core-binding domains sequences among invasive NTHi isolates. (a and c) Matrixes of pairwise similarities among the hmw1A-like core-binding domain sequences (a) and among the hmw2A-like core-binding domain sequences (c). Isolate code numbers are indicated in the column on the far left and on the row above. The hmwA sequences from prototype strain 12 were included in pairwise comparisons. Above the diagonals, values represent pairwise similarities among the aligned sequences of each group, where a value of 1.0000 indicates 100% identity. (b and d) Similarity plots of the hmw1A-like core-binding domain sequences (b) and the hmw2A-like core-binding domain sequences (d). Similarity score values are indicated on the left (vertical axis). Nucleotide positions, based on strain 12 sequences, are indicated below the graph (horizontal axis). The dotted lines represent average similarities across the entire alignments.
Using PLOTSIMILARITY to graphically display the average similarity at each position in the alignment among all the sequences in each group, a not-homogenous trend was observed. Indeed, peaks of relatively high similarity (up to 90%) were found close to strongly divergent regions (<65% similarity down to 20%) within both the hmw1A-like and hmw2A-like core-binding domain sequences (Fig. 2b and d).
The hmw1A-like and hmw2A-like core-binding domain nucleotide sequences were translated and the deduced amino acid sequences were designated either HMW1-like or HMW2-like, according to the encoding genes. A phylogenetic tree was constructed using Clustal W alignment of all the HMW core-binding domain sequences, irrespective of their belonging to the HMW1 or HMW2 group. As shown in Fig. 3, the HMW1-like and HMW2-like core-binding domain sequences clustered into two separate groups in agreement with the encoding hmw1A-like or hmw2A-like genes. The only exceptions were the 142 and 143 HMW2-like core binding domain sequences that clustered into the opposite group and the 143 HMW1-like core-binding domain sequence that appeared clearly distinct from the other members of the HMW1 group.
FIG. 3.
Phylogenetic tree based on the deduced amino acid sequences of the HMW1 and HMW2 core-binding domains. The phylogenetic tree was constructed with Clustal W alignment of all the HMW core-binding domain sequences using the neighbor-joining method. Regions of the alignments containing gaps were excluded from the phylogenetic analysis. Numbers on the branches indicate bootstrap values out of 1,000 replicates.
Examination of the amino acid distance matrix indicated a considerable degree of variation in distance values among sequences within the same phylogenetic cluster, with the HMW1 sequences ranging from 0.000 to 0.529 and the HMW2 sequences from 0.000 to 0.435 (data not shown). Looking at the three sequences that did not cluster in the homologous group, they appeared strongly divergent from all other sequences whether HMW1 or HMW2 (143 HMW1 versus HMW1 group: range, 0.489 to 0.609; 143 HMW1 versus HMW2 group: range, 0.398 to 0.471; 143 HMW2 versus HMW1 group: range, 0.435 to 0.571; 143 HMW2 versus HMW2 group: range, 0.441 to 0.544; 142 HMW2 versus HMW1 group: range, 0.433 to 0.523; 142 HMW2 versus HMW2 group: range, 0.564 to 0.602). This result was in agreement with that previously shown in the matrixes of the pairwise similarities for nucleotide-nucleotide comparison, where the three sequences had the lowest similarity values.
Overall, phylogenetic analysis suggested divergence of the HMW binding domain during evolution and indicated a greater degree of conservation within each of the HMW1 and HMW2 groups. To investigate further strain-to-strain conservation of the two core-binding domains, the HMW1-like and HMW2-like amino acid sequences were analyzed by MEME and MOTIFSEARCH computer programs to identify conserved motifs. When the HMW1-like or the HMW2-like core-binding domain sequences were analyzed separately, distinct sets of motifs were discovered within each group (Tables 3 and 4). For each motif, a multilevel consensus sequence showing all likely (probability ≥ 0.2) amino acids for each position was calculated (Tables 3 and 4). The motifs ranged from 8 to 17 amino acids and from 8 to 20 amino acids for the HMW1-like and the HMW2-like core-binding domain sequences, respectively.
TABLE 3.
Motif patterns in the HMW1-like core-binding domain sequencesa
Strain no. | Positions and <motif> | Combined P value | E valueb | |||||
---|---|---|---|---|---|---|---|---|
12 | (14 <2> 25) | (30 <5> 41) | (46 <1> 62) | (74 <6> 84) | (86 <4> 93) | (104 <3> 113) | 8.5E-67 | 8.5E-66 |
142 | (14 <2> 25) | (30 <5> 41) | (46 <1> 62) | (74 <6> 84) | (86 <4> 93) | (104 <3> 113) | 8.5E-67 | 8.5E-66 |
157 | (14 <2> 25) | (30 <5> 41) | (46 <1> 62) | (74 <6> 84) | (86 <4> 93) | (104 <3> 113) | 8.5E-67 | 8.5E-66 |
188 | (14 <2> 25) | (30 <5> 41) | (46 <1> 62) | (74 <6> 84) | (86 <4> 93) | (104 <3> 113) | 8.5E-67 | 8.5E-66 |
56 | (14 <2> 25) | (30 <5> 41) | (46 <1> 62) | (74 <6> 84) | (86 <4> 93) | (104 <3> 113) | 8.5E-67 | 8.5E-66 |
72 | (14 <2> 25) | (30 <5> 41) | (46 <1> 62) | (76 <6> 86) | (88 <4> 95) | (105 <3> 114) | 2.2E-55 | 2.2E-54 |
161 | (7 <2> 18) | (30 <5> 41) | (46 <1> 62) | (79 <6> 89) | (91 <4> 98) | (108 <3> 117) | 8.6E-54 | 8.6E-53 |
152 | (14 <2> 25) | (30 <5> 41) | (46 <1> 62) | (76 <6> 86) | (88 <4> 95) | (106 <3> 115) | 3.6E-51 | 3.6E-50 |
143 | (7 <2> 18) | (30 <5> 41) | (47 <1> 63) | (72 <6> 82) | — | (101 <3> 110) | 3.9E-42 | 3.9E-41 |
91 | (7 <2> 18) | (31 <5> 42) | (47 <1> 63) | (80 <6> 90) | — | — | 1.0E-41 | 1.0E-40 |
The motifs (multilevel consensus sequences) represented in the sequences were as follows: 1, FRFNNVSLNGTGSGLQF; 2, (I/L)NITA(K/G)Q(D/S)(I/V)AFE; 3, GRTYWNLTSL; 4, VNISMVLP; 5, (Q/A)VITGQG(T/V)ITSG; and 6, NKF(E/D)GTLNISG. The number order of the motifs is that automatically generated by the MEME program and does not correspond to the position in the sequences. Motif alignment displays occurrences of the motifs (multilevel consensus sequences) within each sequence in the HMW1-like group. Numbers on the left and on the right identify amino acid positions where the motif begins and ends.
The E value is the combined P value multiplied by the number of sequences in the HMW1-like dataset.
TABLE 4.
Motif patterns in the HMW2-like core-binding domain sequencesa
Strain no. | Positions and <motif> | Combined P value | E valueb | |||||
---|---|---|---|---|---|---|---|---|
56 | (1 <1> 16) | (17 <6> 24) | (30 <4> 43) | (52 <3> 66) | (74 <5> 93) | (101 <2> 112) | 1.1E-80 | 1.1E-79 |
12 | (1 <1> 16) | (17 <6> 24) | (30 <4> 43) | (52 <3> 66) | (74 <5> 93) | (101 <2> 112) | 1.1E-80 | 1.1E-79 |
188 | (1 <1> 16) | (17 <6> 24) | (30 <4> 43) | (52 <3> 66) | (74 <5> 93) | (101 <2> 112) | 1.1E-80 | 1.1E-79 |
91 | (1 <1> 16) | (17 <6> 24) | (31 <4> 44) | (53 <3> 67) | (75 <5> 94) | (102 <2> 113) | 1.8E-72 | 1.8E-71 |
157 | (1 <1> 16) | (18 <6> 25) | (32 <4> 45) | (54 <3> 68) | (76 <5> 95) | (103 <2> 114) | 6.1E-72 | 6.1E-71 |
152 | (1 <1> 16) | (18 <6> 25) | (32 <4> 45) | (54 <3> 68) | (76 <5> 95) | (103 <2> 114) | 7.2E-72 | 7.2E-71 |
161 | (1 <1> 16) | (17 <6> 24) | (30 <4> 43) | (52 <3> 66) | (74 <5> 93) | (101 <2> 112) | 1.9E-70 | 1.9E-69 |
72 | (1 <1> 16) | (18 <6> 25) | (32 <4> 45) | (54 <3> 68) | (76 <5> 95) | (103 <2> 114) | 1.8E-67 | 1.8E-66 |
143 | (1 <1> 16) | (18 <6> 25) | (32 <4> 45) | (53 <3> 67) | (78 <5> 97) | (105 <2> 116) | 1.6E-61 | 1.6E-60 |
142 | (1 <1> 16) | (19 <6> 26) | (28 <4> 41) | (49 <3> 63) | (76 <5> 95) | (103 <2> 114) | 7.4E-47 | 7.4E-46 |
The motifs (multilevel consensus sequences) represented in the sequence were as follows: 1, D(V/I)HKNITLGTG(F/Y)LNIT; 2, YW(Q/K)TS(Y/H)DS(Y/H)WNV; 3, FR(A/L)NNVSLNGTGKGL; 4, RDA(A/S)(N/D)A(K/Q)IVAQGT(I/V); 5, NLSH(K/N)(L/F)(D/S)G(E/T)IN(I/V)SGN(V/I)TINQ; and 6, A(A/G)SVAFEG. The number order of the motifs is that automatically generated by the MEME program and does not correspond to the position in the sequences. Motif alignment displays occurrences of the motifs (multilevel consensus sequences) within each sequence in the HMW2-like group. Numbers on the left and on the right identify amino acid positions where the motif begins and ends.
The E value is the combined P value multiplied by the number of sequences in the HMW2-like dataset.
Looking at the motif alignment that displays occurrences of the motifs within each sequence in the original data set, all the sequences matched all the motifs discovered within the corresponding group with the exception of the 143 and 91 HMW1-like sequences, which lacked one and two motifs, respectively (Table 3). Actually, the 91 HMW1-like binding domain sequence that was obtained by PCR using primer set HMWA6 and HMWA11 was smaller than expected, since the amplicon lacked about 400 nucleotides at the 3′ end of the sequence.
Regarding the HMW1-like binding domain sequence from strain 143, the MEME result was in agreement with the previous observed divergence of this sequence from the other HMW1-like sequences (see phylogenetic tree and amino acid distance matrix). Pairwise comparison of motifs generated by the two rounds of MEME showed that distinct motifs located in similar positions in the sequences overlapped to different degrees (see motif 1 of the HMW1 group which was closely related to motif 3 of the HMW2 group, whereas motif 3 of the HMW1 was more divergent from the corresponding motif 2 of the HMW2) or did not overlap at all (see motif 1 of the HMW2 sequences, which was unique of the HMW2 group). Overall, this comparison confirmed the specificity of the two sets of motifs to either the HMW1 or the HMW2 group.
To obtain consensus motifs that were shared among all the HMW core-binding domain sequences, a third round of MEME/MOTIFSEARCH analysis was performed, using all the sequences as input. Once again, a set of conserved motifs (each motif ranging from 8 to 14 amino acids) was identified (Table 5). As shown in the motif alignment, all the sequences but three (HMW1 from strains 161, 143, and 91) matched all six motifs found (Table 5). Comparison of the motifs generated by this third round of MEME with those identified within the HMW1 or the HMW2 sequences highlighted that motifs specific for the HMW1-like or the HMW2-like group matched consensus motifs common to both but contained or lacked other contiguous amino acid residues.
TABLE 5.
Motif patterns in the HMW1-like and HMW2-like core-binding domain sequencesa
Strain no. (group) | Positions and <motif> | Combined P value | E valueb | |||||
---|---|---|---|---|---|---|---|---|
188 (HMW1) | (1 <3> 8) | (18 <6> 25) | (32 <5> 39) | (46 <1> 57) | (75 <4> 88) | (107 <2> 115) | 7.4E-51 | 1.5E-49 |
56 (HMW1) | (1 <3> 8) | (18 <6> 25) | (32 <5> 39) | (46 <1> 57) | (75 <4> 88) | (107 <2> 115) | 7.4E-51 | 1.5E-49 |
12 (HMW1) | (1 <3> 8) | (18 <6> 25) | (32 <5> 39) | (46 <1> 57) | (75 <4> 88) | (107 <2> 115) | 7.4E-51 | 1.5E-49 |
142 (HMW1) | (1 <3> 8) | (18 <6> 25) | (32 <5> 39) | (46 <1> 57) | (75 <4> 88) | (107 <2> 115) | 7.4E-51 | 1.5E-49 |
157 (HMW1) | (1 <3> 8) | (18 <6> 25) | (32 <5> 39) | (46 <1> 57) | (75 <4> 88) | (107 <2> 115) | 7.4E-51 | 1.5E-49 |
91 (HMW2) | (1 <3> 8) | (16 <6> 23) | (38 <5> 45) | (53 <1> 64) | (79 <4> 92) | (110 <2> 118) | 1.6E-47 | 3.2E-46 |
143 (HMW2) | (1 <3> 8) | (17 <6> 24) | (39 <5> 46) | (53 <1> 64) | (82 <4> 95) | (113 <2> 121) | 2.5E-44 | 5.0E-43 |
56 (HMW2) | (1 <3> 8) | (16 <6> 23) | (37 <5> 44) | (52 <1> 63) | (78 <4> 91) | (109 <2> 117) | 3.0E-44 | 6.1E-43 |
12 (HMW2) | (1 <3> 8) | (16 <6> 23) | (37 <5> 44) | (52 <1> 63) | (78 <4> 91) | (109 <2> 117) | 3.0E-44 | 6.1E-43 |
188 (HMW2) | (1 <3> 8) | (16 <6> 23) | (37 <5> 44) | (52 <1> 63) | (72 <4> 85) | (109 <2> 117) | 3.0E-44 | 6.1E-43 |
157 (HMW2) | (1 <3> 8) | (17 <6> 24) | (39 <5> 46) | (54 <1> 65) | (80 <4> 93) | (111 <2> 119) | 4.8E-43 | 9.7E-42 |
161 (HMW2) | (1 <3> 8) | (16 <6> 23) | (37 <5> 44) | (52 <1> 63) | (78 <4> 91) | (109 <2> 117) | 5.0E-42 | 1.0E-40 |
72 (HMW2) | (1 <3> 8) | (17 <6> 24) | (39 <5> 46) | (54 <1> 65) | (80 <4> 93) | (111 <2> 119) | 6.7E-42 | 1.3E-40 |
72 (HMW1) | (1 <3> 8) | (18 <6> 25) | (32 <5> 39) | (46 <1> 57) | (77 <4> 90) | (108 <2> 116) | 2.6E-41 | 5.2E-40 |
142 (HMW2) | (1 <3> 8) | (18 <6> 25) | (35 <5> 42) | (49 <1> 60) | (80 <4> 93) | (111 <2> 119) | 3.0E-41 | 6.0E-40 |
152 (HMW2) | (1 <3> 8) | (17 <6> 24) | (39 <5> 46) | (54 <1> 65) | (80 <4> 93) | (111 <2> 119) | 5.4E-41 | 1.1E-39 |
161 (HMW1) | — | (11 <6> 18) | (32 <5> 39) | (46 <1> 57) | (80 <4> 93) | (111 <2> 119) | 2.8E-40 | 5.6E-39 |
143 (HMW1) | — | (11 <6> 18) | (32 <5> 39) | (47 <1> 58) | (73 <4> 86) | (104 <2> 112) | 1.5E-38 | 2.9E-37 |
152 (HMW1) | (1 <3> 8) | (18 <6> 25) | (32 <5> 39) | (46 <1> 57) | (77 <4> 90) | (109 <2> 117) | 3.0E-37 | 6.1E-36 |
91 (HMW1) | — | (11 <6> 18) | (33 <5> 40) | (47 <1> 58) | (81 <4> 94) | — | 6.6E-33 | 1.3E-31 |
The motifs (multilevel consensus sequences) represented in the sequences were as follows: 1, FRFNNVSLNGTG; 2, (Y/H)WN(V/L)(T/S)SLN(V/L); 3, D(V/I)HKNI(T/S)L; 4, K(F/L)(D/E)G(T/E)(L/I)NISG(N/K)(V/I)(T/N); 5, IT(A/G)QGTIT; and 6, (A/T)(K/A/G)Q(S/D)(V/I)AFE. The number order of the motifs is that automatically generated by the MEME program and does not correspond to the position in the sequences. Motif alignment displays occurrences of the motifs (multilevel consensus sequences) within each sequence in both HMW1 and HMW2-like groups. Numbers on the left and on the right identify the amino acid positions where the motif begins and ends.
The E value is the combined P value multiplied by the number of sequences in the HMW-like dataset.
DISCUSSION
In this study, we focused our efforts on investigating the conservation and diversity of the hmw1A and hmw2A core-binding domain sequences among invasive NTHi isolates. The strains analyzed had been chosen among isolates previously found hmwA positive by PCR. A Southern hybridization assay with an hmwA-specific probe confirmed this result. Since the hmwA genes seemed to be present as duplicates in NTHi isolates, the first step was to separate the hmw1A gene from the hmw2A gene by PCR employing primers complementary to flanking genes. The results of these PCRs demonstrated that (i) both the hmw1A-like and the hmw2A-like genes were present in each isolate and (ii) the location on the chromosome of the two hmw loci was conserved among our isolates, in agreement with the finding previously reported by Buscher et al. (4). Moreover, on the basis of the lengths of the amplified fragments, complete hmw1A-like and hmw2A-like genes were very likely present.
In two isolates, the presence of hmwA genes did not correspond to the presence of detectable HMW proteins by Western blotting. The negative results might be attributed to the lack of cross-reactivity of the 28D antiserum with antigenically variant proteins. However, analysis of the promoter regions of the hmwA genes of one of these isolates (strain 72) seems to suggest that the expression of both HMW proteins was down-modulated. On the basis of this result, we looked at the promoter regions of the hmw1A-like and hmw2A-like genes from strain 143, the other nonreactive isolate, by PCR analysis followed by direct sequencing of amplicons. Since a high number of 7-bp repeats was found in both promoters (23 and 31 repeats in hmw1A and hmw2A, respectively), a low level of HMW expression may be supposed in this isolate (data not shown).
Sequence analysis of the hmw1A-like and hmw2A-like genes from strain 72 provided further insight into the diversity and conservation within the hmwA genes at the sequence level. In comparison with strain 12 sequences, both the hmw1A and hmw2A genes from strain 72 displayed high conservation at the 5′ end of the gene and at the 3′ end, while the middle portion exhibited considerable genetic diversity as a result of multiple point mutations, insertions, and deletions. These data confirmed that the middle portion of the hmwA genes diverges not only between the hmw1A and hmw2A genes from strain 12 but also among each hmw1A gene and among each hmw2A gene from different isolates (9).
We speculated that the 5′ regions of the hmwA genes are not subject to evolutionary selection for antigenic variation, since they include the coding sequences for the first 441 amino acids that are cleaved prior to translocation across the outer membrane and are not surface exposed (15). Likely, since the C terminus of HMW is required for anchoring mature proteins to the surface of the microorganism (15), the maintenance of conserved sequences at the 3′ end of the hmwA genes offers a selective advantage. The middle portion of the hmwA genes encodes most of the mature protein, including its N-terminal domain that is surface exposed and subject to selective pressure to diversify serologically. It is therefore not surprising that this portion of the genes shows greater variability than the others. Several Haemophilus influenzae surface proteins exhibit sequence heterogeneity, since the presence of variable sequences allows bacteria to present a changing surface to an antibody-rich environment (6, 20, 35).
According to those found in NTHi strain 12, the sequences responsible for the HMW1 and HMW2 binding properties are localized near the N terminus of the mature proteins, within their variable regions (8). It may be supposed that, in spite of the variability of these regions, conservation of sequences critical for particular binding specificity is necessary to assure preservation of function. In this study, to investigate strain-to-strain conservation of the HMW1 and HMW2 binding domain sequences, we characterized the encoding hmw1A-like and hmw2A-like genes from several clinical isolates, assessing sequence diversity of the different alleles. Amplicons encompassing the hmw1A-like and the hmw2A-like gene portions encoding the two core-binding domains were obtained and sequenced. To overcome the problem of sequence variability, for some isolates, consensus primers recognizing sequences located quite far from the core-binding domains were used in PCR amplification. Sequencing of the two core-binding domains required use of a battery of primers, even if their total length was as short as 400 bp.
Computer-supported sequence analysis was performed on both nucleotide and predicted amino acid sequences. Multiple sequence alignments and examination of matrixes of pairwise similarities for nucleotide-nucleotide comparisons within the hmw1A-like and hmw2A-like core-binding domain sequences showed remarkable strain-to-strain heterogeneity, with isolates exhibiting both hmw1A and hmw2A nucleotide sequences identical to those of strain 12, isolates showing high polymorphism in both the genes, and isolates sharing the same strain 12 sequences for either the hmw1A or hmw2A gene. This strain-to-strain heterogeneity is not surprising considering the genetic diversity of the NTHi strains (23, 27), even if isolated from systemic sites (5).
When we looked at the average similarity among each group of sequences across the entire alignment, a complex pattern was highlighted, regions with relatively high similarity alternating with strongly divergent regions. Overall, these observations suggested that intraspecies recombination events have played a role in the evolution of the hmwA genes. It is well known that H. influenzae, like other human mucosal pathogens, is a naturally transformable species (17). The ability to incorporate exogenous DNA facilitates horizontal gene exchange, considering also that multiple H. influenzae strains may colonize the nasopharynx. Several H. influenzae genes encoding virulence factors, such as hifA, hifE, and hap, exhibit intragenic mosaicism, an apparent mixing and matching of gene regions acquired from other H. influenzae strains, probably as a result of horizontal gene transfer through transformation and homologous recombination (6, 13, 18, 21).
When the deduced amino acid sequences of the predicted HMW1-like and HMW2-like core-binding domains were analyzed, we first looked at the phylogenetic relationships among all the HMW sequences, irrespective of their belonging to the HMW1-like or HMW2-like group. With a few exceptions, our results clearly showed clustering into HMW1-like and HMW2-like groups, in agreement with the hmw1A-like and hmw2A-like encoding genes. Recently, Buscher et al. demonstrated that the chromosomal location of the hmw1A and hmw2A genes does not necessarily correlate with the binding phenotype of the associated adhesins, since, in some isolates, the hmw1A gene encodes a protein with HMW2 adherence properties and vice versa (4). In the present study, no investigation was conducted on the actual adherence properties of the hmw1A and hmw2A gene products to human ephithelial cells, our main object being to characterize hmwA alleles from several clinical NTHi isolates by assessing conservation and diversity at sequence level. However, phylogenetic analysis confirmed correlation between the hmw1A-like and hmw2A-like encoding genes and the HMW1 or HMW2 group specificity of the predicted core-binding domain amino acid sequences.
The only isolate showing both HMW core-binding domain sequences not clustering in the homologous group was strain 143. For this strain, it might be supposed that the hmw1A-like gene encodes a protein resembling an HMW2 binding phenotype and vice versa. Actually, based on Western blotting results, the expression of both HMW adhesins appeared down-modulated in strain 143, consistent with the presence of a high number of tandem repeats in the promoter region of both hmwA genes. Moreover, based on amino acid sequences, both the 143 core-binding domains appeared strongly divergent from all the others, suggesting that the two hmw loci had undergone extensive recombination events resulting in strongly variant hmwA genes, perhaps because they had not been subjected to selective pressure to preserve function.
Overall, phylogenetic analysis suggested that an hmw gene duplication event had occurred early in the evolution of NTHi, followed by sequence divergence, and that some conservation of sequences critical for a particular binding specificity existed. Actually, MEME searching resulted in the identification of distinct sets of conserved motifs, each constituted by significant stretches of amino acids, within the HMW1-like and within the HMW2-like core-binding domain sequences. We speculate that the presence of a set of motifs specific for each core-binding domain may ensure preservation of the different HMW1 and HMW2 binding specificities among isolates. A previous study suggested that the HMW1 and the HMW2 binding domains may be conformational structures (8). Our data are in agreement with this suggestion, since discontinuous stretches of amino acids in sequence may interact to make up a binding domain thanks to tertiary structure. Interestingly, several motifs specific for the HMW1 or the HMW2 group matched a consensus motif common to both but contained or lacked short stretches of other contiguous amino acid residues. This configuration of motifs reflects the evolutionary history of the HMW1 and HMW2 proteins which evolved from a common ancestor and suggest that the two binding domains are the result of a balance between selection favoring sequence divergence and selection favoring preservation of function.
In conclusion, this study documents extensive sequence diversity of the HMW1-like and HMW2-like core-binding domains among the NTHi invasive isolates analyzed. In spite of diversity, identification of specific patterns of conserved amino acid motifs within each of the two domains improves our understanding of the relationship between protein sequence and binding specificity. Moreover, since the HMW adhesins have been considered as potential vaccine antigens in a multicomponent vaccine against NTHi, data on consistency and extent of conservation within the HMW binding domains at the sequence level are of great interest. Further studies are necessary to assess whether these regions are capable of eliciting cross-reactive antibodies.
Acknowledgments
This study was partially funded by the Ministero della Salute (Italy), Programma per la ricerca finalizzata 2003.
We are very grateful to S. J. Barenkamp for supplying strain 12 and the 28D rabbit polyclonal antiserum. We thank Tonino Sofia for editorial assistance.
Editor: V. J. DiRita
REFERENCES
- 1.Barenkamp, S. J., and F. F. Bodor. 1990. Development of serum bactericidal activity following nontypable Haemophilus influenzae acute otitis media. Pediatr. Infect. Dis. J. 9:333-339. [DOI] [PubMed] [Google Scholar]
- 2.Barenkamp, S. J., and E. Leininger. 1992. Cloning, expression, and DNA sequence analysis of genes encoding nontypeable Haemophilus influenzae high-molecular-weight surface-exposed proteins related to filamentous hemagglutinin of Bordetella pertussis. Infect. Immun. 60:1302-1313. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Barenkamp, S. J., and J. W. St. Geme III. 1994. Genes encoding high-molecular-weight adhesion proteins of nontypeable Haemophilus influenzae are part of gene clusters. Infect. Immun. 62:3320-3328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Buscher, A. Z., K. Burmeister, S. J. Barenkamp, and J. W. St. Geme III. 2004. Evolutionary and functional relationships among the nontypeable Haemophilus influenzae HMW family of adhesins. J. Bacteriol. 186:4209-4217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cerquetti, M., M. L. Ciofi degli Atti, G. Renna, A. E. Tozzi, M. L. Garlaschi, P. Mastrantonio, and the Hi Study Group. 2000. Characterization of non-type b Haemophilus influenzae strains isolated from patients with invasive disease. J. Clin. Microbiol. 38:4649-4652. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Clemans, D. L., C. F. Marrs, M. Patel, M. Duncan, and J. R. Gilsdorf. 1998. Comparative analysis of Haemophilus influenzae hifA (pilin) genes. Infect. Immun. 66:656-663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Dawid, S., S. J. Barenkamp, and J. W. St. Geme III. 1999. Variation in expression of the Haemophilus influenzae HMW adhesins: a prokaryotic system reminiscent of eukaryotes. Proc. Natl. Acad. Sci. USA 96:1077-1082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Dawid, S., S. Grass, and J. W. St. Geme III. 2001. Mapping of binding domains of nontypeable Haemophilus influenzae HMW1 and HMW2 adhesins. Infect. Immun. 69:307-314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ecevit, Z., K. W. McCrea, C. F. Marrs, and J. R. Gilsdorf. 2005. Identification of new hmwA alleles from nontypeable Haemophilus influenzae. Infect. Immun. 73:1221-1225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ecevit, I. Z., K. W. McCrea, M. M. Pettigrew, A. Sen, C. F. Marrs, and J. R. Gilsdorf. 2004. Prevalence of the hifBC, hmw1A, hmw2A, hmwC, and hia genes in Haemophilus influenzae isolates. J. Clin. Microbiol. 42:3065-3072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Falla, T. J., S. R. Dobson, D. W. Crook, W. A. Kraak, W. W. Nichols, E. C. Anderson, J. Z. Jordens, M. P. Slack, D. Mayon-White, and E. R. Moxon. 1993. Population-based study of non-typable Haemophilus influenzae invasive disease in children and neonates. Lancet 341:851-854. [DOI] [PubMed] [Google Scholar]
- 12.Furrer, M., P. Cottagnoud, and K. Mühlemann. 2000. Haemophilus influenzae infections among hospitalized adult patients. Infection 28:351-354. [DOI] [PubMed] [Google Scholar]
- 13.Gilsdorf, J. R., C. F. Marrs, and B. Foxman. 2004. Haemophilus influenzae: genetic variability and natural selection to identify virulence factors. Infect. Immun. 72:2457-2461. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Grass, S., A. Z. Buscher, W. E. Swords, M. A. Apicella, S. J. Barenkamp, N. Ozchlewski, and J. W. St Geme III. 2003. The Haemophilus influenzae HMW1 adhesin is glycosylated in a process that requires HMW1C and phosphoglucomutase, an enzyme involved in lipooligosaccharide biosynthesis. Mol. Microbiol. 48:737-751. [DOI] [PubMed] [Google Scholar]
- 15.Grass, S., and J. W. St. Geme III. 2000. Maturation and secretion of the non-typable Haemophilus influenzae HMW1 adhesin: roles of the N-terminal and C-terminal domains. Mol. Microbiol. 36:55-67. [DOI] [PubMed] [Google Scholar]
- 16.Hultgren, S. J., S. Abraham, M. Caparon, P. Falk, J. W. St Geme III, and S. Normark. 1993. Pilus and nonpilus bacterial adhesins: assembly and function in cell recognition. Cell 73:887-901. [DOI] [PubMed] [Google Scholar]
- 17.Kahn, M. E., and H. O. Smith. 1984. Transformation in Haemophilus: a problem in membrane biology. J. Membr. Biol. 81:89-103. [DOI] [PubMed] [Google Scholar]
- 18.Kilian, M., K. Poulsen, and H. Lomholt. 2002. Evolution of the paralogous hap and iga genes in Haemophilus influenzae: evidence for a conserved hap pseudogene associated with microcolony formation in the recently diverged Haemophilus aegyptius and H. influenzae biogroup aegyptius. Mol. Microbiol. 46:1367-1380. [DOI] [PubMed] [Google Scholar]
- 19.Krasan, G. P., D. Cutter, S. L. Block, and J. W. St. Geme III. 1999. Adhesin expression in matched nasopharyngeal and middle ear isolates of nontypeable Haemophilus influenzae from children with acute otitis media. Infect. Immun. 67:449-454. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lomholt, H., L. van Alphen, and M. Kilian. 1993. Antigenic variation of immunoglobulin A1 proteases among sequential isolates of Haemophilus influenzae from healthy children and patients with chronic obstructive pulmonary disease. Infect. Immun. 61:4575-4581. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.McCrea, K. W., J. M. St. Sauver, C. F. Marrs, D. Clemans, and J. R. Gilsdorf. 1998. Immunologic and structural relationships of the minor pilus subunits among Haemophilus influenzae isolates. Infect. Immun. 66:4788-4796. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.O'Neill, J. M., J. W. St. Geme III, D. Cutter, E. E. Adderson, J. Anyanwu, R. F. Jacobs, and G. E. Schutze. 2003. Invasive disease due to nontypeable Haemophilus influenzae among children in Arkansas. J. Clin. Microbiol. 41:3064-3069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Pettigrew, M. M., B. Foxman, Z. Ecevit, C. F. Marrs, and J. R. Gilsdorf. 2002. Use of pulsed-field gel electrophoresis, enterobacterial repetitive intergenic consensus typing, and automated ribotyping to assess genomic variability among strains of nontypeable Haemophilus influenzae. J. Clin. Microbiol. 40:660-662. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Rao, V. K., G. P. Krasan, D. R. Hendrixson, S. Dawid, and J. W. St. Geme III. 1999. Molecular determinants of the pathogenesis of disease due to non-typable Haemophilus influenzae. FEMS Microbiol. Rev. 23:99-129. [DOI] [PubMed] [Google Scholar]
- 25.Rodriguez, C. A., V. Avadhanula, A. Buscher, A. L. Smith, J. W. St. Geme III, and E. E. Adderson. 2003. Prevalence and distribution of adhesins in invasive non-type b encapsulated Haemophilus influenzae. Infect. Immun. 71:1635-1642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425. [DOI] [PubMed] [Google Scholar]
- 27.Smith-Vaughan, H. C., K. S. Sriprakash, A. J. Leach, J. D. Mathews, and D. J. Kemp. 1998. Low genetic diversity of Haemophilus influenzae type b compared to nonencapsulated H. influenzae in a population in which H. influenzae is highly endemic. Infect. Immun. 66:3403-3409. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.St. Geme, III, J. W. 1994. The HMW1 adhesin of nontypeable Haemophilus influenzae recognizes sialylated glycoprotein receptors on cultured human epithelial cells. Infect. Immun. 62:3881-3889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.St. Geme, III, J. W., S. Falkow, and S. J. Barenkamp. 1993. High-molecular-weight proteins of nontypable Haemophilus influenzae mediate attachment to human epithelial cells. Proc. Natl. Acad. Sci. USA 90:2875-2879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.St. Geme, III, J. W., and S. Grass. 1998. Secretion of the Haemophilus influenzae HMW1 and HMW2 adhesins involves a periplasmic intermediate and requires the HMWB and HMWC proteins. Mol. Microbiol. 27:617-630. [DOI] [PubMed] [Google Scholar]
- 31.St. Geme, III, J. W., V. V. Kumar, D. Cutter, and S. J. Barenkamp. 1998. Prevalence and distribution of the hmw and hia genes and the HMW and Hia adhesins among genetically diverse strains of nontypeable Haemophilus influenzae. Infect. Immun. 66:364-368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Surana, N. K., S. Grass, G. G. Hardy, H. Li, D. G. Thanassi, and J. W. Geme III. 2004. Evidence for conservation of architecture and physical properties of Omp85-like proteins throughout evolution. Proc. Natl. Acad. Sci. USA 101:14497-14502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The Clustal X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:4876-4882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Van Schilfgaarde, M., P. van Ulsen, P. Eijk, M. Brand, M. Stam, J. Kouame, L. van Alphen, and J. Dankert. 2000. Characterization of adherence of nontypeable Haemophilus influenzae to human epithelial cells. Infect. Immun. 68:4658-4665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Yi, K., and T. F. Murphy. 1997. Importance of an immunodominant surface-exposed loop on outer membrane protein P2 of nontypeable Haemophilus influenzae. Infect. Immun. 65:150-155. [DOI] [PMC free article] [PubMed] [Google Scholar]