Enterobacteriaceae represent a diverse and medically important family of bacteria that are difficult to identify to the species level using the standard molecular method of 16S rRNA gene sequencing. Prior work has demonstrated the value of dnaJ gene sequence analysis in resolving different members of the family. However, existing protocols are not optimized for clinical use and exhibit several limitations in practice.
KEYWORDS: DNA sequencing, Enterobacteriaceae, clinical methods, dnaJ, genotypic identification
ABSTRACT
Enterobacteriaceae represent a diverse and medically important family of bacteria that are difficult to identify to the species level using the standard molecular method of 16S rRNA gene sequencing. Prior work has demonstrated the value of dnaJ gene sequence analysis in resolving different members of the family. However, existing protocols are not optimized for clinical use and exhibit several limitations in practice. Here, we describe an improved assay for dnaJ-based identification of Enterobacteriaceae which boasts increased broad-range specificity across genera, shorter amplicon sizes that are suitable for use with formalin-fixed or direct patient specimens, and enhanced amplification efficiency and assay sensitivity through the incorporation of locked nucleic acid chemistries. Sequence analysis of public databases indicates that the partial dnaJ sequence interrogated by this design retains high discriminatory power among Enterobacteriaceae genera and species, with only particular lineages of Shigella sp. and Escherichia coli proving unresolvable. Limits of detection studies using 8 disparate species indicated that amplification was consistently achievable across organisms and allowed robust dideoxynucleotide chain terminator sequencing from as little as 10 genome equivalents of template, depending on the species interrogated. Retrospective application of the dnaJ assay to patient specimens enabled unambiguous classification of Enterobacteriaceae to the species level in 22 of 27 (81.5%) positive specimens examined, with most remaining cases representing unresolvable calls between closely related Escherichia coli and Shigella species. We expect that this assay will facilitate the accurate molecular identification of species from the Enterobacteriaceae family in a variety of clinical specimens and diagnostic contexts.
INTRODUCTION
The family Enterobacteriaceae contains over 60 genera and more than 250 species, making it one of the most taxonomically diverse groups of bacteria currently described (1, 2). Organisms within this classification include lineages of high medical importance, including pathogenic bacteria capable of causing diverse infectious processes, such as sepsis, pneumonia, urinary tract infections, and intra-abdominal infections (3), as well as a host of foodborne illnesses (4). The treatment of diseases caused by these bacteria are complicated by the increasing prevalence of extended-spectrum beta-lactamases within the group and the emergence of carbapenemase-producing strains (3, 5, 6).
For the purposes of molecular diagnosis, it is difficult to distinguish among members of Enterobacteriaceae using conventional 16S rRNA gene sequencing (7) because the limited amount of sequence divergence at this locus does not allow for resolution of many species from one another (1, 8). Alternative molecular targets that provide greater discriminatory power have consequently been explored, most notably housekeeping genes tuf (9), atpD (9), and dnaJ (8). Encoding heat shock protein 40, dnaJ has proven particularly well-suited for the unambiguous identification of organisms within Enterobacteriaceae, showing demonstrably higher performance than 16S rRNA, tuf, or atpD (8). The dnaJ gene contains regions that are highly conserved across the Enterobacteriaceae family and are suitable targets for broad-specificity primer design. Analogously to 16S rRNA (7), such primers can be used to amplify and sequence taxonomically informative polymorphisms present in less evolutionarily constrained domains of the gene.
Current dnaJ assays for Enterobacteriaceae have proven useful for exploring the phylogenetic relationships of defined species and strains within the family (1, 8) and for the identification of cultured clinical isolates. However, the approach is suboptimal for the molecular diagnosis of Enterobacteriaceae from direct patient specimens because the PCR product generated by existing broad-range dnaJ primer designs (8) is relatively large (758 bp), making it challenging to robustly amplify fragmented DNA extracted from formalin-fixed paraffin-embedded materials (10, 11) or other clinical specimens containing degraded nucleic acids. Moreover, extant primer designs were based on the alignment of only three different Enterobacteriaceae species (8) and failed to amplify a subset of organisms that were empirically tested (8), leaving it unclear how generally well the pair is able to amplify other members of the family, especially in light of recent taxonomic additions (1, 2).
Here, we describe a novel broad-range primer set for the amplification and sequencing of a partial dnaJ fragment that is optimized for the species-level identification of Enterobacteriaceae from direct clinical specimens. The primers are designed for maximum identity across members of the current Enterobacteriaceae species tree (1, 2) to ensure broad-range amplification within the family. A short (∼385 bp) amplicon is targeted in order to facilitate amplification from fragmented DNA but provides sufficient sequence information to enable reliable discrimination of separate species. Last, the incorporation of locked nucleic acid (LNA) chemistries significantly promotes primer annealing (12), facilitating robust amplification from limited numbers of template molecules.
MATERIALS AND METHODS
dnaJ sequence analysis.
The full computer code used to identify and curate dnaJ sequences for downstream analysis is available as a Github repository (https://github.com/crosenth/dnaj). Briefly, sequences of dnaJ genes from representative Enterobacteriaceae genera and species were obtained from GenBank (see Table S1 in the supplemental material) based on feature annotation labels. To confirm sequence orientation, evaluate sequence quality, and identify sequence misannotation events, an alignment profile and multiple alignment were built using MUSCLE (13) and HMMER (14) from dnaJ sequences that were deposited in the NCBI type material database (15). HMMER alignment bit scores were calculated for sequences in the alignment and were then sorted based on this metric. The alignment was visually inspected, and misannotated sequences, identified by relatively low HMMER bit scores, were discarded. The remaining sequences were then grouped by genus or species taxon, and a single representative record corresponding to a sequence spanning the greatest length of the dnaJ alignment and having the highest HMMER alignment bit score was retained. Of the currently described genera classified as Enterobacteriaceae (1, 2), Levinea, Saccharobacter, and Samsonia were necessarily excluded due to a lack of sequence availability. For representative records having incomplete dnaJ sequence data, we required that at least 85% of the primer-trimmed sequence alignment was present for inclusion in downstream analyses.
Final multiple sequence alignments were performed using Clustal Omega (16). Neighbor-joining phylogenetic trees were constructed from alignments using the ClustalW2 package (17) and were used to confirm sequence taxonomic labels prior to downstream analysis. Pairwise distances between nucleotide sequences were calculated using MEGA4 (18). Logo plots of primers and dnaJ sequences were generated using WebLogo (19).
Oligonucleotides.
Primers described by Pham et al. (8) (DN1-1F, 5′-GATYTRCGHTAYAACATGGA-3′ and DN1-2R, 5′-TTCACRCCRTYDAAGAARC-3′) and those developed for this study (DJF, 5′-CNGGYGATYTGTAYGTWCAGGT-3′ and DJR, 5′-TCRTCRAARAAYTTYTTNACNC-3′) were ordered as salt-purified oligonucleotides from IDT (Coralvillle, IA). LNA-modified derivatives of DJF (5′-+CNG+GYG+ATYTGTAYGTWCAGGT-3′, plus sign precedes LNA bases) and DJR (5′-T+CRT+CRA+ARAAYTTYTTNACNC-3′) were obtained through Qiagen (Germantown, MD).
PCR amplification and sequence classification.
DNA was extracted from bacterial isolates and deidentified clinical specimens using the Qiacube ultraclean production (UCP) pathogen mini kit (Qiagen). The 16S rRNA gene testing of patient specimens was performed by V1 to V2 variable region sequencing as previously described (20). Next-generation 16S rRNA sequencing, also utilizing the V1 to V2 variable region, was performed as detailed elsewhere (21, 22). All dnaJ PCR amplifications were carried out using FailSafe PCR enzyme premix D (Lucigen, Madison WI) for 34 cycles with an annealing temperature of 50°C, unless otherwise specified. Negative and extraction controls were included for each batch of specimens. PCR products were processed with ExoSAP-IT (Thermo Fisher, Waltham MA) prior to dideoxynucleotide chain terminator sequencing using a BigDye Terminator v3.1 cycle sequencing kit (Thermo Fisher) and an ABI Prism 3730xl genetic analyzer (Thermo Fisher), according to manufacturer specifications.
Sequences of amplified dnaJ fragments were classified using BLAST (23) searches of the nonredundant NCBI sequence database (last accessed 5/14/2019). To assign a unique classification, we empirically required >97% coverage over the length of the sequence fragment with either (1) a primary match having >99% identity and derived from type material or having an associated peer-reviewed publication, with <98.05% identity to sequence records originating from alternative species and having the same credentials and at least 1.4% less sequence identity in the secondary match relative to the primary match; or (2) a primary sequence match having >97.9% identity and derived from type material or having an associated peer-reviewed publication, with <95.5% identity to alternative species having such credentials and at least 2.4% less sequence identity in the secondary match relative to the primary match. If multiple sequence matches from disparate species were identified above the primary identity thresholds, the classification included all possible species-level assignments.
For limit of detection studies, the number of genome equivalents per reaction was determined from the measured quantity of genomic DNA using the formula:
which assumes 650 Da as the average weight of a base pair in double-stranded DNA.
All amplifications were performed at least in triplicate to confirm results.
Data availability.
Sequence data from this project are publicly available through GenBank (accession numbers MN066711 to MN066737 and MN307447 to MN307450).
RESULTS AND DISCUSSION
Targeted region selection and primer design.
We generated a multiple alignment of dnaJ sequences from the 59 Enterobacteriaceae genera available from public databases in order to identify regions of sequence conservation across the family that would be amenable to the design of broad range primers and to explore the taxonomic informativity of various subsets of the gene to individually identify organisms. Within the full-sequence alignment of the gene (∼1.1 kb), we empirically identified regions of conservation of ∼400 bp apart, between which there was sufficient taxonomic diversity to discriminate between individual species. Primers designed to amplify target sequences from these conserved sites were manually designed to maximally overlap conserved nucleotides, to minimize mismatches across different genera through the inclusion of degenerative sites, to maximize primer melting temperature, and to avoid secondary structures and predicted primer-dimer formation.
Comparative analysis of our final primer designs and the original primers described by Pham et al. (8) against corresponding dnaJ sequences across Enterobacteriaceae genera was performed next (Fig. 1). This analysis revealed mismatches between the critical 3′ position of the original forward primer and matching genomic DNA of genera Leminorella, Budvicia, Pragia, and Arsenophonus and also showed 3′ mismatches between the reverse primer and conjugate binding sites in genera Leminorella, Budvicia, Pragia, Phaseolibacter, Pectobacterium, and Buttiauxella (Fig. 1A). At least three of these genera have been implicated in human infection (24–27) and are consequently considered to be of medical importance. In contrast, the newly developed primer pair terminates over positions that are fully conserved across all Enterobacteriaceae genera examined (Fig. 1B). Consistent with prior reports (8), we conclude that the extant dnaJ primer set is unable to amplify a subset of Enterobacteriaceae genera, whereas the primers described here are expected to be more generally applicable across family members.
FIG 1.
Logo plot of dnaJ primer designs. Two primer dnaJ primer sets are depicted, namely, those previously described by Pham et al. (8) (A) and those described in the current study (B). Both primer sets are shown relative to a dnaJ consensus sequence of 59 different Enterobacteriaceae genera, with positions numbered according to the E. coli dnaJ sequence (GenBank accession number AE014075). The intervening sequences between primer binding sites are of variable length and are masked as “N.” The height of each character is proportional to the frequency with which it occurs in the sequence(19). All DNA sequences are listed 5′ to 3′.
Discriminatory power of the partial dnaJ target.
A phylogeny (Fig. 2A) and pairwise distance analysis (Fig. 2B) of representative members from each genus in Enterobacteriaceae confirmed that the primer-trimmed dnaJ sequence fragment was able to robustly discriminate among representatives from the 59 genera selected. There was an average (± standard deviations) of 87 ± 20.5 nucleotide substitutions that differentiated individual taxa. Of the 1,711 pairwise sequence comparisons performed, only 2 carried fewer than 35 nucleotide differences, including the closely related organisms Escherichia coli and Shigella sonnei (11 differences) and Obesumbacterium proteus and Hafnia alvei (14 differences).
FIG 2.
Phylogenies and pairwise differences among Enterobacteriaceae taxa based on selected dnaJ sequence fragment analysis. Neighbor-joining phylogenetic trees and violin plots of pairwise distance matrices for 59 different Enterobacteriaceae genera (A and B) and 107 different Enterobacteriaceae species (C and D) based on the 391 bp dnaJ fragment identified in this study. Scale bars in panels A and C indicate changes per site; pairwise distances are expressed in absolute number of nucleotide differences. Dashed lines in violin plots (B and D) indicate the median and dotted lines demarcate quartiles.
To explore taxonomic informativity of the dnaJ sequence fragment at the species level, we next repeated the analysis using 1 representative from each of the 107 species with publicly available dnaJ sequence data that spanned the full length or nearly the full length of our amplicon (Fig. 2C and D). On average (± standard deviations), 46.5 ± 10.3 variants were distinguished among species across the 5,671 pairwise comparisons that were performed. However, this more comprehensive analysis revealed difficulty in resolving some representatives of the closely related Escherichia and Shigella species (28). Pairwise comparisons restricted to a phylogenetic subgroup composed of E. coli, Escherichia fergusonii, Shigella boydii, Shigella sonnei, and Shigella flexneri evidenced an average (± standard deviations) of 2.7 ± 1.2 pairwise differences, with S. boydii and E. coli being indistinguishable over the interrogated region. Outside of that subgroup, only two pairwise comparisons (Lelliottia jeotgali versus Lelliottia aquatilis and Lelliottia nimipressuralis versus Enterobacter roggenkampii) were distinguished by two sequence polymorphisms each, and all other taxa were differentiated from each other by at least 5 variant sites.
LNA modifications improve amplification efficiency and increase primer annealing temperatures.
In practice, PCR amplification using our newly designed primer set showed reduced amplification efficiency relative to the dnaJ primers previously described by Pham et al. (8), despite having a markedly reduced product size (758 bp versus ∼385 bp, respectively) (Fig. 3). This result suggests inefficient primer annealing, likely due to the high number of degenerate sites required by our primer designs. Therefore, in order to better promote primer binding and improve amplification efficiency, we introduced locked nucleic acid modifications near the 5′ end of each oligonucleotide (12). The inclusion of these substitutions markedly increased the semiquantitative amplification efficiency and resulted in greater yields of the product relative to both the unmodified primer pair and the original dnaJ primer design (Fig. 3). An annealing temperature of 50°C was selected to maximize amplification specificity and sensitivity.
FIG 3.

Amplification of dnaJ using standard and locked nucleic acid modified primers. Gel electrophoresis of PCR products generated using dnaJ primer sets described by Pham et al. (8) (“original”), and those described in the current study, with and without LNA-modified bases (“new+LNA” and “new,” respectively). PCR was performed using different annealing temperatures (specified at top) using E. coli DNA as a template, and equivalent volumes of products were resolved on the gel. Key sizing bands on the 100-bp ladder (outermost lanes) are labeled at right.
Limit of detection analysis.
We next ascertained the limits of detection achievable with our primers by performing dnaJ amplification and sequencing using 10-fold serial dilutions of the bacterial genomic DNA template. To assess performance across medically important Enterobacteriaceae, these studies were performed using a combination of 10 different type strains and clinical isolates that spanned 9 species from within that taxon (Table 1). Amplification robust enough to allow identification of the organism by dideoxynucleotide sequencing was achieved from 10 to 100 genome equivalents of DNA, depending on the organism used as the template, although visible amplification products were sometimes observed from lower dilutions. We conclude that the dnaJ primer set is capable of identifying organisms across Enterobacteriaceae using fewer than 100 genomes of template, and as little as 10 genomes, with reproducible differences in limits of detection reflecting the specific strain interrogated.
TABLE 1.
Limits of detection for different Enterobacteriaceae species
| Organism | Limit of detection (genome equivalents) |
|---|---|
| Escherichia coli (ATCC 25922) | 10 |
| Escherichia coli (ATCC 35218) | 10 |
| Klebsiella pneumoniae (ATCC 700603) | 10 |
| Enterobacter cloacae (ATCC BAA1143) | 10 |
| Klebsiella aerogenes (clinical isolate 1) | 10 |
| Klebsiella aerogenes (clinical isolate 2) | 10 |
| Enterobacter cloacae complex (clinical isolate) | 100 |
| Shigella flexneri (clinical isolate) | 100a |
| Citrobacter freundii complex (clinical isolate) | 10 |
| Klebsiella variicola (clinical isolate) | 100 |
Sequencing results were equally consistent with S. flexneri or E. coli.
Impact of interfering DNA templates.
The presence of human DNA or that from other bacteria in a testing matrix may adversely affect the performance of molecular assays. We first determined the impact of host material on assay performance by repeating limit of detection studies with one representative organism (Enterobacter cloacae ATCC BAA1143) in the presence of abundant of tissue-extracted human DNA (5 ng per reaction). Equivalent limits of detection for the dnaJ assay were achieved in the presence or absence of the human template (Table 1), indicating that human DNA did not adversely affect amplification even when its mass exceeded that of bacterial genomes by more than 5 orders of magnitude. Similarly, dnaJ amplification using 100 ng of human genomic DNA yielded no visible product, which is consistent with the primers having negligible affinity for human sequences.
In order to determine the effects that more complex specimen matrices had on Enterobacteriaceae detection, we next tested four patient samples known to contain Enterobacteriaceae at various relative abundances in concert with other bacterial species, as ascertained by clinical next-generation 16S rRNA sequencing (21, 22) (Table 2). Despite the presence of potentially confounding host and microbial DNA, we found that dnaJ amplified and sequenced cleanly in these specimens, enabling the classification of Enterobacteriaceae to the species level even when they comprised as little as 1% of the total bacterial burden.
TABLE 2.
Results of dnaJ sequencing for polymicrobial clinical specimens
| Specimen description | Enterobacteriaceae taxon reported by clinical 16S rRNA deep sequencing | Estimated relative abundance (%) of Enterobacteriaceae | No. of non-Enterobacteriaceae species reported | GenBank accession no. for targeted dnaJ sequence | Classification by targeted dnaJ sequencing | GenBank accession no. of a representative of the closest dnaJ sequence match (coordinates [bp]) | % Identity to closest GenBank dnaJ sequence match |
|---|---|---|---|---|---|---|---|
| Bronchoalveolar lavage | Enterobacteriaceae | 93 | 2 | MN307447 | Escherichia coli | AP009378 (14788–15146) | 100 |
| Skin tissue | Citrobacter freundii complex | 33.2 | 1 | MN307448 | Citrobacter braakii | JQ762617 (339–688 bp) | 99.71 |
| Leg tissue | Enterobacteriaceae | 7.6 | 8 | MN307449 | Escherichia coli | CP000247 (14970–15304) | 100 |
| Bronchoalveolar lavage | Klebsiella pneumoniae complex | 1.1 | 6 | MN307450 | Klebsiella variicola | CP010523 (4685857–4686116) | 99.6 |
We conclude that the dnaJ primer set is generally robust, even in the presence of templates derived from other bacteria or the human host.
Application of targeted dnaJ sequencing to clinical specimens.
Last, in order to ascertain the utility of partial dnaJ sequencing as a diagnostic approach for identifying Enterobacteriaceae in clinical practice, we retrospectively examined a cohort of 27 patient tissue specimens (Table 3) that were previously submitted for bacterial identification by 16S rRNA gene sequencing (variable regions V1 to V2). These specimens represented a convenience sample of material where 16S rRNA gene testing identified bacteria of the family Enterobacteriaceae, without applying any other selection criteria. For 20 of the 27 specimens, 16S rRNA testing was sufficient to yield classifications only to the family level. Three other classifications were successfully made to the genus level, while one was identifiable as a species complex and the remaining three were identifiable as specific species. We included additional specimens containing DNA from non-Enterobacteriaceae bacterial species to serve as amplification controls (8 specimens).
TABLE 3.
Results of dnaJ sequencing of direct clinical specimens
| Specimen description | Reported clinical result by 16S rRNA sequencing | GenBank accession no. for targeted dnaJ sequence | Classification by targeted dnaJ sequencing | GenBank accession no. of a representative of the closest dnaJ sequence match (coordinates [bp]) | % Identity to closest GenBank dnaJ sequence match |
|---|---|---|---|---|---|
| Cerebrospinal fluid | Klebsiella oxytoca | MN066711 | Klebsiella oxytoca | CP017928 (5386543–5386904) | 99.72 |
| Leg tissue | Enterobacteriaceae | MN066712 | Escherichia coli or Shigella species | AP019189 (4018530–4018888) and CP000036 (16044–16402) | 99.73 |
| Liver abscess | Klebsiella species | MN066713 | Klebsiella pneumoniae | CP033777 (5251674–5252033) | 100 |
| Pancreatic pseudocyst fluid | Enterobacteriaceae | MN066714 | Escherichia coli | CP033092 (3995387–3995748) | 100 |
| Pancreatic pseudocyst fluid | Enterobacteriaceae | MN066715 | Escherichia coli | AE014075 (15689–16050) | 100 |
| Pancreatic pseudocyst fluid | Enterobacteriaceae | MN066716 | Escherichia coli | AE014075 (15689–16050) | 100 |
| Ventriculoperitoneal shunt | Klebsiella species | MN066717 | Klebsiella pneumoniae | CP002910 (532186–532545) | 100 |
| Tissue (not specified) | Enterobacteriaceae | MN066718 | Escherichia coli | AP009378 (14788–15146) | 100 |
| Gallbladder fluid | Enterobacteriaceae | MN066719 | Escherichia coli | AE014075 (15689–16050) | 99.72 |
| Abdominal drain fluid | Klebsiella species | MN066720 | Klebsiella variicola | CP010523 (4685750–4686108) | 99.72 |
| Knee tissue | Enterobacteriaceae | MN066721 | Escherichia coli | AE014075 (15689–16050) | 100 |
| Liver biopsy | Enterobacteriaceae | MN066722 | Enterobacter ludwigii | CP006580 (4532648–4533009) | 99.45 |
| Submuscular wound swap | Enterobacteriaceae | MN066723 | Escherichia coli | AE014075 (15689–16050) | 97.94 |
| Supramuscular wound swab | Enterobacteriaceae | MN066724 | Escherichia coli | AE014075 (15689–16050) | 99.72 |
| Leg tissue | Enterobacteriaceae | MN066725 | Escherichia coli | CP021202 (1431563–1431921) | 100 |
| Chest sternal wound | Klebsiella pneumoniae complex | MN066726 | Klebsiella variicola | CP000964 (4792363–4792721) | 99.45 |
| Pleural fluid | Enterobacteriaceae | MN066727 | Escherichia coli | CP033092 (3995387–3995748) | 100 |
| Femur tissue | Enterobacteriaceae | MN066728 | Enterobacter hormaechei or Enterobacter cloacae | CP010384 (710440–710807) and CP033466 (2556488–2556855) | 99.46 |
| Perihepatic aspirate, abdominal fluid | Enterobacteriaceae | MN066729 | Escherichia coli, Shigella boydii, or Shigella flexneri | CP024223 (3091690–3092048) | 100 |
| Eye | Serratia marcescens | MN066730 | Serratia marcescens | CP018917 (14640–15007) | 98.37 |
| Calf tissue | Enterobacteriaceae | MN066731 | Enterobacter hormaechei | CP010384 (710440–710801) | 100 |
| Hip joint | Enterobacteriaceae | MN066732 | Escherichia coli or Shigella sonnei | AP009240 (14905–15263) | 100 |
| Abscess fluid | Serratia marcescens | MN066733 | Serratia marcescens | CP018917 (14640–15007) | 99.19 |
| Leg tissue | Enterobacteriaceae | MN066734 | Enterobacter cancerogenus | MG706101 (332–683) | 98.3 |
| Lumbar disc tissue | Enterobacteriaceae | MN066735 | Escherichia coli | AE014075 (15689–16050) | 99.72 |
| Testicle tissue (paraffin embedded) | Enterobacteriaceae | MN066736 | Escherichia coli | CP033092 (3995387–3995748) | 100 |
| Not specified | Enterobacteriaceae | MN066737 | Escherichia coli or Shigella species | CP007265 (1319740–1319382) | 100 |
| Pleural tissue | Flavobacteriaceae | Amplification negative | NAa | NA | NA |
| Abscess drainage fluid | Staphylococcus aureus | Amplification negative | NA | NA | NA |
| Epidural tissue | Staphylococcus aureus | Amplification negative | NA | NA | NA |
| Vitreous fluid | Staphylococcus epidermidis | Amplification negative | NA | NA | NA |
| Tissue swab (not specified) | Streptococcus mitis group | Amplification negative | NA | NA | NA |
| Knee tissue | Clostridium perfringens | Amplification negative | NA | NA | NA |
| Chest wall abscess | Haemophilus influenzae | Amplification negative | NA | NA | NA |
| Urine | Lactobacillus crispatus | Amplification negative | NA | NA | NA |
Not applicable.
Patient specimens lacking a provisional diagnosis of Enterobacteriaceae failed to amplify with the dnaJ primer set, demonstrating specificity for the primer pair for templates within the intended family. Sequencing of the dnaJ target generally resulted in more specific taxonomic classifications than could be achieved by 16S rRNA analysis. All identifications originally assigned at the species level by 16S rRNA sequencing were recapitulated unambiguously by focused dnaJ sequencing. However, the dnaJ approach also achieved unambiguous species level-identification of 15 of the 20 specimens, which could only be assigned to the Enterobacteriaceae family by 16S rRNA analysis. The identity of the organism in four of the other cases could not be robustly resolved between E. coli and closely related Shigella species, and in the remaining case, there was ambiguity in resolving between two Enterobacter species. In 12 instances, E. coli was diagnosed at the species level based on high identity sequence matches that had significantly less identity (at least 1.4%) to alternative species. Species-level classifications were also achieved for each of the four specimens originally classified at the level of genus or complex. Overall, the assay achieved species-level classifications in 22 of the 27 specimens considered in this analysis (81.5%).
Summary and conclusions.
The sensitive and accurate molecular identification of Enterobacteriaceae from clinical specimens can be impactful in informing clinical care, especially given the different intrinsic antimicrobial susceptibilities known for species within that group (29). Here, we have described the development and validation of a broad-range dnaJ primer set suitable for the robust amplification of Enterobacteriaceae DNA directly from patient specimens. The inclusion of LNA modifications into the primer pair increases amplification efficiency to the point that detectable product can be recovered from trace amounts of starting material, namely, as little as 10 genome equivalents.
The dnaJ sequence fragment targeted is small enough to be amplifiable from highly fragmented template DNA but retains enough information on sequence polymorphisms to discriminate among most species within the family. In practice, we found that unambiguous, species-level classifications could be achieved in the vast majority of cases examined (Table 2 and 3). Of the 107 Enterobacteriaceae species we surveyed by in silico analysis (Fig. 2), the only two that could not be readily differentiated were representatives from S. boydii and E. coli. However, given the phylogenomic derivation of Shigella from E. coli (28), a relative paucity of dnaJ polymorphisms which distinguish among some lineages within those species are not unexpected. Positively, we note that only a fraction of Shigella and E. coli strains are expected to be problematic by this approach (28) and also that other Shigella and Escherichia species included in our analysis could still be readily discriminated from one another by targeted dnaJ sequencing.
The dnaJ assay described in this work provides a powerful approach for the molecular diagnosis of Enterobacteriaceae infections in clinical practice and should prove to be of benefit in a variety of contexts. However, there are caveats with the current study. It should be noted that Enterobacteriaceae species are sometimes encountered as environmental contaminants and contaminants of biotechnology reagents (30), and the appropriate use of negative and extraction controls is warranted to assess for possible false-positive results. As the Enterobacteriaceae family is large, it was not tractable in this study to evaluate the performance of the assay for each species, nor was it possible to comprehensively evaluate the specificity of the primer set across the bacterial tree of life. However, in silico analysis and empirical performance in the limited use cases tested in this study provide encouraging support for the assay being generally applicable in clinical use.
Supplementary Material
Footnotes
Supplemental material for this article may be found at https://doi.org/10.1128/JCM.00986-19.
REFERENCES
- 1.Adeolu M, Alnajar S, Naushad S, S Gupta R. 2016. Genome-based phylogeny and taxonomy of the “Enterobacteriales”: proposal for Enterobacterales ord. nov. divided into the families Enterobacteriaceae, Erwiniaceae fam. nov., Pectobacteriaceae fam. nov., Yersiniaceae fam. nov., Hafniaceae fam. nov., Morganellaceae fam. nov., and Budviciaceae fam. nov. Int J Syst Evol Microbiol 66:5575–5599. doi: 10.1099/ijsem.0.001485. [DOI] [PubMed] [Google Scholar]
- 2.Munson E, Carroll KC. 2019. An update on the novel genera and species and revised taxonomic status of bacterial organisms described in 2016 and 2017. J Clin Microbiol 57:e01181-18. doi: 10.1128/JCM.01181-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Sheu C-C, Chang Y-T, Lin S-Y, Chen Y-H, Hsueh P-R. 2019. Infections caused by carbapenem-resistant Enterobacteriaceae: an update on therapeutic options. Front Microbiol 10:80. doi: 10.3389/fmicb.2019.00080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Baylis C, Uyttendaele M, Joosten H, Davies A. 2011. The Enterobacteriaceae and their significance to the food industry. ILSI Europe Emerging Microbiological Issues Task Force, Brussels, Belgium. [Google Scholar]
- 5.Hoban DJ, Lascols C, Nicolle LE, Badal R, Bouchillon S, Hackel M, Hawser S. 2012. Antimicrobial susceptibility of Enterobacteriaceae, including molecular characterization of extended-spectrum beta-lactamase-producing species, in urinary tract isolates from hospitalized patients in North America and Europe: results from the SMART study 2009–2010. Diagn Microbiol Infect Dis 74:62–67. doi: 10.1016/j.diagmicrobio.2012.05.024. [DOI] [PubMed] [Google Scholar]
- 6.Stadler T, Meinel D, Aguilar-Bultet L, Huisman JS, Schindler R, Egli A, Seth-Smith HMB, Eichenberger L, Brodmann P, Hübner P, Bagutti C, Tschudin-Sutter S. 2018. Transmission of ESBL-producing Enterobacteriaceae and their mobile genetic elements-identification of sources by whole genome sequencing: study protocol for an observational study in Switzerland. BMJ Open 8:e021823. doi: 10.1136/bmjopen-2018-021823. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Janda JM, Abbott SL. 2007. 16S rRNA gene sequencing for bacterial identification in the diagnostic laboratory: pluses, perils, and pitfalls. J Clin Microbiol 45:2761–2764. doi: 10.1128/JCM.01228-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Pham HN, Ohkusu K, Mishima N, Noda M, Monir Shah M, Sun X, Hayashi M, Ezaki T. 2007. Phylogeny and species identification of the family Enterobacteriaceae based on dnaJ sequences. Diagn Microbiol Infect Dis 58:153–161. doi: 10.1016/j.diagmicrobio.2006.12.019. [DOI] [PubMed] [Google Scholar]
- 9.Paradis S, Boissinot M, Paquette N, Bélanger SD, Martel EA, Boudreau DK, Picard FJ, Ouellette M, Roy PH, Bergeron MG. 2005. Phylogeny of the Enterobacteriaceae based on genes encoding elongation factor Tu and F-ATPase beta-subunit. Int J Syst Evol Microbiol 55:2013–2025. doi: 10.1099/ijs.0.63539-0. [DOI] [PubMed] [Google Scholar]
- 10.Lu XJD, Liu KYP, Zhu YS, Cui C, Poh CF. 2018. Using ddPCR to assess the DNA yield of FFPE samples. Biomol Detect Quantif 16:5–11. doi: 10.1016/j.bdq.2018.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Amemiya K, Hirotsu Y, Oyama T, Omata M. 2019. Relationship between formalin reagent and success rate of targeted sequencing analysis using formalin fixed paraffin embedded tissues. Clin Chim Acta 488:129–134. doi: 10.1016/j.cca.2018.11.002. [DOI] [PubMed] [Google Scholar]
- 12.Levin JD, Fiala D, Samala MF, Kahn JD, Peterson RJ. 2006. Position-dependent effects of locked nucleic acid (LNA) on DNA sequencing and PCR primers. Nucleic Acids Res 34:e142. doi: 10.1093/nar/gkl756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Eddy SR. 2011. Accelerated profile HMM searches. PLoS Comput Biol 7:e1002195. doi: 10.1371/journal.pcbi.1002195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Federhen S. 2015. Type material in the NCBI taxonomy database. Nucleic Acids Res 43:D1086–1098. doi: 10.1093/nar/gku1127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG. 2011. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7:539. doi: 10.1038/msb.2011.75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Thompson JD, Higgins DG, Gibson TJ. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680. doi: 10.1093/nar/22.22.4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Tamura K, Dudley J, Nei M, Kumar S. 2007. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol 24:1596–1599. doi: 10.1093/molbev/msm092. [DOI] [PubMed] [Google Scholar]
- 19.Crooks GE, Hon G, Chandonia J-M, Brenner SE. 2004. WebLogo: a sequence logo generator. Genome Res 14:1188–1190. doi: 10.1101/gr.849004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lee SA, Plett SK, Luetkemeyer AF, Borgo GM, Ohliger MA, Conrad MB, Cookson BT, Sengupta DJ, Koehler JE. 2015. Bartonella quintana aortitis in a Man with AIDS, diagnosed by needle biopsy and 16S rRNA gene amplification. J Clin Microbiol 53:2773–2776. doi: 10.1128/JCM.02888-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Salipante SJ, Kawashima T, Rosenthal C, Hoogestraat DR, Cummings LA, Sengupta DJ, Harkins TT, Cookson BT, Hoffman NG. 2014. Performance comparison of Illumina and ion torrent next-generation sequencing platforms for 16S rRNA-based bacterial community profiling. Appl Environ Microbiol 80:7583–7591. doi: 10.1128/AEM.02206-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Cummings LA, Kurosawa K, Hoogestraat DR, SenGupta DJ, Candra F, Doyle M, Thielges S, Land TA, Rosenthal CA, Hoffman NG, Salipante SJ, Cookson BT. 2016. Clinical next generation sequencing outperforms standard microbiological culture for characterizing polymicrobial samples. Clin Chem 62:1465–1473. doi: 10.1373/clinchem.2016.258806. [DOI] [PubMed] [Google Scholar]
- 23.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 24.Blekher L, Siegman-Igra Y, Schwartz D, Berger SA, Carmeli Y. 2000. Clinical significance and antibiotic resistance patterns of Leminorella spp., an emerging nosocomial pathogen. J Clin Microbiol 38:3036–3038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Smuszkiewicz P, Tomczak H. 2014. A rarely isolated micro‐organism, Budvicia aquatica, cultured from urine of a patient with Guillain–Barré syndrome. JMM Case Rep doi: 10.1099/jmmcr.0.001503. [DOI] [Google Scholar]
- 26.Corbin A, Delatte C, Besson S, Guidry A, Hoffmann AH, Monier P, Nathaniel R. 2007. Budvicia aquatica sepsis in an immunocompromised patient following exposure to the aftermath of Hurricane Katrina. J Med Microbiol 56:1124–1125. doi: 10.1099/jmm.0.47139-0. [DOI] [PubMed] [Google Scholar]
- 27.De Baere T, Wauters G, Kämpfer P, Labit C, Claeys G, Verschraegen G, Vaneechoutte M. 2002. Isolation of Buttiauxella gaviniae from a spinal cord patient with urinary bladder pathology. J Clin Microbiol 40:3867–3870. doi: 10.1128/jcm.40.10.3867-3870.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Sahl JW, Morris CR, Emberger J, Fraser CM, Ochieng JB, Juma J, Fields B, Breiman RF, Gilmour M, Nataro JP, Rasko DA. 2015. Defining the phylogenomics of Shigella species: a pathway to diagnostics. J Clin Microbiol 53:951–960. doi: 10.1128/JCM.03527-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Osterblad M, Pensala O, Peterzéns M, Heleniusc H, Huovinen P. 1999. Antimicrobial susceptibility of Enterobacteriaceae isolated from vegetables. J Antimicrob Chemother 43:503–509. doi: 10.1093/jac/43.4.503. [DOI] [PubMed] [Google Scholar]
- 30.Salter SJ, Cox MJ, Turek EM, Calus ST, Cookson WO, Moffatt MF, Turner P, Parkhill J, Loman NJ, Walker AW. 2014. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol 12:87. doi: 10.1186/s12915-014-0087-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Sequence data from this project are publicly available through GenBank (accession numbers MN066711 to MN066737 and MN307447 to MN307450).


