Abstract
We compared and analyzed 16S rRNA and tuf gene sequences for 97 clinical isolates of coagulase-negative staphylococci (CNS) by use of the GenBank, MicroSeq, EzTaxon, and BIBI databases. Discordant results for definitive identification were observed and differed according to the different databases and target genes. Although higher percentages of sequence identity were obtained with GenBank and MicroSeq for 16S rRNA analysis, the BIBI and EzTaxon databases produced less ambiguous results. Greater discriminatory power and fewer multiple probable identifications were observed with tuf gene analysis than with 16S rRNA analysis. The most pertinent results for tuf gene analysis were obtained with the GenBank database when the cutoff values for the percentage of identity were adjusted to be greater than or equal to 98.0%, with >0.8% separation between species. Analysis of the tuf gene proved to be more discriminative for certain CNS species; further, this method exhibited better distinction in the identification of CNS clinical isolates.
INTRODUCTION
Coagulase-negative staphylococci (CNS) are normal inhabitants of human skin and mucous membranes and have been regarded previously as culture contaminants (31). However, CNS have emerged as significant pathogens (26), especially in immunocompromised patients (3), in premature neonates in intensive-care units (8, 16), and in patients who have undergone complex medical procedures involving the implantation of prosthetic or cardiac devices or indwelling catheters (25, 26). One of the frequently isolated CNS species, Staphylococcus epidermidis, is the most commonly isolated etiological agent of nosocomial infections (21).
Because of the increase in the clinical significance of CNS, there is a need for a more accurate and sensitive method to identify CNS species in clinical samples. Many automated identification systems are commercially available, such as the Vitek 2 (bioMérieux, Marcy l'Etoile, France), BD Phoenix (Becton Dickinson Diagnostic Systems, Sparks, MD), and MicroScan (Dade Behring, West Sacramento, CA) systems. These systems allow for more rapid and more accurate identification than does manual morphological identification or biochemical tests. However, the accuracy of these tests can be compromised because of the variable expression of phenotypic characteristics and the limited nature of the databases; these limitations can result in ambiguous findings and the inability to identify uncommon isolates (4, 15, 17). In addition, identification of CNS to the species level may change the diagnosis and therapeutic plans, since uncommon isolates are being considered as causative agents, and unusual antibacterial resistance patterns appear (1, 13). Therefore, genotypic methods of identifying CNS species are emerging as diagnostic tools for CNS infections.
Sequence analysis of the 16S rRNA, a highly conserved region present in all bacteria, has been implemented in clinical laboratories to identify CNS species (10, 22). Although this method is widely used and accurate, the high degree of similarity between closely related species limits its usefulness for identifying several CNS species (27). Alternative target genes, such as rpoB, tuf, dnaJ, and sodA, are being assessed for their abilities to distinguish between highly similar species (12, 14, 15, 24, 27, 28). The tuf gene, which encodes the elongation factor Tu, is an essential constituent of the bacterial genome and is involved in peptide chain formation. Due to its essential nature, it is preferred for diagnostic purposes (19). tuf gene analysis has been shown to be a reliable and reproducible method of identifying CNS; further, it has exhibited better resolution for distinguishing between certain CNS species than 16S rRNA analysis (15, 22).
In addition to the selection of a target gene for bacterial species identification, interpretation of the sequence analysis results utilizing different databases is also important. However, the postsequencing process of interpreting genotypic results has not been emphasized in many studies (4, 22). Bioinformatic tools for bacterial identification are continually being developed and renovated to meet the needs for processing ever-increasing amounts of data. Currently, multiple databases or tools exist for bacterial identification. The most commonly used open database worldwide is GenBank (5), which incorporates DNA sequences from all available public sources, making it comprehensive and easily accessible, and its sequences are considered the primary data for other databases. Other databases incorporate other information along with that from GenBank and can be regarded as secondary databases. One of the secondary databases, BIBI, combines the two well-known tools of BLAST (30) and CLUSTALW (18) and utilizes phylogenetic data that are important for bacterial identification (11). EzTaxon, another Web-based tool, contains 16S rRNA sequences for prokaryotic type strains; it is constructed to enable the identification of isolates on the basis of pairwise nucleotide similarity values and phylogenetic inference methods (9). Additionally, MicroSeq 500 (Applied Biosystems Inc., Foster City, CA) is a commercially available software program for 16S rRNA sequence analysis (32).
We compared the genotypic results from 16S rRNA sequencing analyzed with GenBank, MicroSeq, EzTaxon, and BIBI for clinical isolates identified as CNS by phenotypic systems (Vitek 2, MicroScan); further, the genotypic results from tuf gene analyses using GenBank and BIBI were also compared. Few articles exist regarding guidelines for the interpretation of DNA target sequences for bacterial identification. This is problematic, because the results given by databases are often inconclusive. The finding of multiple probable results or a rare species may not have been reviewed by others, because many databases are open to the public and are not validated thoroughly. The Clinical and Laboratory Standards Institute (CLSI) molecular method 18-A (MM18-A) is so far the most commonly referenced material for bacterial DNA identification (22). CLSI MM18-A focuses on the interpretation of bacterial 16S rRNA sequence data, including data for staphylococci, related Gram-positive cocci, and fungi. However, few guidelines exist for other DNA targets, such as the tuf gene, for bacterial identification. Therefore, we evaluated the appropriateness of the CLSI guidelines for tuf gene analysis and aimed to determine the optimal criteria for CNS species identification by tuf gene analysis. Moreover, we compared the results of 16S rRNA and tuf gene analyses using different databases and assessed whether tuf gene analysis can be used reliably in clinical laboratories to identify CNS species.
MATERIALS AND METHODS
Bacterial isolates.
A total of 97 clinical isolates that were identified as CNS by either one of the two phenotypic systems used were included in this study. The phenotypic systems used were the MicroScan Pos Combo panel type 1A (Dade Behring, West Sacramento, CA) and Vitek 2 Gram-positive (GP) identification (ID) cards (bioMérieux, Marcy l'Etoile, France). The identification procedures were performed according to the manufacturers' recommendations. Identifications with less than a 90% identity score by phenotypic systems were considered insufficient for a result, but those isolates were included for the analysis. To exclude blood culture contaminants, only blood culture isolates that grew in at least two of three blood cultures were included. For additional or repeated tests, the isolates were suspended in skim milk and were stored at −70°C.
Extraction of genomic DNA.
After overnight growth on blood agar plates at 37°C, genomic DNA was extracted from pure cultures by a Chelex matrix (Bio-Rad Laboratories, Hercules, CA) according to the manufacturer's instructions.
16S rRNA and tuf gene sequencing.
PCR amplifications were conducted in a total volume of 25 μl containing 2.5 mM deoxynucleoside triphosphates (dNTPs), 10 pmol of each PCR primer, 0.6 U Taq polymerase, 2.5 μl of 10× PCR buffer with 15 mM MgCl2 (Takara Bio, Inc., Shiga, Japan), and 2.5 μl of the template. In-house primers were designed using LightCycler Probe Design software (version 2.0) (Roche, Penzberg, Germany); published studies were used as a reference (7, 10, 15). 16S rRNA was amplified with primers MSQ-F (5′-TGAAGAGTTTGATCATGGCTCAG-3′) and MSQ-R (5′-ACCGCGGCTGCTGGCAC-3′), and the tuf gene was amplified with primers TUF-F (5′-GCCAGTTGAGGACGTATTCT-3′) and TUF-R (5′-CCATTTCAGTACCTTCTGGTAA-3′). The PCR conditions for 16S rRNA were as follows: an initial denaturation period of 10 min at 95°C, followed by 35 cycles of 30 s of annealing at 60°C and 45 s of elongation at 72°C, with a final 10-min extension at 72°C. The PCR conditions for tuf were as follows: 15 min of initial denaturation at 95°C, followed by 35 cycles of 30 s at 95°C, 30 s at 56°C, and 45 s at 72°C, with a final 10-min extension at 72°C. Gel electrophoresis was used to detect positive PCR signals and to confirm amplicon lengths of 527 bp for 16S rRNA and 412 bp for the tuf gene. Prior to sequencing, the PCR products were purified using the ExoSAP-IT reagent (USB Corporation, Cleveland, OH) according to the manufacturer's instructions. Forward and reverse sequencing reactions were conducted for each of the amplified products. The sequencing reaction mixture for 16S rRNA consisted of 10 μl of MicroSeq 500 sequencing mix (containing 1.6 pmol of MSQ-F or MSQ-R) primers, 2.9 μl of molecular-grade water, and 1 μl of the purified PCR product. For the tuf gene, sequencing reactions were performed using BigDye Terminator, version 3.1, reagents (Applied Biosystems Inc., Foster City, CA). Briefly, the sequencing reaction mixture consisted of 1 μl of BigDye Ready Reaction mix, 3.5 μl of BigDye sequencing buffer (5×) (Applied Biosystems Inc.), 1.6 μl of a 1-pmol primer, 2.9 μl of molecular-grade water, and 1 μl of the purified PCR product; the final reaction volume was 10 μl. The thermal cycling conditions were as follows: 25 cycles of 10 s at 96°C, 5 s at 50°C, and 4 min at 60°C. The sequencing products were purified using ethanol–sodium acetate. Sequencing reactions were performed on an ABI Prism 3130xl genetic analyzer (Applied Biosystems Inc.) according to the standard automated sequencer protocols.
Sequence analysis.
GenBank, MicroSeq 500, version 2.0, EzTaxon (http://www.eztaxon.org) (9), and BIBI (http://umr5558-sud-str1.univ-lyon1.fr/lebibi/lebibiexecX.cgi) (11) were accessed most recently on January 2011 for analysis. In compliance with the CLSI guidelines for the interpretation of 16S rRNA sequence analysis with GenBank and MicroSeq, a query sequence with ≥99.0% identity for species identification and >0.8% separation between different species was considered acceptable. In cases where multiple species with <0.8% separation and with ≥99.0% identity were identified, all of the IDs were considered probable IDs. Because BIBI does not report results with identity scores, the ID with the greatest number of sequences labeled “ID in cluster” in response to the query sequence was considered the most probable ID; the ID with the greatest number of sequences labeled “type strain in clusters” was considered the next most probable ID, followed by the ID with the greatest number of sequences labeled “type strain outside clusters.”
For tuf gene analysis, GenBank and BIBI were used. Because no interpretative criteria exist for species identification using tuf gene analysis with GenBank, we evaluated the results with the current CLSI guidelines for 16S rRNA interpretation and arbitrarily set ≥98.0% identity with >0.8% separation as the rule for acceptable ID results for reporting at the species level. The same rule that was applied for 16S rRNA analysis with BIBI was used for tuf gene analysis with BIBI.
The “definitive ID” was defined as the ID most frequently obtained by the eight different methods (two phenotypic and six genotypic methods); a minimum of four results in concordance were required for a definitive ID.
Statistical analysis.
One-way analysis of variance (ANOVA) was used for comparisons of differences in percentages of identity between the first and second probable IDs by different methods. Kappa correlation statistics for the concordance of the results from different methods were analyzed. IBM SPSS Statistics, version 19 (SPSS Inc., Chicago, IL), was used for statistical evaluation. All P values are two sided, and a P value of <0.05 was considered statistically significant.
RESULTS
Identification of coagulase-negative staphylococci.
The identification results for the 97 clinical isolates obtained using four databases for 16S rRNA sequencing, two databases for tuf gene sequencing, and two automated phenotypic identification systems are shown in Tables 1 to 4. In general, at least five different methods were in agreement for species identification. In the sole exceptional case, only four of the identification results were in concordance; this isolate was verified as Staphylococcus pettenkoferi (29), which was regarded as the definitive ID. In summary, the clinical specimens were identified as S. epidermidis (n = 37), Staphylococcus hominis (n = 22), Staphylococcus capitis (n = 10), Staphylococcus haemolyticus (n = 7), Staphylococcus lugdunensis (n = 6), Staphylococcus caprae (n = 5), Staphylococcus cohnii (n = 3), Staphylococcus warneri (n = 3), Staphylococcus simulans (n = 2), Staphylococcus saprophyticus (n = 1), and S. pettenkoferi (n = 1). For 16S rRNA analysis, identifications that were discordant with the definitive ID occurred at a frequency of 2.1% for GenBank, 1.0% for MicroSeq, 10.3% for EzTaxon, and 15.5% for BIBI. On the other hand, tuf gene analysis with GenBank showed 2.1% discrepancy; tuf gene analysis with BIBI did not yield any discrepant results. The phenotypic ID systems Vitek 2 and MicroScan showed 8.8% and 19.5% discrepancy with the definitive ID, respectively. The discordant results by different methods are listed in Table 5.
Table 1.
16S rRNA analysis (GenBank) | 16S rRNA analysis (MicroSeq) | 16S rRNA analysis (EzTaxon) | 16S rRNA analysis (BIBI) | tuf gene analysis (GenBank) | tuf gene analysis (BIBI) | Vitek 2 | MicroScan | Definitive ID | No. of isolates |
---|---|---|---|---|---|---|---|---|---|
S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis (100.0 [12], 99.7 [8], 99.4 [6], 99.1 [3], 98.8 [1]) | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | 30 |
S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis (100.0 [1], 99.7 [1], 99.4 [1]) | S. epidermidis | S. hominis | S. epidermidis | S. epidermidis | 3 |
S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis (99.1) | S. epidermidis | S. epidermidis | S. aureus | S. epidermidis | 1 |
S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | S. hominis | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | 1 |
S. epidermidis/S. capitis/ S. caprae (99.9) | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | 1 |
S. epidermidis | S. epidermidis | S. epidermidis | S. caprae/ S. capitis | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | S. epidermidis | 1 |
Where the percentage of identity for 16S rRNA analysis with GenBank, MicroSeq, or EzTaxon or for tuf analysis with GenBank is not 100%, the percentage of identity is given in parentheses, and the number of isolates with that percentage of identity (if more than 1) is given in brackets.
Table 4.
16S rRNA analysis (GenBank) | 16S rRNA analysis (MicroSeq) | 16S rRNA analysis (EzTaxon) | 16S rRNA analysis (BIBI) | tuf gene analysis (GenBank) | tuf gene analysis (BIBI) | Vitek2 | MicroScan | Definitive ID | No. of isolates |
---|---|---|---|---|---|---|---|---|---|
S. cohnii | S. cohnii | S. cohnii | S. cohnii | S. cohnii (98.3 [1], 97.3 [1], 97.1 [1]) | S. cohnii | S. cohnii | S. cohnii | S. cohnii | 3 |
S. lugdunensis (100.0 [3], 99.8 [1], 99.3 [1]) | S. lugdunensis (99.9 [2], 99.8 [3]) | S. lugdunensis | S. lugdunensis | S. lugdunensis (100 [1], 99.7 [3], 99.6 [1]) | S. lugdunensis | S. lugdunensis | S. haemolyticus | S. lugdunensis | 5 |
S. lugdunensis | S. lugdunensis (99.8) | S. lugdunensis | S. lugdunensis | S. lugdunensis (99.7) | S. lugdunensis | NAb | S. lugdunensis | S. lugdunensis | 1 |
S. pettenkoferi/ S. pseudolugdunensis (99.6) | S. hyicus/S. cohnii/ S. caprae/S. capitis (96.6) | S. pettenkoferi | S. pseudolugdunensis/S. pettenkoferi | S. pseudolugdunensis | S. pettenkoferi | NA | S. capitis | S. pettenkoferi | 1 |
S. saprophyticus | S. saprophyticus/S. xylosus (99.9) | S. saprophyticus/S. xylosus (99.6) | S. saprophyticus | S. saprophyticus (98.9) | S. saprophyticus | S. saprophyticus | S. saprophyticus | S. saprophyticus | 1 |
S. simulans (99.6) | S. simulans (99.9) | S. simulans. | S. simulans | S. simulans (99.1) | S. simulans | S. simulans | S. simulans | S. simulans | 1 |
S. simulans (99.6) | S. simulans (99.9) | S. simulans | S. simulans | S. simulans (99.1) | S. simulans | S. simulans | S. cohnii | S. simulans | 1 |
S. warneri/S. pasteuri | S. warneri | S. warneri | S. warneri | S. warneri/S. pasteuri | S. warneri | S. saprophyticus | S. warneri | S. warneri | 1 |
S. warneri/S. pasteuri (99.8) | S. warneri/S. pasteuri (99.8) | S. warneri | S. warneri | S. warneri/S. pasteuri (99.6) | S. warneri | S. warneri | S. auricularis | S. warneri | 1 |
S. warneri/S. pasteuri | S. warneri/S. pasteuri | S. warneri | S. warneri | S. warneri/S. pasteuri | S. warneri | S. warneri | S. warneri | S. warneri | 1 |
Where the percentage of identity for 16S rRNA analysis with GenBank, MicroSeq, or EzTaxon or for tuf analysis with GenBank is not 100%, the percentage of identity is given in parentheses, and the number of isolates with that percentage of identity (if more than 1) is given in brackets.
NA, not available.
Table 5.
Method | No. (%) of isolates with: |
Total no. of isolates | ||
---|---|---|---|---|
A single ID in concordance with the definitive ID | Multiple IDs, one of which is in agreement with the definite ID | A discordant ID | ||
16S rRNA analysis | ||||
GenBank | 56 (57.7) | 39 (40.2) | 2 (2.1) | 97 |
MicroSeq | 81 (83.5) | 15 (15.5) | 1 (1.0) | 97 |
EzTaxon | 80 (82.5) | 7 (7.2) | 10 (10.3) | 97 |
BIBI | 67 (69.1) | 15 (15.5) | 15 (15.5) | 97 |
tuf gene analysis | ||||
GenBank | 92 (94.8) | 3 (3.1) | 2 (2.1) | 97 |
BIBI | 97 (100.0) | 0 (0.0) | 0 (0.0) | 97 |
Vitek 2 | 83 (91.2) | 0 (0.0) | 8 (8.8) | 91 |
MicroScan | 70 (80.5) | 0 (0.0) | 17 (19.5) | 87 |
Table 2.
16S rRNA analysis (GenBank) | 16S rRNA analysis (MicroSeq) | 16S rRNA analysis (EzTaxon) | 16S rRNA analysis (BIBI) | tuf gene analysis (GenBank) | tuf gene analysis (BIBI) | Vitek 2 | MicroScan | Definitive ID | No. of isolates |
---|---|---|---|---|---|---|---|---|---|
S. capitis/S. caprae/ S. epidermidis (99.8 [3], 99.6 [1]) | S. capitis/S. caprae/ S. epidermidis (99.9 [4]) | S. caprae | S. capitis/S. caprae | S. capitis (100.0 [3], 99.4 [1]) | S. capitis | S. capitis | S. capitis | S. capitis | 4 |
S. capitis | S. capitis/S. caprae/ S. epidermidis (99.9 [2]) | S. caprae | S. capitis/S. caprae | S. capitis (99.1 [2]) | S. capitis | S. capitis | S. capitis | S. capitis | 2 |
S. caprae/S. epidermidis/S. capitis | S. capitis | S. caprae | S. capitis/S. caprae | S. capitis | S. capitis | S. capitis | S. capitis | S. capitis | 1 |
S. capitis/S. caprae/ S. epidermidis (99.8) | S. capitis/S. caprae/ S. epidermidis (99.9) | S. caprae | S. epidermidis | S. capitis (99.1) | S. capitis | S. capitis | S. capitis | S. capitis | 1 |
S. capitis/S. caprae/ S. epidermidis (99.8) | S. capitis/S. caprae/ S. epidermidis (99.9) | S. caprae | S. capitis/S. caprae | S. capitis (99.7) | S. capitis | S. capitis | S. epidermidis | S. capitis | 1 |
S. caprae/S. epidermidis/S. capitis (99.8) | S. capitis/S. caprae/ S. epidermidis (99.9) | S. caprae | S. haemolyticus | S. capitis (99.4) | S. capitis | S. capitis | S. capitis | S. capitis | 1 |
S. arlettae/S. capitis/ S. caprae (99.8) | S. capitis/S. caprae/ S. epidermidis (99.9) | S. caprae/S. saccharolyticus/S. epidermidis (99.8) | S. capitis/S. caprae | S. caprae (99.3) | S. caprae | S. caprae | S. capitis | S. caprae | 1 |
S. arlettae/S. capitis/ S. epidermidis (99.6) | S. capitis/S. caprae/ S. epidermidis (99.9) | S. caprae/S. capitis/ S. saccharolyticus/S. epidermidis (99.8) | S. capitis/S. caprae | S. caprae (99.4) | S. caprae | S. simulans | S. haemolyticus | S. caprae | 1 |
S. arlettae/S. capitis/ S. caprae (99.8) | S. capitis/S. caprae/ S. epidermidis (99.9) | S. caprae/S. capitis/S. saccharolyticus/ S. epidermidis (99.8) | S. capitis/S. caprae | S. caprae (99.6) | S. caprae | S. caprae | S. warneri | S. caprae | 1 |
S. arlettae/S. capitis/ S. epidermidis (99.5) | S. capitis/S. caprae/ S. epidermidis (99.9) | S. caprae/S. capitis/S. saccharolyticus/ S. epidermidis (99.8) | S. capitis/S. caprae | S. caprae | S. caprae | S. caprae | S. epidermidis | S. caprae | 1 |
S. arlettae/S. capitis/ S. caprae (99.6) | S. capitis/S. caprae/ S. epidermidis (99.9) | S. capitis/S. caprae/ S. epidermidis (99.8) | S. capitis/S. caprae | S. caprae | S. caprae | S. caprae | S. aureus | S. caprae | 1 |
Where the percentage of identity for 16S rRNA analysis with GenBank, MicroSeq, or EzTaxon or for tuf analysis with GenBank is not 100%, the percentage of identity is given in parentheses, and the number of isolates with that percentage of identity (if more than 1) is given in brackets.
Table 3.
16S rRNA analysis (GenBank) | 16S rRNA analysis (MicroSeq) | 16S rRNA analysis (EzTaxon) | 16S rRNA analysis (BIBI) | tuf gene analysis (GenBank) | tuf gene analysis (BIBI) | Vitek 2 | MicroScan | Definitive ID | No. of isolates |
---|---|---|---|---|---|---|---|---|---|
S. auricularis/ S. haemolyticus/ S. warneri | S. haemolyticus | S. haemolyticus | S. haemolyticus | S. haemolyticus (99.7 [1], 99.6 [1], 99.1 [1]) | S. haemolyticus | S. haemolyticus | S. haemolyticus | S. haemolyticus | 4 |
S. auricularis/ S. haemolyticus/ S. warneri | S. haemolyticus | S. haemolyticus (99.8) | S. capitis/S. caprae | S. haemolyticus | S. haemolyticus | S. haemolyticus | S. haemolyticus | S. haemolyticus | 1 |
S. auricularis/ S. haemolyticus/ S. warneri | S. haemolyticus | S. haemolyticus (99.7) | S. haemolyticus | S. haemolyticus | S. haemolyticus | S. epidermidis | S. epidermidis | S. haemolyticus | 1 |
S. auricularis/ S. haemolyticus/ S. warneri | S. haemolyticus | S. haemolyticus (99.7) | S. haemolyticus | S. haemolyticus | S. haemolyticus | S. haemolyticus | S. simulans | S. haemolyticus | 1 |
S. hominis/S. warneri/ S. hyicus (100.0 [7], 99.8 [2]) | S. hominis (100.0 [3], 99.9 [1], 99.8 [5]) | S. hominis (100.0 [3], 99.8[6]) | S. hominis | S.hominis (100.0 [2], 99.7 [2], 99.6 [1], 99.4 [1], 99.3 [1], 98.6 [1], 98.3 [1]) | S. hominis | S. hominis | S. hominis | S. hominis | 9 |
S. hominis | S .hominis (99.9 [1], 99.8 [1], 99.7 [4]) | S. hominis (99.8 [6]) | S. xylosus | S. hominis (99.4 [1], 99.1 [2], 98.9 [1], 98.8 [1], 98.0 [1]) | S. hominis | S. hominis | S. aureus | S. hominis | 6 |
S. hominis/S. warneri/S. hyicus (100.0 [2], 99.8 [1]) | S. hominis (99.8 [3]) | S. hominis (100.0 [1], 99.8 [2]) | S. xylosus | S. hominis (100.0 [2], 99.1 [1]) | S. hominis | S. hominis | S. hominis | S. hominis | 3 |
S. hominis/S. warneri/S. hyicus (99.8) | S. hominis | S. hominis | S. hominis | S. hominis | S. hominis | NAb | S. hominis | S. hominis | 1 |
S. hominis/S. warneri/S. hyicus | S. hominis (99.8) | S. hominis (99.8) | S. xylosus | S. hominis (99.1) | S. hominis | Kocuria variant | S. hominis | S. hominis | 1 |
S. hominis/S. warneri/S. hyicus | S. hominis (99.7) | S. hominis/S. haemolyticus (99.8) | S. xylosus | S. hominis | S. hominis | S. hominis | S. hominis | S. hominis | 1 |
S. hominis/S. warneri/S. hyicus | S. hominis (99.7) | S. hominis (99.8) | S. hominis | S. hominis (99.1) | S. hominis | S. hominis | S. epidermidis | S. hominis | 1 |
Where the percentage of identity for 16S rRNA analysis with GenBank, MicroSeq, or EzTaxon or for tuf analysis with GenBank is not 100%, the percentage of identity is given in parentheses, and the number of isolates with that percentage of identity (if more than 1) is given in brackets.
NA, not available.
Briefly, by 16S rRNA analysis with GenBank, two S. caprae isolates were misidentified; with MicroSeq, one S. pettenkoferi isolate was misidentified. By use of EzTaxon for 16S rRNA analysis, 10 S. capitis isolates were identified as S. caprae; by use of BIBI, 1 S. epidermidis, 1 S. haemolyticus, 2 S. capitis, and 11 S. hominis isolates were misidentified. When GenBank was used for tuf gene analysis, one S. epidermidis and one S. pettenkoferi isolate showed mismatches with the definitive ID; with BIBI, no discrepant results were observed.
Because multiple species were considered probable IDs when there was <0.8% identity difference between the IDs, we evaluated the species that were considered probable IDs (Table 5). 16S rRNA analysis with GenBank yielded multiple answers for 40.2% of the specimens, showing low discriminatory power between S. capitis, S. caprae, and S. epidermidis; between S. hominis, S. warneri, S. hyicus, and S. haemolyticus; and between S. warneri and S. hyicus. 16S rRNA analysis with MicroSeq provided multiple IDs for 15.5% of the specimens and was unable to differentiate between S. capitis, S. caprae, and S. epidermidis and between S. saprophyticus and S. xylosus. 16S rRNA analysis with EzTaxon showed ambiguous results for 7.2% of the specimens; further, all five S. caprae isolates were given multiple IDs, including S. caprae, S. capitis, S. saccharolyticus, and S. epidermidis. S. hominis was identified as either S. hominis or S. haemolyticus, and S. saprophyticus was identified as either S. saprophyticus or S. xylosus, by 16S rRNA analysis with EzTaxon. 16S rRNA analysis with BIBI revealed multiple probable IDS for 15.5% of the clinical isolates. The results were inconclusive, with S. capitis identified as S. capitis or S. caprae and S. pettenkoferi identified as S. pettenkoferi or Staphylococcus pseudolugdunensis. The numbers of sequences labeled “type strain within clusters” and “type strain outside of clusters” were the same for the two species by BIBI. In contrast, tuf gene analysis with GenBank was not able to differentiate between S. warneri and S. pasteuri. No ambiguous results were obtained by use of tuf gene analysis with BIBI.
The correlation of each method with the definitive ID was evaluated by multirater kappa statistics, and the kappa coefficient was 0.9735 for 16S rRNA analysis by GenBank, 0.9868 for 16S rRNA analysis by MicroSeq, 0.8684 for 16S rRNA analysis by EzTaxon, 0.8081 for 16S rRNA analysis by BIBI, 0.9736 for tuf analysis by GenBank, 1.00 for tuf analysis by BIBI, 0.8560 for the Vitek 2 system, and 0.7613 for the MicroScan system. The kappa coefficient with tuf analysis was higher than those for 16S rRNA analysis and automated identification systems.
Discriminatory power.
The genotypic results generated by 16S rRNA analysis with MicroSeq and EzTaxon and by tuf gene analysis with GenBank expressed identification scores as the percentage of sequence identity to the type culture collection or verified strains. Therefore, the results from these three methods were compared for their discriminatory power and for determination of the optimal sequence identity cutoff value for the identification of CNS species by tuf gene analysis. Excluding the results with 100% identity, MicroSeq and EzTaxon showed average identities of 99.85% and 99.80%, respectively, with the reference sequence; for tuf gene analysis with GenBank, 99.29% identity with the reference sequence was observed. The percentage of sequence identity was significantly higher for 16S rRNA analysis using MicroSeq or EzTaxon than for tuf gene analysis with GenBank (P < 0.001). When identity differences between the first and the second most probable ID results were compared, MicroSeq had a 1.57% difference; EzTaxon, a 1.31% difference; and tuf gene analysis, a 2.82% difference. The average difference in the percentage of identity between the first and the second most probable IDs was significantly higher for tuf gene analysis with GenBank than for16S rRNA analysis with MicroSeq or EzTaxon (P < 0.001).
Interpretative criteria for tuf gene analysis of coagulase-negative staphylococci.
Currently, there are no guidelines for the interpretation of data generated by tuf gene analysis for bacterial identification. Therefore, we applied the CLSI MM18-A guidelines to the results for tuf gene analysis. Overall, 88.7% of the specimens could be identified with ≥99.0% species identity and with >0.8% separation between species. Additionally, 9.3% of the specimens were identified with ≥97.0% sequence identity for genus identification. Results that could be reported only to the genus level according to the CLSI guidelines (≥97.0% identity) were identified correctly with respect to the definitive ID once we adjusted the criteria for species identification to ≥98.0% identity with >0.8% separation between species. A total of 98.0% of the specimens could be identified correctly to the species level.
DISCUSSION
Many laboratories use the 16S rRNA region as a target for identification because it is widely used and is supported with a large amount of data within public databases; further, guidelines for data interpretation exist. However, because 16S rRNA analysis has low discriminatory power for the identification of certain CNS species (4, 12, 20), other target genes, such as the tuf gene, have been studied for CNS identification and exhibit good resolution (2, 6, 14). In the genomic era, postsequencing analyses have become an important part of molecular diagnostics. However, information regarding the interpretation of genotypic results with different databases is limited. Here, for the first time, we compared 16S rRNA sequencing results using four different databases, tuf gene analysis using two different databases, and two phenotypic identification systems.
The results of our study showed that tuf gene analysis generally exhibited better discriminatory power for CNS species identification than 16S rRNA analysis. Further, the use of multiple databases is important, because the results differ depending on the database used. Although disparities existed for the various databases used, there were fewer discordant results with tuf gene analysis than with 16S rRNA analysis, and the kappa coefficients for tuf gene analysis were higher than those for 16S rRNA analysis; these results support the appropriateness of tuf gene analysis as another method for species identification (2, 6). In addition, there were markedly fewer cases of multiple probable IDs with tuf gene analysis; the differences in the percentage of sequence identity between the first and the second probable IDs were greater. This finding was also made in other studies, and it was suggested through phylogenetic studies and assessment with different databases that the tuf gene had more intraspecies variability than 16S rRNA (15, 23). Therefore, further confirmation with other methods or targets would be required less often with tuf gene analysis than with 16S rRNA gene analysis. Because it is particularly important to provide an accurate and clearly defined result for a requested test in clinical laboratories, the discriminatory power of identification tests is very important.
16S rRNA sequence analysis with all four different databases showed low discriminatory power for S. capitis and S. caprae, suggesting the use of an alternative target, as recommended in the CLSI MM18-A document. tuf gene analysis was capable of distinguishing between these two species. S. warneri is reported to share approximately 98.7% sequence identity with S. pasteuri (22), and tuf and rpoB gene analyses are generally known to provide better resolution between these two species (6). However, sequence analysis with GenBank reported both S. pasteuri and S. warneri as probable IDs regardless of the target (16S rRNA or the tuf gene). Using 16S rRNA analysis, EzTaxon reported S. warneri correctly, which could be due to the fact that the database is based on different algorithms and data sources, curated from the primary database GenBank. Although S. saprophyticus and S. xylosus are known to exhibit approximately 100% identity in their 16S rRNA sequences (22), the sequence analysis results differed depending on the database; the secondary databases, MicroSeq and EzTaxon, could not distinguish between these species, but GenBank did. These results emphasize the need to utilize multiple databases, both primary and secondary databases, for the interpretation of sequencing analysis data for any DNA target.
Although, with 97 clinical specimens, our study included many CNS species, other species that may be encountered in clinical laboratories, such as Staphylococcus intermedius, Staphylococcus schleiferi, and Staphylococcus pseudintermedius, were not included and should be examined in a future study. In addition, we chose the ID that appeared most frequently as the definitive ID; however, this may not be the correct species ID. Further analysis with type strains is required to support and confirm the definitive IDs of this study. This study used partial sequencing for both 16S rRNA and tuf gene analysis and provided reliable results, excluding one case (S. pettenkoferi), suggesting that partial sequencing of the two genes is sufficient for clinical use. In addition, a minimal modification of the CLSI MM18-A guidelines for species identification criteria of ≥98.0% identity with >0.8% separation between species yielded reliable results for tuf gene analysis.
From this study, it appears that the tuf gene is a useful alternative target for CNS species identification that exhibits higher discriminatory power than 16S rRNA. Although no guidelines exist for tuf gene analysis, minimally modified CLSI MM18-A criteria can be used to obtain reliable results with low ambiguity and high sensitivity. Integrating results from different databases for postsequencing analysis is important for accurate diagnosis.
Footnotes
Published ahead of print on 12 October 2011.
REFERENCES
- 1.Ahlstrand E., Svensson K., Persson L., Tidefelt U., Söderquist B. Glycopeptide resistance in coagulase-negative staphylococci isolated in blood cultures from patients with hematological malignancies during three decades. Eur. J. Clin. Microbiol. Infect. Dis. [5 April 2011.]. doi: 10.1007/s10096-011-1228-8. [DOI] [PubMed]
- 2. Alexopoulou K., et al. 2006. Comparison of two commercial methods with PCR restriction fragment length polymorphism of the tuf gene in the identification of coagulase-negative staphylococci. Lett. Appl. Microbiol. 43:450–454 [DOI] [PubMed] [Google Scholar]
- 3. Bearman G., Wenzel R. 2005. Bacteremias: a leading cause of death. Arch. Med. Res. 36:646–659 [DOI] [PubMed] [Google Scholar]
- 4. Becker K., et al. 2004. Development and evaluation of a quality-controlled ribosomal sequence database for 16S ribosomal DNA-based identification of Staphylococcus species. J. Clin. Microbiol. 42:4988–4995 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Benson D. A., Karsch-Mizrachi I., Lipman D. J., Ostell J., Sayers E. W. 2011. GenBank. Nucleic Acids Res. 39:D32–D37 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Bergeron M., et al. 2011. Species identification of staphylococci by amplification and sequencing of the tuf gene compared to the gap gene and by matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Eur. J. Clin. Microbiol. Infect. Dis. 30:343–354 [DOI] [PubMed] [Google Scholar]
- 7. Capurro A., et al. 2009. Comparison of a commercialized phenotyping system, antimicrobial susceptibility testing, and tuf gene sequence-based genotyping for species-level identification of coagulase-negative staphylococci isolated from cases of bovine mastitis. Vet. Microbiol. 134:327–333 [DOI] [PubMed] [Google Scholar]
- 8. Cheung G., Otto M. 2010. Understanding the significance of Staphylococcus epidermidis bacteremia in babies and children. Curr. Opin. Infect. Dis. 23:208–216 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Chun J., et al. 2007. EzTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal RNA gene sequences. Int. J. Syst. Evol. Microbiol. 57:2259–2261 [DOI] [PubMed] [Google Scholar]
- 10. Clarridge J., III 2004. Impact of 16S rRNA gene sequence analysis for identification of bacteria on clinical microbiology and infectious diseases. Clin. Microbiol. Rev. 17:840–862 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Devulder G., Perriere G., Baty F., Flandrois J. 2003. BIBI, a bioinformatics bacterial identification tool. J. Clin. Microbiol. 41:1785–1787 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Drancourt M., Raoult D. 2002. rpoB gene sequence-based identification of Staphylococcus species. J. Clin. Microbiol. 40:1333–1338 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Garza-González E., et al. Microbiological and molecular characterization of human clinical isolates of Staphylococcus cohnii, Staphylococcus hominis, and Staphylococcus sciuri. Scand. J. Infect. Dis. [19 August 2011.]. doi: 10.3109/00365548.2011.598873. [DOI] [PubMed]
- 14. Ghebremedhin B., Layer F., Konig W., Konig B. 2008. Genetic classification and distinguishing of Staphylococcus species based on different partial gap, 16S rRNA, hsp60, rpoB, sodA, and tuf gene sequences. J. Clin. Microbiol. 46:1019–1025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Heikens E., Fleer A., Paauw A., Florijn A., Fluit A. 2005. Comparison of genotypic and phenotypic methods for species-level identification of clinical isolates of coagulase-negative staphylococci. J. Clin. Microbiol. 43:2286–2290 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Isaacs D. 2003. A ten year, multicentre study of coagulase negative staphylococcal infections in Australasian neonatal units. Arch. Dis. Child. Fetal Neonatal Ed. 88:F89–F93 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Kim M., et al. 2008. Comparison of the MicroScan, VITEK 2, and Crystal GP with 16S rRNA sequencing and MicroSeq 500 v2. 0 analysis for coagulase-negative staphylococci. BMC Microbiol. 8:233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Larkin M., et al. 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23:2947–2948 [DOI] [PubMed] [Google Scholar]
- 19. Martineau F., et al. 2001. Development of a PCR assay for identification of staphylococci at genus and species levels. J. Clin. Microbiol. 39:2541–2547 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Mellmann A., et al. 2006. Sequencing and staphylococci identification. Emerg. Infect. Dis. 12:333–336 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Otto M. 2009. Staphylococcus epidermidis—the ‘accidental' pathogen. Nat. Rev. Microbiol. 7:555–567 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Petti C., et al. 2008. Interpretative criteria for identification of bacteria and fungi by DNA target sequencing; approved guideline. CLSI document MM-18A. Clinical and Laboratory Standards Institute, Wayne, PA. [Google Scholar]
- 23. Petti C., et al. 2008. Genotypic diversity of coagulase-negative staphylococci causing endocarditis: a global perspective. J. Clin. Microbiol. 46:1780–1784 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Poyart C., Quesne G., Boumaila C., Trieu-Cuot P. 2001. Rapid and accurate species-level identification of coagulase-negative staphylococci by using the sodA gene as a target. J. Clin. Microbiol. 39:4296–4301 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Rogers K., Fey P., Rupp M. 2009. Coagulase-negative staphylococcal infections. Infect. Dis. Clin. North Am. 23:73–98 [DOI] [PubMed] [Google Scholar]
- 26. Rupp M., Archer G. 1994. Coagulase-negative staphylococci: pathogens associated with medical progress. Clin. Infect. Dis. 19:231–243 [DOI] [PubMed] [Google Scholar]
- 27. Shah M., et al. 2007. dnaJ gene sequence-based assay for species identification and phylogenetic grouping in the genus Staphylococcus. Int. J. Syst. Evol. Microbiol. 57:25–30 [DOI] [PubMed] [Google Scholar]
- 28. Sivadon V., et al. 2005. Use of genotypic identification by sodA sequencing in a prospective study to examine the distribution of coagulase-negative Staphylococcus species among strains recovered during septic orthopedic surgery and evaluate their significance. J. Clin. Microbiol. 43:2952–2954 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Song S., et al. 2009. Human bloodstream infection caused by Staphylococcus pettenkoferi. J. Med. Microbiol. 58:270–272 [DOI] [PubMed] [Google Scholar]
- 30. Tatusova T., Madden T. 1999. BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol. Lett. 174:247–250 [DOI] [PubMed] [Google Scholar]
- 31. Weinstein M., et al. 1997. The clinical significance of positive blood cultures in the 1990s: a prospective comprehensive evaluation of the microbiology, epidemiology, and outcome of bacteremia and fungemia in adults. Clin. Infect. Dis. 24:584–602 [DOI] [PubMed] [Google Scholar]
- 32. Woo P., et al. 2003. Usefulness of the MicroSeq 500 16S ribosomal DNA-based bacterial identification system for identification of clinically significant bacterial isolates with ambiguous biochemical profiles. J. Clin. Microbiol. 41:1996–2001 [DOI] [PMC free article] [PubMed] [Google Scholar]