ABSTRACT
Bacterial pathogens that cannot be identified using matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) are occasionally encountered in clinical laboratories. The 16S rRNA gene is often used for sequence-based analysis to identify these bacterial species. Nevertheless, traditional Sanger sequencing is laborious, time-consuming, and low throughput. Here, we compared two commercially available 16S rRNA gene sequencing tests that are based on Illumina and Nanopore sequencing technologies, respectively, in their ability to identify the species of 172 clinical isolates that failed to be identified by MALDI-TOF MS. Sequencing data were analyzed by the respective built-in programs (MiSeq Reporter software of Illumina and Epi2me of Nanopore) and BLAST+ (v2.11.0). Their agreement with Sanger sequencing on species-level identification was determined. Discrepancies were resolved by whole-genome sequencing. The diagnostic accuracy of each workflow was determined using the composite sequencing result as the reference standard. Despite the high base-calling accuracy of Illumina sequencing, we demonstrated that the Nanopore workflow had a higher taxonomic resolution at the species level. Using built-in analysis algorithms, the concordance of Sanger 16S with the Illumina and Nanopore workflows was 33.14% and 87.79%, respectively. The agreement was 65.70% and 83.14%, respectively, when BLAST+ was used for analysis. Compared with the reference standard, the diagnostic accuracy of Nanopore 16S was 96.36%, which was identical to that of Sanger 16S and better than that of Illumina 16S (69.07%). The turnaround time of the Illumina workflow and the Nanopore workflow was 78 h and 8.25 h, respectively. The per-sample cost of the Illumina and Nanopore workflows was US$28.5 and US$17.7, respectively.
KEYWORDS: 16S rRNA gene, bacterial species, Illumina sequencing, Nanopore sequencing, Sanger sequencing
INTRODUCTION
Traditionally, clinical microbiology laboratories have relied on phenotypic methods to identify bacterial pathogens. However, conventional biochemical tests are labor-intensive and time-consuming, and the results can be ambiguous when two species share similar biochemical profiles (1, 2). Nowadays, matrix-assisted laser desorption–ionization time of flight mass spectrometry (MALDI-TOF MS) is widely used for bacterial identification in clinical laboratories (3). MALDI-TOF MS allows rapid identification of microorganisms by comparing the mass spectrum of a sample with the reference spectra in the database (4). Although MALDI-TOF MS is a rapid, simple, and high-throughput technology for bacterial identification, some species cannot be well differentiated due to high similarity in the mass spectra of closely related species or lack of reference spectra (5).
A study by Lau et al. reported that MALDI-TOF MS failed to determine the species of 37 out of 67 (55%) phenotypically “difficult-to-identify” bacteria in clinical laboratories (6). In general, anaerobes, particularly Actinomyces spp., Peptostreptococcus spp., Prevotella spp., and Fusobacterium spp. (7–9), have a higher failure rate than aerobes in bacterial identification using MALDI-TOF MS (7, 10). Additionally, some weakly acid-fast bacilli and Gram-positive aerobes, such as Nocardia spp. and Streptomyces spp., respectively, are poorly identified by MALDI-TOF MS (7, 11). Regarding Gram-negative aerobes, studies have reported that MALDI-TOF MS cannot effectively identify certain Achromobacter spp., Acinetobacter spp., Chryseobacterium spp., and Moraxella spp. (11, 12). In such cases, 16S sequencing of cultured isolates is commonly used for species-level identification.
Sanger sequencing offers a high base-calling accuracy, but it is laborious and time-consuming with limited throughput (13). High-throughput sequencing (HTS) technologies, such as Illumina sequencing and Nanopore sequencing, have been proposed as alternatives to generate 16S sequences for rapid identification of bacteria that are of clinical interest. The Illumina platform can generate vast quantities of highly accurate sequencing reads. However, the read length is limited and insufficient to cover the entire 16S rRNA gene. According to the 16S metagenomic sequencing library preparation workflow from Illumina, bacteria are identified based on variable regions (V3 and V4) of 16S. Nevertheless, the variable regions are not equally discriminative between and across different species, genera, and families (14).
In contrast, the MinION device by Oxford Nanopore Technologies (ONT) enables generation of ultralong reads exceeding 4 Mb. The 16S rRNA sequencing assay (SQK-16S024) from ONT allows the entire 16S rRNA gene to be sequenced with real-time data analysis. Recent studies have demonstrated its potential for rapid bacterial identification; however, the high read error rate (8% to 15%) of this platform might hinder the accuracy of species-level identification for diagnostic purposes (15).
Considering the respective limitations of Illumina and Nanopore technologies, a comprehensive investigation of the clinical utility of these 16S rRNA sequencing approaches for bacterial identification is required. This study aimed to evaluate the performance of two commercial HTS workflows for 16S rRNA sequencing, which were the 16S metagenomic sequencing library preparation workflow (Nextera XT index kit v2) coupled with MiSeq Reporter software (MSR) from Illumina and the 16S barcoding kit 1-24 (SQK-16S024) coupled with Epi2me from ONT. These workflows were used to identify bacterial isolates that could not be definitively identified by MALDI-TOF MS. The respective performances of the two built-in analysis pipelines (MSR and Epi2me) were compared with that of the in-house BLAST+ (v2.11.0) analysis.
In light of the complexities of evaluating diagnostic accuracy in the absence of a perfect gold standard, we considered a composite 16S rRNA sequencing result inferred by Sanger and the two HTS platforms as a reference standard. In cases of disagreement in taxa inferred by the three sequencing platforms, whole-genome sequencing (WGS) was conducted to confirm the bacterial identities. The costs and times to result of the sequencing workflows were also compared.
MATERIALS AND METHODS
Sample collection.
A total of 172 clinical isolates from 117 species were collected from the clinical microbiology laboratory of Pamela Youde Nethersole Eastern Hospital in Hong Kong. Clinical isolates were included if they failed to be classified at the species level (score < 2.00) by the IVD MALDI Biotyper (Bruker Daltonics, Bremen, Germany). The MALDI-TOF MS procedures were repeated twice to eliminate the effect of random errors. Failure to identify bacterial species occurred due to (i) lack of a reference spectrum in the database (81 samples), (ii) inclusion of certain species in the “dangerous database,” named Security Library 1.0, rather than the regular database (two samples), or (iii) poor quality of protein spectra (89 samples) (see Table S1 in the supplemental material). The IVD MALDI Biotyper used in this study was microflex (Bruker Daltonics), and the database version was BD-6763. The original specimens from which the organisms were isolated are listed in Table S1 in the supplemental material.
DNA extraction.
Total nucleic acid was extracted from clinical isolates using the Amplicor respiratory specimen preparation kit (Roche, Basel, Switzerland) and purified with 1.8× AMPure XP beads (Beckman Coulter, CA, USA). Purified DNA was diluted to targeted concentrations in subsequent sequencing workflows. The required DNA inputs for the Illumina and Nanopore workflows were 12.5 ng and 10 ng, respectively.
Sanger 16S.
For Sanger 16S rRNA sequencing (Sanger 16S), the full-length 16S rRNA gene was amplified using primers for 16s_27F (5′-AGAGTTTGATCMTGGC-3′´) and 16s_1492R (5′-TACCTTGTTACGACTT-3′´) (Fig. S1) (16). The reaction mixture was prepared by mixing 36.7 μL of nuclease-free water, 5 μL of 10× PCR buffer, 1 μL of 10 mM deoxynucleoside triphosphate mix (NEB, Ipswich, MA, USA), 1 μL of each 25 μM primer, 0.3 μL of HotStarTaq Plus DNA polymerase (Qiagen, Hilden, Germany), and 5 μL of DNA template. The PCR conditions were 96°C for 8 min, 37 cycles at 94°C for 1 min, 37°C for 2 min, and 72°C for 2 min 30 s, followed by 72°C for 10 min and a hold step at 4°C. PCR products were purified using ExoSAP-IT reagent (Thermo Fisher Scientific, Waltham, MA, USA) and then passed to the subsequent cycle sequencing by using eight sequencing primers (17–19) (Table S2). The reaction mixture consisted of 13 μL of nuclease-free water, 1 μL of BigDye Terminator v3.1 ready reaction mix (Thermo Fisher Scientific), 3.5 μL of 5× sequencing buffer, 1 μL of 3.2 μM primer, and 1.5 μL of purified PCR product. The PCR conditions were 96°C for 1 min and 25 cycles at 96°C for 10 s, 37°C for 30 s, and 60°C for 4 min, followed by a hold step at 4°C. The sequencing products were purified using 75% isopropanol and resuspended in 12 μL of Hi-Di formamide (Thermo Fisher Scientific). After loading on the Applied Biosystems 3130 genetic analyzer (Thermo Fisher Scientific), the resulting raw trace files were analyzed using the Staden Package (v2.0.0b11). The consensus sequence of each sample was classified by submitting a Basic Local Alignment Search Tool (BLAST) query against the 16S rRNA sequence database (https://blast.ncbi.nlm.nih.gov/Blast.cgi), using the default parameters. The classified species with the lowest E value and highest percentage identity was regarded as the identity of the sample.
Illumina 16S. (i) Library preparation.
For Illumina sequencing (Illumina 16S), libraries were constructed according to the 16S metagenomic sequencing library preparation workflow from Illumina. Briefly, the 16S V3 and V4 regions of samples were amplified in the first stage of PCR using the primers suggested in the workflow, which were 16S amplicon PCR forward primer (5′-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG-3′) and 16S amplicon PCR reverse primer (5′-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC-3′) (Fig. S1). The underlined bases in the primer sequences are the overhang adapter sequences for attachment of the indexed adapters in the second stage of PCR. The size of the amplicon was approximately 460 bp. After a post-PCR cleanup, a unique indexed sequencing adapter was added to each sample using the Nextera XT index kit v2 (Illumina, San Diego, CA, USA). Then, a second post-PCR cleanup was performed, followed by a qualification check of the purified libraries.
(ii) Quantification and sequencing.
The size of each library was measured using the 2100 Bioanalyzer system (Agilent, Santa Clara, CA, USA) and the high-sensitivity DNA kit (Agilent). The quantity of the libraries was measured by real-time PCR using the LightCycler 480 instrument II (Roche) and the QIAseq Library Quant assay kit (Qiagen). Then, the libraries were diluted to 4 nM and pooled into one tube. After denaturation with 0.2 N NaOH, the pooled library was diluted to 9 pM and spiked with 15% of 9 pM PhiX prepared from the PhiX control kit v3 (Illumina). The pooled library was then loaded on the MiSeq sequencer (Illumina) for sequencing using MiSeq reagent kit v3 (Illumina). The sequencing time was 56 h.
(iii) On-instrument data analysis.
Sequencing data were analyzed using MiSeq Reporter software (v2.6.2.3) (MSR) in the MiSeq system. After selection of the metagenomics workflow, sequencing reads were mapped against reference sequences in the Greengenes database (v13.5, May 2013) (http://greengenes.lbl.gov/) for classification. The classification of reads at seven taxonomic levels from kingdom to species was analyzed in this workflow.
(iv) Data analysis using BLAST+ (Illumina_BLAST+).
The paired-end reads of each sample were merged using the “make.contigs” command in Mothur (v1.44.3) (20). The reads were filtered using the “screen.seqs” command. Sequences smaller than 400 bp, larger than 500 bp, or with any ambiguous bases were removed. The resulting fasta files were analyzed by BLAST+ (v2.11.0) using an in-house Python script (https://github.com/siupenyau/Pocket_16S/tree/7d3fa9d73a6a35afb47e40e7850cef72b4b91a22). In brief, the reads were aligned to the reference sequences in the 16S rRNA database (https://ftp.ncbi.nlm.nih.gov/blast/db/) downloaded from the National Center for Biotechnology Information (NCBI). The percentage identity and percentage query coverage were set at 90%.
(v) Data analysis using nf-core/ampliseq.
Samples with disagreements between the MSR and Illumina_BLAST+ were further analyzed using another pipeline, nf-core/ampliseq (https://github.com/nf-core/ampliseq), which was developed by Straub et al. (21) to obtain the resolved Illumina 16S identity. The pipeline performed taxonomic assignments based on an error-correcting amplicon sequence variant (ASV) approach instead of read-by-read classification. The reference 16S rRNA database was the SILVA v132 database (22).
Nanopore 16S. (i) Library preparation and sequencing.
For Nanopore sequencing (Nanopore 16S), library preparation was performed using the 16S barcoding kit 1-24 (SQK-16S024) from ONT according to the manufacturer’s protocol. Libraries were quantified using the Qubit 2.0 fluorometer (Thermo Fisher Scientific) with the Qubit 1× double-stranded DNA (dsDNA) HS assay kit (Thermo Fisher Scientific). Then, 24 barcoded libraries were pooled into one tube in equal concentrations. After ligation with the rapid adapter, sequencing was performed using the FLO-MIN106 R9.4.1 flow cell with the MinION sequencer on the MinKNOW platform for approximately 4 h.
(ii) On-instrument real-time data analysis.
During sequencing, the passed fastq files generated by Guppy of MinKNOW, which had a quality score of >7, were uploaded on the cloud-based data analysis platform Epi2me for analysis. Sequencing reads were aligned to reference sequences in the NCBI 16S bacterial database using the FASTQ 16S workflow (v2020. 04. 06). Regarding the workflow parameters, the minimum QSCORE was set at 7, while the minimum percentage coverage and minimum percentage identity were set at 90%.
(iii) Data analysis using BLAST+ (NanoBLAST+).
In addition to Epi2me, sequencing data were analyzed using BLAST+ (v2.11.0), similar to the analysis of Illumina data. As each sample generated multiple fastq files in a sequencing run, the fastq files of each sample were first merged into a single fastq file and then converted to a fasta file before being aligned to reference sequences in the database.
(iv) Data analysis using NanoCLUST.
Samples with disagreement between Epi2me and NanoBLAST+ were further analyzed using another pipeline, NanoCLUST (https://github.com/genomicsITER/NanoCLUST) (23), to obtain the resolved Nanopore 16S identity. Unlike Epi2me and NanoBLAST+, NanoCLUST does not classify individual reads in a sample. Instead, NanoCLUST forms clusters of similar reads and classifies the consensus sequence of each cluster.
WGS.
Samples with completely discordant taxa, as inferred by Sanger 16S, Illumina 16S, and Nanopore 16S tests, were subjected to whole-genome sequencing (WGS) to confirm the definite identities using the ONT platform. Library preparation was performed using the transposase-based rapid barcoding kit (SQK-RBK110.96) from ONT in accordance with the manufacturer’s protocol. After pooling and adapter ligation, the library was loaded on the FLO-MIN106 R9.4.1 flow cell and sequenced using the GridION device for 48 h in high-accuracy base-calling mode. The passed fastq files were uploaded to Epi2me and analyzed using the WIMP workflow (v2021.03.05).
De novo assembly for WGS data sets.
Sequencing reads of each sample were assembled using Shasta (v0.7.0) (https://github.com/chanzuckerberg/shasta). Sequencing reads were realigned to the assembled consensus sequences using minimap2 (v2.17-r941) and samtools (v1.10). Consensus sequences were first polished using MarginPolish (v1.3.dev-5492204) (https://github.com/UCSC-nanopore-cgl/MarginPolish) and then further polished using homopolish (v0.2.1) (https://github.com/ythuang0522/homopolish) (24). To avoid bioinformatics bias in de novo assembly, each sample was also subjected to a second analysis pipeline. In brief, the sequencing reads were assembled using miniasm (v0.3-r179) (https://github.com/lh3/miniasm/releases/tag/v0.3). All-versus-all read self-mapping was performed using minimap2. Raw consensus sequences were then generated using miniasm. After realignment of the raw reads to consensus sequences using minimap2, the consensus sequences were polished twice using racon (v1.4.3) (https://github.com/isovic/racon).
The longest polished consensus sequences of each sample were classified using BLAST+ (v2.11.0) with the Prokaryotic RefSeq Genomes database downloaded from the NCBI. The top classified species with both query coverage and percentage identity were reported. The average nucleotide identity (ANI) between the query and best-matched reference genomes was calculated using an ANI calculator (https://www.ezbiocloud.net/tools/ani) (25). An ANI of >94% indicated that the samples belong to the same species as the best-matched genomes.
Data and statistical analysis.
The top classified taxa obtained from Illumina and Nanopore data sets were compared with those inferred by Sanger 16S using built-in programs and BLAST+ for analysis. Species-level concordance between the HTS and Sanger workflows was calculated. For samples that did not match at the species level, concordance at the genus or family level was determined.
To assess diagnostic accuracy, a composite 16S rRNA sequencing result obtained from the three sequencing platforms was considered the reference standard. Identical species obtained by at least two sequencing platforms were considered reference taxa. For samples with completely discordant species inferred by the three sequencing platforms, WGS was conducted to confirm the reference taxa.
RESULTS
Statistics of sequencing reads generated from the Illumina and Nanopore workflows.
Based on the default analysis of MSR, the Illumina platform generated an average of 113,381 reads per sample. After merging the paired-end reads and filtering out unwanted reads with undesired read lengths and ambiguous bases, an average of 68,652 filtered reads per sample was retained for Illumina_BLAST+ analysis.
The Nanopore MinKNOW platform generated an average of 51,769 reads (QSCORE ≥ 7) per sample, but an average of 51,419 reads (QSCORE ≥ 7) per sample was analyzed in the FASTQ 16S workflow in Epi2me. The slight difference in the number of average reads per sample was due to using different algorithms in the demultiplexing step between Epi2me and Guppy of MinKNOW. An average of 51,769 reads per sample was analyzed using NanoBLAST+. The total number of reads and the number of classified reads of each sample on both sequencing platforms are shown in Table S3 in the supplemental material.
Taxonomic resolution of sequencing reads.
The percentage distribution of classified reads via both sequencing platforms is shown in Fig. 1. On average, only 45.74% of the total reads of a sample were successfully classified at the species level by MSR with reference to the Greengenes database. After merging paired-end reads and quality filtering, 94.02% of filtered reads were classified at the species level by Illumina_BLAST+ with reference to the NCBI 16S rRNA database.
In the Nanopore workflow, both Epi2me and NanoBLAST+ use the NCBI 16S rRNA database for classification of long-read sequencing data. An average of 76.03% of total reads were classified at the species level in Epi2me, compared with 53.56% in NanoBLAST+.
Concordance in bacterial speciation: Illumina 16S and Nanopore 16S versus Sanger 16S.
The top-ranked species obtained from the Illumina 16S and Nanopore 16S workflows, coupled with the respective analysis pipelines, are listed in Table S3. The percentage of samples that matched Sanger 16S results at each of the species, genus, and family levels is illustrated in Fig. 2. The concordance in species-level identification among the sequencing platforms is shown in Fig. 3. Overall, in terms of concordance with the Sanger 16S result, Nanopore 16S was better than Illumina 16S, regardless of analysis pipeline.
For the Illumina 16S workflow, MSR and Illumina_BLAST+ demonstrated concordances of 33.14% (57/172) and 65.70% (113/172), respectively, with Sanger 16S in species-level identification. A total of 9.30% of samples (16/172) were unmatched, even at the family level, in MSR, whereas all samples matched at the family level or below in Illumina_BLAST+. Of note, concordance between the results of MSR and Illumina_BLAST+ was low; only 32.56% of samples (56/172) showed a matched result among the classified species from these two analysis pipelines. Moreover, only 28.49% of samples (49/172) showed complete agreement in the classified species among the MSR, Illumina_BLAST+, and Sanger data sets.
For the 116 samples with discrepant taxa inferred by MSR and Illumina_BLAST+, nf-core/ampliseq was used to resolved their identities. However, only 41 samples were classified at the species level by nf-core/ampliseq, 28 (24.14%) of them matched the results of Illumina_BLAST+, and 4 (3.45%) of them matched the results of MSR. For the nine samples that failed to reach agreement at the species level, all of them matched the results of Illumina_BLAST+ at the genus level. A total of 75 samples were classified only at the genus level or above by nf-core/ampliseq, and all of them matched the genus or family inferred by Illumina_BLAST+. Concordance between the resolved Illumina 16S and Sanger 16S results was 63.95% (110/172).
For Nanopore 16S, concordances of 87.79% (151/172) and 83.14% (143/172) at the species level were achieved with Epi2me and NanoBLAST+, respectively. A total of 1.16% of samples (2/172) were unmatched even at the family level, as reported by Epi2me and NanoBLAST+, respectively. Concordance between the results of Epi2me and NanoBLAST+ was 80.23% (138/172). Additionally, 76.74% of samples (132/172) showed agreement in the classified species among the Epi2me, NanoBLAST+, and Sanger data sets.
A total of 34 samples showed disagreement in the classified species inferred by Epi2me and NanoBLAST+. The respective Nanopore data were further analyzed using NanoCLUST to resolve the discrepancies. NanoCLUST agreed with Epi2ME and BLAST+ in 13 (38.24%) and 17 (50.00%) samples, respectively. Four samples failed to reach agreement in terms of species-level identification, of which three were matched in terms of genus-level identification and one was considered as having no reliable bacterial identification. Concordance between the resolved Nanopore 16S and Sanger 16S results was 89.53% (154/172).
WGS for bacterial isolates with discrepant species-level identification.
Eight samples (4.65% [8/172]) showed complete discordance in bacterial species, as inferred by the three 16S rRNA sequencing workflows. WGS was conducted to identify definite taxa. To validate the transposase-based rapid sequencing protocol for bacterial genome construction, two ATCC reference strains, namely, Klebsiella pneumoniae BAA3079 and Staphylococcus aureus BAA3114, were sequenced and analyzed in parallel with the eight discordant samples. Both reference strains successfully yielded consensus sequences of >3 Mb, which covered 94% of the genomes of the respective target organisms with 99% identity. This indicated that the WGS protocol was able to construct reliable consensus prokaryotic genomes (Table 1).
TABLE 1.
Sample ID | Species inferred by Sanger 16S | Species inferred by resolved Illumina 16Sc | Species inferred by resolved Nanopore 16Sd | WGS results |
||||||
---|---|---|---|---|---|---|---|---|---|---|
Best-matched species by WGS (reference genome accession no.) | Shasta genome assembly |
Miniasm genome assembly |
||||||||
Query coverage (%) | Identity (%) | ANI (%)e | Query coverage (%) | Identity (%) | ANI (%)e | |||||
Klebsiella pneumoniae BAA3079b | NA | NA | NA | Klebsiella pneumoniae (NC_016845.1) | 99.00 | 97.00 | 98.92 | 92.13 | 99.40 | 99.14 |
Staphylococcus aureus BAA3114b | NA | NA | NA | Staphylococcus aureus (NC_007795.1) | 94.06 | 99.95 | 99.30 | 88.39 | 99.92 | 99.23 |
R001 | Kocuria koreensis | Kocuria massiliensis | Kocuria spp. | Kocuria massiliensis (NZ_LT835161.1) | 42.21 | 87.44 | 78.29 | 42.42 | 87.41 | 78.55 |
R006 | Kocuria koreensis | Kocuria massiliensis | Kocuria spp. | Kocuria massiliensis (NZ_LT835161.1) | 43.04 | 79.12 | 78.49 | 42.04 | 87.49 | 78.44 |
R062 | Klebsiella grimontii | Enterobacter cloacae | Yokenella regensburgei | Klebsiella michiganensis (NZ_CP060111.1) | 92.17 | 99.17 | 98.71 | 86.30 | 98.99 | 98.69 |
R120 | Brachybacterium conglomeratum | Brachybacterium faecium | Brachybacterium paraconglomeratum | Brachybacterium saurashtrense (NZ_CP031356.1) | 62.15 | 85.18 | 82.30 | 62.30 | 85.12 | 82.39 |
R121 | Schaalia odontolytica | Schaalia vaccimaxillae | Sphingomonas paucimobilis | Schaalia odontolytica (NZ_CP046315.1) | 6.07 | 78.55 | 70.34 | 6.04 | 78.24 | 70.86 |
R131 | Schaalia odontolytica | Schaalia vaccimaxillae | No reliable ID | Schaalia odontolytica (NZ_CP046315.1) | 6.19 | 82.12 | 71.21 | 6.29 | 78.25 | 71.26 |
R158 | Microbacterium ginsengiterrae | Microbacterium assamensis | Microbacterium foliorum | Microbacterium foliorum (NZ_CP041040.1) | 65.41 | 84.52 | 82.24 | 65.21 | 84.51 | 82.15 |
R181 | Sphingomonas yabuuchiae | Sphingomonas paucimobilis | Sphingomonas sanguinis | Sphingomonas hominis (NZ_JABULH010000007.1) | 31.48 | 89.67 | 82.09 | 30.68 | 89.59 | 81.95 |
ID, identification; NA, not applicable.
Klebsiella pneumoniae BAA3079 and Staphylococcus aureus BAA3114 served as quality control samples, which were sequenced and analyzed in parallel with the discordant samples for WGS and bioinformatics analysis.
Discordant samples between MSR and Illumina_BLAST+ were resolved by nf-core/ampliseq.
Discordant samples between Epi2me and NanoBLAST+ were resolved by NanoCLUST.
An average nucleotide identity (ANI) of >94% indicated that the samples belonged to the same species as the best-matched genomes.
Interestingly, seven of these samples failed to match the published bacterial genomes, with query coverage of <70% for the longest consensus sequences (Table 1). The average nucleotide identities (ANIs) to the best-matched genomes were <85% (the threshold for the same species should be >94%), suggesting that these seven “difficult-to-identify” isolates were likely novel bacterial species. As the definite bacterial species could not be confirmed, these samples were excluded from the subsequent diagnostic evaluation.
The consensus sequence of one sample (R062) showed an overall query coverage of >92%, with 99.17% identity to Klebsiella michiganensis (NZ_CP060111.1). As the ANI achieved 98.71%, K. michiganensis was therefore considered the reference taxon for this sample.
Diagnostic accuracy of the three 16S rRNA sequencing workflows.
The composite of 16S rRNA sequencing and WGS results was regarded as the reference standard for calculating the diagnostic accuracy. The discordant samples between each sequencing platform and the reference standards are listed in Table 2.
TABLE 2.
Sample ID | Species-level ID (reference standard) | Sanger sequencing |
Illumina sequencing |
Nanopore sequencing |
|||
---|---|---|---|---|---|---|---|
Classified species from Sanger 16Sa | 16S identity against the reference (%) | Classified species from resolved Illumina 16Sa | 16S identity against the reference (%) | Classified species from resolved Nanopore 16Sa | 16S identity against the reference (%) | ||
R003 | Pseudoglutamicibacter albus | Pseudoglutamicibacter cumminsii | 99.26 | Pseudoglutamicibacter albus | Matched | Pseudoglutamicibacter albus | Matched |
R013 | Microbacterium hominis | Microbacterium hominis | Matched | Microbacterium aerolatum | 97.47 | Microbacterium hominis | Matched |
R017 | Microbacterium hominis | Microbacterium hominis | Matched | Microbacterium aerolatum | 97.47 | Microbacterium hominis | Matched |
R021 | Microbacterium hominis | Microbacterium hominis | Matched | Microbacterium aerolatum | 97.47 | Microbacterium hominis | Matched |
R024 | Bacillus idriensis | Bacillus idriensis | Matched | Bacillus idriensis | Matched | Bacillus indicus | 97.62 |
R025 | Varibaculum cambriense | Varibaculum cambriense | Matched | Varibaculum anthropi | 98.50 | Varibaculum cambriense | Matched |
R026 | Varibaculum cambriense | Varibaculum cambriense | Matched | Varibaculum anthropi | 98.50 | Varibaculum cambriense | Matched |
R036 | Corynebacterium lowii | Corynebacterium lowii | Matched | Corynebacterium bovis | 93.29 | Corynebacterium lowii | Matched |
R040 | Weissella cibaria | Weissella cibaria | Matched | Weissella confusa | 99.26 | Weissella cibaria | Matched |
R043 | Proteus vulgaris | Proteus vulgaris | Matched | Proteus alimentorum | 99.64 | Proteus vulgaris | Matched |
R045 | Brucella microti | Brucella microti | Matched | Brucella papionis | 99.86 | Brucella microti | Matched |
R047 | Proteus cibarius | Proteus cibarius | Matched | Proteus terrae | 99.65 | Proteus cibarius | Matched |
R049 | Dermacoccus barathri | Dermacoccus barathri | Matched | Dermacoccus profundi | 99.86 | Dermacoccus barathri | Matched |
R052 | Arcanobacterium wilhelmae | Arcanobacterium wilhelmae | Matched | Arcanobacterium pinnipediorum | 96.60 | Arcanobacterium wilhelmae | Matched |
R053 | Dermacoccus barathri | Dermacoccus barathri | Matched | Dermacoccus profundi | 99.86 | Dermacoccus barathri | Matched |
R056 | Corynebacterium simulans | Corynebacterium simulans | Matched | Corynebacterium glutamicum | 93.74 | Corynebacterium simulans | Matched |
R058 | Corynebacterium mastitidis | Corynebacterium mastitidis | Matched | Corynebacterium tuberculostearicum | 94.67 | Corynebacterium mastitidis | Matched |
R062 | Klebsiella michiganensis | Klebsiella grimontii | 99.20 | Enterobacter cloacae | 97.07 | Yokenella regensburgei | 98.56 |
R063 | Corynebacterium pilbarense | Corynebacterium pilbarense | Matched | Corynebacterium coyleae | 98.04 | Corynebacterium pilbarense | Matched |
R069 | Eikenella corrodens | Eikenella corrodens | Matched | Eikenella halliae | 98.69 | Eikenella corrodens | Matched |
R071 | Corynebacterium xerosis | Corynebacterium hansenii | 99.07 | Corynebacterium xerosis | Matched | Corynebacterium xerosis | Matched |
R072 | Mycolicibacterium fortuitum | Mycolicibacterium fortuitum | Matched | Mycolicibacterium arcueilense | 98.96 | Mycolicibacterium fortuitum | Matched |
R073 | Tessaracoccus oleiagri | Tessaracoccus oleiagri | Matched | Tessaracoccus flavescens | 95.95 | Tessaracoccus oleiagri | Matched |
R078 | Vagococcus teuberi | Vagococcus teuberi | Matched | Vagococcus martis | 99.22 | Vagococcus teuberi | Matched |
R079 | Corynebacterium xerosis | Corynebacterium hansenii | 99.07 | Corynebacterium xerosis | Matched | Corynebacterium xerosis | Matched |
R083 | Tessaracoccus oleiagri | Tessaracoccus oleiagri | Matched | Tessaracoccus flavescens | 95.95 | Tessaracoccus oleiagri | Matched |
R086 | Raoultella planticola | Raoultella planticola | Matched | Raoultella planticola | Matched | Klebsiella aerogenes | 99.06 |
R094 | Corynebacterium xerosis | Corynebacterium hansenii | 99.07 | Corynebacterium xerosis | Matched | Corynebacterium xerosis | Matched |
R096 | Streptomyces thermodiastaticus | Streptomyces thermodiastaticus | Matched | Streptomyces thermoviolaceus | 98.86 | Streptomyces thermodiastaticus | Matched |
R097 | Pseudoxanthomonas helianthi | Pseudoxanthomonas helianthi | Matched | Pseudoxanthomonas spadix | 97.04 | Pseudoxanthomonas helianthi | Matched |
R098 | Brachybacterium huguangmaarense | Brachybacterium huguangmaarense | Matched | Brachybacterium huguangmaarense | Matched | Brachybacterium nesterenkovii | 97.84 |
R104 | Gordonia sputi | Gordonia sputi | Matched | Gordonia otitidis | 99.07 | Gordonia sputi | Matched |
R105 | Gordonia sputi | Gordonia sputi | Matched | Gordonia otitidis | 99.07 | Gordonia sputi | Matched |
R107 | Moraxella osloensis | Moraxella osloensis | Matched | Enhydrobacter aerosaccus | 99.19 | Moraxella osloensis | Matched |
R108 | Staphylococcus saccharolyticus | Staphylococcus saccharolyticus | Matched | Staphylococcus epidermidis | 99.19 | Staphylococcus saccharolyticus | Matched |
R112 | Citrobacter sedlakii | Citrobacter sedlakii | Matched | Citrobacter youngae | 98.32 | Citrobacter sedlakii | Matched |
R116 | Tsukamurella tyrosinosolvens | Tsukamurella tyrosinosolvens | Matched | Tsukamurella ocularis | 99.86 | Tsukamurella tyrosinosolvens | Matched |
R123 | Pseudoglutamicibacter albus | Pseudoglutamicibacter cumminsii | 99.26 | Pseudoglutamicibacter albus | Matched | Pseudoglutamicibacter albus | Matched |
R133 | Nocardia brasiliensis | Nocardia brasiliensis | Matched | Nocardia vulneris | 99.31 | Nocardia brasiliensis | Matched |
R140 | Moraxella lacunata | Moraxella lacunata | Matched | Moraxella equi | 99.38 | Moraxella lacunata | Matched |
R141 | Ottowia beijingensis | Ottowia beijingensis | Matched | Brachymonas denitrificans | 93.33 | Ottowia beijingensis | Matched |
R148 | Moraxella osloensis | Moraxella osloensis | Matched | Enhydrobacter aerosaccus | 99.19 | Moraxella osloensis | Matched |
R149 | Ornithinibacillus californiensis | Ornithinibacillus californiensis | Matched | Ornithinibacillus scapharcae | 98.48 | Ornithinibacillus californiensis | Matched |
R151 | Dermacoccus barathri | Dermacoccus barathri | Matched | Dermacoccus profundi | 99.86 | Dermacoccus barathri | Matched |
R153 | Corynebacterium mastitidis | Corynebacterium mastitidis | Matched | Corynebacterium tuberculostearicum | 94.67 | Corynebacterium mastitidis | Matched |
R167 | Moraxella osloensis | Moraxella osloensis | Matched | Enhydrobacter aerosaccus | 99.19 | Moraxella osloensis | Matched |
R175 | Corynebacterium pollutisoli | Corynebacterium pollutisoli | Matched | Corynebacterium humireducens | 98.07 | Corynebacterium pollutisoli | Matched |
R176 | Tsukamurella oculari s | Tsukamurella ocularis | Matched | Tsukamurella ocularis | Matched | Tsukamurella hominis | 100.00 |
R178 | Acinetobacter soli | Acinetobacter soli | Matched | Acinetobacter soli | Matched | Acinetobacter lactucae | 97.82 |
R179 | Corynebacterium lipophiloflavum | Corynebacterium lipophiloflavum | Matched | Corynebacterium mycetoides | 97.16 | Corynebacterium lipophiloflavum | Matched |
R180 | Corynebacterium mastitidis | Corynebacterium mastitidis | Matched | Corynebacterium tuberculostearicum | 94.67 | Corynebacterium mastitidis | Matched |
R182 | Fusobacterium nucleatum | Fusobacterium nucleatum | Matched | Fusobacterium canifelinum | 98.34 | Fusobacterium nucleatum | Matched |
R183 | Parabacteroides faecis | Parabacteroides faecis | Matched | Parabacteroides chongii | 97.15 | Parabacteroides faecis | Matched |
R190 | Bacillus xiamenensis | Bacillus xiamenensis | Matched | Bacillus aerius | 97.16 | Bacillus xiamenensis | Matched |
R192 | Corynebacterium pilbarense | Corynebacterium pilbarense | Matched | Corynebacterium ureicelerivorans | 98.85 | Corynebacterium pilbarense | Matched |
R204 | Prevotella scopos | Prevotella scopos | Matched | Prevotella melaninogenica | 98.10 | Prevotella scopos | Matched |
R205 | Pasteurella multocida | Pasteurella multocida | Matched | Pasteurella stomatis | 93.74 | Pasteurella multocida | Matched |
R206 | Staphylococcus cohnii | Staphylococcus cohnii | Matched | Staphylococcus auricularis | 98.16 | Staphylococcus cohnii | Matched |
R208 | Achromobacter denitrificans | Achromobacter denitrificans | Matched | Achromobacter xylosoxidans | 99.15 | Achromobacter denitrificans | Matched |
R210 | Bacillus licheniformis | Bacillus licheniformis | Matched | Bacillus piscis | 97.37 | Bacillus licheniformis | Matched |
Mismatched taxa are underlined. Sanger, Illumina, and Nanopore 16S rRNA sequencing results are shown.
The diagnostic performance of each sequencing workflow is summarized in Table 3. For the Illumina platform, the diagnostic accuracies of MSR and Illumina_BLAST+ were 35.76% and 71.52%, respectively. Notably, the diagnostic accuracy of resolved Illumina 16S was even lower than that of Illumina_BLAST+ alone (69.07% versus 71.52%), suggesting that Illumina_BLAST+ was the most optimized analysis pipeline for Illumina 16S.
TABLE 3.
Sequencing method | No. of samples analyzed | No. of samples with matched taxa | Diagnostic accuracy (%) | 95% CIc | P value (chi-square test)d |
---|---|---|---|---|---|
Sanger 16S | 165 | 159 | 96.36 | 92.25–98.65 | |
Resolved Illumina 16Sa | 165 | 115 | 69.70 | 62.07–76.60 | <0.0001* |
Analyzed by MSR | 165 | 59 | 35.76 | 28.46–43.58 | |
Analyzed by Illumina_BLAST+ | 165 | 118 | 71.52 | 63.98–78.26 | |
Resolved Nanopore 16Sb | 165 | 159 | 96.36 | 92.25–98.65 | 0.0291* |
Analyzed by Epi2ME | 165 | 147 | 89.09 | 83.31–93.41 | |
Analyzed by NanoBLAST+ | 165 | 148 | 89.70 | 84.02–93.88 |
Discordant samples between MSR and Illumina_BLAST+ were analyzed by nf-core/ampliseq; classified species in nf-core/ampliseq were considered resolved identities in Illumina workflow.
Discordant samples between Epi2me and NanoBLAST+ were analyzed by NanoCLUST; classified species in NanoCLUST were considered resolved identities in Nanopore workflow.
CI, confidence interval.
*, P < 0.05, statistically significantly different from Sanger 16S results.
For the Nanopore platform, the diagnostic accuracies of Epi2me and nanoBLAST+ were 89.09% and 89.70%, respectively. The diagnostic accuracy of resolved Nanopore 16S was 96.36%, which was the same as that of Sanger sequencing.
Comparison of sample-to-report time and running cost of the two HTS technologies.
The Illumina platform enables sequencing of up to 384 samples per run, whereas, owing to the limited choice of sequencing barcodes, the Nanopore platform can support only a batch of 24 samples per run. Without considering the time for DNA extraction, it took 78 h for the Illumina workflow to generate sequencing data for each run (Fig. 4). With the Nanopore platform, the sequencing workflow required 8.25 h. Of note, although base-calling and Epi2me analyses are real-time processes, their speed is highly dependent on the strength of the computer. However, Nanopore sequencing can be stopped once sufficient reads have been generated.
The running cost of the Nanopore workflow is relatively lower than that of the Illumina workflow. The cost of the Illumina workflow per sequencing run is US$4,931 (172 samples), and the cost per sample is approximately US$28.7. If the sample size is increased to 384, the cost of the Illumina workflow per sequencing run is US$8,279; therefore, the cost per sample is reduced to US$21.6. For the Nanopore workflow, the cost per sequencing run (24 samples) is US$424, which means that the cost per sample is approximately US$17.7.
DISCUSSION
Although the majority of bacterial pathogens can be identified by MALDI-TOF MS, 16S rRNA gene sequencing is needed in clinical microbiology laboratories to confirm the identities of “difficult-to-identify” clinical isolates. With reduced costs, simplified protocols, and automated bioinformatics pipelines, HTS has been proposed as a better alternative to traditional Sanger sequencing for sequence-based bacterial identification in clinical laboratories. This is the first study to compare the performances and evaluate the clinical utilities of two commercially available high-throughput 16S rRNA gene sequencing assays with built-in analysis software for taxonomic assignment of bacterial pathogens that are unidentifiable using MALDI-TOF MS.
In order to evaluate the performance of the built-in analysis pipelines from Illumina (MSR) and Nanopore (Epi2me) platforms, the sequencing data from both platforms were also analyzed using BLAST+. With the same analysis approach as that of MSR and Epi2me (read-by-read classification) and the applicability to both Illumina and Nanopore data, BLAST+ is a good analysis tool for intra- and interplatform comparisons. The full analysis workflow is illustrated in Fig. 5.
The results from Illumina and Nanopore platforms were compared with Sanger 16S results (Fig. 5a). With the Illumina platform, the concordance of the classified species between MSR and Sanger 16S was exceptionally low; only 33.14% of samples matched the Sanger result for the top classified species, compared with 65.70% when using Illumina_BLAST+. As described in previous studies, the use of different bioinformatics tools and 16S rRNA sequence databases could result in different taxonomic assignments, especially at lower taxonomic levels (26, 27). The latest version of the Greengenes database for MSR was updated in 2013 and does not contain certain new bacterial taxa, which accounts for the poor agreement of this workflow compared with others (27). Nevertheless, mismatches between Illumina and Sanger sequencing were observed in 34.33% of samples, even when the same aligner (i.e., BLAST) and database (i.e., NCBI 16S bacterial database) were used.
The Nanopore 16S workflow demonstrated a considerably higher percentage concordance with the Sanger 16S workflow than with the Illumina 16S workflow, regardless of the analysis pipeline used. In contrast to the built-in analysis on the Illumina platform (i.e., MSR), the performance of Epi2me with Nanopore 16S was comparable to that of nanoBLAST+ (83.14%), with 87.79% of samples matching the Sanger results for the top classified species. Notably, species-level disagreement between Epi2me and nanoBLAST+ was observed in 34 samples (19.77%).
One may argue that with the constraint of low sequencing depth, the Sanger 16S result alone should not be considered as the final reference. We therefore used a composite of 16S sequencing results generated by the three platforms, and any discrepancies were resolved by WGS as the reference standard to determine the diagnostic accuracy of the HTS workflows (Fig. 5b and c).
The discrepant samples between MSR and Illumina_BLAST+ were further analyzed by nf-core/ampliseq. This new pipeline classifies reads based on an error-correcting amplicon sequence variant (ASV) approach, which showed better performance in taxonomic classification than the clustering of operational taxonomic unit (OTU) approach in the study by Straub et al. (21). However, there was no improvement in the diagnostic accuracy when the resolved Illumina 16S was compared with the reference standards. Regardless of the classification approaches, the diagnostic accuracy of the Illumina workflow was still restricted by the length and position of the variable regions of the 16S gene fragment being sequenced.
As indicated by Johnson et al., although some subregions (e.g., V1 to V3) of the 16S rRNA gene provide a reasonable approximation of 16S diversity, most do not capture sufficient sequence variation to discriminate between closely related taxa. Also, different subregions show bias in the bacterial taxa that can be identified (28). In this study, V3 and V4 regions might perform poorly in classifying the genera of discordant samples (Table 2) down to the species level. However, Illumina_BLAST+ showed a high concordance to the reference at the genus level (98.79%), meaning that the genus-level identification of the Illumina platform is credible.
Epi2me and BLAST+ rely on read-by-read alignment to reference sequences in the database. As the base-calling accuracy of Nanopore sequencing is relatively low, the prevalence of sequencing errors in Nanopore reads could limit its ability to resolve highly similar sequences. Alternatively, NanoCLUST generates clusters based on uniform manifold approximation and projection (UMAP) and classifies the representative consensus read in each cluster using BLAST. The effect of sequencing errors in individual sequences can be minimized by forming clusters, which reduces the chance of misclassification. Comparing the species resolved using NanoCLUST with the reference standard, there was a slight improvement in diagnostic accuracy from 89.09% (Epi2me) and 89.70% (nanoBLAST+) to 96.36%.
There were six samples (3.64%) that still failed to match the reference at the species level for the resolved Nanopore 16S. One possible reason for this discordance is the high similarity in 16S rRNA gene sequences between the inferred species and the reference taxa. Based on the now historic assumption of 16S rRNA sequencing, sequences with >95% identity represent the same genus, whereas sequences with >97% identity represent closely related species (29). Many researchers have reported that the taxonomic resolution of the 16S rRNA gene is lower and is unable to discriminate the closely related species in certain genera, including but not limited to Bacillus, Burkholderia, Acinetobacter baumannii-calcoaceticus complex, Achromobacter, Actinomyces, and Staphylococcus and Enterobacterales (30, 31). In this study, all six taxa inferred by Nanopore 16S had >97% sequence identity with the reference standard (Table 2).
In addition, WGS was performed to identify the definite bacterial taxa for the eight samples with completely discordant 16S results given by three sequencing platforms. Nonetheless, seven samples were considered novel bacterial species due to the low query coverage (<50%) and low ANIs (<94%) between the respective consensus sequence and best-matched genome (32). WGS confirmed that R062 belonged to K. michiganensis (ANI = 98.71%), which shared a high degree of 16S rRNA identity with the taxa assigned by Sanger 16S (Klebsiella grimontii; 99.20%), resolved Illumina 16S (Enterobacter cloacae; 97.07%), and resolved Nanopore 16S (Yokenella regensburgei; 98.56%) (Table 1). This demonstrated that 16S rRNA sequencing was not able to accurately differentiate these closely related species.
Considering the time to result (not including DNA extraction) of the two sequencing platforms, the Nanopore workflow (8.25 h) has a much shorter turnaround time than the Illumina workflow (78 h). A long quantification process (quantitative PCR [qPCR] and bioanalyzer) is required in the Illumina workflow (12 h) since the cluster generation process in Illumina sequencing is highly sensitive to library concentration. While overclustering leads to lower base accuracy, underclustering leads to lower data output in Illumina sequencing. In contrast, Nanopore sequencing is less sensitive to the fluctuation of library concentration, and the DNA quantification process is simpler.
The largest sample size of the Nanopore 16S workflow is 24 samples per batch, compared to 384 samples per batch in the Illumina 16S workflow. Comparing the cost per sample in a sequencing run with respective maximum sample size, Nanopore sequencing is relatively cheaper than Illumina sequencing (US$17.7 versus US$21.6, respectively). Additionally, the startup cost of Nanopore sequencing is remarkably lower than that of Illumina sequencing. The starter package of Nanopore sequencing costs only US$1,000, whereas the Illumina MiSeq costs approximately US$125,000. Also, expensive instruments like a qPCR machine and a bioanalyzer are required for the quantification step in Illumina sequencing.
In this study, the FLO-MIN106 R9.4.1 reusable flow cell, which enables sequencing for up to 72 h, was used for Nanopore 16S sequencing. However, library carryover from the previous run was observed in a pilot study. This is problematic when the same barcode set is used in consecutive sequencing runs. To avoid contamination by library carryover, a new flow cell was used in each sequencing run, and used flow cells were reserved for other sequencing runs using different barcodes. In this context, the disposable Flongle flow cell with fewer active pores is preferred in a clinical setting, especially when the sample size is small.
Bacterial identification at the genus level might be enough for prescribing treatment in some cases, since most antimicrobial drugs act against groups of bacteria instead of single species. However, identification to the species level is crucial in differentiating environmental nonpathogenic species and pathogenic species, especially when the bacteria have contrasting drug susceptibility patterns, for example, the A. calcoaceticus-A. baumannii complex (33). Nevertheless, the taxonomic resolution of 16S sequencing is dependent on the read length of the 16S rRNA gene, the capacity of the 16S reference database, and the choice of analysis pipeline.
There are some limitations to this study. First, the aim of this study was to compare commercially available kits for 16S rRNA gene sequencing from Illumina and Nanopore. Therefore, by using the 16S metagenomic sequencing library preparation kit, only the V3 and V4 subregions of the 16S rRNA gene were sequenced in the Illumina workflow. But, it is possible to sequence the full-length 16S rRNA gene using Illumina MiSeq with a laboratory-developed protocol (31), which may increase the diagnostic accuracy of the Illumina workflow. However, the analysis is more complicated since an additional step of making contigs is required, which could not be done by MSR. Second, except for the eight discordant samples, the reference taxa of isolates were defined solely by 16S rRNA sequencing, and it may not represent the definite taxa. Third, the taxonomic assignment in WGS was based on the contigs of consensus sequences after de novo assembly. Circular, gap-free bacterial genomes were not constructed.
Conclusions.
Because of its rapidity, simplicity, and high accuracy, MALDI-TOF MS is the mainstay of bacterial identification in clinical microbiology laboratories. 16S sequencing of cultured isolates should only be used for taxonomic assignment of unidentifiable bacterial pathogens in MALDI-TOF MS.
The performance of MSR in taxonomic classification was unsatisfactory, and analysis using external pipelines such as BLAST+ was recommended in the Illumina 16S workflow (Nextera XT index kit v2). With massive throughput and high base accuracy, the Illumina platform is suitable for clinical laboratories with a high burden of clinical samples, where a longer turnaround time is acceptable. The Nanopore 16S workflow (SQK-16S024 with Epi2me) is recommended when rapid species-level identification is required, especially in emergency cases. It is recommended to further confirm the classified species using other analysis pipelines in both sequencing platforms to increase the diagnostic accuracy.
ACKNOWLEDGMENTS
This work was supported by the Innovation and Technology Fund–Partnership Research Program (PRP) (grant no. PRP/010/20FX).
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
We declare no competing interests.
Footnotes
Supplemental material is available online only.
Contributor Information
Gilman Kit-Hang Siu, Email: gilman.siu@polyu.edu.hk.
Sandra S. Richter, Mayo Clinic
REFERENCES
- 1.Jesumirhewe C, Ogunlowo PO, Olley M, Springer B, Allerberger F, Ruppitsch W. 2016. Accuracy of conventional identification methods used for Enterobacteriaceae isolates in three Nigerian hospitals. PeerJ 4:e2511. 10.7717/peerj.2511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Harmsen D, Rothganger J, Frosch M, Albert J. 2002. RIDOM: ribosomal differentiation of medical micro-organisms database. Nucleic Acids Res 30:416–417. 10.1093/nar/30.1.416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Karger A. 2016. Current developments to use linear MALDI-TOF spectra for the identification and typing of bacteria and the characterization of other cells/organisms related to infectious diseases. Proteomics Clin Appl 10:982–993. 10.1002/prca.201600038. [DOI] [PubMed] [Google Scholar]
- 4.Patel R. 2015. MALDI-TOF MS for the diagnosis of infectious diseases. Clin Chem 61:100–111. 10.1373/clinchem.2014.221770. [DOI] [PubMed] [Google Scholar]
- 5.Hou TY, Chiang-Ni C, Teng SH. 2019. Current status of MALDI-TOF mass spectrometry in clinical microbiology. J Food Drug Anal 27:404–414. 10.1016/j.jfda.2019.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lau SK, Tang BS, Teng JL, Chan TM, Curreem SO, Fan RY, Ng RH, Chan JF, Yuen KY, Woo PC. 2014. Matrix-assisted laser desorption ionisation time-of-flight mass spectrometry for identification of clinically significant bacteria that are difficult to identify in clinical laboratories. J Clin Pathol 67:361–366. 10.1136/jclinpath-2013-201818. [DOI] [PubMed] [Google Scholar]
- 7.Ge MC, Kuo AJ, Liu KL, Wen YH, Chia JH, Chang PY, Lee MH, Wu TL, Chang SC, Lu JJ. 2017. Routine identification of microorganisms by matrix-assisted laser desorption ionization time-of-flight mass spectrometry: success rate, economic analysis, and clinical outcome. J Microbiol Immunol Infect 50:662–668. 10.1016/j.jmii.2016.06.002. [DOI] [PubMed] [Google Scholar]
- 8.Garner O, Mochon A, Branda J, Burnham CA, Bythrow M, Ferraro M, Ginocchio C, Jennemann R, Manji R, Procop GW, Richter S, Rychert J, Sercia L, Westblade L, Lewinski M. 2014. Multi-centre evaluation of mass spectrometric identification of anaerobic bacteria using the VITEK(R) MS system. Clin Microbiol Infect 20:335–339. 10.1111/1469-0691.12317. [DOI] [PubMed] [Google Scholar]
- 9.Knoester M, van Veen SQ, Claas EC, Kuijper EJ. 2012. Routine identification of clinical isolates of anaerobic bacteria: matrix-assisted laser desorption ionization-time of flight mass spectrometry performs better than conventional identification methods. J Clin Microbiol 50:1504. 10.1128/JCM.06607-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Luo Y, Siu GKH, Yeung ASF, Chen JHK, Ho PL, Leung KW, Tsang JLY, Cheng VCC, Guo L, Yang J, Ye L, Yam WC. 2015. Performance of the VITEK MS matrix-assisted laser desorption ionization-time of flight mass spectrometry system for rapid bacterial identification in two diagnostic centres in China. J Med Microbiol 64:18–24. 10.1099/jmm.0.080317-0. [DOI] [PubMed] [Google Scholar]
- 11.Bizzini A, Jaton K, Romo D, Bille J, Prod'hom G, Greub G. 2011. Matrix-assisted laser desorption ionization–time of flight mass spectrometry as an alternative to 16S rRNA gene sequencing for identification of difficult-to-identify bacterial strains. J Clin Microbiol 49:693–696. 10.1128/JCM.01463-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Homem de Mello de Souza HAP, Dalla-Costa LM, Vicenzi FJ, Camargo de Souza D, Riedi CA, Filho NAR, Pilonetto M. 2014. MALDI-TOF: a useful tool for laboratory identification of uncommon glucose non-fermenting Gram-negative bacteria associated with cystic fibrosis. J Med Microbiol 63:1148–1153. 10.1099/jmm.0.076869-0. [DOI] [PubMed] [Google Scholar]
- 13.Winand R, Bogaerts B, Hoffman S, Lefevre L, Delvoye M, Braekel JV, Fu Q, Roosens NH, Keersmaecker SC, Vanneste K. 2019. Targeting the 16s rRNA gene for bacterial identification in complex mixed samples: comparative evaluation of second (Illumina) and third (Oxford Nanopore Technologies) generation sequencing technologies. Int J Mol Sci 21:298. 10.3390/ijms21010298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Chakravorty S, Helb D, Burday M, Connell N, Alland D. 2007. A detailed analysis of 16S ribosomal RNA gene segments for the diagnosis of pathogenic bacteria. J Microbiol Methods 69:330–339. 10.1016/j.mimet.2007.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Ip CLC, Loose M, Tyson JR, de Cesare M, Brown BL, Jain M, Leggett RM, Eccles DA, Zalunin V, Urban JM, Piazza P, Bowden RJ, Paten B, Mwaigwisya S, Batty EM, Simpson JT, Snutch TP, Birney E, Buck D, Goodwin S, Jansen HJ, O'Grady J, Olsen HE, MinION Analysis and Reference Consortium. 2015. MinION analysis and reference consortium: phase 1 data release and analysis. F1000Res 4:1075. 10.12688/f1000research.7201.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Muyzer G, Teske A, Wirsen CO, Jannasch HW. 1995. Phylogenetic relationships of Thiomicrospira species and their identification in deep-sea hydrothermal vent samples by denaturing gradient gel electrophoresis of 16S rDNA fragments. Arch Microbiol 164:165–172. 10.1007/BF02529967. [DOI] [PubMed] [Google Scholar]
- 17.Liu Z, Lozupone C, Hamady M, Bushman FD, Knight R. 2007. Short pyrosequencing reads suffice for accurate microbial community analysis. Nucleic Acids Res 35:e120. 10.1093/nar/gkm541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Andersson AF, Lindberg M, Jakobsson H, Backhed F, Nyren P, Engstrand L. 2008. Comparative analysis of human gut microbiota by barcoded pyrosequencing. PLoS One 3:e2836. 10.1371/journal.pone.0002836. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Nossa CW, Oberdorf WE, Yang L, Aas JA, Paster BJ, Desantis TZ, Brodie EL, Malamud D, Poles MA, Pei Z. 2010. Design of 16S rRNA gene primers for 454 pyrosequencing of the human foregut microbiome. World J Gastroenterol 16:4135–4144. 10.3748/wjg.v16.i33.4135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, Sahl JW, Stres B, Thallinger GG, Van Horn DJ, Weber CF. 2009. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75:7537–7541. 10.1128/AEM.01541-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Straub D, Blackwell N, Langarica-Fuentes A, Peltzer A, Nahnsen S, Kleindienst S. 2020. Interpretations of environmental microbial community studies are biased by the selected 16S rRNA (gene) amplicon sequencing pipeline. Front Microbiol 11:550420. 10.3389/fmicb.2020.550420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J, Glockner FO. 2013. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res 41:D590–D596. 10.1093/nar/gks1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Rodriguez-Perez H, Ciuffreda L, Flores C. 2021. NanoCLUST: a species-level analysis of 16S rRNA nanopore sequencing data. Bioinformatics 37:1600–1601. 10.1093/bioinformatics/btaa900. [DOI] [PubMed] [Google Scholar]
- 24.Huang YT, Liu PY, Shih PW. 2021. Homopolish: a method for the removal of systematic errors in nanopore sequencing by homologous polishing. Genome Biol 22:95. 10.1186/s13059-021-02282-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Yoon SH, Ha SM, Lim J, Kwon S, Chun J. 2017. A large-scale evaluation of algorithms to calculate average nucleotide identity. Antonie Van Leeuwenhoek 110:1281–1286. 10.1007/s10482-017-0844-4. [DOI] [PubMed] [Google Scholar]
- 26.Sierra MA, Li Q, Pushalkar S, Paul B, Sandoval TA, Kamer AR, Corby P, Guo Y, Ruff RR, Alekseyenko AV, Li X, Saxena D. 2020. The influences of bioinformatics tools and reference databases in analyzing the human oral microbial community. Genes (Basel) 11:878. 10.3390/genes11080878. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Park SC, Won S. 2018. Evaluation of 16S rRNA databases for taxonomic assignments using mock community. Genomics Inform 16:e24. 10.5808/GI.2018.16.4.e24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Johnson JS, Spakowicz DJ, Hong BY, Petersen LM, Demkowicz P, Chen L, Leopold SR, Hanson BM, Agresta HO, Gerstein M, Sodergren E, Weinstock GM. 2019. Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis. Nat Commun 10:5029. 10.1038/s41467-019-13036-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Schloss PD, Handelsman J. 2005. Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness. Appl Environ Microbiol 71:1501–1506. 10.1128/AEM.71.3.1501-1506.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Janda JM, Abbott SL. 2007. 16S rRNA gene sequencing for bacterial identification in the diagnostic laboratory: pluses, perils, and pitfalls. J Clin Microbiol 45:2761–2764. 10.1128/JCM.01228-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Church DL, Cerutti L, Gurtler A, Griener T, Zelazny A, Emler S. 2020. Performance and application of 16S rRNA gene cycle sequencing for routine identification of bacteria in the clinical microbiology laboratory. Clin Microbiol Rev 33:e00053-19. 10.1128/CMR.00053-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Konstantinidis KT, Tiedje JM. 2005. Genomic insights that advance the species definition for prokaryotes. Proc Natl Acad Sci USA 102:2567–2572. 10.1073/pnas.0409727102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Lim YM, Shin KS, Kim J. 2007. Distinct antimicrobial resistance patterns and antimicrobial resistance-harboring genes according to genomic species of Acinetobacter isolates. J Clin Microbiol 45:902–905. 10.1128/JCM.01573-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.