Abstract
Next-generation sequencing is rapidly finding footholds in numerous microbiological fields, including infectious disease diagnostics. Here, we describe a molecular inversion probe panel for the identification of bacterial, viral, and parasitic pathogens. We describe the ability of Illumina and Oxford Nanopore Technologies (ONT) to sequence small amplicons originating from this panel for the identification of pathogens in complex matrices. The panel correctly classified 31 bacterial pathogens directly from positive blood culture bottles with a genus-level concordance of 96.7% and 90.3% on the Illumina and ONT platforms, respectively. Both sequencing platforms detected 18 viral and parasitic organisms directly from mock clinical samples of plasma and whole blood at concentrations of 104 PFU/mL with few exceptions. In general, Illumina sequencing exhibited greater read counts with lower percent mapped reads; however, this resulted in no effect on limits of detection compared with ONT sequencing. Mock clinical evaluation of the probe panel on the Illumina and ONT platforms resulted in positive predictive values of 0.91 and 0.88 and negative predictive values of 1 and 1 from de-identified human chikungunya virus samples compared with gold standard quantitative RT-PCR. Overall, these data show that molecular inversion probes are an adaptable technology capable of pathogen detection from complex sample matrices on current next-generation sequencing platforms.
Next-generation sequencing (NGS), as a technique for clinical diagnostics and biosurveillance efforts, continues to progress toward functional realization.1,2 Currently, second-generation sequencing technologies, such as the Illumina platform, benefit from locked down reagents, protocols, and systems coupled with decades of research evaluating clinical metrics. US Food and Drug Administration approval of the MiSeqDx (Illumina, San Diego, CA) for clinical in vitro testing provided a regulatory precursor for diagnostic panels targeting genetic aberrations.3 Part of the rationale for US Food and Drug Administration approval of the MiSeqDx was low error rates,4 thus ensuring confidence for making diagnostic decisions. Unlike approved genetic screening that requires high-confidence calls for genetic variants, infectious disease diagnosis is often a binary determination of the presence or absence of foreign genetic material in a clinical sample. Consequently, these factors are the reason molecular techniques such as real-time PCR remain firmly ahead of sequencing in the infectious disease diagnostic space. Third-generation sequencing platforms, such as those from Oxford Nanopore Technologies (ONT; Oxford, United Kingdom), offer a potential bridge between current real-time PCR applications and the depth of information sequencing can provide. In this context, ONT represents a paradigm shift measuring changes in ionic current across a membrane as single-stranded DNA passes through a nanopore to determine nucleotide composition.5 The technology is small and portable, especially compared with the complex optics required for second-generation sequencers, and shows functionality in austere or resource-limited environments.6, 7, 8 ONT sequencing can also detect base pair modifications during regular sequencing runs as well as sequence RNA directly, thus offering additional information beyond the capabilities of real-time PCR, which may provide further diagnostic insights.
Benefits from using the ONT platform are balanced with several caveats, such as higher raw read error rates,9 making them less suitable for the confident single-nucleotide variant analysis required in some diagnostic applications.10,11 In addition, rapid technological turnaround of the ONT platform results in a lack of benchmarking and platform lockdown compared with the Illumina platform. Adaptions to flow cells, motor proteins, and library preparation kits have resulted in a diverse library of company and user-made protocols.10 This, combined with constant evolution of base calling and analytical software, makes developing consistent start-to-finish protocols challenging. These caveats aside, near real-time sequencing combined with size and portability are ideal for infectious disease diagnostics, thus warranting the development effort.10,11
Application of molecular inversion probes (MIPs) before sequencing mitigates several of the classic issues surrounding NGS. Issues such as cost per sample, host background, and necessity for a priori knowledge for real-time PCR are mitigated through MIP multiplex amplification of several signatures at once to include virulence elements, resistance genes, and other identifying elements.12,13 MIPs have exhibited multiplex capabilities of >10,000 probes significantly outpacing current multiplex PCR panels.14,15 Mechanistically, MIPs are single-stranded DNA probes that hybridize to target sequences, “gap-fill” via DNA polymerase, and ligate to form a circular element that is amplified with a single universal primer pair. Unlike multiplex PCR, these universal primers can be easily modified to create amplicons suitable for amplification and sample barcoding on both the Illumina and ONT sequencing platforms without having to alter the probe panel. Here, we present data characterizing an MIP panel identifying pathogens from spiked complex matrices using the Illumina and ONT platforms. We analyzed and compared diagnostic metrics for each platform and highlight the adaptability of MIP technology for infectious disease diagnostics.
Materials and Methods
Strains Used, DNA and RNA Preparation
Bacterial and viral strains used in this study are included in Supplemental Table S1. Viral strains used in this study were obtained from the Unified Culture Collection of the United States Army Medical Research Institute of Infectious Diseases (Fort Detrick, MD) and BEI Resources (Manassas, VA). Bacterial samples were created as described previously.12 Briefly, bacterial glycerol stock was grown overnight on sheep’s blood agar plates (Thermo Fisher Scientific, Waltham, MA) at 37°C. A colony was suspended in 40 mL of BACTEC Standard/10 aerobic/F culture spiked with whole blood (BioreclamationIVT, Baltimore, MD) at a 1:4 ratio. Bottles were then cultured in BACTEC FX40 (Thermo Fisher Scientific) overnight. Colony-forming units per milliliter counts on flagged positive bottles were within a 107 to 109 range. Then, 1 mL of blood culture fluid was removed and 50 μL of lysozyme (100 mg/mL) and 10 μL of mutanolysin (10,000 U/mL) were added, incubated at 37°C for 30 minutes, and subjected to bead-beating for 5 minutes with 100 μL of 0.5 μmol/L beads. DNA was extracted and purified by using the Qiagen EZ1 DNA Tissue kit (Qiagen, Valencia, CA) according to the manufacturer’s instructions.
Stock virus was provided by BEI Resources and the Unified Culture Collection. Cell culture supernatant of cells infected with stock virus was inactivated by using TRIzol LS at a 1:3 ratio (Thermo Fisher Scientific). Human plasma and whole blood (BioreclamationIVT) were also added to TRIzol LS at a 1:3 ratio. Mock clinical samples were prepared at indicated concentrations combining TRIzol-treated virus and matrix. Then, 400 μL of each sample, including human chikungunya virus (CHIKV) samples, was extracted on the EZ1 Advanced XL (Qiagen) using the Virus 2.0 kit (Qiagen) according to the manufacturer’s instructions. For all viral samples, a DNase step was completed by using the RNase-Free DNase Set kit (Qiagen) followed by sample cleanup with the RNeasy MinElute kit (Qiagen). For all viral samples, cDNA was generated from 50 ng total RNA with the SeqPlex RNA amplification kit (MilliporeSigma, Burlington, MA).
MIP Design and Protocol
The Pathogen Identification Panel (PIP) was designed by using the CLC Genomics Workbench (CLC Bio, Cambridge, MA) and AlleleID 7.73 (PREMIER Biosoft, Palo Alto, CA). Homologous probe arms are described in Supplemental Table S2. The MIP backbone and universal amplification primers are listed in Table 1. Probes were synthesized by Integrated DNA Technologies (Coralville, IA) suspended in Tris-EDTA and combined at equimolar amounts. The MIP protocol was performed as previously published with few changes.12 Briefly, 10 μL reaction mixtures contained 10 U of Ampligase (Epicentre, Madison, WI), 500 μmol/L nicotinamide adenine dinucleotide (Sigma-Aldrich, St. Louis, MO), 1× Phusion High-Fidelity PCR Master Mix with HF Buffer (New England Biolabs, Ipswich, MA), 100 pmol/L PIP probe pool, and 2 μL of target organism DNA. Reactions were heated to 98°C for 3 minutes and then ramped to 55°C (0.1°C per second) with a 60-minute hold at 55°C followed by 72°C for 15 minutes. Noncircular DNA was removed with the addition of 20 U of exonuclease I (Epicentre) and 25 U of exonuclease III (Epicentre). Reactions were heated to 37°C for 30 minutes followed by inactivation at 80°C for 20 minutes. Capture regions were amplified by using 0.5 μmol/L forward and reverse sequencer-specific universal primers (Table 1) and 1× Phusion HF MM (Qiagen). Cycling conditions were as follows: 98°C for 3 minutes, 40 cycles of 98°C for 10 seconds, 60°C for 30 seconds, 72°C for 15 seconds, and a hold of 72°C for 5 minutes. Amplicons were purified by using Agencourt AMPure XP beads (Beckman Coulter, Pasadena, CA) per the manufacturer’s protocol with a bead ratio of 0.7×.
Table 1.
MIP Backbone and Amplification Primer Sequences
| Primer name | Sequence |
|---|---|
| ONT Universal amplification primer Forward | 5′-TTTCTGTTGGTGCTGATATTGCCGTTGTTACCGACTGGATTATTACC-3′ |
| ONT Universal amplification primer Reverse | 5′-ACTTGCCTGTCGCTCTATCTTCTCCGCATACCAGTTGTTGTCG-3′ |
| Illumina Universal amplification primer Forward | 5′-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGNNNNCGTTGTTACCGACTGGATTATTACC-3′ |
| Illumina Universal amplification primer Reverse | 5′-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGNNNNTCCGCATACCAGTTGTTGTCG-3′ |
| MIP backbone | 5′-[PHOS]-GGTAATAATCCAGTCGGTAACAACGAACGGTACGCTGAGGGCGGAAAAAATCGTCGGGGACATTGTAAAGGCGGCGAGCGCGGCTTTTCCGCGCCAGCGTGAAAGCAGTGTGGACTGGCCGTCAGGTACTCCGCATACCAGTTGTTGTCG-3′ |
Underlined sequences indicate amplification primer and probe binding locations, and [PHOS] indicates that a 5′ phosphorylation is required. MIP, molecular inversion probe; ONT, Oxford Nanopore Technologies.
Sequencing and Data Analysis
Samples for Illumina sequencing were enriched and barcoded by using the Kapa Biosystems Library Amplification Kit (Roche Holding AG, Basel, Switzerland) and 0.5 μmol/L Nextera dual indices (Illumina). Cycling conditions were as follows: 72°C for 3 minutes, 98°C for 30 seconds, 24 cycles at 98°C for 10 seconds, 63°C for 30 seconds, and 72°C for 30 seconds, followed by a 72°C hold for 1 minute. Amplicons were purified by using Agencourt AMPure XP beads per the manufacturer’s protocol with a bead ratio of 0.5×, quantified with the LabChip GX Touch (PerkinElmer, Waltham, MA) and DNA high-sensitivity reagent kit (PerkinElmer). Samples were pooled, quantified with the KAPA library quantification kit (Roche Holding AG), and sequenced on the MiSeq sequencer (Illumina) using a 300 v2 cycling kit (Illumina) at a final concentration of 8 pmol/L.
For ONT sequencing, 0.5 nmol/L of each sample was enriched and barcoded by using the 1D PCR barcoding (96) amplicon (SQK-LSK108) kit (ONT) according to the manufacturer’s instructions. Cycling conditions were as follows: 95°C for 3 minutes, 15 cycles of 95°C for 12 seconds, 62°C for 30 seconds, and 65°C for 1 minute with final extension at 65°C for 5 minutes. Amplicons were purified by using Agencourt AMPure XP beads per the manufacturer’s protocol with a bead ratio of 0.7× and quantified with the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific). Samples were pooled and enriched by using an adapted protocol for the Ligation Sequencing Kit 1D (ONT). The Ligation Sequencing Kit 1D protocol was initiated at the sample enrichment PCR step, ignoring adaptor ligation. Samples were sequenced by using the Flo-MIN106 R9.4.1 flow cell for 1D experiments. Flow cells were run for 48 hours, and samples were demultiplexed and base called by using Albacore and Porechop on a Linux operating system.
Analysis was performed by using the CLC Genomics Workbench. Adaptors were trimmed before read mapping, and Illumina and ONT reads were quality trimmed by using a quality score of 0.01 and 0.05, respectively. Trimmed reads were mapped separately to a 16S rRNA database comprising variable regions V1V2, V3, and V6V7 encompassing 88 genera and 3069 species as previously described12 and a viral database composed of capture regions of targeted viral organisms. These regions were curated by first aligning target organism genomes downloaded from GenBank using the CLC Genomics Workbench. The consensus sequences from each MIP-targeted region were then removed from the full genome and used for reference-based mapping. Mapping settings were as follows: for ONT, a mismatch cost of 2, insertion cost of 3, deletion cost of 3, insertion open cost of 6, insertion extend cost of 1, deletion open cost of 6, deletion extend cost of 1, length fraction of 0.5, and similarity fraction of 0.8 were used; and for Illumina reads, a mismatch cost of 2, insertion cost of 3, deletion cost of 3, insertion open cost of 6, insertion extend cost of 1, deletion open cost of 6, deletion extend cost of 1, length fraction of 0.9, and similarity fraction of 0.9 were used. GraphPad Prism version 7.01 (GraphPad Software, La Jolla, CA) and JMP Genomics version 8.1 (SAS Institute, Inc., Cary, NC) were used for statistical analysis and graphing. References were called positive if mapped reads were above a minimum threshold: 100 reads for bacterial samples or 100 and 10 reads for Illumina and ONT, respectively, for viral samples. A second cutoff was applied by using an average plus three times the SD of nontemplate controls. Receiver-operating curves (ROC) and precision-recall curves (PRC) and statistics were generated by using MedCalc Software version 20.027 (MedCalc Software Ltd., Ostend, Belgium).
Clinical Samples
Human clinical serum samples from suspected CHIKV infections were acquired through the Special Pathogens Laboratory at the United States Army Medical Research Institute of Infectious Diseases. All samples were stored at −20°C or −80°C before their use. All samples were de-identified before use. All studies were conducted in compliance with United States Department of Defense, federal, and state statutes and regulations relating to the protection of human subjects, and adhere to principles identified in the Belmont Report. All data and human subjects research were gathered and conducted for this publication under an Institutional Review Board–approved determination FY17-31 as defined by 32 CFR 219.102(f). Mock clinical samples used human plasma and whole blood (BioIVT, Westbury, NY) under the Institutional Review Board–approved determination FY16-21. The CHIKV status of human samples was determined by using real-time RT-PCR as previously described.16
Results
Design and Composition of the Pathogen Identification and Characterization Panel
Covering the panoply of possible human-relevant infectious diseases for any given clinical sample is functionally unrealistic for real-time PCR. To establish proof-of-concept for the application of MIPs on Illumina and ONT NGS, a probe panel consisting of 173 MIPs termed the PIP, for the identification of pathogens using NGS as a downstream readout, was designed. The panel consisted of 94 probes targeting 17 viral pathogens and one parasitic organism. In silico analysis found that these probes cover >90% of GenBank sequences with 100% identity for targeted organisms (Supplemental Table S3). Of these, only Plasmodium falciparum and Lassa virus had <80% of their reference sequences captured at this 100% identity. Eight previously published probes12 targeting variable regions V1, V2, V3, V6, and V7 of the 16S rRNA gene for bacterial taxonomic classification as well as 71 probes targeting antibiotic resistance (AR) elements were also included in the probe panel (Supplemental Table S2).
Taxonomic Classification and AR Gene Detection Using PIP on the Illumina and ONT Sequencing Platforms
Previously published data showed targeted amplification of 16S rRNA variable regions using MIPs, and downstream Illumina sequencing was capable of bacterial taxonomic classification.12 To show the malleability of MIP technology for incorporation of novel probes as well as changes in the downstream sequencing platform, 31 bacterial pathogens were directly challenged from positive blood culture bottles and each sample was classified based on sequencing reads produced by the Illumina and ONT platforms. Each 16S rRNA variable region affords different taxonomic resolution depending on the species. Figure 1 shows representative results from these studies for Klebsiella as complex classifications and Acinetobacter as simple classifications, respectively. For instance, organisms such as Klebsiella pneumoniae had a large percentages of nonspecific read mappings in regions V1-V2 and V6-V7 (Figure 1A and Supplemental Figure S1). In these cases, sequencing reads are split between other closely related Enterobacteriaceae family members. In contrast, bacteria such as Acinetobacter baumannii had nearly 100% correctly mapped reads across each variable region (Figure 1B, Supplemental Figure S1, and Supplemental Table S4). Genus level mapping was similar at each region for the Illumina and ONT platforms. Correlation plots between percent mapped reads from all 31 pathogens revealed a Pearson coefficient of 0.947, representing significant correlation between the sequences produced by each NGS platform (Figure 1C).
Figure 1.
16S variable region mapping and correlation of Illumina and Oxford Nanopore Technologies (Oxford) sequencing reads after enrichment with the Pathogen Identification Panel. Stacked box plots showing percent reads mapped of Illumina and Oxford sequences to reference genera for three variable regions (V1V2, V3, and V6V7) for a subset of organisms, Klebsiella pneumoniae (A) and Acinetobacter baumannii (B). C: Comparison of sequencing platform showing the percent reads mapped for positive reference genera for all 31 bacterial organisms. Variable regions V1V2, V3, and V6V7 are shown in separate colors. Pearson’s correlation coefficient was calculated by using all data points.
Genus level concordance was 96.7% on the Illumina platform when evaluating taxonomic classification. This concordance value was identical to data previously described using the 16S rRNA probe set without the addition of 165 additional probes used in this study.12 Sequence reads from both sequencing technologies resulted in the misclassification of Klebsiella oxytoca as Enterobacter. The ONT platform also misclassified Burkholderia cepacia and Enterobacter aerogenes, resulting in a genus level concordance of 90.3%. These errors were the result of low total mapped reads in the case of Burkholderia and misclassification of reads to other closely related genus such as Raoultella and Klebsiella for E. aerogenes (Supplemental Table S4).
Of the 31 bacterial strains tested, 20 originated from the US Food and Drug Administration-CDC AR Isolate Bank and were fully sequenced with well-defined resistance mechanisms. Using PIP, AR gene families, including β-lactamases blaKPC, blaSHV, blaTEM, blaCMY, blaVIM, blaSME, blaCTX, and blaIMP, were identified in 100% of the organisms tested on both the Illumina and ONT sequencing platforms (Supplemental Table S5). AR gene families not included in the PIP were not identified. Genes within the blaOXA family were identified with the exception of OXA-1 and OXA-9, suggesting that PIP does not capture the entire blaOXA gene family and additional probes will be required. These data show the ability of MIP probe sets to incorporate additional probes without performance loss as well as their adaptability to multiple NGS platforms; however, the decrease in genus-level concordance on the ONT platform warranted further evaluation of diagnostic metrics such as limits of detection (LOD), repeatability, and mock clinical and clinical performance.
A Comparison of Illumina and ONT Sequencing Performance Identifying PIP-Amplified Sequences in Complex Clinical Matrices
PIP-amplified sequences from 20 viral and parasitic strains resulted in positive detection in three biological replicates at a concentration of 104 PFU/mL (Table 2) on both the ONT and Illumina platforms. Results were reproducible in both plasma and whole blood representing cell-free and high host background matrices. Only Marburg virus strains Ci67 and RAVN, Taï Forest virus, and P. falciparum strains 3D7 and NF54E required increased concentration for detection. Pearson correlations investigating the relationship among sequencing platforms and clinical matrices showed that log10 transformed mapped reads resulted in r values ranging from 0.352 to 0.851 across sequencing reads mapped to each organism (Figure 2A). This relationship depended on the matrix and sequencing platform comparison. These r values significantly increased, ranging from 0.73 to 0.93, when normalized to percent mapped reads. Direct comparison of the Illumina and ONT platforms in either plasma or whole blood produced r values of 0.82 and 0.90, respectively, suggesting a high correlation between NGS platforms. Although highly correlative, the total reads mapped for the Illumina sequencing platform were statistically higher than those for the ONT platform (Figure 2B). ONT, however, had a statistically higher percentage of mapped reads compared with Illumina, with 38% and 12% suggesting better signal to noise (Figure 2C). These data showed that both NGS platforms are capable of identifying pathogens from PIP-amplified sequences; however, they have statistical differences in total and percent mapped reads that could affect diagnostic metrics such as LOD or repeatability.
Table 2.
Identification of Pathogens with PIP from Multiple Matrices and Sequencing Platforms
| Organism | Conc. (PFU/mL) | Illumina (whole blood) |
ONT (whole blood) |
Illumina (plasma) |
ONT (plasma) |
||||
|---|---|---|---|---|---|---|---|---|---|
| Rep. | Avg. Map. | Rep. | Avg. Map. | Rep. | Avg. Map. | Rep. | Avg. Map. | ||
| BDBV | 104 | (3/3) | 102,795 | (3/3) | 2321 | (3/3) | 106,756 | (3/3) | 3466 |
| CCHFV | 104 | (5/6) | 820∗ | (5/6) | 143∗ | (3/3) | 9035 | (3/3) | 1648 |
| CHIKV | 104 | (3/3) | 19,533 | (3/3) | 4054 | (3/3) | 72,439 | (3/3) | 4823 |
| DENV1 | 104 | (3/3) | 63,666 | (3/3) | 15,017 | (3/3) | 76,944 | (3/3) | 5942 |
| DENV2 | 104 | (3/3) | 78,752 | (3/3) | 3871 | (3/3) | 56,305 | (3/3) | 7196 |
| DENV3 | 104 | (3/3) | 1,133,757 | (3/3) | 7555 | (3/3) | 96,931 | (3/3) | 6243 |
| DENV4 | 104 | (3/3) | 456,906 | (3/3) | 11,870 | (3/3) | 82,660 | (3/3) | 6180 |
| EBOV | 104 | (3/3) | 931 | (3/3) | 51 | (3/3) | 8818 | (3/3) | 948 |
| LASV | 104 | (3/3) | 14,585 | (3/3) | 1374 | (3/3) | 52,620 | (3/3) | 10,988 |
| MARV (Ci67) | 105 | (3/3) | 3422 | (3/3) | 398 | (3/3) | 8560 | (3/3) | 2585 |
| MARV (RAVN) | 105 | (2/3) | 417∗ | (2/3) | 551∗ | (3/3) | 72,510 | (3/3) | 21,161 |
| Plasmodium falciparum (NF54E) | 480 ng/mL | (2/3) | 553∗ | (2/3) | 135∗ | N/A | N/A | N/A | N/A |
| P. falciparum (3D7) | 480 ng/mL | (2/3) | 1193∗ | (3/3) | 268 | N/A | N/A | N/A | N/A |
| RESTV | 104 | (3/3) | 3091 | (3/3) | 284 | (3/3) | 3869 | (3/3) | 865 |
| RVFV | 104 | (3/3) | 98,590 | (3/3) | 3465 | (3/3) | 126,615 | (3/3) | 12,806 |
| SUDV | 104 | (3/3) | 2437 | (3/3) | 520 | (3/3) | 11,740 | (3/3) | 1814 |
| TAFV | 107 | (1/3) | 436∗ | (2/3) | 961∗ | (3/3) | 5703 | (3/3) | 3574 |
| WNV | 104 | (3/3) | 3608 | (3/3) | 663 | (3/3) | 12,585 | (3/3) | 2849 |
| YFV | 104 | (3/3) | 1389 | (3/3) | 392 | (3/3) | 6573 | (3/3) | 1251 |
| ZIKV | 104 | (5/6) | 38,594∗ | (6/6) | 11,031 | (3/3) | 49,850 | (3/3) | 8308 |
| Total (%) | 89.39 | 93.94 | 100 | 100 | |||||
Ave. Map, average mapped reads; BDBV, Bundibugyo virus; CCHFV, Crimean-Congo hemorrhagic fever; CHIKV, chikungunya virus; Conc., concentration; DENV, dengue virus; EBOV, Ebola virus; LASV, Lassa fever virus; MARV, Marburg virus; N/A, samples not performed in the respective matrix; ONT, Oxford Nanopore Technologies; PIP, Pathogen Identification Panel; RAVN, Ravn virus; RESTV, Reston virus; RVFV, Rift Valley fever virus; SUDV, Sudan virus; TAFV, Taï Forest virus; WNV, West Nile virus; YFV, yellow fever virus; ZIKV, Zika virus.
Replicates (Rep.) called negative were excluded from average.
Figure 2.
Evaluation of clinical matrix and sequencing platform effects on detection of pathogens using the Pathogen Identification Panel. A: Scatterplot matrix displaying correlations between matrices and sequencing platforms using total mapped sequencing reads (lower left quadrant) and mapped reads normalized to total reads (upper right quadrant). Each sample point represents the mean of all replicates for a given input organism as seen in Table 2. Each square depicts the correlation of these points between next-generation sequencing platforms and human matrices that are listed diagonally down the middle of the square. Fit lines along with 95% CIs (light gray shading) as well as r correlations are displayed for each graph. Box plots comparing the total mapped reads (B) and the percent mapped reads (C) of the Illumina and Oxford Nanopore Technologies (ONT) platforms and whole blood and plasma matrices. Only positive calls were used for analysis, with each point representing an individual sample. An unpaired t-test was performed on each data set. ∗∗∗∗P < 0.0001.
To test the effect on these diagnostic metrics, PIP-amplified sequences from serial dilutions of spiked CHIKV and Ebola virus in plasma and whole blood were sequenced on the Illumina and ONT platforms (Figure 3). CHIKV had a preliminary LOD of 103 PFU/mL for plasma and whole blood samples independent of sequencing platform, whereas Ebola virus spiked samples had a preliminary LOD of 103 PFU/mL for plasma and 104 PFU/mL for whole blood samples independent of sequencing platform (Figure 3). Preliminary LOD created the basis of LOD confirmation across 20 biological replicates for both CHIKV and Ebola virus in plasma and whole blood to measuring sample repeatability on each NGS platform. Evaluation of key metrics of repeatability such as SD, CV, and percentage of replicates called positive were similar under each condition evaluated (Table 3). These results showed limited variation in LOD or repeatability between NGS platforms despite significant differences in total and percent mapped reads.
Figure 3.
Effect of clinical matrix and sequencing platform on detection of pathogens with varying input. Bar graph representing input plaque-forming units per milliliter (PFU/mL) versus log10 (1 + Y) transformed mapped reads from chikungunya virus (CHIKV) (A and C) and Ebola virus (EBOV) (B and D) spiked into plasma and whole blood and sequenced on the Illumina (A and C) and Oxford Nanopore Technologies (ONT) (C and D) platforms. Error bars represent the SD of three replicates, and bar colors indicate the number of replicates called positive as described in the Materials and Methods.
Table 3.
Effect of Sequencing Platform on the Identification and Repeatability of EBOV and CHIKV Spiked Clinical Matrices
| Input organism | Matrix | Preliminary LOD tested (PFU/mL) | Sequencer | Called positive replicates | Average log10 transformed | SD∗ | CV∗ |
|---|---|---|---|---|---|---|---|
| CHIKV | Plasma | 103 | Illumina | (19/19)∗ | 4.265 | 0.449 | 10.53% |
| Plasma | 103 | ONT | (18/19)∗ | 3.994 | 0.2682 | 6.71% | |
| Whole blood | 103 | Illumina | (20/20) | 3.772 | 0.4374 | 11.60% | |
| Whole blood | 103 | ONT | (20/20) | 3.158 | 0.5601 | 17.74% | |
| EBOV | Plasma | 103 | Illumina | (17/20) | 2.905 | 0.2599 | 8.95% |
| Plasma | 103 | ONT | (15/20) | 2.439 | 0.3363 | 13.79% | |
| Whole blood | 104 | Illumina | (20/20) | 3.925 | 0.4123 | 10.51% | |
| Whole blood | 104 | ONT | (20/20) | 3.274 | 0.3946 | 12.05% |
CHIKV, chikungunya virus; EBOV, Ebola virus; LOD, limits of detection; ONT, Oxford Nanopore Technologies.
Only 19 replicates due to error in library preparation.
Evaluation of Sequencing Platforms and their Effect on PIP Diagnostic Performance
Samples multiplexed on next-generation sequencers often suffer from index cross-talk resulting in false-positive calls for diagnostic samples.17,18 Therefore, the diagnostic performance of PIP was evaluated with regard to specificity, sensitivity, and predictive value comparing the Illumina and ONT platforms as well as sample matrix. ROC was generated using total mapped read data from matched samples in Table 2. Matrix had the greatest effect on diagnostic performance, with plasma samples sequenced on Illumina and ONT platforms producing area under the ROC (AUROC) scores of 1 compared with whole blood AUROC scores of 0.976 and 0.988, respectively; however, these values were not statistically different (Supplemental Table S6). PRC were generated because they have been shown to handle analysis of imbalanced data sets more effectively than ROC curves.19 Similar to ROC analysis, plasma samples had an area under the PRC (AUPRC) of 1 when sequenced on both platforms. The AUPRC for whole blood samples sequenced on the Illumina and ONT platforms were 0.954 and 0.982, respectively. AUPRC scores between Illumina plasma/Illumina whole blood, Illumina whole blood/ONT whole blood, and Illumina whole blood/ONT plasma were considered significant using a 95% bootstrap CI with 1000 iterations (Supplemental Table S6). In all cases, the high AUROC and AUPRC scores indicate that both NGS platforms showed high specificity and sensitivity rates.
PIP performance was also described on pathogen identification from de-identified human clinical samples compared with gold standard quantitative RT-PCR methods post-SeqPlex amplification (Table 4). The Illumina platform had a true positive/negative rate of 0.91 and 1.00, whereas the ONT platform had values of 0.88 and 1.0. The positive/negative predictive values for the Illumina and ONT platforms were 1.0/0.94 and 1.0/0.92, respectively. A single negative replicate of sample SPL14-61-1 induced lower diagnostic values on the ONT platform. Both platforms were unable to detect CHIKV-positive sample SPL14-55-1. Both sequencing platforms produced no false-positive calls related to the misidentification of another organism within the reference database.
Table 4.
Detection of Pathogen in Human Clinical Samples Using PIP and Sequenced on Illumina and ONT Platforms
| Sample ID | CHIKV RT-qPCR (after SeqPlex) | Concordant reference mapped reads (Illumina) |
Concordant reference mapped reads (ONT) |
||||
|---|---|---|---|---|---|---|---|
| A | B | C | A | B | C | ||
| Sample 1 | 8.85 | 91,308 | 70,484 | 72,004 | 14,643 | 21,330 | 11,014 |
| Sample 2 | 10.05 | 48,473 | 81,426 | 74,848 | 14,176 | 12,051 | 12,233 |
| Sample 3 | 21.12 | 2075 | 25,014 | 8623 | 4577 | 8551 | 4089 |
| Sample 4 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 5 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 6 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 7 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 8 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 9 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 10 | 10.37 | 71,710 | 58,360 | 42,418 | 12,029 | 15,454 | 92,484 |
| Sample 11 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 12 | 7.75 | 65,718 | 70,261 | 60,276 | 12,889 | 13,419 | 12,877 |
| Sample 13 | 8.26 | 85,160 | 57,819 | 79,110 | 11,963 | 19,366 | 15,746 |
| Sample 14 | 16.92 | 31,662 | 35,337 | 31,182 | 18,609 | 18,049 | 21,361 |
| Sample 15 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 16 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 17 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 18 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 19 | 8.50 | 62,904 | 49,199 | 74,428 | 10,119 | 9477 | 24,283 |
| Sample 20 | 38.39 | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 21 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 22 | 8.97 | 77,666 | 54,865 | 62,727 | 15,237 | N/A | 16,664 |
| Sample 23 | 7.80 | 73,124 | 47,405 | 52,120 | 13,077 | 11,095 | 13,770 |
| Sample 24 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 25 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
| Sample 26 | Negative | 0 | 0 | 0 | 0 | 0 | 0 |
CHIKV, chikungunya virus; EBOV, Ebola virus; N/A, library prep failed; ONT, Oxford Nanopore Technologies; PIP, Pathogen Identification Panel; RT-qPCR, quantitative RT-PCR.
Discussion
Developed nearly two decades ago, MIPs are a molecular technique originally designed to perform massively multiplexed genotyping on microarrays. MIPs continuously evolved in concert with downstream detection systems such as NGS and show continued relevance due to their multiplex capabilities. Here we designed an MIP panel with the intent to evaluate probe multiplexability as well as adaptability to current NGS platforms in various clinical matrices for the identification of a subset of select agents, classified by the CDC,20 as well as other BSL3/BSL4 pathogens and acquired antimicrobial genes of concern.
The new PIP probe set expanded on a previously published MIP panel consisting of eight probes targeting variant regions of the 16S rRNA gene through the inclusion of 165 probes targeting viral, parasitic, and AR genetic elements.12 Adding novel probes into existing probe sets is an intrinsic capability of MIP technology. The 3′ and 5′ recognition sequences of MIPs are physically constrained by a linker backbone sequence and dual enzymatic steps, including a ligase and polymerase, minimizing probe crosstalk and spurious amplification. We classified 31 bacterial strains extracted from positive blood culture with PIP-amplified material and compared these results versus previous classification results using the 16S rRNA probe set alone.12 The expanded probe set showed concordant taxonomical classification on the Illumina platform, suggesting limited detrimental effects in this use case. The ability to tailor existing probe sets to discover new pathogens or genetic variants of interest is invaluable to targeted sequencing technologies. Multiplex PCR has shown limits to the number of primer pairs contained within a single assay; however, MIP probe sets containing >10,000 probes have been used successfully.15 The number of probes afforded per panel makes the theoretical number of organisms or variants that can be detected in a single assay extremely high when combined with NGS as a detection device.
Despite the intrinsic multiplexing capabilities afforded by MIPs, verification of the probe set would be required when protocol changes, such as novel probe additions, pathogen additions, or clinical matrices alterations, are incorporated. Similar to multiplex PCR and other in vitro targeted diagnostics, bridging studies are required to determine how probe crosstalk and matrix inhibition affect diagnostic metrics. Deviations from previously defined LOD using characterized reference material would require further evaluation and/or redesign of probe sets. Within the current study, we identified matrix-specific effects on LOD for Ebola virus in whole blood compared with plasma. These differences could be the result of multiple factors such as inhibitor carryover, which would affect MIP binding and amplification, or reduced extraction efficiency resulting in loss of target copies per sample. Similarly, targeting novel pathogens could also affect assay performance. For instance, PIP did not target fungal agents; however, pathogens such as Candida species may be desired in future panels. It is unlikely that fungal nucleic acid would behave differently than bacterial or viral targets; however, extraction protocols would need to be optimized for these organisms. Extraction conditions can be optimized to improve performance for specific matrices and organisms, although experimentation on how new extraction protocols affect previous diagnostic metrics would be required. In the end, despite the intrinsic specificity afforded by MIPs, discussions on bridging experiments required to validate or verify novel targets and conditions need to be discussed before use.
Along these lines, defining the use case of targeted NGS approaches such as MIPs before target design is essential for appropriate assay utilization. We evaluated PIP and NGS platform performance using multiple clinical matrices such as blood culture fluid, plasma, serum, and whole blood throughout the study. Though valuable for initial proof-of-concept, defining metrics such as clinical matrices prior to use is vital for appropriate probe design. For instance, although blood culture fluid was used as a complex matrix to classify bacterial agents, this is unlikely to be a primary clinical matrix for PIP-targeted NGS. Several diagnostic approaches are approved for rapid identification of bacterial agents from blood culture, including PCR panels and matrix-assisted laser desorption ionization–time of flight mass spectrometry.21 Targeted NGS is still significantly slower and more expensive than standard molecular techniques, and therefore more appropriate uses would be the identification of agents from limited samples in which large molecular panels cannot be performed or unculturable samples. For bacterial blood culture, the inclusion of 16S rRNA probes could act as a secondary classification strategy with the primary intent to identify and track genetic elements of public concern such as AR genes. Factors such as clinical matrix, endemic pathogens, differentiation of pathogens with similar symptomology, and clinical versus biosurveillance use need to be considered before designing probe sets that will be verified or validated.
The selected downstream NGS platform represents another consideration when designing targeted sequencing panels. Two NGS technologies, the Illumina and ONT nanopore platforms, currently dominate the sequencing field. For both of these platforms, metagenomic and targeted sequence library creation requires the addition of sequencing adaptors onto nucleic acids through random ligation, transposases, or primer amplification with adaptor overhangs. Multiplex PCR panels are composed of primer pairs with platform-specific overhangs that allow for target-specific amplification followed by library enrichment and barcoding if desired. MIPs capture the target sequence before amplification, then use a universal primer sequence within the linker backbone for amplification requiring only one primer pair with platform-specific overhangs (Supplemental Figure S2). The transition of MIP amplicons to the Illumina or ONT platform required no modifications to the probe pool, ensuring that the capture aspect of the panel remained consistent. Although there are many dissimilarities between sequencing platforms, the difference in base calling error rates could affect pathogen identification.22 In the current study, normalized mapped reads exhibited a high correlation between the Illumina and ONT platforms; however, the error rate affected taxonomic classification of bacteria, decreasing genus-level concordance from 96.6% to 90.3%. At the genus level, the percentage of misclassified reads doubled on the ONT platform compared with those on the Illumina platform (Supplemental Table S4 and Supplemental Figure S1). The bacterial strains selected within this article contained several Enterobacteriaceae family members, which are difficult to classify through 16 rRNA alone when using small amplicons.23,24 Misclassification due to error rate can be partially alleviated on the ONT platform when sequencing the entire 16S rRNA gene24; however, MIPs do not allow sequence capture lengths of this size. For viral targets, sequencing PIP amplicons on either NGS platform resulted in minimal loss of diagnostic metrics such as LOD, repeatability, sensitivity, and specificity. Misclassification was less prevalent on viruses and parasites given the diversity of the target organism. Designing probe target regions toward maximal heterogeneity as well as capturing multiple target regions for redundancy can effectively offset the high error rate and read misclassification seen on the ONT platform.
The dependence on well-curated genomes for the design of targeted panels remains a significant hurdle for targeted NGS technologies. Identifying conserved regions of interest that can identify the greatest number of pathogens or gene sequences while being heterologous enough for classification down to the species or strain level remains challenging. This can be especially challenging for MIPs because amplicon sizes are smaller due to constriction of homologous arms by the linker backbone. This requires more probes to make up for the genetic heterogeneity offered by longer sequence lengths. These challenges were highlighted when using MIPs to identify AR elements from the CDC and US Food and Drug Administration AR Isolate Bank.25 These isolates possess fully sequenced genomes in addition to well-curated AR genes. The PIP successfully identified targeted AR gene families with the exception of a subset of OXA genes. The OXA gene family represents a large diverse set of β-lactamases requiring multiple probes to cover the entire family.26 PIP, as currently designed, was not inclusive to all OXA genes, resulting in inadequate identification of this gene family. These results highlight that, although additional probes can be added to fill the gaps, the initial design of targeted panels is critical. Even with careful design, targeted NGS approaches are unable to predict novel or emerging pathogens requiring other diagnostic methods (eg, metagenomic NGS) to identify these organisms.
Metagenomic sequencing of complex matrices such as whole blood requires ultra-deep sequencing for infectious disease identification due to high host-to-pathogen nucleic acid concentrations. For metagenomic sequencing, the utilization of cell-free matrices, such as cerebrospinal fluid or plasma, is ideal to reduce host contamination.27, 28, 29 Nonetheless, sequencing read depths of approximately 5 to 10 million reads are still required to identify potential pathogens.27 Using PIP before sequencing, we were able to multiplex an average of 55 samples per sequencing run, including negative controls, independent of sequencing platform and clinical matrix (Supplemental Table S7). The average reads per sample were approximately 500,000 and 50,000 for the Illumina MiSeq and ONT MinION platforms, respectively (Supplemental Table S7). Although no cost analysis was performed during this study, sequencing cartridges remain the most significant expense during NGS. A single 300-cycle v2 MiSeq Reagent Kit averages approximately $1000 USD per cartridge and has maximum reads per run of 15 million. The cost per cartridge is similar for the ONT MinION R9.4 flow cell; however, significantly fewer total reads were produced. The reads per run afforded by these cartridges allow for a single metagenomic sample per sequencing run, assuming controls are included, resulting in significant per-sample costs. Currently, metagenomic sequencing is performed on instruments with greater read depth, such as the NovaSeq or HiSeq (both, Illumina), allowing for greater sample multiplexing, albeit increased cartridge costs. As for the costs of MIPs, they are approximately four times the cost of a primer pair and significantly cheaper than real-time PCR fluorescent probes. This scenario results in a significant upfront cost depending on the size of the panel, although, once purchased, the cost per sample is negligible due to the concentrations required per reaction. In the end, the upfront costs associated with building a novel probe set can be expensive; however, the cost savings per sample due to sample multiplexing makes targeted NGS feasible for fields such as routine diagnostics and biosurveillance.
The ability to characterize all DNA or RNA within a sample, clinical or environmental, including microbial pathogens as well as host responses promises to revolutionize fields such as biosurveillance, bioforensics, and clinical microbiology.2 For routine diagnostics and biosurveillance efforts, cost, time-to-answer, and the technical abilities required to operate most sequencers remain as significant hurdles. Technological advances in sequencing, including ONT’s small platform, low cost burden, ease of use, and near real-time sequencing, have pushed NGS forward. Here, we show that MIPs continue to be a valuable upfront molecular amplification technique easily adapted to ever-evolving downstream molecular technologies. The fundamental molecular aspects of MIPs, including the specificity and adaptability afforded by the linker backbone, promises this molecular technique will continue to be utilized well into the future.
Acknowledgments
We thank Daniel J. Carucci for contributing P. falciparum, Strain 3D7, MRA-102, and Megan G. Dowler for contributing P. falciparum, Strain NF54 (Patient Line E), MRA-1000.
Footnotes
Supported by the Defense Threat Reduction Agency. The opinions, interpretations, conclusions, and recommendations contained herein are those of the authors and are not necessarily endorsed by the US Army.
Disclosures: None declared.
Supplemental material for this article can be found at http://doi.org/10.1016/j.jmoldx.2021.12.005.
Author Contributions
C.P.S. wrote the manuscript, performed the experiments, analyzed the results, and prepared all figures; A.T.H. and A.S.G. performed the experiments; T.D.M. designed and planned experiments, wrote the manuscript, and secured funding to perform research. All authors contributed to experimental design and planning and reviewed the manuscript.
Supplemental Data
Supplemental Figure S1.
16S variable region mapping. Stacked box plots showing percent Illumina and Oxford Nanopore Technologies (Oxford) sequences mapped to reference genera for three variable regions (V1V2, V3, and V6V7) for all bacterial organisms in Supplemental Table S1.
Supplemental Figure S2.
Molecular inversion probe (MIP) workflow. Workflow using MIPs for targeted sequencing on the Illumina and Oxford Nanopore Technologies next-generation sequencing (NGS) platforms.
References
- 1.Chiu C.Y., Miller S.A. Clinical metagenomics. Nat Rev Genet. 2019;20:341–355. doi: 10.1038/s41576-019-0113-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Minogue T.D., Koehler J.W., Stefan C.P., Conrad T.A. Next-generation sequencing for biodefense: biothreat detection, forensics, and the clinic. Clin Chem. 2019;65:383–392. doi: 10.1373/clinchem.2016.266536. [DOI] [PubMed] [Google Scholar]
- 3.Collins F.S., Hamburg M.A. First FDA authorization for next-generation sequencer. N Engl J Med. 2013;369:2369–2371. doi: 10.1056/NEJMp1314561. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Laehnemann D., Borkhardt A., McHardy A.C. Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction. Brief Bioinform. 2016;17:154–179. doi: 10.1093/bib/bbv029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jain M., Olsen H.E., Paten B., Akeson M. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 2016;17:239. doi: 10.1186/s13059-016-1103-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Theuns S., Vanmechelen B., Bernaert Q., Deboutte W., Vandenhole M., Beller L., Matthijnssens J., Maes P., Nauwynck H.J. Nanopore sequencing as a revolutionary diagnostic tool for porcine viral enteric disease complexes identifies porcine kobuvirus as an important enteric virus. Sci Rep. 2018;8:9830. doi: 10.1038/s41598-018-28180-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Mbala-Kingebeni P., Villabona-Arenas C.J., Vidal N., Likofata J., Nsio-Mbeta J., Makiala-Mandanda S., Mukadi D., Mukadi P., Kumakamba C., Djokolo B., Ayouba A., Delaporte E., Peeters M., Muyembe Tamfum J.-J., Ahuka-Mundeke S. Rapid confirmation of the Zaire Ebola virus in the outbreak of the Equateur Province in the Democratic Republic of Congo: implications for public health interventions. Clin Infect Dis. 2019;68:330–333. doi: 10.1093/cid/ciy527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Greninger A.L., Naccache S.N., Federman S., Yu G., Mbala P., Bres V., Stryke D., Bouquet J., Somasekar S., Linnen J.M., Dodd R., Mulembakani P., Schneider B.S., Muyembe-Tamfum J.-J., Stramer S.L., Chiu C.Y. Rapid metagenomic identification of viral pathogens in clinical samples by real-time nanopore sequencing analysis. Genome Med. 2015;7:99. doi: 10.1186/s13073-015-0220-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Fu S., Wang A., Au K.F. A comparative evaluation of hybrid error correction methods for error-prone long reads. Genome Biol. 2019;20:26. doi: 10.1186/s13059-018-1605-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bowden R., Davies R.W., Heger A., Pagnamenta A.T., de Cesare M., Oikkonen L.E., Parkes D., Freeman C., Dhalla F., Patel S.Y., Popitsch N., Ip C.L.C., Roberts H.E., Salatino S., Lockstone H., Lunter G., Taylor J.C., Buck D., Simpson M.A., Donnelly P. Sequencing of human genomes with nanopore technology. Nat Commun. 2019;10:1869. doi: 10.1038/s41467-019-09637-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.De Coster W., De Rijk P., De Roeck A., De Pooter T., D-Hert S., Strazisar M., Sleegers K., Van Broeckhoven C. Structural variants identified by Oxford Nanopore PromethION sequencing of the human genome. Genome Res. 2019;29:1178–1187. doi: 10.1101/gr.244939.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Stefan C.P., Hall A.T., Minogue T.D. Detection of 16S rRNA and KPC genes from complex matrix utilizing a molecular inversion probe assay for next-generation sequencing. Sci Rep. 2018;8:2028. doi: 10.1038/s41598-018-19501-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Stefan C.P., Koehler J.W., Minogue T.D. Targeted next-generation sequencing for the detection of ciprofloxacin resistance markers using molecular inversion probes. Sci Rep. 2016;6:25904. doi: 10.1038/srep25904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Hardenbol P., Banér J., Jain M., Nilsson M., Namsaraev E.A., Karlin-Neumann G.A., Fakhrai-Rad H., Ronaghi M., Willis T.D., Landegren U., Davis R.W. Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat Biotechnol. 2003;21:673–678. doi: 10.1038/nbt821. [DOI] [PubMed] [Google Scholar]
- 15.Hardenbol P., Yu F., Belmont J., Mackenzie J., Bruckner C., Brundage T., Boudreau A., Chow S., Eberle J., Erbilgin A., Falkowski M., Fitzgerald R., Ghose S., Iartchouk O., Jain M., Karlin-Neumann G., Lu X., Miao X., Moore B., Moorhead M., Namsaraev E., Pasternak S., Prakash E., Tran K., Wang Z., Jones H.B., Davis R.W., Willis T.D., Gibbs R.A. Highly multiplexed molecular inversion probe genotyping: over 10,000 targeted SNPs genotyped in a single tube assay. Genome Res. 2005;15:269–275. doi: 10.1101/gr.3185605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Smith D.R., Lee J.S., Jahrling J., Kulesh D.A., Turell M.J., Groebner J.L., O’Guinn M.L. Development of field-based real-time reverse transcription-polymerase chain reaction assays for detection of Chikungunya and O’nyong-nyong viruses in mosquitoes. Am J Trop Med Hyg. 2009;81:679–684. doi: 10.4269/ajtmh.2009.09-0138. [DOI] [PubMed] [Google Scholar]
- 17.Conrad T.A., Lo C.-C., Koehler J.W., Graham A.S., Stefan C.P., Hall A.T., Douglas C.E., Chain P.S., Minogue T.D. Diagnostic targETEd seQuencing adjudicaTion (DETEQT): algorithms for adjudicating targeted infectious disease next-generation sequencing panels. J Mol Diagn. 2019;21:99–110. doi: 10.1016/j.jmoldx.2018.08.008. [DOI] [PubMed] [Google Scholar]
- 18.Mitra A., Skrzypczak M., Ginalski K., Rowicka M. Strategies for achieving high sequencing accuracy for low diversity samples and avoiding sample bleeding using Illumina platform. PLoS One. 2015;10:e0120520. doi: 10.1371/journal.pone.0120520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Saito T., Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One. 2015;10:e0118432. doi: 10.1371/journal.pone.0118432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.CDC . CDC; Atlanta, GA: 2021. HHS and USDA Select Agents and Toxins. [Google Scholar]
- 21.Timsit J.-F., Ruppé E., Barbier F., Tabah A., Bassetti M. Bloodstream infections in critically ill patients: an expert statement. Intensive Care Med. 2020;46:266–284. doi: 10.1007/s00134-020-05950-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Besser J., Carleton H.A., Gerner-Smidt P., Lindsey R.L., Trees E. Next-generation sequencing technologies and their application to the study and control of bacterial infections. Clin Microbiol Infect. 2018;24:335–341. doi: 10.1016/j.cmi.2017.10.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Srinivasan R., Karaoz U., Volegova M., MacKichan J., Kato-Maeda M., Miller S., Nadarajan R., Brodie E.L., Lynch S.V. Use of 16S rRNA gene for identification of a broad range of clinically relevant bacterial pathogens. PLoS One. 2015;10:e0117617. doi: 10.1371/journal.pone.0117617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Benítez-Páez A., Portune K.J., Sanz Y. Species-level resolution of 16S rRNA gene amplicons sequenced through the MinION™ portable nanopore sequencer. Gigascience. 2016;5:4. doi: 10.1186/s13742-016-0111-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.CDC . CDC; Atlanta, GA: 2021. CDC & FDA Antibiotic Resistance Isolate Bank. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Evans B.A., Amyes S.G.B. OXA [beta]-lactamases. Clin Microbiol Rev. 2014;27:241–263. doi: 10.1128/CMR.00117-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Schlaberg R., Chiu C.Y., Miller S., Procop G.W., Weinstock G., Professional Practice Committee and Committee on Laboratory Practices of the American Society for Microbiology. Microbiology Resource Committee of the College of American Pathologists Validation of metagenomic next-generation sequencing tests for universal pathogen detection. Arch Pathol Lab Med. 2017;141:776–786. doi: 10.5858/arpa.2016-0539-RA. [DOI] [PubMed] [Google Scholar]
- 28.Miller S., Naccache S.N., Samayoa E., Messacar K., Arevalo S., Federman S., Stryke D., Pham E., Fung B., Bolosky W.J., Ingebrigtsen D., Lorizio W., Paff S.M., Leake J.A., Pesano R., DeBiasi R., Dominguez S., Chiu C.Y. Laboratory validation of a clinical metagenomic sequencing assay for pathogen detection in cerebrospinal fluid. Genome Res. 2019;29:831–842. doi: 10.1101/gr.238170.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Wilson M.R., Sample H.A., Zorn K.C., Arevalo S., Yu G., Neuhaus J., et al. Clinical metagenomic sequencing for diagnosis of meningitis and encephalitis. N Engl J Med. 2019;380:2327–2340. doi: 10.1056/NEJMoa1803396. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





