Abstract
Efficient detection of human respiratory viral pathogens is crucial in the management of patients with acute respiratory tract infection. Sequence-independent amplification of nucleic acids combined with next-generation sequencing technology and bioinformatics analyses is a promising strategy for identifying pathogens in clinical and public health settings. It allows the characterization of hundreds of different known pathogens simultaneously and of novel pathogens that elude conventional testing. However, major hurdles for its routine use exist, including cost, turnaround time, and especially sensitivity of the assay, as the detection limit is dependent on viral load, host genetic material, and sequencing depth. To obtain insights into these aspects, we analyzed nasopharyngeal aspirates from a cohort of 81 Thai children with respiratory disease for the presence of respiratory viruses using a sequence-independent next-generation sequencing approach and routinely used diagnostic real-time reverse transcriptase PCR (real-time RT-PCR) assays. With respect to the detection of rhinovirus and human metapneumovirus, the next-generation sequencing approach was at least as sensitive as diagnostic real-time RT-PCR in this small cohort, whereas for bocavirus and enterovirus, next-generation sequencing was less sensitive than real-time RT-PCR. The advantage of the sequencing approach over real-time RT-PCR was the immediate availability of virus-typing information. Considering the development of platforms capable of generating more output data at declining costs, next-generation sequencing remains of interest for future virus diagnosis in clinical and public health settings and certainly as an additional tool when screening results from real-time RT-PCR are negative.
INTRODUCTION
Laboratories nowadays largely perform viral species-specific assays for virus diagnosis in clinical samples to increase the sensitivity of detection and reduce the time needed for diagnosis. However, an etiological agent cannot be identified in many cases despite the use of a wide range of sensitive diagnostic assays (1–5). This can depend on the timing of sampling, performance of the individual assays, and also the involvement of divergent viruses that are not detected due to the high specificity of the assays. New perspectives for research and diagnostic applications of virus detection have opened up with recent advances in sequence-independent amplification techniques combined with next-generation sequencing platforms. These technologies are well known for their enormous output of sequence data at a relatively high but decreasing cost.
Sequence-independent next-generation sequencing approaches have been applied successfully to various fields in virology, including virus discovery, whole-virus genome reconstruction, and minority variant analyses (6–10). Sequence-independent amplification of nucleic acids combined with next-generation sequencing technology and bioinformatic analyses is a promising strategy for the rapid identification of pathogens in clinical and public health settings. It allows the characterization of numerous known pathogens simultaneously and of novel pathogens that elude conventional testing. The general idea, however, is that it is unlikely that genomics-based tools will soon be used in a clinical diagnostic setting (11). Its major hurdles are cost-effectiveness; high-throughput formats for clinical settings; turnaround time; the requirement for investments in bioinformatics tools, databases, and data management; training of personnel; and the reporting and interpretation of guidelines upon the identification of viruses of which the clinical relevance is not clear (12). In addition, issues regarding patient privacy need to be resolved before the transition of genomics-based tools from a research setting to the clinic, as these tools yield sequence information from the host genome as well. Nevertheless, the costs for deep sequencing are still declining, and thorough comparisons of the sensitivities of sequence-independent next-generation sequencing approaches to those of current diagnostic assays are scarce.
Here, we describe the comparison of a sequence-independent next-generation sequencing approach to diagnostic real-time reverse transcriptase PCR (real-time RT-PCR) assays in a cohort of Thai children with respiratory disease. The data indicate that a sequence-independent next-generation sequencing approach is a relatively efficient tool for the simultaneous detection of multiple respiratory viruses, albeit slower than routine diagnostic real-time RT-PCR assays.
MATERIALS AND METHODS
Sample collection.
The nasopharyngeal aspirates included in this study were collected for diagnostic testing from children with respiratory illness (n = 261) from 2010 to 2013 and kept at the Center of Excellence in Clinical Virology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand (see Table S1 in the supplemental material). Out of 261 patients screened by standard diagnostic assays in Thailand for influenza A and B viruses (13) and (nested) in-house RT-PCR assays for human respiratory syncytial virus (RSV) (14), human rhinovirus (HRV) (15), enterovirus (EV) (15, 16), human adenovirus (hAdV) (17), human metapneumovirus (hMPV), human parainfluenza virus (hPIV) (18), and human coronavirus (hCoV), 89 had no diagnosis, and samples from 81 of these patients were available for additional studies. As the (nested) in-house RT-PCR assays in Thailand are not based on real-time PCR technology, it was assumed that this initial diagnostic screening may not have been optimal. The age distribution of the enrolled patients was between 8 days and 14 years. Patients were categorized into 1 of 4 groups: infant (<2 years), preschool age (2 to 5 years), primary school age (6 to 11 years), and secondary school age (12 to 15 years). Of the 81 patients, 66.7% were infants (n = 54), 25.9% were in preschool (n = 21), 3.7% were in primary school (n = 3), and 3.7% were in secondary school (n = 3) (see Table S1). The clinical severity of each disease case was defined as mild (a pediatric patient with acute respiratory tract infection [ARTI] complications without abnormal breath sounds and who did not require intubation) (9.9%; n = 8), moderate (patients with ARTI symptoms and abnormal breath sounds who required intubation) (60.5%; n = 49), or severe (patients with ARTI complications with abnormal breath sounds and who required intubation) (28.4%; n = 23) (see Table S1).
Ethics statement.
In compliance with relevant laws and institutional guidelines, ethical approval was obtained from the institutional review board of the Faculty of Medicine, Chulalongkorn University (IRB 493/2557). The study was conducted on anonymous stored clinical specimens. Patient identifiers, including personal information (e.g., name and address) and hospitalization numbers, were removed from these samples to protect patient confidentiality and do not appear in any part of the documentation in this study. Permission for specimen utilization was granted by the director of King Chulalongkorn Memorial Hospital, Thailand.
Sequence-independent next-generation sequencing.
Depletion of host nucleic acids, isolation of viral nucleic acids, sequence-independent amplification, and next-generation sequencing with a 454GS Junior (Roche) were carried out as previously described (19–21) on 81 nasopharyngeal aspirates. Briefly, nasopharyngeal aspirates were centrifuged and filtered through a 0.45-μm-pore filter, after which, the samples were treated with OmniCleave endonuclease (Epicentre; Illumina). RNA and DNA were extracted using the NucleoSpin RNA XS kit (Macherey-Nagel) and the High Pure viral nucleic acids kit (Roche). After first- and second-strand syntheses, random PCR amplification was performed, and PCR products were purified using the MinElute PCR purification kit (Qiagen) (20, 21). Subsequently, 12 samples were pooled for each library preparation, and unique sequence tags were added to the PCR products of each sample using the GS FLX Titanium rapid library MID adaptor kit, and a library of DNA fragments was prepared using a GS FLX Titanium library preparation kit (454 Life Sciences; Roche). The libraries of DNA fragments were sequenced on a 454 GS Junior instrument (454 Life Sciences).
Assembly.
Exhaustive iterative assembly of sequences is part of a virus discovery pipeline written in the python 2.7 programming language, which includes trimming of reads and initial assembly with Newbler (454GS Assembler version 2.7; Roche), with standard parameters. Trimmed reads and initial contigs were subjected to assembly by CAP3 (version date, 21 December 2007) with standard parameters. The resulting singletons and contigs were iteratively assembled by CAP3 until no new contigs were formed. Subsequently, the trimmed reads were mapped back to the identified taxonomic units with Newbler (454 GSMapper version 2.7; Roche) using a minimum length of 75 nucleotides and otherwise standard parameters. The resulting contigs and singletons were filtered with DustMasker, which is part of the NCBI BLAST 2.2.25 suite of tools for sequences that contain >60% low-complexity sequences.
Metagenome analysis.
After filtering the low-complexity sequences, the remaining taxonomic units were subjected to a BLASTN search against a database that contained only nucleotide sequences from birds (Aves, taxonomic identification [taxID] 8782), carnivores (Carnivora, taxID 33554), primates (Primates, taxID 9443), rodents (Rodentia, taxID 9989), and ruminants (Ruminantia, taxID 9845) with an E value cutoff value of 0.001 for the subtraction of potential host sequences. Sequences without hits in the host BLAST were then subjected to a BLASTN search against the entire nucleotide database with an E value cutoff value of 0.001. Due to limited capacity, all the sequences without hits were then subjected to a BLASTX search against sequences present in the GenBank nr database. BLAST hits were categorized by assigning taxonomic categories. The sensitivity of a deep-sequencing approach for detecting viruses is dependent on sequencing depth. In this study, the average number of reads analyzed per sample was ∼10,000; inherently, the detection limit lies at ∼0.01% of viral reads in the metagenome, which results in 1 viral read in the metagenomic data set per sample. The whole sequence-independent next-generation sequencing approach, including analysis, takes up to 5 days.
Diagnostic real-time RT-PCR assays.
Total nucleic acid was extracted from an aliquot (200 μl) of the 81 nasopharyngeal aspirates using the MagNA Pure LC total nucleic acid isolation kit and the MagNA Pure LC isolation station (Roche) and eluted in 50 μl of elution buffer, according to the manufacturer's instructions. Subsequently, the 81 samples were screened for the presence of human rhinovirus, enterovirus, and human metapneumovirus by a real-time RT-PCR with primers and probes used in the routine molecular viral diagnostics setting of Erasmus Medical Center essentially as described previously (22), except hMPV-probe-2 was not used. For bocaviruses, 4 μl extracted nucleic acid was amplified by a real-time PCR, as described previously (23).
Phylogenetic analysis.
Alignments and phylogenetic trees were prepared with MAFFT version 7 (http://mafft.cbrc.jp/alignment/server/) and Molecular Evolutionary Genetics Analysis version 6 (MEGA6) (24) with corresponding sequences of representative members of the respective virus families (see Table S3 in the supplemental material). Neighbor-joining phylogenetic trees were created with 1,000 bootstrap replicates using p-distance (HRV and HBoV) and maximum-likelihood composite models (EV and hMPV) as described previously (25–28).
Nucleotide sequence accession numbers.
Nucleotide sequences of obtained partial viral genomes were deposited in GenBank under accession numbers KM361520 to KM361530.
RESULTS
Eighty-one nasopharyngeal aspirates from Thai children with acute respiratory tract infections (ARTI) who visited two hospitals located in Thailand between 2010 and 2013 (see Table S1 in the supplemental material) were analyzed by random amplification combined with next-generation sequencing (19–21). The taxonomic content of the different samples varied substantially (Fig. 1A) and showed no strong correlations with patient age or disease severity (data not shown). Although not significantly different, the moderate and severe disease cases seemed to have higher mean viral contents in the metagenome than did mild disease cases, in contrast to the bacterial contents, which were similar in mild, moderate, and severe disease cases (Fig. 1B and C). The identified mammalian viral sequences belonged to the families Anelloviridae (∼57% of patients), Picornaviridae (∼51%), Herpesviridae (∼12%), Orthomyxoviridae (∼5%), Paramyxoviridae (∼20%), Parvoviridae (∼21%), Adenoviridae (∼9%), Papillomaviridae (∼4%), and Retroviridae (∼1%) (Table 1; see also Table S2 in the supplemental material). Single and multiple infections occurred in 28% and 60% of the children, respectively (48% and 25% when only clinically well-established infectious respiratory pathogens able to induce disease on their own were calculated).
FIG 1.
Overview of metagenomic content. (A) Relative abundance of the main broad taxonomic categories in metagenomic sequences obtained from nasopharyngeal aspirates of 81 Thai children. The percentages of viral (B) and bacterial (C) reads of the total number of analyzed reads are displayed against disease category (explained in Materials and Methods).
TABLE 1.
Mammalian virus detection in Thai patients with respiratory disease
| Mammalian virus | No. of patients positive for indicated virus | No. (%) of patients positive for indicated virus according to disease severity |
||
|---|---|---|---|---|
| Mild (n = 8) | Moderate (n = 49) | Severe (n = 23) | ||
| Anellovirus | 46 | 3 (37.5) | 29 (59.2) | 14 (60.1) |
| Rhinovirus | 28 | 3 (37.5) | 21 (42.9) | 4 (17.4) |
| Bocavirus | 16 | 1 (12.5) | 11 (22.4) | 4 (17.4) |
| Enterovirus | 13 | 0 | 13 (26.5) | 0 |
| Respiratory syncytial virus | 11 | 1 (12.5) | 9 (18.4) | 1 (4.3) |
| Herpesvirus | 10 | 0 | 7 (14.3) | 3 (13) |
| Adenovirus | 7 | 0 | 3 (6.1) | 4 (17.4) |
| Influenza virus | 4 | 2 (25) | 1 (2) | 1 (4.3) |
| Metapneumovirus | 4 | 0 | 2 (4.1) | 2 (8.6) |
| Papillomavirus | 3 | 0 | 2 (4.1) | 1 (4.3) |
| Parainfluenza virus | 1 | 0 | 1 (2) | 0 |
| Human endogenous retrovirus | 1 | 0 | 1 (2) | 0 |
| Parechovirus | 1 | 1 (12.5) | 0 | 0 |
To obtain insight into the sensitivity of the deep-sequencing approach compared with that of the real-time RT-PCR assays routinely used for virus detection in clinical settings, we performed diagnostic real-time RT-PCRs for rhinovirus, enterovirus, hMPV, and bocavirus on the entire sample set. These viruses were chosen based on genome composition exemplifying both RNA (negative and positive stranded) and DNA viruses with genome sizes ranging from ∼5.5 to 13 kb. A total of 23, 15, 3, and 21 patients were identified as positive for rhinovirus, enterovirus, hMPV, and bocavirus, respectively (Table 2; Table S2 in the supplemental material). In general, a strong correlation was observed between the threshold cycle (CT) and the percentage of viral reads identified by next-generation sequencing (Table 2; Fig. 2A to D; see also Table S2). These data suggest that the deep-sequencing approach was at least as sensitive as real-time RT-PCRs for rhinovirus and human metapneumovirus detection. However, the sequencing approach may be less sensitive than real-time RT-PCRs for enterovirus and bocavirus detection.
TABLE 2.
Summary of deep-sequencing and real-time PCR data
| Patient no. | Total no. of analyzed reads | HRVa |
EVb |
hMPVc |
HBoVd |
||||
|---|---|---|---|---|---|---|---|---|---|
| %e | CTf | % | CT | % | CT | % | CT | ||
| CU4 | 6,301 | 0.079g | |||||||
| CU12 | 10,220 | 0.039g | |||||||
| CU14 | 7,063 | 21.94h | |||||||
| CU20 | 11,213 | 0.054 | 30.67 | ||||||
| CU36 | 17,533 | 0.165 | 30.65 | 34.71h | |||||
| CU37 | 4,067 | 32.93h | |||||||
| CU38 | 5,477 | 31.45h | 33.73h | ||||||
| CU40 | 7,314 | 1.477 | 29.62 | ||||||
| CU41 | 16,268 | 13.751 | 25.04 | 0.061 | 27.70 | ||||
| CU42 | 13,690 | 10.270 | 31.21 | 0.088 | 27.40 | ||||
| CU43 | 11,208 | 0.018 | 29.64 | 32.64h | |||||
| CU44 | 1,287 | 34.90h | |||||||
| CU46 | 7,384 | 24.67h | |||||||
| CU51 | 9,779 | 0.460 | 36.81 | ||||||
| CU52 | 2,554 | 82.341 | 11.36 | ||||||
| CU56 | 11,035 | 0.009g | 0.027 | 36.99 | |||||
| CU57 | 13,928 | 5.916 | 24.79 | 0.007 | 26.99 | ||||
| CU58 | 10,007 | 0.340 | 31.12 | ||||||
| CU62 | 941 | 0.319 | 20.50 | ||||||
| CU66 | 4,798 | 28.491 | 21.74 | ||||||
| CU70 | 10,482 | 10.990 | 22.66 | ||||||
| CU71 | 8,605 | 31.47h | |||||||
| CU72 | 7,565 | 0.225 | 28.50 | 0.013g | 35.51h | ||||
| CU106 | 10,086 | 13.960 | 21.31 | ||||||
| CU113 | 9,451 | 0.011g | |||||||
| CU115 | 12,716 | 2.139 | 27.35 | 0.016g | |||||
| CU121 | 15,163 | 0.092g | 30.38h | 32.65h | |||||
| CU127 | 25,231 | 0.020g | 0.067 | 29.44 | 3.028 | 26.44 | |||
| CU132 | 34,116 | 0.015g | |||||||
| CU134 | 11,504 | 0.635 | 24.59 | 13.543 | 24.33 | ||||
| CU136 | 12,848 | 30.57g | |||||||
| CU149 | 3,880 | 0.026h | |||||||
| CU151 | 5,060 | 7.016 | 25.18 | ||||||
| CU153 | 5,382 | 4.236g | |||||||
| CU157 | 3,490 | 0.057 | 18.70 | ||||||
| CU168 | 9,536 | 0.010 | 30.55 | ||||||
| CU171 | 4,545 | 0.330 | 22.74 | 29.131 | 20.63 | ||||
| CU173 | 452 | 8.850 | 32.47 | ||||||
| CU177 | 3,466 | 2.424 | 20.25 | 0.058 | 32.30 | ||||
| CU183 | 5,939 | 0.084 | 29.25 | ||||||
| CB1 | 6,944 | 8.770 | 19.09 | ||||||
| CB5 | 8,335 | 0.060 | 30.39 | 0.072 | 31.47 | ||||
| CB6 | 10,311 | 0.010 | 37.40 | ||||||
| CB7 | 7,828 | 0.013 | 34.34 | ||||||
| CB9 | 7,474 | 5.312 | 14.99 | ||||||
| CB10 | 13,903 | 0.007g | 0.036g | ||||||
| CB11 | 14,479 | 0.007g | 32.80h | 0.021 | 35.62 | ||||
| CB14 | 15,705 | 15.390 | 16.48 | ||||||
| CB17 | 26,657 | 0.011 | 33.37 | ||||||
| CB19 | 7,948 | 25.944 | 12.94 | ||||||
| CB21 | 7,227 | 0.332 | 29.47 | ||||||
| CB22 | 12,880 | 31.03h | 16.071 | 19.87 | 31.62h | ||||
| CB23 | 13,596 | 0.074 | 20.24 | ||||||
| CB24 | 10,698 | 0.028g | |||||||
HRV, human rhinovirus.
EV, human enterovirus.
hMPV, human metapneumovirus.
HBoV, human bocavirus.
Percentage of virus reads out of the total number of analyzed reads.
CT, real-time PCR cycle threshold value.
Samples positive only by deep sequencing.
Samples positive only by real-time PCR.
FIG 2.
Correlation between deep-sequencing and real-time RT-PCR assays. The percentages of viral reads of the total number of analyzed reads are displayed against the real-time RT-PCR CT values for rhinovirus (A), human metapneumovirus (B), enterovirus (C), and bocavirus (D). The theoretical next-generation sequencing detection limit (based on an average of 10,000 analyzed reads per sample) of 0.01% is indicated by dashed lines. Red, blue, and green dots indicate samples for which <5,000, ≥5,000 and <10,000, or ≥10,000 reads, respectively, were analyzed. Samples depicted on the x axis were negative by next-generation sequencing and positive by real-time RT-PCR. (E) A combination of panels A to D depicting the percentages of viral reads of the total number of analyzed reads displayed against the real-time RT-PCR CT values for all samples and viruses analyzed with >10,000 analyzed deep sequence reads. (F) A combination of panels A to D depicting the percentages of viral reads of the total number of analyzed reads displayed against the real-time RT-PCR CT values for all samples with >10,000 analyzed deep sequence reads and with samples that contained one or only a set of identical HRV, EV, hMPV, or HBoV reads assumed to be negative by the deep-sequencing approach.
The sensitivity of a deep-sequencing approach for detecting viruses is dependent on sequencing depth (11). In this study, the average number of reads per sample that were analyzed was ∼10,000 (range, 452 to 34,116 reads per sample) (Table 2; see also Table S2 in the supplemental material); inherently, the detection limit was ∼0.01% (range, 0.22% to 0.003%) viral reads in the metagenome (Fig. 2A to D). Upon exclusion of all samples with <10,000 analyzed next-generation sequencing reads (Fig. 2E), the overall correlation between the CT values and percentages of viral reads identified by sequencing was very strong. A subset of samples was detected by only one of the two assays (Fig. 2A to E). The next-generation sequencing approach was negative in a small number of samples (n = 8), with CT values of >30 (range, 30.4 to 34.7) (Table 2; Fig. 2E; see also Table S2). Conversely, in practically all cases in which real-time RT-PCR did not detect viral RNA/DNA and next-generation sequencing was positive (n = 10), a low number of virus-positive reads (mean, 4.2; range, 1 to 14) was obtained with the next-generation sequencing approach (Table 2; Fig. 2A to E; see also Table S2). We checked whether any of the detected viral reads by deep sequencing were identical between the 81 analyzed samples. This was the case for one bocavirus read of 159 bp, which was identical in the samples from patients CU58 and CU66. These patients, however, had 34 and 1,367 bocavirus reads, respectively, and the one identical read does not change the interpretation of our data. In addition to the exclusion of all samples with <10,000 analyzed next-generation sequencing reads, samples that contained one or only a set of identical HRV, EV, hMPV, or HBoV reads were assumed to be negative by the deep-sequencing approach (Fig. 2F). The overall strong correlation between the CT values and percentages of viral reads identified by sequencing remained.
An advantage of using a next-generation sequencing approach to detect viruses in clinical specimens is that it can also be used to obtain information regarding the virus species and/or type of virus that was identified, in contrast to the real-time RT-PCR assays used in this study. Indeed, we obtained virus type information (see Table S2 in the supplemental material). Near-full-length rhinovirus, enterovirus, human metapneumovirus, and bocavirus genomes were obtained from 11 different patients, in whom >10% of the analyzed reads (mean CT value, 20.7; range, 11.4 to 31.2) were of the described viruses (Table 2; see also Table S2). The genetic relationships of these viruses with representative viral genomes of the respective viral families were assessed (Fig. 3A to D). Rhinoviruses from patients CB14, CB19, and CU106 showed the highest nucleotide identity (91.9%, 90.6%, and 83.6%) with HRV-A types 23, 61, and 88, respectively. The rhinoviruses obtained from patients CU41 and CU42 showed the highest nucleotide identity (∼97.6% and 67%) with HRV-C types 4 and 9, respectively, in the phylogenetic tree (Fig. 3A). Rhinovirus CU42 may constitute a new rhinovirus C type (type 55), as it displayed the highest nucleotide identity (∼74%) to HRV-51, HRV-26, and HRV-36 (29). Three near-complete enterovirus genomes were obtained from patients CU70, CU134, and CU171, and these genomes showed the highest nucleotide identity to enterovirus 68 strains from Japan and New Zealand (Fig. 3B). The human metapneumovirus genome from patient CB22 was most closely related (∼98% nucleotide identity) to a genetic group A hMPV from China (Fig. 3C). The two bocavirus strains from patients CU52 and CU66 were most closely related (∼99% nucleotide identity) to HBoV-1 strains previously identified in Thailand (Fig. 3D). Thus, virus type information was available in the next-generation sequencing data.
FIG 3.
Phylogenetic analysis of obtained genome sequences. (A) A phylogenetic tree of the near-complete rhinovirus genomes from patients CU41, CU42, CU106, CB14, and CB19 and representative human rhinoviruses (corresponding to nucleotides [nt] 132 to 6677 of HRV-23) was generated using MEGA6 with the neighbor-joining method with p-distance parameter and 1,000 bootstrap replicates. Bootstrap values are shown. (B) A phylogenetic tree of the near-complete enterovirus genomes from patients CU70, CU134, and CU171 and representative human enteroviruses (corresponding to nt 126 to 7143 of EV-NZ-2010-541) was generated using MEGA6 with the neighbor-joining method with the maximum-likelihood composite parameter and 1,000 bootstrap replicates. Bootstrap values are shown. (C) A phylogenetic tree of the near-complete hMPV genome from patient CB22 and representative human metapneumoviruses (corresponding to nt 32 to 13080 of hMPV-gz01) was generated using MEGA6 with the neighbor-joining method with p-distance parameter and 1,000 bootstrap replicates. Bootstrap values are shown. (D) A phylogenetic tree of the near-complete bocavirus genomes from patients CU52 and CU66 and representative human bocaviruses (corresponding to nt 61 to 5201 of HBoV-CU74) was generated using MEGA6 with the neighbor-joining method with the maximum-likelihood composite parameter and 1,000 bootstrap replicates. Bootstrap values are shown. GenBank accession numbers we used are shown in Table S3 in the supplemental material.
DISCUSSION
In principle, random amplification combined with next-generation sequencing would be advantageous in clinical viral diagnostics, as there is no need to design specific primers to amplify target sequences. It eliminates the need for the design and validation of several tens or hundreds of specific primers/probes targeting multiple viral pathogens, and it does not require continuous adaptation of the primer sequences with the description of new variants and species. Instead, it allows the characterization of hundreds of different known pathogens simultaneously without a priori knowledge of the virus present and provides in-depth virus species/type information.
A recent review of the application of next-generation sequencing technology in clinical diagnostics described several major drawbacks, among which is the fact that random amplification results in the amplification of all nucleic acids, including host nucleic acids, suggesting that an analytical sensitivity similar to that of diagnostic PCRs cannot be expected, even with the increased depth of sequencing and specific pathogen enrichment steps applied (11). Our data indicate that this may be less of a problem, as a strong correlation existed between the CT values from diagnostic real-time RT-PCRs and the percentages of viral reads identified by next-generation sequencing for rhinovirus, enterovirus, bocavirus, and human metapneumoviruses, which is in line with previous observations (30). Using a next-generation sequencing platform that generated an average of 10,000 reads per sample, our sequencing approach was at least as sensitive as diagnostic real-time RT-PCRs for rhinovirus and human metapneumovirus detection. However, the deep-sequencing approach seemed less sensitive than did real-time RT-PCRs for enterovirus and bocavirus detection. This possibly reflects differences in the optimization of real-time RT-PCR assays and detection with next-generation sequencing. The detection of RNA viruses with the sequencing approach was generally better than for DNA viruses, which may be due to genome size differences and/or the nucleic acid isolation procedure, as the removal of host DNA was more substantial in the RNA isolation than in the DNA isolation procedure. As shown by Wylie and coworkers (31), increasing the sequencing depth using other next-generation sequencing platforms with increased sequence output may increase the sensitivity of the next-generation sequencing approach even further. Thus, achieving an analytical sensitivity comparable to that of diagnostic real-time RT-PCRs using next-generation sequencing platforms may actually be possible.
A common belief is that the necessary genetic information required for accurate virus typing is unpredictable when partial sequences are acquired due to insufficient genome coverage (11). This, however, depends on a number of variables, including the divergence of viruses within the virus families under study and which part of the genome is sequenced. In our study, typing information was readily available from next-generation sequencing data from 11 different patients in whom >10% of the analyzed reads (mean CT value, 20.7; range, 11.4 to 31.2) were of the described viruses. In line with previous observations, virus type information was obtained even when only a few sequence reads were available (31). For example, most enterovirus-positive samples contained viral reads that were typed as enterovirus D type 68. Until recently, reports of respiratory infections due to enterovirus were rare, but over the past 3 years, outbreaks in Japan, the Philippines, and the Netherlands, as well as epidemic clusters in the United Kingdom, have implicated EV68 as an emerging respiratory pathogen (32–36). Our data confirm previous observations regarding enterovirus 68 infections in Thailand (16). In addition, we typed most of the rhinovirus-positive samples as being A, B, or C and obtained near-full-length genome sequences that were typed as A23, A61, or A88; C4; or a new type that we designated C55 (29). All the hMPV strains were typed as A, the human bocaviruses were type I, and RSV (64% type A, 27% type B) and human herpesvirus (HHV) infections (22% HHV-6B, 67% HHV-5, 11% HHV-4) were typed as well.
The next-generation sequencing approach identified a number of viruses that are usually not screened for in respiratory infections using routine diagnostic assays, including Herpesviridae. It is of note that herpesviruses were identified in 12% of the samples analyzed here (mean, 0.24% of analyzed reads; range, 0.007% to 2%) and that they always occurred in combination with at least one other viral infection. This may be due to the detection of latent viruses, but it is also possible that respiratory disease caused by other viruses results in the reactivation of latent herpesviruses. However, it cannot be ruled out that human herpesviruses are involved in respiratory disease. Cytomegalovirus (CMV) infections have been described to cause pneumonia sporadically, and in this study, CMV was detected at relatively high levels using next-generation sequencing (∼2% of the analyzed reads) in an 8-month-old patient (CU188) and seemingly dominated the simultaneous RSV infection.
Another prominent observation was the detection of anelloviruses in respiratory samples from more than half of the children with respiratory illness. Anellovirus infections are commonly acquired during early childhood, during which the virus establishes a chronic productive infection with long-lasting detectable viremia (37). They are endemic worldwide and can be detected in blood and various tissues in the body, including cerebrospinal and bronchoalveolar lavage fluid (37). Many investigations have been carried out to unravel their epidemiological, clinical, and pathogenic properties, but at present, their causative role in disease is not considered likely. Previously, we showed that the incidence of human anelloviruses in vitreous fluid of patients with seasonal hyperacute panuveitis (SHAPU), a potentially blinding ocular disease occurring in Nepal that principally affects young children, was significantly higher than that in non-SHAPU patients. The data suggested that anelloviruses observed in vitreous fluid samples of patients with uveitis, but not in patients with retinal detachment, most likely originated from the systemic anellovirus pool upon inflammation-induced disruption of the blood-ocular barrier (38). Likewise, inflammation in the respiratory tract system, caused by an infection, for example, may cause increased permeability of the capillary barrier, resulting in edema, an influx of inflammatory cells, and the concomitant increase in the amount of anellovirus in the respiratory tract.
One of the major questions is how to interpret next-generation sequencing data in terms of what is clinically relevant for the patient. The comparison of the next-generation sequencing data to real-time RT-PCR CT values showed that the interpretation may not need to be very different than the interpretation of CT values, as there is a very strong correlation between the percentage of viral reads detected and the CT value obtained from the same sample. However, as the sensitivity of the next-generation sequencing assay compared to that of the real-time RT-PCR assays is not the same for each RT-PCR assay, the next-generation sequencing assay may need to be validated with each real-time RT-PCR assay. In addition, our data show that multiple viral infections occurred substantially more often than did single-virus infections (60% versus 28%, respectively). Even when taking only well-established respiratory pathogens into account, 25% of the cases were multiple infections. In most cases, one of the detected viruses seemed to dominate. Although major hurdles for the routine use of next-generation sequencing approaches do exist, including cost, labor intensity, and especially turnaround time, the sensitivity of a next-generation sequencing assay may actually reach that of current diagnostic routine real-time RT-PCR assays, and it potentially provides more information regarding virus species/type, thus remaining of interest for virus diagnosis in clinical and public health settings. Although it might not be used for routine diagnostics at present, a sequence-independent next-generation sequencing assay may be applied to samples that remain negative with routine diagnostics and, as such, may function as an additional diagnostic tool with the added value of surveillance for (re)emerging viruses.
Supplementary Material
ACKNOWLEDGMENTS
This work was partially funded by the Virgo Consortium, which is funded by the Dutch government (project number FES0908), the Netherlands Genomics Initiative (NGI) (project number 050-060-452), and ZonMW TOP (project 91213058).
Thai sites included the National Research University Project, Office of Higher Education Commission (HR1155A, WCU-001,007-HR-57); the Center of Excellence in Clinical Virology, Chulalongkorn University; the Centenary Academic Development Project, Integrated Innovation Academic Center; the Chulalongkorn University Centenary Academic Development Project (CU56-HR01); the Ratchadaphiseksomphot Endowment Fund of Chulalongkorn University (RES560530093); and the Outstanding Professor and RGJ Ph.D. program of the Thailand Research Fund (DPG5480002, PHD/0114/2551).
A. D. M. E. Osterhaus and S. L. Smits are part-time employees of Viroclinics Biosciences B.V. This does not alter our adherence to all the policies on sharing data and materials.
Footnotes
Published ahead of print 6 August 2014
Supplemental material for this article may be found at http://dx.doi.org/10.1128/JCM.01641-14.
REFERENCES
- 1.Chu CM, Lin DY, Yeh CT, Sheen IS, Liaw YF. 2001. Epidemiological characteristics, risk factors, and clinical manifestations of acute non-A-E hepatitis. J. Med. Virol. 65:296–300. 10.1002/jmv.2033 [DOI] [PubMed] [Google Scholar]
- 2.Finkbeiner SR, Kirkwood CD, Wang D. 2008. Complete genome sequence of a highly divergent astrovirus isolated from a child with acute diarrhea. Virol. J. 5:117. 10.1186/1743-422X-5-117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Granerod J, Crowcroft NS. 2007. The epidemiology of acute encephalitis. Neuropsychol. Rehabil. 17:406–428. 10.1080/09602010600989620 [DOI] [PubMed] [Google Scholar]
- 4.Juvén T, Mertsola J, Waris M, Leinonen M, Meurman O, Roivainen M, Eskola J, Saikku P, Ruuskanen O. 2000. Etiology of community-acquired pneumonia in 254 hospitalized children. Pediatr. Infect. Dis. J. 19:293–298. 10.1097/00006454-200004000-00006 [DOI] [PubMed] [Google Scholar]
- 5.Saeed M, Zaidi SZ, Naeem A, Masroor M, Sharif S, Shaukat S, Angez M, Khan A. 2007. Epidemiology and clinical findings associated with enteroviral acute flaccid paralysis in Pakistan. BMC Infect. Dis. 7:6. 10.1186/1471-2334-7-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Capobianchi MR, Giombini E, Rozera G. 2013. Next-generation sequencing technology in clinical virology. Clin. Microbiol. Infect. 19:15–22. 10.1111/1469-0691.12056 [DOI] [PubMed] [Google Scholar]
- 7.Lipkin WI. 2010. Microbe hunting. Microbiol. Mol. Biol. Rev. 74:363–377. 10.1128/MMBR.00007-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Mokili JL, Rohwer F, Dutilh BE. 2012. Metagenomics and future perspectives in virus discovery. Curr. Opin. Virol. 2:63–77. 10.1016/j.coviro.2011.12.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Smits SL, Osterhaus AD. 8 April 2013. Virus discovery: one step beyond. Curr. Opin. Virol 3:1–6. 10.1016/j.coviro.2013.03.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.van Boheemen S, de Graaf M, Lauber C, Bestebroer TM, Raj VS, Zaki AM, Osterhaus AD, Haagmans BL, Gorbalenya AE, Snijder EJ, Fouchier RA. 2012. Genomic characterization of a newly discovered coronavirus associated with acute respiratory distress syndrome in humans. mBio 3:e00473-12. 10.1128/mBio.00473-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lecuit M, Eloit M. 2014. The diagnosis of infectious diseases by whole genome next generation sequencing: a new era is opening. Front. Cell Infect. Microbiol. 4:25. 10.3389/fcimb.2014.00025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Smits SL, Osterhaus ADME. 2012. Emerging viral infections, p 1142–1154 In Ginsburg GS, Willard HF. (ed), Genomic and personalized medicine, vol 1 Academic Press, London, United Kingdom [Google Scholar]
- 13.Suwannakarn K, Payungporn S, Chieochansin T, Samransamruajkit R, Amonsin A, Songserm T, Chaisingh A, Chamnanpood P, Chutinimitkul S, Theamboonlers A, Poovorawan Y. 2008. Typing (A/B) and subtyping (H1/H3/H5) of influenza A viruses by multiplex real-time RT-PCR assays. J. Virol. Methods 152:25–31. 10.1016/j.jviromet.2008.06.002 [DOI] [PubMed] [Google Scholar]
- 14.Auksornkitti V, Kamprasert N, Thongkomplew S, Suwannakarn K, Theamboonlers A, Samransamruajkij R, Poovorawan Y. 2014. Molecular characterization of human respiratory syncytial virus, 2010–2011: identification of genotype ON1 and a new subgroup B genotype in Thailand. Arch. Virol. 159:499–507. 10.1007/s00705-013-1773-9 [DOI] [PubMed] [Google Scholar]
- 15.Linsuwanon P, Payungporn S, Samransamruajkit R, Posuwan N, Makkoch J, Theanboonlers A, Poovorawan Y. 2009. High prevalence of human rhinovirus C infection in Thai children with acute lower respiratory tract disease. J. Infect. 59:115–121. 10.1016/j.jinf.2009.05.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Linsuwanon P, Puenpa J, Suwannakarn K, Auksornkitti V, Vichiwattana P, Korkong S, Theamboonlers A, Poovorawan Y. 2012. Molecular epidemiology and evolution of human enterovirus serotype 68 in Thailand, 2006–2011. PLoS One 7:e35190. 10.1371/journal.pone.0035190 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sriwanna P, Chieochansin T, Vuthitanachot C, Vuthitanachot V, Theamboonlers A, Poovorawan Y. 2013. Molecular characterization of human adenovirus infection in Thailand, 2009–2012. Virol. J. 10:193. 10.1186/1743-422X-10-193 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ruampunpong H, Payungporn S, Samransamruajkit R, Pratheepamornkul T, Theamboonlers A. 2014. Human parainfluenza virus infection in Thai children with lower respiratory tract infection from 2010 to 2013. Southeast Asian Pac. J. Trop. Med. Public Health 45:610–621 [PubMed] [Google Scholar]
- 19.Bodewes R, van der Giessen J, Haagmans BL, Osterhaus AD, Smits SL. 2013. Identification of multiple novel viruses, including a parvovirus and a hepevirus, in feces of red foxes. J. Virol. 87:7758–7764. 10.1128/JVI.00568-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.van den Brand JM, van Leeuwen M, Schapendonk CM, Simon JH, Haagmans BL, Osterhaus AD, Smits SL. 2012. Metagenomic analysis of the viral flora of pine marten and European badger feces. J. Virol. 86:2360–2365. 10.1128/JVI.06373-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.van Leeuwen M, Williams MM, Koraka P, Simon JH, Smits SL, Osterhaus AD. 2010. Human picobirnaviruses identified by molecular screening of diarrhea samples. J. Clin. Microbiol. 48:1787–1794. 10.1128/JCM.02452-09 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Hoek RA, Paats MS, Pas SD, Bakker M, Hoogsteden HC, Boucher CA, van der Eerden MM. 2013. Incidence of viral respiratory pathogens causing exacerbations in adult cystic fibrosis patients. Scand. J. Infect. Dis. 45:65–69. 10.3109/00365548.2012.708942 [DOI] [PubMed] [Google Scholar]
- 23.Kantola K, Sadeghi M, Antikainen J, Kirveskari J, Delwart E, Hedman K, Soderlund-Venermo M. 2010. Real-time quantitative PCR detection of four human bocaviruses. J. Clin. Microbiol. 48:4044–4050. 10.1128/JCM.00686-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. 2013. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol. Biol. Evol. 30:2725–2729. 10.1093/molbev/mst197 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Chieochansin T, Simmonds P, Poovorawan Y. 2010. Determination and analysis of complete coding sequence regions of new discovered human bocavirus types 2 and 3. Arch. Virol. 155:2023–2028. 10.1007/s00705-010-0781-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Gaunt ER, Jansen RR, Poovorawan Y, Templeton KE, Toms GL, Simmonds P. 2011. Molecular epidemiology and evolution of human respiratory syncytial virus and human metapneumovirus. PLoS One 6:e17427. 10.1371/journal.pone.0017427 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Palmenberg AC, Spiro D, Kuzmickas R, Wang S, Djikeng A, Rathe JA, Fraser-Liggett CM, Liggett SB. 2009. Sequencing and analyses of all known human rhinovirus genomes reveal structure and evolution. Science 324:55–59. 10.1126/science.1165557 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Wisdom A, Kutkowska AE, McWilliam Leitch EC, Gaunt E, Templeton K, Harvala H, Simmonds P. 2009. Genetics, recombination and clinical features of human rhinovirus species C (HRV-C) infections; interactions of HRV-C with other respiratory viruses. PLoS One 4:e8518. 10.1371/journal.pone.0008518 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.McIntyre CL, Knowles NJ, Simmonds P. 2013. Proposals for the classification of human rhinovirus species A, B and C into genotypically assigned types. J. Gen. Virol. 94:1791–1806. 10.1099/vir.0.053686-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.de Vries M, Oude Munnink BB, Deijs M, Canuti M, Koekkoek SM, Molenkamp R, Bakker M, Jurriaans S, van Schaik BD, Luyf AC, Olabarriaga SD, van Kampen AH, van der Hoek L. 2012. Performance of VIDISCA-454 in feces-suspensions and serum. Viruses 4:1328–1334. 10.3390/v4081328 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Wylie KM, Mihindukulasuriya KA, Sodergren E, Weinstock GM, Storch GA. 2012. Sequence analysis of the human virome in febrile and afebrile children. PLoS One 7:e27735. 10.1371/journal.pone.0027735 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Imamura T, Fuji N, Suzuki A, Tamaki R, Saito M, Aniceto R, Galang H, Sombrero L, Lupisan S, Oshitani H. 2011. Enterovirus 68 among children with severe acute respiratory infection, the Philippines. Emerg. Infect. Dis. 17:1430–1435. 10.3201/eid1708.101328 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Jacobson LM, Redd JT, Schneider E, Lu X, Chern SW, Oberste MS, Erdman DD, Fischer GE, Armstrong GL, Kodani M, Montoya J, Magri JM, Cheek JE. 2012. Outbreak of lower respiratory tract illness associated with human enterovirus 68 among American Indian children. Pediatr. Infect. Dis. J. 31:309–312. 10.1097/INF.0b013e3182443eaf [DOI] [PubMed] [Google Scholar]
- 34.Kaida A, Kubo H, Sekiguchi J, Kohdera U, Togawa M, Shiomi M, Nishigaki T, Iritani N. 2011. Enterovirus 68 in children with acute respiratory tract infections, Osaka, Japan. Emerg. Infect. Dis. 17:1494–1497. 10.3201/eid1708.110028 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Rahamat-Langendoen J, Riezebos-Brilman A, Borger R, van der Heide R, Brandenburg A, Scholvinck E, Niesters HG. 2011. Upsurge of human enterovirus 68 infections in patients with severe respiratory tract infections. J. Clin. Virol. 52:103–106. 10.1016/j.jcv.2011.06.019 [DOI] [PubMed] [Google Scholar]
- 36.Xiang Z, Gonzalez R, Wang Z, Ren L, Xiao Y, Li J, Li Y, Vernet G, Paranhos-Baccala G, Jin Q, Wang J. 2012. Coxsackievirus A21, enterovirus 68, and acute respiratory tract infection, China. Emerg. Infect. Dis. 18:821–824. 10.3201/eid1805.111376 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Maggi F, Bendinelli M. 2010. Human anelloviruses and the central nervous system. Rev. Med. Virol. 20:392–407. 10.1002/rmv.668 [DOI] [PubMed] [Google Scholar]
- 38.Smits SL, Manandhar A, van Loenen FB, van Leeuwen M, Baarsma GS, Dorrestijn N, Osterhaus AD, Margolis TP, Verjans GM. 2012. High prevalence of anelloviruses in vitreous fluid of children with seasonal hyperacute panuveitis. J. Infect. Dis. 205:1877–1884. 10.1093/infdis/jis284 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



