Skip to main content
Journal of Clinical Microbiology logoLink to Journal of Clinical Microbiology
. 2024 Feb 26;62(3):e01111-23. doi: 10.1128/jcm.01111-23

A robust, scalable, and cost-efficient approach to whole genome sequencing of RSV directly from clinical samples

Sophie Köndgen 1,#, Djin-Ye Oh 1,#, Andrea Thürmer 2, Somayyeh Sedaghatjoo 2, Livia V Patrono 3,2, Sébastien Calvignac-Spencer 3,3, Barbara Biere 1, Thorsten Wolff 1, Ralf Dürrwald 1, Stephan Fuchs 2, Janine Reiche 1,
Editor: John P Dekker4
PMCID: PMC10935636  PMID: 38407068

ABSTRACT

Respiratory syncytial virus (RSV) is a leading cause of acute lower respiratory tract infections causing significant morbidity and mortality among children and the elderly; two RSV vaccines and a monoclonal antibody have recently been approved. Thus, there is an increasing need for a detailed and continuous genomic surveillance of RSV circulating in resource-rich and resource-limited settings worldwide. However, robust, cost-effective methods for whole genome sequencing of RSV from clinical samples that are amenable to high-throughput are still scarce. We developed Next-RSV-SEQ, an experimental and computational pipeline to generate whole genome sequences of historic and current RSV genotypes by in-solution hybridization capture-based next generation sequencing. We optimized this workflow by automating library preparation and pooling libraries prior to enrichment in order to reduce hands-on time and cost, thereby augmenting scalability. Next-RSV-SEQ yielded near-complete to complete genome sequences for 98% of specimens with Cp values ≤31, at median on-target reads >93%, and mean coverage depths between ~1,000 and >5,000, depending on viral load. Whole genomes were successfully recovered from samples with viral loads as low as 230 copies per microliter RNA. We demonstrate that the method can be expanded to other respiratory viruses like parainfluenza virus and human metapneumovirus. Next-RSV-SEQ produces high-quality RSV genomes directly from culture isolates and, more importantly, clinical specimens of all genotypes in circulation. It is cost-efficient, scalable, and can be extended to other respiratory viruses, thereby opening new perspectives for a future effective and broad genomic surveillance of respiratory viruses.

IMPORTANCE

Respiratory syncytial virus (RSV) is a leading cause of severe acute respiratory tract infections in children and the elderly, and its prevention has become an increasing priority. Recently, vaccines and a long-acting monoclonal antibody to protect effectively against severe disease have been approved for the first time. Hence, there is an urgent need for genomic surveillance of RSV at the global scale to monitor virus evolution, especially with an eye toward immune evasion. However, robust, cost-effective methods for RSV whole genome sequencing that are suitable for high-throughput of clinical samples are currently scarce. Therefore, we have developed Next-RSV-SEQ, an experimental and computational pipeline that produces reliably high-quality RSV genomes directly from clinical specimens and isolates.

KEYWORDS: respiratory syncytial virus, surveillance, whole genome sequencing, hybrid capture, NGS

INTRODUCTION

Respiratory syncytial virus (RSV) is a leading cause of acute lower respiratory tract infection in high-, middle-, and low-income settings (1). Young children and the elderly are at high risk of severe disease and hospitalizations resulting in a significant burden on health care systems worldwide. This has become particularly clear, when the easing of coronavirus disease 2019 restrictions was followed by pronounced off-season RSV surges that put unprecedented strain on pediatric hospitals (2). RSV prevention has become an increasing priority in recent years: a novel monoclonal antibody was recently approved for prevention of RSV lower respiratory tract disease in infants, complementing palivizumab, which for years was the only approved anti-RSV drug (3); moreover, several vaccine candidates are in the late stages of clinical development, of which two have recently been approved (2, 4, 5). With promising immunoprophylactics on the horizon, a comprehensive understanding of RSV molecular diversity and evolution is required. Global genomic surveillance of RSV needs to be established, which will require a ramp-up in RSV sequencing capacity, particularly in developing countries (68).

RSV is an enveloped, single-stranded, negative-sense RNA virus. There are two groups, A and B, each comprising multiple genotypes. The RSV genome is 15.2 kb long and has 10 genes encoding for nonstructural and structural proteins (9). The latter include the attachment glycoprotein (G) and the fusion (F) glycoprotein, which initiate infection and are the targets of neutralizing antibodies. Traditionally, RSV molecular surveillance efforts have relied on Sanger sequencing of the highly variable G gene (7). Molecular surveillance at the whole genome level is more precise and informative since it provides insight into RSV genetic diversity at higher resolution, is better-suited for assessing immune escape, given that monoclonals and most vaccines target the F, rather than the G protein, and enables recognition of potential recombinants. RSV is not particularly amenable to culture isolation (7); thus, potential whole genome sequencing (WGS) methods need to generate robust results from primary respiratory specimens.

WGS of viral pathogens relies on next generation sequencing (NGS), for which there are several approaches: direct (“shotgun”) sequencing of total nucleic acids from clinical specimens yields sequences from both virus and host, with the latter being most usually vastly overrepresented. Sequencing must be performed at great depth to ensure high sensitivity, which makes the method expensive. Alternatively, overlapping PCRs can be used to enrich the viral genome (“amplicon sequencing”), improving sequence quality and reducing cost; however, in case of primer mismatches, amplicon sequencing may result in sequencing gaps (“amplicon dropouts”). Closing these sequencing gaps may require rerunning PCRs with updated primers, which complicates workflows. Thus, amplicon sequencing can be a laborious and/or error-prone approach and may require regular updates to cope with ongoing viral evolution.

Our goal was to establish an NGS method that produces high-quality RSV genomes while avoiding the drawbacks of shotgun and amplicon sequencing. To this end, we applied hybridization capture, a method that uses custom-designed biotinylated probes complementary to the target genomes to enrich them selectively. The probes allow typically 10%–20% sequence divergence (10, 11) and are thus more resistant to sequence mismatches than PCR primers. Hybridization capture has been successfully used for WGS of DNA and RNA viruses (1114), including respiratory viruses and, among them, RSV (1518). Building on these earlier efforts, we developed Next-RSV-SEQ, an experimental and computational pipeline specifically designed to enable high-quality, broad, scalable, and cost-efficient genomic surveillance of RSV.

MATERIALS AND METHODS

Samples

All samples studied here originate from Germany’s national virological sentinel surveillances for monitoring acute respiratory infections from ambulant (ARI) and hospitalized (SARI) patients, and were collected between 1999 and 2022. Briefly, physicians collect upper respiratory specimens (mostly nasal or pharyngeal swabs) from S/ARI patients and send them to the Robert Koch Institute. After their arrival in the laboratory, viruses were washed out after the addition of 1.5–3 mL of cell culture medium (minimum essential medium with N-2-hydroxyethylpiperazine-N′-2-ethanesulfonic acid buffer with 5,000 U/mL PenStrep) to yield a total volume of 2.5–4 mL sample material. Subsequently, samples undergo molecular diagnostics for a broad panel of respiratory viruses (19). The samples analyzed in this study were tested positive for RSV, human metapneumovirus (HMPV) or parainfluenza virus (PIV) by specific real-time PCR assays which have been previously described (19, 20).

Virus culture

For virus cultivation, aliquots of RSV- or PIV-positive samples were sterile-filtered (0.45 µm) and inoculated on Hep-2 cells (BioWhittaker) or CaCo-2 cells (European Collection of Authenticated Cell Cultures), respectively. Cells were observed microscopically daily for cytopathic effect (CPE). If virus-typical CPE was noted, culture medium was harvested and frozen at −70°C.

RNA extraction, cDNA synthesis, and quantification

RNA was extracted from 200 µL clinical isolate or clinical swab sample and eluted in 50 µL or 100 µL elution buffer using the MagNA Pure 24 Total NA Isolation Kit or the MagNA Pure 96 DNA and viral NA small volume kit (Roche). RNA was subsequently treated with TURBO DNase (Ambion, Life Technologies). First strand cDNA synthesis was performed using Superscript IV reverse transcriptase (Thermo Fisher Scientific) according to the manufacturer’s instructions, with 11.2 µL RNA, ribonuclease inhibitor (Thermo Fisher Scientific), and random hexamer primer (0.5 µM). Double-stranded (ds) cDNA was generated using Klenow fragment (Fermentas). Briefly, total (20 µL) volume of single-stranded cDNA was mixed (on ice) with a 20 µL reaction mix containing random hexamer primer (0.5 µM final concentration), deoxynucleotide triphosphate (dNTP) mix (0.5 mM final concentration each), 1× Klenow reaction buffer, 10 units of Klenow fragment (Thermo Fisher Scientific), and 11.4 µL PCR-grade H2O. The reaction was incubated at 37°C for 60 minutes. Ds cDNA was then purified with MagSi beads (Steinbrenner) and eluted in 25 µL Tris-Cl. Double-stranded DNA concentrations were measured using Qubit dsDNA High Sensitivity Kit (Life Technologies). In addition, viral loads were quantified using qPCR specific for RSV, HMPV, and PIV (19, 20) and applying synthetic DNA molecules (GeneArt Strings, Thermo Fisher Scientific) as quantification standards (for standard curves, see Fig. S1). The PCRs were shown to run with PCR efficiencies of 91%–109% with a linear detection range of 106–101 genome equivalents per reaction. Samples with Cp ≤31 per 5 µL ds cDNA were further processed, resulting in viral loads of ≥500 copies per 5 µL ds cDNA for RSV and ≥1,600 or ≥2,700 copies per 5 µL ds cDNA for HMPV or PIV. This, in turn, corresponds to 230 RSV genome copies per microliter RNA, 740 HMPV genome copies per microliter RNA, and 1,240 PIV genome copies per microliter RNA.

Library preparation

Manual library preparation

For each sample, 16 µL of ds cDNA was taken as input. Ds cDNA was fragmented using a Covaris S220 Focused-ultrasonicator in a volume of 130 µL TE buffer using settings to generate a 400 bp fragment size (intensity = 4, duty cycle = 10%, cycles per burst = 200, treatment time = 55 s, temperature = 7°C). Fragmented extracts were then concentrated using the MinElute PCR Purification Kit (Qiagen) and eluted into 50 µL TE buffer. Libraries were prepared using the NEBNext Ultra II DNA library kit following the recommended protocol, with NEBNext Multiplex Oligos (Dual Index Set) (New England Biolabs).

Automatized library preparation on a Hamilton Microlab STAR (Hamilton)

Libraries were constructed from 13 µL ds cDNA using half volumes of the NEBNext Ultra II FS DNA library preparation kit and NEBNext Multiplex Oligos (Unique Dual Index Primer Pairs; New England Biolabs). Fragmentation time was set to 5 minutes. We followed the manufacturer’s instructions with the exception that the number of index cycles was set to a fixed number of 8, 10, or 12 cycles (according to the DNA concentrations of the respective sample batch).

In-solution hybridization capture and sequencing

RNA baits covering RSV (full and partial) genomes from publicly available sequences were designed and synthesized by a service provider (myBaits, Daicel Arbor Bioscience) with flexible threefold tiling of 100mer baits (probes). To gain insight into the capacity of our method for detecting other respiratory viruses, probes based on PIV and HMPV sequences were added to the probe set. Accession numbers of the genomes used to design the probes are available for download at the Supplementary Section. We followed the myBaits hybridization capture for targeted NGS protocol (version 4.01) using a 1:4 dilution of the baits. Between 3 and 20 libraries of approximately equal viral load (referring to viral copy numbers of the cDNA input) were pooled. If necessary, libraries were diluted to obtain greater uniformity within pools. Pools were then concentrated with a MinElute PCR Purification Kit (Qiagen). The starting material ranged between 9 and 2,272 ng per pool. Pools were hybridized for 18–22 h at 65°C. Following capture and washing steps, the surviving DNA was amplified using the KAPA Hifi Library Amplification Kit (KAPA Biosystems) for 12 cycles and purified using 1.8× MagSi beads (Steinbrenner Laborsysteme) followed by a final quantification using the Qubit dsDNA High Sensitivity Kit (Life Technologies). Some lower concentrated pools (<2 nM) were amplified for additional 5–10 cycles. The final products were diluted to 2 nM. Batches of 40–60 samples were sequenced on a MiSeq flow cell using v3 reagent chemistry for 2 × 300 cycles (Illumina) generating up to 50 million paired-end reads per run.

Data analysis of RSV sequences

Data preprocessing

The raw sequencing data were processed using FastP v 0.23.2 (21) to remove low-quality bases and adapter sequences. The parameters used were a minimal PHRED score of 20 and a minimal read length of 50. The reads were then taxonomically filtered using kraken2 v 2.1.2 (22) to exclude potential contaminations. The used database covers reference genomes of RSV types A and B (sourced from GenBank in April 2022) as well as Homo sapiens (GRCh38) to allow automated read selection within the workflow and can be retrieved from zenodo (10.5281/zenodo.8133844).

Genome assembly and annotation

Target genomes were assembled using a reference-based approach. Pre-processed reads were aligned to a reference genome using bwa-mem2, with a minimum mapping quality of 15 and marking shorter split hits as secondary. For samples with a partial duplication of 60 or 72 base pairs within the G gene as identified by Sanger sequencing, the reference genomes KY654518 (RSVA) or KM517573 (RSVB) were chosen, whereas samples without duplication were aligned to OK649614 (RSVA) or KU316134 (RSVB). Samtools v.1.4 was then used to convert resulting sam to bam files and to remove PCR duplicates from the alignment. Both variant calling and generation of consensus sequences were performed using iVar v1.3.1 with default parameters, but a minimum read depth of 10 for variant calling and a minimum insertion frequency threshold of 0 for consensus generation, respectively.

Quality control and automation

The quality of the genome assembly was evaluated using different metrics, including the number of raw reads, trimmed reads, aligned reads, deduplicated aligned reads, and the absolute and relative genome fractions with 10/50/100/500-fold coverage (based on unique reads). We implemented this workflow as automated pipeline using Snakemake as workflow manager. The pipeline is publicly available (https://gitlab.com/rki_bioinformatics/next-rsv-seq). Our bioinformatics pipeline undergoes continual development and enhancement. A significant recent update includes the introduction of an automated reference sequence selection feature. Please refer to the README file available at https://gitlab.com/rki_bioinformatics/next-rsv-seq for detailed information.

Data analysis of PIV and HMPV sequences

PIV and HMPV sequences were analyzed as described for RSV, using the appropriate kraken2 databases and reference sequences for taxonomic filtering and genome assembly, respectively. Databases can be retrieved from zenodo (10.5281/zenodo.8133844).

Classification of Passed and Failed samples

Sequenced samples were classified as Passed when they displayed ≥98% genome coverage at a minimum read depth of 10 unique reads per position. Samples with genome coverage below 98% at 10× read depth were classified as Failed.

To exclude genetic variation as a reason for sequencing failure, we generated consensus sequences at 2× depth of Failed samples. In one case, where this was not possible, an existing Sanger sequence of the G gene was used.

Genotyping

RSV is divided into various genotypes, subgenotypes, and lineages (23). Hereafter, we use the expression “genotype” for easier reading.

Genotypes were assessed based on phylogenetic analyses of consensus sequences from this study and associated reference sequences as provided by Goya et al. (23). A multiple sequence alignment, including sequences of Passed as well as Failed samples, was compiled from the ectodomain region of the G protein gene of RSV-A and RSV-B using Mafft, as implemented in Geneious, version 2021.2.2. A maximum likelihood tree was inferred using iq-tree software version 1.5.3 with 10,000 ultrafast bootstrap replicates as statistical support (24, 25).

RESULTS

Next-RSV-SEQ workflow

We developed a hybrid capture-based NGS workflow that enables whole genome sequencing of RSV-A and RSV-B genomes, both from culture isolates and directly from clinical swab samples across multiple genotypes.

Briefly, RNA underwent DNA digestion and ds cDNA synthesis, followed by quantitative PCR. Ds cDNA with Cp ≤31 were chosen for library preparation and hybridization capture. Captured libraries were sequenced on an Illumina MiSeq system, followed by genome reconstruction using a customized bioinformatic pipeline (Fig. 1).

Fig 1.

Fig 1

Next-RSV-SEQ, an experimental and computational pipeline enabling a broad, scalable, and cost-efficient genomic surveillance of RSV. (A) Schematic representation describing key steps within the NGS workflow. RNA underwent DNA digestion and ds cDNA synthesis, followed by quantitative PCR and library preparation. With pooled libraries, hybridization capture was performed using a customized RNA bait panel that enriches libraries containing RSV-A and RSV-B sequences. After a final amplification and quantification step, enriched libraries were sequenced on an Illumina MiSeq platform and resulting sequencing data sets provided for bioinformatic analysis. (B) Scheme of the bioinformatic workflow. Raw read data were trimmed and taxonomically filtered. Pre-processed reads were then aligned to a reference genome followed by removal of PCR duplicates. Consensus sequences were generated by calling bases at positions covered by ≥10 unique reads and for which the majority of the reads agreed. This workflow was implemented as automated pipeline using Snakemake as workflow manager.

Next-RSV-SEQ library preparation was initially set up as a manual protocol based on mechanical fragmentation. Subsequently, preparation was transferred to an automated process using enzymatic fragmentation making the use of either manual or automated library preparation feasible.

Next-RSV-SEQ enables sequencing of a broad spectrum of RSV genotypes

To assess the robustness of the method across a broad diversity of RSV genotypes, Next-RSV-SEQ was applied to a large RSV-A and RSV-B panel collected between 1999 and 2020 (n = 199). Consensus sequences were assigned to genotypes according to clustering with reference sequences provided by Goya et al. (23) (Fig. S2 and S3).

Within our genomic surveillance, we defined empirical quality control criteria in advance, categorizing samples as follows: successfully sequenced samples were classified as Passed when they displayed ≥98% genome coverage at a minimum read depth of 10 unique reads per position. Samples with genome coverage below 98% at 10× read depth were classified as Failed. Genotypes of samples that did not meet our quality criteria were determined based on consensus sequences generated at 2× depth or Sanger sequencing.

Next-RSV-SEQ yielded whole genome sequences along nine RSV-A and seven RSV-B genotypes (Fig. 2A), representing a wide range of ancient and recent RSV genotypes (Fig. 2B), which confirms good probe affinity across genotypes. With the exception of genotype GB5.0.0., Failed samples did not accumulate for a specific genotype (Fig. 2A); importantly, sequences of GB5.0.0 samples categorized as Failed (determined from 2× depth consensus sequences) were >97% identical with Passed sequences of the same genotype, which is within the range of sequence variation tolerated by hybridization probes. Thus, Next-RSV-SEQ is unbiased with respect to RSV genotype, enabling capture-based WGS of widely diverse RSV.

Fig 2.

Fig 2

(A) Doughnut charts showing the percentages of Passed (colored) and Failed (gray) sequences sorted by RSV-A and RSV-B genotypes. Total numbers are given in the centers. Genotypes were assigned upon phylogenetic analysis of the G gene ectodomain. (B) Global circulation of the identified genotypes over time. Periods of detections are adopted from Goya et al. (23) and were updated based on our data.

Next-RSV-Seq optimization

We optimized our protocol with an eye toward cost-effectiveness and scalability: to this end, libraries were prepared, applying enzymatic fragmentation. For simple batch wise processing, the number of index cycles was fixed to either 8, 10, or 12 cycles, depending on the DNA concentrations of the respective sample batch. Then, this protocol was set up on a Hamilton automated liquid handling system using half volumes of reagents. Capture-based NGS costs, incurred by reagents and hands-on and instrument time, can be substantially reduced by sample multiplexing, i.e., by pooling multiple libraries in a single capture reaction (10, 26, 27). To evaluate whether the bait concentration is sufficient for even higher pool sizes, we compared pools including 10 versus 20 libraries. Each library contained approximately 2 × 105 viral copies, which had been shown to be a promising input quantity. As observed in a head-to-head comparative analysis, there was no drop in sequence quality when increasing the pool size, making pooling of up to 20 libraries feasible (Fig. S4).

Next-RSV-SEQ performs specifically and efficiently

Using the optimized protocol, we implemented this workflow on 121 clinical samples and 35 culture isolates. We determined NGS performance by assigning samples to four categories according to their viral load (Fig. 3), here defined as the viral copy number of the ds cDNA input volume for library preparation (copies/library).

Fig 3.

Fig 3

(A and B) On-target rates and mean coverage depth for sequenced samples grouped by viral load. Fig. S5 visualizes these data on a continuous x-axis. (C–F) Violin plots depict distributions of the genome coverage at depths of 10× (C), 50× (D), 100× (E), and 500× (F) for samples in the different viral load categories. Viral load categories are based on the total number of copies of the ds cDNA input used for library preparation. Circles indicate median values.

On-target rates were high, with >93% RSV-specific reads for the majority of samples across viral load categories (Fig. 3A). The mean coverage depth was high, with a median close to 1,000 even for samples in the lowest viral load category (<104 copies/library). Mean coverage depth improved further with increasing viral load, up to a median >5,000 for samples with viral loads ≥105 (Fig. 3B). As depicted in Fig. 3C through F, Next-RSV-SEQ consistently provides genome coverage with median values close to, or reaching, 100% at a depth of 10×, 50×, 100×, and—for samples with viral loads >104 copies—even of 500×.

According to our quality criteria, 153/156 samples (98%) were sequenced successfully (Fig. 4A). Sequenced samples, of which 77% were clinical specimens, had a broad range of viral loads, from 1.3 × 103 to 1.4 × 107 copies with a median of 2.3 × 105. Only 2% of the samples failed sequencing; their viral loads ranged between 2.1 × 103 and 1.8 × 105 (Fig. 4A). As shown in Fig. 4B, both RSV-A and RSV-B sequences categorized as Passed displayed high read depth across the entire genome, each at an average more than two orders of magnitude above the 10× coverage required for obtaining a consensus sequence.

Fig 4.

Fig 4

(A) Viral input quantities (total copies of the ds cDNA input) used for library preparation and related sequencing outcome. Passed refers to sequences ≥98% genome coverage at 10× read depth, and Failed refers to <98% genome coverage at 10× read depth. Data points are colored in dark gray for clinical swab samples and in light gray for isolates. Black lines denote median values. (B) Average sequencing depth for all Passed samples across the reference genome sequence. Displayed are locally weighted scatterplot smoothing (LOWESS) regression of median (central curve), 20th percentile (lower curve), and 80th percentile (upper curve) values, using a bandwidth of 1/25. The area between the 20th and 80th percentile curves is highlighted by color, emphasizing the distribution of sequencing coverage within that range. Red lines denote the coverage depth threshold used for consensus generation.

Extending Next-RSV-SEQ to other respiratory viruses

With focus on a potential future multi-pathogen genomic surveillance, we extended our method by including PIV and HMPV genomes into the design of our bait set. Both are pathogens that cause relevant respiratory infections in children and adults (20, 28).

In an initial approach, automated Next-RSV-SEQ was applied to 30 PIV- and 32 HMPV-positive samples, of which the majority were clinical specimens obtained through our ARI and SARI sentinel surveillance systems (PIV: 19 clinical samples, 11 isolates; HMPV: 32 clinical samples). This selection included representatives of HMPV groups A and B collected in 2021–2022, and PIV types 2, 3, 4a, and 4b collected in 2012–2021.

Of these, 100% of PIV and 97% of HMPV genomes were successfully reconstructed and classified as Passed (Fig. 5A). Next, we compared the mean coverage depth of sequences obtained from the three virus species. Mean coverage depth was normalized to 100,000 reads to account for differences between sequencing runs. Due to limited sample material in the upper and lower viral load categories, we compared data of Passed samples with input quantities between 104 and 106 copies (corresponding to viral load categories 2 and 3 as defined in Fig. 3), thereby representing the majority of the samples. Medians of normalized mean coverage depths for PIV and HMPV were comparable to those achieved for RSV (Fig. 5B).

Fig 5.

Fig 5

(A) Viral input quantities (total copies of the ds cDNA input) used for library preparation and related sequencing outcome are shown for PIV and HMPV in comparison to RSV. Passed refers to sequences ≥98% genome coverage at 10× read depth and Failed to <98% genome coverage at 10× read depth. Data points are colored by virus. Black lines denote median values. Passed samples of the gray-shaded area (>104 and <106 viral copies) were selected for subsequent comparison of the coverage depth (Fig. 5B). (B) Box plots of mean coverage depths of Passed PIV and HMPV sequences in comparison to RSV, each calculated on samples with viral input quantities between 104 and 106 copies.

DISCUSSION

Recent out-of-season RSV surges have underscored the global importance of this respiratory pathogen. Novel passive as well as active immunization strategies have recently been approved and will likely be employed on a worldwide scale. As with other respiratory viruses causing substantial morbidity and mortality, this increases the need for an upscale of genomic surveillance, using approaches that are not only informative but also cost-efficient and amenable to high throughput (2931). Here, we present Next-RSV-SEQ, a robust hybrid capture approach to generate high-quality RSV genomic sequences directly from clinical specimens. Next-RSV-SEQ includes a wetlab protocol as well as a publicly available bioinformatics pipeline that facilitates data analysis. While previous studies have shown that hybridization capture techniques are suitable for NGS of RSV (1518), they were not specifically designed to consistently provide full-length genome sequences for public health genomic surveillance at a manageable cost, whereas Next-RSV-SEQ combines both scalability as well as cost and time efficiency.

We showed that Next-RSV-SEQ can be applied to a broad diversity of historic and recently circulating RSV-A and RSV-B genotypes. High-quality RSV whole genome sequences were generated from more than 97% of both culture isolates and, more importantly, clinical specimens, demonstrating that Next-RSV-SEQ is a suitable tool especially, but not only, for genomic surveillance purposes. Robust whole genome sequencing of RSV from clinical specimens offers practical benefits, because although culture isolation prior to NGS may enhance virus concentration, it presents methodological challenges with RSV (7). Moreover, cell culture passage may select for mutations and thus bias sequencing results. Furthermore, target enrichment via hybridization capture was highly efficient, with a median rate of on-target reads close to 100% across all viral load categories tested.

While the vast majority of samples was successfully sequenced, a small fraction (2%) was not. A clearcut link between sequencing failure and viral load could not be established: Failed samples had moderate to high viral loads, while whole genomes were successfully recovered from samples with viral loads as low as 1.3 × 103 viral copies/library, corresponding to 230 copies per microliter RNA. Given that Failed samples did not display substantial genetic variation compared to Passed samples of the same genotype, sequencing failures might be related to other sources of variation such as RNA conservation or sample preparation techniques. Such low dropout rates are indeed unlikely to matter in the context of large-scale genomic surveillance programs and already compare well with other methods (32, 33).

A limitation is that our approach, geared toward genomic surveillance in the public health context, excluded samples with viral loads below Cp ≤31 (RSV: 1.3 × 103 genome copies/library). Thus, we cannot provide a formal limit of detection. However, in comparison to published RSV WGS techniques reporting sample viral loads, this method appears highly sensitive: for example, Dong et al. (32), using an amplicon approach on samples with Cp 10–32, obtained whole genome sequences for 83% of samples; Schobel and colleagues (34) found a 67% success rate for samples with Cp <29. Wang et al. (33) noted amplicon dropouts for all samples with Cp >24 but did report a 99.6% success rates for samples with Cp 24–33 when using a 20-amplicon nested assay; however, this 20-amplicon approach must be considered very laborious and would not be easily adaptable to high sample throughput.

A key application for Next-RSV-SEQ is public health genomic surveillance, which may require processing large numbers of samples in both resource-rich and resource-limited settings. Therefore, the method was designed and optimized to be cost-efficient and scalable. We saved on consumables costs by using library preparation kits at half volumes (2× cost reduction) and diluting RNA probes by a factor 4 (4× cost reduction) (10, 18). Depending on the laboratory equipment and sample throughput, Next-RSV-SEQ can be carried out both manually and automatically with maintained sequence quality (Fig. 3; Fig. S6 and S7). In addition, we pooled multiple libraries per capture reaction (further cost reduction depending on pool size). Multiplexing libraries prior to enrichment led to a substantial reduction of consumables cost and hands-on time, thereby augmenting scalability. Finally, we sequenced our enriched libraries in batches of 40–60 on a MiSeq system. Of note, further cost reductions could still be achieved by using larger Illumina instruments since biases are usually indistinguishable.

Use of a bait set targeting multiple virus species has been described previously (17, 18, 3537) and provides an opportunity for broader genomic surveillance. We demonstrated that use of an expanded probe set encompassing PIV and HMPV in addition to RSV leads to high-quality WGS results for either virus. While a broader probe set does not come at a higher cost or only incurs marginally higher costs, it opens up the possibility to obtain genomic sequences from RSV, HMPV, and PIV, using the same wetlab workflow. Efforts to validate the use of the probe panel to include also samples with lower viral loads, or samples with mixed virus infections, e.g., RSV/PIV coinfections, are ongoing in our lab. In the future, this approach may enable customized solutions, employing modular probe panels that are adapted to capture pathogens of interest, particularly other important respiratory viruses such as influenza viruses A and B or severe acute respiratory syndrome coronavirus 2. In summary, Next-RSV-SEQ holds great promise as a tool for genomic surveillance of RSV, which is gaining in relevance given the recent, postpandemic RSV surges and the imminent rollout of vaccines and a novel prophylactic antibody. The method performed extremely well, with near-complete to complete genome sequences obtained for over 98% of samples investigated herein. Next-RSV-SEQ works well over a wide range of viral loads, with both cell culture isolates and clinical samples and across a broad spectrum of viral genotypes. In addition, we demonstrate that our method can be expanded to include other respiratory viruses, offering new prospects for establishing customized multi-pathogen surveillance systems for respiratory viruses in the future using a modular, standardized workflow.

ACKNOWLEDGMENTS

We are grateful to the patients, their families, and the physicians participating in the German ARI sentinel, who have contributed many of the clinical specimens used for validation. We thank the staff of the National Influenza Centre and the excellent technical support of Susi Hafemann.

This work was partially supported by intramural grants at RKI (grant numbers 8321487, 8321733, 8321844) to J.R., the IGS-BMG-project sponsored by the Federal Ministry of Health (C.1.1) and by DAKI (Daten- und KI-gestütztes Frühwarnsystem zur Stabilisierung der deutschen Wirtschaft) funded by the Bundesministerium für Wirtschaft und Klimaschutz (01MK21009E) to S.F.

Contributor Information

Janine Reiche, Email: reichej@rki.de.

John P. Dekker, National Institute of Allergy and Infectious Diseases, Bethesda, Maryland, USA

ETHICS APPROVAL

The German national ARI and SARI sentinel surveillances have been approved by the Charité-Universitätsmedizin in Berlin Ethical Board (references EA2/126/11 and EA2/218/19) and written informed consent was obtained from all participants.

DATA AVAILABILITY

Consensus sequences were submitted to GISAID (https://gisaid.org (RSVA: EPI_ISL_17995589–17995778; RSVB: EPI_ISL_18005798–18005952)) and Genbank (RSVA: OR795292OR795481; RSVB: OR795209OR795291 and MZ568931-MZ569010), respectively (Table S2).

SUPPLEMENTAL MATERIAL

The following material is available online at https://doi.org/10.1128/jcm.01111-23.

Supplemental Figures S1-S7. jcm.01111-23-s0001.pdf.

Additional experimental results.

jcm.01111-23-s0001.pdf (4.4MB, pdf)
DOI: 10.1128/jcm.01111-23.SuF1
Supplemental Table S1. jcm.01111-23-s0002.pdf.

List of GenBank accession numbers of sequences used for bait design.

jcm.01111-23-s0002.pdf (33.1KB, pdf)
DOI: 10.1128/jcm.01111-23.SuF2
Supplemental Table S2. jcm.01111-23-s0003.pdf.

Sample list with GenBank and GISAID accession numbers.

jcm.01111-23-s0003.pdf (48.2KB, pdf)
DOI: 10.1128/jcm.01111-23.SuF3

ASM does not own the copyrights to Supplemental Material that may be linked to, or accessed through, an article. The authors have granted ASM a non-exclusive, world-wide license to publish the Supplemental Material files. Please contact the corresponding author directly for reuse.

REFERENCES

  • 1. Li Y, Wang X, Blau DM, Caballero MT, Feikin DR, Gill CJ, Madhi SA, Omer SB, Simões EAF, Campbell H, et al. 2022. Global, regional, and national disease burden estimates of acute lower respiratory infections due to respiratory syncytial virus in children younger than 5 years in 2019: a systematic analysis. Lancet 399:2047–2064. doi: 10.1016/S0140-6736(22)00478-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. European Centre for Disease Prevention and Control (ECDC) . 2022. Intensified circulation of respiratory syncytial virus (RSV) and associated hospital burden in the EU/EEA. Available from: https://www.ecdc.europa.eu/sites/default/files/documents/RRA-20221128-473.pdf. Retrieved 1 Jun 2023.
  • 3. Hammitt LL, Dagan R, Yuan Y, Baca Cots M, Bosheva M, Madhi SA, Muller WJ, Zar HJ, Brooks D, Grenham A, Wählby Hamrén U, Mankad VS, Ren P, Takas T, Abram ME, Leach A, Griffin MP, Villafana T, MELODY Study Group . 2022. Nirsevimab for prevention of RSV in healthy late-preterm and term infants. N Engl J Med 386:837–846. doi: 10.1056/NEJMoa2110275 [DOI] [PubMed] [Google Scholar]
  • 4. Papi A, Ison MG, Langley JM, Lee DG, Leroux-Roels I, Martinon-Torres F, Schwarz TF, van Zyl-Smit RN, Campora L, Dezutter N, de Schrevel N, Fissette L, David MP, Van der Wielen M, Kostanyan L, Hulstrom V, AReSVi-006 Study Group . 2023. Respiratory syncytial virus prefusion F protein vaccine in older adults. N Engl J Med 388:595–608. doi: 10.1056/NEJMoa2209604 [DOI] [PubMed] [Google Scholar]
  • 5. Schmoele-Thoma B, Zareba AM, Jiang Q, Maddur MS, Danaf R, Mann A, Eze K, Fok-Seang J, Kabir G, Catchpole A, Scott DA, Gurtman AC, Jansen KU, Gruber WC, Dormitzer PR, Swanson KA. 2022. Vaccine efficacy in adults in a respiratory syncytial virus challenge study. N Engl J Med 386:2377–2386. doi: 10.1056/NEJMoa2116154 [DOI] [PubMed] [Google Scholar]
  • 6. World Health Organization (WHO) . Respiratory syncytial virus surveillance. Available from: https://www.who.int/teams/global-influenza-programme/global-respiratory-syncytial-virus-surveillance. Retrieved 1 Jun 2023.
  • 7. Teirlinck AC, Broberg EK, Stuwitz Berg A, Campbell H, Reeves RM, Carnahan A, Lina B, Pakarna G, Boas H, Nohynek H, et al. 2021. Recommendations for respiratory syncytial virus surveillance at the national level. Eur Respir J 58:2003766. doi: 10.1183/13993003.03766-2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Teirlinck AC, Johannesen CK, Broberg EK, Penttinen P, Campbell H, Nair H, Reeves RM, Boas H, Brytting M, Cai W, et al. 2023. New perspectives on respiratory syncytial virus surveillance at the national level: lessons from the COVID-19 pandemic. Eur Respir J 61:2201569. doi: 10.1183/13993003.01569-2022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Pandya MC, Callahan SM, Savchenko KG, Stobart CC. 2019. A contemporary view of respiratory syncytial virus (RSV) biology and strain-specific differences. Pathogens 8:67. doi: 10.3390/pathogens8020067 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Cruz-Davalos DI, Llamas B, Gaunitz C, Fages A, Gamba C, Soubrier J, Librado P, Seguin-Orlando A, Pruvost M, Alfarhan AH, Alquraishi SA, Al-Rasheid KAS, Scheu A, Beneke N, Ludwig A, Cooper A, Willerslev E, Orlando L. 2017. Experimental conditions improving in-solution target enrichment for ancient DNA. Mol Ecol Resour 17:508–522. doi: 10.1111/1755-0998.12595 [DOI] [PubMed] [Google Scholar]
  • 11. Gaudin M, Desnues C. 2018. Hybrid capture-based next generation sequencing and its application to human infectious diseases. Front Microbiol 9:2924. doi: 10.3389/fmicb.2018.02924 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Bonsall D, Ansari MA, Ip C, Trebes A, Brown A, Klenerman P, Buck D, Consortium S-H, Piazza P, Barnes E, Bowden R. 2015. Ve-SEQ: robust, unbiased enrichment for streamlined detection and whole-genome sequencing of HCV and other highly diverse pathogens. F1000Res 4:1062. doi: 10.12688/f1000research.7111.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Bonsall D, Golubchik T, de Cesare M, Limbada M, Kosloff B, MacIntyre-Cockett G, Hall M, Wymant C, Ansari MA, Abeler-Dörner L, Schaap A, Brown A, Barnes E, Piwowar-Manning E, Eshleman S, Wilson E, Emel L, Hayes R, Fidler S, Ayles H, Bowden R, Fraser C, HPTN 071 (PopART) Team . 2020. A comprehensive genomics solution for HIV surveillance and clinical monitoring in low-income settings. J Clin Microbiol 58:e00382-20. doi: 10.1128/JCM.00382-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Nagy-Szakal D, Couto-Rodriguez M, Wells HL, Barrows JE, Debieu M, Butcher K, Chen S, Berki A, Hager C, Boorstein RJ, Taylor MK, Jonsson CB, Mason CE, O’Hara NB. 2021. Targeted hybridization capture of SARS-CoV-2 and metagenomics enables genetic variant discovery and nasal microbiome insights. Microbiol Spectr 9:e0019721. doi: 10.1128/Spectrum.00197-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Baier C, Huang J, Reumann K, Indenbirken D, Thol F, Koenecke C, Ebadi E, Heim A, Bange FC, Haid S, Pietschmann T, Fischer N. 2022. Target capture sequencing reveals a monoclonal outbreak of respiratory syncytial virus B infections among adult hematologic patients. Antimicrob Resist Infect Control 11:88. doi: 10.1186/s13756-022-01120-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Goya S, Sereewit J, Pfalmer D, Nguyen TV, Bakhash S, Sobolik EB, Greninger AL. 2023. Genomic characterization of respiratory syncytial virus during 2022-23 outbreak, Washington, USA. Emerg Infect Dis 29:865–868. doi: 10.3201/eid2904.221834 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. O’Flaherty BM, Li Y, Tao Y, Paden CR, Queen K, Zhang J, Dinwiddie DL, Gross SM, Schroth GP, Tong S. 2018. Comprehensive viral enrichment enables sensitive respiratory virus genomic identification and analysis by next generation sequencing. Genome Res 28:869–877. doi: 10.1101/gr.226316.117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Patrono LV, Röthemeier C, Kouadio L, Couacy-Hymann E, Wittig RM, Calvignac-Spencer S, Leendertz FH. 2022. Non-invasive genomics of respiratory pathogens infecting wild great apes using hybridisation capture. Influenza Other Respir Viruses 16:858–861. doi: 10.1111/irv.12984 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Oh DY, Buda S, Biere B, Reiche J, Schlosser F, Duwe S, Wedde M, von Kleist M, Mielke M, Wolff T, Dürrwald R. 2021. Trends in respiratory virus circulation following COVID-19-targeted nonpharmaceutical interventions in Germany, January - September 2020: Analysis of national surveillance data. Lancet Reg Health Eur 6:100112. doi: 10.1016/j.lanepe.2021.100112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Oh DY, Biere B, Grenz M, Wolff T, Schweiger B, Dürrwald R, Reiche J. 2021. Virological surveillance and molecular characterization of human parainfluenzavirus infection in children with acute respiratory illness: Germany, 2015-2019. Microorganisms 9:1508. doi: 10.3390/microorganisms9071508 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890. doi: 10.1093/bioinformatics/bty560 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Wood DE, Lu J, Langmead B. 2019. Improved metagenomic analysis with Kraken 2. Genome Biol 20:257. doi: 10.1186/s13059-019-1891-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Goya S, Galiano M, Nauwelaers I, Trento A, Openshaw PJ, Mistchenko AS, Zambon M, Viegas M. 2020. Toward unified molecular surveillance of RSV: a proposal for genotype definition. Influenza Other Respir Viruses 14:274–285. doi: 10.1111/irv.12715 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Minh BQ, Nguyen MA, von Haeseler A. 2013. Ultrafast approximation for phylogenetic bootstrap. Mol Biol Evol 30:1188–1195. doi: 10.1093/molbev/mst024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. 2015. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274. doi: 10.1093/molbev/msu300 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Greninger AL, Roychoudhury P, Xie H, Casto A, Cent A, Pepper G, Koelle DM, Huang ML, Wald A, Johnston C, Jerome KR. 2018. Ultrasensitive capture of human herpes simplex virus genomes directly from clinical. mSphere 3:e00283-18. doi: 10.1128/mSphereDirect.00283-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Hale H, Gardner EM, Viruel J, Pokorny L, Johnson MG. 2020. Strategies for reducing per-sample costs in target capture sequencing for phylogenomics and population genomics in plants. Appl Plant Sci 8:e11337. doi: 10.1002/aps3.11337 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Reiche J, Jacobsen S, Neubauer K, Hafemann S, Nitsche A, Milde J, Wolff T, Schweiger B. 2014. Human metapneumovirus: insights from a ten-year molecular and epidemiological analysis in Germany. PLoS One 9:e88342. doi: 10.1371/journal.pone.0088342 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Oh DY, Holzer M, Paraskevopoulou S, Trofimova M, Hartkopf F, Budt M, Wedde M, Richard H, Haldemann B, Domaszewska T, et al. 2022. Advancing precision vaccinology by molecular and genomic surveillance of severe acute respiratory syndrome coronavirus 2 in Germany, 2021. Clin Infect Dis 75:S110–S120. doi: 10.1093/cid/ciac399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Subissi L, von Gottberg A, Thukral L, Worp N, Oude Munnink BB, Rathore S, Abu-Raddad LJ, Aguilera X, Alm E, Archer BN, et al. 2022. An early warning system for emerging SARS-CoV-2 variants. Nat Med 28:1110–1115. doi: 10.1038/s41591-022-01836-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. World Health Organization (WHO) . 2019. WHO strategy for the global respiratory syncytial virus surveillance based on influenza surveillance. Available from: https://cdn.who.int/media/docs/default-source/influenza/rsv-surveillance/who-rsv-surveillance-strategy-phase-26mar2021.-final.pdf?sfvrsn=d8b1c36a_9. Retrieved 1 Jun 2023.
  • 32. Dong X, Deng YM, Aziz A, Whitney P, Clark J, Harris P, Bautista C, Costa AM, Waller G, Daley AJ, Wieringa M, Korman T, Barr IG. 2023. A simplified, amplicon-based method for whole genome sequencing of human respiratory syncytial viruses. J Clin Virol 161:105423. doi: 10.1016/j.jcv.2023.105423 [DOI] [PubMed] [Google Scholar]
  • 33. Wang L, Ng TFF, Castro CJ, Marine RL, Magaña LC, Esona M, Peret TCT, Thornburg NJ. 2022. Next-generation sequencing of human respiratory syncytial virus subgroups A and B genomes. J Virol Methods 299:114335. doi: 10.1016/j.jviromet.2021.114335 [DOI] [PubMed] [Google Scholar]
  • 34. Schobel SA, Stucker KM, Moore ML, Anderson LJ, Larkin EK, Shankar J, Bera J, Puri V, Shilts MH, Rosas-Salazar C, Halpin RA, Fedorova N, Shrivastava S, Stockwell TB, Peebles RS, Hartert TV, Das SR. 2016. Respiratory syncytial virus whole-genome sequencing identifies convergent evolution of sequence duplication in the C-terminus of the G gene. Sci Rep 6:26311. doi: 10.1038/srep26311 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Li B, Si HR, Zhu Y, Yang XL, Anderson DE, Shi ZL, Wang LF, Zhou P. 2020. Discovery of bat coronaviruses through surveillance and probe capture-based next-generation sequencing. mSphere 5:e00807-19. doi: 10.1128/mSphere.00807-19 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Wylezich C, Calvelage S, Schlottau K, Ziegler U, Pohlmann A, Hoper D, Beer M. 2021. Next-generation diagnostics: virus capture facilitates a sensitive viral diagnosis for epizootic and zoonotic pathogens including SARS-CoV-2. Microbiome 9:51. doi: 10.1186/s40168-020-00973-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Yang Y, Walls SD, Gross SM, Schroth GP, Jarman RG, Hang J. 2018. Targeted sequencing of respiratory viruses in clinical specimens for pathogen identification and genome-wide analysis. Methods Mol Biol 1838:125–140. doi: 10.1007/978-1-4939-8682-8_10 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Figures S1-S7. jcm.01111-23-s0001.pdf.

Additional experimental results.

jcm.01111-23-s0001.pdf (4.4MB, pdf)
DOI: 10.1128/jcm.01111-23.SuF1
Supplemental Table S1. jcm.01111-23-s0002.pdf.

List of GenBank accession numbers of sequences used for bait design.

jcm.01111-23-s0002.pdf (33.1KB, pdf)
DOI: 10.1128/jcm.01111-23.SuF2
Supplemental Table S2. jcm.01111-23-s0003.pdf.

Sample list with GenBank and GISAID accession numbers.

jcm.01111-23-s0003.pdf (48.2KB, pdf)
DOI: 10.1128/jcm.01111-23.SuF3

Data Availability Statement

Consensus sequences were submitted to GISAID (https://gisaid.org (RSVA: EPI_ISL_17995589–17995778; RSVB: EPI_ISL_18005798–18005952)) and Genbank (RSVA: OR795292OR795481; RSVB: OR795209OR795291 and MZ568931-MZ569010), respectively (Table S2).


Articles from Journal of Clinical Microbiology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES