Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Nov 1.
Published in final edited form as: Virology. 2018 Sep 10;524:182–191. doi: 10.1016/j.virol.2018.08.021

Design and validation of a universal influenza virus enrichment probe set and its utility in deep sequence analysis of primary cloacal swab surveillance samples of wild birds

Yongli Xiao a,*, Jacqueline M Nolting b, Zong-Mei Sheng a, Tyler Bristol a, Li Qi a, Andrew S Bowman b, Jeffery K Taubenberger a
PMCID: PMC6617512  NIHMSID: NIHMS1008283  PMID: 30212665

Abstract

Influenza virus infections in humans and animals are major public health concerns. In the current study, a set of universal influenza enrichment probes was developed to increase the sensitivity of sequence-based virus detection and characterization for all influenza viruses. This universal influenza enrichment probe set contains 46,953 120nt RNA biotin-labeled probes designed based on all available influenza viral sequences and it can be used to enrich for influenza sequences without prior knowledge of type or subtype. Marked enrichment was demonstrated in influenza A/H1N1, influenza B, and H1-to-H16 hemagglutinin plasmids spiked into human DNA and in cultured influenza A/H2N1 virus. Furthermore, enrichment effects and mixed influenza A virus infections were revealed in wild bird cloacal swab samples. Therefore, this universal influenza virus enrichment probe system can capture and enrich influenza viral sequences selectively and effectively in different samples, especially ones with degraded RNA or containing low amount of influenza RNA.

Keywords: Influenza, Universal influenza enrichment, Deep sequencing, Mixed avian influenza infection

1. Introduction

Influenza viruses have a negative-sense, single-stranded, segmented RNA genome, and are enveloped viruses of the family Orthomyxoviridae (Wright et al., 2007). They infect many warm blooded avian and mammalian species, including humans, and infections caused by these viruses have great public health, animal health, and economic significance. Influenza A viruses (IAV) are extremely diverse, both genetically and antigenically, and widely distributed in wild avian species globally, in domestic avian and mammalian species, and in humans. There are two major surface glycoproteins, hemagglutinin and neuraminindase with multiple subtypes. Currently, 18 hemagglutinin (HA) subtypes and 11 neuraminidase (NA) subtypes have been described, of which 16 HA and 9 NA subtypes are frequently found in avian species. Of the 144 possible HA-NA subtype combinations possible, at least 131 have been characterized in strains isolated from birds in NCBI influenza virus database (Bao et al., 2008). The two recently described IAV subtypes (H17N10 and H18N11) have only been isolated in fruit bats (Wu et al., 2014). Aquatic birds likely serve as the predominant natural reservoir for all other known subtypes and probably are the ultimate source of all human pandemic IAV strains (Webster et al., 1992).

Annual epidemics of influenza vary in their impact, but up to 56,000 people die annually from influenza in the United States (https://www.cdc.gov/flu/about/disease/2015–16.htm), and up to 650,000 annual influenza fatalities occur globally (http://www.who.int/mediacentre/factsheets/fs211/en/). Novel pandemic viruses occur sporadically in history. There have been four pandemics in the last century (Morens et al., 2009) and the 1918 H1N1 pandemic killed approximately 50 million people globally (Iuliano et al., 2018; Johnson and Mueller, 2002). Epizootic outbreaks of avian- and swine-origin IAV strains poses a risk for future pandemics. Type B and C influenza viruses are adapted to and isolated almost exclusively from humans, although influenza B viruses have been isolated from seals and influenza C viruses have been isolated from pigs and dogs (Wright et al., 2007).

Therefore, the continuous research on surveillance, rapid diagnosis, transmission, pathogenesis, and vaccinology of influenza is essential to prevent and mitigate its impact.

One critical aspect of influenza research is to detect and classify influenza viruses as to type, subtype, and genotype, especially for newly emerging viral variants in different samples, such as from wild birds, domestic animals, and human patients for the purposes of surveillance, prevention, and treatment. A second important aspect to understanding the ecobiology of IAV, in birds especially, is the detection of mixed IAV infections in wild birds, which appear to be playing a key role in the natural history of these viruses. However, this task is extremely time and resource demanding. If multiple strains of avian IAV are present in the original sample, one strain of virus may outgrow others in embryonated chicken eggs, and other strains go undetected. Subtyping of avian IAV samples are classically determined by HA-and NA-inhibition tests (HIT and NIT) using reference antibodies and antigens (Lee et al., 2006). Therefore, over the years, the frequency of mixed IAV in wild birds has likely been greatly underestimated (Dugan et al., 2008; Wang et al., 2008).

Technological advances allow new methods to be developed to identify and characterize influenza virus isolates from a variety of sample types, including methods based on culture, antibody binding, serological assays, nucleic acid amplification strategies, and nucleic acid sequencing methodologies (Vemula et al., 2016). The latest and most comprehensive, but still costly, approach is the incorporation of high throughput sequencing technology. First introduced in 2005, ‘high-throughput’ DNA sequencers, which can determine megabases of DNA sequence per run (Service, 2006), have evolved dramatically, increasing sequencing capacity by a factor of 100–1000 (Goodwin et al., 2016). They have been widely used in influenza research studies, including detecting IAV and norovirus infections in patients (Nakamura et al., 2009), uncovering mixed infection with 2009 pandemic influenza A viruses (Ghedin et al., 2011), high throughput sequencing of influenza B viruses (Zhou et al., 2014), evaluating genetic stability of influenza vaccine viruses (Laassri et al., 2015), revealing antigenic variants at low frequencies in IAV-infected patients (Dinis et al., 2016), and high-throughput identification of influenza A/H3N2 virus antigenic drift variants (Mishin et al., 2017). However, because of the limited amount of viral RNA in typical clinical samples, all of these studies employed virus-specific primers and virus-specific PCR amplification strategies to enrich for target influenza sequences.

PCR-introduced errors have been emphasized more and more in next generation sequencing recently. It has been show that most commonly used PCR enzymes, including high fidelity enzymes, all have the error rates at 10−5 to 10−6 point mutations/bp/duplication (McInerney et al., 2014). Besides well-characterized polymerase base substitution errors, other sources of error were found to be equally prevalent, including PCR-mediated recombination, template-switching, and DNA damage introduced during temperature cycling (Potapov and Ong, 2017). In fact, Primer ID method was developed to distinguish PCR-introduced errors from real single nucleotide polymorphisms (SNPs) that occurred during virus evolution (Zhou et al., 2015). Another challenge using PCR to enrich cDNA derived from influenza virus RNA is that the isolated influenza RNA may have been degraded (Wang et al., 2008) and will be thus difficult to amplify by influenza universal primers requiring full-length segments as templates (Hoffmann et al., 2001). In many surveillance situations, subtype-specific primers cannot be used because the type(s) or subtype(s) of influenza virus in the sample is unknown.

Therefore, this study designed a set of universal influenza probes for enrichment of any influenza A, B, or C virus sequences by hybridization capture. These hybridization capture methods were first used to enrich sequence targets from the human genome. The designed oligonucleotide probes that capture the sequencing targets are attached to a solid phase support (Albert et al., 2007; Hodges et al., 2007; Okou et al., 2007). Subsequently, enrichment of the target sequences was performed in solution (Gnirke et al., 2009; Tewhey et al., 2009). Recently, a virome capture sequencing platform was developed for vertebrate viruses (Briese et al., 2015). Currently, many commercial companies provide different design and capture methods (Bodi et al., 2013; Garcia-Garcia et al., 2016; Zhou et al., 2009). In this study, we designed an enrichment probe set specific for all the influenza strains. Utilizing all available influenza A, B, and C virus sequences, a clustered unique data set for designing universal influenza enrichment probes was obtained. Using these universal influenza enrichment probes, we found significant enrichment of IAV and influenza B virus (IBV) RNA in control experiments. Application of the methods to cloacal swab samples collected from wild bird field surveillance, revealed IAV sequences and subtypes in these samples that were not detected by traditional methods.

2. Results

2.1. Design influenza universal enrichment probes

All the influenza A, B, and C virus sequences were downloaded from the NCBI influenza database (ftp://ftp.ncbi.nih.gov/genomes/INFLUENZA/) on 11/18/2015. A total of 408,140 influenza sequences were obtained. Among them, there were 390,301 (95.6%) ‘clean’ sequences and 17,839 (4.4%) ‘non-clean’ sequences (containing ambiguous non A, T, G, C bases). First, the clean sequences were collapsed into a unique set of 277,949 sequences using FASTX-Toolkit version 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit/) as the clean sequence set. For the sequences with ambiguous bases (17,839), if the total length of the ambiguous bases together in a sequence was longer than 10% of the total length of the sequence, they were discarded. Consequently, 649 sequences were discarded. The remainder of the 17,190 sequences were clustered at the 90% identity level by cd-hit-v4.3 (Li and Godzik, 2006) and resulted in 492 sequences. These sequences were split if they contained 8 Ns continuously and the separated sequences were retained if longer than 20 bp, which generated 649 sequences as the non-clean sequence set. Subsequently, the clean sequence set (277,949) and the non-clean sequence set (649) were combined and clustered together at the 90% identity level by cd-hit-v4.3 (Li and Godzik, 2006). This process generated a total of 905 sequences with 823 sequences derived from the clean sequence set and 82 sequences from the non-clean sequence set. After aligning these 82 sequences from the non-clean sequence set against 823 sequences from the clean sequence set, we found that 78 out of 82 sequences had high sequence homology, with the lowest percent identity at 82.24%. Therefore, these 78 sequences were eliminated from the dataset. Consequently, the final data set contained 823 sequences from the clean sequence set and 4 sequences from non-clean sequence set. The combined sequences (827) were clustered together at the 90% identity level by cd-hit-v4.3 (Li and Godzik, 2006) again and used to generate the final sequences (825). The process of generation of the final sequence data set from downloaded all influenza sequences is shown in Fig. 1. The final sequences (825) were used to make enrichment RNA oligonucleotide probes (Agilent Technologies, Santa Clara, CA) at 5X tiling resulting in 46,953 120nt RNA biotin-labeled universal influenza enrichment probes. All the enrichment probe sequences are listed in Supplemental Table 1.

Fig. 1.

Fig. 1.

Process of downloaded influenza reads for enrichment probe design.

2.2. In silico evaluation of enrichment probes

Based on phylogenetic analysis, IAV HA subtypes can be categorized into two major clades, group 1, which contains H1, H2, H5, H6, H8, H9, H11, H12, H13, H16, H17, and H18, and group 2 contains H3, H4, H7, H10, H14, and H15 (Air, 1981; Dugan et al., 2008; Joyce et al., 2016; Nobusawa et al., 1991). In an initial screen, the 1918 H1N1 pandemic strain (A/Brevig Mission/1/1918 (H1N1)) (group 1) and a 2013 H7N9 epizootic strain (A/Hangzhou/1/2013 (H7N9)) (group 2) were used to evaluate the probes in silico, in which the enrichment probes were aligned using the blastn program (with percent identity as 90 and e-value as 0.001 cutoff) (Altschul et al., 1990) against both viral genomes. Table 1 shows the coverage of each segment from the two different viruses provided by the probes and Fig. 2 shows the graphic coverage of the corresponding HA segments. As can be seen, the enrichment probes provided 100% coverage of all the segments from both IAV strains. In addition, we downloaded all the sequences of complete HA and NA segments of influenza A viruses from Influenza Research Database (https://www.fludb.org/brc/home.spg?decorator=influenza) (Zhang et al., 2017). One HA sequence from each HA subtype (H1 to H18) was randomly picked from a total of 55,300 unique downloaded HA sequences and one NA sequence from each NA subtype (N1 to N11) was randomly picked from a total of 45,834 unique downloaded NA sequences. Enrichment probes were aligned to 18 randomly picked HA sequences from each HA subtype and 11 randomly picked NA sequences from each NA subtype respectively using the blastn program. Table 2 shows the coverages of the 18 HA segments (H1 to H18) and Table 3 shows that of 11 NA segments (N1 to N11) by the enrichment probes. Similarly, the enrichment probes display more than 95% of segment coverage of all the tested HA and NA subtypes. Next, we blasted the enrichment probes against all human virus sequences downloaded from NCBI (https://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?taxid=10239&host=human) that contains 3140 human representative virus sequences including 100 influenza sequences. From the downloaded human representative virus dataset, there were 18,704 probes that hit 122 human viral sequences. Among the 122 human viral sequences hits, 100 were influenza virus gene sequences, including influenza A, B and C sequences (71 unique NC numbers), representing all influenza virus sequences represented in this dataset, and 22 non-influenza viral sequences (19 unique NC numbers from 19 non-influenza human viruses). However, the average alignment length between query and target from these 22 non-influenza hits was only 26.8 bp (range 21–35 bp), while the average alignment length between query and target from influenza virus hits was 107.0 bp. Therefore, the homology of our probes to human non-influenza viruses is not significant. The blastn outputs of non-influenza hits are shown in Supplemental Table 2 and influenza hits are shown in Supplemental Table 3. Finally, we blasted all the enrichment probes against human genome sequences downloaded from NCBI (ftp://ftp.ncbi.nlm.nih.gov/blast/db/). We found that 25 enrichment probes hit human genome sequences (90 unique gi numbers). Examining the query/target alignment length, the average alignment length of hits is 34.6 bp (30–43 bp) (Supplemental Table 4). Therefore, in silico analysis indicated that the enrichment probes hybridize preferentially to influenza virus sequences.

Table 1.

Probe coverages of each segment of 2 influenza strains by blastn search.

Influenza strain Segment Segment length Blastn hit number Covered length Covered percentage

A/Brevig Mission/1/1918(H1N) DQ208309_PB2 2280   987 2280 1
DQ208310_PB1 2274   678 2274 1
DQ208311_PA 2151 1074 2151 1
AF117241_HA 1701   481 1701 1
AY744935_NP 1497   459 1497 1
AF250356_NA 1410   567 1410 1
AY130766_M2_M1   982   446   982 1
AF333238_NEP_NS1   838   551   838 1
A/Hangzhou/1/2013(H7N9) gi_485649809_PB2 2280   976 2271 0.996
gi_485649811_PB1 2274   974 2274 1
gi_485649813_PA 2151 1104 2151 1
gi_475662453_HA 1683   236 1683 1
gi_485649815_NP 1497   487 1497 1
gi_475662451_NA 1398   278 1398 1
gi_482680053_M2_M1   982   305   982 1
gi_485649817_NEP_NS1   838   311   838 1

Fig. 2.

Fig. 2.

Blastn hit coverages of HA segments from (A) A/Brevig Mission/1/1918(H1N) and (B) A/Hangzhou/1/2013(H7N9).

Table 2.

Probe coverages of HA segments by blastn search.

Segment Segment
length
Blastn hit
number
Covered
length
Covered
percentage

CY129662.1_H1 1701 684 1701 1
CY196318.1_H2 1689 182 1689 1
CY121800.1_H3 1701 734 1701 1
CY132573.1_H4 1695 237 1695 1
HM172115.1_H5 1704 815 1704 1
CY110057.1_H6 1701 247 1701 1
MG739458.1_H7 1683 226 1683 1
CY096648.1_H8 1701 184 1701 1
KP414235.1_H9 1683 315 1683 1
MH501112.1_H10 1686 199 1686 1
CY097093.1_H11 1698 161 1698 1
MH597584.1_H12 1695   96 1695 1
MH498978.1_H13 1698 107 1614 0.951
CY146897.1_H14 1707 125 1707 1
CY006033.1_H15 1713 107 1713 1
LC339707.1_H16 1701 151 1701 1
CY103892.1_H17 1695   86 1695 1
CY125945.1_H18 1686   86 1686 1

Table 3.

Probe coverages of NA segments by blastn search.

Segment Segment
length
Blastn hit
number
Covered
length
Covered
percentage

CY186693.1_N1 1410 415 1410 1
KY284466.1_N2 1410 934 1410 1
CY157064.1_N3 1410 288 1410 1
AB289344.1_N4 1413 207 1413 1
CY177395.1_N5 1422 190 1420 0.999
KJ200655.1_N6 1413 308 1413 1
CY183639.1_N7 1416 271 1416 1
CY196228.1_N8 1413 532 1413 1
CY179701.1_N9 1413 334 1413 1
CY103878.1_N10 1329   54 1329 1
CY125947.1_N11 1344   56 1344 1

2.3. Enrichment effects of different influenza virus plasmid sets spiked into human DNA samples by real-time PCR

In order to test the effectiveness of the enrichment probes, we spiked-in 2 sets of 8 plasmids containing 8 gene segments from 2 influenza viruses respectively, influenza A/California/04/2009 (H1N1) and influenza B/Bethesda/NIH001/2007 into human DNA and made Illumina sequencing libraries from them. Real-time PCRs were performed to assess the amount of each of the 8 influenza segments in the libraries before and after influenza-specific enrichment and results are shown in Table 4. Using the human HER2 gene as reference, the enrichment effect is readily apparent. For the H1N1 spiked-in library, the lowest enrichment effect was more than 10CT values (the difference of CT values between unenriched and enriched libraries) from the PB2 segment. For the influenza B spiked-in library, the lowest enrichment effect is more than 8CT values from NS segment. Furthermore, we spiked in 16 HA plasmids (H1-to-H16) (Qi et al., 2014) into human DNA and tested the enrichment effect of our probe set in the same way. The results are shown in Table 5 and demonstrate significant enrichment effects: the average CT values are decreased 9.8 after enrichment for H1-to-H16 plasmid spiked-ins while the CT value of human HER2 gene is increased 9.9 after enrichment.

Table 4.

Real-time PCR results of enrichment of spike-in libraries.

Sequencing library from H1N1 spike-in Sequencing library from flu B spike-in


Segment CTs before enrichment CTs after enrichment Segment CTs before enrichment CTs after enrichment

H1N1_HA 29.911 17.059 FluB_HA 23.435 12.933
H1N1_MX 28.95 17.322 FluB_MX 23.88 13.782
H1N1_NA 29.408 17.642 FluB_NA 22.191 11.055
H1N1_NP 28.087 15.834 FluB_NP 24.399 15.548
H1N1_NS 29.249 17.783 FluB_NS 21.642 13.229
H1N1_PA 29.505 17.649 FluB_PA 22.326 12.628
H1N1_PB1 29.791 17.697 FluB_PB1 22.387 13.501
H1N1_PB2 29.178 18.28 FluB_PB2 22.94 12.897
Human (Her2) 27.559 Undetermined 23.385 Undetermined
Her2 +control 25.139 23.143

Table 5.

Real-time PCR results of enrichment of spike-in libraries of H1 to H16 plasmids.

Sequencing library from H1 to H16 plasmids spike-in

HA segments CTs before enrichment CTs after enrichment CT differences

H1_HA 26.876 16.090 10.786
H2_HA 24.475 14.117 10.358
H3_HA 26.179 15.836 10.343
H4_HA 22.594 13.267   9.327
H5_HA 26.087 15.777 10.31
H6_HA 23.822 13.869   9.953
H7_HA 23.049 13.623   9.426
H8_HA 24.890 14.326 10.564
H9_HA 22.964 13.365   9.599
H10_HA 23.903 14.734   9.169
H11_HA 22.467 13.160   9.307
H12_HA 22.579 14.476   8.103
H13_HA 22.915 13.294   9.621
H14_HA 23.216 13.203 10.013
H15_HA 23.767 13.688 10.079
H16_HA 23.621 13.946   9.675
Human (Her2) 24.251 34.187 −9.936

2.4. Enrichment effects on cultured IAV stocks by Illumina sequencing

Target enrichment was further evaluated by Illumina sequencing of cultured influenza stock virus. Four sequencing libraries were constructed, two un-enriched and two enriched ones from stocks of influenza A/H2N1 virus (Qi et al., 2014) and sequenced on Illumina NextSeq. The results are shown in Table 6, demonstrating that the numbers of mapped reads to the cultured influenza virus significantly increased (approximately 10-fold) following universal influenza virus probe enrichment. For library 1, the mapped influenza read number was only 2.9% of the total reads prior to enrichment, while the mapped read number increased to 81.0% of the total reads following enrichment. Similarly, for library 2, the mapped influenza read number was only 8.3% of the total reads prior to enrichment, while the mapped read number was 85.4% of the total reads following enrichment.

Table 6.

Illumina sequencing results of enrichment of cultured virus libraries.

Unenriched libraries Enriched libraries


L1 L2 L3(enriched L1) L4(enriched L2)

CY018877_HA_ORF mapped reads   2059227    2621035   2150500   13877071
CY081243_MX_ORF mapped reads       2140     27660   3080022   41281755
CY018879_NA_ORF mapped reads     154202      595203 13512258   63752388
CY018880_NP_0RF mapped reads       5755     43815   5763976   65970485
CY018881_NS_0RF mapped reads       3565     10690   2804799   26735871
CY018882_PA_ORF mapped reads     172979      793904 16690059 111385360
CY018883_PB1_0RF mapped reads   1658703    2249468   6294228   77672532
CY018884_PB2_0RF mapped reads     110633      276198 12734323   66449873
Total mapped influenza reads      4167204    6617973 63030165 467125335
Total reads 145444783 79853814 77816824 547262319
Mapped influenza reads percentage        2.87%      8.28%      81.00%         85.36%

2.5. Detection of avian IAV in wild bird primary cloacal swap surveillance samples by viral isolation

The three cloacal swabs samples used in this study were collected from mallard ducks (Anas platyrhynchos) during routine IAV surveillance conducted in Ohio during 2013. The presence of IAV was determined by inoculation techniques in embryonated specific pathogenfree (SPF) chicken eggs and the HA and NA subtype of each viral isolate was determined by hemagglutinin and neuraminidase inhibition techniques at The National Veterinary Service Laboratory (NVSL), as previously described (Fries et al., 2015). The subtype of sample 1 (A/mallard/Ohio/13OS1979/) was not able to be determined using traditional methods. Sample 2 (A/mallard/Ohio/13OS1980) was subtyped as H2N8, and sample 3 (A/mallard/Ohio/13OS1351) was subtyped as H1N8.

2.6. Enrichment effects on wild bird cloacal swab surveillance samples by Illumina sequencing

Following in silico and in vitro validation, the utility of the enrichment probe set was assessed with RNA isolated from the three primary cloacal swab samples described above. From cloacal swab sample 1 that was not able to be subtyped by traditional methods, we first made an Illumina sequencing library without enrichment. After sequencing, we obtained 207,271,317 reads. However, there were only 3 reads mapped to NA sequences (N8) after mapping the sequence reads to all influenza A virus HA and NA sequences downloaded from the Influenza Resource Database. Subsequently, an influenza probe-enriched sequencing library was made from the same sample. The library was sequenced with an Illumina NextSeq and a total of 329,619,786 reads were obtained. Among them, 575,532 reads were mapped to HAs and 409,989 reads to NA sequences. Among the HA hits, there were 118,317 unique reads aligning to H10 and 457,215 unique reads aligning to H1. In addition, there were no reads that aligned to both H10 and H1. Similarly, for NA hits, there were 65,934 unique reads aligning to N5 and 344,055 unique reads aligning to N8 sequences. Also, no unique hits from N5 and N8 overlapped. Therefore, the enrichment approach by the designed universal influenza enrichment probes not only facilitates detection of low quantities of influenza RNA, but also demonstrated evidence of a mixed infection with 2 HAs and 2 NAs in this sample.

Based on the success from the enrichment of sample 1, we made enrichment libraries from samples 2 and 3 and sequenced them on Illumina NextSeq. From sample two, 438,249,661 reads were obtained. Among them, 9673 reads were mapped to IAV HA sequences with 235 reads mapped to H1 and 9438 reads mapped to H2. Similarly, 16,485,335 reads were mapped to IAV NA sequences with 1342 mapped to N1 and 16,483,993 mapped to N8, which reflected the NVSL subtyping result of H2N8. From sample three, 633,893,806 reads were obtained. Among them, 34,418,497 reads were mapped to IAV HA sequences with 24,076,524 reads mapped to H1 and 10,341,973 reads mapped to H8 and 21,574,593 reads were mapped to NA sequences with 8,491,101 mapped to N4 and 13,083,492 mapped to N8. Similarly, for sample 3, it confirms the result from traditional subtyping methods of H1N8 infection as well as the molecular detection of other influenza HA and NA subtypes, including a possible mixed infection with an H8N4 substype. The enrichment effect on the primary cloacal swab samples is shown in Table 7.

Table 7.

Illumina sequencing results of enrichment of mallard swab samples.

Total reads Reads mapped to HAs Reads mapped to NAs Identified by culturing

Un-enriched sample 1 207271317 0 3 (N8) Undetected
Enriched sample 1 329619786 575532 (118317 H10, 457215 H1) 409989 (65934 N5, 344055 N8) Undetected
Enriched sample 2 438249661 9673 (235 H1, 9438 H2) 16485335 (1342 N1, 16483993 N8) H2N8
Enriched sample 3 633893806 34418497 (24076524 H1, 10341973 H8) 21574593 (8491101 N4, 13083492 N8) H1N8

3. Discussion

In the current study, we generated a unique dataset for all available influenza virus sequences and based on that, designed a set of universal influenza enrichment probes. We analyzed their homology in silico on a data set of influenza virus sequences, a data set of all human viruses, and the human genome, and demonstrated their specific homology preferentially to influenza viruses. Subsequently, we performed the experiments to test their enrichment effects on: 1) libraries made by spiking influenza gene-encoding plasmid DNAs by real-time PCR; 2) cultured influenza virus stocks; 3) wild bird primary cloacal swab surveillance samples. From all of these materials, a significant enrichment of influenza sequences was demonstrated. Mixed infections with different avian IAV subtypes were detected in the mallard samples, which may not be detected using traditional subtyping methods (Dugan et al., 2008; Wang et al., 2008). With the cost of deep sequencing technology decreasing and sequence output increasing, more and more influenza researchers will be able to apply this advanced technology to their research, surveillance, and diagnostic efforts because it not only gives you an opportunity recover the entire influenza genome (Ren et al., 2013), but also allows investigation of the quasispecies variants in the population (Doud et al., 2017). Most traditional molecular-based approaches have utilized viral specific primers to PCR amplify influenza genomes or genome segments. A recent study using sequence-independent PCR amplification on RNA isolated from purified viral particles, which requires filtration and ultracentrifugation (Ren et al., 2013). Enrichment strategies using universal influenza probes avoids influenza specific amplification but should also allow enrichment from samples for which RNA is degraded. For example, isolated RNA (maximum length about 100 nucleotides) from a previous study sequencing IAV from a formalin-fixed, paraffin-embedded (FFPE) autopsy lung tissue sample from the 1918 influenza pandemic (Xiao et al., 2013), or from fixed clinical nasal swabs (Krafft et al., 2005) or bird cloacal swabs (Wang et al., 2008). The RNA isolated from FFPE tissue samples or fixed swab samples can be highly degraded, making reverse transcription using conserved non-coding region primers, and PCR amplification using full gene segment primers difficult or impossible. In addition, prior knowledge of the infected influenza virus type, or IAV subtype(s) is unnecessary when using the influenza universal enrichment probes described here. Even RNA from emerging influenza strains will likely be enriched because the enrichment process is hybridization-based, and the probes are 120nt in length. It has been shown in a study of a related method that sequences up to 40% difference from known virus genomes used for designing a probe library can be captured (Briese et al., 2015) and the probe hybridization temperature and the wash conditions can be adjusted to compensate for stringency of hybridization. Obviously, when influenza viruses evolve and diverge over time, the effectiveness of this probe set will decrease and new probes from more newly emerged influenza strains will be needed.

From three primary cloacal swab samples from wild mallards, using our enrichment probes, we not only recovered influenza sequences from our deep sequencing libraries, but also identified evidence of mixed infection, reflected by two HA subtype sequences and two NA subtype sequences. However, when using the traditional methods, sample one could not be subtyped, likely reflecting the mixed infection seen by sequence analysis. For samples two and three, from the egg-cultured sample, only one subtype was identified. It has been reported that sequencing viral samples without culturing increases the detection of mixed infections and enhances the identification of viral strains that might be outgrown during adaptation to egg culture (Lindsay et al., 2013). From our sequencing data, the subtypes identified by traditional methods are always the ones that have largest number of reads. Therefore, the reason of only detecting one viral subtype by culture could be that during the culturing process, the major one outgrows the minor one.

Aquatic birds are thought to be the reservoir of influenza virus (Webster et al., 1992) and occasionally spill over to other species including intermediate hosts, like dogs and horses (Parrish et al., 2015). Mixed infection of different subtype of influenza viruses and reassortment have been found in wild birds (Dugan et al., 2008, 2011; Lindsay et al., 2013; Wang et al., 2008). Based on the sequencing results, evidence of mixed infection was noted in the enriched cloacal swab sample libraries. The coverage of the HA or NA gene segments varied from 15.9% of H10 and 12.6% of N5 from sample two to 90.5% of H8 and 85.7% of N4 in sample 3 (Supplemental Table 5). It is possible that the variation in coverage might due to RNA degradation in the samples. However, for the lowest representation of H1 (235 reads with 4.3% coverage of H1 segment) and N1 (1342 reads with coverage 5.4% of N1 segment) detected in sample two, minor contamination in the sample due to the sensitivity of high throughput sequencing technology cannot be ruled out.

4. Conclusion

Overall, the universal influenza enrichment probe set described in this study showed significant, influenza-specific enrichment and should be a useful tool for the influenza research community for detecting influenza RNA in low amounts and in a type- and subtype-independent manner.

5. Materials and methods

5.1. Enrichment probe design

The influenza sequences were downloaded from NCBI influenza database (ftp://ftp.ncbi.nih.gov/genomes/INFLUENZA/) on 11/18/2015. Enrichment probes were picked and synthesized by Agilent Technologies (Santa Clara, CA) based on generated final influenza sequence data set as reference. The total 46,953 biotinylated cRNA probes were produced with 120nt in length and 24nt spacing density. The details of generation of final sequence data set for enrichment probe design is stated in results section.

5.2. Data analysis

All the HA and NA sequences were downloaded from Influenza Virus Resource (https://www.ncbi.nlm.nih.gov/genomes/FLU/Database/nph-select.cgi?go=database)(Bao et al., 2008) on 02/19/2016; A/Brevig Mission/1/1918(H1N1) and A/Hangzhou/1/2013(H7N9) sequences were download from NCBI; all human viruses were downloaded from NCBI (https://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?taxid=10239&host=human) on 09/13/2017, and human genome were downloaded on 04/02/2013 from NCBI (ftp://ftp.ncbi.nlm.nih.gov/blast/db/). Blastn program (Altschul et al., 1990) from BLAST + (version 2.2.31) package was used to search enrichment probes against different downloaded databases using percent identity 90 and e-value 0.001 as cutoffs. Blastn results from searching A/Brevig Mission/1/1918(H1N) and A/Hangzhou/1/2013(H7N9) sequences were converted to sam files using blast2bam version 0.1 (https://github.com/guyduche/Blast2Bam). SAMtools (version 1.4) were used to convert sam files to bam files, sort the generated bam files, and index the sorted bam files (Li et al., 2009). BEDTools (version 2.25.0) were used to calculate overall coverage of the reads for each segment (Quinlan and Hall, 2010). Integrative Genomics Viewer (version 2.3.60) was used to generate HA and NA coverage figure (Robinson et al., 2011). Sequencing was performed on the Illumina NextSeq machine. Samples were demultiplexed and FastQ files were generated using Illumina software. Reads were mapped to the Bowtie2 (version 2.2.5, http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) indexed all downloaded HA and NA sequences using Tophat2 (release 2.0.13, http://ccb.jhu.edu/software/tophat/index.shtml) downloaded from the Center for Computational Biology, Johns Hopkins University (http://ccb.jhu.edu/) (Trapnell et al., 2009).

5.3. Plasmids spike-in and real-time PCR

Plasmid set of H1N1 (A/California/04/2009(H1N1)) were constructed in a previous study (Memoli et al., 2015) and a plasmid set of influenza B virus (B/Bethesda/NIH001/2007) were made by cloning the 8 viral gene segments of the wild-type influenza B virus obtained from a patient in 2007 at NIH clinical center in pHH21 vector. Plasmids were purified using the QIAprep Spin Miniprep kit (Qiagen, Hilden, Germany). Real-time PCR primers for H1N1 are: HA-fwd: TAA ACA CCA GCC TCC CAT TTC, HA-rev: CCT GTG GCC AGT CTC AAT TT; MX-fwd: CAG TGC TGG TCT GAA AGA TGA, MX-rev: GAC GAG AGG ATC ACT TGA ATC G; NA-fwd: GGG CCT TGC TAA ATG ACA AAC, NA-rev: GGA GAG GGA ACT TCA CCA ATA G; NP-fwd: GTA CTC ACT GGT CGG GAT AGA, NP-rev: GAC TCT TGT GAG CTG GGT TT; NS-fwd: GGG AAA CAA ATC GTG GAA TGG, NS-rev: GAG GGT CAT GTC AGA AAG GTA G; PA-fwd: GCA AGC ATG AGG AGG AAC TAT, PA-rev: CAT TGA GCA AGG CCG TAT TTA TG; PB1-fwd: GAC AAT ATA CTG GTG GGA TGG G, PB1-rev: CTG TCC ACT CCT GCT TGT ATT; PB2-fwd: GCA GCA ATA GGG TTG AGG ATT A, PB2-rev: CCC GTT AGC ACT TCT TCT TCT T. Real-time PCR primers for influenza B virus are: HA-fwd: CCT CAT CTG CTA ATG GAG TAA CC, HA-rev: TTT GTG GTA GTC CTT CGT CTT C; MX-fwd: CCT TAT CGG GAA TGG GAA CAA, MX-rev: CTG AGC TTT CAT GGC CTT CT; NA-fwd: GGA AAC TCA GCT CCC TTG ATA A, NA-rev: TGA GCT GCA TAA TGG GTT AGA G; NP-fwd: AGT CTT GGC TTT GAT GTC TCT C, NP-rev: TCA AAG GAG GCG GAA CTT TAG; NS-fwd: CTG CTG GAA TTG AAG GGT TTG, NS-rev: CCC TGG TGT TGA AGG GTA AT; PA-fwd: GCA AGG ATG TCT CCC TTA GTA TC, PA-rev: CTT CTG GTA GCT CAT GGT TGT; PB1-fwd: GCC CGT AGG TGG AAA TGA AA, PB1-rev: CTG TTA CTG TCA TGC TGA TCC C; PB2-fwd: TAT CAC CCG GGA GGG AAT AA, PB2-rev: TGG GTT TGA TGC GAC TAT TGA. Real-time PCR primers for H1 to H16 segments are: H1-fwd: GTT AGA GGA CAG GCA GGC AG; H1-rev: CGG AGT CAG ACC CCT TGT TC; H2-fwd: TGA CGA TGC GGA ACA AAG GA; H2-rev: CCC CCT TGT CCG TTG ACT TT; H3-fwd: ACG TTC AGG CAT CAG GAA GG; H3-rev: ACT AGT ACA TCC CCC GGC TT; H4-fwd: AGA GTG ACT GTC TCC ACC CA; H4-rev: CCT CTC ACC CAC GGT CTA CT; H5-fwd: TCA TCC TCT TGC CAT TGG GG; H5-rev: CCA TTC CTT GCC ATC CTC CT; H6-fwd: CCA AGT CAG CAG CGT ATC CA; H6-rev: TGC TCA TTG GTG TCA GGA GG; H7-fwd: TGG GGC ATT CAT AGC TCC TG; H7-rev: CCC CAC TGT GGA AGC AAT CT; H8-fwd: GGG GCA TTC TGA AAA GGG GA; H8-rev: GGC ATT TCG TGT GGC AGT TT; H9-fwd: TTG TCA ATG GTC AGC AGG GG; H9-rev: TCA CTC GCA ATG TCT GAC CC; H10-fwd: CAC CGA GAA CTG TGG GTC AA; H10-rev: CCA AAC AGG CCT CTC CCT TG; H11-fwd: GCT GGG TTC ATA GAG GGT GG; H11-rev: CTG CAG CAA TCC CTG TAC CT; H12-fwd: CCA TTC ACC ACC CAC CAA CA; H12-rev: AGT GGT GAC TGA GGA GAG GG; H13-fwd: CGC ACC TAC TTC TTG GGG AG; H13-rev: TGT TGG TCC CCG TAT TGT CC; H14-fwd: AGG TGG CAA CAG GGA GAG TA; H14-rev: AGA TGC TTA TCC TGC CGC TC; H15-fwd: ATG CCG TAG CAA ATG GGA CA; H15-rev: GTC CAC CGC TTT CTT CCC TT; H16-fwd: AGA GGG GTT TGT TTG GTG CT; H16-rev: GGC TTT CTG GGT GGA CAC TT; Primers for the human HER2 gene are: HER2-fwd: ACA ACC AAG TGA GGC AGG TC, HER2-rev: GTA TTG TTC AGC GGG TCT CC. All the primers were synthesized by Integrated DNA Technologies (Coralville, IA). The isolated plasmids were diluted at least 1 million times before mixing with 500 ng Hela human DNA (New England Biolabs, Ipswich, MA). Mixed DNAs were sheared to 150 bp on Covaris S2 machine (Covaris, Woburn, MA) and sequencing libraries (enriched and unenriched) were made following standard protocols (Library construction and sequencing section) with elimination of RNA to cDNA steps. Real-time PCR reaction using 1 μl sequencing libraries were performed on Applied Biosystems 7500 Real-Time PCR System (Foster City, CA) using Power SYBR® Green PCR Master Mix (Thermo Fisher Scientific, Waltham, MA) with following program: 50 °C for 2 min; 95 °C for 3 min; followed by 40 cycles of 95 °C for 15 s; 60 °C for 1 min.

5.4. RNA isolation

Virus stock was cultured in Madin-Darby canine kidney (MDCK) cell and RNA was isolated using the QIAamp Viral RNA Mini Kit (Qiagen, Hilden, Germany). Cloacal swabs from mallards were inoculated in embryonated specific pathogen-free (SPF) chicken eggs. Total RNA was extracted from cloacal swabs or first passage allantoic fluid using E.Z.N.A viral RNA kit according to the manufacture’s protocol (OMEGA bio-tek, Norcross, GA).

5.5. Library construction, enrichment and sequencing

Isolated total RNA was amplified by Ovation RNA-Seq system V2 from NuGEN (NuGEN, San Carlos, CA) following kit specification. Each sample, 5 μl of total RNA, was used as input for Ovation RNA-Seq system V2. The amplified total cDNAs were analyzed by Agilent 2100 Bioanalyzer using Agilent High Sensitivity DNA Kit (Agilent Technologies, Santa Clara, CA) and sheared to 150 bp on Covaris S2 machine (Covaris, Woburn, MA). Then, about 300 ng amplified cDNA was used to make Illumina sequencing libraries. For influenza enriched libraries, Agilent SureSelectXT Target Enrichment Kit for Illumina Multiplex Sequencing (Agilent, Santa Clara, CA) was used with designed influenza enrichment probes using protocol for 200 ng DNA samples. Briefly, the ends of the sheared cDNA were repaired, adenylated at 3 ends, ligated to adaptors, and amplified according to the protocol. The enrichment steps are following: use a vacuum concentrator to concentrate the amplified samples (30 μl) from previous step at less than 45 °C to final volume of 3.4 μl. Prepare the hybridization buffer and block mix according to the protocol. Then add 5.6 μl of the block mix to 3.4 μl of sample (total 9 μl). Put the mixture on PCR using the following program: 95 °C 5 min and hold at 65 °C on thermal cycler. Take the synthesized influenza enrichment probe set from −80 °C and thaw on ice. Add 2 μl of enrichment probe set (full cocktail) to 5 μl of diluted RNase block solution and keep it on ice (total 7 μl). Bring hybridization buffer to 65 °C using heat block or PCR machine. Put the 7μl of enrichment and RNase block mixture on PCR machine at 65 °C. After that, quickly add 13μl of 65 °C of hybridization buffer then 9 μl of sample and block mixture (total 29 μl). Incubate the hybridization mixture on PCR machine at 65 °C with a heated lid at 105 °C for 16–24 h. Vigorously resuspend the Dynabeads MyOne Streptavidin T1 magnetic beads (Thermo Fisher Scientific, Waltham, MA) on a vortex mixer. For each hybridization sample, 50 μl of the resuspended beads is needed. Wash the resuspended Streptavidin T1 magnetic beads with 200 μl SureSelect Binding Buffer 3 times on a magnetic separator device and resuspend the beads in 200 μl of SureSelect Binding Buffer. Then keep the hybridization tubes at 65 °C on PCR machine while adding 200 μl of washed streptavidin beads to each tube and mix well by pipetting slowly. Put the hybridization and beads mixture on tube/plate mixer and rotate or mix for 30 min at room temperature. Put the tube in a magnetic separator to collect the beads. Wait until the solution is clear, then remove and discard the supernatant. Resuspend the beads in 200 μl of SureSelect Wash Buffer 1. Incubate the samples for 15 min at room temperature. Briefly spin the tube and put it on the magnetic separator. Wait for the solution to clear, then discard the supernatant. Resuspend the beads in 200 μl of 65 °C prewarmed Wash Buffer 2 and incubate the sample for 10 min at 65 °C on the thermal cycler. Put the tube in the magnetic separator. Wait for the solution to clear, then discard the supernatant. Repeat the wash steps for a total of 3 times. Make sure all of the wash buffer has been removed during the final wash. Add 30 μl of nuclease-free water to each sample well. Pipette up and down to resuspend the beads. Captured DNA is retained on the streptavidin beads during the step of post-capture amplification with indexing primers, which is the final step of the enrichment library construction according to the procedure of Agilent SureSelectXT Target Enrichment Kit (Agilent, Santa Clara, CA). The PCR amplification cycle of 16 was used because the capture library is 1 kb to 0.5 Mb according to the protocol. For un-enriched libraries, TruSeq RNA Library Prep Kit v2 (Illumina, San Diego, CA) was used following manufacturer’s instructions with elimination of cDNA synthesizing steps. All the Illumina sequencing libraries were analyzed on Agilent 2100 Bioanalyzer using Agilent High Sensitivity DNA Kit. Libraries were then clustered on Illumina cBot machine and sequenced on Illumina GAIIx or NextSeq sequencer according to manufacturer’s instructions (Illumina, San Diego, CA).

Supplementary Material

Suppl data 2
suppl data 1
suppl data 3
suppl data 5
supple data 4

Acknowledgments

Collection and isolation of wild bird samples was supported with federal funds from the Centers of Excellence for Influenza Research and Surveillance (CEIRS), National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contract No. HHSN272201400006C. This work was supported by the Intramural Research Program of the NIH and the NIAID.

Footnotes

Appendix A. Supplementary material

Supplementary data are available in the online version of this article at doi:10.1016/j.virol.2018.08.021

References

  1. Air GM, 1981. Sequence relationships among the hemagglutinin genes of 12 subtypes of influenza A virus. Proc. Natl. Acad. Sci. USA 78, 7639–7643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X, Richmond TA, Middle CM, Rodesch MJ, Packard CJ, Weinstock GM, Gibbs RA, 2007. Direct selection of human genomic loci by microarray hybridization. Nat. Methods 4, 903–905. [DOI] [PubMed] [Google Scholar]
  3. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ, 1990. Basic local alignment search tool. J. Mol. Biol. 215, 403–410. [DOI] [PubMed] [Google Scholar]
  4. Bao Y, Bolotov P, Dernovoy D, Kiryutin B, Zaslavsky L, Tatusova T, Ostell J, Lipman D, 2008. The influenza virus resource at the National Center for Biotechnology Information. J. Virol. 82, 596–601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bodi K, Perera AG, Adams PS, Bintzler D, Dewar K, Grove DS, Kieleczawa J, Lyons RH, Neubert TA, Noll AC, Singh S, Steen R, Zianni M, 2013. Comparison of commercially available target enrichment methods for next-generation sequencing. J. Biomol. Technol. 24, 73–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Briese T, Kapoor A, Mishra N, Jain K, Kumar A, Jabado OJ, Lipkin WI, 2015. Virome capture sequencing enables sensitive viral diagnosis and comprehensive virome analysis. MBio 6, e01491–01415. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dinis JM, Florek NW, Fatola OO, Moncla LH, Mutschler JP, Charlier OK, Meece JK, Belongia EA, Friedrich TC, 2016. Deep sequencing reveals potential antigenic variants at low frequencies in influenza A virus-infected humans. J. Virol. 90, 3355–3365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Doud MB, Hensley SE, Bloom JD, 2017. Complete mapping of viral escape from neutralizing antibodies. PLoS Pathog. 13, e1006271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Dugan VG, Chen R, Spiro DJ, Sengamalay N, Zaborsky J, Ghedin E, Nolting J, Swayne DE, Runstadler JA, Happ GM, Senne DA, Wang R, Slemons RD, Holmes EC, Taubenberger JK, 2008. The evolutionary genetics and emergence of avian influenza viruses in wild birds. PLoS Pathog. 4, e1000076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Dugan VG, Dunham EJ, Jin G, Sheng ZM, Kaser E, Nolting JM, Alexander HL Jr., Slemons RD, Taubenberger JK, 2011. Phylogenetic analysis of low pathogenicity H5N1 and H7N3 influenza A virus isolates recovered from sentinel, free flying, wild mallards at one study site during 2006. Virology 417, 98–105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Fries AC, Nolting JM, Bowman AS, Lin X, Halpin RA, Wester E, Fedorova N, Stockwell TB, Das SR, Dugan VG, Wentworth DE, Gibbs HL, Slemons RD, 2015. Spread and persistence of influenza A viruses in waterfowl hosts in the North American Mississippi migratory flyway. J. Virol. 89, 5371–5381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Garcia-Garcia G, Baux D, Faugere V, Moclyn M, Koenig M, Claustres M, Roux AF, 2016. Assessment of the latest NGS enrichment capture methods in clinical context. Sci. Rep. 6, 20948. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Ghedin E, Laplante J, DePasse J, Wentworth DE, Santos RP, Lepow ML, Porter J, Stellrecht K, Lin X, Operario D, Griesemer S, Fitch A, Halpin RA, Stockwell TB, Spiro DJ, Holmes EC, St George K, 2011. Deep sequencing reveals mixed infection with 2009 pandemic influenza A (H1N1) virus strains and the emergence of oseltamivir resistance. J. Infect. Dis. 203, 168–174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, Fisher S, Russ C, Gabriel S, Jaffe DB, Lander ES, Nusbaum C, 2009. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Goodwin S, McPherson JD, McCombie WR, 2016. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet 17, 333–351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ, Albert TJ, Hannon GJ, McCombie WR, 2007. Genome-wide in situ exon capture for selective resequencing. Nat. Genet. 39, 1522–1527. [DOI] [PubMed] [Google Scholar]
  17. Hoffmann E, Stech J, Guan Y, Webster RG, Perez DR, 2001. Universal primer set for the full-length amplification of all influenza A viruses. Arch. Virol. 146, 2275–2289. [DOI] [PubMed] [Google Scholar]
  18. Iuliano AD, Roguski KM, Chang HH, Muscatello DJ, Palekar R, Tempia S, Cohen C, Gran JM, Schanzer D, Cowling BJ, Wu P, Kyncl J, Ang LW, Park M, Redlberger-Fritz M, Yu H, Espenhain L, Krishnan A, Emukule G, van Asten L, Pereira da Silva S, Aungkulanon S, Buchholz U, Widdowson MA, Bresee JS, Global Seasonal Influenza-associated Mortality Collaborator, N., 2018. Estimates of global seasonal influenza-associated respiratory mortality: a modelling study. Lancet 391, 1285–1300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Johnson NP, Mueller J, 2002. Updating the accounts: global mortality of the 1918–1920 “Spanish” influenza pandemic. Bull. Hist. Med. 76, 105–115. [DOI] [PubMed] [Google Scholar]
  20. Joyce MG, Wheatley AK, Thomas PV, Chuang GY, Soto C, Bailer RT, Druz A, Georgiev IS, Gillespie RA, Kanekiyo M, Kong WP, Leung K, Narpala SN, Prabhakaran MS, Yang ES, Zhang B, Zhang Y, Asokan M, Boyington JC, Bylund T, Darko S, Lees CR, Ransier A, Shen CH, Wang L, Whittle JR, Wu X, Yassine HM, Santos C, Matsuoka Y, Tsybovsky Y, Baxa U, Program NCS, Mullikin JC, Subbarao K, Douek DC, Graham BS, Koup RA, Ledgerwood JE, Roederer M, Shapiro L, Kwong PD, Mascola JR, McDermott AB, 2016. Vaccine-induced antibodies that neutralize Group 1 and Group 2 Influenza A viruses. Cell 166, 609–623. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Krafft AE, Russell KL, Hawksworth AW, McCall S, Irvine M, Daum LT, Connoly JL, Reid AH, Gaydos JC, Taubenberger JK, 2005. Evaluation of PCR testing of ethanol-fixed nasal swab specimens as an augmented surveillance strategy for influenza virus and adenovirus identification. J. Clin. Microbiol. 43, 1768–1775. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Laassri M, Zagorodnyaya T, Plant EP, Petrovskaya S, Bidzhieva B, Ye Z, Simonyan V, Chumakov K, 2015. Deep sequencing for evaluation of genetic stability of influenza A/California/07/2009 (H1N1) vaccine viruses. PLoS One 10, e0138650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Lee CW, Senne DA, Suarez DL, 2006. Development and application of reference antisera against 15 hemagglutinin subtypes of influenza virus by DNA vaccination of chickens. Clin. Vaccin. Immunol. 13, 395–402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing, S., 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Li W, Godzik A, 2006. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659. [DOI] [PubMed] [Google Scholar]
  26. Lindsay LL, Kelly TR, Plancarte M, Schobel S, Lin X, Dugan VG, Wentworth DE, Boyce WM, 2013. Avian influenza: mixed infections and missing viruses. Viruses 5, 1964–1977. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. McInerney P, Adams P, Hadi MZ, 2014. Error rate comparison during polymerase chain reaction by DNA polymerase. Mol. Biol. Int 2014, 287430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Memoli MJ, Czajkowski L, Reed S, Athota R, Bristol T, Proudfoot K, Fargis S, Stein M, Dunfee RL, Shaw PA, Davey RT, Taubenberger JK, 2015. Validation of the wild-type influenza A human challenge model H1N1pdMIST: an A(H1N1) pdm09 dose-finding investigational new drug study. Clin. Infect. Dis. 60, 693–702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Mishin VP, Baranovich T, Garten R, Chesnokov A, Abd Elal AI, Adamczyk M, LaPlante J, St George K, Fry AM, Barnes J, Chester SC, Xu X, Katz JM, Wentworth DE, Gubareva LV, 2017. A pyrosequencing-based approach to high-throughput identification of Influenza A(H3N2) virus clades harboring antigenic drift variants. J. Clin. Microbiol. 55, 145–154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Morens DM, Taubenberger JK, Fauci AS, 2009. The persistent legacy of the 1918 influenza virus. N. Engl. J. Med 361, 225–229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Nakamura S, Yang CS, Sakon N, Ueda M, Tougan T, Yamashita A, Goto N, Takahashi K, Yasunaga T, Ikuta K, Mizutani T, Okamoto Y, Tagami M, Morita R, Maeda N, Kawai J, Hayashizaki Y, Nagai Y, Horii T, Iida T, Nakaya T, 2009. Direct metagenomic detection of viral pathogens in nasal and fecal specimens using an unbiased high-throughput sequencing approach. PLoS One 4, e4219.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Nobusawa E, Aoyama T, Kato H, Suzuki Y, Tateno Y, Nakajima K, 1991. Comparison of complete amino acid sequences and receptor-binding properties among 13 serotypes of hemagglutinins of influenza A viruses. Virology 182, 475–485. [DOI] [PubMed] [Google Scholar]
  33. Okou DT, Steinberg KM, Middle C, Cutler DJ, Albert TJ, Zwick ME, 2007. Microarray-based genomic selection for high-throughput resequencing. Nat. Methods 4, 907–909. [DOI] [PubMed] [Google Scholar]
  34. Parrish CR, Murcia PR, Holmes EC, 2015. Influenza virus reservoirs and intermediate hosts: dogs, horses, and new possibilities for influenza virus exposure of humans. J. Virol. 89, 2990–2994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Potapov V, Ong JL, 2017. Examining sources of error in PCR by single-molecule sequencing. PLoS One 12, e0169774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Qi L, Pujanauski LM, Davis AS, Schwartzman LM, Chertow DS, Baxter D, Scherler K, Hartshorn KL, Slemons RD, Walters KA, Kash JC, Taubenberger JK, 2014. Contemporary avian influenza A virus subtype H1, H6, H7, H10, and H15 hemagglutinin genes encode a mammalian virulence factor similar to the 1918 pandemic virus H1 hemagglutinin. MBio 5, e02116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Quinlan AR, Hall IM, 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Ren X, Yang F, Hu Y, Zhang T, Liu L, Dong J, Sun L, Zhu Y, Xiao Y, Li L, Yang J, Wang J, Jin Q, 2013. Full genome of influenza A (H7N9) virus derived by direct sequencing without culture. Emerg. Infect. Dis. 19, 1881–1884. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP, 2011. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Service RF, 2006. Gene sequencing. The race for the ~1000 genome. Science 311, 1544–1546. [DOI] [PubMed] [Google Scholar]
  41. Tewhey R, Nakano M, Wang X, Pabon-Pena C, Novak B, Giuffre A, Lin E, Happe S, Roberts DN, LeProust EM, Topol EJ, Harismendy O, Frazer KA, 2009. Enrichment of sequencing targets from the human genome by solution hybridization. Genome Biol. 10, R116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Trapnell C, Pachter L, Salzberg SL, 2009. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Vemula SV, Zhao J, Liu J, Wang X, Biswas S, Hewlett I, 2016. Current approaches for diagnosis of influenza virus infections in humans. Viruses 8, 96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Wang R, Soll L, Dugan V, Runstadler J, Happ G, Slemons RD, Taubenberger JK, 2008. Examining the hemagglutinin subtype diversity among wild duck-origin influenza A viruses using ethanol-fixed cloacal swabs and a novel RT-PCR method. Virology 375, 182–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Webster RG, Bean WJ, Gorman OT, Chambers TM, Kawaoka Y, 1992. Evolution and ecology of influenza A viruses. Microbiol. Rev. 56, 152–179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Wright PF, Neumann G, Kawaoka Y, 2007. Orthomyxoviruses In: PM K, Da H (Ed.), Fields Virology, 5th edition Lippincott Williams and Wilkins, Philadelphia PA, pp. 1691–1740 (ed). [Google Scholar]
  47. Wu Y, Wu Y, Tefsen B, Shi Y, Gao GF, 2014. Bat-derived influenza-like viruses H17N10 and H18N11. Trends Microbiol. 22, 183–191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Xiao YL, Kash JC, Beres SB, Sheng ZM, Musser JM, Taubenberger JK, 2013. High-throughput RNA sequencing of a formalin-fixed, paraffin-embedded autopsy lung tissue sample from the 1918 influenza pandemic. J. Pathol. 229, 535–545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Zhang Y, Aevermann BD, Anderson TK, Burke DF, Dauphin G, Gu Z, He S, Kumar S, Larsen CN, Lee AJ, Li X, Macken C, Mahaffey C, Pickett BE, Reardon B, Smith T, Stewart L, Suloway C, Sun G, Tong L, Vincent AL, Walters B, Zaremba S, Zhao H, Zhou L, Zmasek C, Klem EB, Scheuermann RH, 2017. Influenza Research Database: an integrated bioinformatics resource for influenza virus research. Nucleic Acids Res. 45, D466–D474. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Zhou B, Donnelly ME, Scholes DT, St George K, Hatta M, Kawaoka Y, Wentworth DE, 2009. Single-reaction genomic amplification accelerates sequencing and vaccine production for classical and Swine origin human influenza a viruses. J. Virol. 83, 10309–10313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Zhou B, Lin X, Wang W, Halpin RA, Bera J, Stockwell TB, Barr IG, Wentworth DE, 2014. Universal influenza B virus genomic amplification facilitates sequencing, diagnostics, and reverse genetics. J. Clin. Microbiol. 52, 1330–1337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Zhou S, Jones C, Mieczkowski P, Swanstrom R, 2015. Primer ID validates template sampling depth and greatly reduces the error rate of next-generation sequencing of HIV-1 genomic RNA populations. J. Virol. 89, 8540–8555. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Suppl data 2
suppl data 1
suppl data 3
suppl data 5
supple data 4

RESOURCES