Version Changes
Revised. Amendments from Version 1
Revisions have been made to the manuscript to address reviewers' comments. The changes in the new version are listed below. Abstract: We specify the type of samples used in the Abstract section Methods: We mention the type of research approaches used in the study. We restructure the sentences to better explain the rationale behind choosing two rhinovirus types for this study. Discussion: We address the reviewers' concerns on (i) possible reasons for failed sequencing of 25 samples, and (ii) implications of not sequencing the extreme sections of the 5' and 3' non-coding regions. Funding: We have listed all funding agencies that supported this work, which were missed out in the previous version. Other minor changes include the addition of suitable references to support the changes above.
Abstract
Background: Virus genome sequencing is increasingly utilized in epidemiological surveillance. Genomic data allows comprehensive evaluation of underlying viral diversity and epidemiology to inform control. For human rhinovirus (HRV), genomic amplification and sequencing is challenging due to numerous types, high genetic diversity and inadequate reference sequences.
Methods: We developed a tiled amplicon type-specific protocol for genome amplification and sequencing on the Illumina MiSeq platform of two HRV types, A15 and A101. We then assessed added value in analyzing whole genomes relative to the VP4/2 region only in the investigation of HRV molecular epidemiology within the community in Kilifi, coastal Kenya.
Results: We processed 73 nasopharyngeal swabs collected between 2016-2018, and 48 yielded at least 70% HRV genome coverage. These included all A101 samples (n=10) and 38 (60.3%) A15 samples. Phylogenetic analysis revealed that the Kilifi A101 sequences interspersed with global A101 genomes available in GenBank collected between 1999-2016. On the other hand, our A15 sequences formed a monophyletic group separate from the global genomes collected in 2008 and 2019. An improved phylogenetic resolution was observed with the genome phylogenies compared to the VP4/2 phylogenies.
Conclusions: We present a type-specific full genome sequencing approach for obtaining HRV genomic data and characterizing infections.
Keywords: human rhinovirus, whole-genome, sequencing, phylogenetics
Introduction
Genomic surveillance of respiratory viruses is important for (i) developing molecular diagnostics 1, 2, (ii) investigating transmission and evolution 3, 4, and (iii) development of vaccines and therapeutic drugs 5. Human rhinovirus (HRV) is the most common cause of upper respiratory infections 6, 7 and is also occasionally associated with lower respiratory infections 8. It is a highly diverse virus, with over 160 distinct types identified globally 9. This diversity presents a challenge in developing a sequencing protocol that works well across the different HRV types 10. Most HRV molecular epidemiology studies utilize partial genome sequences, which offer lower resolution in identifying epidemiologically linked infections 11.
Viral genome sequencing can take one of the two approaches available: a targeted/enrichment approach or an agnostic/metagenomics approach 12. Theoretically, the viral metagenomics approach is an unbiased way to obtain all viral genetic content in a sample as it does not require prior knowledge of their sequences 13. However, this approach requires high viral titers to succeed 14, which are not always available in a clinical sample 15. Furthermore, most clinical samples especially those relevant to HRV sequencing are dominated by host and bacterial nucleic acids 13. Nonetheless, this challenge can be overcome by target enrichment, for example, polymerase chain reaction (PCR) to bulk up for the target virus before sequencing 15– 17 . We describe a target enrichment sequencing approach of two HRV types, A15 and A101, using type-specific primers and compare the phylogenetic inferences between partial and whole genome sequences. We used a combination of genomic, method-development, descriptive and retrospective approaches to achieve this.
Methods
Study population
The study utilized nasopharyngeal swabs collected during two previous studies in Kilifi County, coastal Kenya: outpatient surveillance of nine health dispensaries within the Kilifi Health and Demographic Surveillance System (KHDSS) between January and December 2016 7 and primary school surveillance between May 2017 and April 2018 6. The school was situated in Junju, a location within the KHDSS. All samples were collected from symptomatic individuals (mild symptoms of acute respiratory tract infection) of varied age (one month - 49 years) and archived at -80°C.
Study design
Samples were screened for HRV and typed as previously described 8. A cycle threshold (Ct) value < 35.0 was used to define positives. HRV positives underwent VP4/2 sequencing to characterize the diversity, spatial and temporal occurrence of HRV in the two settings. The most frequent type observed in the KHDSS health dispensary surveillance was A15 (n=63). Comparison of the HRV diversity within the school (located in Junju) and the Junju outpatient clinic revealed 12 common types, and the most frequent common type was HRV-A101 (n=10, with a frequency of n=5 in each setting) 9, 11.
For this study, we purposively selected two types from the two studies: A15 from the KHDSS surveillance and A101 from both the Junju health dispensary and school for whole genome sequencing (WGS). These two types were selected due to their high frequency of occurrence at the various scales of observation studied.
Ethics statement
Sample collection was undertaken following an informed written consent provided by parents or guardians for persons <18 years or by participating individuals if aged >17 years. Children whose parents consented were also asked for individual assent to participate. The study protocols were reviewed and approved by the University of Warwick Biomedical and Scientific Research Ethics Committee (BSREC #REGO-2016-1858 and #REGO-2015-6102) and the KEMRI-Scientific Ethics Review Unit (KEMRI-SERU #3332 and #3103).
Primer design
We retrieved nine type A101 genomes and three type A15 genomes, all >6000 nt long from GenBank 18 on 30 th September 2019. Geneious Prime 2019.2.1 ( https://www.geneious.com) was used to design eight overlapping primer pairs across the ~7.2 kb HRV genome. The primers targeted eight amplicons 0.9–1.6 kb in size, with overlaps varying in size from 300 to 800 bases, Table 1.
Table 1. Type-specific primers for the whole-genome amplification of two human rhinovirus types-A15 and A101.
| Name | Target
type |
Amplicon | Start | Length | %GC | Tm
(°C) |
Hairpin
Tm (°C) |
Self-
dimer Tm (°C) |
Pair
dimer Tm (°C) |
Sequence |
|---|---|---|---|---|---|---|---|---|---|---|
| 95 F_a101 | A101 | 1 | 95 | 23 | 39.1 | 58 | None | None | 6.9 | ACCCCAAATGTAACTTAGAAGCA |
| 716 R_a101 | A101 | 1 | 1,334 | 22 | 45.5 | 60 | None | None | None | TCATCAGTGGGTTGTTGTGAGT |
| 726 F_a101 | A101 | 2 | 726 | 20 | 50 | 60 | None | None | None | AGCATCAAGTGGAGCGTCAA |
| 1,215R_a101 | A101 | 2 | 1,833 | 22 | 54.5 | 62 | None | None | None | GACACCCACACGAACTGCATAC |
| 827 F_a101 | A101 | 3 | 1,445 | 22 | 36.4 | 56 | None | None | None | ATGCTGTTCCTATGGATTCAAT |
| 2,468R_a101 | A101 | 3 | 2,468 | 20 | 50 | 60 | None | None | None | TCTGGTTGTGTTTGGCTGGT |
| 1,524F_a101 | A101 | 4 | 2,142 | 22 | 40.9 | 56 | None | None | None | TACCACACCTGATACATACTCA |
| 3,516R_a101 | A101 | 4 | 3,516 | 20 | 55 | 60 | None | 8.8 | 3.2 | TCCACAATCTCCAGGTGCAC |
| 2,546F_a101 | A101 | 5 | 3,164 | 23 | 39.1 | 56 | None | None | None | TACCTACAAGAACAGACCTTACT |
| 3,901R_a101 | A101 | 5 | 4,519 | 22 | 40.9 | 55 | None | None | None | GTTTCCCTTTGTCTGGTAAATC |
| 4,102F_a101 | A101 | 6 | 4,102 | 20 | 50 | 60 | None | None | None | ACCCAGAAACAGCAGCAAGA |
| 5,248R_a101 | A101 | 6 | 5,248 | 23 | 39.1 | 58 | None | None | None | ACCCTGTGAACTTTCCATTACAT |
| 4,306F_a101 | A101 | 7 | 4,924 | 24 | 37.5 | 57 | None | None | None | AAATCAGTTAGGAATCCAGATGTC |
| 5,905R_a101 | A101 | 7 | 6,523 | 24 | 33.3 | 56 | None | None | None | TAGAATTACACAACTTCCTAACCA |
| 5,550F_a101 | A101 | 8 | 6,168 | 21 | 38.1 | 55 | None | None | None | ACCAATGATCACTTTCCTCAA |
| 6,383R_a101 | A101 | 8 | 7,001 | 24 | 33.3 | 56 | None | None | None | TGGTCATATTTGTCTTTTCCACTA |
| A15_1F | A15 | 1 | 21 | 20 | 55 | 61 | None | None | None | ATCCCACCTGAACCTCCCAA |
| A15_1R | A15 | 1 | 1,251 | 20 | 55 | 60 | None | None | None | CCAGCCGTGACATTACCTYT |
| 534F_a15_22 | A15 | 2 | 621 | 21 | 52.4 | 60 | None | 2.9 | None | CCATGGGCGCTCAAGTATCTA |
| 1,889R_a15_22 | A15 | 2 | 1,988 | 24 | 33.3 | 57 | None | None | None | CACAAAACATGAAACTGAATCGTA |
| 1,391F_a15_22 | A15 | 3 | 1,478 | 21 | 42.9 | 56 | None | None | None | AGACATAACAACTGGAGCTTG |
| 2,848R_a15_22 | A15 | 3 | 2,947 | 23 | 34.8 | 56 | None | None | None | TCCATCGTATCCATCATAAAACA |
| 2,417F_a15_22 | A15 | 4 | 2,516 | 22 | 40.9 | 55 | None | None | None | TCACAGACTAGAGATGAGATGA |
| 3,464R_a15_22 | A15 | 4 | 3,563 | 20 | 45 | 55 | None | None | None | CTATCACACCATGTTTGCAC |
| 2,900F_a15_22 | A15 | 5 | 2,999 | 24 | 33.3 | 54 | None | 4.3 | None | CTATGTTCAAGAATAGTCACTGAA |
| 4,352R_a15_22 | A15 | 5 | 4,451 | 23 | 39.1 | 58 | None | 3 | None | CACCAGGATTTTGCATAATGTCA |
| 3,576F_a15_22 | A15 | 6 | 3,675 | 21 | 38.1 | 55 | None | None | None | TTGGTGACGGGTTTGTAAATA |
| 4,991R_a15_22 | A15 | 6 | 5,090 | 24 | 33.3 | 55 | None | None | None | CAAATATAATGCCTGCTATACTGA |
| 4,385F_a15_22 | A15 | 7 | 4,484 | 23 | 43.5 | 58 | None | None | None | TCAAGTGTAACCTTTATCCCTCC |
| 5,943R_a15_22 | A15 | 7 | 6,042 | 23 | 43.5 | 59 | None | None | None | GTTCCAAACACACTATCCTCCAA |
| A15_8F | A15 | 8 | 5,972 | 20 | 55 | 60 | None | None | None | ACYCTTGAYATTGRCCCAGC |
| A15_8R | A15 | 8 | 7,029 | 20 | 55 | 60 | None | None | None | CTCACACTGCGAATCCCCTT |
| 5,560F_a15_22 | A15 | 8 | 5,659 | 21 | 42.9 | 55 | None | None | None | CATTCATGTTGGTGGTAATGG |
| 7,076R_a15_22 | A15 | 8 | 7,076 | 20 | 55 | 60 | None | None | None | AAGGCGGGATATACAGTGCG |
Tm - melting temperature
Start - Genome position (of the whole genome template used) where the primer sequence starts
GenBank sequences used in primer design were accession numbers: MN306051.1, DQ473493.1 and JN541268.1 for A15 and; KY460514.1, GQ415052.1, KY369891.1, KY189315.1, KY369897.1, KY369892.1, KY369889.1, JQ245965.1 and GQ415051.1 for A101.
RNA extraction, reverse transcription and PCR
Viral RNA was extracted from 140 μl sample using QIAamp Viral RNA kit (Qiagen, USA) as per the manufacturer's recommendations. Reverse transcription was carried out using random hexamers and the Superscript III First-Strand Synthesis System (Invitrogen, United Kingdom). Genome-wide amplification using HRV-specific primers was done using the Q5 High-Fidelity 2X Master Mix (New England Biolabs, United Kingdom). PCR success was assessed by electrophoresing the products on a 1% agarose gel. Once suitable PCR conditions per amplicon were established, a duplex PCR of non-consecutive amplicons of similar conditions was set up (Protocol doi - dx.doi.org/10.17504/protocols.io.bukxnuxn).
Sequencing
PCR products were purified with 1X AMPure XP beads (Beckman Coulter Inc.), quantified with Qubit dsDNA High Sensitivity Assay (Invitrogen, United Kingdom), pooled per sample and normalized to 0.2 ng/uL. Sequencing libraries were prepared using the Nextera XT Sample Preparation Kit (Illumina, CA) and sequencing performed on Illumina MiSeq platform (200 bp × 2) per sample.
Sequence assembly
The raw reads were quality checked using FastQC v0.11.9 and trimmed (Phred score >30) using Trimmomatic v0.39 19. HRV reads were identified by mapping to the respective reference strains ( https://www.picornaviridae.com/sg3/enterovirus/rv-a/rv-a_seqs.htm) and subsequently assembled into contigs using SPAdes v3.12.0. The contigs were checked for completeness and assembled to a consensus sequence using Sequencher v5.4.6 ( www.genecodes.com). We defined sequencing success as obtaining HRV reads covering at least 70% of the genome (>5040 bases). Sequencing depth was visualized using the deepTools 20 package.
Sequence analysis
Sequences were aligned using MAFFT v7.271 21. Recombination scans were done using RDP5 22 and visualized on SimPlot 23. Nucleotide substitutions across the genomes were visualized using a python script to examine genetic diversity across the genome. POPART 24 was used to construct haplotype networks using the Minimum Spanning Network model. The best-fitting model and maximum likelihood trees were inferred using IQ-TREE, v1.6.0 25. Branch support for phylogenetic trees was assessed using bootstrapping of 1000 iterations. MegaX 26 was used to calculate mean pairwise distances, and the respective standard errors were assessed using 100 iterations.
Bayesian phylogeny was used to create time-structured phylogenetic trees using BEAST v.1.10.4 27. BEAST was run with 200 million MCMC steps using the best fitting substitution model and a coalescent-based relaxed clock framework 28. The output was assessed for convergence using Tracer v1.7.1. Maximum clade credibility (MCC) trees were identified using TreeAnnotator v1.10.4 after removal of 10% burn-in. The trees were then visualised in FigTree v1.4.4 29 and branching posterior probabilities were noted.
Statistical analysis
Statistical analysis was undertaken using R version 3.6.1 (R Core, 2021). The Shapiro–Wilk test was used to check for the normality of the data. The T-test was then used to compare Ct-values of successfully sequenced versus failed samples.
Results
Whole genome sequencing
We successfully sequenced all 10 (100%) A101 and 38 of 63 (60.3%) A15 samples. Cycle threshold (Ct) values ranged from 20.2 – 34.7, with a median of 28.4 for A15 and 30.2 for A101. The failed 25 samples did not have a significantly higher median Ct-value than those successfully sequenced based on the T-test: 28.3 (IQR = 4.0) versus 29.2 (IQR = 3.7), respectively (p= 0.21), Figure 1A and B. Besides, samples that failed sequencing did not have unique phylogenetic clustering patterns based on their previously generated VP4/2 sequences. Sequencing depth was comparable across Ct-values, with the mean depth coverage per genome ranging from 351 - 13356 reads per base pair, Figure 1C and D.
Figure 1. Summary statistics of sequenced samples.
( A) Distribution of cycle threshold (Ct) values across all samples selected for sequencing. Bars are colored by HRV type. ( B) Dispersion of Ct-values across samples successfully sequenced and those that failed. ( C). Read depth (per base pair) distribution per (successfully) sequenced sample. Each line represents a genome/sample. ( D). Distribution of mean coverage per base pair per genome across successfully sequenced samples. The bars are colored by Ct-value group.
Phylogenetic analysis identified interspersion of local A101 sequences with global sequences (n=9) collected between the years 1999–2016. However, A15 local genomes clustered separately from global sequences (n=3), Figure 2. The global A15 genomes were collected in the years 2008 (n=2) and 2019 (n=1).
Figure 2.
Maximum-likelihood phylogenetic trees of local (generated) and global ( A) A15 and ( B) A101 sequences. The tips are coloured by origin, indicating the global sequences used in primer design. The scale bar represents nucleotide substitutions per site.
The ends of the 5' untranslated region (UTR) and 3' UTR were not amplified due to lack of suitable primers. Genetic diversity was observed across the entire genome and not within a particular genomic region for both types, as shown in Figure 3.
Figure 3.
Genetic diversity across the genome for ( A) A15 and ( B) A101. The known HRV strains ( https://www.picornaviridae.com/sg3/enterovirus/rv-a/rv-a_seqs.htm) are used as the reference. A substitution to "A" is indicated by green, "C" by blue, "G" by indigo and "T" by red bars. Gray contiguous bars indicate unknown/unsequenced bases.
Phylogenetic resolution
We compared the phylogenetic bifurcation patterns and statistical uncertainty of VP4/2 and WGS for the two types. The depiction of sister taxa was comparable across the two trees. However, WGS resolved phylogenetic polytomies/unresolved branches observed previously in the VP4/2 phylogenies. For example, using VP4/2 sequences, all viruses collected in the school formed one polytomy, which was now fully bifurcated using WGS. Similarly, for A15, the four polytomies observed on VP4/2 phylogeny were well resolved using WGS, effectively distinguishing one sample from the other. Although the mean pairwise distances across VP4/2 and WGS were close, the standard error of pairwise distance calculations was notably less (about a tenth) in WGS than VP4/2, Figure 4.
Figure 4.
Maximum-likelihood phylogenetic trees of local ( A) A15 and ( B) A101 VP4/2 and whole genome sequences. The tips are coloured by site of origin. The scale bar represents nucleotide substitutions per site while node labels indicate bootstrap value. ( C) Mean pairwise distances and respective standard errors of VP4/2 and whole genome sequences.
Overall higher branching posterior probabilities in Bayesian phylogenetic trees were observed using WGS than VP4/2 sequences. In VP4/2 Bayesian trees, 17.5% of A15 and 64.7% of A101 nodes had a posterior probability greater than 0.7 compared to 64.3% and 88.2% nodes in WGS trees, respectively, as illustrated in Figure 5.
Figure 5.
Bayesian phylogenetic trees of local and global ( A) HRV-A15 and ( B) HRV-A101 VP4/2 and whole genomes. The branches are coloured by site of origin. Node labels indicate branching posterior probabilities.
The improved resolution was further depicted by haplotype networks that displayed notably more alleles using WGS than VP4/2, e.g., in HRV-A101, school sequences that were considered a single allele using VP4/2 sequences resolved into five alleles when using whole-genome sequences, Figure 6. Identical samples at the VP4/2 region had a median of 3 nt changes for A101 and 5 nt changes for A15 across the whole genome.
Figure 6.
Haplotype networks displaying sequence variation of VP4/2 and whole-genome sequences of ( A) A15 and ( B) A101. Numbers along the edges indicate the nucleotide substitutions. The alleles are coloured by study site. ( C) Recombination scan of recombinant sequence KEN_Rhinovirus_7018 compared to its major and minor parents and the A101 prototype sequence, GQ415051.1. A putative recombinant region was identified within the VP3.
Recombination analysis
Recombination scans identified breakpoints within the VP3 of one A101 sequence (KEN_Rhinovirus_7018), with both parents belonging to A101 type (p-value < 1.922E-2), Figure 6C. Recombination within HRV structural regions has been shown to be rare and sporadic 30.
Discussion
This study presents a type-specific whole genome sequencing protocol for two HRV types. A101 had a higher success (100%) than A15 (60.3%). We attribute the higher A101 sequencing success to the higher number of sequences (n=9) that were available for primer design, which captured more intra-type variation, compared to A15 (n=3). Having more genomes contributing to the consensus sequence used in primer design increased genetic variation, and subsequently, the likelihood that the local and contemporaneous diversity was captured. While it’s not clear what the cause of sequencing failure of the 25 samples was, we speculate that either (i) their genomic diversity was not captured in primer design resulting in primer mismatches, (ii) the sample quality had deteriorated over time or (iii) there was presence of PCR inhibitors/ nuclease enzymes in the sample.
Whole-genome sequencing provided greater phylogenetic resolution and less statistical uncertainty to partial sequencing. Polytomies are a product of inadequate data and are, therefore, a potential source of bias. They also result in reduced statistical power due to increased uncertainty. The loss of terminal phylogenetic resolution may result in two opposing predictions: the underestimation (due to unresolved taxa) or overestimation (due to increased total tree length) of diversity 31. Due to the short size of VP4/2 (~420nt), insufficient data results in reduced phylogenetic resolution and increased uncertainty evidenced by a higher standard error in phylogenetic distance. Unresolved phylogenies are a challenge in epidemiology as one cannot distinguish infections from one individual to another for transmission inference.
Posterior probabilities summarize the uncertainties about a parameter and indicate confidence in the evidence 32. High posterior probabilities indicate high confidence, and the reverse is also true. Whole genomes consistently provided higher confidence across the two genotypes assessed in this study.
With pathogen sequencing now an established tool to track viral infections 2, 12, it is crucial to compare the resolution of different sequence analysis. As the huge antigenic diversity of HRV continues to pose a challenge in vaccine development, efforts should be directed towards understanding and mitigating transmission. Our study shows that HRV WGS is better suited for transmission inference to the commonly used VP4/2 sequences.
The sequencing approach we developed has some limitations. First, it requires prior genotyping of the HRV positive samples, commonly done by VP4/2 or VP1 sequencing. It is therefore unsuitable for sequencing new or highly divergent types due to the requirement of matching primer sequencing. Second, having to create primer sets for each type is cumbersome and relies on adequate number of pre-existing genomes to design conserved primers. In addition, an amplicon-based target enrichment does not work well for low complexity regions such as the 5'UTR and 3' UTR. Although the 5’UTR alone does not offer adequate resolution to confidently distinguish HRV types 33, it is speculated to be a hotspot for recombination 30, 33. Not sequencing these extreme regions may therefore result in missing out on important evolutionary/phylogenetic signal. Notwithstanding, the new method can successfully enrich for human rhinovirus in archived samples of varying virus titers. It can also effectively capture intra-type recombinant regions enabling detailed study of viral dynamics.
Conclusions
With HRV being the most common respiratory virus, it is surprising that we have such few publicly available whole genomes to allow detailed intra-type analysis. We describe a new protocol for the whole genome sequencing of two HRV types and enrich the public database of HRV genomes. The protocol can be adapted for other HRV types. Our study also shows that WGS is more informative than VP4/2 sequencing in studying HRV dynamics as it maximizes resolution and reduces phylogenetic uncertainty.
Data availability
Accession number: GenBank, MW713746-MW713793
Accession number: BioProject, PRJNA701406
Root URL: https://identifiers.org/bioproject
Accession number URL: https://identifiers.org/bioproject:PRJNA701406
Harvard Dataverse. Replication Data for: Whole genome sequencing of two human rhinovirus A types (A101 and A15) detected in Kenya, 2016–2018. DOI: https://doi.org/10.7910/DVN/QGXZLI 34
This project contains the following underlying data:
-
-
This is a replication dataset for the manuscript titled: "Whole genome sequencing of two human rhinovirus A types (A101 and A15) detected in Kenya, 2016–2018." The dataset contains contains Cycle threshold (Ct) values, and read/sequencing depth.
Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).
Acknowledgements
We thank all the study participants, the field study team and the laboratory staff of the Virus Epidemiology and Control research group at the KEMRI-Wellcome Trust Research Programme. This paper was published with the permission of the Director of KEMRI.
Funding Statement
This work was supported by the Wellcome Trust through a Wellcome Trust Senior Investigator Award to DJN (#102975). MML was supported by the Fogarty International Center (#U2RTW010677) of the National Institutes of Health (NIH) and DELTAS Africa Initiative (#DEL-15-003) of the African Academy of Sciences (AAS). The content is the authors’ responsibility and does not necessarily represent the official views of the Wellcome Trust, NIH, nor AAS.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
[version 2; peer review: 2 approved]
References
- 1.Agoti CN, Kiyuka PK, Kamau E, et al. : Human Rhinovirus B and C Genomes from Rural Coastal Kenya. Genome Announc. 2016;4(4):e00751–16. 10.1128/genomeA.00751-16 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Meredith LW, Hamilton WL, Warne B, et al. : Rapid implementation of SARS-CoV-2 sequencing to investigate cases of health-care associated COVID-19: a prospective genomic surveillance study. Lancet Infect Dis. 2020;20(11):1263–1271. 10.1016/S1473-3099(20)30562-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Otieno JR, Kamau EM, Oketch JW, et al. : Whole genome analysis of local Kenyan and global sequences unravels the epidemiological and molecular evolutionary dynamics of RSV genotype ON1 strains. Virus Evol. 2018;4(2):vey027. 10.1093/ve/vey027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Agoti CN, Phan MVT, Munywoki PK, et al. : Genomic analysis of respiratory syncytial virus infections in households and utility in inferring who infects the infant. Sci Rep. 2019;9(1):10076. 10.1038/s41598-019-46509-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Thanh Le T, Andreadakis Z, Kumar A, et al. : The COVID-19 vaccine development landscape. Nat Rev Drug Discov. 2020;19(5):305–306. 10.1038/d41573-020-00073-5 [DOI] [PubMed] [Google Scholar]
- 6.Adema IW, Kamau E, Nyiro JU, et al. : Surveillance of respiratory viruses among children attending a primary school in rural coastal Kenya [version 2; peer review: 2 approved]. Wellcome Open Res. 2020;5:63. 10.12688/wellcomeopenres.15703.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Nyiro JU, Munywoki P, Kamau E, et al. : Surveillance of respiratory viruses in the outpatient setting in rural coastal Kenya: baseline epidemiological observations [version 1; peer review: 2 approved]. Wellcome Open Res. 2018;3:89. 10.12688/wellcomeopenres.14662.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Onyango CO, Welch SR, Munywoki PK, et al. : Molecular epidemiology of human rhinovirus infections in Kilifi, coastal Kenya. J Med Virol. 2012;84(5):823–831. 10.1002/jmv.23251 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Morobe JM, Nyiro JU, Brand S, et al. : Human rhinovirus spatial-temporal epidemiology in rural coastal Kenya, 2015-2016, observed through outpatient surveillance [version 2; peer review: 2 approved]. Wellcome Open Res. 2019;3:128. 10.12688/wellcomeopenres.14836.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Tapparel C, Junier T, Gerlach D, et al. : New complete genome sequences of human rhinoviruses shed light on their phylogeny and genomic features. BMC Genomics. 2007;8:224. 10.1186/1471-2164-8-224 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Luka MM, Kamau E, Adema I, et al. : Molecular epidemiology of human rhinovirus from one-year surveillance within a school setting in rural coastal Kenya. medRxiv. 2020; 2020.03.09.20033019. 10.1101/2020.03.09.20033019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Houldcroft CJ, Beale MA, Breuer J: Clinical and biological insights from viral genome sequencing. Nat Rev Microbiol. 2017;15(3):183–192. 10.1038/nrmicro.2016.182 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Gu W, Miller S, Chiu CY: Clinical Metagenomic Next-Generation Sequencing for Pathogen Detection. Annu Rev Pathol. 2019;14:319–338. 10.1146/annurev-pathmechdis-012418-012751 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Greninger AL, Naccache SN, Federman S, et al. : Rapid metagenomic identification of viral pathogens in clinical samples by real-time nanopore sequencing analysis. Genome Med. 2015;7:99. 10.1186/s13073-015-0220-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Mertes F, Elsharawy A, Sauer S, et al. : Targeted enrichment of genomic DNA regions for next-generation sequencing. Brief Funct Genomics. 2011;10(6):374–386. 10.1093/bfgp/elr033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kamaraj US, Tan JH, Xin Mei O, et al. : Application of a targeted-enrichment methodology for full-genome sequencing of Dengue 1-4, Chikungunya and Zika viruses directly from patient samples. PLoS Negl Trop Dis. 2019;13(4):e0007184. 10.1371/journal.pntd.0007184 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hasan MR, Rawat A, Tang P, et al. : Depletion of Human DNA in Spiked Clinical Specimens for Improvement of Sensitivity of Pathogen Detection by Next-Generation Sequencing. J Clin Microbiol. 2016;54(4):919–927. 10.1128/JCM.03050-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Sayers EW, Agarwala R, Bolton EE, et al. : Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2019;47(D1):D23–D28. 10.1093/nar/gky1069 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Bolger AM, Lohse M, Usadel B: Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–2120. 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Ramírez F, Ryan DP, Grüning B, et al. : deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44(W1):W160–5. 10.1093/nar/gkw257 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Yamada KD, Tomii K, Katoh, K: Application of the MAFFT sequence alignment program to large data-reexamination of the usefulness of chained guide trees. Bioinformatics. 2016;32(21):3246–3251. 10.1093/bioinformatics/btw412 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Martin DP, Varsani A, Roumagnac P, et al. : RDP5: A computer program for analysing recombination in and removing signals of recombination from, nucleotide sequence datasets. Virus Evol. 2020;7(1):veaa087. 10.1093/ve/veaa087 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lole KS, Bollinger RC, Paranjape RS, et al. : Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination. J Virol. 1999;73(1):152–160. 10.1128/JVI.73.1.152-160.1999 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Leigh JW, Bryant D: POPART: Full-feature software for haplotype network construction. Methods Ecol Evol. 2015;6(9):1110–1116. 10.1111/2041-210X.12410 [DOI] [Google Scholar]
- 25.Nguyen LT, Schmidt HA, von Haeseler A, et al. : IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Mol Biol Evol. 2015;32(1):268–274. 10.1093/molbev/msu300 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Kumar S, Stecher G, Li M, et al. : MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol Biol Evol. 2018;35(6):1547–1549. 10.1093/molbev/msy096 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Drummond AJ, Suchard MA, Xie D, et al. : Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol. 2012;29(8): 1969–1973. 10.1093/molbev/mss075 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Drummond AJ, Ho SYW, Phillips MJ, et al. : Relaxed phylogenetics and dating with confidence. PLoS Biol. 2006;4(5):e88. 10.1371/journal.pbio.0040088 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Rambaut A, Drummond AJ: FigTree version 1.4. 0.2012. [Google Scholar]
- 30.McIntyre CL, Savolainen-Kopra C, Hovi T, et al. : Recombination in the evolution of human rhinovirus genomes. Arch Virol. 2013;158(7): 1497–1515. 10.1007/s00705-013-1634-6 [DOI] [PubMed] [Google Scholar]
- 31.Swenson NG: Phylogenetic Resolution and Quantifying the Phylogenetic Diversity and Dispersion of Communities. PLoS One. 2009;4(2):e4390. 10.1371/journal.pone.0004390 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Llewelyn H: Replacing P-values with frequentist posterior probabilities of replication-When possible parameter values must have uniform marginal prior probabilities. PLoS One. 2019;14(2):e0212302. 10.1371/journal.pone.0212302 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Savolainen-Kopra C, Blomqvist S, Smura T, et al. : 5’ noncoding region alone does not unequivocally determine genetic type of human rhinovirus strains. J Clin MicrobiolUnited States;2009;47(4):1278–80. 10.1128/JCM.02130-08 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Luka MM, Kamau E, de Laurent ZR, et al. : Replication Data for: Whole genome sequencing of two human rhinovirus A types (A101 and A15) detected in Kenya, 2016-2018. 2021. 10.7910/DVN/QGXZLI [DOI] [PMC free article] [PubMed] [Google Scholar]






