Abstract
Background
The genetic diversity and epidemiological characteristics of SARS-CoV-2 lineages in Tanzania have not yet been fully explored, despite their critical impact on public health.
Methods
We conducted a comprehensive genomic analysis of 80 sequences derived from 90 samples collected between November 2022 and July 2023 from multiple regions in Tanzania.
Results
Our findings reveal a complex landscape of 7 Omicron clades, with clades 22 F (XBB*) and 22E (BQ.1) emerging as the predominant variants, comprising 56.3% and 21.35% of all samples, respectively. Notably, there were regional variations in the distribution of lineages. We observed that four lineages were introduced in November 2022, which later led to local transmissions through virus imports and exports, with the majority of exports originating from Dar es Salaam to other regions. Phylogenetic analysis revealed distinct clades comprising only Tanzanian sequences, demonstrating localised transmission within the country.
Conclusions
This study provides key insights into the genetic diversity and transmission dynamics of SARS-CoV-2 in Tanzania. It highlights regional discrepancies in lineage distribution, identifying sixteen Omicron lineages and the dominance of clades 22 F (XBB*) and 22E (BQ.1) in the country. The presence of Tanzania-specific clades suggests sustained local transmission of the virus. These findings underscore the importance of enhanced integrated genomic surveillance systems and ongoing monitoring of SARS-CoV-2 evolution.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12864-026-12656-4.
Keywords: SARS-CoV-2, Tanzania, Omicron, Phylogenetic analysis, Genomic surveillance
Introduction
It has been four years since the emergence of the COVID-19 global pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1, 2]. As of July 2025, more than 778 million cases and approximately 7 million deaths have been reported worldwide [3]. In Tanzania, 43,078 COVID-19 cases have been reported, and 846 COVID-19-related deaths have occurred [4]. Overall, the number of new infections has declined, but a low number of cases are still reported globally.
SARS-CoV-2 belongs to the Coronaviridae family, has a positive-sense single-stranded RNA genome of approximately 30 kb that encodes 29 proteins [5]. This virus exhibits natural mutations and antigenic variations, leading to evolution, with five variants of concern (VOCs) and various variants of interest (VOIs) having been reported to date. The VOCs included Alpha (B.1.1.7 and Q lineages), Beta (B.1.351), Gamma (P.1), Delta (B.1.617.2) and Omicron (B.1.1.529) and VOIs Epsilon (B.1.427 and B.1.429), Eta (B.1.525), Iota (B.1.526), Kappa (B.1.617.1), Zeta (P.2), and Mu (B.1.621 and B.1.621.1). The Omicron variant (B.1.1.529), first identified in South Africa and categorised as a VOC by the World Health Organisation (WHO) on November 26, 2021, is distinctive within the Pango lineage B.1.1.529 and falls under the Nextstrain clade 21 K [6]. Compared with the original virus, the Omicron variant presented more than 30 amino acid changes in the spike protein, including three minor deletions and one small insertion [7]. Despite its numerous mutations, the Omicron variant, with all its numerous sublineages, including B.1.1.529, BA.2, BA.5, BA.1, XBB, BQ.1 and JN.1, is associated with a reduced risk of severe illness and fatality compared with the preceding delta variant [8]. However, despite the relatively low mortality rate, close monitoring of new strains is warranted because of their rapid evolution with the emergence of numerous sublineages [8, 9].
Tanzania, like other countries in the world, has experienced SARS-CoV-2 infections since the first case of COVID-19 was detected on March 16, 2020 [10]. At the beginning of the pandemic, the country had taken and emphasised infection prevention and control measures [10]. In 2021, the government approved COVID-19 vaccination. Several SARS-CoV-2 variants have been detected circulating within Tanzania [11, 12]. The Ministry of Health continues to monitor SARS-CoV-2 infections through the influenza sentinel surveillance and testing of suspected persons at the National Public Health Laboratory (NPHL). Influenza sentinel surveillance in Tanzania was established in 2008 with the objective of describing the epidemiology, seasonality and burden of influenza in Tanzania [13]. In 2022, Tanzania adopted a comprehensive surveillance system by integrating SARS-CoV-2 testing with existing influenza sentinel surveillance [14], as recommended by the WHO [15]. Despite these commendable efforts, there is a lack of published data on the prevalence and variant evolution of the SARS-CoV-2 Omicron variant in Tanzania. The aim of this study was to fill this gap by investigating SARS-CoV-2 Omicron variant diversity in Tanzania between November 2022 and July 2023 through the influenza-like illnesses (ILI)/ severe acute respiratory infections (SARI) sentinel surveillance system.
Materials and methods
Study design and setting
This was a descriptive cross-sectional study of all samples that tested positive for SARS-CoV-2 within the integrated influenza and SARS-CoV-2 surveillance in Tanzania. In Tanzania, 27 influenza sentinel sites are distributed across 19 regions. By design, each sentinel site is required per day to collect 2 samples from ILI cases presenting at the facility, and 2 samples for admitted SARI cases. During the study period (epidemiological week 44 of 2022 to week 30 of 2023), a total of 7,920 samples collected from individuals presenting with ILI, SARI or acute respiratory illness (ARIs) across all influenza sentinel sites were submitted to the National Influenza Centre (NIC) of the NPHL, where all samples were tested for both influenza virus and SARS-CoV-2 using reverse transcription real-time PCR (RT-qPCR). A total of 239 samples (3%) tested positive for SARS-CoV-2 after screening using the US CDC Influenza SARS-CoV-2 (Flu SC2) RT-qPCR mltiplex assay (CDC, Atlanta, USuA) [16]. Ninety samples were included in this study after meeting the inclusion criterion of a < 30 cycle threshold (Ct) value in the RT-qPCR step, and were subjected to SARS-CoV-2 whole-genome sequencing.
Whole-genome sequencing of SARS-CoV-2
RNA extraction was performed on 90 samples using the QIAamp Viral RNA Mini Kit (QIAGEN, Hilden, Germany), according to the manufacturer’s instructions [17]. An Illumina Covidseq kit (Illumina, USA) was used to prepare the sequencing libraries [18]. The resulting libraries were subsequently pooled, normalised, and quantified using the Qubit DNA High Sensitivity Kit (Applied Biosystems, Foster City, CA, USA). Paired-end sequencing was performed on a MiSeq system (Illumina, USA) via a MiSeq 600 cycle V3 kit (Illumina, USA) with ARTIC primers version 4. The multiplexed fastq files were processed via the NextFlow-based analysis pipeline [19] the viral recon [20] from the nf-core [21]. In brief, the viral recon pipeline does the following: the adapters are trimmed from the fastq reads via fastp [22] (version 0.20.1) and then aligned to the SARS-CoV-2 reference genome (NC_045512.2) via Bowtie [23] (version 3.5.1). Next, the indexed reads were sorted via SAMtools [24] (version 1.9), the amplicon primer sequences were trimmed, variants were called, and per-sample consensus sequences were generated via Ivar [25] (version 1.2.2). The resulting consensus files were submitted to the NextClade web tool [26] for additional quality checks of the sequences as well as determination of the lineages and clades of the corresponding sequences. In the Nextclade quality check algorithm, sequences were checked for clustered mutations (counted in a sliding window of 100 nucleotides; 6 or more private mutations in the window are counted as a cluster), private mutations, ambiguous bases (10 or more ambiguous nucleotides are flaged in QC), sequence length (sequences with more than 3,000 are flagged), stop codons (premature stop codons are flagged) and frameshifts (all frameshifts are flagged except for those that are known to be present). The scores are assigned as good, mediocre and bad (Supplementary Material Table 1).
Phylogenetic tree
Nextstrain was used to construct a phylogenetic tree via the ncov workflow [27]. We employed a contextual sampling approach, downloading global, African and Tanzanian sequences from GISAID [28, 29]. We then removed 8 local sequences that failed NextClade QC (bad score). We used the following three subsampling schemes. The first scheme was applied only to custom sequences (retaining all Tanzania sequences). The second subsample was for African context sequences. We selected only sequences with a region equal to that of Africa, and 500 sequences were selected on the basis of their proximity to Tanzanian sequences. The last subsample was in a global context on the basis of proximity, 500 sequences were selected, and noncustom sequences were used; the region was not equal to Africa. In summary, this process uses the Augur toolkit to eliminate short and low-quality sequences, as well as those lacking complete sampling dates. The filtered sequences are aligned via MAFFT v7.471 [30] according to the configuration file. Irrelevant sections and termini are masked from the alignment, and context subsampling is conducted using genomes genetically similar to our main subset, prioritising sequences in closer proximity to Tanzanian sequences. Next, Nextstrain pipeline constructed a maximum likelihood (ML) phylogenetic tree via IQ-TREE v2.0.3 [31] employing the general time reversible model (GTR) with unequal rates and base frequencies; thereafter, we produced a time-scaled tree resolving polytomies and internal nodes with TreeTime v0.7.6 [32] under a strict clock and a skyline coalescent prior with a rate of 8 × 10− 4 substitutions per site per year, finally annotating clades, identifying mutations, deducing geographical movements, and exporting the findings to the JSON format to facilitate interactive visualisation via Auspice [27]. To evaluate the geographical transmission pattern of the virus in Tanzania, we aligned 76 high-quality sequences from this study with 546 Tanzanian sequences submitted to GISAID via the AUGUR alignment tool. The alignment was then used to infer maximum likelihood (ML) tree topologies in IQ-TREE v2.0.3 via the general time reversible (GTR) model of nucleotide substitution with a total of 1,000 bootstrap replicates. The resulting ML tree topology was first inspected in TempEst [33] to identify any sequences that deviated more than 0.0005 from the residual mean. To understand viral dispersion, we replicated the approach used by Tegally et al. [34]. Briefly, we performed a basic viral dispersal analysis. A migration model was fitted to the time-calibrated tree topology in TreeTime, mapping the locations of the sampled sequences to the external tips of the trees. The mugration model of TreeTime also infers the most likely location for internal nodes in the trees. Using a custom Python script (downloaded from https://github.com/CERI-KRISP/africa-covid19-genomics/tree/main/python_scripts accessed on 12 June 2025), we could then count the number of state changes by iterating over each phylogeny from the root to the external tips. We count state changes when an internal node transitions from one region to a different region in the resulting child node or tip(s). The timing of transition events is then recorded, which serves as the estimated import or export event.
Results
Patient demographic characteristics
Between November 2022 and July 2023, a total of 7,920 samples were tested for both influenza virus and SARS-CoV-2 by RT-qPCR, of which 239 (3%) tested positive for SARS-CoV-2. The distribution of all the samples by region, sex and age group is shown in Table 1 and Supplementary Material Fig. 1. Among these samples, 90 met the inclusion criterion of a Ct value of < 30 for nucleotide sequencing. All 90 samples were sequenced, yielding 83 SARS-CoV-2 genome sequences. The subsequent clade-quality assessment identified 80 sequences that met the quality standards for further analysis. Three sequences were removed because of unusual private mutations caused by mixed reads. Predominantly, 20 (25%) samples originated from the Lindi region, followed by 8 (10%) samples from Mwanza and Mtwara. The median age of the studied patients was 24.5 years (interquartile range (IQR), 0–83 years). Females constituted 50.6% of the total population; more distributions are outlined in Table 2.
Table 1.
Distribution of total samples collected between November 2022 and July 2023 by sex, region and age group with their corresponding number of SARS-CoV-2-positive cases
| Characteristics | Samples collected (SARS-CoV-2 positive) |
|---|---|
| Sex N = 7920 | |
| Male | 4016 (122) |
| Female | 3894 (117) |
| Missing | 10 (0) |
| Age group | |
| 0–5 years | 3965 (99) |
| 6–17 years | 621 (17) |
| 18–45 years | 1761 (61) |
| > 45 years | 1391 (55) |
| Missing | 182 (7) |
| Regions | |
| Arusha | 1081 (36) |
| Dar es Salaam | 1078 (27) |
| Dodoma | 346 (16) |
| Kigoma | 506 (20) |
| Morogoro | 639 (16) |
| Tabora | 396 (4) |
| Iringa | 235 (9) |
| Mara | 165 (7) |
| Lindi | 295 (21) |
| Ruvuma | 556 (11) |
| Pwani | 461 (15) |
| Singida | 234 (9) |
| Mwanza | 451 (16) |
| Mtwara | 279 (15) |
Table 2.
Demographic characteristics of the sequenced samples
| Characteristics, N = 83 | Frequency (%) |
|---|---|
| Age (years), Median (IQR) | 24.5 (0–83) |
| Sex | |
| Male | 37 (51.94) |
| Female | 40 (48.05) |
| Missing | 3 (0.03) |
| Age group | |
| 0–5 years | 33 (42.85) |
| 6–17 years | 5 (6.25) |
| 18–45 years | 21 (27.28) |
| > 45 years | 18 (23.37) |
| Missing | 3 |
| Regions | |
| Arusha | 6 (7.5) |
| Dar es Salaam | 3 (3.75) |
| Dodoma | 6 (7.5) |
| Kigoma | 5 (6.25) |
| Morogoro | 7 (8.75) |
| Tabora | 1 (1.25) |
| Iringa | 7 (8.75) |
| Mara | 2 (2.25) |
| Lindi | 20 (25) |
| Ruvuma | 4 (5) |
| Pwani | 4 (5) |
| Singida | 8 (10) |
| Mwanza | 1 (1.25) |
| Mtwara | 8 (8.75) |
Distribution of lineages and clades
Sixteen (16) Omicron pango lineages were detected, of which 48 (60%) sequences were from clade 22 F (XBB*) and one (1.25%) sequence was from clade 23D (XBB.1.9.1); these two clades have been classified as variants under monitoring (VUM) by the WHO. Three (3.75) sequences from Dodoma and Singida were identified as clades 23B (XBB.1.16), classified as a variant of concern (VOC). The remaining sequences were from clade 22 A (BA.4), with 10 (12.5%) sequences, and 22E (BQ.1), with 22 (22.5%) sequences. Most of the sequences assigned to lineage XBB* were XBB.2 (67.2%), and more than half (55.88%) were detected from the Lindi region. Half of all XBB.2 patients were in the 0–5-year age group, accounting for 50% of all XBB.2 patients. Temporal distribution of the lineages has been illustrated in Fig. 1 were where it shows clade 22 F dominating in December 2022 co-circulating with clades 22 A and 22E, later on clade 23B appears between June and July 2023.
Fig. 1.
The temporal distribution of SARS-CoV-2 clades from November 2022 to July 2023, as illustrated by a stacked area chart, reveals key shifts in circulating variants. A notable peak in sequenced genomes occurred in December 2022, largely attributed to Clade 22 F (teal). Following this surge, the total sequence counts decreased, with Clade 23B (purple) becoming prominent in mid-2023
Phylogenetic tree
The final phylogenetic tree was constructed from 800 sequences (80 Tanzanian sequences from this study and 720 sequences from other countries) filtered by proximity to the Tanzanian sequences (Fig. 2). Our sequences clustered as descendants of 21 L (BA.2) sequences and were found in clades 22 A (BA.4), 22E (BQ.1), 22 F (XBB), and 23B (XBB.1.16) in the final build. In the tree, we observed the majority of the sequences clustered in the 22 F clade and were collected in epidemiology week 51 of 2022, and most came from southern regions (Lindi, Mtwara and Ruvuma) (Fig. 2). In our import-export analysis through ancestral state reconstruction, we identified 161 state changes between regions (import-export events), the majority of which originated from Dar es Salaam, accounting for 101 (62.7%) export events, followed by Arusha and Ruvuma, with 12 (7.5%) and 11 (6.8%) export events, respectively.
Fig. 2.
Time-resolved phylogenetic analysis of SARS-CoV-2 sequences from Tanzania (2022–2023) created in Nextstrain. The tree illustrates the evolutionary relationships between sequences collected in Tanzania (yellow nodes) and a background dataset of global contextual sequences (blue nodes). The horizontal axis represents the timeline of sample collection. The tree highlights the succession of major Omicron sub-lineages, including BA.1, BA.2, BA.5, and XBB variants, demonstrating the genomic diversity of the virus circulating in Tanzania
Discussion
Our study provides a comprehensive analysis of the genetic diversity of SARS-CoV-2 lineages circulating in Tanzania between November 2022 and July 2023. Our findings reveal regional disparities in lineage distribution, with a notable concentration of sequences originating from the Lindi region. The detection of sixteen Omicron lineages underscores the genetic complexity of Omicron variants in Tanzania. Notably, clades 22 F (XBB*) and 22E (BQ.1) emerged as the predominant lineages, which is consistent with global trends during the sampling period and time [4].
Phylogenetic analysis elucidated the evolutionary relationships among Tanzanian sequences, revealing distinct clades with Tanzanian sequences only (Fig. 2). The identification of clades exclusive to Tanzanian sequences implies potential local transmission chains within the country during that period. This also highlights that there was a localised outbreak between weeks 41 and 52 of 2022, with surges in the number of cases most of which were clade 22 F sequences. We observed that most export events originated from Dar es Salaam, indicating that it was the hub of local SARS-CoV-2 infections in Tanzania. This can be explained by the fact that Dar es Salaam is Tanzania’s largest commercial city; it also has an international airport that accommodates up to 2.8 million local and international passengers annually. Moreover, the rapid transmission across regions could be due to local travel, as the same period corresponds to the time of the year in which most citizens travel to different parts of the country for holidays hence the separate clustering of Tanzanian sequences in the tree may reflect local transmission within the community, a pattern that has also been observed in other studies in Italy [35] and the Dominican Republic [36].
We acknowledge several limitations, including a limited sample size; however, we included sufficient high-quality background samples to ensure a robust phylogenetic analysis. Additionally, the interpretation of lineage dynamics may be influenced by factors such as sampling frequency, which are shaped by the influenza sentinel surveillance design like health seeking behaviour for young aged individuals as this was a clinical-based surveillance. Thirdly, our study did not assess vaccination status, clinical outcomes or immune responses in patients infected with the Omicron variant.
Conclusion
In conclusion, this study offers crucial insights into the genetic diversity of SARS-CoV-2 in Tanzania between November 2022 and July 2023. The discovery of regional disparities in lineage distribution and the detection of sixteen Omicron lineages underscore the genetic diversity during this period. The identification of clades 22 F (XBB*) and 22E (BQ.1) as dominant lineages, along with the observation of clades exclusively composed of Tanzanian sequences, suggests sustained local transmission. These findings emphasise the critical importance of strengthening integrated genomic surveillance systems and continued monitoring of SARS-CoV-2 evolution in Tanzania and across Africa.
Supplementary Information
Acknowledgements
The authors thank the Ministry of Health, Tanzania, for funding and enabling the National Public Health Laboratory to perform advanced molecular testing. We also appreciate the efforts of the influenza surveillance sentinel sites in ensuring specimen collection and transport to NPHL. Additionally, we acknowledge the staff of the National Influenza Centre for the identification and sorting of eligible samples for this study.
Authors’ contributions
LM and MK: Conceived and designed the analysis; LM and PV performed the analysis. LM, MF, AI: Wrote the first draft of the paper. HS and SI reviewed the first draft. OM, AK, JM, FN, IM, RB, RL, and AM participated in sample collection, performed laboratory testing, and compiled the epidemiological data. GM and NM reviewed and approved the final version of the manuscript. All the authors contributed to the writing, revision, and approval of the final manuscript.
Funding
Not applicable.
Data availability
Raw sequence reads were deposited in the Sequence Read Archive (SRA) under bioproject PRJNA1389878. All background genome sequences and associated metadata in this dataset are published in GISAID’s EpiCoV database. The contributors for each sequence, including accession number, virus name, collection date, originating laboratory, submitting laboratory, and the list of authors, are available at EPI_SET_251104yg; for the Tanzanian background sequences, see EPI_SET_251104rd.
Declarations
Ethics approval and consent to participate
The samples collected and used in this study strictly adhere to the Helsinki Declaration. Permission to publish these data and to perform the analysis was obtained from the National Health Research Ethics Committee (NatHREC). The study protocol was reviewed and approved by the NatHREC (Certificate No. NIMR/HQ/R.8a/Vol.IX/4916). Informed consent was waived by the National Health Research Ethics Committee (NatHREC).
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Singhal TA, Review of. Coronavirus Disease-2019 (COVID-19). Indian J Pediatr. 2020;87:281–6. 10.1007/s12098-020-03263-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Cucinotta D, Vanelli M. WHO declares COVID-19 a pandemic. Acta Biomed. 2020;91:157–60. 10.23750/ABM.V91I1.9397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.COVID-19 cases | WHO COVID-19 dashboard. https://data.who.int/dashboards/covid19/cases?n=c. Accessed 12 Sep 2025.
- 4.COVID-19 data | WHO COVID-19 dashboard. https://data.who.int/dashboards/covid19/data?n=c. Accessed 7 Mar 2024.
- 5.Wu F, Zhao S, Yu B, Chen YM, Wang W, Song ZG, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265–9. 10.1038/S41586-020-2008-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ren SY, Gao RD, Wang WB, zhou AM. Omicron variant (B.1.1.529) of SARS-CoV-2: Mutation, infectivity, transmission, and vaccine resistance. World J Clin Cases. 2022;10. 10.12998/wjcc.v10.i1.1. [DOI] [PMC free article] [PubMed]
- 7.Manjunath R, Gaonkar SL, Saleh EAM, Husain K. A comprehensive review on Covid-19 Omicron (B.1.1.529) variant. Saudi J Biol Sci. 2022;29. 10.1016/J.SJBS.2022.103372. [DOI] [PMC free article] [PubMed]
- 8.Arabi M, Al-Najjar Y, Mhaimeed N, Salameh MA, Paul P, AlAnni J, et al. Severity of the Omicron SARS-CoV-2 variant compared with the previous lineages: A systematic review. J Cell Mol Med. 2023;27:1443–64. 10.1111/JCMM.17747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Adjei S, Hong K, Molinari NAM, Bull-Otterson L, Ajani UA, Gundlapalli AV et al. Mortality risk among patients hospitalized primarily for COVID-19 during the Omicron and delta variant pandemic Periods — United States. 2022;71:1182. 10.15585/MMWR.MM7137A4. [DOI] [PMC free article] [PubMed]
- 10.Mghamba JM, Oriyo NM, Bita AAF, Shayo E, Kagaruki G, Katsande R, et al. Compliance to infection prevention and control interventions for slowing down COVID-19 in early phase of disease transmission in Dar Es Salaam, Tanzania. Pan Afr Med J. 2022;41:174. 10.11604/pamj.2022.41.174.31481. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Mnyambwa NP, Gao J, Magesa A, Mgimba E, Mahesh S, Angra P, et al. Genomic analysis of SARS-CoV-2 sequences obtained in Tanzania during the pandemic. J Clin Virol Plus. 2025;5:100212. 10.1016/J.JCVP.2025.100212. [Google Scholar]
- 12.Mziray SR, van Zwetselaar M, Kayuki CC, Mbelele PM, Makubi AN, Magesa AS, et al. Whole-genome sequencing of SARS-CoV-2 isolates from symptomatic and asymptomatic individuals in Tanzania. Front Med (Lausanne). 2022;9:1034682. 10.3389/fmed.2022.1034682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Mmbaga VM, Mwasekaga MJ, Mmbuji P, Matonya M, Mwafulango A, Moshi S, et al. Results from the first 30 months of National Sentinel surveillance for influenza in Tanzania, 2008–2010. J Infect Dis. 2012;206 suppl1:S80–6. 10.1093/INFDIS/JIS540. [DOI] [PubMed] [Google Scholar]
- 14.Shedura VJ, Hussein AK, Nyanga SK, Kamori D, Mchau GJ. Evaluation of the influenza-like illness Sentinel surveillance system: A National perspective in Tanzania from January to December 2019. PLoS ONE. 2023;18:e0283043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Implementing the integrated sentinel surveillance of influenza and other respiratory viruses ofepidemic and pandemic potential by the Global Influenza Surveillance and Response System: standards and operational guidance. Geneva: World Health Org; 2024.
- 16.CDC’s Influenza. SARS-CoV-2 Multiplex Assay | Influenza (Flu) | CDC. https://www.cdc.gov/flu/php/laboratories/influenza-sars-cov-2-multiplex-assay.html. Accessed 2 Aug 2025.
- 17.Shu B, Kirby MK, Davis WG, Warnes C, Liddell J, Liu J, et al. Multiplex Real-Time Reverse Transcription PCR for Influenza A Virus, Influenza B Virus, and Severe Acute Respiratory Syndrome Coronavirus 2. Emerg Infect Dis. 2021;27(7):1821–30. 10.3201/eid2707.210462. [DOI] [PMC free article] [PubMed]
- 18.Lowry K, Bauer MJ, Buckley C, Wang C, Bordin A, Badman S, et al. Evaluation of Illumina® COVIDSeq™ as a tool for Omicron SARS-CoV-2 characterisation. J Virol Methods. 2023;322:114827. 10.1016/J.JVIROMET.2023.114827. [DOI] [PubMed] [Google Scholar]
- 19.DI Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol 2017. 2017;35:4. 10.1038/nbt.3820. [DOI] [PubMed] [Google Scholar]
- 20.Patel H, Monzón S, Varona S, Espinosa-Carrasco J, Garcia MU, bot, nf-core et al. nf-core/viralrecon: nf-core/viralrecon v2.6.0 - Rhodium Raccoon. 10.5281/ZENODO.7764938.
- 21.Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, et al. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020;38:276–8. 10.1038/S41587-020-0439. X;SUBJMETA=114,631,648,706;KWRD=COMPUTATIONAL+BIOLOGY+AND+BIOINFORMATICS,SCIENTIFIC+COMMUNITY. [DOI] [PubMed] [Google Scholar]
- 22.Chen S, Zhou Y, Chen Y, Gu J. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–90. 10.1093/BIOINFORMATICS/BTY560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Langmead B, Salzberg SL Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–9. . 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, et al. Twelve years of samtools and BCFtools. Gigascience. 2021;10:1–4. 10.1093/GIGASCIENCE/GIAB008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Grubaugh ND, Gangavarapu K, Quick J, Matteson NL, De Jesus JG, Main BJ, et al. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using primalseq and iVar. Genome Biol. 2019;20:1–19. 10.1186/S13059-018-1618-7/FIGURES/9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Aksamentov I, Roemer C, Hodcroft EB, Neher RA. Nextclade: clade assignment, mutation calling and quality control for viral genomes. J Open Source Softw. 2021;6:3773. 10.21105/JOSS.03773. [Google Scholar]
- 27.Hadfield J, Megill C, Bell SM, Huddleston J, Potter B, Callender C, et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics. 2018;34:4121–3. 10.1093/BIOINFORMATICS/BTY407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Khare S, Gurry C, Freitas L, Schultz MB, Bach G, Diallo A, et al. GISAID’s role in pandemic response. China CDC Wkly. 2021;3. 10.46234/ccdcw2021.255. [DOI] [PMC free article] [PubMed]
- 29.Elbe S, Buckland-Merrett G. Data, disease and diplomacy: gisaid’s innovative contribution to global health. Global Challenges. 2017;1. 10.1002/gch2.1018. [DOI] [PMC free article] [PubMed]
- 30.Katoh K, Misawa K, Kuma KI, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast fourier transform. Nucleic Acids Res. 2002;30:3059. 10.1093/NAR/GKF436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Nguyen LT, Schmidt HA, Von Haeseler A, Minh BQ. IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32:268–74. 10.1093/MOLBEV/MSU300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sagulenko P, Puller V, Neher RA, TreeTime. Maximum-likelihood phylodynamic analysis. Virus Evol. 2018;4. 10.1093/VE/VEX042. [DOI] [PMC free article] [PubMed]
- 33.Rambaut A, Lam TT, Carvalho LM, Pybus OG. Exploring the Temporal structure of heterochronous sequences using tempest (formerly Path-O-Gen). Virus Evol. 2016;2. 10.1093/VE/VEW007. [DOI] [PMC free article] [PubMed]
- 34.Tegally H, San JE, Cotten M, Moir M, Tegomoh B, Mboowa G, et al. The evolving SARS-CoV-2 epidemic in africa: insights from rapidly expanding genomic surveillance. Science. 2022;378:eabq5358. 10.1126/science.abq5358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Bergna A, Lai A, Ventura C, Della, Bruzzone B, Weisz A, d’Avenia M, et al. Genomic epidemiology of the main SARS-CoV-2 variants in Italy between summer 2020 and winter 2021. J Med Virol. 2023;95. 10.1002/jmv.29193. [DOI] [PubMed]
- 36.Paulino-Ramírez R, López P, Mueses S, Cuevas P, Jabier M, Rivera-Amill V. Genomic surveillance of SARS-CoV-2 variants in the Dominican Republic and emergence of a local lineage. Int J Environ Res Public Health. 2023;20. 10.3390/ijerph20085503. [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Raw sequence reads were deposited in the Sequence Read Archive (SRA) under bioproject PRJNA1389878. All background genome sequences and associated metadata in this dataset are published in GISAID’s EpiCoV database. The contributors for each sequence, including accession number, virus name, collection date, originating laboratory, submitting laboratory, and the list of authors, are available at EPI_SET_251104yg; for the Tanzanian background sequences, see EPI_SET_251104rd.


