Abstract
More than any other infectious disease epidemic, the COVID-19 pandemic has been characterized by the generation of large volumes of viral genomic data at an incredible pace due to recent advances in high-throughput sequencing technologies, the rapid global spread of SARS-CoV-2, and its persistent threat to public health. However, distinguishing the most epidemiologically relevant information encoded in these vast amounts of data requires substantial effort across the research and public health communities. Studies of SARS-CoV-2 genomes have been critical in tracking the spread of variants and understanding its epidemic dynamics, and may prove crucial for controlling future epidemics and alleviating significant public health burdens. Together, genomic data and bioinformatics methods enable broad-scale investigations of the spread of SARS-CoV-2 at the local, national, and global scales and allow researchers the ability to efficiently track the emergence of novel variants, reconstruct epidemic dynamics, and provide important insights into drug and vaccine development and disease control. Here, we discuss the tremendous opportunities that genomics offers to unlock the effective use of SARS-CoV-2 genomic data for efficient public health surveillance and guiding timely responses to COVID-19.
Introduction
COVID-19, a contagious disease caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has reached an extraordinary scale not seen since the influenza pandemic of 1918–19191. Within a month of its first reported case in China in December 2019, COVID-19 had spread to many regions in China2 and had been detected in several neighboring countries, including Thailand, Korea, and Japan. As international flights continued to operate, SARS-CoV-2 then spread to Europe and North America in a short amount of time, and was soon declared a global pandemic3,4. According to the World Health Organization (WHO), in the first 16 months of the pandemic (through April 8, 2021), more than 132.7 million people became infected worldwide, resulting in more than 2.8 million deaths5.
Over the past two decades, the biomedical community has become equipped with infrastructure for basic genomic techniques to support epidemic responses6, a capability that has enabled the rapid collection of SARS-CoV-2 genomic information which has allowed observation of SARS-CoV-2 genomic evolution online, rapid tracking of SARS-CoV-2 genetic groups, lineages, variants, variants of interest (VOI) and variants of concern (VOC)7. The precise and rapid tracking of SARS-CoV-2 genetic changes facilitates fast development of SARS-CoV-2 clinical tests and predicting the efficiency of the vaccines. As sequencing technologies and genomic analysis tools progress, genome sequencing is becoming more widely integrated into clinical and healthcare workflows. However, the utilization of genomic sequencing to its full potential for public health surveillance and outbreak response efforts has yet to be established and depends on the broad expansion of the best practices for preventing and limiting outbreaks that had been determined during the COVID-19 response8. Herein, we discuss the genomic capacities that can be used to address many of the public health issues associated with COVID-19 (Box 1).
Box 1: The value of capacities of genomic analysis for COVID-19 pandemic responses.
1. Initial SARS-CoV-2 virus detection and characterization.
Genomic analysis conducted on respiratory specimens isolated from the first COVID-19 patients hospitalized in December, 2019, in Wuhan, China, allowed for the prompt detection and characterization of a novel coronavirus, later named SARS-CoV-2, by January 20203,4,9,10. Initial sequence analyses revealed that SARS-CoV-2 shared 80% nucleotide identity with SARS-CoV11,12, strongly indicating that SARS-CoV-2 was likely a respiratory pathogen that could spread from human to human and hence with clear epidemic potential. These initial analyses also revealed that SARS-CoV-2 shared high sequence similarity with related viruses found in bats and pangolins, suggesting a zoonotic origin10,11,13–19. Across its complete genome, SARS-CoV-2 is most closely related to the bat coronavirus RaTG13, with which it shares approximately 96% nucleotide sequence identity. However, different SARS-CoV-2 coding regions share greater similarity to those of other animal coronaviruses.
2. The role of genomics in the early COVID-19 outbreak response.
Early access to SARS-CoV-2 genome sequences allowed for the timely development and production of nucleic acid amplification testing (NAAT)-based diagnostics, expedited vaccine development, and accelerated opportunities for SARS-CoV-2 genomics-based, real-time surveillance27–32. By April 7, 2021, public repositories that host SARS-CoV-2 genomes contained over 1,000,000 genomes36–38 (Table S1). Notably, by the end of the sixth month of the pandemic (May 2020), Global Initiative on Sharing All Influenza Data (GISAID) and the National Center for Biotechnology Information (NCBI) databases included 110,000 SARS-CoV-2 full-length genome sequences as compared to more than the 8,000 HIV full-length genome sequences collected by the Los Alamos sequence National Laboratory39 over the past 40 years40 (Figure 1a).
3. SARS-CoV-2 genomic evolution.
The unprecedented scale of SARS-CoV-2 genome sequencing offers unique opportunities for tracking SARS-CoV-2 evolution online and detecting the emergence and spread of new variants of interest (VOI) and variants of concern VOC42–44 (Figure 2). Coronaviruses undergo a mean rate of approximately 1.12 × 10−3 nucleotide substitutions per site per year. This is comparable to the SARS-CoV-1 mutation rate from 0.8 × 10−3 to 2.38 × 10−3, Ebolavirus’s mutation rate of 1.3 × 10−3 and is lower than seasonal influenza mutation rate of 6.7 × 10−3 and HIV mutation rate of 4.4 × 10−3 45–47,50–52. However, during the first year and a half of SARS-CoV-2 evolution there 3 VOI and 5 VOC appeared in the world.
4. The use of genomics to investigate the pandemic spread of SARS-CoV-2.
Online tracking of changes in SARS-CoV-2 genomes allows for detecting viral transmission routes. Genomic analysis has allowed the identification of SARS-CoV-2 introduction into Europe from China, into the US from both China and Europe with a subsequent local transmission91,95–97. We have collected around 40 outbreak investigations available in the literature that illustrates the usability of genomic analysis for tracking pathogen spread routes.
5. Monitoring SARS-CoV-2 transmission through wastewater genomic studies.
SARS-CoV-2 is regularly shed in feces113 that allows surveying the COVID-19 epidemiology by testing wastewater samples. That allows estimating the number of COVID-19 cases, the prevalence of the lineages in the region, and detecting new viral variants. We have collected over 35 studies in 17 countries that used wastewater studies for detecting the presence and concentration of viruses in wastewater114 and estimate the relative number of disease cases in the area covered by the sewage facilities.
6. Genomics in clinical applications.
Viral genomics aids in detecting how viral evolution impacts clinical outcomes and treatment136. For example, SARS-CoV-2 lineages that have been linked to phenotypic changes that could influence outbreak dynamics and clinical outcomes were declared as VOIs and VOCs. Genomics also helps to control over antigenic drift that can influence vaccines and test efficiency. This provides an opportunity to proactively adapt vaccines and tests to ongoing SARS-CoV-2 genome changes.
7. Integrating clinical and genomics viral genomics data.
Many SARS-CoV-2 genomic studies to date have been conducted in the absence of substantial clinical data collection. Conversely, numerous studies have critically evaluated extensive clinical data alone, without assessing corresponding genomic data156. This substantial limitation of current practices results from distinctions between the fields of bioinformatics (genomic data analyses) and medical informatics (clinical data analyses). The COVID-19 pandemic promises to unite researchers from both of these fields to integrate these seemingly disparate data sources, especially in prospective studies158.
Initial SARS-CoV-2 detection and characterization
Genomic analysis conducted on respiratory specimens isolated from the first COVID-19 patients hospitalized in December, 2019, in Wuhan, China, allowed for the prompt detection and characterization of a novel coronavirus, later named SARS-CoV-2, by January 20203,4,9,10. Initial sequence analyses revealed that SARS-CoV-2 shared 80% nucleotide identity with SARS-CoV11,12, strongly indicating that SARS-CoV-2 was likely a respiratory pathogen that could spread from human to human and hence with clear epidemic potential. These initial analyses also revealed that SARS-CoV-2 shared high sequence similarity with related viruses found in bats and pangolins, suggesting a zoonotic origin10,11,13–19. Across its complete genome, SARS-CoV-2 is most closely related to the bat coronavirus RaTG13, with which it shares approximately 96% nucleotide sequence identity. However, different SARS-CoV-2 coding regions share greater similarity to those of other animal coronaviruses. For example, the spike (S) protein receptor-binding domain (RBD) exhibits higher sequence identity (97.4%) to that of the Guangdong pangolin virus, rather than to RaTG13 (89.3%), while SARS-CoV-2 long 1ab (replicase) open reading frame (ORF) exhibits the highest sequence identity (98.8%) with the RmYN02 bat coronavirus20. In further support of zoonotic origin, another coronavirus detected in five bats (RacCS203) is genetically closely related to SARS-CoV-221, and neutralizing antibodies for SARS-CoV-2 were found in wild pangolins and bats from Thailand21. Moreover, there is similarity between SARS-CoV-2 zoonosis and the zoonoses of the SARS-CoV and MERS-CoV coronaviruses, as data indicate that in all three cases, other intermediate animals were likely present in their transmission chains22,23. Together, these findings suggest a complex history of recombination events prior to the zoonotic transfer of SARS-CoV-2 to humans, although when and in which hosts these events took place remains unclear13,14,19,24. Genomic knowledge acquired through viral sequencing and phylogenetic analysis greatly contributed to the rapid determination of the potential epidemiological characteristics and origins of SARS-CoV-225.
The role of genomics in the early COVID-19 outbreak response
As seen with other recent viral epidemics, viral genome sequencing has become an essential part of the COVID-19 public health response26. Early access to SARS-CoV-2 genome sequences allowed for the timely development and production of nucleic acid amplification testing (NAAT)-based diagnostics, expedited vaccine development, and accelerated opportunities for SARS-CoV-2 genomics-based real-time surveillance27–32. The first SARS-CoV-2 tests and vaccine candidates appeared within one and three months, respectively, after the identification of the first COVID-19 patient32–34. Together with access to modern sequencing technologies, the scale of the pandemic, based on numbers of cases and affected regions, has prompted the collection of SARS-CoV-2 viral genomic data at an unparalleled magnitude (on average 2,500 genomes per day). Consequently, the capacity to track virus spread and evolution in real time has been accelerated relative to that associated with prior outbreaks35. When the WHO initially declared a Public Health Emergency of International Concern (PHEIC) on January 30, 2020, 339 SARS-CoV-2 genomes had already been collected and characterized2–4,28. By April 7, 2021, public repositories that host SARS-CoV-2 genomes contained over 1,000,000 genomes36–38 (Table S1). Notably, by the end of the sixth month of the pandemic (May 2020), Global Initiative on Sharing All Influenza Data (GISAID) and the National Center for Biotechnology Information (NCBI) databases included 110,000 SARS-CoV-2 full-length genome sequences as compared to more than the 8,000 HIV full-length genome sequences collected by the Los Alamos sequence National Laboratory39 over the past 40 years40 (Figure 1a). 86% of available SARS-CoV-2 raw sequencing data at NCBI is Illumina data, 13.7% is Oxford Nanopore, and 0.3% is Pacbio, IonTorrent and BGISEQ (Figure S1). There is a correlation between the number of submitted sequences per capita and the GDP per capita for the majority of the countries in the world, moreover, high-income countries submitted about 100x more sequences per capita on average than did low-income countries (Figure 1b, S2). However, it is remarkable that African nations with a low GDP per capita sequenced viral genomes on a level comparable to that of middle- and high-income countries41. Indeed, due to several previous programs that were aimed at controlling outbreaks of other viruses in Africa, the sequencing capacity of the African healthcare system improved, helping to increase its efficiency in the sequencing of SARS-CoV-2 genomes41 (Figure 1c). Countries with the highest ratios for numbers of SARS-CoV-2 genomes sequenced to numbers of COVID-19 cases and relatively low number of reported cases per capita were Taiwan, New Zealand, Australia, Iceland, and Denmark (Figure 1d).
Figure 1. Available SARS-CoV-2 genomic sequencing data and its usage for outbreak investigation.
(a) The number of global SARS-CoV-2 genomes sequenced according to Global Initiative On Sharing All Influenza Data (GISAID) between January 2020-March 2020. (b) The number of available SARS-CoV-2 sequences in GISAID per 1 million (1M) individuals for each country vs. the number of cases per capita up to March 2021. (c) The number of available SARS-CoV-2 sequences in GISAID per 1 million (1M) individuals for each country in Africa vs. the number of sequencers per capita up to March 2021. Blue line is a correlation line of all data points on the plot (d) The number of available SARS-CoV-2 sequences in GISAID per number of reported COVID-19 cases for each country vs. the number of reported COVID-19 cases per capita up to March 2021. (e) Global outbreak investigations by phylogenetic analysis (red) and wastewater studies (yellow), dots were placed in the geographical centers of each county or region.
SARS-CoV-2 genomic evolution
The unprecedented scale of SARS-CoV-2 genome sequencing offers unique opportunities for tracking SARS-CoV-2 evolution online and detecting the emergence and spread of new VOI and VOC42–44 (Figure 2). Due to SARS-CoV-2 genome sequencing and consequent bioinformatics analysis it was shown that because of an intrinsic RNA proofreading mechanism, coronaviruses exhibit lower mutation rates than do many other RNA viruses, such as Ebola virus and HIV45–48. In addition, their evolutionary (i.e., nucleotide substitution) rate partly reflects the action of host-dependent RNA-editing enzymes (e.g., APOBEC)49. Coronaviruses undergo a mean rate of approximately 1.12 × 10−3 nucleotide substitutions per site per year. This is comparable to the SARS-CoV-1 mutation rate from 0.8 × 10−3 to 2.38 × 10−3, Ebolavirus’s mutation rate of 1.3 × 10−3 and is lower than seasonal influenza mutation rate of 6.7 × 10−3 and HIV mutation rate of 4.4 × 10−3 45–47,50–52.
Figure 2. Variant of Concern (VOC) and Variant of Interest (VOI) circulating throughout the globe.
(a) The locations where VOCs and VOIs were initially detected. (b) The timeline showing when VOCs and VOIs initially appeared in the sequencing data (not the time when they were declared as VOCs and VOIs).
Another important aspect of SARS-CoV-2 evolution is that SARS-CoV-2, like many other RNA viruses, can live in the host as a swarm of closely related variants within individual hosts and has a tendency for recombinations53. Genomic studies have demonstrated the presence of such intra-host diversity inside hosts54–60, with one study having identified between 1 and 52 haplotype variants in each of 25 clinical patients54. Identifying the factors that shape these intra-host viral population structures can promote a better understanding of short-term viral evolution, in addition to providing insights into host adaptation and drug and vaccine design. For example, evidence of intra-host recombination61 may enable estimating the role of recombination in the zoonotic origin of SARS-CoV-214 and the emergence of novel viral variants62–64.
Over the first year of the epidemic, SARS-CoV-2 has gradually accumulated mutations and developed into several viral lineages as it has spread through the human population7,65–68. However, from the advent of the pandemic through approximately September 2020, there was no statistical evidence that any of the numerous characterized SARS-CoV-2 mutations had resulted in a loss or gain of function45–47. For example, one study analyzed all 48,454 SARS-CoV-2 genomes available from GISAID from late July of 2020 that had been sequenced throughout the world and identified 12,706 mutations, 398 of which were recurrent, and none of which were associated with a significant change in transmissibility69. During the summer of 2020, the D614G mutation in the viral S protein sparked attention because this new variant globally superseded the original SARS-CoV-2 strain globally. Phylogenetic analyses and clinical evidence indicated that, although the D614G variant was associated with both increased viral load and infectivity68,70, it was also more susceptible to neutralizing antisera and was not linked to any change in vaccine efficacy or increased pathogenicity71.
The first SARS-CoV-2 viral variant of concern (VOC) for public health, known as variant B.1.1.7, was first detected in the UK in September 2020. Genomic analysis revealed that this B.1.1.7 variant had first arisen in late Summer or early Fall 2020, and then quickly spread through many countries, including Australia, Denmark, Italy, Iceland, the Netherlands, and now the US72–74. However, the full pathogenic potential of this variant was not recognized until December 202072. The B.1.1.7 variant strain harbors at least 12 mutations, including 2 in the S protein: N501Y, which increases the ability of SARS-CoV-2 binding to its cellular receptor, ACE2, and P618H, which adjoins the furin cleavage site in the S protein75–77. Both mutations have been associated with a 40–80% increase in the transmissibility of this variant as compared to previous SARS-CoV-2 strains72. More recently, the B.1.1.7 variant was found to be associated with greater disease severity and an increased risk of death as compared to other variants78. In addition, the variant carries a Δ69–70 deletion that results in detection failure by some SARS-CoV-2 molecular tests, which can limit the successful tracing of this VOC79. However, there is no evidence thus far that this variant reduces vaccine efficacy.
The second VOC was discovered in, UK, in September 2020, and was characterized by several mutations, including E484K in the RBD of the S protein. This mutation, which was later discovered to have arisen independently in other viral variants around the world80, is associated with reduced neutralizing activity of human convalescent and post-vaccination sera. Additional VOCs related to B.1.1.7 include B.1.351, which was first detected in South Africa in November 2020,81 where it spread rapidly. Although the latest reports indicate that this variant has also spread to Zambia and the US, there is no evidence that this mutation impacts disease severity82. This variant also harbors multiple mutations in the S protein, such as K417N, E484K, and N501Y.
The third VOC, P.1, was detected in four travelers who arrived in Japan from Brazil in January 202183–85. P.1 carries similar mutations in the RBD domain as B.1.351 (K417T, E484K, N501Y), the latter of which can increase transmissibility and help the virus evade neutralizing antibodies. The impact of the K417T mutation is not known. More recently, another genetic variant B.1.427/B.1.429 was declared as VOC because of its prevalence in the outbreak that happened in California. This variant is harboring the L452R mutation in the S protein that is suspected to confer SARS-CoV-2 antibody resistance, although it is less severe than the E484K mutation, which is associated with greatly reduced viral susceptibility to antibody neutralization86. The full and actual list of all VOI and VOC can be found at the official CDC page87.
Some of these variants were first independently identified in immunodeficient individuals in different countries, suggesting that their emergence may be the result of convergent evolution followed by rapid spread. For example, the appearance of the ΔH69/ΔV70 deletion was documented in an immunosuppressed individual through deep viral genome sequencing at 23 time points during the course of infection (101 days)64. A weakened host immune response can permit the virus to replicate with little or no control, increasing the likelihood for mutations to occur. The independent evolution of a given mutation in different geographic locations suggests that this mutation may confer an adaptive advantage to the virus, such as immune evasion or increased transmissibility, which is corroborated by clinical studies. Given the likely public health importance of these VOCs and VUIs, global surveillance for these and other new variants is expanding, as information for all SARS-CoV-2 lineages is now collected and made available online for the rapid evaluation of their epidemiologic and vaccine impact and short-term evolution based on individual data points7,88. In order to gain better control over emerging VOC and VOI, A European Commission Recommendation dated 19 January 2021 stated that “all EU Member States should reach a capacity of sequencing at least 5% - and preferably 10% - of positive test results. In most Member States, the sequencing capacity for identification of SARS-CoV-2 variants is below the recommendation set by the European Commission to sequence 5–10% of SARS-CoV-2 positive specimens” (Figure 1d).
The use of genomics to investigate the pandemic spread of SARS-CoV-2
Access to rich and diverse publicly available SARS-CoV-2 genomic data across various regions has allowed scientists and public health officials to efficiently track routes along which COVID-19 outbreaks have spread locally and internationally (Figure S3). In this context, phylogenetic and genetic network analyses can provide important public health information regarding viral epidemic spread89,90. Importantly, as viruses accumulate genomic mutations within different populations, knowledge regarding such evolution can reveal transmission chains and distinguish imported cases from instances of local transmission if a sufficient number of samples is analyzed, ultimately identifying high risk transmission routes which should be subject to enhance public health control91–94. Genomic analysis has allowed the identification of SARS-CoV-2 introduction into Europe from China, into the US from both China and Europe with a subsequent local transmission91,95–97. One recent study suggests that SARS-CoV-2 was introduced in the US in Connecticut via a domestic transmission route, while another showed that most successful viral introductions to Arizona were likely from domestic travel91,98. Another study revealed that the New York City area exhibited multiple introductions of SARS-CoV-2, primarily from Europe99. Similarly, SARS-CoV-2 was potentially introduced into France from several countries, including China, Italy, the United Arab Emirates, Egypt, and Madagascar92. We have curated a comprehensive list of genomic outbreak investigations to date for various geographical regions (Table S2). This catalog contains 40 studies and is updated in real-time as more studies are published; an online version is available at https://github.com/Mangul-Lab-USC/COVID-19-outbreak-investigations.
Viral genomics can also be used to monitor the effectiveness of global travel restrictions and lockdowns in different countries in limiting viral spread. For example, genomic analysis showed that the risk of domestic transmission of SARS-CoV-2 in Connecticut exceeded that of international introduction at the time federal travel restrictions were imposed, highlighting the critical need for local surveillance91. Similarly in Brazil, three clades of European origin were established prior to the initiation of travel bans and lockdowns100. Another genomic analysis showed that, due to violations of imposed lockdowns with sea trade, several SARS-CoV-2 international introductions likely occurred in Morocco101. In Australia, lockdown effectiveness was validated using agent-based modeling coupled with SARS-CoV-2 genomic data102. On December 19, 2019, due to the new rapidly spreading B.1.1.7 variant found in the UK, the prime minister implemented tighter lockdown and other restrictions, and as a result, many countries closed their borders to people traveling from the UK103. The spread of this variant then was precisely tracked in the U.S. due to available sequencing data104.
Combining genomic methods with clinical and geospatial data can help characterize viral infectivity, virulence, and death rates of circulating viral strains more accurately because epidemics in different areas of the world may have distinct characteristics that depend on viral genotype, as well the demographics of the host population. Specifically, integration and analysis of phylogenetic and epidemiologic data can provide a more complete understanding of the pandemic transmission dynamics105. Available genomic data can also be utilized to examine and partly explain the relationship between genetic variation in strains of SARS-CoV-2 and disease severity106. Findings from these studies can also help characterize mutation patterns in various hotspots and identify correlates of infection and death rates in these countries107. Novel approaches can be developed to combine population genomics and genetics to leverage the identification of molecular markers with unusual pattern variations or relevant single nucleotide polymorphisms in people from different geographies67,68. If SARS-CoV-2 infections continue at their current rate, population genomic research and pharmacogenomics approaches may be useful in the development of personalized therapeutics against this pathogen. Although disease severity can be partly attributed to host genomics, understanding these factors has been difficult due to contradictory evidence and limited host genomics studies conducted thus far to date108.
Monitoring SARS-CoV-2 transmission through wastewater genomic studies
Another genomics-based method for population-level pathogen surveillance assesses the presence of trace viral genomic material in wastewater, with this approach having been successfully employed to track antibiotic use109 and tobacco consumption110 and for the monitoring of enteric viruses such as poliovirus111. Notably, a 2013 study accomplished the early detection of a viral outbreak in Sweden by quantifying hepatitis A virus and norovirus genetic material levels in wastewater112. Although COVID-19 is primarily associated with respiratory symptoms, SARS-CoV-2 is regularly shed in feces113. As of August 2020, SARS-CoV-2 RNA had been detected in wastewater by over 35 studies in 17 countries using NAAT-based methods (https://www.covid19wbec.org), which can effectively detect the presence and concentration of viruses in wastewater114 and potentially estimate the relative number of disease cases in the area covered by the sewage facilities. However, current NAAT-based methods cannot detect whether or not these samples harbor novel mutations115, and the development of novel mutations in the template primer binding sites have the potential to compromise the efficacy of NAAT-based methods to detect the viral presence116. Additionally, wastewater may contain fragmented or defective genomes which may not be detected with these methods. Alternatively, a potentially promising approach is the application of metagenomics on a global scale to detect, collect, and store samples in preparation for future pandemics117,118. Metagenomics methods, which can sequence all available genomic material in a sample, allow the characterization of an entire viral population and the detection of prevalent SARS-CoV-2 variants in a given geographical space119,120. Wastewater surveillance studies of SARS-CoV-2 RNA concentration across various regions in the world have taken place between January 2020 and November 2020. (Figure 1e, Table S3).
Temporal changes in SARS-CoV-2 RNA concentration in wastewater were assessed in Valencia, Spain from February till April 2020, Paris, France from March till April 2020, and in many other regions (see Table S3), they have been consistent with the number of clinically diagnosed cases in a given community121,122. This relationship demonstrates the use of wastewater studies as a relatively inexpensive and straightforward method for investigating national outbreak dynamics, especially in areas where case diagnosis is complicated. In contrast, clinical diagnostic testing traditionally used to assess the number of cases in a community typically underestimates actual infection rates123, as this approach primarily focuses on symptomatic individuals because the asymptomatic cases are less likely to be captured. However, combining clinical diagnostics with wastewater-based surveillance can potentially provide a more comprehensive community-level profile of both symptomatic and asymptomatic cases, enabling identification of hospital capacity needs114,124–130. Additionally, an important advantage of wastewater monitoring is the ability to detect early-stage outbreaks before they become widespread111,115,131,132. In contrast to NAAT-based methods such as real time reverse transcription polymerase chain reaction (RT-PCR)-based analysis of SARS-CoV-2, metagenomic sequencing allows for characterization of the prevalent SARS-CoV-2 genomic variants in a defined local region and reveal geospatial SARS-CoV-2 genotype distribution120,133. Using wastewater samples can identify circulating lineages in the community and accompany analysis of genomic epidemiology, for example, such analysis has already helped to detect B.1.1.7 strains in the US and Switzerland134.
Despite the numerous advantages of wastewater-based virus surveillance, many potential improvements would result in more reliable and extended applications in public health decision-making. Currently, wastewater-based methods require calibration and validation because they only provide a raw measure of the number of cases in a population115. Additionally, wastewater-based monitoring lacks the granularity of clinical diagnostic testing and cannot discern a particular area of an outbreak when the wastewater treatment plant serves a large population. Sampling at a higher spatial resolution within the sewer system or even at a building-level scale could potentially provide early indications of viral outbreaks and help monitor their progression135. This effort could also include areas with large numbers of septic tank systems that are not feeding municipal wastewater systems.
Genomics in clinical applications
Viral genomics can also aid in vaccine development and investigations of how viral evolution impacts clinical outcomes and treatment136. While the majority of known SARS-CoV-2 mutations have no effect on viral replication and transmission137, some substitutions have been linked to phenotypic changes that could influence outbreak dynamics. For example, patients in Singapore infected with Δ382 SARS-CoV-2 variants, which have a 382-nucleotide deletion in ORF8, exhibited milder symptoms compared to patients with viruses that lacked this deletion138. However, the Δ382 SARS-CoV-2 variant is very rare globally and appears to have died out in Singapore. More alarmingly, the highly prevalent D614G mutation may increase transmissibility and infectivity in natural populations, giving variants harboring this mutation a marked selective advantage, although the best evidence to date comes from laboratory and simulation studies only68,95,139. Lastly, the viral variants B.1.1.7, B.1.351, and P.1, which were detected in late 2020, showed significantly increased transmissibility, heightening concern from a public health perspective82,140,141. For vaccine development, understanding the degree to which different regions of the viral genome are prone to mutation is important, as it is necessary to understand whether rising immunity in humans will result in antigenic drift and consequent vaccine escape. These evolutionary effects are commonly seen, for example, in human influenza viruses and endemic coronaviruses. Analyses of the current genomic variability of SARS-CoV-2 suggest that prospective COVID-19 vaccines should be cross-protective for the majority of currently known viral variants73,108,142, although some minor variants (< 1% natural occurrence frequency) have been shown to alter the antigenicity of SARS-CoV-2143. For vaccine development, determining the structures of SARS-CoV-2 antigens and their mutants is also crucial for the maximization of vaccine efficacy144. The online COVID-3D resource allows for the exploration of the structural distribution of genetic variation in SARS-CoV-2145.
The antigenic drift may also affect the effectiveness of NAAT-testing because when mutations happen in primer regions, the effectiveness of tests drops due to loss of affinity. Therefore control over appearing mutations should be taken into account for updating NAAT tests as well.
The first lab-confirmed case of COVID-19 re-infection case was detected in Hong Kong using a genomic analysis approach146, after which additional re-infection cases were detected in Belgium, Ecuador, and the US147–149. Phylogenetic analyses of longitudinal SARS-CoV-2 genomic sequences for all these patients distinguished between patient re-infection and persistent viral shedding from the initial infection. The findings in all four cases suggest that SARS-CoV-2 may persist in the global human population, despite herd immunity due to natural infection, which can complicate vaccine development and efficacy146. However, current data suggests that SARS-CoV-2 re-infection is rare, and it has been proposed that immunity against reinfection can last for at least several months after the primary infection150.
Clinical manifestations of SARS-CoV-2 infection vary greatly, ranging from a lack of symptoms to irreversible pulmonary damage151–153. Adaptive immune responses, such as early CD8+ and CD4+ T cell responses, have been associated with positive patient outcomes154. Next-generation sequencing of T and B cell receptor repertoires from COVID-19 patients has also revealed differences in immune response characteristics between patients with a mild or severe disease course155. Schultheiß et al. detected more than 14 million T and B receptors from blood samples of infected patients from 70 time points, compiling a valuable resource that can inform new therapeutic approaches and vaccine development. For example, their study revealed that knowledge of host immunopathology obtained through sequencing can permit the early detection of clinical biomarkers and aid in the identification of patients at risk for severe disease155.
Integrating clinical and genomics data
Many SARS-CoV-2 genomic studies to date have been conducted in the absence of substantial clinical data collection and/or integration with viral sequence data. Conversely, numerous studies have critically evaluated extensive clinical data alone, without assessing corresponding genomic data156. Even investigations yielding large genomic (e.g., GISAID28,36) and clinical datasets157 have not performed integrated analyses of both data types. This substantial limitation of current practices results from distinctions between the fields of bioinformatics (genomic data analyses) and medical informatics (clinical data analyses). The COVID-19 pandemic promises to unite researchers from both of these fields to integrate these seemingly disparate data sources, especially in prospective studies158. Finding significant associations between genomic and clinical features of the virus will ultimately support more targeted interventions by public health officials.
Discussion
The unprecedented density and volume of available SARS-CoV-2 genomic and clinical data enabled the prompt and effective characterization of both SARS-CoV-2 genomes and COVID-19 epidemiology compared to those of previous outbreaks. The numerous successful efforts across various parts of the globe utilizing genomic data for addressing the COVID-19 outbreak created a solid foundation for the standardization of using SARS-CoV-2 genomic data. High-income countries sequenced more SARS-CoV-2 sequences per population than the countries with low, middle-low and middle income. However, the countries of Africa with low and middle-low income demonstrated remarkably better preparedness to collect SARS-CoV-2 genomes than low and middle-low income countries from other continents (Figure 1b). This preparedness can be attributed to previous global initiatives to support African countries in mitigating previous outbreaks of other viruses that ended up in growing sequencing capacity of the region. Africa provides remarkable examples of the necessity of international cooperation which should be implemented in other parts of the globe for better control of worldwide epidemiology.
At the same time, the unprecedented volume of SARS-CoV-2 genome sequencing that reached one million viral genomes sequences challenged the current practices of viral data storage, processing, and bioinformatics analysis159–161. While the importance of genome-based viral surveillance systems was widely recognized, the principle of such systems were conceptualized, and there were technological burdens of creating them, as such systems were still in the early stages of development before the pandemic started. However, the unprecedented mobilization of financial, scientific, and development resources during the course of COVID-19 allowed for fast development, deployment, and scaling of numerous global surveillance systems which provide resources for outbreak response using SARS-CoV-2 genome analysis (Table 1).
Table 1:
Online services with SARS-CoV-2 genome resources and analytics
| Resource | Description | Link |
|---|---|---|
| GISAID | Assembled genome database and analysis | https://www.gisaid.org/ |
| NCBI | Raw sequencing data database | https://www.ncbi.nlm.nih.gov/sars-cov-2/ |
| COG-UK | United Kingdom sequences database | https://www.cogconsortium.uk/ |
| PANGO | Lineage analytics | https://cov-lineages.org/ |
| Nextstrain | Phylogenetic analysis | https://nextstrain.org/ |
| WBEC | Wastewater analytics | https://www.covid19wbec.org/ |
| COVID-3D | Structural changes of lineages | http://biosig.unimelb.edu.au/covid3d/ |
When rigorously studied, benchmarked, and standardized, viral genomic surveillance systems enable reliable and timely detection of the presence of circulating and emerging pathogens similar to SARS-CoV-2, providing us a robust shield from current and newly emerging outbreaks162. With sufficient sampling, genomic analysis will enable sentinel surveillance efforts capable of effectively locating the geographic source of outbreaks, elucidating transmission chains, and ultimately limiting the spread of the pathogens globally99,141,163–167.
Supplementary Material
Acknowledgements
We thank William M. Switzer and Ellsworth M. Campbell from the Division of HIV/AIDS Prevention, Centers for Disease Control and Prevention, Atlanta, 30333 GA, USA for useful discussions and suggestions. We also thank numerous anonymous reviewers who helped improve our manuscript by their valuable comments on the manuscript.
Funding
S.M. was partially supported by National Science Foundation grants 2041984. Tommy Lam is supported by NSFC Excellent Young Scientists Fund (Hong Kong and Macau) (31922087) and Health and Medical Research Fund (COVID190223). Pavel Skums was supported by the National Institutes of Health grant 1R01EB025022. Malak Abedalthagafi MA a acknowledge King Abdulaziz City for Science and Technology and the Saudi Human Genome Project for technical and financial support (https://shgp.kacst.edu.sa) Nicholas Wu: startup funds from the University of Illinois at Urbana-Champaign Adam Smith: acknowledge funding from NSF grant no. 2029025. Alex Zelikovsky: A.Z. has been partially supported by NSF Grant CCF-1619110 and NIH Grant 1R01EB025022-01. Sergey Knyazev S.K. has been partly supported by Molecular Basis of Disease at Georgia State University. Rob Knight: NSF project 2038509, RAPID: Improving QIIME 2 and UniFrac for Viruses to Respond to COVID-19. CDC project 30055281 with Scripps led by Kristian Andersen, Genomic sequencing of SARS-CoV-2 to investigate local and cross-border emergence and spread.
Contributor Information
Sergey Knyazev, Department of Computer Science, College of Art and Science, Georgia State University, 1 Park Place, Room 618, Atlanta, GA 30303, USA; Division of HIV/AIDS Prevention, Centers for Disease Control and Prevention, Atlanta, 30333 GA, USA; Oak Ridge Institute for Science and Education, Oak Ridge, TN 37830, USA.
Karishma Chhugani, Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Room 713. Los Angeles, CA 90089, USA.
Varuni Sarwal, Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA.
Ram Ayyala, Department of Neuroscience, College of Life Sciences, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA.
Harman Singh, Department of Electrical Engineering, Indian Institute of Technology, Hauz Khas, New Delhi, 110016, India.
Smruthi Karthikeyan, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
Dhrithi Deshpande, Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Room 713. Los Angeles, CA 90089, USA.
Zoia Comarova, Paradigm Environmental, 3911 Old Lee Highway, Fairfax, VA 22030.
Angela Lu, Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Room 713. Los Angeles, CA 90089-9121, USA.
Yuri Porozov, World-Class Research Center “Digital biodesign and personalized healthcare”, I.M. Sechenov First Moscow State Medical University, Moscow, Russia; Department of Computational Biology, Sirius University of Science and Technology, Sochi, Russia.
Aiping Wu, Center for Systems Medicine, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100005, China; Suzhou Institute of Systems Medicine, Suzhou, 215123, China.
Malak S. Abedalthagafi, Genomics Research Department, Saudi Human Genome Project, King Fahad Medical City and King Abdulaziz City for Science and Technology, Riyadh, Saudi Arabia.
Shivashankar H. Nagaraj, Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, QLD 4059, Australia Translational Research Institute, Brisbane, Australia.
Adam L. Smith, Astani Department of Civil and Environmental Engineering, University of Southern California, 3620 South Vermont Avenue, Los Angeles, CA 90089
Pavel Skums, Department of Computer Science, College of Art and Science, Georgia State University, 1 Park Place, Floor 6, Atlanta, GA 30303, USA.
Jason Ladner, The Pathogen and Microbiome Institute, Northern Arizona University, Flagstaff, AZ 86011.
Tommy Tsan-Yuk Lam, State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong.
Nicholas C. Wu, Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
Alex Zelikovsky, Department of Computer Science, College of Art and Science, Georgia State University, 1 Park Place, Floor 6, Atlanta, GA 30303, USA; The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia.
Rob Knight, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA; Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA; Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA, USA; Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA.
Keith A. Crandall, Computational Biology Institute and Department of Biostatistics & Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC 20052
Serghei Mangul, Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, 1540 Alcazar Street, Los Angeles, CA 90033, USA.
References
- 1.Barro R., Ursúa J. & Weng J. The Coronavirus and the Great Influenza Pandemic: Lessons from the ‘Spanish Flu’ for the Coronavirus’s Potential Effects on Mortality and Economic Activity. (2020) doi: 10.3386/w26866 [DOI] [Google Scholar]
- 2.Lu J. et al. Genomic Epidemiology of SARS-CoV-2 in Guangdong Province, China. Cell (2020) doi: 10.1016/j.cell.2020.04.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Wang C., Horby P. W., Hayden F. G. & Gao G. F. A novel coronavirus outbreak of global health concern. The Lancet vol. 395 470–473 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wu Z. & McGoogan J. M. Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention. JAMA (2020) doi: 10.1001/jama.2020.2648 [DOI] [PubMed] [Google Scholar]
- 5.Coronavirus disease (COVID-19) – World Health Organization. https://www.who.int/emergencies/diseases/novel-coronavirus-2019.
- 6.Grubaugh N. D. et al. Tracking virus outbreaks in the twenty-first century. Nat Microbiol 4, 10–19 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Rambaut A. et al. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nat Microbiol 2020. Preprint] July 15, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.The Rockefeller Foundation Releases New Action Plan to Accelerate Development of a National System for Gathering and Sharing Information on SARS-CoV-2 Genomic Variants and Other Pathogens. https://www.rockefellerfoundation.org/news/the-rockefeller-foundation-releases-new-action-plan-to-accelerate-development-of-a-national-system-for-gathering-and-sharing-information-on-sars-cov-2-genomic-variants-and-other-pathogens/.
- 9.Ren L.-L. et al. Identification of a novel coronavirus causing severe pneumonia in human: a descriptive study. Chin. Med. J. 133, 1015–1024 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Zhou P. et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579, 270–273 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Tang X. et al. On the origin and continuing evolution of SARS-CoV-2. Natl Sci Rev 7, 1012–1023 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lu R. et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet 395, 565–574 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Andersen K. G., Rambaut A., Lipkin W. I., Holmes E. C. & Garry R. F. The proximal origin of SARS-CoV-2. Nat. Med. 26, 450–452 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Li X. et al. Emergence of SARS-CoV-2 through recombination and strong purifying selection. Science Advances eabb9153 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Morens D. M. & Fauci A. S. Emerging Pandemic Diseases: How We Got to COVID-19. Cell vol. 182 1077–1092 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wu A. et al. Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China. Cell Host Microbe 27, 325–328 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Zhu N. et al. A Novel Coronavirus from Patients with Pneumonia in China, 2019. N. Engl. J. Med. 382, 727–733 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Boni M. F. et al. Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. Nat Microbiol (2020) doi: 10.1038/s41564-020-0771-4 [DOI] [PubMed] [Google Scholar]
- 19.Zhang T., Wu Q. & Zhang Z. Probable Pangolin Origin of SARS-CoV-2 Associated with the COVID-19 Outbreak. Curr. Biol. 30, 1578 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Zhou H. et al. A Novel Bat Coronavirus Closely Related to SARS-CoV-2 Contains Natural Insertions at the S1/S2 Cleavage Site of the Spike Protein. Curr. Biol. 30, 3896 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wacharapluesadee S. et al. Evidence for SARS-CoV-2 related coronaviruses circulating in bats and pangolins in Southeast Asia. Nat. Commun. 12, 972 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Cui J., Li F. & Shi Z.-L. Origin and evolution of pathogenic coronaviruses. Nat. Rev. Microbiol. 17, 181–192 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Zhu Z. et al. From SARS and MERS to COVID-19: a brief summary and comparison of severe acute respiratory infections caused by three highly pathogenic human coronaviruses. Respir. Res. 21, 224 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Lam T. T.-Y. et al. Identifying SARS-CoV-2 related coronaviruses in Malayan pangolins. Nature (2020) doi: 10.1038/s41586-020-2169-0 [DOI] [PubMed] [Google Scholar]
- 25.Rando H. M. et al. Pathogenesis, Symptomatology, and Transmission of SARS-CoV-2 through analysis of Viral Genomics and Structure. arXiv [q-bio.QM] (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Gardy J. L. & Loman N. J. Towards a genomics-informed, real-time, global pathogen surveillance system. Nat. Rev. Genet. 19, 9–20 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wang D. et al. Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus–Infected Pneumonia in Wuhan, China. JAMA 323, 1061–1069 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Elbe S. & Buckland-Merrett G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Glob Chall 1, 33–46 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kalinich C. C. et al. Real-time public health communication of local SARS-CoV-2 genomic epidemiology. PLoS Biol. 18, e3000869 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Thanh Le T. et al. The COVID-19 vaccine development landscape. Nat. Rev. Drug Discov. 19, 305–306 (2020). [DOI] [PubMed] [Google Scholar]
- 31.Amanat F. & Krammer F. SARS-CoV-2 Vaccines: Status Report. Immunity 52, 583–589 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Chen W.-H., Strych U., Hotez P. J. & Bottazzi M. E. The SARS-CoV-2 Vaccine Pipeline: an Overview. Curr Trop Med Rep 1–4 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Sheridan C. Coronavirus and the race to distribute reliable diagnostics. Nature Biotechnology vol. 38 382–384 (2020). [DOI] [PubMed] [Google Scholar]
- 34.Kudo E. et al. Detection of SARS-CoV-2 RNA by multiplex RT-qPCR. PLoS Biol. 18, e3000867 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Hadfield J. et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics 34, 4121–4123 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Shu Y. & McCauley J. GISAID: Global initiative on sharing all influenza data--from vision to reality. Eurosurveillance 22, 30494 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.An integrated national scale SARS-CoV-2 genomic surveillance network. The Lancet Microbe (2020) doi: 10.1016/S2666-5247(20)30054-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Fernandes J. D. et al. The UCSC SARS-CoV-2 Genome Browser. Nat. Genet. 52, 991–998 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Kuiken C., Korber B. & Shafer R. W. HIV sequence databases. AIDS Rev. 5, 52–61 (2003). [PMC free article] [PubMed] [Google Scholar]
- 40.Foley B. T. et al. HIV Sequence Compendium 2018. https://www.osti.gov/biblio/1458915 (2018) doi: 10.2172/1458915 [DOI] [Google Scholar]
- 41.Inzaule S. C., Tessema S. K., Kebede Y., Ogwell Ouma A. E. & Nkengasong J. N. Genomic-informed pathogen surveillance in Africa: opportunities and challenges. Lancet Infect. Dis. (2021) doi: 10.1016/S1473-3099(20)30939-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Wu A. et al. Mutations, Recombination and Insertion in the Evolution of 2019-nCoV. bioRxiv (2020) doi: 10.1101/2020.02.29.971101 [DOI] [Google Scholar]
- 43.Mathew D. et al. Deep immune profiling of COVID-19 patients reveals distinct immunotypes with therapeutic implications. Science 369, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Fang S. et al. GESS: a database of global evaluation of SARS-CoV-2/hCoV-19 sequences. Nucleic Acids Research (2020) doi: 10.1093/nar/gkaa808 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Koyama T., Platt D. & Parida L. Variant analysis of SARS-CoV-2 genomes. Bull. World Health Organ. 98, 495–504 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Zhao Z. et al. Moderate mutation rate in the SARS coronavirus genome and its implications. BMC Evol. Biol. 4, 21 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Holmes E. C., Dudas G., Rambaut A. & Andersen K. G. The evolution of Ebola virus: Insights from the 2013–2016 epidemic. Nature 538, 193–200 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Sanjuán R. & Domingo-Calap P. Mechanisms of viral mutation. Cellular and Molecular Life Sciences vol. 73 4433–4448 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Di Giorgio S., Martignano F., Torcia M. G., Mattiuz G. & Conticello S. G. Evidence for host-dependent RNA editing in the transcriptome of SARS-CoV-2. Sci Adv 6, eabb5813 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Lofgren E., Fefferman N. H., Naumov Y. N., Gorski J. & Naumova E. N. Influenza seasonality: underlying causes and modeling theories. J. Virol. 81, 5429–5436 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Cuevas J. M., Geller R., Garijo R., López-Aldeguer J. & Sanjuán R. Extremely High Mutation Rate of HIV-1 In Vivo. PLoS Biol. 13, e1002251 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Hoenen T., Groseth A., Safronetz D., Wollenberg K. & Feldmann H. Response to Comment on ‘Mutation rate and genotype variation of Ebola virus from Mali case sequences’. Science vol. 353 658 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Wilke C. O., Wang J. L., Ofria C., Lenski R. E. & Adami C. Evolution of digital organisms at high mutation rates leads to survival of the flattest. Nature 412, 331–333 (2001). [DOI] [PubMed] [Google Scholar]
- 54.Shen Z. et al. Genomic diversity of SARS-CoV-2 in Coronavirus Disease 2019 patients. Clin. Infect. Dis. (2020) doi: 10.1093/cid/ciaa203 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Moreno G. K. et al. Limited SARS-CoV-2 diversity within hosts and following passage in cell culture. bioRxiv 2020.04.20.051011 (2020) doi: 10.1101/2020.04.20.051011 [DOI] [Google Scholar]
- 56.Karamitros T. et al. SARS-CoV-2 exhibits intra-host genomic plasticity and low-frequency polymorphic quasispecies. bioRxiv 2020.03.27.009480 (2020) doi: 10.1101/2020.03.27.009480 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Lythgoe K. A. et al. Shared SARS-CoV-2 diversity suggests localised transmission of minority variants. bioRxiv 2020.05.28.118992 (2020) doi: 10.1101/2020.05.28.118992 [DOI] [Google Scholar]
- 58.Jary A. et al. Evolution of viral quasispecies during SARS-CoV-2 infection. Clin. Microbiol. Infect. (2020) doi: 10.1016/j.cmi.2020.07.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Kuipers J. et al. Within-patient genetic diversity of SARS-CoV-2. Cold Spring Harbor Laboratory 2020.10.12.335919 (2020) doi: 10.1101/2020.10.12.335919 [DOI] [Google Scholar]
- 60.James S. E. et al. High Resolution analysis of Transmission Dynamics of Sars-Cov-2 in Two Major Hospital Outbreaks in South Africa Leveraging Intrahost Diversity. medRxiv (2020) doi: 10.1101/2020.11.15.20231993 [DOI] [Google Scholar]
- 61.Sashittal P., Luo Y., Peng J. & El-Kebir M. Characterization of SARS-CoV-2 viral diversity within and across hosts. bioRxiv (2020). [Google Scholar]
- 62.Avanzato V. A. et al. Case Study: Prolonged Infectious SARS-CoV-2 Shedding from an Asymptomatic Immunocompromised Individual with Cancer. Cell vol. 183 1901–1912.e9 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Choi B. et al. Persistence and Evolution of SARS-CoV-2 in an Immunocompromised Host. N. Engl. J. Med. 383, 2291–2293 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Kemp S. A. et al. SARS-CoV-2 evolution during treatment of chronic infection. Nature (2021) doi: 10.1038/s41586-021-03291-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Geoghegan J. L. & Holmes E. C. The phylogenomics of evolving virus virulence. Nat. Rev. Genet. 19, 756–769 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.van Dorp L. et al. Emergence of genomic diversity and recurrent mutations in SARS-CoV-2. Infect. Genet. Evol. 104351 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Zhang Y.-Z. & Holmes E. C. A Genomic Perspective on the Origin and Emergence of SARS-CoV-2. Cell 181, 223–227 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Korber B. et al. Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus. Cell (2020) doi: 10.1016/j.cell.2020.06.043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.van Dorp L. et al. No evidence for increased transmissibility from recurrent mutations in SARS-CoV-2. doi: 10.1101/2020.05.21.108506 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Baric R. S. Emergence of a Highly Fit SARS-CoV-2 Variant. N. Engl. J. Med. (2020) doi: 10.1056/NEJMcibr2032888 [DOI] [PubMed] [Google Scholar]
- 71.Hou Y. J. et al. SARS-CoV-2 D614G variant exhibits efficient replication ex vivo and transmission in vivo. Science 370, 1464–1468 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Volz E. et al. Transmission of SARS-CoV-2 Lineage B.1.1.7 in England: Insights from linking epidemiological and genetic data. medRxiv 2020.12.30.20249034 (2021). [Google Scholar]
- 73.Mahase E. Covid-19: What have we learnt about the new variant in the UK? BMJ m4944 (2020) doi: 10.1136/bmj.m4944 [DOI] [PubMed] [Google Scholar]
- 74.Fiorentini S. et al. First detection of SARS-CoV-2 spike protein N501 mutation in Italy in August, 2020. Lancet Infect. Dis. (2021) doi: 10.1016/S1473-3099(21)00007-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Peacock T. P. et al. The furin cleavage site of SARS-CoV-2 spike protein is a key determinant for transmission due to enhanced replication in airway cells. Cold Spring Harbor Laboratory 2020.09.30.318311 (2020) doi: 10.1101/2020.09.30.318311 [DOI] [Google Scholar]
- 76.Chan K. K., Tan T. J. C., Narayanan K. K. & Procko E. An engineered decoy receptor for SARS-CoV-2 broadly binds protein S sequence variants. bioRxiv (2020) doi: 10.1101/2020.10.18.344622 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Starr T. N. et al. Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding. Cell 182, 1295–1310.e20 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Horby P. et al. NERVTAG note on B. 1.1. 7 severity. New & Emerging Threats Advisory Group, January 21, (2021). [Google Scholar]
- 79.Bal A. et al. Screening of the H69 and V70 deletions in the SARS-CoV-2 spike protein with a RT-PCR diagnosis assay reveals low prevalence in Lyon, France. medRxiv 2020.11.10.20228528 (2020). [Google Scholar]
- 80.Chand M. & Others. Investigation of novel SARS-COV-2 variant: Variant of Concern 202012/01 (PDF). Public Health England. PHE (2020). [Google Scholar]
- 81.Tegally H. et al. Emergence and rapid spread of a new severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) lineage with multiple spike mutations in South Africa. medRxiv (2020). [Google Scholar]
- 82.Mwenda M. et al. Detection of B.1.351 SARS-CoV-2 Variant Strain — Zambia, December 2020. MMWR. Morbidity and Mortality Weekly Report vol. 70 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Toovey O. T. R., Harvey K. N., Bird P. W. & Tang J. W.-T. W.-T. Introduction of Brazilian SARS-CoV-2 484K.V2 related variants into the UK. Journal of Infection (2021) doi: 10.1016/j.jinf.2021.01.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Naveca F. et al. Phylogenetic relationship of SARS-CoV-2 sequences from Amazonas with emerging Brazilian variants harboring mutations E484K and N501Y in the Spike protein. Virological. org. Available at: https://virological.org/t/phylogenetic-relationship-of-sars-cov-2-sequences-from-amazonas-with-emerging-brazilian-variants-harboring-mutations-e484k-and-n501y-in-the-spike-protein/585 (2021). [Google Scholar]
- 85.Faria N. R. et al. Genomic characterisation of an emergent SARS-CoV-2 lineage in Manaus: preliminary findings. (2021).
- 86.Greaney A. J. et al. Comprehensive mapping of mutations to the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human serum antibodies. Cold Spring Harbor Laboratory 2020.12.31.425021 (2021) doi: 10.1101/2020.12.31.425021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.CDC. Cases, Data, and Surveillance. https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/variant-surveillance/variant-info.html (2021). [Google Scholar]
- 88.Maxmen A. Massive Google-funded COVID database will track variants and immunity. Nature (2021) doi: 10.1038/d41586-021-00490-5 [DOI] [PubMed] [Google Scholar]
- 89.Blair C. & Ané C. Phylogenetic Trees and Networks Can Serve as Powerful and Complementary Approaches for Analysis of Genomic Data. Syst. Biol. 69, 593–601 (2020). [DOI] [PubMed] [Google Scholar]
- 90.Martin M. A., VanInsberghe D. & Koelle K. Insights from SARS-CoV-2 sequences. Science 371, 466–467 (2021). [DOI] [PubMed] [Google Scholar]
- 91.Fauver J. R. et al. Coast-to-Coast Spread of SARS-CoV-2 during the Early Epidemic in the United States. Cell (2020) doi: 10.1016/j.cell.2020.04.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Gámbaro F. et al. Introductions and early spread of SARS-CoV-2 in France. doi: 10.1101/2020.04.24.059576 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Miller D. et al. Full genome viral sequences inform patterns of SARS-CoV-2 spread into and within Israel. medRxiv (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Thielen P. M. et al. Genomic Diversity of SARS-CoV-2 During Early Introduction into the United States National Capital Region. medRxiv (2020) doi: 10.1101/2020.08.13.20174136 [DOI] [Google Scholar]
- 95.McNamara R. P. et al. High-Density Amplicon Sequencing Identifies Community Spread and Ongoing Evolution of SARS-CoV-2 in the Southern United States. Cell Rep. 33, 108352 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Nadeau S. A., Vaughan T. G., Scire J., Huisman J. S. & Stadler T. The origin and early spread of SARS-CoV-2 in Europe. Proc. Natl. Acad. Sci. U. S. A. 118, (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Worobey M. et al. The emergence of SARS-CoV-2 in Europe and North America. Science (2020) doi: 10.1126/science.abc8169 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Ladner J. T. et al. An Early Pandemic Analysis of SARS-CoV-2 Population Structure and Dynamics in Arizona. MBio 11, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Gonzalez-Reiche A. S. et al. Introductions and early spread of SARS-CoV-2 in the New York City area. Science (2020) doi: 10.1126/science.abc1917 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Candido D. D. S. et al. Routes for COVID-19 importation in Brazil. J. Travel Med. 27, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Badaoui B., Sadki K., Talbi C., Driss S. & Tazi L. Genetic Diversity and Genomic Epidemiology of SARS-COV-2 in Morocco. doi: 10.1101/2020.06.23.165902 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Rockett R. J. et al. Revealing COVID-19 transmission in Australia by SARS-CoV-2 genome sequencing and agent-based modeling. Nat. Med. 26, 1398–1404 (2020). [DOI] [PubMed] [Google Scholar]
- 103.Kupferschmidt K. Fast-spreading U.K. virus variant raises alarms. Science 371, 9–10 (2021). [DOI] [PubMed] [Google Scholar]
- 104.Washington N. L. et al. Genomic epidemiology identifies emergence and rapid transmission of SARS-CoV-2 B.1.1.7 in the United States. medRxiv (2021) doi: 10.1101/2021.02.06.21251159 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Nepomuceno M. R. et al. Besides population age structure, health and other demographic factors can contribute to understanding the COVID-19 burden. Proceedings of the National Academy of Sciences of the United States of America vol. 117 13881–13883 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Biswas S. K. & Mudi S. R. Genetic variation in SARS-CoV-2 may explain variable severity of COVID-19. Med. Hypotheses 143, 109877 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Pachetti M. et al. Emerging SARS-CoV-2 mutation hot spots include a novel RNA-dependent-RNA polymerase variant. J. Transl. Med. 18, 179 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Rausch J. W., Capoferri A. A., Katusiime M. G., Patro S. C. & Kearney M. F. Low genetic diversity may be an Achilles heel of SARS-CoV-2. Proceedings of the National Academy of Sciences 202017726 (2020) doi: 10.1073/pnas.2017726117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Fahrenfeld N. & Bisceglia K. J. Emerging investigators series: sewer surveillance for monitoring antibiotic use and prevalence of antibiotic resistance: urban sewer epidemiology. Environmental Science: Water Research & Technology vol. 2 788–799 (2016). [Google Scholar]
- 110.Castiglioni S., Senta I., Borsotti A., Davoli E. & Zuccato E. A novel approach for monitoring tobacco use in local communities by wastewater analysis. Tobacco Control vol. 24 38–42 (2015). [DOI] [PubMed] [Google Scholar]
- 111.Sims N. & Kasprzyk-Hordern B. Future perspectives of wastewater-based epidemiology: Monitoring infectious disease spread and resistance to the community level. Environ. Int. 139, 105689 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Hellmér M. et al. Detection of pathogenic viruses in sewage provided early warnings of hepatitis A virus and norovirus outbreaks. Appl. Environ. Microbiol. 80, 6771–6781 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Chen Y. et al. The presence of SARS-CoV-2 RNA in the feces of COVID-19 patients. J. Med. Virol. 92, 833–840 (2020). [DOI] [PubMed] [Google Scholar]
- 114.Peccia J. et al. Measurement of SARS-CoV-2 RNA in wastewater tracks community infection dynamics. Nat. Biotechnol. (2020) doi: 10.1038/s41587-020-0684-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Farkas K., Hillary L. S., Malham S. K., McDonald J. E. & Jones D. L. Wastewater and public health: the potential of wastewater surveillance for monitoring COVID-19. Current Opinion in Environmental Science & Health vol. 17 14–20 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Adriaenssens E. M. et al. Viromic Analysis of Wastewater Input to a River Catchment Reveals a Diverse Assemblage of RNA Viruses. mSystems 3, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Carbo E. C. et al. Coronavirus discovery by metagenomic sequencing: a tool for pandemic preparedness. J. Clin. Virol. 131, 104594 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Bedford J. et al. A new twenty-first century science for effective epidemic response. Nature 575, 130–136 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Nieuwenhuijse D. F. et al. Setting a baseline for global urban virome surveillance in sewage. Sci. Rep. 10, 13748 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Crits-Christoph A. et al. Genome sequencing of sewage detects regionally prevalent SARS-CoV-2 variants. doi: 10.1101/2020.09.13.20193805 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Randazzo W., Cuevas-Ferrando E., Sanjuán R., Domingo-Calap P. & Sánchez G. Metropolitan Wastewater Analysis for COVID-19 Epidemiological Surveillance. SSRN Electronic Journal doi: 10.2139/ssrn.3586696 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Wurtzer S. et al. Evaluation of lockdown impact on SARS-CoV-2 dynamics through viral genome quantification in Paris wastewaters. doi: 10.1101/2020.04.12.20062679 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Wu F. et al. SARS-CoV-2 Titers in Wastewater Are Higher than Expected from Clinically Confirmed Cases. mSystems 5, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Weidhaas J. et al. Correlation of SARS-CoV-2 RNA in wastewater with COVID-19 disease burden in sewersheds. (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Medema G., Heijnen L., Elsinga G., Italiaander R. & Brouwer A. Presence of SARS-Coronavirus-2 RNA in Sewage and Correlation with Reported COVID-19 Prevalence in the Early Stage of the Epidemic in The Netherlands. Environ. Sci. Technol. Lett. 7, 511–516 (2020). [DOI] [PubMed] [Google Scholar]
- 126.Ahmed W. et al. First confirmed detection of SARS-CoV-2 in untreated wastewater in Australia: A proof of concept for the wastewater surveillance of COVID-19 in the community. Sci. Total Environ. 728, 138764 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Gonzalez R. et al. COVID-19 surveillance in Southeastern Virginia using wastewater-based epidemiology. Water Res. 186, 116296 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Medema G., Heijnen L., Elsinga G., Italiaander R. & Brouwer A. Presence of SARS-Coronavirus-2 in sewage. doi: 10.1101/2020.03.29.20045880 [DOI] [PubMed] [Google Scholar]
- 129.Wu F. et al. SARS-CoV-2 titers in wastewater foreshadow dynamics and clinical presentation of new COVID-19 cases. doi: 10.1101/2020.06.15.20117747 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Karthikeyan S. et al. High throughput wastewater SARS-CoV-2 detection enables forecasting of community infection dynamics in San Diego county. doi: 10.1101/2020.11.16.20232900 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Larsen D. A. & Wigginton K. R. Tracking COVID-19 with wastewater. Nat. Biotechnol. doi: 10.1038/s41587-020-0690-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Schmidt C. Watcher in the wastewater. Nat. Biotechnol. 38, 917–920 (2020). [DOI] [PubMed] [Google Scholar]
- 133.Izquierdo Lara R. W. et al. Monitoring SARS-CoV-2 circulation and diversity through community wastewater sequencing. Public and Global Health (2020) doi: 10.1101/2020.09.21.20198838 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Jahn K. et al. Detection of SARS-CoV-2 variants in Switzerland by genomic analysis of wastewater samples. medRxiv (2021). [Google Scholar]
- 135.Bogler A. et al. Rethinking wastewater risks and monitoring in light of the COVID-19 pandemic. Nature Sustainability (2020) doi: 10.1038/s41893-020-00605-2 [DOI] [Google Scholar]
- 136.Burioni R. & Topol E. J. Assessing the human immune response to SARS-CoV-2 variants. Nat. Med. (2021) doi: 10.1038/s41591-021-01290-0 [DOI] [PubMed] [Google Scholar]
- 137.Zhang X. et al. Viral and host factors related to the clinical outcome of COVID-19. Nature 583, 437–440 (2020). [DOI] [PubMed] [Google Scholar]
- 138.Young B. E. et al. Effects of a major deletion in the SARS-CoV-2 genome on the severity of infection and the inflammatory response: an observational cohort study. Lancet (2020) doi: 10.1016/S0140-6736(20)31757-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 139.Zhang L. et al. The D614G mutation in the SARS-CoV-2 spike protein reduces S1 shedding and increases infectivity. bioRxiv (2020) doi: 10.1101/2020.06.12.148726 [DOI] [Google Scholar]
- 140.Grubaugh N. D., Hodcroft E. B., Fauver J. R., Phelan A. L. & Cevik M. Public health actions to control new SARS-CoV-2 variants. Cell (2021) doi: 10.1016/j.cell.2021.01.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141.Mashe T. et al. Surveillance of SARS-CoV-2 in Zimbabwe shows dominance of variants of concern. Lancet Microbe (2021) doi: 10.1016/S2666-5247(21)00061-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Dearlove B. et al. A SARS-CoV-2 vaccine candidate would likely match all currently circulating variants. Proc. Natl. Acad. Sci. U. S. A. (2020) doi: 10.1073/pnas.2008281117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Li Q. et al. The Impact of Mutations in SARS-CoV-2 Spike on Viral Infectivity and Antigenicity. Cell 182, 1284–1294.e9 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.Walls A. C. et al. Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein. Cell 181, 281–292.e6 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Portelli S. et al. Exploring the structural distribution of genetic variation in SARS-CoV-2 with the COVID-3D online resource. Nat. Genet. (2020) doi: 10.1038/s41588-020-0693-3 [DOI] [PubMed] [Google Scholar]
- 146.To K. K.-W. et al. Coronavirus Disease 2019 (COVID-19) Re-infection by a Phylogenetically Distinct Severe Acute Respiratory Syndrome Coronavirus 2 Strain Confirmed by Whole Genome Sequencing. Clinical Infectious Diseases (2020) doi: 10.1093/cid/ciaa1275 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147.Van Elslande J. et al. Symptomatic SARS-CoV-2 reinfection by a phylogenetically distinct strain. Clin. Infect. Dis. (2020) doi: 10.1093/cid/ciaa1330 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 148.Prado-Vivar B., Becerra-Wong M., Guadalupe J. J. & Others. COVID-19 re-infection by a phylogenetically distinct SARS-CoV-2 variant, first confirmed event in South America. SSRN 2020; published online Sept 8. [Google Scholar]
- 149.Iwasaki A. What reinfections mean for COVID-19. The Lancet infectious diseases vol. 21 3–5 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150.Abu-Raddad L. J. et al. Assessment of the risk of SARS-CoV-2 reinfection in an intense re-exposure setting. bioRxiv (2020) doi: 10.1101/2020.08.24.20179457 [DOI] [Google Scholar]
- 151.Grant M. C. et al. The prevalence of symptoms in 24,410 adults infected by the novel coronavirus (SARS-CoV-2; COVID-19): A systematic review and meta-analysis of 148 studies from 9 countries. PLoS One 15, e0234765 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 152.Fu L. et al. Clinical characteristics of coronavirus disease 2019 (COVID-19) in China: A systematic review and meta-analysis. Journal of Infection vol. 80 656–665 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153.Gavriatopoulou M. et al. Organ-specific manifestations of COVID-19 infection. Clin. Exp. Med. 20, 493–506 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 154.Channappanavar R., Zhao J. & Perlman S. T cell-mediated immune response to respiratory coronaviruses. Immunologic Research vol. 59 118–128 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 155.Schultheiß C. et al. Next-Generation Sequencing of T and B Cell Receptor Repertoires from COVID-19 Patients Showed Signatures Associated with Severity of Disease. Immunity vol. 53 442–455.e4 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Williamson E. J. et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature (2020) doi: 10.1038/s41586-020-2521-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157.Haendel M., Chute C. & Gersing K. The National COVID Cohort Collaborative (N3C): Rationale, Design, Infrastructure, and Deployment. J. Am. Med. Inform. Assoc. (2020) doi: 10.1093/jamia/ocaa196 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.Meredith L. W. et al. Rapid implementation of SARS-CoV-2 sequencing to investigate cases of health-care associated COVID-19: a prospective genomic surveillance study. The Lancet Infectious Diseases (2020) doi: 10.1016/s1473-3099(20)30562-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159.Van Noorden R. Scientists call for fully open sharing of coronavirus genome data. Nature 590, 195–196 (2021). [DOI] [PubMed] [Google Scholar]
- 160.Hodcroft E. B. et al. Want to track pandemic variants faster? Fix the bioinformatics bottleneck. Nature 591, 30–33 (2021). [DOI] [PubMed] [Google Scholar]
- 161.Maxmen A. One million coronavirus sequences: popular genome site hits mega milestone. Nature (2021) doi: 10.1038/d41586-021-01069-w [DOI] [PubMed] [Google Scholar]
- 162.Status of environmental surveillance for SARS-CoV-2 virus. https://www.who.int/news-room/commentaries/detail/status-of-environmental-surveillance-for-sars-cov-2-virus.
- 163.Watson C. How countries are using genomics to help avoid a second coronavirus wave. Nature 582, 19 (2020). [DOI] [PubMed] [Google Scholar]
- 164.Deng X. et al. Genomic surveillance reveals multiple introductions of SARS-CoV-2 into Northern California. Science 369, 582–587 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 165.Chaguza C., Nyaga M. M., Mwenda J. M., Esona M. D. & Jere K. C. Using genomics to improve preparedness and response of future epidemics or pandemics in Africa. The Lancet Microbe vol. 1 e275–e276 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 166.Robert A. Lessons from New Zealand’s COVID-19 outbreak response. The Lancet. Public health vol. 5 e569–e570 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 167.Oude Munnink B. B. et al. Rapid SARS-CoV-2 whole-genome sequencing and analysis for informed public health decision-making in the Netherlands. Nat. Med. 26, 1405–1410 (2020). [DOI] [PubMed] [Google Scholar]
- 168.Seemann T. et al. Tracking the COVID-19 pandemic in Australia using genomics. Nat. Commun. 11, 4376 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 169.Gudbjartsson D. F. et al. Spread of SARS-CoV-2 in the Icelandic Population. N. Engl. J. Med. 382, 2302–2315 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 170.Geoghegan J. L. et al. Genomic epidemiology reveals transmission patterns and dynamics of SARS-CoV-2 in Aotearoa New Zealand. doi: 10.1101/2020.08.05.20168930 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 171.Zhou Z.-Y. et al. Worldwide tracing of mutations and the evolutionary dynamics of SARS-CoV-2. doi: 10.1101/2020.08.07.242263 [DOI] [Google Scholar]
- 172.Sekizuka T. et al. SARS-CoV-2 Genome Analysis of Japanese Travelers in Nile River Cruise. Front. Microbiol. 11, 1316 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 173.Forster P., Forster L., Renfrew C. & Forster M. Phylogenetic network analysis of SARS-CoV-2 genomes. Proc. Natl. Acad. Sci. U. S. A. 117, 9241–9243 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 174.Candido D. S. et al. Evolution and epidemic spread of SARS-CoV-2 in Brazil. Science 369, 1255–1260 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 175.Giovanetti M., Angeletti S., Benvenuto D. & Ciccozzi M. A doubt of multiple introduction of SARS-CoV-2 in Italy: A preliminary overview. J. Med. Virol. (2020) doi: 10.1002/jmv.25773 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 176.Chong Y. M. et al. SARS-CoV-2 lineage B.6 is the major contributor to transmission in Malaysia. doi: 10.1101/2020.08.27.269738 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 177.Bedford T. et al. Cryptic transmission of SARS-CoV-2 in Washington state. Science (2020) doi: 10.1126/science.abc0523 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 178.Guo L. et al. Genomic epidemiology of the Los Angeles COVID-19 outbreak. medRxiv 2020.09.15.20194712 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 179.Long S. W. et al. Molecular Architecture of Early Dissemination and Massive Second Wave of the SARS-CoV-2 Virus in a Major Metropolitan Area. doi: 10.1101/2020.09.22.20199125 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 180.Lemieux J. et al. Phylogenetic analysis of SARS-CoV-2 in the Boston area highlights the role of recurrent importation and superspreading events. doi: 10.1101/2020.08.23.20178236 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 181.Page A. J. et al. Large scale sequencing of SARS-CoV-2 genomes from one region allows detailed epidemiology and enables local outbreak management. doi: 10.1101/2020.09.28.20201475 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 182.Mueller N. F. et al. Viral genomes reveal patterns of the SARS-CoV-2 outbreak in Washington State. medRxiv (2020) doi: 10.1101/2020.09.30.20204230 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 183.Website. https://doi.org/10.1101/2020.06.26.20135715 doi:10.1101/2020.06.26.20135715. [Google Scholar]
- 184.Pattabiraman C. et al. Genomic epidemiology reveals multiple introductions and spread of SARS-CoV-2 in the Indian state of Karnataka. doi: 10.1101/2020.07.10.20150045 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 185.Komissarov A. B. et al. Genomic epidemiology of the early stages of SARS-CoV-2 outbreak in Russia. doi: 10.1101/2020.07.14.20150979 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 186.da Silva Filipe A. et al. Genomic epidemiology reveals multiple introductions of SARS-CoV-2 from mainland Europe into Scotland. Nat Microbiol 6, 112–122 (2021). [DOI] [PubMed] [Google Scholar]
- 187.Popa A. et al. Genomic epidemiology of superspreading events in Austria reveals mutational dynamics and transmission properties of SARS-CoV-2. Sci. Transl. Med. 12, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 188.Islam M. R. et al. Genome-wide analysis of SARS-CoV-2 virus strains circulating worldwide implicates heterogeneity. Sci. Rep. 10, 14004 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 189.Ngoi J. M. et al. Genomic analysis of SARS-CoV-2 reveals local viral evolution in Ghana. Exp. Biol. Med. 1535370220975351 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 190.Li T. et al. Phylogenetic supertree reveals detailed evolution of SARS-CoV-2. Sci. Rep. 10, 22366 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 191.Pereson M. J. et al. Phylogenetic analysis of SARS-CoV-2 in the first few months since its emergence. J. Med. Virol. 93, 1722–1731 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 192.Zehender G. et al. Genomic characterization and phylogenetic analysis of SARS-COV-2 in Italy. J. Med. Virol. 92, 1637–1640 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 193.Castillo A. E. et al. Phylogenetic analysis of the first four SARS-CoV-2 cases in Chile. J. Med. Virol. 92, 1562–1566 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 194.Bartolini B. et al. SARS-CoV-2 Phylogenetic Analysis, Lazio Region, Italy, February-March 2020. Emerg. Infect. Dis. 26, 1842–1845 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 195.Adebalİ O. et al. Phylogenetic analysis of SARS-CoV-2 genomes in Turkey. Turk. J. Biol. 44, 146–156 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 196.Nie Q. et al. Phylogenetic and phylodynamic analyses of SARS-CoV-2. Virus Res. 287, 198098 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 197.Westhaus S. et al. Detection of SARS-CoV-2 in raw and treated wastewater in Germany – Suitability for COVID-19 surveillance and potential transmission risks. Science of The Total Environment vol. 751 141750 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 198.La Rosa G. et al. First detection of SARS-CoV-2 in untreated wastewaters in Italy. Sci. Total Environ. 736, 139652 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 199.Kocamemi B. A. et al. First Data-Set on SARS-CoV-2 Detection for Istanbul Wastewaters in Turkey. doi: 10.1101/2020.05.03.20089417 [DOI] [Google Scholar]
- 200.Or I. B. et al. Regressing SARS-CoV-2 sewage measurements onto COVID-19 burden in the population: a proof-of-concept for quantitative environmental surveillance. doi: 10.1101/2020.04.26.20073569 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 201.Zhang D. et al. Potential spreading risks and disinfection challenges of medical wastewater by the presence of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) viral RNA in septic tanks of Fangcang Hospital. Sci. Total Environ. 741, 140445 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 202.Fongaro G. et al. SARS-CoV-2 in human sewage in Santa Catalina, Brazil, November 2019. doi: 10.1101/2020.06.26.20140731 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 203.Haramoto E., Malla B., Thakali O. & Kitajima M. First environmental surveillance for the presence of SARS-CoV-2 RNA in wastewater and river water in Japan. Sci. Total Environ. 737, 140405 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 204.Kumar M. et al. First proof of the capability of wastewater surveillance for COVID-19 in India through detection of genetic material of SARS-CoV-2. Sci. Total Environ. 746, 141326 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 205.Nemudryi A. et al. Temporal Detection and Phylogenetic Assessment of SARS-CoV-2 in Municipal Wastewater. Cell Rep Med 1, 100098 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 206.Mlejnkova H. et al. Preliminary Study of Sars-Cov-2 Occurrence in Wastewater in the Czech Republic. Int. J. Environ. Res. Public Health 17, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 207.Graham K. et al. SARS-CoV-2 in wastewater settled solids is associated with COVID-19 cases in a large urban sewershed. doi: 10.1101/2020.09.14.20194472 [DOI] [Google Scholar]
- 208.Agrawal S., Orschler L. & Lackner S. Long-term monitoring of SARS-CoV-2 in wastewater of the Frankfurt metropolitan area in Southern Germany. bioRxiv (2020) doi: 10.1101/2020.10.26.20215020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 209.D’Aoust P. M. et al. Quantitative analysis of SARS-CoV-2 RNA from wastewater solids in communities with low COVID-19 incidence and prevalence. Water Res. 188, 116560 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 210.Kocamemi B. A. et al. SARS-CoV-2 detection in Istanbul wastewater treatment plant sludges. bioRxiv (2020) doi: 10.1101/2020.05.12.20099358 [DOI] [Google Scholar]
- 211.Guerrero-Latorre L. et al. SARS-CoV-2 in river water: Implications in low sanitation countries. Sci. Total Environ. 743, 140832 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 212.Ahmed W. et al. Detection of SARS-CoV-2 RNA in commercial passenger aircraft and cruise ship wastewater: a surveillance tool for assessing the presence of COVID-19 infected travellers. J. Travel Med. 27, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 213.Randazzo W. et al. SARS-CoV-2 RNA in wastewater anticipated COVID-19 occurrence in a low prevalence area. Water Res. 181, 115942 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 214.Rimoldi S. G. et al. Presence and vitality of SARS-CoV-2 virus in wastewaters and rivers. bioRxiv (2020) doi: 10.1101/2020.05.01.20086009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 215.La Rosa G. et al. SARS-CoV-2 has been circulating in northern Italy since December 2019: evidence from environmental monitoring. bioRxiv (2020) doi: 10.1101/2020.06.25.20140061 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 216.Trottier J. et al. Post-lockdown detection of SARS-CoV-2 RNA in the wastewater of Montpellier, France. bioRxiv (2020) doi: 10.1101/2020.07.08.20148882 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 217.Michael-Kordatou I., Karaolia P. & Fatta-Kassinos D. Sewage analysis as a tool for the COVID-19 pandemic response and management: the urgent need for optimised protocols for SARS-CoV-2 detection and quantification. J Environ Chem Eng 8, 104306 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 218.Green H. et al. Quantification of SARS-CoV-2 and cross-assembly phage (crAssphage) from wastewater to monitor coronavirus transmission within communities. bioRxiv (2020) doi: 10.1101/2020.05.21.20109181 [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


