Summary
Background
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) has rapidly spread worldwide in the population since it was first detected in late 2019. The transcription and replication of coronaviruses, although not fully understood, is characterised by the production of genomic length RNA and shorter subgenomic RNAs to make viral proteins and ultimately progeny virions. Observed levels of subgenomic RNAs differ between sub-lineages and open reading frames but their biological significance is presently unclear.
Methods
Using a large and diverse panel of virus sequencing data produced as part of the Danish COVID-19 routine surveillance together with information in electronic health registries, we assessed the association of subgenomic RNA levels with demographic and clinical variables of the infected individuals.
Findings
Our findings suggest no significant statistical relationship between levels of subgenomic RNAs and host-related factors.
Interpretation
Differences between lineages and subgenomic ORFs may be related to differences in target cell tropism, early virus replication/transcription kinetics or sequence features.
Funding
The author(s) received no specific funding for this work.
Keywords: SARS-CoV-2, Subgenomic RNA, Alpha, Delta, Omicron, Association analysis
Research in context.
Evidence before this study
The biological significance of measured levels of SARS-CoV-2 subgenomic RNA during COVID-19 infection is currently unclear. While it is well understood that their abundance could vary between sub-lineages and different subgenomic ORFs, their role as indicators of disease severity and active replication of the virus is currently debated. To date, few or no association studies have been performed to elucidate the relationship between subgenomic RNA and host-related factors.
Added value of this study
In this study, we combined whole genome sequencing data from the Danish COVID-19 surveillance effort and available electronic health registries to systematically assess the differences in subgenomic RNA abundance between several sub-lineages and identify potential associations with clinical information from the infected individuals. While we reported significantly higher subgenomic RNA levels in Alpha, compared to Omicron and Delta, and different abundance profiles between ORFs, our findings suggest no significant statistical relationship between subgenomic RNAs and host associated factors such as the presence of COVID-19 symptoms, hospitalizations, medical history and demographics.
Implications of all the available evidence
This study provides an unbiased survey of subgenomic RNA profiles in a diverse population both in terms of viral sub-lineages and medical background of the infected hosts and may contribute to a better understanding of the underlying mechanisms driving the abundance of subgenomic RNAs during SARS-CoV-2 infection.
Introduction
As the coronavirus disease 2019 (COVID-19) pandemic unfolds, there remains a growing interest in finding molecular markers for active replication of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus causing this disease. To better determine whether patients were likely to be infectious, the presence of viral genomic RNA would normally be checked. However, this seems a poor candidate as RNA detected by real-time reverse transcription PCR (rRT-PCR) may be observed for several weeks1 after the initial infection, long after the commonly estimated infectious virus shedding period of 7–8 days.2, 3, 4 Subgenomic RNAs, or negative-sense RNAs, needed for virus transcription and replication have been suggested to be possible alternative indicators2,5, 6, 7 of active replication, as these are being generated intracellularly during replication in addition to the full-length genomic RNA. SARS-CoV-2 is an enveloped virus with a single-stranded positive-sense RNA genome of approximatively 30,000 nucleotides. The genomic RNA consists of a 5′-UTR, two large open reading frames (ORFs) Orf1a and Orf1b in addition to smaller ORFs encoding the structural and accessory proteins. These proteins are needed for the virus to replicate, transcribe, produce progeny virions, and produce the polyadenylated 3′-UTR of its RNAs.8 Following the entry into a cell, i.e. initial infection, the intracellular replication is believed to initially translate ORF1ab proteins and form subcellular cytoplasmic organelles including convoluted membranes and double-membrane vesicles.9, 10, 11, 12 Negative-stranded RNA, double-stranded RNA intermediates, and new positive-stranded genomic and subgenomic coronavirus RNAs are all produced during the coronavirus RNA replication and transcription process. The mentioned membrane structures9 are currently understood to provide the framework on which RNA synthesis occurs, possibly shielding the virus replication complexes and synthetized RNA from cellular defense mechanisms.
Synthesis of positive-strand copies of the genomic RNA is achieved by using negative-strand RNA as a template. These are used subsequently for production of virus proteins or for packaging of virion RNA inside progeny virus particles. Of particular interest to this paper, subgenomic RNAs are produced via a complex method of discontinuous negative-strand RNA synthesis serving as templates for generation of positive-strand RNA copies. The positive-strand RNA copies are used subsequently as messenger RNAs to express the structural and accessory proteins of SARS-CoV-2.8,13 Every subgenomic RNA molecule consists of a common leader sequence of approximatively 65–69 nucleotides from the most 5′-end of the genome, followed by a transcription regulatory sequence (TRS-L).14 During the discontinuous negative-strand RNA synthesis, the virus RNA polymerase pauses at the TRS-B located upstream from each ORFs. The ORFs encoding for the structural and accessory proteins are located in the 3′-end of the virus genome. The nascent RNA is joined to the TRS-L located within the leader sequence, creating a negative-sense subgenomic RNA which in turn will be used to generate the newly generated positive-sense subgenomic RNAs13 (Fig. 1A).
Fig. 1.
Subgenomic mRNA abundance across Alpha, Delta and Omicron SARS-CoV-2 lineages. A) From top to bottom: Schematic of the SARS-CoV-2 genome, discontinuous negative-strand RNA synthesis and detection of subgenomic RNA is performed by identifying read pairs spanning across the leader and the 5′-end of a subgenomic open reading frame. B) Subgenomic RNA abundance with one added pseudo count and normalised per two million total reads are shown for Alpha, Delta, and Omicron BA.1, BA.2, and BA.5 samples (n = 12,324) collected from the Danish genomic surveillance of COVID-19 in the period March, 2021–June, 2022. ‘Late Delta’ correspond to Delta samples contemporary to Omicron (November 2021–January 2022) while Delta corresponds to samples contemporary to Alpha (May–September, 2021).
The clearance of subgenomic RNAs is currently debated, and several studies have suggested that these are markers of acute infection and active replication,2,6,7,15, 16, 17, 18 while some have indicated higher subgenomic RNA levels in patients with mild disease compared to severe cases.19,20 We have recently reported that subgenomic RNAs are relatively stable and can be detected in samples up to 19 days post symptom onset and are correlated with the amount of genomic length RNA present within a sample.14,20 A recent study confirmed the strong correlation between genomic and subgenomic Nucleocapsid (N) abundance in Alpha, Delta and Omicron sub-lineages and also highlighted that subgenomic RNA offered no advantage over genomic RNA as an indicator of virus activity.21 The persistence of these molecules could be facilitated by the double-membrane structures created in the cytoplasm of infected cells during the replication and transcription of SARS-CoV-2.14,22[preprint]
Recent work to assess subgenomic RNA levels across different SARS-CoV-2 variants, e.g. Alpha, Delta, and Omicron, have been performed in the United Kingdom (UK),23[preprint],24 highlighting differences of subgenomic RNA abundance between sub-lineages and particular subgenomic RNA molecules. Using 4000 SARS-CoV-2 samples taken from healthcare workers, hospitalized patients, and from the community, Parker et al.24 reported statistically significant overrepresentation of subgenomic spike, nucleocapsid, envelope and membrane in sub-lineage B.1.1.7 (VOC Alpha) compared to the UK variant B.1.117. In another study, increased levels of subgenomic N.iORF3, a C-terminal fragment of nucleocapsid, were observed in Alpha and Omicron, but absent in Delta samples.23[preprint]
In our study, we used a relatively large number of oropharyngeal samples (n = 12,324) collected by the Danish COVID-19 surveillance effort from both the community and healthcare tracks, representing different sub-lineages of SARS-CoV-2. We mapped short sequence reads from routine, multiple amplicon-based, whole genome sequencing runs to quantitate reads derived from SARS-CoV-2 subgenomic RNAs, and by using electronic health registries available in Denmark, we assessed the association of subgenomic RNA levels with demographic and clinical variables of the infected individuals. We found that subgenomic RNAs could be found in individuals irrespective of their medical history or COVID-19 symptoms. Furthermore, the observed levels of subgenomic RNAs were essentially driven by the lineage of the virus, the difference among the different subgenomic RNAs and to a lesser extent by viral load. Our result suggests that there is no significant statistical relationship between the levels of observed subgenomic RNAs and host-related factors, but that different variants may have slight differences in the abundance of specific subgenomic RNAs. The biological significance of this observed difference is presently unclear but may be related to differences in target cell tropism, early virus replication/transcription kinetics or simply caused by sequence features of particular variants.
Methods
SARS-CoV-2 mass test system in Denmark
During 2020–2022, Denmark established an extensive, openly available, free of charge nationwide SARS-CoV-2 mass test system. Over 63 million PCR-tests and 60 million rapid antigen tests have been performed on the Danish population (5.86 million inhabitants) and 91.1% of the population has been PCR-tested at least once. This represents one of the highest PCR mass test capacities in the world during that period.25[preprint] Tests were organized in two tracks, a community track, which was openly available and free of charge at the time of data inclusion and a healthcare track, which analyzed all samples in the first phase of the epidemic and subsequently followed a clinical approach, where patients were tested prior to hospitalization or based on medical referrals. The healthcare track also included samples collected from healthcare personnel and from primary care. All samples were collected using oropharyngeal swabs as described.25[preprint] The mass test strategy has changed towards the late phase of the epidemic as individuals in high-risk groups only (elderly or persons with chronic diseases) were recommended to be tested upon symptoms starting from March 10, 2022.
Whole genome sequencing effort in Denmark
One of the key components of the surveillance of SARS-CoV-2 in Denmark has been the extensive use of whole genome sequencing (WGS) with a community and a healthcare track. The community track had a sequencing capacity of up to 15,000 samples per week between March and November 2021 and 4000 samples per week afterwards. The health care track contains samples from clinics and hospitals, part of which are sequenced regionally at the Departments of Clinical Microbiology, and the remaining at Statens Serum (SSI). Sequenced samples were selected from PCR-positive samples. Briefly, at SSI, WGS was performed using the ARTIC v326 amplicon sequencing panel (https://artic.network), consisting of 98 amplicons with slight modifications including primer spike-ins (implemented continuously throughout the pandemic to maintain consistent amplicon coverage). Samples were identified by barcodes and amplicons were divided into two non-overlapping pools to limit possible artefacts. Samples were sequenced on either the NextSeq or NovaSeq platforms (Illumina) with paired read lengths ranging from 51 to 150 nucleotides. An internal quality control (QC) step rejects samples or whole sequencing plates in case of e.g. poor sequence or suspected contamination etc. Consensus sequences were called using an in-house implementation of iVar27 or a combination of iVar and a custom BCFtools command for consensus calling. Subvariants were called on all the generated consensus sequences containing less than 3000 missing and less than 5 ambiguous sites using Pangolin with the PangoLEARN assignment algorithm.28
Data collection and sample inclusion
This study was performed using data from the Danish COVID-19 Genome Consortium (https://www.covid19genomics.dk) as part of the national surveillance of the pandemic in Denmark. Samples of each lineage of interest (Alpha, Delta, BA.1, BA.2, and BA.5) were selected from samples collected during the period March 1, 2021–June 18, 2022. All samples have a complete consensus sequence and raw reads deposited in the EpiCov database of the Global Initiative on Sharing All Influenza Database (GISAID, https://www.gisaid.org).29
Between March 1, 2021, and June 18, 2022, a total of 465,442 SARS-CoV-2 sequenced samples passed internal QC criteria for their assembly and their consensus sequences and raw reads were shared to the EpiCov database. Samples used in the study were selected from this very large dataset as follows: for each lineage of interest, subsets of up to 2500 samples were randomly selected from the collection based on read data availability. Each sample was sequenced using the Illumina NextSeq platforms at SSI with 74 nucleotide paired-end read length. The original data collection was performed in February 2022. The dataset included 2500 Alpha, Delta, BA.1 and BA.2 lineages and was expanded to later Omicron sub-lineages by matching this per-group sample size. Additionally, samples within each lineage subset have a mean sequence coverage within the inter-quartile range of all genomes of that lineage collected within the considered time frame. An exclusion criterion was applied, based on the amplification profiles of the amplicons of interest across the data collection period shown in Supplementary Fig. S1, samples collected in March–April, 2021 and April–May, 2022, with low aggregated coverage in amplicons 1 and 20 getting removed from further analyses to ensure that reads were obtained and investigated only from samples with similar coverage across these amplicons.
An inclusion flow diagram, detailed information on the number of included samples, mean coverage and time windows are shown in Fig. 2 and Table 1 and the list of included samples with their alias and accession IDs on GISAID's EpiCoV database is available in Supplementary Table S1.
Fig. 2.
Sample inclusion and analysis workflow. SARS-CoV-2 genome alignments were collected from routine Danish COVID-19 genomic surveillance sample collection. Included samples were randomly selected among those sequenced with Illumina and lengths of 74 nucleotides and having a mean coverage within normal range.
Table 1.
Summary information of SARS-CoV-2 specimens included in the study.
Lineage | N | N (with complete clinical metadata) | Mean coverage | Median coverage | Median mapped reads | Earliest collection date | Latest collection date |
---|---|---|---|---|---|---|---|
BA.5 | 2084 | 42 | 3373 | 2987 | 1,739,672 | 13/06/2022 | 18/06/2022 |
BA.2 | 2306 | 333 | 3401 | 3555 | 2,067,029 | 14/12/2021 | 17/06/2022 |
BA.1 | 2297 | 473 | 3261 | 3245 | 1,895,626 | 28/11/2021 | 29/03/2022 |
Late Delta | 1863 | 724 | 3794 | 3926 | 2,322,849 | 21/11/2021 | 22/01/2022 |
Delta | 1870 | 1281 | 3541 | 3574 | 2,092,134 | 31/05/2021 | 13/09/2021 |
Alpha | 1904 | 1142 | 1902 | 1808 | 1,036,346 | 03/05/2021 | 10/08/2021 |
Full dataset | 12,324 | 3995 | 3376 | 3059 | 1,786,170 |
For each lineage group, the number of included samples with available sequence alignment file, mean and median sequence coverage and collection time point is shown.
Identification of subgenomic mRNAs
Sequence reads derived from subgenomic mRNAs were identified using a methodology adapted from the one used in our previous study14 to support the use of Illumina paired-end read sequencing. In the first step, we searched for all forward reads overlapping the expected positions of the leader motif, nucleotides 52–67 in the Wuhan-Hu-1 NC_045512/MN908947.3 reference sequence.14 Subgenomic mRNA reads were then classified based on the start position of the mate read anchored to the 3′-end of the subgenomic mRNA. We used the assumed 3′-end position of the potential subgenomic mRNAs as previously14; S: 21,552, Orf3a: 25,385, E: 26,237, M: 26,469, Orf6: 27,041, Orf7a: 27,388, Orf8: 27,884, and Orf9: 28,256.
For S, E, M and Orfs 6, 7a, and 8, we allowed a tolerance window up to the position of the AUG codon of the following open reading frame. For Orf9 (Nucleocapsid/3′ open reading frames), Orf9a and 9b were pooled together and all positions between 28,250 and the 3′-end position were considered since Orf10 subgenomic reads were not found.
A pseudo count of 1 subgenomic RNA was added to each identified position to account for missingness due to the lower limit of detection. The missingness profile per subgenomic site and per lineage can be found in Supplementary Fig. S2. Finally, since the median total read count of our collected samples was 1.78 million, the raw occurrence counts were normalised to 2 million total reads to make the subgenomic mRNA observations comparable across samples.
Clinical information
To investigate potential associations between subgenomic RNA levels and patient phenotypes, demographic and clinical information for each case was obtained from several Danish registries, including the national Microbiology Database (MiBa), the national patient registry (Landspatientregisteret LPR), and the vaccination registry. Data collection was facilitated by the Civil Registration System in Denmark, which assigns a unique 10-digit identifier (CPR-number) to every resident. Information can be linked at the individual level using this number.30 Collected information include patient demographics, vaccinations status, SARS-CoV-2 infection history, COVID-19 symptoms (Yes/No), hospital admissions (Yes/No), and incidence history (Yes/No) within the past five years prior to infection of diseases listed and defined in Supplementary Table S2. Previous history of diabetes, asthma and chronic obstructive pulmonary disease was provided by the registry of Chronic Diseases and Severe Mental Disorders (RUKS). While demographic and vaccination information, medical history, and hospital admissions were available for the whole study population, information on symptoms, provided by the Danish Patient Safety Authority (STPS) was based on patient interviews and only available for a subset of cases.
Statistical analyses
Putative association of subgenomic mRNA levels with demographic and clinical parameters were assessed using a 10-fold cross-validated elastic net repeated 50 times and linear regression models. Lineage, subgenomic mRNA site, previous medical history, vaccination status, presence/absence of symptoms during infection, age group, sex, and geographic location were encoded as dummy variables. Variable importance was reported as the mean of the absolute standardized coefficients obtained from each cross-validated elastic-net iteration and visualized as a bar plot. Comparison of subgenomic mRNA levels between lineages and other groups hospital admissions, vaccination status, presence or absence of symptoms were performed using the non-parametric Kruskal–Wallis and Wilcoxon rank-sum tests without multiple testing correction. All statistical analyses were performed using R 3.6.3 and packages glmnet31 4.1–6, caret32 6.0–93 and visualizations using ggplot 233 3.4.0 and ggpubr34 0.5.0.
Cutoff calculation between the high and low subgenomic RNA abundance densities were calculated dynamically using peakPick 0.11, a peak calling algorithm available as a R package.35
Principal component analysis
Population structure of the study dataset based on available clinical and demographic information and stratification of subgenomic mRNA levels across lineages were assessed using principal component analysis and visualized as a biplot using the R package ggfortify36 0.4.14 and cluster37 2.1.4.
Ethical statement
Samples included in this study have been collected as part of Danish COVID-19 surveillance and according to the Danish law, ethical approval is not required for this type of study. The research is approved by the legal advisory board at SSI, a public research institute under the auspices of the Danish Ministry of Health. The study contains aggregated results without identifiable personal data and therefore complies with the European General Data Protection Regulations (GDPR).
Role of funders
The project was funded by Statens Serum Institut, but the institution had no role in the design of the study, in the collection, analyses or interpretation of data, in the writing of the manuscript or in the decision to publish the results. The author(s) received no specific funding for this work.
Results
Identification of subgenomic mRNAs using whole genome sequencing data from the Danish COVID-19 surveillance
A collection of 12,324 SARS-CoV-2 genomes from the Danish surveillance of COVID-19 collected during March 1, 2021–June 18, 2022, had been assembled, consisting of isolates of the Alpha, Delta, BA.1, BA.2, and BA.5 lineages. Delta samples were divided into two distinct subgroups labelled ‘Delta’ for collection times contemporary with Alpha (May–September, 2021) and ‘Late Delta’ for collection times contemporary with BA.1 and BA.2 (November 2021–January 2022) to accommodate with distant collection time periods and for internal validation purposes. The demographic characteristics of the COVID-19 positive individuals included in the study are the following: median age 32 (s.d. 20), 49.9% female and 50.1% male. A detailed description of the sample selection process and sequencing are shown in Table 1 and Fig. 2.
All samples included in the study were sequenced at SSI using the same Illumina NextSeq platform with paired-end read length of 74 nucleotides. Subgenomic mRNAs were identified based on the position of the forward and reverse reads with counts normalised after the addition of one pseudo count (Material and Methods). Fig. 1B, shows the abundance of each subgenomic mRNA per site, for each lineage, in counts per two million total reads. Detailed summary statistics (medians, standard deviations) of the abundance of each subgenomic RNA in each lineage is available in Supplementary Table S3.
Similar to the observations made from our previous study14 and other works,19,38 we were not able to detect any subgenomic Orf10. Among the eight screened subgenomic mRNAs, four were expressed in relatively higher abundance: Spike (S), M, Orf6, and Orf9 (N and any 3′-end open reading frames). The other four, Orf3a, E, Orf7a, and Orf8, were in very low abundance, with fewer than two counts. Of the latter four very low abundance subgenomic RNAs, we were only consistently able to detect these in Alpha samples, albeit in a very low quantity, since median abundance for the other lineages were below the lower detection limit of 1.95 reads per 2 million reads, estimated as three times the standard deviation of the count per two million reads of samples with only one pseudo count. Subsequent analyses were therefore focused on the four most abundant subgenomic RNAs; S, M, Orf6, and Orf9.
Subgenomic S was more abundant in Alpha lineages (median 4.44 counts per two million total reads) compared to Omicron (median 2.90, 2.46 and 3.21 in BA.1, BA.2, BA.5 respectively) and present in Delta samples but within the lower detection limit (median 1.90 and 1.87 respectively). For the latter, we could only say that the subgenomic RNA was found but not quantifiable. Subgenomic Orf6 was found in all the investigated lineages, interestingly, observed levels were similar in Delta and Omicron (median from 2.07 to 2.36 reads per two million) while levels in Alpha were higher (median 4.42). Subgenomic M was only found in Alpha and Delta in a relatively high abundance (8.92 and 4.38 respectively) while levels in Omicron sub-lineages were within the lower level of detection (median from 1.41 to 1.96). We found that Orf9 were found in relatively high levels in all lineages, however, with Alpha and Omicron having more than the Delta sub-lineages (median 6.07, 6.34, 5.64, 7.43, 4.71, and 4.96 respectively). Subgenomic RNA levels within the Omicron sub-lineages BA.1, BA.2 and BA.5 significantly differ in Orf9, S, M and Orf6, with levels in BA.5 being higher than in the previous Omicron sub-lineages (Wilcoxon rank sum test p < 1 e-6 for each BA.5-BA.2 and BA.5-BA.1 comparisons of each subgenomic RNA). No statistically significant differences were found in the median subgenomic RNA levels of Delta and Late Delta sample groups which reassures about the consistency of the observations over a long collection time.
We performed a principal component analysis (PCA) on the subgenomic mRNA counts per two million total reads to further describe the main factors that contribute to the subgenomic RNA levels. The biplot in Fig. 3 shows that most important loadings come from Orf9, membrane (M), Orf6 and spike (S). Furthermore, we observed that Alpha and Delta samples belong to a similar cluster (lower cloud) while Omicron BA.1, BA.2, BA.5 belonged to another cluster (upper cloud). The explained variance of PC1 and PC2 accounted for 89.52% of the total.
Fig. 3.
Principal component analysis of the subgenomic RNA levels across the different sites. The biplot shows the loadings of the two leading components, colored by lineage. The explained variance of PC1 and PC2 represent 89.52%.
No association of subgenomic mRNA levels with host and demographic factors
The samples collected in our study benefit from the highly scrutinized health care registries available in Denmark. In addition, some of the individuals were interviewed to survey their symptoms during their SARS-CoV-2 infection. In total, out of 12,324 included samples, we were able to retrieve complete clinical and demographic information for 3995 individuals. The main distinct subgroups of individuals in the study could be characterized as having one of the following conditions: diabetes or cardiovascular or respiratory diseases as shown in the biplot of the first two leading components showing the structure of the included population based on the available medical history data (see Supplementary Fig. S3).
A 10-fold cross-validated elastic-net was performed 50 times to determine which host-related variables have a significant effect on subgenomic mRNA levels. Fig. 4 shows the mean standardized importance of the variables we examined. The result support that Alpha lineage, site of the subgenomic RNA and Ct value contributed the most to subgenomic mRNAs levels. None of the frequent comorbidities in the study population outlined by the PCA were statistically associated with subgenomic RNA levels. Vaccination status showed little effect on subgenomic mRNA levels and no association was found with the presence of COVID-19 symptoms or not. For validation, we compared the subgenomic RNA levels in Alpha and Delta cases with and without symptoms and found no differences between the distributions of subgenomic RNA counts per 2 million (Wilcoxon rank-sum tests p = 0.18 and p = 0.16 respectively). Looking at directionality of the associations, Alpha lineage, Orf9 and M were positively associated with subgenomic RNA levels while higher Ct values were associated with lower subgenomic RNA levels. A detailed assessment of the effect sizes was performed using a linear regression model, shown in Supplementary Table S4. To investigate whether subgenomic RNA measured in samples collected from the healthcare track were associated with host-related variables, we performed a separate analysis of the 906 samples collected from the hospitals. The cross-validated elastic-net was performed on the same variables, except the one for which data were incomplete or missing (Ct value, having symptoms or not, medical history from the RUKS database). The result showed no association with host-related variables except a very marginal effect for the variable ‘newly admitted’ to the hospital, and that the variability of subgenomic RNA levels was essentially driven by the different subgenomic RNAs (Supplementary Fig. S4).
Fig. 4.
Variable importance plot of the most significant factors associated with subgenomic mRNA count per 2 million. Variables selected by 10-fold cross-validated elastic-net for association with subgenomic mRNA levels are shown in their order of importance, repeated 50 times. The analysis was performed on a population subset of n = 3995 with all the available host-associated data. Definitions of all clinical variables are available in Supplementary Table S2.
Altogether, these results are in line with the ones shown in Fig. 1, as the variance of the data is essentially explained by the site of the subgenomic mRNA and lineage group.
Samples with low subgenomic RNA levels are overrepresented in high Ct values
In our study, while we addressed missing observations using a pseudo count, we found within certain groups of samples a low abundance subset that contrasts with the rest of the data with higher counts. To further investigate these bi-modal distributions, we focused on four sample groups with high missingness: Alpha-subgenomic Orf9, BA.1- subgenomic M, BA.2-M and BA.5-M. These were further investigated by looking at the abundance as a function of Ct values. Fig. 5 shows the density profiles in these groups and statistical assessment between the distribution of samples with high and low subgenomic RNA counts defined by a dynamic cut off calculated using an algorithm previously described.35 In each panel figure, we could distinguish two clouds of datapoints, one with a higher abundance and one with a very low abundance. While assessing the statistical significance of the Ct values between the low and high abundance clouds, we found that Ct values were significantly higher when subgenomic RNA levels were low. A similar result was described in another study24 in which a weak but statistically significant negative correlation was found between raw subgenomic RNA reads and E cycle threshold. This analysis supports the results that estimated the effect size and directionality of Ct on subgenomic RNA levels described above (Supplementary Table S4). This suggests, therefore, that the amount of detectable subgenomic RNAs is correlated to viral load (lower Ct). The observed Ct value for measuring at least 2 subgenomic reads (one read and a pseudo count) per 2 million total reads was 26.5 in our study.
Fig. 5.
Density pattern of subgenomic RNA abundance vs. Ct values in groups of samples with low counts. Density of abundance vs. Ct value are shown for four groups with a high missingness compared to other lineages: A) Alpha lineage, Orf9 (35% of samples with no observations) B) BA.5, M (60%) C) BA.2, M (69%), and D) BA.1 (71%). Cut off between sample groups with low (yellow) and high (blue) subgenomic RNA levels are defined dynamically based on the density profile of observations. A non-parametric test is performed between the groups with high and low abundance and standard significance cut-offs are applied: ∗∗: 0.01, ∗∗∗: 0.001, ∗∗∗∗: 0.0001).
Discussion
In this study, we performed a comparative analysis of the abundance of subgenomic RNA across a diverse collection of SARS-CoV-2 samples consisting of the sub-lineages Alpha, Delta, Omicron BA.1, BA.2, and BA.5. We also thoroughly investigated possible associations between changes in the measured levels and available demographic and clinical parameters associated with the individuals from which the samples were collected using various statistical approaches. Similar to our previous findings,14,22[preprint] we showed that subgenomic RNAs were detectable in essentially all our samples, even in low abundance as for Orf3a, E, Orf7a, Orf8, and that significant differences could be observed between Alpha, Delta and Omicron.
The Danish national surveillance of COVID-19 offered a unique opportunity to study the profiles of subgenomic mRNA abundance across a large and unbiased population. This is due to a comprehensive data collection based on a nation-wide sampling of SARS-CoV-2 infections combined with a thorough collection of clinical information from highly digitalized national health registries. Samples used in our study were randomly picked from this collection essentially based on the coverage of all samples collected for the sub-lineages of interest to minimize fluctuations due to sequencing aspects, so that we could better focus on patient associated variables in our analysis. However, since samples were collected and sequenced at different time points, some fluctuations in sequencing yield over time could be observed across the amplicons covering the Leader region (amplicon 1) and the 5′-prime ends of the ORFs of interest (amplicons 20, 83, 87, 89, 90, 92, 93). Using aggregated coverage profiles of these amplicons (Supplementary Fig. S1), we identified two time-windows, March–April, 2021 and April–May, 2022, during which amplicons 1 and 20 were of lower abundance and decided for clarity purposes not to include samples from these time points in the main results. In Supplementary Fig. S5, the abundance profiles of all the collected samples including the omitted time windows showed that the overall result and conclusions were unchanged, besides a slight normalization bias for non-included sample groups of lineages BA.5, BA.4 and BA.2.12.1 which seemed to have higher median subgenomic RNA levels than samples collected before and after the omitted collection periods. We further addressed potential fluctuations of sequencing depth by applying a normalization per 2 million total mapped reads to our observations. Despite the above-described procedures and continuous updates of amplicon primers to maintain consistent coverage, variations in coverage of individual amplicons between sub-lineages exist (Supplementary Fig. S6). However, these differences do not impact our association analyses with host-related factors since we corrected the models for lineage. It is also important to note that previous studies24 have used a similar study set-up, from sample collection using community and hospital tracks to amplicon-based sequencing using primer sets from the ARTIC Network.
Unlike in our previous study, in which we used a sequencing approach designed to identify subgenomic RNAs based on longer-read and targeted PCR amplicon approach for each TRS region, we used in this analysis genomic data from national ongoing surveillance of COVID-19, with reads partially covering the leader and 5′-ends of the subgenomic ORF across the template switching loci. Although the analysis focused on known subgenomic RNAs, the analysis could be adapted in future studies to detect novel subgenomic RNAs by searching for abundant read pairs with mate position outside the targeted positions.
Despite the overall observed abundance of subgenomic RNAs were lower than in targeted approaches, we were consistently able to detect subgenomic RNAs in our samples, demonstrating that it is feasible using routine surveillance data. In this regard, our findings suggest that the ability to detect subgenomic RNA using amplicon-based paired-end read Illumina sequencing is sensitive to Ct values - hence the amount of viral RNA in the samples–and that the levels of detected subgenomic RNAs is inversely correlated with Ct, as previously observed.24
We found overall, Alpha to have more subgenomic S, Orf6 and M than Delta and Omicron. Interestingly, in Omicron, subgenomic S and Orf9 (N) appeared to be slightly more abundant than in Delta, while in contrast, the opposite appeared to be the case for subgenomic M with higher levels in Delta compared to Omicron. Differences in subgenomic RNA levels between variants have been previously described, both in swab samples and cell cultures.23[preprint],24 In a previous study, we showed that subgenomic RNA levels in 48-h cell culture were similar to that of swab samples.22[preprint] Interestingly, a comparative study of subgenomic RNAs in SARS-CoV, SARS-CoV-2 and MERS-CoV39 showed that different strains of coronaviruses may express subgenomic RNAs at different levels, in particular for the non-core ones. The authors suggested that these variations may be involved in mechanisms of host adaptation. As the virus evolves, many changes may impact the viral fitness that may vary over time and change from variant to variant.40 For example, it has been previously described that there is a change of cell entry mechanism in Omicron compared to Alpha and Delta, with a switch from using the ACE2-TMPRSS2 entry pathway to using the endocytosis-Cathepsins pathway and thus apparently resulting in a reduced ability to form syncytia, a change in tropism from the lower to the upper respiratory tract cells and likely associated with the higher transmission efficiency of Omicron and its subvariants.41, 42, 43, 44, 45, 46, 47, 48[preprint] Furthermore, the membrane protein plays an important role in the virus assembly and the recruitment of other structural proteins such as N and Spike protein localisation as well as syncytia formation.49,50 Some studies suggested that Omicron BA.5 may have reverted back to using the initial entry pathway used by Delta and Alpha, with potentially increased pathogenicity,51,52 however this scenario may require further investigation.
In addition to differences between Alpha, Delta and Omicron, we found that subgenomic RNA levels between BA.1/BA.2 and BA.5, in Orf9, S, M and Orf6 with levels in BA.5 were higher than in the previous Omicron sub-lineages. Based on our results we are not able to provide any causal explanation to these differences. We could only hypothesize that the differences between the Alpha/Delta and Omicron are unlikely to be due to any differences in their paths of entry but could be related to the evolution/optimization of virus transcription stoichiometry during adaptation to and global circulation in humans. We speculate that Alpha produces more subgenomic RNAs maybe due to being less adapted to the new host (humans) and thus less efficient in transcription and replication, while Delta has intermediate levels and the Omicron sub-lineages may represent a more optimized transcription strategy with an increased relative level of S and N open reading frames and reduced relative levels of M subgenomic RNAs. Further investigations involving in-vitro experiment may elucidate the underlaying mechanisms linked to these variations. With regards to potential associations between observed levels of subgenomic RNAs and severity of COVID-19, we found no statistically significant correlations with the presence of symptoms, hospital admissions, vaccination status of the patients, disease occurrence history in the past five years or demographic characteristics such as age or sex. That differed from other works which linked subgenomic RNA levels with active viral infection. A study reported for example decreased subgenomic RNA levels in severe cases19 while others described a correlation between genomic and subgenomic RNA levels, which was being interpreted in these studies as a surrogate of viral activation and transcription.15, 16, 17, 18 In particular, the previously described apparent correlation between genomic and subgenomic Orf718 could be explained by the presence of subgenomic RNA roughly in the same proportion throughout the infection course synchronously with the presence of genomic RNA. Contrarily, we here show data to support that most of the explained variance of the observed subgenomic RNA levels lays on the variables designating lineages (Alpha or not) and in particular subgenomic RNA (S, Orf6, M, Orf9). We therefore suggest that, if it exists, the effect of disease severity or other host associated factors on subgenomic RNA levels is modest compared to the genomic factors such as lineage and type of subgenomic RNA. While our cross-validated analysis showed weak or no evidence of association of subgenomic RNA levels with host-associated factors, one needs however to consider that this analysis was done on a fraction of the collected samples (3995/12,324, 32%), with a majority of datapoints collected among patients infected by Alpha and Delta variants (n = 3147) due to the data availability from the RUKS data source. We repeated the association analysis with the largest possible sample size (n = 11,365) by excluding all the sparsely available variables (Have symptoms STPS, RUKS diabetes type 1 and 2, RUKS Asthma, RUKS KOL). We found again that subgenomic RNA levels were not or weakly associated with host-related factors and that the most significantly associated factors were lineage Alpha or not and ORF9 (see Supplementary Fig. S7).
Additionally, patient information on COVID-19 symptoms was based on interviews. The large increase of infections by Omicron in December 2021–January 2022 limited the capacity of the Danish Patient Safety Authority to conduct interviews; and patients infected with Omicron sub-lineage BA.1 had less severe outcomes than those infected with Delta.53 We cannot therefore exclude that positive selection towards patients susceptible to have symptoms occurred and that patients infected by Omicron could have been under-represented. Among the individuals with available clinical data, nearly all reported symptoms (3792/3995, 94%) with a small fraction of Omicron infections (848/3995, 20%).
In addition, only 72 hospital admissions were reported in our dataset. Among these, 58 patients were infections with Alpha and Delta. Three studies have been performed in Denmark to compare risk of hospitalization54, 55, 56 between different epidemic waves. The fraction of hospitalized patients with critical outcome is relatively low, approx. 5.8% during the second wave (July–December 2020)54 and the risk of hospitalization is decreased among infections with Omicron compared to Delta.56 In our analysis focusing on the subset of 906 samples collected from the healthcare track, we found only a marginal effect of new admissions compared with the type of subgenomic RNA. A new analysis to better assess the association between hospitalizations and subgenomic RNA levels would require collecting more samples from the early time points (Alpha and Delta lineages).
Altogether, we have performed a comprehensive survey of subgenomic RNA occurrences across diverse SARS-CoV-2 lineages and across a long collection time window. Despite the challenges associated with using routine sequencing data for epidemic surveillance, we were able to show clear differences in abundance between samples of different sub-lineages and for different types of subgenomic RNAs. We are confident that our data support that the contribution of disease severity and other patient associated factors are non-significant in terms of effect sizes compared with lineage and suggest that these subgenomic RNAs may be produced during the replication/transcription cycle of the virus and their presence and abundances are highly correlated to the abundance of virion RNA. Further investigation including in-vitro experiments would be required to better describe the differences between sub-lineages and to define whether the differences observed may be part of evolving optimization of SARS-CoV-2 replication and transcription in the human host.
Contributors
S.A. conceptualized the study, M-H.E.T. and K.N. performed the analyses, S.M.E. assembled the patient metadata, K.E. provided the WGS QC data. M-H.E.T., K.N., S.A. and M.S had full access to all the data in the study and verified underlying data reported in the manuscript. M-H.E.T. drafted the initial manuscript with inputs of S.A., M.S., K.N. All authors contributed to the final submitted version. All authors have read and agreed to the final version of the manuscript.
Data sharing statement
The raw reads of all samples have been deposited into GISAID's EpiCoV database, under the EPI_SET identifiers: EPI_SET_230103mu, EPI_SET_230103zt, EPI_SET_230103ek, EPI_SET_230103ky, EPI_SET_230103ar, EPI_SET_230103te.
All scripts used to perform the subgenomic RNAs detection, statistical analyses and generation of the figures are available in the Git repository https://github.com/ssi-dk/project-covid19_subgenomic.
The supplementary material contains the list of members of the Danish COVID-19 Genomic Consortium, Supplementary Figs. S1–S7, Supplementary Tables S1–S4.
Declaration of interests
The authors declare no competing or conflict of interest.
Acknowledgements
The authors would like to thank the Danish Covid-19 Genome Consortium for providing the whole genome sequencing data of SARS-CoV-2 positive samples, the Division of Infectious Disease Preparedness at Statens Serum Institut for providing the patient data and Henrik Ullum for his comments on the manuscript.
Footnotes
Supplementary data related to this article can be found at https://doi.org/10.1016/j.ebiom.2023.104669.
Appendix A. Supplementary data
References
- 1.Sethuraman N., Jeremiah S.S., Ryo A. Interpreting diagnostic tests for SARS-CoV-2. JAMA. 2020;323:2249. doi: 10.1001/jama.2020.8259. [DOI] [PubMed] [Google Scholar]
- 2.Wölfel R., Corman V.M., Guggemos W., et al. Virological assessment of hospitalized patients with COVID-2019. Nature. 2020;581:465–469. doi: 10.1038/s41586-020-2196-x. [DOI] [PubMed] [Google Scholar]
- 3.Bullard J., Dust K., Funk D., et al. Predicting infectious severe acute respiratory syndrome coronavirus 2 from diagnostic samples. Clin Infect Dis. 2020;71:2663–2666. doi: 10.1093/cid/ciaa638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.van Kampen J.J.A., van de Vijver D.A.M.C., Fraaij P.L.A., et al. Duration and key determinants of infectious virus shedding in hospitalized patients with coronavirus disease-2019 (COVID-19) Nat Commun. 2021;12:267. doi: 10.1038/s41467-020-20568-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Immergluck K., Gonzalez M.D., Frediani J.K., et al. Correlation of SARS-CoV-2 subgenomic RNA with antigen detection in nasal midturbinate swab specimens. Emerg Infect Dis. 2021;27:2887–2891. doi: 10.3201/eid2711.211135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hogan C.A., Huang C., Sahoo M.K., et al. Strand-specific reverse transcription PCR for detection of replicating SARS-CoV-2. Emerg Infect Dis. 2021;27:632–635. doi: 10.3201/eid2702.204168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Perera R.A.P.M., Tso E., Tsang O.T.Y., et al. SARS-CoV-2 virus culture and subgenomic RNA for respiratory specimens from patients with mild coronavirus disease. Emerg Infect Dis. 2020;26:2701–2704. doi: 10.3201/eid2611.203219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.V’kovski P., Kratzel A., Steiner S., Stalder H., Thiel V. Coronavirus biology and replication: implications for SARS-CoV-2. Nat Rev Microbiol. 2021;19:155–170. doi: 10.1038/s41579-020-00468-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Snijder E.J., Limpens R.W.A.L., de Wilde A.H., et al. A unifying structural and functional model of the coronavirus replication organelle: tracking down RNA synthesis. PLoS Biol. 2020;18 doi: 10.1371/journal.pbio.3000715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wolff G., Limpens R.W.A.L., Zevenhoven-Dobbe J.C., et al. A molecular pore spans the double membrane of the coronavirus replication organelle. Science. 2020;369:1395–1398. doi: 10.1126/science.abd3629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zhou X., Cong Y., Veenendaal T., et al. Ultrastructural characterization of membrane rearrangements induced by porcine epidemic diarrhea virus infection. Viruses. 2017;9:251. doi: 10.3390/v9090251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Knoops K., Kikkert M., Worm S.H., et al. SARS-coronavirus replication is supported by a reticulovesicular Network of modified endoplasmic reticulum. PLoS Biol. 2008;6:e226. doi: 10.1371/journal.pbio.0060226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Sola I., Almazán F., Zúñiga S., Enjuanes L. Continuous and discontinuous RNA synthesis in coronaviruses. Annu Rev Virol. 2015;2:265–288. doi: 10.1146/annurev-virology-100114-055218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Alexandersen S., Chamings A., Bhatta T.R. SARS-CoV-2 genomic and subgenomic RNAs in diagnostic samples are not an indicator of active replication. Nat Commun. 2020;11:6059. doi: 10.1038/s41467-020-19883-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Deiana M., Mori A., Piubelli C., et al. Impact of full vaccination with mRNA BNT162b2 on SARS-CoV-2 infection: genomic and subgenomic viral RNAs detection in nasopharyngeal swab and saliva of health care workers. Microorganisms. 2021;9:1738. doi: 10.3390/microorganisms9081738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Böszörményi K.P., Stammes M.A., Fagrouch Z.C., et al. The post-acute phase of SARS-CoV-2 infection in two macaque species is associated with signs of ongoing virus replication and pathology in pulmonary and extrapulmonary tissues. Viruses. 2021;13:1673. doi: 10.3390/v13081673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Su M., Ping S., Nguyen P.V., et al. Subgenomic RNA abundance relative to total viral RNA among SARS-CoV-2 variants. Open Forum Infect Dis. 2022;9(11):ofac619. doi: 10.1093/ofid/ofac619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Berry N., Ferguson D., Kempster S., et al. Intrinsic host susceptibility among multiple species to intranasal SARS-CoV-2 identifies diverse virological, biodistribution and pathological outcomes. Sci Rep. 2022;12:18694. doi: 10.1038/s41598-022-23339-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Agius J.E., Johnson-Mackinnon J.C., Fong W., et al. SARS-CoV-2 within-host and in vitro genomic variability and sub-genomic RNA levels indicate differences in viral expression between clinical cohorts and in vitro culture. Front Microbiol. 2022;13:824217. doi: 10.3389/fmicb.2022.824217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Davies M., Bramwell L.R., Jeffery N., et al. Persistence of clinically relevant levels of SARS-CoV2 envelope gene subgenomic RNAs in non-immunocompromised individuals. Int J Infect Dis. 2022;116:418–425. doi: 10.1016/j.ijid.2021.12.312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Dimcheff D.E., Blair C.N., Zhu Y., et al. Total and subgenomic RNA viral load in patients infected with SARS-CoV-2 Alpha, Delta, and Omicron variants. J Infect Dis. 2023 doi: 10.1093/infdis/jiad061. Epub ahead of print. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Chamings A., Bhatta T.R., Alexandersen S. Subgenomic and negative sense RNAs are not markers of active replication of SARS-CoV-2 in nasopharyngeal swabs. medRxiv. 2021 doi: 10.1101/2021.06.29.21259511. [preprint] [DOI] [Google Scholar]
- 23.Mears H.V., Young G.R., Sanderson T., et al. Emergence of new subgenomic mRNAs in SARS-CoV-2. bioRxiv. 2022 doi: 10.1101/2022.04.20.488895. [preprint] [DOI] [Google Scholar]
- 24.Parker M.D., Stewart H., Shehata O.M., et al. Altered subgenomic RNA abundance provides unique insight into SARS-CoV-2 B.1.1.7/Alpha variant infections. Commun Biol. 2022;5:666. doi: 10.1038/s42003-022-03565-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Gram M.A., Steenhard N., Cohen A.S., et al. Patterns of testing in the extensive Danish national SARS-CoV-2 test set-up. medRxiv. 2023 doi: 10.1101/2023.02.06.23285556. [preprint] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Quick J. 2020. nCoV-2019 sequencing protocol V.3.https://www.protocols.io/view/ncov-2019-sequencing-protocol-v3-locost-bp2l6n26rgqe/v3?version_warning=no [Google Scholar]
- 27.Grubaugh N.D., Gangavarapu K., Quick J., et al. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biol. 2019;20:8. doi: 10.1186/s13059-018-1618-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.O’Toole Á., Scher E., Underwood A., et al. Assignment of epidemiological lineages in an emerging pandemic using the Pangolin tool. Virus Evol. 2021;7(2):veab064. doi: 10.1093/ve/veab064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Khare S., Gurry C., Freitas L., et al. GISAID’s role in pandemic response. China CDC Wkly. 2021;3:1049–1051. doi: 10.46234/ccdcw2021.255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Schmidt M., Pedersen L., Sørensen H.T. The Danish Civil registration System as a tool in epidemiology. Eur J Epidemiol. 2014;29:541–549. doi: 10.1007/s10654-014-9930-3. [DOI] [PubMed] [Google Scholar]
- 31.Friedman J., Hastie T., Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1–22. [PMC free article] [PubMed] [Google Scholar]
- 32.Kuhn M. 2022. Caret: classification and regression training.https://CRAN.R-project.org/package=caret [Google Scholar]
- 33.Wickham H. Springer-Verlag New York; 2016. ggplot2: elegant graphics for data analysis. [Google Scholar]
- 34.Kassambara A. 2022. ggpubr: ‘ggplot2’ based publication ready plots.https://CRAN.R-project.org/package=ggpubr [Google Scholar]
- 35.Weber C.M., Ramachandran S., Henikoff S. Nucleosomes are context-specific, H2A.Z-modulated barriers to RNA polymerase. Mol Cell. 2014;53:819–830. doi: 10.1016/j.molcel.2014.02.014. [DOI] [PubMed] [Google Scholar]
- 36.Tang Y., Horikoshi M., Li W. Ggfortify: unified interface to visualize statistical results of popular R packages. Ratio Juris. 2016;8:474. [Google Scholar]
- 37.Maechler M., Rousseeuw P., Struyf A., Hubert M., Hornik K. 2022. Cluster: cluster analysis basics and extensions.https://CRAN.R-project.org/package=cluster [Google Scholar]
- 38.Finkel Y., Mizrahi O., Nachshon A., et al. The coding capacity of SARS-CoV-2. Nature. 2021;589:125–130. doi: 10.1038/s41586-020-2739-1. [DOI] [PubMed] [Google Scholar]
- 39.Lyu L., Feng R., Zhang M., et al. Subgenomic RNA profiling suggests novel mechanism in coronavirus gene regulation and host adaption. Life Sci Alliance. 2022;5 doi: 10.26508/lsa.202101347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Li C., Huang J., Yu Y., et al. Human airway and nasal organoids reveal escalating replicative fitness of SARS-CoV-2 emerging variants. Proc Natl Acad Sci USA. 2023;120 doi: 10.1073/pnas.2300376120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Willett B.J., Grove J., MacLean O.A., et al. SARS-CoV-2 Omicron is an immune escape variant with an altered cell entry pathway. Nat Microbiol. 2022;7:1161–1179. doi: 10.1038/s41564-022-01143-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Pia L., Rowland-Jones S. Omicron entry route. Nat Rev Immunol. 2022;22:144. doi: 10.1038/s41577-022-00681-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Gupta R. SARS-CoV-2 Omicron spike mediated immune escape and tropism shift. Res Sq. 2022 doi: 10.21203/rs.3.rs-1191837/v1. [DOI] [Google Scholar]
- 44.Meng B., Abdullahi A., Ferreira I.A.T.M., et al. Altered TMPRSS2 usage by SARS-CoV-2 Omicron impacts infectivity and fusogenicity. Nature. 2022;603:706–714. doi: 10.1038/s41586-022-04474-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Hoffmann M., Krüger N., Schulz S., et al. The Omicron variant is highly resistant against antibody-mediated neutralization: implications for control of the COVID-19 pandemic. Cell. 2022;185:447–456.e11. doi: 10.1016/j.cell.2021.12.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Planas D., Saunders N., Maes P., et al. Considerable escape of SARS-CoV-2 Omicron to antibody neutralization. Nature. 2022;602:671–675. doi: 10.1038/s41586-021-04389-z. [DOI] [PubMed] [Google Scholar]
- 47.Cameroni E., Bowen J.E., Rosen L.E., et al. Broadly neutralizing antibodies overcome SARS-CoV-2 Omicron antigenic shift. Nature. 2022;602:664–670. doi: 10.1038/s41586-021-04386-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Peacock T.P., Brown J.C., Zhou J., et al. The altered entry pathway and antigenic distance of the SARS-CoV-2 Omicron variant map to separate domains of spike protein. bioRxiv. 2022 doi: 10.1101/2021.12.31.474653. [preprint] [DOI] [Google Scholar]
- 49.Boson B., Legros V., Zhou B., et al. The SARS-CoV-2 envelope and membrane proteins modulate maturation and retention of the spike protein, allowing assembly of virus-like particles. J Biol Chem. 2021;296:100111. doi: 10.1074/jbc.RA120.016175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Zhang Z., Nomura N., Muramoto Y., et al. Structure of SARS-CoV-2 membrane protein essential for virus assembly. Nat Commun. 2022;13:4399. doi: 10.1038/s41467-022-32019-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Aggarwal A., Akerman A., Milogiannakis V., et al. SARS-CoV-2 Omicron BA.5: evolving tropism and evasion of potent humoral responses and resistance to clinical immunotherapeutics relative to viral variants of concern. eBioMedicine. 2022;84:104270. doi: 10.1016/j.ebiom.2022.104270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kimura I., Yamasoba D., Tamura T., et al. Virological characteristics of the SARS-CoV-2 Omicron BA.2 subvariants, including BA.4 and BA.5. Cell. 2022;185:3992–4007.e16. doi: 10.1016/j.cell.2022.09.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Veneti L., Bøås H., Bråthen Kristoffersen A., et al. Reduced risk of hospitalisation among reported COVID-19 cases infected with the SARS-CoV-2 Omicron BA.1 variant compared with the Delta variant, Norway, December 2021 to January 2022. Euro Surveill. 2022;27:2200077. doi: 10.2807/1560-7917.ES.2022.27.4.2200077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Buttenschøn H.N., Lynggaard V., Sandbøl S.G., Glassou E.N., Haagerup A. Comparison of the clinical presentation across two waves of COVID-19: a retrospective cohort study. BMC Infect Dis. 2022;22:423. doi: 10.1186/s12879-022-07413-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Holler J.G., Eriksson R., Jensen T.Ø., et al. First wave of COVID-19 hospital admissions in Denmark: a Nationwide population-based cohort study. BMC Infect Dis. 2021;21:39. doi: 10.1186/s12879-020-05717-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Bager P., Wohlfahrt J., Bhatt S., et al. Risk of hospitalisation associated with infection with SARS-CoV-2 omicron variant versus delta variant in Denmark: an observational cohort study. Lancet Infect Dis. 2022;22:967–976. doi: 10.1016/S1473-3099(22)00154-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.