Skip to main content
Wiley - PMC COVID-19 Collection logoLink to Wiley - PMC COVID-19 Collection
. 2021 Nov 2;94(3):1227–1231. doi: 10.1002/jmv.27418

SARS‐CoV‐2 variant with mutations in N gene affecting detection by widely used PCR primers

Pia Laine 1, Hanna Nihtilä 2, Ella Mustanoja 1, Annina Lyyski 1, Anne Ylinen 1, Jukka Hurme 2, Lars Paulin 1, Sakari Jokiranta 2,3, Petri Auvinen 1, Taru Meri 2,4,
PMCID: PMC8661661  PMID: 34698407

Abstract

While most of the spontaneous mutations in the viral genome have no functional, diagnostic, or clinical consequences, some have. In February 2021, we noticed in Southern Finland coronavirus disease 2019 cases where two commercial polymerase chain reaction (PCR) analyses failed to recognize the used N gene target but recognized the other target gene of severe acute respiratory syndrome coronavirus 2. Complete viral genome sequence analysis of the strains revealed several mutations that were not found at that time in public databases. A short 3 bp deletion and three subsequent single nucleotide polymorphisms in the N gene were found exactly at the site where an early published and widely used N gene‐based PCR primer is located, explaining the negative results in the N gene PCR. Later the variant strain was identified as a member of the B.1.1.318 Pango lineage that had first been found from Nigerian samples collected in January 2021. This strain shares with the Beta variant the S gene E484K mutation linked to impaired vaccine protection, but differs from this variant in several other ways, for example by deletions in the N gene region. Mutations in the N gene causing diagnostic resistance and on the other hand E484K mutation in the causing altered infectivity warrants careful inspection on virus variants that might get underdiagnosed.

Keywords: diagnosis, N gene, PCR, SARS CoV‐2, sequencing

1. INTRODUCTION

In late 2019 a cluster of pneumonia‐type infections occurred in Wuhan, China. The causative agent was soon identified as a severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) from the genus of Betacoronavirus of the family Coronaviridae. 1 Infection caused by SARS‐CoV‐2 is called coronavirus disease 2019 (COVID‐19) and it was declared a pandemic by WHO. There have been approximately 205 000 000 infected patients and over 4 million deaths by the end of July 2021.

SARS‐CoV‐2 is a positive‐stranded RNA virus with a genome of approximately 30 kbp in size. In the genome, there are two genes, ORF1a and ORF1ab that encode16 nonstructural proteins (nsp), and four structural proteins: spike surface glycoprotein (S), small envelope protein (E), membrane protein (M), and nucleocapsid protein (N). In addition, eight accessory proteins (3a, 3b, p6, 7a, 7b, 8b, 9b, and orf14) have been described. Viruses entering the cells utilize different receptors to facilitate entry. SARS‐CoV‐2 binds to the human ACE2 receptor via receptor‐binding motif (RBM, amino acids 424–494) in the spike protein. 2 Interaction between receptor and viral structures has an important role in viral infectivity and pathogenesis.

Currently circulating and emerging SARS‐CoV‐2 viruses are naturally expanding genetic diversity. SARS‐CoV‐2 genome has preferences on nucleotide mutation restricted with effect on coding i.e., possible nonsynonymous mutations that are not tolerated at the level of the protein structure/function axis. A to C or T and T to A are the most frequent nucleotide changes. 3 Genome of SARS‐CoV‐2 has a complicated RNA secondary structure that could restrict allowed mutations to a level that does not affect protein‐coding capacity. 4 , 5

The golden standard test for diagnosis of COVID‐19 is quantitative reverse‐transcription polymerase chain reaction (RT‐qPCR), which is performed to RNA isolated from a nasopharyngeal swab. Antigen detection and antibody tests are also widely used. 6 Antigen assays detect viral proteins or parts of them from patient samples, whereas antibody tests are used to detect antibodies formed against SARS‐CoV‐2 and should not be used to diagnose acute COVID‐19 infection. Nucleic acid tests, on the other hand, detect the presence of SARS‐CoV‐2 RNA. The most commonly used targets in RT‐qPCR methods to detect SARS‐CoV‐2 are N, E, RdRp/ORF1ab genes. Mutations occurring in diagnostic target regions may impair the recognition if located in primer/probe binding regions.

The aim of this study was to analyze in detail the genomic features of SARS‐CoV‐2 variants, which were not identified with two widely used commercially available RT‐PCR diagnostic methods.

2. MATERIAL AND METHODS

2.1. Patient

The first nasopharyngeal swab sample for Covid‐analysis was taken on 6.2.2021 from a 44‐year‐old male living in Southern Finland. The patient was symptomatic and hospitalized the following week. The second patient from the same workplace, a 24‐year‐old male, gave a sample on February 19, 2021.

2.2. Diagnostic RT‐qPCR

Nasopharyngeal swab samples taken into mNAT‐medium were analyzed using two different commercially available multiplex RT‐qPCR tests (Amplidiag® Covid‐19 and PerkinElmer® SARS‐CoV‐2 Real‐time RT‐PCR assay). Both recognize SARS‐CoV‐2 ORF1ab and N genes, and human ribonuclease P (RNase P) as a sampling control. Tests were performed according to assay protocols provided by manufacturers and performed twice for each sample.

2.3. Sequencing and comparative analysis of the N gene sequences

As the result of two diagnostic RT‐qPCR ‐tests were unexpected (only one of the two targets were identified) from the index patients, we used the first sample to acquire complete S and N genes from the same sample by using Sanger sequencing. Later same sequence analysis was performed on another patient sample from the same working place.

Whole‐genome sequencing was performed using a commercial panel (CleanPlex® SARS‐CoV‐2 Research and Surveillance panel (Paragon Genomics)) using Illumina iSeq. 100 instrument. For comparative analyses of the nucleocapsid (N) gene sequences from both Alphacoronavirus and Betacoronavirus genera were retrieved from NCBI Coronavirus genome data set. Details of Sanger and whole‐genome sequencing and comparative analysis of the N gene are described in Online Supporting Information.

3. RESULTS

RT‐qPCR results from the first patient showed a prominent signal from ORF1ab gene with quantitation cycle (cq) of 19.36 (Figure 1A) and no signal for the N gene. When the sample was analyzed using another test targeting the same genes the result was similar, clear signal for ORF1ab (cq 21.84) and no signal for the N gene. Results from the second patient sample showed similar results with both assays.

Figure 1.

Figure 1

The unexpected RT‐qPCR result of the index patient sample. RT‐qPCR, quantitative reverse‐transcription polymerase chain reaction

The results of the two RT‐qPCR analyses from the sample were unexpected. Until February 15, 2021, we had identified 756 positive SARS‐CoV‐2 ‐samples, and none of them showed a prominent signal only for ORF1ab. Next, we verified the finding of COVID‐19 by acquiring complete sequences of the S and N genes using Sanger sequencing. Sequences acquired from the two patient samples were identical. We compared the S protein sequence profile to B.1.17/Alpha variant, B.1.351/Beta variant and P1/Gamma variant, and as a result, we identified a combination of distinct variant mutations, which differed from other variants circulating in Europe at the beginning of February 2021 (Table S1).

We also compared the N gene sequence to NC_045512.2 reference and primer sequences (Figure 2). Reference sequences were from diagnostic targets designed and suggested by WHO in the beginning of the current pandemic. We noticed that the site for forward primer (http://ivdc.chinacdc.cn/kyjz/202001/t20200121_211337.htm) contains three mutations at the 5ʹ end and specifically a 3 bp deletion at the 3ʹ end of the primer, which most likely explain our results from the two different PCR‐assays.

Figure 2.

Figure 2

Mutations in the SARS‐Cov‐2 N‐gene on the ChinaCDC_N_F binding site. SARS‐CoV‐2, severe acute respiratory syndrome coronavirus 2

Complete genome sequence of this SARS‐CoV‐2 variant was determined by mapping primer and adapter removed paired end sequences against NC_045512.2 reference genome sequence. Median sequencing coverage was 261x and genome coverage across the complete genome varied from 1× to 8019x. hCoV‐19/Finland/FinD796H/2021 sequence had a total of 58 nucleotide differences compared with the reference sequence. Twenty‐one were nonsynonymous mutations changing amino acids and 30 were nucleotide deletions. Two genomic positions (ORF1ab: 17821Y (C/T), S: 23948S (G/C)) had two different bases and therefore IUPAC codes were used in the final genome sequence. hCoV‐19/Finland/FinD796H/2021 had a very uncommon combination of the S protein mutations, as only 0,04% (535/1204022) of complete genome sequences deposited to Gisaid at 23.4.2021 had a similar combination (S:T95I, S: 144del, S:E484K, S:D614G, S:P681H, and S:D796D&H).

In addition to S gene mutations, a fifteen bp long deletion was found from the last base of the ORF7b gene to the 8th base of the ORF8 gene resulting in a fusion of the accessory ORF7b and ORF8 proteins (Figure S1). One nonsense mutation (GAA>TAA) was found later in ORF8 (G28209T). In the N gene a cluster of three successive mutations (G28881A, G28882A, and G28883C) and a 3 bp deletion (CTA28896‐28898‐‐‐) causing deletion of one amino acid, were found (Figure 2). A similar N protein mutation pattern was found from 651 (0.05%) of complete genomes deposited in Gisaid (23.4.2021).

Phylogenetic analysis for the hCoV‐19/Finland/FinD796H/2021 strain and representatives from Alfacoronavirus and Betacoronavirus genera of the family Coronaviridae is shown in Supplement Figure 2. Maximum likelihood (ML) tree was constructed based on 83 N gene sequences.

Virus genome sequence of hCoV‐19/Finland/FinD796H/2021 was deposited to GISAID (accession ID EPI_ISL_1061414). The variant was identified as a member of B.1.1.318 Pango lineage.

4. DISCUSSION

In the present study, we identified a rare combination of mutations from a SARS‐CoV‐2 strain isolated from a Finnish patient in February 2021. Importantly, our isolate had several mutations and deletions on the N gene primer binding sites, which resulted in failed N gene recognition by two commonly used RT‐qPCR assays. Also, a rare combination of amino acid changes within the S protein was observed. Altogether seven patients with similar S‐gene mutation profile were identified in Finland by us during spring 2021.

After WHO declared SARS‐CoV‐2 disease as a pandemic, clinical laboratories as well as manufacturers of reagents rushed to invent methods and reagents for reliable identification of the novel coronavirus. qRT‐PCR assay is the gold standard method to diagnose viruses transmitted by air. When the first whole‐genome sequences of SARS‐CoV‐2 were published, WHO recommended primers targeting N, E, and ORF1ab. As there was no time to optimize primer design and due to the rise of new SARS‐CoV‐2 variants, many manufacturers have used these first recommendations.

When we aligned the N gene sequence with the recommended PCR‐primer binding site (Figure 2), we identified several mutations, which most likely caused the failure in detection with the used assays. By comparing the N gene sequence to sequences in GISAID we identified 651 (0.05%) complete SARS‐CoV‐2 isolates with similar mutations (23.4.2021). Mutations affecting RT‐PCR detection of most commonly used N and E gene targets have been identified also previously. 7 , 8 , 9 , 10 In silico analysis of more than 17 000 genomes submitted in GISAID revealed multiple mismatches in primer binding regions in 7 of 27 studied common diagnostic primers. 11

Evolution of diagnostic resistance, for example, accumulation of mutations to sites used for identification in NAT (nucleic acid test) methods has been described for example in H7 avian influenza virus and in bacteria Chlamydia trachomatis. Usage of one target and/or a known variable target sequence in a NAT test are risk factors for the development of diagnostic resistance. N gene of SARS‐CoV‐2 was shown to be one of the most nonconservative regions in genome study of 31 421 genomes 12 and as our results show (Figure 2), mutations in N gene have already led to diagnostic resistance.

We used two commercial assays which target two different positions in SARS‐CoV‐2 genome and identified these novel variant samples as positive. Using a NAT method identifying only the N gene with original primers recommended by WHO, both samples would have been negative. Our study supports the previously published observations, highlighting the importance of simultaneous use of multiple targets in SARS‐CoV‐2 diagnostic assays and continuous monitoring of genetic variability of SARS‐CoV‐2 to ensure the specificity and sensitivity of commonly used diagnostic assays. Primers and probes should be aligned with an increasing number of reported SARS‐CoV‐2 variants to reassess their suitability for SARS‐CoV‐2 diagnostics.

CONFLICT OF INTERESTS

The authors declare that there are no conflict of interests.

AUTHOR CONTRIBUTIONS

Pia Laine, designed sequencing strategy, analyzed sequence data, and wrote manuscript. Hanna Nihtilä, analyzed patient samples and sequence data, prepared figures, and wrote manuscript. Ella Mustanoja, analyzed sequence data, took part into writing process. Annina Lyyski, analyzed sequence data, took part into writing process. Anne Ylinen, analyzed sequence data, and took part into writing process. Jukka Hurme, participated in study design. Lars Paulin, designed sequencing strategy, took part into writing process. Sakari Jokiranta, participated in study design, wrote manuscript. Petri Auvinen, wrote manuscipt. Taru Meri, prepared figures, participated in study design, wrote manuscript.

Supporting information

Supporting information.

ACKNOWLEDGMENTS

We acknowledge Päivi Laamanen, Ursula Lönnqvist, and Hanna Roos at DNA Sequencing and Genomics Laboratory, Institute of Biotechnology, University of Helsinki for sequencing of SARS‐CoV‐2 samples. Funding from Academy of Finland 336472 for PA and 324236 for TM.

Laine P, Nihtilä H, Mustanoja E, et al. SARS‐CoV‐2 variant with mutations in N gene affecting detection by widely used PCR primers. J Med Virol. 2022;94:1227‐1231. 10.1002/jmv.27418

Pia Laine, Hanna Nihtilä, Petri Auvinen, and Taru Meri equally contributed to the study.

DATA AVAILABILITY STATEMENT

Genome sequence of this variant was deposited to GISAID (accession ID EPI_ISL_1061414).

REFERENCES

  • 1. Wu F, Zhao S, Yu B, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579(7798):265‐269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Lan J, Ge J, Yu J, et al. Structure of the SARS‐CoV‐2 spike receptor‐binding domain bound to the ACE2 receptor. Nature. 2020;581(7807):215‐220. [DOI] [PubMed] [Google Scholar]
  • 3. Pathan RK, Biswas M, Khandaker MU. Time series prediction of COVID‐19 by mutation rate analysis using recurrent neural network‐based LSTM model. Chaos Solitons Fractals. 2020;138:110018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Rangan R, Zheludev IN, Hagey RJ, et al. RNA genome conservation and secondary structure in SARS‐CoV‐2 and SARS‐related viruses: a first look. RNA. 2020;26(8):937‐959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Sanders DW, Jumper CC, Ackerman PJ, et al. SARS‐CoV‐2 requires cholesterol for viral entry and pathological syncytia formation. eLife. 2021;10:10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Falzone L, Gattuso G, Tsatsakis A, Spandidos DA, Libra M. Current and innovative methods for the diagnosis of COVID19 infection (Review). Int J Mol Med. 2021;47(6):100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Vanaerschot M, Mann SA, Webber JT, et al. Identification of a polymorphism in the N gene of SARS‐CoV‐2 that adversely impacts detection by reverse transcription‐PCR. J Clin Microbiol. 2020;59(1):e02369‐20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Mushtaq MZ, Shakoor S, Kanji A, et al. Discrepancy between PCR based SARS‐CoV‐2 tests suggests the need to re‐evaluate diagnostic assays. BMC Res Notes. 2021;14(1):316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Ziegler K, Steininger P, Ziegler R, Steinmann J, Korn K, Ensser A. SARS‐CoV‐2 samples may escape detection because of a single point mutation in the N gene. Euro Surveill. 2020;25(39):2001650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Artesi M, Bontems S, Göbbels P, et al. A recurrent mutation at position 26340 of SARS‐CoV‐2 is associated with failure of the E gene quantitative reverse transcription‐PCR utilized in a commercial dual‐target diagnostic assay. J Clin Microbiol. 2020;58(10):e01598‐20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Khan KA, Cheung P. Presence of mismatches between diagnostic PCR assays and coronavirus SARS‐CoV‐2 genome. R Soc Open Sci. 2020;7(6):200636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Wang R, Hozumi Y, Yin C, Wei GW. Mutations on COVID‐19 diagnostic targets. Genomics. 2020;112(6):5204‐5213. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting information.

Data Availability Statement

Genome sequence of this variant was deposited to GISAID (accession ID EPI_ISL_1061414).


Articles from Journal of Medical Virology are provided here courtesy of Wiley

RESOURCES