Abstract
In order to analyze the differences between the results of coronavirus disease 2019 (COVID-19) diagnostic tests in Gyeongbuk, Republic of Korea, this study was performed full length genome analysis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) used by Ion Torrent Genexus Integrated Sequencer and dedicated analysis program. Full-length genomic sequences of the SARS-CoV-2 were obtained from specimen of 13 confirmed cases of COVID-19. As a result of the analysis, the Omicron sub-lineages were BA.1.1 (n=5), BA.2 (n=4), and BA.2.3 (n=4). In detail, R61C or R61H were confirmed in the E gene, meanwhile L142W and 166–178 nucleotide sequences were deleted in the ORF1ab NSP3 region. In the analysis using a computer program, mutations in the E gene region (R61H) were not useful as a target region for COVID-19 Real-time RT-PCR. Amino acid mutation (L142W) and nucleotide deletion (166–178) in the ORF1ab NSP3 region have been identified to affect diagnosis of COVID-19 with specific diagnostic kits. Thus, these results suggest that the detailed lineage and genetic mutations of the SARS-CoV-2 using the whole genome sequencing could be a critical tool for the COVID-19 diagnosis.
Keywords: Coronavirus Disease-19, Severe acute respiratory syndrome coronavirus 2, Whole genome sequencing, SARS-CoV-2 variant, Molecular diagnostic test
Key messages
① What is known previously?
There are only a few studies on the correlation between severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) mutation and gene detection.
② What new information is presented?
SARS-CoV-2 mutations may affect sensitivity of gene detection depending on diagnostic kits.
③ What are implications?
Continuous monitoring is necessary through analysis of whole genome of SARS-CoV-2.
Introduction
Since the outbreak of respiratory pneumonia of unknown origin in Wuhan, Hubei Province, China in December 2019, it has been rapidly spreading worldwide beyond community boundaries through person-to-person transmission. The culprit is one of the coronaviruses that has been identified as a new strain that is genetically different from the already-known severe acute respiratory syndrome (SARS) coronavirus. Additionally, the genetic lineage of the current coronavirus disease 2019 (COVID-19) virus is becoming more and more diverse following the global spread. The World Health Organization Technical Advisory Group on SARS-CoV-2 Virus Evolution classified the omicron variant as a variant of concern in November 2021 [1,2], which was re-classified into 390 sub-strains. Among these, BA.5, BA.4, and BA.2 were further re-classified into 140, 20, and 172 sub-strains, respectively (October 2022) [3,4]. In the Republic of Korea (ROK), BA.4/BA.5 and BA.2.75 were identified in May and July 2022, respectively. As of November 2022, the detection rate of BA.5 (including BF.7, BQ.1, and BQ.1.1) and BA.2.75 (including BN.1) sub-strains was 85.1% and 11.6%, respectively [5].
Division of Laboratory Diagnosis Analysis, Gyeongbuk Regional Center for Disease Control and Prevention conducts diagnostic tests and genetic analysis for patients confirmed with COVID-19, thereby contributing to scientific quarantine based on the analysis and diagnosis of the COVID-19 epidemic in the region. In our lab, domestic manufacturer reagents that detect E and ORF1ab genes or E and RdRp genes are used for the gene detection test, and the COVID-19 mutation screening test, specific gene (S gene) sequencing analysis, and whole genome analysis are used for genetic analysis.
Most of the commercially available COVID-19 gene diagnostic reagents in the ROK detect specific regions (E, ORF1ab, RdRp, etc.) of the expression of COVID-19 viral gene using real-time reverse transcription polymerase chain reaction (real time RT-PCR). Typically, the cycle threshold (Ct) values of the target gene (2 or 3) within the same sample are consistent when the same gene detection reagent is used; however, despite using the same reagent, different Ct values were identified for the COVID-19 virus detected in some of the recently confirmed cases of COVID-19 in the Gyeongbuk region.
Based on these results, detailed strains of mutations were identified in this study using whole genome analysis of the COVID-19 virus isolated from confirmed patients of COVID-19 who showed different Ct values of the specific gene for COVID-19 in the detection test. In addition, based on the identified gene mutation, its effect on the gene detection test was analyzed using a computer program as a reference when selecting a genetic diagnostic reagent.
Methods
1. Target for Whole Genome Analysis
Clinical samples (oral and nasopharyngeal smears) were obtained from 13 patients suspected with COVID-19 who visited a screening clinic (public health center), in Gunwi-gun, Cheongdo-gun, and Uiseong-gun in the Gyeongbuk region, following direct contact with a patient diagnosed with COVID-19 between February and March 2022.
2. COVID-19 Diagnostic Test
Viral genes were extracted from the clinical samples of 13 patients confirmed with COVID-19. A diagnostic reagent from a domestic manufacturer (company B), which can simultaneously detect both the E and ORF1ab genes of the COVID-19 virus, was used to identify positive cases, according to the criteria for the reagent manufacturer’s internal control (IC); and positive outcomes were determined using real-time RT-PCR (Applied BiosystemsTM 7500 Fast Real-Time PCR System, ThermoFisher Scientific).
3. Whole Genome Analysis and Registration of GISAID
Viral RNA extracted from patients with COVID-19 were quantified using TaqManTM 2019 nCoV Assay kit v1 (Applied BiosystemsTM), and whole genome analysis was conducted using the whole genome analysis reagents (ThermoFisher Scientific), GenexusTM Integrated Sequencer (Ion Torrent Genexus System, ThermoFisher Scientific), and Ion AmpliSeqTM SARS-CoV-2 Insight Research Assay GX. The obtained COVID-19 genome was compared and analyzed for sublineage and genome sequence analysis using SARS-CoV-2-Panglin and CLC Main Workbench (Version 21.0.3, QIAGEN). The full-length genome sequence obtained was registered in the Global Initiative on Sharing Avian Influenza Data (GISAID) [3,4].
4. In silico Analysis
A virtual experiment using a computer program was performed to determine whether the COVID-19 gene obtained from 13 confirmed patients of COVID-19 could be detected using diagnostic reagents (A, B, C, D, E) from 5 domestically licensed and commercially available diagnostic reagents by Division of Emerging Infectious Diseases, Korea Disease Control and Prevention Agency.
Results
1. COVID-19 Diagnostic Test
The number of patients confirmed with COVID-19 that underwent the COVID-19 gene detection test was 9 at the Gunwi-gun Public Health Center, 1 at the Cheongdo-gun Public Health Center, and 3 at the Uiseong-gun Public Health Center in Gyeongsangbuk-do. Our study included 6 males and 7 females aged 9–65 years, with a mean age of 33 years. When gene detection was performed using the reagent from Company B, the six samples (numbers 1 to 6) obtained from Gunwi-gun showed no amplification or a significant increase in the Ct value of the E gene compared with that the ORF1ab gene. For the three samples obtained from Gunwi-gun, one sample from Cheongdo-gun, and three samples from Uiseong-gun, the ORF1ab gene was not amplified or the Ct value increased compared with that of the E gene (Table 1).
Table 1. Characteristics of Ct values of SARS-CoV-2 in Gyeongbuk region.
| No. | Age | Sex | Region | Value of Ct | GISAID accession number | Characteris-tics | |
|---|---|---|---|---|---|---|---|
| ORF1ab | E | ||||||
| 1 | 65 | F | GW | 18.08 | - | hCoV-19/South_Korea/KDCA33801/2022 | Not detected E gene or Ct value increasement of E gene |
| 2 | 55 | M | GW | 23.1 | 34.31 | hCoV-19/South_Korea/KDCA33847/2022 | |
| 3 | 56 | M | GW | 23.68 | 31.37 | hCoV-19/South_Korea/KDCA33848/2022 | |
| 4 | 43 | F | GW | 20.52 | 26.58 | hCoV-19/South_Korea/KDCA33846/2022 | |
| 5 | 35 | F | GW | 20.05 | 28.65 | hCoV-19/South_Korea/KDCA33845/2022 | |
| 6 | 9 | F | GW | 21.24 | 31.23 | hCoV-19/South_Korea/KDCA33844/2022 | |
| 7 | 38 | F | GW | 19.15 | 16.61 | hCoV-19/South_Korea/KDCA42754/2022 | Not detected ORF1ab or Ct value increasement of ORF1ab |
| 8 | 10 | M | GW | 19.07 | 16.71 | hCoV-19/South_Korea/KDCA42756/2022 | |
| 9 | 39 | M | GW | 22.06 | 19.11 | hCoV-19/South_Korea/KDCA42800/2022 | |
| 10 | 39 | F | CD | 19.36 | 16.46 | hCoV-19/South_Korea/KDCA42801/2022 | |
| 11 | 12 | M | US | - | 20.06 | hCoV-19/South_Korea/KDCA43171/2022 | |
| 12 | 14 | M | US | 38.23 | 19.01 | hCoV-19/South_Korea/KDCA43172/2022 | |
| 13 | 14 | F | US | 38.14 | 18.91 | hCoV-19/South_Korea/KDCA43173/2022 | |
Ct=cycle threshould; SARS-CoV-2=severe acute respiratory syndrome coronavirus 2; GISAID=Global Initiative on Sharing Avian Influenza Data; -=not detected; GW=Gunwi-gun; CD=Cheongdo-gun; US=Uiseong-gun.
2. Whole Genome Analysis
Following whole genome analysis of the COVID-19 virus, it was confirmed that both BA.1 and BA.2 strains were prevalent simultaneously in Gyeongsangbuk-do during February–March 2022. The full-length sequence of the COVID-19 viral genome obtained from 13 patients in Gyeongbuk Province showed 97% coverage and a depth of ≥2,600×. Furthermore, all mutant sublineages were of the omicron variant, and there were 5, 4, and 4 cases of BA.1.1, BA.2, and BA.2.3, respectively (Pangolin version 4.1.3). The 13 full-length genome sequences obtained were registered in the GISAID (Table 2).
Table 2. Distribution of SARS-CoV-2 variants and main amino acid mutations of COVID-19 patients.
| No. | GISAID | Pango Lineage |
Pangolin version | Main amino acid mutationsTarget region |
Main amino acid mutations of COVID-19 Patients |
|---|---|---|---|---|---|
| 1 | GRA | BA.2.3 | 4.1.3 | T9I, R61C on E gene | T9I, R61C |
| 2 | BA.1.1 | 4.1.3 | T9I, R61H on E gene | T9I, R61H | |
| 3 | BA.1.1 | 4.1.3 | |||
| 4 | BA.1.1 | 4.1.3 | |||
| 5 | BA.1.1 | 4.1.3 | |||
| 6 | BA.1.1 | 4.1.3 | |||
| 7 | BA.2 | 4.1.3 | T24I, L142W, G171V, G489S on NSP3 | T24I, L142W, G171V, G489S | |
| 8 | BA.2 | 4.1.3 | |||
| 9 | BA.2 | 4.1.3 | T24I, L142W, G489S on NSP3 | T24I, L142W, G489S | |
| 10 | BA.2 | 4.1.3 | T24I, F25L, E26N, D28M, E29K, R30G, I31L, D32I, V34Y, N36M, E37R, K38del, C39del, Y42L, T43P, V44I, E45Q, G47N, T48S, L142W, G171V, G489S on NSP3 | T24I, F25L, E26N, D28M, E29K, R30G, I31L, D32I, V34Y, N36M, E37R, K38del, C39del, Y42L, T43P, V44I, E45Q, G47N, T48S, L142W, G171V, G489S | |
| 11 | BA.2.3 | 4.1.3 | 166–178 deletion on NSP3 | 166-178 deletion | |
| 12 | BA.2.3 | 4.1.3 | |||
| 13 | BA.2.3 | 4.1.3 |
SARS-CoV-2=severe acute respiratory syndrome coronavirus 2; COVID-19=coronavirus disease 2019; GISAID=Global Initiative on Sharing Avian Influenza Data; C=cysteine; D=aspartic acid; del=deletion; E=glutamic acid; F=phenylalanine; G=glycine; I=isoleucine; K=lysine; L=leucine; M=methionine; N=asparagine; NSP3=multi-domain non-structural protein 3; P=proline; Q=glutamine; R=arginine; S=serine; T=threonine; V=valine; W=tryptophan; Y=tyrosine.
A common T9I mutation was identified in the E gene region using whole genome analysis of COVID-19 viral samples obtained from the six confirmed cases (No. 1–6), in which the E gene was not amplified or the Ct value was increased compared to the Ct value of the ORF1ab gene. Among these, R61C mutation was confirmed in case 1 (BA.2.3) and R61H mutation was confirmed in cases 2 to 6 (BA.1.1). In contrast, whole genome analysis of COVID-19 viral samples obtained from cases 7 to 10 (BA.2), whose ORF1ab gene was not amplified or Ct value was increased, revealed that T24I, L142W, and G489S mutations were commonly identified in the NSP3 gene region. Moreover, several additional mutations were identified in case 10 (BA.2). In cases 11 to 13 (BA.2.3), consecutive gene deletions (166 to 178) were detected in the NSP3 gene (Table 2).
The results of the whole genome analysis showed total reads of at least 97% and a coverage depth (CD) of at least 2,600×, thus, indicating good information. Major mutations (R61C and R61H on the E gene, and L142W and G171V on the NSP3 gene) were identified with a CD of ≥1,121× and a mutation frequency of ≥93% (Table 3).
Table 3. Characteristics of SARS-CoV-2 whole genome analysis and frequency of gene mutation on E and NSP3 region.
| No. | Mapped reads | Consensus | Reads mapped to reference/total reads (%) | Depth of coverage | Genome coverage (%) | E gene | NSP3 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| R61C | R61H | L142W | G171V | |||||||||||||
| Coverage | Frequency (%) | Coverage | Frequency (%) | Coverage | Frequency (%) | Coverage | Frequency (%) | |||||||||
| 1 | 606,419 | 29,849 | 99.50 | 3,042 | 99.80 | 1,788 | 99.2 | - | - | - | - | - | - | |||
| 2 | 1,554,483 | 29,855 | 98.70 | 7,798 | 99.80 | - | - | 2,684 | 93.3 | - | - | - | - | |||
| 3 | 1,482,940 | 29,855 | 98.80 | 7,439 | 99.80 | - | - | 3,564 | 98.5 | - | - | - | - | |||
| 4 | 1,398,424 | 29,822 | 98.40 | 7,015 | 99.70 | - | - | 2,038 | 98.6 | - | - | - | - | |||
| 5 | 1,883,778 | 29,855 | 98.40 | 9,449 | 99.80 | - | - | 3,172 | 98.2 | - | - | - | - | |||
| 6 | 1,513,983 | 29,855 | 98.40 | 7,595 | 99.80 | - | - | 2,930 | 98.3 | - | - | - | - | |||
| 7 | 1,316,138 | 29,603 | 98.80 | 6,602 | 99.00 | - | - | - | - | 3,963 | 97.3 | 1,121 | 98.7 | |||
| 8 | 1,119,320 | 29,795 | 98.90 | 5,615 | 99.60 | - | - | - | - | 4,125 | 97.7 | 1,258 | 98.3 | |||
| 9 | 1,091,099 | 29,710 | 98.40 | 5,473 | 99.40 | - | - | - | - | 3,993 | 98 | - | - | |||
| 10 | 3,570,992 | 29,843 | 98.80 | 17,913 | 99.80 | - | - | - | - | 18,048 | 97.8 | 5,822 | 98.5 | |||
| 11 | 528,461 | 29,212 | 97.90 | 2,651 | 97.70 | - | - | - | - | - | - | - | - | |||
| 12 | 1,332,872 | 29,549 | 98.40 | 6,686 | 98.80 | - | - | - | - | - | - | - | - | |||
| 13 | 1,371,159 | 29,515 | 99.00 | 6,878 | 98.70 | - | - | - | - | - | - | - | - | |||
SARS-CoV-2=severe acute respiratory syndrome coronavirus 2; -=not applicable; C=cysteine; G=glycine; H=histidine; L=leucine; R=arginine; V=valine; W=tryptophan.
3. In Silico Analysis of the Target Gene
Based on the 13 full-length genome information obtained, mutations were identified in the target gene detection sites of five diagnostic reagents commercially available in the ROK. Computer program-based analysis was performed to determine whether these mutations can be detected using specific diagnostic reagents. Mutations in the E gene could be detected, whereas the partial mutations of the NSP3 in the ORF1ab gene in cases 7–10 and 11–13 with partial regional deletions could not be detected using the diagnostic reagents of Company B. In addition, it was difficult to detect S and E genes using diagnostic reagents from Companies A and E (Table 4).
Table 4. Comparision of SARS-CoV-2 diagnostic reagents using software-based analysis.
| No. | Type of kit | Target gene | Detectable targets | Undetectable target | Kit interpretation (positive) |
|---|---|---|---|---|---|
| 1–6 | A | RdRp, S, E, N | RdRp, E, N | S | Ct≤40 |
| B | ORF1ab, E | ORF1ab, E | None | Ct≤38 | |
| C | RdRp, E | RdRp, E | None | Ct≤36 | |
| D | RdRp, E | RdRp, E | None | Ct≤38 | |
| E | ORF1ab, S, E, N | ORF1ab, S, N | E | Ct≤38 | |
| 7–10 | A | RdRp, S, E, N | RdRp, E, N | S | Ct≤40 |
| B | ORF1ab, E | E | ORF1ab | Ct≤38 | |
| C | RdRp, E | RdRp, E | None | Ct≤36 | |
| D | RdRp, E | RdRp, E | None | Ct≤38 | |
| E | ORF1ab, S, E, N | ORF1ab, S, N | E | Ct≤38 | |
| 11–13 | A | RdRp, S, E, N | RdRp, E, N | S | Ct≤40 |
| B | ORF1ab, E | E | ORF1ab | Ct≤38 | |
| C | RdRp, E | RdRp, E | None | Ct≤36 | |
| D | RdRp, E | RdRp, E | None | Ct≤38 | |
| E | ORF1ab, S, E, N | ORF1ab, S, N | E | Ct≤38 |
SARS-CoV-2=severe acute respiratory syndrome coronavirus 2; E=envelope; N=nucleoprotein; ORF1ab= open reading frame 1ab; RdRp=RNA-dependent RNA polymerase; S=spike; Ct=cycle threshould.
Discussion
Using full-length genomic analysis of the COVID-19 virus from clinical samples of patients with COVID-19, the sub-strains of variants as well as the prominent strain in the ROK were analyzed and the sites of genetic mutations were identified in the entire viral genome. In silico analysis results based on full-length genome information of the COVID-19 virus showed that the R61C or R61H mutation in the E gene did not affect genetic diagnosis because it was not the target gene location of the diagnostic reagents of Company B examined in this study. This indicates that the gene detection sensitivity may be lowered due to a region other than the target E gene or other causes. In addition, due to the effect of the NSP3 L142W mutation in the ORF1ab gene identified in some samples, the Ct value of ORF1ab gene increased when using the diagnostic test reagents from Company B. However, the same genes were not detected when using the same reagents from Company B because of the genetic defect in the 166–178 region of the ORF1ab gene.
Garg et al. [6] performed real-time RT-PCR on five clinical samples using the gene detection test method they developed using seven types of COVID-19 diagnostic reagents and reported that samples with weak positives were sometimes reported as negative depending on the type of reagent used. Furthermore, 354 patients hospitalized with COVID-19 were randomly selected to compare the Ct value results via real-time RT-PCR with three diagnostic reagents (Sansuer Biotech, GeneFinderTM, TaqPathTM) using the same PCR equipment and nucleic acid extraction conditions. They observed no statistically significant difference between detection reagents; however, the reagent from Sansuer Biotech had slightly superior performance. For improvement, determination of the cut-off Ct value and rapid primer development according to the emergence of new mutations were suggested [7]. A Dutch research team identified the detection values of ORF1ab and E gene as 20.7 and 30.2, respectively, using a diagnostic reagent that detects two target genes; they reported five unique mutations in the ORF1ab region of the virus, thus, emphasizing the importance of using gene region analysis for molecular diagnosis [8].
Since the diagnostic reagents compared in this study have more than two target genes, the influence of diagnostic tests for specific genetic mutations can be minimized using diagnostic reagents with different targets if unexpected results are obtained from one diagnostic reagent (such as when only one of the target genes is determined to be positive). The diagnostic reagents approved by the Ministry of Food and Drug Safety disclose their target genes and criteria [9] to ensure that the diagnostic reagents with fewer mutations in the target gene can be selected based on the gene mutation information of the COVID-19 virus obtained using whole genome analysis (Table 4). Furthermore, as various mutations have occurred in the COVID-19 omicron strain along with the unpredictable genetic mutations and deletion patterns, the results obtained by monitoring the emergence of continuous genetic mutations using whole genome analysis of COVID-19 may be used to improve the molecular diagnostics for COVID-19.
Acknowledgments
We acknowledge to staff in the Division of Emerging Infectious Diseases in Korea Disease Control and Prevention Agency for the additional analysis of in silico and gene mutation sites based on the full-length genome sequence of the SARS-CoV-2.
Declarations
Ethics Statement: Not applicable.
Funding Source: None.
Conflict of Interest: The authors have no conflicts of interest to declare.
Author Contributions: Conceptualization: HSY, WYC. Data curation: HSY, GRM. Formal analysis: HJK, SHO. Investigation: JJP, HGC, YPL, Methodology: HSY, HGC. Resources: JJP, CIL. Software: HGC, SHO. Supervision: WYC, CKS. Validation: HSY. Visualization: HSY. Writing – original draft: HSY. Writing – review & editing: WYC.
REFERENCES
- 1.World Health Organization, author. Weekly epidemiological update on COVID-19 - 20 April 2021 [Internet] World Health Organization; 2021. [2022 May 9]. Available from: https://www.who.int/publications/m/item/weekly-epidemiological-update-on-covid-19---20-april-2021 . [Google Scholar]
- 2.Pango Network, author. Tracking SARS-CoV-2 variants [Internet] Pango Network; 2022. [cited 2022 May 9]. Available from: https://www.pango.network/ [Google Scholar]
- 3.GISAID, author. Global Initiative on Sharing Avian Influenza Data [Internet] GISAID; 2022. [cited 2022 Jul 25]. Available from: https://gisaid.org/ [Google Scholar]
- 4.Pango Lineages [Internet] Cov-lineages.org; 2022. [cited 2022 Nov 15]. Available from: https://cov-lineages.org . [Google Scholar]
- 5.Korea Disease Control and Prevention Agency, author. 2022. [Press Release (November 30 2022)]. Available from: https://www.kdca.go.kr/board/board.es?mid=a20501020000&bid=0015&list_no=721290&cg_code=C01&act=view&nPage=6 .
- 6.Garg A, Ghoshal U, Patel SS, et al. Evaluation of seven commercial RT-PCR kits for COVID-19 testing in pooled clinical specimens. J Med Virol. 2021;93:2281–6. doi: 10.1002/jmv.26691. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Banko A, Petrovic G, Miljanovic D, et al. Comparison and sensitivity evaluation of three different commercial real-time quantitative PCR kits for SARS-CoV-2 detection. Viruses. 2021;13:1321. doi: 10.3390/v13071321.3bc7548286bf4929add62a4eef94cc7e [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Popping S, Molenkamp R, Weigel JD, et al. Diminished amplification of SARS-CoV-2 ORF1ab in a commercial dual-target qRT-PCR diagnostic assay. J Virol Methods. 2022;300:114397. doi: 10.1016/j.jviromet.2021.114397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ministry of Food and Drug Safety, author. Medical device information portal [Internet] Ministry of Food and Drug Safety; 2022. [cited 2022 Oct 11]. Available from: https://udiportal.mfds.go.kr/search/data/P02_01#list . [Google Scholar]
