Skip to main content
Heliyon logoLink to Heliyon
. 2021 Aug 21;7(8):e07866. doi: 10.1016/j.heliyon.2021.e07866

Molecular characterization of SARS-CoV-2 from Bangladesh: implications in genetic diversity, possible origin of the virus, and functional significance of the mutations

Md Marufur Rahman a, Shirmin Bintay Kader b, SM Shahriar Rizvi c,
PMCID: PMC8380069  PMID: 34458642

Abstract

In a try to understand the pathogenesis, evolution and epidemiology of the SARS-CoV-2 virus, scientists from all over the world are tracking its genomic changes in real-time. Genomic studies can be helpful in understanding the disease dynamics. We have downloaded 324 complete and near complete SARS-CoV-2 genomes submitted in GISAID database from Bangladesh which were isolated between 30 March to 7 September, 2020. We then compared these genomes with Wuhan reference sequence and found 4160 mutation events including 2253 missense single nucleotide variations, 38 deletions and 10 insertions. The C>T nucleotide change was most prevalent (41% of all mutations) possibly due to selective mutation pressure to reduce CpG sites to evade CpG targeted host immune response. The most frequent mutation that occurred in 98% isolates was 3037C>T which is a synonymous change that usually accompanied 3 other mutations that include 241C>T, 14408C>T (P323L in RdRp) and 23403A>G (D614G in spike protein). The P323L was reported to increase mutation rate and D614G is associated with increased viral replication and currently most prevalent variant circulating all over the world. We identified multiple missense mutations in B-cell and T-cell predicted epitope regions and/or PCR target regions (including R203K and G204R that occurred in 86% of the isolates) that may impact immunogenicity and/or RT-PCR based diagnosis. Our analysis revealed 5 large deletion events in ORF7a and ORF8 gene products that may be associated with less severity of the disease and increased viral clearance. Our phylogeny analysis identified most of the isolates belonged to the Nextstrain clade 20B (86%) and GISAID clade GR (88%). Most of our isolates shared common ancestors either directly with European countries or jointly with middle eastern countries as well as Australia and India. Interestingly, the 19B clade (GISAID S clade) was unique to Chittagong, which was originally prevalent in China. This reveals possible multiple introductions of the virus in Bangladesh via different routes. Hence, more genome sequencing and analysis with related clinical data is needed to interpret functional significance and better predict the disease dynamics that may be helpful for policy makers to control the COVID-19 pandemic.

Keywords: SARS-CoV-2, Mutation, Phylogeny, COVID-19, Bangladesh, D614G, Genome sequence


SARS-CoV-2, Mutation, Phylogeny, COVID-19, Bangladesh, D614G, Genome sequence.

1. Introduction

The world is suffering from COVID-19, a devastating pandemic caused by a novel coronavirus originating from Wuhan, China (Zhou et al., 2020). Meanwhile, scientists from all over the world are trying to understand the virus better using various processes including genome sequencing. The first reported complete genome sequence was identified in January 3, 2020 (Tan et al., 2020). More than 141 thousands genomic sequences of SARS-CoV-2 have been submitted in the Global Initiative on Sharing All Influenza Data (GISAID) database (Elbe and Buckland-Merrett, 2017, GISAID-Initiative, 2020).

The genomic sequences revealed that the length of the SARS-CoV-2 viral genome is ~30kb. The longest part of the genome at 5’ end encodes for ORF1ab polyprotein whereas the rest of the genome consists of genes for encoding four structural proteins namely surface (S), envelope (E), membrane (M) and nucleocapsid (N), accessory proteins and other non-structural proteins (NSP) encoded by ORF3a, ORF6, ORF7a, ORF7b, ORF8 and ORF10 genes (Khailany RA and Ozaslan, 2020). The initial assessment of 3 clades indicates distinct geographic distribution. Depending on amino acid changes Förster et al. reported three central variants (A,B,C) of SARS-CoV-2 where A and C being the most common type in Europe and USA and B being the most common type in East Asia (Forster et al., 2020). Pachetti el. al. identified multiple mutation hotspots with geographic location specificity. They identified mutations in RNA dependent RNA polymerase (RdRp) gene which are important as RdRp protein is the target for some proposed antiviral drugs and mutations in the gene may facilitate the virus to escape from those drugs (Pachetti et al., 2020). Yao et al. identified pathogenic variations depending specific SNVs in the Spike glycoprotein (S) changing viral load and cytopathic effects up to 270 folds (Yao et al., 2020). Deletions in the viral genomes are also common phenomena and sometimes are related to severity of the diseases (Armengaud et al., 2020; Holmes, 2009; Pachetti et al., 2020). Still there is a lack of studies to integrate all the deletions in the whole genome of SARS-CoV-2 globally. This may contribute to understand the pathogenic dynamics of the virus over time. The genetic differences among SARS-CoV-2 strains from different locations can be linked with their geographical distributions (Islam et al., 2020).

The clade and lineage nomenclatures for SARS-CoV-2 are changing rapidly. Specific combinations of 9 genetic markers shows 95% of the hCOV-19 data in GISAID can be further classified in major 6 clades named S, L, V, G, GH, GR (Khailany RA and Ozaslan, 2020). Initially the virus was classified into 2, then further into 3 super clades (Forster et al., 2020; Pachetti et al., 2020). A team of scientists has developed an open source bioinformatics and visualization toolkit named Nextstrain (www.nextstrain.org) for real-time tracking of pathogen evaluation including SARS-CoV-2 (nextstrain-github, 2020). Their clade nomenclature is different but supplementary to Rambaut et al., 2020 (GISAID-clade, 2020).

The virus was first reported in Bangladesh on March 8, 2020 as first 3 cases were identified at the Institute of Epidemiology and Disease Research, Dhaka. Currently the country is at the community spread stage and the total number of infected is about 379,738 with 5,555 reported deaths from COVID-19 till 12 October, 2020 (IEDCR, 2020; worldometer, 2020).

It is important to get more sequences from all over the world which will help us for better understanding of the evolution pattern, disease dynamics, phylogeographic distribution of the clades, designing drugs and vaccines. In this paper we tried to determine the phylogenetic relationship of Bangladeshi isolates with other isolates from around the world. This can help us to assume the travelling routes of the virus into Bangladesh as well in other parts of the world. We also tried to know the specific mutational differences in Bangladeshi isolates compared to the reference sequence and whether there is any clinico-pathological significance associated with those mutations.

2. Materials and methods

2.1. Local sequence retrieval

For retrieval of genome sequences from Bangladesh, we have searched in the GISAID database using search term “Bangladesh” as location. We have downloaded all the relevant sequences from search result in FASTA format and also downloaded patient status metadata, sequencing technology metadata and acknowledgement table separately (accessed on 30 August, 2020).

2.2. Mutation analysis

We have used Genome Detective Coronavirus Typing Tool version 1.13 and CoVsurver enabled by GISAID to analyse our query sequences in FASTA format (Cleemput et al., 2020; A∗STAR-Bioinformatics-Institute, 2020). These are freely available online based bioinformatic tools which are validated to identify and reassemble novel Corona virus isolates. Using these tools, we identified both nucleotide and amino acid mutations and similarities compared to SARS-CoV-2 (NCBI Taxonomy ID: 2697049) reference sequence NC_045512 (NCBI) and EPI_ISL_402124 (GISAID).

For functional prediction of mutational changes, we have used two web based tools namely SIFT (Sorting Intolerant From Tolerant) and MutPred2 (Ng and Henikoff, 2003; Pejaver et al., 2017). We also have used USCS SARS-CoV-2 Genome Browser (https://genome.ucsc.edu/cgi-bin/hgGateway?db=wuhCor1) to align mutations along base position and functionally significant areas in the genome (Fernandes et al., 2020). A support vector machine (SVM) based tool namely i-Mutant 3.0 was used for predicting the change of protein stability change and ΔΔG from specific mutations (Capriotti et al., 2005).

2.3. Phylogeny analysis

We have used open source bioinformatics visualization platform Nextstrain (nextstrain.org) for phylogeny analysis of our sequences (Hadfield et al., 2018). Pairwise sequence alignment and clade assignment was done using web based Nextstrain tool Nextclade beta version 0.4.9 (Nextclade, 2020). The GISAID clade identification of the sequences was done using GISAID CoVsurver tool. Further detailed information of different clusters was derived from a preformed interactive web-based tree developed from a subsample of global sequences (~5000) by neighbour joining method in Nextstrain web interface (https://nextstrain.org/ncov/global, date accessed on 30 August, 2020).

3. Results

From GISAID database we have found 329 submissions from Bangladesh (accessed on 30 August, 2020). Out of the 329 submissions, 324 were complete/near genomes. Among the complete genomes, 102 were isolated from female patients, 220 were from male patients and for 2 isolates the gender was not reported. The age range of collected samples was between 8 days to 95 years and the median age was 38 years. The submissions came from a total of 9 laboratories and different sequencing platforms were used by different laboratories. A quality control (QC) analysis was done where 7 sequences were flagged as “bad” (private mutation cut-off was set at 20) and 6 sequences were reported to have more than 5 ambiguous mutations.

3.1. Mutation analysis with functional significance

Variation analysis from 324 genome isolates revealed a total of 4160 mutation events out of which 4112 are Single Nucleotide Variations (SNVs), 38 deletions (in 30 isolates) and 10 insertions (in 10 isolates). Among the SNVs, 2253 were missense mutations in coding regions, 1216 were synonymous mutations and rest were in non-coding part of the genome (Table 1). Among all the SNVs, the most common change was C>T (~41%) and the second most prevalent change was G>A (~16%). There were 5 large deletions (>50 nucleotide) among which three resulted in deletion of a large portion ORF7a gene and another two deleted the ORF8 gene. The highest number of mutation events (including all SNVs and indels) observed in one isolate (EPI_ISL_445217) was 59 and least was zero as one isolate (EPI_ISL_458133) was identical to the reference genome (NC_045512.3) with 99.4% genome coverage. The average number of mutation events was approximately 13 per isolate.

Table 1.

Distribution of mutation events along different genomic regions of SARS-CoV-2 Bangladeshi isolates.

Genome segment Base Position Total mutation Missense Mutation Synonymous Mutation Deletion∗∗ Insertion
Coding region ORF1ab 266–21555 1796 1021 775 2 0
S 21563–25384 528 416 112 3 0
ORF3a 25393–26220 135 101 34 7 0
E 26245–26472 4 3 1 0 0
M 26523–27191 65 14 51 0 0
ORF6 27202–27387 8 6 2 1 0
ORF7a 27394–27759 50 9 41 9 0
ORF7b 27756–27887 8 5 3 0 0
ORF8 27894–28259 70 34 36 2 0
N 28274–29533 949 635 314 0 3
ORF10 29558–29674 13 0 13 0 2
Non-coding region
Intergenic - 12 - - 2 1
5′-UTR 1–265 386 - - 5 1
3′-UTR
29675–29903
136
-
-
7
3
Total 4160 2253 1382 38∗∗ 10

Genes in displayed in italics.

∗∗

Some deletion events continued over multiple regions.

The most common mutation in non-coding region was 241C>T, observed in 96% (312) of the isolates. The mutation with highest frequency (~98%) in coding region was 3037C>T (synonymous) and 14408C>T (missense) in ORF1ab gene, and 23403A>G (missense) in S gene. The latter caused D614G amino acid change in the spike protein of the virus. Among the 19 missense SNVs that occurred more than 5 times, 12 were predicted to decrease the stability of their respective protein structure (DDG value less than -0.5 kcal/mol where DDG or Delta Delta G is a measurement for predicting the effect of SNVs on protein stability) and six of the SNVs were predicted to alter protein function. Among these 19 frequent SNVs, 6 were in CD4+ T-Cell epitope predicted regions, 6 were in CD8+ T-Cell epitope predicted regions and 5 were in B-Cell epitope predicted regions. The T592I mutation in ORF1ab polyprotein (NSP2) was strongly predicted for CD8+ T-Cell epitope that was also predicted for altered protein function. The P4715L, one of the highest frequent SNVs, occurred in the RNA dependent RNA polymerase (RdRp) region of ORF1ab polyprotein. Beside these, two of the high frequency (86%) SNVs (R203K and G204R) occurred in COVID-19 diagnostic RT-PCR target and B-Cell predicted epitope regions, of which the later was predicted to cause altered function of the nucleocapsid protein (Figure 1). Further analysis revealed two SNVs (L3606F, H125Y) were predicted to cause altered ordered interface and altered transmembrane protein for ORF1ab polyprotein (NSP6) and Membrane (M) protein respectively where L3606F was predicted to cause gain of sulfation at Y360 position of NSP3 protein (Table 2).

Figure 1.

Figure 1

(a) Frequency of SNVs along the base positions of SARS-CoV-2 genome with high frequency (>5) missense SNVs marked red (b) The high frequency missense SNVs aligned (red lines) to different regions of SARS-CoV-2 genome, uniport regions of interest, RT-PCR diagnostic primer set, B-Cell and T-Cell predicted epitope regions, SARS-CoV T Cell epitope regions (M1 and M2 peptides), human protein interaction, protease cleavage and signal peptide regions derived from UCSC genome browser.

Table 2.

Variants of SARS-CoV-2 genomes observed in more than 5 isolates.

Genomic change Type of mutation Gene Amino acid change No. of samples Structural Prediction Effect (SVM3) DDG Value (kcal/mol) Functional prediction effect Predicted molecular mechanism change Finding from other studies
241C>T Non-coding 5′-UTR - 312 - - - - Frequently observed as co-mutation with 3037C>T (98%), 14408C>T (98%) and 23403A>G (97%) linked with European clade and high infection rate (Mercatelli and Giorgi, 2020; Rahimi et al., 2021)
683C>T Synonymous ORF1ab None 8 - - - - -
1163A>T Missense ORF1ab I300F 255 Decrease -1.79 Tolerated CD4+ T Cell epitope Less likely to interact with host factors (Ul Alam et al., 2020)
2040C>T Missense ORF1ab T592I 12 Neutral -0.48 Affect function CD8+ T Cell epitope (strong prediction) -
2388C>T Missense ORF1ab T708I 11 Neutral -0.35 Tolerated CD4+ and CD8+ T Cell epitope -
2836C>T Synonymous ORF1ab None 6 - - - - -
2910C>T Missense ORF1ab T882I 8 Neutral -0.10 Tolerated - -
3037C>T Synonymous ORF1ab None 318 - - - - Frequently observed as co-mutation with 3037C>T (98%), 14408C>T (98%) and 23403A>G (97%) linked with European clade and high infection rate (Rahimi et al., 2021)
3961C>T Synonymous ORF1ab None 38 - - - - Associated with manifestation of diarrhoea and sore throat in patients (Rabbi et al., 2021)
4444G>T Synonymous ORF1ab None 7 - - - - Co-evolving mutation with 8371G>T and 29403A>G (Shishir et al., 2021)
8026A>G Synonymous ORF1ab None 17 - - - - -
8371>T Missense ORF1ab Q2702H 8 Decrease -0.68 Affect function - Co-evolving mutation with 8371G>T and 29403A>G (Shishir et al., 2021)
9502C>T Synonymous ORF1ab None 7 - - - - -
10323A>G Missense ORF1ab K3353R 8 Neutral -0.13 Tolerated CD4+ T Cell epitope -
11083G>T Missense ORF1ab L3606F 11 Decrease -1.00 Affect function Altered Ordered interface,
Altered Transmembrane protein,
Gain of Sulfation at Y3607, CD4+ and CD8+ T Cell epitope
More prevalent in asymptomatic cases (Aiewsakun et al., 2020; Wang et al., 2020)
14408C>T Missense ORF1ab P4715L 317 Decrease -0.83 Tolerated CD8+ T Cell epitope, RdRp zone, Frequently observed as co-mutation with 3037C>T (98%), 14408C>T (98%) and 23403A>G (97%) linked with European clade and high infection rate (Rahimi et al., 2021). Associated with higher mutation rate (Pachetti et al., 2020)
15324C>T Synonymous ORF1ab None 8 - - - - -
15738C>T Synonymous ORF1ab None 7 - - - - -
18877C>T Synonymous ORF1ab None 24 - - - - Associated with mutation density in M and E gene (Eskier et al., 2020a)
19723G>T Missense ORF1ab V6487F 17 Decrease -1.55 Tolerated - -
21575C>T Missense S L5F 8 Decrease -0.98 Tolerated - May increase hydrophobicity of the signal peptide thus facilitate in viral secretion from cell (Zhan et al., 2020) hence increase infectivity (Li et al., 2020). It also increases epitope binding affinity (Guo and Guo, 2020)
21855C>T Missense S S98F 6 Neutral 0.00 Tolerated -
22444C>T Synonymous S None 24 - - - - Found to co-evolve with 28854C>T and unique to Indian isolates (Banerjee et al., 2020)
23403A>G Missense S D614G 315 Decrease -0.93 Tolerated B cell epitope Discussed in detail in the discussion part
25494G>T Synonymous ORF3a None 19 - - - - -
25504C>G Missense ORF3a Q38E 8 Decrease -1.02 Affect function CD4+ and CD8+ T Cell epitope, transmembrane protein -
25563G>T Missense ORF3a Q57H 30 Decrease -2.03 Affect function CD4+ T Cell epitope Decrease ion permeability of ORF3a channel pore and possibly decrease viral release and immune response (Alam et al., 2020)
26735C>T Synonymous M None 29 - - - - -
26895C>T Missense M H125Y 6 Neutral 0.06 Tolerated CD8+ T Cell epitope, Altered Transmembrane protein
Altered Ordered interface
-
28854C>T Missense N S194L 25 Neutral -0.38 Tolerated B cell epitope Enhance N-E and decrease N-M interactions thus may promote viral release and attenuate viral assembly (Wu et al., 2021)
28881G>A Missense N R203K 279 Decrease -0.93 Tolerated B cell epitope, PCR target area, transmembrane protein Enhance N-E interaction thus may promote viral release and infectivity (Wu et al., 2021).
28882G>A Missense N R203K 279 Decrease -0.93 Tolerated B cell epitope, PCR target area, transmembrane protein
28883G >C Missense N G204R 279 Decrease -0.52 Affect function B cell epitope, PCR target area, transmembrane protein Enhance N-E interaction thus may promote viral release and infectivity (Wu et al., 2021).
29403A >G Missense N D377G 6 Neutral -0.44 Tolerated - -
29742G >A Non-coding 3′-UTR - 7 - - - - Create miR-3664-5p binding site but degradation of viral RNA by host protein is unaltered (Mukherjee and Goswami, 2020).
29848T>A Non-coding 3′-UTR - 8 - - - -
29850A>T Non-coding 3′-UTR - 8 - - - - -
29852A>T Non-coding 3′-UTR - 7 - - - - -
29853G>A Non-coding 3′-UTR - 8 - - - - -

3.2. Mutation analysis of S gene

A separate and detailed analysis of SARS-CoV-2 S gene was done and revealed a total of 530 mutation events among which 414 were missense events and 3 were single amino acid deletions. These mutation events comprised of 56 SNVs and 2 deletions where D614G was the mutation of highest frequency that occurred in 97.2% (315) isolates. Among the SNVs 35 were predicted to decrease protein stability (DDG less than or around -0.5 kcal/mol), 8 were predicted to alter protein function, 15 were in predicted B-Cell epitope region and 28 were in T-Cell predicted epitope region. Individual analysis of these mutations with functional significance is shown in Supplementary Table 1. Three SNVs were found in the receptor binding domain of the spike protein but none of them was in the ACE2 receptor binding part of the protein (Figures 2 and 3).

Figure 2.

Figure 2

(a) Frequency of SNVs along the base positions of SARS-CoV-2 S gene, missense mutations are annotated in red text (b) The missense SNVs aligned (red lines) to different regions of SARS-CoV-2 S gene, uniport regions of interest, RT-PCR diagnostic primer set, B-Cell and T-Cell predicted epitope regions, SARS-CoV T Cell epitope regions (M1 and M2 peptides), human protein interaction, protease cleavage and signal peptide regions derived from UCSC genome browser.

Figure 3.

Figure 3

Mutated amino acid positions shown in the human ACE2 receptor bound structure (PDB ID: 6acj) of SARS-CoV-2 spike protein (except L5F, S13I, Q14H, G75V, T76I, Y145del, H146Y, Q675 H/R, Q677H, N679K, I834V, R1185H and K1195N as the regions were not covered in the structure). Mutations are shown in only one chain of the structure.

3.3. Phylogeny analysis

After phylogeny analysis we have found, our 324 isolates were distributed among all the nextstrain clades where 20B clade was the most frequent (86%) (Figure 4). The 19B clade was unique to Chittagong (5 isolates) and one root clade 19A isolate was reported from Dhaka. There were 24 isolates for which the location data was unknown and only one isolate was found in 20C clade. We also have analysed clade distribution of our sequences according to GISAID nomenclature and found more than 96% of the isolates belonged to the G clade and its two major branches GH and GR clade. The common distinctive feature of these three clades is D614G mutation. About 88% of the sequences clustered in GR clade the distinctive feature of which is G204R mutation in the nucleocapsid protein (Figure 4 and Table 3).

Figure 4.

Figure 4

Radial presentation of phylogeny tree formed from 324 SARS-CoV-2 genomes from Bangladesh compared to the reference SARS-CoV-2 sequence shows the isolates belonged to all of the nextstrain clades and the 20B clade is the most prominent.

Table 3.

Location-wise of distribution of isolates from different clades.

Location (Division) Nextstrain/GISAID Clade No. of Isolates Primary Countries
Dhaka 19A/L 1/1 Asia: China/Thailand
20A/G,GH 6/5,1 N America/Europe/Asia: USA, Belgium, India
20B/GR,O 77/75,2 Europe: UK, Belgium, Sweden
Chittagong 19B/S 5/5 Asia: China
20A/G,GH,O 14/2,11,1 N America/Europe/Asia: USA, Belgium, India
20B/GR 55/55 Europe: UK, Belgium, Sweden
Rajshahi 20A/GH 2/2 N America/Europe/Asia: USA, Belgium, India
20B/GR 30/30 Europe: UK, Belgium, Sweden
Rangpur 20A/GH 1/1 N America/Europe/Asia: USA, Belgium, India
20B/GR,G 24/23,1 Europe: UK, Belgium, Sweden
Khulna 20A/O,GH 5/3,2 N America/Europe/Asia: USA, Belgium, India
20B/GR 22/22 Europe: UK, Belgium, Sweden
Sylhet 20A/GH 2/2 N America/Europe/Asia: USA, Belgium, India
20B/GR 19/19 Europe: UK, Belgium, Sweden
Barishal 20A/GH 2/2 N America/Europe/Asia: USA, Belgium, India
20B/G,GR 21/1,20 Europe: UK, Belgium, Sweden
Unknown 20A/G,GH 6/4,2 N America/Europe/Asia: USA, Belgium, India
20B/GR 16/16 Europe: UK, Belgium, Sweden
20C/GH 1/1 N America: USA

The overall analysis revealed that, most of the isolates shared common ancestors with European countries. A subsampled global phylogeny analysis revealed the largest cluster of Bangladeshi isolates shared common ancestor with some Australian isolates that were reported between mid-May to mid-July. Several other clusters were formed sharing common ancestors with countries including Senegal, Morocco, Egypt, Oman, Saudi Arabia, India, Srilanka, Zhenjiang (China), Portugal, Norway, Luxembourg, Bosnia and Herzegovina, England and Italy (Supplementary Figure 1, Supplementary Figure 2, Supplemantry Figure 3).

4. Discussion

The current SARS-CoV-2 pandemic has changed the world in many ways bringing devastating effects on the society and environment yet we have seen some positive changes and one of which is increased collaboration of scientists and open-source projects from all over the world. The collaborative efforts are making huge impact on research and evidence generation. In our study, we have tried to gather data on genetic evolution and mutational impacts of SARS-CoV-2 those have been isolated and sequenced in Bangladesh. Our analysis revealed multiple introductions of the virus from different regions in our country as the phylogeny tree shows isolates closely related to different countries and regions of the world. Although most of the isolates were related to isolates from Middle Eastern and European countries, this can be explained as a lot of Bangladeshi migrant workers live in those countries including Saudi Arabia, Oman, Belgium and Italy. Many of these migrant workers came back to Bangladesh during first and second quarter of 2020 as number of COVID-19 cases were very high in those regions (Siddiqui et al., 2020; IOM, 2020). Interestingly the largest cluster was formed around Australian isolates but going further back on the tree reveals last common ancestor was also related to the European isolates (Sweden and Switzerland). Hence the abundance of 20B clade is observed in Bangladesh unlike neighbouring India and Pakistan where Asian and North American Clades (19A, 19B, 20B) are more dominant (Figure 5).

Figure 5.

Figure 5

Comparison of clade distribution in different regions of Asia from subsampled global analysis in nextrain.org.

In our analysis we have found the most common single nucleotide change was C to T (~41%). This phenomenon was reported earlier and can be explained by selective mutation pressure to reduce CpG sites in the presence of abundant human antiviral proteins including APOBEC3 and ZAP (Wei et al., 2020). The CpG sites are common targets of viral genome and can be recognized by Toll-like receptors (TLRs) that results in release of pro-inflammatory cytokines including type- I interferon, IL-6, IL-12 and TNF-α (Arpaia and Barton, 2011) those play key role in severe COVID-19 including lung tissue damage (Costela-Ruiz et al., 2020). The reducing CpG in SARS-CoV-2 may indicate the mutational change facilitates the viral replication. Similar findings have also been reported in other studies for other RNA viruses relating the CpG motifs and viral replication (Atkinson et al., 2014). Relation of CpG with acute inflammatory response also has been mentioned in case of non-viral gene therapy vectors (Yew and Cheng, 2004). Inverse relation of disease severity with viral load has been reported in a recent study (Argyropoulos et al., 2020). Hence, we think further research is needed focusing the CpG suppression rate in SARS-CoV-2 and its relation with viral load and disease severity which may help designing more potent vaccine and therapeutics.

Large deletion events were observed among some of the isolates that resulted in deletion of most of ORF7a or ORF8 gene products. Three such deletions were between base position 27487 to 27552 (65 nucleotide), 27912 to 28256 (344 nucleotide) and 27472 to 27672 (200 nucleotide). Though these proteins are accessory proteins and not necessary for viral replication, ORF7a was found to interact with human ribosomal transport proteins MDN1 and HEATR3 (Gordon et al., 2020). In SARS-CoV the ORF7a was reported to act as cellular translation inhibitor and apoptosis inducer (Kopecky-Bromberg et al., 2006). Hence deletion of ORF7a may not change viral replication but can alter disease severity by reducing the chance of ORF7a mediated apoptosis. Similar deletions were reported earlier from USA (Addetia et al., 2020). On the other hand, ORF8 protein is least similar to its SARS-CoV homolog and reported to be associated with MHC-I downregulation that facilitates the virus for immune evasion from Cytotoxic T-Cells. This kind of immune evasion facilitates virus to replicate without getting detected by immune cells hence producing less symptoms. Similar mechanism of immune evasion is observed in some chronic infection causing viruses including HIV-1 and Kaposi Sarcoma associated Herpes Virus (KSHV). This may be one of the reasons for the large number of asymptomatic patients, chronic viral shedding even after clinical cure and redetection of virus long after recovery (Zhang et al., 2020b). Recently study found that SARS-CoV-2 ORF8 also can potentially mediate unique immune suppression and evasion mechanisms (Flower et al., 2021).

The 241C>T mutation was one of the most frequent (96%) we have observed in our study. This mutation occurred in 5’ UTR region, so it may not have any functional significance except for reducing CpG sites. Interestingly, this mutation always accompanied 3 other mutations in same isolates. Those mutations include 3037C>T (98%), 14408C>T (98%) and 23403A>G (97%) where the last two are non-synonymous mutations (P4715L in ORF1ab or P323L in RdRp and D614G in Spike protein). This co-occurrence of said mutations are not by chance rather a linkage disequilibrium that has been reported earlier by several studies (Bai et al., 2020; Demir et al., 2020; Mercatelli and Giorgi, 2020). These co-mutations are primary features of GISAID G clade that started rising since February in Europe and now include more than 70% of all SARS-CoV-2 sequences from all over the world (Mercatelli and Giorgi, 2020). The high prevalence of these 4 mutations in Bangladesh also establishes stronger linkage with European isolates. The P323L mutation in RdRp is reported to be associated with increased mutation rate (Begum et al., 2020; Eskier et al., 2020b). Among all 4 mutations the D614G is most studied and reported. Though not in the receptor binding part of Spike protein, multiple studies reported that the D614G mutation provides the SARS-CoV-2 an evolutionary advantage for replication. Studies reported D614G mutation increases infectivity of the virus as it was found to be associated with higher viral load and higher infectious titre (Korber et al., 2020a, 2020b; Zhang et al., 2020a).

The D614 amino acid is located between S1 and S2 junction of spike protein. The cleavage of S1–S2 junction by host protease is crucial for entry into the host cell and multiple cleavage sites enhances the fusion of SARS-CoV with host cell membrane (Belouzard et al., 2009). One study predicted D614G mutation introduces a novel protease (elastase) cleavage site that may enhance the fusion of viral envelop to the host cell membrane hence further facilitate viral RNA entry into the host cell (Bhattacharyya et al., 2020). Studies identified the 614G variant of the virus can get more functional advantage in population with delC variant (rs35074065 site) of TMPRSS2 gene (Bhattacharyya et al., 2020; Russo et al., 2020). This variant is common in Europe, America and South Asia but extremely rare in East Asia according to 1000 Genome project data (NCBI, 2020). This may explain the spread of D614G or G clade in mostly Europe, America and recently in South Asia. While some studies suggested that D614G mutation is associated with higher fatality rate (Becerra-Flores and Cardozo, 2020; Toyoshima et al., 2020), several other studies reported no significant association (Korber et al., 2020b; Cassia Wagner et al., 2020). There is no significant association between viral load and clinical outcome or survival was found in another recent study (Argyropoulos et al., 2020). Considering these analyses it can be said that the current evidence is not clear about the impact of D614G mutation alone on the disease severity and mortality as multiple other stronger factors play role especially age and comorbidity (Grubaugh et al., 2020).

There have been concerns about the impact of D614G mutation on vaccine development but it is clear that the mutation does not take place in the receptor binding region of the spike protein which is the primary target of the neutralizing antibodies. Also, studies suggested that in natural infection, antibodies generated from D614 variant can cross neutralize G614 variant viruses. Hence this is unlikely that the mutation will have a drastic effect on the immunogenicity of the virus and less like to have any impact on vaccine efficacy (Grubaugh et al., 2020; Hu et al., 2020; Ozono et al., 2020).

The second most frequent mutation in our analysis was a tri-nucleotide change resulting in two amino acid changes which are R203K and G204R in N protein. Our analysis revealed these mutations occurred in a PCR target area and B-cell epitope region. Though functionally tolerated, this change was predicted to decrease the protein stability. This finding is similar with other studies those reported R203K and G204R destabilizes N protein structure but may enhance interaction with SARS-CoV-2 Envelop protein that may promote viral release (Rahman et al., 2020; Wu et al., 2021).

The I300F mutation (occurred in 78% isolates) was predicted to reduce the stability of NSP2 protein the function of which is not yet confirmed. One study suggested that the amino acid is positioned within the internal groove of the protein and less likely to interact with host factors (Ul Alam et al., 2020). There two other functionally significant mutations which are Q57H in ORF3a and S194L in N protein. Our analysis suggested the Q57H decreases protein stability with altered protein function may result in loss of a CD4+ T Cell epitope similar to other studies (Kim et al., 2020; Wu et al., 2021). One study mentioned that the Q57H mutation in ORF3a protein may decrease ion permeability by creating a tighter constriction in channel pore and possibly decrease viral release and immune response (Ul Alam et al.). The S194L was predicted to have neutral effect in our study with a reduced DDG value and may attenuate viral assembly as reported in an earlier study (Wu et al., 2021).

We have observed several mutations in some of the RT-PCR target regions. Though it is yet unknown that if mismatch in primer template changes the accuracy and precision of RT-PCR based COVID-19 diagnosis, we recommend avoidance of using primers containing mutation prone regions for better diagnosis. Several other low frequency mutations we found to be associated with higher infectivity and manifestation of specific symptoms which are mentioned in Table 2.

As a limitation of our study, we couldn't derive any clinical information of the patients from whom the samples were collected. The functional significance described in this paper are only computational prediction based and may not always reflect clinical scenario. Also, the genomic sequences were derived using different sequencing platforms (i.e. Illumina, Ion Torrent etc.) and methods (Sanger and Next-generation sequencing) by different laboratories which may have impacted the quality of the sequences hence impacted our analysis. We have found one sequenced that has no mutation compared to the reference sequence which is very unlikely and may possibly be a submission error as the sample was collected long after the original Wuhan outbreak. We hope our findings will create scopes for further research specially including clinical data and also help identifying changes in pathogenicity and infectivity pattern of the virus.

Declarations

Author contribution statement

Dr. Md. Marufur Rahman: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.

Dr. Shirmin Bintay Kader: Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.

Dr. S M Shahriar Rizvi: Conceived and designed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.

Funding statement

This work was supported by the Centre for Medical Biotechnology (CMBT), Management Information System, Directorate General of Health Services, Bangladesh.

Data availability statement

Data included in article/supp. material/referenced in article.

Competing interest statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.

Acknowledgements

We acknowledge who were involved in the process of sample collection, genome sequencing and sequence data submission in the GISAID database. We also acknowledge Dr. Senjuti Saha, Scientist, Child Health Research Foundation Bangladesh, who inspired and guided us throughout the writing of this article.

Appendix A. Supplementary data

The following is the supplementary data related to this article:

Supplementary Figure 1 _spl_1_spl_.

Supplementary Figure 1 _spl_1_spl_

Supplementary Figure 2.

Supplementary Figure 2

Supplemantry Figure 3.

Supplemantry Figure 3

Supplementary Table 1
mmc1.docx (20.7KB, docx)

References

  1. A∗Star-Bioinformatics-Institute . 2020. CoVsurver - CoronaVirus Surveillance Server.https://corona.bii.a-star.edu.sg/ [Online]. Available: [Google Scholar]
  2. Addetia A., Xie H., Roychoudhury P., Shrestha L., Loprieno M., Huang M.-L., Jerome K., Greninger A. Identification of multiple large deletions in ORF7a resulting in in-frame gene fusions in clinical SARS-CoV-2 isolates. medRxiv. 2020 doi: 10.1016/j.jcv.2020.104523. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Aiewsakun P., Wongtrakoongate P., Thawornwattana Y., Hongeng S., Thitithanyanont A. SARS-CoV-2 genetic variations associated with COVID-19 severity. MedRxiv. 2020 doi: 10.1099/mgen.0.000734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Alam A.R.U., Islam O.K., Hasan M.S., Al-Emran H.M., Jahid M.I.K., Hossain M.A. Evolving infection paradox of SARS-CoV-2: fitness costs virulence? MedRxiv. 2020 [Google Scholar]
  5. Argyropoulos K.V., Serrano A., Hu J., Black M., Feng X., Shen G., Call M., Kim M.J., Lytle A., Belovarac B. Association of initial viral load in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) patients with outcome and symptoms. Am. J. Pathol. 2020;190:1881–1887. doi: 10.1016/j.ajpath.2020.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Armengaud J., Delaunay-Moisan A., Thuret J.Y., Van Anken E., Acosta-Alvear D., Aragón T., Arias C., Blondel M., Braakman I., Collet J.F. The importance of naturally attenuated Sars-Cov-2 in the fight against Covid-19. Environ. Microbiol. 2020;22:1997–2000. doi: 10.1111/1462-2920.15039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Arpaia N., Barton G.M. Toll-like receptors: key players in antiviral immunity. Curr. Opin. Virol. 2011;1:447–454. doi: 10.1016/j.coviro.2011.10.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Atkinson N.J., Witteveldt J., Evans D.J., Simmonds P. The influence of CpG and UpA dinucleotide frequencies on RNA virus replication and characterization of the innate cellular pathways underlying virus attenuation and enhanced replication. Nucleic Acids Res. 2014;42:4527–4545. doi: 10.1093/nar/gku075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bai Y., Jiang D., Lon J.R., Chen X., Hu M., Lin S., Chen Z., Meng Y., Du H. Evolution and molecular characteristics of SARS-CoV-2 genome. bioRxiv. 2020 doi: 10.1016/j.ijid.2020.08.066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Banerjee A., Sarkar R., Mitra S., Lo M., Dutta S., Chawla-Sarkar M. The novel coronavirus enigma: phylogeny and analyses of coevolving mutations among the sars-cov-2 viruses circulating in India. JMIR Bioinform. Biotechnol. 2020;1 doi: 10.2196/20735. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Becerra-Flores M., Cardozo T. SARS-CoV-2 viral spike G614 mutation exhibits higher case fatality rate. Int. J. Clin. Pract. 2020 doi: 10.1111/ijcp.13525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Begum F., Mukherjee D., Das S., Thagriki D., Tripathi P.P., Banerjee A.K., Ray U. Specific mutations in SARS-CoV2 RNA dependent RNA polymerase and helicase alter protein structure, dynamics and thus function: effect on viral RNA replication. bioRxiv. 2020 [Google Scholar]
  13. Belouzard S., Chu V.C., Whittaker G.R. Activation of the SARS coronavirus spike protein via sequential proteolytic cleavage at two distinct sites. Proc. Natl. Acad. Sci. Unit. States Am. 2009;106:5871–5876. doi: 10.1073/pnas.0809524106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Bhattacharyya C., Das C., Ghosh A., Singh A.K., Mukherjee S., Majumder P.P., Basu A., Biswas N.K. Global spread of SARS-CoV-2 subtype with spike protein mutation D614G is shaped by human genomic variations that regulate expression of TMPRSS2 and MX1 genes. bioRxiv. 2020 [Google Scholar]
  15. Capriotti E., Fariselli P., Casadio R. I-Mutant2. 0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res. 2005;33:W306–W310. doi: 10.1093/nar/gki375. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Cassia Wagner P.R., Frazar Chris D., Lee Jover, Müller Nicola F., Moncla Louise H. 2020. Comparing Viral Load and Clinical Outcomes in Washington State across D614G.https://github.com/blab/ncov-wa-d614g substitution in spike protein of SARS-CoV-2 [Online]. Available: [Google Scholar]
  17. Cleemput S., Dumon W., Fonseca V., Abdool Karim W., Giovanetti M., Alcantara L.C., Deforche K., De Oliveira T. Genome Detective Coronavirus Typing Tool for rapid identification and characterization of novel coronavirus genomes. Bioinformatics. 2020;36:3552–3555. doi: 10.1093/bioinformatics/btaa145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Costela-Ruiz V.J., Illescas-Montes R., Puerta-Puerta J.M., Ruiz C., Melguizo-Rodríguez L. SARS-CoV-2 infection: the role of cytokines in COVID-19 disease. Cytokine Growth Factor Rev. 2020 doi: 10.1016/j.cytogfr.2020.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Demir A.B., Benvenuto D., Abacioğlu Y.H., Angeletti S., Ciccozzi M. Identification of the nucleotide substitutions in 62 SARS-CoV-2 sequences from Turkey. Turk. J. Biol. 2020;44:178–184. doi: 10.3906/biy-2005-69. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Elbe S., Buckland-Merrett G. Data, disease and diplomacy: GISAID's innovative contribution to global health. Glob. Challeng. 2017;1:33–46. doi: 10.1002/gch2.1018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Eskier D., Suner A., Oktay Y., Karakülah G. Mutations of SARS-CoV-2 nsp14 exhibit strong association with increased genome-wide mutation load. PeerJ. 2020;8 doi: 10.7717/peerj.10181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Eskier D., Karakülah G., Suner A., Oktay Y. RdRp mutations are associated with SARS-CoV-2 genome evolution. bioRxiv. 2020 doi: 10.7717/peerj.9587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Fernandes J.D., Hinrichs A.S., Clawson H., Gonzales J.N., Lee B.T., Nassar L.R., Raney B.J., Rosenbloom K.R., Nerli S., Rao A.A. The UCSC SARS-CoV-2 genome browser. bioRxiv. 2020 doi: 10.1038/s41588-020-0700-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Flower T.G., Buffalo C.Z., Hooy R.M., Allaire M., Ren X., Hurley J.H. Structure of SARS-CoV-2 ORF8, a rapidly evolving immune evasion protein. Proc. Natl. Acad. Sci. Unit. States Am. 2021;118 doi: 10.1073/pnas.2021785118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Forster P., Forster L., Renfrew C., Forster M. Phylogenetic network analysis of SARS-CoV-2 genomes. Proc. Natl. Acad. Sci. Unit. States Am. 2020;117:9241–9243. doi: 10.1073/pnas.2004999117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Gisaid-Clade . 2020. GISAID - Clade and Lineage Nomenclature Aids in Genomic Epidemiology of Active hCoV-19 Viruses.https://www.gisaid.org/references/statements-clarifications/clade-and-lineage-nomenclature-aids-in-genomic-epidemiology-of-active-hcov-19-viruses/ [Online]. Available: [Google Scholar]
  27. GISAID-INITIATIVE. 2020. https://www.gisaid.org/ Available: [Google Scholar]
  28. Gordon D.E., Jang G.M., Bouhaddou M., Xu J., Obernier K., White K.M., O’meara M.J., Rezelj V.V., Guo J.Z., Swaney D.L. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature. 2020:1–13. doi: 10.1038/s41586-020-2286-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Grubaugh N.D., Hanage W.P., Rasmussen A.L. Making sense of mutation: what D614G means for the COVID-19 pandemic remains unclear. Cell. 2020;182:794–795. doi: 10.1016/j.cell.2020.06.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Guo E., Guo H. CD8 T cell epitope generation toward the continually mutating SARS-CoV-2 spike protein in genetically diverse human population: implications for disease control and prevention. PloS One. 2020;15 doi: 10.1371/journal.pone.0239566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Hadfield J., Megill C., Bell S.M., Huddleston J., Potter B., Callender C., Sagulenko P., Bedford T., Neher R.A. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics. 2018;34:4121–4123. doi: 10.1093/bioinformatics/bty407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Holmes E.C. Oxford University Press; 2009. The Evolution and Emergence of RNA Viruses. [Google Scholar]
  33. Hu J., He C.L., Gao Q., Zhang G.J., Cao X.X., Long Q.X., Deng H.J., Huang L.Y., Chen J., Wang K. The D614G mutation of SARS-CoV-2 spike protein enhances viral infectivity. BioRxiv. 2020 [Google Scholar]
  34. IEDCR. 2020. https://www.iedcr.gov.bd/ Available: [Google Scholar]
  35. IOM . International Organization for Migration; 2020. IOM Assists Vulnerable Returning Migrants Impacted by the COVID-19 Pandemic.https://bangladesh.iom.int/news/iom-assists-vulnerable-returning-migrants-impacted-covid-19-pandemic [Internet]. [cited 2020 Aug 8]. Available from: [Google Scholar]
  36. Islam M.R., Hoque M.N., Rahman M.S., Alam A.R.U., Akther M., Puspo J.A., Akter S., Sultana M., Crandall K.A., Hossain M.A. Genome-wide analysis of SARS-CoV-2 virus strains circulating worldwide implicates heterogeneity. Sci. Rep. 2020;10:1–9. doi: 10.1038/s41598-020-70812-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Khailany Ra S.M., Ozaslan M. Genomic characterization of a novel SARS-CoV-2. Gene Rep. 2020;19:100682. doi: 10.1016/j.genrep.2020.100682. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kim J.-S., Jang J.-H., Kim J.-M., Chung Y.-S., Yoo C.-K., Han M.-G. Genome-Wide identification and characterization of point mutations in the SARS-CoV-2 genome. Osong Publ. Health Res. Perspect. 2020;11:101. doi: 10.24171/j.phrp.2020.11.3.05. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Kopecky-Bromberg S.A., Martinez-Sobrido L., Palese P. 7a protein of severe acute respiratory syndrome coronavirus inhibits cellular protein synthesis and activates p38 mitogen-activated protein kinase. J. Virol. 2006;80:785–793. doi: 10.1128/JVI.80.2.785-793.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Korber B., Fischer W., Gnanakaran S.G., Yoon H., Theiler J., Abfalterer W., Foley B., Giorgi E.E., Bhattacharya T., Parker M.D. Spike mutation pipeline reveals the emergence of a more transmissible form of SARS-CoV-2. bioRxiv. 2020 [Google Scholar]
  41. Korber B., Fischer W.M., Gnanakaran S., Yoon H., Theiler J., Abfalterer W., Hengartner N., Giorgi E.E., Bhattacharya T., Foley B. Tracking changes in SARS-CoV-2 Spike: evidence that D614G increases infectivity of the COVID-19 virus. Cell. 2020;182:812–827. doi: 10.1016/j.cell.2020.06.043. e19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Li Q., Wu J., Nie J., Zhang L., Hao H., Liu S., Zhao C., Zhang Q., Liu H., Nie L. The impact of mutations in SARS-CoV-2 spike on viral infectivity and antigenicity. Cell. 2020;182:1284–1294. doi: 10.1016/j.cell.2020.07.012. e9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Mercatelli D., Giorgi F.M. Geographic and genomic distribution of SARS-CoV-2 mutations. Front. Microbiol. 2020 Jul 22 doi: 10.3389/fmicb.2020.01800. https://www.frontiersin.org/article/10.3389/fmicb.2020.01800/full [Internet] [cited 2020 Aug 19];11:1800. Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Mukherjee M., Goswami S. Global cataloguing of variations in untranslated regions of viral genome and prediction of key host RNA binding protein-microRNA interactions modulating genome stability in SARS-CoV-2. PloS One. 2020;15 doi: 10.1371/journal.pone.0237559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. NCBI . 2020. rs35074065 RefSNP Report - dbSNP - NCBI.https://www.ncbi.nlm.nih.gov/snp/rs35074065#frequency_tab [Internet]. [cited 2020 Aug 20]. Available from: [Google Scholar]
  46. NEXTCLADE. 2020. https://clades.nextstrain.org/ Available: [Google Scholar]
  47. Nextstrain-Github . 2020. ncov/naming_clades.md at Master · Nextstrain/ncov · GitHub.https://github.com/nextstrain/ncov/blob/master/docs/naming_clades.md [Online]. Available: [Google Scholar]
  48. Ng P.C., Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31:3812–3814. doi: 10.1093/nar/gkg509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Ozono S., Zhang Y., Ode H., Seng T.T., Imai K., Miyoshi K., Kishigami S., Ueno T., Iwatani Y., Suzuki T. Naturally mutated spike proteins of SARS-CoV-2 variants show differential levels of cell entry. bioRxiv. 2020 [Google Scholar]
  50. Pachetti M., Marini B., Benedetti F., Giudici F., Mauro E., Storici P., Masciovecchio C., Angeletti S., Ciccozzi M., Gallo R.C. Emerging SARS-CoV-2 mutation hot spots include a novel RNA-dependent-RNA polymerase variant. J. Transl. Med. 2020;18:1–9. doi: 10.1186/s12967-020-02344-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Pejaver V., Urresti J., Lugo-Martinez J., Pagel K.A., Lin G.N., Nam H.-J., Mort M., Cooper D.N., Sebat J., Iakoucheva L.M. MutPred2: inferring the molecular and phenotypic impact of amino acid variants. bioRxiv. 2017 doi: 10.1038/s41467-020-19669-x. Preprint. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Rabbi M.F.A., Khan M.I., Hasan S., Chalita M., Hasan K.N., Sufian A., Hosen M.B., Polol M.N.I., Naima J., Lee K. Large scale genomic and evolutionary study reveals SARS-CoV-2 virus isolates from Bangladesh strongly correlate with European origin and not with China. bioRxiv. 2021 doi: 10.1016/j.ygeno.2022.110497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Rahimi A., Mirzazadeh A., Tavakolpour S. Genetics and genomics of SARS-CoV-2: a review of the literature with the special focus on genetic diversity and SARS-CoV-2 genome detection. Genomics. 2021;113:1221–1232. doi: 10.1016/j.ygeno.2020.09.059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Rahman M.S., Islam M.R., Alam A.R.U., Islam I., Hoque M.N., Akter S., Rahaman M.M., Sultana M., Hossain M.A. Evolutionary dynamics of SARS-CoV-2 nucleocapsid protein (N protein) and its consequences. BioRxiv. 2020 doi: 10.1002/jmv.26626. [DOI] [PubMed] [Google Scholar]
  55. Rambaut A., Holmes E.C., O’Toole Á., Hill V., McCrone J.T., Ruis C., Plessis L. du, Pybus O.G. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nat. Microbiol. 2020;5:1403–1407. doi: 10.1038/s41564-020-0770-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Russo R., Andolfo I., Lasorsa V.A., Iolascon A., Capasso M. Genetic analysis of the novel SARS-CoV-2 host receptor TMPRSS2 in different populations. bioRxiv. 2020 doi: 10.3389/fgene.2020.00872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Shishir T.A., Naser I.B., Faruque S.M. In silico comparative genomics of SARS-CoV-2 to determine the source and diversity of the pathogen in Bangladesh. PloS One. 2021;16 doi: 10.1371/journal.pone.0245584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Siddiqui T., Sultana M., Sultana R., Akhter S. 2020. Labour Migration from Bangladesh 2018 Achievements and Challenges.https://www.forum-asia.org/uploads/wp/2019/05/Migration-Trend-Analysis-2018-RMMRU.pdf [Internet]. [cited 2020 Aug 8]. Available from: [Google Scholar]
  59. Tan W., Zhao X., Ma X., Wang W., Niu P., Xu W., Gao G.F., Wu G. A novel coronavirus genome identified in a cluster of pneumonia cases—Wuhan, China 2019− 2020. China CDC Weekly. 2020;2:61–62. [PMC free article] [PubMed] [Google Scholar]
  60. Toyoshima Y., Nemoto K., Matsumoto S., Nakamura Y., Kiyotani K. SARS-CoV-2 genomic variations associated with mortality rate of COVID-19. J. Hum. Genet. 2020:1–8. doi: 10.1038/s10038-020-0808-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Ul Alam A.R., Rafiul Islam M., Shaminur Rahman M., Islam O.K., Anwar Hossain M. Understanding the possible origin and genotyping of first Bangladeshi SARS-CoV-2 strain. J. Med. Virol. 2020 doi: 10.1002/jmv.26115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Wang R., Chen J., Hozumi Y., Yin C., Wei G.-W. Decoding asymptomatic COVID-19 infection and transmission. J. Phys. Chem. Lett. 2020;11:10007–10015. doi: 10.1021/acs.jpclett.0c02765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Wei Y., Silke J.R., Aris P., Xia X. Coronavirus genomes carry the signatures of their habitats. PloS One. 2020;15 doi: 10.1371/journal.pone.0244025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. WORLDOMETER. 2020. https://www.worldometers.info/coronavirus/country/bangladesh/ Available: [Google Scholar]
  65. Wu S., Tian C., Liu P. Effects of SARS-CoV-2 mutations on protein structures and intraviral protein–protein interactions. J. Med. Virol. 2021;93:2132-–2140. doi: 10.1002/jmv.26597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Yao H.-P., Lu X., Chen Q., Xu K., Chen Y., Cheng L., Liu F., Wu Z., Wu H., Jin C. 2020. Patient-derived Mutations Impact Pathogenicity of SARS-CoV-2. CELL-D-20-01124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Yew N.S., Cheng S.H. Reducing the immunostimulatory activity of CpG-containing plasmid DNA vectors for non-viral gene therapy. Expet Opin. Drug Deliv. 2004;1:115–125. doi: 10.1517/17425247.1.1.115. [DOI] [PubMed] [Google Scholar]
  68. Zhan X.-Y., Zhang Y., Zhou X., Huang K., Qian Y., Leng Y., Yan L., Huang B., He Y. Molecular evolution of SARS-CoV-2 structural genes: evidence of positive selection in spike glycoprotein. BioRxiv. 2020 [Google Scholar]
  69. Zhang L., Jackson C.B., Mou H., Ojha A., Rangarajan E.S., Izard T., Farzan M., Choe H. The D614G mutation in the SARS-CoV-2 spike protein reduces S1 shedding and increases infectivity. bioRxiv. 2020 [Google Scholar]
  70. Zhang Y., Zhang J., Chen Y., Luo B., Yuan Y., Huang F., Yang T., Yu F., Liu J., Liu B. The ORF8 protein of SARS-CoV-2 mediates immune evasion through potently downregulating MHC-I. bioRxiv. 2020 doi: 10.1073/pnas.2024202118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Zhou P., Yang X.-L., Wang X.-G., Hu B., Zhang L., Zhang W., Si H.-R., Zhu Y., Li B., Huang C.-L. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270–273. doi: 10.1038/s41586-020-2012-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Table 1
mmc1.docx (20.7KB, docx)

Data Availability Statement

Data included in article/supp. material/referenced in article.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES