Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2020 Nov 4;90:107413. doi: 10.1016/j.compbiolchem.2020.107413

Genetic analysis of SARS-CoV-2 isolates collected from Bangladesh: Insights into the origin, mutational spectrum and possible pathomechanism

Md Sorwer Alam Parvez a, Mohammad Mahfujur Rahman a, Md Niaz Morshed a, Dolilur Rahman a, Saeed Anwar b, Mohammad Jakir Hosen a,*
PMCID: PMC7641529  PMID: 33221119

Graphical abstract

graphic file with name ga1_lrg.jpg

Keywords: COVID-19, SARS-CoV-2, Bangladeshi isolates, Spike protein, Mutation, ACE2 receptor

Abstract

As the coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), rages across the world, killing hundreds of thousands and infecting millions, researchers are racing against time to elucidate the viral genome. Some Bangladeshi institutes are also in this race, sequenced a few isolates of the virus collected from Bangladesh. Here, we present a genomic analysis of these isolates. The analysis revealed that SARS-CoV-2 isolates sequenced from Dhaka and Chittagong were the lineage of Europe and India, respectively. Our analysis identified a total of 42 mutations, including three large deletions, half of which were synonymous. Most of the missense mutations in Bangladeshi isolates found to have weak effects on the pathogenesis. Some mutations may lead the virus to be less pathogenic than the other countries. Molecular docking analysis to evaluate the effect of the mutations on the interaction between the viral spike proteins and the human ACE2 receptor, though no significant difference was observed. This study provides some preliminary insights into the origin of Bangladeshi SARS-CoV-2 isolates, mutation spectrum and its possible pathomechanism, which may give an essential clue for designing therapeutics and management of COVID-19 in Bangladesh.

1. Introduction

The coronavirus disease 2019 (COVID-19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Human to human transmission of this viral infection occurs via the droplets expelled during the face to face talking, coughing and sneezing. The time of exposure is very crucial factors for the transmission of infection from an infected person to a healthy person. Prolonged exposure has a high risk of transmission while shot exposure is less likely to transmit. It takes an average of 11.5 days to develop the symptoms of the disease after the successful transmission to a healthy person (Tang et al., 2020; Wiersinga et al., 2020). Common symptoms include fever, cough, fatigue, shortness of breath, nausea, vomiting, and diarrhoea. The disease has emerged as a critical, rapidly evolving global health crisis (Yin et al., 2020; Zheng et al., 2020). More than 6.5 million people have contracted the virus, and nearly 400 thousand have died (CSSE, 2020). In Bangladesh, the COVID-19 was first reported on 7 March by the Institute of Epidemiology Disease Control and Research (IEDCR) (Paul, 2020). Until the end of March, the infection rate was sort of low; however, as the non-therapeutic prevention measures enforced by the government faced enormous challenges, the infection rate raised drastically in April and kept on rising (Nabi and Shovon, 2020). The people did not maintain the social distancing enforced by the government and trend to gather in crowded places (The Business Standard, 2020). Moreover, an inadequacy of testing for COVID-19 diagnosis is a common criticism in Bangladesh (Tithila, 2020). As of 31 August 2020, nearly 313,000 confirmed cases were reported, with a total of 4281 deaths in Bangladesh (IEDCR, 2020).

SARS-CoV-2 is a positive-stranded RNA virus with a genome of ∼ 30 kb, encodes structural and non-structural proteins. SARS-CoV-2 is a spherical-shaped enveloped virus and characterized by spike proteins projecting from the virion surface. Generally, the viral structure is formed with some structural proteins such as spike (S), membrane (M), envelope (E) and nucleocapsid (N) protein where S, M and E proteins are embedded in the viral envelope and N protein is located in the core regions (Ashour et al., 2020). Like other RNA viruses, the SARS-CoV-2 is prone to frequent mutations, which makes it challenging to develop therapeutics and vaccines against the virus (Ruan et al., 2003; Wang et al., 2020). Sequence information of both the pathogen and the host would greatly facilitate an effective therapeutic strategy or vaccine development (Seib et al., 2009). Analysis of the genome sequences obtained from a vast array of isolates collected from different regions could provide an idea about the efficacy of the vaccines being developed (Amanat and Krammer, 2020). Henceforth, researchers across the world are running against time to unravel the genomic insights into the virus.

Till the end of August 2020, more than 80,000 genome sequences of SARS-CoV-2 has been submitted from different countries, where most of the sequences have come from European countries (∼46,000). About 20,000 complete genome sequences have been submitted from the USA while China has submitted about 1000 genome sequences. In Bangladesh, more than 300 isolates of the virus have been sequenced and deposited in GSAID (Global Initiative on Sharing Avian Influenza Data) database until the end of August 2020. Unfortunately, there is yet a study on the genomics of the SARS-CoV-2 in Bangladeshi isolates.

This study aimed to provide some preliminary insights into the genetic structure of all isolates reported in Bangladesh along with the mutational spectrum. It presents the first study on SARS-CoV-2 genomes obtained from Bangladesh, which, in broader terms, would help the therapeutic strategy development and vaccination programs against the virus in the country.

2. Materials and methods

2.1. Retrieval of the genome sequences of SARS-CoV-2

Till the end of August, more than 300 genome sequences of SARS-CoV-2 isolates were deposited from Bangladesh in the GSAID database (https://www.gisaid.org/) and we retrieved all of them from the database. As many of the Bangladeshi people return during the COVID-19 outbreak mainly from China, India, Saudi Arabia, Spain, Italy, Japan, Qatar, Canada, Kuwait, USA, France, Sweden, and Switzerland, the first deposited genome sequence of those countries were also retrieved. Sequence information of the first isolate collected from China was considered as a reference for further analysis.

2.2. Multiple sequence alignment and phylogenetic tree reconstruction

Multiple sequence alignment of all genome sequences of Bangladeshi isolates along with other countries was done using MUSCLE alignment program (Edgar, 2004). This alignment file was further proceeded for the reconstruction of the phylogenetic tree with Maximum Likelihood (ML) method using IQ-TREE (Nguyen et al., 2015). Model test was performed by ModelFinder tools to select the best-fit substitution model for the tree reconstruction (Kalyaanamoorthy et al., 2017). A total number of 88 models were tested and the best-fit model (GTR + F+I + G4) was selected according to Bayesian Information Criterion (BIC). Besides, to assess the branch supports, the ultrafast bootstrap was performed adopting UFBoot2 and the number of bootstrap replicates was set to 1000 (Hoang et al., 2018). Finally, the reconstructed phylogenetic tree was visualized and analyzed by iTOL online tool (Letunic and Bork, 2019).

2.3. Identification of nucleotide variations in Bangladeshi Strain

To identify the nucleotide variations, we performed multiple sequence alignment using Clustal Omega (Sievers and Higgins, 2014; Madeira et al., 2019), and the sequence of the strain China [EPI_ISL_402124] was used as a reference genome. The alignment file was analyzed using MVIEW program of Clustal Omega (Brown et al., 1998). Till 20th May, only 14 complete genome sequences of SARS-CoV-2 isolates were deposited in the database from Banglaesh and our further analysis was done using these 14 sequences.

2.4. Prediction and identification of the viral genes

FGENESV of SoftBerry (http://linux1.softberry.com/berry.phtml), which is a Trained Pattern/Markov chain-based viral gene prediction tools, was adopted for the prediction of the genes as well as the proteins from the viral genomes. Each predicted protein (for each viral genomes) was identified using the Basic Local Alignment Search Tool (BLAST), at the interface of the National Center for Biotechnology Information (NCBI) (Madden, 2013). The identity of each protein was evaluated compared to the proteins of the reference strain.

2.5. Detection of mutation Spectrum

Again, Clustal Omega was used for the multiple sequence alignment of each protein, which further analyzed by MVIEW. The amino acid variations were identified in each protein comparing to the protein of the reference strain. Further, both nucleotide variations and amino acid variations were compared to study the types of mutations.

2.6. Prediction of mutational effects

The structural and functional effects of the missense variants, along with the stability change, were analyzed using different prediction tools. I-mutant was employed to analyze the stability change where all the parameters were kept in default (Capriotti et al., 2005). Additionally, Mutpred2 was adopted to predict the molecular consequences and functional effect of these mutations (Pejaver et al., 2017).

2.7. Homology modeling of spike proteins and model validation

The BLASTp program at the NCBI interface (link) was used to find the most suitable template for homology modeling. Blasting against the protein databank reservoir (PDB) identified spike protein (Human) with PDB ID: 6VSB as a suitable template, as it has 99.59 % sequence similarity and 94 % coverage with the target sequence. The homology modeling of all mutant spike proteins along with the spike protein of the reference was done using SWISS-MODEL (Schwede et al., 2003). The validation of the predicted model was done by adopting Rampage and ERRAT (Colovos and Yeates, 1993; Lovell, 2002).

2.8. Molecular docking of Spike Protein with ACE2 receptor

The molecular docking approach was employed to investigate the interaction of mutant spike protein with the human ACE2 receptor. First, the crystal structure of human ACE2 (PDB ID: 6D0G) was obtained from the Protein Data Bank (PDB), and PyMOL was used to clean the structure to remove all the complex molecules and water (Berman et al., 2000; DeLano, 2002). The HDOCK webserver was used for prediction of the interaction between Spike protein and human ACE2 receptor through the protein-protein molecular docking (Yan et al., 2017). PyMOL was also used for the visualization of docking interactions.

3. Results

3.1. Retrieved genome sequence of the SARS-CoV-2

A total number of 311 complete genome sequences of the SARS-CoV-2 isolates from Bangladesh and 12 genome sequence from the isolates of other countries (China, India, Saudi Arabia, Spain, Italy, Japan, Qatar, Canada, Kuwait, USA, France, Sweden, and Switzerland) have been retrieved from GSAID. The strain of Wuhan accession number with EPI_ISL_402124 was considered as the reference strain.

3.2. Phylogenetic tree analysis

Phylogenetic tree analysis revealed that Bangladeshi isolates which initially collected from Dhaka were very close to Spain as well as Switzerland whereas the isolates collected from Chittagong were found to share a common ancestor with India and USA (Fig. 1 ). The isolates collected from Chittagong also centroid with the Middle East countries such as Kuwait and Saudi Arabia. Moreover, all the isolates initially collected before the pandemic started in Bangladesh were clustered with the strain of China indicating the same lineage of the virus. However, the phylogenetic distance of the isolates from this lineage increased over time.

Fig. 1.

Fig. 1

Maximum likelihood phylogenetic tree reconstructed with the sequences of all Bangladeshi isolates and other countries. The value in the nodes represents the bootstrap value of the branches where the branch length represents the evolutionary distance.

3.3. Predictions of the genes and proteins

FGENESV predicted the presence of 12 genes in the reference. Interestingly, all except five isolates (EPI_ISL_445213, EPI_ISL_445214, EPI_ISL_450342, EPI_ISL_450343, and EPI_ISL_450344) of Bangladesh also showed a similar result. Both isolates EPI_ISL_445213 and EPI_ISL_445217 found to have ten genes (missing of ORF7a and ORF10 genes) and isolate EPI_ISL_450343 and EPI_ISL_450344 have 11 genes (missing ORF8 gene). Multiple sequence alignment revealed that most of the variation in Bangladeshi isolates occurred in the ORF1a polyprotein, surface glycoproteins, and nucleocapsid phosphoprotein. Remarkably, envelope glycoprotein, ORF6, ORF8, and ORF10 were found 100 % identical in most of the isolates compared to the reference sequence (Table 1 ).

Table 1.

Predicted number of genes and identity compared to the reference strain. (Legends: S1: EPI_ISL_437912; S2: EPI_ISL_445213; S3: EPI_ISL_445214; S4: EPI_ISL_445215; S5: EPI_ISL_445216; S6: EPI_ISL_445217; S7: EPI_ISL_445244; S8: EPI_ISL_450339; S9: EPI_ISL_450340; S10: EPI_ISL_4503441; S11: EPI_ISL_450342; S12: EPI_ISL_450343; S13: EPI_ISL_450344; S14: EPI_ISL_450345; M: Missing).

No Protein S1 S2 S3 S4 S5 S6 S7 S8 S9 S11 S11 S12 S13 S14
1 ORF1a Polyprotein 99.98 99.93 99.95 99.95 100 99.95 100 99.95 99.98 99.98 99.95 99.98 99.98 99.98
2 ORF1b Polyprotein 99.96 100 100 100 100 100 100 100 100 100 99.96 100 100 100
3 Surface Glycoprotein 99.92 100 99.84 99.92 99.92 99.92 99.92 100 100 100 100 99.92 99.92 100
4 ORF3a protein 100 99.64 100 99.64 100 99.64 100 100 100 100 100 99.27 99.64 99.64
5 envelope protein 100 100 100 100 100 100 100 100 100 100 100 100 100 100
6 Membrane Glycoprotein 100 100 100 100 100 100 100 100 100 100 100 100 100 100
7 ORF6 protein 100 100 100 100 100 100 100 100 100 100 100 99.36 100 100
8 ORF7a protein 100 M 100 100 100 M 100 100 100 100 100 100 100 100
9 ORF7b 100 100 100 100 100 100 100 100 100 100 100 100 100 100
10 ORF8 100 100 100 100 100 100 100 99.17 100 99.17 99.17 M M 99.35
11 Neucleocapsid phospoprotein 99.52 99.76 99.28 99.28 100 99.28 100 99.76 99.52 99.52 99.76 99.76 99.76 99.76
12 ORF10 100 M 100 100 100 M 100 100 100 100 M 100 100 100

3.4. Mutation Spectrum of bangladeshi SARS-CoV-2 isolates

Analysis of all 14 Bangladeshi isolates revealed a total of 42 single nucleotide variants (Fig. 2 ); 24 of them were nonsynonymous missense in character. Besides, three large deletions were also found in those isolates (Table 2 ). Among the deletions, two deletions were responsible for the deletion of ORF7a in EPI_ISL_445213 and EPI_ISL_445217 isolates. Another large deletion from nucleotide 27911–28254, occurred in EPI_ISL_450343 and EPI_ISL_450344 isolates, responsible for the deletion of ORF8 in both isolates. Surprisingly, three consecutive mutations were found at nucleotide position 28882–28884; resulted in two amino acids substitution in nucleocapsid phosphoprotein.

Fig. 2.

Fig. 2

Variations Plot of SARS-CoV-2 in Bangladeshi isolates.

Table 2.

All mutations found in the coding regions of the 14 isolates compared to the reference strain. (Legends: S1: EPI_ISL_437912; S2: EPI_ISL_445213; S3: EPI_ISL_445214; S4: EPI_ISL_445215; S5: EPI_ISL_445216; S6: EPI_ISL_445217; S7: EPI_ISL_445244; S8: EPI_ISL_450339; S9: EPI_ISL_450340; S10: EPI_ISL_4503441; S11: EPI_ISL_450342; S12: EPI_ISL_450343; S13: EPI_ISL_450344; S14: EPI_ISL_450345).

Strain Mutation Protein Amino Acid Changes Mutation Types
S11, 14 283:C > T ORF1a Polyprotein No change Synonymous
S9, 10 602:C > T ORF1a Polyprotein No Change Synonymous
S1,2,3, 4,6 1164:A > T ORF1a Polyprotein I300F Missense
S1,2,3, 4, 5, 6, 7, 12, 13 3038:C > T ORF1a Polyprotein No Change Synonymous
S5 3689:C > T ORF1a Polyprotein No Change Synonymous
S2,3, 4, 6 4445:G > T ORF1a Polyprotein No Change Synonymous
S8 6730:A > G ORF1a Polyprotein N2155S Missense
S2, 3, 4, 6 8372:G > T ORF1a Polyprotein Q2702H Missense
S8, 9, 10, 11, 14 8783:C > T ORF1a Polyprotein No change Synonymous
S8, 9, 10, 11 10330:A > G ORF1a Polyprotein D3355G Missense
S14 10871:G > T ORF1a Polyprotein K3353R Missense
S2 10980:G > A ORF1a Polyprotein V3572M Missense
S11 12120:C > T ORF1a Polyprotein P3952S Missense
S8 12485:C > T ORF1a Polyprotein No Change Synonymous
S1, 2, 3, 4, 5, 6, 7, 12, 13 14409:C > T ORF1ab Polyprotein P214L Missense
S5, 8, 9, 10, 11, 14 15325:C > T ORF1ab Polyprotein No Change Synonymous
S8 15739:C > T ORF1ab Polyprotein No change Synonymous
S4 15896:C > T ORF1ab Polyprotein No Change Synonymous
S1 17020:G > T ORF1ab Polyprotein E1084D Missense
S12, 13 18878:C > T ORF1ab Polyprotein No Change Synonymous
S11 19405:G > A ORF1ab Polyprotein V1883T Missense
S12, 13 22445:C > T Surface Glycoprotein No change Synonymous
S14 23321:C > T Surface Glycoprotein No change Synonymous
S8, 9, 10, 11, 14 22469:G > T Surface Glycoprotein No change Synonymous
S1,2, 3, 4, 5, 6, 7, 12, 13 23404:A > G Surface Glycoprotein D623G Missense
S3 24488:T > C Surface Glycoprotein F1118L Missense
S12, 13 25495:G > T ORF3a protein No change Synonymous
S14 25506:A > T ORF3a protein Q38L Missense
S12 25512:C > T ORF3a protein S40L Missense
S12, 13 25564:G > T ORF3a protein Q57H Missense
S2, 4, 6 25907:G > T ORF3a protein G172C Missense
S12, 13 26736:C > T Membrane Glycoprotein No Change Synonymous
S12 27282:G > T ORF6 protein W27L Missense
S2 27432−27651:DEL ORF7a protein Whole protein deletion Deletion
S6 27486−27613:DEL ORF7a protein Whole protein deletion Deletion
S12, 13 27911−28254:DEL ORF8 Whole protein deletion Deletion
S14 28098:C > T ORF8 A65V Missense
S8, 9, 10, 11, 14 28145:T > C ORF8 L84S Missense
S8, 9, 10, 11, 14 28879:G > A Neucleocapsid phospoprotein S202N Missense
S1,2,3, 4, 6 28882:G > A Neucleocapsid phospoprotein R203K Missense
S1,2,3, 4, 6 28883:G > A Neucleocapsid phospoprotein R203K Missense
S1,2,3, 4, 6 28884:G > C Neucleocapsid phospoprotein G204R Missense
S9, 10 29293:G > T Neucleocapsid phospoprotein K373N Missense
S2,3, 4, 6 29404:A > G Neucleocapsid phospoprotein D377G Missense
S8, 9, 10, 11, 14 29643:G > A ORF10 No Change Synonymous

3.5. Mutational effects

Mutational effects analysis of the 24 missense mutations found that 18 mutations were responsible for decreasing structural stability. Mutations located in the ORF1a polyprotein and surface glycoprotein were predicted to decrease the structural stability of both proteins (Table 3 ). Additionally, three mutations occurring in surface glycoprotein, ORF3a and ORF6 were predicted to alter the molecular consequences, including loss of sulfation in surface glycoprotein and loss of proteolytic cleavage in ORF3a and loss of allosteric site in ORF6 (Table 4 and Supplementary Table 1).

Table 3.

Prediction of the mutational effects on the structural stability.

Protein Amino Acid Changes SVM2 Prediction Effect DDG Value (kcal/mol)
ORF1a Polyprotein I300F Decrease −1.79
ORF1a Polyprotein N2155S Decrease −0.60
ORF1a Polyprotein Q2702H Decrease −0.68
ORF1a Polyprotein D3355G Decrease −0.95
ORF1a Polyprotein K3353R Increase −0.13
ORF1a Polyprotein V3572M Decrease −0.88
ORF1a Polyprotein P3952S Decrease −1.21
ORF1b Polyprotein P214L Decrease −0.83
ORF1b Polyprotein E1084D Decrease −0.75
ORF1b Polyprotein V1883T Decrease −1.46
Surface Glycoprotein D623G Decrease −0.93
Surface Glycoprotein F1118L Decrease −0.81
ORF3a protein Q38L Increase 0.12
ORF3a protein S40L Increase 0.40
ORF3a protein Q57H Decrease −0.90
ORF3a protein G172C Decrease −0.83
ORF6 protein W27L Decrease −0.96
ORF8 A65V Increase 0.02
ORF8 L84S Decrease −2.29
Neucleocapsid phospoprotein S202N Increase −0.78
Neucleocapsid phospoprotein R203K Decrease −0.93
Neucleocapsid phospoprotein G204R Decrease −0.52
Neucleocapsid phospoprotein K373N Increase −0.10
Neucleocapsid phospoprotein D377G Decrease −0.44

Table 4.

Prediction of the mutational effects on the molecular consequences.

Protein Name Mutation Effects
Surface Glycoprotein F1118L Altered Ordered interface
Altered Disordered interface
Altered DNA binding
Loss of Sulfation at Y1119
Altered Metal binding
ORF3a G172C Loss of O-linked glycosylation at S171
Gain of Disulfide linkage at G172
Loss of Intrinsic disorder
Altered Transmembrane protein
Altered Ordered interface
Gain of Loop
Loss of Proteolytic cleavage at D173
ORF6 W27L Altered Ordered interface
Altered Disordered interface
Loss of Strand
Gain of Helix
Loss of Allosteric site at F22
Gain of Sulfation at Y31
Altered DNA binding
Altered Transmembrane protein

3.6. Prediction and validation of the homology models

In total, three models were generated using the template PDB ID: 6VSB; one model for the spike protein of reference strain, and the two others were for two different mutant isolates from Bangladesh (Fig. 3 ). Two types of mutations were found in the spike proteins of all Bangladeshi isolates, where most of the isolates were found to contain a substitution of D623 G. Only one strain, EPI_ISL_445214, found to have two substitutions; one was similar to the previous substitution, and the other was F1118 L. The validation assessment scores of these three models were mostly similar to the template, which provided the reliability of these models (Table 5 ).

Fig. 3.

Fig. 3

Homology model of the spike proteins; (A) wildtype (B) Model with one mutation: D623 G (C) Model with two mutations: D623 G and F1118 L (D) Superimpose of all models. Here, in B and C, red dot represents the mutation site. In D, purple color represents the wildtype model; the cyan represents a model with one mutation, and the green represents a model with two mutations.

Table 5.

Model Validation assessment score.

Structures Rampage Score
ERRAT Score
Favoured Region Allowed Region
Template 95.8 % 4.1 % 76 %
Wild type 92.9 % 5.7 % 83 %
Mutant Model 1 92.6% 5.3 % 84.69 %
Mutant Model 2 92.8% 5.3 % 83.78 %

3.7. Analysis of the interaction between spike proteins and human ACE2 receptor

HDOCK server was used to predict the interaction between the above-mentioned 3D models of reference spike proteins along with mutant models and the human ACE2 receptor. Interestingly, this molecular docking analysis revealed that the docking score for the three models against the human ACE2 receptor was similar, and it was -244.42 (Table 6 ); mutation in the spike proteins do not hamper binding with ACE2 receptor. For three spike protein models, this study found that a domain of spike protein instead of whole protein, amino acid ranging from 345 to 527, was involved in the interactions. This domain was conserved in all isolates resulting in similar interactions with ACE2 (Fig. 4 ).

Table 6.

Molecular docking results of human ACE2 receptor against wild-type and muatant spike protein of SARS-CoV-2.

Models Variations HDOCK Score
Model 1 Wild type −244.42
Model 2 D623G −244.42
Model 3 D623 G, F1118L −244.42

Fig. 4.

Fig. 4

Interaction of Spike protein with ACE2: (A) carton model and (B) Surface model. Here, green represents the receptor binding domain (RBD) of spike protein, and cyan represents human ACE2.

4. Discussion

COVID-19 has become a global challenge for the scientific communities affecting millions of people and taking thousands of lives every day. Scientists worldwide are working hard to combat against SARS-CoV-2, but no significant outcome is obtained (Lake, 2020; Yuen et al., 2020). Along with other studies, genetic studies can give a significant clue to understanding the pathogenesis of COVID-19. Together with the critical therapeutic target, the genomic sequence data may provide insights into the pattern of global spread, the diversity during the epidemics, and the dynamics of evolutions, which are crucial to unwind the molecular mechanism of COVID-19 (Khailany et al., 2020). This study gives insights into the transmission of SARS-CoV-2, genetic diversity of the isolates, and predicts the impacts of mutations in Bangladesh.

It has been reported that during the COVID-19 outbreak about 600,000 people had entered into Bangladesh from the other countries including Spain and Switzerland (wsws, 2020). The phylogenetic study revealed that the Bangladeshi isolates found in Dhaka were descendent from Europe, and most of the isolates from Chittagong are descendent from India. India is the neighbour country of Bangladesh and a lot of people crosses the border between Bangladesh-India every day for business, education and treatment purposes. So, the chances of India for being the origin of the virus which caused the COVID-19 pandemic in Bangladesh is very high. Besides, Middle East could also be a potential source of the virus as they were very close to the isolates collected from Chittagong. However, some isolates of Chittagong were close to the isolates from Dhaka. Dhaka is the capital city of Bangladesh and the sixth most densely populated city in the world. This virus may spread to other regions of the country from this city as it is the central hub of Bangladesh for financial, political, entertainment, and education. The SARS-CoV-2 isolates collected from Chittagong are close to the strain from the Middle East is not surprising. As most of the migrants from Bangladesh live in Middle East are from Chittagong, and during the COVID-19 outbreak, thousands of them returned to their home city (Dastider, 2018; Ullah, 2020). Moreover, the phylogenetic distance from the initially collected isolates increased over time which indicates about the extensive mutation that the virus had gone during the human to human transmission in Bangladesh.

Mutation in the viral genome is a ubiquitous phenomenon for the viruses to escape the host defence. But the mutation rate in SARS-CoV-2 much lower than the other RNA viruses, including seasonal flu viruses (Oberemok et al., 2020). In this study, there was found some variations in the SARS-CoV-2 isolated in Bangladesh, which may affect the epidemiology and pathogenicity of the virus. A total of 42 mutations were identified with a large deletion in the coding regions, where about half were synonymous. Even some isolates were found not to encode one or more accessory proteins such as ORF7a, ORF8, and ORF10 caused by a large deletion in the genome. An 80-nucleotides deletion in ORF7a was also reported by a study conducted in Arizona (Mercatelli and Giorgi., 2020). Absent of these accessory proteins may have adverse effects on the viral replication or pathogenesis and the expression of structural protein E (Keng et al., 2006).

Moreover, ORF8 is involved in the crucial adaptation pathways of coronavirus from human-to-human. At the same time, ORF7a contributes to the viral pathogenesis in the host by inhibiting Bone Marrow Stromal Antigen 2 (BST-2), which restricts the release of coronaviruses from affected cells. Loss of ORF7a causes a much more significant restriction of the virus's spreading into the host(Taylor et al., 2015; Decaro and Lorusso, 2020). Loss of these accessory proteins may lead to the virus being less pathogenic, resulting in a meager infection rate and mortality compared to the other countries (Keng et al., 2006).

Additionally, many variations in structural and non-structural proteins caused substitutions of one or more amino acids were found in the isolates of Bangladesh compared to the reference. Most of the mutations found to affect the structural stability of the proteins rather than alter the molecular functions. Among the structural proteins, most variations were found in Surface glycoproteins (spike) and Nucleocapsid phosphoprotein. Spike proteins play a crucial role in the viral entry into the cell by interacting with the human ACE2 receptor. At the same time, Nucleocapsid phosphoprotein is essential for the packaging of viral genomes into a helical ribonucleocapsid (RNP) and fundamental for viral self-assembly (Chang et al., 2014; Hoffmann et al., 2020). These functions may not affect much by those mutations, as Mutpred2 predicted that these mutations did not alter any molecular consequences of the proteins which are consistent with the study conducted by Wrapp and his co (Wrapp et al., 2020).

Interestingly, D623 G mutation was found in the spike protein of all isolates which is similar to the mutation D614 G of the spike protein of SARS-CoV-2 mentioned by many studies. They only differed in the amino acid numbering which occurred due to the use of predictive model in this study. This mutation in spike protein has now become the dominant genotype around the world and could boost the transmission of the virus (Grubaugh et al., 2020). However, several recent studies demonstrated that this mutation had not any differences in the hospitalization outcomes (Korber et al., 2020; Wagner et al., 2020; Lorenzo-Redondo et al., 2020). Moreover, our molecular docking analysis revealed that these mutations in spike proteins do not affect the interaction with the ACE2 receptor; give us a notion that mutation in the spike protein maybe for the better adaption of the SARS-CoV-2. This observation is also supported by two independent studies (Grubaugh et al., 2020; Isabel et al., 2020). Additionally, this study identified a domain in the spike protein (amino acid ranging from 345 to 527) involved with human ACE2 receptor interaction rather than the whole protein. This domain was conserved in all isolates reported in Bangladesh, resulting in no effect of the mutations. A recent study identified the receptor-binding domain of spike protein, amino acid ranging from 319 to 541, to interact with the ACE2 receptor, which is similar to our findings (Lan et al., 2020).

5. Conclusion

SARS-CoV-2 isolates from Dhaka and Chittagong were close to European and Mideast lineage. A large deletion in the EPI_ISL_445213, EPI_ISL_445214, EPI_ISL_450343, and EPI_ISL_450344 isolates may explain the less pathogenic result of COVID-19 compared to other countries. Mutations in the spike protein of SARS-CoV-2 may induce more adaptation of this fetal virus; can cause less effective therapeutics if targeted. Our study gives novel insights to understand the SARS-CoV-2 epidemiology in Bangladesh.

Ethical approval

Not required.

Data availability

All data supporting the findings of this study are available within the article and its supplementary materials.

Funding

SUST Research Center funds for MJH. SA is supported by the (1) Alberta Innovates Graduate Student Scholarship (AIGSS), and the (2) Maternal and Child Health (MatCH) Scholarship programs.

CRediT authorship contribution statement

Md. Sorwer Alam Parvez: Conceptualization, Methodology, Software, Data curation, Formal analysis, Visualization, Validation, Writing - original draft. Mohammad Mahfujur Rahman: Formal analysis, Validation, Investigation. Md. Niaz Morshed: Formal analysis, Validation, Investigation. Dolilur Rahman: Formal analysis, Writing - original draft. Saeed Anwar: Data curation, Writing - review & editing. Mohammad Jakir Hosen: Supervision, Conceptualization, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Footnotes

Appendix A

Supplementary material related to this article can be found, in the online version, at doi:https://doi.org/10.1016/j.compbiolchem.2020.107413.

Appendix A. Supplementary data

The following are Supplementary data to this article:

mmc1.zip (359B, zip)
mmc2.docx (10.4KB, docx)

References

  1. Amanat F., Krammer F. SARS-CoV-2 vaccines: status report. Immunity. 2020 doi: 10.1016/j.immuni.2020.03.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Ashour H.M., Elkhatib W.F., Rahman M., Elshabrawy H.A. Insights into the recent 2019 novel coronavirus (SARS-CoV-2) in light of past human coronavirus outbreaks. Pathogens. 2020;9(3):186. doi: 10.3390/pathogens9030186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., et al. The protein data bank. Nucleic Acids Res. 2000;28(1):235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Brown N.P., Leroy C., Sander C. MView: a web-compatible database search or multiple alignment viewer. Bioinformatics (Oxford, England) 1998;14(4):380–381. doi: 10.1093/bioinformatics/14.4.380. [DOI] [PubMed] [Google Scholar]
  5. Capriotti E., Fariselli P., Casadio R. I-Mutant2. 0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res. 2005;33(suppl_2):W306–W310. doi: 10.1093/nar/gki375. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Chang C.K., Hou M.H., Chang C.F., Hsiao C.D., Huang T.H. The SARS coronavirus nucleocapsid protein–forms and functions. Antiviral Res. 2014;103:39–50. doi: 10.1016/j.antiviral.2013.12.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Colovos C., Yeates T.O. ERRAT: an empirical atom-based method for validating protein structures. Protein Sci. 1993;2(9):1511–1519. doi: 10.1002/pro.5560020916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. CSSE . ArcGIS. Johns Hopkins University; 2020. COVID-19 Dashboard by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU) Retrieved 1 June 2020. [Google Scholar]
  9. Dastider P. The Financial Express; 2018. Manpower Export From Chittagong Regions Rises in 2017.https://thefinancialexpress.com.bd/economy/bangladesh/manpower-export-from-chittagong-region-rises-in-2017-1515563258 10 June. Available at: [Google Scholar]
  10. Decaro N., Lorusso A. Novel human coronavirus (SARS-CoV-2): a lesson from animal coronaviruses. Vet. Microbiol. 2020 doi: 10.1016/j.vetmic.2020.108693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. DeLano W.L. Pymol: an open-source molecular graphics tool. CCP4 Newsl. Protein Crystallogr. 2002;40(1):82–92. [Google Scholar]
  12. Edgar R.C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Grubaugh N.D., Hanage W.P., Rasmussen A.L. Making sense of mutation: what D614G means for the COVID-19 pandemic remains unclear. Cell. 2020 doi: 10.1016/j.cell.2020.06.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hoang D.T., Chernomor O., Von Haeseler A., Minh B.Q., Vinh L.S. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 2018;35(2):518–522. doi: 10.1093/molbev/msx281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Hoffmann M., Kleine-Weber H., Schroeder S., Krüger N., Herrler T., Erichsen S., et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell. 2020 doi: 10.1016/j.cell.2020.02.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. IEDCR . 2020. Institute of Epidemiology, Disease Control and Research (IEDCR)www.corona.gov.bd Available at: [Google Scholar]
  17. Isabel S., Grana-Miraglia L., Gutierrez J.M., Bundalovic-Torma C., Groves H.E., Isabel M.R., et al. Evolutionary and structural analyses of SARS-CoV-2 D614G spike protein mutation now documented worldwide. Sci. Rep. 2020;10:14031. doi: 10.1038/s41598-020-70827-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Kalyaanamoorthy S., Minh B.Q., Wong T.K., von Haeseler A., Jermiin L.S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods. 2017;14(6):587–589. doi: 10.1038/nmeth.4285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Keng C.T., Choi Y.W., Welkers M.R., Chan D.Z., Shen S., Lim S.G., et al. The human severe acute respiratory syndrome coronavirus (SARS-CoV) 8b protein is distinct from its counterpart in animal SARS-CoV and down-regulates the expression of the envelope protein in infected cells. Virology. 2006;354(1):132–142. doi: 10.1016/j.virol.2006.06.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Khailany R.A., Safdar M., Ozaslan M. Genomic characterization of a novel SARS-CoV-2. Gene Rep. 2020 doi: 10.1016/j.genrep.2020.100682. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Korber B., Fischer W.M., Gnanakaran S., Yoon H., Theiler J., Abfalterer W., et al. Tracking changes in SARS-CoV-2 Spike: evidence that D614G increases infectivity of the COVID-19 virus. Cell. 2020;182(4):812–827. doi: 10.1016/j.cell.2020.06.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Lake M.A. What we know so far: COVID-19 current clinical knowledge and research. Clin. Med. 2020;20(2):124. doi: 10.7861/clinmed.2019-coron. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Lan J., Ge J., Yu J., Shan S., Zhou H., Fan S., et al. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature. 2020:1–6. doi: 10.1038/s41586-020-2180-5. [DOI] [PubMed] [Google Scholar]
  24. Letunic I., Bork P. Interactive Tree of Life (iTOL) v4: recent updates and new developments. Nucleic Acids Res. 2019;47(W1):W256–W259. doi: 10.1093/nar/gkz239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Lorenzo-Redondo R., Nam H.H., Roberts S.C., Simons L.M., Jennings L.J., Qi C., et al. A unique clade of SARS-CoV-2 viruses is associated with lower viral loads in Patient Upper Airways. medRxiv. 2020 doi: 10.1016/j.ebiom.2020.103112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Lovell S.C. AWadBP, et all. Structure validation by C α geometry: phi, psi and C β-deviation. J. Proteins. 2002;50:437–450. doi: 10.1002/prot.10286. [DOI] [PubMed] [Google Scholar]
  27. Madden T. In The NCBI Handbook. 2nd edition. National Center for Biotechnology Information (US); 2013. The BLAST sequence analysis tool. [Internet] [Google Scholar]
  28. Madeira F., Park Y.M., Lee J., Buso N., Gur T., Madhusoodanan N., et al. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 2019;47(W1):W636–W641. doi: 10.1093/nar/gkz268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Mercatelli D., Giorgi F.M. Geographic and genomic distribution of SARS-CoV-2 mutations. Front. Microbiol. 2020:11. doi: 10.3389/fmicb.2020.01800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Nabi M.S., Shovon F.R. Dhaka Tribune; 2020. 20-fold Rises in COVID-19 Cases in Bangladesh Since April 1.https://www.dhakatribune.com/health/coronavirus/2020/04/14/20-fold-rise-of-covid-19-cases-in-bangladesh-since-april-1 14 April. Available at: [Google Scholar]
  31. Nguyen L.T., Schmidt H.A., Von Haeseler A., Minh B.Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 2015;32(1):268–274. doi: 10.1093/molbev/msu300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Oberemok V.V., Laikova K.V., Yurchenko K.A., Fomochkina I.I., Kubyshkin A.V. SARS-CoV-2 will continue to circulate in the human population: an opinion from the point of view of the virus-host relationship. Inflamm. Res. 2020:1–6. doi: 10.1007/s00011-020-01352-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Paul R. Reuters; 2020. Bangladesh Confirms Its First Three Cases of Coronavirus.https://www.reuters.com/article/us-health-coronavirus-bangladesh/bangladesh-confirms-its-first-three-cases-of-coronavirus-health-officials-idUSKBN20V0FS 8 March. Available at: [Google Scholar]
  34. Pejaver V., Urresti J., Lugo-Martinez J., Pagel K.A., Lin G.N., Nam H.J., et al. MutPred2: inferring the molecular and phenotypic impact of amino acid variants. BioRxiv. 2017 doi: 10.1038/s41467-020-19669-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Ruan Y., Wei C.L., Ling A.E., Vega V.B., Thoreau H., Thoe S.Y.S., et al. Comparative full-length genome sequence analysis of 14 SARS coronavirus isolates and common mutations associated with putative origins of infection. Lancet. 2003;361(9371):1779–1785. doi: 10.1016/S0140-6736(03)13414-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Schwede T., Kopp J., Guex N., Peitsch M.C. SWISS-MODEL: an automated protein homology-modeling server. Nucleic Acids Res. 2003;31(13):3381–3385. doi: 10.1093/nar/gkg520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Seib K.L., Dougan G., Rappuoli R. The key role of genomics in modern vaccine and drug design for emerging infectious diseases. PLoS Genet. 2009;5(10) doi: 10.1371/journal.pgen.1000612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Sievers F., Higgins D.G. Clustal omega. Curr. Protoc. Bioinf. 2014;48(1):3–13. doi: 10.1002/0471250953.bi0313s48. [DOI] [PubMed] [Google Scholar]
  39. Tang D., Comish P., Kang R. The hallmarks of COVID-19 disease. PLoS Pathog. 2020;16(5) doi: 10.1371/journal.ppat.1008536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Taylor J.K., Coleman C.M., Postel S., Sisk J.M., Bernbaum J.G., Venkataraman T., et al. Severe acute respiratory syndrome coronavirus ORF7a inhibits bone marrow stromal antigen 2 virion tethering through a novel mechanism of glycosylation interference. J. Virol. 2015;89(23):11820–11833. doi: 10.1128/JVI.02274-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. The Business Standard . 2020. So Much Social Distancing.https://tbsnews.net/coronavirus-chronicle/coronavirus-bangladesh/so-much-social-distancing-60973 25 March Available online at: [Google Scholar]
  42. Tithila K.K. Dhaka Tribune; 2020. Bangladesh Expands Covid-19 Testing.https://www.dhakatribune.com/bangladesh/2020/04/03/bangladesh-expands-covid-19-testing 3 (2020)April Available online at: [Google Scholar]
  43. Ullah A. Middle East Eye; 2020. Coronavirus: Bangladesh Imposes 14-day Quarantine on Gulf Workers.https://www.middleeasteye.net/news/coronavirus-bangladesh-quarantine-workers-gulf-kuwait-saudi-arabia-qatar 17 March. Available at: [Google Scholar]
  44. Wagner C., Roychoudhury P., Hadfield J., Hodcroft E.B., Lee J., Moncla L.H., et al. 2020. Comparing Viral Load and Clinical Outcomes in Washington State Across D614G Mutation in Spike Protein of SARS-CoV-2. [Google Scholar]
  45. Wang H., Li X., Li T., Zhang S., Wang L., Wu X., Liu J. The genetic sequence, origin, and diagnosis of SARS-CoV-2. Eur. J. Clin. Microbiol. Infect. Dis. 2020;1 doi: 10.1007/s10096-020-03899-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Wiersinga W.J., Rhodes A., Cheng A.C., Peacock S.J., Prescott H.C. Pathophysiology, transmission, diagnosis, and treatment of coronavirus disease 2019 (COVID-19): a review. JAMA. 2020 doi: 10.1001/jama.2020.12839. [DOI] [PubMed] [Google Scholar]
  47. World Socialist Web Site (wsws) 2020. Bangladesh Government Downplays COVID-19 Threat As Jobless Mount.https://www.wsws.org/en/articles/2020/03/16/bang-m16.html Available at: [Google Scholar]
  48. Wrapp D., Wang N., Corbett K.S., Goldsmith J.A., Hsieh C.L., Abiona O., et al. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science. 2020;367(6483):1260–1263. doi: 10.1126/science.abb2507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Yan Y., Zhang D., Zhou P., Li B., Huang S.Y. HDOCK: a web server for protein–protein and protein–DNA/RNA docking based on a hybrid strategy. Nucleic Acids Res. 2017;45(W1):W365–W373. doi: 10.1093/nar/gkx407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Yin W., Mao C., Luan X., Shen D.D., Shen Q., Su H., et al. Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir. Science. 2020 doi: 10.1126/science.abc1560. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Yuen K.S., Ye Z.W., Fung S.Y., Chan C.P., Jin D.Y. SARS-CoV-2 and COVID-19: the most important research questions. Cell Biosci. 2020;10(1):1–5. doi: 10.1186/s13578-020-00404-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Zheng Y.Y., Ma Y.T., Zhang J.Y., Xie X. COVID-19 and the cardiovascular system. Nat. Rev. Cardiol. 2020;17(5):259–260. doi: 10.1038/s41569-020-0360-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

mmc1.zip (359B, zip)
mmc2.docx (10.4KB, docx)

Data Availability Statement

All data supporting the findings of this study are available within the article and its supplementary materials.


Articles from Computational Biology and Chemistry are provided here courtesy of Elsevier

RESOURCES