Skip to main content
BioMed Research International logoLink to BioMed Research International
. 2020 Oct 20;2020:7234961. doi: 10.1155/2020/7234961

Sequence Analysis and Structure Prediction of SARS-CoV-2 Accessory Proteins 9b and ORF14: Evolutionary Analysis Indicates Close Relatedness to Bat Coronavirus

Chittaranjan Baruah 1,, Papari Devi 2, Dhirendra K Sharma 3
PMCID: PMC7576348  PMID: 33102591

Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has a single-stranded RNA genome that encodes 14 open reading frames (ORFs), eight of which encode accessory proteins that allow the virus to infect the host and promote virulence. The genome expresses around 29 structural and nonstructural protein products. The accessory proteins of SARS-CoV-2 are not essential for virus replication but do affect viral release, stability, and pathogenesis and finally contribute to virulence. This paper has attempted the structure prediction and functional analysis of two such accessory proteins, 9b and ORF14, in the absence of experimental structures. Sequence analysis, structure prediction, functional characterization, and evolutionary analysis based on the UniProtKB reviewed the amino acid sequences of SARS-CoV-2 9b (P0DTD2) and ORF14 (P0DTD3) proteins. Modeling has been presented with the introduction of hybrid comparative and ab initio modeling. QMEANDisCo 4.0.0 and ProQ3 for global and local (per residue) quality estimates verified the structures as high quality, which may be attributed to structure-based drug design targets. Tunnel analysis revealed the presence of 1-2 highly active tunneling sites, perhaps which will able to provide certain inputs for advanced structure-based drug design or to formulate potential vaccines in the absence of a complete experimental structure. The evolutionary analysis of both proteins of human SARS-CoV-2 indicates close relatedness to the bat coronavirus. The whole-genome phylogeny indicates that only the new bat coronavirus followed by pangolin coronaviruses has a close evolutionary relationship with the novel SARS-CoV-2.

1. Introduction

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a positive-sense, single-stranded RNA virus with a genome size of 29,903 nucleotides in length. The 5′ terminus of the SARS-CoV-2 genome encodes a polyprotein (pp1ab), which is further cleaved into 15 nonstructural proteins (nsp-1 to nsp-10 and nsp-12 to nsp-16), whereas the 3′ terminus encodes four structural proteins (spike, envelope, membrane, and nucleocapsid) and eight accessory proteins (3a, 3b, p6, 7a, 7b, 8b, 9b, and ORF14) [1, 2]. The virus is the causative agent of coronavirus disease 2019 (COVID-19) and is contagious through human-to-human transmission. Previously identified human CoVs that cause human disease include alphaCoVs hCoV-NL63 and hCoV-229E and the betaCoVs HCoV-OC43, HKU1, severe acute respiratory syndrome CoV (SARS-CoV), and Middle East respiratory syndrome CoV (MERS-CoV) [3]. Among the seven strains coronaviruses (CoVs) discovered so far, three strains proved to be highly pathogenic (SARS-CoV, MERS-CoV, and 2019-nCoV), which caused endemic to severe CoV disease [4, 5]. The viruses can be classified into four genera: alpha, beta, gamma, and deltaCoVs [6]. The SARS-CoV and MERS-CoV infections can result in life-threatening diseases and have pandemic potential. SARS-CoV-2 is responsible for infection with special reference to the involvement of both the lower and upper respiratory tract [5, 7]. Furthermore, the potential for close contact between bats, civets, and humans in the wildlife trade in southern China, coupled with a possible propensity of these bats to foster CoV host-shifts, could explain SARS-like CoVs as the source of SARS-CoV [8].

To accommodate the wide spectrum of clinical presentations and outcomes of infections caused by SARS-CoV-2 [9], the WHO recently introduced the name COVID-19 (World Health Organization, 2020) to denote this disease. The acronym COVID-19 stands for “CO - corona,” “VI – viruses,” “D - disease,” and “19 - the year 2019” [10]. Despite the fact that COVID-19 has a death rate of 3.27% as of September, 27,236,916 confirmed cases with 891,031 confirmed deaths in a few months (December 8, 2019, to September 08, 2020) across 216 countries or territories are terrifying. Indeed, this virus is highly contagious, and the number of infected people can be doubled in less than seven days with a basic reproductive number (R0) of 2.2–2.7 [11]. In humans, SARS-CoV and SARS-CoV-2 are rapidly spread by respiratory droplets, airborne routes, or direct contact [12].

The viral genome encodes 29 proteins (Nature doi:10.1038/s41433-020-0790-7). The functions of a large number of SARS-CoV-2 ORFs are poorly understood or unknown. The accessory proteins are unique to SARS-CoV, as they have little homology in amino acid sequences with accessory proteins of other coronaviruses [13]. An accessory ORF14 was first described in SARS-CoV by Marra et al. [14, 15]. Understanding the complete proteome of SARS-CoV-2, including the accessory proteins, is the need of the hour for the final destination of drug/medicine. Although the complete genome of SARS-CoV-2 has been made available in the public domain databases, it has been observed in our previous study on SARS-CoV-2 proteome analysis [2] that the two “accessory” ORFs ORF13 (9b) and ORF14 are poorly studied in SARS-CoV-2, as both are not annotated in most of the completed genome sequences [2]. Given the similarity of SARS-CoV-2 to bat SARS-CoV-like coronaviruses, it is likely that bats serve as reservoir hosts for its progenitor. The SARS-CoV-2 spike protein optimized for binding to human-like ACE2 is the result of natural selection [16]. Therefore, the present study reports the in silico sequence analysis, structure prediction, and evolutionary analysis of two such accessory proteins, 9b and ORF14, of the newly emerged SARS-CoV-2.

2. Materials and Methods

2.1. Acquisition and Analysis of Sequences

UniProtKB reviewed the amino acid sequences of SARS-CoV-2 9b (accession no. P0DTD2), and ORF14 protein (accession no. P0DTD3) was used in the present study. A conceptual framework of the workflow in the current study is represented in Figure 1. The amino acid sequences for different taxa were downloaded from UniProtKB for phylogenetic analysis based on BLASTp [17] and FASTA hits [18]. Data mining and sequence analyses were carried out using ExPASy proteomic tools (https://www.expasy.org/tools). The physicochemical parameters were computed using ProtParam [19] and BioEdit [20].

Figure 1.

Figure 1

A conceptual framework of the present study for analysis of SARS-CoV-2 accessory proteins ORF9b and ORF14.

Genome sequences (16 nos.) of different coronavirus genomes with NCBI-IDs AY572034, AY572035, KF569996, MH734115, MG92481, MG923467, MT040335, MT040333, MT072864, MN996532, MT791905, MT451886, MT973427, DQ412042, DQ648856, and AY321118 were retrieved from NCBI genome database for construction of whole-genome phylogeny.

2.2. Comparative and Ab Initio Modeling

BlastP and FASTA searches were performed independently with PDB to know the existing structure from the PDB, for a suitable template for comparative modeling and to decide ab initio modeling requirements (Table 1). The significance of the BLAST results was assessed the e-value generated by the BLAST family of search algorithm and query coverage. The comparative modeling was carried out in the Modeller9.24 program [21], and ab initio modeling was done in Baker Rosetta Server (https://robetta.bakerlab.org/). The loop regions were modeled using the ModLoop server [22]. The final 3D structures with complete coordinates were obtained by optimization of the molecular probability density function of the Modeller 9.24 [23]. The computational protein structures were verified by using global and local (per residue) quality estimates of ProQ3 and QMEANDisCo 4.0.0 [24]. All the graphic presentations of the 3D structures were prepared using Chimera version 1.8.1 [25] and pyMOL 0.97rc [26].

Table 1.

BLAST results against available PDB structures for selection of the modeling method, template selection for the structures of 9b and ORF14 proteins.

Sl no. Protein name and UniProtKB accession number Length (aa residue) PDB Template (S) Identity with template (%) E-value Query coverage The final structure/modeling method selected
1 ORF9b protein (P0DTD2) 97 2CME_B (79 aa) 70.93% 2e-34 89% Comparative modelling
2 ORF14 protein (P0DTD3) 73 3A32_A 39.13% 9.0 31% Ab initio modelling

2.3. Proteomics Analysis and Functional Annotation

Sequence-based functional annotation was carried out using Pfam (pfam.sanger.ac.uk/-), Hmmer version 3.3 [27], PFam, PROSITE, and InterProScan. ProFunc server [28] was used to identify the likely biochemical function of proteins from the predicted 3D structures. MOLE 2.0 [29] and Caver Web 1.0 [30] were used for the advanced analysis of biomacromolecular channels. The tunnel bottleneck radius and lengths were calculated in Ångström (Å) and throughput (estimated tunnel importance) calculated as e−cost, where e is Euler's number. Active site prediction of protein server [31] has been used for the computation of cavities in the target proteins.

2.4. Molecular Phylogenetic Analysis

The amino sequences used for phylogenetic analysis were aligned using ClustalW 1.6 [32] integrated in the MEGA X software [33]. The evolutionary history was inferred using maximum likelihood methods [34]. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) [35]. The initial tree(s) for the heuristic search were obtained automatically by applying the Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the JTT model and then selecting the topology with a superior log likelihood value. To verify the reliability of protein phylogeny (9b and ORF proteins), a whole-genome ML phylogenetic tree was constructed using the General Time Reversible model with Gamma distribution (G).

3. Results and Discussion

3.1. Tertiary Structures of 9b and ORF14

The 9b protein (P0DTD2) with molecular weight = 10796.10 Daltons is rich in leucine (12.37%) and valine (10.31%) (Figure 2). The ORF14 (P0DTD3) molecular weight = 8049.27 Daltons is rich in leucine (20.55%) and alanine (12.33%) (Figure 3). ProMotif analysis of the final predicted structure of 9b, modeled using comparative modeling, calculated 2 sheets, 2 beta hairpins, 7 strands, 4 helices, 9 beta turns, and 1 gamma turn (Table 1; Figures 4 and 5; Figure S1). ProMotif analysis of the predicted ab initio structure of ORF14 calculated 5 helices, 9 helix-helix interacts, and 2 beta turns (Table 1; Figures 6 and 7; Figure S2). The verified structures of the ORF9b and ORF14 proteins had qmean4 z scores of -1.64 (Global Score : 0.67 ± 0.09) and -1.18 (Global Score : 0.52 ± 0.11), respectively (Annexures 1 & 2). Precheck verification showed 94.0% and 95.5% residues in the most favored regions (A, B, L) of Ramachandran plot in 9b and ORF14 proteins, respectively. Structural verification in ERRAT revealed good quality of the models with quality factors of 97.56 and 100 for ORF9b and ORF14 proteins, respectively. The verification reports indicate the high reliability of the theoretical structures (Figures S3-S6).

Figure 2.

Figure 2

Amino acid composition of SARS-CoV-2 9b protein (leucine and valine-rich).

Figure 3.

Figure 3

Amino acid composition of SARS-CoV-2 ORF14 protein (leucine and valine-rich).

Figure 4.

Figure 4

Structure of SARS-CoV-2 9b protein along with major active sites.

Figure 5.

Figure 5

Structure of SARS-CoV-2 9b protein and its two high-throughput tunnels. Tunnels are colored on the basis of preferences in throughput values, i.e., tunnel-1 (blue) and tunnel-2 (green). The high relevance pocket is shown (yellow).

Figure 6.

Figure 6

Structure of SARS-CoV-2 ORF14 protein along with major active sites.

Figure 7.

Figure 7

Structure of SARS-CoV-2 ORF14 protein along with its throughput tunnels (blue).

The estimated high throughput tunnel-1 (blue) in the 9b protein is a bottleneck radius of 1.9 Å, length of 1.5 Å, distance to the surface of 1.5 Å, curvature of 1.0, throughput of 0.92, and number of residues 11; tunnel-2 (green) is a bottleneck radius of 1.5 Å, length of 5.4 Å, distance to surface of 4.9 Å, curvature of 1.1, throughput of 0.78, and number of residues 14 (Figure 5, Figures S7-S8). The estimated high throughput tunnel-1 (blue) in ORF14 protein is a bottleneck radius of 1.2 Å, length of 5.8 Å, distance to the surface of 4.9 Å, curvature of 1.2, throughput of 0.69, and number of residues 14 (Figure 7; Figure S9).

Structural comparison of SARS-CoV-2 9b protein (97 aa residues) with the crystal structure of SARS-CoV ORF9b protein (79 aa residues; PDB ID 2 CME), which shared 70.93% sequence identity (89% query coverage) showed minor differences in the number of strands, helices, and beta turns (SARS-CoV: 6 strands, 1 helix, 7 beta turns, 3 gamma turns; SARS-CoV-2: 7strands, 4 helices, 5 beta turns). This difference in the increase in the number of helices and strands may be due to increased sequence length in the SARS-CoV-2 9b protein. Changes in the DNA sequence will therefore affect both the conventional and alternative ORF, limiting the rate and extent to which the corresponding proteins can evolve [14]. The structure of SARS-CoV ORF9b is a 2-fold symmetric dimer constructed from two adjacent twisted β sheets [36]. Each of these sheets is formed from β strands contributed by both monomers, which form a highly interlocked architecture reminiscent of a handshake. The interdigitated nature of the ORF9b dimer rests on a highly unusual topology of largely antiparallel β sheets in which monomers wrap around each other [36].

The accessory ORF14 described only by Marra et al. [14] is still an uncharacterized protein, and very little is known about its structure and interactions. ORF14 has no significant sequence homology to proteins in other coronaviruses. It belongs to the group of proteins, named as predicted unknown proteins (PUPs), and is unique to SARS-CoV [15, 37]. Interactions among SARS-CoV accessory proteins were studied using a bimolecular fluorescence complementation assay [15, 37]. Self-interactions were observed with 9b and ORF14, indicating the formation of dimeric or multimeric complexes in the nucleus, similar to the findings of von Brunn et al. [15]. ORF9b and ORF14 interacted with themselves, indicating the formation of dimeric or multimeric complexes [15]. ORF9b and ORF14 self-interactions were also found in the co-Immunoprecipitation (CoIP) assay. α-Galactosidase and β-galactosidase assays of protein interactions of 9b-9b, 8a-9b, and ORF14-ORF14 demonstrated self-interactions [15, 37].

3.2. Proteomics Profiles of 9b and ORF14

The InterProScan Search Result of ORF9b protein has revealed that it belongs to protein family—protein 9b and SARS-like (IPR018542) (Figure 8). This is a family of proteins found in SARS and SARS-like coronaviruses. It includes protein 9b from SARS coronavirus 2 (SARS-CoV-2), human SARS coronavirus (SARS-CoV), and bat coronaviruses. Protein 9b is one of 8 accessory proteins in SARS-CoV [38]. The gene (ORF9b, also known as ORF13) that encodes this protein is included within the nucleocapsid (N) gene (alternative ORF) [39]. The ORF9b accessory protein is associated with the spike and nucleocapsid proteins and has unusual membrane-binding properties [14, 36]. SARS-CoV ORF9b has been shown to localize to the outer mitochondrial membrane and target mitochondrial antiviral signaling proteins (MAVS), suppressing innate immunity [40, 41]. Antibodies against SARS-CoV ORF9b have been found in patients, demonstrating that it is produced during infection [36, 42]. Protein 9b from SARS-CoV comprises 98 amino acids, the structure of which has a novel fold that forms a dimeric tent-like beta structure with an amphipathic surface, and a central hydrophobic cavity that binds lipid molecules [36]. This cavity is likely involved in membrane attachment [36]. The sequence of ORF9b is well conserved in different SARS isolates; however, there is little homology between protein 9b from SARS-CoV and the I-protein (protein 9b homologue) present in other coronaviruses [39, 43].

Figure 8.

Figure 8

Protein family membership of SARS-CoV-2 protein 9b resembles the SARS-like protein (IPR018542) and the 9b SARS InterPro homologous superfamily (9-97; IPR037223).

InterProScan Search Result of ORF14 protein revealed its family of membership as protein 14, SARS-like (IPR035113) (Protein 14_ SARS-like) (Figure 9). This is a family of unknown functions found in SARS and SARS-like coronaviruses. It includes uncharacterized protein 14 from SARS coronavirus 2 (SARS-CoV-2), human SARS coronavirus (SARS-CoV), and bat coronavirus Rp3/2004 (SARS-like coronavirus Rp3) [14]. In SARS-CoV, ORF14 is completely contained within the ORF encoding the nucleocapsid protein (N) [38]. In SARS-CoV-2, uncharacterized protein 14 was predicted to contain one transmembrane helix. The ORF14 protein is with three domains: (i) noncytoplasmic domain (1-51), (ii) transmembrane region (52-72), and (iii) cytoplasmic domain (73-73).

Figure 9.

Figure 9

Protein family membership of SARS-CoV-2 protein 14 resembles SARS-like protein (IPR035113). The protein is with three domains: (i) noncytoplasmic domain (1-51), (ii) transmembrane region (52-72), and (iii) cytoplasmic domain (73-73).

Protein 9b shows its subcellular location as a host cytoplasmic vesicle membrane, peripheral membrane protein, and host cytoplasm that binds noncovalently to intracellular lipid bilayers. Gene ontology revealed the cellular components of the host cell cytoplasmic vesicle membrane, and the subunit structure is homodimer with binary interactions. ORF14 protein may play a role in host-virus interaction-subcellular location: membrane sequence analysis and single-pass membrane protein sequence analysis. The topology of gene ontology exhibits cellular components, integral components of the membrane, transmembrane, and transmembrane helices.

3.3. Functional Annotation of 9b and ORF14

PROSITE analysis of the 9b protein revealed three sites: (i) PS00006 CK2_PHOSPHO_SITE Casein kinase II phosphorylation site (24-27; 63-66; 83-86), (ii) PS00008 MYRISTYL N-myristoylation site (49--54), and (iii) PS00005 PKC_PHOSPHO_SITE Protein kinase C phosphorylation site (95-97).

The domain profile of ORF9b resembles the Sarbecovirus 9b domain profile (PROSITE entry PS51920). Coronaviruses are divided into four genera: α-coronavirus, β-coronavirus, γ-coronavirus, and delta-coronavirus. SARS, SARS-CoV-2, BatCoV RaTG13, and Bat-SARS-like coronavirus (BAT-SL-CoVZXC21 and BAT-SL-CoVZC45) belong to the Sarbecovirus subgenus of β-coronavirus.

Coronaviruses code for the characteristic proteins replicase polyprotein (pp1ab), spike (S), membrane (M), envelope (E), and nucleocapsid (N) proteins. In addition, Sarbecoviruses code for subgroup-specific accessory proteins that are thought to be dispensable for viral replication in cell culture but may be important for virus-host interactions and thus contribute to virus fitness.

To achieve the optimum output from their limited genomes, viruses frequently make use of alternative open reading frames, in which translation is initiated from a start codon within an existing gene and, being out of frame, gives rise to a distinct protein product. ORF9b codes for a small accessory protein of 98 amino acid residues, which are found in Sarbecovirus-infected cells. The ORF9b protein (p9b) has been shown to self-interact and interact with nsp5, nsp14, and the accessory protein p6. The function of p9b is unknown, although it has been suggested that it specifically recognizes and binds to intracellular vesicular. The 9b protein could have a role in membrane interactions during the assembly of the virus membranes [36, 44, 45].

The 9b domain has a fold with seven β-strands (PDB ID: 2CME). The β-strands from two molecules form two adjacent twisted β-sheets, resulting in a highly interlocked handshake structure that contains a hydrophobic central cavity, which binds to lipids and stabilizes the molecule (Meier et al., 2006) [36]. Protein 9b is a homodimer that plays a role in membrane interactions during the assembly of the virus.

PROSITE analysis of ORF14 protein also estimated three sites: (i)PS00005 PKC_PHOSPHO_SITE Protein kinase C phosphorylation site (aa 19-21), (ii) PS00008 MYRISTYL N-myristoylation site (aa 22-27), and (iii) PS00006 CK2_PHOSPHO_SITE Casein kinase II phosphorylation site (aa 39-42).

Casein kinase II (CK-2) is a protein serine/threonine kinase whose activity is independent of cyclic nucleotides and calcium. CK-2 phosphorylates many different proteins [46]. N-myristoylation site, an appreciable number of eukaryotic proteins are acylated by the covalent addition of myristate (a C14-saturated fatty acid) to their N-terminal residue via an amide linkage [47, 48]. The sequence specificity of the enzyme responsible for this modification, myristoyl CoA:protein N-myristoyl transferase (NMT), has been derived from the sequence of known N-myristoylated proteins and from studies using synthetic peptides [48]. In vivo, protein kinase C exhibits a preference for the phosphorylation of serine or threonine residues found close to a C-terminal basic residue (Meier et al., 2006; Liu et al., 2014). The presence of additional basic residues at the N- or C-terminal of the target amino acid enhances the Vmax and Km of the phosphorylation reaction [49].

The instability index values of the ORF9b and ORF14 proteins of SARS-COV-2 were 33.11 and 32.56, respectively, which classifies both proteins were stable (Table 2). The aliphatic indices of SARS-COV2 9b and ORF14 were 105.46 and 125.62, respectively, indicating high thermal stability in both proteins (Table 2). The grand average of hydropathicity (GRAVY) values of the 9b and ORF14 proteins are computed as -0.085 and 0.603, respectively, which indicates that protein 9b is hydrophilic and ORF14 is hydrophobic in nature (Table 2; Figures S10 and S11).

Table 2.

Physicochemical parameters of SARS-CoV-2 9b and ORF14 proteins and comparison with other coronaviruses.

Protein name UniProt KB accession no. organism Length MW (Da) pI Chemical formula Instability index Aliphatic index Gravy
ORF9b protein P0DTD2|human SARS-CoV-2 97 10796.66 6.56 C478H796N130O142S5 33.11 105.46 -0.085
Q3LZX3|bat CoV 97 10722.54 6.05 C475H786N128O142S5 41.80 104.43 -0.012
Q6RD12|human SARS-CoV 98 10802.45 4.90 C472H778N130O148S5 38.95 98.47 -0.122
A0A023PUR2|R. affinis CoV 98 10781.50 5.69 C475H787N131O145S4 39.97 105.41 -0.050
Tr|Q3ZTD0|SARS-CoV civet010 98 10790.40 4.90 C470H774N130O149S5 38.95 94.49 -0.176
ORF14 protein P0DTD3|human SARS-CoV-2 73 8049.65 5.79 C359H588N92O100S8 32.56 125.62 0.603
AVP78040|bat CoV 70 7690.13 6.38 C341H555N91O96S7 25.59 117.14 0.466
ARO76392|human SARS-CoV 70 7842.29 6.25 C354H571N93O97S5 32.40 119.86 0.321
AAU04674|SARS-CoV civet 70 7868.37 6.25 C357H577N93O96S5 23.92 125.43 0.387
AHX37568|R. affinis CoV 70 7810.21 6.39 C352H563N93O97S5 32.81 111.57 0.196

GRAVY: Grand average of hydropathicity.

A comparison made in this study on the physicochemical parameters of ORF9b and ORF14 proteins among the different coronaviruses showed that ORF9 protein of SARS-CoV-2 has 76.53% sequence identity with Rhinolophus affinis coronavirus, 74.23% with human CoV, and 73.20% with bat SARS-CoV. The ORF14 protein of SARS-CoV-2 has 92.86% identity with the ORF14 protein of bat coronavirus, 78.57% with human SARS-CoV, 77.14% with civet and Rhinolophus affinis coronavirus. The ORF9 protein showed a wide range of isoelectric points from 4.9 (human and civet SARS-CoV) to 6.56 (human SARS-CoV-2). The instability index values of ORF9b of coronaviruses ranged from 33.11 (human SARS-CoV-2) to 41.80 (bat CoV) (Table 2). The instability index values of ORF14 of coronaviruses ranged from 25.59 (bat CoV) to 32.81 (R. affinis CoV) (Table 2). This indicates higher stability of the ORF14 protein than the ORF9 protein. The grand average of hydropathicity (GRAVY) values were computed in the range of -0.176 to -0.012 in ORF9b protein and 0.196 to 0.603 in ORF14, indicating that ORF9b protein is hydrophilic and ORF14 is hydrophobic in nature (Table 2). The physicochemical parameters, including amino acid composition, pI, instability index and hydropathicity of SARS-CoV-2, showed higher identity with bat SARS-CoV (Table 2).

The functional analysis results of Profunc have been presented in Table 3. Of the nine (09) estimated cavity points in the structure of the 9b protein, the cavity-1 produced by amino acids “NPQVDKGEYTAMIFRLS” is with xyz coordinates of 10.606, -3.416, and -5.832 and a volume of 1261 Ångström cube; the cavity-2 produced by the amino acids “DKQPRVELNTFYIAM” is with cavity point 10.014, -0.066, and 5.372 and a volume of 971 Ångström cube (Figure 5; Table S1). Out of the 10 potential cavities for computed active sites for the function of ORF14 protein, cavity-1 is represented by amino acids “HEPIATVLKWCDMY,” with xyz coordinates of -7.341, 17.615, and -5.427 and a volume of 662 Ångström cube; the cavity-2 with amino acids “PATIHQVLWKYENMCSF” is with a cavity point of -1.634, 12.400, and 3.280 and a volume of 494 Ångström cube (Figure 7; Table S2). ORF9b is an unusual membrane-binding protein with a long hydrophobic lipid-binding tunnel.

Table 3.

Predicted functions of SARS-CoV-2 9b and ORF14 proteins with respective ProFunc score (shown within parenthesis).

Protein name Summary of predicted function
Protein name terms Gene ontology (GO) terms
Cellular component Biological process Biochemical function
9b protein SARS coronavirus ORF9b (0.90) aquifex aeolicus trbp111 structure-specific (0.50) aeolicus trbp111 structure-specific trna (0.50) trbp111 structure-specific trna binding (0.50) ustilago maydis lipase um03410 (0.50) maydis lipase um03410 short (0.50) lipase um03410 short form (0.50) um03410 short form without (0.50) Extracellular region (1.33) cytoplasm (0.85) Cellular process (1.66) cellular metabolic process (1.66) tRNA binding (0.50) RNA binding (0.50) aminoacyl\-tRNA ligase activity (0.50) binding (0.50)
ORF14 protein Human (1.72) domain (1.56) atcc (1.00) nmr (1.00) aminoimidazole riboside kinase (0.70) phycocyanin (0.57) ccm3 (0.50) c-terminal regulatory domain stk25 (0.50) Cytoplasm (1.86) cell (1.86) cell part (1.86) intracellular (1.86) Metabolic process (3.06) cellular process (2.27) cellular metabolic process (2.27) primary metabolic process (1.53) Catalytic activity (2.91) binding (2.48) metal ion binding (1.61) ion binding (1.61)

3.4. Molecular Phylogeny of the 9b and ORF14 Proteins

Evolutionary analysis of the 9b and ORF14 proteins of SARS-CoV-2 was based on the Maximum Likelihood (ML) method and the JTT matrix-based model. The percentage of trees in which the associated taxa clustered together is shown next to the branches. The ML phylogenetic tree, based on the amino acid sequence of the 9b protein, revealed that it has close evolutionary relatedness with human SARS coronavirus (UniProtKB accession number APO40587) followed by the bat SARS-CoV (UniProtKB accession numbers AAZ67037, AAZ41338, and Q3LZX3 (Figure 10)).

Figure 10.

Figure 10

Evolutionary analysis of 9b protein of SARS-CoV-2 by Maximum Likelihood method and JTT matrix-based model (Jones and Taylor, 1992). The tree with the highest log likelihood (-1113.75) is shown. This analysis involved 9 amino acid sequences. There were a total of 141 positions in the final dataset. Evolutionary analyses were conducted in MEGA X.

However, the ML phylogenetic tree based on the amino acid sequence of the human SARS-CoV2 ORF14 protein showed the closest evolutionary relationship with bat SARS-like coronaviruses (accession number AVP78040) with 100% boot strap support (Figure 11).

Figure 11.

Figure 11

Evolutionary analysis of ORF14 protein of SARS-CoV-2 by Maximum Likelihood method and JTT matrix-based model (Jones and Taylor, 1992). The tree with the highest log likelihood (-497.56) is shown. The percentage of trees, in which the associated taxa clustered together, is shown next to the branches. This analysis involved 14 amino acid sequences. There were a total of 73 positions in the final dataset. Evolutionary analyses were conducted in MEGA X.

The whole-genome phylogenetic tree strongly supports the protein phylogeny based on ORF9b and ORF14 proteins, indicating that the close evolutionary SARS-CoV-2 has very closely evolutionarily related to newly sequenced bat coronavirus RaTG13 genome/March 2020 from China (MN996532) followed by the pangolin coronavirus genome (MT040333, MT040335, MT072864) (Figure 12). Bat SARS-CoV Rf1/2004 (DQ412042) and bat CoV 273/2005 (DQ648856) along with human SARS-CoV and horseshoe bat (Rhinolophus affinis) formed a different clade in the whole-genome phylogeny, indicating rapid evolution of coronavirus. Moreover, all the pangolin coronavirus genomes sequenced in April 2020 (MT04033, MT072864, MT040335) were found to be sister taxa in the whole-genome phylogeny. The findings indicate that only the new bat coronavirus followed by pangolin coronaviruses have close evolutionary related with the novel SARS-CoV-2. The present study strongly supports that like the human host the coronavirus had undergone rapid evolution in bats and pangolin as an amplifying host (Figure 12).

Figure 12.

Figure 12

Evolutionary relationships of different coronaviruses based on whole-genome bootstrap phylogenetic analysis (ML tree).

Earlier research claimed that snakes or pangolins may be intermediate hosts for creating the coronavirus by recombination events [50]. Cross-species transmission of zoonotic coronaviruses (CoVs) can result in disease outbreaks [51]. Molecular analysis supported bats as natural hosts for SARS-CoV, but palm civets (Paguma larvata) had a critical role in the transmission to humans [52, 53]. Bats are implicated in SARS-CoV-2 origin. A very similar SARS-CoV-2 strain (RaTG13 CoV) was detected in Rhinolophus affinis bat with 96% genome similarity compared with SARS-CoV-2 genome sequence. Considering that bats were in hibernation when the outbreak occurred, the virus is more likely to have been transmitted via other species [54]. Both protein (9b and ORF14) genome phylogeny results of the present study are supported by the hypothesis for the zoonotic transmission route was constructed based on contact with Malayan pangolins (Manis javanica) by visitors of Huanan seafood market in Wuhan, China [55]. The close phylogenetic relationship to RaTG13 provides evidence that 2019-nCoV may have originated in bats [10]. Differently from bats, which are able to suppress viral replication, pangolin is an amplifying host which allows the increase of viral load and accelerated SARS-CoV-2 jump to human host and human-to-human transmission subsequently [56].

Another study, which supports the results of present finding, showed that the bat and pangolin coronaviruses were the most related to SARS-CoV-2 with 96% and 86% of identity all along the genome [57]. The comparison study from bat and pangolin by Li and his friends explains that BetaCoV/bat/Yunnan/RaTG13/2013 virus was more similar to the SARS-CoV-2 virus than the coronavirus obtained from the two pangolin samples (SRR10168377 and SRR10168378). The human SARS-CoV-2 virus, which is responsible for the recent outbreak of COVID-19, did not come directly from pangolins [13].

Tunnels are access paths connecting the interior of molecular systems with the surrounding environment. The presence of tunnels in proteins influences their reactivity, as they determine the nature and intensity of their interactions. Tunnel analysis of the newly predicted structures of the present study has estimated the presence of multiple tunnels in ORF14 protein. The β sheets of ORF9b form a tent-like structure which contains a 22 Å long central cavity, lined by hydrophobic side chains, which spans the molecule and is open at both ends [36]. The presence of multiple tunnels in this so far uncharacterized protein may take a key role in a large number of transport pathways for small ligands influencing their reactivity. It has been experimentally demonstrated that the tunnels and their properties can define many important protein characteristics like substrate specificity, enantioselectivity, stability, and activity [58]. The details of the structure verification report have been deposited to Modelarchive and will be available to download along with the structures (https://www.modelarchive.org/doi/10.xxxx/).

Several years before the outbreak of SARS, two other zoonotic viruses, Nipah virus and Hendra virus, emerged in Asia and Australia; they were both known to originate from bats [59, 60]. This led scientists to consider bats in the search for reservoirs of SARS-CoV. The present study on the evolution of 9b and ORF14 also highly indicates the bat origin for the newly emerged human SARS-CoV-2. Understanding the bat origin of human coronaviruses is helpful for the prediction and prevention of another pandemic emergence in the future [61].

In a recent correspondence published in Nature Medicine, Andersen et al. [16] clearly showed that SARS-CoV-2 is not a laboratory construct or a purposefully manipulated virus. The potential for close contact between bats, civets, and humans in the wildlife trade in southern China, coupled with a possible propensity of these bats to foster CoV host-shifts, could explain SARS-like CoVs as the source of SARS-CoV [8]. This potential supports molecular results on bat CoVs that suggest a recent host shift from bats to civets or other animals and humans [62]. A recent study also reported that the sequence of bat coronavirus RaTG13, sampled from a Rhinolophus affinis bat, is ~96% identical overall to SARS-CoV-2 [10]. With human activity increasingly overlapping the habitats of bats, disease outbreaks resulting from spillover of bat coronaviruses will continue to occur in the future, despite the fact that direct transmission of bat coronaviruses to humans appears to be rare [61].

It is reasonable to propose that ORF9b in SARS-CoV-2 may contribute to viral pathogenesis as I-protein in mouse hepatitis virus (MHV) does. The interactions between S, N, and ORF9b may help to localize ORF9b inside the particles but closer to the envelope. The structure of ORF9b, an intertwined dimer with an amphipathic outer surface and a long hydrophobic lipid binding tunnel, suggests how this protein may interact, via an unusual anchoring mechanism, with compartments of the ER-Golgi network to act as an accessory protein during the assembly of the SARS virion (Meier et al., 2006). However, further analyses of the properties and functions of ORF9b and ORF14 proteins are still necessary to understand its contribution to virus pathogenesis. All current studies on accessory proteins of coronaviruses, including SARS-CoV-2, suggest that they are not essential for virus replication [63] but do affect viral release, stability, and pathogenesis and finally contribute to virulence [64].

4. Conclusion

The RNA genome of SARS-CoV-2 has 29.9 kb nucleotides, encoding 14 open reading frames (ORFs) for 29 proteins, although one may not be expressed. Studying these different components of the virus as well as how they interact with human cells has already yielded some clues but much remains to be explored. The present study reported theoretical modeling, sequence-based, and structure-based functional characterization of two accessory protein-9b and ORF14 of SARS-CoV-2 p. Phylogenetic analysis of both proteins revealed a close evolutionary relationship between the newly emerged human SARS-CoV-2 and bat SARS-like corona virus. The whole-genome phylogeny indicates that 2019-nCoV may have originated in bat, undergone rapid evolution in bats, and pangolin may more likely to be an amplifying host. The presence of a large number of tunnels in the 9b protein indicates its high reactivity. The theoretical structures and statistical verification reports were successfully deposited in the Model Archive. The theoretical structures would perhaps be useful for advanced computational analysis of interactions of each protein for detailed functional analysis, understanding of viral pathogenesis and virulence for structure-based drug design, or to study potential vaccines, if at all, towards to prevent epidemics and pandemics in the absence of a complete experimental structure.

Acknowledgments

The authors are grateful to DBT-Govt. of India for supporting Bioinformatics Laboratory (under the DBT-Star College scheme) at the Post Graduate Department of Zoology, Darrang College, Tezpur, Assam. The authors are thankful to the Principal, Darrang College (Gauhati University), Tezpur (Assam), India, and Head of the Post Graduate Department of Zoology, Darrang College, and the University of Science and Technology, Meghalaya, India, for supporting the research laboratory facility.

Data Availability

(1) The resultant protein structures are deposited in ModelArchive (https://www.modelarchive.org/). The same data has been provided in a supplementary file (folder name: Data availability). (2) The supplementary file for the data generated in the project has been deposited to ChemRxiv. Preprint. doi:10.26434/chemrxiv.12424958.v1 (supplementary file). (3) All the above data are also included along with this manuscript.

Additional Points

Commercial Declarations. It is made available under a CC-BY-NC-ND 4.0 International license.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Authors' Contributions

P Devi and DKS were involved in planning, designing, and realizing the present work. Data acquisition and analysis were done by C. Baruah. After in-depth discussion of the results with DKS, both CB and PD prepared the manuscript.

Supplementary Materials

Supplementary Materials

Table S1: computed cavities in the 3D structure of ORF9b protein for active sites. Table S2: computed cavities in the 3D structure of ORF14 protein for active sites. Figure S1: secondary structure profile of 9b protein. Figure S2: secondary structure profile of ORF14 protein. Figure S3: QMEANDisCo local quality estimate for 9b protein. Figure S4: QMEANDisCo local quality estimate for ORF14 protein. Figure S5: protein 9b structure verification in ERRAT. Figure S6: protein ORF14 structure verification in ERRAT. Figure S7: profile of tunnel 1 in 9b protein. Figure S8: profile of tunnel 2 in 9b protein. Figure S9: tunnel-profile of ORF14 protein. Figure S10: hydropathicity plot for 9b protein. Figure S11: hydrophobicity plot for ORF14 protein. Annexure 1: protein 9b structure verification. Annexure 2: ORF14 protein structure verification.

References

  • 1.Yang H., Bartlam M., Rao Z. Drug design targeting the main protease, the Achilles’ heel of coronaviruses. Current Pharmaceutical Design. 2006;12(35):4573–4590. doi: 10.2174/138161206779010369. [DOI] [PubMed] [Google Scholar]
  • 2.Baruah C., Devi P., Sharma D. K. BioRxiv. 2020. In silico proteome analysis of severe acute respiratory syndrome coronavirus 2(SARS-CoV-2) [DOI] [Google Scholar]
  • 3.Lu G., Wang Q., Gao G. F. Bat-to-human: spike features determining 'host jump' of coronaviruses SARS-CoV, MERS-CoV, and beyond. Trends in microbiology. 2015;23(8):468–478. doi: 10.1016/j.tim.2015.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Lu G., Liu D. SARS-like virus in the Middle East: a truly bat-related coronavirus causing human diseases. Protein & Cell. 2012;3(11):803–805. doi: 10.1007/s13238-012-2811-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Paules C. I., Marston H. D., Fauci A. S. Coronavirus Infections—More than just the common cold. JAMA. 2020;323(8):707–708. doi: 10.1001/jama.2020.0757. [DOI] [PubMed] [Google Scholar]
  • 6.Woo P. C., Lau S. K., Lam C. S., et al. Discovery of seven novel Mammalian and avian coronaviruses in the genus deltacoronavirus supports bat coronaviruses as the gene source of alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of gammacoronavirus and deltacoronavirus. Journal of Virology. 2012;86(7):3995–4008. doi: 10.1128/JVI.06540-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Chang C. K., Jeyachandran S., Hu N. J., et al. Structure-based virtual screening and experimental validation of the discovery of inhibitors targeted towards the human coronavirus nucleocapsid protein. Molecular BioSystems. 2016;12(1):59–66. doi: 10.1039/C5MB00582E. [DOI] [PubMed] [Google Scholar]
  • 8.Cui J., Han N., Streicker D., et al. Evolutionary relationships between bat coronaviruses and their hosts. Emerging infectious diseases. 2007;13(10):1526–1532. doi: 10.3201/eid1310.070448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Huang C., Wang Y., Li X., et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020;395(10223):497–506. doi: 10.1016/S0140-6736(20)30183-5PMID:31986264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Zhou P., Yang X.-L., Wang X.-G., et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579(7798):270–273. doi: 10.1038/s41586-020-2012-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Sanche S., Lin Y. T., Xu C., Romero-Severson E., Hengartner N., Ke R. High contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2. Emerging Infectious Diseases. 2020;26(7):1470–1477. doi: 10.3201/eid2607.200282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Rothan H. A., Byrareddy S. N. The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak. Journal of Autoimmunity. 2020;109:p. 102433. doi: 10.1016/j.jaut.2020.102433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Li X., Zai J., Zhao Q., et al. Evolutionary history, potential intermediate animal host, and cross-species analyses of SARS-CoV-2. Journal of medical virology. 2020;92(6):602–611. doi: 10.1002/jmv.25731. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Marra M. A., Jones S. J., Astell C. R., et al. The genome sequence of the SARS-associated coronavirus. Science. 2003;300(5624):1399–1404. doi: 10.1126/science.1085953. [DOI] [PubMed] [Google Scholar]
  • 15.von Brunn A., Teepe C., Simpson J. C., et al. Analysis of intraviral protein-protein interactions of the SARS coronavirus ORFeome. PLoS One. 2007;2(5):p. e459. doi: 10.1371/journal.pone.0000459. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Andersen K. G., Rambaut A., Lipkin W. I., Holmes E. C., Garry R. F. The proximal origin of SARS-CoV-2. Nature Medicine. 2020;26(4):450–452. doi: 10.1038/s41591-020-0820-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Altschul S. F., Madden T. L., Schaffer A. A., et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Pearson W. R. Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics. 1991;11(3):635–650. doi: 10.1016/0888-7543(91)90071-L. [DOI] [PubMed] [Google Scholar]
  • 19.Gasteiger E., Hoogland C., Gattiker A., et al. The Proteomics Protocols Handbook. Humana Press; 2005. Protein identification and analysis tools on the ExPASy server; pp. 571–607. [Google Scholar]
  • 20.Hall T. A. Bio edit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symposium Series. 1999;41:95–98. [Google Scholar]
  • 21.Webb B., Sali A. Comparative protein structure modeling using Modeller. Current Protocols in Bioinformatics. 2016;15:5–6. doi: 10.1002/cpbi.3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Fiser A., Do R. K., Sali A. Modeling of loops in protein structures. Protein Science. 2008;9:1753–1773. doi: 10.1110/ps.9.9.1753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Eswar N., Webb B., Marti-Renom M. A., et al. Coligan J. E., Dunn B. M., Speicher D. W., Wingfield P. T., editors. Comparative protein structure modeling using MODELLER. Current Protocols in Protein Science. 2006. [DOI] [PMC free article] [PubMed]
  • 24.Haas J., Gumienny R., Barbato A., et al. Introducing "best single template" models as reference baseline for the continuous automated model evaluation (CAMEO) Proteins. 2019;87(12):1378–1387. doi: 10.1002/prot.25815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Pettersen E., Lamotte-Brasseur J., Chessa J. P., et al. UCSF Chimera?A visualization system for exploratory research and analysis. Journal of Computational Chemistry. 2004;25(13):1605–1612. doi: 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
  • 26.DeLano W., Pymol L. An open-source molecular graphics tool. CCP4 Newsletter On Protein Crystallography. 2002;40:82–92. [Google Scholar]
  • 27.Finn R. D., Clements J., Eddy S. R. HMMER web server: interactive sequence similarity searching. Nucleic Acids Research. 2011;39(suppl):W29–W37. doi: 10.1093/nar/gkr367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Laskowski R. A., Watson J. D., Thornton J. M. ProFunc: a server for predicting protein function from 3D structure. Nucleic Acids Research. 2005;33(Web Server):W89–W93. doi: 10.1093/nar/gki414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Sehnal D., Svobodová Vařeková R., Berka K., et al. MOLE 2.0: advanced approach for analysis of biomacromolecular channels. Journal of cheminformatics. 2013;5(1):p. 39. doi: 10.1186/1758-2946-5-39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Stourac J., Vavra O., Kokkonen P., et al. Caver web 1.0: identification of tunnels and channels in proteins and analysis of ligand transport. Nucleic Acids Research. 2019;47(W1):W414–W422. doi: 10.1093/nar/gkz378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Singh T., Biswas D., Jayaram B. AADS- An automated active site identification, Docking, and scoring protocol for protein targets based on Physicochemical Descriptors. Journal of chemical information and modeling. 2011;51(10):2515–2527. doi: 10.1021/ci200193z. [DOI] [PubMed] [Google Scholar]
  • 32.Thompson J. D., Higgins D. G., Gibson T. J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic acids research. 1994;22(22):4673–4680. doi: 10.1093/nar/22.22.4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Kumar S., Stecher G., Li M., Knyaz C., Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Molecular Biology and Evolution. 2018;35(6):1547–1549. doi: 10.1093/molbev/msy096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Jones D. T., Taylor W. R., Thornton J. M. The rapid generation of mutation data matrices from protein sequences. Computer Applications in the Biosciences. 1992;8(3):275–282. doi: 10.1093/bioinformatics/8.3.275. [DOI] [PubMed] [Google Scholar]
  • 35.Felsenstein J. Inferring phylogenies. Sunderland, Massachusetts: Sinauer Associates; 2003. [Google Scholar]
  • 36.Meier C., Aricescu A. R., Assenberg R., et al. The crystal structure of ORF-9b, a lipid binding protein from the SARS coronavirus. Structure. 2006;14(7):1157–1165. doi: 10.1016/j.str.2006.05.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kong J., Shi Y., Wang Z., Pan Y. Interactions among SARS-CoV accessory proteins revealed by bimolecular fluorescence complementation assay. Acta Pharmaceutica Sinica B. 2015;5(5):487–492. doi: 10.1016/j.apsb.2015.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Xu K., Zheng B. J., Zeng R., et al. Severe acute respiratory syndrome coronavirus accessory protein 9b is a virion-associated protein. Virology. 2009;388(2):279–285. doi: 10.1016/j.virol.2009.03.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Moshynskyy I., Viswanathan S., Vasilenko N., et al. Intracellular localization of the SARS coronavirus protein 9b: evidence of active export from the nucleus. Virus Research. 2007;127(1):116–121. doi: 10.1016/j.virusres.2007.03.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Shi C. S., Qi H. Y., Boularan C., et al. SARS-coronavirus open reading frame-9b suppresses innate immunity by targeting mitochondria and the MAVS/TRAF3/TRAF6 signalosome. Journal of Immunology. 2014;193(6):3080–3089. doi: 10.4049/jimmunol.1303196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Zhang L., Qin Y., Chen M. Viral strategies for triggering and manipulating mitophagy. Autophagy. 2018;14(10):1665–1673. doi: 10.1080/15548627.2018.1466014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Qiu M., Shi Y., Guo Z., et al. Antibody responses to individual proteins of SARS coronavirus and their neutralization activities. Microbes and Infection. 2005;7(5-6):882–889. doi: 10.1016/j.micinf.2005.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Sharma K., Akerstrom S., Sharma A. K., et al. SARS-CoV 9b protein diffuses into nucleus, undergoes active Crm 1 mediated nucleocytoplasmic export and triggers apoptosis when retained in the nucleus. PLoS One. 2011;6, article e19436 doi: 10.1371/journal.pone.0019436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Liu D. X., Fung T. S., Chong K. K., Shukla A., Hilgenfeld R. Accessory proteins of SARS-CoV and other coronaviruses. Antiviral Research. 2014;109:97–109. doi: 10.1016/j.antiviral.2014.06.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Shukla A., Hilgenfeld R. Acquisition of new protein domains by coronaviruses: analysis of overlapping genes coding for proteins N and 9b in SARS coronavirus. Virus Genes. 2015;50(1):29–38. doi: 10.1007/s11262-014-1139-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Pinna L. A. Casein kinase 2: an 'eminence grise' in cellular regulation? Biochimica et Biophysica Acta. 1990;1054(3):267–284. doi: 10.1016/0167-4889(90)90098-x. [DOI] [PubMed] [Google Scholar]
  • 47.Towler D. A., Gordon J. I., Adams S. P., Glaser L. The biology and enzymology of eukaryotic protein acylation. Annual Review of Biochemistry. 1988;57(1):69–97. doi: 10.1146/annurev.bi.57.070188.000441. [DOI] [PubMed] [Google Scholar]
  • 48.Grand R. J. Acylation of viral and eukaryotic proteins. The Biochemical Journal. 1989;258(3):625–638. doi: 10.1042/bj2580625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kishimoto A., Nishiyama K., Nakanishi H., et al. Studies on the phosphorylation of myelin basic protein by protein kinase C and adenosine 3’: 5’-monophosphate-dependent protein kinase. The Journal of Biological Chemistry. 1985;260(23):12492–12499. [PubMed] [Google Scholar]
  • 50.Brüssow H. The Novel Coronavirus - A Snapshot of Current Knowledge. Microbial Biotechnology. 2020;13(3):607–612. doi: 10.1111/1751-7915.13557.PMID. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Agnihothram S., Yount B. L., Jr., Donaldson E. F., et al. A mouse model for Betacoronavirus subgroup 2c using a bat coronavirus strain HKU5 variant. mBio. 2014;5(2) doi: 10.1128/mBio.00047-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Li W., Wong S. K., Li F., et al. Animal origins of the severe acute respiratory syndrome coronavirus: insight from ACE2-S-protein interactions. Journal of Virology. 2006;80(9):4211–4219. doi: 10.1128/JVI.80.9.4211-4219.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Wang L. F., Eaton B. T. Bats, civets and the emergence of SARS. Current Topics in Microbiology and Immunology. 2007;315:325–344. doi: 10.1007/978-3-540-70962-6_13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Sun J., He W. T., Wang L., et al. COVID-19: epidemiology, evolution, and cross-disciplinary perspectives. Trends in Molecular Medicine. 2020;26(5):483–495. doi: 10.1016/j.molmed.2020.02.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Lam T. T., Jia N., Zhang Y. W., et al. Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins. Nature. 2020;583(7815):282–285. doi: 10.1038/s41586-020-2169-0. [DOI] [PubMed] [Google Scholar]
  • 56.Lopes L. R., de Mattos C. G., Paiva P. B. Molecular evolution and phylogenetic analysis of SARS-CoV-2 and hosts ACE2 protein suggest Malayan pangolin as intermediary host. Brazilian Journal of Microbiology. 2020:1–7. doi: 10.1007/s42770-020-00321-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Touati R., Haddad-Boubaker S., Ferchichi I., et al. Comparative genomic signature representations of the emerging COVID-19 coronavirus and other coronaviruses: High identity and possible recombination between Bat and Pangolin coronaviruses. Genomics. 2020;112(6):4189–4202. doi: 10.1016/j.ygeno.2020.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Brezovsky J., Babkova P., Degtjarik O., et al. Engineering a de novo transport tunnel. ACS Catalysis. 2016;6(11):7597–7610. doi: 10.1021/acscatal.6b02081. [DOI] [Google Scholar]
  • 59.Halpin K., Young P. L., Field H. E., Mackenzie J. S. Isolation of Hendra virus from pteropid bats: a natural reservoir of Hendra virus. The Journal of General Virology. 2000;81(8):1927–1932. doi: 10.1099/0022-1317-81-8-1927. [DOI] [PubMed] [Google Scholar]
  • 60.Yob J. M., Field H., Rashdi A. M., et al. Nipah virus infection in bats (order Chiroptera) in peninsular Malaysia. Emerging Infectious Diseases. 2001;7(3):439–441. doi: 10.3201/eid0703.017312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Hu B., Ge X., Wang L. F., Shi Z. Bat origin of human coronaviruses. Virology journal. 2015;12(1):p. 221. doi: 10.1186/s12985-015-0422-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Vijaykrishna D., Smith G. J., Zhang J. X., Peiris J. S., Chen H., Guan Y. Evolutionary insights into the ecology of coronaviruses. Journal of Virology. 2007;81(8):4012–4020. doi: 10.1128/JVI.02605-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.de Haan C. A., Masters P. S., Shen X., Weiss S., Rottier P. J. The group-specific murine coronavirus genes are not essential, but their deletion, by reverse genetics, is attenuating in the natural host. Virology. 2002;296(1):177–189. doi: 10.1006/viro.2002.1412. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Weiss S. R., Navas-Martin S. Coronavirus pathogenesis and the emerging pathogen severe acute respiratory syndrome coronavirus. Microbiology and molecular biology reviews. 2005;69(4):635–664. doi: 10.1128/MMBR.69.4.635-664.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Materials

Table S1: computed cavities in the 3D structure of ORF9b protein for active sites. Table S2: computed cavities in the 3D structure of ORF14 protein for active sites. Figure S1: secondary structure profile of 9b protein. Figure S2: secondary structure profile of ORF14 protein. Figure S3: QMEANDisCo local quality estimate for 9b protein. Figure S4: QMEANDisCo local quality estimate for ORF14 protein. Figure S5: protein 9b structure verification in ERRAT. Figure S6: protein ORF14 structure verification in ERRAT. Figure S7: profile of tunnel 1 in 9b protein. Figure S8: profile of tunnel 2 in 9b protein. Figure S9: tunnel-profile of ORF14 protein. Figure S10: hydropathicity plot for 9b protein. Figure S11: hydrophobicity plot for ORF14 protein. Annexure 1: protein 9b structure verification. Annexure 2: ORF14 protein structure verification.

Data Availability Statement

(1) The resultant protein structures are deposited in ModelArchive (https://www.modelarchive.org/). The same data has been provided in a supplementary file (folder name: Data availability). (2) The supplementary file for the data generated in the project has been deposited to ChemRxiv. Preprint. doi:10.26434/chemrxiv.12424958.v1 (supplementary file). (3) All the above data are also included along with this manuscript.


Articles from BioMed Research International are provided here courtesy of Wiley

RESOURCES