Abstract
The current Coronavirus Disease 19 (COVID-19) pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) shows similar pathology to MERS and SARS-CoV, with a current estimated fatality rate of 1.4%. Open reading frame 10 (ORF10) is a unique SARS-CoV-2 accessory protein, which contains eleven cytotoxic T lymphocyte (CTL) epitopes each of nine amino acids in length. Twenty-two unique SARS-CoV-2 ORF10 variants have been identified based on missense mutations found in sequence databases. Some of these mutations are predicted to decrease the stability of ORF10 in silico physicochemical and structural comparative analyses were carried out on SARS-CoV-2 and Pangolin-CoV ORF10 proteins, which share 97.37% amino acid (aa) homology. Though there is a high degree of ORF10 protein similarity of SARS-CoV-2 and Pangolin-CoV, there are differences of these two ORF10 proteins related to their sub-structure (loop/coil region), solubility, antigenicity and shift from strand to coil at aa position 26 (tyrosine). SARS-CoV-2 ORF10, which is apparently expressed in vivo since reactive T cell clones are found in convalescent patients should be monitored for changes which could correlate with the pathogenesis of COVID-19.
Keywords: ORF10, SARS-CoV-2, Pangolin-CoV-2020, Mutations, Intrinsic disorder, COVID-19
1. Introduction
The Coronavirus Disease 19 (COVID-19) pandemic has affected the whole world with more than 131 million people infected and 2.85 million fatalities worldwide as of April, 5, 2021 [[1], [2], [3], [4], [5]]. The high fatality rate of 9.7% for SARS-CoV and 37% for Middle East Respiratory Syndrome Coronavirus (MERS-CoV) in comparison to 1.4% for SARS-CoV-2 has made it vital to monitor mutations within proteins such as ORF10 that could influence viral pathogenicity [[6], [7], [8]]. SARS-CoV-2 is a positive-sense, single-stranded RNA virus with four structural, sixteen non-structural, and six accessory proteins [9]. ORF10 is the smallest accessory protein (38 aa) in SARS-CoV-2, which can distinguish the infection faster than PCR techniques [10]. ORF10, present at the C-terminal of the genome, is hypothesized to be a transposon, although being distinct from larger transposons [10,11]. ORF10 contains a MoRF region from amino acid residue 3 to 7, a protein interaction site, which enables the intrinsic protein to adopt a set of conformations connected with different proteins [12,13]. High-throughput analysis revealed that ORF10 could interact with many host proteins such as multiple members of the Cullin-ubiquitin-ligase complex, which is essential for viral pathogenesis despite its small structure [12,[14], [15], [16], [17]]. Humans may not utilize any memory B and T cells elicited against other microorganisms to target ORF10 and fight SARS-CoV-2 [18]. No sequence homology was found with any protein in the NCBI protein depository. SARS-CoV-2 ORF10 was reported to have a 99.15% nucleotide similarity to Pangolin-CoV-2020 [19,20].
Here we explore mutations described for SARS-CoV-2 ORF10 variants, which in addition to their physiochemical and immunological properties; may possibly have an impact on pathogenesis. The SARS-CoV-2 and Pangolin-CoV ORF10 proteins are also compared.
2. Data and methods
2.1. Data acquisition
11,288 complete genomes of SARS-CoV-2 were retrieved from the National Center for Biotechnology Information (NCBI) database. There were 34 unique ORF10 accessory protein sequences. Only 22 sequences of these ORF10 proteins have one missense mutation, each with an ambiguous mutation in the remaining sequences. A nonsense mutation at position 29 resulted in a truncated ORF10 sequence (QJR96431.1) (Table 1 ).
Table 1.
Accession | Geo_location | Collection_date |
---|---|---|
YP_009725255 | China | 2019-12 |
QLJ57416 | USA: WA | 2020 |
QIS29991 | China: Hubei, Wuhan | 2020-01-10 |
QJR96431 | USA: CA | 2020-03-13 |
QKU54102 | USA: Washington, King County | 2020-03-15 |
QLA48060 | USA: NY | 2020-03-24 |
QNG41574 | USA: Minnesota | 2020-03-25 |
QKV08176 | USA: Washington, King County | 2020-03-26 |
QKV37245 | Australia: Northern Territory | 2020-03-27 |
QNI23218 | USA: Virginia | 2020-04 |
QLG99793 | USA: CA | 2020-04-16 |
QLY88596 | USA: GA | 2020-04-27 |
QNC04532 | USA | 2020-04-29 |
QNI25281 | USA: Virginia | 2020–05 |
QLI33453 | USA | 2020-05-12 |
QNC49349 | Pakistan | 2020-05-15 |
QMT94417 | USA: Washington, Yakima County | 2020-05-27 |
QMT54534 | USA: Washington, Yakima County | 2020-06-17 |
QLG76514 | Australia: Victoria | 2020-06-20 |
QNG42985 | USA: FL | 2020-06-23 |
QMT97141 | USA: FL | 2020-06-30 |
QNB17780 | Bangladesh | 2020-07-07 |
QMU93213 | USA: Wisconsin, Dane county | 2020-07-13 |
QNA70543 | Bangladesh | 2020-07-19 |
The SARS-CoV-2 genome (NC_045512) reference ORF10 (YP_009725255.1) was used to identify mutations [21]. The ORF10 variants of SARS-CoV-2 were compared by sequence-based homology (Fig. 1A) and phylogeny (Fig. 1B).
SARS-CoV-2 ORF10 sequences utilize a single amino acid change at a distinct position. These positions (18) vary widely from position 2 to 38 for the 22 SARS-CoV-2 ORF10 variants.
2.2. Methods
2.2.1. Webserver based predictions
The prediction of various properties of ORF10 proteins was determined by several webservers.
-
•
The web server PROVEAN was used to estimate the effect of known mutations and the structural effect of these mutations, and another I-MUTANT webserver was used [[22], [23], [24]]. The QUARK webserver was used to predict the secondary structure of ORF10 proteins [[25], [26], [27]].
-
•
The ABTMpro webserver predicts whether the protein is a transmembrane protein from its sequence, and further predicts alpha-helices and beta sheets. Besides, the INNOVAGEN webserver was used for peptide property predictions [28].
-
•
The DIpro webserver predicts whether the protein sequence contains disulfide bonds, based on 2D recurrent neural network, support vector machine, graph matching and regression algorithms [29].
-
•
Protein antigenicity is predicted using the webserver ANTIGENpro. A two-stage architecture makes a prediction likelihood centered on several primary sequence representations and five machine learning algorithms [30]. The DisEMBL server uses the intrinsic distortion estimation of a single protein sequence [31].
-
•
Epitopes of a given aa specific sequence were spotted and analyzed for binding affinity using across 12 HLA (Human leukocyte antigen) subtypes (HLA-A*01:01, HLA-A*02:01, HLA-A*03:01, HLA-A*24:02, HLA-A*26:01, HLA-B*07:02, HLA-B*08:01, HLA-B*27:05 B*39:01, B*40:01, B*58:01 and B*15:01). The Immune Epitope Database (IDEB) score was predicted using the IDEB immunogenicity tool [32,33].
2.2.2. Evaluating the per-residue predisposition of various ORF10 proteins for intrinsic disorder
Per-residue disorder distribution within ORF10 protein sequences was evaluated by PONDR-VSL2 [34], which is an accurate standalone disorder predictor [[35], [36], [37]]. Predisposition scores for the per-residual condition are 0 to 1, where 0 indicates residues entirely arranged, and 1 indicates residues completely disordered. Residues of disorder scores between 0.25 and 0.5 were considered extremely versatile, and disorder scores between 0.1 and 0.25 were considered mildly versatile. Residues with values higher than 0.5 were considered disordered.
3. Results
3.1. SARS-CoV-2 ORF10 mutations
Each ORF10 sequence has been aligned using the p-blast protein and omega blast suites of the NCBI, and missense mutations were identified in (Fig. 2A) [38,39]. Conserved and non-conserved residues in ORF10 proteins are identified and marked in different colors (Fig. 2B). MoRF (YINVF) is also predicted using MoRFchibi server for the ORF10 Wuhan sequence [40].
There are 22 unique mutations in 22 SARS- CoV-2 ORF10 variants. These missense mutations are found in the entire ORF10 sequence from the aa position 2 to 38. Arginine (R), valine (V), and leucine (L) are substituted to more than one aa at fixed positions (marked magenta in Fig. 2B). The largest conserved region across all the 24 ORF10 variants is ‘SLLLC’ at positions 15–19.
Each unique variant (Table 1) of SARS-CoV-2 ORF10 possesses a single missense mutation (Table 2 ).
Table 2.
Accession ID | Mutations | Type of mutations | PROVEAN scorea | Effect of mutations on structure | RIb | Polarity changes | Charge |
---|---|---|---|---|---|---|---|
QNI23218.1 | G2D | Deleterious | −7 | Decrease | 7 | NP to P | Neutral to acidic |
QIS29991.1 | V6I | Neutral | −1 | Decrease | 7 | NP to NP | Neutral to neutral |
QLI33453.1 | Y14C | Deleterious | −9 | Decrease | 2 | P to P | Neutral to neutral |
QNC04532.1 | R20I | Deleterious | −8 | Decrease | 3 | NP to NP | Basic (strongly) to neutral |
QLA48060.1 | R20K | Deleterious | −3 | Decrease | 8 | NP to P | Basic (strongly) to basic |
QMT97141.1 | S23F | Deleterious | −6 | Increase | 2 | P to NP | Neutral to neutral |
QMU93213.1 | R24C | Deleterious | −8 | Decrease | 7 | P to P | Basic (strongly) to neutral |
QMT54534.1 | R24L | Deleterious | −7 | Decrease | 9 | P to NP | Basic (strongly) to neutral |
QKU54102.1 | Y26H | Deleterious | −5 | Decrease | 8 | P to P | Neutral to basic (weakly) |
QNI25281.1 | V30A | Deleterious | −4 | Decrease | 9 | NP to NP | Neutral to neutral |
QNC49349.1 | V30L | Deleterious | −3 | Decrease | 4 | NP to NP | Neutral to neutral |
QNA70543.1 | L37F | Deleterious | −4 | Decrease | 7 | NP to NP | Neutral to neutral |
QKV37245.1 | T38I | Deleterious | −6 | Decrease | 5 | P to NP | Neutral to neutral |
QKV08176.1 | L37P | Deleterious | −7 | Decrease | 8 | NP to NP | Neutral to neutral |
QNB17780.1 | F35S | Deleterious | −8 | Decrease | 9 | NP to P | Neutral to neutral |
QMT94417.1 | D31Y | Deleterious | −9 | Decrease | 6 | P to P | Acidic to neutral |
QLY88596.1 | A28V | Deleterious | −4 | Decrease | 5 | NP to NP | Neutral to neutral |
QLG76514.1 | N22T | Deleterious | −6 | Decrease | 1 | P to P | Neutral to neutral |
QLG99793.1 | I13M | Deleterious | −3 | Decrease | 8 | NP to NP | Neutral to neutral |
QNG42985.1 | P10S | Deleterious | −8 | Decrease | 8 | NP to P | Neutral to neutral |
QLJ57416.1 | A8V | Deleterious | −4 | Increase | 3 | NP to NP | Neutral to neutral |
QNG41574.1 | I4L | Neutral | −2 | Increase | 1 | NP to NP | Neutral to neutral |
PROVEAN score: If the PROVEAN score is equal to or below a predefined threshold (e.g., −2.5), the protein variant is predicted to have a “deleterious” effect. If the PROVEAN score is above the threshold, the variant is predicted to have a “neutral” effect.
RI: Reliability Index ranges from 0 to 9.
It was established that the most diversified mutations are deleterious and resulted in decreased protein stability, thus indicating the amplification of intricate virulence of SARS-CoV-2 (Table 2).
3.2. Sequence homology and mutations of SARS-CoV-2 ORF10
SARS-CoV-2 ORF10 does not show homology with other proteins in the NCBI depository including the Bat CoV ORF10 [10]. Surprisingly, SARS-CoV-2 ORF10 showed 97.37% homology to Pangolin-CoV ORF10 (QIG55954.1 (release date: 2020-05-18; collection date: 2019-03-29; geo-location: China; host: Sunda pangolin (Manis javanica))) (Fig. 3 ) [20].
The only difference in the ORF10 sequences is between the serine (S) in the Pangolin-CoV and asparagine (N) in the SARS-CoV-2 at position 25, which according to the PROVEAN score (−3) is deleterious. Subsequently, the protein structural stability is predicted to decrease.
Analysis of the per-residue intrinsic disorder predispositions of the ORF10 of SARS-CoV-2 and Pangolin-CoV provide evidence of their differences. The findings indicate that while the ORF10 SARS-CoV-2 and Pangolin-CoV proteins have very close disorder profiles, the residual disorder tendency of SARS-CoV ORF10 differs significantly, especially within its C-terminal half (Fig. 4A).
Fig. 4B compares intrinsic disorder predispositions of the 24 unique variants of ORF10 protein from different SARS-CoV-2 isolates. It is seen that intrinsic disorder predispositions can vary significantly, especially within the C-terminal half of the protein. In fact, majority of substitutions found within the N-terminal region (residues 1–15; i.e., mutations G2D, I4L, V6I, A8V, P10S, I13M, and Y14C) have very little effect on the local intrinsic disorder predisposition of ORF10. On the other hand, ORF10 variants with the mutations within the C-terminal region (residues 20–38; i.e., mutations R20I/K, N22T, S23F, R24C/L, Y26H, A28V, V30A/L, D31Y, F35S, L37P/F, and T38I, as well as shortened QJR96431.1 variant, which is truncated due to a nonsense mutation at the position 29) typically show rather substantial variability in their local disorder predispositions. The most significant changes are observed within the “disorder hump” region (residues 20–30), intensity of which is increased in QKU54102.1 (Y26H), QNI25281.1 (V30A), and QNB17780.1 (F35S) ORF10 variants, whereas in the variants QMT54534.1 (R24L), QNC04532.1 (R20I), QMU93213.1 (R24C), and QMT97141.1 (S23F), this hump is either eliminated or noticeably flattened. Interestingly, comparison of Fig. 4A and B shows that the variability in the disorder predisposition between many variants of the ORF10 protein from various SARS-CoV-2 isolates is noticeably greater than that between the reference ORF10 from SARS-CoV-2 and ORF10 from Pangolin-CoV. On the other hand, none of the SARS-CoV-2 ORF10 variants (with the exception for the truncated QJR96431.1 variant) has as disordered C-terminal half as the ORF10 protein from SARS-CoV does.
3.3. Comparison of SARS-CoV-2 ORF10 and Pangolin-CoV ORF10
Provided that SARS-CoV-2 and Pangolin-CoV ORF10 have the highest sequence homology, we aimed to detect parity and difference between SARS-CoV-2 and Pangolin-CoV. Therefore, we performed a multi-dimensional analysis of both ORF10 proteins from structural, physicochemical, biophysical, and immunological aspects to understand the origin of SARS-CoV-2 from the ORF10 perspective.
The correlations between sequences of SARS-CoV-2 and Pangolin-CoV ORF10 showed no disulfide connections (Fig. 5A). However, there are several variations. Due to the ABTMpro server and the inclusion of the bulk of hydrophobic amino acids, the SARS COV-2 ORF10 was predicted to be an alpha-helical transmembrane protein (probability 0.489) while the Pangolin-CoV ORF10 series was predicted to be a non-transmembrane protein (probability 0.513). The predicted probability of antigenicity of SARS-CoV-2 ORF10 was slightly higher than that of Pangolin-CoV ORF10. Both proteins were expected to be located in the capsid area of the virus as they both show a positive distance with Pangolin-CoV (0.1502) at a higher rate than SARS-CoV-2 (0.1141).
We characterized their secondary structure (Fig. 5B) for detailed insights into ORF10 proteins from SARS-CoV-2 and Pangolin-CoV and found that these are almost the same except for a significant variation of tyrosine (Y) at position 26 for SARS-CoV-2 ORF10. In SARS-CoV-2 ORF10, 23 of the residues are buried, and the solubility is significantly greater in 24 residues in SARS-CoV-2 ORF10 compared to Pangolin CoV.
Subsequent in-depth physiochemical properties study of two of ORF10 SARS-CoV-2 and Pangolin-CoV proteins revealed the high similarity of extinction, isoelectric point, and net charging dependent on structural and fundamental proprietary studies (Fig. 6A). However, in contrast to Pangolin-CoV ORF10 (4422 g/mol), the molecular weight of SARS-CoV-2 ORF10 was higher because of the replacement of Pangolin-CoV S (low molecular weight) for N (high molecular weight) for SARS-CoV-2. The enzyme cleavage sites for the SARS-CoV-2 and Pangolin-CoV ORF10 were also indistinguishable for all proteases (Fig. 6B).
Protein intrinsic disorder analysis disclosed the presence of hotloops in both sequences within the same span of amino-acids (26–38). However, the presence of loops/coils (22–29) was a distinct characteristic of SARS-CoV-2 ORF10 and no such structures were observed for Pangolin-CoV ORF10 (Fig. 7 ).
We studied and identified nine amino acid epitopes 11 cytotoxic T-lymphocytes (CTLs), from the SARS-CoV-2 ORF10 series, in all 12 HLA subtypes to demonstrate the immunogenic properties of ORF10 and their associated epitopic mutations (Fig. 8 ). The scores were contrasted with the initial epitopes, thereby predicting that the binding affinity for Class-I MHC molecules will increase/decline due to mutations. All eleven epitopes and mutational epitopes have been analyzed using the IDEB tool to take their immunogenicity into account.
4. Discussions
A detailed study of the ORF10 protein was carried out to evaluate its potential to yield to variants that could possibly alter viral pathogenicity. It was observed that each SARS-CoV-2 ORF10 sequence possesses one distinct mutation. Each of the twenty-two SARS-CoV-2 ORF10 variants is at a uniquely different position. None of these mutations in the SARS-CoV-2 ORF10, however, contributes to the determination of clades of SARS-CoV-2. Of all variants, a total of 13 variants were identified to possess mutations at amino acid positions 22–38 and in a region predicted to contain overlapping loops/coils and hot-loop regions of the ORF10 protein. All mutations were predicted to be deleterious with decreased effect on protein structure stability except S23F, which increased stability, denoting that these mutations play an active role in enhancing intrinsic propensity disorder (IPD) and allowing the protein to undergo more favorable interactions with other proteins. Two other mutations, I4L and V6I, were found to be in the MoRF region of ORF10, and which may also possibly contribute to the IPD as well.
The mutations at positions 20 and 24 were also significant due to their sensitivity for trypsin activity. Four ORF10 variants (QNC04532.1, QMT54534.1, QMU93213.1 and QLA48060.1) possess four mutations at these two positions. Among them, three variants harboring the mutations R20I, R24L and R24C provide trypsin resistance, while the fourth variant (QLA48060.1) with the R20K mutation is susceptible to protease degradation.
An amino acid homology of 97.37% was observed between SARS-CoV-2 ORF10 and Pangolin-CoV ORF10. Although most physicochemical and peptide properties are similar, the probability of antigenicity is greater for SARS-CoV-2 ORF10 than that of Pangolin-CoV ORF10 and consequently a stronger immune response is predicted for SARS-CoV-2 ORF10. A change from strand (Pangolin-CoV ORF10) to coil (SARS-CoV-2 ORF10) at position 26 (tyrosine (Y)), is predicted indicating the higher disordered state of the protein. A sequence with the Y26H mutation was also detected in SARS-CoV-2 ORF10, which showed that a hydrophobic amino acid was replaced by a hydrophilic amino acid, thus increasing the probability for more ionic interactions.
Analysis identified ORF10 mutations predicted to alter binding affinity to respective HLA alleles and to possibly correspondingly change the immunogenicity of SARS-CoV-2 ORF10. Eight ORF10 variants (containing one of the following mutations each G2D, I4L, I13M, Y14C, Y26H, F35S, L37S and L37P (Table 2)) accounted for 40% of total mutations and demonstrated decreased affinity for MHC class I, 25% of the variants (carrying mutations R20K, R20I, R24C, R24L and D31Y) predict for increased affinity, and 35% of the variants (carrying mutations V6I, A8V, P10S, S23F, A28V and V30A) contain both high and low binding affinity epitopes. This may indicates that mutations in ORF10 predominantly decrease the affinity of epitopes to escape the host-immune system, while in the mixed cases the effect of increased affinity by mutations is nullified by the presence of mutations contributing to decreased affinity. For mutations showing only increased binding affinity epitopes, it is hypothesized that acquiring more than one mutation in a single sequence in the future will nullify them as well. In addition, the immunogenicity score prediction revealed that a large number of mutations had decreased or no effect and very few of them exhibited an increased immunogenicity score, which may be a possible strategy adopted by SARS-CoV-2 to evade the host-immune response. Six mutation-bearing sequences (QLJ57416.1, QMT97141.1, QLY88596.1, QNC49349.1, QMT54534.1, and QLG76514.1) were found to contain epitopes showing both high affinity binding for MHC class 1 and high immunogenicity, indicating that these epitopes can mount significant immune response and might serve as potential targets for vaccine candidates. More critical studies in ORF10 SARS-CoV-2 are necessary to monitor high frequency mutations that could change viral pathogenesis.
ORF10 protein of SARS-CoV-2 and Pangolin-CoV are similar. However, there are predicted notable differences detected between these two ORF10 proteins in terms of loop/coil structure, antigenicity, solubility, and in mutational diversification of SARS-CoV-2. These significant disagreements of various physicochemical, structural, immunological properties despite an amino acid homology of 97.37% between the ORF10 proteins of SARS-CoV-2 and Pangolin-CoV are quite surprising, and deserving of further study.
A question exists as to the expression of ORF10 both in vivo and in virally infected cell lines. In a small case series of two subjects, a SARS-CoV-2 strain with a truncation mutation of ORF10 was associated with mild disease and in vitro this strain replicated with the same efficiency as strains with non-truncated ORF10 [41]. It should be noted that both individuals infected with this strain had mild disease, and the VeroE6 cells used for viral culture lack native interferon (IFN) production. Evasion of IFN production as well as IFN signaling appears to be important in the pathogenesis of SARS-CoV-2, and interference with IFN induction or interferon sensitivity by ORF10 cannot be ruled out by these experiments [42]. In this vein, ORF10 expression is found in immune cells of subjects infected with SARS-CoV-2, and expression levels of ORF 10 are associated with disease severity [43]. Finally, T cells of acute and convalescent subjects with SARS-CoV-2 infection react to ORF10 in vitro [44]. Taken together, these data suggest that ORF10 is indeed expressed during infection and may be involved in disease severity. The analysis of mutations described in our study through various in vivo and in vitro models of COVID-19 infection and disease severity should be explored, and appear crucial in our understanding of disease pathogenesis.
CRediT authorship contribution statement
SSH conceived the problem and experiment(s). DA, SG, SSH, VNU examined the mutations. SSH, PPC, DA, SG and VNU analyzed the results. SSH wrote the primary draft of the article. AAAA and KL have made major editing to reach a final form. All authors critically reviewed, edited, and approved the final manuscript.
Declaration of competing interest
The authors do not have any conflicts of interest to declare.
References
- 1.Kong W.-H., Li Y., Peng M.-W., Kong D.-G., Yang X.-B., Wang L., Liu M.-Q. Sars-cov-2 detection in patients with influenza-like illness. Nat. Microbiol. 2020;5(5):675–678. doi: 10.1038/s41564-020-0713-1. [DOI] [PubMed] [Google Scholar]
- 2.Ju B., Zhang Q., Ge J., Wang R., Sun J., Ge X., Yu J., Shan S., Zhou B., Song S. Human neutralizing antibodies elicited by sars-cov-2 infection. Nature. 2020;584(7819):115–119. doi: 10.1038/s41586-020-2380-z. [DOI] [PubMed] [Google Scholar]
- 3.Yousefzadegan S., Rezaei N. Case report: death due to covid-19 in three brothers. Am. J. Trop. Med. Hyg. 2020;102(6):1203–1204. doi: 10.4269/ajtmh.20-0240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Xu Z., Shi L., Wang Y., Zhang J., Huang L., Zhang C., Liu S., Zhao P., Liu H., Zhu L. Pathological findings of covid-19 associated with acute respiratory distress syndrome. Lancet Respir. Med. 2020;8(4):420–422. doi: 10.1016/S2213-2600(20)30076-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Li G., Fan Y., Lai Y., Han T., Li Z., Zhou P., Pan P., Wang W., Hu D., Liu X. Coronavirus infections and immune responses. J. Med. Virol. 2020;92(4):424–432. doi: 10.1002/jmv.25685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Corman V.M., Ithete N.L., Richards L.R., Schoeman M.C., Preiser W., Drosten C., Drexler J.F. Rooting the phylogenetic tree of middle east respiratory syndrome coronavirus by characterization of a conspecific virus from an african bat. J. Virol. 2014;88(19):11297–11303. doi: 10.1128/JVI.01498-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hui D.S., Azhar E.I., Madani T.A., Ntoumi F., Kock R., Dar O., Ippolito G., Mchugh T.D., Memish Z.A., Drosten C. The continuing 2019-ncov epidemic threat of novel coronaviruses to global health—the latest 2019 novel coronavirus outbreak in Wuhan, China. Int. J. Infect. Dis. 2020;91:264–266. doi: 10.1016/j.ijid.2020.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Yang W., Kandula S., Huynh M., Greene S.K., Van Wye G., Li W., Chan H.T., McGibbon E., Yeung A., Olson D. Estimating the infection-fatality risk of sars-cov-2 in New York city during the spring 2020 pandemic wave: a model-based analysis. Lancet Infect. Dis. 2020;21:203–212. doi: 10.1016/S1473-3099(20)30769-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.C. S. G. of the International The species severe acute respiratory syndrome-related coronavirus: classifying 2019-ncov and naming it sars-cov-2. Nat. Microbiol. 2020;5(4):536. doi: 10.1038/s41564-020-0695-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Koyama T., Platt D., Parida L. Variant analysis of sars-cov-2 genomes. Bull. World Health Organ. 2020;98(7):495. doi: 10.2471/BLT.20.253591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kiyotani K., Toyoshima Y., Nemoto K., Nakamura Y. Bioinformatic prediction of potential t cell epitopes for sars-cov-2. J. Hum. Genet. 2020;65(7):569–575. doi: 10.1038/s10038-020-0771-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Giri R., Bhardwaj T., Shegane M., Gehi B.R., Kumar P., Gadhave K., Oldfield C.J., Uversky V.N. Understanding covid-19 via comparative analysis of dark proteomes of sars-cov-2, human sars and bat sars-like coronaviruses. Cell. Mol. Life Sci. 2020:1–34. doi: 10.1007/s00018-020-03603-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Uversky V.N. Multitude of binding modes attainable by intrinsically disordered proteins: a portrait gallery of disorder-based complexes. Chem. Soc. Rev. 2011;40(3):1623–1634. doi: 10.1039/c0cs00057d. [DOI] [PubMed] [Google Scholar]
- 14.Gordon D.E., Jang G.M., Bouhaddou M., Xu J., Obernier K., White K.M., O’Meara M.J., Rezelj V.V., Guo J.Z., Swaney D.L. A sars-cov-2 protein interaction map reveals targets for drug repurposing. Nature. 2020:1–13. doi: 10.1038/s41586-020-2286-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Díaz J. Sars-cov-2 molecular network structure. Front. Physiol. 2020;11:870. doi: 10.3389/fphys.2020.00870. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Liang Q., Li J., Guo M., Tian X., Liu C., Wang X., Yang X., Wu P., Xiao Z., Qu Y. Virus-host interactome and proteomic survey of pmbcs from covid-19 patients reveal potential virulence factors influencing sars-cov-2 pathogenesis. Med. 2020;2(1):99–112. doi: 10.1016/j.medj.2020.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Cagliani R., Forni D., Clerici M., Sironi M. Coding potential and sequence conservation of sars-cov-2 and related animal viruses. Infect. Genet. Evol. 2020:104353. doi: 10.1016/j.meegid.2020.104353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Le Bert N., Tan A.T., Kunasegaran K., Tham C.Y., Hafezi M., Chia A., Chng M.H.Y., Lin M., Tan N., Linster M. Sars-cov-2-specific t cell immunity in cases of covid-19 and sars, and uninfected controls. Nature. 2020;584(7821):457–462. doi: 10.1038/s41586-020-2550-z. [DOI] [PubMed] [Google Scholar]
- 19.Kim D., Lee J.-Y., Yang J.-S., Kim J.W., Kim V.N., Chang H. The architecture of sars-cov-2 transcriptome. Cell. 2020;181(4):914–921. doi: 10.1016/j.cell.2020.04.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Liu P., Jiang J.-Z., Wan X.-F., Hua Y., Li L., Zhou J., Wang X., Hou F., Chen J., Zou J. Are pangolins the intermediate host of the 2019 novel coronavirus (sars-cov-2)? PLoS Pathog. 2020;16(5) doi: 10.1371/journal.ppat.1008421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wu X., Cai Y., Huang X., Yu X., Zhao L., Wang F., Li Q., Gu S., Xu T., Li Y. Co-infection with sars-cov-2 and influenza a virus in patient with pneumonia, China. Emerg. Infect. Dis. 2020;26(6):1324. doi: 10.3201/eid2606.200299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Choi Y., Sims G.E., Murphy S., Miller J.R., Chan A.P. Predicting the functional effect of amino acid substitutions and indels. PLoS One. 2012;7(10):e46688. doi: 10.1371/journal.pone.0046688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Choi Y. Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine. 2012. A fast computation of pairwise sequence alignment scores between a protein and a set of single-locus variants of another protein; pp. 414–417. [Google Scholar]
- 24.Choi Y., Chan A.P. Provean web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics. 2015;31(16):2745–2747. doi: 10.1093/bioinformatics/btv195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Xu D., Zhang Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins Struct. Funct. Bioinformatics. 2012;80(7):1715–1735. doi: 10.1002/prot.24065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Xu D., Zhang Y. Toward optimal fragment generations for ab initio protein structure assembly. Proteins Struct. Funct. Bioinformatics. 2013;81(2):229–239. doi: 10.1002/prot.24179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Hassan S.S., Choudhury P.P., Basu P., Jana S.S. Molecular conservation and differential mutation on orf3a gene in indian sars-cov2 genomes. Genomics. 2020;112(5):3226–3237. doi: 10.1016/j.ygeno.2020.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Cheng J., Randall A.Z., Sweredoski M.J., Baldi P. Scratch: a protein structure and structural feature prediction server. Nucleic Acids Res. 2005;33(suppl_2):W72–W76. doi: 10.1093/nar/gki396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Cheng J., Saigo H., Baldi P. Large-scale prediction of disulphide bridges using kernel methods, two-dimensional recursive neural networks, and weighted graph matching. Proteins Struct. Funct. Bioinformatics. 2006;62(3):617–629. doi: 10.1002/prot.20787. [DOI] [PubMed] [Google Scholar]
- 30.Magnan C.N., Zeller M., Kayala M.A., Vigil A., Randall A., Felgner P.L., Baldi P. High-throughput prediction of protein antigenicity using protein microarray data. Bioinformatics. 2010;26(23):2936–2943. doi: 10.1093/bioinformatics/btq551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Linding R., Jensen L.J., Diella F., Bork P., Gibson T.J., Russell R.B. Protein disorder prediction: implications for structural proteomics. Structure. 2003;11(11):1453–1459. doi: 10.1016/j.str.2003.10.002. [DOI] [PubMed] [Google Scholar]
- 32.Zhang H., Lund O., Nielsen M. The pickpocket method for predicting binding specificities for receptors based on receptor pocket similarities: application to mhc-peptide binding. Bioinformatics. 2009;25(10):1293–1299. doi: 10.1093/bioinformatics/btp137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Vita R., Mahajan S., Overton J.A., Dhanda S.K., Martini S., Cantrell J.R., Wheeler D.K., Sette A., Peters B. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 2019;47(D1):D339–D343. doi: 10.1093/nar/gky1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Obradovic Z., Peng K., Vucetic S., Radivojac P., Dunker A.K. Exploiting heterogeneous sequence properties improves prediction of protein disorder. Proteins Struct. Funct. Bioinformatics. 2005;61(S7):176–182. doi: 10.1002/prot.20735. [DOI] [PubMed] [Google Scholar]
- 35.Meng F., Uversky V.N., Kurgan L. Comprehensive review of methods for prediction of intrinsic disorder and its molecular functions. Cell. Mol. Life Sci. 2017;74(17):3069–3090. doi: 10.1007/s00018-017-2555-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Peng Z.-L., Kurgan L. Comprehensive comparative assessment of in-silico predictors of disordered regions. Curr. Protein Pept. Sci. 2012;13(1):6–18. doi: 10.2174/138920312799277938. [DOI] [PubMed] [Google Scholar]
- 37.Fan X., Kurgan L. Accurate prediction of disorder in protein chains with a comprehensive and empirically designed consensus. J. Biomol. Struct. Dyn. 2014;32(3):448–464. doi: 10.1080/07391102.2013.775969. [DOI] [PubMed] [Google Scholar]
- 38.Johnson M., Zaretskaya I., Raytselis Y., Merezhuk Y., McGinnis S., Madden T.L. Ncbi blast: a better web interface. Nucleic Acids Res. 2008;36(suppl_2):W5–W9. doi: 10.1093/nar/gkn201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Madeira F., Park Y.M., Lee J., Buso N., Gur T., Madhusoodanan N., Basutkar P., Tivey A.R., Potter S.C., Finn R.D. The embl-ebi search and sequence analysis tools apis in 2019. Nucleic Acids Res. 2019;47(W1):W636–W641. doi: 10.1093/nar/gkz268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Malhis N., Jacobson M., Gsponer J. Morfchibi system: software tools for the identification of morfs in protein sequences. Nucleic Acids Res. 2016;44(W1):W488–W493. doi: 10.1093/nar/gkw409. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Pancer K., Milewska A., Owczarek K., Dabrowska A., Kowalski M., Łabaj P.P., Branicki W., Sanak M., Pyrc K. The sars-cov-2 orf10 is not essential in vitro or in vivo in humans. PLoS Pathog. 2020;16(12) doi: 10.1371/journal.ppat.1008959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Xia H., Cao Z., Xie X., Zhang X., Chen J.Y.-C., Wang H., Menachery V.D., Rajsbaum R., Shi P.-Y. Evasion of type i interferon by sars-cov-2. Cell Rep. 2020;33(1) doi: 10.1016/j.celrep.2020.108234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Liu T., Jia P., Fang B., Zhao Z. Differential expression of viral transcripts from single-cell rna sequencing of moderate and severe covid-19 patients and its implications for case severity. Front. Microbiol. 2020;11 doi: 10.3389/fmicb.2020.603509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Bacher P., Rosati E., Esser D., Martini G.R., Saggau C., Schiminsky E., Dargvainiene J., Schröder I., Wieters I., Khodamoradi Y. Low-avidity cd4+ t cell responses to sars-cov-2 in unexposed individuals and humans with severe covid-19. Immunity. 2020;53(6):1258–1271. doi: 10.1016/j.immuni.2020.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]