Skip to main content
. 2021 Jun 22;15:11779322211025876. doi: 10.1177/11779322211025876

Table 1.

Tabulation of amino acid lengths of SARS-CoV and SARS-CoV-2 proteins and percentage identity for common proteins of the two species.

Gene Length of amino acids in SARS-CoV-2 Length of amino acids in SARS-CoV Percentage identity between homologous proteins Number of amino acid residues with mutation rate greater than 0.01 during the pandemic (N) N/amino acid length
Nsp1 180 180 84.44 0 0.000
Nsp2 628 628 68.34 5 0.008
Nsp3 1922 1922 75.82 10 0.005
Nsp4 500 500 80.00 1 0.002
Nsp5 306 306 96.08 5 0.016
Nsp6 390 390 88.15 6 0.015
Nsp7 83 83 98.80 1 0.012
Nsp8 198 198 97.47 0 0.000
Nsp9 113 113 97.35 1 0.009
Nsp10 139 139 97.12 0 0.000
Nsp11 13 13 84.60 0 0.000
Nsp12 932 932 96.14 7 0.008
Nsp13 601 601 99.83 5 0.008
Nsp14 527 527 95.07 2 0.004
Nsp15 346 346 88.73 4 0.012
Nsp16 298 298 93.29 1 0.003
S 1273 1255 75.96 21 0.016
ORF3a 275 274 72.36 9 0.033
ORF3b 151 154 No significant similarity
E 75 76 94.74 0 0.000
M 222 221 90.54 0 0.000
ORF6 61 63 68.85 0 0.000
ORF7a 121 122 85.35 1 0.008
ORF7b 43 44 81.40 1 0.023
ORF8a 121 (only ORF8) 39 31.71 2 0.017
ORF8b 121 (only ORF8) 84 40.48
N 419 422 90.52 16 0.038
ORF9a 97 98 72.45
ORF9b 73 70 77.14 4 0.055
ORF10 38

Abbreviation: ORF, open reading frames.

Open reading frames of SARS-CoV-2 proteins were detected from GenBank accession number NC_045512.2. Similarly, open reading frames of SARS-CoV Tor2 were detected from GenBank accession number AY274119. Pairwise alignment of the proteins was conducted using NCBI blast and percentage identity was tabulated. 7 In the pblast algorithm, max target sequence of 100, short queries automatically adjusted to parameters for short input sequences, expect threshold of 0.05 and world size of 6 was set. BLOSUM62 matrix with gap costs of 11 for existence and 1 for extension was set. Number of amino acids with mutation rate higher than 0.01 and mutational frequency rate was tabulated in the last two columns.