Table 5.
(A) | 02/29–04/26/2020* | 06/12–07/07/2020* | 07/09–07/22/2020 | 08/01–12/01/2020 | |||
---|---|---|---|---|---|---|---|
Position | Location | Mutation | Count | Count | Count | Count | Incidence |
241nt | 5´UTR | CG → TG | 76/111 | 74/96 | 99/99 | 116/117 | Prevalent |
noneffective | |||||||
1,059nt | nsp2 | CC → TC | 42/112 | 45/97 | 30/99 | 56/117 | Prevalent |
ACC (Threonine) → ATC (Isoleucine) | |||||||
1,917nt | CT → TT | 0/112 | 11/97 | 0/99 | 0/117 | CN | |
ACT (Threonine) → ATT (Isoleucine) | |||||||
2,416nt | CA → TA | 9/112 | 4/97 | 1/99 | 3/117 | CN,ES,FR,RU,ZA | |
noneffective | |||||||
3,037nt | nsp3 | CT → TT | 75/112 | 72/97 | 99/99 | 117/117 | prevalent |
noneffective | |||||||
3,871nt | GA → TA | 0/112 | 0/97 | 29/99 | 4/117 | FR,ZA | |
AAGATC (Lysine Isoleucine) → AATATC (Asparagine Isoleucine) | |||||||
3,931nt | TG → CG | 0/112 | 0/97 | 29/99 | 4/117 | Unique | |
noneffective | |||||||
4,226nt | CC → TC | 0/112 | 0/97 | 28/99 | 0/117 | Unique | |
CCA (Proline) → TCA (Serine) | |||||||
5,672nt | CC → TC | 0/112 | 0/97 | 28/99 | 0/117 | Unique | |
CCT (Proline) → TCT (Serine) | |||||||
7,837nt | AG → CG | 0/112 | 0/97 | 28/99 | 0/117 | CN | |
TTAGAC (Leucine Aspartic Acid) → TTCGAC (Phenylalanine Aspartic Acid) | |||||||
8,083nt | GG → AG | 0/112 | 0/97 | 0/99 | 18/117 | Unique | |
ATGGAA (Methionine Glutamic Acid) → ATAGAA (Isoleucine Glutamic Acid) | |||||||
8,782nt | nsp4 | CC → TC | 15/112 | 15/97 | 0/99 | 0/117 | CN,DE,ES,IN |
noneffective | |||||||
10,139nt | 3C‐like proteinase | CT → TT | 0/112 | 0/97 | 0/99 | 29/117 | Unique |
CTT (Leucine) → TTT (Phenylalanine) | |||||||
12,025nt | nsp7 | CA → TA | 0/112 | 0/97 | 11/99 | 2/117 | Unique |
noneffective | |||||||
14,408nt | RNA‐dependent RNA polymerase | CT → TT | 78/112 | 71/97 | 99/99 | 117/117 | Prevalent |
CCT (Proline) → CTT (Leucine) | |||||||
17,747nt | Helicase | CT → TT | 8/112 | 12/97 | 0/99 | 0/117 | FR |
CCT (Proline) → CTT (Leucine) | |||||||
17,858nt | AT → GT | 8/112 | 12/97 | 0/99 | 0/117 | ZA | |
TAT (Tyrosine) → TGT (Cysteine) | |||||||
18,060nt | 3´‐ to – 5´exonuclease | CT → TT | 9/112 | 11/97 | 0/99 | 0/117 | ZA |
noneffective | |||||||
18,424nt | AA → GA | 0/112 | 0/97 | 0/99 | 26/117 | Unique | |
AAT (Asparagine) → GAT (Aspartic Acid) | |||||||
18,486nt | CA → TA | 0/112 | 0/97 | 13/99 | 2/117 | Unique | |
noneffective | |||||||
18,877nt | CT → TT | 13/112 | 1/97 | 6/99 | 3/117 | BR,DE,ES,FR,IN | |
noneffective | |||||||
19,677nt | endoRNAse | GG → TG | 0/112 | 0/97 | 26/99 | 0/117 | Unique |
CAGGGT (Glutamine Glycine) → CATGGT (Histidine Glycine) | |||||||
19,839nt | TA → CA | 0/112 | 0/97 | 11/99 | 7/117 | CN,DE,ES,FR,RU | |
noneffective | |||||||
20,268nt | AG → GG | 2/112 | 5/97 | 15/99 | 29/117 | FR,ES,RU,ZA | |
noneffective | |||||||
21,304nt | 2'‐O‐ribose methyltransferase | CG → TG | 0/112 | 0/97 | 0/99 | 25/117 | ES |
CGC (Arginine) → TGC (Cysteine) | |||||||
22,162nt | Spike glycoprotein | TT → CT | 0/112 | 0/97 | 13/99 | 2/117 | Unique |
noneffective | |||||||
23,403nt | AT → GT | 77/112 | 72/97 | 99/99 | 117/117 | Prevalent | |
GAT (Aspartic Acid) → GGT (Glycine) | |||||||
23,707nt | CA → TA | 0/112 | 0/97 | 11/99 | 3/117 | Unique | |
noneffective | |||||||
25,907nt | ORF3a protein | GT → TT | 0/112 | 0/97 | 0/99 | 26/117 | Unique |
GGT (Glycine) → GTT (Valine) | |||||||
25,563nt | GA → TA | 65/112 | 54/97 | 37/99 | 66/117 | Prevalent | |
CAGAGC (Glutamine Serine) → CATAGC (Histidine Serine) | |||||||
27,964nt | ORF8 protein | CA → TA | 13/112 | 6/97 | 4/99 | 31/117 | Unique |
TCA (Serine) → TTA (Leucine) | |||||||
28,144nt | TA → CA | 15/112 | 15/97 | 0/99 | 0/117 | CN,DE,ES,IN | |
TTA (Leucine) → TCA (Serine) | |||||||
28,472nt | Nucleocapsid phosphoprotein | CC → TC | 0/112 | 0/97 | 0/99 | 22/117 | Unique |
CCT (Proline) → TCT (Serine) | |||||||
28,821nt | CT → AT | 0/112 | 0/97 | 9/99 | 5/117 | Unique | |
TCT (Serine) → TAT (Tyrosine) | |||||||
28,854nt | CA → TA | 3/112 | 0/97 | 13/99 | 28/117 | CN,DE,ES,FR,IN,RU | |
TCA (Serine) → TTA (Leucine) | |||||||
28,869nt | CA → TA | 0/112 | 0/97 | 0/99 | 25/117 | DE | |
CCA (Proline) → CTA (Leucine) | |||||||
28,881nt | GGG → AAC | 3/112 | 1/97 | 17/99 | 17/117 | Prevalent | |
AGGGGA (Arginine Glycine) → AAACGA (Lysine Arginine) | |||||||
28,887nt | CT → TT | 0/112 | 1/97 | 1/99 | 10/117 | BR,CN,FR,IN,RU | |
ACT (Threonine) → ATT (Isoleucine) | |||||||
28,977nt | CT → TT | 0/112 | 0/97 | 29/99 | 4/117 | CN | |
TCT (Serine) → TTT (Phenylalanine) |
(B) | 01/19/2020–01/20/2021 | |||
---|---|---|---|---|
Position | Location | Mutation | Total Count | Percentage |
36nt | 5´UTR | C → T | 1,188 | 2.24 |
241nt | C → T | 48,826 | 92.24 | |
833nt | nsp2 | T → C | 1,171 | 2.21 |
1,059nt | C → T | 28,844 | 54.49 | |
3,037nt | nsp3 | C → T | 49,077 | 92.71 |
8,083nt | G → A | 2,779 | 5.25 | |
8,782nt | nsp4 | C → T | 2,798 | 5.29 |
10,319nt | 3C‐like proteinase | C → T | 8,465 | 15.99 |
10,323nt | A → G | 1,176 | 2.22 | |
10,741nt | C → T | 1,120 | 2.12 | |
11,083nt | nsp6 | G → T | 1,612 | 3.05 |
11,916nt | nsp7 | C → T | 1,670 | 3.15 |
14,408nt | RNA‐dependent RNA polymerase | C → T | 49,140 | 92.83 |
14,805nt | C → T | 3,176 | 6 | |
16,260nt | Helicase | C → T | 1,797 | 3.39 |
17,747nt | C → T | 2,049 | 3.87 | |
17,858nt | A → G | 2,084 | 3.94 | |
18,060nt | 3'‐to‐5' exonuclease | C → T | 2,135 | 4.03 |
18,424nt | A → G | 6,708 | 12.67 | |
18,877nt | C → T | 1,517 | 2.87 | |
19,839nt | endoRNAse | T → C | 1,955 | 3.69 |
20,268nt | A → G | 6,742 | 12.74 | |
21,304nt | 2'‐O‐ribose methyltransferase | C → T | 6,603 | 12.47 |
23,403nt | Spike glycoprotein | A → G | 49,154 | 92.86 |
23,604nt | C → A | 1,238 | 2.34 | |
24,076nt | T → C | 2,148 | 4.06 | |
25,563nt | ORF3a | G → T | 31,241 | 59.02 |
25,907nt | G → T | 6,369 | 12.03 | |
27,964nt | ORF8 | C → T | 12,002 | 22.67 |
28,144nt | T → C | 2,790 | 5.27 | |
28,472nt | Nucleocapsid phosphoprotein | C → T | 6,473 | 12.23 |
28,821nt | C → A | 1,821 | 3.44 | |
28,842nt | G → T | 1,152 | 2.18 | |
28,854nt | C → T | 6,694 | 12.65 | |
28,869nt | C → T | 6,640 | 12.54 | |
28,881nt | G → A | 6,887 | 13.01 | |
28,882nt | G → A | 6,848 | 12.94 | |
28,883nt | G → C | 6,847 | 12.93 | |
28,887nt | C → T | 1,090 | 2.06 | |
29,402nt | G → T | 1,630 | 3.08 | |
29,784nt | 3´UTR | C → T | 1,062 | 2.01 |
29,870nt | C → A | 1,990 | 3.76 |
The general design of this Table is similar to Tables 3, 4 and 7–12, with minor modifications. Part A: From the overall analyses of the entire SARS‐CoV‐2 RNA sequence from 112 (US‐I), 97 (US‐II), 99 (US‐III), and 117 (US‐IV) randomly chosen isolates, the mutated nucleotides (nt)—as compared to the original Wuhan sequence—were tabulated. The actual time periods of mutant selections for the US‐I to US‐IV samples were indicated. Please note that in some of the Tables, as is the case in Table 5A, mutations were analyzed at different time intervals. From earlier to later, these time intervals were designated in the text as US‐I, US‐II, etc. The same nomenclature was followed in other Tables as well, in case more than one time interval was studied. Mutations previously designated as “signal hotspots” (Weber et al, 2020, i.e. 241–1,059–1,440–2,891–3,037–8,782–14,408–23,403–25,563–28,144–28,881) were now designated “prevalent.” The * in the US‐I and US‐II columns designates previous publication in (Weber et al, 2020). The actual nucleotide changes were indicated in the third column, the most frequent being C → T (here 61.5%), as reported previously (Simmonds, 2020; Weber et al, 2020). Locations of mutations on the viral genome and amino acid exchanges as consequences of individual mutations were tabulated in columns 2 and 3, respectively. In columns 4 to 7, the actual frequencies of mutations at the four time intervals (US‐1 to US‐IV) are listed. The following designations for individual countries were chosen: BR for Brazil, CN for China, DE for Germany, FR for France, IN for India, RU for Russia, ES for Spain, ZA for South Africa, UK for United Kingdom, and US for United States.
The GGG → AAC is a non‐point mutation in nucleotide position 28,881 that generated a highly basic amino acid sequence in the SARS‐CoV‐2 nucleocapsid phosphoprotein. We have speculated that this mutation might have originated from a recombination event between different viral RNA molecules (Weber et al, 2020).
Part B: A total of 5,710 SARS‐CoV‐2 RNA sequences from the GISAID source were analyzed. Deviations from the Wuhan reference sequence of >2% incidence were found at 42 sites in the sequence. Further details were described in the text.