Skip to main content
. 2021 May 31;13(6):e14062. doi: 10.15252/emmm.202114062

Table 5.

United States.

(A) 02/29–04/26/2020* 06/12–07/07/2020* 07/09–07/22/2020 08/01–12/01/2020
Position Location Mutation Count Count Count Count Incidence
241nt 5´UTR CG → TG 76/111 74/96 99/99 116/117 Prevalent
noneffective
1,059nt nsp2 CC → TC 42/112 45/97 30/99 56/117 Prevalent
ACC (Threonine) → ATC (Isoleucine)
1,917nt CT → TT 0/112 11/97 0/99 0/117 CN
ACT (Threonine) → ATT (Isoleucine)
2,416nt CA → TA 9/112 4/97 1/99 3/117 CN,ES,FR,RU,ZA
noneffective
3,037nt nsp3 CT → TT 75/112 72/97 99/99 117/117 prevalent
noneffective
3,871nt GA → TA 0/112 0/97 29/99 4/117 FR,ZA
AAGATC (Lysine Isoleucine) → AATATC (Asparagine Isoleucine)
3,931nt TG → CG 0/112 0/97 29/99 4/117 Unique
noneffective
4,226nt CC → TC 0/112 0/97 28/99 0/117 Unique
CCA (Proline) → TCA (Serine)
5,672nt CC → TC 0/112 0/97 28/99 0/117 Unique
CCT (Proline) → TCT (Serine)
7,837nt AG → CG 0/112 0/97 28/99 0/117 CN
TTAGAC (Leucine Aspartic Acid) → TTCGAC (Phenylalanine Aspartic Acid)
8,083nt GG → AG 0/112 0/97 0/99 18/117 Unique
ATGGAA (Methionine Glutamic Acid) → ATAGAA (Isoleucine Glutamic Acid)
8,782nt nsp4 CC → TC 15/112 15/97 0/99 0/117 CN,DE,ES,IN
noneffective
10,139nt 3C‐like proteinase CT → TT 0/112 0/97 0/99 29/117 Unique
CTT (Leucine) → TTT (Phenylalanine)
12,025nt nsp7 CA → TA 0/112 0/97 11/99 2/117 Unique
noneffective
14,408nt RNA‐dependent RNA polymerase CT → TT 78/112 71/97 99/99 117/117 Prevalent
CCT (Proline) → CTT (Leucine)
17,747nt Helicase CT → TT 8/112 12/97 0/99 0/117 FR
CCT (Proline) → CTT (Leucine)
17,858nt AT → GT 8/112 12/97 0/99 0/117 ZA
TAT (Tyrosine) → TGT (Cysteine)
18,060nt 3´‐ to – 5´exonuclease CT → TT 9/112 11/97 0/99 0/117 ZA
noneffective
18,424nt AA → GA 0/112 0/97 0/99 26/117 Unique
AAT (Asparagine) → GAT (Aspartic Acid)
18,486nt CA → TA 0/112 0/97 13/99 2/117 Unique
noneffective
18,877nt CT → TT 13/112 1/97 6/99 3/117 BR,DE,ES,FR,IN
noneffective
19,677nt endoRNAse GG → TG 0/112 0/97 26/99 0/117 Unique
CAGGGT (Glutamine Glycine) → CATGGT (Histidine Glycine)
19,839nt TA → CA 0/112 0/97 11/99 7/117 CN,DE,ES,FR,RU
noneffective
20,268nt AG → GG 2/112 5/97 15/99 29/117 FR,ES,RU,ZA
noneffective
21,304nt 2'‐O‐ribose methyltransferase CG → TG 0/112 0/97 0/99 25/117 ES
CGC (Arginine) → TGC (Cysteine)
22,162nt Spike glycoprotein TT → CT 0/112 0/97 13/99 2/117 Unique
noneffective
23,403nt AT → GT 77/112 72/97 99/99 117/117 Prevalent
GAT (Aspartic Acid) → GGT (Glycine)
23,707nt CA → TA 0/112 0/97 11/99 3/117 Unique
noneffective
25,907nt ORF3a protein GT → TT 0/112 0/97 0/99 26/117 Unique
GGT (Glycine) → GTT (Valine)
25,563nt GA → TA 65/112 54/97 37/99 66/117 Prevalent
CAGAGC (Glutamine Serine) → CATAGC (Histidine Serine)
27,964nt ORF8 protein CA → TA 13/112 6/97 4/99 31/117 Unique
TCA (Serine) → TTA (Leucine)
28,144nt TA → CA 15/112 15/97 0/99 0/117 CN,DE,ES,IN
TTA (Leucine) → TCA (Serine)
28,472nt Nucleocapsid phosphoprotein CC → TC 0/112 0/97 0/99 22/117 Unique
CCT (Proline) → TCT (Serine)
28,821nt CT → AT 0/112 0/97 9/99 5/117 Unique
TCT (Serine) → TAT (Tyrosine)
28,854nt CA → TA 3/112 0/97 13/99 28/117 CN,DE,ES,FR,IN,RU
TCA (Serine) → TTA (Leucine)
28,869nt CA → TA 0/112 0/97 0/99 25/117 DE
CCA (Proline) → CTA (Leucine)
28,881nt GGG → AAC 3/112 1/97 17/99 17/117 Prevalent
AGGGGA (Arginine Glycine) → AAACGA (Lysine Arginine)
28,887nt CT → TT 0/112 1/97 1/99 10/117 BR,CN,FR,IN,RU
ACT (Threonine) → ATT (Isoleucine)
28,977nt CT → TT 0/112 0/97 29/99 4/117 CN
TCT (Serine) → TTT (Phenylalanine)
(B) 01/19/2020–01/20/2021
Position Location Mutation Total Count Percentage
36nt 5´UTR C → T 1,188 2.24
241nt C → T 48,826 92.24
833nt nsp2 T → C 1,171 2.21
1,059nt C → T 28,844 54.49
3,037nt nsp3 C → T 49,077 92.71
8,083nt G → A 2,779 5.25
8,782nt nsp4 C → T 2,798 5.29
10,319nt 3C‐like proteinase C → T 8,465 15.99
10,323nt A → G 1,176 2.22
10,741nt C → T 1,120 2.12
11,083nt nsp6 G → T 1,612 3.05
11,916nt nsp7 C → T 1,670 3.15
14,408nt RNA‐dependent RNA polymerase C → T 49,140 92.83
14,805nt C → T 3,176 6
16,260nt Helicase C → T 1,797 3.39
17,747nt C → T 2,049 3.87
17,858nt A → G 2,084 3.94
18,060nt 3'‐to‐5' exonuclease C → T 2,135 4.03
18,424nt A → G 6,708 12.67
18,877nt C → T 1,517 2.87
19,839nt endoRNAse T → C 1,955 3.69
20,268nt A → G 6,742 12.74
21,304nt 2'‐O‐ribose methyltransferase C → T 6,603 12.47
23,403nt Spike glycoprotein A → G 49,154 92.86
23,604nt C → A 1,238 2.34
24,076nt T → C 2,148 4.06
25,563nt ORF3a G → T 31,241 59.02
25,907nt G → T 6,369 12.03
27,964nt ORF8 C → T 12,002 22.67
28,144nt T → C 2,790 5.27
28,472nt Nucleocapsid phosphoprotein C → T 6,473 12.23
28,821nt C → A 1,821 3.44
28,842nt G → T 1,152 2.18
28,854nt C → T 6,694 12.65
28,869nt C → T 6,640 12.54
28,881nt G → A 6,887 13.01
28,882nt G → A 6,848 12.94
28,883nt G → C 6,847 12.93
28,887nt C → T 1,090 2.06
29,402nt G → T 1,630 3.08
29,784nt 3´UTR C → T 1,062 2.01
29,870nt C → A 1,990 3.76

The general design of this Table is similar to Tables 3, 4 and 7–12, with minor modifications. Part A: From the overall analyses of the entire SARS‐CoV‐2 RNA sequence from 112 (US‐I), 97 (US‐II), 99 (US‐III), and 117 (US‐IV) randomly chosen isolates, the mutated nucleotides (nt)—as compared to the original Wuhan sequence—were tabulated. The actual time periods of mutant selections for the US‐I to US‐IV samples were indicated. Please note that in some of the Tables, as is the case in Table 5A, mutations were analyzed at different time intervals. From earlier to later, these time intervals were designated in the text as US‐I, US‐II, etc. The same nomenclature was followed in other Tables as well, in case more than one time interval was studied. Mutations previously designated as “signal hotspots” (Weber et al, 2020, i.e. 241–1,059–1,440–2,891–3,037–8,782–14,408–23,403–25,563–28,144–28,881) were now designated “prevalent.” The * in the US‐I and US‐II columns designates previous publication in (Weber et al, 2020). The actual nucleotide changes were indicated in the third column, the most frequent being C → T (here 61.5%), as reported previously (Simmonds, 2020; Weber et al, 2020). Locations of mutations on the viral genome and amino acid exchanges as consequences of individual mutations were tabulated in columns 2 and 3, respectively. In columns 4 to 7, the actual frequencies of mutations at the four time intervals (US‐1 to US‐IV) are listed. The following designations for individual countries were chosen: BR for Brazil, CN for China, DE for Germany, FR for France, IN for India, RU for Russia, ES for Spain, ZA for South Africa, UK for United Kingdom, and US for United States.

The GGG → AAC is a non‐point mutation in nucleotide position 28,881 that generated a highly basic amino acid sequence in the SARS‐CoV‐2 nucleocapsid phosphoprotein. We have speculated that this mutation might have originated from a recombination event between different viral RNA molecules (Weber et al, 2020).

Part B: A total of 5,710 SARS‐CoV‐2 RNA sequences from the GISAID source were analyzed. Deviations from the Wuhan reference sequence of >2% incidence were found at 42 sites in the sequence. Further details were described in the text.