Table 2. The mutational path to European pandemic founder haplotypes*.
Variant | EP–3 | EP–2 | EP† | EP+1† | EP+1+LOF |
---|---|---|---|---|---|
C241U (5′-UTR) | - | - | + | + | + |
C3037U (nsp3-F106F) | - | - | + | + | + |
C14408U (RdRp-P323L) | - | - | - | + | + |
A23403G (Spike-D614G) | - | + | + | + | + |
G25563U (ORF3a-Q57H/ORF3c-R36I/ORF3d-E14*) | - | - | - | - | + |
Earliest collection§ | 24-Dec | 7-Feb | 28-Jan | 20-Feb | 21-Feb |
Earliest location§ | Wuhan | Wuhan | Munich (Shanghai)‡ | Lombardy | Hauts de France |
Occurrence in China | 233 | 1 | 1 (2)‡ | 0 | 0 |
Occurrence in Europe | 458 | 0 | 21 | 1153 | 310 |
Occurrence in Italy | 1 | 0 | 0 | 27 | 0 |
Occurrence in Germany | 15 | 0 | 1 | 11 | 21 |
Occurrence in Belgium | 27 | 1 | 20 | 187 | 27 |
Occurrence in UK | 210 | 0 | 0 | 338 | 38 |
Occurrence in Iceland | 56 | 0 | 0 | 212 | 54 |
Occurrence in France | 14 | 0 | 72 | 102 | |
Occurrence in US | 467 | 0 | 0 | 88** | 326** |
Total in GISAID†† | 1610 | 2 | 22 | 1455 | 752 |
*Haplotypes are here defined by the presence (+) or absence (-) of five high-frequency variants (rows 1–5), and other variants with lower frequencies on these backgrounds are ignored. EP-1 is not observed in our dataset.
†The EP haplotype is first detected in German patient #4 and is a documented founder for coronavirus spread in Germany (Rothe et al., 2020). Neither the EP nor EP+1 haplotypes were detectable between January 28 and February 20, although they immediately became a major haplotype once EP+1 was detectable. Failure to detect these two haplotypes during these 3 weeks could potentially be explained by ascertainment bias, for example lack of testing for travel-independent cases.
‡This Shanghai sample (GISAID: EPI_ISL_416327) comprises 1.32% poly-Ns and failed our quality control criteria, but is added here since it is potentially relevant to the origin of the EP haplotype. Including this sample, the EP haplotype is observed in Shanghai twice.
§The earliest collection location and time are highly subject to collection and submission bias and do not necessarily reflect where the mutation/haplotype first occurred.
**There is likely a testing bias in the United States, as the EP+1+LOF haplotype was often detected in Washington but EP+1 was not.
††These numbers are based on 3853 samples from December 24 to April 1 at the time of GISAID accession that passed both our quality control criteria for alignment and for this particular analysis (i.e. no ambiguous genotype calls among the five SNPs in this table), unless otherwise stated.