TABLE 1.
Comparison of frequencies of in-frame indels (indels) in SARS-CoV-2 proteins using the two-sided binomial test (only indels observed in at least two genomes were included to eliminate spurious mutations). Bold font indicates proteins with a significantly increased rate of indels (q-value<0.01 and Odds ratio>1).
| Protein | Protein length | Number of indels | All Proteins as Background | ORF1ab as Background | ||
|---|---|---|---|---|---|---|
| Odds Ratio | q-value (FDR adjusted p-value) | Odds Ratio | q-value (FDR adjusted p-value) | |||
| NSP1 | 540 | 109 | 2.14 | 1.85E-12 | 4.40 | 1.44E-36 |
| NSP2 | 1914 | 81 | 0.45 | 5.89E-17 | 0.92 | 5.02E-01 |
| NSP3 | 5835 | 442 | 0.80 | 1.78E-07 | 1.65 | 3.96E-32 |
| NSP4 | 1500 | 40 | 0.28 | 5.58E-24 | 0.58 | 2.93E-04 |
| NSP5 | 918 | 6 | 0.07 | 3.74E-29 | 0.14 | 9.58E-12 |
| NSP6 | 870 | 58 | 0.71 | 6.47E-03 | 1.45 | 8.99E-03 |
| NSP7 | 249 | 5 | 0.21 | 1.18E-05 | 0.44 | 6.23E-02 |
| NSP8 | 594 | 9 | 0.16 | 1.86E-14 | 0.33 | 1.65E-04 |
| NSP9 | 339 | 7 | 0.22 | 2.31E-07 | 0.45 | 3.54E-02 |
| NSP10 | 417 | 8 | 0.20 | 4.37E-09 | 0.42 | 1.08E-02 |
| NSP12 | 2795 | 46 | 0.17 | 2.91E-64 | 0.36 | 5.63E-18 |
| NSP13 | 1803 | 15 | 0.09 | 3.32E-54 | 0.18 | 2.65E-20 |
| NSP14 | 1581 | 91 | 0.61 | 2.58E-07 | 1.26 | 3.54E-02 |
| NSP15 | 1038 | 36 | 0.37 | 1.05E-12 | 0.76 | 9.92E-02 |
| NSP16 | 894 | 23 | 0.27 | 6.51E-15 | 0.56 | 5.01E-03 |
| Spike | 3822 | 459 | 1.27 | 1.22E-07 | - | - |
| E | 228 | 18 | 0.84 | 5.16E-01 | - | - |
| M | 669 | 26 | 0.41 | 2.06E-07 | - | - |
| N | 1260 | 159 | 1.34 | 3.43E-04 | - | - |
| ORF10 | 117 | 7 | 0.63 | 3.01E-01 | - | - |
| ORF3a | 828 | 254 | 3.25 | 1.91E-57 | - | - |
| ORF6 | 186 | 61 | 3.47 | 9.43E-16 | - | - |
| ORF7a | 366 | 595 | 17.22 | 0.00E+00 | - | - |
| ORF7b | 132 | 58 | 4.65 | 1.56E-20 | - | - |