Table 4.
Mean # Mutations Per Tumor1 |
Clonality Signal |
Mean # Matching Mutations |
Frequency of <0.051 | |||
---|---|---|---|---|---|---|
Uncorrelated2 | Negative Correlations3 |
Positive Correlations4 | ||||
0.3 | 0.9 | |||||
5 | 0.0 | 0.10 | 0.01 | 0.04 | 0.02 | 0.02 |
5 | 0.1 | 0.56 | 0.34 | 0.41 | 0.31 | 0.24 |
5 | 0.25 | 1.32 | 0.64 | 0.66 | 0.62 | 0.46 |
10 | 0.0 | 0.21 | 0.02 | 0.02 | 0.02 | 0.03 |
10 | 0.1 | 1.17 | 0.57 | 0.55 | 0.56 | 0.40 |
10 | 0.25 | 2.60 | 0.87 | 0.87 | 0.84 | 0.67 |
20 | 0.0 | 0.45 | 0.04 | 0.02 | 0.04 | 0.05 |
20 | 0.1 | 2.41 | 0.81 | 0.83 | 0.75 | 0.55 |
20 | 0.25 | 5.28 | 0.98 | 0.99 | 0.98 | 0.87 |
In all configurations the 10,000 markers are generated using the same marginal probability set-ups as described in the footnotes to Table 2 with regard to the marginal probabilities of the mutations and the clonality signal.
Here the test is computed by using the same marginal probabilities and (uncorrelated) data generation as in Tables 2 and 3.
In these configurations negative correlation between “pathways” is generated as follows, designed such that the overall mean numbers of matching mutations are equivalent to the corresponding uncorrelated configuration. Common markers are generated in blocks of 10 using a single draw from a multinomial distribution in each block with 10 mutually exclusive outcomes and fixed marginal frequency of 0.1 each. One, two or four such multinomials, respectively, are generated for each tumor under the three scenarios (mean # mutations of 5, 10 or 20). In addition, 5,000 rare markers are generated in 50 blocks of mutually exclusive markers of size 100 and fixed rare marginal frequencies for each mutation (4/9990, 8/9980 and 16/9960 respectively for the three scenarios). That is we generated one draw from each multinomial with 101 potential outcomes, where none of the 100 markers exhibit a mutation when 101th outcome is selected (probability of the 101st outcome is 9490/9990, 9180/9980 and 8360/9960 respectively for the three scenarios). For markers that belong to these multinomial blocks the clonality status is drawn once for a whole block, i.e. the whole blocks rather than individual mutations are considered clonal or independent. The remaining 4990, 4980, or 4960 markers in the three scenarios are independent of each other and of multinomial blocks and are generated as described above. The test statistic and reference distribution are calculated assuming all markers are independent.
Similarly to (3) above, one, two or four blocks of size 10 of common markers and 50 blocks of size 100 of rare markers are generated with positive correlation. To accomplish this we generated multivariate normal variates Y of size 10 or 100 with 0 mean, variance 1 and pairwise correlations of 0.3 or 0.9. The correlated binary mutation outcomes were determined by dichotomizing these normal variables at the appropriate marginal frequencies. Clonality status was drawn on per-block basis as described in (3) above, and the remaining markers were generated independently. Note that in this setup it is possible for greater than 1 mutation to be observed within a block (indeed this is increasingly likely as the correlation increases) while in the mutually exclusive construct in (3) above at most 1 mutation is observed in each block.