Extended Data Figure 5: Characterization of the variation in allelic fraction and cellular prevalence of SNVs across 27 MCF7 strains and their single cell-derived clones.
(a) Top: unsupervised hierarchical clustering of 27 MCF7 strains, based on the allelic fractions of all their SNVs. Groups of strains expected to cluster together based on their evolutionary history are highlighted, as in Fig. 1. Bottom: a corresponding heatmap, showing the allelic fractions of all mutations across the 27 MCF7 strains. Shown are mutations identified only in a subset of the strains. The presence of a mutation is shown in color according to its allelic fraction. (b) The AF of an activating PIK3CA mutation (top) and an inactivating TP53 mutation (bottom) across strains. (c) Top: unsupervised hierarchical clustering of 27 MCF7 strains, based on their SNV cellular prevalence. Groups of strains expected to cluster together based on their evolutionary history are highlighted, as in Fig. 1. Bottom: a corresponding heatmap, showing the cellular prevalence of all mutations across the 27 MCF7 strains. Shown are mutations identified only in a subset of the strains. The presence of a mutation is shown in color according to its cellular prevalence. (d) The distribution of the maximal differences in cellular prevalence (CP) of non-silent mutations, across 27 MCF7 strains. The peak at maxΔCP=1 represents SNVs that are clonal in at least one strain but are nearly or completely absent in at least one other strain; the peak at maxΔCP=0 represents SNVs that are detected at similar prevalence across all 27 strains; and the peak at maxΔCP=~0.1 represents a group of SNVs present at CP=~0.1 only in strain M. (e) A table of the MCF7 single cell-derived clones included in this study, presenting their parental cell line, genetic manipulations and relationship to one another. (f) A heatmap presenting the allelic fractions of non-silent mutations in three WT single cell-derived MCF7 clones and its parental population. The presence of a mutation is shown in color according to its allelic fraction. (g) A heatmap presenting the allelic fractions of non-silent mutations in five genetically-manipulated single cell-derived MCF7 clones. For two of the clones, samples were passaged for a prolonged time and sequenced at multiple time points. The presence of a mutation is shown in color according to its allelic fraction. (h) Comparison of the karyotypic variation between parental and single cell-derived cell populations. Histograms present the distribution of chromosome numbers from the parental (light gray) and single cell-derived (dark gray) populations. P-values indicate the significance of the difference between the variations (rather than the means) of the populations from a one-tailed Levene’s test (n=50 metaphases per group). (i) Two representative karyotypes from each sample. Note that all single cell-derived clones are karyotipically heterogeneous. Marker chromosomes are not shown. Arrows point to partially aberrant chromosomes. Images are representative of 50 metaphases counted per sample. (j) Two representative karyotypes from two cell populations of the same single cell-derived clone, separated by 6 months of culture propagation. Marker chromosomes are not shown. Arrows point to partially aberrant chromosomes. Images are representative of 50 metaphases counted per sample. (k) Comparison of the karyotypic variation between two cell populations of the same single cell-derived clone, separated by 6 months of culture propagation. Histograms present the distribution of chromosome numbers from the early (light gray) and late (dark gray) populations. 50 metaphases were counted per sample. P-value indicates the significance of the difference between the means of the populations from a two-tailed Wilcoxon rank-sum test.