Costa et al. 10.1073/pnas.0708250105.

Supporting Information

Files in this Data Supplement:

SI Figure 5
SI Figure 6
SI Figure 7
SI Figure 8
SI Figure 9
SI Appendix
SI Text




SI Figure 5

Fig. 5. Enhanced version of Fig. 1 in the main text.





SI Figure 6

Fig. 6. Correlation of MLDA-derived metrics. Discrimination using the HDA metric (Left) and standard deviation among allele frequencies (Right) correlates well with that achieved using Total Mutational Load. S.D., standard deviation. A maximum of information is obtained by using the combination of 22 alleles, and tests indicate that each allele contributes a similar amount of information.





SI Figure 7

Fig. 7. Enhanced version of Fig. 3 in the main text.





SI Figure 8A
SI Figure 8B

Fig. 8. We have provided Kaplan-Meier curves for a range of disturbance frequencies (A) and intensities (B). These figures demonstrate that the effect of disturbance frequency appears to lead to relatively gradual changes in the risks of developing tumor. The maximum risk of tumor formation occurs at intermediate disturbance intervals 8 and 10. On the other hand, the effect of disturbance intensity shows a fairly steep bifurcation between lower intensities and higher intensities (Fig. 7B) with a maximum risk at intermediate intensity of 0.9.





SI Figure 9

Fig. 9. Here we provide more detailed views of individual simulations, which portray the transition from random mutational spectra to dominant allele from the simulation. The clearest way to show this effect is complete description for specific individual simulated patients, as Fig. 4, demonstrating that high-risk patients have a diverse set of clones from which a single or small number of dominant clones emerges. We have provided a representative series of figures, in which the diversity of the mutated clones calculated as Simpson's measure (Shannon entropy gave essentially identical results) is plotted. The entire time for simulations in which tumors developed (black points) and paired with a randomly selected simulation with the same disturbance pattern in which no tumor developed (red points). In general, the nontumors show irregular fluctuating patterns of diversity/entropy over time, and no dominant allele emerges. For the tumor samples, we have shown additionally the timepoint at which a tumor was "diagnosed" (vertical green line), as well as the maximum value obtained to a particular timepoint of the highest dominant allele (shown in blue as the proportion of the highest allele overall in the entire sample). These plots indicate that generally the diversity/entropy of the tumor samples before the establishment of a dominant clone is essentially indistinguishable from the nontumor. Once the clone is established, the entropy for the organism remains consistent.





SI Text

Materials and Methods

Simulation/Model Description

We have simulated the distribution of mutational load and its relation to tumor development in a population of cells, which is located within a spatially structured tissue subject to periodic disturbances, by using a stochastic lattice model. The lattice is a 100 ´ 100 square grid with periodic boundary conditions. Demes are represented as individual (stem-like) cells that have the capacity to mutate, reproduce by expanding into vacant neighboring tissue niches, or die. Demes were initially randomly distributed throughout the grid at various overall densities, each occupying a single location in the grid. We simulate metapopulation barriers occurring by the limitation of the capacity of clones to expand without sufficient "empty space" in the neighborhood. Specifically, a rule of the model states that colonization of neighboring sites cannot occur into occupied regions.

A deme is characterized by its "genotype," which is initially identical (wild type) but subsequently altered by accumulated mutations. During each time step, mutation, expansion, and death, transitions were randomly assigned to each cell according to transition probabilities based on the genotype of the deme. For the studies presented here, there was a single baseline mutation rate, which was unchanged for all genotypes. In the simulations presented, for the purpose of monitoring mutation load, the genotype of a deme was represented by 10 alleles, for three target genes, or 10 possible values for each gene (one for each type of potential advantage; proliferative rate, death avoidance, susceptibility to disturbance). The frequencies of each mutated allele in the population were explicitly monitored to track relative changes in mutational load profile in the tissue over time. The mutational load at a number of other alleles, outside of these specifically targeted genes, was also monitored (data not shown). These additional sites are for the most part deleterious to fitness when present singly. The parameters of a single run included growth, death, and susceptibility probabilities (for both wild type and mutated genotypes), mutation rate, as well as disturbance frequency and intensity. Runs consisted of 5,000 Monte Carlo iterations. Global disturbance events were fatal for a cell as a randomized function of its overall death and. susceptibility rates and the global disturbance intensity parameter. The model is agent-based yet closely related conceptually to the multispecies metapopulation models (cf. ref. 1).

However, in their model loss of habitat, D (which plays a similar role to disturbance in our model) is irreversible and fixed at a certain level for a particular analysis. For our model, D is a function of time (e.g., ). In our agent-based model with discrete time steps, each clone will have an additional probability of dying, above the baseline death probability, during cycles in which a disturbance occurs. This probability is determined by the presence or absence of a mutation in a gene for susceptibility to the disturbance.

The growth and survival parameters are in general related to the colonization and extinction parameters in the basic Levin metapopulation model:

Specifically, they are more closely related to these same parameters in the multispecies metapopulation models of Tilman et al. (1). For the agent-based model with discrete time steps, each clone will spread into neighboring unoccupied space with a probability dependent on the wild-type or mutated genotype.

A range of mutation rates per cell division was studied roughly from 10-5 through 10-9 mutations per nucleotide per division. This is a rough estimate, because our approach was based on mutations anywhere within a particular allele per division.

Average disturbance frequency ranged from every 2 to 1,000 iterations. Disturbance intensities ranging from 0.3 to 0.999 were evaluated. For the studies presented here, either no disturbances occurred (undisturbed state) or disturbance frequency was 25 iterations, with disturbance intensity of 0.9. Individual demes within patches were subject to random culling if they persisted without a sufficient threshold of mutations before the potential transformation to an oncodeme, defined for the current studies as the appearance of a third mutation. Tumor formation was considered to have occurred with the accumulation of three mutations in a continuously expanding clone.

We simulated the importance of the proportion of dead vs. alive cells in the samples of biological fluids analyzed. A plot of the effects of different differences where this difference is varied is attached as SI Fig. 8. The effect on tumor detection of varying the sampling proportions of living versus dying cells depends on the difference in the risk of dying for mutated phenotype compared with wild-type phenotype ("mutated death delta") with almost no effect when this difference is <0.2, which is the range used in the simulations. Even at higher values of delta, there is very little effect as long as at least 10-20% of the sample is from the entire population mixed with the dying cells.

The software for this model is available through the model repository from the Harvard Integrative Cancer Biology CViT program (http://genecube.med.yale.edu/080/montebello)

(D.T. and J.C.).

1. Tilman D, May RM, Lehman CL, Nowak MA (1994) Nature 371:65-66.