Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Oct 1.
Published in final edited form as: J Theor Biol. 2014 Apr 13;355:170–184. doi: 10.1016/j.jtbi.2014.02.042

Multifocality and recurrence risk: a quantitative model of field cancerization

Jasmine Foo 1,*, Kevin Leder 2,*, Marc D Ryser 3,
PMCID: PMC4589890  NIHMSID: NIHMS585876  PMID: 24735903

Abstract

Primary tumors often emerge within genetically altered fields of premalignant cells that appear histologically normal but have a high chance of progression to malignancy. Clinical observations have suggested that these premalignant fields pose high risks for emergence of recurrent tumors if left behind after surgical removal of the primary tumor. In this work, we develop a spatio-temporal stochastic model of epithelial carcinogenesis, combining cellular dynamics with a general framework for multi-stage genetic progression to cancer. Using the model, we investigate how various properties of the premalignant fields depend on microscopic cellular properties of the tissue. In particular, we provide analytic results for the size-distribution of the histologically undetectable premalignant fields at the time of diagnosis, and investigate how the extent and geometry of these fields depend upon key groups of parameters associated with the tissue and genetic pathways. We also derive analytical results for the relative risks of local vs distant secondary tumors for different parameter regimes, a critical aspect for the optimal choice of post-operative therapy in carcinoma patients. This study contributes to a growing literature seeking to obtain a quantitative understanding of the spatial dynamics in cancer initiation.

1 Introduction

The term ‘field cancerization’ refers to the clinical observation that certain regions of epithelial tissue have an increased risk for the development of multiple synchronous or metachronous primary tumors. This term originated in 1953 from repeated observations by Slaughter and colleagues of multiple primary oral squamous cell cancers and local recurrences within a single region of tissue [1]. The phenomenon, also known as the ‘cancer field effect’ has been documented in many organ systems including head and neck (oral cavity, oropharynx, and larynx), lung, vulva, esophagus, cervix, breast, skin, colon, and bladder [2]. Although the exact underlying mechanisms of the field effect in cancer are not fully understood, recent molecular genetic studies suggest a carcinogenesis model in which clonal expansion of genetically altered cells (possibly with growth advantages) drives the formation of a premalignant field [2, 3]. This premalignant field, which may develop in the form of one or more expanding patches, forms fertile ground for subsequent genetic transformation events, leading to intermediate cancer fields and eventually clonally diverging neoplastic growths. The presence of such premalignant fields poses a significant risk for cancer recurrence and progression even after removal of primary tumors. Importantly, these fields with genetically altered cells often appear histologically normal and are difficult to detect; thus, mathematical models to predict the extent and evolution of these fields may be useful in guiding treatment and prognosis prediction.

In this work we utilize a stochastic evolutionary framework to model the cancer field effect. Our model combines spatial cellular reproduction and death dynamics in an epithelial tissue with a general framework for multi-stage genetic progression to cancer. Using this model, we investigate how microscopic cellular properties of the tissue (e.g. tissue renewal rate, mutation rate, selection advantages conferred by genetic events leading to cancer, etc) impact the process of field cancerization in a tissue. We develop methods to characterize the waiting time until emergence of second field tumors and the recurrence risk after tumor resection. In addition we study the clonal relatedness of recurrent tumors to primary tumors by assessing whether local field recurrences (second field tumors) are more likely than distant field recurrences (second primary tumors). The key results of our study are summarized as follows. (i) We provide analytic results for the size-distribution of the histologically undetectable pre-cancerous fields at the time of diagnosis. (ii) We investigate how the extent and geometry of these fields depend upon a key meta-parameter of the system, Γ, which is defined through a specific relationship between kinetic parameters of the tissue and genetic pathways. (iii) We derive analytical results for the relative risks of local vs distant secondary tumors for different parameter regimes. These types of predictions are important in clinical practice. For example, they help determining the optimal size of excision margins at the time of surgery, and the appropriate choice of post-operative therapy (which may depend on the type of recurrence expected).

The methodology developed in this work is generally applicable to early carcinogenesis in epithelial cancers, and contributes to a growing literature on the evolutionary dynamics of cancer initiation, see e.g. [413]. Since our work is concerned with analyzing spatial premalignant field geometries during the genetic progression to cancer, here we briefly describe some existing mathematical models of the stochastic evolutionary process of cancer initiation from spatially structured tissue, e.g. [1419]. In 1977 Williams and Bjerknes proposed a spatial Moran model of clonal expansion in epithelial tissue [16] in which cells divide according to fitness and replace a neighboring cell at random on the rectangular lattice. This model is closely related to the biased voter model from particle systems theory [20], and in [21, 22] the growth properties and asymptotic shape of the process were established. However, this model did not incorporate the possibility of mutations occurring to produce new types in the population. In [14] Komarova proposed a 1D model incorporating mutations with fitness advantages, where cells were allowed to divide in response to the death of a neighboring cell in contrast to the models mentioned previously. It was shown that the probability of mutant fixation and time to obtain two-hit mutants differ from the well-mixed setting. Later, in [17,18] this model was extended to incorporate motility, and the relationships between migration, mutation, selection and invasion in a spatial stochastic evolutionary model were explored. In [19] the voter model considered in [16] was generalized to incorporate neutral mutations, and the waiting time to produce two-hit mutants was studied in a general dimension setting. Martens and colleagues considered a similar model of mutation accumulation on a discrete time hexagonal lattice model, and studied the speed of population adaptation [23,24]. In a recent work Antal and colleagues consider a stochastic spatial model of cancer progression where cells acquire successive fitness advantages along the edge of the tumor. In the context of this model they study the shape of the evolving tumor front as well as the number of mutations acquired in the tumor [25]. In a recent work, we studied the accumulation and spread rates of advantageous mutant clones in a spatially structured population of general dimension [26]. Finally, we note that there have also been some studies mathematically modeling the growth of pre-cancerous cells via growth factors during early carcinogenesis utilizing reaction-diffusion systems, e.g. [27].

Most of the evolutionary models proposed in the field utilize similar descriptions of the fundamental processes of birth, selection, mutation and death in a spatially structured population (modulo the occasional minor differences in lattice structure and the structure of reproduction update rules). However, the studies described above have been aimed at studying the rates of invasion, adaptation, and mutation accumulation in these populations. In contrast, in this study we obtain analytical results for the spatial and temporal dynamics of premalignant fields during carcinogenesis. We consider a generalized spatial Moran process in which cells can acquire successive random mutations which confer selective advantages, reproduction occurs at rates proportional to cellular fitness, and reproduction results in neighbor replacement at random. We analyze this fundamental evolutionary model to quantify how field cancerization dynamics and recurrence risks depend on the kinetic parameters of the tissue and genetic progression pathway to cancer. To the best of our knowledge, this is the first evolutionary modeling effort aimed at mathematically predicting the cancer field effect and its consequences.

The article is organized as follows: in section 2 we introduce the stochastic mathematical model and describe basic properties regarding the survival and growth rate of mutant clones. Using previously derived results on the spread of mutant clones, we introduce a mesoscopic approximation to the model. In section 3 we analyze the model to investigate the characteristics and extent of local and distant premalignant fields at the time of initiation. In particular, we determine how the spatial geometry of the field (e.g. number and size of lesions) depends on cellular and tissue properties such as mutation rate, tissue renewal rate and mutational fitness advantages. In section 4 we analyze the model to understand the risk of recurrence due to local or distant field malignancies, as a function of time and cellular parameters.

Throughout the paper we will use the following notation for the asymptotic behavior of positive functions,

f(t)~g(t)iff(t)/g(t)1ast,f(t)g(t)iff(t)/g(t)0ast,f(t)g(t)iff(t)/g(t)ast.

Finally, we use the notation X = dF to denote that the random variable X has distribution F.

2 Mathematical framework and basic properties

Cancer initiation is associated with the accumulation of multiple successive genetic or epigenetic alterations to a cell [28]. A subset of these genetic events may give rise to a fitness advantage (i.e. an increase in reproductive rate of the cell or avoidance of apoptotic signals), and subsequently lead to a clonal expansion within the tissue. These expanding mutant cell populations form the background for further independent genetic events which eventually lead to carcinogenesis. As a result of this spatial evolutionary process, by the time of cancer initiation or diagnosis the tissue field surrounding a tumor can be composed of genetically distinct premalignant lesions of various sizes and stages.

2.1 Cell-based model

To study the dynamics of this process, we consider a stochastic model which describes the accumulation and spread of a clone of cells with genetic alterations throughout a spatially structured tissue (e.g. stratified epithelium). Thus, we consider the model on a regular lattice Inline graphic ∩ [−L/2, L/2]d, where L > 0 and d is the number of spatial dimensions of the tissue. Each location in the lattice is occupied by a single cell, and each cell reproduces at a rate according to its fitness with exponential waiting times. Whenever a cell reproduces, its offspring replaces one of its 2d lattice neighbors at random, see Figure 1A. The type of each cell corresponds to its fitness, which is related to the number of genetic hits a cell has accumulated in a multi-step genetic model of cancer initiation. For example, type-0 cells have fitness normalized to 1 and are labeled as wild-type or normal (with no mutations). Initially our entire lattice is occupied by type 0 cells. Type-0 cells acquire the first mutation at rate u1 to become type-1 cells. The type-1 cell will have a relative fitness advantage to type-0 cells, given by 1 + s1, for some constant s1 ≥ 0. In general, type-i cells have a fitness advantage of 1 + si relative to type-(i − 1) cells, and they acquire the (i + 1)–th mutation in the sequence at rate ui+1 to become type-(i + 1) cells. The process is stopped when a cell develops k mutations; we call this the time of cancer initiation. The number of mutation k as well as the parameters ui, si for i = 1, …, k depend on the specific cancer type. Although many (epi)genetic events are selectively disadvantageous (i.e. they confer a selective disadvantage si < 0), the progeny of deleterious mutants die out quickly so here we restrict our attention to the case si ≥ 0. Note that this process can be thought of as a spatial version of the Moran process, a spatially well-mixed population model that is commonly used to describe carcinogenesis (e.g. see [812]). In addition, the spatial reproduction and death dynamics of this model (without mutation) correspond to the biased voter process which has been well-studied in physics and probability literature. In fact, a similar voter model approach was previously used to model cellular dynamics in epithelial tissue and found to correlate well with experimental predictions of clone size distribution in the mouse epithelium [29].

Figure 1. Lattice dynamics.

Figure 1

(A) Schematic of spatial Moran model in d = 2: each cell divides at rate according to its fitness and replaces one of its 2d neighbors: if the light blue cell divides, its offspring replaces one of the dark blue neighbors, chosen uniformly at random. Every lattice site is occupied at all times (not shown). (B) Simulation example of the model: growth of an advantageous clone (light blue) starting from one cell with fitness advantage s = 0.2 over the surrounding field (dark blue).

The total number of cells in the fixed-size population is NLd; in most cancer initiation settings this number is quite large (at least 106), while mutation rates are quite small (orders of magnitude smaller than 1). Therefore we will, unless stated otherwise, restrict our analysis to regimes where L ≫ 1 and ui ≪ 1. In Section 2.3, we will briefly discuss the specific conditions that we impose on the relationship between these parameters. For mathematical simplicity, the lattice is equipped with periodic boundary conditions; however in most relevant biological situations the domain size (i.e. cell number) is sufficiently large so that boundary effects are negligible.

Note on dimension of the model

We analyze the general model in space dimensions d = 1, 2, 3. While all epithelial tissues have an intrinsically three dimensional architecture, in some situations considering d = 1, 2 may be a good approximation. For example, cancer initiation in mammary ducts of the breast, renal tubules of the kidney, and bronchi tubes of the lung could be viewed as approximately one-dimensional processes, due to the aspect ratio of tube radius versus length. On the other hand, cancer initiation in the squamous epithelium of the cervix, the bladder or the oral cavity can be viewed as two-dimensional process, since initiation occurs in the basal layer of the epithelium which is only 1–2 cells thick (see e.g. Figure 2). The validity of such approximations poses an interesting problem in itself, but will not be addressed in this work.

Figure 2. Geometry of squamous epithelium.

Figure 2

A Basal layer (vertical perspective) before initiation with local field (left), and after initiation where the tumor is growing within the local field (right). B Sideways view of the fields before and after initiation, along the dashed lines in panel A. The proliferative cells inhabiting the two-dimensional lattice in the model reside in the basal layer of the epithelium.

2.2 Survival and growth of a single mutant clone

We first establish some basic behaviors of mutant cells and their clonal progeny within a tissue. Of particular interest are: (i) the survival probability of a mutant clone, and (ii) the rate of spatial expansion of the mutant clone through the tissue. In particular, how are these characteristics influenced by tissue parameters and the cellular fitness advantage conferred by a mutation? We have addressed some of these questions in a previous work [26] and restate the results here to make the paper self-contained. In addition, we perform new simulations in this work to fill in gaps where theoretical results are currently not available.

Consider the probability that a mutant cell survives to form a viable clone (i.e. does not die out due to demographic stochasticity). Let type-1 cells have fitness 1 + s and type-0 cells have fitness 1, and let ϕt(x) denote the type of cell at site x in the lattice at time t. Define

ξt{xd[-L/2,L/2]d:ϕt(x)=1}.

In other words, ξt is the set of all type-1 cell locations at time t. We initiate the model with a single type-1 cell at the origin surrounded by type-0 cells in all other locations:

ϕ0(x)={1,x=00,otherwise,

and assume no further mutations are possible (ui = 0). This simplified model is known as the Williams-Bjerknes model [16], and if L = ∞ then it corresponds to the biased voter model, see e.g. [30]. Let t| denote the number of type-1 cells in the model at time t. Then we can define the extinction time of the process T0 ≡ inf{t > 0 : |ξt| = 0}. The probability of survival of a single mutant clone with selective advantage s over the surrounding cells is then the probability of the event {T0 = ∞}. By looking at the the process t| only at its jump times, we note that the embedded process is a discrete time random walk that moves one up with probability s/(1 + s) or one down with probability 1/(1 + s). This can be seen by observing that the process only changes at boundaries between type-0 and type-1 cells, and the only possible resulting events are that the type-0 gets replaced by a type-1 (resulting in a jump up in t|) or the type-0 gets replaced by a type-1 (resulting in a jump down in t|). Analysis of the overall survival probability of this random walk can then be calculated using elementary results for random walks, see Example 1.43 in [31],

P(T0=)=s1+ss,

where the approximation is valid for s ≪ 1. Thus, the probability that a mutant clone with fitness advantage s survives is s1+s, and is independent of the dimension of the tissue.

To understand how the expansion rate of a mutant clone depends on the selection strength s of the mutant, we first recall a result by Bramson and Griffeath [21, 22], which establishes an asymptotic shape for the type-1 clone. More precisely, Bramson-Griffith shape theorem says that conditional on the clone never going extinct, the clone has a convex, symmetric shape whose radius expands linearly. In a previous work, we studied how this linear rate of expansion depends on the selection strength s in the setting of weak selection, see Theorem 1 of [26]. We found that if we denote by e1 the first unit vector in ℝd and define the growth rate cd(s) such that

D{ze1:z}=[-cd(s),cd(s)],

then as s → 0,

cd(s)~{sd=14πs/log(1/s)d=24β3sd=3, (1)

where β3 is the probability that two simple random walks started at 0 and e1 = (1, 0, 0) never hit. In other words, the radius of the asymptotic shape D approximating the type-1 clone grows linearly with rate on the order of cd(s).

The previous results hold only in the regime of weak selection or small s. For larger values of the selective advantage s, simulations can be used to obtain cd(s) for d = 2, 3 (in d = 1 the process can be analyzed directly through simple random walk analysis and we obtain that c1(s) = s). For example, Figure 3 shows that the s-dependence of the growth rate is approximately linear for s > 0.5; in this case simple regression yields the estimate c2(s) ≈ 0.6s + 0.22 (s > 0.5). Thus, a combination of analysis and simulation gives us a complete picture of how spatial expansion rate of mutant clones in a tissue depend upon the selective advantage s for a wide range of selection strengths.

Figure 3. Simulations of clonal expansion rate for large s.

Figure 3

Dependence of the growth rate c2 on the fitness advantage s. Statistics performed on M = 100 samples for each s-value. The error bars represent 95% confidence intervals.

2.3 Approximating with a hybrid mesoscopic model

Our results regarding the survival and growth of a single mutant clone suggest a hybrid mesoscopic model simplification that enables our analysis of the field cancerization process. In particular, each successful mutant clone can be well-approximated as a growing d-dimensional ball with expansion rate cd(s) as calculated in the previous section. Before proceeding however, let us clarify the notion of clone ‘survival’ a.k.a. ‘success’ in the full model, where multiple mutations can arise and compete in the same finite domain. In particular, we consider a mutant clone with selective advantage s over the background to be successful if it reaches size ≫ 1/s. This criterion guarantees a negligible chance of extinction in an infinite domain with no interference. In particular, if we start with a single type-1 cell with selective advantage s in a sea of type-0 cells, and if we define T0 to be the extinction time of the type-1 progeny, one can use the embedded discrete time process and standard results on biased random walks [31] to show that if the progeny reaches size k ≫ 1/s, then P(T0 = ∞ | |ξ0| = k) ≈ 1 − eks.

Consider the fate of an unsuccessful type-1 clone arising on a background of type-0 cells. The clone evolves as a supercritical (s > 1) biased voter model conditioned on extinction. In [26] we showed that unsuccessful type-1 mutations typically die out by a time of order

(s)={s-2d=1,s-1log(1/s)d=2,s-1d=3. (2)

As seen in the previous section, the survival probability in the biased voter model (starting with a single type 1 cell in a sea of type 0 cells) is s/(1 + s), but in the more complex spatial Moran model with the possibility of multiple interacting type 1 clones, it is not immediately clear that this survival probability is still given by s/(1 + s). However, it was shown in [26] that the above survival probability remains a good approximation as long as

(A0)(1/u1)(s)(d+2)/2. (3)

If the total number of type-1 cells is always a negligible fraction of N and (A0) holds, then successful type-1 mutations arrive as a Poisson arrival process with approximate rate Nu1ss+1, where N is the total number of cells in the tissue. In particular, these conditions hold for biologically reasonable parameter sets, such as the ones used for the numerical examples in this article.

We are now ready to introduce a hybrid mesoscopic model approximation as follows: Type-1 mutations arrive in the healthy tissue as a Poisson arrival process with rate Nu1, distributed uniformly at random in the spatial domain. Each mutation event has two potential outcomes:

  • with probability s/(1 + s), the mutation is successful and we approximate the subsequent clonal expansion with a ball whose radius grows deterministically. The macroscopic growth rate is cd(s), which was derived from individual cellular growth kinetics as described in section 2.2. As a representative simulation in figure 1B suggests, the ball in standard L2-norm in ℝd will be utilized.

  • with probability 1/(1+s), the mutation is unsuccessful, and the clone evolves according to the full stochastic (cellular-level) model dynamics conditioned on extinction.

Note that the remainder of the paper discusses properties of this mesoscopic model.

It will be useful to define γd as the volume of a ball of radius 1 in d dimensions,

γ1=2,γ2=π,γ3=4π/3.

Note that although the stochastic fluctuations of the shape of expanding clones are lost in this approximation, one gains generality since the mesoscopic model can approximate a whole class of microscopic models that admit a shape result.

2.4 Cancer initiation behavior

Although the methodology developed in this work can be generalized to the setting of k-mutation carcinogenesis models, we will consider for simplicity the classic two-mutation model of cancer initiation first introduced by Knudson [32]. Here, type-0 cells are wild-type with fitness 1, type-1 cells are premalignant with fitness 1 + s1 relative to type-0 cells, and type-2 cells are initiated cancer cells with fitness 1 + s2 relative to type-1 cells. The time of cancer initiation σ2 is defined as the time at which the first successful type-2 cell arrives. In [26], we studied the situation where s1 = s2 = s > 0 and found that the timing of cancer initiation is strongly governed by the limiting value of the following meta-parameter:

Γ(Nu1s)d+1(cddu2s)-1.

Roughly speaking, Γ1/(d+1) represents the ratio of the rate of producing successful type-1 cells to the subsequent time it take to acquire the first successful type-2. We found that both the mechanisms and distribution of the cancer initiation time vary significantly depending on the regime of Γ:

  • Regime 1 (R1): When Γ < 1, the first successful type-2 mutation occurs within the expanding clone of the first successful type-1 mutation (left panel of Figure 4). The initiation time σ2 is exponential and does not depend on the spatial dimension.

  • Regime 2: (R2) For Γ ∈ (10, 100), the first successful type-2 mutation occurs within one of several successful type-1 clones (middle panel in Figure 4). The initiation time is no longer exponential and depends explicitly upon the spatial dimension.

  • Regime 3 (R3): When Γ > 1000, the first successful type-2 mutation occurs after many successful type-1 mutations have occurred (right panel of Figure 4). The first successful type-2 can arise from either a successful or an unsuccessful type-1 family; the initiation time represents a mixture distribution of these two events.

  • Note that for Γ ∈ [1, 10] and Γ ∈ [100, 1000] we say that we are in borderline regimes R1/R2 and R2/R3 respectively.

Figure 4. The three dynamic regimes.

Figure 4

Regime 1: first successful type-2 cell (arrow) arises in the first premalignant clone, Γ = 0.055. Regime 2: several premalignant clones are present at the time of the first successful type-2 cell, Γ = 54.47. Regime 3: a large number of small premalignant clones are present by the time of the first successful type-2 cell, Γ = 5.45 × 104. Simulations obtained with parameter values as in Figure 5.

We refer the reader to [26] for mathematical details of these statements. Note that these ‘regimes’ can be thought of as labels highlighting distinct types of initiation behaviors that arise as Γ changes. In fact the system behavior continuously varies through the parameter space, and borderline cases between these regimes do exist. Figure 5 shows how the distribution of the waiting time σ2 varies with changing number of cells N in d = 2. We note that as N increases, the waiting time distribution shifts to the left and initiation occurs earlier. By comparing Figures 4 and 5 we see that early initiation times are associated with a diffuse premalignant field with a large number of independent lesions, whereas late initiation times are associated with a single premalignant field harboring the initiating tumor cell.

Figure 5. Waiting time until first successful type-2.

Figure 5

Cumulative distribution function (cdf) of σ2, the waiting time until the first successful type-2 mutation, for increasing N (see (4)). Regime 1: u1 = 7.5 · 10−8, Regime 2: u1 = 7.5 · 10−7, Regime 3: u1 = 7.5 · 10−6. All other parameters are fixed: d = 2, N = 2 · 105, s1 = s2 = 0.1, u2 = 2 · 10−5, c2(s1) = 0.16.

To briefly summarize, we have described first a microscopic model of cellular division, mutation and death within a regularly structured epithelial tissue. Analysis of the fine-scale dynamics of this model leads to a more tractable hybrid mesoscopic model which approximates the microscopic model. In the next section, we analyze this mesoscopic model to study the characteristics and extent of premalignant fields at the stochastic time of cancer initiation or diagnosis. In the analyses throughout, we will consider parameter ranges spanning all three regimes of initiation behavior; however, for simplicity in regime 3 we will restrict ourselves to the range of parameter space in which successful type-2 mutations arise from successful type-1 mutations (i.e. that do not later die out). The behavior in the final remaining portion of the parameter space in regime 3 will be the subject of further work.

3 Characterizing the premalignant field

The time between cancer initiation and diagnosis, which we label here as TD, is a subject of great interest, see e.g. [33] for a review. In general, TD is itself a random variable and may depend on the natural history of the disease until initiation. However, if we assume that TD is independent of σ2, then we can characterize the premalignant field at time of diagnosis, σ2 + TD, by means of the field characterization at time σ2, together with the distribution of the delay time TD. For this reason, even though the clinically relevant time is σ2 + TD, we focus here on characterizing the field at σ2. Note that mathematically, this requires us to condition our analyses upon observing σ2 at some time t, i.e. condition upon the event {σ2 = t}.

The starting time of the model (t = 0) is assumed to be at the end of tissue development and the start of the tissue renewal phase. However for some tissues it is difficult to estimate this time, and thus it may be difficult to ascertain the system time t at the time σ2. In such cases, it is simple to adapt our analyses to this scenario and treat σ2 as an unobservable quantity, by removing the conditioning on {σ2 = t} and integrating of our results against the density of σ2, which is given by (see (24) in section 7.1 for derivation)

λetλ(ϕ(t)-1)(1-e-θtd+1), (4)

where

ϕ(t)1t0texp(-θrd+1)dr. (5)

The constants in (4) and (5) are the arrival rate of successful type-1 mutations

λNu1s¯1, (6)

and

θu2s¯2γdcdd(s1)d+1, (7)

where we used the notation i = si/(1 + si).

3.1 Size of the local field at initation

We are first interested in characterizing the size of the local field, i.e. the region of the premalignant type-1 clone that gives rise to the first successful type-2 clone (see Figure 6). Following the nomenclature of [34], we note the distinction between two different types of recurrent tumors: if the recurrence arises from a transformed cell in the premalignant field that gave rise to the primary tumor, the recurrence is called a second field tumor, see Figure 6A. On the other hand, if the recurrence arises from a premalignant field that is clonally unrelated to the primary malignancy, it is called a second primary tumor, see Figure 6B. These two types of recurrent tumors vary in terms of their degree of clonal relatedness to the primary tumor, and this may have some implications for treatment strategies in primary vs. recurrent tumors.

Figure 6. Local and distant recurrences.

Figure 6

Local (blue) and distant (green) premalignant fields give rise to second field tumors and second primary tumors (both red), respectively. In scenario A, there is only one premalignant field (the local field) present at time of cancer initiation (middle panel), and the recurrence occurs inside the local field. In scenario B, two unrelated precancerous fields are present at time of initiation (middle panel), and the recurrence may occur as a second primary tumor in the distant field.

We define now Rl(t) to be the radius of the local field at time t, and Xl(t) its corresponding area ( Xl=γdRld). Note that we will use the terminology ‘area’ to describe clone sizes in all dimensions, and reserve the use of the term ‘volume’ for space-time quantities. In the following, we are interested in determining the distributions of these two quantities at time σ2, conditioned on the event {σ2 = t}. In other words, we are looking for the distributions of (Rl(σ2)|σ2 = t) and (Xl(σ2)|σ2 = t), respectively.

At any given time, each clone produces initiating mutations at a rate proportional to its area. Hence the probability that clone i (born at time Ti) gives rise to the initiating mutation at time t is given by the ratio of clone i’s own area,

Xi(t)γdcdd(s1)(t-Ti)d,

divided by the total area of type-1 clones present. In other words, the size distribution of the initiating clone is given by the distribution of a size-biased pick from the different clones present at the time the initiated mutation arises.

Definition 3.1 (Size-biased pick)

Let L1, …, Ln be a family of n random variables. A size-biased pick from L1, …, Ln is defined as a random variable L[1] with conditional probability distribution

P(L[1]=LiL1,,Ln)=Li/j=1nLj.

The following theorem is the main result of this section and characterizes the size-distribution of the local field at the time of initiation. This is recognized as a size-biased pick from the clones present at time t, conditioned on the event {σ2 = t}.

Theorem 3.2

The distribution of the area of the local field at time σ2, conditioned on {σ2dt}, is given by

P^(Xl(σ2)dx)=P^(X[1]dx)=u2s¯2x1/ddγd1/dcd(s1)(1-eθtd+1)exp[-u2s¯2xd+1d(d+1)γd1/2cd(s1)], (8)

for x[0,γdcdd(s1)td].

The proof of this result is found in section 7.1, and the distribution of the local field radius follows easily as

P^(Rl(σ2)dr)=u2s¯2γdrdcd(s1)(1-e-θtd+1)exp[-u2s¯2γdrd+1cd(s1)(d+1)], (9)

for r ∈ [0, cd(s1)t].

Note that the distribution of the local field size (8) depends on the rate of successful mutations u22 and the growth rate cd(s1), but is independent of λ, the arrival rate of type-1 mutations. In Figure 7A, we show how the distribution of the local field area (8) changes with arrival time of the first successful type-2 clone. As expected, the support of the distribution increases with increasing initiation time, and hence the likelihood of having a large local field increases substantially. This suggests that that tumors appearing later have a higher recurrence probability if only the malignant portion is removed during surgery. The finite support of each probability density function reflects the fact that there is a hard upper bound on the size of a premalignant field at finite time t in the system.

Figure 7. Size-distribution of local field.

Figure 7

The size-distribution (8) of the local field is shown for different scenarios, corresponding to different Γ-values and regimes R1, R2 and R3 as explained in Section 2.4. A For varying arrival times t; B for varying type-1 mutation rates u1; C for varying type-2 mutation rates u2; (D) for varying type-1 fitness advantages s1. The non-varying parameters are held constant at d = 2, N = 2 · 105, u1 = 7.5 · 10−7, u2 = 2 · 105, s1 = s2 = 0.1 and c2(s1) = 0.16.

In Figure 7B,C we illustrate the sensitivity of the size-distribution of the local field to varying mutation rates u1 and u2, conditioned on observing initiation at the expected time t = E(σ2). The mutation rates are tuned to vary across parameter Regimes 1, 2, and 3 as described in the previous section. Observe that for lower mutation rates, the local field size varies widely (and sometimes close to uniformly) over a large range of values, while elevated mutation rates in both cases signify smaller local fields. For the u1 rate (Figure 7B), an intuitive explanation for this behavior is that as the mutation rate increases, the system moves towards regimes 2 and 3, in which the premalignant field is comprised of an increasing number of independent type-1 patches. With more type-1 patches present, the space-time volume of type-1 cells that can give rise to the first successful type-2 cell increases faster, and hence the size of the patch that eventually gives rise to the first type-2 decreases accordingly. For u2 (Figure 7C) on the other hand, an increase in the mutation rate signifies a move towards regime 1: fewer type-1 clones are required to produce the first successful type-2, and the size of the type-1 field that yields the first type-2 decreases with increasing u2. Another observation to note is that the local field size varies across the same range of orders of magnitude as the mutation rates. This suggests for example, that carcinogen exposure or environmental causes changing mutation rates by one order of magnitude could result in predicted field sizes impacted similarly by an order of magnitude.

Finally, we demonstrate the sensitivity of the local field size to the selective advantage s of mutant cells, see Figure 7D. For a small fitness gain of s = 0.025, the distribution is peaked at lower field sizes, but as s increases the field size distribution shifts to the right. High fitness gains are usually associated with an aggressive tumor phenotype, and Figure 7D suggests that such tumors may also be associated with large surrounding premalignant fields and thus higher recurrence risks.

3.2 Size of the distant field at initiation

Next we are interested in analyzing the size distribution of the distant field at initiation, which is comprised of premalignant clones that are clonally unrelated to the tumor. Define the vector of areas of the distant premalignant lesions at time t to be d(t). This vector holds the areas of all premalignant clones except for the local field clone from which the tumor arises. Mathematically speaking, the goal of this section is to characterize the law of d(σ2) conditioned on the event {σ2 = t}. Before stating the main result some additional notation is needed. First, define the mapping αj(i) as follows:

αj(i)={i,ifj>ii+1,ifji.

Then, we define the random variable iXα(i), where

α(i)j=1M(t)αj(i)1{X[1]=Xj},

Note that using this definition, (1, …, M(σ2)−1) represents the vector of sizes of the clones present at time σ2, omitting the entry corresponding to the size-biased pick X[1] which represents the local field. In other words, the distribution of d(σ2) is the joint distribution of (1, …, M(σ2)−1), which characterizes the size distribution of the clones in the distant field at time σ2. We obtain the following result (see section 7.2 for the proof).

Theorem 3.3

The size-distribution of the distant field clones at time σ2 of the first successful type-2 mutation, conditioned on {σ2 = t}, is given by

L(X¯ddt)=dP^(X1dx1,,XM(t)-1dxM(t)-1)=11-e-λtϕ(t)m=1(λϕ(t)t)me-λϕ(t)tm!i=1m-1gt(xi),

where gt(x) is defined in (26).

Of note, from Theorem 3.3 and Corollary 3.5 below, we see that

L(X¯dσ2=t,M(t)=m)=dP^(X1dx1,,Xm-1dxm-1)=i=1m-1gt(xi).

Figure 8 shows how the probability density function of the total distant field size (i.e. the sum of all distant field patches) changes with increasing mutation rate u1. For a comparison to the local field size distribution at the same parameter values, we refer to Figure 7B. We note that in regimes 1 and 2 the total distant field size is on the same order of magnitude as the local field size, but in regime three the distant field size is significantly larger than the size of the local field. As will be investigated in more detail below, this suggests that secondary tumor recurrences for cancer types in regime 3 are much more likely to stem from the distant field, and thus are more likely to be clonally unrelated to the primary tumor.

Figure 8.

Figure 8

The distribution of the total size of the distant field is shown for different scenarios, corresponding to the three regimes R1, R2 and R3 illustrated in Figure 4 for varying type-1 mutation rates u1. The non-varying parameters are held constant at d = 2, N = 2 · 105, u2 = 2 · 10−5, s1 = s2 = 0.1 and c2(s1) = 0.16.

3.3 Number of field patches: evolution until initiation

We next analyze the total number of premalignant lesions over time until tumor initiation. In particular, the following result holds (see section 7.3 for the proof).

Proposition 3.4

Conditioned on {σ2 = t}, we have that for all ζt, the number of field patches is distributed as a mixture of a Poisson and a shifted Poisson random variable. In particular,

P(M(ζ)=mσ2=t)=p1(t,ζ)λm[tϕ(t)-(t-ζ)ϕ(t-ζ)]m(m)!e-λ[tϕ(t)-(t-ζ)ϕ(t-ζ)]+p2(t,ζ)λm-1[tϕ(t)-(t-ζ)ϕ(t-ζ)]m-1(m-1)!e-λ[tϕ(t)-(t-ζ)ϕ(t-ζ)],

where p1(t, ζ) + p2(t, ζ) = 1 and p1(t, ζ) = (1 − eθ(tζ)d+1)/(1 − eθtd+1). In particular,

E(M(ζ)σ2=t)=λ[tϕ(t)-(t-ζ)ϕ(t-ζ)]+p2(t,ζ).

It is interesting to observe that as ζt we see that p1(t, ζ) → 0, therefore as ζ gets closer to time t the process looks more like a shifted Poisson. This is stated in the corollary below.

Corollary 3.5

P^(M(t)=m)=(λtϕ(t))m-1(m-1)!e-tλϕ(t),m1, (10)

and (M(t) = m) = 0. In particular,

E^(M(t))=1+E(M(t)σ2>t)=1+λtϕ(t), (11)

where E(M(t)|σ2 > t) is discussed in Lemma 7.2.

Using Proposition 3.4, we can study the expected number of field patches of a certain size over time. Figure 9 shows the temporal dynamics of clone-size distribution in each regime. In regime 1 the expected number of small clones peaks and then declines as larger clones begin to dominate (consistent with the notion that a single premalignant clone exists prior to initiation), whereas in regimes 2 and 3 we see longer coexistence of large and small clones over time.

Figure 9. Dynamic clone-size distribution.

Figure 9

For each of the three regimes in Figure 5, the expected number of type-1 clones of sizes comprised in the corresponding intervals Ij are shown as functions of time up to E(σ2) (expectations are conditioned on {t = E(σ2)}). The intervals are defined as I1 = [0, 1500), I2 = [1500, 3000), I3 = [3000, 4500) and I4 = [4500, +∞). Parameter values as in Figure 5.

Finally, we would like to point out that the result in Proposition 3.4 can be extended to a result about the entire process {M(r) : 0 ≤ rt} conditioned on σ2 = t. The details are provided in section 7.4.

4 Recurrence predictions

Tumor recurrence due to field cancerization poses a substantial clinical problem in many epithelial cancers [3]. We next aim to use the results of the previous section to develop a methodology for assessing the risk of tumor recurrence (as well as the likely type of tumor recurrence) after surgical removal of the primary tumor.

4.1 Local vs. distant field recurrence?

As discussed above, a recurring tumor can either arise in the same premalignant field (a second field tumor), or it can arise in a clonally unrelated field (second primary tumor). In this section we characterize the recurrence time distribution for each of these secondary tumor types, and study how the relative likelihood of local vs. distant recurrence depends upon parameters of the tissue and cancer type.

To this end, we first study the recurrence time distribution for second field tumors, which arise from the local premalignant field. Denote the second field recurrence time by TRf, measured in time units τ starting from τ = 0 at time σ2. The time is reset at the tumor initiation time σ2, rather than the tumor resection time σ2 + TD, to accommodate the possibility that a recurrence occurs prior to detection of the primary tumor. Thus if recurrence occurs at some time τ < TD, then a secondary tumor already exists at the time of diagnosis of the primary tumor (but may be too small to be detectable). We assume that the primary tumor node is completely resected once it becomes detectable at time TD, leaving the surrounding field intact (i.e. there are no excision margins).

At time σ2 a successful type-2 cell arises from a premalignant clone of radius Rl(σ2), whose distribution is characterized in (9). If Rl(σ2) = r, the incidence rate of successful type 2 mutations within this field is given by

η(r,τ)u2s¯2γd[(r+cd(s1)τ)d-cdd(s2)(τTD)d], (12)

where cd(s2) is the rate of expansion of the malignant cells into the type-1 field. The proof of the following result can be found in section 7.5.

Corollary 4.1

The probability of a second field tumor having formed before time τ (measured from σ2), conditioned on {σ2 = t}, is given by

P^(TRf<τ)=1-γdu2s¯2cd(s1)(1-e-θtd+1)0cd(s1)trdexp[-u2s¯2γdcd(s1)(d+1)rd+1-0τη(r,s)ds]dr.

In particular, P^(TRf<TD) is the probability that smaller, possibly undetectable second field tumors exist at the time of diagnosis.

In Figure 10A the cumulative distribution function of TRf as calculated in Corollary 4.1 is shown, for varying values of type-2 mutation rates u2. As one might expect, higher mutation rates yield a decreased time to recurrence (the curves shift to the left for increasing u2). However, considering that the size of the premalignant field at initiation of the primary tumor is inversely proportional to the mutation rate u2, see Figure 10B, the decrease in time to recurrence is a priori not obvious: a bigger precancer field increases the chance of fast recurrence. This example illustrates how a quantitative model enables us to assess the relative importance of competing aspects of the system - in this case, the impact of larger premalignant field versus higher mutation rates on recurrence likelihood.

Figure 10. Time to local recurrence.

Figure 10

A The cumulative distribution function of the time to recurrence of a second field tumor is shown for three different scenarios, corresponding to u2 = 2 · 10−3 (Regime 1), u2 = 2 · 10−5 (Regime 2) and u2 = 2 · 10−3 (Regime 2/3), respectively. The remaining parameters are d = 2, N = 2·105, u1 = 7.5·10−7, s1 = s2 = 0.1, t = E(σ2). B Schematic of the relative initiation times of the primary tumor (yellow) and sizes of the local fields (blue), for the three scenarios in panel A. The numerical values for expected initiation time and local field size are: (a) Inline graphic(σ2) = 123, Ê(Rl) = 8; (b) Inline graphic(σ2) = 281, Ê(Rl) = 31; (c) Inline graphic(σ2) = 474, Ê(Rl) = 55.

If the recurrence does not take place in the local field giving rise to the first successful type-2 clone, then it either arises from one of the type-1 clones already present at time of initiation (i.e. the distant field), or it arises in a type-1 clone formed after initiation. In the latter case, the waiting time is again distributed as σ2, and hence we focus here on the distribution of the waiting time TRp, defined as the time from σ2 until a second primary tumor arises from the distant field already existing at σ2. We have the following result, proved in section 7.6.

Corollary 4.2

The probability that the distant field at the time of initiation gives rise to a second primary tumor by time τ (measured from σ2), conditioned on {σ2 = t}, is given by

P(TRp>τσ2=t)=exp[-λtϕ(t)(1-dγdΦ(τ,t))]

where

Φ(τ,t)=0exp(-0τη(r,s)ds)rd-1gt(ridγd)dr,

and gt is defined in (26).

Thanks to the results in this section, it is now possible to evaluate the probability of local versus distant tumor recurrences in each parameter regime. Corollary 4.1 explicitly provides the probability density function P^(TRfdτ), which is the probability that a second field tumor arises at time τ from the same field that gave rise to the primary tumor. To obtain the corresponding probability density function for recurrence as a second primary tumor, we have to consider recurrences due to distant field lesions that have arisen before and after σ2. While Corollary 4.2 characterizes the recurrence risk due to distant lesions already present at initiation, the time to a successful second primary tumor from a distant field not yet present at initiation is distributed as σ2, see (4). Therefore, the distribution of interest is that of TRp=min{TRp,σ2}, which is the time of the first distant recurrence event.

In Figure 11 we study how the comparison between the probability density functions of TRf (second field tumor, local) and TRp (second primary tumor, distant) varies in regimes 1, 2 and 3. The likelihood of local vs. distant recurrences depends strongly upon both the timing and parameter regime of the system In regime 1, local recurrence is significantly more likely overall, but at late times the probability of distant recurrences is slightly higher than for local recurrences. In contrast, in regimes 2 and 3 the overall probability of local and distant recurrences are comparable. However, in regime 2, at early times distant field recurrences are more likely, whereas the opposite is true at later times. The same observation, but even more pronounced, holds in regime 3.

Figure 11. Local vs. distant recurrence.

Figure 11

A For each of the three regimes in Figure 5, we show: the distribution of time to local recurrence P^(TRfdτ), and the distribution of time to distant recurrence P^(TRpdτ). The distribution of TRf is given in Corollary 4.1 and we set TRp=min{TRp,σ2} to account both for contributions from type-1 clones already existing at σ2 as well as contributions from type-1 clones born after σ2 (for which time to recurrence is distributed as σ2). Expected times to recurrence: E^(TRf)=81 and E^(TRp)=733 (Regime 1); E^(TRf)=98 and E^(TRp)=86 (Regime 2); E^(TRf)=149 and E^(TRp)=34 (Regime 3). The parameter values are as in Figure 5.

5 Conclusions and outlook

In this study we performed a quantitative analysis of the cancer field effect by means of a spatial stochastic model of cancer initiation, which had previously been introduced in [26]. Using this model, we studied the characteristics of premalignant fields at the time of tumor initiation. In particular, we derived the size-distributions of the local field (the premalignant lesion that gives rise to the tumor) and the distant field (the premalignant lesions that are unrelated to the primary tumor). We also investigated how the extent and geometry of these fields depend upon Γ, a key combination of parameters of the tissue and genetic pathway leading to cancer. We calculated the dynamic clone size distribution at times leading up to initiation, and derived the probability density functions of local and distant recurrence times. Finally, we compared the relative likelihood of second field versus second primary tumors, and demonstrated how the clonal relatedness between primary and recurrent tumors depends explicitly upon tissue and cancer type parameters.

Using an example set of biologically realistic parameters in two space dimensions (which is appropriate for describing the cancer initiation process in the basal layer of a stratified epithelium), we found that lower mutation rates (such as in regime 1) were associated with larger local field sizes, whereas higher mutation rates (regimes 2 and 3) led to smaller local fields. We also found that higher mutation rates resulted in larger distant fields, while more aggressive cancers (high selective advantage) led to larger local fields at diagnosis. Finally, we investigated the risk of recurrence after surgical resection of the malignant portion, and found that for low mutation rates (regime 1), local recurrence is much more likely, whereas for larger mutation rates (regimes 2 and 3), the overall probability of local and distant recurrences are comparable. However, in regimes 2 and 3, early recurrences are more likely to be a second primary tumor, whereas the late recurrences are more likely to be second field tumors.

One important limitation of our approach is that the model captures a specific sequence of genetic alterations with specified ui and si, and does currently not allow for permutations of genetic events and divergent pathways. Nevertheless, our model may provide a useful framework for comparing different biological hypotheses and disentangling divergent genetic pathways among cancer subtypes. In particular, it enables us to predict differences in observable dynamics such as initiation times and prognoses between different molecular models. Such an approach could help elucidating the sequence of genetic events during carcinogenesis, and will be the subject of future work. Another limitation of our framework is that we have assumed a static, uniform microenvironment within the tissue. The local microenvironment is in reality determined by a variety of time- and space-dependent factors such as glucose, oxygen, growth factors, drugs and cytokine concentrations. In addition to impacting the growth and mutation rates of cells within the tissue, the local microenvironment is increasingly being recognized as playing an important role in carcinogenesis through stromal signaling.

As mentioned before, field cancerization poses various clinical challenges, especially in the case of head and neck, where multifocal primary cancers as well as recurrences are common [35]. In particular, the optimal size of excision margins and assessment of the recurrence risk after surgery are largely unsolved problems arising in everyday clinical practice. In a forthcoming study, we will discuss how our analysis can be used to address some of the most pertinent clinical questions in head and neck cancer care.

In summary, the analyses performed in this work contribute towards a quantitative understanding of how organ-specific physiological parameters and pathway-specific parameters influence the process of field cancerization and the associated risk of recurrence. We demonstrate that tumor recurrence dynamics and premalignant field characteristics are strongly dependent upon these parameters, which vary across different tissue and cancer types. Once properly calibrated for a specific tissue and cancer type, the proposed methodology can potentially be used to provide insights into key prognostic factors such as risk of multifocal lesions and tumor recurrence, surveillance guidelines, and treatment design. For example, we are able to assess the likelihood and timing of local versus distant recurrences after surgical resection. Since this distinction provides information on the level of clonal relatedness between primary and recurrent tumors, the model predictions may provide insights into whether treatment strategies effective for primary tumors will be useful for recurrent tumors in particular cancer types. In addition, our methodology can be utilized to assess the relative benefits of surgical excision margins, and to help determine the minimal margins necessary to prevent recurrence in each tissue type.

Acknowledgments

We thank Rick Durrett for insightful discussions on this project as well as his useful suggestions on the manuscript.

7 Appendix: Proofs

7.1 Proof of Theorem 3.2

To prove Theorem 3.2, we first need a few new definitions and preliminary results. Define V(t) to be the random total space-time volume covered by successful type-1 families until time t,

V(t)=i=1M(t)γdcdd(s1)(t-Ti)d+1d+1, (13)

where Ti represents the arrival time of the i-th family, and M (t) is the total number of successful arrivals by time t, which is a Poisson process with rate λ. Let Inline graphic represent the space-time volume conditioned on the event

Et(t1,,tm){M(t)=m,T1dt1,Tmdtm},

where 0 < t1 < ⋯ < tm < t. In other words,

VEtγdcdd(s1)d+1i=1m(t-ti)d+1. (14)

For ease of notation we replace Inline graphic with the more compact version Inline graphic. Since E[V (t)] = E[E[V(t)|M(t)]] and the conditioned process is a compound Poisson process, we obtain that

E[V(t)]=m=0P(M(t)=m)mγdcdd(s1)d+1E[(t-Ti)d+1]=λγdcdd(s1)td+2(d+2)(d+1).

Similarly, we define A(t) to be the total area of clones covered by successful type-1 families at time t,

A(t)i=1M(t)γdcdd(s1)(t-Ti)d, (15)

and we define Inline graphic to be this quantity conditioned on Inline graphic(t1, …, tm),

AEti=1mγdcdd(s1)(t-t1)d. (16)

Note that

E[A(t)]=m=0P(M(t)=m)mγdcdd(s1)E[(t-Ti)d]=λγdcdd(s1)td+1d+1. (17)

By considering the space-time volume of type-1 clones we can calculate P (σ2 > t| Inline graphic(t1, …, tm) and P (σ2 > t|M (t) = m). Combining these two formulas and using Bayes rule we get the following result for the joint distribution of the arrival times of successful type-1 mutations, conditioned on the total number of mutations by time t.

Lemma 7.1

Conditioned on {σ2 > t} and {M (t) = m}, the arrival times of successful type-1 clones (T1, …, Tm) are distributed as order statistics of iid random variables as follows:

P(T1dt1,,Tmdtmσ2>t,M(t)=m)=m!tmϕ(t)mi=1me-θ(t-ti)d+1

where 0 < t1 < ⋯< tm < t.

Proof

The arrival process of successful type-1 mutations is represented by M(·), which is a Poisson process with rate λ = Nu1s1/(1 + s1) and arrival times T1, T2,…. Then for any t > 0 and sequence 0 < t1 < ⋯ < tm < t we have that

P(Et(t1,,tm))=λme-λt. (18)

Since

P(σ2>tEt(t1,,tm))=exp(-u2s¯2VEt), (19)

we find using Bayes’ rule

P(σ2>t,Et(t1,,tm))=λme-λtexp(-u2s¯2VEt).

It follows then that

P(T1dt1,,Tmdtmσ2>t,M(t)=m)=P(σ2>t,Et(t1,,tm))P(σ2>tM(t)=m)P(M(t)=m)=λme-λtexp(-u2s¯2VEt)P(σ2>tM(t)=m)e-λt(λt)m/m!=m!tmexp(-u2s¯2VEt)(Eexp(-u2s¯2γdcdd(s1)(t-T)d+1/(d+1)))m=m!i=1m(1t)exp(-u2s¯2γdcdd(s1)(t-ti)d+1/(d+1))Eexp(-u2s¯2γdcdd(s1)(t-T)d+1/(d+1)),

where T is a uniform random variable on [0, t].

The distribution in Lemma 7.1 is an exponential twist of the uniform distribution. Note that if the conditioning was placed on the set {σ2 = t} instead of {σ2 > t}, then the conditional distribution would no longer have product form because of the term ddtVEt, and the arrival times would not be the order statistics from an iid collection of random variables.

Next, we show that the random variable M(t) is Poisson if conditioned on {σ2 > t}.

Lemma 7.2

Conditioned on {σ2 > t}, M (t) =d Pois (λtϕ(t)).

Proof

First we note that

P(σ2>t)=m=01m![0,t]mP(σ2>tEt(t1,,tm))P(Et(t1,,tm))dt1dtm=m=01m![0,t]mexp(-uss¯2VEt)λme-λtdt1dtm=m=01m!tmλme-λt(1t0texp(-u2s¯2γdcdd(s1)(t-r)d+1d+1)dr)m=m=0(tλϕ(t))mm!e-λt=etλ(ϕ(t)-1). (20)

From this, we find using Bayes’ rule

P(Et(t1,,tm)σ2>t)=P(σ2>tEt(t1,,tm)P(Et(t1,,tm))P(σ2>t)=λme-λtexp(uss¯2VEt)etλ(ϕ(t)-1), (21)

and hence

P(M(t)=mσ2>t)=1m![0,t]mP(Et(t1,,tm)σ2>t)dt1dtm=e-λtϕ(t))(tλϕ(t))mm!.

For subsequent considerations, it will be useful to define the two conditional probability measures (·) = P2 = t) and (·) = P2 > t), and their corresponding expected values, Ê(·) = E2 = t) and (·) = E2 > t), respectively. In particular, we can compute the Radon-Nikodym derivative between these two measures.

Lemma 7.3

The Radon-Nikodym derivative of with respect to is given by

dP^dP=AEtu2s¯2λ(1-e-θtd+1). (22)
Proof

First, note that

P(Et(t1,,tm)σ2=t)=P(Et(t1,,tm))P(σ2=tEt(t1,,tm))P(σ2=t). (23)

By differentiating (19) and (20) we obtain

P(σ2=tEt(t1,,tm))=u2s¯2AEtexp(-u2s¯2VEt)

and

P(σ2dt)=-ddtetλ(ϕ(t)-1)=λ(1-e-θtd+1)etλ(ϕ(t)-1). (24)

Hence (23) becomes

P(Et(t1,,tm)σ2=t)=λme-λtu2s¯2AEtexp(-u2s¯2VEt)λetλ(ϕ(t)-1)(1-e-θtd+1),

and comparing this to (21) yields the desired result.

Recall now that M(t) is the number of successful type-1 mutations that have arrived by time t, and we denote their arrival times by T1, …, TM(t). At time t, the area of a clone created at time r < t is γdcdd(s1)(t-r)d, and hence the area of the i-th clone at time t is given by the random variable

Xi(t)γdcdd(s1)(t-Ti)d.

Using the above results together with definition 3.1 of a size-biased pick we can now prove Theorem 3.2.

Proof of Theorem 3.2

Using basic properties of conditional expectations and Definition 3.1 we find

P^(X[1]dx)=E^[P^(X[1]dxX1,,XM(t),M(t))]=E^[i=1M(t)Xi1{Xidx}SM(t)]=m=1E^[i=1mXi1{Xidx}Sm1{M(t)=m}],

where Sm = X1 + + Xm. Using the Radon-Nikodym derivative (22) we can rewrite this as

=m=1E[1{M(t)=m}u2s¯2λ(1-e-θtd+1)(i=1mXi1{Xidx}Sm)j=1mXj]=u2s¯2λ(1-e-θtd+1)m=1E[1{M(t)=m}i=1mx1{Xidx}]=xu2s¯2λ(1-e-θtd+1)m=1E[i=1m1{Xidx}M(t)=m,σ2>t]P(M(t)=mσ2>t)=xu2s¯2λ(1-e-θtd+1)P(X1(t)dxM(t)=m,σ2>t)E[M(t)σ2>t], (25)

where we have used the fact that P (X1(t) < x|M (t) = m, σ2 > t) is independent of m, which we will show below. Using Lemma 7.1 and differentiating the cumulative distribution function

P(X1(t)<xM(t)=m,σ2>t)=P(T1>t-(xγdcdd(s1))1/d|M(t)=m,σ2>t),

we determine that

P(X1(t)dxM(t)=m,σ2>t)=x1/d-1dγd1/dcd(s1)tϕ(t)exp[-u2s¯2xd+1d(d+1)γd1/dcd(s1)]gt(x) (26)

for x[0,γdcdd(s1)td]. Note that (26) is indeed independent of m. From Lemma 7.2 it follows that

E[M(t)σ2>t]=λtϕ(t),

and combined with (25) and (26) this yields the desired result.

7.2 Proof of Theorem 3.3

Using Definition 3.1 of a size-biased pick we find

P^(X1dx1,,XM(t)-1dxM(t)-1)=E^[P^(X1dx1,,XM(t)-1dxM(t)-1X1,,XM(t),M(t))]=E^[j=1M(t)XjSM(t)i=1M(t)-11{Xαj(i)dxi}]=u2s¯2λ(1-e-θtd+1)m=1P(M(t)=mσ2>t)E[j=1mXji=1m-11{Xαj(i)dxi}|σ2>t,M(t)=m],

where the final equality follows from the same sequence of arguments as used in the proof of Theorem 3.2. Next, we note that

E[Xj(t)σ2>t,M(t)=m]=0xP(Xj(t)dxM(t)=m,σ2>t)=0xgt(x)dx=0γdcdd(s1)tdx1/ddγd1/2cd(s1)ϕ(t)texp[-u2s¯2xd+1d(d+1)γd1/dcd(s1)]dx=1ϕ(t)tu2s¯2[1-exp(-u2s¯2γdcdd(s1)td+1d+1)],

and

j=1mE[Xji=1m-11{Xαj(i)dxi}|σ2>t,M(t)=m]=j=1mE[Xjσ2>t,M(t)=m]i=1m-1gt(xi).

Together with Lemma 7.2 the result follows.

7.3 Proof of Proposition 3.4

First, we use Bayes’ rule to find

P(Eζ(t1,,tm)σ2=t)=P(σ2dtEζ(t1,,tm))P(Eζ(t1,,tm))P(σ2dt). (27)

Since P (σ2dt) is given in (24) and P ( Inline graphic(t1, …, tm)) = λmeλζ, it remains to calculate P (σ2dt| Inline graphic(t1, …, tm)). It is easy to see that

P(σ2>tEζ(t1,,tm))=exp(-u2s¯2VEt)q(ζ,t), (28)

where q(ζ, t) is the probability that a type-2 mutation arises in a clone that is born in the interval (ζ, t). We find

q(ζ,t)=E[e-θi=1M(t-ζ)(t-Ti)d+1]=E[E[e-θi=1M(t-ζ)(t-Ti)d+1|M(t-ζ)]]=E[ϕ(t-ζ)M(t-ζ)]=eλ(t-ζ)(ϕ(t-ζ)1),

where the last expression is the generating function for the Poisson process. Together with (28) this yields now

P(σ2dtEζ(t1,,tm))=-ddtP(σ2>tEζ(t1,,tm))=eλ(t-ζ)(ϕ(t-ζ)-1)e-u2s¯2VEt[uss¯2AEt+λ(1-e-θ(t-ζ)d+1)]

Together with (24) and (18), we find now

P(Eζ(t1,,tm)σ2=t)=λm-1e-λ[tϕ(t))-(t-ζ)ϕ(t-ζ)](1-e-θtd+1)e-u2s¯2VEt[uss¯2AEt+λ(1-e-θ(t-ζ)d+1)],

and hence performing the integration in

P^(M(ζ)=m)=[0,ζ]m1m!P(Eζ(t1,,tm)σ2=t)dt1dtm

yields the desired result.

7.4 Joint distribution of the process {M(r) : 0 ≤ rt}

We present here the joint distribution of the process {M(r) : 0 ≤ rt}, conditioned on σ2 = t, at multiple time points. Since the proof is similar to Proposition 3.4 we do not include it. For 0 ≤ rr′t define

ϕ^(t;r,r)=rre-θ(t-y)d+1dy.

Then for any positive integer ℓ, sequence of time points 0 < r1r < t and non-negative integers k1k2k we have that

P^(M(r1)=k1,,M(r)=k)=(i=1ki-ki-1ϕ^(t;ri-1,ri)pi+λp+1)1λj=1(λϕ^(t;rj-1,rj))kj-kj-1(kj-kj-1)!e-λϕ^(t;rj-1,rj),

where for 1 ≤ i ≤ ℓ + 1,

pi=e-θ(t-ri)d+1-e-θ(t-ri-1)d+11-e-θtd+1,

r0 = 0, k0 = 0, and rℓ+1 = t. Note that for each i, 0 < pi < 1 and i=1+1pi=1, i.e. the pi’s form a probability vector. The above joint distribution is rather difficult to parse, so we describe how one would generate samples of the increments of the process. For 1 ≤ i ≤ ℓ, set Xi = M(ri) − M (ri−1), then we can generate the values of the vector X1, …, X under the measure as follows. For each 1 ≤ i ≤ ℓ sample Xi according to a Poisson distribution with mean λϕ̂(t; ri−1, ri). Choose an integer I according to the probability vector (p1, …, pℓ+1), if I = i < ℓ + 1 replace Xi with Xi + 1. Note that in contrast to the setting of a Poisson process the random variables X1, …, X are not independent under .

7.5 Proof of Corollary 4.1

P^(TRf>τ)=P(TRf>τσ2=t)=0cd(s1)tP(TRf>τ,Rl(σ2)drσ2=t)dr=0cd(s1)tP(TRf>τRl(σ2)dr,σ2=t)P(Rl(σ2)drσ2=t)dr,

where Rl(t) is the radius of the local field surrounding the tumor at time t. The result follows from

P(TRf>τRl(σ2)dr,σ2=t)=exp(-0τη(r,s)ds) (29)

and the conditional density of Rl(σ2) in (9).

7.6 Proof of Corollary 4.2

First, we note that

P(TRp>τM(t)=m,σ2=t)=+m-1P(TRp>τR1dr1,,Rm-1drm-1,M(t)=m,σ2=t)P(R1dr1,,Rm-1drm-1M(t)=m,σ2=t), (30)

where i are the radii of the distant field clones, corresponding to their respective areas i defined in Section 3.2. Recalling the definition of η in (12), we find

P(TRp>τR1dr1,,Rm-1drm-1,M(t)=m,σ2dt)=exp(-i=1m-10τη(ri,s)ds). (31)

Recalling the Radon-Nikodym derivative dP̂/dP̃ from Lemma 7.3, it is straight-forward to verify that

dP(t1,tmM(t)=m,σ2=t)dP(t1,tmM(t)=m,σ2>t)-dP^dPP(M(t)=mσ2>t)P(M(t)=mσ2=t)-AEtu2s¯2tϕ(t)m(1-e-θtd+1),

which allows us to derive the following expression (proceeding as in the proof of Corollary 3.3),

P(X1dx1,,Xm-1dxm-1M(t)=m,σ2=t)=i=1m-1g(xi)dxi.

Switching from the clone-areas i back to the corresponding radii i, we find

P(R1dr1,,Rm-1drm-1M(t)=m,σ2=t)=(dγd)m-1i=1m-1rid-1g(ridγd)dri

From this, (31) and (30) we find

P(TRp>τM(t)=m,σ2=t)=(dγdΦ(τ,t))m-1, (32)

Finally, using Lemma 7.2,

P^(TRp>τ)=m=1P(TRp>τM(t)=m,σ2=t)P^(M(t)=m)=exp(-λtϕ(t)(1-dγdΦ(τ,t)))

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Slaughter Danely P, Southwick Harry W, Smejkal Walter. field cancerization in oral stratified squamous epithelium. clinical implications of multicentric origin. Cancer. 1953;6(5):963–968. doi: 10.1002/1097-0142(195309)6:5<963::aid-cncr2820060515>3.0.co;2-q. [DOI] [PubMed] [Google Scholar]
  • 2.Braakhuis Boudewijn JM, Tabor Maarten P, Alain Kummer J, René Leemans C, Brakenhoff Ruud H. A genetic explanation of slaughter’s concept of field cancerization evidence and clinical implications. Cancer Research. 2003;63(8):1727–1730. [PubMed] [Google Scholar]
  • 3.Chai Hong, Brown Robert E. Field effect in cancer–an update. Annals of Clinical & Laboratory Science. 2009;39(4):331–337. [PubMed] [Google Scholar]
  • 4.Armitage P, Doll R. A two-stage theory of carcinogenesis in relation to the age distribution of human cancer. Br J Cancer. 1957;11 doi: 10.1038/bjc.1957.22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Luebeck G, Moolgavkar S. Multistage carcinogenesis and the incidence of colorectal cancer. PNAS. 2002;99:15095–15100. doi: 10.1073/pnas.222118199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Komarova NL, Sengupta A, Nowak MA. Mutation-selection networks of cancer initiation: Tumor suppressor genes and chromosone instability. Journal of Theoretical Biology. 2003;223:433–450. doi: 10.1016/s0022-5193(03)00120-6. [DOI] [PubMed] [Google Scholar]
  • 7.Michor F, Iwasa Y, Nowak MA. The age incidence of chronic myeloid leukemia can be explained by a one-mutation model. Proc Natl Acad Sci USA. 2006;103:14931–14934. doi: 10.1073/pnas.0607006103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Schweinsberg J. Waiting for n mutations. Electronic Journal of Probability. 2008;13:1442–1478. [Google Scholar]
  • 9.Iwasa Y, Michor F, Komarova N, Nowak M. Population genetics of tumor suppressor genes. Journal of Theoretical Biology. 2005;233:15–23. doi: 10.1016/j.jtbi.2004.09.001. [DOI] [PubMed] [Google Scholar]
  • 10.Wodarz D, Komarova NL. Can loss of apoptosis protect against cancer? Trends Genet. 2007;23:232–237. doi: 10.1016/j.tig.2007.03.005. [DOI] [PubMed] [Google Scholar]
  • 11.Durrett R, Schmidt D, Schweinsberg J. A waiting time problem arising from the study of multi-stage carcinogenesis. Annals of Applied Probability. 2009;19:676–718. [Google Scholar]
  • 12.Foo J, Leder K, Michor F. Stochastic dynamics of cancer initiation. Physical Biology. 2011;8:54–69. doi: 10.1088/1478-3975/8/1/015002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Beerenwinkel Niko, Antal Tibor, Dingli David, Traulsen Arne, Kinzler Kenneth W, Velculescu Victor E, Vogelstein Bert, Nowak Martin A. Genetic progression and the waiting time to cancer. PLoS Computational Biology. 2007;3(11):e225. doi: 10.1371/journal.pcbi.0030225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Komarova N. Spatial stochastic models for cancer initiation and progression. Bull Math Biol. 2006;68:1573–1599. doi: 10.1007/s11538-005-9046-8. [DOI] [PubMed] [Google Scholar]
  • 15.Nowak M, Michor Y, Iwasa Y. The linear process of somatic evolution. PNAS. 2003;100:14966–14969. doi: 10.1073/pnas.2535419100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Williams T, Bjerknes R. Stochastic model for abnormal clone spread through epithelial basal layer. Nature. 1972;236:19–21. doi: 10.1038/236019a0. [DOI] [PubMed] [Google Scholar]
  • 17.Thalhauser C, Lowengrub J, Stupack D, Komarova N. Selection in spatial stochastic models of cancer: Migration as a key modulator of fitness. Biology Direct. 2010;5:21. doi: 10.1186/1745-6150-5-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Komarova N. Spatial stochastic models of cancer: Fitness, migration, invasion. Mathematical Biosciences and Engineering. 2013;10:761–775. doi: 10.3934/mbe.2013.10.761. [DOI] [PubMed] [Google Scholar]
  • 19.Durrett R, Moseley S. A spatial model for tumor growth. Annals of Applied Probability. 2013 in press. [Google Scholar]
  • 20.Liggett T. Stochastic interacting systems: contact, voter and exclusion processes. Springer; 1999. [Google Scholar]
  • 21.Bramson M, Griffeath D. On the Williams-Bjerknes tumour growth model: I. Annals of Probability. 1981;9:173–185. [Google Scholar]
  • 22.Bramson M, Griffeath D. On the Williams-Bjerknes tumor growth model: II. Mathematical Proceedings of the Cambridge Philosophical Society. 1980;88:339–357. [Google Scholar]
  • 23.Martens Erik A, Hallatschek Oskar. Interfering waves of adaptation promote spatial mixing. Genetics. 2011;189(3):1045–1060. doi: 10.1534/genetics.111.130112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Martens Erik A, Kostadinov Rumen, Maley Carlo C, Hallatschek Oskar. Spatial structure increases the waiting time for cancer. New journal of physics. 2011;13(11):115014. doi: 10.1088/1367-2630/13/11/115014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Antal T, Krapivsky PL, Nowak MA. Spatial evolution of tumors with successive driver mutations. ArXiv e-prints. 2013 doi: 10.1103/PhysRevE.92.022705. [DOI] [PubMed] [Google Scholar]
  • 26.Durrett R, Foo J, Leder K. Spatial Moran models II. Tumor growth and progression. 2013 in revision. [Google Scholar]
  • 27.Bertolusso R, Kimmel M. Modeling spatial effects in early carcinogenesis: Stochastic versus deterministic reaction-diffusion systems. Math Mod Nat Phenom. 2012;7:245–260. [Google Scholar]
  • 28.Weinberg RA. The Biology of Cancer [With DVD ROM] Taylor & Francis Group; 2013. [Google Scholar]
  • 29.Klein AM, Doupe DP, Jones PH, Simons BD. Mechanism of murine epidermal maintenance: Cell division and the voter model. Physical Review E. 2007;77(3) doi: 10.1103/PhysRevE.77.031907. [DOI] [PubMed] [Google Scholar]
  • 30.Liggett TM. Classics in Mathematics Series. Springer-Verlag Berlin and Heidelberg GmbH & Company KG; 2005. Interacting Particle Systems. [Google Scholar]
  • 31.Durrett R. Springer Texts in Statistics. Springer; 2012. Essentials of Stochastic Processes. [Google Scholar]
  • 32.Knudson A. Two genetic hits (more or less) to cancer. Nature Reviews Cancer. 2001;1:157–161. doi: 10.1038/35101031. [DOI] [PubMed] [Google Scholar]
  • 33.Attolini Camille Stephan-Otto, Michor Franziska. Evolutionary theory of cancer. Annals of the New York Academy of Sciences. 2009;1168(1):23–51. doi: 10.1111/j.1749-6632.2009.04880.x. [DOI] [PubMed] [Google Scholar]
  • 34.Braakhuis Boudewijn JM, Tabor Maarten P, René Leemans C, van der Waal Isaac, Snow Gordon B, Brakenhoff Ruud H. Second primary tumors and field cancerization in oral and oropharyngeal cancer: molecular techniques provide new insights and definitions. Head & neck. 2002;24(2):198–206. doi: 10.1002/hed.10042. [DOI] [PubMed] [Google Scholar]
  • 35.Leemans CR, Braakhuis BJM, Brakenhoff RH. The molecular biology of head and neck cancer. Nature Cancer Reviews. 2011;11:9–22. doi: 10.1038/nrc2982. [DOI] [PubMed] [Google Scholar]

RESOURCES