Skip to main content
PLOS One logoLink to PLOS One
. 2023 Jun 14;18(6):e0286499. doi: 10.1371/journal.pone.0286499

A mathematical model for cancer risk and accumulation of mutations caused by replication errors and external factors

Kouki Uchinomiya 1,*, Masanori Tomita 1
Editor: Gayle E Woloschak2
PMCID: PMC10266611  PMID: 37315031

Abstract

Replication errors influence mutations, and thus, lifetime cancer risk can be explained by the number of stem-cell divisions. Additionally, mutagens also affect cancer risk, for instance, high-dose radiation exposure increases lifetime cancer risk. However, the influence of low-dose radiation exposure is still unclear because this influence, if any, is very slight. We can assess the minimal influence of the mutagen by virtually comparing the states with and without mutagen using a mathematical model. Here, we constructed a mathematical model to assess the influence of replication errors and mutagens on cancer risk. In our model, replication errors occur with a certain probability during cell division. Mutagens cause mutations at a constant rate. Cell division is arrested when the number of cells reaches the capacity of the cell pool. When the number of cells decreases because of cell death or other reasons, cells resume division. It was assumed that the mutations of cancer driver genes occur stochastically with each mutation and that cancer occurs when the number of cancer driver gene mutations exceeds a certain threshold. We approximated the number of mutations caused by errors and mutagens. Then, we examined whether cancer registry data on cancer risk can be explained only through replication errors. Although the risk of leukemia was not fitted to the model, the risks of esophageal, liver, thyroid, pancreatic, colon, breast, and prostate cancers were explained only by replication errors. Even if the risk was explained by replication errors, the estimated parameters did not always agree with previously reported values. For example, the estimated number of cancer driver genes in lung cancer was larger than the previously reported values. This discrepancy can be partly resolved by assuming the influence of mutagen. First, the influence of mutagens was analyzed using various parameters. The model predicted that the influence of mutagens will appear earlier, when the turnover rate of the tissue is higher and fewer mutations of cancer driver genes were necessary for carcinogenesis. Next, the parameters of lung cancer were re-estimated assuming the influence of mutagens. The estimated parameters were closer to the previously reported values. than when considering only replication errors. Although it may be useful to explain cancer risk by replication errors, it would be biologically more plausible to consider mutagens in cancers in which the effects of mutagens are apparent.

Introduction

Nordling showed that age-dependent cancer death-rate becomes a straight line on the doubly logarithmic plane [1]. This is mathematically formularized and called the multistage theory of carcinogenesis [2]. This theory assumes that some successive mutations cause carcinogenesis, and each mutation arises with a certain probability depending on time. Then, the logarithm of the incidence rate increases linearly with the logarithm of the time unit, such as age. The slope of the line on the doubly logarithmic plane is used for estimating the number of mutations that are necessary for carcinogenesis. It was estimated that six or seven successive mutations are necessary for the mortality of cancer if the probability of mutation remains constant throughout life. Nowadays, it is known that the number of driver mutations in the tumor has a certain distribution, and it can be smaller than seven [3]. In addition, a different mathematical model showed that the risk of chronic myeloid leukemia could be explained by a single mutation [4]. Although molecular genetic data and the mathematical model have been updated, the fundamental concept that one or more mutations cause carcinogenesis is predominant. One of the important points of the multistage theory is that mutations arise depending on time. In other words, the number of mutations increases in an age-dependent manner with the number of cell divisions. Tomasetti and Vogelstein showed that cancer risk variation is explained by the number of stem-cell divisions [5]. They estimated the total number of stem-cell divisions and investigated its correlation with the lifetime risk of different types of cancers. They found that the risk of cancers and the number of cell divisions were strongly correlated, implying that cancer driver gene mutations are caused by cell divisions. Therefore, based on the multistage theory of carcinogenesis, not only the lifetime risk of cancer but also age-dependent cancer risk might be explained by replication errors. A study focused on cell division and cancer risk to explain the risk of colorectal cancer assuming that the rate of mutation is constant [6]. If the division rate of cells is constant, the assumption would be reasonable. However, the rate of cell division can change with age [5, 7]. Especially, cell division during the development of tissues might affect cancer risk [5]. Therefore, it is prudent to formularize assuming that the rate of cell proliferation varies with age.

Even if cancer risk is explained by the number of cell divisions, some mutagens such as ionizing radiation and smoking influence it. A study showed that the number of stem-cell divisions is not correlated with radiation- or smoking-associated cancer risk [8]. For example, a life-span study of atomic bomb survivors showed that the risk of cancer mortality increased significantly for some tissues such as the stomach, lung, liver, colon, breast, and gallbladder [9]. Stem and early progenitor cells are recognized as important target cells in radiation carcinogenesis, and ionizing radiation contributes to the carcinogenic process by adding a few mutations [10]. Since the accident at the Fukushima Daiichi Nuclear Power Plant, the carcinogenic effects of low-dose and low-dose-rate radiation have become a major public concern in Japan. Although evaluation of the biological effects of low dose and/or low dose-rate ionizing radiation has important implications in radiation protection, the magnitude of cancer risk associated with low dose and/or low dose-rate radiation exposure is still unclear [11]. One of the difficulties in estimating the influence of a very low dose is that its impact on cancer risk is, if any, very low. To discuss a low influence of mutagens, the baseline of cancer risk must be considered. As mentioned above, the influence of replication errors on cancer risk may be the baseline. Considering the relationship between replication error and cancer risk using a simple model and comparing it with actual data will help identify the baseline. Examining how mutagens affect this baseline will thus help us evaluate it, even if the effect of a mutagen is very small.

Here, we constructed a simple mathematical model to assess the influence of cell division and mutagens on cancer risk. The model assumed cell division phases of growth and stability. In the growth phase, stem cells increase exponentially until the number of cells reaches the capacity of the stem-cell pool. The stable phase starts after that. Stem cells divide only when the cells are decreased in the stable phase. In these processes, mutations can arise due to replication errors and mutagens. Cancer risk was assumed to depend on mutations. Next, we examined whether age-dependent cancer risk can be explained only by replication errors through a comparison of the model with cancer registry data. Finally, the influences of mutagens are discussed qualitatively by varying the parameters, and lung cancer data were reanalyzed.

Methods

Overview of the mathematical model

Cells generally do not proliferate indefinitely. Some studies assume that stem cells frequently divide until the tissue is fully developed, and the frequency of cell division decreases after development [5, 12]. We assumed that tissues exhibit growth and stable phases. During the growth phase, stem cells divide at a constant rate until the total cell number is the complete size of the stem-cell pool. Then, the growth phase ends, and the stable phase starts. In the stable phase, stem cells do not divide as long as the number of cells can be maintained. Stem cells can be removed from the cell pool as a result of cell death and/or differentiation; however, we did not distinguish this here. When cells are removed, the remaining cells undergo division with a constant growth rate until the total cell number reaches the complete size of the cell pool. We assumed that mutations are classified into errors and lesions. Errors correspond to DNA replication errors, which can arise during stem-cell division. We assumed errors to occur according to a Poisson distribution with parameter λ1 when a stem cell undergoes division. By contrast, lesions arise independently from cell division, thus reflecting effects of mutagens. We assumed that the number of lesions per cell per time depends on a Poisson distribution with parameter λ2. When λ3 is independent from time, there is a chronic influence of mutagens from the beginning. When discussing the temporary influence of mutagens, λ2 should be the function of time. In this paper, we assumed that λ2 is constant, and we examined the accumulation of errors and lesions. A schematic diagram is shown in Fig 1.

Fig 1. Schematic diagram of the model.

Fig 1

This process is referred to as growth phase, followed by a stable phase. In the stable phase, cells can divide only when their number decreases, such as when they are eliminated. Errors can occur during division, and lesions occur with a certain probability at any time, regardless of division.

Simulation using the Gillespie algorithm

As the number of stem cells is not constant, we used the Gillespie algorithm [1315] to simulate changes in the numbers of mutations over time (S1 File). Let n{i,j} be the number of stem cells whose errors and lesions are i and j, respectively. We express the condition of the stem-cell pool as a set of n{i,j} as follows: (n{0,0},n{1,0},,n{i,j},,n{I,J}), where I and J are maximum numbers of errors and lesions, respectively. Although these limitations are necessary for the simulation, we can ignore their influence by setting sufficiently large I and J. The condition of the tissue varies with stochastic events such as stem-cell division, formation of lesions, and cell removal. The chance of each event is proportional to its normalized form by the sum of the rates of all possible events [15]. First, we show the rate of each event in the growth phase. Stem cells divide at a rate r, and cell removal does not occur during the growth phase. As the maximum number of errors is I, the rate at which a stem cell carries i errors and j lesions and divides with k errors is

(n{0,0},n{1,0},,n{i,j},,n{I,J})(n{0,0},n{1,0},,n{i+k,j}+1,,n{I,J})=λ1keλ1k!rn{i,j}ifkIi, (1)
(n{0,0},n{1,0},,n{i,j},,n{I,J})(n{0,0},n{1,0},,n{I,j}+1,,n{I,J})=k>Iiλ1keλ1k!rn{i,j}ifk>Ii. (2)

XY represents the rate of the changing the condition from X to Y. Cells can also suffer lesions without cell division. The rate at which a cell having i errors and j lesions and suffers l lesions is written as follows:

(n{0,0},n{1,0},,n{i,j},,n{i,j+l},,n{I,J})(n{0,0},n{1,0},,n{i,j}1,,n{i,j+l}+1,,n{I,J})=λ2leλ2l!n{i,j}iflJj, (3)
(n{0,0},n{1,0},,n{i,j},,n{i,J},,n{I,J})(n{0,0},n{1,0},,n{i,j}1,,n{i,J}+1,,n{I,J})=l>Jjλ2leλ2l!n{i,j}ifl>Jj. (4)

Then, the sum of the rates of all possible events in the growth phase, Γg, is

Γg=i=0Ij=0J[k=0Iiλ1keλ2k!rn{i,j}+k>Iiλ1keλ2k!rn{i,j}+l=0Jjλ2leλ2l!n{i,j}+l>Jjλ2leλ2l!n{i,j}]=i=0Ij=0J[rn{i,j}+n{i,j}]. (5)

Denoting the probability that condition X changes to Y as Pr[XY], we can write the probability of each event as follows:

Pr[(n{0,0},n{1,0},,n{i,j},,n{I,J})(n{0,0},n{1,0},,n{i+k,j}+1,,n{I,J})]=1Γgλ1keλ1k!rn{i,j}ifkIi, (6)
Pr[(n{0,0},n{1,0},,n{i,j},,n{I,J})(n{0,0},n{1,0},,n{I,j}+1,,n{I,J})]=1Γgk>Iiλ1keλ1k!rn{i,j}ifk>Ii, (7)
Pr[(n{0,0},n{1,0},,n{i,j},,n{i,j+l},,n{I,J})(n{0,0},n{1,0},,n{i,j}1,,n{i,j+l}+1,,n{I,J})]=1Γgλ2leλ2l!n{i,j}iflJj, (8)
Pr[(n{0,0},n{1,0},,n{i,j},,n{i,J},,n{I,J})(n{0,0},n{1,0},,n{i,j}1,,n{i,J}+1,,n{I,J})]=1Γgl>Jjλ2leλ2l!n{i,j}ifl>Jj. (9)

Let NL be the capacity of the stem-cell pool. In the stable phase, cells do not divide when there are NL stem cells, and cells may be removed. We denote a{i,j} as the rate of cell division of a stem cell, which carries i errors and j lesions in the stable phase. As the division rate is 0 when there are NL cells,

a{i,j}={0ifN=NLad{i,j}ifN<NL, (10)

where N is the total cell number, which is defined as

N=i=0Ij=0Jn{i,j}. (11)

When the total cell number is smaller than NL, the stem cells divide at a rate of ad{i,j}. Similar to the growth phase, the rate at which a stem cell, which carries i errors and j lesions, divides with k errors is

(n{0,0},n{1,0},,n{i,j},,n{I,J})(n{0,0},n{1,0},,n{i+k,j}+1,,n{I,J})=λ1keλ1k!a{i,j}n{i,j}ifkIi, (12)
(n{0,0},n{1,0},,n{i,j},,n{I,J})(n{0,0},n{1,0},,n{I,j}+1,,n{I,J})=k>Iiλ1keλ1k!a{i,j}n{i,j}ifk>Ii. (13)

The rate of cell removal is

(n{0,0},n{i,j},,n{I,J})(n{0,0},n{i,j}1,,n{I,J})=b{i,j}n{i,j}, (14)

where b{i,j} is the removal rate of a stem cell carrying i errors and j lesions.

As the mutations caused by lesions is the same as that during the growth phase the sum of the rates of all possible events in the stable phase, Γs, is

Γs=i=0Ij=0J[k=0Iiλ1keλ1k!a{i,j}n{i,j}+k>Iiλ1keλ1k!a{i,j}n{i,j}+b{i,j}n{i,j}+l=0Jjλ2leλ2l!n{i,j}+l>Jjλ2leλ2l!n{i,j}]=i=0Ij=0J[a{i,j}n{i,j}+b{i,j}n{i,j}+n{i,j}]. (15)

Then, the probability of each event in the stable phase can be denoted as follows:

Pr[(n{0,0},n{1,0},,n{i,j},,n{I,J})(n{0,0},n{1,0},,n{i+k,j}+1,,n{I,J})]=1Γsλ1keλ1k!a{i,j}n{i,j}ifkIi, (16)
Pr[(n{0,0},n{1,0},,n{i,j},,n{I,J})(n{0,0},n{1,0},,n{I,j}+1,,n{I,J})]=1Γsk>Iiλ1keλ1k!a{i,j}n{i,j}ifk>Ii, (17)
Pr[(n{0,0},n{1,0},,n{i,j},,n{i,j+l},,n{I,J})(n{0,0},n{1,0},,n{i,j}1,,n{i,j+l}+1,,n{I,J})]=1Γsλ2leλ2l!n{i,j}iflJj, (18)
Pr[(n{0,0},n{1,0},,n{i,j},,n{i,J},,n{I,J})(n{0,0},n{1,0},,n{i,j}1,,n{i,J}+1,,n{I,J})]=1Γsl>Jjλ2leλ2l!n{i,j}ifl>Jj, (19)
andPr[(n{0,0},n{i,j},,n{I,J})(n{0,0},n{i,j}1,,n{I,J})]=1Γsb{i,j}n{i,j}. (20)

Although a{i,j} and b{i,j} can depend on the number of mutations if mutations affect the character of cells, we assumed that those parameters are independent of the number of mutations, a = a{i,j}, b = b{i,j}, in the following analysis, for simplicity.

Multi-stage carcinogenesis model

Based on the multistage carcinogenesis theory, we assumed that cancer arises through successive mutations. We did not distinguish between mutations arising from errors or lesions and assumed that mutations of g cancer driver genes cause carcinogenesis. Strictly, accumulation of mutations should be considered for each stem cell, however, we considered the average number of mutations per cell, for simplicity. The total numbers of errors and lesions at time t in the tissue were calculated, respectively, as follows:

E(t)=i=0in{i,j}(t), (21)
W(t)=j=0jn{i,j}(t), (22)

where the number of stem cells that have i errors and j lesions at time t is denoted as n{i,j}(t). The average number of mutations in a stem cell is

m(t)=E(t)+W(t)N(t), (23)

where N(t) is the total number of stem cells at time t. The probability of having g or more mutations of cancer driver genes equals one minus the probability of the number of mutations smaller than g. Let p be the probability that one cancer driver gene mutation occurs per mutation. By using the cumulative distribution function of the binomial distribution, when there are m(t) mutations, the cumulative cancer risk is

G(t)=1k=0g1(m(t)k)pk(1p)m(t)k. (24)

The second term is the probability that the number of cancer driver gene mutations is smaller than g. If m(t) is sufficiently high and p is sufficiently low, binomial distribution can be approximated using the Poisson distribution. Then, G can be described as follows:

G(t)=1k=0g1(pm(t))kepm(t)k!. (25)

In the following, we use Eq (25) for simplifying the calculation.

Results

Simulation and approximation of the accumulation of errors and lesions

First, the accumulation of errors and lesions was simulated using the Gillespie algorithm. An example of the simulation is shown in Fig 2. The number of stem cells increases exponentially in the growth phase until it reaches NL (Fig 2A). When the number of stem cells reaches NL, the growth phase ends, and the stable phase starts. When stem cells are removed, cell division resumes (Fig 2A). The time series of the number of errors and lesions are shown in Fig 2B. Then, we approximate the time course of the accumulation of errors and lesions to discuss cancer risk.

Fig 2. Results of a simulation by the Gillespie algorithm.

Fig 2

(a) Time course of the number of cells. The label n{i,j} is the number of cells carrying i errors and j lesions. Total number of cells are denoted as N(t). (b) Time course of the total number of errors E(t) and lesions W(t). Parameter values are N0 = 10, NL = 100, r = 0.15, ad = 0.16, b = 0.01, λ1 = 0.01, and λ2 = 0.001. The results were output when reaching a certain number of divisions and were combined.

We consider the average accumulated mutations in the tissue. In the growth phase, stem cells increase with the growth rate r. Setting the initial stem-cell number to N0, the number of cell divisions at time t is d, and the following equation holds:

N0exp[rt]=N0+d. (26)

Rearranging this equation, we can express d as the function of time as follows:

d=N0(exp[rt]1). (27)

As the average number of errors per cell division is λ1, the number of errors, E, in the growth phase is

E=λ1d=λ1N0(exp[rt]1). (28)

By contrast, the average number of lesions per stem cell per time is λ2. The number of stem cells at time t is N0 exp[rt] in the growth phase. These mean that the increase rate of W is λ2N0 exp[rt]. Considering there is no lesion in the initial state,

dWdt=λ2N0exp[rt]W=λ2N0r(exp[rt]1). (29)

From Eq (27), t can be written as the function of d as follows:

t=1rlog(dN0+1). (30)

Substituting Eq (30) into Eq (29) the number of lesions can be written as the function of d as follows:

W=λ2N0r(exp[rt]1)=λ2N0r((dN0+1)1)=λ2rd. (31)

In the stable phase, the total number of cell divisions, d, is the summation of the total number of cell divisions in the growth and stable phases. The former is trivially calculated as NLN0. To estimate the number of cell divisions in the stable phase, the waiting time until cell division needs to be taken into account. In the stable phase, stem cells do not divide when the number of stem cells is NL. The waiting time until cell division is the summation of waiting time until cell removal and cell division. As the rate of cell removal is b, the waiting time until cell removal is 1/bNL. In this analysis, we assume the blank of a stem cell in the stem-cell pool is filled before another cell is removed. Then, the waiting time until cell division is approximated as 1/a(NL−1). Therefore, the waiting time until cell division in the stable phase is calculated as follows:

1bNL+1a(NL1)=a(NL1)+bNLabNL(NL1). (32)

Letting dT be the number of cell divisions during T in the stable phase, the following equation holds:

dT=Ta(NL1)+bNLabNL(NL1). (33)

When growth phase ends at t′, the total number of cell divisions at t>t′ is

NLN0+dtt=NLN0+tta(NL1)+bNLabNL(NL1)==NLN0+(tt)abNL(NL1)a(NL1)+bNL. (34)

As the growth phase ends when the total stem-cell number reaches NL, t′ can be calculated as follows:

NL=N0exp[rt]t=1rlogNLN0. (35)

Substituting this equation into Eq (34), the total number of cell divisions d at t>t′ is

d=NLN0+(t1rlogNLN0)abNL(NL1)a(NL1)+bNL. (36)

Then, the number of errors, E, in the stable phase is

E=dλ1=λ1{NLN0+(t1rlogNLN0)abNL(NL1)a(NL1)+bNL}. (37)

Considering the number of lesions in the growth phase, it is convenient to express t as a function of d. From Eq (36), the relationship between d and t in the stable stage is

t=1rlogNLN0+(dNL+N0)(a(NL1)+bNL)abNL(NL1). (38)

Let W′ be the number of lesions at the end of the growth phase i.e. W at t = t′. From Eqs (31) and (35), W′ can be calculated as

W=λ2N0r(exp[r(1rlogNLN0)]1)=λ2r(NLN0). (39)

The number of lesions emerged in the stable phase is expressed as the product of NL, λ2, and the time elapsed since entering the stable phase. The number of lesions in the stable phase is

W=NLλ2(tt)+W=NLλ2{(dNL+N0)(a(NL1)+bNL)abNL(NL1)}+λ2r(NLN0). (40)

Fig 3 shows a comparison between the results of simulations and approximations. The approximations explain the simulation well. Next, we discuss cancer risk using these approximations.

Fig 3. Comparison of simulation and analytical approximations.

Fig 3

The horizontal line indicates the number of cell divisions, and the vertical line indicates the number of errors and lesions. Dots are the average values of 50-fold simulations. Lines are the approximations in Eqs 28, 29, 37, and 40. Blue and orange lines show errors and lesions, respectively. Parameter values are N0 = 10, NL = 100, r = 0.15, ad = 0.16, b = 0.01, λ1 = 0.01, and λ2 = 0.001.

Fitting to cancer registry data

We consider the cumulative risk of cancer as mutations accumulate during the lifetime. If cancer risk is explained by stem-cell divisions, the risk should be explained only by replication errors. In other words, the presented model can fit cancer registry data of cancer risk even if λ2 = 0. We applied our model to cancer registry data of cumulative cancer incidence risk in 2015 obtained from the Cancer Information Service, National Cancer Center, Japan (Ministry of Health, Labour and Welfare, National Cancer Registry), accessible under https://ganjoho.jp/reg_stat/index.html. We assumed that a tissue originates from only a single stem cell and that the number of stem cells reaches NL at the age of 18, i.e., the growth phase continues until the age of 18. Therefore, the rate of cell division in the growth phase r is

NL=N0exp[18r]r=118logNLN0=118logNL. (41)

The dimension of r is cell divisions per year in this formula. When we assume that the cell division occurs immediately after a stem cell is removed, ab, the waiting time of cell division in the stable phase, as in Eq (32), is 1/bNL. This means that the proliferation rate in the stable phase equals the cell-removal rate b. The number of stem cells in the tissue and that caused by the divisions of each stem cell per year are estimated in previous studies [5, 7] (Table 1). As we assume λ2 = 0, we should estimate λ1, p, and g. Explicitly denoting d as a function of t as d(t), from Eqs (28) and (37), pm(t) = 1d(t) if λ2 = 0. Therefore, λ1p is regarded as a single parameter in Eq (25) when λ2 = 0. This parameter can be understood as the average number of cancer driver genes mutations per cell division. Therefore, we have to estimate the values of λ1p and g. Although λ1p is a real number, g is an integer. We varied the value of g and calculated the least-square value with λ1p. Then, we chose a parameter set of λ1p and g that made the least-square value minimum (S1 Fig). The curves of G(t) were fitted to cancer registry data of esophageal cancer (Fig 4A), leukemia (Fig 4B), liver cancer (Fig 4C), lung cancer (Fig 4D), thyroid cancer (Fig 4E), pancreatic cancer (Fig 4F), colon cancer (Fig 4G), breast cancer (Fig 4H), and prostate cancer (Fig 4I). The curve was not fitted to the risk of leukemia because the risk increases from a young age (Fig 4B). Except for leukemia, the curve was fitted to the data. This implies that age-dependent cancer risk can be explained well by replication errors. However, it should be noted that the estimated parameter values do not always agree with the values expected from the previously reported values, as elaborated in the Discussion. For example, g and λ1p of male lung cancer are very large. It is well known that smoking increases lung cancer risk. Therefore, it may be possible to reduce this discrepancy by considering the influence of mutagens.

Table 1. List of parameters for the fitting to cancer registry data shown in Fig 4.

Number of divisions of each stem cell per year (b) Number of normal stem cells in tissue origin (NL) Cell division rate in growth phase (r=logNL18)
Esophageal cancer 33.2* 6.6528 × 106* 0.87
Leukemia 12* 1.35 × 108* 1.04
Liver cancer 0.91* 3.01 × 109* 1.21
Lung cancer 0.07* 1.22 × 109* 1.16
Thyroid cancer 0.087* 6.5 × 107* 1.00
Pancreatic cancer 1* 4.18 × 109* 1.23
Colon cancer 73* 2.00 × 108* 1.06
Breast cancer 4.32** 8.7 × 109** 1.27
Prostate cancer 2.99** 2.1 × 108** 1.06

*Values from [5]

**Values from [7]

Fig 4. Comparison of model and cancer registry data of age-dependent cancer risk.

Fig 4

Cancer risk G(t) was compared with the cumulative risk of (a) esophageal cancer, (b) leukemia, (c) liver cancer, (d) lung cancer, (e) thyroid cancer, (f) pancreatic cancer, (g) colon cancer, (h) breast cancer, and (i) prostate cancer. Blue and red lines are the cases of males and females, respectively. Dots indicate cancer registry data of cumulative risk of cancer for the age range of 5–85. Curves are the fitting of G(t) to data when λ2 = 0. Fitted parameters are shown in each figure. The other parameters values are summarized in Table 1.

Influence of lesions on carcinogenesis

Finally, we provide a qualitative hypothesis for the influence of lesions on cancer risk by analyzing the model using different parameters. We used risk difference (RD) as a quantitative measure for the influence of lesions [16]. Let G1(t) and G0(t) be the cancer risks at time t in the case of λ2 ≠ 0 and λ2 = 0, respectively. Then, RD at time t is defined as

RD(t)=G1(t)G0(t). (42)

The other measure called, risk ratio (RR), which is defined as RR = G1/G0, is widely used in the field of public health [16]. We, however, consider only RD because G0 can be 0 in this theoretical analysis. Since the absolute value of RD depends on unknown biological parameters, such as sensitivity to mutagens, we assumed some parameter sets and obtained qualitative results. The typical shape of RD is shown in Fig 5. RD is very low at the beginning and after a sufficiently long time. Even if there are mutagens, the number of cancer driver gene mutations is not sufficient for carcinogenesis in the beginning. As a result, RD is low. By contrast, after a sufficiently long time, most cancers are caused by replication errors, and hence, RD is low. The largest values of RD are observed when t is moderate. When λ2 is large, the start of the increase is early, and the peak of RD is large (Fig 5A). This means that the greater the influence of mutagens, the earlier and greater the influence on RD. This result is intuitive. The biological features of the tissues might affect the influence of the mutagens. When the turnover of the stem cells is less frequent in the stable phase, i.e., b is low, the increase of RD is late, and the maximum value is large (Fig 5B). In this case, the accumulation of errors is slow as the cell division is less frequent. Then, cancer driver gene mutations are relatively unlikely to occur even if there are mutagens. However, as there are few cancers caused by errors, the maximum value of RD becomes large. The influence of the number of cancer genes is contrary to the turnover rate. If carcinogenesis needs fewer mutations, the increase of RD is faster, and the maximum value is smaller (Fig 5C). In this case, a relatively low number of mutations are sufficient for the complete mutation of all cancer driver genes; therefore, carcinogenesis occurs early because of additional mutations caused by mutagens. By contrast, the maximum value of RD is low because cancer arises relatively frequently without mutagens. These results suggest that if the turnover rate is low and/or the number of cancer driver gene mutations is large, the influence of mutagens emerges slower and larger. We re-analyzed the data on lung cancer to consider the influence of lesions (Fig 6). In this analysis, we assumed that lesions would accumulate after an age of 20 years. That is, λ2 = 0 before age 20, and λ2 has a constant value after that. This is because the smoking age in Japan is 20. As the fitting was already good even when considering replication errors only, it did not improve markedly. However, the estimated values were smaller than when considering replication errors only.

Fig 5. Time course of risk difference with various parameters.

Fig 5

Parameter values are N0 = 1, NL = 109, r = log[NL]/18, λ1 = 0.5, p = 0.2, (a) b = 0.1, g = 6, (b) λ2 = 0.01, g = 6, and (c) b = 1, λ2 = 0.1. The remaining parameters are shown in each figure.

Fig 6. Comparison of models with lesion and lung cancer data of age-dependent cancer risk.

Fig 6

Dots indicate cancer registry data of cumulative risk of cancer for the age range of 5–85. Curves are the fitting of G(t) when λ2 = 0 if the age is < 20, λ2 is a constant value if the age is > 20. The other parameters values are summarized in Table 1.

Discussion

We constructed a mathematical model to discuss the risk of cancer caused by the accumulation of mutations. In the model, a single stem cell divides up to a certain number, NL, and stops dividing as long as the total cell number can remain at NL. Cell division resumes when cells are removed. Mutations are caused by errors and lesions. Errors arise stochastically with cell division, and lesions arise without cell division. First, we simulated the accumulation of errors and lesions using the Gillespie algorithm (Fig 2). The accumulation of errors and lesions were approximated and compared with the simulation (Fig 3). As so many cell divisions occur in real tissue, calculation with the Gillespie algorithm takes a long time. Using these approximations, it is easier to compare with cancer registry data than using a simulation.

Next, we considered the probability of carcinogenesis based on the multistage model. Cancer risk is correlated to the number of stem-cell divisions [5]. This suggests that cancer risk might be explained by the accumulation of replication errors. In our model, this means that cancer risk is explained without lesions. We fitted our model to cancer registry data from the Cancer Information Service, National Cancer Center, Japan (Fig 4). Except for leukemia, our model fitted to the risks of cancers. This supports the hypothesis that the risks of some cancers are explained by replication errors. Leukemia was not explained by our model as the risk occurs at a very young age. The model assumes that cells divide vigorously at a younger age. If we try to explain the early risk of leukemia through replication errors, the risk in the stable phase becomes too large because hematopoietic stem cells are considered to divide even in adults. It is said that chronic myeloid leukemia can be explained by a single mutation when the mutated cells acquire reproductive advantages [4]. We assumed that mutations do not affect the biological characters of the cells in this study. Therefore, the influence of mutations on cells should be considered in future studies. We estimated 3–10 cancer driver gene mutations. Some studies assessed the numbers of cancer-related mutations (Table 2).

Table 2. Estimated parameters from fitting and related parameters from previous studies.

Estimated Number of cancer driver genes (g) Average number of cancer driver genes mutations per cell division. (λ1p) Number of driver mutations from previous studies Number of cancer driver genes mutations per cell division estimated from [1820]
Male Female Male Female
Leukemia 2 2 0.0016 0.00014 1.94** 0.000228, 0.000274
Lung cancer 10 7 1.045 0.54 5.33**, 5.16** 0.000228, 0.000274
Pancreatic cancer 5 6 0.0226 0.0290 4.25* 0.000228, 0.000274
Colon cancer 5 5 0.00041 0.00037 3.59*, 3.74** 0.000228, 0.000274
Breast cancer - 2 - 0.0018 3.32*, 1.76** 0.000228, 0.000274

*Values from [3]

**Values from [17]

Vogelstein et al. [3] reviewed the distribution of the numbers of driver mutations in tumors. Pancreatic cancer, colorectal cancer, and breast cancer were associated with, on average, 4.25, 3.59, and 3.32 mutations, respectively. A different study analyzed 3,281 tumors from 12 cancer types and identified 127 genes that showed significant frequencies of mutations [17]. The average numbers of mutations in colon and rectal carcinoma, acute myeloid leukemia, lung adenocarcinoma, lung squamous cell carcinoma, and breast adenocarcinoma were 3.74, 1.94, 5.33, 5.16, and 1.76, respectively. The estimation of the present study showed that leukemia required 2 mutations, pancreatic cancer requires 5 or 6 mutations, colon cancer requires 5 mutations, and breast cancer requires 2 mutations. These estimations therefore deviate only slightly from the previously reported values. However, the estimated value of g = 10 or 7 for lung cancer is larger than the values. In addition, the estimated mutation rate λ1p of lung cancer is very large. The mutation rate is estimated at approximately 1.14 mutations per genome per division in healthy haematopoiesis and 1.37 per genome per division in brain development [18], which corresponds to λ1. Then, p can be interpreted as the proportion of cancer genes in the genome. Combined with the estimation that more than 1% of the genes contribute to human cancer [19] and the estimation that the protein-coding gene is 2% [20], p is considered to be approximately 0.01×0.02 = 0.0002. From these estimations, λ1p will be approximately 1.14×0.0002 = 0.000228 or 1.37×0.0002 = 0.000274. The estimated values of λ1p are too large in liver, lung, thyroid, pancreatic, breast, and colon cancer. This may suggest that biological processes other than replication errors affect the cancer risk. We discuss the biological processes that can affect the results in the last section.

The change in cancer risk caused by mutagens is an important problem in the field of oncology. For example, high dose and dose-rate radiation increase cancer risk [9]. However, the influence of very low dose and dose-rate radiation, which is important for radiation protection, is unclear [11]. We discussed the influence of lesions caused by mutagens on cancer risk using risk difference (RD) as an indicator (Fig 5). RD increased initially and then began to decrease. The decrease in the latter half is due to the assumption that carcinogenesis occurs even in the absence of mutagens because of the accumulation of replication errors. In reality, the decrease will be difficult to observe because life is limited. The timing of the start of RD increases depending on biological parameters. The increase is delayed when the cell-removal rate b is low (Fig 5B) and/or the number of cancer driver genes g is large (Fig 5C). The removal rate b can be regarded as the division rate if the removal of a stem cell is immediately compensated by the cell division of the remnant stem cells. For example, astrocytes and neural stem cells are the origins of glioblastoma and medulloblastoma, respectively. The number of divisions of those stem cell per year is estimated at 0 [5]. The influence of mutagens on these cancer risks might appear slowly. However, it is estimated that colonic stem cells divide 75-times per year [5]. The influence of mutagens might be observed relatively early in the colon. Recently, a hypothesis that radiation, especially high dose-rate radiation, might lead to an earlier onset of cancer has been advocated [21]. The time course of RD in this study focuses on cancer risk change with time. When risk G is fixed and solved with respect to t, the difference might be an indicator of the early onset of cancer.

In order to integrate the effects of mutagens, we focused on lung cancer, which showed a large difference between estimated parameters when only replication errors and previously reported values. In addition, the estimated the number of cancer driver genes in lung cancer also largely differed between sexes. This discrepancy will be partly due to incompleteness of the estimation, as discussed below, but can be partially resolved by assuming mutagen. It is known that smoking increases lung cancer risk (e.g., [22]). The data included smokers and non-smokers, with more male than female smokers. This may lead to a larger estimated value for the number of cancer driver genes in male data than in female data and the large difference between sexes. When we integrated the influence of mutagen, the fitting did not differ as much. However, the estimated value was smaller and closer to the previously reported values than in the case of replication error alone. The difference between sexes became slightly smaller. By considering the effects of mutagens in more detail, the effects of replication errors might be estimated more accurately.

The present analysis was very simple, and there are limitations that must be solved in the future to derive more accurate results. We did not consider population fluctuation when fitting the cancer registry data. In addition, fitting by the least-squares method was performed, but this assumes an age-independent normal distribution. These may have caused the differences between sexes in the parameter estimates. For more accurate estimates, it will be necessary to take into account demographics and different health conditions among generations. Moreover, this study ignored many biological processes. The end of growth phase was assumed at age 18, but the actual age may vary depending on the organ or tissue. Assuming a longer growth phase, estimated g and λ1p would be smaller. We assumed that properties of the cells were independent of errors and lesions. However, the acquisition of driver mutations may increase the growth rate, and mutations in DNA damage repair genes may increase the mutation rate. In addition, as the accumulation of mutations was treated as the average of the population in the multi-stage carcinogenesis model, polyclonality of cancer could not be expressed. Many cancers are known to be polyclonal [23]. It may be possible to incorporate changes in cell properties and cancer polyclonality by focusing on each cell, rather than on the population average, in the process of carcinogenesis. For example, if the properties of each cell are changed by mutations, a{i,j}, b{i,j}, and λ1 should depend on the number of errors i and that of lesions j. In the stable phase, the division rate was constant; however, it is known to decreases with age [7]. It is also known that the effect of radiation is more pronounced in younger people than in adults [9]. The relationship between division rates and mutagens will also be an important issue when considering radiation protection. For example, assuming that a{i,j} and/or λ2 are time-dependent, it may be possible to incorporate those phenomena. Moreover, we assumed that the interaction of stem cells does not affect the proliferation and removal of cells. However, cells exposed to mutagens such as ionizing radiations might be removed through stem-cell competition [24, 25]. If so, the removal rate of damaged cells may be larger than that of intact cells. These phenomena should be considered in the future to examine cancer risk in more detail.

Supporting information

S1 Fig. Estimation of g and λ1p through the least squares method.

The sum of the squares of the residuals for each g was minimized by λ1p. Then, a combination of g and λ1p showing the minimum of the sum of squares of the residuals was established. The order of the figures corresponds to that in Fig 5: (a) esophageal cancer, (b) leukemia, (c) liver cancer, (d) lung cancer, (e) thyroid cancer, (f) pancreatic cancer, (g) colon cancer, (h) breast cancer, and (i) prostate cancer. The upper and lower panels show the results of males and females, respectively.

(TIF)

S1 File. Code of Gillespie algorithm.

This is a C language code for Gillespie algorithm. The default parameters correspond to Fig 2.

(C)

Acknowledgments

We thank Dr. Hiroshi Haeno for technical discussions on the mathematical model.

Data Availability

Statistical data used in the manuscript can be accessed from the "graph database" in "latest cancer statistics" of Cancer Information Service, National Cancer Center, Japan (National Cancer Registry, Ministry of Health, Labour and Welfare (https://ganjoho.jp/reg_stat/index.html). The direct URL to the "graph database" for each cancer and the values used there are shown below. “Graph database” of esophageal cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=4&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of esophageal cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=4&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of leukemia: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=27&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of leukemia: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=27&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of liver cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=8&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of liver cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=8&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of lung cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=12&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of lung cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=12&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of thyroid cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=24&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of thyroid cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=24&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of pancreatic cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=10&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of pancreatic cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=10&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of colon cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=6&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of colon cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=6&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of breast cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=14&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of breast cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=14&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of prostate cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=20&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of prostate cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=20&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 The authors confirm that others would be able to access or request these data in the same manner as the authors. The authors did not have any special access or request privileges that others would not have.

Funding Statement

KU was supported by JSPS KAKENHI Grant Number JP 20K19972. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Nordling CO. A new theory on the cancer-inducing mechanism. British journal of cancer. 1953; 7(1): 68–72. doi: 10.1038/bjc.1953.8 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Armitage P, Doll R. The age distribution of cancer and a multi-stage theory of carcinogenesis. British journal of cancer. 1954; 8(1), 1–12. doi: 10.1038/bjc.1954.1 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA Jr, Kinzler KW. Cancer genome landscapes. Science. 2013; 339(6127); 1546–1558. doi: 10.1126/science.1235122 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Michor F, Iwasa Y, Nowak MA. The age incidence of chronic myeloid leukemia can be explained by a one-mutation model. Proc Natl Acad Sci U S A. 2006; 103(40), 14931–14934. doi: 10.1073/pnas.0607006103 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Tomasetti C, Vogelstein B. Variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science. 2015; 347(6217), 78–81. doi: 10.1126/science.1260825 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Calabrese P, Shibata D. A simple algebraic cancer equation: calculating how cancers may arise with normal mutation rates. BMC Cancer. 2010; 10(1), 1–12. doi: 10.1186/1471-2407-10-3 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Tomasetti C, Poling J, Roberts NJ, London NR Jr, Pittman ME, Haffner MC, et al. Cell division rates decrease with age, providing a potential explanation for the age-dependent deceleration in cancer incidence. Proc Natl Acad Sci U S A. 2019; 116(41), 20482–20488. doi: 10.1073/pnas.1905722116 , PMCID: PMC6789572. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Little MP, Hendry JH, Puskin JS. Lack of Correlation between Stem-Cell Proliferation and Radiation- or Smoking-Associated Cancer Risk. PLoS One. 2016; 11(3), e0150335. doi: 10.1371/journal.pone.0150335 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Ozasa K, Shimizu Y, Suyama A, Kasagi F, Soda M, Grant EJ, et al. Studies of the mortality of atomic bomb survivors, Report 14, 1950–2003: an overview of cancer and noncancer diseases. Radiat Res. 2012; 177(3), 229–243. doi: 10.1667/rr2629.1 . [DOI] [PubMed] [Google Scholar]
  • 10.Hendry JH, Niwa O, Barcellos-Hoff MH, Globus RK, Harrison JD, Martin MT, et al. ICRP Publication 131: Stem cell biology with respect to carcinogenesis aspects of radiological protection. Ann ICRP. 2016;45(1 Suppl):239–252. doi: 10.1177/0146645315621849 . [DOI] [PubMed] [Google Scholar]
  • 11.United Nations Scientific Committee on the Effects of Ionizing Radiation, ANNEX B Epidemiological studies of cancer risk due to low-dose-rate radiation from environmental sources. In: UNSCEAR 2017 Report: Sources, effects and risks of ionizing radiation, with. New York: United Nations; 2018.
  • 12.Wu S, Powers S, Zhu W, Hannun YA. Substantial contribution of extrinsic risk factors to cancer development. Nature. 2016; Nature, 529(7584), 43–47. doi: 10.1038/nature16166 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Gillespie DT. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. Journal of computational physics. 1976; 22(4), 403–434. 10.1016/0021-9991(76)90041-3 [DOI] [Google Scholar]
  • 14.Gillespie DT. Exact stochastic simulation of coupled chemical reactions. The journal of physical chemistry. 1977; 81(25), 2340–2361. 10.1021/j100540a008 [DOI] [Google Scholar]
  • 15.Haeno H, Iwasa Y, Michor F. The evolution of two mutations during clonal expansion. Genetics. 2007; 177(4): 2209–2221. doi: 10.1534/genetics.107.078915 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Krickeberg K, Pham VT, Pham TMH. Epidemiology: Key to Prevention. 2012. New York, N.Y. Springer. [Google Scholar]
  • 17.Kandoth C, McLellan MD, Vandin F, Ye K, Niu B, Lu C, et al. Mutational landscape and significance across 12 major cancer types. Nature. 2013; 17;502(7471):333–339. doi: 10.1038/nature12634 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Werner B, Case J, Williams MJ, Chkhaidze K, Temko D, Fernández-Mateos J, et al. Measuring single cell divisions in human tissues from multi-region sequencing data. Nat Commun. 2020; 25;11(1):1035. doi: 10.1038/s41467-020-14844-6 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, et al. A census of human cancer genes. Nat Rev Cancer. 2004; 4(3):177–83. doi: 10.1038/nrc1299 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature. 2004. Oct 21;431(7011):931–45. doi: 10.1038/nature03001 . [DOI] [PubMed] [Google Scholar]
  • 21.Nakamura N. A hypothesis: radiation carcinogenesis may result from tissue injuries and subsequent recovery processes which can act as tumor promoters and lead to an earlier onset of cancer. Br J Radiol. 2020; 93(1115):20190843 doi: 10.1259/bjr.20190843 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Schwartz AG, Cote ML. Epidemiology of Lung Cancer. Adv Exp Med Biol. 2016; 893:21–41. doi: 10.1007/978-3-319-24223-1_2 . [DOI] [PubMed] [Google Scholar]
  • 23.Parsons BL. Many different tumor types have polyclonal tumor origin: evidence and implications. Mutat Res. 2008; 659(3):232–47. doi: 10.1016/j.mrrev.2008.05.004 . [DOI] [PubMed] [Google Scholar]
  • 24.Bondar T, Medzhitov R. p53-mediated hematopoietic stem and progenitor cell competition. Cell Stem Cell. 2010; 6(4):309–22. doi: 10.1016/j.stem.2010.03.002 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Fujimichi Y, Otsuka K, Tomita M, Iwasaki T. An Efficient Intestinal Organoid System of Direct Sorting to Evaluate Stem Cell Competition in Vitro. Sci Rep. 2019;9(1):20297. doi: 10.1038/s41598-019-55824-1 . [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Gayle E Woloschak

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

30 May 2022

PONE-D-22-12989A mathematical model for cancer risk and accumulation of mutations caused by replication errors and external factorsPLOS ONE

Dear Dr. Uchinomiya,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.One reviewer suggested major revisions, the other recommended rejection.  Please attempt to address as many of the comments as possible.

Please submit your revised manuscript by Jul 14 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Gayle E. Woloschak, PhD

Section Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1.Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following financial disclosure: 

 "KU was supported by JSPS KAKENHI Grant Number JP 20K19972."  

Please state what role the funders took in the study.  If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." 

If this statement is not correct you must amend it as needed. 

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

Additional Editor Comments :

One reviewer recommended rejection, the other asked for major revisions. Please address as of these concerns as possible in a revision.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: No

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The described exercise investigates the multistage model, adding complexity in terms of including developing and stable tissue compartments. Overall the modeling lacks biological significance (i.e., no distinction regarding functions caused by mutations, assumes monoclonal tumor origin, assumes mutations rates that were not justified. Consequently, the value of the exercise is questionable. Also, identifying exogenously-induced mutations as wounds is non-standard nomenclature.

Reviewer #2: Uchinomiya and Tomita present a mathematical model for accumulation of mutations during development and in adult tissues. They use it to fit cancer risk data, inferring that accumulation of DNA copying-errors is largely sufficient to explain cancer risk across many cancer types. Overall the study would appear sound, if the issues described below (1-3) are addressed. There is considerable interest in deriving these sorts of models for cancer risk and I think such studies could benefit from incorporating some cancer genomic data that is now available.

Major:

1- One major point of concern is that for the central result -- the fits to epidemiological cancer data (as shown in Figure 3) -- the inferred parameters "lambda_1*p" (number of oncogenic driver mutations acquired per cell division) and "g" (number of driver mutations required) for each cancer type were largely not constrained to match biological knowledge of each cancer type, nor were they checked to see if the inferred values match what we expect.

There are some suspect examples where the "lambda_1*p" parameter can vary by >3 orders of magnitude between e.g. panel (g) [colon] and panel (d) [lung]. I think it is very unlikely tht the number of errors during cell division could vary so much between different tissues. The number of mutations per division has been measured recently by WGS for a variety of healthy human tissues; this data should be looked up and compared to the estimates of this parameter, to check veracity.

1b- A related point is the number of driver mutations in the "g" parameter: that this would be 7-10 for lung cancer, 4-5 for esophagus, while 2 for thyroid and breast is a bit unusual. Does this match the actually observed number of drivers in these tissues in e.g. TCGA or similar cancer genomics data sets? (I appreciate it is not trivial to exactly measure the number of driver mutations per cancer type however some estimates can be made). If this doesn't match the models should be ajusted.

2- There is a worry about whether multiple solutions with different combinations of their 2 parameters "lambda_1*p" (number of oncogenic driver mutations acquired per cell division) and "g" (number of driver mutations required) would fit the cancer incidence data similarly well, and thus might also be plausible solutions. Thus the fit might be visualized as a heatmap of the possible solutions. This fits the biological intuition that e.g. the somatic mutation rate will be somewhat different across individuals (they might have a germline defect in a DNA repair gene or not), and also that the number of driver mutations is variable within a cancer type -- it is a range, and not (as now somewhat artificially contstrained) a single value.

3- I think related with the above: there should be a formal test (perhaps based on bootstrap, or similar) to test if the addition of the lambda_2 parameter (mutagen exposure/wounds) significantly increases the fit of the models to the epidemiological data, or not. Now there is a statement about how the mutagen exposure factor is not necessary in the models to achieve a good fit to the cancer risk curves, however this doesn't seem to be supported with actual data. Does the fit improve with adding this (even if the improvement is subtle)?

Needs dissussing:

- In at least some adult tissues, stem cells are not renewed during aging at the rate at which they are lost; their number declines with age. E.g. blood (HSC) is one example and there may be others. My intution is this should not impact their model massively. Can they comment/discuss, or reanalyze if necessary?

- A tricky problem is that mutations resulting from DNA lesions ("wounds") might be occuring in a way coupled with DNA replication. Replication may increase the rate of converting wounds into mutations. If cells replicate slowly, there is plenty of time for repair e.g. via the NER pathway. If they replicate fast, lesions are not repaired but instead copied-over using TLS enyzmes thus generating mutations. So an ideal model would also account for this interaction between replication rate/ mutations and 'wound' exposures, having this as a parameter. I am not sure if such (more complicated) parametrization is feasible here, given the few amount of data points to fit to. Can they discuss this and/or adress with analysis if they think important? See the example of the "lambda_1*p" parameter (allegedly replication-dependant but not mutagen exposure dependant) being higher in lung; this is odd.

- One idea for future work, which I understand might be out-of-scope here, is to model explicitly the existence of subsets of mutator cancers (e.g. MSI colon cancers; the BRCA breast cancer).

Minor:

- The mathematical formalism may be hard to follow for some readers without an appropriate background. Text describing the method is well described however takes effort to absorb. I suggest having a visual diagram/schematic describing the simulation method.

- In figure 3, write cancer type on the figure panels. In Fig 2, write color legend in the figure.

- Wording "Wound", while not incorrect, suggests a macroscopic event. In DNA, the word "lesion" is more commonly used.

- I don't understand much the emphasis on ionizing radiation in the abstract, while there are no analyses particular to this in the manuscript. It is also probably not a major carcinogen compared to e.g. tobacco smoking.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Jun 14;18(6):e0286499. doi: 10.1371/journal.pone.0286499.r002

Author response to Decision Letter 0


21 Jul 2022

>Reviewer #1: The described exercise investigates the multistage model, adding complexity in

>terms of including developing and stable tissue compartments. Overall the modeling lacks

>biological significance (i.e., no distinction regarding functions caused by mutations, assumes

>monoclonal tumor origin, assumes mutations rates that were not justified. Consequently, the

>value of the exercise is questionable. Also, identifying exogenously-induced mutations as

>wounds is non-standard nomenclature.

We appreciate the comments of this reviewer #1. We understand the concerns of this reviewer and revised the manuscript.

[Answers]

As reviewer #1 pointed out, this model ignores many biological processes. The goal of this paper is to examine how the cancer risks can be explained by biological processes as simple as possible. We couldn't express this point well, so we added descriptions (Line 82-85).

Although we ignored many biological processes, it is not difficult to incorporate some of the neglected processes. The model can include functions caused by mutations through parameters a_{i,j} and b_{i,j}(We added descriptions in Line 493-496, 500-502). However, it is very difficult to assume how mutations affect realistically. Therefore, for convenience, a_{i,j} and b_{i,j} were assumed to be constant. As the descriptions about this simplification were lacking, we added the explanation (Line 201-203).

As with the classic model such as Armitage & Doll (1954), this model also assumes a monoclonal tumor origin. This model is a monoclonal tumor origin because it assumes the population average in the carcinogenesis process. If we focus on individual cells, polyclonality can be expressed. We added a discussion about this (Line 496-500).

We added references and discussions on mutation rate (Line 449-459) and changed the term “wound” to “lesion”.

>Reviewer #2: Uchinomiya and Tomita present a mathematical model for accumulation of

>mutations during development and in adult tissues. They use it to fit cancer risk data, inferring

>that accumulation of DNA copying-errors is largely sufficient to explain cancer risk across

>many cancer types. Overall the study would appear sound, if the issues described below (1-3)

>are addressed. There is considerable interest in deriving these sorts of models for cancer risk

>and I think such studies could benefit from incorporating some cancer genomic data that is

>now available.

We appreciate the positive evaluation of our work by reviewer #2. We addressed all suggestions.

>Major:

>1- One major point of concern is that for the central result -- the fits to epidemiological cancer

>data (as shown in Figure 3) -- the inferred parameters "lambda_1*p" (number of oncogenic

>driver mutations acquired per cell division) and "g" (number of driver mutations required) for

>each cancer type were largely not constrained to match biological knowledge of each cancer

>type, nor were they checked to see if the inferred values match what we expect.

>There are some suspect examples where the "lambda_1*p" parameter can vary by >3 orders of

>magnitude between e.g. panel (g) [colon] and panel (d) [lung]. I think it is very unlikely tht the

>number of errors during cell division could vary so much between different tissues. The

>number of mutations per division has been measured recently by WGS for a variety of healthy

>human tissues; this data should be looked up and compared to the estimates of this

>parameter, to check veracity.

[Answer]

The mutation rate is estimated approximately 1.14 mutations per genome per division in healthy haematopoiesis and 1.37 per genome per division in brain development (Warner et al. 2020), which corresponds to lambda_1. Then, p can be interpreted as the proportion of cancer genes in the genome. Combined with the estimation that more than 1% of genes contribute to human cancer (Futreal et al. 2004) and the estimation that the protein-coding gene is 2% (IHGSC, 2004), lambda_1 * p is considered to be about 0.000228 or 0.000274. The values are too large in liver, lung, thyroid, pancreatic, breast, and colon cancer. This may suggest that biological processes other than replication errors affect the cancer risk. We added descriptions and discussions (Line 449-459).

>1b- A related point is the number of driver mutations in the "g" parameter: that this would be

>7-10 for lung cancer, 4-5 for esophagus, while 2 for thyroid and breast is a bit unusual. Does

>this match the actually observed number of drivers in these tissues in e.g. TCGA or similar

>cancer genomics data sets? (I appreciate it is not trivial to exactly measure the number of

>driver mutations per cancer type however some estimates can be made). If this doesn't match

>the models should be ajusted.

[Answer]

We referred to some papers for discussing the number of driver mutations. The parameter “g” of some cancers was not so different from the observation, but the others were different. Ignoring the biological processes in this paper may cause this mismatch, so we discussed it (Line 436-448).

>2- There is a worry about whether multiple solutions with different combinations of their 2

>parameters "lambda_1*p" (number of oncogenic driver mutations acquired per cell division)

>and "g" (number of driver mutations required) would fit the cancer incidence data similarly

>well, and thus might also be plausible solutions. Thus the fit might be visualized as a heatmap

>of the possible solutions. This fits the biological intuition that e.g. the somatic mutation rate

>will be somewhat different across individuals (they might have a germline defect in a DNA

>repair gene or not), and also that the number of driver mutations is variable within a cancer

>type -- it is a range, and not (as now somewhat artificially contstrained) a single value.

[Answer]

We agree with the concern that different combinations of the 2 parameters may be the solution. Although we tried to show it by heatmap, it were hard to see. Alternatively, we showed the relationship among residual sum of squares and the 2 parameters for each fitting (S Fig1).

>3- I think related with the above: there should be a formal test (perhaps based on bootstrap, or

>similar) to test if the addition of the lambda_2 parameter (mutagen exposure/wounds)

>significantly increases the fit of the models to the epidemiological data, or not. Now there is a

>statement about how the mutagen exposure factor is not necessary in the models to achieve a

>good fit to the cancer risk curves, however this doesn't seem to be supported with actual data.

>Does the fit improve with adding this (even if the improvement is subtle)?

[Answer]

The effect of the mutagen will be best understood in lung cancer, but since the fitting itself is successful even with the error, considering the effect of the mutagen may not improve the fitting. However, when considering only the duplication error, the estimated parameters are significantly different from the values known in experiments. When considering the mutagen, the parameters are close to the observed values to some extent. We added a figure and a discussion about this (Fig 6, Line 391-397, Line 482-491).

>Needs dissussing:

>- In at least some adult tissues, stem cells are not renewed during aging at the rate at which

>they are lost; their number declines with age. E.g. blood (HSC) is one example and there may

>be others. My intution is this should not impact their model massively. Can they

>comment/discuss, or reanalyze if necessary?

[Answer]

When the division rate decreases, the accumulation of replication errors in old age is overestimated in this model. It is necessary to consider the change in the division rate when constructing a more accurate model. We will be able to incorporate the effect by changing the division rate a_ {i, j} depending on time. We added a discussion about this (Line 502-507).

>- A tricky problem is that mutations resulting from DNA lesions ("wounds") might be occuring

>in a way coupled with DNA replication. Replication may increase the rate of converting wounds

>into mutations. If cells replicate slowly, there is plenty of time for repair e.g. via the NER

>pathway. If they replicate fast, lesions are not repaired but instead copied-over using TLS

>enyzmes thus generating mutations. So an ideal model would also account for this interaction

>between replication rate/ mutations and 'wound' exposures, having this as a parameter. I am

>not sure if such (more complicated) parametrization is feasible here, given the few amount of

>data points to fit to. Can they discuss this and/or adress with analysis if they think important?

>See the example of the "lambda_1*p" parameter (allegedly replication-dependant but not

>mutagen exposure dependant) being higher in lung; this is odd.

[Answer]

We are interested in the relationship between replication rate and lesions (we changed the term from “wounds”). The influence of radiation exposure on younger people is known to increase the cancer risk compared to adults. The relationship between replication rate and accumulation of lesions may also be important in discussing radiation effects. In terms of this model, it is conceivable to change lambda_2 with age or replication rate. We added discussions about this while addressing the comment above. (Line 503-507).

>- One idea for future work, which I understand might be out-of-scope here, is to model

>explicitly the existence of subsets of mutator cancers (e.g. MSI colon cancers; the BRCA breast

>cancer).

[Answer]

Indeed, this model only considers the average properties of cells in each tissue. In mutator cancers, lambda_1 related to the mutation rate will be larger in other cancers. I added a discussion about it (Line 459-467)

>Minor:

>- The mathematical formalism may be hard to follow for some readers without an appropriate

>background. Text describing the method is well described however takes effort to absorb. I

>suggest having a visual diagram/schematic describing the simulation method.

[Answer]

We added a schematic diagram as Fig1 and revised the numbers in other figures.

>- In figure 3, write cancer type on the figure panels. In Fig 2, write color legend in the figure.

[Answer]

We added a color legend in Fig 2(new number is Fig 3) and cancer type on the figure panels in Fig 3(new number is Fig 4).

>- Wording "Wound", while not incorrect, suggests a macroscopic event. In DNA, the word

>"lesion" is more commonly used.

[Answer]

We changed the term “wound” to “lesion”.

>- I don't understand much the emphasis on ionizing radiation in the abstract, while there are

>no analyses particular to this in the manuscript. It is also probably not a major carcinogen

>compared to e.g. tobacco smoking.

[Answer]

Our aim is to create a model for discussing the effects of low-dose and low-dose-rate radiation, which is difficult to observe. We emphasized that the model is effective for analyzing such phenomena. (L82-85)

Attachment

Submitted filename: Response_to_Reviewers.docx

Decision Letter 1

Gayle E Woloschak

31 Oct 2022

PONE-D-22-12989R1A mathematical model for cancer risk and accumulation of mutations caused by replication errors and external factorsPLOS ONE

Dear Dr. Uchinomiya:

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Major revisions have been recommended by the reviewers.  Please address concerns in a revision.

Please submit your revised manuscript by Dec 15 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Gayle E. Woloschak, PhD

Section Editor

PLOS ONE

Additional Editor Comments (if provided):

One reviewer suggested major revisions, one suggested minor revisions. Please address these concerns in your revised manuscript.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

Reviewer #3: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

Reviewer #3: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

Reviewer #3: No

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: The authors have adequately adressed my concerns with additional discussion.

It is especially appreciated that they explicitly comment on the cases where their parameter fits are very different from realistic values, which may reveal interesting biology: "The estimated values of 1 are too large in liver, lung, thyroid, pancreatic, breast, and colon cancer. This may suggest that biological processes other than replication errors affect the cancer risk." One final request I would have that this result " 1 are too large in liver, lung, thyroid, pancreatic, breast, and colon cancer", which is in my opinion important (not to invalidate the model but rather to suggest effects of additional biological factors), be shown in a separate figure or table. In other words, I would suggest to include a figure/table that compares the fitted parameters to a (range of) expected parameters from the literature, for each cancer type.

As a minor comment, the sentence in the abstract "The parameters estimated by the analysis of lung cancer data were closer to the observed values than when considering only replication errors." is quite unclear as written; consider expanding it to clearly convey what is the meaning/implication of this lung result.

Finally it it still unclear to me why the focus on radiation in the introduction if there is not any radiation specific analysis. It is not wrong, but is a bit confusing perhaps.

Reviewer #3: The present reviewer has been added in the revision process. The study is well designed and addresses an interesting topic. However, there are several unclarities which need to be solved before acceptance for publication.

Major comments

1. Classically, as in ref. 2, the Armitage-Doll model is fitted to the rate (i.e., cases per person-year) unlike the present study, where the cumulative rate data (i.e., cases per total population) is used for fitting (equations 24 and 25). Please comment whether the “cumulative rate” data of the cancer registry herein are adjusted for competing factors (such as the age-related reduction in population). In addition, the cumulative rate at a certain age is dependent on the rates at all earlier time points because of its cumulative nature. This indicates that the uncertainty of the data is also accumulated with age. Fitting using the least square method assumes identical (i.e., age-independent) normal distribution on the uncertainty, and thus, does not take into account this cumulative nature of the uncertainty. Please consider stating these points, if applicable, as limitations of the study.

2. In Fig. 2, the increase of N(t) in the growth phase seems to slow when it gets close to 300. This is not expected from the description of the model. If the authors modeled some slowing in the proliferation rate when the stem cell pool is nearly full, please comment on this in the Method section. If not, please explain the mechanism of the slowing.

3. Equation 29 may need reconsideration. Because lambda_2 denotes the rate at which W changes in a cell, a differential equation of “dW/dt = lambda_2 * N_0 * exp[rt]” should hold. Solving this under an assumption of W(0) = 0, one obtains the equation of “W = (lambda_2 * N_0 / r) * (exp[rt] – 1)” instead of equation 29.

4. p.17 line 275 and equation 32. I do not understand why “a – b” is used here instead of “a”. Please consider using “a” or explaining why “a – b” is appropriate. I understand that the assumption described in lines 273-274 should be dealt with by the assumption of “a >> b” rather than the use of “a – b”.

5. In equation 40, the authors assume that exponential growth of the stem cell pool continues until age 18, which is quite counterintuitive, while this assumption contributes to the simplification of the model. Please discuss the consequence of changing the age at which the growth phase is terminated.

6. p.20 line 326 “lambda_1 * p is regarded as a single parameter in Eq (25)” This statement may not be obvious to readers because “lambda_1 * p” is not explicit in Eq (25). Please consider showing the relevant calculation process.

7. Fig. 4 and S1 Fig. indicate different values of g between the sexes, which may seem odd. Please consider using an identical value for both sexes or showing a rationale for using different values (such as previous observations supporting a sex-dependent number of driver gene mutations).

8. p.29 lines 472-474. Ovarian germ cells may not be the only cell-of-origin of ovarian cancer because somatic cells of the ovary and the oviduct can also be its origin. Please reconsider the context.

9. Please consider code sharing as stated in https://journals.plos.org/plosone/s/materials-software-and-code-sharing.

Minor points

p.3 lines 31-32 and other places: Please consider adding the word “previous” when referring to previous observations. Example: … the estimated parameters did not always agree with “previously reported values”. (p.3 lines 31-32)

p.47 line 47: “appearance” should be “mortality” according to ref. 2.

p.6 lines 74-75: Consider the following modification: “… and ionizing radiation contributes to the carcinogenic process by adding a few mutations”

p.6 line 91: … due “to” replication errors …

p.7 line 106, p.24 line 387, p.31 line 503: Replace the first comma (,) with a semicolon (;) (e.g., …; however, …).

p.12 lines 184-185: “…, in the growth phase, …” This phrase should be deleted.

p.12 line 186: The subscript g should be substituted with s (Gamma_g to Gamma_s).

p.13 line 207 and other occasions: Because “oncogenes” do not include “tumor suppressor genes”, the wording here should be “cancer driver genes” instead of “oncogenes”. The same applies to many occasions in the manuscript.

p.14 line 217: “… of having more than g mutations …” should be “… of having g or more mutations …”

p.14 line 221: “cancer risk” should be “cumulative cancer incidence”

p.16 line 245 and p.19 line 305: The value of N_L seems to be 300 instead of 100, as the maximum value of N in Fig 2 is 300.

p.16 line 252: n should be d?

p.17 line 273: the other -> another

p.19 line 308 and other occasions: The wording of “statistical data” is odd and should be replaced with expressions like “epidemiology data”, “real-world data” or “cancer registry data”.

p.20 line 314: The ministry name should be followed by the country name.

p.20 line 328: “value” -> “values”

p.24 line 385: “larger” -> “smaller”?

p.29 line 480: “with on time” -> “with time”?

p.30 lines 501-502: The grammar of “… should be depends on …” needs reconsideration.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Jun 14;18(6):e0286499. doi: 10.1371/journal.pone.0286499.r004

Author response to Decision Letter 1


3 Feb 2023

>Reviewer #2: The authors have adequately adressed my concerns with additional discussion.

[Answer] We were relieved that our response seemed to address your concerns. We have responded to all Reviewer #2’s comments.

>It is especially appreciated that they explicitly comment on the cases where their parameter fits are

>very different from realistic values, which may reveal interesting biology: "The estimated values of

>1 are too large in liver, lung, thyroid, pancreatic, breast, and colon cancer. This may suggest that

>biological processes other than replication errors affect the cancer risk." One final request I would

>have that this result " 1 are too large in liver, lung, thyroid, pancreatic, breast, and colon cancer",

>which is in my opinion important (not to invalidate the model but rather to suggest effects of

>additional biological factors), be shown in a separate figure or table. In other words, I would suggest

>to include a figure/table that compares the fitted parameters to a (range of) expected parameters

>from the literature, for each cancer type.

[Answer]We added the table comparing fitted parameters and expected parameters (Table 2).

>As a minor comment, the sentence in the abstract "The parameters estimated by the analysis of lung

>cancer data were closer to the observed values than when considering only replication errors." is

>quite unclear as written; consider expanding it to clearly convey what is the meaning/implication

>of this lung result.

[Answer]

We reconsidered the sentence (Line 33-34, 41-43).

>Finally it still unclear to me why the focus on radiation in the introduction if there is not any

>radiation specific analysis. It is not wrong, but is a bit confusing perhaps.

[Answer]

As we added in line 83-85, the carcinogenic effects of low-dose and low-dose-rate radiation have become a major public concern in Japan since the accident at the Fukushima Daiichi Nuclear Power Plant. The results of this paper are expected to make a significant contribution to the evaluation of cancer risk from low-dose and low-dose-rate radiation. Therefore, we would like to keep descriptions about radiation.

>Reviewer #3: The present reviewer has been added in the revision process. The study is well

>designed and addresses an interesting topic. However, there are several unclarities which need to

>be solved before acceptance for publication.

[Answer]

Thank you for your important comments. We have carefully considered and responded to those comments.

>Major comments

>1. Classically, as in ref. 2, the Armitage-Doll model is fitted to the rate (i.e., cases per person-year)

>unlike the present study, where the cumulative rate data (i.e., cases per total population) is used for

>fitting (equations 24 and 25). Please comment whether the “cumulative rate” data of the cancer

>registry herein are adjusted for competing factors (such as the age-related reduction in population).

>In addition, the cumulative rate at a certain age is dependent on the rates at all earlier time points

>because of its cumulative nature. This indicates that the uncertainty of the data is also accumulated

>with age. Fitting using the least square method assumes identical (i.e., age-independent) normal

>distribution on the uncertainty, and thus, does not take into account this cumulative nature of the

>uncertainty. Please consider stating these points, if applicable, as limitations of the study.

[Answer]

As Reviewer #3 pointed out, there are limitations related to demographics and the appropriateness of using the least-squares method. We added the discussion about these limitations (Line 519-528).

>2. In Fig. 2, the increase of N(t) in the growth phase seems to slow when it gets close to 300. This is

>not expected from the description of the model. If the authors modeled some slowing in the

>proliferation rate when the stem cell pool is nearly full, please comment on this in the Method

>section. If not, please explain the mechanism of the slowing.

[Answer]

Since the results are output when a certain number of divisions occurred for reducing the drawing cost, it looks that the rate of increase slowing around N=300 due to smoothing in combining them. We clarified in the caption that we are connecting discrete values. In addition, we did a new simulation and redrew the figure. In the new simulation, we set N=100 which used in figure 3.

>3. Equation 29 may need reconsideration. Because lambda_2 denotes the rate at which W changes

>in a cell, a differential equation of “dW/dt = lambda_2 * N_0 * exp[rt]” should hold. Solving this under

>an assumption of W(0) = 0, one obtains the equation of “W = (lambda_2 * N_0 / r) * (exp[rt] – 1)”

>instead of equation 29.

[Answer]

Thank you for your very helpful comments. We changed the approximation of W in the growth phase following the comments (eqs. 29 and 31). Following this change, the approximation of W in the stable phase must be changed, which was also addressed (eqs. 39 and 40). In addition, figures 3 and 5 are affected with this change and the Major-comment 4. We recalculated and redrew these figures.

>4. p.17 line 275 and equation 32. I do not understand why “a – b” is used here instead of “a”. Please

>consider using “a” or explaining why “a – b” is appropriate. I understand that the assumption

>described in lines 273-274 should be dealt with by the assumption of “a >> b” rather than the use of

>“a – b”.

[Answer]

We agree with this comment. Using “a” is more appropriate. We rewrote the related equations (eqs. 32-34, 36-38, 40). In addition, figures 3 and 5 are redrawn as mentioned in the response to the Major-comment 3

>5. In equation 40, the authors assume that exponential growth of the stem cell pool continues until

>age 18, which is quite counterintuitive, while this assumption contributes to the simplification of

>the model. Please discuss the consequence of changing the age at which the growth phase is

>terminated.

[Answer]

If the growth phase ends later, estimated g and lambda_1*p can be smaller. When the growth phase is long, the stable phase will be short, so it is necessary to develop cancer in a shorter time. Therefore, g should be small. Under this condition, cancer cells emerge earlier if lambda_1*p remains unchanged. To prevent this lambda_1*p should be smaller. I added a brief discussion about this (Line 526-527).

>6. p.20 line 326 “lambda_1 * p is regarded as a single parameter in Eq (25)” This statement may not

>be obvious to readers because “lambda_1 * p” is not explicit in Eq (25). Please consider showing the

>relevant calculation process.

[Answer]

We can show this by combining eqs (25) and (37), so we added the explanation (Line 343-345).

>7. Fig. 4 and S1 Fig. indicate different values of g between the sexes, which may seem odd. Please

>consider using an identical value for both sexes or showing a rationale for using different values

>(such as previous observations supporting a sex-dependent number of driver gene mutations).

[Answer]

We believe the value of g should be the same for both sexes. We think that there are two factors that cause the differences in the estimation. One is that the fitting is incomplete as pointed out in Major-comments 1. Esophageal cancers and pancreatic cancers with small differences in g may correspond to this case. Another cause will be the limitation of considering cumulative cancer risk based only on replication errors. Estimated g of Liver cancer and Lung cancer differ more between sexes. In Lung cancer, the influence of smoking is well known as risk factor, and the difference in g between sexes became smaller when the influence of mutagen was assumed. We added it to the Discussion for clarifying about this (Line 507-510, 522-523).

>8. p.29 lines 472-474. Ovarian germ cells may not be the only cell-of-origin of ovarian cancer

>because somatic cells of the ovary and the oviduct can also be its origin. Please reconsider the

>context.

[Answer]

We deleted the notation about ovarian cancer.

>9. Please consider code sharing as stated in https://journals.plos.org/plosone/s/materials-software-and-code-sharing.

[Answer]

We attached the code of Gillespie algorithm simulation.

Minor points

>p.16 line 245 and p.19 line 305: The value of N_L seems to be 300 instead of 100, as the maximum

>value of N in Fig 2 is 300.

[Answer]

We did new simulation with N=100 and redrew Fig.2 as response to Major-comment 3.

>p.20 line 314: The ministry name should be followed by the country name.

[Answer]

The notation of the reference was specified, so we wrote the order as it is.

[Answer]

We have corrected all the following concerns pointed out in the minor points.

>p.3 lines 31-32 and other places: Please consider adding the word “previous” when referring to previous observations.

>Example: … the estimated parameters did not always agree with “previously reported values”. (p.3 lines 31-32)

>p.47 line 47: “appearance” should be “mortality” according to ref. 2.

>p.6 lines 74-75: Consider the following modification: “… and ionizing radiation contributes to the carcinogenic

>process by adding a few mutations”

>p.6 line 91: … due “to” replication errors …

>p.7 line 106, p.24 line 387, p.31 line 503: Replace the first comma (,) with a semicolon (;) (e.g., …; however, …).

>p.12 lines 184-185: “…, in the growth phase, …” This phrase should be deleted.

>p.12 line 186: The subscript g should be substituted with s (Gamma_g to Gamma_s).

>p.13 line 207 and other occasions: Because “oncogenes” do not include “tumor suppressor genes”, the wording here

>should be “cancer driver genes” instead of “oncogenes”. The same applies to many occasions in the manuscript.

>p.14 line 217: “… of having more than g mutations …” should be “… of having g or more mutations …”

>p.14 line 221: “cancer risk” should be “cumulative cancer incidence”

>p.16 line 252: n should be d?

>p.17 line 273: the other -> another

>p.19 line 308 and other occasions: The wording of “statistical data” is odd and should be replaced with expressions

>like “epidemiology data”, “real-world data” or “cancer registry data”.

>p.20 line 328: “value” -> “values”

>p.24 line 385: “larger” -> “smaller”?

>p.29 line 480: “with on time” -> “with time”?

>p.30 lines 501-502: The grammar of “… should be depends on …” needs reconsideration.

Attachment

Submitted filename: Response_to_Reviewers.docx

Decision Letter 2

Gayle E Woloschak

20 Feb 2023

PONE-D-22-12989R2A mathematical model for cancer risk and accumulation of mutations caused by replication errors and external factorsPLOS ONE

Dear Dr. Uchonomiya:

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Minor changes mostly editorial in nature have been proposed for the work.  Please address these comments in a revision.

Please submit your revised manuscript by Apr 06 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Gayle E. Woloschak, PhD

Section Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: The authors have adequately addressed referee remarks. The study is ready for publication in my opinion.

Reviewer #3: The authors have adequately addressed my comments, with following minor points left. I apology that some of these points had been missed in the previous round of review.

1. Line 39: re-estimate -> re-estimated

2. Eq 1, 2, 6, 7, 12, 13, 15, 16 and 17: lambda_2 is used in some places instead of lambda_1.

3. Eq 22: i is used instead of j.

4. Line 269: “These means” -> “This means” or “These mean”

5. Line 303: ... the number of lesion“s” ...

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Jun 14;18(6):e0286499. doi: 10.1371/journal.pone.0286499.r006

Author response to Decision Letter 2


30 Apr 2023

>Reviewer #2: The authors have adequately addressed referee remarks. The study is ready for publication

>in my opinion.

[Answer] We were relieved that our response seemed to address your concerns.

>Reviewer #3: The authors have adequately addressed my comments, with following minor

>points left. I apology that some of these points had been missed in the previous round of

>review.

>1. Line 39: re-estimate -> re-estimated

>2. Eq 1, 2, 6, 7, 12, 13, 15, 16 and 17: lambda_2 is used in some places instead of lambda_1.

>3. Eq 22: i is used instead of j.

>4. Line 269: “These means” -> “This means” or “These mean”

>5. Line 303: ... the number of lesion“s” ...

[Answer] Thank you very much for your careful checking. We have corrected all the points you pointed out.

Attachment

Submitted filename: Response_to_Reviewers.docx

Decision Letter 3

Gayle E Woloschak

18 May 2023

A mathematical model for cancer risk and accumulation of mutations caused by replication errors and external factors

PONE-D-22-12989R3

Dear Dr. Uchinomiya:

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Gayle E. Woloschak, PhD

Section Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Gayle E Woloschak

22 May 2023

PONE-D-22-12989R3

A mathematical model for cancer risk and accumulation of mutations caused by replication errors and external factors

Dear Dr. Uchinomiya:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Gayle E. Woloschak

Section Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Estimation of g and λ1p through the least squares method.

    The sum of the squares of the residuals for each g was minimized by λ1p. Then, a combination of g and λ1p showing the minimum of the sum of squares of the residuals was established. The order of the figures corresponds to that in Fig 5: (a) esophageal cancer, (b) leukemia, (c) liver cancer, (d) lung cancer, (e) thyroid cancer, (f) pancreatic cancer, (g) colon cancer, (h) breast cancer, and (i) prostate cancer. The upper and lower panels show the results of males and females, respectively.

    (TIF)

    S1 File. Code of Gillespie algorithm.

    This is a C language code for Gillespie algorithm. The default parameters correspond to Fig 2.

    (C)

    Attachment

    Submitted filename: Response_to_Reviewers.docx

    Attachment

    Submitted filename: Response_to_Reviewers.docx

    Attachment

    Submitted filename: Response_to_Reviewers.docx

    Data Availability Statement

    Statistical data used in the manuscript can be accessed from the "graph database" in "latest cancer statistics" of Cancer Information Service, National Cancer Center, Japan (National Cancer Registry, Ministry of Health, Labour and Welfare (https://ganjoho.jp/reg_stat/index.html). The direct URL to the "graph database" for each cancer and the values used there are shown below. “Graph database” of esophageal cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=4&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of esophageal cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=4&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of leukemia: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=27&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of leukemia: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=27&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of liver cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=8&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of liver cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=8&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of lung cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=12&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of lung cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=12&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of thyroid cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=24&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of thyroid cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=24&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of pancreatic cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=10&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of pancreatic cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=10&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of colon cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=6&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of colon cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=6&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of breast cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=14&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of breast cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=14&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 “Graph database” of prostate cancer: https://gdb.ganjoho.jp/graph_db/gdb1?dataType=30&graphId=106&totalTarget=40&_useLog=on&_stackedRaito=on&_showErrorBar=on&_useUnknownStage=on&smTypes=1&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&_smTypes=on&smType=20&year=2015&years=2015&_years=1&avgStep=&survivalAgeKbn=AAA&sexType=0&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&marumeAgeKbn=A85&stage=0&elapsedYears=5&_showBreastOnlyFemale=on&_showOnlyPrefectures=on&showGraph=Submit The values of the data of prostate cancer: https://gdb.ganjoho.jp/graph_db/gdb1?showData=&dataType=30&graphId=106&totalTarget=40&year=2015&years=2015&avgStep=&survivalAgeKbn=AAA&ageSybt=0&ageSt=009&ageEd=A85¤tAge=0&smTypes=1&smType=20&sexType=0&marumeAgeKbn=A85&stage=0&elapsedYears=5 The authors confirm that others would be able to access or request these data in the same manner as the authors. The authors did not have any special access or request privileges that others would not have.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES