Abstract
Tumor growth from a single transformed cancer cell up to a clinically apparent mass spans many spatial and temporal orders of magnitude. Implementation of cellular automata simulations of such tumor growth can be straightforward but computing performance often counterbalances simplicity. Computationally convenient simulation times can be achieved by choosing appropriate data structures, memory and cell handling as well as domain setup. We propose a cellular automaton model of tumor growth with a domain that expands dynamically as the tumor population increases. We discuss memory access, data structures and implementation techniques that yield high-performance multi-scale Monte Carlo simulations of tumor growth. We discuss tumor properties that favor the proposed high-performance design and present simulation results of the tumor growth model. We estimate to which parameters the model is the most sensitive, and show that tumor volume depends on a number of parameters in a non-monotonic manner.
Keywords: cellular automaton, dynamic boundaries, tumor model, cancer stem cells, sensitivity analysis
Introduction
Simulating complex multi-scale cellular automata is still a great challenge despite advances in computational power of modern computers in recent years. Cellular automata are increasingly used to simulate tumor growth dynamics [1-15]. Whilst many efficient ways exist to simulate deterministic and synchronous cellular automata, such as Conway's ‘Game of Life’ [16], high-performance simulation of stochastic cancer cell kinetics and emerging multi-scale tumor population dynamics is still in is infancy. In stochastic Monte Carlo cancer models, cells are not governed by simple deterministic rules but by probability distributions of coupled internal states and non-trivial interactions with the continuously changing local environment. Additionally, tumor population dynamics emerge from the interaction of millions of cells, and often the development of such populations from few initial cells needs to be simulated. This poses problems of bridging many temporal and spatial scales. Due to the stochastic nature of single cell kinetics many simulations for the same scenario need to be performed in order to obtain averaged and statistically significant results. To further complicate matters, in typical tumor growth models many parameters need to be estimated in high-dimensional parameter sweeps and sensitivity analysis needs to be performed to study parameter influence on overall dynamics.
The main advantage of utilizing cellular automata in cancer modeling is the ability to formalize experimentally observable single-cell kinetics [17, 18] and observe emerging population level dynamics without a-priori knowledge of tumor behavior. Because of their apparent resemblance of in vitro cell culture models, cellular automata may be referred to as in silico experiments [19]. Automata simulations enable visualization, measurement and perturbation of cell kinetics as well as their interaction with the environment. Herein we describe a simple cellular automaton tumor growth model, and discuss computer memory access, data structures, domain setup and implementation techniques that enable high performance multi-scale simulations.
Tumor Growth Model
A cancer cell is an individual entity that occupies a single grid point of (10μm)2 on a two-dimensional square lattice. Each cancer cell is characterized by its specific trait vector [cct, ρ, μ, α] denoting cell cycle time, proliferation potential, migration potential and rate of spontaneous death, respectively [20]. We assume a heterogeneous tumor population consisting of so-called cancer stem cells and non-stem cancer cells. Cancer stem cells are assumed to be immortal and have unlimited proliferation potential (i.e., α=0, ρ=∞), whereas non-stem cancer cells can only divide a limited number of times ρmax before cell death. Each cell type can divide symmetrically to produce two daughter cells with parental phenotype. Both populations are coupled through asymmetric division of cancer stem cells, which with probability 1-ps (where ps is the probability of symmetric cancer stem cell division) produces a cancer stem cell and a non-stem cancer cell, which inherits an initial proliferation potential ρ=ρmax that decreases with each subsequent non-stem cell division (Figure 1A). Cells need adjacent space for migration and proliferation, and cells that are completely surrounded by other cells (eight on a two-dimensional lattice; Moore neighborhood) become quiescent (Figure 1B). In unsaturated environments, cells proliferate and migrate into vacant adjacent space at random. At each simulation step, cells can undergo spontaneous death with rate α and will be instantaneously removed from the system.
Time is advanced at discrete time intervals Δt = 1/24 day (i.e., 1 hour), and 24 simulation steps represent one day. At each simulation step, cells are considered in random order to minimize lattice geometry effects and the behavior of each cell is updated. Cell proliferation, migration and death are random events with the respective probabilities scaled to the simulation time step. Cell proliferation and migration are temporally mutually exclusive events. We assume that cells proliferate at each simulation step with probability pd =(24/cct)×Δt, migrate with probability (1-pd)pm and die with probability α. Let pm= μ×Δt where μ denotes cancer cell motility speed.
Implementation
Memory architecture and data access
High-performance simulations require fast access to available memory and cached data. How memory is handled depends heavily on simulation design as well as used data structures and procedures. The memory in modern desktop PCs has three layers: the built-in cache memory has the fastest access time (1-20 ns) but a very limited size; random access memory (RAM) is slower (50-100 ns) but much larger; and hard disk drives (HDD) whilst having large memory have the slowest access time (5-10 ms) (Figure 2).
State-of-the-art processors may have up to 24 Megabytes (MB) of cache memory and can access up to 4096 Gigabytes (GB) of RAM (c.f., Intel Xeon E7-8830). Cache memory stores the most frequently used RAM locations to reduce access time to necessary information [21]. Due to limited memory size, cached content constantly changes throughout simulations. Simulation time decreases when the spatial locality property is unsatisfied, i.e. the CPU frequently requires access to information that is not stored in cache memory locations. If a so-called cache miss occurs data needs to be retrieved from much slower RAM or even HDD memory. High frequencies of cache misses dramatically reduce computation speed and optimized algorithms should minimize cache miss events.
Let us consider a two-dimensional rectangular lattice coded by a two-dimensional array as commonly used in cellular automata. As computer memory is arranged linearly, higher-dimensional arrays are stored row after row (or column after column). Therefore, if an array element is accessed only parts of its immediate spatial neighborhood will be stored in the cache. Especially for large lattices, 2 of 4 neighbors (2-D von Neumann neighborhood) or 6 of 8 neighbors (2-D Moore neighborhood) are cache missed. Whilst convenient at implementation, frequent access of cell neighbors in two- and three-dimensional arrays is memory inefficient and slow.
Population geometry and data type optimization
Which data structures are the best to use depends on the cellular processes that are considered as well as the geometry of the emerging population. Prostate tumors, for example, have a very dense, compact structure whereas glioblastoma brain tumors are highly diffusive. Such density difference may be represented by the number of cells on the computational lattice per area or volume. Let us define a dense tumor as a population of cells where each lattice point is occupied by a cell with probability p=0.99 (i.e., 99%) and a diffusive tumor occupies lattice points with p=0.5. For cells to migrate or proliferate adjacent lattice points need to be vacant. Dependent on the expected tumor density - either many or few neighbors for most cells - the most efficient data structure for obtaining vacant neighbor lattice sites might be different. To determine cell neighborhood vacancies, a simple array keeping Boolean information about lattice points occupied by cells will be highly inefficient for dense tumors. Let us consider morphological erosion, where each cell that is not completely surrounded by other cells is removed from the lattice. For dense population geometries, a coded array containing information about number of vacant spots in the cell neighborhood may be more efficient as unsuccessful scanning of each neighboring lattice point for vacancy is be avoided (Figure 3A,B). Appropriate use of C++ data type char will not introduce a memory tradeoff as both char and Boolean require one byte of memory. Using intuitive int instead of char will require four times more memory and increases computation time as fewer information can be stored in cache memory. A computationally expensive drawback of a coded lattice is the requirement to update all neighboring lattice codes when occupancy of a single grid point changes, which makes this approach less efficient for diffusive tumors (Figure 3C).
Random neighbor selection
Monte Carlo tumor growth simulations frequently require obtaining a free neighboring lattice site at random, for example for migration or proliferation. A naïve approach may consider all neighboring lattice sites, store those that are vacant in a temporary vector and then select a vector element at random. Alternatively, neighboring lattice sites may be addressed in random order and the first encountered vacant position is selected (Figure 4A). This simple alternative random access method significantly decreases simulation time in dense (Figure 4B) and diffusive tumors (Figure 4C) with increasing lattice size (free spot is selected iteratively for each cell on the lattice). While the naïve procedure is much slower for diffusive tumors as more vacant lattice sites have to be stored in temporary vectors, the alternative random access approach performs equally well irrespective of tumor type.
Dependent on the modeled cellular processes, additional alterations or improvements may be required. For example, one may choose to store hashed information about the cell neighborhood in the lattice. In particular, a limited number of possible cell neighborhood configurations may be encoded in identifying keys.
Random ordering
Many programming languages provide efficient procedures and data structures that can be utilized for cellular automata design in combination with simulation specific code. In asynchronous stochastic cellular automata of tumor growth random cell ordering and random access of cells is fundamental. A naïve implementation of selecting cells in a random order may consist of
From a vector containing all cells, pick a cell at random by drawing a random positive integer not larger than the vector length.
Erase the selected cell from the vector to avoid its reselecting.
Repeat steps 1 and 2 until there are no cells left in the vector.
The C++ Standard Template Library (STL) provides numerous algorithms to perform search, sort and shuffle operations. Random shuffle rearranges all elements in a specified range randomly in a single invocation. The STL random shuffle procedure reduces computation time compared to the naïve approach by multiple orders of magnitude even for small vector sizes (Figure 5), clearly demonstrating the power and importance of using standard language-specific data structures and algorithms for high-performance simulations.
Dynamically growing domains
To simulate a growing tumor population from a single cancer cell computational lattice-induced boundary constraints need to be avoided. An appropriate lattice size must be selected dependent on the achievable tumor size, which requires a priori knowledge about emerging population dynamics, tumor density and cell diffusibility. A dense, radially symmetrically growing two-dimensional tumor population of 100,000 cells could well fit into a 400×400 lattice. Cells in a highly diffusive irregular tumor, however, will likely hit the boundary of such lattice early during tumor growth. Whilst a sufficiently large lattice could ensure avoidance of boundary contact, memory requirements and computing performance limit such approach. Large amounts of computational resources would be wasted especially early in population expansion when only a few cells are present.
One possibility is to use dynamic data structures such as a C++ standard template library (STL) map, which can be understood as an associative container that stores elements formed by the combination of a key value and a mapped value. Unfortunately, accessing elements in a map is logarithmic in size, which for large tumor sizes dramatically decreases computational performance (Figure 6). We propose a dynamically allocated array with associated procedures that expand the lattice points. While a static large array of 1000×1000 lattice points is the most efficient for large tumors it is inefficient for small tumors up to 10,000 cells due to unoccupied lattice sites occupying large amounts of memory. The dynamically expanding array is most efficient for small tumors and comparable to large static arrays for large tumor populations.
Tumor growth simulations
We will use a combination of the presented high-performance techniques to simulate solid tumor growth and compare it to a naïve implementation. Let us initialize tumor growth simulations with one cancer stem cell located in the center of a square lattice with trait vector [cct=24hours, ρ=∞, μ=100μm/day, α=0%] and ps=0.1. Non-stem daughter cells are initialized with trait vector [cct=24hours, ρ=10, μ=100μm/day, α=1%].
These parameter values have previously been shown to enable fast dense tumor growth [20]. Tumor growth dynamics with other parameters have been discussed elsewhere [22-24]. Cell cycle time and migration rate can be measured experimentally [17], while other parameters are yet to determined. It is conceivable, however, that model parameters are organ or even patient specific. We therefore use the initial trait vector as control and study the sensitivity of tumor volume to changes in each parameter. We simulate tumor growth for t=180 days using an intuitive implementation (naïve code) and compare to an implementation with a combination of above-discussed improvements (improved code). The naïve simulation is executed on a fixed 750×750 square lattice, whereas the improved simulation is initiated on a 50×50 square lattice with dynamically expanding domains. Due to the stochastic nature of the model we simulate N=100 (improved) and N=77 (naïve) independent tumors and report average results. Both implementations yield similar population sizes with comparable cancer stem cell and non-stem cancer cell numbers (Figure 7). While the naïve code executes in an average of 4212 seconds (>70 minutes) the improved code executes in 51 seconds (<1 minute) – an 82-fold reduction in computing time. The high-performance of the improved code is due to the dynamically expanding domain as well as efficient access to information on vacant neighboring lattice sites. More than 70% of all cells at the final time point of the simulation have no adjacent space, and less than 5% of cells have two or more vacant lattice sites to migrate or proliferate into (Figure 7D). Graphical visualization of tumor morphologies at different time points show that tumors simulated with either implementation technique are non-differentiable beyond intrinsic stochastic effects (Figure 8).
To study the effects of model parameters on emerging tumor volume we separately change the initial parameter values (control) by 50% in either direction and compare resulting total cell count to the control tumor. Sensitivity analysis reveals that spontaneous rate of cell death has little impact on total cell number, compared to cell migration and symmetric stem cell division rate that strongly correlate with tumor volume. Whilst an increase in symmetric stem cell division probability ps and cell migration rate μ yields larger tumor volumes, cell death rate α and proliferation potential ρ modulate tumor volume non-monotonically (Figure 9). A 50% parameter value increase as well as decrease yields tumor volumes that are smaller than the control tumor, suggesting the existence of optimum parameter values that maximize tumor volume. Their specific values, however, will be dependent on the other model parameters [23]. Interestingly, increasing proliferation potential ρ yields significantly smaller tumors than lower ρ values, indicating a strong competition between non stem cancer cells and cancer stem cells that drives the total population into prolonged phases of dormancy [24].
Discussion
Cellular automata are frequently used to simulate solid tumor growth and cancer stem cell dynamics [1,25,26]. Intuitive design of stochastic cellular automata for tumor modeling is often counterbalanced by its performance. Although cellular automata are lattice-based, naïve implementation as two-dimensional Boolean arrays has little computational efficiency. We set out to compare C++ data structures, memory-efficient procedures and dynamic domains to decrease computing time. As extension to three spatial dimensions is trivial, we presented implementation details in two dimensions for clarity. We found that simple substitutions in intuitive cellular automaton implementations significantly decrease computing time. First, appropriate use of data type char over int provides a 4-fold reduction in memory allocation. Second, consideration of a coded lattice that holds information about a cell's neighborhood vacancies rather then Boolean information whether or not a cell is occupying that lattice point significantly decreases computation time if queries about adjacent space are frequently required for cell decisions. Third, utilization of the C++ STL shuffle method to provide a random order of elements proves superior to repeatedly selecting single elements at random positions within a vector. Finally, we presented a dynamically growing domain that evolves according to the population size, which keeps computation time exceptionally low when the population is small. Computing time is comparable to a large array when the population approaches array carrying capacity, but with the unique option to further expand the domain when needed. When all of these adjustments are combined into a simulation of cancer stem cell driven solid tumor growth, the improved implementation yields a high-performance over the naïve approach. In simulations of tumor growth for 180 days from a single cell to a population of about 140,000 cells, the presented high-performance cellular automaton yields an 82-fold reduction in computing time while reproducing the results of the naïve implementation. We believe the developed high-performance cellular automaton will serve as a template for future simulations of solid tumor growth as well as other population dynamics models. We share the source code for the presented naïve and improved code on our personal websites and the sourceforge.net repository.
Acknowledgments
This work was partially supported by the Polish Ministry of Science and Higher Education within the Iuventus Plus Grant (IP2011 041971) (J.P.) and the NIH/NCI Integrative Cancer Biology Program (5U54 CA113007) (H.E.).
Contributor Information
Jan Poleszczuk, Email: j.poleszczuk@mimuw.edu.pl.
Heiko Enderling, Email: heiko.enderling@moffitt.org.
References
- 1.Agur Z, Daniel Y, Ginosar Y. The universal properties of stem cells as pinpointed by a simple discrete model. J Math Biol. 2002;44:79–86. doi: 10.1007/s002850100115. [DOI] [PubMed] [Google Scholar]
- 2.Anderson ARA. A hybrid mathematical model of solid tumour invasion: the importance of cell adhesion. Mathematical Medicine and Biology. 2005;22:163–186. doi: 10.1093/imammb/dqi005. [DOI] [PubMed] [Google Scholar]
- 3.Gerlee P, Anderson ARA. An evolutionary hybrid cellular automaton model of solid tumour growth. Journal of Theoretical Biology. 2007;246:583–603. doi: 10.1016/j.jtbi.2007.01.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.BankheadIII A, III, Magnuson NS, Heckendorn RB. Cellular automaton simulation examining progenitor hierarchy structure effects on mammary ductal carcinoma in situ. Journal of Theoretical Biology. 2007;246:491–498. doi: 10.1016/j.jtbi.2007.01.011. [DOI] [PubMed] [Google Scholar]
- 5.Hatzikirou H, Brusch L, Schaller C, Simon M, Deutsch A. Prediction of traveling front behavior in a lattice-gas cellular automaton model for tumor invasion. Computers and Mathematics with Applications. 2009:1–14. [Google Scholar]
- 6.Kansal AR, Torquato S, Harsh GR, IV, Chiocca EA, Deisboeck TS. Cellular automaton of idealized brain tumor growth dynamics. BioSystems. 2000;55:119–127. doi: 10.1016/s0303-2647(99)00089-1. [DOI] [PubMed] [Google Scholar]
- 7.Tang J, Enderling H, Becker-Weimann S, Pham C, Polyzos A, Chen CY, et al. Phenotypic transition maps of 3D breast acini obtained by imaging-guided agent-based modeling. Integr Biol. 2011;3:408. doi: 10.1039/c0ib00092b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Alarcón T, Byrne HM, Maini PK. A cellular automaton model for tumour growth in inhomogeneous environment. Journal of Theoretical Biology. 2003;225:257–274. doi: 10.1016/s0022-5193(03)00244-3. [DOI] [PubMed] [Google Scholar]
- 9.Patel AA, Gawlinski ET, Lemieux SK, Gatenby RA. A Cellular Automaton Model of Early Tumor Growth and Invasion: The Effects of Native Tissue Vascularity and Increased Anaerobic Tumor Metabolism. Journal of Theoretical Biology. 2001;213:315–331. doi: 10.1006/jtbi.2001.2385. [DOI] [PubMed] [Google Scholar]
- 10.Ribba B, Alarcón T, Marron K, Maini PK, Agur Z. The use of hybrid cellular automaton models for improving cancer therapy. 2004:444–453. [Google Scholar]
- 11.Aubert M, Badoual M, Féreol S, Christov C, Grammaticos B. A cellular automaton model for the migration of glioma cells. Phys Biol. 2006;3:93–100. doi: 10.1088/1478-3975/3/2/001. [DOI] [PubMed] [Google Scholar]
- 12.Piotrowska MJ, Angus SD. A quantitative cellular automaton model of in vitro multicellular spheroid tumour growth. Journal of Theoretical Biology. 2009;258:165–178. doi: 10.1016/j.jtbi.2009.02.008. [DOI] [PubMed] [Google Scholar]
- 13.Jiao Y, Torquato S. Emergent Behaviors from a Cellular Automaton Model for Invasive Tumor Growth in Heterogeneous Microenvironments. PLoS Comput Biol. 2011;7:e1002314. doi: 10.1371/journal.pcbi.1002314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Powathil GG, Gordon KE, Hill LA, Chaplain MAJ. Modelling the effects of cell-cycle heterogeneity on the response of a solid tumour to chemotherapy: Biological insights from a hybrid multiscale cellular automaton model. Journal of Theoretical Biology. 2012;308:1–19. doi: 10.1016/j.jtbi.2012.05.015. [DOI] [PubMed] [Google Scholar]
- 15.Enderling H, Park D, Hlatky L, Hahnfeldt P. The importance of spatial distribution of stemness and proliferation state in determining tumor radioresponse. Math Model Nat Phenom. 2009;4:117–133. [Google Scholar]
- 16.Gardner M. Mathematical games: The fantastic combinations of John Conway's new solitaire game “life,”. Scientific American. 1970;223:120–123. [Google Scholar]
- 17.Gao X, McDonald JT, Hlatky L, Enderling H. Acute and fractionated irradiation differentially modulate glioma stem cell division kinetics. Cancer Research. 2013;73:1481–1490. doi: 10.1158/0008-5472.CAN-12-3429. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Tang J, Fernandez-Garcia I, Vijayakumar S, Martinez-Ruiz H, Illa-Bochaca I, Nguyen DH, et al. Irradiation of juvenile, but not adult, mammary gland increases stem cell self- renewal and estrogen receptor negative tumors. Stem Cells. 2013 doi: 10.1002/stem.1533. [DOI] [PubMed] [Google Scholar]
- 19.Trisilowati, Mallet DG. In Silico Experimental Modeling of Cancer Treatment. ISRN Oncology. 2012;2012:1–8. doi: 10.5402/2012/828701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Enderling H, Anderson ARA, Chaplain MAJ, Beheshti A, Hlatky L, Hahnfeldt P. Paradoxical dependencies of tumor dormancy and progression on basic cell kinetics. Cancer Research. 2009;69:8814–8821. doi: 10.1158/0008-5472.CAN-09-2115. [DOI] [PubMed] [Google Scholar]
- 21.Hennessy JL, Patterson DA. Computer Architecture. Elsevier. 2012 [Google Scholar]
- 22.Enderling H, Hlatky L, Hahnfeldt P. Migration rules: tumours are conglomerates of self-metastases. 2009;100:1917–1925. doi: 10.1038/sj.bjc.6605071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Morton CI, Hlatky L, Hahnfeldt P, Enderling H. Non-stem cancer cell kinetics modulate solid tumor progression. Theoretical Biology and Medical Modelling. 2011;8:48. doi: 10.1186/1742-4682-8-48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Enderling H. Advances in Experimental Medicine and Biology. Springer New York; New York, NY: 2012. Cancer Stem Cells and Tumor Dormancy; pp. 55–71. [DOI] [PubMed] [Google Scholar]
- 25.Enderling H, Hlatky L, Hahnfeldt P. Cancer stem cells: a minor cancer subpopulation that redefines global cancer features. Front Oncol. 2013;3:76. doi: 10.3389/fonc.2013.00076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Sottoriva A, Verhoeff JJC, Borovski T, McWeeney SK, Naumov L, Medema JP, et al. Cancer Stem Cell Tumor Model Reveals Invasive Morphology and Increased Phenotypical Heterogeneity. Cancer Research. 2010;70:46–56. doi: 10.1158/0008-5472.CAN-09-3663. [DOI] [PubMed] [Google Scholar]