Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2021 Jun 28;17(6):e1009094. doi: 10.1371/journal.pcbi.1009094

Topological data analysis distinguishes parameter regimes in the Anderson-Chaplain model of angiogenesis

John T Nardini 1, Bernadette J Stolz 2, Kevin B Flores 1, Heather A Harrington 2,3, Helen M Byrne 2,*
Editor: Mark Chaplain4
PMCID: PMC8270459  PMID: 34181657

Abstract

Angiogenesis is the process by which blood vessels form from pre-existing vessels. It plays a key role in many biological processes, including embryonic development and wound healing, and contributes to many diseases including cancer and rheumatoid arthritis. The structure of the resulting vessel networks determines their ability to deliver nutrients and remove waste products from biological tissues. Here we simulate the Anderson-Chaplain model of angiogenesis at different parameter values and quantify the vessel architectures of the resulting synthetic data. Specifically, we propose a topological data analysis (TDA) pipeline for systematic analysis of the model. TDA is a vibrant and relatively new field of computational mathematics for studying the shape of data. We compute topological and standard descriptors of model simulations generated by different parameter values. We show that TDA of model simulation data stratifies parameter space into regions with similar vessel morphology. The methodologies proposed here are widely applicable to other synthetic and experimental data including wound healing, development, and plant biology.

Author summary

Vascular networks play a key role in many physiological processes, by delivering nutrition to, and removing waste from, biological tissues. In cancer, tumors stimulate the growth of new blood vessels, via a process called angiogenesis. The resulting vascular structure comprises many inter-connected vessels which lead to emergent morphologies that influence the rate of tumor growth and treatment efficacy. In this work, we consider several approaches to summarize the morphology of synthetic vascular networks generated from a mathematical model of tumor-induced angiogenesis. We find that a topological approach can be used quantify vascular morphology of model simulations and group the simulations into biologically interpretable clusters. This methodology may be useful for the diagnosis of abnormal blood vessel networks and quantifying the efficacy of vascular-targeting treatments.

Introduction

Blood vessels deliver nutrients to, and remove waste products from, tissues during many physiological processes, including embryonic development, wound healing, and cancer [13]. The structure and form of connected blood vessels (e.g., vascular morphology), determine how nutrients and waste are supplied to or removed from the environment and, in turn, influence the behavior of the tissue and its constituent cells. The morphology of a vascular network can reveal the presence of an underlying disease, or predict the response of a patient to treatment. High resolution vascular imaging technology creates an exciting opportunity to develop mathematical tools to discover new links between blood vessel structure and function.

Here, we focus on tumor-induced angiogenesis, the process by which tumor cells stimulate the formation of new blood vessels from pre-existing vasculature [1]. When oxygen and nutrient levels within a population of tumor cells become too low to sustain a viable cell population, the tumor cells produce several growth factors, including vascular endothelial growth factor (VEGF), platelet-derived growth factor (PDGF), and basic fibroblast growth factor (bFGF), which diffuse through the surrounding tissue [46]. On contact with neighboring blood vessels, these tumor angiogenesis factors (or TAFs) increase the permeability of the vessel walls and loosen adhesive bonds between the endothelial cells that line the blood vessels [7]. The TAFs activate the endothelial cells to release proteases that degrade the basement membrane [8]. Activated tip endothelial cells then migrate away from the parent vessel and follow external cues, such as spatial gradients of VEGF and/or fibronectin [9, 10]. Stalk endothelial cells located behind the tip cells proliferate into the surrounding tissue matrix, causing the emerging vessel sprouts to elongate [11]. When tip endothelial cells from one sprout come into contact with tip or stalk endothelial cells from another sprout, the two endothelial cells may fuse or anastomose together, forming a new loop through which oxygen and nutrients may be transported [9]. In Fig 1 we present a schematic model of how these processes are coordinated.

Fig 1. Schematic overview of tumor-induced angiogenesis.

Fig 1

Schematic depicting seven aspects angiogenesis that are incorporated in the Anderson-Chaplain model of tumor-induced angiogenesis [17]. (1) Distant tumor cells release a range of chemoattractants, including vascular endothelial growth factors and basic fibroblast growth factors that stimulate the formation of new blood vessels. These growth factors are described collectively as a single, generic tumor angiogenesis factor (TAF). (2) Production and consumption of tissue-matrix bound fibronectin by the endothelial cells creates a spatial gradient of fibronectin across the domain. (3) Endothelial cells sense the TAF and fibronectin gradients and undergo individual cell migration. (4) As endothelial tip cells migrate, via chemotaxis, up spatial gradients of the TAF, stalk cells in the developing vessels are assumed to proliferate, creating what has been termed a “snail trail” of new endothelial cells. (5) Endothelial tip cells also migrate, via haptotaxis, up spatial gradients of fibronectin. (6) Endothelial cells in existing sprouts may initiate the formation of secondary sprouts. (7) If a sprout coincides with an existing vessel, then it is assumed to be annihilated and a new loop is formed; if two sprouts coincide or anastomose, then both are assumed to be annihilated and a new loop forms.

As the number of experimental and theoretical studies of angiogenesis has increased, our knowledge of the mechanisms involved in its regulation has increased beyond the idealized description given above [12, 13]. For example, subcellular signalling involving the VEGF and delta-notch signalling pathways is known to influence whether an endothelial cell adopts a tip or stalk cell phenotype [14]. Further, immune cells, including macrophages that are present in the tumor microenvironment, are known to release TAFs such as VEGF and matrix degrading proteases which facilitate tumor angiogenesis [12]. Mechanical stimuli also influence endothelial cell migration. For example, the fluid shear stress experienced by endothelial cells lining blood vessels drives their migration by causing them to form lamellipodia and focal adhesions at the front of the cell, in the flow direction, and to retract focal adhesions at the rear [15, 16].

In this paper, we explore several approaches for quantifying the morphology of vascular networks that form during angiogenesis, including recently developed techniques from topological data analysis. We develop and test our methodology through application to simulated data from a highly idealized, and well known hybrid (i.e., discrete and continuous) stochastic mathematical model of tumor-induced angiogenesis proposed by Anderson and Chaplain [17] (see Fig 1 for a model schematic).

The Anderson-Chaplain model of tumor angiogenesis is based on continuum models proposed by Balding and McElwain [18] and Byrne and Chaplain [19], as well as the stochastic model proposed by Stokes and Lauffenberger [20]. It is an on-lattice model in which endothelial cells perform a biased random walk and generate blood vessel networks that resemble those observed experimentally [17]. While more detailed theoretical models have been developed (see reviews [2124]), we focus on the Anderson-Chaplain model due to its simplicity and wide adoption in the mathematical biology literature. The methods presented in this work can also be used to study alternative models of angiogenesis, such as the phase-field model presented in [25], the 2D model of early angiogenesis and cell fate specification in [26], the 3D hybrid model presented in [27], and multiscale models of vascular tumor growth, such as those presented in [2831].

The Anderson-Chaplain model describes the spatio-temporal evolution of three physical variables: endothelial tip cells, tumor angiogenesis factor (TAF), and fibronectin [17]. Fig 1 contains a model schematic where tumor cells located along the right boundary produce TAF and the endothelial tip cells from a nearby healthy vessel placed along the left boundary move via chemotaxis up spatial gradients of TAF. As the tip cells move, stalk cells located behind them proliferate, creating what has been termed a “snail trail” of new blood vessel segments [21]. Endothelial tip cells also migrate via haptotaxis, up spatial gradients of fibronectin, which is produced and consumed by the migrating endothelial cells and binds to the tissue matrix in which the cells are embedded. As the tip cells migrate and the stalk cells proliferate, the vessel sprouts elongate, and secondary sprouts may emerge from the vessels, at the end of a sprout or laterally from the nascent vessel. Further, when a tip cell collides with another vessel segment or another tip cell, the two cells may fuse, forming a closed loop or anastomosis.

We would like a quantitative method to understand and compare vessel networks that is applicable to experimental and synthetic data. Standard quantitative descriptors of vascular morphology proposed by [27, 3235] include inter-vessel spacing, the number of branch points, measuring the total area (volume) covered for 2D (3D) simulations, vessel length, density, tortuosity, the number of vessel segments, and the number of endothelial sprouts. These metrics correlate with tumor progression and treatment response in experimental datasets [36]; however, they are computed at a fixed spatial scale, and neglect information about the connectedness (i.e., the topology) of vascular structures. Therefore, standard descriptors do not exploit the full information content of synthetic data generated from the models. At the same time, advances in imaging technology mean that it is now possible to visualize vascular architectures across multiple spatial scales [37]. These imaging developments are creating a need for new methods that can quantify the multi-scale patterns in vascular networks, and how they change over time.

A relatively new and expanding field of computational mathematics for studying the shape of data at multiple scales is TDA. TDA consists of algorithmic methods for identifying global structures in high-dimensional datasets that may be noisy [38, 39]. TDA has been useful for biomedical applications [4043], including brain vessel MRI data [44] and experimental data of tumor blood vessel networks [45, 46]. A commonly-used tool from TDA is persistent homology (PH). Homology refers to the characterization of topological features (e.g., connected components and loops) in a dataset, and persistence refers to the extent to which these features persist within the data as a scale parameter varies. PH is a useful way to summarize the topological characteristics of spatial network data [4547] as well as noisy and high dimensional agent-based models (ABMs) [48, 49]. More recently, PH has been combined with machine learning to extract accurate estimates of parameters from such models [50]. Most of these studies focus on PH calculations for point clouds of data; notably, PH has been adapted to analyze networks and grayscale images [51]. An active area of research is exploring different filtrations for application-driven PH.

In this paper, we develop a computational pipeline for analyzing simulations of the Anderson-Chaplain model over a range of biologically-relevant parameter values. Our main objectives are to identify model descriptor vectors that group together similar model simulations into biologically interpretable clusters. By biologically interpretable, we mean that simulations within a cluster arise from similar parameter values and simulations in different clusters arise from different parameter values.

We use two topological approaches, the sweeping plane and flooding filtrations, to construct simplicial complexes from binary image data generated from simulations of the Anderson-Chaplain model. We show that PH of the sweeping plane filtration and its subsequent vectorization provides an interpretable descriptor vector for the model parameters governing chemotaxis and haptotaxis. We show, by comparison with existing morphologically-motivated standard descriptor vectors, that this topological approach leads to more biologically interpretable clusters that stratify the haptotaxis-chemotaxis parameter space. Furthermore, the clusters generated from the sweeping plane filtration are robust and generalize well to unseen model simulations.

Materials and methods

We simulate the Anderson-Chaplain model of tumor-induced angiogenesis for a range of values of the haptotaxis and chemotaxis coefficients, ρ and χ. Each simulation generates vasculature composed of a collection of interconnected blood vessels in the form of binary image data. We compute three so-called standard descriptor vectors to summarize each simulation: two are computed at 50 time steps and the third is computed at the final time step. We also compute 30 topological descriptor vectors. These topological descriptors encode the proposed multiscale quantification of the vascular morphology. We calculate standard and topological descriptor vectors for the entire set of 1,210 simulations. We then analyze the simulations by clustering the standard or topological descriptor vectors and compare the results. Fig 2 shows the pipeline of data generation and analysis. We give details of the Anderson-Chaplain model, including governing equations, parameter values and implementation, in the S1 Appendix. The Python files and notebooks used for all steps of our analysis are publicly available at https://github.com/johnnardini/Angio_TDA.

Fig 2. Data generation and analysis pipeline.

Fig 2

(1) Spatio-temporal modeling: Anderson-Chaplain model. The Anderson-Chaplain model is simulated for 11 × 11 = 121 different values of the haptotaxis and chemotaxis parameters, (ρ, χ). The model output is saved as a binary image, where pixels labeled 1 or 0 denote the presence or absence, respectively, of blood vessels. We generate 10 realizations for each of the 121 parameter combinations, leading to 1,210 images of simulated vessel networks. (2) Data analysis. We use the binary images from Step 1 to generate standard and topological descriptor vectors A.) Standard descriptor vectors We compute the number of active tip cells and the number of vessel segments at discrete time points. We also compute the length of the vessels at the final time point [32, 34, 35, 52]. B.) Topological descriptor vectors. We construct flooding and sweeping plane filtrations [44] using binary images from the final timepoint of each simulation. We track the birth and death of topological features (connected components and loops) that are summarized as Betti curves and persistence images [52]. (3) Data clustering. We perform k-means clustering using either the standard or topological descriptor vectors computed during step 2 from all 1,210 simulations. In this way, we decompose (ρ, χ) parameter space into regions that group vessel networks with similar morphologies.

Data generation

We simulate the Anderson-Chaplain model as the parameters ρ and χ are each independently varied among the following 11 values: {0, 0.05, 0.1, 0.15, …, 0.5}. The parameter bounds of [0, 0.5] were chosen to yield a range of vascular architectures that are visibly distinct and recapitulate vascular patterns seen in vivo. The resolution of the parameter mesh (11 discrete values between 0 and 0.5) was chosen for computational tractability while still being sufficient to illustrate how TDA methods can detect and quantify differences between simulations from similar parameter values that are difficult to distinguish visually. We generate 10 realizations of the model for each of the 121 (ρ, χ) parameter combinations, and produce 1,210 binary images that summarize the different synthetic blood vessel networks. Each simulation is initialized using the same initial conditions, with all other model parameters fixed at the baseline values used in [17], and no-flux conditions imposed on the domain boundary. For consistency with [17], all simulations are computed on lattices of size 201 × 201. Each simulation continues until either an endothelial tip cell crosses the line x = 0.95 (in dimensionless spatial units) or the maximum simulation time, t = 20 (in dimensionless time units), is exceeded. The final model output is a 201 × 201 binary image in which nonzero (zero) pixels denote the presence (absence) of an endothelial cell at that lattice site.

Morphologically-motivated standard descriptors and descriptor vectors

Morphologically-motivated standard descriptors are used to summarize physical characteristics of each model simulation. Commonly used descriptors include inter-vessel spacing, the number of branch points, the total area (volume) spanned by the network for 2D (3D) simulations, the shape of such loops in 2D (voids in 3D), and tortuosity [27, 3235, 53]. We choose to use the following standard descriptors due to their prevalence in the literature: the number of vessel segments (NS), the number of endothelial tip cells (NT), and the length of the vessel that has penetrated furthest from the parent vessel (L) [18, 32, 34, 35]. In Fig 3 we use idealized networks to indicate how the standard descriptors are defined. As described in S1 Appendix, NS and NT both increase by one with each branching event, and NT decreases by one with each anastomosis event. We record NS and NT at 50 time steps to create the descriptor vectors. To facilitate downstream clustering analysis, we require vectors of the same length for all simulations. To ensure the dynamic descriptor vectors for each model simulation are of length 50, we record NS(t) and NT(t) at the 50 time steps t={ti}i=150, where ti = (i − 1)Δt, Δt = tend/49, and tend denotes the duration of a given simulation. We remark that the requirement for having vectors of the same length for all simulations might not be feasible when dealing with experimental data, which could include heterogeneities in sample size and time point locations. We note further that this requirement only applies to the standard descriptors; for the topological filtrations considered in future sections, only the final segmented vessel image is needed. For each simulation, we also compute L(tend), which we define as the x-coordinate of the vessel segment location which has the greatest horizontal distance from the y-axis. Thus, the standard descriptor vectors we use for subsequent clustering analysis are NS(t), NT(t) and L(tend).

Fig 3. Standard descriptor overview.

Fig 3

Standard descriptors are computed for each model simulation include the number of vessel segments (NS, depicted by each colored solid curve), the number of endothelial tip cells (NT, depicted by black dots), and the x-coordinate of the vessel segment location with the greatest horizontal distance from the y-axis (L, depicted by the vertical grayed dashed line). This schematic example is initialized with 2 vessel segments and 2 tip cells at locations (1) and (2). A branching event at location (3) increases the value of both NS and NT by one. An anastomosis event at location (4) decreases NT by one. The branching event at location (5) increases both NS and NT by one.

Quantifying vessel shape using topological data analysis

The most prominent method from TDA is persistent homology (PH) [38, 51, 54]. PH is rooted in the mathematical concept of homology, which captures the characteristics of shapes. To compute (persistent) homology from data, one needs to first construct simplicial complexes, which can be thought of as collections of generalized triangles. From the constructed simplicial complexes, one can quantify and visualize the connected components (dimension 0) and loops (dimension 1) of a dataset at different spatial scales. While we compute the NT and NS standard descriptor vectors at 50 time points for each model simulation (see Section Morphologically-motivated standard descriptors and descriptor vectors), we compute topological descriptor vectors from the final output binary image.

Simplicial complexes and homology

We construct simplicial complexes from data points derived from binary images of the completed simulations of the Anderson-Chaplain model (see“Synthetic data” in Step 1 of Fig 2, where red pixels denote presence of vasculature). Normally, TDA of image data lends itself to the construction of cubical complexes, which are generalizations of cubes; however, we have binary rather than grayscale data, so we instead use information about the model generation and Moore neighborhoods as we describe here. Each pixel that has a value of one is embedded in R2 at the centroid of the pixel as a vertex (or 0-simplex), as demonstrated in Fig 4. We term the resulting collection of points a point cloud. We connect two points by an edge (or 1-simplex) if they are within each others’ Moore neighborhood (the Moore neighborhood for one pixel is shaded in Fig 4B). If three points are pairwise connected by an edge, then we connect them with a filled-in triangle (or 2-simplex). The union of 0-, 1-, and 2-simplices form a simplicial complex, as shown in Fig 4C.

Fig 4. Converting a binary image into a simplicial complex.

Fig 4

A) Schematic binary image. B) Point cloud for all nonzero pixels in the binary image. The Moore neighborhood of one nonzero pixel is highlighted in salmon. C) A simplicial complex constructed from the point cloud in panel B.

To compute topological invariants, such as connected components (dimension 0) and loops (dimension 1), we use homology. To obtain homology from a simplicial complex, K, we construct vector spaces whose bases are the 0-simplices, 1-simplices, and 2-simplices, respectively, of K. There is a linear map between 2-simplices and 1-simplices, called the boundary map ∂2, which sends triangles to their edges. Similarly, the boundary map ∂1 sends edges to their end points and ∂0 sends points to 0. The action of the boundary map ∂1 on the simplices is stored in a binary matrix where the entry ai,j indicates whether the i-th 0-simplex forms part of the boundary of the j-th 1-simplex. If so, then ai,j = 1; otherwise, ai,j = 0. One can compute the kernel Ker(⋅) and image Im(⋅) of the boundary maps to obtain the vector spaces H0(K) = Ker(∂0)/Im(∂1) and H1(K) = Ker(∂1)/Im(∂2). These vector spaces are also referred to as homology groups and their dimensions define topological invariants called the Betti numbers of K, β0 and β1, which quantify the numbers of connected components and loops, respectively.

Filtrations for simulated vasculature

There are different ways to study vascular data at multiple scales [4446, 55]. We encode the multiple spatial scales of the data using a filtration, which is a sequence of embedded simplicial complexes K0K1 ⊆ … ⊆ Kend built from the data. A challenge for researchers is to determine informative filtrations for specific applications [56], such as blood vessel development in our study. Here, the input data is the binary image from the final timepoint of the Anderson-Chaplain model, which we call N (See Eq (13) in S1 Appendix). We construct sequences of binary images that correspond to different filtered simplicial complexes: a sweeping plane filtration [44] and a flooding filtration [55]. We remark that both filtrations can be considered sublevelset filtrations corresponding to a height function h:XR on some simplicial complex X (or just on the vertices/pixels and considering the simplicial subcomplex spanned by them).

Sweeping plane filtration. In the sweeping plane filtration, we move a line across the binary image N and include pixels in the filtration at discrete steps once a pixel is crossed by the line. On every filtration step, we move the line by a fixed number of pixels in a chosen direction. In Fig 5, we illustrate the method when a vertical line moves from left to right (LTR). From the corresponding sequence of binary images, we construct a filtered simplicial complex KLTR = {K1, K2, …, Kend}. Similarly, we construct the filtered simplicial complexes KRTL when the vertical line moves from right to left (RTL), KTTB when a horizontal line moves from top to bottom (TTB), and KBTT when a horizontal line moves from bottom to top (BTT).

Fig 5. The LTR sweeping plane filtration for a binary image.

Fig 5

A) Sample binary image from a simulation of the Anderson-Chaplain model. B-E) Point clouds used for each iteration of the sweeping plane approach. On each step, only pixels located to the left of the plane (denoted here with a blue line) are included; gray pixels are ignored. F-I) Filtration steps constructed from the point clouds in panels B-E.

Flooding filtration. Our second filtration is the flooding filtration. From the binary image, N, associated with a model simulation, we construct a sequence of binary images in the following way. On the first step, we input the output binary image from a model simulation. In the binary image, pixels with value one denote the presence of an endothelial cell and pixels with value zero denote the absence of endothelial cells. To create the second binary image, we consider all pixels of value one and assign all pixels in their Moore neighborhood (as shown in Fig 4B) to value one. We repeat this process until a binary image with only pixels of value one is created. In Fig 6B, for example, the red pixels denote the nonzero pixels from the initial binary image, the orange pixels denote the nonzero pixels added on the first round of flooding, and the yellow pixels denote the nonzero pixels added on the second round of flooding. Each “round” is called a step. From the corresponding sequence of binary images, we construct a filtered simplicial complex Kflood = {K1, K2, …, Kend}. We choose Kend = 25 so that the Betti curves for all simulations attain β0 = 1 and β1 = 0 at the final filtration step, thus ensuring that the maximum amount of topological information is encoded in the Betti curves.

Fig 6. The flooding filtration for a binary image.

Fig 6

A) Schematic binary image. B) Grayscale image of the original binary image. Red pixels are nonzero in the initial image, orange pixels become nonzero after one round of flooding, and yellow pixels become nonzero after two rounds of flooding. C-E) We performed three filtrations by dilating the image twice. During each filtration, the eight neighboring pixels of all nonzero pixels (ie pixels not marked as white) become nonzero. Red pixels are nonzero in the original image, orange pixels become nonzero on the second filtration (i.e., after one step of flooding), and yellow pixels become nonzero on the third filtration (i.e., after two steps of flooding). F-H) Simplicial complexes associated with the filtrations presented in panels C-E.

Persistent homology (PH)

PH is an algorithm that takes in data via a filtration and outputs a quantification of topological features such as connected components (dimension 0) and loops (dimension 1) across the filtration. The simplicial complexes are indexed by the scale parameter of the filtration. In our case, the scale parameter corresponds to the spatial location in the sweeping plane filtration or steps of flooding in the flooding filtration. Note that we compute these filtrations on simulated blood vessel network images at the final time step of a model simulation. The inclusion of a simplicial complex KiKj for ij gives a relationship between the corresponding homology groups Hp(Ki) and Hp(Kj) for p = 0, 1. This relationship enables us to track topological features such as loops along the simplicial complexes in the filtration. Intuitively, a topological feature is born in filtration step b when it is first computed as part of the homology group Hp(Kb) and dies in filtration step d when that feature no longer exists in Hp(Kd), i.e., when a connected component merges with another component or when a loop is covered by 2-simplices. The output from PH is a multiset of intervals [b, d) which quantifies the persistence of topological features. Each topological feature is said to persist for the scale db in the filtration. Our topological descriptors are the connected components (i.e., 0-dimensional) and loops (i.e., 1-dimensional features) obtained by computing persistent homology of the sweeping plane and flooding filtrations. We use the superscript notation Kv to denote the filtered simplicial complex K in direction v = {LTR, RTL, TTB, BTT, flood}.

Topological descriptor vectors from PH

We use PH to generate different output vectors.

  • Betti curves. Betti curves [57] show the Betti numbers on each filtration step. Let βi(Kv) denote the Betti curves of filtered simplicial complex K in direction v for dimensions i = {0, 1}.

  • Persistence images. A persistence image [52] uses as input the birth-death pairs given by PH and converts the set of (birth b, persistence (db)) coordinates into a vector, a format which is suitable for machine learning and other classification tasks. Following the standard definition of a persistence image, the coordinates (b, db) are blurred by a Gaussian, with standard deviation σ, that is centered about each birth-persistence point, which accounts for uncertainty [52]. We take a weighted sum of the Gaussians and place a grid on this surface to create a vector. We compute two alternative weighting strategies: all ones (in which case all features are equally weighted) and a persistence weight of (db) (in which more persistent features are given larger weights). Let PIOi(Kv) and PIRi(Kv) denote the corresponding persistence images computed from Kv with equal and ramped weighting, respectively, for dimension i = {0, 1}.

The topological descriptor vectors that we compute for each model simulation are βi(Kv), PIOi(Kv), and PIRi(Kv), where i = {0, 1} and v = {LTR, RTL, TTB, BTT, flood}. In total each simulation has 3 × 2 × 5 = 30 topological descriptor vectors.

Simulation clustering

We use an unsupervised algorithm, k-means clustering, to group the quantitative descriptor vectors of the synthetic data generated from simulations of the Anderson-Chaplain model. To cluster n samples x1,x2,,xnRd into k clusters, the k-means algorithm finds k points in Rd, clusters each sample according to which of the k means it is closest to, and then minimizes the within-cluster residual sum of squares distance [58]. We cluster the standard descriptor vectors and the topological descriptor vectors (see steps 2A and 2B of the methods pipeline in Fig 2), respectively, to investigate whether descriptor vectors lead to biologically interpretable clusters.

We test the robustness of each clustering assignment by randomly splitting the descriptor vectors into training and testing sets, and use the testing set to evaluate out-of-sample (OOS) accuracy. Specifically, 7 of the 10 descriptor vectors from each of the 121 (ρ, χ) parameter combinations are randomly chosen with uniform sampling and placed into a training set (847 simulations); the remaining 3 descriptor vectors from each (ρ, χ) combination are placed into a testing set (363 simulations). After performing unsupervised clustering with a k-means model on the training set, each (ρ, χ) combination is labeled by the majority of cluster assignments among its 7 training simulation descriptor vectors. To evaluate each clustering model’s OOS accuracy, the 3 test data samples for each (ρ, χ) parameter combination are given a “ground-truth” label equal to the cluster assignment for the 7 training simulations for that same (ρ, χ) parameter combination. A “predicted” label is calculated for each descriptor vector from the test data using the trained k-means model, and OOS accuracy is defined as the proportion of test simulations for which the “predicted” label matches the “ground-truth” label. For example, if the seven training simulations from the parameter combination (ρ, χ) = (0.2, 0.2) are placed into clusters {1, 1, 2, 2, 1, 1, 1}, then the three testing simulations from (ρ, χ) = (0.2, 0.2) will be labeled as being in cluster 1. If the test simulations are predicted to be from clusters {1, 2, 1}, then we compute an OOS accuracy of 66.67%. We use all 363 testing simulations when computing OOS accuracy and use the KMeans command from scikit-learn (version 0.22.1), which uses Lloyd’s algorithm for Euclidean space, for training and prediction.

Results

We simulated the Anderson-Chaplain model for 121 different values of the chemotaxis and haptotaxis coefficients (ρ, χ) and performed 10 model realizations for each parameter pair. We observed that simulations generated by different parameter pairs may appear visually distinct in their vessel growth patterns. To quantify and summarize the morphologies, we computed dynamic standard descriptor vectors as well as topological descriptor vectors for the set of 1,210 model simulations. We then performed unsupervised clustering using either standard or topological descriptor vectors to partition the (ρ, χ) parameter space.

As we detail here, we found that the topological sweeping plane descriptor vectors offered a multi-scale analysis that led to biologically interpretable and robust clusterings of the (ρ, χ) parameter space, whereas the clusterings of the standard descriptor vectors were not biologically interpretable. The topological analysis succeeded in discriminating fine-grained vascular structures and the parameter pairs that generated them.

Haptotaxis and chemotaxis alter vessel morphology

We investigated the influence of haptotaxis and chemotaxis by varying the parameters ρ and χ We note that similar parameter sensitivity analyses have been performed previously [17, 22]; we reproduce this analysis as the basis for our data analysis. Recall that in model simulations, the vessels emerge from the left-most boundary and are recruited towards a tumor located on the right boundary. When chemotaxis alone drives endothelial tip cell movement (ρ = 0.00, χ = 0.50), the blood vessel segments are long and thin as the tip cells migrate directly towards the tumor and produce individual vessel segments with few anastomosis events (Fig 7A). When haptotaxis drives tip cell movement (ρ = 0.50, χ = 0.00), the vessels fail to reach the tumor (Fig 7B). When both haptotaxis and chemotaxis are active (ρ = χ = 0.15), the network spans a large portion of the spatial domain and is highly connected (Fig 7C).

Fig 7. Typical realizations of the Anderson-Chaplain model.

Fig 7

A) Chemotaxis-driven tip cell movement (ρ = 0.00, χ = 0.50), B) haptotaxis-driven tip cell movemement (ρ = 0.50, χ = 0.00), and C) tip cell movement driven by chemotaxis and haptotaxis (ρ = 0.15, χ = 0.15).

Standard and topological descriptors measure distinct aspects of vascular architectures

Rather than performing visual inspection of all model simulations, we computed and compared quantitative vessel descriptor vectors using data science approaches as described in the Materials and Methods. We demonstrate here how the biological and topological descriptors are distinct from each other. In Fig 8, we observe how the number of tip cells (NT), vessel segments (NS), connected components (β0), and loops (β1) associated with an evolving vasculature change following different events. After each anastomosis event, NT decreases by one (Fig 8B). After a branching event, both NT and NS increase by one (Fig 8C). We conclude that NT is unchanged and NS increases by one after one branching event and one anastomosis event have occurred. By contrast, the values of the topological descriptors before and after these events are not necessarily identical, as depicted in Fig 8D and 8E. When an anastomosis event involves vessel segments that were previously connected, the number of loops, β1, increases by 1 (compare Fig 8A and 8D); when the vessel segments were not previously connected, β0 decreases by 1 (compare Fig 8A–8E).

Fig 8. Standard and topological descriptors describe distinct aspects of vessel morphology.

Fig 8

For clarity, each vessel segment is colored differently, and black dots represent active tip cells. A) The initial vessel configuration. B) A schematic showing how the vascular architecture changes after an anastomosis event. C) A schematic showing how the vascular architecture changes after a branching event. D-E) Schematics showing two ways in which vasculature can change after one anastomosis event and one branching event. F) A schematic showing how the vessels can change after one branching and two anastomosis events. (Key: β0, number of connected components; β1, number of loops; NT, number of tip cells; NS, number of vessel segments).

Let NA(r, s) and NB(r, s) denote the numbers of anastomosis and branching events, respectively, that occur between times t = r and t = s. We computed NT(t), NS(t) vectors of the standard descriptors of evolving blood vessel simulations, as follows:

NT(t1)=NS(t1)=9,NT(ti)=NT(ti-1)+NB(ti-1,ti)-NA(ti-1,ti),i=2,...,50,NS(ti)=NS(ti)+NB(ti-1,ti),i=2,...,50.

The values of NT(t1) and NS(t1) were determined from the models’ initial conditions.

We first constructed the filtrations Kv where v = {LTR, RTL, TTB, BTT, flood} of each simulation’s final binary image. Then we computed connected components and loops, which we call topological descriptors of Kv, using PH. The topological descriptor vectors, Betti curves and persistence images, were then constructed for these five filtrations.

Quantification of blood vessel architecture data

Fig 9 shows the standard descriptor vectors for the three simulations presented in Fig 7. For the haptotaxis-driven simulation growth in the horizontal direction halts (the total length remains at 0.095 units) and the numbers of tip cells and vessel segments stay constant (at or near the initial condition of 9 vessels). By contrast, the number of tip cells and vessel segments increase over time for the chemotaxis-driven simulation, with a more marked increase when both chemotaxis and haptotaxis are active. For both chemotaxis-active simulations, the vessels extend to the maximum horizontal distance.

Fig 9. The standard descriptors used to summarize model simulations.

Fig 9

We show how the following descriptors change over 50 time steps for the three simulations presented in Fig 7: A) the number of tip cells, NT(t), B) the number of vessel segments, NS(t), and C) the final vessel length, L(tend).

We next illustrate how we use the plane sweeping and flooding filtrations to analyze the chemotaxis-driven, haptotaxis-driven, and chemotaxis and haptotaxis-driven simulations (from Fig 7). We computed Betti curves, β0(Kv) and β1(Kv), of the simulated data with sweeping plane filtrations for v = {LTR, RTL, TTB, BTT} (see Fig 10) and a flooding filtration v = {flood} (see Fig 11). These filtrations are constructed to provide complementary but not directly comparable information; roughly, the sweeping plane gives an indication of the network in the (x-y) plane whereas the flooding filtration also focusses on vessel density (see earlier description of filtration construction). We could interpret the PH of the sweeping plane filtration direction as follows: the blood vessels grow primarily from left to right, so the LTR and RTL filtrations primarily identify dynamic changes in branching and anastomosis of vessels. The BTT and TTB filtrations are similar to each other, and reflect only horizontal changes, which could be interpreted as measuring bendiness or tortuosity. Therefore, we only discuss the BTT results in this section.

Fig 10. TDA sweeping plane descriptor vectors.

Fig 10

A) The sweeping plane directions are left-to-right (LTR, gray); right-to-left (RTL, orange); bottom-to-top (BTT, cyan); and top-to-bottom (TTB, green). For these four v = iTj filtrations, we started with points on the ith boundary and included more points as we stepped towards the jth boundary. Repeating this process produced the spatial filtration Kv = K0, K1Kend for v = {LTR, RTL, TTB, BTT}. B-D) We formed the Betti curves β0(Kv) and β1(Kv) by computing the Betti numbers along each step of Kv for the three blood vessel simulations presented in Fig 7; B) the chemotaxis-driven simulation, C) the haptotaxis-driven simulation and D) chemotaxis and haptotaxis simulation.

Fig 11. The flooding filtration.

Fig 11

A,C,E) Illustrate the flooding filtrations for the three blood vessel simulations from Fig 7. Dark red pixels were points included in the first steps of the filtration and yellow pixels were the points included in later filtration steps. White pixels were not included after 24 steps of the flooding filtration. B,D,F) The Betti Curves (β0(Kflood) and β1(Kflood)) for the chemotaxis-driven, haptotaxis-driven, and chemotaxis and haptotaxis simulations respectively.

For the chemotaxis-driven simulation (Figs 10B, 11A and 11B), the β0(KLTR) descriptor vector slightly decreases with distance x from the y-axis as vessel segments anastomose together. The β0(KRTL) and β1(KRTL) descriptor vectors decrease with x at the end of each vessel segment. The magnitude of β0(KBTT) increases almost linearly with the y-spatial coordinate, and β1(KBTT) increases with y until it plateaus for y ≥ 0.6. The β0(Kflood) descriptor vector decreases with each filtration step, and β1(Kflood) begins with β1 = 14 and achieves a local maximum of β1 = 10 after seven filtration steps before reaching β1 = 0 after 11 filtration steps.

For the haptotaxis-driven simulation (Figs 10C, 11C and 11D), all LTR and RTL Betti curves become zero for x > 0.1 because each vessel segment has terminated by x = 0.1. The β0(KBTT) and β1(KBTT) descriptor vectors increase steadily with y as the horizontal plane moves past each individual vessel segment. The flooding descriptor vectors initially decrease with each flooding step and remain fixed at β0 = 1, β1 = 0 after six steps of flooding.

For the chemotaxis and haptotaxis-driven simulation (Figs 10D, 11E and 11F), β0(KLTR) decreases with x as vessel segments anastomose together, while β1(KLTR) increases with x due to the formation of many loops. The β0(KRTL) descriptor vector increases with x until it reaches its maximum value of β0 = 11 at x = 0.85 and then decreases to β0 = 0. The β1(KRTL) descriptor vector decreases with x. The β0(KBTT) descriptor vector periodically increases and decreases with y as the horizontal plane moves past regions of high and low connectivity. The β1(KBTT) descriptor vector steadily increases with the y coordinate. The β0(Kflood) descriptor vector is equal to one for all steps of flooding, and β1(Kflood) decreases with each round of flooding and reaches β1 = 0 after 20 flooding steps.

Sweeping plane descriptor vectors cluster simulations by parameter values

We applied k-means clustering separately to the standard descriptor vectors and the topological descriptor vectors. None of the clusterings of the standard descriptor vectors were biologically interpretable in (ρ, χ) parameter space (see Fig 12 and S1 Fig). While the topological descriptor vectors from the sweeping plane filtration gave more biologically interpretable groupings for combined haptotaxis and chemotaxis parameter values, they were not robust for haptotaxis dominated or chemotaxis dominated (ρ, χ) parameter ranges (see S2 Fig). The flooding filtration did not give biologically interpretable groups for any parameter ranges (see Clustering results in S1 Appendix for all details). We suggest that the LTR and RTL plane sweeping filtrations produce the most interpretable clusterings because sweeping the plane horizontally resembles the way in which endothelial tip cells migrate from left to right in the simulated data.

Fig 12. Partitioning (ρ, χ) parameter space into distinct regions using standard biological descriptor vectors.

Fig 12

We applied k-means clustering, with k = 5, to standard biological descriptor vectors. The resulting partitions are presented for A) NT(t), B) NS(t), C) L(tend), and D) concatenation of all three descriptor vectors. The five clusters are ordered according to the mean χ value within the cluster.

No single descriptor vector was able to robustly cluster simulations into biologically interpretable groups. However, double descriptor vectors, created by concatenating two sweeping plane topological descriptor vectors, produced clusterings in (ρ, χ) parameter space that were biologically interpretable and robust. As an example of a double descriptor vector, “PIO0(KLTR) & PIO1(KLTR)” denotes the topological descriptor vector created by concatenating the PIO0(KLTR) and PIO1(KLTR) descriptor vectors. We note that doubles of standard descriptor vectors or topological flooding descriptor vectors were unable to produce biologically interpretable and robust clusterings in (ρ, χ) parameter space (see Clustering results in S1 Appendix).

We computed a total of 2 × 4 × 3 = 24 individual sweeping plane descriptor vectors for each model simulation: each descriptor vector counted connected components or loops (β0, β1), swept a plane across the domain in one of four directions (LTR, RTL, TTB, and BTT), and used one of the three PH summaries (BC, PIO, PIR). We computed all double sweeping plane descriptor vectors and clustered all 276 combinations in (ρ, χ) parameter space. The 15 double descriptor vectors with the highest OOS accuracies are presented in Table 1; the four with the highest OOS accuracies are listed below:

Table 1. Summary of the 15 sweeping plane double descriptor vectors that achieved the highest OOS accuracy scores.

Feature In Sample Accuracy Out of Sample Accuracy
PIO0(KLTR) & PIO0(KRTL) 94.0% 93.1%
PIO0(KRTL) & PIR1(KLTR) 94.2% 92.6%
PIR0(KTTB) & PIR1(KLTR) 92.3% 91.5%
PIR0(KBTT) & PIR1(KLTR) 92.3% 91.5%
PIR0(KLTR) & PIR1(KLTR) 92.3% 90.9%
PIO0(KTTB) & PIR1(KLTR) 94.2% 90.4%
PIO0(KBTT) & PIR1(KLTR) 94.3% 90.1%
PIO0(KLTR) & PIR1(KLTR) 92.6% 90.1%
PIR0(KRTL) & PIR1(KLTR) 92.6% 89.0%
PIR0(KLTR) & PIR0(KRTL) 90.2% 89.0%
PIO0(KLTR) & PIR0(KRTL) 91.9% 88.4%
PIO0(KRTL) & PIO1(KLTR) 86.1% 86.0%
PIO0(KBTT) & PIO1(KLTR) 84.2% 85.1%
PIO1(KLTR) & PIR1(KBTT) 85.6% 84.8%
PIO1(KLTR) & PIR1(KTTB) 83.1% 84.8%
PIO0(KTTB) & PIO1(KLTR) 84.9% 84.8%
PIO0(KBTT) & PIO1(KRTL) 85.6% 84.8%
  1. PIO0(KLTR) & PIO0(KRTL) (93.1% OOS accuracy),

  2. PIO0(KRTL) & PIR1(KLTR) (92.6% OOS accuracy),

  3. PIR0(KTTB) & PIR1(KLTR) (91.5% OOS accuracy),

  4. PIR0(KBTT) & PIR1(KLTR) (91.5% OOS accuracy).

In Fig 13 and S3 Fig, we show how the data clustered in (ρ, χ) parameter space when the top four double sweeping plane descriptor vectors were used. The “PIO0(KLTR) & PIO0(KRTL)” and “PIO0(KRTL) & PIR1(KLTR)” double descriptor vectors produced five connected clustering regions and, thus, attained high OOS accuracy and biologically interpretable clusterings. In contrast, the clusterings generated by the following double descriptor vectors (“PIR0(KTTB) & PIR1(KLTR)” and “PIR0(KBTT) & PIR1(KLTR)”) were unable to distinguish between simulations with strong chemotaxis and strong haptotaxis. We conclude that double descriptor vectors comprising LTR and RTL descriptor vectors robustly cluster simulations into biologically interpretable clusterings because the associated filtrations capture the ways in which vessel segments anastomose and branch, respectively. We performed principal components analysis to reduce the “PIO0(KLTR) & PIO0(KRTL)” double descriptor vector to two principal components. The results presented in S4 Fig show that the simulations from each group are also clustered in this two-dimensional space.

Fig 13. Partitioning (ρ, χ) parameter space into distinct regions using sweeping plane double descriptor vectors.

Fig 13

We applied k-means clustering, with k = 5, to sweeping plane double descriptor vectors. The resulting partitions are presented for the four descriptor vectors that have the highest OOS accuracy: A) PIO0(KLTR) & PIO0(KRTL), B) PIO0(KRTL) & PIR1(KLTR), C) PIO0(KTTB) & PIR1(KLTR), and D) PIO0(KBTT) & PIR1(KLTR). The five clusters are ordered according to the mean χ value within the cluster.

We applied k−means clustering to all sweeping plane double descriptor vectors and computed the average OOS accuracy for a range of values of k, also known as an elbow plot (S5 Fig). While the average OOS accuracy decreased with increasing values of k, the rate of decrease slowed markedly for k > 5. We fixed k = 5 as this value robustly clustered (ρ, χ) parameter space into biologically interpretable clusters (Fig 13). Larger values of k may overfit the data [59], while smaller values fail to partition (ρ, χ) parameter space into biologically interpretable regions (S6 Fig). For example, when k = 3, simulations with strong haptotaxis (ρ > χ), equal chemotaxis and haptotaxis (ρ = χ), and strong chemotaxis (ρ < χ) are clustered together. When k = 4, there are two distinct clusters near ρ = χ, but simulations with large values of the haptotaxis parameter, ρ, are clustered with those with high values of the chemotaxis parameter, χ. With k = 5 clusters, the region with large values of the haptotaxis parameter, ρ, was further subdivided. We suggest that clustering with k = 5 provides sufficient resolution to distinguish model simulations generated from different regions of parameter space. We note that more sophisticated descriptors could be used to justify this choice of k, including the silhouette score [60].

We identified five model simulations whose double descriptor vectors closely resembled the means from the clusters associated with the “PIO0(KLTR) & PIO0(KRTL)” clustering. Fig 14 presents representative simulations of each cluster. We note that as ρ and χ vary, the spatial extent and connectedness of the networks vary as expected (i.e., greater spatial extent in the x-direction as χ increases, and greater connectedness as ρ increases). We applied a similar analysis to model simulations sampled from (ρ, χ, ψ) parameter space, where the non-negative parameter ψ represents the rate at which tip cells branch. The resulting parameter clustering changed only with ρ and χ, suggesting that the vascular morphology does not depend on ψ (S7 Fig).

Fig 14. Representative simulated blood vessel networks from Groups 1–5.

Fig 14

Series of simulations that resemble the means from each of the five groups (as identified from the double descriptor vectors associated with the sweeping plane filtrations PIO0(KLTR) & PIO0(KRTL)). The representative simulations for groups 1–5 were generated using the following pairs of parameter values: (ρ, χ) = (0.25, 0), (0.5, 0.2), (0.3, 0.2), (0.15, 0.15), and (0.2, 0.4), respectively.

Conclusions and discussion

We have introduced a topological pipeline to analyze simulated data of tumor-induced angiogenesis [17]. We investigated different filtrations to feed into the persistent homology algorithm. We demonstrated that simple data science algorithms can group together topologically similar simulated data. Pairs of topological descriptor vectors generated from sweeping plane filtrations robustly clustered simulations into five biologically interpretable groups. Simulations within each group were generated from similar values of the model parameters ρ and χ, which determine the sensitivity of the tip cells to haptotaxis and chemotaxis.

The Betti numbers quantify connectedness of the vasculature across spatial scales. We computed PH of four sweeping plane filtrations (LTR, RTL, TTB, BTT), which quantify topological features as a plane moves across the simulated data in a particular direction. We found that the filtration direction provides information about different aspects of the vascular network. For the LTR filtration, the number of connected components (i.e., β0) decreases when an anastomosis event occurs, whereas for the RTL filtration, β0 decreases when a branching event occurs. The TTB and BTT filtrations summarize vascular tortuosity: when tip cell movement in the y-spatial direction is negligible and/or tip cells do not merge with adjacent sprouts, β0 increases with the number of tip cells/ number of sprouts. We found that the combined PIO0(KLTR) & PIO0(KRTL) sweeping plane descriptor vectors robustly cluster haptotaxis-chemotaxis parameter space into five distinct regions that appear to be connected.

Our results may depend on the model setup, in which new blood vessels are created as endothelial tip cells move from left to right. In the future, we may explore the use of extended persistence [61] and/or the PH transform [62], which allow consideration of multiple different directions in the topological analysis. If the vessel network grows radially, for example, then radial filtrations pinned at the network center may be more informative than the LTR filtration [45, 46]. Classically in PH, topological features of short persistence are topological noise, yet there is growing evidence that these short features can provide information about geometry and curvature [43, 44, 46, 47, 6366]. Our results using persistence images with unit weightings, rather than ramped weightings, also suggest that short persistence features may be informative for distinguishing vessel morphologies. The spatial resolution of the output binary images that we used for our analyses was the same as that used to generate the simulations, enabling us to distinguish the presence and absence of individual endothelial cells. In future work, we will consider finer and coarser binary images to investigate how the performance of the methods depends on their spatial resolution.

Many models of angiogenesis have been developed and implemented since the Anderson-Chaplain model, as has been extensively reviewed previously [2124, 6771]. We plan to use the methods described in this work to interpret more sophisticated models of angiogenesis and, in the longer term, to determine how vessel morphology affects tumor survival and to predict the efficacy of vascular targeting agents. The tumor is not explicitly modeled in the Anderson-Chaplain model, but many studies couple angiogenesis and tumor growth [25, 27, 29, 72, 73]. More complex models have been developed that account for mechanical factors such as mechanotaxis, pressure-driven convective transport of extracellular factors, mechanical stimulation of endothelial cell proliferation, and subcellular signalling pathways that guide cell fate specification of tip and stalk cells [26, 27, 29, 74]. For example, Vavourakis et al. developed a 3D model of angiogenic tumor growth that incorporates the effects of mechanotaxis on vessel sprouting and mechano-sensitive vascular remodelling [29]. Importantly, these more complex models have been shown to recapitulate experimental observations of time-varying spatial distributions of angiogenic vasculature [29] and features such as asymmetric neovasculature [74] and cell rearrangements and phenotype switching of endothelial stalk and tip cells [75, 76]. In future work, we will apply the TDA framework developed here to these, and other angiogenesis models. Such analyses will enable us to investigate the effect that processes such as mechanotaxis and subcellular signalling have on the topology of blood vessel networks.

Miller at al. used computational models to analyze tumor regression in response to drug treatment as a function of vascular morphology and heterogeneity [77]. Our future analysis of such state-of-the-art models will require us to extend the topological methods presented here for spatio-temporal data that describes how multiple cell types (e.g., tumor cells, immune cells and endothelial cells) interact in 3D. In a preliminary study, we applied these methods to 3D data from in vivo studies of tumor vessel networks, which is near the edge of current computational feasibility [46]. Further work in this direction requires significant computational advancements to scale the TDA pipeline to multiple, more complex models and to analyze the resulting parameter landscapes.

The topological approaches considered in this work represent a multiscale way to summarize the properties of in silico vasculature, and could be extended to experimental imaging data. Previous studies have quantified experimental data of tumor-induced angiogenesis using standard descriptors that include sprout length and diameter, number of tumor cells, concentration of angiogenic chemicals [30, 78, 79]. To compare models and data will require descriptors, such as the topological ones proposed here, that can discriminate between different conditions (e.g., parameter values). The challenges of making direct comparisons between experimental and synthetic data is a significant bottleneck in the validation of mathematical and computational models of angiogenesis. In future work we will apply the topological approaches outlined in this paper to compare different model simulations with experimental data [80, 81]. In this way, we aim to determine biologically realistic parameter values for control conditions [36], and to identify parameter values that are associated with response to treatments, including vessel normalization.

Supporting information

S1 Fig. Standard descriptor vector clustering.

Clustering of the (ρ, χ) parameter space using k-means with k = 5 on the standard descriptor vectors to summarize each simulated vasculature. We considered A) Number of tips over time, C) Number of vessel segments over time, E) Final length of the simulated vasculature, and G) All of the previous descriptor vectors concatenated into one descriptor vector. The five clusters are ordered according to the mean χ value within the cluster. Panels B,D,F,H depict the out of sample confusion matrices for each descriptor vector.

(TIF)

S2 Fig. Individual plane sweeping descriptor vector clustering.

Clustering of the (ρ, χ) parameter space using k-means with k = 5 on individual sweeping plane topological filtrations to summarize each simulated vasculature. The four highest OOS accuracies resulted from the A) PIR1(KLTR), C) PIO0(KLTR), E) PIO1(KRTL), G) and PIO1(KLTR) descriptor vectors. The five clusters are ordered according to the mean χ value within the cluster. Panels B,D,F,H depict the out of sample confusion matrices for each descriptor vector.

(TIF)

S3 Fig. Double plane sweeping descriptor vector clustering.

Clustering of the (ρ, χ) parameter space using k-means with k = 5 on double vectors of the sweeping plane topological filtration to summarize each simulated vasculature. The four highest OOS accuracies resulted from the A) PIO0(KLTR) & PIO0(KRTL), C) PIO0(KRTL) & PIR1(KLTR), E) PIO0(KTTB) & PIR1(KLTR), and F) PIO0(KBTT) & PIR1(KLTR) descriptor vectors. The five clusters are ordered according to the mean χ value within the cluster. Panels B,D,F,H depict the out of sample confusion matrices for each descriptor vector.

(TIF)

S4 Fig. Dimensionality reduction.

Dimensionality reduction of the PIO0(KLTR) & PIO0(KRTL) double descriptor vector. We reduced the dimensionality of the PIO0(KLTR) & PIO0(KRTL) double descriptor vector to two dimensions using principal components analysis and plot the reduced-dimension descriptor vector for each simulation. The color of each dot denotes the predicted grouping of each simulation from the k-means algorithm.

(TIF)

S5 Fig. Elbow curve.

Elbow curve for the OOS Accuracy for the k-means clustering and labeling algorithm over different values of k. We considered all 276 doubles of topological feature vectors and plot the mean OOS accuracy percentage plus or minus two standard deviations for multiple values of k. We propose that the elbow occurs at k = 5.

(TIF)

S6 Fig. Multiple k-means clusterings.

Clustering of the (ρ, χ) parameter space using the PIO0(KLTR) & PIO0(KRTL) double of topological descriptor vectors using the k-means algorithm with A) k = 3, C) k = 4, E) k = 5. The clusters are ordered according to the mean χ value within the cluster. Panels B,D,F) depict the out of sample confusion matrices for each descriptor vectory.

(TIF)

S7 Fig. (ρ, χ, ψ) clustering.

Clustering based on the (ρ, χ) grouping and the time to branch, ψ. We created a data set of blood vessel simulations by considering (ρ, χ) values from groups 1–5 from Fig 13A and letting ψ vary over the 10 values {.1, 0.2, 0.3, …, 1.0}. We simulated the Anderson-Chaplain model ten times at each (ρ, χ, ψ) values to create 500 total simulations. We performed our clustering methodology on this collection of images based off the PIO0(KLTR) & PIO0(KRTL) descriptor vector.

(TIF)

S8 Fig. Initial TAF and fibronectin profiles.

TAF concentration (left) and fibronectin concentration (right).

(TIF)

S9 Fig. Individual flooding descriptor vector clustering.

Clustering of the (ρ, χ) parameter space using k-means with k = 5 on individual flooding topological descriptor vectors to summarize each simulated vasculature. The four highest OOS accuracies resulted from the A) PIR1(Kflood), C) β1(Kflood), E) PIO1(Kflood), G) and β0(Kflood) descriptor vectors. The five clusters are ordered according to the mean χ value within the cluster. Panels B,D,F,H depict the out of sample confusion matrices for each descriptor vector.

(TIF)

S10 Fig. Double flooding descriptor vector clustering.

Clustering of the (ρ, χ) parameter space using k-means with k = 5 on double flooding topological descriptor vectors to summarize each simulation. The four highest OOS accuracies resulted from the A) PIO0(Kflood)&PIR1(Kflood), C) PIR0(Kflood)&PIR1(Kflood), E) β0(Kflood)&β1(Kflood), G) and PIO0(Kflood)&β1(Kflood) descriptor vectors. The five clusters are ordered according to the mean χ value within the cluster. Panels B,D,F,H depict the out of sample confusion matrices for each descriptor vector.

(TIF)

S1 Table. Anderson-Chaplain model parameters.

List of mechanistic parameters for the Chaplain Anderson ABM from [17].

(PDF)

S2 Table. Anderson-Chaplain model nondimensionalized parameters.

Baseline nondimensionalized mechanistic parameters for the Chaplain-Anderson Model from [17].

(PDF)

S3 Table. Individual plane sweeping clustering.

Out of Sample Accuracy scores for individual feature vectors from the flooding filtration using k-means classification with k = 5.

(PDF)

S4 Table. Double flooding clustering.

Out of Sample Accuracy scores for doubles of feature vectors from the flooding filtration using k-means classification with k = 5.

(PDF)

S5 Table. Individual plane sweeping clustering.

Out of Sample Accuracy scores for individual feature vectors from the sweeping plane filtration using k-means classification with k = 5.

(PDF)

S1 Appendix. Modeling and clustering appendix.

The file contains a more detailed description of implementation of the Anderson-Chaplain Model and clustering results.

(PDF)

Data Availability

All data is available at https://github.com/johnnardini/Angio_TDA.

Funding Statement

HAH, BJS and HMB are grateful for the support provided by the UK Centre for Topological Data Analysis EPSRC grant EP/R018472/1. HAH gratefully acknowledges funding from EPSRC EP/R005125/1 and EP/T001968/1. This work was partially supported by the National Science Foundation under Grant DMS-1638521 to the Statistical and Applied Mathematical Sciences Institute, and in part by National Institute of Aging grant R21AG059099 to KBF. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Gupta MK, Qin RY. Mechanism and its regulation of tumor-induced angiogenesis. World Journal of Gastroenterology. 2003;9(6):1144–1155. doi: 10.3748/wjg.v9.i6.1144 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Kushner EJ, Bautch VL. Building blood vessels in development and disease. Current Opinion in Hematology. 2013;20(3):231–236. [DOI] [PubMed] [Google Scholar]
  • 3. Tonnesen MG, Feng X, Clark RAF. Angiogenesis in wound healing. Journal of Investigative Dermatology Symposium Proceedings. 2000;5(1):40–46. doi: 10.1046/j.1087-0024.2000.00014.x [DOI] [PubMed] [Google Scholar]
  • 4. Ferrara N. VEGF and the quest for tumour angiogenesis factors. Nature Reviews Cancer. 2002;2(10):795–803. doi: 10.1038/nrc909 [DOI] [PubMed] [Google Scholar]
  • 5. Folkman J. Tumor angiogenesis. In: Klein G, Weinhouse S, Haddow A, editors. Advances in Cancer Research. vol. 19. Academic Press; 1974. p. 331–358. [DOI] [PubMed] [Google Scholar]
  • 6. Sherwood LM, Parris EE, Folkman J. Tumor angiogenesis: therapeutic implications. New England Journal of Medicine. 1971;285(21):1182–1186. doi: 10.1056/NEJM197111182852108 [DOI] [PubMed] [Google Scholar]
  • 7. Folkman J, Merler E, Abernathy C, Williams G. Isolation of a tumor factor responsible for angiogenesis. The Journal of Experimental Medicine. 1971;133(2):275–288. doi: 10.1084/jem.133.2.275 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Menashi S, Lu H, Soria C, Legrand Y. 2 Endothelial cell proteases: physiological role and regulation. Baillière’s Clinical Haematology. 1993;6(3):559–576. doi: 10.1016/S0950-3536(05)80188-X [DOI] [PubMed] [Google Scholar]
  • 9. Paweletz N, Knierim M. Tumor-related angiogenesis. Critical Reviews in Oncology/Hematology. 1989;9(3):197–242. doi: 10.1016/S1040-8428(89)80002-2 [DOI] [PubMed] [Google Scholar]
  • 10. Terranova VP, DiFlorio R, Lyall RM, Hic S, Friesel R, Maciag T. Human endothelial cells are chemotactic to endothelial cell growth factor and heparin. Journal of Cell Biology. 1985;101(6):2330–2334. doi: 10.1083/jcb.101.6.2330 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Sholley M, Ferguson G, Seibel H, Montour J, Wilson J. Mechanisms of neovascularization. Vascular sprouting can occur without proliferation of endothelial cells. Laboratory Investigation; a Journal of Technical Methods and Pathology. 1984;51(6):624–634. [PubMed] [Google Scholar]
  • 12. Lugano R, Ramachandran M, Dimberg A. Tumor angiogenesis: causes, consequences, challenges and opportunities. Cellular and Molecular Life Sciences. 2020;77(9):1745–1770. doi: 10.1007/s00018-019-03351-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Saman H, Raza SS, Uddin S, Rasul K. Inducing angiogenesis, a key step in cancer vascularization, and treatment approaches. Cancers. 2020;12(5):1172. doi: 10.3390/cancers12051172 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Bentley K, Franco CA, Philippides A, Blanco R, Dierkes M, Gebala V, et al. The role of differential VE-cadherin dynamics in cell rearrangement during angiogenesis. Nature Cell Biology. 2014;16(4):309–321. doi: 10.1038/ncb2926 [DOI] [PubMed] [Google Scholar]
  • 15. Li JL, Harris AL. Notch signaling from tumor cells: a new mechanism of angiogenesis. Cancer Cell. 2005;8(1):1–3. doi: 10.1016/j.ccr.2005.06.013 [DOI] [PubMed] [Google Scholar]
  • 16. Li S, Butler P, Wang Y, Hu Y, Han DC, Usami S, et al. The role of the dynamics of focal adhesion kinase in the mechanotaxis of endothelial cells. Proceedings of the National Academy of Sciences. 2002;99(6):3546–3551. doi: 10.1073/pnas.052018099 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Anderson ARA, Chaplain MAJ. Continuous and discrete mathematical models of tumor-induced angiogenesis. Bulletin of Mathematical Biology. 1998;60(5):857–899. doi: 10.1006/bulm.1998.0042 [DOI] [PubMed] [Google Scholar]
  • 18. Balding D, McElwain DLS. A mathematical model of tumour-induced capillary growth. Journal of Theoretical Biology. 1985;114(1):53–73. doi: 10.1016/S0022-5193(85)80255-1 [DOI] [PubMed] [Google Scholar]
  • 19. Byrne HM, Chaplain MAJ. Growth of nonnecrotic tumors in the presence and absence of inhibitors. Mathematical Biosciences. 1995;130(2):151–181. doi: 10.1016/0025-5564(94)00117-3 [DOI] [PubMed] [Google Scholar]
  • 20. Stokes CL, Lauffenburger DA. Analysis of the roles of microvessel endothelial cell random motility and chemotaxis in angiogenesis. Journal of Theoretical Biology. 1991;152(3):377–403. doi: 10.1016/S0022-5193(05)80201-2 [DOI] [PubMed] [Google Scholar]
  • 21. Byrne HM. Dissecting cancer through mathematics: from the cell to the animal model. Nature Reviews Cancer. 2010;10(3):221–230. doi: 10.1038/nrc2808 [DOI] [PubMed] [Google Scholar]
  • 22. Hadjicharalambous M, Wijeratne PA, Vavourakis V. From tumour perfusion to drug delivery and clinical translation of in silico cancer models. Methods. 2021;185:82–93. doi: 10.1016/j.ymeth.2020.02.010 [DOI] [PubMed] [Google Scholar]
  • 23. Metzcar J, Wang Y, Heiland R, Macklin P. A review of cell-based computational modeling in cancer biology. JCO Clinical Cancer Informatics. 2019. doi: 10.1200/CCI.18.00069 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Scianna M, Bell CG, Preziosi L. A review of mathematical models for the formation of vascular networks. Journal of Theoretical Biology. 2013;333:174–209. doi: 10.1016/j.jtbi.2013.04.037 [DOI] [PubMed] [Google Scholar]
  • 25. Vilanova G, Colominas I, Gomez H. A mathematical model of tumour angiogenesis: growth, regression and regrowth. Journal of The Royal Society Interface. 2017;14(126):20160918. doi: 10.1098/rsif.2016.0918 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Stepanova D, Byrne HM, Maini PK, Alarcón T. A multiscale model of complex endothelial cell dynamics in early angiogenesis. PLoS Computational Biology. 2021;17(1):e1008055. doi: 10.1371/journal.pcbi.1008055 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Perfahl H, Hughes BD, Alarcón T, Maini PK, Lloyd MC, Reuss M, et al. 3D hybrid modelling of vascular network formation. Journal of Theoretical Biology. 2017;414:254–268. doi: 10.1016/j.jtbi.2016.11.013 [DOI] [PubMed] [Google Scholar]
  • 28. Grogan JA, Connor AJ, Markelc B, Muschel RJ, Maini PK, Byrne HM, et al. Microvessel chaste: an open library for spatial modeling of vascularized tissues. Biophysical Journal. 2017;112(9):1767–1772. doi: 10.1016/j.bpj.2017.03.036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Vavourakis V, Wijeratne PA, Shipley R, Loizidou M, Stylianopoulos T, Hawkes DJ. A validated multiscale in-silico model for mechano-sensitive tumour angiogenesis and growth. PLoS Computational Biology. 2017;13(1):e1005259. doi: 10.1371/journal.pcbi.1005259 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Cai H, Liu X, Zheng J, Xue Y, Ma J, Li Z, et al. Long non-coding RNA taurine upregulated 1 enhances tumor-induced angiogenesis through inhibiting microRNA-299 in human glioblastoma. Oncogene. 2017;36(3):318–331. doi: 10.1038/onc.2016.212 [DOI] [PubMed] [Google Scholar]
  • 31. Sefidgar M, Soltani M, Raahemifar K, Sadeghi M, Bazmara H, Bazargan M, et al. Numerical modeling of drug delivery in a dynamic solid tumor microvasculature. Microvascular Research. 2015;99:43–56. doi: 10.1016/j.mvr.2015.02.007 [DOI] [PubMed] [Google Scholar]
  • 32. Folarin AA, Konerding MA, Timonen J, Nagl S, Pedley RB. Three-dimensional analysis of tumour vascular corrosion casts using stereoimaging and micro-computed tomography. Microvascular Research. 2010;80(1):89–98. doi: 10.1016/j.mvr.2010.03.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Kannan P, Kretzschmar WW, Winter H, Warren D, Bates R, Allen PD, et al. Functional parameters derived from magnetic resonance imaging reflect vascular morphology in preclinical tumors and in human liver metastases. Clinical Cancer Research. 2018;24(19):4694–4704. doi: 10.1158/1078-0432.CCR-18-0033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Konerding MA, Malkusch W, Klapthor B, Ackern Cv, Fait E, Hill SA, et al. Evidence for characteristic vascular patterns in solid tumours: quantitative studies using corrosion casts. British Journal of Cancer. 1999;80(5):724–732. doi: 10.1038/sj.bjc.6690416 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Konerding MA, Fait E, Gaumann A. 3D microvascular architecture of pre-cancerous lesions and invasive carcinomas of the colon. British Journal of Cancer. 2001;84(10):1354–1362. doi: 10.1054/bjoc.2001.1809 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Bauer AL, Jackson TL, Jiang Y. A cell-based model exhibiting branching and anastomosis during tumor-induced angiogenesis. Biophysical Journal. 2007;92(9):3105–3121. doi: 10.1529/biophysj.106.101501 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Zhan Y, Li MZ, Yang L, Feng XF, Zhang QX, Zhang N, et al. An MRI study of neurovascular restorative after combination treatment with Xiaoshuan enteric-coated capsule and enriched environment in rats after stroke. Frontiers in Neuroscience. 2019;13. doi: 10.3389/fnins.2019.00701 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Carlsson G. Topology and data. Bulletin of the American Mathematical Society. 2009;46(2):255–308. doi: 10.1090/S0273-0979-09-01249-X [DOI] [Google Scholar]
  • 39. Zomorodian AJ. Topology for computing. Cambridge University Press; 2005. [Google Scholar]
  • 40. Nicolau M, Levine AJ, Carlsson G. Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. PNAS. 2011;108(17):201102826. doi: 10.1073/pnas.1102826108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Nielson JL, Paquette J, Liu AW, Guandique CF, Tovar CA, Inoue T, et al. Topological data analysis for discovery in preclinical spinal cord injury and traumatic brain injury. Nature Communications. 2015;6:8581. doi: 10.1038/ncomms9581 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Qaiser T, Sirinukunwattana K, Nakane K, Tsang YW, Epstein D, Rajpoot N. Persistent homology for fast tumor segmentation in whole slide histology images. Procedia Computer Science. 2016;90:119–124. doi: 10.1016/j.procs.2016.07.033 [DOI] [Google Scholar]
  • 43. Xia K, Wei GW. Persistent homology analysis of protein structure, flexibility, and folding. International Journal for Numerical Methods in Biomedical Engineering. 2014;30(8):814–844. doi: 10.1002/cnm.2655 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Bendich P, Marron JS, Miller E, Pieloch A, Skwerer S. Persistent homology analysis of brain artery trees. Annals of Applied Statistics. 2016;10(1):198–218. doi: 10.1214/15-AOAS886 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Byrne HM, Harrington HA, Muschel R, Reinert G, Stolz BJ, Tillmann U. Topology characterises tumour vasculature. Mathematics Today. 2019;55(5):206–210. [Google Scholar]
  • 46.Stolz BJ, Kaeppler J, Markelc B, Mech F, Lipsmeier F, Muschel RJ, et al. Multiscale topology characterises dynamic tumour vascular networks. arXiv:200808667 [q-bio]. 2020;. [DOI] [PMC free article] [PubMed]
  • 47.Feng M, Porter MA. Persistent homology of geospatial data: a case study with voting. arXiv:190205911 [physics]. 2019;.
  • 48. Topaz CM, Ziegelmeier L, Halverson T. Topological data analysis of biological aggregation models. PLoS ONE. 2015;10(5):e0126383. doi: 10.1371/journal.pone.0126383 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. McGuirl MR, Volkening A, Sandstede B. Topological data analysis of zebrafish patterns. Proceedings of the National Academy of Sciences. 2020;117(10):5113–5124. doi: 10.1073/pnas.1917763117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Bhaskar D, Manhart A, Milzman J, Nardini JT, Storey KM, Topaz CM, et al. Analyzing collective motion with machine learning and topology. Chaos. 2019;29(12):123125. doi: 10.1063/1.5125493 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Otter N, Porter MA, Tillmann U, Grindrod P, Harrington HA. A roadmap for the computation of persistent homology. European Physical Journal—Data Science. 2017;6(17):1–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Adams H, Emerson T, Kirby M, Neville R, Peterson C, Shipman P, et al. Persistence images: a stable vector representation of persistent homology. Journal of Machine Learning Research. 2017;18(1):218–252. [Google Scholar]
  • 53. Baish JW, Stylianopoulos T, Lanning RM, Kamoun WS, Fukumura D, Munn LL, et al. Scaling rules for diffusive drug delivery in tumor and normal tissues. Proceedings of the National Academy of Sciences. 2011;108(5):1799–1803. doi: 10.1073/pnas.1018154108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Edelsbrunner H, Harer JL. Computational topology: an introduction. Providence R. I.: American Mathematical Society; 2010. [Google Scholar]
  • 55. Garside K, Henderson R, Makarenko I, Masoller C. Topological data analysis of high resolution diabetic retinopathy images. PLoS ONE. 2019;14(5):e0217413. doi: 10.1371/journal.pone.0217413 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Stolz-Pretzer B. Global and local persistent homology for the shape and classification of biological data. University of Oxford; 2019. Available from: https://ora.ox.ac.uk/objects/uuid:3352ad74-87b4-415a-87d3-0592315763ac. [Google Scholar]
  • 57. Giusti C, Pastalkova E, Curto C, Itskov V. Clique topology reveals intrinsic geometric structure in neural correlations. Proceedings of the National Academy of Sciences of the United States of America. 2015;112(44):13455–13460. doi: 10.1073/pnas.1506407112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Lloyd S. Least squares quantization in PCM. IEEE Transactions on Information Theory. 1982;28(2):129–137. doi: 10.1109/TIT.1982.1056489 [DOI] [Google Scholar]
  • 59. Bholowalia P, Kumar A. EBK-Means: a clustering technique based on elbow method and K-means in WSN. International Journal of Computer Applications. 2014;105(9):17–24. [Google Scholar]
  • 60. Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics. 1987;20:53–65. doi: 10.1016/0377-0427(87)90125-7 [DOI] [Google Scholar]
  • 61. Edelsbrunner H, Harer JL. Persistent homology—a survey. Contemporary Mathematics. 2008;453:257–282. [Google Scholar]
  • 62. Turner K, Mukherjee S, Boyer DM. Persistent homology transform for modeling shapes and surfaces. Information and Inference: A Journal of the IMA. 2014;3(4):310–344. doi: 10.1093/imaiai/iau011 [DOI] [Google Scholar]
  • 63. Marchese A, Maroulas V. Signal classification with a point process distance on the space of persistence diagrams. Advances in Data Analysis and Classification. 2018;12(3):657–682. doi: 10.1007/s11634-017-0294-x [DOI] [Google Scholar]
  • 64. Robins V, Turner K. Principal component analysis of persistent homology rank functions with case studies of spatial point patterns, sphere packing and colloids. Physica D: Nonlinear Phenomena. 2016;334:99–117. doi: 10.1016/j.physd.2016.03.007 [DOI] [Google Scholar]
  • 65. Stolz BJ, Harrington HA, Porter MA. Persistent homology of time-dependent functional networks constructed from coupled time series. Chaos. 2017;27(4):047410. doi: 10.1063/1.4978997 [DOI] [PubMed] [Google Scholar]
  • 66. Bubenik P, Hull M, Patel D, Whittle B. Persistent homology detects curvature. Inverse Problems. 2020;36(2):025008. doi: 10.1088/1361-6420/ab4ac0 [DOI] [Google Scholar]
  • 67. Hamis S, Powathil GG, Chaplain MAJ. Blackboard to bedside: a mathematical modeling bottom-up approach toward personalized cancer treatments. JCO Clinical Cancer Informatics. 2019;(3):1–11. doi: 10.1200/CCI.18.00068 [DOI] [PubMed] [Google Scholar]
  • 68. Karolak A, Markov DA, McCawley LJ, Rejniak KA. Towards personalized computational oncology: from spatial models of tumour spheroids, to organoids, to tissues. Journal of The Royal Society Interface. 2018;15(138):20170703. doi: 10.1098/rsif.2017.0703 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Mantzaris NV, Webb S, Othmer HG. Mathematical modeling of tumor-induced angiogenesis. Journal of Mathematical Biology. 2004;49(2):111–187. doi: 10.1007/s00285-003-0262-2 [DOI] [PubMed] [Google Scholar]
  • 70. Peirce SM. Computational and mathematical modeling of angiogenesis. Microcirculation. 2008;15(8):739–751. doi: 10.1080/10739680802220331 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Rockne RC, Hawkins-Daarud A, Swanson KR, Sluka JP, Glazier JA, Macklin P, et al. The 2019 mathematical oncology roadmap. Physical Biology. 2019;16(4):041005. doi: 10.1088/1478-3975/ab1a09 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Owen MR, Alarcón T, Maini PK, Byrne HM. Angiogenesis and vascular remodelling in normal and cancerous tissues. Journal of Mathematical Biology. 2008;58(4):689. [DOI] [PubMed] [Google Scholar]
  • 73. Norton KA, Jin K, Popel AS. Modeling triple-negative breast cancer heterogeneity: effects of stromal macrophages, fibroblasts and tumor vasculature. Journal of Theoretical Biology. 2018;452:56–68. doi: 10.1016/j.jtbi.2018.05.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. Vilanova G, Burés M, Colominas I, Gomez H. Computational modelling suggests complex interactions between interstitial flow and tumour angiogenesis. Journal of The Royal Society Interface. 2018;15(146):20180415. doi: 10.1098/rsif.2018.0415 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Chen W, Xia P, Wang H, Tu J, Liang X, Zhang X, et al. The endothelial tip-stalk cell selection and shuffling during angiogenesis. Journal of Cell Communication and Signaling. 2019;13(3):291–301. doi: 10.1007/s12079-019-00511-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Jakobsson L, Franco CA, Bentley K, Collins RT, Ponsioen B, Aspalter IM, et al. Endothelial cells dynamically compete for the tip cell position during angiogenic sprouting. Nature Cell Biology. 2010;12(10):943–953. doi: 10.1038/ncb2103 [DOI] [PubMed] [Google Scholar]
  • 77. Ha M, Hb F. Evaluation of drug-loaded gold nanoparticle cytotoxicity as a function of tumor vasculature-induced tissue heterogeneity. Annals of Biomedical Engineering. 2018;47(1):257–271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78. Feleszko W, Bałkowiec EZ, Sieberth E, Marczak M, Dabrowska A, Giermasz A, et al. Lovastatin and tumor necrosis factor- exhibit potentiated antitumor effects against Ha-ras-transformed murine tumor via inhibition of tumor-induced angiogenesis. International Journal of Cancer. 1999;81(4):560–567. doi: [DOI] [PubMed] [Google Scholar]
  • 79. Sailem HZ, Zen AAH. Morphological landscape of endothelial cell networks reveals a functional role of glutamate receptors in angiogenesis. Scientific reports. 2020;10(1):1–14. doi: 10.1038/s41598-020-70440-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80. Gimbrone MA, Cotran RS, Leapman SB, Folkman J. Tumor growth and neovascularization: an experimental model using the rabbit cornea. Journal of the National Cancer Institute. 1974;52(2):413–427. doi: 10.1093/jnci/52.2.413 [DOI] [PubMed] [Google Scholar]
  • 81. Hlushchuk R, Barré S, Djonov V. Morphological aspects of tumor angiogenesis. In: Ribatti D, editor. Tumor angiogenesis assays: methods and protocols. Methods in Molecular Biology. New York, NY: Springer; 2016. p. 13–24. [DOI] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009094.r001

Decision Letter 0

Mark Chaplain, Daniel A Beard

16 Feb 2021

Dear Professor Byrne,

Thank you very much for submitting your manuscript "Topological data analysis distinguishes parameter regimes in the Anderson-Chaplain model of angiogenesis" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

Dear Authors,

Your manuscript has been reviewed by three experienced reviewers and each has recommended that revisions be undertaken before the paper can be considered acceptable for publication.

Please revise your manuscript in line with the comments of each of the reviewers and provide details of the changes made when re-submitting the revised version.

We look forward to receiving your revised manuscript in due course which will be sent back to the reviewers.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Mark Chaplain

Guest Editor

PLOS Computational Biology

Daniel Beard

Deputy Editor

PLOS Computational Biology

***********************

Dear Authors,

Your manuscript has been reviewed by three experienced reviewers and each has recommended that revisions be undertaken before the paper can be considered acceptable for publication.

Please revise your manuscript in line with the comments of each of the reviewers and provide details of the changes made when re-submitting the revised version.

We look forward to receiving your revised manuscript in due course which will be sent back to the reviewers.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: This paper proposes a topological data analysis (TDA) pipeline to interrogate simulation predictions of the Anderson-Chaplain model of angiogenesis associated with tumour development. The authors assess various topological and standard descriptors of model simulations generated by different parameter values, and they demonstrate TDA’s performance to stratify the A-C model parameter space into regions with regards to vessel morphology similarities. The paper is in general very comprehensive and structured. A well-executed study is presented here that provides important results pertinent for systematic analysis of the A-C model, and with regards to a specific TAD algorhtm: persistent homology. The reviewer recommends publication once the following revisions have been addressed by the authors. Please note that the comments for revision are not presented below in any special order.

Introduction (2nd paragraph): Here the authors outline the A-C angiogenesis model, which appears not to account for any vascular remodelling (vessel compression, collapse, lumen remodelling, etc.) - not as far as more contemporary models do, e.g.: DOI:10.1016/j.mvr.2015.02.007, DOI:10.1371/journal.pone.0150296, DOI:10.1371/journal.pcbi.1005259, DOI:10.1098/rsif.2016.0918. Therefore, the authors should justify this simplification in their study. Also, they need to elaborate in the paper how and if this would impact the findings of this work. Along the same lines, the authors should justify why they decided to select investigating the A-C model using a two-dimensional space. They have to elaborate and convince on their decision to do so – mainstream state-of-art mathematical models are 3D; thus, a 2D cancer model study may seem too simplistic nowadays. The authors need to elaborate on this matter and explain again if the findings of their TDA study would be different in the case of a A-C model in 3D.

Introduction (2nd paragraph) + Discussion: The review articles of theoretical cancer models should include more recent ones, opposed to the ones currently cited [41-44], more specifically the reviewer proposes: DOI:10.1200/CCI.18.00068, DOI:10.1088/1478-3975/ab1a09, DOI:10.1016/j.ymeth.2020.02.010.

Figure 1: This figure is nice despite no clear explanation why/how fibronectin gradients are produced, while also the vascular elements recruited by TAFs that are in turn secreted by tumour cells seem too out of scale compared to the “mother” blood vessel. Minor rework is needed here to further improve the cartoon.

Figure 2: Here the authors outline the hierarchy of the proposed pipeline, where the 3 major components (mathematical model -> data analysis -> data clustering) are shown very nicely. However, one may say that box 2 and box 3 concern about data processing and analysis; therefore, there is no clear distinction between the two. Also, there is no connection amongst the 3 boxes that is to say which component from box 1 feeds another component from box 2, and which component from box 2 supplies box 3 or/and box 1, and so on. The illustration needs some rework to render it a true figure showing the “flow” of the processes involved in the proposed pipeline!

Materials & Methods: The major direction of tumour angiogenesis data via a topological data analysis of the Anderson-Chaplain model is to interrogate haptotaxis and chemotaxis. However, the papers cited already in this manuscript [47,48] highlight the importance of mechanotaxis in tumour angiogenesis. This is a major simplification which may also have significant impact to the TDA analysis of the tumour networks. The authors need to elaborate on this in the paper, and they are strongly advised to encompass in their study the effect of mechanotaxis.

Materials & Methods: “We generate 10 realizations of the model for each of the 121 parameter combinations, and produce… “ It should be justified how this parameter space discretization and why this parameter range was selected. Should the authors have selected a much broader or narrower parameter range for model parameters ‘rho’ and ‘chi’, or even a parameter range that may produce “unrealistic simulations,” would this have affected the findings significantly, or not? Please elaborate on this – the authors should also connect their response with the above reviewer comment!

Materials & Methods: “To facilitate downstream clustering analysis, we require vectors of the same length for all simulations.” The authors need to justify this decision. What if the ‘experimental data’ were not homogeneous and of a different sample size. Would this impact the reliability and accuracy of the proposed pipeline; would the pipeline be able to work in principle? Please explain that in the manuscript.

Figure 3: Selection of vessel segments (that would in principle determine the number of vessel segments) appears somehow arbitrary. Please explain how it works in the algorithm. Also, the concept determining the final length of the vessels may be appropriate for a 2D representation of a vasculature. Despite of this, what if one had considered to adopt a different wave-front determination algorithm for the tip vessel propagation “monitoring.” What if one had employed in the algorithm multiple wave-fronts / directions of the dashed-line(s). Would the proposed approach work in 3D representation of the tumour vessels? Please elaborate further details of the algorithm in the manuscript.

Filtrations for simulated vasculature / Figure 4: It is not entirely clear in the methods how the binary image is generated – here from the model produced data (synthetic data) – and how it is associated with the resolution of the simulation space. More precisely, the authors need to explain how the “resolution” of the model generated tumour vasculature relates to the binary image “resolution”. Is there any relationship between the two resolutions? For example, let’s take two extreme scenarios: (a) in the first one, assume the resolution of the model vasculature (produced using the snail-trail algorithm) is of the order of 1 [unit length] (minimum segment length) whereas the pixel size is 10*10 [unit lengtht]^2; (b) in the second scenario, the resolution of the model vasculature is of the order of 1 [unit length] (minimum segment length) whereas the pixel size is 1*1 [unit lengtht]^2. Can the authors explain how the filtration algorithm will perform – please quantify your results.

The authors need to comment also about the efficiency of the “sweeping plane filtration” and “flooding filtration” algorithms when applied to 3D data (i.e., image stacks of MR scans, see review DOI:10.1018/nrclinonc.2014.126).

The following statement is not entirely clear to the reviewer, please consider rephrasing it.

“On each subsequent step, we iterate through each nonzero pixel from the previous binary image and manually set every pixel in its Moore neighborhood …”

“From the corresponding sequence of binary images, we construct a filtered simplicial complex K^flood = {K_1, K_2, …, K_end}.” To the reviewer’s understanding, end filtration will result into a simplicial complex that covers the entire “image”. Can the authors explain why this is needed? Is this right?

Pages 8-9: “Intuitively, a topological feature is born in filtration step … with another component or when a loop is covered by 2-simplices.”

It is self-evident that the homology group for when a feature is generated / born (tip vessel, branch, anastomosis, etc.) and for when a feature ceases to exist (e.g., tip transforms into an anastomosis). Nonetheless, and in view of the modelling simplification that no vascular remodelling / vessel compression or collapse / vessel intussusception is encompassed in the A-C model, then how a branch (or even a tip vessel) is regressed, or how an anastomosis is pruned in the vascular network.

Topological descriptor vectors from PH: The following statement may need some further justification why this step was considered in the persistent homology algorithm.

“In a first step the coordinates (b; d - b) are blurred by a Gaussian, with standard deviation sigma, that is centered about each birth-persistence point.”

Simulation Clustering: “Specifically, 7 of the 10 descriptor vectors from each of the 121 (rho, chi) parameter combinations …” were these 7 descriptor vectors begin selected randomly following a specific pattern (uniform, normal distribution)?

Results / Haptotaxis and chemotaxis alter vessel morphology: As covered in the recent reviews (DOI:10.1007/s11831-015-9156-x, DOI:10.1016/j.ymeth.2020.02.010 - they should be cited here), angiogenesis models have already highlighted the importance of chemotaxis and haptotaxis in tumour-induced angiogenesis, whereas in these two modelling papers (DOI: 10.1371/journal.pcbi.1005259) and (DOI:10.1098/rsif.2018.0415) it has been demonstrated the importance of mechanotaxis and interstitial fluid flow in vascular development in the tumour microenvironment. Therefore, the contribution of this paragraph is not entirely clear to the reviewer. It should be enhanced with the analysis suggested in the comment #5 above.

Reviewer #2: Topological data analysis distinguishes parameter regimes in the Anderson-Chaplain model of angiogenesis

The authors present a novel analysis pipeline that applies multi-scale encoding and machine learning methods to simulated data produced by a well known model of cancer-induced angiogenesis, in order to identify biologically similar groupings of vessel morphology. The idea is well founded and the motivation clear, and the methods are sound. The paper is also generally well written and easy to digest.

However, I think some major clarification is necessary to the results, as it is very unclear as to why one particular encoding (the sweeping plane filtration) produces better results than another (the flooding filtration). Furthermore, this comparison is confused with the overall message, which (I think) is that a standard clustering method (k-means) can be used with multi-scale encodings to group similar vascular morphologies, while standard post-processed metrics (e.g. number of nodes, vessel length) cannot. I think the authors need to clearly separate the two questions.

General comment

The authors frequently make reference to “biologically interpretable / informative / meaningful” clusters. However the definition of “biologically meaningful” is not clear to me; I would guess it refers to clusters that have similar biological properties, but this is purely my inferred meaning. I think the authors should define this up front with a clear statement on what constitutes a “biologically meaningful cluster”. Furthermore, there’s a difference between “meaningful” and “interpretable”; the latter term has a quite specific meaning in machine learning that is not necessarily equivalent to meaning (an algorithm can be non-interpretable but still produce meaningful predictions).

Materials and methods

Have the authors considered alternative standard descriptors from the literature? For example, δmax, the maximum distance in the tissue from the nearest blood vessel, and λ, a measure of the shape of the spaces between vessels, used by https://www.pnas.org/content/108/5/1799. Or even simply tortuosity (which the authors themselves reference on page 13).

Section ‘Filtrations for simulated vasculature’. The results section raises questions regarding the two representations presented here (sweeping plane and flooding). Given that they produce quite different results, I think some reference should be made to previous applications of each encoding; in particular what problems they were developed to solve, as this may shed light on why sweeping plane performs better than flooding.

Section ‘Simulation clustering’. The distance metric used by the k-means algorithm is not explained; I assume it’s Euclidean, but the authors should state this clearly.

Results

Figs 10, 11. I find it difficult to compare these figures because the Betti curves are presented on different axes (space in Fig 10, step in Fig 11). It would help if the figures were more directly comparable, or even on the same plot.

Section ‘Haptotaxis and chemotaxis alter vessel morphology’. The respective roles of hapto- and chemotaxis have been investigated before, it’s not entirely clear whether the authors are reporting them as new or just including them for context. I would guess the latter, but I think clarification is needed.

Section ‘Sweeping plane descriptor vectors cluster simulations by parameter values’. There are two results here that are not explained: i) sweeping plane can produce biologically meaningful clusters, while flooding cannot; ii) double descriptor vectors created by sweeping plane produce biologically meaningful and robust clusters. I think the authors need to provide explanations, and in particular comment on why the two encodings produce such different results.

Choosing the value of k: the elbow plot (SI Fig 15) is not hugely convincing to justify k=5, and the decision appears to have been made somewhat post hoc based on the desired interpretation of the clusters. The silhouette score might provide more information, and should be relatively easy to calculate given that a standard distance metric is used by the k-means algorithm.

Reviewer #3: Nardini and colleagues present an application of topological data analysis (TDA) to analyse synthetic vascular network generated with the Anderson-Chaplain (A-C) model of tumour angiogenesis. The authors compare various choices of description vector arising from the TDA and demonstrate that a judicious choice enables robust clustering of the synthetic networks based on the relative weight given to chemo-/haptotaxis in a number of simulated networks even in the presence of stochasticity.

There is currently a lack of methods for defining embeddings of vascular networks that enable downstream classification and regression. This is particularly challenging in this scenario given the intrinsic structural stochasticity. This is a very substantial contribution of this study and I congratulate the authors on the innovation provided. The writing in the manuscript is impeccable and the figures provided are of high quality, which facilitates substantially the understanding of the concepts introduced and discussed.

Before the manuscript can be accepted, I would like to request the following corrections:

1. At the risk of sounding pedantic, the description of tumour-induced angiogenesis in Fig. 1 needs its language updated or simply make clear that some of the concepts are useful modelling abstractions but not necessarily biological mechanisms. For example, endothelial cells (ECs) don’t “deposit” other ECs as they invade the avascular space. Since the formulation of the original A-C model, there have been substantial advances in understanding the dynamics of tip and stalk cells in sprouting angiogenesis. Furthermore, ECs don’t branch into two cells, it is ECs along an existing sprout that initiate secondary sprouts and so on. It may be worth bringing the description of TAF (VEGF+bFGF) from the methods section to the main text.

2. The argument of the paper goes that the standard quantitative descriptors in page 4 (inter-vessel spacing, number of branching points, tortuosity, etc) fail to capture higher dimensional information and facilitate network clustering. It would be very interesting to see how (little) effective these are at clustering the simulations similarly to Fig. 12. This comparison would provide quantitative evidence of the superior performance of the method proposed.

3. Fig. 12 and accompanying text: it would be interesting to also see the PIO0(KLTR)& PIO0(KRTL) descriptors plotted on a 2D plane (via Principal Component Analysis for example) to get a sense of the distance between the clusters. Fig. 13 appears to indicate that classes 1 and 2 may be phenotypically much closer than any other pair.

4. Page 18: my main criticism of the study is that the conclusion that the proposed descriptors cluster the networks in biologically meaningful way is highly dependent on the choice of k=5 in the clustering analysis. This choice is justified in a very empirical way and I would suggest the authors to use the elbow method or a similar heuristic to argue about the optimal number of clusters in k-means based on the amount variance explained.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009094.r003

Decision Letter 1

Mark Chaplain, Daniel A Beard

18 May 2021

Dear Professor Byrne,

We are pleased to inform you that your manuscript 'Topological data analysis distinguishes parameter regimes in the Anderson-Chaplain model of angiogenesis' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Mark Chaplain

Guest Editor

PLOS Computational Biology

Daniel Beard

Deputy Editor

PLOS Computational Biology

***********************************************************

Thank you for revising the paper and taking into account all comments of the reviewers.

#Accept

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have provided a detailed and thorough response/revision to my comments - I endorse the paper for publication.

Reviewer #3: The authors have addressed my comments satisfactorily.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No: The code for the Chaplain Anderson ABM model used in this study should be provided. Input files to run the code should be also provided to replicate the results of this study.

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Dr Vasileios Vavourakis

Reviewer #3: No

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009094.r004

Acceptance letter

Mark Chaplain, Daniel A Beard

25 Jun 2021

PCOMPBIOL-D-21-00052R1

Topological data analysis distinguishes parameter regimes in the Anderson-Chaplain model of angiogenesis

Dear Dr Byrne,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Katalin Szabo

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Standard descriptor vector clustering.

    Clustering of the (ρ, χ) parameter space using k-means with k = 5 on the standard descriptor vectors to summarize each simulated vasculature. We considered A) Number of tips over time, C) Number of vessel segments over time, E) Final length of the simulated vasculature, and G) All of the previous descriptor vectors concatenated into one descriptor vector. The five clusters are ordered according to the mean χ value within the cluster. Panels B,D,F,H depict the out of sample confusion matrices for each descriptor vector.

    (TIF)

    S2 Fig. Individual plane sweeping descriptor vector clustering.

    Clustering of the (ρ, χ) parameter space using k-means with k = 5 on individual sweeping plane topological filtrations to summarize each simulated vasculature. The four highest OOS accuracies resulted from the A) PIR1(KLTR), C) PIO0(KLTR), E) PIO1(KRTL), G) and PIO1(KLTR) descriptor vectors. The five clusters are ordered according to the mean χ value within the cluster. Panels B,D,F,H depict the out of sample confusion matrices for each descriptor vector.

    (TIF)

    S3 Fig. Double plane sweeping descriptor vector clustering.

    Clustering of the (ρ, χ) parameter space using k-means with k = 5 on double vectors of the sweeping plane topological filtration to summarize each simulated vasculature. The four highest OOS accuracies resulted from the A) PIO0(KLTR) & PIO0(KRTL), C) PIO0(KRTL) & PIR1(KLTR), E) PIO0(KTTB) & PIR1(KLTR), and F) PIO0(KBTT) & PIR1(KLTR) descriptor vectors. The five clusters are ordered according to the mean χ value within the cluster. Panels B,D,F,H depict the out of sample confusion matrices for each descriptor vector.

    (TIF)

    S4 Fig. Dimensionality reduction.

    Dimensionality reduction of the PIO0(KLTR) & PIO0(KRTL) double descriptor vector. We reduced the dimensionality of the PIO0(KLTR) & PIO0(KRTL) double descriptor vector to two dimensions using principal components analysis and plot the reduced-dimension descriptor vector for each simulation. The color of each dot denotes the predicted grouping of each simulation from the k-means algorithm.

    (TIF)

    S5 Fig. Elbow curve.

    Elbow curve for the OOS Accuracy for the k-means clustering and labeling algorithm over different values of k. We considered all 276 doubles of topological feature vectors and plot the mean OOS accuracy percentage plus or minus two standard deviations for multiple values of k. We propose that the elbow occurs at k = 5.

    (TIF)

    S6 Fig. Multiple k-means clusterings.

    Clustering of the (ρ, χ) parameter space using the PIO0(KLTR) & PIO0(KRTL) double of topological descriptor vectors using the k-means algorithm with A) k = 3, C) k = 4, E) k = 5. The clusters are ordered according to the mean χ value within the cluster. Panels B,D,F) depict the out of sample confusion matrices for each descriptor vectory.

    (TIF)

    S7 Fig. (ρ, χ, ψ) clustering.

    Clustering based on the (ρ, χ) grouping and the time to branch, ψ. We created a data set of blood vessel simulations by considering (ρ, χ) values from groups 1–5 from Fig 13A and letting ψ vary over the 10 values {.1, 0.2, 0.3, …, 1.0}. We simulated the Anderson-Chaplain model ten times at each (ρ, χ, ψ) values to create 500 total simulations. We performed our clustering methodology on this collection of images based off the PIO0(KLTR) & PIO0(KRTL) descriptor vector.

    (TIF)

    S8 Fig. Initial TAF and fibronectin profiles.

    TAF concentration (left) and fibronectin concentration (right).

    (TIF)

    S9 Fig. Individual flooding descriptor vector clustering.

    Clustering of the (ρ, χ) parameter space using k-means with k = 5 on individual flooding topological descriptor vectors to summarize each simulated vasculature. The four highest OOS accuracies resulted from the A) PIR1(Kflood), C) β1(Kflood), E) PIO1(Kflood), G) and β0(Kflood) descriptor vectors. The five clusters are ordered according to the mean χ value within the cluster. Panels B,D,F,H depict the out of sample confusion matrices for each descriptor vector.

    (TIF)

    S10 Fig. Double flooding descriptor vector clustering.

    Clustering of the (ρ, χ) parameter space using k-means with k = 5 on double flooding topological descriptor vectors to summarize each simulation. The four highest OOS accuracies resulted from the A) PIO0(Kflood)&PIR1(Kflood), C) PIR0(Kflood)&PIR1(Kflood), E) β0(Kflood)&β1(Kflood), G) and PIO0(Kflood)&β1(Kflood) descriptor vectors. The five clusters are ordered according to the mean χ value within the cluster. Panels B,D,F,H depict the out of sample confusion matrices for each descriptor vector.

    (TIF)

    S1 Table. Anderson-Chaplain model parameters.

    List of mechanistic parameters for the Chaplain Anderson ABM from [17].

    (PDF)

    S2 Table. Anderson-Chaplain model nondimensionalized parameters.

    Baseline nondimensionalized mechanistic parameters for the Chaplain-Anderson Model from [17].

    (PDF)

    S3 Table. Individual plane sweeping clustering.

    Out of Sample Accuracy scores for individual feature vectors from the flooding filtration using k-means classification with k = 5.

    (PDF)

    S4 Table. Double flooding clustering.

    Out of Sample Accuracy scores for doubles of feature vectors from the flooding filtration using k-means classification with k = 5.

    (PDF)

    S5 Table. Individual plane sweeping clustering.

    Out of Sample Accuracy scores for individual feature vectors from the sweeping plane filtration using k-means classification with k = 5.

    (PDF)

    S1 Appendix. Modeling and clustering appendix.

    The file contains a more detailed description of implementation of the Anderson-Chaplain Model and clustering results.

    (PDF)

    Attachment

    Submitted filename: response_to_reviewers.pdf

    Data Availability Statement

    All data is available at https://github.com/johnnardini/Angio_TDA.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES