Abstract
Nano-engineered particles are a promising tool for medical diagnostics, biomedical imaging and targeted drug delivery. Fundamental to the assessment of particle performance are in vitro particle–cell interaction experiments. These experiments can be summarized with key parameters that facilitate objective comparisons across various cell and particle pairs, such as the particle–cell association rate. Previous studies often focus on point estimates of such parameters and neglect heterogeneity in routine measurements. In this study, we develop an ordinary differential equation-based mechanistic mathematical model that incorporates and exploits the heterogeneity in routine measurements. Connecting this model to data using approximate Bayesian computation parameter inference and prediction tools, we reveal the significant role of heterogeneity in parameters that characterize particle–cell interactions. We then generate predictions for key quantities, such as the time evolution of the number of particles per cell. Finally, by systematically exploring how the choice of experimental time points influences estimates of key quantities, we identify optimal experimental time points that maximize the information that is gained from particle–cell interaction experiments.
Keywords: heterogeneity, mathematical modelling, particle–cell interaction, nano-engineered particle, parameter estimation, prediction, approximate Bayesian computation
1. Introduction
Nano-engineered particles have the potential to transform precision medicine [1–4], medical diagnostics and biomedical imaging [5–8]. They facilitate the targeted delivery of therapeutics and imaging agents to specific cell types in challenging biological environments [3,9,10]. Particles can be designed and produced with varying physicochemical characteristics tailored to specific applications, such as cancer medicines, gene therapies, immunotherapies [3] and vaccines [11,12]. However, determining the performance of a particular design for a particular application is challenging. This is because many biological, physical and chemical processes occur simultaneously across a range of spatial and temporal scales [10,13–18]. Here, we develop tools to assess the performance of nano-engineered particle designs using a combined mathematical–statistical–experimental framework that quantifies crucial nano-engineered particle–cell interactions [14–17].
In vitro particle–cell interaction experiments provide a relatively fast and inexpensive option to build an understanding of nano-engineered particle–cell interactions (figure 1). In these experiments, nano-engineered cells are incubated with particles over a period of hours to days [19]. Measurements estimating the number of associated particles per cell can be obtained in different ways, including microscopy, flow cytometry and spectrometer-based methods. A review of these techniques can be found in [20]. Microscopy techniques provide data at a range of spatial resolutions, including approximate counts of individual nanoparticles at the single cell level, but generating representative samples is time and labour intensive [21–23]. In this study, we focus on routinely generated, rapid and high-throughput flow cytometry data [19,24,25]. Throughout we use the term nano-engineered particles, or particles for brevity, to describe nanoparticles (less than in diameter) and particles that are hundreds of nanometres in diameter generated using nano-engineering techniques [10].
Figure 1.
Mathematical–statistical–experimental workflow to analyse particle–cell interaction experiments. Experimental data comprise flow cytometry measurements from particle-only, cell-only and particle–cell experiments. We use mathematical and statistical modelling to generate predictions of key quantities and identify optimal experimental time points.
Mathematical modelling provides a powerful tool to characterize particle–cell interactions, reviewed in [10,26], and interpret data from particle–cell interaction experiments. In particular, mechanistic mathematical modelling has been used to reveal key mechanisms driving particle–cell interaction experiments, including particle transport [27–31] and internalization processes [26,32–34]. In such studies there is a growing recognition of the importance of heterogeneity in biological and physical processes and properties [35–38]. Previous studies have also provided a mechanistic and quantitative basis for understanding particle–cell interactions using experimentally validated differential equation-based models [19,35,36]. These studies have revealed that particle–cell interactions are well characterized by two parameters: particle–cell association rate, analogous to a rate constant in chemical reactions, and a carrying capacity-type, which characterizes the maximum number of particles associated with a cell. This approach groups together particles in contact with the surface of a cell and particles internalized within a cell [19]. Using such metrics has been recommended as a route to accelerate progress by enabling objective comparisons across data generated using different experimental protocols or pairs of cells and particles [39–41].
In this study, we generalize the mechanistic ordinary differential equation-based mathematical model in [19] to biologically heterogeneous cell populations. Specifically, we allow the particle–cell association rate, carrying capacity and autofluorescence to be distinct for each cell and allow the fluorescence of each particle to be distinct. This new mathematical model captures particle–cell interactions in experiments with cell–cell heterogeneity and captures cell–cell competition for particles. To perform parameter estimation, practical identifiability analysis and prediction for the mathematical model, we use approximate Bayesian computation methods that have been developed to interpret similar flow cytometry data [32,42–45]. These approximate Bayesian computation methods seek parameter values that minimize the difference between simulated data from the model and observed experimental data. This combined mathematical–statistical–experimental approach allows us to estimate key parameters, variation in key parameters and uncertainty in key parameters. We directly compare this new model to previous approaches that do not allow for heterogeneous populations, neglect heterogeneity in routine measurements and apply data transformations that account for heterogeneity in routine measurements but can generate non-physical quantities. This study builds on the growing literature that has focused on quantifying cell–cell heterogeneity from flow cytometry data in the absence of particles [46–52]. We conclude by exploring how the choice of experimental time points influences information gained about the mathematical model parameters from our approximate Bayesian computation approach. In this process, using established methods based on principles of Bayesian optimal design [53–55], we systematically examine 3003 experimental designs, identify optimal designs for different particle–cell association rates, and identify an overall optimal design for when the particle–cell association rate is unknown.
2. Data
We examine previously published data for THP-1 cells (a human leukaemia monocytic suspension cell line) separately incubated with three types of nano-engineered particles [19]. Results in the main paper focus on 150 nm polymethacrylic acid (PMA) core–shell particles. Electronic supplementary material results focus on 214 nm PMA capsules and 633 nm PMA core–shell particles.
Data for each particle–cell combination are obtained by flow cytometry and comprise (i) 20 000 measurements per time point of the fluorescent signal for individual cells in time course particle–cell association data, for , , , , and h (figure 1C,F); (ii) 11 605 measurements of the fluorescent signal for individual cells for the cell-only control data, , corresponding to measurements at h (figure 1B,E); and (iii) at least 249 344 measurements of fluorescent signal for individual particles for particle-only control data, (500 000 measurements for 150 nm PMA core–shell particles and 214 nm PMA capsules, 249 344 measurements for 633 nm PMA core–shell particles) (figure 1A,D). Following [19], these measurements are calibrated to a reference voltage and denoted , and (electronic supplementary material, S1.1). Data for each time point group together endpoint measurements from two replicates. We do not obtain multiple measurements of the same cell. These types of data are sometimes referred to as snapshot time-series data [48].
3. Methods
We seek to (i) quantify biological heterogeneity in nano-engineered particle–cell experiments, (ii) generate predictions of key quantities that are challenging to observe by experimentation alone, and (iii) identify experimental time points that are optimal in the sense of maximizing the precision of parameter estimates. To perform this analysis, we present a suite of quantitative techniques: two mechanistic mathematical models that describe particle–cell interactions; simulation-based approximate Bayesian computation (ABC) methods that we employ for parameter inference, practical identifiability analysis, and prediction; and experimental design tools based on principles of Bayesian optimal design.
3.1. Mathematical models
We consider two ordinary differential equation-based mathematical models. The first is a previously published model that assumes a homogeneous cell population [19]. The second is a new mathematical model that generalizes the homogeneous model to heterogeneous cell populations.
3.1.1. Homogeneous cell population
Assuming all cells in the experimental well are identical, the concentration of particles per cell in the well-mixed media of volume at time , denoted [], is initially given by [] and its time evolution is governed by the ordinary differential equation [19]
| (3.1) |
where [-] is the fractional surface coverage of cells, [] is the surface area of the cell boundary, [] is the particle–cell association rate, and we refer to [] as the cell carrying capacity for particles. In practice, equation (3.1) is a phenomenological model and particles may rapidly associate and disassociate, so is not necessarily a carrying capacity in the traditional sense of population dynamics models; however, it captures the observed behaviour. Considering conservation of the total number of particles in the system and assuming that there is no particle degradation during the experiment, the number of associated particles per cell at time , denoted [], is equal to the difference between the initial number of particles per cell in the media and the number of particles per cell in the media at time :
| (3.2) |
The analytical solution of the coupled system of equations (3.1) and (3.2) for is
| (3.3) |
Whether the solution approaches the long-time solution, , within the experimental duration depends on . In typical experiments, the number of particles per cell is small relative to the initial number of particles in the media, for all [19].
This mechanistic mathematical model is characterized by four parameters , that are determined by the experimental setup and assumed to be known fixed constants (electronic supplementary material, S1.2), and two unknown parameters, , that cannot be directly measured in the experiments.
3.1.2. Heterogeneous cell population
Previous studies have shown that the particle–cell association rate, , and the carrying capacity parameter, , exhibit significant variability between cells subjected to the same experimental conditions [35]. We now generalize the homogeneous cell population model to allow for this variability. We make a common assumption for non-negative biological parameters and assume that and are lognormally distributed [28,56,57],
| (3.4) |
In equation (3.4), and denote the means and and denote the standard deviations of the lognormal distributions for and , respectively. We use this parameterization of the lognormal distributions to report the results for and in terms of their mean and standard deviation. With this approach, we allow for differing amounts of heterogeneity in the particle–cell association rate, , and the carrying capacity, , and assume that there is no correlation between these mechanisms for each individual cell [35].
Assuming that each cell has distinct properties, the time evolution of the concentration of particles per cell in the well-mixed media can be described by a system of ordinary differential equations, where is the number of cells simulated in the experimental well. We set , which is equal to the number of experimental measurements per time point and assume that this is sufficiently large to accurately approximate the – distribution. This model captures cell–cell competition for particles. For example, if a population of cells all have similar values of , those cells with high reduce the total number of particles available to associate with cells with low at later times. In electronic supplementary material, S1.3, we derive this heterogeneous model and show that the heterogeneous model simplifies to the homogeneous model when all cells are identical.
Solving the large system of coupled differential equations that forms the heterogeneous model is computationally expensive, and typically takes minutes to solve. This makes statistical inference, where we must solve the model many times for different parameter values, computationally challenging. To address this challenge, we determine an approximate solution to the heterogeneous model that is accurate under the experimentally relevant assumption that the number of particles that associate with cells is small relative to the initial number of particles in the media. Furthermore, this approximate solution can be evaluated in less than a second, which supports efficient statistical inference. The approximate solution to the heterogeneous model, derived in electronic supplementary material, S1.3, is given by a set of independent equations for the number of particles associated with each cell :
| (3.5) |
In equation (3.5), and are samples from the probability distributions defined in equation (3.4) and [particles cell m] represents the approximate concentration of particles per cell in the media at time . We estimate using the mean of evaluations of the analytic solution to the homogeneous cell population model (equation (3.3)) at distinct samples of and from equation (3.4). We set based on pilot simulations exploring a tradeoff between computational efficiency and accuracy. We evaluate the integral in equation (3.5) using the trapezoid rule. In electronic supplementary material, S1.3, we verify that this approximate solution to the heterogeneous model (equation (3.5)) agrees with the corresponding solution to the heterogeneous model for experimentally relevant parameter regimes.
This mechanistic mathematical model is characterized by four parameters that are determined by the experimental setup and assumed to be known fixed constants (electronic supplementary material, S1.2) and four unknown hyperparameters that we will estimate as they cannot be directly measured in the experiments.
3.2. Parameter estimation, practical identifiability analysis and prediction
We use established likelihood-free simulation-based ABC methods for parameter estimation, practical identifiability analysis and prediction. In brief, we use ABC methods that seek parameter values that minimize the difference between the experimentally measured flow cytometry fluorescence data and synthetic flow cytometry data that we generate by simulation.
For illustrative purposes, we first explain how to generate synthetic flow cytometry data under simplifying assumptions. In particular, we assume that the homogeneous mathematical model is valid and that it is reasonable to summarize the experimentally measured cell-only and particle-only control datasets, and defined in §2, with their median fluorescent intensities. The only synthetic flow cytometry datapoint at time is then [19,41]
| (3.6) |
In equation (3.6), the contribution of cell autofluorescence to the total fluorescence is additive, and the contribution of particle fluorescence to the total fluorescence is multiplicative.
In practice, we generalize equation (3.6) to allow for the heterogeneous outputs from the heterogeneous mathematical model and we capture cell–cell variability, particle–particle variability and measurement noise by exploiting heterogeneity in and . A single flow cytometry datapoint for cell at time is then given by
| (3.7) |
Analogous to equation (3.6), in equation (3.7) noise due to cell autofluorescence, , is additive but rather than using the median fluorescent intensity of , we sample the autofluorescence of each cell , , from . Similarly, we refer to noise due to the particle fluorescence as multiplicative, but rather than using the median fluorescent intensity, we now sample the fluorescence of each individual particle for cell , , from . In equation (3.7), since may not be an integer, we assume that the total fluorescence due to particles is given by samples from , where denotes the floor function, and one additional sample from that is scaled by the fraction of the particle that is associated with the cell, , where denotes the ceiling function.
A complete synthetic flow cytometry time course dataset is then , where . As we simulate the synthetic flow cytometry datapoints at each time point and for we extensively sample and . Throughout, we use capitalized and non-capitalized to distinguish between datasets and the data points that form datasets, respectively.
We perform inference using an established ABC-sequential Monte Carlo (SMC) algorithm [32,45] that we modify to terminate at a target ABC error threshold. To facilitate efficient inference, we use an ABC-SMC algorithm, which sequentially determines sets of parameter values that result in closer agreement between the synthetic and experimental time course datasets, (defined above) and (defined in §2), respectively. We use uniform priors for all parameters and set the ABC distance function to be the sum of the Anderson–Darling distance between the experimental and simulated datasets at each time point. In electronic supplementary material, S1.4, we present further details, including the ABC error thresholds, the number of ABC particles, the transition kernel, definitions of the ABC distance functions that we explore, uniform prior bounds, and how we generate posterior distributions, predictions and inferred distributions.
3.3. Optimal experimental time points
To identify optimal experimental time points, we adopt established Bayesian optimal design techniques for mathematical models with intractable likelihoods. Specifically, we consider each set of time points as a distinct experimental design and take an approach similar to the ABCdE algorithm [53] that involves ABC rejection and pre-simulating synthetic data.
For consistency with previous data [19], we assume that each experimental design comprises flow cytometry measurements at six distinct time points in addition to cell-only control data (corresponding to h) and particle-only control data. For each particle–cell scenario that we consider we assume that the cell-only and particle-only control data are fixed. Each experimental design is then characterized by six distinct time points. We assume that these time points are chosen from 14 possible choices: h, h and every 2 h from h to h. These time points are chosen based on previous data and practical considerations regarding the frequency of measurements with manual experimental procedures. As we consider all possible ways to choose the six time points from a set of 14 possible time points, we examine possible experimental designs.
Given that particles are frequently designed to target specific cells, the particle–cell association rate is a key quantity of interest. As the particle–cell association rate has been shown to vary over multiple orders of magnitude for different combinations of particles and cell types [19], we examine each of the experimental designs for three particle–cell scenarios that vary with respect to the particle–cell association rate . We refer to these as low , intermediate and high and note that these are defined relative to .
For each scenario, we capture potential variation in experimental data by pre-simulating synthetic flow cytometry datasets using statistical hyperparameters , , , sampled from non-negative truncated Gaussian distributions with pre-specified mean values and coefficients of variation set equal to . We next pre-simulate synthetic flow cytometry datasets using statistical hyperparameters sampled from uniform priors whose bounds are defined in electronic supplementary material, S1.4. Using ABC rejection across all designs, we use these datasets to efficiently form ABC posteriors. We choose the ABC error threshold so that all ABC posteriors across all designs contain at least 200 samples and use the Anderson–Darling ABC distance throughout.
To assess each design for a particular scenario, we compute an average utility that rewards precise estimates of the model parameters across the synthetic flow cytometry datasets via the empirical covariance matrix of the ABC posterior distributions [54]:
| (3.8) |
In equation (3.8), represents a single synthetic flow cytometry dataset, and ABC posteriors are computed for each of the synthetic datasets independently.
The optimal design for a particular scenario corresponds to the design that maximizes the average utility over all of the 3003 possible designs, , with respect to potential future data and model parameters,
| (3.9) |
4. Results and discussion
Particle–cell interaction experiments generate thousands of cell-level measurements per time point. For example, we analyse experiments with 20 000 measurements at each time point [19]. However, heterogeneity that is present in such data is often overlooked when using standard metrics such as the median fluorescence intensity. Here, we use mechanistic mathematical modelling and statistical parameter estimation, practical identifiability analysis and prediction tools to exploit this heterogeneity in routine measurements for greater understanding.
We first perform synthetic data studies to verify that our methods recover known parameters and quantities, to explore the role of heterogeneity on key quantities, and to demonstrate parameter identifiability challenges that may arise. We then apply these methods to experimental data to quantify biological heterogeneity, quantify uncertainty in estimates of biological heterogeneity, and generate predictions of key quantities that are challenging to determine by experimentation alone. Next, we compare our new approach to a previous method. We conclude by identifying optimal experimental time points across various particle–cell scenarios.
4.1. Mechanistic modelling incorporating biological and control data heterogeneity captures data variability and reveals sources of uncertainty
Before analysing experimental data we verify that the parameter inference methods that we employ recover known parameters and quantities. We generate synthetic flow cytometry data consistent with the experimental data (§3). Heterogeneity in these synthetic data is driven by prescribed biological heterogeneity and heterogeneity in routine control experimental data measurements. The prescribed biological heterogeneity describes cell–cell variability in particle–cell association rates, , and carrying capacities, . Heterogeneity from the cell-only and particle-only control experimental data arises in the synthetic data as additive noise due to cell autofluorescence and multiplicative noise due to the presence of particles, respectively.
Analysing these synthetic flow cytometry data using the model that generated the data is useful to explore potential sources of uncertainty in the absence of model misspecification. However, the synthetic data are finite, non-ideal, incomplete and noisy. Furthermore, while it is straightforward to simulate the model, which incorporates biological heterogeneity and heterogeneity from the control experimental data, the complexity of the model renders traditional likelihood-based inference techniques computationally challenging.
Using likelihood-free simulation-based ABC inference methods (§3.2), we form posterior distributions for the means and standard deviations of the lognormal distributions that we assume characterize and . We use these posterior distributions to capture uncertainty and estimate credible intervals, which we report in terms of highest posterior density. For these synthetic data, we find that posterior distributions for all statistical hyperparameters are relatively narrow in comparison to pre-specified bounds and are well-formed about a single peak. Therefore, we conclude that the statistical hyperparameters () are practically identifiable (figure 2). This means that a relatively narrow range of parameters gives similar agreement to the experimental data. Furthermore, the known values of the statistical hyperparameters used to generate the data are each contained within their respective highest posterior density interval.
Figure 2.
Parameter inference techniques recover known values of the statistical hyperparameters from finite, non-ideal, incomplete and noisy synthetic data. Univariate and bivariate posterior distributions for the statistical hyperparameters [], [particles cell], [] and [particles cell]. The 95% univariate highest posterior density intervals are for , for , for and for . Known parameter values used to generate the data are and are shown as vertical orange dashed lines and circles.
Propagating forward the uncertainty captured by the posterior distributions is a powerful technique for additional verification and allows for comparisons to experimentally measured quantities. We use this technique to show that posterior predictions for the distribution of fluorescence measurements demonstrate excellent agreement with the corresponding synthetic data (figure 3).
Figure 3.
Posterior prediction of particle–cell association time course data captures variability in the synthetic data. Histograms of the synthetic fluorescence signal, [AU] (grey), and prediction interval of heights of histogram bars from the mathematical model (cyan) for (A) , (B) , (C) , (D) , (E) , (F) h. Inset in (A) illustrates the width of prediction intervals.
We next propagate forward uncertainty captured by the posterior distributions for the statistical hyperparameters to generate inferred distributions for and (figure 4). These inferred distributions allow us to quantify biological heterogeneity and allow us to quantify the uncertainty in these estimates of the biological heterogeneity. These inferred distributions demonstrate that there is significant biological heterogeneity. Previous methods that focus on point estimates of parameters that characterize particle–cell interactions in the homogeneous model cannot capture this heterogeneity [19]. Furthermore, while more recent methods can capture biological heterogeneity, they overlook heterogeneity that is present in routine control experimental data measurements that we later show is critical for generating accurate predictions [35].
Figure 4.
Parameter inference techniques recover known inferred distributions from finite, non-ideal, incomplete and noisy synthetic data. (A) Inferred distribution for [] with inset focusing on low . (B) Inferred distribution for [particles cell]. Colours represent median of estimated inferred distribution (black), 95% prediction interval (blue) and distributions at known values (orange-dashed).
4.2. Mathematical modelling enables predictions of key quantities that are challenging to observe experimentally
Particle–cell interaction experiments are routinely performed to address a fundamental question: how does the number of particles per cell change with time? This is a challenging question to answer accurately and robustly by experimentation alone. Counting the number of particles per cell at scale in experiments is simply not yet feasible. Furthermore, straightforward translations of fluorescence measurements to particles per cell, such as in [19], assume homogeneous cell populations and do not generate robust accurate estimates for the number of particles per cell, as we later show. Our mathematical and statistical modelling framework provides a powerful tool to address this question. Propagating forward the uncertainty in posterior distributions, we generate predictions for the time evolution of the number of particles per cell, . These predictions demonstrate excellent agreement with the corresponding synthetic data. In particular, the lower and upper quartiles of the known synthetic data demonstrate excellent agreement with the corresponding boundaries of the prediction interval (figure 5A). This is a critical verification step for our approach. This prediction for is obtained by analysing the synthetic flow cytometry data that comprise noisy observations of as described in equation (3.7), whereas the synthetic data to which we make a comparison are noise-free observations of obtained from equation (3.5).
Figure 5.
Mathematical modelling generates predictions of quantities that are challenging to observe experimentally. (A) Time evolution of the number of particles per cell, [particles cell], with (dark blue) and (light blue) prediction intervals. Box plots represent synthetic data (outliers not shown). (B) Time evolution of the percentage of cells that are close to carrying capacity, where close is defined via , for (solid), (dashed) and (dotted). The arrow indicates the direction of increasing .
We can also use our modelling framework to generate predictions for other quantities that are challenging to observe by experimentation and previous modelling approaches that assume a homogeneous cell population. For example, we can predict the percentage of cells close to their respective carrying capacities (figure 5B). This could be used to optimize the particle dosage. If most cells are far from the respective carrying capacities increasing the particle dosage could help to elucidate particle–cell interactions. By contrast, if most cells are close to the respective carrying capacities increasing the dosage provides similar results with increased wastage of particles. Predicting the percentage of cells close to carrying capacity can also help interpret uncertainty in estimates of , since parameter estimates for improve when cells are close to their respective carrying capacities.
Our predictions also extend beyond previous flow cytometry-based analyses that classify a cell as being either positive or negative for the presence of particles based on a cutoff fluorescence intensity [24]. We can estimate the percentage of cells close to their respective carrying capacities and the percentage of cells where is above a certain threshold. Such cell-level information could be used to identify if particles associate with cells too slowly or too quickly, which may result in subtherapeutic dosages for a particular application or toxic side effects.
4.3. Regimes with parameter identifiability challenges
Point estimates for the particle–cell association rate, , have been shown to vary over multiple orders of magnitude for different combinations of particles and cell types [19]. In figures 2–5, we consider a scenario where we obtain multiple measurements of the initial increase in far from and multiple measurements when is close to (figure 5A). We refer to this as a scenario with intermediate . We now compare a scenario with intermediate to a scenario with low , which only comprises measurements of the initial increase in , far from , and to a scenario with high , which only comprises measurements of close to . Note that these classifications that we base on are relative to . Furthermore, the classifications are based on a prediction of that we obtain using our modelling approach.
Performing additional synthetic data studies we analyse how results differ for low, intermediate and high . We explore these regimes by varying while holding the coefficient of variation , the lognormal distribution for , and time points fixed. Exemplar synthetic data for simulated for low, intermediate and high values of are shown in figure 6M–O.
Figure 6.
Parameter identifiability challenges for low and high . Univariate posterior distributions for the statistical hyperparameters [], [particles cell], [] and [particles cell] for (A,D,G,J) low , (B,E,H,K) intermediate and (C,F,I,L) high . Time evolution of the number of particles per cell, [particles cell], with (dark blue) and (blue) prediction intervals with box plots representing synthetic data with outliers not shown for (M) low , (N) intermediate and (O) high . Synthetic data are generated using the following parameter values: for low, intermediate and high with low , intermediate , high .
For low , posterior distributions for and are relatively narrow in comparison with the pre-specified bounds and are well formed about a single peak, so we conclude that these parameters are practically identifiable (figure 6A,D). These parameters are identifiable because we obtain multiple measurements when the influence of is prominent. By contrast, posterior distributions for and are wide and flat relative to pre-specified bounds (figure 6G,J), so we conclude that these parameters are practically non-identifiable. This lack of identifiability arises because we do not obtain measurements when is close to .
For high we obtain results that are opposite to the results for low . We find that and are practically non-identifiable (figure 6C,F) while and are practically identifiable (figure 6I,L). This is because we do not obtain measurements when has a significant influence, but we do obtain measurements when is close to . For intermediate values of , in agreement with results in figures 2–5, all parameters are practically identifiable as we obtain measurements when has a significant influence and when is close to (figure 6B,E,H,K).
These results, which demonstrate that model parameters are partially identifiable for low and high , are consistent with observations for homogeneous population models experiencing sigmoidal growth dynamics [58,59]. As we later discuss, uncertainty in the posterior distributions results in similar uncertainty for inferred distributions for and and previous methods that focus only on point estimates cannot provide such insights regarding the range of parameters consistent with the data.
Posterior predictions for the flow cytometry data for all values of demonstrate close agreement with the corresponding synthetic data despite the lack of parameter identifiability for low and high values of (electronic supplementary material, figure S8). This predictive capability in the presence of poor identifiability is expected because our ABC method involves identifying parameter values that result in close agreement with the synthetic flow cytometry data. This result is consistent with observations for homogeneous population models [60]. As our ABC method is designed only to minimize the difference to flow cytometry data, predictions for other quantities should be explored on a case-by-case basis. In this regard, predictions of are found to demonstrate excellent agreement with the corresponding synthetic data for all values of (figure 6M–O).
4.4. Insights from experimental data
We now apply our verified mathematical and statistical modelling tools to experimental data. Working directly with the experimental data poses new challenges as we do not know the true data-generating process. We allow for model misspecification through an increased ABC error threshold compared with synthetic data.
We focus on experiments where 150 nm PMA core–shell particles are incubated with THP-1 cells. Posterior distributions for each statistical hyperparameter suggest that they are practically identifiable (figure 7A–D). The posterior distributions for and and the inferred distributions for and (figure 7C,D) both indicate significant cell–cell variability in the population. We also observe significant cell–cell variability in predictions of the number of particles per cell, (figure 7E).
Figure 7.
Methods apply to experimental data and reveal uncertainty in predictions of key quantities. (A–D) Posterior distributions for the statistical hyperparameters [], [particles cell], [] and [particles cell]. The 95% highest posterior density intervals are for , for , for and for . (E–J) Histograms of synthetic fluorescence data (grey) and prediction interval of heights of histogram bars from the mathematical model (cyan) for , , , , , h. (K,L) Inferred distribution for (K) [] and (L) [particles cell]. (M) Prediction for [particles cell]. In (K–M) 95% prediction intervals are shaded in blue.
Propagating forward the uncertainty captured by the posterior distributions for the statistical hyperparameters, we observe that posterior predictions for the distribution of fluorescence measurements demonstrate reasonable agreement with the corresponding experimental data (figure 7E–J). However, the prediction intervals often do not capture the experimental data. We attribute these differences to model misspecification, as we have earlier verified that our mathematical and statistical modelling tools can demonstrate excellent agreement with these types of data and recover known parameters. This model misspecification may arise from the form of the homogeneous mathematical model, the use of lognormal distributions to characterize heterogeneity in and , or the assumptions of additive noise due to cell autofluorescence and multiplicative noise due to particle fluorescence.
We next repeat this analysis for 214 nm PMA capsules incubated with THP-1 cells and 633 nm PMA core–shell particles incubated with THP-1 cells (electronic supplementary material, S2.1). We estimate posterior distributions for the statistical hyperparameters, predictions for the distribution of fluorescence measurements, inferred distributions for and , and predictions for . We choose the ABC error threshold for the 214 nm and 633 nm data with the Anderson–Darling ABC distance to be and , which are both higher than the corresponding ABC error threshold for the 150 nm data, which is set to be . We make these choices so that the ABC-SMC algorithms terminate in a reasonable timeframe (24 h). There are multiple possible reasons why the experimental data require different ABC thresholds to terminate within the set timeframe. For example, higher ABC error thresholds may suggest greater model misspecification, a more complex posterior distribution, or that the established ABC-SMC algorithm that we employ may benefit from additional fine-tuning.
Analysing these three experimental datasets with two other ABC distance functions, based on the Cramer von Mises statistic and Kolmogorov–Smirnov distance, we find similar agreement between posterior predictions for the distribution of the fluorescence measurements and the corresponding experimental data (electronic supplementary material, S2.1). Posterior distributions for the statistical hyperparameters and inferred distributions for and obtained using the different ABC distance functions are also similar. This corresponds to predictions for the median of demonstrating excellent agreement across the ABC distance functions. However, differences in the posterior distributions result in varied predictions for the upper tails of . These results suggest that care should be taken when choosing the ABC distance function to interpret these data. For example, if the upper tails of the distribution of are important we advise that one should compare results using multiple ABC distance functions.
4.5. Comparison to previous methods that focus on point estimates and overlook heterogeneity
Various methods have been employed to infer the dynamics of particle–cell interactions from flow cytometry data [10,26]. We now compare our new method to recent approaches that seek to quantify the particle–cell association rate, , and carrying capacity, , using mathematical modelling [19,35,36,61]. We focus on these approaches because (i) they assume that the experimental data comprise the same measurements that we have considered thus far: cells incubated with particles at multiple time points, , cell-only data, , and particle-only data, ; and (ii) they report results using a useful metric, namely particles per cell [39–41].
Rather than working directly with the entire flow cytometry data, Faria et al. [19] first employ a data transformation to obtain a point estimate of the number of particles per cell at each time point:
| (4.1) |
where the subscript ‘o’ denotes observed and . These are based on the median fluorescence intensity of each flow cytometry dataset. While using the median fluorescence intensity to summarize flow cytometry data is a common approach [19,41], vast amounts of information regarding the heterogeneity of the cell and particle populations are neglected when working solely with median values. For example, for each time point 20 000 cell-level fluorescent measurements in are reduced to a single median fluorescence intensity value.
Faria et al. [19] connect these to the homogeneous mathematical model characterized by two parameters and (§3.1.1) using the method of least squares (electronic supplementary material, S1.5). This approach implicitly assumes that measurement/observation errors are additive and follow a Gaussian distribution. It is not clear that these assumptions hold for these data. Furthermore, using the method of least squares, one can only obtain a point estimate for and to characterize the particle–cell interactions. This contrasts our new approach with the heterogeneous mathematical model and ABC methods where we estimate statistical hyperparameters that characterize distributions for and and then generate inferred distributions for and .
For both synthetic data (introduced in figure 6) and experimental data (introduced in figure 7), point estimates for and that characterize the homogeneous mathematical model obtained by the method of least squares are of the same order of magnitude as the modes of the inferred distributions for and generated using the heterogeneous mathematical model (figure 8A,B,D,E,G,H,J,K). However, while these point estimates for and are of the same order of magnitude they do not provide robust accurate estimates of modes of the inferred distributions. Further, these point estimates do not provide robust accurate estimates of the mode of the known distributions for synthetic data (figure 8A,B,D,E,G,H). We will show that this contributes to poor predictions of the number of particles per cell.
Figure 8.
Comparison of point estimate method with inferred distributions and predictions obtained using our new method. (A,D,G,J) Best-fit parameter values for from the homogeneous model with the method of least squares (magenta vertical dashed) compared with inferred distribution for from our new method. (B,E,H,K) Best-fit parameter values for from the homogeneous model with the method of least squares (magenta vertical dashed) compared with inferred distribution for from our new method. (C,F,I,L) from equation (4.1) (magenta circles) and solution of the homogeneous cell population mathematical model evaluated at the best-fit parameter values for and from the method of least squares (magenta dashed) compared with prediction of from our new method.
Simulating the homogeneous mathematical model with the best-fit values of and obtained using the method of least squares, we observe excellent agreement to the transformed data (from equation (4.1)) for synthetic and experimental data (figure 8C,F,I,L). However, we find that these predictions from the least squares method systematically overestimate the median of known values of from synthetic data. This overestimation appears to arise because control data, and , are right-skewed (electronic supplementary material, S2.3). Further, for intermediate , these predictions exceed the third quartile of known values of from synthetic data at later times (figure 8F). For high , these predictions exceed the third quartile of the known data at all times (figure 8I). By contrast, predictions of from our new method accurately capture the distribution of from synthetic data. These results suggest that our new method is better suited to estimating and predicting heterogeneity and uncertainty in , and .
Previous studies have also recognized that flow cytometry data contain more information than that given by the median fluorescence intensity [32,35,46–50]. We now compare our new method to techniques employed in the particle–cell interaction study by Johnston et al. [35]. By generalizing equation (4.1) to allow for heterogeneity in , Johnston et al. estimate the number of particles associated to each cell through time:
| (4.2) |
where represents the cell in and we let . Johnston et al. characterize at each time point using the mean and standard deviation. While this approach provides two pieces of information to summarize each time point, rather than the one data point provided by equation (4.1), the approach still does not fully exploit the information inherent in these data. Furthermore, the approach does not exploit the heterogeneity in the cell-only and particle-only control data. Johnston et al. use iterative techniques to fit to the means and standard deviations of at each time point. This approach provides estimates of lognormal distributions for and that are assumed to characterize these data within a voxel-based mathematical model. Equation (4.2) has also been used to pre-process experimental data for analysis in [36,61].
We also note that equation (4.2) can generate non-physical estimates for the number of particles per cell. In particular, when is similar to , equation (4.2) can result in many non-physical negative values of . While these non-physical negative values can be discarded, neglecting the negative values can lead to overestimating the number of particles per cell and subsequently lead to inaccurate estimates and predictions. We do not repeat this approach here. Our new method works directly with all data experimental fluorescence measurements and does not require data cleaning or pre-processing procedures.
4.6. Identifying optimal experimental time points
Many experimental design choices can influence results and their interpretation. Poor experimental designs may result in poor and inconsistent estimates of particle performance, with the efficacy of some particles being overestimated and the efficacy of others being underestimated. Overestimating particle performance can lead to disappointing results in more complicated experiments and wasted effort and expense. Underestimating particle performance may result in potent particles being overlooked in favour of inferior particles. By contrast, well-designed particle–cell interaction experiments facilitate an improved understanding of the dynamics of particle–cell interactions and support the identification of particles with preferred properties.
We now examine one of the simplest yet most important design choices that can be controlled in these experiments, namely the time points when measurements are obtained. Before an experiment, it is not clear when measurements should be taken to maximize information about particle–cell interactions. To explore how the choice of time points influences results we systematically examine 3003 designs using our heterogeneous mathematical model and established optimal Bayesian design methods for models with intractable likelihoods. For further details see §3.3.
We examine each of these 3003 experimental designs for three particle–cell scenarios, namely low, intermediate and high , which are introduced in figure 6. We later discuss an overall optimal design for an unknown using data from all three particle–cell scenarios. For each particle–cell scenario, we report the performance of a particular design using a normalized mean utility based on an analysis of 20 synthetic datasets. This mean utility rewards precise estimates of the statistical hyperparameters and is normalized such that the maximum mean utility corresponds to one for each scenario. Throughout this discussion, we highlight nine designs (figure 9A; electronic supplementary material, table S4): four simple and naive designs for illustrative purposes (early, middle, late, uniform); the time points that we consider in the previous analysis in this study which are those which have been used to collect experimental data previously (exp ); an optimal design for each of the three particle–cell scenarios (optimal low , optimal intermediate , optimal high ); and an overall optimal design (optimal unknown ).
Figure 9.
Comparison of 3003 experimental designs across three particle–cell scenarios. (A) Schematic for selected designs. (B–D) Normalized mean utility for selected designs (bars) with a summary for all designs (box plot). Results correspond to data generated for particle–cell scenarios with (B) low , (C) intermediate and (D) high . (E) Distribution of time points for the top 100 designs for each of the three particle–cell scenarios and for the top 100 designs based on average rank across all three experimental designs. The time point is denoted .
In figure 9B, we present the normalized mean utility function for a scenario with low . We observe that the optimal low design is h. In figure 9E, we present the distribution of time points for the top 100 designs for low , which demonstrates that later time points correspond to higher utilities for these data. These results agree with expectations. For low , we expect to obtain similar information for across any choice of time points because increases approximately linearly throughout the experiment, but we expect to obtain more information about at later times as is increasing throughout the experiment. This is consistent with the observation that the early design is particularly poor (ranked 3003 out of the 3003 designs) and with the observation that the exp design, which incorporates multiple early time points, is also sub-optimal (ranked 1808 out of the 3003 designs). Further details on the normalized utilities and rankings for the nine selected designs for all scenarios are presented in electronic supplementary material, table S4.
The optimal intermediate r design comprises h (figure 9A). The top 100 designs for intermediate follow a similar structure (figure 9E). The structure of these top designs is expected since we seek to capture the prominent roles of at early time and at late time.
The optimal high r design corresponds to measurements throughout the experiment ( h) (figure 9A). The top 100 designs for high follow a similar structure (figure 9E). We expect this structure for these top designs since data for high are consistent across all time points. Many designs result in a similar normalized mean utility with the lowest normalized mean utility for high equal to whereas the lowest normalized mean utility for low and intermediate are and , respectively (figure 9B–D).
All optimal designs maximize the normalized mean utility for their respective scenarios. Furthermore, the optimal low r and optimal int r designs correspond to outliers in the 3003 designs with respect to the normalized mean utility (figure 9B–D). However, a design that is optimal in one scenario is not necessarily optimal for a different scenario. For example, the optimal low r design is a poor design for intermediate (ranked out of designs) and high (ranked out of designs).
Typically particle–cell interaction experiments are performed to infer the particle–cell association rate, so we often do not know whether the experiment corresponds to a scenario with low, intermediate or high . Therefore, we compute an overall optimal design for an unknown based on an average of the ranking of designs across the three particle–cell scenarios. This optimal unknown design has the highest average ranking, which is , and comprises measurements throughout the experiment ( h) (figure 9A). This design corresponds to rankings of , and for data with low, intermediate and high , respectively. These rankings exceed 90%, 69% and 99% of the 3003 designs for low, intermediate and high , respectively (figure 9B–D).
The overall top 100 designs for unknown across all data follow a similar structure to the optimal unknown r design (figure 9E). The optimal low r, optimal intermediate r and optimal high r designs are not found in the overall top 100 designs at positions 2430, 855 and 1000, respectively. The exp r design is positioned 1430 in the overall ranking, with an average ranking of 1438 due to rankings of 1808, 565 and 1942 for data with low, intermediate and high , respectively. These results suggest that while the design that has been previously used to collect data is in the top of designs for scenarios with intermediate , there are 1429 designs better suited to explore scenarios when is unknown.
5. Conclusion
In this study, we use mechanistic mathematical modelling, statistical modelling and techniques from optimal Bayesian design to examine routinely collected flow cytometry measurements from nano-engineered particle–cell interaction experiments. We exploit previously overlooked heterogeneity in routine measurements that form the particle-only control data, cell-only control data and time course flow cytometry data using a novel heterogeneous ordinary differential equation-based mathematical model. This approach allows us to reveal and quantify heterogeneity and associated uncertainty in key biological parameters that characterize particle–cell interaction experiments. The approach also allows us to generate predictions of key quantities that are challenging to observe by experimentation alone, including the time evolution of the number of particles per cell. While many studies quantify the particle–cell interactions by assessing the percentage of cells positive for the presence of particles, using our tools we can generalize this metric to estimate how many particles are present in each cell. Obtaining such insights at scale with the latest experimental technology is challenging. Previous methods that assume homogeneous cell populations are also unable to provide such insights.
We apply and verify that our new tools perform as desired in various particle–cell scenarios. Classifying these scenarios using flow cytometry data without modelling is challenging. Instead, we identify different experimental regimes using predictions of the time evolution of the number of particles per cell from our mathematical modelling approach. To identify model parameters we seek a regime where we obtain multiple measurements of the initial increase in the number of particles per cell and multiple measurements of the number of particles per cell close to carrying capacity. Scenarios that do not include either of these types of measurements result in increased uncertainty and lack of identifiability for the particle–cell association rate and carrying capacity-type parameter, respectively.
Applying our tools to experimental data with particles that range from 150 nm to 633 nm incubated with a human leukaemia monocytic cell line, we estimate key biological parameters and make predictions that capture the heterogeneity in the data. We demonstrate that these methods improve on previous methods that assume homogeneous cell populations and those that rely on pre-processing calculations that can produce non-physical quantities. These previous methods also use the method of least squares and iterative techniques to obtain point estimates of key parameters that do not capture the heterogeneity in the data. While the uncertainty in these point estimates could be estimated and propagated forward to generate predictions [62], it is not clear what the most appropriate measurement/observation error model should be. The approach that we employ, with a heterogeneous mathematical model and approximate Bayesian computation, circumvents this challenge and, among many advantages, explicitly captures heterogeneity.
Model predictions for the flow cytometry data demonstrate reasonable agreement with the corresponding experimentally measured data for multiple ABC distance functions. In particular, predictions for the number of particles per cell from different ABC distance functions closely agree about the median. As we verify that our method recovers known quantities for synthetic data, we attribute differences to model misspecification [63,64], which is near impossible to avoid in such a complex system as particle–cell interactions. To explore model misspecification, one could revisit the assumptions of the mathematical models and develop more sophisticated models, likely with more parameters, but this may lead to further parameter identifiability challenges. One could also revisit the assumption that we capture all measurement noise in the flow cytometry time course data by sampling from the experimental control data, and allow for additional measurement noise by extending our ABC method to an exact ABC inference technique [65]. In the first instance, one can start to observe the impact of model misspecification by generating results using multiple ABC distance functions. This may be particularly insightful when seeking estimates of the statistical hyperparameters and predictions for the upper tails of the distribution for the number of particles per cell.
Using optimal experimental design techniques with our novel model, we identify optimal time points for multiple particle–cell scenarios. This analysis systematically examines 3003 designs and rewards designs that obtain precise estimates of the parameters that characterize particle–cell interactions. We focus on time points as they are simple to vary experimentally and vital for accurate parameter estimates. We choose the set of possible time points based on previous data and practical considerations regarding the frequency of measurements with manual experimental procedures. Automated experimental procedures would facilitate more frequent measurements. When early time measurements do not capture the initial increase in particles per cell, increasing the temporal resolution at early times would facilitate a transition from a regime of partial identifiability to a regime where parameters are practically identifiable. When late time measurements of the number of particles per cell are not close to carrying capacity, extending the experimental duration would also facilitate a transition to a regime where model parameters are practically identifiable. This would, however, introduce additional complications; for example, cell proliferation and media changes are expected to be more influential. The methodology that we present here is well suited to incorporate such additional mechanisms.
Overall, our results suggest that it is essential to consider heterogeneity even when interpreting the simplest of controlled in vitro nano-engineered particle–cell interaction experiments. We quantify this heterogeneity in the particle–cell association rate and carrying capacity, but we do not claim to know the mechanisms that drive this heterogeneity. Exploring the mechanisms that give rise to such heterogeneity would be interesting. The methodology we present in this study is well suited to analyse other combinations of particles and cell types in well-mixed media, such as with suspension and adherent cell lines subject to stirring. The methodology is suitable for extensions where spatial gradients of particles form in the media, such as for adherent cells in an unstirred media, and this would involve extending the mathematical model to a partial differential equation-based model that incorporates the role of particle transport [19,35]. We also contribute to the growing literature demonstrating the advantages of working with all flow cytometry data rather than summary statistics, such as the median fluorescence intensity [46–50]. We show that experimental data can exhibit dramatic cell–cell variability in particle–cell association rates, carrying capacity, and associated particles, and the influence of these findings on target applications, such as medical diagnostics, biomedical imaging and targeted drug delivery, is worthy of further investigation.
Acknowledgements
This research was supported by The University of Melbourne’s Research Computing Services and the Petascale Campus Initiative. We thank Dr David J. Warne and Dr Alexander P. Browning for helpful discussions. We thank the two anonymous reviewers for their helpful comments.
Contributor Information
Ryan J. Murphy, Email: ryan.murphy@unisa.edu.au.
Matthew Faria, Email: matthew.faria@unimelb.edu.au.
James M. Osborne, Email: jmosborne@unimelb.edu.au.
Stuart T. Johnston, Email: stuart.johnston@unimelb.edu.au.
Ethics
This work did not require ethical approval from a human subject or animal welfare committee.
Data accessibility
We analyse previously published data from [19] available on FigShare. This includes fluorescence data from Flow Cytometry Standard (FCS) files and experimental details from INI files. Raw FCS data files were converted to CSV using https://floreada.io/. Key computer code implemented in Julia is freely and publicly available on a Zenodo repository [66].
Electronic supplementary material is available online [67].
Declaration of AI use
We have not used AI-assisted technologies in creating this article.
Authors’ contributions
R.J.M.: conceptualization, formal analysis, investigation, methodology, software, validation, visualization, writing—original draft, writing—review and editing; M.F.: conceptualization, funding acquisition, methodology, supervision, writing—review and editing; J.M.O.: conceptualization, funding acquisition, methodology, supervision, writing—review and editing; S.T.J.: conceptualization, funding acquisition, methodology, supervision, writing—review and editing.
All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Conflict of interest declaration
We declare we have no competing interests.
Funding
S.T.J., M.F. and J.M.O. are supported by an Australian Research Council Discovery Project (DP230100380). J.M.O. is supported by an Australian Research Council Future Fellowship (FT230100352). We acknowledge the assistance of the CASS Foundation through a Medicine/Science grant.
References
- 1. Bobo D, Robinson KJ, Islam J, Thurecht KJ, Corrie SR. 2016. Nanoparticle-based medicines: a review of FDA-approved materials and clinical trials to date. Pharm. Res. 33, 2373–2387. ( 10.1007/s11095-016-1958-5) [DOI] [PubMed] [Google Scholar]
- 2. Gao Y, Chen Y, Ji X, He X, Yin Q, Zhang Z, Shi J, Li Y. 2011. Controlled intracellular release of doxorubicin in multidrug-resistant cancer cells by tuning the shell-pore sizes of mesoporous silica nanoparticles. ACS Nano 5, 9788–9798. ( 10.1021/nn2033105) [DOI] [PubMed] [Google Scholar]
- 3. Mitchell MJ, Billingsley MM, Haley RM, Wechsler ME, Peppas NA, Langer R. 2021. Engineering precision nanoparticles for drug delivery. Nat. Rev. Drug Discov. 20, 101–124. ( 10.1038/s41573-020-0090-8) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Yan Y, Johnston APR, Dodds SJ, Kamphuis MMJ, Ferguson C, Parton RG, Nice EC, Heath JK, Caruso F. 2010. Uptake and intracellular fate of disulfide-bonded polymer hydrogel capsules for doxorubicin delivery to colorectal cancer cells. ACS Nano 4, 2928–2936. ( 10.1021/nn100173h) [DOI] [PubMed] [Google Scholar]
- 5. Baetke SC, Lammers T, Kiessling F. 2015. Applications of nanoparticles for diagnosis and therapy of cancer. Br. J. Radiol. 88, 20150207. ( 10.1259/bjr.20150207) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Han X, Xu K, Taratula O, Farsad K. 2019. Applications of nanoparticles in biomedical imaging. Nanoscale 11, 799–819. ( 10.1039/c8nr07769j) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Lee JH, et al. 2007. Artificially engineered magnetic nanoparticles for ultra-sensitive molecular imaging. Nat. Med. 13, 95–99. ( 10.1038/nm1467) [DOI] [PubMed] [Google Scholar]
- 8. Shi J, Kantoff PW, Wooster R, Farokhzad OC. 2017. Cancer nanomedicine: progress, challenges and opportunities. Nat. Rev. Cancer 17, 20–37. ( 10.1038/nrc.2016.108) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Desai N. 2012. Challenges in development of nanoparticle-based therapeutics. AAPS J. 14, 282–295. ( 10.1208/s12248-012-9339-4) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Johnston ST, Faria M, Crampin EJ. 2021. Understanding nano-engineered particle–cell interactions: biological insights from mathematical models. Nanoscale Adv. 3, 2139–2156. ( 10.1039/d0na00774a) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Verbeke R, Lentacker I, De Smedt SC, Dewitte H. 2021. The dawn of mRNA vaccines: the COVID-19 case. J. Control. Release 333, 511–520. ( 10.1016/j.jconrel.2021.03.043) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Hou X, Zaks T, Langer R, Dong Y. 2021. Lipid nanoparticles for mRNA delivery. Nat. Rev. Mater. 6, 1078–1094. ( 10.1038/s41578-021-00358-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Canton I, Battaglia G. 2012. Endocytosis at the nanoscale. Chem. Soc. Rev. 41, 2718–2739. ( 10.1039/c2cs15309b) [DOI] [PubMed] [Google Scholar]
- 14. Donahue ND, Acar H, Wilhelm S. 2019. Concepts of nanoparticle cellular uptake, intracellular trafficking, and kinetics in nanomedicine. Adv. Drug Deliv. Rev. 143, 68–96. ( 10.1016/j.addr.2019.04.008) [DOI] [PubMed] [Google Scholar]
- 15. Nel AE, Mädler L, Velegol D, Xia T, Hoek EMV, Somasundaran P, Klaessig F, Castranova V, Thompson M. 2009. Understanding biophysicochemical interactions at the nano–bio interface. Nat. Mater. 8, 543–557. ( 10.1038/nmat2442) [DOI] [PubMed] [Google Scholar]
- 16. Ding HM, Ma YQ. 2018. Computational approaches to cell–nanomaterial interactions: keeping balance between therapeutic efficiency and cytotoxicity. Nanoscale 3, 6–27. ( 10.1039/C7NH00138J) [DOI] [PubMed] [Google Scholar]
- 17. Augustine R, Hasan A, Primavera R, Wilson RJ, Thakor AS, Kevadiya BD. 2020. Cellular uptake and retention of nanoparticles: insights on particle properties and interaction with cellular components. Mater. Today Commun. 25, 101692. ( 10.1016/j.mtcomm.2020.101692) [DOI] [Google Scholar]
- 18. Treuel L, Jiang X, Nienhaus GU. 2013. New views on cellular uptake and trafficking of manufactured nanoparticles. J. R. Soc. Interface 10, 20120939. ( 10.1098/rsif.2012.0939) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Faria M, Noi KF, Dai Q, Björnmalm M, Johnston ST, Kempe K, Caruso F, Crampin EJ. 2019. Revisiting cell–particle association in vitro: a quantitative method to compare particle performance. J. Control. Release 307, 355–367. ( 10.1016/j.jconrel.2019.06.027) [DOI] [PubMed] [Google Scholar]
- 20. Ivask A, Mitchell A, Malysheva A, Voelcker N, Lombi E. 2018. Methodologies and approaches for the analysis of cell–nanoparticle interactions. Wiley Interdiscip. Rev. Nanomed. Nanobiotechnol. 10, e1486. ( 10.1002/wnan.1486) [DOI] [PubMed] [Google Scholar]
- 21. Varela JA, Bexiga MG, Åberg C, Simpson JC, Dawson KA. 2012. Quantifying size-dependent interactions between fluorescently labeled polystyrene nanoparticles and mammalian cells. J. Nanobiotechnol. 10, 39. ( 10.1186/1477-3155-10-39) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Vtyurina N, Åberg C, Salvati A. 2021. Imaging of nanoparticle uptake and kinetics of intracellular trafficking in individual cells. Nanoscale 13, 10436–10446. ( 10.1039/d1nr00901j) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Yang B, Richards CJ, Gandek TB, de Boer I, Aguirre-Zuazo I, Niemeijer E, Åberg C. 2023. Following nanoparticle uptake by cells using high-throughput microscopy and the deep-learning based cell identification algorithm Cellpose. Front. Nanotechnol. 5, 1181362. ( 10.3389/fnano.2023.1181362) [DOI] [Google Scholar]
- 24. Shin H, Kwak M, Lee TG, Lee JY. 2020. Quantifying the level of nanoparticle uptake in mammalian cells using flow cytometry. Nanoscale 12, 15743–15751. ( 10.1039/d0nr01627f) [DOI] [PubMed] [Google Scholar]
- 25. Salvati A, et al. 2018. Quantitative measurement of nanoparticle uptake by flow cytometry illustrated by an interlaboratory comparison of the uptake of labelled polystyrene nanoparticles. NanoImpact 9, 42–50. ( 10.1016/j.impact.2017.10.004) [DOI] [Google Scholar]
- 26. Åberg C. 2021. Kinetics of nanoparticle uptake into and distribution in human cells. Nanoscale Adv. 3, 2196–2212. ( 10.1039/d0na00716a) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Hinderliter PM, Minard KR, Orr G, Chrisler WB, Thrall BD, Pounds JG, Teeguarden JG. 2010. ISDD: a computational model of particle sedimentation, diffusion and target cell dosimetry for in vitro toxicity studies. Part. Fibre Toxicol. 7, 1–20. ( 10.1186/1743-8977-7-36) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Johnston ST, Faria M, Crampin EJ. 2018. An analytical approach for quantifying the influence of nanoparticle polydispersity on cellular delivered dose. J. R. Soc. Interface 15, 20180364. ( 10.1098/rsif.2018.0364) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Cui J, et al. 2016. A framework to account for sedimentation and diffusion in particle–cell interactions. Langmuir 32, 12394–12402. ( 10.1021/acs.langmuir.6b01634) [DOI] [PubMed] [Google Scholar]
- 30. Thomas DG, et al. 2018. ISD3: a particokinetic model for predicting the combined effects of particle sedimentation, diffusion and dissolution on cellular dosimetry for in vitro systems. Part. Fibre Toxicol 15, 1–22. ( 10.1186/s12989-018-0243-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. DeLoid GM, Cohen JM, Pyrgiotakis G, Pirela SV, Pal A, Liu J, Srebric J, Demokritou P. 2015. Advanced computational modeling for in vitro nanomaterial dosimetry. Part. Fibre Toxicol. 12, 32. ( 10.1186/s12989-015-0109-1) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Browning AP, Ansari N, Drovandi C, Johnston APR, Simpson MJ, Jenner AL. 2022. Identifying cell-to-cell variability in internalization using flow cytometry. J. R. Soc. Interface 19, 20220019. ( 10.1098/rsif.2022.0019) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Richards DM, Endres RG. 2017. How cells engulf: a review of theoretical approaches to phagocytosis. Rep. Prog. Phys. 80, 126601. ( 10.1088/1361-6633/aa8730) [DOI] [PubMed] [Google Scholar]
- 34. Richards CJ, Melero Martinez P, Roos WH, Åberg C. 2024. High-throughput approach to measure number of nanoparticles associated with cells: size dependence and kinetic parameters. Nanoscale Adv. 7, 185–195. ( 10.1039/d4na00589a) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Johnston ST, Faria M, Crampin EJ. 2020. Isolating the sources of heterogeneity in nano-engineered particle–cell interactions. J. R. Soc. Interface 17, 20200221. ( 10.1098/rsif.2020.0221) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Dowling CV, Cevaal PM, Faria M, Johnston ST. 2022. On predicting heterogeneity in nanoparticle dosage. Math. Biosci. 354, 108928. ( 10.1016/j.mbs.2022.108928) [DOI] [PubMed] [Google Scholar]
- 37. Ware MJ, Godin B, Singh N, Majithia R, Shamsudeen S, Serda RE, Meissner KE, Rees P, Summers HD. 2014. Analysis of the influence of cell heterogeneity on nanoparticle dose response. ACS Nano 8, 6693–6700. ( 10.1021/nn502356f) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Rees P, Wills JW, Brown MR, Barnes CM, Summers HD. 2019. The origin of heterogeneous nanoparticle uptake by cells. Nat. Commun. 10, 2341. ( 10.1038/s41467-019-10112-4) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Faria M, Johnston ST, Mitchell AJ, Crampin E, Caruso F. 2021. Bio-nano science: better metrics would accelerate progress. Chem. Mater. 33, 7613–7619. ( 10.1021/acs.chemmater.1c02369) [DOI] [Google Scholar]
- 40. Cevaal PM, Roche M, Lewin SR, Caruso F, Faria M. 2022. Experimental quantification of interactions between drug delivery systems and cells in vitro: a guide for preclinical nanomedicine evaluation. J. Vis. Exp. 187, e64259. ( 10.3791/64259) [DOI] [PubMed] [Google Scholar]
- 41. Gottstein C, Wu G, Wong BJ, Zasadzinski JA. 2013. Precise quantification of nanoparticle internalization. ACS Nano 7, 4933–4945. ( 10.1021/nn400243d) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Beaumont MA, Zhang W, Balding DJ. 2002. Approximate Bayesian computation in population genetics. Genetics 162, 2025–2035. ( 10.1093/genetics/162.4.2025) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Sisson SA, Fan Y, Tanaka MM. 2007. Sequential Monte Carlo without likelihoods. Proc. Natl Acad. Sci.USA 104, 1760–1765. ( 10.1073/pnas.0607208104) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MPH. 2009. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface 6, 187–202. ( 10.1098/rsif.2008.0172) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Vo BN, Drovandi CC, Pettitt AN, Pettet GJ. 2015. Melanoma cell colony expansion parameters revealed by approximate Bayesian computation. PLoS Comput. Biol. 11, e1004635. ( 10.1371/journal.pcbi.1004635) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Hasenauer J, Waldherr S, Doszczak M, Radde N, Scheurich F, Allgöwer P. 2011. Identification of models of heterogeneous cell populations from population snapshot data. BMC Bioinform. 12, 125. ( 10.1186/1471-2105-12-125) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Lambert B, Gavaghan DJ, Tavener SJ. 2021. A Monte Carlo method to estimate cell population heterogeneity from cell snapshot data. J. Theor. Biol. 511, 110541. ( 10.1016/j.jtbi.2020.110541) [DOI] [PubMed] [Google Scholar]
- 48. Augustin D, Lambert B, Wang K, Walz AC, Robinson M, Gavaghan D. 2023. Filter inference: a scalable nonlinear mixed effects inference approach for snapshot time series data. PLoS Comput. Biol. 19, e1011135. ( 10.1371/journal.pcbi.1011135) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Drovandi C, Lawson B, Jenner AL, Browning AP. 2022. Population calibration using likelihood-free Bayesian inference. (https://arxiv.org/abs/2202.01962)
- 50. Loos C, Moeller K, Fröhlich F, Hucho T, Hasenauer J. 2018. A hierarchical, data-driven approach to modeling single-cell populations predicts latent causes of cell-to-cell variability. Cell Syst. 6, 593–603.( 10.1016/j.cels.2018.04.008) [DOI] [PubMed] [Google Scholar]
- 51. Altschuler SJ, Wu LF. 2010. Cellular heterogeneity: do differences make a difference? Cell 141, 559–563. ( 10.1016/j.cell.2010.04.033) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Waldherr S. 2018. Estimation methods for heterogeneous cell population models in systems biology. J. R. Soc. Interface 15, 20180530. ( 10.1098/rsif.2018.0530) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Price DJ, Bean NG, Ross JV, Tuke J. 2016. On the efficient determination of optimal Bayesian experimental designs using ABC: a case study in optimal observation of epidemics. J. Stat. Plan. Infer. 172, 1–15. ( 10.1016/j.jspi.2015.12.008) [DOI] [Google Scholar]
- 54. Drovandi CC, Pettitt AN. 2013. Bayesian experimental design for models with intractable likelihoods. Biometrics 69, 937–948. ( 10.1111/biom.12081) [DOI] [PubMed] [Google Scholar]
- 55. Ryan EG, Drovandi CC, McGree JM, Pettitt AN. 2016. A review of modern computational algorithms for Bayesian optimal design. Int. Stat. Rev. 84, 128–154. ( 10.1111/insr.12107) [DOI] [Google Scholar]
- 56. Turnbull T, et al. 2019. Cross-correlative single-cell analysis reveals biological mechanisms of nanoparticle radiosensitization. ACS Nano 13, 5077–5090. ( 10.1021/acsnano.8b07982) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Limpert E, Stahel WA, Abbt M. 2001. Log-normal distributions across the sciences: keys and clues: on the charms of statistics, and how mechanical models resembling gambling machines offer a link to a handy way to characterize log-normal distributions, which can provide deeper insight into variability and probability—normal or log-normal: that is the question. Bioscience 51, 341–352. ( 10.1641/0006-3568(2001)051[0341:LNDATS]2.0.CO;2) [DOI] [Google Scholar]
- 58. Warne DJ, Baker RE, Simpson MJ. 2017. Optimal quantification of contact inhibition in cell populations. Biophys. J. 113, 1920–1924. ( 10.1016/j.bpj.2017.09.016) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Simpson MJ, Browning AP, Warne DJ, Maclaren OJ, Baker RE. 2022. Parameter identifiability and model selection for sigmoid population growth models. J. Theor. Biol. 535, 110998. ( 10.1016/j.jtbi.2021.110998) [DOI] [PubMed] [Google Scholar]
- 60. Simpson MJ, Maclaren OJ. 2024. Predictions using poorly identified mathematical models. Bull. Math. Biol. 86, 80. ( 10.1007/s11538-024-01294-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Johnston ST, Faria M. 2022. Equation learning to identify nano-engineered particle–cell interactions: an interpretable machine learning approach. Nanoscale 14, 16502–16515. ( 10.1039/d2nr04668g) [DOI] [PubMed] [Google Scholar]
- 62. Murphy RJ, Maclaren OJ, Simpson MJ. 2024. Implementing measurement error models in a likelihood-based framework for estimation, identifiability analysis, and prediction in the life sciences. J. R. Soc 21, 20230402. ( 10.1098/rsif.2023.0402) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Frazier DT, Robert CP, Rousseau J. 2020. Model misspecification in approximate Bayesian computation: consequences and diagnostics. J. R. Stat. Soc. Ser. B Stat. Methodol. 82, 421–444. ( 10.1111/rssb.12356) [DOI] [Google Scholar]
- 64. Frazier DT, Drovandi C, Loaiza-Maya R. 2020. Robust approximate Bayesian computation: an adjustment approach. (https://arxiv.org/abs/2008.04099)
- 65. Schälte Y, Hasenauer J. 2020. Efficient exact inference for dynamical systems with noisy measurements using sequential approximate Bayesian computation. Bioinformatics 36, i551–i559. ( 10.1093/bioinformatics/btaa397) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Murphy RJ. 2025. Code for ‘Quantifying biological heterogeneity in nano-engineered particle-cellinteraction experiments’. Zenodo 20230402. ( 10.5281/zenodo.16416414) [DOI] [PubMed]
- 67. Murphy RJ, Faria M, Osborne J, Johnston ST. 2025. Supplementary material from: Quantifying biological heterogeneity in nano-engineered particle-cell interaction experiments. Figshare. ( 10.6084/m9.figshare.c.7963817) [DOI] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
We analyse previously published data from [19] available on FigShare. This includes fluorescence data from Flow Cytometry Standard (FCS) files and experimental details from INI files. Raw FCS data files were converted to CSV using https://floreada.io/. Key computer code implemented in Julia is freely and publicly available on a Zenodo repository [66].
Electronic supplementary material is available online [67].









