A wall-time minimizing parallelization strategy for approximate Bayesian computation

Emad Alamoudi; Felipe Reck; Nils Bundgaard; Frederik Graw; Lutz Brusch; Jan Hasenauer; Yannik Schälte

doi:10.1371/journal.pone.0294015

. 2024 Feb 22;19(2):e0294015. doi: 10.1371/journal.pone.0294015

A wall-time minimizing parallelization strategy for approximate Bayesian computation

Emad Alamoudi ^1,^#, Felipe Reck ^1,^#, Nils Bundgaard ², Frederik Graw ^2,^3,⁴, Lutz Brusch ⁵, Jan Hasenauer ^1,^6,^7,^*, Yannik Schälte ^1,^6,⁷

Editor: Abel CH Chen⁸

PMCID: PMC10883530 PMID: 38386671

Abstract

Approximate Bayesian Computation (ABC) is a widely applicable and popular approach to estimating unknown parameters of mechanistic models. As ABC analyses are computationally expensive, parallelization on high-performance infrastructure is often necessary. However, the existing parallelization strategies leave computing resources unused at times and thus do not optimally leverage them yet. We present look-ahead scheduling, a wall-time minimizing parallelization strategy for ABC Sequential Monte Carlo algorithms, which avoids idle times of computing units by preemptive sampling of subsequent generations. This allows to utilize all available resources. The strategy can be integrated with e.g. adaptive distance function and summary statistic selection schemes, which is essential in practice. Our key contribution is the theoretical assessment of the strategy of preemptive sampling and the proof of unbiasedness. Complementary, we provide an implementation and evaluate the strategy on different problems and numbers of parallel cores, showing speed-ups of typically 10-20% and up to 50% compared to the best established approach, with some variability. Thus, the proposed strategy allows to improve the cost and run-time efficiency of ABC methods on high-performance infrastructure.

Introduction

Mechanistic models are important tools in systems biology and many other research fields to describe and understand mechanisms underlying systemic behavior [1, 2]. Usually, such models have unknown parameters that need to be estimated by comparing model outputs to observed data [3]. For complex stochastic models, in particular multi-scale models used to describe the complex dynamics of multi-cellular systems, evaluating the likelihood of data given parameters however becomes quickly computationally infeasible [4, 5]. For this reason, simulation-based methods that circumvent likelihood evaluation have been developed, such as approximate Bayesian computation (ABC), popular for its simplicity and wide applicability [6, 7].

ABC generates samples from an approximation to the true Bayesian posterior distribution. While asymptotically exact, a known disadvantage of ABC is its computational complexity. The reason for this is that it requires often simulations of hundred thousands to millions of artificial datasets. Therefore, methods to efficiently explore the search space have been developed [8]. In particular, ABC is frequently combined with a Sequential Monte Carlo scheme (ABC-SMC), which over several generations successively refines the posterior approximation via importance sampling while maintaining high acceptance rates [9, 10]. Furthermore, in ABC-SMC the sampling for each generation can be parallelized, enabling the use of high-performance computing (HPC) infrastructure. This has in recent years enabled tackling increasingly complex problems via ABC [11–14].

It would be desirable if available computational resources were perfectly exploited at all times, to minimize both the wall-time until results become available to the researcher, and the cost associated with allocated resources. However, the problem is that established parallelization strategies to distribute ABC-SMC work over a set of workers leave resources idle at times and thus fall short of this aim. The parallelization strategy used in most established HPC-ready ABC implementations is static scheduling (STAT), which defines exactly as many parallel tasks as accepted particles are required [15, 16]. While it minimizes the active compute time and consumed energy, typically a substantial amount of workers become idle towards the end of each generation. Dynamic scheduling (DYN) mitigates this problem and reduces the overall wall-time by continuing sampling on all workers until sufficiently many particles have been accepted [17]. It was shown to reduce the wall-time substantially. However, also in this strategy at the end of each generation workers become idle, waiting until all simulations have finished.

A natural strategy to circumvent idle time is to start already sampling the next generation, given partial information about the current generation. Yet, it is not obvious how particles need to be accepted or weighted, and whether this would indeed improve efficiency. In this manuscript, we describe an ABC-SMC parallelization strategy for multi-core and distributed systems, called look-ahead scheduling (LA) which avoids idle time. We show that by appropriate sample reweighting we obtain an unbiased Monte Carlo sample. We provide an HPC-ready implementation and test the method on various problems. Moreover, we show that the strategy can be integrated with adaptive algorithms for e.g. summary statistics, distance functions, or acceptance thresholds.

Methods

ABC

We consider a mechanistic model described via a generative process of simulating data $y \sim π (y | θ) \in R^{n_{y}}$ for parameters $θ \in R^{n_{θ}}$ . Given observed data y_obs, in Bayesian inference the likelihood π(y_obs|θ) is combined with prior information π(θ) to the posterior distribution π(θ|y_obs) ∝ π(y_obs|θ) ⋅ π(θ). We assume that evaluating the likelihood is computationally infeasible, but that it is possible to simulate data y ∼ π(y|θ) from the model. Then, classical ABC consists in the 3 steps of first sampling parameters θ ∼ π(θ), second simulating data y ∼ π(y|θ), and third accepting θ if d(y, y_obs)≤ε, for a distance metric $d : R^{n_{y}} \times R^{n_{y}} \to R_{\geq 0}$ and acceptance threshold ε > 0. This is repeated until sufficiently many particles, say N, are accepted. The population of accepted particles constitutes a sample from an approximation of the posterior distribution,

\begin{matrix} π_{ABC, ε} (θ | y_{obs}) \propto \int I [d (y, y_{obs}) \leq ε] π (y | θ) d y \cdot π (θ) . \end{matrix}

(1)

Under mild assumptions, π_ABC,ε(θ|y_obs) converges to the actual posterior π(θ|y_obs) as ε → 0 [18, 19]. Commonly, ABC operates not directly on the measured data, but summary statistics thereof, capturing relevant information in a low-dimensional representation [20]. Here, for notational simplicity we assume that y already incorporates summary statistics, if applicable.

ABC-SMC

The vanilla ABC formulation exhibits a trade-off between the reduction of the approximation error induced by ε, and high acceptance rates. Thus, ABC is frequently combined with a Sequential Monte Carlo scheme (ABC-SMC) [21, 22]. In ABC-SMC, a series of particle populations $P_{t} = {(θ_{t}^{i}, w_{t}^{i})}_{i \leq N}$ is generated, constituting samples of successively better approximations $π_{ABC, ε_{t}} (θ | y_{obs})$ of the posterior, for generations t = 1, …, n_t, with acceptance thresholds ε_t > ε_t+1. In the first generation (t = 1), particles are sampled directly from the prior, g₁(θ) = π(θ). In later generations (t > 1), particles are sampled from proposal distributions g_t(θ) ≫ π(θ) based on the last generation’s accepted weighted population P_t−1, e.g. via a kernel density estimate. The importance weights $w_{t}^{i}$ are the Radon-Nikodym derivatives w_t(θ) = π(θ)/g_t(θ). This is precisely such that the weighted parameters are samples from the distribution

\begin{matrix} \int w_{t} (θ) I [d (y, y_{obs}) \leq ε_{t}] π (y | θ) d y \cdot g_{t} (θ) = \int I [d (y, y_{obs}) \leq ε_{t}] π (y | θ) d y \cdot π (θ), \end{matrix}

(2)

i.e. the target distribution (1) for ε = ε_t.

Common proposal distributions first select an accepted parameter from the last generation and then perturb it, in which case g_t takes the form $g_{t} (θ) = \sum_{i = 1}^{N} w_{t - 1}^{i} K (θ | θ_{t - 1}^{i}) / \sum_{i = 1}^{N} w_{t - 1}^{i}$ , with e.g. $K (θ | θ_{t - 1}^{i}) = N (θ | θ_{t - 1}^{i}, Σ_{t - 1})$ a normal distribution with mean $θ_{t - 1}^{i}$ and covariance matrix Σ_t−1. The performance of ABC-SMC algorithms relies heavily on the quality of the proposal distribution, on its ability to efficiently explore the parameter space. Methods that adapt to the problem structure, e.g. basing Σ_t−1 on the previous generation’s weighted sample covariance matrix and potentially localizing around θ_i, have shown superior performance [23–25].

The output of ABC-SMC is a population of weighted parameters

\begin{matrix} P_{n_{t}} = {(θ_{n_{t}}^{i}, w_{n_{t}}^{i})}_{i \leq N} \sim π_{ABC, ε_{n_{t}}} (θ | y_{obs}) . \end{matrix}

For a statistic $f : R^{n_{θ}} \to R$ , the expected value under the posterior is then approximated via the self-normalized importance estimator

\begin{matrix} E_{π_{ABC, ε_{n_{t}}} (θ | y_{obs})} [f] \approx \hat{f} = \sum_{i = 1}^{N} W_{n_{t}}^{i} f (θ_{n_{t}}^{i}), \end{matrix}

which is asymptotically unbiased. Here, $W_{t}^{i} : = w_{t}^{i} / \sum_{j = 1}^{N} w_{t}^{j}$ are self-normalized weights. This is necessary because the weights w_t(θ) = π(θ)/g_t(θ) are not normalized in the joint sample space (θ, y), therefore effectively another Monte Carlo estimator is employed for the normalization constant (for details see the Section 1.1 in S1 File).

In importance sampling, samples are assigned different weights, such that some impact estimates more than others. This can be quantified e.g. via the effective sample size (ESS) [8, 26]:

\begin{matrix} ESS ({w_{i}}_{i \leq N}) = \frac{{(\sum_{i \leq N} w_{i})}^{2}}{\sum_{i \leq N} w_{i}^{2}} \end{matrix}

(3)

Established parallelization strategies

In ABC, often hundred thousands to millions of model simulations need to be performed, which is typically the computationally critical part. To speed up inference, parallelization strategies have been developed that exploit the independence of the N particles constituting the t-th population. Suppose we have W parallel workers, each worker being a computational processor unit e.g. in an HPC environment. There are two established techniques to parallelize execution over the workers:

In static scheduling (STAT), given a population size N, N tasks are defined and distributed over the workers. Each task consists in sampling until one particle gets accepted (Fig 1A). The tasks are queued if N > W. STAT minimizes the active computation time and number of simulations and is easy to implement, only requiring basic pooling routines available in most distributed computation frameworks. However, even for W > N only N workers are employed, although the number of required simulations is usually substantially larger than N. In addition, at the end of every generation the number of active workers decreases successively, most workers idly waiting for a few to finish their tasks. STAT is available in most established ABC-SMC implementations [15, 16].

In dynamic scheduling (DYN), sampling is performed continuously on all available workers until N particles have been accepted (Fig 1B). However, simply taking those first N particles as the final population would bias the population towards parameters causing short-running simulations. Therefore, DYN waits for all workers to finish, and out of then $\tilde{N} \geq N$ accepted particles, only the N that started earliest are finally accepted, not the ones that finished earliest. This has the effect that simulation time plays no longer a role in the acceptance decision [17]. This ensures that the acceptance probability of a particle is in accordance with the target distribution, independent of later events and thus its run-time.

Parallelization using look-ahead dynamic scheduling

DYN allows to exploit the available parallel infrastructure to a higher degree than STAT and therefore already substantially decreases the wall-time (by a factor of between 1.4 and 5.3 in test scenarios, see [17]). Nonetheless, some workers remain idle at the end of each generation while waiting for others to complete. This fraction of idle workers increases as the number of workers increases relatively to the population size. Additionally, the idle time increases if simulation times are heterogeneous, which is often the case, e.g. with estimated reaction rates determining the number of simulated events (Section 3.4.1 in S1 File). In case of fast model simulations, also the time between generations, e.g. to post-process and store results, may be relatively long.

Proposed algorithm

We propose to extend dynamic scheduling by using the free workers at the end of each generation to proactively sample for the next generation (Fig 2): As soon as N acceptances have been reached in generation t − 1 and workers thus start to get idle, we construct a preliminary proposal ${\tilde{g}}_{t}$ , based on which particles for generation t are generated to start simulations on the free workers. ${\tilde{g}}_{t}$ can be based on a preliminary population of accepted particles ${\hat{P}}_{t - 1} = {({\hat{θ}}_{t - 1}^{i}, {\hat{w}}_{t - 1}^{i})}_{i \leq N}$ relying on these first N acceptances. However, ${\hat{P}}_{t - 1}$ may introduce a practical bias (in a finite sample sense) towards particles with faster simulations times. This can in particular occur when computation time is highly parameter-dependent. Say, for example, that parameters from multiple regions in parameter space can explain the data similarly well, but that one region leads to substantially higher simulation times. Then, a sampling routine that does not wait for all started simulations to finish may under-represent or even miss out on regions in parameter space with high simulation times. The ABC-SMC routine may consequently have a low probability of generating importance samples from that region in subsequent generations, leading to a biased final posterior sample. To address this issue, the preliminary proposal can alternatively be based on P_t−2 (such that ${\tilde{g}}_{t} = g_{t - 1}$ ), giving inductively practically unbiased proposals. If a particle ${\tilde{θ}}_{t} \sim {\tilde{g}}_{t}$ gets accepted according to the acceptance criteria of generation t, its non-normalized weight is calculated as ${\tilde{w}}_{t} ({\tilde{θ}}_{t}) = \frac{π ({\tilde{θ}}_{t})}{{\tilde{g}}_{t} ({\tilde{θ}}_{t})}$ . As soon as all simulations for generation t − 1 have finished and thus the actual P_t−1 is available, all workers are updated to continue working with the actual sampling task based on proposal g_t. As the time-critical part of typical ABC applications is the model simulation, the cost of generating the preliminary sampling task is usually negligible.

Fig 2 — As soon as no more simulations are required for generation t − 1 (green), a preliminary simulation task for generation t is formulated either based on population P_t−2 (grey, LA Past) or the preliminary population P_t−1 (purple, LA Prel). Resulting simulations are considered when evaluating the next generation, and suitable weight normalization is applied to all samples (top right). Over time, the number of workers dedicated to generation t − 1 decreases, while that for generation t increases (bottom).

The assessment of acceptance of preliminary samples depends on whether everything is pre-defined: If the acceptance components, including distance function d and acceptance threshold ε_t for generation t are defined a-priori, then acceptance can be checked directly on the workers without knowledge of the complete previous population P_t−1. If however any component of the algorithm is adaptive and hence based on P_t−1 (e.g. the acceptance threshold is commonly chosen as a quantile of ${d (y_{t - 1}^{i}, y_{obs})}_{i \leq N}$ ), the acceptance check must be delayed until the actual P_t−1 is available. This allows to use one common acceptance criterion across all particles within a generation, so that all particles target the same distribution.

The population of generation t is then, corrected for run-time bias as in DYN by only considering the N accepted particles that were started first, given as

\begin{matrix} P_{t} = {{({\tilde{θ}}_{t}^{i}, {\tilde{w}}_{t}^{i})}_{i \leq \tilde{N}}, {(θ_{t}^{i}, w_{t}^{i})}_{\tilde{N} < i \leq N}}, \end{matrix}

(4)

with $0 \leq \tilde{N} \leq N$ particles based on the preliminary proposal ${\tilde{g}}_{t}$ , and $N - \tilde{N}$ on the final g_t. The weights need to be normalized appropriately, as explained in the following section. We call this parallelization strategy, which during generation t − 1 already looks ahead to generation t, Look-ahead (dynamic) scheduling (LA).

Weights and unbiasedness

A key property of ABC methods is that they provide an asymptotically unbiased Monte Carlo sample from $π_{ABC, ε_{n_{t}}} (θ | y_{obs})$ , with $π_{ABC, ε_{n_{t}}} (θ | y_{obs}) \to π (θ | y_{obs})$ for ε → 0. The sample (4) obtained via LA conserves this property: The point is that each subpopulation on its own gives an asymptotically unbiased estimator, since the weights ${\tilde{w}}_{t} (\tilde{θ}) = π (\tilde{θ}) / {\tilde{g}}_{t} (\tilde{θ})$ , w_t(θ) = π(θ)/g_t(θ) are exactly the Radon-Nikodym derivatives w.r.t. the respective proposal distributions. Note that this theoretical unbiasedness holds regardless of whether ${\tilde{g}}_{t}$ is based on ${\hat{P}}_{t - 1}$ or P_t−2, as long as ${\tilde{g}}_{t} (θ) ≫ π (θ)$ As noted in the previous section, however a practical bias may occur due to finite sample size.

The subpopulation estimates are then combined, which decreases the Monte Carlo error due to the larger sample size. Instead of simply tossing all samples together, it is preferable to first normalize the weights relative to their subpopulation, ${\tilde{W}}_{t}^{i} : = {\tilde{w}}_{t}^{i} / \sum_{i = 1}^{\tilde{N}} {\tilde{w}}_{t}^{j}$ , $W_{t}^{i} : = w_{t}^{i} / \sum_{i = \tilde{N} + 1}^{N} w_{t}^{i}$ (Section 1.3 in S1 File). This is because both weight functions are non-normalized, with generally different normalization constants, which renders them not directly comparable. A joint estimate based on the full population can then be given as

\begin{matrix} E_{π_{ABC, ε_{t}} (θ | y_{obs})} [f] \approx β \sum_{i = 1}^{\tilde{N}} {\tilde{W}}_{t}^{i} f ({\tilde{θ}}_{t}^{i}) + (1 - β) \sum_{i = \tilde{N} + 1}^{N} W_{t}^{i} f (θ_{t}^{i}) \end{matrix}

(5)

with β ∈ [0, 1] a free parameter. A straightforward choice is $β = \tilde{N} / N$ , rendering the contribution of each subpopulation proportional to the respective number of samples. Instead, we propose to choose β to maximize the overall effective sample size (3), rendering the Monte Carlo estimate more robust. This is a simple constrained optimization problem with solution

\begin{matrix} β = \frac{ESS ({{\tilde{W}}_{t}^{i}}_{i \leq \tilde{N}})}{ESS ({{\tilde{W}}_{t}^{i}}_{i \leq \tilde{N}}) + ESS ({W_{t}^{i}}_{\tilde{N} < i \leq N})} \end{matrix}

i.e. the contribution of each subpopulation is proportional to its effective sample size (Section 1.4 S1 File). Supposing that for N → ∞, $\tilde{N} / N \to α \in [0, 1]$ , (5) converges to the left-hand side, as required. A more detailed derivation and extension to more than two proposal distributions is given in the Section 1 in S1 File.

Implementation and availability

We implemented LA in the open-source Python tool pyABC [27], which already provided STAT and DYN. We employ a Redis low-latency server to handle the task distribution. If all components are pre-defined, we perform evaluation of the “look-ahead” samples $(\tilde{θ}, \tilde{y})$ directly on the workers. If there are adaptive components, the delayed evaluation is performed on the main process. To avoid generating unnecessarily many preliminary samples in the presence of some very long-running simulations, we limited the number of preliminary samples to a default value of 10 times the number of samples in the current iteration. To not start preliminary sampling unnecessarily, we employed schemes predicting whether any termination criterion will be hit after the current generation. The code underlying this study can be found at https://github.com/EmadAlamoudi/Lookahead_study. A snapshot of code and data can be found at https://doi.org/10.5281/zenodo.7875905.

Results

Wall-time superiority of DYN over STAT has already been established in prior work [17]. To study the performance of LA and compare it to DYN, we applied both to several parameter estimation problems and in various scenarios of population size N and workers W. We distinguish between “LA Prel” using the preliminary ${\hat{P}}_{t - 1}$ to generate ${\tilde{g}}_{t}$ , and “LA Past” using P_t−2 instead.

Test problems

We considered four problems (Table 1): Problems T1-T2 are simple test problems, while M1-M2 are realistic application examples.

Table 1. Overview of application examples.

ID	Description	Implementation	n _θ
T1	Bimodal run-time-skewed model	Python	1
T2	Conversion reaction ODE model	Python	2
M1	Tumor spheroid growth [11]	C++	7
M2	Liver tissue regeneration [28]	Morpheus	14

Open in a new tab

Problem T1 is a bimodal model y ≈ θ², in which simulations from one mode have an artificially longer run-time. Specifically, if θ > 0, a log-normally distributed simulation time of $τ \sim log N (1, 2, 4)$ seconds was simulated. The goal of setting an artificially longer run-time was to specifically test the preliminary bias caused by only accepting simulations from one mode when constructing the preliminary proposal.

Problem T2 is an ordinary differential equation (ODE) model with 2 parameters describing a conversion reaction x₁ ↔ x₂, with observables obscured by random multiplicative noise. To analyze sampler behavior under simulation run-time heterogeneity, we added random log-normally distributed delay times t_sleep of various variances on top of the ODE simulations. For this model, run-times are fast, permitting repeated analyses to check correctness of the method, quantify stochastic effects and assess average behavior.

Problem M1 describes the growth of a tumor spheroid using a hybrid discrete-continuous approach, modeling single cells as stochastically interacting agents and extracellular substances [11]. The model combines a system of PDEs to describe the extracellular matrix, with a cellular Potts model (CPM) to describe cell configurations and mechanisms like cell division and cell death. The model has seven estimated parameters and outputs three observables, the growth curve, extra-cellular matrix and proliferation profiles.

Problem M2 describes the metabolic status of mechano-sensing during liver generation, describing the reaction network dynamics by a set of ODEs [28]. This model has 14 parameters and two observables, the nuclear YAP and total YAP intensities. These observables were quantified from image tiles covering an entire liver lobule with portal and central veins.

Further details about the test problems can be found in the Section 3 in S1 File.

Biased proposal can induce practical bias in accepted population

The analysis of test model T1 revealed that for small population sizes N relative to the number of workers W, in combination with high acceptance rates, LA Prel can indeed lead to a bias towards short-running simulations (Fig 3 right). This can happen when ${\hat{P}}_{t - 1}$ is only based on short-running simulations, and solely proposes particles from that regime, enough of which are then accepted to form P_t. For larger N relative to W, this effect occurred less, likely because given large population sizes, sampling from other modes with associated high importance weights eventually happened.

Sampling from unbiased proposal solves bias

When replacing LA Prel by LA Past, i.e. sampling from P_t−2 instead of ${\hat{P}}_{t - 1}$ , the bias no longer occurred (Fig 3 right). This is expected, because ${\tilde{g}}_{t - 2}$ has no run-time bias. In practice, we did not encounter any problems of practical bias on the considered application examples, where results from DYN, LA Prel and LA Past were highly consistent. Yet, LA Prel may fail in some situations, which also demonstrates that ABC-SMC algorithms are sensitive to potential bias in the proposal distribution. Thus, in the following, we focus on the stable LA Past algorithm, showing pendants for LA Prel in the S1 File.

Look-ahead sampling gives accurate results

We used problem T2 to analyze different scenarios, with population sizes N from 20 to 1280 particles, worker numbers W from 32 to 256, and log-normally distributed simulation times of variances σ² from 1.0 to 4.0. We ran each scenario 13 times to obtain stable average statistics. We considered means and standard deviations as point and uncertainty measures.

Point estimates for DYN and LA converged to the same values across population sizes (Fig 4A and 4B). The proportion of accepted LA samples in the final population originating from the preliminary distribution ranged from nearly 100% to 50% (LA Prel) and 20% (LA Past), as expected decreasing for larger population sizes (Fig 4E and 4F). The more pronounced decrease for LA Past than LA Prel is reasonable because void of bias, ${\hat{P}}_{t - 1}$ provides a better sampling distribution than P_t−2. Effective sample sizes were stable across DYN and LA (Fig 4D). A higher run-time variance lead to an increase in accepted samples originating from the preliminary proposal distribution (S1 Fig in S1 File). This is expected, because greater heterogeneity in run-times increases the chance of encountering exceptionally long-running simulations, which DYN has to wait for, while LA already proceeds.

Considerable speed-up towards high worker numbers

To analyze the effect of scheduling strategy on the overall wall-time, we ran model T2 systematically for different population sizes and numbers of workers. We considered population sizes 32 ≤ N ≤ 1280 and numbers of parallel workers 32 ≤ W ≤ 256, which covers typical ranges used in practice. Each scenario was repeated between 13 times to assess average behavior, here we report mean values.

As a general tendency, the wall-time speed-up of LA over DYN became larger with increasing ratio of the number of workers by the population size. For a model sleep time variance of σ² = 1 (Fig 5), e.g. for N = 20 and W = 256, the average wall-time got reduced by a factor of almost 1.8. In most scenarios, a wall-time reduction by a factor of between 1.11 and 1.8 was observed. Only when the population size was large compared to the number of workers, the speed-up was comparably small. Generally, the increase in speed-up with increasing ratio W/N is as expected, as for large W the idle time between generations occurring for DYN and LA constitutes a more pronounced factor in the overall run-time.

For a sleep time variance of σ² = 2 (S2 Fig in S1 File), we observed similar behavior. There, the acceleration was generally more pronounced with up to a factor of roughly 1.9 and many factors in the range 1.2 to 1.9. This indicates that indeed the advantage of LA over DYN is more pronounced in the presence of highly heterogeneous model simulation times.

Also on T1, the comparison of run-times (Fig 3 left) revealed a speed-up of LA over DYN. Further, we could confirm on both T1 and T2 (Fig 3 and S3 Fig in S1 File) the substantial speed-up DYN already provides over STAT, as reported in [17], on which we here improved further.

Scales to realistic application problems

Given the high simulation cost of the application problems M1–2, we only performed selected analyses to compare LA and DYN. A reliable comparison of run-times in real-life application examples is challenging, because the total wall-time varies strongly due to stochastic effects, and computations are too expensive to perform inference many times.

For the two models, the parameter estimates obtained using LA (both LA Per and LA Past) and DYN are consistent, except for expectable stochastic effects (S4 and S5 and S9–S11 Figs in S1 File). Together with the previous analyses, this indicates that for practical applications, the multi-proposal approach of LA allows for stable and accurate inference, similar to the single proposal used by DYN. In early generations, a considerable part of the accepted particles was based on the preliminary proposal distribution (near 100%), which then decreased in later generations (S6 and S12 Figs in S1 File). This is consistent with the decrease in acceptance rate and thus the relative time during which the preliminary and not the final proposal distribution is used.

For the tumor model M1, we used an adaptive quantile-based epsilon threshold schedule [29], with DYN, LA Prel and LA Past, population sizes N ∈ {250, 500, 1000}, and W ∈ {128, 256} workers. For each considered configuration we performed 2 replicates (in total 8) to assess average behavior. Reported run-times are until a common final threshold was hit by all runs. The speed-up of LA over DYN varied depending on the ratio of population size and number of workers, similar to what we observed for T1+2. For high ratios, LA was consistently faster up to 35%. However, for low ratios, less improvement was observed. In some runs, LA was slightly slower than DYN (Fig 6). Over the 8 runs, we observed a mean speed-up of 21% (13%) and a median of 23% (16%) for LA Past (LA Prel). This indicates expected speed-ups of 13–23%. However, it should be remarked that large run-time differences and volatility could be traced back to single generations taking vast amounts of time (S7 Fig in S1 File). These long generations occurred in all scheduling variants and exist most likely because the epsilon for that generation was chosen too optimistically, indicating a weakness of the used epsilon scheme rather than the parallelization strategy.

For the liver regeneration model M2, we performed similar analyses, with adaptive quantile-based epsilon threshold schedules, population sizes N ∈ {250, 500, 1000} and W ∈ {128, 256} workers, with 2 replicates per configuration. Similar to model M1, we observed a faster performance of up to 35%. However, with a smaller ratio between population size and the number of workers, a slightly lower performance improvement was achieved (Fig 7). Similarly to M1, the acceleration varied quite strongly. For LA Prel we observed a mean speed-up over all 8 runs of 36% (median 31%) over DYN. However, for LA Past we observed contrarily a mean slow-down of 39% (median 43%) over DYN. It is not clear what caused this stark difference, which is again subject to high fluctuations. Further tests would be needed to assess the reasons for this specific model.

Discussion

Simulation-based ABC methods have made parameter inference increasingly accessible even for complex stochastic models, which are however limited by computational costs. Here, we presented “look-ahead” sampling, a parallelization strategy to minimize wall-time and improve run-time efficiency by using all available high-performance computing resources at near-all times. On various test and application examples, we verified the accuracy and robustness of the novel approach in typical settings. Depending on model simulation run-time heterogeneity, and the relation of population size and the number of available cores, we observed a speed-up of up to 45% compared to dynamical scheduling as the previously most efficient strategy. Compared to widely used static scheduling, dynamic scheduling is already highly efficient, with limited room for improvement. Nevertheless, using the here proposed look-ahead sampling, on typical application examples, we observed a speed-up of often roughly 20–30%, however with some variability and sometimes efficiency on par with or even below dynamical scheduling. Assessing these variations in efficiency in more detail on expensive application examples would require further tests with considerable computational resources. Importantly, our analysis also demonstrates how ABC-SMC is sensitive to the choice of proposal distribution. Finite samples can induce a practical bias, as we observed here for parameter-dependent run-times of models—a problem that occurred in extreme cases but could only be solved by using look-ahead sampling with the previous, and not the preliminary, proposal distribution.

Conceptually and aside implementation details, the presented strategy provides the minimal wall-time among all parallelization strategies, as all cores are used at practically all times. We observed that look-ahead sampling using preliminary results (LA Prel) provided a performance speed-up over re-using the previous generation (LA Past), however at the cost of practical bias. Thus, LA Past constitutes the safe choice. Only if the possibility of critical parameter-dependent simulation times can be excluded, would we presently recommend LA Prel.

Were it possible to construct an unbiased proposal using those preliminary results, e.g. via reweighting or imbalance detection, we could thus increase the speed-up with robust performance. Alternatively, LA Past and LA Prel could also be combined, e.g. switching to LA Prel after a “burn-in”, when the probability of a bias toward short-running simulations is lessened.

When using delayed evaluation, it would be possible to parallelize the evaluation as well, which we have not done here. If evaluation times are long relative to simulation times, e.g. if (adaptive) summary statistics involve complex operations, this would be beneficial. In order to reduce a potential bias in the preliminary proposal distribution towards fast-running simulations, it may be beneficial to update it as soon as more particles finish. This would imply the use of more than two importance distributions, the theory of which we have already provided in the S1 File.

In conclusion, we showed how we can minimize wall-time and associated computing cost of ABC samplers with substantial performance gains over established methods. Given that the concept is generally applicable for sequential importance sampling methods, it is of potential widespread use for different applications.

Supporting information

S1 File. Further details on the methods and results.

(PDF)

pone.0294015.s001.pdf^{(4.1MB, pdf)}

Data Availability

The code underlying this study can be found at https://github.com/EmadAlamoudi/Lookahead_study. A snapshot of code and data can be found at https://doi.org/10.5281/zenodo.7875905.

Funding Statement

The authors acknowledge the Gauss Centre for Supercomputing e.V. (www.gauss-centre.eu) for funding this project by providing computing time on the GCS Supercomputer JUWELS at Jülich Supercomputing Centre (JSC). This work was supported by the German Federal Ministry of Education and Research (BMBF) (FitMultiCell/031L0159C and EMUNE/031L0293C) and the German Research Foundation (DFG) under Germany’s Excellence Strategy (EXC 2047 390873048 and EXC 2151 390685813 and the Schlegel Professorship for JH). YS acknowledges support by the Joachim Herz Foundation. FG was supported by the Chica and Heinz Schaller Foundation. There was no additional external funding received for this study.

References

1. Gershenfeld NA, Gershenfeld N. The nature of mathematical modeling. Cambridge university press; 1999. [Google Scholar]
2. Kitano H. Systems Biology: A Brief Overview. Science. 2002;295(5560):1662–1664. doi: 10.1126/science.1069492 [DOI] [PubMed] [Google Scholar]
3.Tarantola A. Inverse Problem Theory and Methods for Model Parameter Estimation. SIAM; 2005.
4. Tavaré S, Balding DJ, Griffiths RC, Donnelly P. Inferring coalescence times from DNA sequence data. Genetics. 1997;145(2):505–518. doi: 10.1093/genetics/145.2.505 [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Hasenauer J, Jagiella N, Hross S, Theis FJ. Data-Driven Modelling of Biological Multi-Scale Processes. J Coupled Syst Multiscale Dyn. 2015;3(2):101–121. doi: 10.1166/jcsmd.2015.1069 [DOI] [Google Scholar]
6. Pritchard JK, Seielstad MT, Perez-Lezaun A, Feldman MW. Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Mol Biol Evol. 1999;16(12):1791–1798. doi: 10.1093/oxfordjournals.molbev.a026091 [DOI] [PubMed] [Google Scholar]
7. Beaumont MA, Zhang W, Balding DJ. Approximate Bayesian Computation in Population Genetics. Genetics. 2002;162(4):2025–2035. doi: 10.1093/genetics/162.4.2025 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Sisson SA, Fan Y, Beaumont M. Handbook of approximate Bayesian computation. Chapman and Hall/CRC; 2018. [Google Scholar]
9. Del Moral P, Doucet A, Jasra A. Sequential Monte Carlo samplers. J R Stat Soc B. 2006;68(3):411–436. doi: 10.1111/j.1467-9868.2006.00553.x [DOI] [Google Scholar]
10. Sisson SA, Fan Y, Tanaka MM. Sequential Monte Carlo without likelihoods. Proc Natl Acad Sci. 2007;104(6):1760–1765. doi: 10.1073/pnas.0607208104 [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Jagiella N, Rickert D, Theis FJ, Hasenauer J. Parallelization and High-Performance Computing Enables Automated Statistical Inference of Multi-scale Models. Cell Syst. 2017;4(2):194–206. doi: 10.1016/j.cels.2016.12.002 [DOI] [PubMed] [Google Scholar]
12. Imle A, Kumberger P, Schnellbächer ND, Fehr J, Carrillo-Bustamante P, Ales J, et al. Experimental and computational analyses reveal that environmental restrictions shape HIV-1 spread in 3D cultures. Nature Communications. 2019;10(1):2144. doi: 10.1038/s41467-019-09879-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Durso-Cain K, Kumberger P, Schälte Y, Fink T, Dahari H, Hasenauer J, et al. HCV spread kinetics reveal varying contributions of transmission modes to infection dynamics. Viruses. 2021;13(7). doi: 10.3390/v13071308 [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Alamoudi E, Schälte Y, Müller R, Starruß J, Bundgaard N, Graw F, et al. FitMultiCell: Simulating and parameterizing computational models of multi-scale and multi-cellular processes. bioRxiv. 2023. doi: 10.1093/bioinformatics/btad674 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Dutta R, Schoengens M, Onnela JP, Mira A. ABCpy: A User-Friendly, Extensible, and Parallel Library for Approximate Bayesian Computation. In: Proceedings of the Platform for Advanced Scientific Computing Conference. PASC’17. New York, NY, USA: ACM; 2017. p. 8:1–8:9.
16.Kangasrääsiö A, Lintusaari J, Skytén K, Järvenpää M, Vuollekoski H, Gutmann M, et al. ELFI: Engine for Likelihood-Free Inference. In: NIPS 2016 Workshop on Advances in Approximate Bayesian Inference; 2016.
17. Klinger E, Rickert D, Hasenauer J. pyABC: distributed, likelihood-free inference. Bioinf. 2018;34(20):3591–3593. [DOI] [PubMed] [Google Scholar]
18. Wilkinson RD. Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. Stat Appl Gen Mol Bio. 2013;12(2):129–141. doi: 10.1515/sagmb-2013-0010 [DOI] [PubMed] [Google Scholar]
19. Schälte Y, Alamoudi E, Hasenauer J. Robust adaptive distance functions for approximate Bayesian inference on outlier-corrupted data. bioRxiv. 2021;. [Google Scholar]
20. Fearnhead P, Prangle D. Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. J R Stat Soc B. 2012;74(3):419–474. doi: 10.1111/j.1467-9868.2011.01010.x [DOI] [Google Scholar]
21. Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MPH. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface. 2009;6:187–202. doi: 10.1098/rsif.2008.0172 [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Beaumont MA. Approximate Bayesian computation in evolution and ecology. Annual review of ecology, evolution, and systematics. 2010;41:379–406. doi: 10.1146/annurev-ecolsys-102209-144621 [DOI] [Google Scholar]
23. Beaumont MA, Cornuet JM, Marin JM, Robert CP. Adaptive approximate Bayesian computation. Biometrika. 2009;96(4):983–990. doi: 10.1093/biomet/asp052 [DOI] [Google Scholar]
24. Filippi S, Barnes CP, Cornebise J, Stumpf MP. On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo. Stat Appl Genet Mol. 2013;12(1):87–107. [DOI] [PubMed] [Google Scholar]
25.Klinger E, Hasenauer J. A scheme for adaptive selection of population sizes in Approximate Bayesian Computation—Sequential Monte Carlo. In: Feret J, Koeppl H, editors. Computational Methods in Systems Biology. CMSB 2017. vol. 10545 of Lecture Notes in Computer Science. Springer, Cham; 2017.
26. Liu JS, Chen R, Wong WH. Rejection control and sequential importance sampling. J Am Stat Assoc. 1998;93(443):1022–1031. doi: 10.1080/01621459.1998.10473764 [DOI] [Google Scholar]
27. Schälte Y, Klinger E, Alamoudi E, Hasenauer J. pyABC: Efficient and robust easy-to-use approximate Bayesian computation. J Open Source Softw. 2022;7(74):4304. doi: 10.21105/joss.04304 [DOI] [Google Scholar]
28. Meyer K, Morales-Navarrete H, Seifert S, Wilsch-Braeuninger M, Dahmen U, Tanaka EM, et al. Bile canaliculi remodeling activates YAP via the actin cytoskeleton during liver regeneration. Mol Syst Biol. 2020;16(2):e8985. doi: 10.15252/msb.20198985 [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Drovandi CC, Pettitt AN. Estimation of parameters for macroparasite population evolution using approximate Bayesian computation. Biometrics. 2011;67(1):225–233. doi: 10.1111/j.1541-0420.2010.01410.x [DOI] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0294015.r001

Decision Letter 0

Abel CH Chen

3 Sep 2023

PONE-D-23-20726A Wall-time Minimizing Parallelization Strategy for Approximate Bayesian ComputationPLOS ONE

Dear Dr. Schälte,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Oct 18 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Abel C.H. Chen

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating in your Funding Statement:

"The authors acknowledge the Gauss Centre for Supercomputing e.V. (www.gauss-centre.eu) for

funding this project by providing computing time on the GCS Supercomputer JUWELS at Jülich

Supercomputing Centre (JSC). This work was supported by the German Federal Ministry of Edu-

cation and Research (BMBF) (FitMultiCell/031L0159C and EMUNE/031L0293C) and the German

Research Foundation (DFG) under Germany’s Excellence Strategy (EXC 2047 390873048 and EXC

2151 390685813 and the Schlegel Professorship for JH). YS acknowledges support by the Joachim

Herz Foundation. FG was supported by the Chica and Heinz Schaller Foundation."

Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now. Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement.

Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf.

3. Please update your submission to use the PLOS LaTeX template. The template and more information on our requirements for LaTeX submissions can be found at http://journals.plos.org/plosone/s/latex.

4. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

5. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The article "A Wall-time Minimizing Parallelization Strategy for Approximate Bayesian Computation" proposed a Look-ahead scheduling technique to fully utilize the idle time that address issues the current static and dynamic scheduling have. Two kinds of LA scheme, LA pre and LA cur are proposed and compared through numerous numerical examples. t is clearly written and will be a useful contribution to the community. link to the software is provided. I was able to access the code through the link provided in the paper. A few minor comments for the manuscript that will be useful for the authors to discuss but does not require any further calculations for this paper, are as below.

1. In section 2.4.1, could the authors add more discussion about the bias introduced by LA Pre, such as, where does the bias come from, why particles with faster simulation times tend to introduce a practical bias?

2. In Figure 3, the authors illustrate the bias caused by LA pre scheduling method. However, it's hard for me to tell the information from the right hand-side plot. I was wondering if a more detailed explanation for the problematic realization (the purple "vertical" lines), or updated plots with extended y-axis can be posted?

3. In section 3.5, what's the reason behind the slower performance as the ratio between number of population size and number of workers increase?

4. How do you determine which method (LA pre and LA cur) is the better choice given a problem?

Reviewer #2: In this study, E. Alamoudi and his colleagues develop a novel parallelization strategy implemented in an high-performance computing (HPC) infrastructure. This strategy aims at improving the speed, mainly by reducing wall-time, of Approximate Bayesian Computation (ABC) sequential Monte Carlo algorithm. The authors show that their new strategy is unbiased and converges to the expected value of Monte Carlo sampling. They also compare their strategy (look-ahead scheduling) with existing parallelization strategies, assessing relative performance at four different test problems. Given its flexibility, ABC is a popular approach that can be applied to very complex models. As such, any potential speed-up strategy is useful and could prove to be valuable for many researchers. Thus, it is my recommendation that this manuscript should be accepted. Nevertheless, I have a few comments and suggestions. Please note that, although I have worked with ABC before, I am not a computational scientist and thus all my comments will be aimed at improving the clarity of the manuscript for a more general audience. I hope my comments are useful and contribute to make this submission more valuable for PlosOne wide readership.

- I think the authors limit the scope of the manuscript by starting the introduction with systems biology. Although “systems biology” is an umbrella term that encompasses several fields and research questions, this parallelization strategy could be useful in other fields such as population genomics or human health, for instance. I think a more general start is warranted, possibly by just mentioning the variety of applications of ABC, including systems biology.

- “While asymptotically exact, a known disadvantage of ABC is its reliance on repeated simulation, often hundred thousands to millions of times” - while this is technically correct, further ahead the authors clearly state the 3 steps of classical ABC. Thus, I think this could be rephrased to highlight that a disadvantage of ABC is the computationally heavy and time consuming simulation of data points (or datasets), which is usually the most time consuming step of ABC.

- I believe there is an unfinished sentence in the “ABC-SMC” section of the Methods. The sentence that starts with “Methods that adapt to the problem structure” ends with “have shown superior”. I think a word is missing after “superior”.

- Although Figure 1 is useful to understand the comparison between the different parallelization strategies, I found it somewhat confusing. Particularly, it was confusing to see shades of gray in the Figure key but not in the Figure itself. I think the authors could remove the shading from the Figure key and include that information only in the Figure legend itself. Possibly as: “The shading associated with each color indicates whether a sample satisfies the acceptance criterion and is included in the final population (dark shading), satisfies the acceptance criterion but is discarded because enough earlier-started accepted samples exist (intermediate shading, for DYN+LA), or does not satisfy the acceptance criterion and is rejected (light shading).” Additionally, it is not clear what the blank areas in-between shaded areas represent for each worker? Idle times or post-process?

- Throughout the text the authors refer to Figure 1A, 1B and 1C but I did not see any A, B or C indicated in Figure 1.

- When explaining static scheduling (STAT), the authors mention that the “tasks are queued if N ≥ W”. Do tasks really get queued if N = W? It was my understanding that if N workers are available then all tasks will start and no task will be queued.

- “Therefore, DYN waits for all workers to finish, and out of then N ≥ N accepted particles, only the N that started earliest are finally accepted”. Is this correct? This sentence is confusing, because if there is a bias towards accepting particles that end earlier, how would waiting for all workers to finish and then selecting the N that started earlier solve that bias?

- This might just be a personal preference but I don’t understand the rationale behind the naming convention chosen for “LA Cur” and “LA Pre”. Although I understand that “LA Pre” refers to the use of a Preliminary sample, “LA Cur” makes me think of “Current” which is less than intuitive given that it is using an earlier generation Pt-2. Maybe this will be clearer for other readers but, for me, it would make more sense to call these two approaches “LA Pre” for preliminary and “LA Pge” for the past generation.

- In the subsection “Test problems”, I think a few more general details about the problems considered would make the results more understandable to a wider audience. I suggest adding a table with the parameters for each problem to the supplementary information. The authors should also mention that a detailed explanation of the test problems can be found in the supplementary information. Additionally, I think the authors should expand their explanation of T1 since this problem is important to understand the preliminary bias associated with “LA Pre”. Particularly, I think that they should more clearly state that the aim of the problem is to infer a posterior distribution with two modes (i.e., a bimodal distribution), which is a tricky problem. Furthermore, it could be made clearer that the goal of setting an artificially longer run-time was to specifically test the preliminary bias caused by only accepting simulations from one mode when constructing gt based on Pt-1.

- In the section titled “Sampling from unbiased proposal solves bias” the authors correctly state that “LA Pre may fail in some situations, which also demonstrates that ABC-SMC algorithms are sensitive to potential bias in the proposal distribution”. Given sufficient number of generations, is it possible to use “LA Cur” in the first generations, as a sort of “burn-in”, and then switch to “LA Pre” when the probability of a bias towards short-running simulations is lessened?

- In section “3.6 Scales to realistic application problems”, is it possible that the “mean slow-down of 39% (median 43%) over DYN” observed for “LA Cur” is a result of a slower refinement of the preliminary proposal distribution? In other words, for a sufficiently complex model, basing the proposal distribution on Pt−2 implies that the search space for each generation is more vast than it would be if proposal distribution was based on Pt−1. Could this explain the observed slow-down?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2024 Feb 22;19(2):e0294015. doi: 10.1371/journal.pone.0294015.r002

Author response to Decision Letter 0

13 Oct 2023

Please find our response to all reviewer comments in the Response PDF file.

Attachment

Submitted filename: LookAheadABC_Response.pdf

pone.0294015.s002.pdf^{(493KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0294015.r003

Decision Letter 1

Abel CH Chen

25 Oct 2023

A Wall-time Minimizing Parallelization Strategy for Approximate Bayesian Computation

PONE-D-23-20726R1

Dear Dr. Schälte,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Abel C.H. Chen

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #1: (No Response)

Reviewer #2: After carefully re-reading the updated manuscript and the authors replies, I feel that the authors addressed all comments of the two reviewers, both in their response letter and in the updated manuscript. I appreciate the change in nomenclature and the detailed description of the test problems. I believe that this has improved the manuscript, which is now clearer and more comprehensive.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

PLoS One. doi: 10.1371/journal.pone.0294015.r004

Acceptance letter

Abel CH Chen

3 Nov 2023

PONE-D-23-20726R1

A Wall-time Minimizing Parallelization Strategy for Approximate Bayesian Computation

Dear Dr. Schälte:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Abel C.H. Chen

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 File. Further details on the methods and results.

(PDF)

pone.0294015.s001.pdf^{(4.1MB, pdf)}

Attachment

Submitted filename: LookAheadABC_Response.pdf

pone.0294015.s002.pdf^{(493KB, pdf)}

Data Availability Statement

The code underlying this study can be found at https://github.com/EmadAlamoudi/Lookahead_study. A snapshot of code and data can be found at https://doi.org/10.5281/zenodo.7875905.

[pone.0294015.ref001] 1. Gershenfeld NA, Gershenfeld N. The nature of mathematical modeling. Cambridge university press; 1999. [Google Scholar]

[pone.0294015.ref002] 2. Kitano H. Systems Biology: A Brief Overview. Science. 2002;295(5560):1662–1664. doi: 10.1126/science.1069492 [DOI] [PubMed] [Google Scholar]

[pone.0294015.ref003] 3.Tarantola A. Inverse Problem Theory and Methods for Model Parameter Estimation. SIAM; 2005.

[pone.0294015.ref004] 4. Tavaré S, Balding DJ, Griffiths RC, Donnelly P. Inferring coalescence times from DNA sequence data. Genetics. 1997;145(2):505–518. doi: 10.1093/genetics/145.2.505 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0294015.ref005] 5. Hasenauer J, Jagiella N, Hross S, Theis FJ. Data-Driven Modelling of Biological Multi-Scale Processes. J Coupled Syst Multiscale Dyn. 2015;3(2):101–121. doi: 10.1166/jcsmd.2015.1069 [DOI] [Google Scholar]

[pone.0294015.ref006] 6. Pritchard JK, Seielstad MT, Perez-Lezaun A, Feldman MW. Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Mol Biol Evol. 1999;16(12):1791–1798. doi: 10.1093/oxfordjournals.molbev.a026091 [DOI] [PubMed] [Google Scholar]

[pone.0294015.ref007] 7. Beaumont MA, Zhang W, Balding DJ. Approximate Bayesian Computation in Population Genetics. Genetics. 2002;162(4):2025–2035. doi: 10.1093/genetics/162.4.2025 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0294015.ref008] 8. Sisson SA, Fan Y, Beaumont M. Handbook of approximate Bayesian computation. Chapman and Hall/CRC; 2018. [Google Scholar]

[pone.0294015.ref009] 9. Del Moral P, Doucet A, Jasra A. Sequential Monte Carlo samplers. J R Stat Soc B. 2006;68(3):411–436. doi: 10.1111/j.1467-9868.2006.00553.x [DOI] [Google Scholar]

[pone.0294015.ref010] 10. Sisson SA, Fan Y, Tanaka MM. Sequential Monte Carlo without likelihoods. Proc Natl Acad Sci. 2007;104(6):1760–1765. doi: 10.1073/pnas.0607208104 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0294015.ref011] 11. Jagiella N, Rickert D, Theis FJ, Hasenauer J. Parallelization and High-Performance Computing Enables Automated Statistical Inference of Multi-scale Models. Cell Syst. 2017;4(2):194–206. doi: 10.1016/j.cels.2016.12.002 [DOI] [PubMed] [Google Scholar]

[pone.0294015.ref012] 12. Imle A, Kumberger P, Schnellbächer ND, Fehr J, Carrillo-Bustamante P, Ales J, et al. Experimental and computational analyses reveal that environmental restrictions shape HIV-1 spread in 3D cultures. Nature Communications. 2019;10(1):2144. doi: 10.1038/s41467-019-09879-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0294015.ref013] 13. Durso-Cain K, Kumberger P, Schälte Y, Fink T, Dahari H, Hasenauer J, et al. HCV spread kinetics reveal varying contributions of transmission modes to infection dynamics. Viruses. 2021;13(7). doi: 10.3390/v13071308 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0294015.ref014] 14. Alamoudi E, Schälte Y, Müller R, Starruß J, Bundgaard N, Graw F, et al. FitMultiCell: Simulating and parameterizing computational models of multi-scale and multi-cellular processes. bioRxiv. 2023. doi: 10.1093/bioinformatics/btad674 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0294015.ref015] 15.Dutta R, Schoengens M, Onnela JP, Mira A. ABCpy: A User-Friendly, Extensible, and Parallel Library for Approximate Bayesian Computation. In: Proceedings of the Platform for Advanced Scientific Computing Conference. PASC’17. New York, NY, USA: ACM; 2017. p. 8:1–8:9.

[pone.0294015.ref016] 16.Kangasrääsiö A, Lintusaari J, Skytén K, Järvenpää M, Vuollekoski H, Gutmann M, et al. ELFI: Engine for Likelihood-Free Inference. In: NIPS 2016 Workshop on Advances in Approximate Bayesian Inference; 2016.

[pone.0294015.ref017] 17. Klinger E, Rickert D, Hasenauer J. pyABC: distributed, likelihood-free inference. Bioinf. 2018;34(20):3591–3593. [DOI] [PubMed] [Google Scholar]

[pone.0294015.ref018] 18. Wilkinson RD. Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. Stat Appl Gen Mol Bio. 2013;12(2):129–141. doi: 10.1515/sagmb-2013-0010 [DOI] [PubMed] [Google Scholar]

[pone.0294015.ref019] 19. Schälte Y, Alamoudi E, Hasenauer J. Robust adaptive distance functions for approximate Bayesian inference on outlier-corrupted data. bioRxiv. 2021;. [Google Scholar]

[pone.0294015.ref020] 20. Fearnhead P, Prangle D. Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. J R Stat Soc B. 2012;74(3):419–474. doi: 10.1111/j.1467-9868.2011.01010.x [DOI] [Google Scholar]

[pone.0294015.ref021] 21. Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MPH. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface. 2009;6:187–202. doi: 10.1098/rsif.2008.0172 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0294015.ref022] 22. Beaumont MA. Approximate Bayesian computation in evolution and ecology. Annual review of ecology, evolution, and systematics. 2010;41:379–406. doi: 10.1146/annurev-ecolsys-102209-144621 [DOI] [Google Scholar]

[pone.0294015.ref023] 23. Beaumont MA, Cornuet JM, Marin JM, Robert CP. Adaptive approximate Bayesian computation. Biometrika. 2009;96(4):983–990. doi: 10.1093/biomet/asp052 [DOI] [Google Scholar]

[pone.0294015.ref024] 24. Filippi S, Barnes CP, Cornebise J, Stumpf MP. On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo. Stat Appl Genet Mol. 2013;12(1):87–107. [DOI] [PubMed] [Google Scholar]

[pone.0294015.ref025] 25.Klinger E, Hasenauer J. A scheme for adaptive selection of population sizes in Approximate Bayesian Computation—Sequential Monte Carlo. In: Feret J, Koeppl H, editors. Computational Methods in Systems Biology. CMSB 2017. vol. 10545 of Lecture Notes in Computer Science. Springer, Cham; 2017.

[pone.0294015.ref026] 26. Liu JS, Chen R, Wong WH. Rejection control and sequential importance sampling. J Am Stat Assoc. 1998;93(443):1022–1031. doi: 10.1080/01621459.1998.10473764 [DOI] [Google Scholar]

[pone.0294015.ref027] 27. Schälte Y, Klinger E, Alamoudi E, Hasenauer J. pyABC: Efficient and robust easy-to-use approximate Bayesian computation. J Open Source Softw. 2022;7(74):4304. doi: 10.21105/joss.04304 [DOI] [Google Scholar]

[pone.0294015.ref028] 28. Meyer K, Morales-Navarrete H, Seifert S, Wilsch-Braeuninger M, Dahmen U, Tanaka EM, et al. Bile canaliculi remodeling activates YAP via the actin cytoskeleton during liver regeneration. Mol Syst Biol. 2020;16(2):e8985. doi: 10.15252/msb.20198985 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0294015.ref029] 29. Drovandi CC, Pettitt AN. Estimation of parameters for macroparasite population evolution using approximate Bayesian computation. Biometrics. 2011;67(1):225–233. doi: 10.1111/j.1541-0420.2010.01410.x [DOI] [PubMed] [Google Scholar]

PERMALINK

A wall-time minimizing parallelization strategy for approximate Bayesian computation

Emad Alamoudi

Felipe Reck

Nils Bundgaard

Frederik Graw

Lutz Brusch

Jan Hasenauer

Yannik Schälte

Roles

Abstract

Introduction

Methods

ABC

ABC-SMC

Established parallelization strategies

Fig 1. Illustration of core usage over run-time for static (STA), dynamic (DYN) and look-ahead (LA) scheduling for a population size N = 5 on W = 8 workers, over 3 generations (colors).

Parallelization using look-ahead dynamic scheduling

Proposed algorithm

Fig 2. Concept visualization of look-ahead scheduling (LA).

Weights and unbiasedness

Implementation and availability

Results

Test problems

Table 1. Overview of application examples.

Biased proposal can induce practical bias in accepted population

Fig 3. Run-time and posterior approximation for 5 different runs of model T1 with STAT, DYN, LA Prel and LA Past, with population size N = 100 on W = 144, 240, 432 workers (top to bottom).

Sampling from unbiased proposal solves bias

Look-ahead sampling gives accurate results

Fig 4. Results for problem T2 for different population sizes N, worker numbers W, and sleep time variances σ2.

Considerable speed-up towards high worker numbers

Fig 5. Speed-up (1 − {Wall- time LA}/{Wall- time DYN}) of LA Prel (left) and LA Past (right) over DYN for various population sizes and numbers of workers, for a model sleep time variance of σ2 = 1.

Scales to realistic application problems

Fig 6. Run-time and posterior distributions for 2 different runs of model M1 with population size 1000, 500, 250 on 128 and 256 workers.

Fig 7. Run-time and posterior distribution for 2 different runs of model M2 with population size 1000, 500, 250 on 128 and 256 workers.

Discussion

Supporting information

Data Availability

Funding Statement

References

Decision Letter 0

Abel CH Chen

Roles

Author response to Decision Letter 0

Decision Letter 1

Abel CH Chen

Roles

Acceptance letter

Abel CH Chen

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Fig 4. Results for problem T2 for different population sizes N, worker numbers W, and sleep time variances σ².

Fig 5. Speed-up (1 − {Wall- time LA}/{Wall- time DYN}) of LA Prel (left) and LA Past (right) over DYN for various population sizes and numbers of workers, for a model sleep time variance of σ² = 1.