Abstract
Parameter inference and uncertainty quantification are important steps when relating mathematical models to real-world observations and when estimating uncertainty in model predictions. However, methods for doing this can be computationally expensive, particularly when the number of unknown model parameters is large. The aim of this study is to develop and test an efficient profile likelihood-based method, which takes advantage of the structure of the mathematical model being used. We do this by identifying specific parameters that affect model output in a known way, such as a linear scaling. We illustrate the method by applying it to three toy models from different areas of the life sciences: (i) a predator–prey model from ecology; (ii) a compartment-based epidemic model from health sciences; and (iii) an advection–diffusion reaction model describing the transport of dissolved solutes from environmental science. We show that the new method produces results of comparable accuracy to existing profile likelihood methods but with substantially fewer evaluations of the forward model. We conclude that our method could provide a much more efficient approach to parameter inference for models where a structured approach is feasible. Computer code to apply the new method to user-supplied models and data is provided via a publicly accessible repository.
Keywords: environmental modelling, epidemic model, maximum-likelihood estimation, optimization, predator–prey model, profile likelihood
1. Introduction
Parameter inference and uncertainty quantification are important whenever we wish to interpret real-world data or make predictions of those data using mathematical models. This is especially true for modelling applications in the life sciences where data are often scarce and uncertain. Commonly used methods include tools from both frequentist (e.g. maximum-likelihood estimation and profile likelihood) [1,2] and Bayesian statistics (e.g. Markov chain Monte Carlo methods and approximate Bayesian computation) [3–5]. These methods can be computationally expensive, particularly for mathematical models with many unknown parameters, resulting in a high-dimensional search space. This has led to a significant body of literature concerned with improving the efficiency of parameter inference methods.
In 2014, Hines et al. [6] reviewed the application of Bayesian Markov chain Monte Carlo (MCMC) sampling methods for parameter estimation and parameter identifiability for a range of ordinary differential equation (ODE)-based models of chemical and biochemical networks. Their results indicated that parameter non-identifiability can be detected through MCMC chains failing to converge. Similar observations were made by Siekmann et al. [7] for a range of continuous-time Markov chain models used in the study of cardiac electrophysiology. In 2020, Simpson et al. [8] compared MCMC sampling with a profile likelihood-based method for parameter inference and parameter identifiability for a range of nonlinear partial differential equation (PDE)-based models used to interrogate cell biology experiments. The authors studied identifiable and non-identifiable problems and found that, while both the MCMC and profile likelihood-based approaches gave similar results for identifiable models, only the profile likelihood approach provided mechanistic insight for non-identifiable problems. They also found that profile likelihood was approximately an order of magnitude faster to run than MCMC sampling, regardless of whether the problem was identifiable or not. The speed-up in computation is a consequence of the fact that it is often faster to use numerical optimization methods compared to sampling methods. Raue et al. [9] suggested the sequential use of profile likelihood to constrain prior distributions before applying MCMC sampling in the face of non-identifiability.
While profile likelihood has long been used to assess parameter identifiability and parameter estimation [10–14], Simpson & Maclaren [15] recently presented a profile likelihood-based workflow covering identifiability, estimation and prediction. The workflow uses computationally efficient optimization-based methods to estimate the maximum likelihood and then explores the curvature of the likelihood through a series of univariate profile likelihood functions that target one parameter at a time. This workflow then propagates uncertainty in parameter estimates into model predictions to provide insight into how variability in data leads to uncertainty in model predictions. While the initial presentation of the profile likelihood-based workflow focused on deterministic models, the same ideas can be applied to stochastic models whenever a surrogate likelihood function is available [1].
For many mathematical modelling applications, there is little alternative than to either assume a fixed value for a particular parameter or include it as a target for inference. However, in some cases, there may be parameters whose values are unknown, but whose effect on the model solution is known to be a simple linear scaling or some other known transformation, as we will demonstrate later. In such cases, finding the optimal (likelihood-maximizing) values of these parameters is trivial if the other model parameters are known. However, typically there will be other model parameters that are unknown. Performing standard inference procedures on the full set of unknown parameters is inefficient as it fails to make use of the simple scaling relationship associated with some of the parameters.
As a motivating example, Lustig et al. described an epidemiological model that was fitted to data in real time and used to provide policy advice during the COVID-19 pandemic [16]. The model consisted of a system of several thousand ODEs, and model fitting and uncertainty quantification were done using an approximate Bayesian computation method targeting 11 unknown parameters. This resulted in a computationally intensive problem with a high-dimensional parameter space. Two of the fitted parameters were multiplicative factors on the proportion of model infections that led to hospitalization and death, respectively, in each age group and susceptibility class. Adjusting either or both of these parameters while holding the other parameters fixed would linearly scale the time series for expected daily hospital admissions and deaths output by the model. Thus, for a given combination of the other nine fitted parameters, it should be possible to find the optimal values for these two parameters without a costly re-evaluation of the full model.
Here, we propose a new approach, which we term structured inference, that exploits the known scaling relationship between certain parameters and the model solution. Our approach is similar to that of Loos et al. [17], who developed a hierarchical parameter inference method for ODE models with additive Gaussian or Laplace noise where some of the parameters, which they termed ‘scaling parameters’, were simply multiplicative factors on the expected value of the observed variables. This method was generalized by Schmiester et al. [18] to include offset parameters. Like Loos et al. and Schmiester et al. [17,18], our approach recasts an -dimensional optimization problem as a nested pair of lower dimensional problems, effectively reducing the dimensionality of the search space. However, our methods go beyond those of [17,18] in two important aspects: (1) they focus on parameter estimation, whereas we consider parameter estimation, practical identifiability and uncertainty quantification via the profile likelihood; (2) we design the method to be applicable in more general settings, including different classes of mathematical models than just ODE models, more general parameter relationships than just multiplicative scaling and more general noise models.
We illustrate our method using canonical toy models representing three case studies drawn from different areas of the life and physical sciences: ecological species interactions, epidemiological dynamics and pollutant transport and deposition. These examples demonstrate some of the features described under (2) above, e.g. Poisson and negative binomial noise models in the first two examples and a PDE model with a non-trivial parameter relationship in the third example. In each case study, we use the model structure to identify parameters that have a known scaling effect on model output. We then show that implementing a structured inference approach results in solutions with very similar accuracy, but substantially fewer calls to the forward model solver. This shows that our method could potentially offer significant reduction in computation time when applied to more complex models, which are computationally expensive to solve. Alongside this article, we provide a fully documented code for implementing our method, with instructions for how it can be applied to user-supplied models and data.
2. Methods
2.1. Unstructured and structured inference
Suppose a model has unknown parameters denoted . We assume that a likelihood function is available for the model for given observed data . A standard approach to parameter inference is to perform maximum likelihood in the -dimensional parameter space. The maximum-likelihood estimate (MLE) for the parameters is the solution of the optimization problem
| (2.1) |
When solving this problem numerically, each call to the objective function with a given combination of parameters typically requires calculation of the model solution, denoted , in order to calculate the likelihood. This solution could involve solving systems of ODEs or PDEs, depending on the modelling context, using either approximate or exact methods. We refer to this as the ‘basic method’.
In this article, we propose and test a modification to this method. Our modified method can be applied to models where the vector of parameters can be partitioned into such that there is a known relationship between the model solutions for parameters with the same value of but different values of . This relationship can be expressed in the form of a transformation,
| (2.2) |
for some known function , which we assume is relatively cheap to compute compared to solving the full forward model for . The simplest example is where the model solution is directly proportional to , in which case this transformation is a simple linear scaling:
| (2.3) |
This type of scaling relationship enables a more efficient approach to maximum-likelihood estimation because, for fixed , the optimal value of may be calculated with only a single run of the forward model. Thus the -dimensional optimization problem in equation (2.1) may be replaced by a nested pair of lower dimensional problems:
| (2.4) |
| (2.5) |
To solve the inner optimization problem in equation (2.5), the full forward model only needs to be run once, for some defined reference value of . The model solution for this reference value of can then be transformed via equation (2.2) to find the model solution for any value of . We refer to this as the ‘structured method’ (see figure 1 for a schematic illustration of the basic and structured methods).
Figure 1.
Schematic illustration of: (a) the basic maximum-likelihood method and (b) the structured maximum-likelihood method. The basic method is an -dimensional optimization problem for . The structured method nests a -dimensional inner optimization problem for (blue) within an -dimensional outer optimization problem for (black). The inner optimization problem can be solved numerically without rerunning the forward model to find . Candidate values for parameters will typically be chosen using a standard optimization algorithm.
In this work, given a log-likelihood function, , we then rescale to form a normalized log-likelihood , where is the MLE so that . Profile likelihoods are constructed by partitioning the full parameter into interest parameters , and nuisance parameters , such that [19]. In this work, we construct a series of univariate profiles by specifying the interest parameter to be a single parameter of interest in , and the nuisance parameters are the remaining parameters in . For a fixed set of data , the profile log-likelihood for the interest parameter given the partition is
| (2.6) |
which implicitly defines a function for optimal values of the nuisance parameters as a function of the target parameters for the given dataset. For univariate parameters, this is simply a function of one variable. For identifiable parameters, these profile likelihoods will contain a single peak at the MLE.
The degree of curvature of the log-likelihood function is related to inferential precision [19]. Since, in general, univariate profiles for identifiable parameters involve a single peak at the MLE, one way to quantify the degree of curvature of the log-likelihood function is to form asymptotic confidence intervals (CIs) by finding the interval where , where denotes the qth quantile of the distribution with degrees of freedom [20]. For univariate profiles, . In this study, we identify the interval in the interest parameter where , which identifies the 95% CI.
2.2. Case studies
2.2.1. Predator–prey model: ecology
Our first case study is a simple toy model for a prey species and predator species , assuming a logistic growth for the prey and a type II functional response (i.e. saturating at high prey density) for the predation term:
| (2.7) |
| (2.8) |
This is a type of Rosenzweig–MacArthur model [21], which is a generalization of the classical Lotka–Volterra predator–prey model [22,23].
We suppose that some fixed fraction of the prey and the predator populations is observed on average at each time point , and observed data are Poisson distributed:
| (2.9) |
| (2.10) |
where and are the expected number of prey and predators observed, respectively. For illustrative purposes, we assume the initial conditions and the parameters and are known (e.g. from information on the environmental conditions and average prey handling time), and perform inference on the parameter set .
For the structured inference method, we use as the inner parameter . We set the reference value of the inner parameter as , noting that if is the model solution with , then the solution for any given value of is simply . Thus, under the structured method, the ODE model (2.7) and (2.8) only ever needs to be solved with . When evaluation of the model likelihood requires the expected value of observed data for any other value of , this is obtained via this simple linear scaling. Note that the choice to set is arbitrary and any fixed value of could be chosen.
2.2.2. SEIRS model: epidemiology
Our second case study is a SEIRS compartment model for an epidemic in a closed population of size [24]. We assume that the population at time is divided into compartments representing the number of people who are susceptible (), exposed (), infectious () and recovered (). To model waning immunity, we subdivide the recovered compartment into two compartments and and assume that individuals transition from to and subsequently back to . This is described by the following system of ordinary differential equations:
| (2.11) |
| (2.12) |
| (2.13) |
| (2.14) |
| (2.15) |
where , and are transition rates representing the reciprocal of the mean latent period, infectious period and immune period, respectively, and is the force of infection defined as
| (2.16) |
where is the basic reproduction number. The model is initialized with a specified number of exposed individuals by setting , and all other variables equal to zero.
We assume that, on average, a fixed proportion of infections are observed. This could represent epidemiological surveillance data, for example the number of notified cases under the assumption that surveillance effort is steady over time. Alternatively, it could represent a particular clinical outcome, such as hospital admission. Unknown case ascertainment or uncertain clinical severity is common, particularly for new or emerging infectious diseases. Hence, joint inference of this parameter with parameters of the SEIRS model, in situations where these are identifiable, is an important task in outbreak modelling [16,25,26].
Observations are not typically recorded instantaneously at the time of infection, so we assume there is an average time lag of from the time an individual becomes infectious to the time they are observed. These assumptions are modelled via the following differential equations:
| (2.17) |
| (2.18) |
Here, represents the number of infected individuals who will be observed but have not yet been observed and is the cumulative number of observed infections. The expected number of observations per day at time is denoted and is equal to .
We assume that the number of observed cases on day is drawn from a negative binomial distribution (which is a commonly used distribution in epidemic modelling when observed cases may have more stochasticity than under a binomial or Poisson distribution [27,28]) with mean and dispersion factor :
We assume that the initial conditions and the transition rates , and are known (e.g. from independent epidemiological data on the average latent period and infectious period), and perform inference on the parameter set .
Similar to the predator–prey model, we use as the inner parameter and set , noting that if is the model solution with , then the solution for any given value of is simply . Again, the choice to set is arbitrary and any fixed value of could be chosen.
2.2.3. Advection–diffusion reaction model: environmental science
In the context of environmental modelling and pollution management, advection–diffusion reaction models are often used to study how dissolved solutes are spatially distributed over time. Very often the equation for the solute concentration at position and time includes a source term to describe chemical reactions [29]. A common approach is to assume that chemical reactions are fast relative to transport and a linear isotherm model. This gives one of the most common models of reaction transport:
| (2.19) |
where is the diffusivity, often called the dispersion coefficient in solute transport modelling literature, is the advection velocity and is a sink term that represents a chemical reaction that converts into some immobile product . A common form for the sink term is , where is the forward reaction rate and is the backward reaction rate. In the limiting case where the chemical reactions occur instantaneously, we have , which we rewrite as , where is a dimensionless parameter known as the retardation factor. Substituting this relationship into equation (2.19) gives
| (2.20) |
This shows that the presence of chemical reactions effectively retards the dispersion and advection processes since the chemical reaction leads to effective diffusion and advection rates of and , respectively.
While, in general, this transport equation can be solved numerically for a range of initial and boundary conditions, exact solutions can also be used where appropriate. In one spatial dimension with initial condition (for ) and boundary condition , the exact solution is [30,31]
| (2.21) |
| (2.22) |
We assume that the solute species is not observed directly, and observations of the product species are taken at a fixed time and at a series of points in space , and are subject to Gaussian noise with standard deviation . Thus, observed data take the form
| (2.23) |
We assume that the solute boundary concentration is known and perform inference on the parameter set . In this instance, we take the most fundamental approach in forming a likelihood function by assuming that observations are normally distributed about the solution of the PDE model with a constant standard deviation. This is a commonly employed approach; however, our likelihood-based approach can be employed using a range of measurement models [32].
For the structured method, we exploit a symmetry in the analytical solution in equation (2.21), namely that
| (2.24) |
We, therefore, use as the inner parameter . We set and note that if is the model solution for when then equations (2.22) and (2.24) imply that the solution for the product for an arbitrary value of is
| (2.25) |
This relationship between the inner parameter and the model solution is more complicated than the simple multiplicative scaling seen in the first two case studies. As equation (2.24) shows, changing the value of is equivalent to rescaling one of the independent variables () in the PDE for . Therefore, evaluating the solution for a different value of is equivalent to evaluating the reference solution at a different value of , as well as changing the multiplicative factor that relates and . Since a numerical scheme for an advection–diffusion PDE will typically generate solution values at a series of closely spaced time points, it will generally be straightforward to query the reference solution at an earlier time point. This allows the solution for arbitrary values of to be obtained without costly re-evaluation of the numerical solution scheme.
To emulate this situation, for given values of the outer parameters and , we generate the reference solution by evaluating equation (2.21) with at a series of equally spaced time points from to . When the solution for in equation (2.25) subsequently requires to be queried at , we approximate this using linear interpolation between the nearest two values of . Here, choosing is beneficial because it means that the reference solution only ever needs to be queried at values of within the range of the numerical solution, .
We note that, although we have studied the one-dimensional version of this model for illustrative purposes, our approach could also be used in higher dimensions. Under these circumstances, numerically solving the PDE is more computationally expensive, so any efficiency improvement from taking advantage of the model structure would be amplified.
2.3. Computational methods
For each model case study, we first solved the underlying deterministic model equations with fixed parameter values (see electronic supplementary material, table S1). This was done via a numerical solution of the ODEs for the first two case studies, whereas the analytical solution was used for the third case study. Once the deterministic solution was calculated, we then generated a synthetic dataset by simulating the specified noise model.
To perform parameter inference using the basic method, we first found the MLE for the target parameter set by solving the optimization problem in equation (2.1). We then computed univariate likelihood profiles for each target parameter by taking a uniform mesh over some interval with a modest number of mesh points () and solving equation (2.6) to find the profile log-likelihood at each mesh point. In practice, we determined an appropriate interval by computing across a trial interval and if necessary widening the interval until it contained the 95% CI.
For the first mesh point to the right of the MLE, the optimization algorithm was initialized using the MLE values for each parameter. For each subsequent mesh point stepping from left to right, the algorithm was initialized using the previous profile solution. This procedure was then repeated for mesh points to the left of the MLE by stepping right to left. The boundaries of the 95% CI were estimated using one-dimensional linear interpolation to compute the value of the target parameter at which the normalized profile log-likelihood equalled the threshold value of .
We used a similar procedure for the structured inference method, finding the MLE by solving the nested optimization problem in equations (2.4) and (2.5) (see figure 1b). We then constructed univariate profiles as described above, but with equation (2.6) similarly recast as a nested pair of optimization problems. This means that, with the structured method, there is one fewer parameter to profile because for a given dataset , the inner parameter is treated as a deterministic function of the outer parameters and so does not need to be profiled separately.
All optimization problems were solved numerically using the constrained optimization function fmincon in Matlab R2022b with the interior point algorithm and default tolerances. We calculated the relative error between the MLE and the true parameter values, and the number of calls to the forward model required by both the basic and structured methods. This enabled us to compare the accuracy and efficiency of the two methods. Because the results will depend on the particular realization of the noise model, we repeated this process times for each case study, calculating the MLE and 95% CIs for independent datasets using fixed parameter values (see electronic supplementary material, table S1). We report the median and interquartile range of the relative error and the number of function calls across the datasets. We also report the proportion of the datasets for which the 95% CI for each parameter contained the true value. We also carried out sensitivity analysis by randomly varying the model parameters for each synthetically generated dataset (see electronic supplementary material for details).
Documented Matlab code that implements both the basic and the structured method on the three models described above is publicly available at: https://github.com/michaelplanknz/structured-inference. The code can also be run on a user-supplied model, either with synthetically generated data from the specified model or with user-supplied data. This requires the user to provide a function that returns the model solution for specified parameter values, and a separate function that transforms the reference model solution to the model solution for a specified value of the inner parameters. The user can choose from a range of pre-supplied noise models, including additive or multiplicative Gaussian, Poisson and negative binomial. Detailed instructions for running the method on a user-supplied model and/or data are available in the ReadMe file at the URL above.
3. Results
3.1. Predator–prey model
The model exhibited limit cycle dynamics for the parameter values chosen, completing around three cycles during the time period observed (figure 2). Both the basic and the structured methods were able to successfully recover the correct parameter values from the observed data. The univariate likelihood profiles were almost identical for both methods and were unimodal, implying that the parameters are identifiable (figure 3). The 95% confidence intervals (range of values above the dotted horizontal line in figure 3) contained the true parameter value for all target parameters for both methods. Obtaining the same results from both methods was an expected result, which provided confirmation that the structured method was working correctly.
Figure 2.
Results for the predator–prey model from: (a) the basic method and (b) the structured method. Each panel shows the solution under the actual parameter values (solid curves); the solution under the maximum-likelihood estimate for the parameter values (dashed curves); and the simulated data (dots) for the prey (blue) and predator (red). Where the dashed curve representing the solution at the MLE is not visible, this is because it coincides with the solid curve representing the solution at the true parameter values.
Figure 3.
Normalized likelihood profiles for inferred parameters of the predator–prey model: (a) prey intrinsic growth rate (), (b) predation coefficient (), (c) predator death rate () and (d) observation probability (). The blue curves are from the basic method; the red curves are from the structured method; dashed vertical lines indicate the actual values (black) and the maximum-likelihood estimates for each parameter under the basic method (blue) and structured method (red). Where the blue curve is not visible, this is because it coincides exactly with the red curve. The dotted horizontal line shows the threshold normalized log-likelihood for the 95% confidence interval.
When run on independently generated synthetic datasets, both methods had similar levels of accuracy in the MLE, with a median relative error of % (interquartile range [0.3%, 0.7%])—see table 1. The coverage properties of both methods were good, with the 95% CI containing the true parameter value in 94−96% of cases for both methods. However, the structured method required only 7492 calls (i.e. evaluations of the forward model) on average, compared to 15 054 for the basic method. The reduction in the number of calls for the structured method compared to the basic method was 50.2% (interquartile range [48.5%, 52.1%]). The efficiency improvement was greater in the profile likelihood component of the algorithm than in the calculation of the MLE, where the reduction in calls was 18.4% [6.3%, 28.8%]. This is partly because, with the structured method, the inner parameter is effectively a function of the outer parameters (i.e. the solution of the inner optimization problem in equation (2.5)), so does not need to be profiled separately. However, there was also a reduction in the number of calls per parameter profile.
Table 1.
Relative error in the maximum-likelihood estimate, coverage of the 95% CI (i.e. the proportion of datasets for which the 95% CI contained true parameter value) for each profiled parameter, and the number of calls to the forward model solver (number to calculate the MLE, number to calculate the likelihood profiles and total number) under the basic and structured inference methods for each of the three model case studies. Note coverage statistics for the inner parameters ( and ) are not applicable for the structured method, as the inner parameter is calculated as a function of the outer parameters. The ‘improvement’ column shows the reduction in the number of function calls under the structured method relative to the basic method. Results show the median and interquartile range across independently generated datasets for each model.
| relative error (%) | ||||
|---|---|---|---|---|
| basic | structured | |||
| predator–prey | 0.5 [0.3, 0.7] | 0.5 [0.3, 0.7] | ||
| SEIRS | 2.6 [1.3, 4.9] | 2.6 [1.2, 4.6] | ||
| adv. diff. | 11.0 [6.6, 17.3] | 11.0 [6.6, 17.3] | ||
| 95% CI coverage | ||||
| basic | structured | |||
| predator–prey | 94.8% | 94.8% | ||
| 94.6% | 94.0% | |||
| 95.4% | 95.2% | |||
| 95.4% | — | |||
| SEIRS | 89.0% | 92.2% | ||
| 87.2% | 90.4% | |||
| 86.2% | — | |||
| 88.2% | 91.8% | |||
| adv. diff. | 89.8% | 89.8% | ||
| 90.8% | 90.8% | |||
| 90.2% | — | |||
| 88.4% | 88.4% | |||
| function calls (MLE) | ||||
| basic | structured | improvement (%) | ||
| predator–prey | 208 [194, 228] | 171 [153, 194] | 18.4 [6.3, 28.8] | |
| SEIRS | 143 [134, 154] | 130 [121, 143] | 9.2 [−2.6, 19.5] | |
| adv. diff. | 173 [158, 191] | 98 [88, 111] | 42.9 [33.9, 50.3] | |
| function calls (profiles) | ||||
| basic | structured | improvement (%) | ||
| predator–prey | 14839 [14084, 15 815] | 7320 [7017, 7662] | 50.7 [49.0, 52.7] | |
| SEIRS | 13389 [12854, 13 750] | 7339 [7123, 7545] | 44.9 [42.6, 46.7] | |
| adv. diff. | 10247 [9699, 10 893] | 4675 [4339, 5059] | 54.6 [51.5, 56.9] | |
| function calls (total) | ||||
| basic | structured | improvement (%) | ||
| predator–prey | 15054 [14294, 16 029] | 7492 [7184, 7834] | 50.2 [48.5, 52.1] | |
| SEIRS | 13542 [12998, 13 900] | 7476 [7251, 7693] | 44.5 [42.3, 46.3] | |
| adv. diff. | 10424 [9870, 11 064] | 4772 [4433, 5158] | 54.4 [51.2, 56.7] | |
3.2. SEIRS model
The model exhibits a large epidemic wave, followed by a series of successively smaller waves as the model approaches the stable endemic equilibrium (figure 4). Again, both the basic and the structured methods successfully recovered the correct parameter values and produced almost identical MLEs and univariate likelihood profiles (figure 5). The profiles were unimodal, meaning that the parameters were identifiable.
Figure 4.
Results for the SEIRS model from: (a) the basic method and (b) the structured method. Each panel shows the solution under the actual parameter values (solid curves); the solution under the maximum-likelihood estimate for the parameter values (dashed curves) and the simulated data (dots). Where the dashed curve representing the solution at the MLE is not visible, this is because it coincides with the solid curve representing the solution at the true parameter values.
Figure 5.
Normalized likelihood profiles for inferred parameters of the SEIRS model: (a) basic reproduction number (), (b) recovered to susceptible rate (), (c) observation probability () and (d) negative binomial dispersion parameter for observed data (). The blue curves are from the basic method; the red curves are from the structured method; dashed vertical lines indicate the actual values (black) and the maximum-likelihood estimates for each parameter under the basic method (blue) and structured method (red). Where the blue curve is not visible, this is because it coincides exactly with the red curve. The dotted horizontal line shows the threshold normalized log-likelihood for the 95% confidence interval.
When run on independently generated datasets, both methods again had similar accuracy, with a relative error in the MLE of 2.6% [1.3%, 4.9%] for the basic method, compared to 2.6% [1.2%, 4.6%] for the structured method (table 1). The coverage rates were slightly better for the structured method (90−93%) compared to the basic method (86−90%). However, the main advantage of the structured method was its efficiency, with a 44.5% [42.3%, 46.3%] reduction in the total number of calls required compared to the basic method. As for the predator–prey model, most of the efficiency improvement was in the profile likelihood component, although there was also a reduction of 9.2% [−2.6%, 19.5%] in the number of calls required for the MLE.
3.3. Advection–diffusion model
The model exhibits a sigmoidal decay in concentration with distance from the source at the observation time of (figure 6). Again, both methods successfully recovered the correct parameter values. Although all univariate likelihood profiles were unimodal, the confidence intervals for the diffusion coefficient and noise magnitude (range of values above the dotted horizontal line in figure 7) were relatively wide compared with the other target parameters (figure 7). This indicates that and were relatively weakly identified by the data, and a range of values of these parameters were approximately consistent with the observed data.
Figure 6.
Results for the advection–diffusion model from: (a) the basic method and (b) the structured method. Each panel shows the solution under the actual parameter values (solid curves); the solution under the maximum-likelihood estimate for the parameter values (dashed curves) and the simulated data (dots). Where the dashed curve representing the solution at the MLE is not visible, this is because it coincides with the solid curve representing the solution at the true parameter values.
Figure 7.
Normalized likelihood profiles for inferred parameters of the advection–diffusion model: (a) diffusion coefficient (), (b) advection velocity (), (c) retardation factor (), and (d) standard deviation of noise in observed data (). The blue curves are from the basic method; the red curves are from the structured method; dashed vertical lines indicate the actual values (black) and the maximum-likelihood estimates for each parameter under the basic method (blue) and structured method (red). Where the blue curve is invisible, this is because it coincides exactly with the red curve. The dotted horizontal line shows the threshold normalized log-likelihood for the 95% confidence interval.
When run on independently generated datasets, the relative error was 11.0% [6.6%, 17.3%] for both methods (table 1). Coverage rates were reasonable (88−91% for both methods). The structured method required 54.4% [51.2%, 56.7%] fewer calls to the model than the basic method did. Even for the MLE component alone, the structured method provided a substantial improvement in efficiency with 42.9% [33.9%, 50.3%] fewer calls than the basic method.
3.4. Sensitivity analysis
The results described above for each model are for independent datasets, each generated using the same set of parameter values. To test the sensitivity of our results to the true parameter values, we repeated the analysis with datasets, each generated from an independently chosen set of parameter values. For each target parameter, we chose the value from an independent normal distribution with the same mean as previously (see electronic supplementary material, table S1) and with a coefficient of variation of .
The results are similar to those described above for fixed parameters. The accuracy and coverage properties are similar for both methods, but the structured method is consistently more efficient than the basic method (see electronic supplementary material for details).
4. Discussion
We have developed an adapted method for likelihood-based parameter inference and uncertainty quantification. The method is based on standard maximum-likelihood estimation and profile likelihood, but exploits known structure in the mechanistic model to split a high-dimensional optimization problem into a nested pair of lower dimensional problems using a similar approach to Loos et al. [17]. Only the outer problem requires the forward model to be solved at each step. Like standard maximum-likelihood estimation, our method can be used with a range of different types of mathematical models and noise models, provided a likelihood function exists.
We have illustrated the new method on three toy models from different areas of the life sciences: a predator–prey model, a compartment-based epidemic model and an advection–diffusion PDE model. We have shown that the standard method and the new structured method provided comparable levels of accuracy in parameter estimates and coverage of inferred confidence intervals. However, the structured method was consistently more efficient, requiring substantially fewer calls to the forward model. In more complex models that are computationally expensive to solve, this would provide a major improvement in computation time.
Although we have illustrated our method on three simple models, each with a single inner parameter, we anticipate there will be a broad class of models for which our approach could be applied. These include models where one or more dependent variables are linear in one or more parameters, as considered by Loos et al. and Schmiester et al. [17,18]. This was the situation in the first two of our case studies. A similar situation would arise if observed model components were linear in an initial or boundary condition, or in a model with one-way coupling from a nonlinear to a linear component. Our approach is also applicable to models where a parameter effectively rescales one of the independent variables (typically space or time), which was the situation in the third case study. More generally, models which possess some sort of symmetry or invariance may also be candidates for applying our approach. These include travelling wave solutions (such as in models of a spreading population or other reaction–diffusion equations [33,34]), similarity solutions (such as in models of chemotaxis [35,36]) or scale invariance (such as in size-spectrum models of marine ecosystem dynamics [37,38]). Exploring the range of models and classes of parameter relationships for which our methodology is applicable is an avenue for further research.
The process of specifying inner parameters and identifying the transformation relationship will in general be model dependent. In models where such relationships exist, they may be revealed by a non-dimensionalization procedure, which is a well-established technique in mathematical modelling [39]. A clear example of this would include a broad class of reaction–diffusion models with a logistic source term with carrying capacity , where changing the value of linearly rescales the solution of the model [40]. However, we acknowledge that not all models will be amenable to the structured inference we have presented because the relationship between parameters of interest and model solution will be unknown or too complex to identify a priori.
Our new method does not avoid the risk of the optimization routine converging to a local minimum instead of a global minimum. Nonetheless, one way to tackle this is to use a global search algorithm, and using our approach to reduce the dimensionality of the parameter space may help this to succeed.
We have examined case studies in which data come from some specified noise process applied to the solution of an underlying deterministic model. In the second and third case studies, where the noise distribution had a variance parameter, we did not assume this was known a priori, but included it as a target for inference for both the basic and the structured methods. Profile likelihood has also been applied to stochastic models, for example a stochastic model of diffusion in heterogeneous media [1]. Another feature of our study is that we have applied our methods to standard cases where we treat the available data as fixed, but we note that we could adapt our method to deal with cases where inference and identifiability analysis are dynamically updated as new data become available [41]. How the size of the efficiency improvement scales with the number of inner and outer parameters in more complex models is also an interesting question for future work.
In this article, we have focused on parameter estimation and identifiability analysis. Given our univariate profile likelihood functions, it is possible to make profile-wise predictions, propagating uncertainty in parameter estimates through to uncertainty in model predictions [15]. This enables an understanding of how variability in different parameters impacts the solution of the mathematical model. Given that our structured profile likelihood functions can be computed with far less computational overhead than the standard approach, it is possible to use our structured approach to speed up the calculation of likelihood-based prediction intervals [15].
All results in this study focus on assessing practical identifiability since we are interested in real-world problems, which involve working with imperfect, noisy and sparse data. Alternatively, we could also consider the structural identifiability, which considers the case where we have highly idealized, noise-free observations [42]. Structural identifiability for ODE models can be assessed using algebraic methods that are available in several software packages, such as GenSSI [43], DAISY [44] and STRIKE-GOLDD [45]. Assessing practical identifiability is a stronger condition than assessing structural identifiability because structurally identifiable models can turn out to be practically non-identifiable when working with finite, noisy data. Previous work has established that likelihood-based methods can perform well regardless of whether a model is structurally or practically non-identifiable [46,47].
We have worked with a likelihood-based framework, which we chose for the sake of algorithmic simplicity and computational efficiency. In particular, working with profile likelihood is often faster than sampling-based methods, such as MCMC [8], and this is particularly relevant for poorly identified problems. All problems we have considered here involve parameters that are well identified by the data. We note that if the variance in the noise was larger than assumed here, or if the data were sparser, this could result in parameters being poorly identified. We anticipate that, where identifiability issues exist, they would affect both the basic and the structured methods similarly. It is possible that the structured approach might speed up the computations needed to diagnose the identifiability issue. However, we leave a detailed exploration of this for future work. One way of doing this would be to seek an appropriate re-parametrization of the log-likelihood function, which we leave for future consideration.
Our structured inference approach could, in principle, be used in a Bayesian setting by sampling from the distribution of outer parameters and, for each sample point, optimizing the inner parameters using a likelihood or approximate likelihood function (see also [48]). Implementing this is beyond the scope of this paper but would be an interesting aim for future work.
Acknowledgements
M.J.P. acknowledges travel support from the Australian Research Council (DP200100177). The authors thank the organizers of the third New Zealand Workshop on Uncertainty Quantification and Inverse Problems, held at the University of Canterbury in 2023. The authors are grateful to five anonymous reviewers for comments on previous versions of the manuscript.
Contributor Information
Michael J. Plank, Email: michael.plank@canterbury.ac.nz.
Matthew J. Simpson, Email: matthew.simpson@qut.edu.au.
Ethics
This work did not require ethical approval from a human subject or animal welfare committee.
Data accessibility
Data and relevant code for this research work are stored in GitHub at https://github.com/michaelplanknz/structured-inference and have been archived within the Zenodo repository [49].
Supplementary material is available online [50].
Declaration of AI use
We have not used AI-assisted technologies in creating this article.
Conflict of interest declaration
We declare we have no competing interests.
Funding
M.J.S. is supported by the Australian Research Council (DP230100025).
References
- 1. Simpson MJ, Browning AP, Drovandi C, Carr EJ, Maclaren OJ, Baker RE. 2021. Profile likelihood analysis for a stochastic model of diffusion in heterogeneous media. Proc. Math. Phys. Eng. Sci. 477, 20210214. ( 10.1098/rspa.2021.0214) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Shuttleworth JG, Lei CL, Whittaker DG, Windley MJ, Hill AP, Preston SP, Mirams GR. 2024. Empirical quantification of predictive uncertainty due to model discrepancy by training with an ensemble of experimental designs: an application to ion channel kinetics. Bull. Math. Biol. 86, 2. ( 10.1007/s11538-023-01224-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Sisson SA, Fan Y, Tanaka MM. 2007. Sequential Monte Carlo without likelihoods. Proc. Natl Acad. Sci. USA 104, 1760–1765. ( 10.1073/pnas.0607208104) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MPH. 2009. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface 6, 187–202. ( 10.1098/rsif.2008.0172) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Sunnåker M, Busetto AG, Numminen E, Corander J, Foll M, Dessimoz C. 2013. Approximate Bayesian computation. PLoS Comput. Biol. 9, e1002803. ( 10.1371/journal.pcbi.1002803) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Hines KE, Middendorf TR, Aldrich RW. 2014. Determination of parameter identifiability in nonlinear biophysical models: a Bayesian approach. J. Gen. Physiol. 143, 401–416. ( 10.1085/jgp.201311116) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Siekmann I, Sneyd J, Crampin EJ. 2012. MCMC can detect nonidentifiable models. Biophys. J. 103, 2275–2286. ( 10.1016/j.bpj.2012.10.024) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Simpson MJ, Baker RE, Vittadello ST, Maclaren OJ. 2020. Practical parameter identifiability for spatio-temporal models of cell invasion. J. R. Soc. Interface 17, 20200055. ( 10.1098/rsif.2020.0055) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Raue A, Kreutz C, Theis FJ, Timmer J. 2013. Joining forces of Bayesian and frequentist methodology: a study for inference in the presence of non-identifiability. Phil. Trans. R. Soc. A 371, 20110544. ( 10.1098/rsta.2011.0544) [DOI] [PubMed] [Google Scholar]
- 10. Bates DM, Watts DG. 1988. Nonlinear regression analysis and its applications. New York, NY: Wiley. ( 10.1002/9780470316757) [DOI] [Google Scholar]
- 11. Raue A, Kreutz C, Maiwald T, Bachmann J, Schilling M, Klingmüller U, Timmer J. 2009. Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics 25, 1923–1929. ( 10.1093/bioinformatics/btp358) [DOI] [PubMed] [Google Scholar]
- 12. Kreutz C, Raue A, Kaschek D, Timmer J. 2013. Profile likelihood in systems biology. FEBS J. 280, 2564–2571. ( 10.1111/febs.12276) [DOI] [PubMed] [Google Scholar]
- 13. Kreutz C, Raue A, Timmer J. 2012. Likelihood based observability analysis and confidence intervals for predictions of dynamic models. BMC Syst. Biol. 6, 120. ( 10.1186/1752-0509-6-120) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Ciocanel MV, Ding L, Mastromatteo L, Reichheld S, Cabral S, Mowry K, Sandstede B. 2024. Parameter identifiability in PDE models of fluorescence recovery after photobleaching. Bull. Math. Biol. 86, 36. ( 10.1007/s11538-024-01266-4) [DOI] [PubMed] [Google Scholar]
- 15. Simpson MJ, Maclaren OJ. 2023. Profile-wise analysis: a profile likelihood-based workflow for identifiability analysis, estimation, and prediction with mechanistic mathematical models. PLoS Comput. Biol. 19, e1011515. ( 10.1371/journal.pcbi.1011515) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Lustig A, Vattiato G, Maclaren O, Watson LM, Datta S, Plank MJ. 2023. Modelling the impact of the omicron BA.5 subvariant in New Zealand. J. R. Soc. Interface 20, 20220698. ( 10.1098/rsif.2022.0698) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Loos C, Krause S, Hasenauer J. 2018. Hierarchical optimization for the efficient parametrization of ODE models. Bioinformatics 34, 4266–4273. ( 10.1093/bioinformatics/bty514) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Schmiester L, Schälte Y, Fröhlich F, Hasenauer J, Weindl D. 2020. Efficient parameterization of large-scale dynamic models based on relative measurements. Bioinformatics 36, 594–602. ( 10.1093/bioinformatics/btz581) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Pawitan Y. 2001. All likelihood statistical modelling and inference using likelihood. Oxford, UK: Oxford University Press. ( 10.1093/oso/9780198507659.001.0001) [DOI] [Google Scholar]
- 20. Royston P. 2007. Profile likelihood for estimation and confidence intervals. Stata J. 7, 376–387. ( 10.1177/1536867X0700700305) [DOI] [Google Scholar]
- 21. Rosenzweig ML, MacArthur RH. 1963. Graphical representation and stability conditions of predator-prey interactions. Am. Nat. 97, 209–223. ( 10.1086/282272) [DOI] [Google Scholar]
- 22. Lotka AJ. 1925. Elements of physical biology. Baltimore, MD: Williams & Wilkins. [Google Scholar]
- 23. Volterra V. 1926. Variazioni e fluttuazioni del numero d’individui in specie animali conviventi. Mem. Reale Accad. Naz. Lincei. 2, 31–113. [Google Scholar]
- 24. Diekmann O, et al. 2000. Mathematical epidemiology of infectious diseases: model building, analysis and interpretation, 5th edn. Chichester, UK: Wiley. [Google Scholar]
- 25. Russell TW, et al. 2020. Reconstructing the early global dynamics of under-ascertained COVID-19 cases and infections. BMC Med. 18, 332. ( 10.1186/s12916-020-01790-9) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Flaxman S, et al. 2020. Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe. Nature 584, 257–261. ( 10.1038/s41586-020-2405-7) [DOI] [PubMed] [Google Scholar]
- 27. Abbott S, et al. 2020. Estimating the time-varying reproduction number of SARS-cov-2 using national and subnational case counts. Wellcome Open Res. 5, 112. ( 10.12688/wellcomeopenres.16006.1) [DOI] [Google Scholar]
- 28. Golding N, Price DJ, Ryan G, McVernon J, McCaw JM, Shearer FM. 2023. A modelling approach to estimate the transmissibility of SARS-cov-2 during periods of high, low, and zero case incidence. eLife 12, e78089. ( 10.7554/eLife.78089) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Herzer J, Kinzelbach W. 1976. Coupling of transport and chemical processes in numerical transport models. Geoderma 44, 473–480. [Google Scholar]
- 30. Ogata A, Banks RB. 1961. A solution of the differential equation of longitudinal dispersion in porous media. US Geological Survey Professional Paper 411-A. Washington, DC: US Government Printing Office. [Google Scholar]
- 31. Genuchten T, Wierenga PJ. 1989. Mass transfer studies in sorbing porous media. I. Analytical solutions. Soil Sci. Soc. Am. J. 40, 473–480. ( 10.21366/sssaj1976.03615995004000040011x) [DOI] [Google Scholar]
- 32. Murphy RJ, Maclaren OJ, Simpson MJ. 2024. Implementing measurement error models with mechanistic mathematical models in a likelihood-based framework for estimation, identifiability analysis and prediction in the life sciences. J. R. Soc. Interface 21, 20230402. ( 10.1098/rsif.2023.0402) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Baker RE, Maini PK. 2007. Travelling gradients in interacting morphogen systems. Math. Biosci. 209, 30–50. ( 10.1016/j.mbs.2007.01.006) [DOI] [PubMed] [Google Scholar]
- 34. Baker RE, Simpson MJ. 2012. Models of collective cell motion for cell populations with different aspect ratio: diffusion, proliferation and travelling waves. Physica A 391, 3729–3750. ( 10.1016/j.physa.2012.01.009) [DOI] [Google Scholar]
- 35. Rascle M, Ziti C. 1995. Finite time blow-up in some models of chemotaxis. J. Math. Biol. 33, 388–414. ( 10.1007/BF00176379) [DOI] [PubMed] [Google Scholar]
- 36. Byrne HM, Cave G, McElwain DLS. 1998. The effect of chemotaxis and chemokinesis on leukocyte locomotion: a new interpretation of experimental results. Math. Med. Biol. 15, 235–256. ( 10.1093/imammb/15.3.235) [DOI] [PubMed] [Google Scholar]
- 37. Capitán JA, Delius GW. 2010. Scale-invariant model of marine population dynamics. Phys. Rev. E 81, 061901. ( 10.1103/PhysRevE.81.061901) [DOI] [PubMed] [Google Scholar]
- 38. Plank MJ, Law R. 2012. Ecological drivers of stability and instability in marine ecosystems. Theor. Ecol. 5, 465–480. ( 10.1007/s12080-011-0137-x) [DOI] [Google Scholar]
- 39. Murray JD. 2003. Mathematical biology: I. An introduction. Berlin, Germany: Springer. [Google Scholar]
- 40. Haridas P, Penington CJ, McGovern JA, McElwain DLS, Simpson MJ. 2017. Quantifying rates of cell migration and cell proliferation in co-culture barrier assays reveals how skin and melanoma cells interact during melanoma spreading and invasion. J. Theor. Biol. 423, 13–25. ( 10.1016/j.jtbi.2017.04.017) [DOI] [PubMed] [Google Scholar]
- 41. Cassudy T. 2023. A continuation technique for maximum likelihood estimators in biological models. Bull. Math. Biol. 85, 90. ( 10.1007/s11538-023-01200-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Chis OT, Banga JR, Balsa-Canto E. 2011. Structural identifiability of systems biology models: a critical comparison of methods. PLoS One 6, e27755. ( 10.1371/journal.pone.0027755) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Chiş O, Banga JR, Balsa-Canto E. 2011. GenSSI: a software toolbox for structural identifiability analysis of biological models. Bioinformatics 27, 2610–2611. ( 10.1093/bioinformatics/btr431) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Bellu G, Saccomani MP, Audoly S, D’Angiò L. 2007. DAISY: a new software tool to test global identifiability of biological and physiological systems. Comput. Methods Programs Biomed. 88, 52–61. ( 10.1016/j.cmpb.2007.07.002) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Villaverde AF, Barreiro A, Papachristodoulou A. 2016. Structural identifiability of dynamic systems biology models. PLoS Comput. Biol. 12, e1005153. ( 10.1371/journal.pcbi.1005153) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Fröhlich F, Theis FJ, Hasenauer J. 2014. Uncertainty analysis for non-identifiable dynamical systems: profile likelihoods, bootstrapping and more. In Computational methods in systems biology (eds Mendes P, Dada JO, Smallbone K), pp. 61–72. Cham, Switzerland: Springer. ( 10.1007/978-3-319-12982-2_5) [DOI] [Google Scholar]
- 47. Simpson MJ, Maclaren OJ. 2024. Making predictions using poorly identified mathematical models. Bull. Math. Biol. 86, 80. ( 10.1007/s11538-024-01294-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Raimúndez E, Fedders M, Hasenauer J. 2023. Posterior marginalization accelerates Bayesian inference for dynamical models of biological processes. iScience 26, 108083. ( 10.1016/j.isci.2023.108083) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Plank M. 2024. Michaelplanknz/structured-inference: accepted version (v1.1) 10.5281/zenodo.12686724 [DOI]
- 50. Plank MJ, Simpson MJ. 2024. Supplementary material from: Structured methods for parameter inference and uncertainty quantification for mechanistic models in the life sciences. Figshare. ( 10.6084/m9.figshare.c.7410585) [DOI]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data and relevant code for this research work are stored in GitHub at https://github.com/michaelplanknz/structured-inference and have been archived within the Zenodo repository [49].
Supplementary material is available online [50].







