Testing structural identifiability by a simple scaling method

Mario Castro; Rob J de Boer

doi:10.1371/journal.pcbi.1008248

. 2020 Nov 3;16(11):e1008248. doi: 10.1371/journal.pcbi.1008248

Testing structural identifiability by a simple scaling method

Mario Castro ^1,^2,^☯,^*, Rob J de Boer ^3,^☯

Editor: Miles P Davenport⁴

PMCID: PMC7665633 PMID: 33141821

Abstract

Successful mathematical modeling of biological processes relies on the expertise of the modeler to capture the essential mechanisms in the process at hand and on the ability to extract useful information from empirical data. A model is said to be structurally unidentifiable, if different quantitative sets of parameters provide the same observable outcome. This is typical (but not exclusive) of partially observed problems in which only a few variables can be experimentally measured. Most of the available methods to test the structural identifiability of a model are either too complex mathematically for the general practitioner to be applied, or require involved calculations or numerical computation for complex non-linear models. In this work, we present a new analytical method to test structural identifiability of models based on ordinary differential equations, based on the invariance of the equations under the scaling transformation of its parameters. The method is based on rigorous mathematical results but it is easy and quick to apply, even to test the identifiability of sophisticated highly non-linear models. We illustrate our method by example and compare its performance with other existing methods in the literature.

Author summary

Theoretical Biology is a useful approach to explain, generate hypotheses, or discriminate among competing theories. A well-formulated model has to be complex enough to capture the relevant mechanisms of the problem, and simple enough to be fitted to data. Structural identifiability tests aim to recognize, in advance, if the structure of the model allows parameter fitting even with unlimited high-quality data. Available methods require advanced mathematical skills, or are too costly for high-dimensional non-linear models. We propose an analytical method based on scale invariance of the equations. It provides definite answers to the structural identifiability problem while being simple enough to be performed in a few lines of calculations without any computational aid. It favorably compares with other existing methods.

This is a PLOS Computational Biology Methods paper.

Introduction

Mathematical models contribute to our understanding of Biology in several ways ranging from the quantification of biological processes to reconciling conflicting experiments [1]. In many cases, this requires formulating a mathematical model and extracting quantitative estimates of its parameters from the experimental data. Parameters are typically unknown constants that change the behavior of the model. While it is usually recognized that parameter estimation requires the availability of sufficient informative data, sometimes it is not possible to estimate all parameters due to the structure of the model (whatever the quantity or quality of the data), even with large amounts of noiseless observations. This inability is referred to as ‘structural identifiability’, a concept introduced decades ago by Bellman and Åström [2, 3], as opposed to the ‘practical identifiability’ that depends on limitations set by the data. Practical identifiability has important consequences that can lead to questionable interpretations of the data leading to some recent controversy around this point [4, 5]. Structural identifiability poses an unsolvable limitation as it is unrelated to the resolution of the experimental data collection or the number of observations.

Structural identifiability is a necessary condition for model fitting and should be used before any attempt to extract information about the parameters, and as a test of the applicability of the model itself. Importantly, the quality of the fit does not guarantee that the estimated parameters are meaningful. In practice, this is both uncontrolled and misleading, as many fitting tools provide information about the goodness of fit but do not check sensitivity or identifiability. Structural identifiability can be qualified as global or local [6–10]. Global structural identifiability tests the ability to estimate unique sets of parameters, while local (or simply, structural identifiability) means that parameters can be estimated only in a limited subset of the space of parameters. In practical terms, these definitions can be translated into the language of sensitivity analysis as identifiability requires that (i) the columns of the sensitivity matrix are linearly independent, and (ii) each of its columns has at least one large entry [11, 12].

Traditionally, work primarily focused on linear systems [2, 3, 13] based on ordinary differential equations (ODE). For non-linear models, those methods cannot be applied, so many methods have been proposed in the literature to address structural identifiability. Early attempts were based on power series expansions of the original non-linear system [14], the similarity transformation method [15–17] or the so-called direct-test method proposed by Denis-Vidal and Joly-Blanchard [18, 19]. These methods exploit the definition of identifiability either analytically [18] or numerically [20–25], but they are not generically suitable for high-dimensional problems. Xia and Moog [6, 26] proposed an alternative to these classical methods based on the implicit function theorem, but this method also becomes involved to apply for complex models [27].

Another approach that is becoming mainstream is based on the framework of differential algebra [28–31]. These methods are also difficult to apply, requiring advanced mathematical skills and, in some cases, replace highly non-linear terms by polynomial approximations that simplify the analysis. On the positive side, they are based on rigorous mathematical theories, are suitable for non-linear models and, more importantly, they can be coded using existing symbolic computational libraries. In this regard, it is worth mentioning DAISY [32], GenSSI [33], COMBOS [34] or, more recently, SIAN [35].

In almost all cases, the major disadvantage of these methods is their difficulty to apply them to even a few differential equations, hence requiring advanced mathematical skills and/or dedicated numerical or symbolic software (that is frequently unable to handle the complexity of the problem). This explains why, despite the huge volume of publications in the field of theoretical biology, only a few address parameter identifiability explicitly. In this paper, we introduce a simple method to assess local structural identifiability of ODE models that reduces the complexity of existing methods and can bring identifiability testing to a broader audience. Our method is based on simple scaling transformations, and the solution of simple sparse systems of equations. Identifiability for stochastic models [36] is out of the scope of our work.

Method

A couple of motivating examples

Consider a simple death model in which the death rate is the product of two parameters λ₁ and λ₂, namely

\begin{matrix} \frac{d x}{d t} = - λ_{1} λ_{2} x, x (0) = x_{0}, \end{matrix}

(1)

with the solution

\begin{matrix} x = x (λ_{1}, λ_{2}, t) = x_{0} e^{- λ_{1} λ_{2} t} . \end{matrix}

(2)

It is evident that from an experiment only the product λ₁λ₂ can be inferred, and not any of the two independently. Following the ‘actionable’ definition in Ref. [11], local structural identifiability is directly linked to the linear independence of the columns of the sensitivity matrix, S_ij, of the variable x_i with respect to parameter λ_j

\begin{matrix} S_{i j} (x_{1}, \dots, x_{r}, x_{r + 1} \dots x_{n}; λ_{1}, \dots, λ_{m}) \equiv \frac{\partial x_{i}}{\partial λ_{j}} \end{matrix}

(3)

Here, we will work with a related (dimensionless) quantity called the relative sensitivity, or simply the elasticity matrix K with elements K_ij given by

\begin{matrix} K_{i j} (x_{1}, \dots, x_{r}, x_{r + 1} \dots x_{n}; λ_{1}, \dots, λ_{m}) \equiv \frac{\partial log x_{i}}{\partial log λ_{j}} = \frac{λ_{j}}{x_{i}} \frac{\partial x_{i}}{\partial λ_{j}} = \frac{λ_{j}}{x_{i}} S_{i j} . \end{matrix}

(4)

The logarithm in the definition of the elasticity matrix provides a clear-cut interpretation of its coefficients. Thus, if K_ij = 1, a 10% increase in λ_j implies a 10% increase in x_i, and if K_ij = 0.5, that very same increase in λ_j translates only to a 5% increase in x_i.

For Eq (1), the elasticity matrix would be simply a 1 × 2 matrix,

\begin{matrix} K = (K_{11} K_{12}), \end{matrix}

with

\begin{matrix} K_{11} \equiv \frac{λ_{1}}{x} \frac{\partial x}{\partial λ_{1}}, and K_{12} \equiv \frac{λ_{2}}{x} \frac{\partial x}{\partial λ_{2}} . \end{matrix}

(5)

We now propose to multiply λ₁ with a generic scale factor u, and to divide λ₂ by the same factor, such that the solution remains invariant. Deriving the scaled solution of Eq (2) with respect to that scale factor u, and by the chain rule,

\begin{matrix} \frac{d x}{d u} = 0 (as u is arbitrary) \end{matrix}

(6)

and, also,

\begin{matrix} \frac{d x (u λ_{1}, λ_{2} / u, t)}{d u} = \frac{\partial x}{\partial λ_{1}} λ_{1} - \frac{λ_{2}}{u^{2}} \frac{\partial x}{\partial λ_{2}} = 0 \end{matrix}

(7)

where the last equality follows from Eq (6)

Rearranging Eq (7) and dividing by x,

\begin{matrix} \frac{λ_{1}}{x} \frac{\partial x}{\partial λ_{1}} = \frac{λ_{2}}{u^{2} x} \frac{\partial x}{\partial λ_{2}} \Rightarrow K_{11} = \frac{1}{u^{2}} K_{12}, \end{matrix}

(8)

so both columns of the elasticity matrix are linearly dependent and, accordingly, λ₁ and λ₂ are unidentifiable. In this particular case, the exact solution confirms this result:

\begin{matrix} K_{11} = K_{12} = λ_{1} λ_{2} t . \end{matrix}

In this case we had complete knowledge of the solution, and consequently, it was straightforward to find the right way to introduce the scaling u. Fortunately, this simple scaling calculation can also be performed directly on Eq (1). Introducing two unknown scaling factors, u₁ and u₂, into that equation,

\begin{matrix} \frac{d x}{d t} = - u_{1} λ_{1} u_{2} λ_{2} x . \end{matrix}

Requiring that this remains identical (or, more formally, invariant) to Eq (1), i.e., λ₁λ₂x = u₁λ₁ u₂λ₂ x, we conclude that u₁ u₂ = 1. The fact that u₁ and u₂ cannot be solved individually, also means that the real values of λ₁ and λ₂ cannot be determined, namely both parameters are unidentifiable.

Next consider a death model with immigration:

\begin{matrix} \frac{d x}{d t} = λ_{1} - λ_{2} x . \end{matrix}

(9)

In this case, to leave the system invariant we need to find u₁ and u₂ such that

\begin{matrix} λ_{1} - λ_{2} x (t) = u_{1} λ_{1} - u_{2} λ_{2} x (t) \end{matrix}

for all values of x at any time. Rearranging the latter equation,

\begin{matrix} (1 - u_{1}) λ_{1} = (1 - u_{2}) λ_{2} x (t), \end{matrix}

where the left-hand side of the last equation is a constant and the right-hand side depends on time. Hence the only possible solution to the latter equation is u₁ = u₂ = 1 implying that both λ₁ and λ₂ are locally identifiable. Notice the difference with the preceding case, Eq (1), in which an infinite number of combinations of the scaling factors satisfy the invariance condition.

These simple examples illustrate how scaling invariance of the model equations can be used to determine whether the parameters are unidentifiable or not. We prove this result more rigorously in S1 Text.

Description of the method

Let us define a general ODE model characterized by the time evolution of n variables, x_i(t), depending on m parameters λ_j,

\frac{d x_{i}}{d t} = f_{i} (x_{1}, \dots, x_{r}, x_{r + 1} \dots x_{n}; λ_{1}, \dots, λ_{m}) i = 1, \dots, n,

(10)

x_{i} (0) = x_{i, 0}, i = 1, \dots, n,

(11)

where the functions f_i depend on the specific details of the problem at hand and x_i,0 are the initial conditions. We need to distinguish between those variables that can be observed (measured) in the experiment, x₁ … x_r, and those which cannot (they are often referred to as latent variables), x_r+1 … x_n.

As we will prove below, the simplicity of our method relies on the ability to decompose the functions f_i as a sum of M functional independent summands, f_ik,

\begin{matrix} f_{i} (x_{1}, \dots, x_{r}, x_{r + 1} \dots x_{n}; λ_{1}, \dots, λ_{m}) = \sum_{k = 1}^{M} f_{i k} ({\tilde{x}}_{k}, {\tilde{λ}}_{k}), \end{matrix}

(12)

having the property that f_ik is functionally independent of f_il for every k ≠ l. For brevity, ${\tilde{x}}_{k}, {\tilde{λ}}_{k}$ denote the subset of variables and parameters included in the function f_ik.

The notion of linear independent functions and how to test it is summarized in S1 Text. However, a simple definition would be: If f₁(x₁, x₂, …), …, f_n(x₁, x₂, …) are linearly independent functions, then the only solution of the equation

\begin{matrix} \sum_{i = 1}^{n} a_{i} f_{i} (x_{1}, x_{2}, \dots) = 0 \end{matrix}

(13)

is a₁ = … = a_n = 0.

Typical examples of functionally independent functions are summarized in Table 1. For instance, f₁₁ = ax₁, f₁₂ = bx₁x₃, f₁₃ = (c + x₄)⁻¹ are functionally independent, whereas examples of dependent functions would be f₁₁ = ax₁x₂ and f₁₂ = bx₁x₂. Note that it is not required that f_ij and f_kj are independent (as they appear in different equations). For instance, in the example in Eq (9) can be decomposed in polynomials of degree 0 (a constant) and 1 (a linear function), namely

\begin{matrix} f_{11} = λ_{1}, and f_{12} = - λ_{2} x . \end{matrix}

Table 1. A collection of frequent linear independent functions: All the functions listed in the Table are independent to each other (of the same or different type).

We assume that λ₁ ≠ λ₂ in all of the cases.

Type	Examples
Polynomial (one variable)	x⁰, x, x², x³, …
Polynomial (more than one variable)	$x_{1} x_{2}, x_{1}^{2} x_{2}, x_{1} x_{2} x_{3}, \dots$
Rational	$\frac{1}{λ + x_{1}}, \frac{x_{1}}{λ + x_{1}}, \frac{1}{x_{1} + x_{2}}, \frac{1}{λ_{1} + x_{1} + x_{2}}, \frac{1}{{(λ + x_{1})}^{2}}, \dots$
Exponential	$e^{λ_{1} x_{1}}, e^{λ_{2} x_{1}}, e^{λ_{1} x_{2}}$
Sigmoid	$\frac{1}{λ_{1} + e^{- λ_{1} x_{1}}}, \frac{1}{λ_{1} + e^{- λ_{1} x_{2}}}, \frac{1}{λ_{1} + e^{- λ_{2} x_{1}}}, \frac{1}{{(λ_{1} + e^{- λ_{1} x_{1}})}^{2}}, \dots$
Trigonometric	sin λ₁x₁, sin λ₁x₂ sin λ₂x₁, cos λ₁x₁, tan λ₁x₂, …

Open in a new tab

We summarize our method in Box 1.

Box 1: Summary of the scale invariance local structural identifiability method introduced in this work

Scale all parameters and all unobserved variables by unknown scaling factors, u:
$\begin{matrix} \begin{matrix} λ_{i} \to u_{λ_{i}} λ_{i} & i = 1, \dots, m \\ x_{j} \to u_{x_{j}} x_{j} & j = r + 1, \dots, n \end{matrix} \end{matrix}$
and substitute them into Eq (15) below.
Equate each functionally independent function, f_ik, to its scaled version. Namely,
$\begin{matrix} f_{i k} (\tilde{x}, \tilde{λ}) = \frac{1}{u_{x_{i}}} f_{i k} (u_{\tilde{x}} \tilde{x}, u_{\tilde{λ}} \tilde{λ}) \end{matrix}$ (14)
where $u_{x_{i}} = 1$ for 1 ≤ i ≤ r and the prefactor in the right-hand side of the equation comes from the scaling of $\frac{d x_{i}}{d t} \to u_{x_{i}} \frac{d x_{i}}{d t}$ . From Eq (11) it follows that $u_{x_{i}} = u_{x_{i, 0}}$ .
From Eq (14), find combinations of the scaling factors u that leave the system invariant. Hereafter, we will denote these as the identifiability equations of the model (see Eq (24) below).
Only the parameters λ_i with a solution $u_{λ_{i}} = 1$ are identifiable. Only the variables, x_i with $u_{x_{i}} = 1$ are observable. Otherwise, parameters whose scaling factors are coupled, form identifiable groups but cannot be identified independently.

In summary, our method reduces the complexity of finding identifiable parameters to finding which scaling factors do not satisfy the trivial solution u_i = 1. In the literature, when a scaling factor is related to one of the latent variables x_r+1 … x_n, if $u_{x_{k}} = 1$ , then x_k is said to be observable [10]. Thus, our method addresses at the same time identifiability and observability. Additionally, irreducible equations involving two or more parameters provide the so-called identifiable groups of variables that cannot be fitted independently. In the case of the pure death model above, the identifiability equation $u_{λ_{1}} u_{λ_{2}} = 1$ is a signature of the unidentifiable group λ₁λ₂. This is interesting as groups involving latent variables (for instance, $u_{x_{j}} u_{λ_{k}}$ ) would inform future experiments aimed to observe that variable and decouple that group.

It is also worth mentioning that our identifiability test (illustrated by example in S1 Text) provides a simple way to find a type of symmetry that is related to scale invariance. More sophisticated methods have been introduced in the literature to address other symmetries [37–39] using the theory of Lie group transformations, however, that approach involves complex calculations assisted by symbolic computations.

Results

The main result

Now we are equipped to prove the main result of the paper. We will proceed in two steps: firstly, we will show how Eq (14) is translated into a set of equations for the scaling factors u. Secondly, we will connect the elasticity matrix with the solution of the identifiability equations and the identifiability of the parameters.

Consider a model described by a set of n ordinary differential equations (ODE)

\begin{matrix} \frac{d x_{i}}{d t} = f_{i} (x_{1}, \dots, x_{r}, x_{r + 1} \dots x_{n}; λ_{1}, \dots, λ_{m}) = \sum_{k = 1}^{M} f_{i k} ({\tilde{x}}_{k}, {\tilde{λ}}_{k}), \end{matrix}

(15)

where f_ik is functionally independent of f_il for every k ≠ l (namely, they satisfy the generalized Wronskian theorem; see S1 Text). For the sake of simplicity, we denote ${\tilde{x}}_{k}$ and ${\tilde{λ}}_{k}$ the subset of variables and parameters of function f_ik.

Motivated by Eqs (1)–(5), we seek for scaling of the parameters that leave the system invariant. As we prove below, this invariance (or lack of) is related to the identifiability of the parameters. Hence, if we define the following scaling transformation:

\begin{matrix} λ_{i} \to u_{λ_{i}} λ_{i}, i = 1, \dots, m x_{j} \to u_{x_{j}} x_{j}, j = r + 1, \dots, n \end{matrix}

(16)

(where the variables x₁ … x_r are unmodified as we can measure them in the experiment) we can write the following set of re-scaled equations:

\frac{d x_{i}}{d t} = \sum_{k = 1}^{M} f_{i k} (u_{{\tilde{x}}_{k}} {\tilde{x}}_{k}, u_{{\tilde{λ}}_{k}} {\tilde{λ}}_{k}), i = 1, \dots, r

(17)

x_{i} = x_{i, 0}, i = 1, \dots, r

(18)

u_{x_{i}} \frac{d x_{i}}{d t} = \sum_{k = 1}^{M} f_{i k} (u_{{\tilde{x}}_{k}} {\tilde{x}}_{k}, u_{{\tilde{λ}}_{k}} {\tilde{λ}}_{k}), i = r + 1, \dots, n

(19)

u_{x_{i}} x_{i} = u_{x_{i, 0}} x_{i, 0}, i = r + 1, \dots, n

(20)

where M is the number of functional independent summands in the equation. It is convenient to rewrite Eq (19) as

\begin{matrix} \frac{d x_{i}}{d t} = \frac{1}{u_{x_{i}}} \sum_{k = 1}^{M} f_{i k} (u_{{\tilde{x}}_{k}} {\tilde{x}}_{k}, u_{{\tilde{λ}}_{k}} {\tilde{λ}}_{k}), i = r + 1, \dots n \end{matrix}

(21)

to perform the scale invariance analysis below in a simpler way.

If the solution is invariant under this transformation, then the right-hand sides of Eq (15) and, consequently Eqs () should be equal. Besides, by the functional linear independence of the functions f_ik we can split each summand. Thus,

\begin{matrix} f_{i k} ({\tilde{x}}_{k}, {\tilde{λ}}_{k}) = f_{i k} (u_{{\tilde{x}}_{k}} {\tilde{x}}_{k}, u_{{\tilde{λ}}_{k}} {\tilde{λ}}_{k}), i = 1, \dots, r \end{matrix}

(22)

and

\begin{matrix} f_{i k} ({\tilde{x}}_{k}, {\tilde{λ}}_{k}) = \frac{1}{u_{x_{i}}} f_{i k} (u_{{\tilde{x}}_{k}} {\tilde{x}}_{k}, u_{{\tilde{λ}}_{k}} {\tilde{λ}}_{k}), i = r + 1, \dots, n \end{matrix}

(23)

This new set of equations is much easier to solve than the one that we would obtain from Eqs (17)–(19) (which would be equivalent to the so-called direct-test method [18]). Eqs (22) and (23) admit the trivial solution $u_{{\tilde{x}}_{k}} = u_{{\tilde{λ}}_{k}} = 1$ . Alternatively, some of the parameters are functionally related to each other. Generically, they can be written as

\begin{matrix} u_{λ_{k}} = F_{k} (u_{m_{1}}, u_{m_{2}}, \dots), \end{matrix}

(24)

Note that, for each parameter k, the scaling $u_{λ_{k}}$ will depend only on a subset of all the scaling factors m₁, m₂, … We denote Eq (24) the identifiability equations of the model. A third possibility would be that some scaling factors take fixed values different from 1. We discuss that case below.

Let us now connect the identifiability equations with the concept of local structural identifiability. If we take the partial derivative of the following (invariant equation)

\begin{matrix} x_{i} (x_{1}, \dots, x_{r}, u_{x_{r + 1}} x_{r + 1} \dots u_{x_{n}} x_{n}; u_{λ_{1}} λ_{1}, \dots, u_{λ_{m}} λ_{m}) = x_{i} (x_{1}, \dots, x_{r}, x_{r + 1} \dots x_{n}; λ_{1}, \dots, λ_{m}) \end{matrix}

with respect to $u_{λ_{k}}$ , by the chain rule, it follows that

\begin{matrix} \frac{\partial x_{i}}{\partial λ_{k}} λ_{k} + \frac{\partial x_{i}}{\partial m_{1}} m_{1} β_{m_{1} k} + \frac{\partial x_{i}}{\partial m_{2}} m_{2} β_{m_{2} k} + \dots = 0 \end{matrix}

(25)

where, for convenience, we have defined

\begin{matrix} β_{m k} \equiv \frac{\partial u_{m}}{\partial u_{λ_{k}}} = (\frac{\partial F_{k}}{\partial u_{m}})^{- 1} . \end{matrix}

Finally, dividing Eq (25) by x_i:

\begin{matrix} K_{i k} + β_{m_{1} k} K_{i m_{1}} + β_{m_{2} k} K_{i m_{2}} + \dots = 0, \end{matrix}

(26)

where K_im are the elements of the elasticity matrix defined in Eq (4). Eq (26) implies that K_ik can be written as a linear combination of other column(s) of the elasticity matrix. According to our discussion in the Introduction (see also Refs. [11, 12]) this is means that λ_k is not identifiable.

Summarising, for each parameter λ_k either $u_{λ_{k}} = 1$ or it is not identifiable. The adjective “local” follows because the method stems on the continuity of the derivative of x_i(t) with respect to λ_k to derive Eq (25). Thus, it is unable to capture any discrete transformations like, for instance,

\begin{matrix} {u_{c} \to 1, u_{δ} \to 1, {u_{c} \to \frac{δ}{c}, u_{δ} \to \frac{c}{δ} \end{matrix}

discussed for Model 8 in S1 Text and that, as we anticipated above, is the third possible solution of the identifiability Eq (24).

Example: An unidentifiable nonlinear model [16]

Here we show how to apply our method to a nonlinear model introduced in Ref. [16] (this model is mathematically equivalent to Model 2 in S1 Text).

{\dot{x}}_{1} = λ_{1} x_{1}^{2} + λ_{2} x_{1} x_{2},

(27)

{\dot{x}}_{2} = λ_{3} x_{1}^{2} + λ_{4} x_{1} x_{2},

(28)

x_{1} (0) = 0,

(29)

x_{2} (0) = 0,

(30)

x_{1} is observed

(31)

Following Box 1:

We re-scale the non-observed variables and parameters:
$\begin{matrix} {\begin{matrix} x_{2} & \to & u_{x_{2}} x_{2} \\ λ_{1} & \to & u_{λ_{1}} λ_{1} \\ λ_{2} & \to & u_{λ_{2}} λ_{2} \\ λ_{3} & \to & u_{λ_{3}} λ_{3} \\ λ_{4} & \to & u_{λ_{4}} λ_{4} \end{matrix}, \end{matrix}$ (32)
as x₁ is observed (so, $u_{x_{1}} = 1$ ).
We define the functional linear independent functions:
$\begin{matrix} f_{11} = λ_{1} x_{1}^{2} f_{12} = λ_{2} x_{1} x_{2} f_{21} = λ_{3} x_{1}^{2} f_{22} = λ_{4} x_{1} x_{2}, \end{matrix}$
and from Eq (14)
$\begin{matrix} u_{λ_{1}} λ_{1} x_{1}^{2} = λ_{1} x_{1}^{2} u_{λ_{2}} u_{x_{2}} λ_{2} x_{1} x_{2} = λ_{2} x_{1} x_{2} \end{matrix}$
and
$\begin{matrix} \frac{u_{λ_{3}}}{u_{x_{2}}} λ_{3} x_{1}^{2} = λ_{3} x_{1}^{2} u_{λ_{4}} λ_{4} x_{1} x_{2} = λ_{4} x_{1} x_{2} \end{matrix}$
respectively.
Manipulating the previous equations:
$u_{λ_{1}} λ_{1} x_{1}^{2} = λ_{1} x_{1}^{2} u_{λ_{2}} u_{x_{2}} λ_{2} x_{1} x_{2} = λ_{2} x_{1} x_{2}$
and
$\frac{u_{λ_{3}}}{u_{x_{2}}} λ_{3} x_{1}^{2} = λ_{3} x_{1}^{2} u_{λ_{4}} λ_{4} x_{1} x_{2} = λ_{4} x_{1} x_{2}$
Hence, the identifiability equations are
$\begin{matrix} {\begin{matrix} u_{λ_{1}} = 1 \\ u_{λ_{2}} u_{x_{2}} = 1 \\ u_{λ_{3}} = u_{x_{2}} \\ u_{λ_{4}} = 1 \end{matrix} \end{matrix}$ (33)
As the system has more than 1 solution besides the trivial ( $u_{λ_{1}} = u_{λ_{2}} = \dots = 1$ ) it follows that the model is unidentifiable. Moreover, Eq (33) allows one to conclude that (i) if x₂ were to be observed ( $u_{x_{2}} = 1$ ), all the parameters would be identifiable, and (ii) the combination $u_{λ_{2}} u_{λ_{3}}$ is identifiable as, for any scale of x₂, the condition $u_{λ_{2}} u_{λ_{3}} = 1$ is always fulfilled and hence λ₂λ₃ is an identifiable group.

Comparison with other methods

We have applied the method outlined in Box 1 to 13 different models defined and analyzed in detail in S1 Text. The choice is based on two criteria: on the one hand, models 1-5 are included for pedagogical purposes. They are simple enough to illustrate the novel method and most of the existing methods also provide the same definite answers. Models 6-13 were chosen because they have previously been analyzed using the methods summarized in the Introduction and in Table 2. This allows us to put our method in direct competition with those methods and to highlight their merits and limitations.

Table 2. List of current methods testing structural identifiability.

We introduce here the acronyms referred to in Table 3.

Method	Acronym	Main Ref.	Pros	Cons
Direct test method	DT	[18, 20]	Simple	Limited
Implicit function theorem	IFT	[26]	Software	Limited
Taylor series approach	TS	[14]	Simple	Computationally Expensive
Generating series approach	GS	[13]	Simple, Software	Computationally Expensive
Similarity Transformation	ST	[16]	Software	Computationally Expensive
Differential algebra	DA	[29, 32, 34]	Software, Conclusive	Limited, Comp. Expensive
Reaction Network theory	RNT	[40, 41]	Simple, Hybrid with other	Only reaction systems
STRIKE-GOLDD	SG	[9, 22]	Powerful, Software	Computationally Expensive
Scaling Invariance Method	SIM	This work	Simple, Widely applicable	Only Local Identifiability

Open in a new tab

The results of this comparison are summarized in Table 3, which is an extension of a similar table in Ref. [7]. The column Not Conclusive/Not Applicable groups different situations in which a particular method do not provide a conclusive answer (or no answer at all). In general, it captures the fact that many of these methods are computationally demanding (after several hours they do not provide any answer) or that the computations do not converge numerically. For instance, in some implementations of the Differential Algebra method [32], when the number of observables is lower than the number of parameters, the computation requires the evaluation of high-order derivatives of the functions f_i in Eq (11) what can be computationally prohibitive. In other cases, some criterion of applicability is not fulfilled (for instance, the observability rank condition for the similarity transformation method) or the method cannot be solved if it involves the solution of a high-degree polynomial or transcendental equations (Direct Test method). These limitations are summarized succinctly in the Cons column in Table 2.

Table 3. Summary of models compared in the literature: The number in brackets in the Model Name column corresponds to the number of observed variables.

Model Numbers correspond to those in Table A in S1 Text. The acronyms for the methods are summarized in Table 2. This table is an extension of Table 1 in Ref. [7].

Model name	Main Ref.	Model Number	Global Struct. Id.	Local Struct. Id.	Unidentifiable	Not Conclusive Not Applicable
Goodwin model (1)	[7]	6			SG,SIM	TS,GS,ST,DT,DA,IFT,RNT
Goodwin model (all)	[7]	6bis		TS,GS,IFT,RNT	DA,SG,SIM	ST,DT
Circadian clock model	[42]	7			TS,GS,RNT,SG,SIM	ST,DT,DA,IFT
HIV model (1)	[6, 43]	8			All
HIV model (2)	[6, 43]	8bis	DA,IFT,RNT	TS,GS,SIM		DT,ST
Linear HIV model (1)	[6, 43, 44]	8ter	DA,IFT,RNT,SG	DT,ST,TS,GS,SIM
Glycolysis model	[45]	9	GS,DA,RNT	TS,SIM		ST,DT
High dimensional model	[42]	10	TS,GS,DA,RNT	IFT,SIM		ST,DT
NF-κ model B (1)	[46]	11			SG, SIM	TS,GS,ST,DT,DA,IFT,RNT
NF-κ model B (2)	[46]	11bis	GS,RNT	TS,SIM	SG	ST,DT,DA,IFT
Pharmacokinetics model (1)	[47]	12		TS,GS,RNT,SG,SIM		ST,DT,DA,IFT
Pharmacokinetics model (2)	[47]	12bis	DA	GS,SG,SIM		ST,DT,IFT,RNT
Within-host virus model	[27]	13	DA	SIM		TS,GS,ST,DT,IFT,RANT

Open in a new tab

Discussion and conclusions

Table 3 shows that our method can handle any complex model and provides a local structural identifiability criterion that is compatible with those methods capable of producing an answer. Thus, our method is widely applicable. It is worth noting that in several cases where our scaling method comes with a conclusive answer, other more complicated methods cannot address those cases (rightmost column in the table). As any global structural identifiable model is also local, our results are compatible with those methods that can address that difference.

Table 3 also highlights the huge discrepancies among methods. These conflicting conclusions are rather discomforting and deserve deeper clarification. The main source of conflict arises when comparing the Taylor series and the Generating series methods, as they transform the original problem into an approximate one. Also, they incorporate (rightly) the initial conditions into the computation while some implementations of the Differential algebra (DA) method do not (see the DAISY implementation [32]), what can lead to different conclusions. Regarding the DA method, in some instances random values are used for the parameters to handle the complexity of some models what, if those parameters are not properly explored, can lead to wrong conclusions.

So overall, we can distinguish three sources of discrepancy: local vs global structural identifiability (which is not an incompatibility as Global implies Local and our method is restricted to the latter); conclusive vs not conclusive (which favors our method as it is not limited by any computational constraint) and; the most concerning, incompatible conclusions. Here, our method is compatible with the conclusions of DA and hybrid methods such as Reaction network theory or STRIKE-GOLDD. As we mentioned in the introduction, Differential Algebra methods (and extensions) are considered the most reliable (when computable) and our method either agrees, or provides an answer where the other methods cannot. The discrepancies with other methods are due to limitations or uncontrolled approximations when applied to complex problems and have been already raised by other authors [7].

From viewpoint of performance, it is worth emphasizing that we have performed our test by hand, as illustrated in S1 Text, and that, after some practice (and using some interesting motifs as having sums of different parameters, or the coefficients related to diagonal terms in the system of equations) the calculations can be made in a few minutes. This contrasts with the most sophisticated methods that, by hand, can fill several pages [27] or take hours using symbolic computation packages.

Together, broad applicability and simplicity are the main signatures of our method and this may attract the interest of mathematical modelers and spread the culture of checking structural identifiability as a mandatory step when fitting experimental data.

We would like to highlight a connection with the so-called Buckingham-Π theorem of dimensional analysis [48]. In some sense, the scale invariance property is related to the principle of dimensional homogeneity, i.e., the constraints on the functional form of the independent variables with the parameters. Our identifiability equations are therefore similar to finding the so-called Π-groups in the theorem.

A limitation of the method is that it is restricted to testing local identifiability. This is implicit in the differentiability of the elasticity matrix which, by definition, is a local operation. Discrete symmetries are not captured, and more sophisticated methods (based on Lie group transformations [39]) are required. However, simple manipulation of the equations to remove the latent variables can improve the explanatory power of the method and might capture those discrete symmetries (see Sec. 3.8 of S1 Text). We leave that extension for future developments.

Finally, in this work we have chosen to solve the scaling factor equations directly as it is easy to perform with pen and paper. However, if we were to redefine the scaling factors as $u_{i} = e^{w_{i}}$ , the new factors w_i would obey a linear system of homogeneous equations. It is therefore expected that the problem of identifiability is related to the rank of the matrix defining the linear system of equations. In that regard, the theorems presented in S1 Text could be supplemented with generic results on homogeneous systems of equations. Thus, our results provide a solid ground for the method and indicate a venue for further development in other systems like delay-differential or partial differential equations.

Another open question is the identifiability problem of mixed-effect models, where parameters are not fixed quantities for each observation but, rather, they are drawn from a meta-distribution linking different subjects [49]. For instance, if one considers the simple model

\begin{matrix} \dot{x} = (a + b) x, \end{matrix}

a and b are not identifiable. However, if they are assumed to be drawn from, say, two exponential distributions with different means μ_a and μ_b, then the joint distribution for λ ≡ a − b is given by

\begin{matrix} p (λ; μ_{a}, μ_{b}) = \frac{μ_{a} μ_{b}}{μ_{a} - μ_{b}} (e^{- μ_{b} λ} - e^{- μ_{a} λ}), \end{matrix}

which is formed by two linearly independent functions (if μ_a ≠ μ_b), $~ e^{- μ_{b} λ}$ and $~ e^{- μ_{a} λ}$ so μ_a and μ_b are identifiable as the unique solution of the identifiability equations

\begin{matrix} \frac{u_{μ_{b}} μ_{a} u_{μ_{b}} μ_{b}}{u_{μ_{b}} μ_{a} - u_{μ_{b}} μ_{b}} e^{- u_{μ_{b}} μ_{b} λ} = \frac{μ_{a} μ_{b}}{μ_{a} - μ_{b}} e^{- μ_{b} λ} \end{matrix}

is $u_{μ_{b}} = 1$ (because of the exponential). This kind of models need further analysis but they seem to be amenable to our approach.

Finally, while we emphasize the simplicity of the method, it is also amenable to be implemented using symbolic computation packages, particularly for systems with a large number of equations/reactions.

Supporting information

S1 Text. In S1 Text we collect the theorems sustaining the method and a catalogue of models with a detailed computation of the identifiability equations that were used to build Table 3.

(TEX)

Click here for additional data file.^{(48.3KB, tex)}

Acknowledgments

This work was initiated during summer visits of the authors to the Los Alamos National Laboratory, and we thank Nick Hengartner and Alan Perelson (LANL) for their hospitality and helpful comments on this work, and the Santa Fe Institute for supporting the summer visits of RdB.

Data Availability

All relevant data are within the manuscript and its Supporting information files.

Funding Statement

This work was funded by Agencia Estatal de Investigación (FIS2016-78883-C2-2-P, PID2019-106339GB-I00) to MC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1. Castro M, Lythe G, Molina-París C, Ribeiro RM. Mathematics in Modern Immunology. Interface focus. 2016;6(2):20150093 10.1098/rsfs.2015.0093 [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Bellman R, Åström KJ. On structural identifiability. Mathematical Biosciences. 1970;7(3-4):329–339. 10.1016/0025-5564(70)90132-X [DOI] [Google Scholar]
3. Jacquez JA, et al. Compartmental analysis in biology and medicine. New York, Elsevier Pub. Co.; 1972. [Google Scholar]
4. Balsa-Canto E, Alonso-del Real J, Querol A. Mixed growth curve data do not suffice to fully characterize the dynamics of mixed cultures. Proceedings of the National Academy of Sciences. 2020;117(2):811 10.1073/pnas.1916774117 [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Ram Y, Obolski U, Feldman MW, Berman J, Hadany L. Reply to Balsa-Canto et al.: Growth models are applicable to growth data, not to stationary-phase data. Proceedings of the National Academy of Sciences. 2020;117(2):814–815. 10.1073/pnas.1917758117 [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Miao H, Xia X, Perelson AS, Wu H. On the identifiability of nonlinear ODE models and applications in viral dynamics. SIAM review. 2011;53(1):3–39. 10.1137/090757009 [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Chis OT, Banga JR, Balsa-Canto E. Structural identifiability of systems biology models: a critical comparison of methods. PloS one. 2011;6(11):e27755 10.1371/journal.pone.0027755 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Villaverde AF, Barreiro A. Identifiability of large nonlinear biochemical networks. Match Commun Math Comput Chem (Mulheim an der Ruhr, Germany). 2016;76(2):259–276. [Google Scholar]
9. Villaverde AF, Barreiro A, Papachristodoulou A. Structural Identifiability of Dynamic Systems Biology Models. PLoS Computational Biology. 2016;12(10):1–22. 10.1371/journal.pcbi.1005153 [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Villaverde AF, Tsiantis N, Banga JR. Full observability and estimation of unknown inputs, states and parameters of nonlinear biological models. Journal of the Royal Society Interface. 2019;16(156):20190043 10.1098/rsif.2019.0043 [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Jaqaman K, Danuser G. Linking data to models: data regression. Nature Reviews Molecular Cell Biology. 2006;7(11):813 10.1038/nrm2030 [DOI] [PubMed] [Google Scholar]
12. Komorowski M, Costa MJ, Rand DA, Stumpf MP. Sensitivity, robustness, and identifiability in stochastic chemical kinetics models. Proceedings of the National Academy of Sciences. 2011;108(21):8645–8650. 10.1073/pnas.1015814108 [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Walter E, Lecourtier Y. Unidentifiable compartmental models: What to do? Mathematical Biosciences. 1981;56(1-2):1–25. [Google Scholar]
14. Pohjanpalo H. System identifiability based on the power series expansion of the solution. Mathematical Biosciences. 1978;41(1-2):21–33. 10.1016/0025-5564(78)90063-9 [DOI] [Google Scholar]
15. Vajda S, Rabitz H. State isomorphism approach to global identifiability of nonlinear systems. IEEE Transactions on Automatic Control. 1989;34(2):220–223. 10.1109/9.21105 [DOI] [Google Scholar]
16. Vajda S, Godfrey KR, Rabitz H. Similarity transformation approach to identifiability analysis of nonlinear compartmental models. Mathematical Biosciences. 1989;93(2):217–248. 10.1016/0025-5564(89)90024-2 [DOI] [PubMed] [Google Scholar]
17. Chappell MJ, Godfrey KR. Structural identifiability of the parameters of a nonlinear batch reactor model. Mathematical Biosciences. 1992;108(2):241–251. 10.1016/0025-5564(92)90058-5 [DOI] [PubMed] [Google Scholar]
18. Denis-Vidal L, Joly-Blanchard G. An easy to check criterion for (un) indentifiability of uncontrolled systems and its applications. IEEE Transactions on Automatic Control. 2000;45(4):768–771. 10.1109/9.847119 [DOI] [Google Scholar]
19. Raksanyi A, Lecourtier Y, Walter E, Venot A. Identifiability and distinguishability testing via computer algebra. Mathematical Biosciences. 1985;77(1-2):245–266. 10.1016/0025-5564(85)90100-2 [DOI] [Google Scholar]
20. Walter E, Braems I, Jaulin L, Kieffer M. Guaranteed numerical computation as an alternative to computer algebra for testing models for identifiability In: Numerical Software with Result Verification. Springer; 2004. p. 124–131. [Google Scholar]
21. Maiwald T, Hass H, Steiert B, Vanlier J, Engesser R, Raue A, et al. Driving the model to its limit: profile likelihood based model reduction. PloS one. 2016;11(9):e0162366 10.1371/journal.pone.0162366 [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Villaverde AF, Barreiro A, Papachristodoulou A. Structural Identifiability Analysis via Extended Observability and Decomposition. IFAC-PapersOnLine. 2016;49(26):171–177. 10.1016/j.ifacol.2016.12.121 [DOI] [Google Scholar]
23. Kreutz C. An easy and efficient approach for testing identifiability. Bioinformatics. 2018;34(11):1913–1921. 10.1093/bioinformatics/bty035 [DOI] [PubMed] [Google Scholar]
24. Tönsing C, Timmer J, Kreutz C. Profile likelihood-based analyses of infectious disease models. Statistical methods in medical research. 2018;27(7):1979–1998. 10.1177/0962280217746444 [DOI] [PubMed] [Google Scholar]
25. Stigter JD, Molenaar J. A fast algorithm to assess local structural identifiability. Automatica. 2015;58:118–124. 10.1016/j.automatica.2015.05.004 [DOI] [Google Scholar]
26. Xia X, Moog CH. Identifiability of nonlinear systems with application to HIV/AIDS models. IEEE transactions on automatic control. 2003;48(2):330–336. 10.1109/TAC.2002.808494 [DOI] [Google Scholar]
27. Koelle K, Farrell AP, Brooke CB, Ke R. Within-host infectious disease models accommodating cellular coinfection, with an application to influenza. Virus evolution. 2019;5(2):vez018 10.1093/ve/vez018 [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Walter E, Pronzato L. On the identifiability and distinguishability of nonlinear parametric models. Mathematics and computers in simulation. 1996;42(2-3):125–134. 10.1016/0378-4754(95)00123-9 [DOI] [Google Scholar]
29. Ljung L, Glad T. On global identifiability for arbitrary model parametrizations. Automatica. 1994;30(2):265–276. 10.1016/0005-1098(94)90029-9 [DOI] [Google Scholar]
30.Ollivier F. Identifiabilité et identification: du Calcul Formel au Calcul Numérique? In: ESAIM: Proceedings. vol. 9. EDP Sciences; 2000. p. 93–99.
31. Meshkat N, Eisenberg M, DiStefano JJ III. An algorithm for finding globally identifiable parameter combinations of nonlinear ODE models using Gröbner Bases. Mathematical Biosciences. 2009;222(2):61–72. 10.1016/j.mbs.2009.08.010 [DOI] [PubMed] [Google Scholar]
32. Bellu G, Saccomani MP, Audoly S, D’Angiò L. DAISY: A new software tool to test global identifiability of biological and physiological systems. Computer methods and programs in biomedicine. 2007;88(1):52–61. 10.1016/j.cmpb.2007.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Chiş O, Banga JR, Balsa-Canto E. GenSSI: a software toolbox for structural identifiability analysis of biological models. Bioinformatics. 2011;27(18):2610–2611. 10.1093/bioinformatics/btr431 [DOI] [PMC free article] [PubMed] [Google Scholar]
34. Meshkat N, Kuo CEz, DiStefano J III. On finding and using identifiable parameter combinations in nonlinear dynamic systems biology models and COMBOS: a novel web implementation. PLoS One. 2014;9(10):e110261 10.1371/journal.pone.0110261 [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Hong H, Ovchinnikov A, Pogudin G, Yap C. SIAN: software for structural identifiability analysis of ODE models. Bioinformatics. 2019;35(16):2873–2874. 10.1093/bioinformatics/bty1069 [DOI] [PubMed] [Google Scholar]
36. Brouwer AF, Meza R, Eisenberg MC. Parameter estimation for multistage clonal expansion models from cancer incidence data: A practical identifiability analysis. PLoS computational biology. 2017;13(3):e1005431 10.1371/journal.pcbi.1005431 [DOI] [PMC free article] [PubMed] [Google Scholar]
37. Yates JWT, Evans ND, Chappell MJ. Structural identifiability analysis via symmetries of differential equations. Automatica. 2009;45(11):2585–2591. 10.1016/j.automatica.2009.07.009 [DOI] [Google Scholar]
38. Anguelova M, Karlsson J, Jirstrand M. Minimal output sets for identifiability. Mathematical Biosciences. 2012;239(1):139–153. 10.1016/j.mbs.2012.04.005 [DOI] [PubMed] [Google Scholar]
39. Merkt B, Timmer J, Kaschek D. Higher-order Lie symmetries in identifiability and predictability analysis of dynamic models. Physical Review E. 2015;92(1):012920 10.1103/PhysRevE.92.012920 [DOI] [PubMed] [Google Scholar]
40. Craciun G, Kim J, Pantea C, Rempala GA. Statistical model for biochemical network inference. Communications in Statistics-Simulation and Computation. 2013;42(1):121–137. 10.1080/03610918.2011.633200 [DOI] [PMC free article] [PubMed] [Google Scholar]
41. Davidescu FP, Jørgensen SB. Structural parameter identifiability analysis for dynamic reaction networks. Chemical Engineering Science. 2008;63(19):4754–4762. 10.1016/j.ces.2008.06.009 [DOI] [Google Scholar]
42. Locke J, Millar A, Turner M. Modelling genetic networks with noisy and varied experimental data: the circadian clock in Arabidopsis thaliana. Journal of theoretical biology. 2005;234(3):383–393. 10.1016/j.jtbi.2004.11.038 [DOI] [PubMed] [Google Scholar]
43. Wu H, Zhu H, Miao H, Perelson AS. Parameter identifiability and estimation of HIV/AIDS dynamic models. Bulletin of mathematical biology. 2008;70(3):785–799. 10.1007/s11538-007-9279-9 [DOI] [PubMed] [Google Scholar]
44. Ho DD, Neumann AU, Perelson AS, Chen W, Leonard JM, Markowitz M. Rapid turnover of plasma virions and CD4 lymphocytes in HIV-1 infection. Nature. 1995;373(6510):123–126. 10.1038/373123a0 [DOI] [PubMed] [Google Scholar]
45.Bartl M, Kötzing M, Kaleta C, Schuster S, Li P. Just-in-time activation of a glycolysis inspired metabolic network-solution with a dynamic optimization approach. In: Crossing Borders within the ABC: Automation, Biomedical Engineering and Computer Science. vol. 55; 2010. p. 217–222.
46. Lipniacki T, Paszek P, Brasier AR, Luxon B, Kimmel M. Mathematical model of NF-κB regulatory module. Journal of theoretical biology. 2004;228(2):195–215. 10.1016/j.jtbi.2004.01.001 [DOI] [PubMed] [Google Scholar]
47. Domurado M, Domurado D, Vansteenkiste S, De Marre A, Schacht E. Glucose oxidase as a tool to study in vivo the interaction of glycosylated polymers with the mannose receptor of macrophages. Journal of controlled release. 1995;33(1):115–123. 10.1016/0168-3659(94)00074-5 [DOI] [Google Scholar]
48. Buckingham E. Illustrations of the use of dimensional analysis on physically similar systems. Physics Review. 1914;4(4):354–377. [Google Scholar]
49. Lavielle M, Aarons L. What do we mean by identifiability in mixed effects models? Journal of pharmacokinetics and pharmacodynamics. 2016;43(1):111–122. 10.1007/s10928-015-9459-4 [DOI] [PubMed] [Google Scholar]

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008248.r001

Decision Letter 0

Douglas A Lauffenburger, Miles P Davenport

2 Jul 2020

Dear Dr Castro,

Thank you very much for submitting your manuscript "Testing structural identifiability by a simple scaling method" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Miles P. Davenport, MB BS, D.Phil

Associate Editor

PLOS Computational Biology

Douglas Lauffenburger

Deputy Editor

PLOS Computational Biology

***********************

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately:

[LINK]

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Dear Editor,

The manuscript “Testing structural identifiability by a simple scaling method” focuses on an interesting problem regarding structural identifiability of mathematical models, or the ability to uniquely determine all or some of the parameters or parameter combinations based on observed data.

The paper presents a concise, simple approach to this problem. It is an interesting article that could be of use in the field of mathematical modelling.

I was confused by Table 2, where the authors compared different methods for determining identifiability. Why do different methods arrive at disagreeing results for the same models? Could you explain and clarify how some tests find that the same model is globally structurally identifiable, while others find that the same model is unidentifiable (or locally structurally identifiable)?

I found the examples that were discussed and worked out in the supplementary information very useful, so I think that this information could be included in the main text.

Minor comments:

In the supplement in Section 2.2 in the first sentence, it says “when only x2 is observed”. Should it say “when only x1 is observed”?

Typo right after (18) on p. 5 of supplement. It says, “u_x1)1”but should say “u_x1=1”.

Typo right before section 2.9 on p. 10 of supplement. The “u_d”s should be “u_\\delta”s.

Reviewer #2: the review is uploaded as an attachment

Reviewer #3: See attached

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see http://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. 2020 Nov 3;16(11):e1008248. doi: 10.1371/journal.pcbi.1008248.r002

Author response to Decision Letter 0

10 Aug 2020

Attachment

Submitted filename: reply.pdf

Click here for additional data file.^{(105.1KB, pdf)}

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008248.r003

Decision Letter 1

Douglas A Lauffenburger, Miles P Davenport

14 Aug 2020

Dear Dr. Castro,

We are pleased to inform you that your manuscript 'Testing structural identifiability by a simple scaling method' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology.

Best regards,

Miles P. Davenport, MB BS, D.Phil

Associate Editor

PLOS Computational Biology

Douglas Lauffenburger

Deputy Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008248.r004

Acceptance letter

Douglas A Lauffenburger, Miles P Davenport

20 Oct 2020

PCOMPBIOL-D-20-00674R1

Testing structural identifiability by a simple scaling method

Dear Dr Castro,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Laura Mallard

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Text. In S1 Text we collect the theorems sustaining the method and a catalogue of models with a detailed computation of the identifiability equations that were used to build Table 3.

(TEX)

Click here for additional data file.^{(48.3KB, tex)}

Attachment

Submitted filename: reply.pdf

Click here for additional data file.^{(105.1KB, pdf)}

Data Availability Statement

All relevant data are within the manuscript and its Supporting information files.

[pcbi.1008248.ref001] 1. Castro M, Lythe G, Molina-París C, Ribeiro RM. Mathematics in Modern Immunology. Interface focus. 2016;6(2):20150093 10.1098/rsfs.2015.0093 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref002] 2. Bellman R, Åström KJ. On structural identifiability. Mathematical Biosciences. 1970;7(3-4):329–339. 10.1016/0025-5564(70)90132-X [DOI] [Google Scholar]

[pcbi.1008248.ref003] 3. Jacquez JA, et al. Compartmental analysis in biology and medicine. New York, Elsevier Pub. Co.; 1972. [Google Scholar]

[pcbi.1008248.ref004] 4. Balsa-Canto E, Alonso-del Real J, Querol A. Mixed growth curve data do not suffice to fully characterize the dynamics of mixed cultures. Proceedings of the National Academy of Sciences. 2020;117(2):811 10.1073/pnas.1916774117 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref005] 5. Ram Y, Obolski U, Feldman MW, Berman J, Hadany L. Reply to Balsa-Canto et al.: Growth models are applicable to growth data, not to stationary-phase data. Proceedings of the National Academy of Sciences. 2020;117(2):814–815. 10.1073/pnas.1917758117 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref006] 6. Miao H, Xia X, Perelson AS, Wu H. On the identifiability of nonlinear ODE models and applications in viral dynamics. SIAM review. 2011;53(1):3–39. 10.1137/090757009 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref007] 7. Chis OT, Banga JR, Balsa-Canto E. Structural identifiability of systems biology models: a critical comparison of methods. PloS one. 2011;6(11):e27755 10.1371/journal.pone.0027755 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref008] 8. Villaverde AF, Barreiro A. Identifiability of large nonlinear biochemical networks. Match Commun Math Comput Chem (Mulheim an der Ruhr, Germany). 2016;76(2):259–276. [Google Scholar]

[pcbi.1008248.ref009] 9. Villaverde AF, Barreiro A, Papachristodoulou A. Structural Identifiability of Dynamic Systems Biology Models. PLoS Computational Biology. 2016;12(10):1–22. 10.1371/journal.pcbi.1005153 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref010] 10. Villaverde AF, Tsiantis N, Banga JR. Full observability and estimation of unknown inputs, states and parameters of nonlinear biological models. Journal of the Royal Society Interface. 2019;16(156):20190043 10.1098/rsif.2019.0043 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref011] 11. Jaqaman K, Danuser G. Linking data to models: data regression. Nature Reviews Molecular Cell Biology. 2006;7(11):813 10.1038/nrm2030 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref012] 12. Komorowski M, Costa MJ, Rand DA, Stumpf MP. Sensitivity, robustness, and identifiability in stochastic chemical kinetics models. Proceedings of the National Academy of Sciences. 2011;108(21):8645–8650. 10.1073/pnas.1015814108 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref013] 13. Walter E, Lecourtier Y. Unidentifiable compartmental models: What to do? Mathematical Biosciences. 1981;56(1-2):1–25. [Google Scholar]

[pcbi.1008248.ref014] 14. Pohjanpalo H. System identifiability based on the power series expansion of the solution. Mathematical Biosciences. 1978;41(1-2):21–33. 10.1016/0025-5564(78)90063-9 [DOI] [Google Scholar]

[pcbi.1008248.ref015] 15. Vajda S, Rabitz H. State isomorphism approach to global identifiability of nonlinear systems. IEEE Transactions on Automatic Control. 1989;34(2):220–223. 10.1109/9.21105 [DOI] [Google Scholar]

[pcbi.1008248.ref016] 16. Vajda S, Godfrey KR, Rabitz H. Similarity transformation approach to identifiability analysis of nonlinear compartmental models. Mathematical Biosciences. 1989;93(2):217–248. 10.1016/0025-5564(89)90024-2 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref017] 17. Chappell MJ, Godfrey KR. Structural identifiability of the parameters of a nonlinear batch reactor model. Mathematical Biosciences. 1992;108(2):241–251. 10.1016/0025-5564(92)90058-5 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref018] 18. Denis-Vidal L, Joly-Blanchard G. An easy to check criterion for (un) indentifiability of uncontrolled systems and its applications. IEEE Transactions on Automatic Control. 2000;45(4):768–771. 10.1109/9.847119 [DOI] [Google Scholar]

[pcbi.1008248.ref019] 19. Raksanyi A, Lecourtier Y, Walter E, Venot A. Identifiability and distinguishability testing via computer algebra. Mathematical Biosciences. 1985;77(1-2):245–266. 10.1016/0025-5564(85)90100-2 [DOI] [Google Scholar]

[pcbi.1008248.ref020] 20. Walter E, Braems I, Jaulin L, Kieffer M. Guaranteed numerical computation as an alternative to computer algebra for testing models for identifiability In: Numerical Software with Result Verification. Springer; 2004. p. 124–131. [Google Scholar]

[pcbi.1008248.ref021] 21. Maiwald T, Hass H, Steiert B, Vanlier J, Engesser R, Raue A, et al. Driving the model to its limit: profile likelihood based model reduction. PloS one. 2016;11(9):e0162366 10.1371/journal.pone.0162366 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref022] 22. Villaverde AF, Barreiro A, Papachristodoulou A. Structural Identifiability Analysis via Extended Observability and Decomposition. IFAC-PapersOnLine. 2016;49(26):171–177. 10.1016/j.ifacol.2016.12.121 [DOI] [Google Scholar]

[pcbi.1008248.ref023] 23. Kreutz C. An easy and efficient approach for testing identifiability. Bioinformatics. 2018;34(11):1913–1921. 10.1093/bioinformatics/bty035 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref024] 24. Tönsing C, Timmer J, Kreutz C. Profile likelihood-based analyses of infectious disease models. Statistical methods in medical research. 2018;27(7):1979–1998. 10.1177/0962280217746444 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref025] 25. Stigter JD, Molenaar J. A fast algorithm to assess local structural identifiability. Automatica. 2015;58:118–124. 10.1016/j.automatica.2015.05.004 [DOI] [Google Scholar]

[pcbi.1008248.ref026] 26. Xia X, Moog CH. Identifiability of nonlinear systems with application to HIV/AIDS models. IEEE transactions on automatic control. 2003;48(2):330–336. 10.1109/TAC.2002.808494 [DOI] [Google Scholar]

[pcbi.1008248.ref027] 27. Koelle K, Farrell AP, Brooke CB, Ke R. Within-host infectious disease models accommodating cellular coinfection, with an application to influenza. Virus evolution. 2019;5(2):vez018 10.1093/ve/vez018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref028] 28. Walter E, Pronzato L. On the identifiability and distinguishability of nonlinear parametric models. Mathematics and computers in simulation. 1996;42(2-3):125–134. 10.1016/0378-4754(95)00123-9 [DOI] [Google Scholar]

[pcbi.1008248.ref029] 29. Ljung L, Glad T. On global identifiability for arbitrary model parametrizations. Automatica. 1994;30(2):265–276. 10.1016/0005-1098(94)90029-9 [DOI] [Google Scholar]

[pcbi.1008248.ref030] 30.Ollivier F. Identifiabilité et identification: du Calcul Formel au Calcul Numérique? In: ESAIM: Proceedings. vol. 9. EDP Sciences; 2000. p. 93–99.

[pcbi.1008248.ref031] 31. Meshkat N, Eisenberg M, DiStefano JJ III. An algorithm for finding globally identifiable parameter combinations of nonlinear ODE models using Gröbner Bases. Mathematical Biosciences. 2009;222(2):61–72. 10.1016/j.mbs.2009.08.010 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref032] 32. Bellu G, Saccomani MP, Audoly S, D’Angiò L. DAISY: A new software tool to test global identifiability of biological and physiological systems. Computer methods and programs in biomedicine. 2007;88(1):52–61. 10.1016/j.cmpb.2007.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref033] 33. Chiş O, Banga JR, Balsa-Canto E. GenSSI: a software toolbox for structural identifiability analysis of biological models. Bioinformatics. 2011;27(18):2610–2611. 10.1093/bioinformatics/btr431 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref034] 34. Meshkat N, Kuo CEz, DiStefano J III. On finding and using identifiable parameter combinations in nonlinear dynamic systems biology models and COMBOS: a novel web implementation. PLoS One. 2014;9(10):e110261 10.1371/journal.pone.0110261 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref035] 35. Hong H, Ovchinnikov A, Pogudin G, Yap C. SIAN: software for structural identifiability analysis of ODE models. Bioinformatics. 2019;35(16):2873–2874. 10.1093/bioinformatics/bty1069 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref036] 36. Brouwer AF, Meza R, Eisenberg MC. Parameter estimation for multistage clonal expansion models from cancer incidence data: A practical identifiability analysis. PLoS computational biology. 2017;13(3):e1005431 10.1371/journal.pcbi.1005431 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref037] 37. Yates JWT, Evans ND, Chappell MJ. Structural identifiability analysis via symmetries of differential equations. Automatica. 2009;45(11):2585–2591. 10.1016/j.automatica.2009.07.009 [DOI] [Google Scholar]

[pcbi.1008248.ref038] 38. Anguelova M, Karlsson J, Jirstrand M. Minimal output sets for identifiability. Mathematical Biosciences. 2012;239(1):139–153. 10.1016/j.mbs.2012.04.005 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref039] 39. Merkt B, Timmer J, Kaschek D. Higher-order Lie symmetries in identifiability and predictability analysis of dynamic models. Physical Review E. 2015;92(1):012920 10.1103/PhysRevE.92.012920 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref040] 40. Craciun G, Kim J, Pantea C, Rempala GA. Statistical model for biochemical network inference. Communications in Statistics-Simulation and Computation. 2013;42(1):121–137. 10.1080/03610918.2011.633200 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008248.ref041] 41. Davidescu FP, Jørgensen SB. Structural parameter identifiability analysis for dynamic reaction networks. Chemical Engineering Science. 2008;63(19):4754–4762. 10.1016/j.ces.2008.06.009 [DOI] [Google Scholar]

[pcbi.1008248.ref042] 42. Locke J, Millar A, Turner M. Modelling genetic networks with noisy and varied experimental data: the circadian clock in Arabidopsis thaliana. Journal of theoretical biology. 2005;234(3):383–393. 10.1016/j.jtbi.2004.11.038 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref043] 43. Wu H, Zhu H, Miao H, Perelson AS. Parameter identifiability and estimation of HIV/AIDS dynamic models. Bulletin of mathematical biology. 2008;70(3):785–799. 10.1007/s11538-007-9279-9 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref044] 44. Ho DD, Neumann AU, Perelson AS, Chen W, Leonard JM, Markowitz M. Rapid turnover of plasma virions and CD4 lymphocytes in HIV-1 infection. Nature. 1995;373(6510):123–126. 10.1038/373123a0 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref045] 45.Bartl M, Kötzing M, Kaleta C, Schuster S, Li P. Just-in-time activation of a glycolysis inspired metabolic network-solution with a dynamic optimization approach. In: Crossing Borders within the ABC: Automation, Biomedical Engineering and Computer Science. vol. 55; 2010. p. 217–222.

[pcbi.1008248.ref046] 46. Lipniacki T, Paszek P, Brasier AR, Luxon B, Kimmel M. Mathematical model of NF-κB regulatory module. Journal of theoretical biology. 2004;228(2):195–215. 10.1016/j.jtbi.2004.01.001 [DOI] [PubMed] [Google Scholar]

[pcbi.1008248.ref047] 47. Domurado M, Domurado D, Vansteenkiste S, De Marre A, Schacht E. Glucose oxidase as a tool to study in vivo the interaction of glycosylated polymers with the mannose receptor of macrophages. Journal of controlled release. 1995;33(1):115–123. 10.1016/0168-3659(94)00074-5 [DOI] [Google Scholar]

[pcbi.1008248.ref048] 48. Buckingham E. Illustrations of the use of dimensional analysis on physically similar systems. Physics Review. 1914;4(4):354–377. [Google Scholar]

[pcbi.1008248.ref049] 49. Lavielle M, Aarons L. What do we mean by identifiability in mixed effects models? Journal of pharmacokinetics and pharmacodynamics. 2016;43(1):111–122. 10.1007/s10928-015-9459-4 [DOI] [PubMed] [Google Scholar]

PERMALINK

Testing structural identifiability by a simple scaling method

Mario Castro

Rob J de Boer

Roles

Abstract

Author summary

Introduction

Method

A couple of motivating examples

Description of the method

Table 1. A collection of frequent linear independent functions: All the functions listed in the Table are independent to each other (of the same or different type).

Box 1: Summary of the scale invariance local structural identifiability method introduced in this work

Results

The main result

Example: An unidentifiable nonlinear model [16]

Comparison with other methods

Table 2. List of current methods testing structural identifiability.

Table 3. Summary of models compared in the literature: The number in brackets in the Model Name column corresponds to the number of observed variables.

Discussion and conclusions

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Douglas A Lauffenburger

Miles P Davenport

Roles

Author response to Decision Letter 0

Decision Letter 1

Douglas A Lauffenburger

Miles P Davenport

Roles

Acceptance letter

Douglas A Lauffenburger

Miles P Davenport

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases