Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2020 Nov 3;16(11):e1008248. doi: 10.1371/journal.pcbi.1008248

Testing structural identifiability by a simple scaling method

Mario Castro 1,2,☯,*, Rob J de Boer 3,
Editor: Miles P Davenport4
PMCID: PMC7665633  PMID: 33141821

Abstract

Successful mathematical modeling of biological processes relies on the expertise of the modeler to capture the essential mechanisms in the process at hand and on the ability to extract useful information from empirical data. A model is said to be structurally unidentifiable, if different quantitative sets of parameters provide the same observable outcome. This is typical (but not exclusive) of partially observed problems in which only a few variables can be experimentally measured. Most of the available methods to test the structural identifiability of a model are either too complex mathematically for the general practitioner to be applied, or require involved calculations or numerical computation for complex non-linear models. In this work, we present a new analytical method to test structural identifiability of models based on ordinary differential equations, based on the invariance of the equations under the scaling transformation of its parameters. The method is based on rigorous mathematical results but it is easy and quick to apply, even to test the identifiability of sophisticated highly non-linear models. We illustrate our method by example and compare its performance with other existing methods in the literature.

Author summary

Theoretical Biology is a useful approach to explain, generate hypotheses, or discriminate among competing theories. A well-formulated model has to be complex enough to capture the relevant mechanisms of the problem, and simple enough to be fitted to data. Structural identifiability tests aim to recognize, in advance, if the structure of the model allows parameter fitting even with unlimited high-quality data. Available methods require advanced mathematical skills, or are too costly for high-dimensional non-linear models. We propose an analytical method based on scale invariance of the equations. It provides definite answers to the structural identifiability problem while being simple enough to be performed in a few lines of calculations without any computational aid. It favorably compares with other existing methods.


This is a PLOS Computational Biology Methods paper.

Introduction

Mathematical models contribute to our understanding of Biology in several ways ranging from the quantification of biological processes to reconciling conflicting experiments [1]. In many cases, this requires formulating a mathematical model and extracting quantitative estimates of its parameters from the experimental data. Parameters are typically unknown constants that change the behavior of the model. While it is usually recognized that parameter estimation requires the availability of sufficient informative data, sometimes it is not possible to estimate all parameters due to the structure of the model (whatever the quantity or quality of the data), even with large amounts of noiseless observations. This inability is referred to as ‘structural identifiability’, a concept introduced decades ago by Bellman and Åström [2, 3], as opposed to the ‘practical identifiability’ that depends on limitations set by the data. Practical identifiability has important consequences that can lead to questionable interpretations of the data leading to some recent controversy around this point [4, 5]. Structural identifiability poses an unsolvable limitation as it is unrelated to the resolution of the experimental data collection or the number of observations.

Structural identifiability is a necessary condition for model fitting and should be used before any attempt to extract information about the parameters, and as a test of the applicability of the model itself. Importantly, the quality of the fit does not guarantee that the estimated parameters are meaningful. In practice, this is both uncontrolled and misleading, as many fitting tools provide information about the goodness of fit but do not check sensitivity or identifiability. Structural identifiability can be qualified as global or local [610]. Global structural identifiability tests the ability to estimate unique sets of parameters, while local (or simply, structural identifiability) means that parameters can be estimated only in a limited subset of the space of parameters. In practical terms, these definitions can be translated into the language of sensitivity analysis as identifiability requires that (i) the columns of the sensitivity matrix are linearly independent, and (ii) each of its columns has at least one large entry [11, 12].

Traditionally, work primarily focused on linear systems [2, 3, 13] based on ordinary differential equations (ODE). For non-linear models, those methods cannot be applied, so many methods have been proposed in the literature to address structural identifiability. Early attempts were based on power series expansions of the original non-linear system [14], the similarity transformation method [1517] or the so-called direct-test method proposed by Denis-Vidal and Joly-Blanchard [18, 19]. These methods exploit the definition of identifiability either analytically [18] or numerically [2025], but they are not generically suitable for high-dimensional problems. Xia and Moog [6, 26] proposed an alternative to these classical methods based on the implicit function theorem, but this method also becomes involved to apply for complex models [27].

Another approach that is becoming mainstream is based on the framework of differential algebra [2831]. These methods are also difficult to apply, requiring advanced mathematical skills and, in some cases, replace highly non-linear terms by polynomial approximations that simplify the analysis. On the positive side, they are based on rigorous mathematical theories, are suitable for non-linear models and, more importantly, they can be coded using existing symbolic computational libraries. In this regard, it is worth mentioning DAISY [32], GenSSI [33], COMBOS [34] or, more recently, SIAN [35].

In almost all cases, the major disadvantage of these methods is their difficulty to apply them to even a few differential equations, hence requiring advanced mathematical skills and/or dedicated numerical or symbolic software (that is frequently unable to handle the complexity of the problem). This explains why, despite the huge volume of publications in the field of theoretical biology, only a few address parameter identifiability explicitly. In this paper, we introduce a simple method to assess local structural identifiability of ODE models that reduces the complexity of existing methods and can bring identifiability testing to a broader audience. Our method is based on simple scaling transformations, and the solution of simple sparse systems of equations. Identifiability for stochastic models [36] is out of the scope of our work.

Method

A couple of motivating examples

Consider a simple death model in which the death rate is the product of two parameters λ1 and λ2, namely

dxdt=-λ1λ2x,x(0)=x0, (1)

with the solution

x=x(λ1,λ2,t)=x0e-λ1λ2t. (2)

It is evident that from an experiment only the product λ1λ2 can be inferred, and not any of the two independently. Following the ‘actionable’ definition in Ref. [11], local structural identifiability is directly linked to the linear independence of the columns of the sensitivity matrix, Sij, of the variable xi with respect to parameter λj

Sij(x1,,xr,xr+1xn;λ1,,λm)xiλj (3)

Here, we will work with a related (dimensionless) quantity called the relative sensitivity, or simply the elasticity matrix K with elements Kij given by

Kij(x1,,xr,xr+1xn;λ1,,λm)logxilogλj=λjxixiλj=λjxiSij. (4)

The logarithm in the definition of the elasticity matrix provides a clear-cut interpretation of its coefficients. Thus, if Kij = 1, a 10% increase in λj implies a 10% increase in xi, and if Kij = 0.5, that very same increase in λj translates only to a 5% increase in xi.

For Eq (1), the elasticity matrix would be simply a 1 × 2 matrix,

K=(K11K12),

with

K11λ1xxλ1,andK12λ2xxλ2. (5)

We now propose to multiply λ1 with a generic scale factor u, and to divide λ2 by the same factor, such that the solution remains invariant. Deriving the scaled solution of Eq (2) with respect to that scale factor u, and by the chain rule,

dxdu=0(asuisarbitrary) (6)

and, also,

dx(uλ1,λ2/u,t)du=xλ1λ1-λ2u2xλ2=0 (7)

where the last equality follows from Eq (6)

Rearranging Eq (7) and dividing by x,

λ1xxλ1=λ2u2xxλ2K11=1u2K12, (8)

so both columns of the elasticity matrix are linearly dependent and, accordingly, λ1 and λ2 are unidentifiable. In this particular case, the exact solution confirms this result:

K11=K12=λ1λ2t.

In this case we had complete knowledge of the solution, and consequently, it was straightforward to find the right way to introduce the scaling u. Fortunately, this simple scaling calculation can also be performed directly on Eq (1). Introducing two unknown scaling factors, u1 and u2, into that equation,

dxdt=-u1λ1u2λ2x.

Requiring that this remains identical (or, more formally, invariant) to Eq (1), i.e., λ1λ2x = u1λ1 u2λ2 x, we conclude that u1 u2 = 1. The fact that u1 and u2 cannot be solved individually, also means that the real values of λ1 and λ2 cannot be determined, namely both parameters are unidentifiable.

Next consider a death model with immigration:

dxdt=λ1-λ2x. (9)

In this case, to leave the system invariant we need to find u1 and u2 such that

λ1-λ2x(t)=u1λ1-u2λ2x(t)

for all values of x at any time. Rearranging the latter equation,

(1-u1)λ1=(1-u2)λ2x(t),

where the left-hand side of the last equation is a constant and the right-hand side depends on time. Hence the only possible solution to the latter equation is u1 = u2 = 1 implying that both λ1 and λ2 are locally identifiable. Notice the difference with the preceding case, Eq (1), in which an infinite number of combinations of the scaling factors satisfy the invariance condition.

These simple examples illustrate how scaling invariance of the model equations can be used to determine whether the parameters are unidentifiable or not. We prove this result more rigorously in S1 Text.

Description of the method

Let us define a general ODE model characterized by the time evolution of n variables, xi(t), depending on m parameters λj,

dxidt=fi(x1,,xr,xr+1xn;λ1,,λm)i=1,,n, (10)
xi(0)=xi,0,i=1,,n, (11)

where the functions fi depend on the specific details of the problem at hand and xi,0 are the initial conditions. We need to distinguish between those variables that can be observed (measured) in the experiment, x1xr, and those which cannot (they are often referred to as latent variables), xr+1xn.

As we will prove below, the simplicity of our method relies on the ability to decompose the functions fi as a sum of M functional independent summands, fik,

fi(x1,,xr,xr+1xn;λ1,,λm)=k=1Mfik(x˜k,λ˜k), (12)

having the property that fik is functionally independent of fil for every kl. For brevity, x˜k,λ˜k denote the subset of variables and parameters included in the function fik.

The notion of linear independent functions and how to test it is summarized in S1 Text. However, a simple definition would be: If f1(x1, x2, …), …, fn(x1, x2, …) are linearly independent functions, then the only solution of the equation

i=1naifi(x1,x2,)=0 (13)

is a1 = … = an = 0.

Typical examples of functionally independent functions are summarized in Table 1. For instance, f11 = ax1, f12 = bx1x3, f13 = (c + x4)−1 are functionally independent, whereas examples of dependent functions would be f11 = ax1x2 and f12 = bx1x2. Note that it is not required that fij and fkj are independent (as they appear in different equations). For instance, in the example in Eq (9) can be decomposed in polynomials of degree 0 (a constant) and 1 (a linear function), namely

f11=λ1,andf12=-λ2x.

Table 1. A collection of frequent linear independent functions: All the functions listed in the Table are independent to each other (of the same or different type).

We assume that λ1 ≠ λ2 in all of the cases.

Type Examples
Polynomial (one variable) x0, x, x2, x3, …
Polynomial (more than one variable) x1x2,x12x2,x1x2x3,
Rational 1λ+x1,x1λ+x1,1x1+x2,1λ1+x1+x2,1(λ+x1)2,
Exponential eλ1x1,eλ2x1,eλ1x2
Sigmoid 1λ1+e-λ1x1,1λ1+e-λ1x2,1λ1+e-λ2x1,1(λ1+e-λ1x1)2,
Trigonometric sin λ1x1, sin λ1x2 sin λ2x1, cos λ1x1, tan λ1x2, …

We summarize our method in Box 1.

Box 1: Summary of the scale invariance local structural identifiability method introduced in this work

  1. Scale all parameters and all unobserved variables by unknown scaling factors, u:
    λiuλiλii=1,,mxjuxjxjj=r+1,,n
    and substitute them into Eq (15) below.
  2. Equate each functionally independent function, fik, to its scaled version. Namely,
    fik(x˜,λ˜)=1uxifik(ux˜x˜,uλ˜λ˜) (14)
    where uxi=1 for 1 ≤ ir and the prefactor in the right-hand side of the equation comes from the scaling of dxidtuxidxidt. From Eq (11) it follows that uxi=uxi,0.
  3. From Eq (14), find combinations of the scaling factors u that leave the system invariant. Hereafter, we will denote these as the identifiability equations of the model (see Eq (24) below).

  4. Only the parameters λi with a solution uλi=1 are identifiable. Only the variables, xi with uxi=1 are observable. Otherwise, parameters whose scaling factors are coupled, form identifiable groups but cannot be identified independently.

In summary, our method reduces the complexity of finding identifiable parameters to finding which scaling factors do not satisfy the trivial solution ui = 1. In the literature, when a scaling factor is related to one of the latent variables xr+1xn, if uxk=1, then xk is said to be observable [10]. Thus, our method addresses at the same time identifiability and observability. Additionally, irreducible equations involving two or more parameters provide the so-called identifiable groups of variables that cannot be fitted independently. In the case of the pure death model above, the identifiability equation uλ1uλ2=1 is a signature of the unidentifiable group λ1λ2. This is interesting as groups involving latent variables (for instance, uxjuλk) would inform future experiments aimed to observe that variable and decouple that group.

It is also worth mentioning that our identifiability test (illustrated by example in S1 Text) provides a simple way to find a type of symmetry that is related to scale invariance. More sophisticated methods have been introduced in the literature to address other symmetries [3739] using the theory of Lie group transformations, however, that approach involves complex calculations assisted by symbolic computations.

Results

The main result

Now we are equipped to prove the main result of the paper. We will proceed in two steps: firstly, we will show how Eq (14) is translated into a set of equations for the scaling factors u. Secondly, we will connect the elasticity matrix with the solution of the identifiability equations and the identifiability of the parameters.

Consider a model described by a set of n ordinary differential equations (ODE)

dxidt=fi(x1,,xr,xr+1xn;λ1,,λm)=k=1Mfik(x˜k,λ˜k), (15)

where fik is functionally independent of fil for every kl (namely, they satisfy the generalized Wronskian theorem; see S1 Text). For the sake of simplicity, we denote x˜k and λ˜k the subset of variables and parameters of function fik.

Motivated by Eqs (1)–(5), we seek for scaling of the parameters that leave the system invariant. As we prove below, this invariance (or lack of) is related to the identifiability of the parameters. Hence, if we define the following scaling transformation:

λiuλiλi,i=1,,mxjuxjxj,j=r+1,,n (16)

(where the variables x1xr are unmodified as we can measure them in the experiment) we can write the following set of re-scaled equations:

dxidt=k=1Mfik(ux˜kx˜k,uλ˜kλ˜k),i=1,,r (17)
xi=xi,0,i=1,,r (18)
uxidxidt=k=1Mfik(ux˜kx˜k,uλ˜kλ˜k),i=r+1,,n (19)
uxixi=uxi,0xi,0,i=r+1,,n (20)

where M is the number of functional independent summands in the equation. It is convenient to rewrite Eq (19) as

dxidt=1uxik=1Mfik(ux˜kx˜k,uλ˜kλ˜k),i=r+1,n (21)

to perform the scale invariance analysis below in a simpler way.

If the solution is invariant under this transformation, then the right-hand sides of Eq (15) and, consequently Eqs () should be equal. Besides, by the functional linear independence of the functions fik we can split each summand. Thus,

fik(x˜k,λ˜k)=fik(ux˜kx˜k,uλ˜kλ˜k),i=1,,r (22)

and

fik(x˜k,λ˜k)=1uxifik(ux˜kx˜k,uλ˜kλ˜k),i=r+1,,n (23)

This new set of equations is much easier to solve than the one that we would obtain from Eqs (17)–(19) (which would be equivalent to the so-called direct-test method [18]). Eqs (22) and (23) admit the trivial solution ux˜k=uλ˜k=1. Alternatively, some of the parameters are functionally related to each other. Generically, they can be written as

uλk=Fk(um1,um2,), (24)

Note that, for each parameter k, the scaling uλk will depend only on a subset of all the scaling factors m1, m2, … We denote Eq (24) the identifiability equations of the model. A third possibility would be that some scaling factors take fixed values different from 1. We discuss that case below.

Let us now connect the identifiability equations with the concept of local structural identifiability. If we take the partial derivative of the following (invariant equation)

xi(x1,,xr,uxr+1xr+1uxnxn;uλ1λ1,,uλmλm)=xi(x1,,xr,xr+1xn;λ1,,λm)

with respect to uλk, by the chain rule, it follows that

xiλkλk+xim1m1βm1k+xim2m2βm2k+=0 (25)

where, for convenience, we have defined

βmkumuλk=(Fkum)-1.

Finally, dividing Eq (25) by xi:

Kik+βm1kKim1+βm2kKim2+=0, (26)

where Kim are the elements of the elasticity matrix defined in Eq (4). Eq (26) implies that Kik can be written as a linear combination of other column(s) of the elasticity matrix. According to our discussion in the Introduction (see also Refs. [11, 12]) this is means that λk is not identifiable.

Summarising, for each parameter λk either uλk=1 or it is not identifiable. The adjective “local” follows because the method stems on the continuity of the derivative of xi(t) with respect to λk to derive Eq (25). Thus, it is unable to capture any discrete transformations like, for instance,

{uc1,uδ1,{ucδc,uδcδ

discussed for Model 8 in S1 Text and that, as we anticipated above, is the third possible solution of the identifiability Eq (24).

Example: An unidentifiable nonlinear model [16]

Here we show how to apply our method to a nonlinear model introduced in Ref. [16] (this model is mathematically equivalent to Model 2 in S1 Text).

x˙1=λ1x12+λ2x1x2, (27)
x˙2=λ3x12+λ4x1x2, (28)
x1(0)=0, (29)
x2(0)=0, (30)
x1isobserved (31)

Following Box 1:

  1. We re-scale the non-observed variables and parameters:
    {x2ux2x2λ1uλ1λ1λ2uλ2λ2λ3uλ3λ3λ4uλ4λ4, (32)
    as x1 is observed (so, ux1=1).
  2. We define the functional linear independent functions:
    f11=λ1x12f12=λ2x1x2f21=λ3x12f22=λ4x1x2,
    and from Eq (14)
    uλ1λ1x12=λ1x12uλ2ux2λ2x1x2=λ2x1x2
    and
    uλ3ux2λ3x12=λ3x12uλ4λ4x1x2=λ4x1x2
    respectively.
  3. Manipulating the previous equations:
    uλ1λ1x12=λ1x12uλ2ux2λ2x1x2=λ2x1x2
    and
    uλ3ux2λ3x12=λ3x12uλ4λ4x1x2=λ4x1x2
    Hence, the identifiability equations are
    {uλ1=1uλ2ux2=1uλ3=ux2uλ4=1 (33)
  4. As the system has more than 1 solution besides the trivial (uλ1=uλ2==1) it follows that the model is unidentifiable. Moreover, Eq (33) allows one to conclude that (i) if x2 were to be observed (ux2=1), all the parameters would be identifiable, and (ii) the combination uλ2uλ3 is identifiable as, for any scale of x2, the condition uλ2uλ3=1 is always fulfilled and hence λ2λ3 is an identifiable group.

Comparison with other methods

We have applied the method outlined in Box 1 to 13 different models defined and analyzed in detail in S1 Text. The choice is based on two criteria: on the one hand, models 1-5 are included for pedagogical purposes. They are simple enough to illustrate the novel method and most of the existing methods also provide the same definite answers. Models 6-13 were chosen because they have previously been analyzed using the methods summarized in the Introduction and in Table 2. This allows us to put our method in direct competition with those methods and to highlight their merits and limitations.

Table 2. List of current methods testing structural identifiability.

We introduce here the acronyms referred to in Table 3.

Method Acronym Main Ref. Pros Cons
Direct test method DT [18, 20] Simple Limited
Implicit function theorem IFT [26] Software Limited
Taylor series approach TS [14] Simple Computationally Expensive
Generating series approach GS [13] Simple, Software Computationally Expensive
Similarity Transformation ST [16] Software Computationally Expensive
Differential algebra DA [29, 32, 34] Software, Conclusive Limited, Comp. Expensive
Reaction Network theory RNT [40, 41] Simple, Hybrid with other Only reaction systems
STRIKE-GOLDD SG [9, 22] Powerful, Software Computationally Expensive
Scaling Invariance Method SIM This work Simple, Widely applicable Only Local Identifiability

The results of this comparison are summarized in Table 3, which is an extension of a similar table in Ref. [7]. The column Not Conclusive/Not Applicable groups different situations in which a particular method do not provide a conclusive answer (or no answer at all). In general, it captures the fact that many of these methods are computationally demanding (after several hours they do not provide any answer) or that the computations do not converge numerically. For instance, in some implementations of the Differential Algebra method [32], when the number of observables is lower than the number of parameters, the computation requires the evaluation of high-order derivatives of the functions fi in Eq (11) what can be computationally prohibitive. In other cases, some criterion of applicability is not fulfilled (for instance, the observability rank condition for the similarity transformation method) or the method cannot be solved if it involves the solution of a high-degree polynomial or transcendental equations (Direct Test method). These limitations are summarized succinctly in the Cons column in Table 2.

Table 3. Summary of models compared in the literature: The number in brackets in the Model Name column corresponds to the number of observed variables.

Model Numbers correspond to those in Table A in S1 Text. The acronyms for the methods are summarized in Table 2. This table is an extension of Table 1 in Ref. [7].

Model name Main Ref. Model Number Global Struct. Id. Local Struct. Id. Unidentifiable Not Conclusive Not Applicable
Goodwin model (1) [7] 6 SG,SIM TS,GS,ST,DT,DA,IFT,RNT
Goodwin model (all) [7] 6bis TS,GS,IFT,RNT DA,SG,SIM ST,DT
Circadian clock model [42] 7 TS,GS,RNT,SG,SIM ST,DT,DA,IFT
HIV model (1) [6, 43] 8 All
HIV model (2) [6, 43] 8bis DA,IFT,RNT TS,GS,SIM DT,ST
Linear HIV model (1) [6, 43, 44] 8ter DA,IFT,RNT,SG DT,ST,TS,GS,SIM
Glycolysis model [45] 9 GS,DA,RNT TS,SIM ST,DT
High dimensional model [42] 10 TS,GS,DA,RNT IFT,SIM ST,DT
NF-κ model B (1) [46] 11 SG, SIM TS,GS,ST,DT,DA,IFT,RNT
NF-κ model B (2) [46] 11bis GS,RNT TS,SIM SG ST,DT,DA,IFT
Pharmacokinetics model (1) [47] 12 TS,GS,RNT,SG,SIM ST,DT,DA,IFT
Pharmacokinetics model (2) [47] 12bis DA GS,SG,SIM ST,DT,IFT,RNT
Within-host virus model [27] 13 DA SIM TS,GS,ST,DT,IFT,RANT

Discussion and conclusions

Table 3 shows that our method can handle any complex model and provides a local structural identifiability criterion that is compatible with those methods capable of producing an answer. Thus, our method is widely applicable. It is worth noting that in several cases where our scaling method comes with a conclusive answer, other more complicated methods cannot address those cases (rightmost column in the table). As any global structural identifiable model is also local, our results are compatible with those methods that can address that difference.

Table 3 also highlights the huge discrepancies among methods. These conflicting conclusions are rather discomforting and deserve deeper clarification. The main source of conflict arises when comparing the Taylor series and the Generating series methods, as they transform the original problem into an approximate one. Also, they incorporate (rightly) the initial conditions into the computation while some implementations of the Differential algebra (DA) method do not (see the DAISY implementation [32]), what can lead to different conclusions. Regarding the DA method, in some instances random values are used for the parameters to handle the complexity of some models what, if those parameters are not properly explored, can lead to wrong conclusions.

So overall, we can distinguish three sources of discrepancy: local vs global structural identifiability (which is not an incompatibility as Global implies Local and our method is restricted to the latter); conclusive vs not conclusive (which favors our method as it is not limited by any computational constraint) and; the most concerning, incompatible conclusions. Here, our method is compatible with the conclusions of DA and hybrid methods such as Reaction network theory or STRIKE-GOLDD. As we mentioned in the introduction, Differential Algebra methods (and extensions) are considered the most reliable (when computable) and our method either agrees, or provides an answer where the other methods cannot. The discrepancies with other methods are due to limitations or uncontrolled approximations when applied to complex problems and have been already raised by other authors [7].

From viewpoint of performance, it is worth emphasizing that we have performed our test by hand, as illustrated in S1 Text, and that, after some practice (and using some interesting motifs as having sums of different parameters, or the coefficients related to diagonal terms in the system of equations) the calculations can be made in a few minutes. This contrasts with the most sophisticated methods that, by hand, can fill several pages [27] or take hours using symbolic computation packages.

Together, broad applicability and simplicity are the main signatures of our method and this may attract the interest of mathematical modelers and spread the culture of checking structural identifiability as a mandatory step when fitting experimental data.

We would like to highlight a connection with the so-called Buckingham-Π theorem of dimensional analysis [48]. In some sense, the scale invariance property is related to the principle of dimensional homogeneity, i.e., the constraints on the functional form of the independent variables with the parameters. Our identifiability equations are therefore similar to finding the so-called Π-groups in the theorem.

A limitation of the method is that it is restricted to testing local identifiability. This is implicit in the differentiability of the elasticity matrix which, by definition, is a local operation. Discrete symmetries are not captured, and more sophisticated methods (based on Lie group transformations [39]) are required. However, simple manipulation of the equations to remove the latent variables can improve the explanatory power of the method and might capture those discrete symmetries (see Sec. 3.8 of S1 Text). We leave that extension for future developments.

Finally, in this work we have chosen to solve the scaling factor equations directly as it is easy to perform with pen and paper. However, if we were to redefine the scaling factors as ui=ewi, the new factors wi would obey a linear system of homogeneous equations. It is therefore expected that the problem of identifiability is related to the rank of the matrix defining the linear system of equations. In that regard, the theorems presented in S1 Text could be supplemented with generic results on homogeneous systems of equations. Thus, our results provide a solid ground for the method and indicate a venue for further development in other systems like delay-differential or partial differential equations.

Another open question is the identifiability problem of mixed-effect models, where parameters are not fixed quantities for each observation but, rather, they are drawn from a meta-distribution linking different subjects [49]. For instance, if one considers the simple model

x˙=(a+b)x,

a and b are not identifiable. However, if they are assumed to be drawn from, say, two exponential distributions with different means μa and μb, then the joint distribution for λ ≡ ab is given by

p(λ;μa,μb)=μaμbμa-μb(e-μbλ-e-μaλ),

which is formed by two linearly independent functions (if μaμb), ~eμbλ and ~eμaλ so μa and μb are identifiable as the unique solution of the identifiability equations

uμbμauμbμbuμbμa-uμbμbe-uμbμbλ=μaμbμa-μbe-μbλ

is uμb=1 (because of the exponential). This kind of models need further analysis but they seem to be amenable to our approach.

Finally, while we emphasize the simplicity of the method, it is also amenable to be implemented using symbolic computation packages, particularly for systems with a large number of equations/reactions.

Supporting information

S1 Text. In S1 Text we collect the theorems sustaining the method and a catalogue of models with a detailed computation of the identifiability equations that were used to build Table 3.

(TEX)

Acknowledgments

This work was initiated during summer visits of the authors to the Los Alamos National Laboratory, and we thank Nick Hengartner and Alan Perelson (LANL) for their hospitality and helpful comments on this work, and the Santa Fe Institute for supporting the summer visits of RdB.

Data Availability

All relevant data are within the manuscript and its Supporting information files.

Funding Statement

This work was funded by Agencia Estatal de Investigación (FIS2016-78883-C2-2-P, PID2019-106339GB-I00) to MC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Castro M, Lythe G, Molina-París C, Ribeiro RM. Mathematics in Modern Immunology. Interface focus. 2016;6(2):20150093 10.1098/rsfs.2015.0093 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Bellman R, Åström KJ. On structural identifiability. Mathematical Biosciences. 1970;7(3-4):329–339. 10.1016/0025-5564(70)90132-X [DOI] [Google Scholar]
  • 3. Jacquez JA, et al. Compartmental analysis in biology and medicine. New York, Elsevier Pub. Co.; 1972. [Google Scholar]
  • 4. Balsa-Canto E, Alonso-del Real J, Querol A. Mixed growth curve data do not suffice to fully characterize the dynamics of mixed cultures. Proceedings of the National Academy of Sciences. 2020;117(2):811 10.1073/pnas.1916774117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Ram Y, Obolski U, Feldman MW, Berman J, Hadany L. Reply to Balsa-Canto et al.: Growth models are applicable to growth data, not to stationary-phase data. Proceedings of the National Academy of Sciences. 2020;117(2):814–815. 10.1073/pnas.1917758117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Miao H, Xia X, Perelson AS, Wu H. On the identifiability of nonlinear ODE models and applications in viral dynamics. SIAM review. 2011;53(1):3–39. 10.1137/090757009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Chis OT, Banga JR, Balsa-Canto E. Structural identifiability of systems biology models: a critical comparison of methods. PloS one. 2011;6(11):e27755 10.1371/journal.pone.0027755 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Villaverde AF, Barreiro A. Identifiability of large nonlinear biochemical networks. Match Commun Math Comput Chem (Mulheim an der Ruhr, Germany). 2016;76(2):259–276. [Google Scholar]
  • 9. Villaverde AF, Barreiro A, Papachristodoulou A. Structural Identifiability of Dynamic Systems Biology Models. PLoS Computational Biology. 2016;12(10):1–22. 10.1371/journal.pcbi.1005153 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Villaverde AF, Tsiantis N, Banga JR. Full observability and estimation of unknown inputs, states and parameters of nonlinear biological models. Journal of the Royal Society Interface. 2019;16(156):20190043 10.1098/rsif.2019.0043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Jaqaman K, Danuser G. Linking data to models: data regression. Nature Reviews Molecular Cell Biology. 2006;7(11):813 10.1038/nrm2030 [DOI] [PubMed] [Google Scholar]
  • 12. Komorowski M, Costa MJ, Rand DA, Stumpf MP. Sensitivity, robustness, and identifiability in stochastic chemical kinetics models. Proceedings of the National Academy of Sciences. 2011;108(21):8645–8650. 10.1073/pnas.1015814108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Walter E, Lecourtier Y. Unidentifiable compartmental models: What to do? Mathematical Biosciences. 1981;56(1-2):1–25. [Google Scholar]
  • 14. Pohjanpalo H. System identifiability based on the power series expansion of the solution. Mathematical Biosciences. 1978;41(1-2):21–33. 10.1016/0025-5564(78)90063-9 [DOI] [Google Scholar]
  • 15. Vajda S, Rabitz H. State isomorphism approach to global identifiability of nonlinear systems. IEEE Transactions on Automatic Control. 1989;34(2):220–223. 10.1109/9.21105 [DOI] [Google Scholar]
  • 16. Vajda S, Godfrey KR, Rabitz H. Similarity transformation approach to identifiability analysis of nonlinear compartmental models. Mathematical Biosciences. 1989;93(2):217–248. 10.1016/0025-5564(89)90024-2 [DOI] [PubMed] [Google Scholar]
  • 17. Chappell MJ, Godfrey KR. Structural identifiability of the parameters of a nonlinear batch reactor model. Mathematical Biosciences. 1992;108(2):241–251. 10.1016/0025-5564(92)90058-5 [DOI] [PubMed] [Google Scholar]
  • 18. Denis-Vidal L, Joly-Blanchard G. An easy to check criterion for (un) indentifiability of uncontrolled systems and its applications. IEEE Transactions on Automatic Control. 2000;45(4):768–771. 10.1109/9.847119 [DOI] [Google Scholar]
  • 19. Raksanyi A, Lecourtier Y, Walter E, Venot A. Identifiability and distinguishability testing via computer algebra. Mathematical Biosciences. 1985;77(1-2):245–266. 10.1016/0025-5564(85)90100-2 [DOI] [Google Scholar]
  • 20. Walter E, Braems I, Jaulin L, Kieffer M. Guaranteed numerical computation as an alternative to computer algebra for testing models for identifiability In: Numerical Software with Result Verification. Springer; 2004. p. 124–131. [Google Scholar]
  • 21. Maiwald T, Hass H, Steiert B, Vanlier J, Engesser R, Raue A, et al. Driving the model to its limit: profile likelihood based model reduction. PloS one. 2016;11(9):e0162366 10.1371/journal.pone.0162366 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Villaverde AF, Barreiro A, Papachristodoulou A. Structural Identifiability Analysis via Extended Observability and Decomposition. IFAC-PapersOnLine. 2016;49(26):171–177. 10.1016/j.ifacol.2016.12.121 [DOI] [Google Scholar]
  • 23. Kreutz C. An easy and efficient approach for testing identifiability. Bioinformatics. 2018;34(11):1913–1921. 10.1093/bioinformatics/bty035 [DOI] [PubMed] [Google Scholar]
  • 24. Tönsing C, Timmer J, Kreutz C. Profile likelihood-based analyses of infectious disease models. Statistical methods in medical research. 2018;27(7):1979–1998. 10.1177/0962280217746444 [DOI] [PubMed] [Google Scholar]
  • 25. Stigter JD, Molenaar J. A fast algorithm to assess local structural identifiability. Automatica. 2015;58:118–124. 10.1016/j.automatica.2015.05.004 [DOI] [Google Scholar]
  • 26. Xia X, Moog CH. Identifiability of nonlinear systems with application to HIV/AIDS models. IEEE transactions on automatic control. 2003;48(2):330–336. 10.1109/TAC.2002.808494 [DOI] [Google Scholar]
  • 27. Koelle K, Farrell AP, Brooke CB, Ke R. Within-host infectious disease models accommodating cellular coinfection, with an application to influenza. Virus evolution. 2019;5(2):vez018 10.1093/ve/vez018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Walter E, Pronzato L. On the identifiability and distinguishability of nonlinear parametric models. Mathematics and computers in simulation. 1996;42(2-3):125–134. 10.1016/0378-4754(95)00123-9 [DOI] [Google Scholar]
  • 29. Ljung L, Glad T. On global identifiability for arbitrary model parametrizations. Automatica. 1994;30(2):265–276. 10.1016/0005-1098(94)90029-9 [DOI] [Google Scholar]
  • 30.Ollivier F. Identifiabilité et identification: du Calcul Formel au Calcul Numérique? In: ESAIM: Proceedings. vol. 9. EDP Sciences; 2000. p. 93–99.
  • 31. Meshkat N, Eisenberg M, DiStefano JJ III. An algorithm for finding globally identifiable parameter combinations of nonlinear ODE models using Gröbner Bases. Mathematical Biosciences. 2009;222(2):61–72. 10.1016/j.mbs.2009.08.010 [DOI] [PubMed] [Google Scholar]
  • 32. Bellu G, Saccomani MP, Audoly S, D’Angiò L. DAISY: A new software tool to test global identifiability of biological and physiological systems. Computer methods and programs in biomedicine. 2007;88(1):52–61. 10.1016/j.cmpb.2007.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Chiş O, Banga JR, Balsa-Canto E. GenSSI: a software toolbox for structural identifiability analysis of biological models. Bioinformatics. 2011;27(18):2610–2611. 10.1093/bioinformatics/btr431 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Meshkat N, Kuo CEz, DiStefano J III. On finding and using identifiable parameter combinations in nonlinear dynamic systems biology models and COMBOS: a novel web implementation. PLoS One. 2014;9(10):e110261 10.1371/journal.pone.0110261 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Hong H, Ovchinnikov A, Pogudin G, Yap C. SIAN: software for structural identifiability analysis of ODE models. Bioinformatics. 2019;35(16):2873–2874. 10.1093/bioinformatics/bty1069 [DOI] [PubMed] [Google Scholar]
  • 36. Brouwer AF, Meza R, Eisenberg MC. Parameter estimation for multistage clonal expansion models from cancer incidence data: A practical identifiability analysis. PLoS computational biology. 2017;13(3):e1005431 10.1371/journal.pcbi.1005431 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Yates JWT, Evans ND, Chappell MJ. Structural identifiability analysis via symmetries of differential equations. Automatica. 2009;45(11):2585–2591. 10.1016/j.automatica.2009.07.009 [DOI] [Google Scholar]
  • 38. Anguelova M, Karlsson J, Jirstrand M. Minimal output sets for identifiability. Mathematical Biosciences. 2012;239(1):139–153. 10.1016/j.mbs.2012.04.005 [DOI] [PubMed] [Google Scholar]
  • 39. Merkt B, Timmer J, Kaschek D. Higher-order Lie symmetries in identifiability and predictability analysis of dynamic models. Physical Review E. 2015;92(1):012920 10.1103/PhysRevE.92.012920 [DOI] [PubMed] [Google Scholar]
  • 40. Craciun G, Kim J, Pantea C, Rempala GA. Statistical model for biochemical network inference. Communications in Statistics-Simulation and Computation. 2013;42(1):121–137. 10.1080/03610918.2011.633200 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Davidescu FP, Jørgensen SB. Structural parameter identifiability analysis for dynamic reaction networks. Chemical Engineering Science. 2008;63(19):4754–4762. 10.1016/j.ces.2008.06.009 [DOI] [Google Scholar]
  • 42. Locke J, Millar A, Turner M. Modelling genetic networks with noisy and varied experimental data: the circadian clock in Arabidopsis thaliana. Journal of theoretical biology. 2005;234(3):383–393. 10.1016/j.jtbi.2004.11.038 [DOI] [PubMed] [Google Scholar]
  • 43. Wu H, Zhu H, Miao H, Perelson AS. Parameter identifiability and estimation of HIV/AIDS dynamic models. Bulletin of mathematical biology. 2008;70(3):785–799. 10.1007/s11538-007-9279-9 [DOI] [PubMed] [Google Scholar]
  • 44. Ho DD, Neumann AU, Perelson AS, Chen W, Leonard JM, Markowitz M. Rapid turnover of plasma virions and CD4 lymphocytes in HIV-1 infection. Nature. 1995;373(6510):123–126. 10.1038/373123a0 [DOI] [PubMed] [Google Scholar]
  • 45.Bartl M, Kötzing M, Kaleta C, Schuster S, Li P. Just-in-time activation of a glycolysis inspired metabolic network-solution with a dynamic optimization approach. In: Crossing Borders within the ABC: Automation, Biomedical Engineering and Computer Science. vol. 55; 2010. p. 217–222.
  • 46. Lipniacki T, Paszek P, Brasier AR, Luxon B, Kimmel M. Mathematical model of NF-κB regulatory module. Journal of theoretical biology. 2004;228(2):195–215. 10.1016/j.jtbi.2004.01.001 [DOI] [PubMed] [Google Scholar]
  • 47. Domurado M, Domurado D, Vansteenkiste S, De Marre A, Schacht E. Glucose oxidase as a tool to study in vivo the interaction of glycosylated polymers with the mannose receptor of macrophages. Journal of controlled release. 1995;33(1):115–123. 10.1016/0168-3659(94)00074-5 [DOI] [Google Scholar]
  • 48. Buckingham E. Illustrations of the use of dimensional analysis on physically similar systems. Physics Review. 1914;4(4):354–377. [Google Scholar]
  • 49. Lavielle M, Aarons L. What do we mean by identifiability in mixed effects models? Journal of pharmacokinetics and pharmacodynamics. 2016;43(1):111–122. 10.1007/s10928-015-9459-4 [DOI] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008248.r001

Decision Letter 0

Douglas A Lauffenburger, Miles P Davenport

2 Jul 2020

Dear Dr Castro,

Thank you very much for submitting your manuscript "Testing structural identifiability by a simple scaling method" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. 

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Miles P. Davenport, MB BS, D.Phil

Associate Editor

PLOS Computational Biology

Douglas Lauffenburger

Deputy Editor

PLOS Computational Biology

***********************

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately:

[LINK]

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Dear Editor,

The manuscript “Testing structural identifiability by a simple scaling method” focuses on an interesting problem regarding structural identifiability of mathematical models, or the ability to uniquely determine all or some of the parameters or parameter combinations based on observed data.

The paper presents a concise, simple approach to this problem. It is an interesting article that could be of use in the field of mathematical modelling.

I was confused by Table 2, where the authors compared different methods for determining identifiability. Why do different methods arrive at disagreeing results for the same models? Could you explain and clarify how some tests find that the same model is globally structurally identifiable, while others find that the same model is unidentifiable (or locally structurally identifiable)?

I found the examples that were discussed and worked out in the supplementary information very useful, so I think that this information could be included in the main text.

Minor comments:

In the supplement in Section 2.2 in the first sentence, it says “when only x2 is observed”. Should it say “when only x1 is observed”?

Typo right after (18) on p. 5 of supplement. It says, “u_x1)1”but should say “u_x1=1”.

Typo right before section 2.9 on p. 10 of supplement. The “u_d”s should be “u_\\delta”s.

Reviewer #2: the review is uploaded as an attachment

Reviewer #3: See attached

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see http://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008248.r003

Decision Letter 1

Douglas A Lauffenburger, Miles P Davenport

14 Aug 2020

Dear Dr. Castro,

We are pleased to inform you that your manuscript 'Testing structural identifiability by a simple scaling method' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Miles P. Davenport, MB BS, D.Phil

Associate Editor

PLOS Computational Biology

Douglas Lauffenburger

Deputy Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008248.r004

Acceptance letter

Douglas A Lauffenburger, Miles P Davenport

20 Oct 2020

PCOMPBIOL-D-20-00674R1

Testing structural identifiability by a simple scaling method

Dear Dr Castro,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Laura Mallard

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text. In S1 Text we collect the theorems sustaining the method and a catalogue of models with a detailed computation of the identifiability equations that were used to build Table 3.

    (TEX)

    Attachment

    Submitted filename: reply.pdf

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting information files.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES