Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Jul 23.
Published in final edited form as: Stat Interface. 2019;12(3):365–375. doi: 10.4310/18-SII550

Dynamic Structural Equation Models for Directed Cyclic Graphs: the Structural Identifiability Problem*

Yulin Wang 1, Yu Luo 2, Hulin Wu 3, Hongyu Miao 4,
PMCID: PMC6648698  NIHMSID: NIHMS1040896  PMID: 31338140

Abstract

Network systems are commonly encountered and investigated in various disciplines, and network dynamics that refer to collective node state changes over time are one area of particular interests of many researchers. Recently, dynamic structural equation model (DSEM) has been introduced into the field of network dynamics as a powerful statistical inference tool. In this study, in recognition that parameter identifiability is the prerequisite of reliable parameter inference, a general and efficient approach is proposed for the first time to address the structural parameter identifiability problem of linear DSEMs for cyclic networks. The key idea is to transform a DSEM to an equivalent frequency domain representation, then Masons gain is employed to deal with feedback loops in cyclic networks when generating identifiability equations. The identifiability result of every unknown parameter is obtained with the identifiability matrix method. The proposed approach is computationally efficient because no symbolic or expensive numerical computations are involved, and can be applicable to a broad range of linear DSEMs. Finally, selected benchmark examples of brain networks, social networks and molecular interaction networks are given to illustrate the potential application of the proposed method, and we compare the results from DSEMs, state-transition models and ordinary differential equation models.

AMS 2000 subject classifications: 00K00, 00K01, 00K02, 00K03

Keywords and phrases: Cyclic network, Dynamic structural equation model, Structural identifiability analysis, Feedback loop

1. INTRODUCTION

Network systems are common in a variety of research fields, including chemistry, physics, economics, computer science, sociology and biology [1, 2]. One particularly interesting research question on network systems is to quantitatively understand network dynamics, where network status evolves over time [3, 4]. The evolution of network status may refer to the temporal changes in network structures (e.g., edge connection), node states, or both. In this study, network structures are assumed remaining the same for simplicity so the dynamics here only refer to the collective changes of node statuses [1, 5]. Considering that accurate and reliable estimation of key parameter values is critical to quantitatively characterizing network dynamics [1, 5], it has been repeatedly stressed in many previous studies [6] that parameter identifiability analyses should be performed before any statistical inference techniques are applied to obtain parameter estimates. However, to the best knowledge of our authors, structural identifiability analysis (SIA) of complex dynamic systems like cyclic networks is a largely underexplored problem, and this study makes an attempt to fill the methodological gap.

Graphical models have long been used to mathematically describe networks and associated characteristics. Since graphical models refer to a broad range of mathematical formulations [6, 7, 8], limited by resources, here we focus on the structural equation model (SEM) representation of networks. SEM is a powerful statistical tool employed in many research fields such as economics [9, 10], environmental science [11], multivariate statistics [12, 13], social science [14] and biomedical engineering [15, 16]. While many previous studies considered static SEMs, lots of real systems are dynamic in nature such that dynamic SEM (DSEM) has been necessarily proposed to, e.g., accommodate time course observations [11, 14, 17, 18]. Different from ordinary differential equation (ODE) models [1, 5, 6], DSEMs are discrete; also, DSEMs are more general than state-transition models [19, 20] since dynamic SEMs can accommodate both con-current effects and memory effects [9, 12, 14]. Therefore, DSEMs are deemed as a powerful and flexible mathematic language for describing complex dynamic network systems.

The purpose of SIA is to verify whether unknown model parameters can be unambiguously determined for a given model structure and observation strategy. The importance of SIA has been increasingly recognized because it is the prerequisite of obtaining reliable parameter estimates from complex dynamic models and likely to occur due to model misspecification and/or lack of information (e.g., latent variables or unobserved local structures) [6, 21]. Particularly for DSEMs, while the parameter estimation techniques have drawn significant attention of certain researchers [10, 22], to the best knowledge of our authors, this is the first study that makes an attempt to address the SIA problem of DSEMs. With that said, there exist a number of previous studies on the SIA problem for static SEMs, but many of them are for recursive static SEMs that contain no feedback loops. For example, several theoretical criteria or computational methods have been established for examining model structural identifiability, such as Brito and Pearl’s conditions for bow-free models [23], Pearls back door and front door criteria [24], Tian’s accessory set approach [25], Brito and Pearl’s auxiliary sets condition [26], Tian’s partial regression analysis technique [27], and Wang’s identifiability matrix method [28, 29, 30]. For static SEMs with feedback loops (i.e., nonrecursive SEMs), only a few previous studies are publically available, including the computer algebra approach that is computationally inhibitive [31] and the cycle simplification method that treats a cycle as two connected paths in a directed acyclic graph [28, 32]. Since none of the existing works above is particularly for DSEMs, it necessarily calls for the development of novel SIA techniques applicable to such models.

DSEMs refer to a broad range of models that differ in assumptions and structures, and it is impossible to explore all such models in one study. Therefore, here we only focus on DSEMs that are: 1) linear; 2) with time-invariant parameters; 3) with equal time intervals between observations; and 4) with short-term memory (i.e., a finite-order Markov chain). In this study, a general and efficient approach is proposed to determine the structural identifiability of DSEMs with (or without) latent variables, feedback loops, and an arbitrary-order Markov property. The basic idea is to transform a DSEM to its frequency domain representation via z-transform, then employ Mason’s gain [33] to handle cyclic networks and generate identifiability equations. Then the identifiability of each unknown parameter is determined with Wang’s identifiability matrix method [28]. The proposed method involves no symbolical or expensive numerical computation, and is thus efficient. A theoretical result is obtained for a special class of DSEMs that contain only self-loops but no latent variables. Also, we have selected three benchmark models from brain networks, social networks and biological networks to illustrate the application of the proposed method in practice, and discussed the differences in the identifiability analysis results between DSEMs and ordinary differential equation (ODE) models.

This article is organized as follows. The SIA problem of DSEMs is defined in Section 2. We then introduce the identifiability equation generation method that incorporates z-transform and Mason’s gain in Section 3. How to determine the identifiability of every single model parameter based on the generated identifiability equations is described in Section 4. We also compare DSEMs and ODE models and present some benchmark models and applications in Section 5. Conclusions and discussions are presented in Section 6.

2. PROBLEM DEFINITION

Let Yit denote the observed value of variable Yi at time t. Without distinguishing endogenous from exogenous variables, a linear DSEM can be given as follows

Yit=jicijYjt+j(k=1οbijkYjtk)+εit,i,j=1,,n, (1)

where cij denotes the concurrent effect of variable Yj on Yi at time t, bijk denotes the delayed effect of variable Yj on Yi after k time intervals, and εit is a homoscedastic Gaussian white noise process with zero-mean. In this model specification, only historical self-dependence is allowed (i.e., ji at time t); also, the constant ο specifies the order of the Markov property. Eq. (1) has a corresponding directed cyclic graph (DCG) representation G = (V, E) = ({Vi}, {Eij}), where node Vi represents variable Yi and a directed edge Eij from node Vj to node Vi represents the effect cij (or bijk) of Yjt (or Yjtk) on Yit. For illustration, a DSEM example with the first-order Markov property is given in Eq. (2), and its corresponding DCG representation is shown in Fig. 1.

Figure 1.

Figure 1.

A simple DSEM example. An orange edge represents the interaction between two nodes at the same time point, and a light blue edge represents the node variable dependence at two consecutive time points.

{y1t=ε1ty2t=c21y1t+c23y3t+b211y1t1+b221y2t1+b231y3t1+ε2ty3t=c32y2t+b321y2t1+b331y3t1+ε3t}. (2)

Introduce the matrix notations C = [cij] and Bk=[bijk] (k = 1, …, ο), then it is straightforward to tell that there are ο + 1 coefficient matrices in total for a DSEM with a οth-order Markov property. In SIA, the model structure and observation strategy are pre-specified; therefore, it is known which parameters are zero or non-zero in C and Bk, and which node variables are observed or unobserved (i.e., latent variables). For a DSEM, the goal of SIA is to determine whether the non-zero parameters in C and Bk can be unambiguously determined given the observed variables. For this purpose, a set of identifiability equations will be generated to symbolically solve for the unknown non-zero parameters, as described in the next section. If there exists only one solution to an unknown parameter, this parameter is called globally identifiable; if there exist a finite number of solutions, an unknown parameter is called locally identifiable; and if there exist an infinite number of solutions, an unknown parameter is called unidentifiable. For instance, in Fig. 1, the coefficient matrices of the DSEM are

C=[000c210c230c320]andB1=[000b211b221b2310b321b331], (3)

and we assume all the variables are observed. The corresponding SIA problem is to determine the number of solutions to each non-zero parameter in matrices C and B1 (i.e., c21, c23, c32, b211, b221, b231, b321 and b331).

3. IDENTIFIABILITY EQUATION GENERATION

For static SEMs, identifiability equations are usually generated using the covariance between two observed variables [28, 32]. However, this idea is not directly applicable to DSEMs since the measurements of a variable in DESM are now time series and one has to consider the time dependence between the measurements of the same variable and between the measurements of different variables. In addition, DSEMs for cyclic networks inevitably involve feedback loops such that it is necessary to consider the feedback loop effects on parameter identifiability. To handle the aforementioned technical difficulties, a new method based on z-transform [34, 35] is described in this section to generate identifiability equations for linear DSEMs. The method consists of three key steps: 1) employ z-transform to get the frequency domain representation of a DSEM; 2) obtain the z-transfer function between each pair of observed variables based on Mason’s gain; 3) derive identifiability equations from the z-transfer functions.

Specifically, consider the model in Eq. (1), we can get the following equation after z-transform

Yi(z)=jicijYj(z)+j[k=1οbijkzkYj(z)]+σ2, (4)

where z is a complex number [34, 35] and σ2 is the known constant variance of εi. The equation above can be rearranged as follows

(1k=1οbiikzk)Yi(z)=ji(cij+k=1οbijkzkYj(z)+σ2, (5)
Yi(z)=jicij+k=1οbijkzk1k=1οbiikzkYi(z)+σ21k=1οbiikzk. (6)

Let Gij(z)=cij+k=1οbijkzk1k=1οbiikzk and ξi(z)=σ21k=1οbiikzk, and substitute them into Eq. (6) to obtain

Yi(z)=ji[Gij(z)Yj(z)]+ξi(z), (7)

where Gij denotes the effect of variable Yj on variable Yi in the frequency domain. Eq. (7) is thus the frequency domain representation of Eq. (1), which also has a corresponding graphical representation. For example, the frequency domain representation of the DSEM in Fig. 1 is visualized in Fig. 2 as a DCG.

Figure 2.

Figure 2.

The equivalent frequency domain representation of the dynamic SEM in Fig. 1, where ξ1 (z), ξ2 (z) and ξ3 (z) in Eq. (7) are not shown since they are not involved in node interactions.

Based on the frequency domain representation of a DSEM, we can get the z-transfer function between any two observed nodes. For this purpose, Mason’s gain is considered here, which is applicable to directed cyclic networks for finding the input-output relationship between two nodes. Actually, it turns out that Mason’s gain in the frequency domain representation of a DSEM is equivalent to the z-transfer function of the DSEM. Specifically, consider a sub-graph of the frequency domain representation, consisting of an input node Vj, an output node Vi, and all the paths and loops between Vi and Vj. Let GLLt denote the gain of loop Lt in the frequency domain (i.e., GLLt=lGl and Gl is the l-th edge coefficient on loop Lt in the frequency domain) and let r denote the maximum number of non-overlapping loops in the sub-graph, then the system determinant SD of the sub-graph is defined as follows

SD=1+s=1r((1)smHm), (8)

where Hm=t=1sGLLt is the m-th gain product of s non-overlapping loops. Similarly, let GPPk denote the gain associated with the forward path Pk in the frequency domain (i.e., GPPk=lGl and Gl is the l-th edge coefficient on path Pk in the frequency domain), then the path determinant of Pk is calculated in the same way as SD after excluding all feedback loops that intersect with Pk. The z-transfer function (i.e., the Mason’s gain) between two observed nodes Vi and Vj is thus given as follows [33]

MGij=YoutYin=YiYj=k=1qGPPkSDPkSD, (9)

where q is the total number of forward paths from Vj to Vi. When two observed nodes Vi and Vj are in the same strongly connected component (SCC), MGij and MGji are dependent because MGijMGji=YjYiYiYj=1. For this reason, we only need to keep one of the two gains. Note that the use of Mason’s gain assumes that feedback cycling is infinite; for the finite feedback cycling case, the interested reader is referred to Hayduk [36]. For illustration, three z-transfer functions can be obtained from Fig. 2 as below

{MG12=G211G23G32=c21+(b211c21b331)z1b211b331z21c23c32(b221+b331+c23b321+c32b231)z1+(b221b331b231b321)z2MG23=G321G23G32=c32+(b321c32b221)z1b221b321z21c23c32(b221+b331+c23b321+c32b231)z1+(b221b331b231b321)z2MG13=G21G321G23G32=c21c32+(c21b321+c32b211)z1+b211b321z21c23c32(b221+b331+c23b321+c32b231)z1+(b221b331b231b321)z2} (10)

Now we can derive the identifiability equations from the z-transfer functions. In a z-transfer function, the coefficients in front of z−k (k = 0,1, ⋯, p) form a polynomial f(cij,bijk) of unknown model parameters. If two variables Yi and Yj are observed in a DSEM, then the corresponding z-transfer function MGij can be uniquely determined through their observation values [34, 35], i.e., MGij is known, because their observation values are time series rather than a single value. Thus all the coefficients f(cij,bijk) in front of z−k of MGij are also known, i.e., f(cij,bijk)=C, where C is a known constant, although its exact value may not be given. This fact allows us to obtain one identifiability equation from each z−k term. Obviously, from one z-transfer function, we can get multiple identifiability equations as suggested in Eq. (10) and shown in Fig. 3(a). In the case that there exist some duplicate identifiability equations, we keep only one identifiability equation. For example, in Eq. (10), three z-transfer functions have the same denominator, then we only keep one denominator and ignore the other two when generating identifiability equations.

Figure 3.

Figure 3.

Illustration of identifiability equations, identifiability matrices and the reduction results for the DSEM in Fig. 1.

4. STRUCTURAL IDENTIFIABILITY DETERMINATION

Since each identifiability equation is a symbolic polynomial equation and the order of each unknown parameter is at most one in all the identifiability equations, we can determine the structural identifiability of each parameter using the previously proposed identifiability-matrix method [28]. Briefly, this method consists of two steps: conversion of identifiability equations to identifiability matrices and determination of parameter identifiability by matrix reduction and grouping. Here we illustrate this method using the example in Fig. 3(a). First, we convert each identifiability equation in Fig. 3(a) to an identifiability matrix, as showed in Fig. 3(b). Then we reduce all the identifiability matrices by simplifying and removing dependent rows. The reduction results of the identifiability matrices in Fig. 3(b) are shown in Fig. 3(c), and we can tell from the identifiability matrices IE1, IE4, IE7 and IE10 that the unknown parameters c21, c23 and c32 are globally identifiable because each of these four matrices has only one row and there is only one 1 element in this row. We then find locally identifiable and unidentifiable parameters by grouping the remaining matrices. In Fig. 3(c), after excluding IE1, IE4, IE7 and IE10, the remaining identifiability matrices are found to be in the same group because each matrix has more than one 1 elements and these matrices directly or indirectly couple with each other in the sense that the j-th columns of the two matrices both have at least one 1 element. There are 5 unknown parameters and 8 identifiability matrices in the group, then all the remaining parameters, i.e., b211, b221, b231, b321 and b331, are locally identifiable. It is worth noting that structural identifiability is determined with the number of symbolic solutions in the identifiability-matrix method [37], since structural identifiability analysis does not use real data.

As illustrated in Fig. 3, the proposed method is applicable to general linear DSEMs; however, for specific DSEMs, we can directly determine structural identifiability of every unknown parameter according to the following theorem (see Appendix for proof).

Theorem 1. In a DSEM with only observed variables and self-feedback loops (i.e., bijk0), an edge parameter is globally identifiable if the corresponding edge is not on any self-loops or if the corresponding edge is on a self-loop of a node and this node has at least one first-order neighbor node that has no self-loops; otherwise, the parameter is locally identifiable.

5. RESULTS

5.1. Comparisons

While the focus of this study is DSEM identifiability, it is worth mentioning other mainstream dynamic models like ODEs and state-transition models. DSEMs, ODEs and state-transition models all can describe dynamic systems; however, ODEs are typically continuous and deterministic while DSEMs and state-transition models are discrete and stochastic. More importantly, DSEMs consider both concurrent effects and memory effects but ODEs only consider concurrent interactions and state-transition models only consider memory effects. Therefore, state-transition models may be deemed as a special case of DSEMs. For illustration, consider a dynamic system that contains 3 observed nodes and 1 unobserved node, with node V1 having an input u. The three different models of this system are given in Eq. (11), Eq. (12) and Eq. (13), and their graphic representations are shown in Fig. 4(a), (b) and (c), respectively.

Figure 4.

Figure 4.

Three different models of a dynamic system. Green nodes are observed and gray nodes are unobserved, orange edges indicate concurrent effects and light blue edges indicate memory effects.

{y.1=uy.2=c21y1+c23y3y.3=c32y2y.4=c41y1+c43y3}, (11)
{y1t=b111y1t1+ε1ty2t=b221y2t1+b231y3t1+ε2ty3t=b311y1t1+ε3ty4t=b431y3t1+b441y4t1+ε4t}, (12)
{y1t=b111y1t1+ε1ty2t=c21y1t+c23y3t+b221y2t1+b231y3t1+ε2ty3t=c32y2t+b311y1t1+ε3ty4t=c41y1t+c43y3t+b431y3t1+b441y4t1+ε4t}. (13)

DAISY [38] is a widely-used software package for structural identifiability analysis of ODE systems. It is based on differential algebra methods and can handle both linear and non-linear systems. For the ODE model in Fig. 4(a), we can analyze this example with DAISY and the identifiability result is shown in Fig. 5(a). To the best knowledge of our authors, there are not existing methods or tools that can determine parameter identifiability of state-transition models or DSEMs; therefore, we seek to the proposed method in this study. We obtain the identifiability results of the two models in Fig. 4(b) and (c), as shown in Fig. 5(b) and (c), respectively. Although the DSEM in Fig. 5(c) has more unknown model parameters than the other two models, all its edges are at least locally identifiable while in Fig. 5(a) and (b) all the edges connected with the unobserved node V3 are unidentifiable. The particular reason for this case is that here the DSEM model corresponds to a more complex network structure (i.e., there are more paths between two observed nodes), and we can generate a larger number of identifiability equations to verify the identifiability of unknown parameters. More specifically, for three parameters c23, c32 and c43 of Fig. 5(a), we can generate only two identifiability equations: C1=c23c32 and C2=c32c43. Also, for three parameters b231, b311 and b431 of Fig. 5(b), we can get only two identifiability equations: C1=b231b311 and C2=b311b431. For these six parameters of Fig. 5(c), however, we can obtain seven reduced identifiability equations C1=c23c32, C2=c23b311, C3=b231b311, C4=c32c43, C5=b431c32, C6=b311c43, C6=b311c43 and C7=b311b431.

Figure 5.

Figure 5.

The identifiability analysis results of the three models in Fig. 4. Green edges are globally identifiable, purple edges are locally identifiable and red edges are unidentifiable.

5.2. Application

To illustrate the application of the proposed method, we select three benchmark models in different disciplines from public literature. The first benchmark model is for investigating the neural correlation of speech [39]. The original graph contains 8 observed nodes and 8 directed edges, in which each node represents a region of interest (ROI) and each directed edge represents a theoretically plausible neural activation path. We assume that the corresponding DSEM has a first-order Markov property, and randomly add 3 new edges between two successive time points to get a DSEM as shown in Fig. 6(a). With the proposed method in this study, we can determine that all the parameters including the original 8 edge parameters and 3 new added parameters are globally identifiable. This is consistent with the conclusion drawn based on Theorem 1, that is, all the edge parameters are globally identifiable for a DSEM without feedback loops and latent variables.

Figure 6.

Figure 6.

Two benchmark models without latent variables. Orange edges indicate concurrent effects and light blue edges indicate memory effects.

The second bench model is about social networks, selected from the well-known karate club study of Zachary [40]. Zachary tracked the club members for over two years, and a total of 34 members were divided into two groups: the club administrator’s group and the instructor’s group. Flere we choose the sub-network among the first six members. Different from Zachary’s study, we consider the sub-network as a directed graph by randomly setting edge directions. To introduce a 2nd-order Markov property to the model, we add three new edges between two successive time points, and two new edges between time point t and t+2. Then we derive a DSEM as shown in Fig. 6(b) from Zachary’s karate club study. We can determine that every parameter is also globally identifiable with the proposed method, although this model has a higher order Markov property. Since the model contains feedback loops at the same time point (i.e., V1tV3tV4tV1t) and between different time points (i.e., V1tV6tV1t+1, V1t2V6tV1t+1 and V1t1V5t1V6tV1t+1) in addition to self-feedback loops (i.e., V2t1V2t), Theorem 1 is not applicable to this model.

The third benchmark model is about biological networks, and is derived from the replication subnetwork of the within-host influenza virus life cycle [41]. Influenza A virus replication is a complex process, involving the interactions among many different biomolecules. It is therefore usually infeasible for one single experimental study to observe all the components and their interactions simultaneously, leading to the existence of latent variables. This subnetwork has 7 nodes, and we select one node as unobserved; also, we assume that this system has a first-order Markov property. With three new edges added between two successive time points, we get a DSEM as showed in Fig. 7(a). Different from the previous two benchmark models, this model contains one latent variable VUAP56 and one loop that consists of the edges between different time points, i.e., VvRNPst1VUAP56tVvRNPst+1. Because of the existence of feedback loops, Theorem 1 is also not applicable to this model. Then the identifiability results can be obtained using the proposed method, as in Fig. 7(b), which shows that all the edges associated with the latent variables are unidentifiable and the other edges are globally identifiable. More specifically, for the four unknown edge parameters associated with node VUAP56, we can only get three identifiability equations after reduction: C1=bvRNPsUAP561bUAP56vRNPs1, C2=cRNPcomplexUAP56bUAP56vRNPs1 and C3=bUAP56vRNPs1bviralRNAUAP561.

Figure 7.

Figure 7.

One benchmark model with latent variables. Observed nodes are colored green and unobserved nodes are colored gray. Orange edges and light blue edges of the original model represent interactions at the same time point and between two successive time points, respectively. Green edges are globally identifiable and red edges are unidentifiable in the identifiability results.

6. CONCLUSIONS AND DISCUSSIONS

The SIA of DSEMs is a previously untouched topic in existing investigations. In this study, we systematically investigated DSEMs and their structural identifiability problem. A general and computerizable solution is proposed for the first time. More specifically, the proposed approach can handle a broad range of DSEMs (e.g., with latent variables, a finite-order Markov property and feedback loops). Similar to the identifiability matrix method, the proposed method can determine the structural identifiability of every single parameter. Because of no symbolic or expensive numerical computations, the proposed method is computationally efficient. A theoretical result is also obtained for a special class of DSEMs that contain only self-loops but no latent variables. Finally, we compare DSEMs and other dynamic models to illustrate the difference of these model structures and their identifiability results, and three selected benchmark DSEMs from different disciplines are given to illustrate the application of the proposed method.

It is worth mentioning that in addition to explicit latent variables, the proposed method can also handle implicit latent variables, i.e., the case of disturbance correlation. Similar to the work of Wang et al. [28], disturbance correlation can be handled by adding new dummy variables. That is, if two disturbance terms εi and εj are dependent, there exists an implicit latent variable or structure that affects both Vi and Vj. A dummy unobserved variable Vk can be added, and it will introduce additional edges such as VkVi and VkVj at the current time point as well as edges between different time points.

It should be stressed that the SIA of dynamic SEMs is quite different from the SIA of static SEMs. Firstly, a dynamic SEM contains both concurrent effects and memory effects between variables while a static SEM cannot account for memory effects. The difference of study focus leads to different generation methods of identifiability equations. Wright’s path coefficient method or its variations are adopted for static SEMs to generate identifiability equations [28]. In comparison, z-transfer functions are used for dynamic SEMs to generate identifiability equations. Wright’s path coefficient method considers the common ancestor node effects in addition to path effects. Nevertheless, z-transfer function deals with only path effects. For example, given a graph with 3 edges, ViVj, VkVi and VkVj, the z-transfer function MGij contains only the effect of the path ViVj, but the Wright’s path coefficient Covij includes the effect of Vk on both Vi and Vj (i.e., two paths VkVi and VkVj) in addition to the effect of the path ViVj. Since the generation methods of identifiability equations are different, unsurprisingly, the identifiability results are also different. Given a static SEM Gs, we can generate a dynamic SEM Gd by some new edges between different time points (i.e., adding some memory effects between variables). Then generally speaking, the identifiability results of Gd is better than the results of Gs due to the introduction of temporal observations. That is, some unidentifiable edge parameters of Gs may become identifiable in graph Gd. One can verify this conclusion with the example in Fig. 6(b).

This work also has some limitations. For instance, to use z-transform, all the observations are assumed to be equally spaced, which may not be the case in practice. And real dynamic systems may contain both discrete and continuous variables, or involve nonlinear interactions, to which the proposed method is not applicable. In addition, the computation complexity of the proposed method is Ο(Nο2 · N3), where Nο is the number of observed variables and N is the total number of variables. The proposed method cannot directly be applicable to high-dimensional systems until it has been specially optimized for high-dimensional systems. However, we expect to tackle the more complex SIA problems for DSEMs in the future, and this study provides a foundation for such future investigations.

APPENDIX A. PROOF OF THEOREM 1

Theorem 1. In a DSEM with only observed variables and self-feedback loops (i.e., bijk0), an edge parameter is globally identifiable if the corresponding edge is not on any self-loops or if the corresponding edge is on a self-loop of a node and this node has at least one first-order neighbor node that has no self-loops; otherwise, the parameter is locally identifiable.

Proof: Consider a DSEM G without any feedback loops and latent variables first. After z-transform, let G′ denote the frequency domain representation of G. In this case, the coefficient associated with each edge in G′ is the Mason’s gain between two end nodes of the corresponding edge, and the denominator of the Mason’s gain is 1 because there are not feedback loops. Since every node of G′ is observed and G′ is a DAG, all edge coefficients of G′ are globally identifiable according to Kline’s work [42]. More specifically, for any two observed nodes Vi and Vj, the Mason’s gain MGij=cij+k=1οbijkzk is known, and thus we can get ο + 1 identifiability equations

IE1:C1=cijIE2:C2=bij1IE3:C3=bij2IE3:C3=bij2IEο+1:Cο+1=bijο (A-1)

Since each identifiability equation in Eq. (A-1) contains only one monomial that has only one unknown parameter of order 1, then every unknown parameter has a unique solution and is therefore globally identifiable.

Now consider a DSEM G with only self-loops but no latent variables. Without loss of generality, assume node Vj has a self-loop (multiple self-loops are not allowed). There are two different cases to consider: 1) at least one of the first-order neighbor nodes of Vj has no self-loops, and 2) every one of the first-order neighbor nodes of Vj has a self-loop.

For the first case, assume that node Vi is the first-order in-neighbor node of Vj and Vi has no self-loops. Because the path from Vi to Vj does not pass any other loops except for the self-loop of Vj, the Mason’s gain between Vi and Vj is MGij=cij+k=1οbijkzk1k=1οbijkzk. Since both Vi and Vj are observed, MGij is a known constant. Then we can generate 2ο + 1 identifiability equations from MGij. Besides the same ο + 1 equations in Eq. (A-1), ο more identifiability equations are now generated on parameters biik (k = 1, ⋯, ο). Same as Eq. (A-1), each of these ο identifiability equation contains only one monomial that has only one unknown parameter of order 1. Therefore, bijk and biik are all globally identifiable.

For the second case, every one of the first-order neighbor nodes of Vj has a self-loop. Without loss of generality, let node Vi be one in-neighbor node of Vj and it has a self-loop. Different from the previous case, the Mason’s gain between Vi and Vj is

MGij=cij+k=1οbijkzk(1k=1οbiikzk)(1k=1οbjjkzk),=cij+k=1οbiikzk1+l=12οSPlzl (A-2)

with SPl being a symbolic polynomial as follows

SPl={(bii1+bjj1)l=1m=1l1biimbjjlm(biil+bjjl)1<lοm=l2οbiimbjjlmο<l2ο}, (A-3)

where ⌊l/2⌋ is the floor integer of l/2. We can generate 2ο new identifiability equations from Eq. (A-3) besides the o+1 equations as in Eq. (A-1). Same as the case without any feedback loops and latent variables, we can conclude from these ο + 1 equations that parameters cij and bijk are globally identifiable. It is known from Eq. (A-3) that these 2ο new equations contain only 2ο unknown parameters, i.e., biik and bjjk (k = 1, ⋯, ο), but these equations are not always linear. We can determine the structural identifiability of parameters biik and bjjk (k = 1, ⋯, ο) using the Jacobian matrix. It can be found that none of the equations can be represented as a linear combination of other equations; that is, the rank of the corresponding Jacobian matrix is 2ο. However, since the highest order of many equations is greater than 1, there exist a finite number of solutions to these equations. Therefore, the parameters associated with self-loops (i.e., biik) are locally identifiable in this case.

In summary, the theorem holds. □

Footnotes

*

We thank the AE and the referees for review.

Wang was partially supported by Fundamental Research Funds for the Central Universities of China under Grant ZYGX2014J064.

Wu was partially supported by NIH/NIAID grant R01AI087135–08.

§

Miao was partially supported by NSF grant DMS-1620957

Contributor Information

Yulin Wang, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, 611731, China.

Yu Luo, School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China.

Hulin Wu, Department of Biostatistics and Data Science, School of Public Health, University of Texas Health Science Center, Houston, TX, 77030, USA.

Hongyu Miao, Department of Biostatistics and Data Science, School of Public Health, University of Texas Health Science Center, Houston, TX, 77030, USA.

REFERENCES

  • [1].VINAYAGAM A, GIBSON TE, LEE H, et al. (2016). Controllability analysis of the directed human protein interaction network identifies disease genes and drug targets. Proceedings of the National Academy of Sciences of the United States of America. 113 4976–4981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].LI A, CORNELIUS SP, LIU Y-Y, et al. (2017). The fundamental advantages of temporal networks. Science. 358 1042–1046. [DOI] [PubMed] [Google Scholar]
  • [3].JENNINGS JH, UNG RL, RESENDEZ SL, et al. (2015). Visualizing hypothalamic network dynamics for appetitive and consummatory behaviors. Cell. 160 516–527. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].BURNS SP, SANTANIELLO S, YAFFE R, et al. (2014). Network dynamics of the brain and influence of the epileptic seizure onset zone. Proceedings of the National Academy of Sciences of the United States of America. 111 5321–5330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].LIU Y, SLOTINE JE and BARABASI A (2011). Controllability of complex networks. Nature. 473 167–173. [DOI] [PubMed] [Google Scholar]
  • [6].MIAO H, XIA X, PERELSON AS, et al. (2011). On identifiability of nonlinear ode models and applications in viral dynamics. SIAM review. 53 3–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].MINCHEVA M and ROUSSEL MR (2007). Graph-theoretic methods for the analysis of chemical and biochemical networks. I. Multistability and oscillations in ordinary differential equation models. Journal of Mathematical Biology. 55 61–86. [DOI] [PubMed] [Google Scholar]
  • [8].DOMKE J (2013). Learning graphical model parameters with approximate marginal inference. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 35 2454–2467. [DOI] [PubMed] [Google Scholar]
  • [9].VALENTINI P, IPPOLITI L and FONTANELLA L (2013). Modeling us housing prices by spatial dynamic structural equation models. The Annals of Applied Statistics. 7 763–798. [Google Scholar]
  • [10].CZIRAKY D and SKRTIC MM (2004). Maximum likelihood estimation of dynamic structural equation models for ict impact. 26th International Conference on Information Technology Inter-faces Cavtat, Croatia [Google Scholar]
  • [11].FONTANELLA L, IPPOLITI L and VALENTINI P (2007). Environmental pollution analysis by dynamic structural equation models. Environmetrics. 18 265–283. [Google Scholar]
  • [12].PRINDLE JJ and MCARDLE JJ (2012). An examination of statistical power in multigroup dynamic structural equation models. Structural Equation Modeling. 19 351–371. [Google Scholar]
  • [13].ASPAROUHOV T, HAMAKER EL and MUTH N,B (2017). Dynamic latent class analysis. Structural Equation Modeling: A Multidisciplinary Journal. 24 257–269. [Google Scholar]
  • [14].BAINGANA B, MATEOS G and GIANNAKIS GB (2013). Dynamic structural equation models for social network topology inference. arXiv: Social and Information Networks. [Google Scholar]
  • [15].STEPHAN KE, KASPER L, HARRISON LM, et al. (2008). Nonlinear dynamic causal models for fmri. NeuroImage. 42 649–662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].MARREIROS AC, KIEBEL SJ and FRISTON KJ (2008). Dynamic causal modelling for fmri: A two-state model. NeuroImage. 39 269–278. [DOI] [PubMed] [Google Scholar]
  • [17].D., C. (2004). Estimation of dynamic structural equation models with latent variables. Advances in Methodology and Statistics. 1 185–204. [Google Scholar]
  • [18].BAINGANA B, MATEOS G and GIANNAKIS GB (2013). Dynamic structural equation models for tracking topologies of social networks. Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2013 IEEE 5th International Workshop on St. Martin, France [Google Scholar]
  • [19].BRISKE DD, FUHLENDORF SD and SMEINS FE (2005). State-and-transition models, thresholds, and rangeland health: A synthesis of ecological concepts and perspectives. Rangeland Ecology & Management. 58 1–10. [Google Scholar]
  • [20].BAGCHI S, BRISKE DD, BESTELMEYER BT, et al. (2013). Assessing resilience and state-transition models with historical records of cheatgrass bromus tectorum invasion in north american sagebrush-steppe. Journal of Applied Ecology. 50 1131–1141. [Google Scholar]
  • [21].SHAMAIAH M, LEE SH and VIKALO H (2012). Graphical models and inference on graphs in genomics: Challenges of high-throughput data analysis. IEEE Signal Processing Magazine. 29 51–65. [Google Scholar]
  • [22].AKASHI K and KUNITOMO N (2015). The limited information maximum likelihood approach to dynamic panel structural equation models. Annals of the Institute of Statistical Mathematics. 67 39–73. [Google Scholar]
  • [23].BRITO C and PEARL J (2002). A new identification condition for recursive models with correlated errors. Structural Equation Modeling. 9 459–474. [Google Scholar]
  • [24].PEARL J (2009). Causality: Models, reasoning, and inference (2nd edition). Cambridge University Press; New York, NY, USA. [Google Scholar]
  • [25].TIAN J (2012). A criterion for parameter identification in structural equation models. arXiv preprint arXiv:1206.5289. [Google Scholar]
  • [26].BRITO C and PEARL J (2012). Graphical condition for identification in recursive sem. arXiv preprint arXiv:1206.6821. [Google Scholar]
  • [27].TIAN J (2009). Parameter identification in a class of linear structural equation models IJCAI. Pasadena, California, USA [Google Scholar]
  • [28].WANG Y, LU N and MIAO H (2016). Structural identifiability of cyclic graphical models of biological networks with latent variables. BMC Systems Biology. 10 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].WANG Y and MIAO H (2017). Parameter identifiability-based optimal observation remedy for biological networks. BMC Systems Biology. 11 53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].WANG Y, LUO Y, WANG M, et al. (2018). Time-invariant biological networks with feedback loops: Structural equation models and structural identifiability. IET systems biology. DOI: 10.1049/IET-SYB.2018.5004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].SULLIVANT S, GARCIA-PUENTE LD and SPIELVOGEL S (2010). Identifying causal effects with computer algebra. Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI) Catalina Island, CA, USA. [Google Scholar]
  • [32].DRTON M, FOYGEL R and SULLIVANT S (2011). Global identifiability of linear structural equation models. Annals of Statistics. 39 865–886. [Google Scholar]
  • [33].MASON SJ (1956). Feedback theory-further properties of signal flow graphs. Proceedings of The Institute of Radio Engineers. 44 920–926. [Google Scholar]
  • [34].SANDRYHAILA A and MOURA JMF (2013). Discrete signal processing on graphs. IEEE Transactions on Signal Processing. 61 1644–1656. [Google Scholar]
  • [35].NAYYERI V, SOLEIMANI M, MOHASSEL JR, et al. (2011). Fdtd modeling of dispersive bianisotropic media using z-transform method. IEEE Transactions on Antennas and Propagation. 59 2268–2279. [Google Scholar]
  • [36].HAYDUK LA (2009). Finite feedback cycling in structural equation models. Structural Equation Modeling. 16 658–675. [Google Scholar]
  • [37].GARCIA C and LI T (1980). On the number of solutions to polynomial systems of equations. SIAM Journal on Numerical Analysis. 11 540–546. [Google Scholar]
  • [38].BELLU G, SACCOMANI M, AUDOLY S, et al. (2007). Daisy: A new software tool to test global identifiability of biological and physiological systems. Computer Methods and Programs in Biomedicine. 88 52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].PRICE LR, LAIRD AR, FOX PT, et al. (2009). Modeling dynamic functional neuroimaging data using structural equation modeling. Structural Equation Modeling. 16 147–162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [40].ZACHARY WW (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research. 33 452–473. [Google Scholar]
  • [41].MATSUOKA Y, MATSUMAE H, KATOH M, et al. (2013). A comprehensive map of the influenza a virus replication cycle. BMC systems biology. 7 97–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [42].KLINE RB (2005). Principles and practice of structural equation modeling (2nd ed.). Guilford Press; New York, NY, USA. [Google Scholar]

RESOURCES