Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Apr 30.
Published in final edited form as: IEEE Trans Automat Contr. 2019 Jul 4;65(5):1901–1910. doi: 10.1109/TAC.2019.2926854

Smooth interpolation of covariance matrices and brain network estimation: Part II

Lipeng Ning 1
PMCID: PMC8086997  NIHMSID: NIHMS1063653  PMID: 33935294

Abstract

This work focuses on the modeling of time-varying covariance matrices using the state covariance of linear systems. Following concepts from optimal mass transport, we investigate and compare three types of covariance paths which are solutions to different optimal control problems. One of the covariance paths solves the Schrödinger bridge problem (SBP). The other two types of covariance paths are based on generalizations of the Fisher-Rao metric in information geometry, which are the major contributions of this work. The general framework is an extension of the approach in [1] which focuses on linear systems without stochastic input. The performances of the three covariance paths are compared using synthetic data and a real-data example on the estimation of dynamic brain networks using functional magnetic resonance imaging.

Index Terms—: Optimal control, linear stochastic system, Fisher-Rao metric, optimal mass transport

I. Introduction

Modeling of time-varying covariance matrices of non-stationary multivariate time series is a fundamental problem in many scientific applications [2]–[8]. In this paper, we investigate a control-theoretic approach that uses the state covariance of linear systems with stochastic input to model time-varying covariance matrices. The goal is to use suitable parametric models of time-varying covariance matrices to understand the dynamic interactions among multivariable states. This approach is an extension of the framework proposed in [1] which focuses on a simpler situation based on linear systems without stochastic input. The motivation of this series work is from a neuroimaging application on using time-varying covariance matrices of resting-state functional magnetic resonance imaging (rsfMRI) data[9]–[13] to investigate dynamic brain networks.

Let us consider the following linear system

x˙t=Atxt+σdwt, (1)

where Atn×n, wt denotes a standard n-dimensional Wiener process and dwt denotes the differential form of wt. The corresponding state covariance Pt=E(xtxt) evolves according to

P˙t=AtPt+PtAt+σ2I, (2)

where I denotes the identity matrix. We assume that two state covariance matrices P0, P1 are given and the underlying system parameters in (1) are unknown. The problem considered in this paper is to find a smooth covariance path that simultaneously connects the two endpoints at t = 0, 1 and satisfies (2). This problem is underdetermined since the solution is not unique. In order to obtain a meaningful solution, it is natural to use a cost function that regularizes the feasible paths. To this end, we consider covariance paths that are optimal solutions to problems of the following form

minPt,At{01f(At)dt|(2) holds, P0,P1 specified }, (3)

where f(At) denotes a quadratic function of At which may depend on Pt. The optimal solutions to (3) not only provides a smooth covariance path Pt but also a dynamic system that describes the interactions among multivariable states. The parametric models of covariance paths can be applied in regression problems to fit noisy sample covariances in order to understand the underlying system dynamics of stochastic processes.

In this paper, we investigate the optimal solutions to (3) corresponding to three objective functions. These functions have been investigated in [1] which focuses on a simpler situation without stochastic input, i.e. σ = 0. The motivations of these objective functions are based on concepts in optimal mass transport (OMT) [14]–[17] and the Fisher-Rao metric in information geometry [18], [19]. In particular, the optimization problem in (3) with the OMT-based objective function is related to the Schrödinger Bridge Problem (SBP) [20], which has been extensively investigated in recent years [21], [22]. The main contribution of this work is the other two solutions based on generalizations of the Fisher-Rao metric in information geometry. Moreover, the general theme of this paper is closely related to the results in [5], [23]–[27] which all focused on modeling of smooth covariance paths. But the methods and solutions proposed in this paper are substantially different from these references. In particular, this paper introduces a family of linear-system-based covariance paths for modeling the rotation of energy among multivariate time series, providing a potentially useful solution for investigating oscillations in brain networks using rsfMRI data.

The organization of this paper is as follows. In Section II we revisit the covariance paths based on optimal mass transport and the SBP. In Section III, we derive the optimal solution of (3) corresponding to a Fisher-Rao metric based objective function. In Section IV, we investigate a family of covariance paths obtained using a weighted least-squares function and discuss their relations with the Fisher-Rao based solutions. The three types of covariance paths are compared in Section V using simulations and real data based on a neuroimaging application. Section VI concludes the paper with discussions.

For notations, Sn, S+n, S++n denote the sets of symmetric, positive semidefinite, and strictly positive definite matrices of size n × n, respectively. Small boldface letters, e.g. x, v, represent column vectors. Capital letters, e.g. P, A, denote matrices. Regular small letters, e.g. w, h are for scalars or scalar-valued functions.

II. Optimal-mass-transport based covariance paths

Let us consider ut = Atxt as the control input that steers the state covariance matrices according to (2). We define

fPtomt (At)=12E(ut22)=12tr(AtPtAt),

as the objective function in (3). Then the optimization problem becomes

minPt,At{01tr(AtPtAt)dt|P˙t=AtPt+PtAt+σ2I,P0,P1 specified}. (4)

If the input noise vanishes, i.e. σ = 0, then the optimal value of (4) is equal to the optimal mass transport distance [14]–[17] between two zero-mean Gaussian distributions with covariance matrices being P0 and P1, respectively. It is also equal to the Bures distance between density matrices in quantum mechanics [23], [28], [29]. In the general situation when σ is non-zero, (4) can be viewed as the Schrödinger Bridge Problem (SBP) [20] between two zero-mean Gaussian probability density functions with covariance being P0 and P1, respectively. The basic idea of SBP is to find a stochastic system whose probability law on diffusion paths is most similar to that of a reference system measured by relative entropy and satisfies the initial and final marginal probability distributions. Its relations with stochastic optimal control and mass transport have been extensively studied recently [21], [22]. The more general situations when the reference model is a linear time-varying system with degenerate diffusions, i.e. noises that influence a subspace of the random states, have also recently been studied in [30], [31]. The following proposition presents the solution to (4).

Proposition 1. Given P0, P1S++n and a scalar σ, then the unique of solution to (4) is equal to

Atomt=Π0(IΠ0t)1, (5)
Ptomt=(IΠ0t)P0(IΠ0t)+σ2(ItΠ0t2), (6)

Where

Π0=IP012((P012P1P012+14σ4I)1212σ2I)P012. (7)

This proposition is a special case of Proposition 4 in [30]. For completeness, an independent proof is provided in the Appendix.

We note that if Π0 is singular, then Atomt  is also singular for all t ∈ [0, t], implying free diffusion in the subspace spanned by the eigenvectors of Π0 corresponding to zero eigenvalues. In the noiseless situation when σ = 0, then the covariance path in (6) is equal to the geodesic induced by the Wasserstein-2 metric.

III. Information-geometry based covariance paths

The Fisher information metric has provided a well-defined distance measure between probability distributions. For zero-mean multivariate Gaussian distributions, the Fisher information metric can be expressed as a quadratic form of the covariance matrices, which is referred to as the Fisher-Rao metric[18], [19]. Specifically, let P be a positive definite covariance matrix and let Δ be a symmetric matrix denoting a tangent direction at P on the manifold of positive-definite matrices. Then the Fisher-Rao metric has the following form [19]

gP(Δ)=tr(P1ΔP1Δ).

The geodesic connecting P0, P1 induced by the Fisher-Rao metric is the optimal solution to

minPt,P˙t{01tr(Pt1P˙tPt1P˙t)dt|P0,P1 specified }, (8)

which has the following well-known expression [32]

Pt=P012(P012P1P012)tP012. (9)

In [1], we have shown that (9) is also equal to the optimal solution to

minPt,At{01fPtinfo (At)dt|P˙t=AtPt+PtAt,P0,P1 specified}, (10)

where the objective function is given by

fPtinfo(At)=E(utPt12)=E(utPt1ut)=tr(Pt1AtPtAt).

Thus the Fisher-Rao metric can be viewed as a weighted-mass-transport cost function for zero-mean Gaussian probability distribution functions.

Following (3) and (10), we consider the following optimization problem:

minPt,At{01fPtinfo (At)dt|P˙t=AtPt+PtAt+σ2I,P0,P1 specified}, (11)

which generalizes (10) with the additional noise term σ2I.

A. On the optimal solution

The optimal solution to (11) is presented as follows.

Proposition 2. Given P0, P1S++n and a scalar σ, if there exists a path that satisfies the following differential equations

P˙t=2PtΠtPt+σ2I, (12)
Π˙t=2ΠtPtΠt, (13)

with ΠtSn and Pt being equal to P0 and P1 at t = 0 and 1, respectively, then Pt is an optimal solution to (11). The corresponding At is equal to

At=PtΠt. (14)

Moreover, Pt also satisfies the following differential equation

P¨tP˙tPt1P˙t+σ4Pt1=0. (15)

The proof of the above proposition is provided in the Appendix.

In the noiseless situation when σ = 0, (15) becomes

P¨tP˙tPt1P˙t=0,

which is the geodesic equation induced by the Fisher-Rao metric [6], [32]. In this case, the closed-form expression of Pt is given by (9).

B. The scalar case

The closed-form expression of the optimal solution to (11) is currently unknown to the author except for scalar-valued covariance. In this case, the optimal covariance path has one of the following expressions:

pt=p0+σ2t, (16)
pt=αeβtσ44αβ2eβt, (17)
pt=σ2ωcos(ωt+θ), with 0<ω<π,π2<θ<π2. (18)

It is straightforward to verify that all the above expressions satisfy that

ptp¨t(p˙t)2+σ4=0,

which is a scalar-version of (15). Clearly, (16) corresponds to a covariance path purely driven by the input noise. In (18), the constraint 0 < ω < π ensures that pt does not have negative values for t ∈ [0, 1]. Moreover, the constraint π2<θ<π2 guarantees that p0 is positive. The following proposition shows that the covariance paths in (17) and (18) connect two endpoints that satisfy different conditions.

Proposition 3. Given three scalars p0, p1 > 0 and σ, if |p1p0| > σ2, then the unique solution to (11) is given by (17) with β being a non-zero solution to

4(p1p0eβ)(p1p0eβ)β2σ4(eβeβ)2=0, (19)

and

α=p1p0eβeβeβ. (20)

If |p1p0| < σ2, then the unique solution to (11) is given by (18) with ω being the unique solution to

σ4ω2 sin2(ω)+2p0p1 cos(ω)=p02+p12, (21)

and θ=acos(p0ωσ2).

The proof is provided in the Appendix. It should be noted that the optimal at is computed using (2) as at=(p˙tσ2)/(2pt). If |p1p0| > σ2, then

at=12pt(αβeβt+σ44αβeβtσ2)>0.

If |p1p0| < σ2, then

at=12pt(σ2 sin(ωt+θ)σ2)<0.

IV. Weighted-least-squares based covariance paths

For scalar-valued covariance, the Fisher-Rao metric gp(p˙) at a covariance p is equal to the squared norm (p˙/p)2. For matrix-valued covariance, there are in general no unique way to generalize the matrix division p˙/p. As an example, in the following equation

P˙t=AtPt+PtAt, (22)

the matrix At can be viewed as a non-commutative division of 12P˙ by P. For a given pair of matrices Pt and P˙t there are infinite At that satisfies (22). But one could find a unique At that minimizes a quadratic function f(At) with a given pair of P˙t and Pt. The optimal value of the quadratic function provides a generalization of the Fisher-Rao metric, which motivates the choice of the third objective function proposed in [1]. We will apply this objective function in the general problem (3).

Specifically, for a matrix Atn×n, we decompose it as At = At,s + At,a where

At,s=12(At+At),At,a=12(AtAt),

are the symmetric and asymmetric parts of At, respectively. For a given scalar ϵ > 0, we define the following weighted squared norm of At

fϵwls(At)At,sF2+ϵAt,aF2=1+ϵ2tr(AtAt)+1ϵ2tr(AtAt).

Next, we follow (3) to consider the problem:

minPt,At{011+ϵ2tr(AtAt)+1ϵ2tr(AtAt)dt|P˙t=AtPt+PtAt+σ2I,P0,P1 specified}. (23)

The expression of the optimal solution is provided in the following proposition.

Proposition 4. Given P0, P1S++n and two scalars ϵ, σ > 0, if there exists a pair of matrix-valued functions Pt, Πt that satisfy

P˙t=1+ϵ2ϵ(ΠtPt2+Pt2Πt)+1ϵϵPtΠtPt+σ2I, (24)
Π˙t=1+ϵ2ϵ(Πt2Pt+PtΠt2)1ϵϵΠtPtΠt, (25)

with Pt being equal to P0, P1 at t = 0, 1, respectively, then Pt is an optimal solution to (23). Moreover, the optimal At is given by

At=12(ΠtPt+PtΠt)+12ϵ(PtΠtΠtPt). (26)

The proof is provided in the Appendix.

A. On the relation with information-geometry based covariance paths

As mentioned above, the cost function fϵwls(At) is equal to fPtinfo (At) for scalar-valued covariance. Therefore, the path defined by (24) and (25) coincides with the solution given by (12) and (13). Thus, the closed-form expressions provided in Proposition 3 are also the optimal solutions to (23).

For matrix-valued covariance, we consider a more general family of covariance paths defined by (24) and (25) with possibly negative ϵ. Though ϵ was restricted to be positive in (4) to ensure that the cost function fϵwls is positive, the covariance path in (24) and (25) is still well-defined for any given initial values with P0S++n and Π0Sn even if ϵ < 0. In this case, (24) and (25) still satisfies the first-order necessary condition for optimality, though they may be local minimum. It is interesting to notice that in the special case when ϵ = −1, (24) and (25) are the same as (12) and (13), respectively, implying the corresponding covariance path is equal to the information-geometry based path given in (15). Therefore, the equations in (24) and (25) with possibly negative ϵ define a more general family of covariance paths.

B. On the existence and uniqueness of covariance paths

Based on the aforementioned relation with information geometry, the geodesic in (9) is the unique solution to (24) when σ = 0 and ϵ = −1. Though the general solution to the covariance path with arbitrary σ and ϵ is currently unknown, we will analyze the existence and uniqueness of the covariance path in the special case when σ = 0.

In Theorem 4 of [1], it is shown that the covariance path with σ = 0 has the following closed-form expression:

P1=e1+ϵ2ϵ(P0Π0Π0P0)eP0Π0P0eΠ0P0e1+ϵ2ϵ(Π0P0P0Π0). (27)

If ϵ = −1, then (27) becomes (24) and the unique Π0 is equal to

Π0=12P012log(P012P1P012)P012. (28)

By exploring the continuous dependence of Π0 on ϵ in (27), we are able to derive the range of ϵ that ensures a unique covariance path connecting P0 and P1. To introduce the result, we define λmin(P) and λmax(P) denote the smallest and the largest eigenvalues of PS++n. Moreover, we define the following pseudo-norm

δ(P)maxΔSn,Δ0ΔPPΔ2Δ2. (29)

Then, the following proposition holds.

Proposition 5. Given P0, P1S++n, if the scalar ϵ satisfies that

|1+ϵ2ϵ|<max{λmin(P0)λmin(P1)δ(P0)λmax(P1),λmin(P0)λmin(P1)δ(P1)λmax(P0)}, (30)

then there exists a unique Π0Sn that satisfies (27).

The proof is provided in the Appendix. We expect that the covariance path exists for a more general range of ϵ but the solution may not be unique.

C. On the relation between input noise and system matrices

To understand the influence of input noise to the optimal solutions, we consider a trajectory defined by (24) to (26) with given initial values P0 and Π0. With the system matrix At given by (26), it is straightforward to derive that

A˙t,a=12ϵ(P˙tΠt+PtΠ˙tΠ˙tPtΠtP˙t)=0.

Therefore, the asymmetric part of At is constant, which is equal to 12ϵ(P0Π0Π0P0).

On the other hand, by taking the derivative of the symmetric part of At, denoted by At,s, we obtain that

A˙t,s=(1+ϵ)(A0,aAt,s+At,sA0,a)σ2Πt. (31)

Next, we apply change of variables to define

A^t=e(1+ϵ)A0,at(At,s+ϵA0,a)e(1+ϵ)A0,at,P^t=e(1+ϵ)A0,atPte(1+ϵ)A0,at,Π^t=e(1+ϵ)A0,atte(1+ϵ)A0,at.

By taking the derivative of the above equations, we obtain that

P^˙t=A^tP^t+P^tA^t+σ2I, (32)
Π^˙t=A^tΠ^tΠ^tA^t, (33)
A^˙t=σ2Π^t, (34)

which shows that only the symmetric part of Ât is changing with non-zero input noise.

V. Example

A. Comparing scalar-valued covariance paths

In this example, we compare the scalar-valued covariance paths defined by the closed-form expressions in (6) and (17) to (18). The plots in Figure 1 illustrate the covariance paths that connect p0 = 6 and several different p1 with σ = 4. The OMT and information-geometry based solutions are shown using dashed red and solid blue lines, respectively. In the endpoint p1 is close to p0 + σ2 = 22, then the plots are close to be the straight line p0 + σ2t. If the endpoint is far away from 22, e.g. the endpoints at 30 and 1, then OMT and information-geometry based plots are very different from each other.

Fig. 1:

Fig. 1:

A comparison of OMT and Fisher-Rao metric based covariance paths. The dashed red lines denoted by ptomt illustrate the paths given by (6). The solid blue lines with p1 above and bellow p0 + σ2 = 22 illustrate the paths given by (17) and (18), respectively.

B. Comparing matrix-valued covariance paths

In this example, we illustrate the difference between the above covariance paths using two fixed endpoints given by

P0=[1000.3],P1=[0.3001]. (35)

Fig. 2a shows the OMT-based paths in (6) using several different values for σ, where the ellipsoids denote isocontour of the quadratic function xPtx = r2 with r = 0.5. Fig. 2b illustrates the information-geometry based paths defined by the (15). Since the two matrices P0 and P1 commute, the paths are obtained by using the closed-form expressions in (17) to (18) to the diagonal entries. Though there are minor differences between the two sets of covariance paths, all the Pt’s along the two paths have the same eigenspace.

Fig. 2:

Fig. 2:

An illustration of the covariance paths obtained using (6) (Left) and (15) (Right), respectively, with the two endpoints given by P0 and P1 in (35).

The closed-form solution of the covariance path defined by (24) and (25) is currently unknown. There may also exist multiple local optimal solutions. Specifically, we consider the solution to (23) in the extreme situation when ϵ = 0 and σ = 0. In this case, any asymmetric system matrix A of the form

A=[0±(2k+1)π2(2k+1)π20], (36)

is an optimal solution because the corresponding objective value is equal to zero. The corresponding covariance paths are equal to

P±,t=[0.3+0.7 cos2((2k+1)π2t)±0.7 cos((2k+1)π2t)sin((2k+1)π2t)±0.7 cos((2k+1)π2t)sin((2k+1)π2t)0.3+0.7 sin2((2k+1)π2t)].

In order to obtain a numerical solution for the covariance paths given by (24) and (25) with non-zero ϵ, we apply the lsqnonlin nonlinear optimization toolbox and the ode45 functions in MATLAB (The MathWorks, Inc., Natic, MA) to solve for Π0 so that a path Pt starts from P0 has the least square error relative to P1 at t = 1. We first set ϵ = 0.001 and choose two different initial values for Π^0 used in the optimization algorithms given by

Π^±,0=±1700[0ππ0],

so that the corresponding system matrices given by(26) approximately satisfy (36). Next, we gradually increase ϵ using a step size of 0.001 and use the optimal Π0 from the previous step as the initial value. The left and right panels of Fig. 3 illustrate two branches of locally optimal paths corresponding to the two different initial values of Π0 with σ = 0.5 and several different values for ϵ. All the numerical solutions satisfy the endpoints with the Frobenius norm of the residuals at the order of 10−6 or smaller. Different from the paths shown in (2), the paths in Fig. 3 have rotating eigenspace where the rotation direction depends on the initial choice of Π0. Moreover, as ϵ increase, the paths become similar to those shown in Fig. 2.

Fig. 3:

Fig. 3:

An illustration of two branches, i.e. (a) and (b), of locally-optimal covariance paths obtained using (24) and (25) with different initial values for Π.

C. Fitting noisy measurements of functional MRI data

In this example, we apply the proposed covariance paths to fit noisy sample covariance matrices of a stochastic process based on a resting-state functional MRI dataset used in [1]. The covariance matrix of time-series data measured by rsfMRI is usually referred to as the functional connectivity matrix which is the standard method to investigate functional networks between brain regions [9]–[11]. However, the underlying time series are non-stationary [12], [13], implying the existence of dynamic changes in brain networks. A review of existing methods on modeling and analyzing time-resolved functional connectivity can be found in [33]. In particular, both neural field analysis and real-data experiments have revealed ultra-slow oscillations in brain activities [34], [35]. The proposed covariance paths provide an approach to use linear systems to model the rotation of energy related to the oscillations in brain networks. Moreover, the underlying model parameters are potentially useful to understand the relation between brain dynamics and structural connectivity between brain regions. The interested reader is referred to [1], [36] for more detailed background information and data processing methods.

This covariance fitting problem is formulated as follows. Given K = 10 sample covariance matrices, denoted by P˜t1,,P˜tk, based on K segments of rsfMRI time series data, we are looking for a smooth covariance path Pt that minimizes

minPtk=1KPtkP˜tkF2. (37)

In this example, P˜tk ‘s are the sample covariances from a 7-dimensional rsfMRI time series based on the 7 brain regions proposed in [37]. We apply the three proposed parametric models of covariance paths to fit these measurements using the fminsdp function in MATLAB. The first model is the closed-form expression given by (6) which is parameterized by P0, Π0 and σ. The optimal solution is denoted by P^tomt. The second model is based on the differential equation (12) and (13) which is implemented by the ode45 function in MATLAB. The estimated path is denoted by P^tinfo . Note again that this model is a special case of (24) and (25) when ϵ = −1. The more general model in (24) and (25) relies on the estimation an additional parameter ϵ. Since the differential equations are highly nonlinear in term of ϵ, we find that the MATLAB algorithm only provides a local optimal value of ϵ depending on its initial value. In order to obtain a reliable covariance path to understand the rotation of energy among the variables, we fix ϵ = 20 as used in [1] to analyze the influence of stochastic input. The estimated path is denoted by P^twls.

The discrete markers in Fig. 4 illustrate the noisy sample covariance of 6 entries of the covariance matrices. The blue, green and red plots are the estimated paths given by P^tomt, P^tinfo , and P^twls, respectively. The normalized squared errors corresponding to the three paths are equal to 0.1674, 0.1668 and 0.1388, respectively, which are all lower than the corresponding fitting errors in [1] because of the additional σ2 term in models. The much lower estimation error of P^twls indicates that the weighted least-squares based solution is more capable of tracking the rotation of energies among brain regions.

Fig. 4:

Fig. 4:

A comparison of fitting results of three covariance paths. The discrete markers illustrate several entries of the noisy sample covariance P˜t. The dashed blue plots illustrate the fitted covariance paths based on (6). The dashed green plots show the fitting results based on (12) and (13). The solid red lines show the results based on (24) and (25).

VI. Discussion and conclusion

In this paper, we have investigated three models for time-varying covariance matrices using the state covariance of linear systems with stochastic input. The main motivation is to use these parametric models to investigate dynamic interactions among brain regions using time-varying covariance matrices of rsfMRI data. The three parametric covariance paths are derived as the optimal solutions to three quadratic regularization problems. The first solution is obtained using an optimal-mass transport based objective function, which is related to the well-known solution to the Schrödinger bridge problem (SBP) [21], [22], [30], [31]. The derivation of the other two models are the main contributions of this work. In particular, the second type of covariance path is based on a generalization of the Fisher-Rao metric in information geometry. The third family of covariance path is obtained using a weighted-least-squares cost function of the underlying system matrices, which includes the Fisher-Rao based solution in a special case. Moreover, the weighted-least-squares based covariance paths are able to model the rotation of energies among multidimensional variables, which cannot be done by the other two types of paths. The three models of covariance paths generalize the results from [1] which focuses on linear systems without stochastic input. The performances of the three covariance paths are compared using synthetic data and a real-data example on estimating dynamic brain networks using rsfMRI. Our future work will focus on developing more effective computational algorithms to solve the covariance fitting problem and the application of the proposed solutions to analyze abnormal brain networks related to mental disorders.

Acknowledgments

This work was supported in part under grants R21MH115280, R21MH116352, K01MH117346 (PI: Ning), R01MH097979, R01MH111917 (PI: Rathi), R01MH074794 (PI: Westin).

Biography

graphic file with name nihms-1063653-b0001.gif

Lipeng Ning received his B.Sc and M.Sc in Control Science and Engineering from Beijing Institute of Technology, China, in 2006 and 2008 respectively. He obtained his Ph.D. in Electrical and Computer Engineering from the University of Minnesota in November 2013. He is currently an Assistant Professor at Brigham, Women’s Hospital and Harvard Medical School, Boston, MA.

He is interested in the application of mathematics in neuroimaging and neuroscience research. His current research focuses on stochastic processes, dynamical systems, machine learning and brain connectomics.

Appendix

Proof of Proposition 1.

We consider (4) as an optimal control problem with At being matrix-valued control input. A necessary condition for the optimal solution is that the derivative of the Hamiltonian

h1(Pt,At,Πt)=tr(AtPtAt+Πt(AtPt+PtAt+σ2I))

with respect to At vanishes with the optimal control. This gives rise to

AtPt+ΠtPt=0.

Thus,

At=Πt. (38)

Next, the optimal Π˙t also needs to annihilate the derivative of h1(·) with respect to Pt which leads to

Π˙t=AtAtAtΠtΠtAt. (39)

Substituting (38) to (2) and (39) to deduce that

P˙t=ΠtPtPtΠt+σ2I, (40)
Π˙t=Πt2. (41)

Next, we show that all eigenvalues of the initial Π0 must be smaller than one. For this purpose, we note that solutions for Pt and Πt from (40) and (41) have the following form

Pt=(IΠ0t)P0(IΠ0t)+σ2(ItΠ0t2), (42)
Πt=Π0(IΠ0t)1. (43)

Assume that Π0 has an eigenvalue λ0 > 1, then At = −Πt becomes unbounded when t increases to10. At the same time, Pt is singular at t = 1/λ0 and becomes non positive semi-definite when t > 1/λ0. Therefore, all eigenvalues of Π0 are smaller than one.

Next, setting t = 1 in (42) and multiplying both sides by P012 to obtain that

P012P1P012=(P012(IΠ0)P012)2+σ2P012(IΠ0)P012.

Therefore P012(IΠ0)P012 has the same eigenvectors as P012P1P012. If y is an eigenvalue of P012P1P012, then the corresponding eigenvalue of P012(IΠ0)P012, denoted by x, satisfies that x2 + σ2x = y. The two solutions are given by

x±=12σ2±(y+14σ4)12.

But x is negative which contradicts to that I – Π0 is positive definite. Thus x+ is the only feasible solution. Therefore,

P012(IΠ0)P012=(P012P1P012+14σ4I)1212σ2I,

which gives rise to the optimal solution in (7). Then, the proposition is proved. ☐

Proof of the Proposition 2.

Following the same method as in the proof of Proposition (1), we will derive the solution to the optimal control problem (11) using the following Hamiltonian

h2(Pt,At,Πt)=tr(Pt1AtPtAt+Πt(AtPt+PtAt+σ2I)).

It is necessary that Π˙t annihilates the partial derivative of h2(·) with respect to Pt, which leads to

Π˙t=AtPt1At+Pt1AtPtAtPt1ΠtAtAtΠt (44)

Moreover, setting the derivative of h2(·) with respect to At to zero to obtain that

At=PtΠt. (45)

Then, (12) and (13) can be obtained by substituting (45) to (2) and (44), respectively. From (12), we obtain the following expression

Πt=12(σ2Pt2Pt1P˙tPt1).

Next, taking the derivative of Πt and setting it equal to (13) we obtain that

Π˙t=12(σ2(Pt1P˙tPt2Pt2P˙tPt1)+2Pt1P˙tPt1P˙tPt1Pt1P¨tPt1)=12(σ2Pt2Pt1P˙tPt1)Pt(σ2Pt2Pt1P˙tPt1).

Then (15) can be obtained from the above equation after simplifications. Thus, the proof is complete. ☐

Proof of Proposition 3.

Since the paths in (17) and (18) both satisfy the (15), for the first argument, we only need to prove that if |p0p1| > σ2, then these exists a unique pair of parameters α, β such that pt from (17) is equal to p0 and p1 at t = 0, 1, respectively. For this purpose, we set pt in (17) to p0 and p1 at t = 0, 1, respectively, to obtain that

p0=ασ44ab2,p1=αeβσ44ab2eβ. (46)

Then, we can derive the following

α=p1p0eβeβeβ.

Next, substituting the above expression into (46) then multiplying both sides by 4(p1p0eβ)β2 to obtain that

4(p1p0eβ)(p1p0eβ)β2σ4(eβeβ)2=0. (47)

The above equation has a trivial solution at β = 0. Moreover, if a non-zero β is a solution to (47), so is −β. In this case, the coefficient of the second term of pt is equal to

σ44ab2=p1p0eβeβeβ.

Therefore, (17) is equivalent to

pt=p1p0eβeβeβeβt+p1p0eβeβeβeβt.

It is interesting to note that switching β to −β does not change the covariance path. Therefore, we only consider positive solutions to (47). To this end, we denote the left-hand side of (47) by ϕ(β). It is straightforward to derive that

dϕ(β)db|β=0=0,d2ϕ(β)db2|β=0=8((p1p0)2σ4).

Moreover,

d3ϕ(β)db3=4p0p1(eβeβ)(β2+6β+6)8σ4(e2βe2β)

which is negative for all β > 0. Therefore, if (pp)2σ4 > 0, then the function ϕ(β) is convex and positive near β = 0. But ϕ(β) < 0 as b → ∞. Because its second order derivative d2ϕ(β)db2 is monotonically decreasing, we conclude that there exists a unique solution to ϕ(β) = 0.

To prove the second argument, we take the derivative of pt given by (18) to obtain

p˙t=σ2sin(ωt+θ).

If p0, p1 are the endpoints of pt, then it is necessary that

|p1p0|=|01p˙tdt|01|p˙t|dtσ2.

To show that there exists a path of the form (18) for any p0, p1 that satisfies |p1p0| < σ2, we re-parameterize (18) as follows

pt=ccos(ωt)+dsin(ωt),

with (c2 + d2)ω2 = σ4. By setting pt to p0, p1 at t = 0, 1, respectively, we obtain that

p0=c,p1=c cos(ω)+d sin(ω). (48)

Therefore,

d=p1p0cos(ω)sin(ω). (49)

Next, substituting (48) and (49) to sin2(ω)(c2+d2)=sin2(ω)σ4ω2 to obtain that

σ4ω2sin2(ω)+2p0p1cos(ω)=p02+p12. (50)

We denote the left hand size of the above equation by ψ(ω). Then, the following holds

limω0  ψ(ω)=σ4+2p0p1(p1p0)2+2p0p1=p02+p12.

Moreover, ψ(ω) = −2p0p1 < 0. Furthermore, it can be shown that

dψ(ω)dω=(2σ4ω3(ωcos(ω)sin(ω))2p0p1)sin(ω),

which is negative for all ω ∈ (0, π). Therefore, there exists a unique solution to (50) for ω ∈ (0, π), which completes the proof. ☐

Proof of Proposition 4.

We will follow the same method as in the proof of Proposition (2) to derive the necessary conditions for a stationary value using the following Hamiltonian

h3(Pt,At,Πt)=tr(1+ϵ2tr(AtAt)+1ϵ2tr(AtAt)+Πt(AtPt+PtAt+σ2I)).

Setting Π˙t equal to partial derivative of h3(·) respect to Pt we obtain that

Π˙t=ΠtAtAtΠt. (51)

Moreover, setting the partial derivative of h3(·) with respect to At equal to zero to obtain that

(1+ϵ)At+(1ϵ)At+2ΠtPt=0. (52)

Thus, the symmetric and asymmetric part of At are equal to

At,s=12(tPt+PtΠt),At,a=12ϵ(PtΠttPt).

Therefore (26) holds. Then, (24) and (25) can be obtained by substituting (26) into (2) and (51), respectively.

Proof of Proposition 5.

To simplify notations, we denote γ = (1 + ϵ)/(2ϵ). Then (28) implies that the matrix Π0 that maps P0 to P1 satisfies

eγ(P0Π0Π0P0)eP0Π0P0eΠ0P0eγ(Π0P0P0Π0)=P1, (53)

which is equivalent to

Π0=12P012log(P012UP1UP012)P012, (54)

Where

U=eγ(Π0P0P0Π0).

If γ = 0, i.e. ϵ = −1, then Π0 given by (28) satisfies the above equation. Next, we will apply perturbation analysis to the above equation to understand the solutions associated with different values for γ. Specifically, let δγ and ΔΠ denote perturbations to γ and Π, respectively, so that γ + δγ and Π0 + ΔΠ still satisfy (54). Then, for small perturbations, the perturbation of both sides of (54) gives rise to1

ΔΠ=12P012MQ1(γP012MU(ΔΠP0P0ΔΠ)P1U'P012+γP012UP1MU'(P0ΔΠΔΠP0)P012+δγP012(Π0P0P0Π0)UP1U'P012+δγP012UP1U'(P0Π0Π0P0)P012)P012+o(|δγ|)+o(ΔΠ), (57)

where o(|δγ|) + o(‖ΔΠ‖) denotes all higher-order

1For A,Δn×n,eA+Δ=eA+MeA(Δ)+o(Δ),

where MX(Δ) denotes the non-commutative multiplication of Δ by X which is defined as

MX(Δ)=01X1τΔXτdτ. (55)

For positive definite matrices P, P+ΔS++n,

log(P+Δ)=log(P)+MP1(Δ)+o(Δ).

where MX1(Δ) denotes the non-commutative devision of Δ by X which is defined as

MX1(Δ)=0(X+τI)1Δ(X+τI)1dτ. (56)

terms of the perturbations, Q=P012UP1U'P012, and MX1() and MX(·) are defined in (55) and (56), respectively. Next, we combine all the terms containing ΔΠ on the right hand side of (57) to define the following linear mapping hγ,P0,Π0:SnSn,

hγ,P0,Π0(ΔΠ)=12P012MQ1(P012MU(ΔΠP0P0ΔΠ)P1U'P012+P012UP1MU'(P0ΔΠΔΠP0)P012)P012. (58)

All the terms on the r.h.s. of (57) that contain δγ is equal to δγhγ,P0,Π0(Π0). Then (57) can be simplified as

ΔΠ=γhγ,P0,Π0(ΔΠ)+δγhγ,P0,Π0(Π0)+o(|δγ|)+o(ΔΠ),

which implies that

(Iγhγ,P0,Π0)(ΔΠ)=δγhγ,P0,Π0(Π0)+o(|δγ|)+o(ΔΠ), (59)

where I denotes the identity mapping.

Let γτ = γτ denote a smooth trajectory on the interval τ ∈ [0, 1] for a given γ. Let Π^τ denote a trajectory on Sn for τ ∈ [0, 1] with the initial value Π^0 given by (28). If the linear mapping Iγhγτ,P0,Π^τ is invertible and

ddτΠ^τ=(Iγτhγτ,P0,Π^τ)1γhγτ,P0,Π^τ(Π^τ), (60)

then the pair γτ, Π^τ satisfy

eγτ(P0Π^τΠ^τP0)eP0Π^τP0eΠ^τP0eγτ(Π^τP0P0Π^τ)=P1.

Therefore, then endpoint Π0=Π^1 is the unique solution for Π0 that satisfies (53).

To prove Proposition 5, we only need to show that if (30) holds, then Iγhγτ,P0,Π^τ is invertible. It suffices to prove that the singular values of hγτ,P0,Πτ are all smaller than 1. For this purpose, we will compute the norm of hγτ,P0,ΠτΠ) defined by (55). To compute the norm of hγτ,P0,Πτ, we first compute the norm of the first two terms containing ΔΠ as follows:

12|P012MQ1(P012MU(ΔΠP0P0ΔΠ)P1U'P012)P012=12001P012(Q+t1I)1P012U1t2(ΔΠP0P0ΔΠ)×Ut2P1U'P012(Q+t1I)1P012dt1dt2120(P012QP012+t1P0)12dt1λmax(P1)δ(P0)ΔΠ120(λmin(P1)+t1λmin(P0))2dt1λmax(P1)δ(P0)ΔΠ=12δ(P0)λmax(P1)λmin(P0)λmin(P1)ΔΠ.

The same upper bound also holds for the other two terms in (55) containing Δ. Combining the bounds for all the four terms of ΔΠ, we obtain that

γτhγτ,P0,Πτγδ(P0)λmax(P1)λmin(P0)λmin(P1).

Therefore, if

|γ|<λmin(P0)λmin(P1)δ(P0)λmax(P1), (61)

Then Iγτhγτ,P0,Πτ is invertible for all τ ∈ [0, 1]. Thus, we have proved the first term in the bracket on the r.h.s. of (30). The second term on r.h.s. of (30) can be obtained in a similar way by considering a time-reversal path from P1 to P0 by switching the role of P0, P1 in (24) and (25). Thus, the proposition is proved. ☐

REFERENCES

  • [1].Ning L, “Smooth interpolation of covariance matrices and brain network estimation,” IEEE Transactions on Automatic Control, pp. 1–10, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Porikli F, Tuzel O, and Meer P, “Covariance tracking using model update based on lie algebra,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 1, June 2006, pp. 728–735. [Google Scholar]
  • [3].Wu Y, Cheng J, Wang J, Lu H, Wang J, Ling H, Blasch E, and Bai L, “Real-time probabilistic covariance tracking with efficient model update,” IEEE Transactions on Image Processing, vol. 21, no. 5, pp. 2824–2837, May 2012. [DOI] [PubMed] [Google Scholar]
  • [4].Yang JF and Kaveh M, “Adaptive eigensubspace algorithms for direction or frequency estimation and tracking,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, no. 2, pp. 241–251, February 1988. [Google Scholar]
  • [5].Jiang X, Ning L, and Georgiou TT, “Distances and riemannian metrics for multivariate spectral densities,” IEEE Transactions on Automatic Control, vol. 57, no. 7, pp. 1723–1735, July 2012. [Google Scholar]
  • [6].Lenglet C, Rousson M, Deriche R, and Faugeras O, “Statistics on the manifold of multivariate normal distributions: Theory and application to diffusion tensor MRI processing,” Journal of Mathematical Imaging and Vision, vol. 25, no. 3, pp. 423–444, October 2006. [Google Scholar]
  • [7].Dryden IL, Koloydenko A, and Zhou D, “Noneuclidean statistics for covariance matrices, with applications to diffusion tensor imaging,” The Annals of Applied Statistics, vol. 3, no. 3, pp. 1102–1123, 2009. [Online]. Available:http://www.jstor.org/stable/30242879 [Google Scholar]
  • [8].Hao X, Whitaker RT, and Fletcher PT, Adaptive Riemannian Metrics for Improved Geodesic Tracking of White Matter. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 13–24. [Online]. Available: 10.1007/978-3-642-22092-02 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Biswal B, Zerrin Yetkin F, Haughton VM, and Hyde JS, “Functional connectivity in the motor cortex of resting human brain using echo-planar mri,” Magnetic Resonance in Medicine, vol. 34, no. 4, pp. 537–541, 1995. [Online]. Available: 10.1002/mrm.1910340409 [DOI] [PubMed] [Google Scholar]
  • [10].Buckner RL, Krienen FM, and Yeo BTT, “Opportunities and limitations of intrinsic functional connectivity MRI,” Nature Neuroscience, vol. 16, pp. 832–837, 2013. [DOI] [PubMed] [Google Scholar]
  • [11].Smith SM, Vidaurre D, Beckmann CF, and et al. , “Functional connectomics from resting-state fMRI,” Trends in Cognitive Sciences, vol. 17, no. 12, pp. 666–682, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Chang C and Glover GH, “Time-frequency dynamics of resting-state brain connectivity measured with fMRI,” NeuroImage, vol. 50, no. 1, pp. 81–98, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Preti MG, Bolton TA, and Ville DVD, “The dynamic functional connectome: State-of-the-art and perspectives,” NeuroImage, vol. 160, pp. 41–54, 2017, functional Architecture of the Brain. [DOI] [PubMed] [Google Scholar]
  • [14].Villani C, Topics in Optimal Transportation. Amer. Math. Soc, 2003. [Google Scholar]
  • [15].Rachev S and Rüschendorf L, Mass transportation problems. Vol. I and II. Probability and its Applications. Springer, New York, 1998. [Google Scholar]
  • [16].Knott M and Smith CS, “On the optimal mapping of distributions,” Journal of Optimization Theory and Applications, vol. 43, no. 1, pp. 39–49, May 1984. [Google Scholar]
  • [17].Takatsu A, “On Wasserstein geometry of the space of Gaussian measures,” ArXiv e-prints, January. 2008. [Google Scholar]
  • [18].Rao C, “Information and the accuracy attainable in the estimation of statistical parameters,” Bull. Calcutta Math. Soc, vol. 37, pp. 81–91, 1945. [Google Scholar]
  • [19].Amari S-I and Nagaoka H, Methods of information geometry. Amer. Math. Soc, 2000. [Google Scholar]
  • [20].Schrodinger E, “Über die Umkehrung der Naturgesetze,” Sitzungsberichte Preuss. Akad. Wiss. Berlin. Phys. Math, pp. 144–153, 1931. [Google Scholar]
  • [21].Léonard C, “A survey of the Schrödinger problem and some of its connections with optimal transport,” ArXiv e-prints, August. 2013. [Google Scholar]
  • [22].Chen Y, Georgiou TT, and Pavon M, “On the relation between optimal transport and Schrödinger bridges: A stochastic control viewpoint,” Journal of Optimization Theory and Applications, vol. 169, no. 2, pp. 671–691, May 2016. [Google Scholar]
  • [23].Ning L, Jiang X, and Georgiou T, “On the geometry of covariance matrices,” IEEE Signal Processing Letters, vol. 20, no. 8, pp. 787–790, August 2013. [Google Scholar]
  • [24].Ning L, Georgiou TT, and Tannenbaum A, “On matrix-valued monge-kantorovich optimal mass transport,” IEEE Transactions on Automatic Control, vol. 60, no. 2, pp. 373–382, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Chen Y, Georgiou T, and Tannenbaum A, “Matrix Optimal Mass Transport: A Quantum Mechanical Approach,” IEEE Transactions on Automatic Control, vol. 63, no. 8, pp. 2612–2619, 2018. [Google Scholar]
  • [26].Chen Y, Georgiou TT, Ning L, and Tannenbaum A, “Matricial Wasserstein-1 distance,” IEEE Control Systems Letters, vol. 1, no. 1, pp. 14–19, July 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Yamamoto K, Chen Y, Ning L, Georgiou TT, and Tannenbaum A, “Regularization and interpolation of positive matrices,” IEEE Transactions on Automatic Control, vol. PP, no. 99, pp. 1–1, 2017. [Google Scholar]
  • [28].Uhlmann A, “The metric of bures and the geometric phase,” in Quantum Groups and Related Topics: Proceedings of the First Max Born Symposium, Gielerak R, Lukierski J, and Popowicz Z, Eds., 1992, p. 267. [Google Scholar]
  • [29].Bhatia R, Jain T, and Lim Y, “On the Bures-Wasserstein distance between positive definite matrices,” ArXiv e-prints, December. 2017. [Google Scholar]
  • [30].Chen Y, Georgiou TT, and Pavon M, “Optimal steering of a linear stochastic system to a final probability distribution, part I,” IEEE Transactions on Automatic Control, vol. 61, no. 5, pp. 1158–1169, May 2016. [Google Scholar]
  • [31].——, “Optimal steering of a linear stochastic system to a final probability distribution, part II,” IEEE Transactions on Automatic Control, vol. 61, no. 5, pp. 1170–1180, May 2016. [Google Scholar]
  • [32].Moakher M, “Means and averaging in the groups of rotations,” SIAM J. Matrix Anal. Appl, vol. 24, no. 1, pp. 1–16, 2002. [Google Scholar]
  • [33].Keilholz S, Caballero-Gaudes C, Bandettini P, Deco G, and Calhoun V, “Time-resolved resting-state functional magnetic resonance imaging analysis: Current status, challenges, and new directions,” Brain Connectivity, vol. 7, no. 8, pp. 465–481, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Deco G, Jirsa V, McIntosh AR, Sporns O, and Kötter R, “Key role of coupling, delay, and noise in resting brain fluctuations,” Proceedings of the National Academy of Sciences, vol. 106, no. 25, pp. 10 302–10 307, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Ghosh A, Rho Y, McIntosh AR, Ktter R, and Jirsa VK, “Noise during rest enables the exploration of the brain’s dynamic repertoire,” PLOS Computational Biology, vol. 4, no. 10, pp. 1–12, 10 2008. [Online]. Available: 10.1371/journal.pcbi.1000196 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36].Ning L and Rathi Y, “A dynamic regression approach for frequency-domain partial coherence and causality analysis of functional brain networks,” IEEE Transactions on Medical Imaging, vol. 37, no. 9, pp. 1957–1969, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Yeo BT, Krienen FM, Sepulcre J, Sabuncu MR, Lashkari D, Hollinshead M, Roffman JL, Smoller JW,Zöllei L, Polimeni JR, Fischl B, and Liu R, Buckner H a.nd, “The organization of the human cerebral cortex estimated by intrinsic functional connectivity,” J. Neurophysiol, vol. 106, pp. 1125–1165, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES