Smooth interpolation of covariance matrices and brain network estimation: Part II

Lipeng Ning

doi:10.1109/TAC.2019.2926854

. Author manuscript; available in PMC: 2021 Apr 30.

Published in final edited form as: IEEE Trans Automat Contr. 2019 Jul 4;65(5):1901–1910. doi: 10.1109/TAC.2019.2926854

Smooth interpolation of covariance matrices and brain network estimation: Part II

Lipeng Ning ¹

PMCID: PMC8086997 NIHMSID: NIHMS1063653 PMID: 33935294

Abstract

This work focuses on the modeling of time-varying covariance matrices using the state covariance of linear systems. Following concepts from optimal mass transport, we investigate and compare three types of covariance paths which are solutions to different optimal control problems. One of the covariance paths solves the Schrödinger bridge problem (SBP). The other two types of covariance paths are based on generalizations of the Fisher-Rao metric in information geometry, which are the major contributions of this work. The general framework is an extension of the approach in [1] which focuses on linear systems without stochastic input. The performances of the three covariance paths are compared using synthetic data and a real-data example on the estimation of dynamic brain networks using functional magnetic resonance imaging.

Index Terms—: Optimal control, linear stochastic system, Fisher-Rao metric, optimal mass transport

I. Introduction

Modeling of time-varying covariance matrices of non-stationary multivariate time series is a fundamental problem in many scientific applications [2]–[8]. In this paper, we investigate a control-theoretic approach that uses the state covariance of linear systems with stochastic input to model time-varying covariance matrices. The goal is to use suitable parametric models of time-varying covariance matrices to understand the dynamic interactions among multivariable states. This approach is an extension of the framework proposed in [1] which focuses on a simpler situation based on linear systems without stochastic input. The motivation of this series work is from a neuroimaging application on using time-varying covariance matrices of resting-state functional magnetic resonance imaging (rsfMRI) data[9]–[13] to investigate dynamic brain networks.

Let us consider the following linear system

{\dot{x}}_{t} = A_{t} x_{t} + σ d w_{t},

(1)

where $A_{t} \in ℝ^{n \times n}$ , w_t denotes a standard n-dimensional Wiener process and dw_t denotes the differential form of w_t. The corresponding state covariance $P_{t} = E (x_{t} x_{t}^{'})$ evolves according to

{\dot{P}}_{t} = A_{t} P_{t} + P_{t} A_{t}^{'} + σ^{2} I,

(2)

where I denotes the identity matrix. We assume that two state covariance matrices P₀, P₁ are given and the underlying system parameters in (1) are unknown. The problem considered in this paper is to find a smooth covariance path that simultaneously connects the two endpoints at t = 0, 1 and satisfies (2). This problem is underdetermined since the solution is not unique. In order to obtain a meaningful solution, it is natural to use a cost function that regularizes the feasible paths. To this end, we consider covariance paths that are optimal solutions to problems of the following form

min_{P_{t}, A_{t}} {\int_{0}^{1} f (A_{t}) d t | (2) holds, P_{0}, P_{1} specified},

(3)

where f(A_t) denotes a quadratic function of A_t which may depend on P_t. The optimal solutions to (3) not only provides a smooth covariance path P_t but also a dynamic system that describes the interactions among multivariable states. The parametric models of covariance paths can be applied in regression problems to fit noisy sample covariances in order to understand the underlying system dynamics of stochastic processes.

In this paper, we investigate the optimal solutions to (3) corresponding to three objective functions. These functions have been investigated in [1] which focuses on a simpler situation without stochastic input, i.e. σ = 0. The motivations of these objective functions are based on concepts in optimal mass transport (OMT) [14]–[17] and the Fisher-Rao metric in information geometry [18], [19]. In particular, the optimization problem in (3) with the OMT-based objective function is related to the Schrödinger Bridge Problem (SBP) [20], which has been extensively investigated in recent years [21], [22]. The main contribution of this work is the other two solutions based on generalizations of the Fisher-Rao metric in information geometry. Moreover, the general theme of this paper is closely related to the results in [5], [23]–[27] which all focused on modeling of smooth covariance paths. But the methods and solutions proposed in this paper are substantially different from these references. In particular, this paper introduces a family of linear-system-based covariance paths for modeling the rotation of energy among multivariate time series, providing a potentially useful solution for investigating oscillations in brain networks using rsfMRI data.

The organization of this paper is as follows. In Section II we revisit the covariance paths based on optimal mass transport and the SBP. In Section III, we derive the optimal solution of (3) corresponding to a Fisher-Rao metric based objective function. In Section IV, we investigate a family of covariance paths obtained using a weighted least-squares function and discuss their relations with the Fisher-Rao based solutions. The three types of covariance paths are compared in Section V using simulations and real data based on a neuroimaging application. Section VI concludes the paper with discussions.

For notations, $S^{n}$ , $S_{+}^{n}$ , $S_{+ +}^{n}$ denote the sets of symmetric, positive semidefinite, and strictly positive definite matrices of size n × n, respectively. Small boldface letters, e.g. x, v, represent column vectors. Capital letters, e.g. P, A, denote matrices. Regular small letters, e.g. w, h are for scalars or scalar-valued functions.

II. Optimal-mass-transport based covariance paths

Let us consider u_t = A_tx_t as the control input that steers the state covariance matrices according to (2). We define

f_{P_{t}}^{omt} (A_{t}) = \frac{1}{2} E ({‖ u_{t} ‖}_{2}^{2}) = \frac{1}{2} tr (A_{t} P_{t} A_{t}^{'}),

as the objective function in (3). Then the optimization problem becomes

\min_{P_{t}, A_{t}} {\int_{0}^{1} tr (A_{t} P_{t} A_{t}^{'}) d t | {\dot{P}}_{t} = A_{t} P_{t} + P_{t} A_{t}^{'} + σ^{2} I, P_{0}, P_{1} specified} .

(4)

If the input noise vanishes, i.e. σ = 0, then the optimal value of (4) is equal to the optimal mass transport distance [14]–[17] between two zero-mean Gaussian distributions with covariance matrices being P₀ and P₁, respectively. It is also equal to the Bures distance between density matrices in quantum mechanics [23], [28], [29]. In the general situation when σ is non-zero, (4) can be viewed as the Schrödinger Bridge Problem (SBP) [20] between two zero-mean Gaussian probability density functions with covariance being P₀ and P₁, respectively. The basic idea of SBP is to find a stochastic system whose probability law on diffusion paths is most similar to that of a reference system measured by relative entropy and satisfies the initial and final marginal probability distributions. Its relations with stochastic optimal control and mass transport have been extensively studied recently [21], [22]. The more general situations when the reference model is a linear time-varying system with degenerate diffusions, i.e. noises that influence a subspace of the random states, have also recently been studied in [30], [31]. The following proposition presents the solution to (4).

Proposition 1. Given P₀, $P_{1} \in S_{+ +}^{n}$ and a scalar σ, then the unique of solution to (4) is equal to

A_{t}^{omt} = - Π_{0} {(I - Π_{0} t)}^{- 1},

(5)

P_{t}^{omt} = (I - Π_{0} t) P_{0} (I - Π_{0} t) + σ^{2} (I t - Π_{0} t^{2}),

(6)

Where

Π_{0} = I - P_{0}^{- \frac{1}{2}} ({(P_{0}^{\frac{1}{2}} P_{1} P_{0}^{\frac{1}{2}} + \frac{1}{4} σ^{4} I)}^{\frac{1}{2}} - \frac{1}{2} σ^{2} I) P_{0}^{- \frac{1}{2}} .

(7)

This proposition is a special case of Proposition 4 in [30]. For completeness, an independent proof is provided in the Appendix.

We note that if Π₀ is singular, then $A_{t}^{omt}$ is also singular for all t ∈ [0, t], implying free diffusion in the subspace spanned by the eigenvectors of Π₀ corresponding to zero eigenvalues. In the noiseless situation when σ = 0, then the covariance path in (6) is equal to the geodesic induced by the Wasserstein-2 metric.

III. Information-geometry based covariance paths

The Fisher information metric has provided a well-defined distance measure between probability distributions. For zero-mean multivariate Gaussian distributions, the Fisher information metric can be expressed as a quadratic form of the covariance matrices, which is referred to as the Fisher-Rao metric[18], [19]. Specifically, let P be a positive definite covariance matrix and let Δ be a symmetric matrix denoting a tangent direction at P on the manifold of positive-definite matrices. Then the Fisher-Rao metric has the following form [19]

g_{P} (Δ) = tr (P^{- 1} Δ P^{- 1} Δ) .

The geodesic connecting P₀, P₁ induced by the Fisher-Rao metric is the optimal solution to

min_{P_{t}, {\dot{P}}_{t}} {\int_{0}^{1} tr (P_{t}^{- 1} {\dot{P}}_{t} P_{t}^{- 1} {\dot{P}}_{t}) d t | P_{0}, P_{1} specified},

(8)

which has the following well-known expression [32]

P_{t} = P_{0}^{\frac{1}{2}} {(P_{0}^{- \frac{1}{2}} P_{1} P_{0}^{- \frac{1}{2}})}^{t} P_{0}^{\frac{1}{2}} .

(9)

In [1], we have shown that (9) is also equal to the optimal solution to

\min_{P_{t}, A_{t}} {\int_{0}^{1} f_{P_{t}}^{info} (A_{t}) d t | {\dot{P}}_{t} = A_{t} P_{t} + P_{t} A_{t}^{'}, P_{0}, P_{1} specified},

(10)

where the objective function is given by

f_{P_{t}}^{info} (A_{t}) = E ({‖ u_{t} ‖}_{P_{t}^{- 1}}^{2}) = E (u_{t}^{'} P_{t}^{- 1} u_{t}) = tr (P_{t}^{- 1} A_{t} P_{t} A_{t}^{'}) .

Thus the Fisher-Rao metric can be viewed as a weighted-mass-transport cost function for zero-mean Gaussian probability distribution functions.

Following (3) and (10), we consider the following optimization problem:

\min_{P_{t}, A_{t}} {\int_{0}^{1} f_{P_{t}}^{info} (A_{t}) d t | {\dot{P}}_{t} = A_{t} P_{t} + P_{t} A_{t}^{'} + σ^{2} I, P_{0}, P_{1} specified},

(11)

which generalizes (10) with the additional noise term σ²I.

A. On the optimal solution

The optimal solution to (11) is presented as follows.

Proposition 2. Given P₀, $P_{1} \in S_{+ +}^{n}$ and a scalar σ, if there exists a path that satisfies the following differential equations

{\dot{P}}_{t} = - 2 P_{t} Π_{t} P_{t} + σ^{2} I,

(12)

{\dot{Π}}_{t} = 2 Π_{t} P_{t} Π_{t},

(13)

with $Π_{t} \in S^{n}$ and P_t being equal to P₀ and P₁ at t = 0 and 1, respectively, then P_t is an optimal solution to (11). The corresponding A_t is equal to

A_{t} = - P_{t} Π_{t} .

(14)

Moreover, P_t also satisfies the following differential equation

{\ddot{P}}_{t} - {\dot{P}}_{t} P_{t}^{- 1} {\dot{P}}_{t} + σ^{4} P_{t}^{- 1} = 0.

(15)

The proof of the above proposition is provided in the Appendix.

In the noiseless situation when σ = 0, (15) becomes

{\ddot{P}}_{t} - {\dot{P}}_{t} P_{t}^{- 1} {\dot{P}}_{t} = 0,

which is the geodesic equation induced by the Fisher-Rao metric [6], [32]. In this case, the closed-form expression of P_t is given by (9).

B. The scalar case

The closed-form expression of the optimal solution to (11) is currently unknown to the author except for scalar-valued covariance. In this case, the optimal covariance path has one of the following expressions:

p_{t} = p_{0} + σ^{2} t,

(16)

p_{t} = α e^{β t} - \frac{σ^{4}}{4 α β^{2}} e^{- β t},

(17)

p_{t} = \frac{σ^{2}}{ω} cos (ω t + θ), with 0 < ω < π, - \frac{π}{2} < θ < \frac{π}{2} .

(18)

It is straightforward to verify that all the above expressions satisfy that

p_{t} {\ddot{p}}_{t} - {({\dot{p}}_{t})}^{2} + σ^{4} = 0,

which is a scalar-version of (15). Clearly, (16) corresponds to a covariance path purely driven by the input noise. In (18), the constraint 0 < ω < π ensures that p_t does not have negative values for t ∈ [0, 1]. Moreover, the constraint $- \frac{π}{2} < θ < \frac{π}{2}$ guarantees that p₀ is positive. The following proposition shows that the covariance paths in (17) and (18) connect two endpoints that satisfy different conditions.

Proposition 3. Given three scalars p₀, p₁ > 0 and σ, if |p₁ − p₀| > σ², then the unique solution to (11) is given by (17) with β being a non-zero solution to

4 (p_{1} - p_{0} e^{- β}) (p_{1} - p_{0} e^{β}) β^{2} - σ^{4} {(e^{β} - e^{- β})}^{2} = 0,

(19)

and

α = \frac{p_{1} - p_{0} e^{- β}}{e^{β} - e^{- β}} .

(20)

If |p₁ − p₀| < σ², then the unique solution to (11) is given by (18) with ω being the unique solution to

\frac{σ^{4}}{ω^{2}} {sin}^{2} (ω) + 2 p_{0} p_{1} cos (ω) = p_{0}^{2} + p_{1}^{2},

(21)

and $θ = acos (\frac{p_{0} ω}{σ^{2}})$ .

The proof is provided in the Appendix. It should be noted that the optimal a_t is computed using (2) as $a_{t} = ({\dot{p}}_{t} - σ^{2}) / (2 p_{t})$ . If |p₁ – p₀| > σ², then

a_{t} = \frac{1}{2 p_{t}} (α β e^{β t} + \frac{σ^{4}}{4 α β} e^{- β t} - σ^{2}) > 0.

If |p₁ – p₀| < σ², then

a_{t} = \frac{1}{2 p_{t}} (- σ^{2} sin (ω t + θ) - σ^{2}) < 0.

IV. Weighted-least-squares based covariance paths

For scalar-valued covariance, the Fisher-Rao metric $g_{p} (\dot{p})$ at a covariance p is equal to the squared norm ${(\dot{p} / p)}^{2}$ . For matrix-valued covariance, there are in general no unique way to generalize the matrix division $\dot{p} / p$ . As an example, in the following equation

{\dot{P}}_{t} = A_{t} P_{t} + P_{t} A_{t}^{'},

(22)

the matrix A_t can be viewed as a non-commutative division of $\frac{1}{2} \dot{P}$ by P. For a given pair of matrices P_t and ${\dot{P}}_{t}$ there are infinite A_t that satisfies (22). But one could find a unique A_t that minimizes a quadratic function f(A_t) with a given pair of ${\dot{P}}_{t}$ and P_t. The optimal value of the quadratic function provides a generalization of the Fisher-Rao metric, which motivates the choice of the third objective function proposed in [1]. We will apply this objective function in the general problem (3).

Specifically, for a matrix $A_{t} \in ℝ^{n \times n}$ , we decompose it as A_t = A_t,s + A_t,a where

A_{t, s} = \frac{1}{2} (A_{t} + A_{t}^{'}), A_{t, a} = \frac{1}{2} (A_{t} - A_{t}^{'}),

are the symmetric and asymmetric parts of A_t, respectively. For a given scalar ϵ > 0, we define the following weighted squared norm of A_t

f_{ϵ}^{wls} (A_{t}) ≔ {‖ A_{t, s} ‖}_{F}^{2} + ϵ {‖ A_{t, a} ‖}_{F}^{2} = \frac{1 + ϵ}{2} tr (A_{t} A_{t}^{'}) + \frac{1 - ϵ}{2} tr (A_{t} A_{t}) .

Next, we follow (3) to consider the problem:

\min_{P_{t}, A_{t}} {\int_{0}^{1} \frac{1 + ϵ}{2} tr (A_{t} A_{t}^{'}) + \frac{1 - ϵ}{2} tr (A_{t} A_{t}) d t | {\dot{P}}_{t} = A_{t} P_{t} + P_{t} A_{t}^{'} + σ^{2} I, P_{0}, P_{1} specified} .

(23)

The expression of the optimal solution is provided in the following proposition.

Proposition 4. Given P₀, $P_{1} \in S_{+ +}^{n}$ and two scalars ϵ, σ > 0, if there exists a pair of matrix-valued functions P_t, Π_t that satisfy

{\dot{P}}_{t} = - \frac{1 + ϵ}{2 ϵ} (Π_{t} P_{t}^{2} + P_{t}^{2} Π_{t}) + \frac{1 - ϵ}{ϵ} P_{t} Π_{t} P_{t} + σ^{2} I,

(24)

{\dot{Π}}_{t} = \frac{1 + ϵ}{2 ϵ} (Π_{t}^{2} P_{t} + P_{t} Π_{t}^{2}) - \frac{1 - ϵ}{ϵ} Π_{t} P_{t} Π_{t},

(25)

with P_t being equal to P₀, P₁ at t = 0, 1, respectively, then P_t is an optimal solution to (23). Moreover, the optimal A_t is given by

A_{t} = - \frac{1}{2} (Π_{t} P_{t} + P_{t} Π_{t}) + \frac{1}{2 ϵ} (P_{t} Π_{t} - Π_{t} P_{t}) .

(26)

The proof is provided in the Appendix.

A. On the relation with information-geometry based covariance paths

As mentioned above, the cost function $f_{ϵ}^{wls} (A_{t})$ is equal to $f_{P_{t}}^{info} (A_{t})$ for scalar-valued covariance. Therefore, the path defined by (24) and (25) coincides with the solution given by (12) and (13). Thus, the closed-form expressions provided in Proposition 3 are also the optimal solutions to (23).

For matrix-valued covariance, we consider a more general family of covariance paths defined by (24) and (25) with possibly negative ϵ. Though ϵ was restricted to be positive in (4) to ensure that the cost function $f_{ϵ}^{wls}$ is positive, the covariance path in (24) and (25) is still well-defined for any given initial values with $P_{0} \in S_{+ +}^{n}$ and $Π_{0} \in S^{n}$ even if ϵ < 0. In this case, (24) and (25) still satisfies the first-order necessary condition for optimality, though they may be local minimum. It is interesting to notice that in the special case when ϵ = −1, (24) and (25) are the same as (12) and (13), respectively, implying the corresponding covariance path is equal to the information-geometry based path given in (15). Therefore, the equations in (24) and (25) with possibly negative ϵ define a more general family of covariance paths.

B. On the existence and uniqueness of covariance paths

Based on the aforementioned relation with information geometry, the geodesic in (9) is the unique solution to (24) when σ = 0 and ϵ = −1. Though the general solution to the covariance path with arbitrary σ and ϵ is currently unknown, we will analyze the existence and uniqueness of the covariance path in the special case when σ = 0.

In Theorem 4 of [1], it is shown that the covariance path with σ = 0 has the following closed-form expression:

P_{1} = e^{\frac{1 + ϵ}{2 ϵ} (P_{0} Π_{0} - Π_{0} P_{0})} e^{- P_{0} Π_{0}} P_{0} e^{- Π_{0} P_{0}} e^{\frac{1 + ϵ}{2 ϵ} (Π_{0} P_{0} - P_{0} Π_{0})} .

(27)

If ϵ = −1, then (27) becomes (24) and the unique Π₀ is equal to

Π_{0} = - \frac{1}{2} P_{0}^{- \frac{1}{2}} log (P_{0}^{- \frac{1}{2}} P_{1} P_{0}^{- \frac{1}{2}}) P_{0}^{- \frac{1}{2}} .

(28)

By exploring the continuous dependence of Π₀ on ϵ in (27), we are able to derive the range of ϵ that ensures a unique covariance path connecting P₀ and P₁. To introduce the result, we define λ_min(P) and λ_max(P) denote the smallest and the largest eigenvalues of $P \in S_{+ +}^{n}$ . Moreover, we define the following pseudo-norm

δ (P) ≔ max_{Δ \in S^{n}, Δ \neq 0} \frac{‖ Δ P - P Δ ‖_{2}}{‖ Δ ‖_{2}} .

(29)

Then, the following proposition holds.

Proposition 5. Given P₀, $P_{1} \in S_{+ +}^{n}$ , if the scalar ϵ satisfies that

| \frac{1 + ϵ}{2 ϵ} | < max {\frac{λ_{min} (P_{0}) λ_{min} (P_{1})}{δ (P_{0}) λ_{max} (P_{1})}, \frac{λ_{min} (P_{0}) λ_{min} (P_{1})}{δ (P_{1}) λ_{max} (P_{0})}},

(30)

then there exists a unique $Π_{0} \in S^{n}$ that satisfies (27).

The proof is provided in the Appendix. We expect that the covariance path exists for a more general range of ϵ but the solution may not be unique.

C. On the relation between input noise and system matrices

To understand the influence of input noise to the optimal solutions, we consider a trajectory defined by (24) to (26) with given initial values P₀ and Π₀. With the system matrix A_t given by (26), it is straightforward to derive that

{\dot{A}}_{t, a} = \frac{1}{2 ϵ} ({\dot{P}}_{t} Π_{t} + P_{t} {\dot{Π}}_{t} - {\dot{Π}}_{t} P_{t} - Π_{t} {\dot{P}}_{t}) = 0.

Therefore, the asymmetric part of A_t is constant, which is equal to $\frac{1}{2 ϵ} (P_{0} Π_{0} - Π_{0} P_{0})$ .

On the other hand, by taking the derivative of the symmetric part of A_t, denoted by A_t,s, we obtain that

{\dot{A}}_{t, s} = (1 + ϵ) (A_{0, a} A_{t, s} + A_{t, s} A_{0, a}^{'}) - σ^{2} Π_{t} .

(31)

Next, we apply change of variables to define

{\hat{A}}_{t} = e^{(1 + ϵ) A_{0, a}^{'} t} (A_{t, s} + ϵ A_{0, a}^{'}) e^{(1 + ϵ) A_{0, a} t}, {\hat{P}}_{t} = e^{(1 + ϵ) A_{0, a}^{'} t} P_{t} e^{(1 + ϵ) A_{0, a} t}, {\hat{Π}}_{t} = e^{(1 + ϵ) A_{0, a}^{'} t} \prod_{t} e^{(1 + ϵ) A_{0, a} t} .

By taking the derivative of the above equations, we obtain that

{\dot{\hat{P}}}_{t} = {\hat{A}}_{t} {\hat{P}}_{t} + {\hat{P}}_{t} {\hat{A}}_{t}^{'} + σ^{2} I,

(32)

{\dot{\hat{Π}}}_{t} = - {\hat{A}}_{t}^{'} {\hat{Π}}_{t} - {\hat{Π}}_{t} {\hat{A}}_{t},

(33)

{\dot{\hat{A}}}_{t} = - σ^{2} {\hat{Π}}_{t},

(34)

which shows that only the symmetric part of Â_t is changing with non-zero input noise.

V. Example

A. Comparing scalar-valued covariance paths

In this example, we compare the scalar-valued covariance paths defined by the closed-form expressions in (6) and (17) to (18). The plots in Figure 1 illustrate the covariance paths that connect p₀ = 6 and several different p₁ with σ = 4. The OMT and information-geometry based solutions are shown using dashed red and solid blue lines, respectively. In the endpoint p₁ is close to p₀ + σ² = 22, then the plots are close to be the straight line p₀ + σ²t. If the endpoint is far away from 22, e.g. the endpoints at 30 and 1, then OMT and information-geometry based plots are very different from each other.

Fig. 1: — A comparison of OMT and Fisher-Rao metric based covariance paths. The dashed red lines denoted by $p_{t}^{omt}$ illustrate the paths given by (6). The solid blue lines with p₁ above and bellow p₀ + σ² = 22 illustrate the paths given by (17) and (18), respectively.

B. Comparing matrix-valued covariance paths

In this example, we illustrate the difference between the above covariance paths using two fixed endpoints given by

P_{0} = [\begin{matrix} 1 & 0 \\ 0 & 0.3 \end{matrix}], P_{1} = [\begin{matrix} 0.3 & 0 \\ 0 & 1 \end{matrix}] .

(35)

Fig. 2a shows the OMT-based paths in (6) using several different values for σ, where the ellipsoids denote isocontour of the quadratic function x^′P_tx = r² with r = 0.5. Fig. 2b illustrates the information-geometry based paths defined by the (15). Since the two matrices P₀ and P₁ commute, the paths are obtained by using the closed-form expressions in (17) to (18) to the diagonal entries. Though there are minor differences between the two sets of covariance paths, all the P_t’s along the two paths have the same eigenspace.

The closed-form solution of the covariance path defined by (24) and (25) is currently unknown. There may also exist multiple local optimal solutions. Specifically, we consider the solution to (23) in the extreme situation when ϵ = 0 and σ = 0. In this case, any asymmetric system matrix A of the form

A = [\begin{matrix} 0 & \pm \frac{(2 k + 1) π}{2} \\ \mp \frac{(2 k + 1) π}{2} & 0 \end{matrix}],

(36)

is an optimal solution because the corresponding objective value is equal to zero. The corresponding covariance paths are equal to

P_{\pm, t} = [\begin{matrix} 0.3 + 0.7 {cos}^{2} (\frac{(2 k + 1) π}{2} t) & \pm 0.7 cos (\frac{(2 k + 1) π}{2} t) sin (\frac{(2 k + 1) π}{2} t) \\ \pm 0.7 cos (\frac{(2 k + 1) π}{2} t) sin (\frac{(2 k + 1) π}{2} t) & 0.3 + 0.7 {sin}^{2} (\frac{(2 k + 1) π}{2} t) \end{matrix}] .

In order to obtain a numerical solution for the covariance paths given by (24) and (25) with non-zero ϵ, we apply the lsqnonlin nonlinear optimization toolbox and the ode45 functions in MATLAB (The MathWorks, Inc., Natic, MA) to solve for Π₀ so that a path P_t starts from P₀ has the least square error relative to P₁ at t = 1. We first set ϵ = 0.001 and choose two different initial values for ${\hat{Π}}_{0}$ used in the optimization algorithms given by

{\hat{Π}}_{\pm, 0} = \pm \frac{1}{700} [\begin{matrix} 0 & π \\ π & 0 \end{matrix}],

so that the corresponding system matrices given by(26) approximately satisfy (36). Next, we gradually increase ϵ using a step size of 0.001 and use the optimal Π₀ from the previous step as the initial value. The left and right panels of Fig. 3 illustrate two branches of locally optimal paths corresponding to the two different initial values of Π₀ with σ = 0.5 and several different values for ϵ. All the numerical solutions satisfy the endpoints with the Frobenius norm of the residuals at the order of 10⁻⁶ or smaller. Different from the paths shown in (2), the paths in Fig. 3 have rotating eigenspace where the rotation direction depends on the initial choice of Π₀. Moreover, as ϵ increase, the paths become similar to those shown in Fig. 2.

Fig. 3: — An illustration of two branches, i.e. (a) and (b), of locally-optimal covariance paths obtained using (24) and (25) with different initial values for Π.

C. Fitting noisy measurements of functional MRI data

In this example, we apply the proposed covariance paths to fit noisy sample covariance matrices of a stochastic process based on a resting-state functional MRI dataset used in [1]. The covariance matrix of time-series data measured by rsfMRI is usually referred to as the functional connectivity matrix which is the standard method to investigate functional networks between brain regions [9]–[11]. However, the underlying time series are non-stationary [12], [13], implying the existence of dynamic changes in brain networks. A review of existing methods on modeling and analyzing time-resolved functional connectivity can be found in [33]. In particular, both neural field analysis and real-data experiments have revealed ultra-slow oscillations in brain activities [34], [35]. The proposed covariance paths provide an approach to use linear systems to model the rotation of energy related to the oscillations in brain networks. Moreover, the underlying model parameters are potentially useful to understand the relation between brain dynamics and structural connectivity between brain regions. The interested reader is referred to [1], [36] for more detailed background information and data processing methods.

This covariance fitting problem is formulated as follows. Given K = 10 sample covariance matrices, denoted by ${\tilde{P}}_{t_{1}}, \dots, {\tilde{P}}_{t_{k}}$ , based on K segments of rsfMRI time series data, we are looking for a smooth covariance path P_t that minimizes

min_{P_{t}} \sum_{k = 1}^{K} {‖ P_{t_{k}} - {\tilde{P}}_{t_{k}} ‖}_{F}^{2} .

(37)

In this example, ${\tilde{P}}_{t_{k}}$ ‘s are the sample covariances from a 7-dimensional rsfMRI time series based on the 7 brain regions proposed in [37]. We apply the three proposed parametric models of covariance paths to fit these measurements using the fminsdp function in MATLAB. The first model is the closed-form expression given by (6) which is parameterized by P₀, Π₀ and σ. The optimal solution is denoted by ${\hat{P}}_{t}^{omt}$ . The second model is based on the differential equation (12) and (13) which is implemented by the ode45 function in MATLAB. The estimated path is denoted by ${\hat{P}}_{t}^{info}$ . Note again that this model is a special case of (24) and (25) when ϵ = −1. The more general model in (24) and (25) relies on the estimation an additional parameter ϵ. Since the differential equations are highly nonlinear in term of ϵ, we find that the MATLAB algorithm only provides a local optimal value of ϵ depending on its initial value. In order to obtain a reliable covariance path to understand the rotation of energy among the variables, we fix ϵ = 20 as used in [1] to analyze the influence of stochastic input. The estimated path is denoted by ${\hat{P}}_{t}^{wls}$ .

The discrete markers in Fig. 4 illustrate the noisy sample covariance of 6 entries of the covariance matrices. The blue, green and red plots are the estimated paths given by ${\hat{P}}_{t}^{omt}$ , ${\hat{P}}_{t}^{info}$ , and ${\hat{P}}_{t}^{wls}$ , respectively. The normalized squared errors corresponding to the three paths are equal to 0.1674, 0.1668 and 0.1388, respectively, which are all lower than the corresponding fitting errors in [1] because of the additional σ² term in models. The much lower estimation error of ${\hat{P}}_{t}^{wls}$ indicates that the weighted least-squares based solution is more capable of tracking the rotation of energies among brain regions.

VI. Discussion and conclusion

In this paper, we have investigated three models for time-varying covariance matrices using the state covariance of linear systems with stochastic input. The main motivation is to use these parametric models to investigate dynamic interactions among brain regions using time-varying covariance matrices of rsfMRI data. The three parametric covariance paths are derived as the optimal solutions to three quadratic regularization problems. The first solution is obtained using an optimal-mass transport based objective function, which is related to the well-known solution to the Schrödinger bridge problem (SBP) [21], [22], [30], [31]. The derivation of the other two models are the main contributions of this work. In particular, the second type of covariance path is based on a generalization of the Fisher-Rao metric in information geometry. The third family of covariance path is obtained using a weighted-least-squares cost function of the underlying system matrices, which includes the Fisher-Rao based solution in a special case. Moreover, the weighted-least-squares based covariance paths are able to model the rotation of energies among multidimensional variables, which cannot be done by the other two types of paths. The three models of covariance paths generalize the results from [1] which focuses on linear systems without stochastic input. The performances of the three covariance paths are compared using synthetic data and a real-data example on estimating dynamic brain networks using rsfMRI. Our future work will focus on developing more effective computational algorithms to solve the covariance fitting problem and the application of the proposed solutions to analyze abnormal brain networks related to mental disorders.

Acknowledgments

This work was supported in part under grants R21MH115280, R21MH116352, K01MH117346 (PI: Ning), R01MH097979, R01MH111917 (PI: Rathi), R01MH074794 (PI: Westin).

Biography

graphic file with name nihms-1063653-b0001.gif

Lipeng Ning received his B.Sc and M.Sc in Control Science and Engineering from Beijing Institute of Technology, China, in 2006 and 2008 respectively. He obtained his Ph.D. in Electrical and Computer Engineering from the University of Minnesota in November 2013. He is currently an Assistant Professor at Brigham, Women’s Hospital and Harvard Medical School, Boston, MA.

He is interested in the application of mathematics in neuroimaging and neuroscience research. His current research focuses on stochastic processes, dynamical systems, machine learning and brain connectomics.

Appendix

Proof of Proposition 1.

We consider (4) as an optimal control problem with A_t being matrix-valued control input. A necessary condition for the optimal solution is that the derivative of the Hamiltonian

h_{1} (P_{t}, A_{t}, Π_{t}) = tr (A_{t} P_{t} A_{t} + Π_{t} (A_{t} P_{t} + P_{t} A_{t}^{'} + σ^{2} I))

with respect to A_t vanishes with the optimal control. This gives rise to

A_{t} P_{t} + Π_{t} P_{t} = 0.

Thus,

A_{t} = - Π_{t} .

(38)

Next, the optimal ${\dot{Π}}_{t}$ also needs to annihilate the derivative of h₁(·) with respect to P_t which leads to

{\dot{Π}}_{t} = - A_{t}^{'} A_{t} - A_{t}^{'} Π_{t} - Π_{t} A_{t} .

(39)

Substituting (38) to (2) and (39) to deduce that

{\dot{P}}_{t} = - Π_{t} P_{t} - P_{t} Π_{t} + σ^{2} I,

(40)

{\dot{Π}}_{t} = Π_{t}^{2} .

(41)

Next, we show that all eigenvalues of the initial Π₀ must be smaller than one. For this purpose, we note that solutions for P_t and Π_t from (40) and (41) have the following form

P_{t} = (I - Π_{0} t) P_{0} (I - Π_{0} t) + σ^{2} (I t - Π_{0} t^{2}),

(42)

Π_{t} = Π_{0} {(I - Π_{0} t)}^{- 1} .

(43)

Assume that Π₀ has an eigenvalue λ₀ > 1, then A_t = −Π_t becomes unbounded when t increases to1/λ₀. At the same time, P_t is singular at t = 1/λ₀ and becomes non positive semi-definite when t > 1/λ₀. Therefore, all eigenvalues of Π₀ are smaller than one.

Next, setting t = 1 in (42) and multiplying both sides by $P_{0}^{\frac{1}{2}}$ to obtain that

P_{0}^{\frac{1}{2}} P_{1} P_{0}^{\frac{1}{2}} = {(P_{0}^{\frac{1}{2}} (I - Π_{0}) P_{0}^{\frac{1}{2}})}^{2} + σ^{2} P_{0}^{\frac{1}{2}} (I - Π_{0}) P_{0}^{\frac{1}{2}} .

Therefore $P_{0}^{\frac{1}{2}} (I - Π_{0}) P_{0}^{\frac{1}{2}}$ has the same eigenvectors as $P_{0}^{\frac{1}{2}} P_{1} P_{0}^{\frac{1}{2}}$ . If y is an eigenvalue of $P_{0}^{\frac{1}{2}} P_{1} P_{0}^{\frac{1}{2}}$ , then the corresponding eigenvalue of $P_{0}^{\frac{1}{2}} (I - Π_{0}) P_{0}^{\frac{1}{2}}$ , denoted by x, satisfies that x² + σ²x = y. The two solutions are given by

x_{\pm} = - \frac{1}{2} σ^{2} \pm {(y + \frac{1}{4} σ^{4})}^{\frac{1}{2}} .

But x₋ is negative which contradicts to that I – Π₀ is positive definite. Thus x₊ is the only feasible solution. Therefore,

P_{0}^{\frac{1}{2}} (I - Π_{0}) P_{0}^{\frac{1}{2}} = {(P_{0}^{\frac{1}{2}} P_{1} P_{0}^{\frac{1}{2}} + \frac{1}{4} σ^{4} I)}^{\frac{1}{2}} - \frac{1}{2} σ^{2} I,

which gives rise to the optimal solution in (7). Then, the proposition is proved. ☐

Proof of the Proposition 2.

Following the same method as in the proof of Proposition (1), we will derive the solution to the optimal control problem (11) using the following Hamiltonian

h_{2} (P_{t}, A_{t}, Π_{t}) = tr (P_{t}^{- 1} A_{t} P_{t} A_{t}^{'} + Π_{t} (A_{t} P_{t} + P_{t} A_{t}^{'} + σ^{2} I)) .

It is necessary that ${\dot{Π}}_{t}$ annihilates the partial derivative of h₂(·) with respect to P_t, which leads to

{\dot{Π}}_{t} = - A_{t}^{'} P_{t}^{- 1} A_{t} + P_{t}^{- 1} A_{t} P_{t} A_{t}^{'} P_{t}^{- 1} - Π_{t} A_{t} - A_{t}^{'} Π_{t}

(44)

Moreover, setting the derivative of h₂(·) with respect to A_t to zero to obtain that

A_{t} = - P_{t} Π_{t} .

(45)

Then, (12) and (13) can be obtained by substituting (45) to (2) and (44), respectively. From (12), we obtain the following expression

Π_{t} = \frac{1}{2} (σ^{2} P_{t}^{- 2} - P_{t}^{- 1} {\dot{P}}_{t} P_{t}^{- 1}) .

Next, taking the derivative of Π_t and setting it equal to (13) we obtain that

{\dot{Π}}_{t} = \frac{1}{2} (σ^{2} (- P_{t}^{- 1} {\dot{P}}_{t} P_{t}^{- 2} - P_{t}^{- 2} {\dot{P}}_{t} P_{t}^{- 1}) + 2 P_{t}^{- 1} {\dot{P}}_{t} P_{t}^{- 1} {\dot{P}}_{t} P_{t}^{- 1} - P_{t}^{- 1} {\ddot{P}}_{t} P_{t}^{- 1}) = \frac{1}{2} (σ^{2} P_{t}^{- 2} - P_{t}^{- 1} {\dot{P}}_{t} P_{t}^{- 1}) P_{t} (σ^{2} P_{t}^{- 2} - P_{t}^{- 1} {\dot{P}}_{t} P_{t}^{- 1}) .

Then (15) can be obtained from the above equation after simplifications. Thus, the proof is complete. ☐

Proof of Proposition 3.

Since the paths in (17) and (18) both satisfy the (15), for the first argument, we only need to prove that if |p₀ – p₁| > σ², then these exists a unique pair of parameters α, β such that p_t from (17) is equal to p₀ and p₁ at t = 0, 1, respectively. For this purpose, we set p_t in (17) to p₀ and p₁ at t = 0, 1, respectively, to obtain that

p_{0} = α - \frac{σ^{4}}{4 a b^{2}}, p_{1} = α e^{β} - \frac{σ^{4}}{4 a b^{2}} e^{- β} .

(46)

Then, we can derive the following

α = \frac{p_{1} - p_{0} e^{- β}}{e^{β} - e^{- β}} .

Next, substituting the above expression into (46) then multiplying both sides by 4(p₁ − p₀e^−β)β² to obtain that

4 (p_{1} - p_{0} e^{- β}) (p_{1} - p_{0} e^{β}) β^{2} - σ^{4} {(e^{β} - e^{- β})}^{2} = 0.

(47)

The above equation has a trivial solution at β = 0. Moreover, if a non-zero β is a solution to (47), so is −β. In this case, the coefficient of the second term of p_t is equal to

- \frac{σ^{4}}{4 a b^{2}} = \frac{p_{1} - p_{0} e^{β}}{e^{- β} - e^{β}} .

Therefore, (17) is equivalent to

p_{t} = \frac{p_{1} - p_{0} e^{- β}}{e^{β} - e^{- β}} e^{β t} + \frac{p_{1} - p_{0} e^{β}}{e^{- β} - e^{β}} e^{- β t} .

It is interesting to note that switching β to −β does not change the covariance path. Therefore, we only consider positive solutions to (47). To this end, we denote the left-hand side of (47) by ϕ(β). It is straightforward to derive that

{\frac{d ϕ (β)}{d b} |}_{β = 0} = 0, {\frac{d^{2} ϕ (β)}{d b^{2}} |}_{β = 0} = 8 ({(p_{1} - p_{0})}^{2} - σ^{4}) .

Moreover,

\frac{d^{3} ϕ (β)}{d b^{3}} = - 4 p_{0} p_{1} (e^{β} - e^{- β}) (β^{2} + 6 β + 6) - 8 σ^{4} (e^{2 β} - e^{- 2 β})

which is negative for all β > 0. Therefore, if (p − p)² – σ⁴ > 0, then the function ϕ(β) is convex and positive near β = 0. But ϕ(β) < 0 as b → ∞. Because its second order derivative $\frac{d^{2} ϕ (β)}{d b^{2}}$ is monotonically decreasing, we conclude that there exists a unique solution to ϕ(β) = 0.

To prove the second argument, we take the derivative of p_t given by (18) to obtain

{\dot{p}}_{t} = - σ^{2} sin (ω t + θ) .

If p₀, p₁ are the endpoints of p_t, then it is necessary that

| p_{1} - p_{0} | = | \int_{0}^{1} {\dot{p}}_{t} d t | \leq \int_{0}^{1} | {\dot{p}}_{t} | d t \leq σ^{2} .

To show that there exists a path of the form (18) for any p₀, p₁ that satisfies |p₁ − p₀| < σ², we re-parameterize (18) as follows

p_{t} = c cos (ω t) + d sin (ω t),

with (c² + d²)ω² = σ⁴. By setting p_t to p₀, p₁ at t = 0, 1, respectively, we obtain that

p_{0} = c, p_{1} = c cos (ω) + d sin (ω) .

(48)

Therefore,

d = \frac{p_{1} - p_{0} cos (ω)}{sin (ω)} .

(49)

Next, substituting (48) and (49) to ${sin}^{2} (ω) (c^{2} + d^{2}) = {sin}^{2} (ω) \frac{σ^{4}}{ω^{2}}$ to obtain that

\frac{σ^{4}}{ω^{2}} {sin}^{2} (ω) + 2 p_{0} p_{1} cos (ω) = p_{0}^{2} + p_{1}^{2} .

(50)

We denote the left hand size of the above equation by ψ(ω). Then, the following holds

lim_{ω \to 0} ψ (ω) = σ^{4} + 2 p_{0} p_{1} \geq {(p_{1} - p_{0})}^{2} + 2 p_{0} p_{1} = p_{0}^{2} + p_{1}^{2} .

Moreover, ψ(ω) = −2p₀p₁ < 0. Furthermore, it can be shown that

\frac{d ψ (ω)}{d ω} = (2 \frac{σ^{4}}{ω^{3}} (ω cos (ω) - sin (ω)) - 2 p_{0} p_{1}) sin (ω),

which is negative for all ω ∈ (0, π). Therefore, there exists a unique solution to (50) for ω ∈ (0, π), which completes the proof. ☐

Proof of Proposition 4.

We will follow the same method as in the proof of Proposition (2) to derive the necessary conditions for a stationary value using the following Hamiltonian

h_{3} (P_{t}, A_{t}, Π_{t}) = tr (\frac{1 + ϵ}{2} tr (A_{t} A_{t}^{'}) + \frac{1 - ϵ}{2} tr (A_{t} A_{t}) + Π_{t} (A_{t} P_{t} + P_{t} A_{t}^{'} + σ^{2} I)) .

Setting $- {\dot{Π}}_{t}$ equal to partial derivative of h₃(·) respect to P_t we obtain that

{\dot{Π}}_{t} = - Π_{t} A_{t} - A_{t}^{'} Π_{t} .

(51)

Moreover, setting the partial derivative of h₃(·) with respect to A_t equal to zero to obtain that

(1 + ϵ) A_{t} + (1 - ϵ) A_{t}^{'} + 2 Π_{t} P_{t} = 0.

(52)

Thus, the symmetric and asymmetric part of A_t are equal to

A_{t, s} = - \frac{1}{2} (\prod_{t} P_{t} + P_{t} Π_{t}), A_{t, a} = \frac{1}{2 ϵ} (P_{t} Π_{t} - \prod_{t} P_{t}) .

Therefore (26) holds. Then, (24) and (25) can be obtained by substituting (26) into (2) and (51), respectively.

Proof of Proposition 5.

To simplify notations, we denote γ = (1 + ϵ)/(2ϵ). Then (28) implies that the matrix Π₀ that maps P₀ to P₁ satisfies

e^{γ (P_{0} Π_{0} - Π_{0} P_{0})} e^{- P_{0} Π_{0}} P_{0} e^{- Π_{0} P_{0}} e^{γ (Π_{0} P_{0} - P_{0} Π_{0})} = P_{1},

(53)

which is equivalent to

Π_{0} = - \frac{1}{2} P_{0}^{- \frac{1}{2}} log (P_{0}^{- \frac{1}{2}} U P_{1} U^{'} P_{0}^{- \frac{1}{2}}) P_{0}^{- \frac{1}{2}},

(54)

Where

U = e^{γ (Π_{0} P_{0} - P_{0} Π_{0})} .

If γ = 0, i.e. ϵ = −1, then Π₀ given by (28) satisfies the above equation. Next, we will apply perturbation analysis to the above equation to understand the solutions associated with different values for γ. Specifically, let δ_γ and Δ_Π denote perturbations to γ and Π, respectively, so that γ + δ_γ and Π₀ + Δ_Π still satisfy (54). Then, for small perturbations, the perturbation of both sides of (54) gives rise to¹

Δ_{Π} = - \frac{1}{2} P_{0}^{- \frac{1}{2}} M_{Q}^{- 1} (γ P_{0}^{- \frac{1}{2}} M_{U} (Δ_{Π} P_{0} - P_{0} Δ_{Π}) P_{1} U' P_{0}^{- \frac{1}{2}} + γ P_{0}^{- \frac{1}{2}} U P_{1} M_{U'} (P_{0} Δ_{Π} - Δ_{Π} P_{0}) P_{0}^{- \frac{1}{2}} + δ_{γ} P_{0}^{- \frac{1}{2}} (Π_{0} P_{0} - P_{0} Π_{0}) U P_{1} U' P_{0}^{- \frac{1}{2}} + δ_{γ} P_{0}^{- \frac{1}{2}} U P_{1} U' (P_{0} Π_{0} - Π_{0} P_{0}) P_{0}^{- \frac{1}{2}}) P_{0}^{- \frac{1}{2}} + o (| δ_{γ} |) + o (‖ Δ_{Π} ‖),

(57)

where o(|δ_γ|) + o(‖Δ_Π‖) denotes all higher-order

^{1} For A, Δ \in ℝ^{n \times n}, e^{A + Δ} = e^{A} + M_{e^{A}} (Δ) + o (‖ Δ ‖),

where M_X(Δ) denotes the non-commutative multiplication of Δ by X which is defined as

M_{X} (Δ) = \int_{0}^{1} X^{1 - τ} Δ X^{τ} d τ .

(55)

For positive definite matrices P, $P + Δ \in S_{+ +}^{n}$ ,

log (P + Δ) = log (P) + M_{P}^{- 1} (Δ) + o (‖ Δ ‖) .

where $M_{X}^{- 1} (Δ)$ denotes the non-commutative devision of Δ by X which is defined as

M_{X}^{- 1} (Δ) = \int_{0}^{\infty} {(X + τ I)}^{- 1} Δ {(X + τ I)}^{- 1} d τ .

(56)

terms of the perturbations, $Q = P_{0}^{- \frac{1}{2}} U P_{1} U' P_{0}^{- \frac{1}{2}}$ , and $M_{X}^{- 1} (\cdot)$ and M_X(·) are defined in (55) and (56), respectively. Next, we combine all the terms containing Δ_Π on the right hand side of (57) to define the following linear mapping $h_{γ, P_{0}, Π_{0}} : S^{n} \to S^{n}$ ,

h_{γ, P_{0}, Π_{0}} (Δ_{Π}) = - \frac{1}{2} P_{0}^{- \frac{1}{2}} M_{Q}^{- 1} (P_{0}^{- \frac{1}{2}} M_{U} (Δ_{Π} P_{0} - P_{0} Δ_{Π}) P_{1} U' P_{0}^{- \frac{1}{2}} + P_{0}^{- \frac{1}{2}} U P_{1} M_{U'} (P_{0} Δ_{Π} - Δ_{Π} P_{0}) P_{0}^{- \frac{1}{2}}) P_{0}^{- \frac{1}{2}} .

(58)

All the terms on the r.h.s. of (57) that contain δ_γ is equal to $δ_{γ} h_{γ, P_{0}, Π_{0}} (Π_{0})$ . Then (57) can be simplified as

Δ_{Π} = γ h_{γ, P_{0}, Π_{0}} (Δ_{Π}) + δ_{γ} h_{γ, P_{0}, Π_{0}} (Π_{0}) + o (| δ_{γ} |) + o (‖ Δ_{Π} ‖),

which implies that

(I - γ h_{γ, P_{0}, Π_{0}}) (Δ_{Π}) = δ_{γ} h_{γ, P_{0}, Π_{0}} (Π_{0}) + o (| δ_{γ} |) + o (‖ Δ_{Π} ‖),

(59)

where I denotes the identity mapping.

Let γ_τ = γ^τ denote a smooth trajectory on the interval τ ∈ [0, 1] for a given γ. Let ${\hat{Π}}_{τ}$ denote a trajectory on $S^{n}$ for τ ∈ [0, 1] with the initial value ${\hat{Π}}_{0}$ given by (28). If the linear mapping $I - γ h_{γ_{τ}, P_{0}, {\hat{Π}}_{τ}}$ is invertible and

\frac{d}{d τ} {\hat{Π}}_{τ} = {(I - γ_{τ} h_{γ τ, P_{0}, {\hat{Π}}_{τ}})}^{- 1} \circ γ h_{γ τ, P_{0}, {\hat{Π}}_{τ}} ({\hat{Π}}_{τ}),

(60)

then the pair γ_τ, ${\hat{Π}}_{τ}$ satisfy

e^{γ_{τ} (P_{0} {\hat{Π}}_{τ} - {\hat{Π}}_{τ} P_{0})} e^{- P_{0} {\hat{Π}}_{τ}} P_{0} e^{- {\hat{Π}}_{τ} P_{0}} e^{γ τ ({\hat{Π}}_{τ} P_{0} - P_{0} {\hat{Π}}_{τ})} = P_{1} .

Therefore, then endpoint $Π_{0} = {\hat{Π}}_{1}$ is the unique solution for Π₀ that satisfies (53).

To prove Proposition 5, we only need to show that if (30) holds, then $I - γ h_{γ_{τ}, P_{0}, {\hat{Π}}_{τ}}$ is invertible. It suffices to prove that the singular values of $h_{γ_{τ}, P_{0}, Π_{τ}}$ are all smaller than 1. For this purpose, we will compute the norm of $h_{γ_{τ}, P_{0}, Π_{τ}}$ (Δ_Π) defined by (55). To compute the norm of $h_{γ_{τ}, P_{0}, Π_{τ}}$ , we first compute the norm of the first two terms containing Δ_Π as follows:

\frac{1}{2} | ‖ P_{0}^{- \frac{1}{2}} M_{Q}^{- 1} (P_{0}^{- \frac{1}{2}} M_{U} (Δ_{Π} P_{0} - P_{0} Δ_{Π}) P_{1} U' P_{0}^{- \frac{1}{2}}) P_{0}^{- \frac{1}{2}} ‖ = \frac{1}{2} ‖ \int_{0}^{\infty} \int_{0}^{1} P_{0}^{- \frac{1}{2}} {(Q + t_{1} I)}^{- 1} P_{0}^{- \frac{1}{2}} U^{1 - t_{2}} (Δ_{Π} P_{0} - P_{0} Δ_{Π}) \times U^{t_{2}} P_{1} U' P_{0}^{- \frac{1}{2}} {(Q + t_{1} I)}^{- 1} P_{0}^{- \frac{1}{2}} d t_{1} d t_{2} ‖ \leq \frac{1}{2} \int_{0}^{\infty} {‖ {(P_{0}^{\frac{1}{2}} Q P_{0}^{\frac{1}{2}} + t_{1} P_{0})}^{- 1} ‖}^{2} d t_{1} λ_{max} (P_{1}) δ (P_{0}) ‖ Δ_{Π ‖} \leq \frac{1}{2} \int_{0}^{\infty} {(λ_{min} (P_{1}) + t_{1} λ_{min} (P_{0}))}^{- 2} d t_{1} λ_{max} (P_{1}) δ (P_{0}) ‖ Δ_{Π} ‖ = \frac{1}{2} \frac{δ (P_{0}) λ_{max} (P_{1})}{λ_{min} (P_{0}) λ_{min} (P_{1})} ‖ Δ_{Π} ‖ .

The same upper bound also holds for the other two terms in (55) containing Δ. Combining the bounds for all the four terms of Δ_Π, we obtain that

‖ γ_{τ} h_{γ_{τ}, P_{0}, Π_{τ}} ‖ \leq γ \frac{δ (P_{0}) λ_{max} (P_{1})}{λ_{min} (P_{0}) λ_{min} (P_{1})} .

Therefore, if

| γ | < \frac{λ_{min} (P_{0}) λ_{min} (P_{1})}{δ (P_{0}) λ_{max} (P_{1})},

(61)

Then $I - γ_{τ} h_{γ_{τ}, P_{0}, Π_{τ}}$ is invertible for all τ ∈ [0, 1]. Thus, we have proved the first term in the bracket on the r.h.s. of (30). The second term on r.h.s. of (30) can be obtained in a similar way by considering a time-reversal path from P₁ to P₀ by switching the role of P₀, P₁ in (24) and (25). Thus, the proposition is proved. ☐

REFERENCES

[1].Ning L, “Smooth interpolation of covariance matrices and brain network estimation,” IEEE Transactions on Automatic Control, pp. 1–10, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[2].Porikli F, Tuzel O, and Meer P, “Covariance tracking using model update based on lie algebra,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 1, June 2006, pp. 728–735. [Google Scholar]
[3].Wu Y, Cheng J, Wang J, Lu H, Wang J, Ling H, Blasch E, and Bai L, “Real-time probabilistic covariance tracking with efficient model update,” IEEE Transactions on Image Processing, vol. 21, no. 5, pp. 2824–2837, May 2012. [DOI] [PubMed] [Google Scholar]
[4].Yang JF and Kaveh M, “Adaptive eigensubspace algorithms for direction or frequency estimation and tracking,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, no. 2, pp. 241–251, February 1988. [Google Scholar]
[5].Jiang X, Ning L, and Georgiou TT, “Distances and riemannian metrics for multivariate spectral densities,” IEEE Transactions on Automatic Control, vol. 57, no. 7, pp. 1723–1735, July 2012. [Google Scholar]
[6].Lenglet C, Rousson M, Deriche R, and Faugeras O, “Statistics on the manifold of multivariate normal distributions: Theory and application to diffusion tensor MRI processing,” Journal of Mathematical Imaging and Vision, vol. 25, no. 3, pp. 423–444, October 2006. [Google Scholar]
[7].Dryden IL, Koloydenko A, and Zhou D, “Noneuclidean statistics for covariance matrices, with applications to diffusion tensor imaging,” The Annals of Applied Statistics, vol. 3, no. 3, pp. 1102–1123, 2009. [Online]. Available:http://www.jstor.org/stable/30242879 [Google Scholar]
[8].Hao X, Whitaker RT, and Fletcher PT, Adaptive Riemannian Metrics for Improved Geodesic Tracking of White Matter. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 13–24. [Online]. Available: 10.1007/978-3-642-22092-02 [DOI] [PMC free article] [PubMed] [Google Scholar]
[9].Biswal B, Zerrin Yetkin F, Haughton VM, and Hyde JS, “Functional connectivity in the motor cortex of resting human brain using echo-planar mri,” Magnetic Resonance in Medicine, vol. 34, no. 4, pp. 537–541, 1995. [Online]. Available: 10.1002/mrm.1910340409 [DOI] [PubMed] [Google Scholar]
[10].Buckner RL, Krienen FM, and Yeo BTT, “Opportunities and limitations of intrinsic functional connectivity MRI,” Nature Neuroscience, vol. 16, pp. 832–837, 2013. [DOI] [PubMed] [Google Scholar]
[11].Smith SM, Vidaurre D, Beckmann CF, and et al. , “Functional connectomics from resting-state fMRI,” Trends in Cognitive Sciences, vol. 17, no. 12, pp. 666–682, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
[12].Chang C and Glover GH, “Time-frequency dynamics of resting-state brain connectivity measured with fMRI,” NeuroImage, vol. 50, no. 1, pp. 81–98, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
[13].Preti MG, Bolton TA, and Ville DVD, “The dynamic functional connectome: State-of-the-art and perspectives,” NeuroImage, vol. 160, pp. 41–54, 2017, functional Architecture of the Brain. [DOI] [PubMed] [Google Scholar]
[14].Villani C, Topics in Optimal Transportation. Amer. Math. Soc, 2003. [Google Scholar]
[15].Rachev S and Rüschendorf L, Mass transportation problems. Vol. I and II. Probability and its Applications. Springer, New York, 1998. [Google Scholar]
[16].Knott M and Smith CS, “On the optimal mapping of distributions,” Journal of Optimization Theory and Applications, vol. 43, no. 1, pp. 39–49, May 1984. [Google Scholar]
[17].Takatsu A, “On Wasserstein geometry of the space of Gaussian measures,” ArXiv e-prints, January. 2008. [Google Scholar]
[18].Rao C, “Information and the accuracy attainable in the estimation of statistical parameters,” Bull. Calcutta Math. Soc, vol. 37, pp. 81–91, 1945. [Google Scholar]
[19].Amari S-I and Nagaoka H, Methods of information geometry. Amer. Math. Soc, 2000. [Google Scholar]
[20].Schrodinger E, “Über die Umkehrung der Naturgesetze,” Sitzungsberichte Preuss. Akad. Wiss. Berlin. Phys. Math, pp. 144–153, 1931. [Google Scholar]
[21].Léonard C, “A survey of the Schrödinger problem and some of its connections with optimal transport,” ArXiv e-prints, August. 2013. [Google Scholar]
[22].Chen Y, Georgiou TT, and Pavon M, “On the relation between optimal transport and Schrödinger bridges: A stochastic control viewpoint,” Journal of Optimization Theory and Applications, vol. 169, no. 2, pp. 671–691, May 2016. [Google Scholar]
[23].Ning L, Jiang X, and Georgiou T, “On the geometry of covariance matrices,” IEEE Signal Processing Letters, vol. 20, no. 8, pp. 787–790, August 2013. [Google Scholar]
[24].Ning L, Georgiou TT, and Tannenbaum A, “On matrix-valued monge-kantorovich optimal mass transport,” IEEE Transactions on Automatic Control, vol. 60, no. 2, pp. 373–382, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
[25].Chen Y, Georgiou T, and Tannenbaum A, “Matrix Optimal Mass Transport: A Quantum Mechanical Approach,” IEEE Transactions on Automatic Control, vol. 63, no. 8, pp. 2612–2619, 2018. [Google Scholar]
[26].Chen Y, Georgiou TT, Ning L, and Tannenbaum A, “Matricial Wasserstein-1 distance,” IEEE Control Systems Letters, vol. 1, no. 1, pp. 14–19, July 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[27].Yamamoto K, Chen Y, Ning L, Georgiou TT, and Tannenbaum A, “Regularization and interpolation of positive matrices,” IEEE Transactions on Automatic Control, vol. PP, no. 99, pp. 1–1, 2017. [Google Scholar]
[28].Uhlmann A, “The metric of bures and the geometric phase,” in Quantum Groups and Related Topics: Proceedings of the First Max Born Symposium, Gielerak R, Lukierski J, and Popowicz Z, Eds., 1992, p. 267. [Google Scholar]
[29].Bhatia R, Jain T, and Lim Y, “On the Bures-Wasserstein distance between positive definite matrices,” ArXiv e-prints, December. 2017. [Google Scholar]
[30].Chen Y, Georgiou TT, and Pavon M, “Optimal steering of a linear stochastic system to a final probability distribution, part I,” IEEE Transactions on Automatic Control, vol. 61, no. 5, pp. 1158–1169, May 2016. [Google Scholar]
[31].——, “Optimal steering of a linear stochastic system to a final probability distribution, part II,” IEEE Transactions on Automatic Control, vol. 61, no. 5, pp. 1170–1180, May 2016. [Google Scholar]
[32].Moakher M, “Means and averaging in the groups of rotations,” SIAM J. Matrix Anal. Appl, vol. 24, no. 1, pp. 1–16, 2002. [Google Scholar]
[33].Keilholz S, Caballero-Gaudes C, Bandettini P, Deco G, and Calhoun V, “Time-resolved resting-state functional magnetic resonance imaging analysis: Current status, challenges, and new directions,” Brain Connectivity, vol. 7, no. 8, pp. 465–481, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[34].Deco G, Jirsa V, McIntosh AR, Sporns O, and Kötter R, “Key role of coupling, delay, and noise in resting brain fluctuations,” Proceedings of the National Academy of Sciences, vol. 106, no. 25, pp. 10 302–10 307, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
[35].Ghosh A, Rho Y, McIntosh AR, Ktter R, and Jirsa VK, “Noise during rest enables the exploration of the brain’s dynamic repertoire,” PLOS Computational Biology, vol. 4, no. 10, pp. 1–12, 10 2008. [Online]. Available: 10.1371/journal.pcbi.1000196 [DOI] [PMC free article] [PubMed] [Google Scholar]
[36].Ning L and Rathi Y, “A dynamic regression approach for frequency-domain partial coherence and causality analysis of functional brain networks,” IEEE Transactions on Medical Imaging, vol. 37, no. 9, pp. 1957–1969, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[37].Yeo BT, Krienen FM, Sepulcre J, Sabuncu MR, Lashkari D, Hollinshead M, Roffman JL, Smoller JW,Zöllei L, Polimeni JR, Fischl B, and Liu R, Buckner H a.nd, “The organization of the human cerebral cortex estimated by intrinsic functional connectivity,” J. Neurophysiol, vol. 106, pp. 1125–1165, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] [1].Ning L, “Smooth interpolation of covariance matrices and brain network estimation,” IEEE Transactions on Automatic Control, pp. 1–10, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] [2].Porikli F, Tuzel O, and Meer P, “Covariance tracking using model update based on lie algebra,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 1, June 2006, pp. 728–735. [Google Scholar]

[R3] [3].Wu Y, Cheng J, Wang J, Lu H, Wang J, Ling H, Blasch E, and Bai L, “Real-time probabilistic covariance tracking with efficient model update,” IEEE Transactions on Image Processing, vol. 21, no. 5, pp. 2824–2837, May 2012. [DOI] [PubMed] [Google Scholar]

[R4] [4].Yang JF and Kaveh M, “Adaptive eigensubspace algorithms for direction or frequency estimation and tracking,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, no. 2, pp. 241–251, February 1988. [Google Scholar]

[R5] [5].Jiang X, Ning L, and Georgiou TT, “Distances and riemannian metrics for multivariate spectral densities,” IEEE Transactions on Automatic Control, vol. 57, no. 7, pp. 1723–1735, July 2012. [Google Scholar]

[R6] [6].Lenglet C, Rousson M, Deriche R, and Faugeras O, “Statistics on the manifold of multivariate normal distributions: Theory and application to diffusion tensor MRI processing,” Journal of Mathematical Imaging and Vision, vol. 25, no. 3, pp. 423–444, October 2006. [Google Scholar]

[R7] [7].Dryden IL, Koloydenko A, and Zhou D, “Noneuclidean statistics for covariance matrices, with applications to diffusion tensor imaging,” The Annals of Applied Statistics, vol. 3, no. 3, pp. 1102–1123, 2009. [Online]. Available:http://www.jstor.org/stable/30242879 [Google Scholar]

[R8] [8].Hao X, Whitaker RT, and Fletcher PT, Adaptive Riemannian Metrics for Improved Geodesic Tracking of White Matter. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 13–24. [Online]. Available: 10.1007/978-3-642-22092-02 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] [9].Biswal B, Zerrin Yetkin F, Haughton VM, and Hyde JS, “Functional connectivity in the motor cortex of resting human brain using echo-planar mri,” Magnetic Resonance in Medicine, vol. 34, no. 4, pp. 537–541, 1995. [Online]. Available: 10.1002/mrm.1910340409 [DOI] [PubMed] [Google Scholar]

[R10] [10].Buckner RL, Krienen FM, and Yeo BTT, “Opportunities and limitations of intrinsic functional connectivity MRI,” Nature Neuroscience, vol. 16, pp. 832–837, 2013. [DOI] [PubMed] [Google Scholar]

[R11] [11].Smith SM, Vidaurre D, Beckmann CF, and et al. , “Functional connectomics from resting-state fMRI,” Trends in Cognitive Sciences, vol. 17, no. 12, pp. 666–682, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] [12].Chang C and Glover GH, “Time-frequency dynamics of resting-state brain connectivity measured with fMRI,” NeuroImage, vol. 50, no. 1, pp. 81–98, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] [13].Preti MG, Bolton TA, and Ville DVD, “The dynamic functional connectome: State-of-the-art and perspectives,” NeuroImage, vol. 160, pp. 41–54, 2017, functional Architecture of the Brain. [DOI] [PubMed] [Google Scholar]

[R14] [14].Villani C, Topics in Optimal Transportation. Amer. Math. Soc, 2003. [Google Scholar]

[R15] [15].Rachev S and Rüschendorf L, Mass transportation problems. Vol. I and II. Probability and its Applications. Springer, New York, 1998. [Google Scholar]

[R16] [16].Knott M and Smith CS, “On the optimal mapping of distributions,” Journal of Optimization Theory and Applications, vol. 43, no. 1, pp. 39–49, May 1984. [Google Scholar]

[R17] [17].Takatsu A, “On Wasserstein geometry of the space of Gaussian measures,” ArXiv e-prints, January. 2008. [Google Scholar]

[R18] [18].Rao C, “Information and the accuracy attainable in the estimation of statistical parameters,” Bull. Calcutta Math. Soc, vol. 37, pp. 81–91, 1945. [Google Scholar]

[R19] [19].Amari S-I and Nagaoka H, Methods of information geometry. Amer. Math. Soc, 2000. [Google Scholar]

[R20] [20].Schrodinger E, “Über die Umkehrung der Naturgesetze,” Sitzungsberichte Preuss. Akad. Wiss. Berlin. Phys. Math, pp. 144–153, 1931. [Google Scholar]

[R21] [21].Léonard C, “A survey of the Schrödinger problem and some of its connections with optimal transport,” ArXiv e-prints, August. 2013. [Google Scholar]

[R22] [22].Chen Y, Georgiou TT, and Pavon M, “On the relation between optimal transport and Schrödinger bridges: A stochastic control viewpoint,” Journal of Optimization Theory and Applications, vol. 169, no. 2, pp. 671–691, May 2016. [Google Scholar]

[R23] [23].Ning L, Jiang X, and Georgiou T, “On the geometry of covariance matrices,” IEEE Signal Processing Letters, vol. 20, no. 8, pp. 787–790, August 2013. [Google Scholar]

[R24] [24].Ning L, Georgiou TT, and Tannenbaum A, “On matrix-valued monge-kantorovich optimal mass transport,” IEEE Transactions on Automatic Control, vol. 60, no. 2, pp. 373–382, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] [25].Chen Y, Georgiou T, and Tannenbaum A, “Matrix Optimal Mass Transport: A Quantum Mechanical Approach,” IEEE Transactions on Automatic Control, vol. 63, no. 8, pp. 2612–2619, 2018. [Google Scholar]

[R26] [26].Chen Y, Georgiou TT, Ning L, and Tannenbaum A, “Matricial Wasserstein-1 distance,” IEEE Control Systems Letters, vol. 1, no. 1, pp. 14–19, July 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] [27].Yamamoto K, Chen Y, Ning L, Georgiou TT, and Tannenbaum A, “Regularization and interpolation of positive matrices,” IEEE Transactions on Automatic Control, vol. PP, no. 99, pp. 1–1, 2017. [Google Scholar]

[R28] [28].Uhlmann A, “The metric of bures and the geometric phase,” in Quantum Groups and Related Topics: Proceedings of the First Max Born Symposium, Gielerak R, Lukierski J, and Popowicz Z, Eds., 1992, p. 267. [Google Scholar]

[R29] [29].Bhatia R, Jain T, and Lim Y, “On the Bures-Wasserstein distance between positive definite matrices,” ArXiv e-prints, December. 2017. [Google Scholar]

[R30] [30].Chen Y, Georgiou TT, and Pavon M, “Optimal steering of a linear stochastic system to a final probability distribution, part I,” IEEE Transactions on Automatic Control, vol. 61, no. 5, pp. 1158–1169, May 2016. [Google Scholar]

[R31] [31].——, “Optimal steering of a linear stochastic system to a final probability distribution, part II,” IEEE Transactions on Automatic Control, vol. 61, no. 5, pp. 1170–1180, May 2016. [Google Scholar]

[R32] [32].Moakher M, “Means and averaging in the groups of rotations,” SIAM J. Matrix Anal. Appl, vol. 24, no. 1, pp. 1–16, 2002. [Google Scholar]

[R33] [33].Keilholz S, Caballero-Gaudes C, Bandettini P, Deco G, and Calhoun V, “Time-resolved resting-state functional magnetic resonance imaging analysis: Current status, challenges, and new directions,” Brain Connectivity, vol. 7, no. 8, pp. 465–481, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] [34].Deco G, Jirsa V, McIntosh AR, Sporns O, and Kötter R, “Key role of coupling, delay, and noise in resting brain fluctuations,” Proceedings of the National Academy of Sciences, vol. 106, no. 25, pp. 10 302–10 307, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] [35].Ghosh A, Rho Y, McIntosh AR, Ktter R, and Jirsa VK, “Noise during rest enables the exploration of the brain’s dynamic repertoire,” PLOS Computational Biology, vol. 4, no. 10, pp. 1–12, 10 2008. [Online]. Available: 10.1371/journal.pcbi.1000196 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] [36].Ning L and Rathi Y, “A dynamic regression approach for frequency-domain partial coherence and causality analysis of functional brain networks,” IEEE Transactions on Medical Imaging, vol. 37, no. 9, pp. 1957–1969, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] [37].Yeo BT, Krienen FM, Sepulcre J, Sabuncu MR, Lashkari D, Hollinshead M, Roffman JL, Smoller JW,Zöllei L, Polimeni JR, Fischl B, and Liu R, Buckner H a.nd, “The organization of the human cerebral cortex estimated by intrinsic functional connectivity,” J. Neurophysiol, vol. 106, pp. 1125–1165, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Smooth interpolation of covariance matrices and brain network estimation: Part II

Lipeng Ning

Abstract

I. Introduction

II. Optimal-mass-transport based covariance paths

III. Information-geometry based covariance paths

A. On the optimal solution

B. The scalar case

IV. Weighted-least-squares based covariance paths

A. On the relation with information-geometry based covariance paths

B. On the existence and uniqueness of covariance paths

C. On the relation between input noise and system matrices

V. Example

A. Comparing scalar-valued covariance paths

Fig. 1:

B. Comparing matrix-valued covariance paths

Fig. 2:

Fig. 3:

C. Fitting noisy measurements of functional MRI data

Fig. 4:

VI. Discussion and conclusion

Acknowledgments

Biography

Appendix

Proof of Proposition 1.

Proof of the Proposition 2.

Proof of Proposition 3.

Proof of Proposition 4.

Proof of Proposition 5.

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Smooth interpolation of covariance matrices and brain network estimation: Part II

Lipeng Ning

Abstract

I. Introduction

II. Optimal-mass-transport based covariance paths

III. Information-geometry based covariance paths

A. On the optimal solution

B. The scalar case

IV. Weighted-least-squares based covariance paths

A. On the relation with information-geometry based covariance paths

B. On the existence and uniqueness of covariance paths

C. On the relation between input noise and system matrices

V. Example

A. Comparing scalar-valued covariance paths

Fig. 1:

B. Comparing matrix-valued covariance paths

Fig. 2:

Fig. 3:

C. Fitting noisy measurements of functional MRI data

Fig. 4:

VI. Discussion and conclusion

Acknowledgments

Biography

Appendix

Proof of Proposition 1.

Proof of the Proposition 2.

Proof of Proposition 3.

Proof of Proposition 4.

Proof of Proposition 5.

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases