A PROOF OF CONVERGENCE OF THE HORN AND SCHUNCK OPTICAL FLOW ALGORITHM IN ARBITRARY DIMENSION

LOUIS LE TARNEC; FRANÇOIS DESTREMPES; GUY CLOUTIER; DAMIEN GARCIA

doi:10.1137/130904727

. Author manuscript; available in PMC: 2015 Jun 17.

Published in final edited form as: SIAM J Imaging Sci. 2014;7(1):277–293. doi: 10.1137/130904727

A PROOF OF CONVERGENCE OF THE HORN AND SCHUNCK OPTICAL FLOW ALGORITHM IN ARBITRARY DIMENSION

LOUIS LE TARNEC ^*, FRANÇOIS DESTREMPES ^†, GUY CLOUTIER ^‡, DAMIEN GARCIA ^§

PMCID: PMC4469484 CAMSID: CAMS3672 PMID: 26097625

Abstract

The Horn and Schunck (HS) method, which amounts to the Jacobi iterative scheme in the interior of the image, was one of the first optical flow algorithms. In this article, we prove the convergence of the HS method, whenever the problem is well-posed. Our result is shown in the framework of a generalization of the HS method in dimension n ≥ 1, with a broad definition of the discrete Laplacian. In this context, the condition for the convergence is that the intensity gradients are not all contained in a same hyperplane. Two other articles ([17] and [13]) claimed to solve this problem in the case n = 2, but it appears that both of these proofs are erroneous. Moreover, we explain why some standard results on the convergence of the Jacobi method do not apply for the HS problem, unless n = 1. It is also shown that the convergence of the HS scheme implies the convergence of the Gauss-Seidel and SOR schemes for the HS problem.

Keywords: Optical flow, Horn and Schunck algorithm, Jacobi iterations

1. Introduction

Optical flow refers to the distribution of apparent movement of intensity patterns in an image caused by the relative motion between an observer and the scene. The Horn and Schunck (HS) method was one of the first optical flow algorithms used to determine a displacement field from several successive images [10]. The original HS method is based on a global approach and introduces a quadratic prior term of smoothness in the classical equation of the optical flow. This algorithm is especially adapted to speckled or diffuse images like those encountered in several modalities where a displacement field without discontinuity or significantly high gradients is expected [24, 16, 14]. Thus, the HS method and its derived forms remain of high interest in some areas of motion imaging. Other complex and very proficient estimators for optical flow, however, now exist in the context of natural scenes [2, 3] to take into account discontinuities at objects edges.

Based on a discretization of the differential operators appearing in the HS optical flow formulation, the HS method results in a linear system that can be solved with direct or iterative methods. In comparison to direct methods, the iterative solvers have the advantage to need lower computational data storage and to be easily programmable. It is well known that the matrix involved in the HS linear system is symmetric positive definite, as a consequence of the V-ellipticity of the HS functional [18]. This ensures, for example, the efficiency of the direct Cholesky decomposition and of the iterative Gauss-Seidel or SOR (successive over-relaxation) solvers. However, the positive definiteness does not permit to conclude about the method proposed in the initial Horn and Schunck’s paper, which is an iterative 2 × 2 blockwise solver [10] and corresponds to the Jacobi solver for the interior points of the image only. In fact, it is shown in this work that the positive definiteness is implied by the convergence of the HS scheme. The Gauss-Seidel and SOR solvers are known to converge at least twice faster than the Jacobi solver [4, Theorem 5.3–4]. These iterative solvers can be made parallelizable using, for instance, a special red-black reordering of the unknowns in the linear system [6, 25]. The Jacobi iterative solver, however, has the advantage to be directly parallelizable since it does not use values computed in the current iteration step [20, 21].

One known general result about the convergence of the Jacobi method concerns strictly (block) diagonally dominant matrices, which is not the case here. Another result concerns (block) irreducible and weakly dominant matrices, an assumption which is not satisfied for images of dimension greater than 1 under the appropriate Neumann boundary conditions. Whether the iterative method for the HS linear system with the Neumann boundary conditions converges still remains unsolved. Indeed, the paper of Horn and Schunck did not include a proof of convergence [10]. Two proofs of convergence have been published since then, in [17] and [13], both for 2-dimensional images. However, as far as we can tell, these two proofs are erroneous. There is also a short argument in [23, p. 249] based on diagonally dominant matrices (without blocks), for the convergence of the pointwise Jacobi method, that is erroneous.

Under a general perspective, there are three main points in an optical flow algorithm: 1) the formulation of the continuous energy (functional) to be minimized; 2) the discretization scheme; and 3) the solver used to minimize the energy. The scope of this work is to present a proof that the HS iterative solver (and hence, the Gauss-Seidel and SOR solvers) converges for the original quadratic HS functional under a generic discretization scheme adopted in this paper.

In Section 2, we state a generalization of the HS method in dimension n. In Section 3, we explain why the previous proofs are erroneous and cannot be fixed. In Section 4, we define some hypotheses about the discrete Laplacian, propose a necessary and sufficient condition for the linear system of Horn and Schunck to be invertible, and state our convergence result. The proof is presented in Section 5. In Section 6, we define a general way of calculating a discrete Laplacian in dimension n. In Section 7, we show that our general discrete Laplacian satisfies the hypotheses imposed to get the convergence result. In Appendix A, the HS iterative scheme is derived in detail from the discretization of the HS problem. In Appendix B, it is explained why the coefficient matrix of the HS scheme is not strictly (block) diagonally dominant matrices, nor (block) irreducible and weakly dominant matrices under the appropriate Neumann boundary conditions for images of dimension greater than 1. A result is shown in Appendix C that implies the convergence of the Gauss-Seidel and SOR iterative schemes whenever the Jacobi method converges, under appropriate conditions. This result also implies that the Gauss-Seidel and SOR methods converge for the HS problem, as a consequence of the convergence of the HS iterative scheme. In Appendix D, details are given to explain why the proofs of [17, 13] are erroneous.

2. Statement of the problem

The optical flow problem is usually applied to two-dimensional images of a moving scene [10]. Optical flow has also been used to analyze motion in one, three or four dimensions [22, 9, 5]. In this work, we investigate the convergence of the HS optical flow problem in the generalized case of dimension n ≥ 1. We thus consider an orthotope V ⊂ ℝⁿ, i.e. a parallelotope whose edges are all mutually perpendicular (a segment if n = 1, a rectangle if n = 2 or a cuboid if n = 3). In the optical flow problem, each element of V generally corresponds to an intensity or brightness that varies over time. Given the intensity field over two or more successive instants, the aim of the HS method is to determine the corresponding displacement field. As we propose here a proof of convergence in the context of n-dimensional arrays, we first state an n-dimensional generalization of the HS method (for the classical 2-dimensional formulation, we refer the reader to [10]).

Let I denote the intensity field on V, I_t its derivative with respect to time t, ∇I its gradient with respect to position, and u the displacement field. We start from the well-known optical flow identity:

\nabla I \cdot u + I_{t} = 0,

(2.1)

which means that a given (apparently moving) point of V keeps its initial intensity during its displacement. Then, a regularization method is employed to impose low spatial variations in the displacement field. By definition, the HS method consists in minimizing the unconstrained functional:

J (u) = \int_{V} {{(\nabla I \cdot u + I_{t})}^{2} + μ {‖ \nabla u ‖}^{2}} d V,

(2.2)

where μ > 0 is a positive real number and || · ||, in the entire article, represents the Euclidean norm. The Euler-Lagrange equation corresponding to this minimization problem reads as follows:

μ Δ u = (\nabla I \cdot u + I_{t}) \nabla I = [\nabla I \nabla I^{T}] u + I_{t} \nabla I on V

(2.3)

\frac{\partial u}{\partial n} = 0 on \partial V .

(2.4)

Here, $\frac{\partial}{\partial n}$ is the differentiation operator in the direction of the normal n to the boundary ∂V, and the superscript T denotes transposition of matrices. The displacements without superscript T are considered to be column vectors (in ℝⁿ). Note that the Neumann boundary conditions (2.4) arise naturally from the unconstrained minimization problem (2.2).

Now, we will discretize the expression of Eq. (2.3) on a lattice Λ covering the orthotope V. The restriction of the intensity field on the lattice Λ can be viewed as a (discretized) image. In the sequel, we assume that there are N ≥ 2 elements in the lattice Λ. Then, a discretized displacement field is of the form u = (u_i)_i_∈Λ, where u_i = (u_i₁, …, u_in)^T denotes the displacement vector at the point i. In the following, we will say that a displacement field u is uniform if all the displacement vectors u_i are identical. Now, Eq. (2.3) can be written for i ∈ Λ as:

μ Δ {(u)}_{i} = {[\nabla I \nabla I^{T}]}_{i} u_{i} + I_{t, i} \nabla I_{i},

(2.5)

where I_t_,_i denotes the partial derivative of the intensity I with respect to t evaluated at the point i. In Eq. (2.5), Δ(u)_i is a discretized Laplacian that depends linearly on the vectors u_i, for i ∈ Λ. Hence, the consideration of Eq. (2.5) for i ∈ Λ yields a linear system of n N equations and n N unknowns, where N is the number of points in Λ. The discretization of the Laplacian can classically be written as:

Δ {(u)}_{i} = κ {M {(u)}_{i} - u_{i}},

(2.6)

where κ > 0 is a positive real number and M a linear transformation (on the vector space of displacements fields), that returns for each point an average of the displacement field over its neighbors:

M {(u)}_{i} = \sum_{j \in Λ} λ_{i j} u_{j},

(2.7)

where λ_ij, for i, j ∈ Λ, are non negative real numbers. We will adopt in Section 6 a general expression of this operator. In the following, we denote for notational convenience:

α = μ κ,

(2.8)

where μ is the regularization weight of Eq. (2.2), so that the coefficient α is a positive real number. In order to solve the linear system (2.5), Horn and Schunck [10] proposed an iterative method that is assumed to converge to the solution. Let P be the linear transformation (on the vector space of displacements fields) defined by P(u)_i = P_i u_i for i ∈ Λ, where $P_{i} = I_{n} - \frac{{[\nabla I \nabla I^{T}]}_{i}}{α + {‖ \nabla I_{i} ‖}^{2}}$ and Inline graphic is the n × n identity matrix. Let d be the displacement field defined by $d_{i} = - \frac{I_{t, i} \nabla I_{i}}{α + {‖ \nabla I_{i} ‖}^{2}}$ for i ∈ Λ. Then, the HS iterative scheme is expressed as follows:

u^{k + 1} = P M (u^{k}) + d .

(2.9)

See Appendix A for a derivation of Equation (2.9). Also, it is shown in Appendix B that the HS iterative scheme of Eq. (2.9) amounts to the Jacobi iterative scheme in the interior of the orthotope V, but never at its boundary points. Moreover, in that Appendix, we explain why standard results (based on block diagonally dominant matrices) on the convergence of the Jacobi iterative scheme do not apply in this context, due to the natural Neumann boundary conditions (2.4). We also explain why the short argument of [23, p. 249] based on diagonally dominant matrices (without blocks) for the pointwise Jacobi method is erroneous for the HS problem.

In Appendix C, it is shown that the convergence of the Gauss-Seidel and SOR iterative schemes is implied by the convergence of the Jacobi method, under appropriate conditions, based on a result on symmetric positive definite matrices. In particular, this result implies that the Gauss-Seidel and SOR methods converge for the HS problem, as a consequence of the convergence of the HS iterative scheme.

It would be straightforward to prove the convergence of the Jacobi solver in the presence of Dirichlet boundary conditions since the matrix would be block irreducible and weakly block diagonally dominant in that case. However, we recall that the Neumann conditions are intrinsically related to the minimization of the cost function (2.2).

3. Previous proofs

As stated in the introduction, the two existing proofs of convergence of the Jacobi solver in the context of the HS problem ([17] and [13]) appear to be erroneous. Let us now see in detail where the errors occurred and why we think that they cannot be fixed.

In [17], the cornerstone of the proof of convergence of the Jacobi method to solve the HS linear system relies on [17, Eq. (16)], which states that the function defined by the matrix “P” of [17, Eq. (9)] (not to be confused with the linear transformation P of Eq. (2.9) of the present paper) is contracting for the norm defined by [17, Eq. (10)], for any image. However, it appears that the only case for which this can be true is if the image is uniform, as explained in detail in Appendix D.

In [13, Eq. (20)], we notice that no condition is given for the convergence of the HS iterations. However, in view of Theorem 4.1 below, that assertion is false (a condition on the image gradients is needed to make the HS problem well-posed). Thus, the proof in [13] must be erroneous. In Appendix D, we give further details on intermediate statements that are false in [13].

Finally, in [23, p. 249], the special case of the HS problem amounts to Ψ′(s²) = 1 (see also [23, p. 247, second column]). In that case, the iterations of [23, Eqs. (14) and (15)] amount to the Jacobi iterative scheme for the system (2.5), but without considering n × n blocks. It is asserted that “If the discrete image gradient does not vanish at one point, the system matrix of these equations is irreducibly diagonally dominant. This guarantees the existence of a unique solution of the linear system and global convergence of the Jacobi iterations [26]”. But, as shown in Appendix B, the coefficient matrix of the system is not even diagonally dominant, except in a special case. Thus, that argument is also erroneous.

4. Statement of the main result

First, the operator M of Eqs. (2.6) and (2.9) comes from the Laplacian discretization and returns, for each point, an average of the displacement field over its neighbors as in Eq. (2.7). We will have to define several hypotheses about M. In the following, we will assume that:

(H1)
For every points i and j of Λ, λ_ij = λ_ji.
(H2)
At every point i of Λ, Σ_j_∈Λ λ_ij = 1.

Intuitively, (H1) comes from the isotropy property of the smooth Laplacian, and (H2) is necessary in order to have a null Laplacian when the displacement field is uniform. As we will see in Section 7, these hypotheses are verified with the general discretization scheme of Section 6. In order to state the last hypothesis, we have to define the graph G by its set of vertices V (G) = Λ and its set of edges E(G) = {(i, j) ∈ Λ² : λ_ij ≠ 0}. If (i, j) ∈ E(G), we write i ~_G j. We assume that the lattice Λ is of the form {(i₁, i₂, …, i_n) : i_ℓ is an integer ranging from 0 to N_ℓ − 1 for 1 ≤ ℓ ≤ n}, where N_ℓ ≥ 1 for 1 ≤ ℓ ≤ n. Thus, the number of points in Λ is equal to $N = \prod_{ℓ = 1}^{n} N_{ℓ}$ . From (H1), this graph is undirected (i.e. i ~_G j if j ~_G i). Let us now recall that an undirected graph G is connected if, for any two vertices i and j of G, there exists a path from i to j in G. We can now state the last hypothesis:

(H3)
The graph G is connected.

We will see in Section 7 that (H3) is also true with the general discretization scheme of Section 6. Actually, this is an immediate consequence of the fact that the closest neighbors of a point are taken into account in the average calculation at this point. In the following, we will call rank of (∇I_i) the dimension of the subspace of ℝⁿ that is spanned by the vectors ∇I_i, i ∈ Λ.

Theorem 4.1

Under Hypotheses (H1), (H2) and (H3):

if the rank of (∇I_i) is n, the linear system (2.5) has a unique solution and the iterations (2.9) converge to this solution;
if the rank of (∇I_i) is not n, the problem is ill-posed, i.e. the linear system (2.5) does not have a unique solution.

Let us notice that the rank of (∇I_i) is different from n if and only if the intensity gradients are all contained in a same hyperplane. In that case, the image is invariant along the direction orthogonal to this hyperplane. The fact that this condition makes the problem ill-posed is not surprising, as it is clear that a displacement along this particular direction cannot be detected by studying the variations of intensity over time.

5. Proof of the main result

The linear transformation M and the coefficients λ_ij are defined in Eq. (2.7). The linear transformation P and the matrix P_i are defined in Section 2 before Eq. (2.9). We define the norm of a displacement field u by ||u|| = (Σ_i_∈Λ ||u_i||²)^1/2, where ||u_i|| is the Euclidean norm on ℝⁿ.

Lemma 5.1

Under Hypotheses (H1) and (H2):

For every displacement field u, ||M(u)|| ≤ ||u||.
If equality holds, then for any two points i ~_G j, we have M (u)_i = u_j.

Proof

For each point i, we get by hypothesis (H1) and (H2) that Σ_j_∈Λ λ_ij = Σ_j_∈Λ λ_ji = 1. Then, for each direction ℓ, Jensen’s inequality [12] applied to the strictly convex function x → x² yields:

{[\sum_{j \in Λ} λ_{i j} u_{j ℓ}]}^{2} \leq \sum_{j \in Λ} λ_{i j} u_{j ℓ}^{2}

(5.1)

We can now write:

\begin{array}{l} {‖ M (u) ‖}^{2} = \sum_{ℓ = 1}^{n} \sum_{i \in Λ} {[\sum_{j \in Λ} λ_{i j} u_{j ℓ}]}^{2} \\ \leq \sum_{ℓ = 1}^{n} \sum_{i \in Λ} \sum_{j \in Λ} λ_{i j} u_{j ℓ}^{2} = \sum_{ℓ = 1}^{n} \sum_{j \in Λ} \sum_{i \in Λ} λ_{i j} u_{j ℓ}^{2} \\ = \sum_{ℓ = 1}^{n} \sum_{j \in Λ} u_{j ℓ}^{2} [\sum_{i \in Λ} λ_{i j}] = \sum_{ℓ = 1}^{n} \sum_{j \in Λ} u_{j ℓ}^{2} = {‖ u ‖}^{2} . \end{array}

(5.2)

Moreover, let us suppose that ||M(u)|| = ||u||. Then, for each point i and each ℓ, the equality in Eq. (5.1) is reached. Thus, the coordinates u_jℓ associated with the non-vanishing coefficients λ_ij are all identical (cf. [8]). This means that u_j = u_j_′, for any two points i ~_G j and i ~_G j′. Therefore, it follows that M (u)_i = Σ_j_′∈Λ λ_ij_′u_j_′ = (Σ_j_′∈Λ λ_ij_′)u_j = u_j, where j is any point such that i ~_G j.

Lemma 5.2

For every displacement field u, we have ||P(u)|| ≤ ||u||. The equality holds if and only if u_i is orthogonal to the gradient ∇I_i at any point i of Λ. In that case, P(u)_i = u_i at any point i of Λ.

Proof

Let u be a displacement field and i a point of the image. There exists a vector $\vec{a}$ of ℝⁿ and a real number b such that $\nabla I_{i}^{T} \vec{a} = 0$ and u_i = $\vec{a}$ + b ∇I_i. From the expression of P_i, we find P_i $\vec{a}$ = $\vec{a}$ (because $\nabla I_{i}^{T} \vec{a} = 0$ ) and $P_{i} \nabla I_{i} = (1 - \frac{{‖ \nabla I_{i} ‖}^{2}}{α + {‖ \nabla I_{i} ‖}^{2}}) \nabla I_{i}$ . Thus, $P_{i} u_{i} = \vec{a} + b \nabla I_{i} (1 - \frac{{‖ \nabla I_{i} ‖}^{2}}{α + {‖ \nabla I_{i} ‖}^{2}})$ . We notice here that ${‖ u_{i} ‖}^{2} = {‖ \vec{a} ‖}^{2} + b^{2} {‖ \nabla I_{i} ‖}^{2}$ and ${‖ P_{i} u_{i} ‖}^{2} = {‖ \vec{a} ‖}^{2} + {(1 - \frac{{‖ \nabla I_{i} ‖}^{2}}{α + {‖ \nabla I_{i} ‖}^{2}})}^{2} b^{2} {‖ \nabla I_{i} ‖}^{2}$ . So, we get that ||P_i u_i|| ≤ ||u_i||, and that the equality holds if and only if b = 0 or ∇I_i = $\vec{0}$ (i.e. u_i = $\vec{a}$ ), which means if and only if $\nabla I_{i}^{T} u_{i} = 0$ . Finally, ||P (u)||² = Σ_i_∈Λ ||P_i u_i||² ≤ Σ_i_∈Λ ||u_i||² = ||u||², with equality if and only if $\nabla I_{i}^{T} u_{i} = 0$ at every point i of Λ. In that case, it is clear that P_i u_i = u_i at every point i of Λ.

Let us recall that the HS iterations read as u^k⁺¹ = P M(u^k) + d. We can now show the convergence of these iterations, under our condition on the intensity field. From Lemmas 5.1 and 5.2, we find that ||PM(u)|| ≤ ||u|| for any displacement field u. A feature of the following proof consists in showing that ||(PM)^N(u)|| < ||u|| for any non-zero displacement field u, where N is the number of points in Λ.

Proof of Theorem 4.1

We still suppose that (H1), (H2) and (H3) are verified. Let us assume that the rank of (∇I_i) is n. Let us assume, by contradiction, that there is a displacement field u ≠ 0 such that ||(PM)^N(u)|| = ||u||. So, there is a point i_* ∈ Λ such that u_{i_*} ≠ $\vec{0}$ . Let i be any point of Λ. We claim that there is a path from i to i_* in the graph G of length 1 ≤ L ≤ N (N is the number of elements in G). Indeed, if i ≠ i_*, then, a minimal path will do; if i = i_*, then, the path i₀ = i ~_G i₁ ~_G i_* = i will do, where i₁ is any neighbor of i in the graph G. Let this path be of the form i₀ = i ~_G i₁ ~_G i₂ ~_G … ~ i_L = i_*. From Lemmas 5.1 and 5.2, the assumption ||(PM)^N (u)|| = ||u|| implies that ||(PM)^L(u)|| = ||M(PM)^L⁻¹(u)|| = ||(PM)^L⁻¹(u)|| = … = ||u||. Moreover, again from 5.1 and 5.2, we have (PM)^L(u)_i₀ = M(PM)^L⁻¹(u)_i₀ = (PM)^L⁻¹(u)_i₁ = … = u_{i_L}. Also, from Lemma 5.2, we have that M (PM)^L−1(u)_i₀ is orthogonal to ∇I_i₀, and thus that u_{i_L}= u_{i_*}is orthogonal to ∇I_i₀ = ∇I_i. Since the point i is arbitrary, we deduce that the space spanned by the gradient vectors ∇I_i is orthogonal to the non-zero vector u_{i_*}, which is a contradiction. Thus, under the condition of convergence stated in the theorem and for u ≠ 0, we have ||(P M)^N(u)|| < ||u||.

We now consider the function u → ||(P M)^N(u)|| defined on the hypersphere {u| ||u|| = 1}. This function is continuous and defined on a compact set, i.e. a bounded closed subset of the vector space of displacement fields. Therefore, the function is bounded and reaches its maximal value. This ensures that there exists β < 1 such that for every displacement field u, ||(P M)^N(u)|| ≤ β ||u||. Since moreover ||PM(u)|| ≤ ||u||, the conclusion about the existence of a solution for the linear system (2.5), its uniqueness and the convergence of the iterations (2.9) to this solution, is then a classical result (see [19, p. 101] for example).

We now suppose that the rank of (∇I_i) is less than n. In this case, the intensity gradients are all contained in a same hyperplane. Let us consider a displacement field u^* that is uniform, different from zero and orthogonal to this hyperplane. Because of Hypothesis (H2) that imposes Σ_j_∈Λ λ_ij = 1 at each point i, and because u^* is uniform, we get M(u^*) = u^*. Moreover, because $\nabla I_{i}^{T} u_{i}^{*} = 0$ at each point i, Lemma 5.2 says that P(u^*) = u^*. Thus, P M(u^*) = u^*. This shows that the linear system u = P M(u) + d (equivalent to the linear system (2.5)) has a non zero solution when d = 0, so that the coefficient matrix of the linear system (2.5) is not invertible.

6. The discrete Laplacian in dimension n

6.1. Description of a general scheme

Recall that the lattice Λ is assumed to be of the form {(i₁, i₂, …, i_n) : i_ℓ is an integer ranging from 0 to N_ℓ − 1 for 1 ≤ ℓ ≤ n}, where N_ℓ ≥ 1 for 1 ≤ ℓ ≤ n. The lattice Λ is viewed as a subset of the Cartesian product ℤ^ℓ. In the following, the norms L¹ and L^∞ are denoted ${‖ (i_{1}, i_{2}, \dots, i_{n}) ‖}_{L^{1}} = \sum_{ℓ = 1}^{n} ∣ i_{ℓ} ∣$ and ||(i₁, i₂, …, i_n)||_L^∞ = max_1≤_ℓ_≤_n |i_ℓ|. We will now define a general way of calculating a discrete Laplacian in dimension n, based on [15]. As proposed in [15], we consider the n-dimensional finite-difference stencil S_i around a point i, consisting of the 3ⁿ − 1 points k ∈ ℤ^ℓ that verify ||i − k||_L^∞ = 1. Then, we divide these stencil points into the sets $S_{i}^{(r)} (1 \leq r \leq n)$ of points k ∈ ℤ^ℓ that verify ||i − k||_L¹ = r. As explained in [15], it turns out that for each r in {1, …, n}, a discretization of the Laplacian can be constructed from the Taylor expansions of the points of $S_{i}^{(r)}$ about the point i. The remaining part of this section concerns only interior points of the lattice Λ; the boundary cases are discussed in Section 6.2. So, if i is not a boundary point, the discretization of the Laplacian is given in [15, formula (2.2)]:

Δ^{(r)} {(u)}_{i} = κ_{r} \sum_{k \in S_{i}^{(r)}} (u_{k} - u_{i}),

(6.1)

where $κ_{r} = \frac{2 n}{r Card (S_{i}^{(r)})}$ . Based on the definition of $S_{i}^{(r)}$ , it is clear that $(S_{i}^{(r)}) = (\begin{matrix} n \\ r \end{matrix}) 2^{r}$ , where $(\begin{matrix} n \\ r \end{matrix}) = \frac{n!}{r! (n - r)!}$ . Thus, κ_r is independent of the point i. Then, a general way to calculate a global discrete Laplacian at the point i is to make a weighted average of the Laplacians obtained for the different sets $S_{i}^{(r)}$ . Such a discretization can be written:

Δ {(u)}_{i} = \sum_{r = 1}^{n} w_{r} Δ^{(r)} {(u)}_{i},

(6.2)

where the weights w_r ≥ 0 are non negative real numbers such that $\sum_{r = 1}^{n} w_{r} = 1$ . We also denote $κ = \sum_{r = 1}^{n} w_{r} κ_{r} Card (S_{i}^{(r)})$ , independent of i, and $γ_{r} = \frac{w_{r} κ_{r}}{κ}$ , so that:

Δ {(u)}_{i} = κ {(\sum_{r = 1}^{n} \sum_{k \in S_{i}^{(r)}} γ_{r} u_{k}) - u_{i}},

(6.3)

We notice here that the coefficients γ_r are non negative and verify:

\sum_{r = 1}^{n} \sum_{k \in S_{i}^{(r)}} γ_{r} = \sum_{r = 1}^{n} γ_{r} Card (S_{i}^{(r)}) = \sum_{r = 1}^{n} \frac{w_{r} κ_{r}}{κ} Card (S_{i}^{(r)}) = 1.

(6.4)

In the following, we will impose w₁ ≠ 0. This is a natural hypothesis because it means that Δ⁽¹⁾(u)_i, which is calculated from the closest neighbors of i, is taken into account in the Laplacian calculation at i. Therefore, we have γ₁ ≠ 0. Note that a simple way of calculating a discrete Laplacian in dimension n is to set w₁ = 1. The dimension independent Laplacian given in [15] is obtained by setting $w_{r} = (\begin{matrix} n - 1 \\ r - 1 \end{matrix}) 2^{1 - n}$ . These coefficients are chosen so that some properties of the smooth Laplacian are kept with the discrete Laplacian (see [15] for more details). The scheme chosen by Horn and Schunck [10] in the 2-dimensional case, detailed below, is obtained by setting $w_{1} = w_{2} = \frac{1}{2}$ (which is the dimension independent Laplacian in the case n = 2).

In Fig. 1, as an example, the stencil S_i is composed of the points {a, b, c, d, e, f, g, h}. We have $S_{i}^{(1)} = {b, d, e, g}$ and $S_{i}^{(2)} = {a, c, f, h}$ . The scheme of Horn and Schunck [10] is: $Δ {(u)}_{i} = κ {\frac{1}{6} (u_{b} + u_{d} + u_{e} + u_{g}) + \frac{1}{12} (u_{a} + u_{c} + u_{f} + u_{h}) - u_{i}}$ , with κ = 3. Here, the coefficient γ₁ associated with $S_{i}^{(1)}$ is $\frac{1}{16}$ , and the coefficient γ₂ associated with $S_{i}^{(2)}$ is $\frac{1}{12}$ .

Finally, from Eq. (2.4), the boundary conditions considered here are that the normal derivatives vanish at the boundary of the image. In [10], Horn and Schunck explained how to deal with these conditions: when a point outside the image is needed, the displacement of the closest point inside the image is copied.

The description of the discretization scheme given above is sufficient to program the HS algorithm: just choose some coefficients w_r, calculate the corresponding coefficients γ_r, use Eq. (2.9) with $M {(u)}_{i} = \sum_{r = 1}^{n} \sum_{k \in S_{i}^{(r)}} γ_{r} u_{k}$ (cf. Eqs (2.6) and (6.3)), and apply the boundary conditions when necessary.

6.2. Determination of the weights in the average calculation

We will now give an expression of the coefficients λ_ij defined in Eq. (2.7), in order to verify the Hypotheses (H1), (H2) and (H3) in the next section.

Let us give a rigorous definition of the boundary conditions. We denote Λ′ = {(k₁, k₂, …, k_n) : −1 ≤ k_ℓ ≤ N_ℓ, 1 ≤ ℓ ≤ n}, and define the function f from Λ′ to Λ such that f{(k₁, k₂, …, k_n)} = (j₁, j₂, …, j_n) with, for each ℓ in {1, …, n}:

if 0 ≤ k_ℓ ≤ N_ℓ − 1, then j_ℓ = k_ℓ;
if k_ℓ = −1, then j_ℓ = 0;
if k_ℓ = N_ℓ, then j_ℓ = N_ℓ − 1.

Then, our discretization scheme can be written at each point i of Λ, even if i is a boundary point:

Δ {(u)}_{i} = κ {(\sum_{r = 1}^{n} \sum_{k \in S_{i}^{(r)}} γ_{r} u_{f (k)}) - u_{i}} .

(6.5)

Now, given two points i and j of Λ and an integer r in {1, …, n}, we denote $A_{i j}^{(r)}$ the set of points defined by

A_{i j}^{(r)} = {k \in S_{i}^{(r)} \subset Λ^{'} : f (k) = j} .

(6.6)

Then, for two points i and j of Λ, we set:

λ_{i j} = \sum_{r = 1}^{n} Card (A_{i j}^{(r)}) γ_{r} .

(6.7)

It is clear that at each point i of Λ and for every displacement field u:

\sum_{j \in Λ} λ_{i j} u_{j} = \sum_{r = 1}^{n} \sum_{k \in S_{i}^{(r)}} γ_{r} u_{f (k)} .

(6.8)

Thus, from Eq. (6.5), our discretization scheme can be written as in Eqs (2.6) and (2.7): Δ(u)_i = κ{M (u)_i − u_i} with M (u)_i = Σ_j_=∈Λ λ_ij uj.

In Fig. 2, as a complement to the example of Fig. 1, the stencil S_i of the boundary point i (located at a corner of the lattice Λ) is composed of the points {a, b, c, d, e, f, g, h}. As for Fig. 1, we have $S_{i}^{(1)} = {b, d, e, g}$ and $S_{i}^{(2)} = {a, c, f, h}$ . Based on the definition of Eq. (6.6), one obtains: $A_{i, i}^{(1)} = {b, d}$ and $A_{i, i}^{(2)} = {a}; A_{i, e}^{(1)} = {e}$ and $A_{i, e}^{(2)} = {c}; A_{i, g}^{(1)} = {g}$ and $A_{i, g}^{(2)} = {f}; A_{i, h}^{(1)} = \emptyset$ and $A_{i, h}^{(2)} = {h}$ . The corresponding scheme of Horn and Schunck [10] is: $Δ {(u)}_{i} = κ {\frac{1}{6} (2 u_{i} + u_{e} + u_{g}) + \frac{1}{12} (u_{i} + u_{e} + u_{g} + u_{h}) - u_{i}} = κ {\frac{1}{12} (5 u_{i} + 3 u_{e} + 3 u_{g} + u_{h}) - u_{i}}$ , with κ = 3. Again, the coefficient γ₁ associated with $S_{i}^{(1)}$ is $\frac{1}{6}$ , and the coefficient γ₂ associated with $S_{i}^{(2)}$ is $\frac{1}{12}$ . The case of a boundary point not located at the corner of Λ (such as the point e in Fig. 2) can be treated similarly.

7. Verification of the hypotheses

We now have to verify that the general n-dimensional scheme described in Section 6 fulfills the hypotheses of Section 4:

(H1)
For every points i and j of Λ, λ_ij = λ_ji.
(H2)
At every point i of Λ, Σ_j _{= ∈Λ} λ_ij = 1.
(H3)
The Graph G defined in Section 4 is connected.

Proposition 7.1

With the discretization scheme of Section 6, (H1) is satisfied.

Proof

Let i and j be two points of Λ, and r be an integer in {1, …, n}. As in Subsection 6.2, we denote $A_{i j}^{(r)}$ the set of points k belonging to $S_{i}^{(r)}$ and satisfying f(k) = j. By definition, a point k belongs to $A_{i j}^{(r)}$ if and only if:

||k − i||_L^∞ = 1;
||k − i||_L¹ = r;
f(k) = j.

Let us now define the function g_ij from ℕⁿ to ℕⁿ by g_ij(k) = k + i − j. We will show that $g_{i j} (A_{i j}^{(r)}) \subset A_{j i}^{(r)}$ . We denote i = (i₁, i₂, …, i_n) and j = (j₁, j₂, …, j_n). Let k = (k₁, k₂, …, k_n) be a point of $A_{i j}^{(r)}$ , and ℓ be an integer in {1, …, n}.

If k_ℓ = −1 then j_ℓ = 0 (because f(k) = j) and i_ℓ = 0 (because ||k − i||_L^∞ ≤ 1 and i belongs to Λ), so that k_ℓ + i_ℓ − j_ℓ = −1.
If k_ℓ = N_ℓ, similarly, j_ℓ = i_ℓ = N_ℓ − 1 and k_ℓ + i_ℓ − j_ℓ = N_ℓ.
In the other cases, k_ℓ = j_ℓ (because f(k) = j), so that k_ℓ + i_ℓ − j_ℓ = i_ℓ.

First, from the definition of the function f, the three previous cases applied to each coordinate l of {1, …, n} yield f(g_ij (k)) = f(k + i − j) = i. Moreover, in each of these three cases, it is clear that |(k_ℓ + i_ℓ − j_ℓ) − j_ℓ| = |k_ℓ − i_ℓ| (either because j_ℓ = i_ℓ or because k_ℓ = j_ℓ). Thus, from ||k − i||_L^∞ ≤ 1 and ||k − i||_L¹ = r, we obtain ||g_ij(k) − j||_L^∞ = 1 and ||g_ij(k) − j||_L¹ = r. This permits to conclude that $g_{i j} (A_{i j}^{(r)}) \subset A_{j i}^{(r)}$ .

Now, as g_ij is a translation, it is injective, so that $Card (A_{i j}^{(r)}) \leq Card (A_{j i}^{(r)})$ . Then, as we did not impose any hypothesis about i and j, we can exchange them and write $Card (A_{j i}^{(r)}) \leq Card (A_{i j}^{(r)})$ , so that $Card (A_{i j}^{(r)}) = Card (A_{j i}^{(r)})$ . Finally, Eq. (6.7) imposes that λ_ij = λ_ji, so that (H1) is verified.

Proposition 7.2

With the discretization scheme of Section 6, (H2) is satisfied.

Proof

Let i be a point of Λ. Eq. (6.8) applied to a displacement field u that is uniform and different from zero yields $\sum_{r = 1}^{n} \sum_{k \in S_{i}^{(r)}} γ_{r} = \sum_{j = \in Λ} λ_{i j}$ . Then, Eq. (6.4) imposes Σ_j_{= ∈Λ} λ_ij = 1, so that (H2) is verified.

Proposition 7.3

With the discretization scheme of Section 6, (H3) is satisfied.

Proof

Let i be a point of the image lattice Λ. By hypothesis, we have that γ₁ ≠ 0. So, from Eq. (6.7), we have that for two close neighbors i and j of Λ, i.e. such that ||i − j||_L¹ = 1, λ_ij = 0. Indeed, in that case, j belongs to $A_{i j}^{(1)}$ so that $Card (A_{i j}^{(1)}) \neq 0$ . So, two close neighbors are always linked in G, and the connectedness of G becomes obvious.

So, (H1), (H2) and (H3) are fulfilled with the discretization scheme of Section 6, and we are under the conditions of Theorem 4.1.

8. Conclusion

The proposed convergence result was shown using a general definition of the discrete Laplacian. That definition includes the classical scheme of Horn and Schunck in dimension 2, and a general scheme (see Section 6) for n-dimensional Laplacians. In this context, a necessary and sufficient condition for the problem to be well-posed (i.e., to have a unique solution) is that the intensity gradients are not all contained in a same hyperplane. Under that condition, the Horn and Schunck iterations converge to the solution. It was also shown that the convergence of the HS iterative scheme implies the convergence of the Gauss-Seidel and SOR solvers for the HS problem.

Acknowledgments

Financial support for this project was obtained by the Canadian institutes of Health Research (Operating grant MOP-106465) and the Natural Sciences and Engineering Research Council of Canada (Discovery grant 138570-2011 and Strategic grant STPGP-381136-09). The authors are grateful to the anonymous reviewers for their comments that helped improve the presentation and the content of this work.

Appendix A

Here are presented the details of the derivation of Equation (2.9) in dimension n ≥ 1. From Equations (2.6) and (2.8), Equation (2.5) is equivalent to:

(α I_{n} + {[\nabla I \nabla I^{T}]}_{i}) u_{i} - α M {(u)}_{i} = - I_{t, i} \nabla I_{i},

(A.1)

where Inline graphic denotes the n × n identity matrix. Let us now notice that:

{[\nabla I \nabla I^{T}]}_{i}^{2} = \nabla I_{i} {[\nabla I^{T} \nabla I]}_{i} \nabla I_{i}^{T} = {‖ \nabla I_{i} ‖}^{2} {[\nabla I \nabla I^{T}]}_{i} .

(A.2)

Thus,

(α I_{n} + {[\nabla I \nabla I^{T}]}_{i}) (α I_{n} + {‖ \nabla I_{i} ‖}^{2} I_{n} - {[\nabla I \nabla I^{T}]}_{i}) = α (α + {‖ \nabla I_{i} ‖}^{2}) I_{n} .

(A.3)

So,

\begin{array}{l} {(α I_{n} + {[\nabla I \nabla I^{T}]}_{i})}^{- 1} = \frac{α I_{n} + {‖ \nabla I_{i} ‖}^{2} I_{n} - {[\nabla I \nabla I^{T}]}_{i}}{α (α + {‖ \nabla I_{i} ‖}^{2})} \\ = α^{- 1} I_{n} - α^{- 1} \frac{[\nabla I \nabla I^{T}]}{α + {‖ \nabla I_{i} ‖}^{2}} . \end{array}

(A.4)

We also have:

{[\nabla I \nabla I^{T}]}_{i} I_{t, i} \nabla I_{i} = I_{t, i} \nabla I_{i} {[\nabla I^{T} \nabla I]}_{i} = {‖ \nabla I_{i} ‖}^{2} I_{t, i} \nabla I_{i} .

(A.5)

Now, from Eqs. (A.4) and (A.5), the expression (A.1) can be rewritten:

u_{i} = (I_{n} - \frac{{[\nabla I \nabla I^{T}]}_{i}}{α + {‖ \nabla I_{i} ‖}^{2}}) M {(u)}_{i} - \frac{I_{t, i} \nabla I_{i}}{α + {‖ \nabla I_{i} ‖}^{2}} .

(A.6)

This equality leads us to write the general HS iterations for an n-dimensional image:

u_{i}^{k + 1} = (I_{n} - \frac{{[\nabla I \nabla I^{T}]}_{i}}{α + {‖ \nabla I_{i} ‖}^{2}}) M {(u^{k})}_{i} - \frac{I_{t, i} \nabla I_{i}}{α + {‖ \nabla I_{i} ‖}^{2}} .

(A.7)

With the notation introduced in Section 2, we thus obtain Equation (2.9).

Appendix B

We discuss the condition of block diagonally dominant matrices in the context of the Jacobi solver for the HS problem. We refer the reader to [11, 1] for results on the convergence of the Jacobi method for strictly diagonally dominant matrices or irreducible and weakly diagonally dominant matrices, as well as [7] for the corresponding notions in the case of block matrices.

Firstly, using Eqs. (2.5), (2.6), (2.7) and (2.8), we observe that Eq. (2.5) can be rewritten in the form (see Eq. (A.1))

{α u_{i} + {[\nabla I \nabla I^{T}]}_{i} u_{i}} - \sum_{j = 1}^{N} α λ_{i j} u_{j} = - I_{t, i} \nabla I_{i} .

(B.1)

Let A_ij, for i, j ∈ Λ, be the n × n matrices defined by:

A_{i j} = - α λ_{i j} I_{n}, i \neq j;

(B.2)

A_{i i} = (α I_{n} + {[\nabla I \nabla I^{T}]}_{i}) - α λ_{i i} I_{n} .

(B.3)

Then, the Jacobi iteration is expressed as:

u_{i}^{k + 1} = A_{i i}^{- 1} (- \sum_{j \neq i} A_{i j} u_{j}^{k} - I_{t, i} \nabla I_{i}) .

(B.4)

Lemma B.1

Assume that 0 ≤ λ_ii < 1. Then, the inverse matrix of the block A_ii is equal to $\frac{1}{α (1 - λ_{i i})} P_{i}^{'}$ , where $P_{i}^{'} = I_{n} - \frac{{[\nabla I \nabla I^{T}]}_{i}}{α (1 - λ_{i i}) + {‖ \nabla I_{i} ‖}^{2}}$ .

Proof

The Lemma follows directly from Eq. (A.4) upon replacing α by α′ = α(1 − λ_ii).

So, let i be a point in the interior of Λ. From Section 6, λ_ii is then equal to 0. Then, $P_{i}^{'} = P_{i}$ and the Jacobi iteration for the point i of Eq. (B.4) reads as:

u_{i}^{k + 1} = α^{- 1} P_{i} (\sum_{j \neq i} α λ_{i j} u_{j}^{k} - I_{t, i} \nabla I_{i})

(B.5)

= P_{i} M {(u^{k})}_{i} + d_{i},

(B.6)

which amounts to the HS iteration (2.9). On the other hand, since λ_ii ≠ 0 if i is a boundary point, the Jacobi iteration is never the HS iteration at boundary points.

Let $‖ P_{i}^{'} ‖$ be the norm of the matrix $P_{i}^{'}$ defined by ${max}_{u_{i} \neq 0} \frac{‖ P_{i}^{'} u_{i} ‖}{‖ u_{i} ‖}$ based on any norm of ℝⁿ.

Lemma B.2

Let n ≥ 2 and consider a vector u_i in ℝⁿ that is orthogonal to ∇I_i. Then, $P_{i}^{'} (u_{i}) = u_{i}$ . Therefore, $‖ P_{i}^{'} ‖ \geq 1$ no matter the norm used on ℝⁿ

Proof

This result follows directly from the proof of Lemma 5.2.

Lemma B.3

If n ≥ 2 and hypothesis (H2) is fulfilled, then ${‖ A_{i i}^{- 1} ‖}^{- 1} \leq \sum_{j \neq i} ‖ A_{i j} ‖$ , for any i, no matter the norm used on ℝⁿ.

Proof

From the definition (B.2) of A_ij, j ≠ i, we have Σ_j _≠ _i ||A_ij|| = α Σ_j _≠ _i λ_ij. Then, from Lemmas B.1 and B.2 and hypothesis (H2), we have ${‖ A_{i i}^{- 1} ‖}^{- 1} = α (1 - λ_{i i}) {‖ P_{i}^{'} ‖}^{- 1} \leq α (1 - λ_{i i}) = α \sum_{j \neq i} λ_{i j}$ .

From Lemma B.3, one concludes that the matrix A is never weakly (or strictly) block diagonally dominant if n ≥ 2 under hypothesis (H2)^¹. On the other hand, if one uses the Euclidean norm on ℝⁿ, one can easily show that ${‖ A_{i i}^{- 1} ‖}^{- 1} = \sum_{j \neq i} ‖ A_{i j} ‖$ , for any i, because $‖ P_{i}^{'} ‖ = 1$ for that norm. So, in that case, A is block diagonally dominant (i.e. ${‖ A_{i i}^{- 1} ‖}^{- 1} \geq \sum_{j \neq i} ‖ A_{i j} ‖$ for any i) but the inequality is never strict.

Next, we show that the matrix A defined by Eqs. (B.2) and (B.3) is not diagonally dominant if n ≥ 2 (here, the matrix is not viewed as a block matrix), except in very special cases. We first treat the case n = 2. The absolute values of the diagonal elements of the matrix A_ii are equal to $α (1 - λ_{i i}) + I_{x, i}^{2}$ and $α (1 - λ_{i i}) + I_{y, i}^{2}$ , whereas the sum of the absolute values of the elements off the diagonal for the corresponding rows of the matrix A are equal to Σ_j _≠ _i αλ_ij + |I_x_,_i||I_y_,_i| and Σ_j _≠ _i αλ_ij + |I_y_,_i||I_x_,_i|. Using the identity Σ_j λ_ij = 1, diagonal dominance is then equivalent to $I_{x, i}^{2} \geq ∣ I_{x, i} ∣ ∣ I_{y, i} ∣$ and $I_{y, i}^{2} \geq ∣ I_{y, i} ∣ ∣ I_{x, i} ∣$ , which implies that |I_x_,_i| = |I_y_,_i| for each i such that I_x_,_i and I_y_,_i are both different from 0. This is a very special case, so that the assertion that the matrix A is diagonally dominant (in general) is false. If n > 2, the absolute values of the diagonal elements of the matrix A_ii are equal to $α (1 - λ_{i i}) + I_{x_{ℓ}, i}^{2}$ for 1 ≤ ℓ ≤ n, whereas the sum of the absolute values of the elements off the diagonal for the corresponding rows of the matrix A are equal to Σ_j _≠ _i αλ_ij + |I_xℓ_, _i|Σ_ℓ_{′ ≠} _ℓ|I_xℓ_′, _i| for 1 ≤ ℓ ≤ n. Therefore, the diagonal dominance of A implies that $\sum_{ℓ = 1}^{n} ∣ I_{x_{ℓ}, i} ∣ \geq (n - 1) \sum_{ℓ = 1}^{n} ∣ I_{x_{ℓ}, i} ∣$ , which implies that ∇I_i = 0→ for each point i. So, again, the assertion that the matrix A is diagonally dominant (in general) is false. Therefore, it appears that the short argument given in [23, p. 249] for the convergence of the pointwise Jacobi method is erroneous.

Remarks

The HS iterative scheme amounts to the Jacobi iterative scheme at the interior points of the image, but never at its boundary points. But then, we believe that it is usually the HS scheme that is implemented rather than the Jacobi method. Indeed, it is easy to implement (c.f. end of section 6.1), still fully parallelizable, and it is the original method proposed by Horn and Schunck. The difference between the two schemes is due to the Neumann boundary conditions (because then, λ_ii ≠ 0 at a boundary point).
The Neumann boundary conditions (2.4) that come from the unconstrained minimization problem, are very important. In particular, they imply that the Laplacian of a uniform displacement field vanishes, i.e. Δ(u)_i = κ(M (u)_i − u_i) = κ(Σ_j_∈Λ λ_ij − 1)u_i, so that we must have Σ_j_∈Λ λ_ij = 1.
Due to this condition, known convergence results of the (block) Jacobi and Gauss-Seidel methods do not apply, unless n = 1. The result [1, Theorem 1, (a)] assumes that the matrix A is strictly diagonally dominant, which is not the case here. Also, the result [1, Theorem 1, (b)] assumes that A is irreducible and weakly diagonally dominant, which is neither the case. Note that one can generalize [1, Theorem 1] using the notion of block diagonally dominant matrices [7]; namely, one can prove along the lines of [1] that if A is strictly block diagonally dominant or if it is block irreducible and weakly diagonally dominant, then both the block Jacobi and Gauss-Seidel solvers converge. But again, these hypotheses never hold for the HS problem unless n = 1.
On the other hand, if one wants to relax the boundary condition (2.4) and allow Σ_j_∈Λ λ_ij < 1 at a boundary point, then one can show that A is weakly block diagonally dominant for the Euclidean norm and block irreducible (based on the connectedness of the graph G, i.e. hypothesis (H3)), so that both the block Jacobi and Gauss-Seidel solvers then converge. This may happen if one considers a minimization problem with constraints, for instance if the displacement is known at some points of the image.

Appendix C

In this appendix, we discuss the implications of Theorem 4.1 (i.e., the convergence of the HS method) on the convergence of the Gauss-Seidel and SOR iterative schemes through the property of positive definiteness of the coefficient matrix of the HS problem. We also present a more general result that states conditions under which the convergence of the Gauss-Seidel and SOR methods is implied by the convergence of the Jacobi method. In the sequel, ρ(A) denotes the spectral radius of a square matrix A.

Proposition C.1

Let B̃ and C̃ be real symmetric matrices of same dimensions such that B̃ is positive definite and ρ(B̃⁻¹C̃) < 1. Then, the matrix B̃ + C̃ is symmetric positive definite.

Proof

Since the matrix B̃ is symmetric positive definite, it can be expressed in the form LL, where L is a symmetric invertible matrix. Indeed, one can write B̃ = RΨR^T, where RR^T = Inline graphic ( is the identity matrix) and Ψ is a diagonal positive definite matrix; thus, B̃ = LL, where L = RΨ^1/2R^T. Then, A = B̃ + C̃ = L( + L⁻¹C̃L⁻¹)L. Since L is symmetric and invertible, then A is positive definite if and only if the symmetric matrix A′ = + L⁻¹C̃L⁻¹ is positive definite. Now, one has that ρ(L⁻¹C̃L⁻¹) = ρ(L⁻¹L⁻¹C̃L⁻¹L) = ρ(B̃⁻¹C̃) < 1. Therefore, the real symmetric matrix L⁻¹C̃L⁻¹ can be written as Q^TΛQ, where Q^TQ = Inline graphic and Λ is a diagonal matrix such that ρ(Λ) = ρ(L⁻¹C̃L⁻¹) < 1. It follows that A′ = Q^T( + Λ)Q, where + Λ is a diagonal positive definite matrix (because any eigenvalue λ of Λ is such that |λ| < 1). Thus, A′ is a symmetric positive definite matrix, and so is A.

Corollary C.2

Let Ax = b be a linear system, where A is a real symmetric matrix. Let A be written in the form D–B–C, where D, B and C are block diagonal, block upper triangular and block lower triangular matrices, respectively. Assume that D is positive definite. Then, the convergence of the Jacobi iterative scheme x^k⁺¹ = D⁻¹((B + C)x^k⁺¹ + b) implies the convergence of the Gauss-Seidel and the SOR iterative schemes. In fact, the matrix A is positive definite under the assumptions.

Proof

Let B̃ = D and C̃ = −B − C. The convergence of the Jacobi iterative scheme is equivalent to ρ(D⁻¹(B + C)) < 1. Thus, from Proposition C.1, the matrix A is positive definite. Henceforth, the Gauss-Seidel and the SOR methods converge; see for instance [4, Theorem 5.3-2].

Corollary C.3

Under Hypotheses (H1), (H2) and (H3), assume that the rank of (∇I_i) is n. Then, the coefficient matrix A of Eq. (B.1), with blocks defined by Eqs. (B.2) and (B.3)) is symmetric positive definite. In particular, the Gauss-Seidel and the SOR iterative schemes converge under these conditions.

Proof

Let B̃ = αP⁻¹ and C̃ = −αM, where P and M are as in Eq. (2.9). Then, B̃ is the block diagonal matrix with diagonal matrix entries $A_{i i}^{'} = α I_{n} + {[\nabla I \nabla I^{T}]}_{i}$ , as follows from Appendix A. Moreover, the eigenvalues of $A_{i i}^{'}$ are α with multiplicity n − 1 and α + ||∇I_i||² with multiplicity 1. Thus, the symmetric matrix B̃ is positive definite. Also, Theorem 4.1 implies that ρ(B̃⁻¹C̃) = ρ(PM) < 1. Finally, A = αP⁻¹ − αM = B̃ + C̃, using Appendix A. The statement on the positive definiteness of the matrix A now follows from Proposition C.1, since M is symmetric. Hence, the Gauss-Seidel and the SOR iterative schemes converge under these conditions, as in the proof of Corollary C.2.

Remark

The positive definiteness of the coefficient matrix of the HS problem has been proved directly in [17]. Moreover, as mentioned in Section 1, the V-ellipticity of the HS functional [18] implies the positive definiteness of the coefficient matrix of the HS problem. Thus, Corollary C.3 is not a new result. However, the more general result Corollary C.2 might be of interest to further understand the convergence of the Jacobi, Gauss-Seidel and SOR methods.

Appendix D

In this appendix, we give more details to explain why we think the proofs presented in [17, 13] are erroneous. We show that the matrix “P” of [17, Eq. (9)] (denoted here P_* to avoid confusion with the linear transformation P of Eq. (2.9)) is not contracting for the norm defined by [17, Eq. (10)], for any non-uniform image. Indeed, let i₀ be a point where $\nabla I_{i_{0}}^{T} = (I_{x, i_{0}}, I_{y, i_{0}}) \neq (0, 0)$ . We consider the displacement field u defined by u₂ _i₋₁ = I_y,i₀ and u₂ _i = −I_x,i₀ if i ∈ N_i₀ (the set of 4-neighbors of i₀), and u₂ _i₋₁ = u₂ _i = 0 otherwise. The norm defined in [17, Eq. (10)], denoted ||.||_* here to avoid any confusion, reads as ${‖ u ‖}_{*} = {max}_{1 \leq i \leq N} {(u_{2 i - 1}^{2} + u_{2 i}^{2})}^{\frac{1}{2}}$ . In that case, we obtain ${‖ u ‖}_{*} = {(I_{x, i_{0}}^{2} + I_{y, i_{0}}^{2})}^{\frac{1}{2}}$ . Moreover, we find that P_{*(u)_2i₀−1} = I_{y, i₀} and P_*(u)_2i₀ = −I_{x, i₀}. Therefore, ${‖ P_{*} (u) ‖}_{*} \geq {(I_{x, i_{0}}^{2} + I_{y, i_{0}}^{2})}^{\frac{1}{2}}$ , so that ||P_*(u)||_* ≥ ||u||_*. Thus, P_* is not contracting, due to this counter-example. We think that the error occurred in [17, formula (13)]: a factor c_i should be added in the second member to take into account that the sum in the first term includes all the neighbors of i. Thus, in the inequality [17, formula (15)], one should use the factor $\sqrt{2}$ instead of 1, which makes that proof break down.

In [13, Eq. (20)], the Laplacian corresponding to the Neumann boundary conditions (which usually correspond to the HS problem) is denoted L₂. The matrix N₂ is defined by the relation N₂(u) = L₂(u) + u (c.f. [13, Eq. (22)])^². Since that Laplacian operator vanishes on uniform displacement fields, any such displacement field is an eigenvector of the matrix N₂ for the eigenvalue 1. Therefore, the assertion after [13, Eq. (23)] that the spectral radius ρ(N₂) of the matrix N₂ (i.e. the maximal modulus of the eigenvalues of N₂) is less than 1 is erroneous. Incidentally, in [13, formula (22)], a factor $\frac{1}{2}$ is missing to get a correct expression of the average. In [13, formulas (38) and (40)], the authors also assert that ρ(I_d − F ⁻¹Diag(S_ij)) < 1. But, at every point, the determinant of the 2 × 2 matrix S_ij is null^³. Then, the matrices S_ij are singular, and so is F⁻¹Diag(S_ij). Thus, 1 is an eigenvalue of I_d − F⁻¹Diag(S_ij), and so, the assertion is flawed. Thus, the two main intermediate results of [13] are both erroneous.

Footnotes

Recall that a matrix A is weakly (or strictly) block diagonally dominant if ${‖ A_{i i}^{- 1} ‖}^{- 1} \geq \sum_{j \neq i} ‖ A_{i j} ‖$ for any i and if that inequality is strict for some (or any, respectively) i.

In our notation, L₂ = Δ⁽¹⁾ of Eq. (6.1) and N₂ = M of Eq. (2.7).

In our notation, “(i, j)” corresponds to a point i and “S_ij” corresponds to the n × n matrix [∇I∇I^T]_i. “I_d” corresponds to the n × n identity matrix Inline graphic .

References

1.Bagnara R. A unified proof for the convergence of Jacobi and Gauss-Seidel methods. SIAM Review. 1995;37:93–97. [Google Scholar]
2.Baker S, Scharstein D, Lewis J, Roth S, Black M, Szeliski R. A database and evaluation methodology for optical flow. International Journal of Computer Vision. 2011;92:1–31. [Google Scholar]
3.Chambolle A, Pock T. A first-order primal-dual algorithm for convex problems with applications to imaging. J Math Imaging Vis. 2011;40:120–145. [Google Scholar]
4.Ciarlet PG. Introduction à l’analyse matricielle et à l’optimisation. Masson; Paris: 1982. [Google Scholar]
5.Duan Q, Angelini ED, Lorsakul A, Homma S, Holmes JW, Laine AF. Coronary occlusion detection with 4d optical flow based strain estimation on 4d ultrasound. In: Ayache N, Delingette H, Sermesant M, editors. Functional Imaging and Modeling of the Heart vol. 5528 of Lecture Notes in Computer Science. Springer; Berlin Heidelberg: 2009. pp. 211–219. [Google Scholar]
6.Evans DJ. Parallel S.O.R. iterative methods. Parallel Computing. 1984;1:3–18. [Google Scholar]
7.Feingold DG, Varga RS. Block diagonally dominant matrices and generalizations of the Gerschgorin circle theorem. Pacific J Math. 1962;12:1241–1250. [Google Scholar]
8.Franjic I, Khalid S, Pecaric J. On the refinements of the Jensen-Steffensen inequality. Journal of inequalities and Applications. 2011;12:1–11. [Google Scholar]
9.Guerrero T, Zhang G, Huang TC, Lin KP. Intrathoracic tumour motion estimation from CT imaging using the 3D optical flow method. Physics in Medicine and Biology. 2004;49:41–47. doi: 10.1088/0031-9155/49/17/022. [DOI] [PubMed] [Google Scholar]
10.Horn BKP, Schunck BG. Determining optical flow. Artificial Intelligence. 1981;17:185–203. [Google Scholar]
11.James KR. Convergence of matrix iterations subject to diagonal dominance. SIAM Journal on Numerical Analysis. 1973;10:478–484. [Google Scholar]
12.Jensen J. Sur les fonctions convexes et les inégalités entre les valeurs moyennes. Acta Mathematica. 1906;30:175–193. [Google Scholar]
13.Kameda Y, Imiya A, Ohnishi N. A convergence proof for the Horn-Schunck optical-flow computation scheme using neighborhood decomposition. Lecture Notes in Computer Sciences. 2008;4958:262–273. [Google Scholar]
14.Kirisits C, Lang LF, Scherzer O. Scale Space and Variational Methods in Computer Vision. Springer; 2013. Optical flow on evolving surfaces with an application to the analysis of 4D microscopy data; pp. 246–257. [Google Scholar]
15.Kumar A. A discretization of the n-dimensional Laplacian for a dimension-independent stability limit. Proc of the Royal Soc London. 2001;457:2667–2674. [Google Scholar]
16.Liu T, Shen L. Fluid flow and optical flow. Journal of Fluid Mechanics. 2008;614:253–291. [Google Scholar]
17.Mitiche A, Mansouri A. On convergence of the Horn and Schunck optical flow method. IEEE Trans Image Processing. 2004;13:848–852. doi: 10.1109/tip.2004.827235. [DOI] [PubMed] [Google Scholar]
18.Schnörr C. Determining optical flow for irregular domains by minimizing quadratic functionals of a certain class. International Journal of Computer Vision. 1991;6:25–38. [Google Scholar]
19.Schwartz L. Cours d’analyse. Hermann; Paris: 1967. [Google Scholar]
20.Thiyagalingam J, Goodman D, Schnabel JA, Trefethen A, Grau V. On the usage of GPUs for efficient motion estimation in medical image sequences. International Journal of Biomedical Imaging. 2011:1–15. doi: 10.1155/2011/137604. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Treibig J, Wellein G, Hager G. Efficient multicore-aware parallelization strategies for iterative stencil computations. Journal of Computational Science. 2011;2:130–137. [Google Scholar]
22.Tsai DM, Tsai HY. Low-contrast surface inspection of mura defects in liquid crystal displays using optical flow-based motion analysis. Machine Vision and Applications. 2011;22:629–649. [Google Scholar]
23.Weickert J, Schnörr C. Variational optic flow computation with a spatio-temporal smoothness constraint. Journal of Mathematical Imaging and Vision. 2001;14:245–255. [Google Scholar]
24.Wildes RP, Amabile MJ, Lanzillotto AM, Leu TS. Recovering estimates of fluid flow from image sequence data. Computer Vision and Image Understanding. 2000;80:246–266. [Google Scholar]
25.Young DM. Iterative solution of large linear systems. Courier Dover Publications; 2003. [Google Scholar]

[R1] 1.Bagnara R. A unified proof for the convergence of Jacobi and Gauss-Seidel methods. SIAM Review. 1995;37:93–97. [Google Scholar]

[R2] 2.Baker S, Scharstein D, Lewis J, Roth S, Black M, Szeliski R. A database and evaluation methodology for optical flow. International Journal of Computer Vision. 2011;92:1–31. [Google Scholar]

[R3] 3.Chambolle A, Pock T. A first-order primal-dual algorithm for convex problems with applications to imaging. J Math Imaging Vis. 2011;40:120–145. [Google Scholar]

[R4] 4.Ciarlet PG. Introduction à l’analyse matricielle et à l’optimisation. Masson; Paris: 1982. [Google Scholar]

[R5] 5.Duan Q, Angelini ED, Lorsakul A, Homma S, Holmes JW, Laine AF. Coronary occlusion detection with 4d optical flow based strain estimation on 4d ultrasound. In: Ayache N, Delingette H, Sermesant M, editors. Functional Imaging and Modeling of the Heart vol. 5528 of Lecture Notes in Computer Science. Springer; Berlin Heidelberg: 2009. pp. 211–219. [Google Scholar]

[R6] 6.Evans DJ. Parallel S.O.R. iterative methods. Parallel Computing. 1984;1:3–18. [Google Scholar]

[R7] 7.Feingold DG, Varga RS. Block diagonally dominant matrices and generalizations of the Gerschgorin circle theorem. Pacific J Math. 1962;12:1241–1250. [Google Scholar]

[R8] 8.Franjic I, Khalid S, Pecaric J. On the refinements of the Jensen-Steffensen inequality. Journal of inequalities and Applications. 2011;12:1–11. [Google Scholar]

[R9] 9.Guerrero T, Zhang G, Huang TC, Lin KP. Intrathoracic tumour motion estimation from CT imaging using the 3D optical flow method. Physics in Medicine and Biology. 2004;49:41–47. doi: 10.1088/0031-9155/49/17/022. [DOI] [PubMed] [Google Scholar]

[R10] 10.Horn BKP, Schunck BG. Determining optical flow. Artificial Intelligence. 1981;17:185–203. [Google Scholar]

[R11] 11.James KR. Convergence of matrix iterations subject to diagonal dominance. SIAM Journal on Numerical Analysis. 1973;10:478–484. [Google Scholar]

[R12] 12.Jensen J. Sur les fonctions convexes et les inégalités entre les valeurs moyennes. Acta Mathematica. 1906;30:175–193. [Google Scholar]

[R13] 13.Kameda Y, Imiya A, Ohnishi N. A convergence proof for the Horn-Schunck optical-flow computation scheme using neighborhood decomposition. Lecture Notes in Computer Sciences. 2008;4958:262–273. [Google Scholar]

[R14] 14.Kirisits C, Lang LF, Scherzer O. Scale Space and Variational Methods in Computer Vision. Springer; 2013. Optical flow on evolving surfaces with an application to the analysis of 4D microscopy data; pp. 246–257. [Google Scholar]

[R15] 15.Kumar A. A discretization of the n-dimensional Laplacian for a dimension-independent stability limit. Proc of the Royal Soc London. 2001;457:2667–2674. [Google Scholar]

[R16] 16.Liu T, Shen L. Fluid flow and optical flow. Journal of Fluid Mechanics. 2008;614:253–291. [Google Scholar]

[R17] 17.Mitiche A, Mansouri A. On convergence of the Horn and Schunck optical flow method. IEEE Trans Image Processing. 2004;13:848–852. doi: 10.1109/tip.2004.827235. [DOI] [PubMed] [Google Scholar]

[R18] 18.Schnörr C. Determining optical flow for irregular domains by minimizing quadratic functionals of a certain class. International Journal of Computer Vision. 1991;6:25–38. [Google Scholar]

[R19] 19.Schwartz L. Cours d’analyse. Hermann; Paris: 1967. [Google Scholar]

[R20] 20.Thiyagalingam J, Goodman D, Schnabel JA, Trefethen A, Grau V. On the usage of GPUs for efficient motion estimation in medical image sequences. International Journal of Biomedical Imaging. 2011:1–15. doi: 10.1155/2011/137604. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Treibig J, Wellein G, Hager G. Efficient multicore-aware parallelization strategies for iterative stencil computations. Journal of Computational Science. 2011;2:130–137. [Google Scholar]

[R22] 22.Tsai DM, Tsai HY. Low-contrast surface inspection of mura defects in liquid crystal displays using optical flow-based motion analysis. Machine Vision and Applications. 2011;22:629–649. [Google Scholar]

[R23] 23.Weickert J, Schnörr C. Variational optic flow computation with a spatio-temporal smoothness constraint. Journal of Mathematical Imaging and Vision. 2001;14:245–255. [Google Scholar]

[R24] 24.Wildes RP, Amabile MJ, Lanzillotto AM, Leu TS. Recovering estimates of fluid flow from image sequence data. Computer Vision and Image Understanding. 2000;80:246–266. [Google Scholar]

[R25] 25.Young DM. Iterative solution of large linear systems. Courier Dover Publications; 2003. [Google Scholar]

PERMALINK

A PROOF OF CONVERGENCE OF THE HORN AND SCHUNCK OPTICAL FLOW ALGORITHM IN ARBITRARY DIMENSION

LOUIS LE TARNEC, Eng.

FRANÇOIS DESTREMPES, Ph.D., Ph.D.

GUY CLOUTIER, Eng. Ph.D.

DAMIEN GARCIA, Eng., Ph.D.

Abstract

1. Introduction

2. Statement of the problem

3. Previous proofs

4. Statement of the main result

Theorem 4.1

5. Proof of the main result

Lemma 5.1

Proof

Lemma 5.2

Proof

Proof of Theorem 4.1

6. The discrete Laplacian in dimension n

6.1. Description of a general scheme

Fig. 1.

6.2. Determination of the weights in the average calculation

Fig. 2.

7. Verification of the hypotheses

Proposition 7.1

Proof

Proposition 7.2

Proof

Proposition 7.3

Proof

8. Conclusion

Acknowledgments

Appendix A

Appendix B

Lemma B.1

Proof

Lemma B.2

Proof

Lemma B.3

Proof

Remarks

Appendix C

Proposition C.1

Proof

Corollary C.2

Proof

Corollary C.3

Proof

Remark

Appendix D

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases