Local Polynomial Regression for Symmetric Positive Definite Matrices

Ying Yuan; Hongtu Zhu; Weili Lin; J S Marron

doi:10.1111/j.1467-9868.2011.01022.x

. Author manuscript; available in PMC: 2013 Sep 1.

Published in final edited form as: J R Stat Soc Series B Stat Methodol. 2012 Mar 16;74(4):697–719. doi: 10.1111/j.1467-9868.2011.01022.x

Local Polynomial Regression for Symmetric Positive Definite Matrices

Ying Yuan ¹, Hongtu Zhu ^1,^†, Weili Lin ¹, J S Marron ¹

PMCID: PMC3448376 NIHMSID: NIHMS336025 PMID: 23008683

Summary

Local polynomial regression has received extensive attention for the nonparametric estimation of regression functions when both the response and the covariate are in Euclidean space. However, little has been done when the response is in a Riemannian manifold. We develop an intrinsic local polynomial regression estimate for the analysis of symmetric positive definite (SPD) matrices as responses that lie in a Riemannian manifold with covariate in Euclidean space. The primary motivation and application of the proposed methodology is in computer vision and medical imaging. We examine two commonly used metrics, including the trace metric and the Log-Euclidean metric on the space of SPD matrices. For each metric, we develop a cross-validation bandwidth selection method, derive the asymptotic bias, variance, and normality of the intrinsic local constant and local linear estimators, and compare their asymptotic mean square errors. Simulation studies are further used to compare the estimators under the two metrics and to examine their finite sample performance. We use our method to detect diagnostic differences between diffusion tensors along fiber tracts in a study of human immunodeficiency virus.

1. Introduction

Symmetric positive-definite (SPD) matrix-valued data occur in a wide variety of important applications. For instance, in computational anatomy, a SPD deformation vector (JJ^T)^1/2 is computed to capture the directional information of shape change decoded in the Jacobian matrices J at each location in an image (Grenander and Miller, 2007). In diffusion tensor imaging (Basser et al., 1994), a 3 × 3 SPD diffusion tensor, which tracks the effective diffusion of water molecules, is estimated at each voxel (a 3 dimensional (3D) pixel) of an imaging space. In functional magnetic resonance imaging, a SPD covariance matrix is calculated to delineate functional connectivity between different neural assemblies involved in achieving a complex cognitive task or perceptual process (Fingelkurts et al., 2005). In classical multivariate statistics, a common research focus is to model and estimate SPD covariance matrices for multivariate measurements, longitudinal data, and time series data among many others (Pourahmadi, 2000; Anderson, 2003).

Despite the popularity of SPD matrix-valued data, only a handful of methods have been developed for the statistical analysis of SPD matrices as response variables in a Riemannian manifold. In the medical imaging literature (Fletcher and Joshi, 2007; Batchelor et al., 2005; Pennec et al., 2006), various image processing methods have recently been developed to segment, deform, interpolate, extrapolate and regularize diffusion tensor images (DTIs). Schwartzman (2006) proposed several parametric models for analyzing SPD matrices and derived the distributions of several test statistics for comparing differences between the means of the two (or multiple) groups of SPD matrices. Kim and Richards (2010) developed a nonparametric estimator for the common density function of a random sample of positive definite matrices. Zhu et al. (2009) developed a semi-parametric regression model with SPD matrices as responses in a Riemannian manifold and the covariates in a Euclidean space. Barmpoutis et al. (2007) and Davis et al. (2010) proposed tensor splines and local constant regressions for interpolating DTI tensor fields, but they did not address several important issues of analyzing random SPD matrices including the asymptotic properties of the nonparametric estimate proposed. All these methods for SPD matrices discussed above are based on the trace metric (or affine invariant metric) in the SPD space (Lang, 1999; Terras, 1988). Recently, Arsigny et al. (2007) proposed a Log-Euclidean metric and showed its excellent theoretical and computational properties. Dryden et al. (2009) compared various metrics of the space of SPD matrices and their properties.

To the best of our knowledge, this is the first paper to develop an intrinsic local polynomial regression (ILPR) model for estimating an intrinsic conditional expectation of a SPD matrix response, S, given a covariate vector x from a set of observations (x₁, S₁), ···, (x_n, S_n), where the x_i can be either univariate or multivariate. In practice, x can be the arc-length of a specific fiber tract (e.g., right internal capsule tract), the coordinates in the 3D imaging space, or demographic variables such as age. Important applications of ILPR include smoothing diffusion tensors along fiber tracts and smoothing diffusion and deformation tensor fields. Another application is quantifying the change of diffusion and deformation tensors as well as the inter-regional functional connectivity matrix across groups and over time.

Relative to the existing literature on the analysis of SPD matrices, we make several important contributions in this paper.

To account for the curved nature of the SPD space, we propose the ILPR method for estimating the intrinsic conditional expectation of random SPD responses given the covariate. We also derive an approximation of a cross-validation method for bandwidth selection.
Theoretically, we compare the trace metric and the Log-Euclidean metric and establish the asymptotic properties of the ILPR estimators corresponding to each metric.
Theoretically and numerically, we examine the effect that the use of different metrics has on statistical inference in the SPD space.

The rest of the paper is organized as follows. In Section 2, we develop the ILPR method and a cross-validated bandwidth method for nonparametric analysis of random SPD matrix-valued data. In Section 3, we compare the trace metric and the Log-Euclidean metric and derive their ILPR estimators. We investigate the asymptotic properties of the estimators proposed under the Log-Euclidean metric and the estimators under the trace metric in Sections 4.1 and 4.2, respectively. We examine the finite sample performance of the ILPR estimators via simulation studies in Section 5. We analyze a real data set to illustrate a real-world application of the proposed ILPR method in Section 6 before offering some concluding remarks in Section 7.

2. Intrinsic Local Polynomial Regression for SPD Matrices

In this section, we develop a general framework for using intrinsic local polynomial regression in the analysis of SPD matrices and will examine two examples in Section 3. Let Sym⁺(m) and Sym(m) be, respectively, the set of m × m SPD matrices and the set of m × m symmetric matrices with real entries. The space Sym(m) is a Euclidean space with the Frobenius metric (or Euclidean inner product) given by tr(A₁A₂) for any A₁, A₂ ∈ Sym(m), whereas Sym⁺(m) is a Riemannian manifold, which will be detailed below. There is a one-to-one correspondence between Sym(m) and Sym⁺(m) through matrix exponential and logarithm. For any matrix A ∈ Sym(m), its matrix exponential is given by $exp (A) = \sum_{k = 0}^{\infty} A^{k} / k! \in {Sym}^{+} (m)$ . Conversely, for any matrix S ∈ Sym⁺(m), there is a log(S) = A ∈ Sym(m) such that exp(A) = S.

Standard nonparametric regression models for responses in the Euclidean space estimate E(S|X = x). However, for a random S in a curved space, one cannot directly define the conditional expectation of S given X = x with the usual expectation in Euclidean space. We are interested in answering the following question.

(Q1) How do we define an intrinsic conditional expectation of S at each x, denoted by D(x), in Sym⁺(m)?

To appropriately define D(x), we review some basic facts about the geometrical structure of Sym⁺(m) near D(x) (Lang, 1999; Terras, 1988). See Figure 1 for a graphical illustration. We first introduce the tangent vector and tangent space at D(x) in Sym⁺(m). For a small scalar δ > 0, let C(t) be a differentiable map from (−δ, δ) to Sym⁺(m) passing through C(0) = D(x). A tangent vector at D(x) is defined as the derivative of the smooth curve C(t) with respect to t evaluated at t = 0. The set of all tangent vectors at D(x) forms the tangent space of Sym⁺(m) at D(x), denoted as T_D₍_x₎Sym⁺(m), which can be identified with Sym(m). The T_D₍_x₎Sym⁺(m) is equipped with an inner product 〈·, ·〉, called a Riemannian metric, which varies smoothly from point to point. For instance, one may use the Frobenuis metric as a Riemannian metric. Two additional Riemannian metrics for Sym⁺(m) will be given in Section 3. For a given Riemannian metric, we can calculate 〈U, V 〉 for any U and V on T_D₍_x₎Sym⁺(m) and then we can calculate the length of a smooth curve C(t): [t₀, t₁] → Sym⁺(m), which equals $\int_{t_{0}}^{t_{1}} \sqrt{〈 \dot{C} (t), \dot{C} (t) 〉} d t$ , where Ċ(t) is the derivative of C(t) with respect to t. A geodesic is a smooth curve on Sym⁺(m) whose tangent vector does not change in length or direction as one moves along the curve. For a U ∈ T_D₍_x₎Sym⁺(m), there is a unique geodesic, denoted by γ_D₍_x₎(t; U), whose domain contains [0, 1], such that γ_D₍_x₎(0; U) = D(x) and γ̇_D₍_x₎(0; U) = U. The Riemannian exponential mapping Exp_D₍_x₎: T_D₍_x₎Sym⁺(m) → Sym⁺(m) of the tangent vector U is defined as Exp_D₍_x₎(U) = γ_D₍_x₎(1; U). The inverse of the Riemannian exponential map ${Log}_{D (x)} (\cdot) = {Exp}_{D (x)}^{- 1} (\cdot)$ is called the Riemannian logarithmic map from Sym⁺(m) to a vector in T_D₍_x₎Sym⁺(m). Finally, the shortest distance between two points D₁(x) and D₂(x) in Sym⁺(m) is called the geodesic distance between D₁(x) and D₂(x), denoted as g(D₁(x), D₂(x)), which satisfies

Fig. 1 — Graphical illustration of the geometrical structure of Sym⁺(m) near D(x).

g {(D_{1} (x), D_{2} (x))}^{2} = 〈 {Log}_{D_{1} (x)} (D_{2} (x)), {Log}_{D_{1} (x)} (D_{2} (x)) 〉 .

(1)

We define Inline graphic (X) to be Log_D₍_X₎(S) in T_D₍_X₎Sym⁺(m). Statistically, (X) can be regarded as the residual of S relative to D(X). Let vecs(C) = (c₁₁, c₂₁, c₂₂, ···, c_m₁, ···, c_mm)^T be an m(m+1)=2×1 vector for any m×m symmetric matrix C = (c_ij). Thus, the intrinsic conditional expectation of S at X = x is defined as D(x) ∈ Sym⁺(m) such that

E {{Log}_{D (X)} (S) ∣ X = x} = O_{m,}

(2)

where O_m is the m × m matrix with all elements zeros and the expectation is taken componentwise with respect to the multivariate random vector vecs(Log_D₍_x₎(S)). In fact, (2) characterizes intrinsic means (Bhattacharya and Patrangenaru, 2005).

Suppose that (x_i, S_i), i = 1, ···, n is an independent and identically distributed random sample, where S_i ∈ Sym⁺(m). For notational simplicity, we focus on a univariate covariate throughout the paper. We are interested in using the observed data {(x_i, S_i), i = 1, ···, n} to estimate D(X) defined in (2) at each X = x₀. By ignoring the Riemannian metric introduced in T_D₍_X₎Sym⁺(m), we can directly minimize a weighted least square criterion based on the metric related to the regular Frobenius inner product, which is given by

L_{n} (D (x_{0})) = \sum_{i = 1}^{n} K_{h} (x_{i} - x_{0}) tr ({Log}_{D (x_{0})} {(S_{i})}^{2}) .

(3)

In (3), K_h(u) = K(u/h)h⁻¹, in which h is a positive scalar, and K(·) is a kernel function such as the Epanechnikov kernel (Fan and Gijbels, 1996; Wand and Jones, 1995). However, it is unclear whether the estimate, which minimizes L_n(D(x₀)), is truly consistent or not. Therefore, we are interested in solving the second question below.

(Q2) How do we use the observed data to consistently estimate D(X) in (2) at each X = x₀?

For a specific Riemannian metric, we consider estimating D(X) at X = x₀ by minimizing a weighted intrinsic least square criterion, denoted by G_n(D(x₀)) given by

\sum_{i = 1}^{n} K_{h} (x_{i} - x_{0}) 〈 {Log}_{D (x_{0})} (S_{i}), {Log}_{D (x_{0})} (S_{i}) 〉 = \sum_{i = 1}^{n} K_{h} (x_{i} - x_{0}) g {(D (x_{0}), S_{i})}^{2} .

(4)

Directly minimizing G_n(D(x₀)) with respect to D(x₀) leads to a weighted intrinsic mean of S₁, ···, S_n ∈ Sym⁺(m) at x₀, denoted by D̂_I(x₀) (Bhattacharya and Patrangenaru, 2005). It will be shown below that D̂_I(x₀) is truly a consistent estimate of D(x₀).

Local polynomial regression has received extensive attention for the nonparametric estimation of regression functions when both response and covariate are in Euclidean space Fan and Gijbels (1996); Wand and Jones (1995). However, little has been done on developing local polynomial regression when the response is in a Riemannian manifold and the covariates are in Euclidean space. Therefore, we are interested in solving a third question below.

(Q3) How do we define the intrinsic local polynomial regression for estimating D(X) in (2) at each X = x₀?

We propose the intrinsic local polynomial regression for estimating D(X) at X = x₀ as follows. Since D(x) is in the curved space, we cannot directly expand D(x) at x₀ by using a Taylor’s series expansion. Instead, we consider the Riemannian logarithmic map of D(x) at D(x₀) in T_D(x₀)Sym⁺(m). Let I_m be an m × m identity matrix. Since Log_D(x₀)(D(x)) for different x₀ are in different tangent spaces, we may transport them from T_D(x₀)Sym⁺(m) to the same tangent space T_{I_m}Sym⁺(m) through a parallel transport given by

φ_{D (x_{0})} : T_{D (x_{0})} {Sym}^{+} (m) \to T_{I_{m}} {Sym}^{+} (m) .

That is, we have

Y (x) = φ_{D (x_{0})} ({Log}_{D (x_{0})} (D (x))) \in T_{I_{m}} {Sym}^{+} (m) and {Log}_{D (x_{0})} (D (x)) = φ_{D (x_{0})}^{- 1} (Y (x)),

(5)

where $φ_{D (x_{0})}^{- 1} (\cdot)$ is the inverse map of φ_D(x₀)(·). Moreover, since Y (x₀) = φ_D(x₀)(O_m) = O_m and Y (x) are in the same space T_{I_m}Sym⁺(m), we expand Y (x) at x₀ by using the Taylor’s series expansion as follows:

{Log}_{D (x_{0})} (D (x)) = φ_{D (x_{0})}^{- 1} (Y (x)) \approx φ_{D (x_{0})}^{- 1} (\sum_{k = 1}^{k_{0}} Y^{(k)} (x_{0}) {(x - x_{0})}^{k}),

(6)

where k₀ is an integer and Y ⁽^k⁾(x) is the k–th derivative of Y (x) with respect to x divided by k!. Equivalently, D(x) can be approximated by

D (x) \approx {Exp}_{D (x_{0})} (φ_{D (x_{0})}^{- 1} (\sum_{k = 1}^{k_{0}} Y^{(k)} (x_{0}) {(x - x_{0})}^{k})) = D (x, α (x_{0}), k_{0}),

(7)

where α(x₀) contains all unknown parameters in {D(x₀), Y ⁽¹⁾(x₀), ···, Y ^(k₀)(x₀)}.

To estimate α(x₀), we substitute the approximation of D(x) in (7) into (4) to obtain G_n(α(x₀)), which is given by

G_{n} (α (x_{0})) = \sum_{i = 1}^{n} K_{h} (x_{i} - x_{0}) g {({Exp}_{D (x_{0})} (φ_{D (x_{0})}^{- 1} (\sum_{k = 1}^{k_{0}} Y^{(k)} (x_{0}) {(x - x_{0})}^{k})), S_{i})}^{2} .

(8)

Subsequently, we calculate an intrinsic weighted least square estimator of α(x₀) defined by

{\hat{α}}_{I} (x_{0}; h) = {argmin}_{α (x_{0})} G_{n} (α (x_{0})) .

(9)

Then we can calculate D(x, α̂_I(x₀; h), k₀), denoted by D̂_I(x, h), as an intrinsic local polynomial regression estimator (ILPRE) of D(x). When k₀ = 0, D(x, α̂_I(x₀; h), 0) is exactly the intrinsic local constant estimator of D(x₀) considered in Davis et al. (2010).

We propose using a leave-one-out cross validation method for bandwidth selection due to its conceptual simplicity. Let ${\hat{D}}_{I}^{(- i)} (x_{i}; h)$ be the estimate of D(x_i) obtained by minimizing G_n(α(x_i)) with (x_i, S_i) deleted for a given bandwidth h and all i. The cross-validation score is defined as follows:

CV (h) = n^{- 1} \sum_{i = 1}^{n} g {(S_{i}, {\hat{D}}_{I}^{(- i)} (x_{i}; h))}^{2} .

(10)

The optimal h, denoted by ĥ, can be obtained by minimizing CV(h). However, since computing ${\hat{D}}_{I}^{(- i)} (x_{i}; h)$ for all i can be computationally prohibitive, we suggest to use the first-order approximation of CV(h), whose details will be given below under each specific metric. Although it is possible to develop other bandwidth selection methods, such as plug-in and bootstrap methods (Rice, 1984; Park and Marron, 1990; Hall et al., 1992; Hardle et al., 1992), we must deal with additional computational and theoretical challenges, which will be left for future research.

3. ILPR under Log-Euclidean Metric and Trace Metric

As discussed in Dryden et al. (2009), various metrics can be defined for tangent vectors on T_D₍_x₎Sym⁺(m). To assess the effect of different metrics on ILPREs, we develop ILPR under two commonly used metrics, including the Log-Euclidean metric and the trace metric.

3.1. Log-Euclidean Metric

In this section, we review some basic facts about the theory of the Log-Euclidean metric, details of which have been given in Arsigny et al. (2007). We introduce the notation ‘L’ into some necessary quantities under the Log-Euclidean metric. We use exp(.) and log(.) to represent the matrix exponential and the matrix logarithm, respectively, whereas we use Exp and Log to represent the Riemannian exponential and logarithm maps, respectively. Let ∂_D₍_x₎ log.(U) be the differential of the matrix logarithm at D(x) ∈ Sym⁺(m) acting on an infinitesimal displacement U ∈ T_D₍_x₎Sym⁺(m) (Arsigny et al., 2007). The Log-Euclidean metric on Sym⁺(m) is defined as

〈 U, V 〉 = tr ({\partial_{D (x)} log \cdot (U)} {\partial_{D (x)} log \cdot (V)}),

(11)

where U and V are in T_D₍_x₎Sym⁺(m). The geodesic γ_D₍_x_),_L(t, U) is given by exp(log(D(x)) + t∂_D₍_x₎ log.(U)) for any t ∈ R. Let ∂_log(_D₍_x₎₎ exp. (A) be the differential of the matrix exponential at log(D(x)) ∈ Sym(m) acting on an infinitesimal displacement A ∈ T_log(_D₍_x₎₎Sym(m) (Arsigny et al., 2007). The Riemannian exponential and logarithm maps are, respectively, given by

{Exp}_{D (x), L} (U) = exp (log (D (x)) + \partial_{D (x)} log \cdot (U)),

(12)

{Log}_{D (x), L} (S) = \partial_{log (D (x))} exp \cdot (log (S) - log (D (x))) .

The geodesic distance between D(x) and S is uniquely given by

g_{L} (D (x), S) = \sqrt{t r [{log (D (x)) - log (S)}^{\otimes 2}]} .

(13)

We consider two SPD matrices D(x) and D(x₀). For any U_D(x₀)∈ T_D(x₀)Sym⁺(m), the parallel transport φ_D(x₀),_L: T_D(x₀)Sym⁺(m) → T_{I_m}Sym⁺(m) is defined by

φ_{D (x_{0}), L} (U_{D (x_{0})}) = \partial_{D (x_{0})} log \cdot (U_{D (x_{0})}) \in T_{I_{m}} {Sym}^{+} (m) .

(14)

Combining (12) and (14) yields

\begin{array}{l} Y (x) = φ_{D (x_{0}), L} ({Log}_{D (x_{0}), L} (D (x))) = log (D (x)) - log (D (x_{0})), \\ D (x) = exp (log (D (x_{0})) + Y (x)) . \end{array}

(15)

In this case, Inline graphic (X) = log(S) − log(D(X)) and E{log(S)|X = x} = log(D(x)).

Let vec(A) = (a₁₁, …, a₁_m, a₂₁, …, a₂_m, ···, a_m₁, ···, a_mm)^T be the vectorization of an m×m matrix A = (a_ij). Under the Log-Euclidean metric, G_n(D(x₀)) in (4) can be written as

G_{n} (D (x_{0})) = \sum_{i = 1}^{n} K_{h} (x_{i} - x_{0}) tr [{log (D (x_{0})) - log (S_{i})}^{2}] .

(16)

To compute the ILPR estimator, we use the Taylor’s series expansion to expand log(D(x)) at x₀ as follows:

log (D (x)) \approx \sum_{k = 0}^{k_{0}} log {(D (x_{0}))}^{(k)} {(x - x_{0})}^{k} = log (D_{L} (x, α_{L} (x_{0}), k_{0})),

(17)

where α_L(x₀) contains all unknown parameters in log(D(x₀))⁽^k⁾ for k = 0, ···, k₀. We compute α̂_IL(x₀; h) by minimizing G_n(D_L(x, α_L(x₀), k₀)). It can be shown that α̂_IL(x₀; h) has the explicit expression as

{\hat{α}}_{I L} (x_{0}; h) = vec ({\sum_{i = 1}^{n} K_{h} (x_{i} - x_{0}) X_{i} {(x_{0})}^{\otimes 2}}^{- 1} \sum_{i = 1}^{n} K_{h} (x_{i} - x_{0}) X_{i} (x_{0}) vecs {(log (S_{i}))}^{T}),

(18)

where Inline graphic (x) = (1, (x_i − x), ···, (x_i − x)^k₀)^T. By substituting α̂_IL(x₀; h) into D_L(x, α_L(x₀), k₀), we have D̂_IL(x; h, k₀) = D_L(x, α̂_IL(x₀; h), k₀).

Let e_k₀+1,i be the (k₀ + 1) unit vector having 1 in the i-th entry and 0 elsewhere. Let $e_{k_{0} + 1, i}^{T} {\sum_{j = 1}^{n} K_{h} (x_{j} - x) X_{j} {(x)}^{\otimes 2}}^{- 1} K_{h} (x_{i} - x) X_{i} (x) = a_{i} (x)$ . The cross-validation score CV(h) can be simplified as follows:

CV (h) = n^{- 1} \sum_{i = 1}^{n} g_{L} {(S_{i}, {\hat{D}}_{I L} (x_{i}; h))}^{2} / {1 - a_{i} (x_{i})}^{2} .

(19)

Replacing a_i(x_i) in (19) by the average of a₁(x₁), ···, a_n(x_n), we can get the generalized cross-validation (GCV) score as follows:

GCV (h) = n^{- 1} \sum_{i = 1}^{n} g_{L} {(S_{i}, {\hat{D}}_{I L} (x_{i}; h))}^{2} / {1 - \sum_{i = 1}^{n} a_{i} (x_{i}) / n}^{2} .

(20)

Without special saying, for the Log-Euclidean metric, we use GCV(h) to select the bandwidth throughout this paper.

3.2. Trace Metric

We review some basic facts about the theory of trace metric (Schwartzman, 2006; Lang, 1999; Terras, 1988; Fletcher et al., 2004; Batchelor et al., 2005; Pennec et al., 2006). We add the notation of ‘T’ into some necessary geometric quantities under the trace metric. Under the trace metric, an inner product of U and V in T_D₍_x₎Sym⁺(m) is defined as

〈 U, V 〉 = tr (U D {(x)}^{- 1} V D {(x)}^{- 1}) .

(21)

The geodesic γ_D₍_x_),_T (t; U) is given by G(x) exp(tG(x)⁻¹UG(x)⁻^T)G(x)^T for any t, where G(x) is any square root of D(x) such that D(x) = G(x)G(x)^T. The Riemannian exponential and logarithm maps are, respectively, given by

\begin{array}{l} {Exp}_{D (x), T} (U) = γ_{D (x), T} (1; U) = G (x) exp (G {(x)}^{- 1} U G {(x)}^{- T}) G {(x)}^{T}, \\ {Log}_{D (x), T} (S) = G (x) log (G {(x)}^{- 1} S G {(x)}^{- T}) G {(x)}^{T} . \end{array}

(22)

The geodesic distance between D(x) and S, denoted by g_T (D(x), S), is given by

\sqrt{tr {{log}^{2} (G {(x)}^{- 1} S G {(x)}^{- T})}} = \sqrt{tr {{log}^{2} (S^{- 1 / 2} D (x) S^{- T / 2})}},

(23)

where S^1/2 is any square root of S.

We consider two SPD matrices D(x) and D(x₀) = G(x₀)G(x₀)^T. For any U_D(x₀)∈ T_D(x₀)Sym⁺(m), the parallel transport φ_D(x₀),T is defined by

φ_{D (x_{0}), T} (U_{D (x_{0})}) = G {(x_{0})}^{- 1} U_{D (x_{0})} G {(x_{0})}^{- T} \in T_{I_{m}} {Sym}^{+} (m) .

(24)

Thus, combining (22) and (24) yields

\begin{array}{l} Y (x) = φ_{D (x_{0}), T} ({Log}_{D (x_{0}), T} (D (x))) = log (G {(x_{0})}^{- 1} D (x) G {(x_{0})}^{- T}), \\ D (x) = G (x_{0}) exp (Y (x)) G {(x_{0})}^{T} . \end{array}

(25)

In this case, Inline graphic (X) = log(G(X)⁻¹SG(X)⁻^T).

To compute the ILPR estimator, we use the Taylor’s series expansion to expand Y (x) at x₀ as follows:

D (x) \approx G (x_{0}) exp (\sum_{k = 1}^{k_{0}} Y^{(k)} (x_{0}) {(x - x_{0})}^{k}) G {(x_{0})}^{T} = D_{T} (x, α_{T} (x_{0}), k_{0}),

(26)

where α_T (x₀) contains all unknown parameters in G(x₀) and Y ⁽^k⁾(x₀) for k = 1, ···, k₀. Thus, we can compute α̂_IT (x₀; h) by minimizing G_n(α_T (x₀)). Under the trace metric, minimizing G_n(α_T (x₀)) is computationally challenging when k₀ > 0, since G_n(α_T (x₀)) is not convex and may have multiple local minimizers. Thus, standard gradient methods, which strongly depend on the starting value of α_T (x₀), do not perform well for optimizing G_n(α_T (x₀)) when k₀ > 0. Hence, we develop an annealing evolutionary stochastic approximation Monte Carlo algorithm (see Liang (2010) for good discussion) for computing α̂_IT (x₀; h). Details can be found in the supplementary document.

To simplify the computation of CV_T (h), we suggest the first-order approximation to CV_T (h) as follows:

{CV}_{T} (h) \approx n^{- 1} \sum_{i = 1}^{n} g_{T} {(S_{i}, {\hat{D}}_{I T} (x_{i}; h, k_{0}))}^{2} + 2 p_{n} (h),

(27)

where D̂_IT (x; h, k₀) = D_T (x, α̂_IT (x₀; h), k₀). The CV_T (h) is close to Akaike’s information criterion (AIC) (Sakamoto et al., 1999) and p_n(h) can be regarded as the number of degrees of freedom. The explicit form of p_n(h) is presented in the supplementary document.

4. Asymptotic Properties

We derive the asymptotic properties of ILPREs, such as asymptotic normality, under the Log-Euclidean and trace metrics. Furthermore, we systematically compare the intrinsic local constant and linear estimators under each metric and between the two metrics.

4.1. Log-Euclidean Metric

Under the Log-Euclidean metric, ILPRE is almost equivalent to the LPR estimator for multivariate response in Euclidean space. Thus, we can generalize the existing theory of the local polynomial regression estimator (Fan and Gijbels, 1996; Wand and Jones, 1995). Moreover, we only present the consistency and asymptotic normality of ILPRE for interior points, since the asymptotic properties of ILPRE for boundary points are similar to those for interior points in Euclidean space (Fan and Gijbels, 1996).

To proceed, we need some additional notation. Let a^⊗2 = aa^T for any vector or matrix a and I_q be an identity matrix of size q = m(m + 1)/2. Let Inline graphic = diag(1, h, ···, h^k₀) ⊗ I_q. Let u = (u₁, ···, u_k₀)^T and v = (v₁, ···, v_k₀)^T be k₀ × 1 vectors, where u_k = ∫x^kK(x)dx and v_k = ∫x^kK(x)²dx for k ≥ 0. Let = (u_i₊_j) and = (v_i₊_j) for 0 ≤ i, j ≤ k₀ be two (k₀ + 1) × (k₀ + 1) matrices for 0 ≤ i, j ≤ k₀. Let f_X(x) and $f_{X}^{(1)} (x)$ be the marginal density function of X and its first-order derivative with respect to x, respectively. We define Inline graphic (x₀; h) = ( (x₀; h)^T, ···, (x₀; h)^T)^T, in which we have

M_{k} (x_{0}; h) = {\begin{cases} u_{k_{0} + k} vecs (log {D (x_{0})}^{(k_{0} + 1)}), \\ for even k_{0} + k as 0 < k \leq k_{0} + 1; \\ {h u}_{k_{0} + k + 1} vecs (log {D (x_{0})}^{(k_{0} + 1)} log {(f_{X} (x_{0}))}^{(1)} + log {D (x_{0})}^{(k_{0} + 2)} {(k_{0} + 2)}^{- 1}), \\ for odd k_{0} + k . \end{cases}

We have the following results, whose proof is similar to that of Theorem 2 in the supplementary document.

Theorem 1

Suppose that x₀ is an interior point of f_X(.). Under the Log-Euclidean metric and conditions (C1)-(C4) in the appendix, we have the following results.

{α̂_IL(x₀; h) – α_L(x₀)} converges to 0 in probability as n → ∞.
For k₀ = 0, under an additional condition (C10) in the appendix and that $f_{X}^{(1)} (x)$ is continuous in a neighborhood of x₀, we have
$\sqrt{n h} [H {{\hat{α}}_{I L} (x_{0}; h) - α_{L} (x_{0})} - h^{2} u_{2} vecs (0.5 log {D (x_{0})}^{(2)} + \frac{f_{X}^{(1)} (x_{0})}{f_{X} (x_{0})} log {D (x_{0})}^{(1)})] \to^{L} N {0, \sum_{0} (x_{0})},$ (28)

where $\sum_{0} (x_{0}) = f_{X}^{- 1} (x_{0}) v_{0} \sum_{E_{D}} (x_{0})$ with (x) = Cov(vecs[log(S) − log{D(x)}]|X= x) and →^L denotes convergence in distribution.
For k₀ > 0, under the conditions of Theorem 1 (ii), we have
$\sqrt{n h} [H {{\hat{α}}_{I L} (x_{0}; h) - α_{L} (x_{0})} - \frac{h^{k_{0} + 1}}{(k_{0} + 1)!} (U_{0}^{- 1} \otimes I_{q}) M (x_{0}; h)] \to^{L} N {0, \sum (x_{0})},$ (29)

where $\sum (x_{0}) = f_{X}^{- 1} (x_{0}) (U_{0}^{- 1} V_{0} U_{0}^{- 1}) \otimes \sum_{E_{D}} (x_{0})$ .

Theorem 1 delineates the asymptotic properties of α̂_IL(x₀; h) for k₀ ≥ 0, which covers the asymptotic properties of the intrinsic local constant and linear estimators of D(x₀) as k₀ = 0, 1. In particular, the asymptotic bias and variance of D̂_IL(x₀; h, 0) are closely related to those of the Nadaraya-Watson estimator when both response and covariate are in Euclidean space (Fan, 1992). Since vecs(log{D̂_IL(x₀; h, k₀)}) is a subvector of α̂_IL(x₀; h), we calculate the asymptotic average mean squared error (AMSE) conditional on x = {x₁, …, x_n} as

AMSE (log {{\hat{D}}_{I L} (x_{0}; h, k_{0})}) = E {tr ({[log {{\hat{D}}_{I L} (x_{0}; h, k_{0})} - log {D (x_{0})}]}^{2}) ∣ x} .

Furthermore, for a given weight function w(x), we may consider a constant bandwidth that minimizes the asymptotic average mean integrated squared error (AMISE) as

AMISE (log {{\hat{D}}_{I L} (.; h, k_{0})}) = \int AMSE (log {{\hat{D}}_{I L} (x; h, k_{0})}) w (x) d x .

Finally, we can calculate the asymptotically optimal local bandwidth, denoted by h_opt_,_L(x₀; k₀), for minimizing AMSE(log{D̂_IL(x₀; h, k₀)}) and the optimal bandwidth, denoted by h_opt_,_L(k₀), for minimizing AMISE(log{D̂_IL(.; h, k₀)}).

By Theorem 1 (iii), AMSE(log{D̂_IL(x₀; h, 0)}) equals $v_{0} {{nhf}_{X} (x_{0})}^{- 1} tr {\sum_{E_{D}} (x_{0})} + h^{4} u_{2}^{2} tr {{(vecs [0.5 log {D (x_{0})}^{(2)} + f_{X}^{(1)} (x_{0}) f_{X} {(x_{0})}^{- 1} log {D (x_{0})}^{(1)}])}^{\otimes 2}}$ . For the intrinsic local linear estimator, AMSE(log{D̂_IL(x₀; h, 1)}) is given by $0.25 h^{4} u_{2}^{2} tr {{(vecs [log {D (x_{0})}^{(2)}])}^{\otimes 2}} + v_{0} {{nhf}_{X} (x_{0})}^{- 1} tr {\sum_{E_{D}} (x_{0})}$ . Intrinsic local constant and linear estimators have the same asymptotic covariance and their differences are concerned only with their biases. The local constant estimator has one more term $h^{2} u_{2} f_{X}^{(1)} (x_{0}) f_{X} {(x_{0})}^{- 1} vecs [log {D (x_{0})}^{(1)}]$ , which depends on the marginal density f_X(.). Subsequently, we can get the optimal bandwidths, whose detailed expression can be found in the supplementary document.

4.2. Trace Metric

Under the trace metric, since ILPRE is different from the LPR estimator for multivariate response in Euclidean space, we study the consistency and asymptotic normality of ILPRE for both interior and boundary points.

We need to introduce some notation for discussion. Consider a function

ψ (S, G, Y) = g_{T} {(S, G exp (Y) G^{T})}^{2},

(30)

where G is an m×m lower triangle matrix, S ∈ Sym⁺(m), and Y ∈ Sym(m). Let α = (α_G, α_Y), in which α_G = vecs(G) and α_Y = vecs(Y). Let ∂_αψ(S, G, Y) and $\partial_{α}^{2} ψ (S, G, Y)$ be the first and second order derivatives of ψ(S, G, Y) with respect to α, respectively. By substituting Y (X) into ∂_αψ(S, G, Y) and $\partial_{α}^{2} ψ (S, G, Y)$ and using the decomposition of α = (α_G, α_Y), we define

\begin{array}{l} (\begin{matrix} Ψ_{1} (x) & Ψ_{2} (x) \\ Ψ_{2} {(x)}^{T} & Ψ_{3} (x) \end{matrix}) = E {\partial_{α}^{2} ψ (S, G, Y (X)) ∣ X = x}, \\ (\begin{matrix} Ψ_{11} (x) & Ψ_{12} (x) \\ Ψ_{12} {(x)}^{T} & Ψ_{22} (x) \end{matrix}) = E [{\partial_{α} ψ (S, G, Y (X))}^{\otimes 2} ∣ X = x], \end{array}

where the expectation is taken with respect to S given X = x. Let 1_k₀ be a k₀ × 1 column vector with all elements ones. Let Inline graphic = (u_i₊_j) and = (v_i₊_j) for 1 ≤ i, j ≤ k₀ be two k₀ × k₀ matrices. We define

ℵ (x_{0}; h) = {(w_{1} {(x_{0}; h)}^{T} Ψ_{2} (x_{0}), w (x_{0}; h) {1_{k_{0}} \otimes Ψ_{3} (x_{0})})}^{T}

and w(x₀; h) = (w₂(x₀; h)^T, ···, w_k₀+1(x₀; h)^T), in which

w_{k} (x_{0}; h) = {\begin{cases} u_{k_{0} + k} vecs (Y^{(k_{0} + 1)} (x_{0})) for even k_{0} + k as 0 < k \leq k_{0} + 1, \\ {h u}_{k_{0} + k + 1} vecs (Y^{(k_{0} + 1)} (x_{0}) log {(f_{X} (x_{0}))}^{(1)} + Y^{(k_{0} + 2)} (x_{0}) {(k_{0} + 2)}^{- 1}) for odd k_{0} + k . \end{cases}

Finally, let α_T (x) = (vecs{G(x)}^T, vecs{Y ⁽¹⁾(x)}^T, ···, vecs{Y ^(k₀)(x)}^T)^T.

Theorem 2

Suppose that x₀ is an interior point of f_X(·). Under the trace metric and conditions (C1)–(C8) in the appendix, we have the following results.

There exist solutions α̂_IT (x₀; h) to equation ∂G_n(α_T (x₀))/∂ α_T (x₀) = 0 such that {α̂_IT (x₀; h) – α_T (x₀)} converges to 0 in probability as n → ∞.
For k₀ = 0, if $f_{X}^{(1)} (x)$ is continuous in a neighborhood of x₀, then we have
$\sqrt{n h} [H {{\hat{α}}_{I T} (x_{0}; h) - α_{T} (x_{0})} - h^{2} u_{2} vecs {G^{(1)} (x_{0}) \frac{f_{X}^{(1)} (x_{0})}{f_{X} (x_{0})} + 0.5 G^{(2)} (x_{0})}] \to^{L} N {0, Ω_{0} (x_{0})},$ (31)

where $Ω (x_{0}) = u_{0}^{- 2} f_{X}^{- 1} (x_{0}) v_{0} Ψ_{1} {(x_{0})}^{- 1} Ψ_{11} (x_{0}) Ψ_{1} {(x_{0})}^{- 1}$ .
For k₀ > 0, if condition (C9) in the appendix is also true, we have
$\sqrt{n h} [H {{\hat{α}}_{I T} (x_{0}; h) - α_{T} (x_{0})} - \frac{h^{k_{0} + 1}}{(k_{0} + 1)!} N {(x_{0})}^{- 1} ℵ (x_{0}; h)] \to^{L} N {0, Ω (x_{0})},$ (32)

where $Ω (x_{0}) = f_{X}^{- 1} (x_{0}) N {(x_{0})}^{- 1} N^{*} (x_{0}) N {(x_{0})}^{- 1}$ and (x) and (x) are, respectively, given by
$N (x) = (\begin{matrix} u_{0} Ψ_{1} (x) & u \otimes Ψ_{2} (x) \\ u^{T} \otimes Ψ_{2} {(x)}^{T} & U_{2} \otimes Ψ_{3} (x) \end{matrix}), N^{*} (x) = (\begin{matrix} u_{0} Ψ_{11} (x) & v \otimes Ψ_{12} (x) \\ v^{T} \otimes Ψ_{12} {(x)}^{T} & V_{2} \oplus Ψ_{22} (x) \end{matrix}) .$

Theorem 2 delineates the asymptotic bias, covariance, and asymptotic normality of α̂_IT (x₀; h) for k₀ ≥ 0. Based on Theorem 2, it is straightforward to derive the asymptotic bias, covariance, and asymptotic normality of D̂_IT (x₀; h, k₀) for k₀ ≥ 0. Moreover, to have a direct comparison between the trace and Log-Euclidean metrics, we calculate the asymptotic biases and covariances of log{D̂_IT (x₀; h, k₀)} under these two metrics. Subsequently, we calculate AMSE(log{D̂_IT (x₀; h, k₀)}) and AMISE(log{D̂_IT (.; h, k₀)}) for a given weight function w(x). Minimizing AMSE(log(D̂_IT (x₀; h, k₀))) and AMISE(log(D̂_IT (x₀; h, k₀))), respectively, leads to the optimal bandwidths, whose detailed expressions can be found in the supplementary document.

We are interested in comparing the asymptotic properties of the intrinsic local constant D̂_IT (x₀; h, 0) and the local linear estimator D̂_IT (x₀; h, 1). It follows from the delta method that AMSE(log{D̂_IT (x₀; h, 0)}) can be approximated as

h^{4} u_{2}^{2} tr ({[G_{D} {(x_{0})}^{T} vecs {G^{(1)} (x_{0}) f_{X}^{(1)} (x_{0}) f_{X} {(x_{0})}^{- 1} + 0.5 G^{(2)} (x_{0})}]}^{\otimes 2}) + {(n h)}^{- 1} tr {G_{D} {(x_{0})}^{\otimes 2} Ω_{0} (x_{0})} + o (h^{4} + {(n h)}^{- 1}),

(33)

where G_D(x₀) = {∂vec(log(G(x₀)^⊗2))/∂vecs(G(x₀))^T}^T. The asymptotic bias and variance of D̂_IT (x₀; h, 0) are similar to those of the Nadaraya-Watson estimator when response is in Euclidean space (Fan, 1992). For the intrinsic local linear estimator, $AMSE (log ({\hat{D}}_{I T} (x_{0}; h, 1))) equals 0.25 h^{4} u_{2}^{2} tr [{G_{D} {(x_{0})}^{T} Ψ_{1} {(x_{0})}^{- 1} Ψ_{2}^{T} (x_{0}) vecs (Y^{(2)} (x_{0}))}^{\otimes 2}] + {(n h)}^{- 1} tr {G_{D} {(x_{0})}^{\otimes 2} Ω_{0} (x_{0})}$ .

We consider ILPRE near the edge of the support of f_X(x). Without loss of generality, we assume that the design density f_X(.) has a bounded support [0, 1] and consider the left-boundary point x₀ = dh for some positive constant d. The asymptotic consistency and normality of IL-PRE are valid for the boundary points after slight modifications on the definitions of u_k and v_k. Denote $u_{k, d} = \int_{- d}^{\infty} x^{k} K (x) d x$ and $v_{k, d} = \int_{- d}^{\infty} x^{k} K^{2} (x) d x$ . Correspondingly, u, Inline graphic , , and are replaced by u_d, , , and , respectively. Let c_k₀+2,d = (u_k₀+2,d, ···, u_2k₀+1,d)^T and (0+) = (u_k₀+1,dΨ₂(0+), c_k₀+2,d ⊗ Ψ₃(0+))^Tvecs(Y ^(k₀+1)(0+)). For the boundary points, we have the following asymptotic results under the trace metric.

Theorem 3

Suppose that x₀ = dh is a left boundary point of f_X(.). Under the trace metric and conditions (C1)–(C8) in the appendix, we have the following results.

There exist solutions, denoted by α̂_IT (x₀; h), to the equation ∂G_n(α_T (x₀))/∂α_T (x₀) = 0 such that {α̂_IT (x₀, h) – α_T (x₀)} converges to 0 in probability as n → ∞.
For k₀ = 0, conditioning on x = {x₁, ···, x_n}, we have
$\sqrt{n h} [H {{\hat{α}}_{I T} (0 +; h) - α_{T} (0 +)} - {h u}_{0, d}^{- 1} u_{1, d} G^{(1)} (0 +)] \to^{L} N {0, Ω_{0, d} (0 +)),$ (34)

where $Ω_{0, d} (0 +) = f_{X}^{- 1} (0 +) u_{0, d}^{- 2} v_{0, d} Ψ_{1} {(0 +)}^{- 1} Ψ_{11} (0 +) Ψ_{1} {(0 +)}^{- 1}$ .
For k₀ > 0, if condition (C9) in the appendix is also true, conditioning on x = {x₁, ···, x_n}, we have
$\sqrt{n h} [H {{\hat{α}}_{I T} (0 +; h) - α_{T} (0 +)} - \frac{h^{k_{0} + 1}}{(k_{0} + 1)!} N_{d} {(0 +)}^{- 1} ℵ_{d} (0 +)] \to^{L} N {0, Ω_{d} (0 +)),$ (35)

where $Ω_{d} (0 +) = f_{X}^{- 1} (0 +) N_{d} {(0 +)}^{- 1} N_{d}^{*} (0 +) N_{d} {(0 +)}^{- 1}$ (0+) and $N_{d}^{*} (0 +)$ are, respectively, given by
$\begin{array}{l} N_{d} (0 +) = (\begin{matrix} u_{0, d} Ψ_{1} (0 +) & u_{d} \otimes Ψ_{2} (0 +) \\ {u_{d}}^{T} \otimes Ψ_{2} {(0 +)}^{T} & U_{2, d} \otimes Ψ_{3} (0 +) \end{matrix}), \\ N_{d}^{*} (0 +) = (\begin{matrix} v_{0, d} Ψ_{11} (0 +) & {v_{d}}^{T} \otimes Ψ_{12} (0 +) \\ v_{d} \otimes Ψ_{12}^{T} (0 +) & V_{2, d} \oplus Ψ_{22} (0 +) \end{matrix}) . \end{array}$

It follows from Theorem 3 (ii) and (iii) that when x₀ is at the boundary, the asymptotic average mean squared errors of intrinsic local constant and linear estimators are, respectively, AMSE(log{D̂_IT (0+; h, 0)}) = O_p(h² + n⁻¹h⁻¹) and AMSE(log{D̂_IT (0+; h, 1)}) = O_p(h⁴ + n⁻¹h⁻¹). The rate of convergence for the intrinsic local constant estimator at boundary points is slower than that at interior points, and thus the intrinsic local constant estimator suffers from the well-known boundary effects. However, the intrinsic local linear estimator adapts automatically at the boundary points and its rate of convergence is not influenced by the location of points. Thus, the intrinsic local linear (or polynomial) estimators share the same property of automatic adaptation to the boundary points as the local polynomial estimators in Euclidean space (Fan and Gijbels, 1996).

5. Simulation

We conducted four sets of Monte Carlo simulations to examine the finite sample performance of ILPREs for SPD matrices under different metrics and noise distributions. It should be emphasized that these simulation studies are intended to have wide applications of SPDs, and thus they are deliberately not limited to DTI.

We set m = 3 and assumed that the true SPD matrix function has the following form:

D (x) = exp ((\begin{matrix} - 0.1 (x + 0.1) & 0.2 (x + 0.1) & sin (0.75 x) \\ 0.2 (x + 0.1) & 0.6 (x + 0.1) & - 0.4 (x + 0.1) \\ sin (0.75 x) & - 0.4 (x + 0.1) & 0.5 (x + 0.1) \end{matrix})) .

We considered three noise distributions including a Riemannian log-normal distribution, a log-normal distribution, and the Rician distribution. We used the Rician noise to simulate the ideal noise in diffusion tensor imaging. The three noise models are given as follows:

(a) Riemannian log normal model: S_i = G(x_i) exp(ε_i)G(x_i)^T follows the Riemannian log normal distribution, where D(x_i) = G(x_i)^⊗2 and ε_i ∈ Sym(3) follows a symmetric matrix variate normal distribution N(0, Σ), in which Σ is a covariance matrix (Schwartzman, 2006).
(b) Log normal model: log(S_i) follows a symmetric matrix variate normal distribution N[log{D(x_i)}, Σ].
(c) Rician noise model: this noise model is commonly used to simulate ideal noise in diffusion weighted images (Zhu et al., 2007). The diffusion-weighted signal was simulated for 31 gradient directions r_k, k = 1, ···, 31 with b-factor b_k = 1000s/mm and four baselines with b_k = 0s/mm for k = 32, ···, 35. The baseline signal intensity W₀ was set at 1500. For a given diffusion tensor D(x_i), ε_R_,_k and ε_I_,_k were independently simulated from a Gaussian random generator with mean zero and standard deviation 60. The diffusion-weighted signal was calculated as $W_{i, k} = \sqrt{{[W_{0} exp {- b_{k} r_{k} D (x_{i}) r_{k}} + ε_{R, k}]}^{2} + ε_{I, k}^{2}}$ for k = 1, ···, 35. Subsequently, the weighted least squares estimate was used to estimate S_i (Zhu et al., 2007).

For each simulated data set, we considered three metrics including the trace metric, the Log-Euclidean metric, and the Euclidean metric. For the trace and Log-Euclidean metrics, we calculated the intrinsic local constant and linear estimators developed above for each data set. By following the arguments in Pasternak et al. (2010), we employed the Euclidean metric for estimated diffusion tensors. Under the Euclidean metric, we applied the standard local constant and linear regression methods to estimate the SPD matrix function for each simulated data set, while the bandwidth was selected by using its corresponding generalized cross-validation method. For comparison, we also included a tensor spline method (Barmpoutis et al., 2007) based on the trace metric.

We first generated n = 50 design points x_i, i = 1, ···, 50 independently from a N (0, 0. 25) distribution. Then we calculated D(x_i) and used it to simulate S_i according to one of the three noise distributions (a)–(c). Unless stated otherwise, the covariance matrix 3 for the noise models (a) and (b) was set as follows:

\sum_{1} = 2 * (\begin{matrix} 0.3 & 0.049 & 0.052 \\ 0.049 & 0.2 & 0.0424 \\ 0.052 & 0.0424 & 0.1 \end{matrix}) .

(36)

Figure 2(a)–(d) presents the true SPD matrix data and a set of simulated data {(x_i, S_i): i = 1, ···, 50} under the three noise models. Each SPD matrix S_i at the point x_i is geometrically represented by an ellipsoid. In this representation, the lengths of the semiaxes of the ellipsoid equal the square root of the eigenvalues of a SPD matrix, while the eigenvectors define the direction of the three axes. In DTI, the ellipsoidal representation is used to represent the local brownian motion of water molecules in the brain. Isotropic diffusion is represented by a sphere, while anisotropic diffusion is represented by an anisotropic ellipsoid. We simulated 100 data sets for each scenario. Note that the Rician noise level is visually less variable than the relatively high levels of the other two.

Fig. 2 — Ellipsoidal representations of (a) the true SPD matrix data along the design points; simulated SPD matrix data along the design points under the three different noise models: (b) Riemannian log normal, (c) log normal and (d) Rician noise models; and estimated SPD matrix data along the design points using three smoothing methods: (e), (h) and (k): ILPR under the trace metric; (f), (i) and (l): ILPR under the Log-Euclidean metric; and (g), (j) and (m): LPR under the Euclidean metric; and under the three different noise models: (e)–(g): Riemannian log normal model; (h)–(j): log normal model; and (k)–(m): Rician noise model, colored with FA values defined in (37).

To compare different smoothing methods for SPD matrices under different scenarios, we calculated two summary statistics including an Average Geodesic Distance (AGD) over all design points and a Local Average Geodesic Distance (LAGD) at each design point. Specifically, AGD is defined as $AGD = n^{- 1} \sum_{i = 1}^{n} g (\hat{D} (x_{i}), D (x_{i}))$ , where x̂_i) is an estimated D(x_i) based on a specific smoothing method. At each sample point x_i, LAGD is given by $LAGD (x_{i}) = \sum_{j = 1}^{100} g ({\hat{D}}_{(j)} (x_{i}), D (x_{i})) / 100$ , where D̂₍_j₎(x_i) is the estimated SPD matrix at x_i based on the j-th simulated replication. Although we chose all the three metrics for calculating AGD and LAGD, we only present those based on the Euclidean metric for the sake of space. The results of AGD and LAGD for the other two metrics are included in the supplementary document.

5.1. Simulation 1

The first set of simulations compared the finite sample performance of the intrinsic local linear estimators under different metrics and noise distributions. Figure 2(e)–(m) displays a set of the estimated SPD functions using local linear regression estimators under the three metrics for the three different noise models. Inspecting Figure 2(k)–(m) reveals that under the Rician noise model, all three metrics perform well in recovering the true SPD function. This is not surprising given the relatively low noise level shown in Figure 2(d). However, for the other two noise models, our intrinsic local linear regression methods visually outperform the local linear regression based on the Euclidean metric. In particular, a clear swelling effect is observed for the Euclidean metric (Figure 2(g) and (j)). This indicates the importance of appropriate metric selection according to the distribution of a specific SPD data set, which is partially in agreement with the suggestion given in Pasternak et al. (2010). However, our findings also suggest that both the trace and Log-Euclidean metrics are appropriate for the nonparametric analysis of SPD matrices for all three noise distributions. This also agrees with the findings in the medical imaging literature (Fletcher and Joshi, 2007; Batchelor et al., 2005; Pennec et al., 2006) on the interpolation and extrapolation of diffusion tensor fields. It should be noted that the simulation studies in Pasternak et al. (2010) solely focus on the effect of metric on the estimated diffusion tensors and their associated scalar measures, such as the apparent diffusion coefficient. Thus, the recommendation in Pasternak et al. (2010) may not apply to the nonparametric analysis of SPD matrices.

5.2. Simulation 2

The second set of simulations compared local constant estimators with local linear estimators under the three metrics and the three noise distributions. In addition, we also compared all local regression methods with the tensor spline estimators in Barmpoutis et al. (2007). Inspecting Figures 3 reveals the following findings. As expected, under all metrics, the local linear estimator is superior to the local constant estimator. Also, our ILPREs outperform the corresponding estimators under the Euclidean metric and the tensor spline estimators under the noise models (a) and (b). For the Rician noise model, our ILPREs under the Log-Euclidean metric slightly outperform those under the trace and Euclidean metrics. Moreover, the local constant and linear estimators outperform the tensor spline estimators under all noise distributions. The variations of AGDs for ILPREs under the trace metric are larger than those under the Log-Euclidean metric under all three noise distributions. The U shape of the LAGD curves indicates that interior points have smaller LAGDs than those near the boundaries since there are more design points in the interior than at the boundaries.

Fig. 3 — Comparisons of the local constant and linear estimators under the three metrics and the tensor spline estimators under the three noise models. Panels (a)–(c) the boxplots of 1000×AGDs obtained from seven different estimators, where LCL, LCT, and LCE, respectively, represent the local constant estimators under the Log-Euclidean, trace and Euclidean metrics, where LLL, LLT, and LLE, respectively, represent the corresponding local constant and linear estimators under the metrics, and where SP represents the tensor spline estimator. Panels (d)–(f) of the second row show the log₁₀(LAGD) curves based on LCL (dash-dotted line), LCT (dashed line), LCE (dotted line), and SP (solid line). Panels (g)–(i) of the third row show the log₁₀(LAGD) curves based on LLL (dash-dotted line), LLT (dashed line), LLE (dotted line), and SP (solid line). The columns correspond to the three noise models: column 1: Riemannian log normal; column 2: log normal; and column 3: Rician.

5.3. Simulation 3

The third set of simulation studies compared the finite sample performance of the intrinsic local linear estimators under the trace, Log-Euclidean and Euclidean metrics and also tensor spline method in Barmpoutis et al. (2007) at a higher noise level. Specifically, we assumed Σ = 4Σ₁ for the covariance matrix of N(0, Σ) in the noise models (a) and (b). At high noise levels, most local linear estimators cannot retain the positive definiteness under the Euclidean metric, while the tensor spline method does not converge. Thus, Figure 4 only presents the results under the trace and Log-Euclidean metrics. Inspecting this figure reveals that when the noise level is high, the intrinsic local linear estimators under the trace metric slightly outperform those under the Log-Euclidean metric under the noise models (a) and (b).

Fig. 4 — Comparison of the intrinsic local linear estimators under the Log-Euclidean and trace metrics for the first two noise models at a higher noise level: Panels (a) and (c): Riemannian log normal; Panels (b) and (d): log normal; Panels (a) and (b): the boxplots of AGDs for LLL and LLT; Panels (c) and (d): the log₁₀(LAGD) curves of LLL and LLT. It shows that at a high noise level, the intrinsic local linear estimators under the trace metric slightly outperform those under the Log-Euclidean metric for the first two noise models.

5.4. Simulation 4

The fourth set of simulation studies examined the importance and effect of directly smoothing SPDs on some SPD-derived scalar summary measures under the three noise models. We considered a well-known scalar measure derived from a 3 × 3 SPD matrix, called fractional anisotropy (FA), which describes the variation of the three eigenvalues of a 3 × 3 SPD matrix. FA is a scalar value between zero (all eigenvalues are the same) and one (two eigenvlues equal 0) and given by

F A = \sqrt{\frac{3 {{(λ_{1} - \bar{λ})}^{2} + {(λ_{2} - \bar{λ})}^{2} + {(λ_{3} - \bar{λ})}^{2}}}{2 (λ_{1}^{2} + λ_{2}^{2} + λ_{3}^{2})}}

(37)

with eigenvalues λ₁, λ₂, λ₃ and their average λ̄. We compared two different methods for smoothing FA’s, here referred to as method A and method B, respectively. The method A first calculates the FA’s from all SPD matrices and then uses the classic local linear regression in Euclidean space to smooth the FA’s. The method B first applies the intrinsic local linear estimator to smooth SPD matrices and then calculates smoothed FA curves based on the smoothed SPD matrices. We further divided method B into three methods according to smoothing methods for SPD matrices under three metrics: trace (method 2), Log-Euclidean (method 3) and Euclidean (method 4) metrics. We assessed each method’s performance via the Mean Absolute Deviation Error (MADE) defined by $MADE = n^{- 1} \sum_{i = 1}^{n} ∣ F A (x_{i}) - \hat{F A} (x_{i}) ∣$ , where FA(x_i) and $\hat{F A} (x_{i})$ are, respectively, the true and estimated FA values across all design points.

Figure 5 reveals that method B outperforms method A under the noise models (a) and (b). For the Rician noise model, methods A and B are fairly comparable, but method B based on the Log-Euclidean metric is slightly better. This may indicate the potential improvement gained by directly smoothing DT data over the post smoothing method A. Based on the medians of MADEs (see Figure 5(d)–(f)), method A cannot faithfully reconstruct the trend of the FA curve for the noise models (a) and (b), whereas method B can accurately estimate the FA curve and reveal its critical features such as the valley. It should be noted that the true FA value at the valley does not equal zero.

Fig. 5 — Boxplot of the MADE’s using the four smoothing methods 1–4 representing the first-fourth methods based on 100 replications for three noise models: (a) Riemannian log normal model; (b) log normal model; and (c) Rician model. Smoothed FA curves for the realizations with median MADE for three noise models: (d) Riemannian log normal model; (e) log normal model; and (f) Rician model. In panels (d)–(f), the raw FA curve is the dotted line with circle, true FA curve is the solid line, the estimated FA curve for the first method is the dash-dotted line with circle, the estimated FA curve for the second method is the dotted line, the estimated FA curve for the third method is dash-dotted line and the estimated FA curve for the fourth method is dashed line.

6. HIV Imaging Data

The aim of this analysis is to assess the integrity of white matter in human immunodeficiency virus (HIV) by using DTI and our IPLRE. This clinical study was approved by the Institutional Review Board of the University of North Carolina at Chapel Hill. A sample data set and the code for ILPRE along with its documentation will be accessible from the website http://www.bios.unc.edu/research/bias. We considered 46 subjects with 28 HIV+ subjects (20 males and 8 females whose mean age is 40.0 with SD 5.6 years) and 18 healthy controls (9 males and 9 females whose mean age is 41.2 with SD 7.4 years). Diffusion-weighted images and T1 weighted images were acquired for each subject. The diffusion tensor acquisition scheme includes 18 repeated measures of six non-collinear directions, (1,0,1), (−1,0,1), (0,1,1), (0,1,−1), (1,1,0), and (−1,1,0) at a b-value of 1000 s/mm² and a b = 0 reference scan. Forty-six contiguous slices with a slice thickness of 2 mm covered a field of view (FOV) of 256 mm² with an isotropic voxel size of 2 × 2 × 2 mm³. High resolution T1 weighted (T1W) images were acquired using a 3D MP-RAGE sequence. A weighted least square estimation method was used to construct the diffusion tensors (Zhu et al., 2007). Since in the previous DTI findings, the diffusion tensors in the splenium of the corpus callosum were found significantly different between the HIV+ and control groups, we examine the finite sample performance of our method by using this fiber tract. The tensors along the tract were extracted using methodology described in Zhu et al. (2010). Figure 6 displays the splenium of the corpus callosum and the ellipsoidal representation of the full tensors on that tract from one selected subject. This involves three steps: (i) registration and atlas construction, (ii) fiber tracking on the atlas and (iii) collection of tensor data on the atlas fiber tracts.

Fig. 6 — (a)The splenium of the corpus callosum in the analysis of HIV DTI data. (b)The ellipsoidal representation of full tensors colored with FA values on the fiber tract from a selected subject.

We calculated the intrinsic local linear estimator of the SPD matrices along this selected tract for each subject under the trace and Log-Euclidean metrics and also calculated the local linear estimator under the Euclidean metric. See Figure 7 for the raw and estimated tensors along the fiber tracts from one subject. It is observed from the ellipsoidal representation of diffusion tensor data (Figure 7(a)) that the data are noisy. Figure 7(b)–(e) show that the tensors are more spherical at the beginning with low FA values and more anisotropic in the middle part with high FA values. The methods under the three metrics reveal very similar trend of diffusion tensors changing along the fiber tract, especially in the first row and the last row. This agrees with our simulation results that for diffusion tensor data, all three metrics are comparable. However, some differences appear on the right side of the middle row. The estimated tensors in the middle row are very anisotropic when using the method under the Euclidean metric compared to the other two metrics.

Fig. 7 — (a) The ellipsoidal representations of the diffusion tensor data and estimated tensors using the intrinsic local linear regression under the (b) Log-Euclidean, (c) trace and (d) Euclidean metrics along the splenium of the corpus callosum, colored with FA values. The estimated tensors in the middle right part are more anisotropic using the method under the Euclidean metric. Each set of 3 rows in (a)–(d) represents one tract of tensors and the three rows are read from left to right in the top row, right to left in the middle row and then left to right in the bottom row. (e) FA’s, (f) MD’s and (g) PE’s derived from the raw tensor data (dot line) and estimated tensors using the intrinsic local linear regression under the trace (dash-dot line), Log-Euclidean (solid line) and Euclidean (dashed line) metrics as the function of arc-length along the splenium of the corpus callosum. Estimated FA, MD and PE function along the splenium of the corpus callosum by using the standard local linear regression for scalars (dotted line with circles).

In many applications, it is common to calculate some tensor-derived diffusion measures, including FA, the trace of a diffusion tensor, called MD, and the largest eigenvalue of a diffusion tensor, called PE, based on noisy diffusion tensor data and then apply standard statistical methods to directly carry out statistical inference on these diffusion measures. Since these scalar measures do not capture all information in the full diffusion tensor, they can decrease the sensitivity of detecting subtle changes of the white matter structure. Similar to the simulation of Section 5.4, we applied method A to directly smooth FA, MD and PE values along the selected tract and then we compared them with method B based on the trace, Log-Euclidean, and Euclidean metrics. Figure 7(e)–(g) shows that there is no large difference for methods B under the three metrics. Both methods A and B perform the same for smoothing MD data, whereas they perform differently for smoothing PE and FA curves, especially in the middle part from 10 to 25. This seems to be caused by that fact that FA and PE values are biased due to the well known ‘sorting’ bias in estimating the eigenvalues of DT (Zhu et al., 2007), whereas the estimated MD value is unbiased.

Finally, we estimated the mean diffusion tensor curve for each of the two groups: HIV and control groups. In order to detect meaningful group differences, registration is crucial. The 46 HIV DTI data used in our studies, including the splenium tracts and diffusion tensors on them, were registered in the same atlas space. Figure 8(a)–(f) displays the estimated mean diffusion tensors along the fiber tract for the two groups using the intrinsic local linear regression for SPD matrices under both the Log-Euclidean and trace metrics and also using the local linear regression for SPD matrices under the Euclidean metric. We can observe some obvious changes of diffusion tensors of HIV subjects along the splenium corpus callosum compared with those in the control group. We also calculated the differences of FA values derived from the estimated mean diffusion tensors, which corresponds to the color differences in Figure 8(g), and the geodesic distances between estimated mean tensors at each point along the tract in Figure 8(h). This result agrees with previous DTI findings that the spleninum of the corpus callosum has been detected as abnormal for the HIV group (Filippi et al. (2001) and Chen et al. (2009)).

Fig. 8 — Ellipsoidal representations of estimated mean tensors along the splenium of the corpus callosum for the control and HIV groups using the intrinsic local linear regression under the Log-Euclidean ((a) and (b)), trace ((c) and (d)) and Euclidean ((e) and (f)) metrics colored with FA values. Each set of 3 rows in (a)–(d) represents one tract of tensors and the three rows are read from left to right in the top row, right to left in the middle row and then left to right in the bottom row. (g) FA differences and (h) geodesic distances (GD) between the mean diffusion tensors of HIV and control groups along the splenium of the corpus callosum under the Log-Euclidean (the solid line), trace (the dashed line) and Euclidean (dash-dot line) metrics.

7. Conclusion and Discussion

We have systematically investigated the intrinsic local polynomial regression methods under the trace and Log-Euclidean metrics on the space of SPD matrices. Many issues still merit further research. The proposed cross validation bandwidth selector is straightforward and relatively simple to derive and implement for SPD matrix variate data. However, the relatively high variance of the cross validation bandwidth selector is regarded widely as an impediment to its good performance (Jones et al., 1996; Hardle et al., 1992). It would be great interest to develop variable bandwidth selection methods to capture complicated variations of SPD matrices in the covariate space and better bandwidth selection methods to reduce the variability of cross validation (Fan et al., 1996). From the average diffusion tensor curves for the HIV and control groups in Section 6, we can observe some obvious changes in diffusion tensors of HIV subjects along the spleninium corpus callosum compared with the control group. A topic of future interest should propose tests for comparing the differences across multiple groups of SPD curves by considering varying-coefficient models and additive models among others. The real applicability of the log normal and Riemannian log normal noise models remains unclear. It is of great interest to explore the fit of those noise models to real data in different applications.

Finally, the ILPR method proposed here and theory may be extended to other nonparametric methods (e.g., tensor splines) and manifold-valued data, such as directional data and rotation matrices. For instance, although we have compared our ILPR with Barmpoutis et al. (2007)’s tensor splines, it would be interesting to develop free-knot regression splines for SPD matrices and compare them with our ILPR both theoretically and numerically (Sangalli et al., 2009). Moreover, a smooth spline method based on unrolling and unwrapping procedures in Riemannian manifolds has been developed for fitting smooth curves to spherical data, rotation matrices, and planar landmark data (Jupp and Kent, 1987; Prentice, 1987; Kume et al., 2007). Development of other nonparametric methods, such as ILPR, for the analysis of manifold-valued data and examination of the asymptotic properties of nonparametric estimates under different metrics should be pursued in future research.

Supplementary Material

Supp Material

NIHMS336025-supplement-Supp_Material.pdf^{(367.3KB, pdf)}

Acknowledgments

We thank the Editor, an Associate Editor, two referees, and Martha Skup for valuable suggestions, which helped to improve our presentation greatly. We are also thankful to Martin Styner, Pew-Thian Yap, and Zhexing Liu for their help with the visualizations. This work was supported in part by a National Science Foundation grant and National Institute of Health grants.

Appendix. Assumptions

The following assumptions are needed to facilitate development of our methods, although they are not the weakest possible conditions. We need some notation. Recall that ψ(S, G, Y) = g_T (S, G exp(Y)G^T)² and α = (α_G, α_Y), where G is an m × m lower triangle matrix, S ∈ Sym⁺(m), Y ∈ Sym(m), α_G = vecs(G), and α_Y = vecs(Y). We define

(\begin{matrix} \partial_{α_{G}} ψ (S, G, Y) \\ \partial_{α_{Y}} ψ (S, G, Y) \end{matrix}) = (\begin{matrix} ψ_{G} (S, G, Y) \\ ψ_{Y} (S, G, Y) \end{matrix}), (\begin{matrix} \partial_{α_{G}}^{2} ψ (S, G, Y) & \partial_{α_{G} α_{Y}}^{2} ψ (S, G, Y) \\ \partial_{α_{Y} α_{G}}^{2} ψ (S, G, Y) & \partial_{α_{Y}}^{2} ψ (S, G, Y) \end{matrix}) = (\begin{matrix} ψ_{G G} (S, G, Y) & ψ_{G Y} (S, G, Y) \\ ψ_{Y G} (S, G, Y) & ψ_{Y Y} (S, G, Y) \end{matrix}) .

(C1)
The kernel function K(.) is a continuous symmetric probability density function with bounded support, say [−1, 1].
(C2)
The regression function D(x) ∈ Sym⁺(m) has a continuous (k₀ + 1)-th order derivative in a neighborhood of x₀.
(C3)
The bandwidth h tends to zero and nh → ∞.
(C4)
The design density f_X(.) is continuous in a neighborhood of x₀ and f_X(x₀) > 0.
(C5)
The conditional density f(S|X = x) is continuous in a neighborhood of x₀.
(C6)
$E {\partial_{α}^{2} ψ (S, G, Y (X)) ∣ X = x}$ and E[{∂_αψS; G; Y (X))}^⊗2|X = x] are continuous in a neighborhood of x₀.
(C7)
The matrix $N_{(x)} = (\begin{matrix} u_{0} Ψ_{1} (x) & u \otimes Ψ_{2} (x) \\ u^{T} \otimes Ψ_{2} {(x)}^{T} & U_{2} \otimes Ψ_{3} (x) \end{matrix})$ is positive definite in a neighborhood of x₀.
(C8)
Let ||. || be the L₂ norm of a matrix, η₀ be a lower triangle matrix, η₁ ∈ Sym(m), and U_δ= {(η₀, η₁): ||η₀||² + ||η₁||² ≤ δ²}. As δ → 0,

\begin{array}{l} E {sup_{U_{δ}} | | \partial_{α}^{2} ψ (S, G (x_{0}) + η_{0}, Y (X) + η_{1}) - \partial_{α}^{2} ψ (S, G (x_{0}), Y (X)) | | ∣ X = x} = o (1), \\ E {sup_{U_{δ}} | | \partial_{α} ψ (S, G (x_{0}) + η_{0}, Y (X) + η_{1}) - \partial_{α} ψ (S, G (x_{0}), Y (X)) - \partial_{α}^{2} ψ (S, G (x_{0}), Y (X)) {(vecs {(η_{0})}^{T}, vecs {(η_{1})}^{T})}^{T} | | ∣ X = x} = o (δ), \end{array}

are uniformly in x in a neighborhood of x₀.
(C9)
There exists a b > 0 such that E{||∂_αψ(S, G(x₀), Y (X))||^b⁺²|X = x} is bounded in a neighborhood of x₀.
(C10)
The (x) = Cov{ (X)|X = x} is continuous in a neighborhood of x₀ and there exists a b > 0 such that E{|| (X)||^b⁺²|X = x} is bounded in a neighborhood of x₀.

Remark

Assumptions (C1)–(C10) are standard conditions for ensuring the asymptotic properties of local polynomial estimators when x₀ is an interior point of f_X(·) (Fan and Gijbels, 1996; Wand and Jones, 1995). Some conditions can be released with additional technicalities of proofs. For instance, the bounded support restriction on K(·) in (C1) is not essential and can be removed if we put restriction on the tail of K(·). Condition (C2) ensures that Y (x) = log(G(x₀) ⁻¹D(x)G(x₀)⁻^T), G(x), and log(D(x)) have a continuous (k₀ + 1)-th order derivative in a neighborhood of x₀. Moreover, assume that f_X(.) has a bounded support [0, 1]. All assumptions can be easily modified when x₀ is a boundary point, say left boundary point x₀ = dh or right boundary point x₀ = 1 − dh for some d > 0. For instance, we require that conditions (C2)–(C10) hold in the left neighborhood of 0 or the right neighborhood of 1. For condition (C2), we also need to introduce f_X(0+) as x₀ is the left boundary point and f_X(1−) as x₀ is the right boundary point. For condition (C7), Inline graphic (x) is also needed to make some modifications. For simplicity, we omit these details.

References

Anderson TW. Wiley Series in Probability and Statistics. 3. 2003. An introduction to multivariate statistical analysis. [Google Scholar]
Arsigny V, FP, PX, AN Geometric means in a novel vector space structure on symmetric positive definite matrices. SIAM J Matrix Anal Appl. 2007;29:328–347. [Google Scholar]
Barmpoutis A, Vemuri BC, Shepherd TM, Forder JR. Tensor splines for interpolation and approximation of dt-mri with applications to segmentation of isolated rat hippocampi. IEEE Transations on Medical Imaging. 2007;26:1537–1546. doi: 10.1109/TMI.2007.903195. [DOI] [PMC free article] [PubMed] [Google Scholar]
Basser PJ, Mattiello J, LeBihan D. Mr diffusion tensor spectroscopy and imaging. Biophysical Journal. 1994;66:259–267. doi: 10.1016/S0006-3495(94)80775-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
Batchelor P, Moakher M, Atkinson D, Calamante F, Connelly A. A rigorous framework for diffusion tensor calculus. Magnetic Resonance in Medicine. 2005;53:221–225. doi: 10.1002/mrm.20334. [DOI] [PubMed] [Google Scholar]
Bhattacharya R, Patrangenaru V. Large sample theory of intrinsic and extrinsic sample means on manifolds-ii. Annals of Statistics. 2005;33:1225–1259. [Google Scholar]
Chen YS, An HT, Zhu HY, Stone T, Smith JK, Hall C, Bullitt E, Shen DG, Lin WL. White matter abnormalities revealed by diffusion tensor imaging in non-demented and demented hiv+ patients. NeuroImage. 2009;47:1154–1162. doi: 10.1016/j.neuroimage.2009.04.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
Davis BC, Bullitt E, Fletcher PT, Joshi S. Population shape regression from random design data. International Journal of Computer Vision. 2010;90:255–266. [Google Scholar]
Dryden IL, Koloydenko A, Zhou D. Non-euclidean statistics for covariance matrices, with applications to diffusion tensor imaging. Annals of Applied Statistics. 2009;3:1102–1123. [Google Scholar]
Fan J, Gijbels I. Local polynomical modelling and its applications. Cahpman and Hall; 1996. [Google Scholar]
Fan J, Gijbels I, Hu T-C, Huang L-S. A study of variable bandwidth selection for local polynomial regression. Statist Sinica. 1996;6(1):113–127. [Google Scholar]
Filippi C, Ulug A, Ryan E, Ferrando S, van Gorp W. Diffusion tensor imaging of patients with hiv and normal-appearing white matter on mr images of the brain. Ajnr: Am J Neuroradiol. 2001;22:277–283. [PMC free article] [PubMed] [Google Scholar]
Fingelkurts AA, Fingelkurts AA, Kahkonen S. Functional connectivity in the brain-is it an elusive concepts? Neuroscience and Biobehavioral Reviews. 2005;28:827–836. doi: 10.1016/j.neubiorev.2004.10.009. [DOI] [PubMed] [Google Scholar]
Fletcher P, Joshi S. Riemannian geometry for the statistical analysis of diffusion tensor data. Signal Processing. 2007;87:250–262. [Google Scholar]
Fletcher P, Joshi S, Lu C, Pizer S. Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Transactions on Medical Imaging. 2004;23:995–1005. doi: 10.1109/TMI.2004.831793. [DOI] [PubMed] [Google Scholar]
Grenander U, Miller MI. Pattern Theory From Representation to Inference. Oxford University Press; 2007. [Google Scholar]
Hall P, Marron JS, Park BU. Smoothed cross-validation. Probability Theory and Related Fields. 1992;92:1–20. [Google Scholar]
Hardle W, Hall P, Marron JS. Regression smoothing parameters that are not far from their optimum. Journal of American Statistical Association. 1992;87:227–233. [Google Scholar]
Jones M, Marron J, Sheather S. A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association. 1996;91:401–407. [Google Scholar]
Jupp PE, Kent JT. Fitting smooth paths to spherical data. Applied Statistics. 1987;36:34–46. [Google Scholar]
Kim PT, Richards DS. IMS Lecture Notes Monograph Series. A Festschrift of Tom Hettmansperger. 2010. Deconvolution density estimation on spaces of positive definite symmetric matrices. [Google Scholar]
Kume A, I, Dryden L, Le H. Shape-space smoothing splines for planar landmark data. Biometrika. 2007;94:513–528. [Google Scholar]
Lang S. Graduate Texts in Mathematics. Vol. 191. Springer Verlag; New York: 1999. Fundamentals of Differential Geometry. [Google Scholar]
Liang F. Evolutionary stochastic approximation monte carlo for global optimization. Statistics and Computing. 2010 In press. [Google Scholar]
Park BU, Marron JS. Comparison of data-driven bandwidth selectors. Journal of American Statistical Association. 1990;85:66–72. [Google Scholar]
Pasternak O, Sochen N, Basser JP. The effect of metric selection on the analysis of diffusion tensor mri data. Neuroimage. 2010;49:2190–2204. doi: 10.1016/j.neuroimage.2009.10.071. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pennec X, Fillard P, Ayache N. A riemannian framework for tensor computing. International Journal of Computer Vision. 2006;66:41–66. [Google Scholar]
Pourahmadi M. Maximum likelihood estimation of generalized linear models for multivariate normal covariance matrix. Biometrika. 2000;87:425–435. [Google Scholar]
Prentice MJ. Fitting smooth paths to rotation data. Journal of the Royal Statistical Society Series C (Applied Statistics) 1987;36:325–331. [Google Scholar]
Rice J. Bandwidth choice for nonparametric regression. Annals of Statistics. 1984;12:1215–1230. [Google Scholar]
Sakamoto Y, Ishiguro M, Kitagawa G. Akaike Information Criterion Statistics (Mathematics and its Applications) Springer; 1999. [Google Scholar]
Sangalli L, Secchi P, Vantini S, Veneziani A. Efficient estimation of three-dimensional curves and their derivatives by free-knot regression splines, applied to the analysis of inner carotid artery centrelines. Journal of the Royal Statistical Society, Ser C. 2009;58:285–306. [Google Scholar]
Schwartzman A. Ph D thesis. Stanford University; 2006. Random ellipsoids and false discovery rates: Statistics for diffusion tensor imaging data. [Google Scholar]
Terras A. Harmonic Analysis on Symmetric Spaces and Applications II. Springer-Verlag; Berlin, Heidelberg and New York: 1988. [Google Scholar]
Wand MP, Jones MC. Kernel Smoothing. London: Chapman and Hall; 1995. [Google Scholar]
Zhu HT, Chen YS, Ibrahim JG, Li YM, Lin WL. Intrinsic regression models for positive-definite matrices with applications to diffusion tensor imaging. Journal of the American Statistical Association. 2009;104:1203–1212. doi: 10.1198/jasa.2009.tm08096. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhu HT, Styner M, Tang NS, Liu ZX, Lin WL, Gilmore J. Frats: Functional regression analysis of dti tract statistics. IEEE Transactions on Medical Imaging. 2010;29:1039–1049. doi: 10.1109/TMI.2010.2040625. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhu HT, Zhang HP, Ibrahim JG, Peterson BG. Statistical analysis of diffusion tensors in diffusion-weighted magnetic resonance image data (with discussion) Journal of the American Statistical Association. 2007;102:1085–1102. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp Material

NIHMS336025-supplement-Supp_Material.pdf^{(367.3KB, pdf)}

[R1] Anderson TW. Wiley Series in Probability and Statistics. 3. 2003. An introduction to multivariate statistical analysis. [Google Scholar]

[R2] Arsigny V, FP, PX, AN Geometric means in a novel vector space structure on symmetric positive definite matrices. SIAM J Matrix Anal Appl. 2007;29:328–347. [Google Scholar]

[R3] Barmpoutis A, Vemuri BC, Shepherd TM, Forder JR. Tensor splines for interpolation and approximation of dt-mri with applications to segmentation of isolated rat hippocampi. IEEE Transations on Medical Imaging. 2007;26:1537–1546. doi: 10.1109/TMI.2007.903195. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Basser PJ, Mattiello J, LeBihan D. Mr diffusion tensor spectroscopy and imaging. Biophysical Journal. 1994;66:259–267. doi: 10.1016/S0006-3495(94)80775-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Batchelor P, Moakher M, Atkinson D, Calamante F, Connelly A. A rigorous framework for diffusion tensor calculus. Magnetic Resonance in Medicine. 2005;53:221–225. doi: 10.1002/mrm.20334. [DOI] [PubMed] [Google Scholar]

[R6] Bhattacharya R, Patrangenaru V. Large sample theory of intrinsic and extrinsic sample means on manifolds-ii. Annals of Statistics. 2005;33:1225–1259. [Google Scholar]

[R7] Chen YS, An HT, Zhu HY, Stone T, Smith JK, Hall C, Bullitt E, Shen DG, Lin WL. White matter abnormalities revealed by diffusion tensor imaging in non-demented and demented hiv+ patients. NeuroImage. 2009;47:1154–1162. doi: 10.1016/j.neuroimage.2009.04.030. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Davis BC, Bullitt E, Fletcher PT, Joshi S. Population shape regression from random design data. International Journal of Computer Vision. 2010;90:255–266. [Google Scholar]

[R9] Dryden IL, Koloydenko A, Zhou D. Non-euclidean statistics for covariance matrices, with applications to diffusion tensor imaging. Annals of Applied Statistics. 2009;3:1102–1123. [Google Scholar]

[R10] Fan J, Gijbels I. Local polynomical modelling and its applications. Cahpman and Hall; 1996. [Google Scholar]

[R11] Fan J, Gijbels I, Hu T-C, Huang L-S. A study of variable bandwidth selection for local polynomial regression. Statist Sinica. 1996;6(1):113–127. [Google Scholar]

[R12] Filippi C, Ulug A, Ryan E, Ferrando S, van Gorp W. Diffusion tensor imaging of patients with hiv and normal-appearing white matter on mr images of the brain. Ajnr: Am J Neuroradiol. 2001;22:277–283. [PMC free article] [PubMed] [Google Scholar]

[R13] Fingelkurts AA, Fingelkurts AA, Kahkonen S. Functional connectivity in the brain-is it an elusive concepts? Neuroscience and Biobehavioral Reviews. 2005;28:827–836. doi: 10.1016/j.neubiorev.2004.10.009. [DOI] [PubMed] [Google Scholar]

[R14] Fletcher P, Joshi S. Riemannian geometry for the statistical analysis of diffusion tensor data. Signal Processing. 2007;87:250–262. [Google Scholar]

[R15] Fletcher P, Joshi S, Lu C, Pizer S. Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Transactions on Medical Imaging. 2004;23:995–1005. doi: 10.1109/TMI.2004.831793. [DOI] [PubMed] [Google Scholar]

[R16] Grenander U, Miller MI. Pattern Theory From Representation to Inference. Oxford University Press; 2007. [Google Scholar]

[R17] Hall P, Marron JS, Park BU. Smoothed cross-validation. Probability Theory and Related Fields. 1992;92:1–20. [Google Scholar]

[R18] Hardle W, Hall P, Marron JS. Regression smoothing parameters that are not far from their optimum. Journal of American Statistical Association. 1992;87:227–233. [Google Scholar]

[R19] Jones M, Marron J, Sheather S. A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association. 1996;91:401–407. [Google Scholar]

[R20] Jupp PE, Kent JT. Fitting smooth paths to spherical data. Applied Statistics. 1987;36:34–46. [Google Scholar]

[R21] Kim PT, Richards DS. IMS Lecture Notes Monograph Series. A Festschrift of Tom Hettmansperger. 2010. Deconvolution density estimation on spaces of positive definite symmetric matrices. [Google Scholar]

[R22] Kume A, I, Dryden L, Le H. Shape-space smoothing splines for planar landmark data. Biometrika. 2007;94:513–528. [Google Scholar]

[R23] Lang S. Graduate Texts in Mathematics. Vol. 191. Springer Verlag; New York: 1999. Fundamentals of Differential Geometry. [Google Scholar]

[R24] Liang F. Evolutionary stochastic approximation monte carlo for global optimization. Statistics and Computing. 2010 In press. [Google Scholar]

[R25] Park BU, Marron JS. Comparison of data-driven bandwidth selectors. Journal of American Statistical Association. 1990;85:66–72. [Google Scholar]

[R26] Pasternak O, Sochen N, Basser JP. The effect of metric selection on the analysis of diffusion tensor mri data. Neuroimage. 2010;49:2190–2204. doi: 10.1016/j.neuroimage.2009.10.071. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] Pennec X, Fillard P, Ayache N. A riemannian framework for tensor computing. International Journal of Computer Vision. 2006;66:41–66. [Google Scholar]

[R28] Pourahmadi M. Maximum likelihood estimation of generalized linear models for multivariate normal covariance matrix. Biometrika. 2000;87:425–435. [Google Scholar]

[R29] Prentice MJ. Fitting smooth paths to rotation data. Journal of the Royal Statistical Society Series C (Applied Statistics) 1987;36:325–331. [Google Scholar]

[R30] Rice J. Bandwidth choice for nonparametric regression. Annals of Statistics. 1984;12:1215–1230. [Google Scholar]

[R31] Sakamoto Y, Ishiguro M, Kitagawa G. Akaike Information Criterion Statistics (Mathematics and its Applications) Springer; 1999. [Google Scholar]

[R32] Sangalli L, Secchi P, Vantini S, Veneziani A. Efficient estimation of three-dimensional curves and their derivatives by free-knot regression splines, applied to the analysis of inner carotid artery centrelines. Journal of the Royal Statistical Society, Ser C. 2009;58:285–306. [Google Scholar]

[R33] Schwartzman A. Ph D thesis. Stanford University; 2006. Random ellipsoids and false discovery rates: Statistics for diffusion tensor imaging data. [Google Scholar]

[R34] Terras A. Harmonic Analysis on Symmetric Spaces and Applications II. Springer-Verlag; Berlin, Heidelberg and New York: 1988. [Google Scholar]

[R35] Wand MP, Jones MC. Kernel Smoothing. London: Chapman and Hall; 1995. [Google Scholar]

[R36] Zhu HT, Chen YS, Ibrahim JG, Li YM, Lin WL. Intrinsic regression models for positive-definite matrices with applications to diffusion tensor imaging. Journal of the American Statistical Association. 2009;104:1203–1212. doi: 10.1198/jasa.2009.tm08096. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] Zhu HT, Styner M, Tang NS, Liu ZX, Lin WL, Gilmore J. Frats: Functional regression analysis of dti tract statistics. IEEE Transactions on Medical Imaging. 2010;29:1039–1049. doi: 10.1109/TMI.2010.2040625. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] Zhu HT, Zhang HP, Ibrahim JG, Peterson BG. Statistical analysis of diffusion tensors in diffusion-weighted magnetic resonance image data (with discussion) Journal of the American Statistical Association. 2007;102:1085–1102. [Google Scholar]

PERMALINK

Local Polynomial Regression for Symmetric Positive Definite Matrices

Ying Yuan

Hongtu Zhu

Weili Lin

J S Marron

Summary

1. Introduction

2. Intrinsic Local Polynomial Regression for SPD Matrices

Fig. 1.

3. ILPR under Log-Euclidean Metric and Trace Metric

3.1. Log-Euclidean Metric

3.2. Trace Metric

4. Asymptotic Properties

4.1. Log-Euclidean Metric

Theorem 1

4.2. Trace Metric

Theorem 2

Theorem 3

5. Simulation

Fig. 2.

5.1. Simulation 1

5.2. Simulation 2

Fig. 3.

5.3. Simulation 3

Fig. 4.

5.4. Simulation 4

Fig. 5.

6. HIV Imaging Data

Fig. 6.

Fig. 7.

Fig. 8.

7. Conclusion and Discussion

Supplementary Material

Acknowledgments

Appendix. Assumptions

Remark

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases