Conditional local distance correlation for manifold-valued data

Wenliang Pan; Xueqin Wang; Canhong Wen; Martin Styner; Hongtu Zhu

doi:10.1007/978-3-319-59050-9_4

. Author manuscript; available in PMC: 2017 Dec 1.

Published in final edited form as: Inf Process Med Imaging. 2017 May 23;10265:41–52. doi: 10.1007/978-3-319-59050-9_4

Conditional local distance correlation for manifold-valued data

Wenliang Pan ^1,², Xueqin Wang ^1,^2,^*, Canhong Wen ^1,², Martin Styner ^3,^**, Hongtu Zhu ^4,^***

PMCID: PMC5606211 NIHMSID: NIHMS867749 PMID: 28943730

Abstract

Manifold-valued data arises frequently in medical imaging, surface modeling, computational biology, and computer vision, among many others. The aim of this paper is to introduce a conditional local distance correlation measure for characterizing a nonlinear association between manifold-valued data, denoted by X, and a set of variables (e.g., diagnosis), denoted by Y, conditional on the other set of variables (e.g., gender and age), denoted by Z. Our nonlinear association measure is solely based on the distance of the space that X, Y, and Z are resided, avoiding both specifying any parametric distribution and link function and projecting data to local tangent planes. It can be easily extended to the case when both X and Y are manifold-valued data. We develop a computationally fast estimation procedure to calculate such nonlinear association measure. Moreover, we use a bootstrap method to determine its asymptotic distribution and p-value in order to test a key hypothesis of conditional independence. Simulation studies and a real data analysis are used to evaluate the finite sample properties of our methods.

Keywords: Manifold-valued, Local distance correlation, Shape statistics

1 Introduction

Manifold-valued data frequently arises in many domains, such as medical imaging, computational biology, and computer vision, among many others [2, 23, 28, 11, 19]. Examples of manifold-valued data in medical imaging analysis include the Grassmann manifold, planar shapes, matrix Lie groups, deformation field, symmetric positive definite (SPD) matrices, and the shape representation of cortical and subcortical structures. Most manifold-valued objects are inherently non-linear and high-dimensional (or even infinite-dimensional), so analysis of these complex objects presents many mathematical and computational challenges.

Motivated by shape analysis, the aim of this paper is to measure a linear/nonlinear association between manifold-valued data (e.g., shape representation) and a random vector/variable (e.g., diagnosis), while controlling for the other random vector (e.g., age). Specifically, consider n independent observations {(X_i, Y_i, Z_i)}₁_≤i≤n, where X_i, Y_i, and Z_i are elements in metric spaces $X$ , $Y$ , and $Z$ , respectively. In traditional statistics, these metric spaces are Euclidean spaces of arbitrary dimension. Correlation and regression analyses are the fundamental statistic techniques for quantifying the degree of association between X and Y, with/without the effect of a set of controlling random variables Z removed. For instance, Pearson correlation and its multivariate extension, so-called canonical correlation analysis (CCA), are powerful tools for measuring the degree of linear association between X and Y. Moreover, partial correlation measures the degree of linear association between two random variables, while controlling for a random vector. Alternatively, one may fit a regression with Y_i as response and both X_i and Z_i as covariates such that $Y_{i} = β_{0} + X_{i}^{T} β_{x} + Z_{i}^{T} β_{z} + ε_{i}$ , where β₀, β_x, and β_z are regression coefficients and ε_i are measurement errors.

Generalizations of correlation and regression analyses to manifold-valued data are recently gaining popularity. Most existing methods for manifold-valued data are primarily on their mean and variation [5, 6, 10, 12]. Some nonparametric methods were subsequently developed for the density estimation of manifold data [3, 19, 21]. Recently, in [14], a Riemannian CCA model was proposed to measure an intrinsically linear association between manifold-valued data and a random vector (or two manifold-valued objects). Furthermore, various intrinsic regression models have been developed for manifold-valued data [13, 3, 4, 16, 24, 29, 7, 2, 19, 10, 9]. Most of these regression methods often require specifying a link function, projecting manifold-valued data to local tangent planes for computing residuals, and transporting all residuals to a common space [7].

However, when X_i or/and Y_i are manifold-valued data, little has been done on the analysis of X and Y, while controlling for Z due to at least two major challenges. First, it is computationally challenging to optimize the objective function for the regression analysis of X and Y, when the dimension of X is relatively high. Such objective function is generally not convex and has a large number of parameters. Particularly, standard gradient-based optimization methods used in the literature strongly depend on the starting value of unknown parameters. Second, most intrinsic regression models for manifold-valued data require the specification of link functions (e.g., geodesic link), but it is conceptually challenging to choose a correct appropriate link function for any regression model that provides goodness of fit to a given data set. Due to these challenges, it is difficult to make further statistical inference (e.g., hypothesis test).

We propose a conditional local distance correlation to measure the nonlinear association between X and Y, while controlling for Z. Since such distance correlation measure solely requires the specification of the distances on the metric spaces $X$ , $Y$ , and $Z$ , it is applicable when both X and Y are manifold-valued data in different spaces. It also enjoys four major advantages. First, it avoids the optimization of a complex objective function, since empirical distance correlation measure is the function of pairwise distance between sample points. Second, it avoids the projection of manifold-valued data to local tangent planes. Third, it has a high statistical power of detecting complex and unknown nonlinear relationships between X and Y. Fourth, it is easy to make statistical inference on the nonlinear association between X and Y, while controlling for Z.

2 Methods

2.1 Conditional local distance correlation

We review a novel distance correlation for characterizing statistical dependence between two random variables or two random vectors of arbitrary dimensions [25]. In [15], distance correlation was further extended to stochastic processes in metric spaces when such metric spaces are of strong negative type. Distance correlation as an extension of Pearson correlation has several important properties. The first and most important one is that it is zero if and only if two random vectors are independent. The second one is its computational simplicity, since empirical distance correlation is the function of pairwise distance between sample points.

We introduce a conditional local distance correlation to measure the nonlinear association between X and Y, while controlling for Z, when $X$ , $Y$ , and $Z$ are metric spaces. Let d_X(·,|·), d_Y (·, ·), and d_Z(·, ·) be, respectively, the metrics of $X$ , $Y$ , and $Z$ . Let M( $X$ |Z) (or·M( $Y$ |Z denote the set of finite conditional probability measures on $X$ (or $Y$ ) given Z. We say that μ ∈ M( $X$ | Z) has a finite first moment if $\int_{X} d_{X} (o, x) d | μ | (x | z) < \infty$ . we for some o ∈ $X$ Similarly, we can define ν ∈ ( $Y$ | Z). Define a_μ (x| z):= ∫d_X(x, x′)dμ(x′|z) and D_X(μ|z):= ∫d_X(x, x′)dμ(x′|z)dμ(x|z) as finite functions when μ ∈ M|( $X$ |Z) has a finite first moment. Also, we can define a_v (y |z) D_Y (ν|z) for ν ∈ M( $Y$ |Z) and Y.

Definition 1

The local distance covariance $ℒ D V (X, Y | Z)$ between random processes X and Y with finite moments given Z is defined as the square root of

\begin{array}{l} ℒ D V^{2} (X, Y | Z) = E [{d_{X} (X, X') - a_{μ} (X | Z) - a_{μ} (X' | Z) + D_{X} (μ | Z)} \\ \times {d_{Y} (Y, Y') - a_{v} (Y | Z) - a_{v} (Y' | Z) + D_{Y} (v | Z)} | Z], \end{array}

where X′ and Y′ are the independent copies of X and Y, respectively.

By setting X = Y, we obtain the local distance variance as $ℒ D V (X | Z) = ℒ D V (X, X | Z) .$

Definition 2

The local distance correlation between random processes X and Y with finite moments given Z is defined as the square root of

ℒ D C^{2} (X, Y | Z) = \frac{ℒ D V^{2} (X, Y | Z)}{\sqrt{ℒ D V^{2} (X | Z) ℒ D V^{2} (Y | Z)}}

if $ℒ D V^{2} (X | Z) ℒ D V^{2} (Y | Z) > 0$ , or 0 otherwise.

Under the condition that the metric spaces are of strong negative type, it can be shown that $ℒ D C^{2} (X, Y | Z = z) = 0$ if and only if X and Y given Z = z are conditionally independent. This property distinguishes the local distance correlation from the existing methods in the literature [14, 13, 3, 4, 16, 24, 29, 7, 2, 19, 7]. As shown below, it is easy to estimate $ℒ D C^{2} (X, Y | Z)$ and use its estimate to make statistical inference.

2.2 Estimation procedure

The next interesting question is to estimate the local distance covariance and correlation. The local distance dependence statistics are defined as follows. Let (X_i, Y_i, Z_i) for i = 1,…,n be a random sample of n independent and identically distributed random vectors from the joint distribution of random vectors (X, Y, Z). We compute two distance matrices (a_kl) = d_X(X_k, X_l) and (b_kl) = d_Y (Y_k, Y_l). For notational simplicity, it is assumed that $Z = R^{r}$ holds. We consider a kernel function K(·) on R^r and the bandwidth h satisfying two regularity conditions as follows:

(C1)
$\int_{ℝ^{r}} z K (z) d z = 0$ , $\int_{ℝ^{r}} K (z) d z = 1$ , $\int_{ℝ^{r}} | K (z) | d z < \infty$ , $\int_{ℝ^{r}} K^{2} (z) d z > 0$ , and $\int_{ℝ^{r}} {‖ z ‖}_{2}^{2} K (z) d z < \infty .$
(C2)
h^r → 0, nh^r → ∞, as n → ∞

Let ω_h,k(Z) = K_h(Z − Z_k), ω_h,kl(Z) = K_h(Z − Z_k)K_h(Z − Z_l), ω_h,ijkl(Z) = K_h(Z − Z_i)K_h(Z − Z_j)K_h(Z − Z_k)K_h(Z − Z_l), and $ω_{h} (Z) = \sum_{k = 1}^{n} ω_{h, k} (Z)$ , where K_h(Z) = K((Z − Z_k)/h)/h^r. We then introduce A_kl(Z; h) as follows:

A_{k l} (Z; h) = a_{k l} - {\bar{a}}_{k} . (Z; h) - \bar{a} ._{l} (Z; h) + \bar{a} .. (Z; h),

where a_kl = d_X(X_k, X_l), ${\bar{a}}_{k} . (Z; h) = {ω_{h} (Z)}^{- 1} \sum_{l = 1}^{n} d_{X} (X_{k,} X_{l}) ω_{h, l} (Z)$ , $\bar{a} ._{l} (Z; h) = {ω_{h} (Z)}^{- 1} \sum_{k = 1}^{n} d_{X} (X_{k}, X_{l}) ω_{h, k} (Z)$ , and

\bar{a} .. (Z; h) = {ω_{h} (Z)}^{- 2} \sum_{k, l = 1}^{n} d_{X} (X_{k}, X_{l}) ω_{h, k l} (Z) .

Similarly, we can define b_kl = d_Y (Y_k, Y_l) and B_kl(Z; h). Then, the empirical local distance covariance can be defined as

ℒ D V_{n}^{2} (X, Y | Z; h) = {ω_{h} (Z)}^{- 2} \sum_{k, l = 1}^{n} A_{k l} (Z; h) B_{k l} (Z; h) ω_{h, k l} (Z) .

We define the empirical local distance correlation as

ℒ D C_{n}^{2} (X, Y | Z; h) = \frac{ℒ D V_{n} (X, Y | Z; h)}{\sqrt{ℒ D V_{n} (X | Z; h) ℒ D V_{n} (Y | Z; h)}},

where $ℒ D V_{n}^{2} (X | Z; h) = ℒ D V_{n}^{2} (X, X | Z; h)$

2.3 Inference procedure

The next question is to make statistical inference based on LDV or LDC. Our inference procedure consists of carrying out hypothesis test and constructing confidence interval.

To test the dependence of X and Y at a fixed location Z = z, we formulate it as follows:

H_{0} : ℒ D V (X, Y | Z = z) = 0 v . s . H_{1} : ℒ D V (X, Y | Z = z) > 0.

(1)

We calculate the p–value of $ℒ D V_{n} (X, Y | Z; h)$ by using a local bootstrap procedure [18, 25, 26] as follows:

Generate $X_{j}^{*}$ from {X₁,…,X_n} with the probability
$P (X^{*} = X_{j} | Z = Z_{i}) = \frac{ω_{h, j} (Z_{i})}{\sum_{j = 1}^{n} ω_{h, j} (Z_{i})}$
for j = 1,…,n. Then, we compute $ℒ D V_{n}^{*}$ by using the local bootstrap sample ${(X_{i}^{*}, Y_{i,} Z_{i}) : i = 1, \dots, n}$ .
Select a resampling number S, say 1,000. Repeat Step (i) S times and obtain $ℒ D V_{n s}^{*}$ for s = 1,…,S. And then the p-value of the test is given by
$p \approx \frac{\sum_{s = 1}^{S} I (| ℒ D V_{n} | > | ℒ D V_{n s}^{*} |) + 1}{S + 1}$

Given a confidence level α, we construct simultaneous confidence bands for $ℒ D C (X, Y | Z)$ as follows:

P (ℒ D C_{n}^{L, α} (X, Y | Z; h) < ℒ D C (X, Y | Z) < ℒ D C_{n}^{U, α} (X, Y | Z; h)) = 1 - α,

where $ℒ D C_{n}^{L, α} (X, Y | Z; h)$ and $ℒ D C_{n}^{U, α} (X, Y | Z; h)$ are the lower and upper limits of simultaneous confidence band, respectively. We use a bootstrap method to approximate the bounds:

Resample $(X_{i}^{*}, Y_{i}^{*} Z_{i}^{*}), i = 1, \dots, n$ from {(X_k, Y_k, Z_k): k = 1,…,n} with the probability
$P ((X^{*}, Y^{*}, Z^{*}) = (X_{j}, Y_{j}, Z_{j}) | Z = Z_{i}) = \frac{ω_{h, j} (Z_{i})}{\sum_{j = 1}^{n} ω_{h, j} (Z_{i})}$
for j = 1,…,n, then compute $ℒ D C_{n}^{*}$ by the bootstrap sample ${(X_{i}^{*}, Y_{i}^{*} Z_{i}^{*}) : i = 1, \dots, n}$ .
Repeat Step (I) with resampling number S times and obtain $ℒ D C_{n s}^{*}, s = 1, \dots, S$ And then the simultaneous confidence band is given by the quantiles at α and 1 − α of $ℒ D C_{n s}^{*}, s = 1, \dots, S$ .

Like many other smoothing-based method, the performance of the proposed method depends upon the bandwidth h. It is widely acknowledged that the optimal h for nonparametric estimation is generally not optimal for testing. Selecting h to achieve optimal statistical power for (1) is an open problem. In practice, h can not be too large, since the conditional local distance covariance tends to the unconditional one. That is, an inappropriately large bandwidth h will yield a much larger false positive rate when X and Y are dependent. For simplicity, we consider the bandwidth h to eliminate the effect of Z on X and Y in the maximum extent. That is, the bandwidth h is chosen to minimize the mean of local distance covariance at every location Z = z. The intuition for the choice of h comes from partial correlation, whose aim is to eliminate the effect of Z on X and Y by the regression of Z on X and Y.

3 Numerical Studies

3.1 Simulations

We use two simulation studies to examine the finite sample performance of LDC. We consider the directional data on the unit sphere R^p, which denoted by $S^{p - 1} = {x \in R^{p} : {‖ x ‖}_{2} = 1}$ for, both X and Y. Under the canonical Riemannian metric on S^p−¹ induced by the canonical inner product on R^p, the geodesic distance between any two points X and X′ is equal to d_X(X, X′) = arccos(X^T X′). The sample size is set to be n = 300 and 400 in order to examine the finite sample performance of local distance estimate. We calculate the rejection rate at the significance level α = 0.05 and S = 200. Moreover, 200 replications are used for each simulation setting.

Simulation 1

We set p = 3 and consider the spherical coordinate of S², denoted as (r, θ, ϕ), where r ∈ [0, ∞), θ ∈ [0, 2π], and ϕ ∈ [0, π], respectively, represent the radial distance, inclination (or elevation), and azimuth. The simulation datasets {(X_i, Y_i, Z_i) ∈ S²×S²×R: i = 1,…,n} were generated as follows:

(I.1)
$X_{i} = (1, θ_{i}^{x}, ϕ_{i}^{x})$ , $Y_{i} = (1, θ_{i}^{y}, ϕ_{i}^{y})$ , and $Z_{i} \sim U (\frac{π}{5}, \frac{4 π}{5})$ ;
(I.2)
$X_{i} = (1, θ_{i}^{x}, Z_{i} + ε_{i}^{x})$ , $Y_{i} = (1, θ_{i}^{y}, Z_{i} + ε_{i}^{y})$ , and $Z_{i} \sim U (\frac{π}{5}, \frac{4 π}{5})$ ;
(I.3)
$X_{i} = (1, θ_{i}, Z_{i} + ε_{i}^{x})$ , $Y_{i} = (1, θ_{i} + ε_{i}, Z_{i} + ε_{i}^{y})$ , and $Z_{i} \sim U (\frac{π}{5}, \frac{4 π}{5})$ ;
(I.4)
Half of samples (Z ∼ U(−π+0.5, 0)) are generated from (I.2) and the other half (Z ∼ U(0, π−0.5)) are from (I.3);

where $θ_{i}^{x}$ , $ϕ_{i}^{x}$ , $θ_{i}^{y}$ , and $ϕ_{i}^{y}$ were independently simulated from the Uniform distribution U(−π, π) and the ε_i, $ε_{i}^{x}$ , and $ε_{i}^{y}$ were independently simulated from the normal distribution N(0, 0.2). Therefore, X and Y are independent in (I.1), whereas they are dependent in (I.2) and (I.3). In contrast, X and Y are conditionally independent given Z in (I.1) and (I.2), whereas they are conditionally dependent in (I.3). For (I.4), the first half of samples are conditionally independent and the second half of samples are conditionally dependent given Z.

Figure 1 presents the estimated LDCs between X and Y given Z and their p-values. The Type I error rates based on the local bootstrap procedure are well maintained under the prefixed significance level, while the power of rejecting the null hypothesis is good. As the sample size n increases, simultaneous confidence bands become narrower and the value of local distance correlation is close to zero under the true conditional independence. We use the function e.cp3o in R package ecp to detect the change point of local distance correlation in (I.4). The estimated change point is very close to the true value of change point.

Fig. 1 — Figures of local distance correlation (95% confidence bands) and Type-I-Error/Power for n = 300 and n = 400 in simulation 1.

Simulation 2

We consider the von Mises-Fisher distribution, of which the data can be spherical or hyper-spherical. A p-dimensional unit random vector $x ({‖ x ‖}_{2} = 1)$ is set to be p-variate von Mises-Fisher distribution M_p(μ, κ). We set p = 10 and simulated μ from the multivariate normal distribution N(0, I₁₀). The simulated datasets were generated as follows:

(II.1)
X_i ∼ M₁₀(μ_x, 15), Y_i ∼ M₁₀(μ_y, 15), and Z_i ∼ N(0, 0.5);
(II.2)
$X_{i} = X_{i}^{1} + ξ_{i}$ and $Y_{i} = Y_{i}^{1} + ξ_{i}$ , where $X_{i}^{1} \sim M_{10} (μ_{x}, 15)$ , $Y_{i}^{1} \sim M_{10} (μ_{y}, 15)$ , and ξ_i = (u₁ Z_i…,u₁₀ Z_i), in which Z_i ~ U(−1, 1) and $u_{1}, \dots u_{10} \in {- \frac{7}{24}, \frac{7}{24}}$ with equal probability. Then, we project X_i and Y_i to the unit spherical surface;
(II.3)
X_i ∼ M₁₀(μ_x, 15), Y_i = RX_i, and Z_i ∼ N(0,0.5) where R is a rotation matrix along the direction μ_x of μ_y
(II.4)
Half of samples (Z ~ U(−5, 0) were generated from (II.1) and the other half (Z ~ U(0, 5) were from (II.3).

Similar to Simulation 1, X and Y are independent in (II.1), whereas they are dependent in (II.2) and (II.3). However, X and Y are conditionally independent given Z in (II.1) and (II.2), while they are conditionally dependent given Z in (II.3). The first half of samples are conditionally independent and the second half of samples are conditionally dependent given Z in (II.4). Inspecting Figure 2 reveals that the proposed methods work well.

Fig. 2 — Figures of local distance correlation (95% confidence bands) and Type-I-Error/Power for n = 300 and n = 400 in simulation 2.

3.2 Real Data Analysis

Alzheimer disease (AD) is a disorder of cognitive and behavioral impairment that markedly interferes with social and occupational functioning. It is an irreversible and progressive brain disease that slowly destroys memory and thinking skills, and eventually even the ability to carry out the simplest tasks. AD affects almost 50% of those over the age of 85 and is the sixth leading cause of death in the United States. The corpus callosum (CC), the largest white matter structure in the brain, has been a structure of high interest in many neuroimaging studies of neuro-developmental pathology. It contains homotopic and heterotopic interhemispheric connections and is essential for communication between the two cerebral hemispheres. Individual differences in CC and their possible implications regarding interhemispheric connectivity have been investigated in last several decades [27, 20].

We consider the CC contour data of the ADNI1 study. We processed the CC shape data for each subject in the ADNI1 study as follows. We used FreeSurfer package [8] to process each T1-weighted MRI, whereas the midsagittal CC area was calculated in the CCseg package.

We are interested in characterizing the change of the CC contour shape and its association with several key covariates of interest, such as age and diagnosis. We focused on n = 409 subjects with 223 healthy controls (HCs) and 186 AD patients at baseline of the ADNI1 study. Each subject has a CC planar contour Y_i with 50 landmarks and nine covariates, including gender, age, handedness, marital status (Widowed, Divorced, and Never married), education length, retirement, and diagnosis. The demographic information is shown in Table 1. We treat the CC planar contour Y_i as a manifold-valued response in the Kendall planar shape space and all covariates in the Euclidean space.

Table 1.

Demographic information for all participants.

Disease status	Number of subjects	Age (years)	Females/males
Healthy control	223	62–90 (76.25)	107/116
AD	186	55–92 (75.42)	88/98

Open in a new tab

The first scientific question of interest is to characterize the relationship between CC shape data and each of the nine covariates. Table 2 presents the distance correlation statistics for correlating CC data with each of the nine covariates. It reveals that the shape of CC planar contour are highly dependent on gender, education length, age and AD diagnosis at the significant level α = 0.05. It may indicate that gender, age and AD diagnosis are the most significant influence factors of CC planar contour, which agree with [1, 17, 22].

Table 2.

The distance correlation (Dcor) statistics for correlating CC contour data and nine covariates. The significance level is 0.05.

covariates	Dcor	P–value
Gender	0.186	0.001
Handedness	0.094	0.420
Marital Status	0.108	0.383
Education length	0.166	0.010
Retirement	0.108	0.165
Age	0.245	0.001
Diagnosis	0.190	0.001

Open in a new tab

The second scientific question of interest is to characterize the relationship between CC shape data and AD diagnosis given age. Figure 3 presents the conditional local distance correlation of CC planar contour and AD diagnosis given age as a function of age. As age increases, the value of the conditional local distance correlation increases. It implies that diagnosis and CC are dependent with each other as age changes. Figure 4 presents the mean age-dependent CC trajectories for healthy controls and AD within each gender group. It can be observed that there is a major difference of the shape between the AD disease and healthy both in male and female groups. The splenium seems to be less thinner and the isthmus is rounded in subjects with AD disease than in healthy controls.

Fig. 3 — Figures of estimated local distance correlation and the corresponding negative log₁₀(p-values) between CC planar contour and Diagnosis given Age.

Fig. 4 — ADNI data: Mean trajectories within each gender group: (a) female group (blue - normal; magenta - AD); (b) male group (black - normal; red - AD)

4 Conclusion

We proposed a local distance correlation for modeling data with manifold valued responses and applied this method to a variety of applications, such as the responses restricted to the sphere, shape spaces. The proposed method can detect complex nonlinear relationship and keeps the computational simplicity. In future, we will further investigate the theoretical properties of the new method and other applications in imaging analysis.

References

1.Allen LS, Richey M, Chai YM, Gorski RA. Sex differences in the corpus callosum of the living human being. The Journal of Neuroscience. 1991;11(4):933–942. doi: 10.1523/JNEUROSCI.11-04-00933.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Banerjee M, Chakraborty R, Ofori E, Okun MS, Viallancourt DE, Vemuri BC. A nonlinear regression technique for manifold valued data with applications to medical image analysis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:4424–4432. [Google Scholar]
3.Bhattacharya A, Dunson DB. Nonparametric bayesian density estimation on manifolds with applications to planar shapes. Biometrika. 2010;97(4):851–865. doi: 10.1093/biomet/asq044. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Bhattacharya A, Dunson DB. Nonparametric bayes classification and hypothesis testing on manifolds. Journal of Multivariate Analysis. 2012;111:1. doi: 10.1016/j.jmva.2012.02.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Bhattacharya R, Patrangenaru V. Large sample theory of intrinsic and extrinsic sample means on manifolds-i. Annals of Statistics. 2003;31(1):1–29. [Google Scholar]
6.Bhattacharya R, Patrangenaru V. Large sample theory of intrinsic and extrinsic sample means on manifolds-ii. Annals of Statistics. 2005;33(3):1225–1259. [Google Scholar]
7.Cornea E, Zhu H, Kim PT, Ibrahim JG. Regression models on riemannian symmetric spaces. Journal of The Royal Statistical Society Series B-statistical Methodology. 2016 doi: 10.1111/rssb.12169. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Dale AM, Fischl B, Sereno MI. Cortical surface-based analysis. i. segmentation and surface reconstruction. NeuroImage. 1999;9(2):179–194. doi: 10.1006/nimg.1998.0395. [DOI] [PubMed] [Google Scholar]
9.Davis B, Fletcher PT, Bullitt E, Joshi S. Population shape regression from random design data. 2007:1–7. [Google Scholar]
10.Fletcher PT, Lu C, Pizer SM, Joshi S. Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Transactions on Medical Imaging. 2004;23(8):995–1005. doi: 10.1109/TMI.2004.831793. [DOI] [PubMed] [Google Scholar]
11.Grenander U, Miller MI. Pattern Theory From Representation to Inference. Oxford University Press; 2007. [Google Scholar]
12.Huckemann S, Hotz T, Munk A. Intrinsic manova for riemannian manifolds with an application to kendall’s space of planar shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2010;32(4):593–603. doi: 10.1109/TPAMI.2009.117. [DOI] [PubMed] [Google Scholar]
13.Kent JT. The fisher-bingham distribution on the sphere. Journal of the Royal Statistical Society. Series B (Methodological) 1982:71–80. [Google Scholar]
14.Kim HJ, Adluru N, Bendlin BB, Johnson SC, Vemuri BC, Singh V. European Conference on Computer Vision. Springer; 2014. Canonical correlation analysis on riemannian manifolds and its applications; pp. 251–267. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Lyons R. Distance covariance in metric spaces. The Annals of Probability. 2013;41(5):3284–3305. [Google Scholar]
16.Machado L, Leite FS, Krakowski K. Higher-order smoothing splines versus least squares problems on riemannian manifolds. Journal of Dynamical and Control Systems. 2010;16(1):121–148. [Google Scholar]
17.Ota M, Obata T, Akine Y, Ito H, Ikehira H, Asada T, Suhara T. Age-related degeneration of corpus callosum measured with diffusion tensor imaging. NeuroImage. 2006;31(4):1445–1452. doi: 10.1016/j.neuroimage.2006.02.008. [DOI] [PubMed] [Google Scholar]
18.Paparoditis E, Politis D. The local bootstrap for kernel estimators under general dependence conditions. Annals of the Institute of Statistical Mathematics. 2000;52(1):139–159. [Google Scholar]
19.Patrangenaru V, Ellingson L. Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis. CRC Press; 2015. [Google Scholar]
20.Paul LK, Brown WS, Adolphs R, Tyszka JM, Richards LJ, Mukherjee P, Sherr EH. Agenesis of the corpus callosum: genetic, developmental and functional aspects of connectivity. Nature Reviews Neuroscience. 2007;8(4):287–299. doi: 10.1038/nrn2107. [DOI] [PubMed] [Google Scholar]
21.Pelletier B. Kernel density estimation on riemannian manifolds. Statistics & Probability Letters. 2005;73(3):297–304. [Google Scholar]
22.Shuyu L, Fang P, Xiangqi H, Li D, Tianzi J. Shape analysis of the corpus callosum in alzheimer’s disease. 2007:1095–1098. [Google Scholar]
23.Srivastava A, Klassen EP. Functional and shape data analysis. Springer; 2016. [Google Scholar]
24.Su J, Dryden IL, Klassen E, Le H, Srivastava A. Fitting smoothing splines to time-indexed, noisy points on nonlinear manifolds. Image and Vision Computing. 2012;30(6):428–442. [Google Scholar]
25.Székely G, Rizzo M, Bakirov N. Measuring and testing dependence by correlation of distances. The Annals of Statistics. 2007;35(6):2769–2794. [Google Scholar]
26.Wang X, Pan W, Hu W, Tian Y, Zhang H. Conditional distance correlation. Journal of the American Statistical Association. 2016;110(512):1726. doi: 10.1080/01621459.2014.993081. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Witelson SF. Hand and sex differences in the isthmus and genu of the human corpus callosum. a postmortem morphological study. Brain. 1989;112(3):799–835. doi: 10.1093/brain/112.3.799. [DOI] [PubMed] [Google Scholar]
28.Younes L. Shapes and Diffeomorphisms. Springer; 2010. [Google Scholar]
29.Yuan Y, Zhu H, Lin W, Marron JS. Local polynomial regression for symmetric positive definite matrices. Journal of The Royal Statistical Society Series B-statistical Methodology. 2012;74(4):697–719. doi: 10.1111/j.1467-9868.2011.01022.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Allen LS, Richey M, Chai YM, Gorski RA. Sex differences in the corpus callosum of the living human being. The Journal of Neuroscience. 1991;11(4):933–942. doi: 10.1523/JNEUROSCI.11-04-00933.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Banerjee M, Chakraborty R, Ofori E, Okun MS, Viallancourt DE, Vemuri BC. A nonlinear regression technique for manifold valued data with applications to medical image analysis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:4424–4432. [Google Scholar]

[R3] 3.Bhattacharya A, Dunson DB. Nonparametric bayesian density estimation on manifolds with applications to planar shapes. Biometrika. 2010;97(4):851–865. doi: 10.1093/biomet/asq044. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Bhattacharya A, Dunson DB. Nonparametric bayes classification and hypothesis testing on manifolds. Journal of Multivariate Analysis. 2012;111:1. doi: 10.1016/j.jmva.2012.02.020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Bhattacharya R, Patrangenaru V. Large sample theory of intrinsic and extrinsic sample means on manifolds-i. Annals of Statistics. 2003;31(1):1–29. [Google Scholar]

[R6] 6.Bhattacharya R, Patrangenaru V. Large sample theory of intrinsic and extrinsic sample means on manifolds-ii. Annals of Statistics. 2005;33(3):1225–1259. [Google Scholar]

[R7] 7.Cornea E, Zhu H, Kim PT, Ibrahim JG. Regression models on riemannian symmetric spaces. Journal of The Royal Statistical Society Series B-statistical Methodology. 2016 doi: 10.1111/rssb.12169. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Dale AM, Fischl B, Sereno MI. Cortical surface-based analysis. i. segmentation and surface reconstruction. NeuroImage. 1999;9(2):179–194. doi: 10.1006/nimg.1998.0395. [DOI] [PubMed] [Google Scholar]

[R9] 9.Davis B, Fletcher PT, Bullitt E, Joshi S. Population shape regression from random design data. 2007:1–7. [Google Scholar]

[R10] 10.Fletcher PT, Lu C, Pizer SM, Joshi S. Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Transactions on Medical Imaging. 2004;23(8):995–1005. doi: 10.1109/TMI.2004.831793. [DOI] [PubMed] [Google Scholar]

[R11] 11.Grenander U, Miller MI. Pattern Theory From Representation to Inference. Oxford University Press; 2007. [Google Scholar]

[R12] 12.Huckemann S, Hotz T, Munk A. Intrinsic manova for riemannian manifolds with an application to kendall’s space of planar shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2010;32(4):593–603. doi: 10.1109/TPAMI.2009.117. [DOI] [PubMed] [Google Scholar]

[R13] 13.Kent JT. The fisher-bingham distribution on the sphere. Journal of the Royal Statistical Society. Series B (Methodological) 1982:71–80. [Google Scholar]

[R14] 14.Kim HJ, Adluru N, Bendlin BB, Johnson SC, Vemuri BC, Singh V. European Conference on Computer Vision. Springer; 2014. Canonical correlation analysis on riemannian manifolds and its applications; pp. 251–267. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Lyons R. Distance covariance in metric spaces. The Annals of Probability. 2013;41(5):3284–3305. [Google Scholar]

[R16] 16.Machado L, Leite FS, Krakowski K. Higher-order smoothing splines versus least squares problems on riemannian manifolds. Journal of Dynamical and Control Systems. 2010;16(1):121–148. [Google Scholar]

[R17] 17.Ota M, Obata T, Akine Y, Ito H, Ikehira H, Asada T, Suhara T. Age-related degeneration of corpus callosum measured with diffusion tensor imaging. NeuroImage. 2006;31(4):1445–1452. doi: 10.1016/j.neuroimage.2006.02.008. [DOI] [PubMed] [Google Scholar]

[R18] 18.Paparoditis E, Politis D. The local bootstrap for kernel estimators under general dependence conditions. Annals of the Institute of Statistical Mathematics. 2000;52(1):139–159. [Google Scholar]

[R19] 19.Patrangenaru V, Ellingson L. Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis. CRC Press; 2015. [Google Scholar]

[R20] 20.Paul LK, Brown WS, Adolphs R, Tyszka JM, Richards LJ, Mukherjee P, Sherr EH. Agenesis of the corpus callosum: genetic, developmental and functional aspects of connectivity. Nature Reviews Neuroscience. 2007;8(4):287–299. doi: 10.1038/nrn2107. [DOI] [PubMed] [Google Scholar]

[R21] 21.Pelletier B. Kernel density estimation on riemannian manifolds. Statistics & Probability Letters. 2005;73(3):297–304. [Google Scholar]

[R22] 22.Shuyu L, Fang P, Xiangqi H, Li D, Tianzi J. Shape analysis of the corpus callosum in alzheimer’s disease. 2007:1095–1098. [Google Scholar]

[R23] 23.Srivastava A, Klassen EP. Functional and shape data analysis. Springer; 2016. [Google Scholar]

[R24] 24.Su J, Dryden IL, Klassen E, Le H, Srivastava A. Fitting smoothing splines to time-indexed, noisy points on nonlinear manifolds. Image and Vision Computing. 2012;30(6):428–442. [Google Scholar]

[R25] 25.Székely G, Rizzo M, Bakirov N. Measuring and testing dependence by correlation of distances. The Annals of Statistics. 2007;35(6):2769–2794. [Google Scholar]

[R26] 26.Wang X, Pan W, Hu W, Tian Y, Zhang H. Conditional distance correlation. Journal of the American Statistical Association. 2016;110(512):1726. doi: 10.1080/01621459.2014.993081. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Witelson SF. Hand and sex differences in the isthmus and genu of the human corpus callosum. a postmortem morphological study. Brain. 1989;112(3):799–835. doi: 10.1093/brain/112.3.799. [DOI] [PubMed] [Google Scholar]

[R28] 28.Younes L. Shapes and Diffeomorphisms. Springer; 2010. [Google Scholar]

[R29] 29.Yuan Y, Zhu H, Lin W, Marron JS. Local polynomial regression for symmetric positive definite matrices. Journal of The Royal Statistical Society Series B-statistical Methodology. 2012;74(4):697–719. doi: 10.1111/j.1467-9868.2011.01022.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Conditional local distance correlation for manifold-valued data

Wenliang Pan

Xueqin Wang

Canhong Wen

Martin Styner

Hongtu Zhu

Abstract

1 Introduction

2 Methods