Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Oct 1.
Published in final edited form as: Med Image Comput Comput Assist Interv. 2016 Oct 2;9902:166–173. doi: 10.1007/978-3-319-46726-9_20

Low-Dimensional Statistics of Anatomical Variability via Compact Representation of Image Deformations

Miaomiao Zhang 1,, William M Wells III 1,2, Polina Golland 1
PMCID: PMC5484008  NIHMSID: NIHMS850833  PMID: 28664199

Abstract

Using image-based descriptors to investigate clinical hypotheses and therapeutic implications is challenging due to the notorious “curse of dimensionality” coupled with a small sample size. In this paper, we present a low-dimensional analysis of anatomical shape variability in the space of diffeomorphisms and demonstrate its benefits for clinical studies. To combat the high dimensionality of the deformation descriptors, we develop a probabilistic model of principal geodesic analysis in a bandlimited low-dimensional space that still captures the underlying variability of image data. We demonstrate the performance of our model on a set of 3D brain MRI scans from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. Our model yields a more compact representation of group variation at substantially lower computational cost than models based on the high-dimensional state-of-the-art approaches such as tangent space PCA (TPCA) and probabilistic principal geodesic analysis (PPGA).

1 Introduction

Shape analysis is critical for image-based studies of disease as it offers characterizations of anatomical variability between different groups, or in the course of a disease. Analysis of shape changes can provide new insights into the nature of the disease and support treatment. For example, brain atrophy has been identified in patients affected by neuro-degenerative diseases such as Parkinson’s, Huntington’s, and Alzheimer’s [5,10]. When combined with other clinical information, characterization of shape differences between clinical cohorts and a healthy population can be useful in predicting disease progression. Landmarks [3], distance transforms [6,9], and medial cores [11] are examples of image-based shape descriptors often used in medical image analysis. Most of these descriptors require informative feature points or a segmented binary image as input to the shape extraction procedure. In this paper, we focus on diffeomorphic transformations estimated from full images as a way to represent shape in a group of images [12,14].

The high-dimensional nature of the data (e.g., a 1283 displacement grid as a shape descriptor for a 3D brain MRI) presents significant challenges for the statistical methods when extracting relevant latent structure from image transformations. The barriers for effective statistical analysis include (i) requiring greater computational resources and special programming techniques for statistical inference and (ii) numerous local minima. Two main ways of overcoming this problem via data dimensionality reduction have been recently proposed in the diffeomorphic setting. One is to perform statistical modeling of the transformations as a step that follows the estimation of deformations, for instance, by carrying out principal component analysis (PCA) in the tangent space of diffeomorphisms (TPCA) [14]. An empirical shape distribution can be constructed by using TPCA to estimate the intrinsic dimensionality of the diffeomorphic surface variation [12]. Later, a Bayesian model of shape variability was demonstrated to extract the principal modes after estimating a covariance matrix of transformations [7]. Alternatively, one could infer the principal modes of variation and transformations simultaneously. Principal geodesic analysis (PGA) generalized PCA to finite-dimensional manifolds and estimated the geodesic subspaces by minimizing the sum-of-squared geodesic distances to the data [4]. This enabled factor analysis of diffeomorphisms that treats data variability as a joint inference problem in a probabilistic principal geodesic analysis (PPGA) [17]. While these models were designed to find a concise low-dimensional space to represent the data, the estimation must be performed numerically on dense image grids in a high-dimensional space.

In contrast, we use the finite-dimensional representation of the tangent space of diffeomorphisms [18] to investigate shape variability using bandlimited velocity fields as a representation. We call this approach low-dimensional probabilistic principal geodesic analysis (LPPGA). We define a low-dimensional probabilistic framework for factor analysis in the context of diffeomorphic atlas building. Our model dramatically reduces the computational cost by employing a low-dimensional parametrization in the Fourier space. Furthermore, we enforce the orthogonality constraints on the principal modes, which is computationally intractable in high-dimensional models like PPGA [17]. We report estimated principal modes in the ADNI brain MRI dataset [8] and compare them with the results of TPCA and PPGA of diffeomorphisms in the full dimensional space. The experimental results show that the low-dimensional statistics encode the features of interest in the data, better capture the group variation and improve data interpretability. Moreover, our model requires much less computational resources.

2 Diffeomorphic Atlas Building with Geodesic Shooting

We first briefly review the mathematical background of diffeomorphic atlas building in the setting of large deformation diffeomorphic metric mapping (LDDMM) [1] with geodesic shooting [15,16].

We let J1,⋯, JN be the N input images that are assumed to be square integrable functions defined on d-dimensional torus domain Ω = ℝd/ℤd (JnL2(Ω, ℝ),n ∈ {1, ⋯ ,N}). We use I to denote the atlas template and ϕn to denote the deformation from template I to image Jn. The time-varying deformation ϕn(t, x) : t ∈ [0,1],x ∈ Ω is defined as the integral flow of time-varying velocity field vn(t, x) in a reproducing kernel Hilbert space V:

d/dtϕn(t,x)=vn(t,ϕn(t,x)).

The geodesic path of the deformation is uniquely determined by integrating the Euler-Poincaré differential equation (EPDiff) [18] with an initial condition of vn(t, x) at t = 0:

vnt=K[(Dvn)Tmn+Dmnvn+mndiv(vn)], (1)

where D is the Jacobian matrix and ÷ is the divergence operator. The operator K is the inverse of a symmetric, positive-definite differential operator :VV that maps a velocity field vnV to a momentum vector mnV* such that mn=vn and vn=Kmn. This process is known as geodesic shooting [15,16].

With a slight abuse of notation, we define ϕn = ϕn(1, ·), vn = vn(0, ·), allowing us to drop the time index in the subsequent derivations. Geodesic shooting (1) enables differentiation of the image differences with respect to the initial velocity field, leading to a gradient decent minimization of the energy function

E(vn,I)=n=1N12σ2JnIϕn1L22+(vn,vn), (2)

where σ2 is the image noise variance. In this paper, we use =(αΔ+e)c, where Δ is the discrete Laplacian operator, e is the identity matrix, c is a positive scalar controlling smoothness, and α is a positive regularity parameter. The notation (·,·) denotes the pairing of a momentum vector with a tangent vector, similar to an inner product.

It has been recently demonstrated that the initial velocity vn can be efficiently captured via a discrete low-dimensional bandlimited representation in the Fourier space [18]. We adopt this low-dimensional representation for statistical shape analysis.

3 Generative Model

We build our generative model in the discrete finite-dimensional space V that represents bandlimited velocity fields. Elements of this space vV are complexvalued vector fields in the Fourier domain that represent conjugate frequencies: v=v, where is the Fourier basis that maps from the frequency domain to the image domain.

Let Wp×q be a matrix in the Fourier space whose q columns (q < N) are orthonormal principal initial velocities in a low p-dimensional space (p ≪ d), Λ ∈ ℝq×q be a diagonal matrix of scale factors for the columns of W, and s ∈ ℝq be a vector that parameterizes the space of transformations. The initial velocity is therefore represented as v=WΛs in the low-dimensional space. Assuming i.i.d. Gaussian noise on image intensities, we obtain

p(Jn|sn;W,Λ,I,σ)=N(Jn;Iϕn1,σ2),, (3)

where ϕn is a deformation that corresponds to the initial velocity vn=WΛsn in the image space, that is, d/dtϕn=WΛsn, and N(·;μ,σ2) is a Gaussian distribution with mean μ and variance σ2.

The prior on the loading coefficients sn is the combination of a Gaussian distribution N(0,e) (e is the identity matrix) with a complex multivariate Gaussian distribution N(0,(WTΛ2W)1) that ensures the smoothness of the geodesic path. Similar to the operator, :VV is also a symmetric, positive definite operator that maps a complex tangent vector vV in the Fourier domain to its dual momentum vector mV. For a D1 × D2 × D3 grid, the operator value Ld1d2d3 at location (d1,d2,d3) is given by

Ld1d2d3=[2α(cos2πd1D1+cos2πd2D2+cos2πd3D33)+1]c,

and its inverse is Ld1d2d31=Kd1d2d3. Finally, we formulate the prior as

p(sn|W,Λ)=N(sn;0,(WTΛ2W)1+e). (4)

We now arrive at the posterior distribution of s1,⋯ sN

Qlogp(s1,,sn|J1,,JN;W,Λ,Ι,σ2)=n=1Nlogp(Jn|sn;W,Λ,I,σ)+logp(sn|W,Λ)+const.=n=1NJnIϕn1L222σ2snT(WTΛ2W+e)sn2dN2logσ+const. (5)

4 Inference

We use alternating gradient accent to maximize the posterior probability (5) with respect to the model parameters θ={W,Λ,Ι,σ2} and latent variables {s1,⋯,sN}.

By setting the derivative of Q with respect to I and σ to zero, we obtain closed-form updates for the atlas template I and noise variance σ2:

I=n=1NJnϕn|Dϕn|n=1N|Dϕn|,σ2=1dNn=1NJnIϕn1L22.

To estimate the principal initial velocity basis W, the scaling factor Λ, and the loading coefficients {sn}, we follow the derivations in [18] and first obtain the gradient of Q w.r.t. the initial velocity vn as follows:

  1. Forward integrate the geodesic evolution equation (1) to generate time-dependent diffeomorphic deformation σn (t,x).

  2. Compute the gradient vnQ at time point t = 1 as
    [vnQ]t=1=K[1σ2(JnIϕn1)(Iϕn1)+Lvn]. (6)
  3. Backward integrate the gradient (6) to t = 0 to obtain [vnQ]t=0.

After applying the chain rule, we have the gradient of Q for updating the loading factor sn:

snQ=ΛWT[vnQ]t=0sn.

The gradients of Q w.r.t. W,Λ are given as follows:

WQ=n=1N[vnQ]t=0snTΛ,ΛQ=n=1NWsnT[vnQ]t=0.

Unlike the PPGA model [17], we can readily enforce the mutual orthogonality constraint on the columns of W Here, we choose to employ Gram-Schmidt orthogonalization [2] on the column vectors of W in a complex inner product space.

5 Results

Data

To evaluate the effectiveness of the proposed low-dimensional principal geodesic analysis (LPPGA), we applied the algorithm to brain MRI scans of 90 subjects from the ADNI [8] study, aged 60 to 90. Fifty subjects have Alzheimer’s disease and the remaining 40 subjects are healthy controls. All MRIs have the same resolution 128 × 128 × 128 with the voxel size of 1.25 × 1.25 × 1.25mm3. All images underwent skull-stripping, downsampling, intensity normalization, bias field correction, and co-registration with affine transformations.

Experiments

We first estimate a full collection of principal modes q = 89 for our model, using α = 3.0, c = 3.0 for the operator with p = 163 dimensions of initial velocity v, similar to the settings used in pairwise diffeomorphic image registration [18]. The number of time steps for integration in geodesic shooting is set to 10. We initialize the atlas I to be the average of image intensities, Λ to be the identity matrix, sn to be the all-ones vector, and the initial velocity matrix W to be the principal components estimated by TPCA [14]. We then compare the results with PPGA [17] and TPCA on the same dataset. In order to conduct a fair comparison, we keep all the parameters including regularization and time steps for numerical integration fixed across the three algorithms. To evaluate the model stability, we randomly select 50 images out of 90 and rerun the entire experiment 50 times.

To investigate the ability of our model to capture anatomical variability, we use the loading coefficients s = {s1,⋯,sN} as a shape descriptor in a statistical study. The idea is to test the hypothesis that the principal modes estimated by our method are correlated significantly with clinical information such as mini-mental state examination (MMSE), Alzheimer’s Disease Assessment Scale (ADAS), and Clinical Dementia Rating (CDR). We focus on MMSE and fit it to a linear regression model using the loadings for all 90 subjects in the training dataset as predictors. Similar analysis is performed on the results of PPGA and TPCA.

Experimental Results

Figure 1 visualizes the first three modes of variation in this cohort by shooting the estimated atlas I along the initial velocities v=aiWiΛi (ai = {−2, −1, 0,1, 2},i = 1, 2, 3). We also show the log determinant of Jacobians at ai = 2 (regions of expansion in red and contraction in blue). The first mode of variation clearly shows that changes in ventricle size is the dominant source of variability in the brain shape. The algorithm estimates standard deviation of the image noise to be σ = 0.02.

Fig. 1.

Fig. 1

Top to bottom: first, second and third principal modes of brain shape variation evaluated for varying amounts of the corresponding principal mode, and log determinant of Jacobians at i. Coronal views are shown.

Figure 2 reports the cumulative variance explained by the model as a function of the model size. Our approach achieves higher representation accuracy than the two state-of-the-art baseline algorithms across the entire range of model sizes.

Fig. 2.

Fig. 2

Left: cumulative variance explained by principal modes estimated through our method (LPPGA) and baseline algorithms (PPGA and TPCA). Right: number of principal modes that explain 90% and 95% of total variance respectively.

Table 1 compares the regression results of our model and the two baseline algorithms using the first principal mode. The higher F and R2 statistics indicate that our approach captures more variation of the MMSE score than the other models. Table 1 also reports run time and memory consumption for building the full model of anatomical variability. It demonstrates that our approach offers an order of magnitude improvement in both the run time and memory requirements while providing a more powerful model of variability.

Table 1.

Left: Comparison of linear regression models on the first principal mode for our model (LPPGA) and the baseline algorithms (PPGA and TPCA) on 90 brain MRIs from ADNI. Right: Comparison of run time and memory consumption. The implementation employed a message passing interface (MPI) parallel programming for all methods and distributed 90 subjects to 10 processors.

Model Residual R2 F p-value
LPPGA 4.45 0.18 19.47 2.18e−5
PPGA 4.49 0.16 17.96 5.54e−5
TPCA 4.53 0.14 16.34 1.10e−4
Time (Hours) Memory (MB)
  1.2   168.4
17.8 1708.1
16.7 1708.1

6 Conclusion

We presented a low-dimensional probabilistic framework for factor analysis in the space of diffeomorphisms. Our model reduces the computational cost and amplifies the statistical power of shape analysis by using a low-dimensional parametrization. This work represents the first step towards efficient probabilistic models of shape variability based on high-dimensional diffeomorphisms. Future work will explore Bayesian variants of shape analysis. A multiscale strategy like [13] can be added to our model to make the inference even faster.

Acknowledgments

This work was supported by NIH NIBIB NAC P41EB015902, NIH NINDS R01NS086905, NIH NICHD U01HD087211, and Wistron Corporation.

References

  • 1.Beg M, Miller M, Trouvé A, Younes L. Computing large deformation metric mappings via geodesic flows of diffeomorphisms. Int J Comput Vis. 2005;61(2):139–157. [Google Scholar]
  • 2.Cheney W, Kincaid D. Linear Algebra: Theory and Applications. Vol. 110. The Australian Mathematical Society; Canberra: 2009. [Google Scholar]
  • 3.Cootes TF, Taylor CJ, Cooper DH, Graham J. Active shape models-their training and application. Comput Vis Image Underst. 1995;61(1):38–59. [Google Scholar]
  • 4.Fletcher PT, Lu C, Joshi S. Statistics of shape via principal geodesic analysis on Lie groups. Comput. Vis Pattern Recogn. 2003;1:1–95. IEEE. [Google Scholar]
  • 5.Gerig G, Styner M, Shenton ME, Lieberman JA. Shape versus size: improved understanding of the morphology of brain structures. In: Niessen WJ, Viergever MA, editors. MICCAI 2001. Vol. 2208. Springer; Heidelberg: 2001. pp. 24–32. (LNCS). [DOI] [Google Scholar]
  • 6.Golland P, Grimson WEL, Shenton ME, Kikinis R. Small sample size learning for shape analysis of anatomical structures. In: Delp SL, DiGoia AM, Jaramaz B, editors. MICCAI 2000. Vol. 1935. Springer; Heidelberg: 2000. pp. 72–82. (LNCS). [DOI] [Google Scholar]
  • 7.Gori P, Colliot O, Worbe Y, Marrakchi-Kacem L, Lecomte S, Poupon C, Hartmann A, Ayache N, Durrleman S. Bayesian atlas estimation for the variability analysis of shape complexes. In: Mori K, Sakuma I, Sato Y, Barillot C, Navab N, editors. MICCAI 2013. Vol. 8149. Springer; Heidelberg: 2013. pp. 267–0274. (LNCS). Part I. [DOI] [PubMed] [Google Scholar]
  • 8.Jack CR, Bernstein MA, Fox NC, Thompson P, Alexander G, Harvey D, Borowski B, Britson PJ, Whitwell JL, Ward C, et al. The alzheimer’s disease neuroimaging initiative (ADNI): MRI methods. J Magn Reson Imaging. 2008;27(4):685–691. doi: 10.1002/jmri.21049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Leventon ME, Grimson WEL, Faugeras O. Statistical shape influence in geodesic active contours. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2000;1:316–323. IEEE. [Google Scholar]
  • 10.Nemmi F, Sabatini U, Rascol O, Péran P. Parkinson’s disease and local atrophy in subcortical nuclei: insight from shape analysis. Neurobiol Aging. 2015;36(1):424–433. doi: 10.1016/j.neurobiolaging.2014.07.010. [DOI] [PubMed] [Google Scholar]
  • 11.Pizer SM, Fritsch DS, Yushkevich PA, Johnson VE, Chaney EL. Segmentation, registration, and measurement of shape variation via image object shape. IEEE Trans Med Imaging. 1999;18(10):851–865. doi: 10.1109/42.811263. [DOI] [PubMed] [Google Scholar]
  • 12.Qiu A, Younes L, Miller MI. Principal component based diffeomorphic surface mapping. IEEE Trans Med Imaging. 2012;31(2):302–311. doi: 10.1109/TMI.2011.2168567. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Sommer S, Lauze F, Nielsen M, Pennec X. Sparse multi-scale diffeomorphic registration: the kernel bundle framework. J Math Imaging Vis. 2013;46(3):292–308. [Google Scholar]
  • 14.Vaillant M, Miller MI, Younes L, Trouvé A. Statistics on diffeomorphisms via tangent space representations. NeuroImage. 2004;23:S161–S169. doi: 10.1016/j.neuroimage.2004.07.023. [DOI] [PubMed] [Google Scholar]
  • 15.Vialard FX, Risser L, Rueckert D, Cotter CJ. Diffeomorphic 3D image registration via geodesic shooting using an efficient adjoint calculation. Int J Comput Vis. 2012;97(2):229–241. [Google Scholar]
  • 16.Younes L, Arrate F, Miller M. Evolutions equations in computational anatomy. NeuroImage. 2009;45(1):S40–S50. doi: 10.1016/j.neuroimage.2008.10.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Zhang M, Fletcher PT. Bayesian principal geodesic analysis in diffeomorphic image registration. In: Golland P, Hata N, Barillot C, Hornegger J, Howe R, editors. MICCAI 2014. Vol. 8675. Springer; Heidelberg: 2014. pp. 121–128. (LNCS). [DOI] [PubMed] [Google Scholar]
  • 18.Zhang M, Fletcher PT. Finite-dimensional Lie algebras for fast diffeomorphic image registration. In: Ourselin S, Alexander DC, Westin C-F, Cardoso MJ, editors. IPMI 2015. Vol. 9123. Springer; Heidelberg: 2015. pp. 249–260. (LNCS). [DOI] [PubMed] [Google Scholar]

RESOURCES