Abstract
Longitudinal regression analysis for clinical imaging studies is essential to investigate unknown relationships between subject-wise changes over time and subject-specific characteristics, represented by covariates such as disease severity or a level of genetic risk. Image-derived data in medical image analysis, e.g. diffusion tensors or geometric shapes, are often represented on nonlinear Riemannian manifolds. Hierarchical geodesic models were suggested to characterize subject-specific changes of nonlinear data on Riemannian manifolds as extensions of a linear mixed effects model. We propose a new hierarchical multi-geodesic model to enable analysis of the relationship between subject-wise anatomical shape changes on a Riemannian manifold and multiple subject-specific characteristics. Each individual subject-wise shape change is represented by a univariate geodesic model. The effects of subject-specific covariates on the estimated subject-wise trajectories are then modeled by multivariate intercept and slope models which together form a multi-geodesic model. Validation was performed with a synthetic example on a S2 manifold. The proposed method was applied to a longitudinal set of 72 corpus callosum shapes from 24 autism spectrum disorder subjects to study the relationship between anatomical shape changes and the autism severity score, resulting in statistics for the population but also for each subject. To our knowledge, this is the first longitudinal framework to model anatomical developments over time as functions of both continuous and categorical covariates on a nonlinear shape space.
1. Introduction
Recent advances in medical image analysis allow researchers to track an individual subject’s development with multiple repeated observations [4]. Longitudinal regression analysis, which adequately accounts for intra-subject correlation, is essential to estimate unknown relationships between subject-specific temporal changes and characteristics of individuals via repeated observations [2].
Data derived from medical images, such as diffusion tensors, diffeomorphic deformations, or geometric shapes, are to be analyzed on their natural nonlinear spaces, e.g. Riemannian manifolds. For longitudinal analysis of nonlinear data on a Riemannian manifold, hierarchical geodesic models were suggested as extensions of a linear mixed effects model on Euclidean space [6, 7, 9]. These methods estimate the same model for both subject and population levels which is a direct reformulation of a linear mixed effects model on a Riemannian manifold. The previous methods face challenges from too few observations per subject to hierarchically solve a multivariate model at a subject level even though the number of subjects may be sufficient. Despite the importance of including subject-specific characteristics for longitudinal analysis, a hierarchical model that analyzes the relationship between subject-wise morphological changes over time and multiple covariates on a Riemannian manifold has not been shown before.
We propose a novel hierarchical multi-geodesic model for longitudinal analysis of subject-specific morphological changes of anatomical structures on a Riemannian manifold. Each subject-wise morphological change over time is modeled by a univariate geodesic model which only requires a minimum of two observations per subject. The effects of subject-specific characteristics, represented by subject-specific covariates, on subject-wise trajectories are modeled by a multi-geodesic model. The covariates are fixed for an individual subject but varying across a population. The multi-geodesic model consists of multivariate intercept and slope models which account for the effects of the subject-specific characteristics to subject-wise baselines and developments over time, respectively. It is worth noting that the proposed method is different from the direct extension of a linear mixed effects model because the relationship between subject-specific characteristics and the slopes of subject-wise temporal trajectories cannot be modeled by linearly adding the covariates as additional explanatory variables.
A synthetic example on a S2 manifold with a comparison to the ground truth showed the feasibility of the proposed method. Experimental validation on 72 corpus callosum shapes represented on a nonlinear shape space from 24 autism spectrum disorder subjects with different autism severity scores demonstrate the capability of the proposed method for longitudinal analysis of the relationship between subject-specific morphological changes and multiple covariates.
2. Method
Background on Riemannian Geometry:
A geodesic is a zero-acceleration curve on an n-dimensional Riemannian manifold M. It has the minimizing property that there is no curve shorter than a geodesic between any two points within a small neighborhood. An exponential map Exp(p, v) = q is a mapping of p ∈ M to q ∈ M along a geodesic going out from p in the direction and magnitude of v. Its inverse, Log(p, q) = v, is defined onto a neighborhood U(p) of p. The Riemannian distance between p and q is the length of a geodesic between the two, d(p, q) = ∥Log(p, q)∥. Parallel transport ψp→q(u) of a tangent vector u ∈ TpM along a differentiable curve c(t) : I → M from p to q is defined by a unique parallel vector field V (t) along c(t) where I = [0, 1], c(0) = p, and c(1) = q. V (t) satisfies V (t0) = u and , where is a covariant derivative of a vector field V. Parallel transport has the following angle and scale preserving properties: the angle of u along c(t) does not change, , and the scale of u also does not change, ∥u∥ = ∥V (t)∥, at any t ∈ I [1]. ψp→q(u) is not unique and depends on a curve c that connects p and q. We will only use ψp→q along with the unique geodesic between p and q to guarantee the transport to be unique.
Subject-wise Trajectory Estimation:
Let yij ∈ M be the jth observation of the ith subject associated with time , i = 1,…, Ns,. Ns and are the number of subjects and the number of observations of the ith subject, respectively. We estimate a subject-wise trajectory Yi by the least squares geodesic regression model [3].
(1) |
where ai ∈ M and bi ∈ TaiM are an intercept and a slope tangent vector of the geodesic model Yi for the ith subject’s trajectory as shown in Fig. 1 (a). It is worth noting that Eq. 1 only requires the minimum of two observations per subject to solve for two coefficients, an intercept ai and a slope bi.
Hierarchical Multi-Geodesic Model:
Let the ith subject be associated with two sets of subject-specific covariates for intercepts and slopes , where Nη and Nθ are the numbers of the covariates.
We aim to model the effects of η and θ on subject-wise trajectories. The proposed Hierarchical Multi-Geodesic model (HMG) consists of an intercept model f(η) and a slope model g(θ),
(2) |
Intercept Model:
The intercept model f(η) is formulated as a multivariate geodesic model with a base intercept β0 ∈ M and tangent vectors [5],
(3) |
βk, k = 1, …, Nη, are coefficients that represent the effects of subject-specific covariates ηk to the intercepts of subject-wise trajectories. For example, a hypothesis that a subject diagnosed with autism and a healthy subject might have different baseline corpus callosum shapes at 3-month after birth can be modeled by the intercept model f. The coefficients βk can be estimated by a least squares formulation of the multivariate geodesic model,
(4) |
We optimize Eq. 4 by a Euclideanized optimization scheme similar to [5] with an iterative update of an anchor point to the estimated intercept, instead of fixing it at the intrinsic mean of given data. The estimated intercept model is given as after optimization.
The least squares formulation Eq. 4 assumes that the distribution of subject-wise intercepts around is the generalized normal distribution [3],
(5) |
where can be directly interpreted as random effects of subject-wise intercepts that indicate longitudinal effects of repeated observations of individual subjects [2]. Fig. 1 (b) shows the illustration of the concept of the intercept model f with an example model with one continuous covariate c.
Tangent Vector Space of Slope Model:
The effects of covariates on the slopes of subject-wise trajectories are modeled as a linear model g0(θ) on a tangent vector space from a set of subject-wise slope tangent vectors ,
(6) |
where γk, k = 0, …, Nθ, are coefficient tangent vectors associated with θk. The model can explain a hypothesis on the relationship between covariates and the slopes. For example, a corpus callosum may develop differently from a baseline for a subject diagnosed with autism versus a healthy subject.
There are two problems that we need to consider to model subject-wise slopes. First, γk must be on a single tangent vector space to be linearly combined as in Eq. 6. For consistent modeling and comprehensible interpretation, we set γk to be on a tangent vector space of the base intercept of the intercept model, , which makes . Second, are not directly comparable to each other because they are on different tangent vector spaces . Therefore, we need to properly transport to to estimate γk.
Stop-Over Parallel Transport ϕ:
Recall that subject-wise intercepts are the combination of the fixed effects intercept model and the random effects ϵa on subject-wise intercepts as explained in Eq. 5. In other words, each is randomly distributed in the normal distribution of the random effects centered at [8]. Therefore, we need to transport a tangent vector on to first to account for the random effects and then from to to account for the fixed effects as shown in Fig. 1 (c) by a stop-over parallel transport ϕ,
(7) |
where . ψ is a parallel transport along with a geodesic between two points with the angle and scale preserving properties that suit our need to consistently transport at each stage of the stop-over transport. Fig. A1 in the appendix shows the effect of the stop-over parallel transport with a synthetic experiment on a S2 manifold. The direct parallel transportation to the intercept point may arbitrarily rotate tangent vectors when the stop-over transport preserves the directions of the tangent vectors consistently.
Slope Model Estimation:
With transported to by ϕ, we estimate slope coefficients γk, k = 0, …, Nθ, in g0(θ). It can be formulated as the least squares formulation of a standard multivariate linear regression problem [2],
(8) |
Eq. 8 is optimized by the closed-form solution of a multivariate linear regression problem with the assumption of no correlation in γk. The estimated slope model is transported to the respective tangent vector space of the intercept model ,.
The least squares formulation in Eq. 8 is related to the random effects of subject-wise slopes, similar to Eq. 5 for the intercept model. The complete estimated multi-geodesic model is then formulated as .
3. Experiments
Synthetic Example on S2 Manifold:
We tested our method with a synthetic example on a S2 manifold. We used the exponential map, log map, and parallel transport of S2 manifold as in [7]. 3527 points of 1000 subjects were generated by the following model with time t ∈ (20, 70) and a continuous covariate c ∈ (0, 5),
where we assigned β0 = [1, 0, 0], β1 = [0, 0, 0.4], γ0 = [0, 0.01, 0.02], and γ1 = [0, 0.002, −0.003] with random effects on intercepts ϵa ~ N(0, 0.052) and slopes ϵb ~ N(0, 0.0012) and the data observation noise ϵ ~ N(0, 0.0012).
Fig. 2(a) shows the synthetic data colored by c from blue to green. The estimated subject level geodesic models are plotted as translucent white curves. Fig. 2(b) shows the estimated geodesic of a geodesic regression model (SG, magenta) with data points colored by t from black to white [3]. Fig. 2(c) illustrates the results of a Hierarchical Single Geodesic model (HSG, brown) [5, 7]. The results of the proposed Hierarchical Multi-Geodesic model (HMG) and the ground truth (yellow curves) are displayed in Fig. 2(d) with uniformly selected values of c with a unit interval from 0.5 to 4.5. The generalized R2 with respect to individual data of SG, HSG, and the proposed HMG were 0.31, 0.29, and 0.96, respectively [3]. The R2 with respect to subject-wise intercepts and slopes of the proposed HMG were 0.98 and 0.90. and of HSG are zero since the HSG is the average trajectory of the subject-wise trajectories [7]. Validation with respect to subject-wise trajectories is not available for non-longitudinal SG. Standard deviations of the random effects estimated by HMG of subject-wise intercepts and slopes are σa = 0.09 and σb = 0.002, respectively.
Longitudinal Corpus Callosum Shape Changes:
Previous research demonstrated differences of corpus callosum (CC) size in autism [10], stating that it is thicker in infants later diagnosed with autism by multi-level linear analysis of derived features from shapes, such as mean thickness. Applying the proposed framework, quantitative exploration of longitudinal shape changes as a function of diagnostic scores becomes possible, resulting in population level and subject-specific models. We modeled HMG with sex s and Autistic Diagnostic Observation Schedule (ADOS) severity score AS, which combines symptoms related to a social interaction and a repetitive behavior with scores ranging from 4 to 10 for autism spectrum disorder (ASD) subjects. Larger AS indicate higher severity of autism. Seventy-two CC shapes from 24 ASD subjects (9 females and 15 males) from the ACE-IBIS study were used for the experiment [10]. Each subject was repeatedly scanned three times. Because the development of CC is known to be asymptotic to a logarithm [10], we reparametrized time t by taking the natural log to model the asymptotic shape changes over time.
with sex s, 0 for male and 1 for female, age(month) t = (6, 25), and AS = (4, 10).
CC shapes were represented on a product manifold of a scale , which represents a shape size, and a Kendall shape z with k points in 2D Kendall shape space , where translation, rotation, and scale of shapes are removed. The squared distance between p = (ρp, zp) and q = (ρq, zq) on M is a weighted sum of the distances on the element spaces. The distance on is normalized by the ratio of variances of data distributions of scales ρ and Kendall shapes z, , where and are the variances of the input data of scales ρ and shapes z, respectively [8]. One hundred landmark points were sampled at corresponding locations from each shape boundary. We used the exponential map, log map, and parallel transport of as in [3]. The exponential and log maps of are addition and subtraction, respectively. The parallel transport of is an identity function.
Fig. 3 displays the estimated longitudinal trends with fixed s=0 of 15 male subjects. Fig. 3 (a) shows the estimated longitudinal shape trends of the lowest and the highest ADOS scores over time. The difference of the shape changes is more evident in Fig. 3(b) illustrating estimated baseline and end point shapes with varying ADOS scores at 3 and 24-month of age. One can observe the increased expansion of the anterior CC (the genu and rostral body) for subjects with higher ADOS scores which confirms previous clinical finding that subjects diagnosed with autism tend to have larger corpus callosum [10]. The population level longitudinal trends of shape sizes estimated from the subject-wise trends in Fig. 3(c) quantitatively show increasing trends of shape sizes with higher ADOS scores in Fig. 3(d). The overall R2 values of SG, HSG, and HMG were 0.23, 0.23, and 0.27, respectively. The root mean squared error measured by average surface boundary landmark distances of SG, HSG, and HMG were 0.287, 0.287, and 0.277 (mm), respectively. Table 1 summarizes the quantitative evaluations. Mean and standard deviation of R2 values of the subject-wise trends were 0.89±0.04.
Table 1:
Synthetic | CC-Autism Spectrum | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
R 2 | σa | σb | R 2 | σa | σa | RMSE(mm) | |||||
SG | 0.31 | N/A | N/A | N/A | N/A | 0.23 | N/A | N/A | N/A | N/A | 0.287 |
HSG | 0.29 | 0.0 | 0.0 | 0.58 | 5.0e−3 | 0.23 | 0.0 | 0.0 | 1.02e−a | 5.15e−3 | 0.287 |
HMG | 0.96 | 0.98 | 0.90 | 0.09 | 2.0e −3 | 0.27 | 0.06 | 0.09 | 9.55e −3 | 4.68e −3 | 0.277 |
4. Discussion
The proposed hierarchical multi-geodesic model is a novel method for longitudinal analysis of subject-specific anatomical shape changes on a Riemannian manifold. It enables longitudinal analysis with multiple covariates directly on a nonlinear shape space which has not yet been possible for clinical studies. The application to subject-specific corpus callosum shape changes demonstrated promising results that confirmed clinical finding of the relationship between the anatomical development and diagnostic scores of individual subjects. We will focus on a hypothesis testing framework for the proposed model to further explore relationships between temporal change of anatomical structures and covariates.
Supplementary Material
Acknowledgements
Funding was provided by the IBIS (Infant Brain Imaging Study) Network, an NIH funded Autism Center of Excellence (2R01HDO55741) that consists of a consortium of 7 Universities in the U.S. and Canada. This research was also supported by NIH 1R01DA038215-01A1 (Cocaine effects), and R01EB021391 (SlicerSALT).
References
- 1.Carmo M.P.d.: Riemannian geometry. Birkhäuser; (1992) [Google Scholar]
- 2.Fitzmaurice GM, Laird NM, Ware JH: Applied longitudinal analysis, vol. 998. John Wiley & Sons; (2012) [Google Scholar]
- 3.Fletcher PT: Geodesic regression and the theory of least squares on Riemannian manifolds. International journal of computer vision 105(2), 171–185 (2013) [Google Scholar]
- 4.Gerig G, Fishbaugh J, Sadeghi N: Longitudinal modeling of appearance and shape and its potential for clinical use. Med Image Anal 33, 114–121 (Oct 2016) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kim HJ, Adluru N, Collins MD, Chung MK, Bendlin BB, Johnson SC, Davidson RJ, Singh V: MGLM on riemannian manifolds with applications to statistical analysis of diffusion weighted images. In: CVPR. IEEE (2014) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kim HJ, Adluru N, Suri H, Vemuri BC, Johnson SC, Singh V: Riemannian nonlinear mixed effects models: Analyzing longitudinal deformations in neuroimaging. In: CVPR. IEEE (2017) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Muralidharan P, Fletcher PT: Sasaki metrics for analysis of longitudinal data on manifolds. In: CVPR. IEEE (2012) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Pennec X: Intrinsic statistics on riemannian manifolds: Basic tools for geometric measurements. Journal of Mathematical Imaging and Vision 25(1), 127 (2006) [Google Scholar]
- 9.Singh N, Hinkle J, Joshi S, Fletcher PT: Hierarchical geodesic models in diffeomorphisms. International Journal of Computer Vision 117(1), 70–92 (2016) [Google Scholar]
- 10.Wolff JJ, Gerig G, Lewis JD, Soda T, Styner MA, Vachet C, et al. : Altered corpus callosum morphology associated with autism over the first 2 years of life. Brain 138(7), 2046–2058 (2015) [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.