Abstract
Longitudinal analysis is a core aspect of many medical applications for understanding the relationship between an anatomical subject’s function and its trajectory of shape change over time. Whereas mixed-effects (or hierarchical) modeling is the statistical method of choice for analysis of longitudinal data, we here propose its extension as hierarchical geodesic polynomial model (HGPM) for multilevel analyses of longitudinal shape data. 3D shapes are transformed to a non-Euclidean shape space for regression analysis using geodesics on a high dimensional Riemannian manifold. At the subject-wise level, each individual trajectory of shape change is represented by a univariate geodesic polynomial model on timestamps. At the population level, multivariate polynomial expansion is applied to uni/multivariate geodesic polynomial models for both anchor points and tangent vectors. As such, the trajectory of an individual subject’s shape changes over time can be modeled accurately with a reduced number of parameters, and population-level effects from multiple covariates on trajectories can be well captured. The implemented HGPM is validated on synthetic examples of points on a unit 3D sphere. Further tests on clinical 4D right ventricular data show that HGPM is capable of capturing observable effects on shapes attributed to changes in covariates, which are consistent with qualitative clinical evaluations. HGPM demonstrates its effectiveness in modeling shape changes at both subject-wise and population levels, which is promising for future studies of the relationship between shape changes over time and the level of dysfunction severity on anatomical objects associated with disease.
Keywords: geodesic regression, statistical shape analysis, hierarchical modeling, longitudinal data
1. Introduction
Studying change over time is a core aspect of many medical applications. Trajectories of change are followed in studies of childhood development, aging, and disease development. This can involve sampling a cross-sectional population to estimate a possible time course. However, such cross-sectional studies may show large variability across the population and may not properly reflect the nature of longitudinal changes of individuals. On the other hand, longitudinal studies involve following subjects over time, allowing to capture subject-wise trajectories as well as at the population level. Longitudinal study design comes with data challenges such as staggered time points, missing time points, and subjects with different number of observations. Dedicated modeling schemes are needed to correctly account for the correlated measurement within subjects. These models are known as mixed-effects or hierarchical models and have shown great promise for modeling derived measure in medical imaging studies [1,17].
Several models have been explored for longitudinal analysis of higher dimension data [3,4,11,13,16,18]. Two main directions have been followed. First, dedicated methods with a specific data representation in mind such as diffeomorphisms on images or shapes. Image or shape change is represented as continuous diffeomorphic deformations at the subject and population level. The second methodological direction are intrinsic Riemannian models which may be adapted to new data representations from a variety of manifold representations. These models require the definition of a few key manifold specific operations, to be discussed in Section 2, in order to be applied to new data types. In this work, we favor this second approach due to the potential for extension to new data representations and thus a variety of clinical problems.
In this paper we propose to extend the hierarchical multi-geodesic model in [11] by applying polynomial expansion at both subject-wise and population levels, with the aim of modeling shape trajectories associated to changes in related covariates, such as sex, cognitive scores, or disease severity. Our development of polynomial regression allows for more flexibility in data-matching than the traditional geodesic model, while still enabling the choice of geodesic as a polynomial of degree 1. The polynomial expansions at different model levels are inherently compatible with the hierarchical modeling framework where subjects may have a different number of observations. Due to the non-Euclidean nature of the shape space, a fast and efficient model estimation algorithm similar to the computation of the Fréchet mean is implemented. We validate our method on synthetic data as well as clinical data right ventricle shape change over the cardiac cycle as it relates to covariates such as dysfunction severity. This promises to meet a currently unmet need of clinical researchers to correlate geometry with function.
2. Methods
2.1. Shape Space and Geodesics
We define shape space as the pre-shape space of Kendall space [12] with rotations removed. As such, the final shapes are obtained by removing their translation, rotation, and similarity components through partial procrustes alignment. The shape space is formed as a hyper-sphere and can be treated as a high dimensional Riemannian manifold . Performing geodesic regressions in the shape space allows for efficient computation with proven Kendall-space equivalence as indicated in [8,15]. A geodesic on is a zero-acceleration curve with the minimizing property that there is no curve shorter than a geodesic between any two points within a small neighborhood. Three geodesic-related operations are extensively used in this work: exponential map, log map, and parallel transport. An exponential map maps a shape to another shape in the direction and magnitude of a tangent vector . A log map is the inverse of the exponential map in which two shapes and are given and the unique tangent vector that maps to is obtained. The Riemannian distance between the two shapes is then defined as the L2-norm of their log map . The parallel transport operation transports a tangent vector from to while maintaining angle and scale preservation properties. For rigorous and complete definitions, please see [2].
2.2. Hierarchical Geodesic Model for Manifold-valued Data
Geodesic regression is very similar to linear regression in Euclidean space, with analogies of the anchor point to the intercept and tangent vector to the slope. In a similar way, multilevel models could be constructed in a “geodesic” way by following the framework of hierarchical linear models [19]. In this study, we further extend geodesic regressions in [11] to higher order polynomial versions at both the subject-specific trajectory level and the population level for better adaptability in the modeling of longitudinal data with covariate induced variability. Unlike the Riemannian polynomial described in [9,10] where the polynomial is defined in a differential manner with covariant derivatives, our polynomial expansions are applied to the tangent vectors under the geodesic regression model in an algebraic form, making them straightforward and consistent to fit into different levels of the hierarchical model. Thus, geodesic polynomial regression in this study refers to expanding the composition of tangent vectors in their hyper-tangent space with polynomials of different orders.
Subject-wise Level Model
We first perform geodesic polynomial regression at the subject-wise level (level 1). The nth order polynomial model on subject-specific trajectory is formulated as
(1) |
where is the anchor point of subject-specific trajectory , is the tangent vector of the polynomial term and is the independent time variable. Given the input observations , and ’s are estimated by least squares geodesic regression
(2) |
where is the combined representation of all , and are the observation and corresponding time variable in . Note that due to the number of free parameters in the regressing polynomial, is required to avoid singularity in the solution.
Population Level Model
At population level (level 2), let the subject trajectories be associated with a set of covariates , the final form of the hierarchical geodesic polynomial model can be written as
(3) |
where and are the models for the anchor point and the basis tangent vector of polynomial order respectively. Technically speaking, the two models do not necessarily share the same set of covariates to allow for more flexible regression. However, in most cases, the same set of covariates would be used for both models if there is no compelling indication that a specific covariate is solely associated with one of the models. We promote both the anchor point model and the tangent vector model with quadratic expansion on the covariates. The aim is to associate each model with higher order terms of the covariates as well as cross terms to accurately model their combined effects on subject trajectories.
Anchor Point Model
The anchor point model with quadratic expansion on covariates is written as
(4) |
where is a base anchor point and , , are basis vectors of the anchor point polynomial. These coefficients can be estimated from the results of subject-specific trajectory regression as
(5) |
where and are the covariates and regressed anchor point of subject , is the total number of input subject trajectories and is the combined representation of , , , in the anchor point model.
Tangent Vector Model
Recall that the is the combined representation of the tangent vector bases of different orders from the level 1 regression. Thus, there are corresponding tangent vector models at the population level . Each tangent vector model with quadratic expansion on associated covariates can be formulated as
(6) |
(7) |
where and are the tangent vectors at and respectively, , , , are bases of the tangent vector polynomial at . Note that the subscript referring to the order of the polynomial model is omitted here for readability purposes. From the above formulation, the final tangent vector basis is obtained by calculating a tangent vector at from the polynomial model and then transporting it to the corresponding anchor point by a parallel transport function defined on . This is due to the consideration that subject-specific tangent vectors must be comparable with each other to perform regression in a consistent manner. Therefore, we need to transport all of them to the same tangent vector space for the regression calculation, as well as to transport them to their corresponding anchor point in the forward calculation. Due to the existence of the parallel transport functions and the fact that regressed subject-specific anchor points do not necessarily lie on , the actual tangent vector being used for regression calculation is obtained as
(8) |
so that all . Stop-over transport avoids arbitrary rotation from direct transport , explained in [11]. Regression on the basis tangent vector of a specific polynomial order is then formulated as
(9) |
where is the combined representation of , , , and .
Iterative Optimization Scheme
Since our shape space is constructed as a Riemannian manifold and thus not Euclidean, there exists no closed-form solution for such geodesic polynomial regressions on subject-wise trajectory and anchor point model. Similar to the calculation of Fréchet mean, we employ an iterative solution scheme for obtaining the optimal parameters in Eq. (2) and Eq. (5). The algorithm 1 illustrates how the parameters are updated over iterations. Note that the function Least Squares Polynomial Fitting depends on , which means that all points in are transformed to the hyper-tangent space at using a log map for calculating new parameters .
and Hypothesis Testing
At subject-wise level, the is calculated using the Fréchet variance[14], intrinsically defined by
(10) |
Algorithm 1:
(11) |
(12) |
where is the Fréchet mean, is the regressed point , and is the error between and observation point . To test statistical significance of fitting a geodesic polynomial model with respect to time, hypothesis test is conducted against the null hypothesis : is irrelevant to change in shape using the permutation approach described in [6,7].
3. Results and Discussion
3.1. Test on Low Dimensional Synthetic Data
In order to validate as well as to visualize our hierarchical geodesic polynomial model, we first test our implementation on points on the unit sphere. Points on the sphere represent shapes with only one 3D point and the sphere is the corresponding shape space. On the left of Fig. 1 shows the result from fitting four input points (red) with geodesic model (green) and 3rd order geodesic polynomial model (blue) respectively. The four input points have integer times steps ranging from 0 to 3. The 3rd order geodesic polynomial is able to fit the input points almost perfectly, whereas as the linear geodesic model can only fit the inputs in a least square sense, which is similar to the case in Euclidean space.
The right of Fig. 1 shows regression results from fitting three input trajectories with a hierarchical geodesic polynomial model. Each input trajectory contains three points with integer timestamps ranging from 0 to 2. The three input trajectories represent trajectories with integer covariate values ranging from 0 to 2 respectively. A quadratic model is used at both the subject-specific and the population level. The green points on the left of Fig. 1 also show the changes in anchor points from fitting a quadratic model, with covariate values ranging from 0 to 2 with an interval of 0.25. The blue points demonstrate the changes in regressed trajectories for covariates ranging from 0 to 2 with an interval of 0.5. Given the three input trajectories, our hierarchical geodesic polynomial model is capable of fitting the inputs perfectly with a quadratic model at both subject-wise level and population level.
3.2. Analysis of 4D Pediatric Right Ventricular Data
The shape of the right ventricle (RV) is known to influence the function of the tricuspid valve, but precise shape-based characterization of the RV in Hypoplastic left heart syndrome (HLHS) has not been described. The 4D trajectories of 94 pediatric RVs are acquired from 3D echocardiogram-based speckle tracking, and then transfered into the TOMTEC imaging system for the computation of the 3D models of the RV chamber. Each acquisition contains approximately one cardiac cycle, with 10 to 30 captured frames. Due to some incomplete cardiac cycle acquisitions as well as the fact that some obtained cardiac cycles do not perfectly repeat themselves from the end to the start point, the data is divided into two subsets, the systolic and diastolic phases. Since lengths of individual cardiac cycles can be different, we standardize trajectories for each systolic and diastolic phase with equally-spaced 50 frames in each trajectory [5]. As such, we finally obtain 58 systolic and 36 diastolic trajectories for hierarchical geodesic regression analysis. To evaluate the impact our polynomial model, we compare to the geodesic model [11], as to our knowledge it is the only longitudinal shape model that incorporates multiple covariates.
Subject-wise Model
We first fit a polynomial model to subject-specific trajectories with the left of Fig. 2 showing the resulting error from fitting models to a representative single subject trajectory. As the order of polynomial model increases, the regression error decreases significantly from 7.65 × 10−3 (1st order) to 3.82×10−5 (5th order) with a 99.5% reduction (unitless because shapes are normalized to unit size). Fig. 3 shows that the trajectory from a higher order polynomial regression exhibits more nonlinearity than the geodesic model, as expected. Though not obvious, it is observed that the maximum distance point shifts from the left to the right side of the shape in the polynomial model, whereas the most distant point remains the same point in the geodesic model throughout time. The right of Fig. 2 shows the ranges and the mean values of the R2 across the systole population, which indicates that trajectories of shape change over time better matches observed data with models of higher order.
Anchor Point Model
As more samples are desirable for regression analysis, we first test our anchor point model on all end systolic and end diastolic shapes in both systolic and diastolic trajectories. Two covariates from the demographics are chosen for level 2 models: tricuspid regurgitation severity (TRS) and right ventricle function (RVF) take on values shown in Table 1.
Table 1:
Covariate Value | TRS | RVF |
---|---|---|
0 | Trivial | Normal |
1 | Mild | Low normal |
1.5 | Mild to moderate | Low normal to mildly diminished |
2 | Moderate | Mildly diminished |
2.5 | Moderate to severe | Mildly to moderately diminished |
3 | Severe | Moderately diminished |
3.5 | Moderately to severely diminished | |
4 | Severely diminished |
We fit both geodesic linear models and quadratic polynomial models to the end systolic and end diastolic shapes, with the RVF and/or TRS as covariates. Table 2 shows regression errors. It can be seen that (i) quadratic models lead to smaller errors than geodesic models, and (ii) model fitting with respect to TRS leads to smaller errors than the ones with respect to RVF, which indicates the shape changes are better aligned with changes in TRS, and (iii) regressions using both covariates outperform those using single covariate, in terms of model fitting, which is expected as more independent covariates are taken into account. From visual observations of the shape changes, similar to the results from fitting polynomial model to subject specific trajectories, the higher order polynomial model fitting including covariates yield more nonlinear changes in the end systolic and end diastolic shapes. While RV shape changes with respect to RVF is smoother, shape changes associated with TRS shows more local variability over time, leading to an obvious compressed edge between the RV top region and the septal wall at the most severe level of tricuspid regurgitation.
Table 2:
Phase | Covariates | Linear model errors(1) | Quadratic model errors(1) |
---|---|---|---|
| |||
End diastole | RVF | 5.521 | 5.487 |
TRS | 5.515 | 5.442 | |
RVF & TRS | 5.454 | 5.208 | |
| |||
End systole | RVF | 5.178 | 5.138 |
TRS | 5.147 | 5.062 | |
RVF & TRS | 5.102 | 4.840 |
It is also observed that changes in shape with respect to the same covariates from multivariate regression are more prominent. Fig. 4 and 5 show that the multivariate regressions yield more observable changes in shapes as the covariate-specific changes of the shape are co-captured by different covariates separately.
Fig. 6 shows the full spectrum of how the shape of the RV changes with respect to different values of the covariates at end diastole. Due to the sparsity and large variability in the input data set, extrapolating RV shape to extreme values of both covariates leads to a non-feasible real world shape as no such combination appeared in the input data set.
3.3. Future Work
There are a few aspects that can be further extended to our current work. First, the current parallel transport of tangent vectors is computed along the geodesic between the start and end points. Meanwhile at the population level, it is also possible that the directions of the tangent vectors are dependent on the anchor points’ trajectories, in which parallel transport should be computed along a certain path on the Riemannian manifold (e.g. the regressed anchor point’s trajectory in the single covariate case). In the case with multiple covariates, the choice of the path requires further study. Second, if scale is a key factor to consider in the shape model, it is also feasible to append a scale factor to the existing model, which is regarded as an additional entry in shape space, and the solution process would be almost identical to the other entries in the anchor point or tangent vector. As we collect more data, we will also investigate modeling growth or pathology models of HLHS over a larger time period (i.e. years) instead of the cardiac cycle.
4. Conclusions
In comparison with previous geodesic models, polynomial regression leads to more accurate and flexible data-matching results at both subject-wise and population levels. Population-level regression with respect to multiple covariates leads to clearer separation between covariate effects on the shapes as indicated from validation on 4D right ventricular data. The regressed model is able to yield results that are consistent with qualitative clinical evaluations.
Given the sparsity and large variability in the input right ventricular data set, extrapolating shapes outside of the input covariate combination range may lead to irregular reconstructed shapes, which is understandable. The proposed HPGM model can be further extended with higher order polynomial expansion on covariates as well as using other basis functions (eg. kernel functions) for better fitting on the input. Overall, the proposed HGPM can be used for multilevel analysis of longitudinal shape data, leading to interpretable results relating functions (covariates) with shape trajectories, thus being promising for a variety of relevant clinical research in the future.
5. Acknowledgements
This work is supported by the National Institute of Health R01EB021391 and R01HL153166.
References
- 1.Bernal-Rusiel JL, Greve DN, Reuter M, Fischl B, Sabuncu MR, Initiative ADN, et al. : Statistical analysis of longitudinal neuroimage data with linear mixed effects models. Neuroimage 66, 249–260 (2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.do Carmo MP: Differential Geometry of Curves and Sur4. Fletcher, T.: Geodesic Regression on Riemannian Manifolds. Prentice Hall (1976)
- 3.Durrleman S, Pennec X, Trouvé A, Braga J, Gerig G, Ayache N: Toward a comprehensive framework for the spatiotemporal statistical analysis of longitudinal shape data. International journal of computer vision 103(1), 22–59 (2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Durrleman S, Pennec X, Trouvé A, Gerig G, Ayache N: Spatiotemporal atlas estimation for developmental delay detection in longitudinal datasets. In: MICCAI. pp. 297–304. Springer; (2009) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Fishbaugh J, Gerig G: Acceleration controlled diffeomorphisms for nonparametric image regression. In: ISBI. pp. 1488–1491 (2019) [DOI] [PMC free article] [PubMed]
- 6.Fletcher PT: Geodesic Regression on Riemannian Manifolds. In: MICCAI MFCA. pp. 75–86 (2011), https://hal.inria.fr/inria-00623920
- 7.Fletcher PT: Geodesic regression and the theory of least squares on riemannian manifolds. IJCV 105(2), 171–185 (2013) [Google Scholar]
- 8.Guigui N, Maignant E, Trouvé A, Pennec X: Parallel transport on kendall shape spaces. In: GSI. p. 103–110 (2021)
- 9.Hinkle J, Muralidharan P, Fletcher PT, Joshi S: Polynomial regression on riemannian manifolds. In: ECCV. pp. 1–14 (2012)
- 10.Hinkle J, Muralidharan P, Fletcher PT, Joshi S: Intrinsic polynomials for regression on riemannian manifolds. J. of Mathematical Imaging and Vision (2014)
- 11.Hong S, Fishbaugh J, Wolff JJ, Styner MA, Gerig G: Hierarchical multi-geodesic model for longitudinal analysis of temporal trajectories of anatomical shape and covariates. In: MICCAI. p. 57–65 (2019) [DOI] [PMC free article] [PubMed]
- 12.Klingenberg CP: Walking on kendall’s shape space: Understanding shape spaces and their coordinate systems. Evolutionary Biology pp. 1–19 (2020)31906845
- 13.Lorenzi M, Pennec X, Frisoni GB, Ayache N, Initiative ADN, et al. : Disentangling normal aging from alzheimer’s disease in structural magnetic resonance images. Neurobiology of aging 36, S42–S52 (2015) [DOI] [PubMed] [Google Scholar]
- 14.Lou A, Katsman I, Jiang Q, Belongie S, Lim SN, De Sa C: Differentiating through the fréchet mean. In: ICML (2020)
- 15.Nava-Yazdani E, Hege HC, Sullivan TJ, von Tycowicz C: Geodesic analysis in kendall’s shape space with epidemiological applications. Journal of Mathematical Imaging and Vision 62(4), 549–559 (2020) [Google Scholar]
- 16.Nava-Yazdani E, Hege HC, von Tycowicz C: A hierarchical geodesic model for longitudinal analysis on manifolds. J. Math. Imaging Vis. 64(4), 395–407 (2022) [Google Scholar]
- 17.Sadeghi N, Prastawa M, Fletcher PT, Wolff J, Gilmore JH, Gerig G: Regional characterization of longitudinal dt-mri to study white matter maturation of the early developing brain. Neuroimage 68, 236–247 (2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Singh N, Hinkle J, Joshi S, Fletcher PT: A hierarchical geodesic model for diffeomorphic longitudinal shape analysis. In: IPMI. pp. 560–571 (2013) [DOI] [PMC free article] [PubMed]
- 19.Woltman H, Feldstain A, MacKay JC, Rocchi M: An introduction to hierarchical linear modeling. Tutorials in Quantitative Methods for Psychology 8(1), 52–69 (2012) [Google Scholar]