Abstract
We consider in this paper a statistical two-phase regression model in which the change point of a disease biomarker is measured relative to another point in time, such as the manifestation of the disease, which is subject to right-censoring (i.e., possibly unobserved over the entire course of the study). We develop point estimation methods for this model, based on maximum likelihood, and bootstrap validation methods. The effectiveness of our approach is illustrated by numerical simulations, and by the estimation of a change point for amygdalar atrophy in the context of Alzheimer’s disease, wherein it is related to the cognitive manifestation of the disease.
Keywords and Phrases: Change-point estimation, right censoring, medical imaging
1. Introduction
The manifestation of an event, such as the onset of a disease, is not always immediate and often requires some time for its repercussions to become observable. Slowly progressing diseases, and in particular neuro-degenerative disorders such as Alzheimer’s disease (AD) which is a focus of the current paper, fall into this category. The manifestation of such diseases is related to the onset of cognitive or functional impairment and, at the time when this occurs, the disease may have already had been affecting the brain anatomically and functionally for a considerable time. Such effects, however, are only observable through costly and sometimes even invasive medical examinations, which are not routinely performed on healthy, or apparently healthy populations.
It is however extremely important scientifically and clinically to determine how the disease evolves and the time when brain change begins, especially when the disease’s pathology is (currently) irreversible like that of AD. The goal of this paper is to propose and analyze a statistical model that addresses this issue by determining a change point, at the population level, at which the evolution of a given biomarker develops a regime change. Assuming the measurements of these biomarkers are taken from a dataset including asymptomatic subjects, with a subsequent determination of the disease manifestation made at a later time, we will describe the associated estimation procedure and provide numerical experiments on both simulated and real data.
It is worthy of note that performing such disease-related observations on human beings is a very challenging process. It requires following subjects regularly over many years, starting at a time point when they have not manifested any sign of the disease yet, with an uncertainty over how many subjects will have converted to diseased status within the time frame of the study. While some genetic or family history information can be used to increase the likelihood of observing disease manifestation, the difficulty of this data acquisition task explains why datasets of this kind are still relatively rare today. One may expect, however, that systematic patient monitoring and computerized medical recording will lead to more such datasets emerging in the future.
The real data that we will use in this paper are provided by the BIOCARD study, which is a longitudinal study of AD in which subjects have been continuously followed for more than 20 years. BIOCARD has, compared to other longitudinal studies on AD such as ADNI [Mueller et al. (2005)], the distinction of having only included individuals who had no sign of cognitive impairment at baseline. The BIOCARD study was initiated in 1995 with the subjects having received their most recent cognitive assessment in 2012, resulting in about one-fourth of the group being diagnosed with mild cognitive impairment (MCI) or AD. This dataset motivates the development of the statistical model and parameter estimation method presented in this paper, and more detailed information about it will be provided in Section 4.2.
We will describe the basic assumptions of our model and our notation in Section 2. Section 3 will describe the parameter estimation procedure. Section 4 will provide experimental results, both on simulated data and on the BIOCARD dataset.
2. Statistical model and notation
We will let Y denote the dependent variable, which in our application, will be associated to the measurement of a biomarker for AD. The value of this variable is assumed to depend on time (which, in this paper, will always be the subject’s age), disease status, and possibly other covariates (e.g., gender, intracranial volume, etc.). The general model assumes that the disease status results in a change point in the evolution of the biomarker, at a time which is indirectly observable through the induced external manifestation (e.g., cognitive impairment) which happens after a delay, Δ, from the change point time. We will assume that Δ is a fixed parameter specific to the biomarker.
Let the random variable U denote the time of disease manifestation itself (and thus the biomarker’s change point occurs at time U − Δ). We will refer to U as the “manifest onset time”. We will assume that it is always finite, in the sense that everyone in the considered population would eventually develop the disease if they were to live indefinitely. This manifest onset time is not always observable, since patients may still be healthy—or have an undetected onset—at the end of the study, resulting in right censoring. Some form of left censoring is also possible, depending on how subjects participating in the study were selected.
We let n denote the total number of subjects involved in the study, each with their own manifest onset times U1, …, Un. We let T1, …, Tn be the ages of the subjects at the end of the study, which are the right censoring times, so that Uk is observable only when it is less than the corresponding Tk. A value ykj of the biomarker is measured for subject k at age tkj, with (pk therefore denotes the total number of longitudinal observations for subject k). We will work with a model assuming a linear dependence of Y with respect to age and onset time, with a rate change at t = U − Δ, in the form
| (1) |
where (x)+ = max(x, 0). Here, a, b1, b2, c, and Δ are parameters, models a time-independent random effect, and is a noise variable modeling the longitudinal intra-subject variation. In our model, η and all ε(t) are mutually independent, and all are independent of U. Note that the well-posedness of the model requires some limits on the values of Δ; if Δ ≪ 0, then (t − U + Δ)+ is almost always zero, and thus c and Δ become barely identifiable; conversely, if Δ ≫ 0, then (t −U + Δ)+ ≃ t −U + Δ and the model becomes over-parametrized. We will return to these issues later.
Assuming n independent realizations of this model, the observations are: ykj = Y (tkj), for k = 1, …, n and j = 1, … pk; the end-of-study ages, Tk; and the censored manifest onset times, Zk = min(Uk, Tk). We will assume that all Tk’s are deterministic, or equivalently, that they are independent of other variables, and work conditionally to them. The final piece of the model is the distribution of the manifest onset time, U. We will use either a Gaussian or an exponentially modified Gaussian distribution (see Section 3.1).
Our model is therefore a two-phase (or segmented) regression model with right-censoring on the time variable. Parametric inference and testing for multi-phase regression were studied in Quandt (1958), Sprent (1961), Hudson (1966), Hinkley (1969, 1971), Farley and Hinich (1970), Feder (1975), Gombay and Horváth (1994a, 1994b). Change-point models have also been introduced for survival analysis and hazard estimation, especially in the context of right censoring [Nguyen, Rogers and Walker (1984), Pons (2003), Wu, Zhao and Wu (2003), Dupuy (2006), Li, Qian and Zhang (2013)] [see also Chen and Gupta (2000) and references cited there].
Note that, even though our data includes right censoring, we are using a linear regression model rather than a hazard or Cox regression model. The model in (1) provides the correct “causal” relationship in which the change point U − Δ triggers a change of regime in the dependent variable Y.
3. Parameter estimation
3.1. Manifest onset time: Prior model
We here describe how the parameters of the manifest onset time distribution can be estimated from data under various hypotheses of right and left censoring. We consider it as a prior distribution and, therefore, assume that it is estimated separately from the other model parameters. This is justified by the fact that, while it would be possible to estimate the joint distribution of (Y, U) for any given biomarker, it is generally preferable to work with a single model of U shared by all biomarker variables Y. Moreover, manifest onset time information for medical data is usually available for more diverse and larger datasets than those on which biomarkers are measured, and this information can naturally be used to estimate their distribution.
In typical studies, one can generally separate the subjects into three groups: the subjects who converted during the study, that we will denote J0, the subjects who converted after study end (right censored), denoted J1 and those who entered the study with the disease (left censored), denoted J2. Some study designs (such as BIOCARD) focused on incident disease are restricted to disease-free cohorts, thus eliminating the last group.
Assuming right censoring only (J2 = ∅) and letting as before Tk be the age at the end of the study, J1 ⊂ {1, …, n} is the subset of subject indexes for which Uk ≥ Tk and J0 = {1, …, n} \ J1. Denote by fU(u|θ) the p.d.f. of the variable U (for a given parameter θ) and by FU(u|θ) the corresponding c.d.f. The loglikelihood of the observed data, which is (min(Uk, Tk), k = 1, …, n), is
For studies in which diseased subjects are not enrolled by design, the likelihood must be modified to account for this bias, which requires that Uk ≥ tk1 for all k. Taking the likelihood conditional on this event, we obtain
| (2) |
Finally, if diseased subjects can be included in the study, yielding a non-empty set J2, the resulting likelihood is
| (3) |
First, consider the case in which fU is a Gaussian distribution , which is probably the simplest choice. In this case, the log-likelihood is
| (4) |
and its gradient can be easily computed.
One may however prefer using a distribution with heavier tails, allowing for large values of the variable to occasionally occur. Such a behavior may be important to allow for healthy controls to have a manifest onset time so far in the future that they do not enter the second phase of the regression model during the study time. We will present simulations using an exponentially modified Gaussian, which can be written as U = W + S, where and S ∼ exp(α) (an exponential variable with mean α). Its p.d.f. is the convolution of the Gaussian and exponential densities, and is given by
| (5) |
where Φ is the cumulative distribution function (c.d.f.) of a standard Gaussian variable. Similarly, the c.d.f. of U is
These expressions can be plugged into (3), and the gradient of the resulting log-likelihood with respect to each of the three parameters m1, σ1, and α can also be computed analytically. Our implementation uses the Matlab optimization toolbox to compute the maximum likelihood estimators of those three parameters.
This exponentially modified Gaussian model obviously reduces to the Gaussian one when α = 0, and we used a log-likelihood ratio test in order to assess the validity of the hypothesis α > 0.
Note that we should in principle have conditioned our prior distribution to take only positive values. However, the models estimated in our applications are such that the probability of taking a negative value is negligible (less than 10−10) and such a modification was not necessary. If implemented, this modification would only have affected the case J2 ≠ ∅ since the likelihood for subjects in J0 and J1 is already left-censored by U >tk,1, and would have resulted in the addition of the term
to the log-likelihood in (3).
Illustration
We first provide an application using simulations based on population data relative to Alzheimer’s disease. The prevalence of Alzheimer’s disease over various age groups was published in Hebert et al. (2013). From this source, prevalence is about 3% in the 65–74 age group, 17.6% in the 75–84 age group and 30% in the 85–94 age group. Similarly, data in the Alzheimer’s Association (2015) indicate that prevalence among people above 95 years may be as high as 50%. Based on results such as those provided in Larrieu et al. (2002), one may add about 5% to these numbers to also include the fraction of population with mild cognitive impairment, a precursor state to Alzheimer’s. Using this information, it is easy to derive a logistic regression model that provides the conditional probability of disease conditional to age.
We can then use this information combined with census data to simulate a large-scale sample of population at various ages and their disease status. For such a sample, which is purely cross-sectional (i.e., does not contain any longitudinal information), one can estimate an exponentially modified Gaussian distribution with (using the previous notation) uk = Tk = tk1, J0 = ∅, J1 being the set of healthy subjects and J2 the set of disease subjects. Doing so, the obtained parameters provide a Gaussian term with a very large variance (m1 ≃ 95, σ1 ≃ 20) and a small exponential term (α ≃ 0.1) which is not significant (implying that a Gaussian model can be used). The values that we used in our experiments were slightly different (m1 = 93, σ1 = 14.5), because the BIOCARD dataset is slightly biased, in the sense that it enrolled a majority of patients with a family history of AD. We estimated these parameters from another dataset, with enrollment procedures similar to BIOCARD.
As a further illustration, Tables 1 to 3 provide simulation results for various parameters of an exponentially modified Gaussian model, with 1000 subjects initially generated with ages normally distributed with mean 57 and standard deviation 10, and true model parameters m1, σ1, and α. We assumed both left and right censoring: only healthy subjects were kept at the beginning of the “study”, with a study length being 15 years (so that T − t1 = 15). We kept α = 2 years and ran simulations with m1 = 55, 65, 75 years and σ1 = 1, 2, 3, 5, 10, and 15 years.
Table 1.
Parametric estimation of the onset time model averaged over 10,000 simulations. Here, n is the number of subjects after left censoring (out of 1000), and |J1| is the average number of right-censored objects (healthy 15 years after enrollment). Numbers between parentheses provide information on the estimation of a Gaussian model with same mean and variance. The power is computed as the fraction of likelihood ratios relative to the submodel α = 0 who were larger than 3.84
| True parameters |
n | |J1| | Bias |
Standard dev. |
Power (in %) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| m1 | σ1 | α | m1 | σ1 | α | m1 | σ1 | α | |||
| 55 (57) |
1 (2.2) |
2 (0) |
384 | 40 | 0.008 (−0.100) |
−0.005 (0.228) |
−0.006 | 0.150 (0.122) |
0.122 (0.159) |
0.165 | 100.0 |
| 55 (57) |
2 (2.8) |
2 (0) |
386 | 42 | 0.023 (−0.100) |
−0.001 (0.187) |
−0.022 | 0.265 (0.170) |
0.194 (0.163) |
0.245 | 98.7 |
| 55 (57) |
3 (3.6) |
2 (0) |
388 | 45 | 0.076 (−0.102) |
0.004 (0.149) |
−0.078 | 0.474 (0.241) |
0.286 (0.184) |
0.444 | 67.2 |
| 55 (57) |
5 (5.4) |
2 (0) |
395 | 57 | 0.400 (−0.094) |
−0.003 (0.094) |
−0.392 | 1.076 (0.428) |
0.457 (0.274) |
1.049 | 11.1 |
| 55 (57) |
10 (10.2) |
2 (0) |
417 | 104 | 0.090 (−0.124) |
−0.455 (0.059) |
0.108 | 2.167 (1.233) |
1.101 (0.695) |
2.162 | 3.0 |
| 55 (57) |
15 (15.1) |
2 (0) |
434 | 160 | −0.942 (−0.335) |
−1.291 (0.112) |
1.577 | 3.632 (2.658) |
2.335 (1.443) |
3.781 | 3.0 |
Table 3.
Same as Table 1, with m1 = 75 years
| True parameters |
n | |J1| | Bias |
Standard dev. |
Power (in %) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| m1 | σ1 | α | m1 | σ1 | α | m1 | σ1 | α | |||
| 75 (77) |
1 (2.2) |
2 (0) |
952 | 576 | 0.005 (−0.033) |
−0.005 (−0.201) |
−0.005 | 0.138 (0.110) |
0.096 (0.108) |
0.178 | 100.0 |
| 75 (77) |
2 (2.8) |
2 (0) |
949 | 576 | 0.026 (−0.032) |
−0.002 (−0.151) |
−0.024 | 0.285 (0.137) |
0.158 (0.112) |
0.309 | 94.8 |
| 75 (77) |
3 (3.6) |
2 (0) |
945 | 574 | 0.120 (−0.034) |
0.010 (−0.112) |
−0.120 | 0.602 (0.172) |
0.233 (0.134) |
0.623 | 48.2 |
| 75 (77) |
5 (5.4) |
2 (0) |
933 | 569 | 0.336 (−0.027) |
−0.023 (−0.064) |
−0.322 | 1.252 (0.256) |
0.368 (0.203) |
1.284 | 8.0 |
| 75 (77) |
10 (10.2) |
2 (0) |
883 | 556 | −0.639 (−0.014) |
−0.423 (−0.016) |
0.794 | 2.710 (0.511) |
0.864 (0.496) |
2.899 | 2.8 |
| 75 (77) |
15 (15.1) |
2 (0) |
826 | 544 | −2.353 (−0.037) |
−1.133 (0.015) |
2.857 | 4.665 (0.886) |
1.846 (1.026) |
5.326 | 2.8 |
From these results, we note that α becomes almost impossible to separate from 0 when σ1 increases, inducing biases in the combined estimation of m1 and α. For smaller values of σ1, the bias is small and the likelihood ratio test has high power; this is mostly independent of the proportion of left censored data. For large standard deviation, the null hypothesis is almost always accepted (about 98% of the time). The estimations of the mean (m1 + α) and standard deviation ( ) remain relatively accurate via the Gaussian submodel, even when the bias and variance of the full model parameter estimates increase.
3.2. Change point onset model
We now describe the estimation of the parameters a, b1, b2, c, ρ2, and σ2 which affect the change-point onset model (1), assuming that m1, and α are fixed. The joint p.d.f. of the model variables is ∏k f (yk, uk, ηk) where , with
and , or simply φ((u − m1)/σ1)/σ1 if α = 0, where φ and Φ are the p.d.f. and c.d.f. of a standard Gaussian variable. Because we assume that m1, σ1, and α are fixed, we omit the left-censoring normalization which only depends on them and, therefore, will not impact the maximization of the likelihood of the observation process.
The observations are either (yk, uk) for k ∈ J0, or just yk and the additional information that uk ≥ Tk (resp., uk ≤ tk1) for k ∈ J1 (resp., k ∈ J2). The likelihood of the observed data is therefore given by
where fY,U is the marginal density of Y and U in the first product, and the p.d.f.’s in the next two products are conditional densities of Y given the relevant event for U. Note that
The denominator was computed in the previous section. It depends on only the fixed distribution of U, and can therefore be treated as a constant. Using the notation
we therefore need to maximize
We now provide the expression of the marginal densities in the product. For simplicity, we will drop the index k in the rest of the computation, therefore letting y = (y1, …, yp) and t = (t1, …, tp).
We start with
Using rj = yj − a − b1tj − b2u − c(tj + Δ − u)+, we have
with
Given this, we have
| (6) |
in which we have made explicit the dependency of rj in u. The resulting integral cannot be computed analytically and our implementation is based on numerical evaluations using Matlab. Let ψ(θ, u) denote the term in the exponential in (6). The gradient of (6) with respect to the parameters then takes the form
which can also be computed numerically. We used Matlab’s gradient-based optimization functions to maximize the log-likelihood. Because of the non-convexity of the likelihood function, we have found it a necessity to perform several runs with different initial conditions. Our numerical procedure is summarized below. When estimating the optimal value of Δ, we fix an interval [Δmin, Δmax] and step δ for incrementing Δ. (To ensure that the likelihood is differentiable in Δ, we have replaced the positive part function x ↦ x+ by a smooth approximation, replacing x+ with (x + ε)2/(4ε) for x ∈ [−ε, ε] and ε small enough.)
Optimization Algorithm
Initialization
We first estimate a sub-model with c = 0, estimating initial values for a, b1, b2, ρ2, and σ2. To simplify this initialization step, we impute values for the missing observations of U (using the conditional expectation of U given U ≥ T in place of right-censored observations), therefore, reducing the problem to a linear model with random effects.
Preliminary step
Maximize the complete model log-likelihood initializing the gradient ascent algorithm with the parameters obtained at the previous step completed with c = 0 and Δ = Δmin. Assuming that Δmin is small enough for (t − U + Δmin)+ to vanish most of the time, the parameters estimated for the reduced model can be expected to provide a reasonable initial guess.
Step m
If Δmin + mδ > Δmax, go to the final step. Otherwise, maximize the likelihood using as an initial condition the parameters found in the previous iteration, but replacing Δ with Δmin + mδ.
Final step
Keep the set of parameters that provided the largest value of the log-likelihood.
3.3. Validation
Since Δ is not identifiable when c = 0, it is important to reject this hypothesis to ensure that the estimated value of the change point is meaningful. We used a likelihood ratio test, therefore comparing the maximum log-likelihood obtained in the previous section to the one obtained in the case of c = 0. The likelihood in this latter case (which can be computed using the same approach as for the general model) was also maximized using a gradient-based method, and in this case, we used the better result from two runs: the first one starts from the parameters estimated from the imputed model in the initialization step of the general algorithm. The second starts with a + cΔ, b1 + c, b2 − c, ρ and σ, where (a, b1, b2, c, Δ, ρ, σ) are the maximum likelihood parameters estimated for the general model. This choice relies on the fact that for large values of Δ, ensuring t − U + Δ > 0 with large probability, the general model becomes very close to the linear submodel with the chosen transformation of parameters.
A p-value for the likelihood ratio can be computed based on bootstrap estimates. We used for this purpose a semi-parametric approach in which each bootstrap sample was computed as follows, given the maximum likelihood parameters (a, b1, b2, c, Δ, ρ, σ).
Impute random values uk for the right-censored U’s, drawn, for each k, according to the conditional distribution of U given the observed variables.
- Compute “model residuals”
- Whiten residuals according to the random effect model as follows. For each k, with pk observations, let . Set
where . Stack all Wkj in an N = p1 + ⋯ + pn vector, and sample with replacements new values from this vector.
- Reconstruct bootstrap residuals using
- Define the complete bootstrap sample by
and the null bootstrap sample by
A p-value for the hypothesis c = 0 can then be obtained by computing the fraction of times the likelihood ratio obtained on the true sample is smaller than the ratios obtained on a large number of null bootstrap samples. When this p-value is small enough, standard deviations, or confidence intervals on the estimated parameters can be deduced from the distribution of the complete bootstrap samples.
Steps (iii) and (v) above are justified as follows (fixing k and letting p = pk). The covariance matrix between the residuals is S = (sij, 1 ≤ i, j ≤ p) with sij = σ2δij + ρ2. The vector (where 1p is p-dimensional and composed entirely of ones) is an eigenvector with eigenvalue σ2 + pρ2 and the other p − 1 eigenvectors span the space orthonormal to it, with eigenvalue σ2. This implies that, for x ∈ ℝp, and letting
Similarly,
which justifies steps (iii) and (v), in which we have used the fact that the factors σ and σ−1 cancel in the overall computation.
3.4. Remarks
Our change-point onset model is similar to the one proposed in Younes et al. (2014), which was applied to anatomical changes of brain structures based on the BIOCARD dataset. The approach in Younes et al. (2014), however, made the simplifying assumption that whenever uk ≥ Tk (and, therefore, the manifest onset time is not observed), then uk − Δ was also larger than the last measurement time, . This was justified by the rather large delay between this last measurement and the last cognitive assessment (about five years), and the belief that Δ would be less than or comparable to this delay. Some other biomarkers, however, which were not considered in Younes et al. (2014), appear to be associated to large values of Δ for which these assumptions would not be justified, motivating the more complex procedure proposed in this paper.
Even though we have limited our discussion to the model described in equation (1), it is easy to generalize it to more complex models, including for example additional variables (covariates) or higher-order dependency with respect to age.
4. Experiments
4.1. Simulations
In our first analysis, we conducted simulation experiments with synthetic data, which allowed us to evaluate the performance of our estimation algorithm with a known ground truth. We used two sets of parameter values for a, b1, b2, c, ρ, and σ, and for each of these sets, simulated data with Δ = 5, 10, 15, or 20 years (see Table 4). These values were based on two sets of observed values from real-data experiments. We also simulated samples with c = 0 to estimate a threshold for the likelihood ratio under the null hypothesis.
Table 4.
Simulation results. Each group of results provides the true values of the model parameters, followed by the 25, 50, and 75 percentiles of the estimated values based on 1000 independent simulations
| a | b1 | b2 | c | Δ | ρ | σ | |
|---|---|---|---|---|---|---|---|
| True values | 20.00 | 1.70 | 0.01 | 1.40 | 5.00 | 10.00 | 10.00 |
| 25 percentile | 9.485 | 1.619 | −0.014 | 1.169 | 4.181 | 9.174 | 9.873 |
| Median | 18.15 | 1.668 | 0.047 | 1.489 | 5.438 | 9.539 | 10.125 |
| 75 percentile | 21.318 | 1.722 | 0.171 | 1.816 | 7.011 | 10.132 | 10.243 |
| True values | 20.00 | 1.70 | 0.01 | 1.40 | 10.00 | 10.00 | 10.00 |
| 25 percentile | 7.969 | 1.578 | −0.025 | 1.382 | 9.002 | 9.118 | 9.870 |
| Median | 14.59 | 1.671 | 0.113 | 1.524 | 10.160 | 9.540 | 10.100 |
| 75 percentile | 20.906 | 1.738 | 0.207 | 1.774 | 11.797 | 10.120 | 10.207 |
| True values | 20.00 | 1.70 | 0.01 | 1.40 | 15.00 | 10.00 | 10.00 |
| 25 percentile | 6.784 | 1.547 | −0.124 | 1.308 | 12.166 | 8.741 | 9.871 |
| Median | 16.68 | 1.693 | 0.051 | 1.556 | 13.717 | 9.533 | 10.103 |
| 75 percentile | 26.955 | 1.796 | 0.250 | 1.772 | 17.708 | 10.012 | 10.200 |
| True values | 20.00 | 1.70 | 0.01 | 1.40 | 20.00 | 10.00 | 10.00 |
| 25 percentile | 4.713 | 1.471 | −0.221 | 1.205 | 15.678 | 8.645 | 9.911 |
| Median | 17.96 | 1.746 | −0.039 | 1.486 | 18.070 | 9.370 | 10.087 |
| 75 percentile | 31.384 | 1.892 | 0.297 | 1.729 | 23.962 | 9.992 | 10.218 |
| True values | 1.40 | −0.10 | 0.00 | −0.02 | 5.00 | 0.16 | 0.12 |
| 25 percentile | 1.304 | −0.101 | −0.001 | −0.025 | 3.702 | 0.150 | 0.118 |
| Median | 1.40 | −0.100 | 0.000 | −0.021 | 4.770 | 0.155 | 0.121 |
| 75 percentile | 1.458 | −0.099 | 0.001 | −0.016 | 6.331 | 0.161 | 0.123 |
| True values | 1.40 | −0.10 | 0.00 | −0.02 | 10.00 | 0.16 | 0.12 |
| 25 percentile | 1.263 | −0.101 | −0.001 | −0.024 | 8.747 | 0.150 | 0.118 |
| Median | 1.39 | −0.100 | 0.000 | −0.020 | 9.847 | 0.156 | 0.121 |
| 75 percentile | 1.433 | −0.099 | 0.002 | −0.018 | 10.750 | 0.162 | 0.123 |
| True values | 1.40 | −0.10 | 0.00 | −0.02 | 15.00 | 0.16 | 0.12 |
| 25 percentile | 1.265 | −0.101 | −0.001 | −0.022 | 12.436 | 0.149 | 0.118 |
| Median | 1.32 | −0.100 | 0.001 | −0.020 | 14.273 | 0.153 | 0.121 |
| 75 percentile | 1.420 | −0.098 | 0.003 | −0.017 | 17.222 | 0.161 | 0.123 |
| True values | 1.40 | −0.10 | 0.00 | −0.02 | 20.00 | 0.16 | 0.12 |
| 25 percentile | 1.143 | −0.103 | −0.002 | −0.022 | 16.107 | 0.145 | 0.119 |
| Median | 1.33 | −0.101 | 0.002 | −0.020 | 20.438 | 0.150 | 0.122 |
| 75 percentile | 1.481 | −0.097 | 0.005 | −0.017 | 22.806 | 0.157 | 0.123 |
Our simulations try to follow a subject recruitment process. We started with 300 “subjects”, with age at the beginning of the study (tk1) simulated from a Gaussian distribution with mean 60 and standard deviation 10. The manifest onset time (uk) was sampled according to an exponentially modified Gaussian distribution with m1 = 73, σ1 = 17, and α = 2. Subjects that did not satisfy the right censoring condition uk ≥ tk1 were automatically excluded from analysis (the average number of excluded subjects was 67.2). We also assumed that the number of biomarker observations for each subject followed a uniform distribution over {1, …, 5}. The interval between every two consecutive longitudinal measurements was fixed, equal to 2 years. The length of the entire study was assumed to be 15 years, and thus Tk = tk1 + 15 for every k. About 64.5% of the selected subjects were right censored (i.e., had a manifest onset age posterior to the end of the study). An example of a simulated dataset is illustrated in Figure 1.
FIG. 1.

Simulated dataset with 206 subjects and 599 total observations represented without right censoring. Lines represent observed data (one line per subject) while gray triangles are the model predictions. The true change point is 10 years before onset (represented by vertical lines in the second chart). The other parameters are those used in the four first simulations in Table 4.
For each of these 8 estimation experiments (and for the two simulations made under the null hypothesis), a total of 1000 independent simulations were performed. Table 4 provides 25, 50, and 75 percentiles of the observed distribution of the estimated coefficients. The estimated 95 percentile of the log-likelihood difference under the null hypothesis (c = 0) was found to equal 3.6 in both models. This allows us to estimate the power of the rejection test in the other cases. The fraction of simulations for which the log-likelihood difference was larger than this threshold was 87%, 99.2%, 99.5%, and 99.5% for Δ = 5, 10, 15, and 20, respectively, under the first parameter set (for which a = 20) and 94%, 99.6%, 99.5%, and 99.3% for the second parameter set (a = 1.4).
These results indicate a reasonable accuracy in the estimation of the change point. Small change point values are more accurately estimated. The estimation of a also seem to slightly degrade with large change points. The estimation of the change point’s slope is also quite accurate and the fixed and random effect standard deviations (ρ and σ) are consistently well estimated.
4.2. Onset of Alzheimer’s disease
We now provide a few results based on real data, focusing on biomarkers related to AD. Magnetic resonance imaging (MRI) measures are an indirect reflection of the neuronal injury that occurs in the brain as the AD pathophysiological process evolves. The volumetric measurements of medial temporal lobe structures, such as the hippocampus and the amygdala, have been shown to be important anatomical hallmarks for AD, exhibiting significant atrophy in patients with both AD and MCI as compared to their healthy counterparts [Jack et al. (1997, 1992)]. Those volumetric measurements have also been shown to be associated with time to progress from MCI to AD dementia [Atiya et al. (2003), Kantarci and Jack (2003)]. In addition to volumetric measurements, shape-based biomarkers have also been found to be sensitive to the pathology of AD, revealing region-specific heterogeneous atrophy patterns in the hippocampus and the amygdala [Tang et al. (2014), Miller et al. (2015)]. It has also been demonstrated that baseline morphometric measures, in terms of both volume and shape, of the hippocampus and the amygdala in healthy controls were capable of predicting subsequent development of MCI [den Heijer et al. (2006)], with hippocampal shape differences detected among healthy controls who subsequently developed cognitive impairment [Csernansky et al. (2005), den Heijer et al. (2006), Kantarci and Jack (2003), Rusinek et al. (2003), Thambisetty et al. (2010)].
Extracting shape measurements of the structural biomarkers of AD, such as the hippocampus and the amygdala, from MRI datasets usually requires a complex processing pipeline before the statistical analysis described in this paper can be performed. This starts with the extraction of “regions of interest” (ROI) which are 3D volumes or 2D surfaces of specific anatomical structures of the human brain that are affected by the disease, such as the entorhinal cortex, the hippocampus or the amygdala in AD [Fischl (2012), Pierson et al. (2011), Tang et al. (2013)]. This segmentation step, even if mostly automated, still requires significant human intervention for quality control and small manual corrections. The 2D surfaces contouring the boundary of the segmentations provide the collection of “shapes” on which the second step, non-rigid alignment, will be performed.
Non-rigid alignment can be interpreted as an operation that places all shapes in a common “coordinate system” in an infinite-dimensional “shape space”. While all of this can be mathematically formalized [Miller, Younes and Trouvé (2014 [Miller, Younes and Trouvé (2015), Younes (2010), Bauer, Bruveris and Michor (2014)], from a computational point of view, the operation boils down to solving a collection of optimal control problems, each of which optimizes a deformation process in which an initial shape (called template) is mapped onto a subject shape, with the template being optimized at the same time [Ma, Miller and Younes (2010)]. At the end of the process, each subject shape is represented as a diffeomorphic transformation of a common shape (the template), and the problem is reduced to the study of the resulting collection of diffeomorphisms (yielding the term“diffeomorphometry” [Miller, Younes and Trouvé (2014)]). A variety of mathematically refined descriptors of the diffeomorphisms can then be analyzed, with the simplest case usually being their Jacobian determinant indexed at each vertex of the template surface, where the diffeomorphism is either considered as a 3D transformation (interpreting the Jacobian as a volume ratio), or a 2D transformation between surfaces (interpreting the Jacobian as a surface area ratio). As a result, shape data are transformed so that each individual surface is represented as a large collection of variables, a random field supported by the template surface. In this paper, we take an additional step to reduce the dimension of the shape variables by averaging these variables over sub-regions of the template obtained via spectral segmentation [Reuter (2010)]. The template surface for the amygdala, on which we will focus here, and the computed subregions are illustrated in Figure 2.
FIG. 2.

Two views of the template surface for the amygdala. Gray-levels are associated with spectral segmentation labels.
With seven segmented sub-regions, and a separate analysis of the amygdalar surfaces in both hemispheres, our final real data consist in a collection of fourteen variables that provide an average amount of atrophy or expansion measured for each subject relative to the template. The dataset we used included 292 subjects among which 70 were diagnosed as MCI before the end of the study (right-censoring therefore applying to 222 subjects).
Based on the discussion made at the end of Section 3.1, we used a Gaussian prior with m1 = 93 and σ1 = 14.5. We applied the change-point model separately to each surface, obtaining in this way fourteen regional estimates. Figure 3 provides likelihood plots (maximum log-likelihood as a function of Δ for the first four regions on the left hemisphere). We used 1000 bootstrap replicates to estimate the 25th and 75th percentiles of the estimators’ distribution, and 1000 additional replicates to estimate p-values for testing the null hypothesis of no change point. These results are summarized in Table 5. Note that the p-values we tabulate in this table were computed separately for each variable, and were therefore not corrected for multiple comparisons. Subregions 4 and 5 (p-values: 0.088, 0.130) on the left amygdala and subregions 1, 2, 4, 5, and 6 (p-values: 0.231, 0.078, 0.240, 0.127, 0.182) on the right amygdala were not significant and are not reported in this table. Figures 4, 5, and 6 illustrate the results on regions 1, 2, and 3 of the left amygdala by plotting the biomarker values and the model predictions as functions of age and of years before onset. Figure 7 provides a visual representation of the estimated change point results for each subregion of the bilateral amygdalas.
FIG. 3.

Likelihood profiles (maximum log-likelihood with fixed Δ as a function of Δ) for the first four regions on the left amygdala (from left to right and top to bottom). Note that the change point model for the fourth region is not significant, with a likelihood exhibiting a smaller spread in value. These plots also illustrate the non-concavity of the likelihood function.
Table 5.
Results of change point estimation for the amygdala, based on the BIOCARD dataset. Each group of results provides the estimated values of the model parameters, followed by the median absolute deviation (MAD). Only results with uncorrected p-value p < 0.05 are provided
| a | b1 | b2 | c | Δ | ρ | σ | |
|---|---|---|---|---|---|---|---|
| Region 1 (left), p = 0.002 | |||||||
| Estimated value | 0.004 | 0.000 | −0.001 | −0.007 | 11.052 | 0.067 | 0.064 |
| MAD | 0.054 | 0.000 | 0.001 | 0.002 | 2.109 | 0.003 | 0.002 |
| Region 2 (left), p = 0.002 | |||||||
| Estimated value | 0.086 | 0.000 | −0.002 | −0.010 | 8.394 | 0.079 | 0.081 |
| MAD | 0.057 | 0.000 | 0.000 | 0.002 | 1.670 | 0.004 | 0.002 |
| Region 3 (left), p = 0.001 | |||||||
| Estimated value | 0.051 | 0.000 | −0.001 | −0.011 | 9.723 | 0.071 | 0.070 |
| MAD | 0.056 | 0.000 | 0.000 | 0.002 | 1.408 | 0.004 | 0.002 |
| Region 7 (left), p = 0.006 | |||||||
| Estimated value | 0.044 | 0.000 | −0.001 | −0.009 | 9.225 | 0.084 | 0.085 |
| MAD | 0.063 | 0.000 | 0.001 | 0.002 | 2.120 | 0.004 | 0.002 |
| Region 2 (right), p = 0.021 | |||||||
| Estimated value | 0.079 | −0.000 | −0.001 | −0.007 | 11.199 | 0.079 | 0.084 |
| MAD | 0.071 | 0.000 | 0.001 | 0.002 | 3.032 | 0.004 | 0.003 |
| Region 3 (right), p = 0.012 | |||||||
| Estimated value | −0.019 | 0.000 | −0.001 | −0.011 | 5.469 | 0.066 | 0.077 |
| MAD | 0.047 | 0.000 | 0.000 | 0.003 | 1.234 | 0.004 | 0.003 |
| Region 5 (right), p = 0.036 | |||||||
| Estimated value | 0.176 | −0.001 | −0.001 | −0.007 | 11.348 | 0.093 | 0.091 |
| MAD | 0.088 | 0.000 | 0.001 | 0.002 | 4.027 | 0.005 | 0.003 |
| Region 7 (right), p = 0.044 | |||||||
| Estimated value | 0.103 | 0.000 | −0.001 | −0.006 | 10.834 | 0.078 | 0.083 |
| MAD | 0.067 | 0.000 | 0.001 | 0.002 | 3.260 | 0.004 | 0.002 |
FIG. 4.

Shape biomarker for the left amygdala (region 1) plotted as a function of age (up) and as a function of years before onset (down). Lines represent observed data (one line per subject) while gray circles and triangle are the model predictions, respectively, for right-censored (controls) and diseased subjects. On the second chart, the time before onset for right-censored subjects (circles) is replaced by its posterior mean. The distance between the two vertical lines in this chart in the estimated change-point time before onset.
FIG. 5.

Shape biomarker for the left amygdala (region 2) plotted as a function of age (up) and as a function of years before onset (down). See Figure 4 for more details.
FIG. 6.

Shape biomarker for the left amygdala (region 3) plotted as a function of age (up) and as a function of years before onset (down). See Figure 4 for more details.
FIG. 7.

Change point estimates mapped on segmented regions in the amygdala (two views of left-side results, followed by two views of right side). Black areas are not significant
For regions on which p-values were not significant, we also tested the hypothesis b2 ≠ 0 within the submodel c = Δ = 0, in order to evaluate the significance of the onset time U on a linear model without change point. None of these tests showed significance.
The amygdala is suggested to play a major role in enhancing the explicit memory related to emotional stimuli, by modulating the consolidation of memory [Hamann (2001)] which is greatly affected by the pathology of AD. Studies have reported amygdalar abnormalities induced by AD, such as loss of neurons as measured in neuropathological analyses [Tsuchiya and Kosaka (1990), Scott, DeKosky and Scheff (1991), Scott et al. (1992)]; loss of volume measured from MRI [den Heijer et al. (2006), Poulin et al. (2011)]; local atrophy based on shape analysis on MRI [Cavedo et al. (2011), Miller et al. (2015)]. Our observations of amygdalar shape atrophy (see Figure 7 and Table 5) are keeping in line with previous findings. In addition, results from our method revealed an accelerated amygdalar atrophy rate induced by AD relative to normal aging. This finding is also consistent with those reported in longitudinal studies of AD focusing on amygdalar shape [Tang et al. (2015)]. The localization of subregions found to be significantly affected by AD in our experiment is also roughly consistent with previous longitudinal results obtained from high-field subregion segmentations [Miller et al. (2015), Tang et al. (2015)], even though variation may occur due to the limited resolution of our MRI data. As shown in Table 5, mainly the amygdalar subregions 2, 3, and 7 were identified, which roughly corresponds to the basolateral and basomedial subregions of the amygdala, the core subregions of the amygdala as defined according to functional characteristics Price (2003), Sheline, Gado and Price (1998). With that being said, none of these previous research work has ever provided the information that seems to be emerging from our analyses, namely an acceleration or start of amygdalar atrophy about 10 years before the onset of AD. Our analysis is the first time, to the best of our knowledge, to have demonstrated that there are subregion-dependent amygdalar atrophy onsets ranging from 8 to 11 years before the clinical onset of AD. This provides unique and important information that furthers our understanding of the pathology of AD, especially its influence on the amygdalar morphometry.
5. Discussion
We have described in this paper an approach to estimate a change point for a biomarker relative to the occurrence of an event (manifest onset), which may be only partially observed. We have described parameter estimation procedures for the prior model on the manifest onset time, and for the two-phase regression model on the biomarkers, with a bootstrap-based model validation scheme.
Our simulation study shows that the learning procedures perform satisfactorily in the ideal case (correct model class), with parameters akin to those estimated from some of the real world data we considered later. With roughly 230 observations, among which about 2/3 were right-censored, change point estimates for true values of 5, 10, 15, and 20 years showed little or even no bias with the gap between the 1/4 and 3/4 quantiles being about 1 and 2 years away. Most of the other coefficients were estimated with very good accuracy, except for the intercept (the variations of which are exacerbated by the fact that the sample ages were far away from 0). It is also important for the validation of the real-data study that we found a rather large power (close to or larger than 90%) for the likelihood-ratio test of the change point detection.
The bootstrap-estimated variations of the change point estimated from real shape data were consistent with those observed in the simulations. They indicated a disease effect around 10 years before the manifest (cognitive) onset. The likelihood profiles computed in Figure 3 illustrate the difficulty of the estimation of the change point, with a non-concave likelihood exhibiting several local maxima. This observation, which is typical of change-point estimation problems, is reinforced in our case by the fact that we are working with a significant amount of right censored subjects.
On the theoretical side, important problems are raised by the presented approach, this paper being limited to experimental validations. The consistency of the maximum-likelihood estimate, and its asymptotic accuracy need to be studied. A rigorous justification of the bootstrap procedure also needs to be developed. These issues, which are left open in the present paper, will be the subject of future work in our group. Future work will also focus on extensions of the model, allowing for evolutions that are more complex than a two-phase linear regression: estimating more than one change point or allowing for non-linear changes in each of the phases.
The results presented here are valid at the population level. Even though, using our model, we were able to compute an individual estimator of the time to onset (as used in Figures 4, 5, and 6), this estimator is very crude and does not provide a reliable individual prediction. Research in this direction is likely to intensify in the near future, however, and we can expect that several weak predictors, such as those derived here for the amygdala, will need to be combined for early diagnosis of AD.
Table 2.
Same as Table 1, with m1 = 65 years
| True parameters |
n | |J1| | Bias |
Standard dev. |
Power (in %) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| m1 | σ1 | α | m1 | σ1 | α | m1 | σ1 | α | |||
| 65 (67) |
1 (2.2) |
2 (0) |
752 | 217 | 0.007 (−0.034) |
−0.001 (−0.016) |
−0.007 | 0.138 (0.093) |
0.100 (0.105) |
0.157 | 99.8 |
| 65 (67) |
2 (2.8) |
2 (0) |
749 | 220 | 0.018 (−0.034) |
−0.000 (−0.008) |
−0.017 | 0.218 (0.122) |
0.142 (0.109) |
0.221 | 99.6 |
| 65 (67) |
3 (3.6) |
2 (0) |
745 | 225 | 0.071 (−0.033) |
0.009 (−0.003) |
−0.069 | 0.421 (0.160) |
0.211 (0.127) |
0.416 | 72.0 |
| 65 (67) |
5 (5.4) |
2 (0) |
731 | 240 | 0.359 (−0.032) |
0.004 (0.004) |
−0.353 | 1.070 (0.259) |
0.362 (0.198) |
1.073 | 11.5 |
| 65 (67) |
10 (10.2) |
2 (0) |
688 | 287 | −0.181 (−0.052) |
−0.371 (0.017) |
0.279 | 2.289 (0.646) |
0.884 (0.527) |
2.370 | 3.0 |
| 65 (67) |
15 (15.1) |
2 (0) |
650 | 329 | −1.569 (−0.129) |
−1.107 (0.057) |
1.980 | 3.888 (1.392) |
1.929 (1.122) |
4.310 | 2.8 |
Footnotes
Supported in part by the National Natural Science Foundation of China (81501546), the SYSU-CMU Shunde International Joint Research Institute Start-up Grant (20150306), the National Institutes of Health (U01-AG03365, P50-AG005146, R01-EB017638, R01-NS084957, and P41-EB015909), the Office of Naval Research (ONR-90048203), and the National Science Foundation (ACI-1053575) for the Computational Anatomy Gateway via Extreme Science and Engineering Discovery Environment (XSEDE) and the Kavli foundation via the Kavli Neuroscience Discovery Institute.
References
- Alzheimer’s Association. 2015 Alzheimer’s disease facts and figures. Alzheimer’s Dement. 2015;11:332–384. doi: 10.1016/j.jalz.2015.02.003. [DOI] [PubMed] [Google Scholar]
- Atiya M, Hyman BT, Albert M, Killiany R. Structural magnetic resonance imaging in established and prodromal Alzheimer disease: A review. Alzheimer Dis Assoc Disord. 2003;17:177–195. doi: 10.1097/00002093-200307000-00010. [DOI] [PubMed] [Google Scholar]
- Bauer M, Bruveris M, Michor PW. Overview of the geometries of shape spaces and diffeomorphism groups. J Math Imaging Vision. 2014;50:60–97. MR3233135. [Google Scholar]
- Cavedo E, Boccardi M, Ganzola R, Canu E, Beltramello A, Caltagirone C, Thompson PM, Frisoni GB. Local amygdala structural differences with 3T MRI in patients with Alzheimer disease. Neurology. 2011;76:727–733. doi: 10.1212/WNL.0b013e31820d62d9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen J, Gupta AK. Parametric Statistical Change Point Analysis. Birkhäuser; Boston, MA: 2000. p. MR1761850. [Google Scholar]
- Csernansky JG, Wang L, Swank J, Miller JP, Gado M, Mckeel D, Miller MI, Morris JC. Preclinical detection of Alzheimer’s disease: Hippocampal shape and volume predict dementia onset in the elderly. NeuroImage. 2005;25:783–792. doi: 10.1016/j.neuroimage.2004.12.036. [DOI] [PubMed] [Google Scholar]
- Den Heijer T, Geerlings MI, Hoebeek FE, Hofman A, Koudstaal PJ, Breteler MMB. Use of hippocampal and amygdalar volumes on magnetic resonance imaging to predict dementia in cognitively intact elderly people. Arch Gen Psychiatry. 2006;63:57–62. doi: 10.1001/archpsyc.63.1.57. [DOI] [PubMed] [Google Scholar]
- Dupuy JF. Estimation in a change-point hazard regression model. Statist Probab Lett. 2006;76:182–190. MR2233390. [Google Scholar]
- Farley JU, Hinich MJ. A test for a shifting slope coefficient in a linear model. J Amer Statist Assoc. 1970;65:1320–1329. [Google Scholar]
- Feder PI. On asymptotic distribution theory in segmented regression problems—Identified case. Ann Statist. 1975;3:49–83. MR0378267. [Google Scholar]
- Fischl B. FreeSurfer. NeuroImage. 2012;62:774–781. doi: 10.1016/j.neuroimage.2012.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gombay E, Horváth L. Limit theorems for change in linear regression. J Multivariate Anal. 1994a;48:43–69. MR1256834. [Google Scholar]
- Gombay E, Horváth L. An application of the maximum likelihood test to the change-point problem. Stochastic Process Appl. 1994b;50:161–171. MR1262337. [Google Scholar]
- Hamann S. Cognitive and neural mechanisms of emotional memory. Trends Cogn Sci. 2001;5:394–400. doi: 10.1016/s1364-6613(00)01707-1. [DOI] [PubMed] [Google Scholar]
- Hebert LE, Weuve J, Scherr PA, Evans DA. Alzheimer disease in the United States (2010-2050) estimated using the 2010 census. Neurology. 2013;80:1778–1783. doi: 10.1212/WNL.0b013e31828726f5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hinkley DV. Inference about the intersection in two-phase regression. Biometrika. 1969;56:495–504. [Google Scholar]
- Hinkley DV. Inference in two-phase regression. J Amer Statist Assoc. 1971;66:736–743. [Google Scholar]
- Hudson DJ. Fitting segmented curves whose join points have to be estimated. J Amer Statist Assoc. 1966;61:1097–1129. MR0210243. [Google Scholar]
- Jack CR, Petersen RC, O’Brien PC, Tangalos EG. MR-based hippocampal volumetry in the diagnosis of Alzheimer’s disease. Neurology. 1992;42:183–188. doi: 10.1212/wnl.42.1.183. [DOI] [PubMed] [Google Scholar]
- Jack CR, Petersen RC, Xu YC, Waring SC, O’Brien PC, Tangalos EG, Smith GE, Ivnik RJ, Kokmen E. Medial temporal atrophy on MRI in normal aging and very mild Alzheimer’s disease. Neurology. 1997;49:786–794. doi: 10.1212/wnl.49.3.786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kantarci KK, Jack CR. Neuroimaging in Alzheimer disease: An evidence-based review. Neuroimaging Clin N Am. 2003;13:197–209. doi: 10.1016/s1052-5149(03)00025-x. [DOI] [PubMed] [Google Scholar]
- Larrieu S, Letenneur L, Orgogozo JM, Fabrigoule C, Amieva H, Le Carret N, Barberger-Gateau P, Dartigues JF. Incidence and outcome of mild cognitive impairment in a population-based prospective cohort. Neurology. 2002;59:1594–1599. doi: 10.1212/01.wnl.0000034176.07159.f8. [DOI] [PubMed] [Google Scholar]
- Li Y, Qian L, Zhang W. Estimation in a change-point hazard regression model with long-term survivors. Statist Probab Lett. 2013;83:1683–1691. MR3062282. [Google Scholar]
- Ma J, Miller MI, Younes L. A Bayesian generative model for surface template estimation. Int J Biomed Imaging. 2010;2010:974957. doi: 10.1155/2010/974957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller MI, Trouvé A, Younes L. Hamiltonian systems in computational anatomy: 100 years since D’Arcy Thompson. Annu Rev Biomed Eng. 2015;17 doi: 10.1146/annurev-bioeng-071114-040601. [DOI] [PubMed] [Google Scholar]
- Miller MI, Younes L, Trouvé A. Diffeomorphometry and geodesic positioning systems for human anatomy. Technology. 2014;2:36–43. doi: 10.1142/S2339547814500010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller MI, Younes L, Ratnanather JT, Brown T, Trinh H, Lee DS, Tward D, Mahon PB, Mori S, Albert M. Amygdalar atrophy in symptomatic Alzheimer’s disease based on diffeomorphometry: The BIOCARD cohort. Neurobiol Aging. 2015;36:S3–S10. doi: 10.1016/j.neurobiolaging.2014.06.032. Supplement 1: Novel Imaging Biomarkers for Alzheimer’s Disease and Related Disorders (NIBAD). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mueller SG, Weiner MW, Thal LJ, Petersen RC, Jack CR, Jagust W, Trojanowski JQ, Toga AW, Beckett L. Ways toward an early diagnosis in Alzheimer’s disease: The Alzheimer’s Disease Neuroimaging Initiative (ADNI) Alzheimer’s Dement. 2005;1:55–66. doi: 10.1016/j.jalz.2005.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nguyen HT, Rogers GS, Walker EA. Estimation in change-point hazard rate models. Biometrika. 1984;71:299–304. MR767158. [Google Scholar]
- Pierson R, Johnson H, Harris G, Keefe H, Paulsen JS, Andreasen NC, Magnotta VA. Fully automated analysis using BRAINS: AutoWorkup. NeuroImage. 2011;54:328–336. doi: 10.1016/j.neuroimage.2010.06.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pons O. Estimation in a Cox regression model with a change-point according to a threshold in a covariate. Ann Statist. 2003;31:442–463. Dedicated to the memory of Herbert E. Robbins. MR1983537. [Google Scholar]
- Poulin SP, Dautoff R, Morris JC, Barrett LF, Dickerson BC, Alzheimer’s Disease Neuroimaging Initiative et al. Amygdala atrophy is prominent in early Alzheimer’s disease and relates to symptom severity. Psychiatry Research Neuroimaging. 2011;194:7–13. doi: 10.1016/j.pscychresns.2011.06.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Price JL. Comparative aspects of amygdala connectivity. Ann NY Acad Sci. 2003;985:50–58. doi: 10.1111/j.1749-6632.2003.tb07070.x. [DOI] [PubMed] [Google Scholar]
- Quandt RE. The estimation of the parameters of a linear regression system obeying two separate regimes. J Amer Statist Assoc. 1958;53:873–880. MR0100314. [Google Scholar]
- Reuter M. Hierarchical shape segmentation and registration via topological features of Laplace–Beltrami eigenfunctions. Int J Comput Vis. 2010;89:287–308. [Google Scholar]
- Rusinek H, de Santi S, Frid D, Tsui WH, Tarshish CY, Convit A, de Leon MJ. Regional brain atrophy rate predicts future cognitive decline: 6-year longitudinal MR imaging study of normal aging. Radiology. 2003;229:691–696. doi: 10.1148/radiol.2293021299. [DOI] [PubMed] [Google Scholar]
- Scott SA, Dekosky ST, Scheff SW. Volumetric atrophy of the amygdala in Alzheimer’s disease: Quantitative serial reconstruction. Neurology. 1991;41:351–356. doi: 10.1212/wnl.41.3.351. [DOI] [PubMed] [Google Scholar]
- Scott SA, Sparks DL, Scheff SW, Dekosky ST, Knox CA. Amygdala cell loss and atrophy in Alzheimer’s disease. Ann Neurol. 1992;32:555–563. doi: 10.1002/ana.410320412. [DOI] [PubMed] [Google Scholar]
- Sheline YI, Gado MH, Price JL. Amygdala core nuclei volumes are decreased in recurrent major depression. NeuroReport. 1998;9:2023–2028. doi: 10.1097/00001756-199806220-00021. [DOI] [PubMed] [Google Scholar]
- Sprent P. Some hypotheses concerning two phase regression lines. Biometrics. 1961;17:634–645. [Google Scholar]
- Tang X, Oishi K, Faria AV, Hillis AE, Albert MS, Mori S, Miller MI. Bayesian parameter estimation and segmentation in the multi-atlas random orbit model. PLoS ONE. 2013;8:e65591. doi: 10.1371/journal.pone.0065591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tang X, Holland D, Dale AM, Younes L, Miller MI, Alzheimer’s Disease Neuroimaging Initiative Shape abnormalities of subcortical and ventricular structures in mild cognitive impairment and Alzheimer’s disease: Detecting, quantifying, and predicting. Hum Brain Mapp. 2014;35:3701–3725. doi: 10.1002/hbm.22431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tang X, Holland D, Dale AM, Younes L, Miller MI, Alzheimer’s Disease Neuroimaging Initiative The diffeomorphometry of regional shape change rates and its relevance to cognitive deterioration in mild cognitive impairment and Alzheimer’s disease. Hum Brain Mapp. 2015;36:2093–2117. doi: 10.1002/hbm.22758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thambisetty M, Simmons A, Velayudhan L, Hye A, Campbell J, Zhang Y, Wahlund LO, Westman E, Kinsey A, Güntert A, et al. Association of plasma clusterin concentration with severity, pathology, and progression in Alzheimer disease. Arch Gen Psychiatry. 2010;67:739–748. doi: 10.1001/archgenpsychiatry.2010.78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsuchiya K, Kosaka K. Neuropathological study of the amygdala in presenile Alzheimer’s disease. J Neurol Sci. 1990;100:165–173. doi: 10.1016/0022-510x(90)90029-m. [DOI] [PubMed] [Google Scholar]
- Wu CQ, Zhao LC, Wu YH. Estimation in change-point hazard function models. Statist Probab Lett. 2003;63:41–48. MR1973402. [Google Scholar]
- Younes L. Applied Mathematical Sciences. Vol. 171. Springer; Berlin: 2010. Shapes and Diffeomorphisms; p. MR2656312. [Google Scholar]
- Younes L, Albert M, Miller MI, Biocard Research Team et al. Inferring changepoint times of medial temporal lobe morphometric change in preclinical Alzheimer’s disease. NeuroImage Clin. 2014;5:178–187. doi: 10.1016/j.nicl.2014.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
