Summary
Predicting binary events such as newborns with large birthweight is important for obstetricians in their attempt to reduce both maternal and fetal morbidity and mortality. Such predictions have been a challenge in obstetric practice, where longitudinal ultrasound measurements taken at multiple gestational times during pregnancy may be useful for predicting various poor pregnancy outcomes. The focus of this paper is on developing a flexible class of joint models for the multivariate longitudinal ultrasound measurements that can be used for predicting a binary event at birth. A skewed multivariate random effects model is proposed for the ultrasound measurements, and the skewed generalized t-link is assumed for the link function relating the binary event and the underlying longitudinal processes. We consider a shared random effect to link the two processes together. Markov chain Monte Carlo sampling is used to carry out Bayesian posterior computation. Several variations of the proposed model are considered and compared via the deviance information criterion, the logarithm of pseudomarginal likelihood, and with a training-test set prediction paradigm. The proposed methodology is illustrated with data from the NICHD Successive Small-for-Gestational-Age Births study, a large prospective fetal growth cohort conducted in Norway and Sweden.
Keywords: Asymmetric link, Generalized t-distribution, Macrosomia, Skewed multivariate random effects model, Prediction, Ultrasound measurement
1. Introduction
Predicting binary events such as the birth of newborns with large birthweight is important for obstetricians in their attempt to reduce both maternal and fetal morbidity and mortality. Such predictions have been a challenge in obstetric practice, where longitudinal ultrasound measurements taken at multiple irregularly spaced times in gestation may be useful for predicting various poor pregnancy outcomes. In the NICHD Successive Small-for-Gestational-Age Births (SGA) study, a large prospective study conducted in Norway and Sweden from 1986 to 1989, each pregnant woman was targeted to have four ultrasound examinations at 17, 25, 33 and 37 weeks of gestation. At each visit, sonography was conducted to measure the following fetal growth anthropometry variables: biparietal diameter (BPD), middle abdominal diameter (MAD) and femur length (FL). Figure 1 shows the longitudinal trajectories for BPD, MAD, and FL on the original scale. The figure shows that the trajectories are rather smooth and that, although observations were targeted at particular times in gestation, there was sizable variation in the observation times across subjects.
Figure 1.
Plots of longitudinal trajectory for BPD, MAD, and FL by gestational age in the study population: the solid lines represent a lowess smooth curve.
The development of accurate methods for predicting a newborn larger than 4,000 grams, defined as macrosomia, using longitudinal ultrasound measurements is important for the clinical management of both mother and baby. Recently, Albert (2012) proposed a shared random effects model to evaluate the predictive accuracy of longitudinal ultrasound measurement using a two-stage procedure. In the first stage, a linear mixed model is fitted to the longitudinal measurements, and in the second stage, the binary outcome is modeled with a probit link function where the predicted random effects are treated as covariates. The probit link was chosen since measurement error of the predicted random effects can be easily incorporated in closed form. This two stage approach can easily be extended to the multivariate setting as illustrated by Zhang et al. (2012).
The Gaussian assumption for the residual error in Albert (2012) may not hold. Figure 2 shows residuals obtained using linear mixed models with cubic spline fixed and random effects for BPD, MAD, and FL, fit on both the original and log scales. For both the original and log scales, the empirical residuals suggest that the residual error distributions are heavy tailed and possibly skewed. Other transformations also showed these features, suggesting that directly transforming the longitudinal measurements will not address this problem. Further, although computationally feasible, the probit link function may not ideally model the probability of a poor pregnancy outcome since it is symmetric around 0.5, a strong assumption in this analysis. Thus, new statistical methodology is necessary to incorporate flexibility in the error distributions as well as in the link function relating the longitudinal trajectories to the binary response.
Figure 2.
Q-Q plots of the residuals for BPD, MAD, and FL from fitting the linear mixed random effects model separately: (a), (b), and (c) for original scale; (d), (e), and (f) for log scale.
In this paper, we develop a flexible class of joint models for the longitudinal ultrasound measurements to predict a binary event at birth, such as macrosomia. A skewed multivariate random effects model is proposed for the ultrasound measurements, and skewed link is assumed for the link function relating the binary event and the underlying longitudinal processes. We consider a shared random effect to link the two processes together. We also use the polynomial regression spline based on truncated power basis to capture the nonlinear structure in the longitudinal mean trajectory. To this end, we develop efficient Bayesian computational methods for fitting this joint model via several modified collapsed Gibbs samplers (Chen et al., 2000; Liu, 1994). In addition, we derive the deviance information criterion (DIC) and logarithm of pseudomarginal likelihood (LPML) for comparing several variations of the proposed joint models.
We organize the rest of the paper as follows. We provide in Section 2 the methodological development of a flexible class of joint models for the longitudinal ultrasound measurements and a subsequent binary event, and show how this approach generalizes methodology already developed for this problem. In Section 3, we discuss the prediction of an adverse pregnancy outcome from a series of multiple ultrasound measurements taken at various gestational ages that can be irregularly spaced across individuals. The prior and posterior distributions as well as the goodness-of-fit criterion are discussed in Section 4. Section 5 presents an analysis of the SGA study data. A discussion follows in Section 6.
2. Model Framework
Let i denote individual, j denote time point, and k denote ultrasound measurement. We assume that each measurement is taken at repeated time points, which are potentially irregularly spaced times in gestation. Further, we assume that there are I individuals in the study, each contributing Ji time points, where Ji denotes the number of repeated time points on the ith individual. Let xijk = (xijk1, . . ., xijk,p1)′ and zi = (zi1, . . ., zi,p2)′ denote the vectors of fixed effect covariates, where zi may share common components with xijk. Also, let βk = (βk1, . . ., βk,p1)′ and γ = (γ1, . . ., γp2)′ denote the corresponding vector of regression coefficients for the longitudinal and binary model components, respectively, i = 1, . . ., I, j = 1, . . ., Ji, and k = 1, . . ., K. Note that βk does not vary by i or j and γ is constant across i, j, and k. Also let , and . Furthermore, we assume that tijk denotes the time (i.e., gestational age) of the kth type of ultrasound measurement (i.e., BPD, MAD, or FL) at the jth time point on the ith individual.
2.1 Joint models
Let yijk denote the kth type of longitudinal ultrasound measurement at the jth time point on the ith individual. Also let Si denote an adverse binary event (e.g., macrosomia) for the ith individual. We propose the following joint models with shared random effects for the longitudinal measurements and binary outcome,
(1) |
(2) |
For the longitudinal data, g(tijk) and gb(tijk; bik) in (1) are functions of time tijk corresponding to fixed and random effects, where bik is a vector of random effects. In addition, are random variables for the error distributions that will allow for long-tailed and skewed distributions, which are apparent in our longitudinal imaging data. These model components for the mean, random effects, and errors are flexible and are described in Sections 2.2 and 2.3. For the binary data, we introduce a latent variable Ri that characterizes the risk of the binary outcome (Albert and Chib, 1993) and is linked to the longitudinal processes through a function of random effects, h(bik), where the random effects are shared between the longitudinal and binary event processes. For each longitudinal variable (k), the parameters αk’s explicitly introduce this dependence. Further, to provide a flexible link function, the error distribution incorporates the possibility of both long tails and skewness as described in Section 2.3. This approach generalized the work of Albert (2012) and Zhang et al. (2012), who, in the same setting, proposed a shared random effects model with Gaussian error structures and a probit link function. With these simplifying assumptions they show that a simple two-stage estimation procedure is possible. Specifically, they assumed quadratic fixed and random effects for g(tijk) and gb(tijk; bik), a normal distribution for , and a normal distribution for , which results in a probit link function for the binary process. As is evident from Figure 2, the normal distribution is not adequate for the longitudinal fetal growth anthropometry data. Further, the probit link function in Albert (2012) might be too restrictive in this application.
Similar to the procedure used by Albert (2012) and Zhang et al. (2012), h(bik) is chosen so that the function relates the random effects to the underlying longitudinal processes close to birth. This is scientifically reasonable since, in the fetal growth example, the dependence between the longitudinal fetal growth and the binary birth outcome should be at a time close to birth. However, in other applications, we could introduce a parameter associated with each random effect component (e.g., linear, quadratic, and cubic terms) for each of the multivariate longitudinal measurements into the link function. The following sections describe each of the model components in more detail.
2.2 Longitudinal model with flexible mean structures
Longitudinal models have been proposed for characterizing fetal growth pattern (Deter, 2004; Slaughter et al., 2009). For fetal growth, the longitudinal profile for ultrasound measurements can be characterized to capture the nonlinear structure in the mean trajectory. To this end, we assume the following polynomial regression spline for g(tijk) and gb(tijk; bik):
(3) |
where q is a pre-specified degree of polynomial spline, m is the number of knots, , ζk = (ζk1, . . ., ζkm)′ is the knot sequence with aζk < ζk1 < ··· < ζkm < bζk, is a truncated polynomial basis functions of degree q, and ϕk = (ϕk0, . . ., ϕk,q+m)′ and bik = (bik0, . . ., bik,q+m)′ are corresponding vectors of parameters and random effects, respectively. Let . We then assume that the random effect bi follows a multivariate normal distribution with mean 0 and the K(q+m+1)×K(q+m+1) unstructured variance-covariance matrix Ω. In (1), the random effects for the kth type of ultrasound measurement are interpreted as individual departures in an individual’s growth curve relative to the average fetal growth curve in the population. This correlated random effects structure allows for a flexible correlation in the longitudinal measurements over time and across type (i.e. BPD, MAD, and FL).
2.3 Error distributions for longitudinal measurements and link functions
We consider flexible distributions for in (1) and in (2) that allow for flexibility in the longitudinal error distribution as well as in the link function. In this paper, we propose to use long-tailed and skewed distributions to accomplish this goal. Let Δθ = diag(θ1, θ2,. . ., θK), ξij = (ξij1, . . ., ξijK)′, εij = (εij1, . . ., εijK)′, and . Specifically, we model
(4) |
(5) |
where the first terms in (4) and (5) reflect components for the skewness and the second terms reflect components for the long tails. In (4), we assume that (i) ξijk and εijk are independent; (ii) ξijk ~ Gξ, and ξijk and ξi′j′k′ are independent, where Gξ is the cumulative density function (cdf) of a skewed distribution defined on R+ = (0, ∞); and (iii) εij follows a multivariate symmetric distribution with the K×K unstructured variance-covariance matrix Σj = (σjkk′), where k and k′ = 1, ···, K, and εij and εi′j′ are independent. In (5), we assume that (i) ψi and ηi are independent; (ii) ψi ~ Gψ are each independent, where Gψ is the cdf of a skewed distribution defined on R+ = (0, ∞); and (iii) ηi follows a symmetric distribution, and ηi and ηi′ are independent. Further, θk in (4) and δ in (5) are skewness parameters. When θk = 0 (δ = 0), the distribution of yijk (Ri) is symmetric. Following Chen et al. (1999) and Kim et al. (2008), we assume that Gξ and Gψ are known cdfs to ensure model identifiability. In this paper, we first specify several different distributions for Gξ and Gψ. Then we adopt the DIC proposed by Spiegelhalter et al. (2002) and the LPML (Ibrahim et al., 2001) to determine which Gξ and Gψ fit the data the best.
For characterizing skewness (the first terms in (4) and (5)), we consider the following distributions for Gξ and Gψ: (a) Gξ is degenerated at 0, denoted by Δ{0}, yielding a multivariate symmetric distribution; Gψ is degenerated at 0 for a symmetric link; (b) Gξ is a standard exponential distribution (ℰ) with probability density function (pdf) fℰ(ξijk) = exp(−ξijk) if ξijk > 0 and 0 otherwise; Gψ is a ℰ; and (c) Gξ is a half normal (ℋ𝒩) with pdf if ξijk > 0 and 0 otherwise; Gξ is a ℋ𝒩. Thus given Gξ the model (4) yields a skewed multivariate distribution. Also, ℰ and ℋ𝒩 for Gψ in (5) both lead to skewed links. For the symmetric component of the error terms in the longitudinal outcomes (second term in (4)), we assume the following multivariate scale-mixture normal distribution for εij
(6) |
where λij’s are independent across j and each follows the distribution Gamma (ν1/2, ν2/2), where Gamma (a, b) is a Gamma distribution with mean a/b. The marginal distribution of εij is then a multivariate generalized t-distribution with parameters ν1 and ν2. Specifically, the pdf of εij is given by
(7) |
with
(8) |
and
(9) |
where ν1 is a shape parameter (or degrees of freedom) and ν2 is a scale parameter. When ν1 = ν2 = ν, (7) reduces to a multivariate t-distribution with ν degrees of freedom. Similar to a multivariate t-distribution, the pdf of a multivariate generalized t-distribution is symmetric about zero, with a small value of ν1 corresponding to a heavy tailed distribution. For a multivariate generalized t-distribution, ν2 is assumed to be fixed to ensure identifiability. Without loss of generality, we assume ν2 to be 1 in this paper. For the symmetric component of the link function for binary outcome (second term in (5)), ηi, we assume a generalized t-distribution (Abranowitz and Stegun, 1972) that is a univariate version of (7) with variance 1 given by
(10) |
where is a shape parameter (or degrees of freedom) and is a scale parameter. To ensure identifiability, we assume to be 1 (Kim et al., 2008).
We note that based on (7) and (10), in (4) is a multivariate skewed generalized t-distribution, and in (5) leads to a skewed generalized t-link for Si. Web Figure 1 illustrates the skewed generalized t-distribution. The joint models defined in (1), (2), (3), (4), (5), (7), and (10) are general and flexible and include the normal/probit and skewed t/skewed t-link models as special cases for modeling the longitudinal measurements/binary outcome.
To complete the model specification, we need to define h(bik). For the fetal growth example, h(bik) is chosen so that the longitudinal process is linked with the binary process through an individual’s projected fetal growth at a point close to birth (Albert, 2012). Thus, h(bik) = uk′bik in (2) is chosen, as is a truncated polynomial basis functions of degree q with m knots and t* a time point near the time of birth, such as 39 weeks of gestation.
2.4 Likelihood functions
For the longitudinal data, we let y = (y111, . . ., yI,JI,K)′, , θ = (θ1, . . .,θK)′, Σ = diag(Σ1, . . ., ΣJI), and . Also let denote the observed data. Given b and , the likelihood function of (β, ϕ, θ, Σ, ν1) for longitudinal ultrasound measurements is given by
(11) |
where μij = Xijβ + Wijϕ + Wijbi + Δθ(ξij − Eξij). For the binary data, we let S = (S1, . . ., SI)′, , α = (α1, . . ., αK)′, and ψ = (ψ1, . . ., ψI)′. Also let denote the observed data. The likelihood function of (γ, α, δ, ) for the binary outcome is
(12) |
Furthermore, the observed joint likelihood function of (β, ϕ, θ, Σ, ν1, γ, α, δ, , Ω) is
(13) |
where and are given in (11) and (13), respectively. Since it is difficult to work directly with the observed joint likelihood function of (β, ϕ, θ, Σ, ν1, γ, α, δ, , Ω) in (13), it is infeasible to develop an efficient Markov chain Monte Carlo sampling algorithm (MCMC) algorithm. Instead, we use the fact that the generalized t-distribution can be represented as a gamma mixture of normal distributions for εij and , and we introduce the complete data likelihood function of (β, ϕ, θ, Σ, ν1, γ, α, δ, , Ω) as described in Web Appendix A.
3. Predicting the Binary Outcome
The joint model in (1) and (2) relates the longitudinal fetal growth pattern to the probability of an abnormal birth outcome through an individual’s predicted measurement at time t*, and it can be used to develop a predictor of the binary outcome from longitudinally collected measurements. We are interested in predicting an adverse pregnancy outcome such as macrosomia from a series of multivariate ultrasound measurements taken at various gestational ages. To predict the abnormal binary outcome at birth using the multivariate longitudinal ultrasound measurements, we let yP = (yt1, yt2, . . ., ytL) denote the longitudinal measurements taken at time points t1, t2, . . ., tL, where L is the number of repeated measurements in the predictor. We also let SP denote the binary outcome we wish to predict. Let . Then the posterior predictive probability for SP based on longitudinal measurements yP can be given by
(14) |
where bP is a multivariate random effect (bP ~ N (0, Ω)) and π(Θ, bP |yP) is the posterior predictive distribution for Θ and bP based on yP. To evaluate the prediction accuracy, we divide the data into training and test set data. To obtain the posterior predictive probability in (14), we sample bP from the joint posterior distribution based on test set data (with parameter estimates obtained from the training set data). We consider two approaches to assess the predictive ability of the longitudinal classifiers. One approach is the receiver operator characteristic curve (ROC) used by Albert (2012), which is a standard approach for estimating the accuracy of a binary classification using a continuous marker. In particular, we compute the area under the ROC curve (AUC), where a value of 1 corresponds to perfect classification, while a value of 0.5 corresponds to completely random classification. Furthermore, the ROC is a plot of 1-specificity versus sensitivity for multiple cut-off values of the predictor. We also use the mean-squared error (MSE) of prediction, which is a measure for assessing absolute risk (Gail and Pfeiffer, 2005) and the average squared difference between the predicted probability and the binary outcome. To perform valid prediction assessment, we estimate the model parameters of the joint model in (1) and (2) and formulate the predictor using the training set data and then validate the predictor using the test set data.
4. Posterior Inference
4.1 Prior and Posterior Distributions
We assume that β, ϕ, θ, Σ, ν1, γ, α, δ, , and Ω are independent a priori. Thus, the joint prior for (β, ϕ, θ, Σ, ν1, γ, α, δ, , Ω) is of the form
(15) |
We further assume that βk ~ Np1 (0, c1Ip1), ϕk ~ Nq+m+1(0, c2Iq+m+1), θ ~ NK(0, c3IK), γ ~ Np2 (0, c4Ip2), α ~NK(0, c5IK), δ ~ N (0, c6), ν1 ~ Gamma (a1,b1) with with , and Ω−1 ~ WishartK(q+m+1) (d1, V1), where c1, c2, c3, c4, c5, c6, a1, b1, a2, b2, d0, V0, d1, and V1 are the prespecified hyperparameters. Here, WishartK (d, V) denotes a Wishart prior distribution with d degrees of freedom and mean dV. We recommend choosing vague priors for all parameters except , since posterior inference was not sensitive to the choice of these hyperparameters. For this application, an informative prior was necessary for , since there was little information about the long tails in the link function for the binary outcome. However, assuming an informative prior for still provides a very flexible form for the link function and ensures convergence of the Gibbs sampler.
Based on the prior distributions specified above, the joint posterior distribution of β, ϕ, θ, Σ, ν1, γ, α, δ, , and Ω is
(16) |
where is defined in (13). A description of the MCMC algorithm is given in Web Appendix B.
4.2 Model Comparison
To assess the goodness of fit of the models, we use the LPML given in Ibrahim et al. (2001) and the DIC proposed by Spiegelhalter et al. (2002). First, LPML is a well-established Bayesian model comparison criterion based on conditional predictive ordinate (CPO) statistics. As suggested in Ibrahim et al. (2001), a natural summary statistic of the CPOis is the LPML defined as , where the CPO statistic for the ith subject is the marginal posterior predictive density of yi and Si. Second, for computing the DIC, we are unable to easily integrate out b analytically in (1) and (2). We therefore take a different approach that uses an extension of the DIC (Huang et al., 2005), given in Web Appendix C. The larger the LPML value and smaller the DIC value, the better the model fits the data.
5. Analysis of the Successive Small-for-Gestational-Age Births Study Data
We used the proposed joint model in (1) and (2) to analyze data from the SGA study discussed in Section 1, and we focus on predicting macrosomia, defined as a newborn > 4, 000g, using the longitudinal ultrasound measurements. By considering macrosomia, we are examining the occurrence of large birthweight that has the potential to have pathological effects. Alternatively, we could have used other measures of large birthweight, including a commonly used measure called large for gestational age (LGA), that explicitly account for gestational age. An analysis of the actual birthweight is also possible but we think less relevant, since our interest is predominately on identifying (diagnosing) abnormally large neonates.
The response variable yij = (yij1, yij2, yij3)′ is the anthropomorphic ultrasound measurements for the ith woman at the jth time point: BPD (mm), MAD (mm), and FL (mm), where each pregnant woman has four ultrasound examinations at approximately 17, 25, 33 and 37 weeks of gestation. The time point tijk is the jth gestational age (GA, weeks) for the ith woman and kth type of ultrasound measurement. We consider the four covariates: maternal age (Age, years), pre-pregnancy body mass index (BMI, kg/m2), history of small-for-gestational age birth (SGA, yes/no), and smoking during pregnancy (Smoking, number of cigarettes per day). Furthermore, the adverse binary outcome Si is macrosomia, which is defined as birthweight > 4000g (Zhang et al., 2012). We focus on 1474 women who had complete or partial longitudinal ultrasound measurements along with the birth outcome and relevant covariates in this analysis. Thus, I = 1474, J = 4 is the maximum number of repeated ultrasound measurements, and K = 3 is the number of ultrasound measurement types. Descriptive statistics for the SGA study data are presented in Web Table 1. In addition, Web Figure 2 illustrates the timing of ultrasound examinations and birth. We further use m = 3 for the number of knots and ζk = (18.86, 26.14, 33.71)′ for the locations of the knot points corresponding to the 25th, 50th, and 75th percentiles of all measurement times. As mentioned in Section 2.1, we use cubic splines for g(tijk) and gb(tijk; bik) by setting q = 3 in (3) to incorporate a flexible mean structure. We also use q = 3 and t* = 39 for h(bik) = uk′bik in (2). In addition, we divide the whole data into two data sets with a 60% and 40% random split into training and test set data, respectively (884 in the training set and 590 in the test set). We use the training set data to develop the predictor (i.e., ROC, AUC, and MSE) by first fitting the joint model in (1) and (2), while the test set data are used to validate the predictor with different accuracy measures.
In all of the analyses, we standardized the covariates, in which each covariate was subtracted from its sample mean and divided by its sample standard deviation (SD). This was done to help the numerical stability in the posterior computation using the MCMC sampling algorithm in Web Appendix B. The means and standard deviations are (28.33, 4.26) for Age, (21.51, 3.17) for BMI, (0.24, 0.43) for SGA, and (6.78, 7.15) for Smoking. Furthermore, tijk is re-scaled to the unit interval, which is divided by the maximum value of tijk, so that 0 < tijk ≤ 1. The location of knots ζk is also re-scaled by the maximum value of tijk. The hyperparameters of the prior in (15) were specified as c1 = 100, c2 = 100, c3 = 100, c4 = 100, c5 = 100, c6 = 100, a1 = 1, b1 = 0.1, a2 = 1, b2 = 1, d0 = K + 0.1, V0 = 0.1, d1 = q + m + 1.00001, and V1 = 0.00001 in the analysis. For all of the posterior computations, we first generated 100,000 MCMC Gibbs samples with a burn-in of 20,000 iterations, and we then used 20,000 iterations obtained from every 5th iteration for computing all the posterior estimates, including posterior means, posterior standard deviations, 95% highest posterior density (HPD) intervals, and the LPMLs and DICs for model comparison. The computer programs were written in FORTRAN 95 using IMSL subroutines with double precision accuracy. The convergence of the MCMC sampling algorithm for all the parameters was checked based on the recommendations of Cowles and Carlin (1996). All trace and autocorrelation plots showed good convergence and excellent mixing of the MCMC sampling algorithm.
We are interested in investigating how the goodness of fit might be affected by the distributions of εij and ηi and by the choices of Gξ and Gψ for joint model in (1) and (2) using the DIC and LMPL discussed in Section 4.2. This investigation involves the following distributions: (i) normal, generalized t (GT), and skewed generalized t (SGT) for εij and ηi; (ii) Δ{0}, ℰ, and ℋ𝒩 for Gξ and Gψ. With the combination of those distributions, we have the following ten models for model comparison: (1) Probit-Normal model with symmetric normal for ηi and εij; (2) SProbitE-Normal model with normal for ηi and εij, and ℰ for Gψ; (3) SProbitN-Normal model with normal for ηi and εij, and ℋ𝒩 for Gψ; (4) GT-GT model with symmetric GT for ηi and εij; (5) SGTE-GT model with GT for ηi and εij, and ℰ for Gψ; (6) SGTE-SGTE model with GT for ηi and εij, and ℰ for Gψ and Gξ; (7) SGTE-SGTN model with GT for ηi and εij, ℰ for Gψ and ℋ𝒩 for Gξ; (8) SGTN-GT with GT for ηi and εij, and ℋ𝒩for Gψ; (9) SGTN-SGTE with GT for ηi and εij, ℋ𝒩 for Gψ and ℰ for Gξ; (10) SGTN-SGTN model with GT for ηi and εij, and ℋ𝒩 for Gψ and Gξ.
Table 1 shows the DIC and LPML values for the ten models under consideration, with the smallest value for DIC (33878.04) and largest value for LPML (−19351.22) corresponding to the SGTN-SGTN model. This demonstrates that the SGTN-SGTN model fits the training data the best among all models considered. This affirms the need for considering the heavy tail distributions for both ηi and εij and skewed distributions with ℋ𝒩 for both Gξ and Gψ. The models with symmetric distributions for ηi and εij have the larger DIC values and smaller LPML values, suggesting that these models fit data worse than the skewed models. Furthermore, between skewed models, the models with ℋ𝒩 for εij have a better fit than models with ℰ for εij. Importantly, all models considered show a better fit than the ProbitNormal model proposed by Albert (2012) and Zhang et al. (2012).
Table 1.
The values of DIC and LPML based on the training set data
Model | D(Θ̄) | PD | DIC | LPML |
---|---|---|---|---|
Probit-Normal | 35138.35 | 2349.66 | 39837.67 | −20648.07 |
SProbitE-Normal | 34511.85 | 2581.42 | 39674.68 | −20583.01 |
SProbitN-Normal | 34407.83 | 2573.29 | 39554.40 | −20531.47 |
GT-GT | 31788.66 | 3419.64 | 38627.94 | −20495.25 |
SGTE-GT | 31409.85 | 3463.43 | 38336.71 | −20424.44 |
SGTE-SGTE | 26532.40 | 4234.55 | 35001.50 | −19679.18 |
SGTE-SGTN | 23645.86 | 5218.22 | 34082.31 | −19469.08 |
SGTN-GT | 31369.51 | 3464.70 | 38298.90 | −20380.60 |
SGTN-SGTE | 26124.64 | 4242.18 | 34608.99 | −19513.09 |
SGTN-SGTN | 23334.20 | 5271.92 | 33878.04 | −19351.22 |
Tables 2 and 3 show the posterior means, standard deviations and 95% HPD intervals of the parameters under the best model (SGTN-SGTN model) based on the training set data. The results in Table 2 show that only maternal age is significant and has positive association with MAD, suggesting that older women have fetuses with larger abdominal diameter. Further, a women’s BMI is negatively and positively associated with BPD and MAD, respectively. The skewness parameters for MAD and FL were significantly different from zero, with MAD being negative and FL positive. The posterior estimate for ν1 is 3.92, suggesting that ultrasound measurements have heavy tail distributions. Posterior estimates of g(tijk) and Σ under the best model (SGTN-SGTN model) are presented in Web Tables 2 and 3. The estimated longitudinal trajectory plots of BPD, MAD, and FL over gestational age are given in Web Figure 3. This plot adjusts for Age, BMI, SGA, and Smoking and takes full advantage of the specification of our flexible model. The estimated longitudinal trajectories for MAD and FL are steadily increasing across gestation, and the trajectory for BPD appears to level off at later gestational ages. Furthermore, ultrasound measurements are positively correlated with each other at each of the follow-up times, with the correlation being the lowest at the third follow-up time (Web Table 4).
Table 2.
Posterior estimates of the parameters for ultrasound measurements under the best model: fit to training set data
Variable | Parameter | Mean | SD | 95% HPD Interval | |
---|---|---|---|---|---|
BPD | Age | β11 | −0.029 | 0.033 | (−0.094, 0.036) |
BMI | β12 | −0.099 | 0.033 | (−0.163, −0.033) | |
SGA | β13 | 0.041 | 0.034 | (−0.026, 0.106) | |
Smoking | β14 | 0.051 | 0.034 | (−0.013, 0.120) | |
Skewness | θ1 | −0.072 | 0.647 | (−0.870, 0.815) | |
| |||||
MAD | Age | β21 | 0.122 | 0.057 | (0.011, 0.231) |
BMI | β22 | 0.126 | 0.058 | (0.010, 0.237) | |
SGA | β23 | −0.048 | 0.057 | (−0.157, 0.066) | |
Smoking | β24 | 0.074 | 0.057 | (−0.042, 0.182) | |
Skewness | θ2 | −1.983 | 0.122 | (−2.215, −1.742) | |
| |||||
FL | Age | β31 | −0.069 | 0.052 | (−0.171, 0.030) |
BMI | β32 | 0.067 | 0.052 | (−0.034, 0.170) | |
SGA | β33 | 0.042 | 0.052 | (−0.059, 0.145) | |
Smoking | β34 | −0.064 | 0.052 | (−0.163, 0.040) | |
Skewness | θ3 | 1.243 | 0.139 | (0.957, 1.500) | |
| |||||
d.f. | ν1 | 3.924 | 0.313 | (3.303, 4.524) |
Table 3.
Posterior estimates for macrosomia under the best model: fit to training set data
Variable | Parameter | Mean | SD | 95% HPD Interval | |
---|---|---|---|---|---|
Intercept | γ1 | −17.299 | 3.733 | (−24.357, −10.989) | |
Age | γ2 | 1.062 | 0.711 | (−0.296, 2.495) | |
BMI | γ3 | 0.986 | 0.721 | (−0.358, 2.475) | |
SGA | γ4 | −3.368 | 1.119 | (−5.676, −1.410) | |
Smoking | γ5 | −1.173 | 0.717 | (−2.661, 0.187) | |
| |||||
α1 | 0.989 | 0.316 | (0.435, 1.594) | ||
α2 | 1.583 | 0.419 | (0.835, 2.404) | ||
α3 | −0.022 | 0.453 | (−0.998, 0.798) | ||
δ | −21.619 | 5.278 | (−31.637, −12.171) | ||
|
1.461 | 0.606 | (1.000, 2.570) |
Furthermore, the results in Table 3 show that only SGA is significant and has negative association with macrosomia, demonstrating that the probability of macrosomia is lower for women with a history of small-for-gestational-age birth. The parameters α1 for BPD and α2 for MAD, which link the two processes, are positive and highly statistically significant, while α3 for FL is not significant, suggesting that the trajectories for BPD and MAD are positively associated with macrosomia, while the trajectory for FL is not related to macrosomia. Further, the posterior estimate of skewness δ is negative and highly significant (−21.62), demonstrating that the negative skewed link is needed for appropriately modeling macrosomia. In addition, the negative skewed link has a small value of (1.46). The posterior estimates of Ω for bi are presented in Web Tables 4 to 9.
Using the test set data, we estimated the overall assessments of diagnostic accuracy for predicting macrosomia. Specifically, we estimated AUC and MSE using the ten models under consideration. Table 4 presents the posterior means, standard deviations and 95% HPD intervals of the AUC and MSE. The results from Table 4 show that the skewed models have higher AUC and smaller MSE than the models with symmetric distributions. The model with the highest AUC and lowest MSE was SGTN-SGTN, demonstrating the importance of incorporating long-tailed skewed error distributions in the longitudinal error distribution as well as the link function formulation. ROC curves corresponding to these AUCs are presented in Web Figure 4. Interestingly, all extended models had sizable increases in diagnostic accuracy as compared to the Probit-Normal model, the special case that reduces to the model developed by Albert (2012) and Zhang et al. (2012). The estimated individual probabilities of prediction for the Probit-Normal model, SGTN-SGTE model, and SGTN-SGTN model (the best model) for the test set data are given in Web Figure 5. These plots suggest that the Probit-Normal model provides an overestimate when the probability for the SGTN-SGTN model is greater than about 0.2. The individual probabilities of prediction are close between skewed models (e.g. SGTN-SGTE and SGTN-SGTN model).
Table 4.
AUC and MSE values based on the test set data
AUC | MSE | |||||
---|---|---|---|---|---|---|
|
||||||
Model | Mean | Std | 95% HPD | Mean | Std | 95% HPD |
Probit-Normal | 0.817 | 0.014 | (0.790, 0.843) | 0.123 | 0.004 | (0.114, 0.132) |
SProbitE-Normal | 0.838 | 0.013 | (0.812, 0.864) | 0.106 | 0.004 | (0.098, 0.114) |
SProbitN-Normal | 0.838 | 0.013 | (0.812, 0.863) | 0.106 | 0.004 | (0.098, 0.114) |
GT-GT | 0.847 | 0.012 | (0.823, 0.871) | 0.102 | 0.004 | (0.095, 0.111) |
SGTE-GT | 0.864 | 0.008 | (0.848, 0.880) | 0.098 | 0.003 | (0.093, 0.104) |
SGTE-SGTE | 0.868 | 0.007 | (0.854, 0.881) | 0.098 | 0.002 | (0.093, 0.103) |
SGTE-SGTN | 0.868 | 0.007 | (0.854, 0.881) | 0.097 | 0.002 | (0.093, 0.102) |
SGTN-GT | 0.865 | 0.008 | (0.850, 0.881) | 0.098 | 0.003 | (0.093, 0.103) |
SGTN-SGTE | 0.868 | 0.007 | (0.854, 0.881) | 0.097 | 0.002 | (0.093, 0.102) |
SGTN-SGTN | 0.869 | 0.006 | (0.857, 0.881) | 0.096 | 0.002 | (0.092, 0.099) |
6. Discussion
This paper presents a new class of models that can be used to predict a binary outcome (e.g. macrosomia) from multivariate longitudinal data (e.g. ultrasound measurements of fetal growth). The models are flexible in that they allow for skewed long-tailed distributions for the ultrasound measurements and a very flexible link function for relating the longitudinal trajectories to the binary outcome of macrosomia. We demonstrate with a fetal growth study that this flexible modeling improves diagnostic and prediction accuracy relative to more standard approaches. The model extends the work of Albert (2012) and Zhang et al. (2012), who proposed a similar analytical framework with the assumption of a normal longitudinal error structure and a probit link function. In this paper, we show that a more general model provides improved model fit as well as substantially improved diagnostic accuracy as compared with the simpler analysis. We recognize that this improved performance comes at the cost of increased computational expense, since a Bayesian approach is needed to handle the multivariate random effects in this complex setting.
The methodology was developed specifically to address the important medical/epidemiologic question of predicting poor pregnancy outcomes from longitudinal ultrasound data. However, the methodology can be applied more generally to any situation where we are predicting a binary event where the timing of this event is of secondary interest and does not cause the longitudinal data to be censored. In the latter case, a joint model of longitudinal and time-to-event would be more appropriate.
In this approach, we incorporate a dependence between the longitudinal and binary processes with shared random effects that link the two process through a function h(bik). We specified h(bik) to be the projection of the fetal growth process to a time close to birth. In our analyses, we chose this time to be 39 gestational weeks, but results were not sensitive to this assumption. Alternatively, one could link the two processes with a linear combination of all the random effects and unknown parameters. However, in our situation where we have three longitudinal outcomes and 7 random effects per outcome, this would be problematic (21 coefficients to estimate). The current approach requires us to estimate only one parameter for each longitudinal outcome (a total of three coefficients).
We incorporated a flexible mean structure for the longitudinal trajectories using cubic splines. A priori, we choose three knot points at gestational times corresponding to each of three quartiles. For the fetal growth application, this was sensible, since knot point locations were chosen close to the targeted measurements. However, in other situations where observations are more irregularly spaced and the mean trajectories follow more complex patterns, a larger number of knot points may be necessary. Future research may focus on estimating the optimal number of knot points in these situations.
The model induced dependence between the longitudinal and binary outcomes using shared random effects. We recognize that this does place a constraint on the correlation between these processes. However, including separate correlated random effects for the longitudinal and binary components is not possible, since random effects cannot be incorporated for a single binary response. For applications with a repeated binary response, such an extension could be considered.
Supplementary Material
Acknowledgments
The research of Drs. Kim and Albert was supported by the Intramural Research program of the National Institutes of Health, Eunice Kennedy Shriver National Institute of Child Health and Human Development.
Footnotes
Web Appendices, Tables, and Figures, referenced in Sections 2.3, 2.4, 4.1, 4.2, and 5, along with Fortran code for conducting analysis in Section 5 are available with this paper at the Biometrics website on Wiley Online Library.
References
- Abranowitz M, Stegun IA. Handbook of mathematical functions with formulas, graphs, and mathematical tables. New York: Dover Publications, Inc; 1972. [Google Scholar]
- Albert JH, Chib S. Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association. 1993;88:669–679. [Google Scholar]
- Albert PS. A linear mixed model for predicting a binary event from longitudinal data under random effects misspecification. Statistics in Medicine. 2012;31:145–154. doi: 10.1002/sim.4405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen MH, Dey DK, Shao QM. A new skewed link model for dichotomous quantal response data. Journal of the American Statistical Association. 1999;94:1172–1186. [Google Scholar]
- Chen M-H, Shao OM, Ibrahim JG. Monte Carlo Methods in Bayesian Computation. New York: Springer-Verlag; 2000. [Google Scholar]
- Cowles C, Carlin BP. Markov chain monte carlo convergence diagnostics: a comparative review. Journal of the American Statistical Association. 1996;91:883–904. [Google Scholar]
- Deter RL. Individualized growth assessments: evaluation of growth using each fetus as its own control. Seminars in Perinatology. 2004;28:23–32. doi: 10.1053/j.semperi.2003.10.011. [DOI] [PubMed] [Google Scholar]
- Gail M, Pfeiffer RM. On criteria for evaluating models of absolute risk. Biostatistics. 2005;6:227–239. doi: 10.1093/biostatistics/kxi005. [DOI] [PubMed] [Google Scholar]
- Huang L, Chen MH, Ibrahim JG. Bayesian analysis for generalized linear models with nonignorably missing covariates. Biometrics. 2005;61:767–780. doi: 10.1111/j.1541-0420.2005.00338.x. [DOI] [PubMed] [Google Scholar]
- Ibrahim JG, Chen M-H, Sinha D. Bayesian survival analysis. New York: Springer-Verlag; 2001. [Google Scholar]
- Kim S, Chen MH, Dey DK. Flexible generalized t-link models for binary response data. Biometrika. 2008;95:93–106. [Google Scholar]
- Liu JS. The collapsed gibbs sampler in bayesian computations with applications to a gene regulation problem. Journal of the American Statistical Association. 1994;89:958–66. [Google Scholar]
- Slaughter JC, Herring AH, Thorp JM. A bayesian latent variable mixture model for longitudinal fetal growth. Biometrics. 2009;65:1233–1242. doi: 10.1111/j.1541-0420.2009.01188.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A. Bayesian measures of model complexity and fit (with discussion) Journal of Royal Statistical Society, B. 2002;64:583–639. [Google Scholar]
- Zhang J, Kim S, Grewal J, Albert PS. Predicting large fetuses at birth: do multiple ultrasound examinations and longitudinal statistical modelling improve prediction. Paediatric amd Perinatal Epidemiology. 2012;26:199–207. doi: 10.1111/j.1365-3016.2012.01261.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.