Modeling longitudinal data and its impact on survival in observational nephrology studies: tools and considerations

Elani Streja; Leanne Goldstein; Melissa Soohoo; Yoshitsugu Obi; Kamyar Kalantar-Zadeh; Connie M Rhee

doi:10.1093/ndt/gfx015

. 2017 Mar 3;32(Suppl 2):ii77–ii83. doi: 10.1093/ndt/gfx015

Modeling longitudinal data and its impact on survival in observational nephrology studies: tools and considerations

Elani Streja ^1,^2,^✉, Leanne Goldstein ³, Melissa Soohoo ¹, Yoshitsugu Obi ¹, Kamyar Kalantar-Zadeh ^1,², Connie M Rhee ¹

PMCID: PMC6251642 PMID: 28340135

Abstract

Nephrologists and kidney disease researchers are often interested in monitoring how patients’ clinical and laboratory measures change over time, what factors may impact these changes, and how these changes may lead to differences in morbidity, mortality, and other outcomes. When longitudinal data with repeated measures over time in the same patients are available, there are a number of analytical approaches that could be employed to describe the trends and changes in these measures, and to explore the associations of these changes with outcomes. Researchers may choose a streamlined and simplified analytic approach to examine trajectories with subsequent outcomes such as estimating deltas (subtraction of the last observation from the first observation) or estimating per patient slopes with linear regression. Conversely, they could more fully address the data complexity by using a longitudinal mixed model to estimate change as a predictor or employ a joint model, which can simultaneously model the longitudinal effect and its impact on an outcome such as survival. In this review, we aim to assist nephrologists and clinical researchers by reviewing these approaches in modeling the association of longitudinal change in a marker with outcomes, while appropriately considering the data complexity. Namely, we will discuss the use of simplified approaches for creating predictor variables representing change in measurements including deltas and patient slopes, as well more sophisticated longitudinal models including joint models, which can be used in addition to simplified models based on the indications and objectives of the study as warranted.

Keywords: change analysis, joint models, longitudinal, mixed-effects models, repeated measures

INTRODUCTION

In a large number of survival analysis studies, laboratory and clinical measurements are measured at a single-point-in-time at the beginning of the study (baseline), and patients are then followed for a period of time subsequent to this baseline time point to examine associations of these markers with morbidity and mortality outcomes. However, when researchers have data with repeated measurements, they may prefer to measure variability that occurs over a specified time period and investigate how changes in these variables impact outcomes after the exposure period. They may use methods such as time-dependent (or time-varying covariate) analyses, where the single measurement may be updated over time and replaced with subsequent measurements to examine short-term associations, or time-averaged analyses, where a patient’s baseline measurement is represented by a single estimate derived from the average of measurements over a specified period. Additionally, with repeated measurements, researchers can also examine changes or fluctuations in these laboratory measurements and how it impacts clinical outcomes. Researchers may use these changes and variations as a predictor of an outcome or examine and describe trends and the factors that impact these trends. A previous review by Leffondre et al. [1] superbly summarizes and compares three methods of estimating trajectories of renal function over time and discusses their advantages and limitations, especially when renal function trajectories are not completely observable due to patient dropout. In this review, we will extend this discussion by examining the modeling of repeated measures as predictors used in survival analyses. We will provide a brief overview of the basics of linear and longitudinal modeling based on the texts of Weiss [2] and Fitzmaurice et al. [3], and additionally explain how longitudinal models can be used to provide better estimates of patient slopes to be used as predictors in survival models.

DELTA METHOD

In change analyses, a researcher may be interested in addressing how a change in a laboratory marker of two time points affects the outcome. In this case, the researcher will estimate a ‘delta’, or examine the difference between the latter measurement subtracted from the first measurement. This method may closely represent a situation in clinical practice, where a physician must determine an action by examining the change between the current visit marker levels and the previous marker levels. In our study [4] on mortality prediction by surrogates of body composition, we examined the associations of changes in dry weight and serum creatinine over a 6-month period with mortality. We subtracted the patients’ dry weight and serum creatinine values at 6 months from their baseline value to calculate the delta and estimated survival differences after this point. As change and its impact on outcomes may be relative to baseline values, we additionally adjusted for the baseline value of each marker. Since the delta of weight and creatinine were the exposures of interest, analyses were restricted to patients with an available delta estimate, i.e. measurements at both baseline and 6 months of follow-up. Similar analyses were conducted in more recent studies by our group examining changes in mineral and bone disorder markers over 6 months or residual kidney function over 1 year with mortality outcomes in hemodialysis patients [5, 6]. The limitation of this delta method is the restriction of analyses to only patients with measurements at both time points who can contribute to the evaluation of subsequent outcomes.

SLOPE METHOD

If more than two data points are available, one may examine associations of marker change with outcomes by estimating the individual slope of that marker in each patient. The estimated slopes (calculated by simple or ordinary least squares (OLS) linear regression in each patient) would be used as the exposure. In such studies, some analysts may choose to limit their analyses to patients with at least three data points measured at specified times (such as beginning, middle and end, or at least 3 months apart) during a specified discrete time period. In a study by Derose et al. [7], analyses were restricted to patients who had more than two creatinine measurements at least 180 days between measurements in order to estimate perpatient slopes of estimated glomerular filtration rates (eGFR) during 1, 3 and 5 years after entry as a predictor of subsequent mortality after the exposure period.

In both delta and slope analyses, researchers can estimate change as a predictor for an outcome and analyses are restricted to patients with the available amount of data. However, when repeated measures data are available for a population of patients, an analyst may want to estimate the exposure of the patient slope as a function of what is occurring for all patients in the population. In this case, more considerations are needed for the data structure in longitudinal (or repeated measures) analysis. Of benefit, appropriate longitudinal modeling can maximally leverage the rich data available and provide correct standard error estimates or better estimates of slopes for use as predictors. Although seemingly complex, examining trends in repeated measures with longitudinal models is an extension of simple linear regression analysis.

LINEAR REGRESSION REVIEW

In an elementary statistics course, researchers learn the basics of linear regression (or OLS regression) modeling where the relationship between a continuous predictor variable and outcome variable is modeled with an equation representing this best fit line, Y = α + βX + Ԑ; where Y is the outcome variable, X is the predictor variable, α is the intercept, which represents the population mean of the outcome variable, β is the slope of regression line or regression model coefficient for X and Ԑ is the total error of difference from observed to predicted values. Researchers also learn that in order to use the linear regression model appropriately, assumptions regarding the data must be satisfied. First is the ‘assumption of linearity’, or that the relationship between the predictor and outcome follows a linear or straight pattern. In practice, the assumption of linearity is checked by examining the absence of clustering patterns within the subjects or that there is a clear linear pattern when the residuals per patient are plotted against the predictor. Second is the ‘assumption of independence’ that the residuals or difference from the actual points measured to the estimated outcome from the model or best fit line are independent or random, and that there is no correlation between residuals for consecutive predictor values. To further this point are the third and fourth assumptions, the ‘assumption of homoscedasticity’ and the ‘assumption of error normality’, which states that the residuals or error values must be normally distributed and have an equal variance across values of the predictor estimate. This ensures that the model does not have a smaller error or creates a better prediction dependent on the predictor value. It is also crucial for these assumptions to hold when using linear regression models for correct inference about regression coefficients, e.g. computing P-values or confidence intervals (CIs).

In longitudinal studies, the goal is to ‘characterize trends in patients’ repeated measurements over time’. In this case, the predictor becomes time and the outcome is the variable of interest. However, by the nature of longitudinal data, the ‘assumption of independence’ is violated. Since repeated measures come from the same patient, repeated values would naturally not be independent or random. There would be ‘within-patient’ factors that impact the measurement values repeated from the same patient. We would expect that two measurements taken from the same person would be more similar than two measurements from different people. For example, if a patient’s first measurement is higher than average, we anticipate that the same patient will have a higher than average second measurement. The distance between the actual value and predicted value for each measurement from that patient would be correlated. That is to say, in longitudinal data the error would be correlated across repeated values. This information about how the repeated measures or errors are correlated must be modeled in the analysis to properly analyze the data. Not accounting for a correlated error or treating the error as independent can lead to large under- or overestimates of the predictor–outcome relationship. Therefore, in longitudinal models, the overall model error from the regression model is partitioned into two parts: the ‘subject-specific random error’ and the remaining model ‘random error’. The ‘subject-specific random error’ accounts for the fact that repeated measures are derived from the same subject and these repeated measures are most likely correlated, while the remaining model ‘random error’ accounts for how much the regression model differs from patients’ actual measured data points overall after accounting for the ‘subject-specific error’. The modeled ‘subject-specific error’ is also known as the ‘random effects’ of the longitudinal model.

FIXED AND RANDOM EFFECTS ON LONGITUDINAL MODELS

Longitudinal models are often referred to as mixed models, since they account for both ‘fixed’ and ‘random effects’. The ‘fixed-effects’ components are the relationships between exposure and confounder variables with outcomes at the population level, assuming fixed and common relationships across individuals. Similar to the simple linear regression model, ‘fixed effects’ are expressed as betas (β). If the ‘fixed-effects’ relationships are not perfectly linear, we can include interaction terms of the exposure (time) with itself to represent logarithmic, quadratic, cubic or higher order polynomial relationships, or generate spline functions of time in the model.

The ‘random-effects’ component is the subject-specific effect, which recognizes that multiple observations should be clustered together in the analysis, since they are derived from the same person. When modeling the ‘random effects’ or ‘subject-specific’ contribution to the model, we can account for the correlation of values in repeated measurements taken from the same person. In addition, we can account for whether effects at the subject level differ from the overall population by their baseline levels, the rate of their trajectories or a combination of both. In longitudinal regression models, the error of Ԑ in Y = α + βX + Ԑ is split into Y = α + βX + D + ε, where D represents the subject-specific contribution or that multiple measures are taken from the same person.

MODELING THE PATTERN OF REPEATED MEASUREMENTS: COVARIANCE STRUCTURE

An appropriate longitudinal model accounts for how repeated measurements taken from the same individual are correlated over time. In longitudinal models, the correlation between measurements being modeled is called the ‘covariance structure’. Covariance, similar to correlation, is a measure of the relationship or linear dependence between two measurements. Unlike correlations, which are unitless, covariances account for the units of the measured variables. The covariance of a variable with itself is called the variance or the spread of distribution of values about the variable’s mean. Covariances are interpreted relative to the variance of the variable. When the covariance is zero, we say that there is no relationship between the two repeated measurements. Otherwise, the value of the covariance needs to be interpreted relative to the variance of the evaluated variable.

In general, there are three basic covariance model structures: (i) unstructured, (ii) compound symmetry and (iii) autoregressive. Evaluating which covariance structure best fits the data is a separate step when running a longitudinal model. Inspecting the covariance between observations helps determine the covariance model structure that should be used. For each pair of time points, we would measure the covariance of values from the first and second time points (or each respective pair) for all subjects that had data for those two time points. Using most statistical software programs, researchers can request a table summary of the covariance matrix representing the covariances of all paired measured observations. For ease of interpretation, the statistical program can also output the normalized correlations or correlation structure between measurements instead.

Unstructured

In an ‘unstructured covariance’, there is no discernible pattern of covariances. The pattern appears ‘arbitrary’ and no explicit structure is assumed, thus making covariances the most flexible. In unstructured covariance models, a parameter is created for each unique pair of observations (first-second, first-third, etc.), and thus increases the degrees of freedom. This can be problematic, especially when dealing with small sample sizes, or if the number of parameters (or degrees of freedom) is large relative to the number of observations and patients. When there are too many parameters in a model for a given sample size, or the model is ‘overfit’ for that particular dataset, the model estimates can become unstable (leading to imprecise standard errors) and are less likely to be replicated in another random sample of the data. In order to simplify the model and have fewer parameters, it is best if covariance patterns are found.

Compound symmetry

Compound symmetry covariance pattern is when the covariance is constant, i.e. there is no pattern in covariance due to the proximity measurements in time. The value of the covariance between measures is solely due to the contribution that the measurements are derived from the same individual.

Autoregressive

‘Autoregressive covariance’ is when the covariance is a function of the absolute difference in times such that the correlation between measures decreases with increasing separation. The covariance is a function of the absolute difference in time between repeated measures. This pattern stipulates that measurements taken closer together in time tend to be more highly correlated than measures far apart in time, or that the correlation/covariance weakens even more with more time between measurements of the pair.

DESCRIPTION OF RANDOM INTERCEPT AND SLOPE

In a longitudinal model, the ‘fixed-effect’ component describes the mean population response for that variable over time, and the random component of a ‘random effect’ accounts for the error in that trajectory explained by differences in patient characteristics. Each patient has a subject-specific trajectory over time. There are many characteristics that vary randomly between patients and can account for the heterogeneity in observed trends or the model error. Patients may start at a different baseline point or intercept or have a different trajectory or slope from the overall population.

Random intercept model

In building a longitudinal model, it is possible that individual patients may start at a different baseline point from the rest of the population. Although their trajectory of change may be similar, the model would need to account for the fact that the patient’s starting point may be different. This can be modeled in a ‘random intercept model’. In a random intercept model, the trend over time may deviate from the population on average because the baseline (intercept) value differed from the population average. ‘Random intercept models’ are synonymous with ‘compound symmetry’ covariance models. In both models, the only component accounting for a ‘random effect’ is the specific patient’s baseline characteristic contributing to the patient’s starting point or the fact that repeated measures are correlated the same over time specific to that patient. However, random intercept models assume that the patient measurement trajectories (slopes) or covariance between measurements over time do not vary.

Random slope model

In a ‘random slope model’, the subject may have a starting point similar to the population mean, but the patient’s trajectory is different from that of the population average (i.e. increasing while the population mean decreases, or may be faster or slower). In this model, the trend over time deviates from the patient, because of the patient’s trajectory, but not because they started at a different level than the population average.

Random intercept and random slope

In some models, both the intercept and slope are different for individuals compared with the population average. Thereby, both a random intercept and random slope need to be accounted for in the subject-specific error component of the model. In this model, each subject varies not only in their baseline level of response, but also in terms of their responses over time.

Although there are a number of random effects and covariance structures that can be specified in a longitudinal model, the above-listed are the most common. Statistical programs can assist in determining which effect structure best fits the data using ‘goodness of fit criteria’, such as the Akaike information criterion (AIC). AIC provides an estimate of model fit known as the log likelihood (how well the specified model fits the data) and corrects for the number of parameters in the model. When selecting the random effect and/or covariance structure, one may consider using the restricted maximum likelihood option, which will allow the program to focus on examining the random or repeated effect. However, for comparing nested regression models for the fixed effect, the maximum likelihood option should be used. The analyst should run a number of different structures, and compare the AIC, with priority given to the model with the smallest AIC value. Researchers can use a combination of the AIC values and assumptions about the known pathophysiology of the data in order to select an appropriate model.

In our recent article [8], we examined trends in the change in ferritin over time in incident hemodialysis patients. We adjusted for demographics and markers of malnutrition-inflammatory-cachexia syndrome to account for their effects on ferritin trends, and also examined trends in strata according to baseline ferritin, and intravenous iron and erythropoietin use. Ferritin trajectories were best described using a random intercept only mixed-effects model. The results showed that ferritin increased over time on dialysis in most models independent of fixed-effects covariates. In another article, we examined patients’ hemoglobin trajectories and their responsiveness to erythropoietin [9]. In this analysis, the best specified model was a combined random slope and random intercept model. Furthermore, we were able to additionally estimate per patient slopes from the mixed-effect model and use them as outcomes in further analyses examining odds of the erythropoietin responsiveness slope. Patient slopes estimated from a mixed-effect model take into account subject level effects and are estimated as a function of trends occurring over the entire population. These slopes are distinct from simplified methods using linear regression of at least three measurements for each patient, which does not account for population effects and requires a certain number of measurements. We have additionally used post-estimation patient slopes from mixed-effects models when examining eGFR trajectories [10, 11]. Estimating patient slopes as a predictor using post-estimation from a mixed-effects model, rather than using regression slopes individually for each patient, may allow for more inclusion of patients and account for the patient’s parameters as a function of what is occurring at the population level. As compared with patient slopes estimated by OLS regression, patient slopes from post-estimations in mixed-effects models describe how the patient slope compares to the mean slope over the entire population of patients, and examines the association of this slope with clinical outcomes.

MIXED-EFFECT MODELS FOR DISCRETE OUTCOMES

We have discussed obtaining slopes by modeling continuous outcomes in longitudinal models; however, outcomes may be discrete: binary, ordinal or count data. The term ‘generalized linear mixed model’ is used to represent regression for both continuous and discrete outcomes. In fixed-effects-only models, generalized linear models such as logistic or Poisson regression are used to model associations for binary and count data outcomes, respectively. However, when examining discrete outcomes over time, one must similarly model the repeated or subject-specific contribution. For nonparametric continuous outcomes, a generalized linear mixed model may be preferred, since it does not require the outcome variable to be normally distributed. Similar to mixed-effects models, patient slopes from generalized linear models can be obtained through post-estimation.

DIFFERENCES IN PREDICTOR ESTIMATES OF CHANGE IN A MARKER OVER TIME ACROSS THREE MODELS

For this illustration, we identified four patients at random from an incident hemodialysis cohort who survived the first year of hemodialysis [12]. We estimated per patient slopes of monthly averaged post-dialysis weight (kg) over 1 year (13 months = baseline month + 12 months) using OLS regression per patient, and post-estimation from a longitudinal mixed-effects model with unstructured covariance and random slope. Output from the mixed-effect model using time as a predictor for the outcome of weight (kg) showed that for each unit increase in time (1 year), on average a patient’s weight decreased by −0.87 [(95% CI) −0.92, −0.81] kg/year. Averages (+/− standard deviation) of the patient slopes were −1.11 (± 10.72) kg/year for the OLS regression-estimated slopes and −0.87 (±7.38) kg/year for slopes generated from the longitudinal mixed model. Patients had an average of 11 monthly averaged weight measurements from start of dialysis to the end of the 1 year period. As observed in Table 1, the differences between patient slope estimates from mixed models and OLS regression may become more extreme when fewer data points are available to estimate the slope predictor. In Table 1, we provide examples of patient measurements over time, showing that Patient A had only two data points available at Month 2 and Month 3. Over this period, Patient A experienced a 2 kg weight loss from 75 kg to 73 kg. In the OLS regression, the patient trajectory of change was extended to reflect this change if it persisted throughout the entire observation period and thereby the patient would lose 24 kg over a year. However, the mixed model takes into consideration the patients' measurements as a function of the population trajectory of weight change over a year and provides a more conservative estimate of 1.5 kg/year decrease in weight, which may seem more reasonable given the population average estimate of −0.87 kg/year. For patients who had only two data points, but only at the beginning and end of the exposure time period, the OLS regression slope would equal the delta estimate of difference. Again, the mixed-model slope would provide a more conservative slope estimate in reflection of the slope as a function of the population trajectory. Theoretically, as patient slopes generated from longitudinal mixed models are a function of the population trajectory of the marker, patients with as little as one measurement can contribute to the population slope estimated and thereby have an estimated per patient slope to be used in subsequent analyses. However, the interpretation of survival estimates for those patients may lead to results more biased towards the null.

Table 1.

Random example of four patients comparing slopes estimated from mixed-models OLS regression, and the delta method

Patient ID	Count of measures	Month	Weight (kg)	Mixed- models slope	OLS slope	Delta (Month 13 – Month 1)
A	1	2	75	−1.49	−24.00	n/a
A	2	3	73
B	1	1	107	0.21	6.00	n/a
B	2	2	107
B	3	3	109
B	4	4	108
C	1	1	74	0.68	0.79	−2.00
C	2	2	69
C	3	12	73
C	4	13	72
D	1	1	65	4.23	4.48	4.00
D	2	2	65
D	3	3	65
D	4	4	66
D	5	5	66
D	6	6	66
D	7	7	66
D	8	8	68
D	9	9	68
D	10	10	68
D	11	11	68
D	12	12	69
D	13	13	69

Open in a new tab

n/a, not applicable.

In a subset of 67 007 patients who had measurements at the time points of baseline Month 1 (time 0), and 1 year and Month 13 (time 1), the average patient estimate slopes by OLS regression, longitudinal mixed model and delta method were −0.87 ± 7.76, −0.87 ± 7.36 and −1.83 ± 7.80 kg/year, respectively. Since the delta could not take into account fluctuations in weight that occur over a year, the estimated average changes in the predictor were larger than those estimated by the mixed models and OLS regression, which better accounted for weight fluctuations. Nonetheless, mixed models may provide superior estimates of exposure trajectories when time point measurements are missing because it can estimate a patient’s exposure trajectory as a function of the population trajectory. Yet, the delta method may better reflect clinical practice when a clinician can only observe a patient’s two exposure time points and does not have the opportunity to estimate a marker fluctuation or change over time. The delta method may also be preferred when estimating change in an exposure to a landmark or pivotal time point. In our recent manuscript, Chang et al. [12] characterized that the average hemodialysis patient experiences a J-shaped trajectory of weight change, where patients lose weight in the first 5 month of dialysis to a nadir weight loss and regain weight in subsequent months up to Month 12 before reaching a plateau. Thereby in our above example, delta methods could not accurately reflect this weight change trajectory. In Chang’s analysis, the delta method was not used to examine weight trajectories and fluctuation, but only to estimate subsequent survival as a function of the delta change of weight to the nadir at the 5 month mark of hemodialysis. Other models such as latent class models have also been developed to identify distinct patterns of change in an exposure trajectory over time and associations of subsequent outcomes depending on trajectory pattern exposure category [13].

JOINT MODELS

In recent years, statistical methods have been developed to further characterize changes in patient markers and their impact on survival or clinical outcomes by modeling both simultaneously using a joint model. In contrast to the above-mentioned models where time periods of exposure must be specified to estimate marker changes prior to the start of follow-up for the survival model, in a joint model both the start of follow-up in modeling patient trajectories and survival estimates occur simultaneously. The shared parameter in the joint model would measure the strength of the association between the longitudinal responses to any particular time point with the hazard of an event at the same point. Asar et al. [14] provide a stellar point-by-point explanation and example of joint modeling using the change of kidney function as a predictor of transition to renal replacement therapy. In their manuscript, Asar et al. explained how joint models can overcome bias in trajectory estimates from longitudinal mixed models due to informative right censoring from survival differences. Mixed models provide superior estimates of population slopes compared with linear regression models because the model is unaffected by data missing at random due to dropout or in their example where the dropout occurs due to previously observed data. However, linear mixed-effects models may still be subject to informative right censoring if the patients may have dropped out (or were censored) over time due to data missing not at random or ‘unobserved’ correlations between censoring and the trajectory of the exposure. Conversely, the joint model can utilize information about drop-out time or right censoring to inform on the missing data and the longitudinal process of an exposure prior to dropout. Subsequently, in joint models there would be better estimates of survival from a time-varying exposure that may be impacted by measurement errors, missing data not at random or censored data. Joint models achieve this superiority by fully exploiting the totality of the data and the dependence between repeated measurements, measurement error and the hazard for survival by accounting for random-effects estimates in models. Asar’s explanation of joint models builds upon the seminal works by Wulfsohn and Tsiatis [15], and Crowther et al., who further extend joint models to include more flexible modeling estimators [16]. Wulfsohn and Tsiatis additionally provide an in-depth explanation of how the joint model supersedes measurement errors and may provide a more precise estimate of a marker’s measurement at any given time point; thereby this more precise estimate can better describe longitudinal trajectories and impact of the marker on survival outcomes.

The major drawback of joint models is that they are computationally expensive and time-consuming, particularly in datasets with larger numbers of patients and repeated measures. These factors may need to be taken into account in consideration of model selection for analyses. In our manuscript examining the impact of age on the obesity paradox in dialysis patients, we provided survival estimates by body mass index from computation of joint models in the supplement, which provided mostly similar results to that of the main analyses [17]. Joint models may be superior when examining the longitudinal outcome of a variable and time-to-event data, where considerations for the interdependency between these two are needed. However, if the research question involves investigating the association of the linear trajectory of a variable during a set exposure period with subsequent mortality, where considerations for measurement errors or data missingness are less warranted, then models using slope estimates from longitudinal models with subsequent survival modeling may be preferred.

CONCLUSION

In summary, numerous methods can be used to analyze the associations between change in a marker over time and risk of an outcome such as survival. These methods include estimating marker change as a predictor using deltas, and patient slopes from OLS regression or longitudinal mixed models. However, in order to appropriately evaluate and understand how clinical measurements in a population change over time and in turn understand how those changes impact survival outcomes, models that maximize partitioning of the model error with considerations of repeated measurement structure or contribution of subject-level effects are necessary. Appropriate longitudinal modeling (including joint models) efficiently uses all available data, leading to valid results and conclusions. However, consideration for computational efficiency and data availability is of essence.

FUNDING

E.S. is supported by a career development award from the Office of Research and Development of the Department of Veterans Affairs; IK2-CX001266-01, C.M.R. is supported by the NIH/NIDDK grant K23-DK102903, and K.K.-Z. is supported by the NIH/NIDDK grants K24-DK091419, R01-DK09568, R01-DK078106, R01-DK096920 and U01-DK102163 and philanthropist grants from Mr Harold Simmons, Mr Louis Chang and Dr Joseph Lee. Y.O. is supported by the Uehara Memorial Foundation Research Fellowship.

CONFLICT OF INTEREST STATEMENT

None declared.

REFERENCES

References

1. Leffondre K, Boucquemont J, Tripepi G. et al. Analysis of risk factors associated with renal function trajectory over time: a comparison of different statistical approaches Nephrol Dial Transplant 2015; 30: 1237–1243 [DOI] [PubMed] [Google Scholar]
2. Weiss RE. Modeling Longitudinal Data. New York, NY: Springer, 2005 [Google Scholar]
3. Fitzmaurice GM, Laird NM, Ware JH.. Applied Longitudinal Analysis. Hoboken, NJ: John Wiley & Sons, Inc., 2011 [Google Scholar]
4. Kalantar-Zadeh K, Streja E, Molnar MZ. et al. Mortality prediction by surrogates of body composition: an examination of the obesity paradox in hemodialysis patients using composite ranking score analysis. Am J Epidemiol 2012; 175: 793–803 [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Soohoo M, Feng M, Obi Y. et al. Changes in markers of mineral and bone disorders in the first six months post-dialysis and subsequent mortality in incident hemodialysis patients. Am J Nephrol 2016; 43: 85–96 [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Obi Y, Rhee CM, Mathew AT. et al. Residual kidney function decline and mortality in incident hemodialysis patients. J Am Soc Nephrol 2016; 27: 3758–3768 [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Derose SF, Rutkowski MP, Crooks PW. et al. Racial differences in estimated GFR decline, ESRD, and mortality in an integrated health system. Am J Kidney Dis 2013; 62: 236–244 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Kim T, Rhee CM, Streja E. et al. Longitudinal trends in serum ferritin levels and associated factors in a national incident hemodialysis cohort. Nephrol Dial Transplant 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Kalantar-Zadeh K, Lee GH, Miller JE. et al. Predictors of hyporesponsiveness to erythropoiesis-stimulating agents in hemodialysis patients. Am J Kidney Dis 2009; 53: 823–834 [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Sumida K, Molnar MZ, Potukuchi PK. et al. Association of slopes of estimated glomerular filtration rate with post-end-stage renal disease mortality in patients with advanced chronic kidney disease transitioning to dialysis. Mayo Clin Proc 2016; 91: 196–207 [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Kovesdy CP, Anderson JE, Derose SF. et al. Outcomes associated with race in males with nondialysis-dependent chronic kidney disease. Clin J Am Soc Nephrol 2009; 4: 973–978 [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Chang TI, Ngo V, Streja E. et al. Association of body weight changes with mortality in incident hemodialysis patients. Nephrol Dial Transplant 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
13. O'Hare AM, Batten A, Burrows NR. et al. Trajectories of kidney function decline in the 2 years before initiation of long-term dialysis. Am J Kidney Dis 2012; 59: 513–522 [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Asar Ö, Ritchie J, Kalra PA. et al. Joint modelling of repeated measurement and time-to-event data: an introductory tutorial. Int J Epidemiol 2015; 44: 334–344 [DOI] [PubMed] [Google Scholar]
15. Wulfsohn MS, Tsiatis AA.. A joint model for survival and longitudinal data measured with error. Biometrics 1997; 53: 330–339 [PubMed] [Google Scholar]
16. Crowther MJ, Abrams KR, Lambert PC.. Flexible parametric joint modelling of longitudinal and survival data. Stat Med 2012; 31: 4456–4471 [DOI] [PubMed] [Google Scholar]
17. Vashistha T, Mehrotra R, Park J. et al. Effect of age and dialysis vintage on obesity paradox in long-term hemodialysis patients. Am J Kidney Dis 2014; 63: 612–622 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B1] 1. Leffondre K, Boucquemont J, Tripepi G. et al. Analysis of risk factors associated with renal function trajectory over time: a comparison of different statistical approaches Nephrol Dial Transplant 2015; 30: 1237–1243 [DOI] [PubMed] [Google Scholar]

[gfx015-B2] 2. Weiss RE. Modeling Longitudinal Data. New York, NY: Springer, 2005 [Google Scholar]

[gfx015-B3] 3. Fitzmaurice GM, Laird NM, Ware JH.. Applied Longitudinal Analysis. Hoboken, NJ: John Wiley & Sons, Inc., 2011 [Google Scholar]

[gfx015-B4] 4. Kalantar-Zadeh K, Streja E, Molnar MZ. et al. Mortality prediction by surrogates of body composition: an examination of the obesity paradox in hemodialysis patients using composite ranking score analysis. Am J Epidemiol 2012; 175: 793–803 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B5] 5. Soohoo M, Feng M, Obi Y. et al. Changes in markers of mineral and bone disorders in the first six months post-dialysis and subsequent mortality in incident hemodialysis patients. Am J Nephrol 2016; 43: 85–96 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B6] 6. Obi Y, Rhee CM, Mathew AT. et al. Residual kidney function decline and mortality in incident hemodialysis patients. J Am Soc Nephrol 2016; 27: 3758–3768 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B7] 7. Derose SF, Rutkowski MP, Crooks PW. et al. Racial differences in estimated GFR decline, ESRD, and mortality in an integrated health system. Am J Kidney Dis 2013; 62: 236–244 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B8] 8. Kim T, Rhee CM, Streja E. et al. Longitudinal trends in serum ferritin levels and associated factors in a national incident hemodialysis cohort. Nephrol Dial Transplant 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B9] 9. Kalantar-Zadeh K, Lee GH, Miller JE. et al. Predictors of hyporesponsiveness to erythropoiesis-stimulating agents in hemodialysis patients. Am J Kidney Dis 2009; 53: 823–834 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B10] 10. Sumida K, Molnar MZ, Potukuchi PK. et al. Association of slopes of estimated glomerular filtration rate with post-end-stage renal disease mortality in patients with advanced chronic kidney disease transitioning to dialysis. Mayo Clin Proc 2016; 91: 196–207 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B11] 11. Kovesdy CP, Anderson JE, Derose SF. et al. Outcomes associated with race in males with nondialysis-dependent chronic kidney disease. Clin J Am Soc Nephrol 2009; 4: 973–978 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B12] 12. Chang TI, Ngo V, Streja E. et al. Association of body weight changes with mortality in incident hemodialysis patients. Nephrol Dial Transplant 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B13] 13. O'Hare AM, Batten A, Burrows NR. et al. Trajectories of kidney function decline in the 2 years before initiation of long-term dialysis. Am J Kidney Dis 2012; 59: 513–522 [DOI] [PMC free article] [PubMed] [Google Scholar]

[gfx015-B14] 14. Asar Ö, Ritchie J, Kalra PA. et al. Joint modelling of repeated measurement and time-to-event data: an introductory tutorial. Int J Epidemiol 2015; 44: 334–344 [DOI] [PubMed] [Google Scholar]

[gfx015-B15] 15. Wulfsohn MS, Tsiatis AA.. A joint model for survival and longitudinal data measured with error. Biometrics 1997; 53: 330–339 [PubMed] [Google Scholar]

[gfx015-B16] 16. Crowther MJ, Abrams KR, Lambert PC.. Flexible parametric joint modelling of longitudinal and survival data. Stat Med 2012; 31: 4456–4471 [DOI] [PubMed] [Google Scholar]

[gfx015-B17] 17. Vashistha T, Mehrotra R, Park J. et al. Effect of age and dialysis vintage on obesity paradox in long-term hemodialysis patients. Am J Kidney Dis 2014; 63: 612–622 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Modeling longitudinal data and its impact on survival in observational nephrology studies: tools and considerations

Elani Streja

Leanne Goldstein

Melissa Soohoo

Yoshitsugu Obi

Kamyar Kalantar-Zadeh

Connie M Rhee

Abstract

INTRODUCTION

DELTA METHOD

SLOPE METHOD

LINEAR REGRESSION REVIEW

FIXED AND RANDOM EFFECTS ON LONGITUDINAL MODELS

MODELING THE PATTERN OF REPEATED MEASUREMENTS: COVARIANCE STRUCTURE

Unstructured

Compound symmetry

Autoregressive

DESCRIPTION OF RANDOM INTERCEPT AND SLOPE

Random intercept model

Random slope model

Random intercept and random slope

MIXED-EFFECT MODELS FOR DISCRETE OUTCOMES

DIFFERENCES IN PREDICTOR ESTIMATES OF CHANGE IN A MARKER OVER TIME ACROSS THREE MODELS

Table 1.

JOINT MODELS

CONCLUSION

FUNDING

CONFLICT OF INTEREST STATEMENT

REFERENCES

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Modeling longitudinal data and its impact on survival in observational nephrology studies: tools and considerations

Elani Streja

Leanne Goldstein

Melissa Soohoo

Yoshitsugu Obi

Kamyar Kalantar-Zadeh

Connie M Rhee

Abstract

INTRODUCTION

DELTA METHOD

SLOPE METHOD

LINEAR REGRESSION REVIEW

FIXED AND RANDOM EFFECTS ON LONGITUDINAL MODELS

MODELING THE PATTERN OF REPEATED MEASUREMENTS: COVARIANCE STRUCTURE

Unstructured

Compound symmetry

Autoregressive

DESCRIPTION OF RANDOM INTERCEPT AND SLOPE

Random intercept model

Random slope model

Random intercept and random slope

MIXED-EFFECT MODELS FOR DISCRETE OUTCOMES

DIFFERENCES IN PREDICTOR ESTIMATES OF CHANGE IN A MARKER OVER TIME ACROSS THREE MODELS

Table 1.

JOINT MODELS

CONCLUSION

FUNDING

CONFLICT OF INTEREST STATEMENT

REFERENCES

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases