Abstract
Survival analysis is a class of models that are ideal for evaluating questions of timing of events, which makes them well-suited for modeling the development of a process such as initiation of substance use, development of addiction, or post-treatment recovery. The focus of this review paper is to demonstrate how survival models operate in a broader developmental framework and to offer guidance on selecting the appropriate model on the basis of the research question at hand. We provide a basic overview of survival models and then identify several key issues, explain how they pertain to research in the addiction field, and describe studies that utilize survival models to address questions about timing. We discuss the importance of carefully selecting the metric and origin of the time scale that corresponds to developmental process under investigation and we describe types of censoring/truncation. We describe the value of modeling covariates as time-invariant versus time-varying, and make the distinction between time-varying covariates and time-varying effects of covariates. We also explain how to test for substantive differences due to the timing of the assessment of the predictor. We finish the paper with a presentation of relatively novel extensions of survival models, including models that integrate standard statistical mediational analysis with discrete-time survival analysis, models that simultaneously consider order and timing of multiple events, and models that involve joint modeling of longitudinal and survival data. We also present our own substantive examples of various models in an Appendix containing annotated syntax and output.
Keywords: survival, hazard, development, initiation, timing, event
A major objective of developmental science is to describe how, when, and why individuals’ behaviors change over time (Nesselroade & Baltes, 1979). Development is most frequently marked by chronological age but it can also refer to a process or progression through stages or events. In the context of substance use behaviors, this might include time of first use of a substance (e.g., a drinking “milestone” such as first sip or full drink) or age at substance use treatment entry. It could also correspond to survival time (time to death) from a substance use-related disease or time to relapse following treatment. Thus, time-to-event can be conceptualized in different ways including both age-based development and progression through events. Throughout this paper, we provide examples of different types of development, and describe survival models that are based on age, time between event dates, and sometimes a combination of the two.
Survival analysis is a class of models that are ideal for evaluating questions of timing of events (Singer & Willett, 1991; 1994; 2003): if and when an event occurs. The survivor function is the probability that an individual will “survive” (not experience the event); this function begins at 1 and declines over time. The estimated median lifetime indicates how much time passes before half the sample experiences the event. A related mathematical function is the hazard function, which is the slope of the (log) survivor function. This function corresponds to the risk of experiencing the event in a given time period, among those who have not already done so. An important consideration with survival models is how time is measured. Time can be measured continuously, in the case of events measured in fine-grained detail (e.g., days), or in a discrete manner, such as a calendar year; this determines the class of survival model: discrete-time or continuous time.
In both classes of models, the key estimated parameters pertain to hazard. The discrete-time hazard, h j is defined as the conditional probability that the target event will occur in time period j, given that it has not already occurred prior to j:
The value of h j always lies between 0 and 1, as it is a probability.
When time is measured continuously, hazard is a rate rather than a probability, specifically, the instantaneous rate of event occurrence.
This is because with continuous time, the probability that an event occurs at any instant of time will approach 0. Thus, when time is measured continuously, hazard can assume any value greater than or equal to 0. The most common type of continuous model is the Cox proportional hazards model (Cox, 1972).
In both discrete and continuous models, we conditionally estimate the hazard based on the values of certain covariates or predictors, as we might in a standard regression model. The Cox model requires that the assumption of proportional hazards (PH) be met, that is, the effects of covariates are constant over time. With discrete models, we assume the covariate effects are the same for all values of t but it is possible to relax this assumption (see discussion of PH assumption below).
The present study describes issues inherent to modeling covariates, as well as topics including metric, time scale origin, and censoring/truncation. Although we present each topic separately, it is important to be attentive to all facets. We end with a presentation of novel extensions of survival models. Throughout, we describe substantive research in addictions that used a given survival modeling approach, and present our own annotated examples in the Appendix. Our explanations and examples are primarily based on a latent variable framework (structural equation modeling framework; SEM; Masyn, 2003), which provides maximum likelihood estimates of (binary) event indicators. These SEM extensions permit testing of more complicated models, including specification of temporal ordering (e.g., positing mediators of distal predictors of a discrete-time survival curve) and inclusion of latent variables as predictors and outcomes. SEM-based models lend themselves well to representing developmental models in the way they are conceived theoretically (see Figures in Appendix Example 5). SEM approaches deal with missing data under the assumption that correlates of missingness are modeled, and can provide estimates with robust standard errors (Fairchild et al., 2015). Although many of the models covered in this manuscript can be conducted in other software packages, some are (currently) reliant on the latent variable framework supported by Mplus (Muthén & Muthén, 1998–2017). For additional examples, we point the reader to the online material associated with Singer and Willett (2003) displaying SAS, Stata, R, SPSS, and Mplus syntax (https://stats.idre.ucla.edu/other/examples/alda/).
We present syntax and output for our own examples in the Appendix. We draw on a prospective study on substance use initiation and progression (see Jackson et al., 2015 for details); briefly, participants (N=1,023; 52% female; 24% non-White/12% Hispanic) enrolled in 6th, 7th, and 8th grades (cohort-sequential design) completed a series of web-surveys through high-school end. The outcome variable for most models is (initiation of) marijuana use [Appendix Example Dataset 1 presents data with wave converted to age in half-years, followed by discrete-time (1a) and continuous-time (1b) survival models with a single covariate, sex].
Metric, Origin of the Time Scale, and Censoring/Truncation
Metric.
Before discussing issues concerning predicted survival outcomes, we discuss the importance of the time-scale. Although developmental researchers commonly use assessment occasion (wave) as the metric of time, change processes can occur on any time-scale. The most intuitive scale is chronological age (Ram & Grimm, 2007), but alternate metrics may better theoretically guide the process, for instance social time or biological time such as pubertal stage (Ram et al., 2010).
Although stage of substance use and chronological age are associated, they are not synonymous (Colder et al., 2010). Deutsch et al. (2017) modeled stage rather than age in order to examine the development of alcohol stages separate from overall development. They examined time between antecedent milestones, for example, time from first alcohol use disorder (AUD) symptom to AUD diagnosis. A similar approach was taken by Jackson (2010), where the metric was years between adjacent milestones (e.g., first drink through daily heavy drinking).
Even when choice of metric is pre-determined, decisions remain with regard to choosing the indicator. Although respondents usually report age of initiation retrospectively, prospective studies may assess age-of-first use repeatedly, to capture initiation as it occurs, in the interest of minimizing retrospective bias. For studies that use both approaches, little guidance is given regarding which report to use when ages are inconsistent with each other. The first report may be the most accurate, as the ability to recall details about the event diminishes with time (Johnson, Gerstein, & Rasinski, 1997); alternately, age-related increases in maturity may improve accuracy, suggesting the earliest report is not the most reliable indicator (Kaestle, 2015; Wellman & O’Loughlin, 2015). A third option is to take the minimum age across assessments. We conducted a study where we examined three alternate reports (first-reported; last-reported; minimum) of age of first drink. We directly compared first-age to last-age-reported correlates of adolescent drinking such as delinquency, school disengagement, alcohol cognitions, anxiety, alcohol availability, and norms (minimum age was very similar to first-reported) (Rogers & Jackson, 2017). Findings indicated poor agreement, with only 30% of participants reporting same age for first-age and last-age reported. However tests of differences (hazard ratios converted to correlated correlation coefficients) indicated no meaningful difference in associations with correlates as a function of which report was used, suggesting substantive conclusions were not greatly impacted by misreporting.
Origin of the time scale.
The origin, or initial point, in a survival model usually corresponds to birth, or time of diagnosis or treatment entry, but data can be structured to specify another time origin. The intercept as a meaningful Time 0 could be defined as initiation of substance use, as done for studies examining progression from an early milestone to a later milestone (for alcohol: Huggett et al., 2018; Jackson, 2010; Sartor et al., 2007; 2016; smoking: Duncan et al., 2012; Huggett et al., 2018; cannabis: Butterworth, Slade, & Degenhardt, 2014; Wagner & Anthony, 2002). This approach also permits a test of the extent to which speed (vs. likelihood) of progression to a subsequent milestone is affected by development: specifically, whether progression is differential for those who achieved the antecedent milestone at an early vs. late age. Studies in the substance use field have documented the phenomenon of “telescoping” (Deutsch et al., 2017; Huggett et al., 2018; Jackson, 2010; Sartor et al., 2007; 2008; 2010) whereby youth with younger ages of initiation progress more slowly to later milestones. Substantively, this suggests the existence of distinct processes underlying any use and progression of use within an individual; methodologically, this approach avoids the confounding of chronological age with stage of use (Deutsch et al., 2017).
Multiple events, such as repeated occasions of substance use in a given time period, could be handled by re-setting each new event to Time 0. Alternately, they can be handled by models that partition time into discrete periods (Willett & Singer, 1995) or by a type of model called the frailty model (Wei & Glidden, 1997; Wienke, 2010) (also see Hedeker, Siddiqui, & Hu, 2000 for options for discrete-time multiple time-to-event data). Along with the baseline hazard function, these models include a random effect (frailty) that accounts for dependence of event times, which makes them also suitable for dealing with dependent event data such as dyadic data. Beets et al. (2009) applied the frailty model to repeated occurrences of heavy drinking episodes across 35 weeks. Elevated drinking was associated with peer disapproval and risk of harm across all timepoints; there also was increased risk on specific calendar events. An alternate model, the mixed-effects location-scale model, also partitions between-and within-person variability in multiple event models (see Courvoisier, Walls, Cheval, & Hedeker, 2018 for an application to daily onset-to-first-cigarette data). Singer and Willett (1991) observed that failing to account for multiple relapse attempts may lead to bias, as previous treatments may increase the success of subsequent treatments. This highlights the importance of measuring and modeling multiple spells (also see Willett & Singer, 1995).
Censoring/truncation.
One other important matter when considering origin is that of left censoring and left truncation. Survival models handle right censoring due to attrition/study end, but left censoring is inherent when modeling outcomes are slow to evolve, especially when the assessment window is short. Ideally all participants would be enrolled prior to the first event and followed until the final event so the entire process is observed for all participants (“incident-cohort design”) (Cain et al., 2011; Gail et al., 2009); this is similar to a closed-cohort design where all enrolled subjects are at risk (Thiébaut & Bénichou, 2004). However, this design is not feasible when the process develops over many years and survival times vary greatly. When the event of interest occurs prior to study entry and event age is unknown, left censoring occurs. When individuals who already experienced the event at enrollment are not included in the study, left truncation occurs. Here, the entire distribution is estimated on the basis of data from the right tail of the distribution (arguably the “healthier” individuals). It is important to bear in mind that our “event” may exert an influence on the covariates themselves (e.g., expectancies about the effects of alcohol may be influenced by earlier drinking experiences) (Dekker et al., 2008). The best solution may be to exclude outcome assessments that took place prior/concurrent to predictor assessments, if violation of temporal ordering is to be avoided. [Example 2 presents syntax for this situation]. Further, in cases with poor retrospective measurement of event timing (e.g., failure to assess a precise age) or widely spaced assessments, data may be prone to interval censoring, where the event is known only to fall within some interval of time. The tension between capturing events as they occur, precise reliable measurement, and sufficient baserates is inherent to longitudinal research and should be thoughtfully considered in both planning and analyzing a prospective study.
Modeling Covariates in Survival Models
A major goal of developmental research is to identify constructs that explain individual differences in change (inter-individual variability in intraindividual change). This is done in survival models by examining the (log) hazard function across levels of a risk factor (covariate). These exogenous variable effects are expressed in terms of hazard ratios (HR; e.g., difference in log hazard drinking for men vs. women). Nested models are estimated to determine unique effects of a predictor, and/or the confidence interval of the HR can be examined.
Covariates can be classified as time-invariant or time-varying. Time-invariant covariates are assumed to be independent of the passage of time. Some variables are inherently time-invariant (sex assigned at birth, prenatal opioid exposure). These factors can be distal or proximal, with proximal risk factors serving as mediators for distal risk (e.g., cognitions mediating social influences on drinking). [see Examples 1a, 1b].
Covariates that vary over time.
Developmental theory may posit that explanatory variables themselves vary over time (time-varying covariates). If the magnitude and/or direction of a covariate is expected to change, a time-varying effect takes this updated information on a risk factor into account when studying its association with the outcome (Dekker et al., 2008). For example, there is a shift in the relative influence of parents and peers during adolescence. In this case, the association between time-varying variables and growth can be modeled by predicting each survival indicator contemporaneously from the time-varying variable (or its lag, in an effort to preserve directionality). In this case, a parameter (HR) is assigned to each timepoint. To determine whether effects are similar across timepoint (time-invariant effects of the time-varying predictor) or vary across timepoint (time-varying effects of the time-varying predictor), they can be constrained to be equal to each other; this results in one HR that can be considered as a weighted average of short-term effects [Examples 3a, 3b]. If fit of the time-invariant model is not measurably worse than the time-varying model, the more parsimonious time-invariant model is selected.
Covariate effects that vary over time.
Regardless of whether the covariate is time-invariant or time-varying, it may exert its influence on the outcome at different chronological ages or stages of development. That is, as opposed to risk being constant over time, there is a weakening or strengthening of associations, with the HR associated with the predictor changing with respect to time. Deutsch et al. (2017) found such a time-dependent effect whereby age at drinking initiation predicted progression from drinking initiation to first AUD symptom more strongly at earlier ages (drinking initiation had a stronger effect on short-term survival) whereas age at first drunkenness was a stronger predictor at later ages (drunk initiation had a stronger effect on long-term survival). Essentially, this tests whether there is a different impact of the covariate on the time-to-event outcome across time.
Whether there are time-dependent effects of the predictor is examined by testing the PH assumption. This is done in several ways, including via graphical methods (e.g., plotting survival estimates for different levels of the covariate to look for survival curves that converge or cross), formal tests of Schoenfeld residuals for each covariate in the model (Grambsch & Therneau, 1994), or by testing interactions between the covariate and time (with significant positive interaction terms indicating the HR increasing with time). [see Example 4a for test of the PH assumption] We can test changes in model fit across increasingly complex models that permit unique (residual) effects at specific timepoints. If there are time-dependent effects, separate rates can be reported for each window under study (i.e., stratifying by time), or effects of predictors can be modeled to test increase or decrease linearly over time. [see Example 4b].
Substantive examples of handling violations in the assumption that risk remains constant over time include stratification for time-invariant (Duncan et al., 2012) and time-varying (Doran & Waldron, 2017) covariates. Duncan et al. (2012) handled a violation of the PH assumption for race/ethnicity in the development of nicotine dependence by creating two time periods (<age 18/age 18+, two dummy codes) and separately estimating the effect of race/ethnicity for each time period. Doran and Waldron (2017) examined whether timing of first sexual intercourse was associated with timing of first alcohol use, modeled as a time-varying predictor (and thus conditioned to occur before/concurrent with first sexual intercourse). The PH assumption was violated, so an age interaction was modeled with separate risk periods for sexual intercourse modeled through age 13 and age 14 onward.
Differences due to the timing of the predictor.
So far, our discussion of covariates has included two scenarios: (1) the covariates themselves vary over time and (2) the covariate effects vary over time. A third scenario is there are differences in the outcome as a function of the timing of predictor (predictors that are themselves time-specific). A risk factor could be categorized into early vs. late, younger vs. older, or distal vs. proximal. In our prior work, parental divorce/separation was categorized based on age experienced (ages 0–5/ages 6–9/age 10+) and we examined whether early window of exposure renders youth susceptible to stress (Jackson, Rogers, & Sartor, 2016). In other examples, Butterworth et al. (2014) examined whether age of first cannabis use was differentially predicted by early (by age 14) vs. late initiation of drinking and smoking and Roberts et al. (2016) looked at the role of early vs. late (> age 18) smoking initiation in predicting all-cause mortality. Finally, Kelley et al. (2016) tested the effects of marijuana use on initiation of psychotic disorders at different periods of development: early-adolescence (ages 12–14), late-adolescence (ages 15–17), and adulthood (age 18+). Thus, these studies test whether there are substantive differences in the effect of a predictor due to its timing.
Tests of these three scenarios are not mutually exclusive. For example, a test of differences due to timing of the predictor (e.g., when separation/divorce is experienced) could be combined with a test of differences due to timing of the outcome (e.g., age of first drink) to inform us about whether parental divorce/separation has a stronger effect among youth who had a very early age of first drink. This requires a test for whether risk is constant over time (the PH assumption) for each of three variables: divorce/separation at ages 0–5, divorce/separation at ages 6–9, and divorce/separation at age 10+.
Novel Extensions of Survival Models
New techniques are possible through advancements in software packages including user-created packages such as R as well as Mplus. The below content provides a brief overview for the applied user, but we urge the user to be cognizant of the applicability, assumptions, and shortcomings of these models, as discussed in detail in the source articles cited.
Survival mediation analysis.
One technique of utility to addiction researchers is the SEM-based discrete-time survival mediation analysis (Fairchild et al., 2015; Janssen et al., 2017a; Mason et al., 2017). As the name implies, this model integrates standard statistical mediational analysis (Baron & Kenny, 1986; MacKinnon, 2008) with discrete-time survival analysis. As one example, Mason et al. (2017) examined childhood (age 8) peer marginalization and aggression as mediators of the association between prenatal risk factors and substance initiation (across ages 11–16). Whereas this mediator was a single time-invariant variable, other studies model time-varying mediators. Slater and Henry (2013) tested mediated effects of music-related media exposure, lagged (time W-1), on subsequent alcohol and cigarette uptake (time W) through time-varying associations with substance-using peers (time W). Janssen et al. (2017a) tested mediation of bi-directional effects of movie alcohol exposure and social/cognitive processes on drinking initiation. Predictors and mediators were modeled as time-varying, such that for instance, alcohol initiation between Time 2 and Time 3 was predicted by Time 1 movie alcohol exposure as mediated through Time 2 social/cognitive processes. Atherton et al. (2015) modeled the mediating constructs as latent growth factors, demonstrating significant mediation of the association between peer deviance at intercept (age 10) and substance use initiation by a slope growth factor (across ages 10–16) of access to substances. [see Appendix Example 5 for time-varying (5a) and time-invariant (5b) mediation, and moderated mediation (5c)].
Multiple outcomes.
The discrete-time multiple-event process survival mixture (MEPSUM) model (Almansa et al., 2014; Dean, Bauer, & Shanahan, 2014) simultaneously models order and timing of multiple events [see Example 6a for MEPSUM and 6b, 6c for MEPSUM relaxing the PH assumption]. These models use mixture models (Nagin, 1999; Nagin, 2005) to identify latent subgroups (classes) of individuals exhibiting different patterns of event occurrence over time. That is, classes are composed of individuals with a similar hazard, across multiple outcomes (Richmond-Rakerd, Fleming, & Slutske, 2016). Dean, Cole, and Bauer (2015) characterized initiation of nine substances across ages 10–30. Six patterns (latent classes) were detected on the basis of both timing of initiation and cumulative risk of substance use. Richmond-Rakerd et al. (2016) similarly used the MEPSUM model to examine initiation of tobacco, alcohol, and cannabis use. The period of greatest risk differentiated classes more so than did sequencing, and all classes supported a clear “gateway” from licit to illicit drugs (tobacco before alcohol before cannabis). Intriguing future avenues include looking at ordering of symptom onsets within a disorder (Richmond-Rakerd et al., 2016). A somewhat related model is the dual-process discrete-time survival analysis (DPDTSA) model (Malone, Lamis, Masyn, & Northrup, 2010). For each behavior (e.g., two substances), age of onset is modeled as a latent class variable with known class membership. At Time 1, there are two known classes (any use vs. no use); at subsequent times, three known classes are estimated (1st time use, no use, past use). Initiation of each substance is predicted by initiation of the other substance at the immediately preceding occasion. This model is an advancement over simply correlating initiation ages as it also resolves ordering.
Joint modeling of growth and survival data.
This class of models incorporates latent growth curve models with discrete-time survival models. The covariate is treated as a growth process, as was done in Janssen et al. (2018) who tested whether initial levels of and changes in social norms in early adolescence (ages 11.5–14.5) prospectively predicted initiation of alcohol use (ages 15–18.5). These models also can be extended to latent growth mixture models, with growth in the predictor modeled as group-based trajectories (e.g., Kelley et al., 2016; Northrup et al., 2015). Survival time is plotted for each trajectory (class), and hazard of survival compared across class in a pairwise fashion. Finally, one last model that utilizes survival and growth functions is the multi-facet longitudinal model (MFLM) (Malone et al., 2011; 2012; Witkiewitz & Masyn, 2009). In this “onset-to-growth model,” initiation is modeled as a latent class, with a separate class for each possible age of initiation (and a class for no initiation). Persistence (continued use) after initiation is simultaneously modeled as a latent growth model. These models have utility for understanding common vs. unique etiological processes underlying initiation and persistence/desistence following (adjusting for) initiation of use. They are also useful for understanding patterns of behavior following initial lapse after substance abuse treatment (Witkiewitz & Masyn, 2009).
Statistical Power
We briefly describe the factors that go into power estimates for survival models. Power is driven by number of events total, which is a function of number of people being observed, probability of event occurrence during the study, and length of time under investigation. Because power is driven by number of events, we can ignore (right) censoring (although we should not ignore drop-out). Thus, it is critical that there be sufficient follow-up time to observe our event. Studies on addictive behaviors on adolescents and community samples will likely require longer follow-up periods than clinical or high-risk samples. When considering power to detect a covariate effect, effect size (ES) can be expressed as a ratio of median survival times in two groups (e.g., ES=2.0 if median lifetime is twice as long), and power can be computed based on ES and number of events (Singer & Willett, 1991). For a given sample size, data should be collected for a longer time when hazard is high and a shorter time when hazard is low, with small anticipated ES obviously requiring either larger N or longer study duration. For additional discussion of power in survival models, see Schoenfeld (1983).
Conclusions
A host of flexible analytic methods have been developed that allow for testing of developmental theory (Curran et al., 2010). Readily available specialized software enables scientists to apply sophisticated analyses (Collins, 2006) although programs are perhaps too user-friendly for researchers unaware of the assumptions and defaults. We strongly encourage the applied user to carefully consult the literature to understand the parameterization of a model as well as software manuals prior to programming the corresponding syntax. Hertzog and Nesselroade (2003) caution that a recent emphasis on statistical advances diverts attention from the importance of matching theory with research design, measurement, and analysis decisions. To the degree that these issues affect the correspondence between our statistical models and our theories of development, our ability to validly test developmental theory is threatened. We hope this paper enables addiction researchers to closely match their research question about timing of events with the appropriate model.
Supplementary Material
Highlights.
Survival models are ideal for examining the development of addictive behaviors
The metric and origin of the time scale must correspond to developmental process
Time-varying covariates and time-varying effects of covariates are both testable
Novel extensions of survival models can handle mediation, multiple events, and growth
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Author disclosure
The authors have no Conflicts of Interest to declare.
Declarations of interest
none.
References
- Almansa J, Vermunt JK, Forero CG, & Alonso J (2014). A factor mixture model for multivariate survival data: an application to the analysis of lifetime mental disorders. Journal of the Royal Statistical Society, 63, 85–102. [Google Scholar]
- Atherton OE, Conger RD, Ferrer E, & Robins RW (2015). Risk and protective factors for early substance use initiation: A longitudinal study of Mexican‐Origin youth. Journal of Research on Adolescence, 26, 864–879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Asparouhov T & Muthén B (2014). Auxiliary variables in mixture modeling: Using the BCH method in Mplus to estimate a distal outcome model and an arbitrary second model. Web note 21Baggio, S., Spilka, S., Studer, J., Iglesias, K., & Gmel, G. (2016). Trajectories of drug use among French young people: Prototypical stages of involvement in illicit drug use. Journal of Substance Use, 21, 485–490. [Google Scholar]
- Baron RM, & Kenny DA (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182. [DOI] [PubMed] [Google Scholar]
- Beets MW, Flay BR, Vuchinich S, Li KK, Acock A, Snyder FJ, & Tobacco Etiology Research Network. (2009). Longitudinal patterns of binge drinking among first year college students with a history of tobacco use. Drug and Alcohol Dependence, 103, 1–8. [DOI] [PubMed] [Google Scholar]
- Butterworth P, Slade T, & Degenhardt L (2014). Factors associated with the timing and onset of cannabis use and cannabis use disorder: Results from the 2007 Australian National Survey of Mental Health and Well‐Being. Drug and Alcohol Review, 33, 555–564. [DOI] [PubMed] [Google Scholar]
- Cain KC, Harlow SD, Little RJ, Nan B, Yosef M, Taffe JR, & Elliott MR (2011). Bias due to left truncation and left censoring in longitudinal studies of developmental and disease processes. American Journal of Epidemiology, 173, 1078–1084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang S-H (2004) Estimating marginal effects in accelerated failure time models for serial sojourn times among repeated events. Lifetime Data Analysis, 10, 175–190. [DOI] [PubMed] [Google Scholar]
- Chen X, Yu B, Lasopa SO, & Cottler LB (2017). Current patterns of marijuana use initiation by age among US adolescents and emerging adults: Implications for intervention. The American Journal of Drug and Alcohol Abuse, 43, 261–270. [DOI] [PubMed] [Google Scholar]
- Collins LM (2006). Analysis of longitudinal data: The integration of theoretical models, design, and statistical model. Annual Review of Psychology, 57, 505–528. [DOI] [PubMed] [Google Scholar]
- Courvoisier D, Walls TA, Cheval B, & Hedeker D (2018). A mixed-effects location scale model for time-to-event data: A smoking behavior application. Addictive Behaviors 10.1016/j.addbeh.2018.08.032 [DOI] [PubMed]
- Cox DR (1972). Regression models and life-tables (with discussion). Journal of the Royal Statististical Society B 34, 187–220. [Google Scholar]
- Crowther MJ, Andersson TML, Lambert PC, Abrams KR, & Humphreys K (2016). Joint modelling of longitudinal and survival data: incorporating delayed entry and an assessment of model misspecification. Statistics in Medicine, 35, 1193–1209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Curran PJ, Obeidat K, & Losardo D (2010). Twelve frequently asked questions about growth curve modeling. Journal of Cognition and Development, 11, 121–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dean DO, Bauer DJ, & Shanahan MJ (2014). A discrete-time Multiple Event Process Survival Mixture (MEPSUM) model. Psychological Methods, 19, 251–264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dean DO, Cole V, & Bauer DJ (2015). Delineating prototypical patterns of substance use initiations over time. Addiction, 110, 585–594. [DOI] [PubMed] [Google Scholar]
- Dekker FW, De Mutsert R, Van Dijk PC, Zoccali C, & Jager KJ (2008). Survival analysis: Time-dependent effects and time-varying risk factors. Kidney International, 74, 994–997. [DOI] [PubMed] [Google Scholar]
- Deutsch AR, Slutske WS, Lynskey MT, Bucholz KK, Madden PA, Heath AC, & Martin NG (2017). From alcohol initiation to tolerance to problems: Discordant twin modeling of a developmental process. Development and Psychopathology, 29, 845–861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doran KA, & Waldron M (2017). Timing of First Alcohol Use and First Sex in Male and Female Adolescents. Journal of Adolescent Health, 61, 606–611. [DOI] [PubMed] [Google Scholar]
- Duncan AE, Lessov-Schlaggar CN, Sartor CE, & Bucholz KK (2012). Differences in time to onset of smoking and nicotine dependence by race/ethnicity in a Midwestern sample of adolescents and young adults from a high risk family study. Drug & Alcohol Dependence, 125, 140–145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fairchild AJ, Abara WE, Gottschall AC, Tein JY, & Prinz RJ (2015). Improving our ability to evaluate underlying mechanisms of behavioral onset and other event occurrence outcomes: A discrete-time survival mediation model. Evaluation & the Health Professions, 38, 315–342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gail MH, Graubard B, Williamson DF, & Flegal KM (2009). Comment on “Choice of time scale and its effect on significance of predictors in longitudinal studies.” Statistics in Medicine, 28, 1315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grambsch PM, & Therneau TM (1994). Proportional hazards tests and diagnostics based on weighted residuals. Biometrika, 81, 515–526. [Google Scholar]
- Hedeker D, Siddiqui O, & Hu FB (2000). Random-effects regression analysis of correlated grouped-time survival data. Statistical Methods in Medical Research, 9, 161–179. [DOI] [PubMed] [Google Scholar]
- Hertzog C, & Nesselroade JR (2003). Assessing psychological change in adulthood: An overview of methodological issues. Psychology and Aging, 18, 639–657. [DOI] [PubMed] [Google Scholar]
- Huang YT, & Yang HI (2017). Causal mediation analysis of survival outcome with multiple mediators. Epidemiology, 28, 370–378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huggett SB, Hatoum AS, Hewitt JK, & Stallings MC (2018). The speed of progression to tobacco and alcohol dependence: A twin study. Behavior Genetics, 48, 109–124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hussong A, Bauer D, & Chassin L (2008). Telescoped trajectories from alcohol initiation to disorder in children of alcoholic parents. Journal of Abnormal Psychology, 11, 63–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ibrahim JG, Chu H, & Chen LM (2010). Basic concepts and methods for joint models of longitudinal and survival data. Journal of Clinical Oncology, 28, 2796. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jackson KM (2010). Progression through early drinking milestones in an adolescent treatment sample. Addiction, 105, 438–449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jackson KM, Colby SM, Barnett NP, & Abar CC (2015). Prevalence and correlates of sipping alcohol in a prospective middle school sample. Psychology of Addictive Behaviors, 29, 766–778 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jackson KM, Rogers ML, & Sartor C (2016). Parental divorce and initiation of alcohol use in early adolescence. Psychology of Addictive Behaviors, 30, 450–461. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Janssen T, Cox MJ, Merrill JE, Barnett NP, Sargent JD, & Jackson KM (2017a). Peer norms and susceptibility mediate the effect of movie alcohol exposure on alcohol initiation in adolescents. Psychology of Addictive Behaviors Advance online publication. 10.1037/adb0000338 [DOI] [PMC free article] [PubMed]
- Janssen T, Jackson KM, Cox M, Barnett NP, & Stoolmiller M (2017b). The role of sensation seeking and R-rated movie watching in early substance use initiation. Journal of Youth and Adolescence doi: 10.1007/s10964-017-0742-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Janssen T, Treloar Padovano H, Merrill JE, & Jackson KM (2018). Developmental relations between alcohol expectancies and social norms in predicting alcohol onset. Developmental Psychology, 54, 281–292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson RA, Gerstein DR, & Rasinski KA (1997). Recall decay and telescoping in self-reports of alcohol and marijuana use: Results from the National Household Survey on Drug Abuse (NHSDA). Proceedings of the American Association for Public Opinion Research. [Google Scholar]
- Joeng HK, Chen MH, & Kang S (2016). Proportional exponentiated link transformed hazards (ELTH) models for discrete time survival data with application. Lifetime Data Analysis, 22, 38–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kaestle CE (2015). Age of smoking milestones: Longitudinal inconsistencies and recanting. Journal of Adolescent Health, 56, 382–388. [DOI] [PubMed] [Google Scholar]
- Kelley ME, Wan CR, Broussard B, Crisafio A, Cristofaro S, Johnson S, Reed TA, Amar P, Kaslow NJ, Walker EF & Compton MT (2016). Marijuana use in the immediate 5-year premorbid period is associated with increased risk of onset of schizophrenia and related psychotic disorders. Schizophrenia Research, 171, 62–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kennedy MC, Marshall BD, Hayashi K, Nguyen P, Wood E, & Kerr T (2015). Heavy alcohol use and suicidal behavior among people who use illicit drugs: a cohort study. Drug & Alcohol Dependence, 151, 272–277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacKinnon D (2008). Introduction to Statistical Mediation Analysis New York: Routledge. [Google Scholar]
- Malone PS, Lamis DA, Masyn KE, & Northrup TF (2010). A dual-process discrete-time survival analysis model: Application to the gateway drug hypothesis. Multivariate Behavioral Research, 45, 790–805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malone PS, Northrup TF, Masyn KE, Lamis DA, & Lamont AE (2012). Initiation and persistence of alcohol use in United States Black, Hispanic, and White male and female youth. Addictive Behaviors, 37, 299–305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mason WA, Patwardhan I, Smith GL, Chmelka MB, Savolainen J, January SAA, Miettunen J, & Järvelin MR (2017). Cumulative contextual risk at birth and adolescent substance initiation: Peer mediation tests. Drug and Alcohol Dependence, 177, 291–298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Masyn KE (2003). Discrete-time survival mixture analysis for single and recurrent events using latent variables (Unpublished doctoral dissertation). University of California, Los Angeles. [Google Scholar]
- Muthén B, & Masyn K (2005). Discrete-time survival mixture analysis. Journal of Educational and Behavioral Statistics, 30, 27–58. [Google Scholar]
- Muthén LK, & Muthén BO (1998–2017). Mplus User’s Guide Eighth Edition. Los Angeles, CA: Muthén & Muthén. [Google Scholar]
- Nagin DS (1999). Analyzing developmental trajectories: A semiparametric group-based approach. Psychological Methods, 4, 139–157. [DOI] [PubMed] [Google Scholar]
- Nagin DS (2005). Group-based modeling of development Cambridge, MA: Harvard University Press. [Google Scholar]
- Nesselroade JR, & Baltes PB (1979). Longitudinal research in the study of behavior and development Academic Press, San Diego, CA. [Google Scholar]
- Northrup TF, Stotts AL, Green C, Potter JS, Marino EN, Walker R, Weiss RD & Trivedi M (2015). Opioid withdrawal, craving, and use during and after outpatient buprenorphine stabilization and taper: a discrete survival and growth mixture model. Addictive Behaviors, 41, 20–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Piccorelli AV, & Schluchter MD (2012). Jointly modeling the relationship between longitudinal and survival data subject to left truncation with applications to cystic fibrosis. Statistics in Medicine, 31, 3931–3945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Preacher KJ, & Hayes AF (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40, 879–891. [DOI] [PubMed] [Google Scholar]
- Proust-Lima C, Joly P, Dartigues JF, & Jacqmin-Gadda H (2009). Joint modelling of multivariate longitudinal outcomes and a time-to-event: a nonlinear latent class approach. Computational Statistics and Data Analysis, 53, 1142–1154. [Google Scholar]
- Ram N, Gerstorf D, Fauth E, Zarit S, & Malmberg B (2010). Aging, disablement, and dying: Using time-as-process and time-as-resources metrics to chart late-life change. Research in Human Development, 7, 27–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ram N, & Grimm KJ (2007). Using simple and complex growth models to articulate developmental change: Matching method to theory. International Journal of Behavioral Development, 31, 303–316. [Google Scholar]
- Raykov T, Zajacova A, Gorelick PB, & Marcoulides GA (2018). Using latent variable modeling for discrete time survival analysis: examining the links of depression to mortality. Structural Equation Modeling: A Multidisciplinary Journal, 25, 287–293. [Google Scholar]
- Richmond-Rakerd LS, Fleming KA, & Slutske WS (2016). Investigating progression in substance use initiation using a discrete-time multiple event process survival mixture (MEPSUM) approach. Clinical Psychological Science, 4, 167–182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roberts ME, Colby SM, Lu B, & Ferketich AK (2016). Understanding tobacco use onset among African Americans. Nicotine & Tobacco Research, 18(suppl_1), S49–S56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rogers ML, & Jackson KM (2017). Alcohol consumption milestones: Comparing first versus most recent report of onset. Journal of Child and Adolescent Substance Use, 26, 258–264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sartor C, Jackson K, McCutcheon V, Duncan A, Grant J, Werner K, & Bucholz K (2016). Progression from first drink, first intoxication and regular drinking to alcohol use disorder: A comparison of African-American and European-American youth. Alcoholism: Clinical and Experimental Research, 40, 1515–1523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sartor CE, Lessov-Schlaggar CN, Scherrer JF, Bucholz KK, Madden PA, Pergadia ML, Grant JD, Jacob T & Xian H (2010). Initial response to cigarettes predicts rate of progression to regular smoking: findings from an offspring-of-twins design. Addictive Behaviors, 35, 771–778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sartor CE, Lynskey MT, Heath AC, Jacob T, & True W (2007). The role of childhood risk factors in initiation of alcohol use and progression to alcohol dependence. Addiction, 102, 216–225. [DOI] [PubMed] [Google Scholar]
- Sartor CE, Xian H, Scherrer JF, Lynskey MT, Duncan AE, Haber JR, Bucholz KK, Jacob T, (2008). Psychiatric and familial predictors of transition times between smoking stages: Results from an offspring-of-twins study. Addictive Behaviors, 33, 235–251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoenfeld D (1983). Sample-size formula for the proportional -hazards regression model. Biometrics, 39, 499–503. [PubMed] [Google Scholar]
- Singer JD, & Willett JB (1991). Modeling the days of our lives: Using survival analysis when designing and analyzing longitudinal studies of duration and the timing of events. Psychological Bulletin, 110, 268–290. [Google Scholar]
- Singer JD, & Willett JB (1994). Modeling Duration and the Timing of Events: Using Survival Analysis. In Long-Term Follow-Up Studies. In Friedman Sarah L. and Carl Heywood H (Eds.), Developmental Follow-Up: Concepts, Genres, Domains and Methods San Diego, CA: Academic Press, 315–330. [Google Scholar]
- Singer D, & Willett JB (2003). Applied longitudinal data analysis: Modeling change and event occurrence New York: Oxford University Press. [Google Scholar]
- Slater MD, & Henry KL (2013). Prospective influence of music-related media exposure on adolescent substance-use initiation: A peer group mediation model. Journal of Health Communication, 18, 291–305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thiébaut A, & Bénichou J (2004). Choice of time‐scale in Cox’s model analysis of epidemiologic cohort data: a simulation study. Statistics in Medicine, 23, 3803–3820. [DOI] [PubMed] [Google Scholar]
- Van den Hout A, & Muniz‐Terrera G (2016). Joint models for discrete longitudinal outcomes in aging research. Journal of the Royal Statistical Society: Series C (Applied Statistics), 65, 167–186. [Google Scholar]
- Wagner FA, & Anthony JC (2002). Into the world of illegal drug use: exposure opportunity and other mechanisms linking the use of alcohol, tobacco, marijuana, and cocaine. American Journal of Epidemiology, 155, 918–925. [DOI] [PubMed] [Google Scholar]
- Wellman RJ, & O’Loughlin J (2015). Data dilemmas and difficult decisions: on dealing with inconsistencies in self-reports. Journal of Adolescent Health, 56, 365–366. [DOI] [PubMed] [Google Scholar]
- Wienke A (2010). Frailty models in survival analysis Chapman and Hall/CRC. [Google Scholar]
- Willett JB, & Singer JD (1993). Investigating onset, cessation, relapse, and recovery: Why you should, and how you can, use discrete-time survival analysis to examine event occurrence. Journal of Consulting and Clinical Psychology, 61, 952–65 [DOI] [PubMed] [Google Scholar]
- Willett JB, & Singer JD (1995). It’s déjà vu all over again: Using multiple-spell discrete-time survival analysis. Journal of Educational and Behavioral Statistics, 20, 41–67. [Google Scholar]
- Witkiewitz K, & Masyn KE (2008). Drinking trajectories following an initial lapse. Psychology of Addictive Behaviors, 22, 157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yuan KH and Bentler PM (1998). Robust mean and covariance structure analysis. British Journal of Mathematical and Statistical Psychology, 51, 63–88. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
