Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 May 1.
Published in final edited form as: Sociol Methods Res. 2013 Dec 13;43(2):248–279. doi: 10.1177/0049124113511594

Measuring Discontinuity in Binary Longitudinal Data: Applications to Drug Use Trajectories

Thor Whalen, Miriam Boeri
PMCID: PMC4190590  NIHMSID: NIHMS521467  PMID: 25309006

Abstract

Life course perspectives focus on the variation in trajectories, generally to identify differences in variation dynamics and classify trajectories accordingly. Our goal here is to develop methods to gauge the discontinuity characteristics trajectories exhibit and demonstrate how these measures facilitate analyses aimed to evaluate, compare, aggregate, and classify behaviors based on the event discontinuity they manifest. We restrict ourselves here to binary event sequences, providing directions for extending the methods in future research. We illustrate our techniques to data on older drug users. It should be noted though, that the application of these techniques is not restricted to drug use, but can be applied to a wide range of trajectory types. We suggest that the innovative measures of discontinuity presented can be further developed to provide additional analytical tools in social science research and in future applications. Our novel discontinuity measure visualizations have the potential to be valuable assessment strategies for interventions, prevention efforts, and other social services utilizing life course data.


Sociological analysts often want to detect patterns in sequence data, such as those found in longitudinal/panel data and event-histories. In drug use research, the life course conceptual framework helps focus attention on trajectories, particularly transitions and turning points in a drug career (Laub and Sampson 2003; Hser et al. 2008; Hser, Longshore and Anglin 2007; Schulenberg, Maggs and O’Malley 2003). A trajectory is defined as “a pathway or line of development over the life span” (Hser et al. 2007:227). Much existing research aimed at analyzing trajectory characteristics identifies groups of trajectories according to these characteristics, and pinpoint causal features of these groups (Brecht et al. 2008; Chassin, Flora and King 2004; Laub and Sampson 1998; White, Pandina and Chen 2002, Yamaguchi 2008). Most of these gauge and differentiate trajectories on the basis of their growth characteristics, that is, the evolution of a continuous variable over time (Ellickson, Martino and Collins 2004; Hser, Huang, Brecht, Li and Evans 2008; Juon, Ensminger and Syndor 2002).

In this paper, we are interested in discrete changes in a trajectory, not particularly if the change is part of a continuous decreasing, persistent, or increasing trend, or belonging to a particular growth pattern. In order to be able to develop our analysis at a satisfactory level of detail without detracting from the principles involved, we will restrict our development to a sequence of binary events. Suggestions of possible extensions will be made in the conclusion of this article.

Hazard-rate models are often used to model the occurrences of a specific event in a binary trajectory. Habitually though, the occurrence of this event is considered to be static or independent of the current state of the trajectory. It makes sense to use such models to accidental events whereby the occurrence of the event depends little on its occurrence in the recent past. By contrast, in many behavioral contexts, the state of a trajectory variable depends greatly on its state in the recent past. For example, using a drug one year will largely influence the use of this drug the next year. This dynamic aspect is not taken into consideration in typical hazard-rate models.

Growth curve models take into consideration the dynamics of a trajectory, and are indeed a popular approach in the field of drug use. Both hazard-rate and growth models can, and have, been extended to describe time-dependent discrete variables: For example, classical growth models have been extended to what is referred to as latent-class (LC) growth modeling, which can be seen as a special case of mixture generalized linear modeling (Vermunt, Tran, and Magidson 2008; Andruff et al. 2009; Jung and Wickrama 2008; Muthen and Masyn 2005). Typically though, growth curve models are applied to continuous variables, and are usually concerned with capturing consistent trends over time, but less so with modeling cyclic transitions between discrete states (Ip, Jones, Heckert, Zhang and Gondolf 2010). Ip and coauthors used Markov-models to describe the transition between discrete variables of behavior. We too will base our analysis on such a perspective.

The purpose of this paper, though, is not to develop dynamical models of discrete trajectories, but to develop descriptive and inferential measures of the inherent discontinuity of trajectories based on a simple Markov model of its dynamics. Namely, we will measure latent discontinuity as the long-term expected transition rate, given a simple Markov model of the trajectory. Note too, that we are not concerned here with exploring the possible enablers of discontinuity based on a choice of explanatory variables, but rather in describing discontinuity itself. To further elaborate, although there are many influences on a trajectory, such as the drug trajectory we use as an example in this paper, we are not attempting to model the effect of the many factors affecting the trajectory but instead measure the discontinuity found in trajectories. Drug use trajectories are convenient examples we use to illustrate our measure of discontinuity.

The concept of discontinuity has been identified as fundamental to development research. Accurately accessing trajectory variability provides a better understanding for developing and implementing needed interventions, such as those needed to address problematic substance use among youth; and failure to recognize discontinuity patterns can lead to inaccurate conclusions (Schulenberg et al. 2003). While others focus on the meaning, definition, and function of discontinuity in a life course trajectory, in this paper we aim to evaluate the discontinuous characteristics of trajectories by developing measures that can be used to acquire insight and/or compare the discontinuity of drug use of certain users and/or typical use dynamics of certain drugs. In doing so we address the need to distinguish transitions from turning points (Hser et al. 2007; Schulenberg, et al. 2003).

The techniques developed herein were conceived with drug use in mind, and drug use will be used to illustrate all the concepts and methods we develop based on retrospective drug use data we have collected. A brief overview of the methods used to collect the sample is found in Appendix A. The reader should bear in mind though, that the applications of these techniques are not restricted to drug use alone. In fact, these can be applied to any sequence of a binary variable, though they will be more appropriate if the value of the variable at time t + 1 is correlated to its value at time t. This property makes sense in the case of drug use given the dynamics of addiction, but this dynamic can also be found in various risk-behaviors, marital status, career sector, and other trajectories with binary states (see for example Elder 1985; Laub and Sampson 1998).

In the first section, we will introduce several descriptive statistics for binary trajectory discontinuity, culminating with four interrelated measures: relapse rate, remission rate, sample activity and sample discontinuity. In the next section, we will show how the corresponding latent parameters of behavior can be estimated from an observed trajectory. The following section will then use these developments to show how one can compare the latent behaviors of two trajectories, aggregate the behaviors of a collection of trajectories to assess global behavior and behavior heterogeneity, compare different global behaviors, and finally assess individual behavior by contrasting it with comparable behaviors of a same context.

1 DESCRIPTIVE STATISTICS FOR TRAJECTORY DISCONTINUITY

In (Authors, 2011), we introduced several measures to assess observed discontinuity of drug use trajectories; namely, transition count, transition rate, relapse rate, and remission rate. These descriptive statistics were used to pinpoint interesting patterns of trajectories of 92 respondents’ drug use of 10 different drugs. In this section, we present these again briefly since these constitute the basis of the discontinuity inferential statistics we will develop in the next section.

1.1 Binary Trajectory

A binary trajectory is a sequence τ = τ0 ⋯ τn of n + 1 binary time-indexed variables each being in one of two states (0 or 1). In the case of our ongoing example of drug use trajectories, each element of the sequence represents a period of one year of a respondents life until the time of the interview; 0 will represent non-use (or “non-active,” which will mean the respondent never took the drug that year) and 1 will represent use (or “active,” which will mean the respondent took the drug at least once that year). For example, consider the following trajectory for tobacco of a respondent (#026) of our study (the left-most digit corresponds to 0-year old and digits are grouped by groups of five to facilitate readability):

  • τ = 00000 00000 11111 11111 11111 11111 11111 11111 11111 11111 10011 11

This trajectory shows that 026 started smoking at age 10, and did not stop until age 51, when he stopped for two years and started again at age 53, and continued until the time of the interview (at age 56). A 1-year interval is a time period used in previous retrospective studies (Hser et al. 2008; Nurco et al. 1975).

Also, the definition of the 0 and 1 states were chosen so that these may represent two significantly different states. One could argue for an alternate definition, and indeed both the definition of the states and the periods they cover are relevant to the meaningfulness of the results, but are of little concern to the actual methods we will develop herein. Again, the methods we will develop can be applied to any binary sequence, but the choice of what the sequence states represent may be more or less appropriate, which can be the object of another analysis.

1.2 Trajectory Discontinuity

The images in Figure 1 exhibit the trajectories of two actual users in the study for each of the 10 drugs in the survey. For this paper we look at yearly use and non-use of each of the following drugs: tobacco (TOB); marijuana (MAR); alcohol (ALC); hallucinogens (HAL); cocaine (COC); crack cocaine (CRK); heroin (HRN); amphetamine (AMP); methamphetamine (MET) and prescription pill misuse (PRP).

Figure 1.

Figure 1

Drug use trajectories for two respondents. A row represents a trajectory for a given drug over the life of the individual. A dark cell (for a given age (x-axis) and drug (y-axis)) represents active use (1) and light cell represents non-active (0).

As shown in Figure 1, the first user (respondent 024) exhibits long stretches of use and long stretches of non-use of every drug. Contrast this with the use trajectories of the second user (respondent 026), which reveal several periods of activity or non-activity for all drugs he used except CRK. It appears that the second respondent exhibits more discontinuous use than the first. We wish to construct measures of trajectories that gauge such discontinuous aspects of a trajectory.

Of special interest here are the pairs τiτi+1 of consecutive states of a trajectory. For a given trajectory τ and x, y ∈ {0, 1} we let

nxy=|{τiτi+1:τi=xandτi+1=y}| (1)

(where 1 ≤ i < |τ|) be the number of times an x is followed by a y in the trajectory τ. Further, we let n = n00 + n01 + n10 + n11; note that n is equal to the length of τ minus one.

1.3 Transition Count

If τi ≠ τi+1 we say there was a transition: 10 is a transition from active to non-active, which we will call remission, and 01 is a transition from non-active to active, which we will call relapse—though this term is not completely suitable for the first such transition (we don’t relapse into drug use the first time we use it), but more so for all subsequent 01-transitions. We start with a straightforward measure that we’ll refer to as transition count TC = n01 + n10, which is simply the number of transitions we observe in the trajectory. For example, if τ = 00000 00111 10000 1111, TC(τ) = 3.

The transition count encodes several noteworthy properties of trajectories: (1) if the TC is 0, it means that the individual never used that particular drug (assuming all drug trajectories start with “non-active” state 0); (2) a TC of 1 means that the individual started and never stopped (at the granularity of a year) the drug; (3) a TC of 2 means that the individual used the drug at some point, but then stopped indefinitely (that is, until the time of the interview); (4) an even TC means that the individual is NOT currently (at the time of the interview) using the drug; and (5) an odd TC means that the individual is currently using the particular drug.

Though elementary, these aspects are characteristic of many trajectories. The transition count provides a single descriptive quantity that captures several of these characteristics and can therefore be a useful statistic to diagnose the data and pinpoint interesting patterns. Moreover, the number of transitions is the basis of the measures presented here, since we are gauging discontinuity as having to do with the frequency of these transitions.

1.4 Transition Rate

The transition count has some shortcomings in emulating how one might commonly compare the discontinuity of several trajectories. One critical inadequacy is that transition count alone does not take into account the amount of time the user has had to accrue these transitions, and this could be misleading. One straightforward way of integrating time as the duration of the respondent’s career with the drug into a transition count is to divide the latter by the number of years since the year of first use; we will call this quantity the transition rate.

Given a trajectory τ, let |τ| be the length of a trajectory (the number of 0s and 1s) and τ̇ be the part of the trajectory, which we call career starting at the year of first use—letting τ̇ = ∅ if TC(τ) = 0, that is, if the respondent never used the drug in question. We define the transition rate TR of a trajectory τ with TC(τ) ≠ 0 to be

TR(τ)=TC(τ)1|τ̇|1.

The reason for subtracting 1 from the TC and to |τ̇| is that we are counting transitions and years since the first use. The transition rate gives us a sense of how often transitions occur during the respondent’s career with this drug, that is once he/she has been exposed to the drug.

From now on we will, by default, consider only trajectories with TC ≠ 0, that is trajectories of users that used the drug at least once. Moreover we will consider the part of the trajectory starting at the year of first use: We call this the career trajectory. Unless otherwise specified we will consider only career trajectories, therefore τ = τ̇, so τ will always start by a 1. The definition of the transition rate for a career trajectory τ becomes

TR(τ)=TC(τ)|τ|1=n01+n10n=n01+n10n00+n01+n10+n11. (2)

1.5 Remission Rate and Relapse Rate

Consider the trajectories τ = 11111 01110 and τ′ = 10000 01100 (though we are merely considering these to illustrate a point, these are patterns found at different points in the trajectories of a number of the participants). One may verify that these have the same TC and TR—namely TC = 3 and TR = 0.3333—yet τ exhibits markedly more drug use than τ′. This is so because transition count and rate exhibits the tendency to transition from one drug use state to another, but it does not indicate whether the transitions are relapses or remissions, but rather aggregate these without taking into account the respective durations of the periods they initiate.

We therefore introduce two new measures: (1) the relapse rate =n01n01+n00, the ratio of non-using years that were followed by a relapse into a using year, and (2) the remission rate =n10n10+n11, the ratio of using years that were followed by a remission into a non-using year. These measures are not defined for empty career trajectories where a respondent never took the drug. Moreover, the relapse rate is not defined if a user never stopped using a drug since then, both n01 and n00 are null; indeed, we cannot measure the tendency of a former user to relapse back into drug use if the respondent never quit the drug in the first place. Considering the two trajectories introduced in the last paragraph, the reader may verify that (τ) = 100%, (τ) = 25%, (τ′) = 16%, and (τ′) = 66%. Note that 1− indicates the tendency to remain in state 0 and 1 − the tendency to remain in state 1. This kind of model of a sequence of states by specifying the rates of all possible pairs of consecutive states is known as a Markov process. The Markov processes for τ and τ′ are illustrated by the digraphs in Figure 2.

Figure 2.

Figure 2

Digraphs representing the Markov processes for τ and τ′.

The purpose of this paper is to measure discontinuity not use rate—that is, the rate at which one transitions from one state to another, not the relative proportion of being in one state or another. The reason for introducing relapse and remission rate rather than contenting ourselves with the transition rate is that in many contexts transitions from 0 to 1 and from 1 to 0 have dissimilar dynamics, which are hidden by the transition rate. If we were to undertake inferential analysis based on the transition rate alone, therefore, any existing asymmetries in the transition dynamics would affect our conclusions. Relapse and remission rates evaluate both types of transitions separately, so they will serve as a good basis for gauging discontinuity.

1.6 Activity and Discontinuity

The relapse and remission rates give us, together, a sense of both how much use and non-use the trajectory exhibits—which we will call activity—and how much discontinuity the trajectory exhibits. We propose two measures to gauge activity and discontinuity using , the relapse rate, and , the remission rate: Let ā=+ be the (sample) activity of the trajectory, and =2+ be the (sample) discontinuity of the trajectory1. Note that the discontinuity is the harmonic mean of the relapse and remission rates. The rationale behind these measures is explained in Appendix B, but requires the inferential statistics we will develop in the next section. Suffice it to note here that is the expected long-term transition rate and ā is the expected long-term proportion of active (use) years.

Each of the four statistics r̄, R̄, ā and conveys an interesting aspect of the trajectory and the behavior underlying it, but are tightly interrelated. Indeed, any two of these determines completely the other two. Table 1.6 shows equations relating each of these four statistics to any two of the remaining, demonstrating the relationships between the four measures. For example, =2ā shows that the remission rate is half of the discontinuity divided by the activity. Since 1 − ā—which we could call inactivity—is the expected long-term proportion of non-active years, the relapse rate =2(1ā) is half of the discontinuity divided by the inactivity. Also, =ā1ā and =(1ā)ā manifest that the ratio of activity and inactivity is a factor relating to .

As a consequence we can choose any two statistics to analyze according to the particular facets we want to focus on, yet being able to derive other perspectives of behavior seamlessly. In this paper though, we will usually examine relapse along with remission, and activity along with discontinuity. Figure 3 exhibits the (r̄, R̄) and (ā, d̄) of the use of four drugs of an actual respondent (005) of our study, which we will use to illustrate characteristics of relapse, remission, activity and discontinuity. In the relapse/remission graph we included black lines indicating different ā levels (for ā = 0.1, 0.2, …, 0.9) and blue lines indicating different levels (for = 0.1, 0.2, …, 0.9). Similarly, the activity/discontinuity graph includes black and blue lines indicating respectively different and levels.

Figure 3.

Figure 3

On the left, relapse and remission rates of respondent 005’s use of CRK, COC, HER, and MAR are plotted. The lines show relapse/remission pairs that have same activity (in black) and same discontinuity (in blue). The corresponding activity and discontinuity for the four drugs are plotted on the right. The lines here show activity/discontinuity pairs that have same relapse (in black) and same remission (in blue).

The triangle appearing in the right activity/discontinuity graph of Figure 3 delimits the region of feasible (activity, discontinuity) pairs. Indeed, the fact (see Table 2) that ā=12=2 and that r̄, R̄ ≤ 1 implies that 2ā12. Informally, this reveals the fact that as discontinuity increases, the range of possible activity narrows down since the discontinuous behavior allows neither to be active too often, nor to be inactive too often. If the discontinuity were to be 1, its highest possible value, this would mean that the respondent transitions every year, necessarily implying as many active and non-active years, therefore an activity of 0.5.

Table 2.

Mean overall behavior measures (in %)

TOB ALC MAR PRP COC CRK HER MET
mean relapse propensity rD: 28.37 22.31 10.96 14.00 11.12 27.00 15.61 22.69
mean remission propensity RD: 4.83 8.34 11.56 21.53 18.99 20.29 21.95 30.15
mean latent activity aD: 85.5 72.8 48.7 39.4 36.9 57.1 41.6 42.9
mean latent discontinuity dD: 8.3 12.1 11.3 17.0 14.0 23.2 18.2 25.9

Note the null activity and discontinuity for CRK in Figure 3. This is so because the respondent used the drug continuously for a period, and then stopped until the time of the interview (i.e. TC = 2), so the relapse rate is zero, implying both null activity and null discontinuity. This is a degenerate case of the activity and discontinuity statistics which reveals the distinction between these and the observed transition rate and proportion of active years. One should remember that ā and are long-term estimates of the expectation of their observed counterparts, and indeed, if the true probability of 005’s relapse was zero, the respondent would never relapse, therefore her transition rate and ratio of use years over total career years would both decrease indefinitely every new year. It is in this sense that the long-term estimates ā and are null.

Yet an observed relapse rate of 0 does not actually imply a relapse probability of zero. The link between observed rates and underlying probabilities will be examined in the next section, and better estimates of the underlying activity and discontinuity thus derived.

2 INFERENTIAL STATISTICS

The remission and relapse rates are maximum likelihood estimates of the underlying probability to transition from non-active to active and from active to non-active, respectively. As we have mentioned earlier, this type of model of a sequence of states is typically known as a Markov process or Markov chain. It is defined as a sequence of states having the property that the next state depends only on the current state—this property is called the Markov property. A Markov process models a sequence of states by specifying an initial state as well as the probability of the next state of the sequence conditional on the current state. In our case of drug use, we use two states (active (1) and non-active (0)) and consider the sub-trajectory starting at the age of onset (the first 1 of the sequence), thereby setting the initial state to be 1 (active).

We will assume that the trajectories we observe are generated by such a latent Markov process, parameterized by underlying relapse and remission probabilities which we will now estimate. The relapse and remission rates are point estimates, but in order to properly infer any conclusions based on our data, we should acquire a probability distribution of the “true” underlying quantities we are estimating, conditional on the data we used for these estimates.

2.1 Sampling Distribution of r and R

Let us consider the probability r of a relapse, which we’ll call relapse propensity in order to avoid some confusion that repeated uses of the word probability could cause. What we are assuming here is that, for a given respondent and drug, there is an underlying probability r that, if not using the drug in a given year, the respondent will use the drug again the following year. That is, the state of the year following a non-using year is determined by a Bernoulli trial2 with the probability r = probi+1 = 1 | τi = 0) of being in state 1 and 1 − r = probi+1 = 0 | τi = 0) of remaining in state 0.

The sampling distribution is therefore prob(r | τ), the posterior probability distribution of r given the evidence which the trajectory provides. Since we assume the Markov property, the evidence the trajectory provides is entirely given by n01 and n00 and Bayes’ theorem gives us prob(r|τ)=prob(r|n01,n00)=prob(n01,n00|r)prob(r)prob(n01,n00). Let Beta(α, β) be the beta distribution–conjugate prior of binomial and Bernoulli distributions–with parameters α and β. We will assume no prior knowledge of prob(r), taking Beta(1, 1) to be the prior of our inference. The posterior distribution is then prob(r | τ) = Beta(n01 + 1, n00 + 1).

The previous derivation is similar for evaluating the remission propensity R; we get in this case, prob(R | τ) = Beta(n10 + 1, n11 + 1). Together, the relapse and remission posterior propensity distributions yield3

prob(r,R|τ)=Beta(n01+1,n00+1)×Beta(n10+1,n11+1) (3)

Figure 4 illustrates the Bayesian inference principal we use to model individual trajectories, and Figure 5 shows the relapse and remission propensity distributions inferred from an actual trajectory of respondent 005 in our study: The career trajectory

  • τ = 11111 11111 11111 10000 00110 1100

of 005’s use of HER. We have n01 = 3, n00 = 7, n10 = 3, and n11 = 17. The top left and bottom right graphs show prob(R |τ) and prob(r | τ) respectively4. The top right and bottom left show prob(r, R | τ) in two different ways. The top right is a density heat map, where color shades from black (low probability density) to white (high probability density). The bottom left shows filled contours determined by percentiles 95%, 75%, 50%, and 25% of total probability weight. These should be interpreted as follows: Given the evidence, there is a 95% chance that (r, R) lies inside the first clearer region (containing other regions), 75% that it is within the next contained slightly darker region, 50% chance it is with in the next, and finally 25% chance that it is within the last darkest region.

Figure 4.

Figure 4

Markov model of an individual trajectory. We assume the trajectory was generated by a Markov process with parameters r, the relapse propensity, and R, the remission propensity, and estimate these parameters from the observed trajectory using Bayesian inference.

Figure 5.

Figure 5

Relapse and remission propensity posterior distributions for 005’s HER career trajectory.

This sampling distribution prob(r, R | τ), probability of the whereabouts of the (r, R) pair given an observed trajectory, will be the basis of all inferential statistical analysis we engage in. We define r = rprob(r | τ) and R = Rprob(R | τ) to be the random variables linking the latent r and R values to their probabilities given the observations, and (r, R) = (r, R) ↦ prob(r, R | τ) the corresponding joint random variable. Notice the use of boldface letters to represent the random variables corresponding to the unobserved parameter. These random variables encapsulate all the knowledge we have about the underlying propensities, therefore the latent behavior we are investigating.

2.2 Sampling Distribution of a and d

For a given relapse propensity r and remission propensity R of a trajectory, we define the latent activity a=rr+R and latent discontinuity d=2rRr+R of the trajectory. The rationale behind these definitions, somewhat mathematically involved, are presented in Appendix B. Since our knowledge of r and R is encapsulated in the distributions of random variables r and R, we can get the probability distributions a and d of latent activity and discontinuity directly, by operating on the random variables;

a=rr+R,and (4)
d=2rRr+R. (5)

Using (5), and (4) where (r, R) is given by the sampling distribution specified in (3), we can compute the sampling distribution activity and discontinuity. For example, in the case of 005’s HER career trajectory, whose (r, R) probability density was displayed in Figure 5, we compute (a, d), displayed in Figure 6. Note that r and R can be computed analytically, since their probability densities are directly given by the beta distribution, but a and d have no closed analytical forms, and were therefore computed numerically, using Monte-Carlo methods.

Figure 6.

Figure 6

005’s HER career activity and discontinuity. The left figure depicts the probability density function as a heat graph, the right as filled contours circumscribing respectively (from lighter to darker filled contours) 95%, 75%, 50%, and 25% of the probability weight.

3 APPLICATIONS

In our present analysis, we represent behavior using the random variables for relapse propensity r, remission propensity R, latent activity a, and/or latent discontinuity d. The information we have about these parameters (given the observed trajectory) is totally contained within the probability densities of these random variables. As we have seen earlier, any two of our four parameters determine the other two, so we can use any pair of these to describe the behavior totally. We will choose to use activity and discontinuity, letting the random variable b = (a, d) embody the latent behavior we wish to examine. We now show how these random variables can be used to answer questions about drug use behavior.

Namely, we address the following matters: (a) how to compare two or several behaviors, (b) how to assess global behavior and behavior heterogeneity, (c) how to compare the behaviors brought about by different drugs, and (d) how to factor in global behavior when assessing individual behavior.

3.1 Comparing Several Behaviors

We are concerned here with comparing the latent behaviors exhibited by different trajectories. Say, for example, that we want to compare the discontinuity of 005’s use of COC with that of CRK. Given the career trajectories τCOC = 11111110011110000000 and τCRK = 111100000000000000000000, we compute the sampling distributions of relapse propensity and remission propensity for these two drugs, and using (5) we get the probability density functions of dCOC and dCRK, the latent discontinuities for the drugs, which are shown in Figure 7. Comparing the difference of dCOC and dCRK comes down to examining the distribution of dCOCdCRK, shown in Figure 9. Indeed, dCOCdCRK gives us the probability density for every possible dCOCdCRK value. Therefore we can compute the probability that the difference is positive, which means that dCOC > dCRK, the probability that the difference is negative, which means that dCRK > dCOC, but also evaluate the significance of any other hypotheses about the difference in discontinuity. We see that the probability that dCOCdCRK > 0 is 91.9% (87.0% + 4.9%). Therefore, if we were to take 10% as the significance level, we could conclude that indeed 005’s latent discontinuity for COC is higher than that for CRK.

Figure 7.

Figure 7

The latent activity and discontinuity (distributions) of 005’s behavior with three drugs.

Figure 9.

Figure 9

Comparison of the behavior of 005 with HER, CRK, and COC.

The same method can be applied to compare relapse propensity, remission propensity, or latent activity. Therefore we can compare our two-dimensional latent behaviors as well: comparison of the behaviors bτ(1) and bτ(2) of two trajectories τ(1) and τ(2) will be entirely determined by the distribution of the difference bτ(1)bτ(2).

Let us consider the behavior of respondent 005’s use of HER, CRK, and COC. The probability densities of bHER, bCRK and bCOC are shown in Figure 7 and the probability densities of bCRKbHER, bCRKbHER and bCRKbHER are shown in Figure 9. The four different quadrants of the graphs correspond to the four possibilities when comparing the activity and discontinuity of two behaviors, as described in Figure 8. In the quadrants of Figure 9 we included the percentage of the bτ(1)bτ(2) probability distribution contained in that quadrant.

Figure 8.

Figure 8

The signification of the four quadrants where the difference of two latent (activity, discontinuity) pairs can be.

For example, Figure 9(a) shows that we are 87.5% certain that 005’s CRK use will be both less active and less discontinuous then that of HER. Figure 9(b) shows that 005’s discontinuity for COC and HER are similar, but her activity with HER is probably slightly higher than for COC. As we can see in Figure 9(c), 005’s activity and discontinuity are both considerably higher for COC than for CRK.

This method of comparing latent behavior of two trajectories was illustrated by comparing a same respondent’s use of different drugs, but could be used to compare different respondent’s use of a same drug, or any two behaviors expressed by random variables over the same parameters. In fact, the method of comparing two behaviors by examining the difference b1b2 of the random variable embodying them will be applied later in this article to compare a global behavior of different drugs, or individual behavior with a drug with respect to global behavior for that same drug.

3.2 Global Behavior and Behavior Heterogeneity

In this section, we are concerned with measuring the behavior of a collection of trajectories. This collection of trajectories may be given by multiple respondents for a same drug, or multiple drugs of a same respondent, or determined by some demographic attribute, and so on. The heart of the matter is to aggregate the information given by each individual trajectory of the collection in a way that conveys the behavior of the group as a whole.

We will take the example of a collection of trajectories of multiple respondents’ use of a same drug. This will allow us, in later subsections, to assess and compare the behaviors for different drugs as well as gauge an individual behavior with respect to other behaviors for the same drug. One should bare in mind that the same analysis can be done for trajectory collections defined differently. For example, aggregating the trajectories of a same respondent for different drugs, we would be able to acquire an overview of an individual’s behavior as well as compare this behavior to other behaviors.

What we are assuming here is that there is an underlying model of global behavior B (a random variable) from which each individual behavior b is drawn. Except, unlike an individual behavior variable bi whose probability density prob(bi = b | τ(i)) is inferred from a single trajectory τ(i), the probability density prob(B = b | τ(1), …, τ(m)) of B is inferred from all trajectories of the collection.

Informally, this means that B tells us that if we draw a random trajectory out of the group B is modeling, we have a specific probability prob(B = b) that this trajectory will exhibit a “b-behavior”, which in turn will govern the dynamics of a trajectory. We used probabilistic inference to infer information of b given an observed trajectory, and we will use probabilistic inference again to infer B from there, as illustrated in Figure 10.

Figure 10.

Figure 10

Relation between global behavior model, individual behavior model, and individual trajectory.

What is left to do therefore, is to specify B. One common way to carry this out is to assume a finite (known or unknown) number of archetypical behaviors. We will demonstrate both extremes of this approach: One assuming a single behavior (over all respondents) for each drug, and the other assuming an infinite number of possible behaviors.

Let us start with a simple specification of B assuming that, for a given drug D, there is a single underlying behavior bD = (rD, RD) that governs all behaviors for this drug. In this case, Bayes’ theorem shows us that this (rD, RD) could be inferred from the observed trajectories τ(1), ⋯, τ(m) by computing the total n00, n01, n10, and n11 of all trajectories and applying (3) to these. Computing such overall relapse and remission propensities (and their corresponding latent activity and discontinuity) for eight drugs of our study, we get the means shown in Table 2. These overall relapse and remission means, as well as their corresponding activity and discontinuity means, are shown in Figure 11.

Figure 11.

Figure 11

Mean relapse and remission propensities (left) and mean activity and discontinuity (right) for eight drugs in the study. The grayed-out triangles in the right figure show regions where (activity, discontinuity) combinations are impossible.

Though this general model of behavior gives us some conception of the difference of behavior between drugs, it doesn’t convey how much behavior heterogeneity there is within a same drug. Indeed, it can be shown (using a likelihood-ratio test for example) that the model, which assumes a single underlying behavior pattern for a same drug, has only a limited ability to account for the variations in behaviors we observe. This is not surprising, given our simple model has only two degrees of freedom to describe all behaviors.

We can extend this model by supposing not one single latent behavior, but a set K of k “archetypical” behaviors specified b1, …, bk and a vector (p1, …, pk) (with Σi=1i=kpi=1) of probabilities of having each of the k archetypical behavior. Given the observed trajectories, we can use techniques such as the maximum likelihood method to find the parametrization of this model that accounts for the most variation. The simple model we introduced previously is in this case equivalent to setting k = 1. This kind of approach can be compared to the classification techniques use in latent growth modeling to categorize trajectories into archetypical latent classes (Chassin et al. 2004:483; Chung et al. 2002:663; Ellickson et al. 2004:299; Gamerman and Smith 1996:587)

Finally, we introduce one more global model that doesn’t assume a fixed number of latent behaviors but as many behaviors as there are individual trajectories in the observed group we are examining. In this model, the probability of a latent behavior b given the observed trajectories is the normalized sum of the probabilities of this behavior given each individual trajectory:

prob(B=b)=1mi=1i=mprob(b|τ(i)). (6)

This model provides the probability density of the latent behavior of a randomly chosen individual in the group. This equation can be derived by assuming that a behavior b is drawn by first picking at random one of the equiprobable trajectories τ(i) and then picking a behavior b from the distribution prob(b | τ(i)) of possible latent behaviors for that trajectory. Using activity and discontinuity to represent behavior, the global behaviors for nine drugs in our study are shown in Figure 12, along with the median latent activity and latent discontinuity. We see that AMP has the lowest median activity (18.93%) and discontinuity (5.90%), CRK has the highest median discontinuity (14.26%) and TOB, not surprisingly, has the highest median activity (63.71%). Another interesting characteristic to notice with TOB is that, unlike all other drugs, its global behavior distribution is bimodal. This bimodality comes from the latent activity dimension: A large proportion of behaviors with TOB can be expected to have either very low activity or very high activity.

Figure 12.

Figure 12

Global activity and discontinuity distributions for nine drugs.

The global model provides a detailed view on behavior heterogeneity for a same drug. This behavior heterogeneity is given by a probability distribution of possible latent behaviors, so we can compute heterogeneity measures from it—using the Gini coefficient (Gini 1912) or entropy (Shannon and Weaver 1949), for instance—as well as compare global behaviors of different drugs, as we show next.

The global model (6) aggregates information from the observed trajectories with equal weighting, which assumes that each contributing behavior is equally likely to be observed in the population we are presumably modeling. This method could be extended further to account for some of the possible biases we may have in our sample by weighing the terms of the sum in (6) appropriately. If, for example, our sample contains 70 male subjects and female subjects and we want to assess the behavior of a population having 40% males and 60% females, weighing the male trajectories by 0.470 and the female trajectories by 0.630 in the (6) sum, our resulting global model will be better fit to represent our population of interest.

3.3 Comparing Global Behavior of Different Drugs

We have seen earlier that we represent behavior by a random variable b representing the probability distribution of latent parameters of the behavior, and that comparing two behaviors b1 and b2 can then be carried out by examining the distribution b1b2. We have illustrated this by comparing two individual trajectories, but the same method can be used to compare two global behaviors specified by their probability distributions. Figure 13 shows the comparison of the behaviors of a few pairs of drugs.

Figure 13.

Figure 13

Comparison of the global behavior for CRK-HER, CRK-MAR, and HER-TOB. The percentages in the corner of the quadrants show the percentage of probability density in that quadrant, and the percentages overlapping two quadrants show the sum of the probability density of both quadrants.

3.4 Factoring Global Behavior in Individual Behavior Assessment

The strength of the latent discontinuity measure for each individual trajectory depends on the global behavior of the drug trajectory being measured. Figure 12 shows that there’s a 50% probability of having a latent discontinuity greater than 14.26% (the median discontinuity) with CRK, so this level of discontinuity is average for CRK. On the other hand we could show that this level of discontinuity corresponds to the 83%-percentile for MAR, therefore such a level of discontinuity is high for MAR.

As we see, in order to assess the individual behavior with a given drug, we should take into account the global behavior for this drug. This conversion process is called normalizing or standardizing, and there are many different methods to do this; the most common using standard score, which uses the mean and standard deviation as global statistics to compare individual values. Continuing with our approach of evaluating and visualizing the intrinsic diversity of the notions we address, we propose to assess the latent behavior of an individual by contrasting not only with a few statistics of global behavior, but the entire probability distribution of global behavior.

This can be done simply by examining the difference bB of the individual behavior probability distribution and the global behavior probability distribution. In Figure 14 we show the normalized behavior of 005’s use of HER, CRK, and COC. Recall that Figure 7 displayed 005’s behavior with these drugs as the probability distribution of latent activity and discontinuity, but this did not give a sense of how these compared to normal use of these drugs. The normalized behavior distributions in Figure 14 on the other hand do exhibit the behaviors relative to other behaviors for the same drug. For example, Figure 7 shows that 005’s use of CRK seems to exhibit significantly low activity and discontinuity, and indeed, compared to 005’s use of HER and COC, her use of CRK displays considerably lower activity and discontinuity, as can be seen in Figure 9. Yet if we compare her use of CRK to the use of CRK of other respondents, shown in Figure 14(b), we see that compared to another randomly chosen user of CRK, there’s a 72.8% chance that her latent activity is lower, a 77.7% chance that her latent discontinuity is lower, and only a 59.5% chance that her latent activity and discontinuity are both lower.

Figure 14.

Figure 14

Normalized behavior of respondent 005 for HER, CRK, and COC. The percentages in the corner of quadrants show the percentage of probability density in that quadrant, and the percentages overlapping two quadrants show the sum of the probability density of both quadrants.

4 DISCUSSION AND CONCLUSIONS

After introducing the concepts of transition count and transition rate, we introduced two additional descriptive statistics–the relapse rate and the remission rate–that conveyed the rate at which both types of transitions occurred. Both of these carry information on both the discontinuity and activity of a trajectory. We then proposed two further descriptive statistics: (1) the observed activity–which captures the activity information provided by the relapse and remission rates–conveys a sense of expected long-term proportion of active states in the trajectory, and (2) the observed discontinuity–which captures the discontinuity information contained in relapse and remission rates–conveys a sense of expected long term transition rate.

These descriptive statistics were not fit, as is, to carry out inferential analysis of the behavior exhibited by the trajectories. In order to do so, we proposed to assume that a trajectory is generated by a latent Markov process whose parameters—relapse propensity and remission propensity—are hidden, but whose probability distribution can be inferred from the observed trajectory. Based on relapse and remission propensity we can compute latent activity and discontinuity. Any two of these four behavior parameters determines the other two. Each one of these four behavior measures are expressed by random variables carrying the probability distribution of the underlying parameter. This probability distribution is the posterior probability of the parameter, given the observed trajectory.

We then showed how these random variables could be used to (1) compare two behaviors by examining the difference of the random variables describing them, (2) generate a global behavior model by aggregating the random variables, (3) compare two global behaviors with each other, and (4) normalize a behavior so as to factor global behavior into the assessment of individual behavior.

The methods presented herein can be applied to any sequence of binary events, with little or no adaptation, assuming the variable of interest is binary. The choice of the length of a period (one year in our case) and the definition of what 0 and 1 are crucial to the meaningfulness of the results one will obtain. On the other hand, given that we assume an underlying Markov process, our approach will be that much more fitting if there is a strong correlation between the state of one period and the state of the next period. This relationship doesn’t have to be a direct one nor a causal one. Indeed, being employed does not cause one to be unemployed, but there always is a probability to become unemployed, and might be used as an indicator of a related behavior. The same is true for relapse and remission propensity. For example, if we define the active state to be “was in prison that year,” the relapse propensity may be an indicator of criminal activity, and the remission propensity would indicate the gravity of this criminal behavior.

Although we use a Markov process defined by a relapse and remission propensity as the basis of our measures, we are not claiming that such a model is appropriate to predict relapse and remission. In the case of drug use, where there is indeed a causal relationship between using a drug one year and using it the next year5, the Markov model does provide some predictive ability. If one’s goal is to predict relapse and remission though, one should take into account more influential factors, such as social variables, age of onset, drug switching patterns, route of administration and other significant covariates. Yet this was not our goal here.

Our goals in this paper were to develop measures of the discontinuity manifested by a sequence of binary events and the behaviors underlying them by employing Bayesian methods and proposing visualizations related to these measures. Discontinuity was defined to be the long-term expected transition rate, this expectation in turn being based on the estimate of a latent transition propensity as described by a simple Markov model. We used drug use trajectories to illustrate these measures. While influencing factors, such as switching drugs, were not included in this paper, our measures remain applicable, since we would still be gauging behavior with respect to a single drug. If the switching phenomena is observed or inferred, the next step would be to take this into account and gauge behavior over a group of drugs. One way to do this would be to merge the relevant trajectories together by defining “active” to be “active on either of the drugs.”

We used relapse and remission to designate a transition from non-active to active and vice-versa, and incorporated graphic models (digraphs) to visualize these transitions. While these are not independent states, they were treated as such in order to develop innovative measures of discontinuity that can be further developed to provide an additional analytical tool in future social science applications. Our goal was not to apply the measure here; yet the discontinuity measure illustrated in this paper can have important implications for interventions, prevention efforts and other public health and social services with further extensions for applied research.

4.1 Limitations, Extensions and Future Research

The Markov model can have significant limitations in describing transition dynamics depending on the context. For example, if the events under consideration have little or no influence on each other in time, using a hazard-rate model would be more appropriate. On the other hand, if past events have a significant influence on future events that is not captured by the events in between (violation of the Markov assumption), it may be appropriate to consider more complex Markov models, including larger time-frames and/or other explanatory variables. Regardless of the model that is chosen to explain transitions, it should be possible to derive from it an estimate of the long-term transition rate, and therefore an alternate measure of discontinuity.

In the section on inferential statistics, we considered the sampling distribution of hidden parameters given an observed trajectory. Another type of error that could be included is the observational error of the trajectory. This would be especially appropriate when the trajectory information is obtained by retrospective reports, which raises concerns about the reliability of the data, further addressed for the study sample in Appendix A.. If the observational errors can be modeled in a way that produces a probability distribution of the true value of an observation, the observational errors could be integrated into the behavior models we have developed with relative ease. For example, Manzoni et al. (2010) use a Markov process to model memory bias in retrospectively collected data. In their study, the authors quantified recall bias and identified its determinants, suggesting that the next step would be to correct for measurement errors that the latent Markov modeling framework allows. Likewise, we posit that this measure is only a first step of a novel strategy with potential for expanding analytical tools used in applied social science research.

Further, we derived sampling distributions assuming no prior knowledge of the whereabouts of the parameters we were estimating. In many cases, it should be possible to incorporate a suitable prior distribution and generate more precise estimates of the behavior parameters using Bayesian inference. For example, it is clear from several of the drug behavior graphs above that there are numerous types of behaviors gravitating around different (activity, discontinuity) pairs. Taking this into account, more realistic comparisons of behaviors can be developed in future applications.

The methods developed herein could also be extended to event-histories exhibiting more than two states. The natural way to do this would be to consider a Markov process over all possible (but finite in number) states, and defining a transition to be any transition from one state to another different state. The concepts of relapse, remission, and activity would not apply here, but discontinuity could still be defined to be the long-term transition rate expected by the latent Markov process. Similar calculations as we carried out in Appendix B should be made to compute the expected number of transitions given the Markov process parameters. This generalization of our method is especially fortuitous since it can be applied to nominal variables–which are not handled by standard growth models–for example transitioning through social roles over the life course. Moreover, a transition can be defined as any sudden change in the variable, although in some cases, assuming a latent Poisson process, for instance, might be more suitable.

Table 1.

Equations exhibiting relationships between , , ā, and . The bars (indicating these are sample statistics, have been removed for sake of legibility. Note though, that these equations also apply when considering the corresponding underlying parameters r, R, a, and d which we will introduce later.

(r, R) (r, a) (r, d) (R, a) (R, d) (a, d)
r
=Ra1a
=Rd2Rd
=d2(1a)
R
=r(1a)a
=dr2rd
=d2a
a
=rr+R
=1d2r
=d2R
d
=2rRr+R
= 2r(1 − a) = 2Ra

APPENDIX A

Methods

The data employed here to develop the measures of discontinuity are from the Older Drug User Study (ODUS) conducted in a metropolitan area in southeastern USA. Specifically we used the 10 drug trajectories over the life course of 92 study participants. This study employed a retrospective longitudinal design, which is typical in life course research (Elder 1985; Laub and Sampson 2003; Hser et al. 2008; Hser et al. 2007; Schulenberg et al. 2003). The assertions made here about the characteristics of the users and the drugs are descriptive of this sample only and not meant to be representative of other drug user trajectories. This sample of older drug users provided the data analyzed to identify transitions in drug trajectories. Here we provide a brief overview of the methods used to collect this sample.

To be eligible to participate in ODUS, respondents had to be at least 45 years old time of the interview and either active or former users of heroin, cocaine/crack, or methamphetamine. These drugs were chosen because they represent the three major drugs of use associated with the most severe consequences (Brecht et al. 2008; Hser et al. 2008). Active use was defIned as having used one of these drugs in the past year and continual use for more than one month. The community-drawn sample consisted of 92 respondents from 45 to 65 years old. African American were 50 percent, 46 percent white, and the remainder Latino or American Indian. The sample was 40 percent female, and 60 percent reported some college education. For this paper, the sample characteristics are not as important as its diversity in terms of drug use heterogeneity. The data and related description can be accessed on the Inter-University Consortium for Political and Social Research (ICPSR) website.

Of interest to this paper is the collection of the same data for all respondents starting from year of birth and including every year of life until the time of the interview. Retrospective designs that collect self-reported data typically use one-year periods that extend to five or less years, or they investigate an ever in the past period of time (Anthony et al. 1991; Fuller et al. 2002; Hser et al. 2008). Instead, this data collection starts at year of birth and continues to the current year at the time of the interview. We realize that this raises questions regarding validity and reliability, which includes two forms of bias: recall accuracy and social desirability (Guest, Bunce and Johnson 2007). While retrospective designs reduce social desirability bias, recall bias is increased; however, retrospective data collection of illegal behaviors reduces the social desirability bias (Murphy et al. 2010). Data collectors were trained on how to develop rapport with respondents to reduce social desirability bias (Anglin, Hser and Chou 1993; Weatherby et al. 1994). In addition, a number of strategies to reduce recall bias were used in the study, such as incorporating historical events to trigger the memory and employing timelines and memory aids targeted for each individual (Agar 1980; Becker 1998; Darke 1998; Fontana and Frey 1998; Lambert 1990; Hser et al. 2008; Murphy et al. 2010; Nurco et al. 1975; Shaw 2005; Sobell et al. 1988).

Three sources of direct data were collected in a face-to-face setting: an in-depth life-history interview (digitally recorded and transcribed), a Life History Matrix (on paper), and yearly survey data (computer-assisted). Two interviewers were present for each interview with one study participant. One interviewer conducted the life audio-recorded life history, writing important aspects of social context, risk behaviors, and major social roles on the Life History Matrix for each year. The second interviewer used a timeline with historical reference points to provide a context in which respondents could place events in their lives. The life history matrix and timeline facilitated the recall of the specific events and time periods regarding drug use and social events over the life course, thereby enhancing the reliability and validity of the survey data.

After the life history was completed, the interviewers began the quantitative survey data collection. One interviewer asked the questions and entered the data on the computer while the other interviewer checked the Life History Matrix and timeline for consistency. The entire interview process was audio recorded. Recordings were used during a quality control of the survey data. Interviews lasted between 3–5 hours on average with some exceptions. Respondents were paid $40 cash and provided some refreshments.

APPENDIX B

Support for the definitions of latent activity and discontinuity

The rationale behind the discontinuity measure we introduced is that it may serve as a statistic conveying the propensity to transition from one state to another. If we assume the trajectory was indeed generated by a latent Markov process, we can estimate the n01, n00, n10, and n11 that we can expect to observe, conditioned on r, R, the parameters of the Markov process, and n, where n + 1 is the the number of observed years. Namely, letting

Λ(r,R,n)=R1(1rR)nr+R, (7)

we have

E(n01|r,R,n)=rr+R(nRΛ(r,R,n)),E(n00|r,R,n)=1rr+R(nRΛ(r,R,n)), (8)
E(n10|r,R,n)=Rr+R(nr+Λ(r,R,n)),and (9)
E(n11|r,R,n)=1Rr+R(nr+Λ(r,R,n)). (10)

The proof of this is involved, but may be found in (Assoudou and Essebbar 2003:423).

Recall that the transition rate was defined in (2) to be n01+n10n. Replacing n01 and n10 by the expected values (8) and (9) above, we define the n-discontinuity dn of the trajectory to be the transition rate we can expect if the latent Markov process generated the sequence of length n + 1:

dn=E(n01|r,R,n)+E(n10|r,R,n)n=1n[rr+R(nRΛ(r,R,n))+rr+R(nRΛ(r,R,n))]=1n[2nrRr+R+(Rrr+RΛ(r,R,n))]=2rRr+R+1nRrr+RR1(1rR)nr+Rusing(7)=1r+R(2rR+R(Rr)r+R1(1rR)nn) (11)

Equation (11) allows us to give a mathematical meaning to our earlier claim that observed discontinuity of a trajectory corresponds to the expected long-term transition rate. Again, dn is the expected transition rate of a trajectory of length n + 1 generated by a Markov process with parameters r and R. The long-term aspect of the discontinuity comes from the fact that we define the latent discontinuity d to be limn→∞ dn, the limit of the n-discontinuity as n tends to infinity. Given that drug use careers are finite and of various lengths, the n-discontinuity would be a better estimate of the latent discontinuity of the drug use of a specific career. Yet d has the advantage of not depending on n, and its expression is much simpler than that of (11). Indeed, the 1(1rR)nn component of (11) tends to 0 as n → ∞, therefore we get (5):

d=2rRr+R.

Similarly, we can verify that the latent activity a, defined to be the expected long-term use proportion, is (4):

a=rr+R.

Indeed, note that the proportion of use years in a trajectory can be expressed as 1+n01+n111+n. Then, using (8) and (10) we get the expected proportion of use years in a trajectory of length n + 1 generated by the Markov process:

1(r+R)1(1+n)(nr+(1rR)(1(1rR)n))

which tends to rr+R when n → ∞.

Footnotes

1

We prefix activity and discontinuity with the modifier sample to distinguish these measures with latent activity and latent discontinuity introduced later. The difference of the two measures can be likened to the sample mean and population mean. In order to lighten the reading, we will drop the modifiers sample and latent when the measure we are referring to is clear.

2

Simply put, a Bernoulli trial is an experiment whose outcome is either of two possibilities.

3

Note that the relapse and remission contributions are independent from each other. This independence was not assumed, but is a result of the Markov property and the non-informative prior.

4

The scale of the probability density axes are irrelevant and were omitted. Suffices to say these curves integrate to 1.

5

if we at all assume some addiction to the drug

REFERENCES

  1. Agar Michael. The Professional Stranger: An Informal Introduction to Ethnography. New York: Academic Press; 1980. [Google Scholar]
  2. Andruff Heather, Carraro Natasha, Thompson Amanda, Gaudreau Patrick, Louvet Benoît. Latent Class Growth Modelling: A Tutorial. Tutorials in Quantitative Methods for Psychology. 2009;5(1):11–24. [Google Scholar]
  3. Anglin M Douglas, Hser Yih-Ing, Chou Chih-Ping. Reliability and Validity of Retrospective Behavioral Self-Report by Narcotics Addicts. Evaluation Review. 1993;17(1):91–103. [Google Scholar]
  4. Anthony James C, Vlahov David, Celentano David D, Menon AS, Margolick Joseph B, Cohn Sylvia, Nelson Kenrad E, Polk B Frank. Self-Report Interview Data For A Study Of HIV-1 Infection Among Intravenous Drug Users: Description Of Methods And Preliminary Evidence On Validity. Journal of Drug Issues. 1991;21(4):739–848. [Google Scholar]
  5. Assoudou Souad, Essebbar Belkheir. A Bayesian Model for Binary Markov Chains. International Journal of Mathematics and Mathematical Sciences. 2004;8:421–429. [Google Scholar]
  6. Authors. Drug Use Trajectory Patterns among Older Drug Users. Substance Abuse and Rehabilitation. 2011;2:89–102. doi: 10.2147/SAR.S14871. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Becker Howard S. Tricks of the Trade: How to Think about Your Research While You’re Doing It. Chicago: University of Chicago Press; 1998. [Google Scholar]
  8. Brecht May-Lynn, Huang David, Evans Elizabeth, Hser Yih-Ing. Polydrug Use and Implications for Longitudinal Research: Ten-year Trajectories for Heroin, Cocaine, and Methamphetamine Users. Drug and Alcohol Dependence. 2008;96:193–201. doi: 10.1016/j.drugalcdep.2008.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Chassin Laurie, Flora David B, King Kevin M. Trajectories of Alcohol and Drug Use and Dependence from Adolescence to Adulthood: The Effects of Familial Alcoholism and Personality. Journal of Abnormal Psychology. 2004;113:483–498. doi: 10.1037/0021-843X.113.4.483. [DOI] [PubMed] [Google Scholar]
  10. Chung Ick-Joong, Hawkins J David, Gilchrist Lewayne D, Hill Karl G, Nagin Daniel S. Identifying and Predicting Trajectories among Poor Children. Social Service Review. 2002;76:664–685. doi: 10.1086/342999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Darke Shane. Self-report Among Injecting Drug Users: A Review. Drug and Alcohol Dependence. 1998;51:253–263. doi: 10.1016/s0376-8716(98)00028-3. [DOI] [PubMed] [Google Scholar]
  12. Elder Glen H., Jr . Life Course Dynamics. Ithaca, NY: Cornell University Press; 1985. [Google Scholar]
  13. Ellickson Phyllis L, Martino Steven C, Collins Rebecca L. Marijuana Use from Adolescence to Young Adulthood: Multiple Developmental Trajectories and their Associated Outcomes. Health Psychology. 2004;23:299–307. doi: 10.1037/0278-6133.23.3.299. [DOI] [PubMed] [Google Scholar]
  14. Fontana Andrea, Frey James H. Interviewing: The Art of Science. In: Denzin Norman K, Lincoln Yvonna S., editors. Collecting and Interpreting Qualitative Material. Thousand Oaks, CA: Sage; 1998. pp. 47–78. [Google Scholar]
  15. Fuller Crystal M, Vlahov David, Ompad Danielle C, Shah Nina, Arria Amelia, Strathdee Steffanie A. High-Risk Behaviors Associated with Transition from Illicit Non-Injection to Injection Drug Use Among Adolescent and Young Adult Drug Users: A Case- Control Study. Drug and Alcohol Dependence. 2002;66:189–198. doi: 10.1016/s0376-8716(01)00200-9. [DOI] [PubMed] [Google Scholar]
  16. Gamerman Dani, Smith Adrian FM. Bayesian Analysis of Longitudinal Data Studies. In: Bernardo JM, Smith AFM, editors. Bayesian Statistics. Vol. 5. Oxford: Oxford University Press; 1996. pp. 587–598. [Google Scholar]
  17. Gini C. "Variabilitá e mutabilita" [1912] In: Pizetti E, Salvemini T, editors. Reprinted in Memorie di metodologia statistica. Rome: Libreria Eredi Virgilio Veschi; 1955. [Google Scholar]
  18. Guest Greg, Bunce Arwen, Johnson Laura. How Many Interviews are Enough?: An Experiment with Data Saturation and Variability. Field Methods. 2006;18:59–82. [Google Scholar]
  19. Hser Yih-Ing, Longshore Douglas, Anglin M Douglas. The Life Course Perspective on Drug use. Evaluation Review. 2007;31:515–547. doi: 10.1177/0193841X07307316. [DOI] [PubMed] [Google Scholar]
  20. Hser Yih-Ing, Evans Elizabeth, Huang David, Brecht Mary-Lynn, Libo Li. Comparing the Dynamic Course of Heroin, Cocaine, and Methamphetamine Use over 10 Years. Addictive Behavior. 2008;33:1581. doi: 10.1016/j.addbeh.2008.07.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hser Yih-Ing, Huang David, Brecht Mary-Lynn, Li Libo, Evans Elizabeth. Contrasting Trajectories of Heroin, Cocaine and Methamphetamine Use. Journal of Addictive Diseases. 2008;27:13. doi: 10.1080/10550880802122554. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Ip. Edward H, Jones Alison Snow, Heckert Alex, Zhang Qiang, Gondolf Edward D. Latent Markov Model for Analyzing Temporal Configuration for Violence Profiles and Trajectories in a Sample of Batterers. Sociological Methods & Research. 2010;39:222–255. [Google Scholar]
  23. Jung Tony, Wickrama KAS. An Introduction to Latent Class Growth Analysis and Growth Mixture Modeling. Social and Personality Psychology Compass. 2008;8:302–317. [Google Scholar]
  24. Juon Hee-Soon, Ensminger Margaret E, Sydnor Kim Dobson. A Longitudinal Study of Developmental Trajectories to Young Adult Cigarette Smoking. Drug and Alcohol Dependency. 2002;66:303–314. doi: 10.1016/s0376-8716(02)00008-x. [DOI] [PubMed] [Google Scholar]
  25. Lambert Elizabeth. In: The Collection and Interpretation of Data from Hidden Populations. National Institute on Drug Abuse Research Monograph 98, editor. Washington, DC: US Government Printing Office; 1990. DHHS Pub. No. (ADM)90-1678. [Google Scholar]
  26. Laub John H, Sampson Robert J. Integrating Quantitative and Qualitative Data. In: Giele Janet Z, Elder Glen H., Jr, editors. Methods of Life Course Research: Qualitative and Quantitative Approaches. Thousand Oaks, CA: Sage; 1998. pp. 213–229. [Google Scholar]
  27. Laub John H, Sampson Robert J. Shared Beginnings, Divergent Lives: Delinquent Boys to Age 70. Cambridge, MA: Harvard University Press; 2003. [Google Scholar]
  28. Manzoni Anna, Vermunt Jeroen K, Luijkx Ruud, Muffels Ruud. Memory Bias in Retrospectively Collected Employment Careers: A Model Based Approach to Correct for Measurement Error. Sociological Methodology. 2010;40:39–73. [Google Scholar]
  29. Murphy Debra A, Hser Yih-Ing, Huang David, Brecht Mary-Lynn, Herbeck Diane. Self Report of Longitudinal Substance Use: A Comparison of the UCAL Natural History Interview and the Addiction Severity Index. The Journal of Drug Issues. 2010;40(2):495–515. doi: 10.1177/002204261004000210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Muthen Bengt, Masyn Katherine. Discrete-Time Survival Mixture Analysis. Journal of Educational Behavioral Statistic. 2004;30(1):25–58. [Google Scholar]
  31. Nurco David N, Bonito Arthur J, Lerner M, Balter Mitchell B. The Natural History of Narcotic Addiction: A First Report. Washington, DC: Committee on Problems of Drug Dependence; 1975. [Google Scholar]
  32. Schulenberg John E, Maggs Jennifer L, O’Malley Patrick M. How and Why the Understanding of Developmental Continuity and Discontinuity is Important. In: Mortimer Jeylan T, Shanahan Michael J., editors. Handbook of the Life Course. New York: Plenum Publishers; 2003. pp. 413–436. [Google Scholar]
  33. Shannon Claude E, Weaver Warren. The Mathematical Theory of Communication. Urbana, IL: University of Illinois Press; 1949. [Google Scholar]
  34. Shaw Victor N. Research with Participants in Problem Experience: Challenges and Strategies. Qualitative Health Research. 2005;15:841–854. doi: 10.1177/1049732305275639. [DOI] [PubMed] [Google Scholar]
  35. Sobell Linda C, Sobell Mark B, Leo Gloria I, Cancilla Anthony. Reliability of a Timeline Method: Assessing Normal Drinkers’ Reports of Recent Drinking and a Comparative Evaluation across Several Populations. British Journal of Addiction. 1988;83:393–402. doi: 10.1111/j.1360-0443.1988.tb00485.x. [DOI] [PubMed] [Google Scholar]
  36. Vermunt Jeroen K, Bac Tran, Magidson Jay. Latent Class Models in Longitudinal Research. In: Menard Scot., editor. Handbook of Longitudinal Research: Design, Measurement, and Analysis. New York: Academic Press; 2008. pp. 373–385. [Google Scholar]
  37. Weatherby Norman L, Needle Richard, Cesari Helen, Booth Robert, McCoy Clyde B, Watters John K, Williams Mark, Chitwood Dale D. Validity of Self-Reported Drug Use Among Injection Drug Users and Crack Cocaine Users Recruited through Street Outreach. Evaluation and Program Planning. 1994;17:347–355. [Google Scholar]
  38. White Helena Raskin, Pandina Robert J, Chen Ping-Hsin. Developmental Trajectories of Cigarette Use from Early Adolescence into Young Adulthood. Drug and Alcohol Dependence. 2002;65:167–178. doi: 10.1016/s0376-8716(01)00159-4. [DOI] [PubMed] [Google Scholar]
  39. Yamaguchi Kazuo. Four Useful Finite Mixture Models for Regression Analyses of Panel Data with a Categorical Dependent Variable. Sociological Methodology. 2008;38:283–328. [Google Scholar]

RESOURCES