Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Jun 11.
Published in final edited form as: Stat Med. 2021 Apr 12;40(15):3460–3476. doi: 10.1002/sim.8977

Hidden mover-stayer model for disease progression accounting for misclassified and partially observed diagnostic tests: Application to the natural history of human papillomavirus and cervical precancer

Jordan Aron 1, Paul S Albert 1, Nicolas Wentzensen 2, Li C Cheung 1
PMCID: PMC10257883  NIHMSID: NIHMS1884073  PMID: 33845514

Abstract

Hidden Markov models (HMMs) have been proposed to model the natural history of diseases while accounting for misclassification in state identification. We introduce a discrete time HMM for human papillomavirus (HPV) and cervical precancer/cancer where the hidden and observed state spaces are defined by all possible combinations of HPV, cytology, and colposcopy results. Because the population of women undergoing cervical cancer screening is heterogeneous with respect to sexual behavior, and therefore risk of HPV acquisition and subsequent precancers, we use a mover-stayer mixture model that assumes a proportion of the population will stay in the healthy state and are not subject to disease progression. As each state is a combination of three distinct tests that characterize the cervix, partially observed data arise when at least one but not every test is observed. The standard forward-backward algorithm, used for evaluating the E-step within the E-M algorithm for maximum-likelihood estimation of HMMs, cannot incorporate time points with partially observed data. We propose a new forward-backward algorithm that considers all possible fully observed states that could have occurred across a participant’s follow-up visits. We apply our method to data from a large management trial for women with low-grade cervical abnormalities. Our simulation study found that our method has relatively little bias and out preforms simpler methods that resulted in larger bias.

Keywords: EM algorithm, forward-backward algorithm, measurement error, partial missing data

1 |. INTRODUCTION

Markov models have previously been used to model the natural history of diseases.1,2 With some diseases, such as cervical cancer, the disease state is assessed using multiple tests, each of which are subject to misclassification. When the results of one of these tests are missing, the disease state is only partially observed. Hidden Markov models (HMMs) have been proposed to account for misclassification in state identification3 with parameters estimated using a forward-backward algorithm. Adaptations to the standard forward-backward algorithmhavebeenproposedtohandlemissingobservations,4 but these approaches cannot be directly applied to partially observed data. This article proposes a maximum-likelihood approach for HMMs when the observed measurements are categorical and sometimes only partially observed. We extend our approach to also allow for the population to be a heterogeneous mixture of individuals susceptible and not susceptible to the disease (mover-stayer models5). These new methods are illustrated using data from a cervical cancer management trial.

Natural history models that describe the acquisition and clearance of human papillomavirus (HPV) infections in the cervix and the progression of persistently infected epithelium to cervical precancers/cancers are used to inform vaccination and screening strategies.68 However, developing natural history models for HPV and cervical precancer/cancer poses multiple challenges. First, modeling strategies need to recognize that a portion of the population will not get infected with HPV and are therefore not susceptible to precancer/cancer progression. We propose a mover-stayer formulation5 for the underlying natural history to account for this heterogeneity. Second, population-level screening for cervical cancer entails multiple tests in sequential fashion—beginning with minimally invasive screening tests, such as testing for high-risk HPV types in the cervix and/or cytologic examination of cervical cells for abnormalities. Depending on the results of these tests, patients may then be referred to a colposcopic examination where areas suspicious of precancers are sampled using punch biopsies. As a result, any one of the three tests used to characterize the state of the cervix (HPV, cytology, or colposcopy) may be missing—either by design or by missing scheduled visits. Finally, while HPV tests have high accuracy, cytology tests are considerably less accurate9 and colposcopies, while more reliable than cytology, can still be insensitive for cervical precancers/cancers.10,11

We propose a discrete-time HMM for HPV and cervical precancer/cancer in which the hidden and observed state spaces are defined by possible combinations of nonmissing HPV, cytology, and colposcopy test results. While other formulations of the hidden states are possible, we chose to mirror the observed states to allow for easy interpretability of the misclassification matrix. Most new HPV infections clear rapidly, but persistent HPV infections can cause precancerous lesions. To keep the Markov property of the model intact, we expand the state space beyond the positive-negative HPV test result to a trichotomous negative-positive-persistent scale that uses the HPV test result from the previous visit. We also treat the observed state as trichotomous as clinical management guidelines differ for newly positive vs persistent infections.12 In our model, there are two scenarios that cause a partially observed state: (1) when any of the three current test results are missing or (2) when the current HPV result is positive but the previous HPV result is missing (ie, the observed HPV state could be either new or persistent). When the state is only partially observed, there exists a set of possible full observations that could have been observed.

Zucchini previously described how to handle fully missing observations,4 however, the handling of partially observed data is not straightforward. While fully missing observations are simply transitioned over, partially observed data require a reworking of the forward-backward algorithm. To properly account for partially observed data, each possible full observation borne out by a partial observation must be considered in the algorithm. Our proposed modification considers this set of all possible full observations by assuming that missingness of a test occurs at random (MAR) given the observed data. The MAR assumption is reasonable since a majority of the missing tests are by design, in that whether they are done depends on observed test results.

The model formulation is described in Section 2. In Section 3, we describe our novel version of the forward-backward algorithm for partially observed data. Section 4 illustrates the proposed method by fitting HMMs, with and without misclassification, to data from the ASCUS/LSIL Triage Study for Cervical Cancer (ALTS).13 In Section 5, we conducted simulation studies that compared three estimation approaches: (1) our proposed method with misclassification; (2) our proposed method without misclassification; and (3) standard methods using only fully observed measurements. Finally, we discuss our findings and implications for future work in Section 6.

2 |. MODEL FORMULATION

2.1 |. Partially observed data

We introduce the problem of partially missing measurements, followed by the full development of the model in Section 2.2. Let Xit and Yit denote the latent (true) and observed (subject to misclassification) state, respectively, for the tth time point on the ith participant. Each state takes a value between 1 and k, and Xi=Xi1,Xi2,,Xin and Yi=Yi1,Yi2,,Yin. We assume all participants have the same follow-up length ni=n and that all observations are equally spaced.

For the ALTS example each state is composed of three tests: HPV test (HPV), cytology, and colposcopy. HPV is 0 if the HPV test is negative, 1 if the test is newly positive, or 2 if the test has been positive for two or more tests in a row (persistence). Cytology is 0 if the cervical cytology result is normal, 1 if cytology is equivocal/low-grade, or 2 if cytology is HSIL/carcinoma. Likewise, colposcopy is 0 if the colposcopy is <CIN2 or 1 if CIN2+. Each state is represented as a tuple of the three tests, (HPV, cytology, colposcopy). Yi is the observed result of the three tests with misclassification and Xi is the underlying disease status (eg, what the result would be without misclassification). In our example there are 18 possible states (3 × 3 × 2) and five equally spaced observations taken semiannually per individual.

Yi is considered to be partially observed when certain values of the state space are ruled out but at least two values are still possible. Suppose we know that Yit cannot be odd and 0<Yitk, we can then infer that the possible values for Yit are all even numbers less than or equal to k. Partially observed data occurs only in the observable data because the true state of the individual exists with or without observations. For estimation, we replace the partially observed data with the set of all possible full observations.

In our example, there are two scenarios that cause partially observed data. The first is when one or two of the three tests are missing. Since we have information about some but not all of the tests, there are multiple possibilities for what the fully observed data could be. The second case is when an individual tests HPV positive but we have no knowledge of their HPV test results 6 months prior and do not know if the HPV is either newly positive or persistent. Three possible participants’ test results can be seen in Figure 1.

FIGURE 1.

FIGURE 1

Participants are grouped by column, tests are grouped by row. Filled in circles represent fully observed data and open circles represent partially observed data. Participant one has partially observed data from a missed colposcopy at time 3. Participant two has partially observed data from an HPV positive test result at time one of the trial and a missed colposcopy at time two. Participant three has partially observed data due to a missed HPV test at time two and subsequent positive result at time three, and three missed colposcopies from times two to four. HPV, human papillomavirus

Define Yit*={Yit,1,Yit,2,,Yit,|Yit*|} as the set of all fully observed possible states for Yit where Yit* is the size of set Yit*. When the data is fully observed Yit*=Yit, however, when the data is partially observed, Yit* contains all possible fully observed states. For example, let Yit=(2,2,1), as all three tests were observed, Yit*=(2,2,1) as well. Now suppose that the colposcopy was not observed, so Yit=(2,2,-). Then, because a colposcopy can be either 0 or 1,Yit*={(2,2,0),(2,2,1)}.

Let Wi=Wi1,,Win be a vector of the fully observed state if Yit is not partially missing or what the fully observed state would be if Yit is partially missing (what the value would be if all tests were observed). It is important to note that for both possibilities, Wi is subject to misclassification as it corresponds to observable states. We can formally define Wi such that:

Wit=YitifYitisfullyobservedYit,j*ifYitispartiallyobserved, (1)

where Yit,j* is the fully observed state if all tests had been done and refers to the jth element of the set Yit*. Continuing the previous example, let Xit=(2,2,1), Yit=(2,2,-), and Yit*={(2,2,0),(2,2,1)}. If a colposcopy were to be done (and therefore if Yit were to be fully observed) with no misclassification, then Wit=(2,2,1); however, if there would have been a misclassification in the observation, then Wit=(2,2,0).

We assume an individual cannot transition from a colposcopy CIN2+ to a colposcopy <CIN2 state. Of note, we censor individuals at the first observed CIN2+ as patients with observed CIN2+ were often treated. We also assume that the HPV tests have no misclassification14,15 and colposcopies have a specificity of 1. The transitions allowed for in our model are visualized in Figure 2. We also assume that the three tests can be modeled as a first-order Markov chain and that the misclassification probabilities are independent over visits on the same individual. Denote the probability of the Markov chain starting in state l1 by rl1=PXi1=l1, the transition from state l1 to state l2 as pl1l2=PXit=l2|Xi,t-1=l1, and the classification of state l1 as state l2 to be Ml1l2=PWit=l2|Xit=l1.

FIGURE 2.

FIGURE 2

Possible state transitions by diagnostic test for hidden Markov model

The joint distribution of the latent and observed states can be calculated by multiplying the (1) probability of starting at the initial latent state, rXi1, the (2) transition probabilities for the latent states, pXi,t-1Xit, and the (3) probability of classifying the latent state as the observed state, MXitWit. The complete-data joint distribution can be written as

PXi,Wi=rXi1t=2npXi,t-1Xitt=1nMXitWit. (2)

In the next subsection, we incorporate population heterogeneity by using a mover-stayer model. Specifically, we incorporateatwogroupmixturewhereonegroupiscomposedofindividualswhohavenoHPVorcervicalabnormality,andthe other group is susceptible to cervical cancer progression. Similar models have been proposed for breast cancer screening without incorporating misclassification.16

2.2 |. Mover-Stayer

Let π=PQi=1 and Qi be an indicator of not being susceptible to HPV infection and cervical cancer progression. When π=0 this model reduces to a standard HMM, described in the previous subsection. If individual i is a stayer, then they remain in the same latent state with probability one. In our example, we are only allowing stayers in the healthy (0,0,0) latent state. Therefore, if individual i is a stayer, then Xi={((0,0,0),(0,0,0)}. As being a stayer only effects the latent state, there are no further restrictions on classification. The joint distribution for individual i with partially observed data and the mover-stayer component can be written as

fi(Xi,Wi,Qi)=(πt=1nM0Wit)Qi×1-πrXi1t=1npXi,t-1Xitt=1nMXitWit1-Qi. (3)

If the individual is a stayer Qi=1, then the probability is found by multiplying the classification probabilities for each observed data point given that the latent state is 0. If the individual is a mover Qi=0, then the joint distribution is the same as for the HMM and reduces to Equation (2). Note that while we directly observe Yi, both Xi and Qi are hidden and Wi is partially hidden.

The proposed approach incorporates potential misclassification in state identification. A special case of the approach is when there is no misclassification. This is a mover-stayer model with partially observed states, which in itself has not been considered before.

As a sensitivity analysis we examined the first-order Markov assumption as well. The mixture transition distribution model17 parametrizesasecond-orderMarkovchainwhileonlycreatingoneadditionalweightparameter.Oneassumption inherent to the model is that the transition matrix is the same for both the first- and second-order transition rates, with the weight parameter changing to account for the differences. To transition to a state at time t, the states at time t2 and t1 must now be taken into account. The effect of each state is considered separately, and the conditional probability is

PXit=l3|Xi,t-2=l1,Xi,t-1=l2=λ1Pl2l3+λ2Pl1l3, (4)

where l1, l2, and l3 all correspond to possible states. λ1 is the weight parameter associated with the first lag, λ2 is the weight parameter associated with second lag and λ1+λ2=1. Pl1l2, the transition rate from state l1 to state l2, can correspond to either a first- or second-order transition depending on which λ it is multiplied by. Note that this model reduces to a first-order Markov chain when λ2=0. Although only one more parameter is necessary, this considerably increases the computational complexity. Appendix A provides the details for this extension in full.

3 |. ESTIMATION

3.1 |. Complete data log likelihood

Here, we will introduce the procedure for estimation assuming no fully missing data but allow partially observed data. The method for dealing with fully missing data (ie, all tests are missing at a particular discrete visit) is similar to the method proposed by Zucchini et al4 and is given in Appendix B. Define the indicator variable Zl1Xit as 1 if l1=Xit and 0 otherwise. We can then rewrite the complete-data joint distribution as

fiXi,Wi,Qi=πt=1nl1=0kM0l1Z0XitZl1WitQi×1-πl1=0krl1Zl1Xil×t=1nl1=0kl2=0kpl1l2Zl2Xi-1Zl2Xit×t=1nl1=0kl2=0kMll1l2Zl1XitZl2Wit1-Qi. (5)

Because Xi, Wi, and Qi are all latent, to find the maximum likelihood estimates we will maximize the marginal likelihood of the observed data, or Yi*’s, by evaluating the complete-data joint distribution at every possible value that the latent Xi’s, Wi ‘s, and Qi’s could take. Let I denote the number of individuals. Therefore:

P(Y1*,Y2*,,YI*)=i=1IP(Yi*)andP(Yi*)=l1=0kln=0kw1=0kwn=0kq=01fil1,,ln,w1,,wn,q, (6)

where lt is the latent state at time t for individual i, wt is the observed, or would-be observed, state at time t for individual i, and q is an indicator for whether individual i is a stayer. Although the summation of each (would be) observed state (wi) is from 0 to k, values that are not in the set of all possible fully observed states Yit* will have probabilities of 0 in the joint distribution. In the above joint distribution, l1,,ln takes the place of the vector of latent states Xi and w1,,wn takes the place of the vector of possible fully observed states Wi. Although this likelihood can be directly maximized in theory, the number of terms makes the optimization intractable. Instead, we use the EM algorithm, iterating between an expectation and maximization step until the parameter estimates converge. The E step calculates the expected value of the complete data log likelihood as follows,

E[i=1Ilogfi(Xi,Wi,Qi)|Yi]=E[Qi|Yi](logπ+i=1It=1nl1=0kE[Z0(Xit)Zl1(Wit)|Yi]logM0Wit)+(1E[Qi|Yi]){log(1π)+i=1Il1=0kE[Zl1(Xi1)|Yi]log(rl1)+i=1It=2nl1=0kl2=0kE[Zl1(Xi,t1)Zl2(Xit)|Yi]log(pl1l2)+i=1It=1nl1=0kl2=0kE[Zl1(Xit)Zl2(Wit)|Yi]log(Ml1l2)}, (7)

where E[Qi|Yi],EZl1Xi1|Yi, EZl1Xi,t-1Zl2Xit|Yi, and EZl1XitZl2Wit|Yi are calculated using an adaption of the forward-backward algorithm, explained in Section 3.4. The standard EM and forward-backward algorithm can be used when the complete set of tests are missing at a particular follow-up time. In this case, we simply apply the forward backward algorithm, removing the misclassification probabilities from the recursion steps associated with the missing measurements. However, this simple change in the forward backward algorithm does not work for partially missing data. In the next two subsections we propose an alternative recursion algorithm that specifically addresses the issue of partially missing data.

Considering the two alternative approaches mentioned in the introduction, the no misclassification method is a special case where the misclassification is ignored by setting Ml1l2 to 0 if l1l2 and 1 otherwise. For the no partially missing method, all partially observed data is simply removed from the data and estimation proceeds as described.

3.2 |. Forward algorithm

To include the partially missing values in the forward algorithm we will consider every possible fully observed state. It is necessary to keep track of exactly which partially observed state is being used at each point for the E-step calculations. Define the forward quantity as:

αl1(it)j=P(Yi1*,Yi2*,,Yit,j*,Xit=l1|Qi=0) (8)

and is calculated by

αl1(it)j={rl1Ml1Yij*ift=1l0=1km=1|Yi,t1*|αl0(i,t1)mpl0l1Ml1Yit,j*ift1. (9)

Note that the j in αl1(it)j indicates that the fully observed data point Yit,j* is used rather than (possibly partially observed) Yit*. By calculating the forward quantity individually for each partially observed state, we will then be able to evaluate expectations needed to perform the E-step. This involves computing αl1**(it) where

αl1*(it)=P(Yi1*,,Yit*,Xit=l1|Qi=0)=j=1Yit*αl1(it)j. (10)

The additional computation time of our approach scales linearly with the size of the set of possible full observations that could have been observed.

3.3 |. Backward algorithm

The backward algorithm adaption is defined slightly differently than its forward counterpart. Because we do not need to keep track of each specific partially observed state for any future calculations, we can sum over all partially observed states. We define the backward quantity as:

βl1(it)=P(Yi,t+1*,Yi,t+2*,,Yin*|Xit=l1,Qi=0), (11)

which is calculated by

βl1(it)={1ift=nl2=0kpl1l2j=1|Yi,t+1|Ml2Yi,t+1,jβl2(i,t+1)ift<n. (12)

Contrasting with the forward step, the backward step does not individually calculate backward quantities for each possible fully observed state. Specifically, the backward probabilities βl1(it) are evaluated by including an additional summation over partially observed states in the recursion equation. Our proposed adaptation takes additional time, when compared with the standard forward-backward, directly proportional to the size of the set of possible full observations that could have been observed, for partially observed data.

3.4 |. Calculating expectations

The following are the E-step expectations, calculated using αl1(it)j and βl1 (it).

EQi|Yi=PQi=1,YiPYi=πt=1nj=1Yit*P(Yit,j*|Xit=0)PYi, (13)
EZl1Xi1|Yi=PXi1=l1,YiPYi=αl1*(i1)βl1(i1)PYi, (14)
EZl1Xi,t-1Zl2Xit|Yi=PXi,t-1=l1,Xit=l2,YiPYi=αl1*(i,t-1)pl1l2j=1Yit*Ml2Yit,j*βl2(it)PYi, (15)
EZl1XitZl2Wit|Yi=PXit=l1|YiPWit=l2|Xit=l1,Yi=α*l1(it)βl1(it)PYi×P(Yit,j*=l2|Xit=l1)m=1|Yit*|P(Yit,m*|Xit=l1), (16)

where

P(Yi)=P(Yi|Qi=1)+P(Yi|Qi=0)=πt=1nj=1|Yit|P(Yit,j|Xit=0)+(1π)kl1=0αl1(in). (17)

4 |. APPLICATION

We illustrate our proposed method on data from the ALTS.13 The ALTS was conducted from 1996 to 2000 to evaluate management strategies for mildly abnormal and very common cytology results. Nonpregnant women, 18 years or older, with equivocal (n = 3844) or low-grade squamous intraepithelial lesions (LSIL) (n = 1572) cytology results within 6 months prior to enrollment were randomly assigned to one of three management arms: immediate colposcopy (IC), HPV triage, or conservative management (CM). We apply our methods to the larger population of patients referred for equivocal cytology results by fitting models that (1) account for misclassification and (2) assume no misclassification.

Women had colposcopy at enrollment if they were (1) randomized to the IC arm, (2) randomized to the HPV triage arm and were enrollment HPV positive or had an enrollment cytology result of HSIL/carcinoma, or (3) randomized to the CM arm and had an enrollment cytology result of HSIL/carcinoma. Regardless of randomization arm, patients in the study were screened biannually over 2 years with HPV and cytology. In addition, patients had colposcopies at postenrollment visits if cytology results were HSIL/carcinoma, and at the 2-year exit screen. Our HMM treats each of the biannual screens and any attendant colposcopies as discrete times spaced 6 months apart. As persistent HPV infections are a necessary precursor to cervical cancer, we classified HPV results as negative, new, or persistent for at least 6 months. The HPV result was missing for 5% of visits. For the 43% of HPV positives found at the enrollment visit or at a visit following a missing HPV test result, it is not known if the HPV infection is new or persistent. Cytology results were classified as normal, abnormal/low-grade, or HSIL/carcinoma and were rarely missing (0.3% of all visit). Colposcopy results were classified as cervical intraepithelial neoplasia grade 2+ (CIN2+, aka precancers) or <CIN2. For women who tested negative for both the HPV and cytology tests, we assumed colposcopies of <CIN2 as this is the common clinical assumption. Even after applying this assumption, colposcopy results were missing for 42% of visits, primarily by design. In addition, some results were missing due to skipped visits or lost to follow-up. In sum, the three tests were fully observed at 38% of visits, partially observed at 40% of visits, and completely unobserved at 22% of visits.

Our HMM allowed for certain state transitions (Figure 2) for HPV, cytology, and colposcopy as per the known natural history of HPV and cervical cancer.6 Most HPV infections clear quickly, but persistent HPV infections can cause cervical lesions that are detectable via cytology tests and colposcopies. While CIN2+ lesions can regress to <CIN2, we could not model this process using the data from ALTS as women with observed CIN2+ have their natural history interrupted by treatment. We allowed for a heterogeneous population of “movers,” women who may transitioned from state to state, and “stayers,” women whose latent states for HPV/cytology/colposcopy remained negative/normal/<CIN2 throughout the 2 years of the study. This mirrors the realities of population-wide cervical screening, which will always include some women who are not a risk for new HPV infections and subsequent precancer/cancers (eg. women in monogamous relationships).

In the model that accounts for misclassification, we fixed certain parameters in the misclassification matrix of the HMM to mirror the known behavior of each test. As HPV tests have high accuracy,14,15 we assumed no misclassification of the HPV state. Because cytology results have poorer accuracy,9 we allowed misclassification of each cytology state. As colposcopy diagnoses in ALTS underwent extensive quality control, we assumed that CIN2+ was not misclassified. However, we allow for the possibility that a <CIN2 colposcopy result could be a misclassified CIN2+, which can occur if a CIN2+ lesion was not biopsied.

We fit a HMM to the ALTS dataset using our proposed method for partially observed data with and without misclassification (models A and B, respectively). To ensure the HMM arrived at a global maximum, we used multiple different starting parameters and confirmed that the algorithm estimated similar parameter values. In general, estimates allowing for misclassification had larger standard errors. We present some of the more precise and scientifically interesting estimates in Table 1. When testing the sensitivity of the first-order Markov assumption, λ1 approached 1, indicating that the first-order Markov assumption holds. The second-order estimated parameters are also similar to the those of the first-order model.

TABLE 1.

Stayer, initial HPV, selected transition, and selected misclassification probabilities for ALTS using the proposed method

Model A Model B

With misclassification Without misclassification

Est Bootstrap SE Est Bootstrap SE
Stayers 0.059 0.032 0.017 0.011
Movers
Initial state New HPV 0.35 0.103 0.29 0.056
Persistent HPV 0.22 0.083 0.25 0.056
Transition probabilities Previous state Current state
(0,0,0) (0,0,0) 0.69 0.09 0.71 0.01
(0,0,0) (0,1,0) 0 0.04 0.14 0.01
(0,0,0) (1,0,0) 0.31 0.07 0.11 0.01
(0,0,0) (1,1,0) 0 0.02 0.04 0
(1,0,0) (0,0,0) 0.36 0.13 0.49 0.04
(1,0,0) (2,-,-) 0.37 0.12 0.45 0.04
(2,0,0) (0,0,0) 0.46 0.14 0.24 0.03
(2,0,0) (2,-,-) 0.54 0.17 0.73 0.05
(2,1,0) (-,-,1) 0.17 0.13 0.04 0.02
Misclassification probabilities True state Observed state
(2,1,1) (2,0,0) 0 0.06
(2,1,1) (2,1,0) 0.5 0.21
(2,1,1) (2,2,0) 0.02 0.02
(2,1,1) (2,0,1) 0 0.04
(2,1,1) (2,1,1) 0.43 0.22
(2,1,1) (2,2,1) 0.05 0.02

Note: State tuples correspond to (HPV, cytology, colposcopy). HPV values are negative, new, or persistent (0–2), cytology values are normal, equivocal/low-grade, and HSIL/carcinoma (0–2), and colposcopy values are <CIN2 and CIN2+ (0–1). State (2,1,0) corresponds to HPV persistent, cytology equivocal/low-grade, and colposcopy <CIN2. (2,-,-) and (-,-,1) are marginal transition probabilities for being in the HPV-persistent state and in the CIN2+ state, respectively.

Abbreviation: HPV, human papillomavirus.

Bootstrap standard errors were calculated by sampling n = 3488 individuals, with replacement. Models A and B were then applied to the sampled data. This process was repeated 2500 times. Bootstrap standard errors were then calculated for each parameter by using the 2500 bootstrap estimates.

One question of interest for vaccination and screening strategies are the transition rates among healthy women. Model A estimated 5.9% (SE: 3.2%) of the population to be stayers, whereby these participants were not susceptible to HPV over the 2 year time period. Interestingly, assuming no misclassification (model B) resulted in an estimate of only 1.7% (SE: 1.1%) stayers, primarily because all women with equivocal/low-grade cytology results would be treated as having cervical changes. Overall, the percentage of stayers in ALTS are less than what one would expect to find in the general population due to their relatively young ages (75%+ of women were under age 35 at enrollment) and their equivocal cytology test entry criteria.

Among movers susceptible to HPV and subsequent cervical precancers, healthy women in ALTS were estimated to have 6-month transition probabilities of 69% (SE: 9%) and 31% (SE: 7%) for remaining in a healthy state vs acquiring a new HPV infection without developing cervical abnormalities, respectively. When no misclassification was assumed, these estimates were 71% (SE: 1%) and 11% (SE: 1%) with an 18% (SE: 1%) probability of transitioning to an equivocal/low-grade cervix state.

Because one cannot know whether HPV infections present at the first visit are new or persistent, standard methods typically cannot use these infections to estimate differential transition rates for new vs persistent infections. Our proposed method treats HPV-positives present at the first visit as partially observed (either new or persistent HPV) and subsequently estimate transition rates. Using models A and B, respectively, we estimated that 61% and 54% of women HPV-positive at the first visit were newly infected. Estimated transition probabilities for clearance and progression then differed depending on whether the infection was new or persistent. For example, under model A, HPV-positive women with normal cervices had probabilities of having their HPV infection persist 6-month later of 36% (SE: 13%) vs 54% (SE 17%) for new vs persistent infections, respectively.

In general, transition rate estimates for progressing to CIN2+ were higher and had greater standard errors when assuming misclassification of colposcopy results (model A) than when assuming no misclassification of colposcopy results (model B). For example, women in the state of persistently HPV-positive, equivocal/low-grade cytology, and <CIN2 colposcopy had a 17% (SE: 13%) vs 4% (SE: 2%) probability of transitioning to CIN2+ with models A and B, respectively.

Estimates for misclassification of cytology and colposcopy in model A had high standard errors but often also high estimates, indicating that misclassification should be accounted for. For example, women in the state of persistently HPV-positive, with equivocal/low-grade cytology, and CIN2 colposcopy had a 50% (SE: 21%) probability of being misclassified as persistently HPV-positive, with equivocal/low-grade cytology, and <CIN2 colposcopy. Our example illustrates that in applications, two competing priorities will need to be balanced—the incorporation of misclassification and the precise estimation of transition and classification probabilities for rare states.

5 |. SIMULATION

We compare the performance of the proposed method with misclassification (method A) and without misclassification (method B) with an alternative approach that applies standard methods for a HMM with a mover-stayer component after excluding partial observations (method C). Simulated latent data are generated from the maximum likelihood estimated proposed method with classification in Section 3 with the same amount of participants and follow-up length as in the ALTS data. Observed values after a positive colposcopy result are removed to mimic patient treatment and their subsequent removal from the trial. Partial missingness is introduced independently to HPV and cytology according to their global partial missing percentage in the data, 5.5% and 0.3%, respectively. Colposcopy partial missingness is conditional on HPV and cytology where the probabilities are calculated from the ALTS data. Full observations are then removed at the same rate as in the ALTS data. Finally, any new HPV positive result not immediately following an HPV negative result and any HPV persistent result not immediately following a new HPV positive or persistent result is considered to be partially observed as it is unsure if the HPV is newly acquired or persistent. Simulation results are based on 2500 simulated realizations.

Tables 2 to 4, show the simulation results for estimating the initial and mover-stayer proportion, transition, and misclassification probabilities, respectively. Discussing each method individually, the proposed approach with misclassification (method A) has good finite sample properties for estimating the parameters of all four components and results in unbiased estimation. The 95% confidence interval for the bias contains zero for every parameter and the empirical standard deviation (SD) remains insignificant for most states. The stayer proportion (Table 2) is accurately estimated with both extremely low bias and SD. Even for the CIN2+ classification probabilities (bottom half of Table 4), the proposed method has minimal bias (barring one state) and reasonable SDs.

TABLE 2.

Initial state probabilities and SE for simulations

Model A Model B Model C

With misclassification Without misclassification Without partial

State Truth N Bias SE MSE Bias SE MSE Bias SE MSE
Stayer 0.059 206 −0.002 0.016 0.0002 −0.041 0.008 0.0017 0.017 0.085 0.0076
Mover 0.941 3282 0.002 0.016 0.0002 0.041 0.008 0.0017 −0.017 0.085 0.0076
(0,0,0) 0.11 361 −0.006 0.019 0.0004 0.159* 0.01 0.0252 0.028 0.104 0.0117
(0,0,0) 0.235 770 −0.005 0.026 0.0007 −0.051* 0.007 0.0027 0.089 0.126 0.0238
(0,1,0) 0.046 150 0.002 0.018 0.0003 −0.044* 0.001 0.0019 0.123 0.101 0.0253
(0,1,0) 0.031 103 0.004 0.024 0.0006 0.022 0.022 0.001 −0.031* 0.003 0.001
(0,2,0) 0.05 164 0.011 0.031 0.001 0.07 0.071 0.0099 −0.014 0.044 0.0021
(0,2,0) 0.045 147 0.002 0.021 0.0004 −0.02* 0.005 0.0004 0.008 0.045 0.0021
(1,0,0) 0.018 59 0.005 0.02 0.0004 0.056* 0.023 0.0036 −0.017* 0.007 0.0003
(1,0,0) 0.112 368 0.022 0.037 0.0019 0.054 0.071 0.008 −0.078 0.042 0.0079
(1,1,0) 0 0 0 0 0 0.003 0.005 0 0 0 0
(1,1,0) 0.041 135 0.011 0.023 0.0006 −0.04* 0.001 0.0016 0.042 0.078 0.0077
(1,2,0) 0 0 0 0 0 0.003 0.002 0 0 0 0
(1,2,0) 0.003 11 0 0.004 0 −0.003* 0 0 0.007 0.025 0.0007
(2,0,0) 0.075 245 −0.017 0.04 0.0018 −0.066* 0.005 0.0043 −0.06* 0.025 0.0041
(2,0,0) 0.006 21 0.001 0.003 0 0.009 0.017 0.0004 0.016 0.026 0.0009
(2,1,0) 0.138 452 −0.042 0.029 0.0026 −0.097* 0.006 0.0095 −0.037 0.053 0.0041
(2,1,0) 0.04 131 0.005 0.03 0.0009 −0.036* 0.005 0.0013 −0.04* 0.001 0.0016
(2,2,0) 0.05 165 0.008 0.036 0.0014 −0.021 0.017 0.0008 −0.038 0.025 0.0021
(2,2,0) 0 0 0 0 0 0.001 0.005 0 0 0 0

Note: State tuples correspond to (HPV, cytology, colposcopy). State (2,1,0) corresponds to HPV 2, cytology 1, and colposcopy 0. Method A is the proposed method with misclassification, B is the proposed method without misclassification, and C is an HMM where the partially observed data are discarded.

*

indicates that the 95% CI for the estimated bias does not include zero. N is the average number of individuals with a particular state at time 1.

Abbreviations: HMM, hidden Markov model; HPV, human papillomavirus.

TABLE 4.

Classification probabilities and SE for simulations

Method A Method C

With misclassification Without partial

True state Class. state Truth N Bias SE MSE Bias SE MSE
(0,0,0) (0,0,0) 0.925 2988 −0.002 0.019 3e-04 0.011 0.121 0.0147
(0,1,0) (0,1,0) 0.506 1324 −0.005 0.05 0.0024 −0.018 0.275 0.0759
(0,2,0) (0,2,0) 0.025 13 −0.002 0.012 1e-04 −0.16 0.161 0.0511
(1,0,0) (1,0,0) 0.901 572 −0.006 0.043 0.0018 0.099 0.202 0.0505
(1,1,0) (1,1,0) 0.737 243 −0.008 0.075 0.0052 0.003 0.375 0.1397
(1,2,0) (1,2,0) 0.484 71 −0.049 0.224 0.0507 0.136 0.458 0.2296
(2,0,0) (2,0,0) 0.873 444 −0.014 0.065 0.0043 0.13 0.321 0.1197
(2,1,0) (2,1,0) 0.952 799 −0.003 0.05 0.0025 0.034 0.179 0.0333
(2,2,0) (2,2,0) 0.392 42 −0.044 0.156 0.0236 −0.407 0.298 0.2548
(0,0,1) (0,0,1) 0.039 10 0.007 0.04 0.0016 −0.003 0.088 0.0076
(0,1,1) (0,1,1) 0.016 41 0.002 0.008 1e-04 −0.007 0.023 6e-04
(0,2,1) (0,2,1) 0 0 0 0 0 0 0 0
(1,0,1) (1,0,1) 0.038 19 −0.012 0.058 0.0034 −0.018 0.131 0.0174
(1,1,1) (1,1,1) 0 0 0 0 0 0 0 0
(1,2,1) (1,2,1) 0.301 142 −0.154 0.144 0.0438 −0.429 0.283 0.2631
(2,0,1) (2,0,1) 0.144 122 0.005 0.047 0.0022 0.031 0.068 0.0055
(2,1,1) (2,1,1) 0.43 280 −0.016 0.155 0.024 0.126 0.251 0.0786
(2,2,1) (2,2,1) 0.365 34 0.041 0.121 0.0144 −0.029 0.212 0.0457

Note: State tuples correspond to (HPV, cytology, colposcopy). State (2,1,0) corresponds to HPV 2, cytology 1, and colposcopy 0. Method A is the proposed method with misclassification and C is an HMM where the partially observed data are discarded.

*

indicates that the 95% CI for the estimated bias does not include zero. N is the average number of individuals with each state from time 1 to n.

Abbreviations: HMM, hidden Markov model; HPV, human papillomavirus.

Ignoring misclassification (method B) emulates a Markov chain and results in substantial bias and smaller standard errors for the initial and transition probabilities. This is especially noticeable for the transition rates (Table 3) where the SD is minimal and the bias is often comparable to the truth. Despite the small SD, this method still has a quite large MSE due to overwhelmingly large bias; for most estimates method B has a higher MSE than method A. In addition, the difference in relative size of the bias and SD means that most 95% CIs for the bias do not include zero (indicated by a *).

TABLE 3.

Six month transition rates and SE for simulations

Model A Model B Model C

With misclassification Without misclassification Without partial

Prev. state Curr. state Truth N Bias SE MSE Bias SE MSE Bias SE MSE
(0,0,0) (0,0,0) 0.689 1199 0.021 0.052 0.0031 −0.02 0.011 0.0006 0.1 0.266 0.0804
(0,0,0) (1,0,0) 0.306 532 −0.023 0.052 0.0031 0.194* 0.006 0.0377 −0.071 0.247 0.0658
(0,1,0) (0,1,0) 0.675 1574 −0.014 0.041 0.0018 0.283* 0.02 0.0807 0.092 0.233 0.0628
(0,1,0) (0,1,1) 0.193 450 0.017 0.053 0.003 0.192* 0.002 0.037 0.034 0.18 0.0335
(0,2,0) (0,2,0) 0.795 367 −0.009 0.065 0.004 0.786* 0.022 0.6187 0.062 0.132 0.0211
(0,2,0) (0,1,1) 0.163 75 0.008 0.063 0.0037 0.143* 0.056 0.0238 −0.061 0.137 0.0226
(1,0,0) (2,0,0) 0.347 170 0.002 0.049 0.002 0.074* 0.027 0.006 −0.063 0.24 0.061
(1,0,0) (0,1,1) 0.255 125 0 0.128 0.016 0.25* 0.007 0.0631 0.031 0.274 0.0765
(1,1,0) (0,1,0) 0.472 143 0.015 0.085 0.0068 0.319* 0.032 0.1034 0.307 0.293 0.179
(1,1,0) (2,2,0) 0.112 34 0.006 0.052 0.0024 0.087* 0.012 0.008 −0.215 0.32 0.1477
(1,2,0) (0,1,0) 0.163 24 −0.006 0.114 0.0121 0.032 0.052 0.0039 0.158* 0.063 0.0297
(1,2,0) (2,0,1) 0.188 28 0.016 0.083 0.0061 0.157* 0.033 0.0267 −0.016 0.306 0.0933
(2,0,0) (0,0,0) 0.463 181 0.002 0.088 0.0071 0.179* 0.024 0.0329 −0.1 0.358 0.1385
(2,0,0) (2,0,0) 0.505 198 0.005 0.095 0.0084 0.088* 0.033 0.0089 0.153 0.338 0.1383
(2,1,0) (2,1,0) 0.363 284 −0.032 0.073 0.0061 −0.029 0.027 0.0015 0.116 0.229 0.0656
(2,1,0) (0,1,1) 0.102 80 0.018 0.064 0.0043 0.1* 0.004 0.0102 0.067 0.115 0.0177
(2,2,0) (2,1,0) 0.679 64 0 0.187 0.0329 0.101 0.097 0.0201 0.172 0.362 0.1582
(2,2,0) (2,1,1) 0.126 12 −0.001 0.177 0.0308 0.065 0.065 0.0089 −0.086 0.308 0.1015

Note: State tuples correspond to (HPV, cytology, colposcopy). State (2,1,0) corresponds to HPV 2, cytology 1, and colposcopy 0. Method A is the proposed method with misclassification, B is the proposed method without misclassification, and C is an HMM where the partially observed data are discarded.

*

indicates that the 95% CI for the estimated bias does not include zero. N is the average number of individuals with a particular state for times 2 to n.

Abbreviations: HMM, hidden Markov model; HPV, human papillomavirus.

Discarding partially observed measurements (method C) results in nearly unbiased estimation, but also causes a substantial loss of precision. In Section 3, it was shown that one major drawback of method A is the high standard errors, and method C exacerbates this issue. As about 50% of all observable data is partially observed, roughly half of the available data are ignored in this method. For the classification rates (Table 4), this method has substantially larger SD than method A for every state. This is shown quantitatively as the median relative efficiency of the no partial method (method C) to the proposed method with misclassification (method A) is 0.12. Method C also relies on the assumption that the data is missing completely at random, and as we explicitly removed colposcopy results based on observed HPV and cytology, this assumption is violated.

In summary, although the proposed approach with misclassification (method A) results in a loss of efficiency relative to ignoring the misclassification error (method B), in terms of MSE method A outperforms methods B and C.

6 |. DISCUSSION

When the state-space is defined by multiple tests, partially observed data occurs when some tests are missing while others are observed. This is especially common when studying the natural history of a disease where the progression is checked by multiple diagnostic tests. HPV tests, cytology and colposcopies are used to screen women for cervical precancers in ALTS. If at least one but not all three tests are missing, the data point is considered partially observed and standard methods for HMMs cannot be applied. Given the large proportion of partially observed data in HPV cohort analyses, incorporating this type of incomplete data is very important.

This article presents a unique adaptation of the forward-backward algorithm that handles partial observations by considering the set of all possible full observations. The first proposed method incorporates misclassification of test results and population mixtures, where a subset of individuals move through different states and other individuals stay only in the healthy state. Interestingly, the problem of fitting Markov chain models (without misclassification) with a partially observed state space is in itself a computationally difficult problem. The proposed algorithm also solves this simpler problem. We apply these approaches to a dataset of women undergoing cervical cancer screening.

To properly account for partially observed data, we use the set of all possible full observations that could have been observed if no test was missing. To do this, we calculate forward probabilities for each possible full observation by considering each observed state or potentially observed state for totally and partially observed states, respectively. Although this leads to additional computations, the adaptation scales linearly with the size of the set of possible full observations. For example, a partially observed time point with five possible full observations requires four more calculations than one fully observed time point. Because our method only deals with partially observed data differently, our adaptation reduces to the standard forward-backward algorithm for fully observed data. Both partially and fully missing data is assumed to be MAR. We believe this to be the case as the data is missing based on prior observed test results, such as not needing a colposcopy based on HPV and cytology results. Through our simulation study, we found that our proposed method with misclassification more accurately estimates the parameters used to create the simulation data than the other approaches.

We applied these methods to 3488 women undergoing cervical cancer screening over 2 years in ALTS. Only 38% of ALTS visits were fully observed; 40% were partially observed and 22% were completely unobserved. Although the existing literature has shown that cytology and colposcopy can disagree, current models of natural history have primarily ignored measurement error and treat test results as the gold standard when estimating transition probabilities. Our application shows how transition probabilities can be estimated while incorporating misclassification, and indicates that the amount of misclassification cannot be ignored, as not accounting for misclassification will produce bias estimates. However, there are tradeoffs between unbiased estimation and large standard errors in models that incorporate misclassification. Approaches, such as ours, applied to larger datasets that also include HPV genotyping can further understanding of the natural history of this disease and inform strategies for primary and secondary prevention.

The mover-stayer model can incorporate heterogeneity in natural history beyond what would be expected from a standard Markov model. However, care should be given to over interpreting the probability of a mover on an individual basis. In particular, in studies with follow-up such as ALTS, we cannot attribute a stayer as someone who is unlikely to progress in any longer term sense.

A number of methodological extensions to this approach are possible. In settings that are more heterogeneous, covariates could be added to the parameters of the mover-stayer model. For highly irregular follow-up time, the discrete Markov chain can either be replaced by a Markov process,18,19 or be sampled at finer intervals. Another possible direction is accounting for nonignorable missing data. Although unnecessary for our application, there are scenarios in which tests can be missing based on prior unobserved results, requiring a further adaptation of our method.

ACKNOWLEDGEMENT

This work utilized the computational resources of the NIH HPC Biowulf cluster (http://hpc.nih.gov).

APPENDIX A. MIXTURE TRANSITION DISTRIBUTION

A.1. Complete data log likelihood

The complete data joint distribution for the mixture transition distribution model is

fiXi,Wi,Qi=πt=1nl1=0kM0l1Z0XitZl1WitQi×1-πl1=0kl2=0krl1l2Zl1Xi1Zl2Xi2×t=1nl1=0kl2=0kl3=0kλ1Pl2l3+λ2Pl1l3Zl1Xi,t-2Zl2Xi,t-1Zl3Xit×t=1nl1=0kl2=0kMl1l2Zl1XitZl2Wit1-Qi (A1)

To accommodate the second-order Markov chain, the initial probabilities are a matrix describing the first two states rather than a vector. rl1l2 refers to the probability of being in state l1 at time t=1 and state l2 at time t=2. As in Section 3.1, because Xi, Wi, and Qi are all latent we will use the EM algorithm to estimate the parameters. The complete data log likelihood is

Ei=1IlogfiXi,Wi,Qi|Yi=EQi|Yi(logπ+i=1It=1nl1=0kEZ0XitZl1Wit|YilogM0Wit)+1-EQi|Yilog1-π+i=1Il1=0kl2=0kEZl1Xi1Zl2Xi2|Yilogrl1l2+i=1It=3nl1=0kl2=0kl3=0kEZl1Xi,t-2Zl2Xi,t-1Zl3Xit|Yilogλ1Pl2l3+λ2Pl1l3+i=1It=1nl1=0kl2=0kEZl1XitZl2Wit|YilogMl1l2, (A2)

where EQi|Yi, EZl1Xi,t-2Zl2Xi,t-1Zl3Xit|Yi, EZl1Xi1Zl2Xi2|Yi, and EZl1XitZl2Wit|Yi are calculated by an adaption of the forward-backward algorithm.

A.2. Second-order forward algorithm

The second-order forward algorithm is defined similarly to the first order in Section 3.2, however, now the forward quantity calculates the probability of being in specific latent states at time t and t-1. The forward quantity is

αl1l2(it)j=P(Yi1*,Yi2*,,Yit,j*,Xi,t-1=l1,Xit=l2) (A3)

and is calculated by

αl1l2(it)jj={rl1l2Ml1YiljMl2Yi2jift=2αl1l2(it)j={l0=0km=1|Yi,t2*|m=1|Yi,t1*|αl0l1(i,t1)mm(λ1Pl1l2+λ2Pl0l2)Ml2Yit,jift=3l0=0km=1|Yi,t1*|αl0l1(i,t1)m(λ1Pl1l2+λ2Pl0l2)Ml2Yit,jift>3. (A4)

Unlike the first-order forward algorithm which starts at t=1, this algorithm is only defined at t=2 and later. For times t>2 the forward quantity can sum over all partially observed values for the state at time t-1 (while still keeping track of all partially observed values for the state at time t). When t=2, partially observed values for the first two states cannot be summed over, as they are necessary for later calculations. Therefore, the forward quantity at time t=2 is calculated slightly differently than at times t>2. Finally, define

αl1l2(it)*=P(Yi1*,Yi2*,,Yit*,Xi,t-1=l1,Xit=l2)=j=1Yit*αl1l2(it)j (A5)

A.3. Second-order backward algorithm

The second-order backward algorithm, similarly to the second-order forward algorithm, includes two latent states in each backward quantity. The backward quantity is:

βl1l2(it)=P(Yi,t+1*,Yi,t+2*,,Yin*|Xi,t-1=l1,Xit=l2) (A6)

and is calculated by

βl1l2(it)={1ift>n2l3kj=1|Yi,t+2|βl2l3(i,t+1)(λ1Pl2l3+λ2Pl1l3)Ml3Yi,t+2,jiftn2. (A7)

A.4. Calculating expectations

The following are the E-step expectations, calculated using αl1l2* (it) and βl1l2(it)

EQi|Yi=PQi=1,YiPYi=πt=1nPYit|Xit=0PYi, (A8)
EZl1Xi1Zl2Xi2|Yi=PXi1=l1,Xi2=l2,YiPYi=αl1l2*(i2)βl1l2(i2)PYi, (A9)
E[Zl1(Xi,t2)Zl2(Xi,t1)Zl3(Xit))|Yi]=P(Xi,t2=l1,Xi,t1=l2,Xit=l3,Yi)P(Yi)=αl1l2*(i,t1)(λ1pl2l3+λ2pl1l3)j=1|Yit|Ml3Yit,jβl2l3(it)P(Yi), (A10)
EZl1XitZl2Wit|Yi=PXit=l1|YiP(Wit=l2|Xit=l1,Yi)=l0=0kα*l1l0(i2)βl1l0(i2)PYi×P(Y*it,j=l2|Xit=l1)m=1Yit*P(Y*it,m|Xit=l1)ift=1l0=0kα*l0l1(it)βl0l1(it)PYi×P(Y*itj=l2|Xit=l1)m=1Yit*P(Y*it,m|Xit=l1)ift>1, (A11)

where

P(Yi)=P(Yi|Qi=1)+P(Yi|Qi=0)=πt=1nP(Yit|Xit=0)+(1π)l1=0kl2=0kαl1l2(in). (A12)

APPENDIX B. COMPLETELY MISSING DATA

When the data is missing at random we can adapt the EM algorithm to deal with the missing values. We’ll treat the missing data as only missing in the observed states. Let Yi=Yif(1),,Yif(s),,Yifni be the vector of observed states for the ith individual where f(s) denotes the sth nonmissing time point and ni is the number of nonmissing time points for individual i. Now, define Wi=Wif(1),,Wifni so that

Wif(s)=Yif(s)ifYif(s)isfullyobservedYif(s)j*ifYif(s)ispartiallyobserved, (B1)

for a j such that 1jYif(s)* and if Yif(s)* were to be fully observed, the full observation would be Yif(s)j*. The adjusted E-step is

Ei=1IlogfiXi,Wi,Qi|Yi=EQi|Yilogπ+i=1It=1nil1=0kEZ0XitZl1Wift|YilogM0Wit+1-EQi|Yilog1-π+i=1Il1=0kEZl1Xi1|Yilogrl1+i=1It=2nl1=0kl2=0kEZl1Xi,t-1Zl2Xit|Yilogpl1l2+i=1It=1nil1=0kl2=0kEZl1XitZl2Wift|YilogMl1l2 (B2)

To calculate the E-step with missing data the only change we need to make is to the forward-backward algorithm. Define αl1*(it)=P(Yif(1)*,,YifsL*,Xit=l1)=j=1Yit*αl1(it)j where fsL=max(f(s)|f(s)t). Then we can calculate the forward algorithm as:

αl1(it)j={rl1ift=1rl1Ml1Yijift=f(1)=1l0=0km=1|Yi,t1|αl0(i,t1)mpl0l1iff(sL)<tl0=0km=1|Yi,t1|αl0(i,t1)mpl0l1Ml1Yi,jiff(sL)=t. (B3)

Also define βl1(it)=P(YifsU*,,Yifni*|Xit=l1) where fsU=min(f(s)|f(s)t+1). Then the algorithm is calculated as,

βl1(it)={1iftf(ni)l2=0kpl1l2βl2(t+1)iff(sU)>t+1l2=0kpl1l2j=1|Yi,t+1|Ml2Yi,t+1,jβl2(t+1)iff(sU)=t+1. (B4)

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions. Code and instructions on how to install and use the proposed and the mixture transition distribution method is available at https://github.com/jordanaron22/PartiallyObservedHMM.

REFERENCES

  • 1.Kay R. A Markov model for analysing cancer markers and disease states in survival studies. Biometrics. 1986;42:855–865. 10.2307/2530699. [DOI] [PubMed] [Google Scholar]
  • 2.Albert PS. A Markov model for sequences of ordinal data from a relapsing-remitting disease. Biometrics. 1994;50(1):51–60. [PubMed] [Google Scholar]
  • 3.Albert PS, Hunsberger SA, Biro FM. Modeling repeated measures with monotonic ordinal responses and misclassification, with applications to studying maturation. J Am Stat Assoc. 1997;92:1304–1211. 10.1080/01621459.1997.10473651. [DOI] [Google Scholar]
  • 4.Zucchini W, MacDonald I, Langrock R. HiddenMarkovModelsforTimeSeries:AnIntroductionUsingR. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC Press; 2017. [Google Scholar]
  • 5.Goodman LA. Statistical methods for the mover-stayer model. J Am Stat Assoc. 1961;56:841–868. 10.1080/01621459.1961.10482130. [DOI] [Google Scholar]
  • 6.Schiffman M, Wentzensen N. Human papillomavirus infection and the multistage carcinogenesis of cervical cancer. Cancer Epidemiol Biomark Prev. 2013;22(4):553–560. 10.1158/1055-9965.EPI-12-1406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Campos NG, Burger EA, Sy S, et al. An updated natural history model of cervical cancer: derivation of model parameters. Am J Epidemiol. 2014;180(5):545–555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Castanon A, Landy R, Sasieni P. By how much could screening by primary human papillomavirus testing reduce cervical cancer incidence in England? J Med Screen. 2016;24:110–112. 10.1177/0969141316654197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Sørbye SW, Suhrke P, Revå BW, Berland J, Maurseth RJ, Al-Shibli K. Accuracy of cervical cytology: comparison of diagnoses of 100 Pap smears read by four pathologists at three hospitals in Norway. BMC Clinical Pathology. 2017;17(1). 10.1186/s12907-017-0058-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Gage J, Hanson V, Abbey K, et al. Number of cervical biopsies and sensitivity of colposcopy. Obstet Gynecol. 2006;108:264–272. 10.1097/01.AOG.0000220505.18525.85. [DOI] [PubMed] [Google Scholar]
  • 11.Wentzensen N, Walker J, Gold M, et al. Multiple biopsies and detection of cervical cancer precursors at colposcopy. J Clin Oncol. 2015;33. 83–89. 10.1200/JCO.2014.55.9948. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Egemen D, Cheung LC, Chen X, et al. Risk estimates supporting the 2019 ASCCP risk-based management consensus guidelines. J Low Genit Tract Dis. 2020;24(2):132–143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Schiffman M, Adrianza ME. ASCUS-LSIL triage study. design, methods and characteristics of trial participants. Acta Cytol. 2000;44(5):726–742. [DOI] [PubMed] [Google Scholar]
  • 14.Venturoli S, Cricca M, Bonvicini F, et al. Human papillomavirus DNA testing by PCR-ELISA and hybrid capture II from a single cytological specimen: concordance and correlation with cytological results. J Clin Virol. 2002;25:177–185. 10.1016/S1386-6532(02)00007-0. [DOI] [PubMed] [Google Scholar]
  • 15.Peyton C, Schiffman M, Lörincz A, et al. Comparison of PCR- and hybrid capture-based human papillomavirus detection systems using multiple cervical specimen collection strategies. J Clin Microbiol. 1998;36:3248–3254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Chen HH, Duffy SW, Tabar L. A Markov chain method to estimate the tumour progression rate from preclinical to clinical phase, sensitivity and positive predictive value for mammography in breast cancer screening. Statistician. 1996;45(3):307–317. 10.2307/2988469. [DOI] [Google Scholar]
  • 17.Raftery AE. A model for high-order Markov chains. J R Stat Soc Ser B Stat Methodol. 1985;47(3):528–539. [Google Scholar]
  • 18.Kalbfleisch JD, Lawless JF. The analysis of panel data under a Markov assumption. J Am Stat Assoc. 1985;80(392):863–871. 10.1080/01621459.1985.10478195. [DOI] [Google Scholar]
  • 19.Jackson CH, Sharples LD, Thompson SG, Duffy SW, Couto E. Multistate Markov models for disease progression with classification error. Statistician. 2003;52(2):193–209. 10.1111/1467-9884.00351. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions. Code and instructions on how to install and use the proposed and the mixture transition distribution method is available at https://github.com/jordanaron22/PartiallyObservedHMM.

RESOURCES