Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Mar 1.
Published in final edited form as: Health Care Manag Sci. 2015 Jul 19;20(1):16–32. doi: 10.1007/s10729-015-9330-6

Optimizing Patient Treatment Decisions in an Era of Rapid Technological Advances: The Case of Hepatitis C Treatment

Shan Liu 1, Jeremy D Goldhaber-Fiebert 2, Margaret L Brandeau 3
PMCID: PMC4718905  NIHMSID: NIHMS731794  PMID: 26188961

Abstract

How long should a patient with a treatable chronic disease wait for more effective treatments before accepting the best available treatment? We develop a framework to guide optimal treatment decisions for a deteriorating chronic disease when treatment technologies are improving over time. We formulate an optimal stopping problem using a discrete-time, finite-horizon Markov decision process. The goal is to maximize a patient’s quality-adjusted life expectancy. We derive structural properties of the model and analytically solve a three-period treatment decision problem. We illustrate the model with the example of treatment for chronic hepatitis C virus (HCV). Chronic HCV affects 3–4 million Americans and has been historically difficult to treat, but increasingly effective treatments have been commercialized in the past few years. We show that the optimal treatment decision is more likely to be to accept currently available treatment—despite expectations for future treatment improvement—for patients who have high-risk history, who are older, or who have more comorbidities. Insights from this study can guide HCV treatment decisions for individual patients. More broadly, our model can guide treatment decisions for curable chronic diseases by finding the optimal treatment policy for individual patients in a heterogeneous population.

Keywords: technology adoption, medical decision making, dynamic programming, Markov decision process, hepatitis C treatment, decision analysis

1. INTRODUCTION

1.1 Background

As everyday consumers of goods and services, people frequently face technology adoption decisions—whether to purchase a good now or wait for a better good that might be developed in the future. It can be difficult to choose between adopting a low-quality technology with immediate benefits versus waiting for a better technology to emerge at potentially higher cost. The decision is particularly challenging when applied to medical technologies (i.e., pharmaceuticals, procedures and devices that are used to prevent, diagnose, monitor, and treat diseases and improve health) since people may assign substantially higher utility to health outcomes than to outcomes derived from other types of consumer goods [1]. Moreover, medical treatments can be irreversible; hence, getting the decision wrong is particularly costly.

One important distinction between the adoption decision for other consumer goods versus medical technologies is that the benefits of medical technologies are generally more uncertain and may operate on multiple dimensions (quality and duration of life), thus complicating decision making about which technologies are best. One way to measure the value of better health is with quality-adjusted life years (QALYs), an approach that accounts for improvements in both length and quality of life [2]. The decision to wait for better medical technology involves a tradeoff between the deterioration of a patient’s health over time and the expected timing and magnitude of future technological improvement. Consumers of health care may be taking a higher risk by delaying treatment than consumers of electronics who delay buying a new product.

In this study, we consider the question of how long a patient with a treatable chronic disease should wait for more effective treatments to emerge before accepting the best available treatment. We are motivated by chronic diseases that share certain features such as consistent health deterioration over time and ongoing development of newer and better treatments. The treatment decision of patients chronically infected with hepatitis C virus (HCV) is a currently significant example. Chronic HCV affects approximately 3–4 million Americans and has been historically difficult to treat [3]. In recent years, new and emerging treatments have shown great promise in improving efficacy and hence health outcomes (Figure 1) [4].

Figure 1.

Figure 1

Effectiveness of best available treatments for chronic HCV infection (genotype 1), measured as a percentage of patients who achieve sustained virologic response

We model the patient-level treatment adoption decision problem as an optimal stopping problem using a discrete-time, finite-horizon Markov decision process (MDP). The goal is to maximize the patient’s quality-adjusted life expectancy. We examine two cases of medical technology innovation—incremental innovation and radical innovation. We derive structural properties of the optimal solution and analytically solve a three-period treatment decision problem. We then present an example of chronic HCV treatment adoption decisions and compare decisions for patients with various demographic characteristics (age, race, risk history, disease severity). Our research bridges a gap between the technology adoption and the medical decision making literature by simultaneously modeling a patient’s changing health and stochastically improving technology over time.

1.2 Relevant Literature

A broad theoretical literature on technology adoption investigates the impact of uncertainty about future technological improvements on current adoption decisions, but with assumptions that may be more easily satisfied in contexts other than health. Balcer and Lippman [5] find that it is optimal to update the currently owned technology with the current best technology if the lag between the best technology and the technology already owned exceeds a threshold. Farzin et al. [6] use a continuous-time model with uncertainty about the timing and magnitude of technological improvements and show that adoption is slower under a dynamic model than under a net present value (NPV) method. Smith and Ulu [7] formulate the uncertainty in a technology’s quality and cost as a Markov process and compare three models of the adoption decision: NPV, single purchase, and repeat purchase. They find that improvements in the technology encourage adoption in the first two models, but under certain conditions improvement may discourage adoption in the third model. Zivin and Neidell [8] incorporate option values in medical technology evaluation by studying the influence of current medical technology adoption decisions on future interventions. They show that irreversibility in medical treatment raises the value of treatments that preserve future treatment options. Shechter et al. [9] use an MDP to model irreversible treatment decisions considering possible downstream availability of a single improved treatment that potentially emerges within N periods. They show that models that ignore foresight on future technology improvement can lead to suboptimal decisions.

New medical interventions and technologies are frequently evaluated using cost-effectiveness analysis [2], but most such analyses ignore the influence of future technology improvement on the adoption decisions of the current alternatives. Salomon et al. [10] numerically explore various scenarios for HCV treatment and show that taking into account potential future advances in treatment can change conclusions about the cost-effectiveness of current treatment—and hence the current best decision.

A growing literature on stochastic and dynamic models in medical decision making aims to optimize disease screening and treatment decisions over time, typically focusing on uncertainties in disease progression, but often holding technology as constant. MDPs provide a general framework for finding optimal solutions for stochastic and dynamic decisions, and have been used in a variety of health applications including epidemic control [11], organ transplantation [12, 13], breast cancer screening and treatment [14], and HIV therapy sequencing [15].

Our model differs from Smith and Ulu’s [7] single-purchase model by considering a chronic disease treatment decision that includes time-varying per period health rewards (representing a patient’s current health) and time-varying terminal health rewards (representing a patient’s expected lifetime health after treatment). Our approach is broader than that of Shechter et al. [9] in that we model the stochastic process of new treatment emergence by allowing for treatment effectiveness to improve by random amounts at multiple time points within the patient’s decision horizon; thus, in our framework the patient may choose to continue waiting when an improved treatment emerges. Additionally, we conceptually characterize patients’ decisions for the currently challenging problem of treatment for chronic HCV infection.

2. TREATMENT ADOPTION MODEL

2.1 Model Framework

We consider a patient with a steadily progressing disease who must decide when to initiate a one-time treatment. The current treatment is non-ideal; that is, it has low effectiveness and/or high toxicity. Successful treatment cures the patient. We assume there may be steady or rapid development toward better treatment (e.g., a new drug or device that has higher effectiveness). We also assume that treatment provides higher expected health benefits than never receiving treatment as long as its effectiveness is greater than zero.

Under this framework, the patient visits a physician periodically to monitor the progression of the disease and determine whether to start treatment during the visit or to continue waiting. Treatment effectiveness is improving over time according to some known probability distribution. At each visit, the patient’s current health and expected future health and the current best available treatment are known. If the patient decides to adopt treatment in any given period, a terminal reward (e.g., expected QALYs) is received and the process terminates. If the patient decides to wait, an immediate reward (e.g., quality-adjusted time until the next doctor visit) is accumulated for this period, and the patient reevaluates the treatment decision in the next period. We assume that the patient’s disease progression is deterministic and fully observable, and the chronic disease alone does not substantially increase the patient’s mortality rate during the decision horizon. We also assume that if the patient fails treatment, re-treatment is not possible (often due to low effectiveness from repeating treatment). The objective is to maximize the expected total health benefit for the patient, measured in terms of total lifetime QALYs.

We formulate the patient’s problem as a discrete-time finite-horizon optimal stopping problem. We use the following notation:

  • k: index for discrete time periods, k = 0, 1, …, N − 1

  • T: a terminal state indicating the patient has already been treated

  • pk: state of the system; pk denotes treatment effectiveness in period k when the patient has not been treated, expressed as the probability that the treatment will cure the patient’s disease; pk ∈ [0, 1]. If the patient has been treated, pk = T

  • wk: improvement in treatment effectiveness in period k; random variable taking positive values with a given probability distribution; wk ∈ [0, δ], δ <1, and pk + wk ≤ 1

  • uk: decision variable for time k; uk = 1 treat at time k, uk = 0 wait

  • qk: expected health reward in period k without treatment

  • Hk: expected total future health reward if treatment accepted in period k is successful (e.g., expected quality-adjusted life expectancy)

  • Fk: expected total future health reward if treatment accepted in period k is unsuccessful (e.g., expected quality-adjusted life expectancy)

  • fk(pk, wk, uk): function that describes the system dynamics; pk+1 = fk(pk, wk, uk)

  • gk(pk, wk, uk): reward function for period k, k = 0, 1, …, N − 1

  • gN(pN): reward function for period N

  • Vk(pk): value function for period k

The expected health rewards Hk, Fk and qk are used to account for the risk of death and can be estimated from survival or simulation models (e.g., individual-level simulation or Markov cohort models). The terms Hk and Fk can also include a disutility associated with treatment side effects. We note that if treatment is unsuccessful, the patient will continue on with disease progression and receive no benefit from treatment.

Using this notation, the system state dynamics are:

pk+1=fk(pk,wk,uk),k=0,1,,N-1

where fk is defined as:

pk+1={T,ifpk=T,orifpkTanduk=1(treat)pk+wk,otherwise

Our assumption is that the current best treatment provides a lower bound on treatment effectiveness; if future treatment is worse than current treatment, the patient could always default back to the current best treatment.

Medical technology improvement can either be incremental or radical. Medical device development is often incremental; new devices with relatively small improvements in efficacy appear frequently. In contrast, truly novel pharmaceutical drug development is often radical (i.e., entirely new classes or combinations of drugs), with relatively large but less frequent improvements appearing due to the rigorous regulatory approval process. (We ignore “me too” drugs which have efficacy and safety profiles that are essentially the same as the existing drugs.) We examine each case below. The system dynamic is pk+1 = pk + wk.

Case 1. Incremental Innovation

The wk’s are independent identically distributed random variables taking positive values within a small bounded interval between 0 and δ, 0 < δ ≪ 1. We assume that pN < 1; that is, treatment effectiveness will not reach 100% before the end of the decision horizon N.

Case 2. Radical Innovation

The wk’s are correlated random variables, each with a probability distribution with support [0, θ(1 − pk)]. Thus, wk is bounded between 0 and θ(1 − pk), 0 < θ ≤ 1. This condition guarantees pk+1 ≤ 1 for all k = 0, …, N − 1.

The term θ captures the upper bound on the potential jump in treatment effectiveness in period k; a larger θ reflects a belief that a larger increment in technology improvement over time is possible. For example, if θ =1, treatment effectiveness can potentially jump to 100%; if θ = 0.1, the maximum treatment improvement is only 10% of the gap. Therefore, θ can be viewed as a control parameter for the decision maker’s range of belief or confidence on the bound of future treatment improvement.

The reward function to be maximized is

Ewk{gN(pN)+k=0N-1gk(pk,wk,uk)},k=0,1,,N-1

where

gN(pN)={pNHN+(1-pN)FN,ifpNT0,otherwisegk(pk,wk,uk)={pkHk+(1-pk)Fk,ifpkTanduk=1(treat)qk,ifpkTanduk=0(wait)0,otherwise

We write the optimal value functions as:

VN(pN)={pNHN+(1-pN)FN,ifpNT0,ifpN=TVk(pk)={max[pkHk+(1-pk)Fk,qk+E{Vk+1(pk+1)}],ifpkT0,ifpk=T (1)

We assume that decision makers are risk neutral and do not have a time preference for health. At the last period N an untreated patient will adopt treatment because treatment, in expectation, can never yield worse health outcomes than never receiving treatment. This assumption might be violated in extreme cases when treatment has very low effectiveness and/or severe side effects (e.g. high reduction in quality of life; high probability of immediate death); however, since FDA-approved treatments must reach a minimum level of safety and efficacy, we assume these extreme cases will not occur. In any other period k, the patient either decides to treat and receive the expected total health reward, a weighted average between successful treatment and failed treatment pkHk + (1 − pk)Fk, or to wait and receive the current period reward qk plus an expected future reward E{Vk+1(pk+1)}. The optimal solution is obtained by solving (1) recursively.

The optimal policy is to adopt treatment when treatment effectiveness is greater than a threshold, which we denote by αk:

pk>qk-Fk+E[Vk+1(pk+1)]Hk-Fk=αk (2)

If pk > αk, then the patient selects treatment; otherwise, the patient waits. We note that αk is a function of pk, and values of p are correlated across periods (i.e., treatment effectiveness can only improve over time). If we can find a threshold k that uniquely solves k = αk(k), then the decision rule is as follows: if pk > k, the patient accepts treatment and if pkk the patient waits. The term αk must be convex and increasing in pk for k to exist; the proof is included in the Appendix in the proofs for Propositions 3 (Appendix page A2) and Proposition 4 (middle section of Appendix page A3).

2.2 Structural Properties of the Optimal Solution

We now derive several structural properties of the value function and optimal policy. All proofs are in the Appendix. We make the following intuitive assumptions:

Assumption 1

The per period health reward is non-increasing in time: qkqk+1. That is, as the patient’s health deteriorates with aging and disease progression, the patient’s per period health reward declines or at most remains constant.

Assumption 2

The expected total future health rewards after treatment are decreasing in time, Hk > Hk+1 and Fk > Fk+1, whether or not the treatment successfully cures the patient. That is, as the patient’s health deteriorates with aging and disease progression, the patient’s expected total future health reward post-treatment decreases. We also assume HkHk+1 > qk and FkFk+1qk. That is, successful treatment yields higher per period health than does no treatment, whereas unsuccessful treatment yields equal or lower per period health than does no treatment.

Assumption 3

The expected total future health reward from treatment success is always larger than the reward from treatment failure, Hk > Fk, and the incremental benefit from successful treatment is decreasing in time, HkFk > Hk+1Fk+1. That is, as the patient’s health deteriorates with aging and disease progression, the added benefit between successful treatment and failed treatment (i.e., H–F) decreases over time. (Note that this is also a direct result from assuming HkHk+1 > qk and FkFk+1qk in Assumption 2.)

We note that if treatment effectiveness is 0% in period k, then the value of waiting is greater than or equal to the value of treatment failure: if pk = 0, then Fkqk + E{Vk+1(pk+1)}. If treatment effectiveness is 100%, then the value of treatment success is greater than the value of waiting: if pk = 1, then Hk > qk + E{Vk+1(pk+1)}. Thus, the patient will always wait when there is no chance that treatment will be successful, and will never wait if current treatment is perfect.

We first examine the impact of changes in the patient’s health on the value function and optimal policy, holding treatment effectiveness constant.

Proposition 1

The value function is increasing in the patient’s health in period k, qk, and the patient’s expected post-treatment total future health rewards, Hk and Fk (separately and simultaneously). That is, for all k, pk, and qk1qk2,Hk1Hk2,Fk1Fk2, we have Vk(pk,Hk1,Fk1,qk1)Vk(pk,Hk2,Fk2,qk2). Healthier patients have a higher value function than less healthy patients.

We now establish conditions under which it is optimal to accept treatment. For Proposition 2a, we assume qk is independent of Hk and Fk. For Proposition 2b, we assume that Hk and Fk are both increasing with qk; that is, healthier patients have better expected future health outcomes post-treatment than sicker patients.

Proposition 2

  1. Assume qk is independent of Hk and Fk and qk1qk2. If it is optimal to accept treatment at effectiveness pk with health qk2, then it is also optimal to accept treatment pk with health qk1. That is, if it is optimal in period k to adopt treatment with effectiveness pk at current health qk, then it is also optimal to adopt treatment for all lower values of qk in period k.

  2. Assume Hk and Fk are both increasing with qk and qk1qk2,Hk1Hk2,Fk1Fk2. If it is optimal to accept treatment at effectiveness pk with health qk2, then it is also optimal to accept treatment pk with health qk1, if pkΔHk + (1 − pkFk − Δqk ≤ 0, where Δqk=qk2-qk1,ΔHk=Hk2-Hk1,ΔFk=Fk2-Fk1. This is a sufficient condition.

We note that the condition pkΔHk + (1 − pkFk − Δqk ≤ 0 is sufficient to guarantee that the optimal decision is to accept treatment for patient with health qk1. The condition states that if the difference between person 1 and person 2’s expected health outcome post-treatment is smaller than the difference between their per period health, then if person 2 should accept treatment then person 1 should also accept treatment. If pkΔHk + (1 − pkFk − Δqk > 0, the optimal decision could still be to accept treatment or could be to wait.

We now examine the impact of changes in treatment effectiveness on the value function and optimal policy. Examining the structure of the optimal stopping problem from period N backward, we have

VN(pN)=pNHN+(1-pN)FNVN-1(pN-1)=max{(HN-1-FN-1)pN-1+FN-1,uN-1=1(HN-FN)pN-1+qN-1+FN+(HN-FN)w¯N-1,uN-1=0

We note that N−1, the expectation of wN−1, is now a function of pN−1 and, depending on the probability distribution of wk, the function can be nonlinear in pN−1.

Proposition 3

Monotonicity of value function: For the cases of both incremental and radical innovation, Vk(pk) is monotonically increasing in treatment effectiveness pk.

In general, as treatment effectiveness pk improves, the rewards associated with both adopting and waiting increase. From Assumptions 1 and 2, we note that improving the current treatment effectiveness pk, holding all else unchanged, increases the value of adopting in the current period k more than it increases the value of waiting if the next period decision is to adopt treatment. This is because the additional improvement in pk has less impact on the value of adopting when Hk and Fk are decreasing in later time periods.

Proposition 4

Monotonicity of policy: For the case of incremental innovation, if it is optimal to adopt treatment with effectiveness pk, then it is also optimal to adopt any treatment with higher effectiveness. As current treatment effectiveness pk improves, the decision moves from waiting toward adopting treatment.

We note that the optimal policy may also be monotonic for the case of radical innovation, but we have been unable to establish the result analytically.

Proposition 5

For the case of incremental innovation, the threshold effectiveness p̄k to adopt treatment decreases as the end of the decision horizon gets closer: p̄k > k+1, if p¯k+1<(Hk+1-Fk+1)(Hk-Fk)-(Hk+1-Fk+1)w¯. That is, if it is optimal to adopt treatment in period k, the decision should also be adoption in periods later than k if the above condition is satisfied.

The condition p¯k+1<(Hk+1-Fk+1)(Hk-Fk)-(Hk+1-Fk+1)w¯ provides an upper bound on the threshold effectiveness. The condition indicates that if the next period threshold is smaller than the expected treatment effectiveness improvement multiplied by a factor (the inverse of the percentage change in the difference between the value of treatment success and failure in the next period), then the threshold effectiveness decreases with time. We note that the bound on the threshold effectiveness increases with higher (thus the patient is more likely to wait longer), and decreases with faster health deterioration (smaller Hk+1Fk+1; thus the patient is more likely to accept treatment).

Proposition 6

For the cases of both incremental and radical innovation, increasing expectation on future treatment improvement favors waiting.

2.3 Analytical Solution to the Three-Period Problem

In this section we analytically solve the optimal stopping problem with radical innovation in treatment improvement (Case 2) for a decision problem of three time periods (N=2). A three-period problem can be interpreted as a patient having to decide whether to accept treatment in period 1 or 2. If the patient waits until period 3, then he or she will accept treatment in period 3, as it is assumed that expected health after receiving treatment is always better than expected health with no treatment. With a three-period problem each time period is sufficiently long to allow a moderate to high likelihood of treatment improvement (for medical innovations, each period is likely measured in years) and, for most chronic diseases, the patient and doctor would want to treat within a reasonable time frame (e.g., less than 5–10 years).

Depending on the distribution of wk, solving a multi-period problem can quickly become analytically challenging. Here we assume that wk ~ U[0, θ(1 − pk)]. One can think of this case as one in which the decision maker (e.g., the patient or doctor) has little information about how much treatment effectiveness will likely be improved in the next period except for the upper bound on treatment improvement, and thus assumes there is an equal probability of improvement between 0 and the maximum improvement possible θ(1 − pk).

The optimal solution (derivation in the Appendix) is as follows:

At k = 0, if p0>α0=q0-F0+E[V1(p1)]H0-F0, then the optimal decision is to treat in period 1; otherwise, the optimal decision is to not treat in period 1. The expectation of the value function is

E[V1(p1)]={C1+(H1-F1)θ(1-p0)2,ifC2<C11θ(1-p0)(C2-C1)22(H1-F1)-(2-θ)(H2-F2)+C1+H1-F12(1-p0)θ,ifC2>C1

where the constants C1 and C2 are:

C1=(H1-F1)p0+F1C2=q1+F2+(H2-F2)θ+(2-θ)p02.

At k = 1, if p1>α1=q1-F1+E[V2(p2)]H1-F1, then the optimal decision is to treat in period 2. The expectation of the value function is

E[V2(p2)]=(H2-F2)p1+F2+(H2-F2)θ(1-p1)2.

Alternatively, solving for 1 = α1(1), we have

p¯1=2(q1-F1+F2)+θ(H2-F2)2(H1-F1)-(H2-F2)(2-θ).

If p1 > 1 it is optimal to accept treatment in period 2. If p11, it is optimal to wait and accept treatment in period 3.

Going from a three-period problem to more periods is analytically challenging even with uniformly distributed treatment improvements (see Appendix). The n-period problem can be solved numerically using backward induction and discretizing the probability distribution on wk. The optimal policy can be found through careful evaluation of all possible states pk and their corresponding value function Vk and optimal action in every time period.

3. EXAMPLE: CHRONIC HEPATITIS C TREATMENT ADOPTION

We now present a numerical example of optimal chronic HCV treatment adoption decisions. We assume there are three time periods with radical innovation in treatment improvement. We omit the incremental innovation case since HCV drug development has shown radical jumps over time (see Figure 1). For a patient who is diagnosed early with minimum liver fibrosis, the disease typically progresses slowly and steadily and can take 15 to 30 years to cause severe liver damage [16]. Therefore patients with no significant fibrosis can wait for better treatment without the threat of immediate adverse health outcomes (i.e., development of liver decompensation or liver cancer). We model a decision horizon of 6 years; each period is 2 years to allow sufficient time for new drug commercialization. For HCV drugs, a major new breakthrough has occurred on average every 2 years in recent times. Within 6 years, HCV may be a completely curable disease (treatment effectiveness close to 100%) and thus the treatment adoption decision will not be as relevant.

HCV has several genotypes. The most common and difficult-to-treat type in the US is genotype 1. The previous standard therapy (pegylated interferon and ribavirin) was effective only in 40% of genotype 1 patients [17]. Between the years 2011–2013, several approved viral protease inhibitors (boceprevir) and polymerase inhibitor (sofosbuvir) used in combination with standard therapy increased treatment success to 70–90% [1820]. More effective interferon-free treatment is currently being commercialized: for example, an all-oral ledipasvir and sofosbuvir combination has shown promise in achieving sustained viral response (SVR) in more than 90% of patients [21, 22]. In the recent environment of rapid HCV drug development, patients and their doctors often delay treatment initiation, with the hope that more effective or less toxic treatment (which leads to better adherence and higher chance of success) will emerge soon. However, the newer drugs have high cost so physicians also implicitly make treatment decisions based on patients’ age, gender, disease severity, risk history, and comorbidities, often prioritizing patients with higher risk of disease progression toward immediate treatment.

3.1 Model and Data

We estimate values for qk, Hk and Fk by modifying a previously developed Markov model [23] that simulates the lifetime disease progression and death of patients with chronic HCV infection (Appendix Figure 1) and then using it to compute these values. The model stratifies patients by age, sex (male, female), race (white, black), risk status (high risk, low risk) and initial liver fibrosis stage. High risk is defined as having a history of injection drug use, transfusion prior to 1992, or more than 20 lifetime sex partners. These demographic factors influence disease progression and/or survival.

Rates of disease progression depend on age and sex. Health states include healthy (no HCV), no fibrosis (FS0), portal fibrosis with no septa (FS1), portal fibrosis with few septa (FS2), numerous septa without cirrhosis (FS3), compensated cirrhosis (FS4), decompensated cirrhosis, hepatocellular carcinoma (HCC, a form of liver cancer), and liver transplantation. Patients can start in disease stages FS0 to FS4 (top row of Appendix Figure 1). Patients who achieve sustained viral response (SVR) from treatment transition to recovered health states stratified by fibrosis severity; those who do not achieve SVR return to natural disease progression. A proportion of patients with decompensated cirrhosis and HCC receive liver transplants. Death can occur from any state based on calibrated background mortality rates that differ by patient age, sex, race and risk status [24]. There are additional risks of death in the decompensated cirrhosis, HCC, and liver transplantation states.

Model inputs include age- and sex- based chronic HCV progression rates [16]; mortality rates by age, sex, race, and risk status (calculated using hazard ratios estimated from the NHANES III linked mortality data and US life tables [25]); and age- and health-related quality-of-life weights. All model inputs can be found in Liu et al. [23]. The immediate health rewards, q0, q1, q2, are the per period expected QALYs experienced in periods 1, 2 and 3, respectively. The terminal health rewards, H0, H1, H2, F0, F1, and F2, are expected total QALYs over a lifetime horizon for patients who are treated in periods 1, 2, or 3 (Table 1). We do not discount health benefits in this example.

Table 1.

Health rewards (expected quality-adjusted life years) for HCV patients by race and gender, starting age, initial fibrosis stage, and risk status

Health reward in each period k = 0, 1, 2 without treatment Future health reward if treatment accepted in period k = 0, 1, 2 is successful* Future health reward if treatment accepted in period k = 0, 1, 2 is not successful*

Patient Description q0 q1 q2 H0 H1 H2 F0 F1 F2
Base Case
White Male, Age 50, FS2, High Risk 1.452 1.400 1.338 16.921 15.254 13.637 12.836 11.400 10.016
White Male, Age 50, FS2, Low Risk 1.452 1.412 1.364 19.146 17.494 15.877 14.194 12.758 11.361
White Female, Age 50, FS2, High Risk 1.452 1.410 1.363 18.717 17.062 15.449 14.847 13.411 12.016
White Female, Age 50, FS2, Low Risk 1.452 1.420 1.383 20.907 19.266 17.656 16.382 14.945 13.540
Black Male, Age 50, FS2, High Risk 1.452 1.369 1.274 13.382 11.691 10.089 10.502 9.069 7.719
Black Male, Age 50, FS2, Low Risk 1.452 1.390 1.317 15.726 14.034 12.403 11.947 10.512 9.139
Black Female, Age 50, FS2, High Risk 1.452 1.386 1.313 15.111 13.432 11.822 12.210 10.776 9.407
Black Female, Age 50, FS2, Low Risk 1.452 1.403 1.347 17.074 15.428 13.832 13.847 12.412 11.025
Sensitivity Analyses
White Male, Age 60, FS2, High Risk 1.409 1.314 1.198 11.342 9.696 8.146 8.773 7.384 6.089
White Male, Age 50, FS3, High Risk 1.452 1.378 1.303 16.921 15.232 13.331 11.512 10.076 8.717
White Male, Age 50, FS4, High Risk 1.349 1.288 1.163 16.818 13.758 11.106 9.160 7.840 6.577
*

The F values (expected total future health reward if treatment is unsuccessful) are close to the H values (expected total future health reward if treatment is successful) due to the nature of chronic HCV progression, which is a slow progressing disease that can take 15–30 years to cause major liver damage (e.g. decompensated cirrhosis and hepatocellular carcinoma).

3.2 Results for HCV Example

We consider the treatment decisions over a three-period time horizon for a 50-year-old patient who is in FS2 (just before developing significant fibrosis) at the beginning of period 1 (base case). Figure 2 shows the treatment acceptance and waiting regions in periods 1 and 2 for a high-risk white male, considering all possible values of treatment effectiveness, as parameterized by p0, p1, and θ (full results stratified by sex, race and risk status are presented in Appendix Figure 2). We ran the numerical example with 1,000 data points for each p0, p1, and θ, with 0.001 intervals, to create the decision regions. The decision regions for periods 1 and 2 are overlaid in the figure: the white region is the region in which the patient waits in period 2; the yellow region is the additional area in which the patient waits in period 1. From these numerical results, we make the following observations, which are also illustrative of the general results from Propositions 4 and 6, about the decision regions:

Figure 2. Hepatitis C example: decision regions for a 50-year-old high-risk white male in fibrosis stage FS2.

Figure 2

Decision regions for optimal treatment adoption as a function of pk (treatment effectiveness in period k) and θ (potential for future treatment improvement). The white region is the region in which the patient waits in period 2; the yellow region is the additional area in which the patient waits in period 1. The green region is the treating region in period 1. The green plus yellow region is the treating region in period 2.

  1. For period k =0 and 1, θ ↑ ⇒ αk (adoption threshold) ↑. That is, as the expectation of future treatment improvement increases, the patient’s decision moves toward waiting.

  2. For period k= 0 and 1, pkαkpk<1. That is, as the current treatment effectiveness increases, the patient’s decision moves toward adopting treatment now.

  3. If p0 > α0(p0) (the decision is to treat in period 1) and α0(p0) > α1(p0), then p0 > α1(p0) by transitivity. Given p1 > p0 and p0 > α1(p0), then p1 > α1(p1). This result comes from α1p=(2-θ)(H2-F2)2(H1-F1)<1. That is, if the decision is to treat in period 1, then the decision is also to treat in period 2.

Observation 3 is more apparent when the patient’s initial disease stage is FS3 fibrosis, as illustrated in Figure 3. Panel (a) shows that the treatment effectiveness threshold (αk, the minimum acceptable threshold at which the patient will accept treatment) is higher in period 1 than in period 2. Panel (b) shows that the period 2 waiting region (white) extends outward to the right from the period 1 waiting region (white plus yellow).

Figure 3. Hepatitis C example: treatment effectiveness threshold and decision regions for a 50-year-old, high-risk white male in fibrosis stage FS3.

Figure 3

Panel (a) shows the treatment effectiveness threshold αk (see equation (2) in text) for period 1 (blue line) and period 2 (red line); if treatment effectiveness is above this line, the patient accepts treatment; otherwise, the patient waits. Panel (b) shows the waiting regions in each period as a function of pk (treatment effectiveness in period k) and θ (potential for future treatment improvement): the white region is the region in which the patient waits in period 2; the yellow region is the additional area in which the patient waits in period 1. The green region is the treating region in period 1. The green plus yellow region is the treating region in period 2.

Using a 50-year-old, high-risk white male with initial disease stage FS2 as the base case, we explore the impact of low risk status, older age, more severe fibrosis stage, and race on the optimal treatment adoption policy. The decision regions for period 1 (k=0) are shown in Figure 4. In each panel, the arrow indicates the base case decision boundary, and the other line is the decision boundary for the single factor change. The results provide insights that can inform HCV treatment adoption decisions: the optimal decision is more likely to be waiting for patients who are low risk (panel a), younger (panel b), healthier (panel c), or white (panel d). For example, in the era of boceprevir triple therapy, we estimate that the treatment effectiveness for a white patient is 85% and for a black patient is 45% [19], and we assume that θ equals 1. Then, for a 50-year-old high-risk white male with FS2 fibrosis, the decision is to wait, whereas for the same patient with FS4 fibrosis the decision switches to treat immediately in period 1 (panel c). For a 50-year-old high-risk black male with FS2 fibrosis, the decision is to wait in period 1, if the current treatment effectiveness for black patients is lower than for white patients. However, if the current effectiveness for black patients is the same as for white patients (85%), the decision switches to treat immediately for black patients; but the decision for white patients is still to wait (panel d). The optimal treatment decision differs by race because race is used as a proxy for background mortality. In the era of highly effective all-oral antiviral treatment with a cure rate above 90%, all patient types would benefit from immediate treatment for all values of θ.

Figure 4. Hepatitis C example: comparison of decision regions in period 1 for different types of patients.

Figure 4

Comparison of decision regions in period 1 for high-risk vs. low-risk patients (panel a), younger vs. older patients (panel b), early vs. advanced disease (panel c), and white vs. black patients (panel d). The arrow indicates the decision boundary line for the base case, a 50-year-old, high-risk white male with FS2 fibrosis.

4. DISCUSSION AND FUTURE WORK

Medical decision analyses rarely include consideration of how future medical technology improvement influences the current technology adoption decision. However, this is a key aspect of many medical decisions: whether to take the best treatment available now or to wait, based on the belief that better technologies will emerge soon. We formulate the patient-level treatment adoption decision problem as an optimal stopping problem using a discrete-time finite-horizon model. This formulation is based on assumptions that are relevant and plausible for real-world chronic diseases. Our analyses show that the model has several intuitive properties: patients with less advanced chronic disease should wait for more effective treatment longer than patients with more advanced disease; the threshold effectiveness for treatment adoption decreases as the end of the horizon gets closer; improving current treatment effectiveness favors adoption; and increasing expectation on future treatment improvement favors waiting.

Our dynamic decision framework provides several advantages over other approaches. Analyses using non-dynamic models (ones that do not account for time-dependent changes in the state of the system) that find marginally effective treatments to be cost-effective for immediate use may suggest a suboptimal decision, especially for patients with minimal disease progression and in situations where significant increases in treatment effectiveness are expected based on reports from ongoing clinical trials. Some patients and doctors may adopt a myopic decision rule and only compare what is available in the current period to what is expected to be available in the next. Such a one-period look-ahead rule is optimal in a finite-horizon monotone stopping problem, but monotonicity properties are easily violated in many disease applications and thus a myopic policy may not be optimal for the multi-period decision problem.

Our numerical example of chronic HCV treatment suggests that the optimal decision is more likely to be to accept treatment at any point in time for patients who are high risk, older, sicker, or black. Risk status and race are used as proxies for comorbidities. Even with a simple uniform distribution assumption for treatment improvement and with relatively effective current treatment (e.g., 85% effectiveness), some patients may still want to wait for better treatment as long as the expectation on future treatment improvement is high (e.g., θ > 0.7).

These findings have immediate policy relevance. Previous cost-effectiveness analyses of HCV treatment indicate that it is more cost-effective to treat patients who will have higher gains in expected long-term health outcomes rather than those who will gain less health benefit [23, 26]. These patients are younger and have lower risk of comorbidities and a lower baseline mortality rate. Results from our study highlight the fact that if treatments are improving over time, it is sometimes optimal to treat sooner those people who appear less cost-effective to treat in a non-dynamic analysis (versus people who appear more cost-effective to treat). In healthcare systems where HCV treatment and care may be limited due to capacity constraints, our work may suggest a different prioritization scheme based on maximizing total population health: for example, treating older and sicker patients earlier than other groups during periods of rapid technological advancement.

Our research bridges a gap between the technology adoption literature and the medical decision making literature. The technology adoption literature primarily studies the impact of uncertainty about future technological progress on adoption decisions. These studies tend to be conceptual mathematical models built to study the impact of the speed of arrival and magnitude of technological improvement on the optimal decisions. The adopters often hold a constant current best technology. The medical decision making literature primarily studies the impact of uncertainty in patients’ health states with technology assumed to be static. These studies often assume that adopters lack the option to wait for improved future treatments. Our analysis considers both aspects of the decision: the adopter has a changing (decreasing) baseline state (i.e., the patient’s health) and future technology (i.e., treatment effectiveness) is improving stochastically over time.

Our model can be extended to consider additional uncertainties in disease progression and technological change. First, our model assumes that decisions are made based on perfect knowledge of expected value of future health, but in fact disease progression can be stochastic and patients may experience different progression rates. A future version of the model could allow patients’ health states to evolve stochastically. Since the decision maker often cannot fully observe how health and technology change with time, a partially observable Markov decision process approach could be useful for modeling the patient’s health state and available technologies. Second, the uniform distribution assumption for how technology may improve approximates cases in which patients and doctors have little information about treatment improvement in the near future. However, early clinical trial results can often narrow the uncertainty about likely treatment improvements and allow one to infer more accurate distributions on future treatment effectiveness. Third, the decision maker may want to update her belief about health, treatment effectiveness and side effects upon taking an action, and the updating process may be costly. Thus, another relevant issue is value of information, which in this context is the amount of money a decision maker is willing to pay for information prior to making a decision. Value of information fits nicely in this framework: the model could be extended to explore a decision maker’s willingness to actively monitor technological progress by collecting sample information on updated developments from clinical trials. Fourth, for other disease applications, a longer decision period could be insightful. Such analyses would require detailed understanding of the disease’s natural history, treatment process and drug development trajectory. Use of discrete percentages for pk would facilitate numerical solution of the model.

Our model can also be extended to consider relevant factors such as re-treatment, risk aversion, discounting, and cost. The current model formulation is only suited to steadily progressing chronic diseases that can be cured, and re-treatment is not considered. The model was motivated by the development trajectory of chronic HCV treatment up to the year 2012. Retreatment was not common due to low drug effectiveness among non-responders of previous treatment and the long and arduous drug regime (24–48 weeks of having flu-like symptoms on treatment). Our model is suited to diseases in which repeating treatment is to be avoided due to ineffectiveness of repeat treatment or the potential for a large decrement in quality of life. More sophisticated models could consider the possibility of repeated treatment. Additionally, patients’ risk attitudes are currently incorporated in the quality-adjusted health rewards (q, H and F) by utility elicitation methods used in obtaining the quality weights for various disease states (e.g., standard gamble) [2]. A future extension could explicitly model risk aversion by including utility preference functions on the expected health rewards. One could also extend the model to incorporate time preference for health (thus, discounting). Finally, our analysis takes the perspective of a fully insured/price insensitive decision maker and thus does not include treatment costs. Since the costs of treatment are borne by several parties (e.g., health insurers, patients, and healthcare providers), patients and doctors often make treatment decisions without considering the full societal costs of their decisions. Future treatments could be more effective but also more expensive. Cost-effectiveness analysis could be used to determine whether it is worthwhile to adopt a new treatment or to continue with existing treatment options.

Our dynamic model determines the optimal timing of treatment adoption when health and technology are both evolving over time. The model can be used to help chronically ill patients determine when to accept the currently available treatment, considering their own demographic and health characteristics, when rapid treatment advances are occurring. With many newly screen-detected HCV-positive individuals needing treatment through expanded HCV screening programs in the US, and improved HCV drugs on the horizon, both individuals and organizations must make treatment decisions in the presence of resource constraints and uncertainty about future treatments. Our work can provide insights into the improved use of new treatments in chronic HCV management and care. More broadly, for many types of chronic diseases, patients who are waiting for treatment have diverse genetic and behavioral risk factors that make delaying treatment more beneficial to some patients than to others. Going forward, new clinical applications include treatments for certain types of cancer (waiting with palliative care vs. initiating toxic therapy) and for chronic kidney disease (waiting on kidney dialysis or receiving a kidney transplant with improving immunosuppressant drugs in the future). Our framework can help maximize patients’ lifetime health by finding the optimal treatment policy for individuals in heterogeneous populations. This is a form of personalized medicine—an area of keen interest for healthcare providers worldwide.

Acknowledgments

Funding Sources: Jeremy Goldhaber-Fiebert was supported in part by a National Institutes of Health National Institute on Aging Career Development Award (K01AG037593-01A1; principal investigator, Dr. Goldhaber-Fiebert). Margaret Brandeau was supported by Grant R01-DA15612 from the National Institute on Drug Abuse. The funding agreements ensured the authors’ independence in designing the study, interpreting the data, writing, and publishing the report.

APPENDIX

Proof of Proposition 1

The result holds for VN(pN) trivially. Assume it holds for any period k+1. Then the value of waiting, qk + E{Vk+1(pk+1)}, is increasing in qk, Hk, Fk. The value of adopting is increasing in Hk and Fk. Using the induction hypothesis, the optimal value function is increasing in qk, Hk, and Fk since it is the maximum of two increasing functions.

Proof of Proposition 2

  1. Increasing qk only increases the value of waiting. If the decision is to accept treatment with qk2, then pkHk+(1-pk)Fkqk2+E{Vk+1(pk+1)}qk1+E{Vk+1(pk+1)}. The decision is also to accept treatment with qk1 since qk1qk2, holding everything else equal. That is, if it is optimal for healthier patients to adopt the current best treatment, it is also optimal for less healthy patients to adopt current treatment. Conversely, if the decision is to wait with qk1, then qk2+E{Vk+1(pk+1)}qk1+E{Vk+1(pk+1)}pkHk+(1-pk)Fk, and the decision is also to wait with qk2, which implies healthier patients wait at least as long to accept treatment than sicker patients.

  2. Assume qk1qk2,Hk1Hk2,Fk1Fk2 for all k. If the decision is to accept treatment with qk2, then pkHk2+(1-pk)Fk2qk2+E{Vk+12(pk+1)}. Define Δqk=qk2-qk1,ΔHk=Hk2-Hk1,ΔFk=Fk2-Fk1. Then we have:
    pkHk2+(1-pk)Fk2qk2+E{Vk+12(pk+1)}.
    Substituting for Hk2,Fk2, and qk2, we have
    pk(Hk1+ΔHk)+(1-pk)(Fk1+ΔFk)(qk1+Δqk)+E{Vk+12(pk+1)}
    which can be written as
    pkHk1+pkΔHk+(1-pk)Fk1+(1-pk)ΔFk(qk1+Δqk)+E{Vk+12(pk+1)}.
    Rewriting, we obtain
    pkHk1+(1-pk)Fk1+[pkΔHk+(1-pk)ΔFk](qk1+Δqk)+E{Vk+12(pk+1)}(qk1+Δqk)+E{Vk+11(pk+1)}.
    The last inequality is due to Vk+12(pk+1)Vk+11(pk+1) from Proposition 1 and by the properties of the expectation operator E{Vk+12(pk+1)}E{Vk+11(pk+1)}. Additionally, we have
    pkHk1+(1-pk)Fk1+[pkΔHk+(1-pk)ΔFk](qk1+Δqk)+E{Vk+11(pk+1)}
    which can be written as
    pkHk1+(1-pk)Fk1+[pkΔHk+(1-pk)ΔFk-Δqk]qk1+E{Vk+11(pk+1)}.

    If [pkΔHk + (1 − pkFk − Δqk] ≤ 0, we can guarantee that patient 1 accepts treatment because pkHk1+(1-pk)Fk1qk1+E{Vk+11(pk+1)}.

Proof of Proposition 3

VN(pN) = pNHN + (1 − pN)FN is increasing in pN. Assume that Vk+1(pk+1) is increasing in pk+1. The value of adopting, pkHk + (1−pk)Fk, and the value of waiting, qk+ E{Vk+1(pk + wk)}, are both increasing in pk, where pk+1(pk) is stochastically increasing in pk. Using the induction hypothesis, the optimal value function is increasing in pk since it is the maximum of two increasing functions.

Proof of Proposition 4

The logic of the proof is similar to the proof for the optimal stopping problem in the case of correlated prices in Bertsekas [27]. We write the DP algorithm in a less formal form (here we omit explicitly writing out the terminal state T):

VN(pN)=pNHN+(1-pN)FNVk(pk)=max[pkHk+(1-pk)Fk,qk+E{Vk+1(pk+wk)}]

where the value associated with adopting is pkHk + (1−pk)Fk and the value associated with waiting is qk +E{Vk+1(pk + wk)}.

In period N−1,

VN-1(pN-1)=max{pN-1HN-1+(1-pN-1)FN-1,uN-1=1,treatqN-1+E[pNHN+(1-pN)FN],uN-1=0,wait=max{pN-1HN-1+(1-pN-1)FN-1,uN-1=1qN-1+FN+(HN-FN)pN-1+(HN-FN)E[wN-1],uN-1=0=max{(HN-1-FN-1)pN-1+FN-1,uN-1=1(HN-FN)pN-1+qN-1+FN+(HN-FN)w¯,uN-1=0

where = E[wN−1]. We define A ≡ (HN−1FN−1)pN−1+ FN−1, B ≡ (HNFN)pN−1+qN−1+FN + (HNFN)w̄, and a constant CN−1qN−1+FN + (HNFN). Both A and B are functions of pN−1. They are illustrated schematically in Appendix Figure 3a.

From Assumptions 2 and 3, we know that the slope of function A is greater than the slope of function B and CN−1 > FN−1. As shown in Appendix Figure 3a, an optimal policy in period N−1 is given by:

TreatifpN-1>p¯N-1WaitifpN-1<p¯N-1

where N−1 is the solution obtained from setting A=B.

For any period k, suppose the system and rewards are time invariant, so that p = f(p, w, u). We note that VN−1(p) ≥ VN(p) for all p, because max[pHN−1 + (1 − p)FN−1, (HNFN)p + qN−1 + FN + (HNFN)] ≥ pHN + (1 − p)FN and VN−1 is convex and increasing in p. The monotonicity property of DP [27] is proved by induction:

Assume Vk+1(p) ≥ Vk+2(p). Then

Vk(p)=maxuE{gk(pk,wk,uk)+Vk+1(f(p,w,u))}maxuE{gk(pk,wk,uk)+Vk+2(f(p,w,u))}=Vk+1(p).

Thus, Vk(p) ≥ Vk+1(p) for all p and k and, by Proposition 3, Vk is increasing in p for all k. Vk(pk) is also convex, as we prove by induction:

Vk(pk)=max[pkHk+(1-pk)Fk,qk+E{Vk+1(pk+1)}]

Assume that Vk+1(pk+1) is convex in pk+1. In period k, the value of adopting treatment, pkHk + (1−pk)Fk, is convex in pk. The value of waiting, qk +E{Vk+1(pk+1)}, is also convex because Vk+1(pk+1) is convex by the induction hypothesis, and the state transitions are convex [7]. The function Vk(pk) is convex because the maximum of two convex functions is convex.

The proof of convex transitions can be found in Smith and Ulu [7], and is modified for our problem here. The state transitions are convex if E[fk+1 (pk+1)|pk] is a convex function of pk for all convex functions fk+1 (pk+1). Assume pkα=αpk1+(1-α)pk2, for any convex function fk+1 (pk+1). Then

αE[fk+1(pk+1)pk1]+(1-α)E[fk+1(pk+1)pk2]=E[αfk+1(pk1+wk)+(1-α)fk+1(pk2+wk)]E[fk+1(pkα+wk)]=E[fk+1(pk+1)pkα].

By the definition of a convex function, E[fk+1 (pk+1)|pk] is a convex function of pk.We also know that E[Vk+1] >0. These facts implies (Appendix Figure 3b) that the optimal policy for period k is,

Treatifpk>p¯kWaitifpkp¯k

where k is the unique solution of the equation pkHk + (1 −pk)Fk =qk +E{Vk+1 (pk + wk)} and 0 < k < 1.

We now show that as we increase the current treatment effectiveness pk, the decision moves from waiting toward adopting. We prove this statement in two steps. The logic of the proof is similar to that in Smith and Ulu [7].

Step 1

Prove the difference between the value of accepting treatment in the next period and the value of accepting treatment in the current period is decreasing with increasing pk. Assume pk2>pk1.

E[pk+1Hk+1+(1-pk+1)Fk+1pk2]-[pk2Hk+(1-pk2)Fk]=E[(pk2+w)Hk+1+(1-pk2-w)Fk+1)]-pk2Hk-(1-pk2)Fk=pk2Hk+1+(1-pk2)Fk+1-pk2Hk-(1-pk2)Fk+E[wHk+1-wFk+1]=(Hk+1-Hk)pk2+(Fk+1-Fk)(1-pk2)+E[(Hk+1-Fk+1)w]=(Hk+1-Hk)pk2+(Fk+1-Fk)(1-pk2)+(Hk+1-Fk+1)w¯=[(Hk+1-Hk)-(Fk+1-Fk)]pk2+(Fk+1-Fk)+(Hk+1-Fk+1)w¯=[(Hk+1-Fk+1)-(Hk-Fk)]pk2+(Fk+1-Fk)+(Hk+1-Fk+1)w¯<[(Hk+1-Fk+1)-(Hk-Fk)]pk1+(Fk+1-Fk)+(Hk+1-Fk+1)w¯=E[pk+1Hk+1+(1-pk+1)Fk+1pk1]-[pk1Hk+(1-pk1)Fk].

The inequality is true because [(Hk+1Fk+1) − (HkFk)] < 0 from Assumption 3.

Step 2

Prove the difference between the current value function and the value of immediately accepting treatment is decreasing with increasing pk.

Define Dk(pk) = Vk(pk) − [pkHk + (1 − pk)Fk]. Then

DN(pN)=0DN-1(pN-1)=max{pN-1HN-1+(1-pN-1)FN-1,qN-1+E[pNHN+(1-pN)FN]}-[pN-1HN-1+(1-pN-1)FN-1]=max{0,qN-1+E[(pN-1+w)HN+(1-pN-1-w)FN]-pN-1HN-1-(1-pN-1)FN-1}=max{0,qN-1+(HN-FN)w¯+FN-FN-1+pN-1[(HN-FN)-(HN-1-FN-1)]}

As PN−1 increases, DN−1 (pN−1) decreases since (HNFN) − (HN−1FN−1) < 0 from Assumption 3.

Assume Dk+1 (pk+1) is decreasing with increasing pk+1. By induction, we have

Dk(pk)=max{0,qk+E[Vk+1(pk+1)]-pkHk-(1-pk)Fk}=max{0,qk+E{pk+1Hk+1+(1-pk+1)Fk+1-[pkHk+(1-pk)Fk]}+E{Vk+1(pk+1)-[pk+1Hk+1+(1-pk+1)Fk+1]}}.

The first expectation is decreasing with increasing pk from Step 1. The second expectation is decreasing with increasing pk from the induction hypothesis. Therefore, Dk(pk) is decreasing with increasing pk. This means that with increasing effectiveness pk, the decision moves toward adopting.

Proof of Proposition 5

Dk (k+1) >0 implies that k > k+1(Appendix Figure 4), where k, k+1 are the corresponding threshold effectiveness levels to adopt treatment in period k and k+1. We want to prove that

Dk(p¯k+1)=Vk(p¯k+1)-[p¯k+1Hk+(1-p¯k+1)Fk]>0,

which is the same as

Vk(p¯k+1)>p¯k+1Hk+(1-p¯k+1)Fk.

This can be written as

max{p¯k+1Hk+(1-p¯k+1)Fk,qk+E[Vk+1(p¯k+1+w)]}>p¯k+1Hk+(1-p¯k+1)Fk

or

qk+E[Vk+1(p¯k+1+w)]>(Hk-Fk)p¯k+1+Fk. (A1)

The left-hand side of (A1) can be written as

qk+E[Vk+1(p¯k+1+w)]=qk+E[(p¯k+1+w)Hk+1+(1-p¯k+1-w)Fk+1]=qk+(Hk+1-Fk+1)p¯k+1+(Hk+1-Fk+1)w¯+Fk+1.

The first equality comes from knowing that k+1+ w is above the treatment threshold k+1 in period k+1, and thus the decision is to treat.

Compare this to the right-hand side of (A1). We know qk +Fk+1Fk from Assumption 2. Eliminate qk + Fk+1 and Fk in the equation, the inequality still holds, we then have

(Hk+1-Fk+1)p¯k+1+(Hk+1-Fk+1)w¯>(Hk-Fk)p¯k+1.

Solving for k+1, we have

[(Hk-Fk)-(Hk+1-Fk+1)]p¯k+1<(Hk+1-Fk+1)w¯p¯k+1<(Hk+1-Fk+1)(Hk-Fk)-(Hk+1-Fk+1)w¯.

Solving for , we have

w¯>(Hk-Fk)-(Hk+1-Fk+1)(Hk+1-Fk+1)p¯k+1.

If this condition is true, then k > k+1.

Proof of Proposition 6

Increasing the expectation on future treatment improvement only affects the value qk +E{Vk+1 (pk +wk)} of waiting in period k. The value of adopting, pkHk + (1−pk)Fk, stays the same. The change indicates that higher expectation on future treatment improvement makes waiting more attractive.

Solution of Three-Period Problem with Radical Innovation

The value functions for the three periods are:

V2(p2)=p2H2+(1-p2)F2,u2=1,treatinthelastperiodV1(p1)=max{p1H1+(1-p1)F1,q1+E[V2(p2)],u1=1,treatu1=0,waitV0(p0)=max{p0H0+(1-p0)F0,q0+E[V1(p1)],u0=1,treatu0=0,wait

Assume wk ~ U[0, θ(1−pk)]. Then E[wk]=θ(1-pk)2.

At k=1, the only unknown quantity is E[V2(p2)]. We solve

E[V2(p2)]=E[p2H2+(1-p2)F2]=E[(H2-F2)(p1+w1)+F2]=(H2-F2)p1+F2+(H2-F2)E[w1]

to obtain

V1(p1)=max{p1H1+(1-p1)F1,u1=1,Aq1+(H2-F2)p1+F2+(H2-F2)θ(1-p1)2,u1=0,B

If p1>q1-F1+E[V2(p2)]H1-F1, then the optimal decision is to treat in period 2.

At k=0, the only unknown quantity is E[V1(p1)]. We solve

E[V1(p1)] = E[max{A, B}]. The function A is

A=p1H1+(1-p1)F1=(H1-F1)p1+F1=(H1-F1)(p0+w0)+F1=(H1-F1)p0+F1+(H1-F1)w0=C1+(H1-F1)w0

where C1 = (H1F1)p0 + F1.

The function B is

B=q1+(H2-F2)p1+F2+(H2-F2)E[w1]

where

E[w1]=θ(1-p1)2=θ(1-p0-w0)2

We can write B as

B=q1+(H2-F2)(p0+w0)+F2+(H2-F2)θ(1-p0-w0)2=q1+(H2-F2)p0+F2+(H2-F2)θ(1-p0)2+(H2-F2)(2-θ)2w0=q1+F2+(H2-F2)θ+(2-θ)p02+(H2-F2)(2-θ)2w0=C2+(H2-F2)(2-θ)2w0

where C2=q1+F2+(H2-F2)θ+(2-θ)p02.

We have A = C1+(H1F1)w0, and B=C2+(H2-F2)(2-θ)2w0. The functions A and B are both linear in w0 and A has a steeper slope in w0 than B from Assumption 3 (Appendix Figure 5). We now examine the two cases for C1 and C2.

  1. If C2 > C1, we solve for 0, which is the unique solution of A=B. The solution is
    w¯0=2(C2-C1)2(H1-F1)-(2-θ)(H2-F2)
  2. If C2 < C1, then A > B, which means the decision is always to treat in k=1. Then
    E[V1(p1)]=E[A]=C1+(H1-F1)E[w0]=C1+(H1-F1)θ(1-p0)2

For case 1, E[V1(p1)]=E[max{A,B}]=0w¯0Bf(w0)dw0+w¯0θ(1-p0)Af(w0)dw0 where f(w0) is the probability density function of w0. For the uniform distribution assumption, f(w0)=1θ(1-p0). Substituting this in, we have

E[max{A,B}]=0w¯0[C2+(H2-F2)(2-θ)2w0]1θ(1-p0)dw0+w¯0θ(1-p0)[C1+(H1-F1)w0]1θ(1-p0)dw0

Solving the above integration, we obtain

E[max{A,B}]=1θ(1-p0)(C2-C1)22(H1-F1)-(2-θ)(H2-F2)+C1+H1-F12(1-p0)θ

If p0>q0-F0+E[V1(p1)]H0-F0, then the optimal decision is to treat in period 1.

Going from a three-period problem to more periods (N=3) is analytically challenging even with uniformly distributed treatment improvements. For example, recursing back for one more period, we have

VN-3(pN-3)=max{pN-3HN-3+(1-pN-3)FN-3,uN-3=1qN-3+E[VN-2(pN-2)],uN-3=0

The only unknown quantity E[VN−2 (PN−2)] is

E[VN-2(pN-2)]=E[max{pN-2HN-2+(1-pN-2)FN-2,qN-2+E[VN-1(pN-1)]}]

and we know PN−2 = PN−3 + wN−3. The function E[VN−1 (PN−1)] is not linear in wN−3 and has two cases, which makes E[VN−2 (PN−2)] difficult to solve analytically.

Appendix Figure 1.

Appendix Figure 1

Hepatitis C virus (HCV) natural history model

Appendix Figure 2.

Appendix Figure 2

Hepatitis C example: decision regions for a 50-year-old in fibrosis stage FS2

Decision regions for optimal treatment adoption as a function of pk (treatment effectiveness in period k) and θ (potential for future treatment improvement) for the case of a 50-year-old patient who begins in fibrosis stage FS2. The white region is the region in which the patient waits in period 2; the yellow region is the additional area in which the patient waits in period 1. The green region is the treating region in period 1. The green plus yellow region is the treating region in period 2.

Appendix Figure 3.

Appendix Figure 3

Schematic of functions VN−1 (pN−1) and Vk (pk) from proof of Proposition 4

Appendix Figure 4.

Appendix Figure 4

Schematic of functions Vk(pk) and Vk+1 (pk+1) from proof of Proposition 5

Appendix Figure 5.

Appendix Figure 5

Schematic of function E[V1 (p1)] from derivation of solution to three-period problem with radical innovation

References

  • 1.Viscusi WK. The value of risks to life and health. J Econ Lit. 1993;31(4):1912–1946. [Google Scholar]
  • 2.Gold MR, Siegel JE, Russell LB, Weinstein MC. Cost-effectiveness in health and medicine. Oxford University Press; New York: 1996. [Google Scholar]
  • 3.Armstrong GL, Wasley A, Simard EP, McQuillan GM, Kuhnert WL, Alter MJ. The prevalence of hepatitis C virus infection in the United States, 1999 through 2002. Ann Intern Med. 2006;144(10):705–714. doi: 10.7326/0003-4819-144-10-200605160-00004. [DOI] [PubMed] [Google Scholar]
  • 4.Alter HJ, Liang TJ. Hepatitis C: the end of the beginning and possibly the beginning of the end. Ann Intern Med. 2012;156(4):317–318. doi: 10.1059/0003-4819-156-4-201202210-00014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Balcer Y, Lippman SA. Technological expectations and adoption of improved technology. J Econ Theory. 1984;34(2):292–318. [Google Scholar]
  • 6.Farzin YH, Huisman KJM, Kort PM. Optimal timing of technology adoption. J Econ Dynam Control. 1998;22(5):779–799. [Google Scholar]
  • 7.Smith JE, Ulu C. Technology adoption with uncertain future costs and quality. Oper Res. 2012;60(2):262–274. [Google Scholar]
  • 8.Zivin JG, Neidell M. Medical technology adoption, uncertainty, and irreversibilities: is a bird in the hand really worth more than in the bush? Health Econ. 2010;19(2):142–153. doi: 10.1002/hec.1455. [DOI] [PubMed] [Google Scholar]
  • 9.Shechter S, Alagoz O, Roberts M. Irreversible treatment decisions under consideration of the research and development pipeline for new therapies. IIE Trans. 2010;42(9):632–642. [Google Scholar]
  • 10.Salomon JA, Weinstein MC, Goldie SJ. Taking account of future technology in cost effectiveness analysis. BMJ. 2004;329(7468):733–736. doi: 10.1136/bmj.329.7468.733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lefevre C. Optimal-control of a birth and death epidemic process. Oper Res. 1981;29(5):971–982. doi: 10.1287/opre.29.5.971. [DOI] [PubMed] [Google Scholar]
  • 12.Ahn JH, Hornberger JC. Involving patients in the cadaveric kidney transplant allocation process: A decision-theoretic perspective. Manage Sci. 1996;42(5):629–641. [Google Scholar]
  • 13.Alagoz O, Maillart LM, Schaefer AJ, Roberts MS. Determining the acceptance of cadaveric livers using an implicit model of the waiting list. Oper Res. 2007;55(1):24–36. [Google Scholar]
  • 14.Maillart LM, Ivy JS, Ransom S, Diehl K. Assessing dynamic breast cancer screening policies. Oper Res. 2008;56(6):1411–1427. [Google Scholar]
  • 15.Shechter S. Doctoral thesis. Pittsburgh, PA: University of Pittsburgh; 2006. When to initiate, when to switch, and how to sequence HIV therapies: a Markov decision process approach. [Google Scholar]
  • 16.Salomon JA, Weinstein MC, Hammitt JK, Goldie SJ. Cost-effectiveness of treatment for chronic hepatitis C infection in an evolving patient population. J Amer Med Assoc. 2003;290(2):228–237. doi: 10.1001/jama.290.2.228. [DOI] [PubMed] [Google Scholar]
  • 17.Ghany MG, Strader DB, Thomas DL, Seeff LB. Diagnosis, management, and treatment of hepatitis C: an update. Hepatology. 2009;49(4):1335–1374. doi: 10.1002/hep.22759. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Jacobson IM, McHutchison JG, Dusheiko G, Di Bisceglie AM, Reddy KR, Bzowej NH, Marcellin P, Muir AJ, Ferenci P, Flisiak R, George J, Rizzetto M, Shouval D, Sola R, Terg RA, Yoshida EM, Adda N, Bengtsson L, Sankoh AJ, Kieffer TL, George S, Kauffman RS, Zeuzem S. Telaprevir for previously untreated chronic hepatitis C virus infection. N Engl J Med. 2011;364(25):2405–2416. doi: 10.1056/NEJMoa1012912. [DOI] [PubMed] [Google Scholar]
  • 19.Poordad F, McCone J, Bacon BR, Bruno S, Manns MP, Sulkowski MS, Jacobson IM, Reddy KR, Goodman ZD, Boparai N, DiNubile MJ, Sniukiene V, Brass CA, Albrecht JK, Bronowicki JP SPRINT2-Investigators. Boceprevir for untreated chronic HCV genotype 1 infection. N Engl J Med. 2011;364(13):1195–1206. doi: 10.1056/NEJMoa1010494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Lawitz E, Mangia A, Wyles D, Rodriguez-Torres M, Hassanein T, Gordon SC, Schultz M, Davis MN, Kayali Z, Reddy KR, Jacobson IM, Kowdley KV, Nyberg L, Subramanian GM, Hyland RH, Arterburn S, Jiang D, McNally J, Brainard D, Symonds WT, McHutchison JG, Sheikh AM, Younossi Z, Gane EJ. Sofosbuvir for previously untreated chronic hepatitis C infection. N Engl J Med. 2013;368(20):1878–1887. doi: 10.1056/NEJMoa1214853. [DOI] [PubMed] [Google Scholar]
  • 21.Afdhal N, Zeuzem S, Kwo P, Chojkier M, Gitlin N, Puoti M, Romero-Gomez M, Zarski JP, Agarwal K, Buggisch P, Foster GR, Brau N, Buti M, Jacobson IM, Subramanian GM, Ding X, Mo H, Yang JC, Pang PS, Symonds WT, McHutchison JG, Muir AJ, Mangia A, Marcellin P, Investigators ION. Ledipasvir and sofosbuvir for untreated HCV genotype 1 infection. N Engl J Med. 2014;370(20):1889–1898. doi: 10.1056/NEJMoa1402454. [DOI] [PubMed] [Google Scholar]
  • 22.Kowdley KV, Gordon SC, Reddy KR, Rossaro L, Bernstein DE, Lawitz E, Shiffman ML, Schiff E, Ghalib R, Ryan M, Rustgi V, Chojkier M, Herring R, Di Bisceglie AM, Pockros PJ, Subramanian GM, An D, Svarovskaia E, Hyland RH, Pang PS, Symonds WT, McHutchison JG, Muir AJ, Pound D, Fried MW, Investigators ION. Ledipasvir and sofosbuvir for 8 or 12 weeks for chronic HCV without cirrhosis. N Engl J Med. 2014;370(20):1879–1888. doi: 10.1056/NEJMoa1402355. [DOI] [PubMed] [Google Scholar]
  • 23.Liu S, Cipriano LE, Holodniy M, Owens DK, Goldhaber-Fiebert JD. New protease inhibitors for the treatment of chronic hepatitis C: a cost-effectiveness analysis. Ann Intern Med. 2012;156(4):279–290. doi: 10.1059/0003-4819-156-4-201202210-00005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Liu S, Cipriano LE, Holodniy M, Goldhaber-Fiebert JD. Cost-effectiveness analysis of risk-factor guided and birth-cohort screening for chronic hepatitis C infection in the United States. PLoS One. 2013;8(3):e58975. doi: 10.1371/journal.pone.0058975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Arias E. United States Life Tables 2006. 2010. National Vital Statistics Reports. [PubMed] [Google Scholar]
  • 26.Liu S, Watcha D, Holodniy M, Goldhaber-Fiebert JD. Sofosbuvir-based treatment regimens for chronic, genotype 1 hepatitis C virus infection in U.S. Incarcerated populations: a cost-effectiveness analysis. Ann Intern Med. 2014;161(8):546–553. doi: 10.7326/M14-0602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Bertsekas DP. Dynamic programming and optimal control. Athena Scientific; Belmont, MA: 1995. [Google Scholar]

RESOURCES