Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Apr 19.
Published in final edited form as: Stat Med. 2021 Jun 29;40(23):4996–5005. doi: 10.1002/sim.9107

Estimating optimal dynamic treatment strategies under resource constraints using dynamic marginal structural models

Ellen C Caniglia 1, Eleanor J Murray 2, Miguel A Hernán 3,4, Zach Shahn 5,6
PMCID: PMC9017598  NIHMSID: NIHMS1792396  PMID: 34184763

Abstract

Methods for estimating optimal treatment strategies typically assume unlimited access to resources. However, when a health system has resource constraints, such as limited funds, access to medication, or monitoring capabilities, medical decisions must account for competition between individuals in resource usage. The problem of incorporating resource constraints into optimal treatment strategies has been solved for point exposures (1), that is, treatment strategies entailing a decision at just one time point. However, attempts to directly generalize the point exposure solution to dynamic time-varying treatment strategies run into complications. We sidestep these complications by targeting the optimal strategy within a clinically defined subclass. Our approach is to employ dynamic marginal structural models to estimate (counterfactual) resource usage under the class of candidate treatment strategies and solve a constrained optimization problem to choose the optimal strategy for which expected resource usage is within acceptable limits. We apply this method to determine the optimal dynamic monitoring strategy for people living with HIV when resource limits on monitoring exist using observational data from the HIV-CAUSAL Collaboration.

Keywords: dynamic strategy, HIV, marginal structural models, monitoring, optimal strategy, resource constraints

1 |. INTRODUCTION

In resource limited settings, where health system constraints prevent immediate initiation of treatment in all individuals, the optimal treatment strategy is the strategy which, if implemented by all doctors, would lead to the best population health outcomes while ‘respecting’ the system’s resource constraints (in a sense we will make rigorous below). In many care settings, physicians repeatedly assess patients and make treatment decisions based on patient history at each assessment. In such settings, ‘dynamic treatment strategies’ mapping a patient’s treatment and covariate history to a treatment decision at each time point are of interest. (Here ‘treatment’ refers to any intervention, including monitoring through lab tests to inform future decisions.) In this article, we propose a method to identify optimal dynamic treatment strategies (from a subclass of all possible strategies) respecting population-level resource constraints.

Luedtke and van der Laan1 considered the problem of estimating optimal resource constrained treatment strategies in the case of a point treatment—that is, when there is only one time point at which a treatment decision is made (see also Wang et al2 and Huang et al3). They considered resource constraints that place an upper limit κ on the expected proportion of treated patients, and defined the set of strategies which respected resource constraint κ as all strategies for which the expected proportion of treated patients in the population under that strategy is less than κ. The optimal point treatment strategy was then identified as the optimal strategy among those respecting the system level constraint. However, optimal point treatment strategies are often of limited utility in clinical decision-making, especially in the context of chronic disease or long-term therapy. Such strategies cannot recommend, for example, “Come back next month, and if the problem has progressed then begin treatment”.

Here, we consider the optimal resource-constrained dynamic strategy and present a method for estimating the optimal resource-constrained dynamic strategy from a parameterized subclass of all strategies. For example, suppose we restrict our attention to the class of monitoring strategies {gx:gx(Lt)=1{Lt<x}} (where 1{} denotes the indicator function) that monitors at time t if and only if the covariate Lt is less than x. Such a class of strategies might be approximately appropriate, for example, for the decision of how often to monitor people living with HIV, where Lt represents CD4 count at time t.

In the absence of resource constraints, Orellana et al4 and Robins et al5 describe how to estimate the optimal strategy from a parameterized class of strategies using a dynamic Marginal Structural Model. A dynamic MSM models the expected counterfactual outcomes under strategies parameterized by x as a function b(x; β) of x. With an estimate β^ of the dynamic MSM parameter β, estimating the optimal strategy simply reduces to finding xopt maximizing b(x;β^) (assuming larger values of b(x; β) are preferable). To accommodate resource constraints, we propose a fairly straightforward extension of this procedure that entails fitting two dynamic MSMs-one estimating the expected counterfactual clinical outcome b(x; β) and one estimating expected counterfactual treatment utilization h(x; θ). The optimal resource-constrained dynamic strategy is then xrcopt maximizing b(x; β) subject to h(x; θ) < κ.

Several methods have been proposed to find dynamic treatment strategies that respect constraints using observational data.610 Sometimes the constraint is a maximum acceptable level for an adverse outcome rather than a resource constraint, for example, prevalence of a side effect, but the resulting methodology is similar. These methods employ the parametric g-formula11 to estimate effects of candidate treatment strategies and then limit attention to those that satisfy constraints. The well-known advantages and drawbacks of the g-formula compared with dynamic MSMs in the unconstrained setting extend to the resource constrained setting. A commonly used algorithm to estimate the parametric g-formula requires correctly specified models for the conditional distribution of the outcome and each of the time-varying covariates over time, whereas to fit a dynamic MSM by inverse probability weighting requires correct specification of a model for treatment and a structural model for the counterfactual outcome. When all models are correctly specified, the g-formula is statistically more precise and can be used to explore a collection of strategies that may not fit neatly into a class indexed by a low dimensional parameter. However, the parametric g-formula requires correct specification of more models, is computationally intensive, and does not directly estimate an optimal strategy within a class (though this latter point will often be of minor practical significance). Thus, the dynamic MSM based approach we propose provides a useful complement to methods previously employed for this problem. We also note that there exists a somewhat related literature in adaptive trial planning.1216 In this setting, the constrained resource is sample size and the intervention to be optimized is the rate at which patients are assigned to treatment arms in order to obtain a maximally precise effect estimate.

We apply this approach to estimate the optimal resource-constrained dynamic strategy for CD4 cell count and HIV-RNA monitoring in people living with HIV who have achieved viral suppression. CD4 cell count and HIV-RNA tests are used to monitor an individual’s response to ART. More frequent monitoring has been shown to be associated with a lower risk of virologic failure (HIV RNA levels > 200 copies/mL) at two years after viral suppression.17 Guidelines recommend dynamic monitoring strategies in which virologically suppressed individuals on ART may be monitored less frequently once their CD4 cell count crosses above a certain threshold. However, the optimal point at which to decrease monitoring frequency is unclear.

In a health system with resource constraints, such as limited funds for monitoring or stock-outs of CD4 cell count and HIV-RNA tests, the CD4 cell count at which monitoring may be decreased must be chosen from a subset of strategies where the average number of tests does not exceed the available resources. Depending on the setting, the cost of a single HIV-RNA test may range from 18 to 36 US dollars,18 whereas the cost of a single CD4 cell count test may range from 15 to 24 US dollars.19,20 While limited funds may be more likely to impact low- and middle-income countries than high-income countries, stock-outs can occur throughout the world, and are often unexpected. In a survey of national AIDS programs in 12 countries in Latin America conducted in 2011, 50% of countries reported stock-outs of HIV laboratory supplies, including no rapid tests in 41% of cases, no CD4 cell count tests in 35%, and no HIV-RNA tests in 18%. This resulted in the discontinuation of services in 88% of cases.21 In a more recent study of gaps in the HIV treatment cascade in 11 West African countries conducted in 2018, ART stock-outs were recorded during 23% of health facility visits and stock-outs of HIV-RNA tests were recorded in 17% of visits.22,23 In 2020, 73 countries reported risk of ARV disruptions in response to the COVID-19 pandemic (including 13% of European countries), and 24 of these countries had critically low stocks of ARVs. Stock-outs were not only reported for ARVs—38 countries reported facing disruption in HIV testing and 23 countries reported facing disruption in HIV-RNA monitoring.24

2 |. NOTATION

Let:

  • t ∈ {0, …, K} denote time, assumed discrete, with K the end of the study;

  • Nt be a variable indicating the whether an individual is monitored at time t;

  • Y denote health outcome we aim to optimize;

  • Lt denote covariates at time t that may influence monitoring decisions;

  • D denote the total number of monitoring tests received by an individual;

  • X¯t denote X0, …, Xt and Xt denote Xt, …, XK for arbitrary time varying variable X.

We assume that we observe independent and identically distributed (iid) realizations of the random vector O = (L0, …, LK, N0, …, NK, Y). We use capital letters to denote random variables and corresponding lower case letters to denote specific values that random variables might take.

A treatment strategy is a rule or function that determines the value to which Nt will be set for a given observed history, that is, a function g:(L¯t,N¯t1)Nt. A strategy is said to be static if its recommendation for the present does not depend on past covariate and treatment values, and can therefore be specified from baseline. An example of a static strategy would be ‘monitor every 3 months’. A strategy is said to be dynamic if it does depend on past covariate and treatment values. An example of a dynamic strategy would be ‘monitor if time since last monitoring ≥ 6 months or if last observed viral load > 100 copies/mL and time since last monitoring ≥ 2 months’. Most realistic and clinically relevant strategies are dynamic. Compared with the above examples of deterministic strategies, random strategies output a distribution over the treatment space instead of a single treatment value,17 which capture deviations from rigid strategies that occur in realistic clinical practice. Random strategies can also capture lottery treatment assignments, which might be of interest particularly in resource constrained settings.

We denote arbitrary strategies by g and we adopt the counterfactual framework of Robins11 in which corresponding to each possible strategy g are counterfactual random variables Y(g), Lt(g), and D(g) that would have been observed had strategy g been followed, possibly contrary to fact. Implicit in the notation for counterfactuals (eg, Y(g)) is the assumption that the treatment strategy followed by one patient does not influence any other patient. This implicit assumption is called the ‘No Interference Assumption’ by Rubin.25 Note that in defining our resource constraint, we are optimizing based on the (counterfactual) average number of treatment or monitoring events per individual over a defined time period, but assuming no competition between individuals to access care under the optimal resource-constrained dynamic strategy. We also make the additional assumptions:26

SequentialPositivity:Iff(l¯t,n¯t1)>0,f(Nt=n|L¯t=l¯t,N¯t1=n¯t1)>0n,t (1)
Consistency:For any strategyg,if for a given individualNt=g(L¯t,N¯t1)for eacht,thenY=Y(g)andL¯K=L¯K(g)for that individual (2)
SequentialExchangeability:Y(g)Nt|L¯t1=l¯t,N¯t1=n¯t1t,g,n¯,l¯ (3)

We define resource constraints as caps on the expected number of doses or monitoring events per patient over a defined time period. We say that strategy g respects resource constraint κ if

E[D(g)]<κ (4)

We consider parameterized classes of strategies {gx} and seek to estimate

xκoptargmaxxE[Y(gx)]subject to E[D(gx)]<κ (5)

3 |. REVIEW OF DYNAMIC MARGINAL STRUCTURAL MODELS

A dynamic MSM is a model for expected counterfactual outcomes under a class of strategies {gx} parameterized by x as a function of x, that is,

E[Y(gx)]=b(x;β). (6)

To estimate β, first note that each individual might follow multiple strategies from the class {gx}. Let ∧i denote the number of strategies that individual i follows. Generate an artificial dataset with ∧i contributions from each individual: (Yi, xi1), …, (Yi, xii). Using the artificial dataset, we can fit by weighted least squares the regression model

E[Y|x]=b(x;γ)

with weights

W(x)k=0Kf(x)f(Nk|L¯k1,N¯k1)

to obtain γ^, where f(|) denotes conditional density and f*(x) is an arbitrary function of x. (When treatment strategies in {gx} are stochastic, the numerator of W(x) is set to fx(Nk|L¯k1,N¯k1), the conditional probability of receiving observed treatment at time k under gx conditional on patient history, as described in Cain et al17). When treatment or monitoring probabilities are unknown, as they are in our application, a consistent estimator f^(Nk|L¯k1,N¯k1) of f(Nk|L¯k1,N¯k1) can be plugged into W(x). Under sequential exchangeability (3) and consistency (2), Orellana et al4 show that the weighted regression parameter estimate γ^ approaches the causal estimand β in the limit.

4 |. ESTIMATING OPTIMAL TREATMENT STRATEGIES WITH RESOURCE CONSTRAINTS

To estimate the optimal resource-constrained dynamic strategy, we simply estimate the parameters of two dynamic MSMs:

E[Y(gx)]=b(x;β) (7)
E[D(gx)]=h(x;θ) (8)

and then estimate the x indexing the optimal strategy as

x^κoptargminxb(x;β^)subject toh(x;θ^)κ (9)

(9) can be solved using software for non-linear optimization with non-linear inequality constraints such as the NLopt27 implementation of the method-of-moving-asymptotes.28 Optimization algorithms are not generally theoretically guaranteed to find global minima when constraints are nonlinear, but in most practical cases (see our application) a simple grid search would suffice. Standard errors for β^, θ^, and certain derived quantities such as E[Y(gx)] and E[D(gx)] for any x can be computed by bootstrap or analytically using formulas in Reference 4. We note that confidence intervals for xκopt and E[Y(gxκopt)] are not generally available in the resource constrained setting through standard methods. This is due to complications arising from an estimator obtained from a constrained optimization where the constraint itself contains a plugin estimator.

4.1 |. An illustrative simulation

In this section, we illustrate the approach with a simple simulated example. We also compare the performance of the strategy learned by our proposed method (which incorporates the resource constraint) with an alternative approach that applies the optimal unconstrained strategy until the resource limit is reached and then administers no further treatment.

We first generated an “observational” data set of 25 000 subjects according to the following data generating process:

L0Normal(0,1);N0Normal(logit1(0.5+|L0|)
L1(0)Normal(L0,1);L1=L1(0)0.5N0;N1Bernoulli(logit1(0.5+|L1|))
Y(0)Normal(L1,1);Y=|Y(0)0.5N1|+.5(N0+N1);D=N0+N1.

We set ourselves the goal of finding the treatment rule g setting N0 and N1 that would minimize Y subject to the arbitrary constraint that E[D(g)] ≤ κ = 0.25 for D = N0 + N1. In the absence of treatment, Y would be the absolute value of the third step in a normal random walk in which L0 and L1 are the first two steps. Setting Nt = 1 applies a multiplicative shrinking factor of .5 to step t + 1 of the random walk with a countervailing additive effect of 0.5 on Y. There is time varying confounding as Nt is more likely to be 1 when |Lt| is large under the observational regime.

We considered strategies {gx} of the form “set Nt = 1 if |Lt| > xt” for x1, x2 ∈ [0, 3]. Note that {gx} allows the threshold to vary by time, so with two time points x is two dimensional. We specified flexible dynamic marginal structural models for outcome and doses:

E[Y(gx)]=b(x;β)=β0+β1x1+β2x12+β3x13+β4x14+β5x15+β6x2+β7x22+β8x23+β9x24+β10x25+β11x1x2+β12(x1x2)2+β13(x1x2)3+β14(x1x2)4
E[D(gx)]=h(x;θ)=θ0+θ1x1+θ2x12+θ3x13+θ4x14+θ5x15+θ6x2+θ7x22+θ8x23+θ9x24+θ10x25+θ11x1x2+θ12(x1x2)2+θ13(x1x2)3+θ14(x1x2)4.

We then estimated parameters β and θ via weighted regression as described in Section 3, plugging probability of treatment estimates from a correctly specified logistic regression model into the weights. Given β^ and θ^, we used the L-BFGS-B optimization algorithm29 implemented in the R function optim() to estimate the unconstrained optimal strategy x^opt, and we used the NLopt27 implementation of the method-of-moving-asymptotes28 for optimization under nonlinear inequality constraints through the R package nloptr to solve (9) and estimate the resource constrained optimal strategy x^κopt. The unconstrained optimal strategy x^opt was (3,0.61), meaning that the strategy “set N1 = 1 if |L1| > 3 and set N2 = 1 if |L2| > 0.61” leads to the optimal expected outcome of all strategies of the form “set N1 = 1 if |L1| > x1 and set N2 = 1 if |L2| > x2” in the absence of resource constraints. In contrast, the resource-constrained optimal strategy x^κopt was (3,1.54), reflecting that it was never optimal to give treatment at the first time point and treatment would only be administered under more extreme values of the covariate at the second time point in the resource constrained case. The expected outcome under our estimated unconstrained optimal strategy from the class was E[Y(gx^opt)]=1.16, and the expected outcome under the estimated optimal strategy from the class respecting resource constraints was E[Y(gx^κopt)]=1.20, reflecting a small but inevitable sacrifice to accommodate resource constraints. (We computed these counterfactual mean outcome values via simulation as the mean outcome of N = 1 000 000 subjects generated under gx^opt and gx^κopt. The values therefore reflect ground truth but are also very close to model generated estimates.)

To assess the gains that would be realized from implementing the resource-constrained optimal strategy as opposed to a realistic alternative, we also computed the mean Y obtained when treatment is administered following the unconstrained optimal strategy gx^opt until the resource limit is reached and then no further treatment is administered. This resource naive strategy resulted in a mean outcome of 1.3, which was over three times as far away from the unconstrained optimal outcome of 1.16 as the optimal resource constrained strategy (E[Y(gxκopt)]=1.2), illustrating the utility of our approach. R code for this simulation is available with supplementary materials, and at https://github.com/CausalInference/Resource_Constrained_dynMSM_Simulation.

5 |. APPLICATION TO MONITORING OF HIV PATIENTS

We apply the method described above to estimate the optimal resource-constrained dynamic strategy for monitoring CD4 cell count and HIV-RNA in people living with HIV using data from the HIV-CAUSAL collaboration.30 The HIV-CAUSAL collaboration combines data from prospective cohorts of people living with HIV enrolled in universal health care systems in Brazil, Canada, France, Greece, Netherlands, Spain, Switzerland, UK, and USA. Due to privacy concerns, the data are available from HIV-CAUSAL only through an approval process.

We have previously reported on the optimal dynamic monitoring strategy in this cohort and showed that decreasing monitoring when CD4 cell count >200 cells/μL compared to >500 cells/μL does not worsen short-term clinical and immunologic outcomes in virologically suppressed individuals living with HIV but may increase the risk of virological failure.31 We now extend these results to identify the optimal resource-constrained dynamic strategy under constraints on the average number of monitoring events over a two-year period.

First we briefly describe the eligibility criteria and monitoring strategies under consideration. We then describe the estimation of the optimal resource-constrained dynamic strategy.

Eligibility criteria:

Previously antiretroviral therapy naive HIV-positive individuals who initiate antiretroviral therapy in 2000 or later and achieve confirmed virologic suppression (2 consecutive HIV-RNA ≤200 copies/mL) within 12 months of initiating therapy are eligible for inclusion in the study. Individuals must meet the following additional eligibility criteria at baseline (date of confirmed virologic suppression): 18 years of age or older, CD4 cell count measurement within the previous 3 months, no history of an AIDS-defining illness, and no pregnancy (when information was available).

Monitoring strategies:

We consider 31 dynamic monitoring strategies, based loosely on current clinical guidelines. Under each strategy, CD4 cell count and HIV-RNA are monitored every 3 to 6 months when CD4 is below the strategy’s threshold and every 9 to 12 months when CD4 is above the threshold. Each strategy corresponds to a CD4 threshold ranging from 200 to 500 cells/μL in increments of 10 cells/μL. All of the monitoring strategies further require individuals to be monitored once every 3 to 6 months when HIV-RNA >200 copies/mL or after diagnosis of an AIDS-defining illness, and that CD4 cell count and HIV-RNA be monitored concurrently.

Follow-up period and outcome:

Individuals are followed from baseline until death, pregnancy (if known), loss to follow-up, or the administrative end of follow-up. The outcome of interest is virologic failure (HIV-RNA >200 copies/mL) at 24 months of follow-up.

Statistical methods:

We compare the 31 monitoring strategies using the replication and censoring approach. Briefly, we create an expanded dataset by making 31 exact replicates of each individual (1 per strategy). If and when an individual’s data are no longer consistent with a given strategy, we artificially censor the corresponding replicate at that time. We compute inverse probability weights to adjust for the potential selection bias induced by the artificial censoring.

We then fit an inverse-probability weighted Poisson regression model to estimate the risk ratio of virologic failure at 24 months of follow-up among those with measurements at 24 ± 2 months. The model includes a flexible functional form of the strategy variable (restricted cubic splines) and the baseline covariates: sex, CD4 cell count (<200, 200-349, 350-499, ≥500 cells/μL), years since HIV diagnosis (<1, 1 to 4, ≥5 years, unknown), race (white, black, other or unknown), geographic origin (N. America/W. Europe, Sub-Saharan Africa, other, unknown), acquisition group (heterosexual, homosexual or bisexual, injection drug use, other or unknown), calendar year (restricted cubic splines with 3 knots at 2001, 2007, and 2011), age (restricted cubic splines with 3 knots at 25, 39, and 60), cohort, and months from cART initiation to virologic suppression (2-4, 5-8, ≥9). Under the assumptions described above, the parameters of the regression model consistently estimate the parameters of a dynamic marginal structural model logPr(Y(gx)=1|V)=β0+β1h(x)+β2V. The model’s estimated parameters are used to estimate the standardized risk of virologic failure at 24-months for each monitoring strategy.

Next, we fit an inverse-probability weighted log-linear regression model to estimate the mean number of measurements at 24 months of follow-up. As above, the model includes a flexible functional form of the strategy variable and the baseline covariates. Again, under the assumptions described above, the parameters of the regression model consistently estimate the parameters of a dynamic marginal structural model E[D(gx)|V]=θ0+θ1f(x)+θ2V. The predicted values are used to estimate the standardized mean number of measurements at 24-months for each monitoring strategy.

After estimating the counterfactual risk of virologic failure at 24 months and counterfactual mean number of measurements at 24 months, we perform a 3-step algorithm to rank the dynamic strategies and selection the best strategy among the subset of strategies that satisfy a hypothetical resource constraint. First, we rank the strategies by the counterfactual mean number of measurements at 24 months (highest to lowest), obtained from the dynamic MSM E[D(gx)] = h(x; θ). Second, we restrict our consideration to the strategies that satisfy the resource constraint characterized by a maximum average number of tests per patient. For example, if the resource constraint states that CD4 cell count and HIV-RNA may be measured a maximum of once every 6 months on average, we would only consider the strategies that lead to an average of 4 or fewer measurements per person over 24 months of follow-up. Third, we choose the strategy with the smallest risk of virologic failure among the strategies that satisfy the resource constraint, obtained from the dynamic MSM E[Y(gx)] = b(x; β). In the third step, we use a grid search to find the best strategy among the subset of strategies corresponding to discrete CD4 cell counts rather than solving the optimization problem to find the optimal continuous CD4 cell count. This is the optimal strategy for minimizing the risk of virologic failure among the strategies that satisfy the constraint.

Results:

47 635 individuals initiated antiretroviral therapy in 2000 or later and met the eligibility criteria for our study. At baseline, we made 31 copies of each eligible individual, for a total sample size of 1 476 685. During follow-up, CD4 cell count and HIV-RNA were measured on average every 4 and 3.8 months, respectively. Figure 1 shows the estimates obtained from the dynamic MSMs for virologic failure at 24 months and for mean number of monitoring events over 24 months across the range of CD4 cell count thresholds considered. In this example, the estimated risk of virologic failure is monotonically increasing and the estimated mean number of measurements is monotonically decreasing as the CD4 cell count threshold increases, so the optimal resource-constrained dynamic strategy can be identified graphically as the lowest CD4 cell count threshold for which the mean number of monitoring events over 24 months is below the resource constraint, κ. Table 1 gives the same information with 95% confidence intervals obtained via bootstrapping.

FIGURE 1.

FIGURE 1

Risk of virologic failure and mean number of measurements per person at 24 months of follow-up by CD4 threshold strategy. The grey line represents one potential resource constraint—a cap on the average per person number of measurements over 24 months of follow-up of 4. Strategies in the green shaded area meet this restriction, and the CD4 threshold 320 strategy is the optimal resource-constrained dynamic strategy [Colour figure can be viewed at wileyonlinelibrary.com]

TABLE 1.

Risk of virologic failure and cumulative number of measurements at 24 months for CD4 threshold

CD4 threshold*

(cells/μL)
Virologic failure
Cumulative # Measurements
Risk (%) 95% CI Expected value 95% CI
500 6.91 (3.99, 9.82) 4.94 (4.82, 5.06)

490 6.85 (4.09, 9.61) 4.89 (4.78, 5.00)

480 6.80 (4.17, 9.43) 4.84 (4.73, 4.94)

470 6.74 (4.21, 9.27) 4.78 (4.69, 4.88)

460 6.69 (4.23, 9.15) 4.73 (4.64, 4.83)

450 6.64 (4.22, 9.05) 4.68 (4.59, 4.78)

440 6.59 (4.18, 8.99) 4.63 (4.54, 4.73)

430 6.54 (4.13, 8.95) 4.58 (4.49, 4.68)

420 6.52 (4.08, 8.95) 4.53 (4.43, 4.63)

410 6.51 (4.03, 8.99) 4.48 (4.38, 4.58)

400 6.54 (4.02, 9.05) 4.43 (4.32, 4.53)

390 6.60 (4.05, 9.15) 4.37 (4.27, 4.48)

380 6.71 (4.14, 9.28) 4.31 (4.21, 4.42)

370 6.87 (4.26, 9.48) 4.25 (4.15, 4.36)

360 7.08 (4.41, 9.76) 4.19 (4.08, 4.30)

350 7.33 (4.55, 10.12) 4.13 (4.02, 4.24)

340 7.62 (4.67, 10.57) 4.07 (3.95, 4.18)

330 7.93 (4.79, 11.08) 4.01 (3.89, 4.13)

320 8.27 (4.91, 11.62) 3.96 (3.84, 4.08)

310 8.61 (5.09, 12.14) 3.92 (3.79, 4.04)

300 8.97 (5.31, 12.61) 3.88 (3.75, 4.00)

290 9.33 (5.59, 13.06) 3.84 (3.71, 3.97)

280 9.70 (5.91, 13.49) 3.81 (3.68, 3.93)

270 10.08 (6.22, 13.94) 3.78 (3.65, 3.91)

260 10.48 (6.52, 14.43) 3.75 (3.62, 3.88)

250 10.88 (6.78, 14.99) 3.73 (3.59, 3.86)

240 11.31 (6.98, 15.64) 3.70 (3.55, 3.85)

230 11.75 (7.12, 16.38) 3.67 (3.52, 3.83)

220 12.21 (7.21, 17.72) 3.65 (3.48, 3.82)

210 12.68 (7.25, 18.12) 3.62 (3.44, 3.80)

200 13.18 (7.23, 19.13) 3.60 (3.41, 3.79)

Note: The monitoring strategies falling in the bold values meet the restriction that CD4 cell count and HIV-RNA may only be monitored every six months. Among the monitoring strategies that meet the restriction, the 320 threshold strategy is the optimal strategy in bold italic.

*

The CD4 Threshold corresponds to the CD4 cell count at which monitoring frequency changes from once every 2 to 7 months (if CD4 cell count is below the threshold) to once every 8 to 13 months (if CD4 cell count is above the threshold). Each strategy also includes monitoring once every 2 to 7 months when HIV-RNA>200 copies/mL or after diagnosis of an AIDS-defining illness.

In our example, we consider the case of κ = 4 and identify the optimal threshold for switching monitoring frequency as 320 cells/μL. The optimal resource-constrained dynamic strategy is then ‘monitor CD4 cell count and HIV-RNA every 3 to 6 months when CD4 is below 320 cells/μL and every 9 to 12 months when CD4 is above 320 cells/μL’.

6 |. CONCLUSIONS

Dynamic treatment strategies are a better representation of real-world clinical decision-making processes than static or point intervention strategies. However, resource utilization of dynamic strategies is difficult to assess, since the number of individuals requiring intervention over time under a given strategy cannot be straightforwardly determined at baseline. When a health system faces resource constraints that prohibit implementing the true optimal dynamic treatment strategy, the optimal resource-constrained dynamic strategy is instead required.

Here we propose a method to identify the optimal resource-constrained dynamic strategy within a parameterized class of strategies of interest by estimating the counterfactual resource usage. We apply this method to estimate the optimal resource-constrained dynamic strategy for monitoring frequency among individuals living with HIV who achieve virologic suppression.

Our choice of κ = 4 was somewhat arbitrary. If we had instead chosen κ = 3, we would have found that none of the strategies under consideration would satisfy this resource constraint. Interestingly, if we had chosen κ = 4.7, we would have identified the optimal threshold for switching monitoring frequency as 410 cells/μL, even though all the strategies in the range from 200 to 450 cells/μL would have satisfied the resource constraint (however, the confidence intervals around our estimates are quite wide).

In reality, determining the number of CD4 cell count and HIV-RNA measurements a setting is willing to allocate depends on a complex assessment of the costs and health benefits of monitoring. In our illustrative application, we imposed a constraint on the number of tests, which we imagined was derived from a hypothetical corresponding cost constraint. In other applications, it might be useful to directly bound cost instead. For example, since HIV-RNA tests cost more than CD4 tests, an optimal strategy satisfying a total cost constraint may be a more flexible joint strategy that allows CD4 cell count and HIV-RNA to be monitored with different frequencies. In addition, a constraint on the number of tests could be caused by stock-outs, an increasing global problem given the current COVID-19 pandemic. Future studies should also assess other health outcomes such as quality-adjusted life years associated with various monitoring strategies. Finally, even in the absence of a single hard resource constraint, examining outcomes of optimal strategies over a range of hypothetical cost constraints could allow for computation of incremental cost-effectiveness ratios, which could be useful for key stakeholders and decision-makers.

ACKNOWLEDGEMENTS

We thank Andrew Phillips, Linda Wittkop, Giota Touloumi, and Hansjakob Furrer for useful comments on an earlier draft of this article and James Robins for helpful discussions. This work was partially supported by NIH grants R37 AI102634 and T32 AI007433.

DATA AVAILABILITY STATEMENT

Data from each participating study in HIV-CAUSAL can be requested from the sponsoring institution in accordance with the applicable laws or regulations in each country.

REFERENCES

  • 1.Luedtke AR, van der Laan MJ. Optimal individualized treatments in resource-limited settings. Int J Biostat. 2016;12(1):283–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Wang Y, Fu H, Zeng D. Learning optimal personalized treatment rules in consideration of benefit and risk: with an application to treating type 2 diabetes patients with insulin therapies. J Am Stat Assoc. 2018;113(521):1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Huang X, Xu J. Estimating individualized treatment rules with risk constraint. Biometrics. 2020;76(4):1310–1318. [DOI] [PubMed] [Google Scholar]
  • 4.Orellana L, Rotnitzky AG, Robins JM. Generalized Marginal Structural Models for Estimating Optimal Treatment Regimes Technical Report. Boston, MA: Department of Biostatistics, Harvard School of Public Health; 2006. [Google Scholar]
  • 5.Robins JM, Orellana L, Rotnitzky AG. Estimation and extrapolation of optimal treatment and testing strategies. Stat Med. 2008;27(23):4678–4721. [DOI] [PubMed] [Google Scholar]
  • 6.Guan Q, Reich BJ, Laber EB, Bandyopadhyay D. Bayesian nonparametric policy search with application to periodontal recall intervals. J Am Stat Assoc. 2020;115(531):1066–1078. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Guan Q, Reich BJ, Laber EB, A spatiotemporal recommendation engine for malaria control; 2020. arXiv preprint arXiv:2003.05084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Laber EB, Wu F, Munera C, Lipkovich I, Colucci S, Ripa S. Identifying optimal dosage regimes under safety constraints: an application to long term opioid treatment of chronic pain. Stat Med. 2018;37(9):1407–1418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Laber EB, Meyer NJ, Reich BJ, Pacifici K, Collazo JA, Drake JM. Optimal treatment allocations in space and time for on-line control of an emerging infectious disease. J Royal Stat Soc Ser C Appl Stat. 2018;67(4):743. [PMC free article] [PubMed] [Google Scholar]
  • 10.Linn KA, Laber EB, Stefanski LA. Chapter 15: Estimation of dynamic treatment regimes for complex outcomes: balancing benefits and risks. Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. Philadelphia, PA: Society for Industrial and Applied Mathematics; 2015:249–262. [Google Scholar]
  • 11.Robins JM. A new approach to causal inference in mortality studies with a sustained exposure period application to the healthy worker survivor effect. Math Model. 1986;7:1393–1512. [Google Scholar]
  • 12.Berry DA, Fristedt B. Bandit Problems: Sequential Allocation of Experiments (Monographs on Statistics and Applied Probability). London, UK: Chapman & Hall; 1985. [Google Scholar]
  • 13.Berry DA, Eick S. Adaptive assignment versus balanced randomization in clinical trials: a decision analysis. Stat Med. 1995;14(3):231–246. [DOI] [PubMed] [Google Scholar]
  • 14.Muller P, Xu Y, Thall PF. Clinical trial design as a decision problem. Appl Stoch Models Bus Ind. 2017;33(3):296–301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Jiang F, Lee J, Muller P. A bayesian decision-theoretic sequential response-adaptive randomization design. Stat Med. 2013;32(12):1975–1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Ding M, Rosner GL, Muller P. Bayesian optimal design for phase II screening trials. Biometrics. 2008;64(3):886–894. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Cain LE, Robins JM, Lanoy E, Logan R, Costagliola D, Hernán MA. When to start treatment? a systematic approach to the comparison of dynamic regimes using observational data. Int J Biostat. 2010;6(2):1–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Roberts T, Kosack C Frontieres MS. Prices analysis of essential HIV CD4 and VL technologies; 2014. https://www.who.int/hiv/events/MSF_AMDS.pdf?ua=1. Accessed September 30, 2014.
  • 19.Lara AM, Kigozi J, Amurwon J, et al. Cost effectiveness analysis of clinically driven versus routine laboratory monitoring of antiretroviral therapy in Uganda and Zimbabwe. PLoS One. 2012;7(4):e33672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Larson B, Schnippel K, Ndibongo B, Long L. How to estimate the cost of point-of-care CD4 testing in program settings: an example using the Alere Pima Analyzer in South Africa. PLoS One. 2012;7(4):e35444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sued O, Schreiber C, Giron N, Ghidinelli M. HIV drug and supply stock-outs in Latin America. Lancet Infect Dis. 2011;11(11):P810–P811. [DOI] [PubMed] [Google Scholar]
  • 22.Oberth G, Baptiste S, Mosime W, Maouan A. Understanding gaps in the HIV treatment cascade in 11 West African countries: findings from the regional community treatment observatory [abstract]. Paper presented at: Proceedings of the International AIDS Society Conference; 2019; Mexico City, Mexico. Abstract number 2841. [Google Scholar]
  • 23.Odendal L Leakages in ART treatment cascades in West Africa and Zambia; 2019. nam aidsmap. https://www.aidsmap.com/news/aug-2019/leakages-art-treatment-cascades-west-africa-and-zambia. Accessed August 8, 2019.
  • 24.World Health Organization Disruption in HIV, Hepatitis, and STI services due to COVID-19. https://www.who.int/docs/default-source/hiv-hq/presentation-disruption-in-services-international-aids-conference-2020.pdf?sfvrsn=d4bf1f87_7. Accessed July 8, 2020.
  • 25.Rubin DB. Bayesian inference for causal effects: the role of randomization. Ann Stat. 1978;1:34–58. [Google Scholar]
  • 26.Robins JM, Hernán MA. Estimation of the causal effects of time-varying exposures. In: Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G, eds. Longitudinal Data Analysis. New York, NY: Chapman & Hall/CRC Press; 2008:533–599. [Google Scholar]
  • 27.Johnson SG. The NLopt nonlinear-optimization package. 2020. http://github.com/stevengj/nlopt. [Google Scholar]
  • 28.Svanberg K A class of globally convergent optimization methods based on conservative convex separable approximations. SIAM J Optim. 2002;12(2):555–573. [Google Scholar]
  • 29.Byrd RH, Lu P, Nocedal J, Zhu C. A limited memory algorithm for bound constrained optimization. SIAM J Sci Comput. 1995;16(5):1190–1208. [Google Scholar]
  • 30.The HIV-CAUSAL collaboration; 2000; HIV-CAUSAL.
  • 31.Caniglia EC, Cain LE, Sabin CA, Robins JM. Comparison of dynamic monitoring strategies based on CD4 cell counts in virally suppressed, HIV-positive individuals on combination antiretroviral therapy in high-income countries: a prospective, observational study. Lancet HIV. 2017;4(6):e251–e259. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data from each participating study in HIV-CAUSAL can be requested from the sponsoring institution in accordance with the applicable laws or regulations in each country.

RESOURCES