Abstract
Causal indirect and direct effects provide an interpretable method for decomposing the total effect of an exposure on an outcome into the indirect effect through a mediator and the direct effect through all other pathways. A natural choice for a mediator in a randomized clinical trial is the treatment’s targeted biomarker. However, when the mediator is a biomarker, values can be subject to an assay lower limit. The mediator is affected by the treatment and is a putative cause of the outcome, so the assay lower limit presents a compounded problem in mediation analysis. We propose two approaches to estimate indirect and direct effects with a mediator subject to an assay limit: (1) extrapolation and (2) numerical optimization and integration of the observed likelihood. Since these estimation methods solely rely on the so-called Mediation Formula, they apply to most approaches to causal mediation analysis: natural, separable, and organic indirect, and direct effects. A simulation study compares the two estimation approaches to imputing with half the assay limit. Using HIV interruption study data from the AIDS Clinical Trials Group described in Li et al 2016, AIDS; Lok and Bosch 2021, Epidemiology, we illustrate our methods by estimating the organic/pure indirect effect of a hypothetical HIV curative treatment on viral suppression mediated by two HIV persistence measures: cell-associated HIV-RNA and single-copy plasma HIV-RNA.
Keywords: assay lower limit, causal inference, causal mediation analysis, HIV/AIDS, indirect and direct effects
1 |. INTRODUCTION
Recent causality research extends the notion of a causal effect beyond the “simple” randomized controlled trial (RCT). While RCTs can provide results with a causal interpretation, their findings are often a “black box view of causality.”1 For example, consider an RCT designed to study the effect of a new drug that is hypothesized to target a pathway that may stall disease progression. If designed correctly, a successful RCT can establish causality between the new treatment and disease progression, but the initial biological hypothesis remains unanswered. Mediation analysis answers questions about how and why a drug affects disease progression.
Baron and Kenny2 established a foundation for mediation analysis that decomposes the total effect of an exposure on an outcome into (1) the indirect pathway from the exposure to the outcome through a mediator variable and (2) the direct pathway from the exposure to the outcome through all other pathways. Indirect and direct effects can be visualized through the causal diagram in Figure 1. Let be a binary randomized treatment, the outcome of interest, and a mediator measured between and . Since is not randomized, pretreatment common causes of the mediator and the outcome must be adjusted for.3,4
FIGURE 1.

Causal diagram with assumed data structure. Arrows from one node to another indicate a causal relationship.
Arrows directed from one node to another imply a causal relationship. Since is assumed randomized, there are no common causes of and , but the randomization assumption can be relaxed.5
Robins and Greenland3 and Pearl4 established a causal counterfactual framework for mediation analysis known as the natural indirect and direct effects. They use counterfactual quantities: and , the mediator and the outcome with treatment set to (0 or 1), and , the outcome with treatment set to (0 or 1) and mediator set to (any ). The natural indirect and direct effects rely on , the outcome under but with the mediator taking its value under : a so called cross-worlds quantity that exists in two mutually exclusive realities: and . The natural indirect effect
is the outcome under while varying the mediator from , its value under , to , its value under . The natural direct effect
is the outcome under varied treatment statuses vs with the mediator kept at , the value the mediator would have under treatment . Identifiability of natural indirect and direct effects involves assumptions on cross-worlds quantities that are not verifiable but yield the Mediation Formula,4
| (1) |
which identifies the cross-worlds term from observable and estimable quantities.
Lok6 proposed organic indirect and direct effects, which avoid cross-worlds counterfactuals and offer relaxed identifiability assumptions. Lok and Bosch7 generalized this approach to the organic indirect and direct effects relative to a specific value . Let be an intervention on the mediator. Denote and as the mediator and outcome, respectively, under and combined with intervention on the mediator. Organic indirect and direct effects relative to generalize the natural indirect and direct effects. Here we consider organic indirect and direct effects relative to is said to be an organic intervention relative to and if
| (2) |
and
| (3) |
In contrast to natural indirect and direct effects that require setting the value of the mediator, Equation (2) requires that the organic intervention changes the distribution of the mediator under no treatment to the mediator distribution under treatment. Equation (3) requires that is only associated with through its effect on the mediator. That mediator-intervention is only associated with outcome through its effect on the mediator is basically definitional, to establish the concept of an organic indirect effect. We are not assuming that a treatment under investigation affects the outcome only through the mediator; that would be only the indirect effect of this treatment. The organic indirect and direct effects relative to and are defined as
| (4) |
and
| (5) |
Assuming a randomized treatment, Lok and Bosch7 showed that the Mediation Formula (1) holds for organic interventions:
| (6) |
Compared to the Mediation Formula (1) relative to and , the Mediation Formula (1) relative to and depends on outcome data exclusively from untreated participants, providing two key advantages.7 First, the mediation formula easily accommodates mediator-treatment interactions in the outcome model. Second, if the effect of treatment on the mediator distribution is known or can be estimated, the indirect effect can be estimated in the absence of outcome data under treatment.
RCTs often test a candidate drug designed to target a specific biomarker believed to result in an improved clinical outcome. These RCTs present a natural causal mediation question: what is the effect of the drug on the clinical outcome through (or mediated by) the biomarker. However, if the biomarker, the mediator, is measured by an assay, measurements can fall below the assay limit. Since the mediator is an effect of the treatment and a cause of the outcome, the problem of an assay limit, sometimes referred to as left censoring, is compounded in mediation analysis. In our motivating example, we are interested in estimating the effect of a new curative HIV treatment on viral suppression, mediated by its effect on viral persistence. However, for many participants the viral persistence measures fall below the assay lower limit, leading to a left censored mediator. The problem of left censoring has been addressed for scenarios when the censored variable is the outcome8 or the predictor,9 but these methods have not been applied to causal mediation, where the censored mediator is both an outcome and a predictor.
In this paper we develop methods for addressing the challenge of mediators with assay limits when estimating causal indirect and direct effects. The estimation methods described here rely on the Mediation Formula, so they apply to natural,3,4 pure,3 organic,6,7 and separable10 indirect and direct effects. The paper is organized as follows. Section 2 describes a general semi-parametric estimator for indirect and direct effects. Section 3 focuses on estimating indirect and direct effects for mediators with assay limits and proposes two estimation methods that leverage observed data to account for the informatively missing mediator: (1) model extrapolation and (2) numerical integration and optimization of the observed data likelihood function. Section 4 presents the results of a simulation study comparing these two estimation methods to a simple imputation technique replacing censored values with assay limit divided by 2, and evaluating the performance of the methods for a small and large sample size with many mediator values below the assay limit. Section 5 illustrates the proposed methods with HIV treatment interruption data from the AIDS Clinical Trials Group11 and estimates the indirect effect of a curative HIV treatment on viral rebound through week 4 mediated by two viral persistence measures.
2 |. SEMI-PARAMETRIC ESTIMATORS FOR PURE INDIRECT EFFECT/ORGANIC INDIRECT EFFECT RELATIVE TO
Baron and Kenny2 popularized the product method to estimate indirect and direct effects for a continuous mediator and outcome that involves fitting linear models for and . If the models are correctly specified, causal mediation analysis approaches including organic,6,7 natural,3,4 and separable10 that are identified by the Mediation Formula result in the product method formulas.
For pure3 and organic7 indirect and direct effects relative to , the product method still holds if the outcome model has a treatment-mediator interaction. However, non-linear mediator or outcome models and the difficulty of correct model specification of a mediator and outcome models limit the applicability of the product method. Imai et al1 suggest a simulation approach that generalizes to non-linear models but still relies on specifying an outcome and a mediator model. An alternative approach6,7 requires only specification of an outcome model under no treatment to estimate . For observed randomized treatment data , an estimate for is:
To estimate we specify an outcome model for untreated subjects and obtain predicted values for treated subjects. The remaining terms and are estimated without parametric assumptions using sample averages of the outcome for untreated and treated observations, respectively, yielding a semi-parametric estimator for the indirect and direct effects. The estimation procedure for any outcome is summarized by the following three steps:
- Fit a generalized linear model with link function on untreated subjects:
to obtain . - Estimate
for all with . - Estimate the pure indirect effect/organic indirect effect relative to as
and estimate the pure direct effect/organic direct effect relative to as
When is fully observed, this semi-parametric estimator reduces the necessary modeling assumptions by only requiring outcome model specification under .
3 |. SEMI-PARAMETRIC ESTIMATORS FOR INDIRECT AND DIRECT EFFECTS RELATIVE TO FOR MEDIATORS WITH ASSAY LIMITS
If is left censored by an assay lower limit, the missing mediator information can be supplemented by including a model for the mediator. Several existing estimation procedures to handle left censoring can be applied to jointly estimate the model parameters of and . The three-step estimation approach introduced in Section 2 can be extended to mediators with an assay lower limit as follows. Let be an indicator of whether a mediator value is above the assay lower limit , that is, if and if .
-
Jointly estimate , the variance of , and the following generalized linear models with link functions and on treated and untreated subjects, respectively:
We present methods to fit these models in Sections 3.1 to 3.2.
Sample for and for those with from a truncated parametric distribution with mean restricted to .
For , predict
- Estimate the pure indirect effect/organic indirect effect relative to by
and estimate the pure direct effect/organic direct effect relative to by
For Step 1 of the algorithm, approaches to parameter estimation with left censoring can be roughly classified into procedures that directly maximize an observed data likelihood with numerical optimization procedures and procedures that maximize a function of the full data likelihood.
The semi-parametric estimator is generalizable to any observed , and . Based on the motivating example of the HIV curative treatment, we consider a continuous mediator measured with an assay lower limit and a binary outcome . The pretreatment common causes of the mediator and the outcome are collected in a vector and can be of any type. For let be the true mediator values. As in the motivating example, we assume that only data for is available and shifts the mediator distribution by . We focus on estimation of the indirect effect relative to , because the indirect effect does not require outcome data under . We assume that given and the mediator follows a normal distribution; specifically, we assume . Let be the observed mediator values with for values measured above the assay lower limit. Further assume a logistic regression model for ; that is, , where
The full data is for and the observed data is for .
3.1 |. Extrapolation method
A simplification of the left-censored mediator problem can be achieved by separately estimating the mediator and outcome models, which leads to well-known results and consistent estimates. To estimate the parameters of the mediator model in step one of the algorithm, we apply the maximum likelihood approach described in Aitkin.12 The observed data likelihood of the mediator distribution is given by
where is the standard normal cumulative distribution function.
The log likelihood is
McCulloch13 derived an iterative maximum likelihood (ML) procedure to estimate . After initializing parameter values at , the procedure updates using data and prior parameter values :
where is the density of the standard normal distribution. Then, the ’s are updated by solving the score equation with respect to the ’s:
| (7) |
Estimation of can be simplified by recognizing the familiar form of the score function that resembles the normal equations given by least squares estimation with left censored observations replaced by
| (8) |
The “normal equations” (7) suggest a simple iterative imputation procedure for estimating the coefficients of the mediator model. After imputing the using (8), a simple OLS model is fit to update the estimates of . The algorithm iterates between updating and until a convergence criterion is met. Details of the derivations are presented in Appendix B.
When the mediator and outcome models are fit separately, the outcome model is a logistic regression model fit on untreated subjects with mediator values above the assay limit. Since the outcome model conditions on , a logistic regression model that is fit on values of above the assay limit is extrapolated to values below the assay limit
The advantage of the extrapolation method is implementation speed. However, since the outcome model is only fit on untreated subjects with mediator values above the assay lower limit, the sample used to fit the outcome model can be quite small, leading to large SEs. Furthermore, since the outcome model is only fit on observations with mediator values above the assay limit, the outcome model extrapolates for observations with mediator values below the assay limit.
3.2 |. Numerical optimization and integration
Cole et al14 estimated the odds of HIV treatment naivete as a function of HIV viral load measured with a lower limit of detection by numerical integration and optimization of the observed data log likelihood. We extend this approach to estimate causal indirect and direct effects. The observed data likelihood is given by:
Under the data generating assumptions in Section 3, the log likelihood is
The R programming language has efficient built-in functions for numerical integration and optimization, <monospace>integrate()</monospace> and <monospace>optim()</monospace>, to accommodate the integrated density function and nonlinearity in . The <monospace>optim()</monospace> was initialized with parameter values estimated from imputing the left censored mediator values with half the assay limit. Since must be nonnegative, we included the argument <monospace>method = “L-BFGS-B”</monospace> in <monospace>optim()</monospace>, which implements a quasi-Newton optimization method with boxed constraints.15
4 |. SIMULATION STUDY
This simulation study evaluates the proposed estimators for the pure indirect effect/organic indirect effect indirect effect of on relative to through a mediator that is subject to an assay lower limit. The simulated data closely follow the estimated data distribution of the HIV cure data analyzed in Section 5 and the assumed data models from Section 3. The estimation target of Section 5 is the indirect effect of a hypothetical HIV curative treatment curative treatment on viral suppression through week 4 after withdrawal of HIV medications, , mediated by two HIV viral persistence measures . The simulation focuses on one measure of viral persistence, cell-associated HIV RNA. We simulate a single binary pre-treatment common cause of the mediator and the outcome, . The hypothetical treatment is assumed to cause a shift in the mediator distribution given . Appendix C provides more details on estimating the indirect effect of a hypothetical treatment assumed to shift the distribution of the mediator. In the simulations, under , the distribution of , the viral reservoir, follows a normal distribution with mean and variance with and . The distribution of , viral suppression through week 4, is Bernoulli with probability of equal to with . The simulated scenarios evaluate the effect of different sample sizes and mediator shifts on the estimators. A moderate sample resembles our data example, and a larger sample reflects larger sample properties. Mediator shifts on the scale reflect a range of biologically significant effects of a hypothetical curative intervention16: 0.5, 1, and 2. For each sample size, 2000 datasets were simulated.
The simulation study evaluates the bias, variance, and coverage probability of the proposed methods from Section 3: (1) extrapolation method (Section 3.1), (2) numerical optimization and integration of the joint outcome-mediator log likelihood (Section 3.2), and (3) imputing censored mediator values with half the assay lower limit.
Figure 2 and Table 1 show the influence of sample size and mediator shift size on the performance of the estimators. Larger sample sizes and smaller mediator shifts are associated with estimators with lower variance. Since all estimation procedures involve outcome model extrapolation, a larger mediator shift is associated with extrapolations further outside of the observed range of mediator values. For every evaluated sample size and mediator shift, the method with the lowest bias and variance and high coverage probability is numerical optimization. The extrapolation method has the highest variance, attributed to fitting the outcome model on a limited sample of mediator values above the assay limit. The high variance of the extrapolation method leads to a coverage probability close to 95% even though the method had a high bias. Imputing values below the assay limit/2 had high bias even at the larger sample size of 500.
FIGURE 2.

Simulation study results evaluating methods to estimate the indirect effect of on relative to mediated by with a lower assay limit of 1.96. The simulation study is modeled after the HIV cure application of Section 5 which seeks to estimate the organic indirect effect relative to pure indirect effect of a curative HIV-treatments that shift the distribution of HIV persistence measures downwards. As with the HIV motivating example, the simulation study evaluates the estimates performance under varying shifts in the log transformed distribution of the mediator: 0.5, 1.0, 1.5, and 2.0. The simulation study evaluates the bias, variance, and coverage probability of the estimates using 2000 simulated datasets. The methods considered are: (1) extrapolation method (Section 3.1), (2) numerical optimization and integration of the joint outcome-mediator log likelihood (Section 3.2), and (3) imputing censored mediator values with half the assay lower limit. The rows depict sample size of or ; columns display mediator shifts of 0.5, 1, and 2. The true indirect effects 0.07, 0.15, and 0.28 for shifts of 0.5, 1, and 2, respectively, are displayed as dashed lines.
TABLE 1.
Results of a simulation study evaluating three methods for estimating the organic indirect effects when the mediator has an assay lower limit.
| Sample size | Mediator shifta | Estimation methodb | rMSE | Bias | Coverage probability |
|---|---|---|---|---|---|
|
| |||||
| 100 | 0.5 | Extrapolation | 0.166 | −0.014 | 0.941 |
| Numerical optimization | 0.044 | −0.000 | 0.942 | ||
| AL/2 | 0.053 | 0.013 | 0.917 | ||
| 1.0 | Extrapolation | 0.219 | −0.032 | 0.938 | |
| Numerical optimization | 0.082 | −0.003 | 0.942 | ||
| AL/2 | 0.095 | 0.020 | 0.917 | ||
| 1.5 | Extrapolation | 0.261 | −0.057 | 0.940 | |
| Numerical optimization | 0.112 | −0.009 | 0.941 | ||
| AL/2 | 0.124 | 0.018 | 0.918 | ||
| 2.0 | Extrapolation | 0.293 | −0.086 | 0.943 | |
| Numerical optimization | 0.134 | −0.018 | 0.941 | ||
| AL/2 | 0.143 | 0.009 | 0.922 | ||
| 500 | 0.5 | Extrapolation | 0.078 | −0.003 | 0.948 |
| Numerical optimization | 0.019 | −0.000 | 0.934 | ||
| AL/2 | 0.026 | 0.013 | 0.851 | ||
| 1.0 | Extrapolation | 0.105 | −0.008 | 0.946 | |
| Numerical optimization | 0.036 | −0.001 | 0.936 | ||
| AL/2 | 0.048 | 0.024 | 0.850 | ||
| 1.5 | Extrapolation | 0.125 | −0.015 | 0.949 | |
| Numerical optimization | 0.049 | −0.002 | 0.933 | ||
| AL/2 | 0.061 | 0.029 | 0.851 | ||
| 2.0 | Extrapolation | 0.139 | −0.025 | 0.947 | |
| Numerical optimization | 0.057 | −0.004 | 0.933 | ||
| AL/2 | 0.068 | 0.030 | 0.850 | ||
Notes: The simulation study is modeled after the HIV cure application of Section 5 which seeks to estimate the organic indirect effect of a curative HIV-treatment that shifts the distribution of HIV persistence measures downwards. As with the HIV motivating example, the simulation study evaluates the estimates performance under varying shifts in the log transformed distribution of the mediator: 0.5, 1.0, 1.5, and 2.0. The simulation study evaluates the bias, variance, and coverage probability of the estimates using 2000 simulated datasets. The estimators evaluated are mediator—outcome extrapolation (Section 3.1), numerical optimization of the joint outcome-mediator log likelihood (Section 3.2), and imputing the values below the assay limit by AL/2.
Treatment-induced downward shift of the viral persistence measure distribution on the log10 scale.
Estimation methods: Extrapolation: mediator-outcome extrapolation (Section 3.1); Numerical optimization: numerical optimization of the joint outcome-mediator log likelihood (Section 3.2); AL/2: imputing censored mediator values with half the assay lower limit.
Abbreviations: bias: average bias (estimate-truth); Coverage probability: probability that the true value is contained in the 95% confidence interval. For each of the 2000 simulated dataset, 95% confidence interval estimated over 1000 boostrapped samples using Efron’s percentile method; RMSE, root mean squared error.
5 |. THE INDIRECT EFFECT OF HIV CURATIVE TREATMENTS THAT SHIFT THE DISTRIBUTION OF THE VIRAL RESERVOIR
Antiretroviral therapy (ART), the current standard of care for HIV-infected patients, successfully suppresses the HIV viral load in the blood, but ART is not a cure and requires lifelong use.17 Additionally, ART use is associated with side effects, drug-drug interactions, and drug resistance, and ART may result in pill fatigue from daily use. Testing new curative treatments to replace ART18 requires an analytic treatment interruption (ATI) study, which disrupts ART until the time of viral rebound, when the viral load reaches a predefined threshold. In the absence of ART, the viral reservoir, latent cells infected with HIV, are activated and viral rebound quickly follows. The risk to participants and the cost of conducting an ATI study limits opportunities for large randomized clinical trials. Using ATI study data, Li et al11 found that biomarkers of a larger on-ART viral reservoir were associated with a shorter off-ART time to viral rebound.
An HIV treatment that meaningfully reduces the viral reservoir may indefinitely postpone viral rebound and maintain viral suppression. Causal mediation analysis provides a methodology for estimating the effect of an HIV curative treatment on viral suppression four weeks after ART interruption through viral persistence measures. Here we consider two viral persistence measures: cell-associated HIV RNA (CA-HIV-RNA) and single-copy HIV RNA (SCA-HIV-RNA). A visualization of the causal structure with CA-HIV-RNA as the mediator is presented in Figure 3. For viral suppression through week 4 after ATI, a relevant common cause of viral persistence measures and viral suppression is the type of HIV treatment regimen when HIV treatment is withdrawn: NNRTI-based ART versus non-NNRTI-based ART. With the ATI data from Li et al,11 we consider the effect of putative HIV curative treatments with hypothesized reductions in viral persistence measures on viral suppression through week 4 after ART interruption. In the available data, we observe for , but the indirect effect relative to can be estimated without outcome data under treatment with assumptions or hypotheses on the effect of the treatment on the mediator.7 The indirect effect can inform the reduction in the viral reservoir necessary to meaningfully increase the probability of viral suppression through week 4. Since measures of HIV viral persistence are subject to an assay limit, we applied the methods described in Section 3 to the ATI data. We assume that CA-HIV-RNA and SCA-HIV-RNA given the HIV treatment regimen, NNRTI-based (yes or no), are normally distributed on the log scale with a constant variance. This assumption is common in HIV research. A normal probability plot (Figure D1) provided in Appendix D supports that this is a reasonable assumption for these data.
FIGURE 3.

Causal diagram for the effect of an HIV curative treatment that shifts the distribution of the pre-analytic treatment interruption cell-associated HIV RNA.
We modeled the binary outcome , viral suppression through week 4, with logistic regression, assuming the same model for values below and above the assay limit. We consider the indirect effect of treatments that shift the mediator distributions by , and given . Table 2 presents the indirect effects along with nonparametric bootstrap 95% confidence intervals based on 2000 bootstrapped samples and using Efron’s percentile method.
TABLE 2.
Organic indirect effects of curative HIV-treatments that shift the distribution of HIV persistence measures downwards.
| HIV persistence measure | Mediator shifta | Estimation methodb | Indirect effectc | 95% CId |
|---|---|---|---|---|
|
| ||||
| CA HIV-RNAe | 0.50 | Extrapolation | 2.9% | (−2.03%, 21.7%) |
| Numerical optimization | 7.7% | (2.7%, 12.5%) | ||
| AL/2 | 6.4% | (2.0%, 10.2%) | ||
| 1.00 | Extrapolation | 8.7% | (−23.7%, 32.1%) | |
| Numerical optimization | 15.1% | (5.4%, 23.6%) | ||
| AL/2 | 12.6% | (4.0%, 19.8%) | ||
| 1.50 | Extrapolation | 14.3% | (−27%, 39.4%) | |
| Numerical optimization | 21.9% | (8.1%, 32.7%) | ||
| AL/2 | 18.4% | (5.9%, 28.0%) | ||
| 2.00 | Extrapolation | 19.5% | (−30.6%, 44.9%) | |
| Numerical optimization | 27.8% | (10.8%, 39.6%) | ||
| AL/2 | 23.7% | (7.9%, 35.0%) | ||
| SCA HIV-RNAf | 0.50 | Extrapolation | −21.7% | (−53.6%, 27.4%) |
| Numerical optimization | 4.0% | (−1.3%, 8.6%) | ||
| AL/2 | 5.1% | (−2.2%, 10.9%) | ||
| 1.00 | Extrapolation | −25.0% | (−58%, 33.5%) | |
| Numerical optimization | 7.8% | (−2.7%, 16.6%) | ||
| AL/2 | 9.9% | (−4.4%, 20.5%) | ||
| 1.50 | Extrapolation | −28.1% | (−60.0%, 38.3%) | |
| Numerical optimization | 11.6% | (−4.0%, 23.7%) | ||
| AL/2 | 14.6% | (−6.6%, 28.5%) | ||
| 2.00 | Extrapolation | −31.0% | (−61%, 41.7%) | |
| Numerical optimization | 15.1% | (−5.4%, 30.1%) | ||
| AL/2 | 18.8% | (−8.7%, 35%) | ||
Notes: The HIV persistence measures have an assay lower limit of 1.96 and −0.26 on the log scale for CA HIV-RNA and and SCA HIV-RNA, respectively. The estimated probability of no viral rebound through week 4 without curative treatment was 63/124 or 51%.
Treatment-induced downward shift of the viral persistence measure distribution on the log10 scale.
Estimation methods: Extrapolation: mediator-outcome extrapolation (Section 3.1); Numerical optimization: numerical optimization of the joint outcome-mediator log likelihood (Section 3.2); AL/2: imputing censored mediator values with half the assay lower limit.
Risk difference scale (in percent).
Nonparametric bootstrap 95% confidence intervals on risk difference scale (in percent) calculated from 2000 bootstrapped samples using Efron’s percentile method.
Cell-Associated HIV-RNA, on-ART (N = 124).
Single-Copy HIV-RNA, on-ART (N = 94).
Since the numerical optimization approach performed best in the simulation studies, we will focus on those estimates. A downward shift in CA HIV-RNA is associated with a 7.7% (95% CI: 2.7%, 12.5%) absolute increase in the probability of week 4 viral suppression after ATI. At the other extreme, a 100-fold reduction (equivalent to a downward shift) in CA-HIV-RNA is associated with a 27.8% (95% CI: 10.8%, 39.6%) absolute increase in the probability of week 4 viral suppression after ATI. In contrast, a 0.5log10 reduction in SCA-HIV-RNA is associated with a 4.0% (95% CI: −1.3%, 8.6%) increase in the probability of week 4 viral suppression after ATI and a 100-fold reduction (equivalent to a 2log10 downward shift) in SCA-HIV-RNA is associated with a 15.1% (95% CI: −5.4%, 30.1%) increase in the probability of week 4 viral suppression. Thus, a sizeable mediator shift is necessary to substantially increase in the probability of week 4 viral suppression after ATI, consistent with mathematical modeling.16
Lok and Bosch7 addressed a similar HIV curative treatment question and found a similar reduction in risk of viral rebound after a 0.5log10 reduction of CA-HIV-RNA, 7.0% (95% CI: 1.7%, 13%). However, the maximum achievable risk reduction was estimated to be 13% (3.0%, 23%), consistent with a treatment that shifts all mediator values below the assay limit. The differences between our findings and those in Lok and Bosch7 are attributable to the difference in assumptions. Lok and Bosch7 assumed that how far viral reservoir values fall below the assay limit is not associated with viral suppression through week 4 after ATI. Thus, the maximal shift they could consider was a shift which moves all values below the assay limit. On the other hand, our methods assume that the probability of viral suppression depends on how far below the assay limit a mediator values lies, and we can model larger treatment-induced shifts in the viral reservoir.
We evaluated the sensitivity of our estimates by artificially censoring mediator values (Table 3). The mediator values were censored by a with 62 values censored and with 68 values censored. For the numerical optimization method and imputation of AL/2, the estimates were similar to a assay lower limit without loss in precision. The larger assay lower limit of resulted in a change in the estimates possibly attributable to a small sample size. The extrapolation method resulted in different estimates as a result of only estimating the mediator model with the reduced set of values above the assay lower limit.
TABLE 3.
Sensitivity analysis: organic indirect effects of curative HIV-treatments that shift the distribution of HIV persistence measures downwards with artificial assay lower limits.
| Mediator shifta | Estimation methodb | AL = 1.96c | AL = 2.04d | AL = 2.18e |
|---|---|---|---|---|
|
| ||||
| 0.50 | Extrapolation | 2.9% (−20%, 21.4%) | −7.2% (−31.5%, 17.9%) | 3.4% (−25.6%, 26.8%) |
| Numerical optimization | 7.7% (2.4%, 12.4%) | 7.7% (2.7%, 12.5%) | 7.1% (1.6%, 12%) | |
| AL/2 | 8.7% (2.6%, 14.2%) | 8.7% (2.8%, 14.2%) | 7.5% (1.6%, 12.7%) | |
| 1.00 | Extrapolation | 8.7% (−23.5%, 32%) | −4.8% (−37.1%, 27.2%) | 9.1% (−28.9%, 36.5%) |
| Numerical optimization | 15.1% (4.8%, 23.6%) | 15.1% (5.5%, 23.9%) | 13.9% (3.3%, 22.8%) | |
| AL/2 | 16.8% (5.3%, 25.9%) | 16.9% (5.7%, 26.3%) | 14.7% (3.2%, 23.8%) | |
| 1.50 | Extrapolation | 14.3% (−27.6%, 39.3%) | −2.4% (−41.5%, 34.8%) | 14.7% (−32.5%, 43.2%) |
| Numerical optimization | 21.9% (7.1%, 32.7%) | 21.9% (8.2%, 32.9%) | 20.2% (4.9%, 31.7%) | |
| AL/2 | 24.1% (7.9%, 35.2%) | 24.1% (8.4%, 35.7%) | 21.2% (4.8%, 32.9%) | |
| 2.00 | Extrapolation | 19.5% (−31.2%, 44.8%) | 0% (−44.6%, 40.6%) | 19.9% (−35.7%, 47.6%) |
| Numerical optimization | 27.8% (9.5%, 39.5%) | 27.8% (10.9%, 39.9%) | 25.9% (6.6%, 38.7%) | |
| AL/2 | 30.3% (10.4%, 42%) | 30.3% (11.1%, 42.5%) | 27% (6.3%, 39.9%) | |
Notes: The organic indirect effects relative to a = 0 are presented for the observed assay lower limit 1.96 as well as 2.04 and 2.18 (all on the log10 scale). Of the 124 observations the number of values below the observed and artificial assay lower limits are 52, 61, and 68. The estimated probability of no viral rebound through week 4 without curative treatment was 63/124 or 51%.
Treatment-induced downward shift of the viral persistence measure distribution on the log10 scale.
Estimation methods: Extrapolation: mediator-outcome extrapolation (Section 3.1); Numerical optimization: numerical optimization of the joint outcome-mediator log likelihood (Section 3.2); AL/2: imputing censored mediator values with half the assay lower limit.
Observed assay lower limit for CA HIV-RNA. Risk difference scale (in percent) with 95% bootstrapped confidence intervals using Efron’s percentile method with 2000 bootstrapped samples.
Artificial assay lower limit for CA HIV-RNA of 2.04. Risk difference scale (in percent) with 95% bootstrapped confidence intervals using Efron’s percentile method with 2000 bootstrapped samples.
Artificial assay lower limit for CA HIV-RNA of 2.18. Risk difference scale (in percent) with 95% bootstrapped confidence intervals using Efron’s percentile method with 2000 bootstrapped samples.
6 |. DISCUSSION
Mediation analysis is a key methodological tool in the evaluation of new therapies, but when the mediator is a biomarker subject to an assay limit, additional estimation tools are required. Our semi-parametric estimates of the causal indirect and direct effects build on Lok and Bosch,7 which addressed the assay limit problem by assuming that the outcome is not affected by how far the mediator values fall below the assay limit. We assume that the outcome is associated with how far the mediator values fall below the assay limit, and make parametric assumptions on the mediator distribution and the expected outcome given mediator values above and below the assay limit. We developed two methods: (1) extrapolation and (2) numerical integration and optimization of the observed data likelihood function, and evaluated them against an ad hoc procedure of imputing them by half the assay lower limit value.
A simulation study designed to emulate the HIV cure application compared the proposed methods to address the assay limit problem. Simulation scenarios included various sample sizes and mediator shifts. Numerical optimization of the observed data log likelihood performed best in all our simulation scenarios. Large mediator shifts can be evaluated using our methods, but the larger the mediator shift the more we extrapolate, and the larger the variability in the estimate, attributable to extending the model further beyond the observed mediator values.
We applied our methods to an HIV cure example that uses existing ART interruption data without curative treatment to evaluate the required treatment-induced mediator shift necessary to substantially increase the probability of viral suppression through week 4. For a downward shift of CA-HIV-RNA, the largest HIV viral persistence shift considered, the indirect effect estimated by numerically optimizing and integrating the log likelihood was27.8%(10.8%,39.6%). For an equivalent shift of in SCA-HIV-RNA, the indirect effect was 15.1% (−5.4%,30.1%). Thus, targeting CA-HIV-RNA to delay viral rebound might be more effective than targeting SCA-HIV-RNA. Alternative explanations to the varying estimates between the two measures of viral persistence include the varying proportion of measurements below the assay limit between CA-HIV-RNA (52/124 [42%]) and SCA-HIV-RNA (60/94 [64%]) and measurement error in both measures of viral persistence. The varying proportion of measurements below the assay limit are constrained by the assay, but incorporating methodology to address left censoring combined with measurement error19 in the measures of viral persistence, the mediator, is an interesting area of future research.
Hill et al16 used mathematical models to explore the HIV viral reservoir reduction necessary to substantially extend viral suppression. They found that a roughly 2000-fold reduction results in a median extension of viral suppression by about a year. The authors suggest using their models to ascertain uncertainty induced by the assay lower limit of viral reservoir measures. In contrast, we suggest direct estimation methods that incorporate viral persistence measures below the assay limit. Future research could combine our methods with their models to attain a better understanding of the viral reservoir reduction necessary to substantially extend viral suppression after ART withdrawal.
Assay limits are not unique to HIV research. Our methods could be useful in the area of environmental exposures with devices constrained by detection limits. For example, air-borne transmission of SARSCoV-2 is an established mode of viral infection.20 Estimating the effect of a preventative measure such as mask wearing on COVID-19 infection mediated by the quantity of aerosolized SARSCoV-2 could provide insight into effective prevention of COVID-19 infection. Airborne viral particles such as SARS COV-2 are most commonly measured by quantitative reverse transcription polymerase chain reaction subject to a lower limit of detection.21 Occupational exposure is another example of an active research area with a limit of detection. For example, in dental prosthetic laboratories with manual production, particulate matter is released when abrasion and polishing are performed. Particulate matter has been reported to directly cause lung cancer as well as other respiratory illnesses.22 The effect of an intervention aimed at reducing respiratory disease incidence by limiting (or mediated by) occupational exposure to particulate matter could lead to better methods of safe-guarding the lab technicians. Particulate matter can be measured using an inductively coupled plasma mass spectrometer device that can only measure particulate matter above a limit of detection.22
We focused on mediation analysis with a continuous mediator and a binary outcome, but an advantage of the described methods is their adaptability to other outcome types. For example, if the outcome of interest is also a continuous biomarker, the methods for estimating the causal indirect and direct effects are easily modified by replacing the logistic regression model with a linear regression model or other generalized linear model. In the example of HIV cure, the outcome was a binary variable of whether or not there was viral rebound through week 4. Our methods can extend to time-to-event outcomes but will require additional methods and assumptions to handle right or interval censoring. This is an interesting subject for future research.
Our methodology applies and demonstrates the organic/ natural/ seperable indirect and direct effects to applications with pretreatment common causes of the mediator and outcome. Posttreatment common causes between the mediator and outcome are addressed by Lok 2020.23
ACKNOWLEDGMENTS
The authors would like to thank the volunteers who participated in the ACTG ART interruption studies, the ACTG, the study investigators and Dr. Jonathan Z. Li (Brigham and Women Hospital) for his leadership in collecting and generating the data we analyzed. This research and the underlying study data was supported by the NIH/NIAID under awards UM1 AI068634 and UM1 AI068636 and by the NSF under award DMS 1854934.
Funding information
National Institute of Allergy and Infectious Diseases, Grant/Award Numbers: UM1 AI068634, UM1 AI068636; Center for Hierarchical Manufacturing, National Science Foundation, Grant/Award Number: DMS 1854934
APPENDIX A. ESTIMATION METHODS IMPLEMENTATION IN R
The necessary code for implementing our estimation methods is available on GitHub (https://github.com/a-chernofsky/mediation_assay_lower_limit).
APPENDIX B. DERIVATIONS FOR THE EXTRAPOLATION METHOD
The extrapolation method fits separate models for the mediator and outcome to simplify the estimation in the presence of left censoring. The procedure to estimate the mediator distribution12 begins with estimation of the variance by differentiating the log likelihood with respect to :
Solving the score equations gives the following iterative estimate:
Similarly, we differentiate the log likelihood of the mediator model with respect to
Thus, culminating in a familiar score function similar to the one for linear regression as described in Section 3.1.
APPENDIX C. ESTIMATING THE INDIRECT EFFECT WITH A MEDIATOR SHIFT
The expense and risk to participant safety limit the ability to collect randomized clinical trial (RCT) data for HIV curative treatments. Lok and Bosch7 propose a method to estimate the indirect effect of an HIV curative treatment on viral suppression, mediated by the viral reservoir. The indirect effect is only dependent on through the distribution of the mediator and if the effect of treatment on the distribution on the mediator is known or hypothesized, we can write the indirect effect as a function of only under . For example, assume that treatment reduces the viral reservoir by , that is, . In that case, the indirect effect equals
Thus, an estimate for the organic direct effect without treatment data is:
APPENDIX D. NORMALITY PROBABILITY PLOT OF VIRAL PERSISTENCE MEASURES: LOG TRANSFORMED HIV CA-RNA
FIGURE D1.

Normal probability plot for log transformed HIV-CA-RNA by NNRTI category at time of antiretroviral therapy interruption (yes or no). Theoretical quantiles based on normal distribution with mean 0 and SD calculated using the numerical optimization approach of Section 3.2.
DATA AVAILABILITY STATEMENT
Details on data collection methods can be found in Li et al.11 The data are available upon request from the last author.
REFERENCES
- 1.Imai K, Keele L, Yamamoto T. Identification, inference and sensitivity analysis for causal mediation effects. Stat Sci. 2010;25(1):51–71. [Google Scholar]
- 2.Baron RM, Kenny DA. The moderator–mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J Pers Soc Psychol. 1986;51(6):1173–1182. [DOI] [PubMed] [Google Scholar]
- 3.Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3(2):143–155. [DOI] [PubMed] [Google Scholar]
- 4.Pearl J Direct and indirect effects. Paper presented at: Proceedings of UAI-01, San Francisco, CA. 2001:411–420. [Google Scholar]
- 5.VanderWeele T Explanation in Causal Inference: Methods for Mediation and Interaction. New York, NY: Oxford University Press; 2015. [Google Scholar]
- 6.Lok JJ. Defining and estimating causal direct and indirect effects when setting the mediator to specific values is not feasible. Stat Med. 2016;35(22):4008–4020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lok JJ, Bosch RJ. Causal organic indirect and direct effects: closer to the original approach to mediation analysis, with a product method for binary mediators. Epidemiology. 2021;32(3):412–420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Lyles RH, Fan D, Chuachoowong R. Correlation coefficient estimation involving a left censored laboratory assay variable. Stat Med. 2001;20(19):2921–2933. [DOI] [PubMed] [Google Scholar]
- 9.Lynn HS. Maximum likelihood inference for left-censored HIV RNA data. Stat Med. 2001;20(1):33–45. [DOI] [PubMed] [Google Scholar]
- 10.Robins JM, Richardson TS. Alternative graphical causal models and the identification of direct effects. In: Shrout PE, Keyes KM, Ornstein K. eds. Causality and Psychopathology: Finding the Determinants of Disorders and their Cures. Oxford, UK: Oxford University Press; 2010:103–158. [Google Scholar]
- 11.Li JZ, Etemad B, Ahmed H, et al. The size of the expressed HIV reservoir predicts timing of viral rebound after treatment interruption. AIDS. 2016;30(3):343–353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Aitkin M A note on the regression analysis of censored data. Dent Tech. 1981;23(2):161–163. [Google Scholar]
- 13.McCulloch CE. Maximum likelihood algorithms for generalized linear mixed models. J Am Stat Assoc. 1997;92(437):162–170. [Google Scholar]
- 14.Cole SR, Chu H, Nie L, Schisterman EF. Estimating the odds ratio when exposure has a limit of detection. Int J Epidemiol. 2009;38(6):1674–1680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Byrd RH, Lu P, Nocedal J, Zhu C. A limited memory algorithm for bound constrained optimization. SIAM J Sci Comput. 1995;16(5):1190–1208. [Google Scholar]
- 16.Hill AL, Rosenbloom DI, Fu F, Nowak MA, Siliciano RF. Predicting the outcomes of treatment to eradicate the latent reservoir for HIV-1. Proc Natl Acad Sci. 2014;111(37):13475–13480. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.WHO. Progress report on HIV, viral hepatitis and sexually transmitted infections 2019. Accountability for the global health sector strategies, 2016–2021. Geneva: World Health Organization. 2019. [Google Scholar]
- 18.Pitman MC, Lau JS, McMahon JH, Lewin SR. Barriers and strategies to achieve a cure for HIV. Lancet HIV. 2018;5(6):e317–e328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Valeri L, Lin X, VanderWeele TJ. Mediation analysis when a continuous mediator is measured with error and the outcome follows a generalized linear model. Stat Med. 2014;33(28):4875–4890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.WHO. Transmission of SARS-CoV-2: implications for infection prevention precautions: scientific brief, 9 July 2020. Technical Report. Geneva, Switzerland: World Health Organization. 2020. [Google Scholar]
- 21.Kim HR, An S, Hwang J. An integrated system of air sampling and simultaneous enrichment for rapid biosensing of airborne coronavirus and influenza virus. Biosens Bioelectron. 2020;170:112656. doi: 10.1016/j.bios.2020.112656 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Yildirim SA, Pekey B, Pekey H. Assessment of occupational exposure to fine particulate matter in dental prosthesis laboratories in Kocaeli, Turkey. Environ Monitor Assess. 2020;192(10):1–16. [DOI] [PubMed] [Google Scholar]
- 23.Lok JJ. Organic direct and indirect effects with post-treatment common causes of mediator and outcome. 2020. doi: 10.48550/ARXIV.1510.02753 [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Details on data collection methods can be found in Li et al.11 The data are available upon request from the last author.
