Semiparametric profile likelihood estimation for continuous outcomes with excess zeros in a random-threshold damage-resistance model

John D Rice; Alex Tsodikov

doi:10.1002/sim.7237

. Author manuscript; available in PMC: 2018 May 30.

Published in final edited form as: Stat Med. 2017 Feb 13;36(12):1924–1935. doi: 10.1002/sim.7237

Semiparametric profile likelihood estimation for continuous outcomes with excess zeros in a random-threshold damage-resistance model

John D Rice ¹, Alex Tsodikov ²

PMCID: PMC5530377 NIHMSID: NIHMS851380 PMID: 28192863

Abstract

Continuous outcome data with a proportion of observations equal to zero (often referred to as semicontinuous data) arises frequently in biomedical studies. Typical approaches involve two-part models, with one part a logistic model for the probability of observing a zero and some parametric continuous distribution for modeling the positive part of the data. We propose a semiparametric model based on a biological system with competing damage manifestation and resistance processes. This allows us to derive a closed-form profile likelihood based on the retro-hazard function, leading to a flexible procedure for modeling continuous data with a point mass at zero. A simulation study is presented to examine the properties of the method in finite samples. We apply the method to a data set consisting of pulmonary capillary hemorrhage area in lab rats subjected to diagnostic ultrasound.

Keywords: Damage threshold modeling, profile likelihood, retro-hazard, semicontinuous data, semiparametric methods

1 Introduction

There is often scientific interest in quantifying the effects of some external stress on a biological system. An area of application of particular relevance is carcinogenesis, for which dose-response models have been used extensively [1]. Often, the statistical question addressed is that of estimation of a threshold below which the probability of toxicity is zero [2]. In this paradigm, the outcome is typically binary: presence or absence of some adverse effect. Another perspective involves analyis of time to failure of a system, as in cumulative damage/shock models [3, 4]. These models, which are also germane to the multi-stage models of carcinogenesis [1], suppose that after some number of insults, the system breaks down.

In both cases, the outcome data take a particular form, binary for the dose-response models and time-to-event for the cumulative damage models. If, however, we are confronted with data containing outcomes that may be either exactly zero or positive (but not necessarily discrete), then another approach is needed. This is known as semicontinuous data, and a large body of research exists on modeling this sort of outcome [5–8]. Such data may occur in experimental setups in which test animals are subjected to external stress and a measure of the damage caused by such pressures is obtained as the outcome. The motivation for this work is a data set consisting of 109 rats subjected to diagnostic ultrasound [9]. From previous studies, it is known that diagnostic ultrasound (DUS) can induce pulmonary capillary hemorrhage (PCH) in rats. This is of clinical relevance for human patients because it demonstrates the potential for pulmonary injury following ultrasound examinations (for example, examinations to diagnose conditions such as pulmonary edema, effusion, and embolism).

The outcome of interest in Miller [9] was the measured area of PCH for each rat, in mm², obtained using photographs from a stereomicroscope with digital camera. The marginal mean of the outcome for all rats, including those with no damage, was 17.63 mm². When restricted to those rats with positive damage, the mean area was 26.69 mm². As 33.9% of rats exhibited no hemorrhagic damage, there is a definite point mass at zero.

The usual approach in analysis of semicontinuous data is based on the two-part model of Aitchison [5], which models the probability of an outcome exactly equal to zero and the distribution of the outcome given that it is greater than zero separately. Over the years, this model has been extended in various ways. Siegel [6] uses what amounts to a profile likelihood method to obtain maximum likelihood estimates for the parameters of a noncentral chi-squared distribution with zero degrees of freedom (a distribution which contains a point mass at zero). Foster and Bravington [8] propose a model based on an extension of the Tweedie generalized linear model, specifically the compound Poisson formulation, in which the outcome $y_{i} = \sum_{j = 1}^{N_{i}} w_{i j}$ , where N_i is Poisson and w_ij are gamma random variables. Polansky [10] provides a nonparametric method for estimation of the distribution function associated with a “nonstandard mixture” model (meaning a model with probability mass at known discrete points) using a combination of an empirical distribution function and a kernel estimate of a distribution function, but does not address regression modeling. Zhou and Liang [11] present a method for the analysis of skewed data with excess zeros based on a two-part model, with the probability of a zero outcome being observed following a logistic model and the continuous positive outcome’s conditional mean being modeled using a nonparametrically estimated smooth link function.

Our goal in this paper is to semiparametrically model data where the outcome represents some measure of damage to a biological system, in which two competing processes are at work. On the one hand, we have the damage manifestation process, which leads to expression of the damage in some observable form; on the other, we have the damage resistance process, which, up to a random, subject-specific threshold, may prevent the expression of the damage entirely, leading to an observed outcome of zero.

Figure 1 depicts schematically the relationship between applied stress and observed damage (left panel) and potential damage and observed damage (right panel) in two alternative models. The left panel represents a model for which the threshold is on the scale of some variable associated with the applied stress. The dose-response model of Crump [1] would be a special case of this: we would observe a binary outcome corresponding to whether or not the dose threshold was exceeded. The right panel shows the model corresponding to our proposed method, for which observed damage is equal to zero up to the threshold, from which point observed damage equals potential damage; in this case, the threshold is measured on the scale of damage itself.

Schematic relationship between applied stress and observed damage (left panel) and potential damage and observed damage (right panel) in two alternative models.

The remainder of this paper is structured as follows: in Section 2, we lay out the details of our model for the competing damage and resistance processes; in Section 3, we propose an estimator for the parametric part of the model based on a profile likelihood defined using a function analogous to the hazard in time-to-event models; Section 4 presents simulation results; and Section 5 describes an application of the proposed method to a study of pulmonary capillary hemorrhage in rats exposed to diagnostic ultrasound.

2 The competitive damage/resistance model

Consider a regression model based on the Lehmann [12] family of alternatives, proposed in the context of nonparametric testing of the equality of distribution functions: for some outcome X ∈ [0, ∞) and baseline cdf F, we have the model as written above in equations (4) and (5)

P (X \leq x | z) \equiv F_{z} (x) = [F {(x)]}^{\exp {z^{'} β}} .

(1)

We may also write this as a linear transformation model [LTM; see 13, e.g.]: Rearranging (1) yields − log [− log F_z(x)] = −z′β − log [− log F (x)]. Since X|z has distribution function F_z, simple calculations show that the random variable − log [− log F_z(X)] has density exp{−x − e^−x}. Therefore,

g (X) = z^{'} β + \in,

(2)

is equivalent to (1), where g(x) = − log [− log F (x)] is an unspecified increasing function and ∊ has density exp{−x − e^−x}.

Our two-part model is based on an unknown baseline cdf F : if D_i is the random variable representing the damage expression and R_i the damage resistance capacity, then our observed data is

X_{i} = D_{i} 1 (D_{i} > R_{i}),

(3)

where 1(·) is the indicator function, equal to 1 if · is true and zero otherwise. This model implies that we observe the damage D_i if and only if it exceeds the resistance capacity of the organism R_i; otherwise we observe zero for the outcome. The probability model for D_i and R_i is

P (R_{i} \leq r) = {[F (r)]}^{μ_{i}}

(4)

P (D_{i} \leq d) = {[F (d)]}^{η_{i}} .

(5)

We refer to this as the competitive damage/resistance (CDR) model.

Although not explicitly a dose-response model, our approach is similar to that of, e.g., Cox [2] or Crump [1]. These authors, however, focused on estimation of the threshold, in contrast to our situation, where the threshold is random and dependent on the subject. One reference in which the threshold is random is Brockhoff and Muller [14], in which the authors make use of quasi-likelihood estimation in the analysis of repeated measures data. Dabrowska and Doksum [15] give an example of a linear transformation model for dose-response studies which has the form of (2), but where ∊ ∼ N(0, 1) and g(x) = Φ⁻¹[F (x)], with Φ(·) the standard normal cdf. It is not obvious for this model, however, how the regression coefficients should be interpreted. An advantage to our model (as will be discussed below) is a natural interpretation of the regression coefficients in terms of the probability of damage being greater in one group than another.

The model given by equations (4) and (5) induces dependence between the observed damage X and the resistance capacity R by the shared baseline cdf. That is to say, the fact that the baseline cdf is common to both the damage and resistance variables is a natural way to incorporate the inherent association of the two processes within an organism. In effect, the baseline cdf is analogous to the variance components in a mixed effects model: these are common to all subjects, but also give rise to the random effect which induces the correlation within subjects.

2.1 Biological motivation for the model

The biological motivation for this model derives from the concept in cancer etiology of growth-promoting and growth-inhibitory signals [16]. On the one hand, proto-oncogenes encourage cell proliferation, while on the other, tumor suppressor genes actively inhibit such proliferation. The failure of these tumor suppressor genes can lead to uncontrolled growth and ultimately to the development of a cancerous tumor, but in the normal course of cell functioning, these genes prevent any cancer from manifesting. In the context of our model, we may view the unobserved R_i as representative of the action of growth-inhibitory signals; D_i, by contrast, corresponds to the action of growth-promoting signals. The event D_i > R_i would then correspond to the point at which the tumor suppressor genes have failed and allowed a tumor to develop due to runaway cell proliferation.

Many studies in the field of cancer epidemiology simply address the question of what factors lead to the development of a tumor, which is analyzed using some binary response model such as logistic regression [17]. This approach produces estimates in the form of odds or risk ratios, but is unable to account for severity of disease in cases that do develop cancer. In contrast to studies seeking to determine the effect of various covariates on risk of developing cancer, other studies collect data on tumor size. However, this data is typically used to stratify an analysis of temporal trends in incidence [18, 19]. Essentially, this becomes an analysis of tumor size conditional on subjects having developed cancer. The model we propose here provides a unifying framework for the analysis of studies investigating the etiology of cancer whereby both incidence and severity of disease may be assessed jointly.

2.2 Specification of the model

Using equations (4) and (5), it may be shown that the cdf of the observed outcome X = D 1(D > R) will be

\begin{matrix} P (X \leq x) = P [D 1 (D > R) \leq x] \\ = \frac{μ + η [F {(x)]}^{η + μ}}{η + μ} . \end{matrix}

(6)

Note that for x = 0, the marginal cdf is equal to μ/(η + μ). This corresponds to a point mass at zero in the marginal distribution of damage. The intuition behind this is in the relative magnitudes of η and μ: the larger μ is relative to η, the greater the probability that no damage will be observed because of an increased resistance capacity.

The parameters η and μ will incorporate covariates z_i as follows:

η_{i} = e^{z_{i}^{'} β_{η}}, μ_{i} = \frac{θ_{i}}{1 - θ_{i}} η_{i}, θ_{i} = \frac{e^{β_{0} + z_{i}^{'} β_{θ}}}{1 + e^{β_{0} + z_{i}^{'} β_{θ}}},

(7)

where z_i is a p × 1 vector. The parameter vectors β_η, β_θ are also each p × 1 vectors, but may have elements constrained to be zero if the corresponding covariate is not wanted in that part of the model. Model identifiability is possible due to the shared baseline cdf between the damage and resistance processes and the exclusion of an intercept term in η_i.

This parameterization follows by defining θ_i = μ_i/(η_i + μ_i), and then using a logistic link function to model θ_i. This allows for the interpretation of the intercept parameter β₀ as log P (D ≤ R)/P (D > R) for a subject with covariate vector of 0, while the remaining regression coefficients in this part of the model have the usual interpretation as log odds ratios for the event {D ≤ R}.

The regression coefficients in the continuous part of the model (i.e., β_η) have a similar interpretation. An example will serve to illustrate this point: suppose we have a single binary covariate Z, equal to 1 for treatment and zero for control, and associated regression coefficient β. Then $P (D \leq d | Z = 1) = {[F (d)]}^{e^{β}}, P (D \leq d | Z = 0) = F (d)$ . Therefore, as in Lehmann [12], we have

P (D_{Z = 0} < D_{Z = 1}) = \int F d (F^{e^{β}}) = \int F e^{β} F^{e^{β - 1}} d F = \frac{e^{β}}{1 + e^{β}} .

That is, β here is the logit of the probability that damage in a treated subject exceeds damage in a control subject. Thus, β < 0 implies a protective effect of treatment, while β > 0 implies a harmful effect.

The derivation of the profile likelihood that follows in Section 3 retains the original parameterization using only η and μ. This allows for simpler expressions throughout, but when it comes to implementation of the method, we will use the parameterization with η and θ.

3 Semiparametric estimation based on profile likelihood

3.1 Left censoring and the retro-hazard function

In order to obtain profile likelihood estimates for the regression parameters, by which we avoid having to specify the baseline cdf, we introduce the retro-hazard function, analogous to the hazard function in survival analysis. Specifically, consider a random variable T taking values on the interval (0, ∞). The baseline cumulative hazard, H(t), is defined as H(t) = − log S(t), where S(t) = P (T > t) is the survival function [20]. This works well for right-censored data, but is an inconvenient way to formulate the model for left-censored data. This arises, in our model, when resistance capacity exceeds damage: the observed outcome will be censored on the left, since we will know only that damage was less than the resistance capacity.

Instead, we define

H * (x) \equiv - \log F (x),

(8)

where F (x) = P (X ≤ x) is the cdf. Lagakos et al. [21] introduced a similar function in the context of the analysis of right-truncated survival data, which they refer to as a “reverse-time hazard function.” Gross and Huber-Carol [22] further develop the ideas of the “retro-hazard,” but are also primarily interested in dealing with right-truncated data.

3.2 Counting process formulation

In the setting of left-censored data (see Appendix A in the Supplementary Materials), recall the counting process notation of survival analysis, where N(t) denotes the counting process that takes value zero until the event occurs, then jumps to 1 (right continuous); and Y (t), which takes value 1 while the subject is at risk of the event, and zero otherwise [left continuous by convention; see 20, p. 25].

For our purposes, we will imagine a reversal of the time scale [similar to the approach of 21], and define new processes

N * (t) = 1 - N (t^{-})

(9)

Y * (t) = 1 - Y (t^{+}) .

(10)

The process defined by (9) will be left continuous, while the process defined by (10) will be right continuous [somewhat different from the definitions given by 22, Sections 4.1–4.2]. Appendix A presents our derivation of the nonparametric maximum likelihood estimator of the retro-hazard function in the general case for independent left-censoring. Recall that for our model, the censoring process, while not independent, results in all censored observations being equal to zero: thus, censored observations contain no information about the retro-hazard function.

3.3 Profile likelihood for the CDR model

Based on the marginal cdf (6), and defining $β = {(β_{0}, β_{θ}^{'}, β_{η}^{'})}^{'}$ , we may now write the marginal likelihood for this data:

L (β; H^{*}) = e^{ℓ_{1} (β) + ℓ_{2} (β; H *)}

(11)

where

\begin{array}{l} ℓ_{1} (β) = \sum_{i : X_{i} = 0} \log \frac{μ_{i}}{η_{i} + μ_{i}} \\ ℓ_{2} (β; H *) = \sum_{i : X_{i} > 0} \log [- η_{i} e^{- (η_{i} + μ_{i}) H * (X_{i})} d H * (X_{i})] . \end{array}

We show in Appendix B of the Supplementary Materials that substitution of the NPMLE of H^∗ into (11) leads to a profile likelihood (over the infinite-dimensional H^∗)

L (β; \hat{H *}) \propto \prod_{i : X_{i} = 0} \frac{μ_{i}}{η_{i} + μ_{i}} \prod_{i : X_{i} > 0} \frac{η_{i}}{\sum_{j : 0 < X_{j} \leq X_{i}} (η_{j} + μ_{j})} .

(12)

This is analogous to a partial likelihood for β, in that the right-hand side of (12) is proportional to the profile likelihood over H* [see Breslow’s contribution to the discussion of 23, pp. 216–217]. This implies that we may base our inferences about these parameters on

ℓ_{pr} (β) = \sum_{i : X_{i} = 0} [\log μ_{i} - \log (η_{i} + μ_{i})] + \sum_{i : X_{i} > 0} [\log η_{i} - \log \sum_{j : 0 < X_{j} \leq X_{i}} (η_{j} + μ_{j})] .

(13)

Using the parameterization given by (7), we may rewrite (13) in the form we will be using for estimation:

ℓ_{pr} (β) = \sum_{i : X_{i} = 0} [β_{0} + z_{i}^{'} β_{θ} - \log (1 + e^{β_{0} + z_{i}^{'} β_{θ}})] + \sum_{i : X_{i} > 0} [z_{i}^{'} β_{η} - \log \sum_{j : 0 < X_{j} \leq X_{i}} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})] .

(14)

The variance-covariance matrix of the parameter estimates may be estimated consistently by $ℐ^{- 1} (\hat{β})$ ; see Appendix C in the Supplementary Materials for the derivation of $ℐ (β)$ . A proof of the asymptotic normality of a similar estimator is given by Gross and Huber-Carol [22]; only slight modifications of their proof are necessary for our estimator.

Cook and Farewell [24] give a similar example of the use of a partial likelihood in the analysis of left-censored data, but it is based on the conventional hazard rather than the retro-hazard. Our method can in fact be viewed as a generalization of theirs for a random left-censoring point that varies by subject and is related in a specific way to the outcome (in our case, by the proportionality of the retro-hazard functions). The authors do not, however, provide much in the way of interpretation of the model parameters. This is further elaborated on in Farewell [25], although the author simply changes the sign of the original outcomes in order to make use of Kaplan–Meier methodology for estimation of the cdf; the Lehmann family of alternatives is also mentioned [25, pp. 288–289].

4 Simulation studies

This section presents some simulation studies to examine the finite-sample properties of the proposed method. We simulated data under three scenarios: first, with the model correctly specified; second, with misspecification in the form on non-proportionality of the baseline retro-hazards; and third, with misspecification in the form of complete separation between the model for the probability of observing damage and the model for positive damage. The second and third scenarios were designed with the goal of checking the robustness of our proposed method to violations of the model assumptions.

In the second and third scenarios, we compare our proposed method with a standard, flexible semiparametric method of addressing the problem, similar to the model proposed by Zhou and Liang [11]. Specifically, for each data set, we fit a standard logistic model, with the outcome being 1(X_i = 0); this fit corresponds to the θ part of our model. For the subset of observations greater than zero, we fit a semiparametric single-index model [26], implemented in the R package np [27] via the npindex() function. This method fits a model of the form X = g(z′γ)+∊, where γ is a vector of parameters and g is an unknown function. Estimates $\tilde{g}$ and $\tilde{γ}$ are obtained by minimizing a least-squares criterion (the bandwidth for estimation of g is chosen by cross validation).

Since this model is for the conditional mean, the parameter γ has no direct relationship to β_η in our model. This means that we are not able to compare our proposed method directly with this competitor on the basis of mean-square error of parameter estimates, for example. However, we may compare the methods indirectly using the fitted values.

To obtain fitted values for our method, we used the parameter estimates produced by our method and the Breslow-type estimator of the retro-hazard (derived in Appendix B of the Supplementary Materials). Then equation (6) gives the predicted cdf, which will be a step function; the jump sizes in this estimated cdf will correspond to an estimate of the density. If we denote this estimate as $\hat{f_{i}}$ , then the fitted value (i.e., expected damage conditional on covariates) for subject i will be $\sum_{j : X_{j} > 0} X_{j} \hat{f_{i}} (X_{j})$ . A similar method could be used to obtain estimates of the mean of the resistance variable for each subject, by replacing $\hat{f_{i}}$ with an estimate of the conditional density of R_i given the observed value of X_i.

For the comparison method, we predict the expected outcome conditional on covariates as ${[1 + e^{z^{'} {\tilde{β}}_{θ}}]}^{- 1} \tilde{g} (z^{'} \tilde{γ})$ . We refer to this method as the logistic/semiparametric single-index model (LSSIM).

4.1 Correct specification

We simulated 1000 data sets for each of three sample sizes and three intercept values; the intercept was varied in order to produce different proportions of zero observations in the response. A baseline retro-hazard of H*(t) = −log(1−e⁻^t/¹⁰) was used, corresponding to an exponential model. We included two covariates, both of which were included in each part of the model: Z₁ ∼ N(0, 1) and Z₂ ∼ B(1/2). Then we have for each subject

θ = \frac{e^{β_{0} + 2 Z_{1} - Z_{2}}}{1 + e^{β_{0} + 2 Z_{1} - Z_{2}}}

(15)

η = e^{- Z_{1} + 2 Z_{2}}

(16)

μ = \frac{θ}{1 - θ} η .

(17)

Thus, for the true parameters, we have β_θ₁ = 2, β_θ₂ = −1, β_η₁ = −1, β_η₂ = 2.

For these simulations, the intercept β₀ in the θ part of the model was allowed to take values −2, 0, and 2, corresponding to, respectively, approximately 18%, 43%, and 71% of observations equal to zero.

The full results of the simulation study for the scenario without misspecification are given in Appendix D, but broadly, the method shows quick reduction in bias of parameter estimates with increases in sample size as well as good agreement between true and estimated standard errors.

4.2 Non-proportional retro-hazards

In order to address the issue of robustness, we generated data according to the following model:

P (R_{i} \leq r) = e^{- μ_{i} H * (r)}

(18)

P (D_{i} \leq d) = e^{- η_{i} {[H * (d)]}^{α} .}

(19)

The function H^∗ was the same as for the simulations in the first scenario, while we varied α between 0.7 and 1.3 to determine the effect of varying degrees of model misspecification (see Figure 4). Equations (15), (16), and (17) were used as in the first scenario to generate the true values of μ and η; covariate distributions were likewise the same.

Full results for this simulation setting are given in Appendix D. Overall, however, the effect of this kind of misspecification seems to be quite limited, both on our proposed method as well as the standard method. The results do indicate that our method outperforms the standard method uniformly and by a large margin, generally 40–50% regardless of other model parameters.

4.3 Unlinked models

As another check on the robustness of our method, we generated data assuming that the probability of observing damage is not linked with the distribution of positive damage. Specifically, under this scenario, the model for the probability of observing any damage was unchanged, and is given by equation (15). For the distribution of the outcome given it is positive, we used a single-index model:

X_{i} | X_{i} > 0 = g (γ_{0} + Z_{1} - Z_{2}) + \in,

(20)

where ∊ ∼ N(0, 5²) for all settings in this scenario; covariate distributions were the same as in the previous scenarios. Since for some possible g functions the outcome could be negative after adding the random error term, the continuous outcome generated was truncated at zero. This means that the LSSI model is not precisely correctly specified, as there are some observations that should have been positive according to the logistic part of the model, but appear as zero due to this truncation.

Two general families were used for the unknown function g, in an attempt to get an idea of the behavior of the estimators over a range of possible shapes for this function. In the first setting, g(u) = 6|u|^ω; ω was varied between 1 and 3. γ₀ = g⁻¹(100) = (100/6)¹^/ω, which minimizes the effect of ω on the mean outcome, which is kept approximately constant. This allows us to focus on the effect of the shape of the unknown function g on the prediction errors produced by our proposed method and the competing method. In the second setting, $g (u) = 100 + \frac{80}{π} \tan^{- 1} (\frac{u}{σ})$ and γ₀ = 0; σ was varied between 1 and 4. No intercept is necessary here because the location of the function is fixed as we vary σ. See Figure 2 for plots of these functions.

Plots of the unknown function $g$ for the simulations in which the LSSIM model is the data-generating mechanism. The $x$-axis of each plot shows the value of the linear predictor that corresponds to the indicated conditional mean of the response variable (given that the response exceeds zero).

We are now less interested in comparisons of predictive error across different values of ω or σ, as there is no “reference” level corresponding to correct model specification. Therefore, we use the usual mean-square error criterion to compare our proposed method and the standard method within each value of ω:

{MSEP}_{2} = \frac{1}{n} \sum_{i = 1}^{n} {[{\hat{X}}_{i} - 𝔼 (X_{i} | z_{i})]}^{2} .

(21)

Average values of this quantity across 1000 simulated data sets are shown Tables 1 and 5.

Table 1.

Simulation results under misspecified model, power transformation as unknown function of the index: predictive errors, n = 500. This table shows the root mean-square error of the predictions $(\sqrt{{MSEP}_{2}})$ for both the standard method (LSSIM) and our proposed method (CDRM); the final column is a measure of relative efficiency, calculated as the ratio of the $\sqrt{{MSEP}_{2}}$ of the CDRM method to that of the LSSIM method. This is averaged over 1000 simulated data sets at each distinct combination of intercept value β₀ and misspecification parameter ω. Also displayed in this table is the average outcome across all subjects and simulated data sets, intended to give an idea of the relative size of the $\sqrt{{MSEP}_{2}}$ values (which are not normalized as they are for MSEP₁). The intercept parameter β₀ was allowed to take values −2, −1, and 0 (shown in the first column), corresponding to, respectively, approximately 18%, 29%, and 43% of observations equal to zero.

β₀

𝔼

LSSIM

CDRM

Ratio

−2

78.5

2.960

3.312

1.119

1.5

69.1

3.326

3.407

1.024

59.3

4.647

4.304

0.926

2.5

54.1

17.147

13.934

0.813

54.0

38.750

34.723

0.896

−1

67.3

3.005

3.550

1.181

1.5

57.6

3.046

3.181

1.044

47.0

3.836

3.680

0.959

2.5

40.4

11.838

8.897

0.752

38.0

25.565

20.240

0.792

53.7

2.930

3.379

1.153

1.5

44.3

2.898

3.012

1.039

34.0

3.151

3.056

0.970

2.5

27.2

7.536

5.276

0.700

23.7

14.612

9.969

0.682

Open in a new tab

Table 1 shows that the standard method slightly outperforms our proposed method for smaller values of the exponent ω, but becomes relatively less efficient as ω increases further. However, we must keep in mind that there is a dramatic increase in predictive error for both the proposed method and the standard method as ω increases, likely due to increasing potential for extreme large values of the outcome (see the left panel of Figure 2). This is even more pronounced when we consider the fact that $𝔼$ X is decreasing with increasing ω. We may conclude that for shallower functions g, the standard method performs better than our proposed method, albeit only by approximately 10%. For steeper g functions, by contrast, neither method performs well, but our proposed method is relatively more robust than the standard method, with possibly 20–30% increases in efficiency depending on ω.

Results for the Cauchy transformation are given in Appendix D. They are generally more favorable for the standard method than our proposed method, but as the curve becomes increasingly linear (i.e., with increasing σ), our proposed method becomes competitive.

5 Rat PCH data analysis

To evaluate the CDR model in practice, we applied it to the data of Miller [9]. The rats in this study were evaluated at various combinations of ultrasonic frequencies (1.5, 4.5, 7.6, and 12 MHz) and peak rarefactional pressure amplitude (PRPA, referred to hereafter simply as amplitude). There was especial interest in thresholds for PCH expressed in terms of the amplitude, which makes this data particularly suitable for our method, as it explicitly models the probability of exceeding subject-specific damage thresholds as a function of covariates.

The results of applying our method to this data are displayed in Table 2 and Figure 3. Two covariates (along with possible interactions) were considered in this analysis: frequency, which takes only four possible values in this data set; and amplitude. It was found that treating frequency as a categorical rather than a continuous variable in the η part of the model provided a substantial improvement in fit to the data without sacrificing too much in terms of efficiency (as measured by AIC).

Table 2.

Parameter estimates for the rat PCH data. The final model was chosen on the basis of visual fit to the observed data (see Figure 3). The column labeled “Model” denotes the part of the model to which the covariates refer: either the logistic model for the probability of not exceeding the resistance threshold, or the continuous model for the positive responses (i.e., observed damage > 0).

Model	Covariate	Est.	SE	p-value
Logistic	(Intercept)	6.064	1.621	0.0002
	Amplitude	−7.696	1.658	0.0000
	Frequency	0.356	0.101	0.0004
Continuous	Amplitude	8.290	1.009	0.0000
	Frequency (1.5 MHz: ref.)	—	—	—
	Frequency (4.5 MHz)	2.632	1.271	0.0384
	Frequency (7.6 MHz)	3.143	1.461	0.0314
	Frequency (12 MHz)	2.230	3.209	0.4871
	Amplitude × Frequency (4.5 MHz)	−3.907	0.836	0.0000
	Amplitude × Frequency (7.6 MHz)	−4.420	1.020	0.0000
	Amplitude × Frequency (12 MHz)	−4.257	2.096	0.0423

Open in a new tab

Observed and fitted values for the rat PCH data. Curves labeled “observed data” are conditional means for the amplitude and frequency values depicted. Curves labeled “fitted values” were obtained by fitting the CDR model using the partial likelihood technique outlined in Section 3; the retro-hazard was then obtained using the estimation procedure given in Appendix A; finally, these elements were combined to give an estimate of the conditional density function, which was then used along with the observed damage values to obtain expectations numerically.

In Table 2, we see coefficient estimates for amplitude are large in magnitude but opposite in sign in the two parts of the model: this is sensible, recalling that we are modeling the probability of damage not being manifested with the θ part of the model; and that the η part of the model essentially scales the cdf of observed positive damage, so that more positive coefficient estimates indicate increased damage. Specifically, with a coefficient estimate of 8.290, the probability that damage in a rat exposed to an additional unit of amplitude exceeds damage in a “control” rat is essentially 1. The overall interpretation is that larger amplitudes lead both to increased probability of exceeding the resistance threshold as well as to increased damage once the threshold has been exceeded.

The interpretation of the effect of frequency on PCH area is somewhat more complicated, both because it is treated as continuous in the logistic (θ) part of the model and categorical in the positive (η) part, as well as because of the inclusion of an interaction term in the positive part. However, we can say that increasing frequency leads to decreasing probability of exceeding the resistance threshold, since the coefficient estimate for this covariate in the θ part of the model is positive. Although the coefficient estimates for the frequency terms alone are all positive in the η part of the model, which would indicate an association of increasing frequency with increasing damage (given exceedance of the threshold), note that the interaction terms all have greater magnitude and negative sign. Therefore, as long as amplitude is greater than zero, the net effect of frequency will be negative, which coincides with what intuition suggests given the positive sign of this coefficient in the logistic part of the model.

Turning now to Figure 3, we may observe the visual fit of the model to the data, obtained using the procedure outlined above in Section 4. It is clear from this figure that the model provides a good fit to the data for each frequency and across amplitudes. There may be slight overestimation in the fitted values for the highest frequency, but overall we see precisely the patterns in the observed data, with smooth curves rising from zero (no damage observed) at the lowest amplitudes.

6 Discussion

In this paper, we have proposed a model for competitive damage and resistance processes in a biological system, motivated by a data set consisting of test animals subjected to an external force expected to lead to injury. Our model, using the retro-hazard function first proposed by Lagakos et al. [21] and later elaborated upon by Gross and Huber-Carol [22], leads to an estimation procedure based on a closed-form profile likelihood. This procedure is fast, efficient, and does not require any distributional assumptions on the observed damage outcome. Parameter interpretation is provided with reference to the probability of damage exceeding repair capacity (for the logistic part of the model) and to the probability of damage in one group exceeding damage in another group (for the continuous part of the model).

The assumption of a common baseline retro-hazard for both the damage and resistance systems could be questioned in a particular application. However, the inclusion of covariates in each part of the model, which may of course take the same or opposite signs, seems to allow sufficient flexibility in terms of the effect of a particular factor on the observed outcome. There are always tradeoffs between fidelity to biological reality on the one hand and statistical or mathematical convenience on the other. Our modeling approach is motivated by the former, but makes the necessary concessions to the latter in order for the model to be identifiable. In our model, the rationale for the damage and resistance variables sharing the same baseline cdf is that the stressor should provoke similar but opposite reactions from these systems. In the context of the rat PCH data, this stressor is the diagnostic ultrasound: this is applied to each organism and triggers two biological reactions, damage and resistance. Although diametrically opposed to one another, both are responding to the same stressor. We may also imagine, in the more general setting, that such a stressor could be an environmental exposure in a study of the etiology of cancer, for example.

Future research may examine the possibility of relaxing this assumption via inclusion of shared variables, similar to frailties in survival analysis, between the two parts of the model. Another possible direction for further study is explicit incorporation of a dose-response relationship in the model, as is depicted in the left panel of Figure 1 (with dose corresponding to stress). Currently, our approach implicitly assumes that the outcome is the response to some applied dose; however, a dynamic model for variable dose over time could be quite interesting.

Supplementary Material

Supp info

NIHMS851380-supplement-Supp_info.pdf^{(349.5KB, pdf)}

Acknowledgments

The authors thank Douglas Miller for providing the rat PCH data used in this paper. This research was partially supported by the grant 1U01CA199338 (CISNET) from the National Cancer Institute; partial support was also provided by by the grant 5-R01-HL-116434-02 from the NIH.

Appendices

A Derivation of NPMLE of the retro-hazard

In this section, we briefly discuss some properties of the retro-hazard function H* and derive the nonparametric maximum likelihood estimator (NPMLE) of H* under the general condition of left-censored data, of which the CDR model’s data structure constitutes a special case. Suppose we have V_i = max{T_i, C_i}, Δ_i = 1(V_i = T_i), i = 1,… n, where T_i ~ e⁻^H^*(^t⁾. The likelihood for this data is

L (H *) = \prod_{i = 1}^{n} {[- d H * (V_{i})]}^{Δ_{i}} e^{- H * (V_{i})} .

(A1)

From equation (8), we have F(t) = e^−H*(^t⁾, implying that the pdf of X under this formulation is f(t) = −dH*(t) e⁻^H^*(^t⁾. It is also apparent that dH*(t) ≤ 0, t ε (0, ∞), so H* must be nonincreasing. Furthermore, we may deduce that (for a proper distribution of T) since F(0) = 0 and lim_x_→∞ F(t) = 1, H*(0) = ∞ and lim_t_→∞ H*(t) = 0. The foregoing also implies that

H * (t) = \int_{t}^{\infty} - d H * (y) .

(A2)

Apart from a sign change, dH* is equivalent to the function ρ introduced by Lagakos et al. [21].

Define differentiation of a linear functional J with respect to H* as [see 28, Section 3.2]

δ_{s} J = \frac{\partial J}{\partial d H * (s)} .

Now, differentiation of the log-likelihood proceeds using the chain rule and definition (A2):

\begin{array}{l} δ_{s} \log L (H *) = \sum_{i = 1}^{n} {Δ_{i} δ_{s} \log [- d H * (V_{i})] - δ_{s} H * (V_{i})} \\ = \sum_{i = 1}^{n} Δ_{i} \frac{\partial \log [- d H * (V_{i})]}{\partial d H * (s)} - \sum_{i = 1}^{n} \frac{\partial H * (V_{i})}{\partial d H * (s)} \\ = \sum_{i = 1}^{n} \frac{Δ_{i}}{- d H * (s)} \cdot \frac{\partial}{\partial d H * (s)} [- d H * (V_{i})] - \sum_{i = 1}^{n} \frac{\partial}{\partial d H * (s)} \int_{V_{i}}^{\infty} - d H * (t) \\ = \sum_{i = 1}^{n} \frac{Δ_{i}}{- d H * (s)} \cdot - 1 (V_{i} = s) - \sum_{i = 1}^{n} \int_{0}^{\infty} - 1 (V_{i} \leq t) \frac{\partial}{\partial d H * (s)} d H * (t) \\ = \sum_{i = 1}^{n} \frac{Δ_{i} 1 (V_{i} = s)}{d H * (s)} - \sum_{i = 1}^{n} \int_{0}^{\infty} - 1 (V_{i} \leq t) 1 (t = s) \\ = \sum_{i = 1}^{n} \frac{Δ_{i} 1 (V_{i} = s)}{d H * (s)} - \sum_{i = 1}^{n} - 1 (V_{i} \leq s) . \end{array}

The important identities established here are

δ_{s} \log [- d H * (t)] = \frac{1 (t = s)}{d H * (s)}, δ_{s} H * (t) = - 1 (t \leq s) .

(A3)

Setting δ_s log L(H*) = 0 implies a Nelson-Aalen estimator

\hat{d H *} (s) = - \frac{\sum_{i = 1}^{n} Δ_{i} 1 (V_{i} = s)}{\sum_{i = 1}^{n} 1 (V_{i} \leq s)} .

The negative sign of the estimator indicates that these will be decrements instead of the usual increments in the classical Nelson–Aalen estimator. Otherwise, the form of the estimator is identical, with the only difference being that the “risk set” at point s is composed of observations with V_i ≤ s. Recalling the identity in equation (A2), the estimate of H* is

\hat{H *} (t) = - \int_{t}^{\infty} \hat{d H *} (s) .

B Derivation of the profile likelihood

We confine ourselves to the observations for which X_i > 0 (that is, observations for which damage is observed), and consider the problem of estimating H*. The log-likelihood for these observations may be written as

ℓ_{2} (β; H *) = \sum_{i : X_{i} > 0} {\int_{0}^{\infty} \log [- η_{i} d H * (t)] d N_{i}^{*} (t) - \int_{0}^{\infty} (η_{i} + μ_{i}) Y_{i}^{*} (t) d H * (t)}

(B1)

using the counting processes defined by (9) and (10). By functional differentiation of (11) with respect to H*, we find

\begin{array}{l} U (s) = δ_{s} \log {\prod_{i : X_{i} > 0} [- η_{i} e^{- (η_{i} + μ_{i}) H * (X_{i})} d H * (X_{i})]} \\ = δ_{s} \sum_{i : X_{i} > 0} {\log η_{i} + \log [- d H * (X_{i})] - (η_{i} + μ_{i}) H * (X_{i})} \\ = \sum_{i : X_{i} > 0} [\frac{d N_{i}^{*} (s)}{d H * (s)} - (η_{i} + μ_{i}) \cdot - Y_{i}^{*} (s)] \\ = \sum_{i : X_{i} > 0} \frac{d N_{i}^{*} (s)}{d H * (s)} + \sum_{i : X_{i} > 0} (η_{i} + μ_{i}) Y_{i}^{*} (s) . \end{array}

Note that we have used the identities (A3) and the fact that $Y_{i}^{*} (s) = 1 (X_{i} \leq s)$ . Furthermore, since for this model all observations greater than 0 are uncensored, $d N_{i}^{*} (s) = 1 (X_{i} = s)$ when X_i > 0. Setting $U$ (s) = 0 implies a Breslow estimator of

\hat{d H *} (s) = - \frac{\sum_{i : X_{i} > 0} d N_{i}^{*} (s)}{\sum_{i : X_{i} > 0} (η_{i} + μ_{i}) Y_{i}^{*} (s)} .

(B2)

Substitution of (B2) into the log-likelihood (B1) yields

\begin{array}{l} ℓ_{2} (β; \hat{H *}) = \int_{0}^{\infty} \sum_{i : X_{i} > 0} \log [η_{i} \frac{\sum_{j : X_{j} > 0} d N_{j}^{*} (t)}{\sum_{j : X_{j} > 0} (η_{j} + μ_{j}) Y_{j}^{*} (t)}] d N_{i}^{*} (t) \\ + \int_{0}^{\infty} \sum_{i : X_{i} .0} (η_{i} + μ_{i}) Y_{i}^{*} (t) \frac{\sum_{j : X_{j} > 0} d N_{j}^{*} (t)}{\sum_{j : X_{j} > 0} (η_{j} + μ_{j}) Y_{j}^{*} (t)} \\ = \int_{0}^{\infty} \sum_{i : X_{i} > 0} \log [η_{i} \frac{\sum_{j : X_{j} > 0} d N_{j}^{*} (t)}{\sum_{j : X_{j} > 0} (η_{j} + μ_{j}) Y_{j}^{*} (t)}] d N_{i}^{*} (t) + \int_{0}^{\infty} \sum_{j : X_{j} > 0} d N_{j}^{*} (t) \\ = const . + \sum_{i : X_{j} > 0} \int_{0}^{\infty} [\log η_{i} - \log \sum_{j : X_{j} > 0} (η_{j} + μ_{j}) Y_{j}^{*} (t)] d N_{i}^{*} (t), \end{array}

where in the last line we have absorbed into the constant all terms not involving η or μ. Returning to (11), we see that

\begin{array}{l} L (β; \hat{H *}) = e^{ℓ_{1} (β) + ℓ_{2} (β; \hat{H *})} \\ \propto \prod_{i : X_{i} = 0} \frac{μ_{i}}{η_{i} + μ_{i}} \prod_{i : X_{i} > 0} \frac{η_{i}}{\sum_{j : X_{j} > 0} (η_{j} + μ_{j}) Y_{j}^{*} (X_{i})} \\ = \prod_{i : X_{i} = 0} \frac{μ_{i}}{η_{i} + μ_{i}} \prod_{i : X_{i} > 0} \frac{η_{i}}{\sum_{j : 0 < X_{j} \leq X_{i}} (η_{j} + μ_{j})} . \end{array}

C Score components and observed information matrix

The profile likelihood is given by equation (14).

C.1 Score components

In the interests of more compact notation, we hereafter adopt the convention that summations over j refer to the set {j : 0 < X_j ≤ X_i}. The score components are

\begin{array}{l} U_{0} \equiv \frac{\partial ℓ_{pr} (β)}{\partial β_{0}} = \sum_{i : X_{i} = 0} \frac{1}{1 + e^{β_{0} + z_{i}^{'} β_{θ}}} - \sum_{i : X_{j} > 0} \frac{\sum_{j} e^{β_{0} + z_{j}^{'} β_{θ} + z_{j}^{'} β_{η}}}{\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})} \\ U_{θ} \equiv \frac{\partial ℓ_{pr} (β)}{\partial β_{θ}} = \sum_{i : X_{i} = 0} \frac{z_{i}}{1 + e^{β_{0} + z_{i}^{'} β_{θ}}} - \sum_{i : X_{i} > 0} \frac{\sum_{j} z_{j} e β_{0} + z_{j}^{'} β_{θ} + z_{j}^{'} β_{η}}{\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})} \\ U_{η} \equiv \frac{\partial ℓ_{pr} (β)}{\partial β_{η}} = \sum_{i : X_{i} > 0} [z_{i} - \frac{\sum_{j} z_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})}{\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})}] . \end{array}

The score vector is $U (β) = (U_{0}, U_{θ}^{'}, U_{η}^{'})'$ .

C.2 Observed information

The observed information matrix will be

ℐ (β) = [\begin{matrix} ℐ_{00} & ℐ_{θ 0}^{'} & ℐ_{η 0}^{'} \\ ℐ_{θ 0} & ℐ_{θ θ} & ℐ_{η θ}^{'} \\ ℐ_{η 0} & ℐ_{η θ} & ℐ_{η η} \end{matrix}],

with component matrices derived below. $ℐ_{00}$ is a scalar; $ℐ_{θ 0}$ and $ℐ_{η 0}$ are p × 1 vectors; and $ℐ_{θ θ}$ , $ℐ_{η θ}$ , and $ℐ_{η η}$ are p × p matrices. Clearly, then, $ℐ (β)$ will be a (2p + 1) × (2p + 1) matrix. Below, we calculate the elements of this matrix.

Derivatives of the score with respect to β₀:

\begin{array}{l} ℐ_{00} \equiv - \frac{\partial U_{0}}{\partial β_{0}} = \sum_{i : X_{i} = 0} \frac{e^{β_{0} + z_{i}^{'} β_{θ}}}{{(1 + e^{β_{0} + z_{i}^{'} β_{θ}})}^{2}} + \sum_{i : X_{i} > 0} \frac{[\sum_{j} e^{β_{0} + z_{j}^{'} β_{θ} + z_{j}^{'} β_{η}}] [\sum_{j} e^{z_{j}^{'} β_{η}}]}{{[\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{2}} \\ ℐ_{θ 0} \equiv - \frac{\partial U_{θ}}{\partial β_{0}} = \sum_{i : X_{i} = 0} \frac{z_{i} e^{β_{0} + z_{i}^{'} β_{θ}}}{{(1 + e^{β_{0} + z_{i}^{'} β_{θ}})}^{2}} + \sum_{i : X_{i} > 0} \frac{[\sum_{j} z_{j} e^{β_{0} + z_{j}^{'} β_{θ} + z_{j}^{'} β_{η}}] [\sum_{j} e^{z_{j}^{'} β_{η}}]}{{[\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{2}} \\ ℐ_{η 0} \equiv - \frac{\partial U_{η}}{\partial β_{0}} = \sum_{i : X_{i} > 0} {\frac{[\sum_{j} z_{j} e^{β_{0} + z_{j}^{'} β_{θ} + z_{j}^{'} β_{η}}] [\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}{{[\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{2}} \\ - \frac{[\sum_{j} z_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})] [\sum_{j} e^{β_{0} + z_{j}^{'} β_{θ} + z_{j}^{'} β_{η}}]}{{[\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{2}}} \end{array}

Derivatives of the score with respect to β_θ:

ℐ_{θ θ} \equiv \frac{\partial U_{θ}}{\partial β_{θ}} = \sum_{i : X_{i} = 0} \frac{z_{i} z_{i}^{'} e^{β_{0} + z_{i}^{'} β_{θ}}}{{(1 + e^{β_{0} + z_{i}^{'} β_{θ}})}^{2}} + \sum_{i : X_{i} > 0} {\frac{[\sum_{j} z_{j} z_{j}^{'} e^{β_{0} + z_{j}^{'} β_{θ} + z_{j}^{'} β_{η}}] [\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}{{[\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{2}} - \frac{{[\sum_{j} z_{j} e^{β_{0} + z_{j}^{'} β_{θ} + z_{j}^{'} β_{η}}]}^{\otimes 2}}{{[\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{2}}}

ℐ_{η θ} \equiv - \frac{\partial U_{η}}{\partial β_{θ}} = \sum_{i : X_{i} > 0} {\frac{[\sum_{j} z_{j} z_{j}^{'} e^{β_{0} + z_{j}^{'} β_{θ} + z_{j}^{'} β_{η}}] [\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}{{[\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{2}}

- \frac{[\sum_{j} z_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})] {[\sum_{j} z_{j} e^{β_{0} + z_{j}^{'} β_{θ} + z_{j}^{'} β_{η}}]}^{'}}{{[\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{2}}}

Derivatives of the score with respect to β_η:

\begin{array}{l} ℐ_{η η} \equiv - \frac{\partial U_{η}}{\partial β_{η}} = \sum_{i : X_{i} > 0} {\frac{[\sum_{j} z_{j} z_{j}^{'} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})] [\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}{{[\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{2}} \\ - \frac{{[\sum_{j} z_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{\otimes 2}}{{[\sum_{j} e^{z_{j}^{'} β_{η}} (1 + e^{β_{0} + z_{j}^{'} β_{θ}})]}^{2}}} \end{array}

D Further simulation results

D.1 Correct specification

The results of the simulation study for the scenario without misspecification are displayed in Table 3. This table shows that bias and variance decrease with increasing sample size, as we would expect. Bias of all parameter estimates also seems to be adversely affected by intercept values differing from zero, however, as is the case for the large values of ESD and ASE. We also see good agreement between the ESD and ASE for moderate to large samples.

In contrast to results for the logistic part of the model, it is clear that bias and variance of the parameter estimates in the continuous part of the model monotonically increase with increasing proportions of zero observations, which is due to effectively decreasing the sample size available for estimation of the η part of the model. We observe good agreement between the ESD and ASE for moderate to large samples, indicating the adequacy of the asymptotic approximations for the covariance matrix of the parameter estimates.

We observe some interesting bias patterns in these results. When the true value of the parameter is negative (corresponding to approximately 18% of responses equal to zero), the bias is also negative, but decreases quickly with increasing sample size. The reverse is true for the simulations with approximately 71% of responses equal to zero. This shows a consistent pattern of bias away from the null in small samples. Bias tends to be larger for the negative and positive intercept scenarios than for the zero intercept scenario, but this is to be expected, because for a binary variable, maximal information is gained from responses with probability roughly equal to 1/2.

There seems to be little to no bias in some cases for the intercept in the logistic part of the model. To explain this, recall that we are varying this parameter in order to examine the effect of different levels of censoring on the model performance: minimal bias occurs when we set the true value of the intercept equal to zero. Usually, we might report relative bias, where the numbers in this table would be divided by the true values of the parameters in order to facilitate comparisons, but of course this is not sensible for a parameter with a true value of zero.

D.2 Non-proportional retro-hazards

For the expected outcome under the true model, we compute $𝔼 [D_{i} 1 (D_{i} > R_{i})]$ for each observation as

\int_{0}^{\infty} t e^{- μ_{i} H * (t)} d (e^{- η_{i} {[H * (t)]}^{α}}) .

Because the scale of the outcome may vary with α under this form of misspecification and we would like to be able to compare performance for different degrees of misspecification, we evaluated predictive ability of the models using a modified mean-squared error:

{MSEP}_{1} = \frac{1}{n} \sum_{i = 1}^{n} {[1 - \frac{{\hat{X}}_{i}}{𝔼 (X_{i} | z_{i})}]}^{2} .

(D1)

Average values of this quantity across 1000 simulated data sets are shown Table 4.

Baseline cdf and pdf plots for simulations in which the baseline retro-hazard is misspecified in the form of equations (18) and (19). The curve in black is the baseline cdf that would be shared between the resistance and damage processes under correct model specification, while the lines in color represent departures from that. Note that while the left-hand panel has an untransformed x-axis, the right-hand panel’s x-axis is on the log scale in order to give a clearer idea of the behavior of the density curves.

We see from this table that our method outperforms the standard method uniformly and by a large margin, generally 40–50% regardless of other model parameters. The increased efficiency of our method is most pronounced when there is a lower proportion of zero values in the outcome (β₀ = −2). The predictive errors increase with increasing proportion of zero values, which is to be expected, as this reduces the amount of information contained in the observed outcome data.

There is an apparent U-shaped relationship between α and the predictive errors under our method (it seems this is also the case for the standard method, although the effect is less obvious). This is sensible, as the model should perform best when it is correctly specified, and indeed α = 1 is where we find the minimal average prediction errors under our method. However, this does not seem to be the case for β₀ = 0: the average predictive error seems relatively flat for α ≤ 1, while increasing thereafter.

Overall, however, the effect of this kind of misspecification seems to be quite limited, both on our proposed method as well as the standard method. We would not expect the standard method to be affected in any particular way by this form of model misspecification, as it does not rely on our specific model assumptions. The proposed method, on the other hand, can be said to be quite robust to moderate violations of its primary assumption, the proportionality of retro-hazards between the damage and resistance processes.

D.3 Unlinked models

The results for the Cauchy transformation simulations, shown in Table 5, are more favorable for the standard method than our proposed method. For smaller values of the scale parameter σ, the standard method is substantially more efficient than our proposed method. However, as the curve becomes more linear (i.e., with increasing σ), our proposed method becomes more competitive. Both methods display decreasing trends in predictive error as σ increases, but this effect is much stronger for our proposed method.

Indeed, comparing the results of this table for σ = 4 with those in Table 1 for ω =1, we see that predictive errors are quite similar, with $\sqrt{{MSEP}_{2}}$ slightly less than 3 for the standard method and slightly greater than 3 for our proposed method. This reflects the near-linearity of the curve for σ = 4 and the perfect linearity of the curve for ω =1 (see Figure 2). Otherwise, it is true that the Cauchy transformations lead to smaller predictive errors than the family of power transformations.

Table 3.

Simulation results under correct model specification (bias and standard errors). This table shows the bias, empirical standard deviation (ESD), and average standard error (ASE) of the parameter estimates across all simulated data sets for the part of the model pertaining to the probability of positive damage being observed (“logistic part”) and for the part of the model pertaining to amount of damage (“continuous part”). The intercept parameter β₀ was allowed to take values −2, 0, and 2 (shown in the first column), corresponding to, respectively, approximately 18%, 43%, and 71% of observations equal to zero.

Logistic part of model

β₀	n	β₀			β_θ₁ = 2			β_θ₂ =−1

		Bias	ESD	ASE	Bias	ESD	ASE	Bias	ESD	ASE
−2	100	−0.177	0.620	0.556	0.248	0.598	0.519	−0.137	0.784	0.693
	200	−0.077	0.397	0.367	0.101	0.358	0.334	−0.040	0.468	0.458
	500	−0.027	0.224	0.224	0.029	0.210	0.200	−0.004	0.279	0.281
0	100	−0.007	0.380	0.364	0.154	0.428	0.406	−0.070	0.528	0.522
	200	0.008	0.253	0.250	0.073	0.294	0.272	−0.026	0.345	0.357
	500	−0.006	0.159	0.156	0.022	0.173	0.165	−0.002	0.220	0.222
2	100	0.167	0.576	0.512	0.184	0.501	0.448	−0.091	0.646	0.586
	200	0.075	0.351	0.342	0.066	0.306	0.292	−0.058	0.392	0.395
	500	0.027	0.212	0.210	0.036	0.178	0.179	−0.011	0.243	0.244
Continuous part of model

β₀	n				β_η₁ = −1			β_η₂ = 2

					Bias	ESD	ASE	Bias	ESD	ASE

−2	100				−0.020	0.162	0.164	0.049	0.318	0.310
	200				−0.007	0.113	0.112	0.021	0.220	0.214
	500				−0.002	0.071	0.069	0.012	0.132	0.133
0	100				−0.016	0.225	0.210	0.073	0.394	0.389
	200				−0.013	0.146	0.143	0.046	0.274	0.266
	500				−0.001	0.089	0.088	0.013	0.166	0.164
2	100				−0.013	0.366	0.339	0.133	0.658	0.623
	200				−0.015	0.227	0.220	0.090	0.422	0.408
	500				−0.010	0.136	0.133	0.026	0.251	0.246

Open in a new tab

Table 4.

Simulation results under misspecified model, with non-proportional retro-hazards: predictive errors, n = 500. This table shows the root mean-square error of the predictions $(\sqrt{{MSEP}_{1}})$ for both the standard method (LSSIM) and our proposed method (CDRM); the final column is a measure of relative efficiency, calculated as the ratio of $\sqrt{{MSEP}_{1}}$ for the CDRM method to that of the LSSIM method. This is averaged over 1000 simulated data sets at each distinct combination of intercept value β₀ and misspecification parameter α (where α = 1 corresponds to proportional retro-hazards, i.e., correct model specification). The intercept parameter β₀ was allowed to take values −2, −1, and 0 (shown in the first column), corresponding to, respectively, approximately 18%, 29%, and 43% of observations equal to zero.

β₀	α	Method		Ratio
β₀	α	LSSIM	CDRM	Ratio
−2	0.7	0.157	0.077	0.489
	0.8	0.155	0.073	0.467
	0.9	0.151	0.069	0.455
	1	0.152	0.068	0.450
	1.1	0.150	0.070	0.468
	1.2	0.153	0.073	0.478
	1.3	0.150	0.076	0.508
−1	0.7	0.166	0.092	0.558
	0.8	0.160	0.083	0.520
	0.9	0.155	0.082	0.526
	1	0.154	0.079	0.513
	1.1	0.156	0.087	0.555
	1.2	0.160	0.094	0.588
	1.3	0.166	0.107	0.641
0	0.7	0.175	0.107	0.611
	0.8	0.172	0.104	0.604
	0.9	0.169	0.103	0.607
	1	0.169	0.104	0.617
	1.1	0.173	0.111	0.642
	1.2	0.178	0.124	0.698
	1.3	0.191	0.145	0.760

Open in a new tab

Table 5.

Simulation results under misspecified model, Cauchy transformation as unknown function of the index: predictive errors, n = 500. This table shows the root mean-square error of the predictions ( $\sqrt{{MSEP}_{2}}$ , RMSEP) for both the standard method (LSSIM) and our proposed method (CDRM); the final column is a measure of relative efficiency, calculated as the ratio of the $\sqrt{{MSEP}_{2}}$ of the CDRM method to that of the LSSIM method. This is averaged over 1000 simulated data sets at each distinct combination of intercept value β₀ and misspecification parameter σ. Also displayed in this table is the average outcome across all subjects and simulated data sets, intended to give an idea of the relative size of the $\sqrt{{MSEP}_{2}}$ values (which are not normalized as they are for MSEP₁). The intercept parameter β₀ was allowed to take values −2, −1, and 0 (shown in the first column), corresponding to, respectively, approximately 18%, 29%, and 43% of observations equal to zero.

β₀

𝔼

LSSIM

CDRM

Ratio

−2

72.4

3.166

7.242

2.287

75.7

3.017

4.861

1.611

77.5

2.983

3.878

1.300

78.5

2.912

3.414

1.172

−1

61.0

3.108

7.181

2.310

64.4

2.959

4.522

1.528

66.2

2.953

3.541

1.199

67.3

2.964

3.362

1.134

47.7

2.840

6.207

2.185

50.9

2.881

4.043

1.403

52.7

2.814

3.263

1.160

53.6

2.973

3.259

1.096

Open in a new tab

Contributor Information

John D. Rice, University of Rochester, Department of Biostatistics and Computational Biology, 265 Crittenden Blvd., Rochester, NY 14642

Alex Tsodikov, University of Michigan, Department of Biostatistics, 1415 Washington Heights, Ann Arbor, MI 48104.

References

1.Crump KS. Dose response problems in carcinogenesis. Biometrics. 1979;35:157–167. [PubMed] [Google Scholar]
2.Cox C. Threshold dose-response models in toxicology. Biometrics. 1987;43:511–523. [PubMed] [Google Scholar]
3.Ebrahimi N. Stochastic properties of a cumulative damage threshold crossing model. Journal of Applied Probability. 1999;36:720–732. [Google Scholar]
4.Esary JD, Marshall AW. Shock models and wear processes. The Annals of Probability. 1973;1:627–649. [Google Scholar]
5.Aitchison J. On the distribution of a positive random variable having a discrete probability mass at the origin. Journal of the American Statistical Association. 1955;50:901–908. [Google Scholar]
6.Siegel AF. Modelling data containing exact zeroes using zero degrees of freedom. Journal of the Royal Statistical Society, Series B (Methodological) 1985;47:267–271. [Google Scholar]
7.Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics. 1992;34:1–14. [Google Scholar]
8.Foster SD, Bravington MV. A Poisson-gamma model for analysis of ecological non-negative continuous data. Environmental and Ecological Statistics. 2013;20:533–552. [Google Scholar]
9.Miller DL. Induction of pulmonary hemorrhage in rats during diagnostic ultrasound. Ultrasound in Medicine and Biology. 2012;38:1476–1482. doi: 10.1016/j.ultrasmedbio.2012.04.004. [DOI] [PubMed] [Google Scholar]
10.Polansky AM. Nonparametric estimation of distribution functions of nonstandard mixtures. Communications in Statistics—Theory and Methods. 2005;34:1711–1724. [Google Scholar]
11.Zhou XH, Liang H. Semi-parametric single-index two-part regression models. Computational Statistics and Data Analysis. 2006;50:1378–1390. doi: 10.1016/j.csda.2004.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Lehmann EL. The power of rank tests. The Annals of Mathematical Statistics. 1953;24:23–43. [Google Scholar]
13.Cheng S, Wei L, Ying Z. Analysis of transformation models with censored data. Biometrika. 1995;82:835–845. [Google Scholar]
14.Brockhoff PM, Muller HG. Random effect threshold models for dose-response relationships with repeated measurements. Journal of the Royal Statistical Society, Series B (Methodological) 1997;59:431–446. [Google Scholar]
15.Dabrowska DM, Doksum KA. Partial likelihood in transformation models with censored data. Scandinavian Journal of Statistics. 1988;15:1–23. [Google Scholar]
16.Weinberg RA. Tumor suppressor genes. Science. 1991;254:1138–1146. doi: 10.1126/science.1659741. [DOI] [PubMed] [Google Scholar]
17.Boffetta P, Nyberg F. Contribution of environmental factors to cancer risk. British Medical Bulletin. 2003;68:71–94. doi: 10.1093/bmp/ldg023. [DOI] [PubMed] [Google Scholar]
18.Enewold L, Zhu K, Ron E, Marrogi AJ, Stojadinovic A, Peoples GE, Devesa SS. Rising thyroid cancer incidence in the united states by demographic and tumor characteristics, 1980–2005. Cancer Epidemiology, Biomarkers & Prevention. 2009;18:784–791. doi: 10.1158/1055-9965.EPI-08-0960. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Chow WH, Dong LM, Devesa SS. Epidemiology and risk factors for kidney cancer. Nature Reviews Urology. 2010;7:245–257. doi: 10.1038/nrurol.2010.46. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. Second. Wiley; 2002. [Google Scholar]
21.Lagakos SW, Barraj LM, De Gruttola V. Nonparametric analysis of truncated survival data, with application to AIDS. Biometrika. 1988;75:515–523. [Google Scholar]
22.Gross ST, Huber-Carol C. Regression models for truncated survival data. Scandinavian Journal of Statistics. 1992;19:193–213. [Google Scholar]
23.Cox DR. Regression models and life-tables. Journal of the Royal Statistical Society, Series B (Methodological) 1972;34:187–220. [Google Scholar]
24.Cook RJ, Farewell VT. The utility of mixed-form likelihoods. Biometrics. 1999;55:284–288. doi: 10.1111/j.0006-341x.1999.00284.x. [DOI] [PubMed] [Google Scholar]
25.Farewell VT. Some comments on analysis techniques for censored water quality data. Environmental Monitoring and Assessment. 1989;12:285–294. doi: 10.1007/BF00394234. [DOI] [PubMed] [Google Scholar]
26.Ichimura H. Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics. 1993;58:71–120. [Google Scholar]
27.Hayfield T, Racine JS. Nonparametric econometrics: The np package. Journal of Statistical Software. 2008;27 [Google Scholar]
28.Hu C, Tsodikov A. Joint modeling approach for semicompeting risks data with missing nonterminal event status. Lifetime Data Analysis. 2014;20:563–583. doi: 10.1007/s10985-013-9288-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp info

NIHMS851380-supplement-Supp_info.pdf^{(349.5KB, pdf)}

[R1] 1.Crump KS. Dose response problems in carcinogenesis. Biometrics. 1979;35:157–167. [PubMed] [Google Scholar]

[R2] 2.Cox C. Threshold dose-response models in toxicology. Biometrics. 1987;43:511–523. [PubMed] [Google Scholar]

[R3] 3.Ebrahimi N. Stochastic properties of a cumulative damage threshold crossing model. Journal of Applied Probability. 1999;36:720–732. [Google Scholar]

[R4] 4.Esary JD, Marshall AW. Shock models and wear processes. The Annals of Probability. 1973;1:627–649. [Google Scholar]

[R5] 5.Aitchison J. On the distribution of a positive random variable having a discrete probability mass at the origin. Journal of the American Statistical Association. 1955;50:901–908. [Google Scholar]

[R6] 6.Siegel AF. Modelling data containing exact zeroes using zero degrees of freedom. Journal of the Royal Statistical Society, Series B (Methodological) 1985;47:267–271. [Google Scholar]

[R7] 7.Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics. 1992;34:1–14. [Google Scholar]

[R8] 8.Foster SD, Bravington MV. A Poisson-gamma model for analysis of ecological non-negative continuous data. Environmental and Ecological Statistics. 2013;20:533–552. [Google Scholar]

[R9] 9.Miller DL. Induction of pulmonary hemorrhage in rats during diagnostic ultrasound. Ultrasound in Medicine and Biology. 2012;38:1476–1482. doi: 10.1016/j.ultrasmedbio.2012.04.004. [DOI] [PubMed] [Google Scholar]

[R10] 10.Polansky AM. Nonparametric estimation of distribution functions of nonstandard mixtures. Communications in Statistics—Theory and Methods. 2005;34:1711–1724. [Google Scholar]

[R11] 11.Zhou XH, Liang H. Semi-parametric single-index two-part regression models. Computational Statistics and Data Analysis. 2006;50:1378–1390. doi: 10.1016/j.csda.2004.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Lehmann EL. The power of rank tests. The Annals of Mathematical Statistics. 1953;24:23–43. [Google Scholar]

[R13] 13.Cheng S, Wei L, Ying Z. Analysis of transformation models with censored data. Biometrika. 1995;82:835–845. [Google Scholar]

[R14] 14.Brockhoff PM, Muller HG. Random effect threshold models for dose-response relationships with repeated measurements. Journal of the Royal Statistical Society, Series B (Methodological) 1997;59:431–446. [Google Scholar]

[R15] 15.Dabrowska DM, Doksum KA. Partial likelihood in transformation models with censored data. Scandinavian Journal of Statistics. 1988;15:1–23. [Google Scholar]

[R16] 16.Weinberg RA. Tumor suppressor genes. Science. 1991;254:1138–1146. doi: 10.1126/science.1659741. [DOI] [PubMed] [Google Scholar]

[R17] 17.Boffetta P, Nyberg F. Contribution of environmental factors to cancer risk. British Medical Bulletin. 2003;68:71–94. doi: 10.1093/bmp/ldg023. [DOI] [PubMed] [Google Scholar]

[R18] 18.Enewold L, Zhu K, Ron E, Marrogi AJ, Stojadinovic A, Peoples GE, Devesa SS. Rising thyroid cancer incidence in the united states by demographic and tumor characteristics, 1980–2005. Cancer Epidemiology, Biomarkers & Prevention. 2009;18:784–791. doi: 10.1158/1055-9965.EPI-08-0960. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Chow WH, Dong LM, Devesa SS. Epidemiology and risk factors for kidney cancer. Nature Reviews Urology. 2010;7:245–257. doi: 10.1038/nrurol.2010.46. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. Second. Wiley; 2002. [Google Scholar]

[R21] 21.Lagakos SW, Barraj LM, De Gruttola V. Nonparametric analysis of truncated survival data, with application to AIDS. Biometrika. 1988;75:515–523. [Google Scholar]

[R22] 22.Gross ST, Huber-Carol C. Regression models for truncated survival data. Scandinavian Journal of Statistics. 1992;19:193–213. [Google Scholar]

[R23] 23.Cox DR. Regression models and life-tables. Journal of the Royal Statistical Society, Series B (Methodological) 1972;34:187–220. [Google Scholar]

[R24] 24.Cook RJ, Farewell VT. The utility of mixed-form likelihoods. Biometrics. 1999;55:284–288. doi: 10.1111/j.0006-341x.1999.00284.x. [DOI] [PubMed] [Google Scholar]

[R25] 25.Farewell VT. Some comments on analysis techniques for censored water quality data. Environmental Monitoring and Assessment. 1989;12:285–294. doi: 10.1007/BF00394234. [DOI] [PubMed] [Google Scholar]

[R26] 26.Ichimura H. Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics. 1993;58:71–120. [Google Scholar]

[R27] 27.Hayfield T, Racine JS. Nonparametric econometrics: The np package. Journal of Statistical Software. 2008;27 [Google Scholar]

[R28] 28.Hu C, Tsodikov A. Joint modeling approach for semicompeting risks data with missing nonterminal event status. Lifetime Data Analysis. 2014;20:563–583. doi: 10.1007/s10985-013-9288-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Semiparametric profile likelihood estimation for continuous outcomes with excess zeros in a random-threshold damage-resistance model

John D Rice

Alex Tsodikov

Abstract

1 Introduction

Figure 1.

2 The competitive damage/resistance model

2.1 Biological motivation for the model

2.2 Specification of the model

3 Semiparametric estimation based on profile likelihood

3.1 Left censoring and the retro-hazard function

3.2 Counting process formulation

3.3 Profile likelihood for the CDR model

4 Simulation studies

4.1 Correct specification

4.2 Non-proportional retro-hazards

4.3 Unlinked models

Figure 2.

Table 1.

5 Rat PCH data analysis

Table 2.

Figure 3.

6 Discussion

Supplementary Material

Acknowledgments

Appendices

A Derivation of NPMLE of the retro-hazard

B Derivation of the profile likelihood

C Score components and observed information matrix

C.1 Score components

C.2 Observed information

D Further simulation results

D.1 Correct specification

D.2 Non-proportional retro-hazards

Figure 4.

D.3 Unlinked models

Table 3.

Table 4.

Table 5.

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases