Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Feb 15.
Published in final edited form as: Stat Med. 2014 Mar 13;33(18):3114–3129. doi: 10.1002/sim.6139

A Weighted Cumulative Sum (WCUSUM) to monitor medical outcomes with dependent censoring

Rena Jie Sun 1, John D Kalbfleisch 1,*, Douglas E Schaubel 1
PMCID: PMC4200511  NIHMSID: NIHMS633453  PMID: 24623573

Abstract

We develop a Weighted CUmulative SUM (WCUSUM) to evaluate and monitor pre-transplant waitlist mortality of facilities in the context where transplantation is considered to be dependent censoring. Waitlist patients are evaluated multiple times in order to update their current medical condition as reflected in a time dependent variable called the Model for End-Stage Liver Disease (MELD) score. Higher MELD scores are indicative of higher pre-transplant death risk. Moreover, under the current liver allocation system, patients with higher MELD scores receive higher priority for liver transplantation. To evaluate the waitlist mortality of transplant centers, it is important to take this dependent censoring into consideration. We assume a ‘standard’ transplant practice through a transplant model and utilize Inverse Probability Censoring Weights (IPCW) to construct a weighted CUSUM. We evaluate the properties of a weighted zero-mean process as the basis of the proposed weighted CUSUM. We then discuss a resampling technique to obtain control limits. The proposed WCUSUM is illustrated through the analysis of national transplant registry data.

Keywords: cumulative sum (CUSUM), dependent censoring, inverse probability weights, failure time data, quality control, quality improvement, resampling, control limits, risk adjustment

1. Introduction

Control charts are used to continuously monitor outcomes of a process, and hence to guide improvement in quality by providing timely feedback. CUmulative SUM (CUSUM) control charts have been suggested to monitor the performance of medical providers by measuring the rate of deaths or other outcomes after a surgical procedure. This approach enables early detection of an unacceptable number of deaths, for example, and can help with timely identification and correction of problems.

Steiner et al. [13] [14] developed a risk-adjusted one-sided CUSUM procedure based on the likelihood ratio in a logistic model for binary outcomes. The authors proposed a graphical method for identifying either a substantial or consistent change in risk-adjusted mortality. Axelrod et al. [1] demonstrated the utility of the one-sided CUSUM method for tracking and analyzing one-year binary mortality outcomes using a cohort of transplanted patients at multiple centers. However, a built-in one-year lag is necessary in this approach. Biswas and Kalbfleisch [4] developed a risk-adjusted one-sided CUSUM procedure constructed on a continuous time scale to monitor transplant survival outcomes sequentially by incorporating exposure and failures as soon as they occur. They compared the observed number of deaths at a given center to the expected number of deaths at that center assuming that the center has the same adjusted death rates as the overall national average. A sequential probability ratio test (SPRT) forms the basis of the one-sided CUSUM, which examines whether there is evidence that could lead to rejection of the null hypothesis that the center’s rates are the same as the expected value in the overall national experience in favor of ‘worse than expected’ (or ‘better than expected’) performance.

Each method listed in the preceding paragraph was developed based on the assumption of independent censoring. This would be violated in many cases, especially in medical settings where preventive approaches are applied selectively on high-risk patients. For example, in such settings, pre-treatment mortality is dependently censored by the receipt of treatment. In particular, the methods we propose in this article were motivated by the evaluation of mortality among patients waitlisted for liver transplantation. Patients on the liver transplant waitlist are regularly evaluated to assess their current medical condition. One particularly important summary measure is the Model for End-Stage Liver Disease (MELD) score, computed as a log linear combination of serum creatinine, bilirubin and International Normalized Ratio of prothrombin time. Waitlisted patients with higher MELD score have a higher risk of death and consequently are given priority to receive liver transplants when available. The ‘censoring time’, through receiving a transplant, is therefore correlated with the patients’ unobserved time of death on the waitlist that would have occurred had the patient been left untransplanted. To evaluate transplant center-specific waitlist mortality, it is important to take the dependent censoring (transplantation) into consideration; failing to do so may yield substantially biased results.

In this article, we discuss a Weighted CUSUM (WCUSUM) to account for dependent censoring. Motivated by the waitlist mortality issue for liver transplant centers, we present the method to address this case directly. However, the proposed methods could be adapted to monitor other data sets where dependent censoring is present. For example, the methods would be generally useful in dealing with transplant data and evaluating the survival on the wait list, at least in any transplant situation in which the chance of transplant depends on a time dependent risk factor for mortality. Another application could be in patients with prostate cancer where some interventions depend on the current value of a measurement of the Prostate Specific Antigen (PSA). In order to investigate survival without the intervention, one could use these methods. This could be useful in assessing overall survival in a particular region of the country, or in comparing treatment centers.

We assume all centers follow a standard liver transplantation guideline on donor allocation, which can be described by a transplant model. We then make use of inverse weights in order to obtain adjusted CUSUMs (or WCUSUMs) that take account of the dependent censoring, where the weights are determined by the time dependent MELD scores and their relationship to transplant. The resulting WCUSUMs are designed to compare the waitlist mortality at a center to the overall national average, having adjusted for patient mix and dependent censoring through the MELD score. We utilize a resampling technique to obtain control limits for the WCUSUM.

In the following sections, we introduce some basic notation before constructing a WCUSUM, where the weights and the hazard of death are obtained using an inverse probability of censoring approach (Robins and Finkelstein [12]). We then describe the signalling rules for the WCUSUM, illustrated by a case study on waitlisted liver transplant mortality.

2. A Weighted O-E CUSUM

2.1. Notation

Assume patient i from a given facility Inline graphic of interest enters the cohort at calendar time Si (e.g. time of initial listing on the transplant waitlist). Let Di denote the time to death since entry and Ci denote the time to transplant since entry. Let Xi = min(Di, Ci) be the observed event time since entry to either death or transplant whichever occurs first. The calendar time of the observed event is Ti = Si + Xi. Finally, Zi(x), 0 ≤ xXi, denotes the set of time-dependent covariates (e.g. MELD scores) and Vi is a set of baseline covariates measured at the time of entry. Typically, Vi includes Zi(0).

Assume we have a population model on time to mortality (in the absence of censoring) since entry with a hazard function αi(x) = α(x; Vi) for subject i where

αi(x)=limΔ0P{Di(x,x+Δ)Dix,Vi}/Δ. (1)

Let dΛi(t) = I(t > Si)αi(tSi)dt define the hazard or failure intensity function for subject i at calendar time t.

Suppose that survival over a one-year period is of interest, and refer to a death that occurs at time Di ≤ 1 as a ‘qualifying death’. Note that one-year survival is often used as a critical endpoint in assessing transplant centers, but the methods could be used to assess mortality over other windows of time, such as three-year survival, with minor adjustment. Individual i is at risk of a qualifying death at time t if Yi(t)=1, where Yi(t)=I{Si<tmin(Ti,Si+1)}. Let δi = I(Di = Xi) be the failure indicator and Ni(t) count the observed number of qualifying failures in the chronological time interval (0, t] for subject i:

Ni(t)={0tSi;δiI{TitSi+1}Si<tSi+1;Ni(Si+1)t>Si+1.

Note that Ni(t) is either 0 or 1. It takes the value 1 if the ith individual enters at a time Si < t and has a qualifying failure before time t. The observed number of qualifying failures in (0, t] for the center Inline graphic is N(t)=iFNi(t).

2.2. Definition and Properties of the Weighted O-E CUSUM

In this section, we first state the key assumption regarding the dependent censoring mechanism. We then consider the case of independent censoring and the unweighted O-E CUSUM. Taking the dependent censoring into account, we construct a weighted O-E CUSUM that provides an estimate of the underlying true CUSUM. We show the weighted O-E CUSUM has mean zero and evaluate its variance function. The components of the O-E CUSUM form the foundation to the weighted one-sided CUSUM that is described in the following section.

Assume the cause-specific hazard for censoring is λiC(xZ¯i(x),Vi)=limΔ0P{Ci(x,x+Δ)Dix,Cix,Z¯i(x),Vi}/Δ, where i(x) = {Zi(s), 0 < sx}. The key assumption needed to construct the weighted CUSUM is

λiC(xZ¯i(x),Vi)=limΔ0P{Ci(x,x+Δ)Cix,Z¯i(y),Vi,Di=y}/Δ, (2)

for all y > x. It says that all information about the rate of dependent censoring at time x is contained in i(x) and the fact that the individual is surviving and uncensored at time x. This rate is not changed by the additional knowledge of the future value of Di = y > x or the additional information on {Zi(v), x < vy}. Under this assumption, it follows that

P{Ci>xVi,Z¯i(y),Di=y}=exp{-0xλiC(uZ¯i(u),Vi)du} (3)

for all 0 < x < y. This assumption (2) and its consequence (3) are essential for the inverse weights arising from the process Zi(x) to fully correct for bias due to dependent censoring (Robins and Finkelstein [12]).

Let Ni(t) represent the underlying counting process of qualifying failures in the absence of dependent censoring, so that Ni(t) = I(Si + Dit < Si + 1) if tSi + 1 and Ni(t) = Ni(Si + 1) if t > Si + 1. Similarly, let Yi(t) denote the underlying at-risk indicator in the absence of dependent censoring, Yi(t) = I{Si < t < min(Si + Di, Si + 1)}. It follows that E(dNi(t)|Yi(t), Vi, Si) = Yi(t)αi(tSi)dt = Yi(t)dΛi(t).

Without any censoring, the O-E CUSUM at center Inline graphic would compare the observed number of failures O(t) = N(t) = Inline graphic Ni(t) with the expected number of failures E(t)=A(t)=iF0tYi(u)dΛi(u), and O(t) − E(t) is a zero-mean process if center Inline graphic has the same mortality rates as the reference population. A plot of O(t) − E(t) versus t provides a tracking of the outcomes from this facility as compared to the reference rates described by dΛi(t), i = 1, 2, …. When this plot trends upwards (downwards), the observed failure rates in this center are higher (lower) than those in the reference population, and so these plots provide a useful descriptive tool. Further discussion can be found in Sun and Kalbfleisch [15] or in Collett et al. [5].

Consider now the situation where the center Inline graphic has the same mortality rates as the reference population, but is subject to dependent censoring as described earlier. We aim to develop a process analogous to O(t) − E(t) that is adjusted for the dependent censoring, but retains the zero-mean property when the death rates in the facility Inline graphic correspond to the reference rates. Let dMi(t)=dNi(t)-Yi(t)dΛi(t)=Yi(t)[dNi(t)-dΛi(t)]. Note that Yi(t)=Yi(t)I(Ci>t-Si) so that

E[dMi(t)]=E{E{dMi(t)Yi(t),dNi(t),Z¯i(t-Si),Si,Vi}}=E{E{I(Ci>t-Si)Yi(t),dNi(t),Z¯i(t-Si),Si,Vi}Yi(t)[dNi(t)-dΛi(t)]}.

Under assumption (3), it follows that E{I(Ci>t-Si)Yi(t),dNi(t),Z¯i(t-Si),Si,Vi}=P{Ci>t-SiDi>t-Si,Z¯i(t-Si),Si,Vi}=exp{-0t-SiλiC(uZ¯i(u),Vi)du}. So that

E[dMi(t)]=E{Yi(t)[dNi(t)-dΛi(t)]exp{-0t-SiλiC(uZ¯i(u),Vi)du}}. (4)

The expression (4) shows that the Mi(t) process does not in general have mean zero under the reference distribution. However, (4) also indicates how to obtain a zero-mean process.

First, we define the weights in chronological time so that wi(t)=wi(t-Si)=exp{0t-SiλiC(uZ¯i(u),Vi)du}. It is now easy to see that

E[wi(t)dMi(t)Yi(t),Vi,Si]=E{Yi(t)[dNi(t)-dΛi(t)]Yi(t),Vi,Si}=0. (5)

This equation (5) shows that the difference between the weighted cumulative observed failures NiW(t)=0twi(u)dNi(u) and the weighted cumulative hazards AiW(t)=0twi(u)Yi(u)dΛi(u) is a zero-mean process, for any subject i.

Thus, the weighted zero-mean process for center Inline graphic is OW(t) − EW(t), where OW(t)=iFNiW(t) and EW(t)=iFAiW(t). In fact, we are replacing O(t) and E(t) above with estimates that are adjusted for the dependent censoring. We refer to OWEW as the weighted O-E CUSUM. In the case of no dependent censoring, when all weights are equal to 1, this process reduces to the usual zero-mean Martingale, with wi(t)dMi(t)=dNi(t)-Yi(t)dΛi(t)=dMi(t) and OW(t) − EW(t) = O(t) − E(t).

The process OW(t) − EW(t) has a jump discontinuity at any time t where a qualifying failure is observed to occur. If it is the ith individual who fails, the size of the jump is wi(t), which is the inverse of the probability that individual i would be untransplanted at time t given survival to that time. The compensating process, EW(t), makes the same adjustment to the conditional probabilities of failure among those at risk at time t. As before, when this process trends up (down), the observed failure rate in center Inline graphic is higher (lower) than the rate in the general population.

The variance of the process OW(t) − EW(t) under the reference distribution and accounting for all subjects at the center Inline graphic is

VarW(t)=Var{OW(t)-EW(t)}=iFE[0t{wi(u)}2Yi(u)dΛi(u)],

which is derived in the Appendix. The derivation of this result depends on a novel partition of the relevant integrals that leads to this relatively very simple result. As compared to the O(t) − E(t) process, the weighted process, OW(t) − EW(t) has an additional source of variation introduced by the weights. We also consider a more general case in which the facility of interest has relative risk, r, so that each patient has a hazard i(x). In this case, the corresponding mean zero process is OW(t) − rEW(t), which has variance function

VarrW(t)=iFrE[0t{wi(u)}2Yi(u)dΛi(u)]. (6)

The variance of the processes is relatively easily calculated and could be used to assist in the construction of appropriate stopping rules for the CUSUM. Further, the statistic {OW(t)-EW(t)}/VarW(t) provides a test statistic at time t that could be compared to the N(0, 1) distribution. We will not pursue this approach further in this article, since the resampling approach discussed in Section 3.2 provides a simpler and more easily implemented approach for determining stopping rules.

3. One-Sided Weighted CUSUM Chart

3.1. The One-Sided CUSUM chart

Biswas and Kalbfleisch [4] proposed a one-sided CUSUM chart, applicable if censoring is independent. At time t, the null hypothesis is taken to be that the rates of qualifying deaths in Inline graphic correspond to those in the general or reference population. Thus, for individual i, the hazard function is i(x) with r = 1. It is convenient to write r = exp(μ), so that the hull hypothesis is H0 : μ = 0. To construct the CUSUM, they consider the alternative, ‘worse than expected’ with a relative risk r = eθ > 1. Thus, the alternative hypothesis is H1 : μ = θ for some suitably chosen θ > 0 that typically represents a relative risk of clinical importance. In this paper, as in other similar work, we consider θ = log 2 for the ‘worse than expected’ case. In [4], it is shown that the logarithm of the likelihood ratio of μ = θ versus μ = 0 is proportional to Σi[θNi(t) − {eθAi(t)}] = θO(t) − {eθ − 1}E(t). From this, they develop a one-sided CUSUM based on a sequential likelihood ratio test. This approach involves plotting the function Gt which can be most easily defined in terms of its increments by Gt+dt = max{0, Gt + θdO(t) − (eθ − 1)dE(t)}, with G0 = 0.

The process Gt remains at 0 until the first qualifying failure occurs when it has a jump discontinuity of size θ. The process then drifts downward according to the term (eθ − 1)dE(t) until it reaches 0 or until the next qualifying failure, whichever occurs first. If another failure occurs, it jumps again by θ and then continues the downward trend. If it reaches 0, this constitutes a renewal and the process remains there until the next qualifying failure occurs. For quality monitoring purposes, the one-sided CUSUM registers a signal of ‘worse than expected’ when Gt exceeds a predetermined control limit L > 0. Refer to Figure 1 for an example chart designed to detect ‘worse than expected’ signals, although this chart is revised to accommodate dependent censoring. Note that a one-sided CUSUM can be designed to help detect either a ‘worse than expected’ performance with θ > 0 or a ‘better than expected’ performance with θ < 0; see [15] and [7].

Figure 1.

Figure 1

The weighted one-sided CUSUM, with control limit, of Center A over a 3.5-year period.

With the presence of dependent censoring, we utilize weighted cumulative failures and weighted cumulative hazards defined in the last section. Thus, in place of O(t) and E(t), we use OW(t) and EW(t). The one-sided Weighted CUSUM GW(t) is defined in terms of its increments by

Gt+dtW=max{0,GtW+θdOW(t)-(eθ-1)dEW(t)}, (7)

with G0W=0. This gives rise to a plot that is very similar to the case discussed above with no dependent censoring. In this case, however, the one-sided WCUSUM for ‘worse than expected’ jumps by XXθ/P(Ci>t-Si)=θwi(t)XX, where i represents the subject experiencing a qualifying failure at time t. When there is no failure, the one-sided WCUSUM trends down by (eθ − 1) times the accumulating hazards scaled by the wi(t)’s. In essence, an individual who is observed to have a qualifying failure, but for whom the chance of being untransplanted at the time of failure is small yields relatively much larger jump in the process GtW than a similar individual who fails, but who had a large chance of continuing to be untransplanted at the time of failure. The one-sided WCUSUM for ‘better than expected’ also follows closely the corresponding CUSUMs in the case of no dependent censoring, as discussed in [7] and [15].

The development of the O-E and one-sided WCUSUMs is straightforward. As in the case of no dependent censoring, their implementation requires the definition of suitable control limits in order to detect signals in a desirable way. In the next section, we consider this problem of specifying control limits L for the one-sided WCUSUM for detecting ‘worse than expected’. It is also possible to develop monitoring bands for the O-E WCUSUMs along the same lines as introduced in Sun and Kalbfleisch [15] in the case of no dependent censoring. We illustrate both the one-sided WCUSUM and the O-E WCUSUM in monitoring liver transplant centers in Section 4.

3.2. Obtaining Control Limits by Resampling

Biswas and Kalbfleisch [4] and Sun and Kalbfleisch [15] conducted simulations to determine suitable control limits. For a given center size, they set a false positive rate over a certain period, so that each center is subject to the same error rate if it has failure rates that correspond exactly to the reference or national rates. For example, Biswas and Kalbfleisch [4] use a false positive rate of 8% over a 3.5 year period; these values were chosen to be comparable to the flagging rates in use by the SRTR (Scientific Registry of Transplant Recipients) in monitoring transplant centers. This approach of controlling the false positive rates for all centers yields control limits that are lower for smaller centers and higher for larger centers. Simulations were done using a Poisson process for subject arrivals, with each patient having an exponential failure distribution at the reference failure rate (of the national average). In this way, appropriate limits could be obtained for each center according to its size (in terms of average number of arrivals per year). It should be noted that these limits are not adjusted for the particular patient characteristics at the center, though simulations suggested that moderate variation in patient characteristics did not change much the false positive rate. In the situation with dependent censoring and inverse weights, a similar simulation approach could be used. In addition to an assumption about the failure model, however, one would also need to model the time dependent model for the dependent censoring mechanism, and so generate patient characteristics and a time dependent risk score as well as a suitable censoring model. This seemed very complicated to execute and the sensitivity to model assumptions would always be a concern. It therefore seemed better to seek a more empirical approach that would require fewer assumptions.

Gandy et al. [7] considered an alternative approach to selecting control limit in the independent censoring case. They defined a revised time scale, s = E(t), and noted that, in terms of this time scale, the counting process of qualifying failures, O#(s) = O(E−1(s)), is a homogeneous Poisson process with rate 1. They showed that the average run length (ARL) in control on this new time scale can be obtained analytically through constructing a Markov chain. This ARL is equal to the expected number of events until stopping on the original scale if the center’s death rates correspond to the reference or national norm. In practice, one can calibrate L to obtain a desired ARL on the transformed time-scale; this is equivalent to setting L to correspond to the level needed to achieve a certain average number of qualifying failures at the signal on the original time scale. This approach would have the same limits apply to all centers with the disadvantage that small centers would be subjected to a smaller risk of a false positive signal over any given time period, whereas the largest facilities would have a relatively much higher probability of a false positive in a given interval. The choice between criteria is a policy decision. In the case of dependent censoring, however, there is no time scale that would map the process OW(t) into a Poisson process. This approach does not apply in the dependent censoring case.

In order to circumvent these problems, we utilize a resampling technique to calibrate control limits for a center of given size. In effect, this approach could be used any time that the reference is determined by a large national or regional population and whether particular facilities have outcomes that are outside of the national norm is of interest. The idea is to draw patients at random from the population to repeatedly compose a center of given size m and, for each such draw, obtain the corresponding one-sided WCUSUM. In the repetitions, this gives a realistic picture of the natural variation in the population of interest. The approach has the decided advantage that the time dependent censoring is automatically included in the variation. Various rules could be used to determine appropriate control limits for the center of interest (i.e. of size m). For example, we could choose the limit L so that the resampled or simulated patients would lead to a signal at a given proportion of the time over an interval of specified length. This is analogous to the approach taken in [4] and [15]. Alternatively, we could choose the control limit L in order to fix the average time until a signal occurs for a given center size, assuming the reference population stably spans over a long enough period of time. If we are interested in one-year mortality outcomes, a one-year lead-in period to reach equilibrium is important in calibrating the control limit L and in constructing the WCUSUM. The WCUSUM can then be operated continuously under the same L and still maintain a comparable type I error rate.

We develop the first of these ideas more specifically. Suppose that the criterion we wish to implement is that there would be an 8% chance of a false positive signal over a 3.5 year period. We set this criterion when the process is in equilibrium. To do this, we would simulate the process with new arrivals starting at time 0 and extending over a 4.5 year period. If the qualifying failure must occur within one year since entry, the Gt process would be in equilibrium by the beginning of year 1. We first estimate the weights and true hazards by constructing a dependent censoring model and a weighted death model using the population data. Consider, for example, a center that admits 30 patients per year. To evaluate this, we select at random and with replacement a sample of size 135 from the population from the patients who arrive during a 4.5 year period. For this sample, we construct a one-sided WCUSUM which we begin to observe at year 1 and follow for an additional 3.5 years, and record the maximum value, Gmax = max{Gt : 1 < t ≤ 4.5} that this WCUSUM achieves. Over a large number of B repetitions (e.g. B = 1000), we find Gimax, i = 1,…,B and choose L so that 8% of these B runs have a maximum one-sided WCUSUM value larger than L; that is 8% gives a false positive signal. This approach can be repeated for various facility sizes.

This resampling technique can also be used in the case of independent censoring, and should give results similar to those obtained by the approach in [4] and [15].

4. Case Study

4.1. Data Description

We consider mortality rates for liver transplant patients who have been waitlisted for a liver transplant using data obtained from the Scientific Registry of Transplant Recipients (SRTR). We consider a three-and-half-year cohort of patients waitlisted between January 1st, 2005 and June 30th, 2008 from one of eleven regions in the United States. In this example, the region is being considered as the population in our model. Patients recorded as Status 1 or 1A at at the time of waitlisting have acute liver failure at waitlisting and are not included in the analysis. In addition, we exclude patients who were waitlisted in error, who changed to kidney/pancreas transplants, or who had a previous liver transplant. Given that pediatric patients follow a different scheme of transplant, we only include adults of age at least 18 years at the time of waitlisting in the analysis. Two centers with fewer than 5 patients waitlisted over this 3.5 year span are excluded. In the final data set, 2,578 patients from a single region with 7 centers and 5 Organ Procurement Organizations (OPOs) are included.

The following baseline covariates are considered: gender, race, age, diagnosis categories, diabetes, previous malignancy indicator, Body Mass Index (BMI), blood type, and hospitalization and Intensive Care Unit (ICU) status. All these covariates as well as baseline MELD score (see below) and sodium value, are included in both the transplant (censoring) model and the mortality model.

Time dependent variables consist of the Model for End-Stage Liver Disease (MELD) score, inactive period, and sodium value. MELD is a function of measurements on serum bilirubin, serum creatinine and the international normalized ratio for prothrombin time (INR), and allocation MELD score is used in practice as the main determining factor in the allocation of livers from deceased donors. For analyses reported here, we record MELD using 12 binary indicators for whether the score is in 6–8, 9–11, 12–14, 15–17 (as the reference level), 18–20, 21–23, 24–26, 27–29, 30–32, 33–35, 36–39, 40+, Status 1 or 1A. Assuming that a patient is being monitored sufficiently by the clinician, it is reasonable to assume that the lack of a MELD score update implies that, to a reasonable approximation, the patient’s MELD score has not changed. Meld is updated frequently when a patient’s score is increasing since it is important to the center to make this known to increase the chance of a transplant, and MELD only rarely decreases. This suggests that coding MELD score as a step-function (i.e. last-value carried-forward) would be appropriate. Waitlisted patients are sometimes declared inactive and so temporarily removed from the waitlist; this may be due to a temporary sickness or other event that makes transplant impossible for a period of time. Inactive patients should not receive transplant offers. Inactive status is tracked with a time dependent indicator variable that takes the value 1 if the patient is inactive. Alternative approaches of handling inactive time were used by Zhang and Schaubel [17]. The inactive indicator is also included in the set of time dependent covariates along with MELD and sodium value in the dependent censoring model. Patients are sometimes permanently removed from the waitlist for reasons such as medical condition, refusing transplant, improved or deteriorated condition, or being inactive on the program for more than 2 years. Without over assumption, we treat removed patients as right truncation.

We consider death within one year since waitlisting as the outcome of primary interest in our analysis, though other time periods could clearly be considered as well. A patient is considered as dependently censored if he or she experienced any type of deceased donor transplant or died during a deceased donor transplant procedure. A patient is independently censored if he or she is lost to follow-up, removed from the waitlist, or received a living donor transplant, which typically is not predicted by MELD score.

4.2. Modeling and Control Limits

We first generate a suitable population model based on the totality of the regional data for both the dependent censoring process (transplant) and mortality. We will then use these models to construct weighted CUSUMs for each facility under consideration. Appropriate control limits for the one-sided CUSUMs are obtained by resampling the population and defining levels that, on resampling, give rise to prespecified operating characteristics. It should be noted that all centers within a region share the same rules for transplantation (depending on MELD), and so share the same censoring mechanism. This also means that the relatively large sample will allow fairly precise estimation of the censoring distribution and associated with weights.

In order to develop a suitable model for the transplant (dependent censoring), we consider a time dependent Cox model with hazard function,

λC(xZ¯i(x),Vi,Di>x)=λ0C(x)exp{γCZi(x)+βCVi}, (8)

where λ0C(x) is an unspecified baseline hazard function, i(x) = {Zi(s), 0 < sx} and Vi is a set of baseline covariates. In the censoring model, Zi(x) is a vector of time-dependent covariates including MELD, inactive period and sodium level for subject i, with Zi(0) indicating the baseline values of these variables. Vi is the set of baseline covariates, and includes Zi(0). The rate of transplantation is taken to depend only on the current value (most recent measurement) of Zi, which corresponds to the policy that utilizes MELD score as the main determinant of priority for transplantation.

Fitting the model to the dependent censoring data using standard techniques, we obtain estimates γ̂C, β̂C, and Λ^0C(x) as estimate of Λ0C(x)=0xλ0C(u)du.

With the model for censoring developed, estimation of mortality rates on the waitlist is the second part of the population model needed to implement the weighted CUSUM techniques. As above, we assume that the hazard of death in the absence of censoring is again a Cox model, but conditioned only on the baseline covariates, Vi. Thus the model for the hazard αi(x) in (1) is

αi(x)=λ0(x)exp(βVi). (9)

Appropriate estimation of the parameters in (9) requires appropriately accounting for the dependent censoring which we do through an analysis, stratified on centers, where the individuals in the risk sets are appropriately weighted with stabilized IPCW weights. (Stabilized weights are briefly discussed in Section 4.4.) The weights are being used to control for the confounding variable of censoring. Because these confounders are controlled by the weights rather than by inclusion as covariates in the Cox models, this approach avoids the problem that such confounders could also be intermediate on the causal pathway to the outcome of death. Once estimates of the regression parameter β are obtained from the model stratified on center, an estimate of an appropriate population level baseline hazard function is obtained by using a nonstratified model of the form (9) with exp(βVi) taken as an offset.

With appropriate models for censoring and mortality now in place, we construct CUSUM charts for each of the facilities in the study. In fact, to use the CUSUMs prospectively, we would use the estimates to monitor new events as they occur on the weight list in each facility in the region, and so use the methods for prospective monitoring of the mortality rates in the centers. For the purpose of this illustration, however, we develop CUSUMs over the period January 1, 2005 to June 30th, 2008. We focus attention on one-year mortality, and begin monitoring each facility one year earlier in January, 2004. If the facility is in accordance with the overall average performance in mortality and transplant, the one year period in advance should lead to a process that is approximately in equilibrium. We construct WCUSUMs for each center. The one-sided weighted CUSUM defined in (7) requires the specification of a suitable target relative risk for the alternative hypothesis and we select eθ = 2. Thus, the WCUSUM is being constructed so as to be particularly sensitive to a relative risk of 2 for waitlist death rates at the center level, as compared to the overall regional data.

In order to obtain an appropriate control limit for a center of interest which admits on average m patients per year, we use a resampling approach as described in Section 3.2. We select an appropriate criterion, and for this follow the recommendation of Biswas and Kalbfleisch [4] to obtain a Type I error rate of 8% over a 3.5-year period. For facility size m, we select at random 4.5m patients from those entering the entire population over the 4.5 year period from January 1, 2004 to June 30, 2008 and for this selection, construct a one-sided WCUSUM beginning in January, 2005 and extending for 3.5 years. This process is repeated 1,000 times and the control limit L is chosen so that 8% of these 1000 WCUSUMs would yield a signal. The results of this process for facility sizes m = 30, 50, 100, 150, 200 are summarized in Table 1.

Table 1.

Control limits for Weighted CUSUM

Size (per year) L OW
EW VarW
30 5.66 15.36 15.48 26.82
50 6.35 25.03 25.19 43.28
100 7.23 50.64 50.80 85.79
150 8.10 75.98 76.17 132.03
200 8.33 101.60 101.80 174.66

4.3. Analysis Results

Table 1 shows that as size increases, the control limit increases, and the weighted expected number of failures and the variance of the weighted zero-mean process increase linearly. The weighted observed number of failures and the weighted expected number of failures are very close.

Given the estimated expected number of failures at a center, we used linear interpolation based on values from Table 1 to find an appropriate control limit L. We apply the estimated control limits on the 7 centers in the selected region and one center is signaled during the period of interest. Figure 1 demonstrates that the example Center A with 654 patients over the 3.5 year period operates at the reference level for the entire period. Figure 2 shows Center B with 368 patients in the 3.5 year cohort. It has several failures observed around the end of September 2006 in a short period of time and continues with a number of observed failures, which leads to the signal at the beginning of July 2007.

Figure 2.

Figure 2

The weighted one-sided CUSUM, with control limit, of Center B over a 3.5-year period.

In practice, the signal in Figure 2 should lead to a review by the center for possible issues in process that might account for this signal and the apparent very high death rates in effect. Following this signal and possible remedial action, it would be normal to restart the WCUSUM before continuing. There is advantage to using a ‘head start’, whereby the CUSUM would begin at a position L/2 instead of 0. This has the effect of increasing the chance of an additional signal especially if the process is currently out of control as the CUSUM suggests. Head starts are discussed in [7] and [11].

As noted earlier, an alternative presentation of CUSUMs was discussed in [15] in which the O-E cusum is plotted with monitoring bands to indicate when the CUSUM yields a signal. Figures 3 and 4 present the weighted version OWEW along with the appropriate monitoring bands for CUSUMs for the same two facilities as in Figures 1 and 2. The dark path is the CUSUM and the broken path is the monitoring band for a worse than expected signal. Figure 3 does not signal and Figure 4 signals at the same time as the one sided CUSUM in Figure 2. Note that the advantage of these plots is that when the CUSUM trends up or down, it indicates that the rate of failure in the center is respectively higher or lower than that in the general population.

Figure 3.

Figure 3

The weighted O-E CUSUM of Center A for a 3.5-year period. The dotted line is the monitoring bound for a worse than expected flag.

Figure 4.

Figure 4

The weighted CUSUM of Center B for a 3.5-year period. The dotted line is the monitoring bound for a worse than expected flag.

(The following subsection was moved from the appendix)

4.4. Unstabilized and stabilized IPCW weights

Under assumption (2), Robins and Finkelstein [12] have shown that we can estimate the true hazards Λi in the presence of dependent censoring, using the inverse probability of censoring weights (IPCW) approach.

We assume a Cox model for the time to transplant (or dependent censoring) with hazard function (8). Under this model and the assumption (2), the conditional probability of not receiving a transplant until time x for subject i whose survival time exceeds x is,

KiV(x)=P{CixDi>x,Z¯i(x),Vi}=exp{-ΛiC(x)}, (10)

where ΛiC(x)=0xexp{γCZi(s)+βCVi}dΛ0C(s). This is estimated as K^iV(x) by replacing γC, βC and Λ0C with their estimated values. The commonly-used (unstabilized) weights are defined as w^i1(x)=1/K^iV(x).

To reduce the variation in the weights, we can stabilize the weights by including a numerator K^i0(x) obtained by using Zi(0) in place of Zi(s) in (10). Stabilized weights are then w^i2(x)=K^i0(x)/K^iV(x). It can be seen that the stabilized weights also give unbiased estimating equations for the parameters of the marginal death model, and these are often used to reduce the variability of the estimates as we have done in Section 4.2.(the following section has been relocated from the Appendix)

5. Simulations of OWEW

In this section, we describe an approach that we use to simulate data that satisfy the dependent censoring models considered in this paper. These simulations are then used to evaluate the properties of the OWEW process.

Assume patients arrive at a given center according to a homogeneous Poisson process with rate μ0 patients per year. We refer to μ0 as the facility size. For each patient i, assume a baseline covariate Vi that is Bernoulli(p) variable, and a time dependent covariate Zi(x) that follows a Poisson process on the follow-up time x with rate depending on Vi; specifically, we assume Zi(x) ~ PP(μeγDVi), where PP(denotes Poisson process. Suppose we are interested in one-year mortality. Patients are followed for one year from entry and are censored at one year if they have not experienced either a failure or a transplant.

Conditional on Zi(x) and Vi, we generate (cause-specific) censoring and mortality according to hazard functions λiC(xVi,Zi(x))=λ0Cexp{γCVi+βCZi(x)} and λiD(xVi,Zi(x))=λ0Dexp(γDVi)+βDZi(x), respectively. As shown in the Appendix in Section 7.2, the additive form for the conditional mortality model yields a marginal form, after taking an expectation over Zi(x), that is multiplicative with structure λiD(xVi)=[λ0D-μ(e-βDx-1)]eγDVi. This step in the simulation allows us to generate a marginal mortality model within the proportional hazards class. This means that an ordinary Cox model can be used to estimate the relative risks and baseline hazard. The degree of correlation between the transplant hazard and mortality hazard is determined by the Zi(x) process. We use a Spearman rank correlation coefficient to measure the correlation between the latent death time and transplant time. In practice, we observe only one event among death, transplant and independent censoring whichever occurs first.

We first conduct some simulations to verify the variance formulas given in (6). We consider the following parameter setup: μ0 = 500, p = 0.5, μ = 5, γD = log(2), λD = 0.01, γC = log(1.5), βD = 0.06 and βC = log(2). The simulation is conducted using 1000 repetitions.

For relative risks r = 0.5, 1 and 2, Table 2 reports the observed death rates, the dependent censoring rates, and the Spearman rank correlation between latent death time and dependent censoring time. In addition, Table 2 reports: the mean and standard deviation of the empirical variance of OW(1) − rEW(1),

Table 2.

Empirical verification of the properties of OW(1) − rEW(1) under simulations

r Death Censoring Corr.
OErW

Var^

Var

mean Var mean SD mean SD
0.5 11.1% 0 0 −0.27 55.4 55.4 6.8 55.7 2.5
8.6% 32.4% 0.13 −0.44 66.4 68.2 29.0 67.8 6.5
1 20.7% 0 0 0.11 103.6 103.7 8.6 103.8 4.5
16.3% 29.6% 0.17 −0.37 125.2 124.9 36.0 125.5 12.6
2 36.3% 0 0 −0.14 179.0 181.2 10.7 181.5 8.2
29.6% 24.6% 0.22 0.34 220.0 216.8 39.3 216.1 21.8
Var^=Var^{OW(1)-rEW(1)}=i{01wi(u)dNi(u)-rwi(u)Yi(u)dΛi(u)}2;

and the mean and standard deviation of the variance from equation (6),

Var=Var{OW(1)-rEW(1)}=ir01[wi(u)]2Yi(u)dΛi(u).

It is expected that both variance estimators would have the same mean with Var having the smaller standard deviation. Note that in all of these calculations, we are taking the weights wi(u) and the intensity functions Λi(u) are taken as given, their estimates in practice being based on a large population sample.

Table 2 verifies that the OW(1) − rEW(1) has mean value close to 0 under all scenarios, and that Var and Var^ are both valid estimates of its variance. However, Var has much smaller variation than Var^ under all scenarios.

In Table 3, we compare the number of observed failures and the number of expected failures in the independent censoring case (Scenario 1) with the weighted observed failures and weighted number of expected failures under dependent censoring (Scenario 2). Again, the weights and the hazards are assumed known or estimated precisely from a large sample. Note that we can never obtain the true weights or hazards in practice. To mimic the practical implementation, we also compare with the results obtained from the estimated weights and hazards (Scenario 3), where we generate a separate large sample (or population) with 5000 subjects and run IPCW analysis to obtain the parameter estimates for the censoring and mortality models. We consider the following parametric settings: μ0 = 100, p = 0.5, μ = 3, γD = log(2), λD = 0.01, λC = 0.05, γC = log(2), βD = 0.1 and βC = log(2). The simulation is conducted using 100 repetitions. The one-year cohort has 13.9% deaths and 39.3% dependent censoring while the latent death rate is 20.6%. Spearman rank correlation between latent death time and dependent censoring time is 0.18.

Table 3.

Recovery of underlying failures and risks in the case of dependent censoring

Scenario 1 (indep)
Scenario 2 (dep)
Scenario 3 (dep)
mean SD mean SD mean SD
Observed (O or OW) 20.35 4.17 19.21 5.63 19.29 5.43
Expected (E or EW) 20.77 1.92 20.72 2.14 20.34 2.19
Variance of OE or OWEW 20.77 1.92 34.77 5.36 36.51 10.47

Table 3 shows that in both Scenario 2 and 3, the mean of the weighted observed failures and weighted expected failures in the dependent censoring case are close to those in the case of no censoring. Note, however, that variance of OWEW is inflated in the dependent censoring case due to the additional uncertainty introduced by the weights. Weighted values using estimated weights and estimated hazards in Scenario 3 agree closely with those obtained using true weights and true hazards in Scenario 2.

6. Discussion

The construction of a weighted CUSUM with IPCW weights requires assumptions of accurate information, no unmeasured confounding, and correctness of the censoring model. Given these assumptions, the weighted O-E process under the null hypothesis is a zero-mean process but with inflated variance as compared to the CUSUM with no dependent censoring.

When dependent censoring model is misspecified, a WCUSUM might give a variety of results depending on the actual censoring pattern. It is important to have a correct dependent censoring model. In our case, this does not present a problem because the transplant priorities are set nationally to depend on current MELD score, and should be strictly followed within each region depending primarily on the MELD score. If the censoring varied across centers, it would still be possible to carry out separate estimations within each center and utilize the corresponding censoring distribution in the WCUSUM. There would be additional uncertainty in this case due to the potential error in the weights because of the smaller sample, and one might wish to take that into account in specifying control limits.

We view the use of CUSUMs as primarily a quality improvement tool, and so they are used to provide quick feedback to the medical providers so that they can monitor outcomes and review for possible problems when a signal occurs. These methods could also be used by an oversight organization to monitor facilities under its purview. The control limits we obtained were with a view to the quality improvement use, and control limits in the context of an oversight organization should depend on the purpose of the flagging and the actions to be taken. For example, very different flagging criteria would be used to suggest what centers might profit from an audit of procedures versus a situation in which the signal would lead to financial penalties or censure.

We presented a resampling technique to obtain the control limit for a center of a given size and over a certain period of time. In doing this, it was essentially assumed that each center draws its patients and covariates from the population, so that the future distribution of profiles in each center would mirror that in the overall population. For our example, Figure 5 gives side by side box plots of the distributions of the estimated relative risks for the seven centers. These suggest that there are no large differences in risk profile and so provide some support for our approach. If these distributions were quite different, and it seemed more reasonable to assume that each center would have risk distributions in the future that looked like those in the past, then we could stratify the bootstrap approach so that bootstrap data for each center would be obtained by sampling nearest neighbors. A lead-in period to reach equilibrium is suggested. Our impression is that the control limits are not terribly sensitive to the risk profile, but a systematic examination of this would be useful. The sampling approach to determining control limits should also be investigated further since it provides a very simple empirical approach to this problem. It would be useful, for example, to compare the results from this approach in the independent censoring case to those obtained from the approach in [4] and in [?].

Figure 5.

Figure 5

Side by side boxplots of the estimated relative risks (exp(βVi)) of patients in the seven centers.

Acknowledgments

We would like to thank the Kidney Epidemiology and Cost Center at the University of Michigan for its support of this research. We also thank Dr. Min Zhang and Dr. Robert Merion for their valuable input to this work. The data for this study were made available by the Scientific Registry of Transplant Recipients which funded by contract from the Health Resources and Services Administration (HRSA), US Department of Health and Human Services.

7. Appendix

7.1. Variance of OW(t) − rEW(t) with relative risk r

We begin by examining the process

NiW(t)-AiW(t)=0twi(u)dNi(u)-0twi(u)Yi(u)dΛi(u)

for a given individual i. We have already seen that this process has mean zero under the null hypothesis that individual i has the same mortality rate as an individual with the same covariates in the reference population. We now investigate the variance of this process.

Consider the more general case where the individual has a relative risk r, meaning that the mortality rate for this individual is a constant r times the mortality rate of a similar individual from the population. The corresponding weighted zero-mean process is

NiW(t)-rAiW(t)=0twi(u)dNi(u)-r0twi(u)Yi(u)dΛi(u). (11)

The variance of this process can be evaluated through the following steps:

Var(NiW(t)-rAiW(t))=E{0twi(u)dNi(u)-rwi(u)Yi(u)dΛi(u)}2=E{0t[wi(u)]2dNi(u)}-2rE{0twi(u)dNi(u)0twi(v)Yi(v)dΛi(v)}+r2E{[0twi(u)Yi(u)dΛi(u)]2}. (12)

The second term in (12) can be written as

2rE{0twi(u)Yi(u)dNi(u)0uwi(v)dΛi(v)}+2rE{0twi(u)dNi(u)utwi(v)Yi(v)dΛi(v)}, (13)

where we have used Yi(u)Yi(v)=Yi(u) for v < u. Note that the second term in (13), which is an integral over the range v > u, must be 0 since: if Yi(v)=1, then dNi(u) = 0 for all u < v, and on the other hand, if dNi(u) = 1 then Yi(v)=0 for all v > u. Under the hypothesis of a relative risk of r it follows that

E(Yi(t)dNi(t)Yi(t),Vi,r,Si)=rYi(t)dΛi(t),

Therefore, (13) reduces to

2r2E{0t0uwi(u)wi(v)Yi(u)dNi(u)dΛi(v)=2r2E{0t0uwi(u)wi(v)Yi(u)dΛi(u)dΛi(v)}=r2E{[0twi(u)Yi(u)dΛi(u)]2.}

Thus, the second and the third terms in (12) cancel and

Var{NiW(t)-rAiW(t)}=E0t[wi(u)]2dNi(u)=rE0t[wi(u)]2Yi(u)dΛi(u).

Consider now a center Inline graphic in which each individual has a relative risk r compared to the overall average mortality rate. In this case, OW(t)=iFNiW(t) and EW(t)=iFAiW(t) and, since individuals are independent,

Var{OW(t)-rEW(t)}=riFE0t[wi(u)]2Yi(u)dΛi(u).

In the special case of no dependent censoring, the variance reduces to r0tYi(u)dΛi(u)=rAi(t), which is the usual Martingale result. The special case r = 1, corresponding to the usual null hypothesis, is of special interest.

7.2. Dependent censoring simulation background

For the simulations in Section 5, we utilized a joint model for the mortality and censoring mechanisms as outlined in this section. In particular, we show that for an additive conditional mortality model given Vi and Zi(x), the expectation of the hazards on Zi(x) results in a multiplicative form, which can then be analyzed using a standard Cox PH model. Thus, this approach leads to both the time dependent Cox model for the dependent censoring, and the Cox model with fixed baseline covariates for the mortality model. The derivation follows that in Jewell and Kalbfleisch [8].

Let Vi represent the baseline covariate of interest, e.g. treatment assignment, and suppose that Vi has a Bernoulli distribution with probability of success p. The time dependent covariate, Zi(t) (e.g. MELD score) for subject i is generated as a Poisson process with intensity μeγVi depending on the value of Vi. The time dependent model for transplant or dependent censoring has hazard function

λiC(xZi(x),Vi)=λ0Cexp{βCZi(x)+γCVi}.

In an analogous way, the model for mortality given Zi(t) has hazard function

λiD(xZi(x),Vi)=λ0Dexp(γVi)+βDZi(x).

Thus, the time dependent covariate Zi(t) induces a correlation between the death time and the censoring time for the ith individual.

Following the derivationsin [8], the marginal survivor function for mortality can be calculated as

S(xVi)=E{S(xVi,Zi(x))Vi}=E{exp(-0x[λ0DeγVi+βDZi(u)]du)Vi}=exp{-λ0DeγVix}E{exp0ψ(u)Zi(u)duVi},=exp{-λ0DeγVix}exp{KZ(ψ)}

where ψt(u) = −βDI(u < x) and

KZ(ψ)=-0xμeγVids+0xμeγViexp{sxψ(v)dv}ds=-μeγVix+μeγVi0xe-βD(x-s)ds=μeγVi(1βD-e-βDxβD-x)

Therefore, the marginal hazard for mortality is

λD(xVi)=-logS(xVi)x=λ0DeγVi-KZ(ψ)x=[λ0D-μ(e-βDx-1)]eγVi=λ0(x)eγVi,

which is a standard Cox model with fixed covariates and baseline hazard λ0(x)=λ0D-μ(e-βDx-1).

References

  • 1.Axelrod D, Guidinger MK, Metzger RA, Wiesner RH, Webb RL, Merion RM. Transplant center quality assessment using a continuously updatable risk-adjusted technique (CUSUM) American Journal of Transplantation. 2006;6:313–323. doi: 10.1111/j.1600-6143.2005.01191.x. [DOI] [PubMed] [Google Scholar]
  • 2.Axelrod DA, Kalbfleisch JD, Sun RJ, Guidinger MK, Biswas P, Levine GN, Arrington CJ, Merion RM. Innovations in the assessment of transplant center performance: Implications for quality improvement. American Journal of Transplantation. 2009;9:959–969. doi: 10.1111/j.1600-6143.2009.02570.x. [DOI] [PubMed] [Google Scholar]
  • 3.Barnard GA. Control charts and stochastic processes. Journal of the Royal Statistical Society. 1959;21:239–271. Series B. [Google Scholar]
  • 4.Biswas P, Kalbfleisch JD. A risk-adjusted CUSUM in continuous time based on the Cox model. Statistics in Medicine. 2008;27:3382–3406. doi: 10.1002/sim.3216. [DOI] [PubMed] [Google Scholar]
  • 5.Collett D, Sibanda N, Pioli S, Bradley A, Rudge C. The UK scheme for mandatory continuous monitoring of early transplant outcome in all kidney transplant centers. Transplantation. 2009;88:970–975. doi: 10.1097/TP.0b013e3181b997de. [DOI] [PubMed] [Google Scholar]
  • 6.Dickinson DM, Shearon TH, O’Keefe J, Wong H-H, Berg CL, Rosendale JD, Delmonico FL, Webb RL, Wolfe RA. SRTR center-specific reporting tools: Posttransplant outcomes. American Journal of Transplantation. 2006;6:1198–1211. doi: 10.1111/j.1600-6143.2006.01275.x. [DOI] [PubMed] [Google Scholar]
  • 7.Gandy A, Kvaloy JT, Bottle A, Zhou F. Risk-adjusted monitoring of time to event. Biometrika. 2010;97:375–388. [Google Scholar]
  • 8.Jewell NP, Kalbfleisch JD. Marker processes in survival analysis. Lifetime Data Analysis. 1996;2:15–29. doi: 10.1007/BF00128468. [DOI] [PubMed] [Google Scholar]
  • 9.Kalbfleisch JD. Commentary on “The UK scheme for mandatory continuous monitoring of early transplant outcome in all kidney transplant centers” by Collett D, Sibanda N, Pioli S, Bradley A, and Rudge C. Transplantation. 2009;88:968–969. doi: 10.1097/TP.0b013e3181b997de. [DOI] [PubMed] [Google Scholar]
  • 10.Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. 2. Wiley; New York: 2002. [Google Scholar]
  • 11.Lucas J, Crosier R. Fast initial response for CUSUM quality-control shcemes: Give you CUSUM a head start. Technometrics. 1982;24:199–205. [Google Scholar]
  • 12.Robins JM, Finkelstein D. Correcting for non-compliance and dependent censoring in an AIDS clinical trial with inverse probability of censoring weighted (IPCW) log-rank tests. Biometrics. 2000;56:779–788. doi: 10.1111/j.0006-341x.2000.00779.x. [DOI] [PubMed] [Google Scholar]
  • 13.Steiner S, Cook R, Farewell V, Treasure T. Monitoring surgical performance using risk-adjusted cumulative sum charts. Biostatistics. 2000;1:441–452. doi: 10.1093/biostatistics/1.4.441. [DOI] [PubMed] [Google Scholar]
  • 14.Steiner S, Cook R, Farewell V. Risk adjusted monitoring of surgical outcomes. Medical Decision Making. 2001;21:163–169. doi: 10.1177/0272989X0102100301. [DOI] [PubMed] [Google Scholar]
  • 15.Sun RJ, Kalbfleisch JD. A risk-adjusted O-E CUSUM with monitoring bands for monitoring medical outcome. Biometrics. 2013 doi: 10.1111/j.1541-0420.2012.01822.x. [DOI] [PubMed] [Google Scholar]
  • 16.Wetherill GB. Sampling inspection and quality control. 2. London: Chapman & Hall; 1977. [Google Scholar]
  • 17.Zhang M, Schaubel DE. Estimating differences in restricted mean lifetime using observational data subject to dependent censoring. Biometrics. 2011;67:740–9. doi: 10.1111/j.1541-0420.2010.01503.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES