Skip to main content
BMC Medical Research Methodology logoLink to BMC Medical Research Methodology
. 2024 Jan 3;24:3. doi: 10.1186/s12874-023-02123-z

Adverse events in single-arm clinical trials with non-fatal time-to-event efficacy endpoint: from clinical questions to methods for statistical analysis

Elena Tassistro 1,, Davide Paolo Bernasconi 1, Maria Grazia Valsecchi 1,2, Laura Antolini 1
PMCID: PMC10765745  PMID: 38172810

Abstract

Background

In any single-arm trial on novel treatments, assessment of toxicity plays an important role as occurrence of adverse events (AEs) is relevant for application in clinical practice. In the presence of a non-fatal time-to-event(s) efficacy endpoint, the analysis should be broadened to consider AEs occurrence in time.

The AEs analysis could be tackled with two approaches, depending on the clinical question of interest. Approach 1 focuses on the occurrence of AE as first event. Treatment ability to protect from the efficacy endpoint event(s) has an impact on the chance of observing AEs due to competing risks action. Approach 2 considers how treatment affects the occurrence of AEs in the potential framework where the efficacy endpoint event(s) could not occur.

Methods

In the first part of the work we review the strategy of analysis for these two approaches. We identify theoretical quantities and estimators consistent with the following features: (a) estimators should address for the presence of right censoring; (b) theoretical quantities and estimators should be functions of time.

In the second part of the work we propose the use of alternative methods (regression models, stratified Kaplan-Meier curves, inverse probability of censoring weighting) to relax the assumption of independence between the potential times to AE and to event(s) in the efficacy endpoint for addressing Approach 2.

Results

We show through simulations that the proposed methods overcome the bias due to the dependence between the two potential times and related to the use of standard estimators.

Conclusions

We demonstrated through simulations that one can handle patients selection in the risk sets due to the competing event, and thus obtain conditional independence between the two potential times, adjusting for all the observed covariates that induce dependence.

Keywords: Survival analysis, Adverse events, Competing risks, Dependent censoring, IPCW

Introduction and rationale

The evaluation of outcome following a novel therapeutic regimen commonly considers a primary, possibly composite time to event(s) endpoint, related to disease control and survival. However, the assessment of toxicity plays an important role as the occurrence of severe adverse events (AEs) (or reactions) that may be disabling (although not deadly) is required for a careful treatment application in clinical practice. As an example, frontline intensive chemotherapy in children newly diagnosed with acute lymphoblastic leukemia (ALL) has remarkably increased in the last decades the ability to avoid relapse of the disease. Nowadays, more attention is given, when innovating the therapeutic approach, or applying it in clinical practice, to avoid or prevent undesirable disabling conditions (such as severe osteonecrosis) [1]. Thus, the analysis of outcome is broadened to evaluate and describe the occurrence of AEs such as osteonecrosis in addition to the primary endpoint of efficacy, such as relapse. This means that treatment failure considers the first event occurring between AE and relapse, thus placing the analysis in the context of competing risks. Yet, the occurrence of a relapse as first event does not exclude the possibility of observing a subsequent AE, but the relationship to the treatment under analysis is weakened because the patient undergoes a different treatment for relapse [2]. As a consequence, the occurrence of relapse is considered as a competing risk, thus as a sort of right censoring. Another possibility when in the presence of a non-fatal time-to-event efficacy endpoint (such as relapse) is the addition of a lag time where if the AE occurs within the lag time it can be still considered related to the initial treatment. For simplicity, we did not consider that lag time. Descriptive methods commonly used to analyse AEs data are the crude proportion of AEs, obtained by dividing the observed number of AEs by the total number of subjects, and the classical epidemiological AEs rate, obtained by dividing the observed number of AEs by the total time spent free from treatment failure. These measures are commonly reported in clinical papers among the initial descriptive results [36].

In principle, the analysis of AEs could be tackled from two different points of view. Approach 1 requires a competing risk framework for analysis: the clinical question relates to the observed occurrence of AE as first event, in the presence of the event “relapse”. In this case, AE and relapse are competing events, and treatment ability to protect from relapse has an impact on the chance of observing AEs due to the competing risks action [5].

Approach 2 requires a potential (or direct) framework for analysis: the clinical question relates to the treatment causing AE occurrence as if relapse could not occur. In this case, one should consider the occurrence of AEs as if relapse would not exclude the possibility of observing AEs related to the treatment under analysis, thus in the absence of competing risks [7]. These two approaches have very different implications when the description of AEs (and relapse) occurrence is used to comparatively describe two different treatment approaches (novel vs standard, for example). Indeed, in approach 1, the more one of the treatments protects from relapse, the greater is the chance of observing an AE as first event [5]. On the other hand, with approach 2, the effect of treatment on relapse is ruled out, in principle.

Regardless of the approach of interest, estimators and theoretical quantities used in clinical papers to describe AEs data should have the following features:

  1. estimators should address for the presence of right censoring;

  2. theoretical quantities and estimators should be functions of time.

This work has two aims: the first is to critically review the standard theoretical quantities and estimators with reference to their appropriateness for dealing with approaches 1 or 2 and to the desired features (a) and (b). The second aim is to define a strategy to relax the assumption of independence between the potential times to the competing events, such as that to AE and that to relapse, of the commonly used estimators when potential approach 2 is of interest.

The paper is organized as follows: in “Notation, setting and simulated example data” section we define notation and setting and we introduce a simulated dataset that will be used to present the standard methods. In “Standard methods” section we review the standard methods: the crude proportion of AEs and the epidemiological AEs rate, which fail in at least one of the two desirable features; the crude incidence and cause-specific estimators which are instead consistent with the two features (a) and (b). In “The potential approach estimation of the cumulative incidence of AE” section we clarify the impact of the crucial assumption of independence between potential times to competing events (AEs and relapse) of the standard estimators used in the potential approach. We propose the use of regression models, stratified Kaplan-Meier curves and inverse probability of censoring weighting to relax the assumption of independence by achieving conditional independence given covariates. In “Motivating example: osteonecrosis in childhood acute lymphoblastic leukemia” section we present results of the standard and regression-based methods on the motivating example on osteonecrosis in childhood ALL. “Simulation protocol” section describes an extensive simulation protocol developed to show the performance of the proposed methods and the impact of not accounting for an unmeasured covariate and “Simulation results” section presents the results of the simulations. The paper ends with discussion in “Discussion and conclusion” section.

Notation, setting and simulated example data

The occurrence of AEs in time defines a survival time from origin TAE. Similarly, the occurrence of a relapse in time defines the survival time TRL. These survival times are called “potential” since only the minimum is observable as first event. The observable failure time is T=min(TAE,TRL) and the observable cause of failure is E (equal to 1 if AE, equal to 2 if relapse). In the presence of a right censoring time C, the observed time is min(TC) and Δ=I(TC) is the failure indicator. Finally, (ti,δi,δi·ei),i=1,...,N, is used to denote the observed failure time, the failure indicator and the cause of failure, on a sized N sample.

In order to make the standard methods commonly used to analyse AEs data clearer, we simulated an example data of N=300 subjects using the inversion method of Bender et al. [8]. The potential times TAE and TRL were simulated from exponential distributions with parameters depending on two independent binary covariates X1 and X2, with P(X1=1)=0.3 and P(X2=1)=0.4. The combination of X1 and X2 identifies a different hazard profile in patients experiencing an AE or a relapse:

  • if X1=0 and X2=0,TAEExp(1) and TRLExp(2)

  • if X1=0 and X2=1,TAEExp(3) and TRLExp(6)

  • if X1=1 and X2=0,TAEExp(3) and TRLExp(5)

  • if X1=1 and X2=1,TAEExp(9) and TRLExp(15)

One may note that, fixed X1 (e.g. X1=0 or X1=1), if X2 changes from 0 to 1, both the hazard of AE and the hazard of relapse triple. For sake of simplicity, we do not consider the presence of censoring, but we will explain the influence of right censoring on each standard method. The distribution of TAE and TRL is presented in Fig. 1 panel a) and b).

Fig. 1.

Fig. 1

a Scatterplot of the potential times TAE and TRL according to the four groups identified by the binary covariates X1 and X2; b Zoom of the scatterplot in panel a) selecting potential times TAE and TRL lower than 1; c histogram of the distribution of the failure times calculated as the minimum value between TAE and TRL, in the simulated data of N=300 subjects

At first glance, the dependence between times TAE and TRL is not evident. As expected, the times of patients with both covariates equal to 0 are systematically greater than the remaining ones. Indeed, median values for TAE and TRL are: 0.7 and 0.4 (X1=0,X2=0), 0.2 and 0.1 (X1=0,X2=1), 0.2 and 0.2 (X1=1,X2=0), 0.1 and 0.0 (both covariates equal to 1) with a Spearman correlation equal to 0.95. However the overall Spearman correlation between times TAE and TRL is equal to 0.25, a moderate value due to the absence of correlation within each of the four groups of patients conditional on the covariate value identifying the groups. The distribution of the failure times T, calculated as the minimum value between TAE and TRL, is displayed in Fig. 1 panel c). The distinct failure times tj, the number of patients at risk nj at each time point and the number of subjects experiencing an AE or a relapse as first event (djAE and djRL respectively) are displayed in Table 1, where 80 subjects developed an AE and 220 relapsed.

Table 1.

Simulated example data of N=300 subjects and estimators

j tj nj djAE djRL S^(tj) h^AE(tj) h^AE(tj)S^(tj-) CI^AE(tj) AN^AE(tj) KM^AE(tj)
1 0.0 300 29 75 0.653 0.097 0.097 0.097 0.097 0.097
2 0.1 196 21 60 0.383 0.107 0.070 0.167 0.204 0.194
3 0.2 115 11 39 0.217 0.096 0.037 0.204 0.300 0.271
4 0.3 65 4 12 0.163 0.062 0.013 0.217 0.362 0.316
5 0.4 49 6 7 0.120 0.122 0.020 0.237 0.484 0.400
6 0.5 36 1 10 0.083 0.028 0.003 0.240 0.512 0.416
7 0.6 25 3 4 0.060 0.120 0.010 0.250 0.632 0.486
8 0.7 18 0 3 0.050 0.000 0.000 0.250 0.632 0.486
9 0.8 15 2 2 0.037 0.133 0.007 0.257 0.765 0.555
10 1.0 11 1 3 0.023 0.091 0.003 0.260 0.856 0.595
11 1.1 7 0 2 0.017 0.000 0.000 0.260 0.856 0.595
12 1.2 5 0 2 0.010 0.000 0.000 0.260 0.856 0.595
13 1.3 3 1 1 0.003 0.333 0.003 0.263 1.189 0.730
14 2.1 1 1 0 0.000 1.000 0.003 0.266 2.189 1.000

tj are the distinct failure times measured in years; nj is the number of patients at risk at time tj; djAE and djRL are the number of patients developing AE or relapse at time tj, respectively; S^(tj) is the survival function at time tj, estimated according to KM estimator on failures due to AE or relapse; h^AE(tj) corresponds to the instantaneous rate of AE and it is the non-parametric ML estimator of the CSHAE(t); h^AE(tj)S^(tj-) is the product of instantaneous rate of AE with the KM estimator of the proportion of patients free from treatment failure up to time tj-; CI^AE(tj) is the AJ estimator of CIAE(t) of AE; AN^AE(tj) and KM^AE(tj) are the AN and KM estimates of the CSHAE(t)

Standard methods

The crude proportion of AE

The empirical crude proportion (CP) is defined as

CP=i=1NI(δi·ei=1)N

where I(·) is the indicator function. CP is the count of patients who fail due to AEs during the entire follow-up over a total of N patients, regardless of the individual follow-up length.

In our example data, this quantity is CP=80300=0.27. CP is consistent with the first approach of analysis, since it can be thought as a naïve estimate of the probability of observing AEs over the entire follow-up. Of note, AEs are counted in the numerator of CP only if observed as first events, and relapse acts as competing risk. CP is not a function of time in the sense that is not calculated at different time points and it does not address properly for the presence of right censoring that affects the count in the numerator, but N is fixed. Thus, CP fails with respect to both features (a) and (b).

The crude incidence of AE

The theoretical CP can be generalized in time by the crude cumulative incidence (CI) probability (that in the remaining of the work will be called crude incidence) CIAE(t)=P(Tt;E=1), which corresponds to the absolute risk of treatment failure due to AEs up to time t. The non-parametric maximum likelihood (ML) estimator of CIAE(t) is given by the Aalen-Johansen (AJ) formula

CI^AE(t)=tjtS^(tj-)·h^AE(tj) 1

where S^(tj-)=P^(T>tj-) is the KM non-parametric ML estimator of the proportion of patients free from treatment failure, either caused by AE or relapse, up to time tj- and

h^AE(t)=i=1NI(ti=tj;δij·eij=1)nj 2

is the instantaneous rate of AEs, which corresponds to the proportion of patients experiencing AEs at time t over the total number of patients at risk, i.e. free from AEs and relapse (and censoring), at that time. Of note, the denominator corresponds to the person-time spent at risk in the time window [tj,tj+1). In Table 1, at each time tj, the quantities needed to estimate CI^AE(t) are displayed and in Fig. 2 panel a) the graph is shown.

Fig. 2.

Fig. 2

a CIAE(t) estimated through the AJ formula; b ANAE(t) and KMAE(t) step curves of the cumulative CSHAE(t) and of the cumulative incidence of AE, respectively

The AJ estimator of CIAE(t) is consistent with the first approach of analysis, since it can be thought as an estimate of the probability of treatment failure due to AEs over the course of time, where, since AEs are counted only if observed as first events, relapse acts as a competing event. One may note in (1) the indirect protection of relapse, that lowers down S^(tj-) when relapse occurs. The CIAE(t) has both features (a) and (b): it addresses for the presence of right censoring, due to the non-parametric ML estimator property, and it is a function of time. In this setting, the possible impact of covariates on the CIAE(t) can be assessed by the Fine and Gray model or the regression model based on pseudo-values.

The epidemiological AE rate

The epidemiological AE rate is defined as

Rate=i=1NI(δi·ei=1)i=1Nti

and it originates from the count of patients observed to fail due to AEs during the entire follow-up divided by the total time spent free from treatment failure, i.e. spent free both from AEs and relapse. The AE rate represents the number of observed AEs per 1 unit of person-time spent at risk.

In our example, the AE rate is Rate=8056.4·12=0.12 with the total time at the denominator calculated in months from Table 1. The AE rate can be thought as an estimate of the probability of observing AEs in the next time unit for a patient that is now free from AEs and relapse (and censoring), assuming this probability constant in time. If this probability cannot be reasonably assumed constant, the AE rate can be interpreted as an “average” rate over the follow-up. The AE rate is consistent with the second approach of analysis, where the focus is on treatment action in the development of AEs in patients relapse free in time. Indeed, the occurrence of relapse (or of right censoring) implies a contribution to the denominator of the time to relapse (or to right censoring) and a null contribution to the numerator. The AE rate addresses feature (a) but not (b), due to the assumption of constancy in time. The AE rate can be proved to be the parametric ML estimator of the probability of observing AEs in the next time unit for a patient free from treatment failure (either caused by relapse or AE) assuming this probability constant in time.

The cause-specific hazard of AE and related quantities

The theoretical AE rate can be easily generalized in time by relaxing the assumption of constancy in time by

CSHAE(t)=limΔt0+P(t<Tt+Δt;E=1|T>t)Δt 3

This quantity corresponds to the cause-specific hazard (CSH) of AEs and the estimator of the instantaneous hazard is h^AE(t) in (2). Based on cause-specific hazard, two important cumulative step function estimators can be derived, as described below.

The Aalen-Nelson estimator of the cumulative hazard of AE

We may consider the cumulative sum of CSHAE(t) to obtain an estimator of the cumulative hazard 0tCSHAE(u)du by the Aalen-Nelson (AN) formula

AN^AE(t)=tjt1·h^AE(tj) 4

which is based on the non-parametric ML estimator h^AE(t) in (2). The ANAE(t) estimator is consistent with the second approach of analysis, as there is no indirect protection from the competing event (relapse). One may observe by comparing (1) and (4) that S^(tj-), which is lowered down in (1) when a relapse occurs, is replaced in (4) by the fixed value 1, as if relapses were removed.

Estimator (4) addresses for the presence of right censoring (feature (a)) since h^AE(tj) does so, and it is a function of time (feature (b)). The values of the estimates are reported in Table 1 and the corresponding curve is displayed in Fig. 2 panel b).

The ANAE(t) curve can be interpreted as a naïve estimator of the expected number of AEs a patient may experience in the time interval up to t when he/she can develop recurrent AEs in time as if relapse was removed.

The Kaplan-Meier estimator of the cumulative incidence of AE

One may consider the cumulative product of (1-h^AE(t)) terms, with the complement to 1 to obtain an estimator of the cumulative incidence function that corresponds to 1-exp0t-CSHAE(u)du, by the KM formula

KM^AE(t)=1-tjt(1-h^AE(tj)) 5

which is based on the non-parametric ML estimator h^AE(t) in (2).

Estimator (5) addresses for the presence of right censoring (feature (a)) since it is addressed in h^AE(tj), and it is a function of time (feature (b)). In Table 1 the values of the estimates obtained with this method are reported and the corresponding curve is displayed in Fig. 2 panel b).

The KM^AE(t) curve can be interpreted as a naïve estimator of the cumulative incidence of treatment failure only due to AEs, as if relapse was removed. This KM^AE(t) corresponds to a KM estimator where relapse is just a censored observation. Also, 1-KM^AE(t) is often used to naïvely estimate the so called “AE free survival curve” that is the cumulative probability of not having AEs in time (censoring time to relapse).

The potential approach estimation of the cumulative incidence of AE

Issues on the meaning of AN and KM estimators

At first glance, the ANAE(t) and KMAE(t) curves could be interpreted only in terms of treatment action in determining AE occurrence, regardless of the impact of relapse. This interpretation comes natural since hAE(t) is related only to the velocity of development of AEs in time. It is not so for the crude incidence CIAE(t), where the presence of S(tj-) in (1) makes evident that the occurrence of relapse has a direct influence on the estimate.

One may note, however, that the occurrence of relapse may exclude “not at random” patients from the risk set on which the instantaneous rate of AEs is calculated in (2). This indirect patients selection due to relapse may influence the interpretation of CSHAE(t) and of the chosen step function. Due to this, the CSHAE(t) in (3) may not capture entirely how treatment influences the occurrence of AE at time t in patients who, at that time, would be free from AE, which is theoretically represented by the potential hazard (pH) of AE

pHAE(t)=limΔt0+P(t<TAEt+Δt|TAE>t)Δt 6

with corresponding potential cumulative incidence

pIAE(t)=P(TAEt)

that represents treatment action on AE, regardless of the impact of relapse.

Only if there is independence between TAE and TRL, the sub-sample of patients with T>t in (3) is a random sample of patients with TAE>t in (6) and thus the two expressions (3) and (6) coincide. As a consequence, under the assumption of independence, KM^AE(t) in (5) is the suitable estimator of pIAE(t). A similar argument follows for AN^AE(t) as estimator of 0tpHAE(u)du This assumption, however, is often not reasonable and cannot be tested.

To enlighten the problems related to the interpretation of AN^AE(t) and KM^AE(t) in the absence of independence, we simulated different datasets according to the setting specified in “Notation, setting and simulated example data” section, with increasing sample size from N=100 to N=1000, and we compared these quantities with the potential cumulative hazard function pHAE(t) and incidence function CFIAE(t). The potential cumulative hazard function can be calculated as -log(pSAE(t)), where the potential survival function pSAE(t)=1-pIAE(t) defined as

pSAE(t)=P(X1=0)P(X2=0)exp(-λ00AE·t)++P(X1=0)P(X2=1)exp(-λ01AE·t)++P(X1=1)P(X2=0)exp(-λ10AE·t)++P(X1=1)P(X2=1)exp(-λ11AE·t) 7

corresponds to an average of exponential distributions weighted by the proportion of patients with covariates values X1=k,k=0,1 and X2=l,l=0,1.

In Fig. 3 panels a) and b) the AN^AE(t) and KM^AE(t) curves calculated in the simulated datasets are displayed together with the potential quantities. One can notice that, independently from the sample size of the dataset and from the low overall Spearman correlation between times TAE and TRL (0.17, 0.33, 0.20, 0.26 for sample sizes N=100,N=300,N=500,N=1000 respectively), the AN^AE(t) and KM^AE(t) curves do not fit the potential quantities.

Fig. 3.

Fig. 3

a AN^AE(t) calculated for different sample sizes; b KM^AE(t) calculated for different sample sizes

To relax the assumption of independence between the two potential times TAE and TRL one possibility is to estimate the hazard of AE in strata defined by covariates (that influence the AE and relapse time distributions) assuming that only within each stratum there is independence between TAE and the indirect selection due to TRL. This approach can be carried out by averaging stratum estimates obtained either non parametrically or by the use of the Cox regression model on CSHAE(t) leading to a weighted average survival probability. An alternative method is addressing the presence of selection due to relapse through inverse probability of censoring weighting (IPCW) [911]. This method aims at creating a pseudo-population that is similar to the one observable in the absence of relapse by adding a weight to patients who do not develop relapse. On this pseudo-population a survival probability is then calculated and the incidence is derived as its complement to 1.

Weighted average survival probability

The survival probabilities for each combination of the observed covariates values pSAEkl(t)=P(TAE>t|X1=k,X2=l),k=0,1 and l=0,1, can be estimated through the KM estimator within each stratum or by a Cox proportional hazards (PH) model including these covariates among regressors:

CSHAE(t)=CSH0,AE(t)exp(β1X1+β2X2)

where CSH0,AE(t) is the baseline hazard.

The overall average survival is determined by weighting the survival probabilities in each level of the covariates by the proportion of subjects having that covariate levels. If both covariates X1 and X2 are observed, the weighted average survival is calculated as

pS^AE(t)=i=1nI(X1i=0,X2i=0)npS^AE00(t)++i=1nI(X1i=0,X2i=1)npS^AE01(t)++i=1nI(X1i=1,X2i=0)npS^AE10(t)++i=1nI(X1i=1,X2i=1)npS^AE11(t) 8

where pS^AEkl(t) indicates the survival probability at time t obtained through the KM estimator or predicted by the Cox model for a patient having X1=k,k=0,1, and X2=l,l=0,1. This is an estimator of the AE free survival probability curve and the complement to 1 of this estimator represents the incidence probability.

IPCW estimator of the survival probability

The unitary contribution of a subject i in the count of subjects at risk of experiencing an AE at time t is replaced in (2) and (5) by the weight wi(t)=1pS^RLX1i,X2i(t) where pS^RLX1i,X2i(t) is the estimate of the potential conditional probability pSRLX1,X2(t)=P(TRL>t|X1,X2) of being relapse free until time t. The lower is the probability of being relapse free, the greater are the weights, given the inverse proportionality. The estimate of pSRLX1,X2(t) can be based on the KM estimator within each combination of the observed covariates or on the fit of a Cox PH model for relapse CSHRL(t)=CSH0,RL(t)exp(β1X1+β2X2) in which prognostic factors for AE and relapse are entered as covariates. Alternative survival regression models can also be considered as for example the Aalen additive model which does not rely on the PH assumption [12]. Once the weights are calculated, one can estimate the survival probability for time to AE in the absence of relapse, i.e. the AE free survival curve, using the KM estimator [13] and then derive the incidence probability as its complement to 1.

Motivating example: osteonecrosis in childhood acute lymphoblastic leukemia

We show here the application of methods revised in “Standard methods” section and of those proposed in “The potential approach estimation of the cumulative incidence of AE” section to data on children with ALL enrolled in two subsequent multicenter clinical trials conducted in Italy with the Italian Association of Pediatrical Hematology and Oncology (AIEOP) [1]. The aim is to assess how front line chemotherapy treatment is related or affects children with a relatively rare, yet disabling, complication such as severe osteonecrosis (ON). The primary evaluation of outcome usually considers a composite time-to-event endpoint, i.e. time to failure (where failure is defined as resistance, relapse, death or second malignant neoplasm, whichever occurs first). The analysis with focus on ON broadens the evaluation of treatment efficacy to relevant toxic events, as if this relevance would justify considering them as failures, in other words as competing events. In particular, here ON is an AE of the frontline treatment protocol administered to children with ALL and acts as competing risk for death (not due to relapse) or first relapse (which is followed by another type of treatment).

We analysed data on 3691 children aged 1-17 years at diagnosis of ALL and, of those, 99 experienced an ON during or after the end of the front line treatment while 725 children experienced a competing event of the primary endpoint (596 developed a relapse, 96 died, 23 experienced a second malignant tumour and 10 were resistant); all others were right censored at last follow-up. In the following, we will identify the competing event as relapse or death.

The crude proportion of ON is CP=993691=0.027, meaning that 2.7% of the study population developed an AE before relapse or death. In Fig. 4 panel a) the crude incidence of ON calculated through the AJ estimator is displayed. The CI probability of failure due to an ON is lower than 3.0% after 6 years from diagnosis of ALL.

Fig. 4.

Fig. 4

a) CION(t) estimated through the AJ formula; b) ANON(t) and KMON(t) step curves of the cumulative CSHON(t) and of the cumulative incidence of ON, respectively

The epidemiological AE rate, calculated as the number of subjects experiencing an ON over the total time at risk of developing an ON (i.e. time at risk is time to ON, relapse/death, censoring) is Rate=9919940.96=0.005, meaning that 5 ONs per 1000 person-years occurred. The estimates of the CSHON(t) obtained through the AN and KM estimators are displayed in Fig. 4 panel b). Multiplying by 100 the AN estimator at 6 years from diagnosis, the expected number of ONs in 100 hypothetical children is 3.11. The estimated cumulative incidence at 6 years is 0.0306 (SE=0.003) and it is close, as expected due to rarity, to the CSHON(t) estimate (0.0311).

In this context two covariates are of relevance: age at diagnosis of ALL, since incidence of ON and of relapse tends to be higher with higher age, and risk group, which is the stratification of the children, based on genetic features and cytological/molecular early response to treatment, that defines the intensity of the administered treatment (the higher the risk, the higher the intensity). In order to correctly account fo the presence of a dependence between the potential time of ON and the potential time of relapse or death we derived the estimate of the potential incidence of ON from formula (8), including first only risk group as a covariate and then adding also age at diagnosis (>10 versus 10 years).

In Fig. 5 the estimates of the cumulative incidence probability obtained with different methods are displayed. One can see that the naïve KM estimator and the weighted average method considering only risk group as covariate give the same estimates of the incidence probability. Including also age at diagnosis as covariate, a similar but higher incidence probability is obtained. The distance between these two groups of curves suggests that the inclusion of the second covariate removes part of the dependence between time to ON and to relapse/death. However, overall the inclusion of the covariates does not change remarkably the incidence probability compared to the naïve estimator. This could be due to either to the low ability of the covariates to remove the dependence or to the fact that the two potential times do not have a strong dependence as the pathways to relapse or death are not so much related to the determinants of ON.

Fig. 5.

Fig. 5

Incidence probability estimates obtained with the naïve KM estimator, the weighted KM estimator stratifying only for 1 covariate (risk group) or for 2 covariates (risk group and age at diagnosis) and the weighted Cox model including only risk group or both risk group and age at diagnosis as covariates

Simulation protocol

Data generation

We extended the simulation setting of “Notation, setting and simulated example data” section (1000 datasets with N=300 each) by considering 4 different scenarios [8, 14].

The 4 scenarios differ in the parameters of the exponential distribution from which the potential competing time TRL is generated (Table 2). In scenario 1 none of the covariates has an impact on the hazard of the potential time TRL and potential times TAE and TRL are independent. In the other three scenarios there is at least one covariate with an impact on the hazard of the potential time TRL and the hazard of the potential time TAE that generates dependence between these potential times. In particular, in scenario 2 only X1 has an impact on both hazards whereas in scenarios 3 and 4 both covariates have an impact on both hazards (scenario 4 is the same of “Notation, setting and simulated example data” section).

Table 2.

Parameters of the exponential distributions of the potential times TAE and TRL

Scenario AE Relapse
λ00AE λ01AE λ10AE λ11AE λ00RL λ01RL λ10RL λ11RL
1 1 3 3 9 2 2 2 2
2 1 3 3 9 2 2 5 5
3 1 3 3 9 2 6 5 5
4 1 3 3 9 2 6 5 15

λklAE and λklRL are the parameters of the exponential distributions of TAE and TRL, respectively, when X1=k,k=0,1, and X2=l,l=0,1

We simulated also an additional scenario varying the imbalance of the covariates, with two cases, in scenario 4. In case A we set the parameters of the Bernoulli distributions from which the binary covariates X1 and X2 are generated in order to construct the “worst” situation that may happen when analysing real data, that is both covariates are very frequent in the population under study. To do so, we changed the Bernoulli parameters from the original P(X1=1)=0.3 and P(X2=1)=0.4 to P(X1=1)=P(X2=1)=0.5. Then, in case B we set the Bernoulli parameters in order to obtain the “best” scenario, that is at least one covariate (here X2) is rare in the study population. In this case, we kept fixed the prevalence of the first covariate P(X1=1)=0.3 and we changed that of the second one to P(X2=1)=0.1.

We added another simulation where the parameters of the exponential distributions of the hazard of relapse where changed in order to reduce the impact of the competing event in scenario 4. First we set the parameters in order to have, fixed X1=0 (or X1=1), when X2 changes, an increase of 2 times in the hazard of relapse (case A) and then an increase of 1.5 times in the hazard of relapse (case B).

Simulations were carried out using the R software version 4.0.3 available at http://cran.r-project.org/.

Estimated quantities

For each scenario, we calculated the potential incidence probability in an hypothetical world where relapse is absent deriving it from formula (7) at two fixed time-points (t=0.2 and t=0.3). In addition, we estimated the expected number of subjects at risk of developing an event (AE or relapse) at these time-points for an hypothetical sample of N=300 patients as

N·P(TRL>t,TAE>t)==N·[P(X1=0)P(X2=0)exp(-λ00RL·t)exp(-λ00AE·t)++P(X1=0)P(X2=1)exp(-λ01RL·t)exp(-λ01AE·t)++P(X1=1)P(X2=0)exp(-λ10RL·t)exp(-λ10AE·t)++P(X1=1)P(X2=1)exp(-λ11RL·t)exp(-λ11AE·t)]

where the calculation of P(TRL>t,TAE>t) in each stratum is obtained by the product of P(TRL>t) and P(TAE>t) due to the conditional independence, given covariates.

We compared the estimators that can be used to estimate the potential incidence probability for each dataset:

  1. the naïve KMAE(t) presented in formula (5)

  2. the complement to 1 of the weighted average survival probability, obtained through the KM estimator in strata defined according to the observed covariate X1, thus considering covariate X2 as unobserved

  3. the complement to 1 of the weighted average survival probability, obtained through the KM estimator in strata defined according to the observed covariates X1 and X2

  4. the complement to 1 of the weighted average survival probability, obtained through the Cox model with only the observed covariate X1, thus considering covariate X2 as unobserved

  5. the complement to 1 of the weighted average survival probability, obtained through the Cox model with the observed covariates X1 and X2

  6. the complement to 1 of the KM survival probability on the pseudo-population obtained by the IPCW estimator (Cox based) where weights are estimated according to the observed covariate X1, thus considering covariate X2 as unobserved

  7. the complement to 1 of the KM survival probability on the pseudo-population obtained by the IPCW estimator (Cox based) where weights are estimated according to the observed covariates X1 and X2.

Simulation results

Simulation results in the four scenarios are presented in Fig. 6. At each time-point the expected number of subjects at risk of developing an event is displayed and the distance between the estimate and the potential incidence probability (bias) is represented in a boxplot for each of the 7 estimators.

Fig. 6.

Fig. 6

Simulation results in the four scenarios at times t=0.2 and t=0.3. The grey horizontal line is the reference null bias

In scenario 1, where the hazard of relapse is independent from the covariates values, the results of all methods are similar: the median value of the bias of the incidence estimated through each method and the potential incidence is equal to 0. Of note, also the estimate of the incidence obtained with the naïve KM estimator censoring relapsed subjects is unbiased.

In scenario 2, where only X1 has an impact on the hazard of relapse, the naïve KM estimator gives a biased incidence probability. With the other methods unbiased estimates are obtained. Of note, the methods in which only the X1 covariate is considered perform slightly better with respect to those in which both covariates are included, due to the fact that X2 does not have an impact on relapse.

In scenario 3 the hazard of relapse depends on X1 and, when X1=0, also on X2. Methods in which only X1 is included give biased estimates. Methods with X1 and X2 covariates give unbiased estimates with the exception of the IPCW estimator, where there is an overestimation of the incidence probability. This is due to the fact that in this scenario also an interaction between the two covariates is present: X2 has an impact on the hazard of relapse only when X1=0. However this interaction is not accounted for in the model to estimate weights. To corroborate this result, Fig. 7 shows the results of weighted Cox and IPCW approaches when X1,X2 and their interaction are considered. The reader may observe that the inclusion of the interaction overcomes the bias in the IPCW method, while it is not needed in the Cox model since, conditional on X1 and X2, this model requires an assumption of independence between TAE and TRL that is present even if there is an interaction between X1 and X2. In this regard, the Cox model is robust also in the presence of an interaction between the covariates on the hazard of relapse.

Fig. 7.

Fig. 7

Simulation results for scenario 3 of the IPCW estimator accounting for the presence of an interaction between X1 and X2 in the estimate of the weights. The grey horizontal line is the reference null bias

In the last scenario displayed in Fig. 6, when both X1 and X2 have an impact on the hazard of relapse, all methods including one covariate only give similar biased results. However, the distance between the estimated and the potential incidence probabilities is lower than that obtained with the naïve KM estimator. The estimates from the weighted KM or the Cox model and from the IPCW are unbiased when the methods account for the presence of all covariates that have an impact on the hazard of relapse. Of note, the estimates from the IPCW have a greater variability with respect to the others.

In the additional simulations on variations of scenario 4, all the proposed methods with the exception of the IPCW were compared.

Figure 8 shows results when the Bernoulli parameters change, named case A when P(X1=1)=P(X2=1)=0.5 and case B when P(X1=1)=0.3 and P(X2=1)=0.1. As expected, in both cases only estimates obtained from the weighted KM and the weighted Cox model methods with the inclusion of both covariates are unbiased. Comparing the results of case A with the corresponding results of scenario 4 in Fig. 6 one may observe the greater variability due to the lower number of patients at risk of developing an event (AE or relapse). Comparing the results of case B with the corresponding results of scenario 4 in Fig. 6, the estimates of all methods are less biased.

Fig. 8.

Fig. 8

Simulation results for the variation of scenario 4 when P(X1=1)=P(X2=1)=0.5 (Case A) or P(X1=1)=0.3 and P(X2=1)=0.1 (Case B). The grey horizontal line is the reference null bias

Figure 9 shows results when the parameters of the exponential distributions from which the hazard of relapse was simulated change, named case A when λ00RL=2,λ01RL=4,λ10RL=5,λ11RL=10 and case B when λ00RL=2,λ01RL=3,λ10RL=5,λ11RL=7.5. Comparing the results with the corresponding results of scenario 4 in Fig. 6 one can observe that the lower is the hazard of relapse, the lower is the bias of the estimated incidence probability. Of note, in this simulation setting the bias obtained from the naïve KM censoring time to relapse reduces, but it is still the worst incidence estimator.

Fig. 9.

Fig. 9

Simulation results for the variation of scenario 4 when fixed X1=0 (or X1=1), if X2 changes, the hazard of relapse increases of 2 times (Case A) or of 1.5 times (Case B). The grey horizontal line is the reference null bias

Discussion and conclusion

In the majority of clinical studies on novel therapies with time-to-event endpoints, toxicity, related to the occurrence of AEs, is analysed with different estimators (such as crude proportions and rates) apparently disregarding the critical aspects that intervene due to the competing risks of AEs versus the primary endpoint event(s). In this work we propose solutions that relax the assumption of independence between the potential time to AE and the potential time to the efficacy endpoint event(s), thus allowing a proper estimate and interpretation of the cumulative incidence of AE. We addressed the particular case of non-fatal time-to-event efficacy endpoint. Of note, if failure was a fatal endpoint, the analysis of the occurrence of AE over time in principle could be carried out in the same way if the fatal endpoint is considered as a competing risk. However, the interpretation would be rather artificial. This is an obvious limitation of the method we proposed.

In the first part of this work we reviewed two different approaches starting from the type of clinical question when analysing AEs data and considering, for simplicity, one event for the efficacy endpoint, i.e. relapse. When the aim is the description of the observed occurrence of AE as first event in a competing risk framework (approach 1), treatment ability to protect from relapse has an impact on the chance of observing the AE due to the competing risks action. While the frequently presented crude proportion is not a function of time and does not properly account for censoring [5, 14], the Aalen-Johansen estimator of the crude incidence of AEs, commonly used for competing risks analysis [15], gives a proper estimate of the probability of treatment failure due to AEs over the course of time, where relapse acts as competing event (since AEs are counted only if observed as first events). The description of the observed occurrence of AE through the crude incidence together with the crude incidence of the efficacy endpoint enables to quantify the risks and benefits from the patient perspective [16]. Of note, the methods described in the previous citation can be useful to perform hypothesis testing on the two types of competing risks.

When the aim is the description of the potential occurrence of AE in relapse free patients, in a potential framework (approach 2), the epidemiological AE rate, which is not a function of time [4, 5], is often replaced by the Aalen-Nelson or Kaplan-Meier estimators of the cause-specific hazard of AE (as first event) and of the cumulative incidence of AE, regardless of relapse occurrence. However, the occurrence of relapse may exclude, not at random, patients from the risk sets on which the instantaneous rate of AE is calculated. This indirect patients selection operated by relapse, which is due to the dependence between the two potential times to AE and to relapse, leads to biased estimates.

One possibility to handle this dependence could be that of resorting to copula models treating the problem as a bivariate problem. This approach would go beyond our aim which is focused on a single marginal (that of the adverse event) and would require the specification of the type of copula. We think that resorting directly on a single marginal distribution is a more direct approach and has the advantage of not requiring a parametric specification of the type of copula [17].

We proposed alternative methods, such as weighted average survival probability (estimated either by the Kaplan-Meier estimator or by the use of the Cox model) and inverse probability of censoring weighting, and we proved through simulations that they overcome the problem due to the dependence between the potential times to AE and to relapse. In particular, we proved through simulations that one can handle patients selection in the risk sets, and thus obtain conditional independence between the two potential times, adjusting for all the observed covariates that induce dependence. Of note, we also show that, adjusting only for one observed covariate, thus ignoring the full dependence structure, gives anyway a less biased estimate compared to the naïve Kaplan-Meier estimator. The naïve Kaplan-Meier estimator, censoring time to relapse, is always biased, unless the hazard of relapse is independent from the covariates values. In a hypothetical scenario where all the covariates are observed, the weighted average incidence estimate obtained either non parametrically or by the Cox model and the inverse probability of censoring weighting would give an unbiased estimate of the incidence probability of AE or of the AE free survival curve. The same applies (data from simulations not shown) for the Aalen-Nelson estimator of the cumulative hazard of AE.

In addition, we pointed out that with the inverse probability of censoring weighting method one could obtain biased estimates when all the possible interactions between the observed covariates are not included in the model to estimate the weights (scenario 3 in Table 2). However, the inclusion of the interaction is not needed when the weighted Cox model is used, since conditional on the observed covariates, this model is robust in estimating the average incidence. Nevertheless, a limitation in the use of the weighted average survival method is given by the fact that it may be applied only in the presence of binary (or categorical covariates), since if the covariate is continuous it is impossible to identify the subgroups in which the incidence function can be estimated.

In general, our extended simulation protocol confirms that patients selection in risk sets is stronger and thus the bias is larger in the naïve KM estimator the higher is the imbalance in covariate values or the hazard of the competing event, relapse for given covariates [18].

Of note, we did not include the presence of right censoring in the simulation protocol since the estimators we investigated already can account for this additional complexity in the data.

Although clinical trials usually have an efficacy endpoint as a primary endpoint, safety analysis is always an important secondary endpoint, especially for non-inferiority trials. The approaches we proposed can be used for the comparison of the risk of developing adverse events depending on the treatment assigned and will be matter of future work.

Acknowledgements

Liv Andres-Jensen for comments and suggestions on the clinical application.

Abbreviations

AE

Adverse event

AJ

Aalen-Johansen

AIEOP

Italian Association of Pediatrical Hematology and Oncology

ALL

Acute lymphoblastic leukemia

AN

Aalen-Nelson

pH

Potential hazard

pI

Potential cumulative incidence

pS

Potential survival function

CI

Crude cumulative incidence

CP

Crude proportion

CSH

Cause-specific hazard

IPCW

Inverse probability of censoring weighting

KM

Kaplan-Meier

ML

Maximum likelihood

ON

Osteonecrosis

PH

Proportional hazards

RL

Relapse

SE

Standard error

Authors’ contributions

E.T. conceptualized the idea. E.T. and L.A. drafted the original manuscript. D.P.B. and M.G.V. reviewed and provided feedback on all drafts. All authors edited and reviewed the final manuscript. The authors read and approved the final manuscript.

Funding

Grant PRIN project nr. 20178S4EK9 for financial support. The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Availability of data and materials

Available from the authors under request.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Parasole R, Valsecchi MG, Silvestri D, Locatelli F, Barisone E, Petruzziello F, et al. Correspondence: Osteonecrosis in childhood acute lymphoblastic leukemia: a retrospective cohort study of the Italian Association of Pediatric Haemato-Oncology (AIEOP) Blood Cancer J. 2018;8(12):115. doi: 10.1038/s41408-018-0150-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Stegherr R, Beyersmann J, Jehl V, Rufibach K, Leverkus F, Schmoor C, et al. Survival analysis for AdVerse events with VarYing follow-up times (SAVVY): Rationale and statistical concept of a meta-analytic study. Biom J. 2021;63(3):650–670. doi: 10.1002/bimj.201900347. [DOI] [PubMed] [Google Scholar]
  • 3.Bender R, Beckmann L, Lange S. Biometrical issues in the analysis of adverse events within the benefit assessment of drugs. Pharm Stat. 2016;15(4):292–296. doi: 10.1002/pst.1740. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Proctor T, Schumacher M. Analysing adverse events by time-to-event models: the CLEOPATRA study. Pharm Stat. 2016;15(4):306–314. doi: 10.1002/pst.1758. [DOI] [PubMed] [Google Scholar]
  • 5.Unkel S, Amiri M, Benda N, Beyersmann J, Knoerzer D, Kupas K, et al. On estimands and the analysis of adverse events in the presence of varying follow-up times within the benefit assessment of therapies. Pharm Stat. 2019;18(2):166–183. doi: 10.1002/pst.1915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hengelbrock J, Gillhaus J, Kloss S, Leverkus F. Safety data from randomized controlled trials: applying models for recurrent events. Pharm Stat. 2016;15(4):315–323. doi: 10.1002/pst.1757. [DOI] [PubMed] [Google Scholar]
  • 7.Coemans M, Verbeke G, Döhler B, Süsal C, Naesens M. Bias by censoring for competing events in survival analysis. BMJ (Clin Res ed). 2022;378:e071349. doi: 10.1136/bmj-2022-071349. [DOI] [PubMed] [Google Scholar]
  • 8.Bender R, Augustin T, Blettner M. Generating survival times to simulate Cox proportional hazards models. Stat Med. 2005;24(11):1713–1723. doi: 10.1002/sim.2059. [DOI] [PubMed] [Google Scholar]
  • 9.Robins JM, Rotnitzky A. Recovery of information and adjustment for dependent censoring using surrogate markers. In: Jewell N, Dietz K, Farewell V, editors. AIDS Epidemiology Methodological Issues. Boston: Birkhuser; 1992. pp. 297–331. [Google Scholar]
  • 10.Cole SR, Hernan MA. Adjusted survival curves with inverse probability weights. Comput Methods Programs Biomed. 2004;75(1):45–49. doi: 10.1016/j.cmpb.2003.10.004. [DOI] [PubMed] [Google Scholar]
  • 11.Robins JM, Finkelstein DM. Correcting for noncompliance and dependent censoring in an aids clinical trial with inverse probability of censoring weighted (IPCW) log-rank tests. Biometrics. 2000;56(3):779–788. doi: 10.1111/j.0006-341X.2000.00779.x. [DOI] [PubMed] [Google Scholar]
  • 12.Bernasconi DP, Antolini L, Rossi E, Blanco-Lopez JG, Galimberti S, Andersen PK, et al. A causal inference approach to compare leukaemia treatment outcome in the absence of randomization and with dependent censoring. Int J Epidemiol. 2022;51(1):314–323. doi: 10.1093/ije/dyab150. [DOI] [PubMed] [Google Scholar]
  • 13.Willems SJW, Schat A, van Noorden MS, Fiocco M. Correcting for dependent censoring in routine outcome monitoring data by applying the inverse probability censoring weighted estimator. Stat Methods Med Res. 2018;27(2):323–335. doi: 10.1177/0962280216628900. [DOI] [PubMed] [Google Scholar]
  • 14.Beyersmann J, Schmoor C. An efficient linearity-and-bound-preserving remapping method. In: Halabi S, Michiels S, editors. Textbook of clinical trials in oncology: a statistical perspective. New York: Chapman and Hall/CRC; 2019. pp. 537–558. [Google Scholar]
  • 15.Allignol A, Beyersmann J, Schmoor C. Statistical issues in the analysis of adverse events in time-to-event data. Pharm Stat. 2016;15(4):297–305. doi: 10.1002/pst.1739. [DOI] [PubMed] [Google Scholar]
  • 16.Furberg JK, Andersen PK, Korn S, Overgaard M, Ravn H. Bivariate pseudo-observations for recurrent event analysis with terminal events. Lifetime Data Anal. 2023;29(2):256–287. doi: 10.1007/s10985-021-09533-5. [DOI] [PubMed] [Google Scholar]
  • 17.Kolev N, UD A, BVDM M. Copulas: a review and recent developments. Stoch Model. 2006;22(4):617–60.
  • 18.Putter H, Fiocco M, Geskus RB. Tutorial in biostatistics: Competing risks and multi-state models. Stat Med. 2007;26(11):2389–2430. doi: 10.1002/sim.2712. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Available from the authors under request.


Articles from BMC Medical Research Methodology are provided here courtesy of BMC

RESOURCES