Skip to main content
BMC Medical Research Methodology logoLink to BMC Medical Research Methodology
. 2023 Oct 13;23:233. doi: 10.1186/s12874-023-02055-8

Covariate balance-related propensity score weighting in estimating overall hazard ratio with distributed survival data

Chen Huang 1,#, Kecheng Wei 1,#, Ce Wang 1, Yongfu Yu 1,2,3,, Guoyou Qin 1,2,3,
PMCID: PMC10576397  PMID: 37833641

Abstract

Background

When data is distributed across multiple sites, sharing information at the individual level among sites may be difficult. In these multi-site studies, propensity score model can be fitted with data within each site or data from all sites when using inverse probability-weighted Cox regression to estimate overall hazard ratio. However, when there is unknown heterogeneity of covariates in different sites, either approach may lead to potential bias or reduced efficiency. In this study, we proposed a method to estimate propensity score based on covariate balance-related criterion and estimate the overall hazard ratio while overcoming data sharing constraints across sites.

Methods

The proposed propensity score was generated by choosing between global and local propensity score based on covariate balance-related criterion, combining the global propensity score fitted in the entire population and the local propensity score fitted within each site. We used this proposed propensity score to estimate overall hazard ratio of distributed survival data with multiple sites, while requiring only the summary-level information across sites. We conducted simulation studies to evaluate the performance of the proposed method. Besides, we applied the proposed method to real-world data to examine the effect of radiation therapy on time to death among breast cancer patients.

Results

The simulation studies showed that the proposed method improved the performance in estimating overall hazard ratio comparing with global and local propensity score method, regardless of the number of sites and sample size in each site. Similar results were observed under both homogeneous and heterogeneous settings. Besides, the proposed method yielded identical results to the pooled individual-level data analysis. The real-world data analysis indicated that the proposed method was more likely to find a significant effect of radiation therapy on mortality compared to the global propensity score method and local propensity score method.

Conclusions

The proposed covariate balance-related propensity score in multi-site distributed survival data outperformed the global propensity score estimated using data from the entire population or the local propensity score estimated within each site in estimating the overall hazard ratio. The proposed approach can be performed without individual-level data transfer between sites and would yield the same results as the corresponding pooled individual-level data analysis.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12874-023-02055-8.

Keywords: Cox model, Distributed data networks, Privacy protection, Propensity score weighting, Covariate balance

Background

The growth of large multi-site medical datasets is accelerating with the development of big data and advances in data collection and storage. If data from multiple sources can be combined, the study power and generalizability can be improved, and multi-site research collaboration can also be carried out. However, in research of data from multiple sites, it is generally challenging to share information at the individual level among sites due to privacy, network security, and transmission speed [1]. Therefore, it is necessary to develop statistical methods that only require summary-level information to provide personal privacy protection while analyzing data from multiple sites.

In biomedical research, a common outcome of interest is the time-to-event endpoint, which focuses on whether or not an event occurred and when that event occurred. Cox proportional model is a popular semi-parametric approach to describe the relationship between the time-to-event endpoints and a set of covariates by estimating the hazard ratios [2]. In multi-site, distributed data, Lu et al. and Vilk et al. developed distributed Cox model based on iterative methods, which required iterative data sets to be transferred multiple times between the analysis center and each site [3, 4]. Li et al. proposed a method for distributed Cox regression that did not need multiple iterative file transfers among sites, but used the summary-level statistical data received from each site to find the solution of parameters based on the iterative method in the analysis center [5].

In observational studies, the inverse probability weighted (IPW) Cox regression model can be used to estimate the overall hazard ratio while adjusting for measured confounders through weighting [6]. Propensity score is the probability of treatment assignment conditional on the covariates and the IPW method assigns weight as the inverse of the probability of receiving the observed treatment to each individual [79]. In multi-site, distributed studies, considering propensity score weighting, Yoshida et al. compared three methods of sharing aggregate-level information to assess the performance of estimating hazard ratio from cox models in simulated distributed data networks [10]. The estimated results were comparable to the pooled individual-level data analysis. Shu et al. estimated the hazard ratio in multi-site study based on the IPW Cox model with summary-level information and provided theoretical justification [11]. Most multi-site studies obtained estimation based on local propensity score and local weight which fit propensity score models using data within each site. The local propensity score considered the possible heterogeneity of each site, while the sample size used to fit models was reduced. Alternatively, a global propensity score model can also be fitted using data from all sites based on distributed logistic regression, and the estimated treatment effect will be equivalent to a weighted pooled individual-level analysis [12]. However, when there is unknown heterogeneity of covariates in different sites, either global or local propensity score to estimate the overall treatment effect may result in potential bias or lower efficiency.

In this article, we propose a new method that uses only the summary-level statistics from each site to estimate the overall hazard ratio based on the new proposed propensity score in distributed survival data. The proposed propensity score is generated by choosing between global and local propensity score based on criteria to better control confounding bias and improve estimation efficiency. Our proposed propensity score is motivated by Dong et al. who proposed the subgroup balancing propensity score to estimate the subgroup treatment effect, which combined the global and local propensity score estimation to ensure covariate balance and control variance inflation [13].

The rest of the article is organized as follows. In Sect. "Data transfer from each site k to the analysis center: each site transmits distinct observed event times for site k,, to the analysis center." in methods, we present the weighted estimation of overall hazard ratio through IPW Cox model. In Sect. 2 in methods, we present the proposed method to estimate the propensity score, and provide respective algorithms using summary-level information to obtain the proposed propensity score. In Sect. 3 in methods, we present the methods of solving the estimating equations to estimate the overall hazard ratio based on the proposed propensity score. In simulations section, we present the simulation results demonstrating the performance of the proposed method and compare that to the global or local propensity score method and pooled individual-level data analysis. In application section, we give a real-world data application for illustration. At the end of the article, we conclude with some discussion.

Methods

Weighted estimation of the overall hazard ratio

Let X be a vector of measured confounders, A be a binary treatment variable (A=1 if treated and A=0 if untreated). T is the true survival time, C is the censoring time which assumed to be independent of T given X. Due to censoring, we observe T=minT,C and δ=I(TC).I() is the indicator function. Suppose we observe n independent sample Ai,Ti,Xi,δi, i=1,,n, from K data-contributing sites. Let Ωk={i:iinsitek,fori=1,,n} be the index set for individuals belonging to the kth sites with size nk and Gi=k if individual i belongs to the kth site, where k=1,,K.

Suppose we have d distinct observed event times across all sites whereT1D<T2D<<TdD. Forj=1,,d, let Dj be the set of individuals who have the observed event time ofTjD,Dj={i:Ti=TjD,δi=1,i=1,,n}, and let Rj be the risk set for individuals who are at risk at timeTjD,Rj={i:TiTjD,i=1,,n}. Also, let Rj(k) be the risk set for individuals who are at risk at time TjD in sitek,Rj(k)={l:TlTjD,lΩkforl=1,,n}. Similarly, within site k, there are d(k) distinct observed event timesTk,1D<Tk,2D<<Tk,d(k)D. Forj=1,,d(k), let Dk,j(k) be the set of individuals who have the observed event time of Tk,jD in sitek, Dk,j(k)={l:Tl=Tk,jD,δl=1,lΩkforl=1,,n} and let Rk,j(k) be the risk set for individuals who are at risk at time Tk,jD in sitek,Rk,j(k)={l:TlTk,jD,lΩkforl=1,,n}, wherek=1,,K.

In this article, we focus on estimating the overall hazard ratio, exp(θ), between treatment and control groups in the entire population:

λt=λ0(t)exp(θA)

where λ0(t) is the baseline hazard function.

IPW Cox regression model is commonly used to estimate hazard ratio. Based on the propensity score e=P(A=1|X), the inverse probability weight is w=Ae+1-A1-e. We assume that the hazard ratio to be common across K data-contributing sites and all sites have a common baseline hazard λ0(t). The weighted partial likelihood score function for the common log hazard ratio θ is [14],

j=1diϵDjwiAi-lϵRjwlexp(θAl)AllϵRjwlexp(θAl)=0 1

The estimate of the log hazard ratio θ^ can be obtained by solving Eq. (1).

Proposed propensity score weighting method for estimating the overall hazard ratio

We propose a new method to estimate the overall hazard ratio based on our proposed propensity score weight, which does not require individual-level data sharing among sites. Specifically, we first estimate the global propensity score for the entire population by distributed logistic regression and generate a global weight for each individual. Second, we fit logistic regression within each site to generate the local propensity score and local weight for each individual. Third, we choose between global and local propensity score for each site based on covariate balance-related criterion, and use this chosen propensity score in each site to obtain the proposed weight for each individual. Fourth, we estimate the overall hazard ratio based on the proposed weight. All the above steps require only summary-level data to be transferred among sites, which would help protect individual privacy.

Global and local propensity score

In the setting of distributed data with K sites, the propensity score can be estimated globally using data from the entire population or locally within each site.

Taking logistic regression models as an example, global propensity score is estimated by fitting logistic regression models to the overall sample:

logiteX,α=δ0,g+αgX 2.1

Since we assume that data at the individual level cannot be shared among sites, data from the full sample cannot be directly used to fit model, and only summary-level statistics can be obtained from each site. The global propensity score can be obtained by distributed logistic regression. Let e(X,αg)=P(A=1|X), and the logistic loss is

MlogisA,X,αg=-{AlogeX,αg+(1-A)log1-eX,αg} 2.2

The distributed Newton–Raphson method [15, 16] is used to obtain the empirical loss minimizer α^:=argminαi,kMlogisAi,k,Xi,k,α through iterations:

α(t+1)=α(t)-[Hnlogisα(t)]-1Gnlogisα(t)t=1,2, 2.3

where Gnlogis(α(t))=1nk=1KiΩkαMlogisAi,Xi,α(t) is the global gradient and Hnlogis(α(t))=1nk=1KiΩkα2MlogisAi,Xi,α(t)is the global Hessian matrix [15]. The iteration process is as follows:

  1. Initialize α(0)=argminαiΩ1MlogisAi,Xi,α based on data from the analysis center (e.g. site 1), and set t=0.

  2. Repeat the following steps until t meets the max iteration times or Gnlogisα pre-specified threshold.
    1. Transfer α(t) to each site to compute the local gradient Gnklogis(α(t)) and the local Hessian matrix Hnklogisαt, and transfer the local gradient and local Hessian matrix to the analysis center.
    2. Calculate the global gradient Gnlogisα(t)=1Kk=1KGnklogis(α(t)) and the global Hessian matrix Hnlogisαt=1Kk=1KHnklogis(α(t)) in the analysis center.
    3. Update α(t) in the analysis center as α(t+1)=α(t)-[Hnlogisα(t)]-1Gnlogisα(t).

Then we could obtain the global propensity score e^g= eX,α^g based on the estimated parameter α^g from iterations. It is worth noting that each site computes its own gradient and Hessian matrix, which are subsequently summarized to update the parameters. As a result, any site can be chosen as the analysis center. It is generally recommended to consider the hardware capabilities and computational power of each site when determining the analysis center.

An alternative approach is to estimate the local propensity score within each site:

logiteX,α=δk+αl,kXk=1,,K 2.4

We fit the model at each site using the observations from that site and obtain the local propensity score e^l= eX,α^l,k based on the estimated parameter α^l,k from each site.

Proposed propensity score

Motivated by Dong et al. [13] we propose a balancing propensity score to estimate the overall hazard ratio in distributed data to improve the estimation efficiency. The proposed method is to choose between the global and local propensity score by optimizing the overall confounder balance for propensity score weighting.

M^p=1nAi=11e^ixip-Ai=011-e^ixip/σ^p 2.5

where e^i is the estimated propensity score, xip is the value of the pth measured confounder Xp for individual i; σ^p is the standard deviations of Xp for overall population. M^p accounts for balancing of confounder Xp in the overall sample.

Notably, M^p could not be directly estimated in distributed data and needs file transfer between sites.

M^p could be rewritten as:

M^p=1nk=1KAi=1,Gi=k1e^ixip-k=1KAi=0,Gi=k11-e^ixipσ^p 2.6
σ^p=k=1KGi=kxip2-k=1KGi=kxip2nn-1

To obtain M^p, each site should transfer the following items to the analysis center:

  1. Ai=1,Gi=k1e^ixip and Ai=0,Gi=k11-e^ixip.

  2. Gi=kxip and Gi=kxip2.

M^p could then be calculated in the analysis center using these transferred values from each site based on (2.6). The objective function is the sum of the squares of M^p.

F=p=1P(M^p)2 2.7

We choose between global and local propensity scores for each site to minimize the objective function F.

Stochastic search algorithm to estimate the proposed propensity score

Dong and others proposed a stochastic search algorithm to find the minimized objective function Fin (2.7) [13]. For each site k=1,,K, let Sk=1 if individuals in site k are weighted based on the estimated global propensity score, and Sk=2 if individuals in site k are weighted based on the estimated local propensity score.

The search process is as follows:

  1. Initially, let all sites use the global propensity score and Sk=1 for k=1,,K. The analysis center calculates the initial value Fint for the objective function F using information transferred from each site. Let the minimum value Fmin=Fint, and let Sk,min=Sk=1 for k=1,,K.

  2. Repeat the following steps until the number of repeats is no smaller than L1 or Fmin does not change over L2 repeats. The values of L1 and L2 are pre-specified.

  3. Randomly permutate all the sites { 1 ,2,, K} and get a new random ordering of the K sites, {A1,A2,,AK}.

  4. Following the order {A1,A2,,AK} in step (a), for each site, choose the global or local propensity score that gives a smaller value of objective function F while fixing the propensity score chosen for other K-1 sites each time. If site k chooses the global propensity score, then Sk=1; if site k chooses the local propensity score, then Sk=2.

  5. After all sites have selected the global or local propensity score, calculate Frep for this repeat in the analysis center.

  6. If Frep in step (c) is smaller than Fmin, then update Fmin=Frep and Sk,min=Sk; if FrepFmin, then keep Fmin and Sk,min unchange.

For each site k=1,,K, if Sk,min=1 then the proposed propensity score for site k is equal to the global propensity score; otherwise the proposed propensity score is equal to the local propensity score estimated within that site, i.e.,

e^p=e^gforsitek,ifSk,min=1
e^p=e^lforsitek,ifSk,min=2

Estimation of overall hazard ratio with distributed survival data based on proposed propensity score

Based on the proposed propensity score e^i=e^p,i, the inverse probability weight for individual i is

w^i=Aie^i+1-Ai1-e^i

Then we could estimate the log hazard ratio θ^ by solving Eq. (1). In order to obtain θ^ in distributed data, Eq. (1) can be rewritten as

k=1Kj=1d(k)iϵDk,j(k)w^iAi-k=1Kj=1dkiϵDk,jkw^ik=1KlϵRk,jkw^lexpθAlAlk=1KlϵRk,jkw^lexpθAl=0 3.1

lϵRk,jkw^lexpθAlAl and lϵRk,jkwl^expθAl in the score Eq. (3.1) can be expressed as

lϵRk,jkw^lexpθAlAl=expθlϵRk,jk,Al=1w^l
lϵRk,jkw^lexpθAl=expθlϵRk,jk,Al=1w^l+lϵRk,jk,Al=0w^l

Then the Eq. (3.1) can be further rewritten as

k=1Kj=1d(k)iϵDk,jkw^iAi-k=1Kj=1dkiϵDk,jkw^iexpθk=1KlϵRk,jk,Al=1w^lexpθk=1KlϵRk,jk,Al=1w^l+k=1KlϵRk,jk,Al=0w^l=0 3.2

To solve (3.2), we need to know:

(u1)

k=1Kj=1d(k)iϵDk,jkw^iAi

(u2)

k=1Kj=1dkiϵDk,jkw^iexpθk=1KlϵRk,jk,Al=1w^lexpθk=1KlϵRk,jk,Al=1w^l+k=1KlϵRk,jk,Al=0w^l

Particularly in (u2), expθk=1KlϵRk,jk,Al=1w^lexpθk=1KlϵRk,jk,Al=1w^l+k=1KlϵRk,jk,Al=0w^l should be calculated for all d distinct observed event times across all sites. Therefore, each site needs to know the d distinct observed event times, which requires each site to first send d(k) observed event times in that site to the analysis center. Then the analysis center needs to summarize the event times from each site and send back all the d distinct event times to each site. With information on d distinct event times, each site k could then calculate lϵRj(k),Al=1w^l and lϵRj(k),Al=0w^l, and sends the results to the analysis center to sum up.

Detailed procedures to obtain the estimated log hazard ratio θ^ in distributed data:

  • 1. Data transfer from each site k to the analysis center: each site transmits d(k) distinct observed event times for site k,Tk,1D,Tk,2D,,Tk,d(k)D, to the analysis center.

  • 2. Data transfer from analysis center to each site: The analysis center summarizes the distinct observed event times across all sites, and transmits all d event times, T1D,T2D,,TdD, to each site.

  • 3. Calculation in each site and data transfer from each site to the analysis center: Each site k calculates lϵRj(k),Al=1w^l and lϵRj(k),Al=0w^l for d distinct observed event times, and transmits the calculation result to the analysis center.

  • 4. Data transfer from the analysis center to each site: Analysis center summarizes k=1KlϵRj(k),Al=1w^l and k=1KlϵRj(k),Al=0w^l for d distinct observed event times, and transmits the summarized result to each site.

  • 5. Data transfer from each site to the analysis center: For d(k) distinct observed event times within each site, each site generates a summary-level table with 4 columns and d(k) rows. The four columns are (i) iϵDk,j(k)w^iAi, (ii) iϵDk,j(k)w^i, (iii) k=1KlϵRk,jk,Al=1w^l for the d(k) distinct observed event times Tk,jD in site k, (iv) k=1KlϵRk,jk,Al=0w^l for the d(k) distinct observed event times Tk,jD in site k. Each site transmits the 4-column table to the analysis center. An example of the 4-column summary table is presented in Table S1.

    In particular, for all d distinct observed event times across all sites, k=1KlϵRk,jk,Al=1w^l and k=1KlϵRk,jk,Al=0w^l has been calculated and transmitted to each site in step 4. Therefore, for d(k) distinct event times observed in site k, columns (iii) and (iv) can be directly obtained from file transfer in step 4.

  • 6. The analysis center solves equation (3.1) based on file transfer in step 5, and obtains the estimated log hazard ratio θ^.

Simulations

Simulation design

To examine the performance of the proposed method, we performed two sets of simulations. The first simulation was to compare the performance of our proposed method with the global propensity score for the entire population or local propensity score estimated within each site in distributed data with K sites. The second simulation was to compare our proposed method to the results obtained from the corresponding pooled individual-level data.

Assumed there were four covariates X1~X4 and considered two scenarios:

  • (a) Covariates and the treatment assignment in each site were homogenous: X1Normal(0,1), X2Uniform(0,1), X3Normal(0,1), X4Bernoulli(0.4). The treatment indicator A was generated from the Bernoulli distribution according to the following propensity score model:

logiteX,α=δ0+α1X1+α2X2+α3X3+α4X4+α5X12+α6X1X4

where α=α1,α2,α3,α4,α5,α6=(-1.5,-0.5,0.5,-0.5,0.5,1.5) and δ0=0.

(b) Covariates and the treatment assignment in each site were heterogeneous: X1Normalμk,1, if G=k, where μk=3-3×k-1K-1, X2Uniform(0,1), X3Normal(0,1), X4Bernoulli(0.4). The treatment indicator A was generated from the Bernoulli distribution according to the following propensity score model:

logiteX,α=k=1Kδk1{G=k}+α1X1+α2X2+α3X3+α4X4+α5X12+α6X1X5

where α=α1,α2,α3,α4,α5,α6=-1.5,-0.5,0.5,-0.5,0.5,1.5 and δk=-1+2×k-1K-1.

Under each scenario, we also simulated the case where the treatment assignment model only included linear terms. Under homogenous scenario, the treatment indicator A was generated from

logiteX,α=δ0+α1X1+α2X2+α3X3+α4X4

where α=α1,α2,α3,α4=-1.5,-0.5,0.5,-0.5andδ0=0.

Under heterogeneous scenario, the treatment indicator A was generated from

logiteX,α=k=1Kδk1{G=k}+α1X1+α2X2+α3X3+α4X4

where α=α1,α2,α3,α4=-1.5,-0.5,0.5,-0.5andδk=-1+2×k-1K-1.

For survival outcome, we defined L=log1A+log2X1+log1.5X2+log0.5X3+log5X4, we generated T from a Weibull distribution with a shape parameter of 2 and a scale parameter of 0.5exp(L)-0.5. For censoring, we generated C from an exponential distribution with a rate parameter of exp(0.5). T=minT,C and δ=I(TC).

In the stochastic search process, L1 was set to be 500, and L2 was set to be 20. We considered K=5,10,20 and nk=500,1000,2000 to evaluate the impact of different numbers of sites and different sample sizes in each site on performance. We reported following performance measures: absolute bias, root mean squared error (RMSE), and ratio of RMSE of different methods against the proposed method (r-RMSE). We also presented the measure of coverage probability; however, due to constraints regarding computational costs, we only provided results for 5 sites. We compared three methods to generate weight for individual when estimating the overall hazard ratio: global weight (weight generated based on global propensity score e^g for the entire population), local weight (weight generated based on local propensity score e^l estimated within each site), and proposed weight (weight generated based on our proposed propensity score e^p). The statistical performance was evaluated based on 500 simulated datasets.

Simulation results

When the covariates and the treatment assignment in each site were homogenous, the absolute bias was small for all the methods, i.e., weighted using global, local, and proposed propensity score. Compared with global weight and local weight, our proposed weight had a smaller RMSE, regardless of the number of sites and sample size in each site. The ratio of RMSE of the global or local weight to our proposed weight (r-RMSE) was up to 1.578 (Table 1).

Table 1.

Comparisons of proposed weight, global weight and local weight to estimate the overall hazard ratio in the simulations, with homogenous design and treatment assignment generated with X1~X4 and X12, X1X4

K=5 K=10 K=20
Global weight Local weight Proposed weight Global weight Local weight Proposed weight Global weight Local weight Proposed weight
nk=500 Bias -0.018 -0.014 -0.017 -0.006 -0.012 -0.007 -0.002 -0.004 0.001
RMSE 0.116 0.129 0.106 0.089 0.088 0.073 0.066 0.076 0.056
r-RMSE 1.093 1.215 1.000 (Ref) 1.225 1.211 1.000 (Ref) 1.176 1.354 1.000 (Ref)
nk=1000 Bias -0.006 -0.008 -0.006 -0.002 -0.003 -0.001 0.000 -0.004 0.001
RMSE 0.089 0.100 0.077 0.066 0.068 0.058 0.122 0.093 0.087
r-RMSE 1.160 1.303 1.000 (Ref) 1.140 1.175 1.000 (Ref) 1.401 1.068 1.000 (Ref)
nk=2000 Bias -0.002 -0.003 -0.001 0.000 -0.004 -0.001 -0.002 -0.005 -0.002
RMSE 0.066 0.065 0.061 0.122 0.082 0.077 0.050 0.045 0.040
r-RMSE 1.087 1.071 1.000 (Ref) 1.578 1.060 1.000 (Ref) 1.249 1.124 1.000 (Ref)

Bias Absolute bias, RMSE Root mean squared error, r-RMSE Ratio of RMSE of global weight or local weight against proposed weight

In the heterogeneity setting, the absolute bias of our method was mostly somewhere between global and local weight, or close to that of global and local weight. Regarding RMSE, the RMSE of our proposed method remained the smallest, and the r-RMSE was up to 1.540 (Table 2). The results are similar when the treatment assignment was generated with X=(X1,X2,X3,X4) (Table 3, Table 4).

Table 2.

Comparisons of proposed weight, global weight and local weight to estimate the overall hazard ratio in the simulations, with heterogeneous design and treatment assignment generated with X1~X4 and X12, X1X4

K=5 K=10 K=20
Global weight Local weight Proposed weight Global weight Local weight Proposed weight Global weight Local weight Proposed weight
nk=500 Bias 0.045 0.041 0.033 0.026 0.023 0.015 0.025 0.019 0.009
RMSE 0.129 0.125 0.110 0.082 0.097 0.065 0.069 0.070 0.049
r-RMSE 1.176 1.140 1.000 (Ref) 1.255 1.484 1.000 (Ref) 1.400 1.421 1.000 (Ref)
nk=1000 Bias 0.039 0.024 0.022 0.027 0.018 0.012 0.031 0.020 0.012
RMSE 0.096 0.092 0.076 0.083 0.079 0.067 0.105 0.084 0.072
r-RMSE 1.269 1.216 1.000 (Ref) 1.247 1.187 1.000 (Ref) 1.458 1.166 1.000 (Ref)
nk=2000 Bias 0.031 0.014 0.011 0.037 0.021 0.017 0.021 0.005 0.002
RMSE 0.093 0.094 0.077 0.196 0.175 0.161 0.042 0.049 0.032
r-RMSE 1.210 1.223 1.000 (Ref) 1.215 1.084 1.000 (Ref) 1.320 1.540 1.000 (Ref)

Bias Absolute bias, RMSE Root mean squared error, r-RMSE Ratio of RMSE of global weight or local weight against proposed weight

Table 3.

Comparisons of proposed weight, global weight and local weight to estimate the overall hazard ratio in the simulations, with homogenous design and treatment assignment generated with X1~X4

K=5 K=10 K=20
Global weight Local weight Proposed weight Global weight Local weight Proposed weight Global weight Local weight Proposed weight
nk=500 Bias -0.009 -0.011 -0.014 0.003 0.000 -0.001 0.003 0.000 0.002
RMSE 0.104 0.106 0.098 0.071 0.073 0.065 0.053 0.055 0.048
r-RMSE 1.063 1.084 1.000 (Ref) 1.087 1.118 1.000 (Ref) 1.106 1.148 1.000 (Ref)
nk=1000 Bias 0.003 0.002 0.000 0.003 0.002 0.002 0.002 0.000 0.000
RMSE 0.071 0.074 0.068 0.053 0.054 0.049 0.035 0.036 0.032
r-RMSE 1.046 1.091 1.000 (Ref) 1.083 1.104 1.000 (Ref) 1.078 1.109
nk=2000 Bias 0.003 0.002 0.002 0.002 0.001 0.001 -0.001 -0.001 -0.001
RMSE 0.053 0.052 0.050 0.035 0.035 0.032 0.026 0.026 0.024
r-RMSE 1.064 1.044 1.000 (Ref) 1.087 1.087 1.000 (Ref) 1.069 1.069 1.000 (Ref)

Bias Absolute bias, RMSE Root mean squared error, r-RMSE Ratio of RMSE of global weight or local weight against proposed weight

Table 4.

Comparisons of proposed weight, global weight and local weight to estimate the overall hazard ratio in the simulations, heterogeneous design and treatment assignment generated with X1~X4

K=5 K=10 K=20
Global weight Local weight Proposed weight Global weight Local weight Proposed weight Global weight Local weight Proposed weight
nk=500 Bias -0.098 -0.140 -0.119 -0.064 -0.098 -0.079 -0.035 -0.073 -0.049
RMSE 0.287 0.254 0.239 0.208 0.226 0.188 0.171 0.178 0.139
r-RMSE 1.200 1.062 1.000 (Ref) 1.106 1.201 1.000 (Ref) 1.232 1.283 1.000 (Ref)
nk=1000 Bias -0.071 -0.073 -0.071 -0.040 -0.049 -0.046 -0.031 -0.028 -0.028
RMSE 0.221 0.247 0.206 0.178 0.174 0.147 0.124 0.144 0.101
r-RMSE 1.071 1.197 1.000 (Ref) 1.215 1.188 1.000 (Ref) 1.231 1.430 1.000 (Ref)
nk=2000 Bias -0.040 -0.031 -0.038 -0.034 -0.017 -0.025 -0.033 -0.019 -0.019
RMSE 0.210 0.199 0.170 0.128 0.142 0.112 0.106 0.100 0.078
r-RMSE 1.235 1.170 1.000 (Ref) 1.142 1.267 1.000 (Ref) 1.351 1.274 1.000 (Ref)

Bias Absolute bias. RMSE Root mean squared error, r-RMSE Ratio of RMSE of global weight or local weight against proposed weight

Besides, we have computed the 95% coverage probability for 5 sites, and our proposed method achieved a coverage probability close to the nominal 95%, and was closer to the nominal 95% compared to the global and local method (Table 5).

Table 5.

The coverage probability of different propensity score methods with the number of sites set to 5

Setting 1
Global weight Local weight Proposed weight
nk=500 88.2 90.0 91.2
nk=1000 93.0 93.0 95.0
nk=2000 90.8 91.4 92.8
Setting 2
Global weight Local weight Proposed weight
nk=500 88.6 91.0 92.8
nk=1000 89.0 91.0 93.2
nk=2000 87.2 90.2 94.4

Setting 1: homogenous design and treatment assignment generated with X1~X4 and X12, X1X4; setting 2: heterogeneous design and treatment assignment generated with X1~X4 and X12, X1X4

When comparing our proposed method to the results obtained from the corresponding pooled individual-level data analysis, as expected, our proposed method in distributed data and pooled individual-level data analysis yielded identical results under all scenarios (Table 6).

Table 6.

Comparisons of the proposed method in distributed data and corresponding pooled individual-level data analysis

Distributed Data Analysis
Setting 1 Setting 2
Global weight Local weight Proposed weight Global weight Local weight Proposed weight
K=5, nk=500 Bias -0.018 -0.014 -0.017 0.045 0.041 0.033
RMSE 0.116 0.129 0.106 0.129 0.125 0.110
r-RMSE 1.093 1.215 1.000 (Ref) 1.176 1.140 1.000 (Ref)
K=5, nk=1000 Bias -0.006 -0.008 -0.006 0.039 0.024 0.022
RMSE 0.089 0.100 0.077 0.096 0.092 0.076
r-RMSE 1.160 1.303 1.000 (Ref) 1.269 1.216 1.000 (Ref)
K=5, nk=2000 Bias -0.002 -0.003 -0.001 0.031 0.014 0.011
RMSE 0.066 0.065 0.061 0.093 0.094 0.077
r-RMSE 1.087 1.071 1.000 (Ref) 1.210 1.223 1.000 (Ref)
Pooled Individual-Level Data Analysis
Setting 1 Setting 2
Global weight Local weight Proposed weight Global weight Local weight Proposed weight
K=5, nk=500 Bias -0.018 -0.014 -0.017 0.045 0.041 0.033
RMSE 0.116 0.129 0.106 0.129 0.125 0.110
r-RMSE 1.093 1.215 1.000 (Ref) 1.176 1.140 1.000 (Ref)
K=5, nk=1000 Bias -0.006 -0.008 -0.006 0.039 0.024 0.022
RMSE 0.089 0.100 0.077 0.096 0.092 0.076
r-RMSE 1.160 1.303 1.000 (Ref) 1.269 1.216 1.000 (Ref)
K=5, nk=2000 Bias -0.002 -0.003 -0.001 0.031 0.014 0.011
RMSE 0.066 0.065 0.061 0.093 0.094 0.077
r-RMSE 1.087 1.071 1.000 (Ref) 1.210 1.223 1.000 (Ref)

Bias absolute bias, RMSE root mean squared error, r-RMSE ratio of RMSE of global weight or local weight against proposed weight

Setting 1: homogenous design and treatment assignment generated with X1~X4 and X12, X1X4; setting 2: heterogeneous design and treatment assignment generated with X1~X4 and X12, X1X4

Application

We apply the proposed method to real-world triple-negative breast cancer (TNBC) data from Surveillance, Epidemiology, and End Results (SEER) [17]. TNBC is an aggressive subtype of breast cancer, accounting for about 20% of all breast cancer cases [18]. It is known that radiation therapy can improve locoregional control in breast cancer patients and has a positive impact on the long-term survival of high-risk patients [19].

The dataset included 4120 patients aged 20–79 years diagnosed with TNBC in 2010 with complete information. The treatment variable was set to 1 if the patient received radiation therapy and 0 if not. The outcome of interest was the time to death during the follow-up of up to 71 months. Descriptive characteristics of patients according to radiation therapy were presented in Table S2. We estimated the hazard ratio and 95% confidence interval after adjusting for age, race, marital status, laterality, grade, the American Joint Committee on Cancer (AJCC) stage, surgery, distant metastasis, and chemotherapy in the propensity score model.

The patients were from five states: Connecticut (n1=717), Hawaii (n2=274), Iowa (n3=723), Kentucky (n4=1176), and Louisiana (n5=1230). Descriptive characteristics of patients according to five sites were presented in Table S3. We compared the proposed method with methods based on the global or local propensity score in the distributed survival data with 5 sites. We further compared the estimates from proposed methods in distributed data to estimates from the corresponding pooled individual-level data analyses.

The confidence intervals were calculated using the bootstrap method with 200 replications [11]. All n individuals in K sites were assigned ID of {1,2,n}. In each bootstrap replication, the analysis center re-sampled with replacement from {1,2,n} and sent the re-sampled ID of the 200 replications to each site. Each site then prepared 200 bootstrap samples based on the instruction from the analysis center. The sample size of the resulting bootstrap samples for each site may differ from that site's original size.

Table 7 presented the estimated hazard ratio and their 95% confidence intervals. Results from the proposed methods and methods based on global or local propensity score indicated that radiation therapy had a positive impact on long-term survival in patients with TNBC. The proposed method was more likely to find a significant effect (hazard ratio, 0.679; 95% confidence interval, 0.585 to 0.789) compared to the global propensity score method (0.737; 0.653 to 0.832) and local propensity score method (0.709; 0.619 to 0.812). Besides, the proposed method and methods based on global or local propensity score produced hazard ratio estimates and 95% confidence intervals equivalent to those obtained from the corresponding pooled individual-level data analyses.

Table 7.

Estimation of overall hazard ratio and the corresponding 95% confidence intervals using different propensity score estimation methods in distributed data and pooled individual-level data

Distributed Data Analysis
Method Hazard ratio 95% Confidence intervals
Global weight 0.737 0.653 to 0.832
Local weight 0.709 0.619 to 0.812
Proposed weight 0.679 0.585 to 0.789
Pooled Individual-Level Data Analysis
Method Hazard ratio 95% Confidence intervals
Global weight 0.735 0.651 to 0.831
Local weight 0.708 0.618 to 0.811
Proposed weight 0.669 0.575 to 0.777

Discussion

We have proposed a covariate balance-related propensity score to create inverse probability weight to make inferences on the overall hazard ratio in multi-site distributed survival data. This proposed propensity score is produced based on covariate balance-related criterion in the entire population. The proposed propensity score is shown to perform better than the global propensity score estimated using data from the entire population or the local propensity score estimated within each site. Besides, the proposed method could be conducted without individual-level data transferred among sites and would yield identical results to the corresponding pooled individual-level data analysis.

The proposed method is developed based on distributed data with multiple sites. Since our proposed method in distributed data and pooled individual-level data analysis yield identical results, the proposed method can be extended to the general studies that data is distributed in multiple sites, but data communication among sites is not restricted. Therefore, in multi-site data, whether or not data transmission between sites is allowed, we recommend using our proposed approach and selecting between the global and local propensity score in each site to estimate the overall treatment effect with efficiency.

In our real-world data analysis, we calculated the 95% confidence intervals based on the global bootstrap method, which re-sampled from the entire population. In practice, researchers can also use the alternative local bootstrap method for simplicity [11]. Specifically, each site could generate its 200 or more bootstrap samples with replacement from the original sample in that site, which is the conventional bootstrap method within the site. We also applied the local bootstrap method to the real-world data, and the result was similar and presented in Table S4.

Our method is proposed based on the unstratified Cox model. Sometimes, if we assume the baseline hazard to vary by site, stratification on site is helpful and the stratified Cox model is used accordingly. In this case, the stratified Breslow-type weighted partial likelihood would be used instead of (1) in our study [11]. The main difference is that each site no longer needs to know the information of all d distinct observed event times across all sites, but only needs to obtain its own summarized information of d(k) distinct observed event times. Accordingly, the detailed steps 1 to 5 in calculating the overall hazard ratio in our study can be replaced by a simple step, i.e., to obtain the following information within each site: (i) iϵDk,j(k)w^iAi, (ii) iϵDk,j(k)w^i, (iii) lϵRk,jk,Al=1w^l for the d(k) distinct observed event times Tk,jD in site k, (iv) lϵRk,jk,Al=0w^l for the d(k) distinct observed event times Tk,jD in site k. Under such circumstances, only one file transfer from each site to the analysis center is required after obtaining the proposed propensity score.

When conducting propensity score-based analysis, it is crucial to correctly identify the set of confounders and specify the propensity score model. We assume that all confounding variables are measurable and known in our study, and that there is no misclassification, missing data, or time-varying covariates. In future studies, it is possible to consider extending our method to situations where these assumptions are not satisfied or data with a large number of candidate covariates [20, 21].

Conclusions

In this study, we proposed a covariate balance-related propensity score to estimate the overall hazard ratio, which only required summary-level information across sites to provide personal privacy protection. The proposed propensity score was estimated based on covariate balance-related criterion, and was shown to outperform the global propensity score estimated using data from the entire population or the local propensity score estimated within each site.

Supplementary Information

12874_2023_2055_MOESM1_ESM.docx (27KB, docx)

Additional file 1: Table S1. An example of the 4-column summary table transferred from the site to the analysis center (10 rows are shown for illustration).Table S2. Descriptive characteristics of patients according to radiation therapy in real-world data analysis. Values are numbers (percentages) of individuals unless otherwise stated. Table S3. Descriptive characteristics of patients according to five sites in real-world data analysis. Values are numbers (percentages) of individuals unless otherwise stated. Table S4. Hazard ratios and 95% confidence intervals in real-world data analysis with local bootstrap.

Acknowledgements

Not applicable.

Authors’ contributions

Y.Y. and G.Q. conceived the study. C.H. and K.W. performed the analysis and prepared the manuscript, including figures and tables. All authors have provided critical comments on the draft, and read and approved the final manuscript. C.H. and K.W. contributed equally to this work.

Funding

This study was supported by National Natural Science Foundation of China (No. 82273730 to YY and 82173612 to GQ), Shanghai Rising-Star Program (21QA1401300 to YY), Shanghai Municipal Natural Science Foundation (22ZR1414900 to YY) and Shanghai Municipal Science and Technology Major Project (ZD2021CY001 to GQ). The sponsors had no role in study design, data collection, data analysis, data interpretation, or writing of this report.

Availability of data and materials

Publicly available datasets were analyzed in this study. These data can be found here: https://seer.cancer.gov/data-software/.

Declarations

Ethics approval and consent to participate

Since the simulated datasets did not involve any human data, ethics approval was not applicable; and the real data is publicly available, thus ethics approval was not required.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Chen Huang and Kecheng Wei contributed equally as co-first authors.

Contributor Information

Yongfu Yu, Email: yu@fudan.edu.cn.

Guoyou Qin, Email: gyqin@fudan.edu.cn.

References

  • 1.Ha YJ, Lee G, Yoo M, Jung S, Yoo S, Kim J. Feasibility study of multi-site split learning for privacy-preserving medical systems under data imbalance constraints in COVID-19, X-ray, and cholesterol dataset. Sci Rep. 2022;12(1):1534. doi: 10.1038/s41598-022-05615-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Cox DR. Regression Models and Life-Tables. J Roy Stat Soc: Ser B (Methodol) 1972;34(2):187–202. [Google Scholar]
  • 3.Lu CL, Wang S, Ji Z, et al. WebDISCO: a web service for distributed cox model learning without patient-level data sharing. J Am Med Inform Assoc. 2015;22(6):1212–1219. doi: 10.1093/jamia/ocv083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Vilk Y, Zhang Z, Young JG, et al. A distributed regression analysis application based on SAS software Part II: Cox proportional hazards regression. arXiv: Computation. 2018.
  • 5.Li D, Lu W, Shu D, Toh S, Wang R. Distributed Cox proportional hazards regression using summary-level information. Biostatistics. 2022;24(3):776–794. doi: 10.1093/biostatistics/kxac006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Schemper M, Wakounig S, Heinze G. The estimation of average hazard ratios by weighted Cox regression. Stat Med. 2009;28(19):2473–2489. doi: 10.1002/sim.3623. [DOI] [PubMed] [Google Scholar]
  • 7.Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55. doi: 10.1093/biomet/70.1.41. [DOI] [Google Scholar]
  • 8.Curtis LH, Hammill BG, Eisenstein EL, Kramer JM, Anstrom KJ. Using Inverse Probability-Weighted Estimators in Comparative Effectiveness Analyses with Observational Databases. Med Care. 2007;45(10):S103–S107. doi: 10.1097/MLR.0b013e31806518ac. [DOI] [PubMed] [Google Scholar]
  • 9.Austin PC, Stuart EA. Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Stat Med. 2015;34(28):3661–3679. doi: 10.1002/sim.6607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Yoshida K, Gruber S, Fireman BH, Toh S. Comparison of privacy-protecting analytic and data-sharing methods: A simulation study. Pharmacoepidemiol Drug Saf. 2018;27(9):1034–1041. doi: 10.1002/pds.4615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Shu D, Yoshida K, Fireman BH, Toh S. Inverse probability weighted Cox model in multi-site studies without sharing individual-level data. Stat Methods Med Res. 2020;29(6):1668–1681. doi: 10.1177/0962280219869742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.El Emam K, Samet S, Arbuckle L, Tamblyn R, Earle C, Kantarcioglu M. A secure distributed logistic regression protocol for the detection of rare adverse drug events. J Am Med Inform Assoc. 2012;20(3):453–461. doi: 10.1136/amiajnl-2011-000735. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dong J, Zhang JL, Zeng S, Li F. Subgroup balancing propensity score. Stat Methods Med Res. 2020;29(3):659–676. doi: 10.1177/0962280219870836. [DOI] [PubMed] [Google Scholar]
  • 14.Binder DA. Fitting Cox's Proportional Hazards Models from Survey Data. Biometrika. 1992;79(1):139–147. doi: 10.1093/biomet/79.1.139. [DOI] [Google Scholar]
  • 15.Jordan MI, Lee JD, Yang Y. Communication-Efficient Distributed Statistical Inference. J Am Stat Assoc. 2019;114(526):668–681. doi: 10.1080/01621459.2018.1429274. [DOI] [Google Scholar]
  • 16.Boyd SP, Vandenberghe L. Convex Optimization. IEEE Trans Autom Control. 2004;51:1859–1859. [Google Scholar]
  • 17.Hayat MJ, Howlader N, Reichman ME, Edwards BK. Cancer statistics, trends, and multiple primary cancer analyses from the Surveillance, Epidemiology, and End Results (SEER) Program. Oncologist. 2007;12(1):20–37. doi: 10.1634/theoncologist.12-1-20. [DOI] [PubMed] [Google Scholar]
  • 18.He MY, Rancoule C, Rehailia-Blanchard A, et al. Radiotherapy in triple-negative breast cancer: Current situation and upcoming strategies. Crit Rev Oncol Hematol. 2018;131:96–101. doi: 10.1016/j.critrevonc.2018.09.004. [DOI] [PubMed] [Google Scholar]
  • 19.Azoury F, Misra S, Barry A, Helou J. Role of Radiation Therapy in Triple Negative Breast Cancer: Current State and Future Directions—A Narrative Review. Precis. Cancer Med. 2022;5:9. 10.21037/pcm-21-9.
  • 20.Wang Y, Hong C, Palmer N, et al. A fast divide-and-conquer sparse Cox regression. Biostatistics. 2019;22(2):381–401. doi: 10.1093/biostatistics/kxz036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Shi J, Qin G, Zhu H, Zhu Z. Communication-efficient distributed M-estimation with missing data. Comput Stat Data Anal. 2021;161:107251. doi: 10.1016/j.csda.2021.107251. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

12874_2023_2055_MOESM1_ESM.docx (27KB, docx)

Additional file 1: Table S1. An example of the 4-column summary table transferred from the site to the analysis center (10 rows are shown for illustration).Table S2. Descriptive characteristics of patients according to radiation therapy in real-world data analysis. Values are numbers (percentages) of individuals unless otherwise stated. Table S3. Descriptive characteristics of patients according to five sites in real-world data analysis. Values are numbers (percentages) of individuals unless otherwise stated. Table S4. Hazard ratios and 95% confidence intervals in real-world data analysis with local bootstrap.

Data Availability Statement

Publicly available datasets were analyzed in this study. These data can be found here: https://seer.cancer.gov/data-software/.


Articles from BMC Medical Research Methodology are provided here courtesy of BMC

RESOURCES