Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Nov 29.
Published in final edited form as: J Biomed Inform. 2022 Aug 23;134:104176. doi: 10.1016/j.jbi.2022.104176

SurvMaximin: Robust federated approach to transporting survival risk prediction models

Xuan Wang a, Harrison G Zhang b, Xin Xiong a, Chuan Hong b, Griffin M Weber b, Gabriel A Brat b, Clara-Lea Bonzel b, Yuan Luo c, Rui Duan d, Nathan P Palmer b, Meghan R Hutch c, Alba Gutiérrez-Sacristán b, Riccardo Bellazzi e, Luca Chiovato f, Kelly Cho g,h, Arianna Dagliati e, Hossein Estiri i, Noelia García-Barrio j, Romain Griffier k,l, David A Hanauer m, Yuk-Lam Ho h, John H Holmes n, Mark S Keller b, Jeffrey G Klann MEng i, Sehi L’Yi b, Sara Lozano-Zahonero o, Sarah E Maidlow p, Adeline Makoudjou o, Alberto Malovini q, Bertrand Moal k, Jason H Moore r, Michele Morris s, Danielle L Mowery n, Shawn N Murphy t, Antoine Neuraz u, Kee Yuan Ngiam v, Gilbert S Omenn w, Lav P Patel x, Miguel Pedrera-Jiménez j, Andrea Prunotto o, Malarkodi Jebathilagam Samayamuthu s, Fernando J Sanz Vidorreta y, Emily R Schriver z, Petra Schubert h, Pablo Serrano-Balazote j, Andrew M South aa, Amelia LM Tan b, Byorn WL Tan ab, Valentina Tibollo o, Patric Tippmann o, Shyam Visweswaran s, Zongqi Xia ac, William Yuan b, Daniela Zöller o, Isaac S Kohane b, Paul Avillach b,1, Zijian Guo ad,1, Tianxi Cai b,1; The Consortium for Clinical Characterization of COVID-19 by EHR 4CEb
PMCID: PMC9707637  NIHMSID: NIHMS1850166  PMID: 36007785

Abstract

Objective:

For multi-center heterogeneous Real-World Data (RWD) with time-to-event outcomes and high-dimensional features, we propose the SurvMaximin algorithm to estimate Cox model feature coefficients for a target population by borrowing summary information from a set of health care centers without sharing patient-level information.

Materials and Methods:

For each of the centers from which we want to borrow information to improve the prediction performance for the target population, a penalized Cox model is fitted to estimate feature coefficients for the center. Using estimated feature coefficients and the covariance matrix of the target population, we then obtain a SurvMaximin estimated set of feature coefficients for the target population. The target population can be an entire cohort comprised of all centers, corresponding to federated learning, or a single center, corresponding to transfer learning.

Results:

Simulation studies and a real-world international electronic health records application study, with 15 participating health care centers across three countries (France, Germany, and the U.S.), show that the proposed SurvMaximin algorithm achieves comparable or higher accuracy compared with the estimator using only the information of the target site and other existing methods. The SurvMaximin estimator is robust to variations in sample sizes and estimated feature coefficients between centers, which amounts to significantly improved estimates for target sites with fewer observations.

Conclusions:

The SurvMaximin method is well suited for both federated and transfer learning in the high-dimensional survival analysis setting. SurvMaximin only requires a one-time summary information exchange from participating centers. Estimated regression vectors can be very heterogeneous. SurvMaximin provides robust Cox feature coefficient estimates without outcome information in the target population and is privacy-preserving.

1. Introduction

Electronic health records (EHR) have been widely adopted in the U. S. and other countries [1-5]. The EHR contains a wealth of patient medical information collected over time by health care providers, and common structured data types include demographics, diagnoses, laboratory test results, medications, and vital signs. Given its longitudinal nature, EHR data have been utilized for various research purposes, including survival analysis [6-8]. For example, the Cox proportional hazards model is used commonly and has been applied to EHR risk prediction [9].

With the increasing availability of EHR data, there is a great interest in integrating knowledge from a diverse range of health care centers to improve generalizability and accelerate discoveries. There now exist multiple collaborative consortia each composed of diverse health care centers seeking to leverage their EHR data in unison. For example, the Consortium for Clinical Characterization of COVID-19 by EHR (4CE consortium) is an international research collaborative that collects patient-level EHR data to study the epidemiology and clinical course of COVID-19 [10]. The consortium comprises more than 300 hospitals across seven countries with 83,178 patients, representing a broad range of multi-national health care centers serving diverse patient populations.

However, EHR data obtained from multiple diverse health care centers often exhibit a high degree of heterogeneity due to variability in EHR and data warehouse platforms, patient populations, health care practices, coding, and documentation. Further, patient-level data often cannot be shared directly between health care centers in a timely manner due to patient and institutional privacy laws [11]. Thus, there is a need for robust analytic strategies to overcome the barriers to conduct multi-center EHR studies.

Our objective is to jointly leverage multi-center, high-dimensional EHR data to make more precise inferences for a target population in the survival analysis setting by sharing only summary statistics obtained from each center, such as Cox feature coefficients and covariance matrices. The target population may be the entire population inclusive of all centers, a subset of centers, or a new, separate population. Integrative analysis approaches that only require individual sites to share summary statistics are often referred to as federated learning [12-14].

Most existing federated learning methods focus on settings with a small number of predictors and/or homogeneous settings where the underlying predictive models are shared across sites [12-14]. In addition, existing methods generally require several rounds of communication between sites, which can be inefficient and labor-intensive. To ensure transportability of models across sites, transfer learning methods have been proposed to transfer knowledge from separate but related centers to provide robust and precise estimates for patients in a new center. This approach has widespread applications in medical studies such as drug sensitivity prediction, integrative analysis of “multi-omics” data, and natural language processing [15-18]. However, most transfer learning methods require outcome labels from the target population, which may be difficult and expensive to obtain, and do not consider the federated learning scenario where individual-level data cannot be shared across sites. In the absence of outcome labels in the target population, transfer learning methods require stringent assumptions that the target and source populations share the same underlying risk model, leading to potential transfer failure when the risk model for the target population is similar to only a subset of source populations [19-21].

With heterogeneous training datasets from multiple centers, one potential limitation of the existing federated transfer learning methods is that the performance of the prediction model can vary substantially across centers. Thus, although the overall performance may be satisfactory, the performance of the model in a particular center might be low. Moreover, when trained models are applied to a new population, transferability and portability are not guaranteed. To improve the robustness of prediction models, the maximin effect approach was first proposed in [22-24], and used as a metric to build a robust prediction model for continuous outcomes across heterogeneous training datasets. Instead of optimizing the average performance across all training datasets, the maximin effect method aims to train a model that maximizes the minimum gain over the null model among all training datasets. The maximin approach was further extended to a setting that allows for covariate shift between the source and target populations [25]. The group distributional robustness optimization in [26,27] is closely related to the maximin effect, which builds a robust prediction model by minimizing the worst-case training loss over a class of distributions. The maximin projection has been developed in [28] to construct the optimal treatment regimen for new patients by leveraging training data from different groups with heterogeneity in optimal treatment decision.

In this paper, motivated by the maximin algorithm for continuous outcomes in [22,25], we propose a maximin transfer learning algorithm for predicting a survival outcome (SurvMaximin) in a target population with high-dimensional features by robustly combining multiple prediction models trained in different source populations. This algorithm only requires sharing of summary statistics across centers and can easily accommodate high-dimensional features. SurvMaximin can be viewed as a robust federated approach to transfer models trained at multiple external centers to a target population, so we refer to it as a federated transfer learning method. SurvMaximin differs from existing transfer learning methods in that it does not require the target population to share the same underlying model with the source population, a highly desirable property when learning with multiple heterogeneous health care systems. The training of the SurvMaximin algorithm also does not require the target population to have gold-standard outcome labels.

2. Methods

2.1. SurvMaximin: Federated robust transfer learning for survival outcomes

The main aim of the SurvMaximin algorithm is to derive a robust risk prediction model for an unlabeled target population based on labeled data from the L source populations under the data sharing constraints. Suppose there are L source populations, indexed by l ∈ {1, …, L}, representing L studies and one target population, denoted by Q. The observed data from the l th source population consists of nl independent and identically federated random vectors 𝒟l={(Zli,Xli,δli)}1inl, where Zli denotes the p-dimensional standardized baseline risk factors with p potentially large relative to the sample sizes, the censored survival times are observed as Xli = min(Tli, Cli) and δli = I(TliCli) with Tli and Cli denoting the survival time and follow up time for the i th subject in the lth population, respectively, and I(·) is an indicator function which equals to 1 if the inside is true and equals to 0 otherwise. In the target population Q, only the baseline features {ZQi}1≤inQ are observed. We assume that the survival time in the l th source population, Tl, given the baseline features Zl follows a Cox proportional hazards model, Λl(tZli)=Λ0,l(t)exp(blTZli), which can be equivalently expressed [29] as.

T~lilogΛ0,l(Tli)=blTZli+liwithliZliandP(li>x)=exp{exp(x)} (1)

where Λl(tZli) is the conditional cumulative hazard function given Z for the l th population, Λ0,l(t) is the cumulative baseline hazard function, blRp denotes the vector of unknown log hazard ratio parameters associated with the risk factors Z, and ⊥ denotes independence, ∈ is a random variable with distribution specified in (1). We assume that distributions of the baseline risk factors, hazard ratio parameters, and the baseline hazard functions may vary across the source populations due to study heterogeneity. Similarly, we assume that the survival time from the target population, TQ, follows a Cox model with unknown baseline cumulative hazard Λ0,Q(·) and feature effect function βQ:

T~QilogΛ0,Q(TQi)=βQTZQi+QiwithQiZQiandP(Qi>x)=exp{exp(x)}. (2)

The SurvMaximin algorithm aims to identify a robust approximation to βQ based on the estimated hazard ratio parameters trained from {𝒟l}1lL as well as the target feature distribution. Due to the lack of gold standard labels on Q and the unspecified heterogeneity among {bl}1≤lL and βQ, the target βQ cannot be identified with the observed data. Instead of targeting βQ directly, the central idea of the SurvMaximin algorithm is to identify an approximation to βQ that maximizes the minimum reward across all L source populations. Following [25], we define the hypothetical outcomes for the nQ subjects in Q generated from the lth source models as.

T~liblTZQi+liwithliZQiandP(li>x)=exp{exp(x)}for1lL, (3)

where T~li can be viewed as the hypothetical outcome (transformed survival time) if the individual, ZQi was assigned to the l th source population. Then we define a robust prediction model as.

βQ=argmaxβRpRQ(β)withRQ(β)=min1lL{E(T~li)2E(T~li+βTZQi)2} (4)

the expectation is taken with respect to ZQi and {T~li}1lL defined in (3). Such a covariate shift maximin effect was defined in [25] for the linear model and is now extended to the Cox regression model. We shall note that E(T~li)2E(T~li+βTZQi)2 is a reward function of β, which represents the variance of T~li explained by the linear prediction −βTZQi. Our targeted maximin effect is maximizing the adversarial reward RQ(β) across the L groups. The SurvMaximin estimate βQ leads to a robust prediction model since the optimization in (4) guards against the worst-case scenario. The maximin effect can be interpreted from an adversarial perspective [23]: in a two-side game, we select an effect vector β and the counter agent then chooses the most challenging scenario for this β; that is, choose the source population such that β has the worst predictive performance. Our goal is to choose β such that the worst-case reward with respect to predicting the transformed survival time returned by the counter agent is maximized.

As shown in the Supplementary Materials, RQ(β) can be equivalently expressed as:

RQ(β)=min1lL{E(blTZQi)2E(blTZQiβTZQi)2}=minbB{2bTΣQββTΣQβ}, (5)

where ΣQ = E[ZQi(ZQi)T], and B={bl,l=1,,L}. Following [25],we may show that the maximin effect βQ as defined by (4) can be expressed as a weighted average of {bl}1≤lL,

βQ=BγQwithγQ=argminγ:γl1=1,γl0γTΓQγ (6)

where Bp×L = [b1, …, bL], ∥·∥q denotes the Lq norm, ΓQ = BTΣQB is a similarity matrix, and the minimization above is restricted to the simplex in L-dimension space. The optimal aggregation weight γQ in (6) depends on both {bl}1≤l≤L and the covariance matrix ΣQ for the target population. The identification equation (6) reveals an important geometric interpretation of the maximin effect: βQ is the point that has the smallest distance to the origin and lies on the convex combination of the regression vectors {bl}1≤l≤L [22]. The maximin estimator tends to shrink the components of {bl}1≤lL whose estimated coefficients vary with different signs across studies to zero and is not as sensitive to the inclusion of sites with an extreme hazard ratio regression vector [22]. In the transfer learning setting, we incorporate the target distribution ZQ into the definition of the distance.

2.2. Implementation of the SurvMaximin algorithm

The SurvMaximin algorithm involves three key steps: (I) locally train the prediction model for each of the L source sites to obtain {bl}1≤lL; (II) estimate the covariance matrix ΣQ and obtain a similarity matrix among {b^lTZQ,l=1,,L}, denoted by Γ^Q; and (III) obtain the final SurvMaximin estimator as an optimal linear combination of {b^l}1lL according to (6). The schema of the SurvMaximin algorithm is shown in Fig. 1.

Fig. 1.

Fig. 1.

Schematic of SurvMaximin algorithm for federated transfer learning.

Step I: Training L local risk prediction models.

We first obtain b^l as the maximizer of the penalized partial likelihood.

b^l=argmaxb{l(b)+λl𝒫(b)} (7)

where l(b) is the log partial likelihood associated with 𝒟l and 𝒫(b)=αb1+(1α)b22 is the elastic net penalty function, which is frequently used to overcome high dimensionality and collinearity of features with α = 1 corresponding to the standard LASSO and α = 0 corresponding to the ridge penalty [30]. The non-negative penalty parameter λl can be selected via standard tuning criteria including the AIC, BIC, or cross-validation.

Step II: Estimate the similarity matrix among.{blTZQ,l=1,,L}

We estimate the similarity matrix of {blTZQ,l=1,,L}, ΓQ=BTΣQB, as Γ^Q=B^TΣ^QB^, where B^p×L=[b^1,,b^L] and Σ^Q is the empirical variance covariance matrix of ZQ estimated based on the unlabeled target population data.

Step III: Maximin aggregation via (6).

Finally, we obtain the SurvMaximin aggregated log hazard ratio estimator as.

β^Q=B^γ^Qwithγ^Q=argminγ:γl1=1,γl0γTΓ^Qγ+ηγ22, (8)

where η ≥ 0 is the tuning parameter and the ridge penalty is included to account for the potential high collinearity among {b^l}1lL. See Supplementary Materials for additional information on data adaptive approach to selecting η. In practice, we find that when there is some heterogeneity observed as in our 4CE studies, setting η = 0 works well and the results are not sensitive to the choice of η when a relatively small η is chosen.

2.3. Transfer to a target site with missing features

A substantial challenge in transfer learning across different health care centers is that certain risk predictors, such as laboratory test results or demographic information, may be available in one center but not in a different center. For example, in the 4CE consortium, all U.S. centers report data on race while European centers do not, causing race data to be entirely missing for European centers. To transport a risk prediction model for a target center Q with only a subset of features available, one may fit a reduced model limited to only the available features for each source center and transport the reduced risk models from the source centers. When the target center changes, essentially, we will need to retrain the model at each source center according to the feature availability of the target center. Such an approach is not computationally efficient as each center needs to fit multiple models, and also increases the number of communications required across centers.

To enable transfer learning in the context of differential feature availability, we propose a simple projection approach that only requires each source center to additionally compute the empirical covariance matrix of the features, {Σ^l}l=1,,L, where Σ^l=nl1i=1nlZliZliT. Let 𝒜{1,,p} index features that are available at the target site, 𝒜c={1,,p}𝒜. Let Z[𝒜] denote the subvector Z corresponding to 𝒜. The key step is to project b^lTZl to the subspace spanned by Zl[𝒜], b^lTZl=θ^lTZl[𝒜]+el, and predict Tl based on θ^lTZl[𝒜]. Since the features are all assumed to be centered, we obtain θ^l=b^l[𝒜]+α^lb^l[𝒜c] and α^l=(Σ^l[𝒜,𝒜])1Σ^l[𝒜,𝒜c], where Σ^l[𝒜,𝒜] and Σ^l[𝒜,𝒜c] denote the submatrices of Σl corresponding to {𝒜,𝒜} and {𝒜,𝒜c}. The final SurvMaximin estimator for the feature effects of ZQ[𝒜] can be constructed by replacing {b^l}1lL with {θ^l}l=1,,L and Σ^Q with Σ^Q[𝒜,𝒜]. If Zl[𝒜] is of high dimension or the vectors in Zl[𝒜] are colinear, in which case the covariance Σ^l[𝒜,𝒜] is not invertible, we can apply regularization methods in [31,32] to stably invert Σ^l[𝒜,𝒜] and construct α^l.

2.4. Validation of SurvMaximin algorithm

We validated the performance of SurvMaximin in federated transfer learning using both simulation studies and a real-world study where we transported COVID-19 mortality risk prediction models to target centers using EHR data from hospitalized patients with COVID-19.

2.4.1. Simulation studies

Simulation studies were conducted to assess the performance of SurvMaximin and to compare its performance against existing federated learning methods. Since SurvMaximin transports a risk prediction model to a future target center without survival outcomes observed, we used other federating learning methods that also do not require supervised training on the target data as comparisons. Specifically, we consider the standard random effect meta-analysis estimator (herein referred to as Meta); the One-shot Distributed Algorithm (ODAC) for the Cox model [12]; and the locally trained risk prediction model with varying training sizes of nQ = 200, 400, and 600. We considered simulation scenarios with L = 15 centers each with sample size nl=300[l3], and p = 20 or 50 features in the risk prediction model.

We generated Zl from a multivariate normal distribution MVN(0, Σ), where Σ is either Σ=[0.5jj]j=1,,pj=1,,p with an autoregressive correlation (AR) structure, or where Σ is a compound symmetry covariance matrix with variance 1 and covariance 0.5. We then generated ZQ from MVN(0, ΣQ) with ΣQ = 0.1 + Σ. Subsequently, we generated Tl and TQ from:

2logTli=log{0.125(1+0.05l)}blTZli+li,l=1,,L;2logTQi=log0.225blTZQi+Qi,

where εli and εQi were generated from extreme value distributions. We let.

bQ=[β8×1,01×(p8)]T,bl=[β8×1+el,01×(p8)]T,el=[el1,,el8]T

and consider a range of scenarios for β and {el}l=1,…,L to explore how the signal strength, the heterogeneity among the source sites, as well as the degree of similarity between the target site and the source sites affect the performance of SurvMaximin relative to other methods. Specifically, we consider β = [0.5, 0.4, 0.3, 0.2, −0.2, −0.3, −0.4, −0.5]T and [0.25, 0.2, 0.15, 0.1, −0.1, −0.15, −0.2, −0.25]T to represent moderate and weak signals. We consider two settings for {el} under each with three levels of heterogeneity among source sites for el. In setting (I), we let elj = τ{1(l ≤ 5) +3I(l > 5)}(−1)j, which results in the first 5 sites being more similar to the target site than the remaining 10 sites. In setting (II), we let elj = τ(l−1) such that a majority of the source sites are substantially different from the target site. We let τ = 0.05, 0.1, and 0.2 to reflect a low, medium and high degree of heterogeneity among the source sites. As τ increases, the target site also becomes more dissimilar to the source sites. We generated censoring time from Exponential (1) distribution, leading to about 20 % to 30 % event rates across the L source sites under setting (I) and 20 % to 40 % under setting (II).

To evaluate the performance of SurvMaximin for missing features, we considered setting (I) with moderate signal, covariance matrix Σ being AR (1), p = 20, and τ = 0.05, 0.1, 0.2. We let the first feature of the target site be missing and calculated the projected SurvMaximin estimator as described in Section 2.3, denoted by SurvMaximinproject. For comparison, we also fitted penalized Cox models to each site with covariates Z[𝒜] with 𝒜={2,,p} to obtain the corresponding effect estimates b^l[𝒜] for Z[𝒜]; and then constructed SurvMaximin based on {b^l[𝒜]}l=1,,L . As naïve benchmarks, we additionally constructed the ODAC and Meta models based on Z and transported these models to the target site by removing the component associated with the first covariate. Such naive approaches are often adopted in practice due to the inability to refit the reduced models on source sites.

We also generated censoring time from Exponential (2.5) to consider scenarios with rare events, ranging from 4 % to 10 % for the 15 sites under setting (I) and 3 % to 35 % under setting (II).

We evaluate the overall performance of the estimated risk score from each method in predicting the survival time TQ for the target site, based on the survival C-statistic with a truncation time close to the largest observed survival time from the target site [33]. We estimate the C-statistics based on an independent validation data of size NQ = 2,000 generated from the target distributions. For each configuration, we summarize results based on 500 iterations.

2.4.2. Improving Cross-system portability of COVID-19 mortality risk prediction models with SurvMaximin

We further validated the performance of SurvMaximin by deriving robust and transportable mortality risk prediction models for patients hospitalized with COVID-19 using international, multi-institutional EHR data from the 4CE consortium [10,34]. Baseline risk factors and mortality information were available for 83,178 patients from L0 = 17 participating health care centers of the consortium across three countries: France, Germany, and the U.S. Eligibility criteria for the study included a positive SARS-CoV-2 reverse transcription polymerase chain reaction (PCR) test result; an admission date between March 1, 2020 and January 31, 2021; and the admission occurring 7 days before to 14 days after the date of their first positive PCR test result recorded in their EHR. Each health care center performed analyses locally and then reported summary results to the central institution. We considered each of the individual health care centers as a potential target population and sought to derive a mortality risk prediction model that is transportable to this population from multiple external models. Given the multinational nature of our data, we anticipated a significant amount of between-health care center heterogeneity in their mortality risk models.

Baseline risk predictors considered included: age groups (18–25, 26–49, 50–69, 70–80, 80+), sex, and race (White, Black, Asian, Hispanic, and other); the pre-admission Charlson comorbidity index (CCI) derived from diagnostic codes; and laboratory test values at admission [35]. We focused on 10 commonly measured laboratory tests (with missing rates < 30 %), including C-reactive protein (CRP), albumin, aspartate aminotransferase (AST), AST to alanine aminotransferase ratio (AST/ALT), total bilirubin, creatinine, d-dimer, white blood cell count (WBC), lymphocyte count, and neutrophil count. Values of AST, d-dimer, and CRP were log-transformed due to their skewed distributions. Missing baseline laboratory values and CCI were imputed via the multivariate imputation by chained equation method and averaged over five imputed sets [36]. In total, we considered p = 19 potential risk predictors. A few predictors, including race data for the European centers, were not available (Supplementary Figure S4). When a variable was not ascertained at a site, the local Cox model fitting excluded it. We derived and evaluated prediction models for all-cause mortality by 3, 7, and 14 days after the admission date. We excluded patients who died on the day of admission in the survival analysis.

For each L0 = 17 health care centers, we transported mortality risk prediction models trained from external analyses via SurvMaximin to the patient population in this center. Specifically, for the lth healthcare center, we fit LASSO penalized Cox models to estimate the effect of Zl coefficients bl^,l=1,,L0 on survival outcome. For l = 1, …, L0, we let the l th site be the target site and then trained the SurvMaximin algorithm based on the source data from all remaining sites that have predictors Z𝒜l ascertained, where Z𝒜l denotes the covariate vector that is available at site l. We used the proposed projection method when the target center had an incomplete set of features. After obtaining the SurvMaximin risk model for the target center, we compared the SurvMaximin risk model against each of the L0 supervised locally trained models with respect to the accuracy in predicting t0 = 3, 7, 14-day mortality in the target population. We quantified the accuracy of predicting t0-day mortality based on the area under the receiver characteristic curve (AUC). We repeated this analysis for all L0 = 17 centers each time considering one of them as the target population Q.

3. Results

3.1. Results for Simulation studies

Simulation results are summarized in Fig. 2 for the moderate signal scenario. In setting (I), where 5 source sites have feature coefficients like the target site, SurvMaximin results in models with accuracy comparable to those from ODAC and Meta when the heterogeneity is low (τ = 0.05, 0.1) and outperforms other methods when the heterogeneity is high (τ = 0.2). Since there are 5 source sites relatively like the target site, the transported model from SurvMaximin attained accuracy higher than the locally trained model with nQ = 200 and comparable to those trained with nQ = 600. When p or the correlation among the features increases, the estimated models generally attain lower prediction performance. Nevertheless, SurvMaximin remains to attain a robust performance relative to the other federated learning methods across different levels of heterogeneity.

Fig. 2.

Fig. 2.

Average C-statistics under settings (I) and (II) with Σ being either AR(1) or compound symmetry; p = 20or50; and tau = 0.05, 0.10, or 0.20 (local coefficients’ heterogenicity) for predicting survival in the target population with risk models trained by SurvMaximin, Meta, ODAC, as well as supervised penalized Cox regression with nQ = 200, 400, 600 labeled target data (Local200, Local400, Local600).

In setting (II), where only 1 other site has similar feature coefficients to the target site, SurvMaximin exhibits a substantially better predictive performance when compared to Meta and ODAC across all settings, further highlighting the robustness of SurvMaximin to varying degrees of similarity between the target site and the source sites. Across all levels of heterogeneity, the Meta and ODAC estimators suffer from very small C-statistics indicating poor predictive performance. We observed similar trends regardless of using p = 20, 50 features or covariance matrix structure. Further, the performance of SurvMaximin remains better than the supervised model trained withnQ = 200 labeled target site data and comparable to the locally trained models with nQ = 400 and 600. This suggests that SurvMaximin may improve estimation performance when the target population sample size is small.

With weaker signals, the cross-site heterogeneity is more pronounced, leading to more apparent distinctions between SurvMaximin and other federated methods (Figure S1 of the Supplement). Only when the heterogeneity is very low with τ = 0.05 under Setting (I), all methods perform similarly. Under all other settings, SurvMaximin substantially outperforms ODAC and Meta. With weaker signals, locally trained models also require a larger sample size to attain performances comparable to SurvMaximin. This further illustrates the advantage of transporting existing models in a robust fashion over training a supervised model when the training sample size is not large relative to the feature dimension. Results for low event rates are shown in Figure S2, which show similar trends.

Results for assessing the performance of the projected SurvMaximin algorithm in the presence of missing features are summarized in Figure S3. The projected SurvMaximin model attains prediction performance comparable to the SurvMaximin model trained by aggregating the locally fit sub-models with Z. Thus, the projection method provides a comparable alternative SurvMaximin estimator when features may be missing for some sites, without the need to unify the set of features for all the centers all the time. The projected SurvMaximin estimator also outperforms the naïve approach of removing the component associated with the first covariate from the ODAC or Meta estimators.

3.2. Results for transporting COVID mortality risk models

For each covariate, we compared the L0 local estimates of its log hazard ratio to those based on SurvMaximin in Fig. 3. While these two sets of estimators are generally consistent, SurvMaximin estimators tend to be more concentrated at the center, while local estimators exhibit higher variability in part due to unstable estimates from some sites. For example, the log hazard ratio (HR) of the age group (18–25) ranges from −6.58 to 0 for the local estimates while the SurvMaximin estimates range from −1.43 to −0.7.

Fig. 3.

Fig. 3.

Density plots of local versus SurvMaximin effect estimators for healthcare systems..l = 1, …, L0 = 15

The AUC estimates associated with the risk models obtained based on SurvMaximin and local supervised training for predicting 3, 7, and 14-day mortality are shown in Fig. 4. For each site, we also compared the AUC of models trained in each of the external sites, the locally trained model, and the SurvMaximin model for predicting 14-day mortality (Figure S5). The accuracy of the risk models transported by SurvMaximin, which does not utilize the outcome information of the target local site, is comparable or even sometimes higher than that of the locally trained models. The AUCs of SurvMaximin are more concentrated at a comparatively higher AUC, suggesting the robustness of the SurvMaximin approach.

Fig. 4.

Fig. 4.

Density plots of estimated AUCs for models trained via local estimation versus SurvMaximin for healthcare systems l = 1, …, L0 = 17.

4. Discussion

We proposed the SurvMaximin approach to deriving a robust risk prediction model for a target population by robustly synthesizing information from estimated risk models from multiple sites. For the target site, the SurvMaximin estimator β^maximin is a linear combination of the coefficient estimators of the local sites {b^l}1lL, or it lies in the convex hull of all the {b^l}1lL and is closest to a zero point with respect to some distance related to the target population. The method enables us to safely transport a set of existing risk models to a target population in the presence of high cross-site heterogeneity.

Compared with existing federated learning methods, such as Meta and the federated learning methods proposed by [12], the proposed maximin method can handle high-dimensional covariates and is very robust to heterogeneity between sites. It is also robust to sample size differences and improves the inference when the sample size of the target population is small as seen from the simulation studies. The SurvMaximin algorithm is efficient in time and cost as it only requires one-time sharing of the summary statistics. Compared with existing transfer learning methods, the proposed maximin method can help to preserve the privacy and confidentiality of patients in different centers. Further, it requires less information, including only feature information of the target site and feature effect estimators from other sites. Thus, SurvMaximin is very flexible and generalized such that it can adapt to a variety of scenarios while achieving high accuracy with limited information.

5. Conclusion

In this paper, we developed a SurvMaximin covariate effect estimator for multi-center survival data with high-dimensional covariates. Simulation studies and real EHR data analysis show that the proposed estimator achieves high accuracy in a range of settings with different levels of heterogeneity between sites and different sample sizes. The SurvMaximin is a highly flexible and robust approach for multi-center survival analysis, which enables federated learning, transfer learning, as well as federated transfer learning.

Supplementary Material

S-Fig3
S-Fig1
S-Fig2
S-Fig4
S-Fig5
S-Fig6
S-Fig7
S-Fig8
S-Fig9
S-Fig10

Acknowledgement

GMW is supported by National Institutes of Health (NIH)/ National Center for Advancing Translational Sciences (NCATS) UL1TR002541, NIH/NCATS UL1TR000005, NIH/National Library of Medicine (NLM) R01LM013345, NIH/ National Human Genome Research Institute (NHGRI) 3U01HG008685-05S2. YL is supported by NIH/NCATS U01TR003528, and NLM 1R01LM013337. KC is supported by VA MVP000 and CIPHER. NGB is supported by PI18/00981, funded by the Carlos III Health Institute. DAH is supported by NCATS UL1TR002240. MSK is supported by NHGRI 5T32HG002295-18. JHM is supported by NLM 010098. MM is supported by NCATS UL1TR001857. DLM is supported by NIH/NCATS CSTA Award #UL1-TR001878. SNM is supported by NCATS 5UL1TR001857-05 and NHGRI 5R01HG009174-04. GSO is supported by NIH U24CA210867 and P30ES017885. LPP is supported by NCATS Clinical and Translational Science Award (CTSA) Award #UL1TR002366. FSJV is supported by NIH/NCATS UL1TR001881. AMS is supported by NIH/ National Heart, Lung, and Blood Institute (NHLBI) K23HL148394 and L40HL148910, and NIH/NCATS UL1TR001420. SV is supported by NCATS UL1TR001857. ZX is supported by National Institute of Neurological Disorders and Stroke (NINDS) R01NS098023. WY is supported by NIH T32HD040128.

IRB Approval was obtained at Assistance Publique - Hôpitaux de Paris, Beth Israel Deaconess Medical Center, Bordeaux University Hospital, Instituti Instituti Clinici Scientifici Maugeri Hospitals, University of Kansas Medical Center, Massachusetts General Brigham, Northwestern University, Medical Center University of Freiburg, University of Pittsburgh, and VA North Atlantic, Southwest, Midwest, Continental and Pacific. An exempt determination was made by Institutional Review Boards at Hospital Universitario 12 de Octubre, University of California Los Angeles, University of Michigan, and University of Pennsylvania.

Footnotes

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A. Supplementary material

Supplementary data to this article can be found online at https://doi.org/10.1016/j.jbi.2022.104176.

References

  • [1].Torda P, Han ES, Scholle SH, Easing the adoption and use of electronic health records in small practices, Health Aff. 29 (4) (2010) 668–675. [DOI] [PubMed] [Google Scholar]
  • [2].Decker SL, Jamoom EW, Sisk JE, Physicians in nonprimary care and small practices and those age 55 and older lag in adopting electronic health record systems, Health Aff. 31 (5) (2012) 1108–1114. [DOI] [PubMed] [Google Scholar]
  • [3].Kim Y-G, Jung K, Park Y-T, Shin D, Cho SY, Yoon D, Park RW, Rate of electronic health record adoption in South Korea: a nation- wide survey, Int. J. Med. Inf 101 (2017) 100–107. [DOI] [PubMed] [Google Scholar]
  • [4].Tavares J, Oliveira T, Electronic health record portal adoption: a cross country analysis. In: BMC medical informatics and decision making 17.1 (2017), pp. 1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Kose I, Rayner J, Birinci S, Ulgu MM, Yilmaz I, Guner S, Mahir SK, Aycil K, Elmas BO, Volkan E, Altinbas Z, Gencyurek G, Zehir E, Gundogdu B, Ozcan M, Vardar C, Altinli B, Hasancebi JS, Adoption rates of electronic health records in Turkish Hospitals and the relation with hospital sizes, BMC Health Services Res. 20 (1) (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Murphy SN, et al. , Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2), J. Am. Med. Inform. Assoc 17 (2) (2010) 124–130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Hagar Y, Albers D, Pivovarov R, Chase H, Dukic V, Elhadad N, Survival analysis with electronic health record data: Experiments with chronic kidney disease: Survival Analysis of EHR CKD Data, Statistical Analy Data Mining 7 (5) (2014) 385–403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Singal G, Miller PG, Agarwala V, Li G, Kaushik G, Backenroth D, Gossai A, Frampton GM, Torres AZ, Lehnert EM, Bourque D, O’Connell C, Bowser B, Caron T, Baydur E, Seidl-Rathkopf K, Ivanov I, Alpha-Cobb G, Guria A, He J, Frank S, Nunnally AC, Bailey M, Jaskiw A, Feuchtbaum D, Nussbaum N, Abernethy AP, Miller VA, Association of patient characteristics and tumor genomics with clinical outcomes among patients with non–small cell lung cancer using a clinicogenomic database, In: Jama 321 (14) (2019) 1391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Cox DR, Regression models and life-tables, J. Roy. Statistical Soc.: Ser. B (Methodological) 34 (2) (1972) 187–202. [Google Scholar]
  • [10].Brat GA, et al. , International electronic health record-derived COVID-19 clinical course profiles: the 4CE consortium, Npj Digital Med. 3 (1) (2020) 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Wolfson M, et al. , DataSHIELD: resolving a conflict in contemporary bioscience? performing a pooled analysis of individual-level data without sharing the data, Int. J. Epidemiol 39 (5) (2010) 1372–1382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Duan R, Luo C, Schuemie MJ, Tong J, Liang CJ, Chang HH, Boland MR, Bian J, Xu H, Holmes JH, Forrest CB, Morton SC, Berlin JA, Moore JH, Mahoney KB, Chen Y, Learning from local to global: An efficient distributed algorithm for modeling time-to-event data, J. Am. Med. Inform. Assoc 27 (7) (2020) 1028–1036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Wu Y, Jiang X, Kim J, Ohno-Machado L, G rid Binary LO gistic RE gression (GLORE): building shared models without sharing data, J. Am. Med. Inform. Assoc 19 (5) (2012) 758–764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Lu C-L, Wang S, Ji Z, Wu Y, Xiong L.i., Jiang X, Ohno-Machado L, WebDISCO: a web service for distributed cox model learning without patient-level data sharing, J. Am. Med. Inform. Assoc 22 (6) (2015) 1212–1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Bastani H, Predicting with Proxies: Transfer Learning in High Dimension, Manage. Sci 67 (5) (2021) 2964–2984. [Google Scholar]
  • [16].Turki T, Wei Z, Wang JTL, Transfer learning approaches to improve drug sensitivity prediction in multiple myeloma patients, IEEE Access 5 (2017) 7381–7393. [Google Scholar]
  • [17].Sun YV, Yi-Juan H.u., Integrative analysis of multi-omics data for discovery and functional studies of complex human diseases, Adv. Genet 93 (2016) 147–190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Dauḿe Hal III. Frustratingly easy domain adaptation. arXiv preprint arXiv: 0907.1815 (2009). [Google Scholar]
  • [19].Tony Cai T, Wei H, Transfer learning for nonparametric classification: Minimax rate and adaptive classifier, Ann. Statistics 49 (1) (2021) 100–128. [Google Scholar]
  • [20].Sai Li T, Cai T, Li H, Transfer learning for high-dimensional linear regres- sion: Prediction, estimation, and minimax optimality. In: arXiv preprint arXiv: 2006.10593, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Cai T, Liu M, Xia Y, Individual data protected integrative regression analysis of high-dimensional heterogeneous data, J. Am. Stat. Assoc (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Bühlmann Peter and Meinshausen Nicolai. “Magging: maximin aggregation for inhomoge- neous large-scale data”. In: arXiv preprint arXiv:1409.2638 (2014). [Google Scholar]
  • [23].Meinshausen N, Buhlmann P, Maximin effects in inhomogeneous large-scale data, Ann. Statistics 43 (4) (2015) 1801–1830. [Google Scholar]
  • [24].Rothenhäausler Dominik, Meinshausen Nicolai, Bühlmann Peter. “Confidence intervals for maximin effects in inhomogeneous large-scale data”. In: Statistical Analysis for High- Dimensional Data. Springer, 2016, pp. 255–277. [Google Scholar]
  • [25].Guo Z. Inference for High-dimensional Maximin Effects in Heterogeneous Regression Models Using a Sampling Approach. In: arXiv preprint arXiv: 2011.07568 (2020). [Google Scholar]
  • [26].Weihua H, et al. , Does distributionally robust supervised learning give robust classifiers? Int. Conf. Mach. Learn. PMLR (2018) 2029–2037. [Google Scholar]
  • [27].Sagawa S, et al. , Distributionally robust neural networks for group shifts: On the impor- tance of regularization for worst-case generalization. In: arXiv preprint arXiv: 1911.08731, 2019. [Google Scholar]
  • [28].Shi C, Song R, Wenbin L.u., Bo F.u., Maximin projection learning for optimal treatment decision with heterogeneous individualized treatment effects, J. Roy. Statistical Soc.: Ser. B (Statistical Methodol.) 80 (4) (2018) 681–702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Cheng SC, Wei LJ, Ying Z, Analysis of transformation models with censored data, Biometrika 82 (4) (1995) 835–845. [Google Scholar]
  • [30].Hastie T, Tibshirani R, Wainwright M, Statistical learning with sparsity: the lasso and generalizations, Chapman and Hall/CRC, 2019. [Google Scholar]
  • [31].Cai T, Liu W, Luo X.i., A constrained 1 minimization approach to sparse precision matrix estimation, J. Am. Stat. Assoc 106 (494) (2011) 594–607. [Google Scholar]
  • [32].Cai TT, Ren Z, Zhou HH, Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation, Electronic J. Statistics 10 (1) (2016) 1–59. [Google Scholar]
  • [33].Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei LJ, On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data, Statist. Med 30 (10) (2011) 1105–1117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Weber GM, et al. , International Changes in COVID-19 Clinical Trajectories Across 315 Hospitals and 6 Countries: a 4CE Consortium Study, In: J. med. internet res (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Deyo RA, Cherkin DC, Ciol MA, Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases, J. Clin. Epidemiol 45 (6) (1992) 613–619. [DOI] [PubMed] [Google Scholar]
  • [36].Van Buuren S, Groothuis-Oudshoorn K, mice: Multivariate imputation by chained equations in R, J. Stat. Softw 45 (1) (2011) 1–67. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S-Fig3
S-Fig1
S-Fig2
S-Fig4
S-Fig5
S-Fig6
S-Fig7
S-Fig8
S-Fig9
S-Fig10

RESOURCES