Abstract
Successful implementation of data-driven artificial intelligence (AI) applications requires access to large datasets. Healthcare institutions can establish coordinated data-sharing networks to address the complexity of large clinical data accessibility for scientific advancements. However, persisting challenges from controlled access, safe data transferring, license restrictions from regulatory and legal concerns discourage data sharing among the in-network hospitals. In contrast, out-of-network healthcare institutions are deprived of access to any big EHR database; hence, limiting their research scope. The main objective of this study is to design a privacy-preserved transfer learning architecture that can utilize the knowledge from a federated model developed from in-network hospital-site EHR data for predicting diabetic kidney cases at out-of-network siloed hospital sites. In all our experiments, transfer learning showed improved performance compared to models trained with out-of-network site datasets. Thus, we demonstrate the proof-of-concept of transferring knowledge from established networks to aid data-driven AI discoveries at siloed sites.
Introduction
Data-driven artificial intelligence (AI) applications engender the scope of substantial knowledge discoveries in many industries. However, the successful implementation of data-driven approaches requires large and diverse datasets1. Similarly, AI-driven research using large amounts of digital healthcare data in electronic health records (EHR) can help address relevant persisting clinical research questions to facilitate evidence-based treatment2. However, existing policies to protect patient privacy and organizations’ proprietary rights over their data alongside compliance to regulatory requirements from the HIPAA and HITECH Acts impose legal, technical, and financial burdens on data sharing2,3. In recent times, to address the complexity of data sharing among healthcare institutions, national networks have emerged to develop coordinated distributed data networks to advance research for improving healthcare services and outcomes.
A group of healthcare institutions coming together to establish a coordinated data-sharing network to conduct large-scale health research is referred to as a clinical research network (or “network”, in general). The affiliated “partners” or data contributors of the established network often retain its data locally while accessing some linked data sources with regulatory agreements. One example of such an extensive network is PCORnet, a national-scale clinical research network consisting of 11 other networks of healthcare institutions such as Greater Plains Collaborative (GPC)4, engaging EHR data from over 80 million Americans5. Other examples of data networks include the Food and Drug Administration (FDA) Mini-Sentinel6, NIH Distributed Research Network7, and ESPnet8.
Although the EHR data procured through the network undergo extensive anonymization of patient identifiers, a few elements may allow for reidentification of the identifiers or information leakage9. Moreover, challenges arising from controlled access, safe data transferring, license restrictions from regulatory and legal concerns often keep the data sharing process complex and enduring. Federated learning (FL) is a learning mechanism that can potentially address the persisting concerns of data governance, sharing, and privacy1,10. In FL, machine learning algorithms can be trained collaboratively on data stored from independent health systems without exchanging the raw data itself. In this way, FL can be utilized in established networks like GPC and others to build high-performance predictive models on a large-scale, potentially growing, decentralized database, enabling novel research on complex and rare health outcomes.
In contrast, the healthcare institutions that do not participate in any bigger network are deprived of access to any big EHR database or data sharing mechanism unless the database is publicly available or available through license purchasing. This poses major challenges for these siloed healthcare institutions to conduct novel research on healthcare problems since the only available data for them are their own clinical records. In addition, data-driven research conducted on such local siloed datasets can introduce demographics biases or misrepresent other population characteristics, leading to the non-generalizability of their results2.
We propose one approach to address this important persisting problem using the concept of knowledge transportation through “transfer learning”. Transfer learning (TL) is a learning mechanism emerging from the concept of “model” sharing that can potentially improve the learner from one domain by transferring information (in terms of a pre-trained model) from a related domain where there is a limited supply of target training data11,12. The key conceptual difference between FL and TL is that FL sequentially trains a set of independent hospital datasets to build a decentralized model, whereas, in TL, any target hospital site can build a model using a transferred pre-trained model and re-train the model using its own data for local adaptation. To the best of our knowledge, this study is the first of its kind to implement transfer learning to address the problem of limited availability of data for out-of-network siloed sites. Furthermore, we choose our target disease to be diabetic kidney disease, a long-term complication of chronic diabetes and can result in increased costs related to hospitalizations, medicines, and treatment procedures.
The main objective of this study is to design a privacy-preserved transfer learning architecture that can utilize the knowledge from a privacy-preserved federated model developed from in-network hospital-site EHR data for predicting diabetic kidney cases at out-of-network siloed hospital sites. In addition, we aim to demonstrate the potential and scope of transferring knowledge from established data-sharing hospital networks to aid data-driven AI discoveries at siloed healthcare entities without sharing of data.
The key contributions of our study include: (i) introducing the concept of knowledge sharing from in-network hospitals to out-of-network hospitals for building predictive AI models, (ii) utilize an extensive database of real-world healthcare data called Health Facts to demonstrate the proof-of-concept of transfer learning, (iii) implementing decentralized privacy-preserved federated learning architecture using in-network hospital data to facilitate a privacy-preserved transfer learning process to predict diabetic kidney disease at out-of-network siloed hospitals, (iv) comparing the proposed transfer learning mechanism to cases where only siloed data at out-of-network hospitals are used to predict the diabetic kidney cases to manifest the significance of our approach.
Background
Transfer learning involves improving the target predictive function by using the knowledge from the source domain and the data from the target domain. Transfer learning applications in machine learning include text sentiment classification, image classification, software detect classification, and multi-language text classification11–16. However, limited research exists in transfer machine learning adaptations in the healthcare domain. Recently, a study by Gao et al. (2019)17 demonstrated a transfer learning approach on a case study on the MIMIC-III database to predict in-hospital mortality. Our study aims to illustrate the adaptation of transfer learning concepts for predicting diabetic kidney disease at siloed data sites, which is the first of its kind to the best of our knowledge. Moreover, we set up the experimental design using natural partitions of healthcare institutions in the Health Facts database to mimic the real-world setting to our best.
Federated learning involves sharing only the mathematical parameters, not the actual data itself, to build a global model iteratively over independent databases18. The earlier efforts of adapting federated learning models in healthcare include predicting mortality, hospital stay-time for ICU patients, cardiac event hospitalization, dyspnea, adverse drug reactions, diabetes-related complications19–23. Most of the federated learning healthcare applications applied popular machine learning algorithms such as logistic regression, artificial neural network, and random forest21,22,24–26. Our previous study23 demonstrated the utilization of federated learning architecture for binary classification of the incidence of three diabetes-related complications affecting eyes, kidneys, and peripheral nerves, respectively, using logistic regression and simple artificial neural networks. We observed comparable performance for the federated learning models to the gold standard of centralized learning on a central database. This motivated us to implement a federated learning architecture to build a predictive model for diabetic kidney disease using data from independent healthcare systems included in a network. In addition, we prioritized the importance of privacy-preserved knowledge transfer for predicting diabetic kidney disease at the siloed data sites, which encouraged us to use the federated models to demonstrate our transfer learning approach.
Methods
Data Source
In this study, we used Cerner’s “Health Facts EMR Data,” a de-identified electronic health records database consolidated from over 90 healthcare systems across the US between 2000 and 2016. This database contains demographics, encounters, diagnoses, lab results, procedures, prescriptions, and other clinical attributes for about 69 million unique patients.
Diabetic Kidney Disease Cohort Selection
A diabetes population was identified using the Surveillance, PREvention, and ManagEment of Diabetes Mellitus algorithm (SUPREME-DM)27 based on eight criteria, two of them were based on International Classification of Disease (ICD-9 and ICD-10) diagnosis codes related to inpatient and outpatient encounters and six were based on lab results (Figure 1). Only the patients over 18 years who satisfied at least one or more were selected in the diabetes population. In addition, a patient cohort with diabetic kidney disease (DKD) was identified using ICD-9 and ICD-10 diagnoses codes (250.4x, E10.2x, and E11.2x) from the selected diabetes population. In this study, we considered a binary classification of diabetic kidney disease. The detail of the population section procedure with inclusion and exclusion process is described in our previous study23.
Figure 1:
We used the following eight criteria to select the diabetes population from Health Facts. The first six criteria were based on lab results, while the last two were based on ICD-9 and ICD-10 diagnosis codes. The thresholds for the lab tests are chosen based on the SUPREME-DM. Patients satisfying at least one criterion were selected as the diabetes population.
Feature Selection
We used the diagnosis table from the Health Facts database to extract the patient-related comorbid features. Next, we mapped the ICD codes to Clinical Classification Software (CSS) tool developed by Healthcare Cost and Utilization Project (HCUP). Finally, by grouping the individual ICD-9 and ICD-10 codes into similar clinical entities, we extracted a total of 283 unique CCS coded features to predict the complication as binary responses for our experiment.
Experimental Data Architecture
To demonstrate the method of transfer learning on siloed healthcare systems from federated models consolidated using in-network hospitals, we utilized the existing hospital identifiers for each patient in our DKD cohort. The data for our DKD cohort were partitioned into the different hospital sites based on the identifiers to facilitate the federated architecture. However, we discarded the hospital sites with less than 100 cases from our analysis. Assuming that hospitals with a larger patient population are part of a bigger network, we considered any hospital with a population over 1900 as in-network sites, while the others as out-of-network siloed sites. In total, 15 sites were included as in-network sites (referred to as vanguard sites), and 16 were considered out-of-network siloed sites (referred to as siloed sites).
Federated Model for In-Network Hospitals
Using a federated learning approach, we utilized the in-network 15 partitioned sites to build a decentralized model to predict the binary DKD cases and non-cases. We call this model our vanguard FL model and the sites as vanguard sites. To develop our vanguard FL model, 70% of each of the 15 partitioned sites were used as training datasets, while 30% were combined to build a common testing dataset. We used Python 3.6 libraries sci-kit learn, NumPy, pandas, and TensorFlow to develop our machine learning approach. In addition, we used 1-hot-encoding for our labels in the training set and transformed the features into TensorFlow data objects. Finally, we compiled a logistic regression and a 3-layer multi-perceptron model using binary cross-entropy as loss function and stochastic gradient descent as the optimizer.
The training module for federated learning was developed using the federated averaging algorithm18, as demonstrated in our previous study including all 31 healthcare entities23. The federated training approach begins with initiating a global model, which serves the initial weights for all the vanguard sites. Next, the vanguard sites use the initial weights to train their data and obtain the updated weights. The updated weights from each vanguard site are sent to the global model without sharing any raw data. A weighted average of the vanguard weights is then used to update the global model. Finally, after many rounds of aggregating vanguard weights to update the global model, we save the final global model (the vanguard FL model) and its weights for further analysis. (Algorithm 1 STEP 1)
The Transfer Learning Approach
The siloed hospital site that is not a part of a big network of hospitals can utilize the knowledge of the network to build predictive models by adopting a transfer learning approach. As knowledge is transferred from the in-network sites to the out-of-network sites, this process can be referred to as transfer learning. Generally, transfer learning is expressed using a pre-trained model, a model trained on a large dataset to predict a similar outcome. The out-of-network siloed sites can utilize the knowledge from the pre-trained model, which is our vanguard FL model, to predict DKD cases without data leaving its door. We adopted two mechanisms of transfer learning to build predictive models for the siloed sites based on the knowledge of a vanguard FL model28. In mechanism A, the siloed site trains the vanguard FL model using its training weights with the siloed dataset, assuming all the model layers as trainable. In mechanism B, the layers of the vanguard FL model are assumed as non-trainable with frozen weights, and new additional deep layers are built on top of that. The required steps of the two mechanisms are presented in STEP 2 and 3 of Algorithm 1. Figure 3 presents the entire experimental architecture of our study design.
Figure 3:
An infographic representation of our proposed transfer learning approach for the out-of-network siloed hospital sites. The decentralized vanguard FL model is built based on the federated learning approach shown in Algorithm 1: STEP 1 with no sharing of actual data among the vanguard sites. Thus, the out-of-network siloed site can utilize the knowledge base from the vanguard FL model to predict the DKD cases without being part of the vanguard network and without sharing data outside the hospital site. We refer to this process of conveying knowledge as transfer learning. The two mechanisms of transfer learning are shown in Algorithm 1: STEP 2 and 3, respectively.
Experimentation for Out-of-Network Siloed Sites
To demonstrate the feasibility of the transfer learning approach, we created training (70%) and testing (30%) datasets for each of the 16 siloed sites and used the same train-test datasets to conduct the following comparative experiments.
CASE I (LOCAL): LOCAL MODEL VALIDATION
We trained individual CCS feature-data silos to build models to predict DKD cases among the patients for 16 siloed sites. For this purpose, we implemented both logistic regression and 3-layer multi-perceptron models using the sci-kit learn module from Python 3.6. The local model validation mimics the case where the siloed sites are deprived of any existing knowledge to build predictive models.
CASE II (VANGUARD): VANGUARD MODEL VALIDATION
This situation mimics the case where the vanguard FL models are available only for testing on the local siloed sites. Therefore, the local test datasets are used to evaluate the predictive performance of the vanguard FL models without any training on the local data. For both the vanguard FL models (logistic and MLP), we tested the performance for all the 16 siloed sites.
CASE III (TRANSFERRED A or B): TRANSFER LEARNING APPROACH
This case mimics the condition where knowledge from the vanguard network is agreed to be transferred to the local siloed site. For our proposed approach, we assumed that the vanguard FL model and its weights could be transferred from the in-network sites to the local sites for further learning. The local siloed sites will then train the vanguard FL model with its local training data silos and evaluate the performance on its test datasets. We applied mechanisms A and B (as shown in Algorithm 1 and Figure 3) to build the MLP model, while logistic regression only applies mechanism A. For our MLP model, we froze the two layers from the vanguard FL model and added two more dense layers with activation function “relu” alongside the output layer with activation “sigmoid”. The new MLP model is finally compiled using stochastic gradient descent as the optimizer and binary cross-entropy as the loss function and trained with the pre-training weights from the vanguard FL model. For both MLP and logistic regression, we tested the performance across all the 16 siloed sites.
Class Imbalance Learning and Evaluation Metrics
The federated datasets for both vanguard sites and local siloed sites are subjected to an unequal class balance for DKD cases and non-cases (See Figure 2 for the class-ratio distribution in Islam (2021)23). To account for the varying class distribution in our model training, all our experiments were repeated for sampling techniques, such as oversampling, which supplements the minority class, and undersampling, which randomly removes the majority class. Thus, for all our experiments, oversampling, undersampling, and no-sampling were performed before model compilation. We used the module “resample” from sk-learn to apply the class balancing techniques.
Figure 2:
A diagrammatical representation of class imbalance as shown by the varying sample size and the number of cases for vanguard and siloed sites. The vanguard sites form the larger hospital network, while the siloed sites represent the local out-of-network hospitals.
For each vanguard FL model, we ran 48 experiments for the logistic regression case and 64 experiments for the multi-perceptron model. With the class-balancing experiment, we built three vanguard FL models using logistic regression and three using multi-perceptron. In total, we ran 336 experiments for our analysis. Since F-1 scores are a more reliable measure than accuracy in the presence of class imbalance, performance metrics, such as F-1 score, recall, and precision, were computed. We weighted the three-performance metrics by the sample size of the local siloed sites to obtain a weighted average with a 95% confidence interval to compare the performance among the different case scenarios of our comparative experiments.
R version 3.4.4 (R Foundation for Statistical Computing, Vienna, Austria) and Python 3.6 were used for data management. All computations were performed on a Mac Book Pro running macOS Catalina version 10.15.2 with 16GB of RAM.
Results
In our analysis, we extracted a sample of 17,455 patients as our diabetic kidney cohort from the Health Facts database with 102,876 population size for diabetes patients. In total, there were 31 independent healthcare systems, and we considered a network of 15 hospitals as our vanguard sites and 16 local out-of-network siloed sites. Figure 2 shows the class-imbalance issue in the datasets by the varying sample size with the number of cases for both the in-network and out-of-network sites. The sample size of the vanguard sites which form the in-network hospitals vary from 1974 to 9386, and the number of cases ranges from 133 to 2096. The siloed local sites have a low as 239 patient sample size.
Table 1 shows the performance for the vanguard FL model for the logistic regression and multi-perceptron models. Both oversampling and undersampling showed better recall performance compared to the case of without sampling case. These models served as our knowledge base for demonstrating the transfer learning approach for the siloed sites.
Table 1:
A table showing the performance metrics for the logistic regression and multi-perceptron models under the three class-balancing techniques for the vanguard sites.
| Model | Sampling Method | F1-Score | Precision | Recall |
|---|---|---|---|---|
| LR | Oversample | 0.64 | 0.55 | 0.85 |
| LR | Undersample | 0.63 | 0.53 | 0.85 |
| LR | No sample | 0.58 | 0.73 | 0.52 |
| MLP (10,10,1) | Oversample | 0.67 | 0.59 | 0.83 |
| MLP (10,10,1) | Undersample | 0.64 | 0.55 | 0.86 |
| MLP (10,10,1) | No sample | 0.65 | 0.68 | 0.68 |
The comparative evaluation for all the experiments under multi-perceptron and logistic regression models are presented in Figure 4 and Figure 5, respectively. Appendix A shows the combined results for the weighted means from all the experiments. The weighted means show a comprehensive measure of the performance metrics; hence, we use the weighted means to compare the different approaches and experiments. For our multi-perceptron model, both TRANSFERRED A and B approaches consistently showed improved predictive performances compared to LOCAL validation and VANGUARD validation, as shown by the trend in the weighted means in Figure 4. Under oversampling, TRANSFERRED A showed a 17.9% increase in weighted F-1 score, 25.4% in the weighted recall, and 13% in weighted precision compared to LOCAL. TRANSFERRED B showed a 21.4% percent increase in weighted F-1 score, 10% in the weighted recall, and 28% in weighted precision compared to LOCAL. We observe similar improvements in the measures for undersampling with 27.8% percent improvement in weighted F1 (41.9% in weighted precision and 3.3% in weighted recall) for TRANSFERRED A, while 22.2% in weighted F1 (28.2% in weighted precision and 10% in weighted recall) for TRANSFERRED B. However, in the case of no sampling, the percentage improvement in weighted F-1 score was much less compared to the above two cases with only a 14.3% increase for TRANSFERRED B. Moreover, we observe a decrease in percentage in weighted precision value for no sampling case by about 1.2% for TRANSFERRED B.
Figure 4:
A figure showing the comparative evaluation for CASE 1 (LOCAL), CASE II (VANGUARD), and CASE III (TRANSFERRED A and B) under the multi-perceptron model. TRANSFERRED A considers the layers of the federated vanguard model as trainable. In contrast, in TRANSFERRED B, the federated vanguard layers are frozen, and new trainable layers are added to the top of that. Different colors and symbols represent the four cases. The performance metrics F-1 scores, Precision and Recall are presented with their weighted mean (red diamond) and 95% confidence interval (error bar) for three sampling cases: OVERSAMPLING, UNDERSAMPLING, and NO SAMPLING. The performance for the vanguard federated model (knowledge base) is indicated with grey dashed lines. The weighted means show a consistent improvement for either TRANSFERRED cases compared to the LOCAL case across all the experiments
Figure 5:
A figure showing the comparative evaluation for CASE 1 (LOCAL), CASE II (VANGUARD), and CASE III (TRANSFERRED A) under the logistic regression model. Different colors and symbols represent the three cases. The performance metrics F-1 scores, Precision and Recall are presented with their weighted mean (red diamond) and 95% confidence interval (error bar) for three sampling cases: OVERSAMPLING, UNDERSAMPLING, and NO SAMPLING. The performance for the vanguard FL model (knowledge base) is indicated with grey dashed lines. The weighted means show consistent improvement for the TRANSFERRED case compared to the LOCAL across all the experiments.
Additionally, the VANGUARD validation on siloed sites showed performance improvement compared to LOCAL. TRANSFERRED B approach showed improvement over VANGUARD validation (weighted F1 ranging from 1.5 to 6.3%, weighted precision from 1.3 to 7.7%, weighted recall from 1.5 to 8.4%); however, TRANSFERRED A showed a percentage decrease in weighted precision values compared to VANGUARD validation ( -6.6% for oversampling and -4.7% for undersampling). In contrast, the weighted recall and F-1 scores showed performance improvements for TRANSFERRED A compared to VANGUARD validation. Furthermore, the weighted recall values for TRANSFERRED B for siloed sites are much closer to the vanguard FL recall value for both the over-and-undersampling cases, as shown in Figure 4. This shows TRANSFERRED B performed consistently well in predicting diabetic kidney disease for the siloed sites.
Similarly, the TRANSFERRED A approach showed a consistent improvement in the performance metrics compared to LOCAL and VANGUARD validation cases for the logistic regression model. The weighted means for F-1, precision, and recall show an upward trend for all the sampling cases, as shown in Figure 5. The recall values for siloed sites are much closer to the vanguard FL performance in the oversampling case. In contrast, most of the precision values and weighted values exceeded the vanguard FL performance level in the undersampling case. Overall, we observe transfer learning to produce comparably better and consistent predictive results than the other cases.
Discussion
The implementation of transfer learning can bring numerous opportunities to investigate key healthcare issues and clinical events when only limited knowledge is available. This paper demonstrates the feasibility of creating data-driven AI models for healthcare institutions with limited access to big databases using a knowledge base from a network of hospitals by a transfer learning mechanism. We acknowledge the importance of preserving privacy, data sharing or transportation challenges to build a central repository, and the lack of data accessibility at out-of-network sites. Thus, through this study, we emphasized the significance of “no data leaving the firewalls of the hospitals” by incorporating the concepts of federated learning using a decentralized AI model building network for the in-network hospital sites. We also demonstrate that knowledge transferred from the privacy-preserved federated AI model in terms of “model transfer” can enhance data-driven AI applications at out-of-network hospital sites.
We used Cerner’s Health Facts database containing the independent healthcare system identifiers to facilitate our federated architecture, hence assigning the hospital sites either in-network or out-of-network. We assumed that smaller patient samples are considered out-of-network sites, although the proportion of cases could be higher in them. We demonstrated the process of federated learning and transfer learning using an artificial neural network (multi-layer perceptron) and a simple case of logistic regression. We compared the performance of the transferred approaches with the case where only local data could be trained to build the predictive model. As for performance measures, we computed F-1, precision, and recall. However, since we are predicting DKD cases, maximizing the chances of true positives is more relevant. Thus, the recall values could provide better insights into model performance. Additionally, we scaled the performance metrics with the individual sample size of the out-of-network sites to compute a weighted mean that can guide our comparison. In all our experiments, transfer learning showed improved performance as shown by the weighted means of F-1 score and recall compared to locally trained models, indicating knowledge enhancement over a pre-trained model. Also, applying the different class balancing techniques improved the transferred model performances overall.
Conclusion
In conclusion, our results prove the feasibility of transfer learning in building AI models for out-of-network sites using a pre-trained model built from a decentralized data sharing network architecture. We present this study as a proof-of-concept of transfer learning using privacy-preserved knowledge developed through federated learning. However, our model architecture is limited to only a case of artificial neural network and logistic regression, which can generate reasonably good but not high-performance models. Also, we assumed the feature set to be similar for all our experiments, which do not reflect the reality. We will explore and experiment further on the loss of accuracy from transfer learning and compare the characteristics of the datasets such as class ratio with the model performance for further illustration of our method. Our future work extends to include more classifiers and deep learning models to improve the performance of learning. We also plan to explore heterogeneous feature and distribution space for transfer learning, averaging techniques for federated learning, machine learning methods to address class-imbalance issues, fine-tuning of the models, extract more patient-related information to build a more reliable and robust predictive model for diabetic kidney disease and other complications of diabetes. Our future work will include more extensive research on making the transfer learning concept more feasible and acceptable for healthcare research advancements.
APPENDIX A
A table showing the combined results of the experiments for the weighted mean of the performance measures (F-1, precision, and recall) with variability measures such as standard deviation (STD) and 95% confidence interval.
| Models | Metrics | SAMPLING: NO SAMPLING | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LOCAL | VANGUARD | TRANSFERRED A | TRANSFERRED B | ||||||||||
| F-1 | PREC | REC | F-1 | PREC | REC | F-1 | PREC | REC | F-1 | PREC | REC | ||
| Logistic Regression | WM ± SD | 0.48 ± 0.21 | 0.67 ± 0.2 | 0.41 ± 0.22 | 0.6 ± 0.13 | 0.72 ± 0.14 | 0.56 ± 0.15 | 0.63 ± 0.16 | 0.76 ± 0.11 | 0.58 ± 0.19 | |||
| [LL, UL] | [0.37, 0.58] | [0.57, 0.77] | [0.31, 0.52] | [0.53, 0.66] | [0.65, 0.78] | [0.49, 0.63] | [0.55, 0.71] | [0.71, 0.82] | [0.49, 0.67] | ||||
| Multi-layer Perceptron | WM ± SD | 0.56 ± 0.13 | 0.68 ± 0.11 | 0.53 ±0.15 | 0.62 ±0.11 | 0.62 ±0.15 | 0.62 ± 0.11 | 0.66 ± 0.1 | 0.72 ± 0.11 | 0.69 ± 0.12 | 0.64 ± 0.15 | 0.67 ± 0.19 | 0.67 ± 0.12 |
| [LL, UL] | [0.5, 0.63] | [0.62, 0.73] | [0.46, 0.6] | [0.57, 0.68] | [0.55, 0.69] | [0.57, 0.68] | [0.61, 0.71] | [0.67, 0.77] | [0.63, 0.75] | [0.56, 0.71] | [0.57, 0.76] | [0.61, 0.73] | |
| SAMPLING: OVERSAMPLING | |||||||||||||
| Logistic Regression | WM ± SD | 0.62 ± 0.12 | 0.56 ± 0.14 | 0.77 ± 0.11 | 0.63 ± 0.12 | 0.53 ± 0.12 | 0.86 ± 0.12 | 0.68 ± 0.11 | 0.6 ± 0.13 | 0.85 ± 0.09 | |||
| [LL, UL] | [0.56, 0.68] | [0.49, 0.63] | [0.72, 0.82] | [0.57, 0.69] | [0.47, 0.59] | [0.8, 0.92] | [0.62, 0.74] | [0.53, 0.66] | [0.8, 0.89] | ||||
| Multi-layer Perceptron | WM ± SD | 0.56 ± 0.14 | 0.54 ± 0.16 | 0.62 ± 0.12 | 0.64 ± 0.11 | 0.55 ± 0.13 | 0.83 ± 0.13 | 0.66 ± 0.12 | 0.61 ± 0.14 | 0.78 ± 0.1 | 0.68 ± 0.11 | 0.58 ± 0.14 | 0.86 ± 0.08 |
| [LL, UL] | [0.49, 0.63] | [0.46, 0.61] | [0.56, 0.68] | [0.59, 0.7] | [0.49, 0.61] | [0.77, 0.89] | [0.6, 0.72] | [0.54, 0.68] | [0.73, 0.83] | [0.62, 0.73] | [0.52, 0.65] | [0.82, 0.9] | |
| SAMPLING: UNDERSAMPLING | |||||||||||||
| Logistic Regression | WM ± SD | 0.48 ± 0.12 | 0.67 ± 0.2 | 0.41 ± 0.22 | 0.6± 0.13 | 0.72 ± 0.14 | 0.56 ± 0.15 | 0.63 ± 0.16 | 0.76 ± 0.11 | 0.58 ± 0.19 | |||
| [LL, UL] | [0.37, 0.58] | [0.57, 0.77] | [0.31, 0.52] | [0.53, 0.66] | [0.65, 0.78] | [0.49, 0.63] | [0.55, 0.71] | [0.71, 0.82] | [0.49, 0.67] | ||||
| Multi-layer Perceptron | WM ± SD | 0.54 ± 0.13 | 0.44 ± 0.13 | 0.79 ± 0.11 | 0.65 ± 0.12 | 0.56 ± 0.14 | 0.85 ± 0.1 | 0.69 ± 0.12 | 0.63 ± 0.13 | 0.81 ± 0.1 | 0.66 ± 0.1 | 0.57 ± 0.15 | 0.87 ± 0.1 |
| [LL, UL] | [0.48, 0.6] | [0.38, 0.41] | [0.73, 0.84] | [0.59, 0.71] | [0.49, 0.63] | [0.8, 0.9] | [0.63, 0.74] | [0.56, 0.69] | [0.76, 0.86] | [0.61, 0.71] | [0.49, 0.64] | [0.82, 0.92] | |
Glossary terms:
Vanguard sites: The hospital sites with population over 1900 forming a network
Siloed sites: The out-of-network hospital sites with population less than 1900
Figures & Table
References
- 1.Rieke N., et al. The future of digital health with federated learning. NPJ Digit. Med. 2020;3:119. doi: 10.1038/s41746-020-00323-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Curtis L. H., Brown J., Platt R. Four Health Data Networks Illustrate The Potential For A Shared National Multipurpose Big-Data Network. Health Aff. 2017;33:1178–1186. doi: 10.1377/hlthaff.2014.0121. [DOI] [PubMed] [Google Scholar]
- 3.Zerka F., et al. Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health Care. JCO Clin. Cancer Informatics. 2020;4:184–200. doi: 10.1200/CCI.19.00047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Waitman L. R., Aaronson L. S., Nadkarni P. M., Connolly D. W., Campbell J. R. Brief communication: The Greater Plains Collaborative: a PCORnet Clinical Research Data Network. J. Am. Med. Inform. Assoc. 2014;21:637. doi: 10.1136/amiajnl-2014-002756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Forrest C. B., et al. PCORnet® 2020: current state, accomplishments, and future directions. J. Clin. Epidemiol. 2021;129:60. doi: 10.1016/j.jclinepi.2020.09.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Curtis L. H., et al. Design considerations, architecture, and use of the Mini-Sentinel distributed data system. 2012. [DOI] [PubMed]
- 7.NIH Collaboratory http://nihcollaboratory.org/
- 8.ESPHealth https://www.esphealth.org/
- 9.Rocher L., Hendrickx J. M., de Montjoye Y.-A. Estimating the success of re-identifications in incomplete datasets using generative models. Nat. Commun. 2019 101. 2019;10:1–9. doi: 10.1038/s41467-019-10933-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Xu J., et al. Federated Learning for Healthcare Informatics. J. Healthc. Informatics Res. 2020. pp. 1–19. [DOI] [PMC free article] [PubMed]
- 11.Weiss K., Khoshgoftaar T. M., Wang D. A survey of transfer learning. J. Big Data 2016 31. 2016;3:1–40. [Google Scholar]
- 12.Pan S. J., Yang Q. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 2009;22:1345–1359. [Google Scholar]
- 13.Duan L., Xu D., Tsang I. W. Learning with augmented features for heterogeneous domain adaptation. Proc. 29th Int. Conf. Mach. Learn. ICML 2012. 2012;1:711–718. [Google Scholar]
- 14.Wang C., Mahadevan S.-T. Twenty-second international joint conference on artificial intelligence. 2011. international joint conference & 2011, U. Heterogeneous domain adaptation using manifold alignment.
- 15.Nam J., Fu W., Kim S., Engineering T. M. -… on S. & 2017, undefined. Heterogeneous defect prediction. IEEE Trans. Softw. Eng. 2017;44:874–896. [Google Scholar]
- 16.Prettenhofer P., Stein B. Cross-language text classification using structural correspondence learning. Proceedings of the 48th annual meeting of the association for computational linguistics. pp. 1118–1127. (Association for Computational Linguistics, 2010)
- 17.Ju C., et al. Federated Transfer Learning for EEG Signal Classification. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. IEEE Eng. Med. Biol. Soc. Annu. Int. Conf. 2020;2020:3040–3045. doi: 10.1109/EMBC44109.2020.9175344. [DOI] [PubMed] [Google Scholar]
- 18.McMahan B., Moore E., Ramage D., Hampson S., Arcas B. A. Artificial Intelligence and Statistics. PMLR; 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data; pp. 1273–1282. [Google Scholar]
- 19.Huang L., et al. Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. J. Biomed. Inform. 2019;99:103291. doi: 10.1016/j.jbi.2019.103291. [DOI] [PubMed] [Google Scholar]
- 20.Brisimi T. S., et al. Federated learning of predictive models from federated Electronic Health Records. Int. J. Med. Inform. 2018;112:59–67. doi: 10.1016/j.ijmedinf.2018.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Deist T. M., et al. Infrastructure and distributed learning methodology for privacy-preserving multi-centric rapid learning health care: euroCAT. Clin. Transl. Radiat. Oncol. 2017;4:24–31. doi: 10.1016/j.ctro.2016.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Choudhury O., et al. Predicting Adverse Drug Reactions on Distributed Health Data using Federated Learning. AMIA Annu. Symp. Proceedings. 2019. pp. 313–322. [PMC free article] [PubMed]
- 23.Islam H., Mosa A. S. M. A Federated Mining Approach on Predicting Diabetes-Related Complications: Demonstration Using Real-World Clinical Data. AMIA Annual Symposium Proceedings. 2021. [PMC free article] [PubMed]
- 24.Mangold P., et al. A Decentralized Framework for Biostatistics and Privacy Concerns. Stud. Health Technol. Inform. 2020;275:137–141. doi: 10.3233/SHTI200710. [DOI] [PubMed] [Google Scholar]
- 25.Lee G. H., Shin S.-Y. Federated Learning on Clinical Benchmark Data: Performance Assessment. J. Med. Internet Res. 2020;22:e20891. doi: 10.2196/20891. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Li X., et al. Multi-site fMRI analysis using privacy-preserving federated learning and domain adaptation: ABIDE results. Med. Image Anal. 2020;65:101765. doi: 10.1016/j.media.2020.101765. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Nichols G. A., et al. Construction of a multisite datalink using electronic health records for the identification, surveillance, prevention, and management of diabetes mellitus: The SUPREME-DM project. Prev. Chronic Dis. 2012;9 doi: 10.5888/pcd9.110311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Marcelino P. Transfer learning from pre-trained models. Towar. Data Sci. 2018.






