Skip to main content
Journal of the American College of Emergency Physicians Open logoLink to Journal of the American College of Emergency Physicians Open
. 2023 May 21;4(3):e12968. doi: 10.1002/emp2.12968

Federated data health networks hold potential for accelerating emergency research

Prashant Mahajan 1,, Charles Macias 2, Amie Barda 3, Christopher M Fung 3
PMCID: PMC10200260  PMID: 37220474

Abstract

Multi‐center research networks often supported by centralized data centers are integral in generating high‐quality evidence needed to address the gaps in emergency care. However, there are substantial costs to maintain high‐functioning data centers. A novel distributed or federated data health networks (FDHN) approach has been used recently to overcome the shortcomings of centralized data approaches. A FDHN in emergency care is comprised of a series of decentralized, interconnected emergency departments (EDs) where each site's data is structured according to a common data model that allows data to be queried and/or analyzed without the data leaving the site's institutional firewall. To best leverage FDHNs for emergency care research networks, we propose a stepwise, 2‐level development and deployment process—creating a lower resource requiring Level I FDHN capable of basic analyses, or a more resource‐intense Level II FDHN capable of sophisticated analyses such as distributed machine learning. Importantly, existing electronic health records‐based analytical tools can be leveraged without substantial cost implications for research networks to implement a Level 1 FDHN. Fewer regulatory barriers associated with FDHN have a potential for diverse, non‐network EDs to contribute to research, foster faculty development, and improve patient outcomes in emergency care.

Keywords: data model, emergency care, global research, pediatric emergency medicine

1. TRADITIONAL CENTRALIZED DATA COLLECTION LIMITS COLLABORATION

Successful multicenter research networks in emergency medicine directly address the ability to conduct high‐quality research on large numbers of eligible patients which is unlikely in single emergency departments (EDs), a research gap identified by National Academies of Sciences, Engineering, and Medicine report on the Future of Emergency Care. 1 Research networks including but not limited to the Pediatric Emergency Care Applied Research Network (PECARN) and Strategies to Innovate Emergency Care Clinical Trials Network (SIREN) have generated high‐quality evidence, nurtured an investigator pipeline, and strengthened the overall emergency research infrastructure. This has directly contributed to improving the care of ill and injured adults and children globally. 2 High‐performing emergency research networks often are supported by a centralized data center to ensure data quality, comply with institutional, national, and international regulatory standards, and curate data to conduct sophisticated analyses (Figure 1).

FIGURE 1.

FIGURE 1

Schematic of a centralized model. Diagram of centralized data collection and storage by coordinating center. Encounter level data including protected health information leaves each participating institution and is stored by the coordinating center. Network emergency departments can participate after completion of data use agreement (DUA). Non‐network emergency departments generally cannot participate without completing DUA process.

Sustaining centralized data centers is costly and requires a substantial commitment of time and resources from researchers and, specifically, health information technology experts. High costs emanate from the need to: (1) harmonize data from disparate and often highly customized electronic health record systems with varied data formats; (2) ensure compliance with cybersecurity protocols; (3) maintain patient privacy standards, especially when identifiable data is being sent outside the institution; and (4) conduct ongoing and required maintenance of databases. Furthermore, centralized data centers are often constrained by specific data collection protocols that limit conducting observational studies with up‐to‐date data, developing artificial intelligence models requiring many variables, and performing data exploration for future studies across network sites. Consequently, centralized data centers are not nimble, and the cost structure is a barrier for non‐network sites to participate in network‐based studies. This reduces the opportunity to solicit input into research study questions and designs, reduces the diversity of study populations, decreases generalizability, and limits the dissemination and implementation of new evidence.

2. MOVING TOWARD A DE‐CENTRALIZED APPROACH

The COVID pandemic further underscored the ongoing need for nimble, multi‐centered, multi‐disciplinary, collaborative approaches to understand the rapidly evolving epidemiology of the virus, and assess the impact, efficacy, and effectiveness of various operational and therapeutic interventions. Our recent work on COVID's impact on diagnostic delays across 14 EDs in Michigan 3 and a survey on the impact of COVID on provider burnout and innovation across 74 EDs in 28 countries 4 has revealed that EDs are willing to commit site resources to contribute data for research in emergency care. To overcome the shortcomings of centralized data approaches, a distributed/de‐centralized approach called federated data health networks (FDHN) has been successfully deployed in COVID research; this has opened the potential for similar approaches to emergency care network‐based research. 5 , 6

A FDHN in emergency care is comprised of a series of decentralized, interconnected EDs where each site's data stay behind its institutional firewall but is structured according to a common data model that allows data to be queried and/or analyzed without the data leaving the site. 7 Thus, instead of data sharing and development of a central data repository, sites maintain a real‐time dataset based on a common data model, which is a way of organizing data into a standard structure (Figure 2). This data is “visited” by centralized queries and algorithms. 5 Guidelines for the creation of a successful FDHN have been developed by the World Economic Forum and requires the commitment of site resources, the development of a robust data governance structure to address compliance with regulatory requirements and collaboration with health information technology personnel for data quality assurance. 5 , 7

FIGURE 2.

FIGURE 2

Schematic of a federated model. Diagram of federated model with data extraction, transformation, and loading into a common data model within the data warehouse of each emergency department (ED). Standardized queries are distributed by the research center and run locally. Aggregate analysis is collected by the research center without the need for protected health information to leave individual EDs and generally without the need for data use agreement. Non‐network sites may participate if their data is mapped onto the common data model.

Within the larger context of health informatics, there are existing common data models such as i2b2, PCORNet, the US Food and Drug Administration's Sentinel initiative, and Observational Medical Outcomes Partnership (OMOP) whose impact on improving outcomes continues to evolve. 7 , 8 , 9 , 10 However, the single encounter‐based frame of reference of the ED practice environment emphasizes the need for a simpler data model than the ones that are currently used to map health systems data not relevant to emergency care delivery. It is possible for certain common data models, such as the OMOP, whose architecture, vocabulary, and accompanying analytic tools are specifically designed for federated analysis across various patient encounters (inpatient, ambulatory, etc) and across multiple institutions globally can be used for ED encounters. 11 However, this approach can be costly to implement, requires ongoing updating and data quality monitoring, and is not tailored to the ED context.

3. HOW CAN WE LEVERAGE THE FDHN TO OPTIMIZE THE PERFORMANCE OF EMERGENCY CARE RESEARCH NETWORKS?

Given the complicated nature of developing and assimilating an emergency research network based FDHN, we propose a stepwise, 2‐level development and deployment process. The first step is to create the FDHN itself by engaging established research networks or a consortium of EDs who are committed to its development. Each site will then obtain institutional review board approval and participate in the development of a common data model (an ED dataset based on shared and accepted definitions of data variables) with the important caveat that the data will be retained at the site and queried in a standardized manner. Creation of this ED dataset will require the commitment of health information technology resources, clinical informatician(s), and clinical champions at each site. Basic descriptive and analytic statistics including the development of regression and classification models can then be applied on each ED dataset in a standardized manner. Meta‐analyses of deidentified collated results can be used for hypothesis generation, health surveillance screening, development of dashboards for operational analytics, and quality improvement activities. We propose that FDHN at this stage of development would be called a Level I FDHN: capable of performing basic descriptive analyses/models within each site and aggregating results centrally. At this stage, peer sites within the network have established a clear governance structure overseeing architecture, the common data model and vocabulary, access, and strategy utilizing primarily structured electronic health records (EHR) data. Future states of FDHNs, which we call Level II, include automation of the data query or “visitation” process as well as inclusion of unstructured data such as text, waveforms and potentially blending a data lake strategy into the FHDN.

Once a successful Level I FDHN is established, research networks can consider a more resource‐intense Level II FDHN that would allow for sophisticated analyses such as federated machine learning on the ED datasets. 12 It is imperative to develop such a model as emergency care datasets become more complicated with the increasing availability of complex “omics” data (genome, metabolome, microbiome, transcriptome, etc) that may need to be integrated with continuously obtained physiologic data from health monitoring devices (eg, Fitbit, Apple Watch) into emergency care visits. 13 Federated machine learning involves the development of models locally and subsequently developing a global model by combining local model parameters in a Health Insurance Portability and Accountability Act secure, cloud‐based server. 14 Level II FDHNs will need to follow the guidelines suggested but sites will need additional technological support/investment and regulatory oversight.

At any level, a successfully implemented FDHN could provide both timely and diverse data at scale across as many sites as willing to participate in a specific question. For instance, operational questions regarding the number of left without being seen, and patients across participating sites during an epidemic stratified by race/ethnicity can give insights into health equity aspects of care along with syndromic surveillance. Other potential questions could range from disease, drug and medical device surveillance, to quality improvement initiatives, benchmarking, research on rare diseases or outcomes, and even to facilitation of data collection in multicenter prospective clinical trials. Questions may range in complexity from as simple as counting the number of patients who left without being seen as a quality benchmark to more complex analyses such as producing model weights in a federated machine learning model. 15

4. LIMITATIONS OF THE FEDERATED APPROACH

Although FDHNs provide many advantages over a centralized model, there are several technical and administrative challenges that should be considered. First, from a technical perspective, a federated model places the entirety of the extract, transform and loading process within each site and thus, each site must be capable of providing the appropriate information technology support—both at inception and on an ongoing basis for maintenance and quality assurance of the dataset. Second, all participating sites within the FHDN must design, implement, maintain, and adhere to a common data model and common vocabulary that necessitates a carefully planned, potentially less flexible, data governance structure. This contrasts with centralized models where at least some of this work can be done and dictated by the coordinating center. Some of this “upfront cost” can potentially be defrayed by leveraging existing EHR‐based analytical tools. More recently, the ongoing consolidation of EHR vendors and health systems as well as the need for ever larger datasets for research has driven many institutions to adopt either open‐source or proprietary common data models. 7 Most EDs already commit to data gathering from the EHR for operational purposes or to share it as a part of their participation in federal and non‐federal repositories such as the National Hospital Ambulatory Medical Care Survey, the Clinical Emergency Data Registry, or collaborative quality groups such as the Michigan Emergency Department Improvement Collaborative). 16 , 17 Finally, a third limitation is that a FHDN will never yield as granular data as a centralized model. By design, a limitation on the minimum level of aggregation or analysis is enforced by the FHDN and thus, requires that researchers or nodes requesting federated data access have carefully planned their analysis prior to querying individual sites. This also limits the ability to explore data across sites or repeatedly query sites for more data. Additionally, each site must be able to provide technical resources such as data analysts and software to process incoming queries from other sites. Although this is a manual process in a Level 1 framework, in future Level 2 FHDNs, this work could potentially be automated with data “visitation” rules established at the network level.

5. HOW CAN FDHNS OVERCOME THESE BARRIERS TO EFFICIENTLY ACHIEVE THE GOALS OF EXISTING EMERGENCY RESEARCH NETWORKS THAT UTILIZE CENTRALIZED DATA CENTERS?

First, having more sites committed to a common ED data model and standardizing the extraction and transformation process itself will improve data quality. Second, both network and non‐network sites that use the common data model will be able to contribute data in an efficient manner. Potentially, sites may also leverage the common data model for internal operational reporting, research, and other business intelligence purposes and some of each site's health information technology can be supported by extramural grant support. Third, participating in Level I FDHN will substantially lower the regulatory barrier to entry for many EDs to contribute an analysis of their data by obviating the need for complex data use agreements, thus adding more diverse sites and populations into a network. This may especially be true for non‐network sites that wish to participate on an ad‐hoc basis in the FDHN. Network expansion allows more rapid cultivation of a culture of research across a specialty, further expanding faculty development, recruitment, retention, and attraction of additional extramural funding. This empowers EDs that are not part of existing research networks to expand and strengthen their research mission and improve the care of emergency patients everywhere. Finally, beyond research, potential steps to advance FHDNs should include consensus building among stakeholders, our professional organizations, and health systems regarding the optimal data architecture and vocabulary for emergency medicine. The establishment of a universal but limited common data model for emergency medicine could lower the barrier for many EDs to participate in multicenter research. Funding organizations and existing research networks should advocate for more limited, easier to adopt interoperability standards focused specifically on emergency medicine. In summary, despite some technical challenges and the need for investment by individual EDs, there is an unprecedented opportunity to leverage federated data health networks to exponentially enhance the impact of emergency care research networks.

CONFLICT OF INTEREST STATEMENT

The authors declare no conflicts of interest.

Mahajan P, Macias C, Barda A, Fung CM. Federated data health networks hold potential for accelerating emergency research. JACEP Open. 2023;4:e12968. 10.1002/emp2.12968

Funding and support: By JACEP Open policy, all authors are required to disclose any and all commercial, financial, and other relationships in any way related to the subject of this article as per ICMJE conflict of interest guidelines (see www.icmje.org). The authors have stated that no such relationships exist.

Supervising Editor: Julie Stilley, PhD, NREMT.

REFERENCES

  • 1. Institute of Medicine . 2007. Hospital‐based emergency care: at the breaking point. The National Academies Press. doi: 10.17226/11621 [DOI] [Google Scholar]
  • 2. Papa L, Kuppermann N, Lamond K, et al. Structure and function of emergency care research networks: strengths, weaknesses, and challenges. Acad Emerg Med. 2009;16(10):995‐1004. [DOI] [PubMed] [Google Scholar]
  • 3. Mangus CW, Parker SJ, DeLaroche AM, et al. Impact of COVID‐19 on the associated complications of high‐risk conditions in a statewide pediatric emergency network. J Am College of Emerg Physicians Open. 2022;3(6):e12865. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Mahajan P, Shu‐Ling C, Gutierrez C, et al. A global survey of emergency department responses to the COVID‐19 pandemic. West J Emerg Med. 2021;22(5):1037‐1044. doi: 10.5811/westjem.2021.3.50358. PMID: 34546878; PMCID: PMC8463065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Hallock H, Marshall SE, ’t Hoen PAC, et al. Federated networks for distributed analysis of health data. Front Public Health. 2021;9:712569. doi: 10.3389/fpubh.2021.712569. PMID: 34660512; PMCID: PMC8514765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Dayan I, Roth HR, Zhong A, et al. Federated learning for predicting clinical outcomes in patients with COVID‐19. Nat Med. 2021;27(10):1735‐1743. doi: 10.1038/s41591-021-01506-3. Epub 2021 Sep 15. PMID: 34526699; PMCID: PMC9157510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Weeks J, Pardee R. Learning to share health care data: a brief timeline of influential common data models and distributed health data networks in U.S. health care research. EGEMS (Wash DC). 2019;7(1):1‐6. doi: 10.5334/egems.288 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Informatics for Integrating Biology and the Bedside (i2b2) . (n.d.). Retrieved March 26, 2023, from https://www.i2b2.org/index.html
  • 9. Fleurence RL, Curtis LH, Califf RM, et al. Launching PCORnet, a national patient‐centered clinical research network. J Am Med Inform Assoc. 2014;21(4):578‐582. doi: 10.1136/amiajnl-2014-002747 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. U.S. Food and Drug Administration . (n.d.). Sentinel Initiative. Retrieved March 26, 2023, from https://www.sentinelinitiative.org/
  • 11. Observational Health Data Sciences and Informatics . (n.d.). Common data model. Retrieved March 26, 2023, from https://ohdsi.github.io/CommonDataModel/
  • 12. World Economic Forum . (2020). Sharing sensitive health data: Lessons from COVID‐19 and recommendations for the future. Retrieved March 26, 2023, from https://www3.weforum.org/docs/WEF_Sharing_Sensitive_Health_Data_2020.pdf
  • 13. Rieke N, Hancox J, Li W, et al. The future of digital health with federated learning. NPJ Digit Med. 2020;3(1):1‐7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Macias CG, Remy KE, Barda AJ. Utilizing big data from electronic health records in pediatric clinical care. Pediatr Res. 2022;93(2):382‐389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Dayan I, Roth HR, Zhong A, et al. Federated learning for predicting clinical outcomes in patients with COVID‐19. Nat Med. 2021;27:1735‐1743. doi: 10.1038/s41591-021-01506-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. American College of Emergency Physicians . (n.d.). Clinical emergency data registry. Retrieved March 26, 2023, from https://www.acep.org/cedr/
  • 17. MedicQI . (n.d.). Retrieved March 26, 2023, from https://medicqi.org/

Articles from Journal of the American College of Emergency Physicians Open are provided here courtesy of American College of Emergency Physicians

RESOURCES