Abstract
Objective
Among National Institutes of Health Clinical and Translational Science Award (CTSA) hubs, effective approaches for enterprise data warehouses for research (EDW4R) development, maintenance, and sustainability remain unclear. The goal of this qualitative study was to understand CTSA EDW4R operations within the broader contexts of academic medical centers and technology.
Materials and Methods
We performed a directed content analysis of transcripts generated from semistructured interviews with informatics leaders from 20 CTSA hubs.
Results
Respondents referred to services provided by health system, university, and medical school information technology (IT) organizations as “enterprise information technology (IT).” Seventy-five percent of respondents stated that the team providing EDW4R service at their hub was separate from enterprise IT; strong relationships between EDW4R teams and enterprise IT were critical for success. Managing challenges of EDW4R staffing was made easier by executive leadership support. Data governance appeared to be a work in progress, as most hubs reported complex and incomplete processes, especially for commercial data sharing. Although nearly all hubs (n = 16) described use of cloud computing for specific projects, only 2 hubs reported using a cloud-based EDW4R. Respondents described EDW4R cloud migration facilitators, barriers, and opportunities.
Discussion
Descriptions of approaches to how EDW4R teams at CTSA hubs work with enterprise IT organizations, manage workforces, make decisions about data, and approach cloud computing provide insights for institutions seeking to leverage patient data for research.
Conclusion
Identification of EDW4R best practices is challenging, and this study helps identify a breadth of viable options for CTSA hubs to consider when implementing EDW4R services.
Keywords: data warehouse, secondary use, CTSA, EHR
OBJECTIVE
Delivery of patient data from electronic health record (EHR) systems and other sources to researchers is a core function of Clinical and Translational Science Award (CTSA) hubs. The infrastructure to prepare and deliver these data requires special privacy and security considerations layered into high-performance technology and is crucial for many stages of investigation, such as activities preparatory to research, prospective clinical trial recruitment, retrospective observational studies, and public health surveillance.1–4 Expectations for these systems are high, especially during rapidly evolving events like the coronavirus disease 2019 (COVID-19) pandemic.5 Funded by the National Institutes of Health (NIH) National Center for Advancing Translational Science (NCATS), CTSA hubs have nearly all deployed enterprise data warehouses for research (EDW4R).2,6 However, hubs appear to have implemented EDW4R with great heterogeneity and without following a clear set of standards or best practice guidelines, which may complicate the CTSA consortium’s ability to meet current and future investigator, institutional, and collaborative multisite needs.
In a previous study, we illustrated how CTSA hubs have implemented EDW4R functionality with respect to organizational and technical architecture, processes for access, and service management.7 We identified 12 themes describing varying operational approaches and demonstrating a diversity of structures often driven by local cultural needs. Although each hub described some mechanism to deliver EHR data to researchers, variability existed in coordination of EDW4R staff with biostatistics and enterprise information technology (IT) colleagues; data delivery processes and oversight mechanisms; approaches to investigator data literacy, skills development, and engagement; and understanding of future technology needs. In observing challenges for and facilitators of EDW4R operations, we identified 4 areas warranting further investigation: relationships between each hub’s EDW4R group and enterprise IT organization, data governance, workforce, and cloud computing. For example, some hubs mentioned a push to move to a cloud-based infrastructure—both from enterprise IT and informatics professionals—but approaches and benefits were not clearly defined.
The current qualitative study is part of an ongoing effort to address gaps in understanding the role of EDW4R operations to provide guidance on the development, maintenance, and sustainability of these foundational components of scientific infrastructure.
BACKGROUND AND SIGNIFICANCE
Nearly ubiquitous in CTSA hubs, EDW4R services have provided critical support for clinical and translational research as exemplified in pandemic response activities. In addition to facilitating COVID-19 clinical trial recruitment,8 patient trajectory prediction,9 illness severity determination,10 symptom understanding,11 and common data model instantiation,12 EDW4R at CTSA hubs have enabled the NCATS National COVID Cohort Collaborative (N3C), an unprecedented resource that aggregates EHR data from more than 50 institutions into a single secure repository for observational studies and could be a model for other efforts.13 Despite the demonstrated benefits of EDW4R supporting COVID-19 research, technological and organizational challenges remain that extend to nonpandemic clinical and translational investigation.5 From our previous study, we identified enterprise IT relationships, data governance, workforce, and cloud computing as sociotechnical challenges facing EDW4R operations.
Enterprise IT refers to broad infrastructure services typically overseen by an institution’s chief information officer (CIO).14 The subject of substantial operations research,15 enterprise IT typically operates an ecosystem in which EDW4R activities exist7 along with patient care16 and other organizational functions. Notably, some respondents in our prior study indicated that the relationship between an EDW4R team and enterprise IT influenced the effectiveness and sustainability of EDW4R operations.
Data governance broadly describes how organizations choose to use data.17,18 Institutional decisions often determine how data are accessed and shared, which is critical given the sensitivity and value of EDW4R contents.19 In particular, stewardship of electronic patient data in collaborations between healthcare provider organizations and industry is complex.20 Our prior study found considerable variation in and future plans for EDW4R data governance among CTSA hubs.
Workforce encompasses team members charged with EDW4R operations. Coauthors here recognized from experience that required EDW4R team skillsets included but are not limited to system administration, computer programming, database design, clinical knowledge, research compliance, IT security, EHR expertise, and epidemiology. Although prior studies have characterized professionals in health IT21 and clinical research data management22 as well as senior leaders in clinical23 and research informatics,24 a comprehensive understanding of the EDW4R workforce is unknown.
Cloud computing provided by Amazon Web Services, Google Cloud Platform, Microsoft Azure, and other commercial vendors is emerging as a standard for many areas of enterprise IT, and informatics researchers have demonstrated benefits of cloud computing for computational biology.25,26 Despite increased adoption of cloud computing for general business functions (eg, email), potential benefits of cloud computing in biomedicine,27,28 and NIH focus on cloud computing through the STRIDES initiative,29 our prior work noted most hubs used on-premises EDW4R approaches and had unclear plans for cloud computing.
EDW4R services are complex and function within the broader contexts of academic medical centers and technology. The goal of this article is to illustrate the current state and potential future direction of EDW4R activities across the CTSA consortium.
MATERIALS AND METHODS
The CTSA consortium’s Informatics Enterprise Committee (iEC) deputized the Enterprise Data Warehouse for Research Working Group (EDW4R WG), which 2 coauthors led (BMK and TRC), to conduct this study in coordination with the NCATS National Center for Data to Health (CD2H), from which 1 coauthor served as liaison (DAD). The University of Iowa Institutional Review Board (IRB) determined this study to be nonhuman subjects research.
Data collection
We conducted semistructured interviews via Zoom with a convenience sample of EDW4R leaders from 20 CTSA hubs. Recruitment of participants occurred through iEC meeting announcements and direct email invitation with focus on (1) hubs that had not participated in our previous study and (2) diversity with respect to CTSA award size and geographic location. To direct interviews with participants, 2 authors (BMK and TRC) created a guide informed by our previous study1 and feedback from EDW4R working group members (Supplementary Appendix S1). Data collection occurred January to October 2020, with interviewers creating handwritten notes, transcribing them into electronic format, and sharing transcripts with interviewees by email for editing and elaboration.
Data analysis
The study team conducted a directed content analysis of interview transcripts.30 Two authors (BMK and TRC) identified initial concepts and relationships. Then peer debriefing occurred with a group of experts (CKC, DAD, and EVB) to iteratively refine understanding of interview transcripts and themes.31 Additional member checking through regular monthly iEC EDW4R working group meetings involved presentation of initial findings for feedback and validation.
RESULTS
We conducted interviews over 30 h that generated more than 50 pages of transcribed notes. As shown in Table 1, characteristics of participants varied.
Table 1.
Characteristics of participants
| Characteristics | N = 27 | % |
|---|---|---|
| Role | ||
| Director | 11 | (41) |
| Chief research informatics officer | 4 | (15) |
| Manager | 4 | (15) |
| Staff | 4 | (15) |
| Other role | 4 | (15) |
| Time in role | ||
| 1–5 year(s) | 15 | (56) |
| 6–10 years | 10 | (37) |
| 10+ years | 2 | (7) |
| Education (highest obtained) | ||
| Nonterminal | 12 | (44) |
| PhD | 7 | (26) |
| MD | 6 | (22) |
| MD, PhD | 2 | (7) |
| Funding tier of CTSA hub | ||
| >$8M | 8 | (40) |
| $5M–$8M | 5 | (25) |
| <$5M | 7 | (35) |
| Sex | ||
| Male | 23 | (85) |
| Female | 4 | (15) |
CTSA: Clinical and Translational Science Award.
Enterprise information technology relationships
Respondents referred to services provided by health system, university, and, in some cases, medical school IT organizations as “enterprise IT.” Respondents’ characterizations of enterprise IT’s role ranged from “system-level, centrally funded, [and] supports everything we do” to “piping to keep everything alive on campus” to “infrastructure—networking, data centers, servers, security—for enabling clinical, research, education, and administrative missions.” All respondents described having a relationship with at least one enterprise IT organization, the most common of which involved health system IT for EHR data and related services. One hub indicated a lack of communication from enterprise IT regarding change management and organizational policies, and another described a need to understand boundaries of multiple enterprise IT organizations (eg, university and medical school) to effectively obtain resources to deliver EDW4R service.
Seventy-five percent (n = 15) of respondents stated that the team providing EDW4R service at their institution was in an organizational unit, such as a CTSA-funded research institute, separate from an enterprise IT organization. Notably, one respondent described physical colocation of staff from enterprise IT and the EDW4R to deliver health system analytics and EDW4R services, and one respondent described plans to merge health system analytics and EDW4R teams into a single group. Three hubs described legacy reporting groups (eg, physician billing) brokering data for research separate from EDW4R teams, which complicated delivery and compliance activities.
Respondents expressed that enterprise IT and EDW4R groups emphasized service management, or the ability of faculty and staff to request and receive data-related services. The majority of hubs (n = 15) reported using a software system, such as the commercially available ServiceNow (Santa Clara, California, USA), or a homegrown application. Of note, one respondent described “regrettably” using a popular commercial service management system because the software “tr[ies] to jam research requests” into a platform designed for enterprise IT requests, such as desktop support.
Respondents indicated that enterprise IT effectively delivered baseline infrastructure, such as servers and networking, but often had slow turnaround times to support specific research requests, such as grant-funded EHR data extractions or configurations of clinical decision support. Notably, 2 hubs characterized enterprise IT as consisting of siloed niches of responsibilities with defined processes and EDW4R teams as providing multiple overlapping services with processes not fully defined. As one respondent stated, “[enterprise] IT doesn’t get research” and an institution must instead rely on an EDW4R team to fulfill requests and advocate on behalf of scientists to enterprise IT.
Some respondents (n = 3) described enterprise IT as considering EDW4R work “not a priority” or a “stepchild” to other health system needs. As one respondent indicated, clinical activity “dominates and gets all of the resources” from enterprise IT while research activity “struggles and does not compare.” One respondent stated that enterprise IT regarded Epic Systems Corporation (Verona, Wisconsin, USA) as “the primary vendor” and “everything else [as] a nuisance.”
Data governance
Most hubs (n = 17) indicated some level of existing data governance to make decisions regarding EDW4R operations. More than half (n = 12) said there were multiple levels of data governance, such as an oversight group to review external data sharing agreements and an operationally focused team to review internal requests for data access. In 2 hubs, data oversight committees delivered decisions to investigators about internal and external data use within 48 h. At least 2 hubs required investigators to sign an agreement prior to receiving EDW4R data.
Factors affecting data sharing with external parties varied. One respondent characterized social and perception issues of an academic medical center engaging with tech companies (eg, Facebook, Google) beyond data privacy as a determinant of external data sharing decisions. In another hub, security and compliance were the primary drivers of institutional decision-making for data release. One respondent indicated that a hub pursued commercial data sharing opportunities only when the activity aligned with the institutional mission and provided adequate funding.
To inform how to add data to an EDW4R, 8 hubs described user requests as dictating priorities while 2 indicated that the Observational Medical Outcomes Partnership (OMOP) common data model provided a guide. In other hubs, decisions to add data elements to an EDW4R rested with senior executives (n = 2) and faculty committees (n = 2). One respondent described a hub’s IRB as requiring investigators to consolidate activities with existing EDW4R services, not create separate infrastructure.
Workforce
Workforce approaches for EDW4R varied across respondents. Multiple roles comprised EDW4R teams, including managers who provided oversight and budget control; engineers who delivered data pipelines, extract-transform-load of data, and technical operations; analysts who engaged with faculty, managed project deliverables, and provided phenotyping services; and faculty who served as clinical or methodological (eg, machine learning) domain experts. Whereas some hubs defined roles for delivering portions of EDW4R service, others blended responsibilities into generic “developer” positions.
Respondents highlighted the importance of EDW4R staff understanding clinical data and healthcare processes with one hub leader indicating “[we] can always get SQL skills, but interrogating clinical data is harder” and another emphasizing the value of staff experienced in “negotiating the turbulent waters of an academic medical center.” Respondents valued staff versatility with one participant stating “we always need ninjas” and another describing the ideal hire as “more of a Swiss army knife than a dedicated developer.” Hubs valued a variety of skills including technical (eg, SQL, R, and Python programming), content knowledge (eg, clinical, EHR systems, research funding, and compliance), and interpersonal (eg, relationship management).
More than half (n = 12) of respondents indicated that EDW4R staffing remained fairly stable. Two hubs indicated that turnover was frequent due to competitive market forces. Noted gaps in the workforce included clinicians with EHR data understanding and engineers with expertise in natural language processing and specific technologies (eg, TensorFlow). To recruit EDW4R staff, hubs engaged local training programs at the graduate (eg, informatics), undergraduate (eg, computer science), and internship (eg, EDW4R) level. Hubs also reported success in personnel transferring from other parts of an institution, such as clinical or university IT, to join the EDW4R staff.
Staffing models varied with some team members located in designated EDW4R units and 10 hubs sharing staff with enterprise IT, especially health system IT staff involved in clinical analytics. Two hubs reported embedding EDW4R staff in either clinical departments or in enterprise analytics units as part of recent reorganization efforts. Team size varied widely from an average size of 10 members and a range from 1.5 full time equivalent for the smallest team to 25 for the largest team. A large-team hub described separating their EDW4R team from enterprise IT.
Managing the challenges of staffing for EDW4R was made easier by leadership support. As 7 hubs indicated, open communications with and engagement of the institutional CIO by EDW4R leaders enabled support of EDW4R activities. Leadership approaches appeared to be critical to EDW4R success for 3 hubs, which described high-performing collaborations based on trust among senior executives such as the CIO, chief medical information officer (CMIO), chief research information officer, chief analytics officer, and biomedical informatics department chair. Two hubs indicated that EDW4R activities succeeded specifically due to CIO and CMIO support of research. In one hub, the CIO previously oversaw EDW4R activities and understood the need for enterprise IT support of the service.
Respondents described a variety of reporting relationships for EDW4R leaders and funding sources for EDW4R activities. EDW4R leaders reported to multiple senior institutional leaders, including CIO or CMIO to enable enterprise IT coordination and research dean or CTSA principal investigator to support scientific strategy. Funding for EDW4R varied with a mix of CTSA, dean’s office, health system, and grant support. Notably, 2 hubs indicated the EDW4R team did not fund enterprise IT resources whereas one hub indicated that the medical school piloted novel data-driven activities with grants and internal funding before transferring responsibility to enterprise IT for ongoing operations. For strategic planning with enterprise IT, 4 hubs reported engaging as partners while 2 noted being excluded from participation.
Cloud computing
Although nearly all hubs (n = 16) described use of cloud computing for specific projects, only 2 hubs reported using a cloud-based EDW4R. Examples of cloud-supported projects include hosting of specific EDW4R-related applications (eg, NCATS Accrual to Clinical Trials web application and data mart), high-performance computing workloads for pathology genomics, specific grant-funded initiatives (eg, The Cancer Genome Atlas Program), and synthetic data generation. One hub described a goal of shifting cloud hosting from 30% to 70% within 5 years.
Factors motivating hubs to pursue cloud computing included scalability (n = 5); potential for cost reduction (n = 4); CIO or other enterprise IT leadership influence (n = 3); institutional strategy following successful efforts in other missions (eg, EHR and finance system migration to cloud; n = 3); security (n = 2); clinical genomics (n = 2); secure enclave capabilities (n = 2); and opportunities for technical and organizational change or modernization (n = 2). Factors inhibiting cloud computing adoption included cost (n = 5), unclear faculty requirements (n = 1), and a lack of trust of cloud by institutional leaders (n = 1). Regarding cost, hubs expressed concern about “paying on the meter” for specific vendor-service usage and resultant unexpectedly high bills; the value of existing on-premises infrastructure; organizational recognition of shifting from capital expenditures for on-premises infrastructure to operating expenditures for cloud infrastructure; and the additional expense of EDW4R staff or consultants with cloud computing skills.
Respondents described opportunities to increase cloud EDW4R adoption. One hub indicated that aligning EDW4R cloud strategy with health system or university IT could help achieve economies of scale. Another hub indicated that consulting firms have a market opportunity to provide enhanced staffing and technical resources to facilitate EDW4R migration to the cloud. Both hubs with a cloud-based EDW4R recommended hiring new staff with cloud expertise to expedite EDW4R cloud migration because current teams adaptation is challenging given concurrent cloud training and maintenance of existing on-premises infrastructure. Additionally, the 2 hubs with cloud-based EDW4R indicated that migration required between 3 and 5 years. Another hub currently planning an EDW4R strategy anticipated requiring 2 years to complete implementation. Notably, one hub described its EDW4R data assets as in direct competition with commercial cloud vendors. Another hub suggested that if NCATS had encouraged cloud EDW4R activities, the NCATS Accrual to Clinical Trials (ACT) Network32 could have expanded more rapidly with less effort from hubs.
DISCUSSION
Informatics leaders from 20 CTSA hubs helped describe the current state of EDW4R operations related to enterprise IT, data governance, workforce, and cloud computing. Analysis demonstrated both a wide variety of current states for these aspects as well as emerging patterns.
Enterprise IT is an important partner for EDW4R operations and influences infrastructure, service management, leadership, data governance, workforce management, and cloud computing. Hubs that had strong EDW4R operations described a positive relationship with enterprise IT that was often a collaboration or a partnership. This suggests that hub EDW4R and enterprise IT colleagues need to assess and be intentional about their relationship.
Data governance appeared to be a work in progress, as the majority of hubs reported complex and incompletely defined processes that continued to evolve in support of COVID-19 pandemic response efforts as well as engagement with commercial entities for data collaborations. Few hubs described a comprehensive data governance structure that covered all decision areas around EDW4R data including decisions on what sources of data are used, how data are stored in the EDW4R (eg, application of ontologies), how data access is managed, how data are shared inside and outside the institution, and how to approach sharing data with industry.
Variation in staffing models made it difficult to identify best practices for creating the workforce that is required to successfully operate an EDW4R. Some hubs described experiencing workforce challenges in acquiring and keeping EDW4R staff. Many appear to be experimenting with the organization of their teams to try to find the best of both integrating with clinical departments and/or enterprise IT analytics organizations while fulfilling institution-wide data needs. However, the need for C-suite collaborative support in establishing and supporting EDW4R functional teams was strongly expressed. A note of concern is the lack of women on EDW4R teams; of the 27 participants in this study who held different roles and titles at different levels, only 4 (15%) were women.
Cloud computing offers new capabilities but adoption for EDW4R is currently low. More data are needed regarding the costs and benefits of cloud-based EDW4R compared to on-premises alternatives. Many hubs were unsure about the best uses of the cloud for their purposes and how it realistically fit into their strategic plan. Although the NIH STRIDES initiative has promoted cloud computing, support appears aimed at basic scientists rather than clinical researchers, EDW4R teams, and enterprise IT organizations that may be responsible for large-scale patient data repositories. NIH shepherding multiple academic medical centers into commercial cloud environments for EDW4R deployment is likely more complex and costly than supporting individual investigators or laboratories. As one respondent indicated, use of cloud computing could accelerate EDW4R-oriented activities such as NCATS ACT, which NIH may seek to consider for future efforts.
Limitations of this study include time and participant constraints related to CTSA iEC deliverable schedules. However, our 20-hub sample included institutions with varying CTSA funding levels and a diversity of approaches that may reflect all hubs in the consortium. Additionally, it is unknown whether findings from this study of CTSA hubs pertain to other academic medical centers. Future work can address non-CTSA healthcare provider organizations in the United States and other countries.
EDW4R is an important component of a research enterprise that is evolving rapidly to support clinical and translational investigation. CTSA hubs and other institutions need sustainable methods to enable EDW4R to provide investigators with electronic patient data.
CONCLUSION
Using qualitative methods, this study characterized trends and opportunities for improvement that may contribute toward best practices for EDW4R operations. Findings can potentially inform CTSA hubs in supporting scientists. Identification of best practices may be exceptionally challenging, but this paper helps identify a breadth of viable options for CTSA hubs to consider when implementing EDW4R services.
FUNDING
This study received support from the National Institutes of Health National Center for Advancing Translational Sciences through grant numbers UL1TR002384 (Weill Cornell), UL1TR002537 (Iowa), UL1TR001433 (Mount Sinai), UL1TR002369 (OHSU), and UL1TR003167 (UTHSC). This work was funded in part by the University of Rochester Center for Leading Innovation and Collaboration (CLIC), under Grant U24TR002260.
AUTHOR CONTRIBUTIONS
BMK and TRC conceptualized the study and interview guide, conducted interviews, transcribed notes, and performed analysis. CKC, DAD, and EVB provided iterative interview guide and analytical feedback. BMK wrote the manuscript with contributions from TRC. CKC, DAD, and EVB edited the manuscript. TRC and BMK revised the manuscript.
SUPPLEMENTARY MATERIAL
Supplementary material is available at Journal of the American Medical Informatics Association online.
CONFLICT OF INTEREST STATEMENT
BMK and TRC are guest associate editors of the JAMIA special issue on best practices for patient data repositories, and recuse themselves from consideration of this manuscript for publication.
DATA AVAILABILITY
The data underlying this article will be shared on reasonable request to the corresponding author.
Supplementary Material
REFERENCES
- 1. Bookman RJ, Cimino JJ, Harle CA, et al. Research informatics and the COVID-19 pandemic: challenges, innovations, lessons learned, and recommendations. J Clin Transl Sci 2021; 5 (1): e110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Obeid JS, Beskow LM, Rape M, et al. A survey of practices for the use of electronic health records to support research recruitment. J Clin Transl Sci 2017; 1 (4): 246–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Schenck EJ, Hoffman KL, Cusick M, Kabariti J, Sholle ET, Campion TR.. Critical carE Database for Advanced Research (CEDAR): an automated method to support intensive care units with electronic health record data. J Biomed Inform 2021; 118: 103789. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Coorevits P, Sundgren M, Klein GO, et al. Electronic health records: new opportunities for clinical research. J Intern Med 2013; 274 (6): 547–60. [DOI] [PubMed] [Google Scholar]
- 5. Madhavan S, Bastarache L, Brown JS, et al. Use of electronic health records to support a public health response to the COVID-19 pandemic in the United States: a perspective from 15 academic medical centers. J Am Med Inform Assoc 2021; 28 (2): 393–401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. MacKenzie SL, Wyatt MC, Schuff R, Tenenbaum JD, Anderson N.. Practices and perspectives on building integrated data repositories: results from a 2010 CTSA survey. J Am Med Inform Assoc 2012; 19 (e1): e119–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Campion TR, Craven CK, Dorr DA, Knosp BM.. Understanding enterprise data warehouses to support clinical and translational research. J Am Med Inform Assoc 2020; 27 (9): 1352–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Helmer TT, Lewis AA, McEver M, et al. Creating and implementing a COVID-19 recruitment data mart. J Biomed Inform 2021; 117: 103765. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Rodriguez VA, Bhave S, Chen R, et al. Development and validation of prediction models for mechanical ventilation, renal replacement therapy, and readmission in COVID-19 patients. J Am Med Inform Assoc 2021; 28 (7): 1480–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Klann JG, Estiri H, Weber GM, et al. Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data. J Am Med Inform Assoc 2021; 28 (7): 1411–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Wang J, Abu-El-Rub N, Gray J, et al. COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model. J Am Med Inform Assoc 2021; 28 (6): 1275–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Lenert LA, Ilatovskiy AV, Agnew J, et al. Automated production of research data marts from a canonical Fast Healthcare Interoperability Resource (FHIR) data repository: applications to COVID-19 research. J Am Med Inform Assoc 2021; 28 (8):1605–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Haendel MA, Chute CG, Bennett TD, et al. ; N3C Consortium. The National COVID Cohort Collaborative (N3C): rationale, design, infrastructure, and deployment. J Am Med Inform Assoc 2021; 28 (3): 427–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Bernstam EV, Hersh WR, Johnson SB, et al. ; CTSA Biomedical Informatics Key Function Committee. Synergies and distinctions between computational disciplines in biomedical research: perspective from the Clinical and Translational Science Award programs. Acad Med 2009; 84 (7): 964–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Parlier GH, Liberatore F, Demange M, eds. Operations Research and Enterprise Systems: 7th International Conference, ICORES 2018, Funchal, Madeira, Portugal, January 24–26, 2018, Revised Selected Papers. Cham: Springer International Publishing; 2019. [Google Scholar]
- 16. Marsolo K. Informatics and operations—let’s get integrated. J Am Med Inform Assoc 2013; 20 (1): 122–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Holmes JH, Elliott TE, Brown JS, et al. Clinical research data warehouse governance for distributed research networks in the USA: a systematic review of the literature. J Am Med Inform Assoc 2014; 21 (4): 730–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Randhawa GS, Slutsky JR.. Building sustainable multi-functional prospective electronic clinical data systems. Med Care 2012; 50 (Suppl): S3–6. [DOI] [PubMed] [Google Scholar]
- 19. Hripcsak G, Bloomrosen M, FlatelyBrennan P, et al. Health data use, stewardship, and governance: ongoing gaps and challenges: a report from AMIA’s 2012 Health Policy Meeting. J Am Med Inform Assoc 2014; 21 (2): 204–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Cole CL, Sengupta S, Rossetti Née Collins S, et al. Ten principles for data sharing and commercialization. J Am Med Inform Assoc 2021; 28 (3): 646–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Hersh WR, Boone KW, Totten AM.. Characteristics of the healthcare information technology workforce in the HITECH era: underestimated in size, still growing, and adapting to advanced uses. JAMIA Open 2018; 1 (2): 188–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Zozus MN, Lazarov A, Smith LR, et al. Analysis of professional competencies for the clinical research data management profession: implications for training and professional certification. J Am Med Inform Assoc 2017; 24 (4): 737–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Kannry J, Sengstack P, Thyvalikakath TP, et al. The chief clinical informatics officer (CCIO): AMIA task force report on CCIO knowledge, education, and skillset requirements. Appl Clin Inform 2016; 7 (1): 143–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Sanchez-Pinto LN, Mosa ASM, Fultz-Hollis K, Tachinardi U, Barnett WK, Embi PJ.. The emerging role of the chief research informatics officer in academic health centers. Appl Clin Inform 2017; 8 (3): 845–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Krissaane I, De Niz C, Gutiérrez-Sacristán A, et al. Scalability and cost-effectiveness analysis of whole genome-wide association studies on Google Cloud Platform and Amazon Web Services. J Am Med Inform Assoc 2020; 27 (9): 1425–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Alvarez RV, Mariño-Ramírez L, Landsman D.. Transcriptome annotation in the cloud: complexity, best practices, and cost. Gigascience 2021; 10 (2): 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Ali O, Shrestha A, Soar J, Wamba SF.. Cloud computing-enabled healthcare opportunities, issues, and applications: a systematic review. Int J Inf Manage 2018; 43: 146–58. [Google Scholar]
- 28. Aarestrup FM, Albeyatti A, Armitage WJ, et al. Towards a European health research and innovation cloud (HRIC). Genome Med 2020; 12 (1): 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.STRIDES Initiative | Data Science at NIH [Internet]. https://datascience.nih.gov/strides Accessed December 1, 2019.
- 30. Hsieh H-F, Shannon SE.. Three approaches to qualitative content analysis. Qual Health Res 2005; 15 (9): 1277–88. [DOI] [PubMed] [Google Scholar]
- 31. Lincoln YS. Naturalistic inquiry. In: Ritzer G, ed. The Blackwell Encyclopedia of Sociology. Oxford: John Wiley & Sons, Ltd; 1985: 357–81. [Google Scholar]
- 32. Visweswaran S, Becich MJ, D'Itri VS, et al. Accrual to clinical trials (ACT): a Clinical and Translational Science Award consortium network. JAMIA Open 2018; 1 (2): 147–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data underlying this article will be shared on reasonable request to the corresponding author.
