Abstract
Introduction
The rapid development of artificial intelligence (AI) in healthcare has exposed the unmet need for growing a multidisciplinary workforce that can collaborate effectively in the learning health systems. Maximizing the synergy among multiple teams is critical for Collaborative AI in Healthcare.
Methods
We have developed a series of data, tools, and educational resources for cultivating the next generation of multidisciplinary workforce for Collaborative AI in Healthcare. We built bulk‐natural language processing pipelines to extract structured information from clinical notes and stored them in common data models. We developed multimodal AI/machine learning (ML) tools and tutorials to enrich the toolbox of the multidisciplinary workforce to analyze multimodal healthcare data. We have created a fertile ground to cross‐pollinate clinicians and AI scientists and train the next generation of AI health workforce to collaborate effectively.
Results
Our work has democratized access to unstructured health information, AI/ML tools and resources for healthcare, and collaborative education resources. From 2017 to 2022, this has enabled studies in multiple clinical specialties resulting in 68 peer‐reviewed publications. In 2022, our cross‐discipline efforts converged and institutionalized into the Center for Collaborative AI in Healthcare.
Conclusions
Our Collaborative AI in Healthcare initiatives has created valuable educational and practical resources. They have enabled more clinicians, scientists, and hospital administrators to successfully apply AI methods in their daily research and practice, develop closer collaborations, and advanced the institution‐level learning health system.
Keywords: artificial intelligence, Collaborative AI in Healthcare, collaborative learning, health workforce, learning health system, multimodal machine learning, team science
1. INTRODUCTION
Artificial intelligence (AI), referred to as using computers to perform intelligent tasks typically done by humans, 1 has received growing attention in healthcare research and application, and has sparked different opinions regarding its role in healthcare. Much of AI's recent successes come from the subfield of machine learning (ML), which uses computers to learn from data to make predictions or decisions without a priori programming. Some AI proponents previously expected it to displace radiologists and anatomical pathologists, 2 such high expectations for AI to contribute to healthcare in general and address paramount health challenges, such as the COVID‐19 pandemic in particular, have been largely unrealized. Reflecting on the underlying reasons, we found that instead of replacing clinicians, most current AI workflows depend on human expert input in the steps of model development, validation, and deployment.
To maximize the benefits of precision medicine, we should enhance collaboration between AI and human healthcare experts. At its core, Collaborative AI in Healthcare seeks to address critical challenges in healthcare by fostering trust in AI technologies, democratizing access to AI tools and resources, and building robust infrastructure to support AI's integration into healthcare systems. Our mission encompasses not only the technical advancement of AI but also its ethical, accessible, and equitable application across diverse healthcare contexts. By weaving these elements together, Collaborative AI in Healthcare aims to pioneer a comprehensive approach to harnessing AI's potential for transformative impact on patient health. Collaborative AI endeavors resonate with foundational Learning Health System (LHS) concepts, notably Friedman's Learning Cycle, 3 by emphasizing continuous learning and improvement. Our initiatives reflect the cycle's phases of data collection, knowledge generation, and practice integration, yet we also identify and strive to fill notable gaps such as the occasional lack of trust between clinicians and AI scientists and the need for the paradigm shift from reactive to proactive AI.
For learning health systems to advance, it is critical for AI scientists, clinicians, healthcare service providers, and researchers to collaborate in developing AI. Building trust between clinicians and AI scientists is essential to leverage their combined expertise effectively. However, trust is often missing, partly due to fears that AI could replace certain medical professionals and concerns over AI scientists applying models to data without sufficient clinical context. 2 This leads to skepticism about the clinical relevance and reproducibility of AI findings, compounded by the challenge of interpreting “black box” AI models. Misapplications of techniques to make these models explainable can further obscure flaws, presenting invalid models as plausible. 4 Thus, it is important for a multidisciplinary workforce of clinicians and AI scientists to work together during each important step: model development, validation, deployment, and ongoing model governance. This requires the creation of fertile ground to cross‐pollinate clinicians and AI scientists, for them to learn and practice building collaborative AI for learning health systems as a “team sport.”
2. BUILDING COLLABORATIVE ARTIFICIAL INTELLIGENCE IN HEALTHCARE
Collaborative AI in Healthcare should provide resource and education infrastructure for AI democratization, putting AI into the hands of clinicians and scientists without specialized AI knowledge, and empower them to effectively use the technology to work together. Leveraging the strong partnerships between Northwestern University Feinberg School of Medicine (FSM) with both Northwestern Medicine's hospitals and Lurie Children's Hospital, we have established such infrastructure in data, tooling and education, enabling numerous collaborators from multiple clinical specialties in adult and pediatric medicine to advance their research and secure extramural funding.
2.1. Governance and oversight of collaborative AI in healthcare
To effectively coordinate and manage its diverse activities, Collaborative AI in Healthcare has established a governance framework that includes an Executive Steering Committee and an Advisory Board, while leveraging the Community Engagement Panel from Northwestern University Clinical and Translational Science Institute (NUCATS) for community outreach. Our advisory board comprises diverse faculty from the schools of medicine, engineering, art and science, as well as leaders of the healthcare system. The 12 members of the Advisory Board bring expertise from core AI techniques applied to multimodal health data (e.g., imaging, clinical notes, multi‐omics), health equity, ethics and patient engagement, various clinical specialties (e.g., from general internal medicine to cardiovascular and pulmonary care), basic science powered translational medicine, as well as education innovation and knowledge management.
This governance structure not only ensures strategic alignment and ethical integrity but also facilitates broad stakeholder engagement, drawing on a wealth of expertise to create an inclusive and collaborative ecosystem. By leveraging the infrastructure of partnering institutions, we maximize resource efficiency and community impact. Designed to be dynamic, our governance framework supports the growth of Collaborative AI in Healthcare and its adaptive response to the rapidly evolving field of healthcare AI. This approach fosters a vibrant ecosystem that promotes research, innovation, and leadership, ensuring that advancements in AI are effectively translated into tangible benefits for biomedicine. Through this governance model, we are poised to navigate the complexities of healthcare AI, advancing the field while prioritizing ethical considerations and the needs of our diverse community.
2.2. Disseminating collaborative educational resource
Since 2020, we have dedicated ourselves to nurturing the next generation of leaders in medical data science and collaborative AI. This commitment led to the launch of the AI for Health (AI4H) Clinic, aimed at providing practical guidance and support to the approximately 4000 practicing clinicians within our faculty. 5 The AI4H clinic sessions, which complement the winter quarter's medical AI courses at Northwestern University Feinberg School of Medicine, serve as a platform where clinicians interested in AI for healthcare can discuss their clinical challenges and ideas. These sessions are grounded in the principles of collaborative AI, drawing together a diverse group of clinicians, AI scientists, data scientists, basic scientists, hospital administrators, and trainees to foster a multidisciplinary approach to healthcare solutions (Figure 1), all operating under ethical guidelines for AI development in healthcare. 6
FIGURE 1.

AI for Health (AI4H) Clinic as a convener for collaborative artificial intelligence (AI) in healthcare. AI4H Clinic brings together clinicians, AI scientists, data scientists, basic scientists, hospital administrators and trainees, and operates under ethical principles guiding development of AI algorithms for healthcare.
The AI4H Clinic has become a catalyst for innovation and collaboration. Clinicians, alongside AI and data scientists, bring forth clinical, research, or operational problems to explore AI/ML‐based solutions through brainstorming, consultation, and iterative solution development. This process not only leads to pilot projects, prototype systems, and academic publications but also deepens the appreciation of the nuances of clinical data among AI professionals. Notably, the clinic has empowered clinicians, especially those previously lacking resources, to develop and deploy AI models with the support of AI scientists and informatics trainees. The engagement spans across various clinical specialties and career stages, fostering clinical implementations and research breakthroughs (Table 1). This vibrant environment encourages early collaboration, nurturing a natural bond between junior clinicians and AI/data science trainees across diverse topics, thereby laying the groundwork for a future where Collaborative AI in Healthcare flourishes through the synergy of next‐generation clinicians and AI scientists.
TABLE 1.
Selected Northwestern Medicine AI4H Clinic sessions across a wide range of clinical specialties by clinicians from all career stages (Assistant/Associate/Full Professors and Chief Medical Officer).
| Date | Specialty | Presenter rank | Topic |
|---|---|---|---|
| 31 January 2019 | Pediatric Critical Care | Assistant Professor | Using machine learning to better understand Multiple Organ Dysfunction Syndrome in Pediatric Intensive Care Unit |
| 6 February 2020 | Emergency Medicine, Medical Ethics | Professor, Chief Medical Officer | Fairness of AI algorithms for risk prediction of heart attack and stroke across demographic and socioeconomic strata |
| 20 February 2020 | Internal Medicine | Associate Professor | Predicting next clinical event risks to ensure outpatient follow‐up appointments occur prior to substantial risk development |
| 5 March 2020 | Radiology | Assistant Professor | Applying machine learning to connectomic neuroimaging for patients' epilepsy risk prediction |
| 14 January 2021 | Emergency Medicine, Geriatrics | Associate Professor |
Using machine learning to predict hospital disposition with Geriatric Emergency Department Innovation (GEDI) intervention |
| 18 February 2021 | Emergency Medicine | Assistant Professor | Using AI to predict the resource utilization and time spent for a given patient's Emergency Room stay for better triaging |
| 11 March 2021 | Internal Medicine | Professor | Leveraging multi‐state patient populations for post‐publication evidence appraisals for AI in healthcare studies |
| 13 January 2022 | Orthopedic Surgery | Assistant Professor | Identifying patients at higher risk for poor outcomes after orthopedic surgery and spine surgery using machine learning |
| 3 February 2022 | Neonatal Infectious Disease | Assistant Professor | Discovering umbilical cord biomarkers for diagnosis of early onset neonatal sepsis |
| 10 March 2022 | Cardiology, Epidemiology | Assistant Professor | Using data science and machine learning to untangle the complexities of heart failure diagnosis and prognosis |
| 12 January 2023 | Gastroenterology | Associate Professor | Using spatial transcriptomics and machine learning to elucidate irritable bowel disease mechanism and druggable targets |
| 16 February 2023 | Neurology | Professor | Using deep learning to identify biomarkers for predicting subsequent hematoma expansion after intracerebral hemorrhage |
| 2 March 2023 | Pediatrics, Emergency Medicine | Assistant Professor | Determine an operationalizable outcome for community‐acquired pneumonia (CAP) using machine learning approaches |
Note: The involved patient populations include neonates, children, adults, and senior patients, with diverse gender, racial/ethnic, and socioeconomic profiles.
The AI4H Clinic, initially envisioned for a select group already versed in AI, quickly exceeded its capacity due to high demand from interested clinicians. Rather than limit participation, we innovated by pairing these clinicians with AI trainees, creating a mentorship dynamic where both parties could learn from each other. Many clinicians, eager to apply AI within their specialties yet lacking analysis‐ready data, were supported by the use of the publicly accessible MIMIC database. 7 This approach allowed AI trainees to craft prototype models under the guidance of clinical insights, fostering a practical learning environment.
These collaborative efforts have led to the development of AI models tackling critical clinical challenges. Several prototype models have spanned the phases of data collection and knowledge generation in Friedman's Learning Cycle, on a variety of concrete clinical tasks including early prediction of acute kidney injury, 8 , 9 , 10 antibiotic stewardship, 11 and fluid management for hyperchloremia prevention. 12 Meanwhile, a notable couple of efforts have further included the practice integration phase and spanned the entire Friedman's Learning Cycle, for example, unsupervised machine learning for pediatric multiple organ dysfunction subphenotyping and targeted interventions, 13 and supervised machine learning to prioritize geriatric emergency department patients for advanced assessment and multidisciplinary care coordination. 14
In 2023, we took a significant step toward expanding AI literacy and fostering patient‐centered innovation by launching the Northwestern Medicine Healthcare AI Forum. This pioneering biweekly forum is uniquely inclusive, inviting not only faculty and students from Northwestern University but also healthcare professionals, patients, and the broader community within the Greater Chicago area. Our sessions (recordings at https://www.youtube.com/watch?v=6OZJFx_yvik&list=PLh0E-LUsGkx3akbCnB9gonkIkKGGbPE1T) are designed to break down the complexities of AI in healthcare, presenting the latest advancements in a manner that is accessible and engaging to everyone, including patients and their advocates.
Each forum features multiple succinct and modular presentations that distill complex research and technological innovations into intuitive, easily understandable insights. These 10–15 min segments avoid technical jargon, opting instead for plain English explanations that invite questions, stimulate open discussion, and encourage participation from all attendees. By prioritizing patient engagement and making AI advancements relatable to their experiences and concerns, we aim to not only educate but also empower our community. This initiative reflects our commitment to not just advancing healthcare through technology but doing so in a way that is inclusive, patient‐focused, and driven by the needs and insights of those we aim to serve.
The multifaceted data and education initiatives and resources have created a constructive and collaborative workspace, which brings clinicians and AI scientists together and enables significant research progress through collaborative AI efforts across multiple clinical specialties in both adult and pediatric medicine. In addition, the collaborative AI efforts are deeply embedded into multiple institutional centers across the Feinberg School of Medicine, including HeartShare Data Translational Center (U54HL160273), Northwestern University Clinical and Translational Science Institute (NUCATS, UL1TR001422), Electronic Medical Records and Genomics (eMERGE) Multi‐center Consortium (U01HG011169), Nutrition Precision Health for All of Us Chicago Center (UG1HD107697), and Network of the National Library of Medicine Evaluation Center (U24LM013751), among others.
2.3. Democratizing access to unstructured health information
Much of the patients' information is locked in the form of narrative clinical notes, which are not analysis ready for data science and AI tools. Extracting structured information and converting it to model's features requires NLP. 15 In medical schools and health systems, a relative scarcity of clinical NLP expertise often exists compared to the broad need from practicing clinicians to extract health information from unstructured clinical notes for automated downstream processing. We have observed that many requests from clinicians shared similarities in the desired information (e.g., presence and absence of certain clinical conditions, medications, procedures). Thus, we can bulk process clinical notes to extract desirable information in anticipation of common interest, serving to democratize access to unstructured health information.
We have developed bulk natural language processing (NLP) and data harmonization pipelines to systematically extract structured information from unstructured clinical notes, and stored processing results in interoperable data marts to power augmented intelligence in clinical practice. Figure 2 illustrates the bulk‐NLP pipeline, which begins by identifying sections in clinical notes, breaking paragraphs into sentences, and sentences into words (tokenization). Stemming is used to reduce inflected words to their root form, capturing core meaning. Part‐of‐speech (POS) tagging assigns a POS tag to each word, capturing inflections. Syntax parsing assigns a syntactic structure to a sentence, which, along with stemming and POS tagging results, informs named entity recognition (concept recognition) 16 , 17 and relation extraction. 18 , 19 , 20 Concept recognition can also inform the syntax parser on the relations between tokens and improve the accuracy of parsing. 21 These results generate a graph representation for a sentence, capturing various relations expressed on the mentioned concepts 22 that can be consumed by our graph neural network model TextGCN 23 for relation inferencing from clinical text (e.g., medication causes adverse events).
FIGURE 2.

Bulk natural language processing (NLP) pipeline and resulted data marts with common data model tables. The bulk NLP pipeline takes in unstructured clinical notes and runs full stack syntactic and semantic processing steps to extract structured information and store them in data marts. The data marts use common data models to store the information extracted from clinical notes to augment structured EHR and provide interoperability among hospitals within Northwestern Medicine and with external health systems. These data marts then power business intelligence and drive research and development by assisting clinicians and scientists at both the medical school and the health system.
To disseminate the use of this state‐of‐the‐art language model, we have made easy‐to‐follow tutorial with a simplified version of TextGCN 23 (available at https://github.com/luoyuanlab/text_gcn_tutorial) and introduced it into classroom teaching so that trainees can run a graph deep learning model on their laptop within 10 min. Our pipeline also allows direct regular expression extraction in order to furnish customized concept and relation recognition. 24 Finally, our concept and relation extraction steps produce outputs that are mapped to the Unified Medical Language System (UMLS). 25 The captured relations and concepts are stored in OMOP Common Data Model tables to ensure interoperability across the 12 hospitals in the adult health system, the pediatric hospital and clinics, and with external health systems.
To ensure broad use of the data and tools, we created tutorials and educational resources (e.g., case studies, consulting sessions, currently available to approved Northwestern Medicine Enterprise Data Warehouse [NMEDW] users) for the data marts produced by the bulk NLP pipelines. These resources are designed to simplify the use of data marts generated by our comprehensive NLP pipelines, catering to a wide audience that spans clinicians, researchers, and administrative staff who seek to leverage the wealth of information locked within unstructured clinical notes. To validate the effectiveness of integrating structured information from clinical notes, we conducted studies comparing predictive models based on structured data alone against models enriched with NLP‐extracted information, specifically in the context of breast cancer recurrence adjudication and prediction. The results demonstrated a significant enhancement in model performance when incorporating information extracted through our NLP pipelines. 26 , 27 , 28 , 29
Our bulk NLP effort has powered research and development efforts, driven business intelligence, and facilitated clinical tasks such as computational phenotyping, 30 , 31 disease predictive modeling, 32 , 33 , 34 semantic analysis, 35 and adverse event detection. 36 , 37 For example, we have collaborated with NMEDW team to deposit results from the NLP pipelines into data marts for breast cancer patients (deployment completed) 38 and cardiovascular disease patients (deployment in progress). The breast cancer data mart has enabled Northwestern investigators to secure federal grants and advance clinical research in breast cancer recurrence adjudication and prediction, 26 , 27 , 28 , 29 genetic risk stratification 39 , 40 , 41 and intervention, and drug delivery assessment. 42 , 43 , 44 We have adapted an NLP pipeline for cardiovascular disease patients, successfully extracting key information like left ventricular ejection fraction from echocardiography reports across our adult care network and external institutions. 45 This also lays the groundwork for downstream tasks like drug repurposing for atrial fibrillation. 46 , 47
2.4. Collaboratively advancing artificial intelligence/machine learning tooling and resources for healthcare
The integration of deep phenotyping, multi‐omics, and ML is revolutionizing our understanding of the pathophysiological evolution and disease progression by illuminating complex biological pathways. Achieving this level of insight, however, necessitates a unique blend of AI, data science, biological sciences, and multispecialty clinical expertise. Our team embodies this multidisciplinary synergy, creating multimodal AI/ML tools that harness the complementary nature of diverse datasets to delve into complex diseases with unprecedented depth and precision (Figure 3).
FIGURE 3.

Multimodal healthcare data for artificial intelligence/machine learning.
For example, in our autism study, we combined healthcare claims, electronic health records, familial whole‐exome sequences, and gene expression patterns. 48 This approach, rather than relying on data volume alone, allowed us to identify dyslipidemia as a risk factor for a novel autism subtype. Working with cardiology experts, we have effectively used deep phenotype (e.g., medical imaging 49 , 50 , 51 , 52 ) and genomic data (e.g., Whole Exome Sequencing) to identify distinct patient subgroups with unique cardiac mechanics. 53 , 54 This approach, utilizing unsupervised machine learning, has been particularly beneficial in studying adult hypertension patients at risk of heart failure with preserved ejection fraction (HFpEF). Our methodology has enabled the identification of subgroups with varying cardiac mechanics 53 and HFpEF progression risks. 55 , 56 , 57
Our methodology does not merely categorize patients more accurately; it redefines patient management strategies, enabling the reclassification of certain cases to more specific and treatable conditions like transthyretin cardiac amyloidosis. 58 , 59 Through this work, we are setting new standards for the systematic identification of disease subtypes, facilitating the discovery of targeted interventions and significantly enhancing patient care outcomes. Our work opens new pathways for leveraging AI‐driven insights that lead to more precise diagnoses and tailored treatments, transforming patient care for complex diseases.
Our developed tools are actively supporting research, practice, and education. For instance, they are aiding the HeartShare Data Translational Center in combining multi‐omics, imaging and phenotyping data for heart failure subtypes and treatment targets and to identify novel heart failure subtypes and promote precision medicine. In practice, we embedded ML models, augmented with bulk NLP extracted features, as part of a human‐in‐the‐loop workflow deployed at Northwestern Medicine to enhance the identification of patients transitioning from moderate to severe HF (a deadly situation sometimes overlooked by primary care physician) and facilitate timely evaluation by HF specialists for advanced therapies. 60 In education, these tools are used to train clinicians and AI scientists, with trainees developing advanced models for diagnosing complex clinical syndromes such as esophageal motility disorders. 61 These tools, complemented by shared source code and tutorials, provide practical AI and machine learning training for early career investigators.
We have developed public, pan‐disease AI/ML resources to enhance the applicability of our tools across clinical specialties. One such initiative is our foray into spatial transcriptomics, an emerging technology that profiles gene expression in a tissue context. Integrating spatial transcriptomics datasets from published studies can provide unique opportunities for clinicians and scientists to extract and aggregate insights on tissue‐context dependent molecular mechanisms. 62 Recognizing the lack of a shared, systematically processed database for this data, we created the Spatial transcriptOmics Analysis Resource (SOAR). SOAR curates and annotates spatial transcriptomics data from 2785 samples across 40 tissue types from 11 species. 63 It offers a consistent, user‐friendly platform for researchers to visualize and assess spatial gene expression variability and cell–cell interactions, to better understand various diseases' mechanisms and inform targeted drug discovery. Accessible at https://soar.fsm.northwestern.edu/, with tutorials in html, pdf and video forms, SOAR serves as a one‐stop destination for clinicians, scientists, and trainees worldwide, currently supporting approximately 2700 users.
By collaborating with our diverse team of clinical experts, we can effectively identify and address prevalent challenges for a learning health system, such as missing clinical data. Although multiple algorithms have been proposed for imputing missing measurements and applied in clinical settings, 64 , 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 , 74 they lack widespread community efforts to advance the state‐of‐the‐art imputation techniques for clinical data. To bridge this gap, we have created a dataset with native and artificial missing values, consisting of common laboratory test results from complete blood count and metabolic panels from the MIMIC III dataset 7 and organized an international challenge. 75 Participants from various industries and academia experimented with a range of algorithms to handle missing data. The shared lessons, benchmarking dataset, source code, and tutorials from this challenge have created a comprehensive resource for clinicians and scientists interested in clinical data imputation.
Addressing healthcare disparities is also a significant challenge for a learning health system. These disparities can bias machine learning models, leading to inequitable decisions. 76 To tackle this, we have developed tools to identify and expose biases and inequity in healthcare practices and policies. 77 , 78 , 79 For instance, our Monte Carlo simulation of a ventilator shortage in a diverse COVID‐19 population revealed that certain triage strategies, despite intended as “color‐blind,” disproportionately affected Black patients with higher SOFA scores and comorbidities. 78 In response to these findings, we have taken proactive steps by designing machine learning models that adopt a more holistic view. These models incorporate both utilitarian principles, which aim for the greatest good for the greatest number, and egalitarian perspectives, ensuring fair treatment across all patient demographics. By integrating social determinants of health, our models significantly reduced disparities in healthcare predictions and outcomes for different patient groups, while not sacrificing the overall performance. 80
The Northwestern University Collaborative AI in Healthcare initiative has been diligently working on democratizing access to unstructured health information, advancing AI/ML resources, and disseminating collaborative educational resources. This multifaceted approach has significantly propelled the successful completion of numerous studies, and their publications have provided timely value to multiple clinical subspecialties and topics (Figure 4). As a result, our closely knit team has been able to accelerate scientific investigations and expand clinical research. This collaborative effort has significantly contributed to the successful acquisition and implementation of numerous NIH‐funded awards (Figure 5), which will further sustain and scale‐up our research endeavors.
FIGURE 4.

The Collaborative AI in Healthcare Initiative has significantly contributed to the successful execution of numerous studies, analyzed by their publications' trends of topics and focus. The term co‐occurrence map from the titles and abstracts of the publications. A connection denotes the co‐occurrence between two terms. Term nodes are sized by the number of times they occur in the title or abstract of the publication. Distance between two terms indicates how often the terms co‐occur in a title or abstract. (A) All terms are assigned (and colored accordingly) to clusters based on the co‐occurrences. The red cluster mainly contains the terms that represent a method, like “representation,” “classification,” and so forth. The green cluster mainly contains the terms that are related to a medical task, like “COVID,” “treatment,” and so forth. The blue cluster can contain both method terms (e.g., “prediction model”) and medical terms (e.g., “acute kidney injury,” “intensive care unit”). (B) All terms are colored according to their average publication year of usage; yellow indicates newer topics. (C) Publications resulted from the Collaborative AI in Healthcare Initiative, analyzed by the number of publications and citations growing with the year.
FIGURE 5.

The Collaborative AI in Healthcare Initiative has significantly contributed to the successful acquisition and implementation of numerous research grants (R01, R21, R18, R61, R35, R24), funded by various NIH institutions and centers. AHRQ, Agency for Healthcare Research and Quality; NCI, National Cancer Institute; NHGRI, National Human Genome Research Institute; NHLBI, National Heart, Lung, and Blood Institute; NIA, National Institute on Aging; NIAID, National Institute of Allergy and Infectious Diseases; NIAMS, National Institute of Arthritis and Musculoskeletal and Skin Diseases; NICHD, Eunice Kennedy Shriver National Institute of Child Health and Human Development; NIDDK, National Institute of Diabetes and Digestive and Kidney Diseases; NIGMS, National Institute of General Medical Sciences; NINDS, National Institute of Neurological Disorders and Stroke; NLM, National Library of Medicine; OD, Office of the Director.
3. DISCUSSION AND FUTURE WORK
In November 2022, Northwestern University took a significant step forward in the intersection of healthcare and technology by institutionalizing the Collaborative AI in Healthcare Initiative into the Center for Collaborative AI in Healthcare. This pivotal move underscores our dedication to advancing biomedical AI, supported by the center's staff who contribute to numerous NIH‐funded grants, as illustrated in Figure 5. These efforts are further bolstered by both non‐sponsored institutional resources from the medical school and partnering departments and institutes, as well as funding from pharmaceutical and biotech companies for AI‐driven drug discovery endeavors.
The center's evolution from a support role in existing grant efforts to a self‐sustaining entity exemplifies organic growth at its finest—uniting service‐oriented efforts, securing institutional backing, and drawing federal and industry funding. This robust, multi‐pillar support system not only ensures the center's sustainability but also empowers it to expand its key activities significantly. Our mission is to serve as a pivotal convener for a multidisciplinary workforce, seamlessly bridging the gap between clinicians, basic scientists, hospital administrators, and AI scientists. By doing so, we aim to foster an environment where collaborative efforts thrive, transcending traditional boundaries and driving forward the initiatives of learning health systems. This unique position allows the Center for Collaborative AI in Healthcare to spearhead innovations that promise to transform patient care, research, and education in the realm of healthcare AI.
3.1. Lessons learned
The journey of the Center for Collaborative AI in Healthcare from inception to its current status has offered numerous valuable lessons on the importance of organic growth and the adoption of a product‐oriented mindset in shaping the center's resources and programs. These insights not only reflect the center's strategic development but also highlight the adaptable and innovative approach necessary for success in the rapidly evolving field of healthcare AI.
3.1.1. Organic growth
Flexibility is key
The center's ability to adapt its focus based on emerging research findings, technological advancements, and healthcare needs has been crucial (e.g., developing SOAR from spatial transcriptomics advances). This flexibility allowed us to respond dynamically to the challenges and opportunities that arose, ensuring our efforts remained relevant and impactful.
Building on existing strengths
Leveraging the existing expertise and infrastructure within Northwestern University and its partners (e.g., Community Engagement Panel from NUCATS) facilitated a solid foundation for growth (e.g., community outreach). This approach underscored the value of utilizing established resources and relationships as a springboard for expansion and innovation.
Engaging broadly with stakeholders
Early and ongoing engagement with a wide range of stakeholders, including clinicians, scientists, administrators, and industry partners, enriched the center's understanding of diverse needs and perspectives. This inclusivity has been instrumental in designing resources and programs such as AI4H clinics and NM Healthcare AI Forum that are both comprehensive and targeted.
3.1.2. Adopting a product‐oriented mindset
User‐centered design
Treating the center's offerings as products meant adopting a mindset focused on the end‐user—whether a clinician, researcher, or educator. This shift emphasized the importance of understanding user needs, preferences, and challenges, leading to the development of more accessible, intuitive, and valuable resources (e.g., bulk NLP, SOAR, AI4H clinics, and NM Healthcare AI Forum).
Iterative development and feedback loops
Embracing a product development approach encouraged the adoption of iterative cycles, where resources and programs are continuously refined based on user feedback and performance metrics. This process ensures that the center's offerings remain at the cutting edge of utility and effectiveness (e.g., adding drug discovery function to SOAR for its 2023 cycle).
Scalability and sustainability
Designing with scalability in mind, the center has focused on creating resources and programs that can grow and evolve (e.g., partnering with the health system, the schools of engineering and art and science when launching NM Healthcare AI Forum). This foresight has been critical for ensuring long‐term sustainability, allowing the center to adjust its strategies in response to changing demands and new opportunities.
3.2. Infrastructure activities
Our commitment to advancing Collaborative AI in Healthcare is underscored by our dedication to constructing flagship and strategic datasets that bolster education and foster the growth of the research community. In the realm of epidemiology, landmark datasets like CARDIA 81 and MESA 82 have catalyzed multigenerational learning and research. However, the field of AI in healthcare has faced a gap in accessible, diverse, and comprehensive datasets that could serve a similar foundational role. To bridge this divide, we are leading the CRITICAL (Collaborative Resource for Intensive care Translational science, Informatics, Comprehensive Analytics, and Learning) consortium (U01TR003528), a collaborative effort between prestigious institutions including Northwestern University, MIT, Tufts Medical Center, University of Alabama at Birmingham, and Washington University.
Through the CRITICAL consortium, we are building a large and shared data, research, and educational platform that harmonizes comprehensive data from ICU patients across multiple institutions to drive next‐generation collaborative AI, benefiting investigators not only within participating institutions but also from the entire digital health community. The CRITICAL platform will support fair and generalizable ML models for advanced patient monitoring and decision support, featuring rich data, particularly on critically ill patients, by having inpatient and outpatient data pre‐, during‐ and post‐ICU admission. In the coming years, we will expand the CRITICAL consortium to cover more of the nation's Clinical and Translational Science Award (CTSA) program hubs.
We plan to also integrate multi‐omic data modalities with phenotypic data, following the late‐fusion strategy 83 that prove to work well in our pilot investigations on understanding complex diseases such as autism 48 and informing drug repurposing prioritization. 84 Building such a comprehensive, multimodal, and diverse data infrastructure is more than an ambitious goal—it is a necessary step toward realizing the full potential of healthcare AI. By establishing a national‐scale flagship dataset spanning multiple disease domains, we aim to power the next wave of innovation in healthcare AI, providing a robust platform for cutting‐edge research and educational endeavors. This vision represents not just the future of AI in healthcare but a fundamental shift toward more informed, holistic, and effective learning health systems.
3.3. Training activities
Our commitment to fostering a vibrant AI‐ready healthcare workforce spans all career stages, emphasizing community and capacity building through comprehensive training initiatives. We have instituted a mandatory curriculum in digital health and data science for all MD program students, equipping future physicians with the essential AI/ML competencies needed for modern healthcare practice. Additionally, for three consecutive years, the Bluhm Cardiovascular Institute has offered a prestigious 1‐year AI Fellowship in Cardiovascular Disease, coupled with a Master's in AI from the NU McCormick School of Engineering, available to select cardiologists or cardiac surgeons.
To further enhance AI healthcare education, we are scaling up the AI4H Clinic, creating opportunities for faculty engaged in AI health to mentor a growing workforce. This expansion aims to encompass a wider and more diverse group, including clinicians, students, and health system staff. Plans are underway to broaden our AI Health courses with modular introductory sessions. The launch of biweekly Northwestern Medicine Healthcare AI Forum makes AI/ML literacy accessible to all healthcare professionals, patients, and scientists. The AI4H Clinic sessions themselves are set to become regular bi‐weekly events year‐round, offering specialized consulting to foster AI health initiatives within our hospitals and strengthen ties with the medical school.
Supporting these endeavors are collaborative efforts across campus, involving NUCATS, the Northwestern University Institute for AI in Medicine, the Department of Medical Education, and more, all working together to create a supportive ecosystem for AI in healthcare. 85 Our educational resources, including tutorials and course modules, are openly available, adhering to FAIR principles (Findable, Accessible, Interoperable, and Reusable). We are also incorporating CARE Principles (Collective Benefit, Authority to Control, Responsibility, and Ethics) for Indigenous Data Governance 86 to ensure our training programs are equity‐focused and interdisciplinary. 87
Our training activities not only prepare individuals for the technical demands of healthcare AI but also foster an understanding of its practical implications, as demonstrated by our early analysis during the COVID‐19 pandemic. This analysis identified outpatient metformin usage as a predictor of inpatient outcomes, leading to our participation in a multisite clinical trial to explore metformin's potential in reducing severe COVID‐19 outcomes. 88 Through these comprehensive training and research efforts, we aim to cultivate an AI‐literate healthcare workforce capable of driving forward the principles of a learning health system.
3.4. Translation activities while incorporating patient perspectives
Our commitment to transforming healthcare through AI and ML extends to developing a comprehensive suite of models and analytical resources that are not only accurate and interpretable, but also adept at harnessing diverse healthcare data modalities. By integrating structured electronic health record (EHR) data, unstructured clinical notes, multi‐omics, and medical imaging, we adopt an integrative approach that enhances disease diagnosis accuracy and the development of targeted therapies. 89 For example, as mentioned in previous sections, our adoption of bulk NLP for automating breast cancer recurrence registration, previously a manual task, has significantly improved data management and impacted clinical practices within our health system.
We have also seen transformative changes in clinical workflows, such as the integration of preoperative MRI evaluations, which has led to more judicious use of preoperative MRIs. 43 In cardiovascular care, information extracted through bulk NLP has facilitated the creation and implementation of ML models that identify patients at risk of progressing from moderate to severe heart failure—a critical condition often missed in primary care. These models are embedded within a human‐in‐the‐loop workflow at Northwestern Medicine, enhancing patient identification for specialist evaluation and timely interventions, including the implantation of left ventricular assist devices, thereby markedly improving patient outcomes. 60
Recognizing the paramount importance of patient involvement, Collaborative AI in Healthcare is actively exploring avenues to engage patients more directly in the design, development, and deployment of AI solutions. Efforts underway include the establishment of patient advisory councils and the integration of patient‐generated data into AI models. As we move forward, Collaborative AI in Healthcare is dedicated to fostering deeper partnerships with patients. Our goal is to democratize AI in healthcare by developing technologies that are not only for patients but also shaped by them. This commitment to patient‐centered innovation ensures that our AI solutions are both relevant and responsive to the people they are designed to benefit. By placing patients at the heart of our AI endeavors, we aim to build a healthcare ecosystem that is inclusive, equitable, and attuned to the diverse needs of the communities we serve.
Looking forward, we are dedicated to continuously assessing the equity and ethical considerations of our AI/ML implementations. This commitment extends to our health system's community clinics, ensuring that our innovations benefit a broad patient base. Through ongoing training initiatives, we aim to equip a multidisciplinary workforce with the tools necessary to unlock the full potential of learning health systems. This strategy not only leverages multimodal health data for a comprehensive understanding of health and disease but also ensures that our healthcare system evolves to meet the needs of all patients, underpinning our vision for a more informed, equitable, and effective healthcare future.
3.5. Paradigm shift to proactive AI to address dynamic healthcare challenges
The shift from reactive to proactive AI represents a transformative approach in healthcare, moving beyond traditional AI systems that respond to preselected data and features. Proactive AI is designed to be inherently collaborative within a learning health system, adept at navigating the complexities of dynamic healthcare challenges such as data shifts and biases. It leverages dual feedback loops: the first (level 1) uses algorithms like deep learning to autonomously learn features, while the second (level 2) employs generative AI and reinforcement learning (RL) to enhance data quality, identify gaps, and refine data collection processes. 90 This methodology has already seen success, notably in Greece, where AI‐driven strategies informed COVID‐19 testing resource allocation and facilitated real‐time case data collection. 91
Proactive AI's strength lies in its ability to foster continuous evolution of solutions, stay abreast of knowledge advancements, and promote integration of diverse health data sources, ensuring the health system is both adaptive and collaborative. This adaptability is crucial for addressing the challenges of modern healthcare and maximizing the benefits of AI in medicine. Building on this foundation, we are exploring deep RL models to develop dynamic policies for selecting lab test panels based on prior patient observations, aiming for accurate diagnoses at reduced costs. 92 Furthermore, we are applying deep RL to devise strategies for the fair and efficient allocation of healthcare resources during crises, continuously adapting to changing conditions and mitigating biases inherent in historical data. 93
Looking ahead, our focus on proactive AI opens new pathways for innovation in healthcare. We aim to expand its application across more clinical scenarios, from personalized patient care plans to optimizing hospital workflows, ensuring that our health system not only responds to current needs but anticipates future challenges. This forward‐thinking approach will enable us to harness the full potential of AI in healthcare, leading to more effective, efficient, and equitable patient care.
4. CONCLUSION
As the concept of the learning health system matures into a working reality, we have taken the initiative to establish education and practice resources to function as a hub of collaborative AI expertise. This will assist a multidisciplinary workforce of clinicians and scientists in successfully applying AI methods in their daily research and practice of learning health system principles. Our bulk NLP pipelines and resulting data marts democratize access to unstructured health information, our continued efforts in creating easy‐to‐use AI/ML tools serve to disseminate analysis resources for multimodal healthcare data, our AI4H clinic contributes to democratizing access to collaborative education resource. Together, they bridge tighter integration between the practicing clinicians, scientists, and AI researchers. By continuing and expanding these efforts, and disseminating and lending material support of collaborative AI, we will continue to enable collaborators to advance clinical research and translational sciences and to support and sustain the development of learning health systems.
FUNDING INFORMATION
This work was supported by the National Institutes of Health (NIH). Grant numbers are R01NS110779, R01LM013337, U01TR003528, U54HL160273, UL1TR001422, U24LM013751, and 1OT2DB000013‐01.
CONFLICT OF INTEREST STATEMENT
This work was supported by the grants from the National Institutes of Health (NIH) Yuan Luo, Kristi Holmes, Luke Rasmussen, Andrew Naidech, Lazaro Sanchez‐Pinto, Richard Wunderink, Jennifer Pacheco, Matthew Carson, Susan Clare. Kristi Holmes is a member of Learning Health Systems Editorial Board. Donald Lloyd‐Jones serves as a board member of the American Heart Association. Michael Markl receives grant support by Siemens and Circle Cardiovascular Imaging; co‐founder and co‐owner of Third Coast Dynamics. Susanna McColley reports grants from the NIH National Center for Advancing Translational Science, the Centers for Disease Control and Prevention, the Cystic Fibrosis Foundation, and the Rosenau Family Research Foundation. She receives compensation as an advisor to Vertex Pharmaceuticals, Inc. Huiping Liu is the scientific co‐founder of ExoMira Medicine. Justin Starren reports grants from the NIH and Greenwall Foundation. Theresa Walunas receives research funding from Gilead Sciences. Kelly Michelson reports grants from the NIH, Greenwall Foundation, and the Patient‐Centered Outcomes Research Institute. Richard D’Aquila reports grants from the NIH, serving on external advisory boards for NIH‐funded projects, serving on the NIAID AIDS Research Advisory Council, and serving on the editorial board of the Journal of Clinical Investigation. Abel Kho is an advisor to Datavant. Sanjiv Shah is supported by grants from the NIH and AHA. Lee Cooper reports grants from the NIH and has invention disclosures registered at the Northwestern Office of Innovation and New Ventures, consults for Tempus, and advises Veracyte and Targeted Bioscience. Feng Yue is supported by grants from NIH and is a co‐founder of Sariant Therapeutics, Inc. Deyu Fang is co‐founder of ExoMira Medicine. Ronald Ackermann is supported by grants from the NIH, CDC, and the UnitedHealth Group.
Luo Y, Mao C, Sanchez‐Pinto LN, et al. Northwestern University resource and education development initiatives to advance collaborative artificial intelligence across the learning health system. Learn Health Sys. 2024;8(3):e10417. doi: 10.1002/lrh2.10417
REFERENCES
- 1. Russell SJ. Artificial intelligence a modern approach. Boston, MA: Pearson Education, Inc.; 2010. [Google Scholar]
- 2. Will artificial intelligence replace doctors? https://www.aamc.org/news-insights/will-artificial-intelligence-replace-doctors
- 3. Friedman CP. What is unique about learning health systems? Learn Health Syst. 2022;6(3):e10328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell. 2019;1(5):206‐215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Augmented Intelligence for Healthcare Clinics Provide Arena to Foster Collaboration. https://www.nucats.northwestern.edu/news/2022/ai4h.html
- 6. Organization WH . Ethics and Governance of Artificial Intelligence for Health: WHO Guidance. 2021.
- 7. Johnson AE, Pollard TJ, Shen L, et al. MIMIC‐III, a freely accessible critical care database. Sci Data. 2016;3:160035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Zimmerman L, Reyfman PA, Smith AD, et al. Early prediction of acute kidney injury following ICU admission. BMC Med Inform Decis Mak. 2019;19(S1):5‐16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Li Y, Yao L, Mao C, Srivastava A, Jiang X, Luo Y. Early prediction of acute kidney injury in critical care setting using clinical notes. Proceedings (IEEE Int Conf Bioinformatics Biomed). 2018;2018:683‐686. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Mao C, Yao L, Luo Y. A pre‐trained clinical language model for acute kidney injury. 2020 IEEE International Conference on Healthcare Informatics (ICHI). Oldenburg: IEEE; 2020:1‐2. [Google Scholar]
- 11. Eickelberg G, Sanchez‐Pinto LN, Luo Y. Predictive modeling of bacterial infections and antibiotic therapy needs in critically ill adults. J Biomed Inform. 2020;109:103540. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Yeh P, Pan Y, Sanchez‐Pinto LN, Luo Y. Hyperchloremia in critically ill patients: association with outcomes and prediction using electronic health record data. BMC Med Inform Decis Mak. 2020;20(14):1‐10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Sanchez‐Pinto LN, Stroup EK, Pendergrast T, Pinto N, Luo Y. Derivation and validation of novel phenotypes of multiple organ dysfunction syndrome in critically ill children. JAMA Netw Open. 2020;3(8):e209271‐e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Bunney G, Tran S, Han S, et al. Using machine learning to predict hospital disposition with geriatric emergency department innovation intervention. Ann Emerg Med. 2022;81(3):353‐363. [DOI] [PubMed] [Google Scholar]
- 15. Nadkarni PM, Ohno‐Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc. 2011;18(5):544‐551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Eickelberg G, Luo Y, Sanchez‐Pinto LN. Development and validation of MicrobEx: an open‐source package for microbiology culture concept extraction. JAMIA Open. 2022;5(2):ooac026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Chowdhury S, Zhang C, Yu PS, Luo Y. Med2Meta: learning representations of medical concepts with meta‐Embeddings. Healthinf. 2020;2020:369‐376. [Google Scholar]
- 18. Luo Y, Cheng Y, Uzuner O, Szolovits P, Starren J. Segment convolutional neural networks (Seg‐CNNs) for classifying relations in clinical notes. J Am Med Inform Assoc. 2018;25(1):93‐98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Li Y, Jin R, Luo Y. Classifying relations in clinical narratives using segment graph convolutional and recurrent neural networks (Seg‐GCRNs). J Am Med Inform Assoc. 2018;26(3):262‐268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Luo Y. Recurrent neural networks for classifying relations in clinical notes. J Biomed Inform. 2017;72:85‐95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Luo Y, Sohani AR, Hochberg EP, Szolovits P. Automatic lymphoma classification with sentence subgraph mining from pathology reports. J Am Med Inform Assoc. 2014;21(5):824‐832. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Luo Y, Uzuner Ö, Szolovits P. Bridging semantics and syntax with graph algorithms—state‐of‐the‐art of extracting biomedical relations. Brief Bioinform. 2016;18(1):160‐178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Yao L, Mao C, Luo Y. Graph convolutional networks for text classification. AAAI. 2019;33:7370‐7377. [Google Scholar]
- 24. Sharma H, Mao C, Zhang Y, et al. Developing a portable natural language processing based phenotyping system. BMC Med Inform Decis Mak. 2019;19(3):79‐87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Aronson AR. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proceedings. AMIA Symposium. Washington, DC: American Medical Informatics Association; 2001;2001:17. [PMC free article] [PubMed] [Google Scholar]
- 26. Zeng Z, Yao L, Roy A, et al. Identifying breast cancer distant recurrences from electronic health records using machine learning. J Healthc Inform Res. 2019;3:283‐299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Wang H, Li Y, Khan SA, Luo Y. Prediction of breast cancer distant recurrence using natural language processing and knowledge‐guided convolutional neural network. Artif Intell Med. 2020;110:101977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Zeng Z, Espino S, Roy A, et al. Using natural language processing and machine learning to identify breast cancer local recurrence. BMC Bioinformatics. 2018;19(17):65‐74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Zeng Z, Li X, Espino S, et al. Contralateral breast cancer event detection using natural language processing. AMIA Annual Symposium Proceedings; 2017. Washington, DC: American Medical Informatics Association; 2017:1885‐1892. [PMC free article] [PubMed] [Google Scholar]
- 30. Zeng Z, Deng Y, Li X, Naumann T, Luo Y. Natural language processing for EHR‐based computational phenotyping. IEEE/ACM Trans Comput Biol Bioinform. 2018;16(1):139‐153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Yao L, Mao C, Luo Y. Clinical text classification with rule‐based features and knowledge‐guided convolutional neural networks. BMC Med Inform Decis Mak. 2019;19(3):71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Shin J, Li Y, Luo Y. Early prediction of mortality in critical care setting in sepsis patients using structured features and unstructured clinical notes. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Houston, TX: IEEE; 2021:2885‐2890. [Google Scholar]
- 33. Sun M, Baron J, Dighe A, et al. Early prediction of acute kidney injury in critical care setting using clinical notes and structured multivariate physiological measurements. Stud Health Technol Inform. 2019;264:368‐372. [DOI] [PubMed] [Google Scholar]
- 34. Mao C, Xu J, Rasmussen L, et al. AD‐BERT: using pre‐trained contextualized embeddings to predict the progression from mild cognitive impairment to Alzheimer's disease. arXiv Preprint arXiv:221206042. 2022;144:104442. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Wang HY, Li YK, Hutch M, Naidech A, Luo Y. Using tweets to understand how COVID‐19‐related health beliefs are affected in the age of social media: twitter data analysis study. J Med Internet Res. 2021;23(2):e26302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Luo Y, Thompson W, Herr T, et al. Natural language processing for EHR‐based pharmacovigilance: a structured review. Drug Saf. 2017;40:1075‐1089. doi: 10.1007/s40264-017-0558-6 [DOI] [PubMed] [Google Scholar]
- 37. Zhao Y, Ison MG, Luo Y. COVID vaccine and cardiovascular risks: a natural language analysis of vaccine adverse event reports. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Houston, TX: IEEE; 2021:614‐618. [Google Scholar]
- 38. NMEDW . Breast Cancer Datamart. https://www.nucats.northwestern.edu/news/2020/breast-cancer-dataset.html
- 39. Li Y, Luo Y. Optimizing the evaluation of gene‐targeted panels for tumor mutational burden estimation. Sci Rep. 2021;11(1):21072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Zeng Z, Mao C, Vo A, et al. Deep learning for cancer type classification and driver gene identification. BMC Bioinformatics. 2021;22(4):1‐13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Zeng Z, Vo A, Li X, et al. Somatic genetic aberrations in benign breast disease and the risk of subsequent breast cancer. Npj Breast Cancer. 2020;6:24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Hophan SL, Odnokoz O, Liu H, et al. Ductal carcinoma in situ of breast: from molecular etiology to therapeutic management. Endocrinology. 2022;163(4):bqac027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Zeng Z, Amin A, Roy A, et al. Preoperative magnetic resonance imaging use and oncologic outcomes in premenopausal breast cancer patients. NPJ Breast Cancer. 2020;6(1):1‐8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Zeng Z, Jiang X, Li X, Wells A, Luo Y, Neapolitan R. Conjugated equine estrogen and medroxyprogesterone acetate are associated with decreased risk of breast cancer relative to bioidentical hormone therapy and controls. PLoS ONE. 2018;13(5):e0197064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Adekkanattu P, Jiang G, Luo Y, et al. Evaluating the portability of an NLP system for processing echocardiograms: a retrospective, multi‐site observational study. AMIA Annual Symposium Proceedings; 2019. Washington, DC: American Medical Informatics Association; 2019:190. [PMC free article] [PubMed] [Google Scholar]
- 46. Lal JC, Zhou Y, Gore‐Panter SR, et al. Network‐based prediction and functional validation of metformin for potential treatment of atrial fibrillation using human inducible pluripotent stem cell‐derived atrial‐like cardiomyocytes. bioRxiv. 2021; 2021.09.17.460826. [Google Scholar]
- 47. Lal JC, Mao CS, Zhou YD, et al. Transcriptomics‐based network medicine approach identifies metformin as a repurposable drug for atrial fibrillation. Cell Rep Med. 2022;3(10):100749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Luo Y, Eran A, Palmer N, et al. A multidimensional precision medicine approach identifies an autism subtype characterized by dyslipidemia. Nat Med. 2020;26:1375‐1379. [DOI] [PubMed] [Google Scholar]
- 49. Mao C, Yao L, Pan Y, Luo Y, Zeng Z. Deep generative classifiers for thoracic disease diagnosis with chest X‐ray images. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). New York City: IEEE; 2018:1209‐1214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Mao CS, Yao L, Luo Y. ImageGCN: multi‐relational image graph convolutional networks for disease identification with chest X‐rays. IEEE Trans Med Imaging. 2022;41(8):1990‐2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Yang MQ, Zhang YH, Chen HN, et al. AX‐Unet: a deep learning framework for image segmentation to assist pancreatic tumor diagnosis. Front Oncologia. 2022;12:894970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Shah SJ, Katz DH, Selvaraj S, et al. Phenomapping for novel classification of heart failure with preserved ejection fraction. Circulation. 2015;131(3):269‐279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Luo Y, Mao C, Yang Y, et al. Integrating hypertension phenotype and genotype with hybrid non‐negative matrix factorization. Bioinformatics. 2019;35(8):1395‐1403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Luo Y, Mao CS. PANTHER: pathway augmented nonnegative tensor factorization for HighER‐order feature learning. Proc AAAI Conf Artif Intell. 2021;35:371‐380. [Google Scholar]
- 55. Li Y, Shah SJ, Arnett D, Irvin R, Luo Y. SNPs filtered by allele frequency improve the prediction of hypertension subtypes. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Houston, TX: IEEE; 2021:2796‐2802. [Google Scholar]
- 56. Ma Y, Jiang H, Shah SJ, Arnett D, Irvin MR, Luo Y. Genetic‐based hypertension subtype identification using informative SNPs. Genes. 2020;11(11):1265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Ahmad FS, Luo Y, Wehbe RM, Thomas JD, Shah SJ. Advances in machine learning approaches to Heart failure with preserved ejection fraction. Heart Fail Clin. 2022;18(2):287‐300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Shah SJ. Misfolded transthyretin as a novel risk factor for Heart failure: a Rich history with implications for future diagnosis and treatment. JAMA Cardiol. 2020;6:255. [DOI] [PubMed] [Google Scholar]
- 59. Maurer MS, Schwartz JH, Gundapaneni B, et al. Tafamidis treatment for patients with transthyretin amyloid cardiomyopathy. N Engl J Med. 2018;379(11):1007‐1016. [DOI] [PubMed] [Google Scholar]
- 60. Cheema B, Mutharasan RK, Sharma A, et al. Augmented intelligence to identify patients with advanced heart failure in an integrated health system. JACC Adv. 2022;1(4):1‐11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Kou WJ, Carlson DA, Baumann AJ, et al. A deep‐learning‐based unsupervised model on esophageal manometry using variational autoencoder. Artif Intell Med. 2021;112:102006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Zeng Z, Li Y, Li Y, Luo Y. Statistical and machine learning methods for spatially resolved transcriptomics data analysis. Genome Biol. 2022;23(1):1‐23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Li Y, Dennis S, Hutch MR, Ding Y, Zhou Y, Li Y, Pillai M, Ghotbaldini S, Garcia MA, Broad MS, Mao C, Cheng F, Zeng Z, Luo Y. SOAR elucidates disease mechanisms and empowers drug discovery through spatial transcriptomics. bioRxiv. 2022. 10.1101/2022.04.17.488596 [DOI] [Google Scholar]
- 64. Buuren S, Groothuis‐Oudshoorn K. Mice: multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3):1‐67. [Google Scholar]
- 65. Luo Y, Szolovits P, Dighe AS, Baron JM. Using machine learning to predict laboratory test results. Am J Clin Pathol. 2016;145(6):778‐788. [DOI] [PubMed] [Google Scholar]
- 66. Stekhoven DJ, Buhlmann P. MissForest—non‐parametric missing value imputation for mixed‐type data. Bioinformatics. 2012;28(1):112‐118. [DOI] [PubMed] [Google Scholar]
- 67. Luo Y, Szolovits P, Dighe AS, Baron JM. 3D‐MICE: integration of cross‐sectional and longitudinal imputation for multi‐analyte longitudinal clinical data. J Am Med Inform Assoc. 2017;25(6):645‐653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Deng Y, Chang C, Ido MS, Long Q. Multiple imputation for general missing data patterns in the presence of high‐dimensional data. Sci Rep. 2016;6:21689. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Cao W, Wang D, Li J, Zhou H, Li L, Li Y. Brits: bidirectional recurrent imputation for time series. Proceedings of the 32nd International Conference on Neural Information Processing Systems. Boston, MA: Curran Associates; 2018;6775–6785. [Google Scholar]
- 70. Luo Y, Cai X, Zhang Y, Xu J, Yuan X. Multivariate time series imputation with generative adversarial networks. Proceedings of the 32nd International Conference on Neural Information Processing Systems. Boston, MA: Curran Associates; 2018:1603‐1614. [Google Scholar]
- 71. Che Z, Purushotham S, Cho K, Sontag D, Liu Y. Recurrent neural networks for multivariate time series with missing values. arXiv Preprint arXiv:160601865. 2016;1‐14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72. Mao CS, Yao L, Luo Y. MedGCN: medication recommendation and lab test imputation via graph convolutional networks. J Biomed Inform. 2022; 127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. Ding MH, Luo Y. Unsupervised phenotyping of sepsis using nonnegative matrix factorization of temporal trends from a multivariate panel of physiological measurements. BMC Med Inform Decis Mak. 2021;21(Suppl 5):95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74. Zhao Y, Luo Y. Unsupervised learning to subphenotype delirium patients from electronic health records. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). New York City: IEEE; 2021:2949‐2961. [Google Scholar]
- 75. Luo Y. Evaluating the state of the art in missing data imputation for clinical data. Brief Bioinform. 2022;23(1):bbab489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76. Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447‐453. [DOI] [PubMed] [Google Scholar]
- 77. Wang H, Li Y, Naidech A, Luo Y. Comparison between machine learning methods for mortality prediction for sepsis patients with different social determinants. BMC Med Inform Decis Mak. 2022;22(2):1‐13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Bhavani SV, Luo Y, Miller WD, et al. Simulation of ventilator allocation in critically ill patients with COVID‐19. Am J Respir Crit Care Med. 2021;204(10):1224‐1227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. Hutch MR, Liu ML, Avillach P, Luo Y, Bourgeois FT, Characterizati CC. National Trends in disease activity for COVID‐19 among children in the US. Front Pediatr. 2021;9:700656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Li Y, Wang H, Luo Y. Improving fairness in the prediction of Heart failure length of stay and mortality by integrating social determinants of health. Circ Heart Fail. 2022;15(11):e009473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81. National Heart L, Institute B . Coronary Artery Risk Development in Young Adults (CARDIA). 2005.
- 82. Bild DE, Bluemke DA, Burke GL, et al. Multi‐ethnic study of atherosclerosis: objectives and design. Am J Epidemiol. 2002;156(9):871‐881. [DOI] [PubMed] [Google Scholar]
- 83. Kline A, Wang H, Li Y, et al. Multimodal machine learning in precision health. NPJ Digit Med. 2022;5(1):1‐14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84. Zhou Y, Liu Y, Gupta S, et al. A comprehensive SARS‐CoV‐2–human protein–protein interactome reveals COVID‐19 pathobiology and potential host therapeutic targets. Nat Biotechnol. 2022;41:1‐12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85. Carson MB, Gonzales S, Shaw P, Schneider D, Holmes K. Bridging the gap: a library‐based collaboration to enhance data skills for clinical researchers. Learn Health Syst. 2022;7:e10339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86. Carroll SR, Garba I, Figueroa‐Rodríguez OL, et al. The CARE Principles for Indigenous Data Governance. 2020.
- 87. Choi BC, Pak AW. Multidisciplinarity, interdisciplinarity and transdisciplinarity in health research, services, education and policy: 1. Definitions, objectives, and evidence of effectiveness. Clin Invest Med. 2006;29(6):351‐364. [PubMed] [Google Scholar]
- 88. Bramante CT, Huling JD, Tignanelli CJ, et al. Randomized trial of metformin, ivermectin, and fluvoxamine for Covid‐19. N Engl J Med. 2022;387(7):599‐610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89. Kline A, Wang HY, Li YK, et al. Multimodal machine learning in precision health: a scoping review. Npj Digit Med. 2022;5(1):171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90. Luo Y, Wunderink RG, Lloyd‐Jones D. Proactive vs reactive machine learning in health care: lessons from the COVID‐19 pandemic. JAMA. 2022;327(7):623‐624. [DOI] [PubMed] [Google Scholar]
- 91. Bastani H, Drakopoulos K, Gupta V, et al. Efficient and targeted COVID‐19 border testing via reinforcement learning. Nature. 2021;599:1‐11. [DOI] [PubMed] [Google Scholar]
- 92. Yu Z, Li Y, Kim J, Huang K, Luo Y, Wang M. Deep reinforcement learning for cost‐effective medical diagnosis. The Eleventh International Conference on Learning Representations. 2023;1‐25. [Google Scholar]
- 93. Li Y, Mao C, Huang K, et al. Deep reinforcement learning for efficient and fair allocation of health care resources. arXiv Preprint arXiv:230908560. 2023;1‐17. [Google Scholar]
