Empowering Precision Medicine for Rare Diseases through Cloud Infrastructure Refactoring

Hui Li; Jinlian Wang; Hongfang Liu

. 2025 Jun 10;2025:300–311.

Empowering Precision Medicine for Rare Diseases through Cloud Infrastructure Refactoring

Hui Li ¹, Jinlian Wang ¹, Hongfang Liu ¹

PMCID: PMC12150693 PMID: 40502250

Abstract

Rare diseases affect approximately 1 in 11 Americans, yet their diagnosis remains challenging due to limited clinical evidence, low awareness, and lack of definitive treatments. Our project aims to accelerate rare disease diagnosis by developing a comprehensive informatics framework leveraging data mining, semantic web technologies, deep learning, and graph-based embedding techniques. However, our on-premises computational infrastructure faces significant challenges in scalability, maintenance, and collaboration. This study focuses on developing and evaluating a cloud-based computing infrastructure to address these challenges. By migrating to a scalable, secure, and collaborative cloud environment, we aim to enhance data integration, support advanced predictive modeling for differential diagnoses, and facilitate widespread dissemination of research findings to stakeholders, the research community, and the public and also proposed a facilitated through a reliable, standardized workflow designed to ensure minimal disruption and maintain data integrity for existing research project.

Introduction

Rare diseases, though individually uncommon, collectively impact approximately 1 in 11 Americans ¹,². Patients often endure lengthy diagnostic delays due to limited clinical evidence, low awareness among healthcare professionals, and lack of definitive treatments ³,⁴. Our parent project focuses on developing a comprehensive informatics framework to enhance knowledge acquisition, curation, and application in rare disease diagnosis 5,6.

Despite progress, our on-premises computational infrastructure faces significant challenges, including limited scalability, high maintenance costs, and obstacles to collaboration and data sharing. These limitations hinder our ability to process large datasets, apply advanced artificial intelligence (AI) techniques, and collaborate effectively with the broader research community.

Methods

We employed a structured approach to migrate our computational infrastructure to the cloud, encompassing design, implementation, and evaluation phases . This transition is facilitated through a reliable, standardized workflow designed to ensure minimal disruption and maintain data integrity. A critical component of this process is the implementation of a sophisticated mapping dashboard. This tool provides users with detailed questionnaires to guide them through the migration process, ensuring consistency and performing rigorous security checks at every stage.

The proposed cloud-based platform is designed to tackle the challenges of performance and resource management by utilizing the scalability of cloud infrastructure. The platform has the high potential to meet the demanding requirements of the parent grant, including extensive storage, substantial memory, and high GPU demands. By scaling resources dynamically, the platform efficiently handles varying workloads, delivering the necessary computational power precisely when needed.

We plan to offer a robust and stable cloud-based collaborative platform built on industry-leading services. This platform seamlessly integrates diverse data types and promotes extensive collaboration among researchers from various rare disease centers. The platform will be designed to be scalable, encouraging wide-ranging contributions from the research community and enabling the incorporation of new projects and data sources.

A key aspect of our innovation is the full utilization of cloud storage capabilities, integrating cloud expertise and advanced knowledge databases, including knowledge graphs. This enhances the system’s ability to manage and analyze large datasets effectively. By using standardized protocols, our platform ensures interoperability with existing healthcare systems, promoting data consistency and facilitating the seamless exchange of information 3340. Furthermore, our project includes the facilitation of NIH cloud account applications and usage, incorporating concepts such as Minimum Viable Systems (MVS) to validate and optimize the cloud-based implementations. This approach ensures that our cloud infrastructure is robust, reliable, and ready for large-scale deployment.

The comprehensive mapping system we propose will align research components with appropriate cloud services, ensuring effective integration into the cloud ecosystem. For any component that cannot be fully migrated, we will ensure significant integration and interaction with cloud services to maximize compatibility and functionality. By leveraging the full potential of cloud-based infrastructure and services, we will overcome the limitations of traditional infrastructure. Our approach enhances the parent grant’s capabilities to conduct and share advanced rare disease research globally, fostering a more collaborative, efficient, and impactful research environment. This transformation will facilitate the discovery of new insights, improve diagnostic accuracy, and ultimately contribute to better outcomes for patients with rare diseases. And the standard workflow shown as Figure 1.

Design and development. We will develop a migration plan in phases to minimize risk and ensure a smooth transition which includes development, testing, staging, and production environments. We will break down the migration into specific tasks, assign responsibilities and timelines for each task, use project management tools to track progress. For data migration strategy, we will create an inventory of all data sources, including structured data (databases), unstructured data (files, documents), and semi-structured data (logs) and map existing data structures to the new cloud environment, ensuring compatibility and integrity. Based on data size, sensitivity, and required speed, we will select appropriate data transfer methods (e.g., direct database transfer, cloud storage migration tools, secure file transfer protocols). For application migration, we will list all applications that need to be migrated, including AI models, web applications, and analytics tools. We plan to use containerization (e.g., Docker) to package applications and their dependencies, ensure consistent environments across development, testing, and production. We assume that the cloud service customers will be research groups, and we will map the selected requirements to related items, defining the matching requirements for both cloud service providers and customers (see Figure 3). For example, the NBDC, NIH, and MIC/METI guidelines require cloud service providers to implement communication encryption between cloud servers and customer terminals. Additionally, the MHLW and NBDC guidelines require users (cloud service customers) to access servers via encrypted communication. They also mandate that environments containing sensitive data be isolated from other environments, such as data pipelines, dashboards, or APIs.

Figure 3. — A Questionnaire for the aws service selection and scoring function Left , Right is map from component to aws service

Assessment and planning. We will review and document current workflows including databases, applications, storage systems, and network configurations. Identify dependencies between different components, including data flows, application interactions, and external integrations. Establish performance benchmarks for current systems to ensure that the cloud infrastructure meets or exceeds these benchmarks post-migration. Next we will clearly define the goals of the migration, such as improved scalability, enhanced collaboration, cost savings, and increased security. To measure the success of the migration we will establish criteria for measuring performance metrics, uptime, user satisfaction, and compliance with security standards. List potential risks associated with the migration, such as data loss, downtime, security vulnerabilities, and compatibility issues. Before project refactoring, we will use UML (Unified Modeling Language) diagrams to create a design and develop a checklist for security and data integrity requirements specific to cloud service providers and customers. The created checklist as shown in Table 1.

Table 1.

Comprehensive cloud migration checklist for research project

Category	Checklist	Purpose
Security	Data encryption	Data at rest and in transit is encrypted.
	Access control	Implement identity and access management (IAM) policies.
	Data isolation	To isolate sensitive data environments from other environments (e.g.dashboard).
	Security monitoring	To detect and respond to threats.
	Compliance	To ensure compliance with relevant regulations and guidelines (e.g., NBDC, NIH, MIC/METI, MHLW).
Functionality	Service availability	Verify the cloud service’s availability and uptime commitments.
	Scalability	To meet the demands of large-scale data and computing needs.
	Integration	To be compatible and integrate with existing systems and tools.
Performance	Latency and throughput	To evaluate the service’s performance metrics
Performance	Resource allocation	To allocate resources dynamically based on workload requirements.
Data management	Data backup and recovery	Regular data backups and disaster recovery.
Data management	Data transfer	To secure and efficient data transfer mechanisms
Legal and compliance	Data residency	Verify the data residency requirements and ensure compliance with local and international laws.
Legal and compliance	Service level agreements	Review the SLAs for commitments on service availability, performance, and support.
Cost management	Cost estimation	To understand the pricing structure and potential costs.
Cost management	Budget controls	Implement budget controls and monitoring to manage cloud service expenses.
Support and maintenance	Vendor support	Evaluate the quality and availability of vendor support services.
Support and maintenance	Service Updates	Ensure the cloud service provider regularly updates and patches their services.
Customization for research	Data Privacy	To protect sensitive genomic data.
	High-Performance Computing Support	Verify support for high-performance computing (HPC) needs
	Specialized tools	Assess the availability of specialized tools and services for rare disease various data processing and analysis.
	Ethical compliance	Ensure compliance with ethical guidelines for handling human genomic data.

Open in a new tab

Design of a comprehensive mapping system.

We will conduct a thorough analysis of all research components, including data sources, applications, and workflows to identify the specific requirements for each component in terms of storage, compute power, and data management. Each research component will be mapped to the most suitable cloud services provided by the NIH cloud platform, considering factors such as scalability, cost-effectiveness, and compatibility with existing workflows. The mapping process will be meticulously documented, detailing the rationale for selecting specific cloud services for each component, serving as a reference for future migrations and optimizations. Implement a dashboard that allows stakeholders to answer questions about each component’s requirements and preferences. Include questions on performance needs, cost sensitivity and data security. We will use responses from the questionnaire, along with a scoring function to select the most appropriate AWS services for each component to finish the mapping process as demonstrated in Figure X.

Integration of on-premises and cloud platform. For systems that cannot be fully migrated to the cloud, we will develop a hybrid integration strategy. This strategy will outline how to integrate on-premises systems with cloud services while maintaining data integrity and performance. Detailed integration protocols will be created to specify how data and applications interact between on-premises and cloud environments, covering data synchronization, secure data transfer, and interoperability standards. Middleware solutions will be developed to facilitate seamless integration between on-premises systems and cloud services, ensuring these solutions are robust, secure, and scalable. To optimize data flows between on-premises and cloud environments, we will facilitate cloud services for caching mechanisms and data replication strategies as needed to minimize latency and ensure real-time data availability. A unified management interface will be utilized, allowing researchers to monitor and manage both on-premises and cloud resources from a single dashboard. This interface will provide visibility into resource usage, performance metrics, and system health.

Phase 2: Integration of knowledge databases and knowledge graphs. We will begin by identifying and selecting appropriate knowledge databases that contain critical information for rare disease research, including disease, phenotype, genetic, clinical trial, and published literature data. To seamlessly integrate these knowledge databases into the cloud environment, we will develop data ingestion pipelines using tools like AWS Glue, Azure Data Factory, or Google Cloud Dataflow. These tools will automate the extraction, transformation, and loading (ETL) processes. Additionally, we will implement data normalization processes to harmonize disparate data sources into a unified format, facilitating easier access and analysis. For rare disease knowledge graph, we will design a comprehensive schema that captures the relationships between various data entities, such as patients, diseases, phenotype, genes, diagnosis and treatments. We will use graph databases such as Amazon Neptune, Azure Cosmos DB, or Google Cloud’s Neo4j to support this schema. Data from integrated databases will be linked to the knowledge graph and annotated with metadata to enhance searchability and context understanding, creating relationships between data points to form a rich, interconnected dataset. Finally, we will develop query interfaces and analytical tools to enable researchers to interact with the knowledge graph, extract insights, and generate hypotheses.

Phase 3: Compliance with standard protocols (HL7, FHRI for EHR, OMOP). We will adopt HL7 standards to ensure interoperability, including HL7 v2 for messaging and HL7 FHIR (Fast Healthcare Interoperability Resources) for data exchange. This will involve developing interfaces and APIs that comply with HL7 standards, enabling seamless data exchange between the cloud system and various EHR systems. Rigorous validation and testing will be performed to ensure that the implemented HL7 interfaces function correctly and securely. Additionally, we will map existing health data to FHIR standards to ensure consistency and interoperability, aligning data elements from different sources to the FHIR framework. Integrating EHR systems using FHIR standards will ensure that patient data is accurately and securely transferred to and from the cloud environment. To protect sensitive health information and ensure compliance with regulatory requirements, robust security measures, including data encryption and access controls, will be properly facilitated through cloud services. Furthermore, we will adopt the OMOP CDM to standardize the representation of health data across different sources, including patient demographics, clinical observations, and treatment outcomes. Existing datasets will be transformed into the OMOP CDM format using ETL tools, ensuring data integrity and consistency during the transformation process. Quality assurance protocols will be implemented to validate the accuracy and completeness of the standardized data, with regular audits and data quality checks performed to maintain high standards.

To ensure compliance with security and regulatory standards, we will utilize cloud services to encryption, access controls, and audit logging for all interactions between on-premises and cloud systems, creating a secure research environment. To efficiently manage and deploy the infrastructure for our parent project migration, we will utilize Infrastructure as Code (IaC) based on a detailed blueprint and Minimum Viable Systems (MVS) principles, fully leveraging the AWS Cloud Development Kit (AWS CDK). It allows our team to use familiar programming languages like TypeScript and Python to model cloud resources, enhancing productivity, reducing errors, and integrating seamlessly with existing workflows. Automating scalable and secure environment provisioning through AWS CDK ensures that infrastructure evolves consistently with application code, facilitating iterative development and rapid adaptation to new requirements. This strategy benefits internal users by increasing productivity and reducing errors, and it benefits collaborators by enhancing security, scalability, and system reliability.

Use case of cloud ecosystem for the parent grant based on the AWS Cloud solutions. The cloud ecosystem for the rare disease research project is meticulously designed using specific cloud provider AWS cloud as example and shown as Figure 7, including roles such as software engineers, collaborators, knowledge base managers, API users, data scientists, AI experts, clinicians, and authorized users via Role-Based Access Control (RBAC). The architecture leverages a CI/CD pipeline utilizing GitHub for version control, CodeBuild for building the code, and automated health checks followed by Blue/Green deployment for seamless updates. Collaborators can deploy Docker images locally, utilizing pre-seeded data and AI models. The knowledge base Extract, Transform, and Load (ETL) pipeline is built on a robust infrastructure, where clinical notes are ingested into a healthcare data lake. AWS Lambda functions process this data, which is then containerized using the Elastic Container Repository for AI readiness. The processed data is stored in OMOP format for standardized analytics. For storage and data management, Amazon Redshift and AWS Glue are employed, along with Healthcare Neptune for knowledge graph database requirements.

Figure 7. — Proposed cloud infrastructure and key system features.

API users interact with the FHIR payload system through an API Gateway, which routes requests to Lambda functions interfacing with an RDS server. The integration is further enhanced with AWS Glue for data preparation and AWS Cognito for secure access management. For big data processing, the architecture employs Elastic MapReduce (EMR) and a Hadoop data pipeline, with resources managed and monitored via CloudTrail.

AI models are developed using cloud service such as (Sagemaker, SageMaker Ground Truth) via Jupyter Notebooks for cloud service, followed by deployment of ChatGPT, Bert and other AI models and then monitored by CloudWatch. Clinicians and other users access web-based tools through a single sign-on system, leveraging a load balancer for web portal for RDConnect and cohort building. Data security and sharing are paramount, with multi-factor authentication (MFA) ensuring secure access. Data storage and sharing are managed with stringent access controls and continuous monitoring. This comprehensive setup not only supports efficient data management and AI deployment but also ensures compliance with relevant guidelines, providing a secure and scalable environment for rare disease research.

Successful migration of the rare disease knowledgebase hub and differential diagnosis system to a cloud-based infrastructure will provide i) enhanced performance, scalability, and collaboration capabilities, ii) improved data consistency, security, and interoperability, iii) broader research contributions from the international community, and accelerated research progress and improved outcomes for patients with rare diseases.

The cloud infrastructure was developed in collaboration with cloud service providers and the National Institutes of Health (NIH) cloud team, ensuring adherence to industry standards and regulatory requirements.

Results

Successful Migration Workflow Development,Standardization: The four-phase workflow minimized disruption and maintained data integrity throughout the migration.Comprehensive Checklist: Addressed all aspects of security, functionality, and compliance, ensuring a robust migration process.Implementation of the Mapping Dashboard,User Guidance: The dashboard provided clear instructions, facilitating a smooth migration experience for users.Security Assurance: Built-in security checks at each step enhanced data integrity and compliance. Enhanced Performance and Resource Management

Dynamic Scalability: The cloud infrastructure efficiently handled varying computational workloads.
Improved Performance: Processing times for large datasets improved significantly, meeting or exceeding benchmarks.
Creation of a Collaborative Cloud Environment
Robust Infrastructure: Developed a stable and secure platform supporting real-time collaboration.
Data Integration: Seamless integration of diverse data types improved accessibility and analysis capabilities.
Integration of Expertise and Standards
Knowledge Graph Development: Enhanced data management and analytical capabilities through comprehensive knowledge graphs.
Standards Compliance: Ensured interoperability and regulatory compliance, facilitating broader research contributions.

This plan aims to enhance the quality and effectiveness of rare disease research by engaging cloud experts, developing a comprehensive knowledge management system, and ensuring compliance with industry standards by working with NIH Cloud team. This integration will facilitate seamless data exchange, improve data management, and support collaborative research efforts, ultimately advancing the diagnosis and treatment of rare diseases. A four-stage process of the application migration encompasses decomposition, design & mapping along with checklist, build, and evaluation shown in Figure 4.

Phase 1. Facilitate NIH cloud account application and usage. We will attend relevant NIH cloud computing workshops to learn and share lessons learnt and best practices. We will work with various stakeholders to understand current workflows, identify specific needs, and brainstorm scalable and secure cloud solutions. To ensure the architecture supports all necessary functionalities and performance requirements, we will develop detailed use cases for rare disease research scenarios. These use cases will guide the design and implementation process, ensuring that the cloud infrastructure meets the specific needs of the research community.

The bar plot of Figure 5 illustrates the enhancement rates across various categories, with the y-axis representing the categories and the x-axis indicating the enhancement rate percentage. The bars are color-coded based on their enhancement rates: grey indicates a good enhancement rate of over 70%, and blue indicates a moderate enhancement rate between 50% and 70%. This visualization helps in quickly identifying areas with strong performance and those needing improvement. The system will be evaluated through functionality testing, performance metrics and user feedback with clear timelines for each task.

Discussion

The migration to a cloud-based infrastructure effectively addressed the limitations of our on-premises system. Dynamic scalability allowed for efficient processing of large datasets, critical for rare disease research. The cloud environment enhanced collaboration, enabling seamless data and tool sharing among researchers globally.Key Findings: Performance Improvement: Significant reduction in data processing times and increased computational efficiency. Collaboration Enhancement: Improved accessibility facilitated by the cloud environment fostered collaborative research efforts. Security and Compliance: Adherence to regulatory standards ensured data security and patient privacy.Training Requirements: Transitioning to a new system required comprehensive training for users.

Ongoing efforts are necessary to adapt to evolving computational demands and technologies. Incorporate more sophisticated AI applications to enhance diagnostic capabilities. Engage with a broader research community to leverage diverse expertise and data sources. Performance Monitoring: Implement continuous monitoring to optimize resource utilization and system performance.

Figures & Tables

Figure 2. — Design and Refactoring for Secure Data Integration: Framework of Components, Services, and Deliverables

Reference

1.Raghunathan N, Sankaran S, Miteu GD. A comprehensive review of iPS cell line-based disease modelling of the polyglutamine spinocerebellar ataxias 2 and 3: a focus on the research outcomes. Ann Med Surg (Lond) 2024 Mar 19;86(6):3487–3498. doi: 10.1097/MS9.0000000000001984. doi: 10.1097/MS9.0000000000001984. PMID: 38846892. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Koenig AB, Tan A, Abdelaal H, Monge F, Younossi ZM, Goodman ZD. Review article: Hepatic steatosis and its associations with acute and chronic liver diseases. Aliment Pharmacol Ther. 2024 Jun 71 doi: 10.1111/apt.18059. doi: 10.1111/apt.18059. Epub ahead of print. PMID: 38845486. [DOI] [PubMed] [Google Scholar]
3.Wojcik MH, Lemire G, Berger E, Zaki MS, Wissmann M. etc. Genome Sequencing for Diagnosing Rare Diseases. N Engl J Med. 2024 Jun 6;390(21):1985–1997. doi: 10.1056/NEJMoa2314761. doi: 10.1056/NEJMoa2314761. PMID: 38838312. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Faviez C, Chen X, Garcelon N, Zaidan M. etc. Objectivizing issues in the diagnosis of complex rare diseases: lessons learned from testing existing diagnosis support systems on ciliopathies. BMC Med Inform Decis Mak. 2024 May 24;24(1):134. doi: 10.1186/s12911-024-02538-8. doi: 10.1186/s12911-024-02538-8. PMID: 38789985. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Yamamoto S, Kanca O, Wangler MF, Bellen HJ. Integrating non-mammalian model organisms in the diagnosis of rare genetic diseases in humans. Nat Rev Genet. 2024 Jan;25(1):46–60. doi: 10.1038/s41576-023-00633-6. doi: 10.1038/s41576-023-006336. Epub 2023 Jul 25. PMID: 37491400. [DOI] [PubMed] [Google Scholar]
6.Tesi B, Boileau C, Boycott KM, Canaud G, Caulfield M, Choukair D, Hill S, Spielmann M, Wedell A, Wirta V, Nordgren A, Lindstrand A. Precision medicine in rare diseases: What is next? J Intern Med. 2023 Oct;294(4):397–412. doi: 10.1111/joim.13655. doi: 10.1111/joim.13655. Epub 2023 Jun 1. PMID: 37211972. [DOI] [PubMed] [Google Scholar]
7.Keloth VK, Selek S, Chen Q, Gilman C, Fu S, Dang Y, Chen X, Hu X, Zhou Y, He H, Fan JW, Wang K, Brandt C, Tao C, Liu H, Xu H. Large Language Models for Social Determinants of Health Information Extraction from Clinical Notes - A Generalizable Approach across Institutions. medRxiv [Preprint] 2024 May 22 2024.05.21.24307726. doi: 10.1101/2024.05.21.24307726. PMID: 38826441. [Google Scholar]
8.Zhou X, Wang Y, Sohn S, Therneau TM, Liu H, Knopman DS. Automatic extraction and assessment of lifestyle exposures for Alzheimer’s disease using natural language processing. Int J Med Inform. 2019 Oct;130:103943. doi: 10.1016/j.ijmedinf.2019.08.003. doi: 10.1016/j.ijmedinf.2019.08.003. Epub 2019 Aug 6. PMID: 31476655; PMCID: PMC6750723. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Sohn S, Wi CI, Wu ST, Liu H, Ryu E, Krusemark E, Seabright A, Voge GA, Juhn YJ. Ascertainment of asthma prognosis using natural language processing from electronic medical records. J Allergy Clin Immunol. 2018 Jun;141(6):2292–2294.e3. doi: 10.1016/j.jaci.2017.12.1003. doi: 10.1016/j.jaci.2017.12.1003. Epub 2018 Feb 10. PMID: 29438770; PMCID: PMC5994178. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Wi CI, Sohn S, Ali M, Krusemark E, Ryu E, Liu H, Juhn YJ. Natural Language Processing for Asthma Ascertainment in Different Practice Settings. J Allergy Clin Immunol Pract. 2018 Jan-Feb;6(1):126–131. doi: 10.1016/j.jaip.2017.04.041. doi: 10.1016/j.jaip.2017.04.041. Epub 2017 Jun 19. PMID: 28634104; PMCID: PMC5733699. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Wang JL, Li H, Liu HF. Enhancing Clinical Outcomes and Global Collaboration in Congenital Diaphragmatic Hernia Research Through a Cloud-Based Registry Infrastructure. AMIA. 2024 (submitted) [Google Scholar]
12.Wang J, Li H, Liu H. Prioritizing Clinically Significant Lung Cancer Somatic Mutations for Targeted Therapy Through Efficient NGS Data Filtering System. AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:305313. PMID: 38827108; PMCID: PMC11141846. [PMC free article] [PubMed] [Google Scholar]
13.Chen E, Wang JL, Kueffner R, Al-Kateb H, Silkov A, Uzilov A, Lochovsky L, Li H, Newman S. CNA Explorer and anaLyzer (CNAEL): an interactive web application and standard operating procedure enabling efficient clinical review and reporting of complex NGS-derived tumor copy number profiles. 2022 doi: https://doi.org/10.1101/2022.10.24.22281236 . [Google Scholar]
14.Rodriguez A, Kim Y, Nandi TN, Keat K, Kumar R, Bhukar R, Conery M. etc. Accelerating Genome- and Phenome-Wide Association Studies using GPUs - A case study using data from the Million Veteran Program. bioRxiv [Preprint] 2024 May 22 2024.05.17.594583. doi: 10.1101/2024.05.17.594583. PMID: 38826407. [Google Scholar]
15.Rathinam R, Sivakumar P, Sigamani S, Kothandaraman I. SJFO: Sail Jelly Fish Optimization enabled VM migration with DRNN-based prediction for load balancing in cloud computing. Network. 2024 Jun 3:1–26. doi: 10.1080/0954898X.2024.2359609. doi: 10.1080/0954898X.2024.2359609. Epub ahead of print. PMID: 38829364. [DOI] [PubMed] [Google Scholar]

[r1-6829] 1.Raghunathan N, Sankaran S, Miteu GD. A comprehensive review of iPS cell line-based disease modelling of the polyglutamine spinocerebellar ataxias 2 and 3: a focus on the research outcomes. Ann Med Surg (Lond) 2024 Mar 19;86(6):3487–3498. doi: 10.1097/MS9.0000000000001984. doi: 10.1097/MS9.0000000000001984. PMID: 38846892. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r2-6829] 2.Koenig AB, Tan A, Abdelaal H, Monge F, Younossi ZM, Goodman ZD. Review article: Hepatic steatosis and its associations with acute and chronic liver diseases. Aliment Pharmacol Ther. 2024 Jun 71 doi: 10.1111/apt.18059. doi: 10.1111/apt.18059. Epub ahead of print. PMID: 38845486. [DOI] [PubMed] [Google Scholar]

[r3-6829] 3.Wojcik MH, Lemire G, Berger E, Zaki MS, Wissmann M. etc. Genome Sequencing for Diagnosing Rare Diseases. N Engl J Med. 2024 Jun 6;390(21):1985–1997. doi: 10.1056/NEJMoa2314761. doi: 10.1056/NEJMoa2314761. PMID: 38838312. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r4-6829] 4.Faviez C, Chen X, Garcelon N, Zaidan M. etc. Objectivizing issues in the diagnosis of complex rare diseases: lessons learned from testing existing diagnosis support systems on ciliopathies. BMC Med Inform Decis Mak. 2024 May 24;24(1):134. doi: 10.1186/s12911-024-02538-8. doi: 10.1186/s12911-024-02538-8. PMID: 38789985. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r5-6829] 5.Yamamoto S, Kanca O, Wangler MF, Bellen HJ. Integrating non-mammalian model organisms in the diagnosis of rare genetic diseases in humans. Nat Rev Genet. 2024 Jan;25(1):46–60. doi: 10.1038/s41576-023-00633-6. doi: 10.1038/s41576-023-006336. Epub 2023 Jul 25. PMID: 37491400. [DOI] [PubMed] [Google Scholar]

[r6-6829] 6.Tesi B, Boileau C, Boycott KM, Canaud G, Caulfield M, Choukair D, Hill S, Spielmann M, Wedell A, Wirta V, Nordgren A, Lindstrand A. Precision medicine in rare diseases: What is next? J Intern Med. 2023 Oct;294(4):397–412. doi: 10.1111/joim.13655. doi: 10.1111/joim.13655. Epub 2023 Jun 1. PMID: 37211972. [DOI] [PubMed] [Google Scholar]

[r7-6829] 7.Keloth VK, Selek S, Chen Q, Gilman C, Fu S, Dang Y, Chen X, Hu X, Zhou Y, He H, Fan JW, Wang K, Brandt C, Tao C, Liu H, Xu H. Large Language Models for Social Determinants of Health Information Extraction from Clinical Notes - A Generalizable Approach across Institutions. medRxiv [Preprint] 2024 May 22 2024.05.21.24307726. doi: 10.1101/2024.05.21.24307726. PMID: 38826441. [Google Scholar]

[r8-6829] 8.Zhou X, Wang Y, Sohn S, Therneau TM, Liu H, Knopman DS. Automatic extraction and assessment of lifestyle exposures for Alzheimer’s disease using natural language processing. Int J Med Inform. 2019 Oct;130:103943. doi: 10.1016/j.ijmedinf.2019.08.003. doi: 10.1016/j.ijmedinf.2019.08.003. Epub 2019 Aug 6. PMID: 31476655; PMCID: PMC6750723. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r9-6829] 9.Sohn S, Wi CI, Wu ST, Liu H, Ryu E, Krusemark E, Seabright A, Voge GA, Juhn YJ. Ascertainment of asthma prognosis using natural language processing from electronic medical records. J Allergy Clin Immunol. 2018 Jun;141(6):2292–2294.e3. doi: 10.1016/j.jaci.2017.12.1003. doi: 10.1016/j.jaci.2017.12.1003. Epub 2018 Feb 10. PMID: 29438770; PMCID: PMC5994178. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r10-6829] 10.Wi CI, Sohn S, Ali M, Krusemark E, Ryu E, Liu H, Juhn YJ. Natural Language Processing for Asthma Ascertainment in Different Practice Settings. J Allergy Clin Immunol Pract. 2018 Jan-Feb;6(1):126–131. doi: 10.1016/j.jaip.2017.04.041. doi: 10.1016/j.jaip.2017.04.041. Epub 2017 Jun 19. PMID: 28634104; PMCID: PMC5733699. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r11-6829] 11.Wang JL, Li H, Liu HF. Enhancing Clinical Outcomes and Global Collaboration in Congenital Diaphragmatic Hernia Research Through a Cloud-Based Registry Infrastructure. AMIA. 2024 (submitted) [Google Scholar]

[r12-6829] 12.Wang J, Li H, Liu H. Prioritizing Clinically Significant Lung Cancer Somatic Mutations for Targeted Therapy Through Efficient NGS Data Filtering System. AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:305313. PMID: 38827108; PMCID: PMC11141846. [PMC free article] [PubMed] [Google Scholar]

[r13-6829] 13.Chen E, Wang JL, Kueffner R, Al-Kateb H, Silkov A, Uzilov A, Lochovsky L, Li H, Newman S. CNA Explorer and anaLyzer (CNAEL): an interactive web application and standard operating procedure enabling efficient clinical review and reporting of complex NGS-derived tumor copy number profiles. 2022 doi: https://doi.org/10.1101/2022.10.24.22281236 . [Google Scholar]

[r14-6829] 14.Rodriguez A, Kim Y, Nandi TN, Keat K, Kumar R, Bhukar R, Conery M. etc. Accelerating Genome- and Phenome-Wide Association Studies using GPUs - A case study using data from the Million Veteran Program. bioRxiv [Preprint] 2024 May 22 2024.05.17.594583. doi: 10.1101/2024.05.17.594583. PMID: 38826407. [Google Scholar]

[r15-6829] 15.Rathinam R, Sivakumar P, Sigamani S, Kothandaraman I. SJFO: Sail Jelly Fish Optimization enabled VM migration with DRNN-based prediction for load balancing in cloud computing. Network. 2024 Jun 3:1–26. doi: 10.1080/0954898X.2024.2359609. doi: 10.1080/0954898X.2024.2359609. Epub ahead of print. PMID: 38829364. [DOI] [PubMed] [Google Scholar]

PERMALINK

Empowering Precision Medicine for Rare Diseases through Cloud Infrastructure Refactoring

Hui Li, PhD

Jinlian Wang, PhD

Hongfang Liu, PhD

Abstract

Introduction

Methods

Figure 1.

Figure 3.

Table 1.

Design of a comprehensive mapping system.

Figure 7.

Results

Figure 5.

Discussion

Figures & Tables

Figure 2.

Reference

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Empowering Precision Medicine for Rare Diseases through Cloud Infrastructure Refactoring

Hui Li, PhD

Jinlian Wang, PhD

Hongfang Liu, PhD

Abstract

Introduction

Methods

Figure 1.

Figure 3.

Table 1.

Design of a comprehensive mapping system.

Figure 7.

Results

Figure 5.

Discussion

Figures & Tables

Figure 2.

Reference

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases