Development of a Genomic Data Flow Framework: Results of a Survey Administered to NIH-NHGRI IGNITE and eMERGE Consortia Participants

Paul Dexter; Henry Ong; Amanda Elsey; Gillian Bell; Nephi Walton; Wendy Chung; Luke Rasmussen; Kevin Hicks; Aniwaa Owusu-obeng; Stuart Scott; Steve Ellis; Josh Peterson

. 2020 Mar 4;2019:363–370.

Development of a Genomic Data Flow Framework: Results of a Survey Administered to NIH-NHGRI IGNITE and eMERGE Consortia Participants

Paul Dexter ^1,², Henry Ong ³, Amanda Elsey ⁴, Gillian Bell ⁵, Nephi Walton ⁶, Wendy Chung ⁷, Luke Rasmussen ⁸, Kevin Hicks ⁹, Aniwaa Owusu-obeng ¹⁰, Stuart Scott ¹⁰, Steve Ellis ¹⁰, Josh Peterson ³

PMCID: PMC7153090 PMID: 32308829

Abstract

Precision health’s more individualized molecular approach will enrich our understanding of disease etiology and patient outcomes. Universal implementation of precision health will not be feasible, however, until there is much greater automation of processes related to genomic data transmission, transformation, and interpretation. In this paper, we describe a framework for genomic data flow developed by the Clinical Informatics Work Group of the NIH National Human Genome Research Institute (NHGRI) IGNITE Network consortium. We subsequently report the results of a genomic data flow survey administered to sites funded by NIH-NHGRI for large scale genomic medicine implementations. Finally, we discuss insights and challenges identified through these survey results as they relate to both the current and a desirable future state of genomic data flow.

Introduction

Precision health’s more individualized molecular approach will enrich our understanding of disease etiology and patient outcomes.¹ Individuals receiving treatment within healthcare systems will ultimately be characterized by multiple methods, including genomics, epigenomics, metabolomics, and proteomics.¹

Acquiring and interpreting these new data sources are now possible because of the decreasing costs of molecular testing, development of large-scale biologic databases, and computational tools that facilitate analysis of large data sets.¹ Genetic sequencing in particular is on the verge of becoming a routine part of clinical care.² However, major challenges to routine use persist. Variant interpretation and reporting is currently extraordinarily complex and time- intensive.³ Genomic lab instrument interoperability with clinical information systems is hindered by a lack of adopted health IT standards.³ Finally, clinical decision support (CDS) necessary for clinicians to understand the significance of all clinically important variants is not widely implemented. At every step, from the point of raw data being generated by the genomics instrument to CDS at the point of care, the delivery of genomic data will require markedly improved automation.

This manuscript focuses on the current status of genomic data transmission, transformation, and interpretation (flow) from instrument to point of care. Specifically, we describe a framework for genomic data flow developed by the Clinical Informatics Work Group (CIWG) of NIH-NHGRI’s “Implementing genomics in practice” (IGNITE) Network. We subsequently report the results of a related survey administered to sites funded by NIH-NHGRI for large scale genomic medicine implementations. Finally, we discuss insights and challenges identified through these survey results as they relate to both the current and a desirable future state of genomic data flow.

Methods

Over the course of several months and through a series of meetings, members of IGNITE’s CIWG committee developed a draft framework for genomic data flow informed by their own institutional experiences and the medical literature. This draft framework was presented to the full IGNITE consortia and refined based on further feedback.

Members of the CIWG Committee subsequently developed a RedCap survey⁴ that corresponded closely to components of the draft framework. For purposes of our survey, we focused on germline molecular testing, but we anticipate that many aspects would be generalizable to somatic molecular testing. Given our primary goal of refining a genomic data flow framework, we chose to survey institutions with well-developed advanced genomic molecular testing laboratories. To this end, we invited survey respondents from institutions that had been funded by NIH-NHGRI as part of the IGNITE and Electronic Medical Records and Genomics (eMERGE) consortia to implement large scale genomic medicine implementations. Surveys were conducted from May 2018 to February 2019.

In the following discussion, we illustrate aspects of the data architecture using two important examples of genomic molecular testing, whole genome sequencing (WGS) and pharmacogenetics (PGx).

Results

Development of a genomic data flow framework. The final proposed framework for genomic data flow is depicted in figure 1. For the sake of simplicity, in the descriptions of the components of figure 1, we will refer only to the number (e.g., “component #2”) rather than a full description (e.g., “refer to component #2 of figure 1”).

In figure 1, genomic data generally flows left to right. Genomic instrument results from the institution’s own laboratory are processed and transformed by a bio-informatics pipeline (component #4). All identified variant results for a patient (component #7) are stored for at least the life of that patient, and separately, interpretative annotations for those results are stored (component #8). For purposes of electronic health record (EHR) performance, clinically actionable subsets of variant data (component #10) and the corresponding interpretative annotations (component #11) are stored,⁵ which are available through the EHR (component #13) and for CDS (component #12).

For pathology reporting and reasons discussed below, we distinguish between automatically-generated portions of a draft report (component #5) and the final report completed by the laboratory’s genetics expert (component #6). A genome variant knowledge base (component #3) is required in machine-readable form for purposes of the bio- informatics pipeline and automatically-generated portions of a draft report, as well as in human-readable form for the genetics expert. Discrete results and annotations data from an external genomics laboratory (component #1) flow into the full variant (component #7) and interpretative annotations (component #8) databases, while the external pathology report would flow through the pathology reporting system (component #6) and into the EHR (component #13). Finally, processes would allow for ongoing re-interpretation of results as genomic knowledge evolves (component #9) and genomics results would be shared with the patient through a patient portal (component #14). These framework components are discussed more fully below.

Genomic laboratory instrument (internal and external testing). Genomics laboratory instruments rely on a variety of underlying laboratory techniques (e.g., DNA arrays, DNA sequencing, real-time quantitative polymerase-chain- reaction). Such testing is performed in a variety of clinical contexts e.g., prenatal testing or determining predisposition to cancer). For purposes of this paper, our focus begins with the raw data generated by the instruments for genomic tests performed both internally within the institution’s laboratory (component #2) and at an external laboratory (component #1).

In the case of WGS, raw data typically consist of fluorescent, chemiluminescent, or electrical current signals.³ These are transformed into sequential base calls using platform-specific algorithms.³ The resultant FASTQ file can contain the results for millions of short DNA sequences. It is this FASTQ file that provides the source input for platform-independent software processing, as discussed below in the bio-informatics pipeline section.³

These sequence reads are appropriately associated with metadata that describe the methods and technologies used to produce those results.⁶ In the case of WGS, such metadata commonly include sequencing quality scores.⁷ These quality scores are important for interpretation given that they reflect the statistical confidence that a given base call is correct.³

Many genomics laboratory instruments were not designed to work in a clinical networked environment.³ There is need for both greater interoperability³ and implementation of standards.⁸ To such ends, the HL7 Clinical Genomics Work Group recommends a transcoding process, whereby genomic data are transformed from bio-informatics format into healthcare IT data standards.⁹ They also describe an alternative approach where genomic data are encapsulated in healthcare standards.⁹

Bio-informatics pipeline. As described in Leipzig et al.’s review of bio-informatic pipeline frameworks, a bio- informatics pipeline invariably involves the shepherding of files through a series of transformations.¹⁰ These pipelines have evolved into frameworks that accommodate integration of user-defined tools, definition of both serial and parallel steps, complex dependencies, and varied data file types.¹⁰

In the case of WGS, the source input for a platform-independent bio-informatics pipeline is commonly in the form of a FASTQ file, which can contain millions of short DNA sequences. Through the pipeline, these short sequences are aligned to a human genome reference sequence.¹¹ Through comparison to the reference sequence, variants for an individual are also determined (“called”). Variants can include single nucleotide variants, insertions/deletions, and copy number variants.¹¹ Identified variants are prioritized, and then annotated with respect to the clinical relevance of those particular variants.¹¹ Throughout the process, the data are transformed into successively smaller file formats, such as FASTQ, BAM, SAM, and finally VCF. Pipeline processing for WGS is frequently time- intensive.¹⁰ Both open source and proprietary software is available for these transformations (e.g., GATK, SAMtools, Atlas2).¹²

From an abstract level, pipeline processing for PGx genotyping is similar, insofar as variants are called, prioritized, and annotated for clinical relevance. PGx pipelines commonly rely on translation tables.¹³ Such tables relate diplotype results (e.g., CYP2C19 *2/*2) to drug related phenotypes (e.g., poor metabolizer) and CDS recommendations (e.g., “Increased risk for reduced response to clopidogrel. Consider alternative drug”).⁷

Such prioritization and curation of variants have become immensely more challenging as a result of WGS and its large number of associated variants.¹⁴ For example, the typical human genome has more than 4 million variants. As noted by Kohane et al., the burden of false-positive incidental findings in WGS “threatens current capabilities to deliver clinical-grade whole-genome clinical interpretation.”¹⁵

Genomics pathology report with interpretation (automatically generated and final). The quality of final genomics pathology reports is a key determinant of the effectiveness of genetic medicine, with high-quality reports concisely explaining both the variants identified in a patient and their clinical relevance.¹⁶ Producing these reports is also labor-intensive and expensive.¹⁶

While the actual sequencing is getting more affordable and higher quality, efficient interpretation is lagging. In one study of the clinical implications of WGS, Dewey et. al found that curation of the 90 to 127 genetic variants found in each participant required a median of 54 minutes per genetic variant ¹⁷ (i.e., presumably approximately 100 hours of interpretation by genetic experts per sequenced participant).

Given our belief that this level of intensity for manual curation for WGS would not accommodate universal WGS testing, we distinguish between the automatically-generated portions of a draft pathology report (component #5) and the final pathology report completed by the laboratory’s genetics expert (#6). Over time, and out of necessity, we anticipate that the automatically-generated portions of the report will become more complete and valid. Through interfacing to the EHR, it also seems likely that such automatically-generated reports will incorporate an increasing amount of patient-specific clinical context (including family history). While it may never prove completely possible due to inherent complexity, the goal would arguably be for valid WGS results to require no more manual expert curation than is currently associated with complete blood count (CBC) results.

Genomic variant knowledge base. A genome variant knowledge base (component #3) is required in machine- readable or computable form for purposes of the bio-informatics pipeline and automatically-generated portions of a draft pathology report. A human-readable form is required for the genetics expert who is completing the pathology report. Such a genomic variant knowledge base could also be employed for purposes of CDS logic (e.g., if variant patterns for particular conditions are coupled to actionable recommendations).¹⁸

Common variant knowledge bases include the Database of Single Nucleotide Polymorphisms (dbSNP) Online Mendelian Inheritance in Man (OMIM), and ClinVar.

Full variant database with meta-data and a separate interpretative annotation database. Given that germline genomic results are applicable for the life of an individual, and the recognized clinical significance of those results will change as a result of evolving knowledge, a full genome database consisting of all variants and metadata requires long term storage.¹⁸ Long term storage of a full variant database including metadata (component #7), and a separate interpretative annotation database (component #8) is consistent with at least three of Masys et al.’s desiderata: (1) maintain separation of primary molecular observations from the clinical interpretations of those data, (2) maintain linkage of molecular observations to the laboratory methods used to generate them, and (3) anticipate fundamental changes in the understanding of human molecular variation.⁵

Given nascent EHR support for genomic data, and the large size of many of these full variant files (e.g., a FASTQ file can be 250 gigabytes for WGS¹²), these databases typically exist outside of an institution’s EHR. Rapid retrieval and analysis of such full genomic data is challenging for the typical EHR database system and could be expected to lead to performance issues.¹⁹ While such files can be compressed, Masys et al. have emphasized the importance of lossless data compression with the ability to produce a fully accurate copy of the original sequence.⁵

Clinically relevant subset of variant database and separate interpretative annotation database. Similar to the case of the full variant database and their interpretations, and consistent with Masys et al.’s desiderata,⁵ our proposed framework separates the subset of clinically relevant variants (component #10) from their interpretative annotations (component #11).

Recommendations for storage of a clinically actionable subset of variants are based on the fact that “the amount of molecular sequence data that currently has demonstrated clinical significance is a tiny fraction of the full genome and proteome, and it is neither computationally feasible nor desirable to query or analyze one’s entire genome in real time to support healthcare-related decisions.”⁵ Such a clinically actionable subset would be accessible to the EHR and CDS.

Such clinically relevant subsets of variant information seem to be commonly stored in the EHR problem list as a form of work-around for ensuring that clinicians are aware of the clinically actionable variant information for a patient, and for driving CDS.^13,20 As EHR vendors improve their systems’ ability to incorporate discrete genomic data, similar to how patient drug allergy information is not typically embedded in problem lists, it is likely that variant data will be both displayed to clinicians and available to CDS in non-problem list sections of the EHR.

Ongoing re-interpretation of individuals’ genomic results as knowledge evolves. Germline genetic results have lifelong ramifications for individuals. While an individual’s germline genomics is expected to remain immutable, the clinical significance of that individual’s identified variants will change dramatically over the foreseeable future as a result of ongoing intense genomics research. Need for corresponding ongoing re-interpretation of genetic results (component #9) makes a compelling case for standards-based, structured (computable) electronic reports.⁸ Both the pathology reports and discrete data that drive CDS will require updating.

As described by Aronson et al., the GeneInsight Suite was specifically designed to accommodate this need for ongoing re-interpretation.¹⁶ It relies on a centralized genetic knowledgebase and a highly flexible report-generation tool. A single action by a geneticist updates both future reports and generates alerts for clinicians treating patients. Automatically-generated draft reports and user-defined templates save geneticist time. Clinical alerts are driven off of changes to the genetic knowledge base.

EHR with integrated CDS. As of 2013, there were over 2500 clinical genetic tests available to clinicians.²¹ Even more overwhelming, the average human genome contains over 4 million variants. The full vision of precision health and genomics (and metabolomics, proteomics, etc) will not be realized in the absence of robust computerized CDS. Until such time as every variant for every person is stored and accessible, precision health will require both pre-test CDS (triggered when a clinician takes an action that should be informed by a genetic assessment but there is no record of such) and post-test CDS (alerts triggered when an action is taken that may be contraindicated by a patient’s genetic profile).²²

Given current health care infrastructure, the most straightforward method for robust genomics CDS at the point of care on a widespread basis would currently be to leverage existing EHRs (component #13) with integrated CDS (#12) -- potentially in the form of CDS web services. Welch et. al have similarly argued that a CDS architecture would appropriately primarily rely on EHR capabilities that are either currently supported or likely to be supported in the near future.¹⁸

Patient portal. The patient is a central stakeholder with respect to precision health. It has been hypothesized that modules incorporated into the patient portal may substitute for or lessen the burden of genetic counseling.²³ Patient portal applications and sharing of results may also increase patients’ participation in their own care.²³ For such purposes, both discrete and final report data would be appropriately shared with the patient through a secure patient portal (component #14).

Survey results

Genomic laboratory instrument (internal testing). 86% (6/7) of survey respondents reported that their site performs CLIA-certified germline molecular testing internally. Table 1 lists the types of CLIA-certified germline molecular testing internally performed at these six sites. Table 2 reports the clinical contexts for which internal germline molecular testing occurs at these 6 sites. In all cases, reports related to internal molecular testing are available for clinician review within the EHR.

Table 1:

Types of CLIA-certified germline molecular testing internally performed at six sites (refer to text)

Type of testing	Percentage of sites
PCR- based testing	100%
Fluorescence in-situ hybridization	83%
Sanger sequencing	67%
Chromosomal microarray analysis	67%
NGS targeted genome sequencing	50%
NGS whole exome genome sequencing	50%
DNA microarray	50%
Microarray comparative Genomic hybridization	50%
NGS whole exome genome sequencing	17%
Immunohistochemistry	17%
Ligase chain reactions-based Testing	17%
RNA microarray	17%
Other	33%

Open in a new tab

Table 2:

Clinical contexts for internal and external germline molecular testing (refer to text for details)

Clinical context for testing	INTERNAL testing	EXTERNAL testing
Genetic disorders tested perinatally (e.g. congenital genetic syndromes)	83%	71%
Bone marrow and organ transplants (e.g. HLA testing to match donors with recipient)	67%	29%
Testing for neurodevelopmental disorders	67%	71%
Pharmacogenomics	67%	71%
Detection of metabolic and vascular disease markers (e.g. hemochromatosis and blood clotting disorders)	50%	57%
Genetic predisposition to cancer (e.g. BRCA)	17%	86%
Other	17%	14%

Open in a new tab

60% (3/5) of sites reported that these reports are stored as PDFs in the EHR, while 20% (1/5) reported storage in the structured XML- based Clinical Document Architecture (CDA) format and 20% (1/5) reported storage in a proprietary format. In the case of pharmacogenomics testing, 75% (3/4) of sites reported that discrete genetic results are transmitted from the laboratory instrument to the EHR.

Discrete genetic results are transmitted much less frequently from the instrument to the EHR in other clinical contexts: 0% (0/1) of testing for genetic cancer predisposition, 0% (0/5) of perinatal genetic disorder testing, 25% (1/4) of bone marrow and organ transplant testing, and 0% (0/3) metabolic and vascular disease marker testing.

Genomic laboratory instrument (external testing). 100% (7/7) of survey respondents reported that their site sends at least a portion of their germline molecular tests to an external CLIA-certified laboratory. Table 2 reports the clinical contexts for which such external germline molecular testing occurs at these 7 sites. In all cases, these external reports are available for clinician review in the EHR, almost exclusively either as a PDF or scanned image. The exception was one site that stored their externally-provided pharmacogenomics report in structured CDA format.

For most types of external testing, the results of external germline molecular testing are stored only as a report in the EHR. In the case of external pharmacogenomics testing, discrete genetic data are stored as a result of external pharmacogenomics testing at 40% (2/5) of sites. In the case of external determination of genetic predisposition to cancer, discrete data are stored at 17% (1/6) of sites. No discrete genetic data are stored as a result of genetic disorders tested perinatally, bone marrow and organ transplants, to detect metabolic and vascular disease markers, or testing for neurodevelopmental disorders.

Bio-informatics pipeline. 86% (6/7) of sites support a computer-based genomics data pipeline platform. Among these six sites, pipelines are employed for genotyping (83%) and sequencing (67%). The most common reason for supporting a computer-based pipeline among these 6 sites is for purposes of pharmacogenomics (83% of sites). Internal personnel maintain these pipelines at all six sites. 50% (3/6) employ an open source pipeline framework, 33% (2/6) employ a commercial framework, and 67% (4/6) employ custom scripting. 50% (3/6) utilize PGx translation tables.

Respondents noted challenges related to these pipelines, including (1) the need to upload the pharmacogenomics phenotype information into the EHR as a discrete field, (2) updating and maintaining the underlying data related to newer medical literature, and (3) non-conformity of different methods of variant calling.

For the six sites that perform internal germline molecular testing, structured variant data is stored outside of the genomics instrument at 67% (4/6) of sites. In all such cases, these structured variant data are stored external to the EHR. For 75% (3/4) of sites, annotations are stored with the structured variant information. For 50% (2/4) of sites, a subset of the variants are available to the EHR (e.g., through entry into the patient’s problem list). There are no sites where all variants are accessible within the EHR.

Genomics pathology report with interpretation (automatically generated and final). 100% (6/6) of sites that perform internal germline molecular testing create interpretative reports. 83% (5/6) of sites automatically generate a portion of the results (e.g., variant calls) on this report. 100% (6/6) automatically generate a portion of the interpretation (e.g., phenotype for PGx variants) on this report. Genetics experts utilize templates to create these reports at 100% (6/6) of sites.

Genomic variant knowledge base. Genetic experts at these six sites rely on Pubmed (100% of sites), PharmGKB (83% of sites), dbSNP (83% of sites), CPIC guidelines (83% of sites), laboratory-maintained resources (67% of sites), OMIM (67% of sites), RefSeq (67% of sites), GeneTests (33% of sites), and other resources (33% of sites). 67% (4/6) of sites noted time-consuming aspects associated with interpretation, including (1) NGS with multiple variants can “easily take 4-8 hours of manual effort,” (2) integration of pharmacogenetic test results with other clinical factors, and (3) rare variants and variants with incomplete penetrance require extensive curation.

Ongoing re-interpretation of individuals’ genomic results as knowledge evolves. 86% (6 of 7) of institutions reported updating prior interpretations as a result of new variant knowledge. At these 6 sites, such updates have been “triggered” by clinician request (50% of sites), patient request (17% of sites), and as part of a scheduled review process (33% of sites). Updates have originated from the internal laboratory (100% of sites), as well as the external laboratory (50% of sites).

Efforts to update prior interpretations have included updating of interpretive reports (83% of sites), updating of stored annotations (83% of sites), notifying ordering providers about reinterpretation (67% of sites), and notifying patients about reinterpretation (50%).

14% (1/7) of sites have policies in place related to updating prior interpretations as new knowledge emerges.

EHR with integrated CDS. 71% (5/7) of sites reported implementing genomics-based clinical decision support within its EHR (both the inpatient and outpatient settings in all cases).

Among these five sites, 100% implemented CDS to recommend treatment based on genetic results, 80% implemented order sets that include genetic testing, 40% implemented CDS to recommend genetic testing triggered by particular orders, and 40% implemented genomics test order templates.

All of these sites implement genomics CDS using their institution’s EHRs CDS infrastructure, but one site also employs an external CDS infrastructure.

Patient portal. 57% (4/7) of sites report the sharing of genomic reports through a patient portal, while 29% (2/7) report mailing reports to the patient.

Survey results and the proposed data flow framework

We believe the survey results validate our proposed data flow framework (figure 1). Participating institutions implement most of the framework components. These institutions also contend with a number of data flow challenges, such as time-consuming manual curation and incomplete discrete data transmission.

The primary framework component that did not appear to be widely implemented, but we believe should remain in the framework, is a systematic process for re-interpreting genomic discrete and pathology reports that remains in step with evolving knowledge. We attribute this finding to the fact that there are currently no trivial automatic methods of performing this task. There is simply too much manual curation required to re-create final reports and interpretative annotations on all previously sequenced patients on a scheduled basis for the life of a patient.

While all sites create final pathology reports and ensure that these results are shared with the ordering providers (in compliance with CLIA requirements), discrete results are less consistently transmitted to downstream systems. This was particularly the case with external laboratory results. Many reports are also stored in PDF format, rather than structured CDA format. Incomplete population of the EHR with both discrete results and structured reports has potential adverse ramifications for CDS. At least some sites rely on entry of variant results into the EHR problem list as a work-around.

We found that all sites automatically generate portions of their draft reports and employ templates. Most sites rely exclusively on their EHR’s integrated CDS system to implement CDS.

Conclusions

Widespread implementation of precision health will not be feasible in the absence of significant advancements in the automation and standardization of genomic transmission, transformation, and interpretation from instrument to point of care. There is critical need for increased automation of genomic pathology reporting, improved transmission of full discrete variant results and metadata in standardized format to downstream systems, improved methods to store clinically actionable genomic data in EHRs, improved methods to automatically update genomic results for the lifetime of a patient as genomic knowledge evolves, and improved methods to create, maintain, and share comprehensive genomic CDS logic among institutions.

The results of our survey appear to validate our proposed genomic data flow framework. We anticipate that explicit definition of a genomic data flow framework can facilitate discussion of much-needed improvements. Participating survey respondents, arguably leaders in the field of clinical genomics as evidenced by their funding for large scale genomic implementation, are still challenged by current limitations of genomic instruments, bioinformatics pipelines, genomic variant knowledge bases, lab information systems, and EHRs.

Acknowledgements: This work was supported by funding from the NIH-NHGRI grant #U01HG007762 and U01HG010245. We also thank our partners in the IGNITE Network, a consortium of genomic medicine projects funded and guided by the NHGRI (https://ignite-genomics.org/) for their valuable contributions to this project.

Figures & Table

References

1.Collins FS, Varmus H. A New Initiative on Precision Medicine. New England Journal of Medicine. 2015;372((9)):793–795. doi: 10.1056/NEJMp1500523. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Payne TH, Corley S, Cullen TA, et al. Report of the AMIA EHR-2020 Task Force on the status and future direction of EHRs. J Am Med Inform Assoc. 2015;22((5)):1102–1110. doi: 10.1093/jamia/ocv066. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Roy S, LaFramboise WA, Nikiforov YE, et al. Next-Generation Sequencing Informatics: Challenges and Strategies for Implementation in a Clinical Environment. Arch Pathol Lab Med. 2016;140((9)):958–975. doi: 10.5858/arpa.2015-0507-RA. [DOI] [PubMed] [Google Scholar]
4.Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)--A metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42((2)):377–381. doi: 10.1016/j.jbi.2008.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Masys DR, Jarvik GP, Abernethy NF, et al. Technical desiderata for the integration of genomic data into Electronic Health Records. J Biomed Inform. 2012;45((3)):419–422. doi: 10.1016/j.jbi.2011.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.The Precision Medicine Initiative Cohort Program – Building a Research Foundation for 21st Century Medicine. 2015 [Google Scholar]
7.Al Kawam A, Sen A, Datta A, Dickey N. Understanding the Bioinformatics Challenges of Integrating Genomics Into Healthcare. Ieee Journal of Biomedical and Health Informatics. 2018;22((5)):1672–1683. doi: 10.1109/JBHI.2017.2778263. [DOI] [PubMed] [Google Scholar]
8.Deckard J, McDonald CJ, Vreeman DJ. Supporting interoperability of genetic data with LOINC. J Am Med Inform Assoc. 2015;22((3)):621–627. doi: 10.1093/jamia/ocu012. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.HL7 Domain Analysis Model: Clinical Genomics, Release 1. 2017 [Google Scholar]
10.Leipzig J. A review of bioinformatic pipeline frameworks. Briefings in bioinformatics. 2017;18((3)):530–536. doi: 10.1093/bib/bbw020. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Aziz N, Zhao Q, Bry L, et al. College of American Pathologists’ Laboratory Standards for Next-Generation Sequencing Clinical Tests. Arch Pathol Lab Med. 2014;139((4)):481–493. doi: 10.5858/arpa.2014-0250-CP. [DOI] [PubMed] [Google Scholar]
12.He KY, Ge D, He MM. Big Data Analytics for Genomic Medicine. International journal of molecular sciences. 2017;18((2)):412. doi: 10.3390/ijms18020412. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Dunnenberger HM, Crews KR, Hoffman JM, et al. Preemptive clinical pharmacogenetics implementation: current programs in five US medical centers. Annu Rev Pharmacol Toxicol. 2015;55:89–106. doi: 10.1146/annurev-pharmtox-010814-124835. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Sefid Dashti MJ, Gamieldien J. A practical guide to filtering and prioritizing genetic variants. BioTechniques. 2017;62((1)):18–30. doi: 10.2144/000114492. [DOI] [PubMed] [Google Scholar]
15.Kohane IS, Hsing M, Kong SW. Taxonomizing, sizing, and overcoming the incidentalome. Genet Med. 2012;14((4)):399–404. doi: 10.1038/gim.2011.68. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Aronson SJ, Clark EH, Babb LJ, et al. The GeneInsight Suite: A Platform to Support Laboratory and Provider Use of DNA-Based Genetic Testing. Human Mutation. 2011;32((5)):532–536. doi: 10.1002/humu.21470. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Dewey FE, Grove ME, Pan C, et al. Clinical interpretation and implications of whole-genome sequencing. Jama. 2014;311((10)):1035–1045. doi: 10.1001/jama.2014.1717. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Welch B, Loya S, Eilbeck K, Kawamoto K. A Proposed Clinical Decision Support Architecture Capable of Supporting Whole Genome Sequence Information. Journal of Personalized Medicine. 2014;4((2)):176–199. doi: 10.3390/jpm4020176. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.DuBravec D. Can EHRs Handle Genomics Data? Journal of AHIMA. 2015;86((11)):28–31. [PubMed] [Google Scholar]
20.Manzi SF, Fusaro VA, Chadwick L, et al. Creating a scalable clinical pharmacogenomics service with automated interpretation and medical record result integration - experience from a pediatric tertiary care facility. Journal of the American Medical Informatics Association : JAMIA. 2017;24((1)):74–80. doi: 10.1093/jamia/ocw052. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Welch BM, Kawamoto K. Clinical decision support for genetically guided personalized medicine: a systematic review. J Am Med Inform Assoc. 2013;20((2)):388–400. doi: 10.1136/amiajnl-2012-000892. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Aronson SJ, Rehm HL. Building the foundation for genomics in precision medicine. Nature. 2015;526((7573)):336–342. doi: 10.1038/nature15816. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Hazin R, Brothers KB, Malin BA, et al. Ethical, legal, and social implications of incorporating genomic information into electronic health records. Genetics in medicine : official journal of the American College of Medical Genetics. 2013;15((10)):810–816. doi: 10.1038/gim.2013.117. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r1-3203249] 1.Collins FS, Varmus H. A New Initiative on Precision Medicine. New England Journal of Medicine. 2015;372((9)):793–795. doi: 10.1056/NEJMp1500523. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r2-3203249] 2.Payne TH, Corley S, Cullen TA, et al. Report of the AMIA EHR-2020 Task Force on the status and future direction of EHRs. J Am Med Inform Assoc. 2015;22((5)):1102–1110. doi: 10.1093/jamia/ocv066. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r3-3203249] 3.Roy S, LaFramboise WA, Nikiforov YE, et al. Next-Generation Sequencing Informatics: Challenges and Strategies for Implementation in a Clinical Environment. Arch Pathol Lab Med. 2016;140((9)):958–975. doi: 10.5858/arpa.2015-0507-RA. [DOI] [PubMed] [Google Scholar]

[r4-3203249] 4.Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)--A metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42((2)):377–381. doi: 10.1016/j.jbi.2008.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r5-3203249] 5.Masys DR, Jarvik GP, Abernethy NF, et al. Technical desiderata for the integration of genomic data into Electronic Health Records. J Biomed Inform. 2012;45((3)):419–422. doi: 10.1016/j.jbi.2011.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r6-3203249] 6.The Precision Medicine Initiative Cohort Program – Building a Research Foundation for 21st Century Medicine. 2015 [Google Scholar]

[r7-3203249] 7.Al Kawam A, Sen A, Datta A, Dickey N. Understanding the Bioinformatics Challenges of Integrating Genomics Into Healthcare. Ieee Journal of Biomedical and Health Informatics. 2018;22((5)):1672–1683. doi: 10.1109/JBHI.2017.2778263. [DOI] [PubMed] [Google Scholar]

[r8-3203249] 8.Deckard J, McDonald CJ, Vreeman DJ. Supporting interoperability of genetic data with LOINC. J Am Med Inform Assoc. 2015;22((3)):621–627. doi: 10.1093/jamia/ocu012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r9-3203249] 9.HL7 Domain Analysis Model: Clinical Genomics, Release 1. 2017 [Google Scholar]

[r10-3203249] 10.Leipzig J. A review of bioinformatic pipeline frameworks. Briefings in bioinformatics. 2017;18((3)):530–536. doi: 10.1093/bib/bbw020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r11-3203249] 11.Aziz N, Zhao Q, Bry L, et al. College of American Pathologists’ Laboratory Standards for Next-Generation Sequencing Clinical Tests. Arch Pathol Lab Med. 2014;139((4)):481–493. doi: 10.5858/arpa.2014-0250-CP. [DOI] [PubMed] [Google Scholar]

[r12-3203249] 12.He KY, Ge D, He MM. Big Data Analytics for Genomic Medicine. International journal of molecular sciences. 2017;18((2)):412. doi: 10.3390/ijms18020412. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r13-3203249] 13.Dunnenberger HM, Crews KR, Hoffman JM, et al. Preemptive clinical pharmacogenetics implementation: current programs in five US medical centers. Annu Rev Pharmacol Toxicol. 2015;55:89–106. doi: 10.1146/annurev-pharmtox-010814-124835. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r14-3203249] 14.Sefid Dashti MJ, Gamieldien J. A practical guide to filtering and prioritizing genetic variants. BioTechniques. 2017;62((1)):18–30. doi: 10.2144/000114492. [DOI] [PubMed] [Google Scholar]

[r15-3203249] 15.Kohane IS, Hsing M, Kong SW. Taxonomizing, sizing, and overcoming the incidentalome. Genet Med. 2012;14((4)):399–404. doi: 10.1038/gim.2011.68. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r16-3203249] 16.Aronson SJ, Clark EH, Babb LJ, et al. The GeneInsight Suite: A Platform to Support Laboratory and Provider Use of DNA-Based Genetic Testing. Human Mutation. 2011;32((5)):532–536. doi: 10.1002/humu.21470. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r17-3203249] 17.Dewey FE, Grove ME, Pan C, et al. Clinical interpretation and implications of whole-genome sequencing. Jama. 2014;311((10)):1035–1045. doi: 10.1001/jama.2014.1717. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r18-3203249] 18.Welch B, Loya S, Eilbeck K, Kawamoto K. A Proposed Clinical Decision Support Architecture Capable of Supporting Whole Genome Sequence Information. Journal of Personalized Medicine. 2014;4((2)):176–199. doi: 10.3390/jpm4020176. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r19-3203249] 19.DuBravec D. Can EHRs Handle Genomics Data? Journal of AHIMA. 2015;86((11)):28–31. [PubMed] [Google Scholar]

[r20-3203249] 20.Manzi SF, Fusaro VA, Chadwick L, et al. Creating a scalable clinical pharmacogenomics service with automated interpretation and medical record result integration - experience from a pediatric tertiary care facility. Journal of the American Medical Informatics Association : JAMIA. 2017;24((1)):74–80. doi: 10.1093/jamia/ocw052. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r21-3203249] 21.Welch BM, Kawamoto K. Clinical decision support for genetically guided personalized medicine: a systematic review. J Am Med Inform Assoc. 2013;20((2)):388–400. doi: 10.1136/amiajnl-2012-000892. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r22-3203249] 22.Aronson SJ, Rehm HL. Building the foundation for genomics in precision medicine. Nature. 2015;526((7573)):336–342. doi: 10.1038/nature15816. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r23-3203249] 23.Hazin R, Brothers KB, Malin BA, et al. Ethical, legal, and social implications of incorporating genomic information into electronic health records. Genetics in medicine : official journal of the American College of Medical Genetics. 2013;15((10)):810–816. doi: 10.1038/gim.2013.117. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Development of a Genomic Data Flow Framework: Results of a Survey Administered to NIH-NHGRI IGNITE and eMERGE Consortia Participants

Paul Dexter, MD

Henry Ong, PhD

Amanda Elsey, MHA

Gillian Bell, PharmD

Nephi Walton, MD, MS

Wendy Chung, MD, PhD

Luke Rasmussen, MS

Kevin Hicks, PharmD, PhD

Aniwaa Owusu-obeng, PharmD

Stuart Scott, PhD, FACMG

Steve Ellis

Josh Peterson, MD, MPH

Abstract

Introduction

Methods

Results

Figure 1:

Survey results

Table 1:

Table 2:

Survey results and the proposed data flow framework

Conclusions

Figures & Table

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Development of a Genomic Data Flow Framework: Results of a Survey Administered to NIH-NHGRI IGNITE and eMERGE Consortia Participants

Paul Dexter, MD

Henry Ong, PhD

Amanda Elsey, MHA

Gillian Bell, PharmD

Nephi Walton, MD, MS

Wendy Chung, MD, PhD

Luke Rasmussen, MS

Kevin Hicks, PharmD, PhD

Aniwaa Owusu-obeng, PharmD

Stuart Scott, PhD, FACMG

Steve Ellis

Josh Peterson, MD, MPH

Abstract

Introduction

Methods

Results

Figure 1:

Survey results

Table 1:

Table 2:

Survey results and the proposed data flow framework

Conclusions

Figures & Table

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases