Abstract
Despite the routine nature of comparing sequence variations identified during clinical testing to database records, few databases meet quality requirements for clinical diagnostics. To address this issue, The Royal College of Pathologists of Australasia (RCPA) in collaboration with the Human Genetics Society of Australasia (HGSA), and the Human Variome Project (HVP) is developing standards for DNA sequence variation databases intended for use in the Australian clinical environment. The outputs of this project will be promoted to other health systems and accreditation bodies by the Human Variome Project to support the development of similar frameworks in other jurisdictions.
Keywords: Genetic variation databases, Data quality, Standards, Global knowledge sharing
Highlights
-
•
Standards for genetic variation databases are being developed.
-
•
These standards will cover use in the Australian clinical environment.
-
•
Project outputs will be promoted for application to other environments by the Human Variome Project.
1. Introduction
It has now become routine practice to compare sequence variations identified during clinical genetic testing with variants recorded in a wide range of genetic variation databases as well as in the scientific literature to aid in understanding the potential clinical significance and determining a definitive diagnosis. Although numerous genetic variation databases already exist, there are few that meet the accuracy and reproducibility required for clinical diagnostics. Current databases are of variable quality and many contain errors in variant calls, non-standardized nomenclature, incomplete pathogenicity associations and limited phenotypic information linked to genomic data (Saunders et al., 2012). These all represent limitations and risks to the quality of patient care. Based on the current research experience of highly curated mutation data (Thompson et al., 2014, Sosnay et al., 2013) the curation of databases to clinical standards is likely to require a substantial investment of time and effort.
The increasing ease of access to technologies such as massively parallel sequencing is producing increasing volumes of genomic data that needs to be recorded in an organized, accurate manner. The integrity of this stored data is critical as there becomes a greater demand for analysis and interpretation in clinical research and diagnostics, a task which now forms a substantial proportion of the genetic diagnostic workload.
There are numerous initiatives and white papers, which discuss the steps needed to allow for responsible integration of emerging genomic technologies into mainstream clinical diagnostics, many of which touch on data quality and collection. Some of these are described below.
Data to Discovery: Genomes to Health (Ahalt et al., 2014) made recommendations on data provenance, collection, and management; delineation of phenotypes; adjudication of genomic variants; biostatistics and bioinformatics, data sharing; and bioethics and the law.
The Global Alliance for Genomics and Health has established data, security, regulatory and ethics, and clinical working groups who have established priorities which include the development of formal data models, application programming interface (API) implementations for submitting, exchanging, querying, and analyzing genomic data (Global Alliance for Genomics and Health, 2014).
The British Society for Genetic Medicine made public the outcomes of the BSGM 100,000 Genome Group which made recommendations on the collaborative development of appropriate genomic standards and policies, promotion of data sharing, and further development of the existing NHS Diagnostic Mutation Database (DMuDB) and DECIPHER database to be more readily usable for the clinical laboratory (Burn and Douglas, 2013). This was followed up by reports on recommendations from the United Kingdom appointed working groups (https://www.gov.uk/government/publications/mapping-100000-genomes-strategic-priorities-data-and-ethics).
Recent challenges being addressed by the eMERGE network and others include collection of phenotype data, the integration of genomic findings into electronic health records, and the current efforts to extend HL7 Version 2 vocabularies for exome and whole genome sequencing within the context of clinical workflows (Kullo et al., 2014, Chute et al., 2013).
In September 2013, the National Human Genome Research Institute (NHGRI) and the Eunice Kennedy Shriver National Institute of Child Health and Development (NICHD) awarded USD25M to support a consortium of three groups to design and implement a framework for evaluating variants, and their role in patient care. This consortium is enabling access to this information through the NCBI ClinVar database. The International Collaboration for Clinical Genomics (ICCG) is a part of this project, and is intended to support data collection and sharing (http://www.iccg.org/about-the-iccg/clingen/).
In addition to the white papers and initiatives, there is a growing number of best practice policies and guidelines addressing the responsible integration of genomics into a clinical environment such as those released by the Association for Clinical Genetic Science (ACGS, part of the British Society of Genetic Medicine (BSGM)) and the Dutch Society of Clinical Genetic Laboratory Specialists (VKGL) (Wallis et al., 2013), American College of Medical Genetics and Genomics (ACMG) (Rehm et al., 2013), the Clinical Molecular Genetics Society, UK (CMGS), also part of the British Society of Genetic Medicine (BSGM) (Ellard et al., 2012), Best Practice Guidelines for the use of Next-Generation Sequencing Applications in Genome Diagnostics from a National Collaborative Study of Dutch Genome Diagnostic Laboratories (Weiss et al., 2013), a draft NIH Genomic Data Sharing Policy (Draft NIH Genomic Data Sharing Policy — Request for Public Comments, 2013), and conclusions from a working group of experts in genomic research, analysis and clinical diagnostic sequencing convened by the NHGRI (MacArthur et al., 2014). All of these guidelines partially address data within the clinical genomics workflow, however they do not focus specifically on the area.
Collection of information related to genetic variation is not a new concept, with over 2000 locus specific databases established with disease and/or gene specific variation information. There are currently no established ISO standards which govern sequence variation databases. There are however numerous de-facto standards and established best practices (Vihinen et al., 2012). While this aids with providing consistent formats, they are in part outdated as genomic data becomes more readily accessible and available. With regard to guidelines for the establishment of locus specific databases (LSDBs), the Human Variome Project (HVP) has been collaborating with the Human Genome Variation Society, and the GEN2PHEN project, working towards standardizing the way that variation and pathogenicity data is presented. In addition Celli et al. developed a supporting document describing curation of a gene variant database as first step to establishing guidelines for database curation (Celli et al., 2012). The HVP continues to promote global standards and guidelines which encourage the establishment and maintenance of quality-assured sequence variation data repositories. Their ongoing work is described further below.
2. The standards development project
Despite the initiatives and guidelines described above, there are no specific standards or equivalent mechanisms which concentrate on guiding the accreditation of DNA sequence variation databases to ensure the accuracy, quality, and ongoing maintenance of uploaded data into any central repository to meet the needs of the clinical diagnostics environment.
An Australian national project led by the Royal College of Pathologists of Australasia (RCPA) in collaboration with the Human Genetics Society of Australasia (HGSA) and the Human Variome Project (HVP) is developing standards for DNA sequence variation databases intended for use in the clinical environment. This project is being supported by the Australian Department of Health's Quality Use of Pathology Program (QUPP).
The standards under development will be a broad reaching set of national standards that are sympathetic to the rapidly changing landscape of genomics in the clinic to seek compliance by both existing and future databases. The fundamental principle of the document is to provide a standard for oversight for DNA sequence variation databases intended to provide utility in clinical diagnostic service delivery, and thereby ensure that they are developed, curated, and maintained as safe, secure, and accurate repositories of genomic data. They are intended to complement existing laboratory standards and accreditation requirements, align with global initiatives and guidelines in existence, act as a guide to identify a quality database, establish new databases as well as improve existing databases that have evolved out of the research environment, and set minimum requirements for clinical purposes within the boundaries of existing legislation both nationally and globally.
3. The standards framework
The framework within which the standards are being developed consists of nine key areas described in Table 1. The framework is intended to adequately address the accreditation requirements in a systematic order with clearly defined and concise criteria. In each section of the document, points deemed important for practice will be identified as either ‘Standards’ or ‘Guidelines’ in the style of current National Pathology Accreditation Advisory Council (NPAAC) documents (National Pathology Accreditation Advisory Council (NPAAC), 2008). A Standard will be considered the minimum requirement for a procedure, method, staffing resource or laboratory facility that is required before a laboratory can attain accreditation. A Guideline will be a consensus recommendation for best practice and should be used if a higher level of practice is appropriate. A Commentary may also be provided to give clarification to the Standards and Guidelines as well as to provide examples and guidance on interpretation of the statements.
Table 1.
Framework for development of standards for DNA sequence variation databases.
| Framework areas | Items being considered in each of the areas (include, but not limited to) |
|---|---|
| Purpose |
|
| Governance |
|
| Establishment |
|
| Protection privacy security |
|
| Content |
|
| Functionality |
|
| Currency of information |
|
| Access & sharing |
|
| Professional use |
|
4. Implementation of the standards
Accreditation of pathology laboratories for clinical service delivery in Australia is overseen by the National Pathology Accreditation Advisory Council (NPAAC). NPAAC is an agency within the Commonwealth (Federal) Department of Health. NPAAC plays a key role in ensuring the quality of Australian pathology services, and is responsible for the development and maintenance of standards and guidelines for pathology practices (http://www.health.gov.au/npaac). The National Association of Testing Authorities (NATA) is the authority which provides independent assurance of technical competence in conjunction with the Royal College of Pathologists of Australasia (RCPA) through a proven network of best practice industry experts. NATA/RCPA provides assessment, accreditation, and training services to laboratories and technical facilities throughout Australia and internationally (http://www.nata.asn.au). NATA audits against the standards and guidelines laid down by NPAAC. Laboratories seeking eligibility for Federal government funding for medical tests are required to meet the specified quality standards as expressed by NPAAC in the context of the Australian pathology accreditation framework. There are a number of specialized technical publications that specify requirements in laboratories undertaking specific areas of medical testing in addition to requirements for good medical practice in all pathology laboratories. The DNA Sequence Variation Database Standards under development are intended to be an adjunct to existing NPAAC standards and guidelines such as “Requirements for Medical Pathology Services (Requirements for Medical Pathology Services (First Edition, 2013); National Pathology Accreditation Advisory Council (NPAAC), 2008)” and “Requirements for the Retention of Laboratory Records and Diagnostic Material (Requirements for the Retention of Laboratory Records and Diagnostic Material (Sixth Edition, 2013) National Pathology Accreditation Advisory Council, 2013)”. When completed, the standards will be submitted for potential endorsement by the RCPA and HGSA boards, and will be made available as a tool for laboratories and NATA assessors alike to facilitate accreditation. Further, the RCPA will engage the NPAAC to seek their inclusion of these Standards in the Commonwealth Health Insurance (Accredited Pathology Laboratories) — Approval Principles 2002.
It is recognized that there is a need to bridge a gap between the translational research environment and the clinical diagnostic environment, and therefore regulation of the use of data within the scope of the respective environments. To address this, in addition to the NATA/RCPA and NPAAC requirements, the Standards will encourage users to comply with the Australian Government National Health and Medical Research Council National Statement on Ethical Conduct in Human Research 2007 (Updated March 2014) (National Statement on Ethical Conduct in Human Research, 2007).
5. Challenges to implementation
There are foreseeable challenges to the implementation of a set of standards such as those described above. Initial acceptance and implementation of a new set of standards can be difficult to achieve without end users supporting the accreditation or compliance requirements. Early communication of this initiative is underway, and includes broad consultation with key experts and stakeholders who will be impacted by the introduction of standards, and presentation of the standards in draft form at local scientific meetings for discussion. It is the intention of the project steering committee that the resulting set of standards gains support prior to their final release.
Further to this, to ensure the implementation of and compliance with the standards, continued accreditation could be monitored via the development by a professional organization of a time limited license or registration program applied to the databases and operators of those databases. Elements could include an external quality assessment program and automated auditing or review of the elements, functions, and curation of the databases. Online training and certification of database users under a continuing professional development (CPD) or continuing medical education (CME) program could be implemented to ensure the information held in databases is appropriately utilized in a clinical environment.
Sequence variation databases are housed both locally and offshore in multiple countries, with ownership existing outside of Australian jurisdiction. It will be difficult for laboratories to apply these standards locally unless they “own” the database. However, the standards will provide them with a tool to judge the integrity and therefore the level of confidence that they might apply to an overseas database, which in turn can be included in their quality systems for future accreditation of bioinformatic pipelines for analysis and interpretation.
This project is being undertaken within the context of the Australian healthcare system and its national- and state-based legislation and regulations governing the quality of medical services. However, given the global reach of individual databases, the findings from this project should be applicable to other countries with similar medico-legal frameworks, and perhaps more broadly. Sharing knowledge, experience, and aligning standards globally in a structured and coordinated manner is critical to advancing the successful implementation of genomics testing in the clinical environment.
6. Broader adoption and the global view
Gaining international consensus and commitment to consistent standards in medical testing represents a major challenge. One mechanism for achieving this outcome is the Human Variome Project, an international initiative to integrate the routine and responsible sharing of genetic variation information into standard clinical practice. The Human Variome Project is a consortium of researchers, diagnosticians and health-care professionals committed to the free and open sharing of genetic variation information generated during clinical testing, thereby leading to better patient outcomes and more accessible genetic health services. The Project is working towards establishing globally acceptable Standards and Guidelines for the collection, curation, interpretation and sharing of genomic knowledge and enabling the sustainable development and operation of a harmonized and federated global knowledge sharing network. A key aspect of this work is harmonizing national and regional efforts around regulatory frameworks and governance of electronic data repositories and knowledge sharing infrastructure. The Project, through its Variant Database Quality Assessment Working Group has specified guidelines for quality parameters that should be assessed in a quality accreditation scheme (in press).
In addition to The Human Variome Project global initiatives, Australia is well represented in the Global Alliance for Genomics and Health (http://www.genomicsandhealth.org), with Alliance partners including the Human Variome Project, National Health and Medical Research Council (NHMRC), Australian Genome Research Facility (http://agrf.org.au), Garvan Institute of Medical Research, Melbourne Genomics Health Alliance and other highly regarded groups (http://genomicsandhealth.org/partners).
7. Conclusion
Regulating the quality, accuracy, and relevance of DNA sequence variation databases and the data held within them through the implementation of standards will reduce the risk of aberrant or uninformative variants being reported, promote the sharing of clinical quality sequencing, and accelerate the delivery of accurate, actionable, and efficient clinical reports to improve patient management and outcomes. The Australian standards development reported above will build on work undertaken to date, and is a promising step towards national and regional harmonization efforts. We hope that the outcome of this project will be of interest to other countries and health systems.
References
- http://www.health.gov.au/npaac
- Commonwealth of Australia; 2013. Requirements for Medical Pathology Services (First Edition 2013) National Pathology Accreditation Advisory Council. (Online) [Google Scholar]
- Commonwealth of Australia; 2013. Requirements for the Retention of Laboratory Records and Diagnostic Material (Sixth Edition 2013) National Pathology Accreditation Advisory Council. (Online) [Google Scholar]
- http://www.nata.asn.au
- Ahalt S., Bizon C., Evans J., Erlich Y., Ginsberg G., Krishnamurthy A., Lange L., Maltbie D., Masys D., Schmitt C., Wilhelmsen K. RENCI, University of North Carolina at Chapel Hill; 2014. Data to Discovery: Genomes to Health. A White Paper From the National Consortium for Data Science. (Text. http://dx.doi.org/10.7921/G03X84K4) [Google Scholar]
- Burn J., Douglas A. 2013. Delivering the 100,000 Genome Project in the NHS. (April 2013) (Downloaded January 29 2014) [Google Scholar]
- Celli J., Dalgleish R., Vihinen M., Taschner P., den Dunnen J. Curating Gene Variant Databases (LSDBs): toward a universal standard. Hum. Mutat. 2012;33(2):291–297. doi: 10.1002/humu.21626. [DOI] [PubMed] [Google Scholar]
- Chute G., Ullman-Cullere M., Wood G., Lin S., He M., Pathak J. Some experiences and opportunities for big data in translational research. Genet. Med. 2013;15(10):802–809. doi: 10.1038/gim.2013.121. (2013 October, Available in PMC January 30 2014) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Federal Register. vol. 78, no. 183. 2013. Draft NIH genomic data sharing policy — request for public comments. (September, Notices) [Google Scholar]
- Ellard S., Charlton R., Lindsay H., Camm N., Watson C., Abbs S., Mattocks C., Taylor G. CMGS; 2012. Practice Guidelines for Targeted Next Generation Sequencing Analysis and Interpretation. (Ratified by the CMGS Executive Committee in December 2012) [Google Scholar]
- Global Alliance for Genomics and Health . 2014. Data Working Group. Priorities for Data Working Group: 2014 and Beyond. (Last Update: April 25 2014) ( https://genomicsandhealth.org/files/public/Priorities%202014%2004%2028%20DWG.pdf Viewed April 28 2014) [Google Scholar]
- http://www.iccg.org/about-the-iccg/clingen/
- http://agrf.org.au
- http://genomicsandhealth.org/partners
- http://www.genomicsandhealth.org
- https://www.gov.uk/government/publications/mapping-100000-genomes-strategic-priorities-data-and-ethics (Published by Department of health July 5 2013)
- Kullo I.J., Haddad R., Prows C.A., Holm I., Sanderson S.C., Garrison N.A., Sharp R.R., Smith M.E., Kuivaniemi H., Bottinger E.P., Connolly J.J., Keating B.J., McCarty C.A., Williams M.S., Jarvik G.P. Return of results in the genomic medicine projects of the eMERGE network. Front. Genet. 2014;5:50. doi: 10.3389/fgene.2014.00050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacArthur D., Manolio T., Dimmock D., Rehm H., Shendure J., Abecasis G., Adams D., Altman R., Antonarkis S., Ashley E., Barrett J., Biesecker L., Conrad D., Cooper G., Cox N., daly M., Gerstein M., Goldstein D., Hirschhorn J., Leal S., Pennacchio L., Stamatoyannopoulos J., Sunyaev S., Valle D., Voight B., Winckler W., Gunter C. Guidelines for investigating causality of sequence variants in human disease. Nature. 2014;508:469–476. doi: 10.1038/nature13127. (April 24 2014) [DOI] [PMC free article] [PubMed] [Google Scholar]
- National Pathology Accreditation Advisory Council (NPAAC) Australian Government Department of Health and Ageing; Canberra: 2008. NPAAC Style Guide First Edition 2008. [Google Scholar]
- National Statement on Ethical Conduct in Human Research . Commonwealth of Australia; Canberra: 2007. The National Health and Medical Research Council, the Australian Research Council and the Australian Vice-Chancellors' Committee. (Updated March 2014) (Online) [Google Scholar]
- Rehm H., Bale S., Bayray-Toydemir P., Berg J., Brown K., Deignan J., Friez M., Funke B., Hegde M., Lyon E. ACMG clinical laboratory standards for next generation sequencing. Genet. Med. 2013;15(9):733–747. doi: 10.1038/gim.2013.92. (September 2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saunders C.J., Miller N.A., Soden S.E., Dinwiddie D.L., Noll A., Alnadi N.A., Andraws N., Patterson M.L., Krivohlavek L.A., Fellis J., Humphray S., Saffrey P., Kingsbury Z., Weir J.C., Betley J., Grocock R.J., Margulies E.H., Farrow E.G., Artman M., Safina N.P., Petrikin J.E., Hall K.P., Kingsmore S.F. Rapid whole-genome sequencing for genetic disease diagnosis in neonatal intensive care units. Sci. Transl. Med. 2012;4(154):135. doi: 10.1126/scitranslmed.3004041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sosnay P.R., Siklosi K.R., Van Goor F., Kaniecki K., Yu H., Sharma N., Ramalho A.S., Amaral M.D., Dorfman R., Zielenski J., Masica D.L., Karchin R., Millen L., Thomas P.J., Patrinos G.P., Corey M., Lewis M.H., Rommens J.M., Castellani C., Penland C.M., Cutting G.R. Defining the disease liability of variants in the cystic fibrosis transmembrane conductance regulator gene. Nat. Genet. 2013;45(10):1160–1167. doi: 10.1038/ng.2745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson B.A., Spurdle A.B., Plazzer J.P., Greenblatt M.S., Akagi K., Al-Mulla F., Bapat B., Bernstein I., Capellá G., den Dunnen J.T., du Sart D., Fabre A., Farrell M.P., Farrington S.M., Frayling I.M., Frebourg T., Goldgar D.E., Heinen C.D., Holinski-Feder E., Kohonen-Corish M., Robinson K.L., Leung S.Y., Martins A., Moller P., Morak M., Nystrom M., Peltomaki P., Pineda M., Qi M., Ramesar R., Rasmussen L.J., Royer-Pokora B., Scott R.J., Sijmons R., Tavtigian S.V., Tops C.M., Weber T., Wijnen J., Woods M.O., Macrae F., Genuardi M., InSiGHT Application of a 5-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants in the InSiGHT locus-specific database. Nat. Genet. 2014;46(2):107–115. doi: 10.1038/ng.2854. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vihinen M., den Dunnen J., Dalgleish R., Cotton R. Guidelines for establishing locus specific databases. Hum. Mutat. 2012;33(2):298–305. doi: 10.1002/humu.21646. [DOI] [PubMed] [Google Scholar]
- Wallis Y., Payne S., McAnulty C., Bodmer D., Sistermans E., Robertson K., Moore D., Abbs S., Deans Z., Devereau A. Association for Clinical Genetic Science (ACGS), Dutch Society of Clinical Laboratory Specialists (VKGL); 2013. Practice Guidelines for the Evaluation of Pathogenicity and the Reporting of Sequence Variants in Clinical Molecular Medicine. (Approved September 2013) [Google Scholar]
- Weiss M., Van der Zwaag B., Jongbloed J., Vogel M., Bruggenwirth H., Lekanne Deprez R., Mook O., Ruivenkamp C., van Slegtenhorst M., van den Wijngaard A., Waisfisz Q., Nelen M., van der Stoep N. Best practice guidelines for the use of next-generation sequencing application in genome diagnostics: a national collaborative study of Dutch genome diagnostic laboratories. Hum. Mutat. 2013;34(10):1313–1321. doi: 10.1002/humu.22368. [DOI] [PubMed] [Google Scholar]
