Abstract
As the utility of genetic and genomic testing in healthcare grows, there is need for a high quality genomic knowledge base to improve the clinical interpretation of genomic variants. Active patient engagement can enhance communication between clinicians, patients and researchers, contributing to knowledge building. It also encourages data sharing by patients and increases the data available for clinicians to incorporate into individualized patient care, clinical laboratories to utilize in test interpretation and investigators to use for research. GenomeConnect is a patient portal supported by the Clinical Genome Resource (ClinGen), providing an opportunity for patients to add to the knowledge base by securely sharing their health history and genetic test results. Data can be matched with queries from clinicians, laboratory personnel and researchers to better interpret the results of genetic testing and build a foundation to support genomic medicine. Participation is online, allowing patients to contribute regardless of location. GenomeConnect supports longitudinal, detailed clinical phenotyping and robust “matching” among research and clinical communities. Phenotype data is gathered using online health questionnaires; genotype data is obtained from genetic test reports uploaded by participants and curated by staff. GenomeConnect empowers patients to actively participate in the improvement of genomic test interpretation and clinical utility.
Keywords: patient portal, patient registry, data sharing, genotype/phenotype, Matchmaker Exchange
Background
Before the advent of technologies such as next generation sequencing, single gene testing dominated the molecular testing market. Laboratories often developed “niches” in certain disease domains and became experts at testing and interpreting results for certain genes. Over time, laboratories built up databases of this interpreted data. While generally always requested, phenotype information was historically not as critical to interpretation given that reasons for referral for any single gene test were relatively narrow in scope.
As technology has advanced, clinical laboratories are now able to offer multi-gene sequencing panels and genome-wide assays, such as cytogenomic microarrays, whole exome sequencing and whole genome sequencing. These types of assays allow laboratories to detect variation across the genome and often in genes with which they have limited clinical experience. Additionally, given the broad-based nature of the testing, the reasons for referral are varied, making the availability of patient phenotype data more critical than ever to variant interpretation. Laboratories have begun to realize that no single laboratory can amass the data needed to effectively interpret genome-wide data on their own, and have begun to support data sharing initiatives, such as those of the Clinical Genome Resource (ClinGen) (Rehm et al. 2015) and its predecessors including the International Standards for Cytogenomic Arrays (ISCA) and International Collaboration for Clinical Genomics (ICCG) (Riggs, 2012; Riggs, 2013).
Through ClinGen, laboratories are encouraged to submit variant-level data to publically available databases, such as the National Center for Biotechnology Information's (NCBI) ClinVar (Landrum, 2014). Shared variant-level data generally include the laboratory's clinical interpretation and supporting evidence and/or citations. However, what is often lacking is detailed phenotypic information about the individuals in whom the variant was observed. Laboratories, the primary submitters of interpreted variant-level data to ClinVar, often do not have access to a patient's detailed phenotype information. Although clinical information is requested with sample submission, this information is often not received or is extremely limited. Even when laboratories have phenotype data, they may have restrictions on sharing this information without patient consent, given the public nature of the ClinVar database.
The ISCA and ICCG projects brought to light the challenges of capturing phenotypic data from those involved in ordering genetic tests and filling out test requisition forms. Riggs and colleagues reported that fewer than 1 in 5 test requisition forms were supplied with completed phenotypic information (Riggs, 2012). As an alternative to laboratory provider-supplied phenotype information for patients, ClinGen has developed GenomeConnect to partner with the patient community to capture the rich, individual-level phenotype information that is crucial to interpreting genomic variants (Riggs, 2012). Many patients are willing and highly motivated to provide additional information and participate in research (Smith-Packard, unpublished survey data), but historically, registries have been focused on specific diagnoses, such as Alzheimer's disease, Down syndrome, lupus, etc. (NIH, 2015). GenomeConnect, ClinGen's patient portal, provides an environment where any individual who has had genetic testing or has a genetic disorder can participate. Participants can securely share genomic and health information and connect with clinicians, clinical laboratories, researchers and other patients.
Patient self-report data are increasingly recognized as a valuable resource for research in patient outcomes and therapeutic intervention. A recent report from the DuchenneConnect registry demonstrated the utility of patient self-report data to observe significant differences in therapeutic interventions in Duchenne muscular dystrophy (Wang, 2014). Other projects, such as research done at 23andMe and PatientsLikeMe, utilize patient-reported data. ClinGen recognizes that phenotype information is critical for variant interpretation and is empowering patients through GenomeConnect to actively participate in the growth of shared sources of quality variant and health information to improve the quality of genetic tests.
GenomeConnect engages patients with a desire to contribute their data
GenomeConnect is an online patient community, developed and maintained under the oversight of the Institutional Review Board at Geisinger Health System. The project collects genotypic and phenotypic data from patients, engaging them as participants in efforts to build open databases such as ClinVar (Figure 1). Participants register for an account and complete informed consent online (parental consent for all minors, and assent for those aged 10-17 years), through their computer, tablet or smartphone. After completing the consent, the participant receives a general survey about their health and uploads a PDF copy of their genetic test results. See Figure 2 for steps involved in participant engagement.
Phenotype data are captured using the Human Phenotype Ontology (HPO), a system of categorizing medical conditions and symptoms using standard terminology within an ontological framework (Robinson, 2008; 2010). The terminology was adapted into patient-friendly terms so that phenotype can be obtained from members of the general public, yet stored uniformly in a system useful for clinicians and researchers. The health survey developed by ClinGen for GenomeConnect utilizes branching logic in order to minimize time required by patients to complete their health history, thereby increasing completion rate of the survey. This means participants receive a health survey adapted to their unique history, rather than a lengthy survey including questions that would not apply to them. For example, only those who report a history of a cardiovascular condition in initial health questions receive follow-up questions related to cardiovascular diagnoses.
Genotype data are provided from participants via genetic test reports they upload to their private portal account. GenomeConnect personnel review the information on the report, and enter the details of up to 10 variants from a single report into the secure database. Curation of the test report in this manner by a genetics professional (certified genetic counselor) assures that the variant data is coded properly and uniformly for storage in the database and secure sharing. This way, participants are able to share their genetic information without having to decode their own reports or risk entering errant data.
For the purpose of broad data sharing, de-identified genotype and phenotype information from GenomeConnect will be shared with publically available databases, such as ClinVar. Any data provided by participants that are transferred outside of the GenomeConnect database are unlinked from the participants’ personally identifying information. The participants are assigned a unique identifier code so that approved personnel can later re-identify participants for recontacting purposes. Entries representing participants who have a full genome or exome dataset will have a special notation of the availability of this data upon request so that researchers and laboratories are aware the individual has consented to sharing genomic information and is available for recontact. In the future, a mechanism for direct upload, or laboratory transfer, of genomic data, or assigned access to a secure cloud-based server, will be supported. Also, the clinical laboratory where the participant had their testing is notified that the participant is registered and willing to provide additional information, if needed, for test interpretation or retrospective analysis. This additional information could be useful to the laboratory if any variants were interpreted as uncertain significance.
Maintaining participant privacy is a key priority of GenomeConnect. Protected health information (PHI) is not shared, and participants’ data are stored on the secure servers of PatientCrossroads, a company specializing in patient registries. Multiple measures are in place by PatientCrossroads to protect the privacy and confidentiality of an individual's health information stored on PatientCrossroads servers. PatientCrossroads abides by the newest HIPAA Security Rule and the CONNECT platform is Safe Harbor certified, meaning that transfer of data for participants residing in the European Union is also approved by the European Commission's Directive on Data Protection. Additionally, PatientCrossroads hosts Federal Information Security Management Act (FISMA)-compliant programs for the National Institutes of Health (NIH). Per National Institute of Standards and Technology, “The FISMA Implementation Project was established in January 2003 to produce several key security standards and guidelines required by Congressional legislation. Details on FISMA can be located by visiting the division's Publications page (http://csrc.nist.gov/publications/).
Data Submission to ClinVar
Before sharing data with ClinVar, it will first be determined if the testing laboratory is a current submitter to ClinVar using the web-based listing (http://www.ncbi.nlm.nih.gov/clinvar/submitters/). If so, concordance between the variant classification on the patient's report and the laboratory's most recent ClinVar submission will be examined and appropriate follow-up will occur with the laboratory and patient if discordant. If concordant, an annotation will be provided on the laboratory's submission to note the presence of case-level data in GenomeConnect along with the patient's overall diagnosis. More detailed phenotypic and genotypic data will also be transferred to controlled-access case-level databases (e.g. dbGaP). If the testing laboratory is not a ClinVar submitter, a new variant-level submission will be provided to ClinVar, noting the testing laboratory as the source of the classification. Providing the testing laboratory will allow connection between the case and a future submission from the laboratory should one be provided. Although this workflow will initially be labor-intensive, as the volume of cases in GenomeConnect scales, more automated approaches will be developed to integrate GenomeConnect and laboratory submitted variants and prevent duplication of supporting case-level data.
Clinical laboratories play a key role in GenomeConnect
GenomeConnect supports continuous linkages and the sharing of data among all four major stakeholders in genomic testing: patients, clinicians, clinical laboratories and researchers. GenomeConnect has engaged all of these parties from the beginning to ensure the resource optimizes all of these relationships as it grows. However, a unique aspect that distinguishes GenomeConnect from many other patient registries is its emphasis on the role of clinical laboratories.
To date, clinical laboratories have not been included in most patient registry data sharing projects despite playing a critical role in the interpretation of genomic variants. Laboratory personnel contribute a valuable perspective to the discussions surrounding data sharing and have been identified as an important factor for the provision of optimal patient care (Wain, 2012). Clinical laboratories are a direct source of referrals through standard inclusion of recruitment text on laboratory reports and websites.
Advisory Engagement of Patients
Patients have been involved in decisions surrounding GenomeConnect from the beginning, individually and through the collective professional experience of our partners at PatientCrossroads who supply the registry platform for GenomeConnect. The experience of PatientCrossroads working with 76 different patient advocacy organizations in establishing patient registries enabled GenomeConnect to launch at a well-developed, advanced stage based on years of feedback from patients belonging to registries. A partial list of some of these registries can be located on the PatientCrossroads homepage (https://www.patientcrossroads.com/connect.html). Additionally, individuals from the patient advocacy group Syndromes Without a Name (SWAN) reviewed the consent and assent forms and the health questionnaire. Their feedback helped mold these documents into the current patient-friendly form. A patient liaisons committee is under development and will involve GenomeConnect participants. Future surveys may be targeted at issues that are important to patients, while continuing to contribute to defining the clinical relevance of genomic variation. Participants in GenomeConnect will also be involved in designing patient-friendly surveys and helping with outreach to other patient advocacy and support groups. Involving the patient perspective will ensure that issues important to GenomeConnect participants will remain at the center of discussions as the project evolves.
GenomeConnect Matchmaking
The current focus of the Matchmaker Exchange is to connect cases that have matching phenotypes and candidate genes to build evidence for novel genetic causes of disease, focusing on the connection between and among clinical laboratories and researchers (Philippakis et al., 2015). However, patients can have a direct and major role in this process as well, particularly given their strong and personal motivation (Might, 2014; Lambertson et al., 2015).
GenomeConnect allows bi-directional communication between GenomeConnect and various parties, as demonstrated in Figure 3. The GenomeConnect community provides a forum for those patients with an “orphan” syndrome or condition for which no disease-specific patient registry exists, brings these individuals together through matchmaking. Phenotype matching will be possible for individuals who report a phenotype during the registration process for their account. To support this, the Orphanet Rare Disease Ontology list (http://www.orphadata.org/cgibin/inc/ordo_orphanet.inc.php) of over 6,000 diagnoses is provided at registration;). Prior to GenomeConnect-facilitated matching, participants can first see the aggregated responses of others to the questions provided on the initial “body systems” health survey, so someone with a specific phenotypic feature would be able to see if others have also reported the same feature. Matchmaking between families and individuals affected with rare conditions helps them to connect, enabling better support for these families as well as the ability to build evidence for genomic candidates in common.
Participants consent to be notified about research opportunities when they join GenomeConnect, and each participant sets their preference to receive requests and has the choice whether or not to respond to notifications. The GenomeConnect consent indicates that their contact information will not be shared and that contacts will be managed by GenomeConnect staff and system functions. The system allows secure contact and does not share contact information until both parties agree to be matched. Participants can be introduced to others who have the same phenotype, diagnosis, disease, affected gene, or variant, enabling the ability to build evidence for gene-disease relationships through patient-initiated matchmaking. Initially, coordinators will be involved in sending queries on behalf of participants interested in connecting with another, with the “matched” additional participants having a choice of whether to to respond or not. Over time, this feature will be automated similar to the current system in www.simonsvipconnect.org. Additional characteristics targeted for matching participants include geographical region, age and sex/gender, to also optimize for emotional and practical support in matching families.
Laboratories, clinicians, and researchers are also able to request additional phenotypic information from registered participants, or inform them of potential research opportunities through GenomeConnect. They can find out about potential research projects that may be of interest to them via researcher inquiries: researchers may contact the GenomeConnect staff to see if there are potential participants for a particular study. The GenomeConnect staff then informs participants who meet the target criteria of the potential research opportunity. If the participant is interested in the opportunity, they can then contact the researcher directly and consent to their study. Creating cohorts has been a struggle for researchers seeking to recruit study participants, and GenomeConnect creates a possible solution by enabling researchers to identify and contact cohorts to study. This enables the “rarest of the rare” to be identified and found. GenomeConnect staff will also contact a participant if a laboratory representative requests a connection.
Given the rarity of some phenotypic features and genomic variants, some participants may be identifiable solely based on the genotypic and phenotypic information in GenomeConnect. The consent that all GenomeConnect participants complete explains that full de-identification may not be possible due to rarity of some diagnoses and features. Project staff will adjust information shown in the matching feature to address this concern and decide what information will help connect participants but not identify them. The contact profile of participants is only viewable to other consented participants who are also registered and logged into the system.
Unique data capturing and sharing aspects of GenomeConnect
GenomeConnect differs from other registries in its unique integration of patient-reported phenotype that captures the full phenotypic spectrum of an individual participant by assessing each body system in patient-friendly terms linked back to a standardized ontology system (HPO). Patient reported outcomes have been introduced into clinical medicine to assess a range of disease features and outcomes and support the value of information reported directly from patients (Broderick, 2013). Whereas many condition-specific registries and most laboratories only assess health history related to the condition at hand (for example, only cancer history questions for those in the PROMPT cancer registry, promptstudy.info), GenomeConnect gathers a broad health history from every participant, regardless of the reason for genetic testing. Participants are provided an online “body systems” survey, available for public viewing at the ClinGen website “Information for Healthcare Providers” tab (http://clinicalgenome.org/genomeconnect/for-providers/), that asks a series of questions assessing history of conditions in a review-of-systems format, organized by organ system (skeletal, cardiovascular, ocular, etc.). Additional questions assess other history, such as prenatal and pregnancy history of the participants and history of specific conditions (cancer, etc.). Follow-up questions are provided, individualized to the participant based on his/her initial responses. This method for extracting complete health history stands apart from health surveys used in other registries, which are often condition- or diagnosis-focused, such as muscular dystrophy-focused questions for the DuchenneConnect registry (Wang, 2014). The GenomeConnect body systems survey allows a wider perspective on health and may allow for identification of related conditions previously missed condition-focused surveys. This will also be important as the known spectrum of many diseases expands to include previously unrecognized features.
Depending on responses to the general survey, some participants may receive additional follow-up surveys. The purpose of these follow-up surveys will be to obtain targeted information, such as more detail about medical issues reported by the participant on the general survey, while minimizing the survey burden experienced by participants at initial enrollment. For example, if the participant indicated that they had a congenital heart defect in the general survey, they may be invited to participate in a follow-up survey designed to learn more about the specific type of heart defect, when and how it was diagnosed, as well as treatment and outcomes. GenomeConnect will develop new surveys as questions arise from the clinical laboratory and research communities.
GenomeConnect encourages partnerships
GenomeConnect builds off of partnerships that have already begun to support broad data sharing between clinicians, clinical laboratories, researchers and patients. It will support partnerships with existing registries by offering to provide services these other registries currently do not provide, such as curation of genetic test reports, secure, deidentified sharing of genomic variant data with databases like ClinVar and guidance for registries that seek to to develop their own processes for genotype curation and phenotype capture. Partnering with GenomeConnect will promote the use of standardized ontology language and expert curation of genetic test reports to ensure the quality and utility of shared data while maintaining the goal of putting the patient first.
GenomeConnect's goal is not to replace other projects or registries, but to support and expand those that already exist in capturing data, particularly genomic variant data. Many registries already play an important role for patients and families, by providing resources, supporting patient advocacy group growth and outreach, and connecting disease-specific patient support groups with interested researchers but do not have staff for curation or phenotype collection. GenomeConnect can provide these. The goal of GenomeConnect is to encourage sharing of phenotypic and genotypic information by all interested parties and to promote the matching of patients with new opportunities, or support them with their submissions to public databases.
Conclusion
The goal of the Human Genome Project was to establish a standard sequence of the human genome (International Human Genome Sequencing Consortium, 2004). Now, efforts like ClinGen are focused on determining how genomic variation among individuals influences human health and disease. ClinGen developed GenomeConnect as one source of genotype and phenotype data derived from direct patient participation. Support for efforts like GenomeConnect that are encouraging broad sharing of genotype and phenotype data to establish clinically useful correlations are necessary. Partnerships between patients, clinicians, clinical laboratories and researchers will help resolve the issue of limited phenotype information that is available to individuals trying to interpret the clinical significance of genomic variation.
The GenomeConnect model of matchmaking to support the sharing of high quality phenotype and genotype information relies on the active involvement of clinicians and laboratories in supporting its development. Patients must be informed of the importance of making health and variant data available, and of their critical role in this endeavor. In turn, clinicians, laboratory personnel, and researchers must remain active in supporting the sharing of health and variant data and encourage patients to voice their needs and thoughts on how genetic tests can be improved. Only when all parties are engaged in the process of safe and informative data sharing will precision medicine, supported by high quality genetic tests and interpretations, be possible.
Acknowledgments
Patient clinical data have been obtained in a manner conforming with IRB and/or granting agency ethical guidelines. The ClinGen Consortium is funded by the NHGRI, in conjunction with additional funding from the NICHD and NCI, through the following grants and contracts: 1U41HG006834-01A1 (Rehm, Martin, and Ledbetter), 1U01HG007437-01 (Berg), 1U01HG007436-01 (Bustamante), HHSN261200800001E. This work was supported in part by the Intramural Research Program of the NIH, National Library of Medicine. The authors would also like to acknowledge the members of the Phenotype and Education, Counseling, and Engagement Work Groups of the ClinGen Resource for their contributions to GenomeConnect. David Miller serves as Clinical Consultant to Claritas Genomics, a provider of genetic testing services (non-equity compensation agreement), and David Ledbetter serves as a consultant for Natera, Inc.
References
- Broderick JE, Morgan DeWitt E, Rothrock N, Crane PK. Advances in Patient Reported Outcomes: the NIH PROMIS Measures. eGEMs (Generating Evidence & Methods to improve patient outcomes) 2013;1(1) doi: 10.13063/2327-9214.1015. Article 12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- International Human Genome Sequencing Consortium Finishing the euchromatic sequence of the human genome. Nature. 2004;431:931–945. doi: 10.1038/nature03001. [DOI] [PubMed] [Google Scholar]
- Lambertson K, Damiani S, Might M, Shelton R, Terry S. Participant-led matchmaking. Hum Mutat. 2015;36:xxx–yyy. doi: 10.1002/humu.22852. [DOI] [PubMed] [Google Scholar]
- Mayo Clinic website [April 10, 2015];Mayo Medical Laboratories Interpretive Handbook. from http://www.mayomedicallaboratories.com/interpretive-guide/?alpha=C&unit_code=61835.
- Might M, Wilsey M. The shifting model in clinical diagnostics: how next-generation sequencing and families are altering the way rare diseases are discovered, studied, and treated. Genet Med. 2014;16(10):736–7. doi: 10.1038/gim.2014.23. [DOI] [PubMed] [Google Scholar]
- National Institutes of Health (NIH) website [April 10, 2015];NIH Clinical Research Trials and You: List of Registries. from http://www.nih.gov/health/clinicaltrials/registries.htm.
- Philippakis AA, Azzariti DR, Beltran S, Brookes AJ, Brownstein CA, Brudno M, Brunner HG, Buske OJ, Carey WK, Doll C, Dumitriu S, SOM Dyke, et al. The Matchmaker Exchange: A Platform for Rare Disease Gene Discovery. Hum Mutat. 2015;36:xxx–yyy. doi: 10.1002/humu.22858. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rehm HL, Berg JS, Brooks LD, Bustamante CD, Evans JP, Landrum MJ, Ledbetter DH, Maglott DR, Martin CL, Nussbaum RL, Plon SE, Ramos EM, et al. ClinGen: The Clinical Genome Resource. N Engl J Med. 2015 doi: 10.1056/NEJMsr1406261. Advance online publication. DOI: 10.1056/NEJMsr1406261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riggs ER, Jackson L, Miller DT, Van Vooren S. Phenotypic Information in genomic variant databases enhances clinical care and research: the International Standards for Cytogenomic Arrays Consortium experience. Hum Mutat. 2012;33(5):787–96. doi: 10.1002/humu.22052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riggs ER, Wain KE, Riethmaier D, et al. Towards a universal clinical genomics database: the 2012 international standards for cytogenomic arrays consortium meeting. Hum Mutat. 2013;34(6):915–919. doi: 10.1002/humu.22306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson PN, Kohler S, Bauer S, et al. The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet. 2008;83(5):610–615. doi: 10.1016/j.ajhg.2008.09.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson PN, Mundlos S. The human phenotype ontology. Clin Genet. 2010;77:525–534. doi: 10.1111/j.1399-0004.2010.01436.x. [DOI] [PubMed] [Google Scholar]
- Wain K, Riggs ER, Hanson K, et al. The laboratory-clinician team: a professional call to action to improve communication and collaboration for optimal patient care in chromosomal microarray testing. J of Gen Couns. 2012;21(5):631–7. doi: 10.1007/s10897-012-9507-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang RT, Silverstein Fadlon CA, Ulm JW, et al. Online Self-Report Data for Duchenne Muscular Dystrophy confirms natural history and can be used to assess for therapeutic benefits. PLoS Curr. 2014;17:6. doi: 10.1371/currents.md.e1e8f2be7c949f9ffe81ec6fca1cce6a. [DOI] [PMC free article] [PubMed] [Google Scholar]