Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Jan 1.
Published in final edited form as: Genet Med. 2022 Apr 19;24(7):1512–1522. doi: 10.1016/j.gim.2022.03.013

The Clinical Variant Analysis Tool: Analyzing the evidence supporting reported genomic variation in clinical practice

Hui-Lin Chin 1,2,3, Nour Gazzaz 1,2,4,5, Stephanie Huynh 1,2, Iulia Handra 1,2, Lynn Warnock 2, Ashley Moller-Hansen 2, Pierre Boerkoel 6, Julius OB Jacobsen 7, Christèle du Souich 1, Nan Zhang 8, Kent Shefchek 9, Leah M Prentice 10, Nicole Washington 11, Melissa Haendel 9, Linlea Armstrong 1,2, Lorne Clarke 1,2, Wenhui Laura Li 8, Damian Smedley 7, Peter N Robinson 12, Cornelius F Boerkoel 1,2,*
PMCID: PMC9363005  NIHMSID: NIHMS1823181  PMID: 35442193

Abstract

Purpose:

Genomic test results, regardless of laboratory variant classification, require clinical practitioners to judge the applicability of a variant for medical decisions. Teaching and standardizing clinical interpretation of genomic variation calls for a methodology or tool.

Methods:

To generate such a tool, we distilled the Clinical Genome Resource framework of causality and the American College of Medical Genetics/Association of Molecular Pathology and Quest Diagnostic Laboratory scoring of variant deleteriousness into the Clinical Variant Analysis Tool (CVAT). Applying this to 289 clinical exome reports, we compared the performance of junior practitioners with that of experienced medical geneticists and assessed the utility of reported variants.

Results:

CVAT enabled performance comparable to that of experienced medical geneticists. In total, 124 of 289 (42.9%) exome reports and 146 of 382 (38.2%) reported variants supported a diagnosis. Overall, 10.5% (1 pathogenic [P] or likely pathogenic [LP] variant and 39 variants of uncertain significance [VUS]) of variants were reported in genes without established disease association; 20.2% (23 P/LP and 54 VUS) were in genes without sufficient phenotypic concordance; 7.3% (15 P/LP and 13 VUS) conflicted with the known molecular disease mechanism; and 24% (91 VUS) had insufficient evidence for deleteriousness.

Conclusion:

Implementation of CVAT standardized clinical interpretation of genomic variation and emphasized the need for collaborative and transparent reporting of genomic variation.

Keywords: Exome sequencing, Genomic medicine, Precision medicine, Variant classification, Variant interpretation

Introduction

Genomic medicine is predicated on understanding the interaction between human genomic variation and its complex ecosystem. It has had its greatest effect in the diagnosis of rare genetic disorders, precision oncology, pharmacogenetics, and precision microbiology. Successful genomic medicine practice requires delineation of variants that predispose to health or disease from those that are neutral, ie, the establishment of causality.1 Consequently, medical diligence and logical rigor are required to avoid wrongful attribution of disease association and to provide valid molecular diagnoses for medical and life decisions.

In 2015, the American College of Medical Genetics (ACMG) and Association of Molecular Pathology (AMP) developed a scoring system and recommendations for standardization of practice.2 The progressive adoption of these standards is leading to their refinement and to additional recommendations.3-6 In 2017, the Clinical Genome Resource (ClinGen) developed a framework for assessing gene–disease association,7 and it leads an ongoing effort to curate such associations. The extent of adoption in clinical practice and laboratory reporting has not been assessed.

Judging the medical applicability of each variant, regardless of laboratory variant classification, is the responsibility of the clinical practitioner. Such evaluation, which currently constitutes 19% of referrals accepted by the Provincial Medical Genetics Program (PMGP) of British Columbia (BC) (I. Handra, unpublished data), requires the clinician to assess evidence associating the reported genomic variation with the proband’s disease and to determine familial risk.8 A lack of laboratory evidential transparency for reported variation and limited clinician knowledge of genomic causality impedes genomic medicine practice.9

We hypothesized that within a genomic medicine practice, genomic variant analysis reduces to a process of tasks, outcomes, and decisions that conform to the ClinGen framework7 for assessment of causality and the ACMG/AMP2 and Quest Diagnostic Laboratory scoring systems10 for assessment of variant deleteriousness; the Quest Diagnostic Laboratory scoring system provides a quantification of potential deleteriousness. To test this hypothesis, we distill these 3 elements into a single tool, the Clinical Variant Analysis Tool (CVAT); compare the performance of 2 junior practitioners using CVAT to that of 3 experienced medical geneticists; and report observations from analysis of variants on 289 clinical exome reports. CVAT enabled clinicians to define the evidence supporting reported variation methodically, to differentiate disease-associated variation more clearly, and to highlight evidential expectations for clinical laboratory reports.

Materials and Methods

Development of CVAT

To enable thorough standardized genomic variant analyses by clinicians, we distilled the ClinGen framework, the ACMG/AMP scoring system, and the Quest Diagnostic Laboratory scoring system into a single tool, CVAT. The ClinGen framework evaluates relevant genetic and experimental evidence supporting or contradicting a gene–disease relationship using frequency of prior reports, functional data, replication, and the presence or absence of contradictory data.5,7 The ACMG/AMP guidelines condense population frequency, inheritance, computational data, functional data, predictive data based on type and location of the variant, precedence, and allelic data into standardized terminology: pathogenic (P), likely pathogenic (LP), uncertain significance, likely benign, and benign.2 The Quest Diagnostic Laboratory scoring system quantitatively assigns weighted scores for data from 5 categories: prediction tools, population frequency, co-occurrence, segregation, and functional studies.10 This metric conveys the relative deleteriousness of the variant.

Development of CVAT involved design, implementation, and refinement of 7 processes: (1) review of the proband phenotype and confirmation of the phenotype sufficiency used/reported by the laboratory, (2) evidence for a disease arising from the variation at the reported genomic locus, (3) congruence of the proband phenotype with the established disease phenotype, (4) congruence of the reported variant with the established genetic and biological disease mechanisms, (5) evaluation of the potential deleteriousness of the reported genomic variation, (6) clinical conclusion regarding relevance of the reported variation to the proband’s condition, and (7) disposition regarding closure or pursuit of further studies for understanding of the proband and/or the variant (Supplemental Figure 1, Supplemental Material 1). Definitions and annotations are listed in Supplemental Table 1.

Study cohort

Consecutive exome reports from 289 individuals evaluated through the PMGP BC (BC Women’s and Children’s Hospital, Vancouver) were included in this study (Figure 1, Supplemental Materials 1 and 2). The probands had diverse demographic characteristics and indications for testing (Supplemental Table 2).

Figure 1. Flowchart illustrating the application of the Clinical Variant Analysis Tool to variants listed on 289 clinical laboratory reports for exome sequencing.

Figure 1

Each step of the analysis is shown as a different color. The outcomes for each step are shown along the right side and the overall outcomes are at the bottom. This analytical framework assesses causality and predicted variant deleteriousness. Evaluation of 289 exome reports found that 42.9% of reports and 38.2% of reported variants supported a diagnosis. Overall, 89.5% of reports supported a diagnosis of a single disorder. 10.5% of reports supported a diagnosis of multiple disorders based on variants in established disease genes, and 14.2% did with inclusion of variants in candidate disease genes. Overall, 79.5% of variants supporting a diagnosis were classified as P or LP by the laboratory, whereas 20.5% of variants supporting a diagnosis were classified as VUS by the laboratory. Evidence used by clinicians to interpret a VUS as supporting a diagnosis included reverse phenotyping, functional studies, or parental segregation for 24, 5, and 5 variants, respectively. 3D, three-dimensional; ACMG, American College of Medical Genetics; AMP, Association of Molecular Pathology; HPO, Human Phenotype Ontology; LP, likely pathogenic; P, pathogenic; VUS, variant of uncertain significance.

Clinical exome sequencing

Proband only (n = 171), duo (n = 6), or trio/quad (n = 112) clinical exome sequencing was performed at multiple commercial clinical laboratories. Each was approved by the BC’s Agency for Pathology and Laboratory Medicine (June 2018 to March 2021) or its antecedent agency (January 2017 to June 2018) (Supplemental Material 1).

Assessing sufficiency of clinical phenotyping through comparison to annotated diseases

To assess whether sufficient phenotypic features of the proband were available for genomic variant prioritization, we transformed phenotypes on the laboratory report into Human Phenotype Ontology (HPO) terms using the program Phenopacket Generator and scored each phenopacket using the Annotation Sufficiency Tool, which compares the proband phenotype with annotated diseases within the HPO database.11 An annotation score of 1 means that the query terms have the same breadth and depth as those used to annotate diseases in the HPO database, whereas a score of 0.9 means the terms have 90% the breadth and depth as the annotations. Per previous studies, a sufficiency score ≥0.7 improved the precision of prioritizing genomic variation by removing numerous false positives without dramatically altering the overall recall of diagnoses (P. Robinson, unpublished data).

Graphical comparison of the proband phenotype with the annotated disease phenotype

To visualize the overlap of the proband’s phenotypic features with those of the disease(s) associated with the reported genomic variation, we used the graphical output of LIRICAL.12 The graphs were analyzed by 3 clinicians for features suggestive of single or multiple disorders.

Bioinformatic evaluation of reported variants

Exomiser, a tool developed to prioritize genomic variation based on phenotype similarity and variant deleteriousness, provides phenotype, variant, and overall scores for each genomic variant.13,14 Using each of these scores, we assessed whether (1) the reported variants resided in candidate disease loci, (2) the patient phenotype matched that curated to the variant locus-associated human disease, and (3) the reported variants were predicted deleterious. Higher phenotype, variant, and overall scores respectively represent increased phenotype similarity, predicted deleteriousness, and potential relevance to the patient’s disorder. We used phenotype, variant, and overall score cutoffs of 0.6, 0.8, and 0.7, respectively, in accordance with a previous study of several thousand solved cases from the 100,000 Genomes Project; these thresholds improved precision by removing numerous false positives without dramatically altering the overall recall of diagnoses (D. Smedley, unpublished data).

To compute the Exomiser scores, we generated Variant Call Format (VCF) files for reported variants using an inhouse custom Java application that queried the Ensembl GRCh37 variant recoder end point for a VCF string (https://uswest.ensembl.org/info/docs/tools/vep/recoder/index.html)15 and subsequently generated a VCF file (Supplemental Material 1). We submitted the VCF files and proband phenotype to Exomiser (version 13.0.0, 2102 data release) using the default exome analysis parameters. For additional assessment of predicted deleteriousness of missense variants, we used Local Identity and Shared Taxa (LIST-S2) (https://precomputed.list-s2.msl.ubc.ca), an algorithm that exploits local sequence identity and taxonomy distances.16

Results

Hypothesizing that a clinically relevant and logical process incorporating the ClinGen framework,7 the ACMG/AMP scoring system,2 and the Quest Diagnostic Laboratory scoring system10 provides a transparent and understandable framework for variant interpretation within genomic medicine practice, we developed CVAT (Supplemental Material 1). Two junior practitioners (endocrinology and genetics fellows) without prior formal medical genetics training used CVAT to analyze reported variants for 289 clinical exomes. They had 94% concordance; discordant interpretations were resolved by discussion and consultation with an experienced medical geneticist.

The probands for these exomes were 50% male, 40.5% female, and 10% unknown (fetal samples). They ranged in age from fetal to adult and had diverse indications for testing (Supplemental Table 2) and diverse ethnicities (Supplemental Figure 2). Subsequent sections contain the results that represent the outcomes using CVAT (Figure 1).

CVAT identifies reported variants not meeting the established ClinGen validity framework

Of the 289 clinical exome reports, 219 (75.8%) listed at least 1 variant and 70 (24.2%) listed no variants (Figure 1). Of the total 412 variants analyzed, all were primary diagnostic findings, and none were ACMG secondary findings. In total, 30 (7.3%) variants were categorized as occurring in candidate disease genes, ie, genes not yet associated with human disease; we excluded these from subsequent analyses. The remaining 382 (92.7%) variants were reported as residing in previously established disease genes, genes known to be associated with the phenotype, or genes possibly associated with the phenotype.

Of these 382 variants, 40 (10.5%) resided in genomic loci without an established association with human disease; these variants were not reported as residing in candidate disease genes (Figure 1, Table 1). Another 77 (20.2%) variants resided in genomic loci for which the associated diseases did not explain the proband phenotype, ie, the proband phenotype and the phenotypic features of the disease(s) associated with the variant(s) were incongruent (Figure 1). Another 28 (7.3%) variants were inconsistent with the established mechanism of disease, eg, reporting of an amorphic allele when the established mechanism of disease is hypermorphic variation (Figure 1). The total number of variants deficient for each evidential category is described in Table 1. In summary, 39 reported P or LP variants and 106 reported variants of uncertain significance (VUS) had insufficient genomic locus or variant evidence to be considered causal of human disease.

Table 1.

Distribution of the evidential deficiencies of causality (gene–disease association, phenotype–disease association, disease mechanism) for 145 variants listed on 289 clinical exome reports

Inadequate
Evidential Feature
Classificationa Number of
Variantsb
% Reported
Variants
Gene–disease association Preliminary 31 8.2
None 9 2.4
Total 40 10.5
Phenotype–disease association Preliminary 15 3.9
None 88 23
Total 103 27
Disease mechanism Inconsistent 68 17.9
Unknown 10 2.6
Total 78 20.5
a

See Supplemental Table 1 for definitions of the classifications.

b

The total number of variants deficient for each evidential category. This is distinct from Figure 1 that shows where in the process variants were removed from consideration.

CVAT identifies reported variants with insufficient deleteriousness as defined by the ACMG/AMP standards

For the 241 variants meeting the antecedent criteria, we assessed variant deleteriousness as defined by the ACMG/AMP standards2 and the Quest Diagnostics scoring system.10 According to these variant-dependent, not proband-dependent, criteria and those defined by MacArthur et al,1 91 (37.8%) variants did not have sufficient deleteriousness to support a diagnosis (Figure 1, Table 1). Across 9 evidential domains of deleteriousness, variants supporting a diagnosis had evidence of deleteriousness in an average of 3.1 categories (SD = 1.4), whereas variants not supporting a diagnosis had evidence for deleteriousness in an average of 1.6 categories (SD = 1.0) (Figure 2A). For variants not supporting a diagnosis, 81.9% had population data evidence (ie, absent/low variant frequency in Genome Aggregation Database), 43.6% had computationally predicted deleteriousness (conservation across species, in silico predictions [Sorting Intolerant From Tolerant (SIFT), Polymorphism Phenotyping (PolyPhen), Combined Annotation Dependent Depletion (CADD), LIST-S216-19]), and 22.3% had evidence against deleteriousness.

Figure 2. Evaluation of variant level evidence for deleteriousness.

Figure 2

A. Assessment of 228 variants with sufficient gene–disease association, phenotype–disease concordance, and disease mechanism congruence. Of these variants, 60.5% had features of deleteriousness supporting a diagnosis, whereas 39.5% did not. Variants supporting a diagnosis had more categories of evidence (mean = 3.1 ± 1.4 SD) supporting deleteriousness than variants that did not (mean = 1.6 ± 1.0 SD). Of variants not supporting a diagnosis, 28% had evidence against deleteriousness. Overall, 84% of variants not supporting a diagnosis fulfilled the population data evidence (ie, absent/low variant frequency in Genome Aggregation Database) for deleteriousness. The characteristics of the 9 categories of evidence are as follows: (1) population data assess rarity in or absence from population databases; (2) computational data are generated using in silico tools (SIFT, PolyPhen, CADD, LIST-S2) to predict potential variant deleteriousness; (3) predictive data represent the expected consequence of the variation based on variant type (eg, null variants) or evidence derived from a different pathogenic/likely pathogenic missense variant or amino acid changes at the same site; (4) functional data represent literature or studies done on the proband and/or the variant to assess deleteriousness; (5) structural data assess whether the variant resides in a mutational hotspot or might deleteriously alter a functional domain or a constrained structure; (6) segregation data document if the variant segregates with the phenotype in the family; (7) de novo data define if the variant is absent in the parents in the context of no prior family history of the phenotype; (8) allelic data assess whether the variant occurs in trans with a pathogenic/likely pathogenic variant for a recessive disease; and (9) replicability or precedence data define whether the variant has previously been reported with the associated phenotype. B. Overall Exomiser scores and (C) LIST-S2 scores for the 228 variants supporting (left) or not supporting (right) a diagnosis. Exomiser scores (overall [left], phenotype [middle], and predicted variant deleteriousness [right]) for the 228 variants (D) supporting or (E) not supporting a diagnosis. Panels (B–E) represent the 228 variants represented in panel (A). B, C. X represents the mean score, the line represents the median score, the box represents the first to third quartile of the scores, and the whiskers represent the maximum to minimum score. Additional outlier scores are represented as dots. CADD, Combined Annotation Dependent Depletion; LIST-S2, Local Identity and Shared Taxa; PolyPhen, Polymorphism Phenotyping; SIFT, Sorting Intolerant From Tolerant.

Patient phenotype sufficiency scores are comparable between reports with variants supporting a diagnosis and those without

To assess whether sufficient phenotypic information was provided for laboratory variant analysis, we assumed that a conservative measure of the phenotypic information submitted to and used by the laboratory is the phenotype listed on the report. Using the Annotation Sufficiency Tool,11 we found that the mean sufficiency score was 0.90 ± 0.26 SD (median = 0.9, mode = 0.8). The sufficiency score was not lower for exomes without variants supporting a diagnosis (mean = 0.94 ± 0.28 SD) compared with those with variants supporting a diagnosis (mean = 0.84 ± 0.22 SD) (Figure 3A). This also did not detect major differences in phenotyping among the 3 experienced medical geneticists (Figure 3B). Overall, 13.5% of reports had a sufficiency score of <0.7. Of these reports, 64.1% had variants supporting a diagnosis. In total, 46.2% of reports with low phenotype sufficiency were for probands with neurological disorders, whereas the difficult-to-phenotype prenatal and intensive care probands constituted 10.3% and 28.2%, respectively. Although this highlighted a need for more detailed phenotyping of neurological conditions or a need for more HPO terms describing neurological disorders, it did not detect an overall insufficiency of phenotypic information for laboratory reporting.

Figure 3. Bioinformatic assessment of proband phenotypes and clinically reported exome sequencing variants.

Figure 3

A. Phenotypic sufficiency scores for the 2 cohorts studied. B. Comparison of the phenotypic sufficiency scores for the probands contributed by the different clinicians. C. Pareto chart demonstrating overall Exomiser score for variants supporting a diagnosis. D. Pareto chart demonstrating overall Exomiser score for variants not supporting a diagnosis. E. Examples of categorical graphical representation of phenotype in the context of genotype to detect dual diagnoses. The left shows graphical representations of the phenotypic features of a proband with diagnoses of cerebral cavernous malformations (KRIT1) and polycythemia vera (JAK2). The right shows graphical representations of the phenotypic features of a proband with diagnoses of Wilson disease (ATP7B) and Bethlem myopathy (COL6A1); the patient also has STS deficiency, which explains the ichthyosis. The graphical output was generated using LIRICAL.12 The green bars represent phenotypic features consistent with the disorder. The red bars represent phenotypic features not consistent with the disorder. (A, B) X represents the mean score, the line represents the median score, the box represents the first to third quartile of the scores, and the whiskers represent the maximum to minimum score. Additional outlier scores are represented as dots. HP, Human Phenotype Ontology Term Identifier.

Bioinformatic analysis concurs with results from CVAT

Given the divergence between the laboratory reports and CVAT interpretations, we submitted the phenotype and the variants included on each report for Exomiser analysis.13,14 Exomiser provides a score for predicted variant deleteriousness, a score for phenotypic similarity to traits associated with disease-predisposing variation of the genomic locus in question, and an overall score derived from the variant deleteriousness and phenotype similarity scores.13,14 The phenotype score for a proband and a locus in question reflects similarity to the respective human disease, model organisms (mouse, zebrafish), and disorders associated with interacting proteins.

A total of 202 (52.2%) reported variants had an overall Exomiser score of <0.7; this approximates the 61.8% of variants that CVAT identified as not supporting a diagnosis. Among variants with Exomiser score <0.7, the number of variants with a low phenotype score (<0.6), a low variant deleteriousness score (<0.8), or both were 156 (77.2%), 119 (59%), and 84 (41.6%), respectively.

Focusing on discrepancies between CVAT and Exomiser, we identified 20 (5.1%) variants interpreted as supporting a diagnosis and with overall Exomiser score <0.7 and 67 (17.3%) variants interpreted as not supporting a diagnosis and with overall Exomiser score >0.7 (Figure 3C and D). For the former (20 variants), 80% had a low phenotype score (<0.6); this was attributable to low phenotype sufficiency, multiple diagnoses (ie, a blended phenotype), or a complex phenotype (eg, features of acquired illness comingled with those of the genetic disease). For the latter (67 variants), 34 (50.7%) had phenotype scores >0.6 and 64 (95.5%) had variant deleteriousness scores >0.8. To test if these 67 variants were being prioritized by model organism or interactome data, repeat Exomiser analysis with human gene-specific phenotype data found that only 32 (47.8%) of these variants had Exomiser phenotype scores >0.6 and only 35 (52.2%) had overall Exomiser score >0.7. For the 32 variants with human-specific Exomiser score <0.7, we postulate that these reside in candidate disease genes. For the remaining 35 variants, the clinician judged that divergence of the proband phenotype from that annotated to the disease was too great to attribute the diagnosis; however, this finding might suggest an expansion of the disease-associated phenotype and warrant reconsideration of the diagnosis.

To understand better the variants discounted by CVAT, we used Exomiser and LIST-S2 scores to reassess predicted deleteriousness for variants that had met the criteria of (1) sufficient evidence for locus (gene)–disease association, (2) congruence of the proband phenotype with the associated disease, and (3) concordance of the variant with the known biological mechanism of disease (Figure 2B and C). Of those judged to support a diagnosis, 86.9% had overall Exomiser score >0.7, whereas only 30% of those not supporting a diagnosis did. The low overall Exomiser scores of the latter arose predominantly from low predicted variant deleteriousness as opposed to low phenotype scores (Figure 2D and E).

CVAT standardizes evidential transparency and enables genomic medicine practice

To assess the performance of the CVAT process, we compared the application of this tool by 2 fellows with the practice of 3 experienced medical geneticists. For the 146 variants supporting a diagnosis, 116 (79.5%) met the ACMG/AMP criteria for a P or LP classification, were reported by the laboratory as such, and were interpreted by the medical geneticist as supporting a diagnosis. For 30 (20.5%) VUS, the responsible medical geneticist had judged them as supporting a diagnosis; use of CVAT also highlighted these VUS (Quest score ≥4.5) and the additional evidence supporting or needed to support a diagnosis. The medical geneticists decided that these VUS supported a diagnosis based on reverse deep phenotyping (73.3%, eg, pathognomonic trait), functional studies (13.3%, eg, biochemical assay), or familial segregation (13.3%) (Figure 1). The synthesis of this additional information into CVAT supported these decisions. Notification of the testing laboratory about the additional information led to reclassification of 8 (26.6%) of the 30 VUS to P or LP. In summary, CVAT facilitated genomic medicine practice across practitioners by harmonizing evidential interpretation, standardizing communication of evidence, and providing a framework for equitable care.

CVAT also detected that 1 P/LP variant and 2 VUS, which had been considered as supporting a diagnosis, lacked congruence with the established genetic and biological mechanism of disease. This required withdrawal of 3 diagnoses from 2 probands. In conclusion, junior practitioners using CVAT had 99.8% concordance with experienced medical geneticists for variant interpretation.

The identification of multiple diagnoses is assisted by graphical representation of proband–disease phenotype congruence

CVAT identified 146 variants supporting diagnoses for 124 of 289 (42.9%) individuals tested. Of these, 13 (10.5%) were given diagnoses of multiple genetic disorders on initial analysis of the laboratory report by the medical geneticist. Although consistent with prior studies,20-24 this is less than the predicted 14.0% by a Poisson model or the predicted 26.4% by an independence model.25,26 Because current analytical strategies for diagnostic genomics in rare-disease medicine tend to follow a heuristic paradigm of a single unifying diagnosis, we hypothesized that categorical graphical representation of proband–disease phenotype congruence detects additional diagnoses (Figure 3E, Supplemental Material 1). To test this, we used LIRICAL to generate visual representations for 119 of 124 exome reports listing single nucleotide variants supporting a diagnosis.12 This correctly represented the multiple diagnoses of 4 of 12 probands initially given multiple diagnoses by the clinician, aided the attribution of a second molecular diagnosis in a proband initially given a single molecular diagnosis, and identified an additional 20 probands with potential multiple diagnoses. Clinician review concurred for 15 of these 20 probands (Supplemental Tables 3 and 4). Among the 28 (22.6%) probands with established or putative multiple diagnoses, 13 had 2 established molecular etiologies and 15 had a single established molecular etiology (Supplemental Table 5). The findings suggest that with expansion of information on the genetic bases of human disease, the prevalence of established multiple genetic diagnoses will approximate the predicted 14% to 26%.25,26

Clinician analysis of all variants on clinical exome reports is feasible and can be assisted by evidential transparency of laboratory reports

To facilitate evaluation and enable consistency among practitioners, we built CVAT (Supplemental Figure 1, Supplemental Material 1) into REDCap27,28 (Supplemental Material 1). This also enabled electronic data capture, management, and consolidation. Medical geneticist and genetic counselor use of the tool averaged 20 to 30 minutes per variant (range = 10 minutes to 4 hours). Because these analyses are facilitated by transparent laboratory reports, we propose recommendations for reporting and documentation of variants (Supplemental Tables 6 and 7) and provide an example of an idealized laboratory report that might aid clinical interpretation (Supplemental Material 1).

Discussion

We developed CVAT, a tool to enable clinical staff methodically to define the evidence supporting a reported genomic variant as an etiology for a proband’s disease. Using CVAT to assess the medical actionability of genomic variants listed on 289 clinical exome sequencing reports found that 38.2% of reported variants and 42.9% of exome reports supported a diagnosis, whereas 38% of reported variants (39 P/LP and 106 VUS) had insufficient causality and an additional 24% of reported variants (91 VUS) had insufficient deleteriousness. All clinically reported variants, regardless of the reported variant ACMG/AMP classification, required evaluation before use in medical decisions. The systematic evaluation of causality and deleteriousness using CVAT enabled standardized clinical interpretation of genomic variation and enhanced communication and efficiency within genomic medicine practice.

In analyzing why 61.8% of clinically reported variants did not support a diagnosis in the respective proband, we made several observations. Among the clinicians whose exome reports were included in this study, inadequate proband phenotype was not a major issue. Apart from difficulties in describing neurological disorders, the phenotype sufficiency scores for tested individuals had on average 90% of the breadth and depth as the terms used to annotate diseases in the HPO database. Supporting the sufficiency of phenotype, manual and bioinformatic analyses detected that many reported variants resided in candidate disease genes as judged by phenotypic overlap with model organisms or proteins within the interactome. Interestingly, despite the sufficiency of phenotype provided to testing laboratories, many reported that variants were judged by manual and bioinformatic evaluation to reside in genes associated with diseases that had poor phenotypic overlap with the probands. We postulate that inclusion of such variants on clinical exome reports arises from narrative potential, ie, the possibility of any given variant providing a seemingly compelling but statistically poorly justified story of influence on phenotype.29 This highlights a need for provision of adequate informative phenotypic information to the laboratory and collaborative interpretation of genomic variation between the laboratory and the clinicians to mitigate biases, to reduce errors of misattribution, and to ensure rigor in variant assessment through application of a logical framework, statistical evidence, and validated bioinformatic tools.

The deficiency in dual diagnosis detection observed by us and others25,26 could arise in part from bias impeding recognition of multiple diagnoses. Clinicians are trained to look for a single unifying diagnosis explanatory of the proband’s problem, ie, to use Occam's razor. In addition, a blended phenotype is often difficult to separate. The ability to address both impediments was, at least partially, possible by categorically evaluating phenotypes using the graphical output generated by LIRICAL (Figure 3E). This output identified additional probands with likely dual diagnoses and increased the frequency of putative dual diagnoses in our cohort to 22.3%, as had been predicted.25,26

The results of our study emphasize that the practice of genomic medicine is dependent on close collaboration and information sharing between the clinical laboratory and the clinical team. Clinicians must provide an appropriately detailed phenotypic evaluation to allow optimal variant prioritization and must rigorously and logically analyze the reported genomic variation. Reciprocally, transparency of the evidence supporting reported variation by the laboratories aids and expedites clinical interpretation. In addition, if laboratory reports provide categorical evaluation and graphical representation of the proband phenotype in the context of the reported variants, this might enhance conceptualization of multiple diagnoses by clinicians. Finally, placing the genomic findings in the context of the proband and providing recommendations for further testing, as appropriate, aids clinicians in the practice of genomic medicine.

Current standards for variant reporting provide robust recommendations for evaluation of variant pathogenicity, although findings herein suggest an inadequacy in the ascertainment of causality for gene–disease and phenotype–disease associations.2,3,30,31 The findings further highlight a need for the application of a systematic framework such as CVAT to contextualize clinically reported genomic variation and a need for clinician education in variant evaluation.

This education necessitates incorporation of genomics and genetic medicine into the core competency of medical geneticists32 and perhaps other subspecialists. Currently, clinicians largely rely on the laboratory to report genomic variants with a high probability of disease association to achieve a binary diagnostic outcome. The practice of genomics is evolving, however, to encompass interpretation of polygenic and multifactorial genetic contributors and recurrent evaluation of genomic data in the context of an evolving proband phenotype. We envision a future state in which referral to medical geneticists will come with access to the full data set (VCF and BAM files) and an integrated interpretation providing a likelihood of a health or disease trajectory, ie, quantitative trait analysis and counseling anchored in genomic, phenotypic, and environmental variation.33

This study has several limitations. It included only 289 exome reports from 3 experienced medical geneticists working in the PMGP BC. Although meticulously curated and evaluated by clinicians, this cohort did not include exome reports from other clinicians within the clinic or replication in other centers. Replication using larger cohorts might provide more perspective on the issues identified by this study. The analyses focused on clinically reported exome variants and did not re-evaluate the original VCF files for variants that might not have been reported or evaluate bioinformatic pipelines. These analyses were beyond the capacity of the clinic resources.

In conclusion, use of CVAT enabled definition of the evidence surrounding reported variation, clearer differentiation of disease-associated variation, and assessment of the utility of reported data. The practice of genomic medicine requires rigorous application of a systematic framework incorporating analyses for causality, phenotypic overlap, disease mechanism, and assessment of variant deleteriousness. CVAT enables clinicians consistently, judiciously, and effectively to incorporate genomic variation into clinical management. Furthermore, it not only encourages cooperation among clinicians and laboratories to improve responsible use of genomic data and avoid errors of misattribution, but also emphasizes how transparent, rigorous, and thorough laboratory reports contextualize genomic variants and improve the practice of genomic medicine.

Supplementary Material

Supplemental methods and figures
Supplemental table

Acknowledgments

We thank many colleagues for thoughtful discussions and critique of the manuscript. Funding was provided in part by Monarch R24 (2R24OD011883-05A1), the National Institute of Child Health and Human Development, United States (1R01HD103805-01), and the National Human Genome Research Institute, United States (RM1 HG010860).

Footnotes

Conflict of Interest

Julius O.B. Jacobsen and Damian Smedley are paid consultants to Congenica Ltd. All other authors declare no conflicts of interest.

Ethics Declaration

The requirement for ethics approval for this study was waived by the University of British Columbia/BC Women’s and Children’s Hospital Research Ethics Board because the study evaluates de-identified data for a quality improvement purpose.

Additional Information

The online version of this article (https://doi.org/10.1016/j.gim.2022.03.013) contains supplementary material, which is available to authorized users.

Data Availability

The anonymized data that support the findings of this study are available from the corresponding author on request. All variants supporting a diagnosis have been submitted to ClinVar (SCV002320767 - SCV002320876).

References

  • 1.MacArthur DG, Manolio TA, Dimmock DP, et al. Guidelines for investigating causality of sequence variants in human disease. Nature. 2014;508(7497):469–476. 10.1038/nature13127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Richards S, Aziz N, Bale S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association forMolecular Pathology. Genet Med. 2015;17(5):405–424. 10.1038/gim.2015.30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Ellard S, Baple EL, Callaway A, et al. ACGS Best Practice Guidelines for Variant Classification in Rare Disease 2020. Association for Clinical Genomic Science. Published 2020. https://www.acgs.uk.com/media/11631/uk-practice-guidelines-for-variant-classification-v4-01-2020.pdf. Accessed March 10, 2021. [Google Scholar]
  • 4.Brnich SE, Abou Tayoun AN, Couch FJ, et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 2019;12(1):3. 10.1186/s13073-019-0690-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Rivera-Muñoz EA, Milko LV, Harrison SM, et al. ClinGen Variant Curation Expert Panel experiences and standardized processes for disease and gene-level specification of the ACMG/AMP guidelines for sequence variant interpretation. Hum Mutat. 2018;39(11):1614–1622. 10.1002/humu.23645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Tavtigian SV, Greenblatt MS, Harrison SM, et al. Modeling the ACMG/AMP variant classification guidelines as a Bayesian classification framework. Genet Med. 2018;20(9):1054–1060. 10.1038/gim.2017.210.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Strande NT, Riggs ER, Buchanan AH, et al. Evaluating the clinical validity of gene-disease associations: an evidence-based framework developed by the Clinical Genome Resource. Am J Hum Genet. 2017;100(6):895–906. 10.1016/j.ajhg.2017.04.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Kotze MJ, Lückhoff HK, Peeters AV, et al. Genomic medicine and risk prediction across the disease spectrum. Crit Rev Clin Lab Sci. 2015;52(3):120–137. 10.3109/10408363.2014.997930. [DOI] [PubMed] [Google Scholar]
  • 9.Rosenbaum JN, Berry AB, Church AJ, et al. A curriculum for genomic education of molecular genetic pathology fellows: a report of the Association for Molecular Pathology Training and Education Committee. J Mol Diagn. 2021;23(10):1218–1240. 10.1016/j.jmoldx.2021.07.001. [DOI] [PubMed] [Google Scholar]
  • 10.Karbassi I, Maston GA, Love A, et al. A standardized DNA variant scoring system for pathogenicity assessments in Mendelian disorders. Hum Mutat. 2016;37(1):127–134. 10.1002/humu.22918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Washington NL, Haendel MA, Köhler S, Lewis SE, Robinson P, Smedley D. How good is your phenotyping? Methods for quality assessment. Zenodo. Published April 1, 2014. https://zenodo.org/record/834091#.W8ZnCxhlCV4. Accessed March 10, 2021. [Google Scholar]
  • 12.Robinson PN, Ravanmehr V, Jacobsen JOB, et al. Interpretable clinical genomics with a likelihood ratio paradigm. Am J Hum Genet. 2020;107(3):403–417. 10.1016/j.ajhg.2020.06.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Smedley D, Jacobsen JOB, Jäger M, et al. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat Protoc. 2015;10(12):2004–2015. 10.1038/nprot.2015.124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Cipriani V, Pontikos N, Arno G, et al. An improved phenotype-driven tool for rare Mendelian variant prioritization: benchmarking Exomiser on real patient whole-exome data. Genes (Basel). 2020;11(4):460. 10.3390/genes11040460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Howe KL, Achuthan P, Allen J, et al. Ensembl 2021. Nucleic Acids Res. 2021;49(D1):D884–D891. 10.1093/nar/gkaa942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Malhis N, Jacobson M, Jones SJM, Gsponer J. LIST-S2: taxonomy based sorting of deleterious missense mutations across species. Nucleic Acids Res. 2020;48(W1):W154–W161. 10.1093/nar/gkaa288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Vaser R, Adusumalli S, Leng SN, Sikic M, Ng PC. SIFT missense predictions for genomes. Nat Protoc. 2016;11(1):1–9. 10.1038/nprot.2015.123. [DOI] [PubMed] [Google Scholar]
  • 18.Adzhubei IA, Schmidt S, Peshkin L, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4):248–249. 10.1038/nmeth0410-248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46(3):310–315. 10.1038/ng.2892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Yang Y, Muzny DM, Reid JG, et al. Clinical whole-exome sequencing for the diagnosis of Mendelian disorders. N Engl J Med. 2013;369(16):1502–1511. 10.1056/NEJMoa1306555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Balci TB, Hartley T, Xi Y, et al. Debunking Occam’s razor: diagnosing multiple genetic diseases in families by whole-exome sequencing. Clin Genet. 2017;92(3):281–289. 10.1111/cge.12987. [DOI] [PubMed] [Google Scholar]
  • 22.Farwell KD, Shahmirzadi L, El-Khechen D, et al. Enhanced utility of family-centered diagnostic exome sequencing with inheritance model-based analysis: results from 500 unselected families with undiagnosed genetic conditions. Genet Med. 2015;17(7):578–586. 10.1038/gim.2014.154. [DOI] [PubMed] [Google Scholar]
  • 23.Lee H, Deignan JL, Dorrani N, et al. Clinical exome sequencing for genetic identification of rare Mendelian disorders. JAMA. 2014;312(18):1880–1887. 10.1001/jama.2014.14604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Retterer K, Juusola J, Cho MT, et al. Clinical application of whole-exome sequencing across clinical indications. Genet Med. 2016;18(7):696–704. 10.1038/gim.2015.148. [DOI] [PubMed] [Google Scholar]
  • 25.Posey JE, Harel T, Liu P, et al. Resolution of disease phenotypes resulting from multilocus genomic variation. N Engl J Med. 2017;376(1):21–31. 10.1056/NEJMoa1516767. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Liu P, Meng L, Normand EA, et al. Reanalysis of clinical exome sequencing data. N Engl J Med. 2019;380(25):2478–2480. 10.1056/NEJMc1812033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research Electronic Data Capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42(2):377–381. 10.1016/j.jbi.2008.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Harris PA, Taylor R, Minor BL, et al. The REDCap consortium: building an international community of software platform partners. J Biomed Inform. 2019;95:103208. 10.1016/j.jbi.2019.103208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Goldstein DB, Allen A, Keebler J, et al. Sequencing studies in humangenetics: design and interpretation. Nat Rev Genet. 2013;14(7):460–470. 10.1038/nrg3455. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Matthijs G, Souche E, Alders M, et al. Guidelines for diagnostic next-generation sequencing. Eur J Hum Genet. 2016;24(1):2–5. 10.1038/ejhg.2015.226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Cresswell L, Wallis Y, Fews G, et al. General genetic laboratory reporting recommendations. Association for Clinical Genomic Science. Published February 4, 2020. https://www.acgs.uk.com/media/11649/acgs-general-genetic-laboratory-reporting-recommendations-2020-v1-1.pdf. Accessed March 5, 2021. [Google Scholar]
  • 32.Childs B. Genetic Medicine: A Logic of Disease. Johns Hopkins University Press; 1999. [Google Scholar]
  • 33.Di Sera T, Velinder M, Ward A, et al. Gene.iobio: an interactive web tool for versatile, clinically driven variant interrogation and prioritization. Sci Rep. 2021;11(1):20307. 10.1038/s41598-021-99752-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental methods and figures
Supplemental table

Data Availability Statement

The anonymized data that support the findings of this study are available from the corresponding author on request. All variants supporting a diagnosis have been submitted to ClinVar (SCV002320767 - SCV002320876).

RESOURCES