Abstract
Inherited platelet disorders (IPD) are a heterogeneous group of rare disorders that affect platelet number and function and often predispose to other significant medical complications. In spite of the identification of over 50 IPD disease-associated genes, a molecular diagnosis is only identified in a minority (10%) of affected patients without a clinically suspected etiology. We studied a cohort of 21 pediatric patients with suspected IPDs by exome sequencing (ES) to: (1) examine the performance of the exome test for IPD genes, (2) determine if this exome-wide diagnostic test provided a higher diagnostic yield than has been previously reported, (3) to evaluate the frequency of variants of uncertain significance identified, and (4) to identify candidate variants for functional evaluation in patients with an uncertain or negative diagnosis. We established a high priority gene list of 53 genes, evaluated exome capture kit performance, and determined the coverage for these genes and disease-related variants. We identified likely disease causing variants in 5 of the 21 probands (23.8%) and variants of uncertain significance in 52% of patients studied. In conclusion, ES has the potential to molecularly diagnose causes of IPD, and to identify candidate genes for functional evaluation. Robust exome sequencing also requires that coverage of genes known to be associated with clinical findings of interest need to be carefully examined and supplemented if necessary. Clinicians who undertake ES should understand the limitations of the test and the full significance of results that may be returned.
1 ∣. INTRODUCTION
Inherited platelet disorders (IPDs), are complex, genetically heterogeneous, and may have a mild clinical presentation making diagnosis challenging.1,2 Hematological and biochemical diagnostic assays are essential for clinical diagnostics, and genomic evaluation and identification of a molecular etiology can be critical to guide optimal treatment and follow up, as well as family counseling. As part of the Clinical Sequencing Exploratory Research (CSER) consortium3 whose goals include the integration of genomic sequencing into clinical care via multidisciplinary approaches across a spectrum of symptomatic and healthy children and adults,3 the Children’s Hospital of Philadelphia (CHOP) Pediatric Sequencing (PediSeq) project focuses on determining best practices for genomic sequencing of pediatric patients with genetically heterogeneous conditions, including IPDs, risk for sudden cardiac arrest,4 hearing loss, intellectual disability/developmental delay, and mitochondrial disorders.
Some IPDs, although rare, are clinically severe making diagnosis straightforward; however other disorders, such as some inherited thrombocytopenias may never present with a clinically significant bleeding event and are only identified incidentally by routine blood tests. In the latter cases, differentiation of an IPD from a secondary form of low platelet count such as immune thrombocytopenia is important for clinical management. Other patients with mild bleeding phenotypes can be diagnosed at a late age and represent a spectrum of disorders that are likely to be multifactorial and more common than previously thought.5 Overall, these issues make connecting genetic variation to an underlying IPD very challenging and an unmet medical need.6
Guidelines have been proposed for the evaluation of patients with suspected inherited platelet function disorders,7 however, the recommended functional tests are not available in all centers and cannot be easily performed on shipped specimens. Molecular evaluation via sequencing is late in the algorithmic process, due to similar lack of availability and difficulty in classifying rare novel variants as pathogenic (as opposed to benign polymorphisms) which results in identification of many “variants of uncertain significance” (VUS), which are frustrating for patients and physicians alike.8 One major challenge in predicting the clinical significance of novel missense variants is the frequent lack of functional evidence.9 Recently, research and diagnostic laboratories have begun the process of including Next Generation Sequencing (NGS) panel-based assays within their workflows for the molecular diagnosis of IPD.10,11 Additionally, ES provides an effective method for diagnosis in both established IPD genes and discovery of new IPD disease genes.12
Our overall goal was to determine the utility and limitations of ES for analysis and molecular diagnosis of a heterogeneous IPD cohort at a large urban academic pediatric hospital with the following objectives: (1) examine the performance of the exome test for IPD genes, (2) determine if this exome-wide diagnostic test provided a higher diagnostic yield than has been previously reported, (3) evaluate the frequency of variants of uncertain significance identified, and (4) identify candidate variants for further functional evaluation in patients with an uncertain or negative diagnosis.
2 ∣. METHODS
2.1 ∣. Overview of the study population
Patients were eligible for enrollment if they had a suspected inherited thrombocytopenia or platelet function disorder. All of the patients were followed at the Children’s Hospital of Philadelphia (CHOP) Hemophilia Treatment Center at the time of study enrollment and had undergone standard diagnostic evaluation for inherited thrombocytopenia or platelet dysfunction including light transmission aggregometry, light microscopic evaluation and as appropriate, electron microscopy, flow cytometry, and targeted molecular analysis. Families provided informed consent prior to enrolling under an IRB approved protocol. The overall purpose of the CHOP PediSeq study was to identify the best methods for educating patients and families about exome sequencing, analyze exome sequencing data to identify results relative to patients with heterogeneous genetic conditions (including inherited platelet disorders), and give results to families in a clear, appropriate, and informative manner.
We enrolled 21 individuals with a clinical presentation suggestive of an underlying genetic cause of an inherited platelet disorder over a 12-month period (Supporting Information Table S1). Clinical diagnoses for the individuals included two (9.6%) with severe, congenital thrombocytopenia, two (9.6%) with dense granule deficiency, eight (38.1%) with macrothrombocytopenia, one (4.8%) with mild macrothromobocytopenia and platelet dysfunction, five (23.8%) with primarily platelet dysfunction, one (4.8%) with significant bleeding diathesis due to platelet dysfunction that has not been able to be better characterized, and two (9.6%) patients with thrombocytopenia with normal size platelets. Eleven patients were male (52.4%) and ten were female (47.6%). The average age was 11.5 years. Seventeen individuals were White/Caucasian (81.0%), three were Black/African-American (14.3%) and one was more than one race. In terms of ethnicity, all individuals identified as non-Hispanic. There was a positive family history in nine individuals (42.9%). Eleven individuals had some type of previous genetic testing (52.4%).
2.2 ∣. DNA isolation and exome sequencing
DNA isolation and exome sequencing was conducted as previously described.4 Briefly, peripheral blood was collected from patients and stored at 4° C until genomic DNA was extracted using the Gentra Puregene Blood Kit Plus (Qiagen, 158489). For some parental samples, DNA was extracted from saliva using prepIT-L2P DNA isolation kit (DNA Genotek, PT-L2P-45). For exome sequencing, three to six micrograms of DNA were used for library preparation and sequencing at the Beijing Genomics Institute (BGI) collaborative genome center at CHOP (BGI@CHOP). Exome capture was performed using the Agilent SureSelect Whole Exome, version 4 kit (51 MB target size), and 100 base pair paired end sequencing was performed on Illumina HiSeq 2500 sequencing machines (4 samples per lane in high throughput mode). Exome sequencing performance was monitored using the following quality metrics: mean coverage depth (≥100X), fraction of target covered with at least 20X (>95%), mapping rate (>95%), and base calling accuracy (Q20 > 85%). The ES analysis strategy combined proband-only ES with proband and parental sample (where available) Sanger sequencing validation for all reported variants.
2.3 ∣. Creation of the pediatric IPD gene list
Genes associated with platelet disorders were manually curated based on review of the literature. Genes were included that were definitively associated with platelet disease as well as those with some evidence suggesting a possible role in platelet disorder etiology, taking into consideration the quality of evidence based on case reports, animal models, and in vitro cellular experimental models. Where available, evidence on mechanism(s) of pathogenicity, inheritance pattern, age of onset, prevalence and genotype-phenotype relationships were determined for accurate and robust interpretation of sequence variants (see “Variant interpretation workflow” section for additional details).
2.4 ∣. Variant interpretation workflow
Our workflow uses a FileMaker Pro Database containing relevant patient clinical information as well as curated gene and variant annotations and links to useful genomics references and databases. An inhouse developed variant calling and filtering pipeline for identification of potentially disease-associated variants was used requiring in-depth analysis and interpretation. Briefly, the reads in FASTQ format were aligned to GRCh37 (hg19) human genome assembly using Novoalign (http://www.novocraft.com). Picard (http://broadinstitute.github.io/picard/) was used to mark the duplicate reads and GATK Unified Genotyper13 was used to call the variants. The reads were further filtered on QualByDepth (QD <2.0), FisherStrand (FS > 60 [indels], FS > 200 [SNPs]), and RMSMappingQuality (MQ< 40), and read depth of 5 or greater. The pipeline filtered out common variants present in ExAC14 (>0.5% MAF) and our internal cohort (>5% MAF). Variants were annotated using ANNOVAR15,16 and missense, nonsense, frameshift, indel, insertion, deletion, and splicing (± 6 bases surrounding exons) variants were retained. The HGMD-preferred transcript was used for annotation. If an HGMD-preferred transcript was unavailable, the transcript with the most severe change based on functional annotation precedence in ANNOVAR was used. The remaining variants were filtered to only retain those within the IPD gene list and were uploaded to the FileMaker database. Variant interpretation is based on internal best practices and American College of Medical Genetics (ACMG) guidelines.17 All variants were reviewed by an American Board of Medical Genetics and Genomics (ABMGG) board-certified molecular geneticist. Variant pathogenicity determination included evaluation of variant frequency in control population databases (ExAC,14 dbSNP,18 1000G19), computational evidence (nucleotide conservation, amino acid conservation, amino acid physiochemical differences, occurrence in functional protein domains, splicing effect prediction) and HGMD and ClinVar20 references. Patients in the PediSeq study also had the option of receiving several types of incidental findings including pathogenic and likely pathogenic variants in medically-actionable disease genes (2714 genes), carrier status (185 genes), and ACMG-recommended incidental findings (56 genes at the time the study was initiated).21 Findings which suggested an immediate change in medical care, including screening or intervention, were returned for all patients. All variants in IPD genes and their pathogenicity classifications were deposited in ClinVar using Organization ID: 505472 (https://www.ncbi.nlm.nih.gov/clinvar/submitters/505472/).
2.5 ∣. IPD candidate variant analysis
For the 16 patients in our cohort for whom a positive molecular diagnosis was not achieved by ES using our IPD gene list, an unbiased analysis of rare heterozygous, homozygous, and potential compound heterozygous variants was performed without using a gene list assessing the zygosity, predicted exonic effect, and incidence of mutations within shared genes among patients based on their IPD phenotype (TCP, MTCP, and PFD). In patients with more than one phenotype (i.e.,: patients 122, 168, and 182 who had a suspected platelet function disorder and thrombocytopenia), variants were included in both IPD phenotype subgroups. The filtration workflow is outlined in Supporting Information Figure S1.
Sixteen molecularly undiagnosed patients and 197 control cases that included unrelated patients without IPD phenotypes and unaffected relatives, 416 043 high quality variants were called. Variants within the 53 genes of our IPD gene list were removed as they were already interpreted for pathogenicity in IPD patients. Nonsynonymous variants that were rare in the general population, i.e.,: minor allele frequency of < 0.5% in dbSNP, 1000G, and ExAC (in the overall cohort and any subpopulations) were retained. This removed ~88.1% of variants and left 49 675 for further filtering and analysis. Similarly, variants identified only in the IPD cohort and not in the control cohort were retained, including homozygous variants in the IPD cohort where a heterozygote was identified in controls. Retaining only variants that matched expected inheritance pattern based on family history of the patients’ phenotypes left 3 122 variants within 2 613 genes.
Variants were further filtered by examining human platelet transcriptomics and proteomics datasets for evidence that the related mRNAs and proteins are expressed, consistent with a potential role in platelet biology and IPD etiology. The PlateletWeb resource (http://plateletweb.bioapps.biozentrum.uni-wuerzburg.de/plateletweb.php), which contains manually curated evidence from 22 platelet proteomics studies and and a SAGE transcriptome study yielding over 5000 proteins with evidence of expression in platelets.22 RNA-Seq results, including polyA+ mRNA and rRNA-depleted total RNA obtained from CD45-depleted healthy donor human platelets were used to examine whether candidate gene mRNAs are expressed in human platelets.23
2.6 ∣. Supporting Information methods
Supporting Information methods including target capture and coverage analysis and copy-number variation analysis are available in the online Supporting Information.
3 ∣. RESULTS
3.1 ∣. Determination of the gene list
A literature search for genes associated with or potentially associated with IPDs resulted in the identification of 53 genes, which constituted our gene list for focused analysis (Supporting Information Table S2). These 53 genes included 873 exons with a range of one to 55 exons per gene.
3.2 ∣. Performance of exome sequencing for disease-associated variant identification
We analyzed the performance of the Agilent SureSelect v4 capture kit with respect to these exons and 808 of 873 exons (92.6%) were targeted as defined in the Methods section (Supporting Information Table S3). At least one exon was targeted for all of the 53 IPD genes and 28 genes (52.8%) had all known exons targeted. We analyzed the depth of coverage and found 76.7% of the 873 exons were completely covered with 100% of bases sequenced at 20X depth or greater on average across 265 identically-processed samples in our entire PediSeq cohort sequenced at an overall 100X depth of coverage (Figure 1A and Supporting Information Table S4). Of the 808 targeted exons, 77.6% were completely covered, and 66.0% of the 65 nontargeted exons were completely covered.
In order to determine if previously reported disease causing variants in these genes would have been detected, we searched the Human Genome Mutation Database (HGMD) for variants in the 53 IPD genes and identified 2 835 total variants that were classified by HGMD as potentially disease-associated (Figure 1B). Two hundred fifty six (9.0%) of these variants were excluded from analysis as they were not expected to be identified by exome sequencing, including large (>20 bp) deletions, insertions, duplications, indels and complex rearrangements. Therefore, 2 579 variants were further analyzed. These 2 579 variants occurred in 2 210 unique positions, as there were a number of variants that occurred at identical nucleotide positions such as c.212T > G p.(Phe71Cys) and c.212T > C p.(Phe71Ser) in the GP9 gene (NM_000174.4). Overall, we found that 2 160 of the 2 210 (97.7%) likely disease causing variant positions identified in HGMD that were expected to be identified were adequately targeted (Supporting Information Table S5).
We also evaluated the depth of coverage across the IPD exons in order to determine the likelihood of detecting these variants in our pipeline. We randomly selected 95 IPD disease-associated variants (with a maximum of two variants per gene to minimize bias against genes with considerably more disease associated variants than others) and evaluated depth of coverage in a cohort of 265 individuals sequenced as part of our PediSeq study for various clinical indications. We found that 88 of 95 (92.6%) variants had adequate sequencing read depth of 20X or greater on average across the 265 exomes (Supporting Information Table S6). For the remaining seven variants, four (4.2%) had between 10 to 20X read depth and three (3.2%) had <10X read depth on average. Two of the variants between 10–20X average coverage were in the 5′ untranslated region (UTR) of the ANKRD26 gene and were not targeted for capture in the kit used for these studies. As the majority of disease-associated variants reported in HGMD and in patients are present in the ANKRD26 5′ UTR, we performed targeted Sanger fill-in of this area, however, we only identified common polymorphisms and variants for which the inheritance pattern did not fit segregation expectations (data not shown).
3.3 ∣. Results of testing
Exome sequencing in the IPD cohort of 21 individuals revealed 49 rare variants with an average of 2.3 variants per individual (Table 1). The 49 rare variants were distributed across 23 genes. There were six likely pathogenic variants (CD36, RASGRP2, RUNX1, GP9 and RAB27A genes) and one pathogenic variant (MYH9) identified (Supporting Information Figure S2). Eighteen variants (36.7%) were classified as variants of uncertain significance (VUS); 17 (34.7%) were likely benign and seven (14.3%) were benign. The maximum number of variants identified in any patient was seven, and no potentially disease-causing variants were identified in six patients (28.6%). VWF had the most variants (six), one of which was in the pseudogene region (exons 23–34) (Supporting Information Figure S4), and was confirmed by Sanger sequencing using primers specific to the functional gene and not the pseudogene.24
TABLE 1.
Variable | Count (%) |
---|---|
Patients | 21 |
Rare IPD variants | 49 |
Variant pathogenicity classification | |
Benign | 7 (14.3) |
Likely benign | 17 (34.7) |
VUS | 18 (36.7) |
Likely pathogenic | 6 (12.2) |
Pathogenic | 1 (2.0) |
Average number of variants per patient | 2.3 |
Variants per patient [range] | [0 - 7] |
Number of patients without uncertain/pathogenic variants | 6 (28.6) |
ES, exome sequencing; IPD, inherited platelet disorder; VUS, variant of uncertain significance.
We achieved an overall positive molecular diagnostic yield of 23.8% (5 out of 21 patients). Six of 21 patients (28.6%) had a negative diagnosis with either no rare variants in our 53 gene list or only benign or likely benign variants. Ten of our 21 patients (47.6%) received an uncertain diagnosis as the only variants identified were of uncertain significance without enough clinical information to conclusively give a positive or negative diagnosis (Figure 2A). Seven likely disease causing variants were found in five patients. Five variants resulted in autosomal recessive disease in 3 patients including compound heterozygous variants in RASGRP2 and CD36, and a homozygous variant in GP9. Two variants resulted in autosomal dominant disease including one patient who inherited a heterozygous MYH9 variant from an affected parent, and another patient having a de novo RUNX1 variant (Figure 2B). No patients with a positive molecular diagnosis had an X-linked inheritance pattern. Additional details for positively diagnosed cases is presented in Table 2. In an attempt to improve diagnostic yield, a modified Exome-Depth workflow for CNV detection was used to detect potential CNVs in the IPD gene list (manuscript in preparation). We found only one CNV of uncertain significance involving an IPD gene, a 188 kb duplication on chromosome 10 including ABI1, ANKRD26, LINC00202-1 in a patient with a molecular diagnosis (187).
TABLE 2.
Patient ID | Gene | Molecular diagnosis | Cytoband | Inheritance | Transcript | cDNAa | Proteina |
---|---|---|---|---|---|---|---|
PediSeq-136 | RASGRP2 | Platelet-type bleeding disorder | 11q13.1 | AR (compound heterozygous) | NM_153819.1 | c.[542T>C]; [1479dup] | p.[(Phe181Ser)]; p[(Arg494Alafs*54)] |
PediSeq-170 | GP9 | Bernard-Soulier syndrome, type C | 3q21.3 | AR (homozygous) | NM_000174.3 | c.[182A>G]; [182A>G] | p.[(Asn61Ser)]; [(Asn61Ser)] |
PediSeq-182 | RUNX1 | Platelet disorder with associated | 21q22.12 | AD (de novo) | NM_001754.4 | c.[497G>A];[=] | p.[(Arg166Gln)]; [(=)] |
PediSeq-187 | CD36 | Platelet glycoprotein IV deficiency | 7q21.11 | AR (compound heterozygous) | NM_001001547.2 | c.[429 + 2T>C]; [701 +1_701 +4dup] | p.[(?)]; p.[(?)] |
PediSeq-189 | MYH9 | MYH9-related disease | 22q12.3 | AD (inherited) | NM_002473.3 | c.[287C>T];[=] | p.[(Ser96Leu)]; [(=)] |
Variant described using HGVS nomenclature.
AD, autosomal dominant; AR, autosomal recessive; ES, exome sequencing; IPD, inherited platelet disorder.
To examine whether diagnostic rates differed based on clinical diagnosis, we categorized patients into four groups including thrombocytopenia (TCP, 4 patients), macrothrombocytopenia (MTCP, 8 patients), platelet function disorder (PFD, 7 patients), and platelet function disorder with thrombocytopenia/macrothrombocytopenia (PFD and TCP, 2 patients) (Supporting Information Figure S3). The TCP group had the highest positive diagnostic rate (50%, or 2 out of 4 patients), followed by the MTCP group with 25% (2 out of 8 patients), PFD with 14.2% (1 out of 7 patients), and PFD and TCP with no positive diagnoses. The remaining 75% (6 out of 8 patients) in the MTCP group had an uncertain diagnosis. In addition to a low positive diagnostic rate, the PFD group also had the largest negative diagnostic rate (57.1%, or 4 out of 7 patients).
We also examined whether there was an association between a reported family history and molecular diagnosis. There was a positive family history reported in nine patients (42.9%) in our cohort, with the TCP group having the highest reported family history (75%, or 3 out of 4 patients) (Supporting Information Figure S5A). Overall, across all groups, 33.3% (4 out of 12 patients) with a negative family history received a positive molecular diagnosis, while only 11.1% (1 out of 9 patients) with a positive family history received a positive molecular diagnosis, however this difference was not statistically significant (two-tailed P = .338, Fisher’s exact test) (Supporting Information Figure S5B). For the self-identified non-White/Caucasian patients, one patient who was more than one race (patient 142) received an uncertain diagnosis, and three Black/African-American patients (patients 175, 184, 187) received an uncertain, negative, and positive diagnosis, respectively.
For the 16 patients in our cohort for whom a positive molecular diagnosis was not achieved by ES using our IPD gene list, an unbiased analysis of rare heterozygous, homozygous, and potential compound heterozygous variants was performed without using a gene list assessing the zygosity, predicted exonic effect, and incidence of mutations within shared genes among patients based on their IPD phenotype (TCP, MTCP, and PFD) (Supporting Information Figure S1). In terms of zygosity, 2207 (95.8%) of the variants remaining after filtering were heterozygous, nine (0.4%) were homozygous, 16 (0.7%) were X-linked, and 72 (3.1%) were potentially biallelic (Supporting Information Table S7). In terms of functional protein impact of the variants, 90.6% were missense, 5.5% were loss-of-function (nonsense, splicing, frameshift insertions/deletions), and the remaining 3.9% were inframe insertions/deletions, stoploss, and those with unknown functional impact. At the gene level, 275 genes contained rare variants in two patients, 53 genes in three patients, and 20 genes in four patients. Lastly, there were specific variants identified multiple times: one variant in two patients with macrothrombocytopenia (patients 142 and 172), and three variants in six patients with platelet function disorders (patients 133 and 162, 122 and 158, and 158 and 179). Overall, the complete filtering and prioritization analysis yielded 2 230 candidate variants within 1 838 genes (an average of 139 variants per patient), representing a 99.5% reduction in the number of the original high quality variants and a 95.5% reduction in the number of rare nonsynonymous variants. Manual curation of the variants yielded several biologically interesting candidates, and further work to validate and evaluate these results is ongoing.
4 ∣. DISCUSSION
Molecular diagnosis of IPDs in pediatric patients remains a challenge affecting patients, families, and clinicians worldwide due to incomplete current knowledge of all disease-causing variants; however, genome and exome sequencing represent powerful techniques to help overcome this challenge. Our results here demonstrate the utility and limitations of ES as a molecular diagnostic tool in a single-center, pediatric clinic for individuals with suspected IPDs. Overall, the average capture and coverage for the 873 exons from our 53-gene platelet disorder list was high, with 92.6% of exons targeted and 88.5% of exons covered partially or completely. We also found a high percentage of completely covered exons (66.0%) in exons that were not targeted, consistent with the biased nature of read distribution around a captured region25 and an analysis examining high quality exome data in nontarget regions.26
Although the primary goal of this study was not to maximize diagnostic yield, we achieved a positive molecular diagnosis in 23.8% of our cohort (5 of 21 unrelated patients). This molecular diagnostic rate is within the range of previously reported exome and gene panel positive diagnostic rates for platelet disorders (10.5% to 45.9%),10,12 for ES applied broadly in individuals with various phenotypes (~20–36%),27–33 and for pediatric patients with risk for sudden cardiac arrest in the PediSeq cohort (17%).4
Approximately 250 gene-disease and 9200 variant-disease associations are reported annually, and this information prompts re-analysis of ES data as an attractive way improve diagnostic yield.34 Wenger et al.found a molecular diagnosis in 10% of negative exome cases (with an average lag time of 20 months before re-analysis) due to new supporting literature where the evidence to associate the causative variant had previously been weak or nonexistent. This highlights the significant possibility that the causative disease variants in negative cases do not necessarily fall outside of the exome data already generated. One advantage of our ES IPD gene list approach is that newly-associated disease genes that were initially targeted but ignored for analysis can be easily included into the gene list filter for analysis without additional capture and sequencing, provided they had sufficient capture and coverage during the first sequencing run; this represents an advantage over NGS targeted gene panels, where re-analysis is limited only to genes already on the panel.
Our identification of a pathogenic MYH9 variant in patient 189 and RUNX1 variant in patient 182 highlighted the utility of ES results to leverage existing genotype-phenotype correlations to prompt further clinical evaluation and longitudinal follow up. MYH9-related disease patients can present with Epstein syndrome, Fechtner syndrome and macrothrombocytopenia where variants in different protein domains can result in these phenotypes associated with individual health concerns.35 The identified c.287C > T; p.(Ser96Leu) variant in patient 189 is within the interface between the SH3-like motif and the motor domain of the Head protein domain (SH3/MD I protein region) of the MYH9 protein, and variants in this region have been associated with a high risk of later onset hearing loss and a lower risk for nephropathy.35–39 The sister and father of patient 189 were found be heterozygous for the mutation, so this may prompt additional health concerns for them as well, highlighting the potential impact of a pediatric genetic diagnosis on other family members. In addition, patient 182 was found to have a previously reported40–42 de novo RUNX1 c.497G > A; p.(Arg166Gln) variant associated with familial platelet disorder and acute myeloid leukemia. Dominant negative effect mutations in RUNX1 have been associated with an increased incidence of hematological malignancies (56%) when compared with loss of function alleles (33%).42 These findings have prompted follow up for early leukemia screening and potentially consideration of therapies such as allogeneic hematopoietic stem cell transplantation in our patient.
We identified one patient, 136, with a qualitative platelet function defect with compound heterozygous variants in RASGRP2, a gene recently identified to be associated with defective αIIbβ3 expression/function (like ITG2B and ITGB3) as associated with Glanzmann’s thrombasthenia, due to abnormal inside-out activation of αIIbβ3.43 Previous studies identified two novel pathogenic homozygous variants in two unrelated families using a combination of exome sequencing and a novel in vitro assay measuring the nucleotide exchange activity of the protein encoded by RASGRP2. These variants were shown to affect both leukocyte and platelet integrin activation resulting in bleeding diathesis and platelet dysfunction. Our results here present, to our knowledge, the first case of an individual with a platelet-type bleeding disorder with compound heterozygous RASGRP2 variants.
One limitation of the version of exome capture kit used in this study (Agilent SureSelect v4) was that its targeting design lacked gene 5′ UTRs. In ANKRD26, 15 of the 16 variants reported in HGMD were in the 5′ UTR where they are thought to alter RUNX1 and FLI1 transcription factor binding ability, precluding transcriptional activation leading to ANKRD26 protein deficiency, thrombocytopenia and myeloid malignancies.44–48 ANKRD26-related thrombocytopenia is reported to cause ~10% of inherited thrombocytopenias, and the majority of variants reported in patients are present in the 5′ UTR, so reduced capture or coverage of these regions represented a possible limitation for diagnosis.49 While our targeted Sanger sequencing of the ANKRD26 5′ UTR was able to supplement ES limitations, no pathogenic variants were identified in this region in our cohort. However, these findings highlight the need to be aware of molecular mechanisms of pathogenicity when designing panels using ES workflows,50 in addition to limitations of the tests selected.
We acknowledge the possibility of large structural variation or copy number variation (CNVs) to have an etiological role in our patients lacking a positive molecular diagnosis. While the current gold-standard method for performing CNV analysis is copy-number microarray, there are currently over 15 publically-available algorithms available that detect copy number variation based on ES sequencing read depth (reviewed by Kadalayil et al.51). While this method did not improve our molecular diagnostic rate, it will be interesting to see how future pipelines that incorporate these increasingly sensitive and powerful techniques increase ES diagnostic yield and utility, and current studies in the laboratory are ongoing to evaluate their utility.
In the process of variant classification, we interpreted a total of 49 variants after filtering and retaining only rare nonsynonymous variants (Table 1). The majority (36.7%) of these variants were classified with a pathogenicity of “variant of uncertain significance” (VUS) and 6 individuals (28.6%) had no rare variants of uncertain or likely pathogenicity within genes in the gene list, indicating a potential opportunity for functional studies and gene discovery to improve our understanding of the etiology of inherited platelet disorders. In these patients we applied a systems biology approach combining rare variant prioritization and platelet transcriptome and proteomics datasets to filter candidates for follow up. After a 99.5% reduction in variants via our filtering pipeline, we identified an average of 139 candidate variants per patient, and studies to validate and functionally characterize these variants are currently ongoing in the laboratory. We acknowledge a possible limitation of using platelet-specific expression datasets as bona fide disease-causing variants may impact other non-platelet cell types or a platelet progenitor cell type, however only two genes (MLPH and HOXA11) out of 53 in our IPD gene list were not found in these datasets, suggesting it is a reasonable filtration step for identifying causal variants.
The von Willebrand factor gene (VWF) is notable in several different ways including mutation mechanism, variant numbers and coverage. Gene conversions with a 97% identical partially-unprocessed VWF pseudogene comprising exons 23–34 on chromosome 22q11.22-q11.23 are a common cause of von Willebrand disease in patients.52,53 VWF has the largest number of disease-associated variants in HGMD out of all genes tested, and the largest number that were predicted to be missed due to reduced capture and coverage. Accurately identifying variants in regions homologous with pseudogenes is a significant problem with NGS, and longer read length technologies and protocol modifications will hopefully ameliorate some of these issues.54 We identified six VWF variants in six patients by ES, one of which was in the partially processed pseudogene region (exon 30). Validation by Sanger sequencing using primers specific for the real gene24 showed concordance with the ES results, however this region had reduced capture and coverage. Pseudogenes for ANKRD26, CYCS, and TUBB1 also exist, however, these were of less diagnostic significance in our study as no rare variants were identified by ES within ANKRD26 or CYCS, and only two likely benign variants were identified within TUBB1. These results demonstrate a common theme that although overall coverage may be high, clinicians should be aware that clinical utility will vary considerably across genes and a highly suspected gene of interest with no pathogenic variants may require further inspection by orthogonal methods.
5 ∣. CONCLUSIONS
We evaluated ES-based molecular diagnostics in a cohort of 21 pediatric patients with IPDs and found it performed at or above the expected diagnostic rate. This is the first report of an exclusively pediatric platelet disorder cohort evaluated by ES. In addition to a gene list approach, we explored a robust and reproducible framework for prioritizing candidate variants for functional follow up when direct disease-association evidence is weak or nonexistent in the literature. For the IPD 53 genes tested, we had high rates of capture and coverage including 92.6% of targeted exons captured and 76.7% of exons completely covered. For disease-associated variants, 97.7% were captured and 92.6% of a randomly selected subset was completely covered. In addition to providing a positive molecular diagnosis for 23.8% of individuals, we also identified long-term implications including requirements for comorbidity screenings. The large number of variants of uncertain significance identified will prompt further clinical correlation and continual reanalysis for new literature findings to provide more certain pathogenicity classifications. These variants create additional ambiguity for clinicians, patients, and families already dealing with elevated risk and uncertainty for these pediatric patients without definitive diagnoses. Additional uncertainty for clinicians is also found in understanding the limitations of a “negative” test, where the result may not necessarily due to lack of a pathogenic variant(s), but challenging loci with reduced capture and/or coverage. We are optimistic about the potential for new and developing technologies such as whole genome sequencing and functional experiments to improve diagnosis in the patients.
Supplementary Material
ACKNOWLEDGMENTS
We thank the patients and their families for their participation and support. We also thank Melissa Gilbert and Leah Dowsett for their critical reading of the manuscript. This work was supported by the National Human Genome Research Institute (NHGRI), which funded The Children’s Hospital of Philadelphia Pediatric Genetic Sequencing (PediSeq) project (NIH Project Number 3U01HG006546-04S1).
Funding information
National Human Genome Research Institute (NHGRI), which funded The Children’s Hospital of Philadelphia Pediatric Genetic Sequencing (PediSeq) project, Grant/Award Number: 3U01HG006546-04S1
Footnotes
DISCLOSURE
MPL has been a consultant for GSK and Novartis and has received research funding from AstraZeneca. She has received an honorarium from Sysmex and has additional research funding from NIAID.
SUPPORTING INFORMATION
Additional Supporting Information may be found online in the supporting information tab for this article.
REFERENCES
- [1].Lambert MP. Update on the inherited platelet disorders. Curr Opin Hematol. 2015;22:460–466. [DOI] [PubMed] [Google Scholar]
- [2].Nurden AT, Nurden P. Inherited disorders of platelet function: selected updates. J Thromb Haemost. 2015;13(Suppl 1):S2–S9. [DOI] [PubMed] [Google Scholar]
- [3].Green RC, Goddard KA, Jarvik GP, et al. Clinical sequencing exploratory research consortium: accelerating evidence-based practice of genomic medicine. Am J Hum Genet. 2016;98:1051–1066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Li MH, Abrudan JL, Dulik MC, et al. Utility and limitations of exome sequencing as a genetic diagnostic tool for conditions associated with pediatric sudden cardiac arrest/sudden cardiac death. Hum Genomics. 2015;9:15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Drachman JG. Inherited thrombocytopenia: when a low platelet count does not mean ITP. Blood. 2004;103:390–398. [DOI] [PubMed] [Google Scholar]
- [6].Maclachlan A, Watson SP, Morgan NV. Inherited platelet disorders: insight from platelet genomics using next-generation sequencing. Platelets. 2017;28:14–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Gresele P, Harrison P, Bury L, et al. Diagnosis of suspected inherited platelet function disorders: results of a worldwide survey. J Thromb Haemost. 2014;12:1562–1569. [DOI] [PubMed] [Google Scholar]
- [8].Timmermans S, Tietbohl C, Skaperdas E. Narrating uncertainty: variants of uncertain significance (VUS) in clinical exome sequencing. BioSocieties. 2017;12:439–458. [Google Scholar]
- [9].Buitrago L, Rendon A, Liang Y, et al. alphaIIbbeta3 variants defined by next-generation sequencing: predicting variants likely to cause Glanzmann thrombasthenia. Proc Natl Acad Sci U S A. 2015;112: E1898–E1907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Simeoni I, Stephens JC, Hu F, et al. A high-throughput sequencing test for diagnosing inherited bleeding, thrombotic, and platelet disorders. Blood. 2016;127:2791–2803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Watson SP, Lowe GC, Lordkipanidze M, Morgan NV. and consortium, GGenotyping and phenotyping of platelet function disorders. J Thromb Haemost. 2013;11(Suppl 1):351–363. [DOI] [PubMed] [Google Scholar]
- [12].Johnson B, Lowe GC, Futterer J, et al. Whole exome sequencing identifies genetic variants in inherited thrombocytopenia with secondary qualitative function defects. Haematologica. 2016;101:1170–1179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].McKenna A, Hanna M, Banks E, et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–1303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Lek M, Karczewski KJ, Minikel EV, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Yang H, Wang K. Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR. Nat Protoc. 2015;10:1556–1566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Richards S, Aziz N, Bale S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17: 405–424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Sherry ST, Ward M, Sirotkin K. dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Res. 1999;9:677–679. [PubMed] [Google Scholar]
- [19].Genomes Project C, Auton A, Brooks LD, et al. A global reference for human genetic variation. Nature. 2015;526:68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Landrum MJ, Lee JM, Riley GR, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980–D985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Green RC, Berg JS, Grody WW, et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet Med. 2013;15:565–574. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Boyanova D, Nilla S, Birschmann I, Dandekar T, Dittrich M. Platelet-Web: a systems biologic analysis of signaling networks in human platelets. Blood. 2012;119:e22–e34. [DOI] [PubMed] [Google Scholar]
- [23].Kissopoulou A, Jonasson J, Lindahl TL, Osman A. Next generation sequencing analysis of human platelet PolyA+ mRNAs and rRNA-depleted total RNA. PLoS One. 2013;8:e81809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Corrales I, Ramirez L, Altisent C, Parra R, Vidal F. Rapid molecular diagnosis of von Willebrand disease by direct sequencing. Detection of 12 novel putative mutations in VWF gene. Thromb Haemost. 2009;101:570–576. [DOI] [PubMed] [Google Scholar]
- [25].Parla JS, Iossifov I, Grabill I, Spector MS, Kramer M, McCombie WR. A comparative analysis of exome capture. Genome Biol. 2011; 12:R97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Guo Y, Long J, He J, et al. Exome sequencing generates high quality data in non-target regions. BMC Genomics. 2012;13:194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Posey JE, Rosenfeld JA, James RA, et al. Molecular diagnostic experience of whole-exome sequencing in adult patients. Genet Med. 2016;18:678–685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Jamuar SS, Tan EC. Clinical application of next-generation sequencing for Mendelian diseases. Hum Genomics. 2015;9:10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29].Yang Y, Muzny DM, Reid JG, et al. Clinical whole-exome sequencing for the diagnosis of mendelian disorders. N Engl J Med. 2013; 369:1502–1511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Lee H, Deignan JL, Dorrani N, et al. Clinical exome sequencing for genetic identification of rare Mendelian disorders. JAMA. 2014;312: 1880–1887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [31].Yang Y, Muzny DM, Xia F, et al. Molecular findings among patients referred for clinical whole-exome sequencing. JAMA. 2014;312: 1870–1879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [32].Need AC, Shashi V, Hitomi Y, et al. Clinical application of exome sequencing in undiagnosed genetic conditions. J Med Genet. 2012; 49:353–361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Deciphering Developmental Disorders S. Large-scale discovery of novel genetic causes of developmental disorders. Nature. 2015;519: 223–228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [34].Wenger AM, Guturu H, Bernstein JA, Bejerano G. Systematic reanalysis of clinical exome data yields additional diagnoses: implications for providers. Genet Med. 2017;19:209–214. [DOI] [PubMed] [Google Scholar]
- [35].Pecci A, Klersy C, Gresele P, et al. MYH9-related disease: a novel prognostic model to predict the clinical evolution of the disease based on genotype-phenotype correlations. Hum Mutat. 2014;35: 236–247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Arrondel C, Vodovar N, Knebelmann B, et al. Expression of the nonmuscle myosin heavy chain IIA in the human kidney and screening for MYH9 mutations in Epstein and Fechtner syndromes. J Am Soc Nephrol. 2002;13:65–74. [DOI] [PubMed] [Google Scholar]
- [37].Utsch B, DiFeo A, Kujat A, et al. Bladder exstrophy and Epstein type congenital macrothrombocytopenia: evidence for a common cause?. Am J Med Genet A. 2006;140:2251–2253. [DOI] [PubMed] [Google Scholar]
- [38].Murayama S, Akiyama M, Namba H, Wada Y, Ida H, Kunishima S. Familial cases with MYH9 disorders caused by MYH9 S96L mutation. Pediatr Int. 2013;55:102–104. [DOI] [PubMed] [Google Scholar]
- [39].Verver EJ, Topsakal V, Kunst HP, et al. Nonmuscle myosin heavy chain IIA mutation predicts severity and progression of sensorineural hearing loss in patients with MYH9-related disease. Ear Hear. 2016; 37:112–120. [DOI] [PubMed] [Google Scholar]
- [40].Song WJ, Sullivan MG, Legare RD, et al. Haploinsufficiency of CBFA2 causes familial thrombocytopenia with propensity to develop acute myelogenous leukaemia. Nat Genet. 1999;23:166–175. [DOI] [PubMed] [Google Scholar]
- [41].Ouchi-Uchiyama M, Sasahara Y, Kikuchi A, et al. Analyses of genetic and clinical parameters for screening patients with inherited thrombocytopenia with small or normal-sized platelets. Pediatr Blood Cancer. 2015;62:2082–2088. [DOI] [PubMed] [Google Scholar]
- [42].Latger-Cannard V, Philippe C, Bouquet A, et al. Haematological spectrum and genotype-phenotype correlations in nine unrelated families with RUNX1 mutations from the French network on inherited platelet disorders. Orphanet J Rare Dis. 2016;11:49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [43].Lozano ML, Cook A, Bastida JM, et al. Novel mutations in RASGRP2, which encodes CalDAG-GEFI, abrogate Rap1 activation, causing platelet dysfunction. Blood. 2016;128:1282–1289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [44].Pippucci T, Savoia A, Perrotta S, et al. Mutations in the 5′ UTR of ANKRD26, the ankirin repeat domain 26 gene, cause an autosomal-dominant form of inherited thrombocytopenia, THC2. Am J Hum Genet. 2011;88:115–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [45].Noris P, Perrotta S, Seri M, et al. Mutations in ANKRD26 are responsible for a frequent form of inherited thrombocytopenia: analysis of 78 patients from 21 families. Blood. 2011;117:6673–6680. [DOI] [PubMed] [Google Scholar]
- [46].Al Daama SA, Housawi YH, Dridi W, et al. A missense mutation in ANKRD26 segregates with thrombocytopenia. Blood. 2013;122: 461–462. [DOI] [PubMed] [Google Scholar]
- [47].Noris P, Favier R, Alessi MC, et al. ANKRD26-related thrombocytopenia and myeloid malignancies. Blood. 2013;122:1987–1989. [DOI] [PubMed] [Google Scholar]
- [48].Bluteau D, Balduini A, Balayn N, et al. Thrombocytopenia-associated mutations in the ANKRD26 regulatory region induce MAPK hyperactivation. J Clin Invest. 2014;124:580–591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].Balduini CL, Pecci A, Noris P. Inherited thrombocytopenias: the evolving spectrum. Hamostaseologie. 2012;32:259–270. [DOI] [PubMed] [Google Scholar]
- [50].Santani A, Murrell J, Funke B, et al. Development and validation of targeted next-generation sequencing panels for detection of germline variants in inherited diseases. Arch Pathol Lab Med. 2017;141: 787–797. [DOI] [PubMed] [Google Scholar]
- [51].Kadalayil L, Rafiq S, Rose-Zerilli MJ, et al. Exome sequence read depth methods for identifying copy number changes. Brief Bioinform. 2015;16:380–392. [DOI] [PubMed] [Google Scholar]
- [52].Patracchini P, Calzolari E, Aiello V, et al. Sublocalization of von Willebrand factor pseudogene to 22q11.22-q11.23 by in situ hybridization in a 46,X,t(X;22)(pter;q11.21) translocation. Hum Genet. 1989;83:264–266. [DOI] [PubMed] [Google Scholar]
- [53].Gupta PK, Adamtziki E, Budde U, et al. Gene conversions are a common cause of von Willebrand disease. Br J Haematol. 2005;130: 752–758. [DOI] [PubMed] [Google Scholar]
- [54].Abou Tayoun AN, Krock B, Spinner NB. Sequencing-based diagnostics for pediatric genetic diseases: progress and potential. Expert Rev Mol Diagn. 2016;16:987–999. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.