Familial platelet disorder with associated myeloid malignancy (FPD-MM, OMIM:601399)1,2 is a rare cancer predisposition syndrome caused by pathogenic germline variants in RUNX1.3 Despite research dating back over two decades, many challenges remain in improving outcomes for individuals with FPD-MM.4 Firstly, the syndrome may go unrecognized due to poor recognition of family history and/or access to appropriate genetic testing. Secondly, intentional screening or incidental detection (e.g., tumour-sequencing) of RUNX1 variants requires access to expert interpretation. Thirdly, after diagnosis, the relative rarity of the disorder inhibits the collation of sizeable local cohorts, making identification of commonalities in disease course and/or outcome highly challenging. To help overcome these significant challenges, we have developed an interactive public webbased international collaborative database for RUNX1: RUNX1db (https://runx1db.runx1-fpd.org/). RUNX1db is a centralized repository for germline RUNX1 variant information, associated next-generation sequencing (NGS) data, and expert-curated variant information (both germline and somatic).
We recently identified, from publications, 140 different families with germline RUNX1 variants.4 While being a rich resource, historically reported variants are largely not classified according to the American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) guidelines, only established in 2015.5 Additionally, the Clinical Genome Resource myeloid malignancy variant curation expert panel (ClinGen MM-VCEP) recently created guidelines specific for classification of germline RUNX1 variants.6 Gene-specific guidelines, while important, add additional complexity to the curation of identified variants. Making available expert knowledge to accurately classify these germline variants prevents both missing pathogenic variants or the misattribution of benign variants as causative in families.7,8 Additionally, variants identified through clinical services and research studies don’t always make it into the public domain due to constraints associated with the reporting of variants through publication or variant repositories. To address some of these challenges, we updated curated variants from publications and undertook an international survey of colleagues, identifying unpublished variants. This study identified an additional 119 families (259 in total), with 164 unique variants. These included ten new variants not previously described (Figure 1 and Table S1). Using this data, we created the first comprehensive RUNX1 germline registry and performed expert curation of all variants according to the RUNX1-specific ACMG classification rules (ALB, CNH, LAG, LM, CDD MM-VCEP members). The registry represents the largest collection of curated and clinically classified RUNX1 germline variants to date, providing a unique clinical resource for researchers, clinical genomics laboratories, and haematologists (Figure 1, Table S1). Utilizing this resource, we have identified 97 pathogenic/ likely pathogenic RUNX1 variants, with 54 located within the RUNT domain (RHD)(75% of RHD variants), of which 24 are missense mutations. Only one pathogenic missense variant is observed outside of the RHD, suggesting the RHD is highly intolerant to genetic-variation. Most commonly observed pathogenic germline RUNX1 variations are whole-gene deletions (21 probands), deletion of exons 1-2 (9 probands), and mutation of amino acid p.Arg201 within the RHD (8 probands)(Table S1). Accessibility and update-ability of this information is available through a live-webportal which hosts the registry (https://runx1db.runx1-fpd.org/classification/classifications). Each curated variant has links to patient-phenotypic information and the current clinical classification, including the evidence for each ACMG code assessed and links to external clinical databases, including ClinVar and associated publications. Importantly, expert crowdsourcing allows the real-time updating of the database through user profile accounts. Newly-identified variants can be easily added to the database and are automatically annotated with over 137 parameters required for accurate classification (e.g., population frequency, pathogenicity predictions). These parameters populate a classificationtool that guides users stepwise through the ACMG classification of new variants (or updating current classifications with new information). Once curated and classified, collated information can be exported as an automated classification report summary, flagged for expert-review, shared with other users, and uploaded to ClinVar.
Table 1.
In addition to a germline RUNX1 variant registry, RUNX1db has the capacity to house NGS datasets, creating the first international genomics cohort of this rare disease. This initiative intends to enable researchers to answer questions about FPD-MM beyond germline variant detection. For example, family members, heterozygous for RUNX1 mutations, can have varying clinical presentations indicating variable penetrance and expressivity. In almost all cases, germline RUNX1 carriers present with thrombocytopenia and qualitative platelet defects, and progression to hematologic malignancies (HM) is incompletely penetrant with variable age of onset ranging from early childhood to late adulthood.2 Patients develop myeloid malignancies most frequently, and Tcell and, more rarely, B-cell acute lymphoblastic leukaemia (ALL).4 Currently, there is no way to predict which individuals will progress to myelodysplastic syndrome (MDS), acute myeloid leukaemia (AML), or other HM. Accumulation of somatic mutations and additional germline modifier variants are mechanisms proposed to contribute to this heterogeneity.4 NGS technology is widely used for surveillance and diagnosis of HM,4 accumulating large amounts of data often not utilized beyond RUNX1 variant detection. Individual laboratories often only have small numbers of patients with deleterious RUNX1 germline variants, which makes asking larger questions about commonalities of genotype-phenotype, disease progression, monitoring, treatment and outcome, difficult.9 To accumulate the data required to make evidence- based clinical decisions in FPD-MM, a dedicated resource utilizing the collective wealth of NGS data generated from research and diagnostic laboratories internationally is ideal in standardizing and collating diseasespecific clinical and genomics data. The database has also been designed for the accumulation, sharing and curation of genomics data acquired from individuals with germline RUNX1 mutations both pre- and post-malignancy progression. We have collated 179 NGS datasets, both whole-exome sequencing (WES) and HM gene panel data, from 19 distinct research centres worldwide. This includes NGS from 60 FPD-MM families and 120 individuals, making it the largest FPD-MM NGS dataset (Figure 2). The dataset includes individuals ranging in age from 1-76 years, malignancy phenotypes of AML, MDS, myelodysplastic syndrome/myeloproliferative neoplasm (MDS/MPN), ALL, and pre-leukemic phenotypes including thrombocytopenia and asymptomatic carriers (Table 1). Detailed clinical information for each patient and associated samples are stored on the database and can be updated, enabling specific phenotypic-genotypic cohort studies to be performed on the clinical spectrum of FPDMM. Additionally, the database can be updated easily with new NGS data as available, including longitudinal datasets from serial testing of individual patients. The database allows for a comprehensive, unbiased and customizable review of all RUNX1 germline datasets with all raw sequencing data being analyzed through a standardized bioinformatics pipeline. This is designed to identify both somatic and germline variants and is available on the database as variant level data (VCF, Figure S1). Using the integrated VariantGrid (https://github.com/SACGF/variantgrid) genomics analysis software, we have curated a panel of somatic variants for each dataset (including all malignancy and pre-leukemic samples), prioritizing the identification of potentially pathogenic variants in HM (2,643 variants, 167 samples). Standard filtering criteria were adapted for identifying somatic variants (Online Supplementary Figure S1). Variants that passed all filtering criteria were subsequently manually curated. Variants classified as having no clinical significance (benign/likely benign) according to ACMG/AMP guidelines, were excluded. Remaining variants were either classified as 1) Clinically relevant, 2) Possibly relevant, or 3) Unknown relevance (Online Supplementary Figure S1).10,11 Curated somatic variant data is available through the interactiveoncoplot on the database homepage or variant page. Shared in real-time with the scientific community, this curated dataset has already allowed the selection of secondary mutations to model FPD-MM disease and therapy in vitro and in animals. Importantly, investigators can interrogate the data to answer additional research questions as the software provides a fully automated annotation of variants and allows non-bioinformaticians to filter, sort, analyze, and curate genetic variants stored in the database via a graphical interface (Online Supplementary Figure S2).
This project serves as a model for data accumulation for rare cancer predisposition syndromes. The adoption of a single database that serves as a repository for patient demographic and clinical data, a mutational germline registry, and patient genomics data, which can be interrogated as a large cohort are essential components for the diagnosis and treatment of patients with a rare-disorder such as FPD-MM. This resource is especially useful in FPD-MM, where the genetic cause is well established but variability in clinical presentation and disease development render diagnosis challenging. The aggregation of multiple families, individuals, and disease stages into a centralized database where all data undergo rigorous quality control using a single bioinformatics analysis strategy will aid in the exploration and discovery of the molecular progression of the disorder. The harmonized interpretation of genomic variants is imperative to understanding the mutational profile of a malignancy, which is achieved through a curated list of variants displayed for each sample. Institutional, national, and international ethics and data sharing guidelines may initially limit contributions to initiatives like this that are supported by patient advocates but need to be overcome, given the importance of the work. We envision that information from this database will guide precision-based approaches to patient care plans with reasonable surveillance and adequate counselling and, eventually, the application of new targeted therapies and interventions prior to malignancy development for germline RUNX1 carriers. With the continued accumulation of data and clinical information, this type of gene-specific database can provide the basis to developing evidence-based clinical decisions such as when to watch and wait and when to apply more aggressive therapies such as stem cell transplantation. Finally, we hope that this database will serve as a model from which similar efforts will emerge for other HMs, benefiting all our patients and families.
Supplementary Material
Acknowledgments
The authors would also like to thank the RUNX1 Research Program for their support in helping to facilitate the development of the database and fostering collaborations. We also thank the patients and their family members for their willingness to participate in this study and the RUNX1 international data-sharing consortium for their valuable contributions. This project is also proudly supported by funding from the Leukaemia Foundation of Australia, and project grants APP1145278 and APP1164601 from the National Health and Medical Research Council of Australia. This work was produced with the financial and additional support of Cancer Council SA's Beat Cancer Project on behalf of its donors and the State Government of South Australia, through the Department of Health (PRF Fellowship to HSS). PA is supported by a fellowship from The Hospital Research Foundation. Part of this project was undertaken whilst PA was holding a Royal Adelaide Hospital Mary Overton Early Career Fellowship. LM is supported by the Associazione Italiana per la Ricerca sul Cancro (AIRC) (Accelerator Award Project 22796; 5x1000 Project 21267; Investigator Grant 2017 Project 20125).
Funding Statement
Funding: this work is supported by a grant from the RUNX1 Research Program.
References
- 1.Arber DA, Orazi A, Hasserjian R, et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016;127(20):2391-2405. Blood. 2016;128(3):462-463. [DOI] [PubMed] [Google Scholar]
- 2.Brown AL, Hahn CN, Scott HS. Secondary leukemia in patients with germline transcription factor mutations (RUNX1, GATA2, CEBPA). Blood. 2020;136(1):24-35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Song WJ, Sullivan MG, Legare RD, et al. Haploinsufficiency of CBFA2 causes familial thrombocytopenia with propensity to develop acute myelogenous leukaemia. Nat Genet. 1999;23(2):166-175. [DOI] [PubMed] [Google Scholar]
- 4.Brown AL, Arts P, Carmichael CL, et al. RUNX1-mutated families show phenotype heterogeneity and a somatic mutation profile unique to germline predisposed AML. Blood Adv. 2020;4(6):1131-1144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Richards S, Aziz N, Bale S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17(5):405-424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Luo X, Feurstein S, Mohan S, et al. ClinGen Myeloid Malignancy Variant Curation Expert Panel recommendations for germline RUNX1 variants. Blood Adv. 2019;3(20):2962-2979. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Brown AL, Hahn C, Hiwase D, Godley LA, Scott HS. Correct application of variant classification guidelines in germline RUNX1 mutated disorders to assist clinical diagnosis. Leuk Lymphoma. 2020;61(1):246-247. [DOI] [PubMed] [Google Scholar]
- 8.Feurstein S, Zhang L, DiNardo CD. Accurate germline RUNX1 variant interpretation and its clinical significance. Blood Adv. 2020;4(24):6199-6203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Bellissimo DC, Speck NA. RUNX1 mutations in inherited and sporadic leukemia. Front Cell Dev Biol. 2017;5:111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Branford S, Wang P, Yeung DT, et al. Integrative genomic analysis reveals cancer-associated mutations at diagnosis of CML in patients with highrisk disease. Blood. 2018;132(9):948-961. [DOI] [PubMed] [Google Scholar]
- 11.Li MM, Datto M, Duncavage EJ, et al. Standards and guidelines for the interpretation and reporting of sequence variants in cancer: a joint consensus recommendation of the Association for Molecular Pathology, American Society of Clinical Oncology, and College of American Pathologists. J Mol Diagn. 2017;19(1):4-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Zhou X, Edmonson MN, Wilkinson MR, et al. Exploring genomic alteration in pediatric cancer using ProteinPaint. Nat Genet. 2016;48(1):4-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.