Abstract
Interpretation of variants identified during genetic testing is a significant clinical challenge. In this study, we developed a high-throughput CDKN2A functional assay and characterized all possible CDKN2A missense variants. We found that 17.7% of all missense variants were functionally deleterious. We also used our functional classifications to assess the performance of in silico models that predict the effect of variants, including recently reported models based on machine learning. Notably, we found that all in silico models performed similarly when compared to our functional classifications with accuracies of 39.5–85.4%. Furthermore, while we found that functionally deleterious variants were enriched within ankyrin repeats, we did not identify any residues where all missense variants were functionally deleterious. Our functional classifications are a resource to aid the interpretation of CDKN2A variants and have important implications for the application of variant interpretation guidelines, particularly the use of in silico models for clinical variant interpretation.
Introduction
Genetic testing of patients with cancer to identify variants associated with an increased cancer risk and sensitivity to targeted therapies is becoming more common as broad testing criteria are integrated into clinical care guidelines (Goggins et al., 2020; Stoffel et al., 2019). The American College of Medical Genetics (ACMG) provides a framework to integrate multiple types of evidence, including variant characteristics, disease epidemiology, clinical information, and functional classifications, to interpret variants in any gene (Richards et al., 2015). In silico variant effect predictors are also integrated into ACMG variant interpretation guidelines as supporting evidence to aid classification of variants. While numerous models have been developed, varied accuracy, poor agreement between models, and inflated performance on publicly available data have been reported (Cubuk et al., 2021; Jaffe et al., 2011; Wilcox et al., 2022). Recently developed variant effect predictors aim to overcome these limitations by incorporating deep-learning based protein structure predictions and by not training on human annotated datasets (Brandes et al., 2023; Cheng et al., 2023; Gao et al., 2023). However, post-development assessment of machine learning based variant effect predictors, to determine accuracy on novel experimental datasets and suitability for clinical use, are limited.
Variants that cannot be classified as either pathogenic or benign are categorized as variants of uncertain significance (VUSs). However, while pathogenic and benign variants identified during genetic testing are clinically actionable, VUSs are the cause of deep uncertainty for patients and their health care providers as an unknown fraction are functionally deleterious and therefore, likely pathogenic. For example, individuals with germline VUSs in a pancreatic cancer susceptibility gene are not be eligible for clinical surveillance programs that are associated with improved patient outcomes, unless they otherwise meet family history criteria (Goggins et al., 2020; Stoffel et al., 2019). Similarly, patients with breast or pancreatic cancer and a germline BRCA2 VUS would not be eligible for treatment with olaparib, a poly (ADP-ribose) polymerase inhibitor (Golan et al., 2019; Tutt et al., 2021). Reclassification of VUSs into pathogenic or benign strata has real-world, life-or-death consequences that necessitate a high degree of accuracy.
Germline VUSs in hereditary cancer genes are a common finding in patients with cancer and frequently can be reclassified as pathogenic on the basis of in vitro functional evidence (Kimura et al., 2022). In patients with pancreatic ductal adenocarcinoma (PDAC), germline CDKN2A VUSs affecting p16INK4a, most often rare missense variants, are found in up to 4.3% of patients (Chaffee et al., 2018; Kimura et al., 2021; McWilliams et al., 2018; Roberts et al., 2016; Shindo et al., 2017; Zhen et al., 2015). As functional data from well-validated in-vitro assays are incorporated into ACMG variant interpretation guidelines, we recently determined the functional consequence of 29 CDKN2A VUSs identified in patients with PDAC using an in vitro cell proliferation assay (Kimura et al., 2022; Richards et al., 2015). We found that over 40% of VUSs assayed were functionally deleterious and could reclassified as likely pathogenic.
Functional characterization, however, is time-consuming, expensive, and requires technical and scientific expertise. These limitations hinder assessment of in silico variant effect predictors and patient access to functional data that may allow reclassification of VUSs into clinically actionable strata. As CDKN2A VUSs will continue to be identified in patients with cancer undergoing genetic testing, we developed a high-throughput functional assay to provide a broad interpretation framework for CDKN2A variants. We characterized all possible CDKN2A missense variants and compared our functional classifications to recently developed in silico models based on machine learning to determine the accuracy of variant effect predictions.
Results
Functional characterization of CDKN2A missense variants
We utilized a codon optimized CDKN2A sequence for our multiplexed functional assay. Expression of codon optimized CDKN2A or the synonymous CDKN2A variants, p.L32L, p.G101G, and p.V126V, in PANC-1, a PDAC cell line with a homozygous deletion of CDKN2A, resulted in significant reduction in cell proliferation (P value < 0.0001; Figure 1-figure supplement 1A). There was no significant difference between codon optimized CDKN2A and the three synonymous variants assayed. Conversely, expression of three pathogenic variants, p.L32P, p.G101W, and p.V126D, in PANC-1 cells did not result in any significant changes in cell proliferation. To determine if there were unappreciated selective effects during in vitro culture, we generated a CellTag library based on the pLJM1 plasmid that contained twenty non-functional 9 base pair barcodes of equal representation. We then transduced PANC-1 cells that stably expressed codon optimized CDKN2A with the CellTag library (Day 0) and determined representation of each barcode in the cell pool on Day 9 and at confluency (Day 45). We found no statistically significant changes in barcode representation, indicating that representation of a pool of functionally neutral variants is stable over a period of in vitro culture representing our assay time course (Figure 1-figure supplement 1B, Appendix 1-table 1).
We next determined whether we could identify functionally deleterious CDKN2A variants at a single residue when all amino acid variants were assayed simultaneously. We generated lentiviral expression plasmid libraries for all 156 CDKN2A amino acid residues, where each library contained all possible amino acids at a single residue. Twenty-seven variants (27 of 3,120, 0.87%) were represented in the plasmid libraries at ≤ 1%. Expression plasmids for each of these 27 variants were individually generated by site directed mutagenesis and added to the corresponding plasmid library to a calculated representation of 5% (Figure 1-figure supplement 2A and 2B, Appendix 1-table 2). Plasmid libraries were then individually amplified, and lentivirus produced. To confirm that the representation of each variant was maintained after transduction, we transduced three lentiviral libraries (amino acid residues p.R24, p.H66, and p.A127) individually into PANC-1 cells and determined the proportion of each variant in the amplified plasmid library and in the cell pool at Day 9 post-transduction. The proportion of each variant in the amplified plasmid library and in the cell pool at Day 9 were highly correlated (Figure 1-figure supplement 2C and 2D, Appendix 1-table 3).
For two CDKN2A amino acid residues that include pathogenic and benign variants, p.V126 and p.R144, we determined the representation of each variant in the transduced cell pool at Day 9 and at confluency after a period of in vitro culture, Day 23 and Day 31 post-transfection, respectively (Figure 1A and 1B, Appendix 1-table 4, Appendix 1-table 5). Two synonymous variants, p.V126V and p.R144R, as well as a previously reported benign variant, p.R144C, either decreased or maintained their representation in the cell pool during in vitro culture as determined by the number of sequence reads supporting the variant. Representation of a previously reported pathogenic variant, p.V126D, increased in the cell pool. Notably, several other variants including p.V126R, p.V126W, p.V126K, and p.V126Y, also increased in representation in the cell pool, suggesting that additional amino acid changes at this residue are functionally deleterious (Figure 1A).
To functionally characterize 2,964 CDKN2A missense variants, PANC-1 cells were transduced with each of the 156 lentiviral expression libraries individually and representation of each CDKN2A variant in the resulting cell pool determined at Day 9 after transduction and at confluency (Day 16 – 40) (Appendix 1-table 5). Variant read counts were then analyzed using a gamma generalized linear model (GLM), that does not rely on annotation of pathogenic and benign variants to set classification thresholds, and variants with statistically significant P values were classified as functionally deleterious (log2 P values ≤ −53.2). Variants with P values that did not reach statistical significance were classified as either of indeterminate function (log2 P values > −53.2 and < −5.8) or functionally neutral (log2 P values ≥ −5.8).
We found that 525 of 2,964 missense variants (17.7%) were functionally deleterious in our assay (Figure 2A, Figure 2-figure supplement 1A, Appendix 1-table 4). In addition, 1,784 variants (60.2%) were classified as functionally neutral, with the remaining 655 variants (22.1%) classified as indeterminate function (Figure 2A, Appendix 1-table 4). In general, our results were consistent with previously reported classifications. Of variants identified in patients with cancer and previously reported to be functionally deleterious in published literature and/or reported in ClinVar as pathogenic or likely pathogenic (benchmark pathogenic variants), 27 of 32 (84.4%) were functionally deleterious in our assay (Figure 2B, Figure 2-figure supplement 1B and 1C, Appendix 1-table 4) (Chaffee et al., 2018; Chang et al., 2016; Horn et al., 2021; Hu et al., 2018; Kimura et al., 2022; McWilliams et al., 2018; Roberts et al., 2016; Zhen et al., 2015). Five benchmark pathogenic variants were characterized as indeterminate function, with log2 P values from −19.3 to −33.2. Of 156 synonymous variants and six missense variants previously reported to be functionally neutral in published literature and/or reported in ClinVar as benign or likely benign (benchmark benign variants), all were characterized as functionally neutral in our assay (Figure 2B, Figure 2-figure supplement 1B and 1C, Appendix 1-table 4) (Kimura et al., 2022; McWilliams et al., 2018; Roberts et al., 2016). Of 31 VUSs previously reported to be functionally deleterious, 28 (90.3%) were functionally deleterious and 3 (9.7%) were of indeterminate function in our assay. Similarly, of 18 VUSs previously reported to be functionally neutral, 16 (88.9%) were functionally neutral and 2 (11.1%) were of indeterminate function in our assay, (Figure 2B, Figure 2-figure supplement 1B and 1C, Appendix 1-table 4).
We next compared variant classifications using the gramma GLM to variant classifications using a normalized fold change method (Brenan et al., 2016; Giacomelli et al., 2018). Classification of missense variants using normalized fold change also differentiated benchmark pathogenic and benchmark benign variants (Figure 2-figure supplement 2A and 2B, Appendix 1-table 6). Using benchmark pathogenic variants and benchmark benign variants to set thresholds for classification, we classified all variants as either functionally deleterious (log2 normalized fold change ≤ 0.24), indeterminate function (log2 normalized fold change > 0.24 and < 1.09), or functionally neutral (log2 normalized fold change ≥1.09). Using these thresholds, 12 of 18 VUSs (66.7%) previously reported to be functionally neutral were classified as functionally neutral, while 6 (33.3%) were of indeterminate function. Similarly, of 31 VUSs previously reported to be functionally deleterious, 30 (96.8%) were functionally deleterious and 1 (3.2%) was of indeterminate function (Figure 2-figure supplement 2A and 2B, Appendix 1-table 6). Overall, 632 of 2,964 missense variants were functionally deleterious (21.3%), 674 variants were indeterminate function (22.7%), and 1658 variants were functionally neutral (55.9%) using log2 normalized fold change to classify variants (Figure 2-figure supplement 2C, Appendix 1-table 6). Notably, 517 of 525 variants (98.5%) classified as functionally deleterious and 1,586 of 1,784 variants (88.9%) classified as functionally neutral using the gamma GLM were similarly classified using log2 normalized fold change (Figure 2-figure supplement 2D).
To confirm the reproducibility of our variant classifications, 28 amino acid residues were assayed in duplicate, and variants classified using the gamma GLM. The majority of missense variants, 452 of 560 (80.7%), had the same functional classification in each of the two replicates (Figure 2-figure supplement 3A and 3B, Appendix 1-table 4). We also determined whether underrepresentation in the cell pool at Day 9 affected variant functional classifications. Fifty-three of 2,964 missense variants (1.8%) were present in the cell pool at Day 9 of the first assay replicate (experiment 1) at < 2%, as determined by the number of sequence reads supporting the variant (Figure 2-figure supplement 4A, Appendix 1-table 4). There was no statistically significant difference in the proportion of variants classified as functionally deleterious for variants present in less than 2% of the cell pool at Day 9 (12 of 53 variants; 22.6%), and variants present in more than 2% of the cell pool (496 of 2,911 variants; 17.0%) (P value = 0.28) (Figure 2-figure supplement 4B). We also found no significant differences in the proportion of variants classified as functionally deleterious for variants present in more than 2% of the cell pool at Day 9 when variants were binned in 1% intervals (Figure 2-figure supplement 4B).
Comparison to in silico prediction algorithms
As in silico predictions of variant effect are integrated into ACMG variant interpretation guidelines as supporting evidence, we compared the ability of different algorithms, including recently described algorithms that incorporate deep-learning models of protein structure, to predict the functional consequence of CDKN2A missense variants. We compared our functional classifications to predictions from Combined Annotation Dependent Depletion (CADD), Polymorphism Phenotyping v2 (PolyPhen-2), Sorting Intolerant From Tolerant (SIFT), Variant Effect Scoring Tool score (VEST), AlphaMissense, ESM1b, and PrimateAI-3D. In silico predictions for all missense variants were available for PolyPhen-2, SIFT, VEST, AlphaMissense, and ESM1b. For CADD and PrimateAI-3D, 910 (152 functionally deleterious, 196 indeterminate, and 562 functionally neutral) and 904 (152 functionally deleterious, 196 indeterminate, and 556 functionally neutral) missense variants had in silico predictions available respectively (Appendix 1-table 7). In silico variant effect predictors performed similarly across a broad range of performance characteristics (Appendix 1-table 8). Accuracy of in silico model predictions were 39.5 – 85.4% (CADD – 45.1%; PolyPhen-2 – 39.5%; SIFT – 60.9%; VEST – 71.9%; AlphaMissense – 71.6%; ESM1b – 59.2%; and PrimateAI-3D; 85.4%) (Figure 3). We also assessed sensitivity, specificity, positive predictive value, and negative predictive value for each model. We found that sensitivity was 0.25 – 0.98 (CADD – 0.97; PolyPhen-2 – 0.98; SIFT – 0.79; VEST – 0.91; AlphaMissense – 0.94; ESM1b – 0.95; and PrimateAI-3D – 0.25), specificity was 0.27 – 0.98 (CADD – 0.35; PolyPhen-2 – 0.27; SIFT – 0.57; VEST – 0.68; AlphaMissense – 0.67; ESM1b – 0.51; and PrimateAI-3D – 0.98), positive predictive value was 0.22 – 0.68 (CADD – 0.23; PolyPhen-2 0.22; SIFT – 0.28; VEST – 0.38; AlphaMissense – 0.38; ESM1b – 0.3; and PrimateAI-3D – 0.68), and negative predictive value was 0.87 – 0.98 (CADD – 0.98; PolyPhen-2 – 0.98; SIFT – 0.93; VEST – 0.97; AlphaMissense – 0.98; ESM1b – 0.98; and PrimateAI-3D – 0.87).
We also tested the effect of combining multiple in silico predictors. 904 missense variants had in silico predictions from all 7 algorithms. The remaining 2,060 missense variants had in silico predictions from 5 algorithms. Of variants with in silico predictions from all 7 algorithms, 378 (41.8%) had predictions of deleterious or pathogenic effect from a majority of algorithms (≥ 4), and of these, 137 (36.2%) were functionally deleterious in our assay. Similarly, of 2,060 missense variants that had in silico predictions from 5 algorithms, 1107 (53.7%) had predictions of deleterious or pathogenic effect from a majority of algorithms (≥ 3), of which, 361 (32.6%) were functionally deleterious in our assay (Appendix 1-table 7).
Distribution of functionally deleterious variants
Analysis of functionally deleterious variants may highlight critical and non-critical resides for CDKN2A function. We found that functionally deleterious missense variants were not distributed evenly across CDKN2A. CDKN2A contains four ankyrin repeats that mediate protein-protein interactions, ankyrin repeat 1 at codon 11–40, ankyrin repeat 2 at codon 44–72, ankyrin repeat 3 at codon 77–106, and ankyrin repeat 4 at codon 110–139 (Goldstein, 2004; Ruas and Peters, 1998; Sun et al., 2010) (Figure 2-figure supplement 5A). Functionally deleterious variants were enriched in ankyrin repeat 1 (21.0%, adjusted P value = 0.01), ankyrin repeat 2 (26.2%, adjusted P value = 1.0 × 10−10), and ankyrin repeat 3 (26.3%, adjusted P value = 2.6 × 1−11), while depleted in ankyrin repeat 4 (6.5%, adjusted P value = 3.2 × 10−13) and non-ankyrin repeat regions (6.8%, adjusted P value = 0) (Figure 2-figure supplement 5B). Moreover, functionally deleterious variants were further enriched within 10 residue subregions of ankyrin repeats 1–3, with 37.0% of variants in residues 16–25 of ankyrin repeat 1, 40.0% of variants in residues 46–55 of ankyrin repeat 2, and 48.0% of variants in residues 80–89 of ankyrin repeat 3 being classified as functionally deleterious (Figure 2C, Appendix 1-table 4).
Across all single residues, the mean percent of functionally deleterious missense variants was 17.7% (95% confidence interval: 12.7% - 20.9%) (Figure 2-figure supplement 5C, Appendix 1-table 4). At five amino acid residues, p.G23, p.G55, p.H83, p.D84, and p.G89, 17 of 19 (89.5%) possible missense variants were functionally deleterious. Notably, these residues are conserved between human and murine p16 (Byeon et al., 1998). And p.H83 has been reported to stabilize peptide loops connecting the helix-turn-helix structure of the four ankyrin repeats (Byeon et al., 1998), whereas p.D84 and p.G89 are located in a 20-residue region reported to interact with CDK4 and CDK6 (Fåhraeus et al., 1996). Conversely, 18 residues were tolerant of amino acid substitutions, with no missense variant characterized as functionally deleterious in our assay (Figure 2-figure supplement 5C, Appendix 1-table 4).
We also determined whether the location of variants in protein domains correlated with in silico predictions for the 904 missense variants with predictions from all 7 algorithms (Figure 3-figure supplement 1A – 1H) and the 2,060 missense variants with predictions from 5 algorithms (Figure 3-figure supplement 2A – 2H). Notably, Ank2 and Ank3 domains had more variants predicted to have deleterious or pathogenic effect by the majority of algorithms compared to Ank1, Ank4, and non-Ank domains (Figure 3-figure supplement 1C, Figure 3-figure supplement 2C). We also found increasing agreement between in silico predictions of deleterious or pathogenic effect and functionally deleterious classification in our assay as the number of algorithms predicting deleterious or pathogenic effects increased. (Figure 3-figure supplement 1B, Figure 3-figure supplement 2B). This was true for all CDKN2A protein domains assessed (Figure 3-figure supplement 1D – 1H, Figure 3-figure supplement 2D – 2H).
Functional effect of CDKN2A somatic mutations
Somatic alterations in CDKN2A are a frequent finding in many types of cancer. However, not all somatic alterations are unequivocally deleterious to protein function. Missense somatic mutations are particularly challenging to functionally interpret and the presence of a functionally neutral somatic mutation may impact patient care (Tung et al., 2020). To understand the functional effect of missense somatic mutations in CDKN2A, we functionally classified mutations reported in the Catalogue Of Somatic Mutations In Cancer (COSMIC) (Forbes et al., 2009), The Cancer Genome Atlas (TCGA) (Muddabhaktuni and Koyyala, 2021), patients with cancer undergoing sequencing at The Johns Hopkins University School of Medicine (JHU), and the Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets Clinical Sequencing Cohort (MSK-IMPACT) (Cheng et al., 2015). Overall, 355 unique missense somatic mutations were reported, of which 119 (33.5%) were functionally deleterious in our assay (Appendix 1-table 9). The percent of missense somatic mutations that were classified as functionally deleterious was greater than the percent of all possible CDKN2A missense variants classified as functionally deleterious, suggesting enrichment of functionally deleterious missense changes among somatic mutations (Figure 2A, Appendix 1-table 4, Appendix 1-table 9). The proportion of missense somatic mutations that were functionally deleterious was similar in COSMIC, TCGA, JHU, and MSK-IMPACT. We found that 34.2% - 53.4% of unique missense somatic mutations classified as functionally deleterious, with 61.4% - 67.6% of patients having a functionally deleterious somatic mutation (Figure 4A, Appendix 1-table 9). As with functionally deleterious variants, functionally deleterious missense somatic mutations were also not distributed evenly across CDKN2A, being enriched within the ankyrin repeat 3 (Figure 4B, Appendix 1-table 9). We found that 32.4% - 50.0% of all functionally deleterious missense somatic mutations occurred within ankyrin repeat 3, with 48.0% - 58.0% of patients in each cohort having a functionally deleterious missense somatic mutation in this domain. Notably, 65.7% - 76.0% of functionally deleterious missense somatic mutations in this domain were in residues 80–89 (Appendix 1-table 9).
When considering unique missense somatic mutations, 26 of 355 (7.3%) would be classified as pathogenic or likely pathogenic by ACMG classification guidelines and these were found in 263 of 1176 (22.4%) patients in COSMIC, 45 of 185 (24.3%) patients in TCGA, 40 of 184 (21.7%) patients in JHU, and 46 of 174 (26.4%) patients in MSK-IMPACT (Figure 4-figure supplement 1A and 1B). In each cohort, the most prevalent of these somatic mutations were p.His83Tyr and p.Asp84Asn, with more than half of the patients with a somatic mutation that could be classified as pathogenic or likely pathogenic having either the p.His83Tyr or p.Asp84Asn alteration (Figure 4-figure supplement 1C). In our functional assays, these somatic mutations were both classified as functionally deleterious.
We were also able to determine the functional classification of CDKN2A missense somatic mutations in COSMIC, TCGA, JHU, and MSK-IMAPCT by cancer type. We found that 22.2% - 100% of CDKN2A missense somatic mutations were functionally deleterious depending on cancer type (Figure 4-figure supplement 2A–2D). When considering missense somatic mutation reported in any database, there was a statistically significant depletion of functionally deleterious mutations in colorectal adenocarcinoma (20.4%; adjusted P value = 5.4 × 10−9) (Figure 4C). As the proportion of missense somatic mutations that were functionally deleterious was less in colorectal carcinoma compared to other types of cancer, we assessed whether somatic mutations in mismatch repair genes (MLH1, MLH3, MSH2, MSH6, PMS1, and PMS2) were associated with the functional status of CDKN2A missense somatic mutations. Thirty-five patients in COSMIC had a CDKN2A missense somatic mutation, of which 12 (34.3%) had a somatic mutation in a mismatch repair gene. We found that no patients with a somatic mutation in a mismatch repair gene had a functionally deleterious CDKN2A missense somatic mutation compared to 6 of 23 samples (26.1%) without a somatic mutation in a mismatch repair gene (P value = 0.062).
CDKN2A variants in variant databases
The Genome Aggregation Database (gnomAD) v4.1.0 reports 287 missense variants in CDKN2A, including the 13 pathogenic, 4 likely pathogenic, 3 likely benign, 3 benign, and 264 VUSs classified using ACMG variant interpretation guidelines (Figure 5A and 5B, Appendix 1-table 10). Of the 264 missense VUSs, 177 were functionally neutral (67.0%), 56 (21.2%) were indeterminate function, and 31 (11.7%) were functionally deleterious in our assay using the gamma GLM for classification (Figure 5C). Similarly, ClinVar reports 395 CDKN2A missense VUSs, of which 256 (64.8%) were functionally neutral, 94 (23.8%) were indeterminate function, and 45 (11.4%) were functionally deleterious in our assay (Figure 5D, Appendix 1-table 11).
Discussion
VUSs in hereditary cancer susceptibility genes, predominantly rare missense variants, are a frequent finding in patients undergoing genetic testing and the cause of significant uncertainty. ACMG variant interpretation guidelines incorporate functional data, as well as other evidence such as in silico predictions of variant effect, to aid classification of variants as either pathogenic or benign. CDKN2A VUSs are a frequent finding in patients with PDAC. We previously found that over 40% of CDKN2A VUSs identified in patients with PDAC were functionally deleterious and therefore could be reclassified as likely pathogenic. In this study, we developed, a high-throughput, in vitro assay and functionally characterized 2,964 CDKN2A missense variants, representing all possible single amino acid variants. We found that 525 missense variants (17.7%) were functionally deleterious. These pre-defined functional characterizations are resource for the scientific community and can be integrated into variant interpretation schema to aid classification of CDKN2A germline variants and somatic mutations.
We classified CDKN2A missense variants using a gamma GLM, with thresholds determined using the change in representation of 20 non-functional barcodes in a pool of PANC-1 cells stably expressing CDKN2A after a period of in vitro growth. Our variant classifications were not biased by using assay outputs for previously reported – benchmark – pathogenic or begin variants to determine thresholds. Even so, CDKN2A missense variant classifications were remarkably similar using a gamma GLM or normalized fold change with thresholds determined using benchmark pathogenic and begin variants. Of missense variants classified as functionally deleterious using a gamma GLM, 98.5% were similarly classified using normalized fold change.
We repeated our functional assay twice for 28 CDKN2A residues. For the remaining 128 residues of CDKN2A, the functional assay was completed once. While we found general agreement between functional classifications from each replicate for the 28 residues assayed in duplicate, additional repeats for each residue are necessary to determine variability in variant functional classifications.
Our characterization of all possible CDKN2A missense variants allowed us to assess the ability of in silico algorithms – including recently published predictors based on machine learning AlphaMissense, ESM1b, and PrimateAI-3D – to predict the pathogenicity or functional effect of CDKN2A missense variants. We found that all in silico variant effect predictors assessed performed similarly. Highest accuracy was observed with PrimateAI-3D at 85.4%, followed by VEST at 71.9% and AlphaMissense at 71.6%. Importantly, even in silico predictors performing best in one metric may perform poorly in others. For example, PrimateAI-3D had the highest specificity (0.98) and positive predictive values (0.68), but the lowest sensitivity (0.25) and negative predictive value (0.87). Given that reclassification of VUSs in hereditary cancer genes into inappropriate strata has significant implications for patients, use of in silico models for clinical variant interpretation, including those utilizing machine learning, may be premature. Ultimately, our data support current ACMG guidelines that include in silico predictions of variant effect as supporting evidence of pathogenicity or benign impact.
Our study also provides other insights for the implementation of variant interpretation guidelines. ACMG guidelines include presence of a missense variant at a residue with a previously reported pathogenic variant as moderate evidence of pathogenicity. We found that functionally deleterious missense variants were not evenly distributed across CDKN2A. We found enrichment of functionally deleterious missense variants in Ankyrin repeats 1–3 and depletion in ankyrin repeat 4. Notably, no CDKN2A residue was completely intolerant of amino acid changes. Suggesting, at least for CDKN2A, that the presence of a pathogenic missense variant at a residue should be used with caution when classifying other missense variants at the same residue.
We characterized variants based upon a broad cellular phenotype, cell proliferation, in a single PDAC cell line. It is possible that CDKN2A variant functional classifications are cell-specific and assay-specific. Our assay may not encompass all cellular functions of CDKN2A and an alternative assay of a specific CDKN2A function, such as CDK4 binding, may result in different variant functional classifications. Furthermore, CDKN2A variants may have different effects if alternative cell lines are used for the functional assay. However, cell-specific effects appear to be limited. In our previous study, we characterized 29 CDKN2A VUSs in three PDAC cell lines, using cell proliferation and cell cycle assays, and found agreement between all functional classifications (Kimura et al., 2022).
This study supports the utility of our in vitro functional assay. In general, we found that benchmark pathogenic variants, benchmark benign variants, and VUSs previously reported to be functionally deleterious had congruent functional classifications in our assay. Moreover, we found that functionally deleterious effects were enriched among somatic missense mutations, and depleted in missense VUSs in gnomAD, compared to all CDKN2A missense variants. Importantly, our functionally assay provides evidence to reclassify 301 of 395 (76.2%) missense VUSs reported in ClinVar and 208 of 264 (78.8%) missense VUSs reported in gnomAD. These include 45 (11.4%) VUSs in ClinVar and 31 missense VUSs in gnomAD that could be reclassified as likely pathogenic variants.
In this study, we determined functional classifications for all possible CDKN2A missense variants. Comparison of our functional classifications to in silico variant effect predictors, including recently described algorithms based on machine learning, provides performance benchmarks and supports current recommendations integrating data computational data into variant interpretation guidelines.
Methods
Cell lines
PANC-1 (American Type Culture Collection, Manassas, VA; catalog no. CRL-1469), a human PDAC cell line with a homozygous deletion of CDKN2A (Caldasl et al., 1994) and 293T (American Type Culture Collection; catalog no. CRL-3216), a human embryonic kidney cell line, were maintained in Dulbecco’s modified Eagle’s medium (Thermo Fisher Scientific Inc., Waltham, MA; catalog no.11995–065) supplemented with 10% fetal bovine serum (Thermo Fisher Scientific Inc.; catalog no. 26140–079). Cell line authentication and mycoplasma testing were performed using the GenePrint 10 System (Promega Corporation, Madison, WI; catalog no. B9510) and the PCR-based MycoDtect kit (Greiner Bio-One, Monroe, NC; catalog no. 463 060) (Genetics Resource Core Facility, The Johns Hopkins University, Baltimore, MD).
CDKN2A somatic mutation data
CDKN2A (p16INK4; NP_000068.1) missense somatic mutation data was obtained from the Catalogue Of Somatic Mutations In Cancer (Forbes et al., 2009), The Cancer Genome Atlas (Muddabhaktuni and Koyyala, 2021), patients with cancer undergoing sequencing at The Johns Hopkins University School of Medicine (Baltimore, MD), Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets Clinical Sequencing Cohort (Cheng et al., 2015). CDKN2A variant data was obtained from gnomAD v.4.1.0. and ClinVar (Landrum et al., 2014).
Plasmids
pHAGE-CDKN2A (Addgene, Watertown, MA; plasmid no. 116726) was created by Gordon Mills & Kenneth Scott (Ng et al., 2018). pLJM1 (Addgene; plasmid no. 91980) was created by Joshua Mendell (Golden et al., 2017). pLentiV_Blast (Addgene, plasmid no. 111887) was created by Christopher Vakoc (Tarumoto et al., 2020). psPAX2 (Addgene, plasmid no. 12260) was created by Didier Trono), and pCMV-VSV-G (Addgene, plasmid no. 8454) was created by Bob Weinberg (Stewart et al., 2003).
CDKN2A expression plasmid libraries
Codon-optimized CDKN2A cDNA using p16INK4A amino acid sequence (NP_000068.1), was designed (Appendix 1-table 12) and pLJM1 containing codon optimized CDKN2A (pLJM1-CDKN2A) generated by Twist Bioscience (South San Francisco, CA). 156 plasmid libraries were then synthesized by using pLJM1-CDKN2A, such that each library contained all possible 20 amino acids variants (19 missense and 1 synonymous) at a given position, generating 500 ng of each plasmid library (Twist Bioscience, South San Francisco, CA). The proportion of variant in each library was shown in Appendix 1-table 2. Variants with a representation of less than 1% in a plasmid library were individually generated using the Q5 Site-Directed Mutagenesis kit (New England Biolabs, Ipswich, MA; catalog no. E0552), and added to each library to a calculated proportion of 5%. Primers used for site-directed mutagenesis are given in Appendix 1-table 13. Each library was then amplified to generate at least 5 ug of plasmid DNA using QIAGEN Plasmid Midi Kit (QIAGEN, Germantown, MD; catalog no. 12143).
Single variant CDKN2A expression plasmids
Individual pLJM1-CDKN2A expression constructs for CDKN2A missense variants, p.L32L, p.L32P, p.G101G, p.G101W, p.V126D, and p.V126V were generated using the Q5 Site-Directed Mutagenesis kit (New England Biolabs, Ipswich, MA; catalog no. E0552). Primers used for site-directed mutagenesis are given in Appendix 1-table 13. Integration of each CDKN2A variant was confirmed using Sanger sequencing (Genewiz, Plainsfield, NJ) using the CMV Forward sequencing primer (CGCAAATGGGCGGTAGGCGTG). The manufacturer’s protocol was followed unless otherwise specified.
CellTag plasmid library
Twenty nonfunctional 9 base pair barcodes “CellTags” were subcloned into pLentiV_Blast using the Q5 Site-Directed Mutagenesis kit (New England Biolabs, Ipswich, MA; catalog no. E0552) (Biddy et al., 2018). Primers used to generate each CellTag plasmid are given in Appendix 1-table 13. Integration of each CellTag was confirmed using Sanger sequencing (Genewiz) (sequencing primer: AACTGGGAAAGTGATGTCGTG). The manufacturer’s protocol was followed unless otherwise specified. CellTag plasmids were then pooled to form a CellTag plasmid library with equal representation of each CellTag plasmid.
Lentivirus production
Lentivirus production was performed as previously described with the following modifications (Kimura et al., 2022). pLJM1 lentiviral expression vectors (plasmid libraries and single variant expression plasmids) and lentiviral packaging vectors (psPAX2 and pCMV-VSV-G) were transfected into 293T cells using Lipofectamine 3000 Transfection Reagent (Thermo Fisher Scientific, Waltham, MA; catalog no. L3000008). Media was collected at 24 hours and 48 hours, pooled, and lentiviral particles concentrated using Lenti-X Concentrator (Clontech, Mountain View, CA; catalog no. 631231) using the manufacturer’s protocol.
Lentiviral transduction
PANC-1 cells were used for CDKN2A plasmid library and single variant CDKN2A expression plasmid transductions. PANC-1 cells previously transduced with pLJM1-CDKN2A (PANC-1CDKN2A) and selected with puromycin were used for CellTag library transductions. Briefly, 1 × 105 cells were cultured in media supplemented with 10 ug/ml polybrene and transduced with × 107 transducing units per mL of lentivirus particles. Cells were then centrifuged at 1,200 × g for 1 hour. After 48 hours of culture at 37°C and 5% CO2, transduced cells were selected using 3 μg/ml puromycin (CDKN2A plasmid libraries and single variant CDKN2A expression plasmids) or 5 μg/ml blasticidin (CellTag plasmid library) for 7 days. Expected MOI was one. After selection, cells were trypsinized and 5 × 105 cells were seeded into T150 flasks. DNA was collected from remaining cells and this sample was named as (Day 9). T150 flasks were cultured until confluent and then DNA was collected. The time for cells to become confluent varied for each amino acid residue (Day 16 – 40, Appendix 1-table 5). DNA was extracted from PANC-1 cells using the PureLink Genomic DNA Mini Kit (Invitrogen, Carlsbad, CA; catalog no. K1820–01). The assay for CellTag library was repeated in triplicate. We repeated our CDKN2A assay in duplicate for 28 residues. For the remaining 128 CDKN2A residues the assay was completed once.
Generation of sequence libraries
Library preparation and sequencing was performed as previously described with the following modifications (Kinde et al., 2011). For the 1st stage PCR, 3 target specific primers were designed to amplify CDKN2A amino acid positions 1 to 53, 54 to 110, and 111 to 156 (Appendix 1-table 13). Forward and reverse 1st stage primers contained 5’ M13F (GTAAAACGACGGCCAGC) and M13R (CAGGAAACAGCTATGAC) sequence, respectively, to enable amplification and ligation of Illumina adapter sequences in a 2nd stage PCR (Appendix 1-table 13). DNA was amplified with Q5 Hot Start High-Fidelity 2× Master Mix (New England Biolabs; catalog no. M0494S). For the 1st stage PCR, each DNA sample was amplified in three reactions each containing 66 ng of DNA for 18 cycles. 1st stage PCR products for each sample were then pooled and purified using the Agencourt AMPure XP system (Beckman Coulter, Inc, Brea, CA; catalog no. A63881), eluting into 50 μL of elution buffer. Purified PCR product was amplified in a 2nd stage PCR to add Illumina adaptor sequences and indexes (Appendix 1-table 13). 2nd stage PCR Amplification was performed with KAPA HiFi HotStart PCR Kit (Kapa Biosystems, Wilmington, MA; catalog no. KK2501) in 25 μL reactions containing 5X KAPA HiFi Buffer - 5 μL, 10 mM KAPA dNTP Mix - 0.75 μL, 10 μM forward primer - 0.75 μL, 10 μM reverse primer - 0.75 μL. For the 1st stage PCR, 66 ng of template DNA and 12.5 μL, Q5 Hot Start High-Fidelity 2× Master Mix was used with the following cycling conditions: 98 °C for 30 seconds; 18cycles of 98 °C for 10 seconds, 72 °C for 30 seconds, 72 °C for 25 seconds; 72 °C for 2 minutes. For the 2nd stage PCR, 0.25 μL of 1st stage PCR product and 0.5 μL of 1 U/μL KAPA HiFi HotStart DNA Polymerase was used with the following cycling conditions: 95 °C for 3 minutes; 25 cycles of 98 °C for 20 seconds, 62 °C for 15 seconds, 72 °C for 1 minute. 2nd stage PCR products were purified with the Agencourt AMPure XP system (Beckman Coulter, Inc.; catalog no. A63881) into 30 μL of elution buffer. Samples were quantified by Qubit using dsDNA HS assay kit (Invitrogen; catalog no. Q33230).
Sequencing and analysis
Sequence libraries were pooled in equimolar amounts into groups of 16 samples and sequenced on the Illumina MiSeq System (Illumina, San Diego, CA) with the MiSeq Reagent Kit v2 (300 cycles) (Illumina catalog no. MS-102–2002) to generate 150 base pair paired-end reads. Samples were demultiplexed and FASTQ sequence read files were generated with MiSeq control software 2.5.0.5 (Illumina). Paired sequence reads were then combined into a single contiguous sequence using Paired-End Read Merger (Zhang et al., 2014). Reads supporting each variant at a given amino acid position were counted using perl.
Functional characterization of CDKN2A variants using a gamma generalized linear model
We determined if a variant has a fitness advantage by assessing the significance of the observed ratio rv,cf at confluence between the number of cells with a missense variant v and the number of cells with a synonymous variant at a given amino acid position. Using the missense variant as a benchmark variant, we assumed that the distribution of rv,cf can be explained by two key covariates: rv,init, which represent the missense variant-to-synonymous variant ratio at Day 9, and pv,init, the proportion of the missense variant cells among other variants, including the synonymous variant, at the studied position. More specifically, given the variables rv,init and pv,init, the ratio at confluence follows a distribution:
where the mean uv of the Gamma distribution is such that:
Here, the parameters of the null model to estimate are α, a, and b, where α, is the shape parameter of the Gamma distribution and is assumed to be the same for all variants. This model is a gamma Generalized Linear Model (GLM) over the response variable rv,cf with a log-link function and covariates log(rv,init) and log(pv,init). Estimating the parameters will provide a null distribution of rv,cf, generating a p-value for every observed rv,cf for any variant at a given position.
To estimate the parameters α,a, and b, we utilized three control experiments where the CellTag plasmid library was transduced into PANC-1CDKN2Aco cells so that each CellTag represented a neutral variant. For a single experiment, every variant can be considered as wild-type, and we test the other 19 variants against it, knowing that they are neutral and therefore follow the null distribution. This provides us with 19 × 20 triplets (), for every experiment, yielding 1140 datapoints when considering all three experiments together. To estimate the parameters using these 1140 data points, we fit the GLM corresponding GLM model using the sklearn.linear_model module.
After the estimation of parameters α,a, and , every observation for a tested variant v at a given position of the triplet () yields a p-value, defined as the probability of observing a ratio at confluence that is at least rv,cf given pv,init, rv,init under the null Gamma model. As some variants were tested in repeated experiments, we combined their associated p-values into a single p-value using Fisher’s method. Finally, to determine if a variant presents a fitness advantage, we apply a Benjamini-Hochberg estimator on all the tested variants p-values, fixing the False Discovery Rate at a level of 0.05.
Functional characterization of CDKN2A variants using log2 normalized fold change
Fold change for each variant was calculated using the proportion of total reads representing a variant at confluency (Day 16–40, Appendix 1-table 5) to the proportion of total reads representing a variant on Day 9 after transfection. Fold change was then normalized to the synonymous variant at each residue and then log2 normalized fold change values calculated (Appendix 1-table 4, Appendix 1-table 6). Variants with log2 normalized fold change values greater than or equal to the minimum value of benchmark pathogenic variants were characterized as functionally deleterious, while variants with values smaller than or equal the maximum value of benchmark benign variants were characterized as functionally neutral (Appendix 1-table 6). Log2 normalized fold change values between these defined thresholds were classified as indeterminate. Mean values were used for replicated variants.
Data visualization
Heat map of individual variant p-values by amino position was generated using R with the heatmaply package (Galili et al., 2018).
Cell proliferation assay
Cell proliferation assay were performed as previously described with the following modifications (Kimura et al., 2022). 1 × 105 cells were seeded into in vitro culture (Day 0). Cells were counted on Day 14 using a TC20 Automated Cell Counter (Bio-Rad Laboratories, Herclues, CA; catalog no. 1450102). Relative cell proliferation value was calculated as cell number normalized to empty vector control. Assays were repeated in triplicate. Mean cell proliferation value and standard deviation (s.d.) were calculated.
Variant effect predictions
Publicly available algorithms were used to predict the consequence of CDKN2A missense variants. Prediction algorithms used included: CADD (Kircher et al., 2014), PolyPhen-2 (Adzhubei et al., 2010), SIFT (Kumar et al., 2009), VEST (Carter et al., 2013), AlphaMissense (Cheng et al., 2023), ESM1b (Brandes et al., 2023), and PrimateAI-3D (Gao et al., 2023) (Appendix 1-table 7). PolyPhen-2, SIFT, VEST, AlphaMissense, and ESM1b prediction were available for all missense variants. CADD scores were available for 910 missense variants and where multiple CADD scores were possible, mean values were used. PrimateAI-3D prediction scores were available for 904 assayed missense variants.
Statistical analyses
Statistical analyses were performed using JMP v.11 (SAS, Cary, NC) and Python statsmodel package (version 0.14.0). Student’s t-tests was used to compare mean cell proliferation values. A chi-square test was used to compare the proportion of functionally deleterious variants for variants present in < 2% and ≥ 2% of the cell pool at Day 9. A Fisher’s exact test was used to compare prevalence of functionally deleterious CDKN2A variants in colorectal cancer cases from COSMIC with and without somatic mutations in mismatch repair genes. Z-tests with multiple test correction performed with the Bonferroni method was used in the following comparisons: 1) proportion of functionally deleterious variants present in < 2% of the cell pool and ≥ 2% of the cell pool at Day 9 binned in 1% intervals, 2) proportion of variants in each domain predicted to have deleterious or pathogenic effect by the majority of algorithms, 3) proportion of functionally deleterious variants in each domain, and 4) proportion of functionally deleterious missense variants and somatic mutations.
Supplementary Material
Acknowledgments
Funding
National Institutes of Health grant P50CA62924 (NJR)
Susan Wojcicki and Dennis Troper (NR)
The Sol Goldman Pancreatic Cancer Research Center (NJR)
The Rolfe Pancreatic Cancer Foundation (NJR)
The Japanese Society of Gastroenterology Support for Young Gastroenterologists Studying in the United States (HK)
The Japan Society for the Promotion of Science Overseas Research Fellowships (HK)
Funding Statement
National Institutes of Health grant P50CA62924 (NJR)
Susan Wojcicki and Dennis Troper (NR)
The Sol Goldman Pancreatic Cancer Research Center (NJR)
The Rolfe Pancreatic Cancer Foundation (NJR)
The Japanese Society of Gastroenterology Support for Young Gastroenterologists Studying in the United States (HK)
The Japan Society for the Promotion of Science Overseas Research Fellowships (HK)
Footnotes
Competing interests
Authors declare that they have no competing interests.
Data and materials availability
All data are available in the main text or the supplementary materials.
References
- Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. 2010. A method and server for predicting damaging missense mutations. Nat Methods 7:248–249. doi: 10.1038/nmeth0410-248 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Biddy BA, Kong W, Kamimoto K, Guo C, Waye SE, Sun T, Morris SA. 2018. Single-cell mapping of lineage and identity in direct reprogramming. Nature 564:219–224. doi: 10.1038/s41586-018-0744-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brandes N, Goldman G, Wang CH, Ye CJ, Ntranos V. 2023. Genome-wide prediction of disease variant effects with a deep protein language model. Nat Genet 55:1512–1522. doi: 10.1038/s41588-023-01465-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brenan L, Andreev A, Cohen O, Pantel S, Kamburov A, Cacchiarelli D, Persky NS, Zhu C, Bagul M, Goetz EM, Burgin AB, Garraway LA, Getz G, Mikkelsen TS, Piccioni F, Root DE, Johannessen CM. 2016. Phenotypic Characterization of a Comprehensive Set of MAPK1/ERK2 Missense Mutants. Cell Rep 17:1171–1183. doi: 10.1016/j.celrep.2016.09.061 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Byeon IJL, Li J, Ericson K, Selby TL, Tevelev A, Kim HJ, O’Maille P, Tsai MD. 1998. Tumor suppressor p16INK4A: Determination of solution structure and analyses of its interaction with cyclin-dependent kinase 4. Mol Cell 1:421–431. doi: 10.1016/S1097-2765(00)80042-8 [DOI] [PubMed] [Google Scholar]
- Caldasl C, Hahn SA, Luis T, Marks C, Schutte M, Seymour AB, Weinstein CL, Hruban RH, Yeo CJ, Kern SE. 1994. Frequent somatic mutations and homozygous deletions of the p16 (MTS1) gene in pancreatic adenocarcinoma. Nat Genet 8:27–32. [DOI] [PubMed] [Google Scholar]
- Carter H, Douville C, Stenson PD, Cooper DN, Karchin R. 2013. Identifying Mendelian disease genes with the variant effect scoring tool. BMC Genomics 14 Suppl 3:S3. doi: 10.1186/1471-2164-14-s3-s3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chaffee KG, Oberg AL, McWilliams RR, Majithia N, Allen BA, Kidd J, Singh N, Hartman A-R, Wenstrup RJ, Petersen GM. 2018. Prevalence of Germline Mutations in Cancer Genes. Genet Med 20:119–127. doi: 10.1038/gim.2017.85.PREVALENCE [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang MT, Asthana S, Gao SP, Lee BH, Chapman JS, Kandoth C, Gao JJ, Socci ND, Solit DB, Olshen AB, Schultz N, Taylor BS. 2016. Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity. Nat Biotechnol 34:155–163. doi: 10.1038/nbt.3391 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheng DT, Mitchell TN, Zehir A, Shah RH, Benayed R, Syed A, Chandramohan R, Liu ZY, Won HH, Scott SN, Rose Brannon A, O’Reilly C, Sadowska J, Casanova J, Yannes A, Hechtman JF, Yao J, Song W, Ross DS, Oultache A, Dogan S, Borsu L, Hameed M, Nafa K, Arcila ME, Ladanyi M, Berger MF. 2015. Memorial sloan kettering-integrated mutation profiling of actionable cancer targets (MSK-IMPACT): A hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J Mol Diagnostics 17:251–264. doi: 10.1016/j.jmoldx.2014.12.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheng J, Novati G, Pan J, Bycroft C, Žemgulytė A, Applebaum T, Pritzel A, Wong LH, Zielinski M, Sargeant T, Schneider RG, Senior AW, Jumper J, Hassabis D, Kohli P, Avsec Ž. 2023. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science (80- ) 7492. doi: 10.1126/science.adg7492 [DOI] [PubMed] [Google Scholar]
- Cubuk C, Garrett A, Choi S, King L, Loveday C, Torr B, Burghel GJ, Durkie M, Callaway A, Robinson R, Drummond J, Berry I, Wallace A, Eccles D, Tischkowitz M, Whiffin N, Ware JS, Hanson H, Turnbull C, CanVIG-UK. 2021. Clinical likelihood ratios and balanced accuracy for 44 in silico tools against multiple large-scale functional assays of cancer susceptibility genes. Genet Med 23:2096–2104. doi: 10.1038/s41436-021-01265-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fåhraeus R, Paramio JM, Ball KL, Lain S, Lane DP. 1996. Inhibition of pRb phosphorylation and cell-cycle progression by a 20-residue peptide derived from P16CDKN2/INK4A. Curr Biol. doi: 10.1016/s0960-9822(02)00425-6 [DOI] [PubMed] [Google Scholar]
- Forbes SA, Tang G, Bindal N, Bamford S, Dawson E, Cole C, Kok CY, Jia M, Ewing R, Menzies A, Teague JW, Stratton MR, Futreal PA. 2009. COSMIC (the Catalogue of Somatic Mutations In Cancer): A resource to investigate acquired mutations in human cancer. Nucleic Acids Res 38:652–657. doi: 10.1093/nar/gkp995 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galili T, O’Callaghan A, Sidi J, Sievert C. 2018. Heatmaply: An R package for creating interactive cluster heatmaps for online publishing. Bioinformatics 34:1600–1602. doi: 10.1093/bioinformatics/btx657 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gao H, Hamp T, Ede J, Schraiber JG, McRae J, Singer-Berk M, Yang Y, Dietrich ASD, Fiziev PP, Kuderna LFK, Sundaram L, Wu Y, Adhikari A, Field Y, Chen C, Batzoglou S, Aguet F, Lemire G, Reimers R, Balick D, Janiak MC, Kuhlwilm M, Orkin JD, Manu S, Valenzuela A, Bergman J, Rousselle M, Silva FE, Agueda L, Blanc J, Gut M, de Vries D, Goodhead I, Harris RA, Raveendran M, Jensen A, Chuma IS, Horvath JE, Hvilsom C, Juan D, Frandsen P, de Melo FR, Bertuol F, Byrne H, Sampaio I, Farias I, do Amaral JV, Messias M, da Silva MNF, Trivedi M, Rossi R, Hrbek T, Andriaholinirina N, Rabarivola CJ, Zaramody A, Jolly CJ, Phillips-Conroy J, Wilkerson G, Abee C, Simmons JH, Fernandez-Duque E, Kanthaswamy S, Shiferaw F, Wu D, Zhou L, Shao Y, Zhang G, Keyyu JD, Knauf S, Le MD, Lizano E, Merker S, Navarro A, Bataillon T, Nadler T, Khor CC, Lee J, Tan P, Lim WK, Kitchener AC, Zinner D, Gut I, Melin A, Guschanski K, Schierup MH, Beck RMD, Umapathy G, Roos C, Boubli JP, Lek M, Sunyaev S, O’Donnell-Luria A, Rehm HL, Xu J, Rogers J, Marques-Bonet T, Farh KKH. 2023. The landscape of tolerated genetic variation in humans and primates. Science (80- ) 380. doi: 10.1126/science.abn8197 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Giacomelli AO, Yang X, Lintner RE, McFarland JM, Duby M, Kim J, Howard TP, Takeda DY, Ly SH, Kim E, Gannon HS, Hurhula B, Sharpe T, Goodale A, Fritchman B, Steelman S, Vazquez F, Tsherniak A, Aguirre AJ, Doench JG, Piccioni F, Roberts CWM, Meyerson M, Getz G, Johannessen CM, Root DE, Hahn WC. 2018. Mutational processes shape the landscape of TP53 mutations in human cancer. Nat Genet 50:1381–1387. doi: 10.1038/s41588-018-0204-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goggins M, Overbeek KA, Brand R, Syngal S, Del Chiaro M, Bartsch DK, Bassi C, Carrato A, Farrell J, Fishman EK, Fockens P, Gress TM, Van Hooft JE, Hruban RH, Kastrinos F, Klein A, Lennon AM, Lucas A, Park W, Rustgi A, Simeone D, Stoffel E, Vasen HFA, Cahen DL, Canto MI, Bruno M. 2020. Management of patients with increased risk for familial pancreatic cancer: updated recommendations from the International Cancer of the Pancreas Screening (CAPS) Consortium. Gut 69:7–17. doi: 10.1136/gutjnl-2019-319352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Golan T, Hammel P, Reni M, Van Cutsem E, Macarulla T, Hall MJ, Park J-O, Hochhauser D, Arnold D, Oh D-Y, Reinacher-Schick A, Tortora G, Algül H, O’Reilly EM, McGuinness D, Cui KY, Schlienger K, Locker GY, Kindler HL. 2019. Maintenance Olaparib for Germline BRCA -Mutated Metastatic Pancreatic Cancer . N Engl J Med 381:317–327. doi: 10.1056/nejmoa1903387 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Golden RJ, Chen B, Li T, Braun J, Manjunath H, Chen X, Wu J, Schmid V, Chang TC, Kopp F, Ramirez-Martinez A, Tagliabracci VS, Chen ZJ, Xie Y, Mendell JT. 2017. An Argonaute phosphorylation cycle promotes microRNA-mediated silencing. Nature 542:197–202. doi: 10.1038/nature21025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldstein AM. 2004. Familial melanoma, pancreatic cancer and germline CDKN2A mutations. Hum Mutat 23:630. doi: 10.1002/humu.9247 [DOI] [PubMed] [Google Scholar]
- Horn IP, Marks DL, Koenig AN, Hogenson TL, Almada LL, Goldstein LE, Romecin Duran PA, Vera R, Vrabel AM, Cui G, Rabe KG, Bamlet WR, Mer G, Sicotte H, Zhang C, Li H, Petersen GM, Fernandez-Zapico ME. 2021. A rare germline CDKN2A variant (47T>G; p16-L16R) predisposes carriers to pancreatic cancer by reducing cell cycle inhibition. J Biol Chem 296:1–11. doi: 10.1016/J.JBC.2021.100634 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu C, Hart SN, Polley EC, Gnanaolivu R, Shimelis H, Lee KY, Lilyquist J, Na J, Moore R, Antwi SO, Bamlet WR, Chaffee KG, DiCarlo J, Wu Z, Samara R, Kasi PM, McWilliams RR, Petersen GM, Couch FJ. 2018. Association between inherited germline mutations in cancer predisposition genes and risk of pancreatic cancer. JAMA 319:2401–2409. doi: 10.1001/jama.2018.6228 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jaffe A, Wojcik G, Chu A, Golozar A, Maroo A, Duggal P, Klein AP. 2011. Identification of functional genetic variation in exome sequence analysis. BMC Proc 5:9–13. doi: 10.1186/1753-6561-5-S9-S13 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimura H, Klein AP, Hruban RH, Roberts NJ. 2021. The Role of Inherited Pathogenic CDKN2A Variants in Susceptibility to Pancreatic Cancer. Pancreas 50:1123–1130. doi: 10.1097/MPA.0000000000001888 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimura H, Paranal RM, Nanda N, Wood LD, Eshleman JR, Hruban RH, Goggins MG, Klein AP, Brand R, Cote ML, Du M, Gallinger S, Goggins M, Kurtz RC, Petersen GM, Rustgi AK, Schwartz AG, Stoffel EM, Syngal S, Zogopoulos G, Roberts NJ. 2022. Functional CDKN2A assay identifies frequent deleterious alleles misclassified as variants of uncertain significance. Elife 11:1–16. doi: 10.7554/eLife.71137 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B. 2011. Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci U S A 108:9530–9535. doi: 10.1073/pnas.1105422108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kircher M, Witten DM, Jain P, O’roak BJ, Cooper GM, Shendure J. 2014. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet 46:310–315. doi: 10.1038/ng.2892 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar P, Henikoff S, Ng PC. 2009. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc 4:1073–1082. doi: 10.1038/nprot.2009.86 [DOI] [PubMed] [Google Scholar]
- Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, Maglott DR. 2014. ClinVar: Public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res 42:980–985. doi: 10.1093/nar/gkt1113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McWilliams RR, Wieben ED, Chaffee KG, Antwi SO, Raskin L, Olopade OI, Li D, Highsmith WE, Colon-Otero G, Khanna LG, Permuth JB, Olson JE, Frucht H, Genkinger J, Zheng W, Blot WJ, Wu L, Almada LL, Fernandez-Zapico ME, icotte H, Pedersen KS, Petersen GM. 2018. CDKN2A germline rare coding variants and risk of pancreatic cancer in minority populations. Cancer Epidemiol Biomarkers Prev 27:1364–1370. doi: 10.1158/1055-9965.EPI-17-1065 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muddabhaktuni BMC, Koyyala VPB. 2021. The Cancer Genome Atlas. Indian J Med Paediatr Oncol 42:353–355. doi: 10.1055/s-0041-1735440 [DOI] [Google Scholar]
- Ng PKS, Li J, Jeong KJ, Shao S, Chen H, Tsang YH, Sengupta S, Wang Z, Bhavana VH, Tran R, Soewito S, Minussi DC, Moreno D, Kong K, Dogruluk T, Lu H, Gao J, Tokheim C, Zhou DC, Johnson AM, Zeng J, Ip CKM, Ju Z, Wester M, Yu S, Li Y, Vellano CP, Schultz N, Karchin R, Ding L, Lu Y, Cheung LWT, Chen K, Shaw KR, Meric-Bernstam F, Scott KL, Yi S, Sahni N, Liang H, Mills GB. 2018. Systematic Functional Annotation of Somatic Mutations in Cancer. Cancer Cell 33:450–462.e10. doi: 10.1016/j.ccell.2018.01.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, Grody WW, Hegde M, Lyon E, Spector E, Voelkerding K, Rehm HL. 2015. Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 17:405–424. doi: 10.1038/gim.2015.30 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roberts NJ, Norris AL, Petersen GM, Bondy ML, Brand R, Gallinger S, Kurtz RC, Olson SH, Rustgi AK, Schwartz AG, Stoffel E, Syngal S, Zogopoulos G, Ali SZ, Axilbund J, Chaffee KG, Chen YC, Cote ML, Childs EJ, Douville C, Goes FS, Herman JM, Iacobuzio-Donahue C, Kramer M, Makohon-Moore A, McCombie RW, Wyatt Mcmahon K, Niknafs N, Parla J, Pirooznia M, Potash JB, Rhim AD, Smith AL, Wang Y, Wolfgang CL, Wood LD, Zandi PP, Goggins M, Karchin R, Eshleman JR, Papadopoulos N, Kinzler KW, Vogelstein B, Hruban RH, Klein AP. 2016. Whole genome sequencing defines the genetic heterogeneity of familial pancreatic cancer. Cancer Discov 6:166–175. doi: 10.1158/2159-8290.CD-15-0402 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ruas M, Peters G. 1998. The p16(INK4a)/CDKN2A tumor suppressor and its relatives. Biochim Biophys Acta - Rev Cancer 1378. doi: 10.1016/S0304-419X(98)00017-1 [DOI] [PubMed] [Google Scholar]
- Shindo K, Yu J, Suenaga M, Fesharakizadeh S, Cho C, Macgregor-Das A, Siddiqui A, Witmer PD, Tamura K, Song TJ, Almario JAN, Brant A, Borges M, Ford M, Barkley T, He J, Weiss MJ, Wolfgang CL, Roberts NJ, Hruban RH, Klein AP, Goggins M. 2017. Deleterious germline mutations in patients with apparently sporadic pancreatic adenocarcinoma. J Clin Oncol 35:3382–3390. doi: 10.1200/JCO.2017.72.3502 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stewart SA, Dykxhoorn DM, Palliser D, Mizuno H, Yu EY, An DS, Sabatini DM, Chen ISY, Hahn WC, Sharp PA, Weinberg RA, Novina CD. 2003. Lentivirus-delivered stable gene silencing by RNAi in primary cells. Rna 9:493–501. doi: 10.1261/rna.2192803 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stoffel EM, Mckernin SE, Brand R, Canto M, Goggins M, Moravek C. 2019. Evaluating Susceptibility to Pancreatic Cancer☐: ASCO Provisional Clinical Opinion. J Clin Oncol 37:153–164. doi: 10.1200/JCO.18.01489 [DOI] [PubMed] [Google Scholar]
- Sun P, Nallar SC, Raha A, Kalakonda S, Velalar CN, Reddy SP, Kalvakolanu D V. 2010. GRIM-19 and p16INK4a synergistically regulate cell cycle progression and E2F1-responsive gene expression. J Biol Chem 285:27545–27552. doi: 10.1074/jbc.M110.105767 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tarumoto Y, Lin S, Wang J, Milazzo JP, Xu Y, Lu B, Yang Z, Wei Y, Polyanskaya S, Wunderlich M, Gray NS, Stegmaier K, Vakoc CR. 2020. Salt-inducible kinase inhibition suppresses acute myeloid leukemia progression in vivo. Blood 135:56–70. doi: 10.1182/blood.2019001576 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tung NM, Robson ME, Ventz S, Santa-Maria CA, Nanda R, Marcom PK, Shah PD, Ballinger TJ, Yang ES, Vinayak S, Melisko M, Brufsky A, DeMeo M, Jenkins C, Domchek S, D’Andrea A, Lin NU, Hughes ME, Carey LA, Wagle N, Wulf GM, Krop IE, Wolff AC, Winer EP, Garber JE. 2020. TBCRC 048: Phase II Study of Olaparib for Metastatic Breast Cancer and Mutations in Homologous Recombination-Related Genes. J Clin Oncol 38:4274–4282. doi: 10.1200/JCO.20.02151 [DOI] [PubMed] [Google Scholar]
- Tutt ANJ, Garber JE, Kaufman B, Viale G, Fumagalli D, Rastogi P, Gelber RD, de Azambuja E, Fielding A, Balmaña J, Domchek SM, Gelmon KA, Hollingsworth SJ, Korde LA, Linderholm B, Bandos H, Senkus E, Suga JM, Shao Z, Pippas AW, Nowecki Z, Huzarski T, Ganz PA, Lucas PC, Baker N, Loibl S, McConnell R, Piccart M, Schmutzler R, Steger GG, Costantino JP, Arahmani A, Wolmark N, McFadden E, Karantza V, Lakhani SR, Yothers G, Campbell C, Geyer CE. 2021. Adjuvant Olaparib for Patients with BRCA1 - or BRCA2 -Mutated Breast Cancer . N Engl J Med 384:2394–2405. doi: 10.1056/nejmoa2105215 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilcox EH, Sarmady M, Wulf B, Wright MW, Rehm HL, Biesecker LG, Abou Tayoun AN. 2022. Evaluating the impact of in silico predictors on clinical variant classification. Genet Med 24:924–930. doi: 10.1016/j.gim.2021.11.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang J, Kobert K, Flouri T, Stamatakis A. 2014. PEAR: A fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30:614–620. doi: 10.1093/bioinformatics/btt593 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhen DB, Rabe KG, Gallinger S, Syngal S, Schwartz AG, Goggins MG, Hruban RH, Cote ML, Mcwilliams RR, Roberts NJ, Cannon-Albright LA, Li D, Moyes K, Wenstrup RJ, Hartman A-R, Seminara D, Klein AP, Petersen GM, Author GM. 2015. BRCA1, BRCA2, PALB2, and CDKN2A Mutations in Familial Pancreatic Cancer (FPC): A PACGENE Study HHS Public. Genet Med 17:569–577. doi: 10.1038/gim.2014.153 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data are available in the main text or the supplementary materials.