Skip to main content
Cell Reports Methods logoLink to Cell Reports Methods
. 2023 Nov 2;3(11):100628. doi: 10.1016/j.crmeth.2023.100628

Sequencing-based functional assays for classification of BRCA2 variants in mouse ESCs

Kajal Biswas 1,5, Alexander Y Mitrophanov 2, Sounak Sahu 1, Teresa Sullivan 1, Eileen Southon 1,4, Darryl Nousome 3, Susan Reid 1, Sakshi Narula 1, Julia Smolen 1, Trisha Sengupta 1, Maximilian Riedel-Topper 1, Medha Kapoor 1, Anav Babbar 1, Stacey Stauffer 1, Linda Cleveland 1, Mayank Tandon 3, Tyler Malys 2, Shyam K Sharan 1,6,
PMCID: PMC10694496  PMID: 37922907

Summary

Sequencing of genes, such as BRCA1 and BRCA2, is recommended for individuals with a personal or family history of early onset and/or bilateral breast and/or ovarian cancer or a history of male breast cancer. Such sequencing efforts have resulted in the identification of more than 17,000 BRCA2 variants. The functional significance of most variants remains unknown; consequently, they are called variants of uncertain clinical significance (VUSs). We have previously developed mouse embryonic stem cell (mESC)-based assays for functional classification of BRCA2 variants. We now developed a next-generation sequencing (NGS)-based approach for functional evaluation of BRCA2 variants using pools of mESCs expressing 10–25 BRCA2 variants from a given exon. We use this approach for functional evaluation of 223 variants listed in ClinVar. Our functional classification of BRCA2 variants is concordant with the classification reported in ClinVar or those reported by other orthogonal assays.

Keywords: BRCA2, breast cancer, variants of uncertain significance, VUS, bacterial artificial chromosome, BAC, recombineering, mouse ES Cells, DNA repair, cell viability, functional assay

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Functional classification of 223 BRCA2 variants using a mouse ESC-based assay

  • NGS-based multiplexed approach for functional evaluation of variants

  • Calculation of functional scores based on cell viability and drug sensitivity

  • Use of a statistical model to determine probability of impact on function of variants

Motivation

We have previously used mouse ESCs for functional classification of BRCA2 variants using XTT-based cell proliferation assays. While the results are reliable, the approach is time consuming because each variant is analyzed individually. This has impeded the number of variants that can be examined at a time. To overcome this, we developed an NGS-based medium-throughput approach for functional evaluation of multiple BRCA2 variants at a time.


Sequencing-based genetic testing has identified thousands of variants of uncertain significance (VUSs) in the critical breast cancer susceptibility gene BRCA2. To advance the throughput of VUS characterization, Biswas et al. present a multiplexed NGS-based functional assay in mouse embryonic stem cells and use it to classify 223 BRCA2 variants.

Introduction

BRCA2 is one of the frequently mutated genes in the general population (1 mutation in 1,000 unaffected individuals).1 Germline pathogenic variants of BRCA2 are associated with increased risk of breast, ovarian, prostate, and pancreatic cancer.2 Clinical management of individuals with a family history of early-onset breast or ovarian cancer or a history of male breast cancer includes sequence-based genetic testing of BRCA1 and BRCA2. Identification of individuals carrying a pathogenic BRCA variant can lead to better cancer surveillance, prevention, and therapeutic options.3 Sequencing-based genetic testing has resulted in the identification of more than 17,000 BRCA2 variants that are listed in the ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/). More than 3,000 of these variants are considered to be variants of uncertain significance (VUSs) because their association with the disease is unknown. Individuals carrying a VUS must cope with the uncertainty of clinical management. Majority (89.5%) of the VUSs reported in the ClinVar database are missense variants, and the functional consequences of these variants with a single amino acid change remains unknown. Many of those VUSs are relatively rare in the general population and in cancer patients. This makes it difficult to reliably classify them using population-based studies, including multifactorial models of pathology, co-occurrence with cancer, and co-segregation data.

The American College of Medical Genetics and Genomics (ACMG) has developed guidelines to classify variants based on criteria of evidence from population data, in silico predictions, functional data, and segregation analyses. The ACMG has recommended a five-tier classification system to classify variants: benign, likely benign, uncertain, pathogenic, and likely pathogenic.4 According to the ACMG guidelines, a well-established functional assay to determine the impact of a mutation on the function is regarded as strong evidence to classify the variants.5 The Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) expert panel and ACMG determined that the variants should be classified as functionally normal or functionally abnormal based on their impact on function in a functional assay. The functionally abnormal variants can be further classified as complete loss of function, partial loss of function/intermediate, gain of function, or dominant negative.6

Several assays, including our mouse embryonic stem cell (mESC)-based assay, have been developed in the recent years to evaluate the functional significance of BRCA2 variants.7,8,9,10,11,12,13,14,15,16,17,18 Many of these assays have been used to analyze hundreds of BRCA2 variants by improving throughput and often classifying the variants using a computational model to assess the functional impact of the variants.10,12,19 The mESC-based assay is a comprehensive assay to evaluate BRCA2 variants.7,13 In this assay, individual variants are generated by recombineering in a bacterial artificial chromosome (BAC) clone containing full-length human BRCA2. The individual BAC clone encoding a single BRCA2 variant is then expressed in Brca2cko/ko mESCs and examined for its ability to complement the loss of Brca2 in mESCs by assessing cell viability and sensitivity to different DNA-damaging agents7,8,13,14,16,17,18 (Figure 1A). We have recently used a statistical algorithm to predict the probabilities on impact of function (PIFs) using the data from the mESC-based assay.9 However, generating experimental data using an mESC-based assay is time consuming and laborious. Consequently, very few VUSs can be analyzed at any given time.

Figure 1.

Figure 1

Schematic representation of the multiplexed mESC-based assay

(A) Schematic representation of the mESC-based functional assay. The PL2F7 mESCs (containing a conditional allele containing two loxP sites and a knockout allele of Brca2) is complemented for the loss of Brca2 allele after Cre expression by the BAC DNA encoding the human BRCA2 gene containing one variant. The recombinants were selected in hypoxanthine, aminopterin, thymidine (HAT)-containing medium. Viable HATr cells show no impact of BRCA2 variants on function. Viable HATr cells were further tested to distinguish the variants that have moderate loss of function by their sensitivity to different DNA-damaging agents. A star in the BAC construct represents the variant. Solid arrows denote loxP sites, and the two halves of the Hypoxanthine-guanine phosphoribosyltransferase (HPRT) mini gene are marked in solid boxes as HP and RT on the conditional allele of Brca2.

(B) Selection of two independent clones of mESCs that express BRCA2 variants. After introduction of BAC into PL2F7 cells, transfected cells were selected and analyzed for protein expression. Two clones were selected from each variant to eliminate positional bias of BAC integration.

(C) Multiplexing of the downstream process by mixing mESC clones expressing different variants. After mixing, two independent experiments were performed from each mix, and at the indicated steps, samples were collected for DNA isolation and deep sequencing.

(D) Schematic representation of loss of BRCA2 variants after selection with HAT, cisplatin, or olaparib. Genomic DNA was isolated, PCR amplified, and deep sequenced to quantify the relative abundance of variants. The ratio of abundance was normalized to the no-drug (M15) control. The relative viability data were further analyzed using the Bayesian statistical model to determine the probability of impact on function of each variant. Oval colored shapes represent cells with a variant, and solid-colored rectangles represent sequence reads.

Here, we describe an experimental approach that improves the throughput of the mESC-based assay using next-generation sequencing (NGS). We further used the cell viability and drug (cisplatin [a DNA interstrand crosslinker] and olaparib [a poly-ADP ribose polymerase (PARP) inhibitor, (PARPi)]) sensitivity data from NGS to generate PIF values of BRCA2 VUSs using a statistical model. We multiplexed the mESC-based assay at two steps: (1) generation of multiple variants of the same exon using recombineering20 and (2) multiplexing variants of the same exon for cell viability and drug sensitivity assays. The BAC clones with variants were individually transfected into Pl2F7 (Brca2cko/ko) cells. We then selected the clones that expressed the BRCA2 protein13 (Figure 1B). Nonsense variants, predicted to encode a truncated protein, were screened by RT-PCR. The mESC clones expressing BRCA2 variants from the same exons were then pooled and transfected with Pgk-Cre to induce loss of the conditional Brca2 allele. Cells undergoing Cre Recombinase (CRE)-mediated recombination were then selected in a medium containing hypoxanthine-aminopterin-thymidine (HAT).13 After HAT selection, the pool of cells was subjected to olaparib and cisplatin for a drug sensitivity assay (Figure 1C). Cell viability in response to loss of the Brca2 conditional allele or to the drug treatment was determined by the presence of a variant allele detected by NGS (Figure 1D). Further, the functional score calculated from the assay for 223 variants was used to build a statistical model to generate the PIF and interpret the functional impact of the variants.

Results

Selection of variants

We selected 223 BRCA2 variants listed in the ClinVar database that impact residues encoded by exons 11 (65 variants), 14 (52 variants), 20 (19 variants), 21 (50 variants), and 25 (37 variants) (Table 1). Among these variants, 32 are nonsense variants that are classified as pathogenic in ClinVar (Table 1). Sixteen variants classified as benign or likely benign in ClinVar were used as neutral controls (Table 1). All control variants are in different exons. Sixty-five variants are in exon 11 of BRCA2, the 3,387-bp exon that codes for the region containing 8 BRC repeats. Among the selected variants in exon 11, eight are located in the BRC1 repeat, and the remaining are in the region flanking the BRC1 repeat (Figure S1). All selected variants from exon 14 are located in the region preceding the DNA binding domain of BRCA2, whereas the variants from exons 20, 21, and 25 are located in the Oligonucleotides/oligosaccharides binding domains, OB2 or OB3 region of the DNA binding domain of BRCA2 (Figure S1). Six variants are present in introns 20 and 21 (Figure S1).

Table 1.

Classification of variants using the mESC-based assay

HGVS nucleotide HGVS protein Exon ClinVar ESC assay
HAT FS Cis FS Ola FS HAT PIF Cis PIF Ola PIF Combined PIF Class
c.7022G>T p.Arg2341Leu 14 US −0.2140 −0.7767 −0.8636 0.0014 0.0451 0.0228 0.00238888 F
c.7021C>G p.Arg2341Gly 14 US −0.0298 −0.4448 −0.4406 0.0007 0.0219 0.0068 0.00083833 F
c.7078T>C p.Ser2360Pro 14 US −0.0611 −0.0113 −0.1099 0.0008 0.0103 0.0031 0.0008022 F
c.7040C>A p.Pro2347Gln 14 US 0.1133 −0.3929 −0.3591 0.0004 0.0198 0.0056 0.00053352 F
c.7082A>G p.His2361Arg 14 US 0.1876 0.0751 0.1213 0.0003 0.0091 0.0020 0.00035197 F
c.7069C>G p.Leu2357Val 14 US 0.1014 −0.1810 −0.1874 0.0004 0.0135 0.0037 0.00049037 F
c.7010C>T p.Thr2337Ile 14 US 0.0353 −0.3679 −0.3835 0.0005 0.0189 0.0059 0.00066076 F
c.7033C>G p.Gln2345Glu 14 US −0.1095 −0.5951 −0.6937 0.0009 0.0299 0.0137 0.00132712 F
c.7042A>C p.Asn2348His 14 US 0.2523 0.3826 0.2206 0.0003 0.0064 0.0017 0.00028432 F
c.7049C>T p.Thr2350Ile 14 US 0.2557 −0.1121 −0.1881 0.0003 0.0121 0.0037 0.00031581 F
c.7045T>C p.Phe2349Leu 14 US −4.8694 −5.3479 −5.5131 1.0000 1.0000 1.0000 1 NF
c.7022G>A p.Arg2341His 14 US 0.0016 −0.1774 −0.1532 0.0006 0.0134 0.0035 0.00066299 F
c.7066T>G p.Phe2356Val 14 US 0.4826 0.6687 0.7290 0.0001 0.0051 0.0009 0.00014814 F
c.7067T>A p.Phe2356Tyr 14 US 0.0292 −0.4491 −0.4405 0.0006 0.0221 0.0068 0.00071176 F
c.7037A>G p.Asn2346Ser 14 US −0.0138 −0.0395 0.0640 0.0007 0.0108 0.0022 0.00067466 F
c.7051G>T p.Ala2351Ser 14 US 0.1069 0.7191 0.7994 0.0004 0.0049 0.0008 0.00043611 F
c.7034A>G p.Gln2345Arg 14 US −0.3732 −0.5543 −0.8801 0.0026 0.0274 0.0240 0.00324057 F
c.7030A>G p.Ile2344Val 14 US −0.1915 −0.2654 −0.2683 0.0012 0.0156 0.0045 0.00131887 F
c.7017G>C p.Lys2339Asn 14 B −0.2044 −0.1943 −0.1084 0.0013 0.0138 0.0031 0.00135582 F
c.7027G>A p.Glu2343Lys 14 US −0.5738 −0.1653 −0.1540 0.0062 0.0132 0.0035 0.0062201 F
c.7051G>C p.Ala2351Pro 14 US −0.3355 −0.4772 −0.5280 0.0022 0.0234 0.0086 0.00241123 F
c.7072T>C p.Ser2358Pro 14 US 0.0349 0.5921 0.7200 0.0006 0.0053 0.0009 0.00055467 F
c.C7021T p.Arg2341Cys 14 US −0.2295 0.0849 0.0244 0.0014 0.0090 0.0024 0.00146803 F
c.7025A>G p.Gln2342Arg 14 US −0.3478 0.2651 0.2819 0.0023 0.0072 0.0015 0.00233593 F
c.7081C>T p.His2361Tyr 14 US −0.3196 −0.3121 −0.1247 0.0021 0.0170 0.0032 0.00212651 F
c.3170A>G p.Lys1057Arg 11 US 0.4549 0.8779 0.2546 0.0002 0.0046 0.0016 0.00016178 F
c.3206C>T p.Ser1069Phe 11 US −0.0121 0.5133 −0.0854 0.0006 0.0057 0.0030 0.00066375 F
c.3256A>G p.Ile1086Val 11 CIP 0.5243 0.4844 0.2458 0.0001 0.0058 0.0016 0.00013868 F
c.3310A>C p.Thr1104Pro 11 US 0.4720 −0.0627 0.0147 0.0001 0.0112 0.0024 0.00017503 F
c.3319C>T p.Gln1107Stop 11 P −3.5779 −4.1744 −4.1490 1.0000 1.0000 1.0000 1 NF
c.3395A>G p.Lys1132Arg 11 US 0.5309 1.1654 0.4936 0.0001 0.0043 0.0011 0.00013209 F
c.3403T>C p.Tyr1135His 11 US −0.4727 −0.2659 −0.1237 0.0039 0.0156 0.0032 0.00399517 F
c.3413A>T p.Gln1138Leu 11 US −0.0826 −0.1557 −0.1797 0.0008 0.0129 0.0037 0.00087922 F
c.3419G>C p.Ser1140Thr 11 US −0.0684 2.7684 0.9349 0.0008 0.0214 0.0007 0.00080588 F
c.3446T>C p.Met1149Thr 11 US 0.5185 0.7778 0.3449 0.0001 0.0048 0.0014 0.00013781 F
c.3458A>G p.Lys1153Arg 11 CIP 0.2278 1.1051 0.4566 0.0003 0.0043 0.0012 0.00029983 F
c.3469G>T p.Glu1157Stop 11 P −5.1118 −6.1137 −6.3412 1.0000 1.0000 1.0000 1 NF
c.3475T>A p.Cys1159Ser 11 0.4656 1.3119 0.6174 0.0002 0.0044 0.0010 0.00015452 F
c.3499A>G p.Ile1167Val 11 US −0.1309 0.1616 0.0775 0.0010 0.0082 0.0022 0.0010112 F
c.3509C>T p.Ala1170Val 11 CIP 1.1074 0.1653 0.5177 0.0000 0.0081 0.0011 5.0152E-05 F
c.3515C>G p.Ser1172Trp 11 US 0.9492 0.8025 0.1095 0.0001 0.0047 0.0020 6.2456E-05 F
c.9458G>C p.Gly3153Ala 25 CIP 1.2132 1.7336 1.6517 0.0000 0.0053 0.0007 3.9357E-05 F
c.9401G>T p.Gly3134Val 25 US 0.9960 1.9634 2.0011 0.0000 0.0064 0.0008 5.4015E-05 F
c.9307A>G p.Ile3103Val 25 US 1.1745 1.5836 1.6928 0.0000 0.0048 0.0007 4.0886E-05 F
c.9275A>G p.Tyr3092Cys 25 CIP 0.9407 0.9271 1.1863 0.0001 0.0045 0.0006 5.654E-05 F
c.9263C>T p.Ala3088Val 25 US 1.2376 1.3473 1.5995 0.0000 0.0044 0.0006 3.7711E-05 F
c.9375C>G p.Leu3125 = 25 LB 1.3594 1.3959 1.4045 0.0000 0.0045 0.0006 3.3459E-05 F
c.9477C>A p.Phe3159Leu 25 CIP 1.2213 1.3938 1.5744 0.0000 0.0045 0.0006 3.8411E-05 F
c.9275A>C p.Tyr3092Ser 25 US 0.3896 −1.5691 −1.1056 0.0002 0.3288 0.0499 0.01659796 F
c.9449C>T p.Pro3150Leu 25 US 0.0503 −1.7698 −0.9606 0.0005 0.5013 0.0310 0.01605649 F
c.9356T>G p.Leu3119Stop 25 P −3.6916 −3.6811 −3.3757 1.0000 0.9999 0.9998 1 NF
c.9500A>C p.Glu3167Ala 25 US 2.4392 2.5451 2.4458 0.0000 0.0141 0.0013 5.1474E-05 F
c.9285C>G p.Asp3095Glu 25 P/LP −3.8228 −5.0215 −4.9208 1.0000 1.0000 1.0000 1 NF
c.9374T>A p.Leu3125His 25 LP −4.1949 −5.9723 −5.9196 1.0000 1.0000 1.0000 1 NF
c.9294C>G p.Tyr3098Stop 25 P −3.1644 −3.7933 −3.5801 1.0000 1.0000 1.0000 1 NF
c.9302T>G p.Leu3101Arg 25 CIP −3.0114 −5.1579 −4.4468 1.0000 1.0000 1.0000 1 NF
c.9481A>T p.Lys3161Stop 25 P −3.3027 −3.2420 −3.2204 1.0000 0.9989 0.9995 1 NF
c.9455A>G p.Glu3152Gly 25 US −4.0623 −4.8274 −4.1532 1.0000 1.0000 1.0000 1 NF
c.2813C>A p.Ala938Glu 11 CIP −0.6316 −0.7041 −0.3713 0.0080 0.0381 0.0057 0.00825901 F
c.3085A>G p.Met1029Val 11 US 0.2874 0.3794 0.3655 0.0002 0.0064 0.0013 0.00025506 F
c.3075G>T p.Lys1025Asn 11 US 0.5270 0.4829 0.4088 0.0001 0.0058 0.0013 0.00013577 F
c.2786T>C p.Leu929Ser 11 B 0.3933 0.2543 0.1975 0.0002 0.0073 0.0017 0.00019514 F
c.2987T>G p.Leu996Arg 11 B 0.8617 0.6970 0.6141 0.0001 0.0050 0.0010 6.6665E-05 F
c.2965T>G p.Tyr989Asp 11 US −0.5989 −0.8827 −0.5099 0.0069 0.0581 0.0082 0.00739321 F
c.2798C>G p.Thr933Arg 11 US 0.3625 0.2828 0.2333 0.0002 0.0071 0.0016 0.0002103 F
c.2849T>A p.Val950Asp 11 CIP 0.6114 0.9657 0.8870 0.0001 0.0044 0.0008 0.0001081 F
c.2803G>C p.Asp935His 11 B −0.0744 −0.3379 −0.4463 0.0008 0.0178 0.0069 0.00093119 F
c.2927C>T p.Ser976Phe 11 B/LB 0.3687 −0.2071 −0.4607 0.0002 0.0141 0.0072 0.00029688 F
c.3073A>G p.Lys1025Glu 11 CIP −0.6727 −0.7125 −0.4728 0.0097 0.0389 0.0074 0.010024 F
c.2944A>C p.Ile982Leu 11 CIP 0.7377 0.4180 0.2804 0.0001 0.0062 0.0015 8.8507E-05 F
c.2979G>A p.Trp993Stop 11 P −4.8245 −6.1596 −6.4292 1.0000 1.0000 1.0000 1 NF
c.3142G>A p.Val1048Ile 11 US 0.4854 0.7102 0.5539 0.0001 0.0049 0.0010 0.00014791 F
c.3166C>T p.Gln1056Stop 11 P −3.2693 −2.4294 −2.7127 1.0000 0.9384 0.9892 0.99999994 NF
c.3088T>G p.Phe1030Val 11 CIP 0.5298 0.0240 0.6611 0.0001 0.0098 0.0009 0.00013664 F
c.2836G>C p.Asp946His 11 US 0.6596 0.9479 0.8467 0.0001 0.0045 0.0008 9.7257E-05 F
c.2899C>G p.Leu967Val 11 US 0.6510 −0.1807 0.0867 0.0001 0.0135 0.0021 0.00012434 F
c.3092T>C p.Phe1031Ser 11 US 0.5260 −0.5788 −0.4747 0.0001 0.0289 0.0075 0.00034466 F
c.2926T>A p.Ser976Thr 11 B/LB 0.7574 0.4405 0.6034 0.0001 0.0060 0.0010 8.1881E-05 F
c.3122G>A p.Ser1041Asn 11 US 0.5944 0.8211 0.8285 0.0001 0.0047 0.0008 0.00011271 F
c.3137A>G p.Glu1046Gly 11 CIP 0.0539 −0.3310 −0.1431 0.0005 0.0176 0.0034 0.000575 F
c.2854G>T p.Ala952Ser 11 US 0.8083 0.2244 0.4404 0.0001 0.0076 0.0012 7.7641E-05 F
c.3109C>T p.Gln1037Stop 11 P −4.4515 −5.5910 −5.5669 1.0000 1.0000 1.0000 1 NF
c.3515C>T p.Ser1172Leu 11 B 0.0007 0.8151 0.6568 0.0006 0.0047 0.0009 0.00062294 F
c.3503T>A p.Met1168Lys 11 US 0.8817 0.5852 0.6997 0.0001 0.0054 0.0009 6.4338E-05 F
c.3367A>G p.Ser1123Gly 11 US 0.9431 0.8715 0.9687 0.0001 0.0046 0.0007 5.6687E-05 F
c.3437A>G p.Glu1146Gly 11 US 0.9490 0.5847 0.7685 0.0001 0.0054 0.0008 5.7345E-05 F
c.3362C>A p.Ser1121Stop 11 P −3.1092 −4.1491 −4.3650 1.0000 1.0000 1.0000 1 NF
c.3265C>T p.Gln1089Stop 11 P −3.0456 −4.5726 −4.6121 1.0000 1.0000 1.0000 1 NF
c.3539A>G p.Lys1180Arg 11 CIP 0.8001 0.2632 0.4744 0.0001 0.0072 0.0012 7.7971E-05 F
c.3262C>T p.Pro1088Ser 11 CIP 0.4008 0.4517 0.2431 0.0002 0.0060 0.0016 0.00018833 F
c.3526G>A p.Val1176Ile 11 US 0.2902 0.6521 0.1806 0.0002 0.0051 0.0018 0.00025369 F
c.3103G>T p.Glu1035Stop 11 P −4.6635 −5.2485 −5.3876 1.0000 1.0000 1.0000 1 NF
c.7102T>G p.Leu2368Val 14 CIP 0.5605 0.3858 0.3938 0.0001 0.0063 0.0013 0.00012646 F
c.7142C>T p.Pro2381Leu 14 US −0.0513 −0.0918 0.1150 0.0007 0.0117 0.0020 0.00076683 F
c.7025A>C p.Gln2342Pro 14 US −0.0644 −0.1821 −0.2759 0.0008 0.0135 0.0046 0.00084049 F
c.7009A>G p.Thr2337Ala 14 US −0.3626 −0.5921 −0.7156 0.0025 0.0297 0.0146 0.00290388 F
c.7100C>T p.Thr2367Ile 14 US −0.2539 0.7600 0.7363 0.0016 0.0048 0.0009 0.00159616 F
c.7150C>G p.Gln2384Glu 14 US 0.7474 0.6018 0.6046 0.0001 0.0053 0.0010 8.2728E-05 F
c.7133C>G p.Ser2378Stop 14 P −3.6377 −4.6589 −4.8437 1.0000 1.0000 1.0000 1 NF
c.7095T>A p.His2365Gln 14 US −0.1041 0.4352 0.3600 0.0009 0.0061 0.0014 0.00090799 F
c.7139A>G p.His2380Arg 14 US 0.5762 0.2725 0.2727 0.0001 0.0072 0.0015 0.00012491 F
c.7039C>G p.Pro2347Ala 14 US 0.3155 0.2242 0.2279 0.0002 0.0076 0.0017 0.00023964 F
c.7093C>T p.His2365Tyr 14 US 0.0306 0.0714 0.0070 0.0006 0.0092 0.0025 0.0005808 F
c.7147T>C p.Tyr2383His 14 US −2.6611 −3.0326 −3.1749 0.9998 0.9966 0.9993 0.99999917 NF
c.7138C>T p.His2380Tyr 14 US 0.2068 0.0454 0.0454 0.0003 0.0095 0.0023 0.00033626 F
c.7107A>C p.Glu2369Asp 14 US −3.1135 −4.0509 −4.2792 1.0000 1.0000 1.0000 1 NF
c.7142C>A p.Pro2381Gln 14 US −1.3017 −2.1833 −2.5570 0.2184 0.8383 0.9744 0.8568118 I
c.7073C>G p.Ser2358Cys 14 US 0.1058 −0.0417 −0.0193 0.0004 0.0108 0.0026 0.00046198 F
c.7115C>G p.Ser2372Stop 14 P −2.7216 −3.5670 −4.1443 0.9999 0.9998 1.0000 0.99999998 NF
c.7028A>T p.Glu2343Val 14 US 0.3533 0.3632 0.3595 0.0002 0.0065 0.0014 0.00021264 F
c.7126G>C p.Ala2376Pro 14 US −0.5944 −0.9135 −0.7131 0.0068 0.0626 0.0145 0.00768141 F
c.7136G>T p.Gly2379Val 14 US −5.4959 −5.6089 −6.1601 1.0000 1.0000 1.0000 1 NF
c.7121A>G p.Asn2374Ser 14 CIP 0.0312 −0.3099 −0.2501 0.0006 0.0169 0.0043 0.00062956 F
c.7136G>A p.Gly2379Glu 14 US −1.6839 −3.5697 −3.7299 0.7538 0.9998 1.0000 0.9999561 NF
c.8657C>G p.Pro2886Arg 21 0.0958 −0.7838 −0.5446 0.0004 0.0458 0.0090 0.0008604 F
c.8679G>T p.Gln2893His 21 US 0.2385 −0.2311 0.3208 0.0003 0.0147 0.0014 0.00030634 F
c.8672C>G p.Thr2891Arg 21 US −0.0015 −0.2788 −0.0183 0.0006 0.0160 0.0026 0.00066501 F
c.8681A>G p.Gln2894Arg 21 US −0.4399 −1.5203 −0.9234 0.0034 0.2931 0.0275 0.01146638 F
c.8651A>T p.Tyr2884Phe 21 US −0.0402 −0.2931 0.0278 0.0007 0.0164 0.0024 0.00075337 F
c.8710C>T p.Leu2904Phe 21 US −0.3337 −2.8621 −1.9416 0.0022 0.9919 0.6195 0.61527078 I
c.8686C>T p.Arg2896Cys 21 US −0.3156 −1.0178 −0.5279 0.0020 0.0813 0.0086 0.00273603 F
c.8651A>G p.Tyr2884Cys 21 CIP −0.0501 −0.8476 −0.3635 0.0007 0.0533 0.0056 0.00103992 F
c.8663G>T p.Arg2888Leu 21 US −0.1668 −0.6515 −0.3279 0.0011 0.0339 0.0052 0.00131122 F
c.8680C>T p.Gln2894Stop 21 P −5.0250 −5.9901 −5.4301 1.0000 1.0000 1.0000 1 NF
c.8707G>T p.Glu2903Stop 21 P −5.9357 −7.5663 −6.6560 1.0000 1.0000 1.0000 1 NF
c.8665G>T p.Ala2889Ser 21 US 0.6434 1.7082 1.1282 0.0001 0.0052 0.0007 0.00010071 F
c.8633-1G>A Intron P/LP −5.3153 −5.5640 −5.2994 1.0000 1.0000 1.0000 1 NF
c.8704G>A p.Ala2902Thr 21 US 0.4549 −1.1871 −0.4189 0.0002 0.1256 0.0065 0.00096687 F
c.8687G>A p.Arg2896His 21 CIP −0.2058 −0.8948 −0.5829 0.0013 0.0598 0.0100 0.00191661 F
c.8722G>A p.Val2908Met 21 US −0.0797 −0.7944 −0.4695 0.0008 0.0470 0.0074 0.00116919 F
c.8690C>G p.Ala2897Gly 21 US −0.0893 −0.8324 −0.4401 0.0009 0.0514 0.0068 0.00120308 F
c.8663G>A p.Arg2888His 21 US 0.4929 1.1285 1.1205 0.0001 0.0043 0.0007 0.00014291 F
c.8662C>T p.Arg2888Cys 21 B 0.1056 −0.2758 0.0369 0.0004 0.0159 0.0023 0.00047119 F
c.8639C>G p.Thr2880Arg 21 US −0.1524 −0.4348 −0.2405 0.0011 0.0215 0.0042 0.00116683 F
c.8668C>A p.Leu2890Ile 21 CIP −0.5353 −2.0841 −1.3181 0.0052 0.7742 0.1029 0.08441721 I
c.8663G>C p.Arg2888Pro 21 US −0.0962 −0.0197 −0.0626 0.0009 0.0105 0.0028 0.00090399 F
c.8714A>G p.Tyr2905Cys 21 US −0.0733 −1.5721 −1.1494 0.0008 0.3312 0.0579 0.01994992 F
c.8702G>T p.Gly2901Val 21 CIP −0.0639 −1.6055 −1.0553 0.0008 0.3574 0.0422 0.01586066 F
c.8708A>G p.Glu2903Gly 21 US 0.2096 −0.2141 −0.1920 0.0003 0.0143 0.0038 0.00036535 F
c.8732C>T p.Ala2911Val 21 US −0.0423 0.5644 0.4160 0.0007 0.0054 0.0012 0.00072643 F
c.8737G>A p.Asp2913Asn 21 US 0.0662 −0.3298 −0.3259 0.0005 0.0176 0.0051 0.00058478 F
c.8746T>A p.Tyr2916Asn 21 US 0.1240 0.0412 0.0657 0.0004 0.0096 0.0022 0.0004299 F
c.8732C>A p.Ala2911Glu 21 P 0.5565 0.4966 0.5866 0.0001 0.0057 0.0010 0.00012521 F
c.8737G>C p.Asp2913His 21 US 0.2936 0.3892 0.3913 0.0002 0.0063 0.0013 0.00025023 F
c.8744C>T p.Ala2915Val 21 US 0.3841 0.3975 0.4609 0.0002 0.0063 0.0012 0.00019441 F
c.8644A>T p.Lys2882Stop 21 P −5.0381 −6.6902 −6.3356 1.0000 1.0000 1.0000 1 NF
c.8752G>C p.Glu2918Gln 21 US 0.2601 0.2159 0.3473 0.0003 0.0076 0.0014 0.00027783 F
c.8754 + 2T>G Intron P −4.4425 −8.5100 −8.5823 1.0000 1.0000 1.0000 1 NF
c.8633-2A>G Intron P −6.1292 −7.9408 −7.0895 1.0000 1.0000 1.0000 1 NF
c.8738A>T p.Asp2913Val 21 US −0.3507 0.4720 −0.4400 0.0024 0.0059 0.0068 0.00239288 F
c.8677C>T p.Gln2893Stop 21 P −6.3245 −8.7391 −9.4322 1.0000 1.0000 1.0000 1 NF
c.8695C>T p.Gln2899Stop 21 P −5.9213 −8.2198 −7.4967 1.0000 1.0000 1.0000 1 NF
c.8754 + 1G>T Intron P/LP −6.1534 −9.3828 −7.8963 1.0000 1.0000 1.0000 1 NF
c.8646A>T p.Lys2882Asn 21 US 0.1195 0.3945 0.3809 0.0004 0.0063 0.0013 0.00042298 F
c.8633-2A>T Intron P −5.0434 −6.9790 −7.2312 1.0000 1.0000 1.0000 1 NF
c.8754 + 1G>A Intron P −6.0750 −7.4268 −6.9858 1.0000 1.0000 1.0000 1 NF
c.9301C>G p.Leu3101Val 25 US 0.7495 1.0248 1.1194 0.0001 0.0044 0.0007 8.0084E-05 F
c.9271G>A p.Val3091Ile 25 CIP 1.9255 1.9417 1.8830 0.0000 0.0063 0.0007 2.8814E-05 F
c.9414A>G p.Leu3138 = 25 LB 1.9071 1.7683 1.6661 0.0000 0.0054 0.0007 2.7722E-05 F
c.9367A>G p.Ser3123Gly 25 US 1.0917 1.9432 1.5569 0.0000 0.0063 0.0006 4.6219E-05 F
c.9292T>C p.Tyr3098His 25 B 1.3183 1.5267 1.5487 0.0000 0.0047 0.0006 3.4899E-05 F
c.9350A>C p.His3117Pro 25 US 1.0466 1.0860 1.3583 0.0000 0.0043 0.0006 4.7865E-05 F
c.9501G>A p.Glu3167 = 25 1.6919 1.4348 1.4565 0.0000 0.0045 0.0006 2.7661E-05 F
c.9466C>T p.Gln3156Stop 25 P −4.2907 −6.1321 −5.2203 1.0000 1.0000 1.0000 1 NF
c.9376C>T p.Gln3126Stop 25 P −4.1324 −4.7481 −4.4496 1.0000 1.0000 1.0000 1 NF
c.9371A>T p.Asn3124Ile 25 P −3.9263 −7.6510 −6.4842 1.0000 1.0000 1.0000 1 NF
c.8539G>A p.Glu2847Lys 20 CIP −0.9734 −4.4171 −4.5893 0.0425 1.0000 1.0000 0.99999935 NF
c.8593T>G p.Leu2865Val 20 CIP 1.0598 0.8185 0.8329 0.0000 0.0047 0.0008 4.7936E-05 F
c.8575C>T p.Gln2859Stop 20 P −2.7456 −6.2237 −6.1633 0.9999 1.0000 1.0000 1 NF
c.8618T>G p.Phe2873Cys 20 US 1.2591 1.3325 1.3907 0.0000 0.0044 0.0006 3.6782E-05 F
c.8503T>C p.Ser2835Pro 20 B 1.6813 1.3591 1.3828 0.0000 0.0044 0.0006 2.7686E-05 F
c.8573A>G p.Gln2858Arg 20 CIP 0.8718 1.1406 1.1764 0.0001 0.0043 0.0006 6.349E-05 F
c.8548G>A p.Glu2850Lys 20 US 1.0769 0.7884 0.7826 0.0000 0.0047 0.0008 4.7041E-05 F
c.8518A>G p.Ile2840Val 20 US 1.5107 1.0592 1.0904 0.0000 0.0044 0.0007 3.0081E-05 F
c.8525G>A p.Arg2842His 20 B 1.4195 −0.2924 −0.4282 0.0000 0.0164 0.0066 0.00013772 F
c.8572C>A p.Gln2858Lys 20 CIP 1.0425 1.2039 1.2558 0.0000 0.0043 0.0006 4.818E-05 F
c.8525G>T p.Arg2842Leu 20 CIP 0.6106 −1.0058 −1.1800 0.0001 0.0788 0.0642 0.00516208 F
c.8599A>C p.Thr2867Pro 20 US 1.0508 0.7940 0.7847 0.0000 0.0047 0.0008 4.8739E-05 F
c.8591C>T p.Ala2864Val 20 US 0.6814 0.3922 0.3434 0.0001 0.0063 0.0014 9.8065E-05 F
c.8629G>T p.Glu2877Stop 20 P −4.2046 −4.8618 −4.8729 1.0000 1.0000 1.0000 1 NF
c.8504C>A p.Ser2835Stop 20 P −3.2603 −3.4495 −3.6358 1.0000 0.9997 1.0000 1 NF
c.8567A>C p.Glu2856Ala 20 CIP 1.2920 0.8997 0.9775 0.0000 0.0045 0.0007 3.6047E-05 F
c.7068T>G p.Phe2356Leu 14 US 0.0051 −1.4464 −1.7247 0.0006 0.2446 0.3781 0.09306036 I
c.3299A>T p.Asn1100Ile 11 CIP 0.1578 1.1638 0.1082 0.0004 0.0043 0.0020 0.00037551 F
c.3362C>G p.Ser1121Stop 11 P −4.0407 −4.7562 −5.2877 1.0000 1.0000 1.0000 1 NF
c.3496G>A p.Val1166Ile 11 US −0.3355 −0.2721 −0.3444 0.0022 0.0158 0.0054 0.00229499 F
c.9425A>G p.Asp3142Gly 25 US 1.2529 1.1127 1.1823 0.0000 0.0043 0.0006 3.708E-05 F
c.9385C>G p.Pro3129Ala 25 US 0.8141 1.1596 0.9435 0.0001 0.0043 0.0007 7.0883E-05 F
c.9276T>G p.Tyr3092Stop 25 P −3.5541 −3.8671 −3.9063 1.0000 1.0000 1.0000 1 NF
c.9382C>T p.Arg3128Stop 25 P −4.1565 −4.7846 −3.9642 1.0000 1.0000 1.0000 1 NF
c.9317G>A p.Trp3106Stop 25 P −2.3918 −2.4653 −2.3196 0.9981 0.9472 0.9138 0.99974674 NF
c.9309A>G p.Ile3103Met 25 US 1.1199 0.2147 0.1092 0.0000 0.0076 0.0020 5.6182E-05 F
c.2971A>G p.Asn991Asp 11 B 1.6490 1.5217 2.2475 0.0000 0.0047 0.0010 2.9989E-05 F
c.2881C>T p.Gln961Stop 11 P −4.8245 −5.9455 −6.7592 1.0000 1.0000 1.0000 1 NF
c.2818C>T p.Gln940Stop 11 P −7.1483 −7.0299 −8.1113 1.0000 1.0000 1.0000 1 NF
c.3076A>T p.Lys1026Stop 11 P −5.6613 −8.4729 −8.1921 1.0000 1.0000 1.0000 1 NF
c.3032C>G p.Thr1011Arg 11 CIP 0.9236 0.3661 0.4184 0.0001 0.0065 0.0012 6.3314E-05 F
c.3132T>G p.Cys1044Trp 11 US 0.7906 0.3435 0.2161 0.0001 0.0066 0.0017 8.2198E-05 F
c.2872A>G p.Ser958Gly 11 US 0.6581 0.8378 0.8054 0.0001 0.0046 0.0008 9.7826E-05 F
c.2803G>A p.Asp935Asn 11 B/LB 0.5735 0.1404 0.4089 0.0001 0.0084 0.0013 0.00012519 F
c.3270G>C p.Met1090Ile 11 US 2.5023 1.1308 2.2957 0.0000 0.0043 0.0011 3.9885E-05 F
c.3304A>T p.Asn1102Tyr 11 B 0.5268 −0.3602 −0.0590 0.0001 0.0186 0.0028 0.0001811 F
c.7141C>T p.Pro2381Ser 14 US 0.1934 0.2064 0.1441 0.0003 0.0077 0.0019 0.00034255 F
c.7141C>G p.Pro2381Ala 14 US 0.2626 −0.0195 −0.0283 0.0003 0.0105 0.0027 0.00029313 F
c.7119C>G p.Ser2373Arg 14 US 0.1417 0.1874 0.0888 0.0004 0.0079 0.0021 0.00040277 F
c.7115C>A p.Ser2372Stop 14 P −1.9846 −5.4167 −5.5699 0.9606 1.0000 1.0000 1 NF
c.7106A>G p.Glu2369Gly 14 US −0.5600 −1.3960 −1.1140 0.0058 0.2155 0.0514 0.0168051 F
c.8702G>A p.Gly2901Asp 21 CIP −0.0512 −0.7515 −0.9849 0.0007 0.0425 0.0335 0.00216736 F
c.8692T>G p.Leu2898Val 21 US −0.8861 −2.5100 −1.4343 0.0274 0.9567 0.1528 0.16956546 I
c.8739C>G p.Asp2913Glu 21 US 0.7681 1.2035 0.8567 0.0001 0.0043 0.0008 7.7631E-05 F
c.8699A>T p.Asp2900Val 21 US −0.0918 0.5854 0.5101 0.0009 0.0054 0.0011 0.00086614 F
c.8740C>G p.Pro2914Ala 21 US 0.5999 1.0249 1.0170 0.0001 0.0044 0.0007 0.00011065 F
c.8678A>G p.Gln2893Arg 21 US 0.1813 0.1716 0.4155 0.0003 0.0081 0.0012 0.00035048 F
c.8656C>A p.Pro2886Thr 21 US 0.2438 −0.1361 −0.2628 0.0003 0.0125 0.0044 0.00033615 F
c.2837A>G p.Asp946Gly 11 CIP 0.4408 0.4513 0.2464 0.0002 0.0060 0.0016 0.00017005 F
c.2771A>T p.Asn924Ile 11 CIP −0.4580 0.5980 0.6023 0.0037 0.0053 0.0010 0.00370591 F
c.8647C>T p.Pro2883Ser 21 CIP −6.4547 −10.133 −9.1098 1.0000 1.0000 1.0000 1 NF
c.9353T>C p.Met3118Thr 25 CIP −0.8240 −0.8750 −0.2229 0.0201 0.0570 0.0040 0.02037364 F
c.9454G>A p.Glu3152Lys 25 CIP 1.6899 2.0613 1.8425 0.0000 0.0071 0.0007 2.9922E-05 F
c.9380G>A p.Trp3127Stop 25 P −5.8464 −7.9743 −7.7484 1.0000 1.0000 1.0000 1 NF
c.8572C>T p.Gln2858Stop 20 P −4.3478 −6.3224 −6.2583 1.0000 1.0000 1.0000 1 NF
c.8594T>G p.Leu2865Stop 20 P −6.6953 −8.8540 −8.1387 1.0000 1.0000 1.0000 1 NF
c.8489G>A p.Trp2830Stop 20 P −4.6542 −3.6163 −3.7287 1.0000 0.9999 1.0000 1 NF

FS, functional score; PIF, probability of impact on function; Cis, cisplatin; Ola, olaparib; F, functional; NF, non-functional; I, indeterminate; B, benign; LB, likely benign; P, pathogenic; LP, likely pathogenic; US, uncertain significance; CIP, conflicting interpretation of pathogenicity.

Generation of variants and multiplexing the mESC-based assay

Each variant was generated by recombineering in a BAC clone (CTD-2342K5 with a 127-kb insert) containing full-length BRCA2.8,20 All BACs were sequenced to confirm the presence of the desired mutation and the lack of any undesired mutations in that region. For most of the variants, two independent mouse embryonic stem cells (PL2F7) clones expressing full-length BRCA2 were selected for further analysis. A single clone was used for variants where two equally expressing clones were not obtained. Subsequently, mESC clones expressing variants from the same exons were pooled together.

Two pools of mESCs were generated for each batch of exon using two independently generated BRCA2-variant expressing clones. Cell samples from each pool were collected before Pgk-Cre expression (termed “M15”), after HAT selection of recombinant clones (termed “HAT”), and after the selection of recombinant clones in the presence of olaparib or cisplatin. The genomic DNA obtained from collected cell samples was used to PCR amplify the genomic region where the variants are located and sequenced using the Illumina Mi-Seq platform.

Functional scores of variants

The frequency of each variant was calculated as described in the STAR Methods. The frequency of variants in the M15 sample followed normal distribution consisting of a bell-shaped curve (Figure S2). The frequency of variants based on HAT, cisplatin, and olaparib treatment were used to calculate the functional score for each variants (STAR Methods). Functional scores of two biological replicates for each treatment (HAT, cisplatin, or olaparib) and for each pool were calculated. Functional scores of two biological replicates under all conditions showed a high correlation (r = 0.95, 0.95 for HAT; 0.98, 0.97 for cisplatin; 0.98, 0.95 for olaparib) (Figure S3A), suggesting equal representation between the replicates. Hence, we generated the average functional score from the two biological replicates for each treatment and for each pool. The average functional scores followed a bimodal distribution (neutral or pathogenic controls) for the control variants under all conditions (Figure S3B). Further, we analyzed the correlation of average functional scores between the two pools (technical replicates) for each treatment and found a high correlation between the two technical replicates under all conditions (r = 0.99 for HAT, 0.67 for cisplatin, and 0.70 for olaparib) (Figure S3C). Finally, we took the average of the functional scores of two technical replicates and generated the functional score for each assay (HAT, cisplatin, and olaparib) (Table 1). For 41 variants, we either failed to get two independent BRCA2-expressing clones or they had a frequency of sequencing reads of less than 0.001 in the M15 sample. The functional scores for these variants were calculated from two replicates, and the functional score for the remaining 182 variants were calculated from four replicates (Table S1).

We next examined the functional scores of the variants. The functional scores in all three assays were bimodally distributed (Figure 2A). All pathogenic control variants were scored below −1.98 (n = 27; median −4.13, −4.86, and −5.22 for HAT, cisplatin, and olaparib, respectively), and all neutral variants were scored at or above −0.46 (n = 16; median 0.66, 0.34, and 0.50 for HAT, cisplatin, and olaparib, respectively). The functional scores obtained in the cisplatin and olaparib sensitivity assays showed the highest correlation (r = 0.98), whereas the correlation of the HAT functional score with the cisplatin (r = 0.87) or the olaparib (r = 0.89) functional score was less than 0.90 (Figure 2B).

Figure 2.

Figure 2

Functional score (FS) distribution and the correlation of FS between three assays

(A) The distribution of FSs calculated from HAT, cisplatin, and olaparib sensitivity of mESCs with different BRCA2 variants. The color codes demarcate controls and variants of uncertain significance (VUSs). Variants that were classified as benign or likely benign in ClinVar were used as neutral/functional controls, and those that were classified as pathogenic or likely pathogenic were used as pathogenic/non-functional controls. A total of 223 variants were analyzed.

(B) Correlation of FSs from three different assays (sensitivity to HAT, cisplatin, or olaparib) are plotted to compare the results of different assays. The Spearman correlation coefficient is displayed in each plot.

Evaluation of BRCA2 variants by a combined-mixture statistical model and prediction of the impact on function

We used the functional scores in a statistical model to calculate the PIF for each BRCA2 variant in the dataset (n = 223) (Tables 1 and S2). We further used these PIF to classify the variants as non-functional (PIF > 0.99), functional (PIF ≤ 0.05), and indeterminate (0.05 < PIF ≤ 0.99) (Table 1).9 Because the functional score distributions for the BRCA2 variants with known pathogenicity (i.e., the control variants in our study: 27 pathogenic and 16 benign variants) could be regarded as normal (Figure S4; Lilliefors test, p > 0.1), mixtures of two normal distributions were fitted to the data from each of the three assays (Figure S5), and the PIFs were calculated from these distributions, as described in the STAR Methods. The main version of our PIF calculation algorithm relied on the semi-supervised learning paradigm21 and had a special structure that combined the mixture models for all three assays. The resulting PIFs are shown in Figure 3, which compares three reduced algorithm versions (Figures 3A–3C) with its full version (Figure 3D). These reduced versions of our algorithm also relied on semi-supervised learning but used the data from only one of the three assays (HAT, cisplatin, or olaparib) for PIF calculation. Remarkably, for the majority of the calculated PIFs, the width of the confidence intervals was negligible, indicating a high robustness of the PIF calculation results (Figure 3D; Table S2). Moreover, only a few BRCA2 variants were classified as indeterminate. The full algorithm version yielded a 100% variant classification accuracy when applied to the full dataset (i.e., fitting accuracy) as well as a 100% cross-validation accuracy. In contrast, for each of the reduced algorithm versions, both full-dataset accuracy and cross-validation accuracy were lower (Figures 3A–3C). This clearly indicates the benefit of using the full, three-assay dataset for PIF calculation.

Figure 3.

Figure 3

Probability of impact on function (PIF) for the BRCA2 variants in the full dataset (n= 223), calculated using the main (semi-supervised learning) version of our PIF calculation algorithm

The four subplots correspond to the four types of assays that provided FS data used in PIF calculation: HAT only (A), cisplatin only (B), olaparib only (C), and HAT, cisplatin, and olaparib combined (D). The dashed lines correspond to the variant classification thresholds of 0.05 and 0.99; a PIF value is considered indeterminate when 0.05 < PIF ≤ 0.99. In each subplot, the circles represent individual BRCA2 variants. Vertical lines show the 95% confidence intervals for each PIF. Values of confidence intervals are listed in Table S2.

To gain insights into the advantages of semi-supervised learning for PIF calculation, we used the analysis strategy described above in combination with a supervised learning version of our algorithm.21 While the main, semi-supervised version of our algorithm uses all of the data in the training set for model fitting, the supervised learning version is trained only using the control variants (whose functional status, benign or pathogenic, is known). The resulting PIF (Figure S6) demonstrated that, while the classification accuracy of the supervised learning algorithm applied to the full dataset was still 100%, the number of variants classified as indeterminate was considerably larger, and the number of PIFs with wide confidence intervals was also considerably larger than those for the semi-supervised learning algorithm version (Table S3). This suggests that supervised learning, compared with semi-supervised learning, was characterized by increased uncertainty and, evidently, decreased reliability of the PIF calculation results. Moreover, the cross-validation accuracy for supervised learning was less than 100%. We thus concluded that our semi-supervised learning algorithmic approach combining data from all three functional assays (HAT, cisplatin, and olaparib) demonstrated superior performance compared with the other PIF calculation approaches we considered here.

To gain further insights into our semi-supervised learning results, we generated PIF-PIF plots for different combinations of assay data used in the PIF calculation (Figure S7). We observed that the PIFs calculated using different data for the same BRCA2 variant could be notably different, suggesting that the incorporation of the information from different assays in reliable PIF calculation is necessary because of their complementary and non-redundant nature. For the PIF based on a single assay, the cisplatin-based PIF typically exceeded the olaparib-based PIF, which, in turn, almost always exceeded the HAT-based PIF (Figures S7A–S7C). The three-assay PIF always exceeded (but in many cases negligibly) the corresponding HAT-only PIF (Figure S7D), a relation that was largely reversed for the cisplatin- and olaparib-only PIF (Figures S7E and S7F). However, in all PIF-PIF plots, the largest PIF-PIF differences were detected for the variants that were classified as non-functional by at least one of the PIFs. This implies that our suggested use of all three assays (HAT, cisplatin, and olaparib) in PIF calculation will be particularly important for the identification of nonfunctional BRCA2 variants.

Comparison of ESC-based variant classification with other available variant information

We next compared our calculated PIF for the variants with the results from other published functional assays, ClinVar data, and in silico predictions (Bayes-del and Priors) and ACMG codes (Figure 4; Tables S4 and S5). Comparison of our prediction of the impact on function for the variants with the results from the high-throughput functional assay based on complementation of PARPi sensitivity (mixed-all nominated-in-one-BRCA [MANO-B] method)12 showed a high concordance with the variants that were classified as fClass 1/2 (neutral) or fClass 4/5 (pathogenic) (Figures 4A and 4D; Table S4). Nine variants that were classified as fClass 1/2 showed a PIF in favor of functional variant classification in our assay, except p.Glu2847Lys, which is classified as pathogenic in our assay (PIF 0.99) (Figure 5A). This variant is classified as intermediate in the homologous recombination (HR)-based functional assay with a probability of 0.343.19 All the variants, except p.Arg2842Leu, that were classified as fClass 4/5 by MANO-B method, showed a PIF in favor of non-functionality (Figures 4A, 4D, and 5A; Table S4). The variant p.Arg2842Leu is classified as functional in our assay (PIF 0.005) (Figure 5A). This variant is intermediate in the HR-based assay with a probability of 0.06 and complements cell survival and HR in the ESC-based assay.15,19 Eight variants (p.Ser1172Leu, p.Thr2337Ala, p.Tyr2905Cys, p.Gly2901Val, p.Asp2913His, p.Asp2913Val, p.Arg2842His, and p.Asp2900Val) that were classified as fClass 2/3 or fClass 3/4 showed a PIF in favor of neutrality in our assay (Figures 4A, 4D, and 5A; Table S4). Among these eight variants, p.Ser1172Leu and p.Arg2842His are classified as benign in ClinVar (Table 1).22 The p.Arg2842His is classified as benign using the HR assay.19 We could not compare the variants that were identified as intermediate by our functional assay because they were not tested by the MANO-B or the HR assay.

Figure 4.

Figure 4

Comparison of multiplexed mESC-based variant classification with other functional, multifactorial, and ClinVar classifications

(A) Comparison of variants that were classified using the MANO-B functional assay.12

(B) Homologous recombination (HR)-based assay11 and classification using the mESC-based assay.

(C) Plot showing the PIFs and classification using the mESC-based assay and classification of the variants in ClinVar. Overlapping dots are circled (dotted), and the number of dots in each circle is represented as n (numbers of variants on each circle). The dots that are not circled represent a single variant.

(D) Side-by-side comparison of mESC assay-based classification of variants that have other functional assay data published. We used the data from the MANO-B (mixed-all nominated-in-one-BRCA [MANO-B]) method of classification, the mESC-based assay reported by Mesman et al.,15 and the HR assay-based classification.12,19

Figure 5.

Figure 5

Position of classified variants in BRCA2

(A) BRCA2 domains are marked with colored rectangles, with the amino acid (aa) position marked in parentheses. The variants located in different regions are marked below. OB, oligonucleotide/oligosaccharides binding; CTRB, C-terminal RAD51 binding.

(B) Position of the splicing variants analyzed in this study. Functional variants are marked in green, non-functionals in magenta, and indeterminants in saffron.

Among the selected variants, 15 variants with mutations in the DNA-binding domains of BRCA2 have been evaluated previously using the HR assay for prediction of pathogenicity.19 Nine of them (p.Leu2865Val, p.Phe2873Cys, p.Arg2842His, p.Glu2856Ala, p.Glu3152Lys, p.Tyr3098His, p.Ser3123Gly, p.Ala3088Val, and p.Tyr3092Cys) were classified as benign, three (p.Asn3124Ile, p.Leu3125His, and p.Asp3095Glu) as pathogenic, and three (p.Arg2842Leu, p.Tyr3092Ser, and p.Glu2847Lys) as intermediate using the HR assay. All variants classified as benign or pathogenic by the HR assay were classified in our NGS-based mouse ESC assay as functional or non-functional, respectively (Figures 4A, 4D, and 5A; Tables 1 and S4). Two intermediate variants, p.Arg2842Leu and p.Tyr3092Ser, with probabilities of 0.06 and 0.1, respectively, in the HR assay are classified as functional in our assay, with a PIF of 0.005 for p.Arg2842Leu and 0.016 for p.Tyr3092Ser (Figure 5A). Both variants rescued ESC lethality and complemented the HR defect in mESCs.15 The p.Glu2847Lys variant is classified as non-functional in our assay (Figure 5A; Table 1). Based on cell survival, p.Glu2847Lys is functional (HAT PIF 0.04). However, when the cell viability and drug sensitivity assays were combined, it was classified as non-functional (PIF 0.99). Seven of the selected variants (p.Tyr3092Ser, p.Arg2888Cys, p.Tyr3098His, p.Asn3124Ile, p.Asp3095Glu, p.Arg2842Leu, and Glu2856Ala) were analyzed by Mesman et al.15 using an mESC-based assay, and two of them (p.Asn3124Ile and p.Asp3095Glu), which failed to complement cell viability, are classified as non-functional in our assay (Figures 4D and 5A). The remaining five variants that were able to complement cell viability and supported HR in the mESC-based assay are classified as functional in our assay (Figure 5A; Table S4).15

All variants that were classified as pathogenic in ClinVar (p.Asp3095Glu, p.Leu3125His, c.8633-1G>A, c.8633-2 A>G, c.8633-2 A>T, c.8754 + 1 G>A, c.8754 + 1 G>T, and c.8754 + 2 T>G) are classified as non-functional in our assay, except for p.Ala2911Glu, which has a PIF in favor of functional (Figures 4C, 5A, and 5B; Table 1). This variant is classified as fClass 1/2 by the MANO-B method.12 ClinVar has classified this variant with no assertion criteria, and the classification was based on a single occurrence in a Fanconi anemia patient.12,23

Computational determination of prior probabilities of pathogenicity is based on the location of the mutated amino acid in the protein as well as the impact of the mutated nucleotide on splicing (database: http://hci-priors.hci.utah.edu/PRIORS/),24 and the in silico prediction tool BayesDel score combines multiple deleteriousness predictors to calculate an overall score.25 In BayesDel, the PIF criterion for benign for BRCA2 is PIF < 0.08, and the PIF criterion for pathogenic is PIF > 0.5.26 These comparisons show high agreement with the PIF values and functionality predictions obtained from our assay (Tables 1 and S4).

Discussion

There are more than 3,000 BRCA2 missense variants that are classified as VUS in ClinVar. It is important to classify these variants using multiple functional assays with high sensitivity and specificity to determine their functional impact. Several functional assays, including our mESC-based assays, have been developed to determine the functional impact of BRCA2 VUSs.12,13,15,19 The results of these functional assays have been used in computational models to calculate the PIF for the individual variant.9,12,19

Although we have used the mESC-based functional assays to functionally classify more than 150 BRCA2 variants, the process to generate the variants and examine their functional impact is laborious and time consuming. To expedite the process, we now multiplexed the assay using an NGS-based approach. We used the change in the variant sequence counts in the pools of variants to determine the effect of each variant on cell viability and sensitivity to cisplatin and olaparib. This is based on the change in the frequency of the sequence reads corresponding to each variant relative to their frequency prior to the HAT, cisplatin, and olaparib treatments. A significant reduction in the frequency after the treatments suggests that a loss of protein function is caused by the variant. By this approach, we were successful in simultaneously analyzing multiple variants located within the same exon of BRCA2. We also combined the results on cell viability and sensitivity to two different drugs. This is an advantage over other functional assays, where impact on cell viability, HR, or sensitivity to DNA-damaging drugs was evaluated.12,19,27 We previously reported a Bayesian hierarchical model that provided improved accuracy in predicting PIFs.9 The PIF calculation algorithm we developed in this study shares a number of important features with that model, such as the use of semi-supervised learning and mixtures of normal distributions as well as the idea of combining the mixtures that model the data obtained from different experimental assays. At the same time, our algorithm appears to be simpler, logically and computationally, than our previously used model.

It is to be noted that the mESCs expressing the variants were treated with drugs after HAT selection and that the variants that failed to rescue viability in HAT were excluded from the drug assay. The variants that survived HAT selection are the only ones that can provide information on the response to cisplatin and olaparib treatment. Calculation of PIF from combining multiple assays helps us to classify the variants with more confidence. We have observed previously that some variants surviving in HAT (0.05 < PIF ≤ 0.99) exhibited a wide range of drug sensitivities (none to extreme sensitivity), impacting their functional status obtained only from cell viability data.9 Here, we identify five variants (p.Pro2381Gln, p.Leu2904Phe, p.Leu2890Ile, p.Phe2356Leu, and p.Leu2898Val) as intermediate variants (0.05 < PIF ≤ 0.99), with more susceptibility to cisplatin or olaparib compared with HAT (Figure S1A; Table 1). All of these variants are in the DNA-binding domain of BRCA2, a region where most of the variants that affect BRCA2 function are located. However, no other functional assay data are available for these variants. Their intermediate status will be confirmed in the future based on data from different functional assays and/or epidemiological data.

In our assay, other than the non-sense variants, 16 missense variants had a PIF greater than 0.99. The defective function of p.Asp3095Glu, p.Leu3125His (PIF > 0.998), p.Asn3124Ile, c.8633-1G>A, c.8633-2 A>G, c.8633-2 A>T, c.8754+1 G>A, c.8754+1 G>T, and c.8754+2 T>G is supported by multifactorial data or other functional assays.12,19,22,28 When we previously analyzed p.Asp3095Glu and p.Asn3124Ile using our mESC-based assay, they were classified as non-functional.9 This further supports that our approach of multiplexing can classify the variants as accurately as when we examine them individually. We found seven missense variants (p.Phe2349Leu, p.Leu3101Arg, p.Glu3152Gly, p.Gly2379Val, p.Gly2379Glu, p.Glu2369Asp, p.Tyr2383His, and p.Pro2883Ser) with PIFs in favor of non-functionality. Among those missense variants, p.Leu3101Arg is classified as fClass 4/5 using a functional assay based on complementation of PARPi sensitivity, supporting our data.12 There are no functional or multifactorial data available for other variants. The missense variants that have an impact on BRCA2 function are all located in the DNA-binding domain, where many non-functional variants are located.

Among the 223 variants analyzed in this report, 65 are in the largest exon of BRCA2, exon 11. All BRC repeats involved in RAD51 binding are encoded by exon 11. We failed to identify any non-functional missense variants in this region, suggesting possible redundancy of function of BRC repeats (Figure 5A). Notably, two missense mutations located in BRC2 and BRC7 (p.Ser1221Pro and p.Thr1980Ile) have been reported recently to have a significant impact on BRCA2 function.29 These variants have not been evaluated by us or by other functional assays.

The results obtained by our sequencing-based multiplexing approach using our mESC-based functional assays are highly consistent with the International Agency for Research on Cancer (IARC) classification and other functional assays. We could not compare our data with CRISPR-based prime editing data because of the lack of overlap between the variants that were analyzed.10 There is a clear discrepancy between our results and ClinVar in the classification of the p.Ala2911Glu variant, which was found in a single Fanconi anemia (FA) patient.23 Our functional evaluation of the variant suggests that p.Ala2911Glu has no functional impact. We hypothesize that the variant contributing to the FA phenotype remains to be identified. Another BRCA2 variant, p.Lys2729Asn (c.8187G>T), was identified in an FA patient and was a pathogenic variant.23 Functional evaluations, including those by us, suggested this variant to be functional.7,11 Re-evaluation of the patient DNA and cDNA samples revealed the presence of another variant in cis in the 5′ untranslated region.30 This variant has been shown to reduce mRNA stability, thus affecting BRCA2 function.30 It will be interesting to find out whether there are other variants present in the FA patient that may be responsible for the FA phenotype. These findings demonstrate the importance of inclusion of functional data in evaluating the impact of BRCA2 variants to determine the risk of cancer for mutation carriers.

To keep pace with the rate at which new BRCA2 VUSs are being identified, high-throughput functional assays, such as the clustered regularly interspaced short palindromic repeats (CRISPR)-based saturation genome editing (SGE), are being developed. Although SGE has the potential to classify thousands of variants, the number of BRCA2 VUSs listed in ClinVar that have been analyzed by SGE so far is quite limited.10,27 Future SGE studies targeting the entire coding sequence of the gene will lead to the functional classification of more VUSs. However, VUS classification results using the data from a single functional assay are less likely to be used for patient risk assessment and clinical management. Having results from multiple well-established functional assays that have high sensitivity and specificity, as well as availability of epidemiological data and co-segregation and/or co-co-occurrence data, will together help with functional assessment of VUSs. In conclusion, we developed a multiplexing strategy for an established mESC-based functional assay that classifies BRCA2 variants with high sensitivity and specificity and can be used to expedite VUS classification in the future.

Limitations of study

In this study, BRCA2 variants were classified based on functional scores that were calculated using the DNA sequencing reads, which does not take into consideration the transcript levels of the variants in the pool. We did not demonstrate that our approach can be used to examine the impact of variants on splicing. RNA sequencing (RNA-seq) analysis using appropriate PCR primers and total RNA from cells in the initial pool and HAT pool, when combined with the DNA sequencing reads, can identify variants that affect splicing or mRNA stability.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Bacterial and virus strains

One shot Top10 competent cells Invitrogen Cat# C404003
E. coli SW102 National Cancer Institute NCI recombineering website: https://redrecombineering.ncifcrf.gov/

Chemicals, peptides, and recombinant proteins

Gelatin (0.1%) Stemcell Technologies Cat# 07903
Knockout Dulbecco’s Modified Eagle Medium (DMEM) Gibco Cat# 10829-018
FBS (Hi Performance) Gibco Cat# 16000-044
b-mercaptoethanol Sigma Cat# M3148
Glutamine Penicillin Streptomycin Thermo Fisher Scientific Cat# 10378016
Trypsin EDTA (0.5%), no phenol red Thermo Fisher Scientific Cat# 15400054
DPBS (1X) Gibco Cat# 14190-144
Dimethyl sulfoxide (DMSO) Millipore Sigma Cat# D2650
Cisplatin Sigma Aldrich Cat# P4394
Olaparib Selleckchem Cat# AZD2281
G418 Thermo Fisher Scientific Cat# 10131035
Ampicillin Sigma Aldrich Cat# A9393
Galactose Sigma Aldrich Cat# 48260
2-deoxy-galactose Sigma Aldrich Cat# D4407
Biotin Sigma Aldrich Cat# B4501
L-Leucine Sigma Aldrich Cat# L8000
Kanamycin Sigma Aldrich Cat# K1637
Glycerol VWR Cat# J61059.K2
(NH4)2SO4 VWR Cat# BDH9216
KH2PO4 VWR Cat# BDH9268
FeSO4⋅7H2O VWR Cat# 95033-256
MgSO4·7H2O VWR Cat# MK569124
KOH VWR Cat# BDH9262
Bacto-tryptone VWR Cat# 76628
Yeast extract VWR Cat# AAJ60287-36
NaCl VWR Cat# BDH9286
Water Thermo Fisher Scientific Cat# 10977015
KH2PO4 VWR Cat# BDH9268
Na2HPO4⋅7H2O VWR Cat# 95035-872
NH4Cl VWR Cat# BDH9208
Agarose VWR Cat# 97064

Antibodies

Rabbit polyclonal BRCA2 Bethyl Lab Cat # A303-434A-T-1, RRID: AB_3073617
Mouse monoclonal Vinculin antibody Santa Cruz Biotech Cat# sc25336, RRID: AB_628438
Goat anti-rabbit HRP conjugated Thermo Fisher Scientific Cat# 31460, RRID:AB_228341
Goat anti-mouse HRP conjugated Thermo Fisher Scientific Cat# 31430, RRID: AB_228307

Recombinant DNA

pGalK National Cancer Institute NCI recombineering website: https://redrecombineering.ncifcrf.gov/

Critical commercial assays

QIAprep Spin miniprep kit Qiagen Cat# 27104
Qiagen Plasmid Maxi kit Qiagen Cat# 12165
Mouse embryonic stem cell nucleofector kit Lonza Cat# VPH-1001
Zymo DNA isolation kit Zymo research Cat# D3020
Platinum Taq Hifi DNA Polymerase Invitrogen Cat# 11304011
QIAquick Spin columns Qiagen Cat# 28115
QIAquick PCR purification kit Qiagen Cat# 28106
One-tube RT-PCR kit Qiagen Cat# 210210
96-well RNA isolation kit Thermo Fisher Scientific Cat# 12173011A
ECL plus Western blotting detection system Amersham Cat# RPN2132
QIAquick gel extraction kit Qiagen Cat# 28704
TruSeq Nano DNA Library Prep Kit Illumina Cat# 20015965

Experimental models: Cell lines

Mouse embryonic stem cells (Brca2Cko/-) [Clone: PL2F7] Sharan lab Kuznetsov et al., 2008
SNLP mouse feeders Sharan lab Kuznetsov et al., 2008

Oligonucleotides

Ex11d-Forward
ATACCTTGGCATTAGATAATC
Integrated DNA Technologies N/A
Ex11d-Reverse
CAACTGTACCTTCAAATTGC
Integrated DNA Technologies N/A
Ex11c-Forward
CAGACTTGACTTGTGTAAACGAACC
Integrated DNA Technologies N/A
Ex11c-Reverse
GTATTAATTGACTGAGGCTTGC
Integrated DNA Technologies N/A
Exon 14-forward
AGAATAGTATCACCATGTAGC
Integrated DNA Technologies N/A
Exon 14-reverse
AAGACTTTGGTTGGTCTGCC
Integrated DNA Technologies N/A
Exon 20-Forward cgaactcctgacctcaggtgatcc Integrated DNA Technologies N/A
Exon 20-Reverse ggcttagacctgatatttctgtccc Integrated DNA Technologies N/A
Exon21-Forward
CTTTGGGTGTTTTATGCTTGG
Integrated DNA Technologies N/A
Exon21-Reverse
ATCAAGCCTCATTATATGTCC
Integrated DNA Technologies N/A
Exon 25-Forward
CATCTAACACATCTATAATAACATTC
Integrated DNA Technologies N/A
Exon 25-Reverse
GTGGTGATGCTGAAAAGTAACC
Integrated DNA Technologies N/A
Exon 11-RT-Fwd
TGGTTTTGTCAAATTCAAGAATTGG
Integrated DNA Technologies N/A
Exon 14 RT-Rev
CCAATCAAGCAGTAGCTGTAACTTTCAC
Integrated DNA Technologies N/A

Deposited data

NGS sequencing data This paper Data: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE244578
Statistical analysis Code This paper Code: https://zenodo.org/record/8160202

Software and algorithms

Genomic data analysis, statistical analysis R studio Software: https://posit.co/download/rstudio-desktop/
Sanger sequencing analysis 4Peaks Software: https://nucleobytes.com/4peaks/
Genome editing analysis Crispresso2 Tools: http://crispresso2.pinellolab.org/submission
Variant caller ANNOVAR Database: https://annovar.openbioinformatics.org/en/latest/#reference
Excel Microsoft Software: https://www.microsoft.com/en-us/microsoft-365/excel
Acrobat Adobe Software: https://www.adobe.com/acrobat.html
Illustrator Adobe Software: https://www.adobe.com/products/illustrator.html

Other

Vac-Man vacuum manifold Promega Cat# PR-A7231
Micropulser electroporator Bio-Rad Cat# 1652100
Gene Pulser Xcell Electroporation Systems Bio-Rad Cat# 1652660

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Shyam K. Sharan (sharans@mail.nih.gov).

Materials availability

All cell lines generated in this study are available upon request.

Experimental model and study participant details

PL2F7 mouse embryonic stem cells are derivative of AB2.2 cells (male)13 and were maintained in Knock-out DMEM supplemented with 15% FBS, 1X GPS and 0.0072μl/ml b-mercaptoethanol at 37°C and 5% CO2.

Method details

Variant nomenclature and in silico analysis

HGVS nomenclature for cDNA and proteins were followed, in which cDNA numbers with +1 correspond to A of the ATG initiation codon in BRCA2 sequence (GenBank accession number NM_000059.3). Bayes-del scores were obtained from BayesDel database (https://fengbj-laboratory.org/BayesDel). Variants with a high BayesDel score are predicted to be non-functional and those with a low score are functional. The PRIOR probability of variants were obtained from the HCI database (http://priors.hci.utah.edu/PRIORS).

BRCA2 variant expressing mESC generation

All selected variants of an exon were generated using PCR by amplifying about 400 bp of exon or exons with adjacent intron sequences to make a pool of variant DNA. The pooled DNA was used to introduce the variants into the BAC clone (CTD-2342K5 with a 127 kb insert) containing full-length BRCA2 in SW102 cells by a recombineering method as described previously.7,15 After recombineering, BACs were sequenced to confirm the presence of the correct variant and absence of any undesired mutations. Oligonucleotide sequences are available upon request.

BAC DNA (20 μg) carrying various mutant alleles of BRCA2 was electroporated into 1.0 × 107 PL2F7 mESCs, selected in the presence of G418 (Invitrogen) and characterized as described previously.9 Each BAC carrying a single BRCA2 variant was electroporated and selected individually to ensure that the ES cells harbor a single mutant BAC clone. For selecting the BRCA2 expressing clones, protein isolation and Western blot was carried out as described before.9 Rabbit polyclonal BRCA2 (recognizes an epitope between residues 450–500) antibody (BETHYL lab, Cat # A303-434A-T-1, 1:2000 dilution), and mouse monoclonal Vinculin antibody (Santa Cruz biotech, Cat# sc25336, 1:200,000 dilution) were used to detect proteins. ECL plus Western blotting detection system (Amersham) was used for chemiluminescent detection. To confirm BRCA2 expression for intronic and nonsense variants, RT-PCR was performed using One tube RT-PCR kit (Qiagen) primers from exon 11 (5′-TGGTTTTGTCAAATTCAAGAATTGG-3′) and exon 14 (5′-CCAATCAAGCAGTAGCTGTAACTTTCAC-3′) following manufacturer’s protocol. DNA from BAC containing mESC clones expressing the human BRCA2 was used to further confirm the presence of the correct variant without any undesired mutation by sequencing.

Multiplexed functional analysis

Ten to twenty-five BAC containing PL2F7 ESC clones, each expressing a single BRCA2 variant, were pooled and cultured for two passages. The variants in each pool were from the same exon and were located within the span of ∼300 bp. Two such pools using two independent clones expressing each BRCA2 variant were generated. The conditional allele of Brca2 in mESCs was deleted by electroporating 20 μg of Pgk-Cre plasmid as described previously.9 Two independent electroporations were carried out for each pool and were subjected to the same treatment at subsequent steps. After electroporation, 7.5 × 106 cells were subjected to HAT selection as described previously.9 An equal number of unelectroporated cells were cultured without drug selection. This sample served as the M15 control. Cells were collected for genomic DNA isolation after HAT selection and for the M15 control. One million HAT selected cells were subjected to drug sensitivity. For drug selection, 0.4 μM cisplatin and 0.05 μM PARP inhibitor (olaparib) were used for five days with re-feeding drug in media on day 3 and 5. After drug selection, cells were collected for genomic DNA isolation.

The genomic DNA obtained from collected cell samples was used for PCR-amplification of the respective exons of BRCA2 using Invitrogen Platinum High-fidelity Taq polymerase according to the manufacturer’s protocol. Ten PCR reactions for each sample were carried out and the PCR products pooled together and purified using Qiagen PCR purification kit. For each batch of variants, sixteen samples were utilized [2 pools (biological replicates) × 2 Pgk-Cre electroporation (technical replicates) × 4 treatments (M15, HAT, Cis, Ola)] for deep sequencing of the respective exons using Illumina MiSeq paired end sequencing on the MiSeq sequencer (2 × 300 cycles) allocating ∼3 million reads for each sample to quantify the relative abundance of each variant at different conditions used. Library for Next Gen sequencing was prepared using TruSeq Nano DNA Library Prep Kit from Illumina according to the manufacturer’s protocol. The quantity and quality of the obtained libraries were evaluated using a Qubit 2.0 fluorometer (Thermo Fisher Scientific) and an Agilent 2200 TapeStation system.

Sequencing data analysis

The paired-end reads were aligned to the reference sequence using the Needleman-Wunsch alignment algorithm after the reads were demultiplexed using bcl2fastq (Illumina). The fastq files were merged using FLASH and CRISPResso2 was used for quantifying the total number of aligned reads.31 The individual unique alignments were annotated with a custom-made variant caller modified from ANNOVAR (release 2019-10-21) and the R/Bioconductor Biostrings package in R version 4.1.2 (Software: https://bioconductor.org/packages/Biostrings).31,32 Merged reads containing “N” bases and any insertions or deletions were removed from the analysis. The abundances of SNVs were quantified when the reads contained a single-nucleotide substitution and no additional mutations or deletions in the sequencing read. Read counts for each SNVs were then normalized to the total read coverage of the sequencing library after adding a pseudo-count of 1 to all reads for all conditions. Individual variants with a read count of more than 1 in 1,000 reads in the M15 condition were used for further analysis. Dropout or enrichment scores were calculated by taking the ratio of frequency of SNV after HAT, cisplatin, or olaparib treatment over that in M15. The scores were expressed in log2 scale, which we define as the functional scores of SNVs in HAT, cisplatin, and olaparib. The functional scores, averaged between 2 or 4 independent replicates, were used to calculate the probabilities of impact on function (PIF) for all the variants in the dataset.

Calculation of functional scores

Abundance of variants was calculated only using sequencing reads that had complete sequence alignment with BRCA2, except for single-nucleotide variation. The frequency of each variant was calculated as the number of variant reads divided by the total number of aligned reads. The variants that had a frequency of at least 1 in 1000 in the M15 sample were subjected to further analysis. Functional scores were calculated for the HAT, cisplatin, and olaparib treatments separately by calculating the log2 ratio of the frequency of treatment over the average frequency of the two biological replicates (two independent Pgk-Cre delivery) for the M15 samples.

Quantification and statistical analysis

Statistical methods and PIF calculation

The functional scores from each of the three assays (HAT, cisplatin, and olaparib) were regarded as statistical samples (with one measurement for a given assay and a given BRCA2 variant constituting one random data point) and were analyzed using statistical and machine-learning methods to calculate PIF for the variants. For each assay, the distributions of the replicate-averaged and log-transformed functional scores were tested for normality using normal quantile-quantile plots; such plots were generated separately for benign and pathogenic BRCA2 variants. Moreover, each of the six statistical samples (3 assays × 2 ClinVar-classification phenotypes) was tested for normality using the Lilliefors test. The details of the statistical models and algorithms for PIF calculation are described below.

PIF-calculation algorithm: The main (semi-supervised-learning) version

For each of the three assays (HAT, cisplatin, and olaparib), the functional-score distribution characterizing the entire dataset (n=223, the total number of BRCA2 variants analyzed) was modeled as a mixture of two normal distributions. The two mixture components represented pathogenic and benign BRCA2 variants, respectively. The distributions were fit to the functional-score data by numerical likelihood maximization (implemented as negative log likelihood minimization in the code).33

Before the fitting, each mixture model was initialized by assigning equal weights of 0.5 to the mixture components (i.e., the prior probability of pathogenicity was set to 0.5), and the components’ means and variances were set to the sample means and variances estimated from the corresponding data subset [characterizing either benign (n=16) or pathogenic (n=27) BRCA2 variants]. For the fitting, we used the R-language optim() function with typical parameters (the Nelder-Mead optimization algorithm with a relative tolerance of 1.0 × e−8).

The semi-supervised-learning nature of our method was reflected in the structure of the likelihood function, which contained information on BRCA2 variants with a known phenotype (i.e., pathogenic or benign), as well as variants of uncertain significance (VUS). Let f(x,m,v) denote the normal probability density with mean m and variance v. Then, the two-component mixture in our case has the general form,

g(x,p,mp,vp,mb,vb)=pf(x,mp,vp)+(1p)f(x,mb,vb),

where p is the prior probability of pathogenicity, mp and vp are the mean and variance of the mixture component for the pathogenic variants, and mb and vb are the mean and variance of the mixture component for the benign variants. Using this notation, the likelihood, l(), can be written as follows:

l(p,mp,vp,mb,vb|x)=i=1Nppf(xi(p),mp,vp)×i=1Nb(1p)f(xi(b),mb,vb)×i=1Nug(xi(u),p,mp,vp,mb,vb),

where x is the vector of the functional scores of the variants in the training set, x=(xi(p),xi(b),xi(u)); xi(p), xi(b), and xi(u) are the functional scores for pathogenic, benign, and VUS variants, respectively; Np, Nb, and Nu are the numbers of pathogenic, benign, and VUS variants, respectively, in the training set.

The parameters of the two-component mixture, including the prior probability of pathogenicity p, were obtained as a result of maximizing the likelihood the way described at the beginning of this subsection. This fitting strategy was applied independently to the data characterizing each of the three assays (HAT, cisplatin, and olaparib). The distributions obtained as the result of the fitting were used to calculate assay-specific probabilities of pathogenicity (denoted PPi) for individual BRCA2 variants using the Bayes formula:

PPi=pf(xi,mp,vp)g(xi,p,mp,vp,mb,vb),

where xi is the functional score of the variant i, and the distribution-parameter values on the right-hand side (i.e., p,mp,vp,mb,vb) are the ones maximizing the likelihood. Importantly, this formula was also used to calculate PPi in the case when the model trained on one dataset was applied to another dataset, here termed the target set (such as in cross-validation and bootstrapping; see below). In such a case, the variable values (i.e., the functional scores for HAT, cisplatin, and olaparib) were taken from the target set, whereas the probability-density parameters were taken from the distributions fitted to the training set.

For each of the three assays, we denote the PPi by PPi(HAT), PPi(Cis), and PPi(Ola), respectively, where i is an index marking individual BRCA2 variants in the target dataset (i=1,,223 in the case when the target set is our full dataset; the cases of target subsets are treated in a similar way). The PPi were used to calculate the probabilities of impact on function (PIFi) as follows:

PIFi=PPi(HAT)+(1PPi(HAT))PPi(Cis)PPi(Ola). (Equation 1)

This is a heuristic formula motivated by the total probability formula, and it can be interpreted as follows. If we know that PPi(HAT) is large (i.e., close to 1), then it is likely that the variant i is pathogenic, i.e., PIFi should also be large. Alternatively, if PPi(HAT) is rather small (i.e., close to 0), then the variant i will still likely be pathogenic if both PPi(Cis) and PPi(Ola) are large. The formula in Equation 1 allows us to define PIFi in a way that reflects this logic and also makes sure that PIFi vary between 0 and 1, as properly defined probabilities should.

Equation 1 is the PIF formula used in the main version of our algorithm. Additionally, we considered algorithm versions where the PIFi were calculated using the data from only one of the three assays: HAT, cisplatin, or olaparib. In those cases, the PIFi were set to be equal to PPi(HAT), PPi(Cis), or PPi(Ola), respectively.

PIF-calculation algorithm: The supervised-learning version

We compared our main, semi-supervised-learning version of the algorithm (described above) with a supervised-learning version to assess the relative advantages of semi-supervised learning in the context of PIF estimation. The difference between the semi-supervised-learning and supervised-learning algorithm versions was that, in the latter, the likelihood function incorporated the data only from the BRCA2 variants for which the phenotype was known (i.e., only the pathogenic and benign variants; no VUS). Specifically, the supervised-learning likelihood function had the form:

l(p,mp,vp,mb,vb|x)=i=1Nppf(xi(p),mp,vp)×i=1Nb(1p)f(xi(b),mb,vb),

where the notation is the same as above.

PIF-based variant-phenotype prediction and algorithm-accuracy assessment

From the calculated PIFi, the phenotype (pathogenic or benign) for each BRCA2 variant, i, was predicted using the thresholds accepted in the BRCA2 research community: PIFi 0.05 indicates a benign BRCA2 variant, PIFi>0.99 indicates a pathogenic variant, and all other cases are considered indeterminate.9 When applied to the full dataset (or to its subsets), the classification accuracy of our algorithm was calculated as the percentage of the BRCA2 variants with a known phenotype (i.e., pathogenic or benign) whose phenotype was predicted correctly from the algorithm-generated PIFi. Predictive accuracy of the algorithm on the full dataset (n=223) was assessed using K-fold cross-validation with K=3,6,9,43. For each variant, the PIF that was used in the K-fold cross-validation accuracy assessment was calculated – and compared to the known benign/pathogenic labels – only for the partition in which this variant was in the cross-validation test set (i.e., the set not used for model fitting in that cross-validation round). For each K, the folds were defined by (randomized) partitioning of the data subset containing only the variants with a known phenotype. Because the number of known-phenotype BRCA2 variants in our full dataset was 43, 43-fold cross-validation was equivalent to leave-one-out cross-validation. The accuracy of K-fold cross-validation was calculated as the classification accuracy on each of the K folds, averaged across all the folds.

While the commonly used values for K include 5 and 10, our choice of K represented a spectrum of possible K values, intended to generate a more complete cross-validation picture.34 We used a generalized cross-validation algorithm, in which the integer K did not have to be a divisor of 43 (indeed, 43 is a prime). In that generalization, we set the fold size, F(K), to [43/K] (here, the brackets denote the integer part of a number) and, for each K, the cross-validation training set consisted of F(K)K randomly selected variants with a known phenotype.

For the semi-supervised-learning algorithm version, the datasets for each of the K1 folds used for training included all the VUS, which were appended to the pathogenic and benign variants (that were randomly selected for that fold) and used together in the training procedure; in the remaining fold – used for testing – VUS were not included. That way, for each K, the training data were completely separated from the test data. For the supervised-learning algorithm version, the VUS were not used in the cross-validation procedure.

PIF confidence intervals via bootstrapping

We calculated 95% confidence intervals for the PIFi using bootstrapping. Bootstrapping is a common resampling strategy that allows one to quantify uncertainty in the output of a statistical model. It involves random sampling with replacement from a given dataset, which yields a bootstrapped dataset.34 Ten thousand bootstrapped datasets were generated from the full experimental dataset (n=223). To preserve the structure of the dataset, the pathogenic (n=27), benign (n=16), and VUS (n=180) variants were bootstrapped independently. We then fitted our statistical model (i.e., the combined mixtures of two normal distributions described above) to each of the bootstrapped datasets using the semi-supervised-learning approach, and then applied the fitted model to our full dataset. Thus, for each i (i=1,,223), we had a sample of 10,000 values of PIFi, from which we calculated the confidence interval for PIFi using the percentile method.35 The confidence intervals for the PIFi from our supervised-learning approach were calculated in a similar way, but VUS were not used in the bootstrap.

Software and hardware implementation of the computational procedures

The PIF calculation and analysis code was written in R 4.0.2 (2020-06-22), using RStudio 1.3.1073, and was developed and run on a Dell Latitude 7400 laptop computer with an Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz processor and 16 Gb RAM under the 64-bit Windows 10 Enterprise 20H2 operating system.

Acknowledgments

We thank members of the Sharan Laboratory for helpful discussions and suggestions. We thank Dr. Edwin Iversen (Duke University) for helpful discussions during the initial stages of data analysis and development of statistical models. We thank Dr. Elizabeth Conner from the CCR Genomic Core for Sanger Sequencing and Bao Tran and Jyoti Shetty from the CCR sequencing facility for library preparation and NGS. The graphical abstract contains some images obtained through a paid subscription to BioRender. This research is supported by the Intramural Research Program, Center for Cancer Research, National Cancer Institute, US National Institutes of Health (to S.K.S.). The content of this publication does not necessarily reflect the views or policies of the US Department of Health and Human Services or of the National Institutes of Health.

Author contributions

K.B. and S.K.S. conceptualized the idea. K.B., T.S., E.S., S.R., S.N., J.S., T.S., M.R.-T., M.K., A.B., S. Stauffer, and L.C. performed the experiments. S. Sahu, D.N., and M.T. performed computational analyses. A.Y.M. developed the PIF calculation algorithm and performed statistical analyses. T.M. contributed ideas and led initial discussions pertaining to the development of the PIF calculation algorithm. K.B., A.Y.M., and S.K.S. wrote the manuscript. S.K.S. supervised the work. All authors reviewed and edited the manuscript.

Declaration of interests

The authors declare no competing interests.

Inclusion and diversity

We support inclusive, diverse, and equitable conduct of research.

Published: November 2, 2023

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.crmeth.2023.100628.

Supplemental information

Document S1. Figures S1–S7
mmc1.pdf (1.2MB, pdf)
Table S1. Number of replicates used for functional score calculation of each variant, related to Figure 1C
mmc2.xlsx (15.3KB, xlsx)
Table S2. CIs of PIF of BRCA2 variants based on survival in HAT, cisplatin, and olaparib, individually and combined, related to Table 1
mmc3.xlsx (39.1KB, xlsx)
Table S3. PIF and corresponding CI of BRCA2 variants based on survival in HAT, cisplatin, and olaparib, individually and combined, related to Table 1
mmc4.xlsx (50KB, xlsx)
Table S4. Comparison of mESC-based functional classification of variants with other published functional assays, related to Figure 4 and Table 1
mmc5.xlsx (19.6KB, xlsx)
Table S5. mESCbased classification of variants and their ACMG codes, related to Figure 4 and Table 1
mmc6.xlsx (25.5KB, xlsx)
Document S2. Article plus supplemental information
mmc7.pdf (6.4MB, pdf)

Data and code availability

References

  • 1.Hu C., Hart S.N., Gnanaolivu R., Huang H., Lee K.Y., Na J., Gao C., Lilyquist J., Yadav S., Boddicker N.J., et al. A Population-Based Study of Genes Previously Implicated in Breast Cancer. N. Engl. J. Med. 2021;384:440–451. doi: 10.1056/NEJMoa2005936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.LaDuca H., Polley E.C., Yussuf A., Hoang L., Gutierrez S., Hart S.N., Yadav S., Hu C., Na J., Goldgar D.E., et al. A clinical guide to hereditary cancer panel testing: evaluation of gene-specific cancer associations and sensitivity of genetic testing criteria in a cohort of 165,000 high-risk patients. Genet. Med. 2020;22:407–415. doi: 10.1038/s41436-019-0633-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Paluch-Shimon S., Cardoso F., Sessa C., Balmana J., Cardoso M.J., Gilbert F., Senkus E., ESMO Guidelines Committee Prevention and screening in BRCA mutation carriers and other breast/ovarian hereditary cancer syndromes: ESMO Clinical Practice Guidelines for cancer prevention and screening. Ann. Oncol. 2016;27:v103–v110. doi: 10.1093/annonc/mdw327. [DOI] [PubMed] [Google Scholar]
  • 4.Richards S., Aziz N., Bale S., Bick D., Das S., Gastier-Foster J., Grody W.W., Hegde M., Lyon E., Spector E., et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 2015;17:405–424. doi: 10.1038/gim.2015.30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Brnich S.E., Abou Tayoun A.N., Couch F.J., Cutting G.R., Greenblatt M.S., Heinen C.D., Kanavy D.M., Luo X., McNulty S.M., Starita L.M., et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 2019;12:3. doi: 10.1186/s13073-019-0690-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Spurdle A.B., Greville-Heygate S., Antoniou A.C., Brown M., Burke L., de la Hoya M., Domchek S., Dörk T., Firth H.V., Monteiro A.N., et al. Towards controlled terminology for reporting germline cancer susceptibility variants: an ENIGMA report. J. Med. Genet. 2019;56:347–357. doi: 10.1136/jmedgenet-2018-105872. [DOI] [PubMed] [Google Scholar]
  • 7.Biswas K., Das R., Alter B.P., Kuznetsov S.G., Stauffer S., North S.L., Burkett S., Brody L.C., Meyer S., Byrd R.A., Sharan S.K. A comprehensive functional characterization of BRCA2 variants associated with Fanconi anemia using mouse ES cell-based assay. Blood. 2011;118:2430–2442. doi: 10.1182/blood-2010-12-324541. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Biswas K., Das R., Eggington J.M., Qiao H., North S.L., Stauffer S., Burkett S.S., Martin B.K., Southon E., Sizemore S.C., et al. Functional evaluation of BRCA2 variants mapping to the PALB2-binding and C-terminal DNA-binding domains using a mouse ES cell-based assay. Hum. Mol. Genet. 2012;21:3993–4006. doi: 10.1093/hmg/dds222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Biswas K., Lipton G.B., Stauffer S., Sullivan T., Cleveland L., Southon E., Reid S., Magidson V., Iversen E.S., Jr., Sharan S.K. A computational model for classification of BRCA2 variants using mouse embryonic stem cell-based functional assays. NPJ Genom. Med. 2020;5:52. doi: 10.1038/s41525-020-00158-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Erwood S., Bily T.M.I., Lequyer J., Yan J., Gulati N., Brewer R.A., Zhou L., Pelletier L., Ivakine E.A., Cohn R.D. Saturation variant interpretation using CRISPR prime editing. Nat. Biotechnol. 2022;40:885–895. doi: 10.1038/s41587-021-01201-1. [DOI] [PubMed] [Google Scholar]
  • 11.Guidugli L., Carreira A., Caputo S.M., Ehlen A., Galli A., Monteiro A.N.A., Neuhausen S.L., Hansen T.V.O., Couch F.J., Vreeswijk M.P.G., ENIGMA consortium Functional assays for analysis of variants of uncertain significance in BRCA2. Hum. Mutat. 2014;35:151–164. doi: 10.1002/humu.22478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ikegami M., Kohsaka S., Ueno T., Momozawa Y., Inoue S., Tamura K., Shimomura A., Hosoya N., Kobayashi H., Tanaka S., Mano H. High-throughput functional evaluation of BRCA2 variants of unknown significance. Nat. Commun. 2020;11:2573. doi: 10.1038/s41467-020-16141-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kuznetsov S.G., Liu P., Sharan S.K. Mouse embryonic stem cell-based functional assay to evaluate mutations in BRCA2. Nat. Med. 2008;14:875–881. doi: 10.1038/nm.1719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Li L., Biswas K., Habib L.A., Kuznetsov S.G., Hamel N., Kirchhoff T., Wong N., Armel S., Chong G., Narod S.A., et al. Functional redundancy of exon 12 of BRCA2 revealed by a comprehensive analysis of the c.6853A>G (p.I2285V) variant. Hum. Mutat. 2009;30:1543–1550. doi: 10.1002/humu.21101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mesman R.L.S., Calléja F.M.G.R., Hendriks G., Morolli B., Misovic B., Devilee P., van Asperen C.J., Vrieling H., Vreeswijk M.P.G. The functional impact of variants of uncertain significance in BRCA2. Genet. Med. 2019;21:293–302. doi: 10.1038/s41436-018-0052-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Sirisena N., Biswas K., Sullivan T., Stauffer S., Cleveland L., Southon E., Dissanayake V.H.W., Sharan S.K. Functional evaluation of five BRCA2 unclassified variants identified in a Sri Lankan cohort with inherited cancer syndromes using a mouse embryonic stem cell-based assay. Breast Cancer Res. 2020;22:43. doi: 10.1186/s13058-020-01272-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Stauffer S., Biswas K., Sharan S.K. Bypass of premature stop codons and generation of functional BRCA2 by exon skipping. J. Hum. Genet. 2020;65:805–809. doi: 10.1038/s10038-020-0768-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Sullivan T., Thirthagiri E., Chong C.E., Stauffer S., Reid S., Southon E., Hassan T., Ravichandran A., Wijaya E., Lim J., et al. Epidemiological and ES cell-based functional evaluation of BRCA2 variants identified in families with breast cancer. Hum. Mutat. 2021;42:200–212. doi: 10.1002/humu.24154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Guidugli L., Shimelis H., Masica D.L., Pankratz V.S., Lipton G.B., Singh N., Hu C., Monteiro A.N.A., Lindor N.M., Goldgar D.E., et al. Assessment of the Clinical Relevance of BRCA2 Missense Variants by Functional and Computational Approaches. Am. J. Hum. Genet. 2018;102:233–248. doi: 10.1016/j.ajhg.2017.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Tubeuf H., Caputo S.M., Sullivan T., Rondeaux J., Krieger S., Caux-Moncoutier V., Hauchard J., Castelain G., Fiévet A., Meulemans L., et al. Calibration of Pathogenicity Due to Variant-Induced Leaky Splicing Defects by Using BRCA2 Exon 3 as a Model System. Cancer Res. 2020;80:3593–3605. doi: 10.1158/0008-5472.CAN-20-0895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.van Engelen J.E., Hoos H.H. A survey on semi-supervised learning. Mach. Learn. 2020;109:373–440. [Google Scholar]
  • 22.Easton D.F., Deffenbaugh A.M., Pruss D., Frye C., Wenstrup R.J., Allen-Brady K., Tavtigian S.V., Monteiro A.N.A., Iversen E.S., Couch F.J., Goldgar D.E. A systematic genetic assessment of 1,433 sequence variants of unknown clinical significance in the BRCA1 and BRCA2 breast cancer-predisposition genes. Am. J. Hum. Genet. 2007;81:873–883. doi: 10.1086/521032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Howlett N.G., Taniguchi T., Olson S., Cox B., Waisfisz Q., De Die-Smulders C., Persky N., Grompe M., Joenje H., Pals G., et al. Biallelic inactivation of BRCA2 in Fanconi anemia. Science. 2002;297:606–609. doi: 10.1126/science.1073834. [DOI] [PubMed] [Google Scholar]
  • 24.Vallée M.P., Di Sera T.L., Nix D.A., Paquette A.M., Parsons M.T., Bell R., Hoffman A., Hogervorst F.B.L., Goldgar D.E., Spurdle A.B., Tavtigian S.V. Adding In Silico Assessment of Potential Splice Aberration to the Integrated Evaluation of BRCA Gene Unclassified Variants. Hum. Mutat. 2016;37:627–639. doi: 10.1002/humu.22973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Feng B.J. PERCH: A Unified Framework for Disease Gene Prioritization. Hum. Mutat. 2017;38:243–251. doi: 10.1002/humu.23158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Tian Y., Pesaran T., Chamberlin A., Fenwick R.B., Li S., Gau C.L., Chao E.C., Lu H.M., Black M.H., Qian D. REVEL and BayesDel outperform other in silico meta-predictors for clinical variant classification. Sci. Rep. 2019;9 doi: 10.1038/s41598-019-49224-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Findlay G.M., Daza R.M., Martin B., Zhang M.D., Leith A.P., Gasperini M., Janizek J.D., Huang X., Starita L.M., Shendure J. Accurate classification of BRCA1 variants with saturation genome editing. Nature. 2018;562:217–222. doi: 10.1038/s41586-018-0461-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Parsons M.T., Tudini E., Li H., Hahnen E., Wappenschmidt B., Feliubadaló L., Aalfs C.M., Agata S., Aittomäki K., Alducci E., et al. Large scale multifactorial likelihood quantitative analysis of BRCA1 and BRCA2 variants: An ENIGMA resource to support clinical variant classification. Hum. Mutat. 2019;40:1557–1578. doi: 10.1002/humu.23818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Jimenez-Sainz J., Mathew J., Moore G., Lahiri S., Garbarino J., Eder J.P., Rothenberg E., Jensen R.B. BRCA2 BRC missense variants disrupt RAD51-dependent DNA repair. Elife. 2022;11 doi: 10.7554/eLife.79183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Bakker J.L., Thirthagiri E., van Mil S.E., Adank M.A., Ikeda H., Verheul H.M.W., Meijers-Heijboer H., de Winter J.P., Sharan S.K., Waisfisz Q. A novel splice site mutation in the noncoding region of BRCA2: implications for Fanconi anemia and familial breast cancer diagnostics. Hum. Mutat. 2014;35:442–446. doi: 10.1002/humu.22505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Clement K., Rees H., Canver M.C., Gehrke J.M., Farouni R., Hsu J.Y., Cole M.A., Liu D.R., Joung J.K., Bauer D.E., Pinello L. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 2019;37:224–226. doi: 10.1038/s41587-019-0032-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Wang K., Li M., Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164. doi: 10.1093/nar/gkq603. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.MacDonald I.L. Numerical Maximisation of Likelihood: A Neglected Alternative to EM? Int. Statistical Rev. 2014;82:296–308. doi: 10.1111/insr.12041. [DOI] [Google Scholar]
  • 34.Gareth James D.W., Hastie T., Tibshirani R. Springer Texts in Statistics; Springer: 2017. An Introduction to Statistical Learning: With Applications in R. [Google Scholar]
  • 35.Eric Vittinghoff D.V.G., Shiboski S.C., McCulloch C.E. Springer; 2012. Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models (Statistics for Biology and Health) [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S7
mmc1.pdf (1.2MB, pdf)
Table S1. Number of replicates used for functional score calculation of each variant, related to Figure 1C
mmc2.xlsx (15.3KB, xlsx)
Table S2. CIs of PIF of BRCA2 variants based on survival in HAT, cisplatin, and olaparib, individually and combined, related to Table 1
mmc3.xlsx (39.1KB, xlsx)
Table S3. PIF and corresponding CI of BRCA2 variants based on survival in HAT, cisplatin, and olaparib, individually and combined, related to Table 1
mmc4.xlsx (50KB, xlsx)
Table S4. Comparison of mESC-based functional classification of variants with other published functional assays, related to Figure 4 and Table 1
mmc5.xlsx (19.6KB, xlsx)
Table S5. mESCbased classification of variants and their ACMG codes, related to Figure 4 and Table 1
mmc6.xlsx (25.5KB, xlsx)
Document S2. Article plus supplemental information
mmc7.pdf (6.4MB, pdf)

Data Availability Statement


Articles from Cell Reports Methods are provided here courtesy of Elsevier

RESOURCES