Abstract
Single nucleotide variants in the open reading frames (ORFs) of pharmacogenes are important causes of interindividual variability in drug response. The functional characterization of variants of unknown significance within ORFs remains a major challenge for pharmacogenomics. Deep mutational scanning (DMS) is a high‐throughput technique that makes it possible to analyze the functional effect of hundreds of variants in a parallel and scalable fashion. We adapted a “landing pad” DMS system to study the function of missense variants in the ORFs of cytochrome P450 family 2 subfamily C member 9 (CYP2C9) and cytochrome P450 family 2 subfamily C member 19 (CYP2C19). We studied 230 observed missense variants in the CYP2C9 and CYP2C19 ORFs and found that 19 of 109 CYP2C9 and 36 of 121 CYP2C19 variants displayed less than ~ 25% of the wild‐type protein expression, a level that may have clinical relevance. Our results support DMS as an efficient method for the identification of damaging ORF variants that might have potential clinical pharmacogenomic application.
Study Highlights.
WHAT IS THE CURRENT KNOWLEDGE ON THE TOPIC?
☑ With the increasing application of next generation sequencing (NGS) to known “pharmacogenes,” large numbers of open reading frame (ORF) “variants of unknown significance” (VUS) are being identified. However, the functional implications of those VUS remains unclear.
WHAT QUESTION DID THIS STUDY ADDRESS?
☑ This study was designed to determine whether the application of DMS might make it possible to determine the functional effects of VUS that have been observed in the ORFs of cytochrome P450 family 2 subfamily C member 9 (CYP2C9) and cytochrome P4540 family 2 subfamily C member 19 (CYP2C19).
WHAT DOES THIS STUDY ADD TO OUR KNOWLEDGE?
☑ The results of the study demonstrate that deep mutational scanning (DMS) can be used to determine the functional implications of ORF VUS in CYP2C9 and CYP2C19.
HOW MIGHT THIS CHANGE CLINICAL PHARMACOLOGY OR TRANSLATIONAL SCIENCE?
☑ The application of DMS to additional pharmacogenes would potentially expand the accuracy and clinical utility of the application of NGS to pharmacogenomics.
Genetic polymorphisms in or near pharmacogenes are a major cause of individual variation in drug response phenotypes.1 Cytochrome P450 family 2 subfamily C member 9 (CYP2C9) and cytochrome P450 family 2 subfamily C member 19 (CYP2C19) are genes that encode important cytochrome P450 enzymes that catalyze the phase I biotransformation of a variety of therapeutic drugs, including antiplatelet agents, selective serotonin reuptake inhibitors, and proton pump inhibitors.2, 3, 4, 5 Several years ago, the Mayo Clinic launched the RIGHT 1K study, in which next generation DNA sequencing (NGS) was performed with DNA from 1013 Mayo Biobank participants to identify variants in 84 pharmacogenes, including CYP2C9 and CYP2C19.1, 6 However, many of the polymorphisms observed in those patients were variants of unknown significance (VUS).1, 7, 8 In a recent publication, we functionally characterized six novel nonsynonymous open reading frame (ORF) variants in the CYP2C9 gene and seven nonsynonymous ORF variants in the CYP2C19 gene observed in DNA from participants in the Right 1K study.9 Conventional methods for the characterization of individual sequence variants “one‐at‐a‐time” are reliable, but they are also time‐consuming, labor‐intense, and not easily scalable. As a result, only a limited number of variants can practically be investigated in that fashion. To help address this challenge, predictive algorithms, such as Polyphen‐2, SIFT, and PROVEAN, among others, represent efforts to help identify deleterious variants, but their reliability is variable and inadequate for clinical application.10, 11, 12 Laboratory‐based assays are still required to reliably interpret the impact of ORF missense variants on protein function. As a result, a significant gap remains in the functional interpretation of the large number of missense VUS being discovered as DNA sequencing is applied ever more broadly. Deep mutational scanning (DMS) methods provide a platform with which a large number of missense variants can be interrogated in parallel.13, 14 For example, Fowler et al. developed a DMS assay for all possible variants in clinically important genes, such as BRCA1, TPMT, and PTEN, linking genotypes to functionally determined phenotypes.15, 16 Specifically, engineered “landing pad” HEK 293T cells were used as a platform to integrate pooled variant libraries, resulting in one variant per cell for further functional assessment.15 DMS includes the creation of variant libraries for a gene, selecting the library for function (e.g., resistance to drug or fluorescence markers of protein quantity) and—finally—high‐throughput DNA sequencing of variants to link them with the “activity” assayed in a functional test.
In our previous “one‐at‐a‐time” functional genomic study, we identified a series of missense variants in the ORFs of CYP2C9 and CYP2C19 that resulted in decreased protein expression as a result of proteasome‐mediated degradation, presumably due to an alteration in protein folding. In the present study, fluorescence‐activated cell sorting (FACS) was used as a selection mechanism for cells expressing variant sequences (i.e., the cells were sorted according to the abundance of reporter protein expression). We should point out that saturation mutagenesis can also be used to create a “saturation” library, including all possible mutations for an ORF sequence in a single reaction, and that approach has been applied in some previous DMS studies.15 However, many of the variants created are unlikely to occur clinically. Therefore, we applied a different mutagenesis approach, nicking mutagenesis, to generate libraries that contained human genomic variation known to be present in the general population. Generation of variants by nicking mutagenesis is not exhaustive, but it is much more efficient than the generation of mutants one at a time by site‐directed mutagenesis.17
In the present study, we have used DMS to analyze the functional implications of missense variants that have been observed in the ORFs of CYP2C9 and CYP2C19, ORFs that were fused to green fluorescent protein (GFP). Recombinase was used to integrate the variant libraries into landing pad cells, one per cell. Multiplexed functional selection performed with FACS was then used to separate the cells into different “bins” based on fluorescence readout at the single cell level. Amplicon sequencing of DNA in each bin, followed by computational analysis of the frequency of variants appearing in each bin, was used to determine levels of protein expression. To identify potentially severely damaging variants for these two important CYP pharmacogenes, we analyzed 230 observed missense variants in ORFs present in a publicly available database consisting of exome sequencing data for 60,000 general population subjects. We included genetic variants with allele frequencies (> 0.00001) observed by the Exome Aggregation Consortium (ExAC; http://exac.broadinstitute.org/) as well as novel ORF VUSs from the Mayo Right 1K and Right 10K projects.6, 9, 18 The RIGHT 1K samples served as a control because we had already studied them individually.9 We found that 19 of 109 CYP2C9 and 36 of the 121 CYP2C19 missense variants that were studied displayed less than ~ 25% of the wild‐type (WT) protein expression, a level that might have clinical relevance.9 We also compared variant calling by the DMS method with the predictions of computational algorithms and, finally, we validated serverely damaging variants by the use of Western blot analysis. Our findings suggest that DMS can be an efficient method for the high‐throughput identification of low protein abundance ORF variants that might have potential clinical implications.
Methods
Generation of landing pad HEK 293T cells
The landing pad construct and the promoter‐less cassette for CYP ORFs was created by Gibson assembly (Supplemental Text). TALENs were used to create double stranded breaks at the AAVS1 site and homology‐directed repair was used to introduce landing pad constructs into HEK 293T cells.15 The landing pad construct expressed doxycycline inducible blue fluorescent protein (BFP), which was used as a selection marker for landing pad insertion. Thirty percent of HEK 293T cells display hypotriploid karyotypes.19 The assay requires that a single variant be integrated per cell. Single cells were sorted into each well of a 96‐well plate for cloning. To identify which cells contained a single landing pad, we used Bxb1 recombinase mediated integration of a 1:1 ratio mixture of GFP and mCherry promotor‐less plasmids and analyzed fluorescence by flow cytometry. Clones having the lowest percentage of both GFP and mCherry were most likely to be candidate for further assay. The attB‐mCherry and attB‐GFP plasmids were transfected 24 hours after Bxb1 recombinase transfection, and BFP induction by doxycycline. After 5 days, candidate clones were trypsinized, washed with phosphate‐buffered saline, and were fixed in 4% formaldehyde at 4°C for 10 minutes. The cells were analyzed by flow cytometer FACS CantoX (BD Biosciences, San Jose, CA) and by the use of FACSDiva version 8.0 software and Flowjo software version 10 (BD Biosciences). The FACS CantoX instrument utilizes colinear 405, 488, and 561 nm lasers plus forward and side angle light scatter. Fluorescence images of variants were visualized using fluorescence microscopy (EVOS FLoid Cell Imaging Station; Life Technologies, Waltham, MA).
Nicking mutagenesis for variant library preparation
Nicking mutagenesis methods were modified from Whitehead et al. to construct variant libraries for ORFs containing CYP2C9 and CYP2C19 missense variants.17 Nicking mutagenesis uses Nt.BbvCI and Nb.BbvCI sites and exonuclease cleavage to degrade WT template DNA. Nt.BbvCI enzyme with nicking endonuclease and exonuclease digestion of WT template DNA was used to form single stranded DNA. At the annealing step, five oligos carrying the variant of interest were annealed separately to single strand templates and then pooled with five reactions in one pot, using high fidelity DNA polymerase and Taq ligase to close the double strand. The second template strand was degraded by Nb.BbvCI endonuclease cleavage and exonuclease digestion, and a new second strand was synthesized with a common primer on a cassette plasmid backbone. Phosphorylated oligos for CYP2C9 and CYP2C19 variants were purchased from IDT (Coralville, IA). Sanger sequencing was used to validate the sequences of variant clones. Detailed protocols are provided in the Supplemental Text.
Fluorescence‐activated cell sorting
Library cells were washed, trypsinized, and resuspended in phosphate‐buffered saline containing 5% fetal bovine serum. Cells were then sorted on an FACSAria with 407, 488, and 532 nm lasers (BD Biosciences) into four bins, and the cells were collected in culture medium. BFP−/mCherry+ cells containing CYP2C9 or CYP2C19 variants were flow sorted and grown for 5 days. BFP‐/mCherry+ cells were sorted again to determine the protein expression of CYP2C9/CYP2C19 variants based on their GFP/mCherry ratios. Gates were set based on GFP/mCherry ratios for cells integrating known CYP2C9/CYP2C19 variants and WT proteins as gating references. Four gates were set to dissect the pooled libraries into four different bins based on GFP/mCherry ratios. Data were analyzed by FACSDiva version 8.0.1 software.
Variant calling
Variant frequencies were calculated by high‐throughput sequencing of the DNA collected in each sorted bin. Genomic DNA was extracted from sorted cells using DNA extraction kits (Qiagen) and amplicons were sequenced as decribed in the Supplemental Text. Fastq files were aligned with respective CYP2C9 and CYP2C19 reference sequences using BWA mem aligner version 0.7.15. Samtools mpileup version 1.5 was used with a custom python script for single nucleotide variation calling. VarScan pileup2indel version 2.3.9 was used for calling INDELS. A base quality score cutoff of 20 and a mapping quality score cutoff of 20 were applied for both single nucleotide variation and INDEL calling. Custom scripts were used to summarize the data and add allele frequencies at all positions in the reference sequence.Variant counts in each bin were tabulated and each variant’s frequency in each bin was calculated. The effects of variants were obtained based on the frequency of variants in each bin.
Results
Generation of landing pad cell lines
The landing pad construct was generated and a promotor‐less cassette was constructed with ORFs for CYP2C9 and CYP2C19. The ORFs were fused to GFP as an indicator of protein expression and mCherry, which was independently expressed as a transfection control. The DMS assay requires that only a single transgenetic cassette be integrated per cell, which could be guaranted by having only a single landing pad per cell (Figure 1 a). To generate a cell line that integrated only a single copy of the landing pad, the landing pad construct was inserted into HEK 293T cells and cells were selected by BFP. Single cells positive for BFP were sorted into each well of a 96‐well plate for cloning. We observed different levels of BFP expression among different landing pad clones, probably because HEK 293T cells have been reported to have hypotriploid karyotypes.19 To identify a candidate clone with only a single landing pad, we selected > 30 clones with low BFP expression levels. We then tested those clones by using bxb1 recombinase to integrate a mixture of 1:1 ratio GFP only and mCherry only promotor‐less plasmids (plasmid sequences are shown in the Supplemental Text) into landing pad clones, followed by flow cytometry analysis. If a candidate clone contained only a single landing pad, it would integrate either GFP or mCherry, but not both. For clone 20, quadrant Q2 for the flow cytometry showed a negligible percentage of cells with both GFP and mCherry expression (Figure 1 b). In addition, no overlay of green and red was observed in the fluroscence image for clone 20, once again indicating that clone 20 had only a single landing pad (Figure 1 c). As a result, clone 20 was selected for use in the following experiments. This system allowed us to monitor variant protein expression in a high‐throughput fashion.
Figure 1.
Generation of HEK 293T landing pad cells. (a) Plasmid maps of the landing pad construct and the promoter‐less cassette for CYP open reading frames that were fused to green fluorescent protein (GFP) and engineered for the simultaneous expression of IRES‐mCherry. (b) Flow cytometry results for landing pad HEK 293T clone 20 that had a low percentage of both GFP and mCherry, compatible with the conclusion that clone 20 contains a single landing pad. (c) Merged fluorescence photos showing that mCherry and GFP did not superimpose for clone 20. The landing pad clone 20 integrated either GFP or mCherry, which indicated that clone 20 contained a single landing pad. BFP, blue fluorescent protein; CMV, human cytomegalovirus promotor; IRES, internal ribosome entry site.
Effect of CYP2C9 and CYP2C19 variants on protein levels
In our previous study, CYP2C9 218C>T, CYP2C9 343A>C, CYP2C19*3 (636G>A), and CYP2C19 815A>G affected final protein quantity, resulting in varying levels of decreased protein expression.9 In the present study, the reporter consisted of a promoter‐less cassette containing the C‐terminus of the CYP2C9 or CYP2C19 ORFs fused to GFP. That construct was expressed once it was integrated into landing pad cells to ensure one variant per cell and the landing pad BFP was disrupted. Flow cytometry analysis of BFP‐/mCherry+ cells showed that known “damaging” variants expressed significantly reduced levels of GFP and lower GFP/mCherry ratios, indicating that those cells expressed less protein. Mean GFP/mCherry ratios of known damaging variants were compared with those of WT proteins. The value for CYP2C9 218C>T was 28.4%; CYP2C9 343A>C was 48.7%; CYP2C19*3 (636G>A) was 17.6%; and CYP2C19 815A>G was 61.2% of the mean WT GFP/mCherry ratios, percentages that were roughly identical with Western blot results that we had previously reported (Figure 2 a–c).9 Next, we created constructs for nonsynonymous variants with allele frequencies > 0.00001 as reported by the Exome Aggregation Consortium and by the Mayo Clinic Right 1K study and created variant libraries for CYP2C9 and CYP2C19. Pooled variant libraries for both genes were integrated into landing pad cells. Using already known damaging variants for CYP2C9 and CYP2C19 as well as WT constructs as references for FACS gating, the cells were sorted into different “bins” based on GFP/mCherry ratios (Figure 2 d). Variants of CYP2C9/CYP2C19 with the lowest GFP/mCherry ratios (< 25% protein expression) were sorted into bin1 and were classified as severely damaging. WT‐like variants were sorted into bin4. The gating for four‐way sorting of CYP2C9/CYP2C19 pooled variants libraries is shown in Figure 3 a,b. Pools of cells in each bin were collected and were then used as input material for DNA sequencing. We used NGS to monitor the frequencies of variants in each bin. Variant frequencies (F v) appearing in each bin were then used to determine protein abundance scores. Abundance scores for each CYP2C9 and CYP2C19 variant were calculated by use of the following equation:
Figure 2.
Flow cytometry of cytochrome P450 family 2 subfamily C member 9 (CYP2C9) and cytochrome P450 family 2 subfamily C member 19 (CYP2C19) constructs with known variants. (a,b) Flow cytometry analysis of blue fluorescent protein (BFP)‐/mCherrry+ cells that integrated wild‐type or known damaging variants, CYP2C9 218C>T and CYP2C9 343A>C, CYP2C19*3 and CYP2C19 815A>G. Note that for both wild‐type (WT) allozymes, most of the cells eluted toward higher green fluorescent protein (GFP)/mCherry ratios, while allozymes containing damaging variants eluted at significantly lower GFP/mCherry ratios than did the cells containing the WT. Mean GFP/mCherry ratios for those variants were consistent with Western blot results obtained during our previous study.9 (c) Cells transfected with constructs expressing known severely damaging variants (CYP2C9 218C>T and CYP2C19*3) eluted at low GFP/mCherry ratios, whereas other variants eluted between the two extremes (CYP2C9 343A>C and CYP2C19 815A>G), compatible with their being “damaging” variants. Four “bins” were established based on WT CYP2C9 and CYP2C19 and known damaging CYP2C9 and CYP2C19 variants, as shown diagramatically in (d).
Figure 3.
Fluorescence‐activated cell sorting‐sorting of pooled cytochrome P450 family 2 subfamily C member 9 (CYP2C9) and cytochrome P450 family 2 subfamily C member 19 (CYP2C19) variant libraries. Blue fluorescent protein−/mCherrry+ cells integrating CYP2C9 and CYP2C19 pooled variant libraries were sorted into four bins based on their green fluorescent protein (GFP)/mCherry ratios. Gates were set based on wild‐type CYP2C9 and CYP2C19 and known damaging CYP2C9 and CYP2C19 variants. Pools of sorted cells in each bin were collect and were used as input material for subsequent amplicon DNA sequencing.
For each experiment, an “abundance score” for each variant studied was obtained by multiplying the variant frequency by weighted values (0.25–1) with bin1 assigned 0.25 and bin4 with 1.0.16 The final abundance score for each variant was calculated by averaging mean abundance scores across replicate assays. Variants were scored in at least three experiments, as shown graphically in Figure 4 and Figure S1 . Variants were classified as “severely damaging,” “damaging,” or “tolerated,” with thresholds chosen on the basis of abuandance scores for known CYP2C9/CYP2C19 variants.9 Using Western blot results and corresponding enzyme activities from our previous publication as a reference,9 CYP2C9 variants with abundance scores below 0.578 (CYP2C9 1076T>C), were classified as “severely damaging,” displaying ~ 25% protein abundance as compared with WT, whereas those with abundance scores above that threshold but lower than 0.670 (CYP2C9 343A>C) were classified as “damaging” displaying ~ 50% of the protein abundance of the WT allozyme. Variants with scores above this threshold were considered “tolerated.” CYP2C19 variants with abundance scores below 0.597 (CYP2C19 1349C>A), were classified as “severely damaging,” whereas those with abundance scores above this threshold, but lower than 0.635 (CYP2C19 557G>A), were classified as “damaging.” Variants with scores above this threshold were considered to be “tolerated.” We found that 19 of 109 CYP2C9 and 36 of 121 CYP2C19 missense variants displayed less than ~ 25% of WT protein expression.
Figure 4.
Protein abundance scores for 109 cytochrome P450 family 2 subfamily C member 9 (CYP2C9) and 121 cytochrome P450 family 2 subfamily C member 19 (CYP2C19) variants. (a) Abundance score values for CYP2C9 and CYP2C19 variants. Variants having abundance scores lower than that for CYP2C9 (1076T>C) were classified as severely damaging, whereas variants having abundance scores lower than that for CYP2C9 (343A>C) were classified as damaging variants. The results shown are averages for four replicates. (b) Mean abundance scores for CYP2C19 variants are shown in the histogram. Variants having abundance scores lower than that for CYP2C19 (1349C>A) were classified as severely damaging, whereas those with abundance scores lower than that for CYP2C19 (557G>A) were classified as damaging variants. The results shown are averages abudance scores for four replicates. SD values are listed in Tables S1 and S2 .
The variant calling results obtained by the use of DMS for variants from the ExAC study that had allele frequencies > 0.00001 and variants from the Mayo RIGHT 10K study are listed in Tables 1 and 2. The DMS results were also compared with predictions obtained using SIFT, Provean, and Polyphen2, and those results are listed in Tables [Link] , [Link] and [Link] , [Link] for CYP2C9 and CYP2C19, respectively. For variants that resulted in dramatically reduced protein expression levels, CYP2C9*11, CYP2C9*21, and another 12 CYP2C9 variants, as well as CYP2C19*8, CYP2C19*22, and another 26 CYP2C19 variants, the results were in good agreement among all three of these predictive algorithms. However, five CYP2C9 variants and seven CYP2C19 variants displayed < 25% of WT protein expression, whereas SIFT predicted that they were “tolerated.” In summary, we found that the three commonly applied algorithms, which we tested on our 230 variants, disagreed among themselves 30.4% of the time, and they disagreed with our DMS assays, for at least one algorithm, 54.7% of the time. We also searched PharmVar, a repository for pharmacogene variation that records functionally validated variants. For CYP2C9*21 and CYP2C19*2B, *8 and *22, our results were consistent with those reported in PharmVar.20 However, PharmVar currently lists several allozymes for both CYP2C9 and CYP2C19, which we found to be severely damaging as having only limited or moderate evidence. As a result, our results provide additional information with regard to the functional implications of these variants. Finally, 15 CYP2C9 and 30 CYP2C19 variants that we found to be severely damaging had not previously been reported or were reported to have uncertain function in PharmVar.20
Table 1.
Protein abuduance scores of CYP2C9 variants from ExAC browser and Mayo Right 10K study
Exact cDNA | Exact amino acid | RSID | Common allele name | Allele frequency | Right 10K (variant prevalence) | DMS | PharmVar | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
WT | Heterzygous | Homozygous | Functional study | Abundance score | |||||||
c.752A>G | p.His251Arg | rs2256871 | CYP2C9*9 | 0.006715 | 10079 | 5 | 0 | Damaging | 0.653893822 | No record | |
c.449G>A | p.Arg150His | rs7900194 | CYP2C9*8 | 0.005242 | 10074 | 1 | 0 | Damaging | 0.617433538 | Uncertain function | |
c.1003C>T | p.Arg335Trp | rs28371685 | CYP2C9*11 | 0.003784 | 10033 | 51 | 0 | Severely Damaging | 0.549201141 | possibly decreased | |
c.374G>A | p.Arg125His | rs72558189 | CYP2C9*14 | 0.002953 | 10081 | 3 | 0 | Damaging | 0.627008929 | No record | |
c.1465C>T | p.Pro489Ser | rs9332239 | CYP2C9*12 | 0.001912 | 10007 | 77 | 0 | Damaging | 0.583353141 | No record | |
c.1080C>G | p.Asp360Glu | rs28371686 | CYP2C9*5 | 0.001163 | Damaging | 0.629538267 | Moderate evidence | ||||
c.801C>T | p.Phe267hPhe | rs149158426 | 0.000855 | Damaging | 0.597268637 | No record | |||||
c.835C>A | p.Pro279Thr | rs182132442 | CYP2C9*29 | 0.0004702 | 10080 | 4 | 0 | TOLERATED | 0.672275959 | possibly decreased | |
c.1324G>A | p.Gly442Ser | rs368545396 | 0.0004286 | 10072 | 12 | 0 | Damaging | 0.603761357 | unknown function | ||
c.14T>C | p.Val5Ala | rs138957855 | 0.0003048 | 10077 | 7 | 0 | TOLERATED | 0.688673248 | No record | ||
c.394C>T | p.Arg132Trp | rs199523631 | CYP2C9*45 | 0.0002804 | 10078 | 6 | 0 | Damaging | 0.639522141 | unknown function | |
c.1264A>G | p.Ser422Gly | rs776769484 | 0.0002477 | 10082 | 2 | 0 | Damaging | 0.627667871 | No record | ||
c.895A>G | p.Thr299Ala | rs72558192 | CYP2C9*16 | 0.0002473 | TOLERATED | 0.673073074 | No record | ||||
c.520C>G | p.Pro174Ala | rs199539783 | 0.0001649 | 10079 | 5 | 0 | Damaging | 0.651742423 | No record | ||
c.269T>C | p.Leu90Pro | rs72558187 | CYP2C9 *13 | 0.0001401 | Damaging | 0.616189783 | possibly decreased | ||||
c.431G>A | p.Arg144His | rs141489852 | 0.0001401 | Damaging | 0.651784299 | No record | |||||
c.1084C> G | p.Leu362Val | . | 0.000132 | Damaging | 0.586379329 | No record | |||||
c.1088C>T | p.Pro363Leu | rs150663116 | 0.0001319 | 10082 | 2 | 0 | Damaging | 0.581270896 | possibly decreased | ||
c.449G>T | p.Arg150Leu | 0.0001319 | 10074 | 9 | 0 | TOLERATED | 0.689018975 | No record | |||
c.290G>C | p.Arg97Thr | . | 0.0001236 | Damaging | 0.608683972 | No record | |||||
c.515G>C | p.Cys172Ser | rs147617899 | 0.0001155 | 10079 | 5 | 0 | TOLERATED | 0.748597212 | No record | ||
c.1034T>C | p.Met345Thr | . | 0.0001154 | Damaging | 0.601859628 | No record | |||||
c.1341A>C | p.Leu447Phe | rs59485260 | 0.0001072 | 10079 | 5 | 0 | Damaging | 0.609375 | No record | ||
c.343A>C | p.Ser115Arg | rs771237265 | 0.000101 | 10082 | 2 | 0 | Damaging | 0.670161008 | No record | ||
c.1095C>A | p.Ser365Arg | rs139532088 | 0.00009894 | Damaging | 0.578622159 | Uncertain function | |||||
c.980T>C | p.Ile327Thr | rs57505750 | CYP2C9*31 | 0.0000907 | Damaging | 0.611262013 | No record | ||||
c.89C>T | p.Pro30Leu | rs142240658 | CYP2C9*21 | 0.00009067 | 10082 | 2 | 0 | Severely Damaging | 0.483953225 | Uncertain function | |
c.448C>T | p.Arg150Cys | rs17847037 | CYP2C9*48 | 0.00009066 | TOLERATED | 0.710010892 | No record | ||||
c.688C>G | p.His230Asp | . | 0.0000837 | Damaging | 0.65625 | unknown function | |||||
c.389C>T | p.Thr130Met | rs200965026 | CYP2C9*44 | 0.00008248 | 10080 | 4 | 0 | Damaging | 0.639755882 | No record | |
c.517G>A | p.Ala173Thr | rs373758696 | 0.00008247 | 10080 | 4 | 0 | Severely Damaging | 0.548579579 | No record | ||
c.395G>A | p.Arg132Gln | rs200183364 | CYP2C9*33 | 0.00008246 | 10083 | 1 | 0 | TOLERATED | 0.711347628 | No record | |
c.1362G>C | p.Gln454His | . | CYP2C9*19 | 0.00008243 | Damaging | 0.645897332 | No record | ||||
c.680C>T | p.Pro227Leu | . | 0.0000756 | TOLERATED | 0.705389146 | possibly decreased | |||||
c.1439C>T | p.Pro480Leu | rs530950257 | 0.00007417 | 10082 | 2 | 0 | Severely Damaging | 0.479818446 | possibly decreased | ||
c.556C>T | p.Arg186Cys | rs150435881 | 0.00006602 | Damaging | 0.592060606 | possibly decreased | |||||
c.1166T>C | p.Ile389Thr | . | 0.00006597 | Damaging | 0.631849327 | No record | |||||
c.1421A>G | p.Asn474Ser | rs141011391 | 0.00006593 | 10082 | 2 | 0 | Damaging | 0.647619048 | No record | ||
c.373C>T | p.Arg125Cys | . | 0.00005775 | TOLERATED | 0.677726163 | uncertain function | |||||
c.1187A>G | p.His396Arg | rs187793133 | 0.00005772 | TOLERATED | 0.700935174 | No record | |||||
c.1370A>G | p.Asn457Ser | . | CYP2C9*61 | 0.0000577 | Severely Damaging | 0.543305795 | uncertain function | ||||
c.263T>C | p.Ile88Thr | rs139656048 | 0.00005768 | TOLERATED | 0.676032437 | No record | |||||
c.218C>T | p.Pro73Leu | rs762081829 | 0.0000569 | 10083 | 1 | 0 | Severely Damaging | 0.571077996 | No record | ||
c.595G>A | p.Glu199Lys | . | 0.00004956 | Damaging | 0.651252305 | No record | |||||
c.371G>A | p.Arg124Gln | rs12414460 | CYP2C9*42 | 0.0000495 | Damaging | 0.656819836 | uncertain function | ||||
c.458T>C | p.Val153Ala | rs373993395 | 0.00004945 | 10081 | 3 | 0 | Damaging | 0.58402074 | uncertain function | ||
c.719T>C | p.Met240Thr | . | 0.00004163 | Damaging | 0.640173483 | No record | |||||
c.619A>T | p.Ile207Phe | . | 0.00004134 | TOLERATED | 0.674486125 | No record | |||||
c.371G>T | p.Arg124Leu | rs12414460 | 0.00004125 | Severely Damaging | 0.566767249 | uncertain function | |||||
c.539C>T | p.Ser180Phe | rs200149294 | 0.00004125 | 10083 | 1 | 0 | Severely Damaging | 0.568764708 | uncertain function | ||
c.164C>T | p.Thr55Ile | rs771905380 | 0.00004124 | 10082 | 2 | 0 | TOLERATED | 0.701629687 | No record | ||
c.1004G>A | p.Arg335Gln | . | CYP2C9*34 | 0.00004122 | TOLERATED | 0.695204529 | No record | ||||
c.709G>A | p.Val237Ile | . | 0.00003334 | Damaging | 0.634745896 | No record | |||||
c.624G>C | p.Leu208Phe | . | 0.00003308 | TOLERATED | 0.681854013 | No record | |||||
c.538T>C | p.Ser180Pro | . | 0.000033 | Severely Damaging | 0.529048923 | No record | |||||
c.541A>G | p.Ile181Val | . | 0.000033 | TOLERATED | 0.689873007 | No record | |||||
c.1297C>T | p.Arg433Trp | . | 0.00003299 | Severely Damaging | 0.564814815 | No record | |||||
c.386T>A | p.Met129Lys | rs750097042 | 0.00003299 | 10083 | 1 | 0 | Damaging | 0.643411363 | No record | ||
c.122A>T | p.Asn41Ile | . | 0.00003298 | Severely Damaging | 0.458665155 | No record | |||||
c.1024A>G | p.Arg342Gly | . | 0.00003298 | Severely Damaging | 0.575108971 | normal function | |||||
c.1036C>T | p.Pro346Ser | . | 0.00003298 | 10081 | 3 | 0 | Damaging | 0.654218921 | No record | ||
c.1429G>A | p.Ala477Thr | . | CYP2C9*30 | 0.00003297 | Severely Damaging | 0.558712121 | uncertain function | ||||
c.632C>T | p.Pro211Leu | . | 0.00002482 | TOLERATED | 0.692533067 | No record | |||||
c.629G>A | p.Ser210Asn | . | 0.00002481 | Damaging | 0.644966918 | No record | |||||
c.600G>T | p.Lys200Asn | rs766903671 | 0.00002478 | 10082 | 2 | 0 | Damaging | 0.663003421 | No record | ||
c.370C>T | p.Arg124Trp | . | CYP2C9*43 | 0.00002475 | TOLERATED | 0.737650497 | No record | ||||
c.1153A>T | p.Thr385Ser | . | 0.00002474 | Severely Damaging | 0.509235183 | No record | |||||
c.863A>G | p.Glu288Gly | . | 0.00002474 | Damaging | 0.601528578 | No record | |||||
c.1135T>C | p.Tyr379His | . | 0.00002474 | Damaging | 0.66603031 | No record | |||||
c.433G>T | p.Val145Phe | . | 0.00002473 | Severely Damaging | 0.460289634 | No record | |||||
c.1022A>G | p.Asp341Gly | . | 0.00002473 | Severely Damaging | 0.470510412 | No record | |||||
c.704A>G | p.Lys235Arg | . | 0.00001668 | TOLERATED | 0.68009768 | uncertain function | |||||
c.791T>G | p.Ile264Ser | rs761895497 | 0.00001656 | 10082 | 2 | 0 | Damaging | 0.599331078 | uncertain function | ||
c.773A>G | p.Asn258Ser | . | 0.00001656 | TOLERATED | 0.684042216 | No record | |||||
c.638T>C | p.Ile213Thr | . | 0.00001655 | Damaging | 0.648500178 | No record | |||||
c.637A>C | p.Ile213Leu | . | 0.00001655 | TOLERATED | 0.676605037 | No record | |||||
c.618G>T | p.Lys206Asn | . | 0.00001654 | TOLERATED | 0.690691147 | No record | |||||
c.356A>C | p.Lys119Thr | . | CYP2C9*41 | 0.00001651 | Damaging | 0.601033475 | decreased function | ||||
c.1076T>C | p.Ile359Thr | rs56165452 | CYP2C9*4 | 0.0000165 | Severely Damaging | 0.578003356 | possibly decreased | ||||
c.389C>A | p.Thr130Lys | . | 0.0000165 | Damaging | 0.668748634 | unknown function | |||||
c.820G>A | p.Glu274Lys | . | 0.0000165 | TOLERATED | 0.687495665 | No record | |||||
c.1080C>A | p.Asp360Glu | rs28371686 | 0.00001649 | Damaging | 0.605882567 | No record | |||||
c.137G>A | p.Gly46Asp | . | 0.00001649 | Damaging | 0.611297817 | No record | |||||
c.109C>T | p.Pro37Ser | . | 0.00001649 | Damaging | 0.62487462 | No record | |||||
c.1045G>A | p.Asp349Asn | . | 0.00001649 | Damaging | 0.648954995 | No record | |||||
c.908G>T | p.Ser303Ile | . | 0.00001649 | Damaging | 0.653594136 | No record | |||||
c.986G>A | p.Arg329His | . | 0.00001649 | Damaging | 0.66037359 | unknown function | |||||
c.1136A>G | p.Tyr379Cys | rs773704286 | 0.00001649 | 10083 | 1 | 0 | Damaging | 0.665754704 | No record | ||
c.1214A>T | p.Glu405Val | . | 0.00001649 | TOLERATED | 0.675922464 | No record | |||||
c.146A>G | p.Asp49Gly | . | CYP2C9*37 | 0.00001649 | TOLERATED | 0.678200758 | uncertain function | ||||
c.422T>C | p.Ile141Thr | rs148615754 | 0.00001649 | TOLERATED | 0.681659253 | No record | |||||
c.1159A>G | p.Ile387Val | rs764211126 | CYP2C9*56 | 0.00001649 | 10083 | 1 | 0 | TOLERATED | 0.688530952 | No record | |
c.968T>C | p.Val323Ala | . | 0.00001649 | TOLERATED | 0.700837059 | No record | |||||
c.989T>C | p.Val330Ala | . | 0.00001649 | TOLERATED | 0.727107621 | No record | |||||
c.257C>A | p.Ala86Asp | . | 0.00001648 | Severely Damaging | 0.513086248 | No record | |||||
c.38G>A | p.Cys13Tyr | . | 0.00001648 | Damaging | 0.590756013 | uncertain function | |||||
c.445G>A | p.Ala149Thr | . | CYP2C9*46 | 0.00001648 | Damaging | 0.604943716 | No record | ||||
c.312A>T | p.Glu104Asp | . | 0.00001648 | Damaging | 0.638095238 | No record | |||||
c.271G>A | p.Gly91Arg | . | 0.00001648 | Damaging | 0.651222774 | No record | |||||
c.296T>A | p.Ile99Asn | . | 0.00001648 | Damaging | 0.651222774 | No record | |||||
c.1415T>C | p.Val472Ala | . | 0.00001648 | Damaging | 0.668443691 | No record | |||||
c.247G>A | p.Val83Met | . | 0.00001648 | Damaging | 0.669561409 | uncertain function | |||||
c.460G>A | p.Glu154Lys | . | 0.00001648 | 10083 | 1 | 0 | TOLERATED | 0.688904038 | decreased function | ||
c.296T>C | p.Ile99Thr | . | 0.00001648 | TOLERATED | 0.6911887 | No record | |||||
c.8C>A | p.Ser3Tyr | . | 0.00001648 | TOLERATED | 0.694461693 | No record | |||||
c.791T>C | p.Ile264Thr | rs761895497 | 0.0000109 | 10082 | 2 | 0 | TOLERATED | 0.683929331 | unknown function | ||
c._del707 | p.Asn236Thrfs*5 | . | 0.000004099 | Severely Damaging | 0.25 | No record | |||||
c.709G>C | p.Val237Leu | . | Not recorded | Damaging | 0.623703953 | No record | |||||
c.229C>T | p.Leu77Met | . | Not recorded | TOLERATED | 0.72516812 | No record |
DMS, deep mutational scanning; ExAC, Exome Aggregation Consortium; RSID, Reference SNP ID number; WT, wild‐type.
Table 2.
Protein abuduance scores of CYP2C19 variants from ExAC browser and Mayo Right 10K study
Exact cDNA | Exact Amino acid | RSID | Common allele name | Allele Frequency | Right 10K (variant prevalence) | DMS | PharmVar | |||
---|---|---|---|---|---|---|---|---|---|---|
WT | Heterzygous | Homozygous | Functional study | Abundance score | ||||||
c.991G>A | V331I | rs3758581 | 0.06242 | 8802 | 1249 | 33 | Damaging | 0.599924823 | Nomal function | |
c.276G>C | E92D | rs17878459 | CYP2C19*2B | 0.0236 | 9480 | 598 | 6 | TOLERATED | 0.713318402 | No record |
c.518C>T | A173V | rs61311738 | 0.005818 | 10080 | 4 | 0 | TOLERATED | 0.643519769 | No record | |
c.481G>C | A161P | rs181297724 | 0.004786 | Severely damaging | 0.475593771 | No function | ||||
c.449G>A | R150H | rs58973490 | CYP2C19*11 | 0.002652 | 9985 | 98 | 1 | TOLERATED | 0.69271491 | normal function |
c.55A>C | I19L | rs17882687 | CYP2C19*15 | 0.002127 | 10079 | 5 | 0 | Damaging | 0.623042396 | normal function |
c.358T>C | W120R | rs41291556 | CYP2C19*8 | 0.00154 | 10045 | 39 | 0 | Severely damaging | 0.47978285 | No function |
c.1228C>T | R410C | rs17879685 | CYP2C19*13 | 0.001517 | 10082 | 2 | 0 | TOLERATED | 0.676655086 | Normal function |
c.431G>A | R144H | rs17884712 | CYP2C19*9 | 0.001079 | 10080 | 4 | 0 | Damaging | 0.621114695 | decreased function |
c.365A>C | E122A | rs17885179 | 0.0009637 | 10082 | 2 | 0 | TOLERATED | 0.753787879 | No record | |
c.784G>A | D262N | . | 0.0008586 | Damaging | 0.614038591 | No record | ||||
c.448C>T | R150C | rs142974781 | 0.0004859 | 10083 | 1 | 0 | Damaging | 0.62953438 | No record | |
c.680C>T | P227L | rs6413438 | CYP2C19*10 | 0.000448 | 10079 | 5 | 0 | Damaging | 0.624712679 | decreased function |
c.985C>T | R329C | rs59734894 | 0.0003872 | 10078 | 6 | 0 | Severely damaging | 0.554502254 | No record | |
c.394C>T | R132W | rs149590953 | 0.0003871 | 10081 | 3 | 0 | Damaging | 0.635242943 | No record | |
c.241G>A | E81K | rs149072229 | 0.0003871 | 10068 | 16 | 0 | TOLERATED | 0.667726879 | No record | |
c.374G>A | R125H | rs141774245 | 0.0002965 | 10062 | 21 | 0 | Damaging | 0.615675954 | No record | |
c.337G>A | V113I | rs145119820 | 0.0002718 | 10079 | 5 | 0 | TOLERATED | 0.683885811 | No record | |
c.1295A>T | K432I | rs146991374 | 0.000264 | Damaging | 0.609742945 | No record | ||||
c.217C>T | R73C | rs145328984 | CYP2C19*30 | 0.0002553 | 10082 | 1 | 0 | Damaging | 0.623329263 | uncertain function |
c.1078G>A | D360N | rs144036596 | 0.0002471 | TOLERATED | 0.693150526 | None | ||||
c.395G>A | R132Q | rs72552267 | CYP2C19*6 | 0.0002389 | 10082 | 2 | 0 | TOLERATED | 0.718026114 | No function |
c.164C>G | T55S | rs572853437 | 0.000132 | TOLERATED | 0.718673467 | No record | ||||
c.557G>C | R186P | rs140278421 | CYP2C19*22 | 0.0001236 | 10083 | 1 | 0 | Severely damaging | 0.560050139 | No function |
c.1004G>A | R335Q | rs118203757 | CYP2C19*24 | 0.0001236 | 10080 | 1 | 0 | Damaging | 0.601312932 | no function |
c.557G>A | R186H | 0.000108 | Damaging | 0.635198497 | No record | |||||
c.831C>A | N277K | . | 9.888E‐05 | 10080 | 3 | 0 | TOLERATED | 0.681062109 | No record | |
c.1048G>A | A350T | rs201509150 | 9.884E‐05 | Severely damaging | 0.536022841 | No record | ||||
c.1127T>A | F376Y | . | 9.884E‐05 | 10083 | 1 | 0 | TOLERATED | 0.711485306 | No record | |
c.25C>G | L9V | . | 0.0000908 | TOLERATED | 0.731784659 | No record | ||||
c.1007G>T | S336I | rs143833145 | 9.061E‐05 | TOLERATED | 0.637387252 | No record | ||||
c.1036C>T | P346S | . | 0.0000906 | Severely damaging | 0.580346079 | No record | ||||
c.556C>T | R186C | rs183701923 | 8.242E‐05 | Severely damaging | 0.564313994 | No record | ||||
c.389C>T | T130M | rs150152656 | 8.237E‐05 | 10082 | 2 | 0 | TOLERATED | 0.684802678 | No record | |
c.527A>G | N176S | rs57700608 | 7.417E‐05 | 10074 | 10 | 0 | TOLERATED | 0.671875 | No record | |
c.1075A>G | I359V | . | 7.414E‐05 | Damaging | 0.603645833 | No record | ||||
c.440A>G | E147G | rs147453531 | 7.413E‐05 | 10081 | 3 | 0 | TOLERATED | 0.643455877 | No record | |
c.781C>T | R261W | . | 6.605E‐05 | Severely damaging | 0.501301905 | No record | ||||
c.836A>C | Q279P | rs61526399 | 6.591E‐05 | Severely damaging | 0.570907901 | No record | ||||
c.1001A>T | N334I | . | 0.0000659 | TOLERATED | 0.648759438 | No record | ||||
c.221T>C | M74T | . | 6.589E‐05 | Severely damaging | 0.58131891 | No record | ||||
c.1324C>T | R442C | rs192154563 | CYP2C19*16 | 5.772E‐05 | Damaging | 0.62128858 | decreased function | |||
c.1021G>C | D341H | rs770829708 | 5.766E‐05 | 10081 | 3 | 0 | TOLERATED | 0.656054948 | No record | |
c.85C>T | L29F | . | 4.946E‐05 | TOLERATED | 0.692408379 | No record | ||||
c.837G>T | Q279H | . | 4.944E‐05 | TOLERATED | 0.678912244 | No record | ||||
c.1003C>T | R335W | rs368758960 | 4.942E‐05 | 10083 | 1 | 0 | Severely damaging | 0.497372294 | No record | |
c.1034T>A | M345K | rs201132803 | 4.942E‐05 | 10080 | 1 | 0 | Severely damaging | 0.497916398 | No record | |
c.371G>A | R124Q | rs200346442 | 4.942E‐05 | Damaging | 0.602183853 | No record | ||||
c.925G>A | A309T | . | 4.942E‐05 | TOLERATED | 0.636880409 | No record | ||||
c.218G>A | R73H | . | 4.119E‐05 | Severely damaging | 0.476206097 | No record | ||||
c.1072T>C | Y358H | . | 4.119E‐05 | TOLERATED | 0.648167478 | No record | ||||
c.370C>T | R124W | . | 4.118E‐05 | Severely damaging | 0.505962416 | No record | ||||
c.726T>A | S242R | . | 3.311E‐05 | TOLERATED | 0.702093879 | No record | ||||
c.801C>A | F267L | rs377674118 | 3.303E‐05 | 10083 | 1 | 0 | Severely damaging | 0.537597325 | No record | |
c.1171C>G | L391V | . | 3.298E‐05 | Severely damaging | 0.569318436 | No record | ||||
c.1205C>T | P402L | . | 3.298E‐05 | TOLERATED | 0.642305732 | No record | ||||
c.1216A>G | M406V | rs144056033 | 3.298E‐05 | 10082 | 2 | 0 | TOLERATED | 0.699731714 | No record | |
c.1060G>A | E354K | . | 3.295E‐05 | Severely damaging | 0.475659325 | No record | ||||
c.896C>G | T299R | . | 3.295E‐05 | Severely damaging | 0.531790263 | No record | ||||
c.305T>C | L102P | . | 3.295E‐05 | Severely damaging | 0.574479079 | No record | ||||
c.905C>T | T302I | rs58259047 | 3.295E‐05 | Damaging | 0.602678118 | No record | ||||
c.1120G>A | V374I | rs113934938 | CYP2C19*28 | 3.295E‐05 | Damaging | 0.627074235 | normal function | |||
c.961G>T | A321S | . | 3.295E‐05 | TOLERATED | 0.639949178 | No record | ||||
c.928C>T | L310F | . | 3.295E‐05 | 10082 | 2 | 0 | TOLERATED | 0.679301841 | No record | |
c.907A>G | S303G | . | 3.295E‐05 | TOLERATED | 0.717662336 | No record | ||||
c.648C>G | C216W | . | 0.0000274 | TOLERATED | 0.656662248 | No record | ||||
c.671A>T | D224V | . | 2.561E‐05 | Severely damaging | 0.53699972 | No record | ||||
c.753C>G | H251Q | rs148247410 | 2.478E‐05 | Severely damaging | 0.563668153 | No function | ||||
c.769A>T | I257F | . | 2.477E‐05 | TOLERATED | 0.681306328 | No record | ||||
c.631C>A | P211T | . | 2.476E‐05 | TOLERATED | 0.659306032 | No record | ||||
c.629C>A | T210N | . | 2.476E‐05 | TOLERATED | 0.684317643 | No record | ||||
c.1316G>T | G439V | . | 2.474E‐05 | TOLERATED | 0.665474123 | No record | ||||
c.1371C>A | N457K | . | 2.474E‐05 | TOLERATED | 0.682756843 | No record | ||||
c.1414G>A | V472I | . | 2.473E‐05 | TOLERATED | 0.670673077 | No record | ||||
c.60G>C | W20C | . | 2.473E‐05 | TOLERATED | 0.672724033 | No record | ||||
c.562G>A | D188N | rs370803989 | CYP2C19*35 | 2.473E‐05 | 10080 | 4 | 0 | TOLERATED | 0.677997182 | uncertain function |
c.82A>G | K28E | . | 2.473E‐05 | TOLERATED | 0.680943598 | No record | ||||
c.185G>C | G62A | . | 2.472E‐05 | Severely damaging | 0.508683191 | No record | ||||
c.190G>A | V64M | rs150045105 | 2.472E‐05 | TOLERATED | 0.650769501 | No record | ||||
c.848C>T | T283I | . | 2.472E‐05 | TOLERATED | 0.65615054 | No record | ||||
c.1037C>T | P346L | . | 2.471E‐05 | Severely damaging | 0.502953907 | No record | ||||
c.1080C>A | D360E | . | 2.471E‐05 | Severely damaging | 0.549265027 | No record | ||||
c.1078G>C | D360H | rs144036596 | 2.471E‐05 | 10081 | 3 | 0 | Severely damaging | 0.567361111 | No record | |
c.373C>T | R125C | rs200150287 | 2.471E‐05 | 10083 | 1 | 0 | TOLERATED | 0.669159602 | No record | |
c.850A>G | I284V | . | 2.471E‐05 | TOLERATED | 0.685908298 | No record | ||||
c.430C>T | R144C | . | 2.471E‐05 | TOLERATED | 0.688887841 | No record | ||||
c.721G>A | E241K | . | 1.657E‐05 | Damaging | 0.619158302 | No record | ||||
c.728A>G | D243G | . | 1.655E‐05 | TOLERATED | 0.696277128 | No record | ||||
c.1288G>C | A430P | . | 1.652E‐05 | Severely damaging | 0.545128622 | No record | ||||
c.813G>A | M271I | . | 1.652E‐05 | TOLERATED | 0.663920987 | No record | ||||
c.169C>G | L57V | . | 1.651E‐05 | Damaging | 0.634196461 | No record | ||||
c.1465C>T | P489S | . | 0.0000165 | Severely damaging | 0.577527563 | No record | ||||
c.1373T>G | L458R | rs761587034 | 1.649E‐05 | 10083 | 1 | 0 | Severely damaging | 0.51802352 | No record | |
c.1349C>A | T450N | rs141690375 | 1.649E‐05 | Severely damaging | 0.597025262 | No record | ||||
c.1439C>T | P480L | rs779501712 | 1.649E‐05 | 10083 | 1 | 0 | Damaging | 0.626171339 | No record | |
c.1325G>A | R442H | rs138112316 | 1.649E‐05 | TOLERATED | 0.646762738 | No record | ||||
c.65A>G | Q22R | rs144928727 | 1.649E‐05 | TOLERATED | 0.656465743 | No record | ||||
c.1398C>A | D466E | . | 1.649E‐05 | TOLERATED | 0.661470643 | No record | ||||
c.593T>C | M198T | rs186489608 | 1.649E‐05 | TOLERATED | 0.667063402 | No record | ||||
c.1330G>C | E444Q | . | 1.649E‐05 | TOLERATED | 0.694466074 | No record | ||||
c.1213G>C | E405Q | . | 1.649E‐05 | TOLERATED | 0.711027827 | No record | ||||
c.518C>A | A173D | rs61311738 | 1.648E‐05 | 10080 | 4 | 0 | Severely damaging | 0.499614976 | No record | |
c.1076T>A | I359N | . | 1.648E‐05 | Severely damaging | 0.522206232 | No record | ||||
c.197C>G | T66S | . | 1.648E‐05 | Damaging | 0.626403544 | No record | ||||
c.862G>A | V288I | . | 1.648E‐05 | TOLERATED | 0.673350956 | No record | ||||
c.537C>G | C179W | . | 1.648E‐05 | TOLERATED | 0.694502569 | No record | ||||
c.865A>G | I289V | . | 1.648E‐05 | TOLERATED | 0.714496776 | No record | ||||
c.445G>A | A149T | . | 1.647E‐05 | Severely damaging | 0.521050873 | No record | ||||
c.905C>G | T302R | rs58259047 | 1.647E‐05 | Severely damaging | 0.521776385 | No record | ||||
c.331G>A | G111R | . | 1.647E‐05 | Severely damaging | 0.541454278 | No record | ||||
c.1021G>A | D341N | . | 1.647E‐05 | Severely damaging | 0.571470548 | No function | ||||
c.409G>C | G137R | . | 1.647E‐05 | Severely damaging | 0.571889977 | No record | ||||
c.1013G>T | C338F | . | 1.647E‐05 | Damaging | 0.609881164 | No record | ||||
c.217C>A | R73S | rs145328984 | 1.647E‐05 | 10082 | 1 | 0 | TOLERATED | 0.651405999 | No record | |
c.410G>A | G137E | . | 1.647E‐05 | TOLERATED | 0.653990969 | No record | ||||
c.326G>C | G109A | . | 1.647E‐05 | TOLERATED | 0.658559311 | No record | ||||
c.347A>G | N116S | . | 1.647E‐05 | TOLERATED | 0.666187952 | No record | ||||
c.271G>C | G91R | rs118203756 | CYP2C19*23 | 1.647E‐05 | TOLERATED | 0.668239425 | uncertain function | |||
c.1002C>A | N334K | rs563052490 | 1.647E‐05 | 10081 | 3 | 0 | TOLERATED | 0.687674269 | No record | |
c.1112C>G | T371S | rs568155950 | 1.647E‐05 | 10083 | 1 | 0 | TOLERATED | 0.715706169 | No record | |
c.578A>G | Q193R | 4.06E‐06 | TOLERATED | 0.658206476 | No record |
ExAC, Exome Aggregation Consortium; RSID, Reference SNP ID number; WT, wild‐type.
Validation of severely damaging variants of CYP2C9 and CYP2C19
In silico predictions were not always consistent with our DMS results, as described in the preceeding paragraph. That fact emphasizes the need for the validation of variant classification using DMS or other functional assays. Although the efficiency of calling damaging variants using DMS exceeded the throughput of classical mutagenesis methods, we still needed to confirm the accuracy of calling for the variants that we studied. Therefore, we validated our variant protein expression calls by the use of Western blot analyses. As shown in Figure 5 a,b, our newly identified severely damaging variants for CYP2C9 and CYP2C19 displayed significantly decreased protein expression (< 25% protein expression with the exception of CYP2C9 371G>T which had ~ 50% protein expression) compared with the WT protein—shown at the far right in each of the panels. The binning patterns for variant frequencies for selected allozyme are shown in Figures 5 c,d.
Figure 5.
Western blot validation of cytochrome P450 family 2 subfamily C member 9 (CYP2C9) and cytochrome P450 family 2 subfamily C member 19 (CYP2C19) allozymes identified as containing severely damaging variants. (a,b) The protein expression of CYP2C9 and CYP2C19 in blue fluorescent protein−/mCherry+ cells integrating severely damaging variants for were validated by Western blot analysis. The mCherry and β‐actin were used as loading controls. Each image includes a control lane for wild‐type (WT) CYP2C9 or CYP2C19. (c,d) Variant frequencies for three representative “severely” damaging variants for CYP2C9 and CYP2C19 showing their distributions in each of the four bins.
Discussion
The functional characterization of ORF missense variants in clinically important pharmacogenes remains a major challenge for pharmacogenomics. In a recent study, we identified and functionally characterized six novel nonsynonymous ORF variants in CYP2C9 and seven in CYP2C19 based on Mayo Right 1K data.9 We found that the enzyme activities of those variants generally correlated well with protein expression levels.9 Missense variants in CYP2C9/CYP2C19 may alter protein folding, leading to decreased protein expression as a result of accelerated proteasome‐mediated degradation, a major factor responsible for decreased enzymatic activity in pharmacogenomics.9, 21, 22, 23 The loss of function of allozymes containing nonsynonymous CYP2C9/CYP2C19 ORF single nucleotide polymorphisms due to decreased protein expression made it possible to analyze their function by fluorescence reporter assays.9 Because of the very large number of missense variants in ORFs, it is practically difficult to link the genotypes of these variants to their functional phenotypes using “one‐at‐a‐time” expression systems. Fowler and colleagues developed DMS to analyze variants for all possible amino acid alterations in several genes. Saturation mutagenesis was also used for DMS in several previous studies.9, 24, 25 That approach has advantages for use in pre‐emptive pharmacogenomics and makes it possible to interpet variant function based on protein structrual mapping.16, 26 Degenerate codons were used to generate the saturation libraries but some variants of interest may be missed due to codon bias, with up to 30–50% of possible variants missing from the final data sets.16 CYP2C9 and CYP2C19 each have ORFs that are 1.6 kb in length, so it would be difficult to generate saturation mutation libraries. As a result, we chose to apply a modification of the nicking mutagenesis method developed by Whitehead et al.17 to create focused variant libraries for missense variants that had reported to occur in humans for use in our study. Specifically, we analyzed 230 nonsynonymous ORF variants for CYP2C9 and CYP2C19 from the ExAC study that had minor allele frequencies > 0.00001. All of those variants had been shown to occur in humans and were not so rare as to be “private.” FACS sorting was used to separate variants of differing protein expression levels, all of which were subsequently analyzed by NGS to make it possible to calculate the frequency of each of the variants studied in each of our four FACS bins (see Figures 2 and 3).
X‐ray crystal structures have been determined for CYP2C9 and CYP2C19.27, 28 Six substrate recognition sites (SRS) have been identified in CYP2C enzyme sequences: amino acids 96–117 (SRS1), 198–205 (SRS2), 233–240 (SRS3), 286–304 (SRS4), 359–369 (SRS5), and 470–477 (SRS6).29 Nineteen of the 75 CYP2C9 and 9 of the 58 CYP2C19 damaging variants listed in Tables 1 and 2, which displayed reduced protein expression of at least 50% protein, mapped to known SRS sites, so they may also influence substrate binding. Other damaging variants listed in Tables 1 and 2 fall outside of those sites but may influence activity due to the disruption of active sites, although they have no influence on protein abundance. For example, we have reported that CYP2C9 709G>C and CYP2C19 65A>G displayed significantly reduced enzyme activity, but their protein levels were similar to that of that the WT.9 In silico predictions have been widely applied to predict variation in protein structure and function. Our previous work and that of others supports the importance of the application of additional, functional methods to validate results obtained by using predictive algorithms.30, 31 Therefore, we compared variant calling by use of DMS with the predictions of computational algorithms, and differences were found between our results and those of prediction algorithms, as listed in Tables S4 and S5 . Those differences may be due to either the accuracy of the prediction algorithms or to underlying molecular mechanisms. For example, CYP2C9 709G>C and CYP2C19 65A>G and CYP2C19*13 have WT‐like protein expression but loss of enzyme activity.9, 32 We also determined whether variants identified by DMS were truly damaging by the use of Western blot analyses, as shown in Figure 5.
A limitation of DMS based on fluorescence is that some genes have been found to not be amenable to this assay, because damaging variants for those genes had similar fluorescent intensities as did WT proteins.16 We found that DMS seems to be most sensitive for screening for severely damaging variants, which displayed clear fluorescence separation from WT‐like variants. However, it required careful interpretation of intermediate‐fluorescing variants. The validation of functional studies for variants characterized in this fashion will be essential if we are to incorporate these results into clinical decision making and electronic health records. To validate the most severely damaging variants that we identified by DMS, we used Western blot assays as our standard functional assay for validation, even though those studies were laborious and time‐consuming—but still necessary at this time. Protein expression is obviously an important aspect of functional genomics—but only one aspect. Additional functional validation based on enzyme activities for different CYP substrates should be performed in the future to further extend the functional characterization of the variant allozymes that we have studied. The regulation of CYP activity is a complex process involving multiple mechanisms, which include transcripton regulation by nuclear receptors, such as the pregnane X receptor, the constitutive androstane receptor, the glucocorticoid receptor and by members of the TSPYL gene family.33, 34, 35, 36 DMS, as we have used it, has the limitation of not addressing upstream DNA sequence variants, such as CYP2C19*17 that results in an increase in transcriptional activity.37, 38, 39 High‐throughput methodology for studying variants outside of ORFs will obviously be required for the interpretation of CYP variants that map outside of protein coding sequences. As a result, DMS is not a “final answer” but rather represents a significant step forward in our efforts to link genomic variation to variation in drug response phenotypes.
In summary, we have identified and validated 15 CYP2C9 and 30 CYP2C19 severely damaging variants that had not previously been reported in PharmVar.20 Those variants are potentially clinically actionable. Functional studies of those variants showed decreased protein expression, which could result in decreased drug metabolism. Our results add information that may help to improve the accuracy of current prediction algorithms and they may also provide test data sets for machine‐learning methods that might “learn” to predict the effects of ORF missense variants. The Mayo Clinic recently expanded the RIGHT 1K study to include an even larger cohort, the RIGHT 10K study that includes an additional 10,085 DNA samples with sequencing data for 77 pharmacogenes.6 Single nucleotide polymorphisms from both RIGHT 1K and RIGHT 10K studies were included in our analyses. This same DMS methodology can now be implemented to study other important pharmacogenes and preemptive NGS Mayo RIGHT 10K data as ever larger numbers of ORF missense variants are identified.
Funding
This study was funded by National Institutes of Health (NIH) grants U19 GM61388 (The Pharmacogenomics Research Network), R01 GM28157, R01 GM125633, and T32 GM08685, and by the Mayo Clinic Center for Individualized Medicine.
Conflict of Interest
Both Drs.Weinshilboum and Wang are co‐founders of and stockholders in OneOme, LLC. All other authors declared no competing interests for this work.
Author Contributions
L.Z., V.S., J.Y., D.L., and R.W. wrote the manuscript. L.Z., V.S., J.Y., L.W., and R.W. designed the research. L.Z., I.M., S.D., and J.R. performed the research. L.Z., V.S., and K.K. analyzed the data. All authors have given final approval of the manuscript for submission.
Supporting information
Figure S1.
Table S1.
Table S2.
Table S3.
Table S4.
Table S5.
Table S6.
Supplemental Text.
Contributor Information
Liewei Wang, Email: Wang.Liewei@mayo.edu.
Richard Weinshilboum, Email: weinshilboum.richard@mayo.edu.
References
- 1. Weinshilboum, R.M. & Wang, L. Pharmacogenomics: precision medicine and drug response. Mayo Clin. Proc. 92, 1711–1722 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Aldrich, S.L. , Poweleit, E.A. , Prows, C.A. , Martin, L.J. , Strawn, J.R. & Ramsey, L.B. Influence of CYP2C19 metabolizer status on escitalopram/citalopram tolerability and response in youth with anxiety and depressive disorders. Front. Pharmacol. 10, 99 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Hicks, J.K. et al Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline for CYP2D6 and CYP2C19 genotypes and dosing of selective serotonin reuptake inhibitors. Clin. Pharmacol. Ther. 98, 127–134 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Sim, S.C. et al A common novel CYP2C19 gene variant causes ultrarapid drug metabolism relevant for the drug response to proton pump inhibitors and antidepressants. Clin. Pharmacol. Ther. 79, 103–113 (2006). [DOI] [PubMed] [Google Scholar]
- 5. Veldic, M. et al Cytochrome P450 2C19 poor metabolizer phenotype in treatment resistant depression: treatment and diagnostic implications. Front. Pharmacol. 10, 83 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Bielinski, S.J. et al Preemptive genotyping for personalized medicine: design of the right drug, right dose, right time‐using genomic data to individualize treatment protocol. Mayo Clin. Proc. 89, 25–33 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Chiasson, M. , Dunham, M.J. , Rettie, A.E. & Fowler, D.M. Applying multiplex assays to understand variation in pharmacogenes. Clin. Pharmacol. Ther. 106, 290–294 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Daly, A.K. , Rettie, A.E. , Fowler, D.M. & Miners, J.O. Pharmacogenomics of CYP2C9: functional and clinical considerations. J. Pers. Med. 8, pii: E1 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Devarajan, S. et al Pharmacogenomic next‐generation dna sequencing: lessons from the identification and functional characterization of variants of unknown significance in CYP2C9 and CYP2C19. Drug Metab. Dispos. 47, 425–435 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Vaser, R. , Adusumalli, S. , Leng, S.N. , Sikic, M. & Ng, P.C. SIFT missense predictions for genomes. Nat. Protoc. 11, 1–9 (2016). [DOI] [PubMed] [Google Scholar]
- 11. Adzhubei, I.A. et al A method and server for predicting damaging missense mutations. Nat. Meth. 7, 248–249 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Choi, Y. & Chan, A.P. PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics 31, 2745–2747 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Kinney, J.B. & McCandlish, D.M. Massively parallel assays and quantitative sequence‐function relationships. Annu. Rev. Genomics Hum. Genet. 20, 99–127 (2019). [DOI] [PubMed] [Google Scholar]
- 14. Fowler, D.M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Meth. 11, 801–807 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Matreyek, K.A. , Stephany, J.J. & Fowler, D.M. A platform for functional assessment of large variant libraries in mammalian cells. Nucleic Acids Res. 45, e102 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Matreyek, K.A. et al Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat. Genet. 50, 874–882 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Wrenbeck, E.E. , Klesmith, J.R. , Stapleton, J.A. , Adeniran, A. , Tyo, K.E. & Whitehead, T.A. Plasmid‐based one‐pot saturation mutagenesis. Nat. Meth. 13, 928–930 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Lek, M. et al Analysis of protein‐coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Stepanenko, A.A. & Dmitrenko, V.V. HEK293 in cell biology and cancer research: phenotype, karyotype, tumorigenicity, and stress‐induced genome‐phenotype evolution. Gene 569, 182–190 (2015). [DOI] [PubMed] [Google Scholar]
- 20. Gaedigk, A. et al The Pharmacogene variation (PharmVar) consortium: incorporation of the human cytochrome P450 (CYP) allele nomenclature database. Clin. Pharmacol. Ther. 103, 399–401 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Wang, L. , Nguyen, T.V. , McLaughlin, R.W. , Sikkink, L.A. , Ramirez‐Alvarado, M. & Weinshilboum, R.M. Human thiopurine S‐methyltransferase pharmacogenetics: variant allozyme misfolding and aggresome formation. Proc. Natl. Acad. Sci. USA 102, 9394–9399 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Wang, L. , Yee, V.C. & Weinshilboum, R.M. Aggresome formation and pharmacogenetics: sulfotransferase 1A3 as a model system. Biochem. Biophys. Res. Commun. 325, 426–433 (2004). [DOI] [PubMed] [Google Scholar]
- 23. Li, F. , Wang, L. , Burgess, R.J. & Weinshilboum, R.M. Thiopurine S‐methyltransferase pharmacogenetics: autophagy as a mechanism for variant allozyme degradation. Pharmacogenet. Genomics 18, 1083–1094 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Fowler, D.M. , Stephany, J.J. & Fields, S. Measuring the activity of protein variants on a large scale using deep mutational scanning. Nat. Protoc. 9, 2267–2284 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Starita, L.M. et al Activity‐enhancing mutations in an E3 ubiquitin ligase identified by high‐throughput mutagenesis. Proc. Natl. Acad. Sci. USA 110, E1263–E1272 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Gasperini, M. , Starita, L. & Shendure, J. The power of multiplexed functional analysis of genetic variants. Nat. Protoc. 11, 1782–1787 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Wang, J.F. , Wei, D.Q. , Li, L. , Zheng, S.Y. , Li, Y.X. & Chou, K.C. 3D structure modeling of cytochrome P450 2C19 and its implication for personalized drug design. Biochem. Biophys. Res. Commun. 355, 513–519 (2007). [DOI] [PubMed] [Google Scholar]
- 28. Reynald, R.L. , Sansen, S. , Stout, C.D. & Johnson, E.F. Structural characterization of human cytochrome P450 2C19: active site differences between P450s 2C8, 2C9, and 2C19. J. Biol. Chem. 287, 44581–44591 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Gotoh, O. Substrate recognition sites in cytochrome P450 family 2 (CYP2) proteins inferred from comparative analyses of amino acid and coding nucleotide sequences. J. Biol. Chem. 267, 83–90 (1992). [PubMed] [Google Scholar]
- 30. Rettie, A.E. & Jones, J.P. Clinical and toxicological relevance of CYP2C9: drug‐drug interactions and pharmacogenetics. Annu. Rev. Pharmacol. Toxicol. 45, 477–494 (2005). [DOI] [PubMed] [Google Scholar]
- 31. Flanagan, S.E. , Patch, A.M. & Ellard, S. Using SIFT and PolyPhen to predict loss‐of‐function and gain‐of‐function mutations. Genet. Test Mol. Biomarkers 14, 533–537 (2010). [DOI] [PubMed] [Google Scholar]
- 32. Zi, J. et al Effects of CYP2C9*3 and CYP2C9*13 on diclofenac metabolism and inhibition‐based drug‐drug interactions. Drug Metab. Pharmacokinet. 25, 343–350 (2010). [DOI] [PubMed] [Google Scholar]
- 33. Chen, Y. , Ferguson, S.S. , Negishi, M. & Goldstein, J.A. Induction of human CYP2C9 by rifampicin, hyperforin, and phenobarbital is mediated by the pregnane X receptor. J. Pharmacol. Exp. Ther. 308, 495–501 (2004). [DOI] [PubMed] [Google Scholar]
- 34. Sahi, J. , Shord, S.S. , Lindley, C. , Ferguson, S. & LeCluyse, E.L. Regulation of cytochrome P450 2C9 expression in primary cultures of human hepatocytes. J. Biochem. Mol. Toxicol. 23, 43–58 (2009). [DOI] [PubMed] [Google Scholar]
- 35. Dvorak, Z. & Pavek, P. Regulation of drug‐metabolizing cytochrome P450 enzymes by glucocorticoids. Drug Metab. Rev. 42, 621–635 (2010). [DOI] [PubMed] [Google Scholar]
- 36. Qin, S. et al TSPYL family regulates CYP17A1 and CYP3A4 expression: potential mechanism contributing to abiraterone response in metastatic castration‐resistant prostate cancer. Clin. Pharmacol. Ther. 104, 201–210 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Scott, S.A. et al Clinical Pharmacogenetics Implementation Consortium guidelines for cytochrome P450–2C19 (CYP2C19) genotype and clopidogrel therapy. Clin. Pharmacol. Ther. 90, 328–332 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Li‐Wan‐Po, A. , Girard, T. , Farndon, P. , Cooley, C. & Lithgow, J. Pharmacogenetics of CYP2C19: functional and clinical implications of a new variant CYP2C19*17. Br. J. Clin. Pharmacol. 69, 222–230 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Shirasaka, Y. et al Interindividual variability of CYP2C19‐catalyzed drug metabolism due to differences in gene diplotypes and cytochrome P450 oxidoreductase content. Pharmacogenomics J. 16, 375–387 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figure S1.
Table S1.
Table S2.
Table S3.
Table S4.
Table S5.
Table S6.
Supplemental Text.