Abstract
Aim:
Perform in silico predictions of functional consequences of CYP2C9 variants identified by next-generation sequencing in Puerto Ricans.
Methods:
Identified low-frequency CYP2C9 variants (minor allele frequencies <2%) were evaluated using the Combined Annotation-Dependent Depletion (CADD v1.3) tools and molecular modeling/docking analysis to predict impact on CYP2C9 activity.
Results:
CYP2C9*5,*8,*9,*11,*12,*21 and a novel *61 induce conformational changes that affect the binding site of S-warfarin. Most of these deleterious variants occur at higher frequency among individuals with large African ancestry.
Conclusion:
The unfavorable distance of S-warfarin from heme group, and low-binding interactions due to these CYP2C9 variants, suggest major complications during warfarin therapy. This study contributes to the field by predicting functional alterations of rare CYP2C9 variants for the first time in Hispanics.
Keywords: : Caribbean Hispanics, CYP2C9, DNA sequencing, docking, pharmacogenetics, S-warfarin, single nucleotide variations
Warfarin (also known as Coumadin®) is an oral anticoagulant commonly prescribed to prevent thromboembolisms in patients with atrial valve replacement, pulmonary embolism, deep vein thrombosis, myocardial infarction and atrial fibrillation, among other indications [1]. S-warfarin is mainly metabolized by the CYP450 isoform 2C9 (CYP2C9) to form 7-hydroxy warfarin [2]. Variants in warfarin-related genes (i.e., CYP2C9 and VKORC1) in combination with other nongenetic clinical factors explain approximately 50% of variability in warfarin dose requirements [3].
CYP450s are relevant for biotransformation of drugs and xenobiotics. Specifically, CYP2C9 has an important role in drug metabolism, with 12.8% of prescribed drugs being transformed through this Phase I metabolic pathway [4]. CYP2C9 not only metabolizes warfarin, but also other commonly used drugs such as phenytoin and valproic acid (anticonvulsants), candesartan and losartan (angiotensin receptor blockers), glyburide and tolbutamide (antidiabetic drugs) and NSAIDs [5]. Williams et al. reported the crystal structure of the drug-metabolizing CYP2C9 enzyme [6]. They identified a big hydrophobic pocket in this enzyme that suggests either an ability to accommodate more than one ligand simultaneously, or displacement of the ligand toward the heme group for hydroxylation by a second ligand [6]. Genetic variants resulting in amino acid substitutions are able to change the hydroxylation capacity of the enzyme; however, the effects are substrate specific [7–9]. CYP2C9*2 and CYP2C9*3 − the most common CYP2C9 genetic predictors of warfarin dose requirements, are known to decrease metabolism of warfarin. CYP2C9*2 alters the interaction of the cytochrome with the P450 oxidoreductase, whereas CYP2C9*3 occurs in the active site of the enzyme reducing the catalytic activity [10,11].
Many adverse drug effects are explained by polymorphisms at genes encoding for members of cytochromes (CYPs) [12]. Furthermore, >40% of the US FDA recommendations based on pharmacogenetic information include a gene encoding for a member of the CYPs [13]. In a previous report, it has been postulated that the majority of genetic variants identified at CYPs are rare or novel variants [12]. Hence, the effect of these genetic variations merits an in-depth evaluation. However, clinical annotation regarding pathogenicity of genetic variations in existing databases (e.g., Human Gene Mutation Database and ClinVar) is biased between Europeans and non-Europeans [14]. The positive correlation between African ancestry and the number of functionally neutral variants (common and nondeleterious variants) was shown to decrease as more annotated pathogenic and low-frequency variants are included [14]. This correlation suggests that Africans have higher genetic diversity than Europeans, but most of these variants do not contain information about pathogenicity in the databases [14]. Ethno-specific genetic variants have shown to increase predictability of warfarin dose requirements [15]. For example, most genetic-guided dosing algorithms only include CYP2C9*2 and CYP2C9*3, but Scott et al. found that the inclusion of CYP2C9*8 and other variants relevant to the African population, reclassified 15% of the patients from one predicted metabolic phenotype to another [16].
From a genetic standpoint, Hispanics are admixed individuals with varying degree of contributions from African, European and Native American ancestries. Puerto Ricans are Caribbean Hispanics characterized by having a greater African contribution than other Hispanics [17]. Given their high level of genetic admixture and African ancestry, along with a limited characterization of their native American component, we have reasoned that Puerto Ricans may harbor some rare genetic variants at CYP2C9 and VKORC1 that might not be properly characterized in terms of their structural alterations, functional consequences and deleteriousness. Additionally, Caribbean Hispanics are under-represented in pharmacogenetics studies [18]. Accordingly, the present study was aimed at performing in silico predictions of structural alterations of low-frequency genetic variants identified by next-generation sequencing (NGS) in Puerto Ricans.
Methods
Subjects
This is a secondary analysis of data from 108 participants in an open-label, single-center, retrospective cohort study (ClinicalTrial. gov Identifier NCT01318057). Patients on warfarin were recruited from the Veterans Affairs Caribbean Healthcare System (VACHS)-affiliated anticoagulation clinic in San Juan, Puerto Rico, which serves a predominantly Caribbean Hispanic population. A full description of the study population cohort as well as detailed information on the patient’s recruitment process can be found elsewhere [19]. The study was approved by the Institutional Review Boards of the Veterans Affairs Caribbean Healthcare System (#00558) and the University of Puerto Rico at Medical Sciences Campus (A4070109). The clinical protocol was conducted according to the principles in the Declaration of Helsinki. Written informed consent was obtained from each participant prior to enrollment.
DNA sequencing
The next-generation-targeted sequencing on genomic DNA specimens from 108 participants was performed with the Ion Personal Genome Machine from Life Technologies (CA, USA), according to manufacturer’s protocol. A full description of this procedure can be found elsewhere [20]. Briefly, the Ion AmpliSeq™ Designer software was used to generate a custom panel of 339 PCR primer pairs directed toward CYP2C9 and VKORC1 loci as well as their flanking regions (5 kb up and downstream). DNA libraries for NGS were constructed using the Ion AmpliSeq 2.0 Library Preparation and equalized to ensure similar final concentrations among samples (∼100 pM) with the Ion Library Equalizer™ (Thermo Fisher Scientific; MA, USA). The sequencing results were analyzed with the Torrent Suite Software (v4.6.). The aligned sequences were then uploaded to the Ion Reporter software cloud for variant calling and annotation with the Torrent Variant Caller algorithm (TVC v4.4-8). A total of 459 variants including single nucleotide variants (SNVs) and insertion/deletions were identified by NGS and annotated; 376 SNVs remained after removing low-quality variants (Q<30).
Characterization of low-frequency variants
Low-frequency variants (minor allele frequencies [MAF] <2%) were identified and evaluated for potential deleteriousness using the tool Combined Annotation Dependent Depletion (CADD v1.3) from the University of Washington (http://cadd.gs.washington.edu/info). CADD applies a ‘Phred-scale’ scoring system (C-score) for coding and noncoding genetic variants (single and insertion/deletions) integrating different types of information: conservation, amino acid substitutions, pathogenicity and experimentally measured regulatory effects. The tool prioritizes functional, deleterious and disease causal variants using the following scores: Grantham, SIFT, PolyPhen (PolyP), GERP, PhastCons and PhyloP. Grantham, SIFT and PolyP provides information about the effects of amino acids substitutions. GERP, PhastCons and PhyloP provide information about conservation of the locus. The C-score is based on a ranking of variant deleteriousness [21]. Variants with higher C-score (preferably >10) occurring in coding regions were selected for in silico modeling and analysis using Phyre v2.0, AutoDock v4.2.6 and PyMOL v1.7.4 as molecular viewer [22].
Structure prediction of CYP2C9 variants & docking analysis
Polymorphisms resulting in amino acid changes (missense mutations) were evaluated by in silico modeling to determine potential changes in binding of S-warfarin into CYP2C9 variant proteins. The CYP2C9 amino acid sequence was retrieved from the Universal Protein Resource (UniProt; sequence Id: P11712). The wild-type (WT) sequence was modified for the amino acid substitutions resulting from the variants: CYP2C9*5 (D360E), CYP2C9*8 (R150H), CYP2C9*9 (H251R), CYP2C9*11 (H335W), CYP2C9*12 (P489S), CYP2C9*21 (P30L) and CYP2C9*61 (R144C, N457S). The obtained amino acid sequences were introduced in the Protein Homology/Analogy Recognition Engine v2.0 tool (Phyre v. 2.0; www.sbg.bio.ic.ac.uk/phyre2) server with no gaps in FASTA format [23]. The results were evaluated for alignment and secondary structure prediction. These predictions were made based on protein sequence of CYP2C9 variants, comparison with the known structure of CYP2C9 WT protein and amino acid sequence alignment.
For molecular docking, AutoDock v4.2.6 with Autodock Tools v1.5.4 as the graphical user interface was used [24]. AutoDock is a suite of tools used to predict binding of small molecules (ligands) to proteins of known 3D structure. It is based on a hybrid of a genetic algorithm and an adaptive local search method, named the Lamarckian genetic algorithm. The program can determine protein–ligand interactions regarding the properties (polar atoms and bond rotations) of the ligand and the binding site of the protein. However, it does not provide information about the intrinsic functionality of the protein. It is expected that changes in the 3D structure of the protein caused by mutations of the residues, impair the ligand binding ability to interact with the protein. For ligand preparation, the structure of S-warfarin was drawn using ChemDraw Ultra v7.0 and energy-minimized with MOPAC AM1 in Chem3D Ultra 7.0. The ligand was saved as a pdb file format. The pdb file for the S-warfarin-CYP2C9 complex was obtained from the Protein Data Bank (PDB: 1OG5).
For the macromolecule preparation, after removing S-warfarin from the crystal structure, AutoDockTools was used to prepare the WT CYP2C9 binding region and S-warfarin for docking, as well as to create a grid box of 60 × 80 × 60 Å with a grid spacing of 0.375 Å centered on the original position of S-warfarin. In the case of CYP2C9 mutations, the grid box coordinates were located at the center of Phe 476, a key residue that has been identified to form π stacking interactions with the phenyl group of S-warfarin. A flexible docking of 100 genetic algorithm (GA) runs was performed with the number of individuals in the population set to 200 and the maximum number of energy evaluations set to 25,000,000, with other parameters accepted as suggested by AutoDockTools. These parameters were also used for clustering (root mean square deviation = 2Å) of the results obtained. PyMOL v2.0 (www.pymol.org) was used for proteins alignment and as molecular viewer to generate high-quality molecular structures [25].
Results
Evaluation of low-frequency SNVs identified with NGS
A larger number of SNVs were identified in the CYP2C9 sequenced region (i.e., seven missenses, three synonymous, 29 upstream, 25 downstream, 5´ on the 3´-UTR and 274 intronic variants); whereas, the VKORC1 locus harbored 24 variants (i.e., two synonymous, four upstream, six downstream, one at the 3´-UTR and 11 intronic variants). A novel SNP at intron 4 of CYP2C9 gene (chr 10:96,729,552) was found to be in strong linkage disequilibrium with CYP2C9*3 (r2 = 0.92; D′ = 1), one of the most important predictors of warfarin dose requirements. Furthermore, a novel haplotype was identified in CYP2C9. It is defined by two SNPs at exons 3 and 9, respectively. This haplotype was recently accepted by the Pharmacogene Variation Consortium (www.PharmVar.org) and is now denoted as CYP2C9*61. The defining SNP (signature) of CYP2C9*61 is a rare missense variant at exon 9 of the gene (rs202201137; chr 10:96,748,682; c.1370A>G transition; N457S). Also, the same variant at exon 3 defining CYP2C9*2 allele (rs1799853; c.430C>T, R144C) is present. This haplotype was found in a patient with low warfarin dose requirements (3 mg/day). According to prediction scores assessed with the CADD tool, CYP2C9*61 was classified as nondeleterious. The scores also suggest that CYP2C9*61 is tolerated and benign; therefore, its impact on CYP2C9 enzymatic activity remains unclear.
From the total set of low-frequency SNVs identified by NGS with predicted CADD-estimated deleteriousness (i.e., C-scores >15), nine were in coding regions and one in the regulatory region (Table 1). The variants rs149158426, close to a splicing site, and rs114071557 that affects the translation initiation start site, were present in four individuals requiring <3 mg/day of warfarin. The genomic locations of these SNVs are highly conserved according to GERP, PhastCons and PhyloP scores (Table 1). The variant rs149158426 has only been found in four Caribbean Hispanics of the 1000 Genomes Project [26], but it was also found in Europeans (MAF = 0.0010) and African–Americans (MAF = 0.0007) by eMERGE (Table 2) [26,27]. The rs142240658, rs114071557 and CYP2C9*5 (D360Q) variants, all present in our study cohort, have previously been reported at very low frequencies among individuals of European heritage (MAF = 0.00007, 0.0001 and 0.0001, respectively) at eMERGE [27].
Table 1. . Combined Annotation Dependent Depletion scores for predicted functionality and conservation of low-frequency variants in CYP2C9 and VKORC1 loci, identified by next-generation sequencing. Variants are organized in descending probability of deleteriousness (according to C-score).
| SNP Id | Allele | Position | Gene | Change | Type | Effect | Amino acid substitution scores | Conservation scores | C-score†(PHRED-like) | Number of patients | Warfarin stable dose (mg/day) | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Grantham‡ | PolyPhen¶ | SIFT§ | PhastCons†† | PhyloP‡‡ | GERP# | ||||||||||
| rs28371685 | CYP2C9*11 | 96740981 | CYP2C9 | C>T | Missense | Arg335Trp | 101 | 0.914 | 0.00 | 0.893 | 0.91 | 2.85 | 27.9 | 2 | 2.89 |
| rs142240658 | CYP2C9*21 | 96698528 | CYP2C9 | C>T | Missense | Pro30Leu | 98 | 0.997 | 0.01 | 0.299 | 0.856 | 2.78 | 25.1 | 1 | 3.21 |
| rs2256871 | CYP2C9*9 | 96708974 | CYP2C9 | A>G | Missense | His251Arg | 29 | 0.98 | 0.00 | 0.124 | 1.45 | 3.19 | 23.8 | 4 | 3.84 |
| rs28371686 | CYP2C9*5 | 96741058 | CYP2C9 | C>G | Missense | Asp360Glu | 0 | 0.93 | 0.00 | 0.23 | 0.02 | -0.53 | 23.5 | 3 | 4.00 |
| rs9332339 | CYP2C9*12 | 96748777 | CYP2C9 | C>T | Missense | Pro489Ser | 74 | 0.681 | 0.00 | 0.464 | 0.77 | 2.50 | 23.5 | 1 | 1.71 |
| rs149158426 | rs149158426 | 96709023 | CYP2C9 | G>A | Synonymous | +19 bps splicing site | ND | ND | ND | 0.99 | 1.84 | 3.29 | 15.43 | 1 | 2.89 |
| rs55894764 | rs55894764 | 31106015 | VKORC1 | C>T | Synonymous | Arg12Arg | ND | ND | ND | 0.997 | 0.604 | 2.54 | 15.25 | 3 | 7.02 |
| rs114071557 | CYP2C9*36 | 96698440 | CYP2C9 | A>G | Initiator codon | Loss of transcription start | ND | ND | ND | 0.862 | 1.65 | 3.69 | 15.14 | 2 | 3.00 |
| Unknown | Unknown | 31107155 | VKORC1 | C>T | Regulatory | Regulatory | ND | ND | ND | 0.777 | -0.239 | -1.150 | 13 | 1 | 4.29 |
| rs7900194 | CYP2C9*8 | 96702066 | CYP2C9 | G>A | Missense | Arg150His | 29 | 0.005 | 0.33 | 0.553 | -0.450 | 3.54 | 7.88 | 4 | 4.16 |
| rs202201137; rs1799853 | CYP2C9*61 | 96748682 96702047 |
CYP2C9 | A>G; C>T | Missense | Asn457Ser Arg144Cys |
46 | 0.009 | 0.83 | 0.014 | 0.368 | 2.30 | 0.001 | 1 | 3.00 |
Warfarin stable dose is an average of warfarin dose requirements of patients who are carriers of the variant.
C-score is a PHRED-like CADD ranking system that estimates the probability that the variant is deleterious. This score allows prioritizing variants with the highest predicted deleteriousness. Values depend on dataset.
Grantham score categorizes amino acids substitutions based on chemical dissimilarity where higher scores reflect higher evolutionary distance, expected to be more deleterious.
SIFT predicts deleteriousness based on the protein structure and function (scores range from 0 to 1 where 0 predicts a higher chance of being damaging).
PolyPhen is the probability that the substitution is damaging (scores from 0 to 1 where values closer to 1 predicts deleteriousness).
GERP observes the number of substitutions expected under neutrality minus the number of observed substitutions (in a multiple sequence alignment); positives scores indicate that a site is under evolutionary constraint; values range from -12.3 to 6.17, where values closer to 6.17 indicate conservation.
PhastCons values in primates (Pri-PhastCons) are a ratio that represents the average rate of substitutions in conserved regions relative to the average rate in nonconserved regions; scores closer to 0 indicate less conservation.
PhyloP values in primates (Pri-PhyloP) compare the probability of observed substitutions under the hypothesis of neutral evolutionary rate; positive values indicate conservation and negatives indicate selection.
CADD: Combined Annotation Dependent Depletion; ND: Not determined.
Table 2. . Minor allele frequencies of low-frequency variants among Africans, Hispanics (Puerto Ricans, Mexican–Americans, Peruvians and Colombians) and Europeans from the 1000 Genomes project as well as Puerto Rican patients on warfarin.
| Gene | SNP Id | Variant | MAF AFR | MAF AMR | MAF EUR | MAF Warf |
|---|---|---|---|---|---|---|
| CYP2C9 | rs142240658 | CYP2C9*21 | <0.001 | 0.0003 | 0.000 | 0.005 |
| CYP2C9 | rs114071557 | CYP2C9*36 | 0.001 | 0.000 | 0.000 | 0.009 |
| CYP2C9 | rs7900194 | CYP2C9*8 | 0.053 | 0.001 | 0.001 | 0.019 |
| CYP2C9 | rs2256871 | CYP2C9*9 | 0.082 | 0.001 | 0.001 | 0.019 |
| CYP2C9 | rs149158426 | NA | 0.000 | 0.006 | 0.000 | 0.004 |
| CYP2C9 | rs28371685 | CYP2C9*11 | 0.024 | 0.001 | 0.002 | 0.009 |
| CYP2C9 | rs28371686 | CYP2C9*5 | 0.017 | 0.001 | 0.000 | 0.014 |
| CYP2C9 | rs9332339 | CYP2C9*12 | 0.000 | 0.000 | 0.000 | 0.005 |
| CYP2C9 | rs202201137 and rs1799853 | CYP2C9*61 | 0.000 | <0.005† | 0.000 | 0.005 |
| VKORC1 | rs55894764 | rs55894764 | 0.000 | 0.019 | 0.030 | 0.014 |
| VKORC1 | NA | NA | 0.000 | 0.000 | 0.000 | 0.004 |
Among the 104 Puerto Ricans in the 1000 Genomes Project, two heterozygous individuals (HG01190 and HG01325) were identified (MAF = 0.010).
AFR: African; AMR: Hispanic; EUR: European; MAF: Minor allele frequency; NA: Not available/not found; Warf: Warfarin.
Based on scores evaluating the effect of amino acids substitutions (Grantham, SIFT and PolyP), rs142240658, CYP2C9*9 (H251R), CYP2C9*11 (R335W), CYP2C9*5 (D360Q) and CYP2C9*12 (P489S) are all deleterious. The CYP2C9*8 (R150H) and CYP2C9*61 (N457S, R144C) substitutions were predicted as benign and tolerated according to PolyPhen and SIFT scores, respectively. Notably, patients who were carriers of all these rare variants (except the CYP2C9*5) had low warfarin dose requirements (<4 mg/day). With the exception of CYP2C9*12, most of the observed rare variants identified in the study cohort are more frequent among Africans than in other parental populations [26].
None of the variants identified in VKORC1 resulted in amino acid substitutions. Two rare variants were found within the VKORC1 locus in two different individuals. The rs55894764, a synonymous variant detected in an individual with warfarin dose requirement of 7 mg/day, is a conserved site as suggested by the PhastCons score. However, patients who are carriers of this variation were also carriers of the VKORC1-1639 G allele (Haplotype B), known predictor of high warfarin dose requirements. A novel variant located at the 3´-UTR of the gene occurred in three patients who are sensitive to warfarin and were also carriers of the VKORC1 Haplotype A (predictor of low warfarin dose requirements). Conservation scores did not identify this variant within an evolutionary conserved region. Additionally, an apparent novel variant was found at the regulatory region of VKORC1 in one individual that required a stable warfarin dose of 4.29 mg/day, which is close to the standard of care (5 mg/day).
Molecular modeling of CYP2C9 protein variants & docking analysis
Molecular modeling of CYP2C9 variants
The 3D structures of CYP2C9 protein variants were generated by introducing the FASTA format amino acid sequence in the Phyre v2.0 server. The predicted models were obtained based on the protein sequence of CYP2C9 WT, as well as on amino acid sequence alignment and structure comparison. The predicted models of CYP2C9 protein variants were superimposed and aligned with the available crystal structure of CYP2C9 WT using PyMOL (PDB: 1OG5, Figure 1). Structural conformations of CYP2C9 variants aligned with CYP2C9 WT at nearly the same position and showed calculated root-mean-square deviation (RMSD) <1 Å.
Figure 1. . Representation of 3D structure and structure alignment of CYP2C9 wild-type (silver gray) and variant proteins generated using PyMOL.
(A) Representative docking of S-warfarin (magenta) into wild-type CYP2C9 and comparison with x-ray structure (Protein Data Bank: 1OG5, silver gray). Binding energy ΔG = -9.71 kcal/mol. Hydrogen interactions between the ketone of 3-position substituent of S-warfarin and the side chain carboxamide of asparagine 217 is shown as dashed lines. Residues 101-111 are omitted for clarity. Silver gray represents the wild-type protein and each color represents the different CYP2C9 variants of (B) CYP2C9*5 (D360E), (C) CYP2C9*8 (R150H), (D) CYP2C9*9 (H251R), (E) CYP2C9*11 (R335W), (F) CYP2C9*12 (P489S), (G) CYP2C9*21 (P30L), and double variant (H) CYP2C9*61 (R144C, N457S).
Structural analyses
To validate our docking calculations, the structure of S-warfarin was re-docked into the available x-ray crystal structure of CYP2C9 WT (Figure 1A). The molecular docking of S-warfarin with CYP2C9 showed that the molecule interacts with the side chains of F476, F100, F114 and N217. The phenyl group of S-warfarin lies predominantly in the hydrophobic pocket of CYP2C9 active site forming a parallel-displaced π-stacking contact with the phenyl group of residues F100 and F476. The 4-hydroxycoumarin moiety of S-warfarin forms parallel π-stacking with F114. Hydrogen bonding interactions are observed between the ketone group in 3-position of S-warfarin and the side-chain carboxamide of N217. The calculated distance between the Fe-Heme and C7 of S-warfarin is 11.1 Å, which is nearly at the same position as the reported x-ray structure (11.0 Å) of CYP2C9 WT with a calculated binding energy of -9.71 kcal/mol, as estimated by docking using AutoDock.
The CYP2C9*5 variant encodes for a protein with mutations at position 360th, corresponding to a negatively charged glutamate rather than aspartate (D360E) (Figure 1B). Both variants (CYP2C9*3 and CYP2C9*5) occur close to the position where the heme group lies in the active site of CYP2C9. The aspartate side chain of D360 forms hydrogen bond with the hydroxyl group of T392 side chain in WT. As observed in the model, the carboxylate side chain in variant E360 is one -CH2- longer compared with D360. This side chain elongation causes that the carboxylate group of E360 mutation points away from T392, forming hydrogen bonds with a more distant H396 residue. The calculated distance between Fe-Heme and C7 in CYP2C9*5 is 11.0 Å (Figure 2B). The calculated binding energy for this model is -8.45 kcal/mol, which suggests a lower ligand affinity for warfarin than the WT. CYP2C9*8 encodes for R150H at the D′-loop of CYP2C9, replacing a 3-carbon aliphatic side chain and positively charged guanidinium group with an aromatic imidazole residue (Figure 1C). According to the model, no hydrogen bonds are formed between H150 mutation and other residues. The calculated distance between Fe-Heme and C7 in CYP2C9*8 is 11.7 Å, which is 0.7 Å longer compared with WT (Figure 2C). The calculated binding energy is -8.42 kcal/mol. CYP2C9*9 (H251R) encodes for an amino acid substitution away from the hydrophobic binding pocket at G′-loop of CYP2C9 (Figure 1D). The H251 residue in WT forms hydrogen bonds with the carboxylate side chain of D262. In the mutation model, the flexible guanidinium side chain of R251 points away from interacting with D262. The calculated distance between Fe-Heme and C7 in CYP2C9*9 is 11.6 Å, and the binding energy is -8.45 kcal/mol (Figure 2D).
Figure 2. . Representative docking calculations of S-warfarin with CYP2C9 wild-type and protein variants.
The protein–ligand interactions and Fe-Heme distance to C7 of S-warfarin are represented for (A) CYP2C9 wild-type, (B) CYP2C9*5 (D360E), (C) CYP2C9*8 (R150H), (D) CYP2C9*9 (H251R), (E) CYP2C9*11 (R335W), (F) CYP2C9*12 (P489S), (G) CYP2C9*21 (P30L) and (H) CYP2C9*61 (R144C, N457S). The AutoDock scores for ligand binding ranged from -8.33 to -8.47 kcal/mol between S-warfarin and variants, and -9.71 kcal/mol with wild-type. The docking calculations and molecular viewer were generated using AutoDock and PyMOL, respectively. Distance between Fe-Heme and C7 of S-warfarin were generated using PyMOL.
The CYP2C9*11 variant protein (R335W) results in an amino acid substitution that is not within the hydrophobic binding pocket of CYP2C9, and with a change of the positively charged guanidinium group in arginine at position 335th by the nonpolar hydrophobic indole group of tryptophan (Figure 1E). This mutation occurs at the J-J′-turn of CYP2C9, disrupting ionic bonds formed between R335 (J-helix) and carboxylate side chain of D341 (J′-helix). The calculated distance between Fe-Heme and C7 in CYP2C9*11 is 11.5 Å, and the binding energy is -8.47 kcal/mol (Figure 2E). The CYP2C9*12 variant encodes for the P489S mutation at the end of the amino acid sequence of the variant protein (Figure 1F). The pyrrolidine side chain of proline in P489 does not exhibit important residue interactions in the WT. However, the polar hydroxymethyl group side chain of S489 mutant is in close proximity and forms hydrogen bond with the guanidinium side chain of R157. The docking calculations showed that the distance of Fe-Heme and C7 of the ligand is 11.6 Å, with a binding energy of -8.43 kcal/mol (Figure 2F). The CYP2C9*21 variant protein resulted in P30L mutation at the flexible A′-turn site pointing away from the protein core and active site (Figure 1G). As suggested by the molecular model, both the WT P30 and mutant L30 carboxamide backbone group forms hydrogen bond with hydroxyl side chain of T66. The docking calculation showed that the distance between Fe-Heme in CYP2C9*21 and C7 of the ligand is 11.7 Å, and the binding energy of -8.43 kcal/mol (Figure 2G).
The CYP2C9*61 haplotype includes two missense mutations (R144C and N457S) that occur at two distant sites in exons 3 and 9, respectively [28]. The R144C substitution has previously been widely used as a predictor of S-warfarin dose requirements in pharmacogenomic algorithms. The hydrogen bond between R144 and carboxamide backbone of Q261 WT, is not observed with mutant substitution C144 in the variant protein. Similarly, the carbonyl group of the polar carboxamide side chain of N457 WT forms hydrogen bonds with the carboxamide side chain of Q324. However, in the S457 mutant substitution the hydrogen bond between the hydroxymethyl side chain and Q324 is not observed (Figure 1H). The docking calculations showed that the distance between Fe-Heme in CYP2C9*61 haplotype and C7 of S-warfarin is 11.3 Å, and the binding energy of -8.33 kcal/mol, which is of lower affinity than the other genotypes (Figure 2H). The AutoDock scores of S-warfarin binding with these variants (i.e., CYP2C9*5, CYP2C9*8, CYP2C9*9, CYP2C9*11, CYP2C9*12, CYP2C9*21 and CYP2C9*61) ranged from -8.33 to -8.47 kcal/mol, compared with -9.71 kcal/mol of S-warfarin binding to CYP2C9 WT. An increase in the distance (Δ = 0.1 to 0.7 Å) of C7 of S-warfarin toward the heme group may decrease the possibility of the drug to be hydroxylated. Noteworthy, the orientation of F476 required for positioning S-warfarin close to the heme group seems to be affected on each of the simulated mutant proteins.
Discussion
The purpose of the present study was to gain insight into the genetic variants found in admixed Puerto Ricans. Low-frequency variants (MAF <2%) found among Puerto Ricans treated with warfarin were evaluated using in silico tools (i.e., CADD and AutoDock v 4.2.6) after prioritizing mutations located at coding regions of VKORC1 and CYP2C9 [21,29]. These variants are not expected to explain variability in warfarin dose requirements from a population perspective (especially if analyzed using association tests due to their low frequency), but may have the potential to influence warfarin response in carriers. Most of the selected variants have higher frequency among African populations. Although some of the variants evaluated in this study are extremely rare in Africans (i.e., CYP2C9*21 and *61), they are not in Puerto Ricans. Interestingly, most of these variants were found among patients with warfarin dose requirements lower than 4 mg/day.
Overall, we have identified multiple rare variants in both CYP2C9 and VKORC1 pharmacogenes with potentially deleterious functions affecting warfarin dose requirements in Puerto Ricans. Among these variants, seven rare CYP2C9 alleles/haplotypes were evaluated for in silico predictions of functional alterations in Caribbean Hispanics from Puerto Rico, including CYP2C9*5, CYP2C9*8, CYP2C9*9, CYP2C9*11, CYP2C9*12, CYP2C9*21 and CYP2C9*61. The latter being a novel haplotype that has only been described in Puerto Ricans [28]. Our structural analysis of protein variants revealed distal effects of mutated residues that could alter important interactions between S-warfarin and the active site of CYP2C9 protein variants. This work contributes to the field by identifying and predicting the functional alterations of rare CYP2C9 variants, and one novel CYP2C9 haplotype (i.e., CYP2C9*61) through a combination of in silico tools for the first time in Caribbean Hispanics.
In silico modeling predicted a decrease in the binding affinity and an increase in the distance of S-warfarin from the heme group induced by all tested CYP2C9 variants. A stronger binding affinity of S-warfarin in the active site and a shorter distance of its C7 to the heme group, result in greater metabolic activity of the CYP2C9 WT enzyme. The first ligand–protein interaction between S-warfarin and CYP2C9 lies in a well-defined hydrophobic pocket near the catalytic site [6]. However, the principal mechanism by which these CYP2C9 variants seem to reduce the binding of S-warfarin is by placing the residues F100, F114 and A103 distant enough to prevent them from the required ligand–protein interactions. Both, the bicyclic and phenyl rings of S-warfarin form important π-stacking interactions in a parallel configuration with the phenyl side chains of F114 and F100, respectively. Also, the hydrophobic bicyclic ring of S-warfarin forms van der Waals contact with A103 (Figure 2A). Those three hydrophobic interactions of S-warfarin with the protein were disrupted in the presence of CYP2C9 variants. Normally, S-warfarin is positioned relatively distant (10–11 Å) from the hydroxylation site; therefore, it is speculated that it needs to be moved through the big hydrophobic pocket to place the C7 in a favorable position to reach the FeO group, triggered by a protein conformational movement. Accordingly, the F476 residue that has conformational mobility, is essential for this displacement due to strong π-stacking interactions with S-warfarin, trapping the molecule and facilitating this shift toward the heme group. As suggested in Williams et al., this conformational movement may be due to protein–protein interactions with the electron-transfer CYP450 reductase [6,30].
Most of the variants identified with NGS occurred in CYP2C9; only 10% of the variants occurred at the VKORC1 loci (including PRSS53). In fact, VKORC1 is evolutionarily conserved across species and it is hypothesized that it evolved from its paralog gene, VKORL1, to keep pace with evolutionary emergence of endothelium and vasculature among vertebrates [31]. On the other hand, members of the CYP450s (i.e., CYP2C9), have a large abundance of rare genetic variants that resulted probably from recent genomic events as a consequence of rapid population growth and weak selective pressure [12]. CYP450s have a role in the biotransformation of endogenous substances and xenobiotics (i.e., drugs).
Previous in vitro studies provide evidence for our in silico predictions. For example, CYP2C9*11 (R335W) was predicted by the CADD tool as the most deleterious low-frequency variant found in Puerto Ricans. In vitro experiments from Tai et al. showed that the mutant enzyme was unstable in a membrane complex suggesting that the deleteriousness of CYP2C9*11 is given by improper folding and instability of the enzyme [8]. Furthermore, in vitro assays of CYP2C9*21 (P30L) showed that the protein product is not able to catalyze warfarin C7-hydroxylation, and that CYP2C9*9 (H251R) retained 82% of the enzymatic activity when compared with the WT enzyme [32].
Conclusion & future perspective
Contrasting with our in silico predictions, CYP2C9*8 has been found to decrease warfarin metabolism by an underexpression of CYP2C9 cDNA [33]. Furthermore, Cavallari et al. found that CYP2C9*8 is in linkage disequilibrium with c.-1188T>C (rs4918758) and c.-1766T>C (rs933209), both located at the promoter of CYP2C9 [34]. They observed that these promoter polymorphisms were associated with slower clearance of S-warfarin, lower dose requirements among African–Americans, and decreased mRNA expression of CYP2C9 in liver tissue [34]. Similar to CYP2C9*8, in silico predictions revealed that CYP2C9*61 is not under evolutionary constraint and, therefore, is not highly conserved. Additionally, Grantham, SIFT and PolyP scores also suggest that CYP2C9*61 is tolerated and benign. However, these scores do not necessarily rule out an impact on activity as was earlier demonstrated for CYP2C9*8. In fact, a warfarin dose requirement of 3 mg/daily in this patient suggested that CYP2C9*61 further reduces the activity of the protein encoded by this haplotype [20,28]. Although CYP2C9 variants can be mechanistically predicted as deleterious, observed phenotypic effects (e.g., dose requirements) will also depend on substrate and therapeutic dose. Therefore, further clinical studies are warrant in order to draw valid conclusions.
Executive summary.
Background
Warfarin is an oral anticoagulant, which is principally metabolized by CYP2C9 to form 7-hydroxy warfarin.
Polymorphisms in CYP2C9 and VKORC1 loci explain around 30% of warfarin dose variability in individuals of mostly European descent.
This study provides in silico predictions of CYP2C9 low-frequency functional variants identified with next-generation sequencing in Caribbean Hispanics from Puerto Rico.
Methods
Targeted sequencing with the Ion Personal Genome Machine was performed on genomic DNA from 108 participants.
Low-frequency CYP2C9 variants (minor allele frequencies <2%) identified among participants were evaluated for conservation, amino-acid substitution effects, pathogenicity and potential deleteriousness using the Combined Annotation Dependent Depletion (CADD v1.3) tools.
Molecular modeling/docking analysis was performed to evaluate the impact of seven low-frequency missense variants on CYP2C9 activity.
Results & discussion
Conformational changes in the seven identified low-frequency variants (CYP2C9*5, CYP2C9*8, CYP2C9*9, CYP2C9*11, CYP2C9*12, CYP2C9*21 and novel CYP2C9*61) affect the binding site of S-warfarin.
In all CYP2C9 variants, in silico modeling predicted a decrease in binding affinity and increase in the distance of S-warfarin from the heme group.
Most of these potentially deleterious variants occur at higher frequency among individuals with large African ancestry.
This study contributes to the field by using in silico tools to predict the functional alterations of seven rare CYP2C9 variants for the first time in Caribbean Hispanics on warfarin.
Acknowledgments
The authors want to acknowledge first the patients of the Veterans Affairs Caribbean Healthcare Center for voluntarily participating in this study. This material is the result of work supported with resources and the use of facilities at the Veterans Affairs Caribbean Health Care Center in San Juan, Puerto Rico. A special acknowledgement to A Labastida who helped the authors with the analysis of sequencing data.
Footnotes
Disclaimer
The contents of this manuscript do not represent the views of the Veterans Affairs Caribbean Healthcare System, the Department of Veterans Affairs, the National Institutes of Health or the US Government.
Financial & competing interests disclosure
K Claudio-Campos and J Duconge held a without compensation employment status with the Pharmacy Service, VA Caribbean Healthcare Systems, in San Juan, Puerto Rico, at the time of conducting the study. This work was supported by SC1 grant # HL123911 from the National Heart, Lung and Blood Institute, and the Minority Biomedical Research Support – Support of Competitive Research program (SCORE) Program of the National Institute of General Medical Sciences. Part of this research was supported by the Research Minority Institutions (RCMI) award 8G12 MD007600 from the National Institute on Minority Health and Health Disparities and by the Minority Biomedical Research Support-Research Initiative for Scientific Enhancement at the University of Puerto Rico Medical Sciences Campus with the grant R25 GM061838. The authors have no other relevant affiliations or financialinvolvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
No writing assistance was utilized in the production of this manuscript.
References
Papers of special note have been highlighted as: • of interest
- 1.Coumadin (warfarin sodium) Label, prescribing information Bristol-Myers Squibb, NJ, USA (2010). http://packageinserts.bms.com/pi/pi_coumadin.pdf
- 2.Choonara IA, Haynes BP, Cholerton S, Breckenridge AM, Park BK. Enantiomers of warfarin and vitamin K1 metabolism. Br. J. Clin. Pharmacol. 22(6), 729–732 (1986). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Johnson JA, Gong L, Whirl-Carrillo M. et al. Clinical Pharmacogenetics Implementation Consortium guidelines for CYP2C9 and VKORC1 genotypes and warfarin dosing. Clin. Pharmacol. Ther. 90(4), 625–629 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Zanger UM, Schwab M. Cytochrome P450 enzymes in drug metabolism: regulation of gene expression, enzyme activities, and impact of genetic variation. Pharmacol. Ther. 138(1), 103–141 (2013). [DOI] [PubMed] [Google Scholar]
- 5.Katzung BG, Trevor AJ. Basic & Clinical Pharmacology (13th Edition). McGraw-Hill Education, New York City, NY, US, 66–85 (2015). [Google Scholar]
- 6.Williams PA, Cosme J, Ward A, Angove HC, Matak Vinkovic D, Jhoti H. Crystal structure of human cytochrome P450 2C9 with bound warfarin. Nature 424(6947), 464–468 (2003). [DOI] [PubMed] [Google Scholar]; • The important study reports the crystal structure of cytochrome P2C9 (CYP2C9) with the anticoagulant drug S-warfarin bound in its active site. The study is pivotal in the understanding of the structural features required in CYP homologs proteins and mutants during drug metabolism.
- 7.Allabi AC, Horsmans Y, Alvarez JC. et al. Acenocoumarol sensitivity and pharmacokinetic characterization of CYP2C9 *5/*8,*8/*11,*9/*11 and VKORC1*2 in black African healthy Beninese subjects. Eur. J. Drug Metab. Pharmacokinet. 37(2), 125–132 (2012). [DOI] [PubMed] [Google Scholar]
- 8.Tai G, Farin F, Rieder MJ. et al. In-vitro and in-vivo effects of the CYP2C9*11 polymorphism on warfarin metabolism and dose. Pharmacogenet. Genomics 15(7), 475–481 (2005). [DOI] [PubMed] [Google Scholar]
- 9.Goldstein JA. Clinical relevance of genetic polymorphisms in the human CYP2C subfamily. Br. J. Clin. Pharmacol. 52(4), 349–355 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Crespi CL, Miller VP. The R144C change in the CYP2C9*2 allele alters interaction of the cytochrome P450 with NADPH:cytochrome P450 oxidoreductase. Pharmacogenetics 7(3), 203–210 (1997). [DOI] [PubMed] [Google Scholar]
- 11.Pavani A, Naushad SM, Stanley BA. et al. Mechanistic insights into the effect of CYP2C9*2 and CYP2C9*3 variants on the 7-hydroxylation of warfarin. Pharmacogenomics 16(4), 393–400 (2015). [DOI] [PubMed] [Google Scholar]; • After the identification of variants R144C and I359L, the authors report a docking analysis about the effect of variants CYP2C9*2 and CYP2C9*3 proteins and pharmacokinetics of S-warfarin.
- 12.Fujikura K, Ingelman-Sundberg M, Lauschke VM. Genetic variation in the human cytochrome P450 supergene family. Pharmacogenet. Genomics 25(12), 584–594 (2015). [DOI] [PubMed] [Google Scholar]
- 13.Frueh FW, Amur S, Mummaneni P. et al. Pharmacogenomic biomarker information in drug labels approved by the United States food and drug administration: prevalence of related drug use. Pharmacotherapy 28(8), 992–998 (2008). [DOI] [PubMed] [Google Scholar]
- 14.Kessler MD, Yerges-Armstrong L, Taub MA. et al. Challenges and disparities in the application of personalized genomic medicine to populations with African ancestry. Nature Commun. 7, 12521 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Duconge J, Ruano G. Admixture and ethno-specific alleles: missing links for global pharmacogenomics. Pharmacogenomics 17(14), 1479–1482 (2016). [DOI] [PubMed] [Google Scholar]
- 16.Scott SA, Jaremko M, Lubitz SA, Kornreich R, Halperin JL, Desnick RJ. CYP2C9*8 is prevalent among African–Americans: implications for pharmacogenetic dosing. Pharmacogenomics 10(8), 1243–1255 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Bryc K, Velez C, Karafet T. et al. Colloquium paper: genome-wide patterns of population structure and admixture among Hispanic/Latino populations. Proc. Natl Acad. Sci. USA 107(Suppl. 2), 8954–8961 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Cavallari LH, Perera MA. The future of warfarin pharmacogenetics in under-represented minority groups. Future Cardiol. 8(4), 563–576 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Duconge J, Ramos AS, Claudio-Campos K. et al. A novel admixture-based pharmacogenetic approach to refine warfarin dosing in Caribbean Hispanics. PLoS ONE 11(1), e0145480 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Claudio-Campos K, Labastida A, Ramos A. et al. Warfarin anticoagulation therapy in Caribbean Hispanics of Puerto Rico: a candidate gene association study. Front. Pharmacol. 8, 347 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]; • Identify that CYP2C9 rs2860905 is a better predictor for warfarin dose requirement than CYP2C9*2 and CYP2C9*3 among Caribbean Hispanics of Puerto Rico.
- 21.Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genet. 46(3), 310–315 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lill MA, Danielson ML. Computer-aided drug design platform using PyMOL. J. Comput. Aided Mol. Des. 25(1), 13–19 (2011). [DOI] [PubMed] [Google Scholar]
- 23.Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJ. The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Protoc. 10(6), 845–858 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]; • Develop a protocol as a detailed tutorial on interpreting the results of the remote homology modeling server Phyre.
- 24.Morris GM, Huey R, Lindstrom W. et al. AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J. Comput. Chem. 30(16), 2785–2791 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]; • Develop a new version of AutoDock as a bioinformatics tool capable of predicting bound conformations and binding energies of ligands with macromolecules. This new version was calibrated on a set of 188 diverse protein–ligand complexes of known structure and binding energy, showing a standard error of about 2–3 kcal/mol in prediction of binding-free energy in cross-validation studies.
- 25.Schrödinger, LLC. The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC (2017). https://pymol.org/2/ ; • The new version of PyMOL molecular graphics viewer, introduce new features including; native retina resolution / 4k display support, Excel exporter plugin (Windows and Mac), better third-party plugin and custom scripting support, and properties editor, among others.
- 26.The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526(7571), 68–74 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Rasmussen-Torvik LJ, Stallings SC, Gordon AS. et al. Design and anticipated outcomes of the eMERGE-PGx project: a multicenter pilot for preemptive pharmacogenomics in electronic health record systems. Clin. Pharmacol. Ther. 96(4), 482–489 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Claudio-Campos KI, Gonzalez-Santiago P, Renta JY. et al. CYP2C9*61, a rare missense variant identified in a Puerto Rican patient with low warfarin dose requirements. Pharmacogenomics 20(1), 3–8 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]; • In a case report identify a rare missense variant in CYP2C9*61, rs202201137 (10:96748682; c.1370A>G; p. Asn457Ser) in a Puerto Rican patient that requires low warfarin dose.
- 29.Cosconati S, Forli S, Perryman AL, Harris R, Goodsell DS, Olson AJ. Virtual Screening with AutoDock: theory and practice. Expert Opin. Drug Discov. 5(6), 597–607 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Modi S, Sutcliffe MJ, Primrose WU, Lian LY, Roberts GC. The catalytic mechanism of cytochrome P450 BM3 involves a 6 A movement of the bound substrate on reduction. Nat. Struct. Biol. 3(5), 414–417 (1996). [DOI] [PubMed] [Google Scholar]
- 31.Oldenburg J, Watzka M, Bevans CG. VKORC1 and VKORC1L1: why do vertebrates have two vitamin K 2,3-epoxide reductases? Nutrients 7(8), 6250–6280 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Niinuma Y, Saito T, Takahashi M. et al. Functional characterization of 32 CYP2C9 allelic variants. Pharmacogenomics J. 14(2), 107–114 (2014). [DOI] [PubMed] [Google Scholar]
- 33.Liu Y, Jeong H, Takahashi H. et al. Decreased warfarin clearance associated with the CYP2C9 R150H (*8) polymorphism. Clin. Pharmacol. Ther. 91(4), 660–665 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Cavallari LH, Vaynshteyn D, Freeman KM. et al. CYP2C9 promoter region single-nucleotide polymorphisms linked to the R150H polymorphism are functional suggesting their role in CYP2C9*8-mediated effects. Pharmacogenet. Genomics 23(4), 228–231 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]


