Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 May 14.
Published in final edited form as: J Phys Chem B. 2020 May 4;124(19):3903–3908. doi: 10.1021/acs.jpcb.0c01857

Computational Investigation of APOBEC3H Substrate Orientation and Selectivity

Mark A Hix 1, G Andŕes Cisneros 1
PMCID: PMC7313631  NIHMSID: NIHMS1597887  PMID: 32321250

Abstract

APOBEC3H is a cytidine deaminase protein most well-known for its involvement in antiretroviral activity in humans. It acts upon a single stranded DNA (ssDNA) substrate with preferential targeting of a 5′-TCA-3′ motif. Currently available crystal structures do not include the ssDNA substrate in the A3H system, nor is the mechanism of recognition for the preferred sequence known. To determine the position and orientation of the substrate in the active site, we used high-performance computing to perform molecular dynamics simulations on several systems of APOBEC3H. We examined different DNA sequences in the active site to determine the structural and chemical mechanism by which the preferred sequence is recognized. We found residues N49, K50, K51, and K52 to be relevant to the recognition of 3′-adenine and residues S86 and S87 to be relevant to the recognition of 5′-thymine, with both recognitions primarily driven by electrostatic nonbonded interactions.

Graphical Abstarct

graphic file with name nihms-1597887-f0004.jpg

■ INTRODUCTION

APOBEC3 enzymes (A3s) are cytidine deaminases responsible for dC → dU mutations. Most A3s act selectively on a 5′-dTdC-3′ motif, with the exception of A3G, which prefers a 5′-dCdC-3′ motif.13 A3H is known for its role in innate human immunity to retroviruses including HIV.4 A3H acts upon a single-stranded DNA or RNA substrate and mutates cytidine to uracil, preferring a 5′-dTdCdA-3′ sequence.5 A3H is stable as a monomer in solution but is known to be an RNA-mediated dimer in vivo.6 There are several available crystal structures of the A3H monomer; however, none of the reported structures have been crystallized in complex with a substrate or substrate analog. Here, we present a computational investigation on the orientation of the substrate on A3H and possible structural determinants for the selectivity of the preferred 5′-dTdCdA-3′ substrate motif.5

■ METHODS

The initial crystal structure for monomeric A3H was obtained from the RCSB Protein Data Bank (pdbid: 5W45).7 The sequence was confirmed to match the A3H-HapI consensus via Uniprot.8 Protonation states of ionizable residues were assigned with H++, followed by system preparation with tleap in AmberTools16.9,10 All systems were neutralized with Cl and solvated in water using a minimum distance of 12 Å between the protein surface and the edge of the box. The parameter sets employed were ff14SB,11 OL15,12 YIL,13 and TIP3P.14

Molecular dynamics (MD) simulations were run in the NVT ensemble at 300 K after minimization (50 steps with steepest descent followed by 450 steps with conjugate gradient) and iterative thermalization (20 equally spaced stages from 10K to 300 K at 12,500 steps per stage) using a Berendsen thermostat.15 Positional restraints with a force constant of 25.0 kcal mol−1 Å−2 were applied to the Zn2+, coordinating residues, water in the active site, and the deoxycytidine, as the active site dissociated in unrestrained simulations. The cytidine base was held in the active site with a distance restraint of 15.0 kcal mol−1 Å−2. Every system was simulated for 250 ns (in triplicate) with a 2.0 fs time step and SHAKE for all bonds involving hydrogen atoms with the pmemd.cuda module in AMBER18 using a cutoff distance of 8.0 Å for nonbonded interactions and the smooth particle-mesh Ewald method for long–range Coulomb interactions.1618 Twelve systems (termed A–L) were generated by rotating the phosphate backbone of the ssDNA strand with respect to the central dC nucleotide in the active site (Figure 1a). The tested ssDNA sequence comprises 5′-dAdAdAdTdCdAdAdAdA-3′. All systems were built with the Modeler software.19,20 Input coordinates and topologies included in the Supporting Information.

Figure 1.

Figure 1.

(a) Twelve possible orientations of ssDNA substrate on A3H, (b) protein–substrate interaction energies with error bars showing standard deviation (calculated via the numpy python module using the first and last halves of the individual trajectories), (c) RMSD of model substrate over time with respect to starting orientation, and (d) RMSF of nucleotides in ssDNA substrate.

■ RESULTS

Structural and dynamic properties, as well as average nonbonded (Coulomb and van der Waals) residuewise interactions via energy decomposition analysis (EDA) were calculated for each system (Figure 1) with cpptraj21 and an in-house FORTRAN90 program (available in the ESI of ref 22).2326 The EDA results suggest that system H has the most favorable total nonbonded interaction energy (Figure 1b). The ssDNA backbone for system B is oriented in the opposite 5′−3′ direction to system H and has a lower interaction energy than adjacent systems. Root mean squared deviation (RMSD) analysis suggests that systems B and H have the smallest deviations with respect to the original structure, suggesting a more stable initial orientation of the ssDNA (Figure 1c, Figure S1). This is further supported by an RMS fluctuation (RMSF) analysis focused on the individual nucleotides in the substrate (Figure 1d, Figure S2). The average RMSF of the substrate for each system indicates that systems B and H have the smallest average fluctuations for the ssDNA substrate (2.0 and 2.1 Å, respectively). Additionally it was observed that the systems adjacent to orientations B and H tended to shift the alignment of the DNA strand toward these two orientations. Analysis of residuewise interactions with the entire ssDNA substrate suggests that systems B and H have the most favorable total interaction energies (Figure 2, Figures S3 and S4), with loop 1 (17RRLRRPYYPRKALL30) and the region comprising residues 115–130, which includes K117, K121, and R124, providing significant favorable interactions with the substrate. Taken together, these results suggest that the orientation of the phosphate backbone corresponding to the B or H systems allows for more favorable interactions with protein residues.

Figure 2.

Figure 2.

Nonbonded interaction energy between each residue and the ssDNA for (a) system B and b) system H, with favorable (unfavorable) residue–substrate interactions in blue (red).

Systems B and H were run for an additional 250 ns from the end of the initial simulation with the restraints on the target dC removed to observe the stability of the enzyme–substrate complex. Interatomic distances between the reacting carbon on the cytidine and the water oxygen in the active site were measured over the trajectories to determine which system maintained a more stable binding. The ssDNA substrate in System B remained bound in the active site the entire duration of the simulation (average 3.2 Å). Conversely system H exhibited sporadic loss of binding. This suggests that while the total interaction energy between the protein surface and the substrate is more favorable with system H, the local interactions with the target cytidine are more stable with system B.

Based on the above results, system B was selected for subsequent simulations to investigate the selectivity of A3H for the reported consenus sequence. Six new systems were generated by modifying the nucleotides that flank the central dC to investigate the effects of different bases in place of the preferred dTdCdA motif. Thymine recognition was tested by generating three system: 5′-dAdCdA-3′, 5′-dCdCdA-3′, and 5′-dGdCdA-3′ substrates. Adenine recognition was tested by 5′-dTdCdC-3′, 5′-dTdCdG-3′, and 5′-dTdCdT-3′ substrates.

EDA was performed to compare the differences (if any) in protein/substrate interaction between the 5′-dTdCdA-3′ motif and all other systems (Figure S5S10). The total nonbonded interaction between the protein and the nucleotides on the 5′ position suggest that dT is favored over dG (−239 kcal mol−1) and dA (−38 kcal mol−1), but less than dC (+81 kcal mol−1). The 5′ dT nucleotide shows favorable interactions with R17, R21, and R26 when compared with those from purine nucleotides; however, these interactions are generally unchanged when dC is in the 5′ flanking position. dT also has more favorable interactions with S86 (between −1 and −5 kcal mol−1) and less favorable interactions with S87 (between +1 and +5 kcal mol−1) compared with the other nucleotides (see Figure 3b). The total nonbonded interaction of the protein with nucleotides in the 5′ position shows that dA at this position is 5 kcal mol−1 less favorable than dT; dG is 61 kcal mol−1 less favorable, and dC is 68 kcal mol−1 less favorable. Hydrogen bond (HB) analysis indicates that a hydrogen bond is present between the 5′ flanking nucleotide and S86 for 25.5% of the simulation time in the case of a 5′dT, compared with 11.7%, 0.8%, and 0.0% for dA, dC, and dG, respectively. The HB persistence between a 5′ deoxy-nucleotide and S87 is 0.9% for dT, compared with 0.2% (dA), 10.7% (dC), and 17.6% (dG).

Figure 3.

Figure 3.

(a) Electrostatic potential of A3H RNA-mediated dimer in solution with ssDNA mapped to solvent-accessible surface. Negative charges shown in red, positive charges shown in blue. ESP was calculated on the RNA-mediated dimer without model substrate using the APBS in PDB 2PQR.34 Inset is active site with substrate cytidine in position. (b) Residues S86 and S87 interacting with the 5′ thymine. (c) 3′ Adenine in a pocket formed by R26, N49, K50, and K51,.

The total nonbonded interaction between the entire ssDNA strand and the protein is more favorable with dA in the 3′ flanking position by between 31 and 245 kcal/mol compared with all other nucleotides at the same position. Our results suggest that 3′-dA shows strong attractive interactions with R17, R21, R26, K51, and K52. When compared with the other nucleotides in the same position, the arginines interact more favorably with dA by at least 5 kcal mol−1 (see Figure 3c). These arginines are on Loop 1, which has previously been shown to be involved in DNA binding and recognition.2729 In contrast, the lysines show slight preference for all three other bases. A3H-R26 is structurally homologous to A3A-R28 and A3B-R211, which have been previously reported as key residues that drive selectivity in the respective A3s, and both preferentially act upon a 5′-dTdC-3′ substrate.2,30,31 The nonbonded interaction between the individual nucleotides at the 3′ position and the entire protein suggest that dC and dT are less favored by 33 kcal mol−1 and 55 kcal mol−1 respectively, while dG shows <1 kcal mol−1 difference in interaction compared with dA. These results are consistent with previous experimental results showing similar selectivity for 5′-dTdCdA-3′ and 5′-dTdCdG-3′.5 Hydrogen bond analysis indicates an HB is formed between dA and R26 for 26.2% of the simulation, compared with 0.8%, 0.0%, and 14.8% for dC, dG, and dT, respectively.

A3G, A3F, and AID have a homologous loop, corresponding to residues 313 to 322 in A3G. Previous studies have shown that A3GCTD 5′-dCdC-3′ selectivity is driven by this loop and can be modified by mutating to the homologous AID or A3F sequences.3,32,33 Based on our results, the DNA binding orientation precludes the interaction of the ssDNA substrate with the region in A3H that is homologous to the A3G 313–322 loop. One greater difference observed between A3H and other A3s relates to the structure of the bound ssDNA substrate. In other A3s it has been reported that the DNA adopts a hairpin conformation.2 In A3H, the substrate can orient in a hairpin conformation in a monomeric system; however, A3H forms an RNA-mediated dimer which obstructs this orientation.

Dimer simulations were carried out to test the possibility of the B, H, and hairpin ssDNA orientations in the active site. The details of the system setup and simulations are reported in ref 22. The dimer structure shows that the active sites in each monomer are located near the RNA-mediated dimer interface of A3H. This location effectively prevents the ssDNA substrate from adopting a hairpin conformation. When a dimer system containing the superposed hairpin structure of A3A in both active sites was considered, the simulation was unstable due to strongly repulsive interactions between the ssDNA substrates and the RNA. Conversely, a track of positively charged surface residues is observed connecting the two active sites in the A3H dimer (Figure 3). For the B and H systems, the initial structures were simulated with two separate strands, one in each active site. During the initial stages of the simulations, both strands were observed to come together and thus a single strand spanning both active sites through the RNA interface was also considered as shown in Figure 3a.

■ CONCLUSIONS

In conclusion, we have performed computational simulations to investigate the binding orientation and substrate selectivity indicators for A3H. Our results suggest that the preferred binding orientation aligns the ssDNA substrate along a track that provides favorable interactions with the target cytidine and the two flanking residues, including three arginines that significantly favor the 3′ flanking adenine. Our results are consistent with previous experimental reports on substrate binding and consensus sequence selectivity. Additionally some of the A3H residues predicted to be related to selectivity are homologous with selectivity residues reported for other A3s. These results also provide possible targets for mutagenesis to investigate the role of the selectivity filters for the consensus sequence of A3H.

Supplementary Material

SI

■ ACKNOWLEDGMENTS

This work was supported by NIH grant R01GM108583. Computational time was provided by the University of North Texas CASCaM’s CRUNTCh3 high-performance cluster partially supported by NSF grant CHE-1531468.

Footnotes

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpcb.0c01857.

Information about energy decomposition analysis program; substrate binding orientation studies raw data: (i) RMSD for each of 12 systems, (ii) RMSF for each of 12 systems, (iii) residuewide interaction energy between protein and substrate for each of 12 systems (Coulomb and van der Waals shown separately); substrate recognition studies raw data: EDA for each of seven trials (van der Waals and Coulomb interactions shown separately. Interactions with each protein residue and 3′, dC, and 5′ residues shown separately) (PDF)

The authors declare no competing financial interest.

■ REFERENCES

  • (1).Liu M; Mallinger A; Tortorici M; Newbatt Y; Richards M; Mirza A; Van Montfort RL; Burke R; Blagg J; Kaserer T Evaluation of APOBEC3B recognition motifs by NMR reveals preferred substrates. ACS Chem. Biol 2018, 13, 2427–2432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (2).Shi K; Carpenter MA; Banerjee S; Shaban NM; Kurahashi K; Salamango DJ; McCann JL; Starrett GJ; Duffy JV; Demir Ö; et al. Structural basis for targeted DNA cytosine deamination and mutagenesis by APOBEC3A and APOBEC3B. Nat. Struct. Mol. Biol 2017, 24, 131–139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (3).Ebrahimi D; Alinejad-Rokny H; Davenport MP Insights into the motif preference of APOBEC3 enzymes. PLoS One 2014, 9, e87679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (4).Conticello SG The AID/APOBEC family of nucleic acid mutators. Genome Biol. 2008, 9, 229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (5).Starrett GJ; Luengas EM; McCann JL; Ebrahimi D; Temiz NA; Love RP; Feng Y; Adolph MB; Chelico L; Law EK; et al. The DNA cytosine deaminase APOBEC3H haplotype I likely contributes to breast and lung cancer mutagenesis. Nat. Commun 2016, 7, 12918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (6).Shaban NM; Shi K; Lauer KV; Carpenter MA; Richards CM; Salamango D; Wang J; Lopresti MW; Banerjee S; Levin-Klein R; et al. The Antiviral and Cancer Genomic DNA Deaminase APOBEC3H Is Regulated by an RNA-Mediated Dimerization Mechanism. Mol. Cell 2018, 69, 75–86.e9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (7).Ito F; Yang H; Xiao X; Li S-X; Wolfe A; Zirkle B; Arutiunian V; Chen XS Understanding the structure, multimerization, subcellular localization and mC selectivity of a genomic mutator and anti-HIV factor APOBEC3H. Sci. Rep 2018, 8, 3763. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Bateman A; Martin M; O’Donovan C; Magrane M; Alpi E; Antunes R; Bely B; Bingley M; Bonilla C; Britto R; et al. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2017, 45, D158–D169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (9).Schafmeister CEAF; Ross WS; Romanovski V LEAP 1995.
  • (10).Case D; Cerutti D; Cheatham T III; Darden T; Duke R; Giese T; Gohlke H; Goetz A; Greene D; Homeyer N et al. Amber16 2017. [Google Scholar]
  • (11).Maier JA; Martinez C; Kasavajhala K; Wickstrom L; Hauser KE; Simmerling C ff14SB: Improving the accuracy of protein side chain and backbone parameters from ff99SB. J. Chem. Theory Comput 2015, 11, 3696–3713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (12).Zgarbová M;Šponer J; Otyepka M; Cheatham TE; Galindo-Murillo R; Jureĉka P Refinement of the sugar-phosphate backbone torsion beta for AMBER force fields improves the description of Z- and B-DNA. J. Chem. Theory Comput 2015, 11, 5723–5736. [DOI] [PubMed] [Google Scholar]
  • (13).Yildirim I; Stern HA; Kennedy SD; Tubbs JD; Turner DH Reparameterization of RNA χ torsion parameters for the AMBER force field and comparison to NMR spectra for cytidine and uridine. J. Chem. Theory Comput 2010, 6, 1520–1531. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (14).Jorgensen WL; Chandrasekhar J; Madura JD; Impey RW; Klein ML Comparison of simple potential functions for simulating liquid water. J. Chem. Phys 1983, 79, 926–935. [Google Scholar]
  • (15).Berendsen HJC; Postma JPM; van Gunsteren WF; DiNola A; Haak JR Molecular dynamics with coupling to an external bath. J. Chem. Phys 1984, 81, 3684–3690. [Google Scholar]
  • (16).Götz AW; Williamson MJ; Xu D; Poole D; Le Grand S; Walker RC Routine microsecond molecular dynamics simulations with AMBER on GPUs. 1. Generalized Born. J. Chem. Theory Comput 2012, 8, 1542–1555 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Salomon-Ferrer R; Götz AW; Poole D; Le Grand S; Walker RC Routine microsecond molecular dynamics simulations with AMBER on GPUs. 2. Explicit solvent particle mesh Ewald. J. Chem. Theory Comput 2013, 9, 3878–3888 . [DOI] [PubMed] [Google Scholar]
  • (18).Essmann U; Perera L; Berkowitz ML; Darden T; Lee H; Pedersen LG A smooth particle mesh Ewald method. J. Chem. Phys 1995, 103, 8577–8593. [Google Scholar]
  • (19).Šali A; Blundell TL Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol 1993, 234, 779–815. [DOI] [PubMed] [Google Scholar]
  • (20).Fiser A; Do RKG;Šali A Modeling of loops in protein structures. Protein Sci. 2000, 9, 1753–1773. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).Roe DR; Cheatham TE PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data. J. Chem. Theory Comput 2013, 9, 3084–3095 . [DOI] [PubMed] [Google Scholar]
  • (22).Hix MA; Wong L; Flath B; Chelico L; Cisneros GA Induced mutagenesis by the DNA cytosine deaminase APOBEC3H Haplotype I protects against lung cancer. https://www.biorxiv.org/content/10.1101/2020.02.28.970509v12020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (23).Cisneros GA; Wang M; Silinski P; Fitzgerald M; Yang W Theoretical and experimental determination on two substrates turned over by 4–oxalocrotonate tautomerase. J. Phys. Chem. A 2006, 110, 700–708. [DOI] [PubMed] [Google Scholar]
  • (24).Dewage SW; Cisneros GA Computational analysis of ammonia transfer along two intramolecular tunnels in Staphylococcus aureus glutamine-dependent amidotransferase (GatCAB). J. Phys. Chem. B 2015, 119, 3669–3677 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (25).Elias AA; Cisneros GA In Biomolecular Modelling and Simulations; Karabencheva-Christova T, Ed.; Advances in protein chemistry and structural biology; Academic Press, 2014; Vol. 96; pp 39–75. [DOI] [PubMed] [Google Scholar]
  • (26).Graham SE; Syeda F; Cisneros GA Computational prediction of residues involved in fidelity checking for DNA synthesis in DNA polymerase I. Biochemistry 2012, 51, 2569–2578 [DOI] [PubMed] [Google Scholar]
  • (27).Logue EC; Bloch N; Dhuey E; Zhang R; Cao P; Herate C; Chauveau L; Hubbard SR; Landau NR A DNA sequence recognition loop on APOBEC3A controls substrate specificity. PLoS One 2014, 9, e97062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (28).Shi K; Carpenter MA; Kurahashi K; Harris RS; Aihara H Crystal structure of the DNA deaminase APOBEC3B catalytic domain. J. Biol. Chem 2015, 290, 28120–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (29).Lu X; Zhang T; Xu Z; Liu S; Zhao B; Lan W; Wang C; Ding J; Cao C Crystal structure of DNA cytidine deaminase ABOBEC3G catalytic deamination domain suggests a binding mode of full-length enzyme to single-stranded DNA. J. Biol. Chem 2015, 290, 4010–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (30).Hou S; Silvas TV; Leidner F; Nalivaika EA; Matsuo H; Yilmaz NK; Schiffer CA Structural analysis of the active site and DNA binding of human cytidine deaminase APOBEC3B. J. Chem. Theory Comput 2019, 15, 637–647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (31).Rathore A; Carpenter MA; Demir Ö; Ikeda T; Li M; Shaban NM; Law EK; Anokhin D; Brown WL; Amaro RE; et al. The local dinucleotide preference of APOBEC3G can be altered from 5′-CC to 5′-TC by a single amino acid substitution. J. Mol. Biol 2013, 425, 4442–4454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (32).Kohli RM; Maul RW; Guminski AF; McClure RL; Gajula KS; Saribasak H; McMahon MA; Siliciano RF; Gearhart PJ; Stivers JT Local sequence targeting in the AID/APOBEC family differentially impacts retroviral restriction and antibody diversification. J. Biol. Chem 2010, 285, 40956–40964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (33).Carpenter MA; Rajagurubandara E; Wijesinghe P; Bhagwat AS Determinants of sequence-specificity within human AID and APOBEC3G. DNA Repair 2010, 9, 579–587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (34).Jurrus E; Engel D; Star K; Monson K; Brandi J; Felberg LE; Brookes DH; Wilson L; Chen J; Liles K; et al. Improvements to the APBS biomolecular solvation software suite. Protein Sci. 2018, 27, 112–128. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI

RESOURCES