Nature Communications 8:15644 doi: 10.1038/ncomms15644 (2017); Published 5 Jun 2017
Transcription factors (TFs) are DNA-binding proteins that regulate gene expression. Sequence-specific TFs recognize DNA via specific amino acid-base hydrogen bonds and contacts that read local DNA shape1. Studying base and shape readout modes of TFs in vivo has been challenging due to technical issues associated with current approaches for mapping TF-binding sites (TFBSs). We recently introduced Chromatin Endogenous Cleavage with sequencing (ChEC-seq), an in vivo mapping method based on fusing Micrococcal Nuclease (MNase) to a TF (ref. 2). Upon addition of calcium to permeabilized cells, tethered MNase cuts DNA adjacent to the bound TF and the released fragments are sequenced to provide a high-resolution genome-wide TFBS map. We used ChEC-seq to map the budding yeast TFs Abf1, Reb1 and Rap1 and obtained data similar to high-resolution ChIP-seq without the need for cross-linking, chromatin solubilization or antibodies.
When cells were collected <1 min after calcium addition, most TFBSs contained a TF-specific sequence motif (‘fast' sites). We also reported ‘slow' sites with low motif scores that appeared after ∼10 min. We found that DNA shape features of high-scoring (mostly fast) and low-scoring (mostly slow) TFBSs corresponded closely, but differed from randomly chosen sites not overlapping high- or low-scoring sites. In our study, DNA shape features of fast and slow sites were centred on the best match to the TF consensus motif; however, randomly chosen genomic intervals were not similarly centred on the best motif match. Rossi, Lai and Pugh now find that when random sites are motif-centred, the shape features correspond closely to slow site features3, which might suggest that DNA shape is insufficient to explain binding site selection by the TFs Abf1, Reb1 and Rap1. However, given that sequence and shape features covary4, it is problematic to rely on motif-dependent analyses to draw conclusions about whether a TF recognizes DNA shape5.
To address this problem, we aligned DNA shape feature vectors for unique fast and slow ChEC-seq sites for each TF using a procedure that relied only on shape data and was not directly informed by sequence alignment. Given the possibility for overlap between nearby TFBSs, we identified unique sites that do not intersect with any other ChEC-seq sites within intervals ranging from 100 to 500 bp surrounding ChEC-seq peak maxima, with larger windows associated with increasing stringency. For Abf1 and Reb1, we found that average fast and slow site shape features were well correlated at a range of interval widths (P<<0.001; Fig. 1a–c). We also searched sites using a ‘shape profile' defined using the average fast site features and found that score distributions for fast and slow sites only slightly differed (P>0.03), but were very different from random and free MNase sites (P<<10−10) for Abf1 (Fig. 1b) and Reb1 (not shown). The major shape feature proximal to Abf1 motifs is a deformation to the helix indicative of motif-proximal poly(dA:dT) tracts (Fig. 1a), a sequence feature we observed at slow sites in our original study2. Consistent with the recognition of a preferred shape signature by Abf1 and Reb1 at fast and slow sites, random sites and free MNase sites were not well correlated with fast and slow sites (Fig. 1a–c). We do not observe shape features enriched for poly(dA:dT) tracts at free MNase sites (Fig. 1a,b), suggesting that the detection of this shared shape feature at fast and slow ChEC-seq sites is not simply due to the higher prevalence of these features within nucleosome-depleted regions. Shape features at Rap1 fast and slow sites were not well correlated (P<0.1; Fig. 1c,d). The robustness of the correlation between average fast and slow shape features for Abf1 and Reb1 across a range of interval widths (Fig. 1d) suggests that sampling of similar shapes by TFs may explain binding events, even within promoters where fast and slow sites co-occur. From these motif-independent analyses, we conclude that fast and slow binding sites for Abf1 and Reb1 have similar shape features.
We next queried a TF-gene regulatory association database6, and asked whether TF-slow site associations had been previously observed in mapping or gene expression studies orthogonal to ChEC-seq. Consistent with our previous demonstration that slow sites were recovered as sites without the canonical motif in other studies2, the proportion of fast and slow sites documented or proposed to regulate proximal genes in previous studies (Fig. 1e) was similar across a range of interval widths. This suggests that slow sites with shape features similar to fast sites are likely true binding sites and not simply experimental noise due to cleavage proximal to fast sites.
What accounts for the differential sensitivity of these TFs to DNA shape? All three TFs are essential and have roles in maintaining nucleosome organization7,8; however, Rap1 is unique in that it also functions in chromatin silencing at the mating type locus and telomeres9. Promoter architecture in Saccharomyces cerevisiae may provide a basis for this functional specialization4. We observed marked deviations in DNA shape in the average aligned fast and slow site profiles for Abf1 (Fig. 1a) and Reb1, but not Rap1 (not shown) consistent with the presence of poly(dA:dT) tracts, which are known to exclude nucleosomes and play a role in establishing canonical chromatin architecture4,10. Abf1 and Reb1 have been proposed to be dependent on poly(dA:dT) tracts for their localization and function11,12,13. It has been suggested that poly(dA:dT) tracts may participate in regulating ribosomal protein gene promoters, which are also bound by Rap1 (ref. 14); however, our inability to detect significant DNA shape contributions to Rap1 binding may be due to the comparatively small number of sites tested. We speculate that promoters with poly(dA:dT) tracts not only exclude nucleosomes, but also have shape features that help recruit TFs that actively maintain nucleosome depletion15. Indeed, binding site-proximal poly(dA:dT) tracts have been proposed to enhance binding16, potentially by increasing accessibility of the adjacent major groove17. Thus, TF functional diversity and architecture of yeast promoters may explain the varying sensitivities of TFs to DNA shape. In this context, we anticipate that ChEC-seq will be a useful tool for generating high-resolution maps of protein-DNA interactions, with the potential to provide insights into the in vivo role of DNA shape in TFBS recognition.
Methods
We defined unique sites such that the intersection of intervals of 100–500 bp widths centred on unique Abf1, Reb1, Rap1 and Free MNase ChEC-seq peak maxima was disjoint. As a null set, we generated 1,500 random intervals from the sacCer3 genome assembly that did not overlap with ChEC-derived peaks. Shape features in 201-bp windows centred on peak maxima were determined as described4 using the DNAshapeR package18. At each interval width for a given TF, sites that did not have overlapping shape alignment windows were selected for alignment. Motif-independent alignment involved comparing each site against every other site within a given class and determining the shift that maximized the cosine similarity. Within a class, all sites were aligned to an internal centroid, defined as the site with the smallest sum of squared cosine similarities versus all other sites. Sites were then shifted relative to the centroid and class-specific average features were computed. Pearson's r was used to quantify the similarity of average shape features between classes (reported P values are two-tailed) without shifting the average features relative to each other. Given the strong A/T MNase cleavage preference (not shown) in the 5-bp window centred on peak maxima, we excluded these positions from the alignment. Further, because shape readout likely occurs near the TFBS, the largest shift considered was 25 bp and alignment was limited to the 90-bp interval centred at the peak maximum. Parameters used for all site classes including the random and free MNase sites were identical. Shape profiles for Abf1 and Reb1 were defined as the regions in the average fast shape features with the largest information gain relative to shuffled sequences. Score distributions were generated by scoring the aligned fast, slow, free MNase and random sites in the same 90-bp interval used for shape alignment using correlation distance to the shape profile; Mann–Whitney U-tests were performed for pairwise comparisons of the resulting distributions. To determine whether putative TFBSs regulate nearby genes, we assigned them to their closest (≤1 kb) genes and queried YEASTRACT6. Source code for these analyses is publicly available (https://github.com/sivakasinathan/shape_align).
Additional information
How to cite this article: Kasinathan, S. et al. Correspondence: Reply to ‘DNA shape is insufficient to explain binding'. Nat. Commun. 8, 15644 doi: 10.1038/ncomms15644 (2017).
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Footnotes
The authors declare no competing financial interests.
Author contributions S.K., G.E.Z. and S.H. designed the study; S.K. wrote the software and performed the motif-independent alignments and statistical analyses; B.X. and R.R. performed DNA shape analyses; S.K. wrote the paper with input from all authors.
References
- Rohs R. et al. Origins of specificity in protein-DNA recognition. Annu. Rev. Biochem. 79, 233–269 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zentner G. E., Kasinathan S., Xin B., Rohs R. & Henikoff S. ChEC-seq kinetics discriminates transcription factor binding sites by DNA sequence and shape in vivo. Nat. Commun. 6, 8733 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rossi M. J., Lai W. K. M. & Pugh B. F. DNA shape is insufficient to explain binding. Nat. Commun. 8, 15643 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou T. et al. DNAshape: a method for the high-throughput prediction of DNA structural features on a genomic scale. Nucleic Acids Res. 41, W56–W62 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krietenstein N. et al. Genomic nucleosome organization reconstituted with pure proteins. Cell 167, 709–721. e712 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teixeira M. C. et al. The YEASTRACT database: an upgraded information system for the analysis of gene and genomic transcription regulation in Saccharomyces cerevisiae. Nucleic Acids Res. 42, D161–D166 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hartley P. D. & Madhani H. D. Mechanisms that specify promoter nucleosome location and identity. Cell 137, 445–458 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ganapathi M. et al. Extensive role of the general regulatory factors, Abf1 and Rap1, in determining genome-wide chromatin structure in budding yeast. Nucleic Acids Res. 39, 2032–2044 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shore D. RAP1: a protean regulator in yeast. Trends Genet. 10, 408–412 (1994). [DOI] [PubMed] [Google Scholar]
- Struhl K. & Segal E. Determinants of nucleosome positioning. Nat. Struct. Mol. Biol. 20, 267–273 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu R. & Li H. Positioned and G/C-capped poly(dA:dT) tracts associate with the centers of nucleosome-free regions in yeast promoters. Genome Res. 20, 473–484 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moreira J. M., Remacle J. E., Kielland-Brandt M. C. & Holmberg S. Datin, a yeast poly(dA:dT)-binding protein, behaves as an activator of the wild-type ILV1 promoter and interacts synergistically with Reb1p. Mol. Gen. Genet. 258, 95–103 (1998). [DOI] [PubMed] [Google Scholar]
- Goncalves P. M. et al. Transcription activation of yeast ribosomal protein genes requires additional elements apart from binding sites for Abf1p or Rap1p. Nucleic Acids Res. 23, 1475–1480 (1995). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reja R., Vinayachandran V., Ghosh S. & Pugh B. F. Molecular mechanisms of ribosomal protein gene coregulation. Genes Dev. 29, 1942–1954 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iyer V. & Struhl K. Poly(dA:dT), a ubiquitous promoter element that stimulates transcription via its intrinsic DNA structure. EMBO J. 14, 2570–2579 (1995). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Afek A., Sela I., Musa-Lempel N. & Lukatsky D. B. Nonspecific transcription-factor-DNA binding influences nucleosome occupancy in yeast. Biophys. J. 101, 2465–2475 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Levo M. et al. Unraveling determinants of transcription factor binding outside the core binding site. Genome Res. 25, 1018–1029 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chiu T.-P. et al. DNAshapeR: an R/Bioconductor package for DNA shape prediction and feature encoding. Bioinformatics 32, 1211–1213 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]