Abstract
Cytosine methylation is a well-characterized epigenetic mark and occurs at both CG and non-CG sites in DNA. Both methylated CG (mCG)- and mCH (H = A, C, or T)-containing DNAs, especially mCAC-containing DNAs, are recognized by methyl-CpG–binding protein 2 (MeCP2) to regulate gene expression in neuron development. However, the molecular mechanism involved in the binding of methyl-CpG–binding domain (MBD) of MeCP2 to these different DNA motifs is unclear. Here, we systematically characterized the DNA-binding selectivities of the MBD domains in MeCP2 and MBD1–4 with isothermal titration calorimetry–based binding assays, mutagenesis studies, and X-ray crystallography. We found that the MBD domains of MeCP2 and MBD1–4 bind mCG-containing DNAs independently of the sequence identity outside the mCG dinucleotide. Moreover, some MBD domains bound to both methylated and unmethylated CA dinucleotide–containing DNAs, with a preference for the CAC sequence motif. We also found that the MBD domains bind to mCA or nonmethylated CA DNA by recognizing the complementary TG dinucleotide, which is consistent with an overlooked ligand of MeCP2, i.e. the matrix/scaffold attachment regions (MARs/SARs) with a consensus sequence of 5′-GGTGT-3′ that was identified in early 1990s. Our results also explain why MeCP2 exhibits similar binding affinity to both mCA- and hmCA-containing dsDNAs. In summary, our results suggest that in addition to mCG sites, unmethylated CA or TG sites also serve as DNA-binding sites for MeCP2 and other MBD-containing proteins. This discovery expands the genome-wide activity of MBD-containing proteins in gene regulation.
Keywords: X-ray crystallography, epigenetics, DNA methylation, 5-methylcytosine, gene regulation, CAC motif, CpH methylation, methyl-CpG–binding domain (MBD), methyl-CpG–binding protein 2
Introduction
Cytosine methylation occurs prevalently at CG dinucleotide sites, with about 70% of CG sites being subject to methylation (m)3 in the human genome (1). Nevertheless, cytosine methylation is also present at CH (H = A, T, or C) sites (2, 3), and non-CG methylation (mCH) accounts for about 25% of the total cytosine methylation in both embryonic stem cells and neurons, contributing to transcriptional repression and imprinting, similar to CG methylation (4–6). Non-CG methylation occurs in virtually all human tissues and is associated with repression of development-related genes during differentiation of adult stem cells (7).
mCG-mediated transcriptional repression is through its binding to a family of proteins containing the MBD domain, a specific methyl-CpG–binding domain of about 70 residues. 11 MBD domains have been identified in mammals, including MeCP2, MBD1–6, SETDB1/2, and BAZ2A/B.
In both mouse and human neurons, mCH is mainly located in chromatin regions of low CG density, which is established and maintained by DNMT3A (2, 4). Among the three CH dinucleotides, CA is the major target for cytosine methylation (2, 4, 8, 9). A flurry of recent studies demonstrate that MeCP2, a protein involved in neuron development whose mutations are linked to Rett syndrome and other neurological diseases (10, 11), interacts with mCH sites, particularly the mCA sites, in neurons, implying that the MeCP2–mCA interaction plays a key role in regulation of gene expression in normal neuron development (4, 12–14). MeCP2 mainly represses long genes (>100 kb) with high mCA density that are primarily expressed in brain (13). EMSA analysis indicates that MeCP2 binds to mCA as tightly as to mCG DNA and that MeCP2 prefers mCA over mCT and mCC (13, 14). Hydroxylation of mCG into hmCG (hmC is 5-hydroxymethylcytosine) significantly reduces its binding affinity to MeCP2, whereas hydroxylation of mCA into hmCA does not affect its binding to MeCP2 (14).
Recent progress in understanding the physiological role of mCA recognition by MeCP2 motivated us to carry out systematic analysis of mCG and mCH binding to the MBD domains of human MeCP2 and MBD1–4 by using ITC and crystallography. We found that the MBD domains of MeCP2 and MBD1–4 bound to mCG DNAs independent of the sequence identity outside the mCG dinucleotide, and the MBD domains of both MeCP2 and MBD1/2/4 could bind to mCA DNAs with a preference for the mCAC sequence motif. We next determined the crystal structures of the MBD2 MBD domain in complex with several different DNA ligands, including mCG, mCAT, mCAC, and unmodified CAC dsDNAs. We found that the MBD domain of MBD2 recognizes the mCA or CA via binding to their complementary TG dinucleotide and explained why the MBD domains favor the mCAC motif. Taken together, our results presented here imply that the unmethylated CA (or TG) DNAs also serve as the binding sites for MeCP2 and other MBD proteins, and also provide a foundation to study how the TG dinucleotide–binding ability of some MBD proteins, including that of MeCP2, impacts their genome-wide distributions and associated gene expression regulation.
Results and discussion
MBD domains of MeCP2 and MBD1–4 bind to mCG DNA independent of the sequence outside the mCG dinucleotide
The methyl-CG binding ability and sequence selectivity of MBD domains have been studied extensively. For instance, MeCP2 has been reported to prefer some A/T nucleotides surrounding the fully methylated CG dinucleostide (15). On the basis of the SELEX selection assay, the MBD domain of MBD1 has been shown to preferentially recognize mCG within the TCGCA and TGCGCA sequence contexts (16). By surface plasmon resonance (SPR) and structural analysis, the MBD domain of cMBD2 (chicken MBD2) was reported to preferentially recognize the mCGG sequence (17). The MBD domain of MBD3, which was initially found to lack mCpG-binding ability (18–20), has been reported to display preferential binding to 5hmC by EMSAs (21) or preferential binding to mCG by residual dipolar coupling analysis (22). The MBD4 protein contains a T/G or U/G mismatch-specific glycosylase domain in addition to the MBD domain, and its MBD domain was found to recognize the mCG/TG mismatch DNA, a product from the deamination of the methylated CG DNA, as well as to the mCG DNA (23, 24). However, some reports indicate that the sequence flanking the mCG dinucleotide does not affect their MBD binding ability (24, 25). In this study, we have systematically measured the binding affinities between the recombinant MBD domains of human MeCP2 as well as MBD1–4 and mCG-containing DNA with different lengths and sequence contexts by ITC (Tables 1 and 2 and Fig. S1). However, we failed to observe significant sequence selectivity other than the mCG dinucleotide. To elucidate the structural determinants of our observations, we determined crystal structures of the MBD domain of MBD2 in complex with two different mCG-containing dsDNAs (mCGG and mCGT), respectively (Fig. 1 and Table S1).
Table 1.
Binding affinities (Kd values) of the MBD domains of MeCP2 and MBD2 with different DNA (μm)
WB means weak binding. For these weak binding ITC data, there is too little heat to be fitted accurately; NB means no detectable binding.

Table 2.
Binding affinities (Kd values) of the MBD domains of MBD1/3/4 with different DNA (μm)
WB means weak binding. For these weak binding ITC data, there is too little heat to be fitted accurately; NB means no detectable binding.

Figure 1.
Structural basis for the recognition of mCG DNA by the MBD domain of MBD2. A, overall structure of the MBD domain of MBD2 in complex with a mCG DNA in a schematic cartoon representation. The protein is shown in blue; the DNA ligand is shown in green except for the mC6–G6′ and G7–mC7′ bp, which are shown as yellow and red sticks, respectively. The mCG dinucleotide-interacting residues in MBD2 are shown as stick models, and water molecules are shown as red spheres. B, detailed interactions of the mCG dinucleotide-specific recognition by the MBD2 MBD domain. The interacting residues and DNA bases are shown in the same mode as in A. A and B, hydrogen bonds formed between protein residues and bases are marked as black dashed lines, and gray dashed lines represent hydrogen bonds between bp. C, schematic diagram of the detailed interactions between MBD2 and mCG DNA. Direct and water-mediated hydrogen bonds are indicated by solid and dashed red arrows, respectively. The stacking interactions between Arg-166 and mC6, Arg-188, and mC7′ are indicated by gray arrows. D, superposition of the complex structures of the MBD2 MBD domain, respectively, with AmCGT (blue) and CmCGG (green) DNA. E–G, structural comparison of the mCG DNA recognition by the MBD domains of human and chicken MBD2. The protein is shown in blue, and the DNA ligand is shown in green. The mCG-interacting residues in both human MBD2 and chicken MBD2 are shown as stick models, and hydrogen bonds formed between protein residues and DNA are shown as dashed lines.
In both crystal structures the base-specific protein–DNA interactions are largely confined to the mCG dinucleotide motif (Fig. 1, A–C). We did not observe any base-specific interaction between protein and methylated DNA outside the mCG dinucleotide. The two MBD2–mCG complex structures could be well superimposed with a root mean square deviation of 0.66 Å over aligned backbone Cα atoms (Fig. 1D). Different from the published cMBD2–mCG structure (17), we found that Lys-174 of human MBD2 did not interact with the guanine following the mCG dinucleotide, explaining why MBD2 does not display sequence selectivity other than the mCG dinucleotide (Fig. 1, E–G). Although the human MBD2 MBD domain is 95% identical to that of cMBD2, our affinities were slight stronger than those of cMBD2 (17). Based on the complex structures, we could not establish a causal link between the few differing sequence positions and the observed difference in affinity because these different amino acids do not play an obvious role in binding. Thus, we propose that the binding discrepancies for our human MBD2 and reported cMBD2 may result from the different experimental techniques, i.e. ITC versus SPR (17). The sequence-independent binding of these MBD domains is not only consistent with our binding results, but is also in line with crystal structures of MeCP2 and MBD4 in complex with the mCG DNA solved by others, which reveal that the MBD domains of MeCP2 and MBD4 barely make contact with any bases other than the CG dinucleotide (Fig. S2) (24, 26, 27). Taken together, the MBD domains of MeCP2 and MBD1–4 display no sequence selectivity outside the mCG dinucleotide.
It has been reported that the MBD domains recognize the duplex mCG dinucleotide through two highly conserved arginine “fingers” (Fig. 1B, Fig. S2, B and D) (26, 27). Each of the two arginine fingers recognizes one mCG dinucleotide from the duplex mCG DNA and forms a stair motif (28). This stair-shaped motif is usually bound together by means of three kinds of interactions: bidentate hydrogen bonds between the arginine side chain and the guanine base; cation–π interactions between the guanidinium group of the same arginine side chain and the 5-methylcytosine (5-mC) 5′ to the guanine; and the nucleobase stacking interactions between the two bases in the mCG dinucleotide (Fig. 1, B and C, and Fig. S2). Cytosine methylation enlarges the binding interface and enhances cation–π interactions between 5-methylcytosine and arginine (28). The stair-shaped motif is also found in other protein–DNA complexes and usually consists of an arginine residue interacting with consecutive bases (pyrimidine followed by guanine) (29, 30). Therefore, we propose that the two arginine and the two symmetrically related mCG steps would be the structural determinants in the specific interactions between the MBD domains and mCG DNA.
MBD domains of MeCP2 as well as MBD1/2/4 bind to mCA DNA with a preference for mCAC sequence motif
As the MBD domain of MeCP2 recognizes mCA DNA in addition to mCG DNA (4, 12–14), we also measured the binding affinities of the MBD domains of MBD1–4 in addition to MeCP2 to different non-CG DNAs by ITC (Fig. 2, A and B, Fig. S1, and Tables 1 and 2). We found that the MBD domains of MeCP2 and MBD1/2/4 bound to mCA DNA, albeit weaker than to mCpG DNA in general, and the MBD domain of MBD3 exhibited only weak binding ability to mCA (Tables 1 and 2 and Fig. S1). We found that Tyr-178 of MBD2 formed a water-mediated hydrogen bond with mCG DNA in the MBD2 complex structures (Fig. 1B), and this interaction is also conserved in the MeCP2–mCpG DNA structure (Fig. S2B) (26). This conserved tyrosine residue has been proposed to be critical for mCG binding (26, 27, 31), but it is substituted with phenylalanine (Phe-34) in MBD3 (Fig. 2C), which cannot form a hydrogen bond as tyrosine does in MBD2 and MeCP2 (26, 27, 31). As a result, MBD3 is a weaker mCG binder, and an even weaker binder to mCA DNA (Table 2 and Fig. S1). Our ITC binding results also revealed that MBD domains bind to mCT and mCC DNAs only weakly (Tables 1 and 2 and Fig. S1), consistent with the earlier report that the MBD domain of MeCP2 binds to mCT and mCC DNAs as weakly as unmethylated CG DNA (13).
Figure 2.
MBD domain proteins possess CAC-binding ability. A, ITC-binding curves for the MBD domain of MBD2 and its mutants with different dsDNAs. B, ITC-binding curves for the MBD domain of MeCP2 and its mutants with different dsDNAs. NB, no detectable binding. C, sequence alignment of the MBD domains of human MBD2 (NP_003918.1), MBD1 (NP_001191065.1), MBD3 (NP_001268382.1), MBD4 (NP_001263199.1), and MeCP2 (NG_007107.2). The secondary structures of MBD2 and MeCP2 are indicated at the top and bottom of the sequences, respectively. The mCG dinucleotide-interacting residues of MBD2 and MeCP2 are labeled.
Motif analysis of the genome-wide CH methylation identifies that CH methylation prominently occurs in the context of trinucleotide mCAC in neuron cells (4, 8, 32, 33). Interestingly, our ITC results also revealed that the MBD domains of MeCP2 and MBD1/2/4 preferred mCAC over other mCAH (H = T, G and A) DNA (Fig. 2, A and B, Fig. S1, and Tables 1 and 2), in line with the observation that the preferential binding of MeCP2 to mCAC is critical for cerebral gene expression in the brain (32). Taken together, the MBD domains of MeCP2 and MBD1/2/4 exhibited binding abilities to mCA DNAs with a preference for the mCAC sequence motif.
Structural basis for the mCA recognition by the MBD domain
To understand the molecular basis of the mCA recognition by MeCP2, we tried to co-crystallize the MBD domain of MeCP2 with different mCA DNAs, but our attempt of the co-crystallization failed. Because our binding results also revealed that the MBD domains of MBD1/2/4 were able to recognize mCA DNA, we tried their co-crystallization and successfully determined the crystal structure of the MBD domain of MBD2 in complex with an mCAT DNA at a resolution of 2.05 Å (Fig. 3, A–C, and Table S1). In the MBD2–mCAT complex structure, the MBD domain of MBD2 adopted a canonical MBD-fold, with a C-terminal α-helix packed against the three-stranded β-sheet. The β-sheet was inserted into the major groove of mCA DNA and interacted with the mCA dinucleotide extensively (Fig. 3A).
Figure 3.
Structural basis for the selective recognition of mCA over mCT and mCC by the MBD2 MBD domain. A, overall structure of the MBD domain of MBD2 in complex with mCAT DNA in a schematic representation. The protein is shown in blue; the DNA ligand is shown in green, except the A5–T5′, mC6–G6′, and G7–mC7′ bp, which are shown as gray, yellow, and red sticks, respectively. The mCAT-interacting residues in MBD2 are shown as stick models. B, detailed interactions between the MBD2 MBD domain and mCAT DNA. The mCAT-interacting residues and DNA bases are shown in the same mode as in A. A and B, hydrogen bonds formed between protein residues and bases are marked as black dashed lines, and gray dashed lines represent hydrogen bonds between bp. C, schematic diagram of the detailed interactions between MBD2 and mCAT DNA, with the intermolecular interactions indicated in the same way as shown in Fig. 1C. D, superposition of the MBD2–mCAT (blue) and MBD2–mCGG (gray) structures. Hydrogen bonds formed between protein residues and DNA bases are marked as black dashed lines; hydrogen bonds between bp are marked as gray dashed lines. E and F, structural models of the MBD2 MBD domain bound to the mCT and mCC DNA, respectively. The protein and DNA are shown in the same way as in B. Hydrogen bonds formed between the MBD domain and the top or the bottom CG pairs are marked as red and gray dashed lines, respectively.
In the MBD2-mCAT complex structure, Arg-166 formed two hydrogen bonds with the guanine base and simultaneously formed cation–π interactions with the pyrimidine ring of thymine in the TG dinucleotide that pairs with the mCA dinucleotide, completing an R/TG stair interaction motif (Fig. 3, B and C). Despite the same positively charged binding groove and the similar Arg-166 binding pattern between the MBD2–mCAT and other available MBD–mCG structures (Figs. 1B and 3B and Fig. S3, A and B) (15, 18–20, 26), there are significant differences between the mCA and mCG recognition. Different from the second mC–G pair recognition by Arg-188 in the MBD2–mCG complex, Arg-188, the other arginine finger, did not interact with the adenine of mCA dinucleotide, because both the side chain of Arg-188 and the 6-NH2 group of adenine function as hydrogen bond donors and could not form a hydrogen bond with each other. Instead, the side chain of Arg-188 was pushed away from the interaction interface, resulting in the loss of the cation–π interactions between Arg-188 and 5-mC (Fig. 3D and Fig. S4, A and B). The 5-mC did form a water-mediated hydrogen bond with Asp-176 and a C–H···O hydrogen bond with the main chain carbonyl oxygen of Arg-188 (Fig. 3, B and C) (34).
The arginine finger Arg-166 forms a salt bridge with the conserved residue Asp-176, as observed in the MBD–mCG complex structures (Fig. 3B). Because Arg-166 was fixed by Asp-176 with two intramolecular hydrogen bonds, and Arg-188 had more flexibility, Arg-166 was used to recognize the TG dinucleotide; otherwise, if the fixed Arg-166 recognized the complementary CA dinucleotide, then the adenine would form close contacts with Arg-166 because both are hydrogen bond donors. Consistently, our mutagenesis binding results revealed that mutating Arg-166 to alanine severely diminished its binding to mCA, whereas mutating Arg-188 to alanine just reduced its binding to mCA by about 4-fold, highlighting that Arg-166 is essential for the binding of MBD2 to mCA DNA (Fig. 2A and Fig. S1).
Interestingly, in the MeCP2–mCG DNA structures, Arg-133 (corresponding to Arg-188 in MBD2) also formed a hydrogen bond with Glu-137, in addition to the conserved salt bridge interactions between Arg-111 and Asp-121 (corresponding to Arg-166 and Asp-176 in MBD2, respectively) (Fig. 2C and Fig. S2, A and B). To investigate how MeCP2 recognizes mCA DNA, we also mutated Arg-111 and Arg-133 to alanine, and found that R111A disrupted the mCA DNA binding, whereas the R133A still retained modest mCA DNA binding (Fig. 2B and Fig. S1), implying that MeCP2 adopts a binding mode similar to that of MBD2 in binding mCA DNA.
Our structure also explained why mCC and mCT DNAs displayed significantly reduced binding affinities toward the MBD domains (Tables 1 and 2 and Fig. S1), because Arg-166 could not form cation–π interactions with the purine ring of adenine or guanine as it does with methylcytosine or thymine (Fig. 3, E and F). This binding mode also explained why MeCP2 exhibits similar binding affinities to both mCA and hmCA (14), because its MBD domain recognized the mCA mainly through its complementary sequence TG, a mimic of mCG, regardless of the modification status of CA.
Molecular basis for the preferential mCAC binding by the MBD domain
To further address why the MBD domains of MeCP2 and MBD1/2/4 prefer mCAC over other mCAH (H = A, T, and G) DNAs, we also determined the structures of the MBD2 MBD domain in complex with two different mCAC DNAs, respectively (Fig. 4, A–C, Fig. S3, C and D, and Table S1). The only difference between these two mCAC DNA sequences is that a thymine nucleotide located at the −2 position to the mCAC motif is replaced with a cytosine. These two structures are highly conserved, further implying that the flanking sequences do not affect the MBD binding. In the MBD2–mCAC complex structures, in addition to the interactions between Arg-166 and T6G7 dinucleotide, Arg-188 formed a hydrogen bond with G5 that pairs with the C5′ following the mC7′A6′ dinucleotide by taking a different conformation from that in the MBD2–mCAT structure (Figs. 4B and 5A and Fig. S3C), and this interaction is not allowed if the nucleotide following the mCA is not cytosine, explaining why the MBD domains of MeCP2 and MBD1/2/4 favor mCAC over other mCAH (H = G, A, and T) motifs (Fig. 2, A and B, Fig. S1, and Tables 1 and 2).
Figure 4.
Structural basis for the recognition of mCAC and CAC DNA by the MBD domain of MBD2. A, overall structure of the MBD domain of MBD2 in complex with an mCAC DNA in a schematic representation. The protein is shown in blue, and the DNA ligand is shown in green except the A4–T4′ (gray), G5–C5′ (gray), T6–A6′ (yellow), and G7–mC7′ (red) bp. The mCAC-interacting residues in MBD2 are shown as stick models. B, specific recognition of the mCAC trinucleotide by MBD2. The interacting residues and DNA bases are shown in the same mode as in A. C, schematic diagram of the detailed interactions between MBD2 and mCAC DNA. The direct hydrogen bonds and water-mediated hydrogen bonds are indicated by solid and dashed red arrows, respectively. The stacking interaction between Arg-166 and T6 is indicated by a gray arrow. D, overall structure of the MBD domain of MBD2 in complex with a CAC DNA in a schematic representation. The protein and DNA are shown the same as observed in A. E, detailed interactions between the MBD2 MBD domain and CAC DNA. The DNA-interacting residues and DNA bases are shown in the same mode as in D. F, schematic diagram of the detailed interactions between MBD2 and CAC DNA, with the intermolecular interactions indicated in the same way as shown in C. A, B, D, and E, hydrogen bonds formed between protein residues and DNA bases are marked as black dashed lines, and gray dashed lines represent hydrogen bonds between bp.
Figure 5.
Structural basis for preferential recognition of mCAC DNA by the MBD domain of MBD2. A, superposition of the MBD2–mCAT (blue) and MBD2–mCAC (orange) structures. B, superposition of the MBD2–mCAC (orange) and MBD2–CAC (green) structures. Hydrogen bonds formed between protein residues and DNA bases are marked as black dashed lines, and gray dashed lines represent hydrogen bonds between bp.
Cytosine methylation of the CA dinucleotide is not essential for the binding of MBD domains
The structural revelation that the MBD domain of MBD2 bound to the mCA DNA by specifically recognizing the complementary TG dinucleotide prompted us to investigate whether MBD2 was also able to recognize the unmethylated CA (or TG) DNA. Our binding results indeed revealed that the MBD domains of MBD2, MBD4, and MeCP2 could bind to the unmethylated CA DNA, albeit weaker than to mCA DNA (Fig. 2, A and B, Fig. S1, and Tables 1 and 2), presumably due to the lack of the C–H···O hydrogen bond between the 5-methyl group of the 5-mC and the main chain carbonyl oxygen of Arg-188 in MBD2. To illustrate the structural basis of the recognition of unmethylated CA DNA by the MBD domains, we determined the complex structure of the MBD2 MBD domain bound to a CAC-containing DNA (Fig. 4, D–F, and Table S1). The MBD2–CAC complex structure confirmed our hypothesis that the only difference between the MBD2–mCAC and MBD2–CAC structures is the loss of the C–H···O hydrogen bond between the cytosine of the CA dinucleotide and the main chain carbonyl oxygen of Arg-188 (Figs. 4, C and F, and 5B, and Fig. S4, C and D).
Although the MBD domain has been long established as a methyl-CG–binding domain (35), surprisingly, back to 1991 it has been reported that the chicken attachment region-binding protein (ARBP) protein, which was later found to be the MeCP2 homolog in chicken (36), recognizes the matrix/scaffold attachment regions (MARs/SARs) through a consensus sequence of 5′-GGTGT-3′ with flanking AT-rich sequences (37, 38), and this recognition depends on the MBD domain and a central 5′-GGTGT-3′ sequence (36, 37). Mutation of the central three nucleotides GTG of 5′-GGTGT-3′ motif either abolishes or diminishes its binding to ARBP (or MeCP2) (37). The GTG sequence corresponds to the CAC sequence in the complementary strand of the DNA duplex. Furthermore, by re-assessing the previously published DNA binding database generated from the protein-binding microarray (PBM) assay, a technology developed to characterize DNA-binding sequence specificities of proteins, including transcription factors, in a high-throughput manner, we found that the MBD domain of MeCP2 selectively bound to unmethylated CA/TG sequence (Fig. 6A) (39, 40). Hence, these observations together with our findings presented here demonstrated that the binding of MBD domains, such as those of MeCP2 and MBD2, to mCA DNAs, is through the recognition of the complementary TG dinucleotide, and cytosine methylation of the CA dinucleotide is not essential for the binding of MBD domains.
Figure 6.
Structural basis for TG dinucleotide recognition by Kaiso, KLF4, and MBD2. A, protein-binding microarray of the MeCP2 MBD domain with the binding motifs highlighted. B, structural basis for TG recognition by Kaiso. T9–A9′ and G10–C10′ are shown as yellow and red sticks, respectively. The Kaiso residues involved in TG recognition are shown as blue sticks. C, structural basis for TG recognition by KLF4 with T5–A5′ and G6–mC6′ of dsDNA shown as yellow and red sticks, respectively. The KLF4 residues involved in TG recognition are shown as blue sticks. D, structural basis for TG recognition by MBD2. T6–A6′ and G7–mC7′ are shown as yellow and red sticks, respectively. The MBD2 residues involved in TG recognition are shown as blue sticks. Hydrogen bonds between protein residues and the top or the bottom CG pairs are marked as red and gray dashed lines, respectively.
The ability of some MBD domains recognizing both mCG and TG DNA is analogous to those of some other transcription factors (41), such as KLF4 (Krüppel-like factor 4) and Kaiso (42–44). Nevertheless, unlike KLF4 and Kaiso that bind to both mCG and TG DNA located within some specific sequences (42–45), the MBD domains recognize mCG or GTG DNA without additional sequence selectivity. Compared with the KLF4–TG and Kaiso–TG complex structures, we found that, apart from the water-mediated interaction between Lys-178 and DNA, MBD2 utilizes the conserved arginine residue and acidic amino acid to recognize the TG dinucleotide (Fig. 6, B–D). The TG motif binding by MBD domains also reminds us of another DNA sequence motif, i.e. the GT box motif, a GGTGTGGG-like sequence (46). The GT box is predominantly found in the proximal promoter regions or the more distal regulatory regions of mammalian genes with its CG-rich sequence unmethylated (also called GC box) (46). The GT and GC boxes together function as the recruiting elements for the Sp (specificity protein) and KLF families of transcription factors (46). Recent genome-wide MeCP2 distribution analysis reveals that, in addition to binding chromatin regions of high mCG density, MeCP2 also occupies chromatin sites of high mCH density but with lower mCG density. The distinctive MeCP2–mCG and MeCP2–mCA binding events may control different transcriptional programs during brain development (32). In this study, we revealed that the unmethylated CA (or TG) DNA might function as a novel biological ligand of MBD proteins. Consistently, the MBD proteins have been found to bind to chromatin in a methylation-independent manner, and more MeCP2 is located in the 5mC-scarce open chromatin regions than in the 5mC-rich heterochromatin regions (25, 47), further implicating the potential role of unmethylated CA dinucleotide in recruiting MeCP2 and other MBD proteins. Nevertheless, how the CA (or TG) binding ability of MBD proteins, including that of MeCP2, impacts their genome-wide distributions and associated gene expression regulation warrants further studies.
Experimental procedures
Cloning, expression, and purification of MBD domains
MeCP2 (aa 80–164), MBD2 (aa 143–220), and MBD4 (aa 55–152) fragments of human genes were subcloned into pET28-MHL (Structural Genomics Consortium) expression vector to generate N-terminal His-tagged fusion proteins, whereas human MBD1 (aa 1–77) and MBD3 (aa 1–71) domains were subcloned into the pET28-GST-LIC (Structural Genomics Consortium) expression vector to generate N-terminal GST-tagged fusion proteins. The MBD domain mutants of MeCP2 (R111A and R133A) and MBD2 (R166A and R188A) were obtained by QuickChange site-directed mutagenesis (Agilent Technologies) using the MeCP2 (aa 80–164) and MBD2 (aa 143–220) expression constructs as the template, respectively.
The recombinant proteins were overexpressed in Escherichia coli BL21 (DE3)-V2R-pRARE2 induced with 1 mm isopropyl-d-thiogalactopyranoside at 14 °C overnight. The cell pellet was dissolved and further lysed in a buffer containing 20 mm Tris-HCl, pH 7.5, 500 mm NaCl, 0.5 mm phenylmethylsulfonyl fluoride, and 5% glycerol. Supernatant was collected after centrifugation at 16,000 × g for 1 h and then purified with nickel-nitrilotriacetic acid resin (Qiagen) or GSH-Sepharose 4 beads (GE Healthcare). Purified proteins were then treated by tobacco etch virus (for MeCP2, MBD2, and MBD4 proteins) and thrombin (for MBD1 and MBD3 proteins) proteases to remove the tags. The treated samples were further analyzed by affinity chromatography, anion-exchange column, and gel-filtration column (GE Healthcare). Finally, the pure proteins were concentrated to 10 mg/ml in a buffer containing 20 mm Tris-HCl, pH 7.5, and 150 mm NaCl.
Isothermal titration calorimetry binding assay
All the DNA ligands used for ITC and crystallization experiments were synthesized by Integrated DNA Technologies and dissolved in the identical buffer with protein samples containing 20 mm Tris-HCl, pH 7.5, and 150 mm NaCl. Then, the DNA solution was finally adjusted to around pH 7.5 using NaOH. The single strand DNA was annealed into DNA duplex as described before (48, 49). The concentrations of the protein and DNA samples were determined based on UV absorbance using the NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific). For each sample, we measured at least three times to get an average concentration. ITC measurements were carried out at the concentrations of MBD domain proteins and DNA ligands ranging from 20 to 60 μm and from 0.5 to 1 mm, respectively. The assays were performed using MicroCal ITC or ITC200 (GE Healthcare) at 25 °C. Regarding the ITC titrations, for most samples, we did just once; for the other samples, we did more than once until we found optimal experimental conditions, mainly protein/DNA concentrations, which gave nice ITC curves with significant heat change so that we could calculate the Kd reliably. We just used the best curves for each and every binding pair to calculate Kd, and the standard errors are the fitting errors from the best ITC titration curves of each binding pair. All the ITC curves with the corresponding thermodynamic parameters are shown in Fig. S1. To determine the Kd values, the data were fitted using the ITC data analysis module of Origin 7.0 (MicroCal Inc.) with the one-site binding model.
Crystallization
The purified proteins were mixed at a 1:1 molar ratio with different DNA ligands followed by incubation on ice for 30 min. The protein/DNA reaction mixtures were crystallized using the sitting drop vapor diffusion method at 18 °C by mixing 0.5 μl of the complex samples with 0.5 μl of the reservoir solution. Finally, we successfully obtained the complex crystals for MBD2 (aa 143–220) with the respective DNA ligands. The detailed crystallization conditions for each MBD–DNA complex are summarized in Table S1.
Data collection and structure determination
The native crystals were soaked in the crystallization solution plus a final concentration of 15% glycerol and frozen by immersion in liquid nitrogen. Diffraction data were collected at synchrotron or rotating anode X-ray sources under cooling to 100 K, processed with XDS (50), and merged with SCALA or AIMLESS (51). Structures were solved by molecular replacement with PHASER (52) using coordinates from PDB entries 3QMG and 2KY8 (for MBD2-CmCGG) or unpublished models (for remaining MBD2 structures) as required. The MBD2–AmCAT complex was used as a starting model for the nearly isomorphous triclinic MBD2–AmCAC complex structure, which in turn was used as a starting model for the MBD2–ACAC complex. In these cases, molecular replacement search was not needed, and POINTLESS (51) analysis and initial refinement were controlled by a DIMPLE (ccp4.github.io/dimple/) script.4 ARP/WARP (53) was used for electron density map improvement and COOT (54) for interactive model building. Restrained model refinement was performed with PHENIX.REFINE (55), REFMAC (56), and AUTOBUSTER (Cambridge, United Kingdom, Global Phasing Ltd.). MOLPROBITY (57) and PARVATI server (58) were used for analysis of model geometry and atomic anisotropic displacement parameters, respectively. PDB_EXTRAC (59) and IOTBX.CIF (60) were used for the compilation of data collection and refinement statistics summarized in Table S1.
Coordinates and structure factors for the structures of the MBD domains in complex with respective DNA ligands, have been deposited into Protein Data Bank (PDB) under the accession codes 6C1A, 6C1U, 6C1T, 6C1V, 6CNP and 6CNQ.
Author contributions
K. L., C. X., M. L., A. Y., P. L., and T. R. H. investigation; K. L., C. X., and J. M. writing-original draft; K. L., C. X., and J. M. writing-review and editing; C. X. software; K. L., C. X. and J. M. validation; C. X. project administration; J. M. conceptualization; J. M. formal analysis; J. M. supervision.
Supplementary Material
Acknowledgments
We thank Wolfram Tempel for the structure determination. We thank Chuanbing Bian for the MBD2–mCG crystal structure, Aiping Dong for a coordinate averaging program, and Amy Wernimont for some diffraction data collection and crystal structure review. Some diffraction data were collected at the Structural Biology Center and the General Medical Sciences and Cancer Institutes Structural Biology Facility at the Advanced Photon Source (GM/CA at APS) is part of the X-ray Science Division at APS, Argonne National Laboratory. GM/CA received support from National Institutes of Health Grants ACB-12002 from NCI and AGM-12006 from NIGMS. Argonne is operated by the United States Department of Energy under Contract DE-AC02-06CH11357. The Structural Genomics Consortium is a registered charity (number 1097737) that receives funds from the following: AbbVie; Bayer Pharma AG; Boehringer Ingelheim; Canada Foundation for Innovation; Eshelman Institute for Innovation; Genome Canada through Ontario Genomics Institute Grant OGI-055; Innovative Medicines Initiative (EU/EFPIA) ULTRA-DD Grant 115766; Janssen, Merck KGaA (Darmstadt, Germany); Merck Sharp and Dohme; Novartis Pharma AG; Ontario Ministry of Research, Innovation and Science (MRIS); Pfizer; São Paulo Research Foundation (FAPESP); Takeda; and Wellcome.
The authors declare that they have no conflicts of interest with the contents of this article. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
This article contains Figs. S1–S4 and Table S1.
The atomic coordinates and structure factors (codes 6C1A, 6C1T, 6C1U, 6C1V, 6CNP, and 6CNQ) have been deposited in the Protein Data Bank (http://wwpdb.org/).
Please note that the JBC is not responsible for the long-term archiving and maintenance of this site or any other third party hosted site.
- m
- methylation
- hm
- 5-hydroxymethyl
- MBD
- methyl-CpG-binding domain
- ITC
- isothermal titration calorimetry
- aa
- amino acid
- 5-mC
- 5-methylcytosine
- hmC
- 5-hydroxymethylcytosine
- c
- chicken
- ARBP
- attachment region-binding protein
- PDB
- Protein Data Bank
- GST
- glutathione S-transferase
- EMSA
- electrophoretic mobility shift assay
- SPR
- surface plasmon resonance.
References
- 1. Ehrlich M., Gama-Sosa M. A., Huang L. H., Midgett R. M., Kuo K. C., McCune R. A., and Gehrke C. (1982) Amount and distribution of 5-methylcytosine in human DNA from different types of tissues or cells. Nucleic Acids Res. 10, 2709–2721 10.1093/nar/10.8.2709 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Ramsahoye B. H., Biniszkiewicz D., Lyko F., Clark V., Bird A. P., and Jaenisch R. (2000) Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a. Proc. Natl. Acad. Sci. U.S.A. 97, 5237–5242 10.1073/pnas.97.10.5237 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Woodcock D. M., Crowther P. J., and Diver W. P. (1987) The majority of methylated deoxycytidines in human DNA are not in the CpG dinucleotide. Biochem. Biophys. Res. Commun. 145, 888–894 10.1016/0006-291X(87)91048-5 [DOI] [PubMed] [Google Scholar]
- 4. Guo J. U., Su Y., Shin J. H., Shin J., Li H., Xie B., Zhong C., Hu S., Le T., Fan G., Zhu H., Chang Q., Gao Y., Ming G. L., and Song H. (2014) Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain. Nat. Neurosci. 17, 215–222 10.1038/nn.3607 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Lister R., Pelizzola M., Dowen R. H., Hawkins R. D., Hon G., Tonti-Filippini J., Nery J. R., Lee L., Ye Z., Ngo Q. M., Edsall L., Antosiewicz-Bourget J., Stewart R., Ruotti V., Millar A. H., et al. (2009) Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315–322 10.1038/nature08514 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Kribelbauer J. F., Laptenko O., Chen S., Martini G. D., Freed-Pastor W. A., Prives C., Mann R. S., and Bussemaker H. J. (2017) Quantitative analysis of the DNA methylation sensitivity of transcription factor complexes. Cell Rep. 19, 2383–2395 10.1016/j.celrep.2017.05.069 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Schultz M. D., He Y., Whitaker J. W., Hariharan M., Mukamel E. A., Leung D., Rajagopal N., Nery J. R., Urich M. A., Chen H., Lin S., Lin Y., Jung I., Schmitt A. D., Selvaraj S., et al. (2015) Human body epigenome maps reveal noncanonical DNA methylation variation. Nature 523, 212–216 10.1038/nature14465 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Ichiyanagi T., Ichiyanagi K., Miyake M., and Sasaki H. (2013) Accumulation and loss of asymmetric non-CpG methylation during male germ-cell development. Nucleic Acids Res. 41, 738–745 10.1093/nar/gks1117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Ziller M. J., Müller F., Liao J., Zhang Y., Gu H., Bock C., Boyle P., Epstein C. B., Bernstein B. E., Lengauer T., Gnirke A., and Meissner A. (2011) Genomic distribution and inter-sample variation of non-CpG methylation across human cell types. PLoS Genet. 7, e1002389 10.1371/journal.pgen.1002389 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Du Q., Luu P. L., Stirzaker C., and Clark S. J. (2015) Methyl-CpG-binding domain proteins: readers of the epigenome. Epigenomics 7, 1051–1073 10.2217/epi.15.39 [DOI] [PubMed] [Google Scholar]
- 11. Amir R. E., Van den Veyver I. B., Wan M., Tran C. Q., Francke U., and Zoghbi H. Y. (1999) Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2. Nat. Genet. 23, 185–188 10.1038/13810 [DOI] [PubMed] [Google Scholar]
- 12. Chen L., Chen K., Lavery L. A., Baker S. A., Shaw C. A., Li W., and Zoghbi H. Y. (2015) MeCP2 binds to non-CG methylated DNA as neurons mature, influencing transcription and the timing of onset for Rett syndrome. Proc. Natl. Acad. Sci. U.S.A. 112, 5509–5514 10.1073/pnas.1505909112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Gabel H. W., Kinde B., Stroud H., Gilbert C. S., Harmin D. A., Kastan N. R., Hemberg M., Ebert D. H., and Greenberg M. E. (2015) Disruption of DNA-methylation-dependent long gene repression in Rett syndrome. Nature 522, 89–93 10.1038/nature14319 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Kinde B., Gabel H. W., Gilbert C. S., Griffith E. C., and Greenberg M. E. (2015) Reading the unique DNA methylation landscape of the brain: non-CpG methylation, hydroxymethylation, and MeCP2. Proc. Natl. Acad. Sci. U.S.A. 112, 6800–6806 10.1073/pnas.1411269112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Klose R. J., Sarraf S. A., Schmiedeberg L., McDermott S. M., Stancheva I., and Bird A. P. (2005) DNA binding selectivity of MeCP2 due to a requirement for A/T sequences adjacent to methyl-CpG. Mol. Cell 19, 667–678 10.1016/j.molcel.2005.07.021 [DOI] [PubMed] [Google Scholar]
- 16. Clouaire T., de Las Heras J. I., Merusi C., and Stancheva I. (2010) Recruitment of MBD1 to target genes requires sequence-specific interaction of the MBD domain with methylated DNA. Nucleic Acids Res. 38, 4620–4634 10.1093/nar/gkq228 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Scarsdale J. N., Webb H. D., Ginder G. D., and Williams D. C. Jr. (2011) Solution structure and dynamic analysis of chicken MBD2 methyl binding domain bound to a target-methylated DNA sequence. Nucleic Acids Res. 39, 6741–6752 10.1093/nar/gkr262 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Hendrich B., and Bird A. (1998) Identification and characterization of a family of mammalian methyl-CpG binding proteins. Mol. Cell. Biol. 18, 6538–6547 10.1128/MCB.18.11.6538 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Zhang Y., Ng H. H., Erdjument-Bromage H., Tempst P., Bird A., and Reinberg D. (1999) Analysis of the NuRD subunits reveals a histone deacetylase core complex and a connection with DNA methylation. Genes Dev. 13, 1924–1935 10.1101/gad.13.15.1924 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Saito M., and Ishikawa F. (2002) The mCpG-binding domain of human MBD3 does not bind to mCpG but interacts with NuRD/Mi2 components HDAC1 and MTA2. J. Biol. Chem. 277, 35434–35439 10.1074/jbc.M203455200 [DOI] [PubMed] [Google Scholar]
- 21. Yildirim O., Li R., Hung J. H., Chen P. B., Dong X., Ee L. S., Weng Z., Rando O. J., and Fazzio T. G. (2011) Mbd3/NURD complex regulates expression of 5-hydroxymethylcytosine marked genes in embryonic stem cells. Cell 147, 1498–1510 10.1016/j.cell.2011.11.054 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Cramer J. M., Scarsdale J. N., Walavalkar N. M., Buchwald W. A., Ginder G. D., and Williams D. C. Jr. (2014) Probing the dynamic distribution of bound states for methylcytosine-binding domains on DNA. J. Biol. Chem. 289, 1294–1302 10.1074/jbc.M113.512236 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Hendrich B., Hardeland U., Ng H. H., Jiricny J., and Bird A. (1999) The thymine glycosylase MBD4 can bind to the product of deamination at methylated CpG sites. Nature 401, 301–304 10.1038/45843 [DOI] [PubMed] [Google Scholar]
- 24. Otani J., Arita K., Kato T., Kinoshita M., Kimura H., Suetake I., Tajima S., Ariyoshi M., and Shirakawa M. (2013) Structural basis of the versatile DNA recognition ability of the methyl-CpG binding domain of methyl-CpG binding domain protein 4. J. Biol. Chem. 288, 6351–6362 10.1074/jbc.M112.431098 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Baubec T., Ivánek R., Lienert F., and Schübeler D. (2013) Methylation-dependent and -independent genomic targeting principles of the MBD protein family. Cell 153, 480–492 10.1016/j.cell.2013.03.011 [DOI] [PubMed] [Google Scholar]
- 26. Ho K. L., McNae I. W., Schmiedeberg L., Klose R. J., Bird A. P., and Walkinshaw M. D. (2008) MeCP2 binding to DNA depends upon hydration at methyl-CpG. Mol. Cell 29, 525–531 10.1016/j.molcel.2007.12.028 [DOI] [PubMed] [Google Scholar]
- 27. Ohki I., Shimotake N., Fujita N., Jee J., Ikegami T., Nakao M., and Shirakawa M. (2001) Solution structure of the methyl-CpG binding domain of human MBD1 in complex with methylated DNA. Cell 105, 487–497 10.1016/S0092-8674(01)00324-5 [DOI] [PubMed] [Google Scholar]
- 28. Zou X., Ma W., Solov'yov I. A., Chipot C., and Schulten K. (2012) Recognition of methylated DNA through methyl-CpG binding domain proteins. Nucleic Acids Res. 40, 2747–2758 10.1093/nar/gkr1057 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Lamoureux J. S., and Glover J. N. (2006) Principles of protein-DNA recognition revealed in the structural analysis of Ndt80-MSE DNA complexes. Structure 14, 555–565 10.1016/j.str.2005.11.017 [DOI] [PubMed] [Google Scholar]
- 30. Rooman M., Liévin J., Buisine E., and Wintjens R. (2002) Cation-π/H-bond stair motifs at protein-DNA interfaces. J. Mol. Biol. 319, 67–76 10.1016/S0022-2836(02)00263-2 [DOI] [PubMed] [Google Scholar]
- 31. Fraga M. F., Ballestar E., Montoya G., Taysavang P., Wade P. A., and Esteller M. (2003) The affinity of different MBD proteins for a specific methylated locus depends on their intrinsic binding properties. Nucleic Acids Res. 31, 1765–1774 10.1093/nar/gkg249 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Lagger S., Connelly J. C., Schweikert G., Webb S., Selfridge J., Ramsahoye B. H., Yu M., He C., Sanguinetti G., Sowers L. C., Walkinshaw M. D., and Bird A. (2017) MeCP2 recognizes cytosine methylated tri-nucleotide and di-nucleotide sequences to tune transcription in the mammalian brain. PLoS Genet. 13, e1006793 10.1371/journal.pgen.1006793 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Kozlenkov A., Roussos P., Timashpolsky A., Barbu M., Rudchenko S., Bibikova M., Klotzle B., Byne W., Lyddon R., Di Narzo A. F., Hurd Y. L., Koonin E. V., and Dracheva S. (2014) Differences in DNA methylation between human neuronal and glial cells are concentrated in enhancers and non-CpG sites. Nucleic Acids Res. 42, 109–127 10.1093/nar/gkt838 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Derewenda Z. S., Lee L., and Derewenda U. (1995) The occurrence of C–H···O hydrogen bonds in proteins. J. Mol. Biol. 252, 248–262 10.1006/jmbi.1995.0492 [DOI] [PubMed] [Google Scholar]
- 35. Lewis J. D., Meehan R. R., Henzel W. J., Maurer-Fogy I., Jeppesen P., Klein F., and Bird A. (1992) Purification, sequence, and cellular localization of a novel chromosomal protein that binds to methylated DNA. Cell 69, 905–914 10.1016/0092-8674(92)90610-O [DOI] [PubMed] [Google Scholar]
- 36. Weitzel J. M., Buhrmester H., and Strätling W. H. (1997) Chicken MAR-binding protein ARBP is homologous to rat methyl-CpG-binding protein MeCP2. Mol. Cell. Biol. 17, 5656–5666 10.1128/MCB.17.9.5656 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Buhrmester H., von Kries J. P., and Strätling W. H. (1995) Nuclear matrix protein ARBP recognizes a novel DNA sequence motif with high affinity. Biochemistry 34, 4108–4117 10.1021/bi00012a029 [DOI] [PubMed] [Google Scholar]
- 38. von Kries J. P., Buhrmester H., and Strätling W. H. (1991) A matrix/scaffold attachment region binding protein: identification, purification, and mode of binding. Cell 64, 123–135 10.1016/0092-8674(91)90214-J [DOI] [PubMed] [Google Scholar]
- 39. Weirauch M. T., Cote A., Norel R., Annala M., Zhao Y., Riley T. R., Saez-Rodriguez J., Cokelaer T., Vedenko A., Talukder S., DREAM5 Consortium, Bussemaker H. J., Morris Q. D., Bulyk M. L., Stolovitzky G., and Hughes T. R. (2013) Evaluation of methods for modeling transcription factor sequence specificity. Nat. Biotechnol. 31, 126–134 10.1038/nbt.2486 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Lam K. N., van Bakel H., Cote A. G., van der Ven A., and Hughes T. R. (2011) Sequence specificity is obtained from the majority of modular C2H2 zinc-finger arrays. Nucleic Acids Res. 39, 4680–4690 10.1093/nar/gkq1303 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Liu Y., Zhang X., Blumenthal R. M., and Cheng X. (2013) A common mode of recognition for methylated CpG. Trends Biochem. Sci. 38, 177–183 10.1016/j.tibs.2012.12.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Liu Y., Olanrewaju Y. O., Zheng Y., Hashimoto H., Blumenthal R. M., Zhang X., and Cheng X. (2014) Structural basis for Klf4 recognition of methylated DNA. Nucleic Acids Res. 42, 4859–4867 10.1093/nar/gku134 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Buck-Koehntop B. A., Stanfield R. L., Ekiert D. C., Martinez-Yamout M. A., Dyson H. J., Wilson I. A., and Wright P. E. (2012) Molecular basis for recognition of methylated and specific DNA sequences by the zinc finger protein Kaiso. Proc. Natl. Acad. Sci. U.S.A. 109, 15229–15234 10.1073/pnas.1213726109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Schuetz A., Nana D., Rose C., Zocher G., Milanovic M., Koenigsmann J., Blasig R., Heinemann U., and Carstanjen D. (2011) The structure of the Klf4 DNA-binding domain links to self-renewal and macrophage differentiation. Cell. Mol. Life Sci. 68, 3121–3131 10.1007/s00018-010-0618-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Daniel J. M., Spring C. M., Crawford H. C., Reynolds A. B., and Baig A. (2002) The p120(ctn)-binding partner Kaiso is a bi-modal DNA-binding protein that recognizes both a sequence-specific consensus and methylated CpG dinucleotides. Nucleic Acids Res. 30, 2911–2919 10.1093/nar/gkf398 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Suske G., Bruford E., and Philipsen S. (2005) Mammalian SP/KLF transcription factors: bring in the family. Genomics 85, 551–556 10.1016/j.ygeno.2005.01.005 [DOI] [PubMed] [Google Scholar]
- 47. Shin J., Ming G. L., and Song H. (2013) By hook or by crook: multifaceted DNA-binding properties of MeCP2. Cell 152, 940–942 10.1016/j.cell.2013.02.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Xu Y., Xu C., Kato A., Tempel W., Abreu J. G., Bian C., Hu Y., Hu D., Zhao B., Cerovina T., Diao J., Wu F., He H. H., Cui Q., Clark E., et al. (2012) Tet3 CXXC domain and dioxygenase activity cooperatively regulate key genes for Xenopus eye and neural development. Cell 151, 1200–1213 10.1016/j.cell.2012.11.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Xu C., Liu K., Lei M., Yang A., Li Y., Hughes T. R., and Min J. (2018) DNA sequence recognition of human CXXC domains and their structural determinants. Structure 26, 85–95.e3 10.1016/j.str.2017.11.022 [DOI] [PubMed] [Google Scholar]
- 50. Kabsch W. (2010) XDS. Acta Crystallogr. D Biol. Crystallogr. 66, 125–132 10.1107/S0907444909047337 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Evans P. R., and Murshudov G. N. (2013) How good are my data and what is the resolution? Acta Crystallogr. D Biol. Crystallogr. 69, 1204–1214 10.1107/S0907444913000061 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. McCoy A. J., Grosse-Kunstleve R. W., Adams P. D., Winn M. D., Storoni L. C., and Read R. J. (2007) Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 10.1107/S0021889807021206 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Perrakis A., Harkiolaki M., Wilson K. S., and Lamzin V. S. (2001) ARP/wARP and molecular replacement. Acta Crystallogr. D Biol. Crystallogr. 57, 1445–1450 10.1107/S0907444901014007 [DOI] [PubMed] [Google Scholar]
- 54. Emsley P., Lohkamp B., Scott W. G., and Cowtan K. (2010) Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 10.1107/S0907444910007493 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Afonine P. V., Grosse-Kunstleve R. W., Echols N., Headd J. J., Moriarty N. W., Mustyakimov M., Terwilliger T. C., Urzhumtsev A., Zwart P. H., and Adams P. D. (2012) Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallogr. D Biol. Crystallogr. 68, 352–367 10.1107/S0907444912001308 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Murshudov G. N., Skubák P., Lebedev A. A., Pannu N. S., Steiner R. A., Nicholls R. A., Winn M. D., Long F., and Vagin A. A. (2011) REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr. D Biol. Crystallogr. 67, 355–367 10.1107/S0907444911001314 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Chen V. B., Arendall W. B. 3rd, Headd J. J., Keedy D. A., Immormino R. M., Kapral G. J., Murray L. W., Richardson J. S., and Richardson D. C. (2010) MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 66, 12–21 10.1107/S0907444909042073 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Zucker F., Champ P. C., and Merritt E. A. (2010) Validation of crystallographic models containing TLS or other descriptions of anisotropy. Acta Crystallogr. D Biol. Crystallogr. 66, 889–900 10.1107/S0907444910020421 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Yang H., Guranovic V., Dutta S., Feng Z., Berman H. M., and Westbrook J. D. (2004) Automated and accurate deposition of structures solved by X-ray diffraction to the Protein Data Bank. Acta Crystallogr. D Biol. Crystallogr. 60, 1833–1839 10.1107/S0907444904019419 [DOI] [PubMed] [Google Scholar]
- 60. Gildea R. J., Bourhis L. J., Dolomanov O. V., Grosse-Kunstleve R. W., Puschmann H., Adams P. D., and Howard J. A. (2011) iotbx.cif: a comprehensive CIF toolbox. J. Appl. Crystallogr. 44, 1259–1263 10.1107/S0021889811041161 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.






