Abstract
Expression of a few master transcription factors can reprogram the epigenetic landscape and three-dimensional chromatin topology of differentiated cells and achieve pluripotency. During reprogramming, thousands of long-range chromatin contacts are altered, and changes in promoter association with enhancers dramatically influence transcription. Molecular participants at these sites have been identified, but how this re-organization might be orchestrated is not known. Biomolecular condensation is implicated in subcellular organization, including the recruitment of RNA polymerase in transcriptional activation. Here, we show that reprogramming factor KLF4 undergoes biomolecular condensation even in the absence of its intrinsically disordered region. Liquid–liquid condensation of the isolated KLF4 DNA binding domain with a DNA fragment from the NANOG proximal promoter is enhanced by CpG methylation of a KLF4 cognate binding site. We propose KLF4-mediated condensation as one mechanism for selectively organizing and re-organizing the genome based on the local sequence and epigenetic state.
Subject terms: Transcription factors, Transcription factors, Transcription, Reprogramming
KLF4, OCT4, SOX2 and MYC cooperate to reorganize chromatin during somatic cell reprogramming. Here the authors show that KLF4 forms a liquid-like biomolecular condensate that recruits OCT4 and SOX2, and that condensation of the isolated KLF4 DNA binding domain with DNA is enhanced by CpG methylation
Introduction
Krüppel like factor 4 (KLF4) is a key constituent of reprogramming cocktails that transform fibroblasts to induced pluripotent stem cells (iPSCs)1–4. KLF4 cooperates with transcription factors (TFs) OCT4 and SOX2 in reprogramming to silence somatic enhancers and activate enhancers of pluripotency genes5,6, including the ‘gateway to pluripotency’ gene NANOG7, which is highly expressed in embryonic stem cell (ESCs)8–11. In PSCs, KLF4 is enriched at the NANOG12 and OCT413 loci, which interact through space with many other pluripotency-related genomic sites. KLF4 is enriched at ESC super-enhancers14 and at iPSC genomic anchors that make more than four contacts, further implicating KLF4 in chromatin organization15. How KLF4 or other TFs might initiate chromatin reorganizations that determine cell fate is of intense interest3,15,16.
KLF4 contacts the DNA major groove with three tandem C2H2 zinc fingers (ZnFs) that make specific interactions17,18 at 9 base pair (bp) cognate DNA sites19,20. The first 400 KLF4 residues are likely to be disordered because they have low sequence complexity, and intrinsically disordered regions (IDRs) of other TFs help to silence21,22 or activate23–26 gene expression. In current models for transcriptional activation, TFs bound to their cognate sites cooperate with co-localized co-activators to recruit Mediator complex and RNA polymerase II through IDR:IDR mediated biomolecular condensations23–26. The KLF4 DNA binding domain and IDR might participate in such processes in open chromatin, and the KLF4 preference for CpG methylated over unmethylated cognate sites27 combined with its ability to bind 6 bp partial sites in nucleosomal DNA16 could help target it to silenced chromatin. The ability of KLF4 to undergo biomolecular condensation could facilitate pioneer interactions with closed chromatin and, as others have speculated15, might stabilize long-range contacts between genomic loci.
Here, we show that KLF4 forms a liquid-like biomolecular condensate with DNA that recruits OCT4 and SOX2. Surprisingly, the intrinsically disordered region is not essential for KLF4 condensation in cells, and a KLF4 fragment comprising the isolated DNA binding domain (DBD) condenses with DNA in vitro. KLF4 DBD condensation with a NANOG promoter duplex is strongly enhanced by CpG methylation of a KLF4 cognate site, and ZnF point mutations that weaken interactions with DNA cognate sites decrease condensation in cells and in vitro. Single molecule methods show that KLF4 tandem zinc fingers bring together short DNA duplexes in dilute solution by a bridging interaction. We propose that bridging and/or condensation with DNA in a sequence- and CpG methylation-dependent manner underlie KLF4 function as a key chromatin organizer and pioneer transcription factor in somatic cell reprogramming.
Results
KLF4 forms nuclear condensates at modest expression levels
We used expression tags to monitor the distribution of KLF4 by fluorescence microscopy in HEK 293T cells or BJ fibroblasts, the somatic cells most widely used for reprogramming28. KLF4 fused to mTurquoise2 (KLF4-mTurq) localizes to the nucleus and forms small puncta or round droplets, whereas mTurquoise2 alone is diffusely distributed throughout the nucleus and the cytoplasm (Fig. 1a). Transfection produces cells with various expression levels of tagged protein; KLF4-mTurq distribution is diffuse at the lowest expression levels, but most cells that express detectable KLF4-mTurq show punctate expression or droplets (Fig. 1b). Because round droplets are hallmarks of liquid–liquid phase separation (LLPS)29,30, we monitored KLF4-mTurq fluorescence after photobleaching; fluorescence recovers rapidly in both large droplets and small puncta in BJ fibroblasts (Fig. 1c), indicating that KLF4-mTurq in the condensate diffuses rapidly and is therefore liquid-like. Time courses using 3D z-stack fluorescence imaging reveal fusion of small droplets in BJ fibroblasts (Fig. 1d), indicating a liquid-like KLF4-mTurq condensate. Treatment with 1,6-hexanediol largely dissolves the KLF4-mTurq puncta and round droplets in HEK 293T cells (Fig. 1e), consistent with a liquid-like condensate31.
A KLF4-mCherry fusion expresses at lower average levels than KLF4-mTurq but also forms puncta and droplets (Supplementary Fig. 1), indicating that the identity of the expression tag is not critical to condensation. Endogenous KLF4 levels have not been reported, but the KLF4-mTurq levels quantified by brightness (0.7 µM average for cells with puncta; 2.5 or 4.0 µM average for cells with small or large droplets, respectively; see Supplementary Fig. 2) are similar to those reported for TFs SOX2 or OCT432. We expect that KLF4 expression driven by vectors in reprogramming cocktails33 would result in robust biomolecular condensation.
The intrinsically disordered region is dispensable for KLF4 condensation
To identify domains that contribute to KLF4-mTurq biomolecular condensation, we expressed constructs lacking either the IDR (residues 1–417) or the DNA binding domain (DBD; residues 418–513) (Fig. 2a). KLF4ΔDBD-mTurq, which lacks the three tandem ZnFs, expresses well, is diffusely distributed throughout the cytoplasm and nucleus, and only rarely forms nuclear puncta (Fig. 2b, top). KLF4ΔIDR-mTurq, which lacks the low complexity region, expresses poorly, localizes to the nucleus, and forms droplets similar to KLF4-mTurq (Fig. 2b, bottom). Scoring cells for the presence of puncta and plotting them by total mTurq brightness reveals diffuse distribution at the lowest expression levels; KLF4-mTurq and KLF4ΔIDR-mTurq mutant form puncta at similar modest expression levels, whereas KLF4ΔDBD-mTurq forms puncta only at high expression levels (Fig. 2c). The dispensability of the IDR indicates that the DBD alone can drive KLF4 condensation, but the tag, which comprises most of KLF4ΔIDR-mTurq (Fig. 2a), may contribute in some way. To test directly whether the KLF4 tandem zinc fingers drive biomolecular condensation, we studied the isolated domain in vitro after expression in E. coli.
The KLF4 DNA binding domain phase separates with cognate DNA
Purified KLF4 DBD is readily soluble and does not condense or precipitate at physiological salt or upon addition of 10% PEG 8000, a crowding agent used to enhance the weak interactions that drive biomolecular condensation (Fig. 2d). Because proteins that bind RNA can undergo RNA-induced phase separation34, we tested the ability of NANK, a 30 bp NANOG promoter DNA duplex containing 3 KLF4 cognate sites, to induce phase separation of DBD. Adding 1 µM NANK to 6 µM DBD in physiological salt, without PEG, results in droplets that are visible by bright field microscopy (Fig. 2d, right). This DNA-induced condensation occurs without any labels or tags on either the isolated DBD or the DNA duplex. To determine if DBD and NANK co-localize, we labeled DBD with Alexa Fluor 594 (AF594) and mixed it with NANK in the presence of the dye YOYO-1, which binds DNA with high affinity35. Two-channel fluorescence images confirm that DBD-AF594 and NANK-YOYO-1 co-localize in all droplets (Fig. 2e). Time courses of NANK-induced DBD condensation shows that droplets form, grow, fuse, settle, and wet the bottom surface (Fig. 2f), indicating that the phase is liquid-like. The liquid nature of large droplets is confirmed by rapid recovery of fluorescence after photobleaching localized regions (Fig. 2g). We conclude that DBD undergoes liquid–liquid phase separation (LLPS) with NANK at physiological salt without the need for crowding agents.
DBD:NANK LLPS depends in complex ways on the component concentrations. At 6 µM DBD, 0.25 µM NANK induces readily detectable LLPS, and increasing NANK up to 1 µM increases the amount of condensate, but further increases in NANK actually produce less condensate, and at 3 µM NANK and above LLPS is no longer detected (Fig. 2h). The lack of LLPS at high NANK suggests that phase separation requires a minimum DBD:NANK ratio, perhaps to saturate NANK cognate KLF4 sites. At any given DBD concentration, LLPS is not observed below a threshold NANK concentration; this NANK threshold is lower at high DBD levels (Fig. 2i, Supplementary Fig. 3). These data describe a phase diagram for DBD:NANK LLPS (Fig. 2i) and challenge us to understand the nature of DBD:NANK interaction.
KLF4 DBD forms a 3:1 complex with a NANOG promoter duplex
KLF4 binds 9 bp cognate sites containing methylated GGCG or ‘intrinsically methylated’ GGTG17 and activates expression from the NANOG promoter through two GGTG elements36 (sites KLFA and KLFC in Fig. 3a). A GGCG element in the NANOG promoter reverse strand (KLFB in Fig. 3a) is methylated and silenced in germ cells37, and NANOG promoter hypermethylation must be reversed to achieve pluripotency38. Because the nature of KLF4 DBD binding to this DNA might be important for LLPS, we tested binding at these sites in vitro. Each of three 12 bp duplexes excerpted from the 30 bp NANOG promoter fragment NANK contains a central GG(C/T)G and binds DBD in electromobility shifts assays (EMSA) (Supplementary Fig. 4). EMSA titration of the 30 bp fragment NANK (or its CpG methylated variant, NANKm) with DBD gives complexes of three different mobilities, consistent with DBD forming 1:1, 2:1, and 3:1 complexes with these DNAs (Fig. 3a). At the same DNA concentrations, 3 DBD equivalents form a detectable 3:1 complex with NANKm, but 6 equivalents are needed to form such a complex with NANK, consistent with the KLF4 preference for GGmCG over GGCG27. We conclude that our KLF4 DBD preparations are well folded and show target selectivity, and that DBD can form a 3:1 complex with NANK.
KLF4 DBD forms a 1:1 complex with a cognate NANOG dodecamer
We determined the crystal structure of the 1:1 complex of DBD bound to a dodecamer containing the NANOG proximal promoter KLFA site (Fig. 3a, Table 1). As with previous KLF4 crystal structures with DNA17,18, each ZnF contacts three or four base pairs in the DNA major groove (Fig. 3b). The overall structure of the DBD:NKA complex is similar to previous KLF4:decamer complexes18 and each ZnF makes base-specific contacts mediated by one or more ‘specificity residues’ at positions −1, 2, 3, and 6 of the canonical C2H2 recognition code39. Residues R473 and R479 of ZnF2 (in positions −1 and 6) and R501 of ZnF3 (in position −1) hydrogen bond with the N7s and O6s of bases G5, G6, and G8, respectively; the arginine side chains also make polar interactions with water, ions, and/or aspartate side chains (Fig. 3c). As in previously solved structures, ZnF1 makes fewer base-specific contacts than ZnF2 or ZnF3. The only base-specific contact made by ZnF1 in this structure is H446 (in position 3) hydrogen bonding to N2 of base G10 (Fig. 3c). With other target DNAs, K443 (position −1) of ZnF1 contacts a G17,18; our target has a C at the corresponding position 11 in our dodecamer, and the K443 side chain is disordered in our structure. Interactions between a glutamate (E476, position 3) and the methyl group of T7 (Fig. 3c) in NKA are similar to those seen for KLF4 DBD bound to a DNA decamer containing methylated-C (PDB ID: 4M9E). KLF4 DBD binds to the GGTG of our dodecamer with the same conformation as it does to the GGmCG of a decamer (87 Cα-atoms superimpose with a root-mean-square deviation of 0.97 Å). When bound to a DNA heptamer (PDB ID: 2WBS), the ZnF2 and ZnF3 domains contact their target bases as in the decamer (354 atoms superimpose with a root-mean-square deviation of 0.48 Å), but ZnF1 adopts a new orientation18 (Fig. 3d) that has been invoked to explain a 6 bp consensus KLF4 binding site on nucleosomes16.
Table 1.
KLF4:NKA | |
---|---|
Data collection | |
Space group | P 1 21 1 |
Cell dimensions | |
a, b, c (Å) | 38.96, 46.69, 45.20 |
α, β, γ (°) | 90.0, 113.5, 90.0 |
Resolution (Å) | 22.17–2.14 (2.22–2.14)a |
I/σ (I) | 4.31 (1.23)a |
Completeness (%) | 96.14 (94.20)a |
Redundancy | 3.9 (3.8)a |
Rmeans | 0.2289 (2.774)a |
CC1/2 | 0.975 (0.304)a |
Refinement | |
Resolution (Å) | 22.17–2.14 (2.22–2.14)a |
No. reflections | 8016 (780)a |
Rwork/Rfreeb | 0.1975/0.2474 |
No. atoms | |
Proteins | 1142 |
Ligand/ion | 5 |
Water | 93 |
B-factors | |
Protein | 48.64 |
Ligand/ion | 47.56 |
Water | 45.46 |
R.m.s. deviations | |
Bond lengths (Å) | 0.009 |
Bond angles (°) | 1.03 |
aValues in parentheses are for the highest resolution shell.
b5.0% of the observed reflections were excluded from refinement for cross validation purposes.
KLF4 site overlap may drive non-canonical binding
The three NANK KLF4 sites match the 9 base KLF4 consensus site from JASPAR40 at 6 or 8 positions, but because the KLFA and KLFB sites overlap (Fig. 4a), how a 3:1 DBD:NANK complex forms (Fig. 3a, right) is not clear. We sought to determine the structure of three DBDs bound to NANK (or NANKm), but preparing 3:1 complexes invariably gave condensation, not crystallization. We, therefore, used superpositioning to determine if the canonical interactions seen in the DBD:NKA complex can be accommodated at sites KLFA, KLFB, and KLFC in a B-DNA model of the NANOG proximal promoter. Individual superpositions at each site suggest favorable protein:DNA contacts, and simultaneously placing DBD at KLFC and either KLFA or KLFB generates no clashes (Fig. 4b). However, canonical DBD occupation of both KLFA and KLFB causes intermonomer clashes of the two modeled ZnF1 domains (Fig. 4c). We conclude that in the observed 3:1 complexes (Fig. 3a), at least one ZnF1 projects away from the DNA. Using the pose for DBD bound to a DNA heptamer (Fig. 3d, black) at KLFA or KLFB with a canonical DBD posed at the other site relieves the clash (Fig. 4c). The four residue ZnF1–ZnF2 linker should allow ZnF1 to adopt different orientations, consistent with tandem ZnFs acting as independently folded “beads on a string”41. KLF4 would contact NANK with ZnF2 and ZnF3, using the 6 bp KLF4 binding site detected on nucleosomal DNA16 and retaining most of the basis for specificity inferred from structural analysis18.
If overlapping KLF4 sites have functional importance because they obligately expose a ZnF, then such arrangements should be conserved. The mouse NANOG promoter (Supplementary Fig. 5) contains four overlapping KLF4 cognate sites in the −90 to −65 region: two sites on the top strand contain GGTG (and match the 9 bp consensus at 7 or 8 positions), and one site on each strand contains GGCG (and matches the consensus at 6 or 8 positions). DBD binding schematics for these sites suggest that the mouse NANOG promoter would also exhibit a tethered, continuously solvent-exposed ZnF1 when saturated with KLF4, and that binding could be modulated by CpG methylation. Since tight DNA binding can be achieved by two ZnFs42 or one ZnF flanked by a basic region43, we hypothesized that the exposed ZnF1 might recruit a second DNA partner, and that such DNA bridging might drive biomolecular condensation.
KLF4 DBD can bridge two DNA duplexes
We tested this hypothesis using single molecule Förster resonance energy transfer (smFRET)44–46. In DBD:NANK models where ZnF1 is excluded from either KLFA or KLFB (Fig. 4d), the 5′ end of the NANK coding strand is 40–43 Å from the Cα of H446, a ZnF1 residue that canonically contacts the major groove. We therefore 5′ end labeled the NANK coding strand with fluorescent donors or acceptors, reasoning that close proximity of the excluded ZnF1 might enable it to bring another labeled DNA close enough for FRET (Fig. 4e). With two labeled DNAs co-dissolved at low concentrations (100 pM AF488-labeled NANK, 500 pM AF594-labeled NANK), donor emission is observed but FRET is not (Fig. 4f, left). FRET events induced by 1 µM unlabeled KLF4 DBD (Fig. 4f, right) show that DBD can bring two DNAs together (closer than 55 Å, the R0 for this label pair), providing a mechanism for DBD biomolecular condensation with DNA. Although detected at dilute concentrations that do not support mesoscale LLPS (Fig. 2i), these events are likely those that drive condensation at higher concentrations. The smFRET data do not define the stoichiometry of the complex(es), but they are consistent with our non-canonical model for DBD:NANK interaction (Fig. 4e).
Non-cognate DNA can drive KLF4 DBD phase separation
If bridging between NANK molecules by a single continuously excluded ZnF can drive LLPS, then DBD bound to non-cognate sites might make similar bridging interactions when one ZnF transiently leaves the major groove. To test this, we mixed DBD with four DNA duplexes of 12 to 40 bp that lack a GG(C/T)G. None induce LLPS at 3.0 µM DBD and 0.25 µM DNA, conditions at which NANK readily drives LLPS, but at 10 µM DBD and 3 µM duplex, all four non-cognate DNAs produce phase separated droplets (Fig. 5a). We infer that DBD bound to non-cognate DNA samples binding modes that transiently expose ZnFs to interact with another duplex.
DBD phase separation with non-cognate DNAs at these modest concentrations reinforces the idea that the failure of 6 µM DBD to support LLPS at 3 µM NANK despite robust LLPS at 1 or 2 µM NANK (Fig. 2i) results from the sequestering of DBD into canonical complexes that depopulate states in which ZnF1 is exposed. At 2:1 DBD:NANK, energetically favored canonical modes can be adopted without steric clashes, as in Fig. 4a, so few DBD will adopt either obligately or transiently exposed binding modes. We reasoned that providing cognate sites in trans might therefore dissolve pre-formed condensate by sequestering DBD. We tested this by preparing 10 µM DBD with 3 µM NANK and allowing the mixture to undergo LLPS. Adding 0.5 equivalents of NANK causes rapid, total loss of the condensate (Fig. 5b, top). The initial and final states are consistent with the phase diagram (Fig. 2i), while the rapid dissolution shows that material readily exchanges between the aqueous phase and the enriched phase. This behavior and our stoichiometry-based explanation are similar to observations and the rationale for phase separation of tandem SH3 domains with a tandem substrate47. For a 3:1 DBD:FGF4 mixture, adding 0.5 equivalents of FGF4 DNA only modestly decreases the amount of condensate (Fig. 5b, bottom) because FGF4 has no high affinity KLF4 cognate sites.
DNA sequence strongly influences LLPS threshold concentration
To determine how the DNA sequence might influence the lowest concentration at which LLPS is observed (the threshold DNA concentration), we performed LLPS assays for the non-cognate 17-mer DNA SBE, for NANK, and for the CpG methylated substrate NANKm at a range of DBD concentrations (Fig. 5c). Even at 3 µM SBE, no LLPS is seen with 1.5 or 3.0 µM DBD; with 6.0 µM DBD, the threshold SBE concentration is 1.5 µM. For NANK, the thresholds for LLPS at 1.5, 3.0, and 6.0 µM DBD are 250, 125, and 125 nM (respectively); the 6.0 µM DBD threshold for NANK is more than 10-fold lower than that of SBE. For NANKm, the LLPS thresholds at 1.5, 3.0 and 6.0 µM DBD are 63, 31, and 16 nM (respectively); these thresholds are 4–8 fold lower than those for NANK, and the threshold for NANKm at 6.0 µM DBD is at least 90-fold lower than that of SBE. The degree of DNA interaction with DBD by EMSA correlates with DBD:DNA LLPS potential (Supplementary Fig. 6), and the very low threshold concentrations for NANKm indicate that condensation can be directed to high affinity KLF4 binding sites. We conclude that the DNA sequence can dramatically alter the propensity for DBD:DNA biomolecular condensation, and that CpG methylation of the NANOG promoter KLFB site (converting NANK to NANKm) strongly potentiates condensation.
Zinc finger domain mutations attenuate DBD:DNA condensation
If the lower threshold concentrations for NANKm compared to NANK are caused by tight DBD binding that accompanies CpG methylation (Fig. 3a), then residues that participate in KLF4 base-specific recognition (see Fig. 3c) should be important to LLPS. On the other hand, if the observed condensation depends on an unfolded DBD fraction, a different DBD surface, or a trace contaminant, then large-to-small mutations at the “specificity residues” of the C2H2 recognition code39 should have no effect on LLPS. We, therefore, prepared DBD carrying a ZnF2 mutation (E476D, position 3) that weakens affinity for cognate KLF4 sites48 and a ZnF3 mutation (R501A, position −1) that weakens affinity by EMSA (Supplementary Fig. 6). The double mutant domain (DBDE476D/R501A) shows decreased LLPS compared to wild type for all three of the tested DNAs (Fig. 5d). Even at 3.0 µM SBE, no LLPS is detected with 6.0 µM DBDE476D/R501A. For NANK, no LLPS is detected for DBDE476D/R501A at 1.5 or 3.0 µM, though the threshold concentration at 6.0 µM is unaltered from wild type. For NANKm, no LLPS is detected at 1.5 µM DBDE476D/R501A, and the thresholds for LLPS at 3.0 and 6.0 µM are 8 fold higher for DBDE476D/R501A than for DBD. Two of the four non-cognate DNAs that condense robustly at 3 µM with 10 µM DBD (Fig. 5a) show no condensation at 3 µM with 10 µM DBDE476D/R501A (Supplementary Fig. 7). We conclude that the ZnF2 and ZnF3 surfaces that contact bases in cognate DNA are important for LLPS with both non-cognate and cognate DNAs.
We then transfected cells with constructs carrying wild type, R501A mutant, or E476D/R501A double mutant KLF4-mTurq and assessed their distributions by microscopy. More puncta are seen for wild type than the mutants (Fig. 5e, top), but mutant expression levels are lower than wild type. Automating the identification of “punctate” cells (>5 puncta detected) and plotting cells by their average fluorescence values reveals that both mutant proteins can be expressed at higher levels than wild type without conferring a “punctate” phenotype (Fig. 5f). At levels between 5.0 and 7.5 × 103 arbitrary units, all cells expressing wild type fusions are classified as “punctate” but fewer than half of those expressing mutant proteins are so classified (Fig. 5f). Visual comparison of HEK 293T cells (Fig. 5e, bottom) or BJ fibroblasts (Supplementary Fig. 8) with equivalent average fluorescence confirms that the wild type construct supports a more punctate distribution than the point mutants. We conclude that at equivalent expression levels, both KLF4R501A-mTurq and KLF4E476D/R501A-mTurq undergo biomolecular condensation less readily than KLF4-mTurq. We infer that the DNA-contacting surfaces of ZnF2 and ZnF3 are therefore important to condensation mediated by full-length KLF4 in cells, and that the observed condensation is likely mediated by KLF4 molecules whose DNA binding domain is properly folded.
KLF4 biomolecular condensates recruit SOX2 and OCT4
We then tested whether SOX2 and OCT4, TFs that cooperate with KLF4 at promoters and enhancers5,6,14,19, would co-localize to KLF4-mediated condensates by co-expressing KLF4-mTurq with OCT4-mCherry or SOX2-mCherry. OCT4-mCherry expressed alone shows a uniform nuclear distribution at low levels, with some tiny puncta at higher expression levels (Fig. 6a). SOX2-mCherry expressed alone shows distributions consistent with SOX2 acting as a bookmark for mitosis49: although usually uniform or showing tiny puncta, in some cells it highlights mitotic chromosomes (Fig. 6a). After co-transfection of vectors for TF-mCherry and KLF4-mTurq, only a fraction of cells express both tagged proteins. OCT4-mCherry co-localizes to KLF4-mTurq droplets and puncta (Fig. 6b, top) in all cells where both proteins are detected (n = 73, 2 biological replicates); OCT4-mCherry droplets are never seen in the absence of KLF4-mTurq. SOX2-mCherry co-localizes to KLF4 puncta and droplets in 74% of cells where both KLF4-mTurq and SOX2-mCherry fluorescence are detected (n = 39, 2 biological replicates); when SOX2-mCherry does not co-localize with KLF4-mTurq, its distribution resembles mitotic bookmarking (Fig. 6c, bottom). We conclude that the cellular KLF4 condensate can recruit OCT4 or SOX2.
To determine if the in vitro DBD:DNA condensed phase can recruit TFs, we labeled purified full-length OCT4 and SOX2 proteins with Alexa Fluor 647, mixed them with NANK (which lacks OCT4 or SOX2 cognate binding sites) with or without DBD, and monitored the mixtures by fluorescence microscopy. OCT4-AF647 or SOX2-AF647 mixed with NANK give homogeneous mixtures, but addition of DBD drives NANK into droplets that co-localize with OCT4-AF647 or SOX2-AF647 (Fig. 6d). We conclude that the DBD-mediated biomolecular condensate can recruit TFs, perhaps through non-specific TF:NANK interactions.
We then assessed DBD behavior with a polynucleosome substrate consisting of a 5 kbp plasmid DNA with sites for 11 nucleosomes and at least 9 GGTG motifs (Active Motif, Inc.). DBD colocalizes to droplets with polynucleosomes (Fig. 6e), and droplets induced by mixing DBD with polynucleosomes recruit labeled OCT4 or SOX2 (Fig. 6f). DBD condenses with this substrate at low concentrations: 250 nM DBD induces droplets with 210 pM plasmid/nucleosome complex (0.4 ng DNA/µl, Fig. 6g, left). This might reflect DBD enhancing the intrinsic ability of polynucleosomes to phase separate50. The binding of KLF4 to DNA in nucleosomes16 might mediate this enhancement or independently support condensation, but exposed plasmid DNA in this substrate might also drive condensation by recruiting many DBDs, giving it increased valency51,52.
KLF4 DBD condenses readily with long DNAs
To see if longer DNAs containing NANK could condense readily without nucleosomes, we examined NP, a 404 bp NANOG promoter fragment (−379 to +25) that includes NANK and 6 additional GGTG sites. 250 nM DBD readily condenses with 2.5 nM NP, but not with NANK at the same DNA weight concentration (0.6 ng/µl, 32 nM) (Fig. 6g, center panels). The threshold concentration for NANK at 6 µM DBD is 125 nM (Fig. 5b), so NP condenses at 24-fold lower DBD levels and 4-fold lower DNA weight concentration (50-fold lower mole concentration) than NANK. 250 nM DBD condenses robustly with 0.6 ng/µl (130 pM) NPE, a 7.4 kbp linear DNA containing portions of the NANOG promoter and its −5 enhancer53 and 93 GG(C/T)G sites (Fig. 6g, right). We conclude that long DNAs condense much more readily than short DNAs.
Discussion
We propose that KLF4 organizes chromatin by forming condensates at genomic loci to which it is recruited in high numbers and then stabilizing the colocalization of such genomic sites when their KLF4:DNA condensates fuse during random diffusive collisions (Fig. 7). For the initial condensation, we expect that KLF4 would bind tightly to 6 bp on one DNA through ZnF2/ZnF3 but more weakly to another DNA through ZnF1, in a bridging mode, and that several KLF4 bound to one stretch of DNA would provide the valency needed to drive biomolecular condensation51,52. Cognate KLF4 sites19, overlapped sites (Fig. 4a, c), and partial 6 bp sites16 that might direct KLF4:DNA condensation at particular genomic loci will have their affinities modulated by CpG methylation and their accessibility influenced by nucleosomes and by other DNA binding proteins. KLF4:DNA condensation in vitro does not require IDR:IDR interactions, but in cells the KLF4 IDR may contribute to condensation (through homotypic interactions) or to recruitment of other factors (through heterotypic interactions). Chromatin modifying machinery would be able to reinforce or reverse KLF4:DNA condensation by altering the accessibility or methylation states of KLF4 binding sites.
KLF4 is found at both repressive and activating loops in PSCs15, indicating that contacts mediated by KLF4:DNA condensates are not sufficient to drive transcriptional activation. Spatial colocalization of genomic elements by KLF4:DNA condensates combined with IDR-centric models for transcriptional activation23–26,54,55 can explain many observed chromatin features. Promoters and enhancers associated with pluripotency are known to recruit KLF4 when they make long-range contacts5,12–15; we propose that KLF4 is condensed with DNA at these loci, helping to stabilize the observed long-range contacts (Fig. 7). Super-enhancers are larger than typical ESC enhancers, more enriched in KLF4, and able to recruit much higher levels of Mediator14. These properties can be explained by extensive KLF4:DNA condensates that bring together several enhancers, whose abilities to recruit transcription machinery through IDR:IDR interactions24,26 would be increased by their mutual proximity and by recruitment of TFs to the KLF4:DNA condensate. The KLF4-mediated recruitment of histone demethylase JMJD356 or DNA demethylase TET257 may be influenced by KLF4:DNA condensation, and the KLF4-mediated recruitment of cohesin13,56 may help to topologically link remote DNA segments held together by KLF4:DNA condensation.
KLF4 is functionally implicated at the NANOG promoter in somatic cell reprogramming5–7,12–15,19. We propose KLF4 binding and condensation as the first mechanistic steps in accessing the closed, highly methylated NANOG promoter during reprogramming. Silenced chromatin is compact, but KLF4 should diffuse into it readily because its folded ZnFs are small and its IDR is deformable. The human NANOG promoter has KLF4 cognate sites spaced by 15, 7, and 11 bp, so one of these sites must be partially exposed in nucleosomes, and KLF4 is known to bind to 6 bp partial sites in nucleosomal DNA16. KLF4 that binds to the CpG-methylated, nucleosome-wrapped NANOG promoter can recruit more KLF4 through condensation driven by its exposed ZnF1, and possibly through homotypic IDR:IDR interactions. When nucleosomal breathing motions expose DNA58, locally tethered KLF4 ZnFs will occupy newly exposed major grooves and prevent rewrapping. The local KLF4:DNA condensate will recruit TFs OCT4 and SOX2, biasing their diffusive searches59 to promoter sites within the condensate and further favoring nucleosome unwrapping; heterotypic IDR:IDR interactions between KLF4 and TFs could enhance recruitment.
Rising KLF4 levels early in reprogramming (Fig. 7a) will promote growth and fusion of KLF4:DNA condensates that help determine the long-range contacts made by the NANOG promoter (Fig. 7b–g). KLF4-enriched enhancers and promoters (Fig. 7c) that collide by random diffusion (Fig. 7d, e) will remain co-localized due to fusion of their KLF4:DNA condensates, within which KLF4 DNA bridging mediates a network of contacts among the key loci and nearby DNA (Fig. 7f, g). These steps driven by KLF4 expression could clear the way for recruitment of transcription machinery that initiates NANOG expression in mid-to-late stages of reprogramming12. A role for KLF4:DNA condensation in organizing chromatin can explain why an additional copy of KLF4 increases the efficiency of somatic cell reprogramming methods11 and commercial kits60 (CytoTune 2.0, Thermo Fisher), and why limiting KLF4 expression halts reprogramming at distinct stages of epigenetic reset but increasing KLF4 levels drives partially reprogrammed cells to iPSCs61.
DNA bridging by tandem C2H2 zinc fingers that we demonstrate here for KLF4 (Fig. 4f) could be widely implicated in chromatin structure and gene expression: the human genome contains more than 700 C2H2 ZnF proteins with four or more tandem ZnFs, having an average of 8.5 and as many as 30 ZnFs62. Many such proteins have ZnFs that are not needed to bind their DNA cognate sites and so might make bridging contacts; for instance, just three of the 11 tandem ZnFs in TZAP are sufficient to direct proteins to telomeres63. The TF GLIS1, which uses two of its five ZnFs to recognize targets64, enhances reprogramming by OCT4/SOX2/KLF465; if it were to make bridging contacts with its other three ZnFs, such contacts could be long-lived. ZnFs with unidentified functional roles are also common in proteins with repressive effects in chromatin: the repressor ZFP57 binds a methylated 6 bp motif in closed chromatin with two of its seven ZnFs66, and the N-terminal ZnF of the mouse repressor protein ZFP568 does not contact target DNA67. The architectural protein CTCF, which interacts through its N-terminal domain with cohesin68 and whose binding site polarity on DNA controls chromatin looping69, makes sequence-specific contacts with different target DNAs but its terminal ZnFs (ZnF1, ZnF10, and ZnF11) do not contribute to binding target DNA70. Our demonstration that the KLF4 ZnF tandem array makes DNA-bridging contacts that mediate condensation suggests that other C2H2 tandem ZnF proteins may bridge DNA making transient or long-lived contacts that contribute to biological function.
Methods
Bacterial strains
The E. coli strain DH5α (Thermo Fisher Scientific) was used for plasmid cloning and large-scale preparations of plasmid DNAs. The E. coli strain BL21 Star (DE3) (Thermo Fisher Scientific) was used for large-scale protein production.
Mammalian cell lines
The HEK 293T cell line (from ATCC, CRL-3216), Lenti-X 293T (from Takara Bio USA, TaKaRa Bio # 632180), and BJ fibroblasts (from ATCC, CRL-2522) were cultivated in Dulbecco’s Modified Eagle Medium (DMEM, Corning) with 10% (v/v) fetal bovine serum (FBS, Corning) and 1X antibiotic-antimycotic solution (Corning). All cells used in this study tested negative for mycoplasm contamination.
Construction of mammalian plasmids
All generated constructs and mutations were confirmed by DNA sequencing (Eurofins Genomics). The pHRT-GFP-AH lentiviral transfer vector was generated from pHR-CMV-TetO2_3C-Avi-His6 (Addgene #113887) by replacing the DNA fragment corresponding to the 5′-Chicken RPTPs signal sequence-HRV 3C site-3’ with a DNA fragment corresponding to 5′-BamHI-KpnI-TEV cleavage site-eGFP-3’. The insert fragment was amplified from the plasmid encoding TEV-eGFP using the primers eGFP-F/eGFP-R; see Supplementary Table 1 for all primers. The vector fragment was amplified from pHR-CMV-TetO2_3C-Avi-His6 using the primers pHRT-1F/pHRT-1R. The insert and vector fragments were ligated together using Gibson Assembly Master Mix (NEB) according to the manufacturer’s protocol.
Lentiviral vectors pHRT-mTu-AH and pHRT-mCh-AH were generated by replacing the eGFP gene in pHRT-GFP-AH with mTurquoise2 (mTu) and mCherry (mCh), respectively. The mTu insert was amplified from pmTurquoise2-Tubulin (Addgene #36202) using the primers mTu-1F/mTu-1R. The mTu gene has the A206K mutation to ensure obligate-monomer state71. The mCh insert was amplified from pBRY-nuclear mCherry-IRES-PURO (Addgene #52409) using the primers mTu-1F/mTu-1R. The vector fragment was amplified from pHRT-GFP-AH using the primers pHRT-1F/pHRT-2R. The insert and vector fragments were ligated together using Gibson Assembly Master Mix.
Vector pHRT-KLF4-mTu-AH was constructed to express KLF4 with a C-terminal TEV cleavage site (mTurquoise2-Avi-His6; mTu-AH) in a lentiviral expression system. The KLF4 insert was amplified from the plasmid encoding KLF4 (GeneArt) using primers KLF4-1F/KLF4-1. The insert fragment was digested with BamHI/KpnI and ligated into BamHI/KpnI-digested pHRT-mTu-AH lentiviral transfer vector using T4 DNA ligase (Promega).
pHRT-KLF4(2-417)-mTu-AH and pHRT-KLF4(418–513)-mTu-AH were constructed to express labeled KLF4 deletion constructs. The KLF4 coding regions corresponding to the intrinsically disordered region (IDR, residues 2-417) or the DNA binding domain (DBD, residues 418–513) were amplified from the plasmid encoding human KLF4 gene (GeneArt) using primer sets KLF4-2F/KLF4-2R or KLF4-3F/KLF4-3R, respectively, and ligated into BamHI/KpnI-digested pHRT-mTu-AH lentiviral transfer vector using Gibson Assembly Master Mix.
pHRT-KLF4_R501A-mTu-AH. To construct the KLF4 single mutant (R501A), two DNA fragments were amplified from pHRT-KLF4-mTu-AH using the primer sets (KLF4-4F/KLF4-4R and KLF4-5F/KLF4-5R, respectively) and ligated together using Gibson Assembly Master Mix.
To construct the KLF4 double mutant (E476D/R501A) expression vector pHRT-KLF4_E476D_R501A-mTu-AH, two DNA fragments were amplified from pHRT-KLF4_R501A-mTu-AH using the primer sets KLF4-4F/KLF4-6R and KLF4-7F/KLF5R, respectively, and ligated together using Gibson Assembly Master Mix.
To express fluorescently labeled OCT4 in the lentiviral expression system, the pHRT-OCT4-GFP-AH plasmid was built by amplifying the OCT4 gene from pGEX4T-1_WT_OCT4 (Addgene #40633) using the primers OCT4-1F/OCT4-1R and ligating into BamHI/KpnI-digested pHRT-GFP-AH using Gibson Assembly Master Mix. The pHRT-OCT4-mCh-AH construct for expression of the C-terminal TEV cleavage site-mCherry-Avi-His6 (mCh-AH) fused OCT4 was built by amplifying the OCT4 gene in pHRT-OCT4-GFP-AH using the primers OCT4-2F/OCT4-2R and ligating into EcoRI/KpnI-digested pHRT-mCh-AH using Gibson Assembly Master Mix.
To construct pHR-SOX2-mCh-Cry2olig, the SOX2 gene (insert) was amplified from the SOX2 gene in pEP4 E02S EN2L (gift from James Thomson, Addgene #20922) using the primers SOX2-1F/SOX2-1R. The vector fragment was amplified from pHR-mCh-Cry2olig (gift from Clifford Brangwynne, Addgene #101222) using the primers pHRT-3F/pHRT-3R. The insert and vector fragments were ligated together using Gibson Assembly Master Mix.
Construction of bacterial plasmids
To express the N-terminal streptavidin-binding Nano tag-His6-TEV cleavage site (NH6t) fused KLF4 DBD in the bacterial expression system, the KLF4 DBD gene (insert) was amplified with the plasmid encoding KLF4 (GeneArt) using primer sets (KLF4-8F/KLF4-8R). The vector fragment was amplified from pET15Nano6HT-SMAD1 (DNASU) using primer sets (pET15-1F/pET15-1R). The insert and vector fragments were ligated together using Gibson Assembly Master Mix to give pET15-NH6t-KLF4(418–513). KLF4 DBD mutations E476D and R501A were introduced into plasmid pET15-NH6t-KLF4(418–513) using the QuikChange Multi Site-Directed Mutagenesis Kit (Agilent) and the primers KLF4-9F/KLF4-10F according to the manufacturer’s protocol. Product pET15-NH6t-KLF4(418–513)_E476D_R501A was verified by DNA sequencing; residue numbering follows the human gene product.
Lentiviral transfection and transduction
Recombinant lentiviruses were produced by co-transfection of Lenti-X 293T cells (1 × 106 cells in gelatin-coated 10 cm cell culture dish) with lentiviral transfer construct (1.2 pmol), psPAX2 packaging plasmid (1.2 pmol; Addgene # 12260), and pMD.2G envelope plasmid (0.7 pmol; Addgene #12259) using the Lipofectamine 3000 Transfection Reagent (Thermo Fisher Scientific). Transfection was performed according to the manufacturer’s optimized protocols. Lentiviruses were harvested 3 days post-transfection, filtered through 0.45 μm-pore size PES filters, and concentrated 100 times using Lenti Concentrator (OriGene). The lentiviral titer concentration, determined by Lentivirus Titer Kit HIV-1 p23 Elisa Assay (OriGene), was ~2 × 108 TU/mL.
Lentiviral transduction was carried out with 1 × 105 host cells in 24-well cell culture plate and 10 multiplicities of infection (MOIs) of lentivirus in the presence of 8 μg/mL of polybrene. Lentivirus was removed after overnight incubation, and fresh cell culture media was added. Two days post-transduction, expression of the fluorescent proteins (eGFP, mTurquoise2, and mCherry) were verified using an EVOS fluorescence microscope (Thermo Fisher Scientific). Transduced cells were passaged in culture as bulk preparation for functional assays.
Purification of KLF4 DBD (418–513; DBD) WT and double mutant (E476D/R501A)
Either pET15-NH6t-KLF4(418–513) or pET15-NH6t-KLF4(418–513)_E476D_R501A were transformed into E. coli BL21 Star competent cells (Novagen, Merck KGaA, Darmstadt, Germany). Transformed cells were grown at 37 °C in Terrific Broth media containing 100 µg/mL carbenicillin antibiotic until optical density at 600 nm (OD600) reached 0.6. The culture was then transferred to an 18 °C incubator shaking at 250 rpm until OD600 reached 0.8–1.0. Protein expression was induced with 1 mM IPTG, followed by overnight growth at 18 °C with shaking at 250 rpm. Cells from the overnight culture were harvested by centrifugation at 7900 × g. Both WT and double mutant (E476D/R501A) KLF4 DBD were purified with the same procedure. The cell culture pellets were resuspended in denaturing lysis buffer (6 M urea, 1 mM 2-mercaptoethanol, 0.5 M NaCl, 109 mM sodium phosphate, pH 8). The resuspended pellets were lysed using a cell homogenizer (Avestin, Ottawa, Canada), with the soluble fraction separated from the cell debris by centrifugation at 38,700 × g. Lysate containing the soluble fraction was filtered using a 0.25 µm filter (Corning). The His-tagged fusion protein was purified from the crude protein mixture by immobilized metal-affinity chromatography (IMAC) using batch/gravity method. The lysate was applied to a pre-equilibrated 5 mL HisPur cobalt resin (Thermo Fisher Scientific) followed by extensive washing (20–50 column volumes). The protein was eluted using elution buffer (denaturing lysis buffer + 200 mM imidazole). The eluted protein was combined with 3 volumes of 0.1% (v/v) TFA, and acidified to ~pH 3. The soluble fraction was further purified by reverse-phase HPLC using Zorbax 300SB C3 column (Agilent Technologies) and a 20–60% ACN gradient (Buffer A: dH2O with 0.1% (v/v) TFA; Buffer B: ACN with 0.1% (v/v) TFA). Pure fractions (by SDS-PAGE analysis) were then combined and dialyzed against deionized distilled water (dH2O). Afterward, the protein solution adjusted to ~ pH 6-7 with 1 M Tris, pH 8 (final concentration ~10-20 mM). TEV protease was then added (1 mg TEV: 25 mg protein), and the sample incubated overnight at 4 °C with rotation. The cleaved, untagged proteins were subsequently re-purified by HPLC (as described above), followed by dialysis with dH2O and flash freezing using liquid N2 prior to storage at −80 °C. CD spectroscopy (Aviv, Lakewood, NJ) was utilized to verify proper protein refolding, monitoring the transition from unfolded to folded state induced by changes in pH and the incremental addition of ZnSO4. For crystallization and subsequent experiments, refolding was performed by addition of 3.3 molar equivalents of ZnSO4 followed by buffer dilution using either 10 mM Tris, pH 8, or 1× TBS (140 mM NaCl, 25 mM Tris, pH 7.4) buffers.
Purification of mTurquoise2
For expression and purification of mTurquoise2 (mTurq) from bacterial cells, the pET15-mTu-AH plasmid was transformed into E. coli BL21 Star (DE3) chemically competent cells, which were then grown at 37 °C in Terrific Broth media with 100 μg/mL carbenicillin. When the OD600 reached 1.0 to 1.5, protein expression was induced with 1 mM IPTG. The cells were incubated overnight with shaking at 18 °C and harvested by centrifugation. Pellets were resuspended and lysed in 1–2 mL RIPA2 lysis buffer solutions using a handheld sonicator operating at 30% power for three cycles of 60 s on, 60 s off. RIPA2 lysis buffer consists of 1× PBS (1.8 mM KH2PO4, 10 mM Na2HPO4, 2.7 mM KCl, 137 mM NaCl), 0.5% Triton X-100, and 0.1% sodium deoxycholate. The fluorescent protein was purified using batch/gravity immobilized metal-affinity chromatography (IMAC). The beads were extensively washed with 50 column volumes of RIPA2 buffer plus 500 mM NaCl, and the protein was eluted with 200 mM imidazole. The eluate was diluted and passed through Q Sepharose beads. The protein was eluted with 500 mM NaCl, concentrated, and exchanged into a new buffer with 2 mM TCEP, 10% glycerol, 500 mM NaCl, 25 mM Tris, pH 7.5.
Purification of full-length (FL) OCT4 and SOX2
E. coli BL21 Star competent cells (Novagen) were transformed with pGEX4T-1_WT_OCT4 (GST-OCT4 fusion) or pET302-GB1-SOX2 (His-tag protein GB1 (h6GB1)-SOX2 fusion) expression plasmids. Protein expression was conducted as described above for KLF4 DBD. For h6GB1-SOX2, the final harvested cell culture pellet was resuspended in denaturing lysis buffer (8 M urea, 850 mM NaCl, 50 mM Tris, pH 8), lysed, and centrifuged. The supernatant was passed through an IMAC column with Co2+ resin. After 20 column volumes of washing with the lysis buffer, the protein was eluted using the same buffer plus 200 mM imidazole. The eluted protein was concentrated, diluted six-fold with refolding buffer (1× PBS plus 500 mM NaCl, 5% (v/v) glycerol, and 0.1% (v/v) Tween-20). The h6GB1 fusion tags from h6GB1-SOX2 proteins were cleaved (1:20 TEV:protein w/w ratio) overnight at 4 °C. Co2+ resin was used to remove h6GB1 and uncleaved proteins; Q Sepharose beads (GE Healthcare) were subsequently used to remove excess DNA. The flow through was mixed with TFA to a final concentration of 0.2% TFA and purified by C3 reverse phase HPLC using the procedure and gradient described above for the KLF4 DBD constructs. Purified fractions were lyophilized using Virtis BenchTop Pro (SP Scientific) and stored at −80 °C. Full length GST-OCT4 fusion proteins were first purified using standard non-denaturing GST purification methods. Briefly, cells were lysed in 1X PBS with 0.1% Triton X-100 and 5 mM DTT. The supernatant was bound to GST Sepharose beads (GE Healthcare), the beads were washed extensively, and the protein was eluted with 50 mM Tris, 10 mM GSH, pH 8. The eluate was dialyzed and cleaved overnight in 1× PBS and TEV protease (1:20 TEV:protein w/w ratio). The solution was passed through GST Sepharose beads to remove GST tag and any uncleaved fusion proteins. The flow through and precipitates from dialysis, which contained cleaved OCT4, were dissolved in 6 M guanidine hydrochloride (GdnHCl) and purified by C3 reverse phase HPLC (as described above). Purified OCT4 fractions were lyophilized and stored at −80 °C. Refolded OCT4 and SOX2 are functionally active (assayed by EMSA).
Fluorescent labeling of OCT4 and SOX2
OCT4 and SOX2 were labeled with Alexa Fluor 647 (AF647) maleimide (Thermo Fisher Scientific) using standard methods described previously45. Briefly, the proteins were dissolved in 6 M GdnHCl, 20 mM Tris pH 8, mixed with 3–4 molar excess of Alexa Fluor 647 (AF647) maleimide dyes, and incubated for 1 h at RT. Samples were then mixed with 3-fold excess of 0.1% TFA/dH2O and purified by reverse phase HPLC. Purified fluorescent labeled samples were lyophilized and stored at −80 °C. SOX2-AF647 and OCT4-AF647 protein samples had 45% and 135% fluorescent labeling efficiency, respectively. For colocalization experiments (Fig. 6d, f), SOX2-AF647 was dissolved in 6 M GdnHCl and diluted ~200× in 10 mM sodium phosphate buffer, pH 8. OCT4-AF647 protein was dissolved in 6 M GdnHCl and then buffer exchanged with NAP-5 columns (GE Healthcare) to a final concentration of ~500 nM in 10 mM sodium phosphate buffer, pH 8. Samples were snap frozen for storage at −80 °C before use.
Fluorescent labeling of NANK
NANK DNA oligos with 5′ amino modified C6 (IDT) were purified by ethanol precipitation and labeled with a 10-fold molar excess of dye (Alexa Fluor 488 or 594 NHS ester; Invitrogen). NANK has 30% (Alexa Fluor 488) and 50% (Alexa Fluor 594) fluorescent labeling efficiency. The labeling reactions were performed at 30 °C with 30–60 min incubation. The Alexa Fluor 488- or 594-labeled NANK were then ethanol precipitated; the collected DNA pellets were dissolved in 0.1 M triethylammonium acetate at pH 7. Excess unconjugated dyes were removed by passing two times over NAP-5 columns (GE Healthcare).
Protein and DNA concentration determination
DBD protein concentration was calculated based on the UV absorbance extinction coefficient at 280 nm of 22,190 M−1 cm−1 (based on Tyr and Trp absorbance72). All DNA oligonucleotides (Supplementary Table 2) for crystallization, LLPS and EMSA experiments were obtained from (Integrated DNA Technologies, Inc., Coralville, IA). Unlabeled duplex DNA for unmethylated and methylated DNA were calculated using the extinction coefficient of the single-strand DNA (IDT) and the formula73 that accounts for the hypochromicity (h): {εds,260nm = (εss,260nm + εreverse complement,260nm) × (1 − h)} and {h = (0.059 × fGC) + (0.287 × fAT)}, where fGC and fAT are fractions of GC and AT, respectively. Fluorescent DNA concentration was measured using the extinction coefficient of the Alexa Fluor 647 dye. Fluorescent labeling efficiencies were calculated using the corrected extinction coefficients based on the manufacturer’s protocol (Invitrogen).
Preparation of 404 bp NP (NANOG promoter) and 7.4 kbp NPE (NANOG promoter enhancer)
The human NANOG promoter was amplified from pNanog-Luc (Addgene #25900) using the forward primer hNan-F2 (or -F1) and the reverse primer hNan-R1 (or -R2); the primers in parentheses are fluorescently labeled versions of the listed primers; see Supplementary Table 2. The 404 bp PCR fragments were purified using QIAquik Gel Extraction Kit (Qiagen). The pGL-NanogP-5E minus plasmid53 containing the mouse 1535 bp NANOG promoter and 1337 bp enhancer (−5 kbp from NANOG promoter) was digested with PvuI (NEB) to linearize the plasmid. The DNA fragment containing NANOG promoter and enhancer was purified using QIAquik Gel Extraction Kit (Qiagen).
Electrophoretic mobility shift assay (EMSA)
The binding reactions for the EMSA consisted of 1× EMSA buffer (0.01 mg/ml BSA, 0.1 mM DTT and 0.05 mM TCEP, 5% glycerol, 50 mM NaCl, 20 mM Tris pH 8) and unlabeled (50 nM–15 µM) protein (see figure legends for the exact protein and DNA concentration, and buffer conditions). Protein concentrations were prepared by 2-fold serial dilution. Samples were loaded onto either 10%, 12% or 4–15% pre-cast Mini-PROTEAN Tris-Glycine gel (TG; Bio-Rad) and electrophoresed for 25–45 min at 120 mV 4 °C in 1x TG buffer (Bio-Rad). EMSA experiments using unlabeled DNAs were stained with EtBr or SybrTM Green for 20 min prior to imaging. The gels were then imaged using ChemiDoc with the appropriate filters and analyzed through the Image Lab software (Bio-Rad). EMSAs were performed with 2–3 independent replicates.
Crystallization and X-ray data collection
The KLF4 DBD:NKA complex was crystallized by hanging drop vapor diffusion method at 20 °C. Purified and refolded human KLF4 (418–513) in 0.5 mM DTT, 20 mM Tris-HCl, pH 8.0, and 3.3 molar equivalents of ZnSO4 was mixed with 1.2 molar excess of dodecameric DNA (12-mer: 5′-AGG GGG TGT GCC-3′). Crystals of KLF4 DBD:NKA were obtained by mixing equal volumes of KLF4 DBD:NKA complex (40 mg/mL total macromolecule) with 0.2 M sodium iodide (pH 7.0) and 20% w/v polyethylene glycol 3,350 reservoir solution. Single crystal X-ray diffraction data were collected at 100 K on the Beam Line 5.0.2 Advanced Light Source (UC Berkeley, USA) at wavelength (λ) = 1.00 Å, using an ADSC Q210 CCD detector. The collected data were integrated and scaled using iMosflm and SCALA, respectively74,75.
Structure solution and refinement
The crystal structure of KLF4 DBD:NKA was determined by molecular replacement method using Phaser76. A prior crystal structure of KLF4 DNA binding domain (PDB ID: 2WBS) was used as search model. A unique solution was obtained for one molecule in the asymmetric unit. The dodecameric DNA was traced and fitted manually into electron density. The final model was obtained by iterative cycles of manual rebuilding using Coot77 and refinement using phenix.refine78. PyMOL visualization program (https://pymol.org) was used for all the structural analyses and preparation of figures. The statistics for data collection and refinement are summarized in Supplementary Table 1. Residue numbering follows the human gene product; previous structures with identical ZnF sequences have been numbered according to the mouse gene product.
In vitro LLPS microscopy imaging
Monitoring for the presence/absence of LLPS droplets was performed at room temperature using EVOS fluorescence imaging system (Thermo Fisher Scientific) with bright field and/or necessary filters (CFP (mTurquoise2), GFP (YOYO-1, AF488), Texas Red (AF594), Cy5 (AF647)). For a set of experiments, the same light power and exposure time was used. Conditions for each set of experiments are detailed in the figure legends. To construct LLPS diagrams, various concentrations of KLF4 DBD and DNAs (cognate and non-cognate DNAs, see Fig. 2 for sequences) were prepared with either of the following buffers: TS buffer (70 mM NaCl, 12.5 mM Tris, pH 7.4; Figs. 2i, 5) or TS2 Buffer (140 mM NaCl, 25 mM Tris, pH 7.4; Supplementary Fig. 3). 100 nM YOYO-1 was added to samples in which the DNA was to be imaged by fluorescence microscopy. Specific conditions for the experiments are in the figure legends. LLPS diagrams were based on images obtained after 30 min of incubation. To assess colocalization of KLF4 DBD:NANK droplets and full length OCT4 or SOX2, KLF4 DBD (9 μM) was mixed with NANK DNA (1.5 μM) and either OCT4-AF647 (95 nM) or SOX2-AF647 (140 nM) in TS buffer. Samples were incubated for 30 min to 1 h prior to imaging. Experiments were performed in 2–3 independent replicates. To assess colocalization of KLF4 DBD with recombinant polynucleosomes purchased from Active Motif, as in Fig. 6e, the commercial polynucleosomes (H3.1; 20 µg protein + 24 µg 5 kbp plasmid DNA; 0.55 µg/μl) in 10 mM Tris-HCl, pH 8.0, 1 mM EDTA, 2 mM DTT, 20% glycerol were diluted (final concentration is 20 ng/μl) in TS buffer and mixed with KLF4 DBD (10 µM). To assess colocalization of KLF4 DBD, polynucleosomes, and OCT4 or SOX2, KLF4 DBD (1 μM) was mixed with commercial polynucleosomes (11 ng/μl) and OCT4-AF647 (50 nM) or SOX2-AF647 (70 nM) in TS buffer. Samples were incubated for 30 min to 1 hr prior to imaging. Experiments were performed in 2–3 independent replicates.
Fluorescence live cell confocal imaging
Fluorescence imaging of live cells (HEK 293 T cells and BJ fibroblasts plated on polyD-lysine coated 35 mm Ibidi μ-dish transduced with different plasmid constructs) was performed 2–3 days after lentiviral transduction using EVOS fluorescence imaging system (Thermo Fisher Scientific) or LSM780 and LSM880 laser-scanning confocal microscope system (Zeiss, Oberkochen, Germany) at 37 °C and 5% CO2 with a ×60 oil objective. Images were analyzed using Fiji (ImageJ 1.52c), Zen 2.3 (Zeiss, Oberkochen, Germany) and Imaris v9.2 (Zurich, Switzerland) microscopy image analysis software. Images were taken at 3–5 different field locations for each biological replicate.
Fluorescence recovery after photo-bleaching (FRAP) imaging in cells and in vitro
FRAP imaging of KLF4-mTurq droplets and puncta/clusters in HEK 293T or BJ fibroblast cells (2–3 days after lentiviral transduction) were performed using a Zeiss LSM780 and LSM880 laser-scanning confocal microscope system at 37 °C and 5% CO2. Different nuclear region of interest (ROI) spots (~0.5–2 μm diameter) were selected, and reference ROIs were drawn in adjacent regions (within the cell). Following 2–3 baseline images, ROIs were bleached for 50-200 iterations at 100% laser power (458 nm and 488 nm), and were imaged for up to 2–4 min post-bleaching for fluorescence recovery. FRAP recovery curves were corrected for background photobleaching (reference ROI in a separate droplet) and normalized against pre-bleach intensity values. FRAP data are fitted with an exponential function in the software Origin (Fig. 1c). FRAP imaging of LLPS droplets in vitro was achieved for LLPS droplets prepared by mixing trace-labeled KLF4 DBD (9 μM unlabeled DBD, 50 nM DBD-AF594) with trace-labeled NANK DNA (1.5 μM unlabeled NANK, 180 nM NANK-AF488). After ~1 h sample incubation, FRAP imaging was performed on droplets that had fused and settled close to the imaging surface. Using Zeiss LSM780 (with ×60 objective), different regions (~1 μm diameter ROI) were bleached with 100% power (488 nm) and 90% power (594 nm) for 100 iterations. Pre- and post-bleaching images (simultaneous 488 and 594 nm channels) were collected for ~15 min with 5 s intervals. After background subtraction (reference ROI in separate droplet) and normalization, the FRAP recovery curves (means and standard deviations) were plotted in the software Origin (Fig. 2g).
Single-molecule Förster resonance energy transfer (smFRET)
KLF4 DBD and NANK DNA binding interactions were monitored by single-molecule spectroscopy using a custom-built Alba confocal laser microscopy system (ISS, Champaign, Illinois). smFRET measurements were conducted in TS buffer (70 mM NaCl, 12.5 mM Tris, pH 7.4) at room temperature (21.5 ± 1 °C) by mixing 100 pM NANK 5′-labeled with Alexa Fluor 488 (FRET donor; Thermo Fisher Scientific) and 500 pM NANK 5′-labeled with Alexa Fluor 594 (FRET acceptor; Thermo Fisher Scientific), with or without 1 μM KLF4 DBD. Measurements were performed with 2 independent replicates. Freely diffusing FRET samples were excited with a 488-nm laser (ISS; ~115 μW). Fluorescence emission was split into donor-acceptor fluorescence by a 605-nm long pass beam splitter dichroic, and donor and acceptor signals were further filtered using 535/50-nm and 641/75-nm bandpass emission filters, respectively. Emission was detected using SPCM-ARQH-16 Avalanche photodiode detectors (Excelitas Technologies Corp., Waltham, MA). Data acquisition and FRET efficiency analysis were performed using VistaVision (64) 4.2.220.0 (ISS), correcting for acceptor emission due to direct excitation (1%) and fluorescence bleed-through of donor emission into the acceptor channel (5%), applying a binning time of 500 µs. There were 40,335 and 33,160 events collected for DNA samples without DBD and with DBD, respectively (Fig. 4). smFRET histograms were fitted to Gaussian functions using OriginPro 2020 (OriginLab, Northampton, MA, USA). FRET efficiencies (EFRET) were calculated (using a value of unity for γ) from the corrected donor (ID) and acceptor (IA) fluorescence intensities as given by:
LLPS quantification and statistical analysis
To construct phase diagrams, a matrix of different nucleic acid and protein concentrations were mixed and incubated for 30 min. Images (fixed size of 153 × 114.7 μm) were collected at the same focal plane using EVOS microscopy system. The mean fluorescent intensities and standard deviation of the *.tif images were determined by the ImageJ software. Data from 2–3 independent replicates were averaged; the coefficient of variation (CV) is determined by the standard deviation divided by the mean. Positive phase separation for a particular condition is determined by CV > 0.2 and mean fluorescent intensity >4 arb. units (Figs. 2 and 5).
Quantification of fluorescence intensities and puncta in cells
Statistical tests (student’s paired t-test) performed on experimental data and their representations are performed using Origin and noted in the figure legends. Puncta/droplet identification was determined through the Spots Algorithm in Imaris software v.9.2. Only spots that are localized in the nucleus, >500 nm in diameter and >1500 arb. units center intensity were chosen. The mean fluorescent intensities were determined by the Imaris software for the HEK 293T cells (Figs. 2 and 5) and ImageJ software for BJ fibroblasts (Supplementary Fig. 8).
Nuclear concentration determination of KLF4-mTurq
Nuclear concentrations of transiently expressed KLF4-mTurq were determined using a calibration plot of the fluorescence intensity/exposure time versus concentration of purified mTurquoise2 protein (Supplementary Fig. 2b). Using an EVOS fluorescence microscope, ×60 objective and CFP filter (Thermo Fisher Scientific), HEK 293T cells expressing KLF4-mTurq plated on 35 mm Ibidi μ-dish were imaged using 30% power, 15 ms exposure time. The nuclei boundaries were manually drawn in ImageJ and the mean fluorescence intensities quantified. The calibration plot was linearly fitted using Origin.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We thank the Baylor College of Medicine Integrated Microscopy Core for the use of the confocal microscopes. We thank Phoebe S. Tsoi for help with DBD protein expression and for reading the manuscript. This work was supported by NIH grants R01 GM122763 to J.C.F. and R21 NS107792 to A.C.F.M. Additional funding was provided by R01 NS105874 and R21 NS109678 to A.C.M.F., by a Core Facility Support Award from the Cancer Prevention and Research Institute of Texas (grant RP160805) to Dr. Martin Matzuk, and by R01 DK121970 and R61 HD099995 to Dr. Feng Li. The ALS-ENABLE beamlines are supported in part by the National Institutes of Health, National Institute of General Medical Sciences, grant P30 GM124169-01 and the Howard Hughes Medical Institute. The Advanced Light Source is a Department of Energy Office of Science User Facility under Contract No. DE-AC02-05CH11231. The Human Stem Cell Core at Baylor College of Medicine is supported in part by the College and NIH grants (P30 CA125123 Osborne and S10 OD028591 Kim).
Source data
Author contributions
A.C.M.F., C.K., K.R.M., and J.C.F. conceived the project. J.J.K., A.C.M.F., C.K., and J.C.F. supervised the experiments. K.J.C. and J.C.F. performed fluorescence cell-based experiments. R.S., M.D.Q., J.C.F., and A.C.M.F. performed in vitro condensation assays; J.C.F. performed the image processing and statistical analysis. H.P., A.L., and J.J.K. performed cell reprogramming experiments. K.J.C. cloned the bacterial and mammalian constructs. R.S., S.S., and J.C.F purified the recombinant proteins. R.S. and C.K. grew crystals and determined the structure; B.S. acquired X-ray data; K.R.M. and C.K. curated the structure. K.R.M. and C.K. developed 3D models for DBD:DNA condensation. M.D.Q. and A.C.M.F. designed, implemented and analyzed single molecule fluorescence experiments. K.R.M. and J.C.F. wrote the manuscript. All authors edited the manuscript.
Data availability
Source data for plots, raw data for counts and intensity measurements, and uncropped gel images generated in this study are provided in a Source data file. The structure factors and coordinates for the KLF4 DBD:KLFA structure have been deposited in the Protein Data Bank under the accession number 6vtx. Source data are provided with this paper.
Competing interests
The authors declare no competing interests.
Footnotes
Peer review informationNature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Rajesh Sharma, Kyoung-Jae Choi.
Contributor Information
Kevin R. MacKenzie, Email: km5@bcm.edu
Allan Chris M. Ferreon, Email: allan.ferreon@bcm.edu
Choel Kim, Email: ckim@bcm.edu.
Josephine C. Ferreon, Email: josephine.ferreon@bcm.edu
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-021-25761-7.
References
- 1.Takahashi K, et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell. 2007;131:861–872. doi: 10.1016/j.cell.2007.11.019. [DOI] [PubMed] [Google Scholar]
- 2.Takahashi K, Yamanaka S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell. 2006;126:663–676. doi: 10.1016/j.cell.2006.07.024. [DOI] [PubMed] [Google Scholar]
- 3.Schmidt R, Plath K. The roles of the reprogramming factors Oct4, Sox2 and Klf4 in resetting the somatic cell epigenome during induced pluripotent stem cell generation. Genome Biol. 2012;13:251. doi: 10.1186/gb-2012-13-10-251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Beers J, et al. A cost-effective and efficient reprogramming platform for large-scale production of integration-free human induced pluripotent stem cells in chemically defined culture. Sci. Rep. 2015;5:11319. doi: 10.1038/srep11319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Chronis C, et al. Cooperative binding of transcription factors orchestrates reprogramming. Cell. 2017;168:442–459 e420. doi: 10.1016/j.cell.2016.12.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wei Z, et al. Klf4 interacts directly with Oct4 and Sox2 to promote reprogramming. Stem Cells. 2009;27:2969–2978. doi: 10.1634/stemcells.2008-0333. [DOI] [PubMed] [Google Scholar]
- 7.Zhang P, Andrianakos R, Yang Y, Liu C, Lu W. Kruppel-like factor 4 (Klf4) prevents embryonic stem (ES) cell differentiation by regulating Nanog gene expression. J. Biol. Chem. 2010;285:9180–9189. doi: 10.1074/jbc.M109.077958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Boyer LA, et al. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell. 2005;122:947–956. doi: 10.1016/j.cell.2005.08.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Silva J, et al. Nanog is the gateway to the pluripotent ground state. Cell. 2009;138:722–737. doi: 10.1016/j.cell.2009.07.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Chambers I, et al. Nanog safeguards pluripotency and mediates germline development. Nature. 2007;450:1230–1234. doi: 10.1038/nature06403. [DOI] [PubMed] [Google Scholar]
- 11.Yu J, et al. Induced pluripotent stem cell lines derived from human somatic cells. Science. 2007;318:1917–1920. doi: 10.1126/science.1151526. [DOI] [PubMed] [Google Scholar]
- 12.Apostolou E, et al. Genome-wide chromatin interactions of the Nanog locus in pluripotency, differentiation, and reprogramming. Cell Stem Cell. 2013;12:699–712. doi: 10.1016/j.stem.2013.04.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wei Z, et al. Klf4 organizes long-range chromosomal interactions with the oct4 locus in reprogramming and pluripotency. Cell Stem Cell. 2013;13:36–47. doi: 10.1016/j.stem.2013.05.010. [DOI] [PubMed] [Google Scholar]
- 14.Whyte WA, et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013;153:307–319. doi: 10.1016/j.cell.2013.03.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Di Giammartino DC, et al. KLF4 is involved in the organization and regulation of pluripotency-associated three-dimensional enhancer networks. Nat. Cell Biol. 2019;21:1179–1190. doi: 10.1038/s41556-019-0390-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Soufi A, et al. Pioneer transcription factors target partial DNA motifs on nucleosomes to initiate reprogramming. Cell. 2015;161:555–568. doi: 10.1016/j.cell.2015.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Liu Y, et al. Structural basis for Klf4 recognition of methylated DNA. Nucleic Acids Res. 2014;42:4859–4867. doi: 10.1093/nar/gku134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Schuetz A, et al. The structure of the Klf4 DNA-binding domain links to self-renewal and macrophage differentiation. Cell Mol. Life Sci. 2011;68:3121–3131. doi: 10.1007/s00018-010-0618-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Chen X, et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell. 2008;133:1106–1117. doi: 10.1016/j.cell.2008.04.043. [DOI] [PubMed] [Google Scholar]
- 20.Shields JM, Yang VW. Identification of the DNA sequence that interacts with the gut-enriched Kruppel-like factor. Nucleic Acids Res. 1998;26:796–802. doi: 10.1093/nar/26.3.796. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Larson AG, et al. Liquid droplet formation by HP1alpha suggests a role for phase separation in heterochromatin. Nature. 2017;547:236–240. doi: 10.1038/nature22822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Strom AR, et al. Phase separation drives heterochromatin domain formation. Nature. 2017;547:241–245. doi: 10.1038/nature22989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Boija A, et al. Transcription factors activate genes through the phase-separation capacity of their activation domains. Cell. 2018;175:1842–1855 e1816. doi: 10.1016/j.cell.2018.10.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sabari, B. R. et al. Coactivator condensation at super-enhancers links phase separation and gene control. Science10.1126/science.aar3958 (2018). [DOI] [PMC free article] [PubMed]
- 25.Chong, S. et al. Imaging dynamic and selective low-complexity domain interactions that control gene transcription. Science10.1126/science.aar2555 (2018). [DOI] [PMC free article] [PubMed]
- 26.Shrinivas K, et al. Enhancer features that drive formation of transcriptional condensates. Mol. Cell. 2019;75:549–561 e547. doi: 10.1016/j.molcel.2019.07.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wan, J. et al. Methylated cis-regulatory elements mediate KLF4-dependent gene transactivation and cell migration. Elife10.7554/eLife.20068 (2017). [DOI] [PMC free article] [PubMed]
- 28.Sacco AM, et al. Diversity of dermal fibroblasts as major determinant of variability in cell reprogramming. J. Cell Mol. Med. 2019;23:4256–4268. doi: 10.1111/jcmm.14316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Elbaum-Garfinkle S, et al. The disordered P granule protein LAF-1 drives phase separation into droplets with tunable viscosity and dynamics. Proc. Natl Acad. Sci. USA. 2015;112:7189–7194. doi: 10.1073/pnas.1504822112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zhang H, et al. RNA controls PolyQ protein phase transitions. Mol. Cell. 2015;60:220–230. doi: 10.1016/j.molcel.2015.09.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Patel A, et al. A liquid-to-solid phase transition of the ALS protein FUS accelerated by disease mutation. Cell. 2015;162:1066–1077. doi: 10.1016/j.cell.2015.07.047. [DOI] [PubMed] [Google Scholar]
- 32.Xie L, et al. A dynamic interplay of enhancer elements regulates Klf4 expression in naive pluripotency. Genes Dev. 2017;31:1795–1808. doi: 10.1101/gad.303321.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kang L, et al. The universal 3D3 antibody of human PODXL is pluripotent cytotoxic, and identifies a residual population after extended differentiation of pluripotent stem cells. Stem Cells Dev. 2016;25:556–568. doi: 10.1089/scd.2015.0321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Maharana S, et al. RNA buffers the phase separation behavior of prion-like RNA binding proteins. Science. 2018;360:918–921. doi: 10.1126/science.aar7366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Gunther K, Mertig M, Seidel R. Mechanical and structural properties of YOYO-1 complexed DNA. Nucleic Acids Res. 2010;38:6526–6532. doi: 10.1093/nar/gkq434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Chan KK, et al. KLF4 and PBX1 directly regulate NANOG expression in human embryonic. Stem Cells Stem Cells. 2009;27:2114–2125. doi: 10.1002/stem.143. [DOI] [PubMed] [Google Scholar]
- 37.Nettersheim D, et al. NANOG promoter methylation and expression correlation during normal and malignant human germ cell development. Epigenetics. 2011;6:114–122. doi: 10.4161/epi.6.1.13433. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Fouse SD, et al. Promoter CpG methylation contributes to ES cell gene regulation in parallel with Oct4/Nanog, PcG complex, and histone H3 K4/K27 trimethylation. Cell Stem Cell. 2008;2:160–169. doi: 10.1016/j.stem.2007.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Wolfe SA, Nekludova L, Pabo CO. DNA recognition by Cys2His2 zinc finger proteins. Annu Rev. Biophys. Biomol. Struct. 2000;29:183–212. doi: 10.1146/annurev.biophys.29.1.183. [DOI] [PubMed] [Google Scholar]
- 40.Fornes O, et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2020;48:D87–D92. doi: 10.1093/nar/gkaa516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Dyson HJ, Wright PE. Intrinsically unstructured proteins and their functions. Nat. Rev. Mol. Cell Biol. 2005;6:197–208. doi: 10.1038/nrm1589. [DOI] [PubMed] [Google Scholar]
- 42.Nunez N, et al. The multi-zinc finger protein ZNF217 contacts DNA through a two-finger domain. J. Biol. Chem. 2011;286:38190–38201. doi: 10.1074/jbc.M111.301234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Omichinski JG, Pedone PV, Felsenfeld G, Gronenborn AM, Clore GM. The solution structure of a specific GAGA factor-DNA complex reveals a modular binding mode. Nat. Struct. Biol. 1997;4:122–132. doi: 10.1038/nsb0297-122. [DOI] [PubMed] [Google Scholar]
- 44.Ferreon AC, Ferreon JC, Wright PE, Deniz AA. Modulation of allostery by protein intrinsic disorder. Nature. 2013;498:390–394. doi: 10.1038/nature12294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Tsoi PS, et al. The N-terminal domain of ALS-Linked TDP-43 assembles without misfolding. Angew. Chem. Int Ed. Engl. 2017;56:12590–12593. doi: 10.1002/anie.201706769. [DOI] [PubMed] [Google Scholar]
- 46.Moosa, M. M., Tsoi, P. S., Choi, K. J., Ferreon, A. C. M. & Ferreon, J. C. Direct single-molecule observation of sequential DNA bending transitions by the Sox2 HMG Box. Int. J. Mol. Sci.10.3390/ijms19123865 (2018). [DOI] [PMC free article] [PubMed]
- 47.Li P, et al. Phase transitions in the assembly of multivalent signalling proteins. Nature. 2012;483:336–340. doi: 10.1038/nature10879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Hashimoto H, et al. Distinctive Klf4 mutants determine preference for DNA methylation status. Nucleic Acids Res. 2016;44:10177–10185. doi: 10.1093/nar/gkw774. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Deluz C, et al. A role for mitotic bookmarking of SOX2 in pluripotency and differentiation. Genes Dev. 2016;30:2538–2550. doi: 10.1101/gad.289256.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Gibson BA, et al. Organization of chromatin by intrinsic and regulated phase separation. Cell. 2019;179:470–484 e421. doi: 10.1016/j.cell.2019.08.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Shin, Y. & Brangwynne, C. P. Liquid phase condensation in cell physiology and disease. Science10.1126/science.aaf4382 (2017). [DOI] [PubMed]
- 52.Banani SF, Lee HO, Hyman AA, Rosen MK. Biomolecular condensates: organizers of cellular biochemistry. Nat. Rev. Mol. Cell Biol. 2017;18:285–298. doi: 10.1038/nrm.2017.7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Blinka S, Reimer MH, Jr., Pulakanti K, Rao S. Super-enhancers at the nanog locus differentially regulate neighboring pluripotency-associated genes. Cell Rep. 2016;17:19–28. doi: 10.1016/j.celrep.2016.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Guo YE, et al. Pol II phosphorylation regulates a switch between transcriptional and splicing condensates. Nature. 2019;572:543–548. doi: 10.1038/s41586-019-1464-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Boehning M, et al. RNA polymerase II clustering through carboxy-terminal domain phase separation. Nat. Struct. Mol. Biol. 2018;25:833–840. doi: 10.1038/s41594-018-0112-y. [DOI] [PubMed] [Google Scholar]
- 56.Huang Y, et al. JMJD3 acts in tandem with KLF4 to facilitate reprogramming to pluripotency. Nat. Commun. 2020;11:5061. doi: 10.1038/s41467-020-18900-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Sardina JL, et al. Transcription factors drive Tet2-mediated enhancer demethylation to reprogram cell fate. Cell Stem Cell. 2018;23:905–906. doi: 10.1016/j.stem.2018.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Li G, Levitus M, Bustamante C, Widom J. Rapid spontaneous accessibility of nucleosomal DNA. Nat. Struct. Mol. Biol. 2005;12:46–53. doi: 10.1038/nsmb869. [DOI] [PubMed] [Google Scholar]
- 59.Halford SE, Marko JF. How do site-specific DNA-binding proteins find their targets? Nucleic Acids Res. 2004;32:3040–3052. doi: 10.1093/nar/gkh624. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Fusaki N, Ban H, Nishiyama A, Saeki K, Hasegawa M. Efficient induction of transgene-free human pluripotent stem cells using a vector based on Sendai virus, an RNA virus that does not integrate into the host genome. Proc. Jpn. Acad. Ser. B Phys. Biol. Sci. 2009;85:348–362. doi: 10.2183/pjab.85.348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Nishimura K, et al. Manipulation of KLF4 expression generates iPSCs paused at successive stages of reprogramming. Stem Cell Rep. 2014;3:915–929. doi: 10.1016/j.stemcr.2014.08.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Emerson RO, Thomas JH. Adaptive evolution in zinc finger transcription factors. PLoS Genet. 2009;5:e1000325. doi: 10.1371/journal.pgen.1000325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Li JS, et al. TZAP: A telomere-associated protein involved in telomere length control. Science. 2017;355:638–641. doi: 10.1126/science.aah6752. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Pavletich NP, Pabo CO. Crystal structure of a five-finger GLI-DNA complex: new perspectives on zinc fingers. Science. 1993;261:1701–1707. doi: 10.1126/science.8378770. [DOI] [PubMed] [Google Scholar]
- 65.Maekawa M, et al. Direct reprogramming of somatic cells is promoted by maternal transcription factor Glis1. Nature. 2011;474:225–229. doi: 10.1038/nature10106. [DOI] [PubMed] [Google Scholar]
- 66.Quenneville S, et al. In embryonic stem cells, ZFP57/KAP1 recognize a methylated hexanucleotide to affect chromatin and DNA methylation of imprinting control regions. Mol. Cell. 2011;44:361–372. doi: 10.1016/j.molcel.2011.08.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Patel A, et al. DNA conformation induces adaptable binding by tandem zinc finger proteins. Cell. 2018;173:221–233 e212. doi: 10.1016/j.cell.2018.02.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Pugacheva EM, et al. CTCF mediates chromatin looping via N-terminal domain-dependent cohesin retention. Proc. Natl Acad. Sci. USA. 2020;117:2020–2031. doi: 10.1073/pnas.1911708117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.de Wit E, et al. CTCF binding polarity determines chromatin looping. Mol. Cell. 2015;60:676–684. doi: 10.1016/j.molcel.2015.09.023. [DOI] [PubMed] [Google Scholar]
- 70.Hashimoto H, et al. Structural basis for the versatile and methylation-dependent binding of CTCF to DNA. Mol. Cell. 2017;66:711–720 e713. doi: 10.1016/j.molcel.2017.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.von Stetten D, Noirclerc-Savoye M, Goedhart J, Gadella TW, Jr., Royant A. Structure of a fluorescent protein from Aequorea victoria bearing the obligate-monomer mutation A206K. Acta Crystallogr. Sect. F. Struct. Biol. Cryst. Commun. 2012;68:878–882. doi: 10.1107/S1744309112028667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Edelhoch H. Spectroscopic determination of tryptophan and tyrosine in proteins. Biochemistry. 1967;6:1948–1954. doi: 10.1021/bi00859a010. [DOI] [PubMed] [Google Scholar]
- 73.Tataurov AV, You Y, Owczarzy R. Predicting ultraviolet spectrum of single stranded and double stranded deoxyribonucleic acids. Biophys. Chem. 2008;133:66–70. doi: 10.1016/j.bpc.2007.12.004. [DOI] [PubMed] [Google Scholar]
- 74.Evans PR. An introduction to data reduction: space-group determination, scaling and intensity statistics. Acta Crystallogr. D. Biol. Crystallogr. 2011;67:282–292. doi: 10.1107/S090744491003982X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Battye TG, Kontogiannis L, Johnson O, Powell HR, Leslie AG. iMOSFLM: a new graphical interface for diffraction-image processing with MOSFLM. Acta Crystallogr. D. Biol. Crystallogr. 2011;67:271–281. doi: 10.1107/S0907444910048675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.McCoy AJ, et al. Phaser crystallographic software. J. Appl. Crystallogr. 2007;40:658–674. doi: 10.1107/S0021889807021206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr. D. Biol. Crystallogr. 2004;60:2126–2132. doi: 10.1107/S0907444904019158. [DOI] [PubMed] [Google Scholar]
- 78.Afonine PV, et al. Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallogr. D. Biol. Crystallogr. 2012;68:352–367. doi: 10.1107/S0907444912001308. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Source data for plots, raw data for counts and intensity measurements, and uncropped gel images generated in this study are provided in a Source data file. The structure factors and coordinates for the KLF4 DBD:KLFA structure have been deposited in the Protein Data Bank under the accession number 6vtx. Source data are provided with this paper.