Argentaro et al. 10.1073/pnas.0704057104.

Supporting Information

Files in this Data Supplement:

SI Figure 5
SI Figure 6
SI Table 1
SI Figure 7
SI Figure 8
SI Materials and Methods
SI Table 2
SI Figure 9
SI Table 3




SI Figure 5

Fig. 5. Sequence alignments for the ADD domain. (a) Sequence alignment of the ADD domain of ATRX with those of the DNMT3A, DNMT3B, and DNMT3L proteins. Absolutely conserved residues are marked with a black circle, strongly conserved residues with a gray circle, and weakly conserved residues with a white circle. The N-terminal zinc finger is shown as a light green bar, the PHD finger as a mauve bar, and the C-terminal extension as a light blue bar; conserved cysteine residues are marked as orange vertical bars. (b) Structure-based sequence alignment of the PHD zinc finger of ATRX with those PHD fingers for which structures are known (identified by their Protein Data Bank codes; all chain IDs used in the alignment are A). Residues considered as structurally equivalent are shown in uppercase, structurally dissimilar residues are shown in lowercase, and background colors follow the ClustalX scheme. Numbering is based on the ATRX sequence, and the positions of the metal-binding cysteines are indicated with triangles below the alignment.





SI Figure 6

Fig. 6. Analogy between the interaction between the GATA-like and PHD fingers of the ADD domain and the interaction of the N-terminal finger of GATA-1 with friend of GATA (FOG). Superposition of the ADD domain of ATRX onto the complex of the N-terminal zinc finger of GATA-1 with part of its protein partner, FOG. For FOG, only the helix that makes all of the contacts to GATA-1 is shown. The color scheme for the ADD domain protein is the same as in Figs. 1 and 2; the backbone and zinc ligands of the N-terminal zinc finger of GATA-1 are shown in gray (zinc in dark red), and the partial backbone of FOG is shown in dark red. The position occupied by FOG on the N-terminal GATA-1 finger corresponds roughly to the position occupied by the PHD finger on the GATA-like finger of the ADD domain. The superposition was carried out by fitting the N, Ca, and C' atoms of residues 167-180, 185-190, and 192-207 of ATRX to the corresponding residues of the N-terminal finger of GATA-1, based on the alignment in Fig. 2a.





SI Figure 7

Fig. 7. One-dimensional NMR spectra of mutant ADD domains. Spectra show the folding state of the ADD domain in WT and some representative mutant ATRX proteins, obtained at 27ºC at 600 MHz (WT, G249D, and P190S) or 500 MHz (Y266C and C220R). The first two spectra of mutants (G249D and P190S) are typical of results obtained for mutants that retain the folded state (SI Table 1); there is a comparable dispersion of the signals and although some changes of chemical shifts are visible, the overall pattern is clearly very similar to that for WT, indicating that the folded structure is largely unperturbed by these mutations. In contrast, the top two spectra (Y266C and C220R) are typical of results obtained for mutants classified as unfolded and for which sufficient yield of soluble protein was obtained to allow an NMR spectrum to be obtained (SI Table 1); all dispersed signals in the spectrum of WT are lost, and only signals consistent with unfolded and/or aggregated protein are visible. In most of these cases the samples were also very dilute.





SI Figure 8

Fig. 8. Possible mode of interaction of the GATA-like finger of the ADD domain with DNA. The ADD domain of ATRX is superposed onto the GATA-1 C-terminal zinc finger:DNA complex, looking down the axis of the GATA-1-bound DNA double helix. The color scheme for the ADD domain and GATA-1 protein is the same as in Figs. 1 and 2b, with the DNA backbone shown schematically in orange and the bases shown schematically in magenta and blue. As can be seen, the majority of the ADD domain would not clash substantially with DNA if it bound in an analogous fashion to GATA, although the C-terminal part of the C-terminal helix of the ADD domain might require some distortion and/or bending of the DNA. The superposition was carried out by fitting the N, Ca, and C' atoms of residues 167-180, 185-190, and 192-207 of ATRX to the corresponding residues of the C-terminal finger of GATA-1, based on the alignment in Fig. 2a (i.e., the same fitting as was used to generate Fig. 2b). The specific positioning of basic residues on the putative DNA-binding surface of the GATA-like finger in ATRX is different from that on the surface of GATA-1 itself (see main text), but this is not surprising. Many DNA-binding proteins, particularly "treble-clef" zinc fingers, such as steroid hormone receptors and GATA-1 (1), interact with DNA through the insertion of an exposed a-helix into the DNA major groove, but the detailed nature of the contacts made depends strongly on the individual DNA sequence being recognized and the angle between the DNA-recognition helix and the wall of the DNA major groove (2).

1. Grishnin NV (2001) Nucleic Acids Res 29:1703-1714.

2. Rhodes D, Schwabe JWR, Chapman L, Fairall L (1996) Philos Trans R Soc London B 351:501-509.





SI Figure 9

Fig. 9. Calculated NMR ensemble. (a) Stereoview of the ensemble of 32 accepted NMR structures of the ADD domain of ATRX protein, superimposed using the N, Ca, and C' atoms of residues 168-209 and 218-293. (b) rmsd and X-PLOR total energy profiles for the 100 calculated structures of the ATRX-ADD domain. Filled circles represent rmsd values calculated independently by using each ensemble size, adding successive structures in order of increasing X-PLOR total energy term. Open circles represent the X-PLOR total energy terms for successive structures. In calculating the structural statistics in SI Table 2, only structures to the left of the red line were included.





Table 1. ADD mutations, protein expression, and structural consequence

Mutation

Location*

In vitro

, %

In vivo

, %

1D NMR

Ref.

G175E

Buried

-

-

-

1

176/7 InsQ

Surface

30

32

Folded

2

Del (V178-K198)

-

-

8

-

3

N179S

Surface

-

-

-

4

H189D

Buried

5

17

Unfolded

4

P190S

Buried

5

28

Folded

5

P190L

Buried

10

-

Folded

6

P190A

Buried

-

26/29

-

2

L192F

Buried

Trace

13

-

2

V194I

Buried

10

24

Folded

6

V194A

Buried

-

-

-

7

L195del

Buried

-

-

-

4

C200S

Cysteine

Trace

8%

Unfolded

2

C200Y

Cysteine

-

-

-

4

Q219P

Buried

10

-

Folded

1

C220R

Cysteine

Trace

12

Unfolded

2

C220Y

Cysteine

-

-

-

8

W222S

Buried

None

24

-

2

W222C

Buried

-

-

-

R.G. unpublished data

C223R

Cysteine

-

-

-

7

L229F

Buried

10

-

Folded

7

N237D

Buried

5

7

Folded

R.G. unpublished data

A238P

Buried

-

17

-

R.G. unpublished data

F239L

Buried

-

19

-

5

C240G

Cysteine

-

-

-

9

C240F

Cysteine

-

-

-

2

C243Y

Cysteine

Trace

-

Unfolded

4

C243R

Cysteine

-

-

-

7

I244N

Buried

-

17

-

5

L245P

Surface

-

-

-

7

R246C

Surface

50

38

Folded

2

R246L

Surface

30

39

Folded

10

N247D

Buried

None

-

-

5

G249D

Surface

30-40

46/54

Folded

2

G249C

Surface

-

-

-

1

E252L

Surface

 

-

-

4

L253S

Buried

 

-

-

4

W263R

Buried

-

28

-

5

C265Y

Cysteine

Trace

-

-

9

Y266C

Buried

10

24

Unfolded

5

 

*Surface residues are those having a relative total side-chain exposure >50%, as calculated by using the program NACCESS (11).

Level of expression obtained in Escherichia coli as compared to wild-type ADD construct.

Level of expression of endogenous ATRX in EBV-transformed cell lines derived from individuals with ADD mutations. Levels are given relative to the mean level of ATRX expression in normal controls. Where a cell line from a second affected individual has been tested, both values are given.

1. Villard L, Bonino MC, Abidi F, Ragusa A, Belougne J, Lossi AM, Seaver L, Bonnefont JP, Romano C, Fichera M, et al. (1999) J Med Genet 36:183-186.

2. Gibbons RJ, Bachoo S, Picketts DJ, Aftimos S, Asenbauer B, Bergoffen J, Berry SA, Dahl N, Fryer A, Keppler K, et al. (1997) Nat Genet 17:146-148.

3. Picketts DJ, Higgs DR, Bachoo S, Blake DJ, Quarrell OWJ, Gibbons RJ (1996) Hum Mol Genet 5:1899-1907.

4. Badens C, Lacoste C, Philip N, Martini N, Courrier S, Giuliano F, Verloes A, Munnich A, Leheup B, Burglen L, et al. (2006) Clin Genet 70:57-62.

5. Gibbons RJ, Higgs DR (2000) Am J Med Genet 97:204-212.

6. Wada T, Kubota T, Fukushima Y, Saitoh S (2000) Am J Med Genet 94:242-248.

7. Wada T, Fukushima Y, Saitoh S (2006) Am J Med Genet A 140:1519-1523.

8. Stevenson RE, Abidi F, Schwartz CE, Lubs HA, Holmes LB Am J Med Genet (2000) 94:383-385.

9. Steensma DP, Higgs DR, Fisher CA, Gibbons RJ (2004) Blood 103:2019-2026.

10. Fichera M, Romano C, Castiglia L, Failla P, Ruberto C, Amata S, Greco D, Cardoso C, Fontes M, Ragusa A (1998) Mutations in Brief No 176 Online Hum Mutat 12:214.

11. Hubbard SJ, Thornton JM (1993) NACCESS (University College London).

 





Table 2. Oligonucleotides used for site-directed mutagenesis

Mutation

Oligonucleotide

InsQ 176/7

CACTGCTTGTGGACAACAACAGGTCAATCATTTTCAAAA

H189D

GCTTGTGGACAACAGGTCGATCATTTTCAAAAGATTCC

P190S

GATTCCATTTATAGACACTCTTCATTGCAAGTTCTTATTG

P190L

GATTCCATTTATAGACACTTGTCATTGCAAGTTCTTATTG

L192F

CCATTTATAGACACCCTTCATTCCAAGTTCTTATTTGTAAG

V194I

TAGACACCCTTCATTGCAAATTCTTATTTGTAAGAATTGCTT

C200S

GCAAGTTCTTATTTGTAAGAATTCCTTTAAGTATTACATGAGTG

Q219P

GACTCAGATGGAATGGATGAACCATGTAGGTGGTGTGCG

C220R

GATGGAATGGATGAACAACGTAGGTGGTGTGCGGAAGG

W222S

ATGGATGAACAATGTAGGTCGTGTGCGGAAGGTGGTGGA

L229F

GCGGAAGGTGGAAACTTTATTTGTTGTGACTTTTG

N237D

TGTTGTGACTTTTGCCATGATGCTTTCTGCAAGAAATGC

R246C

CTGCAAGAAATGCATTCTATGTAACCTTGGTCGAAAGGAGTTG

R246L

CTGCAAGAAATGCATTCTACTCAACCTTGGTCGAAAGGAGTTG

N247D

CTGCAAGAAATGCATTCTACGCGACCTTGGTCGAAAGGAGTTGTC

G249D

AATGCATTCTACGCAACCTTGATCGAAAGGAGTTGTCCACAATA

C265Y

GATGAAAACAACCAATGGTATTACTACATTTGTCACCCAGAG

Y266C

GGATGAAAACAACCAATGGTATTGCTGCATTTGTCACCCAGAGCC




Table 3. Structural statistics for the ATRX-ADD ensemble

Structural restraints

 

NOE-derived distance restraints

 

Intraresidue

500

Sequential (|i - j| = 1)

340

Medium (2 ≤ |i - j| ≤ 4)

231

Long (|i - j| > 4)

351

H-bond restraints

38 (19 H-bonds)

Dihedral angle restraints (c1)

31

No distance violations greater than 0.27 Å

 

Statistics for 32 accepted structures

 

Ramachandran statistics for residues 168-293

 

Most favored, %

73.6

Additionally allowed, %

24.4

Generously allowed, %

1.9

Disallowed, %

0.1

Mean XPLOR energy terms, kcal•mol-1 (±SD)

 

E(total)

797.39 ± 19.6

E(van der Waals)

338.2 ± 7.7

E(distance restraints)

44.1 ± 4.8

rms deviations from XPLOR ideal geometries

 

Bond lengths, Å

0.0063

Bond angles, °

0.79

Improper angles, °

0.40

Average rmsd to average structure (±SD) for

residues 168-209 and 218-293

 

N, Ca, and C' atoms, Å

0.48 ± 0.12

All heavy atoms, Å

0.96 ± 0.16

 





SI Materials and Methods

Production of the ADD Domain of ATRX.

An ATRX-ADD domain construct suitable for structural analysis by NMR was identified by using a number of methods. The N-terminal border of the folded domain was identified by using limited proteolysis of the polypeptide encompassing residues 85-320. Identification of the C terminus involved the expression of different length constructs guided by secondary structure predictions. All constructs were expressed in Escherichia coli and affinity purified, and the degree of aggregation was assessed by using gel filtration on a Superdex 75 (Pharmacia, Peapack, NJ) column. The ADD domain polypeptide chosen for structural study spans residues 159-296 of human ATRX and contains three nonnative residues (Gly-Ala-Met) at its N terminus remaining from the tobacco etch virus (TEV) cleavage.

PCR products were cloned into the pET30a vector (Novagen, Madison, WI), modified to contain a TEV protease cleavage site following an N-terminal His6-tag and an S-tag (1). Proteins were produced in the host strain Rosetta(DE3)pLysS (Novagen). For the NMR analysis, the bacteria were grown in minimal media containing 15NH4Cl as the sole nitrogen source and [13C6]glucose as the sole carbon source and supplemented with 0.1 mM ZnCl2. Cell pellets were resuspended in lysis buffer [50 mM Tris•HCl (pH 7.5)/500 mM NaCl/10% glycerol/1 mM DTT/0.1 mM ZnCl2], and cells were disrupted by sonication. After centrifugation (30,000 ´ g for 20 min), the supernatant was incubated for 1 h with Ni-NTA resin (Qiagen, Valencia, CA) and equilibrated in lysis buffer. After centrifugation at 1,500 rpm, the Ni-NTA resin was washed four times with lysis buffer with 20 mM imidazole added, and the protein was eluted with lysis buffer containing 300 mM imidazole. The imidazole and major impurities were removed by gel filtration on a Superdex 200 column 26/60 [GE Healthcare, Chalfont St. Giles, U.K., equilibrated in 10 mM Tris•HCl (pH 7.5)/500 mM NaCl/10% glycerol/1 mM DTT/0.1 mM ZnCl2]. TEV proteolytic digestions were then performed overnight at room temperature by the addition of TEV protease at a ratio of 1:50 to protein sample. The TEV protease used was itself His6-tagged. The cleaved proteins were purified away from the TEV protease and the cleaved His tag by repeating the binding to Ni-NTA resin equilibrated in the column running buffer. The proteins were then further purified by gel filtration on a Superdex 200 column 26/60 equilibrated in the same buffer. Samples for NMR were then concentrated and exchanged into 10 mM deuterated Tris•HCl (pH 6.8)/1 mM deuterated DTT/500 mM NaCl/0.1 mM ZnCl2.

Site-Directed Mutagenesis.

Primers for the described mutations are given in SI Table 2. Appropriate reading frame fusions and integrity of flanking sequences for all constructs created by PCR were confirmed by DNA sequence analysis of both strands.

NMR Experiments.

NMR samples were composed of 0.3-0.4 mM (15N-labeled) or 0.6-0.8 mM (15N,13C-labeled) solutions of ATRX 159-296 (with additional vector-derived residues Gly-Ala-Met at the N terminus) containing 20 mM [2H10]Tris•HCl, 1 mM [2H6]DTT, 0.1 mM ZnCl2, and 500 mM NaCl. Data were acquired at 27ºC on Bruker Avance 800, DMX600, and DRX500 spectrometers. Resonances were assigned by using a standard suite of triple-resonance NMR experiments, and 1H, 15N, and 13C chemical shifts were calibrated by using sodium 3,3,3-trimethylsilylpropionate (TSP) as internal 1H reference (2).

For 15N-labeled protein samples, the following data were acquired: 2D data sets: [15N-1H] HSQC, [1H-1H] NOESY experiments (without heteronuclear filtering; tm = 50, 100, and 150 ms), [1H-1H] NOESY experiments filtered to remove 15N-coupled signals either in both F1 and F2 (tm = 50 and 150 ms), or in F2 only (tm = 50 and 150 ms); 3D data sets: HNHB, 15N NOESY-HSQC (tm = 50 and 150 ms). For 15N/13C-labeled protein samples, the following data were acquired: 2D: [15N-1H] HSQC, [13C-1H] HSQC covering full 13C spectral width, constant-time [13C-1H] HSQC covering only aliphatic 13C region, constant-time [13C-1H] HSQC covering only aromatic 13C region; 3D data sets: CBCANH, CBCACONH, HNCA, HNCOCA, HBHANH, HBHACONH, [1H-13C-1H] HCCH-TOCSY, [13C-13C-1H] HCCH-TOCSY, HCCH-COSY, and 13C NOESY-HSQC (tm = 50 and 150 ms), separate datasets acquired for 13C aliphatic and aromatic spectral regions.

Structure Calculations.

NOE distance restraints were derived from analysis of all of the data from NOE-based experiments. Cross-peak intensities were measured by using the program SPARKY (3) and grouped into four categories. The strongest daN (i, i + 1) and interstrand daa(i, j) connectivities in b-sheet regions were used to set the upper limit for the category "very strong" (0-2.3 Å, 34 constraints), strong dNN (i, i + 1) connectivities in a-helices defined the category "strong" (0-2.8 Å, 218 constraints), daN (i, i + 3) cross-peaks in helices defined the category "medium" (0-3.5 Å, 588 constraints), and all remaining peaks were classified as weak (0-5 Å, 582 constraints). Lower bounds for all NOE restraints were set to zero (4), and no multiplicity corrections were required because r-6 summation was used for restraints involving groups of equivalent or nonstereo-assigned spins (5, 6). Amide protons in stable hydrogen bonds were identified in a 15N-HSQC spectrum recorded quickly after transferring a sample into 2H2O solution, and in cases where the corresponding acceptor was unambiguously identifiable, distance restraints (2.4 Å < dN...O < 3.7 Å and 1.4 Å < dH...O < 2.7 Å) were applied during the structure calculations. Stereo-specific assignments for 32 CbH2 groups were made by analyzing HNHB and short mixing time (tm = 50 ms) NOESY spectra.

Structures were calculated from polypeptide chains with randomized f and y torsion angles by using a two-stage simulated annealing protocol within the program X-PLOR, essentially as described elsewhere (7, 8) but employing larger numbers of cycles as follows: First stage calculations comprised Powell energy minimization (500 steps), dynamics at 1,000 K (25,000 steps), increase of the van der Waals force constant, tilting of the NOE potential function asymptote (4,000 steps), switching to a square-well NOE function then cooling to 300 K in 2,000 step cycles, and final Powell minimization (1,000 steps). Second stage calculations used Powell minimization (500 steps), increasing dihedral force constant during 4,000 step cycles of dynamics at 1,000 K (with a strong van der Waals force constant and square-well NOE potential function), cooling to 300 K in 1,000 step cycles, and 2,000 steps of final Powell minimization.

The absolute chirality of ligand arrangements around zinc was defined according to the Cahn-Ingold-Prelog convention, assigning decreasing priorities along the sequence as proposed by Berg (following the convention in ref. 9). The program CLUSTERPOSE was used to calculate the mean rmsd of ensembles to their mean structure (10), and the electrostatic surface was calculated with the program APBS (11).

Sequence and Structure Comparison.

Structures were visualized by using the program PYMOL (12). The multiple sequence-based sequence alignments were run with the program ClustalW (13), and the multiple structure-based sequence alignments were produced by manually combining multiple pairwise comparisons of the structures obtained by using the program ProSup (14).

Real-Time PCR.

Quantitative real-time PCR, primers, and 5' 6-carboxyfluorescein and 3' nonfluorescent quencher probes were as follows: ATRX primer nucleotide sequences: ATRX (TaqMan gene expression assays; Applied Biosystems, Foster City, CA) assay ID Hs00230877_m1 (spanning exons 21-23) and GAPDH (TaqMan gene expression assays; Applied Biosystems) assay ID Hs99999905_m1 and probes. Duplicate real-time PCRs on each template were performed on a Sequence Detection System 7000 thermocycler (Applied Biosystems) by using TaqMan universal PCR mastermix (Applied Biosystems), 400 nM each primer, and 200 nM probe in 25-ml reactions.

Western Analysis.

Approximately 10 mg of total protein was run in 3-8% TAE gradient gels (Invitrogen, Paisley, U.K.) and transferred to nylon membrane (Immobilon-P; Millipore Corporation, Billerica, MA). Membranes were incubated with a mouse monoclonal anti-ATRX antibody, (39f) at 1:10 dilution and subsequently with HRP-conjugated goat anti-mouse antibody (1:1,000; Dako UK, Ely, Cambridgeshire, U.K.). To correct for variations in loading, the membranes were reprobed with mSin3a (1:2000; Santa Cruz Biotechnology, Santa Cruz, CA) and subsequently with HRP-conjugated sheep anti-rabbit antibody (1:1000; Dako). Signals were detected by using enhanced chemiluminescence (ECLplus; Amersham Pharmacia Biosciences, Piscataway, NJ). The volume intensity (Vol:INT) of each band was measured by using optical density with the lane-based quantitation calculated by the average intensity of pixels across the band width and integrating over the band height. Any background signal was corrected by using a global subtraction method.

1. Court R, Chapman L, Fairall L, Rhodes D (2005) EMBO Rep 6:39-45.

2. Wishart DS, Bigam CG, Yao J, Abildaard F, Dyson HJ, Oldfield E, Markley JL, Sykes BD (1995) J Biomol NMR 6:135-140.

3. Goddard TD, Kneller DG SPARKY 3 (University of California, San Francisco).

4. Hommel U, Harvey TS, Driscoll PC, Campbell ID (1992) J Mol Biol 227:271-282.

5. Nilges M (1993) Proteins Struct Funct Genet 17:297-309.

6. Fletcher CM, Jones DNM, Diamond R, Neuhaus D (1996) J Biomol NMR 8:292-310.

7. Dutnall RN, Neuhaus D, Rhodes D (1996) Structure 4:599-611.

8. Muto Y, Pomeranz-Krummel D, Oubridge C, Hernandez H, Robinson C, Neuhaus D, Nagai K (2004) J Mol Biol 341:185-198.

9. Berg J (1988) Proc Natl Acad Sci USA 85:99-102.

10. Diamond R (1995) Acta Crystallogr D 51:127-135.

11. Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA (2001) Proc Natl Acad Sci USA 98:10037-10041.

12. DeLano WL (2002) The PyMOL Molecular Graphics System (DeLano Scientific, Palo Alto, CA).

13. Higgins D, Thompson J, Gibson T, Thompson JD, Higgins DG, Gibson TJ (1994) Nucleic Acid Res 22:4673-4680.

14. Feng ZK, Sippl MJ (1996) Fold Des 1:123-132.