Skip to main content
Acta Crystallographica Section F: Structural Biology Communications logoLink to Acta Crystallographica Section F: Structural Biology Communications
. 2019 May 21;75(Pt 6):439–449. doi: 10.1107/S2053230X19006599

A new crystal structure and small-angle X-ray scattering analysis of the homodimer of human SFPQ

Thushara Welwelwela Hewage a,, Sofia Caria a,b,, Mihwa Lee a,*
PMCID: PMC6572092  PMID: 31204691

The dimerization domain of splicing factor proline/glutamine-rich (SFPQ), an essential RNA-binding protein, has been crystallized in the C-centered orthorhombic space group C2221 with one monomer in the asymmetric unit, and its structure has been determined and refined to 1.9 Å resolution. The crystal structure was analyzed and compared with the solution scattering data.

Keywords: SFPQ, DBHS protein family, nuclear protein, RNA-recognition motif, PSPC1, NONO, dimerization, crystal structure, solution scattering, Drosophila behavior human splicing

Abstract

Splicing factor proline/glutamine-rich (SFPQ) is an essential RNA-binding protein that is implicated in many aspects of nuclear function. The structures of SFPQ and two paralogs, non-POU domain-containing octamer-binding protein and paraspeckle component 1, from the Drosophila behavior human splicing protein family have previously been characterized. The unusual arrangement of the four domains, two RNA-recognition motifs (RRMs), a conserved region termed the NonA/paraspeckle (NOPS) domain and a C-terminal coiled coil, in the intertwined dimer provides a potentially unique RNA-binding surface. However, the molecular details of how the four RRMs in the dimeric SFPQ interact with RNA remain to be characterized. Here, a new crystal structure of the dimerization domain of human SFPQ in the C-centered orthorhombic space group C2221 with one monomer in the asymmetric unit is presented. Comparison of the new crystal structure with the previously reported structure of SFPQ and analysis of the solution small-angle X-scattering data revealed subtle domain movements in the dimerization domain of SFPQ, supporting the concept of multiple conformations of SFPQ in equilibrium in solution. The domain movement of RRM1, in particular, may reflect the complexity of the RNA substrates of SFPQ. Taken together, the crystal and solution structure analyses provide a molecular basis for further investigation into the plasticity of nucleic acid binding by SFPQ in the absence of the structure in complex with its cognate RNA-binding partners.

1. Introduction  

Splicing factor proline/glutamine-rich (SFPQ), also known as polypyrimidine tract-binding protein-associated splicing factor (PSF), is a multifunctional nuclear protein which was first identified as a necessary protein for pre-mRNA splicing owing to its interaction with polypyrimidine-tract binding protein (PTB) (Patton et al., 1993). Since its first identification, SFPQ has been implicated in many aspects of nuclear function along with two paralogues: non-POU domain-containing octamer-binding protein (NONO) and paraspeckle component 1 (PSPC1). In addition to their structural function as core protein components of subnuclear bodies termed paraspeckles (Fox & Lamond, 2010; Sasaki et al., 2009), the three proteins SFPQ, NONO and PSPC1 play important roles in RNA biogenesis and transport, as well as in DNA damage repair (Bond & Fox, 2009; Knott et al., 2016).

Exclusive to the animal kingdom, members of the Drosophila behavior human splicing (DBHS) protein family are found in invertebrates to higher order vertebrates. The genomes of higher vertebrates encode three DBHS proteins, SFPQ, NONO and PSPC1, whereas invertebrates and lower vertebrates typically code for a single DBHS protein (Knott et al., 2015). DBHS proteins share a common domain structure, termed the DBHS domain, that includes two tandem RNA-recognition motifs (RRMs), a NonA/paraspeckle (NOPS) domain and a C-terminal coiled-coil (CC) domain (Fig. 1 a; Lee et al., 2011). Sharing more than 70% sequence identity within the DBHS domain, the three DBHS proteins form obligate dimers, both homodimers as well as heterodimers. The structures of the DBHS proteins have previously been characterized: crystal structures have been obtained of the human NONO–PSPC1 heterodimer (Passon et al., 2012), SFPQ homodimer (Lee et al., 2015) and SFPQ–PSPC1 heterodimer (Huang et al., 2018), and of Caenorhabditis elegans NONO-1 (Knott et al., 2015). The structures confirmed the dimeric nature of the DBHS proteins and showed a novel arrangement of four RRMs in the dimer owing to the antiparallel configuration of the CC domain.

Figure 1.

Figure 1

Crystal structure of an SFPQ homodimer in space group C2221. (a) Domain architecture of SFPQ. (b) Stereoview of the crystal structure of SFPQ in space group C2221 in cartoon representation. Residues modeled in the final structure are colored gold (RRM1), dark blue (RRM2), orange (NOPS) and red (coiled coil). The construct boundary used in this study is indicated with a square box and includes residues 276–535 of SFPQ. (c) Stereoview of the SFPQ dimer with the other monomer (gray) generated by the symmetry operation (−x − 1, y, −z − 1/2).

Crystal structures of SFPQ have been reported in our previous study, providing the molecular details of homodimerization (Lee et al., 2015). Owing to the unusual antiparallel configuration of the CC domain, the structure of the full-length DBHS domain of human SFPQ displays remarkably extended α-helical regions of over 110 amino acids in a single helical span, resulting in an extended structure of over 265 Å in length. We demonstrated that polymerization mediated by the extended CC domain is crucial for the function of SFPQ in gene regulation as well as in paraspeckle formation. The unique arrangement of four RRMs in the dimer resulting from the unusual antiparallel configuration of the CC domain, in conjunction with the ability of SFPQ to polymerize, may suggest a unique mode of RNA interaction in SFPQ (Lee et al., 2015). However, the structural basis of nucleic acid binding by SFPQ and by DBHS proteins in general remains to be characterized.

Here, we present a new crystal structure of the dimerization domain of human SFPQ in a C-centered orthorhombic space group with one monomer in the asymmetric unit. We compared the new crystal structure with the previously reported structure of SFPQ and characterized the protein in solution via small-angle X-scattering (SAXS). In the absence of structures of SFPQ in complex with RNA, these analyses show subtle domain movements in the dimerization domain of SFPQ, providing insight into the plasticity of nucleic acid binding by SFPQ.

2. Materials and methods  

2.1. Protein expression and purification  

The expression and purification procedures for the human SFPQ (residues 276–535) homodimer have been described previously (Lee et al., 2015). Briefly, pCDF11-SFPQ-276–535 was transformed into Rosetta 2 (DE3) cells (Merck Millipore) and protein expression was induced with 0.5 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) at an A 600 of 0.6–0.8 for 16 h. The protein was purified using nickel-affinity chromatography followed by His6-tag removal using Tobacco etch virus protease and size-exclusion chromatography. The purified, untagged SFPQ homodimer was typically concentrated to 6–12 mg ml−1 and stored at −80°C until further use.

2.2. Crystallization and X-ray diffraction data collection  

Crystals of the SFPQ homodimer were grown by hanging-drop vapor-diffusion experiments at 20°C. 2 µl SFPQ solution (6.0 mg ml−1) was mixed with 2 µl reservoir solution [0.1 M Tris–HCl pH 8.5, 35–40%(v/v) MPD] and equilibrated against 0.5 ml reservoir solution. Crystals were transferred directly to liquid nitrogen before data collection. Diffraction data were recorded to 1.90 Å resolution from a single crystal on beamline MX2 (Aragão et al., 2018) at the Australian Synchrotron at a wavelength of 0.954 Å at 100 K. The diffraction data were processed with XDS (Kabsch, 2010) and were merged and scaled with AIMLESS (Evans & Murshudov, 2013). The data were processed in space group C2221, with unit-cell parameters a = 65.1, b = 119.2, c = 72.8 Å. The asymmetric unit is estimated to contain one monomer, with a corresponding crystal volume per protein weight of 2.4 Å3 Da−1 and a solvent content of 47.7%. Data-collection and merging statistics are summarized in Table 1.

Table 1. Diffraction data-collection and refinement statistics.

Values in parentheses are for the highest resolution shell.

Data collection
 Space group C2221
 Unit-cell parameters (Å) 65.1, 119.2, 72.8
 Resolution (Å) 20.49–1.90 (1.95–1.90)
 No. of reflections 90911 (5816)
 No. of unique reflections 22646 (1436)
 Completeness (%) 99.8 (100.0)
 Multiplicity 4.0 (4.1)
R merge (%) 4.5 (48.7)
R p.i.m. (%) 3.0 (33.8)
 CC1/2 0.999 (0.827)
 Average I/σ(I) 15.8 (2.5)
Refinement
R (%) 20.1 (25.0)
R free (%) 24.3 (30.9)
 No. of reflections in test set 1113 [4.9%]
 No. of protein molecules per asymmetric unit 1
 R.m.s.d., bond lengths (Å) 0.01
 R.m.s.d., bond angles (°) 0.98
 Average B factors2)
  Overall 37.3
  Protein molecules 36.2
  Water molecules 46.3
 Ramachandran plot: residues other than Gly and Pro in
  Most favored regions (%) 99.2
  Additional allowed regions (%) 0.8
  Disallowed regions (%) 0
 PDB code 6ncq

Calculated by BAVERAGE in the CCP4 suite (Winn et al., 2011).

Calculated using MolProbity (Chen et al., 2010).

2.3. Structure solution and refinement  

The crystal structure of the SFPQ homodimer in space group C2221 was solved by molecular replacement using Phaser (McCoy et al., 2007) within the CCP4 suite (Winn et al., 2011). Chain A of the structure of the SFPQ homodimer (PDB entry 4wii; Lee et al., 2015) was used as the search model after removing all nonprotein atoms. One monomer was found in the asymmetric unit with a log-likelihood gain (LLG) of 5599 and a Z-score of 71.9. Iterative model building with Coot (Emsley et al., 2010) and refinement with autoBUSTER (Bricogne et al., 2016) were carried out. The final model consisted of one chain of SFPQ (residues 279–535) and 208 water molecules. The quality of the model was validated using MolProbity (Chen et al., 2010). The refinement statistics are included in Table 1. The atomic coordinates have been deposited in the Protein Data Bank as entry 6ncq.

2.4. Inline size-exclusion chromatography–small-angle X-ray scattering (SEC–SAXS)  

Inline SEC–SAXS was performed as described previously (Banjara et al., 2018). Briefly, the SFPQ homodimer at 12 mg ml−1 (75 µl) was subjected to size-exclusion chromatography using a Superdex 200 5/150 column pre-equilibrated with 20 mM Tris–HCl pH 7.5, 250 mM NaCl, 5%(v/v) glycerol at a flow rate of 0.4 ml min−1. During elution, SAXS analysis was conducted inline via a coupled coflow sample sheath-flow environment run at a fractional sample flow rate of 0.5 (Kirby et al., 2016; Ryan et al., 2018). SAXS data were acquired on the SAXS/WAXS beamline at the Australian Synchrotron (Kirby et al., 2013), which has been optimized for low instrument background. Data were acquired using a camera length of 2.680 m, providing a q-range of 0.006–0.35 Å−1 at 12 keV at a flux of 3.1 × 1012 photons s−1, using continuous exposures with a 1 s integration time using a PILATUS 1M detector. Data-collection information is summarized in Table 2.

Table 2. SAXS data collection.

Instrument SAXS/WAXS beamline, Australian Synchrotron
Detector PILATUS 1M
Beam geometry (µm) 230 × 400
Fractional sample-flow rate§ 0.5
Wavelength (Å) 1.0332
Flux (photons s−1) 3.1 × 1012
Camera length (m) 2.683
q range (Å−1) 0.006–0.35
Exposure time (s) Continuous 1 s data-frame measurements of SEC elution
Temperature (K) 298
Sample configuration SEC–SAXS with sheath-flow cell, effective sample path length 0.49 mm
SASBDB code SASDFK3

Kirby et al. (2016), Ryan et al. (2018).

2.5. SAXS data analysis  

During data collection, preliminary data analysis was carried out using the Australian Synchrotron SAXS/WAXS autoprocessing, in which the scatterBrain software package (https://www.ansto.gov.au/user-access/instruments/australian-synchrotron-beamlines/saxs-waxs) uses the first 20 frames of the data as buffer, averages them and subtracts them throughout SEC–SAXS data analysis (Kirby et al., 2013; Ryan et al., 2018). The UV absorbance at 280 nm, measured near the capillary, enabled determination of the protein concentration during the whole experiment with the parameters described in Table 3 (Kirby et al., 2016; Ryan et al., 2018). The initial I(0) and the radius of gyration (R g) were calculated using the autoprocessing package during the inline SEC–SAXS experiment for each frame.

Table 3. SAXS structural parameters.

Structural parameters reported for the dimerization domain of SFPQ from the inline SEC peak (Fig. 5 a)

Guinier analysis
I(0) (cm−1) 0.0590 ± 0.0002
R g (Å) 27.73 ± 0.12
P(r) analysis
I(0) (cm−1) 0.0590 ± 0.0001
R g (Å) 27.48 ± 0.08
d max (Å) 88.62
 Porod volume (Å3) 91900
 χ2 (total estimate from GNOM) 0.301
CorMap p-value 0.18
Molecular-mass determination: concentration-dependent
 Partial specific volume (cm3 g−1) 0.733
 Contrast (Δρ × 1010) 2.758
 Extinction coefficient§ 0.53
 Calculated monomeric M r from sequence§ (kDa) 30.05
 UV path length 0.3
 Molecular mass M r [from I(0)] (kDa) 60.58
Molecular-mass determination: concentration-independent
 Molecular mass M r (MoW method)†† (kDa) 61.16
 Molecular mass M r (V c method)†† (kDa) 56.08

For concentration-dependent molecular-mass determination, this should be multiplied by 2.05 owing to coflow measurement.

Determined with MULCh (Whitten et al., 2008).

§

Determined with ProtParam (Gasteiger et al., 2005).

Determined as described by Orthaber et al. (2000).

††

Molecular mass determined with PRIMUS (Franke et al., 2017).

The images suggested by the autoprocessing preliminary analysis were inspected, averaged and subtracted using PRIMUS from the ATSAS suite of SAXS data-analysis tools (Franke et al., 2017). Data analysis was carried out as described in the 2017 publication guidelines (Trewhella et al., 2017), using the R g and the I(0) normalized over concentration scattering data selected over a given peak. Molecular weight was estimated by concentration-independent means using SAXS MoW (Fischer et al., 2010; Piiadov et al., 2019) and correlation volume (V c) in PRIMUS (Franke et al., 2017) as well as by concentration-dependent means (Orthaber et al., 2000) (Table 3). Initial data analysis based on reduced data in the form of data files using Guinier plots (Guinier, 1939) was conducted to calculate the radius of gyration using AUTORG from the ATSAS suite of SAXS data-analysis tools (Franke et al., 2017). Uncertainties from Guinier fits are two standard error values of the slope of fitted linear regressions of ln[I(q)] versus q 2. P(r) distribution analysis was also carried out to further characterize SFPQ in solution using GNOM (Franke et al., 2017). Rigid-body analyses were performed by CRYSOL and CORAL (Franke et al., 2017) with available crystal structures. In CORAL the dimeric models obtained from the two crystal structures were fixed to their positions and missing residues (Table 4) were generated ab initio with χ2 calculated from two standard error uncertainties. A total of ten CORAL runs were carried out for the models generated from the two crystal structures. The data presented in this article refer to the CORAL models with the best χ2 (near 0.25 for two standard errors) and highest p-value obtained using CorMap (Franke et al., 2015). Information on data-processing and analysis is summarized in Tables 3 and 4. The SAXS experimental data have been deposited in the Small-Angle Scattering Biological Data Bank (SASBDB) under accession code SASDFK3.

Table 4. SAXS atomistic modeling.

  P212121 structure (PDB entry 4wii) C2221 structure (PDB entry 6ncq)
CRYSOL (no constant subtraction)
 χ2 0.45 0.37
CorMap p-value 0.00009 0.01122
 Envelope R g (Å) 24.99 26.22
 Molecular weight (kDa) 57.58 58.79
 Vol (Å), Ra (Å), Dro (e Å−3) 73450, 1.80, 0.03 67650, 1.72, 0.03
CORAL
 Chain position Fixed Fixed
 N-terminal missing residues Chain A, 4; chain B, 12 Chain A, 6; chain B , 6
 C-terminal missing residues Chain A, 7; chain B, 6 None
 χ2 0.35 0.51
CorMap p-value 0.02216 0.00004

Two standard error uncertainties were used throughout (best fit χ2 near 0.25).

Chain B was generated by the symmetry operator −x − 1, y, −z − 1/2.

3. Results  

3.1. Crystal structure of the dimerization domain of SFPQ in space group C2221  

The dimerization domain of human SFPQ, encompassing the two RRMs, the NOPS domain and part of the CC domain (Fig. 1 a), was originally crystallized in the primitive ortho­rhombic space group P212121 with one dimer in the asymmetric unit (PDB entry 4wii), as described in our previous study (Lee et al., 2015). In the present work, the same dimerization domain of SFPQ (residues 276–535) has been crystallized in the C-centered orthorhombic space group C2221 with one monomer in the asymmetric unit (Fig. 1 b). The model of SFPQ in space group C2221 was refined to 1.90 Å resolution, with residuals R = 20.1% and R free = 24.3% (Table 1). The final model consisted of one monomer in the asymmetric unit with residues 279–535 and 208 water molecules. The dimer, which is the biological functional unit of SFPQ, can be generated by the crystallographic twofold symmetry (symmetry operator −x − 1, y, −z − 1/2; Fig. 1 c).

Consistent with the previously characterized structure of the dimerization domain of human SFPQ, the structure of SFPQ in space group C2221 shows extensive intermolecular interactions within the dimer, which are primarily governed by hydrophobic interactions. The NOPS domain of one monomer makes extensive interactions with the second RRM (RRM2) of the second monomer, followed by antiparallel CC-domain interactions (Fig. 1 c). Analysis of the interface using PISA (Krissinel & Henrick, 2007) shows that the dimer interface involves all four domains (two RRMs, the NOPS domain and the CC domain), with over 40% of residues directly involved in the dimer interface. The overall structure of SFPQ in space group C2221 confirms the previously determined structure of SFPQ.

3.2. Comparison of the crystal structures of the dimerization domain of SFPQ  

The two structures of the dimerization domain of SFPQ are generally in good agreement: superposition of the structure of SFPQ in space group C2221 with chain A and chain B of that in space group P212121 results in root-mean-square deviations (r.m.s.d.s) of 1.45 Å for 250 Cα pairs and 1.31 Å for 244 Cα pairs, respectively (Fig. 2 a). Overall superposition of the two structures as dimers, however, shows subtle dispositions of the first RRM (RRM1) and the terminal CC domain in addition to the local differences in the NOPS domain (Fig. 2). The pairwise Cα r.m.s.d. clearly shows that the major differences are localized in RRM1, the NOPS domain and the distal CC domain, with greatest Cα r.m.s.d.s of 4.5 and 4.2 Å with chain A and chain B, respectively, of the structure of SFPQ in space group P212121 (Fig. 2 b).

Figure 2.

Figure 2

Comparison of the crystal structures of the dimerization domain of SFPQ. (a) Superposition of the dimer structures in ribbon representation. The structure of SFPQ in space group C2221 is shown in red with the other monomer generated by the crystallographic twofold symmetry in gray, while chain A from the structure of SFPQ in space group P212121 is shown in yellow and chain B in cyan. (b) R.m.s.d. Cα difference of the structure of SFPQ in space group C2221 from that in space group P212121 (chain A is shown in black and chain B in orange) showing that the major differences are localized in RRM1, the NOPS domain and the distal CC domain.

To assess the disposition of the four distinct domains in the dimerization domain of SFPQ in the two structures, the movement of the domain was analyzed using the protein domain-motion analysis program DynDom (Hayward & Berendsen, 1998). The two monomers of SFPQ in the P212121 structure were compared with the monomer of SFPQ in the C2221 structure, and the resulting domain movement of SFPQ in the C2221 structure relative to the two monomers of SFPQ in the P212121 structure is shown in Fig. 3(a). RRM1 and the distal CC domain are identified as moving domains that contribute to the major deviations in the disposition of the domains.

Figure 3.

Figure 3

Domain-movement analysis using DynDom (Hayward & Berendsen, 1998). (a) The domain movement of the structure of SFPQ in space group C2221 is color-coded: fixed domains are in blue, moving domains in red and bending residues in green (residues that were not used in the analysis are shown in gray). The structure of SFPQ in space group C2221 is superposed with chain A from the structure of SFPQ in space group P212121 (yellow, middle) and chain B from the structure of SFPQ in space group P212121 (cyan, right). (b, c) Close-up view of the moving domains: RRM1 and the CC domain. The structure of SFPQ in space group C2221 is superposed with chain A from the structure of SFPQ in space group P212121 in yellow (b) and with chain B from the structure of SFPQ in space group P212121 in cyan (c). (d) Close-up view of the canonical RNA-binding residues in RRM1. The structure of SFPQ in space group C2221 is superposed with chain A from the structure of SFPQ in space group P212121 (left) and with chain B from the structure of SFPQ in space group P212121 (right). (e) Superposition of the CC domains from the two structures of SFPQ showing different CC-domain curvature in the two structures. The same color scheme as in Fig. 2 was applied.

In the case of RRM1, the domain from the structure of SFPQ in space group C2221 superposes well individually with that from the P212121 structure (r.m.s.d.s of 0.45 Å for 83 Cα atoms from chain A and 0.34 Å for chain B). However, the relative position of RRM1 to the dimer core in the new structure is tilted by a rotation angle of 6.6° and 5.9° away from the dimer core compared with those of RRM1 of chain A and chain B in the P212121 structure, respectively (Figs. 3 b and 3 c). The resulting dispositions of RRM1 in the two structures provide variations in the position of the conserved aromatic residues that are found in the canonical RRM (Fig. 3 d).

The distal CC domain of SFPQ in the C2221 structure also shows a significant variation in curvature in comparison to that in space group P212121 (Fig. 3 e). In contrast to the curved distal CC domains that are observed in the structure of SFPQ in space group P212121, the CC domain of SFPQ in space group C2221 is straight, resulting in an increase of the Cα distance between Tyr527 and Phe486 of the other monomer in the dimer from 6.1 and 5.9 Å in the P212121 structure to 7.4 Å. This change in curvature was identified in the DynDom analysis, with rotation angles of 8.2° for chain A of the structure of SFPQ in space group P212121 and 9.2° for chain B relative to the fixed domain (RRM2).

The straight distal CC domain in the C2221 structure is owing to the difference in local crystal packing. In the structure of SFPQ in space group P212121, the C-terminus of the CC domain is disordered, with the last residues modeled in the structure being His528 in chain A and Tyr529 in chain B. In contrast, the CC domain of SFPQ in the C2221 structure exhibits clear density to the last residue of Leu535, where a series of hydrophobic and hydrogen-bond interactions with neighboring symmetry-related molecules stabilize the CC domain. In particular, the side chain of His530 forms a helix-stabilizing π-stacking interaction with the same residue in the neighboring monomer (symmetry operator −x − 1, y, −z + 1/2), stabilizing the conformation of the distal CC domain (Fig. 4). A similar π-stacking interaction of His530 with His553 from the extended CC domain of the symmetry-related neighboring dimer has been observed in full CC-domain constructs of SFPQ in a previous report (Lee et al., 2015). It has been shown that the polymerization of SFPQ mediated by the extended CC domain is a reversible dynamic process, depending on local concentration, and is essential for many aspects of the function of SFPQ (Lee et al., 2015). Taken together, these observations suggest that the CC domain of SFPQ is in dynamic equilibrium with multiple conformations, depending on the local concentration and oligomeric status.

Figure 4.

Figure 4

The stabilized C-terminal CC domain owing to crystal contacts. (a, b) The monomer in the asymmetric unit is shown using the same color scheme as in Fig. 1(b), forming a dimer with the neighboring symmetry-related monomer (−x − 1, y, −z − 1/2) shown in gray. The C-terminal CC domain is stabilized by the interaction with the neighboring dimer (yellow, −x − 1, y, −z + 1/2; green, x, y, z + 1). (c) Close-up view of the π-stacking interaction of His530.

In addition to the subtle differences in the disposition of RRM1 and the CC domain, the structure highlights the plasticity of the NOPS domain (Fig. 3 a). The conformations of the NOPS domains in the two chains of the previous dimer structure in space group P212121 are significantly different and the NOPS domain in the current structure is not identical to either of the previous conformations, but it is closer to the conformation in chain A of the P212121 structure. The NOPS domain is the most flexible domain among the four domains in the DBHS proteins, as shown by its higher B factor than the average B factor of every DBHS protein structure characterized so far. In the structure of the SFPQ–PSPC1 heterodimer the NOPS domain of SFPQ is disordered (Huang et al., 2018). The plasticity of the NOPS domain has been suggested to contribute to dimer partner selection and exchange among the three DBHS proteins in heterodimerization (Huang et al., 2018).

The effect of these local differences observed in RRM1, the NOPS domain and the CC domain in the overall dimer interface was analyzed by PISA (Krissinel & Henrick, 2007). The interface analyses of the two structures are within a comparable range: Δi G = −51.6 kcal mol−1 (p-value 0.033) and Δdiss G = 75.0 kcal mol−1 for the symmetry-generated dimer in space group C2221, while Δi G = −46.1 kcal mol−1 (p-value 0.061) and Δdiss G = 71.2 kcal mol−1 for the two monomers in space group P212121. The larger buried surface area of the dimer in the P212121 structure (10 420 Å2 in the P212121 structure; 9040 Å2 in the C2221 structure), however, indicates slightly more compact dimer formation in space group P212121.

3.3. Comparison of the solution X-ray scattering data with the crystal structure of SFPQ  

To evaluate the structure of the dimerization domain of SFPQ in solution, we carried out SAXS experiments using the size-exclusion chromatography (SEC) inline with SAXS (SEC–SAXS) method. Scattering data were collected while the protein sample was eluted from the size-exclusion chromatography column (Figs. 5 a, 5 b and 5 c). The chromatogram revealed that the dimerization domain of SFPQ eluted in a single resolved peak (Fig. 5 a) with an averaged R g and I(0)/c over the peak frames of 27.81 ± 0.06 Å and 0.0291 ± 0.0003 (for four averaged frames), respectively. The scattering profile retrieved from the peak conformed a straight line in the low-q region on a Guinier plot (Fig. 5 b). The molecular weight calculated from the forward scattering intensity, I(0), for this peak corresponded to a species of ∼60 kDa. A similar molecular weight was obtained from concentration-independent measurements such as the MoW or V c methods (Table 3), confirming that the dimerization domain of SFPQ is a dimer in solution at the concentration used in this study (the theoretical molecular weight of a monomer is 30 049 Da; Table 3).

Figure 5.

Figure 5

Small-angle X-ray scattering data for the dimerization domain of SFPQ. (a) The inline size-exclusion elution profile of SFPQ (dark gray line). The dark squares and triangles represent the frames used for averaging the scattering profiles based on R g (squares) and I(0) normalized over concentration (triangles). (b) Measured scattering data for SFPQ (residues 276–535) in log[I(q)]–q representation. Inset: the Guinier plot of the low-angle portion of the scattering data is linear, consistent with a monodisperse solution. (c) P(r) distribution of the dimerization domain of SFPQ. (d) CRYSOL modeling analysis. Measured scattering data (gray circles) are overlaid with the predicted scattering profiles for the crystal structures of SFPQ in the two space groups: a blue line for the model in space group P212121 (PDB entry 4wii; CRYSOL: χ2 = 0.45, p-value = 0.00009) and a red line for the model in space group C2221 (PDB entry 6ncq; CRYSOL: χ2 = 0.37, p-value = 0.01122). (e, f) Models retrieved from CORAL hybrid modeling analysis of SFPQ with the two available crystallographic structures: the CORAL model in space group P212121 (e) and that in space group C2221 (f). (g, h) CORAL hybrid modeling analysis. Measured scattering data (gray circles) and the predicted scattering profiles for the CORAL hybrid models of SFPQ in the two space groups P212121 (e) and C2221 (f) are presented in the same way as in (d): a blue line for the model in space group P212121 (PDB entry 4wii; CORAL best fit: χ2 = 0.35, p-value = 0.02216) and a red line for the model in C2221 (PDB entry 6ncq; CORAL best fit: χ2 = 0.51, p-value = 0.00004).

To assess which of the two crystal structures of SFPQ better represents the structure of the protein in solution, CRYSOL analyses were carried out with both models (Table 4 and Fig. 5 d). The results suggested that the crystal structure of SFPQ in space group C2221 is a better fit than that in space group P212121 judged by the similarity of the R g value to the experimentally obtained value and the better statistics of χ2 (two standard errors) of 0.37 with a p-value of 0.0112 (Table 4 and Fig. 5 d). However, this could be a potential misrepresentation of the reality owing to the larger number of missing residues in the model of SFPQ in space group P212121 compared with that in space group C2221. For this reason, we carried out rigid-body modeling analysis using CORAL, in which the missing residues were generated ab initio (Petoukhov et al., 2012). The ten runs of rigid-body modeling using CORAL resulted in an χ2 for the fits of the models in P212121 from 0.35 to 0.37 with p-values from 0.00009 to 0.00222, while the χ2 values for the models in C2221 varied from 0.51 to 0.52 with p-values from 0 to 0.00004. In this article, we present the models retrieved with the best χ2 (near 0.25 for two standard errors) and highest p-values for the two crystallo­graphic space groups (Table 4, Figs. 5 e and 5 f). It was observed that the models in both space groups would fit the experimentally obtained SAXS data (Figs. 5 g and 5 h). However, comparison of the χ2 and p-values (CorMap) suggests that the best fit is obtained from the CORAL model of SFPQ in space group P212121, with a χ2 of 0.35 and a p-value of 0.02216 (Table 4). The discrepancy between the experimental data and the models, which was more obvious when the fits were presented in a Kratky plot (Fig. 5 h), in particular with the model in C2221, implies a dynamic nature of SFPQ in solution. This observation is consistent with the domain movements observed in the two crystal structures.

4. Discussion  

SFPQ has been implicated in many aspects of gene regulation, including RNA biogenesis, paraspeckle formation and DNA repair. Manifold functions of SFPQ appear to be mediated by its ability to interact with various nucleic acid targets, RNA as well as DNA, and to form complexes with protein interaction partners at different locations and stages of the cell cycle (Yarosh et al., 2015; Knott et al., 2016). The functional bio­logical unit of the DBHS proteins is a dimer and the molecular basis of the dimerization has been extensively characterized, in particular by structure determination (Passon et al., 2012; Lee et al., 2015; Knott et al., 2015; Huang et al., 2018). The importance of polymerization mediated by the extended CC domain in the multiple functions of SFPQ has also been probed (Lee et al., 2015). However, the molecular details of how DBHS proteins, including SFPQ, interact with other interaction partners remain to be characterized.

In this study, we report a new crystal structure of the dimerization domain of SFPQ in space group C2221 with one monomer in the asymmetric unit and compare it with the previously reported structure in space group P212121. While confirming the overall arrangement of the four distinct domains (two RRMs, a NOPS domain and a CC domain), comparison of the two structures highlights subtle domain movements that are localized mainly in RRM1 and the distal CC domain. SAXS data analysis further supported the concept that the dimerization domain of SFPQ is dynamic in solution, given that neither of the crystallographic models fits the solution scattering data perfectly. This is presumably owing to multiple conformations generated by domain movements that are in equilibrium.

The domain movement of SFPQ may facilitate interaction with a wide range of interaction partners, including RNA partners. Recent CLIP-Seq data show that SFPQ has a binding preference for noncoding regions of RNA, including introns and 3′ UTRs, and interacts directly with the 3′ UTRs of mRNAs to modulate microRNA (miRNA) targeting mRNAs with long 3′ UTRs (Bottini et al., 2017). Interactions of SFPQ with long noncoding RNAs (lncRNAs) have also been documented: direct interaction with NEAT1 to form paraspeckles (Hirose et al., 2014) and with MALAT1 and CTBP1-AS to mediate its regulatory effects in colorectal cancer (Ji et al., 2014) and advanced prostate cancer (Takayama et al., 2013), respectively. These indicate that SFPQ may bind to a wide range of RNAs with versatile sequence and/or secondary-structure specificity.

Although the precise modes of these interactions remain to be characterized, comparison of the two structures of SFPQ presented in this study suggests that the dispositions of RRM1 in SFPQ in particular may affect the selection of and/or affinity for the RNA partners. RRM1 of SFPQ possesses the consensus ribonucleoprotein (RNP) 1 and 2 sequences critical for canonical RNA interactions. The positions of the conserved aromatic residues (Phe300, Phe334 and Phe336) in RNP1 and RNP2 from the two structures are significantly different owing to the domain movement of RRM1 (Fig. 3 d). Owing to the unprecedented spatial arrangement of the four RRMs in the dimer of the DBHS proteins, it has previously been speculated that a unique mode of RNA binding may exist (Passon et al., 2012). In conjunction with the propensity of SFPQ to polymerize at high local concentrations (Lee et al., 2015), subtle changes in the relative position of RRM1 in the dimer may further facilitate the accommodation of a wide range of RNA partners with different sequence and/or secondary-structure specificities. Further studies including the structural characterization of SFPQ in complex with its RNA partners will be needed to define the precise interaction mode(s) of SFPQ with RNA.

Supplementary Material

PDB reference: dimerization domain of human SFPQ in space group C2221, 6ncq

Acknowledgments

Aspects of this research were undertaken on the Macromolecular Crystallography (MX) beamlines and SAXS/WAXS beamline at the Australian Synchrotron, ANSTO, Melbourne, Victoria, Australia, and we thank the beamline staff for their professional support. We acknowledge the La Trobe University–Comprehensive Proteomics Platform (La Trobe University) for providing infrastructure and expertise.

Funding Statement

This work was funded by Australian Research Council grant DE150101243 to Mihwa Lee. La Trobe University grant TBM Fellowship to Mihwa Lee.

References

  1. Aragão, D., Aishima, J., Cherukuvada, H., Clarken, R., Clift, M., Cowieson, N. P., Ericsson, D. J., Gee, C. L., Macedo, S., Mudie, N., Panjikar, S., Price, J. R., Riboldi-Tunnicliffe, A., Rostan, R., Williamson, R. & Caradoc-Davies, T. T. (2018). J. Synchrotron Rad. 25, 885–891. [DOI] [PMC free article] [PubMed]
  2. Banjara, S., Mao, J., Ryan, T. M., Caria, S. & Kvansakul, M. (2018). J. Biol. Chem. 293, 5464–5477. [DOI] [PMC free article] [PubMed]
  3. Bond, C. S. & Fox, A. H. (2009). J. Cell Biol. 186, 637–644. [DOI] [PMC free article] [PubMed]
  4. Bottini, S., Hamouda-Tekaya, N., Mategot, R., Zaragosi, L. E., Audebert, S., Pisano, S., Grandjean, V., Mauduit, C., Benahmed, M., Barbry, P., Repetto, E. & Trabucchi, M. (2017). Nature Commun. 8, 1189. [DOI] [PMC free article] [PubMed]
  5. Bricogne, G. B. E., Brandl, M., Flensburg, C., Keller, P., Paciorek, W., Roversi, P. S. A., Smart, O. S., Vonrhein, C. & Womack, T. O. (2016). BUSTER v.2.10.2. Global Phasing, Cambridge, UK.
  6. Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12–21. [DOI] [PMC free article] [PubMed]
  7. Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501. [DOI] [PMC free article] [PubMed]
  8. Evans, P. R. & Murshudov, G. N. (2013). Acta Cryst. D69, 1204–1214. [DOI] [PMC free article] [PubMed]
  9. Fischer, H., de Oliveira Neto, M., Napolitano, H. B., Polikarpov, I. & Craievich, A. F. (2010). J. Appl. Cryst. 43, 101–109.
  10. Fox, A. H. & Lamond, A. I. (2010). Cold Spring Harb. Perspect. Biol. 2, a000687. [DOI] [PMC free article] [PubMed]
  11. Franke, D., Jeffries, C. M. & Svergun, D. I. (2015). Nature Methods, 12, 419–422. [DOI] [PubMed]
  12. Franke, D., Petoukhov, M. V., Konarev, P. V., Panjkovich, A., Tuukkanen, A., Mertens, H. D. T., Kikhney, A. G., Hajizadeh, N. R., Franklin, J. M., Jeffries, C. M. & Svergun, D. I. (2017). J. Appl. Cryst. 50, 1212–1225. [DOI] [PMC free article] [PubMed]
  13. Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M. R., Appel, R. D. & Bairoch, A. (2005). The Proteomics Protocols Handbook, edited by J. M. Walker, pp. 571–607. Totowa: Humana Press.
  14. Guinier, A. (1939). Ann. Phys. 11, 161–237.
  15. Hayward, S. & Berendsen, H. J. C. (1998). Proteins, 30, 144–154. [PubMed]
  16. Hirose, T., Virnicchi, G., Tanigawa, A., Naganuma, T., Li, R., Kimura, H., Yokoi, T., Nakagawa, S., Bénard, M., Fox, A. H. & Pierron, G. (2014). Mol. Biol. Cell, 25, 169–183. [DOI] [PMC free article] [PubMed]
  17. Huang, J., Casas Garcia, G. P., Perugini, M. A., Fox, A. H., Bond, C. S. & Lee, M. (2018). J. Biol. Chem. 293, 6593–6602. [DOI] [PMC free article] [PubMed]
  18. Ji, Q., Zhang, L., Liu, X., Zhou, L., Wang, W., Han, Z., Sui, H., Tang, Y., Wang, Y., Liu, N., Ren, J., Hou, F. & Li, Q. (2014). Br. J. Cancer, 111, 736–748. [DOI] [PMC free article] [PubMed]
  19. Kabsch, W. (2010). Acta Cryst. D66, 125–132. [DOI] [PMC free article] [PubMed]
  20. Kirby, N., Cowieson, N., Hawley, A. M., Mudie, S. T., McGillivray, D. J., Kusel, M., Samardzic-Boban, V. & Ryan, T. M. (2016). Acta Cryst. D72, 1254–1266. [DOI] [PMC free article] [PubMed]
  21. Kirby, N. M., Mudie, S. T., Hawley, A. M., Cookson, D. J., Mertens, H. D. T., Cowieson, N. & Samardzic-Boban, V. (2013). J. Appl. Cryst. 46, 1670–1680.
  22. Knott, G. J., Bond, C. S. & Fox, A. H. (2016). Nucleic Acids Res. 44, 3989–4004. [DOI] [PMC free article] [PubMed]
  23. Knott, G. J., Lee, M., Passon, D. M., Fox, A. H. & Bond, C. S. (2015). Protein Sci. 24, 2033–2043. [DOI] [PMC free article] [PubMed]
  24. Krissinel, E. & Henrick, K. (2007). J. Mol. Biol. 372, 774–797. [DOI] [PubMed]
  25. Lee, M., Passon, D. M., Hennig, S., Fox, A. H. & Bond, C. S. (2011). Acta Cryst. D67, 981–987. [DOI] [PubMed]
  26. Lee, M., Sadowska, A., Bekere, I., Ho, D., Gully, B. S., Lu, Y., Iyer, K. S., Trewhella, J., Fox, A. H. & Bond, C. S. (2015). Nucleic Acids Res. 43, 3826–3840. [DOI] [PMC free article] [PubMed]
  27. McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007). J. Appl. Cryst. 40, 658–674. [DOI] [PMC free article] [PubMed]
  28. Orthaber, D., Bergmann, A. & Glatter, O. (2000). J. Appl. Cryst. 33, 218–225.
  29. Passon, D. M., Lee, M., Rackham, O., Stanley, W. A., Sadowska, A., Filipovska, A., Fox, A. H. & Bond, C. S. (2012). Proc. Natl Acad. Sci. USA, 109, 4846–4850. [DOI] [PMC free article] [PubMed]
  30. Patton, J. G., Porro, E. B., Galceran, J., Tempst, P. & Nadal-Ginard, B. (1993). Genes Dev. 7, 393–406. [DOI] [PubMed]
  31. Petoukhov, M. V., Franke, D., Shkumatov, A. V., Tria, G., Kikhney, A. G., Gajda, M., Gorba, C., Mertens, H. D. T., Konarev, P. V. & Svergun, D. I. (2012). J. Appl. Cryst. 45, 342–350. [DOI] [PMC free article] [PubMed]
  32. Piiadov, V., Ares de Araújo, E., Oliveira Neto, M., Craievich, A. F. & Polikarpov, I. (2019). Protein Sci. 28, 454–463. [DOI] [PMC free article] [PubMed]
  33. Ryan, T. M., Trewhella, J., Murphy, J. M., Keown, J. R., Casey, L., Pearce, F. G., Goldstone, D. C., Chen, K., Luo, Z., Kobe, B., McDevitt, C. A., Watkin, S. A., Hawley, A. M., Mudie, S. T., Samardzic-Boban, V. & Kirby, N. (2018). J. Appl. Cryst. 51, 97–111.
  34. Sasaki, Y. T., Ideue, T., Sano, M., Mituyama, T. & Hirose, T. (2009). Proc. Natl Acad. Sci. USA, 106, 2525–2530. [DOI] [PMC free article] [PubMed]
  35. Takayama, K., Horie-Inoue, K., Katayama, S., Suzuki, T., Tsutsumi, S., Ikeda, K., Urano, T., Fujimura, T., Takagi, K., Takahashi, S., Homma, Y., Ouchi, Y., Aburatani, H., Hayashizaki, Y. & Inoue, S. (2013). EMBO J. 32, 1665–1680. [DOI] [PMC free article] [PubMed]
  36. Trewhella, J., Duff, A. P., Durand, D., Gabel, F., Guss, J. M., Hendrickson, W. A., Hura, G. L., Jacques, D. A., Kirby, N. M., Kwan, A. H., Pérez, J., Pollack, L., Ryan, T. M., Sali, A., Schneidman-Duhovny, D., Schwede, T., Svergun, D. I., Sugiyama, M., Tainer, J. A., Vachette, P., Westbrook, J. & Whitten, A. E. (2017). Acta Cryst. D73, 710–728. [DOI] [PMC free article] [PubMed]
  37. Whitten, A. E., Cai, S. & Trewhella, J. (2008). J. Appl. Cryst. 41, 222–226.
  38. Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. M., Krissinel, E. B., Leslie, A. G. W., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A. & Wilson, K. S. (2011). Acta Cryst. D67, 235–242. [DOI] [PMC free article] [PubMed]
  39. Yarosh, C. A., Iacona, J. R., Lutz, C. S. & Lynch, K. W. (2015). Wiley Interdiscip. Rev. RNA, 6, 351–367. [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

PDB reference: dimerization domain of human SFPQ in space group C2221, 6ncq


Articles from Acta Crystallographica. Section F, Structural Biology Communications are provided here courtesy of International Union of Crystallography

RESOURCES