Cooperativity boosts affinity and specificity of proteins with multiple RNA-binding domains

Simon H Stitzinger; Salma Sohrabi-Jahromi; Johannes Söding

doi:10.1093/nargab/lqad057

. 2023 Jun 9;5(2):lqad057. doi: 10.1093/nargab/lqad057

Cooperativity boosts affinity and specificity of proteins with multiple RNA-binding domains

Simon H Stitzinger ¹, Salma Sohrabi-Jahromi ², Johannes Söding ^3,^4,^✉

PMCID: PMC10251633 PMID: 37305168

Abstract

Numerous cellular processes rely on the binding of proteins with high affinity to specific sets of RNAs. Yet most RNA-binding domains display low specificity and affinity in comparison to DNA-binding domains. The best binding motif is typically only enriched by less than a factor 10 in high-throughput RNA SELEX or RNA bind-n-seq measurements. Here, we provide insight into how cooperative binding of multiple domains in RNA-binding proteins (RBPs) can boost their effective affinity and specificity orders of magnitude higher than their individual domains. We present a thermodynamic model to calculate the effective binding affinity (avidity) for idealized, sequence-specific RBPs with any number of RBDs given the affinities of their isolated domains. For seven proteins in which affinities for individual domains have been measured, the model predictions are in good agreement with measurements. The model also explains how a two-fold difference in binding site density on RNA can increase protein occupancy 10-fold. It is therefore rationalized that local clusters of binding motifs are the physiological binding targets of multi-domain RBPs.

INTRODUCTION

RNA-binding proteins (RBPs) regulate various steps of mRNA biogenesis including RNA splicing, localization, translation, and degradation (1). To ensure that these proteins bind the correct set of RNA molecules and at the right regions, the interactions have to be highly specific. Yet many RNA-binding domains (RBDs) bind to short and degenerate RNA motifs, often three, rarely more than five nucleotides in length (2,3), and the dissociation constants (K_d) of their RNA-binding domains are often in the micromolar range, sometimes hundreds of micromolar (4–11). In contrast, single DNA-binding domains typically recognize somewhat longer motifs (12–14) and the dissociation constants of most transcription factors are in the nanomolar range.

Despite the low affinity of the individual RNA-binding domains, cooperativity between multiple domains in an RBP can result in high specificities and avidities (defined as an ‘effective’ association constant, see Materials and Methods) for the entire RBP much higher than the K_as of individual domains (15,16). When RBPs form oligomers or polymers, all RNA-binding domains of the complex can bind RNA cooperatively. Roughly 80% of eukaryotic RBPs either have at least two binding domains (17) or assemble into homooligomeric complexes with multiple RNA-binding domains (18) (Figure 1).

Figure 1. — Most RBPs have more than one domain per chain or per homooligomeric complex. Numbers of RNA-binding domains per protein for proteins in the RNA-binding protein database (RBPDB) (17), which contains proteins from human, mouse, Drosophila and Caenorhabditis elegans. Inset shows in black the oligomeric state as predicted by the PDBePISA tool (18) for the 279 PDB structures of 136 of the RNA-binding proteins with only one domain.

The increase in avidity via cooperative binding can be explained by the high local concentration of a protein binding domain at the second RNA site when the first binding domain is bound to the first RNA site, which adds to the background concentration (19,20). We will show here that, when this local effective concentration c_eff is x-fold higher than the K_d of the still unbound binding site (in isolation), the effective K_d for this interaction can be x-fold lower than for the isolated RNA-binding domain.

Thermodynamic models of cooperative binding for two binding units have been developed for binding of bivalent antibodies to antigens (19,21), of ligand binding by bivalent and multivalent receptors (22,23), and of DNA-binding proteins with two DNA-binding domains (24). In all of these cases, the combination of multiple binding domains and target sites, and their connection through flexible linkers increases avidity in an analogous way to multi-domain RNA-binding.

To better understand cooperative RNA-protein interactions and the biological implications that arise from cooperativity, we need to model quantitatively the avidity of proteins or oligomeric complexes with more than two RNA-binding domains. So far, existing models have only described cooperative binding between two domains, with flexible linkers between the domains of one binding partner (20).

Here, our goal is to develop a simplified model that can provide biologists and biochemists insight into the important effects of cooperative binding of multi-domain RBPs. Our goal is not to develop a model that can make accurate predictions of avidities as this would require, if at all possible, detailed atomic-level molecular dymamic simulations.

We present an equilibrium thermodynamic model for multi-domain RNA-binding with any number of RNA-binding domains. We treat the RNA linkers between binding motifs as worm-like chains and, in contrast to earlier work (19,20), we take the entropy of the chain into account. However, we have to simplify by ignoring interactions of the RNA linker with the proteins. The model can describe RNA-binding domains connected by flexible peptide linkers (25), which we also treat as worm-like chains.

Using this model, we can show that the avidity increases exponentially with each added pair of binding domain and target site. In this way, high affinities and specificities can be achieved with low-affinity and low-specificity RNA-binding domains. We validate the model on seven RNA-binding proteins for which the affinities of the entire protein and of individual domains have been measured. We find that the avidities estimated with the model are in good agreement with the measured values. Lastly and most importantly, we demonstrate that, by cooperative binding with multiple RNA-binding domains with the same binding preferences, RNAs can be sensitively distinguished based on their binding motif density. This result suggests that sequence-specific RBPs achieve high specificity and avidity by binding to clusters of binding sites on their target RNAs.

MATERIALS AND METHODS

Simple cooperative binding model

The model describes the cooperative, multivalent binding of RNA-binding proteins possessing n RNA-binding domains to an RNA with n binding sites (Figure 2A). To be able to analytically calculate the avidity for the protein and its RNA substrate, we need to make three simplifying assumptions. First, we assume that each RNA-binding domain can only bind to a single, cognate binding site on the RNA, so domain 1 to RNA site 1, domain 2 to RNA site 2, and so on. Second, we assume that an RNA is at most bound by a single protein. This is a good approximation as long as the local concentration of domains of the already bound protein at the RNA sites is much larger than the background protein concentration. When the linkers between binding sites on the RNA are short enough, typically up to about 20 nucleotides, the first-bound protein will outcompete all other proteins from binding to its RNA. Third, we assume that the RNA linker between motifs does not interact with the proteins nor other parts of the RNA.

Figure 2. — Thermodynamic model for cooperative RNA-protein interactions. (A) Illustration of an RNA with an RNA-binding protein binding to it. All binding sites on the RNA are only bound by one domain of the RBD. Each of these interactions has its individual K_a,i. (**B–D**) Reaction networks for one, two, and three binding sites on the RNA. Each system has 2ⁿ possible states. Every possible reaction step has an association constant equal to the individual K_a,i for the domain-to-RNA-site interaction multiplied by the concentration of the domain at its cognate site.

We denote binding configurations in this model by a binary string that indicates which sites are bound. For instance, 101 represents the configuration in which the first and third sites on the RNA are bound by the first and third domains of one protein.

Inter- and intramolecular reactions of first and second order

We have to consider two types of reactions. First, when the RNA and protein are not linked, all possible reactions are second order intermolecular reactions between one protein domain and its cognate RNA binding site. We call the association constant for this K_a,i (units of molar), where i is the index of the interacting domain and RNA site. These reactions only depend on the concentrations of free RNA, [0...0], and free protein, c (Figure 2B).

In the second case, where the protein is already bound to the RNA with at least one domain, new domains can bind in a first order intramolecular reaction and we can describe the unitless association constant for one binding step based on the law of mass action. For example, the reaction Inline graphic (Figure 2A) depends on the local effective concentration c₁₂ of domain 2 (of the already bound protein) at RNA site 2. In a first, rough approximation, we can assume this concentration to be constant inside the volume accessible to RNA site 2 (19,20). The concentration is 1 divided by the accessible volume, a sphere with radius equal to the length l₁₂ of the RNA between sites 1 and 2 (Figure 3A): Inline graphic (See below for a refinement of this estimate.) This is the same as c₂₁, the concentration of protein site 1 at RNA site 1 when a protein site 2 is bound to RNA site 2.

Figure 3. — Effective concentration c₁₂ of domain 2 at RNA site 2, when at least one RNA site is already bound. (A) In the simplest approximation, the concentration is uniform inside the sphere of radius l₁₂ around the bound domain. l₁₂ is the RNA chain length between binding sites. (B) More realistically, when the RNA chain is treated as a ‘worm like chain’ the concentration c₁₂ has a Gaussian density (for large l₁₂). Its size depends on l₁₂ and the 3D distance d₁₂ between binding domains on the protein. (C) Comparison of the effective concentration in the simple case described in (A), with the more realistic worm-like chain in (B). The concentration decays much more slowly if modelled by a Gaussian. The distance between binding sites on the protein is d₁₂ = 3 nm.

The law of mass action for the reaction Inline graphic reads

and by rearranging we get for the association constant of the reaction Inline graphic ,

(1)

This means that all possible first order reaction steps have an apparent association constant (e.g. Inline graphic ) equal to the individual K_a,i for the domain-to-RNA-site interaction multiplied by the local concentration c_ij of the domain at its cognate site (Figure 2A).

Thermodynamic definition of the avidity K_av

We would like to calculate for each concentration c of the RNA-binding proteins what fraction of RNA molecules is bound by a protein. For a single binding domain and RNA binding site n = 1, we can simply write the association constant of binding as Inline graphic , where [0] is the concentration of unbound RNA and [1] is the concentration of bound RNA (Figure 2B). When the RNA contains two binding sites and the protein contains two cognate ones (n = 2), we do not have a single bound state anymore but rather three: 10 (first site on RNA bound), 01 (second site bound), and 11 (both sites bound by protein) (Figure 2C). An association constant can only describe the equilibrium between two states. We therefore need a generalization of association constants to multistate systems. Following Kitov et al. (23), we can define the avidity—sometimes called ‘effective’, ‘apparent’, or ‘functional’ affinity or association constant—as the ratio of the sum of concentrations of all bound states divided by the concentrations of the two unbound species A and B:

(2)

For instance for the case of n = 2 RBDs per protein and two RNA binding sites per RNA, this gives us

(3)

With the exception of Kitov et al. (23), the term avidity has so far mostly been used qualitatively to describe cooperativity in multivalent binding (16,26).

By substituting all concentration terms in equation (3), we can express the K_av with n = 2, in terms of the associations constants of the individual domain-to-RNA-site interactions K_a,i

The derivation for this has been shown elsewhere before. In the Supplementary Methods (Section 4) we derive the K_av for n = 2 for the alternative case, where the two domains have the same specificities such that each of them can bind to any of the two binding motifs on the RNA.

Mainly, however, we generalize the derivation to any number n. Detailed mathematical steps are shown in the Supplementary Methods (Section 1), while here, we focus on explaining the intuition behind the formulas. First, we need to write equation (2) for the reaction system with 2ⁿ states (shown in Figure 2B-D for one, two and three sites). By the same logic that leads to equation (1), we can substitute all concentration terms in equation (2). In the limiting case where the fully bound configuration dominates the partially bound state, that is, if K_{a,i − 1} c_{i − 1, i} ≫ 1 and c_{i − 1, i} K_a,i ≫ 1 for all i = 2, 3, …, n, we find that (Supplemental Methods, section 1)

(4)

Each added binding site approximately multiplies the avidity by a factor c_{i − 1, i}K_a,i. Intuitively, this is a consequence of the reaction path from the unbound state [0…0] to the fully bound state [1…1], for instance by flipping unbound sites to bound sites in the order from leftmost to rightmost site. The total K_a of such an n-step reaction (where the total reaction is the sum of individual steps), is the product of association constants of individual reaction steps.

Effective concentrations using the worm-like chain model

The effective concentration c_ij of site j on the RNA at site j of the protein when site i is already bound was approximated above as the reciprocal of the accessible volume Inline graphic , where l_ij is the chain length between binding sites i and j (20). This approximation neglects the entropy. The closer d_ij is to l_ij, the fewer spatial conformations are available to the linker. For a more accurate estimate, we use the worm-like chain model, a statistical mechanics description of semi-flexible polymers (27,28). Given a sufficient length l_ij, the local concentration c_ij has a Gaussian shape centered around site i (Figure 3B) (29). Its variance depends on l_ij and on the 3D distance d_ij between binding domains on the protein. The rigorous mathematical description of this case and of the second case in which the protein has flexible linkers between domains that is allowed to move independently is given in the Supplementary Methods (Section 2 and 3, Figure S1).

When we consider the dependence of c_ij on the linker length l_ij, it is instructive to observe the difference between both models (Figure 3C). From a uniformly distributed concentration, one would expect the concentration enhancing effects of an RNA or protein linker to vanish much more quickly, compared to the worm-like chain model. According to this, cooperative binding can be observed even for RNAs with relatively long linkers between binding sites.

Effect of different RNA motif densities

Consider a long RNA with N binding sites and proteins with n binding sites. We can estimate the avidity for proteins to bind the RNA in this special case, by making additional simplifying assumptions. First, we assume that all binding domains bind to the same binding motifs, and we model the binding sites on the RNA with equal distances between them. Second, we assume that fully bound conformations with domains bound to adjacent binding motifs dominate the K_av (Eq. (4)). The number of binding conformations for an RNA with N motifs is then approximately N − n + 1 higher than for an RNA with n motifs, because each of the conformations with all domains bound can be placed at N − n + 1 positions. Therefore, the avidity for the RNA with N binding sites is approximately

(5)

Simulation of cooperative binding with Gillespie algorithm

We cross-checked our analytical calculations described above with simulations using the Gillespie algorithm (30,31), implemented in the Python library Gillespy2 (32). We performed simulations of the model by defining all binding configurations as molecular entities in the simulation and determining the avidity based on trajectories of the simulated system (See Supplementary Methods, section 5 for more details on how the simulations were set up).

Determining the model parameters

K _d values of individual binding domains are taken from experimental measurements like electrophoretic mobility shift assays (EMSA) or isothermal titration calorimetries (ITC). Distances between binding sites on the protein are 3D Euclidian distances calculated based on available PDB structures. The contour lengths of ssRNA linkers between binding sites and the length of flexible linkers between protein domains are estimated as the number of nucleotides or amino acids multiplied with a length per base of 5.5 Å (mean of 5 measurements) (33–37) or a length per amino acid of 3.8 Å (38) respectively. The persistence length l_p of ssRNA is estimated as 2.7 nm, the mean of five publications (33–37), and the mean persistence length for disordered proteins is 3.04 Å (38).

RESULTS

The model correctly estimates dissociation constants

To validate the new model, we analyzed seven multi-domain RBPs for which the K_d values of individual domains and the whole protein have been measured experimentally. We estimated the avidity for the full-length proteins using the dissociation constants of the individual domains and employing the analytical results outlined in the Supplementary Methods, section 1 (Figure 4). We cross-checked the calculations with simulations using the Gillespie algorithm.

Figure 4. — Measured avidities are in good agreement with model predictions. We found seven RBPs composed of two or three RBDs (or RBD pairs) for which dissociation constants of the full-length protein had been measured together with those of individual RBDs (4–11,39). We used the simple thermodynamic model to estimate the avidities of the full-length RBPs from those of their individual domains and from linker lengths l and protein binding site distances d and found agreement within a factor of ∼5. No free fitting parameters were used (see Supplementary Methods, section 6 for details). Orange triangles indicate the theoretical case of independent binding of the two or three domains (equivalent to an infinitely long RNA linker), calculated as the sum of K_a values of individual domains.

The proteins used are the zipcode binding protein 1 (ZBP1) (4), the heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1) (5), the two terminal domains of the polypyrimidine tract binding protein (PTB) (6,7), the first four domains of the insulin-like growth factor 2 mRNA-binding protein 3 (IMP3 or IGF2BP3) (8), the first two KH2 domains of IMP1 (10), the U2 snRNP auxiliary factor (U2AF65) (11), and the K-homology splicing regulator protein (KSRP) (9,39) (see Supplementary Methods, section 6 for parameters used in the calculations). With the exception of IMP3 and KSRP, these proteins consist of two rigidly linked domains. In contrast, IMP3 consists of three domain pairs with flexible linkers between the pairs. In our model the first two of the three IMP3 domain pairs were represented as two binding sites, connected by a flexible linker. KSRP contains four KH-domains, with the middle two connected as a rigid unit. Measurements were done for the wild-type protein and for variants, in which mutations in the binding domains remove the ability to bind for that domain (see Supplementary Methods, section 7 for further assumptions we make, and predictions of the remaining measurements).

The measurements were done using fixed target RNA sequences. The affinity of full-length U2AF was measured for RNAs with three different linker lengths between the binding sites. This allows us to confirm the distance dependence in our model for the local concentration (Figure 3C). All predictions were at least within a factor ∼5 of the experimental value, demonstrating the applicability of the model to multivalent, cooperative binding of RBDs to their RNA substrates.

Avidity increases exponentially with number of binding sites

We then asked how the avidities for RBPs depend on the number n of their RBDs (Figure 5A). We chose K_d values for RBDs and linker lengths in the ranges of typical RBPs. We observed an exponential increase in avidity with the number of binding sites by a factor K_a,i c_{i − 1, i} for each added domain (eq. (4)) (i.e. a shift in the concentration at half occupancy by the inverse of this factor). The local concentration of the RBDs, c_ij, depends on the linker length l between consecutive binding sites and the distance d between the consecutive RBDs, which determine the variance of the Gaussian concentration density (Figure 3B, Supplementary Methods, section 2). While the factor in real RBPs will depend on individual K_ds and distances between binding sites, the analysis shows that the inverse avidity can drop by orders of magnitude per domain added. So, the addition or removal of one domain—or one RNA binding site—can make the difference between binding and essentially no binding.

Inline graphic — Dependence of the avidity (effective association constant) of RBPs on the number of RBDs, their K_d’s, the effective local concentrations c_ij, and the binding site density on the RNA. (A) The inverse avidity decreases exponentially with the number of binding domains n, because each added binding site multiplies the avidity by ∼K_{a, i} times the local concentration c_{i − 1, i} of the free i’th RNA binding site at the site of the i’th free RBD (equation 4). The slope thus depends on the spacing of binding sites via c_{i − 1, i}. All RBD K_d’s were set to 10 μM and distances d between rigidly linked binding domains to 2 nm. (B) Individual RBDs contribute proportionally to the total avidity as long as their K_{d, i} is less than the local concentration, (here shown for i = 3). The K_d for the first domain is K_{d, 1} = 10 μM, the K_ds for the second and third domain are varied as indicated. was calculated for equal distances between rigidly linked binding domains of 2 nm and an RNA linker length of 20 nt. (C) The inverse avidity decreases with the binding site density on the RNA. For this plot, we approximately neglect non-sequential binding modes, which are much less populated than the sequential ones. K_ds of individual domains were 50 μM and the total RNA length was 200 nt. The horizontal line indicates a concentration of 0.1 μM used for the calculations in (D). (D) Binding probability of RBPs as measured by as a function of binding site density on the RNA at an RNA concentration of 0.1 μM (horizontal line in (C)). Curves show fits with sigmoidal Hill-functions, with Hill coefficients of h₁ = 0.99, h₂ = 2.35, h₃ = 4.01 and h₄ = 5.7 for one to four domains, respectively. Note the strongly cooperative, switch-like behaviour for n = 4 RBDs.

Contributions of individual domains to the avidity becomes negligible after a threshold in the individual K_d

To further investigate the effect of domain K_ds to the total affinity, we calculated the avidities for artificial RBPs with 3 domains, kept the K_d of the first domain constant and varied K_{d, 2} and K_{d, 3} (Figure 5B). As expected, the inverse avidity increases when the K_d of one individual domain is increased. According to equation (4), when K_{a, i} c_{i − 1, i} ≤ 1, or, equivalently, K_{d, i} ≥ c_{i − 1, i}, the contribution of domain i to the avidity quickly saturates (vertical line in Figure 5B), which was also concluded from experiments in (40). Only domains with a dissociation constant below the effective concentration contribute significantly to the avidity. As Figure 3C shows, this concentration can lie in the millimolar range.

Protein binding can depend sensitively on the density of binding motifs on the RNA

The combination of multiple RNA-binding domains is important for providing the specificity needed to bind to the correct target RNAs (41). The density of binding sites on the RNA molecule is also an important determinant of binding affinity and specificity (42,43). To investigate this effect, we calculated the avidity and the binding probability (or relative occupancy) of RBPs as a function of the binding site density on the RNA based on equation (5) (Figure 5C and D). With increasing binding site density, the RNA linker length l between binding sites decreases, the standard deviation of the Gaussian density of the local concentration c_ij decreases, and the local concentration increases. Suppose the increase is c_ij to Inline graphic . The avidity increases with increasing motif density by a factor for each of the n domains, or for the whole protein, as long as the approximation in equation (5) holds, that is, as long as K_{a, i} c_{i − 1, i} ≫ 1. Therefore, the inverse avidity decreases approximately exponentially with increasing binding site density (Figure 5C). With growing number of domains, this results in lower threshold densities of the binding curves and more and more switch-like binding behaviour (Figure 5D). To quantify the cooperativity of this transition, we fitted a sigmoidal Hill function 1/(1 + (D₀/D)^h) to the binding probability as a function of the binding site density D on the RNA. The Hill coefficient h, a common measure of cooperativity, grows somewhat faster than the number of domains (h₁ = 0.99, h₂ = 2.35, h₃ = 4.01 and h₄ = 5.7 for one to four domains, respectively).

DISCUSSION

Thermodynamic model extends previous models of cooperative binding

Previous models treated cooperative binding for two binding sites. Crothers and Metzger developed a model to determine the avidity of the two binding sites of an antibody, estimating c_eff with the particle-in-a-sphere model (Figure 3A) and assuming that the RNA binding site is uniformly distributed inside a sphere with a radius of l around the first already bound binding site (19). This model has been extended several times, taking into account different properties like chain length of the flexible linker between binding sites/domains and also transferring it into the context of RNA-binding (20–22,24). All of these studies, derive avidities for two domains. The results for n = 2 match our model, which describes binding for an arbitrary number of binding sites. Previous models can only describe a flexible linker between the binding sites on one binding partner. However, many RNA-binding proteins have flexible peptide linkers between their domains. We have therefore extended the model to include the possibility of flexible linkers in both binding partners.

Simplifying assumptions limit model accuracy

We describe a simple, idealized model system. Still, the model estimates of the avidity for the full-length proteins agree with the experimental measurements to within an order of magnitude (Figure 4). This supports the general validity of the model, but also highlights the limits in the use as a predictive tool, while it can rather offer intuitive mechanistic insights.

Various simplifying assumptions can potentially explain the deviations from measurements. Most notably, many linkers between RNA binding sites are very short. To estimate the effective local concentration c_eff, we use the assumption that the chain length is much larger than the persistence length (l_p, measure of flexibility in the worm-like chain model) of the RNA, l ≫ l_p. If the chain length is shorter, the end-to-end distribution will not be an isotropic Gaussian anymore but will depend on the initial tangent orientation of the bound end (29). It has been shown that only for Inline graphic the distribution has a single maximum near the origin in direction of the initial orientation and approaches a Gaussian for larger values (29). The chain lengths in the examples given earlier correspond to rather stiff chains. Depending on the orientation of the next binding site in relation to the first, the effective concentration and consequently the avidity can be over- or underestimated. To increase the accuracy of estimates for c_eff we would have to take into account other geometric properties of the protein in addition to the distance between binding sites. However, for short polymers, the analytical solution to the worm-like chain model becomes highly complex and the simplicity and intuition of the model would be lost.

In addition to short RNA linkers, RNA secondary structure and unspecific binding can decrease the accuracy of the predictions. Furthermore, the sequence of the RNA influences its flexibility. Many measurements of the persistence length of ssRNA have been done with repetitive sequences. Thus, for short chains the RNA sequence might have a stronger effect on estimations of RNA flexibility, while for longer chains this effect will most likely average out.

We describe two examples in which our simplifying assumptions are violated and our model fails to accurately predict the K_d of the full-length protein. For the two-domain protein TDP-34, which binds to UG-rich RNA, our model underestimates the K_d by more than an order of magnitude because it violates two assumptions. First, since binding is measured against a (UG)₆-RNA, it does not contain two well defined binding sites, but instead a continuous interaction surface. Second, K_ds for the individual RRMS were only measured for (UG))₃- and (UG)₆-RNA and even vary across studies (44,45), and it is unclear whether they represent the true effective K_d,1 and K_d,2 in the complex. A second example is the binding of PTB (7) to different GABA RNA constructs (46). The RNAs are relatively long and the lack of defined binding sites, the complex RNA secondary structure, and the possibility for multimerization of PTB and thus, the formation of complexes with stoichiometry other than 1:1 render our model inapplicable.

Disorder in RNA binding domains

We model two distinct situations with respect to the linkers between RBDs. In the first case, protein domains are rigidly linked and move together as a unit. In the second case, they are connected by a flexible linker and move independently, only restricted by the length of the linker. In reality, however, it is possible to observe situations in between these two extreme cases. Flexible protein linkers might either come in contact with the RNA, play a role in conformational changes of the two domains relative to each other, change their flexibility upon binding, or undergo a disorder-to-order transition (25,40). We do not expect these additional complexities to influence the general derivation of our model. Rather, all these situations require more complex calculations of the effective concentration c_eff, as the assumption of either completely independent or joint movement is violated.

Partial binding of the peptide linkers to the RNA after binding of one domain violates our model’s assumption of independent movement of the RNA and unbound protein domain connected by the linker. A positive correlation could considerably increase the local concentration of the RNA binding motif at the second domain relative to our model’s estimate. In addition, the binding can result in a much reduced flexibility of the linkers. If the persistence length of RNA or peptide becomes to large, the distribution cannot be assumed isotropic, resulting in an increase or decrease of the effective local concentration of the RNA motif at the second RBD (see discussion above).

In addition to disordered linkers between domains from the same protein, intrinsically disordered regions can also lead to the association of RBDs from different proteins. This creates the possibility for cooperative binding in a similar way to what is described here. If two domains associate via their IDRs before binding to an RNA and this complex is stable on the timescale of RNA binding, the two domains can be treated in the same way as a two-domain protein, with a flexible linker between the domains. Increases in avidity are expected, whether RBDs are covalently linked or whether the effective number of domains is increased by dimerization or multimerization.

Multi-domain RBPs can distinguish sensitively between RNAs with different binding site densities

Analyses of high-throughput measurements of RNA binding affinities for 86 RNA-binding proteins by high-throughput RNA SELEX (47), 78 by RNA Bind-n-Seq (2), and 205 by RNAcompete (3) showed generally low enrichment factors of the most enriched motifs. Enriched motifs were short and degenerate for a substantial fraction of proteins and often motifs consisted of short mono- or dinucleotide repeats (48). Our thermodynamic model of cooperative binding explains how such degenerate motifs bound with relatively low binding affinities in the micro- to millimolar range can yield highly selective binding behavior to dense clusters of binding motifs, in which density as much as binding affinity of individual motifs determines the binding affinity. This underscore the need for bioinformatic methods that can learn ‘clustered motif’ binding models for multi-domain RBPs from high-throughput experiments.

Four RBDs result in a Hill-like coefficient of 5.7 for the dependence of avidity on motif density. It is easy to imagine how homodi-, and -multimerization of RBDs can increase the effective number of RBDs to much higher numbers, particularly in liquid phases enriching for certain RBPs (next subsection). Such homo-oligomer assemblies can become exquisitely specific and affine for target RNAs with a corresponding number of target binding site.

As an example, in a study of the function of Nrd1/Nab3 heterodimers in recognizing and degrading antisense transcripts in yeast it was found that a mere factor 1.5 higher density of Nrd1 and Nab3 binding sites on antisense versus sense transcripts seems sufficient to selectively degrade antisense transcripts (43). It was later observed that, while the Nrd1/Nab3 dimer contains only two RNA-binding domains, both proteins contain disordered regions prone to form aggregates or even liquid droplet phases and that aggregation of Nrd1/Nab3 via these disordered regions leads to their polymerization or aggregation in concert with binding to their target RNA (49,50). The high effective number of binding domains in the formed polymers could therefore explain how high Hill coefficients can be realized (Figure 5D). Similarly, selective inhibition of polyadenylation of U1A mRNA over other mRNAs by U1A, depends on the presence of two binding sites on the RNA with correct spacing, to allow two interacting U1A molecules to bind (51).

Figure 5C demonstrates that four RNA-binding domains achieve an avidity of around (2 nM)⁻¹ when each of the domains has a very modest single-domain RNA-binding affinity of (50 μM)⁻¹. This might be the reason why RBPs rarely contain more than four RNA-binding domains: the resulting avidities would simply be below what is needed in the cell.

Some motifs on the RNA consist of mono- or dinucleotide repeats, creating the possibility for multiple binding registers in one RNA motif (15,16). This can be seen for example in the HuR C-terminal RRM binding to AU-rich RNA regions (52) and also in PTB, one of our examples, which binds to polypyrimidine tracts (53). When the repeat regions are long enough, the protein domains can bind in more than one arrangement. The effects on the affinity of an individual domain by encompassing N binding registers in one RNA motif can be estimated through a simple statistical consideration by dividing the K_d by a factor of N (equation (5) can be applied here).

The concept of ‘fuzziness’ describes the more general situation when every RNA binding site can at least to some degree bind to every protein domain (54). We calculate this effect in our model for two binding sites (Supplementary Methods, section 4). Including fuzzy binding in the calculations increases the number of possible bound configurations and thus the complexity of the combinatorics. However, it does not qualitatively change the results that we present here.

Multi-domain RNA-binding can promote phase separation

Phase-separated biological droplets/condensates, which function to concentrate and organize molecules inside the cell, form via multivalent networks of interactions (55). These multivalent interactions can arise from weak interactions between intrinsically disordered regions of the proteins and/or by multivalency through multiple connected domains (25,56). Many stages of RNA metabolism also involve phase separation (56–58), in which RNAs form condensates together with RNA-binding proteins (59). The same cooperativity that enables the formation of phase separated condensates visible under a light microscope will also enable the formation of condensates or aggregates of RNAs and RNA-binding proteins on a nanoscale (60), containing only tens or thousands of molecules, perhaps even containing a single RNA (61,62). Within these aggregates, as well as within true condensates, the concentration of RNA-binding proteins and RNA is much higher than in the cytosol, and therefore even low-affinity binding sites on the RNA can get saturated. We suggest that this type of cooperativity is often amplified by the one we investigate here, involving multiple domains within one protein complex (56). A better quantitative understanding of it could help to give insights into the formation of RNA-protein aggregates and phase-separated condensates.

Cooperative binding plays a role in other biomolecular interactions

While we focused on RNA binding proteins in this work, the general concept described here is applicable to many other types of interactions. Most closely related might be DNA binding proteins, which employ multi-domain binding in a similar way to RBPs (24,63). The first quantitative treatment of cooperative multivalent binding was applied to antibodies binding to antigens (19). Another application of the presented model could be for binding of proteins to intrinsically disordered regions in proteins (54,64). The same concept of exploiting multivalent binding to maximize avidity is used in fragment based drug discovery (65,66) and in the development of small molecule inhibitors for RNAs (67).

CONCLUSION

The simple thermodynamic model for RNA binding of multi-domain RBDs shows how cooperative binding of their domains can lead to very high specificity and avidity with RBDs that, alone, have low specificity and affinity. The actual binding motifs of multi-domain RBDs should be considered to be clusters of simple binding motifs, in which the total avidity is determined not only by the affinities of individual motifs but to a large extend by their number and density. A single additional site can change the avidity by two orders of magnitude (Figure 5C), and a twofold change in motif density can change avidity by a factor 10 (Figure 5D).

DATA AVAILABILITY

The code for the simulations and all calculations is available at https://github.com/soedinglab/cooperative_rbp (permanent DOI: 10.5281/zenodo.7963695). The protein structures used in the validation of the model are available under the PDB accession codes 2n8l, 6dcl, 2adc, 6fq1, 6gqe, 6qey and 2jvz.

Supplementary Material

lqad057_Supplemental_File

Click here for additional data file.^{(325.3KB, pdf)}

ACKNOWLEDGEMENTS

Author contributions: S.H.S. implemented the algorithms and conducted all the analysis. J.S. conceptualized the idea. S.S.J. and J.S. supervised research. S.H.S., S.S.J. and J.S. wrote the manuscript.

Contributor Information

Simon H Stitzinger, Quantitative and Computational Biology, Max Planck Institute for Multidisciplinary Sciences, Am Fassberg 11, 37077 Göttingen, Germany.

Salma Sohrabi-Jahromi, Quantitative and Computational Biology, Max Planck Institute for Multidisciplinary Sciences, Am Fassberg 11, 37077 Göttingen, Germany.

Johannes Söding, Quantitative and Computational Biology, Max Planck Institute for Multidisciplinary Sciences, Am Fassberg 11, 37077 Göttingen, Germany; Campus-Institut Data Science (CIDAS), Goldschmidtstrasse 1, 37077 Göttingen, Germany.

SUPPLEMENTARY DATA

Supplementary Data are available at NARGAB Online.

FUNDING

Focus program SPP2191 of the Deutsche Forschungsgemeinschaft. Funding for open access charge: Forschungsgemeinschaft [SPP2191].

Conflict of interest statement. None declared.

REFERENCES

1. Dreyfuss G., Kim V.N., Kataoka N.. Messenger-RNA-binding proteins and the messages they carry. Nat. Rev. Mol. Cell. Biol. 2002; 3:195–205. [DOI] [PubMed] [Google Scholar]
2. Dominguez D., Freese P., Alexis M.S., Su A., Hochman M., Palden T., Bazile C., Lambert N.J., Van Nostrand E.L., Pratt G.A.et al.. Sequence, structure, and context preferences of human RNA binding proteins. Mol. Cell. 2018; 70:854–867. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Ray D., Kazan H., Cook K.B., Weirauch M.T., Najafabadi H.S., Li X., Gueroussov S., Albu M., Zheng H., Yang A.et al.. A compendium of RNA-binding motifs for decoding gene regulation. Nature. 2013; 499:172–177. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Nicastro G., Candel A.M., Uhl M., Oregioni A., Hollingworth D., Martin S.R., Ramos A.. Mechanism of β-actin mRNA Recognition by ZBP1. Cell Rep. 2017; 31:1187–1199. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Kooshapur H., Choudhury N.R., Simon B., Mühlbauer M., Jussopow A., Fernandez N., Jones A.N., Dallmann A., Gabel F., Camilloni C.et al.. Structural basis for terminal loop recognition and stimulation of pri-miRNA-18a processing by hnRNP A1. Nat. Commun. 2018; 9:2479. [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Lamichhane R., Daubner G.M., Thomas-Crusells J., Auweter S.D., Manatschal C., Austin K.S., Valniuk O., Allain F. H.-T., Rueda D.. RNA looping by PTB: evidence using FRET and NMR spectroscopy for a role in splicing repression. Proc. Natl. Acad. Sci. U.S.A. 2010; 107:4105–4110. [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Oberstrass F.C., Auweter S.D., Erat M., Y. H., Henning A., Wenter P., Reymond L., Amir-Ahmady B., Pitsch S., Black D.L.et al.. Structure of PTB bound to RNA: specific Binding and Implications for Splicing Regulation. Science. 2005; 309:2054–2057. [DOI] [PubMed] [Google Scholar]
8. Schneider T., Hung L.-H., Aziz M., Wilmen A., Thaum S., Wagner J., Janowski R., Müller S., Schreiner S., Friedhoff P.et al.. Combinatorial recognition of clustered RNA elements by the multidomain RNA-binding protein IMP3. Nat. Commun. 2019; 10:2266. [DOI] [PMC free article] [PubMed] [Google Scholar]
9. García-Mayoral M.F., Díaz-Moreno I., Hollingworth D., Ramos A.. The sequence selectivity of KSRP explains its flexibility in the recognition of the RNA targets. Nucleic Acids Res. 2008; 36:5290–5296. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Dagi R., Ball N.J., Ogrodowicz R.W., Hobor F., Purkiss A.G., Kelly G., Martin S.R., Taylor I.A., Ramos A.. IMP1 KH1 and KH2 domains create a structural platform with unique RNA recognition and re-modelling properties. Nucleic Acids Res. 2019; 47:4334–4348. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Mackereth C.D., Madl T., Bonnal S., Simon B., Zanier K., Gasch A., Rybin V., Valcárcel J., Sattler M.. Mutli-domain conformational selection underlies pre-mRNA splicing regulation by U2AF. Nature. 2011; 475:408–411. [DOI] [PubMed] [Google Scholar]
12. Jolma A., Yan J., Whitington T., Toivonen J., Ritta K.R., Rastas P., Morgunova E., Enge M., Taipale M., Wei G.et al.. DNA-Binding Specificities of Human Transcription Factors. Cell. 2013; 152:327–339. [DOI] [PubMed] [Google Scholar]
13. Franco-Zorilla J.M., López-Vidriero I., Carrasco J.L., Godoy M., Vera P., Solano R.. DNA-binding specificities of plant transcription factors and their potential to define target genes. Proc. Natl. Acad. Sci. U.S.A. 2014; 111:2367–2372. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Zhu C., Byers K.J., McCord R.P., Shi Z., Berger M.F., Newburger D.E., Saulrieta K., Smith Z., Shah M.V., Radhakrishnan M.et al.. High-resolution DNA-binding specificity analysis of yeast transcription factors. Genome Res. 2009; 19:556–566. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Mackereth C.D., Sattler M.. Dynamics in multi-domain protein recognition of RNA. Curr. Opin. Struct. Biol. 2012; 22:287–296. [DOI] [PubMed] [Google Scholar]
16. Helder S., Blythe A.J., Bond C.S., Mackay J.P.. Determinants of affinity and specificity in RNA-binding proteins. Curr. Opin. Struct. Biol. 2012; 38:83–91. [DOI] [PubMed] [Google Scholar]
17. Cook K.B., Kazan H., Zuberi K., Morris Q., Hughes T.R.. RBPDB: a database of RNA-binding specificities. Nucleic Acids Res. 2010; 39:D301–D308. [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Krissinel E., Henrick K.. Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 2007; 372:774–797. [DOI] [PubMed] [Google Scholar]
19. Crothers D.M., Metzger H.. The influece of polyvalency on the binding properties of antibodies. Immunochemistry. 1972; 9:341–357. [DOI] [PubMed] [Google Scholar]
20. Shamoo Y., Abdul-Manan N., Williams K.R.. Multiple RNA binding domains (RBDs) just don’t add up. Nucleic Acids Res. 1995; 23:725–728. [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Zhou H.-X. Quantative Account of the Enhanced Affinity of Two Linked scFvs Specific for Different Epitopes on the Same Antigen. J. Mol. Biol. 2003; 329:1–8. [DOI] [PubMed] [Google Scholar]
22. Bobrovnik S.A. The influence of rigid or flexible linkage between two ligands on the effective affinity and avidity for reversible interactions with bivalent receptors. J. Mol. Recognit. 2007; 20:253–262. [DOI] [PubMed] [Google Scholar]
23. Kitov P.I., Bundle D.R.. One the nature of the multivalency effect: a thermodynamic model. J. Am. Chem. Soc. 2003; 125:16271–16284. [DOI] [PubMed] [Google Scholar]
24. Zhou H.-X. The affinity-enhancing roles of flexible linkers in two-domain DNA-binding proteins. Biochemistry. 2001; 40:15069–15073. [DOI] [PubMed] [Google Scholar]
25. Ottoz D. S.M., Berchowitz L.E.. The role of disorder in RNA binding affinity and specificity. Open Biol. 2020; 10:200328. [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Klein J.S., Bjorkman P.J.. Few and far between: how HIV may be evading antibody avidity. PLoS Phathog. 2010; 6:e1000908. [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Kratky O., Porod G.. Rötgenuntersuchungen gelöster Fadenmoleküle. Recl. Trav. Chim. Pays-Bas. 1949; 68:1106–1122. [Google Scholar]
28. Saito N., Takahashi K., Yunoki Y.. The statistical mechanical theory of stiff chains. J. Phys.Soc. Japan. 1967; 22:219–226. [Google Scholar]
29. Spakowitz A.J., Wang Z.-G.. End-to-End distance vector distribution with fixed end orientations for the wormlike chain model. Phys. Rev. E. 2005; 72:041802. [DOI] [PubMed] [Google Scholar]
30. Gillespie D.T. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys. 1976; 22:403–434. [Google Scholar]
31. Gillespie D.T. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 1977; 81:2340–2361. [Google Scholar]
32. Abel J.H., Drawert B., Hellander A., Petzold L.R.. GillesPy: a Python Package for Stochastic Model Building and Simulation. IEEE Life Sci. Lett. 2016; 2:35–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Chi Q., Wang G., Jiang J.. The persistence length and length per base of single-stranded DNA obtained from fluorescence correlation sprectroscopy measurements using mean field theory. Physica A. 2013; 393:1072–1079. [Google Scholar]
34. Chen H., Meisburger S.P., Pabit S.A., Sutton J.L., Webb W.W., Pollack L.. Ionic strength-dependent persistence lengths of single-stranded RNA and DNA. Proc. Natl. Acad. Sci. U.S.A. 2012; 109:799–804. [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Laurence T.A., Kong X., Jäger M., Weiss S.. Probing structural heterogeneities and fluctuations of nucleic acids and denaturated proteins. Proc. Natl. Acad. Sci. U.S.A. 2005; 48:17348–17353. [DOI] [PMC free article] [PubMed] [Google Scholar]
36. Murphy M.C., Rasnik I., Cheng W., Lohman T.M., Ha T.. Probing single-stranded DNA conformational flexibility using fluorescence spectroscopy. Biophys. J. 2004; 86:2530–2537. [DOI] [PMC free article] [PubMed] [Google Scholar]
37. Mills J., Vacano E., Hagerman P.J.. Flexibility of single-stranded DNA: use of gapped duplex helices to determine the persistence lengths of Poly(dT) and Poly(dA). J. Mol. Biol. 1999; 285:245–257. [DOI] [PubMed] [Google Scholar]
38. Zhou H.-X. Loops in Proteins Can Be Modeled as Worm-Like Chain. J. Phys. Chem. B. 2001; 105:6763–6766. [Google Scholar]
39. Hollingworth D., Candel A.M., Nicastro G., Martin S.R., Briata P., Gherzi R., Ramos A.. KH domains with impaired nucleic acid binding as a tool for functional analysis. Nucleic Acids Res. 2012; 40:6873–6886. [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Sørensen C.S., Jendroszek A., Kjaergaard M.. Linker dependence of avidity in multivalent interactions between disordered proteins. J. Mol. Biol. 2019; 431:4784–4795. [DOI] [PubMed] [Google Scholar]
41. Lunde B.M., Moore C., Varani G.. Modular design for efficient function. Nat. Rev. Mol. Cell Biol. 2007; 8:470–490. [DOI] [PMC free article] [PubMed] [Google Scholar]
42. Danilenko M., Dalgliesh C., Pagliarini V., Naro C., Ehrmann I., Feracci M., Kheirollahi-Chadegani1 M., Tyson-Capper A., Clowry G.J., Fort P.et al.. Binding site density enables paralog-specific activity of SLM2 and Sam68 proteins in Neurexin2 AS4 splicing control. Nucleic Acids Res. 2017; 45:4120–4130. [DOI] [PMC free article] [PubMed] [Google Scholar]
43. Schulz D., Schwalb B., Kiesel A., Baejen C., Torkler P., Gagneur J., Soeding J., Cramer P.. Transcriptome surveillance by selective termination of noncoding RNA synthesis. Cell. 2013; 155:1057–1087. [DOI] [PubMed] [Google Scholar]
44. Kuo P.-H., Doudeva L.G., Wang Y.-T., Shen C.-K.J., Yuan H.S.. Structural insights into TDP-43 in nucleic-acid binding and domain interactions. Nucleic Acids Res. 2009; 37:1799–1808. [DOI] [PMC free article] [PubMed] [Google Scholar]
45. Mackness B.C., Tran M.T., P. M.S., Matthews C.R., Zitzewitz J.A.. Folding of the RNA recognition motif (RRM) domains of the amyotrophic lateral scelrosis (ALS)-linked protein TDP-43 reveals an invermediate state. J. Biol. Chem. 2014; 289:8264–8276. [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Clerte C., Hall K.B.. Characterization of multimeric complexes formed by the human PTB1 protein on RNA. RNA. 2006; 12:457–475. [DOI] [PMC free article] [PubMed] [Google Scholar]
47. Jolma A., Zhang J., Mondragón E., Morgunova E., Kivioja T., Laverty K.U., Yin Y., Zhu F., Bourenkov G., Morris Q.et al.. Binding specificities of human RNA-binding proteins toward structured and linear RNA sequences. Genome Res. 2020; 30:962–973. [DOI] [PMC free article] [PubMed] [Google Scholar]
48. Sohrabi-Jahromi S., Söding J.. Thermodynamic modeling reveals widespread multivalent binding by RNA-binding proteins. Bioinformatics. 2021; 37:i308–i316. [DOI] [PMC free article] [PubMed] [Google Scholar]
49. O’Rourke T.W., Loya T.J., Head P.E., Horton J.R., Reines D.. Amyloid-like assembly of the low complexity domain of yeast Nab3. Prion. 2015; 9:34–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
50. Loya T.J., O’Rourke T.W., Degtyareva N., Reines D.. A network of interdependent molecular interactions describes a higher order Nrd1-Nab3 complex involved in yeast transcription termination. J. Biol. Chem. 2013; 288:34158–34167. [DOI] [PMC free article] [PubMed] [Google Scholar]
51. Varani L., Gunderson S.I., Mattaj I.W., Kay L.E., Neuhaus D., Varani G.. The NMR structure of the 38 kDA U1A protein-PIE RNA complex reveals the basis of cooperativity in regulation of polyadenylatino by human U1A protein. Nat. Struct. Mol. Biol. 2000; 7:329–335. [DOI] [PubMed] [Google Scholar]
52. Ripin N., Boudet J., Duszczyk M.M., Hinniger A., Faller M., Krepl M., Gadi A., Schneider R.J., Šponer J., Meisner-Kober N.C.et al.. Molecular basis for AU-rich element recognition and dimerization by the HuR C-terminal RRM. Proc. Natl. Acad. Sci. U.S.A. 2019; 116:2935–2944. [DOI] [PMC free article] [PubMed] [Google Scholar]
53. Auweter S.D., Oberstrass F.D., Allain F. H.-T.. Solving the structure of PTB in complex with pyrimidine tracts: an NMR study of protein-RNA complexes of weak affinities. J. Mol. Biol. 2006; 367:174–186. [DOI] [PubMed] [Google Scholar]
54. Olsen J.G., Teilum K., Kragelund B.B.. Behaviour of intrisically disordered proteins in protein-protein complexes with an emphasis on fuzziness. Cell. Mol. Life Sci. 2017; 74:3175–3183. [DOI] [PMC free article] [PubMed] [Google Scholar]
55. Banani S.F., Lee H.O., Hyman A.A., Rosen M.K.. Biomolecular condensates: organizers of cellular biochemistry. Nat. Rev. Mol. Cell. Biol. 2017; 18:285–298. [DOI] [PMC free article] [PubMed] [Google Scholar]
56. Sabari B.S., Dall’Agnese A., Young R.A.. Biomolecular condensates in the nucleus. Trends Biochem. Sci. 2020; 45:961–977. [DOI] [PMC free article] [PubMed] [Google Scholar]
57. Fay M.M., Anderson P.J.. The role of RNA in biological phase separations. J. Mol. Biol. 2018; 430:4685–4701. [DOI] [PMC free article] [PubMed] [Google Scholar]
58. Lin Y., Protter D. S.W., Rosen M.K., Parker R.. Formation and maturation of phase-separated liquid droplets by RNA-binding proteins. Mol. Cell. 2015; 60:208–219. [DOI] [PMC free article] [PubMed] [Google Scholar]
59. Rhine K., Vidaurre V., Myong S.. RNA droplets. Annu. Rev. Biophys. 2020; 49:247–265. [DOI] [PMC free article] [PubMed] [Google Scholar]
60. Lyon A.S., Peeples W.B., Rosen M.K.. A framework for understanding the functions of biomolecular condensates across scales. Nat. Rev. Mol. Cell Biol. 2021; 22:215–235. [DOI] [PMC free article] [PubMed] [Google Scholar]
61. Mateu-Regue A., Christiansen J., Bagger F.O., Winther O., Hellriegel C., Nielsen F.C.. Single mRNP analysis reveals that small cytoplasmic mRNP granules represent mRNA singletons. Cell Rep. 2019; 29:736–748. [DOI] [PubMed] [Google Scholar]
62. Söding J., Zwicker D., Sohrabi-Jahromi S., Boehning M., Kirschbaum J.. Mechanisms for active regulation of biomolecular condensates. Trends Cell Biol. 2020; 30:4–14. [DOI] [PubMed] [Google Scholar]
63. Vuzman D., Polonsky M., Levy Y.. Facilitated DNA search by multidomain transcription factors: cross talk via a flexible linker. Biophys. J. 2010; 99:1202–1211. [DOI] [PMC free article] [PubMed] [Google Scholar]
64. Arbesú M., Iruela G., Fuentes H., Teixeira J. M.C., Pons M.. Intramolecular Fuzzy Interactions Involving Intrinsically Disordered Domains. Front. Mol. Biosci. 2018; 5:39. [DOI] [PMC free article] [PubMed] [Google Scholar]
65. Krishnamurthy V.M., Estroff L.A., Whitesides G.M.. Multivalency in ligand design. 2006; John Wiley and Sons, Ltd; 11–53. [Google Scholar]
66. Tjandra K.C., Thordarson P.. Multivalency in drug delivery – when it is too much of a good thing?. Bioconjugate Chem. 2019; 30:503–514. [DOI] [PubMed] [Google Scholar]
67. Disney M.D., Lee M.M., Pushechnikov A., Childs-Disney J.L.. The role of flexibility in the rational design of modularly assembled ligands targeting the RNAs that cause the myotonic dystrophies. ChemBioChem. 2010; 11:375–382. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

lqad057_Supplemental_File

Click here for additional data file.^{(325.3KB, pdf)}

Data Availability Statement

[B1] 1. Dreyfuss G., Kim V.N., Kataoka N.. Messenger-RNA-binding proteins and the messages they carry. Nat. Rev. Mol. Cell. Biol. 2002; 3:195–205. [DOI] [PubMed] [Google Scholar]

[B2] 2. Dominguez D., Freese P., Alexis M.S., Su A., Hochman M., Palden T., Bazile C., Lambert N.J., Van Nostrand E.L., Pratt G.A.et al.. Sequence, structure, and context preferences of human RNA binding proteins. Mol. Cell. 2018; 70:854–867. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] 3. Ray D., Kazan H., Cook K.B., Weirauch M.T., Najafabadi H.S., Li X., Gueroussov S., Albu M., Zheng H., Yang A.et al.. A compendium of RNA-binding motifs for decoding gene regulation. Nature. 2013; 499:172–177. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4] 4. Nicastro G., Candel A.M., Uhl M., Oregioni A., Hollingworth D., Martin S.R., Ramos A.. Mechanism of β-actin mRNA Recognition by ZBP1. Cell Rep. 2017; 31:1187–1199. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] 5. Kooshapur H., Choudhury N.R., Simon B., Mühlbauer M., Jussopow A., Fernandez N., Jones A.N., Dallmann A., Gabel F., Camilloni C.et al.. Structural basis for terminal loop recognition and stimulation of pri-miRNA-18a processing by hnRNP A1. Nat. Commun. 2018; 9:2479. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6. Lamichhane R., Daubner G.M., Thomas-Crusells J., Auweter S.D., Manatschal C., Austin K.S., Valniuk O., Allain F. H.-T., Rueda D.. RNA looping by PTB: evidence using FRET and NMR spectroscopy for a role in splicing repression. Proc. Natl. Acad. Sci. U.S.A. 2010; 107:4105–4110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7] 7. Oberstrass F.C., Auweter S.D., Erat M., Y. H., Henning A., Wenter P., Reymond L., Amir-Ahmady B., Pitsch S., Black D.L.et al.. Structure of PTB bound to RNA: specific Binding and Implications for Splicing Regulation. Science. 2005; 309:2054–2057. [DOI] [PubMed] [Google Scholar]

[B8] 8. Schneider T., Hung L.-H., Aziz M., Wilmen A., Thaum S., Wagner J., Janowski R., Müller S., Schreiner S., Friedhoff P.et al.. Combinatorial recognition of clustered RNA elements by the multidomain RNA-binding protein IMP3. Nat. Commun. 2019; 10:2266. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] 9. García-Mayoral M.F., Díaz-Moreno I., Hollingworth D., Ramos A.. The sequence selectivity of KSRP explains its flexibility in the recognition of the RNA targets. Nucleic Acids Res. 2008; 36:5290–5296. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10] 10. Dagi R., Ball N.J., Ogrodowicz R.W., Hobor F., Purkiss A.G., Kelly G., Martin S.R., Taylor I.A., Ramos A.. IMP1 KH1 and KH2 domains create a structural platform with unique RNA recognition and re-modelling properties. Nucleic Acids Res. 2019; 47:4334–4348. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11] 11. Mackereth C.D., Madl T., Bonnal S., Simon B., Zanier K., Gasch A., Rybin V., Valcárcel J., Sattler M.. Mutli-domain conformational selection underlies pre-mRNA splicing regulation by U2AF. Nature. 2011; 475:408–411. [DOI] [PubMed] [Google Scholar]

[B12] 12. Jolma A., Yan J., Whitington T., Toivonen J., Ritta K.R., Rastas P., Morgunova E., Enge M., Taipale M., Wei G.et al.. DNA-Binding Specificities of Human Transcription Factors. Cell. 2013; 152:327–339. [DOI] [PubMed] [Google Scholar]

[B13] 13. Franco-Zorilla J.M., López-Vidriero I., Carrasco J.L., Godoy M., Vera P., Solano R.. DNA-binding specificities of plant transcription factors and their potential to define target genes. Proc. Natl. Acad. Sci. U.S.A. 2014; 111:2367–2372. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B14] 14. Zhu C., Byers K.J., McCord R.P., Shi Z., Berger M.F., Newburger D.E., Saulrieta K., Smith Z., Shah M.V., Radhakrishnan M.et al.. High-resolution DNA-binding specificity analysis of yeast transcription factors. Genome Res. 2009; 19:556–566. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B15] 15. Mackereth C.D., Sattler M.. Dynamics in multi-domain protein recognition of RNA. Curr. Opin. Struct. Biol. 2012; 22:287–296. [DOI] [PubMed] [Google Scholar]

[B16] 16. Helder S., Blythe A.J., Bond C.S., Mackay J.P.. Determinants of affinity and specificity in RNA-binding proteins. Curr. Opin. Struct. Biol. 2012; 38:83–91. [DOI] [PubMed] [Google Scholar]

[B17] 17. Cook K.B., Kazan H., Zuberi K., Morris Q., Hughes T.R.. RBPDB: a database of RNA-binding specificities. Nucleic Acids Res. 2010; 39:D301–D308. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B18] 18. Krissinel E., Henrick K.. Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 2007; 372:774–797. [DOI] [PubMed] [Google Scholar]

[B19] 19. Crothers D.M., Metzger H.. The influece of polyvalency on the binding properties of antibodies. Immunochemistry. 1972; 9:341–357. [DOI] [PubMed] [Google Scholar]

[B20] 20. Shamoo Y., Abdul-Manan N., Williams K.R.. Multiple RNA binding domains (RBDs) just don’t add up. Nucleic Acids Res. 1995; 23:725–728. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B21] 21. Zhou H.-X. Quantative Account of the Enhanced Affinity of Two Linked scFvs Specific for Different Epitopes on the Same Antigen. J. Mol. Biol. 2003; 329:1–8. [DOI] [PubMed] [Google Scholar]

[B22] 22. Bobrovnik S.A. The influence of rigid or flexible linkage between two ligands on the effective affinity and avidity for reversible interactions with bivalent receptors. J. Mol. Recognit. 2007; 20:253–262. [DOI] [PubMed] [Google Scholar]

[B23] 23. Kitov P.I., Bundle D.R.. One the nature of the multivalency effect: a thermodynamic model. J. Am. Chem. Soc. 2003; 125:16271–16284. [DOI] [PubMed] [Google Scholar]

[B24] 24. Zhou H.-X. The affinity-enhancing roles of flexible linkers in two-domain DNA-binding proteins. Biochemistry. 2001; 40:15069–15073. [DOI] [PubMed] [Google Scholar]

[B25] 25. Ottoz D. S.M., Berchowitz L.E.. The role of disorder in RNA binding affinity and specificity. Open Biol. 2020; 10:200328. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B26] 26. Klein J.S., Bjorkman P.J.. Few and far between: how HIV may be evading antibody avidity. PLoS Phathog. 2010; 6:e1000908. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B27] 27. Kratky O., Porod G.. Rötgenuntersuchungen gelöster Fadenmoleküle. Recl. Trav. Chim. Pays-Bas. 1949; 68:1106–1122. [Google Scholar]

[B28] 28. Saito N., Takahashi K., Yunoki Y.. The statistical mechanical theory of stiff chains. J. Phys.Soc. Japan. 1967; 22:219–226. [Google Scholar]

[B29] 29. Spakowitz A.J., Wang Z.-G.. End-to-End distance vector distribution with fixed end orientations for the wormlike chain model. Phys. Rev. E. 2005; 72:041802. [DOI] [PubMed] [Google Scholar]

[B30] 30. Gillespie D.T. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys. 1976; 22:403–434. [Google Scholar]

[B31] 31. Gillespie D.T. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 1977; 81:2340–2361. [Google Scholar]

[B32] 32. Abel J.H., Drawert B., Hellander A., Petzold L.R.. GillesPy: a Python Package for Stochastic Model Building and Simulation. IEEE Life Sci. Lett. 2016; 2:35–38. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B33] 33. Chi Q., Wang G., Jiang J.. The persistence length and length per base of single-stranded DNA obtained from fluorescence correlation sprectroscopy measurements using mean field theory. Physica A. 2013; 393:1072–1079. [Google Scholar]

[B34] 34. Chen H., Meisburger S.P., Pabit S.A., Sutton J.L., Webb W.W., Pollack L.. Ionic strength-dependent persistence lengths of single-stranded RNA and DNA. Proc. Natl. Acad. Sci. U.S.A. 2012; 109:799–804. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B35] 35. Laurence T.A., Kong X., Jäger M., Weiss S.. Probing structural heterogeneities and fluctuations of nucleic acids and denaturated proteins. Proc. Natl. Acad. Sci. U.S.A. 2005; 48:17348–17353. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B36] 36. Murphy M.C., Rasnik I., Cheng W., Lohman T.M., Ha T.. Probing single-stranded DNA conformational flexibility using fluorescence spectroscopy. Biophys. J. 2004; 86:2530–2537. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B37] 37. Mills J., Vacano E., Hagerman P.J.. Flexibility of single-stranded DNA: use of gapped duplex helices to determine the persistence lengths of Poly(dT) and Poly(dA). J. Mol. Biol. 1999; 285:245–257. [DOI] [PubMed] [Google Scholar]

[B38] 38. Zhou H.-X. Loops in Proteins Can Be Modeled as Worm-Like Chain. J. Phys. Chem. B. 2001; 105:6763–6766. [Google Scholar]

[B39] 39. Hollingworth D., Candel A.M., Nicastro G., Martin S.R., Briata P., Gherzi R., Ramos A.. KH domains with impaired nucleic acid binding as a tool for functional analysis. Nucleic Acids Res. 2012; 40:6873–6886. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B40] 40. Sørensen C.S., Jendroszek A., Kjaergaard M.. Linker dependence of avidity in multivalent interactions between disordered proteins. J. Mol. Biol. 2019; 431:4784–4795. [DOI] [PubMed] [Google Scholar]

[B41] 41. Lunde B.M., Moore C., Varani G.. Modular design for efficient function. Nat. Rev. Mol. Cell Biol. 2007; 8:470–490. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B42] 42. Danilenko M., Dalgliesh C., Pagliarini V., Naro C., Ehrmann I., Feracci M., Kheirollahi-Chadegani1 M., Tyson-Capper A., Clowry G.J., Fort P.et al.. Binding site density enables paralog-specific activity of SLM2 and Sam68 proteins in Neurexin2 AS4 splicing control. Nucleic Acids Res. 2017; 45:4120–4130. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B43] 43. Schulz D., Schwalb B., Kiesel A., Baejen C., Torkler P., Gagneur J., Soeding J., Cramer P.. Transcriptome surveillance by selective termination of noncoding RNA synthesis. Cell. 2013; 155:1057–1087. [DOI] [PubMed] [Google Scholar]

[B44] 44. Kuo P.-H., Doudeva L.G., Wang Y.-T., Shen C.-K.J., Yuan H.S.. Structural insights into TDP-43 in nucleic-acid binding and domain interactions. Nucleic Acids Res. 2009; 37:1799–1808. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B45] 45. Mackness B.C., Tran M.T., P. M.S., Matthews C.R., Zitzewitz J.A.. Folding of the RNA recognition motif (RRM) domains of the amyotrophic lateral scelrosis (ALS)-linked protein TDP-43 reveals an invermediate state. J. Biol. Chem. 2014; 289:8264–8276. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B46] 46. Clerte C., Hall K.B.. Characterization of multimeric complexes formed by the human PTB1 protein on RNA. RNA. 2006; 12:457–475. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B47] 47. Jolma A., Zhang J., Mondragón E., Morgunova E., Kivioja T., Laverty K.U., Yin Y., Zhu F., Bourenkov G., Morris Q.et al.. Binding specificities of human RNA-binding proteins toward structured and linear RNA sequences. Genome Res. 2020; 30:962–973. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B48] 48. Sohrabi-Jahromi S., Söding J.. Thermodynamic modeling reveals widespread multivalent binding by RNA-binding proteins. Bioinformatics. 2021; 37:i308–i316. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B49] 49. O’Rourke T.W., Loya T.J., Head P.E., Horton J.R., Reines D.. Amyloid-like assembly of the low complexity domain of yeast Nab3. Prion. 2015; 9:34–47. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B50] 50. Loya T.J., O’Rourke T.W., Degtyareva N., Reines D.. A network of interdependent molecular interactions describes a higher order Nrd1-Nab3 complex involved in yeast transcription termination. J. Biol. Chem. 2013; 288:34158–34167. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B51] 51. Varani L., Gunderson S.I., Mattaj I.W., Kay L.E., Neuhaus D., Varani G.. The NMR structure of the 38 kDA U1A protein-PIE RNA complex reveals the basis of cooperativity in regulation of polyadenylatino by human U1A protein. Nat. Struct. Mol. Biol. 2000; 7:329–335. [DOI] [PubMed] [Google Scholar]

[B52] 52. Ripin N., Boudet J., Duszczyk M.M., Hinniger A., Faller M., Krepl M., Gadi A., Schneider R.J., Šponer J., Meisner-Kober N.C.et al.. Molecular basis for AU-rich element recognition and dimerization by the HuR C-terminal RRM. Proc. Natl. Acad. Sci. U.S.A. 2019; 116:2935–2944. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B53] 53. Auweter S.D., Oberstrass F.D., Allain F. H.-T.. Solving the structure of PTB in complex with pyrimidine tracts: an NMR study of protein-RNA complexes of weak affinities. J. Mol. Biol. 2006; 367:174–186. [DOI] [PubMed] [Google Scholar]

[B54] 54. Olsen J.G., Teilum K., Kragelund B.B.. Behaviour of intrisically disordered proteins in protein-protein complexes with an emphasis on fuzziness. Cell. Mol. Life Sci. 2017; 74:3175–3183. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B55] 55. Banani S.F., Lee H.O., Hyman A.A., Rosen M.K.. Biomolecular condensates: organizers of cellular biochemistry. Nat. Rev. Mol. Cell. Biol. 2017; 18:285–298. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B56] 56. Sabari B.S., Dall’Agnese A., Young R.A.. Biomolecular condensates in the nucleus. Trends Biochem. Sci. 2020; 45:961–977. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B57] 57. Fay M.M., Anderson P.J.. The role of RNA in biological phase separations. J. Mol. Biol. 2018; 430:4685–4701. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B58] 58. Lin Y., Protter D. S.W., Rosen M.K., Parker R.. Formation and maturation of phase-separated liquid droplets by RNA-binding proteins. Mol. Cell. 2015; 60:208–219. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B59] 59. Rhine K., Vidaurre V., Myong S.. RNA droplets. Annu. Rev. Biophys. 2020; 49:247–265. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B60] 60. Lyon A.S., Peeples W.B., Rosen M.K.. A framework for understanding the functions of biomolecular condensates across scales. Nat. Rev. Mol. Cell Biol. 2021; 22:215–235. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B61] 61. Mateu-Regue A., Christiansen J., Bagger F.O., Winther O., Hellriegel C., Nielsen F.C.. Single mRNP analysis reveals that small cytoplasmic mRNP granules represent mRNA singletons. Cell Rep. 2019; 29:736–748. [DOI] [PubMed] [Google Scholar]

[B62] 62. Söding J., Zwicker D., Sohrabi-Jahromi S., Boehning M., Kirschbaum J.. Mechanisms for active regulation of biomolecular condensates. Trends Cell Biol. 2020; 30:4–14. [DOI] [PubMed] [Google Scholar]

[B63] 63. Vuzman D., Polonsky M., Levy Y.. Facilitated DNA search by multidomain transcription factors: cross talk via a flexible linker. Biophys. J. 2010; 99:1202–1211. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B64] 64. Arbesú M., Iruela G., Fuentes H., Teixeira J. M.C., Pons M.. Intramolecular Fuzzy Interactions Involving Intrinsically Disordered Domains. Front. Mol. Biosci. 2018; 5:39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B65] 65. Krishnamurthy V.M., Estroff L.A., Whitesides G.M.. Multivalency in ligand design. 2006; John Wiley and Sons, Ltd; 11–53. [Google Scholar]

[B66] 66. Tjandra K.C., Thordarson P.. Multivalency in drug delivery – when it is too much of a good thing?. Bioconjugate Chem. 2019; 30:503–514. [DOI] [PubMed] [Google Scholar]

[B67] 67. Disney M.D., Lee M.M., Pushechnikov A., Childs-Disney J.L.. The role of flexibility in the rational design of modularly assembled ligands targeting the RNAs that cause the myotonic dystrophies. ChemBioChem. 2010; 11:375–382. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Cooperativity boosts affinity and specificity of proteins with multiple RNA-binding domains

Simon H Stitzinger

Salma Sohrabi-Jahromi

Johannes Söding

Abstract

INTRODUCTION

Figure 1.

MATERIALS AND METHODS

Simple cooperative binding model

Figure 2.

Inter- and intramolecular reactions of first and second order

Figure 3.

Thermodynamic definition of the avidity Kav

Effective concentrations using the worm-like chain model

Effect of different RNA motif densities

Simulation of cooperative binding with Gillespie algorithm

Determining the model parameters

RESULTS

The model correctly estimates dissociation constants

Figure 4.

Avidity increases exponentially with number of binding sites

Figure 5.

Contributions of individual domains to the avidity becomes negligible after a threshold in the individual Kd

Protein binding can depend sensitively on the density of binding motifs on the RNA

DISCUSSION

Thermodynamic model extends previous models of cooperative binding

Simplifying assumptions limit model accuracy

Disorder in RNA binding domains

Multi-domain RBPs can distinguish sensitively between RNAs with different binding site densities

Multi-domain RNA-binding can promote phase separation

Cooperative binding plays a role in other biomolecular interactions

CONCLUSION

DATA AVAILABILITY

Supplementary Material

ACKNOWLEDGEMENTS

Contributor Information

SUPPLEMENTARY DATA

FUNDING

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Thermodynamic definition of the avidity K_av

Contributions of individual domains to the avidity becomes negligible after a threshold in the individual K_d