Skip to main content
Cell Reports Methods logoLink to Cell Reports Methods
. 2022 Mar 18;2(3):100185. doi: 10.1016/j.crmeth.2022.100185

Principles and pitfalls of high-throughput analysis of microRNA-binding thermodynamics and kinetics by RNA Bind-n-Seq

Karina Jouravleva 1,, Joel Vega-Badillo 1, Phillip D Zamore 1,2,3,∗∗
PMCID: PMC9017153  PMID: 35475222

Summary

RNA Bind-n-Seq (RBNS) is a cost-effective, high-throughput method capable of identifying the sequence preferences of RNA-binding proteins and of qualitatively defining relative dissociation constants. Although RBNS is often described as an unbiased method, several factors may influence the outcome of the analysis. Here, we discuss these biases and present an analytical strategy to estimate absolute binding affinities from RBNS data, extend RBNS to kinetic studies, and develop a framework to compute relative association and dissociation rate constants. As proof of principle, we measured the equilibrium binding properties of mammalian Argonaute2 (AGO2) guided by eight microRNAs (miRNAs) and kinetic parameters for let-7a. The miRNA-binding site repertoires, dissociation constants, and kinetic parameters calculated from RBNS data using our methods correlate well with values measured by traditional ensemble and single-molecule approaches. Our data provide additional quantitative measurements for Argonaute-bound miRNA binding that should facilitate development of quantitative targeting rules for individual miRNAs.

Keywords: RNA-protein interactions, high-throughput biochemistry, binding affinity, dissociation constant, association rate constant, dissociation rate constant, RNA motif, RNA interference, miRNA, Argonaut

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Several factors may influence the outcome of the RBNS analysis

  • An analytical strategy estimates absolute binding affinities from RBNS data

  • A simple framework to compute relative association and dissociation rate constants

  • Absolute affinities of microRNA-directed binding of AGO2 to a variety of sites

Motivation

The speed at which regulatory proteins find their RNA targets and the time they spend bound to specific regulatory sequence motifs govern the specificity and extent of target repression or activation. RBNS was developed to qualitatively describe the specificity of RNA-binding proteins. Here, we present an analytical strategy to measure absolute affinity values and relative association and dissociation rate constants from RBNS data. The resulting analysis provides a resource for replacing >500 biochemical binding assays in examining physicochemical properties and kinetic behavior of miRNA-guided Argonaute (AGO) proteins. Our quantitative measurements should allow prediction of complex and dynamic miRNA-mRNA regulatory networks.


Jouravleva et al. show that, when designed and analyzed with potential biases in mind, RNA Bind-n-Seq provides a high-throughput route to discovering the sequence preferences of RNA-binding proteins, defining their absolute equilibrium binding dissociation constants, and delineating their relative association and dissociation rate constants for sites of disparate sequence and affinity.

Introduction

Predicting the regulatory consequences of molecular interactions based on measured biochemical properties is a long-standing goal in biology. The strength of an interaction is quantitatively described by its equilibrium dissociation constant (KD), the substrate concentration required for half-maximal complex formation. By definition, KD is the ratio of the rate of complex dissolution, described by the dissociation rate constant, koff, and the rate of complex formation, described by the association rate constant, kon. Interactions with the same affinity may arise from different kinetic behaviors that may vary by orders of magnitude: one set of interactions may be driven by rapid recognition and binding (large kon), while another may be driven by slower kon but increased complex stability (small koff). Knowledge of kon limits the possible mechanisms for target finding, e.g., whether a binding event is diffusion-limited or whether additional proteins can speed the search for high-affinity binding sites. koff determines the lifespan of the binding interaction and thus provides insight into whether a process is likely to be driven by a hit-and-run mechanism or require continued site occupancy to exert a regulatory effect. Thus, the affinities and the dynamics of molecular interaction provide critical information for developing quantitative models of a regulatory network.

Interactions between RNA-binding proteins (RBPs) and mRNAs are dynamic (Müller-McNicoll and Neugebauer, 2013; Licatalosi et al., 2020) and at the core of gene regulation (Moore, 2005). Current approaches to determine thermodynamic and kinetic properties generally involve compromises in throughput, comprehensiveness, and quantitative precision. For example, targeted biochemical approaches—including electrophoretic mobility shift assays and fluorescence resonance energy transfer—provide quantitative biophysical measurements but can only interrogate known interactions and are therefore inherently low throughput (Ladbury, 1995; Ladbury and Chowdhry, 1996; Garland, 1996; Schuck, 1997; Frey and Albin, 2001; Erickson et al., 2001, 2003; Hellman and Fried, 2007; Shi and Herschlag, 2009). By contrast, high-throughput sequencing methods that rely on crosslinking and immunoprecipitation yield comprehensive lists of RNA-binding motifs but do not enable quantitative assessment of binding affinities (Licatalosi et al., 2008; König et al., 2010; Zhao et al., 2010; Danan et al., 2016; Sugimoto et al., 2012; Kishore et al., 2011; Campbell and Wickens, 2015). Other strategies provide high-throughput, quantitative information for intermolecular associations but use complicated experimental setups unavailable to many laboratories (Buenrostro et al., 2014; Tome et al., 2014; Nutiu et al., 2011; Maerkl and Quake, 2007; Martin et al., 2012; Sharma et al., 2021).

RNA Bind-n-Seq (RBNS) determines the specificity of proteins for nucleic acids in vitro using a single-step binding assay and a high-throughput sequencing readout, making the method widely accessible and cost-effective. Originally developed to identify the repertoire of DNA sequence motifs by zinc-finger proteins and calculate their relative binding affinities (Zykovich et al., 2009), it was subsequently applied to RBPs (Lambert et al., 2014; Taliaferro et al., 2016; Dominguez et al., 2018; Hale et al., 2018; McGeary et al., 2019; Van Nostrand et al., 2020). Experimental procedures for RBNS and the computational strategy to identify RBP binding sites and their relative KD values have been presented previously (Lambert et al., 2015; McGeary et al., 2019).

Here, we revisit RBNS, providing practical guidelines for performing the assay and highlighting potential biases and pitfalls influencing the outcome of RBNS analysis. We report a novel computational approach that extracts absolute KD values from RBNS data, providing a high-throughput route to quantitatively describe intermolecular interactions. Finally, we extend RBNS to measure relative kon and koff values, enabling a deeper understanding of reversible bimolecular interactions. To test our method, we use published (McGeary et al., 2019) and new data for target binding by mammalian AGO2 loaded with different miRNAs. In vivo, miRNAs guide AGO2 to repress gene expression (Liu et al., 2004; Meister et al., 2004; Baek et al., 2008; Selbach et al., 2008; Guo et al., 2010; Doench and Sharp, 2004; Hendrickson et al., 2009; Huntzinger and Izaurralde, 2011; Bazzini et al., 2012). Unlike most RBPs (69 of 78 assayed), which have detectable binding affinity for just one or two sequence motifs (Van Nostrand et al., 2020), AGO proteins take their binding specificity from their 21-nt-long miRNA guides. Consequently, AGO proteins can bind a variety of sites with affinities ranging from ∼1 pM to ∼10 nM. AGO proteins are ideal for illustrating the biases inherent in RBNS, discussing experimental considerations, and benchmarking our strategies for measuring absolute KD values and relative kon and koff values.

Results

Overview of RNA Bind-n-Seq

RBNS begins with incubation of a purified RBP with a large pool of RNA molecules, each containing a region of random sequence (Figure 1). After reaching equilibrium, protein-bound RNA is separated from unbound RNA and then extracted from the nitrocellulose membrane, reverse transcribed, amplified, and sequenced. Analysis of sequencing data readily identifies a list of preferentially bound sequence motifs. Performing multiple individual binding reactions across a broad range of RBP concentration allows measurement of relative KD (Lambert et al., 2015).

Figure 1.

Figure 1

Overview of RNA Bind-n-Seq

Experimental design considerations

RISC purification and choice of concentration range

Assembly and purification of miRISC (RNA-induced silencing complex) comprising purified human or mouse AGO2 and a synthetic miRNA guide has been described elsewhere (Figure 2A) (Flores-Jasso et al., 2013). Because different miRISC preparations can have different percent activities for the same amount of protein, a titration experiment to quantify the concentration of binding-competent miRISC is a key step for determining the protein concentrations to use in RBNS reactions. We estimate total miRISC concentration by northern blot and use double-filter binding assays to measure the equilibrium binding of active miRISC with a high-affinity RNA target (Figures 2B and 2C). Concentration of RNA target is chosen to be much greater than its measured KD, and miRISC concentration is varied by an order of magnitude above and below the target concentration. The titration data are then fit to a quadratic equation (Figure 2C). The stoichiometry of the bound complex is 1:1 (Wee et al., 2012); therefore, the breakpoint in fraction bound versus the ratio of protein to ligand indicates the amount of active miRISC.

Figure 2.

Figure 2

Sequential steps of RBNS

(A) Purification of miRISC containing a single, unique small RNA guide.

(B) Total concentration of purified miRISC is estimated by northern blot.

(C) Concentration of binding-competent miRISC is measured by a titration experiment. KD is the apparent dissociation constant, r is the molar ratio of [RISC] to [RNA], n is the stoichiometric equivalence point, ƒ is the fraction bound, and ƒmax is the maximum fraction bound. Error bars report propagated SD.

(D) After the binding step of RBNS, the two membranes are separated, imaged, and quantified.

To examine binding to both high- and low-affinity sites, RBP concentrations in RBNS reactions should span and exceed KD values of those sites. miRISC has subnanomolar affinity for its canonical sites—mouse AGO2 loaded with let-7a has KD ∼10 pM for an 8mer site but ∼200 pM for a 6mer (Becker et al., 2019; Flores-Jasso et al., 2013; Salomon et al., 2015). Therefore, we surveyed logarithmically spaced miRISC concentrations from ∼5 pM to ∼1 nM. In addition, we performed a no-RISC binding experiment to detect potential method-specific biases in background binding (Figure 2D).

RNA design

The length of the random sequence region of RNA ligands is an important aspect of RBNS design (Lambert et al., 2015). miRNA canonical binding sites are 6–8-nt long, whereas noncanonical sites can be 11–12 nt long (Shin et al., 2010). To capture binding at both, we randomized 20 nt, an RNA pool of 420 = 1.1 × 1012 distinct RNA sequences. Each binding reaction was carried out with 2 pmol of RNA—1.2 × 1012 molecules; thus, the pool theoretically contained >104 copies of each of the 412 = 1.7 × 107 possible 12mers and >108 copies of each of the 46 = 4.1 × 103 possible 6mers. High-throughput sequencing of the RNA pool generally confirmed these expectations and showed an overall balanced base composition, albeit with a slightly reduced frequency of guanine and slightly higher than expected frequency of cytosine (Figure S1A). In addition, some 10mers were depleted 10-fold, and others were over-represented by 5-fold compared with expected frequency (Figure S1B). Such departures from randomness in the RNA pool is corrected for by calculating the ratio of the frequency of each 10mer protein-bound RNA to that in the starting pool.

In typical RBNS experiments, the random sequence RNA region is flanked by constant primer-binding regions used for sequencing (Figure 1). This design simplifies library preparation, avoids biases that can result from RNA ligation, and ensures that any RNA carried over from protein purification will not contaminate the sequenced library. However, fixed primer-binding sequences can affect the secondary structure ensemble of the RNA pool (Cook et al., 2017) and bias interpretation of RBNS assays with RBPs that recognize structured elements. miRISC binds single-stranded sequence motifs; therefore, secondary structures will have little effect on miRISC binding unless they occlude a site. We observed a ∼1.5-fold higher enrichment of canonical binding sites at the 5′ end of the random sequence region for miRISC bearing let-7a, miR-34b, or miR-449a (Figures S1C and S1D).

Nevertheless, constant primer-binding sequences may bias the RBNS assay if they contain a motif for miRISC binding. In this case, miRISC may bind to virtually any RNA molecule in the pool, underestimating the enrichment of authentic binding motifs within the randomized sequence region, reducing RBNS sensitivity. miRNAs guide miRISC to their targets primary through Watson-Crick pairing with their “seed” sequence, miRNA nucleotide positions g2–g8 (Lewis et al., 2003, 2005; Rajewsky and Socci, 2004; Krek et al., 2005), which is displayed by AGO2 in a helical geometry ready to base pair (Parker et al., 2005; Wang et al., 2008; Elkayam et al., 2012; Nakanishi et al., 2012). Figure S1C presents the nomenclature for canonical site type classification. Given the high diversity and short length of miRNA seed sequences, the potential for miRISC binding to the constant regions is high. For example, constant regions in the RNA pool used by McGeary et al. (2019) contained at least one canonical site for 170 human miRNAs, including miR-7.

Even noncanonical, seed-matched sites with undetectable affinity may impact binding by interacting with higher affinity sites, e.g., those with slow kon. miRISC binds rapidly to a short, seed-matched 4mer-m3.6 site, but this transient binding does not produce a measurable interaction. In contrast, 3′-only sites typically have slower association rate constants (∼107 M−1 s−1) than seed-matched sites (Salomon et al., 2015), because miRISC does not pre-organize the 3′ region of its miRNA guide. Yet a 3′-only 10mer-m10.19 site placed adjacent to a 4mer-m3.6 site present in the constant region of the RNA target displayed a diffusion-limited apparent kon (Figure S1E). Constant regions cannot be readily modified to avoid biases, as they must remain compatible with Illumina sequencing. Inspired by Becker et al. (2019), we use cDNA oligonucleotides to block the common sequences present in each RNA molecule. Making these regions double-stranded prevents miRISC binding and disfavors intramolecular secondary structures, leaving only the randomized sequence and four 5′ and 3′ flanking nucleotides accessible (Figure 1). We benchmarked the blocker strategy using miR-449a miRISC, which has a low-affinity seed-matched binding site (5mer-A1) in the 3′ constant region of the RNA pool. DNA blocking oligonucleotides increased the enrichment of canonical sites 2.5- to 5-fold by decreasing the fraction of RNA with no binding site in the random sequence region (Figure S1F).

Time to reach equilibrium

KD, the equilibrium dissociation constant, must be measured after binding reactions have reached equilibration. As reviewed by Jarmoskaite et al. (2020), binding follows an exponential curve characterized by its half-life, t1/2. A conservative standard for equilibration is five half-lives, corresponding to 96.6% completion. Half-life is also linked to the equilibration rate constant keq:

keq=ln2t1/2.

Thus, the time for a binding reaction to reach 96.6% of completion is given by

T=5×t1/2=5×ln2keq.

For the binding equilibrium where miRISC interacts with an RNA sitei, the equilibration rate constant is described by

keq=kon×[miRISC]×[sitei]+koff.

The longer-lived the complex, the longer the incubation time required to reach equilibrium. For the let-7a 8mer site, kon = 2.4 ± 0.1 × 108 M−1s−1, koff = 0.0036 ± 0.0003 s−1 (Salomon et al., 2015) and the concentration of the 8mer site in the RNA pool used for RBNS is 40 pM (Table S1). Under these conditions, the binding reaction with the lowest miRISC concentration (8 pM) should reach equilibrium in 16 ± 2 min. To provide sufficient time for equilibration, we incubated binding reactions for 2 h.

RNA pool concentration

In RBNS, the RNA pool concentration is the same for all binding reactions, while the RBP concentration varies. To measure KD, the “titration” regime—in which the concentration of a binding site is much greater than KD—must be avoided (reviewed by Jarmoskaite et al., 2020). We measured the frequency of known canonical and noncanonical binding sites in the RNA pool by high-throughput sequencing. Under our experimental conditions, concentrations of known binding sites are of the same order of magnitude as their KD values. This is an acceptable regime, and KD can be determined by an appropriate binding equation that explicitly accounts for bound protein and does not rely on the common assumption that [Protein]free ∼ [Protein]total (see STAR Methods).

Separation of bound and free RNA molecules

After reaching equilibrium, bound and free RNA are separated, e.g., by electrophoretic mobility shift assay, capture on beads coupled to specific antibodies that recognize the RBP or a ligand attached to the synthetic miRNA, and a double-filter binding assay.

The double-filter method has been used extensively to investigate the equilibrium binding and kinetic properties of protein-nucleic acid interactions (Wong and Lohman, 1992, 1993; Riggs et al., 1970a, 1970b; Winter and von Hippel, 1981; Clore et al., 1982; Wee et al., 2012; Xiao and MacRae, 2020). The nitrocellulose filter preferentially retains protein and protein-bound nucleic acids. The positively charged nylon filter placed directly beneath the nitrocellulose membrane traps protein-free nucleic acids not retained by the nitrocellulose.

RBNS using miRISC immobilized on paramagnetic beads provides an obvious route to high-throughput automation of the method, a prerequisite for defining the site types and their KD values for all known mammalian miRNAs. We compared the double-filter binding assay with miRISC immobilized on paramagnetic beads via an antibody-FLAG epitope tag interaction (Figures 1 and S2A). As the readout, we calculated the enrichment of canonical binding sites: the frequency of a binding site in miRISC-bound reads divided by its frequency in the starting pool (Figures 3B and S2B). Enrichment of canonical sites was 5- to 10-fold lower when bound RNA was recovered using bead-immobilized miRISC. Concomitantly, RNA non-specifically associated with miRISC was more abundant for the beads than in the double-filter binding assay, diluting the specifically recovered RNA and reducing RBNS sensitivity. We developed an alternative method for enrichment analysis to account for this higher background. Using this approach (described below) for the two methods yielded highly similar lists of binding sites (Figure S2C) with comparable affinities (≤2-fold difference for 12 of 18 binding sites) (Figure S2D). While magnetic bead assays present a clear advantage for automation, the greater sensitivity and reduced handling of the double-filter binding assay is generally simpler, especially for kinetic assays.

Figure 3.

Figure 3

RBNS enables de novo discovery of binding sites

(A) Pairing of enriched sites.

(B) Enrichment profile of canonical let-7a sites observed at each of the five miRISC concentrations tested.

(C) Illustration of de novo site discovery strategy. Occurrences of all k-mers are calculated in all the sequencing reads (black lines). Each λ-nt-long read contains λ – k + 1 motifs. For example, the magenta 10mer appears four times in the miRISC-bound sequencing data (five reads total), therefore its frequency in this sample is 4 ÷ (5 × 55). The most enriched motif is selected, and its Z score is compared with the Z score threshold. The magenta motif is the most enriched, and its Z score is above the Z score threshold; therefore, this motif is retained. Next, all reads containing the identified motif are masked in the miRISC-bound and RNA pool sequencing data (gray lines). All enrichment values are recalculated on masked reads to eliminate “shadow” motifs (blue). The orange motif is the most enriched at this iteration and is retained. Nevertheless, its Z score (calculated from the original enrichment values) is below the Z score threshold; therefore, the orange k-mer is not considered as enriched. This iterative procedure continues until the Z score of the most enriched binding site is below the Z score threshold.

De novo identification of binding sites

Principle

The number of occurrences of all k-mers of a specific length is counted over all the reads in a binding reaction. These counts are then divided by the total count of all k-mers to yield the k-mer frequency. The same procedure is repeated over the reads from the RNA input pool to account for any sequence biases present in the random region of RNA molecules. Finally, enrichment of a k-mer is defined as the frequency of the k-mer in protein-bound reads divided by its frequency in the RNA input pool (Figure 3C). A motif is identified as a binding site if (1) its enrichment is either above an arbitrarily defined value (McGeary et al., 2019) or is greater than a chosen Z score threshold (Lambert et al., 2014; Dominguez et al., 2018; Van Nostrand et al., 2020) and (2) it is not enriched in the no-RISC control reaction. Identified binding sites were also not enriched in a binding reaction containing the miRNA but not AGO2, indicating miRISC-specific binding (Figure S3A).

RBNS sensitivity depends on RBP concentration

Enrichment values are dependent on the concentration of RBP and produce a characteristic unimodal curve (Lambert et al., 2014). At low protein concentrations, enrichment of a motif increases with increasing protein concentration, as increasing amounts of bound RNA improve signal over a constant background of RNA molecules recovered even in the absence of protein. At high RBP concentrations, high-affinity motifs are saturated, and binding is driven toward lower affinity sites, resulting in a lower fraction of high-affinity motifs (Figure 3B). In our experiments using AGO2 miRISC, the relative rankings of binding motifs obtained at different protein concentrations were always preserved (Figures 3B, S1F, and S2B) and highly correlated (Spearman’s ρ > 0.83 at adjacent protein concentrations), the binding reaction with the highest protein concentration offered the greatest sensitivity for low-affinity sites (Figure S3B). To limit sensitivity bias when comparing binding sites within various miRNAs, it is preferable to measure the active concentration of AGO2 and perform site discovery at similar active miRISC concentrations.

Sensitivity depends on the criteria chosen for significance

Motifs are typically considered enriched if their enrichment scores (McGeary et al., 2019) or Z scores (Lambert et al., 2014; Dominguez et al., 2018; Van Nostrand et al., 2020) exceed an arbitrarily defined threshold. Because binding site enrichment scores depend on the amount of RNA non-specifically recovered in the assay—increased background dilutes the specific signal, lowering enrichment scores—using a fixed threshold to compare different datasets may bias motif discovery. For example, a threshold of 10 (Figure S3C) (McGeary et al., 2019) applied to our AGO2 let-7a datasets generated by the double-filter binding method would yield a list of 17 enriched sites, but only 4 binding sites when using bead-immobilized miRISC (Figure S3D). This bias is prevented by using Z scores, which indicate the number of standard deviations by which enrichment values differ from the mean enrichment (Figure S2C). Previous reports used various values of Z score for different proteins and motif lengths (Lambert et al., 2014; Dominguez et al., 2018; Van Nostrand et al., 2020). To standardize the procedure of choosing a threshold, we consider a motif significantly enriched if its Z score ≥ 99.9 percentile (Figure S3E).

Finally, site discovery may be biased by “shadow motifs”—k-mers occurring in reads with high-affinity binding motifs but not conferring binding by themselves. An iterative procedure allows detection of “true binders” by masking reads harboring the most enriched binding site in the protein sample and in the RNA pool, then repeating the analysis iteratively (Figure 3C) (McGeary et al., 2019; Dominguez et al., 2018; Van Nostrand et al., 2020). All enrichments are recalculated on the masked reads to obtain the resulting most enriched motif, with this process continuing until the enrichment Z score (calculated from the original enrichment values) no longer meets the criterion for significance.

Motif size

Known miRISC binding sites are 5–12 nt long. As enrichment is calculated as the ratio of motif frequency in protein sample over RNA input pool, and division by zero is not supported, k-mers must be present in sequenced reads of the RNA pool. With current sequencing costs, one can readily sequence ∼20 million reads for each condition. At this depth, ∼200 reads of every possible 10mer will be present within the 20-nt-long randomized central region, but not all 11mers will be sampled. As sequencing costs fall, deeper sequencing should allow longer motif discovery.

Estimation of absolute KD values using synthetic datasets

k-mer enrichment approximates relative binding affinities, because high-affinity binding sites are more enriched than lower affinity sites (Zykovich et al., 2009). However, enrichment depends on RBP concentration, saturation of high-affinity sites, and background binding (Figures 3B, S1F, and S2B). Consequently, enrichment (1) has an upper limit that depends on the length of both k-mers and the RNA molecules and (2) is not directly proportional to binding affinities (Lambert et al., 2015). Recently, a strategy was proposed to estimate relative KD values solely from sequenced data without prior information on miRISC concentration, fraction bound of RNA molecules, and non-specific RNA recovered as background (McGeary et al., 2019). We modified this procedure to simultaneously determine absolute KD values using maximum likelihood estimation (MLE) (Figure 4A). In this statistical model, the parameter values—the KD values of various sites, miRISC concentration, and the background of RNA interacting non-specifically with the beads, membrane, or tubes—of a mathematical model predicting expected sequencing counts are fit to maximize the likelihood of observing the predicted sequencing counts in experimental data. We tested our approach using simulated RBNS datasets containing four binding sites and corresponding to our typical experimental conditions. Figure S4A shows the convergence of a typical MLE fit. Our method accurately estimated the ground truth affinity values of the four binding sites (Figures 4B and S4B).

Figure 4.

Figure 4

Estimation of KD values by RBNS is robust and reproducible

(A) The mathematical equation derived from the biochemical model at equilibrium describes predicted concentration of binding site i in sequencing data, given KD values of all binding sites, miRISC concentration, and background. These parameters are fit simultaneously to maximize the likelihood of observing predicted sequencing counts in experimental data.

(B) Testing KD estimation with simulated data. RBNS data were modeled by simulating miRISC binding to RNA pool containing four binding sites and no-site molecules. Stock concentration of miRISC was equal to 2.1 nM (dataset 1) or 8.1 nM (datasets 2 and 3). Background was set to 0.1 nM and KDnosite was set to 5 nM. Error bars indicate 95% CI on the median.

(C) Comparison of sub-datasets when the highest miRISC concentration and no-RISC binding reaction were removed (Pearson’s r = 0.979). Binding sites identified in Figure 3 were used to compute KD.

(D) Correspondence between fitted KD values of enriched binding sites estimated from two independent binding experiments (Pearson’s r = 0.974). The solid orange and red lines indicate 2- and 10-fold differences, respectively. Dashed diagonal lines show y = x.

(E) Comparison of fitted KD values when multi-site reads were fractionally assigned to corresponding site types or excluded from the analysis.

Estimation of absolute KD values using experimental datasets

Next, we applied our MLE procedure to experimental let-7a datasets. To test whether the method’s KD estimates were robust, we removed one sample from the dataset and re-fitted the parameters using the remaining binding reactions. Estimates obtained by this leave-one-out procedure strongly correlated in pairwise comparisons (Pearson’s r ≥ 0.969 for each of the 15 pairwise comparisons), even when we removed the highest miRISC concentration or the no-RISC binding reaction (Figures 4C and S4C). Estimates were also highly reproducible in two independent experiments, each comprising five protein concentrations and no-RISC control (Pearson’s r ≥ 0.974; Figure 4D).

To estimate KD values and their 95% confidence intervals (CI), we first performed fitting optimization 2,000 times. Each binding experiment used two independent trials to account for errors caused by sample-to-sample variation. Second, we bootstrapped sequencing data 10 times to account for the error caused by the multinomial down-sampling of RNA molecules during sequencing. Third, each fitting optimization was performed 100 times using independent and partially randomized starting points to account for the error of ending at a local minimum. Importantly, while values for the initial guess spanned a >20-fold range, fitted parameters were robustly estimated, yielding narrow 95% CI (Figures 5, 6, and S2D).

Figure 5.

Figure 5

KD values estimated by RBNS agree with previous results

(A) Comparison of KD values estimated by RBNS and measured by ensemble biochemistry and single-molecule approaches.

(B) KD values fitted for AGO2 let-7a RBNS from McGeary et al. (2019). Error bars indicate 95% CI on the median. Center: pairing of enriched sites identified by de novo site discovery. Right: enrichment profile of canonical let-7a sites.

Figure 6.

Figure 6

RBNS provides insight into miRNA targeting

(A) KD values for t2–t8, t2–t7, and t2–t6 targets with different t1 nucleotides.

(B–D) KD values for t2–t8 targets with t1A and different one-nucleotide mismatches (B), deletions (C), and bulges (D) at indicated positions. We note that deletion of N nt and N+1 nt may yield the same sequence. Because we cannot discriminate between these two sites, they are annotated as “N/N+1”.

Error bars indicate 95% CI on the median. Horizontal dashed line indicates KD8mer. Adenine (orange), uridine (magenta), cytidine (cyan), and guanosine (blue).

We note that our procedure for estimating KD values uses only reads containing one miRNA-binding site. Reads containing multiple instances of binding sites represented ≤1% of sequence reads in the starting library. Omission of these reads did not affect estimation of KD values: fractionally assigning such multi-site reads to appropriate binding sites yielded the same results as excluding them (Figure 4E).

Fitted parameters were also insensitive to the number and identity of binding sites. For example, when optimization was performed with 15 statistically enriched site types (Figure 5A) our analysis for let-7a sites estimated KD8mer = 4 pM (95% CI = [2, 6]), KD7mer−A1 = 125 pM (95% CI = [116, 140]), and KD6mer−A1 = 490 pM (95% CI = [484, 497]). Using 12 seed-matched site types with different t1 nucleotides (Figure S4D) gave essentially the same results: KD8mer = 4 pM (95% CI = [4, 6]), KD7mer−A1 = 135 pM (95% CI = [125, 146]), and KD6mer−A1 = 524 pM (95% CI = [516, 535]). Our results agreed well with KD values measured by ensemble biochemistry and single-molecule approaches (Figures 5A, S4E, and S4F) (Wee et al., 2012; Salomon et al., 2015; Becker et al., 2019; Sheu-Gruttadauria et al., 2019) and with the relative binding affinities measured by McGeary et al. (2019) (Figure 5B).

High-throughput absolute KD estimation provides insights into miRNA targeting

Our analytical approach can estimate the affinity of any binding site of interest—enriched or not—provided the motif is represented in the sequenced data. We used our strategy to compute the KD values for de novo identified binding sites from previously published RBNS datasets (McGeary et al., 2019) and from additional datasets generated in this study (Figures 5B, S5A, and S6). We found that canonical binding sites display miRNA-specific differences in their affinities. For example, the C. elegans miRNA, lsy-6, loaded into human AGO2 had a 5-fold weaker affinity for an 8mer than AGO2:miR-1 miRISC (Figure S5A). Notably, the seed of miR-1 is predicted by nearest neighbor methods (Turner and Mathews, 2010) to pair more strongly with its target than the seed of the lsy-6 miRNA (Figure S5B), consistent with the known relationship between the predicted strength of seed-pairing and the efficacy of target mRNA repression (Garcia et al., 2011). While affinity increased with increased predicted pairing stability, the correlation between measured affinity and affinity predicted by nearest neighbor free energy was significant only for 7mer-m8 binding sites (Figure S5B). Affinity of seed-matched sites was also increased by A at position t1 (Figure 6A), which preferentially interact with a t1-nucleotide-binding pocket in AGO2 (Schirle et al., 2015).

Importantly, miRNAs bound some noncanonical sites with greater affinity than a canonical 6mer site (Figures 5B, S5A, and S6). These included 3′-only sites that extensively pair to the 3′ region of the miRNA without pairing to the seed. These sites were identified in the let-7a, lsy-6, miR-155, and miR-124 datasets using the Z score approach and display as much as 4-fold greater binding affinity than that of the canonical 6mer site (Table S3). By contrast, miR-1, miR-7, miR-449a, and miR-34b bound 3′-only sites poorly (KD > 1.2 nM). We note that the seed regions of these miRNAs are predicted by nearest neighbor analysis to pair more strongly and their 3′ regions more weakly than the equivalent sites for let-7a, lsy-6, miR-155, and miR-124, likely explaining the pronounced miRNA-specific differences in the repertoire of noncanonical sites.

Single target mismatches, insertions, and deletions disrupting the seed detectably increased KD (Figures 6B–6D) (Wee et al., 2012; Becker et al., 2019; Sheu-Gruttadauria et al., 2019). Surprisingly, some sites bearing a central 1-nt bulge in the seed were detected among enriched motifs (Figures 3A, 5B, S5A, and S6), suggesting that these bulges are better tolerated. For example, insertion of guanosine between positions 5 and 6 (b5.6G) in miR-124 seed-matched sites decreased affinity by only 21-fold (Figure 6D). This binding site corresponds to a nucleation-bulge site and was previously identified by high-throughput crosslinking and immunoprecipitation (Chi et al., 2012). Nevertheless, 67% of enriched sites (8 of 12) with one target nucleotide inserted in the center of the seed do not use this mode of binding, suggesting that bulged sites may be more common than previously appreciated, but are not used by all miRNAs.

Determining relative kinetic parameters by RBNS using synthetic datasets

Kinetic studies of molecular interactions provide powerful insights into the underlying microscopic mechanisms and functional outcome. We reasoned that RBNS might enable high-throughput measurement of the kinetic behavior of an RBP if the bound RNA was recovered at different times after the binding reaction was started (Figure S7A). Traditional association kinetics experiments monitor the time-dependent progress of RNA binding at different protein concentrations. To simplify the experiment, one usually uses pseudo first-order conditions, with a low concentration of one reactant (e.g., RNA) and ≥10-fold higher concentration of the second reactant (e.g., RBP). Under these conditions, the observed rate constant kobs is related to RBP concentration and the association and dissociation rate constants as kobs = kon×[RBP]+koff. A plot of kobs versus [RBP] is linear, with slope kon and y intercept koff. Therefore, association kinetics requires performing a time series with several protein concentrations. An alternative approach measures binding at different time points under pseudo first-order kinetics, while blocking reassociation by dilution or adding a high-affinity RNA competitor target at high concentration. In this experimental setup, a single protein concentration is used; koff is first determined from the dissociation and is then used to infer kon from kobs.

We developed a computational strategy that does not assume pseudo first-order kinetics and simultaneously fits association and dissociation rate constants using the same RBNS dataset (Figure 7A). Our mathematical model predicts the read counts for each site type across the time series as a function of kon and koff values for each miRNA site type (including “no-site”). We also estimated a constant amount of RNA molecules recovered as background in all samples.

Figure 7.

Figure 7

Estimation of kon and koff values by RBNS

(A) The mathematical equation derived from the biochemical model describes the predicted concentration of binding site i at a time point t in the sequencing data, given kon and koff values of all binding sites, and background. These parameters are fit simultaneously to maximize the likelihood of observing predicted sequencing counts in experimental data.

(B and C) Testing kon and koff estimation with simulated data. RBNS data were modeled by simulating miRISC binding to RNA pool containing four binding sites and no-site molecules. Dataset 1 contains binding sites with similar koff but different kon values. Dataset 2 contains binding sites with similar kon but different koff values. (B) Convergence of a representative MLE fit. The cost function ƒcost is minimized to a final value ƒcost,end. ƒcost, GT: ƒcost calculated with ground truth parameters. (C) Fitted kon and koff values are reported relative to those of site 4.

(D) Comparison of sub-datasets when the shortest and the longest times were removed (Pearson’s r = 0.998). Indicated are the site types showing >2-fold difference. Fitted parameters include kon (nM−1 s−1), koff (s−1), and background (nM).

(E) Correspondence between fitted kon and koff values estimated from two independent binding experiments (Pearson’s r = 0.86). Kinetic parameters are reported relative to those of 8mer site. Indicated are the site types showing >4-fold difference.

(F) Comparison of kon and koff values estimated by RBNS and measured by co-localization single-molecule spectroscopy (CoSMoS). CoSMoS data from Figure S5D and (Salomon et al., 2015). Error bars indicate 95% CI on the median (RBNS) or propagated error on the mean (CoSMoS). RNA molecules used to measure binding of 5mer-A1 by CoSMoS unintentionally contained an additional 5mer-m2.6-w2 binding site. The solid orange and red lines indicate 2- and 10-fold difference, respectively. Dashed diagonal lines show y = x.

To test our approach, we used simulated kinetic RBNS datasets, in which the observations were modeled to match our typical experimental conditions (Figure S7B). Our analytical method did not recover the absolute ground truth parameters (Figure S7B). The cost function ƒcost, which we minimize during the optimization procedure, is a linear combination of exponential functions. This function has multiple minima, and ƒcost was routinely trapped at a local minimum yielding kon and koff values different from the ground truth (Figure 7B). Nevertheless, our method was able to discriminate between fast and slow binders—but maybe was not able to distinguish rank order within these broad classes of sites—and preserved the relative differences in kinetic parameters among binding sites (Figure 7C).

Estimation of relative kinetic parameters using experimental datasets

To benchmark our approach, we used AGO2 let-7a miRISC, whose kinetics have been measured previously (Figures S5D and S5E) (Becker et al., 2019; Salomon et al., 2015). Leave-one-out validation showed that the fitted kon and koff parameters were robustly estimated (Pearson’s r ≥ 0.995 for each of the 55 pairwise comparisons; Table S4). We note that estimation of koff for long-lived binding events was sensitive to removing the first and the equilibrium time points from the kinetic series (Figure 7D). Estimates differed by ≤2-fold between two independent experiments using two different miRISC preparations (Pearson’s r = 0.86) with two noteworthy exceptions (Figure 7E). The kon of 5mer-m8 and the koff of 11mer-m11.21 differed between experiments by 5- and 3.5-fold, respectively. These observations highlight two pitfalls of RBNS that we also observed measuring binding affinities: estimation is less accurate for (1) longer and therefore more sparse motifs, and (2) for no-site motifs. (The 5mer-m8 site type was not enriched in our RBNS equilibrium data but was included in the kinetic analysis for comparison with single-molecule results.) Importantly, our mathematical model and fitting algorithm successfully discriminated between fast—the canonical seed-matched 8mer, 7mer-m8, 7mer-A1, and 6mer sites—and slow binders—the 3′-only site 11mer-m11.21 and the 5mer-m8 (Figure 7F). The same hierarchy of fast and slow binding sites was observed by single-molecule assays, albeit the dynamic range of kon was compressed relative to RBNS. This variance reflects technical differences between the two methods. First, single-molecule approaches use one unique RNA target >20 nt long, whereas RBNS measures binding for sites ≤12 nt, because longer motifs are too infrequent in the sequencing data. Second, the identity of nucleotides flanking binding sites impacts kon in vitro, with different flanking contexts spanning as much as a 100-fold range (Figure S7C) (Becker et al., 2019). RBNS averages these differences while single-molecule assays are typically performed using the most favorable context.

Our approach readily distinguished between long-lived and short-lived interactions: canonical seed-matched sites had the slowest departure rates (small koff), whereas the 11mer-m11.21 and 5mer-m8 sites displayed the fastest (Figure 7F). Although consistent with single-molecule measurements, the dynamic range of koff values estimated by RBNS was narrower than that offered by single-molecule assays. We do not currently understand the source of the compression of dynamic range in koff, but it is unlikely to arise from technical issues in the MLE procedure, because our method preserved the relative relationship among koff values with simulated data (Figure 7C). If sufficient protein sample is available, we suggest performing both equilibrium and kinetic series to obtain KD and kon and use these values to infer koff.

Discussion

RBNS is a straightforward and cost-effective strategy to interrogate the sequence specificities of RBPs. RBNS is often characterized as an unbiased method for site discovery in vitro. The method does enable simultaneous binding of RBPs to an exhaustive list of motifs. But as discussed above, several factors in the experimental design and computational analysis can alter the outcome of the assay. Nevertheless, when designed and analyzed with potential biases in mind, RBNS provides a high-throughput route to assessing the biochemical and kinetic properties of RBPs.

Despite its greater utility, RBNS has mainly been used to obtain a list of binding motifs (Taliaferro et al., 2016; Dominguez et al., 2018; Hale et al., 2018; Van Nostrand et al., 2020). Notably, some studies have employed RBNS to obtain qualitative, relative binding affinities (Lambert et al., 2014; McGeary et al., 2019). Here, we present a computational method that estimates the absolute values of dissociation constants for an RBP of interest without prior knowledge of miRISC concentration, fraction of RNA molecules bound, or non-specific RNA recovered as background. Our procedure uses sequencing reads containing single sites or a no-site. Because the proportion of reads with multiple sites is low (≤1%), we did not consider a more complex model considering all states in which the AGO-miRNA complex is bound to target sites. Importantly, estimating absolute—rather than relative—affinities does not require considering non-specific binding to no-site RNAs, which display more variance in affinity estimates than sequence-specific binding sites.

Applying our approach to simulated RBNS datasets showed that the method accurately estimated the ground truth KD values of the binding sites. Our ability to measure absolute KD values enables direct comparison of binding affinities for the same regulatory site type among different miRNAs. For example, our analysis revealed a 5-fold range in binding affinities for 8mers of human miR-1 and worm lsy-6. If miR-1 and lsy-6 were present in the same concentrations in a cell and the distribution of their 8mer sites was the same for the two, one would observe a smaller perturbation of gene expression for a loss of lsy-6 function than that of miR-1.

We benchmarked our strategy using the well-studied AGO2:let-7a miRISC and obtained robust results in good agreement with previous biochemical and single-molecule measurements (Becker et al., 2019; Salomon et al., 2015; Sheu-Gruttadauria et al., 2019; Wee et al., 2012; Schirle et al., 2015). As affinities for all motifs are assessed simultaneously, RBNS combined with our computational methods may become an alternative to traditional low-throughput quantitative methods. For example, our analysis of published and newly generated AGO2 datasets provides a resource for replacing >450 binding assays. Our estimation of dissociation constants for canonical and noncanonical sites for eight different miRNAs gives further support to binding models established using one or two miRNA sequences (Becker et al., 2019; Salomon et al., 2015; Sheu-Gruttadauria et al., 2019; Wee et al., 2012; Schirle et al., 2015).

We envision that absolute KD values measured using RBNS may enable prediction of changes in the occupancy of miRNA-guided AGO2 or other RBPs at sites across the transcriptome in response to developmental and external stimuli, thereby allowing the modeling of the resulting changes in regulatory activity. For miRNAs, RBNS using miRISC immobilized on paramagnetic beads promises to allow determination of the site types and equilibrium binding affinities for all mammalian miRNAs, a prerequisite for predicting the occupancy and identity of miRNA target sites in vivo. Finally, kon and koff measured by RBNS may prove useful for understanding the mechanisms of molecular interactions, providing additional information for developing quantitative models of biology.

Limitations of the study

The method presented here has broad utility in quantitatively assessing the specificity of RNA- or DNA-binding proteins for nucleic acids. We consider miRISC in solution with the RNA pool, and we assume that miRISC may bind one of the k-mers within the RNA molecule with 1:1 stoichiometry and a Hill coefficient of 1 and recovery of bound RNA is complete. For KD estimation, we also assume that the reaction has reached equilibrium, and concentrations of binding sites are below or of the same order of magnitude as their KD values. De novo identification of binding sites, and reliable measurement of KD, kon, and koff values of motifs of interest are achieved if these sites are represented in the sequenced reads. For RNA molecules with 20-nt random region and datasets consisting of 10–20 million sequenced reads after filtering steps, the maximum effective motif size for de novo site discovery is 10 nt. Interrogation of longer motifs could be achieved by including a constant region, such as an imperfect seed sequence, in addition to a random region (McGeary et al., 2022). For fitting KD, kon, and koff values, longer motifs can be interrogated, as the sequenced data are not required to contain all k-mers of a certain length. While our approach correctly ranked long-lived and short-lived interactions, fitted koff values may underestimate differences among the binding motifs. Therefore, we suggest fitting KD and kon and using these values to infer koff.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Chemicals, peptides, and recombinant proteins

DMEM, high glucose, GlutaMAX Supplement Thermo Fisher 10566024
Heat-inactivated FBS Life Technologies 10082-147
Polybrene Sigma TR-1003
Puromycin dihydrochloride Thermo Fisher A1113803
Klenow fragment 3′-to-5′ exo-minus New England Biolabs M0212S
Streptavidin New England Biolabs N7021S
Heparin Sigma H4784
Protocatechuic acid Aldrich 37580
Pseudomonas sp. protocatechuate 3,4-Dioxygenase Sigma P8279
Trolox Aldrich 238813
Propyl gallate Sigma P3130
4-nitrobenzyl alcohol Aldrich N12821
Proteinase K EMD Millipore 70663-5
AccuPrime Pfx DNA Polymerase Invitrogen 12344024
3XFLAG Sigma Aldrich F4799

Critical commercial assays

Anti-FLAG M2 paramagnetic beads Sigma M8823
Dynabeads MyOne Streptavidin T1 Life Technologies 65602
Protran nitrocellulose Sigma GE10600002
Hybond-XL Cytiva RPN303S
SuperScript III First-Strand Synthesis System Invitrogen 18080051
TransIT-2020 Mirus Bio MIR 5406

Deposited data

Raw sequencing data from mouse RISC RBNS This study NCBI: PRJNA807105
Raw sequencing data from human RISC RBNS McGeary et al. (2019) GSE140220

Experimental models: Cell lines

HEK 293T stably overexpressing FLAG-tagged mouse AGO2 This study N/A

Oligonucleotides

Table S1 This study N/A

Recombinant DNA

pScalps Puro EGFP 3XFLAG-AGO2 This study N/A
psPAX2 Gift of Ken-Edwin Aryee N/A
pMD2.G Gift of Ken-Edwin Aryee N/A

Software and algorithms

Python v2.7.11 Python Software Foundation https://www.python.org
Python module NumPy v1.16.6 Harris et al., 2020 https://numpy.org/
Python module SciPy v1.2.3 Virtanen et al., 2020 http://www.scipy.org
MATLAB vR2020b Natick, MA: MathWorks https://www.mathworks.com
CoSMoS pipeline v1 Smith et al. (2019) https://github.com/qnano/cosmos_pipeline
Estimation of KD values and kinetic parameters This study https://doi.org/10.6084/m9.figshare.19180952

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to, and will be fulfilled by, the lead contact, Phillip D. Zamore (phillip.zamore@umassmed.edu), or by completing the request form at https://www.zamorelab.umassmed.edu/reagents.

Materials availability

Plasmids generated in this study are available for non-commercial use upon request without restriction.

Experimental model and subject details

Cell lines

Studies used HEK293T cells stably overexpressing FLAG-AGO2. AGO2 cDNA was amplified by RT-PCR from mouse testis total RNA. Restriction cloning was used to add the AGO2 coding sequence to pScalps Puro EGFP, fusing the sequence in-frame with N-terminal 3XFLAG tag. Lentivirus transfer vectors were packaged by co-transfection with psPAX2 and pMD2.G (4:3:1) using TransIT-2020 (Mirus Bio) in HEK293T cells. Supernatant containing lentivirus was used to transduce HEK293T cells in the presence of 16 μg/ml polybrene (Sigma) to obtain stable FLAG-AGO2-expressing cell lines. Three sequential transductions were performed to maximize recombinant protein production. The transduced cells were selected in the presence of 2 μg∙ml−1 puromycin for 2 weeks, then the cells expressing the 5% highest EGFP fluorescence were selected by FACS (UMASS Medical School Flow Cytometry Core) and expanded. Cells were cultured at 37°C, 5% CO2 in DMEM (Gibco, Life Technologies) supplemented with 10% heat-inactivated fetal bovine serum (Sigma).

Method details

RISC purification

Cells stably expressing recombinant FLAG-AGO2 protein were expanded. Cells were washed once with ice-cold PBS, collected by scraping and centrifuged at 500 × g for 5 min. Cell pellets were flash-frozen in liquid nitrogen and stored at −80°C. Cell extract was essentially prepared as described (Dignam et al., 1983). Briefly, the cell pellet was washed three times in ice-cold PBS and once in Buffer A (10 mM HEPES-KOH pH 7.9, 10 mM potassium acetate, 1.5 mM magnesium acetate, 0.01% (w/v) CHAPS, 0.5 mM DTT, 1 mM AEBSF, hydrochloride, 0.3 μM Aprotinin, 40 μM Bestatin, hydrochloride, 10 μM E-64, 10 μM Leupeptin hemisulfate). Next, the pellet was resuspended in twice its volume with buffer A and incubated on ice for 20 min to allow the cells to swell. The cells were subsequently lysed on ice with a Dounce homogenizer using 40 strokes of a tight pestle (B type). The homogenate was centrifuged at 2,000 × g to remove nuclei and cell membranes. Next, 0.11 volumes (relative to the volume of the clarified supernatant from the low speed centrifugation) of Buffer B (300 mM HEPES-KOH, pH 7.9, 1.4 M potassium acetate, 30 mM magnesium acetate, 0.01% (w/v) CHAPS, 0.5 mM DTT, 1 mM AEBSF, hydrochloride, 0.3 μM Aprotinin, 40 μM Bestatin, hydrochloride, 10 μM E-64, 10 μM Leupeptin, hemisulfate) was added, followed by centrifugation at 100,000 × g for 20 min at 4°C; the supernatant corresponds to the S100 extract. Ice-cold 80% (w/v) glycerol was then added to achieve a 20% (w/v) final glycerol concentration and mixed by gentle inversion. S100 was aliquoted, frozen in liquid nitrogen, and stored at −80°C.

To capture AGO2 protein, clarified lysate was incubated for 2 h at 4°C rotating with 20 μl anti-FLAG M2 paramagnetic beads (Sigma) per ml lysate. Beads were washed three times with wash buffer (30 mM HEPES-KOH, pH 7.9, 120 mM potassium acetate, 3.5 mM magnesium acetate, 2 mM DTT, 0.01% (w/v) CHAPS). Immobilized AGO2 was loaded by incubating with 1 μM single-stranded miRNA guide (Table S2) in wash buffer for 1 h at 37°C. Unbound miRNA guide was removed by washing the beads three times with wash buffer. AGO2 and AGO2:miRNA were eluted for 1 h at room temperature with 100 ng∙μl−1 3XFLAG peptide in wash buffer.

miRISC was purified as described (Flores-Jasso et al., 2013). Briefly, the assembled miRISC was incubated overnight at 4°C with a biotinylated, 2′-O-methyl capture oligonucleotide (Table S2) linked to streptavidin paramagnetic beads (Dynabeads MyOne Streptavidin T1, Life Technologies). miRISC was eluted with a competitor oligonucleotide (Table S2) for 2 h at room temperature. Excess competitor oligonucleotide was removed by incubating the eluate with streptavidin paramagnetic beads (Dynabeads MyOne Streptavidin T1, Life Technologies) for 15 min at room temperature. Finally, miRISC was dialyzed at 4°C against three changes (3 h each) of a 3,000-fold excess of wash buffer supplemented with 20% (w/v) glycerol. miRISC was aliquoted, frozen in liquid nitrogen, and stored at −80°C. For single-molecule analysis, guide strands were labeled with 3′ Alexa Fluor 555 (Life Technologies).

Northern blotting

Northern blotting was essentially performed as described (Pall and Hamilton, 2008). Briefly, miRNA guide standards and miRISC were first resolved on a denaturing 15% polyacrylamide gel, transferred to Hybond-XL (Cytiva) by semi-dry transfer at 20 V for 1 h, and the RNA crosslinked to the membrane with 0.16 M EDC in 0.13 M 1-methylimidazole, pH 8.0, at 60°C for 1 h. The crosslinked membrane was pre-hybridized in Church’s buffer (1% w/v BSA, 1 mM EDTA, 0.5 M phosphate buffer, and 7% w/v SDS) at 37°C for 1 h. Radiolabeled, 25 pmol 5′ 32P-DNA probe (Table S2) in Church buffer was added to the membrane and allowed to hybridize overnight at 37°C, followed by two washes with 2× SSC containing 0.1% w/v SDS and two washes with 1× SSC containing 0.1% w/v SDS at 37°C for 15 min. The membrane was air dried and exposed to a storage phosphor screen.

Quantification of active miRISC by double-filter binding assay

Binding assays were essentially performed as described (Wee et al., 2012). Double-filter binding assays measured the equilibrium binding of active, binding-competent miRISC with 28-nt long RNA target fully complementary to the miRNA guide. To block cleavage, the target RNA contained a phosphorothioate linkage flanked by 2′-O-methyl ribose at positions t10 and t11 (Schwarz et al., 2004; Rivas et al., 2005; Wee et al., 2012). Binding reactions were performed in 5 μl in the presence of 30 mM HEPES-KOH, pH 7.9, 120 mM potassium acetate, 3.5 mM magnesium acetate, 2 mM DTT, 0.01% (w/v) CHAPS. A 5′ 32P-RNA target (0.5 nM) complementary to the seed region of the miRNA guide (Table S2) was incubated with a range of miRISC concentrations from 0.05 nM to 5 nM. The assay also included a no-miRISC binding reaction using miRISC storage buffer. Binding reactions were incubated at 37°C for 1 hr. RNA binding was measured by capturing protein-RNA complexes on Protran nitrocellulose (Whatman, GE Healthcare Bioscience, Pittsburgh, PA) and unbound RNA on a Hybond-XL (Cytiva) in a Bio-Dot apparatus (Bio-Rad, Hercules, CA). After applying the sample under vacuum, membranes were washed with 10 μl equilibration buffer (30 mM HEPES-KOH, pH 7.9, 120 mM potassium acetate, 3.5 mM magnesium acetate, 2 mM DTT). Membranes were air-dried and signals detected by phosphorimaging.

Co-localization single-molecule spectroscopy

Single-stranded RNA targets were generated as described (Salomon et al., 2015). Typically, 100 pmol RNA target (Table S2) was mixed with a 1.5-fold molar excess of Klenow template oligonucleotide (Table S2) in 7.5 μl of 10 mM HEPES-KOH, pH 7.4, 20 mM sodium chloride, and 0.1 mM EDTA. Samples were incubated at 90°C for 5 min in a heat block. The heat block was then switched off and allowed to cool to room temperature. Afterwards, the annealed strands (30% of final reaction volume) were added without further purification to a 3′ extension reaction, comprising 1× NEB Buffer 2 (New England Biolabs, Ipswich, MA), 1 mM dATP, 1 mM dCTP, 0.12 mM Alexa Fluor 647-aminohexylacrylamido-dUTP (Life Technologies), and 0.2 U/μl Klenow fragment (3′-to-5′ exo-minus, New England Biolabs) and incubated at 37°C for 1 h. The reaction was quenched with 500 mM (f.c.) ammonium acetate and 20 mM (f.c.) EDTA. A 15-fold molar excess of ‘trap’ oligonucleotide (Table S2) was added to the Klenow template oligonucleotide. The entire reaction was precipitated overnight at −20°C with three volumes of ethanol. The labeled target was recovered by centrifugation, dried, dissolved in loading buffer (7 M Urea, 25 mM EDTA), incubated at 95°C for 5 min, and resolved on a 6% polyacrylamide gel and isolated by electroelution.

Single-molecule experiments were performed and analyzed as described (Smith et al., 2019). Fresh cover glasses were prepared for each day of imaging. Cover glasses (Gold Seal 24 Å∼ 60 mM, No. 1.5, Cat. #3423), and glass coverslips (Gold Seal 25 Å∼ 25 mM, No. 1, Cat. #3307) were cleaned by sonicating for 30 min in NanoStrip (KMG Chemicals, Houston, TX), washed with 10 changes of deionized water, and dried with a stream of nitrogen. Two ∼2 mm diameter lines of high vacuum grease (Dow Corning, Midland, MI) were applied to the cover glass to create a flow cell. Three layers of self-sticking labeling tape (Fisher, Cat. No. 159015R) were applied outside of the flow cell. The coverslip was placed on top of the cover glass with a ∼0.5 mm gap between the cover glass and coverslip. To minimize non-specific binding of protein and RNA to the glass surface, microfluidic chambers were incubated with 2 mg∙ml−1 poly-L-lysine-graft-PEG-biotin in 10 mM HEPES-KOH, pH 7.5, at room temperature for 30 min and washed extensively with imaging buffer (30 mM HEPES-KOH, pH 7.9, 120 mM potassium acetate, 3.5 mM magnesium acetate, 20% (w/v) glycerol) immediately before use. To immobilize biotinylated RNA targets, streptavidin (0.01 mg∙ml−1, Sigma) was incubated for 5 min in each microfluidic chamber. Unbound streptavidin was washed away with imaging buffer.

Immediately before each experiment, a flow cell was incubated for 2 min with imaging buffer supplemented with 75 μg∙ml−1 heparin (Sigma H4784), oxygen scavenging system (Crawford et al., 2008; Aitken et al., 2008) (2.5 mM protocatechuic acid [Aldrich 37580] and 0.5 U∙ml−1 Pseudomonas sp. protocatechuate 3,4-Dioxygenase [Sigma P8279]), triplet quenchers (Dave et al., 2009) (1 mM Trolox [Aldrich 238813], 1 mM propyl gallate (Sigma P3130), and 1 mM 4-nitrobenzyl alcohol [Aldrich N12821]). The chamber was then filled with ∼100 pM target in imaging buffer supplemented with 75 μg∙ml−1 heparin, oxygen scavenging system, and triplet quenchers. Target deposition was monitored by taking a series of images; once the desired density was achieved, the flow cell was washed three times with imaging buffer supplemented with oxygen scavenging system and triplet quenchers. A syringe pump (KD Scientific, Holliston, MA) running in withdrawal mode at 0.15 ml∙min−1 was applied to the flow cell outlet to introduce AGO2:miRNA complex (pre-heated to 37°C) supplemented with an oxygen scavenging system and triplet quenchers. Typically, 3,000 frames were collected at 5 frames per s. A digitally-controlled heater (TP-LH, Tokai Hit) maintained objective temperature at 42°C. A custom fabricated heating stage (Smith et al., 2019) was heated to 40°C to achieve a sample temperature of 37°C. The temperature on the surface of the cover glass was independently monitored with a Type E, 0.25 mM O.D. thermocouple (Omega Engineering Inc., Sutton, MA) inserted between the top and the bottom cover glasses.

Imaging was performed on an IX81-ZDC2 zero-drift inverted microscope equipped with a cellˆTIRF motorized multicolor TIRF illuminator with 561 and 640 nm 100 mW lasers and a 100×, oil immersion, 1.49 numerical aperture UAPO N TIRF objective with FN=22 (Olympus, Tokyo, Japan). Fluorescence signals were split with a main dichroic mirror (Olympus OSF-LFQUAD) and triple emission filter (Olympus U-CZ491561639M). The primary image was relayed to two ImagEM X2 EM-CCD cameras (C9100-23B, Hamamatsu Photonics, Hamamatsu, Japan) using a Cairn three-way splitter equipped with a longpass dichroic mirror (T635lpxr-UF2, Chroma). Illumination and acquisition parameters were controlled with cellˆTIRF and MetaMorph software (Molecular Devices, Sunnyvale, CA), respectively.

Images were recorded as uncompressed TIFF files and merged into stacked TIFF files. Images were processed using the pipeline (Smith et al., 2019) as described in the manual. Co-localization events required that (1) the intensity of AGO2:miRNA complex >150 photons, (2) ratio intensity of the AGO2:miRNA complex to the local background >1, (3) the distance between the target and guide was < 1.2 pixel, and (4) sigma < 4.6. To exclude short, non-specific events, the minimal event duration was set to 2 frames. To overcome short temporary loss of miRISC fluorescent signal due to fluorescent dye blinking, the gap parameter was set to 2 frames. Only the first binding event at each target location was used to estimate arrival time and dwell time, to minimize errors caused by occupation of sites by photobleached molecules. The same analysis was automatically performed on ‘dark’ locations, i.e., regions that contained no target molecules; these served as a control for non-specific binding of AGO2:miRNA complex to the surface of the cover glass.

RNA bind-n-Seq for de novo site discovery and KD measurements

Two libraries of RNA oligonucleotides, each containing a central region of 20 random-sequence positions (Table S2), were synthesized with equal ratio of bases (25:25:25:25) (IDT), 5′ 32P-radiolabeled, and gel-purified. After phenol-chloroform extraction and ethanol precipitation, RNA was denatured at 90°C for 1 min, annealed to BRTP primer (Table S2) and reverse transcribed using SuperScript III. RNA was degraded by alkaline hydrolysis using 0.4 M sodium hydroxide for 1 h at 55°C, and cDNA was recovered by ethanol precipitation. The sample was then amplified with AccuPrime Pfx DNA Polymerase (Invitrogen). The reactions were run on a 2% agarose gel, amplicons were purified then sequenced using a NextSeq 500 (Illumina) to obtain 75-nt, single-end reads.

Because of the randomness of the central region, sequence composition differed between the two libraries. For example, frequencies of the four nucleotides at each position within the randomized region of RNA molecules varied between the two libraries (Figure S1A). RNA pool 1 was used in RBNS of AGO2:miR-34b trials 1 and 2, AGO2:let-7a trial1, and let-7a trials 1 and 2; RNA pool 2 was used in RBNS of AGO2:let-7a trial 2, AGO2:let-7a trials 1 and 2 (recovery of bound RNA using M2 FLAG beads), AGO2:miR-449a – blockers trials 1 and 2, AGO2:miR-449a + blockers trials 1 and 2, miR-34b trials 1 and 2, miR-449a trials 1 and 2, and association experiments. DNA blocking oligonucleotides were synthesized (IDT) and annealed to RNA library in 30 mM HEPES-KOH, pH 7.5, 120 mM potassium acetate, 3.5 mM magnesium acetate using a 1:1.2 molar ratio of RNA pool to DNA blockers by first incubating at 95°C for 1 min, then at 65°C for 10 min, and finally cooled to room temperature.

Each experiment included five or six binding reactions. The highest concentration of miRISC used corresponded to 40% (v/v) of the stock solution and equaled 0.8–5 nM (f.c.) active protein. For additional reactions, the stock was serially diluted 3.2-fold in storage buffer. Each experiment also included a mock binding reaction (no-RISC control) using protein storage buffer without miRISC. For each miRNA, we performed an additional binding reaction using protein storage buffer with miRNA guide at the highest miRISC concentration assayed, but lacking AGO2 protein. All binding reactions (20 μl) were performed in 25 mM HEPES-KOH, pH 7.9, 110 mM potassium acetate, 3.5 mM magnesium acetate, 0.01% (w/v) CHAPS, 2 mM DTT, 8% (w/v) glycerol and contained 100 nM (f.c.) RNA library. To reduce non-specific binding, each reaction also included 2.5 μg∙μl−1 BSA and 0.5 μg∙μl−1 yeast tRNA. Reactions were incubated for 2 h at 37°C and then filtered through a Protran nitrocellulose membrane (Whatman, GE Healthcare Bioscience, Pittsburgh, PA) on top of a Hybond-XL (Cytiva) nylon membrane in a Bio-Dot apparatus (Bio-Rad, Hercules, CA). To reduce retention of free single-stranded RNA, we pre-conditioned nitrocellulose and nylon membranes prior to use as described (Smolarsky and Tal, 1970; Wong and Lohman, 1993). Nitrocellulose filters were pre-soaked in 0.4 M potassium hydroxide for 10 min. Nylon filters were incubated in 0.1 M EDTA, pH 8.2 for 10 min, washed three times in 1 M sodium chloride for 10 min each followed by a quick rinse (∼15 s) in 0.5 M sodium hydroxide. Nitrocellulose and nylon filters were then rinsed in water until the pH returned to neutral and equilibrated in wash buffer (20 mM HEPES-KOH, pH 7.9, 100 mM potassium acetate, 3.5 mM magnesium acetate, 1 mM DTT) for at least 1 h at 37°C. After applying the sample under vacuum, membranes were washed with 100 μl wash buffer for 3 s. Membranes were air-dried and signals detected by phosphorimaging to monitor binding. The nitrocellulose membranes containing miRISC-bound RNA were excised and incubated with 1 μg∙μl−1 Proteinase K (Thermo Fischer) in 100 mM Tris-HCl, pH 7.5, 10 mM EDTA, 150 mM sodium chloride, 1% (w/v) SDS for 1 h at 45°C shaking at 300 rpm. After phenol-chloroform extraction and ethanol precipitation, RNA was reverse transcribed, amplified, and sequenced using the procedure described above for the RNA pool.

RNA bind-n-Seq using paramagnetic beads

Binding reactions were assembled as described above. After incubation for 1 h at 37°C, reactions were transferred to tubes containing 10 μl anti-FLAG M2 paramagnetic beads (Sigma). Prior to use, beads were washed three times with wash buffer (20 mM HEPES-KOH, pH 7.9, 100 mM potassium acetate, 3.5 mM magnesium acetate, 1 mM DTT) and incubated for 1 h at 37°C with wash buffer containing 2.5 μg∙μl−1 BSA and 0.5 μg∙μl−1 yeast tRNA. After adding binding reactions to beads, samples were incubated at 37°C for 1 h. Beads were captured in a magnetic stand, the supernatant containing unbound RNA removed, and the beads washed with 100 μl wash buffer for 3 s. miRISC and bound RNA were eluted twice for 45 min at room temperature with 100 ng∙ml−1 3XFLAG peptide in wash buffer. RNA libraries were prepared as described above.

RNA bind-n-Seq for kinetic measurements

The nitrocellulose and nylon membranes were prepared as described above. Binding reactions were performed in 25 mM HEPES-KOH, pH 7.9, 110 mM potassium acetate, 3.5 mM magnesium acetate, 0.01% (w/v) CHAPS, 2 mM DTT, 8% (w/v) glycerol, and contained 100 nM (f.c.) RNA library (RNA pool 2) and 136 pM (trial 1) or 170 pM (trial 2) AGO2:let-7a. To reduce non-specific binding, each reaction also included 2.5 μg∙μl−1 BSA and 0.5 μg∙μl−1 yeast tRNA. All components except the RNA pool were combined to generate a master mix, which was aliquoted and placed at 37°C. The RNA pool was added to the first aliquot, and the timer started. After 1 h, the RNA pool was added to the second aliquot. After 1.5 h, the RNA pool was added to the third aliquot. This procedure was repeated until all aliquots received RNA pool at the desired time points. After 2 h from initiating the first binding reaction, all the samples were applied under vacuum to the membranes, which were washed with 100 μl wash buffer. The nitrocellulose membranes containing miRISC-bound RNA were excised, RNA extracted, reverse-transcribed, amplified, and sequenced as described above.

Quantification and statistical analysis

Quantification of active miRISC by double-filter binding assay

To measure concentration of active, binding-competent miRISC, titration data were fit to

f(r)=fmax×r+KD+n(r+KD+n)24×r×n2×n

where KD is the apparent dissociation constant, r is the molar ratio of [RISC] to [RNA], n is the stochiometric equivalence point, ƒ is the fraction bound, ƒmax is the maximum fraction bound.

Co-localization single-molecule spectroscopy

The individual datasets were saved and combined. The binding rate (kon) was determined by fitting the cumulative fractions of miRISC arrivals to:

f(t)=1(1h)×ekon×th×e(kon+kNS)×t

and reported per time unit and concentration of introduced miRISC. A dwell time distribution was fitted as

f(t)=N×ekoff×t+A×ekNS1×t+B×ekNS2×t.

Parameters relative to non-specific association of miRISC with the glass surface (kNS, on-rate for non-specific arrivals; h, fraction of control locations having received non-specific arrivals; kNS1 and kNS2, off-rate for non-specific binding events and rate of photobleaching; A and B, their respective amplitudes) were determined from the fitting of data for control locations. Values of kon and koff were derived from data collected from >900 individual RNA target molecules. Error was evaluated by 1,000-cycle bootstrapping of 90% of the data.

Quality control of high-throughput sequencing data

Only Illumina reads containing TGG (the first nucleotides of the 3′ adapter) at positions 21–23 were analyzed. Sequences were filtered (Phred quality score ≥20 for all nucleotides, and “N” base calls disallowed), and the 3′ adapter sequence (5′-TGG AAT TCT CGG GTG CCA AGG-3′) removed.

Enrichment values

Occurrences of all 10-nt long motifs (10-mers) were counted in all the reads of each RBNS sample. These counts were then divided by the total count of all 10-mers to give motif frequencies. Enrichment of a motif was computed as the ratio of the motif frequency in the protein-bound samples over the frequency in the RNA pool. Z-score of a motif was computed as Z=RR¯S where R is enrichment of the motif, R¯ is the mean of enrichment values of all 10-mers, and S is the sample standard deviation of enrichment values of all 10-mers. A motif was considered significant if its Z-score was ≥99.9 percentile.

De novo site discovery

Enrichments in the library from the binding reaction with the greatest miRISC concentration were used for the following iterative procedure: (1) enrichment values of all 10-mers were calculated; (2) the hundred most enriched 10-mers were interrogated for base-pairing with the guide miRNA; (3) the most enriched site type was identified; (4) Z-scores of motifs belonging to the site type were compared to the Z-score threshold; (5) all reads containing the binding site were masked in the miRISC-bound library and the RNA pool so that stepwise enrichments of subsequent 10-mers could be used to eliminate subsequent ‘shadow’ motifs; (6) all enrichment values were then recalculated on the masked read sets to obtain the resulting most enriched 10-mers. This process continued until the Z-score of the most enriched binding site (calculated from the original enrichment values) was < 99.9 percentile.

To identify a binding site at each iteration, the one-hundred most enriched 10-mers were tested for base-pairing with the guide miRNAs. If perfect complementarity was not observed, the 10-mer was tested for any of the following in this order: (1) complementarity to nine contiguous miRNA positions, allowing a single bulged target nucleotide; (2) complementarity to ten contiguous miRNA positions while allowing for wobble pairing; (3) complementarity to ten contiguous miRNA positions while allowing a non-wobble mismatch. If none of these configurations allowed assigning the motif to a binding site, the procedure was repeated with two 9-mers within the 10-mer, the three 8-mers within the 10-mer, etc., until a configuration of base-pairing was identified.

Read assignments

Each sequencing read in RNA pool and miRISC-bound libraries was interrogated for presence of all binding sites of interest. The entire single-stranded sequence was interrogated: the 20-nt random-sequence region flanked by constant primer-binding sequences in the case when blockers were not used and the 20-nt random-sequence region flanked by 4 nucleotides of constant primer-binding sequence on either side in the case when blockers were annealed to the RNA pool. For analysis of published datasets (McGeary et al., 2019), the entire 87-nt sequenced encompassing the 37-nt random sequence region and constant primer-binding regions was searched. A read was assigned to a site category if it contained one single binding motif. Reads containing multiple instances of binding sites (from the same or a different site category) and reads containing partially overlapping sites were not included in the analysis and represented ≤1% of libraries. Reads that did not have any of binding motifs of interest were classified as reads with a no-site.

Modeling of RBNS experiments

RBNS data was modeled by simulating the equilibrium binding of an RNA Binding Protein (RBP) with an RNA input pool. The stock solution of the RBP was set to 2.1 nM (dataset 1) and 8.1 nM (datasets 2 and 3). Each in silico experiment included five binding reactions. The highest concentration of the RBP used corresponded to 40% (v/v) of the stock solution. For additional reactions, the stock was serially diluted 3.2-fold. Each experiment also included a mock binding reaction (no-RISC control). We considered the RBP binding to an RNA pool (100 nM f.c.) containing four binding sites with affinities equal to 5, 20, 100 and 500 pM (datasets 1 and 2) or 10, 50, 250 and 1,000 pM (dataset 3). The concentration of specific motifs was set to 15 pM. We also included nonspecific binding sites (KD, nosite = 5 nM). We constructed a system of equations relating the concentrations of the free and bound states of RPB and binding sites to the KD values for each binding site and the total concentrations of each species. This system was solved numerically for each input value of RBP in MATLAB using function fsolve. We allowed recovery of 0.1 nM background RNA.

Kinetic RBNS data was modeled by simulating the association of an RBP with an RNA input pool. Concentration of the RBP was set to 150 pM. Each in silico experiment included fourteen time points, ranging from 0 to 7,200 s. We considered the RBP binding to an RNA pool (100 nM f.c.) containing four binding sites with kon1=0.1 nM–1s–1, kon2=0.01 nM–1s–1, kon3=0.001 nM–1s–1, kon4=0.0001 nM–1s–1, konno-site=0.0001 nM–1s–1 and koff1=0.0001 s–1, koff2=koff3=koff4=0.01 s–1, koffno-site=10 s–1 (dataset 1) and kon1=kon2=kon3=kon4=0.1 nM–1s–1, konno-site=0.001 nM–1s–1 and koff1=0.00001 s–1, koff2=0.001 s–1, koff3=0.01 s–1, koff4=0.1 s–1, koffno-site=10 s–1 (dataset 2). We constructed a system of equations relating the concentrations of the free and bound states of RPB and binding sites to the KD values for each binding site and the total concentrations of each species. This system was solved numerically for each input value of RBP in MATLAB using function solve. We allowed recovery of 0.1 nM background RNA.

Concentrations of each site recovered in equilibrium or kinetic RBNS in silico experiments were converted to number of molecules. The input RNA pool and bound RNA were subsampled with no replacement to yield 20×106 molecules—our typical sequencing depth.

Maximum likelihood estimation (MLE) of parameters from RBNS data

Let M be a mathematical model that predicts read counts of each binding site type given information about the RNA pool and a set of KD values for each site type. Let D be observed data from RNA sequencing of n binding reactions. Maximum Likelihood Estimation (MLE) is used to estimate KD values, so that D is the most probable given M. We note the likelihood function L(D|M):

L(D|M)=L((Reads1,,Readsj,,Readsn)|(KD1,,KDi,,KDm)) (Equation 1)

where Readsj is sequencing data from the binding reaction j and KDi is dissociation constant for binding site i. Binding in reaction j is independent from binding in reaction j+1, so the joint probability is a product of individual probabilities:

P((Reads1,,Readsj,,Readsn)|(KD1,,KDi,,KDm))=j=1nP(Readsj|(KD1,,KDi,,KDm)) (Equation 2)

Each probability is a value comprised between 0 and 1. Multiplying small values is prone to numerical underflow and introduces errors because the computer can only store a certain number of digits. Therefore, we describe L in terms of log conditional probabilities:

L(D|M)=ln[P((Reads1,,Readsj,,Readsn)|(KD1,,KDi,,KDm))]
L(D|M)=ln[j=1nP(Readsj|(KD1,,KDi,,KDm))]
L(D|M)=j=1nln[P(Readsj|(KD1,,KDi,,KDm))] (Equation 3)

In the binding reaction j, observations from RNA sequencing follow Negative Multinomial Distribution with possible outcomes {site1, site2, …, sitem−1, noiste}. Therefore, the probability mass function is:

Pj((reads1,j,,readsi,j,,readsm1,j|readsm,j),{p1,,pi,,pm1})=(i=1mreadsi,j1)!×pm,jreadsm,j(readsm,j1)!×i=1m1pi,jreadsi,j(readsi,j)! (Equation 4)

where pi,j is the exected frequency of a binding site type i in a binding reaction j and is given by the mathematical model:

pi,j=xi,ji=1mxi,j (Equation 5)

with xi,j being the concentration of a binding site i in binding reaction j predicted by M.

After combining Equations (3), (4), and (5), L(D|M) becomes:

L(D|M)=j=1nln[(i=1mreadsi,j)!×readsm,ji=1mreadsi,j×i=1m(xi,j/i=1mxi,j)readsi,j(readsi,j)!] (Equation 6)

Cost function for parameter fitting

Wedefineacostfunction,fcost,asfcost=L(D|M). (Equation 7)

A cost function maps values of variables into a real number, intuitively representing some “cost” associated with the event. An optimization procedure seeks to maximize the likelihood function, so that the model looks the most like the data. Because we define the cost function as the opposite of the likelihood function, fcost will be minimized upon fitting.

fcost=j=1nln[(i=1mreadsi,j)!×readsm,ji=1mreadsi,j×i=1m(xi,j/i=1mxi,j)readsi,j(readsi,j)!] (Equation 8)

Using product, quotient and power rules of natural logarithm, Equation (8) can be simplified to yield:

fcost=j=1n[ln((readsi,j)!)ln((i=1mreadsi,j)!)ln(readsm,ji=1mreadsi,j)+ln(i=1mxi,j)×i=1mreadsi,ji=1m(readsi,j×ln(xi,j))] (Equation 9)

We note that some terms in the expression of fcost do not depend on parameters from the mathematical model. Therefore, fcost can be written as

fcost=j=1n[ln(i=1mxi,j)×i=1mreadsi,ji=1m(readsi,j×ln(xi,j))]+C

where C=j=1n[ln((readsi,j)!)ln((i=1mreadsi,j)!)ln(readsm,ji=1mreadsi,j)]

Inclusion of a constant term may result in a loss of significant figures. Moreover, a function f and a function f+C have the same optima; therefore minimizing Equation (9) is equivalent to minimizing

fcost=j=1m[ln(i=1mxi,j)×i=1mreadsi,ji=1m(readsi,j×ln(xi,j))] (Equation 10)

Implementation of MLE

Minimization of the cost function is performed by minimize function from a Python-based library SciPy using the Limited-memory Broyden–Fletcher–Goldfarb–Shanno algorithm (L-BFGS) with default parameters on acceptance of convergence. Minimize requires two inputs: the function fcost to minimize and an initial guess of the variables. Both MLE of KD values and MLE of kinetic parameters use fcost defined above, but with xi,j coming from the appropriate mathematical model (see Mathematical Model for MLE of KD values and Mathematical Model for MLE of kon and koff values below). The BFGS method does not require the gradient of the cost function, as it can be estimated using fine differences. Nevertheless, the analytical gradient can be supplied and enhances the efficiency of the optimization process. Therefore, derivation of the cost function for MLE of KD values and MLE of kinetic parameters is developed below.

Mathematical model for MLE of KD values

We considered the simplified reaction:

RISC+RNARISC:RNA

For a given binding reaction j and a miRNA binding site i, the recovered concentration of sitei, xi,j, is a sum of si,j and nsi,j, which correspond to concentrations of miRISC-bound and non-specifically recovered sitei, respectively.

xi,j=si,j+nsi,j (Equation 11)

Concentrations of miRISC-bound sitei, depends on dissociation constant KD of site i:

KDi=[sitei,j]free×[AGO:miRNAfree]jsi,j (Equation 12)

where [AGO:miRNAfree]j is concentration of unbound miRISC in the reaction j and [sitei,j]free is the concentration of unbound site i in the binding reaction j.

Let [sitei]pool be the concentration of a binding site i in the RNA pool used in RBNS. After substituting [sitei,j]free by [sitei]poolsi,j, Equation (12) becomes:

si,j=[AGO:miRNAfree]j×[sitei]poolKDi+[AGO:miRNAfree]j (Equation 13)

We describe concentration of non-specifically recovered sitei by:

nsi,j=cj×[sitei,j]free (Equation 14)

where cj is a sample-specific proportionality constant and [sitei,j]free is the concentration of unbound site i in the binding reaction j.

We consider background to be the total concentration of all non-specifically recovered RNA, which is assumed to be the same in all binding reactions:

background=i=1mnsi,j=i=1mcj×[sitei,j]free=cj×i=1m[sitei,j]free
Therefore,cj=backgroundi=1m[sitei,j]free (Equation 15)

After substituting si,j by expression Equation (13) and combining with Equations (14) and (15), xi,j can be expressed in terms of KD of site i and background:

xi,j=[sitei]pool×([AGO:miRNAfree]jKDi+[AGO:miRNAfree]j)×(1backgroundk=1m[sitek]poolk=1m[AGO:miRNAfree]j×[sitek]poolKDi+[AGO:miRNAfree]j)+backgroundk=1m[sitek]poolk=1m[AGO:miRNAfree]j×[sitek]poolKDi+[AGO:miRNAfree]j (Equation 16)

Equation (16) predicts concentrations of each binding site type, given information on RNA pool and a set of KD values for each site type. We note that this equation also contains [AGO:miRNAfree]j, concentration of unbound miRISC in a reaction j. [AGO:miRNAfree]j cannot be calculated explicitly, as it depends on concentration of miRISC bound to all site types. Therefore, [AGO:miRNAfree]j is approximated at each iteration of the optimization routine by solving the equation:

DFj×ago[AGO:miRNAfree]ji=1m[AGO:miRNAfree]j×[sitei]poolKDi+[AGO:miRNAfree]j=0 (Equation 17)

To find root of Equation (17), minimize_scalar from a Python-based library SciPy is used within the interval (0, ago).

Parameters fitted during MLE of KD values

The parameters to optimize includes KD values of site1, …, sitem–1, and no-site, as well as background, the total concentration of all non-specifically recovered RNA, and ago, the stock concentration of active AGO:miRNA complex. Upon optimization, some parameters may receive negative values, which is meaningless for affinities and concentrations. Therefore, we perform exponential transformation and define θ1, …, θk, …, θm, θm+1, and θm+2 as

KDk=eθk for k in [1, m]

ago=eθm+1

background=eθm+2 (Equation 18)

Derivation of ƒcost for MLE of KD values

The function fgrad(θ) returns the derivative of fcost with respect to each θ :

fgrad(θ1,,θk,,θm+2)=(dfcostdθ1,,dfcostdθk,,dfcostdθm+2) (Equation 19)

We derive dfcostdθk using the chain rule:

dfcostdθk=j=1ni=1mfcostxi,j×xi,jθk (Equation 20)

First, to derive fcostxi,j, substitute fcost by Equation (10):

fcostxi,j=(j=1n[ln(i=1mxi,j)×i=1mreadsi,ji=1m(readsi,j×ln(xi,j))])xi,j

After using natural logarithm properties, the expression is simplified to:

fcostxi,j=z=1mreadsz,jz=1mxz,jreadsi,jxi,j (Equation 20.1)

Next, to derive dfcostdθk, we note that xi,j contains si,j and background, which depend on θk. Therefore, we use the chain rule to write:

xi,jθk=xi,jbackground×backgroundθk+z=1m(xi,jsz,j×sz,jθk) (Equation 20.2)

To calculate the partial derivative of xi,j with respect to background, we substitute xi,j by Equation (11), and combining it with Equations (14) and (15) yields:

xi,jbackground=(si,j+background×[sitei]poolsi,jz=1m([sitez]poolsz,j))background=[sitei]poolsi,jz=1m([sitez]poolsz,j) (Equation 20.2.1)

To calculate the partial derivative of background with respect to θk, we note that background=eθm+2; therefore:

backgroundθk=background×δk(m+2),whereδk(m+2)={1ifk=m+20ifkm+2 (Equation 20.2.2)

To calculate the partial derivative of xi,j with respect to si,j, we substitute xi,j by Equation (11), and combining it with Equations (14) and (15) yields:

xi,jsz,j=(si,j×(1backgroundk=1m([sitek]poolsk,j))+background×[sitei]poolk=1m([sitek]poolsk,j))sz,j

After using the composite function rule and rearranging, the partial derivative of xi,j with respect to si,j becomes:

xi,jsz,j=(1backgroundk=1m([sitek]poolsk,j))×δzi+background×[sitek]poolsi,j(k=1m[sitek]poolk=1msk,j)2,whereδzi={1ifz=i0ifzi (Equation 20.2.3)

By combining expressions of partial derivatives of xi,j with respect to background and si,j, expression Equation (20.2) becomes:

xi,jθk=[sitei]poolsi,jz=1m([sitez]poolsz,j)×background×δk(m+2)+z=1m([(1backgroundk=1m([sitek]poolsk,j))×δzi+background×[sitek]poolsi,j(k=1m[sitek]poolk=1msk,j)2]×sz,jθk)

which can be further simplified to yield:

xi,jθk=[sitei]poolsi,jz=1m([sitez]poolsz,j)×background×δk(m+2)+background×[sitek]poolsi,j(k=1m[sitek]poolk=1msk,j)2×z=1msz,jθk+(1backgroundk=1m[sitek]poolk=1msk,j)×si,jθk (Equation 20.3)

We note that xi,jθk requires expression of si,jθk. si,j is function of KDi and [AGO:miRNAfree]j; therefore we apply the chain rule:

si,jθk=si,jKDi×KDiθk+si,j[AGO:miRNAfree]j×[AGO:miRNAfree]jθk (Equation 20.4)

To calculate the partial derivative of si,j with respect to KD of site i, we substitute si,j by Equation (13):

si,jKDi=([AGO:miRNAfree]j×[sitei]poolKDi+[AGO:miRNAfree]j)KDi

We use the composite function rule to write:

si,jKDi=[AGO:miRNAfree]j×[sitei]pool(KDi+[AGO:miRNAfree]j)2 (Equation 20.4.1)

To calculate the partial derivative of KDi with respect to θk, we note that KDi=eθi for i in [1, m]:

KDiθk=eθiθk=eθi×δki=KDi×δki,whereδki={1ifk=i0ifki (Equation 20.4.2)

To calculate the partial derivative of si,j with respect to [AGO:miRNAfree]j, we substitute si,j by Equation (13):

si,j[AGO:miRNAfree]j=[AGO:miRNAfree]j([AGO:miRNAfree]j×[sitei]poolKDi+[AGO:miRNAfree]j)

After using the composite function rule and rearranging, the expression above becomes:

si,j[AGO:miRNAfree]j=KDi×[sitei]pool(KDi+[AGO:miRNAfree]j)2 (Equation 20.4.3)

To calculate the partial derivative of [AGO:miRNAfree]j with respect to θk, we note that [AGO:miRNAfree]j=DFj×agoi=1msi,j, where DFj is the dilution factor of miRISC in the binding reaction j. Therefore, we can write:

[AGO:miRNAfree]jθk=DFj×ago×δk(m+1)(i=1msi,j)θk,whereδk(m+1)={1ifk=m+10ifkm+1 (Equation 20.4.4.1)

Use Equation (20.4) to calculate (i=11si,j)θk:

(i=1msi,j)θk=i=1m(si,jKDi×KDiθk)+i=1msi,j[AGO:miRNAfree]j×[AGO:miRNAfree]jθk (Equation 20.4.4.2)

Combining Equation (20.4.4.2) with Equation (20.4.4.1) and substituting si,jKDi by Equation (20.4.1), KDiθk by Equation (20.4.2) and si,j[AGO:miRNAfree]j by Equation (20.4.3) yields after simplification:

(i=1msi,j)θk=i=1m([AGO:miRNAfree]j×[sitei]pool(KDi+[AGO:miRNAfree]j)2×KDi×δki)+i=1mDFj×ago×δk(m+1)×KDi×[sitei]pool(KDi+[AGO:miRNAfree]j)21+i=1mKDi×[sitei]pool(KDi+[AGO:miRNAfree]j)2 (Equation 20.4.4.3)

We note that i=1m([AGO:miRNAfree]j×[sitei]pool(KDi+[AGO:miRNAfree]j)2×KDi×δki) can be simplified:

i=1m([AGO:miRNAfree]j×[sitei]pool(KDi+[AGO:miRNAfree]j)2×KDi×δki)=[AGO:miRNAfree]j×[sitei]pool×KDi(KDi+[AGO:miRNAfree]j)2×I[1m](k),

where I[1m](k)={1ifk[1m]0ifk[1m]

Therefore, Equation (20.4.4.3) becomes:

(i=1msi,j)θk=[AGO:miRNAfree]j×[sitei]pool×KDi(KDi+[AGO:miRNAfree]j)2×I[1m](k)+i=1mDFj×ago×δk(m+1)×KDi×[sitei]pool(KDi+[AGO:miRNAfree]j)21+i=1mKDi×[sitei]pool(KDi+[AGO:miRNAfree]j)2 (Equation 20.4.4.4)

Finally, we can combine all the partial derivatives given by Equations (20.4.1), (20.4.2), (20.4.3), (20.4.4.1), and (20.4.4.4) to calculate si,jθk:

si,jθk=[AGO:miRNAfree]j×[sitei]pool(KDi+[AGO:miRNAfree]j)2×KDi×δki+KDi×[sitei]pool(KDi+[AGO:miRNAfree]j)2×[DFj×ago×δk(m+1)[AGO:miRNAfree]j××[sitei]pool×KDi(KDi+[AGO:miRNAfree]j)2×I[1m](k)+i=1mDFj×ago×δk(m+1)×KDi×[sitei]pool(KDi+[AGO:miRNAfree]j)21+i=1mKDi×[sitei]pool(KDi+[AGO:miRNAfree]j)2] (Equation 20.4.4.5)

Equation (20.4.4.5) can be further re-arranged:

si,jθk=[AGO:miRNAfree]j×[sitei]pool×KDi(KDi+[AGO:miRNAfree]j)2×(δkiKDi×[sitei]pool(KDi+[AGO:miRNAfree]j)2×I[1m](k)1+i=1mKDi×[sitei]pool(KDi+[AGO:miRNAfree]j)2)+KDi×[sitei]pool(KDi+[AGO:miRNAfree]j)2×DFj×ago×δk(m+1)1+i=1mKDi×[sitei]pool(KDi+[AGO:miRNAfree]j)2 (Equation 20.5)

The gradient of fcost can now be computed and is given by Equation (20):

dfcostdθk=j=1ni=1mfcostxi,j×xi,jθk

Substituting fcostxi,j by Equation (20.1) and xi,jθk by Equation (20.3) yields:

dfcostdθk=j=1ni=1m(z=1mreadsz,jz=1mxz,jreadsi,jxi,j)×{[sitei]poolsi,jz=1m([sitez]poolk=1msz,j)×background×δk(m+2)+background×[sitek]poolsi,jk=1m([sitek]poolk=1msk,j)2×z=1msz,jθk+(1backgroundk=1m[sitek]poolk=1msk,j)×si,jθk}

where (i=1msi,j)θk is given by Equation (20.4.4.4) and si,jθk is given by Equation (20.5).

Initial guess for MLE of KD values and calculation of 95% cIs

Bootstrapping of 95% of the data was performed ten times on sequencing reads from each binding reaction and the RNA pool. MLE of KD values was performed on each bootstrapped sample by using 100 different combinations of 10 initial guesses of miRISC concentration (in the range 0.5–25 nM) and 10 initial guesses of KD for RNA with no enriched site (in the range 0.5–10 nM). KD values were initialized as the inverse of the average enrichment values. The background was initialized at 0.1 nM. All the initial guesses were partially randomized by adding a value drawn from a normal distribution with mean 0 and standard deviation 0.1. The cost function was evaluated in the presence of physically meaningful constraints on the parameters: 0.1 pM ≤ KDsite ≤ 100 nM, 100 pM ≤ KDno-site ≤ 10,000 nM, 100 pM ≤ ago ≤ 100 nM, and 5 pM ≤ background ≤ 5 nM. Any of the fitted parameters were at the boundaries at the end of the optimization routine. KD estimates, the background, and the stock concentration of miRISC provided by MLE were used to predict counts of each binding site type in sequencing data. These counts were compared with observed sequencing data, and MLE results were retained if Pearson correlation coefficient was >0.90. Results from independent starting points satisfying this criterion were combined. All bootstrapped samples were combined. Finally, estimates from two independent RBNS assays were merged. Median and 95% confidence intervals on medians were reported.

Mathematical model for MLE of kon and koff values

We considered the simplified reaction:

RISC+RNAkoffkonRISC:RNA

For a binding reaction at time t and miRNA binding site i, the recovered concentration of sitei xi,t is a sum of miRISC-bound sitei, si,t, and non-specifically recovered sitei. Similar to how Equation (11) was obtained above to estimate KD, xi,t is described by

xi,t=si,t×(1backgroundk=1m([sitek]poolsk,t))+background×[sitei]poolk=1m([sitek]poolsk,t) (Equation 21)

where si,t is given by

dsidt=koni×[sitei]free×[AGO:miRNAfree]koffi×si (Equation 22)

As [sitei]free=[sitei]poolsi, Equation (22) becomes:

dsidt+si×(koni×[AGO:miRNAfree]+koffi)=koni×[sitei]pool×[AGO:miRNAfree] (Equation 23)

This is a linear differential equation with constant coefficient of the type:

y′(x)+ay(x)=b, whose solution is y(x)=ba+ceax.

Therefore,

si,t=koni×[sitei]pool×[AGO:miRNAfree]tkoni×[AGO:miRNAfree]t+koffi+c×e(koni×[AGO:miRNAfree]t+koffi)t

To define c, we consider initial conditions. There is no complex formation at t = 0, i.e., si,0=0 and [AGO:miRNAfree]0=[AGO:miRNA]total.

Therefore,

c=koni×[sitei]pool×[AGO:miRNA]totalkoni×[AGO:miRNA]total+koffi

and

si,t=koni×[sitei]pool×[AGO:miRNAfree]tkoni×[AGO:miRNAfree]t+koffikoni×[sitei]pool×[AGO:miRNA]totalkoni×[AGO:miRNA]total+koffi×e(koni×[AGO:miRNAfree]t+koffi)t (Equation 24)

After substituting si,t by Equation (24), the Equation (21) becomes:

xi,t=(koni×[sitei]pool×[AGO:miRNAfree]tkoni×[AGO:miRNAfree]t+koffikoni×[sitei]pool×[AGO:miRNA]totalkoni×[AGO:miRNA]total+koffi×e(koni×[AGO:miRNAfree]t+koffi)t)×(1backgroundk=1m[sitek]poolk=1m(konk×[sitek]pool×[AGO:miRNAfree]tkonk×[AGO:miRNAfree]t+koffkkonk×[sitek]pool×[AGO:miRNA]totalkonk×[AGO:miRNA]total+koffk×e(konk×[AGO:miRNAfree]t+koffk)t))+background×[sitei]poolk=1m[sitek]poolk=1m(konk×[sitek]pool×[AGO:miRNAfree]tkonk×[AGO:miRNAfree]t+koffkkonk×[sitek]pool×[AGO:miRNA]totalkonk×[AGO:miRNA]total+koffk×e(konk×[AGO:miRNAfree]t+koffk)t) (Equation 25)

Equation (25) predicts concentrations of each binding site type at time t, given information on RNA pool and a set of kon and koff values for each site type. Our typical binding experiment consists of 10 time points and measures concentrations of 10 binding sites, yielding a system of 100 xi,t equations in total. Some of these equations are likely linearly dependent on others; but we reasoned that there should remain enough independent equations to estimate 21 variables (kon and koff values of 10 sites, and the background). Therefore, we do not impose any additional constrains.

We also note that the Equation (25) contains [AGO:miRNAfree]t, concentration of unbound miRISC at time t. [AGO:miRNAfree]t cannot be calculated explicitly, as it depends on the concentration of miRISC bound to all site types. Therefore, [AGO:miRNAfree]t is approximated at each iteration of the optimization routine by solving the equation:

[AGO:miRNA]total[AGO:miRNAfree]ti=1m(koni×[sitei]pool×[AGO:miRNAfree]tkoni×[AGO:miRNAfree]t+koffikoni×[sitei]pool×[AGO:miRNA]totalkoni×[AGO:miRNA]total+koffi×e(koni×[AGO:miRNAfree]t+koffi)t)=0 (Equation 26)

To find root of Equation (26), minimize_scalar from a Python-based library SciPy is used within the interval (0, [AGO:miRNA]total). In the current version of implementation, [AGO:miRNA]total is supplied by the user, not fit.

Parameters fitted during MLE of kon and koff values

The set of parameters to optimize contains kon and koff values of site1, …, sitem–1, and no-site, as well as background, the total concentration of all non-specifically recovered RNA. Upon optimization, some parameters may receive negative values, which is meaningless for kinetic rate constants and concentrations. Therefore, we perform exponential transformation and define θ1, …, θm, …, θ2m, and θ2m+1 as

konk=eθk for k in [1, m]

koffk=eθm+k for k in [1, m]

background=eθ2m+1 (Equation 27)

Derivation of ƒcost for MLE of kon and koff values

The function fgradθ returns the derivative of fcost with respect to each θ :

fgrad(θ1,,θ2m+1)=(dfcostdθ1,,dfcostdθ2m+1) (Equation 28)

We derive dfcostdθk using the chain rule:

dfcostdθk=t=1ni=1mfcostxi,t×xi,tθk (Equation 29)

As demonstrated above, fcostxi,t is given by the Equation (20.1) and xi,tθk is given by the Equation (20.3). The Equation (20.3) contains si,tθk and z=1msz,jθk whose expressions differ from RBNS experiments at equilibrium, and therefore should be calculated.

si,t contains koni, koffi, and [AGO:miRNAfree]t that depend on θk, so partial derivatives are required. We use the chain rule to write:

si,tθk=si,tkoni×koniθk+si,tkoffi×koffiθk+si,t[AGO:miRNAfree]t×[AGO:miRNAfree]tθk (Equation 30)

To calculate the partial derivative of si,j with respect to koni, we use the expression of si,j from Equation (24). After applying the quotient rule and some rearrangements, we obtain:

si,tkoni=[sitei]pool×[AGO:miRNAfree]t×koffi(koni×[AGO:miRNAfree]t+koffi)2[sitei]pool×[AGO:miRNA]totalkoni×[AGO:miRNA]total+koffi×e(koni×[AGO:miRNAfree]t+koffi)t×(koffikoni×[AGO:miRNA]total+koffi[AGO:miRNAfree]t×t×koni) (Equation 30.1)

To calculate the partial derivative of koni with respect to θk, we note that konk=eθk for k in [1, m]:

koniθk=eθiθk=eθi×δki=koni×δki,whereδki={1ifk=i0ifki (Equation 30.2)

To calculate the partial derivative of si,j with respect to koffi, we use the expression of si,j from Equation (24). After applying the quotient rule and some rearrangements, we obtain:

si,tkoffi=koni×[sitei]pool×[AGO:miRNAfree]t(koni×[AGO:miRNAfree]t+koffi)2+koni×[sitei]pool×[AGO:miRNA]total(koni×[AGO:miRNAfree]t+koffi)2×e(koni×[AGO:miRNAfree]t+koffi)t×(t+1koni×[AGO:miRNA]total+koffi) (Equation 30.3)

To calculate the partial derivative of koffi with respect to θk, we note that koffk=eθm+k for k in [1, m]:

koffiθk=eθ(m+i)θk=eθ(m+i)×δk(m+i)=koffi×δk(m+i),whereδk(m+i)={1ifk=m+i0ifkm+i (Equation 30.4)

To calculate the partial derivative of si,j with respect to [AGO:miRNAfree]t, we use the expression of si,j from Equation (24). After applying the quotient rule and some rearrangements, we obtain:

si,t[AGO:miRNAfree]t=koni×[sitei]pool×koffi(koni×[AGO:miRNAfree]t+koffi)2+koni×[sitei]pool×[AGO:miRNA]totalkoni×tkoni×[AGO:miRNA]total+koffi×e(koni×[AGO:miRNAfree]t+koffi)t (Equation 30.5)

To calculate the partial derivative of [AGO:miRNAfree]t with respect to θk, we note that

[AGO:miRNAfree]tθk=([AGO:miRNA]totali=1msi,t)θk=i=1msi,tθk

Substituting si,tθk by Equation (30) and simplifying the expression yields:

[AGO:miRNAfree]tθk=i=1m(si,tkoni×koniθk)+i=1m(si,tkoffi×koffiθk)1+i=1msi,t[AGO:miRNAfree]t

Some terms can be simplified further:

i=1m(si,tkoni×koniθk)=i=1m(si,tkoni×koni×δki)=si,tkoni×koni×I[1…m](k),whereI[1...m](k)={1ifk[1...m]0ifk[1...m]
i=1m(si,tkoffi×koffiθk)=i=1m(si,tkoffi×koffi×δk(m+i))=si,tkoffi×koffi×I[m+1...2m](k),whereI[m+1...2m](k)={1ifk[m+1...2m]0ifk[m+1...2m]

After simplifying, we obtain:

[AGO:miRNAfree]tθk=si,tkoni×koni×I[1m](k)+si,tkoffi×koffi×I[m+12m](k)1+i=1msi,t[AGO:miRNAfree]t (Equation 30.6)

The gradient of fcost can now be computed.

Because of the complexity of the equations, the full solution of dfcostdθk is not shown.

dfcostdθk is given by Equation (29), where xi,tθk is substituted by Equation (20.3), si,tθk by Equation (30), si,tkoni×koniθk by Equations (30.1) and (30.2), si,tkoffi×koffiθk by Equations (30.3) and (30.4), si,t[AGO:miRNAfree]t by Equation (30.5), and [AGO:miRNAfree]tθk by Equation (30.6).

Initial guess for MLE of kon and koff values and calculation of 95% cIs

Bootstrapping of 95% of the data was performed ten times on sequencing reads from each time point and the RNA pool. MLE was performed on each bootstrapped sample 100 times. The kon values of all sites were initialized as Enrichmentsite30 sEnrichmentsite0 s(t30 st0 s)×[RISC]total. The koff values of binding sites were initialized as the inverse of the average enrichment values, and the koff for no-site was set to 100 s–1. The background was initialized at 0.1 nM. All the initial guesses were partially randomized by adding a value drawn from a normal distribution with mean 0 and standard deviation 0.001 (except that standard deviation 1 or 0.1 was used for koffno-site and the background, respectively). The cost function was evaluated in the presence of physically meaningful constraints on the parameters: 105 M–1 s–1konall ≤ 109 M–1 s–1, 0.0005 s–1koffsite ≤ 20 s–1, 0.0005 s–1koffno-site ≤ 200 s–1, and 5 pM ≤ background ≤ 10 nM. None of the fitted parameters were at the boundaries at the end of the optimization routine. The estimates provided by MLE were used to predict counts of each binding site type in sequencing data. These counts were compared with observed sequencing data, and MLE results were retained if Pearson correlation coefficient was >0.90. Results from independent starting points satisfying this criterion were combined. All bootstrapped samples were combined. Finally, estimates from two independent RBNS assays were merged. Median and 95% confidence intervals on medians were reported.

Acknowledgments

We thank members of the Zamore laboratory for critical comments on the manuscript, Ildar Gainetdinov for advice on designing the RNA library, Amena Arif for gift of pScalps Puro EGFP plasmid, and Ken-Edwin Aryee for providing psPAX2 and pMD2.G plasmids. This work was supported in part by National Institutes of Health grant R35 GM136275 to P.D.Z. K.J. was supported by a Charles A. King Trust Postdoctoral Fellowship.

Author contributions

Conceptualization, K.J., J.V.-B., and P.D.Z.; methodology, K.J. and J.V.-B.; investigation, K.J. and J.V.-B.; formal analysis, K.J.; software, K.J.; writing – original draft, K.J.; writing – review & editing, K.J. and P.D.Z.; supervision, P.D.Z.; funding acquisition, P.D.Z.

Declaration of interests

The authors declare no competing interests.

Published: March 18, 2022

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.crmeth.2022.100185.

Contributor Information

Karina Jouravleva, Email: karina.jouravleva@umassmed.edu.

Phillip D. Zamore, Email: phillip.zamore@umassmed.edu.

Supplemental information

Document S1. Figures S1–S7 and Tables S1–S4
mmc1.pdf (1.4MB, pdf)
Document S2. Article plus supplemental information
mmc2.pdf (4.6MB, pdf)

Data and code availability

RBNS sequencing data have been deposited at National Center for Biotechnology Information Sequence Read Archive and are publicly available as of the date of publication using accession number PRJNA807105.

All published software required to reanalyze the data reported in this paper is described in the quantification and statistical analysis section below. All original code has been deposited at https://figshare.com/articles/software/MicroRNA-binding_thermodynamics_and_kinetics_by_RNA_Bind-n-Seq/19180952 and is publicly available as of the date of publication.

Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

References

  1. Aitken C.E., Marshall R.A., Puglisi J.D. An oxygen scavenging system for improvement of dye stability in single-molecule fluorescence experiments. Biophys. J. 2008;94:1826–1835. doi: 10.1529/biophysj.107.117689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Baek D., Villén J., Shin C., Camargo F.D., Gygi S.P., Bartel D.P. The impact of microRNAs on protein output. Nature. 2008;455:64–71. doi: 10.1038/nature07242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bazzini A.A., Lee M.T., Giraldez A.J. Ribosome profiling shows that miR-430 reduces translation before causing mRNA decay in zebrafish. Science. 2012;336:233–237. doi: 10.1126/science.1215704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Becker W.R., Ober-Reynolds B., Jouravleva K., Jolly S.M., Zamore P.D., Greenleaf W.J. High-throughput analysis reveals rules for target RNA binding and cleavage by AGO2. Mol. Cell. 2019;75:741–755.e11. doi: 10.1016/j.molcel.2019.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Buenrostro J.D., Araya C.L., Chircus L.M., Layton C.J., Chang H.Y., Snyder M.P., Greenleaf W.J. Quantitative analysis of RNA-protein interactions on a massively parallel array reveals biophysical and evolutionary landscapes. Nat. Biotechnol. 2014;32:562–568. doi: 10.1038/nbt.2880. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Campbell Z.T., Wickens M. Probing RNA-protein networks: biochemistry meets genomics. Trends Biochem. Sci. 2015;40:157–164. doi: 10.1016/j.tibs.2015.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chi S.W., Hannon G.J., Darnell R.B. An alternative mode of microRNA target recognition. Nat. Struct. Mol. Biol. 2012;19:321–327. doi: 10.1038/nsmb.2230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Clore G.M., Gronenborn A.M., Davies R.W. Theoretical aspects of specific and non-specific equilibrium binding of proteins to DNA as studied by the nitrocellulose filter binding assay. Co-operative and non-co-operative binding to a one-dimensional lattice. J. Mol. Biol. 1982;155:447–466. doi: 10.1016/0022-2836(82)90481-8. [DOI] [PubMed] [Google Scholar]
  9. Cook K.B., Vembu S., Ha K.C.H., Zheng H., Laverty K.U., Hughes T.R., Ray D., Morris Q.D. RNAcompete-S: combined RNA sequence/structure preferences for RNA binding proteins derived from a single-step in vitro selection. Methods. 2017;126:18–28. doi: 10.1016/j.ymeth.2017.06.024. [DOI] [PubMed] [Google Scholar]
  10. Crawford D.J., Hoskins A.A., Friedman L.J., Gelles J., Moore M.J. Visualizing the splicing of single pre-mRNA molecules in whole cell extract. RNA. 2008;14:170–179. doi: 10.1261/rna.794808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Danan C., Manickavel S., Hafner M. PAR-CLIP: a method for transcriptome-wide identification of RNA binding protein interaction sites. Methods Mol. Biol. 2016;1358:153–173. doi: 10.1007/978-1-4939-3067-8_10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Dave R., Terry D.S., Munro J.B., Blanchard S.C. Mitigating unwanted photophysical processes for improved single-molecule fluorescence imaging. Biophys. J. 2009;96:2371–2381. doi: 10.1016/j.bpj.2008.11.061. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Dignam J.D., Lebovitz R.M., Roeder R.G. Accurate transcription initiation by RNA polymerase II in a soluble extract from isolated mammalian nuclei. Nucleic Acids Res. 1983;11:1475–1489. doi: 10.1093/nar/11.5.1475. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Doench J.G., Sharp P.A. Specificity of microRNA target selection in translational repression. Genes Dev. 2004;18:504–511. doi: 10.1101/gad.1184404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Dominguez D., Freese P., Alexis M.S., Su A., Hochman M., Palden T., Bazile C., Lambert N.J., Van Nostrand E.L., Pratt G.A., et al. Sequence, structure, and context preferences of human RNA binding proteins. Mol. Cell. 2018;70:854–867.e9. doi: 10.1016/j.molcel.2018.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Elkayam E., Kuhn C.D., Tocilj A., Haase A.D., Greene E.M., Hannon G.J., Joshua-Tor L. The structure of human argonaute-2 in complex with miR-20a. Cell. 2012;150:100–110. doi: 10.1016/j.cell.2012.05.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Erickson M.G., Alseikhan B.A., Peterson B.Z., Yue D.T. Preassociation of calmodulin with voltage-gated Ca(2+) channels revealed by FRET in single living cells. Neuron. 2001;31:973–985. doi: 10.1016/s0896-6273(01)00438-x. [DOI] [PubMed] [Google Scholar]
  18. Erickson M.G., Liang H., Mori M.X., Yue D.T. FRET two-hybrid mapping reveals function and location of L-type Ca2+ channel CaM preassociation. Neuron. 2003;39:97–107. doi: 10.1016/s0896-6273(03)00395-7. [DOI] [PubMed] [Google Scholar]
  19. Flores-Jasso C.F., Salomon W.E., Zamore P.D. Rapid and specific purification of Argonaute-small RNA complexes from crude cell lysates. RNA. 2013;19:271–279. doi: 10.1261/rna.036921.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Frey K.A., Albin R.L. Receptor binding techniques. Curr. Protoc. Neurosci. 2001;1:1.4. doi: 10.1002/0471142301.ns0104s00. [DOI] [PubMed] [Google Scholar]
  21. Garcia D.M., Baek D., Shin C., Bell G.W., Grimson A., Bartel D.P. Weak seed-pairing stability and high target-site abundance decrease the proficiency of lsy-6 and other microRNAs. Nat. Struct. Mol. Biol. 2011;18:1139–1146. doi: 10.1038/nsmb.2115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Garland P.B. Optical evanescent wave methods for the study of biomolecular interactions. Q. Rev. Biophys. 1996;29:91–117. doi: 10.1017/s0033583500005758. [DOI] [PubMed] [Google Scholar]
  23. Guo H., Ingolia N.T., Weissman J.S., Bartel D.P. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature. 2010;466:835–840. doi: 10.1038/nature09267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Hale M.A., Richardson J.I., Day R.C., McConnell O.L., Arboleda J., Wang E.T., Berglund J.A. An engineered RNA binding protein with improved splicing regulation. Nucleic Acids Res. 2018;46:3152–3168. doi: 10.1093/nar/gkx1304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Harris C.R., Millman K.J., van der Walt S.J., Gommers R., Virtanen P., Cournapeau D., Wieser E., Taylor J., Berg S., Smith N.J., et al. Array programming with NumPy. Nature. 2020;585:357–362. doi: 10.1038/s41586-020-2649-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hellman L.M., Fried M.G. Electrophoretic mobility shift assay (EMSA) for detecting protein-nucleic acid interactions. Nat. Protoc. 2007;2:1849–1861. doi: 10.1038/nprot.2007.249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Hendrickson D.G., Hogan D.J., McCullough H.L., Myers J.W., Herschlag D., Ferrell J.E., Brown P.O. Concordant regulation of translation and mRNA abundance for hundreds of targets of a human microRNA. PLoS Biol. 2009;7:e1000238. doi: 10.1371/journal.pbio.1000238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Huntzinger E., Izaurralde E. Gene silencing by microRNAs: contributions of translational repression and mRNA decay. Nat. Rev. Genet. 2011;12:99–110. doi: 10.1038/nrg2936. [DOI] [PubMed] [Google Scholar]
  29. Jarmoskaite I., AlSadhan I., Vaidyanathan P.P., Herschlag D. How to measure and evaluate binding affinities. Elife. 2020;9:e57264. doi: 10.7554/eLife.57264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Kishore S., Jaskiewicz L., Burger L., Hausser J., Khorshid M., Zavolan M. A quantitative analysis of CLIP methods for identifying binding sites of RNA-binding proteins. Nat. Methods. 2011;8:559–564. doi: 10.1038/nmeth.1608. [DOI] [PubMed] [Google Scholar]
  31. König J., Zarnack K., Rot G., Curk T., Kayikci M., Zupan B., Turner D.J., Luscombe N.M., Ule J. iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat. Struct. Mol. Biol. 2010;17:909–915. doi: 10.1038/nsmb.1838. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Krek A., Grün D., Poy M.N., Wolf R., Rosenberg L., Epstein E.J., MacMenamin P., da Piedade I., Gunsalus K.C., Stoffel M., et al. Combinatorial microRNA target predictions. Nat. Genet. 2005;37:495–500. doi: 10.1038/ng1536. [DOI] [PubMed] [Google Scholar]
  33. Ladbury J.E. Counting the calories to stay in the groove. Structure. 1995;3:635–639. doi: 10.1016/s0969-2126(01)00197-6. [DOI] [PubMed] [Google Scholar]
  34. Ladbury J.E., Chowdhry B.Z. Sensing the heat: the application of isothermal titration calorimetry to thermodynamic studies of biomolecular interactions. Chem. Biol. 1996;3:791–801. doi: 10.1016/s1074-5521(96)90063-0. [DOI] [PubMed] [Google Scholar]
  35. Lambert N., Robertson A., Jangi M., McGeary S., Sharp P.A., Burge C.B. RNA Bind-n-Seq: quantitative assessment of the sequence and structural binding specificity of RNA binding proteins. Mol. Cell. 2014;54:887–900. doi: 10.1016/j.molcel.2014.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Lambert N.J., Robertson A.D., Burge C.B. RNA bind-n-seq: measuring the binding affinity landscape of RNA-binding proteins. Methods Enzymol. 2015;558:465–493. doi: 10.1016/bs.mie.2015.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lewis B.P., Shih I.H., Jones-Rhoades M.W., Bartel D.P., Burge C.B. Prediction of mammalian microRNA targets. Cell. 2003;115:787–798. doi: 10.1016/s0092-8674(03)01018-3. [DOI] [PubMed] [Google Scholar]
  38. Lewis B.P., Burge C.B., Bartel D.P. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005;120:15–20. doi: 10.1016/j.cell.2004.12.035. [DOI] [PubMed] [Google Scholar]
  39. Licatalosi D.D., Mele A., Fak J.J., Ule J., Kayikci M., Chi S.W., Clark T.A., Schweitzer A.C., Blume J.E., Wang X., et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature. 2008;456:464–469. doi: 10.1038/nature07488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Licatalosi D.D., Ye X., Jankowsky E. Approaches for measuring the dynamics of RNA-protein interactions. Wiley Interdiscip. Rev. RNA. 2020;11:e1565. doi: 10.1002/wrna.1565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Liu J., Carmell M.A., Rivas F.V., Marsden C.G., Thomson J.M., Song J.J., Hammond S.M., Joshua-Tor L., Hannon G.J. Argonaute2 is the catalytic engine of mammalian RNAi. Science. 2004;305:1437–1441. doi: 10.1126/science.1102513. [DOI] [PubMed] [Google Scholar]
  42. Maerkl S.J., Quake S.R. A systems approach to measuring the binding energy landscapes of transcription factors. Science. 2007;315:233–237. doi: 10.1126/science.1131007. [DOI] [PubMed] [Google Scholar]
  43. Martin L., Meier M., Lyons S.M., Sit R.V., Marzluff W.F., Quake S.R., Chang H.Y. Systematic reconstruction of RNA functional motifs with high-throughput microfluidics. Nat. Methods. 2012;9:1192–1194. doi: 10.1038/nmeth.2225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. McGeary S.E., Lin K.S., Shi C.Y., Pham T.M., Bisaria N., Kelley G.M., Bartel D.P. The biochemical basis of microRNA targeting efficacy. Science. 2019;366:eaav1741. doi: 10.1126/science.aav1741. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. McGeary S.E., Bisaria N., Pham T.M., Wang P.Y., Bartel D.P. MicroRNA 3’-compensatory pairing occurs through two binding modes, with affinity shaped by nucleotide identity and position. Elife. 2022;11:e69803. doi: 10.7554/eLife.69803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Meister G., Landthaler M., Patkaniowska A., Dorsett Y., Teng G., Tuschl T. Human Argonaute2 mediates RNA cleavage targeted by miRNAs and siRNAs. Mol. Cell. 2004;15:185–197. doi: 10.1016/j.molcel.2004.07.007. [DOI] [PubMed] [Google Scholar]
  47. Moore M.J. From birth to death: the complex lives of eukaryotic mRNAs. Science. 2005;309:1514–1518. doi: 10.1126/science.1111443. [DOI] [PubMed] [Google Scholar]
  48. Müller-McNicoll M., Neugebauer K.M. How cells get the message: dynamic assembly and function of mRNA–protein complexes. Nat. Rev. Genet. 2013;14:275–287. doi: 10.1038/nrg3434. [DOI] [PubMed] [Google Scholar]
  49. Nakanishi K., Weinberg D.E., Bartel D.P., Patel D.J. Structure of yeast Argonaute with guide RNA. Nature. 2012;486:368–374. doi: 10.1038/nature11211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Nutiu R., Friedman R.C., Luo S., Khrebtukova I., Silva D., Li R., Zhang L., Schroth G.P., Burge C.B. Direct measurement of DNA affinity landscapes on a high-throughput sequencing instrument. Nat. Biotechnol. 2011;29:659–664. doi: 10.1038/nbt.1882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Pall G.S., Hamilton A.J. Improved northern blot method for enhanced detection of small RNA. Nat. Protoc. 2008;3:1077–1084. doi: 10.1038/nprot.2008.67. [DOI] [PubMed] [Google Scholar]
  52. Parker J.S., Roe S.M., Barford D. Structural insights into mRNA recognition from a PIWI domain-siRNA guide complex. Nature. 2005;434:663–666. doi: 10.1038/nature03462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Rajewsky N., Socci N.D. Computational identification of microRNA targets. Dev. Biol. 2004;267:529–535. doi: 10.1016/j.ydbio.2003.12.003. [DOI] [PubMed] [Google Scholar]
  54. Riggs A.D., Bourgeois S., Cohn M. The lac repressor-operator interaction. 3. Kinetic studies. J. Mol. Biol. 1970;53:401–417. doi: 10.1016/0022-2836(70)90074-4. [DOI] [PubMed] [Google Scholar]
  55. Riggs A.D., Suzuki H., Bourgeois S. Lac repressor-operator interaction. I. Equilibrium studies. J. Mol. Biol. 1970;48:67–83. doi: 10.1016/0022-2836(70)90219-6. [DOI] [PubMed] [Google Scholar]
  56. Rivas F.V., Tolia N.H., Song J.J., Aragon J.P., Liu J., Hannon G.J., Joshua-Tor L. Purified Argonaute2 and an siRNA form recombinant human RISC. Nat. Struct. Mol. Biol. 2005;12:340–349. doi: 10.1038/nsmb918. [DOI] [PubMed] [Google Scholar]
  57. Salomon W.E., Jolly S.M., Moore M.J., Zamore P.D., Serebrov V. Single-molecule imaging reveals that argonaute reshapes the binding properties of its nucleic acid guides. Cell. 2015;162:84–95. doi: 10.1016/j.cell.2015.06.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Schirle N.T., Sheu-Gruttadauria J., Chandradoss S.D., Joo C., MacRae I.J. Water-mediated recognition of t1-adenosine anchors Argonaute2 to microRNA targets. eLife. 2015;4:e07646. doi: 10.7554/eLife.07646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Schuck P. Reliable determination of binding affinity and kinetics using surface plasmon resonance biosensors. Curr. Opin. Biotechnol. 1997;8:498–502. doi: 10.1016/s0958-1669(97)80074-2. [DOI] [PubMed] [Google Scholar]
  60. Schwarz D.S., Tomari Y., Zamore P.D. The RNA-induced silencing complex is a Mg2+-dependent endonuclease. Curr. Biol. 2004;14:787–791. doi: 10.1016/j.cub.2004.03.008. [DOI] [PubMed] [Google Scholar]
  61. Selbach M., Schwanhäusser B., Thierfelder N., Fang Z., Khanin R., Rajewsky N. Widespread changes in protein synthesis induced by microRNAs. Nature. 2008;455:58–63. doi: 10.1038/nature07228. [DOI] [PubMed] [Google Scholar]
  62. Sharma D., Zagore L.L., Brister M.M., Ye X., Crespo-Hernández C.E., Licatalosi D.D., Jankowsky E. The kinetic landscape of an RNA-binding protein in cells. Nature. 2021;591:152–156. doi: 10.1038/s41586-021-03222-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Sheu-Gruttadauria J., Xiao Y., Gebert L.F., MacRae I.J. Beyond the seed: structural basis for supplementary microRNA targeting by human Argonaute2. EMBO J. 2019;38:e101153. doi: 10.15252/embj.2018101153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Shi X., Herschlag D. Fluorescence polarization anisotropy to measure RNA dynamics. Methods Enzymol. 2009;469:287–302. doi: 10.1016/S0076-6879(09)69014-5. [DOI] [PubMed] [Google Scholar]
  65. Shin C., Nam J.W., Farh K.K., Chiang H.R., Shkumatava A., Bartel D.P. Expanding the microRNA targeting code: functional sites with centered pairing. Mol. Cell. 2010;38:789–802. doi: 10.1016/j.molcel.2010.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Smith C.S., Jouravleva K., Huisman M., Jolly S.M., Zamore P.D., Grunwald D. An automated Bayesian pipeline for rapid analysis of single-molecule binding data. Nat. Commun. 2019;10:272. doi: 10.1038/s41467-018-08045-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Smolarsky M., Tal M. Novel method for measuring polyuridylic acid binding to ribosomes. Biochim. Biophys. Acta. 1970;199:447–452. doi: 10.1016/0005-2787(70)90087-0. [DOI] [PubMed] [Google Scholar]
  68. Sugimoto Y., König J., Hussain S., Zupan B., Curk T., Frye M., Ule J. Analysis of CLIP and iCLIP methods for nucleotide-resolution studies of protein-RNA interactions. Genome Biol. 2012;13:R67. doi: 10.1186/gb-2012-13-8-r67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Taliaferro J.M., Lambert N.J., Sudmant P.H., Dominguez D., Merkin J.J., Alexis M.S., Bazile C., Burge C.B. RNA sequence context effects measured in vitro predict in vivo protein binding and regulation. Mol. Cell. 2016;64:294–306. doi: 10.1016/j.molcel.2016.08.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Tome J.M., Ozer A., Pagano J.M., Gheba D., Schroth G.P., Lis J.T. Comprehensive analysis of RNA-protein interactions by high-throughput sequencing-RNA affinity profiling. Nat. Methods. 2014;11:683–688. doi: 10.1038/nmeth.2970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Turner D.H., Mathews D.H. NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucleic Acids Res. 2010;38:D280–D282. doi: 10.1093/nar/gkp892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Van Nostrand E.L., Freese P., Pratt G.A., Wang X., Wei X., Xiao R., Blue S.M., Chen J.Y., Cody N.A.L., Dominguez D., et al. A large-scale binding and functional map of human RNA-binding proteins. Nature. 2020;583:711–719. doi: 10.1038/s41586-020-2077-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Virtanen P., Gommers R., Oliphant T.E., Haberland M., Reddy T., Cournapeau D., Burovski E., Peterson P., Weckesser W., Bright J., et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17:261–272. doi: 10.1038/s41592-019-0686-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Wang Y., Sheng G., Juranek S., Tuschl T., Patel D.J. Structure of the guide-strand-containing Argonaute silencing complex. Nature. 2008;456:209–213. doi: 10.1038/nature07315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Wee L.M., Flores-Jasso C.F., Salomon W.E., Zamore P.D. Argonaute divides its RNA guide into domains with distinct functions and RNA-binding properties. Cell. 2012;151:1055–1067. doi: 10.1016/j.cell.2012.10.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Winter R.B., von Hippel P.H. Diffusion-driven mechanisms of protein translocation on nucleic acids. 2. The Escherichia coli repressor--operator interaction: equilibrium measurements. Biochemistry. 1981;20:6948–6960. doi: 10.1021/bi00527a029. [DOI] [PubMed] [Google Scholar]
  77. Wong I., Lohman T.M. Allosteric effects of nucleotide cofactors on Escherichia coli Rep helicase-DNA binding. Science. 1992;256:350–355. doi: 10.1126/science.256.5055.350. [DOI] [PubMed] [Google Scholar]
  78. Wong I., Lohman T.M. A double-filter method for nitrocellulose-filter binding: application to protein-nucleic acid interactions. Proc. Natl. Acad. Sci. U S A. 1993;90:5428–5432. doi: 10.1073/pnas.90.12.5428. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Xiao Y., MacRae I.J. Robust differential microRNA targeting driven by supplementary interactions in vitro. RNA. 2020;26:162–174. doi: 10.1261/rna.072264.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Zhao J., Ohsumi T.K., Kung J.T., Ogawa Y., Grau D.J., Sarma K., Song J.J., Kingston R.E., Borowsky M., Lee J.T. Genome-wide identification of polycomb-associated RNAs by RIP-seq. Mol. Cell. 2010;40:939–953. doi: 10.1016/j.molcel.2010.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Zykovich A., Korf I., Segal D.J. Bind-n-Seq: high-throughput analysis of in vitro protein-DNA interactions using massively parallel sequencing. Nucleic Acids Res. 2009;37:e151. doi: 10.1093/nar/gkp802. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S7 and Tables S1–S4
mmc1.pdf (1.4MB, pdf)
Document S2. Article plus supplemental information
mmc2.pdf (4.6MB, pdf)

Data Availability Statement

RBNS sequencing data have been deposited at National Center for Biotechnology Information Sequence Read Archive and are publicly available as of the date of publication using accession number PRJNA807105.

All published software required to reanalyze the data reported in this paper is described in the quantification and statistical analysis section below. All original code has been deposited at https://figshare.com/articles/software/MicroRNA-binding_thermodynamics_and_kinetics_by_RNA_Bind-n-Seq/19180952 and is publicly available as of the date of publication.

Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.


Articles from Cell Reports Methods are provided here courtesy of Elsevier

RESOURCES