Skip to main content
Molecular Therapy. Nucleic Acids logoLink to Molecular Therapy. Nucleic Acids
. 2019 Oct 18;18:893–902. doi: 10.1016/j.omtn.2019.10.010

Amount of Escape Estimation Based on Bayesian and MCMC Approaches for RNA Interference

Tian Liu 1, Yongzhen Pei 1,2,, Changguo Li 3, Ming Ye 1,4
PMCID: PMC6881653  PMID: 31756682

Abstract

The amount of short interfering RNA (siRNA) escaping from the endosome has a significant impact on the efficiency of RNAi. In general, the initial injected amount of siRNAs during the experiment is known, and also the amount of siRNAs after the experiment can be revealed by the level of mRNA measured. However, it is impossible to measure the amount of siRNAs that escape from the endosome and really take part in the chemical reaction of RNAi by detecting the biological organism and its tissues. Inspired by the bottleneck effect in the virus, we introduce the Bayesian approach to infer the amount of escape based on a single type and multiple types of siRNA, respectively. With the consideration of the large calculation quantity of the accurate posterior distribution and the unavailable analytic expression of the likelihood function, our article proposes to take samples by the improved Markov chain Monte Carlo (MCMC) method. The article takes the silencing gene of the synthesis of chitin and the interfering multiple target oncogene as numerical examples to show that our improved MCMC method has higher operation efficiency compared to the Bayesian approach. Our research models siRNA endosome escape using statistical methods for the first time. It perhaps provides a theoretical basis to decrease the cost of a biotic experiment for the future and the standardized statistical approaches for the amount of escape estimation.

Keywords: amount of escape, Bayesian inference, RNA interference, MCMC method, Gillespie algorithm

Introduction

RNAi refers to a highly conserved biological process that recognizes double-stranded RNA (dsRNA) in the cell to induce the specific degradation of homologous mRNA during evolution.1 Endogenously expressed long dsRNA is first cleaved into short interfering RNA (siRNA) by the enzyme, such as Dicer, that is the component of a gene-silencing mechanism, and then the short RNA molecules are exploited as guides to target homologous RNA species.2,3 The specific suppression of gene expression possibly actualizes through injecting or feeding with dsRNA. The introduction of siRNA into insect cells and silencing of target genes expression offer a new potential tool for the biological pest control method.4 For example, the RNAi pathway could be applied to reduce the breeding of lepidopteran and coleopteran insect pests via restraining the planta expression,5 and Mao et al.6 provide a strategy to impair larval tolerance of gossypol by interfering a cotton bollworm RNA. As a highly efficient technology, RNAi has also developed rapidly in the field of infectious disease and tumor gene therapy,7,8 and it can cure humans with various diseases that traditional drugs cannot, such as chronic hepatitis B virus.9 In addition, individualized treatment schemes can be designed according to different conditions of patients.

The significant barrier for efficient siRNA uptake lies in the plasma membrane. In spite of the small size of siRNA molecules, they are still prevented from crossing biological membranes because of their negative charge and hydrophilicity. The procedure of the intracellular transportation of siRNAs begins with early endosomal vesicles. Subsequently, with the fusion of these early endosomes and sorting endosomes, siRNAs are transferred to the late endosomes. Only a small part of siRNAs could escape from the endosomes, and another part with the endosomal contents is removed to the lysosomes. The lysosomes that contain various nucleases acidify the endosomal content, and the siRNAs are degraded in turn. Figure 1 provides a schematic diagram that describes the process of the uptake and intracellular trafficking of a targeted siRNA. So, in order to avert lysosomal degradation, siRNAs have to escape from the endosomes and get into the cytosol, where they will associate with the RNAi mechanism.10 Besides, it has been found that some of the generated siRNAs are not directly derived from the cleavage of dsRNA but rather, from a chain reaction of RNA polymerase. With the allowance of a single strand of siRNA as a primer and the target mRNA as a template, this reaction amplifies the target mRNA under the action of RNA-mediated RNA polymerase (RdRP) and generates a new siRNA subpopulation.11 These, in turn, would continue to react to the target mRNA and degrade it.12 This cyclical amplification process of RNAi explains the reason why a small amount of dsRNA can induce strong gene-silencing effects.

Figure 1.

Figure 1

The Process of Escape of siRNA

Uptake and intracellular trafficking of a targeted siRNA delivery vehicle.10

We find that the process of siRNA delivery resembles the biological effect called bottleneck. The bottleneck describes the phenomenon that the number of individuals in a group is reduced drastically or even extinct due to drastic changes in the environment. When we inject a certain amount of siRNA into a pest, only a small fraction of the siRNA can across the plasma membrane and participate in the RNAi, and the remaining siRNAs will be degraded. The lower amount of escaping siRNA (commonly known as bottleneck size) will lead to a form of a new population by the amplification process.13 Accurate quantification of the amount of escape for RNAi is vital for several reasons. First, the estimation of the amount of siRNAs escaping from the endosome helps us to research the biological mechanism of endosomal escape more definitively. Second, the knowledge of the amount of siRNAs of escape in RNAi processes is important to design rationally the strategies that optimize the amount of siRNA to interfere with the target RNA. Finally, the amount of escape impacts the levels of the types that can escape from the endosome into the cytosol when we inject multiple types of siRNA and thereby, impact the effect of interference.

Bottleneck has been extensively researched by many articles that mostly focus on the qualitative analysis of transmission bottleneck sizes,14 and Abel et al.15 provide a biologically motivated introduction to bottlenecks. Sobel et al.16 use the deep-sequencing data to construct the likelihood expression of transmission bottleneck on the basis of the beta-binomial sampling method. Inspired by the above opinions with bottleneck, new ideas aiming at gauging the escaping amounts of siRNA for a single type and multiple types are suggested, respectively. After the observed data are simulated by the Gillespie algorithm, the probability distributions of escaping amounts of siRNA are estimated by means of two algorithms, consisting of the Bayesian approach and the nearest neighbor method.17 However, both algorithms are inefficient in the course of actual implementation, because the multiple invoking and running of the Gillespie algorithm take much time. So we provide an alternative approach to sample the escaping amounts of siRNA based on the Markov chain Monte Carlo (MCMC) method and take the means of samples as the estimation of escaping amounts to improve the speed of the computer. Finally, comparisons indicate that the estimations inferred by both Bayesian and MCMC methods approximate the true value.

Results

Silence Gene Controlling the Synthesis of Chitin

The oriental migratory locust is a crucial pest in agriculture.18 Recently, the locust plague has broken out more frequently and severely in China.19 As we know, the growth and development of locust strictly depend on the biosynthesis and degradation of chitin, which is absent in plants and vertebrates. So, chitin metabolism represents an attractive target for developing safe and effective insecticides.20

RNAi can be used to silence genes that control the synthesis of chitin, sequentially leading to the death of locusts. After siRNAs are injected into the locust, they are governed by stochastic processes, including amplification, degradation, immigration, and emigration, which are dominated by a parameter set θ={α,λ,μ,σ}. Let S(t) be the amount of the current siRNAs. Then, four stochastic processes are modeled by four biochemical reactions as follows:

Sα2S(a)Sλ(b)μS(c)Sσ(d). (Equation 1)

Next, the biological significance of the construction and parameters in Equation 1 are presented.

  • α is the amplification rate of siRNAs that have escaped. Equation 1a means that given the current amount S(t), a unit of new siRNA is generated in the time interval (t,t+dt) with probability αS(t)dt.

  • λ is the degradation rate of siRNA due to the endocytosis. Equation 1b, represents that a unit of siRNA is degraded by lysosomes with probability λS(t)dt in the time interval (t,t+dt) for given the current states S(t).

  • μ is the immigration rate of a new siRNA molecule. Equation 1c reveals that a unit of siRNA immigrates in our system from the neighboring cells with probability μdt in the time interval (t,t+dt).

  • σ is the emigration rate of siRNA. Equation 1d shows that siRNA will decrease one unit with the emigration of siRNA into the neighboring cells in the time interval (t,t+dt) with the probability σS(t)dt for the given current state S(t).

Take the parameter values α=0.6,λ=0.3,μ=0.6,σ=0.23, for example, when the initial value is given by S(0)=5, simulations for the dynamic of the siRNA by the Gillespie algorithm are illustrated in Figure 2. So, the value at Δt=12h could be recorded as our observation data s2 being the amount of siRNA after amplification.

Figure 2.

Figure 2

Time Evolutions of the siRNA

The simulations for the dynamic of the siRNA by the Gillespie algorithm21 are illustrated and the lines with five different colors represent five simulations.

Next, the above observation data s2 are employed to estimate the amount of escape s1 or its posterior distribution p(s1|s2) and meanwhile, demonstrate the efficacy of Algorithm 1 and Algorithm 2 for the single type of siRNA.

  • 1.

    Given the target amount of escape s1{1,3,5,7,70,140,700}.

  • 2.

    Get the data {(s2)1,(s2)2,...,(s2)101}i.i.dGillespie(s1,Δt,θ).

  • 3.

    Make s2 be the median of {(s2)j}j=1,...,101.

  • 4.

    Acquire p(s1|s2) by Algorithm 1 and the mean s1 of samples by Algorithm 2, respectively.

  • 5.

    Compare p(s1|s2) and the mean with target s1, respectively.

Algorithm 1. Estimation of Probability Distributions p(s1|s2).

Input: the amount of siRNAs after amplification s2, time interval Δt, and the parameter set θ.

Output: the probability p(s1|s2) when s1=1,...,smax(smaxs2).

  • 1.

    For s1=1 to smax, do

  • 2.

     Simulate {(s2)1,(s2)2,...,(s2)100} from s1 by the Gillespie algorithm

  • 3.

     Get pˆ(s2|s1) from {(s2)j}j=1,...,100 by the nearest neighbor method

  • 4.

     Set prob=pˆ(s2|s1).

  • 5. Set [pˆ(s1=1|s2),...,pˆ(s1=smax|s2)]=probsum(prob).

  • 6. Return [pˆ(s1=1|s2),...,pˆ(s1=smax|s2)].

For targets s1=7,s1=70, and s1=700, we obtained the posterior distributions p(s1|s2) of the escaping amount by Algorithm 1 in Figures 3A–3C. Furthermore, we take their modes 9, 67, and 687 as the estimations of the escaping amount, respectively. For the same targets, the samples of the escaping amount are displayed in Figures 4A–4C by Algorithm 2, and their means are estimated as 5, 78, and 687 after burn-in. Obviously, the two kinds of estimations fit the targets very well. This indicates that the two algorithms are efficient.

Figure 3.

Figure 3

Posterior Distributions p(s1|s2) Estimated by Bayesian Inference

Posterior distributions p(s1|s2) of amount of escape estimated using Algorithm 1 for (A) s1=7, (B) s1=70, and (C) s1=700.

Figure 4.

Figure 4

The Results of Sampling for Single Type of siRNA Obtained by MCMC Method

The three panels at the top visualize the sampled data of s1. For all other panels, the posterior distributions p(s1|s2) obtained using Algorithm 2 are delineated. (A) refers to the target s1=7, (B) to the s1=70, and (C) to s1=700.

Algorithm 2. Generating the Samples of s1.

Input: the amount of siRNAs after amplification s2, time interval Δt, the parameter set θ, initial value s1(0), number of iterations N, and cycle index k=0.

Output: the sample s1(0),s1(1),...,s1(N).

  • 1.

    Simulate s2(k) from s1(k) by the Gillespie algorithm, and calculate d=|s2(k)s2|

  • 2.

    For k=0 to N, do

  • 3.

     Generate a proposed value s1' from proposal distribution q(s1'|s1(k))

  • 4.

     Simulate s2' from s1' by the Gillespie algorithm, and calculate d'=|s2's2|

  • 5.

     Sample u from uniform distribution U(0,1)

  • 6.

     Calculate the acceptance probability α by (Equation 7)

  • 7.

     If uα(s1',s1(k)), then

  • 8.

     Accept s1', and set s1(k+1)=s1',d=d

  • 9.

     else

  • 10.

     Reject s1', and set s1(k+1)=s1(k),d=d

  • 11.

    Return s1(0),s1(1),...,s1(N)

Interfere Multiple Target Oncogene

Related studies have found that the cancerization of normal cells is the consequence of interaction of multiple genes. However, conventional therapies, which are only targeted toward a single gene mostly, cannot completely inhibit the growth of tumors. It is obvious that RNAi technology can be utilized to silence gene. Yin et al.22 suggested that injecting multiple types of siRNA can specifically interfere with multiple target oncogenes simultaneously and thereby inhibit the growth and proliferation of cancer cells synergistically.

Consequently, for multiple types, a hypothesis is given that we inject seven types of siRNA v0[1],v0[2],...,v0[7] for gene therapy. Then, the observation data v2 could be simulated by the Gillespie algorithm, as previously mentioned. Algorithm 3 and Algorithm 4 are applied to estimate the amount of escaping siRNAs and verify the efficacy of these two methods by the following steps.

  • 1.

    Given the initial injected amount, v0=600[1],600[2],...,600[7].

  • 2.

    Given the target amount of escape, s1{1,3,5,7,70,140}.

  • 3.

    Generate a mode v1 using the multivariate hypergeometric distribution related to random samples of size s1 from v0.

  • 4.

    Get the data {(v2)1,(v2)2,...,(v2)101}i.i.dGillespie(v1,Δt,θ).

  • 5.

    Then, make v2 be the median of {(v2)j}j=1,...,101.

  • 6.

    Acquire p(s1|v0,v2) by Algorithm 3 and the mean s1 of samples by Algorithm 4, respectively.

  • 7.

    Compare p(s1|v0,v2) and the mean with target s1, respectively.

Algorithm 3. Estimation of Probability Distributions p(s1|v0,v2).

Input: the initial injected amount of siRNAs of various types v0, the amount of siRNAs after amplification v2, time interval Δt, and the parameter set θ.

Output: the probability p(s1|v0,v2) when s1=1,...,smax.

  • 1.

    For s1=1 to smax, do

  • 2.

     For k=1 to 1,000, do

  • 3.

     Sample v1 from the multivariate hypergeometric distribution with s1, v0

  • 4.

     Calculate p(v1|v0) by (Equation 11)

  • 5.

     Set a=p(v1|v0)

  • 6.

     Set b=1

  • 7.

     For v1[i]v1, do

  • 8.

     Simulate {(v2[i])1,(v2[i])2,...,(v2[i])100} from v1[i] by the Gillespie algorithm

  • 9.

     Get pˆ(v2[i]|v1[i]) from {(v2[i])j}j=1,...,100 by the nearest neighbor method

  • 10.

     Set p=pˆ(v2[i]|v1[i])

  • 11.

     Set b=b×p b is pˆ(v2|v1) at last

  • 12.

     Set prob=a×b

  • 13.

     Set numert=sum(prob)

  • 14.

    Set [pˆ(s1=1|v0,v2),,pˆ(s1=smax|v0,v2)]=numertsum(numert)

  • 15.

    Get the modes of [pˆ(s1=1|v0,v2),...,pˆ(s1=smax|v0,v2)] as an estimation of s1

  • 16.

    Return [pˆ(s1=1|v0,v2),...,pˆ(s1=smax|v0,v2)]

Estimated posterior distributions p(s1|v0,v2) by Algorithm 3 are shown in Figures 5A–5C for targets s1=7,s1=70, and s1=140. The modes, as the estimations of the escaping amount, are 9, 65, and 135, respectively. For the same targets, we perform 10,000 samples by Algorithm 4 and report in Figures 6A–6C. After burn-in, we get the estimations 9, 69, and 133 by calculating their means. It can be seen that our predicted results approximate accurately to real ones.

Figure 5.

Figure 5

Posterior Distributions p(s1|v0,v2) Estimated by Bayesian Inference

Posterior distributions of amount of escape estimated using Algorithm 3 for (A) s1=7, (B) s1=70, and (C) s1=140.

Algorithm 4. Generating the Samples of s1.

Input: the initial injected amount of siRNAs v0, the amount of siRNAs after amplification v2, time interval Δt, parameter set θ, initial value s1(0), number of iterations N, and cycle index k=0.

Output: the samples s1(0),s1(1),...,s1(N).

  • 1.

    Sample v1(k) from the multivariate hypergeometric distribution with v0, s1(k)

  • 2.

    Simulate v2(k) from v1(k) using the Gillespie algorithm, and calculate d=v2(k)v2

  • 3.

    For k=0 to N, do

  • 4.

     Generate a proposed value s1' from proposal distribution q(s1'|s1(k))

  • 5.

     Sample v1' from the multivariate hypergeometric distribution with v0, s1'

  • 6.

     Simulate v2' from v1' by the Gillespie algorithm, and calculate d=v2'v2

  • 7.

     Sample u from uniform distribution U(0,1)

  • 8.

     Calculate the acceptance probability α by (Equation 17)

  • 9.

     If uα(s1',s1(k)), then

  • 10.

     Accept s1', and set s1(k+1)=s1',d=d

  • 11.

     else

  • 12.

     Reject s1', and set s1(k+1)=s1(k),d=d

  • 13.

    Return s1(0),s1(1),...,s1(N)

Figure 6.

Figure 6

The Results of Sampling for Multiple Types of siRNAs Obtained by MCMC Method

The three panels at the top visualize the sampled data of s1. For all other panels, the posterior distributions p(s1|v0,v2) obtained using Algorithm 4 are delineated. (A) refers to the target s1=7, (B) to the s1=70, and (C) to s1=140.

Discussion

The amount of siRNAs escaping from the endosome is one of the important essentials dominating the efficiency of RNAi, but it is intractable to be observed and calculated in experiments. In this paper, two methods are proposed to estimate the amount of escape in terms of the knowledge of the dynamics during amplification from the amount after the reaction and the amount of injection. One is to estimate the posterior distribution of escaping the amount according to the Bayesian approach; the other one is to get the samples of the escaping amount by the MCMC method and to use the mean of samples as an estimate. For the traditional Bayesian approach, we present the specific algorithms combined with the nearest neighbor method, which is used for the estimation of p(s2|s1). For the MCMC method, the acceptance probability of the Metropolis-Hastings (MH) algorithm is controlled by the distance function between the simulation with the observed data. Furthermore, with the contraposition of the single type of siRNAs and multiple types of siRNAs, the algorithms of the estimate of the escaping amount are given, respectively. To inspect the validity of our algorithms, two examples on the silencing gene for the synthesis of chitin and blocking multiple target oncogenes are derived. Our pursuit offers statistical ways to infer the exact amount of siRNAs participating in the actual RNAi reaction. Meanwhile, it perhaps provides a theoretical basis to decrease the cost of the biotic experiment for the future.

Even so, there are still some problems worth exploring further. First, the MCMC method failed to estimate the posterior distribution that could express the uncertainty through the variance of the distributions, although it improves the efficiency. It indicates that a more comprehensive method that takes into account the accuracy of estimation, efficiency, and expression of uncertainty together is required. Besides, the estimation of the bottleneck size is only built on the assumption that the dynamics during amplification are known. When the partial data are missing, how to estimate the amount of escape and the parameters together is the problem for further consideration. In future research, we will try to find the solutions to these problems.

Materials and Methods

Single Type of siRNA

In general, we only introduce a single type of siRNA aimed at a specific RNA into the organisms. The processes for which siRNAs escape from the endosome and amplify intracellularly have been described in the first part, and now, we picture them in Figure 7. Define the initial injected amount of siRNAs as s0, the amount of siRNAs that escape from endosome as s1, and the amount of siRNAs after amplification as s2 (Figure 7). Obviously, s1s2. Then, on the premise of the amount of siRNAs after amplification, Bayesian inference or MCMC can be applied to estimate the posterior distribution of the escaping amount of siRNAs, as well as their value.

Figure 7.

Figure 7

Diagrammatic Representation of the Process that siRNAs Escape and the Amount at Each Stage

Firstly, the siRNAs with an initial injected amount s0 escape from endosome. And then, the escaping siRNAs s1 are amplified to s2 after Δt time.

Bayesian Inference

According to the Bayesian framework, the amount of siRNAs escaping from the endosome can be estimated by the posterior probability distributions. Given the observations of the amount after amplification, the distribution is given by

p(amountofescape(s1)|amountafteramplification(s2)).

The merit of the use of the Bayesian approach is that we not only could get the estimates of the most probable amount of escape (in terms of the modes of the distribution), but also, we could be aware of the uncertainty via the variance of the distributions. Then, the posterior probability p(s1|s2) is given by

p(s1|s2)=p(s1)p(s2|s1)s1p(s1)p(s2|s1)p(s1)p(s2|s1). (Equation 2)

With the further assumption of the prior p(s1) to be equally likely, one gets

p(s1|s2)=p(s2|s1)s1p(s2|s1). (Equation 3)

Then, the posterior distribution p(s1|s2) can be obtained through estimating all of the probability p(s1|s2) for s1=1,...,smax, where smax is the maximum of escaping amount s1. The detailed process is shown as follows.

First, starting from s1, we perform n simulations using the Gillespie stochastic algorithm,21 according to a parameter set θ for the dynamics, and obtain the finite simulating samples of s2 after time interval Δt:

{(s2)1,(s2)2,...,(s2)n},

from which p(s2|s1,θ) is estimated using the nearest neighbor method,17 which is a classical nonparametric estimation method.

Second, with the substitution of all probabilities p(s2|,) into Equation 3, one gets the estimation of the probability distribution p(s1|s2).

In detail, the algorithm for estimating distribution p(s1|s2) is given as follows.

Algorithm 1 implies that the Gillespie algorithm runs n times when the loop executes one time. It reveals that Algorithm 1 is time consuming if simulating time n is large. So, in order to improve the running efficiency of program, we adopt the MCMC method to estimate the escaping amount of siRNAs.

MCMC Method

The MCMC method includes Gibbs and MH, which are techniques simulating the random variables by using the Markov chain.23 In this paper, we choose MH to sample single variable s1, rather than Gibbs from the target distribution, being the conditional distribution of interest. Here, the target distribution, that is, posterior distribution p(s1|s2) in Equation 2, is proportional to the product of prior p(s1) and likelihood p(s2|s1).

From the ideas of MCMC, we need to compute the acceptance probability α(s1',s1(k)),

where s1(k) is k th sample, and s1' is a proposed value. From the symmetry of proposal distribution, namely q(s1'|s1(k))=q(s1(k)|s1'),24 and equally likely possibility of prior

p(s1), the acceptance probability can be simplified to

α(s1',s1(k))=min{1,p(s2|s1')p(s2|s1(k))}. (Equation 4)

Again, because p(s2|s1') and p(s2|s1(k)) in Equation 4 are unknown, next, we pursue a novel approach to compute them. For p(s2|s1(k)), first of all, we simulate one value s2(k) from s1(k) after a certain time Δt by the Gillespie algorithm. Second, we compute the distance between the given value s2 and the simulation s2(k) denoted by d=|s2(k)s2|. Finally, the likelihood25 is calculated by

p(s2|s1(k))=ed. (Equation 5)

Similarly, another likelihood in Equation 4 is calculated by

p(s2|s1')=ed', (Equation 6)

where d'=|s2's2|, while s2' is simulating from s1' by the same way as s2(k).

From all of the above, the acceptance probability in Equation 4 is renovated by

α(s1',s1(k))=min{1,eded}. (Equation 7)

Now, the procedure of sampling s1 by MCMC methods is listed as follows.

Multiple Types of siRNA

With the consideration of injecting multiple types of siRNAs to affect different target RNAs, the stochastic process of siRNAs is shown in Figure 8. Assume that we inject m types of siRNA for which the initial injected amount consists of v0[1],v0[2],...,v0[m], where v0[i]0 is the amount of i th siRNA. The amount of siRNA is declined to the relatively lower values of v1[1],v1[2],...,v1[m] because of endocytosis. After amplification, the composition of siRNA develops into v2[1],v2[2],...,v2[m] (Figure 8). With initial injected amount v0=(v0[1],v0[2],...,v0[m]) and the amount v2=(v2[1],v2[2],...,v2[m]) after amplification known, the Bayesian inference and MCMC method can be applied to estimate the posterior distribution p(s1|v0,v2) and sample s1 from this posterior distribution, respectively.

Figure 8.

Figure 8

Diagrammatic Representation of the Process in which siRNAs Escape and Their Propensity to Stochastic Variability in Terms of Both the Amount and the Composition of Their Population

Different colors express different types of siRNAs. Initial injected multiple types of siRNA consist of v0. After endocytosis, their amount decline to v1. Subsequently, the escaping siRNAs are amplified to v2 after Δt time.

Bayesian Inference

For the posterior distribution p(s1|v0,v2), we have

p(s1|v0,v2)=p(v1s.t.sum(v1)=s1v1|v0,v2)=v1s.t.sum(v1)=s1p(v1|v0,v2), (Equation 8)

where sum(v1)=iv1[i]. Again, from Bayes’ theorem, one gets

p(v1|v0,v2)=p(v1|v0)p(v2|v1,v0)v1p(v1|v0)p(v2|v1,v0)=p(v1|v0)p(v2|v1)v1p(v1|v0)p(v2|v1). (Equation 9)

Therefore, the incorporation of Equations 8 and 9 yields

p(s1|v0,v2)=v1s.t.sum(v1)=s1p(v1|v0)p(v2|v1)v1p(v1|v0)p(v2|v1)=v1s.t.sum(v1)=s1p(v1|v0)p(v2|v1)s1v1s.t.sum(v1)=s1p(v1|v0)p(v2|v1) (Equation 10)

Assume that all types of siRNAs are phenotypically identical and have the same probability of escaping from the endosome. Then, the distribution v1[1],v1[2],...,v1[m] of s1 could be considered as sampling randomly without replacement from the initial injected amount with distribution v0. So, we can select the amount of escape from a multivariate hypergeometric distribution with v0 and sum(v1)=s1. The probability of drawing v1 from v0 is given by

p(v1|v0sum(v1)=s1)=(v0[1]v1[1])(v0[2]v1[2])...(v0[m]v1[m])(v0[1]+v0[2]+...+v0[m]v1[1]+v1[2]+...+v1[m]). (Equation 11)

In reality, components of v2 are simulated by the Gillespie algorithm in view of parameter vector θ. So, for convenience, p(v2|v1) is denoted by p(v2|v1,θ), which is factorized in accordance with the independence between each type of siRNA as follows:

p(v2|v1,θ)=ip(v2[i]|v1,θ)=ip(v2[i]|v1[i],θ). (Equation 12)

Then, p(v2[i]|v1[i],θ) could be estimated the same way that we estimate p(s2|s1,θ), used in Algorithm 1.

The acquisition of p(v1|v0) and p(v2|v1) that are desired for Equation 10 has been solved in the previous segment, but we should count all of the summands when v2 gets every possible value, such that iv1[i]=s1. One key problem is that all possible values of v1 grow superexponentially with s1 when we give a value of s1.26 Now, we face a combinatorial and computational challenge, and so a replaceable approach is required.

To avoid the combinatorial problem, the more probable configuration of v1, such as the modes of v1, could replace the summands that consider all possibilities of v1 in Equation 10. Requena et al.27 have elaborated an algorithm to solve this question, but now, we provide a simpler sampling method that is to sample points v1 randomly from multivariate hypergeometric distribution p(v1|v0) for enough times so that most of these points would be adjacent to the modes. The concrete execution of the sampling procedure is shown in Algorithm 3.

Likewise, as discussed in the context above, there are problems of efficiency with this approach. Therefore, it is tempting to attempt to use the MCMC method.

MCMC Method

Multiple types are also appropriate for the MCMC method. Similar to the single type, our target distribution is posterior distribution p(s1|v0,v2) now. From Equation 9, we get

p(s1|v0,v2)p(s1|v0)p(v2|s1). (Equation 13)

In view of the equal possibility of the prior p(s1) and the previous Equation 13, the acceptance probability about the MH method is given by

α(s1',s1(k))=min{1,p(v2|s1')p(v2|s1(k))}. (Equation 14)

In order to go to the acceptance probability, first, we should draw v1' from the multivariate hypergeometric distribution with s1 and given v0. Afterward, simulate one vector of v2' from v1' after Δt by the Gillespie algorithm, and then, the distance between the given value v2 and the simulation v2' is recorded as d=v2'v2. Finally, we give the numerator in Equation 14 as

p(v2|s1')=ed. (Equation 15)

Let v2(k) be simulating from s1(k), and d=v2(k)v2, the denominator in Equation 14, is computed by

p(v2|s1(k))=ed. (Equation 16)

Then, we accept s1' with probability

α(s1',s1(k))=min{1,eded}. (Equation 17)

The exact process of the MCMC method is described in Algorithm 4.

Author Contributions

Y.P. conceived the project and designed the frame of this paper; T.L. and C.L. finished mathematical analyses, performed simulations and wrote the first draft; M.Y. polished, revised the last draft. All authors contributed to the manuscript and approved the final manuscript.

Conflicts of Interest

The authors declare no competing interests.

Acknowledgments

The work was supported by the National Natural Science Foundation of China (11471243 and 11971023).

References

  • 1.Fire A., Xu S., Montgomery M.K., Kostas S.A., Driver S.E., Mello C.C. Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature. 1998;391:806–811. doi: 10.1038/35888. [DOI] [PubMed] [Google Scholar]
  • 2.Wilson J.A., Richardson C.D. Induction of RNA interference using short interfering RNA expression vectors in cell culture and animal systems. Curr. Opin. Mol. Ther. 2003;5:389–396. [PubMed] [Google Scholar]
  • 3.Dykxhoorn D.M., Novina C.D., Sharp P.A. Killing the messenger: short RNAs that silence gene expression. Nat. Rev. Mol. Cell Biol. 2003;4:457–467. doi: 10.1038/nrm1129. [DOI] [PubMed] [Google Scholar]
  • 4.Wilson J.A., Richardson C.D. Hepatitis C virus replicons escape RNA interference induced by a short interfering RNA directed against the NS5b coding region. J. Virol. 2005;79:7050–7058. doi: 10.1128/JVI.79.11.7050-7058.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Baum J.A., Bogaert T., Clinton W., Heck G.R., Feldmann P., Ilagan O., Johnson S., Plaetinck G., Munyikwa T., Pleau M. Control of coleopteran insect pests through RNA interference. Nat. Biotechnol. 2007;25:1322–1326. doi: 10.1038/nbt1359. [DOI] [PubMed] [Google Scholar]
  • 6.Mao Y.B., Cai W.J., Wang J.W., Hong G.J., Tao X.Y., Wang L.J., Huang Y.P., Chen X.Y. Silencing a cotton bollworm P450 monooxygenase gene by plant-mediated RNAi impairs larval tolerance of gossypol. Nat. Biotechnol. 2007;25:1307–1313. doi: 10.1038/nbt1352. [DOI] [PubMed] [Google Scholar]
  • 7.Yang W.Q., Zhang Y. RNAi-mediated gene silencing in cancer therapy. Expert Opin. Biol. Ther. 2012;12:1495–1504. doi: 10.1517/14712598.2012.712107. [DOI] [PubMed] [Google Scholar]
  • 8.Ma T., Pei Y., Li C., Zhu M. Periodicity and dosage optimization of an RNAi model in eukaryotes cells. BMC Bioinformatics. 2019;20:340. doi: 10.1186/s12859-019-2925-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Wooddell C.I., Rozema D.B., Hossbach M., John M., Hamilton H.L., Chu Q., Hegge J.O., Klein J.J., Wakefield D.H., Oropeza C.E. Hepatocyte-targeted RNAi therapeutics for the treatment of chronic hepatitis B virus infection. Mol. Ther. 2013;21:973–985. doi: 10.1038/mt.2013.31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Dominska M., Dykxhoorn D.M. Breaking down the barriers: siRNA delivery and endosome escape. J. Cell Sci. 2010;123:1183–1189. doi: 10.1242/jcs.066399. [DOI] [PubMed] [Google Scholar]
  • 11.Dougherty W.G., Parks T.D. Transgenes and gene suppression: telling us something new? Curr. Opin. Cell Biol. 1995;7:399–405. doi: 10.1016/0955-0674(95)80096-4. [DOI] [PubMed] [Google Scholar]
  • 12.Sijen T., Fleenor J., Simmer F., Thijssen K.L., Parrish S., Timmons L., Plasterk R.H.A., Fire A. On the role of RNA amplification in dsRNA-triggered gene silencing. Cell. 2001;107:465–476. doi: 10.1016/s0092-8674(01)00576-1. [DOI] [PubMed] [Google Scholar]
  • 13.Dybowski R., Restif O., Price D.J., Mastroeni P. Inferring within-host bottleneck size: A Bayesian approach. J. Theor. Biol. 2017;435:218–228. doi: 10.1016/j.jtbi.2017.09.011. [DOI] [PubMed] [Google Scholar]
  • 14.Moncla L.H., Zhong G., Nelson C.W., Dinis J.M., Mutschler J., Hughes A.L., Watanabe T., Kawaoka Y., Friedrich T.C. Selective Bottlenecks Shape Evolutionary Pathways Taken during Mammalian Adaptation of a 1918-like Avian Influenza Virus. Cell Host Microbe. 2016;19:169–180. doi: 10.1016/j.chom.2016.01.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Abel S., Abel zur Wiesch P., Davis B.M., Waldor M.K. Analysis of bottlenecks in experimental models of infection. PLoS Pathog. 2015;11:e1004823. doi: 10.1371/journal.ppat.1004823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Sobel L.A., Weissman D., Greenbaum B., Ghedin E., Koelle K. Transmission Bottleneck Size Estimation from Pathogen Deep-Sequencing Data, with an Application to Human Influenza Virus. J. Virol. 2017;91:e00171-17. doi: 10.1128/JVI.00171-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Silverman B.W. Chapman and Hall/CRC; 1986. Density Estimation for Statistics and Data Analysis. [Google Scholar]
  • 18.Zhang J., Liu X., Zhang J., Li D., Sun Y., Guo Y., Ma E., Zhu K.Y. Silencing of two alternative splicing-derived mRNA variants of chitin synthase 1 gene by RNAi is lethal to the oriental migratory locust, Locusta migratoria manilensis (Meyen) Insect Biochem. Mol. Biol. 2010;40:824–833. doi: 10.1016/j.ibmb.2010.08.001. [DOI] [PubMed] [Google Scholar]
  • 19.Xia J.Y. Analysis on the outbreak of locusta migratoria manilensis and its control strategies. Plant Protection Technology and Extension. 2002;22:7–10. [Google Scholar]
  • 20.Cohen E. Chitin synthesis and inhibition: a revisit. Pest Manag. Sci. 2001;57:946–950. doi: 10.1002/ps.363. [DOI] [PubMed] [Google Scholar]
  • 21.Gillespie D.T. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 1977;81:2340–2361. [Google Scholar]
  • 22.Yin J.Q., Gao J., Shao R., Tian W.N., Wang J., Wan Y. siRNA agents inhibit oncogene expression and attenuate human tumor cell growth. J. Exp. Ther. Oncol. 2003;3:194–204. doi: 10.1046/j.1359-4117.2003.01092.x. [DOI] [PubMed] [Google Scholar]
  • 23.Gasparini M. Chapman and Hall/CRC; 1996. Markov Chain Monte Carlo in Practice. [Google Scholar]
  • 24.Wilkinson D.J. Stochastic modelling for systems biology. In: Wilkinson D.J., editor. Briefings in Bioinformatics. Chapman and Hall/CRC; 2006. pp. 204–205. [Google Scholar]
  • 25.Pandey A., Mubayi A., Medlock J. Comparing vector–host and SIR models for dengue transmission. Math. Biosci. 2013;246:252–259. [PubMed] [Google Scholar]
  • 26.Stanley R.P. Cambridge University Press; 1997. Enumerative Combinatorics. [Google Scholar]
  • 27.Requena F., Ciudad N.M. The Maximum Probability 2 × c Contingency Tables and the Maximum Probability Points of the Multivariate Hypergeometric Distribution. Commun. Stat.-Theor. M. 2003;9:1737–1752. [Google Scholar]

Articles from Molecular Therapy. Nucleic Acids are provided here courtesy of The American Society of Gene & Cell Therapy

RESOURCES