Skip to main content
Genetics logoLink to Genetics
. 2016 Dec 9;205(2):827–841. doi: 10.1534/genetics.116.197285

Evolution of Resistance Against CRISPR/Cas9 Gene Drive

Robert L Unckless *, Andrew G Clark †,, Philipp W Messer ‡,1
PMCID: PMC5289854  PMID: 27941126

Abstract

CRISPR/Cas9 gene drive (CGD) promises to be a highly adaptable approach for spreading genetically engineered alleles throughout a species, even if those alleles impair reproductive success. CGD has been shown to be effective in laboratory crosses of insects, yet it remains unclear to what extent potential resistance mechanisms will affect the dynamics of this process in large natural populations. Here we develop a comprehensive population genetic framework for modeling CGD dynamics, which incorporates potential resistance mechanisms as well as random genetic drift. Using this framework, we calculate the probability that resistance against CGD evolves from standing genetic variation, de novo mutation of wild-type alleles, or cleavage repair by nonhomologous end joining (NHEJ)—a likely by-product of CGD itself. We show that resistance to standard CGD approaches should evolve almost inevitably in most natural populations, unless repair of CGD-induced cleavage via NHEJ can be effectively suppressed, or resistance costs are on par with those of the driver. The key factor determining the probability that resistance evolves is the overall rate at which resistance alleles arise at the population level by mutation or NHEJ. By contrast, the conversion efficiency of the driver, its fitness cost, and its introduction frequency have only minor impact. Our results shed light on strategies that could facilitate the engineering of drivers with lower resistance potential, and motivate the possibility to embrace resistance as a possible mechanism for controlling a CGD approach. This study highlights the need for careful modeling of the population dynamics of CGD prior to the actual release of a driver construct into the wild.

Keywords: CRISPR/Cas9, gene drive, homing drive, mutagenic chain reaction, whole population replacement


THE prospect of driving genetically modified alleles to fixation in a population has enticed scientists for >40 years (Curtis 1968; Foster et al. 1972). Potential applications are broad and ambitious, including the eradication of vector-borne diseases such as malaria, dengue, and Zika (Burt 2003; Esvelt et al. 2014; Champer et al. 2016). For example, mosquitoes could be genetically altered such that they can no longer transmit Plasmodium parasites. If these altered alleles could then be spread in a wild population, we could effectively eradicate malaria in humans. Similarly, a gene drive could be used to reverse insecticide resistance in an agricultural pest, or to spread a deleterious allele in an invasive species to drive it toward extinction.

While various strategies for implementing a gene drive have been discussed since the 1970s, all previously proposed mechanisms have faced significant obstacles. Ongoing efforts to transfer the Medea system from flour beetles to other insects have, thus far, fallen short (Chen et al. 2007; Akbari et al. 2014). Underdominance systems have also been developed (Altrock et al. 2010; Akbari et al. 2013; Reeves et al. 2014), but these are slow-spreading systems that require multiple releases of large numbers of male transgenic insects. The use of Wolbachia-mediated strategies have perhaps been most successful (Lambrechts et al. 2015).

The recently proposed CRISPR/Cas9 gene drive (CGD) system now promises a highly adaptable approach for driving genetically engineered alleles into a population, even when those alleles carry substantial fitness costs (Gantz and Bier 2015; Gantz et al. 2015). CGD falls into the category of homing endonuclease gene drives, first proposed by Burt (2003), which consists of a genetic construct (driver allele) encoding an endonuclease that can cleave a genomic target site and then insert itself into that site. CGD uses a Cas9 endonuclease for cleavage that can be engineered to target virtually any 20-bp-long nucleotide sequence in a genome, using a target-specific guide RNA (gRNA). The driver construct is flanked by two homology arms to facilitate its incorporation at the cleavage site during homology-directed repair. In addition, arbitrary payload sequences can be included in the construct that will be transmitted alongside the driver. By converting heterozygotes for the driver allele into homozygotes (Figure 1A), a CGD construct can “cheat” the rules of Mendelian inheritance, allowing for rapid spread of the driver in the population (Burt 2003; Unckless et al. 2015).

Figure 1.

Figure 1

(A) CGD mechanism and creation of resistance alleles by NHEJ: In heterozygotes, Cas9 from the driver construct cleaves the wild-type chromosome at the target site. (i) Cleavage repair by homology-directed repair using the sister chromatid as a template will convert heterozygotes for the driver allele into homozygotes. (ii) Break-repair by NHEJ, on the other hand, can create a resistance allele, e.g., by introducing an indel (green) at the target site. (B) Driver dynamics and resistance allele creation rates in our standard model: driver frequency initially grows exponentially in our standard model and the driver fixes after ∼30 generations (assuming resistance does not evolve). As the driver increases in frequency, so does se(t). The establishment probability of a newly arisen resistance allele, π(t), increases with se(t), but is already quite high in generation zero, because resistance alleles present at that time only have to survive a few generations of drift before their relative selective advantage becomes noticeable. The rate of resistance allele creation by de novo mutation, uμ(t), is proportional to the frequency of wild types and thus decreases as the driver becomes more frequent. The rate of resistance allele creation by NHEJ, uδ(t), is proportional to the frequency of driver/wild-type heterozygotes and is thus maximal for intermediate driver frequencies. (C) Probability that resistance establishes from the SGV as a function of the initial copy number, n0, of resistance alleles present when the driver is introduced. The three curves show our standard model (black), a scenario in which the driver cleaves at only 50% efficiency (blue), and one where driver cost is 10 times smaller (red).

As a proof of principle, Gantz and Bier (2015) showed that CGD was highly effective at converting heterozygotes for a driver allele into homozygotes in a laboratory cross of Drosophila melanogaster. In their experiment, insects homozygous for a CGD construct that disrupts the X-linked yellow gene were crossed to wild-type insects, resulting in 254 of 261 daughters with the disrupted yellow phenotype (Gantz and Bier 2015). Since this phenotype is recessive, CGD must have disrupted the wild-type allele in these females. In another study, a similar construct conferring resistance to Plasmodium falciparum was successfully introduced into a laboratory population of Anopheles mosquitoes (Gantz et al. 2015). A third study recently generated a CGD targeting a mosquito fertility gene for population suppression (Hammond et al. 2016). Several theoretical studies have further shown that gene drives can, in principle, spread very rapidly in a population, even when the driver allele carries a substantial fitness cost (Burt 2003; Sinkins and Gould 2006; Deredec et al. 2008; Gould 2008; North et al. 2013; Burt 2014; Unckless et al. 2015; Drury et al. 2016).

Despite these initial results, it is still unclear whether CGD can actually lead to fixation of the driver allele in a large natural population because of several factors that might influence its population dynamics. Arguably the most critical problem is the evolution of resistance against the drive (Burt 2003; Deredec et al. 2008, 2011; Esvelt et al. 2014; Bull 2016; Noble et al. 2016b). Such resistance could arise, for instance, through alleles at the target locus that cannot be converted by CGD because they are not recognized by the driver’s gRNA. In fact, a single nucleotide mutation in the 3-bp-long PAM motif of the target sequence will typically suffice to create such a resistance allele. Most indels within the target will likely create a resistance allele as well. Importantly, the driver itself is expected to produce its own resistance alleles when cleavage is repaired by end joining mechanisms such as nonhomologous end joining (NHEJ) or microhomology-mediated end joining, rather than homologous recombination, and such resistance alleles have already been observed in several experiments (Gantz and Bier 2015; Gantz et al. 2015; Hammond et al. 2016; Wang et al. 2016).

If resistance alleles are less deleterious than the driver allele—as will likely be the case for many types of resistance alleles—natural selection will favor these alleles over the driver. As a result, they will rise in frequency and should eventually replace the driver. Thus, even though a driver may initially spread to high frequency in the population, its ultimate fate will depend on whether resistance alleles have emerged during this process. While several strategies have been proposed for reducing resistance potential; including the use of multiple gRNAs, the targeting of essential genes, daisy chains, or poison-antidote systems (Esvelt et al. 2014; Champer et al. 2016; Noble et al. 2016a); it remains to be seen how well these approaches can actually work in practice.

It is clear that any informed decision about the potential consequences and risks of releasing a CGD construct into a wild population requires that we first understand the dynamics of this process on the population level, including potential resistance mechanisms. Here, we build on previous theoretical results to devise a comprehensive population genetic model of CGD, which allows us to quantitatively study the population dynamics of such systems, calculate the probability that resistance evolves, and predict how resistance alleles will interfere with the spread of a driver construct.

Results

Population genetic model of CGD

Our model for studying the population dynamics of CGD is based on the original drive system proposed by Gantz and Bier (2015), which hearkens back to Burt (2003). We consider a single locus with three types of alleles: wild type (0); driver (d); and resistance (r), where we define a resistance allele to be any allele that cannot drive and also cannot be converted into a functioning driver by the CGD machinery (complete resistance). The driver cleaves the wild-type allele in heterozygous driver/wild-type embryos at rate c. In a fraction δ of cases, cleavage repair creates a resistance allele when NHEJ introduces an indel at the target site. In reality, such resistance alleles can be created by any repair mechanism that changes the target sequence without incorporating the functional drive machinery (e.g., microhomology-mediated end joining or the incorporation of a loss-of-function driver construct) (Champer et al. 2016), but we refer to this entire class as NHEJ for simplicity. Resistance alleles can also arise by de novo mutation in wild-type alleles at rate μ, specifying an effective rate that includes all possible mutations creating resistance. We set the fitness of wild-type homozygotes to ω00=1, whereas all other genotypes can carry arbitrary fitness cost, ωij=1sij1. The driver is initially introduced into the population in generation t=0 at frequency xd(0). We model a large, panmictic population with variance effective population size Ne and census population size N.

Note that there could generally also exist other types of alleles at the locus. For example, resistance alleles that arise by de novo mutation could be distinguishable from those that arise by NHEJ, or resistance alleles that were already present as natural variation in the population. Depending on the specific drive scenario, these different resistance alleles could carry different costs. For simplicity, we combine all resistance alleles into a single class with the same fitness costs in our model.

We will typically perform our numerical analyses using a “standard” model, for which we vary individual parameters independently, while keeping the other parameters constant. This will allow us to dissect how each individual parameter affects the general behavior of the model. Our standard model assumes a scenario in which cleavage is efficient, c=0.9, as one would likely aim to design a driver this way. We further assume that resistance alleles carry no cost, sr0=srr=0, which we will relax later. Driver alleles, by contrast, carry a substantial cost that is codominant, sd0=0.1=sdd/2. In those analyses where both the driver and the resistance allele carry a cost, we will assume that these costs are codominant to each other, sdr=sd0+sr0. We further set δ=106 and μ=108. Both are likely conservative choices with respect to the actual rates at which resistance alleles arise in many systems of interest, such as insects. In fact, NHEJ rates are probably several orders of magnitude higher in most systems, in which case the evolution of resistance from NHEJ will be practically assured (as we will show below). We still choose a very low value of δ for our standard model both to be able to study systems in which NHEJ has been intentionally suppressed and because it is the parameter space where resistance is not inevitable. We finally set N=Ne=106 for simplicity, and introduce driver alleles into the population at frequency xd(0)=105. Table 1 provides a summary of the different parameters and their default values in our standard model.

Table 1. Description of all model parameters.

Parameter Standard model Description
xd(t=0) 10−5 Introduction frequency of driver allele
c 0.9 Cleavage rate in driver/wild-type heterozygotes
δ 10−6 Fraction of cleavage events resulting in resistance allele by NHEJ
μ 10−8 Rate at which wild-type alleles mutate into resistance alleles
sd0 0.1 Fitness cost of driver/wild-type heterozygotes
sdr 0.1 Fitness cost of driver/resistance heterozygotes
sr0 0.0 Fitness cost of resistance/wild-type heterozygotes
sdd 0.2 Fitness cost of driver homozygotes
srr 0.0 Fitness cost of resistance homozygotes
Ne 106 Variance effective population size
N 106 Census population size

Deterministic frequency dynamics

In our model, we can express the expectation values of allele frequencies xd(t), xr(t), and x0(t) as functions of their frequencies in the previous generation; the fitness costs of the different genotypes; and the rates c, μ, and δ. We initially assume that δ1 and μ1, such that the generation of new resistance alleles does not noticeably affect wild-type and driver allele frequencies (we will relax this assumption below). Given allele frequencies xd, xr, and x0 in generation t, the expected values of these frequencies in generation t+1 are then

E[xd]=xdx0[(1c)ωd0+2cωdd]+xdxrωdr+xd2ωddω¯(t), (1)
E[xr]=xrx0ωr0+xrxdωdr+xr2ωrrω¯(t), (2)
E[x0]=1E[xd]E[xr]. (3)

The contributions to the numerator in Equation 1 are as follows: The first term specifies the contributions of driver/wild-type heterozygotes that either successfully converted into driver homozygotes and thus now have fitness ωdd [fraction c(1δ)c of driver/wild-type heterozygotes], or failed to convert and thus remain heterozygotes with fitness ωd0 [fraction 1c(1δ)1c of driver/wild-type heterozygotes]. The second and third terms specify the contributions from driver/resistance heterozygotes (fitness ωdr) and driver homozygotes (fitness ωdd), respectively. In Equation 2, the first term in the numerator specifies the contribution from resistance/wild-type heterozygotes, the second term the contributions from resistance/driver heterozygotes, and the third term the contribution from resistance homozygotes. Equation 3 follows from the condition that allele frequencies have to sum to one. The mean population fitness in generation t is given by

ω¯(t)=2xdx0[(1c)ωd0+cωdd]+2xrx0ωr0+2xrxdωdr+ixi2ωii. (4)

These equations fully describe the dynamics of allele frequencies from a given set of starting frequencies in a deterministic scenario, which will be appropriate when allele frequencies are sufficiently large that genetic drift can be ignored. Additionally, these equations can be incorporated into a Wright–Fisher framework that explicitly takes drift into account.

We and others have previously shown that these dynamics can already produce rich behavior in the absence of resistance (Burt 2003; Deredec et al. 2008; Unckless et al. 2015). Possible outcomes can include the fixation of the driver, loss of the driver, and both stable or unstable equilibria, depending on the cleavage rate, c, the fitness costs of the driver allele, sdd and sd0, and the initial frequency xd(t=0) at which the driver is introduced into the population. Briefly, in scenarios with high cleavage rate and small fitness costs, the driver will usually fix even when introduced at low frequency, whereas for low cleavage rates and high fitness costs the driver may not be able to invade at all. Intermediate scenarios can give rise to equilibria that are usually unstable, but can be stable if cleavage rates are small and fitness costs are low and recessive. For the remainder of this study, we will focus only on cases where, in the absence of resistance, the driver allele can go to fixation.

Origin of resistance alleles

We distinguish three mechanisms by which resistance alleles can originate: (i) standing genetic variation (SGV), (ii) de novo mutation of wild-type alleles after the driver is introduced (specified by μ), and (iii) cleavage repair by NHEJ (specified by δ). Initially we consider each of these mechanisms independently, assuming the others do not occur. To quantify the expected supply of resistance alleles from SGV, we assume mutation-selection-drift balance prior to the introduction of the driver, which is determined by the fitness costs of resistance alleles in the absence of the driver and the population-level mutation rate toward resistance alleles, θ=4Neμ (Hermisson and Pennings 2005). Once the driver is introduced, de novo mutation of wild-type alleles will create resistance alleles at rate

uμ(t)=2Nμ[(1xd)2+xd(1xd)(1c)]. (5)

This rate is proportional to the number of wild-type alleles and thus decreases as the driver spreads. NHEJ during break repair will create resistance alleles at rate

uδ(t)=2Nδcxd(1xd), (6)

which is proportional to the number of wild-type/driver heterozygotes and thus is highest when the driver is at intermediate frequency.

Establishment of resistance alleles

The overall likelihood that resistance evolves will depend on the supply of resistance alleles by the individual mechanisms, and the probability that a new resistance allele is not lost to drift when it is still rare (its so-called establishment probability). Before we discuss the individual contributions of resistance from SGV, de novo mutation, and NHEJ, we will first calculate these establishment probabilities in our model.

As long as resistance alleles are still rare, they should be present mostly in heterozygotes, and we can therefore neglect resistance homozygotes at this stage. If a resistance allele is present at frequency xr(t)1 in generation t, its expected frequency in the next generation will be

E[xr]xr(1xd)ωr0+xrxdωdrω¯(t), (7)

which can be rearranged into

E[xr]xr[1+se(t)]withse(t)=(1xd)ωr0+xdωdrω¯(t)1. (8)

This “effective” selection coefficient se(t) specifies the expected change in frequency of the resistance allele in generation t. Note that se(t) is time dependent because it is a function of the driver frequency (Figure 1B). If resistance alleles carry a cost themselves, then se(t) could initially be negative and turn positive only after the driver has reached a certain frequency in the population.

To calculate se(t) in a given generation t, we need to know the frequency xd(t) of the driver allele in that generation. Since we assume that resistance alleles are still at very low frequency, such that xd(t)1x0(t), the dynamics from Equations 1–4 simplifies to

xd=xd(1xd)[(1c)ωd0+2cωdd]+xd2ωddω¯(t)withω¯(t)=2xd(1xd)[(1c)ωd0+cωdd]+xd2ωdd+(1xd)2. (9)

Assuming a deterministic model for the driver frequency, we can then calculate xd(t) recursively from its starting frequency xd(0).

We define establishment probability π(t) as the probability that a resistance allele present in a single copy in generation t successfully survives in the population (i.e., is not lost by drift while it is still rare). Uecker and Hermisson (2011) showed how the establishment probability of a new mutation can be calculated when its selection coefficient is time dependent. Their theory can be directly applied to our scenario if we consider a resistance allele to be a new mutation with time-dependent selection coefficient se(t). Defining g(t,t)=i=tt1se(i) and f(t,t)=i=tt1eg(t,i)[1ese(i)]/se(i), we obtain (Appendix A)

π(t){21+(N/Ne)[f(t,tfix)+eg(t,tfix)/se(tfix)],t<tfix21+(N/Ne)×1/se(tfix),ttfix. (10)

Both of these expressions break down when se(tfix)=0, which could be the case if the fitness cost of the driver is completely dominant, such that resistance alleles would not provide any fitness advantage in driver/resistance heterozygotes. We will discuss this particular scenario below. For now, we assume that se(tfix)>0. Note that as the driver spreads, the supply rates of resistance alleles from de novo mutation and NHEJ, the selective advantage of being resistant, and the probability that resistance escapes stochastic loss are all dynamic quantities (Figure 1B).

Given an estimate for π(t), we can directly calculate the probability that a resistance allele will successfully establish in the population when it is initially present in n0 copies at time t

Pr(establish|n0,t)1[1π(t)]n0. (11)

Figure 1C shows how Pr(establish|n0) scales for our standard model with the initial copy number of resistance alleles present in the SGV at the time the driver is introduced.

Resistance from the standing genetic variation

The probability that resistance evolves from alleles present as SGV will depend on the expected frequencies of such alleles in the population in generation t=0 and their establishment probabilities provided by Equation 11. We can use existing theory for the well-studied problem of evolutionary rescue from SGV (Hermisson and Pennings 2005; Orr and Unckless 2008) for calculating this probability. In particular, we will use the analytical framework of Hermisson and Pennings (2005) and consider the introduction of the driver to be the environmental shift, which then renders a previously neutral or deleterious resistance allele as beneficial as mean population fitness declines when the driver spreads. We assume that prior to introduction of the driver, resistance alleles arise at rate μ per generation per haploid wild-type genome, and that they are evolving under mutation-selection-drift balance, specified by fitness costs sr0 and srr. We define θ=4Neμ to be twice the population-level mutation rate toward resistance alleles. After introduction of the driver, the fitness effects of resistance alleles are given by se(t), as defined in Equation 8.

Let PSGV denote the probability that resistance successfully evolves from any resistance allele present in the SGV at the time the driver is introduced, assuming that SGV is the only possible source of resistance alleles (which we can assure in our model by setting μ and δ to zero once the driver is introduced). For this case, we obtain

PSGV=1exp[θln(1+2Neπ(0)σ+1)]withσ={4Nesr00<sr0<srr2Nesrr0=sr0<srr. (12)

The derivation follows Hermisson and Pennings (2005) and involves integrating establishment probabilities over the expected frequency distribution of resistance mutations under mutation-selection-drift balance. Our result for the codominant case (0 < sr0 < srr) follows directly from Equation 8 in Hermisson and Pennings (2005), where we can identify hsd (the heterozygous fitness cost of resistance alleles) with sr0 in our model. The factor 2hsb in this equation is the establishment probability of a mutation present in a single copy in generation zero, which corresponds to π(0) in our model. The result for recessive resistance costs (0=sr0<srr) follows from the discussion provided after equation A11 in Hermisson and Pennings (2005), where they show that the factor 2hαdαd in their equation 8 needs to be replaced by 2Nes in the recessive case. The factor s in this expression specifies the homozygous fitness cost of the mutation, which corresponds to srr in our model.

Figure 2 shows PSGV as a function of θ in our standard model, while simultaneously varying the cleavage rate (Figure 2A), the fitness cost of the driver allele (Figure 2B), or the fitness cost of resistance (Figure 2, C and D). Interestingly, as the cleavage rate (c) increases, so does the probability of resistance from SGV. This is because higher c leads to a faster spread of the driver, and resistance alleles present in the SGV will thus experience their net fitness advantage sooner; increasing their chances of surviving drift. Generally, as long as resistance alleles provide a net fitness advantage over the driver, evolution of resistance from the SGV is practically assured whenever θ0.1, whereas it remains unlikely for θ0.1. This is consistent with the general result that the probability of adaptation to an environmental shift using alleles from the SGV should depend only weakly on the selective advantage of these alleles in the new environment, but should be mostly determined by how many such alleles were present when the environment changed, which is a function of θ and the selective disadvantage of these alleles prior to the change (Hermisson and Pennings 2005).

Figure 2.

Figure 2

Probability that resistance evolves from SGV in our standard model. (A) PSGV as a function of θ and cleavage rate: higher c increases PSGV by increasing π(0) (see panel E). (B) PSGV as a function of θ and driver cost: higher driver costs also increases PSGV by increasing π(0) (see F). (C) PSGV when resistance alleles carry a codominant fitness cost (sr0=srr/2>0). (D) Same as (C), but assuming recessive fitness costs (srr>sr0=0). Higher fitness cost of resistance alleles lead to lower PSGV only if fitness costs are codominant, whereas recessive costs have almost no noticeable effect. To vary θ in (A–D), we always varied Ne while keeping μ=108 constant. (E) Establishment probability as a function of cleavage rate: π(0) increases with c because resistance alleles can experience their net fitness advantage faster when the driver spreads faster. Our analytics (line) agree well with numerical simulations (●) under a Wright–Fisher model with conversion, selection, and drift. (F) Same as (E) but varying driver costs instead of cleavage rate.

If resistance alleles carry a fitness cost themselves, this lowers the probability that resistance evolves from the SGV. However, this effect is much more pronounced when fitness costs are codominant than when they are recessive, consistent with the fact that even a deleterious allele can still reach a noticeable frequency in the SGV as long as its costs are only recessive. In our model, even for substantial recessive costs (srr=0.1), we still find PSGV to be practically indistinguishable from the cost-free scenario.

Resistance from de novo mutation

Any given resistance allele that arises from de novo mutation as a single copy in generation t will successfully establish with probability π(t), assuming that resistance does not arise from any other mechanism. Thus, the overall probability that at least one allele that arose by de novo mutation during the time interval [1,t] establishes in the population is given by

Pr(tμ<t)=1i=1teuμ(i)π(i). (13)

The factor euμ(i)π(i) here specifies the probability that no resistance allele from generation i successfully establishes in the population.

Note that uμ(t) is defined in terms of the census population size N because the overall supply of resistance mutations depends on how many wild-type genomes are present in the population overall, not the effective population size. If the effective population size differs from the census size, this will not affect uμ, but will change the establishment probability of a new resistance allele initially present in just a single copy in the population, specified by Equation A1. However, this change is often well approximated by π(Ne/N)π, where Ne technically refers to the so-called “fixation” effective population size (Otto and Whitlock 1997). In that case, N simply cancels out in the products uμ(i)π(i). We always set N=Ne in our standard model for simplicity.

Resistance alleles from a de novo mutation can only arise as long as the driver has not yet fixed in the population. Afterward there will no longer be wild-type alleles present that could mutate into resistance alleles: thus, uμ(ttfix)=0. The overall probability that resistance establishes from a de novo mutation arising after introduction of the driver is therefore

Pμ=Pr(tμ<tfix). (14)

Figure 3, A–C, shows how Pμ depends on the parameters in our standard model. Again, we find that the key parameter determining the likelihood of resistance is θ=4Neμ. However, in contrast to resistance from SGV, both the cleavage rate of the drive (c) and the cost of the driver allele (sd0) can noticeably affect Pμ as well. Specifically, higher cleavage rates decrease Pμ (the opposite effect they had on PSGV). This is because higher cleavage rates speed up the spread of the driver, providing less time for resistance alleles to emerge by de novo mutation before the driver becomes fixed. Higher fitness costs of the driver, on the other hand, increase Pμ by increasing the establishment probability of resistance alleles, as resistance alleles will have a higher net fitness advantage over the driver. When resistance alleles carry (codominant) fitness costs themselves, this has only a marginal effect on Pμ, as long as they still provide a net fitness advantage. Since we assume that resistance homozygotes are irrelevant for the establishment probability of a new resistance allele, recessive costs have no noticeable effect on Pμ in our model.

Figure 3.

Figure 3

Probability that resistance evolves from de novo mutation or NHEJ in our standard model. (A) Pμ as a function of θ and cleavage rate: higher cleavage rate decreases Pμ, because it leads to faster fixation of the driver, providing less time for resistance alleles to arise during its spread. (B) Pμ as a function of θ and driver cost: higher driver costs increase Pμ by increasing π(t) throughout the drive. (C) Pμ as a function of θ and (codominant) resistance costs: resistance costs have only very little effect on Pμ until they become comparable to the cost of the driver. To vary θ in (A–C), we always varied Ne while keeping μ=108 constant. (D) Pδ as a function of Neδ and cleavage rate, which has very little effect on Pδ because the overall number of resistance alleles created by NHEJ throughout the spread of the driver is proportional to the overall number of cleavage events that occur during its spread, which does not depend strongly on c (a driver always has to produce roughly 2N cleavage events over the course of its spread to become fixed in the population). (E) Pδ as a function of Neδ and driver cost: higher driver costs increase Pδ by increasing π(t), similar to their effect on Pμ. (D) Pδ as a function of Neδ and (codominant) resistance costs, which have only very little effect on Pδ. To vary Neδ in (D–F), we always varied Ne while keeping δ=106 constant.

When comparing PSGV with Pμ for the same set of model parameters, we find that PSGV is generally higher than Pμ. Hence, resistance is more likely to evolve from resistance alleles already present when the driver is introduced, than from de novo mutation after its introduction. The only exception would be a scenario of a very inefficient drive (small c), in which case it would take a long time until the driver becomes prevalent in the population, reducing π(0) while at the same time providing more time for de novo resistance alleles to emerge.

Resistance from NHEJ

Similar to de novo mutation, any given resistance allele that arises in generation 0<t<tfix by NHEJ will successfully establish with probability π(t), assuming that resistance does not arise from any other mechanism. The overall probability that at least one such allele that arose during the time interval [1,t] establishes in the population is then given by

Pr(tδ<t)=1i=1teuδ(i)π(i). (15)

This equation is analogous to Equation 13 for the de novo mutation scenario, except for uμ(t) being exchanged by uδ(t). Given that uδ(t) will be zero once the driver has fixed, we can again use this result to calculate the overall probability that at least one resistance allele from this mechanism establishes in the population:

Pδ=Pr(tδ<tfix). (16)

Figure 3, D–F, shows how Pδ depends on the parameters in our standard model. To ensure that there is no resistance from the SGV or de novo mutation, we set μ=0. We then varied Neδ by varying Ne while keeping δ=106 constant. As we discussed above, this is likely to be a very low estimate for the rate at which cleavage repair generates resistance alleles through NHEJ in many systems, which may be difficult to achieve in practice. Higher rates will always increase Pδ, making this estimate conservative with regard to the probability that resistance evolves by this mechanism.

Given the similarity between the NHEJ and de novo mutation scenarios, it is not very surprising that the key parameter determining the likelihood of resistance in the NHEJ scenario is the product Neδ, similar to the parameter θ=4Neμ in the de novo scenario. Whenever Neδ becomes on the order of one or larger in our model, evolution of resistance from NHEJ becomes likely.

In contrast to the de novo mutation scenario, the cleavage rate has very little impact on Pδ. This can be understood from the fact that the overall number of resistance alleles that arise by NHEJ before driver fixation is proportional to the overall number of cleavage events that occur, which is independent of the actual cleavage rate (as long as selection against the driver does not yet have a strong effect on its spread). Successful fixation, on average, always requires 2N conversion events throughout this process, regardless of the actual conversion rate. Fitness costs of the driver, however, can have a strong impact on Pδ, with higher fitness costs leading to lower Pδ by reducing π(t) — the same effect they had in the de novo mutation scenario.

Total resistance probability

Given the individual probabilities PSGV, Pμ, and Pδ, we can calculate the total probability that resistance evolves by any of the three mechanisms

Ptot1(1PSGV)(1Pμ)(1Pδ). (17)

Figure 4 shows Ptot for a wide range of parameters and limiting cases in our standard model. Consistent with our results for the individual resistance mechanisms, we find that the total resistance probability depends primarily on three parameters: Ne, μ, and δ. Specifically, whenever θ=4Neμ0.1, resistance is likely to evolve from the SGV, as long as resistance alleles provide a net fitness advantage over the driver. This result is largely independent of the conversion rate of the driver, as well as the absolute fitness cost of the driver. Resistance alleles that arise from de novo mutation after introduction of the driver are generally less likely to contribute to resistance than alleles from the SGV. NHEJ produces resistance whenever Neδ1, unless fitness costs of the driver are very small.

Figure 4.

Figure 4

Total probability that resistance evolves in our model. (A–C) Ptot as a function of μ and δ in our standard model, assuming three different effective population sizes (A) Ne=105, (B) 106, and (C) 107. Larger values of Ne increase the probability of resistance, yet even in the scenario with Ne=105, low values of δ<105 are still required if resistance from NHEJ is to be prevented. (D–F) Ptot as a function of μ and δ for fixed Ne=106 in three limiting cases of our standard model: (D) scenario of a very weak drive with cleavage rate c=0.2. Comparison with (B) shows that this has almost no effect on Ptot. (E) Scenario with low driver costs, sd0=sdd/2=0.01. While this allows for larger values of δ, the dependence on μ does not change much, since PSGV remains largely unaffected by the driver costs (compare with Figure 2B). (F) Scenario in which resistance alleles carry (codominant) fitness costs almost as large (90%) as those of the driver, sr0=srr/2=0.09. In this case, resistance is unlikely to evolve from SGV or de novo mutation. However, it is still likely to evolve by NHEJ unless δ<105.

Relaxing model assumptions

Recessive and dominant driver costs:

So far, we have assumed that driver fitness costs are codominant. If they are recessive, the driver can initially rise in frequency faster, as driver heterozyogtes carry no costs. Conversely, if they are dominant, this will slow down the initial spread of the driver. In our standard model, the differences in driver frequency trajectories between recessive, codominant, and dominant driver fitness costs are only marginal (Figure 5A). This is because driver dynamics are dominated by conversion, rather than selection, which only becomes important once driver fitness costs are of the same order as the cleavage rate (sd0c).

Figure 5.

Figure 5

(A) Driver allele frequency trajectories, effective fitness advantage of resistance allele, and establishment probabilities under recessive, codominant, and completely dominant driver fitness costs in our standard model. π(t) is not visible in the dominant scenario as it is very close to zero. (B) Threshold frequencies x0 at which a resistance allele needs to be present to have the given establishment probability in the case of a completely dominant driver with cost sd0. (C) Replacement of driver by resistance allele. Shown is the deterministic frequency trajectory of a driver allele in our standard model and its replacement by a resistance allele from the SGV that was present as a single copy (solid line) or 100 copies (dashed line) when the driver was introduced. (D) Time until resistance allele and driver allele reach 50% population frequency in (C) when varying driver cost sd0. Once driver cost reaches sd00.25 in our model, the driver will be outpaced by the resistance allele before it can reach 50%.

Dominance of the driver can nevertheless have a strong impact on the probability that resistance evolves, due to its effects on the fitness of driver/resistance heterozygotes. Resistance alleles that are still rare will predominantly be present in driver/resistance heterozygotes once a driver is frequent. If driver costs are recessive, se(t) will then be larger than in the codominant case, increasing π(t), and thus Pμ and Pδ. By contrast, if driver costs are dominant, driver/resistance heterozygotes will have no fitness advantage over driver homozygotes. Once wild-type alleles have been completely displaced by the driver, only resistance homozygotes would then have a fitness advantage. Unfortunately our approach for calculating establishment probabilities based on Equation 10 fails here because it does not take resistance homozygotes into account.

We can still estimate whether resistance will likely evolve in this case by mapping the problem onto the analogous problem of whether a beneficial mutation can establish in a population when its fitness effects are completely recessive. Such a recessive beneficial mutation with fitness 1+s in homozygotes and fitness 1 in heterozygotes has an establishment probability of approximately 2s/(Neπ) when initially present in a single copy (Kimura 1962), which we can extend to the probability that the mutation will successfully establish when it is initially present at frequency x01:

Pr(establish|x0)1(12sNeπ)2Nex01e1.6x0Nes. (18)

In our standard scenario with cost-free resistance, the net fitness advantage of resistance homozygotes in this case will be ssdd/(1sdd), assuming that the driver is completely dominant and close to fixation. Figure 5B shows that for resistance to evolve in this scenario, resistance alleles need to be present at substantial initial frequency x0. For example, when the driver has a dominant fitness cost of sdd=0.1, resistance alleles would have to be present at ∼0.1% frequency to have a 50% chance of successfully establishing. Note that this would likely be the case in scenarios where δ is no longer very small, as we will show next in our treatment of the frequent NHEJ regime.

Frequent NHEJ regime:

Our dynamical model described by Equations 1–4 assumes that δ is small enough that we can effectively neglect the contribution of NHEJ events to changes in allele frequencies between generations. This may no longer be the case if NHEJ becomes more frequent, say δ=0.01. Extending our model to such a scenario is straightforward (Appendix B). Evolution of resistance is practically ensured in this regime as long as resistance alleles carry any net fitness advantage relative to driver alleles, given that there will always be ample supply of resistance alleles. As long as allele frequency trajectories are still dominated by conversion, rather than selection, the ratio of resistance allele frequency over driver allele frequency should then simply be

xr(t)xd(t)δ1δ. (19)

This is because every time CGD-induced cleavage in a driver/wild-type heterozygote is repaired, in a fraction δ of cases a resistance allele will be added to the population, whereas in a faction 1δ of cases a driver allele will be added. Fitness differences between driver and resistance alleles will change their relative frequencies over time, but this can be slow, unless these fitness differences become large. For a driver with high conversion rate and moderate fitness costs, we would expect that at the time the wild-type allele is lost from the population, the population frequency of resistance alleles should still be close to the ratio given in Equation 19.

High introduction frequency:

Our simulations show that varying the introduction frequency of the driver has almost no noticeable effect on the total probability that resistance evolves, which is due to two competing effects: Increasing the introduction frequency will increase π(0), and thus PSGV, because resistance alleles present in the SGV will experience their fitness advantage faster. At the same time, this will also lead to faster fixation of the driver, leaving less time for de novo resistance mutations to occur, which will decrease Pμ. The two effects approximately cancel out. Pδ will generally not depend much on the introduction frequency, as long as xd(0)1, because most resistance alleles produced by NHEJ arise when the driver is at intermediate frequency. Note that changing xd(0) can still have a noticeable impact on the fixation time of the driver. For example, while it takes the driver t0.526 generations to reach frequency 0.5 in our standard model with xd(0)=105, this only takes t0.510 generations when initially introduced at frequency x(0)=0.01.

Varying population size:

We have so far assumed that population size remains constant over time. This assumption is likely violated in many systems. Insect populations, for instance, can show dramatic fluctuations in population size over just a few generations (Berryman 2002). In temperate areas, winters can result in population crashes, followed by rapid increase during the growing season. Management practices for pests, such as insecticide application, can also change population sizes dramatically, as could CGD itself if the goal is to spread a harmful allele in the population.

Relaxing the assumption of constant population size can affect three aspects of our analytical framework: (i) the supply of resistance alleles in the SGV, (ii) the resistance allele creation rates uμ(t) and uδ(t), and (iii) the establishment probability π(t) of a resistance allele. The framework by Uecker and Hermisson (2011) we adopted for calculating π(t) does in fact allow for arbitrary changes in population size over time. Given functions N(t) and Ne(t), Equation A1 then becomes

π(t)=21+t[N(t)/Ne(x)]etxse(y)dydx, (20)

which can be solved numerically for any given demography in the same way we did for the constant population size scenario. Both uμ(t) and uδ(t) are already functions of t in our model. Under varying population size, these rates become:

uμ(t)=2N(t)μ[(1xd)2+xd(1xd)(1c)] (21)
uδ(t)=2N(t)δcxd(1xd). (22)

Given π(t) and rates uμ(t) and uδ(t), the resistance probabilities Pμ and Pδ can then be calculated according to Equations 14 and 16 the same way we did for the constant population size scenario.

To our knowledge, there are no general analytical solutions for mutation-selection-drift balance under arbitrary demography scenarios. Calculation of PSGV will therefore typically rely on numerical simulations to infer the expected frequency of resistance alleles in the SGV. However, such distributions could be easily obtained from Wright–Fisher simulations under any given demographic model, allowing for numerical estimation of PSGV.

Data availability

Our theory for modeling CGD dynamics and calculating resistance probabilities is implemented in a C++ command line program. This program takes as input the parameter values of the given model: xd(0),c,μ,δ,sd0,sdr,sr0,srr,Ne,N. For the given parameters, it then calculates the driver allele frequency trajectory xd(t) under a deterministic model, specified by Equations 1–4 in the absence of resistance; the rate uμ(t) at which resistance alleles are expected to arise by de novo mutation; the rate uδ(t) at which they are expected to arise by NHEJ; the effective selection coefficient se(t) of resistance alleles; and the establishment probability π(t) of a resistance allele arising in a single copy in generation t. Results are provided for each generation 0ttfix. The program also calculates the individual resistance probabilities Ptot, PSGV, Pμ, and Pδ. Executables, source code, and documentation for this program are available at http://messerlab.org/resources/.

Discussion

The prospect of using CGD for genetic manipulation of entire populations has sparked both optimism and deep concern. While this technology could help us tackle some of the most pressing public health challenges, such as vector-borne diseases, there are also serious doubts regarding our ability to control a gene drive once released into a wild population. In this study, we showed that resistance will almost inevitably evolve against standard CGD approaches, most likely due to resistance alleles produced by the driver itself when cleavage is repaired by NHEJ-type mechanisms.

We focused in our analyses on population parameters we feel are appropriate for many proposed target systems of CGD, such as insect populations. Estimates of single nucleotide mutation rates tend to be in the range of 109<μ<108 in such organisms (Haag-Liautard et al. 2007; Keightley et al. 2009, 2014), and a single mutation in the PAM motif probably already suffices to create a resistance allele against CGD (Cong et al. 2013). In addition, indels should typically occur within the 20-bp-long target sequence of a CGD construct at rates comparable to single nucleotide mutation rates (Schrider et al. 2013). In our model, we showed that resistance from SGV and de novo mutation is then likely whenever Ne>106, compatible with estimates of effective population sizes in many insect populations (Leffler et al. 2012). In fact, the values of Ne relevant for CGD in natural insect populations may often be much larger than these values of long-term Ne obtained from levels of neutral diversity, which tend to be dominated by historical bottlenecks and seasonal population crashes (e.g., caused by the winters in a temperate region) (Hartl and Clark 1997; Berryman 2002; Karasov et al. 2010; Messer and Petrov 2013). During warm seasons, insect population sizes can easily reach billions and more, and resistance alleles will inevitably arise during these times when the supply of resistance alleles is not limited by mutation. If enough generations occur before the next population collapse, resistance alleles could reach a sufficiently high frequency to persist throughout the collapse (Wilson et al. 2014). In such scenarios, resistance probabilities may then actually depend on the specific time during the year a driver is released.

Yet even in much smaller populations, resistance is still likely to evolve from the repair of driver-induced cleavage events by NHEJ, unless the NHEJ rate can be limited to below δ<1/Ne. This poses a very low boundary for many natural populations, which should be quite difficult to achieve in practice, given that NHEJ rates observed in previous CGD experiments were typically on the order of 1% or larger (Gantz et al. 2015). In conclusion, our results suggest that resistance should evolve almost inevitably against standard CGD approaches in most natural populations, unless NHEJ can be effectively suppressed, fitness costs of the driver are completely dominant, or these fitness costs are on par with those of resistance alleles.

Our modeling framework also allows us to study and compare resistance probabilities under specific CGD strategies (Esvelt et al. 2014). One proposed strategy is to disrupt an existing gene (e.g., a gene involved in insecticide resistance). In this case, cleavage repair by NHEJ could still achieve the intended result—a disrupted copy of the gene. Since these alleles should carry similar fitness costs as the driver, they are not likely to rise in frequency. However, resistance may still evolve if some mutations change the target sequence without actually disrupting the gene, or if the ability to drive itself is already associated with some fitness costs, as will likely be the case if off-target effects of the driver have not been completely suppressed (Fu et al. 2013; Anderson et al. 2015; Zhang et al. 2015). Another proposed strategy is the insertion of a new gene (e.g., a gene that prevents mosquitoes from transmitting malaria). If the target site lies in a nonfunctional region, NHEJ could then create cost-free resistance alleles, rendering the evolution of resistance highly probable if the driver carries a fitness cost.

Several CGD strategies have already been proposed specifically to combat resistance (Burt 2003; Deredec et al. 2008, 2011; Esvelt et al. 2014; Noble et al. 2016a,b). One such strategy is to include multiple gRNAs in the construct to target different sites. The probability that resistance evolves should then remain very low in our model, due to the large number of mutations required for an allele that is resistant to all gRNAs. Yet in practice such a strategy will likely add complexities that require more sophisticated modeling, such as allele classes with intermediate levels of resistance and possibly intermediate fitness costs. Resistance could then still evolve via a succession of intermediate steps. Another approach, often proposed in concert with multiple gRNAs, is the targeting of a conserved region of an essential gene (Esvelt et al. 2014). The idea being that individuals carrying NHEJ alleles will then have very low fitness—even as heterozygotes—and therefore will be quickly purged from the population. However, this strategy may not be appropriate in all cases. For example, if the goal is to knock out an insecticide-resistance gene, then the target gene will be predetermined. Furthermore, NHEJ may still occasionally repair sequences in such a way that renders them resistant to further drive, but without introducing significant fitness consequence. Our analytical framework can accommodate such an approach if we create two categories of NHEJ alleles: lethal and viable. The lethal NHEJ alleles can simply be ignored, while the viable NHEJ alleles can be modeled the same way as in our original model, but arising at a lower effective rate. Note that even if the proportion of such viable NHEJ among all NHEJ alleles is only 0.1–1%, as long as overall NHEJ rates are high enough, resistance will still likely evolve in our model.

All of these approaches are based on the paradigm that resistance is always an impediment for practical CGD applications, which needs to be averted using more elaborate CGD approaches. Yet maybe this is not always true. In fact, there may be scenarios in which the ensured evolution of resistance could actually provide a valuable safety mechanism for controlling a driver, while still allowing the approach to effectively achieve its goal. This is because even when resistance alleles will ultimately replace the driver allele, this replacement will be driven by natural selection, which can be much slower than the initial rise of the driver. In our standard model, for instance, it takes >100 generations until resistance alleles reach 50% frequency even when they were already present when the driver was introduced, whereas the driver reaches >99% frequency in <30 generations (Figure 5C). This lag time depends on the net selective advantage of the resistance allele over the driver and will be even longer for lower driver costs (Figure 5D). Short-term population transformations can thus still be attained. Subsequent CGD constructs could be designed and released that specifically target resistance alleles, based on regular surveys of genetic variation at the target site—a process that may be more effective and safe than the use of the CGD system with multiple gRNAs (Esvelt et al. 2014). Our model provides a quantitative framework for predicting the population dynamics of driver and resistance alleles in such scenarios and can shed light on how drive parameters need to be tuned to orchestrate key features of these dynamics, such as the initial rate of spread of the driver, the maximum frequency it can attain in the population, and the time it will take until resistance alleles will replace the driver. Note that our theory also allows the assessment of strategies involving the intentional release of resistance alleles for controlling a CGD approach.

It is clear that there is a need for more detailed modeling of CGD on several fronts. For example, we assumed that resistance is a binary trait, yet resistance levels could depend on number, type, and location of mutations in the target site (Cong et al. 2013). Expression of the driver in the germline, instead of embryos, could alleviate some of the recessive fitness costs of the driver in converted individuals (Deredec et al. 2008). Furthermore, resistance alleles may be created if homology-directed repair inserts the driver construct, but introduces errors that prevent it from driving. We also limited our study to resistance at the target site, even though trans-resistance might be common. Target species may harbor natural variation for Cas9 expression levels or may produce peptides or RNA that silences the CRISPR machinery (Bondy-Denomy et al. 2013). The possibility of all these other resistance mechanisms suggests that our estimates for the probability that resistance evolves are likely to be conservative. Finally, we have not addressed CGD strategies in which an allele is spread with the intention to actively diminish a population. While our model does provide a framework for analyzing such strategies when provided with an explicit model for how the spread of the suppression allele will affect population size, we still know very little about how an actual population would react to such an approach.

Throughout this analysis we remained purposefully agnostic to the potential benefits and risks of this potent biological technology. We do, however, acknowledge the need for extensive discussion among scientists, policy makers, and the public before any release of a CGD into a wild population is ever considered (Esvelt et al. 2014; Oye et al. 2014; Akbari et al. 2015). It is our hope that this work facilitates such discussion.

Acknowledgments

We thank Anthony Long for suggesting the nonhomologous end joining resistance concern and Jackson Champer, Brittny Smith, and members of the Messer and Clark laboratories for helpful discussions. This work was supported by the National Institutes of Health grants R01 GM-064590 to A.G.C. and K99 GM-114714 to R.L.U, and funds from the College of Agriculture and Life Sciences at Cornell University to P.W.M.

Appendix A: Calculation of Establishment Probabilities

Uecker and Hermisson (2011) developed a framework for calculating the establishment probability of a new mutation when its selection coefficient is time dependent. Their theory can be directly applied to our scenario if we consider a resistance allele to be a new mutation with time-dependent selection coefficient se(t). According to equation 16 in Uecker and Hermisson (2011), the establishment probability of such a resistance allele when initially present in a single copy at time t, is then given by

π(t)=21+(N/Ne)tetxse(y)dydx. (A1)

To calculate the improper integral in the denominator of Equation A1 for t < tfix, we can split it into two components, prior to and after fixation of the driver allele

tetxse(y)dydx=ttfixetxse(y)dydx+tfixetxse(y)dydx. (A2)

The integral in the exponent of the second summand still extends back to time t, but we can again partition it into a component prior to tfix and afterward, and then make use of the fact that se(ttfix)=se(tfix) will no longer depend on t in the latter component. This yields

tfixetxse(y)dydx=tfixe[ttfixse(y)dy+tfixxse(y)dy]dx=ettfixse(y)dy×tfixese(tfix)(xtfix)dx=ettfixse(y)dyse(tfix). (A3)

For ttfix, we obtain

tetxse(y)dydx=1se(tfix). (A4)

Both of these integrals break down when se(tfix)=0, which could be the case if the fitness cost of the driver is completely dominant and resistance alleles would therefore not provide any fitness advantage in driver/resistance heterozygotes. Here we assume that se(tfix)>0.

Importantly, all the above integrals are defined over continuous time, whereas in our Wright–Fisher-type model, generations are discrete and values for se(t) are only defined for t0. Before we can estimate establishment probabilities in our discrete model, we first have to map these integrals onto sums over discrete generations. In the following, we will always use i,t,t0 to denote discrete variables (generations), whereas x,y will denote continuous variables.

For mapping se(t) onto continuous time, we will extend it to a piecewise constant function with se(t+x)=s(t) for 0<x<1. We can then associate the integrals inside the exponents in Equation A2, estimated between two discrete generations t < t with sums of the form

g(t,t)=ttse(y)dyi=tt1se(i). (A5)

Discretization of the outer integral in Equation A2 can be achieved by partitioning it into the individual integrals between subsequent generations, yielding

f(t,t)=ttetxse(y)dydx=i=tt1ii+1etxse(y)dydx=i=tt1ii+1e[tise(y)dy+ixse(y)dy]dx=i=tt1eg(t,i)ii+1ese(i)(xi)dx=i=tt1eg(t,i)1ese(i)se(i). (A6)

Both g(t,t) and f(t,t) are now successfully expressed in terms of only the values of se(i) estimated in discrete generations i0. Combining all the above results, we obtain for the establishment probability of a resistance allele arising in generation t in a single copy in our discrete model

π(t)={21+(N/Ne)[f(t,tfix)+eg(t,tfix)/se(tfix)],t<tfix21+(N/Ne)×1/se(tfix),ttfix. (A7)

Appendix B: Frequent NHEJ Regime

Our dynamical model for the allele frequencies described by Equations 1–4 assumed that the rates at which resistance alleles are created by mutation and NHEJ are small enough that we can neglect their contribution to changes in allele frequencies between generations. This may no longer be the case if NHEJ becomes more frequent. Extending our dynamical model to such a scenario is straightforward. In this case, the dynamics described by Equations 1–4 become

E[xd]=xdx0[(1c)ωd0+2c(1δ)ωddcδωdr]+xdxrωdr+xd2ωddω¯(t), (B1)
E[xr]=xrx0ωr0+xrxdωdr+xr2ωrr+xdx0cδωdrω¯(t), (B2)
E[x0]=1E[xd]E[xr], (B3)
ω¯(t)=2xdx0[(1c)ωd0+c(1δ)ωddcδωdr]+2xrx0ωr0+2xrxdωdr+ixi2ωii. (B4)

Footnotes

Communicating editor: J. Hermisson

Literature Cited

  1. Akbari O. S., Matzen K. D., Marshall J. M., Huang H., Ward C. M., et al. , 2013.  A synthetic gene drive system for local, reversible modification and suppression of insect populations. Curr. Biol. 23: 671–677. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Akbari O. S., Chen C. H., Marshall J. M., Huang H., Antoshechkin I., et al. , 2014.  Novel synthetic Medea selfish genetic elements drive population replacement in Drosophila; a theoretical exploration of Medea-dependent population suppression. ACS Synth. Biol. 3: 915–928. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Akbari O. S., Bellen H. J., Bier E., Bullock S. L., Burt A., et al. , 2015.  Safeguarding gene drive experiments in the laboratory. Science 349: 927–929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Altrock P. M., Traulsen A., Reeves R. G., Reed F. A., 2010.  Using underdominance to bi-stably transform local populations. J. Theor. Biol. 267: 62–75. [DOI] [PubMed] [Google Scholar]
  5. Anderson E. M., Haupt A., Schiel J. A., Chou E., Machado H. B., et al. , 2015.  Systematic analysis of CRISPR–Cas9 mismatch tolerance reveals low levels of off-target activity. J. Biotechnol. 211: 56–65. [DOI] [PubMed] [Google Scholar]
  6. Berryman A. A., 2002.  Population Cycles: The Case for Trophic Interactions. Oxford University Press, New York. [Google Scholar]
  7. Bondy-Denomy J., Pawluk A., Maxwell K. L., Davidson A. R., 2013.  Bacteriophage genes that inactivate the CRISPR/Cas bacterial immune system. Nature 493: 429–432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bull J. J., 2016.  Lethal gene drive selects inbreeding. bioRxiv DOI: 10.1101/046847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Burt A., 2003.  Site-specific selfish genes as tools for the control and genetic engineering of natural populations. Proc. Biol. Sci. 270: 921–928. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Burt A., 2014.  Heritable strategies for controlling insect vectors of disease. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369: 20130432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Champer J., Buchman A., Akbari O. S., 2016.  Cheating evolution: engineering gene drives to manipulate the fate of wild populations. Nat. Rev. Genet. 17: 146–159. [DOI] [PubMed] [Google Scholar]
  12. Chen C. H., Huang H., Ward C. M., Su J. T., Schaeffer L. V., et al. , 2007.  A synthetic maternal-effect selfish genetic element drives population replacement in Drosophila. Science 316: 597–600. [DOI] [PubMed] [Google Scholar]
  13. Cong L., Ran F. A., Cox D., Lin S., Barretto R., et al. , 2013.  Multiplex genome engineering using CRISPR/Cas systems. Science 339: 819–823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Curtis C. F., 1968.  Possible use of translocations to fix desirable genes in insect pest populations. Nature 218: 368–369. [DOI] [PubMed] [Google Scholar]
  15. Deredec A., Burt A., Godfray H. C. J., 2008.  The population genetics of using homing endonuclease genes in vector and pest management. Genetics 179: 2013–2026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Deredec A., Godfray H. C. J., Burt A., 2011.  Requirements for effective malaria control with homing endonuclease genes. Proc. Natl. Acad. Sci. USA 108: E874–E880. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Drury D. W., Siniard D. J., Zentner G. E., Wade M. J., 2016.  CRISPR/Cas9 gene drives in genetically variable and non-randomly mating wild populations. bioRxiv DOI: 10.1101/071670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Esvelt K. M., Smidler A. L., Catteruccia F., Church G. M., 2014.  Concerning RNA-guided gene drives for the alteration of wild populations. eLife 3: e03401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Foster G. G., Whitten M. J., Prout T., Gill R., 1972.  Chromosome rearrangements for the control of insect pests. Science 176: 875–880. [DOI] [PubMed] [Google Scholar]
  20. Fu Y., Foden J. A., Khayter C., Maeder M. L., Reyon D., et al. , 2013.  High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat. Biotechnol. 31: 822–826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Gantz V. M., Bier E., 2015.  The mutagenic chain reaction: a method for converting heterozygous to homozygous mutations. Science 348: 442–444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Gantz V. M., Jasinskiene N., Tatarenkova O., Fazekas A., Macias V. M., et al. , 2015.  Highly efficient Cas9-mediated gene drive for population modification of the malaria vector mosquito Anopheles stephensi. Proc. Natl. Acad. Sci. USA 112: E6736–E6743. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Gould F., 2008.  Broadening the application of evolutionarily based genetic pest management. Evolution 62: 500–510. [DOI] [PubMed] [Google Scholar]
  24. Haag-Liautard C., Dorris M., Maside X., Macaskill S., Halligan D. L., et al. , 2007.  Direct estimation of per nucleotide and genomic deleterious mutation rates in Drosophila. Nature 445: 82–85. [DOI] [PubMed] [Google Scholar]
  25. Hammond A., Galizi R., Kyrou K., Simoni A., Siniscalchi C., et al. , 2016.  A CRISPR-Cas9 gene drive system targeting female reproduction in the malaria mosquito vector Anopheles gambiae. Nat. Biotechnol. 34: 78–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hartl D. L., Clark A. G., 1997.  Principles of Population Genetics, Vol. 116 Sinauer Associates, Sunderland, MA. [Google Scholar]
  27. Hermisson J., Pennings P. S., 2005.  Soft sweeps molecular population genetics of adaptation from standing genetic variation. Genetics 169: 2335–2352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Karasov T., Messer P. W., Petrov D. A., 2010.  Evidence that adaptation in Drosophila is not limited by mutation at single sites. PLoS Genet. 6: 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Keightley P. D., Trivedi U., Thomson M., Oliver F., Kumar S., et al. , 2009.  Analysis of the genome sequences of three Drosophila melanogaster spontaneous mutation accumulation lines. Genome Res. 19: 1195–1201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Keightley P. D., Pinharanda A., Ness R. W., Simpson F., Dasmahapatra K. K., et al. , 2014.  Estimation of the spontaneous mutation rate in Heliconius melpomene. Mol. Biol. Evol. 32: 239–243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kimura M., 1962.  On the probability of fixation of mutant genes in a population. Genetics 47: 713–719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Lambrechts L., Ferguson N. M., Harris E., Holmes E. C., McGraw E. A., et al. , 2015.  Assessing the epidemiological effect of wolbachia for dengue control. Lancet Infect. Dis. 15: 862–866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Leffler E. M., Bullaughey K., Matute D. R., Meyer W. K., Segurel L., et al. , 2012.  Revisiting an old riddle: what determines genetic diversity levels within species? PLoS Biol. 10: e1001388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Messer P. W., Petrov D. A., 2013.  Population genomics of rapid adaptation by soft selective sweeps. Trends Ecol. Evol. 28: 659–669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Noble C., Min J., Olejarz J., Buchthal J., Chavez A., et al. , 2016a Daisy-chain gene drives for the alteration of local populations. bioRxiv DOI: 10.1101/057307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Noble C., Olejarz J., Esvelt K., Church G., Nowak M., 2016b Evolutionary dynamics of CRISPR gene drives. bioRxiv DOI: 10.1101/057281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. North A., Burt A., Godfray H. C., Buckley Y., 2013.  Modelling the spatial spread of a homing endonuclease gene in a mosquito population. J. Appl. Ecol. 50: 1216–1225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Orr H. A., Unckless R. L., 2008.  Population extinction and the genetics of adaptation. Am. Nat. 172: 160–169. [DOI] [PubMed] [Google Scholar]
  39. Otto S. P., Whitlock M. C., 1997.  The probability of fixation in populations of changing size. Genetics 146: 723–733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Oye K. A., Esvelt K., Appleton E., Catteruccia F., Church G., et al. , 2014.  Regulating gene drives. Science 345: 626–628. [DOI] [PubMed] [Google Scholar]
  41. Reeves R. G., Bryk J., Altrock P. M., Denton J. A., Reed F. A., 2014.  First steps towards underdominant genetic transformation of insect populations. PLoS One 9: e97557. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Schrider D. R., Houle D., Lynch M., Hahn M. W., 2013.  Rates and genomic consequences of spontaneous mutational events in Drosophila melanogaster. Genetics 194: 937–954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Sinkins S. P., Gould F., 2006.  Gene drive systems for insect disease vectors. Nat. Rev. Genet. 7: 427–435. [DOI] [PubMed] [Google Scholar]
  44. Uecker H., Hermisson J., 2011.  On the fixation process of a beneficial mutation in a variable environment. Genetics 188: 915–930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Unckless R. L., Messer P. W., Connallon T., Clark A. G., 2015.  Modeling the manipulation of natural populations by the mutagenic chain reaction. Genetics 201: 425–431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Wang Z., Pan Q., Gendron P., Zhu W., Guo F., et al. , 2016.  CRISPR/Cas9-derived mutations both inhibit HIV-1 replication and accelerate viral escape. Cell Rep. 15: 481–489. [DOI] [PubMed] [Google Scholar]
  47. Wilson B. A., Petrov D. A., Messer P. W., 2014.  Soft selective sweeps in complex demographic scenarios. Genetics 198: 669–684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Zhang X.-H., Tee L. Y., Wang X.-G., Huang Q.-S., Yang S.-H., 2015.  Off-target effects in CRISPR/Cas9-mediated genome engineering. Mol. Ther. Nucleic Acids 4: e264. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Our theory for modeling CGD dynamics and calculating resistance probabilities is implemented in a C++ command line program. This program takes as input the parameter values of the given model: xd(0),c,μ,δ,sd0,sdr,sr0,srr,Ne,N. For the given parameters, it then calculates the driver allele frequency trajectory xd(t) under a deterministic model, specified by Equations 1–4 in the absence of resistance; the rate uμ(t) at which resistance alleles are expected to arise by de novo mutation; the rate uδ(t) at which they are expected to arise by NHEJ; the effective selection coefficient se(t) of resistance alleles; and the establishment probability π(t) of a resistance allele arising in a single copy in generation t. Results are provided for each generation 0ttfix. The program also calculates the individual resistance probabilities Ptot, PSGV, Pμ, and Pδ. Executables, source code, and documentation for this program are available at http://messerlab.org/resources/.


Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES