Skip to main content
Molecular Therapy logoLink to Molecular Therapy
. 2012 Mar 27;20(6):1177–1186. doi: 10.1038/mt.2012.47

Integration Frequency and Intermolecular Recombination of rAAV Vectors in Non-human Primate Skeletal Muscle and Liver

Ali Nowrouzi 1, Magalie Penaud-Budloo 2, Christine Kaeppel 1, Uwe Appelt 1, Caroline Le Guiner 2,3, Philippe Moullier 2,3,4, Christof von Kalle 1, Richard O Snyder 2,4,5, Manfred Schmidt 1,*
PMCID: PMC3369298  PMID: 22453768

Abstract

The comprehensive characterization of recombinant adeno-associated viral (rAAV) integration frequency and persistence for assessing rAAV vector biosafety in gene therapy is severely limited due to the predominance of episomal rAAV vector genomes maintained in vivo. Introducing rAAV insertional standards (rAIS), we show that linear amplification-mediated (LAM)-PCR and deep sequencing can be used for validated measurement of rAAV integration frequencies. Integration of rAAV2/1 or rAAV2/8, following intramuscular (IM) or regional intravenous (RI) administration of therapeutically relevant vector doses in nine adult non-human primates (NHP), occurs at low frequency between 10−4 and 10−5 both in NHP liver and muscle, but with no preference for specific genomic loci. High resolution mapping of inverted terminal repeat (ITR) breakpoints in concatemeric and integrated vector genomes reveals distinct vector recombination hotspots, including large deletions of up to 3 kb. Moreover, retrieval of integrated rAAV genomes indicated approximately threefold increase in liver compared to muscle. This molecular analysis of rAAV persistence in NHP provides a promising basis for a reliable genotoxic risk assessment of rAAV in clinical trials.

Introduction

Insertional mutagenesis by integrating retroviruses has been associated with subtle and severe adverse events in gene therapy trials.1,2,3 The genomic loci of the provirus has proven to be at least partially responsible for the deregulation of neighboring genes increasing the risk for cancer.4 Although recombinant adeno-associated virus (rAAV) vectors are predominantly maintained episomally, detection of rAAV proviruses encoding for the human β-glucuronidase gene in murine hepatocellular carcinomas suggested that insertional mutagenesis by rAAV vectors could lead to malignant transformation.5,6 AAV-induced transformation in mice has not been reproduced with different rAAV vectors;7,8 however, in light of the increasing number of rAAV-based clinical trials, these findings underscore the need for highly sensitive detection techniques, which provide essential and precise information on rAAV persistence, integration frequency, and its genetic safety for clinical use.6

The biology of rAAV includes the intra- and intermolecular recombination of vector genomes resulting in the formation of monomeric circles or concatemeric molecules in head-to-tail (H-T), head-to-head (H-H) or tail-to-tail (T-T) orientation9,10,11,12,13 that are able to persist as episomes.14 However, rare integration events have been detected in human primary cells and in mice with preferences for active genes, CpG-islands8,15,16 and chromosomal breakage sites.17,18 The rare integration events, together with strong secondary structures present in the inverted terminal repeats (ITR), have severely hampered the detection of rAAV proviruses in clinical and preclinical samples. Thus, genome-wide mapping of rAAV integration sites from in vivo transduced tissues have been limited due to several factors.6 First, rAAV vector genomes are maintained in non-human primates (NHP) predominantly as monomeric and concatemeric episomes.14,19,20 These extrachromosmal molecules can compete for primer binding and detection of rare proviruses by PCR-based methods. Second, since rearrangements occur during concatemerization and/or integration,9 the molecular and bioinformatic analyses of the rAAV forms is crucial. And finally, given the heterogeneity and the biodistribution of the rAAV genomes in quiescent tissue, proviral sequences can be restricted to only a few cells in vivo unless they are derived from a cell cluster or clonal tumor.6,21 Thus, for comprehensive preclinical biosafety assessment and clinical monitoring of rAAV vectors, highly sensitive and validated technologies are needed.

Molecular insights into vector persistence, stability, and integration frequency obtained from large animal models are useful for increasing the safety of patients enrolled in rAAV-mediated gene therapy trials. In particular, the evaluation of rAAV integration frequency in clinically relevant large animal models such as the NHP has rarely been assessed. We and others have shown that a precise and reliable estimation of rAAV integration frequencies in different organs of NHP transduced at clinically relevant doses have been limited so far either to the complete failure of detection of integration events or to a limited number of retrieved integration sites (IS).14,22,23 Thus, we argued that a defined recombinant AAV insertional standard (rAIS) needs to be established for defining the sensitivity and specificity limits for rAAV IS retrieval in NHP, which is important for a proper genotoxic risk assessment of initiated and upcoming rAAV clinical trials.

Here, we utilized linear amplification-mediated (LAM)-PCR and next generation sequencing24,25,26 to dissect the persistence of rAAV genomes in NHP up to 34 months postinjection with respect to their integration frequency and molecular configuration. Molecular analyses of >44,000 vector-vector concatemeres and vector-cellular junctions breakpoints revealed recombination hotspots within the ITR and the vector genome in vivo.

The detection of concatemerized vector configurations containing up to 3 kb deletions point to a subpopulation of potential non-functional vector units possibly expressing truncated transgene products. We determined that integration of rAAV in primate liver or skeletal muscle following intramuscular (IM) or regional intravenous (RI) vector administration occurs at a frequency of <10−5 with no preferences for host cell genome hotspots or transcribed regions. Our results indicate that the genotoxic potential of rAAV for insertional mutagenesis by deregulating potential microRNA clusters, oncogenes or tumor suppressor genes is substantially reduced.

Results

Generation of a clonal standard to quantify rAAV integration frequencies

For reliable determination of rAAV integration frequency in transduced tissues by linear amplification-mediated PCR (LAM-PCR) and deep sequencing, measurements of the ratio of episomal versus integrated rAAV genomes are essential. We hypothesized that by spiking engineered integration standards into preclinical samples known to harbor predominantly episomal rAAV vector genomes, a copy number-dependent retrieval frequency of individual rAAV IS and concatemer sequences can be measured. Therefore, we generated cell clones harboring integrated rAAV genomes by transducing Cos cells (an African green monkey cell line) with a rAAV-RSV-tGFP-IRES-Puro-wpre-ISceI vector. These cell lines are applicable for analyzing the copy number-dependent retrieval frequency of individual IS following LAM-PCR and 454 pyrosequencing.

rAAV vector-cellular junctions from 10 transduced cell clones were characterized by 454 pyrosequencing of LAM-PCR amplicons. We chose two clones, Cos1 and Cos 7, each containing a rAAV provirus that has 57 and 110 bp of the ITR deleted, respectively (Figure 1a,b). According to absolute quantitative PCR measurements of the single vector-cellular junction identified by LAM-PCR and 454 sequencing (Supplementary Figure S1), we calculated that in Cos7 3.67 rAAV copies/ng DNA are present. However, while quantifying vector copy number targeting the RSV promoter of the rAAV genome, we found a striking difference in copy number resulting in 200 copies/ng, very likely pointing to the fact that the rAAV genome is integrated as a tandem concatemer. The characterized single-cell clone Cos7 was used as a rAIS applicable for dissecting the rAAV integration frequency in vivo by LAM-PCR and deep sequencing.

Figure 1.

Figure 1

Recombinant AAV insertional standards (rAIS) for LAM-PCR coupled with 454 sequencing. (a) Cell clones harboring integrated rAAV genomes were generated after transduction of a Cos cell line with the rAAV-RSV-tGFP-IRES-Puro-wpre-ISceI vector and puromycin selection. Two clones were selected that each harbor a single rAAV integration site (IS) (dark gray box) with a partial ITR sequence (gray arrow); these truncated ITRs mimic the majority of the rAAV forms found following transduction in vivo. (b) LAM-PCR was performed from 100 ng of DNA and the amplicon size was 272 bp and 186 bp for Cos 1 and Cos 7 clone, respectively. To the sizes of the COS 1 and COS 7 amplicons depicted in the agarose gel, ~30 bp resulting from the linker ligation required for subsequent amplification by LAM-PCR (see methods) have to be added. (c) The efficacy of amplifying and sequencing the vector–host genome junction through an ITR with varying length was measured by the relative retrieval frequency of the IS following 454 pyrosequencing. The primer used for linear amplification is located in the RSV promoter sequence (white arrow on a). AAV, adeno-associated virus; ITR, inverted terminal repeat; LAM-PCR, linear amplification-mediated PCR.

We first analyzed whether the strong secondary structure present within the ITR influences the efficiency of retrieving rAAV ITR sequences with different length. In current protocols using Sanger sequencing platforms, identification of vector-vector or vector-cellular junctions is strongly hampered by the secondary structure of the ITR.14,27 Among the clones generated, we were able to detect full ITR sequences integrated into the Cos genome, thus the assay is suitable to amplify full ITR structures. The efficiency of amplifying and sequencing partial ITR with different deletions (and consequently different amplicon lengths) was calculated by comparing the relative sequence count of retrieved rAAV IS with varying degree of deletion in the ITR. Therefore the percentage of sequence reads corresponding to the actual IS in COS1 and COS7 were calculated based on the total sequence count of the samples (Figure 1c). The results showed that the efficiency for retrieving ITRs with different length and varying degree of secondary structures differs but is not substantially limited when LAM-PCR and pyrosequencing technologies are being applied together (Figure 1c). The sensitivity of LAM-PCR and 454 sequencing to detect the rAIS was evaluated on limiting dilutions of DNA obtained from rAIS Cos 7. The sensitivity of LAM-PCR and 454 sequencing is feasible for detecting rAIS down to 0.11 vector copies based on quantification of integrant-specific copy numbers (Supplementary Table S1).

Molecular characterization of persistent rAAV configurations and detection of rare rAAV integration events in NHP muscle and liver

It has been previously shown that rAAV genomes reside predominantly as episomal monomeric and concatemeric circles in primate muscle for up to 22 months after IM injection.14 Whereas H-T concatemeric fragments were predominately retrieved, Sanger sequencing of rAAV LAM-PCR amplicons did not reveal any integrated rAAV genomes in skeletal muscle.14 To gain more comprehensive molecular insight into the integration frequency and molecular persistence of rAAV genomes after long-term follow-up, we characterized rAAV concatemers and rare vector-cellular junctions by LAM-PCR and 454 pyrosequencing in liver and skeletal muscle of nine NHP injected IM or RI as previously described.28 The vector used to transduce the NHP encodes for the LEA29Y (belatacept) molecule under the control of the constitutive RSV promoter and the WPRE sequence. The rAAV-RSV-LEA29Y-WPRE-pA-injected animals exhibited the typical transgene expression profile, characterized by an increase in the first 3 months, followed by a slower decrease and a steady state achieved from 6 months onwards.28 Vector copy numbers ranged from 0.3 to 2.8 and 0.6 to 36.1 vector genome copies per diploid genome in the liver and skeletal muscle of treated animals, respectively (Table 1).

Table 1. Overview of analyzed animals.

graphic file with name mt201247t1.jpg

A broad distribution of rAAV2/1 and rAAV2/8 in several organs including the liver following IM or RI injection in NHP has been observed.28 This raises the question whether the integration frequency of rAAV varies in different organs under the chosen administration routes. In terms of genotoxic safety, insertional events in the liver are of concern after reports were published of hepatocellular carcinoma being linked to integrated rAAV genomes in mice.5,6 A precise and reliable estimation of rAAV integration frequencies in different organs of NHP transduced at clinically relevant doses needs to be evaluated for the risk assessment of initiated and upcoming rAAV clinical trials.

LAM-PCR was repetitively performed on 0.5–1 µg of genomic DNA obtained from the liver and muscle of nine NHP and the resulting LAM-PCR amplicons (Supplementary Figure S2) were sequenced by 454 pyrosequencing. From DNA derived from liver and skeletal muscle, we extracted 44,630 LAM-PCR amplicon sequence reads matching our validation criteria (see methods) for being vector-vector concatemeric fragments or vector-cellular junctions. Vector-vector concatemers in H-H or H-T configuration composed 76% of the analyzed reads while only 4% of the sequence reads were identified as rAAV proviruses mapping to the rhesus genome (see below). Of the remaining 20%, 15% of the reads were predominantly too short for accurate mapping to the genome or vector while 5% of the reads resembled ITR-plasmid junctions probably generated by recombination of vector genomes with plasmid backbone sequence that could be packaged in virions during rAAV vectors production29(Supplementary Figure S3).

A comparative analysis of ITR breakpoints present in the detected integrated rAAV genomes (see below) revealed that over 70% of the breakpoints in the rAAV ITR are clustered in the A′-B region while the rest are dispersed throughout the ITR (Figure 2a). A total of 33,786 sequences matching the validity criteria as being concatemeric vector-vector fragments (sequences that continue with a rAAV-specific fragment after the break) were obtained. These sequences were used to devise a high resolution map of ITR breakpoints in NHP liver and muscle by calculating the relative sequence count for each basepair starting from the first base of the LAM-PCR primer until the end of the 5′ITR (Figure 2b). Sequence coverage along the 5′ITR revealed intact ITRs in 0.2% of all concatemeric genome fragments analyzed. Interestingly, a significant (> threefold) decrease in ITR sequence coverage was detectable in the B palindrome in seven out of nine NHP primates analyzed (Figure 2b). This reveals that preferential sites within the ITR are involved in inter-vector recombination and concatemerization of rAAV after transduction of primate tissues.

Figure 2.

Figure 2

High resolution mapping of ITR breakpoints in rAAV concatemeric fragments and proviruses in vivo reveals hotspots of recombination. (a) Percentage of rAAV proviral insertions with ITR breakpoints at particular regions within the ITR in skeletal muscle and liver. (b) Depth sequence analysis was used to detect ITR breakpoints of rAAV concatemers in skeletal muscle and liver of nine NHP. Therefore, the sequence coverage was calculated for each basepair starting from the primer that was used to prepare the samples for sequencing, through the 5′ITR until reaching the first break. Only sequences that continue with a rAAV-specific fragment after the break were taken into account. Preferentially, breakpoints were detected in the B region of the ITR indicated by a significant decrease in sequence coverage. (c) H-H or H-T concatmers were distinguished by analyzing the orientation of each side of the concatemeric amplicons. Red arrows indicate location of recombination hotspots which result in up to 3 kb deletions of single concatemerized vector units (see e). (d) The relative proportion of concatemers was calculated by measuring the total sequence counts of retrieved H-H and H-T concatemers obtained both in liver and skeletal muscle of all analyzed NHP. In cases where only ITR-ITR fragments were sequenced the H-T or H-H configuration cannot be determined. Designated as u.d. (e) To identify the configuration of the rAAV concatemers, 454-sequence reads were identified that discontinuously align to exactly two parts to the vector sequence. The resulting concatemeric sequences can be divided by a junction resembling the recombination spot in the 5′ITR (x-axis) and the whole vector sequence (y-axis). The x-axis resembles the coordinates of the 5′ITR starting with the position of the sequencing primer (462 bp) and coordinates on the y-axis indicates the exact position of breakpoints on the whole vector sequence (1–4,173 bp). The two vector coordinates describing the transitions of the two-part-alignments were drawn as red (head-to-tail) and blue (head-to-head) spots. Each dot in the graphs represents the exact coordinates of breakpoints according to the 5′ITR junction (x-axis) and the whole vector (y-axis) which can be in the ITR itself or on other locations of the vector. ITR, inverted terminal repeat; NHP, non-human primates; rAAV, recombinant adeno-associated virus; u.d, undetermined.

The concatemer sequences were further divided based on the orientation (see methods) of the vector-vector junctions (Figure 2c). In total, 5,844 and 3,399 unique concatemeric configurations were identified in skeletal muscle and liver, respectively. The relative retrieval frequency of rAAV H-T and H-H concatemers revealed an almost equal persistence of H-T and H-H concatemers in skeletal muscle, whereas in liver the retrieval of H-T concatemers was > threefold higher then H-H concatemers (Figure 2d). We were not able to quantify the proportion of T-T concatemers because amplification was always primed from the RSV promoter which is located in the 5′ portion of the vector.

For a more precise analysis of concatemers, hybrid concatemeric reads containing two vector-vector fragments separated by a 5′-ITR breakpoint were mapped to the whole vector genome identifying recombination hotspots in the ITR and within the body of the vector genome in H-T and H-H configuration (Figure 2e). Therefore, 454-sequence reads that discontinuously align to exactly two parts to the vector sequence were identified and the exact position of the intermolecular recombination was resolved by identifying the breakpoint with the 5′ ITR region and the flanking sequence mapping within the body of the vector genome or ITR vector boundaries (Figure 2e). The distribution of recombination hotspots within the vector were similar between concatemers retrieved from skeletal muscle and liver. Two recombination hotspots with the 5′ITR of H-T concatemers are located at position 700 and 1,700 bp within the vector body revealing large deletions (up to 3 kb) within single vector units which have concatemerized (Figure 2c,e).

LAM-PCR analyses performed on 0.5 and 1 µg genomic DNA could detect 178 unique rAAV insertional events in the liver samples (Table 2). In muscle, a total of 153 unique rAAV-IS were retrieved from 1 µg of each NHP analyzed. From these, a total of 129 IS from liver and 119 from skeletal muscle were assigned to unique loci of the rhesus macaque genome (Supplementary Table S2). The total number of IS detected for each individual animal are given in Table 2. Interestingly, in our NHP model, preferential hotspots of rAAV-IS or preferences for integration into RefSeq genes were not observed. The number of annotated genes in the rhesus genome is not complete, however, a comparative IS mapping to the human reference genome (Supplementary Figure S4) did not reveal any significant preferences for integration in RefSeq genes, oncogenes or CpG islands. Mapping of the IS to the rhesus genome revealed that only 4 and 2% of all IS retrieved from muscle and liver were located inside RefSeq genes respectively. Mapping the IS sites to homologous regions in the human genome in which almost all RefSeq genes have been annotated indicated that preferences for RefSeq genes increase to 26% of all IS both for muscle and liver, however, compared to computer simulated randomly selected IS sites the frequency into RefSeq genes seems to be even less than expected by chance (Supplementary Figure S4). Also the frequency of insertions in vicinity of CpG islands revealed no significant selection for these regions (Supplementary Figure S4). Thus it appears that at the total number of IS analyzed in our model, gene coding regions are not favored by rAAV-IS as described previously in rodents8,15

Table 2. Retrieved unique rAAV insertion sites.

graphic file with name mt201247t2.jpg

Relative quantification of rAAV concatemers and proviruses in liver and muscle of NHP

To calculate the copy number of individual rAAV proviruses by LAM-PCR and pyrosequencing, we utilized the rAIS cell line and assessed the frequency of retrieving the established rAIS in an excess of episomal rAAV genomes. DNA from the Cos7 cell clone was added in limiting dilutions into DNA extracted from the skeletal muscle of two primates (Mac1 and Mac7) that had received rAAV2/1 or rAAV2/8 vector, respectively. The RSV primers used for LAM-PCR anneal to the integrated sequence of the Cos7 clone as well as to the sequence of the vector that was injected in NHP. The ratio of the recovered rAIS to the total recovered LAM-PCR amplicons showed a linear relationship between relative sequence counts and decreasing copy numbers of the rAIS indicating that the retrieval of individual IS is copy number dependent (Figure 3a). The retrieval frequency of rAIS Cos7 DNA mixed with genomic DNA obtained from muscle of NHP containing high vector copy number (Mac1; 31.4 vector copies/diploid genome) compared to the DNA mix with low vector copy number (Mac7; 1.3 vector copies/diploid genome) showed that the retrieval of rAIS is less efficient in the Mac 1 mixture than in the Mac 7 mixture (Figure 3b). This analysis indicates that rAIS can be detected in an up to 2,500-fold excess of episomal monomeric and/or concatemeric vector genomes (Figure 3b). However, higher numbers of episomes hamper the detection for rAIS retrieval. In addition, the lower limit of sensitivity to detect the rAIS dramatically decreased from 0.11 vector copy numbers in the absence of episomes to 110 vector copy numbers in the presence of episomes. Although, we were able to detect 110 rAIS vector copies in ~25,000-fold limiting dilution compared to vector copy numbers present in NHP muscle samples, this was not reproducible in all cases, pointing to the fact that detection of rare rAAV IS is probably stochastic based on the total number of episomes present in the tissue. Generally, the retrieval frequency of rAAV insertional events in the presence of concatemeric and episomal rAAV molecules from 1 µg of total DNA was between 10−4 and 10−5 per diploid genome regardless of the tissue analyzed or the serotype of rAAV used (Figure 3c; Table 3). However, relative to the vector copy number in the individual animals the IS frequency is higher, ranging from two to 50-fold increase for the liver than skeletal muscle (Table 3). Accordingly, calculating the retrieval of integrated vector genomes and concatemers based on their relative sequence count, we observed a threefold higher retrieval frequency of rAAV IS in the liver, most likely influenced by the presence of a higher degree of competing episomes in the muscle (Figure 3d).

Figure 3.

Figure 3

Retrieval of rAAV insertion sites by LAM-PCR and deep sequencing is copy number dependent. (a) For analyzing the sensitivity of capturing rAAV insertion sites mimicking the in vivo context, genomic DNA obtained from transduced primate muscle (harboring predominantly episomes) was mixed with DNA derived from the Cos 7 rAAV insertion standard and LAM-PCR was performed followed by deep sequencing. To calculate the relative retrieval frequency of the rAIS, we measured the sequence count of the rAIS in each dilution. The retrieval frequency of all integration sites was calculated from the total amount of break-evident reads revealing a copy number dependent retrieval of rAAV insertion sites. (b) The retrieval frequency of rAIS (Cos7) spiked with genomic DNA was compared in samples obtained from muscle of NHP containing high (Mac1; 31.4 vector copies/diploid genome) and low vector copy numbers (Mac7; 1.3 vector copies/diploid genome). (c) The quantitative relation of rAAV insertion sites compared to concatemeric vector configurations predominantly derived from episomal vector genomes was calculated for each individual animal analyzed for muscle and liver respectively. Therefore, the relative sequence count of all LAM-PCR amplicons which passed our validation criteria for being concatemeric or potential insertion sites were divided into groups resembling either rAAV insertion sites and concatemers. (d) Quantitative measurement of captured rAAV proviruses in NHP liver and muscle following IM and RI vector administration. IM, intramuscular; IS, integration sites; LAM-PCR, linear amplification-mediated PCR; NHP, non-human primates; rAAV, recombinant adeno-associated virus; rAIS, rAAV insertional standards; RI, regional intravenous.

Table 3. Integration frequency of rAAV in 1 µg of DNA from skeletal muscle and liver of NHP.

graphic file with name mt201247t3.jpg

Discussion

rAAV vectors are being widely used for a broad range of gene therapeutic applications. The first cases of rAAV-induced insertional mutagenesis in mice have been reported, and safety concerns have been raised.5,6 Current knowledge about rAAV integration in vivo mostly rely on experiments in neonatal mice receiving high doses of rAAV vectors containing a bacterial origin of replication used for bacterial subcloning of potential vector-cellular junctions before sequencing.15,27,30 Recently, rAAV insertion sites from tumors and healthy tissues were analyzed by LM-PCR and pyrosequencing in 80 adult mice 18 months postinjection revealing no clear molecular correlation between tumor initiation and rAAV insertions.8 Despite gain of general knowledge concerning the pattern and spectrum of rAAV integration in vivo most of these studies are biased for selection of IS and the application of this data for predicting the situation in NHP and humans is not clear. Although the immune response against the therapeutic protein or against the viral capsid is now of a great concern, other aspects addressing the interaction between the vector and the host should be systematically considered in large animal models. As technology improves, one could expect that more sensitive methods can better detect rare integration events. We have previously shown that LAM-PCR coupled with Sanger sequencing failed to retrieve rAAV integrations from NHP skeletal muscle,14 however, LAM-PCR coupled with high-throughput sequencing can be used for the detection of rAAV proviruses in studies using NHP though the number of retrieved IS was limited.22 By defining the sensitivity and specificity limits for IS retrieval in NHP, we shed light into controversies concerning the integration frequency of rAAV gene therapy vectors and lay basis for future biosafety analysis in clinical rAAV gene therapy studies.

We demonstrate that the retrieval frequency of rare rAAV insertional events following pyrosequencing is feasible using small amounts of DNA suitable for evaluating rAAV integration frequency with different serotypes and administration routes. The pyrosequencing technology proves feasible in comprehensive sequencing of intact ITRs. The sequencing of >30,000 rAAV concatemeric fragments both from primate skeletal muscle and liver have enabled us to identify preferential breakpoints in the ITR in high resolution. Molecular analyses of concatemers at late time-points have also identified a high diversity of vector-vector junctions revealing large deletions in rAAV genomes. It is most likely that concatemerization in this context also results in a portion of rAAV vectors which lack transgene expression as Sun et al. recently also suggested,20 or vectors that express truncated products that could elicit immune responses.31 In this respect, potential transcripts resulting from complex concatemerized rAAV genomes may be more heterogenous than previously assumed and warrant further characterization. A semi-quantitative measurement of AAV concatemers and proviruses provided here reveal 96 and 93% of the vector genomes assimilate into concatemers in primate muscle and liver, respectively. These concatemers are predominantly episomal, since nearly all vector copies are associated with episomal circles, but integrated proviruses can also be concatemerized, thus the number of vector-cellular junctions relative to vector-vector junctions is very small. We have identified 5–34 rAAV IS from 1 µg DNA obtained from skeletal muscle and liver of each individual NHPs. Given that in 1 µg of DNA, we have analyzed ~1.56 × 105 diploid genomes, the frequency of rAAV IS can be calculated to be between 10−4 and 10−5. However, taking into account that most skeletal muscle samples had significantly higher vector copy numbers, the frequency of detecting single insertional events per vector genome was two to 50-fold lower in muscle compared to liver possibly resulting from higher episomal rAAV genomes in the muscle competing for IS retrieval. The demonstration that episomes and/or concatemers compete for the ability to retrieve integrated rAAV genomes (Figure 3b) suggests that the integration frequency is likely to be higher and dependent on the total vector numbers of the tissue or section analyzed.

In terms of the clinical safety of rAAV vectors, this report provides insight into the integration frequency of rAAV showing that, despite the predominance of rAAV episomal and concatemeric structures, rare and random integration events can be detected in the liver and muscle following IM or RI administration. In light of the currently debated discussion about the integration frequency of AAV vectors in vivo, our results support the view that rAAV integration is a rare event, substantially outnumbered by episomally persistent rAAV forms even after 34 months post-transduction. Because AAV integration has been linked to the development of hepatocellular carcinoma in mice5 detection of integrated vector genomes in the liver of NHP following IM or RI injection may be of concern for genotoxic events.6 However, although rAAV integrates at low frequency and the fact that none of the studies of rAAV in NHP reported by us or other laboratories22,28,32,33,34,35 developed hepatocellular carcinomas or other, potential genotoxic side effects, insertional mutagenesis cannot be ruled out entirely. The analyses conducted in this study indicate a low risk for insertional mutagenesis in liver and skeletal muscle at the doses administered. However, further decreasing the genotoxic risk through rAAV vector design, similar to approaches being applied for next generation retroviral vectors (e.g., tissue-specific promoters36 or the incorporation of chromatin insulators)37 potentially can reduce possible vector-induced side effects on neighboring cellular genes. Insights concerning the integration frequency of rAAV in large animal models are relevant for further vector biosafety assessment of rAAV serotypes in preclinical models and support a low genotoxic risk for rAAV in future human clinical trials.

Materials and Methods

Production of rAAV vectors and rAAV administration in NHP. The rAAV2/1 and 2/8 vectors containing the LEA29Y (belatacept) transgene under the transcriptional control of the Rous Sarcoma Virus promoter and followed by the Woodchuck post-transcriptional regulatory element (WPRE) were produced as described previously.28 The IM (Mac 1 to 4 and Mac 9) or RI (Mac 5 to 8) administration of 5 × 1012 vg/kg of rAAV was detailed in Toromanoff et al.28 Total DNA was extracted at different time postinjection from biopsies of the injected tibialis anterior muscle or transduced liver.28 The rAAV2/1-RSV-tGFP-IRES-Puro-ISceI vector was produced by transfection of the corresponding vector plasmid in 293 cells and subsequent purification by cesium chloride gradient.

Generation of rAAV-transduced human- and rhesus-derived cell clones. 5 × 106 of African green monkey-derived Cos cell line were transduced at a multiplicity of infection of 1,000 by rAAV2/1-RSV-tGFP-IRES-Puro-wpre-ISceI. Afterwards, cell clones were cultivated under 1 µg/ml of pyromycin for 12 passages to select for cells containing stably integrated rAAV vector genomes. Under such selection conditions most episomal rAAV vector genomes are gradually lost during cell division and cells surviving puromycin selection contain integrated rAAV genomes. For further characterization, real-time PCR were performed as described below to determine the vector copy number per 30 ng of genomic DNA and using cellular primers complementary to the flanking cellular region of the insertions and the primer used for the LAM-PCR to quantify the vector-cellular junction copy number.

Detection of rAAV proviruses by LAM-PCR and next generation pyrosequencing. LAM-PCR was performed to isolate rAAV flanking sequences in samples of 1 µg total DNA extracted from NHP primate muscle and liver as described previously.14,26 An additional PCR step with barcoded fusion primers (5′-GCCTCCCTCGCGCCATCAG(N)4-6 CTTCCAGAGC ATGGCTACGTAG-3′ specific for 5′-Bio GCCTTGCCAGCCCGC TCAGA GTGGCACAGCAGTTAGG-3′ specific for the rAAV vector and the linker was performed for next generation pyrosequencing (GS FLX 454 pyrosequencing implemented by Roche (Mannheim, Germany)). LAM-PCR sequences obtained by 454 pyrosequencing were analyzed by automated bioinformatical data mining, including trimming of sequences, alignment using UCSC BLAT analyzing tools and identification of nearby genes and other integrating features as previously described.38 Characterization of stable integration sites in Cos-derived cell clones was performed on 100 ng of total DNA by LAM-PCR followed by next generation pyrosequencing as described above except that the following barcoded fusion primer (5′-GCCTCCCT CGCGCCATCAG(N)4-6GATAAG CTGTCAAACATGACG-3′) was used for next generation pyrosequencing. The sensitivity of LAM-PCR and the efficiency of amplifying and sequencing of complete or partial ITR by LAM-PCR follows pyrosequencing were analyzed in a LAM-PCR experiment as described above using a dilution series ranging from 300 to 0.03 ng of different Cos-derived cell clones (Cos1: mid-length ITR and Cos7: short-length ITR).

Bioinformatical analyses of LAM-PCR–derived amplicon sequences. LAM-PCR sequence reads obtained by 454 pyrosequencing were initially aligned to the proviral vector sequence using a locally installed BLAST+ 39 instance configured to gain maximum alignment sensitivity (word size: 4, e value: 10, identity: 75%). Resulting alignment data were used to remove all sequence reads that do not start with the fusion primer sequence (minimum identity: 95%). Moreover sequence reads that align continuously and in full length to the provirus were omitted from further analyses. Remaining sequences were then inspected with respect to proper alignments of their 3′ ends and dissected into two partitions: (i) sequence reads that do not align to the proviral vector sequence at their 3′ end, suggesting evidence of vector insertion events and (ii) sequence reads that align discontinuously but in full length to the proviral vector, suggesting viral deletion and recombination events such as concatemers of proviral or episomal vectors; 5′ ends of sequences suggesting vector insertions were trimmed with respect to vector alignments, mapped to the target genome using UCSC BLAT to derive chromosomal insertion coordinates and finally analyzed by automated bioinformatical data mining tools, including identification of nearby genes and other integrating features as previously described.22,25 Sequences supporting hypothesis of concatemerization of vectors were analyzed in terms of whether they aligned in two parts to the proviral sequence, in fact whether their 5′ ends properly align to the fusion primer region within the minus strand of the 5′ ITR, whereas the remaining 3′ nucleotides were required to align to different parts of the proviral sequence. In cases where both ends of a sequence aligned to the minus strand of the vector, H-T concatemers were assumed, whereas 3′ sequence ends aligning to the plus strand indicate H-H recombinations. Since AAV-vector genomes comprise identical ITRs at both ends it is often impossible to distinguish between alignments to the 3′ or 5′ ITR. For sequence reads whose 3′ ends competitively aligned to both ITRs, the recombination type accompanying the smaller deletion was assumed. Finally, the two vector coordinates describing the transitions of such two-part-alignments were screened for redundancy to call incidences of concatemeric events.

Real-time-PCR. The vector copy number per diploid genome in the primate samples was determined by LEA29Y-specific TaqMan amplification regarding the Epsilon-Globin endogenous gene copy number.28 The integration site copy number of the main IS in the rhesus-derived single cell clone Cos7 was determined by absolute RT-PCR (Light Cycler 480 implemented by Roche) based on SYBR GREEN I. The plasmid used as standard reference was generated by cloning the tracking PCR product of the unique IS of Cos7 into the pCR 2.1 TOPO vector (Invitrogen, Carlsbad, CA). The sample was analyzed as quadruple of 30 ng. Primers specific to the promoter region (A: 5′-CATGCAATTGTCGGTCAAGCC-3′ and B: 5′-CGTCATGTTTGACAGCTTATC-3) were used to quantify the total vector copy number while IS corresponding primers (C: 5′-GACCATCCTTCCCAATTAGCC-3′and D: 5′-GTCGAGATCTTCCA GAGCATG-3′) were used to quantify the copies of the Cos7 specific IS.

SUPPLEMENTARY MATERIAL Figure S1. Vector copy number calculation of rAIS (Cos7) by qPCR. Figure S2. Amplified LAM-PCR amplicons. Figure S3. Overview of sequenced and validated LAM-PCR amplicons. Figure S4. Association of rAAV insertion sites for RefSeq genes or CpG islands. Table S1. Sensitivity of LAM-PCR for retrieval of rAIS. Table S2. Chromosomal location of rAAV insertion sites.

Acknowledgments

We thank Christian Weber, Kai Lukat, Nadine Krenzer and Anne Arens for technical and bioinformatical assistance. The personnel at the Boisbonne Centre (large animal facility, ONIRIS, Nantes) and the Vector Core at the University Hospital of Nantes for providing the rAAV2/1 and rAAV2/8 stocks. Funding was provided by the European Network for the Advancement of Clinical Gene Transfer and Therapy (CLINIGENE) and the Deutsche Forschungsgemeinschaft (SPP1230). R.O.S. is an inventor on patents related to recombinant AAV technology. R.O.S. owns equity in a gene therapy company that is commercializing AAV for gene therapy applications. To the extent that the work in this manuscript increases the value of these commercial holdings. The other authors declared no conflict of interest.

Supplementary Material

Figure S1.

Vector copy number calculation of rAIS (Cos7) by qPCR.

Figure S2.

Amplified LAM-PCR amplicons.

Figure S3.

Overview of sequenced and validated LAM-PCR amplicons.

Figure S4.

Association of rAAV insertion sites for RefSeq genes or CpG islands.

Table S1.

Sensitivity of LAM-PCR for retrieval of rAIS.

Table S2.

Chromosomal location of rAAV insertion sites.

REFERENCES

  1. Hacein-Bey-Abina S, Von Kalle C, Schmidt M, McCormack MP, Wulffraat N, Leboulch P.et al. (2003LMO2-associated clonal T cell proliferation in two patients after gene therapy for SCID-X1 Science 302415–419. [DOI] [PubMed] [Google Scholar]
  2. Ott MG, Schmidt M, Schwarzwaelder K, Stein S, Siler U, Koehl U.et al. (2006Correction of X-linked chronic granulomatous disease by gene therapy, augmented by insertional activation of MDS1-EVI1, PRDM16 or SETBP1 Nat Med 12401–409. [DOI] [PubMed] [Google Scholar]
  3. Schwarzwaelder K, Howe SJ, Schmidt M, Brugman MH, Deichmann A, Glimm H.et al. (2007Gammaretrovirus-mediated correction of SCID-X1 is associated with skewed vector integration site distribution in vivo J Clin Invest 1172241–2249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Nienhuis AW, Dunbar CE., and, Sorrentino BP. Genotoxicity of retroviral integration in hematopoietic cells. Mol Ther. 2006;13:1031–1049. doi: 10.1016/j.ymthe.2006.03.001. [DOI] [PubMed] [Google Scholar]
  5. Donsante A, Miller DG, Li Y, Vogler C, Brunt EM, Russell DW.et al. (2007AAV vector integration sites in mouse hepatocellular carcinoma Science 317477. [DOI] [PubMed] [Google Scholar]
  6. Russell DW. AAV vectors, insertional mutagenesis, and cancer. Mol Ther. 2007;15:1740–1743. doi: 10.1038/sj.mt.6300299. [DOI] [PubMed] [Google Scholar]
  7. Bell P, Wang L, Lebherz C, Flieder DB, Bove MS, Wu D.et al. (2005No evidence for tumorigenesis of AAV vectors in a large-scale study in mice Mol Ther 12299–306. [DOI] [PubMed] [Google Scholar]
  8. Li H, Malani N, Hamilton SR, Schlachterman A, Bussadori G, Edmonson SE.et al. (2011Assessing the potential for AAV vector genotoxicity in a murine model Blood 1173311–3319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Schultz BR., and, Chamberlain JS. Recombinant adeno-associated virus transduction and integration. Mol Ther. 2008;16:1189–1199. doi: 10.1038/mt.2008.103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Yang J, Zhou W, Zhang Y, Zidon T, Ritchie T., and, Engelhardt JF. Concatamerization of adeno-associated virus circular genomes occurs through intermolecular recombination. J Virol. 1999;73:9468–9477. doi: 10.1128/jvi.73.11.9468-9477.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Miao CH, Nakai H, Thompson AR, Storm TA, Chiu W, Snyder RO.et al. (2000Nonrandom transduction of recombinant adeno-associated virus vectors in mouse hepatocytes in vivo: cell cycling does not influence hepatocyte transduction J Virol 743793–3803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Nakai H, Storm TA., and, Kay MA. Recruitment of single-stranded recombinant adeno-associated virus vector genomes and intermolecular recombination are responsible for stable transduction of liver in vivo. J Virol. 2000;74:9451–9463. doi: 10.1128/jvi.74.20.9451-9463.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Choi VW, McCarty DM., and, Samulski RJ. Host cell DNA repair pathways in adeno-associated viral genome processing. J Virol. 2006;80:10346–10356. doi: 10.1128/JVI.00841-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Penaud-Budloo M, Le Guiner C, Nowrouzi A, Toromanoff A, Chérel Y, Chenuaud P.et al. (2008Adeno-associated virus vector genomes persist as episomal chromatin in primate muscle J Virol 827875–7885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Nakai H, Montini E, Fuess S, Storm TA, Grompe M., and, Kay MA. AAV serotype 2 vectors preferentially integrate into active genes in mice. Nat Genet. 2003;34:297–302. doi: 10.1038/ng1179. [DOI] [PubMed] [Google Scholar]
  16. Miller DG, Trobridge GD, Petek LM, Jacobs MA, Kaul R., and, Russell DW. Large-scale analysis of adeno-associated virus vector integration sites in normal human cells. J Virol. 2005;79:11434–11442. doi: 10.1128/JVI.79.17.11434-11442.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Miller DG, Rutledge EA., and, Russell DW. Chromosomal effects of adeno-associated virus vector integration. Nat Genet. 2002;30:147–148. doi: 10.1038/ng824. [DOI] [PubMed] [Google Scholar]
  18. Miller DG, Petek LM., and, Russell DW. Adeno-associated virus vectors integrate at chromosome breakage sites. Nat Genet. 2004;36:767–773. doi: 10.1038/ng1380. [DOI] [PubMed] [Google Scholar]
  19. Nathwani AC, Gray JT, Ng CY, Zhou J, Spence Y, Waddington SN.et al. (2006Self-complementary adeno-associated virus vectors containing a novel liver-specific human factor IX expression cassette enable highly efficient transduction of murine and nonhuman primate liver Blood 1072653–2661. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Sun X, Lu Y, Bish LT, Calcedo R, Wilson JM., and, Gao G. Molecular analysis of vector genome structures after liver transduction by conventional and self-complementary adeno-associated viral serotype vectors in murine and nonhuman primate models. Hum Gene Ther. 2010;21:750–761. doi: 10.1089/hum.2009.214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Flageul M, Aubert D, Pichard V, Nguyen TH, Nowrouzi A, Schmidt M.et al. (2009Transient expression of genes delivered to newborn rat liver using recombinant adeno-associated virus 2/8 vectors J Gene Med 11689–696. [DOI] [PubMed] [Google Scholar]
  22. Mattar CN, Nathwani AC, Waddington SN, Dighe N, Kaeppel C, Nowrouzi A.et al. (2011Stable human FIX expression after 0.9G intrauterine gene transfer of self-complementary adeno-associated viral vector 5 and 8 in macaques Mol Ther 191950–1960. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Niemeyer GP, Herzog RW, Mount J, Arruda VR, Tillson DM, Hathcock J.et al. (2009Long-term correction of inhibitor-prone hemophilia B dogs treated with liver-directed AAV2-mediated factor IX gene therapy Blood 113797–806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Gabriel R, Eckenberg R, Paruzynski A, Bartholomae CC, Nowrouzi A, Arens A.et al. (2009Comprehensive genomic access to vector integration in clinical gene therapy Nat Med 151431–1436. [DOI] [PubMed] [Google Scholar]
  25. Paruzynski A, Arens A, Gabriel R, Bartholomae CC, Scholz S, Wang W.et al. (2010Genome-wide high-throughput integrome analyses by nrLAM-PCR and next-generation sequencing Nat Protoc 51379–1395. [DOI] [PubMed] [Google Scholar]
  26. Schmidt M, Schwarzwaelder K, Bartholomae C, Zaoui K, Ball C, Pilz I.et al. (2007High-resolution insertion-site analysis by linear amplification-mediated PCR (LAM-PCR) Nat Methods 41051–1057. [DOI] [PubMed] [Google Scholar]
  27. Inagaki K, Piao C, Kotchey NM, Wu X., and, Nakai H. Frequency and spectrum of genomic integration of recombinant adeno-associated virus serotype 8 vector in neonatal mouse liver. J Virol. 2008;82:9513–9524. doi: 10.1128/JVI.01001-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Toromanoff A, Chérel Y, Guilbaud M, Penaud-Budloo M, Snyder RO, Haskins ME.et al. (2008Safety and efficacy of regional intravenous (r.i.) versus intramuscular (i.m.) delivery of rAAV1 and rAAV8 to nonhuman primate skeletal muscle Mol Ther 161291–1299. [DOI] [PubMed] [Google Scholar]
  29. Chadeuf G, Ciron C, Moullier P., and, Salvetti A. Evidence for encapsidation of prokaryotic sequences during recombinant adeno-associated virus production and their in vivo persistence after vector delivery. Mol Ther. 2005;12:744–753. doi: 10.1016/j.ymthe.2005.06.003. [DOI] [PubMed] [Google Scholar]
  30. Inagaki K, Lewis SM, Wu X, Ma C, Munroe DJ, Fuess S.et al. (2007DNA palindromes with a modest arm length of greater, similar 20 base pairs are a significant target for recombinant adeno-associated virus vector integration in the liver, muscles, and heart in mice J Virol 8111290–11303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Li C, Goudy K, Hirsch M, Asokan A, Fan Y, Alexander J.et al. (2009Cellular immune response to cryptic epitopes during therapeutic gene transfer Proc Natl Acad Sci USA 10610770–10774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Gao G, Lu Y, Calcedo R, Grant RL, Bell P, Wang L.et al. (2006Biology of AAV serotype vectors in liver-directed gene transfer to nonhuman primates Mol Ther 1377–87. [DOI] [PubMed] [Google Scholar]
  33. Jiang H, Couto LB, Patarroyo-White S, Liu T, Nagy D, Vargas JA.et al. (2006Effects of transient immunosuppression on adenoassociated, virus-mediated, liver-directed gene transfer in rhesus macaques and implications for human gene therapy Blood 1083321–3328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Nathwani AC, Rosales C, McIntosh J, Rastegarlari G, Nathwani D, Raj D.et al. (2011Long-term Safety and Efficacy Following Systemic Administration of a Self-complementary AAV Vector Encoding Human FIX Pseudotyped With Serotype 5 and 8 Capsid Proteins Mol Ther 19876–885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Toromanoff A, Adjali O, Larcher T, Hill M, Guigand L, Chenuaud P.et al. (2010Lack of immunotoxicity after regional intravenous (RI) delivery of rAAV to nonhuman primate skeletal muscle Mol Ther 18151–160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Fougerousse F, Bartoli M, Poupiot J, Arandel L, Durand M, Guerchet N.et al. (2007Phenotypic correction of alpha-sarcoglycan deficiency by intra-arterial injection of a muscle-specific serotype 1 rAAV vector Mol Ther 1553–61. [DOI] [PubMed] [Google Scholar]
  37. Emery DW. The use of chromatin insulators to improve the expression and safety of integrating gene transfer vectors. Hum Gene Ther. 2011;22:761–774. doi: 10.1089/hum.2010.233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Cartier N, Hacein-Bey-Abina S, Bartholomae CC, Veres G, Schmidt M, Kutschera I.et al. (2009Hematopoietic stem cell gene therapy with a lentiviral vector in X-linked adrenoleukodystrophy Science 326818–823. [DOI] [PubMed] [Google Scholar]
  39. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K.et al. (2009BLAST+: architecture and applications BMC Bioinformatics 10421. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1.

Vector copy number calculation of rAIS (Cos7) by qPCR.

Figure S2.

Amplified LAM-PCR amplicons.

Figure S3.

Overview of sequenced and validated LAM-PCR amplicons.

Figure S4.

Association of rAAV insertion sites for RefSeq genes or CpG islands.

Table S1.

Sensitivity of LAM-PCR for retrieval of rAIS.

Table S2.

Chromosomal location of rAAV insertion sites.


Articles from Molecular Therapy are provided here courtesy of The American Society of Gene & Cell Therapy

RESOURCES