ABSTRACT
The genetic stability of cell lines is a critical analytical attribute required to demonstrate the quality of cells over time. During cell passage, mutations can arise in the genomic DNA, potentially leading to changes in the final vaccine product. The identity and integrity of master cell banks, extended cell banks, complementing cell lines or recombinant cell lines expressing transgenes has to be tested throughout the production process by the vaccine manufacturer. Over the past few years, the traditional methods for evaluation of genetic stability have been replaced with molecular approaches including quantitative PCR, digital PCR and high throughput sequencing. However, these molecular-based approaches are used in research laboratories and not within a GMP-compliant environment. In this article, we briefly discuss some opportunities and challenges in characterization of the genetic stability of vaccine cell lines with these molecular-based approaches.
KEYWORDS: cell lines, genetic stability, molecular approaches, transgenes, vaccines
Overview
Cell cultures are known to be under stress and selection due to the artificial environment in which they exist, and laboratory manipulations to which they are subject.1-3 Genetic mutations that occur during cell passaging can lead to a heterogeneous population and the mutations might confer a selective advantage. This may allow a small population to overtake the initial cell culture, resulting in a final cell population that is different from the starting pool.1-3 For manufacturers of biologics, including vaccines, gene therapies and therapeutic proteins, spontaneous mutations occurring during cell culturing pose a risk to the use of recombinant proteins.3 Expression of recombinant proteins generally begins with cloning the gene of interest into a vector (such as a plasmid or viral vector) and infecting a specific host (such as bacterial, human, animal or insect cell lines) with that vector; the desired protein is then produced within the host cells, and can be purified and subsequently formulated for medicinal use. The industrial use of recombinant proteins allows for large quantities of a specific protein to be easily produced and purified.
Given, the possibility of genetic mutations occurring during the production of recombinant proteins, genetic stability testing is a fundamental step in confirming the consistency of the recombinant proteins produced. Multiple health and regulatory authorities have outlined guidelines for biologic manufacturers to ensure the genetic stability of protein expression systems. For example, the United States Food and Drug Administration (FDA) guideline, “Supplement to the Points to Consider in the Production and Testing of New Drugs and Biologic & Produced by Recombinant DNA Technology: Nucleic Acid Characterization and Genetic Stability,”4 describes the testing profiles required to demonstrate protein and DNA construct stability. Similar guidelines exist within the World Health Organization (WHO)5 and other pharmacopeias.6
One key assessment of genetic stability is confirmation of the size and nucleotide sequence of the genome region of interest and, if applicable, verification of the sequence of the promoter and other contributing transcriptional elements. Guidelines by the International Council for Harmonisation (ICH),7 WHO Technical Research Series,5 European Pharmacopeia,6 and the FDA Points to Consider,4 advise that genetic stability testing should be done, at a minimum, on the Master Cell Banks (MCB) and the Extended Cell Banks (ECB) to ensure the consistency of the product.
Stability of transgenes in complementing cell lines
Over recent years, several recombinant cell lines have been developed for the production of vaccine candidates. Many replication-deficient viruses, in which essential genes deleted from the viral genome in order to produce infectious but non-replicating virus candidate vaccines, require a cell line to complement the replication defects of the viral vaccine strain to allow efficient viral production. Genetic stability of the transgenes introduced into these cell lines is an important concern, since transgene instability may lead to inconsistent expression of the gene products.8,9 Additionally, this transgene instability may negatively impact vaccine yields. Quantitative polymerase chain reaction (qPCR) is the a common molecular method for assessing transgene stability in the genome of a constructed cell line.10,11 However, qPCR has some technical limitations, including differences in amplification efficiency between some reference standards and test samples.12
Recently, Sanofi Pasteur developed two molecular appro-aches for evaluation of gene copy numbers in the AV529-19 transgenic cell line. AV529-19 is a Vero cell line specifically engineered to express the coding regions of transgenes HSV-1 UL5 and UL29; it is used for the propagation of a candidate herpesvirus type II vaccine with defective UL5 and UL29 coding regions.13
The first approach was to develop a qPCR-based method to assess UL5 and UL29 gene copy number. This assay is, based on a relative quantitation that involves an in-house reference standard from which reportable values for UL5 and UL29 gene copy numbers are extrapolated. However, as the Vero cell line was not fully sequenced at the time, construction of a test reference standard without significant optimization was difficult.13 Therefore, a second molecular method was developed, using digital PCR (dPCR) technology to demonstrate the stability of the transgene copy number in the AV529-19 cell line.13 This high throughput dPCR-based approach did not require any test reference standard, and demonstrated high precision and accuracy. dPCR involves partitioning each test sample into a large number of individual reactions, with reaction partitions that received a target sequence counted as positive and those reaction partitions that did not receive a target molecule counted as negative. The combined results are then evaluated by Poisson distribution to convert the fraction of positive reaction partitions into a copy number determination. The developed method was able to resolve the issues associated with the traditional qPCR-based approach, and confirmed the stability of UL5 and UL29 transgenes in the Vero MCB and ECB.13
High throughput sequencing approach for evaluation of genetic stability in vaccine cell lines
The two tests commonly used to demonstrate genetic stability are Southern Blot analysis (Ph. Eu. 0784)6 and restriction fragment length polymorphism (RFLP).14 These assays are complex, time consuming, sensitive to genetic instability, and usually require large amount of test samples.15,16 However, with the development of newer molecular techniques, genetic stability can now be confirmed by Sanger sequencing.14 In general, multiple coverage of the regation of interest by capillary sequencing is required by the regulatory agencies to demonstrate stability.4,5 This can be demonstrated by full coverage of the insert region using a minimum of two amplicons, with each amplicon sequenced in both the forward and reverse directions.
The recent development of high throughput sequencing (HTS) instruments now offers better sensitivity than Sanger sequencing, with the number of reads reaching upwards of hundreds of millions, allowing for in-depth assessment of very minor variations. While Sanger sequencing is based on the sequencing of a pool of amplified molecules and can potentially detect variants at frequencies as low as 15%, HTS can detect variants at frequencies of 0.1% or even lower.17-19 Furthermore, the use of HTS technology allows for the genetic analysis of regions outside of the construct, such as host sequences or plasmid sequencing, to assess additional genetic changes. Although not yet required by any of the regulatory agencies, the use of HTS technology potentially allows for the genetic analysis of regions outside of the construct, such as host sequences or plasmid sequencing, to be assessed. It is anticipated that regulatory agencies will request for this information as HTS becomes more accepted and widely used for confirming genetic stability.20
There are multiple HTS platforms available, and each system has its own advantages and disadvantages. A comparison of the systems available from Ion Torrent, Illumina and Pacific Bioscience is outlined in Table 1.
Table 1.
Ion Torrent |
Illumina Platform |
|||||
---|---|---|---|---|---|---|
Personal Genome Machine | Ion Proton | MiSeq | HiSeq | NextSeq | Pacific Biosciences | |
Read length | 400 bp | 75 bp | 250 bp | 250 bp | 150 bp | >15 Kb |
Depth of Reads | 10 million | 80 million | 15 million | 1 Billion | 500 Million | 50,000 |
Other considerations | High error rates in regions of homopolymers | Short-read length make mapping repetitive or highly similar regions very difficult | Long reads that can sequence across tandemly duplicated regions. |
The copy number for the gene of interest within DNA construct is often increased to two or three copies to increase the amount of protein produced.21,22 These highly similar regions can be potentially problematic for data analysis when using short-read technology, as short reads are generated from regions with a high prevalence of repeat sequences and cannot be de-convoluted. In comparison, tandemly repeated genes are not an issue for the long reads generated from the Pacific Biosciences system. Although recent work has improved on the sensitivity of the Pacific Biosciences platform, this system has a lower accuracy for sequcenced nucleotides than the Illumina or the Ion Torrent platforms.23-25
The Ion Torrent platform is based on a sequential nucleotide flow and measurement of a change in voltage during nucleotide incorporation, and has a high error rate when sequencing through a region of homopolymers. The Illumina platform uses a reversible dye determination technique that is more prone to mis-incorporation of a nucleotide.23 Therefore, for assessing genetic stability and to demonstrate consistency of the nucleotide sequence of the gene, the choice of sequencing system may ultimately depend on the genetic structure of the construct. Another possibility is to combine PCR amplification with HTS sequencing, with the region of interest first amplified by PCR before being sequenced. Multiple PCR amplicons will allow the separation of repetitive regions (for example, a gene copy greater than one) before being sequenced in the same reaction by multiplexing.
Analysis of HTS data for assessment of genetic stability is another aspect that needs to be considered. Traditional Sanger sequencing produces a chromatogram that represents the consensus sequence of the population of the sequenced sample. In contrast, each read from an HTS run represents a nucleic acid molecule in the initial population of the sample. Depending on whether PCR amplicons were used to enrich the targeted region prior to sequencing, HTS data may also include signals from the vector, host or environmental background. For data analysis, examination of Sanger sequencing data is relatively straightforward and, if necessary, can be done manually. However, analysis of HTS data requires bioinformatics software for read mapping to ensure there is full coverage of the construct and that potential variants are detected. De novo assembly of the data requires a balance between generating multiple continuous sequences (known as contigs) that do not collapse (potentially due to variants) and mis-assembly, where erroneous reads are assembled together. Although HTS offers much higher sensitivity due to its depth of coverage, the sequencing error rate of different HTS instruments should be considered when identifying potential variants.26
The use of a HTS system within a validated (or qualified) setting to evaluate genetic stability, needs to be considered if it will ultimately be used as a control test for release of the product. Currently, most HTS systems are suitable for research and development purposes only, and do not necessarily fulfill regulatory requirements such as those specified in the FDA 21 CFR Part 11 and the EudraLex Annex 11. As a computerized system, regulatory compliance includes, at a minimum, secured access, an audit trail, data backup, control of software and use of electronic signatures, all of which must be applied to both the hardware and software components. Furthermore, the technology used must be optimized specifically for nucleic acid extraction, amplification and post-amplification manipulations, and controls need to be established to monitor the performance of the test application.
Concluding remarks
One of the important attributes for vaccine production is to establish a cell line which is stable over time. In recent years, molecular approaches including qPCR, dPCR, and HTS have supplanted the traditional analytical testing techniques and is acknowledged by experts and regulatory agencies. However, while each technology has its own inherent advantages, there are some challenges with each approach. qPCR is the most common validated molecular approach for evaluation of genetic stability of cell lines, being compliant with Good Manufacturing Practice (GMP) guidelines and easy to validate. However, selection of internal reference genes and the difference in amplification efficiency between test reference standards and test samples can be a challenge with this method. dPCR-based approaches are able to overcome these shortcomings as they do not require the use of reference standard. HTS is another technology which significantly impacts the field. However, as with dPCR, there is limited support for implementing HTS systems in a regulated environment. The majority of the available software associated with dPCR or HTS is suitable for research use only and remains to be implemented within a GMP-compliant environment. Despite these challenges, it is important to continue to improve current assays. The benefit of higher sensitivity and broader coverage of the molecular region of interest will improve the product consistency for biologic product manufacturers.
Disclosure of potential conflicts of interest
All authors are employees of Sanofi Pasteur.
Acknowledgment
Medical editing services were provided by Adam McGechan, inScience Communications, Springer Healthcare.
Funding
All authors are employees of Sanofi Pasteur. Sanofi Pasteur was involved in the design and conduct of the review and the decision to publish. Editorial assistance was provided by Springer Healthcare, funded by Sanofi Pasteur.
Author contributions
SN, LG-L and AA were involved in the conception, design and conduct of this review. All authors contributed to drafting and critically reviewing the publication and approved the final manuscript for submission.
References
- [1].ATCC, Technical Bulletin 7, Passage Number Effects In Cell Lines. 2010 [Google Scholar]
- [2].Hughes P, Marshall D, Reid Y, Parkes H, Gelber C. The costs of using unauthenticated, over-passaged cell lines: how much more data do we need? Biotechniques 2007; 43(5):575; PMID:18072586; https://doi.org/ 10.2144/000112598 [DOI] [PubMed] [Google Scholar]
- [3].World Health Organization, Replacement of Annex 1 of WHO Technical Report Series, No 878, Recommendations for the evaluation of animal cell cultures as substrates for the manufacture of biological medicinal products and for the characterization of cell banks. 2010 [Google Scholar]
- [4].Food and Drug Administration, Center for Biologics Evaluation and Research Food and Drug Administration Supplement to the Points to Consider in the Production and Testing of New Drugs and Biologic& Produced by Recombinant DNA Technology: Nucleic Acid Characterization and Genetic Stability. 1992 [Google Scholar]
- [5].World Health Organization Technical Report Series: biological products: general recommendations, Guidelines on evaluation of similar Biotherapeutic Products (SBPs), ECBS. 2009 [Google Scholar]
- [6].Ph Eur, Monographs: Medicinal and Pharmaceutical Substances, Products of Recombinant DNA Technology. 2016 [Google Scholar]
- [7].ICH Topic Q 5 B, Quality of Biotechnological Products: Analysis of the expression construct in cell lines used for production of rDNA derived protein products. 1996 [PubMed] [Google Scholar]
- [8].Hughes ED, Qu YY, Genik SJ, Lyons RH, Pacheco CD, Lieberman AP, Samuelson LC, Nasonkin IO, Camper SA, Van Keuren ML, et al.. Genetic variation in C57BL/6 ES cell lines and genetic instability in the Bruce4 C57BL/6 ES cell line. Mamm Genome 2007; 18(8):549-58; PMID:17828574; https://doi.org/ 10.1007/s00335-007-9054-0 [DOI] [PubMed] [Google Scholar]
- [9].Aumiller JJ, Mabashi-Asazuma H, Hillar A, Shi X, Jarvis DL. A new glycoengineered insect cell line with an inducibly mammalianized protein N-glycosylation pathway. Glycobiology 2012; 22(3):417-28; PMID:22042767; https://doi.org/ 10.1093/glycob/cwr160 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Paakkanen R, Vauhkonen H, Eronen KT, Jarvinen A, Seppanen M, Lokki ML. Copy number analysis of complement C4A, C4B and C4A silencing mutation by real-time quantitative polymerase chain reaction. PLoS One 2012; 7(6):e38813; PMID:22737222; https://doi.org/ 10.1371/journal.pone.0038813 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Castley AS, Martinez OP. Molecular analysis of complement component C4 gene copy number. Methods Mol Biol 2012; 882:159-71; PMID:22665233 [DOI] [PubMed] [Google Scholar]
- [12].Svec D, Tichopad A, Novosadova V, Pfaffl MW, Kubista M. How good is a PCR efficiency estimate: Recommendations for precise and robust qPCR efficiency assessments. Biomol Detect Quantif 2015; 3:9-16; PMID:27077029; https://doi.org/ 10.1016/j.bdq.2015.01.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Azizi A, Aidoo F, Gisonni-Lex L, McNeil B. Determination of HSV-1 UL5 and UL29 gene copy numbers in an HSV complementing Vero cell line. J Biotechnol 2013; 168(4):382-7; PMID:24140636; https://doi.org/ 10.1016/j.jbiotec.2013.10.002 [DOI] [PubMed] [Google Scholar]
- [14].World Health Organization Standard Operating Procedure Mutant Analysis by PCR and Restriction Enzyme Cleavage (MAPREC) for Oral Poliovirus (SABIN) Vaccine Types 1, 2 OR 3, Version 5. 2012 [Google Scholar]
- [15].Adzitey F. Genetic diversity of Escherichia coli isolated from ducks and the environment using enterobacterial repetitive intergenic consensus. Pak J Biol Sci 2013; 16(20):1173-8; PMID:24506018; https://doi.org/ 10.3923/pjbs.2013.1173.1178 [DOI] [PubMed] [Google Scholar]
- [16].Yang L, Ding J, Zhang C, Jia J, Weng H, Liu W, Zhang D. Estimating the copy number of transgenes in transformed rice by real-time quantitative PCR. Plant Cell Rep 2005; 23(10-11):759-63; PMID:15459795; https://doi.org/ 10.1007/s00299-004-0881-0 [DOI] [PubMed] [Google Scholar]
- [17].Mantel N, Girerd Y, Geny C, Bernard I, Pontvianne J, Lang J, Barban V. Genetic stability of a dengue vaccine based on chimeric yellow fever/dengue viruses. Vaccine 2011; 29(38):6629-35; PMID:21745519; https://doi.org/ 10.1016/j.vaccine.2011.06.101 [DOI] [PubMed] [Google Scholar]
- [18].Flaherty P, Natsoulis G, Muralidharan O, Winters M, Buenrostro J, Bell J, Brown S, Holodniy M, Zhang N, Ji HP. Ultrasensitive detection of rare mutations using next-generation targeted resequencing. Nucleic Acids Res 2012; 40(1):e2; PMID:22013163; https://doi.org/ 10.1093/nar/gkr861 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Xu F, Wang W, Wang P, Jun LM, Chung SP, Wang J. A fast and accurate SNP detection algorithm for next-generation sequencing data. Nat Commun 2012; 3:1258; PMID:23212387; https://doi.org/ 10.1038/ncomms2256 [DOI] [PubMed] [Google Scholar]
- [20].Treangen TJ, Salzberg SL. Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet 2011; 13(1):36-46; PMID:22124482 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Liang A, Cao S, Han L, Yao Y, Moaeen-Ud-Din M, Yang L. Construction and evaluation of the eukaryotic expression plasmid encoding two copies of somatostatin genes fused with hepatitis B surface antigen gene S. Vaccine 2008; 26(23):2935-41; PMID:18455280; https://doi.org/ 10.1016/j.vaccine.2008.03.036 [DOI] [PubMed] [Google Scholar]
- [22].Ross TM, Xu Y, Bright RA, Robinson HL. C3d enhancement of antibodies to hemagglutinin accelerates protection against influenza virus challenge. Nat Immunol 2000; 1(2):127-31; PMID:11248804; https://doi.org/ 10.1038/77802 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, Bertoni A, Swerdlow HP, Gu Y. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 2012; 13:341; PMID:22827831; https://doi.org/ 10.1186/1471-2164-13-341 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Liu L, Li Y, Li S, Hu N, He Y, Pong R, Lin D, Lu L, Law M. Comparison of next-generation sequencing systems. J Biomed Biotechnol 2012; 2012:251364; PMID:22829749 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Koren S, Schatz MC, Walenz BP, Martin J, Howard JT, Ganapathy G, Wang Z, Rasko DA, McCombie WR, Jarvis ED, et al.. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotechnol 2012; 30(7):693-700; PMID:22750884; https://doi.org/ 10.1038/nbt.2280 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Ng SH, Azizi A, Edamura K, Malott RJ, Charlebois RL, Logvinoff C, Schreiber M, Mallet L, Gisonni-Lex L. Preliminary evaluation of next-generation sequencing performance relative to qPCR and in vitro cell culture tests for Human Cytomegalovirus. PDA J Pharm Sci Technol 2014; 68(6):563-71; PMID:25475630; https://doi.org/ 10.5731/pdajpst.2014.01013 [DOI] [PubMed] [Google Scholar]