Abstract
Nicotiana benthamiana is employed around the world for many types of research and one transgenic line has been used more extensively than any other. This line, 16c, expresses the Aequorea victoria green fluorescent protein (GFP), highly and constitutively, and has been a major resource for visualising the mobility and actions of small RNAs. Insights into the mechanisms studied at a molecular level in N. benthamiana 16c are likely to be deeper and more accurate with a greater knowledge of the GFP gene integration site. Therefore, using next generation sequencing, genome mapping and local alignment, we identified the location and characteristics of the integrated T-DNA. As suggested from previous molecular hybridisation and inheritance data, the transgenic line contains a single GFP-expressing locus. However, the GFP coding sequence differs from that originally reported. Furthermore, a 3.2 kb portion of a transposon, appears to have co-integrated with the T-DNA. The location of the integration mapped to a region of the genome represented by Nbv0.5scaffold4905 in the www.benthgenome.com assembly, and with less integrity to Niben101Scf03641 in the www.solgenomics.net assembly. The transposon is not endogenous to laboratory strains of N. benthamiana or Agrobacterium tumefaciens strain GV3101 (MP90), which was reportedly used in the generation of line 16c. However, it is present in the popular LBA4404 strain. The integrated transposon sequence includes its 5’ terminal repeat and a transposase gene, and is immediately adjacent to the GFP gene. This unexpected genetic arrangement may contribute to the characteristics that have made the 16c line such a popular research tool and alerts researchers, taking transgenic plants to commercial release, to be aware of this genomic hitchhiker.
Introduction
Reporter genes have been profoundly important in advancing biological research. In plants, one of the first widely adopted reporter systems utilised beta glucuronidase (GUS) [1], which converts a soluble clear substrate to an insoluble blue precipitate. Unfortunately, this is a destructive histochemical technique. The more recent discovery and application of the green fluorescent protein (GFP) from Aequorea victoria allows reporter assays to be live, continuous and non-destructive, and has revolutionised molecular science. GFP was first used in bacteria and animal cells but was rapidly adopted by plant researchers following addition of plant regulatory signals and codon optimisation. The pioneering plant expression constructs, such as mGFP4, mGFP5, and mGFP5-ER [2, 3], have been used extensively in both monocotyledon and dicotyledon plants. For many of these species, different research groups have produced their own GFP-expressing lines and rarely has a single transgenic GFP-expressing line of a species been embraced by a whole research sector. However, the 16c line of N. benthamiana, from David Baulcombe’s laboratory, is the notable exception. It was generated alongside three other lines (GFP8, GFP17b, and GFPY) [4], which have not been widely distributed, but 16c has been cited in more than 750 publications. These include: exploring virus-plant interactions, RNAi, mobile signals, florigens, grafting, viroids, protein structure and function, protein-protein interactions, human and avian viruses, and silencing suppressors. We have already published and made available the genome and transcriptome sequences of N. benthamiana [5–7] (www.benthgenome.com), and believe that reporting further details about the location and characteristics of the T-DNA insertion in the 16c line will aid the plant research community.
Materials and methods
Plant material
Nicotiana benthamiana cultivars LAB [5, 7] and 16c (a generous gift from David Baulcombe, UK) were grown in soil (Plugger custom Mix, Debco, supplemented with Osmocote® slow release fertiliser) in a controlled growth chamber, at constant temperature of 22.5°C, 16 hours day length, 300 μm/m2 light and 60% relative humidity.
DNA extractions
DNA was extracted from approximately 100 mg of freshly ground leaf material using the NucleoSpin® Plant II (Macherey-Nagel) extraction kit as per manufacturer’s instructions with the following modification: 1% v/v β-mercaptoethanol (Sigma-Aldrich) was added to the lysis buffer to inhibit oxidative damage. Extracted DNA was quantified spectrophotometrically using the Nanodrop 2000 (Thermo-Fisher) and an aliquot of 300 ng run on a 1% TAE (tris-acetate-EDTA) agarose gel and visualised under blue light to determine its integrity.
Genome sequencing
The 16c DNA library was prepared using the TruSeq® Nano Kit (Illumina) as per manufacturer’s instructions and processed on one lane of the NextSeq® 500 (Illumina) housed at the Queensland University of Technology, Central Analytical Research Facility (QUT CARF). Raw reads were filtered prior to alignment; Trimmomatic [8] was used to only keep reads of 150 nt, FastUniq was used to remove PCR duplicates and BBsplit from the BBMap package (https://sourceforge.net/projects/bbmap/) was used to remove reads aligning to mitochondria and chloroplast genomes (Nicotiana tabacum GenBank ID BA000042.1 and Z00044.2, respectively). 212,397,922 paired end reads were aligned to pBIN19-mGFP5-ER sequence using Bowtie2 on Galaxy Queensland platform (http://galaxy-qld.genome.edu.au/galaxy).
RNA extraction and RNAseq
Total RNA was isolated from two week old 16c seedlings using PureLink® RNA Mini Kit as per manufacturer's protocol with a column DNAse treatment. RNA integrity was checked on a 1% TAE gel prior to library preparation. NGS (next generation sequencing) library was prepared using Illumina TruSeq® Stranded Total RNA with Ribo-Zero kit as per manufacturer’s protocol and processed on HiSeq® (Illumina) housed at the Australian Genome Research Facility (AGRF).
Sanger sequencing of T-DNA insert in 16c
Primers pairs B+E and D+F (Table 1) were used to amplify the genomic DNA sequences flanking the T-DNA LB and RB. PCR amplifications were carried out as follows; an initial denaturation step at 95°C for 3 min, followed by 35 cycles comprising of 95°C for 15 s, 55°C for 15 s, and 72°C for 4 min, followed by a final extension for 6 min at 72°C. Reactions contained 2G Robust HotStart ReadyMix (KAPA), 0.5 μM each of forward and reverse primer, 10 ng of 16c DNA as template and volume made up to 20 μL with molecular grade water. The resulting PCR amplicons were gel excised and cloned into pGEM®-T Easy (Promega) using standard molecular techniques. Four clones were sequenced using Sanger sequencing.
Table 1. List of primers used in this study.
Primer ID | Primer sequence | Description |
---|---|---|
A | AGGAATATATGTTGGGTTTGAATC | N. benthamiana flanking RB in 16c Fwd2 |
B | AATTCTGGAAATATCAAAGGTG | N. benthamiana flanking RB in 16c Fwd1 |
C | GCTTAGCTCATTAAACTCCAGA | NOS promoter Rev |
D | CTCGGCCACAAGTTGGAATA | Internal primer for mGFP5-ER Fwd |
E | GGTCTTGAAGTTGGCTTTGATG | Internal primer for mGFP5-ER Rev |
F | CACTTGTGAGGGGAGAATATAA | N. benthamiana flanking LB in 16c Rev |
To ascertain to which scaffold the T-DNA insertion had integrated into, primer pair A+C (Table 1) were used to generate a 1.7 kb amplicon which was analysed by Sanger sequencing as previously described.
Results
Genomic location of the T-DNA
Genomic DNA was extracted from young leaves of the 16c line and used to generate a genomic DNA library. This was sequenced on an Illumina NextSeq® 500 machine and, after trimming and filtering, produced 212,397,922 paired end reads. The reads were aligned to the sequences of both pBin-35S-mGFP5 and pBin-35S-mGFP5-ER. This revealed that, in contrast to the original report [4], the GFP gene within the T-DNA encodes a protein possessing both an ER targeting signal peptide and an ER retention signal. These signal-encoding sequences and all but one nucleotide (C->T, H46Y) in the body of the GFP gene are identical to those of mGFP5-ER [3]. Many of the reads aligned perfectly with the transfer region of the Ti plasmid and formed one contig but none aligned with left border (LB) or right border (RB) sequences (Fig 1). Furthermore, neither terminus of the contig had T-DNA-like sequences. At the potential RB and LB ends are 115 nt and 108 nt of non-T-DNA sequence, respectively. Using BLAST to compare the 115 nt sequence with the genome assembly of the LAB strain (www.benthgenome.com), produced many alignments but gave only one 100% uninterrupted match. This was with a region of Nbv0.5scaffold4905. The 108 nt sequence gave no significant alignments from a BLAST search of the N. benthamiana genome but produced several good alignments with bacterial sequences in the NCBI non-redundant database (https://blast.ncbi.nlm.nih.gov/Blast.cgi). To investigate this further, two sets of primer pairs were used to amplify regions of the 16c genome (Fig 1D). One set used a primer binding within the 115 nt sequence (primer B) and the other within the mGFP5-ER gene (primer E). The second primer set utilised one (primer F) designed to bind to a region 175 nt downstream of primer B site on Nbv0.5scaffold4905 and the other (primer D) binding within the GFP gene. Primers B+E produced an amplicon of 4.17 kb, and primers D+F gave a 4.17 kb product (Fig 1F). These were cloned and sequenced. The B+E amplicon sequence matched the expected region of the contig sequence but the D+F product was much larger than anticipated and contained 3.2 kb of sequence matching a bacterial transposon (Fig 1B).
Although the 115 nt sequence matches perfectly with Nbv0.5scaffold4905, it also has near-perfect matches with >100 different scaffolds in both the benthgenome.com and solgenomics.net assemblies. Therefore, a primer (primer A) binding ~1.5 kb upstream of the primer B site on Nbv0.5scaffold4905 was synthesised and used in conjunction with a primer binding to the NOS promoter (primer C) to test for and amplify a longer genomic region (Fig 1D). The sequence of the 5’ 1.5 kb of the 1.7 kb amplicon matched perfectly with Nbv0.5scaffold4905 but poorly with all other scaffolds. This unequivocally identifies Nbv0.5scaffold4905 as representative of the region of the 16c genome harbouring the mGFP5-ER T-DNA insert. The results also show that 58 nt has been lost from the plant’s genome during T-DNA integration.
Interrogating the solgenomics.net assembly using BLAST with this 1.5 kb sequence produced multiple hits on the Niben101Scf03641 scaffold. Pairwise comparison of the sequences produced two regions of almost perfect alignment (1008 nt and 523 nt), separated by a 41 nt region that appears to be due to an insertion in Niben101Scf03641 (Fig 2). The region of N. benthamiana sequence surrounding the T-DNA insertion site (115+57+108 nt) matches many scaffolds and sometimes many times within a scaffold. This suggests that the sequence is distributed widely throughout the genome. Its repeated presence is also evident from the great apparent depth of coverage for the sequences adjacent to the T-DNA insertion (Fig 1C). If they were only present at this location, their depth of coverage should be similar to that for the T-DNA (i.e. ~50x). However, their apparent depth is approximately 60 times greater (~3000x).
To investigate the expression of genes inserted at this locus, an RNAseq library was prepared from 16c seedlings. The resulting NGS library contained 40,940,956 100 bp single end reads which were aligned to the T-DNA+transposon sequence. This confirmed the high expression level of mGFP5-ER compared to the NPTII gene (Fig 1E).
Unusual co-integration of T-DNA with a partial transposon sequence
A de novo genome assembly was constructed using the 16c reads. This produced a contig containing the same pBIN19-mGFP5-ER T-DNA sequence that we found by mapping the 16c raw reads onto the plasmid sequence. Assuming that the T-DNA cut sites are between the third and fourth nucleotide of the 25 bp border repeat sequence [9], there is a 56 bp T-DNA truncation at the RB and a 444 bp truncation at the LB. There is also a co-integrated partial transposon sequence (Fig 1B). This sequence appears to be part of the class II Tn3-type transposable element, Tn5393 (GenBank Accession No. M96392.1), and contains its 5' inverted terminal repeat (IR), transposase (tnpA) gene, recombination region (res), and 128 nt of its resolvase (tnpR) gene. While an intact Tn5393 transposon also harbours two streptomycin resistance genes, an insertion element and a 3' IR [10] (Fig 1A), these components are not integrated into 16c either in Nbv0.5scaffold4905 or elsewhere in the genome, as their sequences are not present in the entire 16c read library. Searching the genome of non-transgenic N. benthamiana failed to detect any Tn5393 sequences but they could be found in the fire blight bacterium, Erwinia amylovora [10] and in plasmids borne by a range of other bacteria [10–12]. This suggests that the transposon sequences adjacent to the T-DNA integration in 16c have been translocated from a bacterial source. While A. tumefaciens strains GV3101 (MP90) and EHA105 do not possess this transposon, it is present in strain LBA4404 [13].
The original description of the generation of N. benthamiana line 16c [4] reported a simple 3:1 inheritance of the GFP transgene, from a heterozygous parent, and a single band in genomic blots. These results are indicative of a single T-DNA insertion locus. The absence of N. benthamiana/T-DNA (+Tn5393) junctions in the 16c library sequence reads, other than the two in Nbv0.5scaffold4905, is consistent with this interpretation. Furthermore, there were no read sequences that suggest a complex integration of more than one copy of the T-DNA at the same locus. The complete sequence of the 16c T-DNA+partialTn5393 locus has been lodged at GenBank with the accession number KY464890 and it is also available on a dedicated page of benthgenome.com.
Discussion
Over the last decade, Arabidopsis thaliana and Oryza sativa have been the popular model species for studying molecular mechanisms in plant biology. Both are diploid, have fully sequenced small genomes, are easy to transform and have a broad range of available mutants. The T-DNA insertion lines of A. thaliana, in particular, have been very useful. N. benthamiana has been used extensively for transient gene expression in the study of many mechanisms, but much less so using stable transgenesis because of the complexity of its allopolyploid genome, lack of genomic sequence information, and the dearth of well-defined mutants [14]. However, there is now considerable interest in how mechanisms operate in polyploid genomes, as most crop species are polyploids. There are also two independent draft genome assemblies of N. benthamiana (benthgenome.com & solgenomics.net) from an Australian and American research group, respectively, and CRISPR-Cas technology appears capable of efficiently generating targeted N. benthamiana mutants [15, 16]. The GFP gene in 16c has been used in many studies including those on the induction and maintenance of epigenetic changes [17, 18], and as a reporter for post-transcriptional gene silencing [19]. However, many of these assays have relied on incomplete or inaccurate details of the GFP locus. In order to furnish researchers with more accurate details for future studies utilising 16c, we determined the sequence and location of this locus and discovered some unexpected details. We confirmed, at a sequence level, the assertion that the GFP transgene is in a single locus and that it is highly expressed when compared to the NPTII selectable marker gene. We determined that the GFP transgene has the ER targeting and retention signals of mGFP5-ER and not the untargeted GFP design of mGFP5, as reported in the original description [4]. We also identified a sequence variation that alters an amino acid residue within the GFP protein that is not a reported change in mGFP5-ER arising from mGFP4-ER.
Interestingly, when mapping the T-DNA to Nbv0.5scaffold4905 and to Niben101Scf03641, we found a 41 nt insertion in the solgenomics.net assembly. This insertion may reflect either a misassembly or a genuine difference between the two isolates. If it is the latter case, this suggests that the background of 16c is more similar to the Australian N. benthamiana LAB isolate than to the one used by the American research group [20].
The most surprising finding was that a large section of an alien transposon, including its tnpA gene, has been inserted adjacent to the T-DNA at its integration site. This transposon, Tn5393, is present in plasmids borne by a range of bacterial species [10–12] including some A. tumefaciens strains [13, 21]. A recent study [13] of over 300 transgenic rice lines, transformed using the LBA4404 strain of A. tumefaciens, found this transposon to have co-integrated into the genome of 26% of them. The Tn5393 sequence is often integrated immediately adjacent to the T-DNA and often as an incomplete copy. This almost exactly mirrors the situation in 16c. However, the A. tumefaciens strain reported [4] to have been used to create 16c was GV3101 (MP90), which does not harbour the transposon sequence [13]. Perhaps, LBA4404 or some other Tn5359-bearing A. tumefaciens strain was mistakenly used, or was contaminating the GV3101 (MP90) culture used, for the transformation.
The 16c line of N. benthamiana has been employed widely to track the systemic mobility of gene silencing, after local induction (Fig 1G), and to assay suppression of gene silencing by potential viral suppressor proteins [22, 23]. Moreover, plant transposons are often the target of transcriptional and post-transcriptional gene silencing [24, 25]. This raises the possibility that the bacterial transposon sequence adjacent to the GFP gene in 16c has an enhancing effect on the plant’s silencing response, once it is triggered. This could explain why 16c is such a sensitive epigenetic research tool. Until now researchers have been unaware of 16c’s integrated bacterial transposon sequence. We envisage that future studies using 16c will not only monitor the GFP gene but also examine the chromatin status of the adjacent transposon sequence. Indeed, if the sequence is found to enhance the induced silencing reporter system, it could become a blueprint for the design of similar reporter systems in other species. Furthermore, with the current regulatory climate requiring clean, well-documented transgenic insertions in commercialized GMO crops, we suggest that it would be prudent to assay for this transposon to ensure that it has not been unwittingly introduced into transgenic lines destined for commercial release.
Acknowledgments
The authors thank Professor Jacqueline Batley, Dr Satomi Hayashi, Ms Sahana Manoli, Mr Vincent Chand and staff at AGRF for their assistance with Illumina and Sanger sequencing.
Data Availability
Data are deposited into Genbank (GenBank IDs in paper).
Funding Statement
This work was supported by an ARC Laureate Fellowship (FL160100155).
References
- 1.Jefferson RA, Kavanagh TA, Bevan MW. GUS fusions: beta-glucuronidase as a sensitive and versatile gene fusion marker in higher plants. The EMBO Journal. 1987;6(13):3901–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Haseloff J, Siemering KR, Prasher DC, Hodge S. Removal of a cryptic intron and subcellular localization of green fluorescent protein are required to mark transgenic Arabidopsis plants brightly. Proceedings of the National Academy of Sciences. 1997;94(6):2122–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Haseloff J, Siemering KR. The Uses of Green Fluorescent Protein in Plants Green Fluorescent Protein: John Wiley & Sons, Inc.; 2005. p. 259–84. [PubMed] [Google Scholar]
- 4.Ruiz MT, Voinnet O, Baulcombe DC. Initiation and maintenance of virus-induced gene silencing. The Plant Cell. 1998;10(6):937–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Naim F, Nakasugi K, Crowhurst RN, Hilario E, Zwart AB, Hellens RP, et al. Advanced engineering of lipid metabolism in Nicotiana benthamiana using a draft genome and the V2 viral silencing-suppressor protein. PLoS ONE. 2012;7(12):e52717 10.1371/journal.pone.0052717 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Nakasugi K, Crowhurst RN, Bally J, Wood CC, Hellens RP, Waterhouse PM. Transcriptome Sequence Assembly and Analysis of RNA Silencing Genes of Nicotiana benthamiana. PLoS ONE. 2013;8(3):e59534 10.1371/journal.pone.0059534 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Bally J, Nakasugi K, Jia F, Jung H, Ho SYW, Wong M, et al. The extremophile Nicotiana benthamiana has traded viral defence for early vigour. Nature plants. 2015;1. [DOI] [PubMed] [Google Scholar]
- 8.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics (Oxford, England). 2014;30(15):2114–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Wang K, Stachel SE, Timmerman B, Marc VM, Zambryski PC. Site-specific nick in the T-DNA border sequence as a result of Agrobacterium vir Gene Expression. Science (New York, NY). 1987;235(4788):587–91. [DOI] [PubMed] [Google Scholar]
- 10.Chiou C, Jones AL. Nucleotide sequence analysis of a transposon (Tn5393) carrying streptomycin resistance genes in Erwinia amylovora and other gram-negative bacteria. Journal of Bacteriology. 1993;175(3):732–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Sundin GW, Monks DE, Bender CL. Distribution of the streptomycin-resistance transposon Tn5393 among phylloplane and soil bacteria from managed agricultural habitats. Canadian Journal of Microbiology. 1995;41(9):792–9. [DOI] [PubMed] [Google Scholar]
- 12.L'Abée-Lund TM, Sørum H. Functional Tn5393-Like Transposon in the R Plasmid pRAS2 from the Fish Pathogen Aeromonas salmonicida subspecies salmonicida Isolated in Norway. Applied and Environmental Microbiology. 2000;66(12):5533–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kim S-R, An G. Bacterial transposons are co-transferred with T-DNA to rice chromosomes during Agrobacterium-mediated transformation. Molecules and Cells. 2012;33(6):583–9. 10.1007/s10059-012-0010-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Goodin MM, Zaitlin D, Naidu RA, Lommel SA. Nicotiana benthamiana: Its History and Future as a Model for Plant—Pathogen Interactions. Molecular Plant-Microbe Interactions. 2008;21(8):1015–26. 10.1094/MPMI-21-8-1015 [DOI] [PubMed] [Google Scholar]
- 15.Li J-F, Norville JE, Aach J, McCormack M, Zhang D, Bush J, et al. Multiplex and homologous recombination-mediated genome editing in Arabidopsis and Nicotiana benthamiana using guide RNA and Cas9. Nature Biotechnology. 2013;31(8):688–91. 10.1038/nbt.2654 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Nekrasov V, Staskawicz B, Weigel D, Jones JDG, Kamoun S. Targeted mutagenesis in the model plant Nicotiana benthamiana using Cas9 RNA-guided endonuclease. Nature Biotechnology. 2013;31(8):691–3. 10.1038/nbt.2655 [DOI] [PubMed] [Google Scholar]
- 17.Dadami E, Dalakouras A, Zwiebel M, Krczal G, Wassenegger M. An endogene-resembling transgene is resistant to DNA methylation and systemic silencing. RNA Biology. 2014;11(7):934–41. 10.4161/rna.29623 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Dadami E, Moser M, Zwiebel M, Krczal G, Wassenegger M, Dalakouras A. An endogene-resembling transgene delays the onset of silencing and limits siRNA accumulation. FEBS Letters. 2013;587(6):706–10. 10.1016/j.febslet.2013.01.045 [DOI] [PubMed] [Google Scholar]
- 19.Baumberger N, Tsai C-H, Lie M, Havecker E, Baulcombe David C. The polerovirus silencing suppressor P0 targets ARGONAUTE proteins for degradation. Current Biology. 2007;17(18):1609–14. 10.1016/j.cub.2007.08.039 [DOI] [PubMed] [Google Scholar]
- 20.Bombarely A, Rosli HG, Vrebalov J, Moffett P, Mueller LA, Martin GB. A Draft Genome Sequence of Nicotiana benthamiana to Enhance Molecular Plant-Microbe Biology Research. Molecular Plant-Microbe Interactions. 2012;25(12):1523–30. 10.1094/MPMI-06-12-0148-TA [DOI] [PubMed] [Google Scholar]
- 21.Henkel CV, den Dulk-Ras A, Zhang X, Hooykaas PJJ. Genome Sequence of the Octopine-Type Agrobacterium tumefaciens Strain Ach5. Genome Announcements. 2014;2(2). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Fusaro AF, Correa RL, Nakasugi K, Jackson C, Kawchuk L, Vaslin MFS, et al. The Enamovirus P0 protein is a silencing suppressor which inhibits local and systemic RNA silencing through AGO1 degradation. Virology. 2012;426(2):178–87. 10.1016/j.virol.2012.01.026 [DOI] [PubMed] [Google Scholar]
- 23.Takeda A, Sugiyama K, Nagano H, Mori M, Kaido M, Mise K, et al. Identification of a novel RNA silencing suppressor, NSs protein of Tomato spotted wilt virus. FEBS Letters. 2002;532(1–2):75–9. [DOI] [PubMed] [Google Scholar]
- 24.Nuthikattu S, McCue AD, Panda K, Fultz D, DeFraia C, Thomas EN, et al. The initiation of epigenetic silencing of active transposable elements is triggered by RDR6 and 21–22 nucleotide small interfering RNAs. Plant Physiology. 2013;162(1):116–31. 10.1104/pp.113.216481 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Slotkin RK, Martienssen R. Transposable elements and the epigenetic regulation of the genome. Nature Reviews Genetics. 2007;8(4):272–85. 10.1038/nrg2072 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data are deposited into Genbank (GenBank IDs in paper).