Skip to main content
. 2018 Mar 9;7:e32110. doi: 10.7554/eLife.32110

Figure 1. Overview of RB-TDNAseq and T-DNA insert density in R. toruloides coding regions.

(A) General strategy of RB-TDNAseq. A library of binary plasmids bearing an antibiotic resistance cassette (NATR) and a random 20 base-pair sequence ‘barcode’ (N20) flanked by specific priming sites (P1/P2) is introduced into a population of A. tumefaciens carrying a vir helper plasmid. A. tumefaciens efficiently transforms a T-DNA fragment into the target fungus (ATMT). NATR colonies are then combined to make a mutant pool. T-DNA-genome junctions are sequenced by TnSeq, thereby associating barcodes with the location of the insertion (Map). The mutant pool is then cultured under specific conditions and the relative abundance of mutant strains is measured by sequencing a short, specific, PCR on the barcodes (BarSeq) and counting the occurrence of each sequence (Count). Finally, for each gene, count data is combined across all barcodes mapping to insertions in that gene to obtain a robust measure of relative fitness for strains bearing mutations in that gene (Fitness Estimation). (B) Histogram of insert density in coding regions (start codon to stop codon) for all genes, and genes with orthologs reported to be essential in A. nidulans, C. neoformans, N. crassa, S. cerevisiae, or S. pombe. The following figure supplements are available for Figure 1.

Figure 1.

Figure 1—figure supplement 1. Schematic of TnSeq and BarSeq libraries generated using RB-TDNAseq.

Figure 1—figure supplement 1.

(A) In the TnSeq protocol, genomic DNA is sheared into ~300 bp fragments, and Illumina TruSeq adapters are ligated on both ends. T-DNA junctions are then specifically enriched by PCR with a T-DNA-specific and an adapter-specific primer. (B) In the BarSeq protocol, genomic DNA is used as a template for a more robust and quantitative PCR on the barcoded region of the T-DNA insert. Phasing error caused by the identical T-DNA sequences flanking the random barcodes was reduced by adding sequence diversity at the beginning of each read, either by the introduction of a short random 6 bp sequence or a 2–4 bp random sequence for TnSeq and BarSeq, respectively.
Figure 1—figure supplement 2. Complexities of T-DNA insertions.

Figure 1—figure supplement 2.

(A) Inferred topology of T-DNA insertions from associations of barcodes and adjacent genomic or T-DNA sequence. Only three of the observed insertion types could be mapped using the TnSeq protocol. (B) Sanger sequencing of barcodes from single colonies isolated from the pool. Multiple overlapping peaks in the barcode region suggest multiple T-DNAs are present in a single strain. Note that these T-DNAs may be integrated at the same, or different loci. Inherent noise in barcode amplification and sequencing introduces significant ambiguity in this analysis. The inferred rate of multiple barcode insertion (29%) should be considered a maximum estimate.
Figure 1—figure supplement 3. Observed biases in T-DNA insertion locations.

Figure 1—figure supplement 3.

(A) Frequency of T-DNA insertion mapping was consistent across all 30 IFO 0880 scaffolds. (B) Histogram of GC content in 100 base pair regions flanking insertion sites and in random 100 base pair regions. (C) Proportion of the R. toruloides IFO 0880 genome in promoter regions, terminator regions, untranslated regions transcribed to mRNA, coding exons, and introns versus the proportion of T-DNA insertions mapped to those sequences. (D) Distribution of T-DNA insertion density across the length of scaffold 1. Total inserts were summed across a rolling 1,000 base pair window using the observed insertions and a simulated random mutant pool assuming biases for insertion in promoters, terminators and untranslated transcribed regions.