Fig. 1.
Schematic overview of the UNAGI pipeline. Reads from the ONT MinION are first stranded by looking for poly(A) or poly(T) tails at the ends and are separated into two files, sense and antisense. Those reads are then mapped to the genome using Minimap2 and their sequence is corrected using the genome. From these results, spikes and drops in coverage are identified as transcriptional unit boundaries as are spikes in number of 5′ or 3′ sites. The reads are also parsed looking for their splicing information and for long open reading frames (ORFs), allowing for the detection of isoforms. When several isoforms are discovered, only the major isoforms are annotated in the main output while all isoforms are listed in specific outputs
