Skip to main content
. 2020 Feb;30(2):287–298. doi: 10.1101/gr.251512.119

Figure 2.

Figure 2.

Flowchart of new isoform calling by TrackCluster. (A) Assignment of sequence tracks (shown as gray bar with different numbers) to a given locus based on read mapping against the C. elegans genome. Existing isoforms are also included as individual tracks (black bar). Two reads that show few overlaps with any existing exons in or are antisense to a given locus are excluded from subsequent analysis. (B) First round of clustering of tracks based on their distance scores (see Methods). (C) Read tracks (excluding existing transcripts) are merged if their distance scores satisfy our cutoff. Only the one with the biggest size of summed exons is retained (indicated with “√”) from each group along with the existing one (indicated with “#”) for subsequent isoform calling. The remaining tracks (indicated with “×”) including existing transcripts are assigned as “subreads” and used only for expression quantification and boundary correction. Note, during track merging, a minor shift (indicated with “*”, within 5% change in “score 1” defined in Methods) in exon-intron boundary caused by read error is permitted to avoid overcalling of novel isoform. (D) The retained tracks from C are subjected to a 2nd round of track clustering based on mutual distance scores (see “score 2” in Methods). (E) The tracks (including existing transcripts) are merged if their distances satisfy our cutoff to avoid calling a novel isoform from a possible partially degraded read retained in C except for those starting with an SL. (F) Existing annotated (black) and novel isoforms (gray) after junction correction (see Supplemental Fig. S4). The retained track is called a novel isoform due to its distance score with any existing transcript satisfying our cutoff. (GI) Schematic representation of each category of the newly identified isoform. Novel isoforms involving newly defined 5′ and/or 3′ end. “5′ and/or 3′ extra or missing” are/is defined as novel isoform with an extra or missing exon at both or either end(s) of a novel isoform relative to an existing transcript. “UTR extensions” or “UTR truncations” is defined as a novel isoform involving changes only in the UTR relative to an existing transcript. (JM) Novel isoform involving the exon change within the gene body. Note that straightforward assignment of an exon combination constitutes the main advantage of the long reads (J).