LaSSO (Lariat Sequence Site Origin), an algorithm to build a lariat database along with workflow to identify lariat reads from RNA-seq data. (A) The algorithm pseudocode. LaSSO takes a given intron sequence of length “L” and uses the first “read length-1” bases of this intron as the 3′-lariat segment (if shorter, the whole sequence is used). To generate the 5′-lariat segments, accounting for all possible combinations of lariat structures, LaSSO iteratively produces all possible segments by selecting each base at a time as the putative branch point. LaSSO works from the 3′ end of the intronic sequence toward the 5′ end, until it reaches the first intronic base. LaSSO takes only the last read length-1 bases of the 5′-lariat segment (if shorter, the whole sequence is used again). LaSSO then concatenates the 5′ segment, the branch point, and the 3′ segment of the lariat sequence, yielding a diagnostic lariat signature. To generate all possible exon-skipping lariat sequences for a given transcript, the input sequence and algorithm were slightly altered. Briefly, considering a gene with two introns and three exons, only a single skipping event can occur. Therefore, the input sequence is the upstream intron with the downstream intron attached to its 3′ end. To avoid database redundancy, the algorithm iterates L times, where L only refers to the length of the downstream intron, not the combined introns. Thus, the 5′ segment of the skipping lariat sequence is generated from the downstream intron, while the 3′ segment of the skipping lariat always corresponds to the 5′ end of the upstream intron. For more than two introns, all possible skipping events are considered, i.e., Sn = (I−1) × I/2 (I: number of introns, Sn: number of skipping events). (B) Scheme for all possible lariat signatures accounted for by LaSSO. Intron excision results in diagnostic cDNA products upon reverse transcription, where the sequence upstream of the branch point precedes the 5′ end of the intron (resulting in 5′- and 3′-lariat segments, respectively). (Green) 5′-lariat segment from upstream intron; (red) 3′-lariat segment from upstream intron; (orange) 5′-lariat segment from downstream intron; (blue) 3′-lariat segment from downstream intron. (C) Lariat detection workflow (see main text for details).