Expanding control datasets and enforcing read count thresholds improves filtering power when analyzing mis-splicing events
There is a small decrease in the number of splicing events identified with increasing control size. Enforcing a read coverage threshold has a more significant effect on event counts, particularly for singleton events, where filtering out events supported by a single read removes up to 95% of singleton events. LCLs appear to exhibit the greatest number of splicing events regardless of read count filter, although this may be due to differences in sequencing depth between tissues. These data are generated from 2,000 bootstraps for control sizes of 30, 60, and 90 individuals. Outliers represent data points lying further than 1.5 times the interquartile range from the 25th and 75th percentile values.