TABLE 2.
Input features for the machine learning classifier
| Feature | Definition |
|---|---|
| Hits | No. of insertion sites within the ORF |
| Reads | No. of reads within the ORF |
| Hits in promoter | No. of hits within 100 bp upstream of ORF start codon |
| ORF length | Total length of ORF coding sequence (intron-free) |
| Insertion indexa | No. of hits in the ORF divided by ORF length |
| Noncoding windowa | Noncoding sequence (including introns) within 10 kb up- and downstream of ORF |
| Neighborhood index (NI) | Insertion index normalized to the noncoding window (hits divided by length) |
| Hit-free interval (HFI) | Length of longest insertion-free interval divided by ORF length |
These features were input indirectly to calculate NI and HFI.