Table 5. Gene models proposed by BRAKER2, before and after filtering.
Initial model set | Intermediate filtered set | High-confidence set | |
---|---|---|---|
Total Genes | 1,460,545 | 32,360 | 41,632 |
Average CDS length (bp) | 613.90 | 1099.08 | 1146.4 |
Average number of exons | 2.78 | 4.22 | 4.48 |
Average intron length (bp) | 2,362 | 2,233 | 3,894 |
Max intron length (bp) | 385,133 | 159,979 | 1,399,110 |
Total monoexonics | 941,659 | — | 5,165 |
Total multiexonics | 518,886 | 32,360 | 36,466 |
Intermediate set was filtered by removing monoexonic models, models with greater than 50% of their length in a masked region, models annotated as retrodomains, and models lacking functional annotation with EnTAP. The high-confidence set includes the intermediate set, plus monoxonic and multiexonic models derived from transcript evidence, removing any fully nested gene models.