Table 2. LTR HMM and RepeatMasker cross correlation at high specificity.
Threshold | HMM+ REP+ | HMM+ REP− | HMM− REP+ | Sensitivity | Specificity | Number of hits on 63 M random sequence | |
Hml | 5 | 395 | 18 | 57 | 0.87 | 0.96 | <1 |
Gamma | 5 | 391 | 159(20) | 602 | 0.39 | 0.71 | <1 |
Beta | 7 | 146 | 12 | 313 | 0.32 | 0.92 | <1 |
Lenti | 3 | 2 | 14 | 1556 | 0.00 | 0.13 | <1 |
General | 4 | 102 | 88 | 1452 | 0.07 | 0.54 | <1 |
Combined | 804 | 276 | 719 | 0.53 | 0.74 | - |
The table shows the number of LTRs detected for different LTR HMMs as compared to the RepeatMasker output for LTRs of the same group, for chromosome 19 (63 million bp) of the human genome assembly hg15. The different thresholds were chosen so as to give roughly the same number of additional positives: 10–100. An algorithm for removal of CT-rich repeats was used, as described in Results. The number of false positives for runs of the five LTR HMMs on 63 million bp random sequence is shown in the last entry. The figure in parentheses is the number of non-ERV1 elements.