Table 1.
Alu sites |
Repetitive non-Alu sites |
Nonrepetitive sites |
|||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Cell lines | Tools | Known SNP (%) | Total | A-to-G (%) | Precision (%) | FDR (%) | Total | A-to-G (%) | Precision (%) | FDR (%) | Total | A-to-G (%) | Precision (%) | FDR (%) | |||
GM12878, cell | SPRINT | 0 | 336 304 | 97.7 | 96.5 | – | 14 019 | 87.8 | 97.2 | – | 5407 | 49.8a | 96.8 | – | |||
Ramaswami et al.* | 100 | 147 029 | 95.8 | – | – | 2385 | 97.4 | – | – | 1451 | 86.6 | – | – | ||||
Regular-RESs | GM12878, cytosolic | SPRINT | 0 | 359 725 | 98.9 | 96.9 | – | 5469 | 96 | 96.8 | – | 2081 | 75.9 | 95.5 | – | ||
GIREMI* | 70 | 36 131 | 99 | 99.4 | – | 267 | 83.7 | 84.3 | – | 1193 | 82.8 | 73.8 | – | ||||
GIREMI* | 100 | 39 757 | 99.7 | – | – | 260 | 88.6 | – | – | 1010 | 73.5 | – | – | ||||
U87MG | SPRINT | 0 | 48 085 | 99.6 | 96.2 | 3.2 | 988 | 99.5 | 97.1 | 4.5 | 296 | 87.8 | 91.2 | 0 | |||
GIREMI | 100 | 2152 | 99.8 | – | 0.7 | 114 | 96.5 | – | 9 | 509 | 88.6 | – | 53 | ||||
RNAEditor | 100 | 62 979 | – | – | 8.2 | 6142 | – | – | 42.3 | 155 | – | – | 55.5 | ||||
REDItools (de novo) | 100 | 628 | 96.5 | – | 3.6 | 238 | 46.2 | – | 80 | 14 949 | 39.7 | – | 100 | ||||
JACUSA (RRD) | 100 | 2154 | 94.7 | – | – | 331 | 39 | – | – | 4527 | 20.8 | – | – | ||||
Total | A-to-G (%) | ||||||||||||||||
GM12878, cell | SPRINT | 328 762 | 97.9 | ||||||||||||||
Hyper-RESs | Porath et al.* | 157 077 | 96 | ||||||||||||||
U87MG | SPRINT | 57 913 | 96 | ||||||||||||||
Porath et al.* | 27 124 | 94.6 |
Note: ‘*’ means the data are derived from the corresponding study. The details for running competing tools are described in Methods. ‘–’ means not assessed: For A-to-G rate, RNAEditor only outputs A-to-G changes, and therefore cannot be assessed; For precision, in U87MG dataset all methods except SPRINT call RESs by removing SNPs in dbSNP from the called SNVs, and therefore cannot be assessed; For FDR, only U87MG dataset is used for assessment, because it has ADAR knockdown sample. Since JACUSA (RRD) compares variants of RNA (CTRL) with that of RNA (KD), the FDR of JACUSA is unavailable.
The lower A-to-G rate is attributed to the presence of many clustered G-to-T changes which may be sequencing artefacts of the RNA-seq data in ENCODE project (see Supplementary Fig. S5 for details).