Skip to main content
. 2014 Nov 11;43(Database issue):D130–D137. doi: 10.1093/nar/gku1063

Table 1. Comparison of the old Rfam 11.0 BLAST and Infernal 1.0 search strategy versus the new Rfam 12.0 Infernal 1.1 search strategy for 15 of 200 randomly chosen families.

Accession Family ID Length (nt) #of seed seqs Time new (h) Time old (h) Time (old/new) New total hits Old total hits New unique hits Old unique hits
Top five families
RF00028 Intron_gpI 251 12 125.0 357.2 2.8 71 433 60 264 11 175 1
RF00026 U6 104 188 31.2 181.1 5.8 66 517 62 174 4367 14
RF00003 U1 166 100 11.6 64.0 5.5 15 770 14 867 904 1
RF00162 SAM 108 433 8.3 590.0 70.8 4905 4797 108 0
RF00050 FMN 140 144 17.1 169.9 23.9 4381 4306 76 1
Middle five families
RF01426 snoR126 101 4 40.3 7.3 0.2 78 66 12 0
RF01252 snR5 196 11 41.1 9.8 0.2 76 72 4 0
RF00544 snopsi28S-3327 143 14 11.3 15.1 1.3 75 74 1 0
RF00439 SNORD87 85 10 26.8 12.6 0.5 75 74 1 0
RF01537 TB11Cs2H1 70 7 5.8 7.3 1.3 74 73 1 0
Bottom five families
RF01439 S_pombe_snR36 164 2 25.0 1.7 0.1 5 2 3 0
RF01448 S_pombe_snR93 143 2 11.0 1.5 0.1 4 3 1 0
RF00967 mir-281 83 2 6.0 2.6 0.4 4 4 0 0
RF00925 MIR1027 142 2 20.4 1.6 0.1 3 3 0 0
RF01576 DdR8 88 2 10.4 1.6 0.2 2 2 0 0
all 200 - - - 4222.2 4069.8 0.96 201 814 179 681 22 312 53

The top five, middle five and lowest five families are shown, as ranked by number of hits found above Rfam GA thresholds using the new search strategy. Identical Rfam 12.0 score thresholds and CM parameters were used for both the new and old strategies (new: Rfam 12.0 CM file in Infernal 1.1 format; old: Rfam 12.0 CM file converted to Infernal 1.0 format using Infernal 1.1's cmconvert program). For each family, columns 1–4 include the Rfam accession, family identifier, model length in nucleotides and number of sequences in the seed alignment, columns 5–7 report on the running time for the new strategy in hours, old strategy in hours and the ratio of the running time (old/new), respectively, columns 8 and 9 report the number of hits found above the per-family Rfam 12.0 thresholds for the new strategy and old strategy, respectively; column 10 reports the number of unique hits found by the new strategy and not the old, and column 11 reports the number of unique hits found by the old strategy but not the new. A unique hit is defined as a hit found by one strategy for which none of the hits found by the other strategy overlap by ≥1 nucleotides on the same strand. The 200 families were randomly chosen from the set of 2190 families that exist in both Rfam 12.0 and Rfam 11.0, the last release for which the old strategy was used. Initially, MIR1122 (RF00906) was included in the 200, but we replaced it with another random choice (SNORD97, RF01291) after learning that MIR1122 is clearly related to a MITE (miniature inverted-repeat transposable element) in plants and that the curators at the microRNA database mirBase (4) suspect it may not be a true miRNA gene. If the family is removed from mirBase, it will also be removed from Rfam.