Skip to main content
. 2014 Nov 11;43(Database issue):D130–D137. doi: 10.1093/nar/gku1063

Table 2. Summary statistics for Rfam-based annotation of RNAs in various genomes and metagenomics data sets.

Genome/data set Size (Mb) # of hits # of fams CPU time (hours) Mb/hour
Homo sapiens 3099.7 14 508 796 650 4.8
Sus scrofa (pig) 2808.5 6177 625 460 6.1
Drosophila melanogaster 168.7 4321 156 30 5.7
Caenorhabditis elegans 100.3 1022 175 20 5.2
Saccharomyces cerevisiae 12.2 376 96 1.7 7.3
Escherichia coli 4.6 256 112 0.46 10.2
Bacillus subtilis 4.1 211 52 0.57 7.2
Methanocaldococcus jannaschii 1.7 257 18 0.31 5.6
Aquifex aeolicus 1.6 52 7 0.22 7.3
Borrelia burgdorferi 0.9 44 7 0.22 4.1
Human immunodeficiency virus (HIV) 0.01 12 10 0.016 0.63
Human gut microbiome sample (sample ERS167139, 454 sequencing) 166.1 4342 54 22 7.7
Human gut microbiome sample (sample ERS235581, Illumina HiSeq sequencing) (28) 52.9 3159 47 8.5 6.2
Ocean metagenome (sample SRS580499, Illumina genome analyzer) 44.3 6692 59 13 3.5

The cmsearch program of Infernal 1.1 was used with Rfam 12.0 CM files and the following command-line options: --noali --cut ga --rfam --nohmmonly --cpu 0. Overlapping hits were removed such that no nucleotide was matched by more than one family by keeping the hit with the lower E-value in the case of overlaps (and higher bit score in the case of tying E-values). All searches were run as single execution threads on 3.0 GHz Intel Xeon processors. The Homo sapiens, Sus scrofa, Drosophila melanogaster and Saccharomyces cerevisiae genomes searched were obtained from Ensembl release 76 (http://www.ensembl.org/) (26) and the Escherishia coli (K12 substr MG1655), Bacillus subtilis (BSn5), Methanocaldococcus jannaschii (DSM 2661), Aquifex aeolicus (VF5) and Borrelia burgdorferi (CA-11 2A) genomes were obtained from release 23 of Ensembl Genomes (http://ensemblgenomes.org/) (27) for all of those the actual sequence file searched was downloaded via FTP and suffixed with .dna.toplevel.fa.gz. The HIV genome used is ENA accession AJ291720 and the four metagenomic samples were downloaded from the EBI Metagenomics Portal (https://www.ebi.ac.uk/metagenomics/) (29), and can be accessed by the sample accession listed in the table. ‘CPU time’ and ‘Mb/hour’ columns are rounded to two significant digits.