. 2014 Nov 11;43(Database issue):D130–D137. doi: 10.1093/nar/gku1063

Table 2. Summary statistics for Rfam-based annotation of RNAs in various genomes and metagenomics data sets.

Genome/data set	Size (Mb)	# of hits	# of fams	CPU time (hours)	Mb/hour
Homo sapiens	3099.7	14 508	796	650	4.8
Sus scrofa (pig)	2808.5	6177	625	460	6.1
Drosophila melanogaster	168.7	4321	156	30	5.7
Caenorhabditis elegans	100.3	1022	175	20	5.2
Saccharomyces cerevisiae	12.2	376	96	1.7	7.3
Escherichia coli	4.6	256	112	0.46	10.2
Bacillus subtilis	4.1	211	52	0.57	7.2
Methanocaldococcus jannaschii	1.7	257	18	0.31	5.6
Aquifex aeolicus	1.6	52	7	0.22	7.3
Borrelia burgdorferi	0.9	44	7	0.22	4.1
Human immunodeficiency virus (HIV)	0.01	12	10	0.016	0.63
Human gut microbiome sample (sample ERS167139, 454 sequencing)	166.1	4342	54	22	7.7
Human gut microbiome sample (sample ERS235581, Illumina HiSeq sequencing) (28)	52.9	3159	47	8.5	6.2
Ocean metagenome (sample SRS580499, Illumina genome analyzer)	44.3	6692	59	13	3.5

The cmsearch program of Infernal 1.1 was used with Rfam 12.0 CM files and the following command-line options: --noali --cut ga --rfam --nohmmonly --cpu 0. Overlapping hits were removed such that no nucleotide was matched by more than one family by keeping the hit with the lower E-value in the case of overlaps (and higher bit score in the case of tying E-values). All searches were run as single execution threads on 3.0 GHz Intel Xeon processors. The Homo sapiens, Sus scrofa, Drosophila melanogaster and Saccharomyces cerevisiae genomes searched were obtained from Ensembl release 76 (http://www.ensembl.org/) (26) and the Escherishia coli (K12 substr MG1655), Bacillus subtilis (BSn5), Methanocaldococcus jannaschii (DSM 2661), Aquifex aeolicus (VF5) and Borrelia burgdorferi (CA-11 2A) genomes were obtained from release 23 of Ensembl Genomes (http://ensemblgenomes.org/) (27) for all of those the actual sequence file searched was downloaded via FTP and suffixed with .dna.toplevel.fa.gz. The HIV genome used is ENA accession AJ291720 and the four metagenomic samples were downloaded from the EBI Metagenomics Portal (https://www.ebi.ac.uk/metagenomics/) (29), and can be accessed by the sample accession listed in the table. ‘CPU time’ and ‘Mb/hour’ columns are rounded to two significant digits.