. 2017 Jan 25;45(6):2960–2972. doi: 10.1093/nar/gkw1350

Table 2. The number and percent of reads filtered at each stage of pre-processing for all datasets used in this section.

	M. Liver	M. EF	HEK293	HEK293, Gao	C. elegans (aggregate)
Raw data	9E + 6	3E + 7	3E + 7	3E + 7	9E + 8
Poor quality	8E + 4 (1%)	4E + 5 (1%)	5E + 5 (2%)	3E + 5 (1%)	4E + 7 (4%)
Ribosomal	5E + 6 (55%)	6E + 6 (17%)	2E + 6 (7%)	2E + 7 (66%)	5E + 8 (56%)
No alignment	1E + 6 (15%)	9E + 6 (27%)	3E + 6 (11%)	3E + 6 (10%)	2E + 8 (25%)
Multimappers	6E + 5 (7%)	8E + 6 (23%)	6E + 6 (20%)	2E + 6 (9%)	3E + 7 (3%)
Non-periodic	1E + 5 (1%)	4E + 5 (1%)	4E + 5 (2%)	1E + 5 (0%)	6E + 7 (7%)
Usable	1E + 6 (21%)	1E + 7 (31%)	1E + 7 (59%)	4E + 6 (14%)	4E + 7 (5%)

‘Raw data’ gives the total number of reads in the dataset. ‘Poor quality’ reads are either too short after removing adapters or do not have adequate fastq quality scores. ‘Ribosomal’ reads map to known ribosomal sequences. ‘No alignment’ reads do not align to the genome. ‘Multimappers’ map to the genome in multiple locations. ‘Non-periodic’ reads are of lengths whose metagene profiles do not result in a periodic signal. ‘Usable’ reads are kept for further analysis. The detailed counts for all samples, including all C. elegans replicates, are given in Supplementary File 7. We obtain a much higher percentage of rRNA reads from dauer stage lysates than from lysates of other developmental stages of the C. elegans life cycle (2).