Table 5.
Processing of N. vectensis metatranscriptomes to remove ribosomal RNAs and low-quality sequences.
Process performed on sample for rRNA depletiona | Initial | Reads removed during QC after identification to categories of rRNA or low quality sequence | Final | ||||
---|---|---|---|---|---|---|---|
Total sequence pairs | Large subunit rRNA | Small subunit rRNA | 5S or ITS rRNA | Tandem repeats | Illumina adaptors | Non-ribosomal sequence pair unitsb(% Initial) | |
Poly(A)purist + RNaseH | 1,433,848 | 987,049 | 418,609 | 1358 | 701 | 12,824 | 13,307 (0.92%) |
Poly(A)purist + mRNAonly | 969,506 | 429,858 | 208,186 | 753 | 24,171 | 195,907 | 110,631 (11.40%) |
Poly(A)purist + RNaseH + mRNAonly | 165,5964 | 1,124,016 | 307,443 | 1454 | 1659 | 153,249 | 68,143 (4.11%) |
Poly(A) purist+ MICROBEnrich+ MICROBExpress + mRNAonly | 653,926 | 368,740 | 33,371 | 475 | 3242 | 1592 | 246,506 (37.7%) |
Poly(A) purist + mRNAonly +DSNuclease | 107,964 | 95,649 | 12,188 | 1 | 2 | 7 | 117 (~0.001%) |
Total RNA unprocessed | 1,100,418 | 767,151 | 235,462 | 741 | 3793 | 2550 | 90,721 (8.24%) |
Total analyzed | 5,921,626 | 3,772,463 | 1,215,259 | 4782 | 33,568 | 366,129 | 529,425 (8.9%) |
ITS, Internal Transcribed Spacer.
The processes implemented for depletion of rRNA and non-bacterial mRNA included treatment of total RNA with: (1) RNAseH after hybridization with DNA oligos targeting specific conserved regions of rRNA - RNAseH is an endonuclease that specifically degrades RNA in RNA:DNA hybrids, (2) the MICROBEnrichTM Kit (Ambion Part No. AM1901) and MICROBExpressTM Kit (Ambion Part No. AM1905), a pair of kits that rely on a novel capture oligo hybridization protocol to selectively remove eukaryotic rRNA and Bacterial rRNA respectively, (3) mRNAOnly reagent (Epicenter), an endonuclease-based method that selectively degrades RNAs with 5′-monophosphates, (4) duplex-specific nuclease (DSN) treatment after hybridization with DNA oligos targeting specific conserved regions of rRNA. DSN specifically degrades dsDNA and DNA in DNA:RNA hybrids, and (5) Poly(A)purist kit that relies on use of oligo(dT) cellulose to preferentially bind Poly(A) tails of eukaryotic mRNA. Treatments were used in combinations specified above and kits were implemented according to the manufacturer's protocols.
A sequence pair unit can be one of three things: (1) A sequence pair whose ends have both made it through filtering. (2) A pair of sequences merged into one sequence because of shared overlapping sequence. (3) A pair of sequences clipped to one sequence because of adaptor contamination.