Table 2:
Criteria for data set quality classification
Metric | Meaning | Fail threshold | Warn threshold | Code |
---|---|---|---|---|
NumReadsQcPass | No. reads passed QC filtering | <50 reads per genea | <500 reads per genea | 1 |
QcPassRate | Proportion of reads passed QC filtering | <60% | <80% | 2 |
STAR_UniqMapRate | Proportion of reads mapped uniquely to the reference genome using STAR | <50% | <70% | 3 |
STAR_AssignRate | Proportion of reads assigned to genes with STAR | <40% | <60% | 4 |
STAR_AssignedReads | No. reads assigned to genes with STAR | <50 reads per genea | <500 reads per genea | 5 |
Kallisto_MapRate | Proportion of reads assigned to transcripts with Kallisto | <40% | <60% | 6 |
Kallisto_MappedReads | No. reads assigned to transcripts with Kallisto | <50 reads per genea | <500 reads per genea | 7 |
DatasetCorrel | Pearson correlation coefficient to passed data average | – | <0.5 | 8 |
Number of protein-coding genes was obtained from Ensembl and used as an estimator of transcriptome complexity.