Table 4.
Step-by-step output as numbers of qualifying genes along the LSTNR analytical pipeline for liver transcriptomes from male Sprague-Dawley rats after toxicant exposure based on the mode-of-action training RNAseq dataset by the MAQC phase III SEQC crowdsource toxicogenomics (TGxSEQC) effort (GEO accession number: GSE55347).
| Criteria | Hepatotoxicity: Mode-of-Action Rat Models (N = 54) |
|---|---|
| Genes with uniquely aligned reads | 30,852 |
| Distribution of gene-wise RPM means | P(y)~Weibull3P(y;α,ß,γ); y = RPM |
| α = 6.7 RPM | |
| ß = 0.38 | |
| γ = 2.5 × 10−3 RPM | |
| Independent filtering: Genes with average RPM y > α | 9,593 |
| Linearized normalizing transformant: GLM Linear Predictor | (y–γ)−1 |
| Transformant two-way ANOVA: resolved genes across groups with respect to gene-wise mean | 3,975 |
| Resolution-Weighed ANOVA: Significant Genes (SGs) with FDR adj. p < 0.05 based on differences in resolution-weighed RPM log-fold changes (Log2FC) relative to baseline condition | 5,983 |
Differential expression: DEGs = subset SGs that exhibit both:
|
5,864 |
Reproducibility: LSTNR genes = subset of SGs that exhibit both:
|
1,953 |
| Expectable DEGs: DEGREEs = Ensembl-annotated DEGs with a reproducible expectation estimate (i.e., DEGs that are also LSTNRs) and official Entrez symbol | 1,510 |
| Transcriptional profiling: Profiler DEGREEs = top DEGREEs ranked by retrospective statistical power with monotonically decreasing within-gene effect sizes ΔLog2FC | 65 Profiler DEGREEs |
| Diagnostic targets: Biomarkers = minimal subset of Profiler DEGREEs with predictive discriminant power based on sequential partition tree analysis (ROC scores>0.9 per phenotype) | Ucp3, Tmem86b, Sugct, Acaa1b, Hadhb, Tfam, Acaa1a, and Gsdmd |