Skip to main content
. 2018 May 16;9:176. doi: 10.3389/fgene.2018.00176

Table 3.

Step-by-step output as numbers of qualifying genes along the LSTNR analytical pipeline for RNAseq data deposited in TCGA from four realizations of patient-derived breast cancer transcriptomes across four molecular subtypes (courtesy of Li and Bushel, 2016; 10.1186/s12864-016-2584-7).

Criteria Breast Cancer Molecular Subtypes (TCGA) (N = 200)
Realization 1 (N = 50) Realization 2 (N = 50) Realization 3 (N = 50) Realization 4 (N = 50) All Specimens (N = 200)
Genes with uniquely aligned reads 20,532
Distribution of gene-wise RPM means P(y) ~Weibull3P(y;α,ß,γ); y = RPM
α = 25.4 RPM α = 24.1 RPM α = 25.0 RPM α = 21.9 RPM α = 22.2 RPM
ß = 0.53 ß = 0.53 ß = 0.54 ß = 0.49 ß = 0.49
γ = 9.9 × 10−3 RPM γ = 6.6 × 10−3 RPM γ = 1.1 × 10−2 RPM γ = 1.6 × 10−3 RPM γ = 1.5 × 10−3 RPM
Independent filtering: Genes with average y > α 8,005 8,110 8,083 8,562 8,538
Linearized normalizing transformant: GLM Linear Predictor (y–γ)−1
Transformant two-way ANOVA: resolved genes across groups with respect to gene-wise mean 4,295 381 638 2,281 2,851
Resolution-Weighed ANOVA: Significant Genes (SGs) with FDR adj. p < 0.05 based on differences in resolution-weighed RPM log-fold changes (Log2FC) relative to baseline condition 4,465 5,086 4,537 4,618 6193
Altogether: 7,749 Final Overlap: 1,509
Intersection: 1,617
Differential expression: DEGs = subset of SGs that exhibit both:
  • resolution-weighed effect size above 5% of gene-wise variation (δLog2FC > 0.3 × σSSR); and

  • post-hoc pairwise-significant Log2FC differences between at least two groups (Student's t-test p < 0.05)

3,736 3,377 3,497 3,617 6,093
Altogether: 6,407 Final Overlap: 908
Intersection: 976
Reproducibility: LSTNRs = subset of SGs that exhibit both:
  • resolution-weighed effect size above 5% of gene-wise variation (δLog2FC > 0.3 × σSSR); and

  • at least one group with Log2FC differences vs. baseline greater than 95% Tolerance Interval of gene × group residuals among SGs (post-hoc pairwise-significance not required)

1,370 1,102 1,130 1,210 1,511
Altogether: 2,193 Final Overlap: 337
Intersection: 368
Expectable DEGs: DEGREEs = Ensembl-annotated DEGs with a reproducible expectation estimate (i.e., DEGs that are also LSTNRs) and official Entrez symbol Intersection: 366 1,511
Final Overlap: 336
Transcriptional profiling: Profiler DEGREEs = top DEGREEs ranked by retrospective statistical power with monotonically decreasing within-gene effect sizes ΔLog2FC 200 Profiler DEGREEs (consensus)
Diagnostic targets: Biomarkers = minimal subset of Profiler DEGREEs with predictive discriminant power based on sequential partition tree analysis (ROC scores > 0.9 per phenotype) CBX7, ESR1, FOXC1, and FOXM1