108 Gene Signature Predicts Overall Survival in Several Epithelial-Derived Cancers
For the TCGA datasets, 108 gene expression data were pulled out based on matching gene names in the hg38 annotation. For each dataset, modeling survival and generating a KM curve on the candidate gene signature score. The dichotomous description of gene signature score (high or low gene signature score) was assigned. p Values were taken from the log likelihood statistic from the Cox proportional hazard models. For verification, we permuted a subset of random expression and random outcome (time, vital status) values and broke the relationship between expression and outcome. This was done 1,000 times, and for each permuted dataset we modeled survival as described for the original analysis and generated a distribution of log likelihood statistics. Note: not all genes were identified in each dataset and the number of genes used to generate KM curves indicated above graphs.