Fig. 1.
Logistic regression classifier with elastic net penalty training and testing errors over 100 iterations for Training Distribution Matching (TDM) transformation of The Cancer Genome Atlas Glioblastoma RNAseq data. a Schematic describing the terms used for training, testing, and validating our model. We applied 5-fold cross validation to the full dataset which consists of training and testing splits in each fold. The model is then applied as an ensemble classifier on a set of in-house samples (validation set) (b) Receiver operating characteristic (ROC) curves for all 500 classifiers that make up the ensemble model applied to both training and testing set. Also shown is the aggregate performance of the ensemble classifier. c The cumulative density of area under the ROC curve (AUROC) for training and testing partitions