Figure 3.
Out-of-sample crossvalidation predictions of ER status. (A) One-at-a-time crossvalidation predictions of classification probabilities for training cases from the factor regression analysis. The values on the horizontal axis are estimates of the overall factor score in the regression. The corresponding values on the vertical axis are estimated classification probabilities with corresponding 90% probability intervals marked as dashed lines to indicate uncertainty about these estimated values. The analysis and predictions for each tumor are based on the screened subset of 100 most discriminatory genes to parallel current practice in expression studies by other groups. (B) One-at-a-time crossvalidation predictions of classification probabilities for training cases in the ER study, in a format similar to that of A. In this instance, each case is predicted only on the basis of the ER status of the remaining training tumors, with the subset of 100 genes reselected in each case. The figure presents the resulting honest uncertainties about the extent of true predictive accuracy in a practical setting, reflecting inherent variability due to heterogeneity of expression profiles.