Figure 1.
Correlation between actual and predicted heights as a function of the number of SNP hits activated in the predictor. While difficult to visually separate, each line represents the training of a predictor using 453k individuals. Correlation is computed on five separate nonoverlapping sets of 5k individuals not used in training. The phase transition region (roughly, ) corresponds to rapid growth in correlation on this graph, with number of hits growing from near 0 to >5000. The correlation and penalization values given in the lower righthand corner are the average of a set of LASSO runs; each run generates a slightly different value (even, a slightly different beta vector).