Figure 4. SVR is superior to linear regression in age prediction.
(a) The minimal MAD of predicted age as a function of the number of sites that compose the independent variables. The 11 CpG sites selected from the Sequenom MassARRAY dataset were combined to one to 11 independent variables. SVR model fit on all but one sample, and the minimal MAD of the predicted age was observed for a given number of independent variables. (b) Predicted versus observed age of all 49 subjects, using SVR model by six markers. MAD of 2.8 years was observed, which is slightly higher than that obtained by 11 markers. (c) Predicted versus observed age using multivariate linear regression by three DNA methylation markers obtained from a recent study18. The original BeadChip data of these three sites were extracted to predict age by using a multivariate linear regression model, and an MAD of 6.27 years was obtained. (d) Predicted versus observed age using SVR by three DNA methylation markers obtained from a recent study18. MAD of 4.23 years was obtained, which is better than the MAD obtained when using a multivariate linear regression (panel C), and better than the MAD obtained when using a multivariate linear regression based on pyrosequencing data in the published study (5.4 years)18.
