Skip to main content
. 2022 Aug 22;5:856. doi: 10.1038/s42003-022-03812-z

Fig. 3. Nonlinear model consistently outperforms linear ones for prediction of multiple complex phenotypes in multi-ethnic dataset.

Fig. 3

Linear (PRS-pink), linear-regularized (LASSO—teal), and nonlinear (XGBoost—gray, purple) models were employed to predict the harmonized phenotypes from SNP data from TOPMed following adjustment for covariates. Two versions of the XGBoost algorithm are shown with the first model employing only the SNPs as features (gray; XGBoost alone) and a second model which had the PRS as one of the features as well (XGBoost with PRS). The LASSO algorithm (teal) was trained on the same set of SNPs as the XGBoost. The inset (gray) depicts estimated heritability for same phenotypes in the same database using the REML approach with error bars of 95% confidence intervals estimated through restricted maximum-likelihood estimate.