Skip to main content
. 2023 Jan 10;14:35. doi: 10.1038/s41467-022-35343-w

Table 2.

Table summarizes the impact of feature cluster removal (i.e., based on their respective linkage distances) on the predictive performance of the LGBM model

No. of features Mean absolute error (n = 10) Standard deviation (n = 10) Wards linkage distance
17 0.116 0.018 0.00
15 0.116 0.017 0.06
13 0.142 0.017 0.12
12 0.143 0.017 0.24
11 0.143 0.018 0.29
10 0.143 0.019 0.35
9 0.139 0.022 0.53
8 0.150 0.021 0.76
5 0.296 0.023 0.82
4 0.296 0.024 0.88

The performance of the LGBM model with various numbers of input features was assessed by comparing the average and standard deviation of the AE values obtained from a series of trials (n = 10 trials) that randomly grouped 20% of the drug–polymer combinations as a holdout test set.