Skip to main content
. 2021 Feb 24;2021:6953197. doi: 10.34133/2021/6953197

Table 2.

Results of regression models created with random forest. The R2 values of random forest model with entire set of variables and those with only most important variables are presented for the bean and maize aggregate phene metrics.

Aggregate phenotypic metric R 2 (% variance explained)
Bean Maize
Model with all variables Model with most important variables Model with all variables Model with most important variables
Total length 89.5 91.6 82 85
Total area 87 87 78 81
Total volume 81.7 88.5 79 81.6
Volume distribution 87 91 61 66
Max no. of roots 78.8 84 67 72.8
Median no. of roots 79.9 87 71 75
Bushiness 62 67 36 41
Max depth 98.6 99.6 79 84
Max width 91 90 95 99
Convex hull area 97.8 97 90 93.4
Convex hull volume 97.6 97.6 87 89.9
Ellipse minor axis 94.9 93.6 80 85
Ellipse major axis 96.7 97.3 95 98.6
Ellipse aspect ratio 85.9 87.4 51.9 62
Solidity 97.4 97.5 89 89
FD 67 68 16 20
FA 93.5 94.9 88 90

Random forest possesses its own reliable statistical characteristics, which could be used for validation and model selection. The major criterion for estimation of internal predictive ability of the random forest models and model selection is the value of R2. R2 in random forest is interpreted as a measure of predictive quality of random forest model on independent samples. Random forest models were run with the aggregate phenotype as dependent variable and all the phenes as predictor variables. Most important variables were chosen based on the % increase in mean square, and random forest models were run with only the most important variables.