Skip to main content
. 2023 Oct 17;11:17. doi: 10.1186/s40170-023-00319-x

Table 4.

Balanced error rate for classification using random forest model and stratified by follow-up time from sample collection to colorectal cancer diagnosis of cases

 < 5 years 5–9 years 10–15 years  > 15 years All samples
Three-level outcomes
 Location (proximal, distal, rectal)a 0.72 0.63 0.66 0.59 0.62
KRAS/BRAF (KRAS, BRAF, both wt)a 0.68 0.67 0.70 0.64 0.67
Two-level outcomes
 Stage (stages I–II and stages III–IV)b 0.50 0.41 0.44 0.46 0.49
KRAS (mutation, wild type)b 0.50 0.47 0.42 0.44 0.50
BRAF (mutation, wild type)b 0.50 0.50 0.52 0.51 0.50
 MSI (MSI, MSS)b 0.50 0.51 0.50 0.50 0.50

wt wild type. MSI microsatellite instability. MSS microsatellite stable. None of the potential confounders (body mass index, smoking status, education level, diabetes, alcohol intake, and recreational physical activity) was selected in the variable selection step of the random forest models, with the exception of body mass index in the tumor location analysis restricted to samples taken < 5 years prior to diagnosis. aBalanced error rate for a three-class problem with expected BER by chance of 0.67. bBalanced error rate for a two-class problem with expected BER by chance of 0.50