Figure 6. Multiple-Regression Model (MRM) and Classification And Regression Tree (CART) process flows.
A: MRM process flow. This is a simple extension of binary logistic regression that allows for more than two categories of the dependent or outcome variable. The model can then be applied to new explanatory variables (i.e. without known genotypes) to predict unknown genotypes B: CART process flow. This method builds a binary decision tree (i.e. a series of evaluations based on a single concomitant variable at each point) and aims to split the data such that there is maximal separation of individuals in terms of the variable of interest. At each point, the evaluation of an individual is either positive or negative and the procedure seeks a cut-off point for a range of values of the concomitant variable such that the positive and negative groups contain maximal number of individuals of the same type. These learned series of evaluations can then be applied to a new set of individuals with concomitant variables known (without known types) to predict their unknown types. Here the concomitant variables are the intensities while the “types” are genotypes.