Overview of the RCDML pipeline. (A) The RCDML pipeline is broken down into four main processing steps (diamonds): data preprocessing, feature selection, model training and validation. For each step, there is an input and an output (square). The descriptions (ovals) are indicated per each corresponding workflow step. The final output of the model consists of a confusion matrix that indicates the number of false positive, false negatives, true positives, and true negatives. (B) The training, validation, and test dataset splitting are shown here. The original dataset is split into two sets, training and validation and the holdout dataset. The training and validation set is split into five folds, where one of the folds at each iteration is used for validation. After the model is selected, the training and validation set is used for training the selected model and the holdout dataset is used for testing