Skip to main content
. 2023 Mar 8;2:e41264. doi: 10.2196/41264

Figure 3.

Figure 3

A visual illustration of the sequential forward selection process for identifying feature subsets that maximize the performance of the ML pipeline. The candidate feature subsets were evaluated by using 5-fold cross-validation. For each subset of document types, 40 experiments were conducted with all possible combinations of 5 folds, 2 vectorizers, and 4 ML models. FPR: false positive rate. ML: machine learning; TF-IDF: term frequency-inverse document frequency; TPR: true positive rate.