Skip to main content
. 2020 Nov 3;10:18926. doi: 10.1038/s41598-020-76141-y

Figure 5.

Figure 5

Workflow of TPOT pipeline. At the beginning of the pipeline, each group of original data was randomly divided into training set and test set according to the proportion of 8:2. In the pipeline, The training set was repeated for data cleaning, feature selection, feature construction, feature processing, model selection and parameter optimization, until a pipeline (classifier) optimized for the parameters of a specific model was selected. In the last stage of the pipeline, the data of the training set and the test set were mixed, then the data was put into the optimal pipeline to verify whether the pipeline and parameters were optimal. The specific operators selected in the best pipeline include the built-in TPOT operator (OneHotenCoder, FeatureSetSelector) and the functions in the scikitlearn library (ExtraTreesClassifier and RandomForstClassifier).