Skip to main content
. 2023 Dec 21;15(2):e02050-23. doi: 10.1128/mbio.02050-23

TABLE 5.

Information that is useful to report when describing ML modeling efforts

Raw data processing Contaminant removal, abundance/prevalence filtering, normalization, transformation. See references 15, 36.
Data splitting Ratios of train, test, validation split; performance variation with different data splits; source of external test sets
Feature selection Method of feature selection, when in process feature selection was applied, whether cross-folded
Model type Type(s) of model(s) used and why, metrics used for model evaluation and optimization
Hyperparameters Final hyperparameters selected for model, and methods used for hyperparameter optimization
Code Open-source code and publicly available data are ideal but not always required