TABLE 1.
Testing performance of machine learning classifiers for Salmonella prevalence prediction
| Model name | Optimal hyperparameters | Precision | Recall | F1 score | Accuracy |
|---|---|---|---|---|---|
| Adaptive Boosting Classifier | learning_rate: 0.1, n_estimators: 100 | 0.90 | 0.87 | 0.87 | 0.87 |
| Decision Tree Classifier | criterion: “gini,” max_depth: None, min_samples_leaf: 2, min_samples_split: 5 | 0.96 | 0.96 | 0.96 | 0.96 |
| Gaussian Naive Bayes | var_smoothing: 1 × 10–9 | 0.87 | 0.87 | 0.87 | 0.87 |
| Logistic Regression | C: 100, solver: “newton-cg” | 0.84 | 0.83 | 0.83 | 0.83 |
| Multi-layer Perceptron Classifier | activation: “tanh,” alpha: 0.0001, hidden_layer_sizes: (100, 50, 50), learning_rate: “constant,” solver: “adam” | 0.37 | 0.61 | 0.46 | 0.61 |
| Random Forest Classifier | max_depth: None, max_features: “log2”, min_samples_split: 10, n_estimators: 100 | 0.87 | 0.87 | 0.87 | 0.87 |
| Stochastic Gradient Descent Classifier | alpha: 0.001, learning_rate: “optimal,” loss: “perceptron,” penalty: “l1” | 0.37 | 0.61 | 0.46 | 0.61 |