Skip to main content
. 2020 Dec 11;20(24):7096. doi: 10.3390/s20247096

Table 3.

Model category, type, R packages used and short descriptions of classification models.

Model Category Model Type R Package Model Description
Logic-based Classification and regression tree (CART) rpart [27] - Lightweight and fast decision tree structure that allows for visibility of decisions.
- However, they lack the complexity of other methods and may not perform as well as ensemble algorithms.
Ensemble Bagging
Random forest (RF) randomForest [28] - Builds an ensemble of many independent decision trees using different sets of training data that are generated at random and replaced at each selection (known as bagging).
- This large number of trees is used to create a consensus and results in the selection of the most common output that will lead to the maximum number of a class in a single node.
Boosting
Support vector machine (SVM), with radial basis function e1071 [29] - Boosting methods fit trees on a modified version of the original data.
- By training multiple models additively and in a sequence, these algorithms can identify the errors of weaker, single decision trees.
- For example, GBM differs from RF in the order the decision trees are built and the method by which the results are combined.
- SVM is an effective tool in datasets with large dimensionality (i.e., a large number of features).
eXtreme gradient boosting (XGB) xgboost [15]
C5.0 (C50) C50 [30]
Stochastic gradient boosting (GBM) gbm [31]
Neural network Feed-forward neural network (Nnet) nnet [32] - Influenced by the function and structure of biological neural networks and can learn highly complex patterns.
- By using hidden layers, they create intermediary representations of data that other models cannot reproduce.
- AvNnet fits multiple Nnet models and uses the average of the predictions from each constituent model.
Model averaged neural network (AvNnet) avnnet [33]