Skip to main content
. 2024 Jan 10;14(2):156. doi: 10.3390/diagnostics14020156

Table 1.

An overview of AI models with brief descriptions.

Algorithm Overview Advantages Disavantages
Support vector machine A supervised ML model used for both classification and regression. It works by finding an hyperplane that maximally separates data points of different classes in a high-dimensional space, aiming to maximize the margin between the classes.
  • Effectively handles unstructured and semi-structured data

  • Low generalization error

  • Requires long training time for large datasets

  • Results may be difficult to interpret

Logistic regression A supervised ML algorithm used for binary classification. It models the probability of an instance belonging to a particular class using the logistic function, and the decision boundary is a linear combination of input features.
  • Provides insight into feature relevance

  • Efficient for small datasets

  • Rapid training

  • Assumption of linearity

  • Sensitivity to outliers

Random forest A supervised ML algorithm used for classification and regression tasks. It builds multiple decision trees during training and merges their predictions to improve accuracy and robustness. Each tree is trained on a random subset of the data, and the final prediction is determined by a majority vote (for classification) or an average (for regression).
  • High accuracy in capturing complex relationships in data

  • Efficient on large datasets

  • Provides insight about feature relevance

  • Sensitive to small changes in data

  • Bias towards dominant classes

Gradient-boosted decision tree An ensemble learning technique used for both classification and regression. It builds a series of decision trees sequentially, with each tree correcting the errors of the previous ones. It combines the predictions of individual trees to create a strong predictive model.
  • High accuracy in capturing complex relationships in data

  • Provides insight into feature relevance

  • Effective on structured and unstructured data

  • Requires long training time for large datasets

  • Complex interpretability

k-nearest neighbour A supervised ML algorithm for classification and regression that predicts a data point’s label or value based on the majority of its nearby neighbors in the dataset. The “k” represents the number of neighbors considered for the prediction.
  • No training period

  • Supports dynamic data addition

  • Efficient for small datasets

  • Poor performance with large datasets

  • Impact of irrelevant features

  • Sensitivity to outliers and missing data

Naive Bayes A supervised ML algorithm for classification. It is based on Bayes’ theorem and assumes independence between features. The algorithm calculates the probability of a data point belonging to a particular class by considering the probabilities of its individual features.
  • Simple and fast

  • Requires small amount of training data

  • Less sensitive to irrelevant features

  • Assumption of feature independence

  • “Zero Probability” issue

  • Sensitivity to outliers and missing data

Convolutional neural network A DL algorithm designed for image and video recognition. It uses convolutional layers to automatically and adaptively learn spatial hierarchies of features from input data.
  • Learns hierarchical features from spatial data

  • Allows parameter sharing, reducing overfitting

  • Automated feature learning

  • Large datasets needed

  • Requires long training time

  • Complex interpretability