Skip to main content
. 2024 Mar 28;10(4):81. doi: 10.3390/jimaging10040081

Table 1.

Display of different types of machine learning models used in computer vision.

Approach Supervision Machine Learning Model Description
Deep
Learning
Supervised Convolutional Neural Network Mostly used for classification and segmentation. It includes wide range of model architectures such as ResNet, VGG-net, and AlexNet.
Mask-region-based Convolutional Neural Network CNN type primarily employed for detecting objects in input images.
YOLO CNN types primarily employed for image segmentation or classification.
U-net Type of CNN mainly used for image segmentation.
Gated recurrent unit Type of recurrent neural network tailored for modeling time dependent data to address long-range dependencies in sequential data.
Long short-term memory (LSTM)
Vision transformer Novel category of CNNs. Adopts transformer architecture commonly used in NLP and shows high performance in image classification benchmarks.
Unsupervised Convolutional Deep Belief Network (CDBN) Type of deep generative models that is constructed by stacking max-pooling Convolutional
Restricted Boltzmann Machines (CRBMs).
Autoencoder A type of neural network that specializes in learning to convert data into a compact and efficient representation, often employed for the purpose of dimensionality reduction.
Traditional Supervised k-nearest neighbors Assigns class labels or values according to the distance of the input data to the k-nearest neighbors in the training data.
Binary Tree Decision-making algorithm that navigates the tree from root to leaf to make decisions based on specific features or attributes.
Naïve Bayes Probabilistic machine learning algorithm that classifies data based on the conditional independence between every pair of features.
Support vector machine (SVM) It uses the kernel trick to find a linear decision boundary to separate input data in the transformed space.
Fuzzy Inference System A computational model that uses fuzzy logic to perform reasoning on uncertain or imprecise information.
Fisher’s linear discriminant analysis Classifies input data based on linear combination of features that represent items in each class.
Linear Mixed Model An extension of simple linear models that allow fixed and random effects, useful for complex data
Logistic/linear regression This is a statistical model that uses the logistic function to predict the probability of a specific class.
Supervised
and
Unsupervised
Random forest Ensemble learning method that comprises multiple trees trained on random subsets of data. The final prediction is aggregated from all trees.
Neural network It is a conventional machine learning model employed for classification and regression. In comparison to existing deep methods, it exhibits lower accuracy.
Singular Value Decomposition (SVD) It decomposes the input feature space into 3 generic and familiar matrices.
Unsupervised Fuzzy C-means A computational model that uses fuzzy logic to perform reasoning on uncertain or imprecise information
Gaussian Mixture Model Segmentation It uses Gaussian distribution to partition pixels into similar segments

Table 1 outlines the types of machine learning models used in CV algorithm development. As computer vision has progressed over the years, the use of deep supervised models has increased. This innovation includes the use of the transformers and autoencoders listed above.