. 2024 Mar 28;10(4):81. doi: 10.3390/jimaging10040081

Table 1.

Display of different types of machine learning models used in computer vision.

Approach	Supervision	Machine Learning Model	Description
Deep Learning	Supervised	Convolutional Neural Network	Mostly used for classification and segmentation. It includes wide range of model architectures such as ResNet, VGG-net, and AlexNet.
		Mask-region-based Convolutional Neural Network	CNN type primarily employed for detecting objects in input images.
		YOLO	CNN types primarily employed for image segmentation or classification.
		U-net	Type of CNN mainly used for image segmentation.
		Gated recurrent unit	Type of recurrent neural network tailored for modeling time dependent data to address long-range dependencies in sequential data.
		Long short-term memory (LSTM)
		Vision transformer	Novel category of CNNs. Adopts transformer architecture commonly used in NLP and shows high performance in image classification benchmarks.
	Unsupervised	Convolutional Deep Belief Network (CDBN)	Type of deep generative models that is constructed by stacking max-pooling Convolutional Restricted Boltzmann Machines (CRBMs).
	Unsupervised	Autoencoder	A type of neural network that specializes in learning to convert data into a compact and efficient representation, often employed for the purpose of dimensionality reduction.
Traditional	Supervised	k-nearest neighbors	Assigns class labels or values according to the distance of the input data to the k-nearest neighbors in the training data.
		Binary Tree	Decision-making algorithm that navigates the tree from root to leaf to make decisions based on specific features or attributes.
		Naïve Bayes	Probabilistic machine learning algorithm that classifies data based on the conditional independence between every pair of features.
		Support vector machine (SVM)	It uses the kernel trick to find a linear decision boundary to separate input data in the transformed space.
		Fuzzy Inference System	A computational model that uses fuzzy logic to perform reasoning on uncertain or imprecise information.
		Fisher’s linear discriminant analysis	Classifies input data based on linear combination of features that represent items in each class.
		Linear Mixed Model	An extension of simple linear models that allow fixed and random effects, useful for complex data
		Logistic/linear regression	This is a statistical model that uses the logistic function to predict the probability of a specific class.
	Supervised and Unsupervised	Random forest	Ensemble learning method that comprises multiple trees trained on random subsets of data. The final prediction is aggregated from all trees.
		Neural network	It is a conventional machine learning model employed for classification and regression. In comparison to existing deep methods, it exhibits lower accuracy.
		Singular Value Decomposition (SVD)	It decomposes the input feature space into 3 generic and familiar matrices.
	Unsupervised	Fuzzy C-means	A computational model that uses fuzzy logic to perform reasoning on uncertain or imprecise information
	Unsupervised	Gaussian Mixture Model Segmentation	It uses Gaussian distribution to partition pixels into similar segments

Table 1 outlines the types of machine learning models used in CV algorithm development. As computer vision has progressed over the years, the use of deep supervised models has increased. This innovation includes the use of the transformers and autoencoders listed above.