. 2021 Sep 21;1(2):162–172. doi: 10.1016/j.jacasi.2021.07.005

Table 2.

Computational Techniques for Machine Learning

	Advantages	Limitations	Application Example
Supervised learning
Linear or logistic regression model	Easy to use, good for small dataset, easy to interpret and understand, and less tendency to be overfitting	Inappropriate for nonlinear modeling and large dataset, relatively low predictive accuracy, and unable to perform classification	Evaluate the risk of hypertension (30,31) and predict the incident hypertension (32)
Artificial neural network	Good for large dataset and nonlinear modeling, easy to identify potential interaction between variables	Time-consuming, difficult to interpret or understand (eg, black box effect), tendency to be overfitting, problem with generalizability, vulnerable to adversarial example, and requires hyperparameter tuning	Evaluate the risk of hypertension (30,33)
Random forest	Good for nonlinear modeling and variable importance assessment, well-suited for prediction and classification	Time-consuming, less useful for descriptive analysis, tendency to be overfitting, inappropriate for large dataset, and requires high computational power	Predict the incident hypertension (32), risk of hypertension (31), and predicting transitions in hypertension control status (34)
Support vector machines	High predictive accuracy, able to transform linear classifier to nonlinear classifier, good for small dataset, text classification, and image recognition	Inappropriate for large and noisy dataset, nonparametric inference (without P value), and not ideal for multiclass classification and high-dimensional space	Predict the incident hypertension (32)
Unsupervised learning
Cluster analysis	Easy to understand using dendrogram, insensitive to outliers (hierarchical clustering), and simple to use	Difficult to find a k value (number of clusters), sensitive to outliers (k-means clustering), does not work with missing data, arbitrary metric and linkage criteria, and nonparametric inference (without P value)	Classification of hypertension (35)
Principal component analysis	Less tendency to be overfitting, good for reducing noises and dimensionality of features, and minimum loss of information	Inappropriate for nonlinear modeling, difficult to understand or interpret, and possibility of losing information in some dimensions	Evaluation of medication adherence (39)
Combined tools
Ensemble	Less tendency to be overfitting, ideal for multiclass classification, easy to reduce biases, and can be used for a combination of results from different algorithms (supervised and unsupervised)	Sensitive to outlier	Prediction of incident hypertension (38)