Skip to main content
. 2021 Sep 21;1(2):162–172. doi: 10.1016/j.jacasi.2021.07.005

Table 2.

Computational Techniques for Machine Learning

Advantages Limitations Application Example
Supervised learning
 Linear or logistic regression model Easy to use, good for small dataset, easy to interpret and understand, and less tendency to be overfitting Inappropriate for nonlinear modeling and large dataset, relatively low predictive accuracy, and unable to perform classification Evaluate the risk of hypertension (30,31) and predict the incident hypertension (32)
 Artificial neural network Good for large dataset and nonlinear modeling, easy to identify potential interaction between variables Time-consuming, difficult to interpret or understand (eg, black box effect), tendency to be overfitting, problem with generalizability, vulnerable to adversarial example, and requires hyperparameter tuning Evaluate the risk of hypertension (30,33)
 Random forest Good for nonlinear modeling and variable importance assessment, well-suited for prediction and classification Time-consuming, less useful for descriptive analysis, tendency to be overfitting, inappropriate for large dataset, and requires high computational power Predict the incident hypertension (32), risk of hypertension (31), and predicting transitions in hypertension control status (34)
 Support vector machines High predictive accuracy, able to transform linear classifier to nonlinear classifier, good for small dataset, text classification, and image recognition Inappropriate for large and noisy dataset, nonparametric inference (without P value), and not ideal for multiclass classification and high-dimensional space Predict the incident hypertension (32)
Unsupervised learning
 Cluster analysis Easy to understand using dendrogram, insensitive to outliers (hierarchical clustering), and simple to use Difficult to find a k value (number of clusters), sensitive to outliers (k-means clustering), does not work with missing data, arbitrary metric and linkage criteria, and nonparametric inference (without P value) Classification of hypertension (35)
 Principal component analysis Less tendency to be overfitting, good for reducing noises and dimensionality of features, and minimum loss of information Inappropriate for nonlinear modeling, difficult to understand or interpret, and possibility of losing information in some dimensions Evaluation of medication adherence (39)
Combined tools
 Ensemble Less tendency to be overfitting, ideal for multiclass classification, easy to reduce biases, and can be used for a combination of results from different algorithms (supervised and unsupervised) Sensitive to outlier Prediction of incident hypertension (38)