Table 2.
Computational Techniques for Machine Learning
Advantages | Limitations | Application Example | |
---|---|---|---|
Supervised learning | |||
Linear or logistic regression model | Easy to use, good for small dataset, easy to interpret and understand, and less tendency to be overfitting | Inappropriate for nonlinear modeling and large dataset, relatively low predictive accuracy, and unable to perform classification | Evaluate the risk of hypertension (30,31) and predict the incident hypertension (32) |
Artificial neural network | Good for large dataset and nonlinear modeling, easy to identify potential interaction between variables | Time-consuming, difficult to interpret or understand (eg, black box effect), tendency to be overfitting, problem with generalizability, vulnerable to adversarial example, and requires hyperparameter tuning | Evaluate the risk of hypertension (30,33) |
Random forest | Good for nonlinear modeling and variable importance assessment, well-suited for prediction and classification | Time-consuming, less useful for descriptive analysis, tendency to be overfitting, inappropriate for large dataset, and requires high computational power | Predict the incident hypertension (32), risk of hypertension (31), and predicting transitions in hypertension control status (34) |
Support vector machines | High predictive accuracy, able to transform linear classifier to nonlinear classifier, good for small dataset, text classification, and image recognition | Inappropriate for large and noisy dataset, nonparametric inference (without P value), and not ideal for multiclass classification and high-dimensional space | Predict the incident hypertension (32) |
Unsupervised learning | |||
Cluster analysis | Easy to understand using dendrogram, insensitive to outliers (hierarchical clustering), and simple to use | Difficult to find a k value (number of clusters), sensitive to outliers (k-means clustering), does not work with missing data, arbitrary metric and linkage criteria, and nonparametric inference (without P value) | Classification of hypertension (35) |
Principal component analysis | Less tendency to be overfitting, good for reducing noises and dimensionality of features, and minimum loss of information | Inappropriate for nonlinear modeling, difficult to understand or interpret, and possibility of losing information in some dimensions | Evaluation of medication adherence (39) |
Combined tools | |||
Ensemble | Less tendency to be overfitting, ideal for multiclass classification, easy to reduce biases, and can be used for a combination of results from different algorithms (supervised and unsupervised) | Sensitive to outlier | Prediction of incident hypertension (38) |