Skip to main content
. 2020 Jun 22;2(8):3115–3130. doi: 10.1039/d0na00388c

An overview of some basic machine learning algorithms.

Algorithm Brief introduction Advantages Disadvantages Representative applications
Regression analysis It can find regression equations and predict dependent variables Deeply developed and widely used in many occasions Needs large amounts of data and may cause overfitting in practical applications Machine learning with systematic density-functional theory calculations: application to melting temperatures of single-and binary-component solids
Naïve Bayes classifier It can classify data into several categories following the highest possibility Only a small amount of data is needed to obtain essential parameters The feature independence hypothesis is not always accurate A naïve-Bayes classifier for damage detection in engineering materials
Support vector machine SVM can find a hyperplane to divide a group of points into two categories It has great generalization ability and can properly handle high-dimension datasets SVM is not very appropriate for multiple classification problems PVP-SVM: sequence-based prediction of phage virion proteins using a support vector machine
Decision tree and random forest By splitting source datasets into several subsets, all data will be judged and classified The calculating processes are easy to comprehend. Also, it can handle large amounts of data It is difficult to obtain a high-performance decision tree or a random forest. Also, the overfitting problem may occur High-throughput machine-learning-driven synthesis of full-Heusler compounds
Artificial neural network By imitating neuron activities, ANN can automatically find underlying patterns in inputs ANN has great self-improving ability, great robustness and high fault tolerance Its inner calculation progresses are very difficult to understand Learning from the Harvard Clean Energy Project: the use of neural networks to accelerate materials discovery
Deep learning Originated from ANN. It aims to build a neural network to analyze data by imitating the human brain It has the best self-adjusting and self-improving abilities compared with other ML methods As a new trend in ML, deep learning has not yet been well studied. Many defects are still unclear Artificial intelligence in neuropathology: deep learning-based assessment of tauopathy