. 2007 Feb 11;2:59–77.

Table 1.

Summary of benefits, assumptions and limitations of different machine learning algorithms

Machine Learning Algorithm	Benefits	Assumptions and/or Limitations
Decision Tree (Quinlan 1986)	easy to understand and efficient training algorithm order of training instances has no effect on training pruning can deal with the problem of overfitting	classes must be mutually exclusive final decision tree dependent upon order of attribute selection errors in training set can result in overly complex decision trees missing values for an attribute make it unclear about which branch to take when that attribute is tested
Naïve Bayes (Langley et al 1992)	foundation based on statistical modelling easy to understand and efficient training algorithm order of training instances has no effect on training useful across multiple domains	assumes attributes are statistically independent* assumes normal distribution on numeric attributes classes must be mutually exclusive redundant attributes mislead classification attribute and class frequencies affect accuracy
k-Nearest Neighbour (Patrick & Fischer 1970; Aha 1992)	fast classification of instances useful for non-linear classification problems robust with respect to irrelevant or novel attributes tolerant of noisy instances or instances with missing attribute values can be used for both regression and classification	slower to update concept description assumes that instances with similar attributes will have similar classifications assumes that attributes will be equally relevant too computationally complex as number of attributes increases
Neural Network (Rummelhart et al 1986)	can be used for classification or regression able to represent Boolean functions (AND, OR, NOT) tolerant of noisy inputs instances can be classified by more than one output	difficult to understand structure of algorithm too many attributes can result in overfitting optimal network structure can only be determined by experimentation
Support Vector Machine (Vapnik 1982; Russell and Norvig, p 749–52)	models nonlinear class boundaries overfitting is unlikely to occur computational complexity reduced to quadratic optimization problem easy to control complexity of decision rule and frequency of error	training is slow compared to Bayes and Decision Trees difficult to determine optimal parameters when training data is not linearly separable difficult to understand structure of algorithm
Genetic Algorithm (Holland 1975)	simple algorithm, easy to implement can be used in feature classification and feature selection primarily used in optimization always finds a “good” solution (not always the best solution)	computation ordevelopment of scoring function is non trivial not the most efficient method to find some optima, tends to find local optima rather than global complications involved in the representation of training/output data