Logistic regression |
An algorithm that estimates probability of dichotomized outcome from multiple covariates using logistic function. |
Classification |
Decision tree |
A flow chart–like algorithm that divides data into branches by considering information gain. The final branches represent output of the algorithm (class or value). |
Classification/regression |
(simple) Neural network |
An algorithm inspired by human brain architecture. Layers consisting of nodes are connected to one another with edges weighted as per training results. |
Classification/regression |
K nearest neighbor |
A simple algorithm that classifies observations by comparing k examples that exist in the nearest locations (=examples with the most similar features). |
Classification/regression |
Support vector machine |
Support vector machine draws a boundary line that maximizes margins from each class. New observations are classified using this line. |
Classification/regression |
K means |
A clustering method that makes k clusters in which each observation belongs to the cluster that has its mean in the nearest locations from the observation. |
Clustering |
Hierarchical clustering |
A type of cluster analysis that builds a dendrogram with a hierarchy of clusters. Pairs of clusters are merged to form clusters as they move up the hierarchy (agglomerative approach). |
Clustering |
Principal component analysis |
An algorithm that converts high dimensional data into lower dimensional data with keeping important information as much as possible by orthogonal transformation |
Dimensionality reduction |