Definition |
Algorithms that learn relationships between input and output attributes based on a set of labeled examples. |
Algorithms that attempt to find patterns in data clusters with similar characteristics, looking for unidentified or uninformed categories and outcomes. |
Advantages |
Analysis of multiple parameters; quick, automatic solution for large-scale questions and high accuracy. |
Less human interference in data analysis; excellent for multimodal or multidimensional data sources; allows identification of new outcomes. |
Disadvantages |
Requires data to be labeled; may be impractical for large volumes of data. Tendency to overfit data. |
High cost; complex techniques. It requires a large amount of data to elaborate the algorithm, and it can be challenging to interpret the results. |
Main tasks |
Regression, classification, prognostic model, and survival analysis. |
Reducing dimensionality of the problem and grouping. |
Examples of algorithms |
Logistic regression, decision trees, random forests, and artificial neural networks. |
Principal component analysis, hierarchical clustering, autoencoders, and linear discriminant analysis. |