Table 3.
Glossary of common and important terms in the field of machine learning and artificial intelligence for seizure detection, prediction, and forecasting.
| Term | Abbreviation | Definition |
|---|---|---|
| Artificial intelligence | AI | Tools trained using historical data to perform a broad array of tasks mirroring human intelligence, including tasks for which they have not been explicitly trained |
| Machine learning | ML | Algorithms trained using historical data to maximize performance on a specific task |
| Deep learning | DL | ML algorithms that use multiple, “deep,” layers of data processing to improve performance |
| Support vector machine | SVM | A type of machine learning algorithm that identifies a maximum separating hyperplane between training data based on the hardest to classify examples, called “support vectors” |
| Neural networks | NN | A type of machine learning algorithm that commonly uses multiple layers of hidden combinations of the input data, called nodes, to identify complex patterns in the data that may improve performance. The interconnections of these hidden nodes often are modeled based on the connections of neurons. This technique is commonly used in DL |
| Training set | – | Historical data that ML/AI algorithms use to learn patterns in data |
| Testing set | – | Data separated from the training set that is used to train higher level structures of ML/AI algorithms (e.g., which ML/AI algorithm is superior to which) |
| Validation set | – | Data separated from the training and test sets that is used to estimate performance when applied to unseen data |
| Feature | – | Quantitative data that can be input into ML/AI algorithms to perform predictions. Also known as independent variables or predictors |
| Feature selection | – | Identifying a subset of the input data that is most related to the outcome of interest |
| Peeking | – | An error in ML/AI tool development where “validation” data is used in training or testing (e.g., choosing the superior ML/AI method on a dataset) |
| Leakage | – | An error in ML/AI tool development where “validation” data leaks into some stages of training or testing (e.g., feature selection) |
| Bootstrapping | – | Empiric estimation of the variability of results by repeating the analyses on datasets where data was randomly selected with replacement |
| Permutation testing | – | Empiric estimation of the variability of chance or null hypothesis results by repeating the analyses on datasets where the outcome of interest is randomly shuffled without replacement |
| Sensitivity/Recall | – | The percent of positive outcomes (e.g., seizures) that was accurately identified |
| Positive predictive Value/Precision | PPV | The percent of outcomes predicted to be positive that indeed were positive (e.g., seizures) |
| False positive rate | FPR | The rate that the ML/AI algorithm predicts a seizure occurred when a seizure did not occur |
| Deficiency time | The percent of time when the device or ML/AI algorithm is not recording high enough quality information to make a reliable prediction of outcomes | |
| Area under the receiver operating curve | AUC | Area under the receiver operating curve of the balance between sensitivity and specificity |
| Area under the PR-curve | PR-AUC PRC | Area under the curve showing the balance between precision and recall |