Skip to main content
. 2024 Feb 15;15:1356436. doi: 10.3389/fimmu.2024.1356436

Table 4.

ML techniques and potential use cases in healthcare/MS care.

ML technique Use cases under research with regard to healthcare/MS care
Supervised learning
(labelled training data required)
a) Classification algorithms
b) Regression algorithms
⇨ A known number of classes
⇨ Used for prediction
a) Prediction of the diagnostic category for a patient
− Subtyping of patients through the identification of specific genetic mutations (genotypes), the cause of disease (endotypes) or the clinical manifestation of symptoms (phenotypes) = classification
b) Prediction of the degree of functional impairment in a patient [e.g., (107)]
− Subtyping patients into phenotypes by modelling motor function decline, disease duration, or the slope of progression = regression
Unsupervised learning
(no labelled training data required)
⇨ An unknown number of classes
⇨ Used for analysis
a) Clustering patients into patient groups [e.g., (108, 109)]
− identification of patients who share similar features referring to observable characteristics (phenotyping) or underlying dysfunctional or pathological mechanisms of MS (endotyping)
b) Reduction of high-dimensional datasets through the generation of simpler representations of highly complex data = generalization
− Option 1: find hidden dependencies, drop features that add no or only little information and retain features that add the most important information = feature selection – transform the dataset into a lower dimension while keeping important information in order to make it a “good to go” dataset for the further training of other ML algorithms
− Option 2: find hidden dependencies, create a new dataset which still contains most of the relevant information by performing a linear or non-linear transformation of the original feature space = feature extraction, which then can again be used with other ML algorithms
c) Identification of sequences = association
− Find common patterns and relationships between different features in a dataset – indicate which combinations of features occur most often and which do not
− Prediction of disease progression based on the cluster the patient is placed in
Reinforcement learning
(no training data needed, but constant feedback is provided to the algorithm)
Trial and error – the goal is to minimize error
a) Drug discovery [e.g., (110)]: virtual generation of the optimal molecules with desired properties; adverse reactions or side-effects are fed back to the algorithm as punishment whereas a improvement in disease course would be fed back as reward
b) Digital twin [e.g., (111, 112)]: build a virtual copy of a real patient, let the algorithm sequentially learn as data accrue and provide feedback in order constantly reevaluate the treatment regimen and recommend the best combination of treatment parameter values for keeping the virtual patient as healthy as possible