1 |
Shen et al |
Cell |
2020 |
COVID-19 |
Random Forest |
Identification of severe COVID-19 cases based on molecular signatures of proteins and metabolites |
Severity identification was conducted on 18 non-severe and 13 severe patients. Identified 29 important variables (22 proteins, 7 metabolites) - > Incorrect classification of 1 patient |
doi: 10.1016/j.cell. 2020.05.032. Epub 2020 May 28. PMID: 32492406; PMCID: PMC7254001 |
Model was tested on an independent cohort of 10 patients - > all severe patients correctly identified except 1 |
2 |
Han et al |
Nature |
2021 |
Human gut microbiota |
Random Forest |
Identification of distinct metabolites to differentiate between different taxonomic groups |
The model revealed subsets of chemical features that are highly conserved and predictive of taxonomic identification |
doi: 10.1038/s41586-021-03707-9. Epub 2021 Jul 14. PMID: 34262212; PMCID: PMC8939302 |
e.g., over-representation of amino acid metabolism |
3 |
Liang et al |
Cell |
2020 |
Human pregnancy metabolome |
Linear regression |
Untargeted metabolomic profiling and identification of metabolic changes in human pregnancy |
Detection of many of the previously reported pregnancy-associated metabolite profiles |
doi: 10.1016/j.cell. 2020.05.002. PMID: 32589958; PMCID: PMC7327522 |
>95% of the pregnancy associated metabolites are previously unreported |
4 |
Hogan et al |
EBioMedicine |
2021 |
Influenza |
Gradient boosted decision trees and random forest |
Untargeted metabolomics approach for diagnosis of influenza infection |
Untargeted metabolomics identified 3,318 ion features for further investigation |
doi: 10.1016/j.ebiom. 2021.103546. Epub 2021 Aug 19. PMID: 34419924; PMCID: PMC8385175 |
Described LC/Q-TOF method in conjunction with machine learning model was able to differentiate between influenza samples (pos/neg) with sensitivity and specificity over 0.9 |
5 |
Bifarin et al |
J Proteome Res |
2021 |
Renal Cell Carcinoma |
Partial Least Squares |
A 10-metabolite panel predicted Renal Cell Carcinoma within the test cohort with 88% accuracy |
A total of 7,147 metabolites were narrowed down to a series of 10 and tested with 4 ML algorithms all of which were able to correctly identify RCC status with high accuracy in the test cohort |
doi: 10.1021/acs.jproteome.1c00213. Epub 2021 Jun 23. PMID: 34161092 |
Random Forest Recursive feature elimination |
K-NN |
6 |
Tiedt et al |
Ann Neurology |
2020 |
Ischemic Stroke |
Random Forest classification |
Identified 4 metabolites showing high accuracy in differentiating between Ischemic stroke and Stroke Mimics |
Levels of 41 metabolites showed significant association with Ischemic stroke compared to controls. Top 4 metabolites show high accuracy in differentiating between stroke and mimics |
https://doi.org/10.1002/ana.25859 |
Linear discriminant analysis |
logistic regression |
K-NN |
naive Bayes |
SVM |
7 |
Liu et al |
Mol Metabolite |
2021 |
Diabetic kidney disease |
Linear discriminant analysis |
Serum integrative omics provide stable and accurate biomarkers for early warning and diagnosis of Diabetic Kidney Disease |
combination of a2-macroglobulin, cathepsin D, and CD324 could serve as a surrogate protein biomarker using 4 different ML methods |
doi: 10.1016/j.molmet. 2021.101,367. Epub 2021 Nov 1. PMID: 34737094; PMCID: PMC8609166 |
SVM |
Random Forest |
Logistic regression |
8 |
Oh et al |
Cell Metab |
2020 |
Cirrhosis |
Random Forest |
Comparison of the dysregulation between gut microbiome in differentiating between advanced fibrosis and cirrhosis |
Identified a core set of gut microbiome that could be used as universal non-invasive test for cirrhosis |
doi: 10.1016/j.cmet. 2020.06.005. PMID: 32610095; PMCID: PMC7822714 |
9 |
Delafiori et al |
Anal Chem |
2021 |
COVID-19 |
ADA tree boosting |
Combine ML with mass spectrometry to differentiate between COVID-19 in plasma samples within minutes |
Diagnosis can be derived from raw data with diagnosis specificity 96%, sensitivity 83% |
doi: 10.1021/acs.analchem.0c04497. Epub 2021 Jan 20. PMID: 33471512; PMCID: PMC8023531 |
Gradient tree boosting |
Random forest |
partial least squares |
SVM |
10 |
Jung et al |
Biomed Pharmacother |
2021 |
Coronary artery disease |
Logistic regression |
10-year risk prediction model based on 5 selected serum metabolites |
provided initial evidence that blood xanthine and uric acid levels play different roles in the development of machine learning models for primary/secondary prevention or diagnosis of CAD. Purine-related metabolites in blood are applicable to machine learning model development for CAD risk prediction and diagnosis |
doi: 10.1016/j.biopha. 2021.111,621. Epub 2021 May 10. PMID: 34243599 |
11 |
Wallace et al |
J Pathol |
2020 |
Cancer |
Linear discriminant analysis |
Comparison between metabolic profile of tumor patients and the predictive ability of machine learning algorithm to interpret metabolite data |
Application of machine learning algorithms to metabolite profiles improved predictive ability for hard-to-interpret cases of head and neck paragangliomas (99.2%) |
doi: 10.1002/path.5472. Epub 2020 Jul 1. PMID: 32462735; PMCID: PMC7548960 |
12 |
Kouznetsova et al |
Metabolomics |
2019 |
Bladder cancer |
Logistic regression |
Elucidate the biomarkers including metabolites and corresponding genes for different stages of Bladder cancer, show their distinguishing and common features, and create a machine-learning model for classification of stages of Bladder cancer |
The best performing model was able to predict metabolite class with an accuracy of 82.54%. The same model was applied to three separate sets of metabolites obtained from public sources, one set of the late-stage metabolites and two sets of the early-stage metabolites. The model was better at predicting early-stage metabolites with accuracies of 72% (18/25) and 95% (19/20) on the early sets, and an accuracy of 65.45% (36/55) on the late-stage metabolite set. |
doi: 10.1007/s11306-019-1,555-9. PMID: 31222577 |
13 |
Murata et al |
Breast Cancer Res Treat |
2019 |
Breast Cancer |
Multiple logistic regression |
Combinations of salivary metabolomics and machine learning methods show potential for non-invasive screening of breast cancer |
Polyamines were identified to be significantly elevated in saliva of breast cancer patients |
doi: 10.1007/s10549-019-05330-9. Epub 2019 Jul 8. PMID: 31286302 |
14 |
Liu et al |
BMC Genomics |
2016 |
Major Depressive Disorder |
SVM |
Identifying the metabolomics signature of major depressive disorder subtypes |
|
|
Random Forest |
∼80% accuracy in classification of melancholic depression |
doi: 10.1186/s12864-016-2,953-2. PMID: 27549765; PMCID: PMC4994306 |
|