Artificial Intelligence (AI) has been a part of the medical community for decades in the form of Clinical Decision Support Systems to aid physicians in diagnosis and categorization of patients (1). Recent years have seen a shift from expert-derived models to proposed machine learning (ML) models into Clinical Decision Support Systems due to the ability of ML models to make predictions more accurately by exploiting higher dimensional and often complex data. In many cases ML models gain their advantage in accuracy by capturing complex and often nonlinear relationships between features being used to make the prediction. However, the hype and excitement around these methods are tempered by the limited utility of often black-box solutions in a clinical setting. This is driven by skepticism of results that are difficult for practitioners to not only interpret but explain to their patients (1-3). This skepticism is not unfounded as multiple examples of black-box solutions identifying incidental correlates as the key predictors have highlighted the potential bias in a training set, or reward system, that a ML model may exploit; an example of this is a model discerning wolves from huskies based on snow in the background rather than features of the dogs (1, 4, 5).
AI and ML is a growing field for endocrinology (6) and has already made inroads in the treatment of diabetes (7). Endocrinology is especially well positioned to take advantage of the upsurge in ML work focused on interpretability and explainability, which focus on developing models based on the trade-off between model accuracy and transparency rather than accuracy alone (5). In their recent article in The Journal of Clinical Endocrinology & Metabolism, Shmoish et al. (8), demonstrate the power of an explainable ML model to predict adult height from growth measurements of children before and at the age of 6 years. In this work they not only demonstrate superior performance in the context of accuracy but utilize the explainable model to provide insight into the factors underlying that performance as well as understand the potential deficiencies in the model that could be addressed in future developments.
The prediction of adult height was undertaken via a suite of ML algorithms and compared with 3 common expert-derived metrics: target height, conditional target height, and “grandma” method. Overall, there was not a lot of difference between the 3 expert-derived methods, the first 2 approaches that use parental height vs the last that is a doubling of baby length at a specified age. However, in all cases the ML model dramatically improves overall prediction of adult height.
Generalization is an important component of ML models, which means that the model can adapt properly to previously unseen data. Utilizing data that has been collected independently from the training data, but expected to be drawn from the same distribution, is one of the best strategies for this (5). Shmoish et al. (8), demonstrate that their model is robust to both where and how the measurements are collected. The results of the model, which showed a Pearson correlation between the predicted and observed adult height of R = 0.87 in the validation set associated with the training data (Swedish cohort of children born in 1974), was approximately the same, R = 0.88, in another Swedish cohort started in 1990 and a separate cohort in Edinburgh with children born between 1972 and 1976, R = 0.88.
The explainability of the model is based on feature importance metrics that can be directly extracted from the ML model of choice, a Random Forest, which is a benefit of the Random Forest ML approach in addition to the often high prediction accuracy. The most important features driving the model were the average height of the child between the age of 3.4-4 years of age and sex, with secondary features such as growth velocity and weight not playing an important role in the prediction. The authors then explored the model results to identify potential biases and infer reasons for these based on their expertise in the field. One interesting example is that the model is more accurate for girls than boys with a possible tie back to the time that girls reach adult height vs boys, which may indicate that 6 years of age is too young for a highly accurate prediction for boys. There also appears to be a shift in prediction towards the mean, meaning the short and tall subjects are either over- or under-estimated, respectively, which may be due to either the Random Forest approach or potentially a lack of adequate data at the extremes. These insights, as well as inclusion of potential environmental impacts on growth, offer a strong foundational model for future researchers to build upon.
In conclusion, Shmoish et al. (8), present a ML model to predict adult height based on growth measurements easily attained in any clinical setting demonstrating accurate predictions based solely on observational data. Furthermore, they delve into the model to identify the most important features, which helps to both understand why the model performs so well on validation and independent data and why it may be challenged in specific groups of the population, such as extremely short or tall individuals. These types of explainable AI models are well suited to use easily acquired patient data in the clinic and thus have the potential to transform how endocrine disorders are diagnosed and treated.
Glossary
Abbreviations
- AI
artificial intelligence
- ML
machine learning
Additional Information
Disclosures : The author has nothing to disclose.
Data Availability
Not applicable as there were not datasets generated or analyzed.
References
- 1. Amann J, Blasimme A, Vayena E, Frey D, Madai VI; Precise4Q consortium . Explainability for artificial intelligence in healthcare: a multidisciplinary perspective. BMC Med Inform Decis Mak. 2020;20(1):310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Afanasiev O, Berghout J, Brenner SE, et al. Computational challenges and artificial intelligence in precision medicine. Pac Symp Biocomput. 2021;26:166-171. [Google Scholar]
- 3. El-Sappagh S, Alonso JM, Islam SMR, Sultan AM, Kwak KS. A multilayer multimodal detection and prediction model based on explainable artificial intelligence for Alzheimer’s disease. Sci Rep. 2021;11(1):2660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Lapuschkin S, Wäldchen S, Binder A, Montavon G, Samek W, Müller KR. Unmasking Clever Hans predictors and assessing what machines really learn. Nat Commun. 2019;10(1):1096. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Wilkinson J, Arnold KF, Murray EJ, et al. Time to reality check the promises of machine learning-powered precision medicine. Lancet Digit Health. 2020;2(12):e677-e680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Gubbi S, Hamet P, Tremblay J, Koch CA, Hannah-Shmouni F. Artificial intelligence and machine learning in endocrinology and metabolism: the dawn of a new era. Front Endocrinol (Lausanne). 2019;10:185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Rigla M, García-Sáez G, Pons B, Hernando ME. Artificial intelligence methodologies and their application to diabetes. J Diabetes Sci Technol. 2018;12(2):303-310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Shmoish M, German A, Devir N, et al. Prediction of adult height by machine learning technique. J Clin Endocrinol Metab. 2021. doi: 10.1210/clinem/dgab093. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Not applicable as there were not datasets generated or analyzed.