Abstract
Life sciences researchers using Artificial Intelligence are under pressure to innovate faster than ever. Large, multilevel, and integrated datasets offer the promise of unlocking novel insights and accelerating breakthroughs. Although more data are available than ever, only a fraction is being curated, integrated, understood, and analyzed. Artificial Intelligence focuses on how computers learn from data and mimic human thought processes. Artificial Intelligence increases learning capacity and provides decision support system at scales that are transforming the future of healthcare. This article is a review of machine learning applications in healthcare with a focus on clinical, translational, and public health applications with an overview of the important role of privacy, data-sharing, and genetic information.
Keywords: Artificial intelligence, Machine Learning, Precision Medicine, Integrated Health Care Systems, Medical Informatics
Introduction
Machine learning, a popular subdiscipline of Artificial Intelligence, utilizes large datasets and identifies interaction patterns among variables. These techniques can discover previously unknown associations, generate novel hypotheses and drive researchers and resources towards most fruitful directions.1 Machine learning can be applied in various fields, including financial, automatic driving, smart home, etc. In medicine, machine learning is widely used to build automated clinical decision systems.
Most machine learning approaches fall into two main categories: supervised and unsupervised methods. Supervised methods are great for classification and regression. Recent examples include: detection of a lung nodule from a chest x-ray;2 risk estimation models of anticoagulation therapy;3 implantation of automated defibrillators in cardiomyopathy;4 use in classification of stroke and stroke mimic;5 modeling of CD4+ T cell heterogeneity;6 outcome prediction in infectious diseases;7 detection of arrhythmia in electrocardiogram;8 and design and development of in silico clinical trial9 among others.
Unsupervised learning does not require labeled data. It aims to identify hidden patterns present in the data and is often used in data exploration and novel hypotheses generation.2 In three separate studies in heart failure with preserved ejection fraction among patients who had a heterogeneous condition with no proven therapies,10 researchers used unsupervised learning2 to revisit failed clinical trial such as treatment with spironolactone,11 enalapril,12 and sildenafil13 versus placebo to identify a subclass of patients who might benefit from specific therapies, without human intervention.
There are other algorithms, such as reinforcement learning, which can be viewed as a combination of supervised and unsupervised learning to maximize the accuracy using trial and error.14 (table 1).
Table1.
ML types | Algorithms Description | Characteristics | Limitation |
---|---|---|---|
Supervised Learning | Labeled dataset. System trained with human feedback | Applications include Classification, Regression, and Prediction; ideal for modeling disease prognosis or treatment outcome. Modeling algorithms include Artificial Neural Network (ANN), Support Vector Machine (SVM), Random Forest (RF) | Requires a large amount of labeled data for training; need validation in an independent cohort. |
Unsupervised Learning | Non-labeled data by humans | Applications include mainly pattern recognition; ideal for modeling disease mechanisms, identifying hidden patterns in genotype or phenotype data. Modeling algorithms include various clustering methods | Needs validation in several independent cohorts |
Reinforcement Learning | Hybrid approach; The goal is to maximize accuracy by trial and error; especially useful in a complex environment | Applications include chemistry, robotics, games, resource management in computer clusters, personalized recommendations | Memory intensive |
Deep learning is a subset of machine learning which mimics the operation of the human brain using multiple layers of artificial neuronal networks to generate automated predictions from training datasets. Models based on deep learning strategy tend to have multiple parameters and layers; thus, model over-fitting could lead to poor predictive performance. Increasing the training sample size, decreasing the number of hidden layers, and ensuring the data is well-balanced can help prevent overfitting. Overall, deep learning is compelling in image recognition15 as well as in modeling disease onset16 using temporal relations among events. A deep neural network was trained on more than 37,000 head computed tomography scans for intracranial hemorrhage and subsequently evaluated on 9,500 unseen cases, reducing time to diagnosis of new outpatient intracranial hemorrhage by 96% with an accuracy of 84%.17
Cognitive computing as a subset of artificial intelligence involves self-learning systems using pattern recognition, and natural language processing for semi, or unstructured data. Cognitive computing mimics the operation of human thought processes, with the goal of creating automated computerized models that can solve problems without human assistance. Examples include research in computer-brain-interface,18,19and commercial products such as the IBM Watson.20
Although none of these approaches can rapidly and simultaneously consider different disease-related parameters in a user-independent fashion, they are promising venues and are changing the way medicine is practiced. Healthcare providers should be ready for the upcoming Artificial Intelligence age and embrace the added capabilities that would lead to more efficient and effective care. In this article, we review the applications and challenges as well as ethical consideration and perspectives of machine learning in medicine, translational research, and public health (table 2).
Table 2.
Field | Application |
---|---|
Clinical | Disease prediction and diagnosis |
Treatment effectiveness and outcome prediction | |
Translation | Drug discovery and repurposing |
(In Silico) Clinical trial | |
Public health | Epidemic outbreak prediction |
Precision health |
Clinical Application
Disease prediction and diagnosis:
Despite the increasing application of artificial intelligence in healthcare, the research mainly concentrates around cancer, nervous system, and cardiovascular diseases, because they are the leading causes of disability and mortality. However, infectious and chronic diseases (e.g., type 2 diabetes,21 inflammatory bowel disease,22 C. difficile infection9) have also been getting considerable attention. Early diagnosis can now be achieved for many conditions by improving the extraction of clinical insight and feeding such insight into a well-trained and validated system.23 For instance, the United States Food and Drug Administration (FDA) permitted applying of diagnosis software designed to detect wrist fractures in adult patients.24 In another study on 1,634 images of cancerous and healthy lung tissue, the algorithm identified healthy cases and distinguished, as accurately as three pathologists, between two common types of lung cancer.25 In the United States, more than 6% of adult populations are affected by depression. Predicting major depressive disorder was 74% accurate by image heatmap pattern recognition.26
Several studies are looking at the potential of artificial intelligence in timely and precise disease diagnosis. Supervised methods are effective tools at capturing nonlinear relationships for complex and multifactorial disease classification. In a 260 patients cohort study, Abedi V. et al27 found that the model can better diagnose acute cerebral ischemia than trained emergency medical respondents. Although noisy data and experimental limitations reduce the clinical utility of the models, deep learning methods can address these limitations by reducing the dimensionality of the data through layered auto-encoding analyses. Examples include: analysis of more than 1,400 images from 308 histopathology region of skin to detect basal cell carcinoma and differentiate malignant from benign lesions, achieving a diagnostic accuracy of >90% compared with experts;28 or examination of more than 41,000 digital screening breast mammographic for identifying dense or non-dense breast tissue, where 94% of the 10,763 deep learning assessments were accepted by the interpreting radiologist.29
Treatment effectiveness and outcome prediction:
Treatment effectiveness and outcome prediction are also important areas with the potential clinical implication in disease management strategies and personalized care plans. A decade ago only molecular and clinical information was exploited to predict cancer outcomes. With the development of high-throughput technologies, including genomic, proteomic, and imaging technologies, new types of input parameters have been collected and used for prediction. With a large sample size and integrated multi-modal data types, including histological or pathological assessments,30 these methods could considerably (15–25%) improve the accuracy of cancer susceptibility, outcome prediction, and prognosis.31
Electronic health records (EHRs) are effective tools for documenting and sharing healthcare information. Integrating machine learning-based modeling designed specifically for administrative datasets can facilitate the detection of potential complications, improve health care resource utilization, and outcome at a personalized level.32,33 Utilization of machine learning applied to EHR data has been shown to predict outcome in sepsis patients. Large scale machine learning-based mortality study in more than 170,000 patients with 331,317 echocardiography by Manar Smad et al.34 achieved 96% accuracy to predict patients survival based on echocardiography combined with EHR data. In terms of algorithm improvement Stephen W. Smith et al.35 developed a deep neural network model for 12-lead ECG analysis compared to the conventional algorithm in emergency department ECGs, their result showed an accuracy of 92% for finding a major abnormality.
Artificial Intelligence analytics can be used in chronic disease management characterized by multi-organ involvement, acute variable events, and long illness progression latencies. For instance, retinopathy can be predicted using machine learning. Training two validation dataset using deep learning to detect and grade diabetic retinopathy and macular edema achieved a high specificity and sensitivity for detecting moderately severe retinopathy and macular edema after each image was graded by ophthalmologists between three and seven times.36
To improve care in congestive heart failure, one study used supervised machine learning on 46 clinical variables from 397 patients with heart failure with preserved ejection fraction. Phenotypic heatmap predicted patient survival more accurately than commonly employed risk assessment tools.2
One of the goals of precision medicine in cancer is the accurate prediction of optimal drug therapies from the genomic data of individual patient tumors.37 In one study researchers present an open-access algorithm for the predictive response of cancers to seven common chemotherapeutic medications.38 Precision medicine success depends on algorithm ability to translate large compendia of -omics data into clinically actionable predictions. For example, Costello J. C. et al.39 analyzed 44 drug sensitivity prediction algorithms on 53 breast cancer cell lines with available genomic information to fulfill dose-response values of growth inhibition for each cell line exposed to 28 therapeutic compounds.
Translation Application
Drug discovery and repurposing:
About 25% of all discovered drugs were the result of a chance when different domains were brought together accidentally.40 Targeted drug discovery is preferred in pharmaceuticals due to the explicit mechanism, higher success rate, and lower cost when compared to traditional blind screening. Machine learning is now utilized in the drug discovery process due to the followings; 1) high costs of drug development; 2) increasing availability of three-dimensional structural information that can guide the characterization of drug targets, and 3) extremely low success rates in clinical trials.41 Machine learning can be used as a bridge to achieve cross-domain linkage. It can identify a newly approved drug by recognizing contextual clues like a discussion of its indication or side effects.20
Despite these novel approaches in drug discovery, there are important challenges, including data access and the fact that in general, different data sets are stored in a variety of repositories. Furthermore, raw data from clinical trials and other pre-clinical studies are typically not available. However, overall, artificial intelligence has been successful when applied to available sources, including the use of drug information to extract insight about mechanism-of-action by applying techniques such as similarity metrics across all diseases to find shared pathways.20 Another example includes the use of natural language processing for identification of hidden or novel associations that might be important in the detection of potential drug adverse effects based on scientific publications.42
Clinical trial and in silico clinical trials:
Clinical trial design has its roots in classical experimental design. However, the clinical investigators are not able to control various sources of variability. Ethical issues are paramount in clinical research. Subject enrollment can become lengthy and costly.43,44
Machine learning approach using in silico dataset was introduced to describe the numerical methods used in drug development in oncology by modeling biological systems in the setting of clinical trial studies and hospital databases, paving the way to predictive, preventive, personalized and participatory medicine.45 This approach gives the researchers the ability to partially replacing animals or humans in a clinical trial and generates virtual patients with specific characteristics to enhance the outcome of such studies. These methods are especially helpful for pediatric or orphan disease trials and can be applied in pharmacokinetics and pharmacodynamics from the preclinical phase to post-marketing.46, 46 In a study, a large in silico randomized, placebo-controlled Phase III clinical trial study was designed where investigators used virtual treatments on synthetic Crohn’s disease patients. Results showed a positive correlation between the initial disease activity score and the drop in the disease activity score but with different medications efficacy.47 The model did not highly score the investigational drug GED-0301; this prediction was further validated when the company which was running the clinical trial on GED-0301, has stopped the phase III trial after it failed to clear an interim futility review.48 In silico clinical trials can have considerable potentials in design and discovery phases of biomedical product, biomarker identification, dosing optimization, or the duration of the proposed intervention.49
Public Health Relevance
Epidemic outbreak prediction:
The infectious disease distribution pattern between population groups with known probabilities are based on prior knowledge of ecological and biological features of the environment. Early prediction of the epidemic (such as peak and duration of infection) is possible if model parameters are partially known.50 Potential outbreak areas for filoviruses were predicted in West, Southwest and Central parts of Uganda which is related to bat distribution and previous outbreaks areas.51 In another study, Kesorn K. et al.52 predicted the morbidity rate of dengue hemorrhagic fever in central Thailand by estimating the infection rate in the female Aedes aegypti larvae mosquitoes and achieved a prediction accuracy of >95% and 88% in the training and test set, respectively.
Precision Health
Genetic and biomedical studies have continued investigation efforts with the goal of revealing connections between genes and human traits or diseases. Regularized logistic regression is an important tool for related applications. Many studies rely on large-scale sensitive genotype or phenotype data and sharing across institutions is paramount for the success of such studies.53
There are many such examples in recent years. For instance, in a recent case-control study with limited sample size, researchers developed an algorithm to integrate personal whole genome sequencing and EHR data and used this algorithm to study abdominal aortic aneurysm. They assess the effectiveness of modifying personal lifestyles given personal genome baselines, demonstrating the model’s utility as a personal health management model. Such studies have the potential to shed lights on the biological architecture of other complex diseases.54 In a recent review, Torkamani et al., examine the core disciplines that enable high-definition medicine given our recent technological advances and high-resolution data.55
Challenges and perspectives
Machine learning’s ultimate goal is to develop algorithms that are capable of self-improving with experience and continuously learning from new data and insights, to find answers to an array of questions. The compelling opportunities in precision medicine offered by complex algorithms are accompanied by computational challenges. In 2012 the Obama administration announced “Big Data Research and Development Initiative” investment to “help solve some of the Nation’s most pressing challenges”.56 The achievement of this potential requires novel approaches to address at least three technical challenges:57 1) volume – scale of data inputs, outputs, and attributes; this challenge can be addressed in part by using clusters of CPUs, data sharing system or cloud and deep learning methods; 2) variety – different formats of data (image, video, and text); this challenge can be partially addressed by using novel deep learning methods to integrated data from various sources; and, 3) velocity – speed of streaming data; to address this challenge, online learning approaches can be developed.
The ethical challenges presented by data science have also been an area of debate. These challenges can be mapped within the conceptual space and described by three branches of research: the ethics of data and privacy, the ethics and morality of algorithms, and the ethics and values of practices.58 Among those, privacy has been the center of attention. Privacy is defined as a fundamental human right in the Universal Declaration of Human Rights at the 1948 United Nations General Assembly. Machine learning plays a key role in the development of precision medicine, whereby treatment is customized to the clinical or genetic risk factors of the patient. These advances require collecting and sharing the massive amount of data and thus generate concern about privacy.59
At the same time, healthcare institutions need to communicate with the public and collaborate with scientific communities, as well as government agencies.60 In this situation, a privacy-preserving framework is necessary and should be applied to a large range of domains where the privacy and confidentiality of study participants and institutions is of concern.61 As standard practice, many institutions collaborate and use the de-identification process to share clinical data; or perform a meta-analysis, and each contributing site performs analysis in-house. These processes reduce the scope of clinical data sharing. For example, the DNAnexus clinical trial solution service powers the FDA’s platform for advancing regulatory standards.62 St. Jude Cloud is a data-sharing resource for the global research community.63 eMERGE is a national network organized and funded by the National Human Genome Research Institute (NHGRI) that combines DNA biorepositories with electronic medical record (EMR) systems for large-scale, high-throughput genetic research in support of implementing genomic medicine.64 In Europe, the UK Biobank is a national and international health resource with unparalleled research opportunities, open to all bona fide health researchers.65
The most important issue when developing machine learning in a clinical setting is the issue of trust when both clinicians and patients accept the recommendations provided by the system.66 The data is noisy, complex, high-dimensional with thousands of variables, and biased for the catchment area of the originating hospital systems where the model was trained. Furthermore, missing data is not at random. Missingness can be due to incompleteness, inconsistency, or inaccuracy.67,68 Imputation, predicting missing values, also has its unique challenges. Standardized techniques such as the MICE algorithm69 or novel imputation methods70 have been proposed. Other challenges in mining the EHR data includes: 1) different protocols and changes are introduced at various time period, without documentation for the research team; and 2) policy changes and reimbursement rules are introduced that may affect how patients seek care and how the treatment is re-designed based on their needs and their insurance coverage. Therefore, to develop models using EHR, the researchers must work closely with care providers and others within the healthcare system to increase the predictive power of the modeling-enabled discoveries.
Other limitations are lack of interoperability across technology platforms over time and massive expansion of structured and unstructured data elements. Natural language processing can be used to process and contextualizes different medical words and expressions.71 However; robust infrastructures have to be in place to be able to handle a large number of clinical notes. For instance, it is possible to use robust infrastructure to process millions of notes and identify patients who are in need of a follow-up appointment for preventive care in hospital settings.72
Today’s machine learning approaches are near to real-world conditions. Due to the rapid technological advancements, tasks previously limited to humans will be taken on by algorithms.73 Machine learning’s ability to transform data into insight will affect the field of medicine, displacing much of the work of radiologists and anatomical pathologists. However, clinical medicine has always required doctors to handle huge amounts of data, from history and physical exam to laboratory and imaging studies and, newly genetic data. The ability to manage this complexity has always set good doctors apart.74
Clinical Significance.
Artificial intelligence increases learning capacity and provides decision support system at scales that are transforming the future of healthcare.
Artificial intelligence has been implemented in disease diagnosis and prognosis, treatment optimization and outcome prediction, drug development and public health.
Technological advances require collecting and sharing the massive amount of data and thus generate concern about privacy.
Funding:
This work was in part supported by Geisinger Research, and by funds from the National Institute of Health (NIH) grant No. R56HL116832 to Sutter Health and sub-awarded to VA (Sub-PI, Geisinger) as well as funds from the Defense Threat Reduction Agency (DTRA) grant No. HDTRA1-18-1-0008 to Virginia Tech and sub-awarded to VA (Sub-PI, Geisinger, sub-award No. 450557-19D03). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Competing Interests: None.
References
- 1.Koprowski R F. K Machine learning and medicine: book review and commentary. Biomed. Eng. Online 17, 17 (2018).29391026 [Google Scholar]
- 2.RC D Machine Learning in Medicine. Circulation 132, 1920–30 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Lip GY, Nieuwlaat R, Pisters R, Lane DA, C. H Refining clinical risk stratification for predicting stroke and thromboembolism in atrial fibrillation using a novel risk factor-based approach: the euro heart survey on atrial fibrillation. Chest 137, 263–272 (2010). [DOI] [PubMed] [Google Scholar]
- 4.O’Mahony C, Jichi F, Pavlou M, Monserrat L, Anastasakis A, Rapezzi C, Biagini E, Gimeno JR, Limongelli G, McKenna WJ, Omar RZ, E. P. H. C. O. I. A novel clinical risk prediction model for sudden cardiac death in hypertrophic cardiomyopathy (HCM risk-SCD). Eur. Heart J 35, 2010–2020 (2014). [DOI] [PubMed] [Google Scholar]
- 5.Abedi V, Goyal N, Tsivgoulis G, Hosseinichimeh N1, Hontecillas R1, Bassaganya-Riera J, Elijovich L, Metter JE, Alexandrov AW, Liebeskind DS, Alexandrov AV, Z. R Novel Screening Tool for Stroke Using Artificial Neural Network. Stroke 48, 1678–1681 (2017). [DOI] [PubMed] [Google Scholar]
- 6.Lu P, Abedi V, Mei Y, Hontecillas R, Hoops S, Carbo A, B.-R. J Supervised learning methods in modeling of CD4+ T cell heterogeneity. BioData Min 4, 27 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Bogle Brittany; Balduino Ricardo; Wolk Donna M.;Farag Hosam A; Kethireddy Shravan; Chatterjee Avijit; Abedi V. Predicting Mortality of Sepsis Patients in a Multi-Site Healthcare System using Supervised Machine Learning in Int’l Conf. of Health Informatics and Medical Systems 9–15 (2018). https://csce.ucmss.com/cr/books/2018/LFS/CSREA2018/HIM3645.pdf [Google Scholar]
- 8.Chen Y et al. Classification of short single lead electrocardiograms (ECGs) for atrial fibrillation detection using piecewise linear spline and XGBoost. Physiol. Meas (2018). doi: 10.1088/1361-6579/aadf0f [DOI] [PubMed] [Google Scholar]
- 9.Leber A, Hontecillas R, Abedi V, Tubau-Juni N, Zoccoli-Rodriguez V, Stewart C, B.-R. J Modeling new immunoregulatory therapeutics as antimicrobial alternatives for treating Clostridium difficile infection. Artif. Intell. Med 78, 1–13 (2017). [DOI] [PubMed] [Google Scholar]
- 10.JE U Heart failure with preserved ejection fraction. Circulation 124, 540–543 (2011). [DOI] [PubMed] [Google Scholar]
- 11.Pitt B, Pfeffer MA, Assmann SF, Boineau R, Anand IS, Claggett B, Clausell N, Desai AS, Diaz R, Fleg JL, Gordeev I, Harty B, Heitner JF, Kenwood CT, Lewis EF, O’Meara E, Probstfield JL, Shaburishvili T, Shah SJ, Solomon SD, Sweitzer NK, Yang S, T. I. MS Spironolactone for heart failure with preserved ejection fraction. N. Engl. J. Med 370, 1383–1392 (2014). [DOI] [PubMed] [Google Scholar]
- 12.Kitzman DW, Hundley WG, Brubaker PH, Morgan TM, Moore JB, Stewart KP, L. W A randomized double-blind trial of enalapril in older patients with heart failure and preserved ejection fraction: effects on exercise tolerance and arterial distensibility 3, 477–485 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Guazzi M, Vicenzi M, Arena R, G. M. Pulmonary hypertension in heart failure with preserved ejection fraction: a target of phosphodiesterase-5 inhibition in a 1-year study. Circulation 124, 164–174 (2011). [DOI] [PubMed] [Google Scholar]
- 14.Krittanawong C, Zhang H, Wang Z, Aydar M, K. T Artificial Intelligence in Precision Cardiovascular Medicine. J. Am. Coll. Cardiol 69, 2657–2664 (2017). [DOI] [PubMed] [Google Scholar]
- 15.Lee EJ, Kim YH, Kim N, K. D Deep into the Brain: Artificial Intelligence in Stroke Imaging. J. Stroke 19, 277–285 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Choi E, Schuetz A, Stewart WF & Sun J Using recurrent neural network models for early detection of heart failure onset. J. Am. Med. Informatics Assoc 24, 361–370 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Arbabshirani MR, Fornwalt Brandon K., M. GJ, Suever Jonathan D., G. BD & Patel Aalpen A., M. GJ Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical work flow integration. npj Digit. Med (2018). doi: 10.1038/s41746-017-0015-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bashivan P; Yeasin M; Bidelman GM. Temporal progression in functional connectivity determines individual differences in working memory capacity In: (2017 International Joint Conference on Neural Networks (IJNN), IEEE, 2017):2943–2949. 10.1109/IJCNN.2017.7966220 [DOI] [Google Scholar]
- 19.Elahian B, Yeasin M, Mudigoudar B, Wheless JW, B.-F. A Identifying seizure onset zone from electrocorticographic recordings: A machine learning approach based on phase locking value. Seizure 51, 35–42 (2017). [DOI] [PubMed] [Google Scholar]
- 20.Chen Y, Elenee Argentinis JD, Weber G IBM Watson: how cognitive computing can be applied to big data challenges in life sciences research Clin Ther. 38, 688–701 (2016). https://www.sciencedirect.com/science/article/pii/S0149291815013168 [DOI] [PubMed] [Google Scholar]
- 21.Kagawa R et al. Development of Type 2 Diabetes Mellitus Phenotyping Framework Using Expert Knowledge and Machine Learning Approach. J. Diabetes Sci. Technol 11, 791–799 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Bassaganya-Riera J Introduction to accelerated path to cures and precision medicine in inflammatory bowel disease In: Bassaganya-Riera J, ed.Accelerated Path to Cures. New York: Springer International; 1–6 (2018). [Google Scholar]
- 23.Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, Wang Y, Dong Q, Shen H, W. Y Artificial intelligence in healthcare: past, present and future. Stroke Vasc. Neurol 2, 230–243 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.US Food and Drug Administration. FDA permits marketing of artificial intelligence algorithm for aiding providers in detecting wrist fractures [press release]. (May 24 2018). Available at: https://www.fda.gov/newsevents/newsroom/pressannouncements/ucm608833.htm. Accessed February 5, 2019.
- 25.Razavian N & Tsirigos A Pathologists meet their match in tumour-spotting algorithm. Nature (2018) 561:436–437. [DOI] [PubMed] [Google Scholar]
- 26.Schnyer DM, Clasen PC, Gonzalez C, Beevers CG Evaluating the diagnostic utility of applying a machine learning algorithm to diffusion tensor MRI measures in individuals with major depressive disorder. Psychiatry Res Neuroimaging 264, 1–9 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Abedi V, Goyal N, Tsivgoulis G, Hosseinichimeh N, Hontecillas R, Bassaganya-Riera J, Elijovich L, Metter JE, Alexandrov AW, Liebeskind DS, Alexandrov AV, Z. R Novel Screening Tool for Stroke Using Artificial Neural Network. J. Stroke 48, 1678–1681 (2017). [DOI] [PubMed] [Google Scholar]
- 28.Cruz-Roa AA, Arevalo Ovalle JE, Madabhushi A, G. O. F A deep learning architecture for image representation, visual interpretability and automated basal-cell carcinoma cancer detection. Med. Image Comput. Comput. Interv 16, 403–10 (2013). [DOI] [PubMed] [Google Scholar]
- 29.Lehman Constance D., Yala Adam, Schuster Tal, Dontchos Brian, Bahl Manisha, Swanson Kyle, R. B Mammographic Breast Density Assessment Using Deep Learning: Clinical Implementation. J. Radiol (2018). [DOI] [PubMed] [Google Scholar]
- 30.Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, F. DI Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J 15, 8–17 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Cruz JA WD Applications of machine learning in cancer prediction and prognosis. Cancer Inform 11, 59–77 (2007). [PMC free article] [PubMed] [Google Scholar]
- 32.Rivers EP, McIntyre L, Morro DC, R. K Early and innovative interventions for severe sepsis and septic shock: taking advantage of a window of opportunity. Can. Med. Assoc. tion J 173, 1054–1065 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Hillman K, Chen J, Cretikos M, Bellomo R, Brown D, Doig G, Finfer S, F. A M. study investigators. Introduction of the medical emergency team (MET) system: a cluster-randomised controlled trial. Lancet 365, 2091–7 (2005). [DOI] [PubMed] [Google Scholar]
- 34.Samad MD, Ulloa A, Wehner GJ, Jing L, Hartzel D, Good CW, Williams BA, Haggerty CM, F. B Predicting Survival From Large Echocardiography and Electronic Health Record Datasets: Optimization With Machine Learning. JACC Cardiovasc Imaging (2018). doi: 10.1016/j.jcmg.2018.04.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Smith SW et al. A Deep Neural Network Learning Algorithm Outperforms a Conventional Algorithm For Emergency Department Electrocardiogram Interpretation. Journal of Electrocardiology (Elsevier Inc, 2018). doi: 10.1016/j.jelectrocard.2018.11.013 [DOI] [PubMed] [Google Scholar]
- 36.Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, Venugopalan S, Widner K, Madams T, Cuadros J, Kim R, Raman R, Nelson PC, Mega JL, W. D Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA 316, 2402–2410 (2016). [DOI] [PubMed] [Google Scholar]
- 37.Haverty PM, Lin E, Tan J, Yu Y, Lam B, Lianoglou S, Neve RM, Martin S, Settleman J, Yauch RL, B. R Reproducible pharmacogenomic profiling of cancer cell line panels. Nature 533, 333–337 (2016). [DOI] [PubMed] [Google Scholar]
- 38.Huang C, Mezencev R, Mcdonald JF & Vannberg F Open source machine-learning algorithms for the prediction of optimal cancer drug therapies. PIoS One 12, 1–14 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Costello JC et al. A community effort to assess and improve drug sensitivity prediction algorithms. Nat. Biotechnol 32, 1202–12 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Hargrave-Thomas E, Yu B, R. J Serendipity in anticancer drug discovery. World J. Clin. Oncol 3, 1–6 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Lu Pinyi, Bevan David R., Leber Andrew, Hontecillas Raquel, Nuria Tubau-Juni, B.-R. J in Accelerated Path to Cures (ed. Bassaganya-Riera J) 7–24 (Springer, 2018). [Google Scholar]
- 42.Abedi V, Yeasin M, Zand R Empirical study using network of semantically related associations in bridging the knowledge gap. J. Transl. Med, 2014. November 27; 12:324. doi: 10.1186/s12967-014-0324-9. https://www.ncbi.nlm.nih.gov/pubmed/25428570 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Design and Analysis of Clinical Trials. PennState Eberly College of Science
- 44.Zand R et al. in Accelerated Path to Cures 57–77 (Springer International Publishing, 2018). doi: 10.1007/978-3-319-73238-1_5 [DOI] [Google Scholar]
- 45.Gal J, Milano G, Ferrero JM, Saâda-Bouzid E, Viotti J, Chabaud S, Gougis P, Le Tourneau C, Schiappa R, Paquet A, C. E. Optimizing drug development in oncology by clinical trial simulation: Why and how? Brief. Bioinform 1–15 (2017). doi: 10.1093/bib/bbx055 [DOI] [PubMed] [Google Scholar]
- 46.Harnisch L, Shepard T, Pons G, O. D. P Modeling and simulation as a tool to bridge efficacy and safety data in special populations. CPT pharmacometrics Syst. Pharmacol 27, e28 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Abedi V et al. Chapter 28 Phase III Placebo-Controlled, Randomized Clinical Trial With Synthetic Crohn’s Disease Patients to Evaluate Treatment Response. Emerg. Trends Appl. Infrastructures Comput. Biol. Bioinformatics, Syst. Biol 411–427 (2016). [Google Scholar]
- 48.Taylor NP Celgene cans phase 3 trial of $710M Crohn’s drug GED-0301. Fierce Biotech (October 20, 2017). Available at: https://www.fiercebiotech.com/biotech/celgene-cans-phase-3-trial-710m-crohn-s-drug-ged-0301 [Accessed February 5, 2019].
- 49.Carlier A, Vasilevich A, Marechal M, de Boer J, G. L In silico clinical trials for pediatric orphan diseases. Sci. Rep 8, 2465 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Zamiri A, Yazdi HS, G. S. Temporal and spatial monitoring and prediction of epidemic outbreaks. IEEE J. Biomed. Heal. informatics 19, 735–744 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Nyakarahuka L, Ayebare S, Mosomtai G, Kankya C, Lutwama J, Mwiine FN, S. E Ecological Niche Modeling for Filoviruses: A Risk Map for Ebola and Marburg Virus Disease Outbreaks in Uganda. PLoS Curr 5, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kesorn K, Ongruk P, Chompoosri J, Phumee A, Thavara U, Tawatsin A, S. P Morbidity Rate Prediction of Dengue Hemorrhagic Fever (DHF) Using the Support Vector Machine and the Aedes aegypti Infection Rate in Similar Climates and Geographical Areas. PLoS One 10, e0125049 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Xie W, Kantarcioglu M, Bush WS, Crawford D, Denny JC, Heatherly R, M. B SecureMA: protecting participant privacy in genetic association meta-analysis. Bioinformatics 30, 3334–41 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Li J et al. Decoding the Genomics of Abdominal Aortic Article Decoding the Genomics of Abdominal Aortic Aneurysm. Cell 174, 1361–1372.e10 (2018). [DOI] [PubMed] [Google Scholar]
- 55.Torkamani A, Andersen KG, Steinhubl SR & Topol EJ High-Definition Medicine. Cell 170, 828–843 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.The White House. Obama administration unveils ‘big data’ initiative: announces $200 million in new R&D investments [press release] (March 29 2012). Available at:https://obamawhitehouse.archives.gov/the-press-office/2015/11/19/release-obama-administration-unveils-big-data-initiative-announces-200. Accessed February 5, 2019.
- 57.Chen X, Member S & Lin X Big data deep learning: challenges and perspectives 2, (2014). doi: 10.1109/ACCESS.201.2325029 [DOI] [Google Scholar]
- 58.Floridi L Taddeo M What is data ethics? Philos Trans A Math Physi Eng Sci 374, (2016) [pii: 20160360]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Azencott C Machine learning and genomics: precision medicine versus patient privacy. Philos. Trans. Ser. A Math., Phys. Eng. Sci 376, (2018). [DOI] [PubMed] [Google Scholar]
- 60.Kayaalp M Patient Privacy in the Era of Big Data. Balkan Med. J 35, 8–17 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Li W, Liu H, Yang P, X. W Supporting Regularized Logistic Regression Privately and Efficiently. PLoS One 11, e0156479 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.DNANEXUS. OPTIMIZE & DE-RISK YOUR CLINICAL TRIALS (2018). https://www.dnanexus.com/clinicaltrials
- 63.St Jude Children’s Research Hospital. St. Jude Cloud (2018). https://www.stjude.cloud/
- 64.eMARGE. Electronic Medical Records and Genomics network (2018). https://emerge.mc.vanderbilt.edu/
- 65.UK Biobank. (2018). https://www.ukbiobank.ac.uk/
- 66.Mehta N D. M Machine learning, natural language programming, and electronic health records: The next step in the artificial intelligence journey? J. Allergy Clin. Immunol 141, 2019–2021 (2018). [DOI] [PubMed] [Google Scholar]
- 67.Botsis T, Hartvigsen G, Chen F, W. C Secondary Use of EHR: Data Quality Issues and Informatics Opportunities. AMIA Jt. Summits Transl. Sci 1, 1–5 (2010). [PMC free article] [PubMed] [Google Scholar]
- 68.Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, Wood AM, C. J Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 338 (2009). doi: 10.1136/bmj.b2393 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.White IR, Royston P, W. A Multiple imputation using chained equations: Issues and guidance for practice. Stat. Med 30, 377–399 (2011). [DOI] [PubMed] [Google Scholar]
- 70.Abedi Vida, Shivakumar Manu K., Lu Pinyi, Hontecillas Raquel, Leber Andrew, Ahuja Monika, Ulloa Alvaro E., Shellenberger Joshua M., B.-R J. Latent-Based Imputation of Laboratory Measures from Electronic Health Records: Case for Complex Diseases. bioRxiv (2018). doi: 10.1101/275743 [DOI] [Google Scholar]
- 71.Miller DD B. E Artificial Intelligence in Medical Practice: The Question to the Answer? Am. J. Med 131, 129–133 (2018). [DOI] [PubMed] [Google Scholar]
- 72.Karunakaran B, Misra D, Marshall K, Mathrawala D & Kethireddy S Closing the loop — Finding lung cancer patients using NLP in 2017 IEEE International Conference on Big Data (Big Data) 2452–2461 (IEEE, 2017). doi: 10.1109/BigData.2017.8258203 [DOI] [Google Scholar]
- 73.Erickson BJ, Korfiatis P, Akkus Z & Kline TL Machine Learning for Medical Imaging 505–515 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Obermeyer Z & Emanuel EJ Predicting the Future - Big Data, Machine Learning, and Clinical Medicine. N. Engl. J. Med 375, 1216–1219 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]