Abstract
Machine Learning (ML) has been categorized as a branch of Artificial Intelligence (AI) under the Computer Science domain wherein programmable machines imitate human learning behavior with the help of statistical methods and data. The Healthcare industry is one of the largest and busiest sectors in the world, functioning with an extensive amount of manual moderation at every stage. Most of the clinical documents concerning patient care are hand-written by experts, selective reports are machine-generated. This process elevates the chances of misdiagnosis thereby, imposing a risk to a patient's life. Recent technological adoptions for automating manual operations have witnessed extensive use of ML in its applications. The paper surveys the applicability of ML approaches in automating medical systems. The paper discusses most of the optimized statistical ML frameworks that encourage better service delivery in clinical aspects. The universal adoption of various Deep Learning (DL) and ML techniques as the underlying systems for a variety of wellness applications, is delineated by challenges and elevated by myriads of security. This work tries to recognize a variety of vulnerabilities occurring in medical procurement, admitting the concerns over its predictive performance from a privacy point of view. Finally providing possible risk delimiting facts and directions for active challenges in the future.
Introduction
In this era of technology and advancements, we have come across multiple transformations made by ML/DL systems in industries such as governance, manufacturing, and transportation. Over the past couple of years, the utilization of intelligent systems has increased manifold in various domains, including our routine life. One such realm is healthcare [1, 2], which earlier had been impervious to large-scale technological disruptions. The Healthcare industry across the globe has evolved extensively with the advent of machine intelligence. Nasr et al. [3] explore current state-of-the-art smart healthcare systems, highlighting significant topics such as wearable and smartphone devices for fitness monitoring, ML for illness prediction, and assistive frameworks, including social robots designed for assisted living environments. Bharadwaj et al. [4] confer applications of ML algorithms integrated with the Healthcare Internet of Things (H-IoT) in terms of their compensations, choice, and potential future aspects. The acceptance of ML/DL techniques has sustained exceptional results in versatile tasks such as brain tumor segmentation [5], Saliva sample classification of COPD patients [6], Chronic Neurological Disorder assistance [7], anomaly recognition in the Artificial Pancreas [8], clinical image reconstruction [9], cancerous cell classification, to name a few. It is expected that in the coming years' intelligent software systems will take over much of the human labor, put by radiologists and physicists in examining medical documents. ML will transform conventional medical practice and research. Healthcare has emerged as an active application area for ML/DL models in achieving human-level performance in various pathological tasks [10]. Some of the investigations reported that the intelligent models outperformed clinical experts in certain respects. Esteva et al. [11] illustrate the categorization of skin lesions with a single CNN evaluated against 21 board-certified dermatologists on biopsy-proven clinical diagnosis of the scariest skin cancer The findings show that AI can classify skin cancer with a degree of accuracy equivalent to dermatologists. Rajpurkar et al. [12] build the CheXNet algorithm, which can diagnose pneumonia from chest X-rays at a higher level than experienced radiologists CheXNet outperforming them on the F1 measure. The drive to enhance the performance of ML models in comparison to humans has resulted in a marvelous increase in the conception of computer-aided investigative systems. The potential of AI systems for healthcare applications increased by the development of advanced technologies such as the Internet of Things (IoT), Big Data, cloud computing, etc. Unitedly with the technologies, AI can produce profoundly accurate monitoring and prediction systems that can facilitate human-centric emergency medical assistance.
Shishvan et al. [13] exposed a variety of emerging ML algorithms in the context of comprehensive healthcare services. The work introduces the applicability of intelligent algorithms in multiple steps such as data extraction, feature selection, model fitting, training, and execution, and a set of performance measurement metrics to evaluate [14]. Kumar et al. [15] develop a classification structure to categorize the recurrence of specific health conditions based on the clinical history of patients using pre-trained word2vec, GloVe, domain-trained, universal sentence encoder embeddings, and fastText to challenge the sorting of sixteen indisposition conditions within medical histories. In this digital era healthcare services have now extended to wearable devices, IoT, and cloud applications, as we attain a deeper understanding of embedded and automated systems in the clinical context [4]. Developing targeted therapies for personalized treatments, accurate localization of disease hubs, and identifying morbidities will be apparent if intelligent systems are critically developed ascertaining the liabilities united with it [16].
Dhief et al. [17] presented an extensive review of IoT frameworks and state-of-art techniques used in healthcare and voice pathology surveillance systems whereas Alhussein et al. [18] investigated the voice abnormality detection system using DL on mobile healthcare frameworks. Researchers and physicians are reviewing numerous approaches to utilize the skill of DL methods for Intensive Care Unit (ICUs) and critically acclaimed concerns [19–21], similarly, Ganainy et al. [22] proposed a real-time consultation system in the clinical context which forecasts the Mean Arterial Pressure (MAP) values’ current status at the ease of bed accessibility using new ML structures. The majority of intelligent applications utilizing customer records have received disappointing results at some point in their performance due to their obsession with metrics [23–25]. Envisioning the privacy concerns that arise while dealing with data transmission or analysis to model a predictive system settles at a compromising state [26–28]. This paper attempts to acknowledge the diverse techniques of ML and their diligence in the Healthcare ecosystem. A brief of subsequent sections is provided next [29–32].
This paper shares a concise statistical background of ML Algorithms while discussing multiple ML models, their application in clinical aspects, along with certain hindrances, and any possible solutions to tackle those shortcomings.
This paper outlines various challenges related to medical analysis using ML and DL techniques.
This paper analyses and lists different heterogeneous sources contributing to healthcare data and the flaws associated.
This paper describes the applications of ML in healthcare for medical prognosis, computer-aided detection, diagnosis, and treatment. Further, the associated drawbacks are outlined as well.
This paper lists different types of vulnerabilities in the ML pipeline and their sources. Further, the work highlights various techniques to avoid information breaches and preserve the privacy of data for clinical users.
The remainder of the paper is organized as follows. Section 2 presents the various ML algorithms, their applications, and their mathematical background. Section 3 presents the different applications of ML in the healthcare systems and tries to bring the present scenario where utilization of the intelligent systems to automate regular tasks is demonstrated. Section 4 witnesses the probable vulnerabilities that are encountered during the preparation of ML models in the healthcare pipeline. Section 5 presents a study to recognize the privacy challenges concerning the involvement of AI systems and various approaches to preserving privacy concerns. Conclusively, Sect. 6 presents imminent prospects and areas that require further research followed by the chapter conclusion in Sect. 7.
Background of ML Algorithms
The majority of developing countries have invested their time and money in advanced technical prospects that in some way or other prove to be cost-effective in the long run. Development is often associated with the advent of automated machinery and mechanical systems as we grow towards becoming a data-centric world. Management and effective use of data at the industrial level is an irksome task if humans run the errands, this is where the applicability of various ML/DL-based intelligent systems gain its importance. ML algorithms are developed specifically for supporting models to solve a problem in different domains (e.g., Healthcare, Fintech, Industrial, etc.) [33]. Okay et al. [34] demonstrate that applying (Interpretable Machine Learning) IML models to sophisticated and difficult-to-interpret ML approaches provides thorough interpretability while preserving accuracy, which is challenging when crucial medical choices are at stake. Ileberi et al. [35] implement an ML-based framework, Synthetic Minority over-sampling Technique (SMOTE), for credit card scam exposure since it outstrips other prevailing methodologies. Ahsan et al. [36] propose a unique prognostics framework based on statistics-driven ML modeling for forecasting qualification test results of electronic components, allowing a decrease in qualification test cost and time. Hari et al. [37] offer a supervised ML method built by modeling the behavior of Gallium Nitride (GaN) power electronic devices for reliably forecasting the current waveforms and switching voltage of these innovative devices. Seng et al. [38] concentrate on how computer vision (CV) and ML practices may be applied to existing vinification actions and vineyard organizations to obtain industry-relevant outcomes. Rehman et al. [39] provide an ML technique for the localization of brain tumor cells utilizing the textonmap image on FLAIR scans of Magnetic Resonance Images (MRI). Singh et al. [40] offer a unique ensemble-based classification technique that combines AI, fog computing, and smart health to create a reliable platform for the early identification of COVID-19 infection. Comparatively, Vyas et al. [41] offer an ML model powered by a multimodal method for assessing a patient's readiness to suggest the hospital plays an important part in action design based on patient choice. Some ML algorithms and their purposes are discussed in the forthcoming sections. A summary of the different ML algorithms discussed in this chapter is depicted in Fig. 1.
Regression Models
Regression analysis is a statistical modeling method that aims to define a relationship between a dependent and independent variable (linear or polynomial) [42]. This predictive modeling technique can be utilized for forecasting, time-series modeling, predictive analysis, etc. Various types of regression methodologies subsisting are Linear, Polynomial, Logistic, Multivariate Regression, Ridge, and Bayesian Linear Regression. Some of these are discussed next.
Linear Regression
Linear regression models have transformed the statistical view of supervised learning for quantitative response prediction of a relation linking the independent (input vector) and dependent variable (output vector). The relationship is represented by a linear function (regression technique) with a formidable perfection. In the ML arena, Linear regression models outperform simplicity while preserving considerable interest and ease of interpretability. Velez et al. [43], presented a straightforward definition of ML as “the capacity to explain or show human eccentricities in understandable terms”. Linear regression targets to access a direct relationship (function) f that justifies the relationship between an input vector x having dimension d and a real-value output y (i.e., f(x)) as
1 |
where is identified as the intercept of the function and is the coefficient vector corresponding to the individual input variables. To calculate the regression coefficients and , a training set (, ) is required where A ∈ denotes k training inputs, and, denotes k training outputs where each is affirmed with the real-entity output . The prime objective is to reduce the empirical risk, quantifying via the relation between predictor and the response, for each Loss functions are a measure of the amount of deviation resulting from the actual outputs concerning model performance. The least squared estimate is one of the widely used loss functions for regression models and also has minimal variance amongst all unbiased linear estimates. Working a regression model by reducing the Residual Sum of Squares (RSS) between the predicted outputs and the labels is expressed as [44]
2 |
Certain downsides include high variance, where a model may properly reflect the data set but may overfit to noisy or otherwise unrepresentative training data, reducing prediction accuracy and making it unsuitable for fitting. However, alternative approaches like Linear Dimension Reduction (LDR), this approach generates a low-dimensional linear mapping of the original high-dimensional or noisy data that maintains some characteristic of interest, denoises or compresses the data, extracts important feature spaces, and other benefits, further, forward or backward elimination allows to avoid overfitting and reduce robustness. The processing and manipulation of data are often associated with noise, creating a diminishing impact on the model's performance [45]. The link between regularization and robustness due to noise is represented as:
3 |
In this regard, the noise is expected to vary accordingly to an uncertainty set , and the learner inherits the robust behavior, where is a convex function that calculates the remainder [46]. Regression models can sometimes renounce the correct interpretability due to a significant no of features against fewer data, to overcome the shortcomings and multicollinearity, various feature selection strategies are applied.
Shrinkage Models
To produce a more predictable model the value of regression coefficients is depreciated with the help of some regularization methods also known as Shrinkage methods at the cost of importing some bias in model ascertainment. The principal intention behind shrinkage methods is penalizing the regression coefficients on the loss function towards a fundamental point, like the mean. Some common shrinkage methods include Ridge Regression which penalizes the norm-2 of the regression coefficients
4 |
where controls shrinkage magnitude, lasso regression penalizes norm-1 and tries to minimize the quantity by
5 |
Least Absolute Shrinkage and Selection Operator (Lasso) Regression is an extension of linear regression supplemented by shrinkage. The lasso approach favors models with fewer parameters, well-suited for models with high degrees of multicollinearity, or for developing automation of some rudiments of model selection. Lasso models are more interpretable as compared to ridge regression due to large which compels some of the estimated coefficients to be equivalent to absolute zero. The estimation accuracy of subset selection is driven solely by the disturbance present in the input dataset, to reduce the effect of foreign particles and to shun numerical issues, the Tikhonov regularization term () with weight > 0 is introduced along with the cutting plane approach [47].
Regression Models Beyond Linearity
Linear correlation is naturally extended to complex non-linear terms, which may apprehend composite relationships between predictors and regressors. Non-linear regression models extend to include step functions, exponential, local regression, smoothing, regression splines, and polynomial regression into the Familia. Otherwise, the Generalized Additive Models (GAMs) [48] maintain the additivity of the original predictors , and the relation between every feature and the response y is expressed using nonlinear functions such as
6 |
To preserve a certain level of predictors interpretability concerning linear models, GAMs escalate the flexibility and accuracy of prediction with the aid of non-parametric models such as boosting and random forest. The predictors are expressed in the form of . The efficacy of GAMs is underrepresented in scenarios where observations exceed predictors. Piecewise affine forms appear as suitable models when the correlated function is found separable, discontinuous, or fuzzy to complex nonlinear expressions [49, 50].
Classification
Classification refers to segregation or mapping of unlabelled data items (entity α) based on a trained dataset () where every has a predefined class relative in a specific category. Classification admits multiclass and binary approaches including logistic regression, Linear Discriminant Analysis (LDA), Support Vector Machines (SVMs), and decision tree mechanisms [51].
Logistic Regression
In critical domain functional relationship between and is absent. Considering this situation, the relation between and has to be described in a general way by a framing a probability function considering that the train data preserves independent bits from. Here the label is assumed to be binary, i.e., the finest class membership conclusion is to choose the label that amplifies the distribution imperatively. Logistic regression examines the probability of belonging to a class for one in the two categories of the dataset by [52]
7 |
The prominent decision boundary between the binary classes is marked by a hyperplane (that maximizes the measure of deviation) is described as . The parameters and are obtained by maximum-likelihood estimation method
8 |
To conclude at a globally optimal solution, order method such as gradient descent for positioning a differential function's local bottom, taking recurrent steps in the conflicting course of the function's incline at the current point, in the steepest descent direction and order such as Newton's method where each iteration entails fitting a parabola to the graph of a differential function at a trial value p and then determining the minimum or maximum of that parabola (called saddle point), come into play. Further tuning of the logistic regression models can be achieved by variable selection to avoid overfitting, forward selection to add variables, or backward elimination to withdraw variables based on the statistical relevance of the coefficients.
Decision Trees
Classification is often associated with a non-parametric model, Decision Trees (DT) for a conclusive decision on any hypothetical or real-world instance using distribution rules expressed as a tree data structure. Statistical indicators (such as mean, median, or mode) recline the intuitive prediction of the model on the segmented training data. DTs are good for large datasets with less dimension and can handle both numerical and categorical values. Entropy is calculated for each candidate i.e., the average weighted probability, and combined them to find the average of each node, represented as , where ‘H’ represents the entropy for the given weight ‘s’ and ‘’ if the frequency of the probability of an element per class ‘i’ in the data. Subtlety, the Gini Impurity is given as evaluates the impurity of each candidate node and hence the root with the least impurity can be picked easily. Similarly, the Information Gain (IG) which quantifies the quantity of split is represented as
9 |
simplifying it to . This can be estimated as
10 |
where ‘H(s)’ is the entropy for the data given the variable ‘a’. To avoid overfitting of data, pruning along with other techniques such as Smit and Konin are taken into consideration. Pruning of a tree is an essential measure to ensure unbiased decisions, represented as
11 |
where ‘R(T)’ is the total misclassification rate of terminal nodes, ‘T’ no of terminal nodes and ‘’ is the cost complexity measure. Various recursive procedures help in the splitting of training datasets to parse them through segmentation. Since recursive procedures have a distinguished greedy nature, it has failed at times to settle at global optimum, giving chances to implement certain other alternatives such as the heuristic approach based on mathematical programming paradigms (i.e., linear optimization) and dynamic programming. Consider an example of a simple classification tree, where the tree determines the health status and need of exercising for elderly people based on their activities. Figure 2 represents the decision process. Okaty et al. [53] propose a fresh stratum-based DT model for precise localization of anatomical landmarks in clinical image scrutiny. Liang et al. [54] provide an effective and privacy-preserving DT classification strategy for health monitoring systems (PPDT). They turn a DT classifier into a boolean trajectory, then encode with symmetric key encryption. Zhu et al. [55] present a novel Multi-ringed (MR) Forest framework based on DTs for the reduction of false positives in pulmonary node detection. Various algorithms that utilize fed data to generate decision trees are Classification and Regression Tree (CART), Iterative Dichotomiser 3 (ID3), ID 4.5, etc.
Algorithm.
Step 1: Start.
Step 2: Randomly shuffle and select n training samples from the dataset along with replacement.
Step 3: Calculate the entropy of the target.
Step 4: The dataset is then split into different attributes. The entropy for each branch is calculated. Then it is added proportionally, to get total entropy for the split. The resulting entropy is subtracted from the entropy before the split. The result is the Information Gain or decrease in entropy.
Step 5: Choose the attribute with the largest information gain as the decision node, divide the dataset by its branches and repeat the same process on every branch.
Step 5.1: A branch with an entropy of 0 is a leaf node.
Step 5.2: A branch with an entropy of more than 0 needs further splitting.
Step 6: End.
SVM
Under the hood of supervised machine learning algorithms in the statistical learning category, SVMs receive vital attention in the optimization approaches. SVMs intend to identify a hyperplane with a maximum margin separating two significant classified classes. Given a training set with training inputs where and being the binary response variable, SVM identifies the margin of separation as . Provided, represents the vector of coefficients for input variables and is the intercept of the distinguishing hyperplane [56].
Hard margin SVM
Hard margin SVM is known as the simplest version of SVMs that proceeds with an assumption that a hyperplane exists which physically separates data into two different classes avoiding misclassification. This optimization technique is categorized as a linearly constrained convex quadratic problem. Following this model's training, a hyperplane is identified which separates the data keeping the distance to the closest data point from the margin of separation maximum. The distance of a data point to the hyperplane is given by
12 |
where expresses the norm-2. Therefore, the data points with labels are on one side of the hyperplane such that while the data point with labels are on the other side . Now to find the hyperplane an optimization function has to be dealt with,
13 |
s.t., , , , which is recognized as a convex quadratic problem. Often the accuracy of optimization by forcing the separability of data on a linear hyperplane is traded off which rules out the practicability of this version of SVM, this is where soft-margin SVMs outperform hard-margin SVMs.
Soft margin SVM
The convex quadratic problem becomes infeasible when data is not separable on linear terms. An alternative to this problem exists by minimizing the errors average. To minimize the data points tinkering on the unfavorable side of the hyperplane a slack variable in the constraints of the objective function is introduced which is then penalized as a proxy. The soft-margin escalation problem is discussed as
14 |
where , , . Considering another alternative as to introduce an error term in the objective function using the squared hinge loss function instead of the hinge loss function to attain specificity of soft-margin SVM. The misclassification rate of this optimization strategy maximizes when norm-2 is replaced with norm-1 leading to linear optimization problems.
Sparse SVM
Various approaches have been proposed to deal with sparsity (feature selection in classification model) in SVMs among which 1-norm, elastic net (both 1-norm and 2-norm) are common. The approach is applied to the model which tunes bias to one of the norms using a hyperparameter [57]. The number of features selected can be modeled in the soft-margin optimization problem by using binary variables where indicates feature is selected else . A constraint restricting the feature number for an optimum desired reach can be resulting in a mixed-integer quadratic catch as
15 |
s.t. , , , , , .
SVR
Support Vector Regression (SVR) is a supervised machine learning technique that is designed to handle regression difficulties. Regression analysis comes in handy while observing the relationship between one or more predictor variables and dependent variables since it can balance the complexity of the model and prediction error [58]. SVR is an extension to classic SVM that is introduced for binary classification buttressing the core idea of recognizing a linear function approximated with a tolerance variable training set () where [59]. SVR has shown optimal performance in handling high-dimensional data that deals with regression problems. SVR uses a similar approach to SVM to perform classification using hyper-planes defined by a few support vectors and can easily handle non-linear regression competently [60]. However, a linear function might not always be derivable thus slack variables expressing deviations from the expected tolerance are introduced and minimized similar to the way of soft-margin SVMs. Following, the optimization problem is stated.
16 |
Hyperparameter (P) tuning further adjusts the weight on deviation from tolerance. This deviation is the -insensitive loss function given by
17 |
Clustering
Clustering is a widely used class of supervised learning that focuses mainly on the grouping of a set of objects into smaller clusters of similar genera. This common statistical data analysis technique finds its application in the domains of pattern recognition, bioinformatics, data compression, image analysis, and information retrieval. Healthcare sectors collect massive amounts of data from various healthcare service providers, and this data may include information such as patient information, medical tests, and treatment specifics. Because of the intricacy of the data obtained, analyzing the data for decision-making on a patient's health state is tough. Numerous strategies, such as clustering, are currently used by healthcare practitioners to determine a patient's health state. Clustering is an unsupervised learning method that divides huge datasets into smaller groups based on related properties [61]. This method is usually used to find commonalities between data points. The most common use of unlabeled learning (Unsupervised learning) has been to generate a cluster or group of items in a dataset. Given an input , which includes k unlabelled observations, with , clustering aims to procure subsets of , i.e., individual clusters, which are homogeneous as well as separated. The cluster estimation acts as a tuning parameter that needs to be corrected before examining the clusters. The degree of separation and homogeneity can be modeled based on the different criteria which give rise to several types of clustering algorithms such as K-means Clustering, Capacitated Clustering, Hierarchical Clustering, etc.
K-Means
K-means clustering or minimum sum of squares clustering is a vector quantization method that aims to partition the no. of data observations into disjoint clusters with an affiliated minimum central mean for each sample. The decision on the cluster proportions is considered by close examination of the elbow curve, or similarity indicators, such as Calinski-Harabasz index, silhouette values, or via statistical programming approaches [62]. Binary variables described as and the centroid of each cluster , the difficulty of reduction in cluster variance is provided as a nonlinear equation [63]
18 |
, . Introduction of the variable which denotes the distance of observation from centroid, the following linear dimensional formula is obtained as
19 |
Apart from the above-mentioned methods several other alternatives such as the heuristic approach based on gradient method, bundle approach, and a column generation approach are in practice. Figure 3 represents the clusters with K-means as their centroid, all classified distinctly.
K-Means Clustering Algorithm.
Input: coordinates dataset , Count of clusters K.
Step 1: Initialize k centroids randomly.
Step 2: Attach each coordinate in dataset D with the closest centroid. This will circulate all coordinates into K clusters based on their similarity.
Step 3: Re-compute the coordinates of centroids.
Step 4: Repeat Steps 2 and 3 until the positions become constant or fixed.
Output: Data points with cluster membership.
Capacitated Clustering
The Capacitated Centred Clustering Problem (CCCP) aims to catalogue a bunch of clusters with a limited capacity and correlation indicated by the similarity index of the cluster’s mean. Considering a group of expected clusters from CCCP can be mathematically represented as
20 |
. Where is the uppermost bound on the clusters, represents the measure of dissimilarity between cluster and observation i. is the capacity of cluster, and is the weight of observation . Variable denotes the assignment of to and variable is equivalent to 1 when cluster is used. If the variable is a distance and the clusters are homogeneous then the formula also models the well-known facility location problem [64].
Linear Dimension Reduction
Linear dimensionality reduction or shrinkage methods have been developed extensively for ages in the domain of statistics and applied fields to become an indispensable tool for analysing high-dimensional and noisy data. These methods improve the model's interpretability by producing a low-dimensional linear function from the original high-dimensional data that preserve features of interest in the output sample [65].
Principal Components
Principal component analysis (PCA) targets prune the sum of squared residual errors between the original high-dimensional data and projected data points. PCA trail in terms of explained variances, which refer to the quantum of information regained from the original feature set PCA was formulated originally as
21 |
where is a unit vector. The problem above was sensitive to the presence of outliers. To improve robustness, the original formulation later grew equivalent to "maximizing variance" derivation given as
22 |
where. PCA finds its application in various data analytics problems which benefit from dimensionality reduction mechanisms. For linear regression models, there exists Principal Component Regression (PCR) a two-staged procedure that inherits the properties of PCA accompanied by the advantage of including fewer predictors and reduced predictability time in the same variable dataset. Amid all the resolute outcomes of PCA, the only known drawback is interpretability.
Problems in Healthcare Sector
A change toward a data-driven socioeconomic health slant is taking place. This is due to the increased volume, velocity, and diversity of data attained from the public and private sectors in healthcare and natural sciences in a wide range. Over the last five years, there has been remarkable advancement in informatics technologies and computational intelligence for use in health and biomedical sciences. However, the full potential of data to address the breadth and extent of human health problems has yet to be realized. The properties of health data present intrinsic limitations to the effective implementation of typical data mining and ML technologies. Aside from the volume of data ('Big Data’) they are difficult to manage because of their complexity, heterogeneity, dynamic nature, and unpredictability. Finally, practical obstacles in applying new and current standards across different health providers and research organizations have hindered data management and the interpretability of the results. Oliveira et al. [66] address the issue of interpretability of the results acquired from the study of clinical data and goes on to explain the cluster labels by deciphering the appropriate events. Consecutively, Mengoudi et al. [67] use self-supervised representation to train DNNs to detect diverse cognitive processes in healthy people. As a result, the model learns to encode high-level semantic information, which is then utilized to distinguish between control people and dementia sufferers. Intelligent methods are now being used to solve possible challenges in the healthcare business.
Applications of ML in Healthcare
Healthcare sectors spawn a comprehensive quantity of heterogeneous information and data daily, which makes it difficult for the data to be analysed and processed by conventional methods. DL and ML methods help simplify the arduous methods to automate the task for actionable insights. Besides, the sources of data can intensify healthcare service information into distinct quarters such as medical data, social media data, environmental data, and genomics. Table 1 accumulates the contributions of various researchers in different domains of ML applicability over time. ML/DL techniques can serve to automate and improve performance in major healthcare applicative sectors such as prognosis, diagnosis, treatment, and clinical workflow. A depiction of the extensive amount of heterogeneous data sourcing into healthcare systems is shown in Fig. 4.
Table 1.
Application of ML in healthcare | References | Year | Contribution |
---|---|---|---|
Electronic health records (EHRs) | Stojanovic et al. [68] | 2017 | Modeled healthcare quality via compact representations of EHRs |
Brisimi et al. [69] | 2018 | Presented Chronic disease prediction hospitalization from EHRs | |
Shickel et al. [70] | 2018 | Analyzed advances in DL techniques for EHRs | |
Fuente et al. [71] | 2019 | Developed a solution for searching behavioral patterns in EHRs using the Random Forest algorithm | |
Harerimana et al. [72] | 2019 | Presented deep learning strategies for EHRs analytics | |
Bernardini et al. [73] | 2020 | Developed solutions for discovering type-2 diabetes in EHRs using sparse balanced SVMs | |
Tsang et al. [74] | 2020 | Modeled skimpy data for feature selection in the prediction of Dementia patient’s admission using EHRs | |
Lee et al. [75] | 2021 | Proposed classification of opioid usage for total joint replacement patients | |
Kumar et al. [15] | 2021 | Developed Ensemble ML approaches for morbidity identification from clinical data | |
Medical image analysis | Zebari et al. [76] | 2020 | Improved automated segmentation of pectoral muscle and breast cancer boundary in mammogram images |
Zech et al. [77] | 2018 | Developed Automated annotation of clinical radiology reports using natural language-based models | |
Jing et al. [78] | 2018 | Developed Automatic generation of radiology imaging reports | |
Li et al. [79] | 2021 | Developed solution Using histopathological images to classify and diagnose lung cancer subtypes | |
Mandal et al. [64] | 2018 | Surveyed on medical imaging transformation across the healthcare spectrum | |
Umamaheswari et al. [80] | 2018 | Developed digital imaging to Classify and segment acute lymphoblastic leukemia cells | |
Wang et al. [81] | 2019 | Used sparse multi-regularization learning and multi-level dual network features to classify breast cancer images | |
Abhinaav et al. [82] | 2019 | Developed ML mechanism using extracted Papanicolaou Smear images to detect abnormality and severity of cells | |
Bora et al. [83] | 2020 | Proposed a radiograph generating reconstruction mechanism for facilitating AI in medical imaging | |
Treatment | Weng et al. [84] | 2017 | Provided analysis on ML prediction of cardiovascular risk using routine medical data |
Fatima et al. [85] | 2017 | Surveyed ML algorithms for disease diagnosis | |
Zhao et al. [86] | 2019 | Applied ML approach for drug repositioning of Schizophrenia and anxiety disorders | |
Jamshidi et al. [87] | 2020 | Proposed DL approaches for diagnosis and treatment of the novel coronavirus | |
Li et al. [88] | 2019 | Assessed ML for predicting severity in liver fibrosis for chronic HBV | |
Noaro et al. [89] | 2021 | Developed ML-based model for improving the calculation of Insulin Bolus of type-1 diabetes therapy | |
Yang et al. [90] | 2017 | Proposed a combined ML algorithm for effective medical diagnosis and treatment using an inference engine | |
Chaitra et al. [91] | 2020 | Proposed an ML model for diagnostic prediction of autism spectrum disorder | |
Computer aided-detection (CAD) | Saygılı et al. [92] | 2021 | Developed ML methods and soft computing strategies for computer-aided Covid-19 detection from CT-Scan and X-ray images |
Abdelsalam et al. [93] | 2018 | Presented the computer-aided detection of leukemia using microscopic blood-based ML | |
Wu et al. [94] | 2018 | Developed DL techniques to detect hookworm in wireless endoscopy images | |
Yu et al. [95] | 2021 | Implemented ML-aided imaging analytics for histopathological image diagnosis | |
Disease prediction and diagnosis | Suresh et al. [96] | 2017 | Presented clinical event prediction and analysis using DL mechanisms |
Rau et al. [97] | 2018 | Presented a study using ML for predicting the mortality rate of the isolate to severe traumatic brain injury patients | |
Kim et al. [98] | 2017 | Proposed ML-based diagnosis of major depressive disorder by combining heart rate data | |
Pellegrini et al. [99] | 2018 | Developed ML assisted diagnosis of dementia and cognitive impairment | |
Akbulut et al. [100] | 2018 | Presented an ML system for foetal health condition prediction based on maternal clinical history | |
Karhade et al. [101] | 2018 | Developed ML algorithms for predicting survival of a 5-year spinal chordoma patient | |
Abdar et al. [102] | 2019 | Proposed a new ML technique for the diagnosis of coronary artery disease | |
Burdick et al. [103] | 2020 | Used ML to develop a prediction system for respiratory decompensation in coronavirus patients | |
Hashem et al. [104] | 2020 | Developed ML models for diagnosis of HCV-related chronic liver disease and hepatocellular carcinoma | |
Magesh et al. [105] | 2020 | Developed explainable ML using LIME on imagery computers model for pre-detection of Parkinson’s disease | |
Shen et al. [106] | 2021 | Presented risk predicting ML models in the diagnosis of Escherichia coli sepsis in patients | |
Montolío et al. [107] | 2020 | ML in disability prediction and diagnosis of multiple sclerosis utilizing optical coherence tomography computers | |
Clinical time-series data | Yu-Wei et al. [108] | 2019 | Used recurrent neural networks for prediction of unplanned ICU readmission |
Xie et al. [110] | 2020 | Compared benchmarks of classical time-series ML models with new algorithms on glucose prediction in the blood of type-1 diabetes | |
Pezoulas et al. [111] | 2021 | Used time-series gene expression data for the detection of a diagnostic biomarker in Kawasaki disease | |
Nancy et al. [112] | 2017 | Observed a bio-statistical quarry approach for the classification of multivariate clinical time-series data observed at varying intervals | |
Froc et al. [113] | 2021 | Characterized urinary tract endometriosis over a collected one-year national series data of 232 patients | |
Wallace et al. [114] | 2018 | Simplified the function of speech recognition admissibility in medical documentation aspects | |
Clinical speech and audio processing | Zamani et al. [115] | 2020 | Presented an automated Pterygium detection using ML/DL approaches |
Prognosis | Ke et al. [117] | 2019 | Presented an automated Image annotation based on multi-label data augmentation and deep CNNs |
Davi et al. [118] | 2019 | Utilized ML and human genome data for severe dengue prognosis | |
Liu et al. [119] | 2019 | Proposed a weakly supervised DL technique for brain disease prognosis using MRI data and incomplete clinical scores | |
Fang et al. [120] | 2020 | Discussed the ML approach for feature selection in stroke prognosis | |
Wang et al. [121] | 2019 | Presented transfer learning least squares SVM mechanism in bladder cancer prognosis | |
Cai et al. [122] | 2020 | Presented ML models and CT quantification approaches for assessment of disease prognosis and severity of coronavirus patients | |
Zack et al. [123] | 2019 | Developed ML techniques for forecasting patient prognosis after percutaneous coronary intervention | |
He et al. [124] | 2021 | Developed ML prediction model for acute kidney injury following after donation |
Electronic Health Records (EHRs)
Electronic Health Records (EHRs) hold a large amount of data consisting of the medication history of patients and other details regarding their recovery daily by hospitals and other healthcare services. It is an extensively irksome task to extract clinical features from EHRs manually, ML-based methods come to the rescue. ML-based methods make it easy to extract required data for facilitating the diagnosis process. Diverse precedents have been presented to diagnose diseases such as diabetes, lung infections due to Covid-19, advancement of tumorous cells from the unstructured EHRs. The unstructured records from EHRs are mainly examined for two stints, i.e., length-of-stay and mortality prediction. It has been observed in studies that the prediction for the diagnosis process gets degenerated when historical records are trained upon by ML models and tested on new (unseen data). Stojanovic et al. [68] presented a study where they coupled EHRs with advanced ML tools for predicting major parameters of healthcare quality. The study is dedicated to reduced dimensional vector representations of patients' clinical procedures and conditions. Brisimi et al. [69] developed ML methods to predict hospitalization probabilities because of the two most important chronic diseases, i.e., diabetes and heart ailments. The predictions rely on the clinical history of patients recorded in EHRs. The previous era has seen an enormous increase in the volume of digital information of medical data collected in EHRs. Shickel et al. [70] surveyed the present research on the application of DL to clinical tasks on EHR data and identified several loopholes in the current EHR-based research. Likewise, Fuente et al. [71] presented a survey where they studied the behavioral patterns in EHRs of patients using the Random Forest Algorithm. Their study mainly focuses on finding a correlation between different diseases or factors associated with them. Analytics plays an important role when considering data-driven systems for medical facilitation. Harerimana et al. [72] offered an intuitive review of optimized DL approaches in managing and utilizing data from EHRs. The exponential rise in the availability of data might reduce the need for data demand in ML processes. However, performance is traded-off for computation time that can become critical at times considering medical emergencies. Diabetes is one of the most common conditions found amongst the Indian population. The early discovery of type 2 diabetes (T2D) helps in treating patients more pragmatically and prevents severity. Bernardini et al. [73] introduced an ML algorithm known as Sparse Balanced Support Vector Machine (SB-SVM), trained extensively on the abundant data recorded in EHRs to detect the novel T2D early and efficiently. The SB-SVM produces promising results to overcome present competitors in providing the best trade-off between computation time and predictive performance. Similarly, with the help of EHRs, the protagonists of clinical welfare have developed several mechanisms such as admission prediction of Dementia patients [74], classification of Opioid usage for Joint Replacement patients [75], morbidity identification [15], to consider a few.
ML in Medical Image Analysis
ML systems have rooted their applicability in the analyzing procedures of medical images. These computational techniques allow the efficient extraction of important information from image samples produced using various imaging modalities (e.g., MRI, Computed Tomography Scan (CT), Positron Emission Tomography (PET), and ultrasound imaging, etc.). Recent advances in computational hardware are allowing physicists to revise old AI algorithms and experiment with new mathematical ideas [76]. The mechanically produced images enable diagnosis of the kernel of illness and localization of abnormalities in various parts of the body. The significant tasks in clinical image analysis comprise detection, segmentation, localization [77], classification, enhancements, reconstruction, etc. [78]. As a result, a completely automated intelligent system for medical image analysis is predicted to successfully provide services such as segmentation, localization, detection, and classification. M. Li et al. [79] presented an experimental study in computer-aided lung-cancer diagnosis based on histopathological imaging. Their proposed best classifier, i.e., the Relief-SVM (relevant features- Support Vector Machine) model achieved the highest accuracy, thereby verifying the potential of auxiliary diagnostic models using medical images. ML and AI have influenced treatment procedures in numerous ways. A detailed review of how AI is remodeling the medical imaging spectrum is presented by Mandal et al. [64]. Likewise, Umamaheswari et al. [80] propose an algorithm for the classification and segmentation of Acute Lymphoblastic Leukaemia Cells. The system is fully fed by clinical photos that have been analyzed. For the sake of graphical analysis, segmentation is performed first, followed by morphological operators and Otsu's thresholding. The use of nucleus characteristics in conjunction with a supervised KNN classifier improves classification rates, yielding an estimated accuracy of 95.92 percent on average. Based on histological images, Wang et al. [81] presented a profound study where they improved the existing detection accuracy of malign cells in breast cancer. They adopted a dual-network multi-relation regularized learning method for boosting performance. Cervical Cancerous cells are classified using a tack called Papanicolaou Smear (PAP) test. Abhinaav et al. [82] devised an algorithm catering to the image’s dataset produced from the PAP test to classify and group the normal cells from affected cells. Histopathological images are prone to uncertainty which acts as a catalyst in corrupting an ML model's performance trained on it. However, the inception of the fuzzy modeling technique has significantly reduced bias and uncertainty in the image data [83].
Applications of ML in Treatment
Recent innovations and research in extensive ML applications for healthcare domains have paved the way for better treatment scopes. The medication process follows a three-step procedure of prognosis, diagnosis, and treatment. In the diagnosis phase, medical images are studied by expert clinicians and radiologists to interpret the possible risks and cures. An extensive amount of medical data is produced daily from various small and big healthcare facilities, the information collected is put through rigorous supervision, and findings are recorded in reports. However, preparing such reports requires expertise and if handled with less experience in areas of nascent healthcare services may result in misdiagnosis or may conclude at a critical synopsis. On the other hand, preparing textual medical documents at an organizational level can be a tedious and weary task for clinical experts and radiologists, therefore researchers have attempted to address some clarifications on specific problems using ML techniques. Zech et al. [77] proposed a Natural Language Processing (NLP) based method for the annotation of radiology reports. A similar study conducted by Jing et al. [78] resurrected a multi-tasking ML framework for the automatic description and tagging of clinical radiology images. Similarly, researchers and physicians have found ways to blend methods such as Convolutional Neural Networks (CNN), RNN, and LSTM to explain automated state of art architecture for predictive design systems in localizing affected areas of body parts [84, 85]. Zhao et al. [86] presented a study where they developed ML algorithms for possible drug repositioning in case of Depression and Schizophrenia disorders. SVMs outperform others amongst the list of experimented algorithms. The Covid-19 outbreak has claimed thousands of lives and has put forward a profusion of difficulties. Researchers and medicinal experts since then have worked enormously to find ways of saving lives, technology has been an integral part. Jamshidi et al. [87] have curated a collection of diverse DL approaches for the diagnosis and treatment of Covid-19 patients. On the other hand, we can witness how gracefully Li et al. [88] have utilized ML approaches for assessing the degree of severity of Liver Fibrosis for chronic HBV. In our prior discussion diabetes remained one of the most researched conditions. Noaro et al. [89] presented a study where they conferred the abilities of ML models in improving the Insulin Bolus Calculation in type 1 diabetes. Moving on with more recitals of ML for treatment, we witness a few more studies that exemplify the statement [90, 91].
ML in Computer-Aided Detection
ML has been used extensively as a major strategy of CAD scheme, i.e., Computer-Aided Detection/Diagnosis of lesion candidates into certain classes of interdisciplinary technology blending elements of AI and ML with radiology and pathology image processing, an ideal example can be IBM's Watson. The automatic interpretation of medical images has proved to be highly valuable in assisting radiologists and doctors in their clinical treatment when time constraint is paramount. The workflow takes into account various DL/ML techniques like Fisher score discriminator, t-test and chi-square test, and several traditional processes including predictive algorithms, Computer Vision, Image processing methods. Saygılı et al. [92] examined several classification models to support early computer-aided diagnosis and treatment of Covid-19 using image processing and ML. Their proposed approach achieved an astounding accuracy of 99.02% on the X-ray images dataset. Correspondingly, Abdelsalam et al. [93] explored the inclinations of CNNs in computer-aided Leukaemia detection using microscopic blood images. The majority interest of discussion revolves around human ailment detection and seeming cures using the technical big name, ML. Considering one of the most common infectious diseases responsible for fatal endings in children especially, i.e., hookworm, He et al. [94] have proposed a broad study in hookworm detection. Their study bears an ML detection framework for Wireless Capsule Endoscopy (WCE) images which simultaneously tracks the tubular patterns of hookworms and models visual representations. Extending to the method of imaging analytics for pathological image diagnosis, Yu et al. [95] presented an extensive review concerning it.
Disease Prediction and Diagnosis
Disease Prediction and diagnosis early can be prolific in saving a person's life. Predictive ML methods instigate the means of early prognosis and diagnosis from medical data which subdues the time required for acting upon the disease for treatment. Surveys stating that certain ML algorithms have been successful in the prediction of cardiovascular risk with clinical data [96] and studies culminated that ML adeptness raised effectuality in prognosis and diagnosis predictability. The inherent use of ML-based methods for prognosis and prediction of cancer, apprehension of various diseases like virulent infections, dengue, hepatitis, heart problems, malaria, diabetes, etc. have proved to be capable [97]. Major Depressive Disorder (MDD) is a variety of abnormal mood disorders observed under biological psychiatry. It has been very prevalent amongst youngsters these days. Diagnosis for it demands the root cause be unravelled. Kim et al. [98] studied MDD and applied ML to classify peripheral biomarkers using Heart Rate Variability (HRV) serum proteomic analysis data. ML has been observed to assist in a lot of cognitive diagnosis procedures, Pellergrini et al. [99] presented a systematic review which is evidence of it. Following this, Akbulut et al. [100] proposed several ML techniques for monitoring and predicting foetal health status based on maternal clinical history. ML has been developed extensively through the years. The decision support provided by ML models has reduced the workload on clinical professionals to a considerable extent. Karhade et al. [101] developed a Bayes Point machine for the prediction of 5-year survival in spinopelvic chordoma. The ML model was developed specifically for this rare pathology yet accuracy was not compromised. Likewise, Abdar et al. [102] proposed ML techniques for the diagnosis of coronary artery diseases. Burdick et al. [103] employed ML for the prediction and diagnosis of respiratory decompensation in Covid-19 patients. Hashem et al. [104] developed predictive models for the diagnosis of chronic liver diseases along with Hepatocellular Carcinoma. Magesh et al. [105] presented their study on early detection of Parkinson's disease, likewise, diagnostic models for Escherichia coli infection [106], multiple sclerosis [107], and others have been developed.
ML for Clinical Time-Series Data
Time-series data is a collection of numerical/statistical features monitored for a certain period. Clinical Time-series data holds an amalgamation of medical imaging observations periodically tracking the transition of prime data points of concern. Applicability of clinical time-series ML modeling cover prophecy of health standing in Intensive Care Units (ICUs) using CNNs and Long-short Term Memory networks (LSTMs) [108], mortality rate prediction of patients with Traumatic Brain Injury (TBI) [109], assessment of blood pressure, Intracranial Pressure (ICP), is prime signs of Cerebrovascular Autoregulation (CA) in TBI patients [109]. Recently studies state that by integrating time-series data with multivariate model inclinations, their predictivity for forecasting tasks of prognosis, diagnosis, recommendation, etc. is increased tremendously. Xie et al. [110] benchmarked ML time-series models on the prediction of glucose content in the blood for Type 1 diabetic patients. Pezoulas et al. [111] gathered time-series microarray gene expression data for the modeling of a predictive system. The model developed, detects candidate biomarkers of Kawasaki disease. Every ML application needs to be fed a hefty amount of data for better performance. However, data management is considered one of the tedious yet crucial jobs. Nancy et al. [112] applied a bio-statistical mining approach for the efficient classification and management of time-series data considering irregular time intervals. Similarly, Froc et al. [113] listed clinical attributes of urinary tract endometriosis on a series of 232 patient data collected for one year.
Clinical Speech and Audio Processing
In clinical environments, concerned authorities require to generate huge amounts of documentation including clinical reports, imaging reports, discharge applications, etc. which takes a lot of time and is highly strenuous for clinicians. Wallace et al. [114] discussed that while considering the workload already on experts, documentation is an added despondency that takes 50% of their time as a result the interaction time with patients is curbed down. This typical situation is strainful for clinicians and emotionally unconnected for patients who require attention, hence clinical speech and audio processing provide a sigh of relief. The applications of it include interaction-less services with speech communication, automation of transcript generation, clinical notes synthesis, correspondence for an emergency in staff unavailability, etc. These methods are time and cost-effective and increase productivity, to manage the healthcare infrastructure internally well, applications of clinical audio and speech processing have been successful where automation is a new modality for patients as well as clinicians [115]. Clinical speech processing confronts two major challenges as disfluency and utterance segmentation which stalls processing activity.
ML in Prognosis
Prognosis refers to the process of forecasting a likely outcome of a disease based on medical trials. The process includes the identification of potential risks and ascertainment of pre-stages of development for the disease and the likelihood of survival. Collins et al. [116] stated that ML models facilitating the process of prognosis are fed with multimodal patients’ data for improved performance. Recent research in the potential applications of ML in the medical prognosis [117] process puts stress on the sanction of personalized medicine, a premature field that still requires extensive development. To achieve the translational impact of personalized medicine robust validation strategies and ML utilization is expected. Davi et al. [118] proposed an ML classification method developed using human genome markers for severe dengue prognosis. Another study on ML models by Liu et al. [119] is evidence of brain disease prognosis using incomplete clinical scores and MRI data. Various other predictive systems are developed using ML/DL approaches. However, ML can also be utilized for selecting features in stroke prognosis [120]. Wang et al. [121] presented a transfer-learning approach for bladder cancer prognosis. Cai et al. [122] investigated ML models for the assessment and quantification of severity and prognosis of Covid-19 patients. Conclusively, Zack et al. [123] leveraged ML techniques for Percutaneous Coronary Intervention prognosis forecasting, He et al. [124] studied acute kidney injury prediction followed after cardiac death liver transplant.
Sources of Vulnerabilities in ML Pipeline
The applications of ML in healthcare are still at their nascent stage of development, the challenges arising due to security breaches and privacy disruption are discussed in this section. The cyber defense strategies however have not fully grown under the healthcare domains which challenges the secrecy and confidence of ML models developed. In addition, major challenges faced during the ML pipeline development besides potential vulnerability sources causing such challenges have been pointed out next.
Vulnerabilities in Data Collection
Vulnerabilities can seldom sneak through carefully amassed medical data considering the generous amount of information collected in various formats such as medical images, radiology reports, health surveys, patient/ disease registries, clinical trials data, etc., every day. Handling this huge mass of information requires obscure human efforts and bulk time wherein chances of data being descended are highly possible, to reduce such failures automation involving ML/DL pertinence is brought into practice. Whilst medical data is consolidated with vigilance there can be various sources of weakness that influence the proper functioning of the primary ML/DL systems, some of which are discussed below.
Unqualified Personnel
The highly interpersonal data-driven healthcare system requires a lot of technical and non-technical assistance. Technical personnel with strong computation and statistical accomplishments to develop the underlying effective ML/DL-based systems to improve the efficacy of medical processes and time-management strategies are limited. Conceding the feeble situation, hospitals turn over to depend entirely on physicians or researchers who do not have qualifying computational expertise for developing such systems [125].
Environmental and Instrumental Noise
The process of digital data collection and regulation seldom accompanies environmental and instrumental disturbances. Little agitation in certain diagnostic procedures such as in multishot MRI where extensive supervision is required, can lead to undesirable noise in the solicited data thereupon increasing the risk of misdiagnosis.
Vulnerabilities Due to Data Annotation
ML/DL applications require extensive model training for perfect predictive performance. For medical usage applications, most models are extensively trained on clinically produced images that require every sample to be annotated. This tedious task of assigning labels should mostly be performed by clinical experts who can prepare domain-enriched datasets or by some automated algorithms [126]. Labeling data like secondary tasks are not encouraged by professionals as it employs a lot of their crucial time therefore trainee staff (who have little domain expertise) are employed for the task. As a result, it leads to problems such as bawdy labels, misclassification, sanction imbalance, etc. Several vulnerabilities due to data annotation are noted further.
Ambiguous Ground Truth
In medical datasets, Finlayson et al. [127] proactively presented a study that expresses the ambiguity in the ground truth of the results. Even well-defined diagnostic tasks are criticized by therapeutic experts, further mishandling and malicious attacks by some perplexed users make the diagnosis, and hence the treatment process difficult yet being under expert supervision.
Improper Annotation
The proper annotation for data samples is critical for certain life-saving healthcare applications. ML/DL mechanisms are deployed for the automated image labeling tasks which often might lead to coarse-grained problems, mislabelling [128]. These problems may challenge the predictive capabilities of healthcare systems that are mentioned next.
Efficiency Challenges
Efficacy becomes the prime factor to monitor an ML/DL-based system's performance. Particular challenges that influence the quality of data and performance thereafter are Limited and Imbalanced datasets, Class imbalance and bias, and sparsity. Newly identified diseases do not have much available history, due to this limitation the performance of a model on predicting the outcomes of this problem is demoted. Class Imbalance is seen as a common problem in supervised ML/DL models which arise due to a mismatch or non uniform data distribution amongst respective classes. Data Sparsity refers to the missing values in the input data that arise due to skipped or unreported samples. All these problems put a significant effect on the functioning of ML/DL techniques.
Vulnerabilities in Model Training
Vulnerabilities concerning ML/DL model training comprise partial training, model poisoning, privacy infringement, incomplete data rendering. Unbecoming training means inappropriate parameters (such as epochs, test/training ratio, etc.) feeding to the model as a result it becomes exposed to infer at a corrupt proposition. ML/DL models are exposed to cyber-attacks such as adversarial attacks, Trojan attacks, backdoor attacks, etc., breaching the secure integrity of the underlying system [129]. The impediments associated with ML/DL models validate their efficient usage, thereby imposing a check on security and life-critical applications development.
Vulnerabilities in Deployment Phase
Deployment of ML/DL systems in a healthcare ecosystem requires extensive human efforts, consequently to avert the robustness of the system customary accountability has to be considered in the deployment phase. Concerning vulnerabilities that occur in the stationing phase of ML/DL systems include Distribution shifts and Incomplete data. Distribution shifts as they are expected to be deployed on different domain data, they are also vulnerable to adversarial attacks [130]. Since ML models are trained on former medical data their performance on future data degrades the efficacy of the prediction. Certain predicaments result in incomplete data collection which might influence the outcomes of the procurement. Incomplete data can either be dropped or is replaced with the mean of the column, however often these practices may lead to a foresight of false positives and false negatives which can have severe consequences in medical care systems. To ensure the accurate prediction of problems and diagnoses, compact and complete data is vital.
Vulnerabilities in Testing Phase
Vulnerabilities in the testing phase are typical that may arise due to training anomalies because of incomplete data, altered data fed for inference, unlabelled medical image inputs, to name a few. These problems could result in severe outcomes that predict false positives or false negatives delimiting the accurate prediction of the condition or disease. Loopholes in the prediction pipeline are critical for a patient's treatment. Decisively, ML-based healthcare is not just about humbling exertion or predictive analysis/ treatment but demands circumspect deployment of statistical/analytical methods in the underlying systems [131].
ML for Healthcare: Challenges
Scientists and Researchers are using ML/DL techniques to churn out smart solutions that help streamline the administrative as well as diagnostic procedures in a medical management ecosystem. Challenges are requisite in the prudish advancement of ML/DL-based systems for viable healthcare applications. Some of the challenges that impede the performance and applicability of automated systems are discussed in this section. Table 2 summarizes the probable challenges faced while tackling ML prosecution in a healthcare ecosystem.
Table 2.
ML in healthcare challenges | Description |
---|---|
Safety challenges |
Model’s prediction precision without expert intervention is questioned Identifying rare, underlying health problems is challenging Enabling ML techniques to identify subtly hidden cases is the key to ensuring safety |
Privacy challenges |
Preserving privacy can be challenging Patients expect their confidential information to be safeguarded Anonymization can prevent unauthorized access and privacy breach |
Ethical challenges |
Data accumulation requires authorization Preserving patients’ dignity while collecting data is to be taken care of If ethical concerns are not addressed, the unfavourable impact is seen in ML applications |
Availability of quality data |
The information available is heterogenous Data collected during practice have issues (bias, redundancy), produce an adverse effect in the algorithms High-quality practical data requires resources and service with good maintenance |
Casualty is Challenging |
Reasoning while taking decisions in crucial health problems is imminent Queries where expert reasoning is required cannot be answered from a medical data perspective Forming casual rationalization from data is challenging |
Updating hospital Infrastructure is inflexible |
Independent sections of healthcare avoid frequent information exchange For frictionless communication, antiquated systems need upgradation The difficulties in upgrading hospital infrastructure raise concern with modern-day healthcare practices using ML/DL |
Safety Challenges
Safety is not a measure of how perfectly an ML/DL model performs under a provisioned environment. Safety accounts for how perfectly an ML/DL model can determine a patient's condition without any expert intervention. The majority of patients under the doctor's supervision have common health conditions, it is their responsibility to examine any underlying rare, subtle or hidden health problems. Arachchige et al., [26] introduced the applicability of the PriModChain framework that enforces safety in the functioning of various mechanical applications in the healthcare domain. Enabling ML/DL applications to recognize those low underlying tenuous events is beneficial in ensuring the safety of the present automated systems.
Privacy Challenges
Privacy is the right of every user (i.e., patient). Preserving privacy in this data-driven healthcare ecosystem is a challenging task, trust is intertwined with issues like integrity, confidentiality, authenticity, accountability, data management, and identity, to name a few [132]. Patients expect that their medical service providers are safeguarding their confidential information from being mishandled or breached by unauthorized accesses, therefore alleviation of privacy breaches is critical for preventing a patient from potential harm. One way of preserving the confidentiality of data to prevent privacy harm is by anonymization, such as reidentification of the individuals [133]. Further austere notice towards every stage of data collection and transmission should be administered.
Ethical Challenges
Ethical usage of data in the ML-driven healthcare system is of utmost importance. Acrimonious caution should be taken while accumulating data for building ML models keeping the sociological aspects of the targeted population at prime. Understanding a patient's concern in preserving their dignity should be considered during data collection. If ethical terms are not taken care of then the use of intelligent systems would have an unfavorable impact. To extend fair and ethical considerations for uncertain and complex scenarios, a clear understanding of ML systems in this regard is expected [134].
Availability of Quality Data
One of the other shortcomings in a healthcare ecosystem is the availability of diverse and good-quality data. Daily, an extensive amount of heterogeneous information related to patients is being generated across medical institutions, and an inadequate amount of useful data is being retrieved for researchers and the scientific community to work on. To produce high-quality practical data requires resources and service with good maintenance and management. The ample presence of quality data would enable professionals to develop systems for the grounds of illness prediction and treatment. Data collected during practice can have issues such as bias, a redundancy that will reflect as adverse outcomes in the algorithms. Intelligent systems cannot differentiate racial bias and fair subjectivity as humans persuade the act they learn, for example, a person with no health provision is repudiated for facilitating medical services wherefore research has brought forward that an AI system could predict bias in racial terms [135]. The trained data also contributes to its modeling challenges [136–138].
Casualty is Challenging
Casualty can be challenging from a medical perspective. Understanding the importance of reasoning, i.e., "What if?" while taking decisions in crucial healthcare problems is imminent [139]. Consider a circumstance where we need to analyse that if the doctor prescribed treatment 1 rather than treatment 2, how will the outcome be influenced? Queries of this kind cannot be answered from a medical data analysed perspective but through causal reasoning. In healthcare applications learning from observational data and inferencing is the socio norm but forming casual rationalizing from it is challenging which requires building casual models. ML/DL models lack fundamental reasoning under their hood and produce output based on correlation and patterns without considering the casual loop in between. In practical application, the limitation of casual analysis may raise concerns about the prophecy of AI systems. The acknowledgment of the casual effect of certain variables on target yields is paramount for fair predictive behaviour.
Updating Hospital Infrastructure is Inflexible
Healthcare organizations favor independent operations and mostly avoid sharing information. For a frictionless erudition exchange, it requires the fixing and updating of antiquated software which can be time-consuming and most are not cost-effective. Finlayson et al. [127] reported that even in the late 20 s most of the infirmaries were operating on the ninth version of the International Classification of Disease (ICD) system even though an updated version of ICD-10 had been released in the early '90 s. The difficulties in upgrading hospital infrastructure and internal management systems can raise concerns with the applicability of recent DL/ML practices.
Future Research Directions
In this section, various issues that require active research attention related to the security, privacy, and robustness of ML in the Healthcare ecosystem are discussed.
Machine Learning on the Edge
The revolutionary change in the purposes of ML in Healthcare applications has seen exponential growth in recent years. Research in ML has revolutionized traditional methods and opted for smart and energy-efficient utilization of wearable devices, IoT sensors, etc. With the development of smart cities and transportable medical devices such as portable ventilators, oxygen concentrators, MRI machines, etc., there is a constant demand for refined ML models trained on Edge devices. This imposes a few limitations including a lack of available hardware support and high computational processing capabilities. ML in the Edge devices is nurturing at its nascent stage and requires attention from the researching fraternity. The growth in this domain will lead to faster care in chancy situations and continuous monitoring of patient's health from a remote location, thereby improving healthcare facilities for a better lifestyle and timely medical assistance.
Handling Dataset Annotation
The output of AI systems is highly subservient on the labeled datasets for training and inference. This requires the medical experts and physiologists to annotate the medical data (such as images, clinical reports, signals, etc.) manually, spending a lot of their valuable time doing this tedious work. The variety of practical medical data glossed with accurate labels will appraise the execution of ML/DL models and exhibit hindrance that might have not been noticed. Thus, manual labeling of data into respective classes is inquisitive, tedious, and energy draining. Automatic approaches like active learning should be adopted and developed to inscribe this impediment.
Distributed Data Management and ML
In Healthcare systems, the generation of data is discrete, i.e., data is processed from various departments within a hospital extending to various other hospitals geographically. This imposes pressure on efficient data sharing and management for clinical analysis particularly using ML models. ML/DL models are developed based on a general consideration that all the analytical information is easily accessible and centrally available. These shortcomings offered by improper management of information exchange need the attention of developers and researchers who collaboratively could tackle the administration of distributed data and ML.
Fair and Accountable ML
Qayyum et al. [140] in analyzing robustness and security of ML/DL techniques reasoned that the results of the models are biased and lack accountability. Ensuring fairness and precision of predictions is of cardinal importance for life-critical application in healthcare systems. Trading the accuracy and accountability of these models could result in cynical outcomes and impose risk to patients' health. Fair predictions by the ML/DL models are influenced by a variety of cases with little available data. Taking into account the importance of fair judgment and interpretability, tuning of models accordingly will make it robust and desist from misjudgements made in the past clinical records. Further study to develop dynamic methods to ensure safety and lessen imperfections is needed in this area.
Model-Driven ML
The practice of ML, AI for predictive analysis in healthcare applications comes with privileges as well as liabilities. Latif et al. [141] discussed the associated caveats in utilizing these tools, failing to denote its lapses might turn out critical as in clinical terms. Usually, the perks of these models convince one that data once available in abundance can handle hypothesis generation without any medical expert validation and interpretation, which attracts unavoidable problems. To avoid these quandaries, it is important to achieve a combined data-driven method including hypothesis and model-based approaches to bring controlled precision in these studies. Areas for building robust, secure, and accountable ML deliverables that are technically precise require further research.
Conclusion
ML is activated by statistically afformed algorithms, distributed over different categories such as Regression, Classification, Clustering, etc. All of these algorithms assist in building intelligent solutions for automating clinical tasks and suspecting disease apprehensions. The traditional practice of services provided by healthcare systems has seen a vast change with the advent of ML and DL-based approaches. However, to ensure secure, bias-free, and hale utilization of these models, provocations should be addressed. This report provides a brief introduction to several ML algorithms, discusses their extent of reinforcement and controls, further marking reliable standards to bypass shortcomings in model building. This paper also provides a synopsis of the challenges arising in the ML deployment pipeline for healthcare infrastructure by classifying different origins of jeopardies in it. Conclusively this work discusses possible solutions to provide users as well as clinical experts in a healthcare ecosystem with secure, robust, and privacy-protected ML explication for privacy endeavouring applications. The paper is summarized by including the potential pursuit of ML techniques in the healthcare sector and the privacy consideration linked with it.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Subhasmita Swain, Email: 2019003860.subhasmita@ug.sharda.ac.in.
Bharat Bhushan, Email: bharat_bhushan1989@yahoo.com.
Gaurav Dhiman, Email: gdhiman0001@gmail.com.
Wattana Viriyasitavat, Email: hardgolf@gmail.com.
References
- 1.Kumar A, Krishnamurthi R, Nayyar A, Sharma K, Grover V, Hossain E. A novel smart healthcare design, simulation, and implementation using healthcare 4.0 processes. IEEE Access. 2020;8:118433–118471. doi: 10.1109/ACCESS.2020.3004790. [DOI] [Google Scholar]
- 2.Yang G, et al. Homecare robotic systems for healthcare 4.0: visions and enabling technologies. IEEE J Biomed Health Inform. 2020;24(9):2535–2549. doi: 10.1109/JBHI.2020.2990529. [DOI] [PubMed] [Google Scholar]
- 3.Nasr M, Islam MM, Shehata S, Karray F, Quintana Y. Smart healthcare in the age of AI: recent advances, challenges, and future prospects. IEEE Access. 2021;9:145248–145270. doi: 10.1109/ACCESS.2021.3118960. [DOI] [Google Scholar]
- 4.Bharadwaj HK, et al. A review on the role of machine learning in enabling IoT based healthcare applications. IEEE Access. 2021;9:38859–38890. doi: 10.1109/ACCESS.2021.3059858. [DOI] [Google Scholar]
- 5.Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, Bengio Y, Pal C, Jodoin P-M, Larochelle H. Brain tumor segmentation with deep neural networks. Med Image Anal. 2017;35:18–31. doi: 10.1016/j.media.2016.05.004. [DOI] [PubMed] [Google Scholar]
- 6.Zarrin PS, Roeckendorf N, Wenger C. In-vitro classification of saliva samples of COPD patients and healthy controls using machine learning tools. IEEE Access. 2020;8:168053–168060. doi: 10.1109/ACCESS.2020.3023971. [DOI] [Google Scholar]
- 7.Aslam AR, Altaf MAB. An on-chip processor for chronic neurological disorders assistance using negative affectivity classification. IEEE Trans Biomed Circuits Syst. 2020;14(4):838–851. doi: 10.1109/TBCAS.2020.3008766. [DOI] [PubMed] [Google Scholar]
- 8.Meneghetti L, Terzi M, Del Favero S, Susto GA, Cobelli C. Data-driven anomaly recognition for unsupervised model-free fault detection in artificial pancreas. IEEE Trans Control Syst Technol. 2020;28(1):33–47. doi: 10.1109/TCST.2018.2885963. [DOI] [Google Scholar]
- 9.Mehta J, Majumdar A. Rodeo: robust de-aliasing autoencoder forreal-time medical image reconstruction. Pattern Recogn. 2017;63:499–510. doi: 10.1016/j.patcog.2016.09.022. [DOI] [Google Scholar]
- 10.Bejnordi BE, Veta M, Van Diest PJ, Van Ginneken B, Karssemeijer N, Litjens G, Van Der Laak JA, Hermsen M, Manson QF, Balkenhol M, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breastcancer. JAMA. 2017;318(22):2199–2210. doi: 10.1001/jama.2017.14585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S. Dermatologist-level classification of skin cancer withdeep neural networks. Nature. 2017;542(7639):115. doi: 10.1038/nature21056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan T, Ding D, Bagul A, Langlotz C, Shpanskaya K, et al (2017) Chexnet: radiologistlevel pneumonia detection on chest x-rays with deep learning. arXiv:1711.05225
- 13.Shishvan OR, Zois D, Soyata T. Machine intelligence in healthcare and medical cyber physical systems: a survey. IEEE Access. 2018;6:46419–46494. doi: 10.1109/ACCESS.2018.2866049. [DOI] [Google Scholar]
- 14.Li JP, Haq AU, Din SU, Khan J, Khan A, Saboor A. Heart disease identification method using machine learning classification in e-healthcare. IEEE Access. 2020;8:107562–107582. doi: 10.1109/ACCESS.2020.3001149. [DOI] [Google Scholar]
- 15.Kumar V, Recupero DR, Riboni D, Helaoui R. Ensembling classical machine learning and deep learning approaches for morbidity identification from clinical notes. IEEE Access. 2021;9:7107–7126. doi: 10.1109/ACCESS.2020.3043221. [DOI] [Google Scholar]
- 16.Paranjape K, Schinkel M, Nanayakkara P. Short keynote paper: mainstreaming personalized healthcare-transforming healthcare through new era of artificial intelligence. IEEE J Biomed Health Inform. 2020;24(7):1860–1863. doi: 10.1109/JBHI.2020.2970807. [DOI] [PubMed] [Google Scholar]
- 17.Al-Dhief FT, et al. A survey of voice pathology surveillance systems based on internet of things and machine learning algorithms. IEEE Access. 2020;8:64514–64533. doi: 10.1109/ACCESS.2020.2984925. [DOI] [Google Scholar]
- 18.Alhussein M, Muhammad G. Voice pathology detection using deep learning on mobile healthcare framework. IEEE Access. 2018;6:41034–41041. doi: 10.1109/ACCESS.2018.2856238. [DOI] [Google Scholar]
- 19.Tsang G, Xie X, Zhou S-M. Harnessing the power of machine learning in dementia informatics research: issues, opportunities, and challenges. IEEE Rev Biomed Eng. 2020;13:113–129. doi: 10.1109/RBME.2019.2904488. [DOI] [PubMed] [Google Scholar]
- 20.Tong Y, Messinger AI, Luo G. Testing the generalizability of an automated method for explaining machine learning predictions on asthma patients’ asthma hospital visits to an academic healthcare system. IEEE Access. 2020;8:195971–195979. doi: 10.1109/ACCESS.2020.3032683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Fiaidhi J. Envisioning insight-driven learning based on thick data analytics with focus on healthcare. IEEE Access. 2020;8:114998–115004. doi: 10.1109/ACCESS.2020.2995763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.El-Ganainy NO, Balasingham I, Halvorsen PS, Rosseland LA. A new real time clinical decision support system using machine learning for critical care units. IEEE Access. 2020;8:185676–185687. doi: 10.1109/ACCESS.2020.3030031. [DOI] [Google Scholar]
- 23.Sierra-Sosa D, et al. Scalable healthcare assessment for diabetic patients using deep learning on multiple GPUs. IEEE Trans Industr Inf. 2019;15(10):5682–5689. doi: 10.1109/TII.2019.2919168. [DOI] [Google Scholar]
- 24.Kumar R, Dhiman G. A comparative study of fuzzy optimization through fuzzy number. Int J Mod Res. 2021;1:1–14. [Google Scholar]
- 25.Chatterjee I. Artificial intelligence and patentability: review and discussions. Int J Mod Res. 2021;1:15–21. [Google Scholar]
- 26.Arachchige PCM, Bertok P, Khalil I, Liu D, Camtepe S, Atiquzzaman M. A trustworthy privacy preserving framework for machine learning in industrial IoT systems. IEEE Trans Ind Inf. 2020;16(9):6092–6102. doi: 10.1109/TII.2020.2974555. [DOI] [Google Scholar]
- 27.Vaishnav PK, Sharma S, Sharma P. Analytical review analysis for screening COVID-19. Int J Mod Res. 2021;1:22–29. [Google Scholar]
- 28.Nair R, Soni M, Bajpai B, Dhiman G, Sagayam KM. Predicting the death rate around the world due to COVID-19 using regression analysis. Int J Swarm Intell Res (IJSIR) 2022;13(2):1–13. doi: 10.4018/IJSIR.287545. [DOI] [Google Scholar]
- 29.Sharma S, Gupta S, Gupta D, Juneja S, Singal G, Dhiman G, Kautish S (2022) Recognition of gurmukhi handwritten city names using deep learning and cloud computing. Sci Program
- 30.Zeidabadi FA, Doumari SA, Dehghani M, Montazeri Z, Trojovsky P, Dhiman G. MLA: a new mutated leader algorithm for solving optimization problems. CMC—Comput Mater Continua. 2022;70(3):5631–5649. doi: 10.32604/cmc.2022.021072. [DOI] [Google Scholar]
- 31.Zeidabadi FA, Doumari SA, Dehghani M, Montazeri Z, Trojovsky P, Dhiman G. AMBO: all members-based optimizer for solving optimization problems. CMC—Comput Mater Continua. 2022;70(2):2905–2921. doi: 10.32604/cmc.2022.019867. [DOI] [Google Scholar]
- 32.Alharbi Y, Alferaidi A, Yadav K, Dhiman G, Kautish S (2021) Denial-of-service attack detection over IPv6 network based on KNN algorithm. Wirel Commun Mobile Comput
- 33.Chinnasamy R, Deepika A, Senthil T. Machine learning algorithms: A background artifact. Int J Eng Technol. 2018;7:143–149. [Google Scholar]
- 34.F. Y. Okay, M. Yıldırım and S. Özdemir, "Interpretable Machine Learning: A Case Study of Healthcare," International Symposium on Networks. Computers and Communications (ISNCC) 2021;2021:1–6. doi: 10.1109/ISNCC52172.2021.9615727. [DOI] [Google Scholar]
- 35.Ileberi E, Sun Y, Wang Z. Performance evaluation of machine learning methods for credit card fraud detection using SMOTE and AdaBoost. IEEE Access. 2021;9:165286–165294. doi: 10.1109/ACCESS.2021.3134330. [DOI] [Google Scholar]
- 36.Ahsan M, Stoyanov S, Bailey C, Albarbar A. Developing computational intelligence for smart qualification testing of electronic products. IEEE Access. 2020;8:16922–16933. doi: 10.1109/ACCESS.2020.2967858. [DOI] [Google Scholar]
- 37.Hari N, Ahsan M, Ramasamy S, Sanjeevikumar P, Albarbar A, Blaabjerg F. Gallium nitride power electronic devices modeling using machine learning. IEEE Access. 2020;8:119654–119667. doi: 10.1109/ACCESS.2020.3005457. [DOI] [Google Scholar]
- 38.Seng KP, Ang L-M, Schmidtke LM, Rogiers SY. Computer vision and machine learning for viticulture technology. IEEE Access. 2018;6:67494–67510. doi: 10.1109/ACCESS.2018.2875862. [DOI] [Google Scholar]
- 39.Rehman ZU, Zia MS, Bojja GR, Yaqub M, Jinchao F, Arshid K. Texture based localization of a brain tumor from MR-images by using a machine learning approach. Med Hypotheses. 2020;141:109705. doi: 10.1016/j.mehy.2020.109705. [DOI] [PubMed] [Google Scholar]
- 40.Singh PD, Kaur R, Dhiman G, Bojja GR (2021) BOSS: a new QoS aware blockchain assisted framework for secure and smart healthcare as a service. Expert Syst e12838
- 41.Vyas P, Bojja G, Ambati LS, Liu J, Ofori M (2021) Prediction of patient willingness to recommend hospital: a machine learning-based exploratory study.
- 42.Xie Y, Li Y, Xia Z, Yan R. An improved forward regression variable selection algorithm for high-dimensional linear regression models. IEEE Access. 2020;8:129032–129042. doi: 10.1109/ACCESS.2020.3009377. [DOI] [Google Scholar]
- 43.Doshi-Velez F, Kim B. Towards a rigorous science of interpretable machine learning. Tech Rep. 2017;1702:08608. [Google Scholar]
- 44.Gambella C, Ghaddar B, Naoum-Sawaya J (2020) Optimization problems for machine learning: a survey. Eur J Oper Res
- 45.Bertsimas D, Copenhaver MS. Characterization of the equivalence of robustification and regularization in linear and matrix regression. Eur J Oper Res. 2018;270(3):931–942. doi: 10.1016/j.ejor.2017.03.051. [DOI] [Google Scholar]
- 46.Bengio Y, Lodi A, Prouvost A. Machine learning for combinatorial optimization: a methodological tour d’Horizon. Tech Rep. 2018;1811:06128. [Google Scholar]
- 47.Bertsimas D, Van Parys B, et al. Sparse high-dimensional regression: Exact scalable algorithms and phase transitions. Ann Stat. 2020;48(1):300–323. doi: 10.1214/18-AOS1804. [DOI] [Google Scholar]
- 48.Laskowski M, Ambroziak SJ, Correia LM, Świder K. On the usefulness of the generalised additive model for mean path loss estimation in body area networks. IEEE Access. 2020;8:176873–176882. doi: 10.1109/ACCESS.2020.3025118. [DOI] [Google Scholar]
- 49.Yang X, et al. Piecewise linear regression based on plane clustering. IEEE Access. 2019;7:29845–29855. doi: 10.1109/ACCESS.2019.2902620. [DOI] [Google Scholar]
- 50.D’Ambrosio C, Lodi A, Wiese S, Bragalli C. Mathematical programming techniques in water network optimization. Eur J Oper Res. 2015;243(3):774–788. doi: 10.1016/j.ejor.2014.12.039. [DOI] [Google Scholar]
- 51.Baumann P, Hochbaum DS, Yang YT. A comparative study of the leading machine learning techniques and two new optimization algorithms. Eur J Oper Res. 2019;272(3):1041–1057. doi: 10.1016/j.ejor.2018.07.009. [DOI] [Google Scholar]
- 52.Lan L, Wang Z, Zhe S, Cheng W, Wang J, Zhang K. Scaling up kernel SVM on limited resources: a low-rank linearization approach. IEEE Trans Neural Netw Learn Syst. 2019;30(2):369–378. doi: 10.1109/TNNLS.2018.2838140. [DOI] [PubMed] [Google Scholar]
- 53.Oktay O, et al. Stratified decision forests for accurate anatomical landmark localization in cardiac images. IEEE Trans Med Imaging. 2017;36(1):332–342. doi: 10.1109/TMI.2016.2597270. [DOI] [PubMed] [Google Scholar]
- 54.Liang J, Qin Z, Xue L, Lin X, Shen X. Efficient and privacy-preserving decision tree classification for health monitoring systems. IEEE Internet Things J. 2021;8(16):12528–12539. doi: 10.1109/JIOT.2021.3066307. [DOI] [Google Scholar]
- 55.Zhu H, et al. MR-forest: a deep decision framework for false positive reduction in pulmonary nodule detection. IEEE J Biomed Health Inform. 2020;24(6):1652–1663. doi: 10.1109/JBHI.2019.2947506. [DOI] [PubMed] [Google Scholar]
- 56.Ghaddar B, Naoum-Sawaya J. High dimensional data classification and feature selection using support vector machines. Eur J Oper Res. 2018;265(3):993–1004. doi: 10.1016/j.ejor.2017.08.040. [DOI] [Google Scholar]
- 57.Vapnik VN. The Nature of Statistical Learning Theory. 2000 doi: 10.1007/978-1-4757-3264-1. [DOI] [Google Scholar]
- 58.Zhang F, O'Donnell LJ. Support vector regression. Mach Learn. 2020 doi: 10.1016/b978-0-12-815739-8.00007-9. [DOI] [Google Scholar]
- 59.Cafieri S, Costa A, Hansen P. Reformulation of a model for hierarchical divisive graph modularity maximization. Ann Oper Res. 2014;222:213–226. doi: 10.1007/s10479-012-1286-z. [DOI] [Google Scholar]
- 60.Üstün B, Melssen WJ, Buydens LMC. Visualization and interpretation of support vector regression models. Anal Chim Acta. 2007;595(1–2):299–309. doi: 10.1016/j.aca.2007.03.023. [DOI] [PubMed] [Google Scholar]
- 61.Kulis B, Jordan MI (2012) Revisiting k-means: new algorithms via Bayesian nonparametric. In Proceedings of the 29th international conference on machine learning (ICML ’12), pp. 513–520, Edinburgh, UK
- 62.Aloise D, Hansen P, Liberti L. An improved column generation algorithm for minimum sum-of-squares clustering. Math Program. 2012;131:195–220. doi: 10.1007/s10107-010-0349-7. [DOI] [Google Scholar]
- 63.2015 John P. Cunningham and Zoubin Ghahramani.
- 64.Mandal S, Greenblatt AB. An J (2018) Imaging Intelligence: AI is transforming medical imaging across the imaging spectrum. IEEE Pulse. 2018;9(5):16–24. doi: 10.1109/MPUL.2018.2857226. [DOI] [PubMed] [Google Scholar]
- 65.Noothout JMH, et al. Deep learning-based regression and classification for automatic landmark localization in medical images. IEEE Trans Med Imaging. 2020;39(12):4011–4022. doi: 10.1109/TMI.2020.3009002. [DOI] [PubMed] [Google Scholar]
- 66.De Oliveira H, Augusto V, Jouaneton B, Lamarsalle L, Prodel M, Xie X. Automatic and explainable labeling of medical event logs with autoencoding. IEEE J Biomed Health Inform. 2020;24(11):3076–3084. doi: 10.1109/JBHI.2020.3021790. [DOI] [PubMed] [Google Scholar]
- 67.Mengoudi K, et al. Augmenting dementia cognitive assessment with instruction-less eye-tracking tests. IEEE J Biomed Health Inform. 2020;24(11):3066–3075. doi: 10.1109/JBHI.2020.3004686. [DOI] [PubMed] [Google Scholar]
- 68.Stojanovic J, Gligorijevic D, Radosavljevic V, Djuric N, Grbovic M, Obradovic Z. Modeling healthcare quality via compact representations of electronic health records. IEEE/ACM Trans Comput Biol Bioinf. 2017;14(3):545–554. doi: 10.1109/TCBB.2016.2591523. [DOI] [PubMed] [Google Scholar]
- 69.Brisimi TS, Xu T, Wang T, Dai W, Adams WG, Paschalidis IC. Predicting chronic disease hospitalizations from electronic health records: an interpretable classification approach. Proc IEEE. 2018;106(4):690–707. doi: 10.1109/JPROC.2017.2789319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Shickel B, Tighe PJ, Bihorac A, Rashidi P. Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) Analysis. IEEE J Biomed Health Inform. 2018;22(5):1589–1604. doi: 10.1109/JBHI.2017.2767063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.de la Fuente C, Urrutia A, Chávez E. Using the random forest algorithm for searching behavior patterns in electronic health records. IEEE Lat Am Trans. 2019;17(05):875–881. doi: 10.1109/TLA.2019.8891957. [DOI] [Google Scholar]
- 72.Harerimana G, Kim JW, Yoo H, Jang B. Deep learning for electronic health records analytics. IEEE Access. 2019;7:101245–101259. doi: 10.1109/ACCESS.2019.2928363. [DOI] [Google Scholar]
- 73.Bernardini M, Romeo L, Misericordia P, Frontoni E. Discovering the type 2 diabetes in electronic health records using the sparse balanced support vector machine. IEEE J Biomed Health Inform. 2020;24(1):235–246. doi: 10.1109/JBHI.2019.2899218. [DOI] [PubMed] [Google Scholar]
- 74.Tsang G, Zhou S-M, Xie X. Modeling large sparse data for feature selection: hospital admission predictions of the dementia patients using primary care electronic health records. IEEE J Transl Eng Health Med. 2021;9:1–13. doi: 10.1109/JTEHM.2020.3040236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Lee S, Wei S, White V, Bain PA, Baker C, Li J. Classification of opioid usage through semi-supervised learning for total joint replacement patients. IEEE J Biomed Health Inform. 2021;25(1):189–200. doi: 10.1109/JBHI.2020.2992973. [DOI] [PubMed] [Google Scholar]
- 76.Zebari DA, Zeebaree DQ, Abdulazeez AM, Haron H, Hamed HNA. Improved threshold based and trainable fully automated segmentation for breast cancer boundary and pectoral muscle in mammogram images. IEEE Access. 2020;8:203097–203116. doi: 10.1109/ACCESS.2020.3036072. [DOI] [Google Scholar]
- 77.Zech J, Pain M, Titano J, Badgeley M, Schefflein J, Su A, Costa A, Bederson J, Lehar J, Oermann EK. Natural language–based machine learning models for the annotation of clinical radiology reports. Radiology. 2018;287(2):570–580. doi: 10.1148/radiol.2018171093. [DOI] [PubMed] [Google Scholar]
- 78.Jing B, Xie P, Xing E (2018) On the automatic generation of medical imaging reports. In: 56th annual meeting of the association for computational linguistics (ACL)
- 79.Li M, et al. Research on the auxiliary classification and diagnosis of lung cancer subtypes based on histopathological images. IEEE Access. 2021;9:53687–53707. doi: 10.1109/ACCESS.2021.3071057. [DOI] [Google Scholar]
- 80.Umamaheswari D, Geetha S. Segmentation and classification of acute lymphoblastic leukemia cells tooled with digital image processing and ML techniques. Second International Conference on Intelligent Computing and Control Systems (ICICCS) 2018;2018:1336–1341. doi: 10.1109/ICCONS.2018.8662950. [DOI] [Google Scholar]
- 81.Wang Y, Huang F, Zhang Y, Zhang R, Lei B, Wang T (2019) Breast cancer image classification via multi-level dual-network features and sparse multi-relation regularized learning. In: 2019 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 7023–7026. [DOI] [PubMed]
- 82.Abhinaav R, Brindha D. Abnormality detection and severity classification of cells based on features extracted from papanicolaou smear images using machine learning. Int Conf Comput Commun Inform (ICCCI) 2019;2019:1–5. doi: 10.1109/ICCCI.2019.8822131. [DOI] [Google Scholar]
- 83.Bora AP, Joshi AD, Sawant ST (2020) Digitally reconstructed radiograph generation for enabling AI/ML in medical imaging. In: 2020 11th international conference on computing, communication and networking technologies (ICCCNT), pp 1–6.
- 84.Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS ONE. 2017;12(4):e0174944. doi: 10.1371/journal.pone.0174944. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Fatima M, Pasha M. Survey of machine learning algorithms for disease diagnostic. J Intell Learn Syst Appl. 2017;9(01):1. [Google Scholar]
- 86.Zhao K, So H-C. Drug repositioning for schizophrenia and depression/anxiety disorders: a machine learning approach leveraging expression data. IEEE J Biomed Health Inform. 2019;23(3):1304–1315. doi: 10.1109/JBHI.2018.2856535. [DOI] [PubMed] [Google Scholar]
- 87.Jamshidi M, et al. Artificial intelligence and COVID-19: deep learning approaches for diagnosis and treatment. IEEE Access. 2020;8:109581–109595. doi: 10.1109/ACCESS.2020.3001973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Li N, et al. Machine learning assessment for severity of liver fibrosis for chronic HBV based on physical layer with serum markers. IEEE Access. 2019;7:124351–124365. doi: 10.1109/ACCESS.2019.2923688. [DOI] [Google Scholar]
- 89.Noaro G, Cappon G, Vettoretti M, Sparacino G, Favero SD, Facchinetti A. Machine-learning based model to improve insulin bolus calculation in type 1 diabetes therapy. IEEE Trans Biomed Eng. 2021;68(1):247–255. doi: 10.1109/TBME.2020.3004031. [DOI] [PubMed] [Google Scholar]
- 90.Yang S, Wei R, Guo J, Xu L. Semantic inference on clinical documents: combining machine learning algorithms with an inference engine for effective clinical diagnosis and treatment. IEEE Access. 2017;5:3529–3546. doi: 10.1109/ACCESS.2017.2672975. [DOI] [Google Scholar]
- 91.Chaitra N, Vijaya PA, Deshpande G. Diagnostic prediction of autism spectrum disorder using complex network measures in a machine learning framework. Biomed Signal Process Control. 2020;62:102099. doi: 10.1016/j.bspc.2020.102099. [DOI] [Google Scholar]
- 92.Saygılı A. A new approach for computer-aided detection of coronavirus (COVID-19) from CT and X-ray images using machine learning methods. Appl Soft Comput. 2021;105:107323. doi: 10.1016/j.asoc.2021.107323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Nagiub EM, Abdelsalam KF, Hussain NM, Omar QT, Ali CA, detection L, using microscopic blood image based machine learning “convolutional neural network”, clinical lymphoma myeloma and leukemia, 18(Supplement), 1, (2018) Page S297. ISSN. 10.1016/j.clml.2018.07.246
- 94.He J, Wu X, Jiang Y, Peng Q, Jain R. Hookworm detection in wireless capsule endoscopy images with deep learning. IEEE Trans Image Process. 2018;27(5):2379–2392. doi: 10.1109/TIP.2018.2801119. [DOI] [PubMed] [Google Scholar]
- 95.Yu Y, Wang J, Chun HE, Xu Y, Fong ELS, Wee A, Yu A. Implementation of machine learning-aided imaging analytics for histopathological image diagnosis, systems medicine. New York: Academic Press; 2021. pp. 208–221. [Google Scholar]
- 96.Suresh H (2017) Clinical event prediction and understanding with deep neural networks. Ph.D. dissertation, Massachusetts Institute of Technology
- 97.Qayyum A, Qadir J, Bilal M, Al-Fuqaha A (2020) Secure and robust machine learning for healthcare: a survey [DOI] [PubMed]
- 98.Kim EY, Lee MY, Kim SH, Ha K, Kim KP, Ahn YM. Diagnosis of major depressive disorder by combining multimodal information from heart rate dynamics and serum proteomics using machine-learning algorithm. Progr Neuro-Psychopharmacol Biol Psychiatry. 2017;76:65–71. doi: 10.1016/j.pnpbp.2017.02.014. [DOI] [PubMed] [Google Scholar]
- 99.Pellegrini E, Ballerini L, Hernandez M, Chappell FM, González-Castro V, Anblagan D, Danso S, Muñoz-Maniega S, Job D, Pernet D, Mair G, MacGillivray TJ, Trucco E, Wardlaw JM. Machine learning of neuroimaging for assisted diagnosis of cognitive impairment and dementia: a systematic review. Alzheimer's Dementia Diagn Assess Dis Monit. 2018;10:519–535. doi: 10.1016/j.dadm.2018.07.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Akbulut A, Ertugrul E, Topcu V. Fetal health status prediction based on maternal clinical history using machine learning techniques. Comput Methods Programs Biomed. 2018;163:87–100. doi: 10.1016/j.cmpb.2018.06.010. [DOI] [PubMed] [Google Scholar]
- 101.Karhade AV, Thio Q, Ogink P, Kim J, Lozano-Calderon S, Raskin K, Schwab JH. Development of machine learning algorithms for prediction of 5-year spinal chordoma survival. World Neurosurgery. 2018;119:e842–e847. doi: 10.1016/j.wneu.2018.07.276. [DOI] [PubMed] [Google Scholar]
- 102.Abdar M, Wojciech Książek U, Acharya R, Tan R-S, Makarenkov V, Pławiak P. A new machine learning technique for an accurate diagnosis of coronary artery disease. Comput Methods Programs Biomed. 2019;179:104992. doi: 10.1016/j.cmpb.2019.104992. [DOI] [PubMed] [Google Scholar]
- 103.Burdick H, Lam C, Mataraso S, Siefkas A, Braden G, Dellinger RP, McCoy A, Vincent JL, Green-Saxena A, Barnes G, Hoffman J, Calvert J, Pellegrini E, Das R (2020) Prediction of respiratory decompensation in Covid-19 patients using machine learning: the READY trial. Comput Biol Med 124:103949 [DOI] [PMC free article] [PubMed]
- 104.Hashem S, ElHefnawi M, Habashy S, El-Adawy M, Esmat G, Elakel W, Abdelazziz AO, Nabeel MM, Abdelmaksoud AH, Elbaz TM, Shousha HI (2020) Machine learning prediction models for diagnosing hepatocellular carcinoma with HCV-related chronic liver disease. Comput Methods Program Biomed 196:105551 [DOI] [PubMed]
- 105.Magesh PR, Myloth RD, Tom RJ (2020) An explainable machine learning model for early detection of parkinson's disease using LIME on DaTSCAN imagery. Comput Biol Med 126:104041 [DOI] [PubMed]
- 106.Shen H, Hu Y, Liu X, Jiang Z, Ye H, Takshe A, Dulaimi SHKA (2021) Application of machine learning risk prediction mathematical model in the diagnosis of Escherichia coli infection in patients with septic shock by cardiovascular color doppler ultrasound. Results Phys 26:104368
- 107.Montolío A, Martín-Gallego A, Cegoñino J, Orduna E, Vilades E, Garcia-Martin E, del Palomar AP (2021) Machine learning in diagnosis and disability prediction of multiple sclerosis using optical coherence tomography. Comput Biol Med 133:104416 [DOI] [PubMed]
- 108.Lin Y-W, Zhou Y, Faghri F, Shaw M, Campbell R. Analysis and prediction of unplanned intensive care unit readmission using recurrent neural networks with long short-term memory. PLoS ONE. 2019;14:e0218942. doi: 10.1371/journal.pone.0218942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Rau C-S, Kuo P-J, Chien P-C, Huang C-Y, Hsieh H-Y, Hsieh C-H (2018) Mortality prediction in patients with isolated moderate and severe traumatic brain injury using machine learning models. PLoS ONE 13(11):e0207192 [DOI] [PMC free article] [PubMed]
- 110.Xie J, Wang Q. Benchmarking machine learning algorithms on blood glucose prediction for type I diabetes in comparison with classical time-series models. IEEE Trans Biomed Eng. 2020;67(11):3101–3124. doi: 10.1109/TBME.2020.2975959. [DOI] [PubMed] [Google Scholar]
- 111.Pezoulas VC, Papaloukas C, Veyssiere M, Goules A, Tzioufas AG, Soumelis V, Fotiadis DI (2021) A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data. Comput Struct Biotechnol J 19:3058–3068 [DOI] [PMC free article] [PubMed]
- 112.Nancy JY, Khanna NH, Kannan A (2010) A bio-statistical mining approach for classifying multivariate clinical time series data observed at irregular intervals. Expert Syst Appl 78
- 113.Froc E, Dubernard G, Bendifallah S, Hermouet E, Rubod-Dit-Guillet C, Canis M, Warembourg S, Golfier F, Fauconnier A, Roman H, Philip C-A (2021) Clinical characteristics of urinary tract endometriosis: a one-year national series of 232 patients from 31 endometriosis expert centers (by the FRIENDS group). Eur J Obst Gynecol Reprod Biol. 10.1016/j.ejogrb.2021.06.018.7 [DOI] [PubMed]
- 114.Wallace DS (2018) The role of speech recognition in clinical documentation. Nuance communications. Accessed 14 Dec 2019 https://www.hisa.org.au/slides/hic18/wed/SimonWallace.pdf.
- 115.Zamani NSM, Zaki WMDW, Huddin AB, Hussain A, Mutalib HA, Ali A. Automated pterygium detection using deep neural network. IEEE Access. 2020;8:191659–191672. doi: 10.1109/ACCESS.2020.3030787. [DOI] [Google Scholar]
- 116.Collins A, Yao Y (2018) Machine learning approaches: data integration for disease prediction and prognosis. In: Applied computational genomics. Springer, New York, pp 137–141.
- 117.Ke X, Zou J, Niu Y. End-to-end automatic image annotation based on deep CNN and multi-label data augmentation. IEEE Trans Multimedia. 2019;21(8):2093–2106. doi: 10.1109/TMM.2019.2895511. [DOI] [Google Scholar]
- 118.Davi C, et al. Severe dengue prognosis using human genome data and machine learning. IEEE Trans Biomed Eng. 2019;66(10):2861–2868. doi: 10.1109/TBME.2019.2897285. [DOI] [PubMed] [Google Scholar]
- 119.Liu M, Zhang J, Lian C, Shen D. Weakly supervised deep learning for brain disease prognosis using MRI and incomplete clinical scores. IEEE Trans Cybern. 2020;50(7):3381–3392. doi: 10.1109/TCYB.2019.2904186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Fang G, Liu W, Wang L (2020) A machine learning approach to select features important to stroke prognosis. Comput Biol Chem 88:107316 [DOI] [PubMed]
- 121.Wang G, Zhang G, Choi K-S, Lam K-M, Lu J (2020) Output based transfer learning with least squares support vector machine and its application in bladder cancer prognosis. Neurocomputing 387:279–292
- 122.Cai W, Liu T, Xue X, Luo G, Wang X, Shen Y, Fang Q, Sheng J, Chen F, Liang T (2020) CT quantification and machine-learning models for assessment of disease severity and prognosis of COVID-19 patients. Acad Radiol 27(12):1665–1678 [DOI] [PMC free article] [PubMed]
- 123.Zack CJ, Senecal C, Kinar Y, Metzger Y, Bar-Sinai Y, Widmer RJ, Lennon R, Singh M, Bell MR, Lerman A, Gulati R (2019) Leveraging machine learning techniques to forecast patient prognosis after percutaneous coronary intervention. JACC Cardiovasc Intervent 12(14):1304–1311 [DOI] [PubMed]
- 124.He Z-L, Zhou J-B, Liu Z-H, Dong S-Y, Zhang Y-T, Shen T, Zheng S-S, Xu X (2021) Application of machine learning models for predicting acute kidney injury following donation after cardiac death liver transplantation. Hepatob Pancr Dis Int 20(3):222–231 [DOI] [PubMed]
- 125.Ghadirzadeh A, Chen X, Yin W, Yi Z, Björkman M, Kragic D. Human-centered collaborative robots with deep reinforcement learning. IEEE Robot Autom Lett. 2021;6(2):566–571. doi: 10.1109/LRA.2020.3047730. [DOI] [Google Scholar]
- 126.Veltri P, Vizza P, Cristofaro M, Kallaverja E (2021) Clinical data annotation for parotid neoplasia management. In: 2021 IEEE 9th international conference on healthcare informatics (ICHI), pp. 445–446.
- 127.Finlayson SG, Bowers JD, Ito J, Zittrain JL, Beam AL, Kohane IS. Adversarial attacks on medical machine learning. Science. 2019;363(6433):1287–1289. doi: 10.1126/science.aaw4399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Alfeld S, Zhu X, Barford P (2016) Data poisoning attacks against autoregressive models. In: Thirtieth AAAI conference on artificial intelligence
- 129.Papernot N, McDaniel P, Sinha A, Wellman M (2016) Towards the science of security and privacy in machine learning. arXiv:1611.03814, 2016.
- 130.Begoli E, Bhattacharya T, Kusnezov D. The need for uncertainty quantification in machine-assisted medical decision making. Nat Mach Intell. 2019;1(1):20. doi: 10.1038/s42256-018-0004-1. [DOI] [Google Scholar]
- 131.Pollard TJ, Chen I, Wiens J, Horng S, Wong D, Ghassemi M, Mattie H, Lindmeer E, Panch T. Turning the crank for machine learning: ease, at what expense? Lancet Digit Health. 2019;1(5):e198–e199. doi: 10.1016/S2589-7500(19)30112-8. [DOI] [PubMed] [Google Scholar]
- 132.Sahi MA, et al. Privacy preservation in e-healthcare environments: state of the art and future directions. IEEE Access. 2018;6:464–478. doi: 10.1109/ACCESS.2017.2767561. [DOI] [Google Scholar]
- 133.Al-Rubaie M, Chang JM. Privacy-preserving machine learning: threats and solutions. IEEE Secur Priv. 2019;17(2):49–58. doi: 10.1109/MSEC.2018.2888775. [DOI] [Google Scholar]
- 134.Zhang J, Bareinboim E (2018) Fairness in decision-making the causal explanation formula. In: Thirty-second AAAI conference on artificial intelligence
- 135.Chen I, Johansson FD, Sontag D (2018) Why is my classifier discriminatory? In: Advances in neural information processing systems, pp 3539–3550.
- 136.Ghassemi M, Naumann T, Schulam P, Beam AL, Chen IY, Ranganath R. Practical guidance on artificial intelligence for healthcare data. Lancet Digit Health. 2019;1(4):e157–e159. doi: 10.1016/S2589-7500(19)30084-6. [DOI] [PubMed] [Google Scholar]
- 137.Panch T, Mattie H, Celi LA (2019) The inconvenient truth about AI in healthcare. Npj Digit Med 2(1):1–3 [DOI] [PMC free article] [PubMed]
- 138.Perone CS, Ballester P, Barros RC, Cohen-Adad J. Unsupervised domain adaptation for medical imaging segmentation with self-ensembling. Neuroimage. 2019;194:1–11. doi: 10.1016/j.neuroimage.2019.03.026. [DOI] [PubMed] [Google Scholar]
- 139.Schulam P, Saria S (2017) Reliable decision support using counterfactual models. In: Advances in neural information processing systems, pp 1697–1708
- 140.Qayyum A, Usama M, Qadir J, Al-Fuqaha A (2019) Securing connected & autonomous vehicles: challenges posed by adversarial machine learning and the way forward. arXiv:1905.12762
- 141.Latif S, Qayyum A, Usama M, Qadir J, Zwitter A, Shahzad M. Caveat emptor: the risks of using big data for human development. IEEE Technol Soc Mag. 2019;38(3):82–90. doi: 10.1109/MTS.2019.2930273. [DOI] [Google Scholar]