A No-Math Primer on the Principles of Machine Learning for Radiologists

Matthew D Lee; Mohammed Elsayed; Sumit Chopra; Yvonne W Lui

doi:10.1053/j.sult.2022.02.002

. Author manuscript; available in PMC: 2022 Aug 9.

Published in final edited form as: Semin Ultrasound CT MR. 2022 Feb 11;43(2):133–141. doi: 10.1053/j.sult.2022.02.002

A No-Math Primer on the Principles of Machine Learning for Radiologists

Matthew D Lee ^*, Mohammed Elsayed ^*, Sumit Chopra ^*,^†, Yvonne W Lui ^*

PMCID: PMC9363000 NIHMSID: NIHMS1827367 PMID: 35339253

Abstract

Machine learning is becoming increasingly important in both research and clinical applications in radiology due to recent technological developments, particularly in deep learning. As these technologies are translated toward clinical practice, there is a need for radiologists and radiology trainees to understand the basic principles behind them. This primer provides an accessible introduction to the vocabulary and concepts that are central to machine learning and relevant to the radiologist.

Introduction

Over the past decade, artificial intelligence (AI) has rapidly gained prominence in scientific and public discourse with promises to revolutionize many aspects of our lives. Spurred on by advances in deep learning, AI has become progressively adept at analyzing images of everyday objects and text.¹ Inspired by this success, medical imaging researchers have adopted these techniques for clinical images and reported expert-level performance in certain focused imaging tasks.^2–5 The federal Food and Drug Administration in the United States has begun approving diagnostic imaging tools that use these technologies.⁶

While these developments are exciting, lay discussion around the topic is often hyperbolic and scientific reports confoundingly technical. Disentangling fact from exaggeration and jargon is a challenge. Nevertheless, understanding the advantages and limitations of these technologies is crucial for interpreting their results as machine learning is increasingly applied to clinical practice.

This primer is primarily aimed at radiologists with no technical background. Here, we provide an accessible introduction to the basic principles of machine learning most relevant to radiology and describe the context in which they may be encountered.

What are Artificial Intelligence, Machine Learning, and Deep Learning?

When John McCarthy coined the term “artificial intelligence” in 1956, he created a field of research around the idea that machines could perhaps simulate human intelligence.⁷ AI is the study and development of computer algorithms able to perform tasks traditionally believed to require human intelligence, such as making decisions and having conversations. Today, AI is a broad field encompassing many diverse approaches and techniques including machine learning and deep learning. Early in the history of AI, scientists programmed systems using explicit rules. An example of a rule-based system in radiology is a system called ICON, developed in 1987, which receives as input information about the findings on chest radiographs of patients with Hodgkin’s disease.⁸ Based on the findings and the clinician’s proposed diagnosis, ICON generates a discussion of the differential diagnosis, including whether the findings support the proposed diagnosis or other competing diagnoses. While the rule-based approach works in certain cases where there are a limited number of possibilities, it does not perform well in situations of increasing complexity that require detailed contextual knowledge.⁹

Machine learning arose from the desire to allow computers to perform flexibly in complex environments with a huge number of variables and to identify patterns in data. Machine learning is the branch of AI that develops and uses algorithms that can learn from data, creating a model in the process. The more data machine learning models are exposed to, the more they improve when faced with similar data, just like humans. “Classical” machine learning refers to algorithms that evolved from statistical methods and do not use artificial neural networks. Classical machine learning algorithms rely on first identifying specific features in data. Features are any measurable property of the data, such as the presence of edges and corners in image analysis at the pixel level or perfusion measures of a tumor at a high level. Originally, such features were defined by researchers and helped focus their methods on data that were believed to be relevant based on a priori scientific knowledge. However, this process of experts identifying features to include is limited by the fact that it relies on a priori knowledge. As radiologists, we understand that sometimes we do not know which imaging features or metrics are most relevant.

Deep learning is a branch of machine learning that was originally based on our concept of the mammalian brain’s multilayer architecture. Nodes called artificial neurons are arranged in multiple layers forming large hierarchical networks with connections between nodes relaying information between layers. By feeding data into these artificial neural networks, deep learning algorithms can derive features from data on their own, obviating the need for human-guided feature selection employed in classical machine learning. Although historically inspired by the brain, deep learning does not seek to directly simulate brain structure or function. Until the 2010s, deep learning models were generally regarded as too computationally costly to practically experiment with and train. Advances in computing power over the past decade, along with the availability of large public datasets such as ImageNet,¹⁰ have accelerated the field’s growth to the point that some deep learning models can even be run on a home laptop, depending on their complexity. Some of deep learning’s most significant advances have occurred in computer vision, speech recognition, and natural language processing, all of which intimately relate to the daily work of radiologists.¹

How do Machine Learning Algorithms Work?

After radiology residents dictate an imaging study, attending radiologists review the images with the residents and correct their reports while teaching them about specific findings and diseases. In this scenario, the residents’ learning is supervised. When residents are on call and do not have an attending to review their work, much learning also occurs, albeit unsupervised. Receiving praise for a job well done or criticism for poorly executed work is based on positive and negative reinforcement.

In an analogous way, machine learning may be supervised, unsupervised, or based on reinforcement. For learning to be supervised, each training example must be associated with a “ground-truth” label or annotation. The model compares its prediction with the ground-truth label and adjusts itself to increase the likelihood of matching the label. In the case of the radiology resident, the attending serves to provide the ground truth, and the resident learns how to refine their reports to align with the attending. In unsupervised learning, there are no ground-truth labels, so the model attempts to discover patterns in the data on its own. Similarly, unsupervised radiology residents rely on observing many cases and recognizing patterns on their own to determine what is normal or abnormal. Between supervised and unsupervised learning lies semi-supervised learning, which uses both labeled and unlabeled examples. At a high level, semi-supervised learning uses the correlations between the features of labeled and unlabeled sets of examples to extend the labels from the labeled examples to the unlabeled examples. In reinforcement learning, an algorithm performs actions and receives rewards or punishments (usually using a points-based system) that shape it to learn how to reach a desirable outcome.

Supervised learning has so far been the most popular approach in early machine learning efforts applied to radiology. However, this can require labeling of large amounts of data, which may be a challenging, tedious, and time-consuming task demanding domain expertise. Where supervised learning incorporates human knowledge in the form of ground-truth labels, unsupervised learning finds patterns in the data and can cluster or group data that are like one another without such guidance. Such approaches can provide unique insights into data. Advances in unsupervised learning hold much promise in the future of machine learning given the limitations of supervised learning.

Classical Machine Learning Models

Classical machine learning methods can be very effective for certain problems. The choice of algorithm depends on the type of question. Classical machine learning methods applied to radiological imaging can be useful in cases where the number of examples is limited and features such as size, density, or other quantitative measures can be leveraged as input. Classical machine learning methods can also be used as baseline models against which we may compare more advanced, deep learning-based models. Here we provide a brief overview of some of the most common methods including linear and logistic regression, support vector machine, k-nearest neighbors, decision trees, and random forest classifiers (Table 1).

Table 1.

Summary of Machine Learning Methods

Method	Strengths	Weaknesses	Examples

Linear Regression	Efficiently predicts continuous variables Easy to interpret	Assumes linear relationship between variables Sensitive to outliers	Survival prediction in hepatocellular carcinoma after chemoembolization³³
Logistic Regression	Provides probabilities of classification Easy to interpret	Sensitive to outliers	Prediction of no-shows to radiology appointments³⁴
Support Vector Machine	Classification robust to outliers Adaptable to linear and nonlinear data Effective for very high-dimensional data	Slow and computationally expensive for large datasets Does not work well when classes overlap	Differentiation of high-grade glioma progression vs treatment-related changes based on dynamic contrast-enhanced MRI³⁵
k-Nearest Neighbors	Efficiently clusters data without separate training period Robust to outliers	Slow for large high-dimensional datasets Dependent on empirical choice of k Sensitive to outliers	Prediction of placenta accreta spectrum based on MRI of placenta previa³⁶
Decision Tree	Can handle any type of data Robust to missing data Easy to interpret	Slow for high-dimensional data Sensitive to small changes in data Prone to overfitting	Diagnosis of biliary atresia based on MRI and ultrasound findings³⁷
Random Forest Classifier	Hierarchical structure helps identify most important features Robust to missing data and outliers Less prone to overfitting than individual decision trees	Slow to generate large number of complex trees	Prediction of malignancy and invasiveness in ground-glass lung nodules³⁸
Neural Network	Various architectures adaptable to different types of data and tasfcs Robust to noise Works well with very large datasets	Slow and computationally expensive May require large, labeled datasets May be overly sensitive to irrelevant features Not easily explainable	Segmentation of knee joint cartilage and bones³⁹ Automated detection of pneumothorax⁴⁰

Open in a new tab

A common type of problem that may be tackled is a regression problem, where we are interested in learning the relationship that one or more independent variables (input) may have with a dependent variable (output). The simplest form of regression is linear regression, which—as we might recall from high school mathematics—can be used to calculate the line that best fits the data (Fig. 1).¹¹ When linear regression is used in machine learning, the model searches for the best fit line by iteratively sampling available training data. The best fit line is then used to predict some continuous variable (eg, survival time, age). Logistic regression, on the other hand, computes a hyperplane (a plane in a high-dimensional space) that best separates the input data into two classes (Fig. 2).¹² In two-dimensional space, the hyperplane is a line. When given a new data point, the logistic model will then predict its class depending on which side of the hyperplane it lies. In other words, the hyperplane acts as a decision boundary.

Linear regression is used in machine learning to iteratively estimate the best fit linear equation for the data. Here, a linear relationship between input data (x-axis) and output result (y-axis) are learned and used to predict new, unseen data.

Logistic regression classifies data according to which side of a hyperplane the data lies on. The hyperplane describes the probability that a point belongs to the yellow group.

Another type of classical machine learning classifier algorithm is the support vector machine (SVM) (Fig. 3).¹³ The SVM separates two classes by drawing a decision boundary that maximizes the space between the boundary and the data points closest to it. In one example in radiology, a SVM model was developed to predict survival in glioma patients based on perfusion MRI metrics by drawing decision boundaries between survivors and non-survivors at multiple time points.¹⁴

A support vector machine (SVM) classifies data by finding parallel hyperplanes (dashed lines) that separate the classes as much as possible and using the line between them (solid line) as a decision boundary.

Unlike the previously discussed methods, k-nearest neighbors (k-NN) is a technique that does not try to draw a decision boundary between classes.¹⁵ Instead, k-NN classifies each new data point according to the k number of data points closest to it where the variable k is chosen empirically (Fig. 4). While an advantage of k-NN is that it does not require a training phase, comparing each new value to its neighbors may take an unreasonably long time for large datasets. In one study, k-NN was used to classify brain MR images as normal or abnormal.¹⁶

In k-nearest neighbors, a point is classified based on the k number of points closest to it. In this example where k = 5, the red point is surrounded by two yellow and three blue points, so it would be classified as blue.

Decision trees are flowcharts that can be used for classification or regression.¹⁷ A decision tree consists of a series of nodes that split data into different classes based on the value of a particular feature. This process is like playing the game 20 Questions, in which a player asks increasingly specific questions to determine what the other player is thinking of (Fig. 5). While decision trees are intuitive to humans, they are sensitive to noise in the training data. To overcome this sensitivity, multiple decision trees can be combined into a random forest classifier. In a random forest classifier, the decision trees each have a random selection of features and are trained on a random sample of training data to predict an outcome by a majority vote.¹⁸ In radiology, researchers have used random forest classifiers to classify lesions based on radiomics or texture features from CT and/or MRI; examples include predicting pathologic subtypes of renal masses¹⁹ and genetic biomarkers in glioblastoma.²⁰

A decision tree is a series of nodes that split data into classes based on specific features. New data start at the root node. If the new datapoint passes the root node test, it is sent to the next node; otherwise, it is sent to a leaf node corresponding to a specific class. This process continues until it is classified.

Classical machine learning algorithms such as these can perform well in some scenarios but are limited in an important way in terms of analyzing radiological imaging. They are restricted by the use of data features as inputs and, because these algorithms do not learn intermediate representations of the images themselves, they cannot readily analyze a whole image.²¹ Directly converting pixel/voxel values from a complex radiologic image and inputting such data into a classical machine learning algorithm rarely works well due to the wide variation in normal and abnormal anatomy. Researchers recognized that a more effective approach might be for the algorithm to learn abstractions from the data before attempting classification, leading to the development of deep learning.

Neural Networks

An artificial neural network consists of interconnected nodes, or artificial neurons, that each process data.⁹ Information flows through the network from an input layer composed of nodes, through one or more hidden layers, and finally to an output layer (Fig. 6). “Deep” in deep learning refers to neural networks with numerous hidden layers, as opposed to “shallow” networks with only one or a few hidden layers. Neural networks may be structured in many ways, yielding different architectures suited for different tasks.²²

Convolutional neural networks (CNN) are neural networks that can receive two- or three-dimensional images as inputs.¹ Through multiple layers that perform different mathematical operations, the network learns progressively complex, abstract representations of the data. The first layers extract basic features such as edges, whereas later layers may extract complex objects such as an organ or lesion. Common types of layers used in a convolutional neural network include convolutional layers and pooling layers. Convolutional layers synthesize information from a group of adjacent pixels/voxels to learn local motifs, whereas pooling layers group adjacent information together in an attempt to represent larger structural information.¹ These layers help make CNNs less prone to overfitting (being overly influenced by local and possibly unique variations in data) and more robust to image transformations such as translation and rotation.

Modern neural networks contain millions of nodes and connections between nodes. Learning or training the network modifies the strengths of the connections, which are called “weights” by machine learning scientists. Weights are adjusted using algorithms that calculate how the weights should change to minimize the model’s error. The networks may learn via many different methods including supervised, unsupervised, semi-supervised, and reinforcement methods as discussed previously. In addition, there are other modes of learning that are emerging. Transfer learning leverages a previously trained network for a different but related task. Initial training of transfer networks usually uses natural images (non-medical images of everyday objects and/or subjects), which are widely available in large labeled public datasets such as ImageNet,¹⁰ a well-known repository of several million images. For example, investigators have adapted state-of-the-art neural networks trained on natural images to clinical imaging tasks such as fracture detection in wrist radiographs.²³ Fourier transform can even be performed on these natural images to render the image in k-space for MR image reconstruction tasks.²⁴ The features from a large pre-trained network can provide a starting point for researchers with the hope of potentially decreasing computational costs required to train a network from scratch as well as decreasing the number of labeled ground-truth medical images required for successful training.

How are Machine Learning Models Developed?

To understand the myriad publications in radiology that involve machine learning, it is important to understand the basic methods. This section reviews some general steps of classical machine learning and deep learning models, from conception to deployment.

Define the Problem

It is important to understand what the problem or gap in knowledge is that any research purports to address. From a clinical standpoint, it is essential to have a sense of the overall impact of the problem. It is also important for researchers to consider whether machine learning is a good approach for their specific problem. In some instances, rule-based programming can be sufficient, along with other non-machine learning approaches. Reviewing the literature for what other researchers have done for similar problems could help clarify the approach.

Obtain the Data

Machine learning models can require large quantities of data to produce accurate and generalizable results. While most models are trained and tested on data from a single institution, this approach may prevent the model from performing well on data from other institutions. To overcome this, many studies seek to combine data from multiple institutions or from public repositories such as The Cancer Imaging Archive of the National Cancer Institute.²⁵ For supervised and semi-supervised learning, ground-truth labels must also be collected. Depending on the project, the ground truth may be defined in different ways, such as diagnoses from the medical record, findings from clinical radiology reports, or radiologists’ consensus opinion. As discussed previously, labeling large amounts of data accurately and consistently is important for training the network but can present challenges to the researcher.

Explore and Prepare the Data

A popular saying in computer science that certainly applies to machine learning is, “Garbage in, garbage out”: namely, the performance of a machine learning model is highly dependent on data quality. High-quality data is accurate, complete, consistent, and relevant. Researchers typically try to understand and explore their data by performing statistical tests, visualizing the data in charts or graphs, and removing irrelevant dimensions or variables. Errors or outliers may bias the model and impair generalization, so researchers try to adopt systematic ways of handling missing, erroneous, or extreme data. Large datasets, especially from clinical settings, can be imperfect but can often show the true and relevant clinical range of such data if adequately prepared. Data may need to be labeled, for example as normal or abnormal, depending on the model and question.

Choose a Metric to Optimize During Learning

To determine whether a model is learning adequately, a suitable metric is necessary to quantify the model’s degree of error. These metrics are calculated using what is called a cost function, or loss function, to provide a quantitative measure of the difference between what the model predicts and what is expected. During learning, a model attempts to minimize its cost function. Different cost functions may be used depending on the types of data involved and the task that the model is supposed to perform. Regression models, for example, frequently use mean squared error as the cost function.

Train and Tune the Model

The data are generally split into three groups: training, validation, and test sets. Examples from the training set are passed through the model, which makes a prediction and adjusts itself to improve future predictions. These adjustments to parameters of the model are small and made iteratively to improve the model. This process is called training the model or learning. While training data are used to optimize a model, validation data are used to determine which is the best model. Many different models can be tried to see which performs best on the validation set. For many machine learning problems, it is neither an obvious nor trivial task to figure out what kind of model will work best. Models also have parameters that are set before training called hyperparameters, such as the number of hidden layers in a neural network or the number of decision trees in a random forest classifier. Modifying these hyperparameters, or “tuning” the model, may influence the model’s learning and performance on the validation set. As more research is done, experience from previous work can provide insight into what types of models and model architectures may be suitable for related tasks.

Test the Model

Once the model finishes training and tuning, the test set is used to assess the final model’s performance. It is important to not mix test data with training or validation data. That would be like seeing the answers to questions from an exam ahead of time. If data from the validation or test sets is present in the training set, this is called “data leakage” and leads researchers to overestimate the model’s performance. Proper methodology avoids errors such as this.

Deploy the Model

Currently, there is much interest in establishing best practices towards successful deployment and utilization of machine learning tools in clinical practice. Machine learning models need reliable integrations with existing imaging data infrastructure. Discussions with key stakeholders, including radiologist end users, information technology specialists, and other clinicians are important to move models toward successful integration into clinical practice. Testing performance and ongoing quality measures are needed. A nuanced understanding of these models’ capabilities, as well as their risks and limitations, is warranted.²⁶

How are Machine Learning Models Assessed?

To develop a nuanced appreciation of how a machine learning model is performing in clinical practice, it is important to understand the various ways in which models may be evaluated. As discussed previously, cost functions are one way to assess a model’s performance. While the cost function is used to minimize errors during training, it can also be used to evaluate the model on non-training data. In some cases, however, the cost function may be too abstract to interpret in a clinically meaningful way.

The standard statistical methods used to quantify model accuracy depend on the type of task the model performs. For classification models, the receiver operating characteristic (ROC) curve is a graphical summary of model performance (Fig. 7). The area under the receiver operating characteristic curve (AUROC or AUC) can be calculated to describe a classifier’s overall diagnostic performance: AUC = 1 indicates perfect performance; AUC = 0.5 indicates performance no better than chance. Other measures of accuracy that are commonly used in clinical medicine, such as sensitivity, specificity, positive predictive value, and negative predictive value, may be calculated depending on where we choose to set the threshold for the model to predict the class. Depending on what clinical purpose a tool is designed for, one may be more or less interested in optimizing one or more of these statistical measures. For example, screening tests are often reliant on high sensitivity, while diagnostic tests used to determine clinical treatment pathways require high specificity. To demonstrate a model’s clinical utility, these measures may be compared to those calculated for human radiologists performing the same task. It is important to note that positive and negative predictive values are affected by the prevalence of the condition in the population studied.

Receiver operating characteristic (ROC) curves represent a classifier’s diagnostic performance. A classifier that performs no better than chance has an area under the curve (AUC) of 0.5 (dashed red). A classifier that performs perfectly has an AUC of 1 (green). A classifier that performs better than chance but not quite perfectly has an AUC between 0.5 and 1 (blue).

In many instances, qualitative assessment of model predictions against expert assessment is warranted. In particular, analyzing discrepant predictions is important to determine the nature of the model’s discrepancies and to identify patterns in them. By carefully examining the model’s errors, we can gain an understanding of the model’s systematic behaviors, biases, and limitations. This insight may then be leveraged to improve the model.

When machine learning models are deployed in real-world clinical environments, it is vital to study their effectiveness and acceptance by users and stakeholders. These qualitative and quantitative investigations can help us understand the successes and challenges in the clinical deployment of machine learning imaging tools. Even after being validated in the clinical environment, machine learning models’ performance must be continuously evaluated to monitor for changes in effectiveness due to shifts in population characteristics over time.

What are the Limitations of Machine Learning in Radiology?

Supervised machine learning algorithms often require sizable, annotated datasets to produce accurate and generalizable results. Compared to natural image datasets such as ImageNet, there are far fewer and smaller medical imaging datasets available. Medical images also often have much higher original image resolution compared to natural images and thus require greater computer memory for training, which is limited by computer hardware. Medical images also tend to have more complex classes/diagnoses and difficulties related to accurate labeling.

Accounting for variation in imaging protocols between institutions as well as in the appearance of findings across populations is a further challenge. Sharing data between institutions may help models generalize across populations but is also hindered by protected health information. To help overcome some of the challenges related to data sharing, researchers are beginning to investigate federated learning approaches, in which a model, instead of data, is shared across institutions.²⁷ In federated learning, learning can occur locally with updates to model parameters relayed from local sites to a central site, which aggregates updates from multiple sites into a consensus model. While federated learning is still in its infancy, it has the potential to draw upon the advantages of a broad distribution of cases from different institutions while mitigating risk associated with sharing protected health information.

The performance of a machine learning model depends on training data. Thus, machine learning models have the potential to perpetuate biases that may be present in data, which could exacerbate health inequity.²⁸ For example, the demographics of patients represented in an imaging dataset is inherently shaped by access to medical imaging services.²⁹ Certain populations may be under-represented in the data, and disease manifestations in such populations may not be learned well by any model trained on these data. A joint statement on the ethics of AI in radiology by multinational radiological societies poses a series of questions for radiology researchers to ask themselves about the data, algorithms, and practice implications as they develop AI models.³⁰

Finally, another criticism of certain classes of machine learning models, such as neural networks, is that they function like “black boxes”: exactly why they produce a certain output is often impossible to explain in a meaningful way to human observers.³¹ By processing numerous complex interactions in the data, a deep neural network learns features of data that are not translatable into comprehensible language. This lack of interpretability has, for some tasks, been a barrier to clinical adoption and trust among medical practitioners and the public. Explainability is a rapidly developing area of interest in machine learning and promises to offer potential solutions in the future.³²

Conclusion

This primer introduces the basic concepts and terminology of machine learning relevant to radiologists and radiology trainees in a non-technical, accessible way. Machine learning is poised to change the landscape of the clinical practice of radiology. Having a grasp of the methods behind these new approaches will empower radiologists as they interpret literature, purchase equipment and software, and adopt new methods of workflow optimization.

Acknowledgments

Supported in part through funding from the NIH/NIBIB, P41 EB017183.

References

1.LeCun Y, Bengio Y, Hinton G: Deep learning. Nature 521:436–444, 2015 [DOI] [PubMed] [Google Scholar]
2.Esteva A, Kuprel B, Novoa RA, et al. : Dermatologist-level classification of skin cancer with deep neural networks. Nature 542:115–118, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Nirschl JJ, Janowczyk A, Peyster EG, et al. : A deep-learning classifier identifies patients with clinical heart failure using whole-slide images of H&E tissue. PLoS One 13:e0192726, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Gulshan V, Peng L, Coram M, et al. : Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316:2402–2410, 2016 [DOI] [PubMed] [Google Scholar]
5.Larson DB, Chen MC, Lungren MP, et al. : Performance of a deep-learning neural network model in assessing skeletal maturity on pediatric hand radiographs. Radiology 287:313–322, 2018 [DOI] [PubMed] [Google Scholar]
6.Benjamens S, Dhunnoo P, MeskÓ B: The state of artificial intelligence-based FDA-approved medical devices and algorithms: An online database. npj Digit Med 3:118, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.McCarthy J, Minsky M, Rochester N, et al. : A proposal for the Dartmouth summer research project on artificial intelligence, August 31, 1955. AI Mag 27:12–14, 2006 [Google Scholar]
8.Swett HA, Miller PL: ICON: A computer-based approach to differential diagnosis in radiology. Radiology 163:555–558, 1987 [DOI] [PubMed] [Google Scholar]
9.Goodfellow I, Bengio Y, Courville A: Deep Learning. Cambridge, MA: MIT Press, 2016 [Google Scholar]
10.Deng J, Dong W, Socher R, et al. : ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, IEEE, 2009 [Google Scholar]
11.Montgomery DC, Peck EA, Vining GG: Introduction to Linear Regression Analysis. Hoboken, New Jersey: Wiley, 2020. Fifth edition [Google Scholar]
12.Hosmer DW, Lemeshow S, Sturdivant RX: Applied Logistic Regression. Hoboken, New Jersey: Wiley, 2013. Third edition [Google Scholar]
13.Noble WS: What is a support vector machine? Nat Biotechnol 24:1565–1567, 2006 [DOI] [PubMed] [Google Scholar]
14.Emblem KE, Pinho MC, Z€ollner FG, et al. : A generic support vector machine model for preoperative glioma survival associations. Radiology 275:228–234, 2015 [DOI] [PubMed] [Google Scholar]
15.Kramer O: K-Nearest Neighbors. Dimensionality Reduction with Unsupervised Nearest Neighbors, Vol. 51. Berlin, Heidelberg, Springer Berlin Heidelberg. [Google Scholar]
16.Rajini NH, Bhavani R: Classification of MRI brain images using k-nearest neighbor and artificial neural network. In: 2011 International Conference on Recent Trends in Information Technology (ICRTIT), Chennai, India: IEEE, 2011 [Google Scholar]
17.Safavian SR, Landgrebe D: A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21:660–674, 1991 [Google Scholar]
18.Breiman L: Random forests. UC Berkeley TR567, 1999. [Google Scholar]
19.Raman SP, Chen Y, Schroeder JL, et al. : CT texture analysis of renal masses. Acad Radiol 21:1587–1596, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Calabrese E, Villanueva-Meyer JE, Cha S: A fully automated artificial intelligence method for non-invasive, imaging-based identification of genetic alterations in glioblastomas. Sci Rep 10:11852, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Razavian N, Knoll F, Geras KJ: Artificial intelligence explained for non-experts. Semin Musculoskelet Radiol 24:3–11, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Sengupta S, Basak S, Saikia P, et al. : A review of deep learning with special emphasis on architectures, applications and recent trends. Knowl Based Syst 194:105596, 2020 [Google Scholar]
23.Kim DH, MacKinnon T: Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks. Clin Radiol 73:439–445, 2018 [DOI] [PubMed] [Google Scholar]
24.Muckley MJ, Ades-Aron B, Papaioannou A, et al. : Training a neural network for Gibbs and noise removal in diffusion MRI. Magn Reson Med 85:413–428, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Clark K, Vendt B, Smith K, et al. : The Cancer Imaging Archive (TCIA): Maintaining and operating a public information repository. J Digit Imaging 26:1045–1057, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Recht MP, Dewey M, Dreyer K, et al. : Integrating artificial intelligence into the clinical practice of radiology: Challenges and recommendations. Eur Radiol 30:3576–3584, 2020 [DOI] [PubMed] [Google Scholar]
27.Rieke N, Hancox J, Li W, et al. : The future of digital health with federated learning. npj Digit Med 3:119, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Pot M, Kieusseyan N, Prainsack B: Not all biases are bad: equitable and inequitable biases in machine learning and radiology. Insights Imaging 12:13, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Americo L, Ramjit A, Wu M, et al. : Health care disparities in radiology: A primer for resident education. Curr Probl Diagn Radiol 48:108–110, 2019 [DOI] [PubMed] [Google Scholar]
30.Geis JR, Brady AP, Wu CC, et al. : Ethics of artificial intelligence in radiology: Summary of the Joint European and North American Multisociety statement. J Am Coll Radiol 16:1516–1521, 2019 [DOI] [PubMed] [Google Scholar]
31.Hinton G: Deep learning—a technology with the potential to transform health care. JAMA 320:1101, 2018 [DOI] [PubMed] [Google Scholar]
32.Vellido A: The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput Applic 32:18069–18083, 2020 [Google Scholar]
33.Dong S, Ye X-D, Yuan Z, et al. : Relationship of apparent diffusion coefficient to survival for patients with unresectable primary hepatocellular carcinoma after chemoembolization. Eur J Radiol 81:472–477, 2012 [DOI] [PubMed] [Google Scholar]
34.Harvey HB, Liu C, Ai J, et al. : Predicting no-shows in radiology using regression modeling of data available in the electronic medical record. J Am Coll Radiol 14:1303–1309, 2017 [DOI] [PubMed] [Google Scholar]
35.Artzi M, Liberman G, Nadav G, et al. : Differentiation between treatment-related changes and progressive disease in patients with high grade brain tumors using support vector machine classification based on DCE MRI. J Neurooncol 127:515–524, 2016 [DOI] [PubMed] [Google Scholar]
36.Romeo V, Ricciardi C, Cuocolo R, et al. : Machine learning analysis of MRI-derived texture features to predict placenta accreta spectrum in patients with placenta previa. Magn Reson Imaging 64:71–76, 2019 [DOI] [PubMed] [Google Scholar]
37.Kim YH, Kim M-J, Shin HJ, et al. : MRI-based decision tree model for diagnosis of biliary atresia. Eur Radiol 28:3422–3431, 2018 [DOI] [PubMed] [Google Scholar]
38.Mei X, Wang R, Yang W, et al. : Predicting malignancy of pulmonary ground-glass nodules and their invasiveness by random forest. J Thorac Dis 10:458–463, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Liu F, Zhou Z, Jang H, et al. : Deep convolutional neural network and 3D deformable approach for tissue segmentation in musculoskeletal magnetic resonance imaging. Magn Reson Med 79:2379–2391, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Kao C-Y, Lin C-Y, Chao C-C, et al. : Automated radiology alert system for pneumothorax detection on chest radiographs improves efficiency and diagnostic performance. Diagnostics 11:1182, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.LeCun Y, Bengio Y, Hinton G: Deep learning. Nature 521:436–444, 2015 [DOI] [PubMed] [Google Scholar]

[R2] 2.Esteva A, Kuprel B, Novoa RA, et al. : Dermatologist-level classification of skin cancer with deep neural networks. Nature 542:115–118, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Nirschl JJ, Janowczyk A, Peyster EG, et al. : A deep-learning classifier identifies patients with clinical heart failure using whole-slide images of H&E tissue. PLoS One 13:e0192726, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Gulshan V, Peng L, Coram M, et al. : Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316:2402–2410, 2016 [DOI] [PubMed] [Google Scholar]

[R5] 5.Larson DB, Chen MC, Lungren MP, et al. : Performance of a deep-learning neural network model in assessing skeletal maturity on pediatric hand radiographs. Radiology 287:313–322, 2018 [DOI] [PubMed] [Google Scholar]

[R6] 6.Benjamens S, Dhunnoo P, MeskÓ B: The state of artificial intelligence-based FDA-approved medical devices and algorithms: An online database. npj Digit Med 3:118, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.McCarthy J, Minsky M, Rochester N, et al. : A proposal for the Dartmouth summer research project on artificial intelligence, August 31, 1955. AI Mag 27:12–14, 2006 [Google Scholar]

[R8] 8.Swett HA, Miller PL: ICON: A computer-based approach to differential diagnosis in radiology. Radiology 163:555–558, 1987 [DOI] [PubMed] [Google Scholar]

[R9] 9.Goodfellow I, Bengio Y, Courville A: Deep Learning. Cambridge, MA: MIT Press, 2016 [Google Scholar]

[R10] 10.Deng J, Dong W, Socher R, et al. : ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, IEEE, 2009 [Google Scholar]

[R11] 11.Montgomery DC, Peck EA, Vining GG: Introduction to Linear Regression Analysis. Hoboken, New Jersey: Wiley, 2020. Fifth edition [Google Scholar]

[R12] 12.Hosmer DW, Lemeshow S, Sturdivant RX: Applied Logistic Regression. Hoboken, New Jersey: Wiley, 2013. Third edition [Google Scholar]

[R13] 13.Noble WS: What is a support vector machine? Nat Biotechnol 24:1565–1567, 2006 [DOI] [PubMed] [Google Scholar]

[R14] 14.Emblem KE, Pinho MC, Z€ollner FG, et al. : A generic support vector machine model for preoperative glioma survival associations. Radiology 275:228–234, 2015 [DOI] [PubMed] [Google Scholar]

[R15] 15.Kramer O: K-Nearest Neighbors. Dimensionality Reduction with Unsupervised Nearest Neighbors, Vol. 51. Berlin, Heidelberg, Springer Berlin Heidelberg. [Google Scholar]

[R16] 16.Rajini NH, Bhavani R: Classification of MRI brain images using k-nearest neighbor and artificial neural network. In: 2011 International Conference on Recent Trends in Information Technology (ICRTIT), Chennai, India: IEEE, 2011 [Google Scholar]

[R17] 17.Safavian SR, Landgrebe D: A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21:660–674, 1991 [Google Scholar]

[R18] 18.Breiman L: Random forests. UC Berkeley TR567, 1999. [Google Scholar]

[R19] 19.Raman SP, Chen Y, Schroeder JL, et al. : CT texture analysis of renal masses. Acad Radiol 21:1587–1596, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Calabrese E, Villanueva-Meyer JE, Cha S: A fully automated artificial intelligence method for non-invasive, imaging-based identification of genetic alterations in glioblastomas. Sci Rep 10:11852, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Razavian N, Knoll F, Geras KJ: Artificial intelligence explained for non-experts. Semin Musculoskelet Radiol 24:3–11, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Sengupta S, Basak S, Saikia P, et al. : A review of deep learning with special emphasis on architectures, applications and recent trends. Knowl Based Syst 194:105596, 2020 [Google Scholar]

[R23] 23.Kim DH, MacKinnon T: Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks. Clin Radiol 73:439–445, 2018 [DOI] [PubMed] [Google Scholar]

[R24] 24.Muckley MJ, Ades-Aron B, Papaioannou A, et al. : Training a neural network for Gibbs and noise removal in diffusion MRI. Magn Reson Med 85:413–428, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Clark K, Vendt B, Smith K, et al. : The Cancer Imaging Archive (TCIA): Maintaining and operating a public information repository. J Digit Imaging 26:1045–1057, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Recht MP, Dewey M, Dreyer K, et al. : Integrating artificial intelligence into the clinical practice of radiology: Challenges and recommendations. Eur Radiol 30:3576–3584, 2020 [DOI] [PubMed] [Google Scholar]

[R27] 27.Rieke N, Hancox J, Li W, et al. : The future of digital health with federated learning. npj Digit Med 3:119, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Pot M, Kieusseyan N, Prainsack B: Not all biases are bad: equitable and inequitable biases in machine learning and radiology. Insights Imaging 12:13, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Americo L, Ramjit A, Wu M, et al. : Health care disparities in radiology: A primer for resident education. Curr Probl Diagn Radiol 48:108–110, 2019 [DOI] [PubMed] [Google Scholar]

[R30] 30.Geis JR, Brady AP, Wu CC, et al. : Ethics of artificial intelligence in radiology: Summary of the Joint European and North American Multisociety statement. J Am Coll Radiol 16:1516–1521, 2019 [DOI] [PubMed] [Google Scholar]

[R31] 31.Hinton G: Deep learning—a technology with the potential to transform health care. JAMA 320:1101, 2018 [DOI] [PubMed] [Google Scholar]

[R32] 32.Vellido A: The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput Applic 32:18069–18083, 2020 [Google Scholar]

[R33] 33.Dong S, Ye X-D, Yuan Z, et al. : Relationship of apparent diffusion coefficient to survival for patients with unresectable primary hepatocellular carcinoma after chemoembolization. Eur J Radiol 81:472–477, 2012 [DOI] [PubMed] [Google Scholar]

[R34] 34.Harvey HB, Liu C, Ai J, et al. : Predicting no-shows in radiology using regression modeling of data available in the electronic medical record. J Am Coll Radiol 14:1303–1309, 2017 [DOI] [PubMed] [Google Scholar]

[R35] 35.Artzi M, Liberman G, Nadav G, et al. : Differentiation between treatment-related changes and progressive disease in patients with high grade brain tumors using support vector machine classification based on DCE MRI. J Neurooncol 127:515–524, 2016 [DOI] [PubMed] [Google Scholar]

[R36] 36.Romeo V, Ricciardi C, Cuocolo R, et al. : Machine learning analysis of MRI-derived texture features to predict placenta accreta spectrum in patients with placenta previa. Magn Reson Imaging 64:71–76, 2019 [DOI] [PubMed] [Google Scholar]

[R37] 37.Kim YH, Kim M-J, Shin HJ, et al. : MRI-based decision tree model for diagnosis of biliary atresia. Eur Radiol 28:3422–3431, 2018 [DOI] [PubMed] [Google Scholar]

[R38] 38.Mei X, Wang R, Yang W, et al. : Predicting malignancy of pulmonary ground-glass nodules and their invasiveness by random forest. J Thorac Dis 10:458–463, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.Liu F, Zhou Z, Jang H, et al. : Deep convolutional neural network and 3D deformable approach for tissue segmentation in musculoskeletal magnetic resonance imaging. Magn Reson Med 79:2379–2391, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] 40.Kao C-Y, Lin C-Y, Chao C-C, et al. : Automated radiology alert system for pneumothorax detection on chest radiographs improves efficiency and diagnostic performance. Diagnostics 11:1182, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

A No-Math Primer on the Principles of Machine Learning for Radiologists

Matthew D Lee, MD

Mohammed Elsayed, MD

Sumit Chopra, PhD

Yvonne W Lui, MD

Abstract

Introduction

What are Artificial Intelligence, Machine Learning, and Deep Learning?

How do Machine Learning Algorithms Work?