Skip to main content
JCO Clinical Cancer Informatics logoLink to JCO Clinical Cancer Informatics
. 2019 May 29;3:CCI.18.00121. doi: 10.1200/CCI.18.00121

A Deep Learning–Based Decision Support Tool for Precision Risk Assessment of Breast Cancer

Tiancheng He 1, Mamta Puppala 1, Chika F Ezeana 1, Yan-siang Huang 1,2, Ping-hsuan Chou 1,2, Xiaohui Yu 1, Shenyi Chen 1, Lin Wang 1, Zheng Yin 1, Rebecca L Danforth 1, Joe Ensor 1, Jenny Chang 1, Tejal Patel 1, Stephen TC Wong 1,
PMCID: PMC10445790  PMID: 31141423

Abstract

PURPOSE

The Breast Imaging Reporting and Data System (BI-RADS) lexicon was developed to standardize mammographic reporting to assess cancer risk and facilitate the decision to biopsy. Because of substantial interobserver variability in the application of the BI-RADS lexicon, the decision to biopsy varies greatly and results in overdiagnosis and excessive biopsies. The false-positive rate from mammograms is estimated to be 7% to approximately 10% overall, but within the BI-RADS 4 category, it is greater than 70%. Therefore, we developed the Breast Cancer Risk Calculator (BRISK) to target a well-characterized and specific patient subgroup (BI-RADS 4) rather than a broad heterogeneous group in assessing breast cancer risk.

METHODS

BRISK provides a novel precise risk assessment model to reduce overdiagnosis and unnecessary biopsies. It was developed by applying natural language processing and deep learning methods on 5,147 patient records archived in the Houston Methodist systemwide data warehouse from 2006 to May 2015, including imaging and pathology reports, mammographic images, and patient demographics. Key characteristics for BI-RADS 4 patients were collected and computed to output an index measure for biopsy recommendation that is clinically relevant and informative and improves upon the traditional BI-RADS 4 scores.

RESULTS

For the validation set, we assessed data from 1,247 BI-RADS 4 patients, including mammographic images and medical reports. The BRISK model sensitivity to predict malignancy was 100%, whereas the specificity was 74%. The total accuracy of our implemented model in BRISK was 81%. Overall area under the curve was 0.93.

CONCLUSION

BRISK for abnormal mammogram uses integrative artificial intelligence technology and has demonstrated high sensitivity in the prediction of malignancy. Prospective evaluation is under way and can lead to improvement in patient-physician engagement in making informed decisions with regard to biopsy.

INTRODUCTION

Early diagnosis and treatment of breast cancer improves prognosis and patient outcomes.1,2 Diagnostic mammography and ultrasonography are frequently performed to evaluate patients with palpable breast masses.3 The American Cancer Society also recommends that women with an average risk of breast cancer undergo regular screening mammography starting at age 45 years.4 The diagnostic work-up of an abnormal mammographic or ultrasound finding includes image-guided needle biopsy for histologic characterization of the suspicious finding. However, overtreatment by biopsy is a predominant and costly issue in breast imaging and leads to patients experiencing unnecessary anxiety and invasive procedures. The American College of Radiology developed the Breast Imaging Reporting and Data System (BI-RADS) lexicon to standardize mammographic reporting to assess cancer risk and facilitate biopsy decision making.2 Because of substantial interobserver variability in the application of the BI-RADS lexicon, biopsy decision-making accuracy varies greatly and results in overdiagnosis and overtreatment by biopsy. The BI-RADS lexicon classifies mammograms into one of seven assessment categories. Category 0 is classified as inconclusive and needing additional imaging, and categories 1 and 2 are largely taken as negative/benign, category 3 as probably benign, categories 5 and 6 as highly suggestive of malignancy, and category 4 as suspicious, with a predicted of malignancy risk of 2% to 95%. The overall mammography false-positive rate is estimated to be 7% to approximately 10%,5,6 with BI-RADS 4 containing well over 70% false-positive biopsy results.7,8 The wide predicted malignancy range of BI-RADS 4 limits its clinical utility.9 Standard practice is to biopsy all BI-RADS 4 category patients for diagnosis. False-positive mammograms are estimated to cost approximately $4 billion per year.10

Biopsies have additional drawbacks, including pain, bruising, risk of infection, and breast scarring.11 Distinguishing malignant from benign breast lesions on the basis of noninvasive imaging in BI-RADS 4 patients is crucial to successful and responsive clinical decisions. An ideal diagnosis support stratification system would consistently distinguish between benign and malignant imaging findings, but this does not exist, which puts the burden on clinicians and patients to make decisions amid high uncertainty. A few models have been published for predicting breast cancer risk from mammographic features and demographic factors,12-14 including mammographic breast density. Other research teams have focused on the application of machine learning methods. These techniques have been used to model the risk of developing cancer and the progression and treatment of diseases. A few models for predicting the risk of breast cancer from mammographic features and demographic factors have been published.15-17 These studies suggest that accurate decisions could be made using the probability of cancer as an outcome measure, but they do not provide a much-desired scoring system for the decision to biopsy. Moreover, our observation is that images alone do not capture all necessary information to predict cancer risk. Thus, there is an urgent need for a system that can better stratify the risk of cancer and the need for biopsy in BI-RADS 4 patients and thereby reduce the number of unnecessary biopsies and their undesirable adverse effects, risks, and costs.

CONTEXT

  • Key Objective

  • This medical decision support system is the first to apply natural language processing and deep learning to thousands of images and reports for patients with findings suggested of breast cancer and to enhance engagement between the patient and clinician in making an informed decision about whether to biopsy, thus reducing unnecessary biopsies and patient anxiety.

  • Knowledge Generated

  • We validated 5,000 Breast Imaging Reporting and Data System 4 patient images, clinical data, free-text reports, demographics, and other administrative information extracted from the eight-hospital clinical research data warehouse at Houston Methodist. We demonstrate that our deep learning algorithm used for breast cancer risk assessment can help to identify more subtle patterns and to generate an optimal index for the decision to biopsy while also improving accuracy of diagnosis support for the management of patients with findings suggested of breast cancer.

  • Relevance

  • Upon the successful completion of the work, we established a Web site for public access by physicians, oncologists, and patients to use our model to better assess the risk of Breast Imaging Reporting and Data System 4 patients and to reduce unnecessary biopsy and patient anxiety.

We describe the Breast Cancer Risk Calculator (BRISK) tool that outputs a risk assessment measure to aid in breast cancer biopsy risk assessment and decision in BI-RADS 4 patients. BRISK incorporates a new analytic model that deploys deep learning, an advanced subset of machine learning that is constructed from a set of neural networks that consists of three or more layers. By measuring similarities among the objects, a deep learning algorithm can find all the objects with similar features and distinguish the different classes. The ability of deep learning to generate new features without being explicitly instructed means that researchers can save time on feature selection and work with richer, more complex, and more comprehensive feature sets.18,19 Validation studies show that our deep learning algorithm-based BRISK tool can identify subtle patterns accurately. It generates a more optimized risk score for the decision to biopsy while enhancing the accuracy of the diagnosis support stratification system for abnormal breast imaging.

METHODS

Data

We constructed a database to aggregate large volumes of de-identified charts and images from BI-RADS 4 category patients who underwent biopsy within 3 months of their abnormal mammogram at Houston Methodist Hospital between January 2006 and May 2015. The database contains additional information that includes ultrasound images, mammographic reports, patient demographic variables, and pathology results. The data were from multiple sources: The ultrasound images were acquired from the picture archiving and communication system of Houston Methodist Hospital, and reports were retrieved from Houston Methodist Environment for Translational Enhancement and Outcomes Research, our systemwide clinical data warehouse.20 We managed data sets by linking the pseudo-medical record number. Figure 1 shows how the database was built.

FIG 1.

FIG 1.

The database architecture of the Breast Cancer Risk Calculator system. The breast database leverages the Houston Methodist clinical information infrastructure and has two components: de-identified clinical reports and patient demographic variables extracted from the Houston Methodist data warehouse and image signatures generated from the de-identified images retrieved online from the picture archiving and communication system (PACS). The Houston Methodist data warehouse contains all clinical data of approximately 4 million inpatients and outpatients, patient demographics, and business and finance data of the eight hospitals within the Houston Methodist system to support online querying, outcome studies, and clinical research. PACS contains all images acquired within Houston Methodist. This figure shows the various vendor databases that Houston Methodist Hospital (HMH) leverages for storing and managing clinical data, including ATHENA NextGen (Orlando, FL), MEDHOST (Franklin, TN), a health information system (HIS; Health Quest, Lagrangeville, NY), a radiology information system image connector (RIS-IC; Centricity; GE Healthcare, Chicago, IL), historical data from an Allscripts (Chicago, IL) database, OR Manager (Access Intelligence, Rockville, MD), and our current electronic medical record system (Epic Systems, Verona, WI), as well as various other internal and external data sources, such as a UnitedHealthCare clinical database (UHC CDB; Edina, MN) and a Vizient database (Irving, TX) for quality- and insurance-related data. CMS, Centers for Medicare & Medicaid Services; ED, emergency department; METEOR, Houston Methodist Environment for Translational Enhancement and Outcomes Research; OP, operative note; OR, operating room.

Model Development

The first step in building our model was to accurately extract cancer risk variables from the clinical reports. Manual extraction of information is a time-intensive and costly process. Therefore, we applied a natural language processing (NLP) interface module21,22 to extract patient demographics and other variables automatically from the clinical report.20,23,24 Methodist Hospital Text Teaser is our developed NLP tool that automatically extracts clinical report and patient demographic variable data for risk modeling20 under guidance from a breast oncologist. It can search and retrieve specific clinical information from free-text reports. Mammographic reports associated with breast biopsies that have been referred for cancer prognosis were collected.

On the basis of the extracted cancer risk variables, a deep learning model was developed for BRISK to aid in predicting malignancy. Figure 2 illustrates a schematic diagram of the model, which contains two deep autoencoder networks and one multilayer perceptron. We refer to the first autoencoder as the image autoencoder deep network used for dimension reduction while maintaining the features from the original images (mammogram and ultrasound images). The compressed image features (the middle hidden layer) are then combined with other clinical features as the input to the second autoencoder called multiple feature autoencoder. Similarly, the hidden layer represents smaller dimensional features and is used as an input of the final decision-making multilayer perceptron. The output of the final multilayer perceptron forms the risk decision for biopsy. We constructed this artificial intelligence system by minimizing the categorical cross-entropy loss. The parameters estimated are weight vectors of the filters, a bias term in the activation function, and a weight vector of the softmax function. Optimization is performed using stochastic gradient descent25 and back propagation.26-28

FIG 2.

FIG 2.

The deep learning architecture of the Breast Cancer Risk Calculator system. We developed the model with two autoencoder deep networks and one multilayer perceptron. The medical image autoencoder deep network is used for compressing the features from the original images (mammogram and ultrasound images). Compressed image features are combined with imaging and clinical features as the input of the multiple feature autoencoder deep network, the compressed features of which are used as input of the final decision-making multilayer perceptron. The output of the multilayer perceptron is the final decision of biopsy.

The model input includes ultrasound images, clinical report features, mammography images and report features, and demographic features. Only basic preprocessing, such as contrast adjustment, brightness correction, and image size normalization, was performed on the original patient images. Clinical features of each BI-RADS 4 patient are extracted from demographic and mammographic reports in our structured database and organized as a feature matrix, with different rows representing different feature types. As a preprocessing step, the feature vector of each row is first generated from the word2vec model of Mikolov et al.29 These models produce text features, with each unique word being assigned a corresponding vector in the space. To ensure that the dimensionalities of clinical feature vectors are identical, zero padding30 is applied across different feature types. Table 1 lists all features used as the model input. Figure 3 shows examples of mammography and ultrasound images for three BI-RADS 4 patients with suspected breast cancer with different findings.

TABLE 1.

Selected Features Stored in the Database for the Breast Cancer Risk Sssessment System

graphic file with name cci-3-1-g003.jpg

FIG 3.

FIG 3.

Examples of mammography and ultrasound images for three Breast Imaging Reporting and Data System 4 patients with suspected breast cancer with different findings from Houston Methodist Hospital. (A) The biopsy-based pathology finding is benign. (B) The biopsy-based pathology finding is ductal carcinoma in situ. (C) The biopsy-based pathology finding is carcinoma.

Model Setting

Medical image autoencoder deep network.

Convolutional neural networks are used as encoders and decoders because the inputs are images.31-35 The encoder consists of a stack of two-dimensional (2D) convolution and 2D max-pooling layers (max-pooling being used for spatial downsampling), whereas the decoder consists of a stack of 2D convolution and 2D upsampling layers. An input-output relation is defined as

μ^= i=1M1(WfWp)(xCAE)

where xCAE is the input set, and μ^ is the reconstructed output of this network. M1 is the number of neurons in the network. Wf denotes the set of the learnable connection weights between the convolution layer and pooling layer, and Wp denotes the set of the learnable connection weights between the pooling layer and the compressed layer. The input is the image data matrix whose size equals 512 × 512 × N, where N is number of input images. This network comprises one convolution layer, one pooling layer, one feature output layer, one unpooling layer, and one unconvolution layer.

Multiple feature autoencoder deep network.

We used the variational Bayesian approach36-40 for multiple feature autoencoder learning, which is applied by the training algorithm called stochastic gradient variational Bayes.36,41 The variational autoencoder model inherits autoencoder architecture that compresses our multiple features from the input layer into a short code and then uncompresses that code to match closely the original feature set. It also makes strong assumptions about the distribution of our feature set. The encoding network comprises one input layer and one hidden layer, and the decoding network comprises one feature layer and one hidden layer. An input-output relation for a neuron is defined as

ϑ^= i=1M2(WgWq)(xVAE)

where xVAE is the input set, and ϑ^ is the reconstructed output of this network. M2 is the number of neurons in the network. Wg denotes the set of the learnable connection weights between the combined feature layer and hidden layer, and Wq denotes the set of the learnable connection weights between the hidden layer and the compressed layer. The input is the feature data matrix (ie, output feature matrix of the medical image autoencoder deep network) plus the normalized features listed in Table 1. This network comprises one input feature layer, two hidden layers, one feature output layer, and one uncoded layer.

Decision-making multilayer perceptron.

The multilayer perceptron is a feed-forward, supervised neural network topology.42,43 It consists of one input layer, one output layer, and one hidden layer. The mission of each neuron is to add weighted inputs, get a net input, and obtain an output by transferring this net input through a rectified linear unit activation function. An input-output relation for a neuron within inputs is defined as

θ^= i=1M3Wh(xMLP)

where xMLP is the input set, and θ^ is the output of our model. M3 is the number of neurons in the network, and Wh denotes the set of learnable connection weights between the input layer and the hidden layer. In our work, the input is the feature data matrix (ie, output feature matrix of medical image autoencoder deep network) plus the normalized features listed in Table 1. This network comprises one input feature layer, two hidden layers, one feature output layer, and one uncoded layer.

Optimization

To optimize the combination of correlation between the learned representations and the reconstruction errors of the autoencoders, the following error function was used to optimize our model; it consists of two autoencoder networks and the multilayer perceptron:

MINWf,Wg,Wp,Wq,Wh{αNi=1N[xip(f(xi))2+yiq(g(yi))2]+βNi=1Nh(g(yi))θ2}

where xi,i = 1,…,N is the input image of a medical image autoencoder deep network (N is the sample size) and yi,i = 1,…,N is the input of the multiple feature autoencoder deep networks; f and p are the encoding and decoding networks of a medical image autoencoder deep network, whereas g and q are the encoding and decoding networks of the multiple feature autoencoder deep network. The h denotes the decision-making multilayer perceptron. Wf, Wg, Wp, Wq, and Wh are the learnable parameters for each network. The α and β are the adjustable parameters in this function, and θ is the desired output.

To constrain the optimization, sparsity is a desired characteristic for our network because it allows the use of a greater number of hidden units and, therefore, gives the network the ability to learn different connections and extract different features. In the two autoencoder networks, we separately defined the average value for the neuron of the hidden layer as

ρif^=1Ni=1Nf(xi) and ρig^=1Ni=1Ng(xi)

We defined the sparsity parameter ρ as the desired average activation value for every hidden neuron, and by initializing it to a value close to zero, we can enforce the sparsity as follows:

ρif^ρ ρig^ρ 

To achieve this, Kullback-Leibler (KL) divergence term44 was used:

OKL(ρ||ρif^)+OKL(ρ||ρig^)=Oρlogρρif^+(1ρ)log1ρ1ρif^+Oρlogρ ρig^+(1ρ)log1ρ1 ρig^

The KL divergence is measured O times (where O = 4 in our validation) between a Bernoulli random variable with mean ρ and a Bernoulli random variable with mean ρlf^ and ρlg^ used to model a single neuron. Therefore, the final form of the energy function is:

minWf,Wg,Wp,Wq,Wh{αNi=1N[xip(f(xi))2+yiq(g(yi))2+OKL(ρ||ρif^)+OKL(ρ||ρig^)]+βNi=1Nh(g(yi))θ2}

RESULTS

In the validation, we evaluated the model by comparing its predictive accuracy with BI-RADS 4 patient pathology reports. We defined five diagnosis types from the biopsy report: benign, atypia, lobular carcinoma in situ, ductal carcinoma in situ (DCIS), and carcinoma. Invasive breast cancer or DCIS were considered positive results. Any other pathology diagnosis was considered as free of breast cancer and as a negative result. Appendix Table A1 lists the number of BI-RADS 4 patients from both our training set and our testing set. The two categories used in our classifier were benign and malignant. There were 5,147 patients who were randomly divided into the 3,900 (76%) for training data and 1,247 (24%) for blind validation (testing) data. In the training data, 79% of the patients (3,090 of 3,900) had benign findings, and 21% (810 of 3,900) had malignant findings. In the testing data, 72% of the patients (897 of 1,247) had benign findings, and 28% (350 of 1,247) had malignant findings. Given that the number of patients with benign findings is naturally larger than those with malignant findings, the imbalance as noticed was not out of place. However, the data imbalance was addressed by applying the kernel modification method from He and Garcia.45 To avoid the overfitting issue while we were fine-tuning the network parameters using the training set of 3,900 patients, 10-fold cross-validation was applied according to the MATLAB-based method.46 After splitting the training data set into 10 equal parts, we trained our model on nine parts and evaluated the model on the remaining part. This process was repeated until each of the 10 parts had been used for evaluation while the remaining nine parts were used for training. We finally generated the model on the basis of the best performance metric for additional testing using our reserved testing data.

Table 2 lists the details of the validation results using the blind validation (testing) data set of 1,247 patients in our model. To ensure that the clinical risk is low, we adjusted the model to guarantee that all patients with a high cancer risk are assigned to biopsy, which requires the sensitivity of our model to be 100%. As such, we killed the specificity and increased the false-positive rate, meaning some patients with benign results were missed and would be sent for biopsy. The sensitivity of our model was 100% (number of patients with actual malignant results and predicted malignant results was 350 for both), and the specificity was 74% (number of patients with actual benign results and predicted benign results was 897 and 663, respectively; Table 2). The total accuracy of our implemented model in BRISK was 81%.

TABLE 2.

Number of BI-RADS 4 Patients’ Cases and Validation Results (total accuracy: 81%)

graphic file with name cci-3-1-g005.jpg

BRISK is a diagnosis support tool for risk stratification of patients with abnormal mammograms that uses integrative artificial intelligent technology. It improves patient-physician engagement in making an informed decision about whether to biopsy. There are no previous works that combine both clinical and imaging data for such an application. To validate our model objectively, we calculated the value of the area under the curve and compared it with other traditional machine learning methods, including decision tree,47 discriminant analysis,48 logistic regression,49 k-nearest neighbors,50 support vector machine,51 and long short-term memory networks.52 Although convolution neural network53 is a good deep learning model for medical imaging classification, it has limits for application to multiple types of information as needed in our model, so we did not compare it in our work. Table 3 lists the model settings and area under the curve values after using the various machine learning methods for comparison. Our model has much higher efficiency than the other seven methods. In Appendix Figures A1 and A2, we show the model robustness plot of the BRISK model by using the accuracy scales and the receiver operating characteristic curve in our model validation. We will continue to validate the model in the prospective study and compare it with other newly published models in the future.

TABLE 3.

All Model Settings and the AUC Values After Using Different Machine Learning Methods for Comparison

graphic file with name cci-3-1-g006.jpg

DISCUSSION

We present the BRISK tool, which uses an integrative artificial intelligence strategy that incorporates NLP, image processing, a deep learning–based analytic model, and thousands of BI-RADS 4 patients to achieve precise breast cancer biopsy risk assessment and decision support. Patient data included mammogram and ultrasound images, free-text radiology and pathology reports, patient demographics, and other administrative information extracted from the clinical data warehouse of Houston Methodist. BRISK was able to categorize abnormal mammogram findings into subtypes (benign, atypia, lobular carcinoma in situ, DCIS, and carcinoma) and improve the biopsy endorsement compared with BI-RADS 4 recommendations, with high specificity and sensitivity. We have validated the tool with data from thousands of BI-RADS 4 patients. After prospective studies are conducted, such a tool will be made accessible on the Web, where BRISK will display an index measure of biopsy recommendation that is more clinically relevant and informative than traditional BI-RADS scores. Through the Web site, clinicians will be able to enter image data and clinical variables of the patient under consideration for breast cancer risk assessment and biopsy.

To our knowledge, this diagnosis support model is the first for abnormal breast imaging using NLP combined with deep learning methods. We compared the proposed model with manual review results and showed that our method maintains high accuracy. Because BI-RADS 4 poses a major problem of unnecessary biopsies, we focused on resolving that issue in the current version of BRISK specifically for BI-RADS 4 patients. In the future, we will extend our model to the other BI-RADS categories of much lesser false-positive impact. Our model was trained and tested using large numbers of patients from Houston Methodist, and we are now extending the evaluation of BRISK to other cancer center sites. Finally, we plan to extend these integrative artificial intelligence tools to reduce overdiagnosis and overtreatment of other types of cancer.

ACKNOWLEDGMENT

We thank our physician collaborators and hospital information technology colleagues at the Houston Methodist Hospital for help with this project.

Appendix

FIG A1.

FIG A1.

Model robustness plot of the proposed model to show the accuracy scales of the model with data instances. AUC, area under the curve.

FIG A2.

FIG A2.

Receiver operating characteristic curve (ROC) of experimental results using our Breast Cancer Risk Calculator (BRISK) model. The arrow indicates the model we used in our BRISK application.

TABLE A1.

Composition of the Training and Test Sets

graphic file with name cci-3-1-g009.jpg

Supported partly by the John S. Dunn Research Foundation and Ting Tsung and Wei Fong Chao Center for Bioinformatics Research and Imaging for Neurosciences as well as by the facilities and clinical and imaging data sources of Houston Methodist.

T.H. and M.P. are co-first authors.

AUTHOR CONTRIBUTIONS

Conception and design: Tiancheng He, Mamta Puppala, Chika F. Ezeana, Shenyi Chen, Joe Ensor, Jenny Chang, Tejal Patel, Stephen T.C. Wong

Financial support: Stephen T.C. Wong

Administrative support: Tejal Patel, Jenny Chang, Stephen T.C. Wong

Provision of study materials or patients: Mamta Puppala, Yan-siang Huang, Ping-hsuan Chou, Stephen T.C. Wong

Collection and assembly of data: Tiancheng He, Mamta Puppala, Chika F. Ezeana, Ping-hsuan Chou, Xiaohui Yu, Shenyi Chen, Lin Wang, Jenny Chang, Tejal Patel, Stephen T.C. Wong

Data analysis and interpretation: Tiancheng He, Mamta Puppala, Chika F. Ezeana, Yan-siang Huang, Zheng Yin, Rebecca L. Danforth, Joe Ensor, Jenny Chang, Tejal Patel, Stephen T.C. Wong

Manuscript writing: All authors

Final approval of manuscript: All authors

Accountable for all aspects of the work: All authors

AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST

The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/jco/site/ifc.

Joe Ensor

Consulting or Advisory Role: Aetna, Statanalytics ldd

Jenny Chang

Consulting or Advisory Role: Genentech, Celgene

Travel, Accommodations, Expenses: Celgene, Genentech

Stephen T.C. Wong

Stock and Other Ownership Interests: Akiri, Community Health System, Health2047

Consulting or Advisory Role: Health2047

Travel, Accommodations, Expenses: Health2047

No other potential conflicts of interest were reported.

REFERENCES

  • 1. American Cancer Society: Cancer Facts & Figures America Cancer Society 2018. https://www.cancer.org/research/cancer-facts-statistics/all-cancer-facts-figures/cancer-facts-figures-2018.html.
  • 2.Hortobagyi GN, Connolly JL, Dorsi CJl, et al. : Breast AJCC Cancer Staging Manual. New York, NY, Springer. 2002 [Google Scholar]
  • 3.Lehman CD, Lee AY, Lee CI: Imaging management of palpable breast abnormalities. AJR Am J Roentgenol 203:1142-1153, 2014 [DOI] [PubMed] [Google Scholar]
  • 4.Smith RA, Cokkinides V, von Eschenbach AC, et al. : American Cancer Society guidelines for the early detection of cancer. CA Cancer J Clin 52:8-22, 2002 [DOI] [PubMed] [Google Scholar]
  • 5.Weigert J, Steenbergen S: The Connecticut experiment: The role of ultrasound in the screening of women with dense breasts. Breast J 18:517-522, 2012 [DOI] [PubMed] [Google Scholar]
  • 6.Lehman CD, Lee CI, Loving VA, et al. : Accuracy and value of breast ultrasound for primary imaging evaluation of symptomatic women 30-39 years of age. AJR Am J Roentgenol 199:1169-1177, 2012 [DOI] [PubMed] [Google Scholar]
  • 7.van Luijt PA, Fracheboud J, Heijnsdijk EA, et al. : Nation-wide data on screening performance during the transition to digital mammography: Observations in 6 million screens. Eur J Cancer 49:3517-3525, 2013 [DOI] [PubMed] [Google Scholar]
  • 8.Halladay JR, Yankaskas BC, Bowling JM, et al. : Positive predictive value of mammography: Comparison of interpretations of screening and diagnostic images by the same radiologist and by different radiologists. AJR Am J Roentgenol 195:782-785, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Bent CK, Bassett LW, D’Orsi CJ, et al. : The positive predictive value of BI-RADS microcalcification descriptors and final assessment categories. AJR Am J Roentgenol 194:1378-1383, 2010 [DOI] [PubMed] [Google Scholar]
  • 10.Ong M-S, Mandl KD: National expenditure for false-positive mammograms and breast cancer overdiagnoses estimated at $4 billion a year. Health Aff (Millwood) 34:576-583, 2015 [DOI] [PubMed] [Google Scholar]
  • 11. doi: 10.1111/tbj.12832. Wallace AS, Nelson JP, Wang Z, et al: In support of the Choosing Wisely campaign: Perceived higher risk leads to unnecessary imaging in accelerated partial breast irradiation? Breast J, 24:12-15, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ayer T, Alagoz O, Chhatwal J, et al. : Breast cancer risk estimation with artificial neural networks revisited: Discrimination and calibration. Cancer 116:3310-3321, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Park CC, Rembert J, Chew K, et al. : High mammographic breast density is independent predictor of local but not distant recurrence after lumpectomy and radiotherapy for invasive breast cancer. Int J Radiat Oncol Biol Phys 73:75-79, 2009 [DOI] [PubMed] [Google Scholar]
  • 14.Eriksson L, Czene K, Rosenberg LU, et al. : Mammographic density and survival in interval breast cancers. Breast Cancer Res 15:R48, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Ayer T, Chen Q, Burnside ES: Artificial neural networks in mammography interpretation and diagnostic decision making. Comput Math Methods Med 2013:832509, 2013. [DOI] [PMC free article] [PubMed]
  • 16. Stojadinovic A, Eberhardt C, Henry L, et al: Development of a Bayesian classifier for breast cancer risk stratification: A feasibility study. Eplasty 10:e25, 2010. [PMC free article] [PubMed] [Google Scholar]
  • 17.Burnside ES, Davis J, Chhatwal J, et al. : Probabilistic computer model developed from clinical data in national mammography database format to classify mammographic findings. Radiology 251:663-672, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Bengio Y: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2:1-127, 2009.
  • 19. Maas AL, Hannun AY, Ng AY: Rectifier nonlinearities improve neural network acoustic models. Proc Int Conf Machine Learning 30:1-3, 2013. [Google Scholar]
  • 20.Puppala M, He T, Chen S, et al. : METEOR: An enterprise health informatics environment to support evidence-based medicine. IEEE Trans Biomed Eng 62:2776-2786, 2015 [DOI] [PubMed] [Google Scholar]
  • 21.Patel TA, Puppala M, Ogunti RO, et al. : Correlating mammographic and pathologic findings in clinical decision support using natural language processing and data mining methods. Cancer 123:114-121, 2017 [DOI] [PubMed] [Google Scholar]
  • 22. He T, Puppala M, Ogunti R, et al: Deep learning analytics for diagnostic support of breast cancer disease management, in 2017 IEEE EMBS International Conference on Biomedical and Health Informatics. Piscataway, NJ, Institute of Electrical and Electronic Engineers, 2017, pp 365-368. [Google Scholar]
  • 23.Wilhelm SM, Wang TS, Ruan DT, et al. : The American Association of Endocrine Surgeons guidelines for definitive management of primary hyperparathyroidism. JAMA Surg 151:959-968, 2016 [DOI] [PubMed] [Google Scholar]
  • 24. Puppala M, He T, Yu X, et al: Data security and privacy management in healthcare applications and clinical data warehouse environment, in 2016 IEEE EMBS International Conference on Biomedical and Health Informatics. Piscataway, NJ, Institute of Electrical and Electronic Engineers, 2016, pp 5-8. [Google Scholar]
  • 25. Breuel TM: The effects of hyperparameters on SGD training of neural networks. arXiv Preprint arXiv:1508.02788, 2015.
  • 26. Rumelhart DE, Hinton GE, Williams RJ: Learning representations by back-propagating errors. Nature 323:533-536, 1986.
  • 27. Joachims T: Text categorization with support vector machines: Learning with many relevant features, in Nédellec C, Rouveirol C (eds): Machine Learning: ECML-98. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence). Berlin, Germany, Springer, 1998, pp 137-142. [Google Scholar]
  • 28. Li X, Roth D: Learning question classifiers, in Proceedings of the 19th International Conference on Computational Linguistics, Volume 1. Stroudsburg, PA, Association for Computational Linguistics, 2002, pp 1-7. [Google Scholar]
  • 29.Mikolov T, Sutskever I, Chen K, et al. : Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 1:3111-3119, 2013. [Google Scholar]
  • 30. Kim Y: Convolutional neural networks for sentence classification. arXiv Preprint arXiv:1408.5882, 2014.
  • 31.van den Oord A, Kalchbrenner N, Espeholt L, et al. : Conditional image generation with PixelCNN decoders. Adv Neural Inf Process Syst 1:4790-4798, 2016. [Google Scholar]
  • 32.Badrinarayanan V, Kendall A, Cipolla R: Segnet: A deep convolutional encoder-decoder architecture for scene segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481-2495, 2017 [DOI] [PubMed] [Google Scholar]
  • 33. Chen H, Zhang Y, Kalra MK, et al: Low-dose CT with a residual encoder-decoder convolutional neural network (RED-CNN). arXiv Preprint arXiv:1702.00288, 2017. [DOI] [PMC free article] [PubMed]
  • 34.Xu J, Luo X, Wang G, et al. : A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images. Neurocomputing 191:214-223, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Schmidhuber J: Deep learning in neural networks: An overview. Neural Netw 61:85-117, 2015 [DOI] [PubMed] [Google Scholar]
  • 36. Kingma DP, Welling M: Auto-encoding variational Bayes. arXiv Preprint arXiv:1312.6114, 2013.
  • 37. Kingma DP, Welling M: Stochastic gradient VB and the variational auto-encoder. Second International Conference on Learning Representations 1:1-14, 2014. [Google Scholar]
  • 38. Tran DL, Walecki R, Eleftheriadis S, et al: DeepCoder: Semi-parametric variational autoencoders for facial action unit intensity estimation. arXiv Preprint arXiv:1704.02206, 2017.
  • 39. Mescheder L, Nowozin S, Geiger A: Adversarial variational Bayes: Unifying variational autoencoders and generative adversarial networks. arXiv Preprint arXiv:1701.04722, 2017.
  • 40. Rosca M, Lakshminarayanan B, Warde-Farley D, et al: Variational approaches for auto-encoding generative adversarial networks. arXiv Preprint arXiv:1706.04987, 2017.
  • 41. Rezende DJ, Mohamed S, Wierstra D: Stochastic backpropagation and approximate inference in deep generative models. arXiv Preprint arXiv:1401.4082, 2014.
  • 42.Tang J, Deng C, Huang G-B: Extreme learning machine for multilayer perceptron. IEEE Trans Neural Netw Learn Syst 27:809-821, 2016 [DOI] [PubMed] [Google Scholar]
  • 43. Naraei P, Abhari A, Sadeghian A: Application of multilayer perceptron neural networks and support vector machines in classification of healthcare data, in 2016 Future Technologies Conference (FTC). Piscataway, NJ, Institute of Electrical and Electronic Engineers, 2016, pp 848-852. [Google Scholar]
  • 44. Vincent P, Larochelle H, Bengio Y, et al: Extracting and composing robust features with denoising autoencoders, in Proceedings of the 25th International Conference on Machine Learning. New York, NY, Association for Computing Machinery, 2008, pp 1096-1103 . [Google Scholar]
  • 45.He H, Garcia EA: Learning from imbalanced data. IEEE Trans Knowl Data Eng 21:1263-1284, 2008 [Google Scholar]
  • 46.Golub GH, Von Matt U: Generalized cross-validation for large-scale problems. J Comput Graph Stat 6:1-34, 1997 [Google Scholar]
  • 47.Melillo P, De Luca N, Bracale M, et al. : Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability. IEEE J Biomed Health Inform 17:727-733, 2013 [DOI] [PubMed] [Google Scholar]
  • 48.Kan M, Shan S, Zhang H, et al. : Multi-view discriminant analysis. IEEE Trans Pattern Anal Mach Intell 38:188-194, 2016 [DOI] [PubMed] [Google Scholar]
  • 49.Manogaran G, Lopez D: Health data analytics using scalable logistic regression with stochastic gradient descent. Int J Adv Intell Paradigms 10:118-132, 2018 [Google Scholar]
  • 50.Khamis HS, Cheruiyot KW, Kimani S: Application of k-nearest neighbour classification in medical data mining. Information and Communication Technology Research 4:121-128, 2014. [Google Scholar]
  • 51.Razzaghi T, Roderick O, Safro I, et al. : Multilevel weighted support vector machine for classification on healthcare data with missing values. PLoS One 11:e0155119, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Lipton ZC, Kale DC, Elkan C, et al: Learning to diagnose with LSTM recurrent neural networks. arXiv Preprint arXiv:1511.03677, 2015.
  • 53.Araújo T, Aresta G, Castro E, et al. : Classification of breast cancer histology images using convolutional neural networks. PLoS One 12:e0177544, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from JCO Clinical Cancer Informatics are provided here courtesy of American Society of Clinical Oncology

RESOURCES