Abstract
Objective:
To assess whether a computer-aided, diagnosis (CAD) system can predict pathological Complete Response (pCR) to neoadjuvant chemotherapy (NAC) prior to treatment using texture features.
Methods:
Response to treatment of 44 patients was defined according to the histopatology of resected tumour and extracted axillary nodes in two ways: (a) pCR+ (Smith’s Grade = 5) vs pCR− (Smith’s Grade < 5); (b) pCRN+ (pCR+ and absence of residual lymph node metastases) vs pCRN−. A CAD system was developed to: (i) segment the breasts; (ii) register the DCE-MRI sequence; (iii) detect the lesion and (iv) extract 27 3D texture features. The role of individual texture features, multiparametric models and Bayesian classifiers in predicting patients’ response to NAC were evaluated.
Results:
A cross-validated Bayesian classifier fed with 6 features was able to predict pCR with a specificity of 72% and a sensitivity of 67%. Conversely, 2 features were used by the Bayesian classifier to predict pCRN, obtaining a sensitivity of 69% and a specificity of 61%.
Conclusion:
A CAD scheme, that extracts texture features from an automatically segmented 3D mask of the tumour, could predict pathological response to NAC. Additional research should be performed to validate these promising results on a larger cohort of patients and using different classification strategies.
Advances in knowledge:
This is the first study assessing the role of an automatic CAD system in predicting the pathological response to NAC before treatment. Fully automatic methods represent the backbone of standardized analysis and may help in timely managing patients candidate to NAC.
INTRODUCTION
Neoadjuvant chemotherapy (NAC) has a leading role in the preoperative treatment of patients with large breast lesions,1,2 thanks to its clinical advantages. First, it allows a downstage of the tumour so that more conservative therapies could be proposed instead of mastectomy.3,4 In addition, an improved survival rate has been reported for patients achieving pathological Complete Response (pCR) after the treatment.5 Moreover, within this therapy, it is possible to monitor the treatment response by measuring the “in vivo” tumour changes during and after NAC. It has been demonstrated that NAC could lead to a pCR in up to 30% of patients with breast cancers.6,7 However, the rate of response to NAC therapy is limited and dependent on the subtypes of breast cancers.8–15 As a consequence, ineffective chemotherapy could be stopped and unnecessary toxicity for the patient could be avoided.5 Indeed, early identification of treatment response would be of key importance to improve patients’ management, since it would enable the use of alternative, potentially more effective therapies tailored to individual patient.
Magnetic Resonance Imaging (MRI) is currently used in clinical practice to assess the response at the end of NAC.16 In several studies the variation of morpho-functional features provided by MRI before and during the course of NAC has been demonstrated to be potential“surrogate” biomarkers in the early discrimination between responder and non-responder patients at the end of the treatment.16–19 However, the suboptimal reproducibility of these features represents the main limitation for its real application in the daily clinical practice.
Recently, the predictive value of quantitative biomarkers based on textural characteristics of the image has been exploited. Textural analysis has gained wide applications in medical image analysis20–25 for its ability to characterize the spatial dependence of grey-levels using high order statistics. In particular, it has been proven useful to detect and characterize breast lesions and, more recently, has shown promises in predicting tumour response to therapy.15,26–29 However, most of previous studies evaluated quantitative parameters acquired after the completion of, at least, one cycle of chemotherapy, therefore their findings could not be used to redirect treatment regimens for patient who are not likely to respond to NAC.29 Moreover, the majority of previous studies did not use as reference standard the pathologic response after surgery, which is the time point better associated with the final prognosis of the patient.26 To the best of our knowledge, only three previous studies exploited the role of two-dimensional textural analysis of Dynamic Contrast Enhanced (DCE)-MRI obtained prior to treatment to predict pathological response to NAC.26,28,29 In particular, Teruel et al26 on a dataset of 58 patients showed that 4 individual texture features were significantly correlated with pCR, with an area under the receiver characteristics curve (ROC) curve (AUC) of 0.68. Michoux et al28 and Golden et al29 developed multiparametric classifiers to predict pathological non-response to therapy using 69 and 60 patients, respectively. The first reached a predictive accuracy of 68%, while the latter obtained an AUC equal to 0.68 in predicting pathological Complete Response. These previous results were promising, however further investigations are needed to better generalize these findings, i.e. including larger groups of tumour subtypes, to exploit the whole tumour characteristics by using a three-dimensional (3D) approach, and to standardize the methods, by developing automatic computer-aided diagnosis (CAD) scheme to segment the tumours.
In this scenario, the objective of this proof-of-concept study is to assess the feasibility of a CAD scheme able to automatically extract quantitative 3D biomarkers and classify each patient according to the likelihood of pCR to NAC, by considering also the different immunohistochemical subtypes.
METHODS AND MATERIALS
Patients
Patients were retrospectively included from a prospective single-centre observational study performed at our institution between November 2010 and February 2014, having the following inclusion criteria: (a) age between 18 and 65 years; (b) presence of imaging-guided core-biopsy proven Stage II/III operable breast cancer (T > 3 cm) or inoperable locally-advanced breast cancerand (c) unifocal or multiple masses at baseline MRI. This study was conducted in compliance with the ethical regulatory issues of our Institution and patients were asked to provide written informed consent before entering the study. All patients were evaluated by our multidisciplinary team clinic before and after the completion of NAC. Enrolled subjects underwent 4 cycles of treatment based on a combination of doxorubicin 50 mgm–2bolus i.v. followed by paclitaxel 175 mg m–2as a 3 h i.v. infusion. In patients with baseline left ventricular ejection fraction <55%, doxorubicin was omitted and monochemotherapy with paclitaxel 225mg m–2 as a 3 h i.v. infusion was administered. Cycles were repeated every 21 days if absolute neutrophil count ≥ 1500μl–1 and platelets ≥ 100,000μl–1, and otherwise delayed until resolution of hematologic toxicity.17 Women with HER2-positive breast cancer received also trastuzumab. All patients underwent surgery, that was performed between 14 and 35 days after the completion of NAC.
Immunoistochemical analysis and pathological tumour response
Immunoistochemical (IHC) analysis was performed on specimens from imaging-guided core-biopsies. Positivity for Estrogen Receptor (ER) and Progesteron Receptor (PgR) status was defined as immunostaining in ≥1% of invasive tumour cells, while Ki67 was considered positive when expressed by more than 14% of tumour cells.30 HER2 status was assessed according to ASCO/CAP Guideline recommendation.31 Positivity was defined as 3+ score by IHC in >30% of invasive tumour cells using the HercepTest (Dako, Glostrup, Denmark). Equivocal cases at IHC (2+ score or 3+ in ≤30% of invasive tumour cells), were subjected to fluorescence in situ hybridization analysis. A ratio of HER2 gene signals to chromosome 17 signals of more than 2.2 was used as a cut-off to define HER2 gene amplification. Tumours were divided into Luminal A (ER-positive and Ki67 <14% and HER2-negative), Luminal B (ER-positive and Ki67 ≥14% and either HER2-positive or HER2-negative), HER2-enriched (ER-negative and HER2-positive) and triple-negative (ER-negative and HER2-negative).30 The histopathological tumour response was evaluated using a five-point assessment scheme described by Smith et al32 Grade 1, some alteration to individual malignant cells but no reduction in overall numbers as compared with the pre-treatment core biopsy; Grade 2, a mild loss of invasive tumour cells but overall cellularity still high; Grade 3, a considerable reduction in tumour cells up to an estimated 90% loss; Grade 4, a marked disappearance of invasive tumour cells such that only small clusters of widely dispersed cells could be detected; and Grade 5, no invasive tumour cells identifiable in the sections from the site of the previous tumour, that is, only in situ disease or tumour stroma remained. Grade5 response was deemed to represent a pCR of the primary cancer. PathologicalComplete Response at axillary level was classified as absence of residual invasive tumour in the lymph nodes.
MRI protocol
MRI examination was carried on before the first cycle of chemotherapy (baseline), within 2 weeks from the second cycle (intermediate) and after the completion of the planned treatment, within 1 week before surgery. MRI was acquired with a 1.5T equipment and dedicated phased-array 8-channel coil (HDx Signa Excite, GE HealthCare Milwaukee, WI), with the patient in the prone position and following the recommended technical requirements for breast imaging.16 In particular, the DCE-MRI study was performed using a fat-sat 3D fast spoiled gradient-echo sequence (VIBRANT®, General Electric, Milwaukee, WI) having slice thickness = 2.6 mm; acquisition matrix = 416×416, and flip angle = 10°. A total of six scans were acquired for each study: one baseline, 4 contrast-enhanced frames with 90 s time resolution, and one delayed frame acquired 8 min after i.v. contrast agent administration (Multihance, Bracco Imaging, Milan, Italy). Contrast-enhanced study was started simultaneously with the bolus injection of 0.1 mmol kg–1 of gadolinium chelate, infused in the antecubital vein by power injector, at a rate of 2 ml s−1 and followed by a saline flush. 33 patients were acquired along the axial plane with repetition time/echo time (TR/TE)=5.4/2.6 ms and pixel size = 0.39 mm2, while 11 patients were acquired using a sagittal sequence with TR/TE = 4.8/1.9 ms and pixel size = 0.22 mm2.
Image analysis
The lesion segmentation method is based on a fully automatic algorithm previously developed33 that consists of different steps. First, the sagittal volumes are converted into axial images and conveniently resampled (upsampled along the x-axis and downsampled along the z-axis). Then, an elastic registration is performed to align the enhanced DCE-MRI frames to the unenhanced one, thus correcting for misalignments due to patient motions. Once all datasets are registered, the breasts are segmented using the algorithm developed by Giannini et al34 conveniently adapted to the fat-sat 3D fast spoiled gradient-echo sequence. Finally, the tumour is automatically segmented on the subtracted mean intensity projection image over time normalized by the contrast enhancement of the mammary vessels. This normalization has been demonstrated useful to cope with the significant variations of signal intensities between patients due to different scanners, coils, acquisition modalities, types and amounts of contrast agent injected, patients’ physiology, and other external factors, and to provide a reliable automatic segmentation.33 Since it would be possible that the automatic algorithm produces some false positive (FP), an experienced radiologist (more than 20 years of experience in interpreting breast MRI) selected, for each patient, the true positive among the segmented areas. In cases of multifocal disease, the largest tumour was selected as the index tumour and taken into consideration for the subsequent steps.
Features extraction
Twenty-seven 3D textural features were extracted from the subtracted post-contrast first frame of the pre-NAC (baseline) DCE-MRI studies. In particular, 17 features were derived from the grey-level co-occurrence matrices (GLCM),35 and 10 were computed from the grey-level run length method (GLRLM).36 The GLCM is a tabulation of how often different combinations of pixel brightness values (i.e. grey levels) occur between neighbouring voxels in an image along a given direction. Therefore, the GLCM allows the calculation of second order statistics, i.e. describing the relationship between groups of pixels in the image. Conversely, in the GLRLM method, each element GLRLMθ(i,j) represents the number of occurrences of the j adjacent elements with grey level i calculated in direction θ.
Before extracting texture parameters, we first equalized the region of interest (ROI) histogram by rescaling into 256 bins the signal intensities within each ROI between the first and the 99th percentile. Then, to take into account the contribution of all voxels adjacent to the reference one, the GLCMs and the GLRLMs were generated for each of the 13 directions characterizing a 3D image. In the case of the GLCM calculation, a distance of one voxel was chosen. Finally, the 13 matrices were averaged to enable the method to be rotationally invariant to the distribution of texture. Finally, the following 17 features were obtained from GLCMs: contrast,35 correlation1,35 correlation2,37 energy,35 entropy,35 homogeneity,35 sum variance,35 sum entropy,35 sum average,35 difference variance,35 difference entropy,35 information measure of correlation1,35 information measure of correlation2,35 cluster prominence,38 cluster shade,39 dissimilarity40 and maximum probability.41 Moreover, the following 10 features were computed form the GLRLMs: short run emphasis,41 long run emphasis,41 grey level distribution,41 run length distribution,41 low grey level runs emphasis,42 high grey level runs emphasis,42 short run low grey level emphasis,43 short run high grey level emphasis,43 long run low grey level emphasis43and long run high grey level emphasis.43 All texture features were computed using an in-house software implemented using C++ and the ITK libraries.44
Statistical analysis
Response to treatment was dichotomized as following:
at breast level: pCR+ (Smith’s Grade = 5) vs pCR− (Smith’s Grade < 5);
at both breast and axillary level: pCRN+ (Smith’s Grade = 5 plus either complete absence of residual nodal metastases or presence of nodal micrometastasis) vs pCRN− (Smith’s Grade < 5 and any other status of axillary lymph node metastases).
Tumour subtypes and immunoistochemical characteristics were classified as previously described and differences between pCR+ and pCR−, and between pCRN+ and pCRN− tumours were assessed using the Fisher’s exact mid-P test.
Age and tumour size were expressed as median with interquartile ranges in parentheses and their association with NAC response was evaluated using the Mann-Whitney test.
The relationship between outcome (pCR and pCRN) and texture features was explored by two approaches: mono-parametric and multiparametric. We first evaluated the performance of each 3D texture parameter in predicting the pathological response to therapy at breast with or without nodal level, using the ROC curve. AUC, sensitivity and specificity at the best cut-off were computed. The best cut-off is the one that maximizes the Youden index, which is the cut-point of the ROC curve that optimizes the biomarker’s differentiating ability when equal weight is given to sensitivity and specificity.45 A p-value < 0.05 was considered as indicating a AUC significantly greater than 0.5.
Analyses were performed with a statistical software (MedCalc Statistical Software version 17.4, Ostend, Belgium).
Afterwards, we combined the features into two different multiparametric classifiers. For both classifiers, we performed a feature selection step to discard uninformative characteristics in order to prevent over-fitting, speed up the learning process as well as improve the model’s interpretability. The first classifier was the logistic regression model, in which features were selected using the backward regression method. This method consists in entering all the variables in the model and sequentially removing (one-at-a-time) those that are non-significant (p > 0.20) for the model (i.e. having the largest p).
Subsequently, a Bayesian classifier was tested. In this case, the classical approach referred to as the “filter approach” was used to perform feature selection. This method consists in first ranking all features based on a criterion independent of the classifier, and then, selecting features from this rank list by setting a threshold which accounted for the classifier performance. Two different ranking methods were used: the Fisher (F)-score method46 and the value of the AUC of the individual features. To avoid to arbitrarily select a threshold on the number of features, we used a previously published method,47 in which the first n features were extracted from the sorted list and the classification performance achieved with this feature subset was computed. The classification performance was thus derived as a function of the number of the n first-ranked features. Performance was measured as the AUC, accuracy, sensitivity and specificity derived from a leave-one-out cross-validation.48 Leave-one-out approach involves training on all but one case, testing the classification on the left out patient, and repeating the procedure until each case has been tested individually. Accuracy, sensitivity and specificity were then estimated and used to identify the set of features that yielded best predictive models.
RESULTS
44 patients were included in the dataset. Patients and lesions characteristics are reported in Table 1. Age, tumour size and subtypes were not different between pCR+ and pCR− and between pCRN+ and pCRN−.
Table 1.
All (n = 44) | pCR+ (n = 15) | pCR– (n = 29) | p-value | pCRN+ (n = 13) | pCRN− (n = 31) | p-value | |
Age | 46(39–53) | 46(40–53) | 47(38–52) | 0.9802a | 46 (40–54) | 47 (38–53) | 0.8773a |
Size | 37.5 (30–50) | 38 (30–50) | 36(30–50) | 0.7369a | 38 (30–52) | 36 (30–49) | 0.4965a |
Histological type | |||||||
IDC | 39 | 13 (33.3%) | 26 (66.7%) | 0.8234b | 13 (33.3%) | 26 (66.7%) | 0.2223b |
ILC | 3 | 1 (33.3%) | 2 (66.7%) | 0.7701b | 0 | 3 (100%) | 0.3739b |
Mucinous cancer | 1 | 0 | 1 (100%) | 0.6705b | 0 | 1 (100%) | 0.6477b |
Squamous cancer | 1 | 1 (100%) | 0 | 0.1705b | 0 | 1 (100%) | 0.6477b |
Immunohistochemical | |||||||
ER positivity | 29 | 4 (13.8%) | 25 (86.2%) | 0.0002b | 4 (13.8%) | 25 (86.2%) | 0.0027b |
PgR positivity | 27 | 4 (14.8%) | 23 (85.2%) | 0.0013b | 4 (14.8%) | 23 (85.2%) | 0.0114b |
ER– & PgR– | 14 | 10 (71.4%) | 4 (28.6%) | 0.0002b | 8 (57.1%) | 6 (42.9%) | 0.0076b |
Ki67 > 14% | 38 | 14 (36.8%) | 24 (63.2%) | 0.0939b | 12 (31.6%) | 26 (68.4%) | 0.4959b |
HER2 positivity | 16 | 9 (56.2%) | 7 (43.8%) | 0.0122b | 7 (43.8%) | 9 (56.2%) | 0.1307b |
Subtypes | |||||||
Luminal A | 4 | 0 | 4 (100%) | 0.1899b | 0 | 4 (100%) | 0.1865b |
Luminal B/HER2– | 18 | 2 (11.1%) | 16 (88.9%) | 0.0144b | 2 (11.1%) | 16 (88.9%) | 0.0313b |
Luminal B/HER2+ | 5 | 2 (40.0%) | 3 (60.0%) | 0.8273b | 2 (40.0%) | 3 (60.0%) | 0.4619b |
Triple negative | 4 | 3 (75.0%) | 1 (25.0%) | 0.0509b | 3 (75.0%) | 1 (25.0%) | 0.0379b |
HER2-enriched | 10 | 7 (70.0%) | 3 (30.0%) | 0.0039b | 5 (50.0%) | 5 (50.0%) | 0.0871b |
pCR+Smith’s Grade = 5; pCR−, Smith’s Grade < 5; pCRN+, pCR+ plus either complete absence of residual nodal metastases or presence of nodal micrometastasis; pCRN−, Smith’s Grade < 5 and any other status of axillary lymph node metastases. IDC, invasive ductal carcinoma; ILC,invasive lobular carcinoma; ER,estrogen receptor; PgR,progesterone receptorand HER2,epidermal growth factor receptor 2. Age and tumour size are expressed as median with interquartile ranges in parentheses, while other measurements are expressed as counts with percentages in parenthesis.
ap-value of the Mann-Whitney test.
bp-value of the Fisher’s exact mid-P test.
ER positive and PgR positive tumours were less represented in the pCR+ and pCRN+ groups, while no differences were observed between Ki67 positive and negative tumours. pCR+ were significantly more represented in HER2 positive group and in tumours with both negative ER and PgR. pCRN+ were significantly more represented in tumours with both negative ER and PgR. According to tumour subtypes, pCR+ were significantly more represented in the HER2-enriched group, while pCRN+ were significantly more represented in the triple negative group. Both pCR+ and pCRN+ were less represented in patients with luminal B/HER2– tumours.
Mono-parametric approach
When individual parameters were compared with pCR outcome, we found 7 statistically significant parameters with AUC greater than 0.5 (Table 2). The feature with the highest AUC was contrast with a sensitivity and specificity at the best cut-off equal to 46.7 and 93.1%, respectively. Two examples of the image processing pipeline are shown in Figures 1 and 2. Contrast has been found statistically higher for pCR+ tumours (Figure 1) than for the pCR− (Figure 2), while correlation showed lower values for pCR+ tumours. Considering texture features from GLRLM, we obtained that higher long run emphasisand higher low run high grey level emphasis were correlated with better response to therapy. The 3D parameters correlated (p < 0.05) with the pCRN outcome were cluster shade, sum variance, long run emphasis and higher low run high grey level emphasis (Table 3). The highest AUC was reached using low run high grey level emphasis, and was equal to 0.747, with a 100% sensitivity and 54.8% specificity at the best cut-off point.
Table 2.
Variable | AUC | SE | Cut-off | Sensitivity (%) | Specificity (%) |
Contrast | 0.722 | 0.0851 | >3385 | 46.7 | 93.1 |
Correlation | 0.715 | 0.0848 | ≤1.465 × 10−4 | 60.0 | 82.8 |
Sum variance | 0.674 | 0.0813 | >7,4770 | 86.7 | 51.7 |
Difference variance | 0.699 | 0.0874 | ≤2.46 × 10−5 | 46.7 | 89.7 |
Difference entropy | 0.713 | 0.0859 | >4.751 | 60.0 | 82.8 |
LRE | 0.676 | 0.0806 | >1.247 | 80 | 58.6 |
LRHGE | 0.708 | 0.0777 | >2,4213 | 93.3 | 55.2 |
AUC, area under the ROC curve; SE,standard error;LRE, long run emphasis, LRHGE, low run high grey level emphasis; pCR+Smith’s Grade = 5; pCR−, Smith’s Grade < 5; pCRN+, pCR+ plus either complete absence of residual nodal metastases or presence of nodal micrometastasis; pCRN−, Smith’s Grade < 5 and any other status of axillary lymph node metastases.
Table 3.
Variable | AUC | SE | Cut-off | Sensitivity (%) | Specificity (%) | |
Cluster shade | 0.685 | 0.0842 | ≤7039 | 92.3 | 41.9 | |
Sum variance | 0.687 | 0.0817 | >74770 | 92.3 | 51.6 | |
LRE | 0.712 | 0.0790 | >1.258 | 84.6 | 61.3 | |
LRHGE | 0.747 | 0.0731 | >24213 | 100.0 | 54.8 |
AUC, area under the ROC curve; SE, standard error;LRE, long run emphasis, LRHGE, low run high grey level emphasis; pCR+Smith’s Grade = 5; pCR−, Smith’s Grade < 5; pCRN+, pCR+ plus either complete absence of residual nodal metastases or presence of nodal micrometastasis; pCRN−, Smith’s Grade < 5 and any other status of axillary lymph node metastases.
Multi-parametric approach
Using the logistic regression classifier, we obtained a model to predict pCR response in which two parameters were kept: sum variance (p = 0.04) and difference entropy (p = 0.01). The AUC of the model was 0.795 (95% CI[0.647–0.902]), with a sensitivity and a specificity at the best cut-off (0.36) of 80 and 69%, respectively (Figure 3). When the logistic regression classifier was used to predict pCR with a lymph nodal response, i.e. pCRN, three texture features were maintained in the model: cluster shade (p = 0.04), long run emphasis (p = 0.11) and low run high grey level emphais (p = 0.19). The AUC of the model was equal to 0.764 (95%CI[0.612–0.879]), with a sensitivity of 46% and a specificity of 100% at the best cut-off (0.53).
When predicting pCR with the Bayesian classifier the best accuracy (70%) has been obtained when the first 6 features ranked by the F-score (cluster shade, correlation, contrast, difference entropy, correlation 1 and difference variance) were fed into the classifier. Specificity and sensitivity were equal to 72 and 67%, respectively. The Bayesian classifier was also tested to predict the pCR associated with a lymph node response, i.e. pCRN. In this case, the best results have been obtained when the features were ranked according to the AUC value of each individual feature. When the first 2 features (low run high grey level emphasis and long run emphasis) were included, the highest sensitivity has been reached (69%) with a specificity of 61% and an accuracy of 64%.
DISCUSSION
In this study we developed and tested a CAD scheme that aims to predict pathological response to NAC based on 3D texture features extracted from an automatic segmentation of the tumour at baseline MRI examination. We demonstrated that some individual texture features can discriminate between responder and non-responders at breast and axillary level before NAC. When analysing pathologic response at breast level (pCR+ vs pCR−), seven parameters reached statistical significance, with very different diagnostic performances (three parameters reached high sensitivity, while the other four had good specificity). When considering the prediction of pathological response at both breast and axillary level (pCRN+ vs pCRN−), four parameters were able to discriminate, before the treatment, between patients achieving pCRN+ and subjects obtaining pCRN−, and all of them reached high sensitivity, with poor specificity. These results demonstrated that DCE-MRI could be used as a non-invasive examination to predict at the fair level which patients will likely respond to NAC.
Previously, only Michoux et al28 and Teruel et al26 assessed the role of individual texture features in predicting pathological response to NAC. Michoux et al28 demonstrated that the inverse difference moment, which is a measure of the local homogeneity of the grey levels, was inversely correlated to the NAC response, and that it showed the highest AUC (0.711) in predicting NAC response. Analogously, we obtained that higher contrast, which is inversely correlated to homogeneity, was an index of a better response to NAC (AUC = 0.722). This behaviour might be explained by the higher vascularity which characterizes tumours that are more likely to respond to NAC, i.e. pCR+.49 Indeed, in higher vascularized tumours the general shape of the blood vessels is altered and deformed, becoming very rough and resulting in increased irregularity. Thus, when a contrast agent is injected, if the tumoural region shows a higher vascularization, there is enhanced intensity of blood vessels in the images and a corresponding increase of contrast values.50 In addition, in our dataset we showed that patients with negative steroid receptor status were more likely to achieve pCR+ and it has been demonstrated that VEGF+ phenotype is more frequently associated with a negative steroid receptor status.51 Therefore, higher contrast in pCR+ patients might be also explained by the fact that most pCR+ patients had a negative steroid receptor status. On the other side, compared with the study of Michoux et al28 we did not obtain a significant AUC when considering the homogeneity of the ROI, per se. The reason might be twofold. First, Michoux et al28 derived the texture features from a 2D ROI, defined as the largest region of contiguous pixels with the same behaviour in amplitude and wash-in of the signal intensity vs time curve, rather than using a 3D segmentation of the tumour. Therefore, they did not evaluate differences within different slices of the tumour that might affect the homogeneity of the ROI. Second, their dataset included only patients with invasive ductal carcinoma, therefore homogeneity could be biased by the choice of a single subtype of cancer. Teruel et al26 showed that sum variance and sum entropy were the most predictive parameters for NAC response. Despite the great variations in imaging protocols and patient’s cohort, we obtained similar results when considering sum variance and difference entropy, demonstrating the great potential of texture features as standard and robust quantitative biomarkers.
One of the main advantages of our work relies on the fact that we used the pathological response after surgery as reference standard, which represents the best time point better associated with the final prognosis of the patient, also considering the evolution of the patients from their clinical response to their final pathological outcome.26
A second important result of our study is that we demonstrated that it could be feasible to develop robust models to early predict the response to NAC at both breast and axillary level by combining different texture features in a multiparametric approach. In particular, we have shown that a two-parameter logistic regression classifier can predict pCR with higher accuracy than the mono-parametric approach (0.795 vs 0.722). More importantly, we demonstrated that a cross-validated Bayesian classifier can reach an accuracy of 70% in predicting pCR. This result could appear similar to that obtained with some individual features, however comparison is flawed as the cross-validation approach could not be performed in the mono-parametric analysis. Cross-validation is necessary to get an unbiased estimate of the predictive accuracy of the model in patients that were not used to train the classifier and to assess the relevance of the working hypothesis and its clinical applicability. Only two previous studies developed cross-validated classifier to predict pre-treatment pathological response to NAC based on texture features. Michoux et al28 obtained an accuracy of 68% with a specificity of 62% and sensitivity of 84% in predicting non-responder patients using a k-means classifier. Similarly, Golden et al29 reported that the use of 31 GLCM-derived features prior to treatment was able to predict pCR with an AUC of 0.68 for patient with triple-negative breast cancers. In our work we reached a slightly higher accuracy, and we also took advantages of two innovative approaches that could overcome some limitations of previous studies.
First, our dataset comprised both invasive ductal carcinoma and invasive lobular carcinoma, which is more representative of the day clinical practice, and it comprises different tumour subtypes. Second, we developed a fully automatic CAD scheme able to segment the whole tumour and provide a 3D mask which is less operator dependent.52 To the best of our knowledge, there are no studies that extracted pre-treatment texture features from an automatically segmented 3D mask of the tumour. Fully automatic lesion detection could represent the backbone of standardized analysis and may contribute to the introduction of quantitative biomarkers for a timely management of patients candidate to NAC. Moreover, the automatic segmentation is also able to reduce the post-processing time for the radiologist. Another strength of our work relies on the fact that we tried to define some imaging biomarkers that were not only related to the pathological response at breast level, but also to the overall loco-regional response, because of its relevant clinical implications in terms of adjuvant treatment and prognosis. Indeed, it has been shown that the achievement of pCR at both breast and axillary node levels is associated with improved long term clinical outcomes.
There are some limitations of our work. First, this is a retrospective study based on a limited number of patients, therefore a second study with a larger dataset should be performed to validate our results. However, in this proof-of-concept study we obtained promising results from texture features extracted from an automatic 3D segmentation of the tumour, and this could set the basis for the development of computer-assisted prediction solution for breast MRI. Second, in our study a specific timepoint corresponding to the enhancement peak on intensity-time curves has been evaluated, based on the findings of previously published works. Indeed, Ahmed et al15 evaluated the correlation between pre-NAC texture features and the clinical response to NAC at different timepoints, and they found significant differences occurring at 1 and 2 min after contrast injection. Further tests on late timepoints should be conducted to evaluate whether different timepoints could better predict pathological response to NAC at breast and nodal level. A possible further limitation could be the inclusion of mass tumour only, due to the fact that in our institution tumour response in patients with non-mass cancers is monitored by clinical examinations and conventional imaging. However, our clinical work-up aims to combine clinical evidences, resource optimization and tailored treatment. It has been demonstrated that accuracy of MRI in the assessment of tumour response is higher in mass lesions than in diffuse cancers53 and that the likelihood of conservative treatment after NAC is lower in patients with non-mass lesions at baseline.54 As a consequence, we prefer to reserve a costly examination to those patients who more frequently benefit from both MRI and conservative surgical treatment after NAC.
In conclusion, in this work we demonstrated that a CAD scheme, that extract texture features from an automatically segmented 3D mask of the tumour, could help in predicting pathological response to NAC at both breast and axillary level, which is a response more relevant in terms of adjuvant treatment and prognosis. From a clinical point of view, such methods should, ideally, obtain a very high negative predictive value (i.e.≥90%), thus achieving a twofold advantage for patients: (a) an early modification of the treatment for those patients that are not likely responding, (b) a reduction of toxicity due to unnecessary treatments. In this study, we reached a negative predictive value of 81% when predicting pCR+ patients, therefore additional research should be performed to increase the performance of our method. In general, we would like to develop and test other statistical classifiers (e.g. support vector machine) and/or unsupervised algorithms, i.e.k-means or hierarchical clustering to improve classification performance. Besides, it would be interesting to combine texture features with dynamic and pharmacokinetics modelling, thus adding others functional information. Finally, it would be necessary to validate our promising results on a larger prospective cohort of patients and using different imaging acquisition protocols.
However, findings of this work might help in developing scheme that will help to better select patients eligible for NAC, thus avoiding unnecessary treatment if the regimen is predicted to be unsuccessful.
Contributor Information
Valentina Giannini, Email: valentina.giannini@ircc.it.
Simone Mazzetti, Email: simone.mazzetti@ircc.it.
Agnese Marmo, Email: marmoagnese@gmail.com.
Filippo Montemurro, Email: gianninivalentina@gmail.com.
Daniele Regge, Email: valentina.giannini@ircc.it.
Laura Martincich, Email: laura.martincich@ircc.it.
References
- 1.M Kaufmann, G von Minckwitz, R Smith, V Valero, L Gianni, W Eiermann, et al. International expert panel on the use of primary (preoperative) systemic treatment of operable breast cancer: review and recommendations. J Clin Oncol 2003; 21: 2600–8. [DOI] [PubMed] [Google Scholar]
- 2.SD Heys, AW Hutcheon, TK Sarkar, KN Ogston, ID Miller, S Payne, et al. . Neoadjuvant docetaxel in breast cancer: 3-year survival results from the aberdeen trial. Clin Breast Cancer 2002; 3(Suppl. 2): S69–S74. [DOI] [PubMed] [Google Scholar]
- 3.JS Mieog, JA van der Hage, CJ van de Velde. Preoperative chemotherapy for women with operable breast cancer. Cochrane Database Syst Rev 2007; 18: CD005002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.B Fisher, J Bryant, N Wolmark, E Mamounas, A Brown, ER Fisher, et al. Effect of preoperative chemotherapy on the outcome of women with operable breast cancer. J Clin Oncol 1919; 16: 2672–85. [DOI] [PubMed] [Google Scholar]
- 5.H Kim, HH Kim, JS Park, HJ Shin, JH Cha, EY Chae. Prediction of pathological complete response of breast cancer patients undergoing neoadjuvant chemotherapy: usefulness of breast MRI computer-aided detection. Br J Radiol 2015; 88: 20150143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.MF Press, G Sauter, M Buyse, L Bernstein, R Guzman, A Santiago, et al. Alteration of topoisomerase II-alpha gene in human breast cancer: association with responsiveness to anthracycline-based chemotherapy. J Clin Oncol 2011; 29: 859–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.JC Chang, EC Wooten, A Tsimelzon, SG Hilsenbeck, MC Gutierrez, R Elledge, et al. Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet 2003; 362: 362–9. [DOI] [PubMed] [Google Scholar]
- 8.GP Barbi, P Marroni, P Bruzzi, G Nicolò, M Paganuzzi, GB Ferrara. Correlation between steroid hormone receptors and prognostic factors in human breast Cancer. Oncology 1987; 44: 265–9. [DOI] [PubMed] [Google Scholar]
- 9.G von Minckwitz, HP Sinn, G Raab, S Loibl, JU Blohmer, H Eidtmann, et al. Clinical response after two cycles compared to HER2, Ki-67, p53, and bcl-2 in independently predicting a pathological complete response after preoperative chemotherapy in patients with operable carcinoma of the breast. Breast Cancer Res 2008; 10: R30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.LJL Esserman, E Kaplan, S Partridge, D Tripathy, H Rugo, J Park, et al. MRI phenotype is associated with response to doxorubicin and cyclophosphamide neoadjuvant chemotherapy in stage III breast cancer. Ann Surg Oncol 2001; 8: 549–59. [DOI] [PubMed] [Google Scholar]
- 11.R Nishimura, T Osako, Y Okumura, M Hayashi, N Arima. Clinical significance of Ki-67 in neoadjuvant chemotherapy for primary breast cancer as a predictor for chemosensitivity and for prognosis. Breast Cancer 2010; 17: 269–75. [DOI] [PubMed] [Google Scholar]
- 12.A Fangberget, LB Nilsen, KH Hole, MM Holmen, O Engebraaten, B Naume, et al. Neoadjuvant chemotherapy in breast cancer-response evaluation and prediction of response to treatment using dynamic contrast-enhanced and diffusion-weighted MR imaging. Eur Radiol 2011; 21: 1188–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.HM Kuerer, LA Newman, TL Smith, FC Ames, KK Hunt, K Dhingra, et al. Clinical course of breast Cancer patients with complete pathologic primary tumor and axillary lymph node response to doxorubicin-based neoadjuvant chemotherapy. J Clin Oncol 1999; 17: 460–9. [DOI] [PubMed] [Google Scholar]
- 14.JR Gralow, HJ Burstein, W Wood, GN Hortobagyi, L Gianni, G von Minckwitz, et al. Preoperative therapy in invasive breast cancer: pathologic assessment and systemic therapy issues in operable disease. J Clin Oncol 2008; 26: 814–9. [DOI] [PubMed] [Google Scholar]
- 15.A Ahmed, P Gibbs, M Pickles, L Turnbull. Texture analysis in assessment and prediction of chemotherapy response in breast cancer. J Magn Reson Imaging 2013; 38: 89–101. [DOI] [PubMed] [Google Scholar]
- 16.F Sardanelli, C Boetes, B Borisch, T Decker, M Federico, FJ Gilbert, et al. Magnetic resonance imaging of the breast: recommendations from the EUSOMA working group. Eur J Cancer 2010; 46: 1296–316. [DOI] [PubMed] [Google Scholar]
- 17.L Martincich, F Montemurro, G De Rosa, V Marra, R Ponzone, S Cirillo, et al. Monitoring response to primary chemotherapy in breast cancer using dynamic contrast-enhanced magnetic resonance imaging. Breast Cancer Res Treat 2004; 83: 67–76. [DOI] [PubMed] [Google Scholar]
- 18.AA Tardivon, L Ollivier, C El Khoury, F Thibault. Monitoring therapeutic efficacy in breast carcinomas. Eur Radiol 2006; 16: 2549–58. [DOI] [PubMed] [Google Scholar]
- 19.E Bufi, P Belli, M Costantini, A Cipriani, M Di Matteo, A Bonatesta, et al. Role of the apparent diffusion coefficient in the prediction of response to neoadjuvant chemotherapy in patients with locally advanced breast cancer. Clin Breast Cancer 2015; 15: 370–80. [DOI] [PubMed] [Google Scholar]
- 20.W Chen, ML Giger, H Li, U Bick, GM Newstead. Volumetric texture analysis of breast lesions on contrast-enhanced magnetic resonance images. Magn Reson Med 2007; 58: 562–71. [DOI] [PubMed] [Google Scholar]
- 21.HP Chan, D Wei, MA Helvie, B Sahiner, DD Adler, MM Goodsitt, et al. Computer-aided classification of mammographic masses and normal tissue: linear discriminant analysis in texture feature space. Phys Med Biol 1995; 40: 857–76. [DOI] [PubMed] [Google Scholar]
- 22.H Li, ML Giger, OI Olopade, A Margolis, L Lan, MR Chinander. Computerized texture analysis of mammographic parenchymal patterns of digitized mammograms. Acad Radiol 2005; 12: 863–73. [DOI] [PubMed] [Google Scholar]
- 23.P Gibbs, LW Turnbull. Textural analysis of contrast-enhanced MR images of the breast. Magn Reson Med 2003; 50: 92–8. [DOI] [PubMed] [Google Scholar]
- 24.SB Antel, DL Collins, N Bernasconi, F Andermann, R Shinghal, RE Kearney, et al. Automated detection of focal cortical dysplasia lesions using computational models of their MRI characteristics and texture analysis. Neuroimage 2003; 19: 1748–59. [DOI] [PubMed] [Google Scholar]
- 25.A Vignati, S Mazzetti, V Giannini, F Russo, E Bollito, F Porpiglia, et al. Texture features on T2-weighted magnetic resonance imaging: new potential biomarkers for prostate cancer aggressiveness. Phys Med Biol 2015; 60: 2685–701. [DOI] [PubMed] [Google Scholar]
- 26.JR Teruel, MG Heldahl, PE Goa, M Pickles, S Lundgren, TF Bathen, et al. Dynamic contrast-enhanced MRI texture analysis for pretreatment prediction of clinical and pathological response to neoadjuvant chemotherapy in patients with locally advanced breast cancer. NMR Biomed 2014; 27: 887–96. [DOI] [PubMed] [Google Scholar]
- 27.F Aghaei, M Tan, AB Hollingsworth, W Qian, H Liu, B Zheng. Computer-aided breast MR image feature analysis for prediction of tumor response to chemotherapy. Med Phys 2015; 42: 6520–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.N Michoux, S Van den Broeck, L Lacoste, L Fellah, C Galant, M Berlière, et al. Texture analysis on MR images helps predicting non-response to NAC in breast cancer. BMC Cancer 2015; 15: 574. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.DI Golden, JA Lipson, ML Telli, JM Ford, DL Rubin. Qualitative and quantitative image-based biomarkers of therapeutic response in triple-negative breast cancer. AMIA Jt Summits Transl Sci Proc 2013; 2013: 62. [PMC free article] [PubMed] [Google Scholar]
- 30.L Martincich, V Deantoni, I Bertotto, S Redana, F Kubatzki, I Sarotto, et al. Correlations between diffusion-weighted imaging and breast cancer biomarkers. Eur Radiol 2012; 22: 1519–28. [DOI] [PubMed] [Google Scholar]
- 31.AC Wolff, ME Hammond, JN Schwartz, KL Hagerty, DC Allred, RJ Cote, et al. American Society of Clinical Oncology/College of American Pathologists guideline recommendations for Human Epidermal growth factor receptor 2 testing in breast cancer. J Clin Oncol 2007; 25: 118–45. [DOI] [PubMed] [Google Scholar]
- 32.IC Smith, SD Heys, AW Hutcheon, ID Miller, S Payne, et al. . Neoadjuvant chemotherapy in breast Cancer: significantly enhanced response with docetaxel. J Clin Oncol 2002; 20: 1456–66. [DOI] [PubMed] [Google Scholar]
- 33.A Vignati, V Giannini, M De Luca, L Morra, D Persano, LA Carbonaro, et al. Performance of a fully automatic lesion detection system for breast DCE-MRI. J Magn Reson Imaging 2011; 34: 1341–51. [DOI] [PubMed] [Google Scholar]
- 34.V Giannini, A Vignati, L Morra, D Persano, D Brizzi, Carbonaro L, et al. A fully automatic algorithm for segmentation of the breasts in DCE-MR images. Conf Proc IEEE Eng Med Biol Soc 2010; 2010: 3146–9. [DOI] [PubMed] [Google Scholar]
- 35.RM Haralick, K Shanmugam, Its'Hak Dinstein. Textural features for image classification. IEEE Trans Syst Man Cybern 1973; SMC-3: 610–21. [Google Scholar]
- 36.RW Conners, CA Harlow. A theoretical comparison of texture algorithms. IEEE Trans Pattern Anal Mach Intell 1980; 2: 204–22. [DOI] [PubMed] [Google Scholar]
- 37.DA Clausi. An analysis of co-occurrence texture statistics as a function of grey level quantization. Can J Remote Sens 2002; 28: 45–62. [Google Scholar]
- 38.JA Nystuen, FW Garcia. Sea ice classification using SAR backscatter statistics. IEEE Trans Geosci Remote Sens 1992; 30: 502–9. [Google Scholar]
- 39.M Unser. Sum and difference histograms for texture classification. IEEE Trans Pattern Anal Mach Intell 1986; 8: 118–25. [DOI] [PubMed] [Google Scholar]
- 40.DG Barber, EF LeDrew. SAR sea ice discrimination using texture statistics: a multivariate approach. Photogramm Eng Remote Sensing 1991; 57: 385–95. [Google Scholar]
- 41.ME Shokr. Evaluation of second-order texture parameters for sea ice classification from radar images. J Geophys Res 1991; 96: 10625–40. [Google Scholar]
- 42.MM Galloway. Texture analysis using gray level run lengths. Comp GraphImage Process 1975; 4: 172–9. [Google Scholar]
- 43.BV Dasarathy, EB Holder. Image characterizations based on joint gray level—run length distributions. Pattern Recognit Lett 1991; 12: 497–502. [Google Scholar]
- 44.HJ Johnson, M McCormick, L Ibanez. The ITK software guide. 3rd ed Kitware, Inc; 2013. http://www.itk.org/ItkSoftwareGuide.pdf [Google Scholar]
- 45.WJ Youden. Index for rating diagnostic tests. Cancer 1950; 3: 32–5. [DOI] [PubMed] [Google Scholar]
- 46.RO Duda, PE Hart, DG Stork. Pattern classification; 2000. [Google Scholar]
- 47.E Niaf, O Rouvière, F Mège-Lechevallier, F Bratan, C Lartizien. Computer-aided diagnosis of prostate cancer in the peripheral zone using multiparametric MRI. Phys Med Biol 2012; 57: 3833–51. [DOI] [PubMed] [Google Scholar]
- 48.K Baumann. Cross-validation as the objective function for variable-selection techniques. TrAC Trends in Analytical Chemistry 2003; 22: 395–406. [Google Scholar]
- 49.WH Kuo, CN Chen, FJ Hsieh, MK Shyu, LY Chang, PH Lee, et al. Vascularity change and tumor response to neoadjuvant chemotherapy for advanced breast cancer. Ultrasound Med Biol 2008; 34: 857–66. [DOI] [PubMed] [Google Scholar]
- 50.JV Raja, M Khan, VK Ramachandra, O Al-Kadi. Texture analysis of CT images in the characterization of oral cancers involving buccal mucosa. Dentomaxillofac Radiol 2012; 41: 475–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.D Coradini, C Pellizzaro, S Veneroni, L Ventura, MG Daidone. Infiltrating ductal and lobular breast carcinomas are characterised by different interrelationships among markers related to angiogenesis and hormone dependence. Br J Cancer 2002; 87: 1105–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.GP Liney, P Gibbs, C Hayes, MO Leach, LW Turnbull. Dynamic contrast-enhanced MRI in the differentiation of breast tumors: user-defined versus semi-automated region-of-interest analysis. J Magn Reson Imaging 1999; 10: 945–9. [DOI] [PubMed] [Google Scholar]
- 53.CE Loo, ME Straver, S Rodenhuis, SH Muller, J Wesseling, MJ Vrancken Peeters, et al. Magnetic resonance imaging response monitoring of breast cancer during neoadjuvant chemotherapy: relevance of breast cancer subtype. J Clin Oncol 2011; 29: 660–6. [DOI] [PubMed] [Google Scholar]
- 54.ER Price, J Wong, R Mukhtar, N Hylton, LJ Esserman. How to use magnetic resonance imaging following neoadjuvant chemotherapy in locally advanced breast Cancer. World J Clin Cases 2015; 3: 607–13. [DOI] [PMC free article] [PubMed] [Google Scholar]