Abstract
Purpose:
Conventional breast MRI is highly sensitive for cancer detection but prompts some false-positives. We performed a prospective, multicenter study to determine whether apparent diffusion coefficients (ADCs) from diffusion weighted imaging (DWI) can decrease MRI false-positives.
Experimental Design:
107 women with MRI-detected BI-RADS 3, 4, or 5 lesions were enrolled from March 2014 to April 2015. ADCs were measured both centrally and at participating sites. Receiver operating characteristic (ROC) analysis was employed to assess diagnostic performance of centrally-measured ADCs and identify optimal ADC thresholds to reduce unnecessary biopsies. Lesion reference standard was based on either definitive biopsy result or at least 337 days of follow-up after the initial MRI procedure.
Results:
Of 107 women enrolled, 67 patients (median age 49, range 24–75 years) with 81 lesions with confirmed reference standard (28 malignant, 53 benign) and evaluable DWI were analyzed. 67/81 lesions were BI-RADS 4 (n=63) or 5 (n=4) and recommended for biopsy. Malignancies exhibited lower mean centrally-measured ADCs (mm2/s) than benign lesions (1.21×10−3 vs.1.47×10−3, p<0.0001, area under ROC curve=0.75, 95% confidence interval [CI] 0.65–0.84). In centralized analysis, application of an ADC threshold (1.53×10−3 mm2/s) lowered the biopsy rate by 20.9% (14/67; 95% CI 11.2–31.2%) without affecting sensitivity. Application of a more conservative threshold (1.68×10−3mm2/s) to site-derived ADCs reduced the biopsy rate by 26.2%(16/61) but missed three cancers.
Conclusion:
DWI can re-classify a substantial fraction of suspicious breast MRI findings as benign and thereby decrease unnecessary biopsies. ADC thresholds identified in this trial should be validated in future Phase III studies.
Introduction
Over the last two decades, breast MRI has emerged as the most powerful tool for breast cancer detection, with numerous multicenter trials reporting sensitivities in high-risk women roughly double those of mammography or ultrasound alone (1–8). Breast cancer detection on MRI relies on the presence of suspicious enhancement after injection of gadolinium contrast to identify areas of abnormal vascularity, a common characteristic of breast malignancies. Although initially associated with relatively high false-positive rates, several of the more recent high-risk screening trials demonstrated breast MRI specificity and positive predictive value can exceed that of conventional breast imaging modalities, such as mammography and ultrasound (1,2). Nonetheless, many benign pathologies still exhibit suspicious enhancement that cannot be distinguished from malignancies, resulting in unnecessary biopsies. In fact, recent studies including community site performance in the United States have demonstrated that 19–36% of MRI recommendations for biopsy yield cancer (9–11), leading to criticisms that breast MRI can cause real harm, particularly when used for pre-operative evaluation of newly diagnosed breast cancer (12–14).
Diffusion weighted MR imaging (DWI) has been proposed as a complementary adjunct sequence to improve breast MRI accuracy. Although initially used to identify early signs of stroke, technical advances have expanded DWI’s use to oncologic applications in extracranial organ systems (15). DWI assesses how freely water molecules can diffuse within tissue, which can be quantified by the apparent diffusion coefficient (ADC). It has been shown in multiple organ systems, particularly the brain, prostate, and liver, that lower ADC values correlate with higher tumor cellularity, which in turn can be used to discriminate cancers from benign lesions and even stratify malignancy grades (16,17). Over the last decade, multiple single institution studies have shown that breast malignancies also exhibit lower ADCs on average than benign lesions, and the addition of DWI could improve breast MRI performance (18–27). However, DWI is not routinely used by breast imagers because of several important limitations in these prior studies. Specifically, the optimal ADC threshold to decrease false-positives has ranged greatly (0.9 to 1.76×10−3 mm2/s) among the many studies due to varying diffusion sensitization (‘b’) values utilized in the DWI scan protocols (26,27). Furthermore, many published reports on DWI have excluded lesion subgroups in most need of improved diagnostic characterization. These subtypes include lesions smaller than 10–12 mm, which account for over half of MRI findings (28) and are more likely than larger lesions to be a false-positive (28,29), and less well-defined non-mass enhancement (NME) lesions (30–32), which require an expensive MRI-guidance procedure for sampling more often than masses (33).
Given the promise of DWI to improve breast MRI accuracy but the wide variability in single institution studies, there is a pressing need to determine a generalizable ADC threshold to facilitate clinical implementation (27). To address this, the Eastern Cooperative Oncology Group – American College of Radiology Imaging Network (ECOG-ACRIN) Cancer Research Group A6702 Phase II multicenter trial was designed to confirm ADC differences between malignant and benign lesions across systems and practice sites for all lesion types detected on breast MRI. Furthermore, the trial was designed to identify potential ADC thresholds that could reduce biopsies of benign lesions prompted by breast MRI that could be tested in future phase III trials.
Materials and Methods
Study Design
This single arm, prospective, Health Insurance Portability and Accountability Act-compliant, multi-institution, Phase II imaging trial (ClinicalTrials.gov NCT02022579) (34) was performed in accordance with the U.S. Common rule. Each participating site received institutional review board approval, and patients were enrolled between March 2014 and April 2015. All data collection and analyses were planned before the trial was initiated (35). Potentially eligible women 18 years or older planning to undergo a clinical breast MRI examination for any clinical indication provided written informed consent to undergo a study-specific DWI sequence during their examination. All breast MRIs were interpreted using only non-DWI sequences, and those with at least one MRI-detected abnormality classified as BI-RADS category 3 (probably benign), 4 (suspicious for malignancy), or 5 (highly suggestive of malignancy) were enrolled on the study. In order to prevent bias from performance at any one institution, enrollment was limited to 40 subjects per participating site. Participants with a qualifying lesion who underwent neoadjuvant chemotherapy before lesion biopsy were excluded in order to limit the possibility of false-negative pathology results. Management of individual lesions was based on institutional standard-of-care with the expectation that all participants would undergo either biopsy of the MRI-detected finding within a month of the study MRI or imaging/clinical follow-up for the MRI abnormality approximately one year after study MRI exam. From the subsequent biopsy, imaging, and/or clinical follow-up performed through one year after the study MRI, the reference standard for each lesion was determined as described in more detail below.
MRI Acquisition
Imaging was performed on 1.5 or 3 tesla MRI scanners with conventional dynamic contrast enhanced breast MRI acquired in accordance with each institution’s standard-of-care and American College of Radiology accreditation guidelines (36). A standardized DWI protocol was acquired prior to contrast injection using a commercially available diffusion-weighted single-shot spin-echo echo planar imaging sequence (37): axial acquisition, parallel imaging (reduction factor ≥2), fat suppression (method selected by site), 1.5 to 2 mm in-plane resolution, 4 mm slice thickness, and scan time ~5 minutes. Diffusion gradients were applied in three orthogonal directions to measure isotropic ADC using diffusion sensitizations (b-values) of 0, 100, 600, and 800 s/mm2. Each MRI system was required to pass study DWI quality control (37), which included central review of test scans in temperature-controlled phantoms (38) (to evaluate system ADC bias and uniformity, relative signal-to-noise, and scan protocol compliance) and representative patient scans (to verify lack of artifacts, adequate signal-to-noise, and homogeneous fat suppression).
Clinical Breast MRI Interpretation
All MRIs were prospectively interpreted in accordance with the 5th edition BI-RADS Atlas (39). The BI-RADS assessment category given for each lesion was required to be based on non-DWI sequences only. For each lesion given a BI-RADS category 3, 4, or 5, radiologists at each site recorded basic morphology (focus, mass, or NME), maximal lesion size, kinetic enhancement worst curve type (initial phase: fast>medium>slow; delayed phase: washout>plateau>persistent), and signal intensity on T2-weighted images (low/high).
ADC measurements
Lesion ADCs were measured both centrally and at each site. For the primary aim of this study, centralized analysis, including quality assessment of the diffusion weighted images, was performed by trained research scientists at the University of Washington under supervision of the co-chairs of the study (SCP >15 years and HR seven years quantitative breast DWI experience) and blinded to lesion outcomes. DWI scans were processed using custom software developed with MATLAB (MathWorks, Natick, Massachusetts). ADC maps were calculated using a classic monoexponential decay model (40) and linear least squares fitting of the signal decay with increasing b-value. Adequate quality for DWI was determined by subjective assessment of the presence of artifacts on diffusion-weighted images, including susceptibility-related distortions, misregistration between varying b-value diffusion weighted images, poor signal-to-noise, partial volume averaging, or poor fat suppression) that would affect DWI visibility and/or accurate ADC evaluation of the BI-RADS category 3, 4, and 5 lesions in question. Such lesions affected by inadequate diffusion weighted images were considered non-evaluable and excluded from the final analysis set.
In order to obtain the central ADC measures, lesions were identified on the diffusion weighted images (b=800 s/mm2) by visually cross-referencing the appearance with the conventional contrast-enhanced MR images to facilitate lesion localization and to assist in avoiding adjacent, uninvolved normal fibroglandular and adipose tissue. Whole lesion regions of interest (ROIs) were then drawn on the diffusion weighted images with the assistance of a semi-automated thresholding tool (41) to further prevent erroneous inclusion of non-lesion tissue, and these ROIs were then propagated to the ADC maps. Lesion ADCs were calculated as the mean of voxel values within the lesion ROI.
For secondary analyses, site radiologists prospectively recorded a site-measured ADC for each lesion after determining BI-RADS assessment. This was performed using their institutions’ clinical software with only the following guidelines: ROIs should be drawn on ADC maps generated from all acquired b-values over the largest solid portion of the lesion, avoiding normal tissue and areas of necrosis.
Reference Standard
The reference standard for each lesion was determined from results of image-guided biopsy, surgery, and follow-up MRIs. Lesions with indeterminate reference standards were excluded from final analyses. Reference standard for a lesion was indeterminate for BI-RADS category 4 or 5 lesions if no sampling of the lesion was performed and there was no follow-up MRI that downgraded the finding. Furthermore, BI-RADS category 4 or 5 lesions that were excised during surgery for another lesion (e.g. an ipsilateral cancer) without prior sampling were also excluded due to the inability to definitively correlate the pathologic outcomes with the lesion in question. Reference standard for BI-RADS category 3 lesions was follow-up MRI at least 337 days (to allow inclusion of patients for whom follow-up occurred up to 4 weeks earlier than a full year) after study MRI without BI-RADS upgrade to category 4 or 5. Any high-risk lesion diagnosed by pre-surgical sampling required either excision or downgrade to BI-RADS category 2 (benign) or 1 (negative) on follow-up MRI performed at least 337 days after study examination.
Statistical Methods
Conventional MRI performance was described by calculating abnormal interpretation rate and positive predictive value 2 (PPV2). Abnormal interpretation rate was defined as the number of women enrolled onto the trial who had an MRI examination with at least one BI-RADS category 3, 4, or 5 lesion divided by the total number of women who consented to the screening stage of the study. PPV2, the fraction of MRI recommendations for biopsy that ultimately yield malignancy, was calculated as the number of BI-RADS category 4 or 5 lesions deemed malignant based on reference standard divided by the number of all BI-RADS category 4 or 5 lesions.
The mean centrally-measured ADCs of malignant and benign lesions were compared using the bootstrap method and explored within lesion subgroups. The utility of centrally-measured ADCs for discriminating malignant and benign lesions was evaluated using receiver operating characteristic (ROC) curves. The lesion level ROC curve was constructed empirically, the area under the ROC curve (AUC) was estimated by the trapezoid rule, and the 95% confidence interval (CI) of the AUC was calculated with the bootstrap method. The highest malignant ADC value in the cohort was identified to determine the optimal ADC threshold to reduce benign biopsies without reducing sensitivity, where lesions with ADCs above this threshold could hypothetically be considered benign without biopsy, and its 95% CI was calculated using the bootstrap method. PPV2 after application of an ADC threshold was calculated as the number of malignant BI-RADS category 4 and 5 lesions with ADCs ≤ the specified threshold divided by the total number of BI-RADS category 4 and 5 lesions with ADCs ≤ the threshold. Potential decreases in biopsy rates after application of an ADC threshold were defined as the number of BI-RADS category 4 and 5 lesions above the threshold divided by the total number of BI-RADS category 4 and 5 lesions. Changes in biopsy rates were calculated as binomial proportions, and the bootstrap method was used to calculate their 95% CIs and to test whether the reductions were statistically significantly greater than 0%.
As outlined by the study protocol, a more conservative and potentially more generalizable ADC threshold was obtained by inflating the data-derived optimal ADC threshold by 10% (37), and the effect of this conservative ADC cutoff on biopsy reduction also was evaluated. The performance for reducing biopsies in clinical practice was further explored by applying the conservative ADC threshold directly to the site-measured lesion ADC values. Central and site-measured lesion ADCs were compared using the paired t-test. All analyses were performed using SAS 9.4 (SAS Institute, Cary, North Carolina) and R Studio 3.3.3 (https://cran.r-project.org). P-values <0.05 were considered significant.
Results
Study Population
The trial participation flowchart is provided in Figure 1. From January 14, 2014 to March 13, 2015, 1002 women from 10 academic institutions consented to participate in the trial prior to undergoing breast MRI. One hundred seven women from nine institutions had at least one qualifying lesion and were enrolled, resulting in an abnormal interpretation rate of 10.7% (107/1002). One subject was subsequently determined to be ineligible due to concurrent chemotherapy, leaving 106 eligible patients with 146 BI-RADS category 3, 4, or 5 lesions. Thirty-nine patients (65 lesions) were excluded due to either not completing the study (four lesions), or missing reference standard (28 lesions: 10 BI-RADS category 3, 17 BI-RADS category 4, and one BI-RADS category 5), or non-evaluable DWI (33 lesions, Figure 1). Of the 33 lesions for which DWI was not evaluable, artifacts related to poor fat suppression, low SNR, susceptibility-related distortion, and/or misregistration/motion between high and low b-value images reduced image quality for 24 lesions, while partial volume averaging precluded lesion localization and visibility for 9 lesions. The final analysis set comprised 67 patients (median age=49, range 24–75) with 81 lesions (57 patients with one lesion, six with two lesions, and four with three lesions) with a verified reference standard (17 invasive carcinomas, 11 ductal carcinomas in situ [DCIS], 53 benign) who underwent MRI with DWI at 1.5 or 3 tesla for a variety of clinical indications (Table 1).
Table 1.
Patient/Exam feature | Eligible (N=106) | Final analysis set (N=67) |
---|---|---|
Age at enrollment (years) | ||
Mean ± standard deviation | 48.9 ± 12.0 | 48.9 ± 12.2 |
Median (range) | 47.5 (24.0–75.0) | 49.0 (24.0–75.0) |
Number of BI-RADS 3, 4, or 5 lesions, n (%) | ||
1 | 77 (72.6) | 57 (85.1) |
2 | 20 (18.9) | 6 (9.0) |
3 | 7 (6.6) | 4 (6.0) |
4 | 2 (1.9) | 0 (0.0) |
Clinical indication for MR imaging, n (%) | ||
Evaluate extent of disease for known breast cancer | 47 (44.3) | 32 (47.8) |
Further evaluation of lesion detected on other imaging | 4 (3.8) | 2 (3.0) |
Short interval follow-up MR imaging | 6 (5.7) | 4 (6.0) |
Screening due to personal history of breast cancer | 6 (5.7) | 5 (7.5) |
Screening due to genetic risk or family history of breast cancer | 24 (22.6) | 12 (17.9) |
Other clinical indication | 9 (8.5) | 7 (10.4) |
Multiple clinical indications | 10 (9.4) | 5 (7.5) |
MR (B₀) field strength (tesla, T), n (%) | ||
1.5 T | 42 (39.6) | 27 (40.3) |
3.0 T | 64 (60.4) | 40 (59.7) |
MR vendor platform, n (%) | ||
Philips 1.5 T | 26 (24.5) | 14 (20.9) |
Siemens 1.5 T | 4 (3.8) | 3 (4.5) |
GE 1.5 T | 12 (11.3) | 10 (14.9) |
Philips 3.0 T | 39 (36.8) | 26 (38.8) |
Siemens 3.0 T | 18 (17.0) | 10 (14.9) |
GE 3.0 T | 7 (6.6) | 4 (6.0) |
Conventional MRI Features and Performance
Of the 81 lesions in the final analysis set, four were BI-RADS category 5 (all with malignant reference standard), 63 were category 4 (24/63 [38.1%] malignant), and 14 were category 3 (all benign) on conventional MRI. Masses (45/81, 55.6%) were the most common morphology described on the basis of conventional breast MRI, followed by NME (32/81, 39.5%) and foci (4/81, 4.9%). The PPV2 of conventional breast MRI BI-RADS assessments for identifying malignancies was 41.8% (28/67), resulting in a benign biopsy rate of 58.2% (39/67).
Centrally-Measured Lesion ADCs and Effect on MRI Performance
The mean centrally-measured ADC value of all 81 lesions was 1.38 ± 0.29×10−3 mm2/s. Pathology-proven malignancies demonstrated a significantly lower mean ADC (1.21 ± 0.21×10−3 mm2/s) than benign lesions (1.47 ± 0.29×10−3 mm2/s, p<0.0001), with illustrative examples provided in Figure 2. Significant differences in ADCs between malignancies and benign lesions persisted across all groups when lesions were stratified by morphology or size (p<0.05 for all comparisons, Table 2).
Table 2.
Lesion Subset | Benign | Malignant | p-value | AUC (95% CI) | ||
---|---|---|---|---|---|---|
n | ADC (× 10−3 mm2/s) (mean ± SD) | n | ADC (× 10−3 mm2/s) (mean ± SD) | |||
All lesions | 53 | 1.47 ± 0.29 | 28 | 1.21 ± 0.21 | < 0.0001 | 0.75 (0.65–0.84) |
Mass | 31 | 1.51 ± 0.30 | 14 | 1.23 ± 0.16 | < 0.0001 | 0.79 (0.66–0.90) |
NME | 20 | 1.43 ± 0.25 | 12 | 1.18 ± 0.27 | 0.0098 | 0.72 (0.51–0.89) |
Size ≤ 10 mm | 27 | 1.48 ± 0.28 | 12 | 1.27 ± 0.16 | 0.0025 | 0.75 (0.61–0.88) |
Size > 10 mm | 26 | 1.46 ± 0.30 | 16 | 1.17 ± 0.24 | 0.0005 | 0.76 (0.61–0.90) |
ADC = apparent diffusion coefficient, AUC = area under the receiver operating characteristic curve, NME = non-mass enhancement, SD = standard deviation
Using ROC curve analysis, the estimated AUC for ADC to predict malignancy was 0.75 (95% CI 0.65–0.84, Figure 3). The ADC threshold associated with 100% sensitivity and maximal specificity was 1.53×10−3 mm2/s (95% CI 1.40–1.53×10−3 mm2/s), above which there were 14 benign BI-RADS category 4 lesions (11 masses, 3 NMEs) and no BI-RADS category 5 lesions. Hypothetical application of this ADC threshold to this cohort, where lesions with ADC>1.53×10−3 mm2/s would not undergo biopsy, resulted in an 11.0% increase in PPV2 (52.8% [28/53] vs. 41.8% [28/67]) and a corresponding 20.9% (14/67) reduction of the biopsy recommendation rate without missing any cancers (Table 3). This represents a 36% (14/39) reduction in the number of false-positive MRI findings and unnecessary biopsy recommendations prompted by MRI. Application of the ADC threshold to BI-RADS category 4 lesions resulted in a 10.9% increase in PPV2 (49.0% [24/49] vs. 38.1% [24/63]) and a 22.2% (14/63) reduction of the biopsy rate (Table 3), while there was no change in PPV2 or biopsy rates in BI-RADS category 5 lesions (all malignant). Biopsy reductions were not calculated for BI-RADS category 3 lesions since these are considered findings that are so unlikely to be cancer (≤2% chance of malignancy) that imaging surveillance is preferred over biopsy. Of note, ADCs ranged 1.14 to 2.00×10−3 mm2/s for the 14 BI-RADS category 3 lesions, with 10 (71%) exhibiting ADCs at or below the threshold.
Table 3.
Lesion Group |
Lesions Recommended for Biopsy (BI-RADS 4 and 5) |
Lesions with ADC > Threshold* | Reduction in Biopsy Rate using ADC threshold | p-value | |||
---|---|---|---|---|---|---|---|
N Total | Nm Malignant | NB Benign | n | n/N (%) | 95% CI | ||
All lesions | 67 | 28 | 39 | 14 | 14/67 (20.9%) | 11.2–31.2% | < 0.0001 |
BI-RADS Assessment Category | |||||||
BI-RADS 4 only | 63 | 24 | 39 | 14 | 14/63 (22.2%) | 11.8–32.8% | < 0.0001 |
BI-RADS 5 only | 4 | 4 | 0 | 0 | 0/4 (0%) | - | 1.0 |
Morphology | |||||||
Mass | 38 | 14 | 24 | 11 | 11/38 (28.9%) | 15.9–44.1% | 0.0001 |
NME | 26 | 12 | 14 | 3 | 3/26 (11.5%) | 0–25.9% | 0.066 |
Size ≤ 10 mm | 30 | 12 | 18 | 7 | 7/30 (23.3%) | 9.1–39.3% | 0.0022 |
Size > 10 mm | 37 | 16 | 21 | 7 | 7/37 (18.9%) | 7.5–33.3% | 0.0028 |
ADC = apparent diffusion coefficient, NME = non-mass enhancement
NOTE: ADC threshold (ADC≤1.53×10−3 mm2/s) identified to maintain 100% sensitivity.
ADC Performance Within Lesion Subsets
ROC curve analyses to evaluate the performance of ADC to discriminate cancers from benign lesions based on morphologic subtypes demonstrated an AUC of 0.79 for masses and 0.72 for NMEs (Table 2, Figure 3). When stratifying by size, the AUC was 0.75 for lesions ≤10 mm and 0.76 for lesions >10 mm (Table 2). The optimal ADC threshold value of 1.53×10−3 mm2/s remained unchanged within each subgroup when evaluated separately. However, applying this threshold resulted in a significant reduction in the biopsy rate for masses of 28.9% (11/38; p=0.0001) but was less significant for NMEs at 11.5% (3/26; p=0.066, Table 3). Reductions in biopsy rates were significant for both lesions ≤10 mm and >10 mm in size, at 23.3% (7/30; p=0.0022) and 18.9% (7/37; p=0.0028), respectively (Table 3).
Conservative ADC threshold Performance
Testing of a more conservative and potentially more generalizable 10% inflated ADC threshold (1.53×10−3 mm2/s × 1.10 = 1.683×10−3 mm2/s) to the same cohort resulted in a more modest decrease in the biopsy rate of 10.4% (7/67). Seven benign lesions (all BI-RADS 4) exhibited ADCs above the conservative ADC threshold, corresponding to a 4.9% increase in PPV2 (46.7% [28/60] vs. 41.8% [28/67]).
Potential Clinical Performance of Site-Measured ADCs
ADCs for 73 of 81 lesions were prospectively measured and submitted by the site radiologists. Site-measured ADCs were not submitted for eight lesions due to the site radiologist indicating that the lesion was not sufficiently visible on DWI to draw an ROI. For three lesions, ADCs were submitted by the site radiologist but were excluded from analyses because ADC maps were calculated at the site level using incorrect b-values. Thus, the performance of site-measured ADCs using the correct b-values for 70 of the 81 lesions (26 malignant, 44 benign; 9 BI-RADS category 3, 57 BI-RADS category 4, 4 BI-RADS category 5) were evaluated.
Site ADCs were slightly higher than central ADCs for this subset, although the difference was not significant (mean = 1.44 ± 0.44×10−3 mm2/s vs. 1.37 ± 0.29×10−3 mm2/s, p=0.064). Of the 61 lesions in this subset recommended for biopsy (BI-RADS category 4 or 5), 16 had site-measured ADCs above the conservative ADC threshold (1.683×10−3 mm2/s), 13 of which were benign while three were malignant. Therefore, applying the conservative ADC threshold at the site level would have increased PPV2 by 8.5% (51.1% [23/45] vs. 42.6% [26/61]) and lowered the biopsy rate by 26.2% (16/61), but with an 11.5% (3/26) decrease in sensitivity due to three missed malignancies. These three malignancies included one DCIS (site ADC=1.7×10−3 mm2/s vs. central ADC=1.53×10−3 mm2/s) and two invasive ductal carcinomas (IDC 1 site ADC = 2.75×10−3 mm2/s vs. central ADC = 1.36×10−3 mm2/s and IDC 2 site ADC = 1.7×10−3 mm2/s vs. central ADC = 1.26×10−3 mm2/s).
Discussion
In this multi-center trial of prospectively collected breast DWI data, we confirm that suspicious lesions on conventional breast MRI that are malignant exhibit lower mean ADCs than their benign counterparts. In exploring the potential clinical impact of DWI on breast MRI performance, our centralized analysis demonstrated that applying an ADC threshold could significantly decrease benign biopsies prompted by breast MRI without reducing sensitivity. To estimate results in clinical practice, we found application of a conservative ADC threshold to prospectively measured site-ADCs would have reduced the overall biopsy rate by 26%, but also could have led to a delay in diagnosis of one DCIS lesion and two IDCs.
Our study supports results from many single site studies reporting lower ADC values in malignancies compared to benign lesions (18–25). Two recent meta-analyses have further suggested there is great potential for clinical implementation of DWI (26,27), with one reporting a pooled sensitivity of 84% and specificity of 79% for lesion ADC measures (26). However, the majority of the studies included were retrospective and excluded lesion subtypes that are more problematic to assess on DWI, leading to potential overestimations in DWI performance from selection bias. Furthermore, there were wide variations in DWI acquisition and ADC quantitation, limiting generalizability (27). In particular, the use of varying b-values among these studies greatly impacts reported ADC thresholds, with higher b-values leading to lower ADCs due to suppression of perfusion effects and increased sampling of slowly diffusing water pools (18,42). Finally, prior studies selected optimal ADC thresholds to balance sensitivity and specificity equally, which could result in higher rates of missed cancers if implemented prospectively.
This multicenter trial brings implementation of DWI into breast MRI interpretation closer to clinical practice than prior reports. By utilizing a consistent protocol with specific b-values across a range of MRI platforms, this study confirms that DWI can discriminate a significant fraction of benign lesions from malignancies not achievable on conventional MRI alone. Furthermore, it identifies an appropriate ADC threshold to decrease unnecessary biopsies while minimizing impact on sensitivity and provides benchmarks for improvement in PPV2 when clinically implemented. To minimize selection bias, this study included consecutive suspicious lesions without restrictions on morphologic features or size, and prospective BI-RADS assessments were recorded at each site independent from DWI information and prior to biopsy or follow-up. Finally, all ADCs utilized for identification of an optimal threshold were centrally-measured in high-quality DW acquisitions using a standardized approach to ensure consistency, blinded to pathology outcomes and results from imaging and clinical follow-up.
Our data suggests an ADC threshold in the range of 1.6 × 10−3 mm2/s (1.53 – 1.68 × 10−3 mm2/s) may be appropriate for clinical implementation of this standardized DWI protocol. This threshold was found to be optimal for all lesion subtypes, although our results suggest DWI is approximately twice as useful for reducing unnecessary biopsies of masses than NMEs. We hypothesize this difference is due to NME lesion ADC measurements being less precise than for masses due to DWI spatial resolution limitations. Because NMEs account for the greatest fraction of MRI-guided biopsies, which are more expensive and time consuming than other biopsy modalities, it is important that higher resolution DWI approaches be emphasized in future research to reduce the benign biopsy rate of these lesion subtypes. We also found that the ADC threshold from this study was more useful for BI-RADS 4 than for BI-RADS 3 or 5 lesions, as all BI-RADS 5 lesions were malignant and all BI-RADS 3 lesions were benign in our study cohort. In fact, application of the threshold to BI-RADS 3 lesions would have upgraded 10/14 benign lesions to a biopsy recommendation. Given the very low likelihood of malignancy in the BI-RADS 3 probably benign category (≤2% by definition) and recommendation to follow with imaging rather than biopsy, it is unlikely that application of DWI could further improve specificity for these lesions. Accordingly, this study suggests caution should be exercised when applying ADC thresholds to BI-RADS 3 findings. Furthermore, others have suggested a stratified ADC approach based on initial BI-RADS assessments, which could be explored in future work (43).
This trial illustrates that several hurdles remain for using breast DWI in clinical practice. First and foremost, 29% (33/114) of lesions could not be evaluated due to technical issues, illustrating that additional work is needed to improve DWI image quality and consistency. Site clinicians also found measuring lesion ADCs to be challenging: eight of 81 evaluable lesions were not measured due to the radiologist indicating the finding was not visible on DWI. When we tested the effect of applying an ADC threshold in the clinical setting in the subset of lesions with site-measured ADCs, we found a potential 26% (16/61) biopsy reduction at the expense of three false-negatives (one DCIS and two IDCs). The differences in site-measured versus centrally-measured ADCs of these three missed cancers ranged from 0.17×10−3 mm2/s for a DCIS lesion to 1.39×10−3 mm2/s for an IDC. We hypothesize these false-negatives on DWI were due to challenges in performing site measurements using less sophisticated software than available for central ADC quantitation, which may have led to inclusion of adjacent normal fibroglandular tissue and/or necrotic tumor. Accordingly, our findings support the need for improved commercially-available ADC measurement tools to facilitate the safe clinical implementation of DWI.
Our study has several additional important limitations. Twenty-eight lesions in 21 patients were excluded due to incomplete reference standard, which could have biased the results. ADCs were measured by calculating mean values of an ROI of the entirety of the lesion. While this approach was prescribed by the A6702 protocol because it is generally the most common approach to ADC quantitation, some authors have indicated that “hot spot” measurements of the lowest ADC within a lesion can yield superior accuracy (44). Finally, the potential for reducing unnecessary biopsies was assessed in data obtained from the same patient cohort that determined the ADC thresholds. Thus, it is important that this threshold value continue to be validated and revised in new patient populations, which could be addressed through a larger scale Phase III prospective trial.
In summary, this multicenter trial confirms that many benign lesions identified on conventional breast MRI exhibit significantly higher ADC values than malignancies, and the use of an ADC threshold could reduce many unnecessary biopsies. The study describes a potentially generalizable ADC threshold obtained from multisite, multiplatform data using a standardized protocol that should be validated in future prospective clinical studies. It also highlights that challenges in obtaining consistent high-quality breast DWI and accurate site ADC measurements remain. Thus, further work on optimizing DWI acquisition and ADC quantitation are needed as breast DWI becomes implemented clinically.
Translational Relevance.
Breast MRI is the most sensitive tool for the detection of breast cancer, as the great majority of breast cancers enhance after the administration of a gadolinium-based contrast agent. Although the specificity of modern breast MRI can exceed that of conventional breast imaging modalities, many benign pathologies also exhibit suspicious enhancement, prompting unnecessary biopsies and limiting the value of breast MRI applications. A considerable amount of retrospective data has emerged from single institution studies supporting the use of non-contrast diffusion weighted imaging (DWI) to decrease breast MRI false-positives. This prospective, Phase II, multicenter trial confirms that DWI can assist with discriminating benign from malignant pathologies that exhibit suspicious enhancement, avoiding a substantial number of unnecessary biopsies. Furthermore, the trial identifies quantitative DWI-based apparent diffusion coefficient thresholds that could be applied and validated in future Phase III trials to facilitate clinical translation.
Acknowledgments
This study was conducted by the ECOG-ACRIN Cancer Research Group (Peter J. O’Dwyer, MD and Mitchell D. Schnall, MD, PhD, Group Co-Chairs) and supported by the National Cancer Institute of the National Institutes of Health under the following award numbers: CA180820, CA180794, CA180828, CA180801, CA180847, CA180799, CA180816, CA180858, CA180870, CA18086, CA180791, CA151326, CA207290, and CA166104. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. government. The authors also acknowledge those individuals who have contributed substantially to the work reported in the manuscript, including the A6702 Trial Team, the patients who participated in the study, and the staff members who contributed to the conduct of the study at the University of Washington, the University of Michigan, the University of Pennsylvania, MD Anderson Cancer Center, University of Wisconsin, Northwestern University, Vanderbilt University, New York University, University of California, San Francisco, and Oregon Health Sciences University.
Potential Conflicts of Interest: Linda Moy: Grant support—Siemens Medical Solutions. Basak Dogan: Consultant – Endomagnetics, Inc. Wei Yang: Medical Advisory Board Member – Seno Medical Instruments. Mitchell D Schnall: Grant support—Siemens Medical Solutions. Constance Lehman: Grant Support – GE Healthcare. Christopher Comstock: Grant Support—Bracco Diagnostics; Speaker Honorarium—Bayer Pharmaceutical. Savannah Partridge: Grant support—GE Healthcare and Philips Healthcare
Footnotes
NOTE: Habib Rahbar previously received grant support from GE Healthcare, which is no longer active (ended in 2017). Wei Yang reports prior consulting relationships with GE Healthcare (ended November 2017), Elsevier (ended November 2016) and Wolters Kluwer (ended June 2017). Thomas Chenevert reports he is the co-inventor of IP assigned to and managed by the University of Michigan, and that has the potential to be licensed. Constance Lehman reports prior service on the GE Healthcare Advisory Board.
References
- 1.Sardanelli F, Podo F, Santoro F, Manoukian S, Bergonzi S, Trecate G, et al. Multicenter surveillance of women at high genetic breast cancer risk using mammography, ultrasonography, and contrast-enhanced magnetic resonance imaging (the high breast cancer risk italian 1 study): final results. Invest Radiol 2011;46(2):94–105 doi 10.1097/RLI.0b013e3181f3fcdf. [DOI] [PubMed] [Google Scholar]
- 2.Kuhl C, Weigel S, Schrading S, Arand B, Bieling H, Konig R, et al. Prospective multicenter cohort study to refine management recommendations for women at elevated familial risk of breast cancer: the EVA trial. J Clin Oncol 2010;28(9):1450–7 doi 10.1200/JCO.2009.23.0839. [DOI] [PubMed] [Google Scholar]
- 3.Riedl CC, Luft N, Bernhart C, Weber M, Bernathova M, Tea MK, et al. Triple-modality screening trial for familial breast cancer underlines the importance of magnetic resonance imaging and questions the role of mammography and ultrasound regardless of patient mutation status, age, and breast density. J Clin Oncol 2015;33(10):1128–35 doi 10.1200/JCO.2014.56.8626. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Berg WA, Zhang Z, Lehrer D, Jong RA, Pisano ED, Barr RG, et al. Detection of breast cancer with addition of annual screening ultrasound or a single screening MRI to mammography in women with elevated breast cancer risk. JAMA 2012;307(13):1394–404 doi 10.1001/jama.2012.388. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Leach MO, Boggis CR, Dixon AK, Easton DF, Eeles RA, Evans DG, et al. Screening with magnetic resonance imaging and mammography of a UK population at high familial risk of breast cancer: a prospective multicentre cohort study (MARIBS). Lancet 2005;365(9473):1769–78 doi 10.1016/S0140-6736(05)66481-1. [DOI] [PubMed] [Google Scholar]
- 6.Kriege M, Brekelmans CT, Boetes C, Besnard PE, Zonderland HM, Obdeijn IM, et al. Efficacy of MRI and mammography for breast-cancer screening in women with a familial or genetic predisposition. N Engl J Med 2004;351(5):427–37 doi 10.1056/NEJMoa031759. [DOI] [PubMed] [Google Scholar]
- 7.Lehman CD, Isaacs C, Schnall MD, Pisano ED, Ascher SM, Weatherall PT, et al. Cancer yield of mammography, MR, and US in high-risk women: prospective multi-institution breast cancer screening study. Radiology 2007;244(2):381–8 doi 10.1148/radiol.2442060461. [DOI] [PubMed] [Google Scholar]
- 8.Sardanelli F, Podo F, D’Agnolo G, Verdecchia A, Santaquilani M, Musumeci R, et al. Multicenter comparative multimodality surveillance of women at genetic-familial high risk for breast cancer (HIBCRIT study): interim results. Radiology 2007;242(3):698–715 doi 10.1148/radiol.2423051965. [DOI] [PubMed] [Google Scholar]
- 9.Lee JM, Ichikawa L, Valencia E, Miglioretti DL, Wernli K, Buist DSM, et al. Performance Benchmarks for Screening Breast MR Imaging in Community Practice. Radiology 2017;285(1):44–52 doi 10.1148/radiol.2017162033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Niell BL, Gavenonis SC, Motazedi T, Chubiz JC, Halpern EP, Rafferty EA, et al. Auditing a breast MRI practice: performance measures for screening and diagnostic breast MRI. J Am Coll Radiol 2014;11(9):883–9 doi 10.1016/j.jacr.2014.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Strigel RM, Rollenhagen J, Burnside ES, Elezaby M, Fowler AM, Kelcz F, et al. Screening Breast MRI Outcomes in Routine Clinical Practice: Comparison to BI-RADS Benchmarks. Acad Radiol 2017;24(4):411–7 doi 10.1016/j.acra.2016.10.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Feig S Cost-effectiveness of mammography, MRI, and ultrasonography for breast cancer screening. Radiol Clin North Am 2010;48(5):879–91 doi 10.1016/j.rcl.2010.06.002. [DOI] [PubMed] [Google Scholar]
- 13.Houssami N, Hayes DF. Review of preoperative magnetic resonance imaging (MRI) in breast cancer: should MRI be performed on all women with newly diagnosed, early stage breast cancer? CA Cancer J Clin 2009;59(5):290–302 doi 10.3322/caac.20028. [DOI] [PubMed] [Google Scholar]
- 14.Houssami N, Turner RM, Morrow M. Meta-analysis of pre-operative magnetic resonance imaging (MRI) and surgical treatment for breast cancer. Breast Cancer Res Treat 2017;165(2):273–83 doi 10.1007/s10549-017-4324-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Taouli B, Beer AJ, Chenevert T, Collins D, Lehman C, Matos C, et al. Diffusion-weighted imaging outside the brain: Consensus statement from an ISMRM-sponsored workshop. J Magn Reson Imaging 2016;44(3):521–40 doi 10.1002/jmri.25196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.White NS, McDonald C, Farid N, Kuperman J, Karow D, Schenker-Ahmed NM, et al. Diffusion-weighted imaging in cancer: physical foundations and applications of restriction spectrum imaging. Cancer Res 2014;74(17):4638–52 doi 10.1158/0008-5472.CAN-13-3534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Charles-Edwards EM, deSouza NM. Diffusion-weighted magnetic resonance imaging and its application to cancer. Cancer Imaging 2006;6:135–43 doi 10.1102/1470-7330.2006.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bogner W, Gruber S, Pinker K, Grabner G, Stadlbauer A, Weber M, et al. Diffusion-weighted MR for differentiation of breast lesions at 3.0 T: how does selection of diffusion protocols affect diagnosis? Radiology 2009;253(2):341–51 doi 10.1148/radiol.2532081718. [DOI] [PubMed] [Google Scholar]
- 19.Dijkstra H, Dorrius MD, Wielema M, Pijnappel RM, Oudkerk M, Sijens PE. Quantitative DWI implemented after DCE-MRI yields increased specificity for BI-RADS 3 and 4 breast lesions. J Magn Reson Imaging 2016;44(6):1642–9 doi 10.1002/jmri.25331. [DOI] [PubMed] [Google Scholar]
- 20.El Khouli RH, Jacobs MA, Mezban SD, Huang P, Kamel IR, Macura KJ, et al. Diffusion-weighted imaging improves the diagnostic accuracy of conventional 3.0-T breast MR imaging. Radiology 2010;256(1):64–73 doi 10.1148/radiol.10091367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Guo Y, Cai YQ, Cai ZL, Gao YG, An NY, Ma L, et al. Differentiation of clinically benign and malignant breast lesions using diffusion-weighted imaging. J Magn Reson Imaging 2002;16(2):172–8. [DOI] [PubMed] [Google Scholar]
- 22.Parsian S, Rahbar H, Allison KH, Demartini WB, Olson ML, Lehman CD, et al. Nonmalignant breast lesions: ADCs of benign and high-risk subtypes assessed as false-positive at dynamic enhanced MR imaging. Radiology 2012;265(3):696–706 doi 10.1148/radiol.12112672. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Partridge SC, DeMartini WB, Kurland BF, Eby PR, White SW, Lehman CD. Quantitative diffusion-weighted imaging as an adjunct to conventional breast MRI for improved positive predictive value. AJR American journal of roentgenology 2009;193(6):1716–22 doi 10.2214/AJR.08.2139. [DOI] [PubMed] [Google Scholar]
- 24.Spick C, Pinker-Domenig K, Rudas M, Helbich TH, Baltzer PA. MRI-only lesions: application of diffusion-weighted imaging obviates unnecessary MR-guided breast biopsies. Eur Radiol 2014;24(6):1204–10 doi 10.1007/s00330-014-3153-6. [DOI] [PubMed] [Google Scholar]
- 25.Woodhams R, Matsunaga K, Kan S, Hata H, Ozaki M, Iwabuchi K, et al. ADC mapping of benign and malignant breast tumors. Magnetic resonance in medical sciences : MRMS : an official journal of Japan Society of Magnetic Resonance in Medicine 2005;4(1):35–42. [DOI] [PubMed] [Google Scholar]
- 26.Chen X, Li WL, Zhang YL, Wu Q, Guo YM, Bai ZL. Meta-analysis of quantitative diffusion-weighted MR imaging in the differential diagnosis of breast lesions. BMC Cancer 2010;10:693 doi 10.1186/1471-2407-10-693. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zhang L, Tang M, Min Z, Lu J, Lei X, Zhang X. Accuracy of combined dynamic contrast-enhanced magnetic resonance imaging and diffusion-weighted imaging for breast cancer detection: a meta-analysis. Acta radiologica 2016;57(6):651–60 doi 10.1177/0284185115597265. [DOI] [PubMed] [Google Scholar]
- 28.Demartini WB, Kurland BF, Gutierrez RL, Blackmore CC, Peacock S, Lehman CD. Probability of malignancy for lesions detected on breast MRI: a predictive model incorporating BI-RADS imaging features and patient characteristics. Eur Radiol 2011;21(8):1609–17 doi 10.1007/s00330-011-2094-6. [DOI] [PubMed] [Google Scholar]
- 29.Kawai M, Kataoka M, Kanao S, Iima M, Onishi N, Ohashi A, et al. The Value of Lesion Size as an Adjunct to the BI-RADS-MRI 2013 Descriptors in the Diagnosis of Solitary Breast Masses. Magnetic resonance in medical sciences : MRMS : an official journal of Japan Society of Magnetic Resonance in Medicine 2018;17(3):203–10 doi 10.2463/mrms.mp.2017-0024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hirano M, Satake H, Ishigaki S, Ikeda M, Kawai H, Naganawa S. Diffusion-weighted imaging of breast masses: comparison of diagnostic performance using various apparent diffusion coefficient parameters. AJR American journal of roentgenology 2012;198(3):717–22 doi 10.2214/AJR.11.7093. [DOI] [PubMed] [Google Scholar]
- 31.Suo S, Zhang K, Cao M, Suo X, Hua J, Geng X, et al. Characterization of breast masses as benign or malignant at 3.0T MRI with whole-lesion histogram analysis of the apparent diffusion coefficient. J Magn Reson Imaging 2016;43(4):894–902 doi 10.1002/jmri.25043. [DOI] [PubMed] [Google Scholar]
- 32.Baltzer PA, Benndorf M, Dietzel M, Gajda M, Camara O, Kaiser WA. Sensitivity and specificity of unenhanced MR mammography (DWI combined with T2-weighted TSE imaging, ueMRM) for the differentiation of mass lesions. Eur Radiol 2010;20(5):1101–10 doi 10.1007/s00330-009-1654-5. [DOI] [PubMed] [Google Scholar]
- 33.Myers KS, Kamel IR, Macura KJ. MRI-guided breast biopsy: outcomes and effect on patient management. Clin Breast Cancer 2015;15(2):143–52 doi 10.1016/j.clbc.2014.11.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.DCE-MRI and DWI for Detection and Diagnosis of Breast Cancer (ACRIN 6702). https://clinicaltrials.gov/ct2/show/NCT02022579. Last accessed November 1, 2018.
- 35.American College of Radiology Imaging Network (ACRIN) 6702 Protocol Documents. Last accessed November 1, 2018 https://www.acrin.org/TabID/879/Default.aspx.
- 36.American College of Radiology Breast MRI Accreditation Program: Modalities Last accessed November 1, 2018 http://www.acraccreditation.org/Modalities/Breast-MRI
- 37.American College of Radiology Imaging Network (ACRIN) 6702 Imaging Manual. Last accessed November 1, 2018 https://www.acrin.org/Portals/0/Protocols/6702/ImagingMaterias/6702SiteImagingManualFinalv101302014.pdf.
- 38.Malyarenko D, Galban CJ, Londy FJ, Meyer CR, Johnson TD, Rehemtulla A, et al. Multi-system repeatability and reproducibility of apparent diffusion coefficient measurement using an ice-water phantom. J Magn Reson Imaging 2013;37(5):1238–46 doi 10.1002/jmri.23825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Morris EA, Comstock CE, Lee CH, et al. ACR BI-RADS® Magnetic Resonance Imaging In: ACR BI-RADS® Atlas, Breast Imaging Reporting and Data System. Reston, VA, American College of Radiology; 2013. [Google Scholar]
- 40.Stejskal EO, Tanner JE. Spin diffusion measurements: spin echoes in the presence of a time-dependent field gradient. The Journal of Chemical Physics 1965;42(288):288–92. [Google Scholar]
- 41.Rahbar H, Kurland BF, Olson ML, Kitsch AE, Scheel JR, Chai X, et al. Diffusion-Weighted Breast Magnetic Resonance Imaging: A Semiautomated Voxel Selection Technique Improves Interreader Reproducibility of Apparent Diffusion Coefficient Measurements. J Comput Assist Tomogr 2016;40(3):428–35 doi 10.1097/RCT.0000000000000372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Clark CA, Le Bihan D. Water diffusion compartmentation and anisotropy at high b values in the human brain. Magnetic resonance in medicine 2000;44(6):852–9. [DOI] [PubMed] [Google Scholar]
- 43.Pinker K, Bickel H, Helbich TH, Gruber S, Dubsky P, Pluschnig U, et al. Combined contrast-enhanced magnetic resonance and diffusion-weighted imaging reading adapted to the “Breast Imaging Reporting and Data System” for multiparametric 3-T imaging of breast lesions. Eur Radiol 2013;23(7):1791–802 doi 10.1007/s00330-013-2771-8. [DOI] [PubMed] [Google Scholar]
- 44.Bickel H, Pinker K, Polanec S, Magometschnigg H, Wengert G, Spick C, et al. Diffusion-weighted imaging of breast lesions: Region-of-interest placement and different ADC parameters influence apparent diffusion coefficient values. Eur Radiol 2017;27(5):1883–92 doi 10.1007/s00330-016-4564-3. [DOI] [PubMed] [Google Scholar]