Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Mar 12.
Published in final edited form as: Radiology. 2018 Jan 9;287(2):416–422. doi: 10.1148/radiol.2017170770

ACR BI-RADS Assessment Category 4 Subdivisions in Diagnostic Mammography

Utilization and Outcomes in the National Mammography Database

Mai Elezaby 1, Geng Li 2, Mythreyi Bhargavan-Chatfield 3, Elizabeth S Burnside 4, Wendy B DeMartini 5
PMCID: PMC6413875  NIHMSID: NIHMS992724  PMID: 29315061

Abstract

Purpose:

To determine the utilization and positive predictive value (PPV) of the American College of Radiology (ACR) Breast Imaging Data and Reporting System (BI-RADS) category 4 subdivisions in diagnostic mammography in the National Mammography Database (NMD).

Materials and Methods:

This study involved retrospective review of diagnostic mammography data submitted to the NMD from January 1, 2008 to December 30, 2014. Utilization rates of BI-RADS category 4 subdivisions were compared by year, facility (type, location, census region), and examination (indication, finding type) characteristics. PPV3 (positive predictive value for biopsies performed) was calculated overall and according to category 4 subdivision. The χ2 test was used to test for significant associations.

Results:

Of 1 309 950 diagnostic mammograms, 125 447 (9.6%) were category 4, of which 33.3% (41 841 of 125 447) were subdivided. Subdivision utilization rates were higher (P < .001) in practices that were community, suburban, or in the West; for examination indication of prior history of breast cancer; and for the imaging finding of architectural distortion. Of 41 841 category 4 subdivided examinations, 4A constituted 55.6% (23 258 of 41 841) of the examinations; 4B, 31.8% (13 302 of 41 841) of the examinations; and 4C, 12.6% (5281 of 41 841) of the examinations. Pathologic outcomes were available in 91 563 examinations, and overall category 4 PPV3 was 21.1% (19 285 of 91 563). There was a statistically significant difference in PPV3 according to category 4 subdivision (P < .001): The PPV of 4A was 7.6% (1274 of 16 784), that of 4B was 22% (2317 of 10 408), and that of 4C was 69.3% (2839 of 4099).

Conclusion:

Although BI-RADS suggests their use, subdivisions were utilized in the minority (33.3% [41 841 of 125 447]) of category 4 diagnostic mammograms, with variability based on facility and examination characteristics. When subdivisions were used, PPV3s were in BI-RADS–specified malignancy ranges. This analysis supports the use of subdivisions in broad practice and, given benefits for patient care, should motivate increased utilization.


Standardized reporting of mammographic examinations, as outlined in the Breast Imaging Reporting and Data System (BI-RADS) atlas, mandates providing a BI-RADS assessment category based on the most suspicious imaging features (1). BI-RADS assessment categories are a well-defined seven-point classification system of numeric codes (zero to six) and accompanying text descriptors that are assigned to mammographic examinations. These risk-stratifying categories are aimed to clearly convey the likelihood of malignancy to referring clinicians and typically guide management recommendations. Examinations assigned a BI-RADS assessment category 4 (hereafter described as category 4) have been defined as those with “findings that do not have the classic appearance of malignancy but are sufficiently suspicious to justify recommendation for biopsy” (2). Category 4 covers a wide range of likelihood of malignancy, from more than 2% to less than 95%. The concordant management recommendation is tissue diagnosis, ranging from imaging-guided aspiration to imaging-guided biopsy or surgical excision. Since 2003, BI-RADS (4th and 5th editions) has suggested the use of category 4 subdivisions to provide improved stratification of likelihood of malignancy. The current definitions (5th edition) of likelihood of malignancy for category 4 subdivisions are as follows: 4A > 2% to ≤ 10% (low suspicion of malignancy), 4B > 10% to ≤ 50% (moderate suspicion) and 4C > 50% to < 95% (high suspicion) (1,3). Hence, subdivisions provide more graded risk stratification, which facilitates informed patient treatment, including setting patients and provider expectations for outcomes. It also provides more meaningful data for measuring radiologists’ performance, ensures more accurate radiologic-pathologic correlation, and generates more robust data for radiologic research (47). The benefits of utilization of BI-RADS 4 subdivisions also apply to the reporting of ultrasonographic and magnetic resonance (MR) imaging findings (810). Of note, the Mammography Quality and Standards Act (MQSA) requires specific wording of assessment categories in mammography reports. For category 4 examinations an assessment of “suspicious” is the mandated wording; subdivisions are optional (11).

The use and outcomes of BI-RADS assessment categories are key measures of physician performance, because variability and outcomes below desirable benchmarks substantially affect patient care (12). Understanding category 4 utilization patterns and results is particularly important, as this category accounts for the majority of tissue diagnosis recommendations and procedures. In addition, category 4 subdivisions are encouraged in practice because of their potential benefits, but little is known about their utilization and outcomes. Thus, the purpose of our study was to determine the utilization and positive predictive values (PPVs) of biopsy performed (PPV3) of category 4 subdivisions for diagnostic mammography, using the National Mammography Database (NMD), a large national registry.

Materials and Methods

This study was approved by the NMD Registry and was Health Insurance Portability and Accountability Act compliant. The ACR data registries, including the NMD, qualify as quality improvement and have institutional review board exemption. We retrospectively reviewed diagnostic mammography data that were voluntarily submitted and self-reported by mammography facilities participating in the NMD.

Inclusion and Exclusion Criteria

We included all diagnostic mammography data submitted to the NMD between January 1, 2008 and December 12, 2014. Mammographic examinations were determined to be diagnostic based on facility designation. For the positive predictive value calculations, we included those diagnostic mammograms for which final pathology results were available.

NMD Facilities and Data Collection

The NMD is one of the American College of Radiology (ACR) National Radiology Data Registries. Since 2008, the NMD has successfully gained its position as one of the largest national mammography registries. The NMD is a clinical data registry with the overarching goal of guiding quality improvement efforts within its participating mammography facilities. It provides auditing feedback to these facilities as compared with national benchmarks and peers from similar practice settings (13). The comprehensiveness of the mammography data submitted to the NMD, as well as the wide spectrum of its participating facilities, would be a good representation of the current trends in mammographic interpretation in routine clinical practice (14), including category 4 utilization.

Data collected from mammography facilities participating in the NMD include facility and examination-level data. Facility data encompass practice type, practice setting, and census region. The NMD divides practice types into four categories: academic facilities, which are associated with academic institutions, multispecialty facilities that provide radiologic as well as services pertaining to other medical specialties, freestanding facilities that are not associated with hospitals or other breast health specialties, and community facilities that do not conform to any of the other facility definitions. Practice settings include metropolitan, suburban, and rural. Census region is based on U.S. government designation into: Northeast, Midwest, West, and South.

Examination data submitted to the NMD include patient demographics and examination indication. The patient demographics include age and personal history of breast cancer. The indications for diagnostic mammograms include additional evaluation of screening mammogram, short interval follow up, evaluation of a clinical problem, and previous history of breast cancer.

Examination-level image interpretation data are also collected by the NMD. This includes BI-RADS finding type, BIRADS assessment category (BI-RADS 0–5), as well as management recommendation (additional imaging, short-term follow up, biopsy and clinical evaluation). Of note, during the study period (2008–2014) and up to 2016, the NMD data upload software version 2.0 did not include BI-RADS 6 records. During that time, the NMD version 2.0 required facilities to exclude BI-RADS 6 records from their submitted data; therefore, these data were not included in our analysis.

BI-RADS Category 4 Subdivision Utilization

We calculated the utilization rate of category 4 subdivisions overall and according to year, facility characteristics (practice type, practice setting, and census region), and examination data (diagnostic examination indication and BI-RADS imaging finding type). In the context of increasing NMD participations, we also determined how use of these subdivisions evolved over time.

Outcome Measurements and PPV Calculation

We determined the PPV3 overall and according to category 4 subdivision for diagnostic mammograms where final pathology results were available.

PPV3 was calculated according to the BI-RADS 5th edition as the number of category 4 examinations with tissue diagnosis of invasive breast cancer or ductal carcinoma in situ (DCIS) divided by the number of category 4 examinations for which tissue diagnosis was performed.

Statistical Analysis

Binomial proportion confidence interval was calculated for PPV3 in each category 4 subdivision, and category 4 subdivision utilization rate in each category of facility and examination characteristics. Pairwise proportion differences and 95% confidence intervals (CIs) were also calculated between categories in each facility and examination characteristic.

The χ2 test was used to determine statistically significant associations, with P < .05 considered to indicate a statistically significant difference. The χ2 test was used to test whether PPV3 was significantly different for BI-RADS 4 subdivided categories (4A, 4B, and 4C), and whether the proportion of BI-RADS 4 subdivided into categories was significantly different according to facility and examination characteristics.

After the χ2 test showed an overall significant difference in group means, post-hoc tests were used to confirm where the differences occurred between groups. First, pairwise comparisons with Bonferroni corrections of the P values were performed. For the pairwise comparison between proportions, the two-tailed test was used.

Then, to see which group had a significantly higher BI-RADS 4 subdivided rate than the other groups, the group with the highest (or lowest) BI-RADS 4 subdivided rate was compared with the sum of the rest of the groups with Bonferroni corrections of the P values. To test if the proportion of a certain category was significantly higher or lower than others, the one-sided z test was used. All of the analyses were performed by using R, version 3.2.4) (15).

Results

During our study period, the number of facilities participating in the NMD has gradually increased, from 11 facilities in 2008 to 165 facilities in 2014 (Table E1 [online]), and contributed to the study data. A total of 1 309 950 diagnostic mammograms were performed between January 1, 2008 and December 30, 2014. Category 4 was assigned to 125 447 (9.6%) of 1 309 950 examinations in 112 958 women with a mean age of 55.3 years (range, 28–86 years). These women constituted our study cohort. Table 1 illustrates the distributions of BI-RADS assessment categories among diagnostic mammography examinations. BI-RADS categories 1 and 2 constituted the majority of diagnostic examinations (64.7% [847 599 of 1 309 950]), and BI-RADS category 5 represented a small proportion of diagnostic examinations (1.1% [14 086 of 1 309 950]).

Table 1.

Distribution of BI-RADS Assessment Categories in Diagnostic Mammography Examinations

BI-RADS Assessment
Category
No. of
Examinations
Percentage
0 140 736 10.7
1 240 861 18.3
2 606 738 46.3
3 182 082 13.9
4 (All) 125 447 9.6
5 14 086 1.1

 Total 1 309 950 100

Utilization of BI-RADS Assessment Category 4 Subdivisions and Positive Predictive Value

Of the 125 447 diagnostic mammograms with category 4, category 4 subdivisions were utilized in 41 841 examinations (33.3%). Of the subdivided category 4 examinations, category 4A was the most common, accounting for the majority of those subdivided, followed by 4B, while 4C was least commonly used (Table 2). Pathologic outcome was available for 91 563 examinations, for an ascertainment rate of 73% (91 563 of 125 447). Pathology results yielded cancer in 19 285 examinations, for an overall category 4 PPV3 of 21.1% (19 285 of 91 563). This was similar to the PPV3 of the category 4 not-subdivided examinations of 21.0%, where 12 855 of 60 272 examinations showed cancer. Of the category 4 subdivided examinations, 1274 of category 4A showed cancer, for a PPV3 of 7.6% (1274 of 16 784). Of the category 4B examinations, 2317 showed cancer, for a PPV3 of 22.2% (2317 of 10 408). Of the category 4C examinations, 2839 examinations showed cancer, for a PPV3 of 69.3%. These results are summarized in Table 2 (and Table E2 [online]). The differences in PPV3 of category 4 subdivisions were statistically significant (P < .002).

Table 2.

PPV3 for BI-RADS Assessment Category 4 Overall and Subdivided

BI-RADS Assessment
Category
Total No. of
Examinations
Examinations with
Pathologic Confirmation
No. of Cancers PPV3 (%)* Proportion (with 95% CI in
Parentheses)
P Value BI-RADS–specified
Likelihood of Malignancy
4 Overall 125 447 91 563 19 285 21.1 0.211 (0.208, 0.213) >2% but ,<95%
4 Not subdivided 83 606 60 272 12 855 21.0 0.213 (0.21, 0.217) > 2% but ,<95%
4A 23 258 16 784 1274  7.6 0.076 (0.072, 0.08) <.001 > 2% to ≤ 10%
4B 13 302 10 408 2317 22.2 0.223 (0.215, 0.231)  .032§ > 10% to ≤ 50%
4C 5281  4099 2839 69.3 0.693 (0.678, 0.707) <.001 > 50% to ,< 95%

Note.—CI = confidence interval.

*

Percentage of diagnostic mammograms for which biopsy was performed that resulted in a tissue diagnosis of cancer within 12 months.

P value for the comparison of PPV3 between each of the BI-RADS category 4 subdivisions (A, B, or C) and the category 4 not subdivided group.

Statistically significant at P < .01.

§

Statistically significant at P < .05.

Statistically significant at P < .001.

Utilization of BI-RADS Assessment Category 4 Subdivisions by Year

The Figure illustrates the utilization rate of category 4 subdivisions according to year. Years 2008 and 2009 showed the highest utilization rates of category 4 subdivisions at 54% (655 of 1209) and 46% (3394 of 74 414), respectively. However, they represented the initial years of data accrual with the lowest volumes of diagnostic mammograms.

Facility Characteristic Data

Table 3 and Table E3 (online) illustrate the utilization of category 4, overall and subdivided according to facility characteristics (practice type, practice setting and census region). For practice type, community practices contributed the majority of category 4 diagnostic examinations (49 477 examinations). They were also the most likely to use category 4 subdivisions (41% [20 327 of 49 477]) of their BI-RADS 4 examinations, with a significantly higher category 4 subdivided utilization rate (P < .001). The academic practices utilized category 4 subdivisions in less than one-third of their category 4 examinations (24% [5810 of 23 972]), with a significantly lower category 4 subdivision utilization rate than other practice types (P < .001). As for location setting, facilities from metropolitan locations contributed the majority of diagnostic examinations assigned a category 4. However, it was facilities in suburban settings that had a significantly higher utilization rate of category 4 subdivisions, at 34% (10 541 of 31 004) (P < .05); while facilities in rural settings had a significantly lower utilization rate, at 31% (2553 of 8317). (P < .001). For census region, facilities from the Northeast region contributed the largest volume of category 4 diagnostic examinations. However, category 4 subdivision utilization rate was highest among West region facilities, at 56% (18 457 of 32 793) (P < .001), and was significantly lowest among Northeast region facilities, at 18% (6042 of 34 608) (P < .001).

Table 3.

Utilization of BI-RADS Assessment Category 4 Subdivisions: Facility Characteristics

Facility Characteristic No. of Diagnostic
Examinations Assessed
as BI-RADS 4 Overall
No. of Diagnostic Examinations
Assessed as BI-RADS 4 with
Subdivisions*
Proportion P Value
Practice type <.001
 Academic 23 972 5810 (24.2) 0.242 (0.237, 0.248)
 Community 49 477 20 327 (41.1)§ 0.411 (0.407, 0.415)
 Multispecialty 5548 1428 (25.7) 0.257 (0.246, 0.269)
 Freestanding 46 450 14 276 (30.7) 0.307 (0.303, 0.312)
Practice setting <.001
 Metropolitan 86 126 28 747 (33.4) 0.334 (0.331,0.337)
 Suburban 31 004 10 541 (34.0)§ 0.34 (0.335, 0.345)
 Rural 8317 2553 (30.7) 0.307 (0.297, 0.317)
Census region <.001
 Northeast 34 608 6042 (17.5) 0.175 (0.171, 0.179)
 Midwest 28 806 8561 (29.7) 0.297 (0.292, 0.302)
 South 29 240 8781 (30.0) 0.3 (0295, 0.306)
 West 32 793 18 457 (56.3)§  0.563 (0.557, 0.568)
*

Data in parentheses are percentages.

Data in parentheses are 95% confidence intervals.

P value of the χ2 test for the association between BI-RADS 4 subdivision and each facility characteristic.

§

Statistically significant higher utilization rate of BI-RADS category 4 subdivisions.

graphic file with name nihms-992724-f0001.jpg

Examination Characteristic Data

Table 4 and Table E4 (online) illustrate the utilization of category 4, overall and subdivided according to examination characteristic (examination indication and finding type). The majority of diagnostic examinations with a category 4 were performed for the evaluation of a clinical breast problem (48% [60 348 of 125 447]). Although previous history of breast cancer constituted the indication with lowest volume of examinations (5% [6206 of 125 447]), it had the highest utilization rate of category 4 subdivisions, at 44.6% (2766 of 6206), which was statistically significant (P < .001).

Table 4.

Utilization of BI-RADS Assessment Category 4 Subdivisions: Examination Characteristics

Characteristic No. of Diagnostic
Examinations
Assessed as BI-RADS
4 Overall
No. of Diagnostic
Examinations
Assessed as BI-RADS
4 with Subdivisions*
Proportion P Value
Examination indication <.001
 Additional evaluation
of screening
mammography
findings
51 039 16 822 (33.0%) 0.330 (0.326, 0.334)
 Evaluation of a clinical
breast problem
60 348 19 494 (32.3) 0.323 (0.320, 0.327)
 Short-interval
follow-up
7854 2759 (35.1) 0.351 (0.341,0.362)
 Previous history of
breast cancer
6206 2766 (44.6)§ 0.446 (0.433, 0.458)
Finding type <.001
 Mass 10 115 3470 (34.3) 0.343 (0.334, 0.352)
 Calcifications 15 292 5274 (34.5) 0.345 (0.337, 0.352)
 Architectural distortion 451 249 (55.2)§ 0.552 (0.506, 0.597)
 Asymmetries 1585 678 (42.8) 0.428 (0.404, 0.452)
 Other 3524 686 (19.5) 0.195 (0.182, 0.208)
*

Data in parentheses are percentages.

Data in parentheses are 95% confidence intervals.

P value of the χ2 test for the association between BI-RADS 4 subdivision and each examination characteristic.

§

Statistically significant higher utilization rate of BI-RADS category 4 subdivisions.

The data for imaging finding type was only available for 27 443 of the 125 447 category 4 diagnostic mammography examinations (21.8%). Architectural distortion was the imaging finding type in only 1.6% (451 of 27 443) of the diagnostic examinations, yet category 4 subdivisions were utilized in 55% (249 of 451) of these examinations (P < .001).

Discussion

Of 125 447 category 4 diagnostic mammograms in the NMD, a registry of mammography data from across the United States, category 4 subdivisions were infrequently utilized (33.3% [41 841 of 125 447]). However, when used, their PPV3s were in the BI-RADS–specified ranges.

The use of category 4 subdivisions has been suggested in BI-RADS since 2003, to aid patient treatment decisions, monitor the interpretive performance of radiologists, and promote radiologic research (5,6,810,16). However, we observed low utilization of subdivisions across our study interval. The initial, higher utilization (47% [4049 of 8625]) was likely due to the small numbers of enrolled facilities that were the earlier adopters of the NMD and were likely more committed to standardization and adherence to guidelines.

Reports on variability in interpretation of screening (1719) and diagnostic (20,21) mammography investigated the impact of physician characteristics such as subspecialty training, years of experience, and volumes of yearly interpreted examinations on improved performance (2224). In our analysis of facility characteristics, community practices were more likely to utilize subdivisions in their category 4 examinations (P < .001). This may appear contradictory to prior reports that emphasized the role of subspecialization in adherence to BI-RADS; a characteristic commonly associated with academics (17,22,23,25). In a study by Miglioretti et al (20), association with an academic institution was the strongest predictor of improved diagnostic accuracy. One possible explanation of our results may relate to the high-volume practices in the NMD (13). High volumes also have been associated with improved performance (26). Additionally, recent changes in the radiology workforce challenge the traditional concept of generalists interpreting mammography in community practice (27). Increasing breast imaging fellowships and growing demands for subspecialization may have increased breast imaging specialists in community practice (28,29). We also identified higher utilization in Suburban and West census regions. However, there are many confounding variables for regional variations in health care delivery that are not fully captured in our data for meaningful interpretation (30).

Our results demonstrate higher utilization of category 4 subdivisions with examinations for prior history of breast cancer. This patient population, often more knowledgeable about breast cancer and more involved in their health care decisions, may have higher demands for better stratification of likelihood of malignancy (31).

Architectural distortion was the most significantly associated finding with the use of category 4 subdivisions (P < .001). This observation should be interpreted with caution given the low number (22% [27 443 of 125 447]) of examinations with finding type data. However, the predominant use of subdivisions with architectural distortion may reflect the need for better stratification of likelihood of malignancy to guide management decisions. This echoes the challenges in the management of architectural distortion with the advent of digital breast tomosynthesis, a topic on the forefront of breast imaging in the recent years (32). On the other hand, the lower utilization of subdivisions with calcifications is unexpected because the majority of prior reports on subdivisions focused on the imaging features of microcalcifications and their predictive power (5,6,33). Our results may reflect the continued challenges in estimating the likelihood of malignancy for calcifications in clinical practice.

The distribution across 4A–C in category 4 subdivided examinations was similar that in to prior reports (5,6,16); 4A accounted for nearly half, at 56% (23 258 of 41 841). Additionally, PPV3 for each category 4 subdivision was within the acceptable performance ranges specified in BIRADS (P < .002). Specifically, it was 7.6% (1274 of 16 784) for 4A, 22.2% (2317 of 10 408) for 4B, and 69.3% (2839 of 4099) for 4C. This is an important observation, because the specified ranges in BI-RADS are based on published data from specialized breast imaging practices. At the same time, our results support the reproducibility of risk stratification of category 4 subdivisions when used in routine clinical practice. On the other hand, the PPV3 of category 4 not-subdivided was 21% (12 855 of 60 272), at the low end of BIRADS acceptable ranges of diagnostic performance (20%−45% for abnormal screening finding; 30%−55% for palpable lump) (1).

We identified other assessment categories’ utilization patterns that warrant further study. For example, 11% (140 736 of 1 309 950) of diagnostic mammograms were assigned category 0 (incomplete). This is higher than the 4.6% previously reported (21) and is inconsistent with the BI-RADS guidelines of restricting the use of category 0 in diagnostic mammography to a minimum (1). Additionally, 13.9% (182 082 of 1 309 950) of diagnostic examinations were assigned category 3 (probably benign).This observation is hard to interpret given variability in the utilization of category 3 in clinical practice and the paucity of well-defined acceptable ranges for its use in the diagnostic setting (34).

Our study had limitations. First, the NMD is a large database; however, facilities’ participation is voluntary and may not be representative of all clinical practice. Second, the NMD is not linked to tumor registries; pathologic outcomes are self-reported and are not available for all examinations. Third, the type of practices participating in the NMD likely changed over time; early participants were likely more predisposed to standardization and adherence to guidelines. Fourth, our analysis focused on BI-RADS assessment categories in diagnostic mammography. This limitation pertains to the data collection process by the NMD, which within our study period has not included US or MR imaging reporting data elements. Last, our analysis identified other assessment categories’ utilization patterns that were outside the scope of this manuscript. These, however, warrant further study and will be the focus of subsequent work.

In conclusion, in our study of 125 447 category 4 diagnostic mammograms in the NMD, subdivisions (4A-C) resulted in improved stratification of likelihood of malignancy within BI-RADS-specified ranges. The low utilization of category 4 subdivisions highlights an opportunity for improving the quality of mammographic interpretation through continued education of radiologists regarding the benefits of their use.

Supplementary Material

Supplemental Tables

Implication for Patient Care.

  • Using category 4 subdivisions to stratify the likelihood of malignancy can benefit patient treatment decisions, potentially leading to improvements in patient care.

Abbreviations:

ACR

American College of Radiology

BI-RADS

Breast Imaging Reporting and Data System

NMD

National Mammography Database

PPV

positive predictive value

PPV3

PPV of biopsy performed

Footnotes

Disclosures of Conflicts of Interest: M.E. disclosed no relevant relationships. G.L. disclosed no relevant relationships. M.B. disclosed no relevant relationships. E.S.B. disclosed no relevant relationships. W.B.D. disclosed no relevant relationships.

Contributor Information

Mai Elezaby, From the Department of Radiology University of Wisconsin School of Medicine and Public Health, 600 Highland Ave, Madison, WI 53792.

Geng Li, Department of Biostatistics and Medical Informatics University of Wisconsin School of Medicine and Public Health, 600 Highland Ave, Madison, WI 53792.

Mythreyi Bhargavan-Chatfield, American College of Radiology, Reston, Va.

Elizabeth S. Burnside, From the Department of Radiology and Carbone Comprehensive Cancer Center University of Wisconsin School of Medicine and Public Health, 600 Highland Ave, Madison, WI 53792.

Wendy B. DeMartini, Department of Radiology, Stanford University School of Medicine, Stanford, Calif.

References

  • 1.D’Orsi CJ, Sickles E, Mendelson EB, Morris EA, et al. ACR BI-RADS Atlas, Breast Imaging Reporting and Data System. Reston, Va: American College of Radiology, 2013. [Google Scholar]
  • 2.Sickles E, D’Orsi CJ, Bassett LW, et al. ACR BI-RADS Mammography In: ACR BI-RADS Atlas, Breast Imaging Reporting and SystemData. Reston, Va: American College of Radiology, 2013. [Google Scholar]
  • 3.D’Orsi CJ, Mendelson EB, Ikeda DM. Breast Imaging Reporting and Data System: ACR BI-RADS—breast imaging atlas. Reston, Va: American College of Radiology, 2003. [Google Scholar]
  • 4.Liberman L, Abramson AF, Squires FB, Glassman JR, Morris EA, Dershaw DD. The breast imaging reporting and data system: positive predictive value of mammographic features and final assessment categories. AJR Am J Roentgenol 1998;171(1):35–40. [DOI] [PubMed] [Google Scholar]
  • 5.Bent CK, Bassett LW, D’Orsi CJ, Sayre JW. The positive predictive value of BI-RADS microcalcification descriptors and final assessment categories. AJR Am J Roentgenol 2010; 194(5):1378–1383. [DOI] [PubMed] [Google Scholar]
  • 6.Sanders MA, Roland L, Sahoo S. Clinical implications of subcategorizing BI-RADS 4 breast lesions associated with microcalcification: a radiology-pathology correlation study. Breast J 2010;16(1):28–31. [DOI] [PubMed] [Google Scholar]
  • 7.Chaiwerawattana A, Thanasitthichai S, Boonlikit S, et al. Clinical outcome of breast cancer BI-RADS 4 lesions during 2003–2008 in the National Cancer Institute Thailand. Asian Pac J Cancer Prev 2012;13(8):4063–4066. [DOI] [PubMed] [Google Scholar]
  • 8.Lazarus E, Mainiero MB, Schepps B, Koelliker SL, Livingston LS. BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. Radiology 2006;239(2):385–391. [DOI] [PubMed] [Google Scholar]
  • 9.Raza S, Goldkamp AL, Chikarmane SA, Birdwell RL. US of breast masses categorized as BI-RADS 3, 4, and 5: pictorial review of factors influencing clinical management. Radio-Graphics 2010;30(5):1199–1213. [DOI] [PubMed] [Google Scholar]
  • 10.Maltez de Almeida JR, Gomes AB, Barros TP, Fahel PE, de Seixas Rocha M. Subcategorization of suspicious breast lesions (BI-RADS category 4) according to MRI criteria: role of dynamic contrast-enhanced and diffusion-weighted imaging. AJR Am J Roentgenol 2015;205(1):222–231. [DOI] [PubMed] [Google Scholar]
  • 11.Mourad WG. Mammography Reports - Are you doing right by your patients? Mammography Quality Standards Act and Program. MQSA Insights. http://wayback.archive-it.org/7993/20170112093819/http://www.fda.gov/Radiation-EmittingProducts/MammographyQualityStandardsActandProgram/FacilityScorecard/ucm113812.htm. Updated October 31, 2014. Accessed June 8, 2016.
  • 12.Inappropriate Use of “Probably Benign” Assessment Category in Screening Mammograms-National Quality Strategy Domain: Efficiency and Cost Reduction. https://www.acr.org/%20~/media/ACR/Documents/P4P/2016%20PQRS/DX/2016_PQRS_Measure_146_11_17_2015.pdf. Updated November 17, 2015. Accessed January 14, 2017.
  • 13.Lee CS, Bhargavan-Chatfield M, Burnside ES, Nagy P, Sickles EA. The National Mammography Database: preliminary data. ajr am j roentgenol 2016;206(4):883–890. [DOI] [PubMed] [Google Scholar]
  • 14.D’Orsi CJ, Sickles EA. 2017 Breast Cancer Surveillance Consortium reports on interpretive performance at screening and diagnostic mammography: welcome new data, but not as benchmarks for practice. Radiology 2017; 283(1):7–9. [DOI] [PubMed] [Google Scholar]
  • 15.Team R A language and environment for statistical computing R Foundation for Statistical Computing. 2006; Vienna, Austria: https://www.R-project.org/. Published 2016. Accessed October 10, 2017. [Google Scholar]
  • 16.Torres-Tabanera M, Cardenas-Rebollo JM, Villar-Castaño P, et al. Analysis of the positive predictive value of the subcategories of BI-RADS(®) 4 lesions: preliminary results in 880 lesions [in Spanish]. Radiologia 2012; 54(6):520–531. [DOI] [PubMed] [Google Scholar]
  • 17.Elmore JG, Jackson SL, Abraham L, et al. Variability in interpretive performance at screening mammography and radiologists’ characteristics associated with accuracy. Radiology 2009;253(3):641–651. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Beam CA, Conant EF, Sickles EA. Factors affecting radiologist inconsistency in screening mammography. Acad Radiol 2002;9(5): 531–540. [DOI] [PubMed] [Google Scholar]
  • 19.Beam CA, Conant EF, Sickles EA. Association of volume and volume-independent factors with accuracy in screening mammogram interpretation. J Natl Cancer Inst 2003;95 (4):282–290. [DOI] [PubMed] [Google Scholar]
  • 20.Miglioretti DL, Smith-Bindman R, Abraham L, et al. Radiologist characteristics associated with interpretive performance of diagnostic mammography. J Natl Cancer Inst 2007;99(24):1854–1863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sickles EA, Miglioretti DL, Ballard-Barbash R, et al. Performance benchmarks for diagnostic mammography. Radiology 2005;235 (3):775–790. [DOI] [PubMed] [Google Scholar]
  • 22.Leung JW, Margolin FR, Dee KE, Jacobs RP, Denny SR, Schrumpf JD. Performance parameters for screening and diagnostic mammography in a community practice: are there differences between specialists and general radiologists? AJR Am J Roentgenol 2007;188(1):236–241. [DOI] [PubMed] [Google Scholar]
  • 23.Sickles EA, Wolverton DE, Dee KE. Performance parameters for screening and diagnostic mammography: specialist and general radiologists. Radiology 2002;224(3):861–869. [DOI] [PubMed] [Google Scholar]
  • 24.Elmore JG, Cook AJ, Bogart A, et al. Radiologists’ interpretive skills in screening vs. diagnostic mammography: are they related? Clin Imaging 2016;40(6):1096–1103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Jackson SL, Abraham L, Miglioretti DL, et al. Patient and radiologist characteristics associated with accuracy of two types of diagnostic mammograms. AJR Am J Roentgenol 2015;205(2):456–463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Esserman L, Cowley H, Eberle C, et al. Improving the accuracy of mammography: volume and outcome relationships. J Natl Cancer Inst 2002;94(5):369–375. [DOI] [PubMed] [Google Scholar]
  • 27.Bluth EI, Cox J, Bansal S, Green D. The 2015 ACR Commission on Human Resources Workforce Survey. J Am Coll Radiol 2015; 12(11):1137–1141. [DOI] [PubMed] [Google Scholar]
  • 28.Baxi SS, Liberman L, Lee C, Elkin EB. Breast imaging fellowships in the United States: who, what, and where? AJR Am J Roentgenol 2009;192(2):403–407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Dutton SC, Sze GK, Lund PL, Bluth EI. Radiology practice environment: options, variations, and differences-a report of the ACR commission on human resources. J Am Coll Radiol 2014;11(4):352–358. [DOI] [PubMed] [Google Scholar]
  • 30.Birkmeyer JD, Reames BN, McCulloch P, Carr AJ, Campbell WB, Wennberg JE. Understanding of regional variation in the use of surgery. Lancet 2013;382(9898):1121–1129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Say R, Murtagh M, Thomson R. Patients’ preference for involvement in medical decision making: a narrative review. Patient Educ Couns 2006;60(2):102–114. [DOI] [PubMed] [Google Scholar]
  • 32.Partyka L, Lourenco AP, Mainiero MB. Detection of mammographically occult architectural distortion on digital breast tomosynthesis screening: initial clinical experience. AJR Am J Roentgenol 2014;203(1):216–222. [DOI] [PubMed] [Google Scholar]
  • 33.Burnside ES, Ochsner JE, Fowler KJ, et al. Use of microcalcification descriptors in BIRADS 4th edition to stratify risk of malignancy. Radiology 2007;242(2):388–395. [DOI] [PubMed] [Google Scholar]
  • 34.Geller BM, Barlow WE, Ballard-Barbash R, et al. Use of the American College of Radiology BI-RADS to report on the mammographic evaluation of women with signs and symptoms of breast disease. Radiology 2002;222(2):536–542. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Tables

RESOURCES