Abstract
Background:
Breast computed tomography (CT) is an emerging breast imaging modality, and ongoing developments aim to improve breast CT’s ability to detect microcalcifications. To understand the effects of different parameters on microcalcification detectability, a virtual clinical trial study was conducted using hybrid images and convolutional neural network (CNN)-based model observers. Mathematically generated microcalcifications were embedded into breast CT data sets acquired at our institution, and parameters related to calcification size, calcification contrast, cluster diameter, cluster density, and image display method (i.e., single slices, slice averaging, and maximum-intensity projections) were evaluated for their influence on microcalcification detectability.
Purpose:
To investigate the individual effects and the interplay of parameters affecting microcalcification detectability in breast CT.
Methods:
Spherical microcalcifications of varying diameters (0.04, 0.10, 0.15, 0.20, 0.25, 0.30, 0.35, 0.40 mm) and native intensities were computer simulated to portray the partial volume effects of the imaging system. Calcifications were mathematically embedded into 109 patient breast CT volume data sets as individual calcifications or as clusters of calcifications. Six numbers of calcifications (1, 3, 5, 7, 10, 15) distributed within six cluster diameters (1, 3, 5, 6, 8, 10 mm) were simulated to study the effect of cluster density. To study the role of image display method, 2D regions of interest (ROIs) and 3D volumes of interest (VOIs) were generated using single slice extraction, slice averaging, and maximum-intensity projection (MIP). 2D and 3D CNNs were trained on the ROIs and VOIs, and receiver operating characteristic (ROC) curve analysis was used to evaluate detection performance. The area under the ROC curve (AUC) was used as the primary performance metric.
Results:
Detection performance decreased with increasing section thickness, and peak detection performance occurred using the native section thickness (0.2 mm) and MIP display. The MIP display method, despite using a single slice, yielded comparable performance to the native section thickness, which employed 50 slices. Reduction in slices did not sacrifice detection accuracy and provided significant computational advantages over multi-slice image volumes. Larger cluster diameters resulted in reduced overall detectability, while smaller cluster diameters led to increased detectability. Additionally, we observed that the presence of more calcifications within a cluster improved the overall detectability, while fewer calcifications decreased it.
Conclusions:
As breast CT is still a relatively new breast imaging modality, there is an ongoing need to identify optimal imaging protocols. This work demonstrated the utility of MIP presentation for displaying image volumes containing microcalcification clusters. It is likely that human observers may also benefit from viewing MIPs compared to individual slices. The results of this investigation begin to elucidate how model observers interact with microcalcification clusters in a 3D volume, and will be useful for future studies investigating a broader set of parameters related to breast CT.
1. INTRODUCTION
Microcalcifications are key indicators of potential early-stage breast cancer which can manifest in various morphologies and distributions1-3. The American College of Radiology has released a BI-RADS report4 on morphological patterns of microcalcifications commonly seen in mammography along with their potential for malignancy. Large rod-like calcifications, for instance, are typically benign and do not necessitate further diagnostic imaging or biopsy. Clustered microcalcifications – dense groupings of small (< 1 mm) calcifications within a small region – are likely to be malignant5,6, and generally necessitate biopsy. During breast cancer screening exams, it is crucial to accurately detect and characterize microcalcifications.
Digital mammography has historically been used for breast cancer screening, and its capabilities for imaging with high resolution have been valuable for microcalcification detection7,8. Mammography’s sensitivity substantially decreases, however, when used to image patients with dense breasts due to the superposition of fibroglandular tissue which can obscure lesions. In recent years, breast computed tomography (CT) has emerged as a promising alternative imaging tool9,10, with the key advantage of accessing fibroglandular anatomy inside the breast without the superposition of neighboring tissue. Initial clinical studies9 suggested that breast CT is better than mammography at detecting mass lesions, but that mammography is better than breast CT at detecting microcalcifications. Since this study was published, our laboratory at UC Davis has developed a higher-resolution breast CT scanner which achieves nearly four times the spatial resolution than that of the earlier generation scanner based on their system modulation transfer functions11. This scanner’s ability to detect microcalcifications is under evaluation, and protocols for optimizing the scanner for microcalcification detection are being developed. Ideally, such optimization would involve extensive clinical trials with human observers to evaluate image quality, diagnostic performance, and the impact of various imaging parameters. However, the challenges of conducting large-scale clinical trials, such as time and cost limitations, necessitate the exploration of alternative approaches.
Simulation studies have been proposed as an alternative to clinical studies, where synthetic images are generated using phantom imaging12-14 or computer simulations15-19. These studies are also called “virtual clinical trials” 20-22. In this work, hybrid images are computer simulated, where mathematically generated microcalcifications are embedded into the reconstructed volumes of patient breast CT images acquired at our institution using the high-resolution breast CT scanner. This approach enables the investigation of many lesion-related parameters (e.g., size) while preserving crucial patient- (e.g., breast density) and imaging-related parameters (e.g., resolution). Hybrid images can also be generated by inserting synthetic lesions in the projection domain; this approach has been explored in both mammography23 and digital breast tomosynthesis24. While there are advantages to this approach, such as a more accurate simulation of certain imaging processes (e.g. beam hardening), it is challenging to implement this approach for breast CT due to the computational burden of reconstructing image volumes for every iteration of lesion insertion. Thus, we insert lesions into the reconstructed image domain while simulating key imaging processes such as partial volume and apodization kernel.
To detect the simulated microcalcifications, a convolutional neural network (CNN) is used in lieu of a human observer. CNN-based model observers are thought to approximate human visual perception25-27, providing an efficient and reproducible means of quantifying microcalcification detectability across several important parameters. CNNs have shown to be useful and versatile across imaging contexts including breast CT27,28. Traditional model observers, such as the pre-whitened matched filter or Channelized Hotelling observer, are not explored in this study, though they are commonly used for observer studies. The field has generally found that CNN-based model observers can outperform mathematical model observers, and this includes recent studies by Baek et al.26,28,29 using synthetic breast CT images with simulated anatomical background, and Lyu et al.30 using hybrid breast CT images with patient anatomical background. These studies concluded that CNN’s capture more diagnostic information from breast CT images than mathematical filters, and therefore may be more useful for conducting optimal performance studies. As we are interested in optimizing breast CT parameters for microcalcification detection, we focus on CNN-based observers in this work. In this study, 2D and 3D CNN models are used to detect individual microcalcifications and microcalcification clusters embedded in 109 patient breast CT data sets, and parameters related to calcification size, calcification contrast, cluster diameter, cluster density, and image display method (i.e., single slices, slice averaging, and maximum-intensity projections) are studied.
2. METHODS
2.1. Breast CT system
Four generations of pendant geometry, cone-beam breast CT scanners have been designed and fabricated in our laboratory over the past two decades. The fourth-generation prototype scanner follows a similar design as previously reported scanners9 but achieves nearly four times the spatial resolution of the early scanners based upon modulation transfer function (MTF) analysis11. This is due to the combination of a pulsed X-ray source, a smaller focal spot, and a higher resolution flat-panel detector. The x-ray source contains a rotating tungsten anode x-ray tube with a nominal focal spot of 0.3 × 0.3 mm and a 14kW x-ray generator (CMP200SE, CPI, Ontario, Canada). The flat-panel detector (Dexela 2923M, Varex, Salt Lake City, UT) features a 0.45 mm thick thallium-activated structured cesium iodide (CsI:T1) scintillator coupled to CMOS detector elements. The gantry rotates in the horizontal plane during image acquisition, using a 60 kV x-ray beam with 0.20 mm copper filtration. A total of 500 projection images are acquired in ~11 s. Acquired projection images are reconstructed using the Feldkamp algorithm31 with a Shepp-Logan apodization kernel cutting off at 1.5 times Nyquist.
2.2. Patient images
An IRB-approved clinical trial was conducted at UC Davis Medical Center evaluating the fourth-generation breast CT system as a tool for breast cancer screening and for the diagnostic breast exam. Patients with suspicious lesions (BI-RADS 4 or 5) based on screening breast imaging (mammography or tomosynthesis) were eligible to participate in the study. The scanning protocol involved four sequential scans of the 1) unaffected breast prior to contrast-injection, 2) affected breast prior to contrast injection, 3) affected breast after contrast injection, and 4) unaffected breast after contrast injection. After scans were acquired, patients underwent breast biopsy on areas of suspicion based upon standard of care which provided the ground truth diagnosis. To date, 58 women have been scanned on the Doheny breast CT system resulting in 222 breast CT volume data sets. Contrast injection is advantageous for visualizing malignant solid tumors due to local perfusion of contrast agent through “leaky” angiogenetic vessels, and this contrast greatly improves lesion conspicuity32. Because this study focuses on microcalcification detectability, only pre-contrast volume data sets were used to avoid contrast enhancement in and around microcalcifications. In total, 109 pre-contrast volume data sets were selected for this study to be used as anatomical background for mathematical microcalcification insertion. Each volume data set contained 800-900 reconstructed slices (1024 × 1024 matrix size) with isotropic voxel dimensions of 0.2 mm.
2.3. Microcalcification simulation
Spherical microcalcification profiles were generated using methodology developed by Hernandez et al.13 First, a single high-resolution microcalcification profile was generated by inserting a sphere of intensity in the center of a 330 × 330 × 330 matrix with an isotropic voxel dimension of 0.01 mm. Then, the volume was blurred in the plane by (i) converting each coronal slice to the frequency domain using the 2D Fourier transform, (ii) multiplying the slice by the 2D coronal plane MTF measured on the breast CT system, and (iii) converting the blurred slice to the spatial domain using the 2D inverse Fourier transform. The volume was then blurred in the dimension by (i) converting each vector in the -direction to the frequency domain using the 1D Fourier transform, (ii) multiplying the vector by the 1D -direction MTF measured on the breast CT system, and (iii) converting the blurred vector to the spatial domain using the 1D inverse Fourier transform. The MTFs were not assumed to be equal in the and directions due to the role of the Shepp-Logan reconstruction kernel in the dimensions but not the dimension. The high-resolution profile was then down sampled to match the voxel dimensions of the breast CT system (0.2 mm), resulting in a 17 × 17 × 17 matrix, . The microcalcification profile generation process is illustrated in Figure 1 for a 0.2 mm diameter calcification.
Figure 1:
Simulation of a 0.2 mm microcalcification profile. As a (a) mathematically generated high-resolution microcalcification undergoes (b) XYZ MTF blurring and (c) down sampling, the microcalcification’s edges are blurred and signal is attenuated.
In addition to edge blurring, the blurring procedure resulted in grayscale reduction due to partial volume effects. Let the intensity of a simulated high-resolution calcification, , be 1000 HU, and the intensity of the background volume, , be −200 HU, as shown in Figure 1. Let be the peak intensity in HU of the calcification after the blurring procedure described above. was computed for a range of computer-generated calcification diameters between 0 and 1 mm. A plot of peak intensity as a function of calcification diameter in mm, , is shown in Figure 2. Based on Figure 2, it was found that intensity reduction resulting from the blurring procedure can be modeled using a logistic function for any input intensity in HU, , calcification diameter in mm, , and background intensity in HU, , using Equation 1:
| (1) |
Figure 2:
Peak intensity in HU of calcifications after blurring as a function of calcification diameter, , showing the partial-volume effects in the system. The grayscale values of the initial high-resolution calcification and background were 1000 HU and −200 HU, respectively.
Equation 1 is derived from fitting the data points in Figure 2 with a logistic function (R2 = 0.99). The parameters in the Equation 1 (i.e. 0.52 and 0.15) are unique to our imaging system but can be derived for any imaging system, and illustrate the key principle of partial volume that occurs in any imaging system: objects smaller than the spatial resolution of the imaging system experience blurring and signal attenuation when imaged. In Figure 2, as the calcification diameter approaches zero, the peak intensity of the blurred calcification approaches the intensity of the background, , and as calcification diameter exceeds 1 mm, the peak intensity of the blurred calcification approaches the native intensity, . In this study, was computed for each simulated microcalcification as the median intensity of a 7 × 7 × 7 subvolume of anatomical background surrounding the location of microcalcification insertion.
was estimated empirically by applying Equation 1 to select calcifications found in our clinical patient data sets. Two large calcifications in two separate patients were identified using the patients’ screening exam radiology reports and verified using the corresponding pathology reports. Only large, pathology-confirmed calcifications were used for determining in order to maximize accuracy in delineating calcification edges. Custom-built breast CT viewing software was used to measure the diameter and peak intensity for each calcification. The median intensity of the tissue surrounding the calcifications was also measured. These variables were applied to Equation 1 and resulted in native intensity values, , of 361 HU and 768 HU. The range of native intensity values corresponds to varying densities and compositions of individual calcifications in our clinical data sets1,33.
2.4. Hybrid image generation
For every breast CT volume, lesion-present and lesion-absent region of interest (ROI) patches were generated at random locations within the breast. For each ROI, a lesion center was computer generated using a random number generator and kept if the surrounding volume was fully contained within the patient’s breast and did not contain skin. Otherwise, a new lesion center was computer generated. For viable lesion centers, the surrounding volume served as the anatomical background for mathematical microcalcification insertion. This method of lesion center selection was consistent for lesion-present and lesion-absent patch generation. Random lesion center selection allowed for datasets to contain a variety of anatomical backgrounds which enabled model generalizability. The lesion centers were saved and used for every iteration of ROI generation in this study to reduce inter-dataset variability such that differences in model observer performance could be accredited to the parameter of interest rather than differences in the local anatomical background. The selection of training ROIs were different than testing ROIs because different patient data sets were held-out for testing as explained in the following section.
2.4.1. 2D image generation
First, 2D hybrid image ROIs were generated such that each ROI contained a single microcalcification of known diameter at the center of the ROI. This setting can be described as signal-known-exactly (SKE), location-known-exactly (LKE). SKE, LKE tasks can be used to establish baselines for evaluating an observer’s detection performance prior to experimenting with unknown or uncertain signals and locations.34
Let be a 120 × 120 × 120 anatomical background volume, be a volume of the same size with a blurred microcalcification profile in the center, and zeros elsewhere, and be the simulated volume with the inserted calcification. is scaled such that when it is added to , the peak intensity of equals from Equation 1. This process simulates the partial volume-related blurring of a calcification with surrounding voxels. Let be the indices of the peak intensity voxel in . The simulated volume is then defined as:
| (2) |
was generated at each lesion center and displayed in 2D by either extracting the center slice of in the axial view plane or by slice averaging across adjacent slices to model thicker sections. Similarly, lesion-absent ROIs were generated and displayed in 2D by either extracting the center slice of in the axial view plane or by slice averaging across adjacent slices. The resulting 2D ROIs had dimensions of 120 × 120 × 1. Five microcalcification diameters (0.04, 0.10, 0.15, 0.20, 0.40 mm), three input intensities (361, 565, 768 HU), and five section thicknesses (0.2, 0.6, 1.0, 3.0, 11.0 mm) were studied. The equivalent number of slices for each section thickness was 1, 3, 5, 15, and 55 slices, respectively.
Out of the 109 total breast CT volume data sets, 99 data sets (N = 99, ~90%) were used to generate training ROIs for training the model observer. For each volume data set, 200 lesion-present ROIs and 200 lesion-absent ROIs were generated from different lesion centers resulting in 39,600 total training ROIs. 10% of the training ROIs (3960 ROIs) were reserved for validation during the training process. The remaining 10 breast CT volume data sets (N = 10, ~10%) were used to generate a testing data set. Again, 200 lesion-present ROIs and 200 lesion-absent ROIs were generated from each volume data set resulting in 4000 total testing ROIs. Sample lesion-present ROIs of simulated SKE LKE microcalcifications are shown in Figures 3 & 4.
Figure 3:
Example ROIs of a single simulated microcalcification centered in the field of view. Microcalcification diameter is varied across columns. Native intensity of microcalcifications prior to the blurring procedure () is varied across rows. Partial volume effects reduce microcalcification conspicuity as microcalcification diameter becomes smaller than the resolution of the imaging system (0.2 mm). Field of view within each ROI: 12 mm × 12 mm.
Figure 4:
Example ROIs of a simulated 0.4 mm microcalcification with 565 HU native intensity displayed using different section thicknesses. Field of view within each ROI: 12 mm × 12 mm.
2.4.2. 3D image generation
3D hybrid image volumes were generated with simulated microcalcification clusters. A single cluster was inserted at the center of every volume of interest (VOI), and microcalcifications of uniform diameters were randomly placed within the cluster. This task had similarities to location-known-exactly tasks because the cluster was centered at a known location, but similarities as well to location-known-statistically tasks because individual calcifications were randomly placed within the cluster across different lesions.
Let be a 50 × 50 × 50 anatomical background volume at a random lesion center in the breast. Let be a volume of zeros of the same size. A spherical boundary is defined within based on the cluster diameter, , representing the boundary of the cluster. A random number generator was used to define the location, , within the cluster where a blurred microcalcification profile, , was inserted. is inserted at . is then scaled such that when it is added to , the peak intensity of equals from Equation 1. Microcalcifications are repeatedly inserted at random locations within the cluster until the desired number of microcalcifications, , are inserted. Because is computed at every location within the cluster, microcalcifications within the same cluster varied in intensity, as is common in breast CT images. Let be the indices of the peak intensity voxel in . The simulated VOI with inserted clusters is defined as:
| (3) |
volumes (50 × 50 × 50) were used as lesion-present VOIs, and volumes (50 × 50 × 50) generated from separate lesion centers were used as lesion-absent VOIs. Thicker sections were modeled by averaging every slices, resulting in 50 × 50 × N VOIs, where N is defined as:
| (4) |
The maximum intensity projection (MIP) was also generated from each VOI in the axial view plane, resulting in 50 × 50 × 1 ROIs. Five microcalcification diameters (0.20, 0.25, 0.30, 0.35, 0.40 mm), six (1, 3, 5, 7, 10, 15), and six cluster diameters (1, 3, 5, 6, 8, 10 mm) were studied. Six section thicknesses (0.2, 0.6, 1.0, 2.2, 3.0, 9.8 mm) in addition to the MIP were studied to understand the role of image display method on microcalcification detectability. The equivalent number of slices for each section thickness was 1, 3, 5, 11, 15, and 49 slices, respectively. Native intensity was fixed at 565 HU.
Out of the 109 total breast CT volume data sets, 99 data sets (N = 99, ~90%) were used to generate the training data set for training the model observer. For each volume data set, 150 lesion-present VOIs and 150 lesion-absent VOIs were generated from different lesion centers resulting in 29,700 total training VOIs. 10% of the training VOIs (2970 VOIs) were reserved for validation during the training process. The remaining 10 breast CT volume data sets (N = 10, ~10%) were used to generate the testing data set. Again, 150 lesion-present VOIs and 150 lesion-absent VOIs were generated from each volume data set resulting in 3000 total testing VOIs. Sample VOIs of simulated microcalcification clusters displayed using MIP are shown in Figure 5.
Figure 5:
Example images of simulated microcalcification clusters displayed using maximum intensity projection (MIP). Calcification diameter, cluster diameter, and number of calcifications are varied. Field of view within each ROI: 24 mm × 24 mm.
2.5. Detectability estimation using CNN
2.5.1. 2D CNN
A 2D convolutional neural network (CNN) was used to detect simulated microcalcifications in the SKE, LKE image data sets. The input to the CNN was a 2D ROI (120 × 120 × 1), and the output was a decision variable between 0 and 1 scaled by the sigmoid function. The decision variables were used for ROC curve analysis to estimate overall detectability of a data set.
The CNN architecture is shown in Figure 6. The CNN consisted of three layers: two convolutional layers and one fully-connected layer. The three-layered architecture was found to efficiently train the model without leading to overfitting or constraints on memory. The convolutional layers used 3 × 3 convolutional filters with strides of 1. Batch normalization was implemented after the first convolutional layer. Max pooling was then implemented after both convolutional layers with a pool size of 2 × 2 and a stride of 1. Dropout was implemented after both max pooling layers with a rate of 0.2, and after the fully-connected layer with a rate of 0.5. The rectified linear unit (ReLU) activation function was applied in all three layers. The binary-cross entropy (BCE) loss function was used:
| (5) |
where is the ground truth label (0 or 1), is the predicted value, and is the number of samples. Accuracy was one metric used to monitor the model’s performance:
| (6) |
while the area under the ROC curve (AUC) was used to evaluate overall detection performance, as described in Section 2.6. The CNN model was implemented in Python using the Keras library35. The Adam optimizer36 was used with a learning rate of 1e-5 and a batch size of 64. Training ran between 50 – 150 epochs. An NVIDIA GeForce GTX 1080 GPU was used.
Figure 6:
2D CNN architecture: two convolutional layers followed by a fully-connected layer.
2.5.2. 3D CNN
A 3D CNN was used to detect simulated microcalcification clusters in the hybrid 3D image VOIs. The input to the CNN was a 50 × 50 × N VOI, where N varied with section thickness, and the output was a scalar-valued decision variable between 0 and 1 scaled by the sigmoid function. The 3D CNN model was designed to mimic the sequential 2D analysis performed by human observers when examining 3D breast CT volumes. Human observers review 3D image volumes slice-by-slice without immediate access to true 3D depth information. Findings from each slice are synthesized into one classification decision for each image volume. To mimic this process, a 3 × 3 × 1 convolutional kernel was used instead of a 3 × 3 × 3 convolutional kernel, which is commonly employed for 3D image data. The 3 × 3 × 1 kernel capitalizes on local spatial information within each slice while disregarding inter-slice information. The CNN synthesizes information from each slice into one decision variable in the fully-connected layer. This kernel choice also suited the range of slices comprising different data sets owing to section thickness: slice averaging across the entire VOI resulted in a single 50 × 50 × 1 slice which required 2D convolutional kernels, while VOIs displayed in the native section thickness contained 50 slices.
The CNN consisted of four layers: three convolutional layers and one fully-connected layer. While the 2D and 3D CNN architectures were similar, an extra convolutional layer was added to the 3D CNN due to the difficulty of detecting relatively small targets (1-15 calcifications) compared to the background space (up to 503 voxels). Moreover, the microcalcifications in the 3D generated data sets were signal-known-statistically (SKS), whereas the microcalcifications in the 2D generated data sets were SKE, LKE, increasing the complexity of the detection task. All three convolutional layers used 3 × 3 × 1 convolutional kernels with strides of 1. Batch normalization was implemented after the first convolutional layer. Max pooling was implemented after all three convolutional layers with a pool size of 2 × 2 × 1 and a stride of 1. The pooling layers further downscaled the spatial dimensions of the feature maps, reducing computational complexity and providing translational invariance. Dropout was implemented after the three max pooling layers with a rate of 0.2, and after the fully-connected layer with a rate of 0.5 to enhance generalization by randomly deactivating neurons during training. The rectified linear unit (ReLU) activation function was applied in all four layers.
The 3D CNN model was trained using the Adam optimizer with a learning rate of 1e-5, a batch size of 64, and the BCE loss function. Training occurred over 100-300 epochs. The model's performance was evaluated using accuracy and AUC metrics. The CNN was implemented in Python using an NVIDIA GeForce GTX 1080 GPU. The 3D CNN model with the 3 × 3 × 1 convolutional kernel is shown in Figure 7.
Figure 7:
3D CNN architecture: three convolutional layers followed by a fully-connected layer.
2.6. Performance evaluation and statistical analysis
CNN-generated decision variables were used with receiver operating characteristic (ROC) curve analysis to estimate overall detection performance. For all decision variables related to one testing breast CT volume, an empirical ROC curve was constructed by plotting the true positive rate against the false positive rate at various threshold values. The area under the ROC curve (AUC) was computed, signifying detection performance in that breast CT volume. AUCs were computed on a breast-volume basis to elucidate variability related to breast density, anatomical and quantum noise, and motion artifacts. To estimate overall detection performance across all the testing breast CT volumes, the mean and standard deviation of testing-volume AUCs were also computed. In Section 3, was plotted with 95% confidence error bars to represent uncertainty in the detectability estimations. 95% confidence error bars were computed using Equation 7:
| (7) |
where is the 95% confidence interval, is the mean AUC across testing breast CT volumes, is the standard deviation of AUCs across testing breast CT volumes, and is the number of testing volumes (). was the primary performance metric in this study.
For all comparative tests, the Mann-Whitney U-test was employed. All tests were two-sided, and statistical significance was defined as a difference with a -value less than 0.05. In cases where multiple comparisons were conducted, Bonferroni correction was applied by dividing the desired significance level by the number of comparisons. Statistical analysis was performed using Matlab (Matlab; TheMathWorks Inc., Natick, MA).
3. RESULTS
3.1. SKE, LKE detection task
3.1.1. Effect of microcalcification size and native intensity
SKE LKE detection performance is plotted as a function of microcalcification diameter for three native intensities in Figure 8. When the signal and location are known, microcalcifications that are notably smaller than the resolution of the imaging system (0.20 mm) can be detected despite partial volume effects. Detection performance increases monotonically with calcification diameter and native intensity, and maximum detection performance is achieved when calcification diameter equals or exceeds 0.40 mm. As expected, larger and “brighter” calcifications are more conspicuous than smaller and “dimmer” calcifications. For the remainder of the study, HU is used.
Figure 8:
Effect of size and native intensity on SKE, LKE microcalcification detectability. A single microcalcification was placed at the center of the ROI for this SKE LKE task. AUC is plotted as a function of calcification diameter for three native intensities . Error bars correspond to 95% confidence intervals for each estimate.
3.1.2. Effect of section thickness
SKE LKE detection performance is plotted as a function of section thickness for five calcification diameters in Figure 9. For the largest calcification (0.40 mm), detection performance is generally unaffected as section thickness increases until the thickness reaches 1.0 mm (5 slices), where performance begins to decrease. For calcifications smaller than 0.40 mm, thicker sections reduce detection performance. The native section thickness (0.20 mm) of breast CT enables peak detection performance across all calcification diameters, and detection performance degrades precipitously as section thickness increases.
Figure 9:
Effect of section thickness on SKE LKE microcalcification detectability for five calcification diameters and five section thicknesses (0.2, 0.6, 1.0, 3.0, 11.0 mm). A single microcalcification was placed at the center of the ROI for this SKE LKE task. Error bars correspond to 95% confidence intervals for each estimate.
3.2. Cluster detection task
3.2.1. Effect of image display method
Figure 10 shows the effect of image display method on CNN detection performance of a 5-mm diameter cluster containing 3 microcalcifications. Mean AUC is plotted as a function of section thickness, and detection performance using maximum intensity projection (MIP) is also plotted for five calcification diameters. Data points representing 0.20- and 0.25-mm calcifications across section thicknesses are omitted from the plot for ease in visualization of other differences.
Figure 10:
Effect of image display method. Detection performance using slice averaging (i.e., section thickness) for a 5-mm cluster of 3 calcifications is plotted for three calcification diameters (0.30, 0.35, 0.40 mm) and denoted using solid circles. Detection performance using maximum intensity projection (MIP) is plotted for five calcification diameters (0.20, 0.25, 0.30, 0.35, 0.40 mm) and denoted using solid triangles. Error bars correspond to 95% confidence intervals for each estimate.
Detection performance decreases with increasing section thickness, and peak detection performance occurs using the native section thickness (0.2 mm) and the MIP. When slices are averaged across the entire volume (50 slices or 9.8 mm thickness), microcalcifications of all three sizes are indistinguishable from anatomical background (AUC ~ 0.5). For the three microcalcification sizes, there was no statistically significant difference in detectability between using the MIP and using the native section thickness (, , for 0.30, 0.35, and 0.40-mm microcalcifications, respectively). The MIP images contained a single slice while the native section thickness images contained 50 slices, thus computing time for MIP images was nearly 50 times shorter than the computing time of the native section thickness images. For the remainder of the study, MIP was used to display the simulated images.
3.2.2. Effect of cluster diameter
Figure 11 shows the effect of cluster diameter on CNN detection performance for clusters containing 3 microcalcifications displayed using MIP. Mean AUC is plotted as a function of cluster diameter for four calcification diameters.
Figure 11:
Effect of cluster diameter on detection performance for clusters of 3 microcalcifications displayed using MIP. Error bars correspond to 95% confidence intervals for each estimate.
The 0.40 mm microcalcification diameter data points frame the diagnostic performance of breast CT when the detection target is 3 calcifications. For the 0.40 mm microcalcification, near optimal detection performance is achieved (AUC ~ 1.0), and cluster diameter has minimal effect on cluster detectability. For microcalcifications on the threshold of detectability (i.e., 0.20 mm, 0.25 mm, and 0.30 mm microcalcifications), cluster diameter becomes important, and larger cluster diameters result in lowered overall detectability, while smaller cluster diameters result in increased overall detectability.
3.2.3. Effect of number of calcifications
Figure 12 displays the effect of the number of calcifications on CNN detection performance for clusters measuring 5-mm in diameter displayed using MIP. Mean AUC is plotted as a function of number of calcifications for four calcification diameters.
Figure 12:
Effect of number of calcifications on detection performance for clusters measuring 5-mm in diameter displayed using MIP. Error bars correspond to 95% confidence intervals for each estimate.
Similar to Figure 11, Figure 12 shows that clusters of relatively large (0.40 mm) microcalcifications are minimally affected by the number of microcalcifications. For microcalcifications at the threshold of detectability (i.e., 0.20 mm, 0.25 mm, and 0.30 mm microcalcifications), the presence of more microcalcifications improves the overall cluster detectability, while fewer calcifications within a cluster decreases the overall cluster detectability.
4. DISCUSSION
Our results demonstrated the utility of the maximum intensity projection (MIP) for displaying image volumes containing microcalcification clusters. We found that there was no statistically significant difference in detection performance when using the MIP compared to the native section thickness, but that thicker sections led to reduced detection performance. This result suggests that the CNN is primarily accessing maximum intensities within breast CT image volumes to determine microcalcification presence. The MIP display method, despite using a single slice, yielded comparable performance to the native section thickness, which employed 50 slices. Reduction in slices did not sacrifice detection accuracy, and this is useful in the context of CNNs, where there are significant computational advantages to using MIPs over multi-slice image volumes. The MIP procedure essentially compresses the 3D image to a 2D image, resulting in efficient and better detection for microcalcifications. Future studies will capitalize on the MIP for investigating additional parameters with improved efficiency. While CNNs do not necessarily predict human observer performance, it is likely that human observers may also benefit from viewing MIPs compared to individual slices.
It is noteworthy that the native section thickness was the optimal section thickness for detecting individual microcalcifications and clusters of microcalcifications. These findings differ from those of previous studies conducted in our laboratory for the detection of mass lesions, where the optimal section thickness for detecting small (1 mm), unenhanced mass lesions was the equivalent of 3-5 reconstructed slices17. We suspect that the lowered detection performance of mass lesions in the thinnest section was due to the interference of quantum noise with the signal. Microcalcifications are higher-contrast objects compared to mass lesions and may be more immune to quantum noise. One advantage of breast CT over mammography is the ability to adjust the display (section thickness or MIP) of the image volume in real-time using viewing software, and this ability will be important during screening exams when lesion positions are not known a priori, or when lesions of interest vary in morphology.
The effects of cluster diameter and number of calcifications on overall cluster detection performance was also investigated. Our results indicated that cluster diameter affects the detectability of microcalcification clusters, particularly for smaller microcalcifications near the threshold of detectability. Larger cluster diameters resulted in reduced overall detectability, while smaller cluster diameters led to increased detectability. Additionally, we observed that the presence of more calcifications within a cluster improved the overall detectability, while fewer calcifications decreased it. These results begin to elucidate how model observers interact with microcalcification clusters in a 3D volume.
These results underscore the well-known reality that the challenge of detecting microcalcifications is the challenge of resolving small objects. The voxel size of our current breast CT scanner is 0.2 mm. Assuming two calcifications of the same composition but differing only in that one is larger than the voxel size resolution, and the other is smaller than the voxel size resolution, the larger calcification will appear notably brighter when imaged on our breast CT system due to partial volume effects. In this study, we mathematically model the loss of intensity owing to partial volume effects specific to our system, and then evaluate the detectability of partial-volumed calcifications across clinical parameters. These models may be useful to quantitatively estimate the improvement in detectability that may arise from adjusting components of the breast CT system or protocols such as reconstructed voxel size.
This study had a number of limitations. The field of view of the 2D 120 × 120 × 1 ROIs was 24 mm × 24 mm. Due to GPU memory constraints, when generating 3D VOIs, the field of view was reduced to 10 mm × 10 mm × 10 mm (50 × 50 × 50 voxels). The reduced volumetric field of view limited our ability to simulate larger cluster diameters and limited the CNN’s learning of anatomical background. Nevertheless, the extensive training dataset consisting of 29,700 VOIs derived from 109 distinct breast CT datasets likely enabled the CNN to effectively grasp the nuances of breast anatomical background during its training process. Moreover, the methodology for simulating microcalcifications and microcalcification clusters was simplified such that each microcalcification was spherical, and that each cluster contained microcalcifications of homogeneous size. While this simplification does not capture the full complexity of real-world microcalcifications and clusters, it enabled a controlled investigation into specific parameters of interest: microcalcification size, native intensity, cluster diameter, number of calcifications, and image display methods. Contrast imaging was not explored in this study, but initial studies have demonstrated the utility of contrast-enhanced breast CT in improving the conspicuity of small, malignant calcifications such as ductal carcinoma in situ (DCIS)37, and will likely bring added benefit. Future studies should investigate the influence of these additional factors to obtain a more comprehensive understanding of breast CT optimization for microcalcification detection.
5. CONCLUSION
This study investigated individual effects and the interplay of parameters affecting microcalcification detectability in breast CT. We inserted mathematically generated microcalcifications into acquired patient breast CT images and used CNN-based model observers to evaluate microcalcification detectability across clinical and imaging parameters. As breast CT is still a relatively new breast imaging modality, there is an ongoing need to identify optimal imaging protocols. The results of this investigation will be useful for optimizing breast CT imaging protocols for the detection of microcalcifications, a crucial endeavor for the translation of breast CT to the clinic.
Acknowledgements:
This study was funded in part from the NIH: R01 EB025829 (UCSB), AWD101462-q (MIDRC), and R01 CA181081 (UCD).
Footnotes
Conflicts of interest:
Authors AMH, CKA, and JMB have financial interest in Izotropic Corporation, a commercial breast CT company. Author JMB is editor-in-chief of Medical Physics but is blinded from the review of this paper and had no role to play in decisions regarding it.
REFERENCES
- 1.Fandos-Morera A, Prats-Esteve M, Tura-Soteras JM, Traveria-Cros A. Breast tumors: composition of microcalcifications. Radiology. 1988;169(2):325–327. doi: 10.1148/radiology.169.2.2845471 [DOI] [PubMed] [Google Scholar]
- 2.Henrot P, Leroux A, Barlier C, Génin P. Breast microcalcifications: The lesions in anatomical pathology. Diagn Interv Imaging. 2014;95(2):141–152. doi: 10.1016/j.diii.2013.12.011 [DOI] [PubMed] [Google Scholar]
- 3.Le Gal M, Chavanne G, Pellier D. Diagnostic value of clustered microcalcifications discovered by mammography (apropos of 227 cases with histological verification and without a palpable breast tumor). Bull Cancer. 1984;71(1):57–64. [PubMed] [Google Scholar]
- 4.Breast Imaging Reporting & Data System. Accessed May 23, 2023. https://www.acr.org/Clinical-Resources/Reporting-and-Data-Systems/Bi-Rads
- 5.Kallergi M. Computer-aided diagnosis of mammographic microcalcification clusters. Med Phys. 2004;31(2):314–326. doi: 10.1118/1.1637972 [DOI] [PubMed] [Google Scholar]
- 6.Bent CK, Bassett LW, D’Orsi CJ, Sayre JW. The Positive Predictive Value of BI-RADS Microcalcification Descriptors and Final Assessment Categories. Am J Roentgenol. 2010;194(5):1378–1383. doi: 10.2214/AJR.09.3423 [DOI] [PubMed] [Google Scholar]
- 7.Fischer U, Baum F, Obenauer S, et al. Comparative study in patients with microcalcifications: full-field digital mammography vs screen-film mammography. Eur Radiol. 2002;12(11):2679–2683. doi: 10.1007/s00330-002-1354-x [DOI] [PubMed] [Google Scholar]
- 8.Arodź T, Kurdziel M, Popiela TJ, Sevre EOD, Yuen DA. Detection of clustered microcalcifications in small field digital mammography. Comput Methods Programs Biomed. 2006;81(1):56–65. doi: 10.1016/j.cmpb.2005.10.002 [DOI] [PubMed] [Google Scholar]
- 9.Lindfors KK, Boone JM, Nelson TR, Yang K, Kwan ALC, Miller DF. Dedicated Breast CT: Initial Clinical Experience. Radiology. 2008;246(3):725–733. doi: 10.1148/radiol.2463070410 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Boone JM, Nelson TR, Lindfors KK, Seibert JA. Dedicated Breast CT: Radiation Dose and Image Quality Evaluation. Radiology. 2001;221(3):657–667. doi: 10.1148/radiol.2213010334 [DOI] [PubMed] [Google Scholar]
- 11.Gazi PM, Yang K, Burkett GW, Aminololama-Shakeri S, Anthony Seibert J, Boone JM. Evolution of spatial resolution in breast CT at UC Davis. Med Phys. 2015;42(4):1973–1981. doi: 10.1118/1.4915079 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Suryanarayanan S, Karellas A, Vedantham S, Sechopoulos I, D’Orsi CJ. Detection of Simulated Microcalcifications in a Phantom with Digital Mammography: Effect of Pixel Size. Radiology. 2007;244(1):130–137. doi: 10.1148/radiol.2441060977 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hernandez AM, Becker AE, Hyun Lyu S, Abbey CK, Boone JM. High-resolution μCT imaging for characterizing microcalcification detection performance in breast CT. J Med Imaging. 2021;8(05). doi: 10.1117/1.JMI.8.5.052107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ghammraoui B, Zidan A, Alayoubi A, Zidan A, Glick SJ. Fabrication of microcalcifications for insertion into phantoms used to evaluate x-ray breast imaging systems. Biomed Phys Eng Express. 2021;7(5):055021. doi: 10.1088/2057-1976/ac1c64 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Timberg P, Båth M, Andersson I, Mattsson S, Tingberg A, Ruschin M. In-plane visibility of lesions using breast tomosynthesis and digital mammography: In-plane visibility of lesions using BT and DM. Med Phys. 2010;37(11):5618–5626. doi: 10.1118/1.3488899 [DOI] [PubMed] [Google Scholar]
- 16.Shaheen E, Van Ongeval C, Zanca F, et al. The simulation of 3D microcalcification clusters in 2D digital mammography and breast tomosynthesis: Simulation of 3D microcalcification clusters. Med Phys. 2011;38(12):6659–6671. doi: 10.1118/1.3662868 [DOI] [PubMed] [Google Scholar]
- 17.Packard NJ, Abbey CK, Yang K, Boone JM. Effect of slice thickness on detectability in breast CT using a prewhitened matched filter and simulated mass lesions. Med Phys. 2012;39(4):1818–1830. doi: 10.1118/1.3692176 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lefebvre F, Benali H, Gilles R, Di Paola R. A simulation model of clustered breast microcalcifications. Med Phys. 1994;21(12):1865–1874. doi: 10.1118/1.597186 [DOI] [PubMed] [Google Scholar]
- 19.Gong X, Glick SJ, Liu B, Vedula AA, Thacker S. A computer simulation study comparing lesion detection accuracy with digital mammography, breast tomosynthesis, and cone-beam CT breast imaging: Comparison of lesion detectability with 3 breast imaging modalities. Med Phys. 2006;33(4):1041–1052. doi: 10.1118/1.2174127 [DOI] [PubMed] [Google Scholar]
- 20.Badano A, Graff CG, Badal A, et al. Evaluation of Digital Breast Tomosynthesis as Replacement of Full-Field Digital Mammography Using an In Silico Imaging Trial. JAMA Netw Open. 2018;1(7):e185474. doi: 10.1001/jamanetworkopen.2018.5474 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Barufaldi B, Vent TL, Bakic PR, Maidment ADA. Computer simulations of case difficulty in digital breast tomosynthesis using virtual clinical trials. Med Phys. 2022;49(4):2220–2232. doi: 10.1002/mp.15553 [DOI] [PubMed] [Google Scholar]
- 22.Bakic PR, Barufaldi B, Higginbotham D, et al. Virtual clinical trial of lesion detection in digital mammography and digital breast tomosynthesis. In: Chen GH, Lo JY, Gilat Schmidt T, eds. Medical Imaging 2018: Physics of Medical Imaging. SPIE; 2018:5. doi: 10.1117/12.2294934 [DOI] [Google Scholar]
- 23.De Sisternes L, Brankov JG, Zysk AM, Schmidt RA, Nishikawa RM, Wernick MN. A computational model to generate simulated three-dimensional breast masses. Med Phys. 2015;42(2):1098–1118. doi: 10.1118/1.4905232 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Vancoillie L, Marshall N, Cockmartin L, Vignero J, Zhang G, Bosmans H. Verification of the accuracy of a hybrid breast imaging simulation framework for virtual clinical trial applications. J Med Imaging. 2020;7(04):1. doi: 10.1117/1.JMI.7.4.042804 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Massanes F, Brankov JG. Evaluation of CNN as anthropomorphic model observer. In: Medical Imaging 2017: Image Perception, Observer Performance, and Technology Assessment. Vol 10136. SPIE; 2017:188–194. doi: 10.1117/12.2254603 [DOI] [Google Scholar]
- 26.Han M, Baek J. A convolutional neural network-based anthropomorphic model observer for signal-known-statistically and background-known-statistically detection tasks. Phys Med Biol. 2020;65(22):225025. doi: 10.1088/1361-6560/abbf9d [DOI] [PubMed] [Google Scholar]
- 27.Fan F, Ahn S, Man BD, et al. Deep learning-based model observers that replicate human observers for PET imaging. In: Medical Imaging 2020: Image Perception, Observer Performance, and Technology Assessment. Vol 11316. SPIE; 2020:53–58. doi: 10.1117/12.2547505 [DOI] [Google Scholar]
- 28.Kim B, Han M, Baek J. A Convolutional Neural Network-Based Anthropomorphic Model Observer for Signal Detection in Breast CT Images Without Human-Labeled Data. IEEE Access. 2020;8:162122–162131. doi: 10.1109/ACCESS.2020.3021125 [DOI] [Google Scholar]
- 29.Kim G, Han M, Shim H, Baek J. A convolutional neural network-based model observer for breast CT images. Med Phys. 2020;47(4):1619–1632. doi: 10.1002/mp.14072 [DOI] [PubMed] [Google Scholar]
- 30.Lyu SH, Abbey CK, Hernandez AM, Boone JM. Pre-whitened matched filter and convolutional neural network based model observer performance for mass lesion detection in non-contrast breast CT. Med Phys. Published online 2023. doi: 10.1002/mp.16685 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Feldkamp LA, Davis LC, Kress JW. Practical cone-beam algorithm. J Opt Soc Am A. 1984;1(6):612. doi: 10.1364/JOSAA.1.000612 [DOI] [Google Scholar]
- 32.Prionas ND, Lindfors KK, Ray S, et al. Contrast-enhanced Dedicated Breast CT: Initial Clinical Experience. Radiology. 2010;256(3):714–723. doi: 10.1148/radiol.10092311 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Frappart L, Remy I, Lin HC, et al. Different types of microcalcifications observed in breast pathology: Correlations with histopathological diagnosis and radiological examination of operative specimens. Virchows Arch A Pathol Anat Histopathol. 1987;410(3):179–187. doi: 10.1007/BF00710823 [DOI] [PubMed] [Google Scholar]
- 34.Eckstein MP, Abbey CK. Model observers for signal-known-statistically tasks (SKS). In: Krupinski EA, Chakraborty DP, eds. ; 2001:91–102. doi: 10.1117/12.431177 [DOI] [Google Scholar]
- 35.Keras Chollet F.. Published online 2015. https://github.com/fchollet/keras [Google Scholar]
- 36.Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. Published online January 29, 2017. Accessed January 26, 2023. http://arxiv.org/abs/1412.6980 [Google Scholar]
- 37.Aminololama-Shakeri S, Abbey CK, Gazi P, et al. Differentiation of ductal carcinoma in-situ from benign micro-calcifications by dedicated breast computed tomography. Eur J Radiol. 2016;85(1):297–303. doi: 10.1016/j.ejrad.2015.09.020 [DOI] [PMC free article] [PubMed] [Google Scholar]












