Abstract
The purpose of this study was to evaluate stereoscopic perception of low-dose breast tomosynthesis projection images. In this Institutional Review Board exempt study, craniocaudal breast tomosynthesis cases (N = 47), consisting of 23 biopsy-proven malignant mass cases and 24 normal cases, were retrospectively reviewed. A stereoscopic pair comprised of two projection images that were ±4° apart from the zero angle projection was displayed on a Planar PL2010M stereoscopic display (Planar Systems, Inc., Beaverton, OR, USA). An experienced breast imager verified the truth for each case stereoscopically. A two-phase blinded observer study was conducted. In the first phase, two experienced breast imagers rated their ability to perceive 3D information using a scale of 1–3 and described the most suspicious lesion using the BI-RADS® descriptors. In the second phase, four experienced breast imagers were asked to make a binary decision on whether they saw a mass for which they would initiate a diagnostic workup or not and also report the location of the mass and provide a confidence score in the range of 0–100. The sensitivity and the specificity of the lesion detection task were evaluated. The results from our study suggest that radiologists who can perceive stereo can reliably interpret breast tomosynthesis projection images using stereoscopic viewing.
Keywords: Breast tomosynthesis, Stereoscopic display, 3D perception, Low-dose projections
Background
In breast tomosynthesis imaging, 15–30 x-ray projections of the breast are typically acquired over a limited angular range of 15–50° [1–3]. Multiple slices of the breast can be synthesized from the projection images, and these slices provide 3D visual information about the structures within the breast. Yet, due to limited angular sampling, tomosynthesis is often regarded as a quasi-3D modality as the resolution between slices (in the direction perpendicular to the detector plane) is poorer than the in-plane (plane parallel to the detector plane) resolution [4]. The typical in-plane resolution provided by present day tomosynthesis systems is around 100 μm per pixel while the resolution between slices is around 1 mm.
The imaging geometry of breast tomosynthesis naturally lends itself to stereoscopic (stereo) visualization provided that the angular separation between the projection images is carefully selected. Stereo mammography has been shown to be effective in reducing unnecessary patient recalls [5]. However, in the clinical studies conducted to date, the dose of the stereo mammographic examination was twice the dose of a regular screening mammographic examination [5]. The x-ray dose under which each tomosynthesis projection image is acquired is much lower than the dose under which a standard mammogram is acquired. Consequently, tomosynthesis projection images have poor contrast due to a low signal to noise ratio [6] when compared to traditional x-ray mammograms. However, Webb et al. [7] showed that radiologists performed significantly better in detecting simulated breast masses on tomosynthesis projection images viewed on a stereo display than on regular mammograms.
For our research, we focused on the effect stereo viewing of tomosynthesis projection images had on the 3D perception and on the detection and characterization performance of real masses. The main objectives of our study were twofold: (1) Can radiologists perceive 3D information when viewing the low-dose projection images stereoscopically and (2) can mass detection be reliably performed on tomosynthesis projection images using stereoscopic viewing? These questions were geared toward finding whether stereo viewing of tomosynthesis projection images could be a viable reading mode for interpreting breast tomosynthesis data in the future since stereo visualization of tomosynthesis projections has the potential to reveal the 3D structure of the breast unlike the current cine or slice-by-slice viewing modes.
Materials and Methods
Data Set
Breast tomosynthesis images were provided by Hologic, Inc. (Bedford, MA, USA). A total of 47 craniocaudal breast tomosynthesis cases, which were comprised of 23 biopsy-proven malignant masses and 24 normal cases, were used in this Institutional Review Board exempt study. Each case consisted of 15 projection images, which spanned an angular range of approximately 15°. A stereo pair was formed by selecting two images that were approximately ±4° apart from the zero angle projection (total separation of 8°). No other criterion was used in selecting the projection images of the stereo pair. Previous studies with stereo mammography have shown that in order to achieve good depth perception of breast tissue, the angle of separation between the two images of the stereo pair should be between 6° and 10° [5].
An experienced breast imager with close to 20 years of experience in interpreting screening mammograms certified each case that was used in this study as being a normal or a lesion case. This breast imager, to whom we will refer as the “truth radiologist,” determined the ground truth location information on each stereo pair of projection images by visually comparing the lesion depicted on the projection images and the corresponding reconstructed slices. For the reconstructed slices, Hologic, Inc. provided the ground truth location information, and the truth radiologist used this information to determine the ground truth location on the stereo pair of projection images. The truth radiologist also assessed the tissue density and the lesion subtlety. Table 1 illustrates the distribution of densities represented in the cohort of cases used for the study.
Table 1.
Fatty | Scattered densities | Heterogeneously dense | Extremely dense |
---|---|---|---|
10 | 21 | 13 | 3 |
Data Preprocessing
We manually preprocessed the stereo pairs of the raw projection images to enhance their contrast. This was achieved by manually adjusting the DICOM window width/window level parameter to obtain satisfactory contrast in the two images of the stereo pair. The DICOM window width/window level parameter was set to the same value in both images of the stereo pair. The truth radiologist certified each stereo pair as having sufficient contrast for interpretation. Subsequently, we corrected the histograms of the two images by shifting the median of one histogram with respect to the other such that they were identically matched in shape. To perceive stereoscopic effect, the images of the stereo pair had to be rotated by 90° in the clockwise direction such that the left breast was displayed as though the patient was in a prone position, while the right breast was displayed as though the patient was in a supine position. This display of images, while different from the display of conventional mammographic images, provides horizontal parallax, which in turn induces stereoscopic perception. The stereo display mode used in our study was similar to that used by Getty et al. in their stereo mammography study [5]. Finally, a white dot was randomly placed on either the left or the right side of the non-tissue region of the images. The white dot was used as a reference by readers while commenting on the location of the abnormality found on the image. Figure 1 illustrates an example of a processed stereo pair of tomosynthesis projection images of the right breast used in our study.
Methodology
The stereo pairs were displayed on a Planar PL2010M stereo display (maximum resolution 1,600 × 1,200) manufactured by Planar Systems, Inc. (Beaverton, OR, USA). This was not a standard mammographic display, but a research workstation. The Stereoscopic PlayerTM (3dtv.at, Linz, Austria) software was used to load the stereo pair on the stereo display system. A stereo pair loaded on this stereo display system can be fused by a human having normal stereo acuity by wearing lightweight passive cross-polarized glasses.
The study was blinded and was held over two phases. The main objective of the first phase was to assess if the radiologists could perceive 3D information from the low-dose projection images. The readers who took part in the first phase of the study were two experienced breast imagers, each having more than 15 years of experience in interpreting screening mammograms. In addition to these two breast imagers, a radiologist who specialized in reading computed tomography scans and an imaging physicist participated in the study design as trial readers. The trial readers evaluated the cases to determine if 3D information could be perceived.
A questionnaire was devised with the help of the truth radiologist to collect information on how well the study readers perceived 3D information on a scale of 1–3 (1, no 3D perception; 2, moderate 3D perception; and 3, excellent 3D perception). The question posed to the readers was their general qualitative perception of the 3D cyclopean view. The questionnaire also contained specific questions for describing the morphological characteristics of the most suspicious finding identified by the readers. The morphological characteristics of the lesions were described using the American College of Radiology Breast Imaging and Data System (BI-RADS®) descriptors [8]. The readers were asked to describe the lesion subtlety on a scale of 1–5, where a rating of 1 meant that the lesion was extremely subtle and a rating of 5 meant that the lesion was extremely obvious. The readers were also asked to characterize the breast density according to one of four BI-RADS ® categories: mainly fatty, scattered fibroglandular densities, heterogeneously dense, and extremely dense [8].
The first phase of the study was split across two sessions, which were held under the same ambient lighting conditions (dark room reading). In the first session, 24 cases were shown to the readers, while in the second session, 23 cases were shown. Each session had a mix of true positive and true negative cases that were shown in a random order. Each reader was shown three stereo tomosynthesis pairs of the breast (1 mass and 2 normal), which were not a part of the study set in a training phase at the start of the first session to get accustomed to the passive cross-polarized glasses, the stereo display, and the questionnaire. A break of at least 30 min was scheduled between the two sessions. Each reader was free to adjust the seating and viewing distance in order to best perceive the 3D view. As soon as a stereo pair was loaded on the stereoscopic display system using the Stereoscopic PlayerTM software, a stopwatch was started and the readers were given at least 20 s to examine the display. The main reason behind waiting for at least 20 s was to ensure that the readers had adequate time to fuse the stereo pair to form the cyclopean view and become comfortable with their 3D perception rather than interpreting the case based on their monoscopic perception. If in those 20 s the readers did not indicate that they were ready to make an interpretation, then they were prompted to see if they were ready. The readers were free to take more time if needed or indicate that they were ready to make an interpretation even before the 20 s had elapsed. The images of each case were presented on the display until the readers had answered all of the questions for that particular case. Only the craniocaudal stereo pairs were available to the readers, with no additional images or display modes. Further, the readers were not allowed to modify any of the display parameters such as magnification or contrast.
The main objective of the second phase was to assess if mass detection could be reliably performed using stereo viewing of tomosynthesis projection images. Here, we expanded on the first phase of the study and compared the mass detection performance of stereo viewing vs. monoscopic (mono) viewing. Four experienced breast imagers, each having an experience of more than 10 years in interpreting screening mammograms, participated in the study as blinded readers. One reader was common to both phases. Like in the first phase, the second phase of the study was also held over two sessions, with a mix of true positive and true negative cases shown in a random order. A total of 23 cases were shown in one session and 24 cases in the other session. In each session, the reader was shown each case twice, once in the stereo mode and once in the mono mode, in a random order. In the mono mode, the same image of the stereo pair was displayed on both monitors of the stereo display system. The reader was not told what the current viewing mode was. Each case was shown on the display system to the readers until the readers had completed interpreting the case. The cross-polarized stereo glasses were used by the readers throughout the session even while viewing the images in the mono mode. The readers were not allowed to modify display parameters such as magnification and contrast.
In the second phase, the readers were asked to provide a binary decision on whether they saw a mass for which they would initiate a diagnostic workup or not and provide a confidence score in the range of 0–100 that indicated their confidence in the presence of the mass. A confidence score of 0 meant that the reader was 100 % certain that there was no mass, while a confidence score of 100 meant that the reader was 100 % certain that there was a mass. The binary decisions were collected, as they closely resembled how the readers would operate in an actual clinical setting. The confidence scores, which indicated the reader’s confidence in the presence of an abnormality, were collected for analyzing the readers’ performance in an experimental setting using the receiver operating characteristic curve (ROC). Previous works [9] on analyzing binary and continuous/multi-category ratings in mammography observer studies have demonstrated that these two decision-making processes yield similar reader performances even though the binary true positive fraction (TPF) and false-positive fraction (FPF) operating points do not always lie on the ROC curves but in their vicinity.
The Randot stereo acuity test was administered on every reader who participated in this study, including the truth radiologist, since stereo perception is an innate ability and not all human beings view stereoscopic images equally well. In fact, 4–10 % of humans exhibit some degree of stereo deficiency [10]. All the readers passed the Randot stereo acuity test.
Statistical Analysis
Statistical analyses were conducted to assess how well the study readers perceived 3D information within the breast and their diagnostic performance under the stereo and the mono viewing modes.
Inter-reader agreement between the two readers in their descriptions of the shape and margin properties and their ratings of lesion subtlety and tissue density collected during the first phase were quantified using percent agreement. We did not use the Kappa statistic for assessing inter-reader agreement as we found that the Kappa statistic values indicated a low agreement when the percent agreement was very high. This is a well-documented problem in the statistics literature with the Kappa statistic as the Kappa is very sensitive to trait prevalence in the population under consideration. This problem is commonly referred to as the Kappa paradox [11].
From the binary decisions collected in the second phase, we computed the overall binary TPF and FPF for each reader and each viewing mode. Similarly, the empirical ROC curves were generated using the confidence scores for each reader and each viewing mode. We first analyzed whether the binary decisions and the continuous ratings resulted in similar reader performances. A bootstrap analysis was carried out to ascertain this. This analysis was similar to what was described by Gur et al. [9] in their work comparing mammography reader performances under binary and continuous/multi-category ratings. We summarize the key steps: Let us suppose that the ith reader is denoted by Ri, and that for this reader and a given viewing mode (stereo or mono), the binary TPF and FPF obtained were . We then evaluated the TPF at the binary FPF, , from the corresponding ROC curve. We used linear interpolation for values of that were not present in the list of FPF values used to generate the empirical ROC curve. Let us denote the linearly interpolated TPF value from the ROC curve by . The signed vertical difference yields a measure that is indicative of how similar the reader performances are under the binary decision and continuous confidence scores.
To assess whether the signed difference in sensitivities was significant or not, we performed bootstrap sampling. For each reader, we separately resampled the mass cases and the normal cases independently to ensure that the final sample had the same number of mass and normal cases as in the original dataset. It is important to note that we did not resample cases across readers, and each reader’s data were analyzed independently of the others’ data. We generated 5,000 bootstrap samples, and for each sample, the signed vertical difference between the binary and the ROC TPF values were computed as described above. The mean-subtracted bootstrap difference distribution was then used to evaluate the two-sided bootstrap p value with the test statistic being the signed vertical difference computed from the observed data.
The area under the ROC curve (AUC) and the standard deviation in AUC were also computed from the observed data for the stereo and the mono viewing modes. Further, since breast imaging radiologists usually operate at sensitivities greater than 90 %, the partial AUC values for 90 and 95 % sensitivities were also computed using the pROC software package [12]. The differences in partial AUC values for 90 and 95 % sensitivities were statistically assessed [12].
Results
The two readers who took part in the first phase of the study perceived moderate or excellent 3D information in 89.36 % (42/47) and 93.62 % (44/47) of the cases. Both readers remarked that the 3D perception was better in cases that depicted rich vasculature and other linear structures.
For the BI-RADS® mass shape, mass margin, and assessment ratings, the two readers agreed on 7 out of 16 (43.75 %), 11 out of 16 (68.75 %), and 12 out of 16 (75 %) cases, respectively. For mass subtlety and tissue density ratings, the two readers agreed on 12 out of 16 (75 %) and 35 out of 47 (74.47 %) cases, respectively. There were only 16 lesions that were correctly detected by both the readers, and hence, the denominator while computing the BI-RADS® percent agreement is 16 as opposed to 23 (the total number of lesion cases).
Figures 2 and 3 illustrate the ROC curves for the four readers along with the binary operating points for each reader for the mono and the stereo viewing modes, respectively. As can be seen from Figures 2 and 3, the binary operating points do not necessarily lie on the ROC curves. Table 2 lists the bootstrap p values comparing the reader performance using the binary and continuous rating scales for the mono and the stereo viewing modes. Out of the eight experimental conditions (four readers and two viewing modes), a statistically significant difference in performance between the binary and continuous ratings was observed for only one condition: the mono viewing mode of reader 2. For the remaining experimental conditions, no significant differences in performance were observed between the binary and the continuous confidence scores. Table 3 lists the AUC values and their standard deviations, the partial AUC values at 90 and 95 % sensitivities, and p values indicating whether the differences in partial AUC values at 90 and 95 % sensitivities were significant or not for the mono and the stereo viewing modes. From Table 3, we see that the partial AUC values at 90 and 95 % sensitivities were significantly better for readers 1 and 2 under the stereo viewing mode than under the mono viewing mode. For readers 3 and 4, the differences in the partial AUC values at 90 and 95 % sensitivities were not statistically significant. Out of the four readers, only reader 1 took part in both phases of the study.
Table 2.
Reader number | Bootstrap p value (mono viewing mode) | Bootstrap p value (stereo viewing mode) |
---|---|---|
1 | 0.133 | 0.053 |
2 | 0.036 | 0.169 |
3 | 0.466 | 0.194 |
4 | 0.403 | 0.211 |
Table 3.
Reader number | Mono viewing mode AUC | Stereo viewing mode AUC | Partial AUCs at 90 and 95 % TPF (mono viewing mode) | Partial AUCs at 90 and 95 % TPF (stereo viewing mode) | p value comparing partial AUC at 90 % TPF between mono and stereo viewing modes | p value comparing partial AUC at 95 % TPF between mono and stereo viewing modes |
---|---|---|---|---|---|---|
1 | 0.870 ± 0.05 | 0.887 ± 0.04 | 0.022, 0.006 | 0.064, 0.031 | 0.037 | 0.013 |
2 | 0.888 ± 0.04 | 0.942 ± 0.03 | 0.033, 0.014 | 0.071, 0.032 | 0.026 | 0.037 |
3 | 0.889 ± 0.05 | 0.904 ± 0.04 | 0.028, 0.0002 | 0.038, 0.013 | 0.636 | 0.302 |
4 | 0.921 ± 0.03 | 0.894 ± 0.05 | 0.061, 0.024 | 0.062, 0.024 | 0.977 | 0.984 |
Discussion
The main advantage breast tomosynthesis offers over traditional screening mammography is the reduction in overlapping out-of-plane tissue structures that hinder the early detection of breast cancer. Indeed, clinical studies conducted to date suggest that breast tomosynthesis might significantly improve the detection of noncalcified lesions, particularly in breasts composed of dense tissue [13]. Tomosynthesis also reduces the number of unnecessary patient recalls when used in conjunction with screening mammography [14, 15].
Yet, the clinical studies conducted to date employed reading modes in which the individual tomosynthesis projection images were viewed as though they were traditional screen film mammograms and the reconstructed slices were viewed using a cine or slice-by-slice viewing mode. It is not clear what is the optimal reading mode for breast tomosynthesis images. Individual projection images, which are acquired at a low x-ray dose, are noisy and have the same limitation as screening mammography due to the effect of overlapping out-of-plane tissue structures.
The quality of the reconstructed slices depends on the reconstruction algorithm that is used [4]. Further, when reconstructed slices are reviewed, there are many slices that would need to be reviewed by the radiologist. If tomosynthesis is used in routine screening, then the volume of data that is generated on a daily basis could potentially be very large and fatiguing. The clinical study conducted by Good et al. [16] reaffirmed this point; there is a need to understand and identify the optimal reading mode for breast tomosynthesis data.
Tomosynthesis projection images are amenable to stereo viewing provided that the angle of separation between the two projection images is carefully selected. Stereo viewing has the potential to reveal the 3D structure of the breast and to enable better real lesion detection performance as revealed by clinical studies conducted with stereo mammography [5]. To the best of our knowledge, this is the first study to explore stereo viewing of tomosynthesis projection images of real breast lesions. A study that compares most closely with our study is the study by Webb et al. [7]. However, as noted earlier, the study by Webb et al. involved simulated breast masses rather than real masses [7]. Additionally, Webb et al. evaluated only lesion detection performance in their study [7], while we explored other questions such as how well 3D information was perceived and how well real breast masses can be characterized in addition to detecting them. We have demonstrated that stereo viewing of tomosynthesis projections is indeed possible with minimal preprocessing of the image data to improve the contrast. The pre-processing can be automated as described by Webb et al. [7]. Thus, stereo viewing could be a viable reading mode for tomosynthesis projection images in the future.
There are some limitations in the current study, which we are addressing in ongoing work. Apart from the limited number of readers involved in this pilot study, the cases that were analyzed in this study contained only mass lesions. We believe that it would be interesting to assess the performance of detecting microcalcifications in the stereo viewing mode. Further, in the current study, we selected projection images that were approximately 4° apart from the zero angle projection. The optimal separation between the projection images is not yet known. If the angle of separation is too small, then the stereoscopic reader cannot perceive enough depth. On the other hand, if the angle of separation is too large, then it is difficult for the reader to fuse the images stereoscopically. A point related to the viewing angle is the number of stereo pairs of projection images that need to be presented to the readers for optimal detection performance. In the current study, we used a single pair of stereo projections. However, we do not claim that the rest of the projection images have redundant information and need not be used. In fact, it would be interesting to study how many stereo pairs of projection images are needed for ensuring complete breast coverage such that optimal detection performance is achieved without compromising on the reading time. The optimal number of stereo pairs of projection images could eventually impact the radiation dose as one could then perform tomosynthesis examinations by acquiring only a specified number of projection images. Webb et al. used 22 stereo pairs of tomosynthesis projection images, each separated by approximately 6°, in their study [7]. However, Webb et al. did not analyze if all the 22 pairs were actually needed to ensure good detection performance [7]. Finally, an important point we would like to make about the stereo mode is that stereo visualization is an innate ability and not all human beings can see 3D content stereoscopically [10].
Conclusion
In this study, we have assessed the feasibility of viewing low-dose breast tomosynthesis projection images on a stereoscopic display. Our findings suggest that reliable stereoscopic 3D display and interpretation of breast tomosynthesis projection images is possible. Stereoscopic viewing could prove to be an efficient reading mode for breast tomosynthesis projection data.
Acknowledgments
The authors would like to acknowledge the support of Hologic, Inc. (Bedford, MA, USA) in this project. In particular, the authors would like to acknowledge Dr. Loren Niklason and Dr. Ashwini Kshirsagar at Hologic, Inc. for providing assistance with the breast tomosynthesis data.
References
- 1.Niklason LT, Christian BT, Niklason LE, et al: Digital tomosynthesis in breast imaging. Radiology 205:399–406, 1997 [DOI] [PubMed]
- 2.Dobbins JT III, Godfrey DJ: Digital x-ray tomosynthesis: Current state of the art and clinical potential. Phys Med Biol 48:R65–106, 2003 [DOI] [PubMed]
- 3.Ren B, Ruth C, Wu T, et al: A new generation FFDM / tomosynthesis fusion system with selenium detector. Proc. SPIE Medical Imaging: Physics of Medical Imaging 7622: 76220B-76220B-11, 2010
- 4.Karellas A, Vedantham S. Breast cancer imaging: A perspective for the next decade. Med Phys. 2008;35:4878–4897. doi: 10.1118/1.2986144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Getty DJ, D’Orsi CJ, Pickett RM. Stereoscopic digital mammography: Improved accuracy of lesion detection in breast cancer screening. Lect Notes Comput Sci. 2008;5116:74–79. doi: 10.1007/978-3-540-70538-3_11. [DOI] [Google Scholar]
- 6.Prince JL, Links JM: Projection radiography. Medical imaging signals and systems. Pearson Prentice Hall, Upper Saddle River , NJ, 2006
- 7.Webb LJ, Samei E, Lo JY, et al: Comparative performance of multiview stereoscopic and mammographic display modalities for breast lesion detection. Med Phys 38:1972–1980, 2011 [DOI] [PubMed]
- 8.D’Orsi CJ, Bassett LW, Berg WA, et al: BI-RADS: Mammography, 4th edition in: D’Orsi CJ, Mendelson EB, Ikeda DM, et al: Breast Imaging Reporting and Data System: ACR BI-RADS - Breast Imaging Atlas. American College of Radiology, Reston, VA, 2003
- 9.Gur D, Bandos AI, King JL, et al: Binary and multi-category ratings in a laboratory observer performance study: A comparison. Med Phys 35:4404–4409, 2008 [DOI] [PMC free article] [PubMed]
- 10.Richards W. Stereopsis and stereoblindness. Exp Brain Res. 1970;10:380–388. doi: 10.1007/BF02324765. [DOI] [PubMed] [Google Scholar]
- 11.Gwet KL. Computing inter-rater reliability and its variance in the presence of high agreement. Br J Math Stat Psychol. 2008;61:29–48. doi: 10.1348/000711006X126600. [DOI] [PubMed] [Google Scholar]
- 12.Robin X, Turck N, Hainard A, et al: pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12:77, 2011 [DOI] [PMC free article] [PubMed]
- 13.Rafferty, E, Niklason, LT: Comparison of FFDM with breast tomosynthesis to FFDM alone: Performance in fatty and dense breasts. Proc. Tomosynthesis Imaging Symposium: Frontiers in Research and Clinical Applications, Duke University, Durham, NC, USA, 2009 (unpublished)
- 14.Poplack SP, Tosteson TD, Kogel CA, Nagy HM, et al: Digital breast tomosynthesis: Initial experience in 98 women with abnormal digital screening mammography. AJR Am J Roentgenol 189:616–623, 2007 [DOI] [PubMed]
- 15.Gur D, Abrams GS, Chough DM, et al: Digital breast tomosynthesis: Observer performance study. AJR Am J Roentgenol 193:586–591, 2009 [DOI] [PubMed]
- 16.Good WF, Abrams GS, Catullo VJ, et al: Digital breast tomosynthesis: A pilot observer study. Am J Roentgenol 190:865–869, 2008 [DOI] [PubMed]