Abstract
Purpose
Posterior acoustic shadow width has been proposed as a more accurate measure of kidney stone size compared to direct measurement of stone width on ultrasound (US). Published data in humans to date have been based on a research US system. Herein, we compare these two measurements in clinical US images.
Methods
Thirty patient image sets where computed tomography (CT) and US images were captured less than one day apart were retrospectively reviewed. Five blinded reviewers independently assessed the largest stone in each image set for shadow presence and size. Shadow size was compared to US and CT stone sizes.
Results
Eighty percent of included stones demonstrated anacoustic shadow; 83% of stones without a shadow were ≤ 5 mm on CT. Average stone size was 6.5 mm ± 4.0 on CT, 10.3 mm ± 4.1 on US, and 7.5 mm ± 4.2 by shadow width. On average, US overestimated stone size by 3.8 mm ± 2.4 based on stone width (p < 0.001) and 1.0 mm ± 1.4 based on shadow width (p < 0.0098). Shadow measurements decreased misclassification of stones by 25% among three clinically relevant size categories (≤ 5 mm, 5.1 – 10 mm, > 10 mm), and by 50% for stones ≤ 5 mm.
Conclusions
US overestimates stone size compared to CT. Retrospective measurement of the acoustic shadow from the same clinical US images is a more accurate reflection of true stone size than direct stone measurement. Most stones without a posterior shadow are ≤ 5 mm.
Keywords: Ultrasonography, calculi, nephrolithiasis, urolithiasis, computed tomography, size
Introduction
Stone size is a critical factor in determining management options for urolithiasis [1, 2]. Previous studies have demonstrated that ultrasound (US) tends to overestimate average stone size by 1.5 to 2.2 mm [3–5]. However, interest in US for the diagnosis and management of nephrolithiasis has grown amongst concerns regarding cost and radiation exposure associated with computed tomography (CT), which remains the imaging modality of choice [6–8].
To improve stone size accuracy with US, the posterior acoustic shadow width has been proposed as an adjunctive measure. Using a research US platform, this approach is more accurate than direct measurement of stone size in phantom models and human subjects [9, 10]. On US, renal stones appear as a hyperechoic signal with a hypoechoic shadow extending behind the stone. Whereas system settings and US imaging modality can affect the appearance of the hyperechoic boundaries, therefore impacting measured stone size, the acoustic shadow is generally unaffected [11].
In 2016, Sternberg and colleagues published a retrospective multi-institutional study of patients undergoing formal renal US and CT [5]. They reported an average overestimation of US stone size of 2.2 mm, with even greater overestimation for small stones ≤ 5 mm (mean 3.3 mm). In the present study, a subset of these US images was retrospectively reviewed for the presence of a posterior acoustic shadow. Shadow width was sized and compared to reported US and CT measurements. This is the first validation of posterior acoustic shadow measurements in clinical images obtained using a commercial US system.
Methods
As described in the multi-institutional study by Sternberg et al [5], data were originally collected from patients undergoing formal renal US and low-dose non-contrast CT within 1 day of each other, for three cohorts at the University of Vermont Medical Center, Massachusetts General Hospital, and Dartmouth-Hitchcock Medical Center. Length, width, and height were determined on axial, coronal, and sagittal sections for the largest stone on CT, and reference stone size was determined by the largest dimension. Clinical US examinations were performed by licensed sonographers and formally read by radiologists as part of clinical care. US stone size was based on the largest reported stone measurement in any dimension. Imaging was optimized per the sonographer’s preference without software or hardware system modifications.
In the present study, de-identified CT data and US images from the University of Vermont cohort were obtained for retrospective review. Stones with incomplete imaging and images with multiple overlapping stones were excluded. The largest stone in each remaining image set was then independently reviewed by 5 individuals (a sonographer, US engineer, endourologist, endourology fellow, and PGY4 urology resident) at the University of Washington for the presence and size of the posterior acoustic shadow. All reviewers were blinded to original US and CT stone measurements. Stone depth on US was recorded as a surrogate for body mass index (BMI).
Shadow Sizing Protocol
The method for sizing stone shadows has been reported previously [10]. US images were displayed using a MATLABTM (MathWorks, Natik, MA), which includes two moveable guide lines and a caliper. Reviewers were instructed to use the caliper to measure the shadow width, and the guide lines to help delineate the shadow borders as needed (Fig 1). There was also a check box if the reviewer felt no shadow was present. All measurements were made approximately 1 cm posterior to the stone. No numerical information was included to indicate stone size or caliper length.
Statistical Analysis
Stone shadow was deemed present if at least one reviewer made a size measurement. This was reported as a dichotomous variable. Absolute bias between stone and shadow measurements on US was determined and compared to the reference CT stone size using a linear mixed effects model to account for within-stone correlations. This was reported as mean bias. Average CT stone size, US stone size, and shadow width were also calculated.
Based on CT size measurements, all shadowing stones were placed into clinically relevant size categories previously used within the literature (≤5 mm, 5.1–10 mm, and >10 mm) [4, 5, 12]. Within these groups, concordance of each stone’s CT size with its reported US size and average shadow measurement was determined. For each size category, average US stone size, shadow size, and magnitude of deviation from CT stone size were, also calculated.
Inter-rater reliability of shadow measurements was assessed by intra-class correlation (ICC) with a 95% confidence interval. Two-sided p-values < 0.05 were considered statistically significant. All analyses were performed in SAS 9.4 (SAS Institute, Cary, NC, USA). This retrospective study received institutional review board approval.
Results
Forty-four image sets were evaluated for inclusion. Fourteen were excluded because of incomplete imaging, lack of imaging reports, or multiple overlapping stones in the images precluding identification of the primary stone or stone shadow. Of the 30 stones included for review, one was a ureteral stone and the remainder were renal stones. The mean CT stone size was 6.5 mm ± 4.0. Mean reported US stone size was 10.3 mm ± 4.1, at a mean depth of 6.8 cm ± 2.1.
Overall, 80% (24/30) of included stones demonstrated a posterior acoustic shadow. Average shadow width was 7.5 mm ± 4.2, with ICC of 0.86 (95% CI 0.77, 0.94) among all reviewers. Of those stones without a shadow, 83% (5/6) were ≤ 5 mm. A post-hoc power analysis showed a sample size of 14 pairs as sufficient to obtain 80% power at type I error level 0.05 in comparing US stone measurements.
Shadow width more closely approached CT stone size than US stone size (Fig 2), particularly for smaller stones. Reported US stone size consistently overestimated CT stone size with a mean bias of 3.8 mm ± 2.4 (p < 0.001), while shadow width demonstrated a mean bias of 1.0 mm ± 1.4 (p < 0.0098). The difference in mean absolute bias between US stone and shadow measurements was statistically significant (p < 0.0001). Using the shadow measurement, there was a greater than 3-fold improvement in the number of stones measuring within 1 mm of the CT measurement (10/24), compared to reported US stone size (3/24).
Shadowing stones were categorized by CT size: ≤5 mm (n = 12), 5.1–10 mm (n = 7), and >10 mm (n = 5) (Table 1). Mean CT stone size, mean reported US stone size, and mean shadow size for each group is listed in Table 2a. Using reported US stone size, 58% of all stones were misclassified; using shadow size, misclassification was reduced to 33%. The greatest degree of misclassification was among stones ≤ 5 mm, where misclassification was reduced from 91.7% using US stone measurements to 41.7% using shadow measurements. Absolute deviation from CT stone size for this subgroup was reduced by 63%, from 4.3 mm ± 2.7 to 1.6 mm ±1.6 on average, respectively, with notably less variability (Table 2b).
Table 1.
CT measurement
|
||||
---|---|---|---|---|
0–5mm | 5–10 mm | >10 mm | ||
a) US Stone measurement | 0–5 mm | 1 (8.3%) | 0 | 0 |
5.1–10 mm | 9 | 4 (57.1%) | 0 | |
>10 mm | 2 | 3 | 5 (100%) |
CT measurement
|
||||
---|---|---|---|---|
0–5mm | 5–10 mm | >10 mm | ||
b) Mean posterior acoustic shadow measurement | 0–5 mm | 7 (58.3%) | 2 | 0 |
5.1–10 mm | 4 | 4 (57.1%) | 0 | |
>10 mm | 1 | 1 | 5 (100%) |
Table 2a.
CT size | Mean CT Size (mean ± SD) | Mean US Size (mean ± SD) | Mean Posterior Acoustic Shadow Size (mean ± SD) |
---|---|---|---|
0–5 mm (n=12) | 4.0 mm ±1.0 | 8.4 mm ± 2.3 | 5.2 mm ± 1.9 |
5.1–10 mm (n =7) | 5.9 mm ± 0.9 | 9.6 mm ± 3.2 | 6.5 mm ± 2.7 |
>10 mm (n=5) | 13.0 mm ± 2.7 | 15.2 mm ± 3.1 | 14.2 mm ± 2.4 |
Table 2b.
CT size | % Overestimated By US Stone Size (mean ± SD) | % Overestimated By Shadow Size (mean ± SD) | % Underestimated By shadow Size (mean ± SD) |
---|---|---|---|
0–5 mm (n=12) | 100% (4.3 mm ± 2.7) | 66.7% (2.0 mm ± 1.8) | 33.3% (0.63 mm ± 0.5) |
5.1–10 mm (n =7) | 100% (3.7 mm ± 2.5) | 71.4% (1.8 mm ± 1.2) | 28.6% (2.1 mm ± 1.5) |
>10 mm (n=5) | 100% (2.8 mm ± 1.6) | 80% (1.7 mm ± 1.6) | 20% (0.55 mm ± N/A) |
Discussion
Prior phantom and human studies using a research ultrasound platform have indicated that posterior acoustic shadow width measurement improves the accuracy of stone size measurement on US [9, 10]. This is the first study to validate this technique in human subjects using a clinical ultrasound system at a second institution. We demonstrate that retrospective measurement of shadow width improves size accuracy on US compared to reported stone size, and approaches the 1 mm accuracy of CT reported by Kishore and colleagues [13]. Shadow measurements also reduced misclassification of stones in clinically relevant size categories by 25% overall and 50% for stones ≤ 5 mm. Though the overall degree of size misclassification in this study is higher than reported by others using the same size categories (27–28% overall and 17–60% for stones ≤ 5 mm, respectively) [4, 12], this may be explained by the exclusion of non-shadowing stones from size classification, over 80% of which were ≤ 5 mm. As smaller stones have a high likelihood of spontaneous passage, improved size concordance using shadow measurements is clinically relevant [1, 2] and may impact clinical management decisions [14].
Posterior acoustic shadow measurements may be a particularly useful technique in the acute setting, where there is growing interest in ultrasound as first-line imaging for patients with suspected nephrolithiasis [6, 15]. This approach addresses the inherent tradeoffs with CT between optimizing size accuracy and minimizing radiation exposure. By improving stone size accuracy and identification of small, passable stones on US, triage and management decisions may be facilitated without CT, which can be reserved only for ambiguous cases. This may be most useful for children, young women, and recurrent stone formers, where minimizing ionizing radiation is a priority.
The software interface in this study simulates sizing calipers on standard ultrasound platforms, making this technique easy to apply within existing clinical ultrasound systems without any additional hardware or software modifications. Moreover, caliper use should already be familiar to practicing sonographers, radiologists, and clinicians. This allows the technique to be easily integrated into clinical use.
This is a small retrospective study subject to the potential biases of such a study design, and it has several additional limitations. The US images obtained were performed for clinical purposes, and were not specifically optimized for the detection or sizing of the posterior acoustic shadow. Moreover, reviewed images were limited to only those saved within the formal study, rather than the full complement of ultrasound images seen in real time by the sonographer. However, this is reflective of US imaging in the “real world,” where the stone or posterior acoustic shadow may be sub-optimally captured due to time or operator dependence, and providers are dependent on the technical expertise of their sonographers.
As images were de-identified, no clinical data on body mass index or body habitus were available. Such factors potentially influence image quality and the appearance of the posterior acoustic shadow. In vitro, the accuracy of US stone size worsened with increasing depth, but this does not hold true for shadow measurements [9]. In this study, average stone depth for all included stones was 6.9 cm ± 2.1, with a maximum depth of 11.8 cm. This suggests that the shadow measurement remains feasible for obese patients.
CT and US stone measurements were obtained by a single reviewer, whereas shadow measurements were evaluated across five reviewers. However, this was accounted for with a linear mixed effects model. As an additional sensitivity analysis, a second provider reviewed the US images and re-measured stone size in a blinded fashion. Though average US stone size measurements were significantly different between reviewers (10.3 mm ± 4.1 vs. 8.4 ± 3.2, p < 0.0001; ICC 0.72), this did not alter stone size misclassification, or the superior accuracy of shadow measurement compared to US stone size (p < 0.0001 for reviewer 1 and p < 0.0099 for reviewer 2). Such variation is not unique to US, as CT stone size measurements have also been reported to vary among radiologists [15, 16]. Moreover, our reviewers included 2 individuals naïve to the shadow measuring technique, suggesting that the high ICC for shadow measurements is not dependent on experienced reviewers.
Despite these limitations, this is the first study to assess posterior acoustic shadow in human subjects with a commercial US unit in the clinical setting. Moreover, it validates the improved accuracy of this technique utilizing images obtained by blinded sonographers at a second institution. This technique requires no software or hardware modification and is available for immediate clinical use. Further multi-institutional studies with greater sample sizes may better clarify the accuracy, reproducibility, transferability, and clinical utility of this approach to measuring stone size on US.
Conclusions
Direct measurement of stones on US consistently overestimates stone size compared to CT imaging, while the posterior acoustic shadow width appears more accurate. This has been shown using clinical US images obtained in human subjects. On average, the shadow overestimates CT stone size by about 1 mm and decreases misclassification of stones within clinically relevant size categories, particularly for stones ≤ 5 mm. Most stones without a posterior shadow are ≤ 5 mm These are clinically significant findings, as these stones have the highest probability of spontaneous passage.
Acknowledgments
This work is part of a large collaborative effort, and we appreciate the help of our many collaborators at the University of Vermont, University of Washington (UW) Center for Industrial and Medical Ultrasound, the UW Department of Urology, and within National Institute of Diabetes and Digestive Kidney Diseases (NIDDK) Program Project DK043881.
Funding: Funding was provided by the National Space Biomedical Research Institute through National Aeronautics and Space Association (NASA) Grant NCC 9-58 and National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Grant DK043881.
Abbreviations
- CT
computed tomography
- US
ultrasound
- BMI
body mass index
- ICC
intra-class correlation
Footnotes
Author contributions:
JC Dai: Project development, data collection and management, data analysis, manuscript writing/editing
B Dunmire: Project development, data collection and management, data analysis, manuscript writing/editing
KM Sternberg: Project development, manuscript editing
Z Liu: Data analysis, manuscript editing
T Larson: Data collection and management, manuscript editing
J Thiel: Data collection, manuscript editing
HC Chang: Data collection, manuscript editing
JD Harper: Project development, manuscript editing
MR Bailey: Project development, manuscript writing/editing
MD Sorensen: Project development, data collection, manuscript editing
Conflict of Interest: Michael R. Bailey, Barbrina Dunmire, and Mathew D. Sorensen have equity in and consulting agreements with SonoMotion Inc. which has licensed intellectual property from the University of Washington related to this technology. For the remaining authors, no competing conflicts of interest exist.
Ethical approval: All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. For this type of study formal consent is not required.
References
- 1.Hubner WA, Irby P, Stoller ML. Natural history and current concepts for the treatment of small ureteral calculi. Eur Urol. 1993;24:172–176. doi: 10.1159/000474289. [DOI] [PubMed] [Google Scholar]
- 2.Assimos D, et al. Surgical Management of Stones: American Urological Association/Endourological Society Guideline, PART II. J Urol. 2016;196:1161–1169. doi: 10.1016/j.juro.2016.05.091. [DOI] [PubMed] [Google Scholar]
- 3.Fowler Ka B, Locken JA, Duchesne JH, Williamson MR. US for detecting renal calculi with nonenhanced CT as a reference standard. Radiology. 2002;222:109–13. doi: 10.1148/radiol.2221010453. [DOI] [PubMed] [Google Scholar]
- 4.Ray AA, Ghiculete D, Pace KT, Honey RJD. Limitations to ultrasound in the detection and measurement of urinary tract calculi. Urology. 2010;76:295–300. doi: 10.1016/j.urology.2009.12.015. [DOI] [PubMed] [Google Scholar]
- 5.Sternberg KM, et al. Ultrasonography Significantly Overestimates Stone Size When Compared to Low-dose, Noncontrast Computed Tomography. Urology. 2016;95:67–71. doi: 10.1016/j.urology.2016.06.002. [DOI] [PubMed] [Google Scholar]
- 6.Smith-Bindman R, et al. Ultrasonography vs Computed Tomography for Suspected Nephrolithiasis. N Engl J Med. 2014;371:1100–1110. doi: 10.1056/NEJMoa1404446. [DOI] [PubMed] [Google Scholar]
- 7.Ferrandino MN, et al. Radiation Exposure in the Acute and Short-Term Management of Urolithiasis at 2 Academic Centers. J Urol. 2009;181:668–673. doi: 10.1016/j.juro.2008.10.012. [DOI] [PubMed] [Google Scholar]
- 8.Coursey Ca, et al. ACR Appropriateness Criteria® Acute Onset Flank Pain–Suspicion of Stone Disease. Ultrasound Q. 2012;28:227–233. doi: 10.1097/RUQ.0b013e3182625974. [DOI] [PubMed] [Google Scholar]
- 9.Dunmire B, et al. Use of the Acoustic Shadow Width to Determine Kidney Stone Size with Ultrasound. J Urol. 2016;195:171–176. doi: 10.1016/j.juro.2015.05.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.May P, et al. Stone-Mode Ultrasound for Determining Renal Stone Size. J Endo. 2016;30:958–962. doi: 10.1089/end.2016.0341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Dunmire B, et al. Tools to Improve the Accuracy of Kidney Stone Sizing with Ultrasound. J Endourol. 2015;29:147–152. doi: 10.1089/end.2014.0332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kanno T, et al. The efficacy of ultrasonography for the detection of renal stone. Urology. 2014;84:285–288. doi: 10.1016/j.urology.2014.04.010. [DOI] [PubMed] [Google Scholar]
- 13.Kishore TAA, Pedro RN, Hinck B, Monga M. Estimation of Size of Distal Ureteral Stones: Noncontrast CT Scan Versus Actual Size. Urology. 2008;72:761–764. doi: 10.1016/j.urology.2008.05.047. [DOI] [PubMed] [Google Scholar]
- 14.Ganesan V, De S, Greene D, Torricelli FCM, Monga M. Accuracy of ultrasonography for renal stone detection and size determination: is it good enough for management decisions? BJU Int. 2017;119:464–469. doi: 10.1111/bju.13605. [DOI] [PubMed] [Google Scholar]
- 15.Sternberg KM, Littenberg B. Trends in imaging use for the evaluation and follow-up of kidney stone disease: A single center experience. J Urol. 2017;198:383–388. doi: 10.1016/j.juro.2017.01.072. [DOI] [PubMed] [Google Scholar]
- 16.Patel SR, et al. Automated renal stone volume measurement by noncontrast computerized tomography is more reproducible than manual linear size measurement. J Urol. 2011;186:2275–2279. doi: 10.1016/j.juro.2011.07.091. [DOI] [PubMed] [Google Scholar]
- 17.Lidén M, Andersson T, Geijer H. Making renal stones change size-impact of CT image post processing and reader variability. Eur Radiol. 2011;21:2218–2225. doi: 10.1007/s00330-011-2171-x. [DOI] [PubMed] [Google Scholar]