Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Oct 1.
Published in final edited form as: J Magn Reson Imaging. 2023 Jan 16;58(4):1153–1160. doi: 10.1002/jmri.28593

Effect of Averaging Measurements From Multiple MRI Pulse Sequences on Kidney Volume Reproducibility in Autosomal Dominant Polycystic Kidney Disease

Hreedi Dev 1, Chenglin Zhu 1, Arman Sharbatdaran 1, Syed I Raza 1, Sophie J Wang 1, Dominick J Romano 1, Akshay Goel 1, Kurt Teichman 1, Mina C Moghadam 1, George Shih 1, Jon D Blumenfeld 2,3, Daniil Shimonov 2,3, James M Chevalier 2,3, Martin R Prince 1,4,*
PMCID: PMC10947493  NIHMSID: NIHMS1973752  PMID: 36645114

Abstract

Background:

Total kidney volume (TKV) is an important biomarker for assessing kidney function, especially for autosomal dominant polycystic kidney disease (ADPKD). However, TKV measurements from a single MRI pulse sequence have limited reproducibility, ± ~5%, similar to ADPKD annual kidney growth rates.

Purpose:

To improve TKV measurement reproducibility on MRI by extending artificial intelligence algorithms to automatically segment kidneys on T1-weighted, T2-weighted, and steady state free precession (SSFP) sequences in axial and coronal planes and averaging measurements.

Study Type:

Retrospective training, prospective testing.

Subjects:

Three hundred ninety-seven patients (356 with ADPKD, 41 without), 75% for training and 25% for validation, 40 ADPKD patients for testing and 17 ADPKD patients for assessing reproducibility.

Field Strength/Sequence:

T2-weighted single-shot fast spin echo (T2), SSFP, and T1-weighted 3D spoiled gradient echo (T1) at 1.5 and 3T.

Assessment:

2D U-net segmentation algorithm was trained on images from all sequences. Five observers independently measured each kidney volume manually on axial T2 and using model-assisted segmentations on all sequences and image plane orientations for two MRI exams in two sessions separated by 1–3 weeks to assess reproducibility. Manual and model-assisted segmentation times were recorded.

Statistical Tests:

Bland–Altman, Schapiro–Wilk (normality assessment), Pearson’s chi-squared (categorical variables); Dice similarity coefficient, interclass correlation coefficient, and concordance correlation coefficient for analyzing TKV reproducibility. P-value < 0.05 was considered statistically significant.

Results:

In 17 ADPKD subjects, model-assisted segmentations of axial T2 images were significantly faster than manual segmentations (2:49 minute vs. 11:34 minute), with no significant absolute percent difference in TKV (5.9% vs. 5.3%, P = 0.88) between scans 1 and 2. Absolute percent differences between the two scans for model-assisted segmentations on other sequences were 5.5% (axial T1), 4.5% (axial SSFP), 4.1% (coronal SSFP), and 3.2% (coronal T2). Averaging measurements from all five model-assisted segmentations significantly reduced absolute percent difference to 2.5%, further improving to 2.1% after excluding an outlier.

Data Conclusion:

Measuring TKV on multiple MRI pulse sequences in coronal and axial planes is practical with deep learning model-assisted segmentations and can improve TKV measurement reproducibility more than 2-fold in ADPKD.

Evidence Level:

2

Technical Efficacy:

Stage 1


Total kidney volume (TKV) indexed to patient height, known as height-adjusted total kidney volume (ht-TKV), is an important renal biomarker. It is used for tracking disease progression in autosomal dominant polycystic kidney disease (ADPKD), as well as predicting risk of developing end-stage kidney disease and helping to determine eligibility for tolvaptan therapy.1,2

Measuring TKV is a tedious, operator-dependent process requiring manual contouring of each kidney on every slice of CT or MRI scans. MRI is preferred over CT to avoid unnecessary radiation exposure, in measurements that are often repeated many times over the patient’s lifetime.

However, these measurements have notable interobserver variability, with root mean square errors as high as 11%.3 Since mean TKV in ADPKD increases by about 5% per year, but can be quite variable, patients often must wait years between measurements to reliably determine their rate of ht-TKV increase.4 There is, therefore, a need for kidney volume measurements with better reproducibility, which would facilitate measurement of ht-TKV changes occurring over shorter intervals.

Recently, deep learning based automated kidney segmentation on axial T2-weighted, with and without fat saturation on coronal T2-weighted, and coronal T1-weighted images have shown promising results with accuracy approaching 90% to 97%, using manual contouring as the ground truth.5-10 Although the accuracy of automating renal contouring may not exceed that of manual contouring for an individual MRI pulse sequence, its efficiency opens the possibility of obtaining kidney volume measurements from multiple pulse sequences and image plane orientations in an MRI exam. Since averaging reduces random variance by the square root of the number of measurements, the mean volume obtained by averaging measurements from five MRI pulse sequences acquired in the same exam can be expected to reduce variability by the square root of five (i.e., 2.2-fold).

Thus, the aim of this study was to extend the deep learning automation of kidney volume measurements from axial T2-weighted images to T1-weighted and steady state free precession (SSFP) images in both axial and coronal image planes, and to average the TKV values from each in order to improve TKV reproducibility.

Materials and Methods

This Institutional Review Board approved, HIPAA compliant study utilized existing images from 397 patients (356 with ADPKD, 41 without ADPKD) for training the deep learning algorithm (Fig. 1). Demographic data for all patients and creatinine values were obtained to calculate the average estimated glomerular filtration rate (eGFR). Performance of the model was assessed on 20 external and 20 prospectively acquired ADPKD subject data sets that were not included in the training/validation pool. An additional 17 ADPKD subjects prospectively underwent MRI scanning twice within a 3-week interval. These subjects undergoing two MRI scans to assess reproducibility and for prospective testing signed informed research consent prior to imaging. The requirement for informed consent was waived for the subjects contributing images to algorithm development and retrospective data analysis. No participants were treated with tolvaptan, a type 2 vasopressin receptor antagonist FDA approved for the treatment of ADPKD.11 Test set and reproducibility subjects were scanned with T2-weighted fast spin echo (T2; axial and coronal), SSFP (axial and coronal), and T1-weighted 3D gradient echo sequences (T1; axial). Subjects in the training and validation sets were scanned with a mixture of these sequences. MRI scan protocol details are provided in Table S1 in the Supplemental Material. All subjects were requested to fast for 2 hours prior to MRI.

FIGURE 1:

FIGURE 1:

Patient flowchart.

Artificial Intelligence Model Development

This work utilizes a previously reported open-source deep learning-based 2D U-net model with an EfficientNet framework pretrained on ImageNet and then trained for kidney segmentation on axial T2 and axial SSFP images from 117 patients.5,12 Our repository contains the codes for training, inference, and the model weights.13 The published models were refined by repeating training with additional axial T2 labeled images (N = 347) split 75:25 (training:validation), using cross-validation and stratifying by TKV and scanner type as before.

The same model was used to train on images from additional pulse sequences including 2D coronal T2 (N = 133), 2D axial SSFP (N = 83), 2D coronal SSFP (N = 30), and 3D axial T1 (N = 84). We pretrained the model with the axial T2 data before training for coronal T2, axial T1 or axial SSFP images. Because the coronal SSFP images share the same plane as coronal T2, we pretrained the network with coronal T2 prior to training with the coronal SSFP images. Model performance on each sequence was evaluated on independent external and prospective test sets using radiologist (MRP), with over 25 years of experience, corrected model outputs as the ground truth.

Measuring Reproducibility

The set of checkpoints with the least Dice similarity coefficient (DSC) value loss for each sequence was used to obtain model inferences on 17 ADPKD subjects who were scanned twice within a 3-week interval during which there were no clinical events. For each case, five independent observers—four with experience segmenting at least 100 cases (AS, SIR, HD, CZ, and SW) and one inexperienced (SW) observer who had labeled less than 10 cases—corrected the model contours while noting the time required. To prevent memory bias, the corrections were assigned and completed in random order with a minimum 1-week interval between correcting annotations on the same patient. To determine how the time required for these corrections compared to the typical times spent on manual TKV measurements, the axial T2-weighted scans were manually contoured by each observer while noting the time required.

Statistical Analysis

Continuous variables were characterized as mean ± standard deviation when normally distributed, and as median and interquartile range if not normally distributed. Normality was assessed using the Schapiro–Wilk test. For categorical variables, significance of comparisons was assessed using Pearson’s chi-squared test. The DSC, interclass correlation coefficient, and concordance correlation coefficient (CCC) were used to compare 1) manual vs. model-assisted segmentations for the axial T2 images, 2) model-assisted segmentations for successive exams across all five MRI pulse sequences, 3) the average of the model-assisted segmentation for successive exams, 4) the average model-assisted segmentation after excluding the measurement from the sequence which had the greatest deviation from the mean for successive exams, and 5) the average model-assisted segmentation after excluding all measurements >10% different or <−10% different from the mean for successive exams.14,15

A Bland–Altman analysis of mean absolute percent difference was used to assess the agreement between the first and second MRI scans obtained within 3 weeks from each other for each pulse sequence used in the 17 ADPKD patients. The Pearson and CCCs were calculated to evaluate the interobserver agreement indices across the five pulse sequences. The reproducibility of kidney volume measurements in these two sets of scans was assessed using Welch’s t-Test. The significance of the mean absolute percent differences between the first and second scans across the five MR pulse sequences were assessed with a paired Student’s t-test (significance level = 0.05).

Results

Demographic data along with the number of DICOM images and mean eGFR for the patients utilized in training/validation of each MRI pulse sequence, as well as the 20 patients in the prospective set, 20 patients in the external set, and 17 patients scanned twice (within 3-weeks) for reproducibility analysis are shown in Table 1. There were no significant differences in age (P = 0.44), gender (P = 0.86), and eGFR (P = 0.37) between the groups.

TABLE 1.

Clinical and demographic characteristics for the training/validation as well as prospective, external and reproducibility test sets.

Parameters Training/Validation Data Prospective
Test Set
External
Test Set
Reprodu
cibility
Test Set
Axial T2 Axial
SSFP
Axial
T1
Coronal
T2
Coronal
SSFP
Number of Patients 347 83 84 133 30 20 20 17
DICOM images 24,879 4253 11,945 6026 1248 7424 3738 11,857
Male:Female (%male) 160:187
(46%)
35:48
(42%)
42:42
(50%)
63:70
(47%)
12:18
(40%)
10:10
(50%)
11:9
(55%)
7:10
(41%)
Age at scan (years) 49 ± 14 48 ± 14 49 ± 16 49 ± 14 48 ± 14 45 ± 16 47 ± 14 48 ± 14
eGFR (mL/minute/1.73 m2) 71 ± 32 67 ± 30 66 ± 29 67 ± 30 68 ± 30 74 ± 28 63 ± 35 67 ± 31

Note: Categorical variable values are reported as number followed by percentage; continuous variables are reported as mean ± SD or median (interquartile range). eGFR = estimated glomerular filtration rate.

Model Training and Validation

Model validation results for each MRI pulse sequence are shown in Table S2a-c in the Supplemental Material. The highest mean DSC values with the lowest mean percent absolute error were observed with segmentations on axial T2-weighted images consistent with the model training with the largest number of patients for axial T2. For axial T2, DSC was 0.99 and 0.97 in the prospective, and 0.96 and 0.95 in the external test sets for right and left kidneys respectively. The average DSC across all five pulse sequences was 0.96 in the right kidney and 0.94 in the left kidney on the prospective evaluation and 0.94 and 0.93 on the external test set for right and left kidneys, respectively.

Time for Manual and Model-Assisted Segmentations

The average time required to segment right and left kidneys in the five observers (Table 2) was significantly less for model-assisted segmentation than for manual segmentation of axial T2 weighted images (2:49 minutes vs. 11:34 minutes). Completing model-assisted segmentations on all five sequences required a mean of 21:01 minutes for the five observers. The axial T1 images required substantially more time to correct compared to the other sequences, 8:29 minutes, because axial T1 had twice as many slices and generally the lowest DSC compared to the other sequences (Fig. 2; Table S2 in the Supplemental Material).

TABLE 2.

Average segmentation times (minutes:seconds)

Manual
Axial T2
Model Assisted
Axial T2 Axial T1 Axial
SSFP
Coronal
T2
Coronal
SSFP
5-Sequence
Time
Right kidney 5:29 ± 2:37* 1:31 ± 1:52 3:28 ± 2:10 1:55 ± 0:49 1:32 ± 2:12 1:36 ± 0:32 10:01 ± 7:36
Left kidney 6:05 ± 2:49* 1:18 ± 0:47 5:01 ± 7:02 1:52 ± 0:41 1:09 ± 0:37 1:40 ± 0:40 11:00 ± 9:47
Total 11:34 ± 5:26* 2:49 ± 2:39 8:29 ± 9:13 3:46 ± 1:30 2:40 ± 2:49 3:16 ± 1:12 21:01 ± 17:23
*

Significantly different from model assisted axial T2, p < 0.006

FIGURE 2:

FIGURE 2:

Variability of masks created by five observers for each pulse sequence. The color indicates agreement as follows: Red—Five observers agree; Yellow—Four observers agree; Green—Three observers agree; Blue—Two observers agree; Purple—One observer labeled these voxels (no agreement). (a) Axial T2 weighted image. (b) Axial Steady State Free Precession. (c) Axial T1 weighted image. (d) Coronal T2 weighted image. (e) Coronal Steady State Free Precession image.

Reproducibility of Fully Manual vs. Model-Assisted Contouring

Table 3 shows the absolute percent difference in the TKV measurements between two MRI scans performed within 3 weeks where no changes in kidney volumes were expected and Fig. 3 shows Bland–Altman plots for TKV measurement agreement. ICC, CCC, and the coefficient of variation are shown in Table S3 in the Supplemental Material for right kidney, left kidney, and TKV. The average absolute percent difference among the five observers on all five sequences was 4.5%, ranging from 3.2% on the coronal T2-weighted pulse sequence to 5.5% on the axial T1-weighted pulse sequence with model assistance. There was a nonsignificant trend toward better reproducibility with coronal sequences with mean absolute percent differences of 3.2% and 4.1% for coronal T2 and coronal SSFP respectively, compared to 5.3% for axial T2, 4.5% axial SSFP, and 5.5% axial T1, P = 0.12 (Fig. 4).

TABLE 3.

Absolute percent difference between TKV measurements on two consecutive MRI scans for each MRI pulse sequence, the average of all five sequences, and averages after excluding one outlier sequence with the volume measurement having greatest difference from the mean or after excluding all outlier sequences >10% different from the mean.

Total Kidney Volume Reader
1
Reader
2
Reader
3
Reader
4
Reader 5 All
Readers
Mean
TKV*
Contouring experience** Yes Yes Yes Yes No
Manual contouring Axial T2 9% 5.6% 4.3% 4.9% 5.9% 5.9% 1490
Model assisted contouring Axial T2 6.7% 5.2% 4.8% 4.6% 5.1% 5.3% 1497
Axial T1 5.1% 6.3% 5.3% 4.9% 5.9% 5.5% 1353
Axial SSFP 5.2% 3.8% 4.0% 4.2% 5.1% 4.5% 1418
Coronal T2 3.1% 2.6% 3.2% 3.5% 3.8% 3.2% 1397
Coronal SSFP 4.6% 5.2% 3.2% 3.3% 4.0% 4.1% 1466
Average of individual sequences 4.9% 4.6% 4.1% 4.1% 5.0% 4.5% 1434
Average of all sequences 3.2% 2.4% 2.2% 2.2% 2.4% 2.5% 1434
Average excluding one outlier 2.9% 2.2% 1.5% 1.5% 2.4% 2.1% 1423
Average excluding >10% outliers 2.3% 2.0% 2.2% 1.8% 2.7% 2.2% 1418
*

Mean TKV of all 17 ADPKD subjects

**

“Yes” indicates experience labeling at least 100 cases; “no” means experience labels less than 10 cases

FIGURE 3:

FIGURE 3:

Bland–Altman plots of the error total kidney volume measurements obtained in the first and second scans for the 17 ADPKD between using model-assisted segmentations on the (a) axial T2, (b) axial T1, (c) axial SSFP, (d) coronal SSFP, (e) coronal T2, and (f) the average of the five sequences.

FIGURE 4:

FIGURE 4:

Variability of the masks by scan and labeling method between five observers. The color indicates agreement as follows: Red—Five observers agree; Yellow—Four observers agree; Green—Three observers agree; Blue—two observers agree; Purple—One observer labeled these voxels (no agreement). (a) First scan, manual labeling (top left). (b) First scan, model assisted labeling (top right). (c) Second scan, manual labeling (bottom left). (d) Second scan, model assisted labeling (bottom right).

Averaging the volumes measured on all five pulse sequences in each exam significantly reduced the absolute percent difference (averaged over five observers) from 4.5% to 2.5%. Compared to manual contouring, the absolute percent difference significantly reduced from 5.9% to 2.5%. A further improvement in reproducibility was achieved by excluding outliers. Excluding outliers defined as the TKV measurement farthest from the mean and averaging the TKV measurements from the remaining four sequences resulted in a significantly lower absolute percent difference of 2.1% compared to a mean of 4.5% without averaging. Excluding outliers defined as TKV measurements >10% higher or lower than the mean yielded a 2.2% absolute percent difference.

There was no significant difference between the performance of the least experienced observer and the experienced observers.

Discussion

It has not been practical to segment kidneys enlarged by ADPKD on more than one pulse sequence from an MRI exam because of the time required, even though abdominal MRI exams routinely include five or more sequences suitable for organ volume measurements. Accelerating kidney segmentations by extending a deep learning model developed for axial T2 images to also work on T1 and SSFP sequences acquired in coronal and axial planes reduces processing times and makes it practical to measure organ volumes on multiple sequences of an abdominal MRI exam. In this study, we found in 17 ADPKD patients imaged twice within a 3-week interval (where no change in kidney volume was expected) that kidney volume measurement variability decreased from an average of 5.9% absolute difference for a single sequence contoured manually, to 2.5% absolute difference after averaging measurements obtained from five sequences. This was further reduced to 2.1% absolute difference by excluding the measurement with greatest deviation from the mean prior to averaging the remaining four volume measurements.

High reproducibility for kidney volume measurements in ADPKD is important because they typically increase by only 5% annually.2 Reducing TKV measurement variability to 2.1%, has the potential to detect changes in organ volumes over a shorter time interval. This may provide more accurate and timely information for patients, including responses to established (eg, tolvaptan) and investigational therapeutic interventions.11,16

Currently, not all radiologists have access to artificial intelligence technology. Under these circumstances, our data suggests a trend toward coronal acquisitions being more reproducible than axial sequences. We hypothesize that coronal acquisitions may have fewer breath holding issues compared to axial acquisitions, which sometimes extend into the pelvis due to enlarged polycystic liver and kidneys. However, in some patients, axial measurements were more reproducible than coronal T2. Accordingly, we still favor analyzing both axial and coronal sequences, focusing on sequences with the best reproducibility. Performing multiple volume measurements identifies outlier measurements which can be excluded to further improve reproducibility.

In this study, T2-weighted images were biased toward larger volumes while axial T1-weighted images were biased toward smaller volumes. There might have been closer agreement if the T1-weighted measurements had been performed after contrast administration, but this was beyond the scope of the current study in which all subjects were imaged without contrast.3 SSFP, based upon T2/T1 contrast, produced volume measurements that closely matched the average of all five sequences. This suggests that any patient being followed with a TKV measurement algorithm utilizing only one sequence should use the same pulse sequence for every follow-up measurement. Since the SSFP sequence produced measurements closest to the average of all sequences and is an efficient acquisition that produces higher resolution data in less time, it might be considered favorable for single sequence measurements if resources are limited. However, SSFP may have more artifacts, especially in large field-of-view acquisitions required to cover enlarged kidneys, potentially reducing reproducibility and confidence.17

Limitations

The processing time for our deep learning 2D U-net model-assisted analysis of all five acquisitions in this study was over 20 minutes. However, there are many opportunities for improving the model to reduce correction times. Each model-assisted measurement represents additional labeled data that can be used for retraining the algorithm for continuous improvement. With more data, a 3D model may become practical, improving accuracy by incorporating an additional spatial dimension into discriminating kidney from background voxels. Furthermore, a systematic approach to analyzing the outliers for correctable errors may improve measurement accuracy. We only explored simple averaging and outlier exclusion prior to averaging; other approaches to averaging could including test-time augmentation, Bayesian inference and other more complex approaches.18,19 Finally, there may be ways of improving data acquisition. Another limitation is the absence of a true ground truth for the volume measurements, which is difficult to obtain since ADPKD kidneys are rarely removed even at the time of transplantation. Also, labeled data was not available for every sequence for every patient, which reduced model performance on some pulse sequences. However, use of model-assisted contours resulted in full volume measurements for every sequence being comparable to fully manual contouring. We made no effort here to optimize the MRI data acquisition using dedicated, optimized pulse sequences; this offers additional opportunities to improve reproducibility.

Conclusion

This study demonstrated the utility of measuring organ volumes on multiple MRI acquisitions with different sequences and image plane orientations to leverage the power of averaging for improving measurement reproducibility. Better reproducibility may enable TKV measurements on MRI to be performed at shorter intervals to assess changes in disease severity and treatment responses more quickly.

Supplementary Material

Supplementary Materials

Acknowledgment

This study received support by Weill Cornell Medical College (WCMC) Clinical and Translational Science Center (CTSC) (UL1TR002384) and the Shaw Foundation.

References

  • 1.Chapman AB, Bost JE, Torres VE, et al. Kidney volume and functional outcomes in autosomal dominant polycystic kidney disease. Clin J Am Soc Nephrol 2012;7(3):479–486. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Grantham JJ, Torres VE. The importance of total kidney volume in evaluating progression of polycystic kidney disease. Nat Rev Nephrol 2016;12(11):667–677. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Bae KT, Tao C, Zhu F, et al. MRI-based kidney volume measurements in ADPKD: Reliability and effect of gadolinium enhancement. Clin J Am Soc Nephrol 2009;4(4):719–725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Grantham JJ, Torres VE, Chapman AB, et al. Volume progression in polycystic kidney disease. N Engl J Med 2006;354(20):2122–2130. [DOI] [PubMed] [Google Scholar]
  • 5.Goel A, Shih G, Riyahi S, et al. Deployed deep learning kidney segmentation for polycystic kidney disease MRI. Radiol Artif Intell 2022;4(2):e210205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kline TL, Korfiatis P, Edwards ME, et al. Performance of an artificial multi-observer deep neural network for fully automated segmentation of polycystic kidneys. J Digit Imaging 2017;30(4):442–448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Mu G, Ma Y, Han M, Zhan Y, Zhou X, Gao Y. Automatic MR kidney segmentation for autosomal dominant polycystic kidney disease. Proceedings of the Society of Photo-optical Instrumentation Engineers; 2019. San Diego, CA: Volpage. https://ui.adsabs.harvard.edu/abs/2019SPIE10950E.0XM/abstract [Google Scholar]
  • 8.Van Gastel MDA, Edwards ME, Torres VE, Erickson BJ, Gansevoort RT, Kline TL. Automatic measurement of kidney and liver volumes from MR images of patients affected by autosomal dominant polycystic kidney disease. J Am Soc Nephrol 2019;30(8):1514–1522. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Raj A, Tollens F, Hansen L, et al. Deep learning-based total kidney volume segmentation in autosomal dominant polycystic kidney disease using attention, cosine loss, and sharpness aware minimization. Diagnostics 2022;12(5):1159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Taylor J, Thomas R, Metherall P, Ong A, Simms R. MO012: Development of an accurate automated segmentation algorithm to measure total kidney volume in ADPKD suitable for clinical application (the Cystvas study). Nephrol Dial Transplant 2022;37(Supplement_3):i6. [Google Scholar]
  • 11.Torres VE, Chapman AB, Devuyst O, et al. Tolvaptan in patients with autosomal dominant polycystic kidney disease. N Engl J Med 2012;367(25):2407–2418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Deng J, Dong W, Socher R, Li L-J, Kai L, Li F-F. ImageNet: A large-scale hierarchical image database. Miami: IEEE; 2009. [Google Scholar]
  • 13.Goel A. ADPKD segmentation in PyTorch. Volume 2021. San Francisco: GitHub; 2021. https://github.com/aksg87/adpkd-segmentation-pytorch [Google Scholar]
  • 14.Zou KH, Warfield SK, Bharatha A, et al. Statistical validation of image segmentation quality based on a spatial overlap index. Acad Radiol 2004;11(2):178–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Taha AA, Hanbury A. Metrics for evaluating 3D medical image segmentation: Analysis, selection, and tool. BMC Med Imaging 2015;15(1):29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Rangan GK, Wong ATY, Munt A, et al. Prescribed water intake in autosomal dominant polycystic kidney disease. NEJM Evid 2022;1(1):1–13. 10.1056/EVIDoa2100021 [DOI] [PubMed] [Google Scholar]
  • 17.Chapman AB, Guay-Woodford LM, Grantham JJ, et al. Renal structure in early autosomal-dominant polycystic kidney disease (ADPKD): The consortium for radiologic imaging studies of polycystic kidney disease (CRISP) cohort. Kidney Int 2003;64(3):1035–1045. [DOI] [PubMed] [Google Scholar]
  • 18.Kimura M. Understanding test-time augmentation. Proceedings of Neural Information Processing: 28th International Conference, ICONIP 2021; Dec Sanur, Bali, Indonesia: Springer International Publishing; 2021. p 558–569. [Google Scholar]
  • 19.Van De Schoot R, Depaoli S, King R, et al. Bayesian statistics and modelling. Nat Rev Methods Primers 2021;1:1. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Materials

RESOURCES