Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Apr 1.
Published in final edited form as: J Pediatr Urol. 2016 Sep 6;13(2):192–198. doi: 10.1016/j.jpurol.2016.06.020

Reliability of grading of vesicoureteral reflux and other findings on voiding cystourethrography

Anthony J Schaeffer a, Saul P Greenfield b, Anastasia Ivanova c, Gang Cui c, J Michael Zerin d, Jeanne S Chow e,f, Alejandro Hoberman g, Ranjiv I Mathews h, Tej K Mattoo i, Myra A Carpenter c, Marva Moxey-Mims j, Russell W Chesney k,, Caleb P Nelson e
PMCID: PMC5339054  NIHMSID: NIHMS818792  PMID: 27666144

Summary

Introduction

Voiding cystourethrograhy (VCUG) is the modality of choice to diagnose vesicoureteral reflux (VUR). Although grading of VUR is essential for prognosis and clinical decision-making, the inter-observer reliability for grading has been shown to vary substantially. The Randomized Intervention for Children with VesicoUreteral Reflux (RIVUR) trial provides a large cohort of children with VUR to better understand the reliability of VCUG findings.

Objective

To determine the inter-observer consistency of the grade of VUR and other VCUG findings in a large cohort of children with VUR.

Study design

The RIVUR trial is a randomized controlled trial of antimicrobial prophylaxis in children with VUR diagnosed after UTI. Each enrollment VCUG was read by a local clinical (i.e. non-reference) radiologist, and independently by two blinded RIVUR reference radiologists. Reference radiologists’ disagreements were adjudicated for trial purposes. The grade of VUR and other VCUG findings were extracted from the local clinical radiologist’s report. The unit of analysis included individual ureters and individual participants. We compared the three interpretations for grading of VUR and other VCUG findings to determine the inter-observer reliability.

Results

Six-hundred and two non-reference radiology reports from 90 institutions were reviewed and yielded the grade of VUR for 560 left and 524 right ureters. All three radiologists agreed on VUR grade in only 59% of ureters; two of three agreed on 39% of ureters; and all three disagreed on 2% of ureters (Table). Agreement was better (e92%) for other VCUG findings (e.g. bladder shape “normal”). The non-reference radiologists’ grade of VUR differed from the reference radiologists’ adjudicated grade by exactly one grade level in 19% of ureters, and by two or more grade levels in 2.2% of ureters. When the participant was the unit of analysis, all three radiologists agreed on the grade of VUR in both ureters in just 43% of cases.

Discussion

Our study shows considerable and clinically relevant variability in grading VUR by VCUG. This variability was consistent when comparing non-reference to the adjudicated reference radiologists’ assessment and the reference radiologists to each other. This study was limited to children with a history of UTI and grade I–IV VUR and may not be generalizable to all children who have a VCUG.

Conclusion

The considerable inter-observer variability in VUR grading has both research and clinical implications, as study design, risk stratification, and clinical decision-making rely heavily on grades of VUR.

Keywords: Vesico-ureteral reflux, Voiding cystourethrogram, Urinary tract infection, Radiology, Concordance, Classification

Introduction

Voiding cystourethrography (VCUG) is the modality of choice to diagnose vesicoureteral reflux (VUR) and certain other urinary tract abnormalities in children. VCUG permits grading of VUR using the five-level International Reflux Scale (IRS) [1], and grade of VUR is strongly associated with outcomes such as spontaneous resolution, recurrence of urinary tract infection (UTI), renal scarring, and others. Recent guidelines recommend clinical decision-making based on the grade of VUR on VCUG, including observation (without medical therapy) for selected children with grade I or II VUR [5].

While both research findings and clinical recommendations assume that the IRS grading scale is reliable and reproducible, this may not be the case, as grade discrepancies between readers occur in significant numbers of children with VUR [68]. However, there are limitations to the published data, and more rigorous measurement of inter-observer agreement regarding VCUG findings is needed.

The Randomized Intervention for Children with VesicoUreteral Reflux (RIVUR) trial is a randomized double-blind placebo-controlled trial of antimicrobial prophylaxis among children with grade I–IV VUR and urinary tract infection [9]. Each child included in RIVUR had a VCUG at study entry, and each VCUG was independently reviewed by a local clinical (non-reference) radiologist, as well as by two RIVUR reference radiologists.

The purpose of this study was to assess inter-observer agreement for grade of VUR and other VCUG findings among RIVUR participants.

Materials and methods

The RIVUR trial randomized 607 children with VUR to trimethoprim-sulfamethoxazole prophylaxis or placebo [9]. Three assessments of pre-enrollment VCUG images were performed: one by the non-reference radiologist who performed the initial clinical interpretation, and one by each of the two reference pediatric radiologists designated to read VCUGs for the RIVUR trial (JMZ, JSC). The RIVUR reference radiologists assessed grade of VUR and other findings including but not limited to bladder (i.e. shape, trabeculation) and urethral anatomy. Disagreements between reference radiologists were adjudicated. Reference radiologists accessed original digital images placed on compact discs in the DICOM format and viewed them on high-resolution monitors. There was no standardized reader software.

The non-reference radiology reports were reviewed for findings besides VUR using the same data form used by the reference radiologists. Two differences exist between the non-reference and reference assessments. First, although each reference radiologist was required to provide an assessment for every item on the RIVUR data report form, the local (non-reference) radiologists were under no such requirements. Accordingly, not every item assessed by the RIVUR reference radiologists was included in every nonreference VCUG report. If no statement was made about a certain item (e.g. left but not the right ureter grade of VUR) in the non-reference report, the field for that item was coded as missing. However, all other items mentioned in the report were analyzed. Second, the non-reference radiologist could report intermediate grades of VUR (e.g. grade II–III), whereas the reference radiologists were only permitted to use grades I, II, III, IV, or V. Of ureters graded by non-reference radiologists, 8% (90/1080) received such an intermediate grade. To handle this discrepancy analytically, intermediate grades of reflux assigned by nonreference radiologists were given 0.5 units for each grade they encompassed. For example, 0.5 units were assigned to grade II and 0.5 units to grade III for non-reference grade II–III VUR.

A three-way comparison between the non-reference radiologist and each reference radiologist was used to determine the number and percent of responses in which all three radiologists agreed, two radiologists agreed and one disagreed, or all three radiologists disagreed. Similarly, two-way comparisons of the number and percent of concordant responses were made between the nonreference radiologist and the adjudicated reference radiologists’ assessments, and between the two reference radiologists’ assessments. For some comparisons, broader severity categories of low-grade (I or II), moderate (III), or high-grade (IV or V) were considered. Most analyses used the individual ureter as the unit of analysis. However, we also determined the agreement in the grade of VUR using the individual patient as the unit of analysis. In this scenario, analysis was only possible in participants who had a VUR grade assigned for each ureter. Agreement was possible only if both ureters had the same grade of VUR across all three radiologists. We analyzed agreement in assessments for the two-way comparisons using the un-weighted Kappa statistic, a conservative, chance-corrected measure of agreement between raters’ evaluations of mutually exclusive categorical items [10]. Statistical analyses were performed using SAS 9.3 (SAS Institute, Cary, NC, USA). All analyses were two-sided, and a p-value <0.05 was considered to be significant.

Results

Six-hundred and two non-reference VCUG reports from 90 facilities were available. These reports provided the nonreference VUR grade for 560 left and 524 right ureters. For this report, there were 49 males and 553 females studied with a median age of 11 months at the time of VCUG (interquartile range 5–30 months). The adjudicated (i.e. reference) grade of VUR for the 602 participants was grade 0 in 319 ureters (26.3%), grade I in 104 ureters (8.6%), grade II in 402 ureters (33.1%), grade III in 321 ureters (26.4%), grade IV in 54 ureters (4.4%), and grade V in one (0.1%) ureter; assessments were not available for three ureters.

All three radiologists agreed on the grade of VUR in 59% (638/1081) of ureters (Table 1). When comparing nonreference with the adjudicated reference assessment, 25% (275/1080) of VUR grades were discordant (Kappa 0.66, 95% CI 0.62–0.69). Table 2 shows that reference radiologists did no better, with 24% (283/1202) of ureters having discrepant grades that required adjudication (Kappa 0.68, 95% CI 0.65–0.71).

Table 1.

Agreement between three radiologists on VCUG assessments.a

All three radiologists agree N (%) Two radiologists agree, and one disagrees N (%) All three radiologists disagree N (%)
Voided during study (n = 559 studies) 517 (92) 42 (8)
Reflux grade (n = 1081 ureters)b,c 637.5 (59) 416.5 (39) 27 (2)
Complete duplication (n = 820 ureters)b 785 (96) 35 (4)
Paraureteral diverticulum (n = 923 ureters)b 897 (97) 26 (3)
Bladder shape (n = 265 studies) 261 (98) 4 (2)
Bladder trabeculations (n = 490 studies) 480 (98) 10 (2)
Bladder ureterocele (n = 374 studies) 374 (100)
Bladder wall diverticulum (n = 496 studies) 472 (95) 24 (5)
Normal urethra (n = 407 studies) 397 (98) 10 (2)
Spinning top urethra (n = 407 studies) 399 (98) 8 (2)
Osseous structures normal (n = 304 studies) 301 (99) 3 (1)
a

Restricted to studies where non-reference radiologists and/or reference radiologists assessed the specific component. For example, non-reference radiologists commented on whether voiding was observed in the study in 559 participants and on the bladder shape in 265 participants.

b

Right and left side ureters were pooled.

c

Comparisons between equal categories are required, so intermediate grades of reflux assigned by non-reference radiologists were given 0.5 units for each grade they encompassed. For example, 0.5 units of credit were assigned to grade II and 0.5 units to grade III for a participant whose reflux was graded II–III by the non-reference radiologist.

Table 2.

Agreement between non-reference radiologist and adjudicated reference assessments and between each reference radiologist’s assessments.

Non-reference radiologist compared with adjudicated referencea Reference radiologist 1 compared with reference radiologist 2b


Concordant N (%) Discrepant N (%) Kappa (95% CI) Concordant N (%) Discrepant N (%) Kappa (95% CI)
Voided during study 542 (95) 26 (5) 0.39 (0.20–0.57) 565 (96) 26 (4) 0.41 (0.23–0.60)
Reflux gradec 805 (75) 275 (25) 0.66 (0.62–0.69) 919 (76) 283 (24) 0.68 (0.65–0.71)
Complete duplicationc 842 (96) 31 (4) 0.40 (0.23–0.507) 811 (98) 14 (2) 0.71 (0.57–0.86)
Paraureteral diverticulumc 907 (98) 16 (2) 0.42 (0.19–0.65) 1173 (97) 31 (3) 0.38 (0.21–0.55)
Bladder shape 262 (99) 3 (1) 0.39 (−0.15 to 0.94) 598 (99) 4 (1) −0.00 (−0.00–0.00)
Bladder trabeculations 483 (99) 7 (1) 0.22 (−0.15 to 0.58) 591 (98) 11 (2) −0.003 (−0.008–0.002)
Bladder ureterocele 374 (100) 0 (0) 602 (100) 0 (0)
Bladder wall diverticulum 480 (97) 16 (3) 0.32 (0.08–0.56) 577 (96) 25 (4) 0.30 (0.10–0.50)
Normal urethra 417 (99) 3 (1) 0.40 (−0.14 to 0.94) 499 (98) 10 (2) −0.01 (−0.02–0.00)
Spinning top urethra 412 (99) 6 (1) 0.24 (−0.16 to 0.64) 509 (98) 8 (2) 0.00 (0.00–0.00)
Osseous structures normal 303 (99) 4 (1) 0.00 (0.00–0.00) 594 (100) 1 (0) −0.00 (−0.00–0.00)
a

Restricted to studies where non-reference radiologists assessed specific component. For example, non-reference radiologists commented on whether voiding was observed during the study for 568 participants and on the reflux grade in 1080 ureters.

b

Restricted to 603 studies where reference information was available. Each item was not assessed in every participant. For example, in only 591 out of 603 studies were images available that allowed determination of voiding; 1202 ureters were assessed.

c

Right and left ureters were pooled.

Disagreements sometimes differed widely. Three ureters had grade 0 VUR (no reflux) according to one radiologist, grade I or grade II by a second radiologist, and grade III VUR by a third radiologist; 11 ureters were graded as having grade I or II by one, grade III by another, and grade IV or grade V by a third radiologist.

Among ureters for which two of three radiologists agreed, 359 were graded on an ordinal scale by nonreference radiologists and could be compared with scores of reference radiologists. Assessments for 12% (43/359) of these ureters differed within a severity category (i.e. disagreement within low-grade [grade I vs. grade II] VUR). However, disagreements were also noted between severity levels. Forty-six percent (164/359) of disagreements occurred between low (grade I or II) and moderate (grade III) VUR. Other areas of disagreement included 83 ureters graded as none and some reflux, 65 ureters as both moderate (grade III) and high-grade (grade IV or V) VUR, and four ureters as both low-grade (grade I or II) and high-grade (grade IV or V) VUR.

Tables 3 and 4 show variation in grades of VUR assigned by the non-reference radiologists and adjudicated RIVUR grades. One would expect less disagreement in VUR that is isolated to the ureter (grade I) versus that which extends to the collecting system (egrade II). This was supported by these data. Specifically, of the 133 ureters assigned grade I by the non-reference radiologists, 15% (20/133) were interpreted as grade II and 14% (13/133) had no reflux by the reference radiologist.

Table 3.

Cross-tabulation of VUR grade comparing non-reference grade with adjudicated reference grade.a

Non-reference radiologist Reference adjudicated grade

Left ureter (n = 558) Right ureter (n = 522)


None I II III IV V Total None I II III IV V Total
None 76 1 2 0 0 0 79 120 2 0 0 0 0 122
I 8 60 12 1 0 0 81 10 35 7 0 0 0 52
I–II 0 0 0 0 0 0 0 0 0 0 0 0 0 0
II 1 2 142 28 0 0 173 5 1 134 20 0 0 160
II–III 0 0 13 17 0 0 30 1 0 14 20 0 0 35
III 1 0 35 91 14 0 141 1 0 25 81 6 0 113
III–IV 0 0 1 7 3 0 11 0 0 0 6 2 0 8
IV 0 0 2 16 20 1 39 0 0 2 19 4 0 25
IV–V 0 0 0 2 1 0 3 0 0 0 2 1 0 3
V 0 0 0 1 0 0 1 0 0 1 1 2 0 4
Total 86 63 207 163 38 1 137 38 183 149 15 0

Rows correspond to the VUR grade assigned by non-reference radiologists and columns correspond to the adjudicated VUR grade by reference RIVUR radiologists. Intermediate grades of reflux are shown for non-reference radiologists, although reference radiologists did not assign intermediate grades of VUR. The shaded cells represent ureters for which the non-reference radiologists and RIVUR radiologists agreed in their grading. Off-diagonal cells indicate discrepancy between the non-reference radiologists’ grade and the adjudicated grade assigned by RIVUR radiologists. For example, one left ureter was considered grade IV by a non-reference radiologist and as grade V by RIVUR radiologists.

a

Restricted to 1080 ureters in which non-reference and adjudicated reference VUR grades were available for comparison.

Table 4.

Degree of discrepancy between RIVUR radiologists and non-reference radiologists.

Non-reference radiologists’ VUR gradea: N (%)
No different from RIVUR read 763 (71)
Lower than RIVUR read (subsequently upgraded) 136 (13)
<1 VUR grade lower than RIVUR read (takes into account the intermediate grades of reflux) 42 (4)
1 VUR grade lower than RIVUR read (subsequently upgraded by RIVUR) 91 (8)
2 VUR grades lower than RIVUR read (subsequently upgraded by RIVUR) 3 (0.3)
3 VUR grades lower than RIVUR read (subsequently upgraded by RIVUR) 0
Higher than RIVUR read (subsequently downgraded) 181 (17)
<1 VUR grade higher than RIVUR read (takes into account the intermediate grades of reflux) 42 (4)
1 VUR grade higher than RIVUR read (subsequently downgraded by RIVUR) 118 (11)
2 VUR grades higher than RIVUR read (subsequently downgraded by RIVUR)b 17 (2)
3 VUR grades higher than RIVUR read (subsequently downgraded by RIVUR)c 4 (0.4)
a

Restricted to 1080 ureters in which non-reference and adjudicated reference VUR grades were available for comparison.

b

Includes five intermediate grade ureters that were between one and two grades above adjudicated reference radiologists’ grade.

c

Includes one intermediate grade ureter that was between two and three grades above adjudicated reference radiologists’ grade.

The data depict a different story for higher grades of reflux. Twenty-four percent (60/254) of ureters assessed to be grade III by non-reference radiologists were rated grade II by reference radiologists; two ureters were judged to have no VUR and 8% (20/254) were rated grade IV by reference radiologists. Among 64 ureters assessed to have grade IV reflux by non-reference radiologist, 55% (35/64) were subsequently downgraded to grade III and 6% (4/64) were downgraded to grade II by reference radiologists.

Using the individual patient as the unit of analysis, assessments for both ureters were available for 492 participants. Sixty-six had intermediate grades of VUR (e.g. grade II–III) assessed by the non-reference radiologist. Excluding these 66 participants (as agreement at the patient level was impossible), all three radiologists agreed on VUR assessments for 185/426 patients (43%). Alternatively, if we include those participants with intermediate non-reference grades, and count agreement as either of the two values in the intermediate grade matching the reference grade (e.g. grade II–III would be in agreement if the reference grade was II or III), 213 (43%) of 492 patients had agreement on both ureters among all three radiologists.

Aside from VUR grade, most of the VCUG assessment items had high agreement between radiologists. As shown in Table 1, the three-way comparison showed at least 92% agreement for all other assessments.

Discussion

Our study shows considerable variability in agreement when grading VUR. We found agreement between all three radiologists on grade of VUR in 59% of ureters; at the patient level, agreement is even lower at 43%. Even the RIVUR reference radiologists, when compared with each other, disagreed at a similar rate (25%) to that seen in comparing the non-reference with adjudicated reference readings (24%), despite the reference radiologists having agreed a priori on the grading criteria to be used, and having already completed a pilot study assessing VCUGs. These findings confirm that grading of VUR on VCUGs is subject to clinically significant inter-observer variability.

Previous studies have reported mixed results regarding the consistency of grading of VUR among multiple readers. Craig et al. reported 95% agreement in grade of VUR between three experienced pediatric radiologists reading 265 consecutive VCUGs for children presenting after their first febrile UTI, with corresponding Kappa values greater than 0.9 [11]. O’Neil et al. assessed the reliability in grade of VUR among pediatric radiologists, attending pediatric urologists, and urology residents using 200 consecutive VCUGs [12]. These authors reported Kappa values between 0.91 and 0.97, suggesting excellent agreement. Kronemer et al. reported variability in grading of VUR for 74 VCUGs interpreted by two radiologists using different interpretation formats [6]. In contrast to the aforementioned studies, these authors showed that among the 39 ureters with reflux, disagreement in grade occurred in 20 (51%), with a Kappa statistic of 0.7. Metcalfe et al. assessed the reliability in grade of VUR on 38 VCUGs assessed by four pediatric radiologists, five pediatric urologists, and four urology residents [8]. They reported poor inter-observer reliability with Kappa values ranging from 0.4 to 0.7 for most comparisons.

Our results show worse agreement in grading VCUGs when compared with the pilot study conducted by the RIVUR investigators [7]. In the pilot study, among 75 VCUGs interpreted, grades of VUR were in agreement in 85–86% of cases (Kappa 0.83–0.85), with most disagreements occurring for low and intermediate grades of reflux. Radiologists who interpreted images in the pilot study were the reference radiologists in this primary study. The increase in variability found in the present study may be explained by a larger number of studies, a higher rate of VUR, or other unknown factors.

Many of the calculated kappa scores were strikingly low for some of the high-agreement items (e.g. bladder shape), especially in light of the high percentage agreement and low variability for many of these items. This is because the Kappa coefficient is influenced by not just the percent agreement but also the prevalence of responses; lower coefficients can result when responses preferentially lie in two of the four quadrants of a 2 × 2 concordance table [13].

Our data have both clinical and research implications. Discussions about prognosis and management decisions regarding antimicrobial prophylaxis and surgical intervention rely heavily on the grade of VUR. This is particularly true if the clinician uses a specific grade threshold to make treatment decisions, and a number of research studies have shown that resolution rates, renal scarring, and rates of recurrent UTI are dependent on VUR grade [1416]. If the grading on which these analyses were based is of uncertain reliability, then the validity of analyses themselves are called into question.

Looking more closely at the divergent grading between radiologists, it becomes apparent how this can affect patient care. If the clinician surgically intervenes on children with grade IV or higher VUR, nearly two-thirds of RIVUR participants with “high grade” reflux would have been inappropriately managed (46 of 75 participants identified in the community as having ≥ grade IV reflux were subsequently re-assigned to grade III by the reference radiologists). Conversely, VUR was not observed by the reference radiologists in 27 ureters assessed by the nonreference radiologists to have VUR. These data support the notion that interpretation of VCUG images has limited inter-observer reliability; treating clinicians should personally review images prior to making any management decisions, and may wish to take VUR grades with the proverbial “grain of salt,” as but one element of the larger clinical picture.

This study should be viewed in light of its limitations. The reference radiologists were not present to view the procedure, so their assessment of VUR is limited to the images provided. This could result in less nuanced assessments that can only be made when viewing in realtime, such as transient VUR that is not visible on the saved images. As there was no standardization of VCUG technique among the numerous clinical centers, some disagreement in interpretation could result from differences in technique (e.g. use of oblique images). These factors could result in differences between reference and non-reference radiologists, but they would not have accounted for any differences between reference radiologists, both of whom viewed the same images. Further, the magnitude of variability being similar for all comparisons suggests that technical factors did not have a major effect. Another possible source of variation was the lack of standardized viewing equipment. While essentially all films were reviewed on high-quality digital computer screens, differences in resolution could lead to some disagreement in findings, for example if lower-resolution monitors obscured certain subtle features. Also, the RIVUR trial included only children with grade I–IV VUR, so our findings may differ from a sample that includes patients with grade V VUR. Also, because participants had a history of UTI, our findings may not be generalizable to all children who have a VCUG. Our analysis was complicated by the existence of intermediate grades of VUR that were reported by non-reference radiologists, making comparison difficult with the ordinal grading used by the reference radiologists. To include assessments for these participants in the analysis and to mitigate bias, we assigned a half-unit count to each grade included in the interval. The counter-intuitive consequence of this was that some of the count totals for ureters were not integers, as certain grade categories include “half of a ureter.” However, we feel that this was the most consistent way to deal with the non-reference radiologists having the option of intermediate grades, which the reference radiologists did not; analytically, our solution is sound. As noted above, we were missing values for some VCUG elements, most often from non-reference radiologists who did not follow a template (as each reference radiologist was required to do). This effect was mitigated by the large sample size. Lastly, the data structure and study design limited our ability to perform a multivariable analysis to determine which factors were associated with disagreement (or agreement).

Conclusions

There is considerable inter-observer variability for grading VUR. This has both research and clinical implications, as study design, risk stratification, and clinical decision-making rely heavily on grades of VUR.

Table.

Characteristic
No. of VCUG reports analyzed 602
Gender of participants
 Male 49
 Female 553
Age in months at time of VCUG (median) [IQR] 11 [5,30]
No. of ureters analyzed 1081
Reflux grade agreement
 Between non-reference and each reference radiologist (three-way)
  All three agree 638/1081 (59%)
  Two agree, one disagree 417/1081 (39%)
  All three disagree 27 (2%)
 Between non-reference and adjudicated reference radiologists’ score (two-way)
  Agree 805 (75%)
  Disagree 275 (25%)
  Kappa (95% CI) 0.66 (0.62–0.69)

Acknowledgments

Funding

This research was supported by grants U01 DK074059, U01 DK074053, U01 DK074082, U01 DK074064, U01 DK074062, U01 DK074063 from the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institutes of Health, Department of Health and Human Services. This trial was also supported by the University of Pittsburgh Clinical and Translational Science Award (UL1RR024153 and UL1TR000005), and the Children’s Hospital of Philadelphia Clinical and Translational Science Award (UL1TR000003) both from the National Center for Research Resources, now at the National Center for Advancing Translational Sciences, National Institutes of Health. Dr. Nelson is supported by grant K23-DK088943 from NIDDK. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Diabetes and Digestive and Kidney Diseases or the National Institutes of Health.

The authors would like to thank the RIVUR clinical trial volunteers whose participation helped answer important clinical questions. We are also indebted to the study coordinators at each trial site, especially Ilina Rosoklija and Lisa Gravens-Mueller whose assistance was vital to gather a complete dataset.

Footnotes

Conflict of interest

None.

Uncited references

[24].

References

  • 1.Lebowitz RL, Olbing H, Parkkulainen KV, Smellie JM, Tamminen-Mobius TE. International system of radiographic grading of vesicoureteric reflux. International Reflux Study in Children Pediatr Radiol. 1985;15(2):105–9. doi: 10.1007/BF02388714. [DOI] [PubMed] [Google Scholar]
  • 2.Garin EH, Olavarria F, Garcia Nieto V, Valenciano B, Campos A, Young L. Clinical significance of primary vesicoureteral reflux and urinary antibiotic prophylaxis after acute pyelonephritis: a multicenter, randomized, controlled study. Pediatrics. 2006;117(3):626–32. doi: 10.1542/peds.2005-1362. [DOI] [PubMed] [Google Scholar]
  • 3.Ardissino G, Avolio L, Dacco V, Testa S, Marra G, Viganò S, et al. Long-term outcome of vesicoureteral reflux associated chronic renal failure in children. Data from the ItalKid Project J Urol. 2004;172(1):305–10. doi: 10.1097/01.ju.0000129067.30725.16. [DOI] [PubMed] [Google Scholar]
  • 4.Roussey-Kesler G, Gadjos V, Idres N, Horen B, Ichay L, Leclair MD, et al. Antibiotic prophylaxis for the prevention of recurrent urinary tract infection in children with low grade vesicoureteral reflux: results from a prospective randomized study. J Urol. 2008;179(2):674–9. doi: 10.1016/j.juro.2007.09.090. discussion 679. [DOI] [PubMed] [Google Scholar]
  • 5.Peters CA, Skoog SJ, Arant BS, Jr, Copp HL, Elder JS, Hudson RG, et al. Summary of the AUA guideline on management of primary vesicoureteral reflux in children. J Urol. 2010;184(3):1134–44. doi: 10.1016/j.juro.2010.05.065. [DOI] [PubMed] [Google Scholar]
  • 6.Kronemer KA, Don S, Luker GD, Hildebolt C. Soft-copy versus hard-copy interpretation of voiding cystourethrography in neonates, infants, and children. AJR Am J Roentgenol. 1999;172(3):791–3. doi: 10.2214/ajr.172.3.10063884. [DOI] [PubMed] [Google Scholar]
  • 7.Greenfield SP, Carpenter MA, Chesney RW, Zerin JM, Chow J. The RIVUR voiding cystourethrogram pilot study: experience with radiologic reading concordance. J Urol. 2012;188(4 Suppl):1608–12. doi: 10.1016/j.juro.2012.06.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Metcalfe CB, Macneily AE, Afshar K. Reliability assessment of international grading system for vesicoureteral reflux. J Urol. 2012;188(4 Suppl):1490–2. doi: 10.1016/j.juro.2012.02.015. [DOI] [PubMed] [Google Scholar]
  • 9.Carpenter MA, Hoberman A, Mattoo TK, Mathews R, Keren R, Chesney RW, et al. The RIVUR trial: profile and baseline clinical associations of children with vesicoureteral reflux. Pediatrics. 2013;132(1):e34–45. doi: 10.1542/peds.2012-2301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Cohen J. A coefficient of agreement for nominal scales. Ed Psych Msmt. 1960;20(1):37–46. [Google Scholar]
  • 11.Craig JC, Irwig LM, Christie J, Lam A, Onikul E, Knight JF, et al. Variation in the diagnosis of vesicoureteric reflux using micturating cystourethrography. Pediatr Nephrol. 1997;11(4):455–9. doi: 10.1007/s004670050316. [DOI] [PubMed] [Google Scholar]
  • 12.O’Neil BB, Cartwright PC, Maves C, Hoeg K, Presson AP, Wallis MC. Reliability of voiding cystourethrogram for the grading of vesicoureteral reflux. J Pediatr Urol. 2014;10(1):107–11. doi: 10.1016/j.jpurol.2013.06.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Feinstein AR, Cicchetti DV. High agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol. 1990;43(6):543–9. doi: 10.1016/0895-4356(90)90158-l. [DOI] [PubMed] [Google Scholar]
  • 14.Estrada CR, Jr, Passerotti CC, Graham DA, Peters CA, Bauer SB, Diamond DA, et al. Nomograms for predicting annual resolution rate of primary vesicoureteral reflux: results from 2,462 children. J Urol. 2009;182(4):1535–41. doi: 10.1016/j.juro.2009.06.053. [DOI] [PubMed] [Google Scholar]
  • 15.Dias CS, Silva JM, Diniz JS, Lima EM, Marciano RC, Lana LG, et al. Risk factors for recurrent urinary tract infections in a cohort of patients with primary vesicoureteral reflux. Pediatr Infect Dis J. 2010;29(2):139–44. doi: 10.1097/inf.0b013e3181b8e85f. [DOI] [PubMed] [Google Scholar]
  • 16.Peters C, Rushton HG. Vesicoureteral reflux associated renal damage: congenital reflux nephropathy and acquired renal scarring. J Urol. 2010;184(1):265–73. doi: 10.1016/j.juro.2010.03.076. [DOI] [PubMed] [Google Scholar]

RESOURCES