Bar graphs depict iterative development of consensus on the validity of
reference values for radiomics features. We tried to find reliable
reference values for radiomics features in an iterative standardization
process. In phase I, features were computed without prior image
processing, whereas in phase II, features were assessed after image
processing with five predefined configurations (configurations
A–E; Appendix
E1 [online]). The panels show, A, the
overall development of consensus on the validity of (tentative)
reference values in phases I and II and, B, the
development of consensus in phase II, according to image processing
configuration. Consensus on the validity of a reference value is based
on the number of research teams that produce the same value for a
feature (weak: ≤3; moderate: three to five; strong: six to nine;
very strong: ≥10). We analyzed consensus at each of the analysis
time points, the time between which was variable (arbitrary unit; arb.
unit). New features were included at time points 5 and 22, causing an
apparent decrease in consensus. For phase II, we first analyzed
consensus at time point 10. Image processing configurations C and D were
altered after time point 16. Configuration E was altered after revising
the resegmentation processing step at time point 22. See Appendix E1
(online) for more information regarding the timeline.