As researchers actively working in the field of pediatric neuroimaging and artificial intelligence, we were pleased to see your recent review, Artificial Intelligence for Neuroimaging in Pediatric Cancer, by Rocha et al. [1], which reflects the growing interest and rapid progress in this area. Reviews like this are critical for synthesizing knowledge and guiding future research directions.
However, we felt compelled to write this letter to clarify several points in Table 1, as some of the reported segmentation results may inadvertently misrepresent the original studies or oversimplify important distinctions—particularly regarding model evaluation metrics, dataset characteristics, and segmentation targets. These discrepancies are not limited to a single line but involve several studies where important methodological details—such as whether the reported Dice scores reflect whole-tumor or subregion segmentation, or whether metrics like Pearson’s correlation were used—were either unclear or omitted. These inconsistencies, while subtle, can lead to confusion in how readers interpret and compare model performance across studies.
Our intention is to support clarity, consistency, and transparency in the evaluation of AI models in pediatric brain tumor segmentation—especially as the field moves toward clinical integration. We respectfully offer the following clarifications for improved accuracy and transparency.
Examples of clarification:
Boyd et al. [2] reported a whole-tumor Dice score of 0.88 for low-grade gliomas, based on external datasets from CBTN (n = 60) and DFCI/BCH (n = 100), using T2-weighted MRI. The model employed a stepwise transfer learning strategy, starting from BRATS 2021 adult glioma weights (n = 1251) and fine-tuned on CBTN data (n = 124). This methodological context is essential for interpreting the reported performance.
Mulvany et al. [3] presented Dice scores ranging from 0.657 to 0.967, reflecting different tumor subregions. The 0.931 Dice score corresponds specifically to whole-tumor segmentation on the PED BraTS 2024 validation dataset (n = 91), with a training set of n = 261 cases.
Vossough et al. [4] reported a Dice score of 0.9 for whole-tumor segmentation on an internal test set (n = 293) and a Pearson correlation coefficient of 0.98, which reflects volume agreement rather than spatial overlap. These are distinct metrics and should not be used interchangeably.
Our study [5] reported a Dice score of 0.642 as an average across tumor subregions (enhancing, non-enhancing, cystic, and edema). The whole-tumor Dice score was 0.681 on CBTN (n = 30 LGG cases) and 0.866 on the held-out PED BraTS 2024 dataset (n = 26). This distinction between whole-tumor and subregion metrics is important when making cross-study comparisons.
To support consistent and accurate comparisons of AI segmentation models, distinguishing between whole-tumor Dice scores and subregion Dice scores in tables and discussions would improve clarity and transparency [6,7].
We appreciate your consideration of this clarification. Providing clear segmentation performance metrics, dataset details, and model specifications is important for advancing AI applications in pediatric neuroimaging [7,8].
To facilitate accurate and meaningful comparisons across segmentation models, we suggest that future tables and summaries distinguish between
Whole-tumor and subregion Dice scores,
Spatial overlap (Dice) versus volume-based (e.g., Pearson r) metrics;
Internal versus external validation cohorts, with appropriate dataset specifications.
We hope these clarifications are helpful and contribute to the broader goal of advancing transparency and reproducibility in pediatric neuroimaging AI research.
Conflicts of Interest
The authors declare no conflicts of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
References
- 1.Dalboni da Rocha J.L., Lai J., Pandey P., Myat P.S.M., Loschinskey Z., Bag A.K., Sitaram R. Artificial Intelligence for Neuroimaging in Pediatric Cancer. Cancers. 2025;17:622. doi: 10.3390/cancers17040622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Boyd A., Ye Z., Prabhu S., Tjong M.C., Zha Y., Zapaishchykova A., Vajapeyam S., Hayat H., Chopra R., Liu K.X., et al. Expert-level pediatric brain tumor segmentation in a limited data scenario with stepwise transfer learning. medRxiv. 2023 doi: 10.1148/ryai.230254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Mulvany T., Griffiths-King D., Novak J., Rose H. Segmentation of Pediatric Brain Tumors using a Radiologically informed, Deep Learning Cascade. arXiv. 20242410.14020 [Google Scholar]
- 4.Vossough A., Khalili N., Familiar A.M., Gandhi D., Viswanathan K., Tu W., Haldar D., Bagheri S., Anderson H., Haldar S., et al. Training and Comparison of nnU-Net and DeepMedic Methods for Autosegmentation of Pediatric Brain Tumors. Am. J. Neuroradiol. 2024;45:1081–1089. doi: 10.3174/ajnr.A8293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bengtsson M., Keles E., Durak G., Anwar S., Velichko Y.S., Linguraru M.G., Waanders A.J., Bagci U. A new logic for pediatric brain tumor segmentation; Proceedings of the 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI); Houston, TX, USA. 14–17 April 2025; pp. 1–5. [Google Scholar]
- 6.Familiar A.M., Kazerooni A.F., Vossough A., Ware J.B., Bagheri S., Khalili N., Anderson H., Haldar D., Storm P.B., Resnick A.C., et al. Towards consistency in pediatric brain tumor measurements: Challenges, solutions, and the role of artificial intelligence-based segmentation. Neuro-Oncology. 2024;26:1557–1571. doi: 10.1093/neuonc/noae093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jha D., Durak G., Sharma V., Keles E., Cicek V., Zhang Z., Srivastava A., Rauniyar A., Hagos D.H., Tomar N.K., et al. A Conceptual Framework for Applying Ethical Principles of AI to Medical Practice. Bioengineering. 2025;12:180. doi: 10.3390/bioengineering12020180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sabuncu M.R., Wang A.Q., Nguyen M. Ethical Use of Artificial Intelligence in Medical Diagnostics Demands a Focus on Accuracy, Not Fairness. NEJM AI. 2025;2:AIp2400672. doi: 10.1056/AIp2400672. [DOI] [Google Scholar]
